Biomedical Informatics - Pathogenic - SWISS-MODEL Homology - Assessment Answer

February 28, 2018
Author : Ashley Simons

Solution Code: 1AHAB

Question:Biomedical Informatics

This assignment falls under Biomedical Informatics which was successfully solved by the assignment writing experts at My Assignment Services AU under assignment help service.

Biomedical Informatics Assignment

Assignment Task

  1. (0.5 points) Using the NCBI dbSNP database, (www.ncbi.nlm.nih.gov), retrieve the data on the SNP with accession rs7412. Is it clinically significant? What is the name of mutated gene ? Is this SNP determined by a silent mutation or not? If there is a change in the amino acid sequence of encoded protein, give the wild- type and mutated codons and amino acids. Give some examples of diseases that are possibly associated with this mutation.
  2. (0.5 points) Using the National Human Genome Research Institute (NHGRI) Catalog of Published Genome-Wide Association Studies (GWAS) [NHGRI-EBI Catalog] (www.ebi.ac.uk/gwas), determine how many genome-wide association studies have identified a trait association with the SNP from the previous assignment (rs7412).
  3. (0.5 points) An enzyme EZH2 is involved in the development of cancers such as lymphoma. Using the Ensembl genome browser (www.ensembl.org), determine the location (chromosome, nucleotide positions) of the EZH2 gene (in the search using EZH2 name, choose the "Best gene match" suggested in the search results). How many splice variant transcripts are known for this gene? Look at the transcript table: how many variations are known in the transcript encoding the largest protein ? How many variations are known for the residue 615 of this protein ? Are they located at the same nucleotide position in the transcript or not ? According to the scores yielded by SIFT program, are these mutations deleterious or not ? What is reported about clinical significance of these mutations by dbSNP/ClinVar databases (click on dbSNP accessions "rs..." in the "Variation ID" column) ?
  4. (0.5 points) A mutation, located in the gene with GeneID 136371, has been identified in a patient and is suspected to be important. The mutation C>T is mapped to the transcript NM_080871, and is shown in the sequence region below:TGCCGAGGCCACCACCGCCCGCTGCCT wild type TGCCGAGGCCACCACTGCCCGCTGCCT mutation

Using Mutation Taster server (www.mutationtaster.org), determine potential importance of the mutation. Use the provided data for the input. Is it silent mutation or there is a change in amino acid sequence ("AA changes"). Follow the link to SNP database (reference ID: rs...). Using the provided info, determine whether this SNP is pathogenic one and the associated disease. Using the dbSNP and ClinVar links, try to determine putative molecular mechanism ("Functional consequence") underlying the effect of this mutation.

5.Toll-like receptors (TLRs) play an important role in signaling of innate immunity responses to pathogens. A number of studies suggests associations of variations in TLRs with changes in susceptibility and/or resistance to pathogens and in severity of some diseases. The TLRs have similar structures with three major domains: an ectodomain, consisting of multiple leucine-rich repeats (LRRs), transmembrane helix and Toll interleukin-1 receptor domain (TIR). Using bioinformatic approaches, analyze some SNPs in TLR genes.

5.1. Possible associations of the following SNPs with lung functions were studied in a group of patients: rs5743618 (TLR1 gene), rs5743708 (TLR2), rs3775291 (TLR3), rs4986790 (TLR4), rs5743810 (TLR6), rs179008 (TLR7), rs2407992 (TLR8), rs4129009 (TLR10). Determine the following features of these SNPs: (a) Synonymous / non-synonymous substitution ? What is reported about clinical significance by the dbSNP database? (b) For a non-synonymous variant, determine amino acid substitution and domain containing it. (c) SNP frequency according to The 1000 Genomes Project (if available). Data from the 1000G project can be viewed in Ensembl (www.ensembl.org). Note the "minor allele frequency" (MAF) values for all populations and the largest/lowest MAF values in 5 main 1000G population groups. (d) Determine the predicted effect according to the SIFT score (can be found in the transcript table of Ensembl). (e) The study of these SNPs found that rs4986790, rs5743810 and rs179008 might be significant, while no associations were revealed for others. Is there any correlation with SNP domain locations, SIFT scores or MAF values ?

5.2. Toll-like receptor TLR3 is activated upon infections by flaviviruses such as Zika and dengue viruses. TLR3 is a receptor for double-helical RNA. Analyze the data on polymorphism of TLR3 amino acids that are involved in RNA binding [these amino acids positions are identified by Liu et al. (2008), Science 320:379-381]. Identify SNPs of amino acids that are essential for RNA binding, their SIFT scores, MAF values and dbSNP reports on clinical significance. The TLR3 activation during early brain development, mediated by Zika virus infection during pregnancy, was suggested to be associated to disorders in children. Is it possible to suggest that the polymorhism in the TLR3 RNA binding sites could play a role in this process ?

6Mutations in the transthyretin gene can lead to a number of disorders caused by protein misfolding. Transthyretin precursor (NP_000362) contains 147 amino acids, the first 20 residues being a signal peptide. One of the mutations in this gene is the deletion of valine at the position 122 of the mature peptide (corresponding to the SNP rs121918096). Using the SWISS-MODEL homology modelling server (swissmodel.expasy.org), predict the structure of this mutant protein.

It is advised to start from searching for templates (it could take a couple of minutes, and meanwhile it is informative to watch the names of procedures/algorithms run by the server). When this is done, you can just use the template at the top of the list (default) for the modelling.

According to the model-template alignment, could you consider this prediction as a reliable one ? Does the Val122 deletion occur in a secondary structure element or in a coil region ?

(NB. It is possible that Swiss-Model will inform you that your browser does not allow you to view the model itself. Actually such a view is not absolutely necessary for modelling and viewing the alignment to the template, so it is not necessary to change anything in your computer).

7. (2 points) Human MALAT1 (Metastasis-Associated Lung Adenocarcinoma Transcript 1) is a long (>8 kbp) noncoding RNA (lncRNA) implicated in a number of cancers. Using the UCSC genome browser (genome.ucsc.edu), determine location (chromosome, positions) of its gene.

It is known that the 3'-proximal part (about 500 bp) contains conserved functional RNA secondary structures. Thus select the region corresponding to the 3'-proximal 500 bp of MALAT1 in the browser and get MultiZ alignment (click on the MultiZ Align bar): you should get the output with the blocks of MultiZ alignment. What is the size of the block with the longest projection on the human genome sequence?

Try to predict the consensus structure in this alignment block using a few diverse sequences, for instance, human, mouse and chicken. It is suggested to use the program RNAalifold for this task. Thus get DNA sequences of these three species from the block of interest (click on the corresponding "D" in the list). Make fasta file of three corresponding RNA fragments and submit it to Clustal Omega (www.ebi.ac.uk/Tools/msa/ clustalo/ ) in order to make a Clustal-formatted multiple alignment (yet save the file as a text). Use this file for RNAalifold (https://rna.tbi.univie.ac.at ) prediction. Is any conserved structure predicted ? Are there base covariations supporting it ?

The assignment file was solved by professionalBiomedical Informatics experts and academic professionals at My Assignment Services AU. The solution file, as per the marking rubric, is of high quality and 100% original (as reported by Plagiarism). The assignment help was delivered to the student within the 2-3 days to submission.

Looking for a new solution for this exact same question? Our assignment help professionals can help you with that. With a clientele based in top Australian universities, My Assignment Services AU’s assignment writing service is aiding thousands of students to achieve good scores in their academics. OurBiomedical Informatics assignment experts are proficient with following the marking rubric and adhering to the referencing style guidelines.

Solution:

  • Clinical significance: It is pathogenic.

 

Gene name: APOE.

Type of mutation: It is not determined by a silent mutation. It is a missense mutation.

Codon: Wild type- CGC (Arg), Muted- TGC (Cys)

Associated diseases: Apolipoproteinemia E1, Familial type 3 hyperlipoproteinemia, atorvastatin response -Efficacy

 

  • Six association studies are available, however the associated traits mainly incudes Lipid metabolism phenotypes, Total cholesterol, LDL cholesterol and response to statin therapy (LDL-C).

  • Chromosome: 7

 

Nucleotide Position: 148,807,383-148,884,321

Splice variants: 10

Variations associated with the largest protein coding transcript: 768

Variations associated with the 651 residue of the protein: 2 missense variants; No, they are not located at the same nucleotide.

Consequence of the mutation: The variation rs112034331 is deleterious, while rs587778301 is tolerated.

Clinical significance of variations: rs112034331- no information available, rs587778301- Untested.

 

  • Type of mutation: It is a silent mutation.

 

Clinical significance: It is pathogenic.

Functional consequence: It effects RNA splicing.

Associated diseases: It leads to adult-onset of primary open-angle glaucoma, a heterogeneous group of diseases.

 

  • The deletion of valine in the transthyretin precursor protein, represented by SNP rs121918096, occurs at position 142 rather than 122 as given in the assignment.

 

The top most template: Protein- Transthyretin, PDB ID- 4n85 (chain A), Sequence identity- 100%, Resolution- 1.6 Å.

Reliability of the model: Since the same protein is being used as a template, the model generated will be highly reliable.

Location of mutation: The Val deletion lies in the secondary structure region, beta sheet towards the C terminal of the protein.

 

 

5.1 The required information is provided in the form of the table below:

SNP rs5743618 rs5743708 rs3775291 rs4986790 rs5743810 rs179008 rs2407992 rs4129009
Gene TLR1 TLR2 TLR3 TLR4 TLR6 TLR7 TLR8 TLR10
Substitution Non-synonymous Non-synonymous Non-synonymous Non-synonymous Non-synonymous Non-synonymous synonymous Non-synonymous
Clinical significance other other other Benign NA NA NA NA
Amino acid substitution Ser ? Lle Arg ? Gln Leu ? Phe Asp ? Gly Ser ? Pro Gln ? Leu / Lle ? Val
Domain/

Repeats

LRRCT TIR LRR15 LRR 2 LRR 8 - LRR 20 TIR
MAF 0.2(C) 0.01(A) 0.23(T) 0.06(G) 0.12(A) 0.12(T) 0.28(G) 0.15(C)
Highest MAF 0.67(C)

EUR

0.02(A)

EUR

0.33(T)

EAS

0.13(G)

SAS

0.41(A)

EUR

0.23(T)

EUR

0.62(G)

EUR

0.27(C)

EAS

Lowest MAF 0.01(C)

EAS

0(A)

AFR/EAS/SAS

0.03(T)

AFR

0(G)

EAS

0(A)

EAS

0(T)

EAS

0.05(G)

AFR

0.01(C)

AFR

SIFT 1 0 0 0.15 0.56 0.29 0.28
Predicted effect tolerated deleterious deleterious tolerated tolerated tolerated tolerated

 

(e) No, there is no significant correlation with the SNP domain locations, SIFT scores orMAF values.

  1. The residues His39 and His60 at N terminal, and His539 and Asn541 at C terminal are essential for RNA binding with TLR3. Variations reported among these residues include the following:
  2. rs780502519: Missense mutation, Residue- 60, H [His] ? Q [Gln], Clinical Significance- NA, MAF- A=0.000008/1, SIFT- 0
  3. rs776387492: Missense mutation, Residue- 539, H [His] ? R [Arg], Clinical Significance- NA, MAF- G=0.00002/2, SIFT- 0

Mutation in RNA binding site of TLR3 might either enhance the RNA recognition capacity of the TLR3 or inhibit the binding of the RNA to TLR3. So, TLR3 activation can be correlated to polymorphism in RNA binding sites only after experimentally validating the fact.

 

  • Location of MALAT1 gene: chr11:65497762-65506512

 

Size of the block with the longest projection on the human genome sequence: 24 bps

MultiZ alignment: Using this option the DNA sequence of human, mouse and dog were retrieved.

Clustal Omega: The alignment obtained using this program is shown below-

CLUSTAL O(1.2.3) multiple sequence alignment

Dog AAATTCATATTAATAAAAAAAA--TTA--

Mouse --------TTTTATGAAATAAAAACTAAA

Human -----CAATTTTGTGTAATAAAAATGGAG

RNAalifold: The conserved structure obtained is as following-

The free energy of the thermodynamic ensemble as predicted was -2.78 kcal/mol.

ThisBiomedical Informatics assignment sample was powered by the assignment writing experts of My Assignment Services AU. You can free download thisBiomedical Informatics assessment answer for reference. This solvedBiomedical Informatics assignment sample is only for reference purpose and not to be submitted to your university. For a fresh solution to this question, fill the form here and get our professional assignment help.

RELATED SOLUTIONS

Order Now

Request Callback

Tap to ChatGet instant assignment help

Get 500 Words FREE