StudentShare
Contact Us
Sign In / Sign Up for FREE
Search
Go to advanced search...
Free

Practical Issues in Bioinformatics - Assignment Example

Cite this document
Summary
The assignment "Practical Issues in Bioinformatics" focuses on the critical analysis of the major practical issues in bioinformatics. Homo sapiens prion protein (PRNP) gene is 5621 PRNP. Query coverage of 100%. E-value of 2e-21 – [BLASTN 2.2.27+]…
Download full paper File format: .doc, available for editing
GRAB THE BEST PAPER97.2% of users find it useful
Practical Issues in Bioinformatics
Read Text Preview

Extract of sample "Practical Issues in Bioinformatics"

? Health sciences and medicine, Assignment   Topic:  Bioinformatics; CW1 – Database work 26th October Partial DNA sequence for a gene that your company is interested in: CGGCGCCGCGAGCTTCTCCTCTCCTCACGACCGAGGCAGAGCAGTCATTATGGCGAACCTTGGCTGCTGGATGCTGGTTCTCTTTGTGGCCACATGGAGTGACCTGGGCCTCTGCAAGAAGCGCCCGAAGCCTGGAGGATGGAACACTGGGGGCAGCCGATACCCGGGGCAGGGCAGCCCTGGAGGCAACCGCTACCCACCTCAGGGCGGTGGTGGCTGGGGGCAGCCTCATGGTGGTGG A short report telling them what data is publicly available for this gene. 1) Using NCBI BLAST identify the most likely candidate for the complete gene. a. What is the name of the gene? Homo sapiens prion protein (PRNP) gene Gene ID: 5621 PRNP Query coverage of 100% E value of 2e-21 – [BLASTN 2.2.27+] (Zhang et al., 2000) b. What organism does the gene comes from? Homo sapiens [Humans] Most of the query results were from Homo sapiens (human) thereby providing a likelihood that the partial DNA sequence could have probably originated from humans. c. In trying to find the function of a gene it can be useful to see how widely distributed amongst species. For example is it limited to bacteria? From the BLAST output what can you say about the distribution of the gene amongst different species? The prion protein gene is not limited to humans and related species (primates); Sumatran orungatan (Pongo abelii) and Macaca fascicularis are also primates. Albeit the gene we obtained was from a primate, prion gene can also be found in other mammals, such as sheep and cattle. 2) Using the secondary databases find out as much as you can about the functional and structural properties of the gene. Why is this gene significant? What does it do? What does it look like? Where is it found within the organism? Are there any related genes? Accession CAA58442 Amino acid sequence of prion protein [Human] MANLGCWMLVLFVATWSDLGLCKKRPKPGGWNTGGSRYPGQGSPGGNRYPPQGGGGWGQPHGGGWGQPHGGGWGQPHGGGWGQGGGTHSQWNKPSKPKTNMKHMAGAAAAGAVVGGLGGYMLGSAMSRPIIHFGSDYEDRYYRENMHRYPNQVYYRPMDEYSNQNNFVHDCVNITIKQHTVTTTTKGENFTETDVKMMERVVEQMCITQYERESQAYYQRGSSMVLFSSPPVILLISFLIFLIVG The gene exists as a single copy and encodes a membrane glycosylphosphatidylinositol-anchored glycoprotein. It is located on chromosome 20. This protein of molecular weight – 26884.3 has a primary structure made up of 245 amino acids and a theoretical pI (Isoelectric point) of 9.13 (Gasteiger et al., 2005). Monomeric form (C) of the gene product is alpha-helical in structure albeit misfolding of this protein gives rise to a protease resistant form (PRPN (Sc) and is usually anchored on the cell membrane via a lipid anchor. The protein is of importance due to its implication in the etiology of human and livestock disease where its malformation may lead to neuronal degeneration. . Figure 1: NMR solution structure of the human prion protein (Zahn et al., 2000) The gene is involved in synaptic plasticity and neuronal development; it may also play roles in the uptake of iron and homeostasis. Related genes i). RNA-binding protein FUS isoform 1 [Homo sapiens] ii). TATA-binding protein-associated factor 2N isoform 1 [Homo sapiens] iii). Chain A, Mouse Prion Protein (121-231) Containing The Substitution F175a iv). Single-stranded DNA-binding protein [Arthrobacter sp. Rue61a] v). Hypothetical conserved protein [Oceanobacillus iheyensis HTE831] vi). Translation initiation factor IF-2 [Corynebacterium glutamicum R] 3) Is this protein related to any diseases? What are they? What causes them? Are there any mutations of the gene associated with diseases? The misfolding of the prion protein results to a variant of prion protein (PrPc) associated with various neurodegenerative diseases collectively termed transmissible spongiform encephalopathies (TSEs) or prion-related diseases (Taylor et al., 2009; Prusiner, 1998). Upon misfolding of the prion protein, there is a significantly large increase in the ?-sheet content of the protein. This causes the proteins to aggregate into large macromolecules. Prion proteins associated diseases include bovine spongiform encephalopathy (BSE or mad cow disease) in cattle, scrapie in sheep and Creutzfeldt-Jakob (CJD) disease, fatal insomnia, Gerstman-Straussler-Scheinker (GSS) disease and variably protease-sensitive prinopathy (VPSPr) in humans. These diseases result either from genetic, sporadic, or even due to infection. These proteins though devoid of any nucleic material are transmissible and are entirely constituted of the transformed protein (Sc). Among the prion-related diseases, CJD is the most common disease in humans, and sporadic events have been associated with about 85% of the incidences of CJD (Torres et al., 2012). However, it should be noted that some 10% cases (familial CJD) may be associated with mutation of the prion gene and a further less than 1 % are as a result of infection. Therefore, according to these statistics, majority of CJD cases are a result of sporadic events; genetics (familial) and infections contribute just 11% of all the total incidences of CJD. Normal prion protein may also be associated with neuroprotective function in the cerebral spinal fluid (CSF). Some studies have shown that this protein plays roles in signal transduction, cell survival, and protection against oxidative stress (Watt and Hooper, 2005; Chen et al., 2003; Mouillet-Richard et al., 2000). Therefore, in prion disease the conversion of normal prion to the abnormal form may have negative effect on CSF and thereby contribute to the progression of the disease. In humans, the prion gene contains an octapeptide repeat region (R1-R2-R2-R3-R4). R2, R3 and R4 repeats encode octapeptides -PHGGGWGQ while R1 encodes octapeptide – PQGGGGWGQ. This region contains 5 repeats of 24-27 bp, a nonapeptide and 4 octapeptide coding sequences. According to Li et al (2011), a rise in the number of these repeats or even a decrease is linked to prion diseases. Point mutations on the prion gene, such as E200K, P102L have been associated with familial CJD and GSS respectively (Kong et al., 2004). Other mutations, such as insertions and deletion, have also been linked with familial prion diseases since they lead to a reduction of the repeats. In conclusion, the presence of a mutation in the prion gene renders the resultant gene product more likely to adopt the abnormal prion protein conformation. 4) Describe how you carried out your investigation, including what databases you used and giving reasons for why you used those resources and supporting the evidence you gave for parts 1-3. Give you opinion as to the usefulness and accessibility of the different databases that you used [UniProt, Prosite etc.] I submitted the provided partial DNA sequence to a BLASTN available at the NCBI website at http://blast.ncbi.nlm.nih.gov/Blast.cgi?PROGRAM=blastn&BLAST_PROGRAMS=megaBlast&PAGE_TYPE=BlastSearch&SHOW_DEFAULTS=on&LINK_LOC=blasthome. BLASTn is a basic local alignment search tool algorithm which uses a DNA sequence as a query to search a nucleotide database. In order to increase the probability of getting the correct result, the BLASTn was optimized for highly similar sequence using megablast. The NCBI has several nucleotide databases; thus, it was imperative to choose the appropriate database for conducting the BLASTn searches. Since no prior information was given on the source of the partial DNA sequence, the search was conducted on a general curated collection of nucleotide collection. The non-redundant (nr) collection was used for the search. This is a collection of nucleotide databases that are manually curated; it increases the chances of finding the correct sequence from the search. In the search, the choice of the most likely DNA which matched the query was based on the query coverage and the E-value which were displayed in the result. The best sequence had the smallest E-value and large query coverage. The query coverage is an indicator of how well the query stretches along the length of a given sequence in the database. Most of the sequences from the databases had query coverage of 100% and E-values that were significant. Most of the sequences, however, were from human and had a common gene ID. This scenario is possible because different groups may have submitted different version of the human prion gene to the nucleotide collection. Sequences in the non-redundant collection are curated; therefore, there is higher probability of getting the correct matching sequences. After getting the DNA sequence, sequences from other organism other than humans were picked as the related sequences. The key indicators of the related sequences to the sequence that was picked were based on their E-values and query coverage. Using the selected human prion protein DNA sequence as a query, a BLASTp search was carried out in the NCBI databases. The nr protein databases were selected. The BLASTp algorithm searches a protein database using as a protein query. Physical properties of the primary sequences from the protein BLAST were investigated by submitting the primary sequence of the prion protein to Protoparam tool available at the Expasy server. The physico-chemical properties included the isoelectric point and the molecular structure. Using the accession number of the protein, the sequences were submitted in the Protein Data Bank, a biological macromolecular resource that archives experimentally-derived structures of biological molecules. The actual sequence can also be submitted at the Protein Data Bank. This helped in providing the structure of the prion protein showing its alpha chains (Figure 1). The results displayed for each query for both NCBI nucleotide and Protein BLAST contain links to the literature paper associated with most sequences submitted in the databases. These papers can be accessed at PubMed, and they provide detailed information of the sequences, including the logic behind sequencing the sequences, related sequences and any information a researcher may be interested in. This is where information on prion proteins and the diseases associated with their malformation was sourced. In conclusion, in the recent past there has been an increase publicly available biological data and almost any sequence sequenced in the laboratory may find its match in these databases. Information on the objectives of the research group who sequenced the DNA can also be accessed easily through links found in the results. However, redundancy in this data is a common problem that brings confusion. Thus, as one searches a sequence of interest in the database, clear judgment is necessary since redundancy exist in the form of hypothetical sequences, or even very short sequences that will match the query sequence even though the sequences do not come from a related source. Statistics alone, therefore, may not be enough in selecting which sequence matches another; rather, careful scrutiny of literature resources backing up the given sequence is necessary. Redundancy may also stem from hypothetical sequences. Hypothetical sequences are derived from translation of submitted sequences by the machine. Reference Chen, S., Mange, A., Dong, L., Lehmann, S. and Schachner, M. (2003) ‘Prion protein as trans-interacting partner for neurons is involved in neurite outgrowth and neuronal survival’ Mol Cell Neurosci. 22 pp. 227-233. Gasteiger, E., Hoogland, C., Gattiker, A., Duvaud, S., Wilkins, M.R., Appel, R.D. and Bairoch, A. (2005) ‘Protein Identification and Analysis Tools on the ExPASy Server’ In: John M. Walker (ed.) The Proteomics Protocols Handbook. Totowa, NJ: Humana Press. pp. 571-607. Kong, Q., Surewicz, W.K., Petersen, R.B., Zou, W., Chen, S.G., Gambetti, P., Parchi, P., Capellari, S., Goldfarb, L., Montagna, P., Lugaresi, E., Piccardo, P. and Ghetti, B. (2004) ‘Inherited Prion Disease’ In: Prusiner S. (ed.) Prion Biology and Diseases. 2nd ed. New York: Cold Spring Harbor Laboratory Press. pp. 673–776. Li, B., Qing, L., Yan, J. and Kong Q (2011) ‘Instability of the Octarepeat Region of the Human Prion Protein Gene’ PLoS ONE. 6(10): e26635. http://www.plosone.org/article/info%3Adoi%2F10.1371%2Fjournal.pone.0026635 [accessed 27 October 2012]. Mouillet-Richard, S., Ermonval, M., Chebassier, C., Laplanche, J.L., Lehmann, S., Launay, J. M. and Kellermann O. (2000) ‘Signal transduction through prion protein’ Science. 289 pp. 1925-1928. Prusiner, S. B. (1998) ‘Prions’ Proc Natl Acad Sci USA. 95 pp. 13363–13383. Taylor, D.R., Whitehouse, I.J. and Hooper, N.M. (2009) ‘Glypican-1 Mediates Both Prion Protein Lipid Raft Association and Disease Isoform Formation’ PLoS Pathog. 5(11): e1000666. http://www.plospathogens.org/article/info%3Adoi%2F10.1371%2Fjournal.ppat.1000666 [accessed 27 October 2012]. Torres, M., Cartier, L., Matamala, J.M., Hernandez, N., Woehlbier,U. and Hetz C. (2012) ‘Altered Prion Protein Expression Pattern in CSF as a Biomarker for Creutzfeldt-Jakob Disease’ PLoS ONE. 7(4): e36159. http://www.plosone.org/article/info%3Adoi%2F10.1371%2Fjournal.pone.0036159 [accessed 27 October 2012]. Watt, N.T. and Hooper, N.M. (2005) ‘Reactive oxygen species (ROS)-mediated beta-cleavage of the prion protein in the mechanism of the cellular response to oxidative stress’ Biochem Soc Trans. 33 pp. 1123-1125. Zahn, R., Liu, A., Luhrs, T., Riek, R., Von Schroetter, C., Garcia, F.L., Billeter, M., Calzolai, L., Wider, G. and Wuthrich, K. (2000) ‘NMR solution structure of the human prion protein’ Proc.Natl.Acad.Sci.USA. 97(1) pp. 145-150. Zhang, Z., Schwartz, S., Wagner, L. and Miller, W. (2000) ‘A greedy algorithm for aligning DNA sequences’ J Comput Biol. 7(1-2): pp. 203-214. Read More
Cite this document
  • APA
  • MLA
  • CHICAGO
(“Bioinformatics Assignment Example | Topics and Well Written Essays - 1500 words”, n.d.)
Bioinformatics Assignment Example | Topics and Well Written Essays - 1500 words. Retrieved from https://studentshare.org/health-sciences-medicine/1459637-bioinformatics
(Bioinformatics Assignment Example | Topics and Well Written Essays - 1500 Words)
Bioinformatics Assignment Example | Topics and Well Written Essays - 1500 Words. https://studentshare.org/health-sciences-medicine/1459637-bioinformatics.
“Bioinformatics Assignment Example | Topics and Well Written Essays - 1500 Words”, n.d. https://studentshare.org/health-sciences-medicine/1459637-bioinformatics.
  • Cited: 0 times

CHECK THESE SAMPLES OF Practical Issues in Bioinformatics

Bioinformatics and molecular modelling

bioinformatics and Molecular Modelling Name Institution bioinformatics and Molecular Modelling Part I and II Introduction Lipases are glycerol esters that often act on the triacylglycerols so that they cab release glycerol and acids.... The major part of the lipases are used in industrial processes today are stemmed from animal or microbial sources....
8 Pages (2000 words) Essay

Bioinformatics in cancer therapy

In research proposal "bioinformatics in cancer therapy" the main red line is an affection of new scientific discoveries on the treatment of probably one of the cruelest and widespread diseases in the world - cancer.... bioinformatics uses advanced computing, mathematics, and different technological platforms to physically store, manage, analyze, and understand the data.... Research proposal "bioinformatics in cancer therapy" main idea is that cancer bioinformatics is expected to play a more important role in the identification and validation of biomarkers, specific to clinical phenotypes related to early diagnoses, measurements to monitor the progress of the disease and the response to therapy, and predictors for the improvement of patient's life quality....
8 Pages (2000 words) Research Proposal

Web technologies.From PHP to Python

hat they use it for practical implementation issues (See: http://www.... The distinguishing factor of PHP from client-side languages like JavaScript is that the code is executed on the server.... If you were to have a script similar to the above on your server, the client would receive the results of running that script, with no way of determining what the underlying code may be....
21 Pages (5250 words) Essay

Innovation, Creativity and Enterprise in the Scottish Life Sciences Industry

This report "Innovation, Creativity and Enterprise in the Scottish Life Sciences Industry" the role of innovation, creativity and enterprise in Scottish life sciences is evaluated.... Through industry analysis, techniques and processes, innovation and creativity are analyzed.... ... ... ... The collaboration between research organisations, academic institutes and Scottish government is contributing massively to further enhance the growth and innovation of this industry....
6 Pages (1500 words) Report

Bioinformatics and Molecular Modelling

This work called "bioinformatics and Molecular Modelling" focuses on a lipase model, the use of the Errata plot enables the assessment of the arrangement of different types of atoms that are in line with each other in the protein models.... The author outlines the folding energies of humans, rats, and mice....
8 Pages (2000 words) Research Paper

Bioinformatics of Bt Cry Toxins

This lab report "bioinformatics of Bt Cry Toxins" presents differences between natural bacterial and synthetic cry genes as well as the interrelatedness between various cry proteins.... Bacillus thuringiensis (Bt) Cry toxins usually undergo significant changes.... ... ... ... In sequence retrieval, genes are often named given accession numbers for easier identification and the retrieval of their DNA sequence from the databases....
5 Pages (1250 words) Lab Report

Bioinformatician in an NHS Clinical Genomics Unit

The report will provide a comprehensive elaboration with the aid of tables and pictures or certificates of the use of cloud computing of bioinformatics and its application in DNA sequencing projects.... This report "Bioinformatician in an NHS Clinical Genomics Unit" provides a description of the cloud computing technology, its features, and types of strategies and how they can be applied in DNA sequencing....
13 Pages (3250 words) Report

Meaning of Human Genome Project

bioinformatics entails the application of information technology, in the biological information management.... bioinformatics is applied in human genome research that is conducted by the HGP.... bioinformatics is significant in the application of genomic data to understand several diseases.... bioinformatics is also majorly applied in identifying new molecular aspects in the drug discovery process (Victor, 2010)....
7 Pages (1750 words) Essay
sponsored ads
We use cookies to create the best experience for you. Keep on browsing if you are OK with that, or find out how to manage cookies.
Contact Us