Protein sequence databases Essay
Protein sequence databases, 504 words essay example
3. Protein sequence databases
I. Primary Database
Protein information resource(PIR)
II. Secondary Database
PROSITE It is a secondary protein sequence database. The basis of it is that the numerous proteins due to sequence similarity are grouped into families. Proteins belonging to the same family originate from a common ancestor and hence have similar functions.
During evolution some protein sequences are conserved better than other sequences. These conserved stretches are known as motif, these contribute to the characteristic function or structure of the protein. Study of these helps to distinguish the members of a particular protein family from the unrelated members by means of protein signature (similar to a fingerprint). This protein signature is used to analyze new protein sequences and identify the specific family they belong to in order to predict their function.
Nucleotide sequence databases
ii. DNA Database of Japan (DDBJ)
iii. European molecular biology laboratory (EMBL)
GenBank It is a primary nucleotide sequence database. It is maintained by National Center for Biotechnology Information (NCBI), which is a part of National Institutes of Health (NIH), a federal agency of the US government.
It has a collection of all the publicly available sequences, those directly submitted by the authors as well as from the genome sequencing groups. It is an archival database as the submitted sequences are not checked.
Mostly the records are single contiguous DNA or RNA stretches. The information is retrieved using Entrez integrated retrieval system. The information is exchanged with other databases like DDBJ and EMBL on a regular basis and hence up to date.
4. Branches of proteomics
Proteome- set of proteins of an organism.
Proteomics- combines the distribution, interaction, expressions, and dynamics of the proteins within living systems. It requires rigorous data and depends on high-throughput measurements which include mass spectrometry and DNA microarrays.
Various subdivisions include
Interaction proteomics it analyzes the protein interactions in order to identify the binary protein interactions, the protein complexes. With information about the protein-protein interaction one can understand system-level biology, the gene regulatory networks, cell signaling cascade, etc.
Expression proteomics it analyzes the protein expressions on a large scale. It helps to recognize the differently expressed proteins (those that contribute to disease development), as well as the desired proteins in a sample. Those that share an identical or similar expression profile can be functionally related. Mass spectrometry, 2D-PAGE are used.
Biomarkers used for indicating a normal biological processes, pathogenic processes, or pharmacologic responses to a therapeutic intervention. Specific protein biomarkers are used for disease diagnosis.
Proteogenomics proteolytic events and post-translational modification is discovered by simultaneous analysis of proteome and genome.
Structural proteomics protein structures are analyzed on a large scale. Helps in determining the target site of drugs on protein and the site where proteins interact with each other, in identifying the functions of newly discovered genes. X-ray crystallography and NMR spectroscopy are used.