Biological Database1

Uploaded by

shamnak2022

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

8 views4 pages

Biological Database1

Uploaded by

shamnak2022

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 4

Biological Databases:

A biological database is a collection of data which is stored in an organized manner so that the
contents are easily accessed, managed and updated. This offers the scientists the opportunity to access
sequence and structure data for around lakhs of sequences from a broad range of organisms.
Biological databases represent as an extensive source in support of biological research. There are
different types of Databases. Based on Data source, they are classified as:
1) Primary Databases: Experimental data from nucleotide sequence, protein sequence or
molecular structure are stored here.
Examples: GenBank, PDB, DDBJ
2) Secondary Databases: Data from the inference obtained from analyzing primary data/ curated
data using computational or manual methods.
Examples: PIR, SWISS-PROT and Pfam.
Based on Data type, they are:
3) 3)Protein Databases: (Gerritsen, 2005) The most extensive sources of protein information are
protein sequence databases, which can be categorized into two types: Universal databases,
which aim to gather biological data across a wide range of species, and Specialized databases,
which focus on specific protein groups, families, or particular organisms.
4) Nucleotide Databases: (Lewitter, 1999) Nucleotide databases are described as essential
repositories that store the sequences of nucleotides (the building blocks of DNA and RNA)
from various organisms. These databases allow scientists to access, retrieve, and analyse
genetic information, which is crucial for understanding biological processes, genetic diversity,
and evolutionary relationships.
Based
5) Structural Database: (Oliviero Carugo, 2002) It is a specialized type of database that stores
and organizes detailed 3D structural information about biological macromolecules, such as
proteins, nucleic acids, and protein-nucleic acid complexes. Unlike sequence databases,
which focus on linear genetic or protein sequences, structural databases emphasize the spatial
arrangement of atoms within a molecule, offering atomic-level detail that is crucial for
understanding biological mechanisms, like enzyme activity or molecular interactions.
Structural databases are essential in fields like structural biology and bioinformatics, as they
provide unique insights into molecular functions, such as identifying functionally important
motifs (e.g., the catalytic triad in enzymes), and allow for the visualization and analysis of
macromolecular interactions. One of the most well-known structural databases is the Protein
Data Bank (PDB), which was established in 1971 and contains extensive entries of
macromolecular structures derived from experimental methods like X-ray crystallography and
NMR spectroscopy.
6) Sequence Database: (C Harger a, 2000)A sequence database is a specialized database that
stores nucleotide sequences (DNA or RNA) and provides access to associated biological,
bibliographic, and annotation information. The Genome Sequence Database (GSDB) is an
example of such a database, focusing on storing and providing access to publicly available
nucleotide sequences across various organisms.
Eg: UniProt, The Nucleotide Database, EMBL nucleotide sequence
Sequence Alignment:
(P. Haritha, 2018) Sequence alignment is essential for detecting similarities among protein,
DNA, and RNA sequences, aiding in the study of evolutionary relationships between them. This
analysis sheds light on the connections among groups of related proteins. In sequence alignment,
amino acids from the protein sequences are compared, usually arranged in a linear fashion. The
alignment tool identifies matching amino acids in each column, inserting gaps where needed.
Sequence alignment has a wide range of applications, including sequence assembly, gene and protein
annotation, structural and functional predictions, as well as phylogenetic and evolutionary research.
The primary types of alignment are Pairwise Alignment, which compares two sequences; Multiple
Sequence Alignment, which involves more than two sequences to find similarities and conserved
regions; and Structural Alignment, based on structural features.
(Sequence alignment software typically inserts gaps between nucleotide or amino acid
residues in the sequences to maximize the alignment of similar sites. In the end, a character matrix is
generated, with the rows representing the sequences and the columns reflecting aligned positions
across those sequences.)

(P. Haritha, 2018) Alignment applications include sequence assembly, annotation, structural and
functional prediction of genes and proteins, as well as phylogenetic and evolutionary analysis. The
main types of alignment are Pairwise Alignment, Multiple Sequence Alignment, and Structural
Alignment. Pairwise alignment compares two sequences, while multiple sequence alignment
examines more than two sequences to identify similarities and conserved regions.
Pairwise Alignment: This method identifies similarities between two sequences to reveal exact
matches. Two main types are Dot Plot Matrix Method and Dynamic Programming.
Multiple sequence Alignment: Multiple Sequence Alignment (MSA) is a technique used to align more
than two sequences at once, helping to identify conserved regions that appear consistently across
multiple sequences. This alignment is particularly useful for constructing phylogenetic trees, which
represent the evolutionary relationships among various sequences. There are two main methods for
performing MSA: Progressive alignment, which builds alignments in stages, and Iterative alignment,
which refines the alignment through repeated adjustments.
Protein:
The process of converting RNA sequences into amino acids is known as translation. Initially, DNA is
transcribed into RNA, which is then synthesized into protein sequences. However, once formed,
protein sequences cannot be traced back to the original DNA. These sequences consist of amino acids.
Protein synthesis occurs in three phases: Initiation, where the AUG initiator codon is located; and
Elongation, leading to the formation of polypeptides, or chains of amino acids.

(LaPelusa1 & Kaushik2., 2022) Proteins, often termed the cell’s workhorses, play vital roles in
providing structural support, facilitating movement, driving metabolism, regulating gene expression,
and enabling cell-environment interactions. Although they vary in shape and size, all proteins are
composed of the same fundamental building blocks.
Proteins are formed from twenty standard amino acids. Each amino acid contains an amino group
(NH3+), a carboxylate group (COO−), and a variable side chain, or R group, all attached to a central
carbon, known as the α-carbon. At physiological pH, amino acids carry both positive and negative
charges: the amino group is protonated, while the carboxyl group is deprotonated.
Amino acids are generally categorized based on their R groups: hydrophobic, polar (hydrophilic), or
charged. Within proteins, amino acids are linked via peptide bonds, and the sequence and nature of
these amino acids ultimately determine the protein's chemical and physical properties. The
arrangement of amino acids forms what is known as the protein's primary structure, while the folding
patterns due to backbone interactions constitute the secondary structure. The full 3D shape of the
protein, including the spatial arrangement of all its atoms, is called the tertiary structure. Proteins with
multiple polypeptide chains exhibit a quaternary structure, essential for stability and functionality.
Fundamentals of Protein Structure: The structural hierarchy of proteins—primary, secondary, tertiary,
and quaternary—is fundamental to their diverse roles within organisms.
Primary Structure: (LaPelusa1 & Kaushik2., 2022)A protein’s primary structure, the unique sequence
of amino acids, is crucial to its function. For instance, a single amino acid change in haemoglobin
(replacement of glutamic acid with valine at the sixth position in β-globin) results in sickle-cell
anaemia. This linear sequence forms the foundation upon which secondary, tertiary, and quaternary
structures develop.
Secondary Structure: (LaPelusa1 & Kaushik2., 2022)Secondary structure is stabilized by interactions
between the backbone’s peptide groups, particularly hydrogen bonds. The most common secondary
structures are the α-helix (a coiled configuration) and the β-pleated sheet (flat, folded segments). In an
α-helix, hydrogen bonds form between every fourth amino acid, while β-pleated sheets result from
larger loops that bring distant segments together. The primary sequence largely dictates whether a
segment adopts an α-helix or β-pleated sheet conformation. Certain amino acids, such as proline, are
less common in α-helices due to their unique structure.
Tertiary Structure: (LaPelusa1 & Kaushik2., 2022) The tertiary structure is the overall 3D shape that
results from various interactions between R-groups and between R-groups and the backbone. These
interactions include hydrogen bonds, hydrophobic effects, van der Waals interactions, covalent
disulfide bridges, and ionic bonds. As the protein folds, nonpolar amino acids often cluster at its core,
stabilized by van der Waals forces, while hydrogen bonds and ionic interactions further solidify the
structure. The primary structure strongly influences the protein’s final shape, which is vital to its
functionality. Denaturation and renaturation experiments, such as those conducted with ribonuclease,
highlight the primary structure’s role in protein folding and stability. Proteins can undergo folding
with the assistance of chaperonin molecules, which shield them from disruptive cellular conditions,
aiding in the attainment of the correct 3D structure. Improper folding is linked to various genetic
disorders.
Quaternary Structure: (LaPelusa1 & Kaushik2., 2022) While primary, secondary, and tertiary
structures involve single polypeptides, some proteins consist of multiple subunits held together by
interactions similar to those seen in the tertiary structure. This assembly, or quaternary structure,
allows such proteins to function cohesively as multi-unit complexes.
FASTA Tool

(P. Haritha, 2018)FASTA, developed in 1995 as an enhanced version of the FASTP tool from 1985,
provides efficient comparison of protein and nucleotide sequences. It can search DNA sequences and
assess statistical significance. The main FASTA programs include TFASTAX, TFASTAY (for DNA
library searches), and FASTAX, FASTAY (for protein databases). Using a heuristic algorithm, FASTA
first identifies identical regions within the sequences. It then applies the PAM-250 matrix to rescore
top regions, connects high-scoring diagonals, and includes gaps to achieve an optimal alignment score
through the Smith-Waterman algorithm. FASTA outputs four components: database information, score
distribution histogram, matched sequences with statistical data, and the aligned sequences.
BLAST Tool

(P. Haritha, 2018)BLAST (Basic Local Alignment Search Tool) enables efficient comparisons between
protein or nucleotide sequences. Its variants include megaBLAST (nucleotide-nucleotide similarity),
BLASTN (distant nucleotide sequences), BLASTP (protein-protein comparisons), and BLASTX
(translated nucleotide queries against protein databases), among others. PSI-BLAST creates Position-
Specific Scoring Matrices (PSSM) to refine protein database searches, while RPSBLAST and
DELTA-BLAST offer rapid searches using PSSM, with DELTA-BLAST performing faster than
RPSBLAST. BLAST conducts local alignment in three phases: Setup (generating words based on the
query), Preliminary Search (scoring matched words), and Traceback (aligning with gapped
extensions). BLAST’s efficiency allows it to surpass traditional dynamic programming methods and
perform multiple local alignments for two sequences.

Bibliography
C Harger a, G. C. (2000). The Genome Sequence DataBase. Nucleic Acids Research, 31-32.

Gerritsen, V. B. (2005). Protein Databases. ENCYCLOPEDIA OF LIFE SCIENCES , 1-7.

LaPelusa1, A., & Kaushik2., R. (2022). Physiology, Proteins. StatPearls Publishing.

Lewitter, A. P. (1999). Nucleotide sequence databases: a gold mine for biologists. Trends in
Biochemical Sciences, 276-280.

Oliviero Carugo, S. P. (2002). The evolution of structural databases. TRENDS in Biotechnology, 498-
501.

P. Haritha, ,. P. (2018). A Comprehensive Review on Protein Sequence Analysis. International Journal

of Computer Sciences and Engineering, 1433-1442.

Computational Biology B.Tech - Biotech (Vith Semester)
No ratings yet
Computational Biology B.Tech - Biotech (Vith Semester)
34 pages
6.1 Bioinformatics Databases and Tools - Introduction: Lecture 6: December, 28, 2001
No ratings yet
6.1 Bioinformatics Databases and Tools - Introduction: Lecture 6: December, 28, 2001
31 pages
8024 Bio Info
No ratings yet
8024 Bio Info
28 pages
Bio PPT
No ratings yet
Bio PPT
35 pages
Insilico Gene Analysis
No ratings yet
Insilico Gene Analysis
34 pages
Pharmacoinformatics in Drug Discovery
100% (1)
Pharmacoinformatics in Drug Discovery
29 pages
Blast & Fasta
No ratings yet
Blast & Fasta
47 pages
Blast User Manual
No ratings yet
Blast User Manual
30 pages
DNA Sequence Alignment
No ratings yet
DNA Sequence Alignment
21 pages
Intro to Bioinformatics Lab Guide
No ratings yet
Intro to Bioinformatics Lab Guide
6 pages
Mega6 Tutorial
100% (1)
Mega6 Tutorial
10 pages
BLAST: Fast Sequence Search Tool
No ratings yet
BLAST: Fast Sequence Search Tool
6 pages
Linux For Bioinformatics (2012), Paul Stothard
100% (1)
Linux For Bioinformatics (2012), Paul Stothard
36 pages
Basics of Bioinformatics
100% (7)
Basics of Bioinformatics
99 pages
BioJava Quick-Start Guide
No ratings yet
BioJava Quick-Start Guide
84 pages
BLAST Guide for Biologists
0% (1)
BLAST Guide for Biologists
3 pages
Sequence Alignment in Bioinformatics
100% (4)
Sequence Alignment in Bioinformatics
44 pages
Lab Manual Bioinformatics Laboratory (Bt2308) V Semester B.Tech Degree Programme Department of Biotechnology
No ratings yet
Lab Manual Bioinformatics Laboratory (Bt2308) V Semester B.Tech Degree Programme Department of Biotechnology
28 pages
Bioinformatics: Intended Learning Outcomes
No ratings yet
Bioinformatics: Intended Learning Outcomes
9 pages
Bioinformatics Tools For Nucleotide Sequence Analysis and Database Exploration
No ratings yet
Bioinformatics Tools For Nucleotide Sequence Analysis and Database Exploration
75 pages
Morgenstern B DIALIGN Multiple DNA and Protein Seq
No ratings yet
Morgenstern B DIALIGN Multiple DNA and Protein Seq
5 pages
Bioinformatics Databases
No ratings yet
Bioinformatics Databases
10 pages
FASTA
No ratings yet
FASTA
3 pages
Blast
No ratings yet
Blast
19 pages
Bioinformatics Tools and Resources
No ratings yet
Bioinformatics Tools and Resources
17 pages
Exam Year Questions and Answers
No ratings yet
Exam Year Questions and Answers
8 pages
Sequence Alignment Algorithms: DEKM Book Notes From Dr. Bino John and Dr. Takis Benos
No ratings yet
Sequence Alignment Algorithms: DEKM Book Notes From Dr. Bino John and Dr. Takis Benos
53 pages
Task 2 - Biodiversity - Evolution - Genetic Variations
No ratings yet
Task 2 - Biodiversity - Evolution - Genetic Variations
7 pages
Databases - Final
No ratings yet
Databases - Final
50 pages
Introduction To Bioinformatics: Tolga Can
No ratings yet
Introduction To Bioinformatics: Tolga Can
21 pages
Bioinformatics for Molecular Biologists
100% (1)
Bioinformatics for Molecular Biologists
18 pages
Biological Databases
No ratings yet
Biological Databases
13 pages
Rese Rach
No ratings yet
Rese Rach
37 pages
BIF501-Bioinformatics-II Solved Questions FINAL TERM (PAST PAPERS)
No ratings yet
BIF501-Bioinformatics-II Solved Questions FINAL TERM (PAST PAPERS)
23 pages
Pairwise Sequence Alignment
No ratings yet
Pairwise Sequence Alignment
12 pages
Introduction To Bioinformatics
No ratings yet
Introduction To Bioinformatics
76 pages
Bioinformatic Databases 2
No ratings yet
Bioinformatic Databases 2
28 pages
Bioinformatics Databases Explained
No ratings yet
Bioinformatics Databases Explained
5 pages
Bio 206
No ratings yet
Bio 206
9 pages
Bioinformatics Database and Applications
100% (3)
Bioinformatics Database and Applications
82 pages
Bioinformatics Biological Database
No ratings yet
Bioinformatics Biological Database
31 pages
Protein & Bioinformatics Databases Guide
No ratings yet
Protein & Bioinformatics Databases Guide
85 pages
BIF401 Midterm Short Notes
No ratings yet
BIF401 Midterm Short Notes
45 pages
Question Bank (Bioinformatics I)
No ratings yet
Question Bank (Bioinformatics I)
75 pages
Bioinformatics for Plant Scientists
No ratings yet
Bioinformatics for Plant Scientists
28 pages
Reddy and Saier JR - 2012 - BioV Suite - A Collection of Programs For The Stud
No ratings yet
Reddy and Saier JR - 2012 - BioV Suite - A Collection of Programs For The Stud
11 pages
Lec2 Databases
No ratings yet
Lec2 Databases
135 pages
Thesis Business It Alignment
100% (3)
Thesis Business It Alignment
7 pages
Ilovepdf Merged
No ratings yet
Ilovepdf Merged
66 pages
BIF401 Midterm Past Papers Subjective
No ratings yet
BIF401 Midterm Past Papers Subjective
10 pages
Biological Data and Database Biological Data
No ratings yet
Biological Data and Database Biological Data
10 pages
Biological Databases PDF
No ratings yet
Biological Databases PDF
13 pages
Module 2 (Bioinformatics)
No ratings yet
Module 2 (Bioinformatics)
81 pages
Bioinformatics
No ratings yet
Bioinformatics
47 pages
Presentation 11
No ratings yet
Presentation 11
20 pages
Unit II Major Databases in Bioinformatics
No ratings yet
Unit II Major Databases in Bioinformatics
54 pages
Molecular Biology MSC Final
No ratings yet
Molecular Biology MSC Final
42 pages
Bif401 Manual 2023
No ratings yet
Bif401 Manual 2023
27 pages
Lecture Bioinfo Databases
No ratings yet
Lecture Bioinfo Databases
27 pages
Protein Databases
No ratings yet
Protein Databases
23 pages
Computational Biology
No ratings yet
Computational Biology
19 pages
Bioin Sofia
No ratings yet
Bioin Sofia
43 pages
Module 2 Biodata
No ratings yet
Module 2 Biodata
36 pages
Fasta& Blasta
No ratings yet
Fasta& Blasta
5 pages
CH12
No ratings yet
CH12
8 pages
Bioinformatics Lecture Notes Database
No ratings yet
Bioinformatics Lecture Notes Database
28 pages
Bioinformatics (Final)
No ratings yet
Bioinformatics (Final)
41 pages
Bioinformatics
No ratings yet
Bioinformatics
8 pages
BCH 516-1
No ratings yet
BCH 516-1
32 pages
Lec 01
No ratings yet
Lec 01
93 pages
Ajol File Journals - 314 - Articles - 242956 - Submission - Proof - 242956 3745 584187 1 10 20230306
No ratings yet
Ajol File Journals - 314 - Articles - 242956 - Submission - Proof - 242956 3745 584187 1 10 20230306
17 pages
Bioinformatics Intern
No ratings yet
Bioinformatics Intern
8 pages
Bioinformatics Intro
No ratings yet
Bioinformatics Intro
69 pages
Biologicaldatabase 190402034501
No ratings yet
Biologicaldatabase 190402034501
26 pages
Sec1 Introduction To Bioinformatics
No ratings yet
Sec1 Introduction To Bioinformatics
20 pages
BIF401 Current Papers Solution Part 1
No ratings yet
BIF401 Current Papers Solution Part 1
6 pages
BIOINFORMATICS
No ratings yet
BIOINFORMATICS
13 pages
FASTA
No ratings yet
FASTA
18 pages
Latthika
No ratings yet
Latthika
21 pages
Introduction To Databases
No ratings yet
Introduction To Databases
21 pages
Biological Database ODL
No ratings yet
Biological Database ODL
21 pages
Biological - Databases Class Work 60
No ratings yet
Biological - Databases Class Work 60
60 pages
Databases Class Work
No ratings yet
Databases Class Work
48 pages
Database 2
No ratings yet
Database 2
15 pages
Sequence Alignment
No ratings yet
Sequence Alignment
8 pages
Biological Database1
No ratings yet
Biological Database1
4 pages
Cannataro 2014
No ratings yet
Cannataro 2014
10 pages
Biological Databases
No ratings yet
Biological Databases
19 pages

Biological Database1

Uploaded by

Biological Database1

Uploaded by

Biological Databases:

Gerritsen, V. B. (2005). Protein Databases. ENCYCLOPEDIA OF LIFE SCIENCES , 1-7.

LaPelusa1, A., & Kaushik2., R. (2022). Physiology, Proteins. StatPearls Publishing.

P. Haritha, ,. P. (2018). A Comprehensive Review on Protein Sequence Analysis. International Journal

You might also like