Protein Databases 1

The document provides an overview of various protein databases, including PIR, SWISS-PROT, TrEMBL, and structure databases like PDB, SCOP, and CATH, which offer information on protein sequences, structures, and classifications. It also discusses protein pattern databases like InterPro and PROSITE, as well as metabolic pathway databases such as KEGG and Reactome, which are essential for understanding biochemical pathways. Additionally, it highlights protein-protein interaction databases like BIND, DIP, MINT, and STRING, emphasizing their applications in sequence analysis, protein structure prediction, and drug discovery.

Uploaded by

priyankamehta22012003

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

9 views13 pages

Protein Databases 1

Uploaded by

priyankamehta22012003

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 13

Protein databases

PIR

• PIR (Protein Information Resource) is a

popular protein sequence database
that provides information on
functionally annotated protein
sequences.
• PIR maintains three databases, the
Protein Sequence Database (PSD), the
Non-redundant Reference (NREF)
sequence database, and the integrated
Protein Classification (iProClass)
database, which contains annotated
protein sequences, classification
information, and protein family,
function, and structure information.
SWISS-PROT

• SWISS-PROT is a protein sequence database that provides high

levels of annotations, including information on the protein’s
function, domain structure, post-translational modifications,
and variants.
• Swiss-Prot is jointly managed by the SIB (Swiss Institute of
Bioinformatics) and the EBI (European Bioinformatics
Institute).
• The database distinguishes itself from other protein sequence
databases by three criteria: (i) annotations, which cover a
broad range of information, (ii) minimal redundancy, which
ensures that each sequence is represented only once, and (iii)
integration with other databases, which enables cross-
referencing and retrieval of information from related databases.
TrEMBL
• TrEMBL is a computer-annotated supplement of Swiss-
Prot. TrEMBL entries follow the Swiss-Prot format.
• It contains all the translations of EMBL (European
Molecular Biology Laboratory) nucleotide sequence
entries that have not yet been integrated into Swiss-
Prot.
Protein Structure Databases

Protein structure databases are collections of information

related to the three-dimensional structure and secondary
structure of proteins.
There are several examples of protein structure databases.
Some are:
PDB
• PDB (Protein Data Bank) is a worldwide repository of 3D
structure data on large molecules such as proteins, nucleic
acids, and other biological macromolecules.
• It stores three-dimensional structural models of
macromolecules obtained through three frequently used
experimental methods: X-ray crystallography, nuclear
magnetic resonance spectroscopy (NMR), and electron
microscopy (3DEM).
SCOP
• SCOP (Structural Classification of Proteins) is a protein structure database
that organizes proteins based on their secondary structure properties.
• SCOP categorizes proteins into different levels based on their evolutionary
relationships and structural similarities.
• Proteins with high sequence identity or similar structure and function are
grouped into families, and families with similar structures but low
sequence identity are placed into superfamilies.
• Proteins with the same major secondary structures in the same
arrangement are placed into the same fold category, and folds are further
grouped into five structural classes(all-alpha, all-beta, alpha/beta,
alpha+beta, and multi-domain).
• In addition to these five main classes, SCOP also includes other categories
like small proteins, membrane and cell surface proteins, coiled proteins,
and those with low-resolution structures
CATH
• CATH is a database that categorizes protein domains
into hierarchical levels based on their folding patterns.
• Protein domains are classified into the CATH
hierarchy, which consists of four levels of increasing
specificity: Class, Architecture, Topology, and
Homologous Superfamily. Domains that have similar
folding patterns are grouped together at higher levels of
the hierarchy.
Protein Pattern and Profile Databases

Protein pattern and profile databases contain information on motifs found

in sequences. Sequence motifs correspond to structural or functional
features in proteins. So, the use of protein sequence patterns or profiles
is a valuable tool in determining the function of proteins.
InterPro
• InterPro is a database that contains information on protein families,
domains, and functional sites.
• It was created by combining several major protein signature databases,
including PROSITE, Pfam, PRINTS, ProDom, and SMART into a single
comprehensive resource.
PROSITE
• PROSITE is a collection of signatures that identify patterns or profiles in
proteins, which can provide information on their biological functions.
• The signatures in the database are linked to annotation documents that
provide information on the protein family or domain detected, including
Metabolic Pathway Databases
Metabolic pathway databases contain information about enzymes,
biochemical reactions, and metabolic pathways.
ENZYME
• ENZYME is a database that stores information on enzyme
nomenclature.
• It is used as the nomenclature source for enzyme names and
reactions by most metabolic databases as well as by other
biomolecular databases.
KEGG
• KEGG (Kyoto Encyclopedia of Genes and Genomes) is a
comprehensive database that maps out molecular and cellular
pathways involving interactions between genes and molecules.
• It is composed of pathway maps, molecule tables, gene tables, and
genome maps, and is used to build functional maps of metabolic and
regulatory pathways.
Reactome:
• Reactome is an open source, expert-curated and peer-reviewed database
of biological reactions and pathways with cross-references to major
molecular databases. Reactome provides the visual representation of
classical intermediary metabolism, signaling, innate and acquired immune
function, transcriptional regulation, apoptosis and disease process etc.
Reactome website supports the navigation of pathway knowledge and
pathway-based analysis and visualization of experimental or
computational data. Interaction, reaction and pathway data are
downloadable as flat file. They are also accessible through RESTful web
services. Software tools such as Pathway Browser, Analyze Data, Species
Comparison, and Reactome FI Network are provided to support data
mining and analysis of large-scale data sets. The Reactome release in
September 2015 contains 101,670 proteins, 74,357 complexes, 68,659
reactions, and 20,261 pathways.
Protein-Protein Interaction Databases
Protein-protein interaction databases are collections of information on the interactions between
proteins. These databases provide valuable information on the relationships between different
proteins and their functions in biological systems.
Examples of protein-protein interaction databases include:
BIND
• BIND (Biomolecular Interaction Network Database) is a database that stores detailed descriptions
of interactions, molecular complexes, and pathways between various biomolecules, including
proteins, nucleic acids, and small molecules.
• The database is designed to be used for data mining and can be used to study networks of
interactions and map pathways across different species. The database can also provide
information for kinetic simulations.
DIP
• DIP (Database of Interacting Proteins) is a database that contains protein-protein interaction
information that has been compiled through both manual curations and computational methods.
• It is useful for understanding protein functions and their relationships with other proteins. It can
also be used to study the properties of networks of interacting proteins, evaluate predictions of
protein-protein interactions, and explore the evolution of these interactions.
MINT
• MINT (Molecular Interaction) is a database that stores information on functional interactions
between biological molecules such as proteins, RNA, and DNA.
• It also stores information on enzymatic modifications of partner molecules.
• The database primarily focuses on experimentally verified protein-protein interactions and
STRING:

In molecular biology, STRING (Search Tool for the Retrieval of Interacting Genes/Proteins) is a biological
database and web resource of known and predicted protein–protein interactions.

The STRING database contains information from numerous sources, including experimental data,
computational prediction methods, and public text collections. It is freely accessible and it is regularly
updated. The resource also serves to highlight functional enrichments in user-provided lists of proteins,
using a number of functional classification systems such as GO, Pfams, and KEGG.
Applications of protein databases
Protein databases have numerous applications. Some of the
applications are:
• Protein databases can be used in sequence analysis to identify
homologous sequences and predict protein functions based on
sequence similarity.
• Protein databases can also be used for predicting protein
structure by comparing the amino acid sequence of a protein
with known structures in the database.
• Protein databases also include tools to study protein-protein
interactions.
• Protein pattern and profile databases can be used for protein
family identification by identifying conserved motifs.
• Protein databases such as metabolic pathway databases can
be used in drug discovery and disease research by studying the
metabolic pathways involved in diseases.

cp5293 Big Data Analytics Question Bank
0% (1)
cp5293 Big Data Analytics Question Bank
13 pages
Protein Databases
No ratings yet
Protein Databases
13 pages
Protein Databases
No ratings yet
Protein Databases
23 pages
Protein Database Overview
No ratings yet
Protein Database Overview
13 pages
CH12
No ratings yet
CH12
8 pages
L-5 Protein Database and Secondary Databases
No ratings yet
L-5 Protein Database and Secondary Databases
24 pages
Lecture Topic: Protein Databases: Topics Covered
No ratings yet
Lecture Topic: Protein Databases: Topics Covered
67 pages
Database 2
No ratings yet
Database 2
15 pages
Protein Databases
No ratings yet
Protein Databases
12 pages
Bioinformatics Lecture Notes Database
No ratings yet
Bioinformatics Lecture Notes Database
28 pages
Bioinformatics for Plant Scientists
No ratings yet
Bioinformatics for Plant Scientists
28 pages
Biological Database ODL
No ratings yet
Biological Database ODL
21 pages
BCH 505 Bioinformatics 3 (2 2) Databases
No ratings yet
BCH 505 Bioinformatics 3 (2 2) Databases
17 pages
Protein Database
No ratings yet
Protein Database
3 pages
Presentation 11
No ratings yet
Presentation 11
20 pages
Databases - Final
No ratings yet
Databases - Final
50 pages
Introduction To Databases
No ratings yet
Introduction To Databases
21 pages
Lec2 Databases
No ratings yet
Lec2 Databases
135 pages
Zoya Bioinformatics Assignment
No ratings yet
Zoya Bioinformatics Assignment
36 pages
Bioinformatics Biological Database
No ratings yet
Bioinformatics Biological Database
31 pages
Serves List
100% (1)
Serves List
34 pages
Tics - A Brief Introduction
No ratings yet
Tics - A Brief Introduction
4 pages
DATAbases 1 KD
No ratings yet
DATAbases 1 KD
5 pages
Bioinformatic Databases 2
No ratings yet
Bioinformatic Databases 2
28 pages
Biological Data Bases
No ratings yet
Biological Data Bases
36 pages
Bioinformatics
No ratings yet
Bioinformatics
47 pages
Data Base in Bioinformatics
No ratings yet
Data Base in Bioinformatics
30 pages
Databases Class Work
No ratings yet
Databases Class Work
48 pages
Protein Databases
No ratings yet
Protein Databases
8 pages
Biological Databases
No ratings yet
Biological Databases
19 pages
Biological - Databases Class Work 60
No ratings yet
Biological - Databases Class Work 60
60 pages
Database
No ratings yet
Database
16 pages
Latthika
No ratings yet
Latthika
21 pages
Note 2
No ratings yet
Note 2
54 pages
Biological Databases
No ratings yet
Biological Databases
6 pages
Biologicaldatabase 190402034501
No ratings yet
Biologicaldatabase 190402034501
26 pages
Bioinformatics PPT Section B Data Storage and Retrival Group 3
No ratings yet
Bioinformatics PPT Section B Data Storage and Retrival Group 3
36 pages
Introduction to Bioinformatics
No ratings yet
Introduction to Bioinformatics
56 pages
Sec1 Introduction To Bioinformatics
No ratings yet
Sec1 Introduction To Bioinformatics
20 pages
Klingstrom Plewczynski 2010
No ratings yet
Klingstrom Plewczynski 2010
13 pages
Central Dogma of Molecular Biology
No ratings yet
Central Dogma of Molecular Biology
8 pages
Module 2 Biodata
No ratings yet
Module 2 Biodata
36 pages
Bioinformatics for Researchers
No ratings yet
Bioinformatics for Researchers
105 pages
Peace BMCB Seminar
No ratings yet
Peace BMCB Seminar
13 pages
Module 2 (Bioinformatics)
No ratings yet
Module 2 (Bioinformatics)
81 pages
المحاضرة 2
No ratings yet
المحاضرة 2
16 pages
Biological Databases
No ratings yet
Biological Databases
13 pages
Biological Databases PDF
No ratings yet
Biological Databases PDF
13 pages
Bioinformatics Overview for Students
No ratings yet
Bioinformatics Overview for Students
32 pages
Koenig Biological Databases
No ratings yet
Koenig Biological Databases
35 pages
Bioinformatics Database
No ratings yet
Bioinformatics Database
50 pages
Abasyn University Peshawar: Name: Ihsan Ullah Depart: BS Medical Lab Technology
No ratings yet
Abasyn University Peshawar: Name: Ihsan Ullah Depart: BS Medical Lab Technology
8 pages
Bioinformatics Databases Explained
No ratings yet
Bioinformatics Databases Explained
5 pages
In Silico Protein Characterization Tools
No ratings yet
In Silico Protein Characterization Tools
13 pages
Data Mining Proteomes
No ratings yet
Data Mining Proteomes
4 pages
Databases 2025
No ratings yet
Databases 2025
50 pages
Protein & Bioinformatics Databases Guide
No ratings yet
Protein & Bioinformatics Databases Guide
85 pages
Class04 - Biological Databases - 2022
No ratings yet
Class04 - Biological Databases - 2022
14 pages
Bioinformatics 1
No ratings yet
Bioinformatics 1
62 pages
Biological Information On Artificial Intelligence
No ratings yet
Biological Information On Artificial Intelligence
20 pages
Cse2007 - Database Management Systems
No ratings yet
Cse2007 - Database Management Systems
3 pages
DSE 310 - Practice 1
No ratings yet
DSE 310 - Practice 1
31 pages
What Is Enterprise Content Management Guide To ECM 6
No ratings yet
What Is Enterprise Content Management Guide To ECM 6
21 pages
Uppcl 3
No ratings yet
Uppcl 3
6 pages
Library Collection Development Guide
No ratings yet
Library Collection Development Guide
6 pages
Student Admission System SRS
No ratings yet
Student Admission System SRS
9 pages
Subquery
No ratings yet
Subquery
4 pages
JabRef Guide
No ratings yet
JabRef Guide
261 pages
Adithya R CV Data Analyst
No ratings yet
Adithya R CV Data Analyst
1 page
Personal Information Management
No ratings yet
Personal Information Management
34 pages
Lift (Data Mining)
No ratings yet
Lift (Data Mining)
3 pages
Loss Run Report Generation Using ARPA
No ratings yet
Loss Run Report Generation Using ARPA
4 pages
Azure Information Protection Guide
No ratings yet
Azure Information Protection Guide
16 pages
IBM Watson Amazing Thing AI
No ratings yet
IBM Watson Amazing Thing AI
3 pages
(MS Odcff) 210422
No ratings yet
(MS Odcff) 210422
35 pages
Data Warehouse Architecture
No ratings yet
Data Warehouse Architecture
5 pages
Database Management Systems PDF
No ratings yet
Database Management Systems PDF
18 pages
DBMS Lab Guide for ECE Students
No ratings yet
DBMS Lab Guide for ECE Students
31 pages
Distributed Messaging Queue 1706649896
No ratings yet
Distributed Messaging Queue 1706649896
23 pages
Business Process Study Guide Week 5
No ratings yet
Business Process Study Guide Week 5
3 pages
09 Transaksi (2) - NDN
No ratings yet
09 Transaksi (2) - NDN
26 pages
2nd Quarter Long Quiz ICT 11
No ratings yet
2nd Quarter Long Quiz ICT 11
1 page
PDF 6 851-ARTS-2019 PDF
No ratings yet
PDF 6 851-ARTS-2019 PDF
58 pages
Restaurant Management System Development
No ratings yet
Restaurant Management System Development
9 pages
Functions of Database Server
0% (2)
Functions of Database Server
4 pages
CSC212Lesson One
No ratings yet
CSC212Lesson One
14 pages
Oleg Ruchinsky
No ratings yet
Oleg Ruchinsky
4 pages
Lab Manual BDA
No ratings yet
Lab Manual BDA
36 pages
Automation Process of Hotel Booking System Using Heflo: Flowchart
No ratings yet
Automation Process of Hotel Booking System Using Heflo: Flowchart
6 pages

Protein Databases 1

Uploaded by

Protein Databases 1

Uploaded by

Protein databases

• PIR (Protein Information Resource) is a

• SWISS-PROT is a protein sequence database that provides high

Protein structure databases are collections of information

Protein pattern and profile databases contain information on motifs found

You might also like