Thanks to visit codestin.com
Credit goes to www.scribd.com

0% found this document useful (0 votes)
17 views16 pages

Bioinformatics Reviewer Full

Bioinformatics applies computational techniques to analyze biological data across various fields such as genomics, proteomics, transcriptomics, metabolomics, and computational biology. Key concepts include genome sequencing, protein structure analysis, and metabolic profiling, which aid in understanding genetic information and biological processes. Applications range from predicting disease risks to improving agricultural practices and developing personalized medicine.

Uploaded by

jerrellempedrad3
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
17 views16 pages

Bioinformatics Reviewer Full

Bioinformatics applies computational techniques to analyze biological data across various fields such as genomics, proteomics, transcriptomics, metabolomics, and computational biology. Key concepts include genome sequencing, protein structure analysis, and metabolic profiling, which aid in understanding genetic information and biological processes. Applications range from predicting disease risks to improving agricultural practices and developing personalized medicine.

Uploaded by

jerrellempedrad3
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 16

BIOINFORMATICS

“Bioinformatics is the application of computational techniques to analyze and


interpret biological data.”

Fields of Bioinformatics

01 Genomics

“The study of genomes, the complete set of genetic material in an organism.”

02 Proteomics

“The study of the entire set of proteins produced by an organism.”

03 Transcriptomics

“The study of the transcriptome, the complete set of RNA molecules in a cell.”

04 Metabolomics

“The study of metabolites and metabolic pathways in an organism.”

05 Computational Biology

“The application of mathematics, statistics, and algorithms to biological data.”

GENOMICS: Genesis + Soma + ics

“Genomics is the study of all a person’s genes (the genome).”


Genome: “The complete set of genes or genetic material present in a cell or
organism.”

DNA sequencing vs genome sequencing

• DNA sequencing refers to the process of determining the order of the


nucleotides (AGTC) within a DNA.
• Genome sequencing determines the sequence of the entire genome.

Next-generation sequencing (NGS)

“A high-throughput sequencing technology that allows rapid sequencing of


DNA.”

Whole genome sequencing (WGS)

“Sequencing the entire genome of an organism.”

Single Nucleotide Polymorphism (SNP)

“A variation in a single nucleotide that occurs at a specific position in the


genome.”

Sequence alignment

“The process of arranging sequences of DNA, RNA, or protein to identify regions


of similarity.”

Sequence alignments:

• Pairwise alignment – “Comparing two sequences to find the best-


matching alignment.”
• Global alignment – “Aligning entire sequences from end to end.”
• Local alignment – “Aligning regions of sequences that are similar.”

BLAST (Basic Local Alignment Search Tool)

“A tool for comparing an input sequence against a database of sequences to find


similar sequences.”

Genome annotation

“The process of identifying the locations of genes and other features within a
genome.”

Gene annotation

“The process of deriving the structural and functional information of a protein or


gene from a raw data set using different analysis, comparison, estimation,
precision, and other mining techniques.”

3 basic categories of genome annotation

01 Nucleotide-level annotation – “It identifies the physical location of a DNA


sequences, along with its components (ie. genes, RNAs, and repetitive
elements).”

02 Protein-level annotation – “Determines the possible functions of genes with


their protein and comparing its presence in other organisms.”

03 Process-level annotation – “Identifies the pathways and processes in which


different genes interact.”
Genome assembly

“The process of putting together the short DNA sequences into longer,
contiguous sequences (contigs).”

Applications

• Predicting Disease Risk at the Individual Level

“Example: In April 2010, Stephen Quake (a healthy individual and a


scientist) became a subject of genome sequencing and analysis at Stanford
University. Based on the analysis, he was predicted to have a 23 percent
risk of developing prostate cancer and a 1.4 percent risk of developing
Alzheimer’s.”

• Pharmacogenomics (also known as Toxicogenomics)

“Evaluates the effectiveness and safety of drugs based on an individual’s


genomic sequencing. The information from an individual’s genomic
sequencing allows scientist to create drugs that specifically caters to an
individual’s medical needs.”

• Metagenomics

“The study of the collective genomes of multiple species that grow and
interact in an environmental niche. Allows scientists to identify bacterial
species rapidly and analyze the effect of pollutants on the environment, as
compared to pure culture.”

• Microbial Genomics: Creation of New Biofuels


“Genomic analysis of the fungus Pichia will allow optimization of its use in
fermenting ethanol fuels. Analysis of the microbes in the hindgut of
termites have found 500 genes that may be useful in enzymatic
destruction of cellulose.”

• Mitochondrial Genomics

“It is the study of mitochondrial DNA (mtDNA), a double-stranded, circular


molecule of 16 569 bp and contains 37 genes coding for two rRNAs, 22
tRNAs and 13 polypeptides. Example: Over the past three decades,
informatics resources such as MitoCart 2.0, MITOMAP Web, and
Mitochondrial Disease Sequence Data Resource (MSeqDR) have been
developed to improve the understanding of mitochondrial diseases.”

• Genomics in Agriculture

“Genomics in agriculture helps improve the resistance of plants and


animals to environmental stress as well as allow them to produce more of
their desired nutrients and vitamins. Through genome modification
techniques, scientists develop genetically modified organisms (or GMOs)
such as Golden Rice to produce more beta carotene to make up for the lack
of Vitamin A in Africa and Asia. Genome sequencing is also utilized in
selective breeding of other animals such as cattles.”

• Genome-Wide Association Study (GWAs)

“It is a type of study in genomics that identifies genomic variants that are
statistically associated with a risk for a disease or a particular trait. This is
important not only for identifying genetic risk factors for diseases but also
in understanding the biological pathways involved. Example: In
Alzheimer’s disease, researchers found that people with SNPs in the APOE
gene have a higher risk of developing Alzheimer’s.”

• Human Genome Project (HGP)

“It aims to sequence all 3 billion base pairs of DNA in the human genome.”

PROTEOMICS

• Protein: “A molecule made up of amino acids that perform various


functions in the cell.”
• Protein structure: “The three-dimensional arrangement of atoms in a
protein.”
• Peptide: “A short chain of amino acids linked by peptide bonds.”
• Proteomics: “The study of proteomes (an entire set of proteins) – its
interactions, function, composition, and structure.”

Protein Structure Levels

• Primary Protein Structure - The linear sequence of amino acids in a protein


chain.
• Secondary Protein Structure - Local folding into patterns like alpha-helices
and beta-pleated sheets, stabilized by hydrogen bonds.
• Tertiary Protein Structure - overall 3D shape of a single protein molecule,
formed by interactions between side chains.
• Quaternary Protein Structure - The structure formed when two or more
protein subunits come together to form a functional complex.
Mass Spectrometry (MS)

“An analytical technique used to measure the mass-to-charge ratio of ions,


widely used in proteomics.”

Protein quantification

“Determining the concentration of proteins in a sample.”

Protein–Protein Interaction (PPI)

“A relationship between two or more proteins that often work together to carry
out biological functions.”

• Stable interactions: “Proteins that are purified as multi-subunit complexes,


and the subunits of these complexes can be identical or different.”
• Transient interactions: “Control the majority of cellular processes. Requires
a set of conditions that promote the interaction (eg. phosphorylation,
conformational changes or localization to discrete areas of the cell).”

Two-dimensional gel electrophoresis (2D-GE)

“A technique used to separate proteins based on their isoelectric point and


molecular weight.”

TRANSCRIPTOMICS

• Transcriptomics: “The study of the transcriptome, the complete set of RNA


molecules in a cell.”
• Transcriptome: “The full range of messenger RNA molecules expressed by
the genome at any given time.”
• RNA sequencing (RNA-Seq): “A method used to sequence the RNA in a
sample, giving insight into gene expression.”
• Gene expression: “The process by which information from a gene is used
to create a functional gene product (usually a protein).”
• Differential expression: “Comparing gene expression levels between
different conditions, samples, or time points.”
• Alternative splicing detection: “Alternative splicing… different
combinations of exons are selectively joined or excluded from the
precursor mRNA, resulting in multiple mature mRNA transcripts from a
single gene.”
• cDNA: “A synthetic form of DNA that is complementary to mRNA, used in
gene expression studies.”
• qPCR (quantitative PCR): “A technique used to measure the amount of
specific RNA (or cDNA) in a sample.”
• ncRNAs (noncoding): “Regulate diverse biological processes and play
regulatory roles in various biological processes and diseases.”
• rRNAs (ribosomal): “Type of stable RNA that is a major constituent of
ribosomes, ensuring proper alignment of the mRNA and the ribosomes as
well as catalyzing the formation of the peptide bonds between two
aligned amino acids during protein synthesis.”
• tRNAs (transfer): “Small type of stable RNA that carries the correct amino
acid to the site of protein synthesis in the ribosome and base pairs with
the mRNA to allow the amino acid it carries to be inserted in the
polypeptide chain being synthesized.”
• snRNAs (small nuclear): “Participate in the pre-mRNA splicing reaction as a
small nuclear ribonucleoprotein (snRNP) complex.”
• mRNAs (messenger): “Short-lived type of RNA that serves as the
intermediary between DNA and the synthesis of protein products.”

METABOLOMICS

• Metabolomics: “The large-scale systematic study of metabolites, within


biological systems including cells, biofluids, tissues or organisms.”
• Metabolite: “A substance that is produced or used when the body
processes food, medicines, chemicals, or its own tissues, like fat or muscle,
in a process called metabolism (National Cancer Institute, n.d.).”
• Metabolome: “The complete set of metabolites found in a cell, tissue, or
biological sample at a specific moment. It’s constantly changing, as small
molecules are always being absorbed, synthesised, degraded and interact
with other molecules inside the body and with the environment.”
• Metabolic pathways: “Sequences of chemical reactions, facilitated by
enzymes, where the product of one reaction serves as the starting material
(substrate) for the next. These pathways can be categorized into two
types: Anabolic pathways (building up) and Catabolic pathways (breaking
down).”
• Targeted vs. Untargeted Metabolomics

Feature Targeted Metabolomics Untargeted Metabolomics


Strength High accuracy, Comprehensive coverage;
reproducibility,and identifies novel metabolites.
sensitivity.
Limitations Limited to known Requires complex
pathways; biased bioinformatics; lower
by priorknowledge. quantificationaccuracy.

Purpose Validating differences: Finding differences:

Specific focus on a Looking for differential


particular metabolite or metabolites between between
class of metabolites different sample groups

Goal Hypothesis-driven Hypothesis-generating discovery


validation (quantitative) (qualitative/semi-quantitative)
Scope Focused on a predefined Global analysis of all detectable
set of metabolites (e.g., metabolites (1000s), including
20–100). unknowns.
Quantification Optimized for specific Broad extraction to capture
metabolites (e.g., selective diverse metabolites
extraction).

• Metabolic fingerprinting: “Metabolic fingerprinting involves analyzing


machine-generated data (e.g., NMR or MS spectra) to identify unique
chemical patterns characteristic of a given biological sample.”
• Metabolic profiling: “Metabolic profiling is the comprehensive
measurement of low-molecular-weight metabolites and their
intermediates in biological systems. This approach captures the dynamic
metabolic changes that occur in response to genetic, physiological,
pathological, or developmental influences.”
Metabolomic Technologies

• Mass spectrometry (MS): “Mass spectrometry (MS) is a tool used in


metabolomics to measure the weight and charge of tiny molecules (like
sugars, fats, and amino acids). This helps figure out what these molecules
are and how much of them are present in a sample.”
• Gas chromatography mass spectrometry / GC-MS: “Separates metabolites
using gas based on their chemical properties and identifies them using
mass spectrometry. Valuable for discovering disease biomarkers and new
metabolic pathways. Limited to analyzing volatile and semi-volatile
compounds.”
• Liquid chromatography mass spectrometry / LC-MS: “Works by using
liquid chromatography to sort compounds based on their chemical
properties. It is followed by mass spectrometry for identification and
measurement. Can handle non-volatile and polar compounds. More
complex and can take longer to perform.”
• Capillary electrophoresis-mass spectrometry / CE-MS: “Combines precise
separation by charge and size with accurate identification using mass
spectrometry. Suitable for analyzing charged small molecules with high
efficiency. Offers excellent separation and versatility, but it has lower
sensitivity and is limited to small metabolites.”
• NMR: “Nuclear Magnetic Resonance (NMR) is a non-destructive technique
used in metabolomics to identify and measure metabolites by detecting
the chemical environment of atoms, especially hydrogen. NMR is often
combined with other tools like mass spectrometry (MS) or
chromatography to give a more complete picture of the metabolome.”
• Chromatography: “Chromatography is a key technique in metabolomics
that separates complex mixtures based on properties like size or polarity.
It comes in several forms, including liquid chromatography (LC), gas
chromatography (GC), and ion chromatography (IC).”
• FTIR: “Fourier Transform Infrared Spectroscopy (FTIR) works by examining
how infrared radiation interacts with molecules to gather detailed
information about metabolites. It has the capacity to analyze non-volatile
and non-polar compounds. Lower sensitivity compared to MS-based
technique.”

Computational Biology & Data Analysis Methods

• Computational biology: “An interdisciplinary field that uses mathematical


modeling, computer simulations, and algorithmic analysis to understand
and predict the behavior of complex biological systems.”
• Big Data Analytics: “With so much biological information available, big
data tools help organize, store, and analyze this information to uncover
useful patterns and insights.”
• Machine Learning and Deep Learning: “These methods use computers to
learn from existing data and make predictions about things like protein
structures or gene activities. Deep learning is especially good at handling
large, complex data.”
• Statistical Methods: “Statistics help find patterns and relationships in
biological data and test whether these patterns make sense.”
Modeling Techniques

1. Computational Modeling: “This method uses math to create simulations of


biological systems, helping researchers study how they work and predict
what might happen in different scenarios.”
2. Multi-State Modeling: “This technique focuses on molecules that can act in
different ways, helping scientists understand how they interact in complex
systems like signaling pathways.”
3. Systems Biology Models: “These are detailed computer models that show
how different parts of a biological system work together, helping
researchers understand the overall behavior of the system.”

Structural Bioinformatics

“Develops and applies computational methods to the analysis, prediction, and


simulation of the three-dimensional structures of biological macromolecules
(primarily proteins and nucleic acids). Its goals are to understand how structure
dictates function, to model molecular interactions, and to guide experimental
design.”

1. Protein Folding: “The process by which a protein assumes its three-


dimensional structure from a linear chain of amino acids.”
2. Molecular Docking: “The process of simulating the interaction between
two or more molecules to predict their binding.”
3. Homology Modeling: “Predicting the three-dimensional structure of a
protein based on the known structure of a related protein.”
4. Molecular Dynamics (MD): “A computer simulation method used to model
the behavior of atoms and molecules over time.”
5. Secondary Structure: “The local folded structures within a polypeptide
chain, such as alpha-helices and beta-pleated sheets.”

Systems Biology

• Network Biology: “Studying biological systems as networks of interacting


components, such as proteins, genes, and metabolites.”
• Pathway Analysis: “Investigating the network of biochemical pathways
and their interactions in a cell or organism.”
• Gene Ontology (GO): “A framework for the representation of gene
functions, processes, and cellular components.”
• Metabolic Network: “A system of biochemical reactions within a cell,
involving metabolites and enzymes.”
• Cellular Pathways: “Series of interactions between molecules in a cell that
lead to a particular outcome, such as cell signaling or metabolic
processes.”

Databases & Comparative Genomics

• GenBank: “A large database of DNA sequences maintained by NCBI.”


• UniProt: “A comprehensive database of protein sequences and functional
annotations.”
• PDB (Protein Data Bank): “A repository of three-dimensional structural
data for proteins and nucleic acids.”
• GEO (Gene Expression Omnibus): “A public functional genomics data
repository.”
• Ensembl: “A genome browser that provides access to annotated genomes
and gene-related information.”
• Phylogenetic Tree: “A diagram that represents the evolutionary
relationships between species or genes.”
• Orthologs: “Genes in different species that evolved from a common
ancestor and maintain similar functions.”
• Paralogs: “Genes within the same species that arise from gene duplication
and may evolve new functions.”

You might also like