Teach Me
Teach Me
This is
a big and exciting field, so we'll break it down into manageable pieces, explaining key concepts
as we go. Think of this as a foundational course to get you started.
Let's begin!
The genome is the complete set of genetic instructions in an organism. It's essentially the
blueprint for life. Studying it is complex because it's so large.
Break it into small fragments: The entire genome is too large to analyze at once.
Amplify (make many copies of) specific fragments: This helps us get enough material
to study. We can do this out of a cell using a technique called PCR or inside a
cell using cloning.
RNA is a molecule that helps carry out the instructions encoded in DNA.
DNA is the genetic material that makes up our genes and chromosomes.
These techniques determine the exact order of nucleotides (A, T, C, G) in a DNA molecule.
F. SNP Genotyping
SNP chips/microarrays: Uses probes that bind to thousands of known SNPs across the
genome. High throughput, used in genetic mapping and association studies for disease
screening, expression profiling, and breed mapping in animals.
TaqMan Assays: A fluorescent probe-based qPCR method that targets specific SNP
alleles. Allele-specific probes have different reporter dyes. Highly specific.
RFLP (Restriction Fragment Length Polymorphism): Relies on SNPs that affect
a restriction enzyme site (a specific DNA sequence where a restriction enzyme cuts).
Involves PCR, enzymatic digestion, and gel electrophoresis to see different fragment
sizes.
Sequencing-based genotyping: Direct sequencing methods like Sanger or NGS can
detect SNPs.
Microsatellite genotyping: Uses STRs (Short Tandem Repeats), which are very
polymorphic (variable). Involves PCR and polyacrylamide gel electrophoresis. Used in
forensics, paternity testing, and population genetics.
G. Bioinformatics
Bioinformatics is a field that applies computational tools to organize, analyze, and understand
molecular biological data. It's often described as an "information system for molecular biology".
The human genome is divided into two main parts: the mitochondrial genome and the nuclear
genome.
A. Mitochondrial vs. Nuclear Genomes
Genes: Contains 13 protein-coding genes (all for ATP production), 22 tRNA genes, and
2 rRNA genes.
Features: It's very compact, with no introns and common gene overlap. It relies on
about 1700 nuclear genes for its function.
Control region: Includes the D-loop, a highly variable region used in forensic tracking.
The nuclear genome contains chromosomes with varying gene density (some regions are "gene
deserts," others "gene clusters").
Main Classes:
o rRNAs (ribosomal RNAs): Essential for protein synthesis.
o tRNAs (transfer RNAs): Carry amino acids during protein synthesis.
Short ncRNAs:
o snRNAs (small nuclear RNAs): Involved in RNA splicing.
o miRNAs (microRNAs):
Definition: Small (18-25 nucleotides), non-coding RNAs that regulate
gene expression post-transcriptionally.
Biogenesis:
1. Transcribed by RNA Pol II into a primary-miRNA (pri-miRNA).
2. Processed by the Drosha enzyme in the nucleus into a pre-
miRNA (a 70-nucleotide hairpin structure).
3. Exported to the cytoplasm by Exportin-5.
4. Further processed by the Dicer enzyme into a mature miRNA
duplex.
5. One strand (the guide strand) is loaded into the RNA-induced
silencing complex (RISC), while the other strand is degraded.
Function: The miRNA-RISC complex binds to complementary
sequences, typically in the 3' untranslated region (3' UTR) of target
mRNAs. This binding can lead to mRNA degradation (if perfectly
complementary) or repression of translation (if partially complementary).
Impact: One miRNA can target hundreds of mRNAs, and one mRNA can
be regulated by multiple miRNAs.
Relevance: Dysregulated miRNAs are linked to diseases like cancer,
cardiovascular disease, and neurodegenerative disorders. They are also
used as biomarkers in blood or tissues and have therapeutic
potential (e.g., miRNA mimics or inhibitors).
Long ncRNAs (lncRNAs):
o Definition: Non-coding RNAs longer than 200 nucleotides with low protein-
coding potential.
o Types: Include intergenic, intronic, and antisense lncRNAs.
o Function: Regulate transcription, chromatin structure, and epigenetic states.
Heterochromatin/Tandem Repeats:
o Located in centromeres (constricted regions of chromosomes)
and telomeres (chromosome ends).
o Generally not transcribed.
o Types include satellites, minisatellites, and microsatellites.
Transposons (Interspersed Repeats): "Jumping genes" that can move around the
genome.
o Class I (Retrotransposons): "Copy and paste" mechanism (transcription to RNA,
then reverse transcription back to DNA, then integration).
LINEs (Long Interspersed Nuclear Elements): Autonomous (can move
themselves).
SINEs (Short Interspersed Nuclear Elements): Non-autonomous (rely
on LINEs for movement).
o Class II (DNA Transposons): "Cut and paste" mechanism.
o Impact: Important for structural integrity and genome evolution but can be
mutagenic.
Gene regulation controls when, where, and how much a gene is expressed. This ensures that
genes are turned on or off at the right time (temporal control) and in the right place (spatial
control), leading to tissue-specific functions, developmental stages, and responses to
environmental signals.
Chromatin: The complex of DNA, histones (proteins around which DNA is wrapped),
and other proteins.
o Euchromatin: "Open" and transcriptionally active (genes can be expressed).
o Heterochromatin: "Condensed" and transcriptionally inactive (genes are
silenced).
o Remodeling: Changes in chromatin structure can increase or decrease DNA
accessibility to promoters, thereby aiding or hindering RNA polymerase binding
and gene expression.
TADs (Topologically Associating Domains): 3D chromatin domains that help bring
enhancers (DNA sequences that boost gene expression) close to promoters.
o Delineated by boundary proteins/elements: These act as borders between
domains and prevent the spread of heterochromatin or prevent enhancers from
activating unintended genes.
o Importance: Crucial for gene regulation and often conserved across species.
Their disruption is linked to rare diseases.
Gene expression is controlled at multiple levels, from changes in DNA structure to protein
processing.
Genetic variation refers to the differences in DNA sequences among individuals. These
variations are the raw material for evolution and can also cause disease.
Mutations, which are changes in the DNA sequence, are the ultimate source of genetic variation.
1. Replication Errors: DNA polymerase, the enzyme that copies DNA, can sometimes
mispair bases. While DNA mismatch repair systems fix many of these, some errors
persist.
2. Replication Slippage: Occurs typically in STRs (short tandem repeats), leading to
insertions or deletions (indels) of repeated units.
3. Chromosome Segregation/Recombination Errors: Mistakes during cell division
(meiosis or mitosis) can cause large-scale chromosomal changes like inversions,
deletions, translocations. For example, nondisjunction (failure of chromosomes to
separate properly) causes aneuploidy (an abnormal number of chromosomes), as seen in
Down, Turner, and Klinefelter syndromes.
4. Endogenous DNA Damage: Spontaneous damage to DNA from within the cell, such as
loss of bases, reactive oxygen species (ROS), or deamination (e.g., cytosine changing to
uracil or thymine).
5. External Mutagens: Agents from the environment that damage DNA, such as ionizing
radiation (causes DNA breaks), UV radiation (forms thymine dimers), and hydrocarbons
from smoke or pollution.
B. Types of Mutations
Balanced mutations involve no net gain or loss of DNA but can still disrupt genes. Unbalanced
mutations result in a change in DNA copy number and often cause disease.
Mutations can have various effects on gene function and protein production:
Loss of Function:
1. Nonsense Mutation: Changes a codon (three-nucleotide sequence that codes for
an amino acid) into a premature stop codon, leading to a truncated (shorter) or
non-functional protein. If the stop codon occurs early in the coding region, it can
trigger nonsense-mediated mRNA decay (a quality control mechanism that
degrades faulty mRNA).
2. Missense Mutation: Changes a single amino acid in the protein.
Synonymous: No amino acid change (due to redundancy in the genetic
code), usually low risk.
Tolerated: Chemically similar amino acid substitution, often mild or no
effect.
Not Tolerated: Major disruption in protein function, high risk.
3. Splice Site Mutation: Occurs at the boundaries between introns and exons,
disrupting the normal splicing process. This can lead to exon skipping, the
creation of new exons, or enlarged exons.
4. Frameshift Mutations: Caused by indels (insertions or deletions not in multiples
of three) that shift the reading frame of the gene. This usually leads to an early
stop codon and a truncated or non-functional protein.
Gain of Function:
1. Structural Rearrangements: Can lead to chimeric genes (combinations of parts
from two or more distinct genes) or ectopic expression (a gene being expressed
in a cell or tissue where it's not normally active). For example, enhancers
(regulatory DNA elements) might be placed next to oncogenes, leading to their
overexpression.
2. RNA Toxicity: Usually results from unstable repeat expansions where
abnormally long RNA transcripts are produced. These long RNAs can form stable
structures (hairpins) that trap RNA-binding proteins, disrupting the processing of
other genes (e.g., by causing mis-splicing). Seen in diseases like myotonic
dystrophy.
3. Unstable Repeat Expansions: Trinucleotide or multibase repeats expand beyond
a certain threshold.
4. Gene Duplications: Lead to more transcripts and potentially more protein, as
seen with IGF2 duplication.
5. Missense Mutations: Can change the protein's function, sometimes creating
dominant oncogenes.
Cells have sophisticated systems to fix DNA damage and replication errors.
1. Karyotyping: Visualizing the full set of chromosomes in a cell. Often uses G banding to
detect large-scale changes. Useful for balanced abnormalities.
2. Chromosome Painting: Uses fluorescently labeled DNA probes that bind to specific
chromosomes, revealing translocations or fusions.
3. FISH (Fluorescence In Situ Hybridization): Uses fluorescently dyed DNA probes to
bind to specific DNA sequences on chromosomes. Detects translocations, inversions,
duplications, deletions, and CNVs.
4. Comparative Mapping: Aligning genes or markers across species to identify conserved
synteny (blocks of genes with the same order). Uses genome browsers, sequence
alignments (like BLAST), and sequencing data.
5. CGH Arrays (Comparative Genomic Hybridization arrays): Detects copy number
differences below 5 Mb (megabases).
6. Whole Genome Sequencing (WGS): Can detect SNVs, indels, and structural variations.
7. Hi-C-seq: Maps 3D chromatin interactions, useful for studying TADs.
I. Comparative Genomics
This field studies the genetic similarities and differences across species.
K. Ethics in Genetics
Dog breeding.
Genetic testing of humans.
Gene editing in humans and animals.
Mendelian characters (or traits) are those whose inheritance patterns can be explained by the
laws of Mendelian genetics, typically involving a single gene.
B. Genetic Heterogeneity
This refers to situations where different genetic causes lead to the same phenotype (observable
trait).
1. Allelic Heterogeneity: Different mutations within the same gene cause the same
phenotype (e.g., over 12 mutations in the CFTR gene cause cystic fibrosis).
2. Locus Heterogeneity: Mutations in different genes cause the same phenotype (e.g.,
retinitis pigmentosa from mutations in over 16 genes).
3. Clinical Heterogeneity: Mutations in the same gene can lead to different phenotypes
(e.g., different dystrophinmutations cause Duchenne or Becker muscular dystrophy).
D. Genetic Mapping
Purpose: To locate genes responsible for traits (e.g., disease loci), determine the relative
positions of genes/markers on chromosomes, understand recombination patterns, and aid
breeding programs.
Genetic Markers: Polymorphic (variable) DNA sequences used to trace inheritance.
Types include SNPs, microsatellites/STRs, and RFLPs. They should be polymorphic,
stable, and easy to genotype.
Genetic Maps: Show the relative positions of markers based on recombination rates. 1
cM (centimorgan) equals 1% recombination frequency.
Fine Mapping: High-resolution mapping to narrow down the location of a disease-
causing mutation using dense marker panels and recombination events. Examples include
GWAS (Genome-Wide Association Studies) and linkage analysis.
E. Linkage Mapping
Main Concepts:
o Identifies disease loci by examining the inheritance of traits with nearby genetic
markers.
o Based on the principle that loci located close together on the same chromosome
tend to be inherited together because recombination between them is less likely.
o Unlinked genes: On different chromosomes or far apart on the same
chromosome; they assort independently.
o Linked genes: Close together on the same chromosome; usually inherited
together, with a lower recombination frequency.
o Recombination frequency: The percentage of offspring with recombinant
genotypes. A low percentage indicates loci are close together. The maximum
recombination frequency for unlinked loci is 50%.
LOD Score (Logarithm of Odds): A statistical measure used to estimate the likelihood
of linkage versus no linkage.
o Formula: log₁₀ (likelihood of linkage / likelihood of no linkage).
o Interpretation:
LOD > 3: Significant evidence for linkage (odds of 1000:1 in favor of
linkage).
LOD < -2: Evidence against linkage.
Values in between are inconclusive.
o The peak of a LOD score curve indicates the most probable recombination
frequency.
Limitations: Requires large, informative families. Less useful for complex traits. Needs
clear phenotype classification and has limited resolution.
WGS is considered the "gold standard" for identifying causative mutations in monogenic
(single-gene) traits.
Analysis of NGS Data: NGS technologies (like Illumina HiSeq) produce billions of short
sequence reads, resulting in massive datasets. This requires automated pipelines, high-
performance computing, and systematic data processing.
Typical Workflow (Pipeline):
1. Quality Control: Raw sequence reads are trimmed, and adapters are removed to
eliminate sequencing errors.
2. Alignment: Reads are aligned (mapped) to a reference genome to identify their
original location. This creates SAM/BAM files.
3. Variant Detection: Genetic variants (like SNPs and indels) are identified by
comparing the aligned reads to the reference genome.
4. Variant Annotation and Filtering: Variants are annotated with their potential
functional consequences, and irrelevant variants are filtered out to prioritize those
likely to affect gene function or cause disease.
Data Handling: Large datasets are processed using multi-node computer clusters or
cloud platforms for parallel processing and distributed storage.
Translating Data to Biological Insight: The ultimate goal is to identify a single
causative mutation or meaningful pattern. For example, in the case of PRA (Progressive
Retinal Atrophy) in dogs, filtering millions of variants led to identifying one causative
mutation, allowing for a genetic test to be developed.
WGS Workflow (Detailed):
1. Start with genomic DNA.
2. Sonication: DNA is fragmented into smaller pieces using sound waves.
3. End Repair: Fragment ends are prepared (made blunt or compatible).
4. Adapter Ligation: Sequencing adapters are attached to the DNA fragments.
5. PCR Amplification (optional, depending on sequencing platform).
6. Size Selection: DNA fragments of appropriate size are selected.
7. Sequencing.
B. Key Concepts
C. Example: Obesity
Definition: Abnormal or excessive fat accumulation that impairs health, often classified
by BMI (Body Mass Index).
Causes (Multifactorial): Genetic predisposition interacts with environmental/lifestyle
factors (diet, activity), hormonal/metabolic dysregulation, altered gut-brain
axis/microbiome, and neuroendocrine regulation (e.g., leptin, ghrelin).
Consequences (Comorbidities): Type 2 diabetes, non-alcoholic fatty liver disease
(NAFLD), cardiovascular disease, cancer, etc..
Treatments:
1. Lifestyle Changes: Diet, physical activity, behavioral therapy (first line, but high
relapse).
2. Pharmacotherapy: Drugs like GLP-1 receptor agonists (e.g., semaglutide) to
increase satiety and delay gastric emptying.
3. Surgery: For severe cases (e.g., bariatric procedures like gastric bypass).
Preclinical Drug Discovery:
o Involves identifying target genes, confirming that modifying them affects disease
outcomes, finding molecules that interact with the gene, optimizing potency, and
then assessing safety and efficacy in animal models before human clinical trials.
o Animal Models: Pigs (like the Gottingen Minipig) are good models for obesity
due to their similar metabolic profile to humans.
o Case Study (Gottingen Minipig Model): Demonstrated that semaglutide reduced
food intake and weight gain while preserving lean mass and maintaining energy
expenditure.
Principle: Scans the entire genome to identify genetic variants (typically SNPs) that
are associated (co-occur) with a particular phenotype or disease. It relies on LD to find
SNPs linked to disease-causing variants.
Causes of Association: Direct causation, natural selection, epistatic effects (gene-gene
interaction), population stratification (differences in allele frequency due to ancestry), or
Type 1 error (false positives).
Process: Uses SNP markers, assuming that a disease-causing allele is inherited along
with neighboring alleles in haploblocks.
The "Hidden Heritability" Problem: GWAS often explains only a small proportion of
the heritability estimated from family studies.
o Possible causes: Small effect sizes missed due to lack of statistical power, gene-
gene or gene-environment interactions, structural or rare variants not captured, or
epigenetic/non-additive genetic architecture.
Multiple Testing Problem: Testing millions of SNPs increases the chance of false
positives (Type 1 error).
o Correction: Statistical corrections like Bonferroni correction (very
conservative), permutation tests, or False Discovery Rate (FDR) control are
used. P-values are used to create Manhattan plots to highlight significant
associations.
Limitations:
o Doesn't directly identify the causal variant.
o Less power to detect rare variants, epistasis, or environmental interactions.
o Often identifies variants in non-coding regions, making biological interpretation
difficult.
o Can be confounded by population stratification and challenges in defining
phenotypes.
Outcome: Identification of significant SNPs associated with a phenotype, visualized in
Manhattan plots.
Follow-up Steps: Identify haplotype blocks, sequence candidate regions to find causal
variants, and conduct functional studies (e.g., using TaqMan validation) to determine
biological effects.
Example (MMVD in Cavalier King Charles Spaniels): GWAS identified SNPs in LD
with a causal variant, and subsequent genotyping pinpointed a HYAL4 variant associated
with the disease.
7. Biomarkers
A biomarker is a measurable characteristic that indicates normal biological processes,
pathogenic (disease) processes, or a pharmacological response to a treatment.
A. Types of Biomarkers
1. Diagnostic: Defines the presence or type of a disease (e.g., PSA for prostate cancer).
2. Prognostic: Predicts disease outcome (e.g., HER2 in breast cancer).
3. Predictive: Indicates the likelihood of response to a specific treatment.
4. Mechanistic: Provides insight into the molecular mechanisms of a disease or drug.
5. Safety: Indicates adverse or toxic responses to a drug.
B. Examples of Biomarkers
D. MicroRNAs as Biomarkers
Main Features: Small, non-coding RNAs (18-25 nt) that regulate gene expression by
binding to target mRNA at the 3' UTR, causing degradation or translational repression.
They are highly conserved across species.
Biogenesis (as explained in Section 2.D): Transcription to pri-miRNA, Drosha
processing to pre-miRNA, export to cytoplasm, Dicer processing to mature miRNA
duplex, RISC loading, and then function by targeting mRNA.
Functions/Circulating miRNAs:
o Regulate key processes in development, immune response, inflammation, and
cancer.
o Circulating miRNAs are highly stable extracellularly (e.g., in blood, urine,
saliva, CSF) because they are protected in vesicles (exosomes) or bound to
proteins. This stability makes them useful as non-invasive biomarkers (e.g.,
miRNA-21 in breast cancer).
Therapeutic Use:
o miRNA mimics: Enhance the expression of downregulated miRNAs.
o AntagomiRs/anti-miRs: Inhibit overexpressed miRNAs.
o Clinical trials are ongoing for miRNA-targeted therapies in various diseases.
o Examples: miRNA-21 upregulation linked to trastuzumab resistance in breast
cancer; miRNA-27a associated with GI cancer progression in dogs.
1. RT-qPCR: The "gold standard" for miRNA detection; sensitive, quantitative, but
challenging due to miRNA's short length and lack of a polyA tail.
2. Small RNA-seq: Provides global, unbiased profiling but is less sensitive than qPCR for
low-abundance miRNAs.
3. ISH (In situ hybridization): Used for spatial localization of miRNAs in tissues, often
with LNA (locked nucleic acid) probes.
4. Microarrays: High-throughput, but less common now due to RNA-seq.
5. Digital PCR: Very quantitative and useful for low RNA input scenarios.
A. Evolution of Cancer
Cancer follows a Darwinian selection model at the cellular level: Random mutations arise, and
those that give cells a growth or survival advantage become dominant. It's a multistage,
multistep process, moving from normal cells to abnormal proliferation (dysplasia) and then to
invasive cancer.
B. Hallmarks of Cancer
These are capabilities that all cancers must acquire to grow and spread:
1. Self-sufficiency in growth signals: Cancer cells don't need external signals to grow.
2. Insensitivity to anti-growth signals: They ignore signals that tell normal cells to stop
growing.
3. Evasion of apoptosis: They avoid programmed cell death, even when damaged.
4. Limitless replicative potential: They can divide indefinitely.
5. Sustained angiogenesis: They stimulate the formation of new blood vessels to supply
nutrients and oxygen.
6. Tissue invasion and metastasis: They can spread from their original location to other
parts of the body.
Epigenetic alterations: DNA methylation and histone modifications are reversible, and
drugs targeting them can reactivate silenced tumor suppressors or silence activated
oncogenes.
Stress-induced adaptations: Cancer cells can adapt to hostile conditions (e.g., acidosis,
hypoxia) and therapy stress without new mutations, enhancing their survival.
lncRNAs (long non-coding RNAs): Can regulate chromatin, transcription, and cell fate,
playing a role in cancer initiation (e.g., LINC00673 in pancreatic models).
F. Genomic Instability
Cancer cells often exhibit genomic instability, meaning their genomes are unstable, leading to
high mutation rates, chromosomal rearrangements, and defects in DNA repair pathways.
Cell plasticity is the ability of differentiated cells to change their identity, dedifferentiate (go
back to a less specialized state), or transdifferentiate (change into another cell type) without
needing mutations, typically under stress or injury.
Role in Cancer:
o Enables cancer cells to adapt to therapies and hostile environments.
o Promotes metastasis (spread of cancer) via EMT (Epithelial-Mesenchymal
Transition), a process where epithelial cells lose their cell-cell adhesion and gain
migratory properties.
o Facilitates drug resistance even without new mutations.
Pancreatic Cancer Example: Acinar-to-ductal metaplasia (ADM), where enzyme-
producing acinar cells become duct-like cells, can be driven by injury and oncogenic
KRAS mutations. This ADM can be a route to Pancreatic Ductal Adenocarcinoma
(PDAC). Transcription factors like Sox9 and Sox4 are involved in plasticity and
neoplastic transformation.
The TME is the complex environment surrounding a tumor, consisting of various cell types,
extracellular matrix, and signaling molecules. It profoundly shapes tumor behavior, progression
(angiogenesis, metastasis), and resistance to treatment.
Components:
o CAFs (Cancer-Associated Fibroblasts): Secrete growth factors, cytokines, and
extracellular matrix components that support tumor growth and immune evasion.
o Immune cells: Certain immune cells (e.g., tumor-associated macrophages,
myeloid-derived suppressor cells) can suppress anti-tumor immune responses.
o Endothelial cells: Support angiogenesis (formation of new blood vessels) to feed
the tumor.
o Extracellular Matrix (ECM): Provides structural support and biochemical
signals.
Pancreatic Cancer TME: Characterized by a dense stroma and an immunosuppressive
microenvironment, posing significant obstacles to treatment.
Concept: Uses an individual patient's genetic and molecular tumor profiles to tailor
treatment.
Inherited Mutations: Mutations in key tumor suppressor genes can lead to hereditary
cancer syndromes (e.g., BRCA1/2 for breast/ovarian cancer, TP53 for Li-Fraumeni, APC
for familial adenomatous polyposis).
Examples of Targeted Therapies:
o BCR-ABL fusion (in chronic myeloid leukemia) treated with Imatinib (a
tyrosine kinase inhibitor).
o HER2 amplification (in breast cancer) treated with Trastuzumab.
Requirements: Requires integrating data from whole-genome/exome sequencing,
transcriptomics (RNA-seq), and epigenetic profiling.
Advantages: Higher efficacy, fewer side effects, and targeting the "driver mutations"
(those that cause cancer progression) rather than just symptoms.
PCR-based Methods: For mutation detection (e.g., TaqMan for SNPs) and in veterinary
genetic testing (e.g., NPHP4 mutation in dogs).
Whole Genome Sequencing (WGS) / Whole Exome Sequencing (WES): Detects point
mutations, indels, and structural variants, helping identify driver vs. passenger mutations.
RNA-seq: Measures gene expression and identifies gene fusions and splice variants.
Epigenetic Assays:
o Methylation analysis: Bisulfite sequencing, MeDIP-seq.
o ChIP-seq: Maps histone modifications.
Reverse Phase Protein Arrays (RPPA): Profiles protein and phosphorylation levels.
Fluorescence-based genotyping (Microsatellites/STRs): Used in paternity testing and
population genetics.
Multiplex PCR Panels: Co-amplify multiple microsatellites in one reaction, used in
paternity and forensics.
Next-Generation Sequencing in Forensics: Uses STRs and SNPs for identity, ancestry,
and age estimation (e.g., with Ion Torrent and Illumina platforms).
Transgenic Animals: Animals that have been genetically modified to carry foreign
DNA (a "transgene") integrated into their own genome, and this transgene is passed on to
their offspring.
Chimeric Organism: An organism made from cells derived from two or more different
zygotes (early embryos).
1. Insertion into Germ Cells: This ensures the genetic modification is present in all cells of
the animal, including its reproductive cells, so it can be passed to the next generation.
o Pronuclear Injection: DNA is directly injected into the pronucleus (the haploid
nucleus of the sperm or egg) of a fertilized oocyte (egg cell) before the sperm and
egg nuclei fuse.
o Gene Transfer into Early Embryos or Gametes: DNA is introduced at very
early embryonic stages or into the gametes themselves.
o Somatic Cell Nuclear Transfer (SCNT): The nucleus from a modified adult cell
is transferred into an enucleated (nucleus-removed) egg cell. This was the
technique used to clone Dolly the sheep.
2. Gene Targeting (Precise Modification): Uses homologous recombination to introduce
a specific mutation or gene insertion at a desired location in the genome. This is done
using a plasmid (or other DNA template) with "homology arms" that match the target
gene sequence. This allows for knockout (inactivating a gene) or knockin (inserting a
gene or specific mutation).
o Backcrossing: After initial modification, animals are often backcrossed with
wild-type animals many times to ensure that the only genetic difference is the
targeted gene modification.
o Key Protein-based Techniques for Gene Editing: These all create a double-
strand break (DSB) at a specific DNA target site, which the cell then repairs.
Zinc Finger Nucleases (ZFNs): Engineered proteins with DNA-binding
domains (zinc fingers) that recognize specific 3-bp DNA sequences and a
nuclease domain (FokI) that cuts DNA. Two ZFNs bind on opposite sides
of the target site, and their FokI domains dimerize to make a DSB.
TALENs (Transcription Activator-Like Effector Nucleases): Similar to
ZFNs, but the DNA-binding domains are derived from bacterial proteins
(TAL effectors). Two TALENs bind to sequences, and their FokI domains
dimerize to create a DSB in the spacer region between their binding sites.
Used in CAR T-cell therapy.
CRISPR-Cas9 (Clustered Regularly Interspaced Short Palindromic
Repeats-CRISPR-associated protein 9): This system uses a guide RNA
(gRNA) that directs the Cas9 nuclease enzyme to a specific DNA
sequence, which must be next to a PAM (Protospacer Adjacent
Motif)sequence. Cas9 then cuts the DNA, creating a DSB. CRISPR can be
used for selective reproduction to introduce or remove specific traits.
o DNA Repair Pathways after DSB:
NHEJ (Non-Homologous End Joining): An error-prone repair pathway
that often results in small insertions or deletions (indels) at the repair site.
This is commonly used for gene knockout(inactivating a gene).
HDR (Homology-Directed Repair): A precise repair pathway that
requires a donor DNA templatewith homology arms (sequences matching
the target region). This pathway allows for precise gene insertion
(knockin) or correction of specific mutations.
o Conditional Gene Editing Systems (Cre-lox Recombination): Allows for
controlling where (tissue-specific) and when (time-specific) a gene is modified.
This is useful for studying gene function without unwanted effects during
development.
Cre recombinase: An enzyme that recognizes and cuts at loxP sites (short
DNA regions that flank the target gene). If Cre is introduced with a tissue-
specific or inducible promoter, it will only excise or invert the DNA
between the loxP sites in specific cells or at specific times, leading to a
conditional gene knockout.
3. Random Mutagenesis: DNA is inserted randomly into the genome. This method offers
less control and can unintentionally disrupt other genes.
1. Production of Reagents:
o Therapeutic Proteins: Genetically engineered organisms (e.g., CHO cells) are
used to produce proteins like insulin or monoclonal antibodies. This requires
precise control of expression, purification, and post-translational modifications.
o Vaccines:
Engineered viral vectors: Viruses (e.g., VSV for Ebola vaccine) are
genetically engineered to carry antigens.
mRNA vaccines: Use synthetic mRNA that codes for a specific antigen
(e.g., spike protein in COVID-19 vaccines).
DNA vaccines: Deliver plasmids into host cells to express antigens.
o Gene Therapy Products: Products like Casgevy for sickle cell disease involve
gene modification.
2. Cell Therapy: Uses intact cells to treat diseases.
o Applications: iPSC-derived neurons for neurodegenerative diseases,
hematopoietic stem cell transplants, regenerative repair (e.g., spinal injury, retinal
degeneration).
o Limitations/Risks: Tumor risk (especially with pluripotent stem cells like
iPSCs), immune rejection in allogeneic transplants, poor control over
differentiation leading to wrong tissue formation, ethical issues with embryonic
stem cells, and difficulty in effective delivery to damaged tissues.
o CAR T-cells (Chimeric Antigen Receptor T-cells): A type of immunotherapy
for cancer.
Process: T cells are extracted from a patient, modified (e.g., with a viral
vector) to express an artificial receptor (CAR19) that targets CD19-
positive cancer cells, and then infused back into the patient.
Donor T-cells: If using donor T cells, they might be genetically modified
using TALENs to disrupt T-cell receptor expression (to prevent immune
rejection/Graft vs Host reaction) and knock out genes like CD52 (to make
them resistant to certain drugs).
H. Stem Cells
Stem cells are undifferentiated cells with the ability to self-renew and differentiate into various
specialized cell types.
I. Risk Assessment for GMOs (Genetically Modified Organisms) and Gene Therapies
Definition: Environmental risk assessment (ERA) and human health risk assessment are
central to regulating GMOs.
Regulation: GMO regulations vary worldwide, with the EU having stricter rules. GMOs
are typically regulated under directives covering deliberate release into the environment
and contained use. Genome editing (e.g., CRISPR) is currently often regulated as GMOs,
but this is evolving.
ERA Purpose: To identify, characterize, and manage potential risks before GMOs are
used in clinical trials, agriculture, or industry.
Considerations in GMO Medicines (Gene Therapies, Cell Therapies, Vaccines):
o Require extensive preclinical trials in model organisms.
o Risks:
1. Shedding: The release of genetically modified material into the
environment (e.g., from dead embryos) that could infect unintended
individuals.
2. Replication competence: The ability of a viral vector (used to deliver
genes) to replicate uncontrollably within the patient or spread to the
environment.
3. Insertional mutagenesis: The inserted gene landing in the wrong place in
the genome, potentially causing cancer.
4. Genetic stability of the construct: Whether the inserted transgene
remains intact and functional over time in host cells. If unstable, it could
become ineffective or harmful.
5. Recombination of the construct: The inserted genetic material
recombining with host DNA or other viral sequences in unintended ways,
potentially leading to new, harmful viruses or hybrid genes.
6. Molecular characteristics of the construct: Poor design can lead to
overexpression, silencing of host genes, or other unintended effects.
Case Examples:
o Transgenic CHO cells: Used for pharmaceutical production, require inactivation
and DNA degradation before disposal to prevent environmental release.
o CRISPR-edited chickens: Modified so male embryos die early, while females
survive and are non-transgenic. Regulations must assess risks like horizontal gene
transfer (gene moving to another organism) or incomplete lethality of males.
Key Distinction:
o Risk assessment: Science-based process to evaluate hazards and their likelihood.
o Risk management: Policy-driven, uses scientific input for regulatory decisions.
o Risk communication: Interactive exchange of information about risks.
This comprehensive overview should provide you with a solid foundation as you begin to
explore the vast and exciting field of genetics and molecular biology!