14.
Methods to study genes
BIO 305
Andrzej Wierzbicki
14.2
How to prepare for BIO 305 exams
• Commit a substantial amount of time and effort
• Come to class, take notes, avoid distractions
• Actively participate in discussion sections
• Read textbook (required and optional readings)
• Make additional notes combining all sources of information
• Review material after each lecture
• Ask and answer questions on Piazza
• Work with practice exams (with caution)
• Come to the GSC and office hours
14.3
How do we study genes and gene products?
14.4
To see a gene or observe its activity we need to:
A. Have a good enough light microscope
B. Have an electron microscope
C. Have an atomic force microscope
D. Analyze large pools of molecules
E. Genes cannot be observed
Answer: D
Gel electrophoresis 14.5
Size marker
Large molecules,
slow migration
Small molecules,
fast migration
Sanders and Bowman p. 556
Restriction enzymes 14.6
Size marker
Plasmid digested with
1. BamHI
2. EcoRI
3. NotI
Clicker question:
What is the distance between NotI
restriction sites on this plasmid?
Provide estimated distance in
kilobases rounded to nearest kb.
Answer: 3kb and 4kb.
Sanders and Bowman p. 556
Restriction enzymes 14.7
Size marker
Plasmid digested with
1. BamHI
2. EcoRI
3. NotI
Conclusions of this digest
1. BamHI
1. One digestion site
2. Plasmid is ~7kb long
2. EcoRI
1. One digestion site
2. Plasmid is ~7kb long
3. NotI
1. Two digestion sites
2. Products ~3kb and ~4kb
Sanders and Bowman p. 556
Molecular cloning 14.8
DNA fragment of interest
Plasmid • PCR product
• Chemical synthesis
• Other plasmid
ligation
Plasmid with our sequence of interest
• Long term maintenance
• Easy multiplication
• Easy gene expression
Sanders and Bowman p. 558
Expression of recombinant proteins 14.9
mRNA protein
bacterial cell
Protein expression plasmids Example:
• Promoter etc. Production of human
• Protein coding sequence insulin in E. coli (p. 566)
• Epitope tag
• Other helper sequences
Approaches to analyze and quantify genes and their products 14.10
• Northern blot RNA length and quantity
• Western blot Protein size and quantity
• PCR Amplify or quantify RNA/DNA
• Sanger sequencing Determine DNA sequence
• High throughput sequencing Determine DNA/RNA
sequence and quantity
Northern Blot 14.11
1. Separate 2. Transfer to 3. Hybridize with
RNA on gel membrane probe
kb
6
5
rRNA 4
rRNA 3
Make RNA
accessible
Probe
• Complementary to RNA of interest
• Binds RNA on membrane by base-pairing
• Radioactive
• Radioactivity indicates presence and quantity of RNA
Northern Blot 14.12
What conclusion can you make
Sample 3
Sample 1
Sample 2 based on this northern blot result?
Probe:
A. Sample 1 has more Chll mRNA
than samples 2 and 3
B. Samples 2 and 3 have more Chll
mRNA than sample1
Northern blot
loading control C. Two loading controls are
inconsistent
D. Sample 2 has more AC2/3 RNA
than samples 1 and 3
Stain: Total RNA E. Sample 3 has less Chll mRNA
staining
loading control than Actin 2 mRNA
Answer: A
Blevins et al. 2006
Western blot 14.13
1. Separate 2. Transfer to 3. Probe blot
protein on gel membrane with antibody
muscle brain
Antibody
• Binds protein of interest
• Labelled or detectable by labelled secondary antibody
• Label indicates presence and quantity of protein
Western blot 14.14
What conclusion can you make
1 2 3 4 based on this western blot result?
100
75
A. Sample 3 contains a larger
protein than sample 1
aHA
50
B. Sample 3 contains a larger
protein than sample 4
37 C. Sample 2 contains more protein
CBB than sample 4
D. Sample 4 contains more protein
Credit: Dr. Ji-Hee Min
than sample 3
E. Answers A and C are both correct
Westen blot using anti-HA
antibody. Samples 1-4 contain
proteins from transgenic plants
expressing various HA-tagged
proteins.
Answer: E
Polymerase Chain Reaction (PCR) – one of 14.15
the most powerful methods of genetics
• Amplify any DNA fragments you want
• Produce DNA
• Quantify DNA or RNA
• Fast, cheap, in vitro
• Limitations
• Need template DNA
• Need to know sequence of flanking regions Design primers
• Length limit (10 kb)
Polymerase Chain Reaction 14.16
Primer Primer
Product after multiple Primers delimit the
rounds of PCR amplified region
PCR: cycle 1 14.17
5’ 3’
3’ 5’
Step 1: denature DNA 95℃
5’ 3’
3’ 5’
Step 2: anneal primers ~55℃
5’ 3’
3’ 5’
5’ 3’
3’ 5’
Step 3: extend primers 72℃
5’ 3’
5’
5’
3’ 5’
PCR: cycle 2 14.18
Denature, anneal, and extend primers
PCR: cycle 3 14.19
Denature, anneal, and extend primers
PCR: additional cycles 14.20
Denature, anneal, and extend primers
14.21
PCR amplifies DNA between primers
Typical PCR: 25-35 cycles.
14.22
Question
You want to PCR amplify part of the htz1 gene in yeast. The coding
strand of the sequence to be amplified is:
5’-AGGCGGCATACG-3
5’-AGGCGGCATACGGGCGAATGTTAAGCAGTATGTAATGGCATTAGGC-3’
5’-GCCTAATGCCAT-3
Which of the following set of primers is best?
Left primer right primer
A. 5’- AGGCGGCATACG -3’ 5’- TACCGTAATCCG-3’
B. 5’-CGTATGCCGCCT-3’ 3’-GCCTAATGCCAT-5’
C. 5’- AGGCGGCATACG -3’ 5’-ATGGCATTAGGC-3’
D. 5’-AGGCGGCATACG-3’ 5’-GCCTAATGCCAT-3’
E. 3’- AGGCGGCATACG-5’ 5’-CGGATTACGGTA-3’
Answer: D
14.23
Question
Below is a DNA sequence. N indicates unknown nucleotides.
Which primer would you use for Sanger sequencing to directly
determine the sequence of the top strand?
5’-AGAGTCGTGGGGNNNNNNNNNNNNNNNNNNNTTTTGGTTAAGG-3’
3’-TCTCAGCACCCCNNNNNNNNNNNNNNNNNNNAAAACCAATTCC-5’
A.5’-AGAGTCGTGGGG-3’
B.5’-CCTTAACCAAAA-3’ Answer: A
C.5’-CCCCACGACTCT-3’
D.5’-TTTTGGTTAAGG-3’
E.Answers A and B are correct
Analyzing RNA transcripts: RT-PCR 14.24
5’
3’
AAAAA mRNA
Reverse transcription (RT)
3’
TTTTT 5’
5’
AAAAA
mRNA/cDNA hybrid
3’
RNase: degrade RNA
3’
TTTTT 5’ cDNA
Use as template in PCR reaction
Real time PCR (qPCR) 14.25
bio-rad.com
High throughput sequencing 14.26
High throughput sequencing
• Sequencing on a massive scale
• Millions of sequences at once
• Limitations
• Sequences within one sample usually anonymous
• Sequences usually short (up to ~600bp)
• Requires complex data analysis
High throughput sequencing 14.27
Genome (re)sequencing
DNA extraction
Library generation
High throughput sequencing
Align to reference sequence
Genome sequence!
Inferred from differences
between sequencing reads and
the reference genome
sequence.
High throughput sequencing 14.28
Genome (re)sequencing applications
• Find mutation causing a phenotype
• Find mutations associated with disease
• Determine best targeted treatment for a specific cancer
• Sequence a new interesting genome
Limitations
• Very hard in the absence of a reference genome
• Very hard on repetitive sequences
• Difficult bioinformatics required
High throughput sequencing 14.29
RNA-seq - quantification of RNA
Total RNA from tissue
cDNA
Sequencing
Mapping to the genome
Gene 1 Gene 2
Higher RNA level Lower RNA level
High throughput sequencing 14.30
RNA-seq - quantification of RNA
Advantages
• Know RNA accumulation from ALL genes
• Analyze transcription and RNA processing
Limitations
• Need genome sequence
• Less abundant RNAs are hard to study
• Often overinterpreted as a measure of gene expression
High throughput sequencing 14.31
What is the correct conclusion of this RNA-seq
experiment performed in Arabidopsis plants,
comparing wild type and the nrpe1 mutant?
AT5G42900
A. Locus AT5G42900 is bound by
wild type
NRPE1
B. Locus AT5G42900 has a lower
RNA accumulation in nrpe1
C. Locus AT5G42900 has a lower
transcription rate in nrpe1
D. Locus AT5G42900 has a lower
nrpe1 mutant
translation rate in nrpe1
E. This result is inconclusive
Answer: B
High throughput sequencing 14.32
ChIP-seq – protein-DNA interactions
Purify DNA
bound to protein
of interest
Sequencing
Mapping to the genome
Region 1 Region 2
Song et al. 2015 Evidence of protein binding
High throughput sequencing 14.33
ChIP-seq – protein-DNA interactions
Advantages
• Find ALL binding sites of a protein
• Can study posttranslational protein modifications
Limitations
• Need an antibody specific towards the protein of interest
• Need a good negative control
• Limited resolution
High throughput sequencing 14.34
ChIP-seq experiment using
an antibody specific
towards the NRPE1 protein
Which genomic region has the in Arabidopsis thaliana.
strongest evidence of NRPE1 1 2 3 4
binding to DNA?
A. 1
wild type
B. 2
C. 3
D. 4
nrpe1 mutant
Answer: C
High throughput sequencing 14.35
Metagenomics
1. Take an environmental sample (eg. Water from Lake Superior)
2. Purify DNA
3. Generate library
4. Sequence
5. Identify all species present in the sample
• Breakthrough in ecology
• Characerize ecosystems
• Discover new species (microorganisms)
• Early detection of invasive species
High throughput sequencing 14.36
Costs of high throughput sequencing (2024)
•1/10 lane of Illumina NovaSeq X 10B = 1.5 bilion reads
•One read = max. 2x150 bp
•Total coverage 450 billion bp (150 x human genome)
•Library: ~$100, sequencing ~$1500
Long read sequencing 14.37
•Direct sequencing (no amplification needed)
•DNA squeezed through a pore
•Up to 1 milion bases
•Detect DNA modificaitons
•Limitations
•Lower throughput
•Lower accuracy
https://nanoporetech.com/platform/technology
14.38
Working on the same TDP43 isoforms in Drosophila, you performed quantitative
RT-PCR using primers complementary to GFP. Labels on the plot below indicate
results corresponding to A – GFP only, B - TDP43(l)-GFP and C - TDP43(s)-GFP.
What is the most reasonable interpretation of this result?
A. Transcription of TDP43(s)-GFP is lower than TDP43(l)-GFP and GFP
B. Translation of TDP43(s)-GFP is lower than TDP43(l)-GFP and GFP
C. RNA degradation of TDP43(s)-GFP is faster than TDP43(l)-GFP and GFP
D. RNA accumulation of TDP43(s)-GFP is lower than TDP43(l)-GFP and GFP
E. Protein accumulation of TDP43(s)-GFP is lower than TDP43(l)-GFP and GFP
Answer: D