Introduction to Genetic Variation
Prof. Xuhua Xia
[email protected] http://dambe.bio.uottawa.ca ∑ 𝜋 𝑖𝑗
𝑖 𝑗 ,𝑖 ≠ 𝑗
=
𝑛 (𝑛 − 1)/ 2
©University of Ottawa
Math/Stats
1
Learning objectives
By the end of this section you should:
• Appreciate the contribution of mutation to genetic variation
and applications based on genetic variation
• be comfortable with simple terminology (e.g., loci, gene, allele,
the number of polymorphic loci, nucleotide diversity,
heterozygosity, Shannon entropy)
• be familiar with commonly used methods (indices) for
quantifying genetic variation
• be able to calculate genotype frequencies from absolute
numbers of different genotypes, allele frequencies from
genotype frequencies, and other simple indices of genetic
variation
• be able to recount practical applications based on genetic
variation and think of new applications
2
Population and population genetics
• Abstract population: exists in models in population genetics, e.g., a
population in the Wright-Fisher model
• Real Population:
– consists of members of the same species
– has its members potentially mating with any other member in the population,
although not always with the same probability
– has members continuously distributed geographically during mating season
– will typically last for many generations
– has population-level features, e.g.,
• size of a population
• frequency of a particular allele or genotype in that population
– exhibits genetic variation that can be quantified over space and time
• Population genetics
– Seeks to understand the structure and dynamics of genetic variation both within
and among populations/species (which can be either abstract or real)
– Study factors (mutation, migration, selection, etc.) contributing to the origin,
maintenance and change in genetic variation
3
Paternal
chromosome
A1 = allele
A locus
Maternal
chromosome
A2 = allele
Relationship among alleles
• Two alleles within one individual
• Two randomly sampled alleles
4
Genetic Variation
• Mutation is the ultimate source of genetic variation
– Point mutation
– Insertion/deletion
– Inversion
– Duplication
– ...
• All evolutionary processes require genetic variation to have any
affect:
– natural selection
– genetic drift
– gene flow/migration
– Inbreeding/assortative mating
• Genotype and phenotype
– Phenotype with unknown genotype
– Genotype with unknown phenotype
– Both known
5
What do alleles look like?
• From Mendel to 1950s: Alleles were seen indirectly through
discrete phenotypes:
– Round seeds, wrinkled seeds in peas
– White-eyed and red-eyed flies in fruit flies
– Color pattern in beetle elytra.
• From 1920s early 1960s: Visible chromosome bands, inversions
and other differences on chromosome
– Dobzhansky and his students work on insects
– McClintock and her students
• Electromorphs (mid-1960s to present, but rarely used after
1980s): Lewontin and Hubby (1966)
• DNA fingerprinting and forensics based on polymorphic DNA,
especially on minisatellite and microsatellite DNA (Alec Jeffreys,
1980s)
• Nucleotide and genomic sequences (1980s to present) Image credit: wikipedia
6
Elytral color patterns
• CC Tan and JC Li (1934, Am Nat)
• CC Tan (1946, Genetics)
• Harmonia axyridis
– 105 named morphs
– classified into several genera
• Detailed breeding and bookkeeping:
– A single species
– A single locus with 12 alleles.
• Two lessons:
– a large "diversity" can ben generated by
very little genetic variation
– These different alleles appear neutral Ando, T, Niimi, T. Development and evolution of color
patterns in ladybird beetles: A case study in Harmonia
axyridis. Develop Growth Differ. 2019; 61: 73– 84.
7
Saliary gland polytene chromosome
Drosophila obscura
Drosophila persimilis
Drosophila pseodoobscura
CC Tan 1935. PNAS
Two morphologically nearly identical "races" turned out to belong
to two different species, with multiple chromosomal inversions. 8
Saliary gland polytene chromosome
Drosophila obscura
Drosophila persimilis
Drosophila pseodoobscura
CC Tan 1935. PNAS
Much genetic
difference may Rizki 1951. PNAS
correspond to little
morphological variation
Little genetic variation
may correspond to
much morphological
variation
Two morphologically nearly identical "races" turned out to belong
to two different species, with multiple chromosomal inversions. 9
Natural selection and genetic variation
Industrial melanism
Odontopera bidentata
scalloped hazel moth
Biston betularia the
peppered moth
Cook, L. M.; Turner, J. R. G. (2008). "Decline of melanism in two British moths: spatial,
https://en.wikipedia.org/wiki/Industrial_melanism 10
temporal and inter-specific variation". Heredity. 101 (6): 483–489
Alleles visualized in electrophoresis
Samples Gel with alkaline running These bands are invisible without
buffer staining.
FF
FF
FF
FF
FS
SS
SS
Proteins, negatively charged in alkaline buffer, migrate towards the positive pole.
Xuhua Xia Slide 11
Detect multiple Paternity (esterase-1)
Table 3. Litters with multiple paternity in 1987 and 1988.
Year Mother Offspring
1987 AA AB AA AD
1988 AA AA AB AB AD AD
AA AA AB AB AB AC AD
AA AA AB AB AB AD
AA AA AA AA AB AB AD
AA AA AA AB AB AD
AA AA AB AB AD AD AD
Xia X, Millar JS. 1991. Genetic evidence of promiscuity in Peromyscus leucopus.
Behavioral Ecology and Sociobiology 28:171-178.
12
Inbreeding/drift & genetic variation
Serengeti
Three individuals with fragment
length polymorphism
Individual 1:
Individual 2:
Individual 3: Gir
Repeated DNA
Conserved region
Forward and reverse PCR
primers to amplify the
fragment (colored yellow) Gilbert DA et al., 1991. JH 13
Applications of DNA fingerprinting
• Forensics (Alec Jeffreys)
– Family relationship
– Criminal identification
• Where is this salmon from?
– Pacific salmon stock represents important resources for west
coast fishermen in both US and Canada
– Overfish can causing serious problems
– Given N salmon that can be harvested from the Pacific salmon
stock, what proportion (p) should be allocated to US fishermen
and (1-p) to Canadian fishermen?
– What is a fair p?
– The equity principle of allocation
• Disease diagnosis
14
Identification of bacterial pathogen
Fig. 2. Agarose gel
electrophoresis through a 1%
gel of 3μg mycobacterial DNA
digested with PstI.
Bacteriophage λ DNA digested
with HindIII and EcoRI was used
as size markers (lane a). Lanes:
(b) E. coli; (c) M. smegmatis; (d)
M. phlei; (e) M. scrojulaceum;
(f) M. marinum; (g) M.
lepraemurium; (h) M.
tuberculosis H37Rv; (i) M.
tuberculosis H37Ra; (j) M.
tuberculosis H4Ra; (k) M. avium
no. 214; (1) M. avium no. 8063;
(m) M. intracellulare; (n) ‘M.
lufu’.
Patel et al. 1986. JGM
15
What do alleles look like? fruit fly Adh
5’UTR
“slow”
allele
“fast”
allele
“Exons are shown as boxes; translated regions are in black”.
Note: only the nts which differ from consensus are shown
These are not
Asterisk (C*) = site of Lys-for-Thr replacement responsible neighboring nucleotides
for mobility difference between fast (F) and slow (S)
electrophoretic alleles
Lys: AAR
Point mutation : A vs. C in
2nd position of codon
Thr: ACN
Figure from Graur and Li (2000), original data from
Kreitman (1983) Polymorphic sites in Drosophila Adh gene
Different methods & different results
5’UTR
“slow”
allele
“fast”
allele
Based on nucleotide sequences
Allele N p
1 1 1/11
2 1 1/11
3 1 1/11
4 1 1/11
Based on electromnorph: 5 1 1/11
Allele N p 6 1 1/11
Fast 6 7 1 1/11
6/11 8 1 1/11
Slow 5 9 1 1/11
5/11 10 1 1/11
17
11 1 1/11
Point mutation:
Sickle-cell anemia
Normal
polypeptide (Hb-A): Val-His-Leu-Thr-Pro-Glu-Glu……
GAA
Sickel-cell GUA
polypeptide (Hb-S): Val-His-Leu-Thr-Pro-Val-Glu……
The first methionine is cleaved during posttranslational modification.
It is typically cut off if the second amino acid is small and nonpolar.
Linus Pauling
2/28/1901 – 8/19/1994
Insertion: β-Thalassemia HBb
10 20 30 40 50 60
----|----|----|----|----|----|----|----|----|----|----|----|--
Normal AUGGUGCACCUGACUCCUGAGGAGAAGUCUGCCGUUACUGCCCUGUGGGGCAAGGUGAACGU
Thalass. AUGGUGCACCUGACUCCUGAGGAGAAGUCUGCCGUUACUGCCCUGUGGGGCAAGGUGAACGU
**************************************************************
70 80 90 100 110 120
--|----|----|----|----|----|----|----|----|----|----|----|----
Normal GGAUGAAGUUGGUGGU-GAGGCCCUGGGCAGGUUGGUAUCAAGGUUACAAGACAGG......
Thalass. GGAUGAAGUUGGUGGUUGAGGCCCUGGGCAGGUUGGUAUCAAGGUUACAAGACAGG......
**************** ***************************************
• An insertion leading to the creation of a stop codon and a truncated protein
• Easy diagnosis: electrophoresis would show two bands, one rapidly
migrating short protein and one slowly migrating long (normal) protein
• Given that it is deleterious, why is it quite frequent in some areas?
• Hypotheses on fetal hemoglobin
Xuhua Xia Slide 19
Trinucleotide repeats & Huntington’s Disease
HTT coding sequence and its translated protein (<27 repeats, > 40 repeats)
HTT
AUGGCGACCCUGGAAAAGCUGAUGAAGGCCUUCGAGUCCCUCAAGUCCUUCCAGCAGCAGCAGCAGCAGC
AGCAGCAG
M A T L E K L M K A F E S L K S F Q Q Q Q Q Q Q Q Q
HTT
CAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAACAGCCGCCACCGCCGCCGCCGCCGCCGCCGCCUCCUCA
G ...
Q Q Q Q Q Q Q Q Q Q Q Q P P P P P P P P P P P Q ...
Duplication of trinucleotide repeats
HTT coding sequence and its translated protein
HTT
AUGGCGACCCUGGAAAAGCUGAUGAAGGCCUUCGAGUCCCUCAAGUCCUUCCAGCAGCAGCAGCAGCAGC
AGCAGCAG
M A T L E K L M K A F E S L K S F Q Q Q Q Q Q Q Q Q
HTT
CAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAACAGCCGCCACCGCCGCCGCCGCCGCCGCCGCCUCCUCA
G ...
Q Q Q Q Q Q Q Q Q Q Q Q P P P P P P P P P P P Q ...
The U. S. –Venezuela Collaborative Research Project, Wexler
NS. 2004. Proc. Natl. Acad. Sci. U S A 101(10):3498.
Duplication of trinucleotide repeats
HTT coding sequence and its translated protein
HTT
AUGGCGACCCUGGAAAAGCUGAUGAAGGCCUUCGAGUCCCUCAAGUCCUUCCAGCAGCAGCAGCAGCAGC
AGCAGCAG
M A T L E K L M K A F E S L K S F Q Q Q Q Q Q Q Q Q
HTT
CAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAACAGCCGCCACCGCCGCCGCCGCCGCCGCCGCCUCCUCA
G ... (A)
Q Q Q Q Q Q Q Q Q Q Q Q P P P P P P P P P P P Q ...
(B)
Number under band: age of disease onset.
The U. S. –Venezuela Collaborative Research Project, Wexler
NS. 2004. Proc. Natl. Acad. Sci. U S A 101(10):3498.
Duplication of trinucleotide repeats
HTT coding sequence and its translated protein
HTT
AUGGCGACCCUGGAAAAGCUGAUGAAGGCCUUCGAGUCCCUCAAGUCCUUCCAGCAGCAGCAGCAGCAGC
AGCAGCAG
M A T L E K L M K A F E S L K S F Q Q Q Q Q Q Q Q Q
HTT
CAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAACAGCCGCCACCGCCGCCGCCGCCGCCGCCGCCUCCUCA
G ... (A)
Q Q Q Q Q Q Q Q Q Q Q Q P P P P P P P P P P P Q ...
(B)
Number under band: age of disease onset.
The U. S. –Venezuela Collaborative Research Project, Wexler • Autosome dominant
NS. 2004. Proc. Natl. Acad. Sci. U S A 101(10):3498. • DNA replication slippage variation
Duplication of trinucleotide repeats
HTT coding sequence and its translated protein
HTT
AUGGCGACCCUGGAAAAGCUGAUGAAGGCCUUCGAGUCCCUCAAGUCCUUCCAGCAGCAGCAGCAGCAGC
AGCAGCAG
M A T L E K L M K A F E S L K S F Q Q Q Q Q Q Q Q Q
HTT
CAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAACAGCCGCCACCGCCGCCGCCGCCGCCGCCGCCUCCUCA
NGCAG 65
... Age Age Range (A)
Q Q Q Q Q60Q Q Q P P P P P P P P P P P Q ...
39Q Q66Q Q 72-59
40 59 61-56 55
Median age of onset
41 54 56-52
50
42 49 50-48
43 44 45-42 45
44 42 43-40 40
45 37 39-36 35
46 36 37-35
30
47 33 35-31 (B)
48 32 34-30 25
38 40 42 44 46 48 50
49 28 32-25
50 27 30-24 Number of CAG repeats
Brinkman RR et al. 1997. Am J Hum Genet young child
Indices used to measure
genetic variation
25
Allele and genotype frequencies
• Alleles and allele frequencies: Given two different alleles A1 and A2 in a sample of N
diploid individuals with number of A1 and A2 alleles being and , we have allele
frequencies
• Diploid genotype and genotype frequencies: There are three possible genotypes for
two alleles (i.e., diploid combinations): A1A1, A1A2, and A2A2, with corresponding
genotype frequencies
• Ideally, genotypes are fully reflected by phenotypes, so one can distinguish among
A1A1, A1A2, and A2A2, so one can obtain N11, N12 and N22 can
• By convention, small-case p for allele frequency and capital-case P for genotype
frequencies
• A1A2 and A2A1 are treated as the same genotype and represented by the conventional
notation of A1A2. Note that there are two distinct forms (A1 sperm and A2 egg and A1
egg and A2 sperm) – we’ll see later some situations in which this matters
26
Allele and genotype frequencies
A sample of N individuals for one locus with two alleles:
Genotype Nij Pij
A 1A 1 N11 P11=N11/N
A 1A 2 N12 P12=N12/N
A 2A 2 N22 P22=N22/N
Sum N 1
With large N, P11 may be thought of as the probability of choosing a random individual
from the population with the genotype A1A1 (e.g. P11 = Pr(X = A1A1) in a random draw of
an individual from the population
𝑛
𝑃 𝑖𝑗 ∑
𝑛
∑
2𝑁 + 𝑁 𝑖𝑗
A general
𝑝 =𝑃 𝑖𝑖 + ;𝑝 =
𝑖𝑖
𝑗 =1 , 𝑗 ≠ 𝑖
equation 𝑖 2
𝑖
2𝑁
𝑗=1 , 𝑗 ≠𝑖
2
Expected genotype
frequencies under HWE 𝑃 𝑖𝑖 =𝑝 𝑖
Homework
Genotype 1987 1988
Dam Young Dam Young
O E O E O E O E
AA 22 21.675 93 90.14961 14 13.59559 43 41.00154
AB 3 4.25 19 21.06299 10 12.01471 51 63.89198
AC 2 1.7 4 3.370079 0 0 1 1.006173
AD 2 1.7 5 9.267717 5 3.794118 25 16.09877
BB 1 0.208333 3 1.230315 4 2.654412 34 24.89043
BC 0 0.166667 0 0.393701 0 0 1 0.783951
BD 0 0.166667 0 1.082677 1 1.676471 7 12.54321
CC 0 0.033333 0 0.031496 0 0 0 0.006173
CD 0 0.066667 0 0.173228 0 0 0 0.197531
DD 0 0.033333 3 0.238189 0 0.264706 0 1.580247
Sum 30 30 127 127 34 34 162 162
A 0.85 0.84252 0.632353 0.503086
B 0.083333 0.098425 0.279412 0.391975
Allele
C 0.033333 0.015748 0 0.006173
frequencies D 0.033333 0.043307 0.088235 0.098765
sum 1 1 1 1
From observed genotype frequencies (O), one can obtain allele frequencies and expected genotype
frequencies (E). The table shows O for Dam and Young in 1987 and 1988. Calculate allele frequencies and
expected genotype frequencies. The answers are provided in gray. You can do this quickly in EXCEL or R
28
Indices: Proportion of polymorphic loci
1. Proportion of polymorphic loci (a polymorphic
locus is one with no allele frequency equal to 1)
Locus AA Aa aa Sum pA Polymorphism
L1 10 0 0 10 1 monomorphic
L2 3 4 3 10 0.5 polymorphic
L3 6 3 1 10 0.75 polymorphic
L4 1 1 8 10 0.15 polymorphic
L5 0 0 10 10 0 monomorphic
L6 10 0 0 10 1 monomorphic
2. Is it likely that the result is from a population
with a true ?
6!
Pr ( 𝑋 =3|𝑃 𝑝𝑜𝑙𝑦 =0.4 , 𝑛=6 )=
3 6−3
0.4 ( 1 −0.4 ) =0.27648
3 ! ( 6 − 3) !
29
Homework
1. You have 2 polymorphic loci out of
7. What is the probability that the
population has a true Ppoly = 0.5?
2. You have 20 polymorphic loci out of
70. What is the probability that the
population has a true Ppoly = 0.5?
30
Confidence interval of Ppoly=0.5
Locus AA Aa aa Sum pA Polymorphism x NPoly Pr(x)
L1 10 0 0 10 1 monomorphic 0 0 0.015625
L2 3 4 3 10 0.5 polymorphic 1 0.166667 0.09375
L3 6 3 1 10 0.75 polymorphic 2 0.333333 0.234375
L4 1 1 8 10 0.15 polymorphic 3 0.5 0.3125
L5 0 0 10 10 0 monomorphic 4 0.666667 0.234375
5 0.833333 0.09375
L6 10 0 0 10 1 monomorphic 6 1 0.015625
𝑁 𝑝𝑜𝑙𝑦 3
𝑃 𝑝𝑜𝑙𝑦 = = =0.5 Binomial distribution: with
𝑁 6
When N is small, we cannot use normal approximation to attach confidence interval
(CI). However, with small N, it is easy to compute exact probability as shown in the top-
right table. For 95% CI, we can exclude the two extreme values because their
summation is only 0.03125, so the 95% CI is
(This CI is disputed by Bayesians but we will not go there)
Indices of genetic variation
1. Proportion of polymorphic loci (a polymorphic locus is one with no allele frequency
equal to 1)
2. var(p) = pq/(2N), convenient for a locus with only two alleles
• Inapplicable for extreme p values e.g., p ≈ 0 or p ≈ 1
• N should be large, e.g., N > 25.
Locus AA Aa aa Sum pA Polymorphism var(pA)
L1 10 0 0 10 1 monomorphic 0
L2 3 4 3 10 0.5 polymorphic 0.0125
L3 6 3 1 10 0.75 polymorphic 0.0094
L4 1 1 8 10 0.15 polymorphic 0.0064
L5 0 0 10 10 0 monomorphic 0
L6 10 0 0 10 1 monomorphic 0
var(pA) provides more information. However, with three or more alleles with
allele frequencies p1, p2, p3, ..., var(pA) becomes allele-specific and cannot
summarize genetic variation in a locus with a single value.
32
Indices: Shannon H
1. Proportion of polymorphic loci (a polymorphic locus is one with no allele
frequency equal to 1)
2. var(p) = pq/(2N) , for a locus with only two alleles.
3. Shannon entropy from allele frequencies: (Note logarithm with base of 2. pi
values equal to 0 are omitted in calculation. )
Locus AA Aa aa Sum pA Polymorphism var(pA) HpA
L1 10 0 0 10 1 monomorphic 0 0
L2 3 4 3 10 0.5 polymorphic 0.0125 1
L3 6 3 1 10 0.75 polymorphic 0.0094 0.8113¿ − ¿
L4 1 1 8 10 0.15 polymorphic 0.0064 0.6098
L5 0 0 10 10 0 monomorphic 0 0
L6 10 0 0 10 1 monomorphic 0 0
Shannon H summarize genetic variation at a locus with any number of alleles.
33
Nucleotide diversity
Nucleotide diversity:
𝑁 𝑑𝑖𝑓𝑓 . 𝑖𝑗 A ACCGCTTAGC
𝜋 𝑖𝑗 = (Also known as Hamming distance) B ACTGCTTAGC
𝐿 C ACCACTTAGC
∑ 𝜋 𝑖𝑗
𝜋= 𝑖 𝑗, 𝑖≠ 𝑗
= 𝜃𝑇 Pair NDiff ij
𝑛(𝑛 − 1)/ 2 A..B 1 0.1
A..C 1 0.1
and T are synonymous B..C 2 0.2
Nei, M., Li, W-H 1979 Mathematical Model for Studying Genetic Variation
in Terms of Restriction Endonucleases". PNAS. 76 (10): 5269–73.
Homework
We sampled three sequences (A, B and C) from each of three populations (Pop1, Pop2 and Pop3).
We have already computed nucleotide diversity for the three sequences from the first
population (See detail in the previous slide). Now you calculate for the three sequences from
Pop2 and Pop3, respectively. You do not need to submit.
Suppose you have sampled 100 sequences of 30 nt from a
Pop1 Pop2 Pop3 population. Only 3 sequences (A, B and C shown below) are
A ACCGCTTAGC ATCACGTCGC GCTGGTAAGC unique. There are 20 sequences identical to A, 30 identical to
B ACTGCTTAGC ATCGCGTCGC GCTGGCAAGC B, and 50 identical to C. How to compute π?
C ACCACTTAGC ATCACGTCGC GCTAGTAGGC
A ACCGCTTAGCATCACGTCGCGCTGGTAAGC
B ACTGCTTAGCATCGCGTCGCGCTGGCAAGC
Pair NDiff ij C ACCACTTAGCATCACGTCGCGCTAGTAGGC
Pop1A..Pop1B 1 0.1
There would be 100*99/2 (=4950) pairwise comparisons and
Pop1A..Pop1C 1 0.1
4950 ij values. Are you going to compute all these ij values,
Pop1B..Pop1C 2 0.2 0.1333 summing them up and then divide by 4950 to get ?
Pop2A..Pop2B
Pop2A..Pop2C Note that we have 3 unique alleles (A, B, and C) with allele
Pop2B..Pop2C frequencies , so only 3 pairwise comparisons (πAB, πAC, πBC):
Pop3A..Pop3B
Pop3A..Pop3C
Pop3B..Pop3C
Indices of genetic variation
1. Proportion of polymorphic loci (a polymorphic locus is one with no allele
frequency equal to 1)
2. var(p) = pq/(2N) , good for a locus with only 2 alleles with p and q not close to 1 or
0, and large N.
3. Shannon entropy: (Note logarithm with base of 2. pi values equal to 0 are omitted
in calculation)
4. Nucletide diversity
All these indices are based on allele or nucleotide frequencies.
If we sample a locus with two alleles in two populations (Pop1 and Pop2) and obtain
the following data:
genotype Pop1 Pop2
AA 25 10
Aa 50 80
aa 25 10
They both have p = q = 0.5, so any indices based on allele frequencies will tell us that
the two population are equally variable at allele frequency level, which is correct.
However, we see clear difference in genotype differences. In particular, Pop2 has an
excess of heterozygotes. We need an index to show such differences. 36
Indices of genetic variation
1. Proportion of polymorphic loci (a polymorphic locus is one with no allele
frequency equal to 1)
2. var(p) = pq/(2N), good for a locus with only 2 alleles with p and q not close to 1 or
0, and large N.
3. Shannon entropy: (Note logarithm with base of 2. pi values equal to 0 are omitted
in calculation)
4. Nucletide diversity
5. Observed and expected heterozygosity
Observed heterozygosity (h): the frequency of heterozygotes.
Expected heterozygosity (assuming HWE):
Mean h: , where n is number of loci.
Mean :
37
Observed and expected heterozygosity
genotype Pop1 Pop2
AA 25 10
Aa 50 80
aa 25 10
Sum 100 100
p 0.5 0.5
q 0.5 0.5
h 0.5 0.8
0.5 0.5
38
Observed and expected heterozygosity
Locus AA Aa aa p Polymorphism h
L1 10 0 0 1.0 M 0 0
L2 3 4 3 0.5 P 0.4
L3 6 3 1 0.75 P 0.3
L4 1 1 8 0.15 P 0.10
L5 0 0 10 0 M 0 0
L6 10 0 0 1.0 M 0 0
39
Why is genetic variation important?
• Animal conservation
– Lions in Serengeti and Gir, Cheetah
– Endangered whale species
• Allocation of Pacific salmon fishing quota
• Viral genetic diversity
– New pathogenic viral strain
– Recombinants
– Tracing viral origin
– Vaccine development
• Forensics:
– Paternal identification
– Matching of crime-scene DNA and suspect DNA
– Tracing ancestors
• Disease diagnosis and mapping 40
Assignment (Lec3)
Frequency of alkaline phosphatase in a sample of English people, from Table 9 in Harris (1966)
Genotype Number Obs. Pij Exp. Pij Obs. Pij: observed genotype
SS 141 frequencies
Exp. Pij: expected genotype
SF 111
frequencies under HWE
FF 28
SI 32 Fill in the blanks and submit the slide
with your
FI 15
II 5 Name:
Total 332
ID:
Allele Ni pi Index Value
S h
F
I
Shannon Hp
Total 2N 1
41