CHEM41290 Notes Part I
CHEM41290 Notes Part I
Dr Elaine O’Reilly
Office 3.11 Chemistry
[email protected]
Biocatalysis
Key topics and learning objectives
Reading material
Ø Lecture notes
Ø Cited publications
1
10/19/20
Biocatalysis
Key topics and learning objectives
Reading material
Ø Lecture notes
Ø Cited publications
Biocatalysis
Biocatalysis typically refers to the application of naturally occurring or modified
enzymes to perform a chemical transformation.
Advantages Challenges
Activity Substrate range
Selectivity Stability
Environmentally benign Restricted to ‘biological reactions’
Enzyme cascades
Tuneable
Only need to ‘make’ it once
2
10/19/20
Developing Biocatalysts
Enzymes isolated from natural sources do not always display the required properties – substrate scope, activity,
stability (temperature/pH/solvent).
Enzymes can be engineered to improve these properties, and there are a variety of approaches that can be taken
to modify the protein, depending on how much information is known about the protein and what properties you
would like to optimise.
To understand how to engineer proteins, we need to understand (remind ourselves) about a few basic principles
3
10/19/20
1) Nitrogen base
2) Sugar
3) Phosphate group
4
10/19/20
10
5
10/19/20
DNA Sequence
11
Heterologous Expression
• If we want to use enzymes to catalyze challenging reactions, the wild-type (WT) enzyme might do the job!
• The organism can sometimes be used directly to catalyze the reaction but there are limitations to this:
culturing/reactions condition compatibility; solubility of substrates; purifications of product; side reactions ….
• Often more useful to clone and perform heterologous expression of the gene of interest
Clone the
corresponding gene Wild-type
Enzyme
12
6
10/19/20
Bacteria carrying
Vector
recombinant DNA. Vast majority of
with antibiotic Recombinant
transformed cells only take up one
resistance marker DNA
plasmid (important for directed
(e.g. lactamase) Each colony derived from
evolution)
single cell
DNA or protein
purification
harvest cells
Inoculate cultures with bacterial
cells from a single colony.
At a specific cell density, transcription
(DNA copied to mRNA) and translation
(templated protein synthesis coded by Isolated DNA
mRNA) induced by addition of
promoter to produce target protein.
13
Cloning - The gene encoding the protein of interest is annealed to the plasmid to generate a circular plasmid
(piece of DNA).
Transformation – The plasmid is introduced into the bacterial cells, which have been made competent (able to
take-up circular DNA). The cells are typically incubated with a suitable liquid growth medium to allow cells to
begin to divide (grow). This functions to enable the newly transformed cells to begin to express the genes that
encode for the antibiotic resistance. Typical growth time is 30-60 minutes.
Plating –After the short incubation period, cells are spread into a nutrient agar plate and incubated. The bacterial
colonies that result should each contain the plasmid, as any cells that do not, will not produce the antibiotic
resistance gene and will be killed. Importantly, as each colony is derived from a single cell, individual colonies will
contain the same plasmid – this becomes important when making mutants, which we will see later.
At this point, the cells can be grown up to isolate the multiple copies of recombinant DNA and this DNA stored, or
used to make mutants etc. Alternatively, gene expression can be induced resulting in protein production from the
recombinant gene.
14
7
10/19/20
Enzyme Engineering
However, WT-enzymes are often not suitable for synthetic applications and their properties need to be tuned before
they are practically useful
Enzyme engineering is often required (rational, semi-rational, random). The approach depends on a number of
factors:
• how mush is understood about the enzyme (structure)
• what degree of change is expected from the enzyme
• the availability of a suitable assay
15
If the goal is to incorporate random mutations, a low-fidelity DNA polymerase must be used.
16
8
10/19/20
There are three main stages: Denaturing – when the double-stranded template DNA is heated to separate it into
two single strands. Annealing – when the temperature is lowered to enable the DNA primers to attach to the
template DNA. Extending – when the temperature is raised and the new strand of DNA is made by the Taq
polymerase enzyme. These three stages are repeated 20-40 times, doubling the number of DNA copies each
time.
Denaturation: During this stage the cocktail containing the template DNA and all the other core ingredients is
heated to 94-95⁰C.
The high temperature causes the hydrogen bonds between the bases in two strands of template DNA to break
and the two strands to separate. This results in two single strands of DNA, which will act as templates for the
production of the new strands of DNA. It is important that the temperature is maintained at this stage for long
enough to ensure that the DNA strands have separated completely. This usually takes between 15-30 seconds.
17
Extending: During this final step, the heat is increased to 72⁰C to enable the new DNA to be made by a special
Taq DNA polymerase enzyme which adds DNA bases. Taq DNA polymerase is an enzyme taken from the heat-
loving bacteria Thermus aquaticus. This bacteria normally lives in hot springs so can tolerate temperatures
above 80⁰C. The bacteria's DNA polymerase is very stable at high temperatures, which means it can withstand
the temperatures needed to break the strands of DNA apart in the denaturing stage of PCR. DNA polymerase
from most other organisms would not be able to withstand these high temperatures, for example,
human polymerase works ideally at 37˚C (body temperature). 72⁰C is the optimum temperature for the Taq
polymerase to build the complementary strand. It attaches to the primer and then adds DNA bases to the single
strand one-by-one in the 5’ to 3’ direction.
18
9
10/19/20
These three processes of thermal cycling are repeated 20-40 times to produce lots of copies of the DNA
sequence of interest. The new fragments of DNA that are made during PCR also serve as templates to which the
DNA polymerase enzyme can attach and start making DNA. The result is a huge number of copies of the specific
DNA segment produced in a relatively short period of time.
19
Ø Nature does just this (unintentionally!) and is the reason we have such great genetic diversity
Ø Enzymes are typically extremely selective, but most evolved from less specialised proteins that likely catalysed
a greater diversity of reactions and/or with a greater variety of substrates
Ø As molecular biologists, instead of waiting around for mutations to creep in naturally – we speed things up by
introducing mutations
Ø For a protein of 400 amino acids there are 20400 total possible number of variants!
Ø No. of mutants in which only 1 amino acid is exchanged for any other 19 = 400 x 19 = 7,600
Ø If you go a little further, you quickly get into the ‘million and billions’!!
Ø What you are trying to achieve can often determine what evolutionary approach you take
20
10
10/19/20
Approaches to Engineering
Minireviews
21
brary screening. Although simple in principal, directed evolu- views can be consulted for further information.[41, 46–61] The
tion usually requires that considerable thought be given to li- more common and well-tested techniques are described
brary construction and high-throughput methods of screening below.
and/or selection. In many cases it might not be possible to use PCR was originally designed to amplify lengths of DNA but
evolutionary methods to improve an enzyme. has been adapted, in the form of error-prone PCR, to serve as
It should be stressed that rational and evolutionary ap- a tool for library generation. DNA polymerases usually have
Approaches to Engineering
Minireviews
proaches are not incompatible and that in recent decades
hybrid approaches have been commonly employed. This is [39]
ChemBioChem 2016, 17, 197 – 203 www.chembiochem.org 199 ⌫ 2016 Wiley-VCH Verlag GmbH & Co. KGaA, Weinheim
22
brary screening. Although simple in principal, directed evolu- views can be consulted for further information.[41, 46–61] The
tion usually requires that considerable thought be given to li- more common and well-tested techniques are described
brary construction and high-throughput methods of screening below.
and/or selection. In many cases it might not be possible to use PCR was originally designed to amplify lengths of DNA but
evolutionary methods to improve an enzyme. has been adapted, in the form of error-prone PCR, to serve as
It should be stressed that rational and evolutionary ap-
proaches are not incompatible and that in recent decades
a tool for library generation. DNA polymerases usually have
proofreading activity to ensure DNA is replicated with high fi-
11
hybrid approaches have been commonly employed.[39] This is delity. However, as the name suggests, error-prone PCR is in-
referred to as semi-rational design or targeted mutagenesis tended to introduce random mutations during replication by
and includes the use of site-saturation mutagenesis (SSM) or reducing the fidelity of the DNA polymerase. One approach to
randomised mutagenesis over a portion of the enzyme rather reducing the fidelity is to use a polymerase lacking the domain
than the entire enzyme.[40–42] responsible for editing (e.g., Taq polymerase). However, the ab-
PROTOCOL
10/19/20
Figure 4 | Generalization of CASTing. Scheme of an enzyme minimized to a
spherical catalyst of 10 Å (for illustrative purposes Lip A is shown). The pdb
a b
10 Å
file 1ISP is used as an explanatory model. In (a) the residue in yellow (CPK)
represents the catalytic amino acid. Accelrys Discovery Studio Visualizer 1.6
and other protein viewers permit the selection of the residues located in a
defined radius (10 Å in this case) around a specific amino acid residue or
atom. To create this kind of representation, select the desired amino acid and
then under the Menu ‘Edit’, click ‘Select’, define ‘Type’ ‘Amino acid’ and ‘Radius’
in Å. The residues appear in yellow as shown in (b), and this selection can be Semi-rational Approaches
saved as a ‘group’ in the structure file as described in the text. In this way it
Saturation
is very easy to transform mutagenesis
the protein (also
formally into a small known
catalyst as cassette mutagenesis)
of defined
radius as shown in (c). In (d) a general diagram of CAST is shown displaying
PROTOCOL
One or
(arbitrarily for illustrative more five
purposes) positions
different are selected
regions A–E, eachand
one the residue at that/those positions randomized
c d
harboring, e.g., 2 or 3 aa. B
YYY
Figure 4 | Generalization of CASTing. Scheme of an enzyme minimized to a
Smaller sequence space (not asLipmany variants) a b A
YY
spherical catalyst of 10 Å (for illustrative purposes A is shown). The pdb
10 Å
10 Å
C
file 1ISP is used as an explanatory model. In (a) the residue in yellow (CPK)
YY
Conclusions andrepresents
perspectives
the catalytic amino acid. Accelrys Discovery Studio Visualizer 1.6 E Y
Over the past 15Tend otherto be higher quality libraries
Y
Y
andyears, traditional
protein strategies
viewers permit of directed
the selection evolution
of the residues located in a
Y
defined radius (10 Å in this case) around a specific amino acid residue or D
based on classical methods
atom. such
To create this kindas repeating cycles
of representation, ofdesired
select the epPCR aminoandacid and
DNA shuffling thenhave underbeen
the Menu applied
‘Edit’, clicksuccessfully in‘Amino
‘Select’, define ‘Type’ engineering
acid’ and ‘Radius’
Iterative
in Å. 1–6 saturation
The residues appear in yellowmutagenesis
as shown in (b), and this selection can be
numerous enzymes . Recently, a number of groups have devel-
saved as a ‘group’ in the structure file as described in the text. In this way it
oped further molecular
is very easybiological
to transform themethods for gene
protein formally into amutagenesis
small catalyst of defined wrong decision is made, the mistake can be corrected because the
as PROTOCOL
Same principal
radius as shown in (c). Inexcept
(d) a generalthe variants
diagram of CAST isfrom
shown the first round are subjected to further rounds of saturation mutagenesis
displaying
well as new strategies for scanning protein sequence space in the site can be eliminated from further consideration after the initial
hope of increasing efficiency
harboring,
1–6,11,12,17–21. Industrial applications of
e.g., 2 or 3 aa.
c
(arbitrarily for illustrative purposes) five different regions A–E, each one Figure 1 | Schematic illustration of iterative
mutagenesis round. d B
etc.
YYY
directed evolution, in particular, require rapid procedures. Our four randomization sites A, B, C and D: confined
A final space aspect toenzyme
be considered when A applying ISM is the codon A CD ABD ABC BCD ABD A BC BCD A CD A BC B CD ACD ABD
YY
2) select say four residues 22,23 to be randomizedprotein sequence
optimization (redundancy (A, in B,
some C,
for evolutionary
cases isD) 10 Å C
YY
contribution to accelerated Conclusions and directed evolution is ISM
perspectives , which is expected). usage. There isThis important the number issue has been discussed in a recent report
3) form four libraries from this parentof cycles. sequence no rule regarding E
YY
Y
Exploration of protein sequence space is
based on a Cartesian Over theview past 15
of years,
the 3D traditional
protein strategies
structure of directed
in which evolution emphasizing library mutagenesis methods Dthat are complementary
Y
B C D A C D A B D A B C
well as new strategies for scanning protein sequence space once. In site
in the the casecanofbe foureliminated
sites, this would from furtherblocks. consideration after basisthe initial
rational considerations resulting from structural information, only proteinogenic
mean convergence after four generations of aa as building On the of statistical
of increasing efficiency1–6,11,12,17–21. Industrial applicationssaturation
defined parts ofhope an enzyme are considered. The iterative process of analyses mutagenesis 46,47,and
mutagenesis, round.
wea total
have of 64calculated the number
libraries if all pathways in organic chemistry, polymer technology and pollution clean-up,
as components
of clones that should
in detergents,
directed evolution, in particular, require rapid procedures. Our A finalexperimentally.
were to be explored aspect to beInconsidered practice this iswhen applying
not neces- ISM is the codon as diagnostic tools, as bio-nanotech-
then allows for 23high evolutionary
contribution pressure
to accelerated directed inevolution
confined regions
is ISM of sary
22,23, which is be
22, but screened
usage.it is This for 95%
of theoretical
important coverage
interestissue
and has needsin
thus been to the
be
discussed case
nological
in a of
devices
recent randomization
and sometimes even as
report attherapeutic drugs. For
this reason, considerable research has been conducted in the quest
considered.
protein sequencebased space,
on which increases
a Cartesian view of the probability
the 3D of success
protein structure in whichThe two and
emphasizing three positions,
library mutagenesis
systematic, iterative nature of the strategy illustrated inrespectively
methods that(Table
toare 1).
complementary
enhance Of
thermostability course,
by a varietyfullof techniques, including
while reducing cost, severaltimepredefined
and human regionseffort.are considered as being crucial for Figure 1 to
coverage those
is unlike treated
anisalternative
not here44. inInwhich
necessarily
approach allmandatory,
of mutations
the our examples and evolution9,27–30
directedregarding
in fact inISM
23
. We speculated that ISM (Fig. 1) could be
our studies
obtained published
in one initial library are22,23,45
simply combined with used
those ofNNK particularly effective . The primary challenge was to find a
a hit degeneracy
The criteria forimproving choosing a given catalytic property. With the systematic restrict-
the appropriate sites for randomization inbyanother
ing of saturation mutagenesis solely to ‘hot sites’, as suggested new
published library24, a so
Cyt/Gua/Thy;
so far
far
process was, only
itallows
that
K: Gua/Thy).
we have
notnew sought
mutants but
This involves
22,23
not . Nevertheless,
criterion that would (N: allow
Ade/a decision regarding the optimal choice
the higher the
mutations. It is also different from the conventional approach32 codons
of the sites and
in theallenzyme
the 20appropriate for saturation mutagenesis.
depend on the rational natureconsiderations
of the catalytic resultingproperty
from structuralthatinformation,
is to beonly basedcoverage,
on proteinogenic
DNA shuffling theofgreater
theaainitial the. Inprobability
hits1,2
as building blocks.
contrast, eachOnnewof finding
Onbasis
the ofimproved
the basis of the well-known
statistical variants.fact that hyperthermophilic
23,considered. enzymes are more rigid than mesophilic analogs31–33, it appeared
a websuggest of cycle of ISM maximizes
improved.
e minimized to a In the case parts
defined of thermostability
of an enzyme are The the use process
iterative Therefore,
analyses46,47 tothe, reduce
probability of obtaining additive and/or
we the molecular
have calculated the number biological
of clones work and the screen-
that should
cooperative effects of newly introduced mutations in a defined region reasonable to introduce appropriate mutations at sites displaying
Semi-rational Approaches
shown). The pdb thenforallows for high evolutionary pressuremethod).
in confinedInregions
B-factors as a basis decision making (the B-FIT the ofoftheing beeffort,
screened we for 95%
suggest coverage
that Wein
space. it havethe
may case be
well of
highrandomization
useful
degrees of to consider
flexibility. at
To identifyother
such sites with some certainty,
10 Å
enantioselectivity22 and thermostability23 of enzymes (see below). of atomic electron densities with respect to their equilibrium
ues on locatedsites around the
in a complete
The criteria binding
for choosing pocket insitesa for
the appropriate systematic
randomizationIt degeneracy published (N:
far itof Ade/Cyt/Gua/Thy; 22,23 D: Ade/Gua/Thy; T: motion
Thy)
is clear that such aso process was notevolutionary
exerting sought pressure . Nevertheless,
is positionsthe as ahigher
result ofthe thermal and positional disorder.
acid residue or
manner (CASTing). Combinatorial
dependAnother
on the nature active-site
possibility,
of the yet tosaturation
catalytic beproperty
tested that test
experi-is (CAST)
be involves
to very coverage,
different only
from the the 12cycles
greater
multiple codons
the and
probability
of epPCR most often12ofusedaa Therefore,
in(Phe,
finding Leu,
improved in thevariants.
Ile,method that we call the B-factor iterative test
Val,acids Tyr, His,that display the highest
red amino acid and 23, we suggest the usedirected evolution studies to date. The latter addresses the whole (B-FIT), only those amino in a protein
mentally, improved.an
is to perform In initial
the caseround
of thermostability
of traditional epPCR and enzyme of Asn, Therefore,
anewAsp,
to reduce
Cys,
in each cycle, Arg,
which
the Ser
means
molecularand
that all
biological
Gly).
its regions are The
work
B-factors and
screening
23 arethe screen-
targeted. effort when
After screening of the corresponding initial
no acid’ and ‘Radius’
CASTing
B-factors as ainvolves focusing
basis for decision makingon the
(the B-FITcatalytic
method).activeIn the
consideredcenter.
ingover effort,
again Thewe Cartesian
evensuggest
though only that aitfew space
may well within
positions be
mayuseful a
mutanttoradius
consider
libraries ofother
prepared approximately
by saturation mutagenesis at positions A,
then tocanchoose
his selection be several hot spots identified thereby as sites
22 for randomizing two or three positions then
B, C, D reduces
etc., the gene to
of the 430
best hitand
is used as the template for
In this way it ISM.10
caseA of̊ substrate
is partitioned
acceptanceinto defined
and/or regions ,(sites)
enantioselectivity to
is be
the focusactually randomized
degeneracies
be important for inducing when by saturation
applying ISMrapidly.
positive responses (Tablemutagenesis
For1). For example, NDT
ext.subsequent Insites
addition 5,175 theclones, respectively, which constitutes saturation mutagenesis at the site from which the second-best hit in
a drastic reduction in
on aroundtothe such a possibility,
complete it is conceivable
binding pocket in a systematic this reason,
degeneracyimproved mutants
(N: accessible by ISM
Ade/Cyt/Gua/Thy;
be found by repeated rounds of epPCR, simply on statistical
are not likely
D: to
Ade/Gua/Thy;
the initial T:
mutagenesis Thy)
experiments originates. The process of ISM
catalyst of defined
thatdisplaying
shown the highly flexible manner sites identified
(CASTing). Another on the
Yellow/green residue represents catalytic amino basis
possibility, of
yet B-factors
to be testedcan experi- experimental
grounds.
acid
involves
Likewise, DNA work
onlyshuffling
12 codons (Table
cannot beand 1).
12
expected toThe
aa (Phe,computer
generate Leu, program
Ile,continued
is then Val, Tyr,
in a His, CASTER,
hierarchical manner until all sites have been
the kind of hitsAsp, that Cys,
arise from ISM ‘visited’. In principle, a given site can be considered more than once
eme of an enzyme minimized to amentally, is to perform an initial round of traditional epPCR and
be usedone in ISM in theaquest to improve catalytic b properties other recombined. Asn,
available from characterour Arg, Serbecause
website,
sequences
andcontains
Gly). The are screening effort when
which calculations of the type shown
A–E, each
purposes Lip A is shown). The pdbthen c to choose several hot spots identified d thereby Bas sites for
The focused
randomizing two enables
of our method,
or three positionsof ben-
is
then reduces to 430 and
10 Å
than
In (a) thermostability
the residue in yellow (CPK) (e.g., enantioselectivity), provided the screen- based
in on
Table
structural
5,175 clones, 1 as
information,
well
respectively, as other
fast convergence
aids
whichofconstitutes useful in designing
this concept a drastic reduction in
saturation
rys Discovery Studio Visualizer 1.6subsequent ISM. In addition to such a possibility, it is conceivable YYY
eficial mutations, as shown in Figure 2. A forerunner
A
ction ing
of thesystem is designed
in a that theaccordingly
highly flexible (‘you get what on you screen for’)C1. can mutagenesis multiplelibraries.
YY
YY the screen-
Y
irected
define ‘Type’evolution
the sites defined
‘Amino acid’ than thermostability (e.g., enantioselectivity),
and ‘Radius’by CASTing. Whatever criterion is chosen, if a provided that
A in featured
in a previous
is Table study1of as wellbinding
selective
here toas illustrate
other
of zinc aids useful toin designing saturation
finger proteins
ISM.
Y
wn in (b), and this selection can be D DNA1 as analyzed by phage display, each zinc finger domain was
ing system is designed accordingly (‘you get what you screen for’) . mutagenesis libraries.
es of epPCR
described in the text.and
In this way it randomized and assembled individually (walking across a target) . 25
allyin
intoengineering
Of course, such B-factor sites can also be considered together with
a small catalyst of defined
The
Although this doesB-FIT as applied
not involve catalysis, ittobears
thesome
thermostabilization
relationship of B. subtilis Lip
iagram of CAST is shown displaying the sites defined by CASTing. Whatever criterion is chosen, ifto aour strategy.
A is featured here
Relevant, too, to illustrate
is a recent study of theISM.
engineering of
oups have devel-
TABLE 1 | Statistical analysis a
of codon usage d. an epoxide hydrolase for enhancing aerobic mineralization of cis-
different regions A–E, each one
c B 1,2-dichloroethylene by several rounds of saturation mutagenesis
ne mutagenesis as wrong decision is made, the mistake can be corrected because the
There is also theanalysis
op4on to make A reduced codon libraries
YYY Wild type
and co-expression of a DNA-shuffled toluene o-mono-oxygenase26.
enceDegenerate
space in the site No.
can | of
1 be eliminated from No. of usage
further 10consideration
a. No. of
after the initialAmino acids 95% coverage 95% coverage
YY
b
Figure 2 | Schematic illustration of iterative b mutagenesis as a
saturation
codon
al applications of codonsround.No. ofamino acidsENo. ofYY
mutagenesis
Degenerate stopsNo. of Enhancing encoded
thermostability by ISM
Amino acids 95% (2coverage
positions) 95%
strategy for coverage
the rapid (3 positions)
convergence of beneficial effects (red pathway upward)
Y
strategies of directed evolution To illustrate ISM, the thermostabilization of Bacillus subtilis lipase of mutations
Y
10/19/20
Fig e 3. 16
Simila l , i h DNa e 1, e e al membe of he gene famil a e f agmen ed, and hen PCR i
n. D ing PCR, diffe en membe of he famil a e c o -p imed. Fo e ample, homological
Gene Shuffling
DNA f agmen ill anneal o each o he . The h b id fo med f om hi p oce a e hen ed o
gene a e a lib a of m an ha a e e ed fo ni e p ope ie .
Meyer et al. Page 9
Figure 1.
The sequence space of all possible proteins is depicted as a 2-dimentional plane. The x- and
y-axis represent genotypic distance, such that neighboring points have a similar genotype
Gene shuffling involves random shuffling of12related genes in order to explore more sequence space than could be achieved
and distant points are more dissimilar. Left) A starting parental sequence (black dot) is
randomly mutagenized, resulting in a typical random mutagenesis library (gray dots). This
Fig e 4. P ce f dige i a d imi g i PCR.
when doing more traditional mutagenesis. library explores the sequence space near the parental sequence, but does not contain the
sequence exhibiting a new function (star). Right) Multiple, divergent parental sequences
NIH-PA Author Manuscript
(black dots) are recombined, resulting in a gene-shuffled library (gray dots). This library
explores the larger, intervening area, and does contain the sequence exhibiting a new
function (star). Thus, the latent evolutionary potential of a gene family can be tapped to find
new functionality.
25
NIH-PA Author Manuscript
Gene Shuffling
A number of homologues genes are digested with DNase, creating numerous fragments. PCR is performed
without additional primers. The result is a library of recombined genes.
Curr Protoc Mol Biol. Author manuscript; available in PMC 2015 January 06.
The sequence space of all possible proteins is depicted as a 2-dimentional plane. The x- and y-axis represent
genotypic distance, such that neighbouring points have a similar genotype and distant points are more dissimilar.
Left) A starting parental sequence (black dot) is randomly mutagenized, resulting in a typical random mutagenesis
library (grey dots). This library explores the sequence space near the parental sequence, but does not contain the
sequence exhibiting a new function (star). Right) Multiple, divergent parental sequences (black dots) are
recombined, resulting in a gene-shuffled library (grey dots). This library explores the larger, intervening area, and
does contain the sequence exhibiting a new function (star). Thus, the latent evolutionary potential of a gene family
can be tapped to find new functionality.
26
13
10/19/20
1000’s of colonies
A few
Parent gene mutations
(= parent protein)
27
Directed
evolution
Evolved
gene product
3 2
Access provided by University College Dublin (UCD) on 08/28/19. For personal use only.
> in vivo
> in vitro
> Complementation > Absorbance
> Activation of > Fluorescence
transcription > HPLC
> Detoxification > Mass spectrometry
> NMR
Figure 1
General strategy for directed evolution and selected experimental methods. Protein catalysts are optimized using iterative cycles of
28gene diversification by mutagenesis (!), gene expression ("), screening or selection for improved variants (#), and subsequent gene
amplification ($). Abbreviations: FACS, fluorescence-activated cell sorting; HPLC, high-performance liquid chromatography; IVC, in
vitro compartmentalization; NMR, nuclear magnetic resonance; µSCALE, microcapillary single-cell analysis and laser extraction;
PACE, phage-assisted continuous evolution; PCR, polymerase chain reaction.
automated using modern robotics systems, thus allowing more efficient laboratory evolution. For
a more comprehensive description of relevant methodology, an excellent review by Packer & Liu
(2) may be consulted.
14
2.1. Generating Genetic Diversity
Exhaustive sampling of sequence space is impossible. A library of fully randomized 40-residue
polypeptides, built from the 20 common proteinogenic amino acids and containing only a single
molecule of each possible sequence, would exceed the mass of Earth by several orders of magnitude.
Typical enzymes are even larger, usually containing several hundred amino acids. Consequently,
10/19/20
1000’s of colonies
A few
Parent gene mutations
(= parent protein)
29
Genotype-phenotype linkage
With proteins, including enzymes, the ‘phenotype’ is the binding ability or reactivity (the properties)
30
15
10/19/20
1000’s of colonies
A few
Parent gene mutations
(= parent protein)
ü General ü Quantitative
31
Ø The prevalence of chiral amines in bioactive natural products and drug molecules means that these enzymes have
become extremely important in industry
Ø As the substrate scope of the wild-type enzymes is often limited, these enzymes are frequently engineered
Ø If we look at the overall transformation, it is clear why an assay that simply relies on the detection of an amine
product may not be suitable
O
2-O PO
OH
3 NH2
N OH
pyridoxal 5’-phosphate 2-O PO
3
PLP
PLP PMP N
Lys
pyridoxamine 5’-phosphate
transaminase PMP
O NH2
N
R R1 R R1
2-O PO
OH
3 Amine Ketone
Donor coproduct
N
H
enzyme-bound
form of cofactor
32
16
10/19/20
Transaminases are PLP-dependent enzymes that are capable of catalysing the conversion of aldehydes and
pro-chiral ketones to the corresponding (chiral) primary amines. The enzymes require two substrates; an amine
donor and an amine acceptor and the transformation consists of two half reactions. The PLP coenzyme is
responsible for shuttling the amino group, from the amine donor to the acceptor substrate to form the product.
The reaction is in equilibrium and is freely reversible and the position of equilibrium depends on the nature of
the amino and ketone substrates.
Finding a suitable assay for transaminase enzymes can be challenging because an amine and carbonyl are
both consumed and produced during the reaction. This means that an assay that simply detects the formation
of an amine product will typically not provide any useful information because there is an amine substrate in the
reaction anyway. Likewise, a screen that detects the formation of a ketone product (if you run the reaction in
the reverse direction) may also not work well.
33
One of the most widely used TA assays is shown below and involves the detection of the acetophenone when
𝞪-methylbenzylamine is used as the amine donor
Acetophenone assay
O NH2
TA library
R R
NH2 O
X only works for low absorbing ketones; high enzyme loadings interfere with absorbance; low-medium throughput;
ideal if used with robotics platform (but most groups don’t have these
https://youtu.be/PmgcektmndE
S. Schätzle, M. Höhne, E. Redestad, K. Robins, U. T. Bornscheuer, Anal. Chem. 2009, 81, 8244
35
17
10/19/20
Control Assay
A. P. Green, N. J. Turner, E. O'Reilly, Angew. Chem. Int. Ed. 2014, 53, 10714
37
Using o-xylylenediamine as the donor leads to the formation of an imine co-product, which tautomerizes to
isoindole. Isoindole is very unstable and rapidly polymerizes, affording a dark precipitate. This can be exploited
for a medium-throughput colorimetric assay to detect TA activity.
The assay is very sensitive but is qualitative, not quantitative. This is because the dark polymer that forms
precipitates out of solution, and so the conversion can not be measured by UV, for example. However, it does
give useful information. It tells you if a transaminase reaction is occurring or not! Useful leads can then be
followed up by HPLC, for example.
38
18
both analogues can readily gain access to the active site with
close proximity to PLP (ESI† Fig. S1). Assays were performed with
the amine acceptor 7, which led to the formation of a yellow and
brownish coloured precipitate, with 12 and 13 respectively (Fig. 2).
Butylamine formation was confirmed by HPLC analysis showing
that reactions proceeded with similar conversions to those
observed with amine donor 1, 74% (with 12) and 65% (with 13).
Due to the formation of a yellow precipitate, which is the same
colour as PLP, compound 12 is less suited as an amino donor.
However, amine donor 13 can also be used for the screening of
10/19/20
TAms, and as a cyclic donor may be useful to identify TAms that
are able to accept cyclic substrates.
Apart from the application in multi-well plates, a colony-based
colorimetric assay to provide a HT method that is amendable for
rapid screening of TAm variant libraries was also developed. In a
control reaction with wild type E. coli BL21 (DE3) incubated with 1
mine 1 based transaminase screening. (12.5 mM) and 2 (5 mM), background conversion by the host
e. 1 (25 mM), amino acceptor (10 mM), intrinsic enzymes was excluded as it showed no coloration
mM) and enzyme as crude lysate, 18 h, (Fig. 3A). However, the conversion of 1 (12.5 mM) and 2 (5 mM)
Pp-TAm; (C) Kp-TAm; (D) ArRMut11; with recombinant E. coli BL21 (DE3) containing CV-TAm resulted
in the formation of intensely red coloured colonies (Fig. 3B). In
This article is licensed under a Creative Commons Attribution 3.0 Unported Licence.
Fig. 2 Assay coloration when using amino donors 1, 12 and 13 with
ermined by HPLC analysis and a CV-TAm and acceptor 7. observed with amine donor 1, 74% (
Open Access Article. Published on 24 September 2015. Downloaded on 11/1/2019 9:55:50 PM.
see ESI†). For example, bioconver- View Article Online
Due to the formation of a yellow pre
nd ArRMut11 resulted in low but
ge with amine acceptors 9 (A1, B1, Chemical Science Edge Article colour as PLP, compound 12 is less
proceeded with moderate conver- However, amine donor 13 can also b
results clearly demonstrated that TAms, and as a cyclic donor may be
ped offers a simple, rapid and template did not provide any signicant hits, an improved
a signicant reduction in background (Fig. S4†), (S)-1-phenyl-
are able to accept cyclic substrates.
valuation and substrate profiling variant was isolated from the B1 library. The new mutant,
ethylamine appeared to be most effective, possibly due to better
ower conversions and lower sub- Apart from the application in mult
be determined quantitatively.
identied as B2, afforded comparable molar conversion aer 2
diffusion of this substrate. However, this demonstrates that colorimetric assay to provide a HT m
n withdrawing group (EWG) in 5 hours and a further 1.5-fold increase in turn-over frequency
other amines can also be used to quench endogenous pyruvate, rapid screening of TAm variant librar
This article is licensed under a Creative Commons Attribution-NonCommercial 3.0 Unported Licence.
r it to form an enamine, other (Table 1). The e.e. measured at 24 h was in all cases >99%
for example with aminotransferases that do not accept 1- control reaction with wild type E. coli B
donors possessing EWGs were
ider generality of the assay: 4-(2-
Fig. 3 Colony-based (though the overall yields
phenylethylamine
TAm screening assay usingas
amine
Fig.were
a substrate. low due to experimental1 based
donor 1 (12.5 mM)
1 2-(4-Nitrophenyl)ethan-1-amine condi- transaminase screening. (12.5 mM) and 2 (5 mM), backgrou
and acceptor benzaldehyde 2 (5 mM) at 30 1C for 30 min. The assay was
hloride 12, and a cyclic analogue performed in triplicate. Control assay with WT E. coli BL21 (DE3) with 2 (A). tions needed to assess the Thekinetic
assay wasbehaviour
performed inof the mutants).
triplicate. A
1 (25 mM), amino acceptor (10 mM), intrinsic enzymes was excluded as
nden-2-amine hydrochloride 13, Assays using E. coli BL21 (DE3) containing CV-TAm with 2 (B) and without 2 (C). PLP (1 mM), KPi buffer pH 7.5 (100 mM) and enzyme as crude lysate, 18 h, (Fig. 3A). However, the conversion of
Evolution towards substituted acetophenones third round of mutagenesis, using the B2 variant as a template,
30 1C, 200 rpm. (A) CV-TAm; (B) Pp-TAm; (C) Kp-TAm; (D) ArRMut11; with recombinant E. coli BL21 (DE3) c
initially failed to identify
(E)better variants, as the colour change
no enzyme.
To test the methodology, HEWT was subjected to directed in the formation of intensely red col
Chemistry 2015 Chem. Commun., 2015, 51, 17225--17228 | 17227 on solid screening was too rapid and did not allow discrimi-
evolution using error-prone PCR to generate a completely contrast, control reactions without a
nation between parental activity and mutants. Reducing the
Open Access Article. Published on 09 May 2019. Downloaded on 8/30/2019 10:12:32 AM.
randomised mutant library with a high number of mutations. with most enzymes in bioconversions (A–D: 1–6). CV-TAm, formation of faintly orange colonies (
concentration of para-nitroacetophenone
for example, readilyand ortho-xylylenedi-
accepted amine acceptors 2, 7, 8 and 11 conversions with residual intracellula
The enzyme has minimal activity towards the aromatic
amine from 10 mM to 1 mM, with just
(A2, A4–A6) 5% (v/v)
resulting DMSO prolonged
in intensely red coloured solutions. How- However, a clearly visible differenc
substrate para-nitroacetophenone (1a) and this was selected as
the screening window toever, ca. 1with
h. An additional
amine acceptors variant,
9 and 10 B3,substantially
was less colora- observed. Compared to previously p
the amino acceptor for the initial screening, using ortho-xyly-
isolated from the screeningtion was andobserved
while,(A1, A3) indicating
upon purication, only moderate
it conversion screening methods12,15 this assay us
lenediamine as the amino donor. DMSO (10% (v/v)) was
D. Baud et al. Chem.
required Commun.,
for substrate 2015,51,
solubility 17225-17228
and enhanced cell-
of these substrates.
displayed comparable reaction velocity and 2 h conversions to identify TAm activity and moreover di
Bioconversions with Pp-TAm
B2, it had 2-fold higher expression levels which justies a more and ArRMut11 gave similar activity with a target substrate and
permeability. The discrimination between wild-type and an
39 improved variant relies on the rapidity with which the enzyme
results, however, with less intense coloration (B1–B6, D1–D6)
rapid colour development with respect to the parental variant.
in particular with amine acceptor 11 (B5, D5) compared to the
cellular acceptors.
TOF m.c. e.e. TOF m.c. e.e. TOF m.c. e.e. TOF m.c. e.e. TOF m.c. e.e.
Substrate (10"3 s"1) (%) (%) (10"3 s"1) (%) (%) (10"3 s"1) (%) (%) (10"3 s"1) (%) (%) (10"3 s"1) (%) (%)
a
Biotransformation reactions were performed with 10 mM ketones, 500 mM L-alanine, 0.1 mM PLP, 0.1 mg mL"1 (1.9 mM) enzyme in 50 mM
M. Planchestainer
phosphate bufferetpHal.8 and
Chem.
10% (v/v)Sci.,
DMSO 2019, 10,Experimental
at 37 $ C (see 5952 in the ESI, Fig. S9). All experiments were conducted in triplicate and the
standard error is reported accordingly. Mutations; A1: W56C, V435A; B1: W56C, L211V, L306M; B2: W56C, L211V, L306M, V361A, Q388R, P453L;
B3: W56C, L211V, A254V, L306M, V361A, Q388R, P453L.
41
5954 | Chem. Sci., 2019, 10, 5952–5958 This journal is © The Royal Society of Chemistry 2019
19
10/19/20
were processed using XDS6 and assigned to a triclinic (P1) space group using POINTLESS and scaled
using SCALA; both implemented in the CCP4i suite.7,8 The 3D structure of HEWT was solved using
Molrep and the structure of an aspartate aminotransferase from Pseudomonas aeruginosa (PDB entry
5TI8; 43% sequence identity over 433 aligned residues) as a search model.9 The structure was manually
built using COOT and refined using phenix.refine until satisfactory refinement parameters were
achieved (Rwork = 16 %; Rfree = 21.0 %).10,11 All residues are located in allowed regions of the
Enzyme engineering requires an efficient screen
Ramachandran except for Ala283 in both chains and Lys284 (chain A); the catalytic lysine is often found
Transaminases
as an outlier in many PLP-dependent enzymes due to its covalent interaction with PMP. Data collection
o-xylylenediamine assay
parameters and refinement statistics are shown in Supplementary Table 3.
Supplementary Figure 1 | Optimised amino acceptor screening based on the ortho-xylylenediamine assay. E.
coli BL21(DE3) cells are transformed with the HEWT library of interest (A) and colonies are grown overnight on a
Selected colonies that rapidly turned black (those much faster than the wild-type control)
nitrocellulose membrane placed on LB agar plates supplemented with 100 g/mL of ampicillin (B). Membranes
are transferred to LB agar plates supplemented with 100 g/mL of ampicillin and 1 mM IPTG for protein
expression for 8 hours (C). To minimised false positive background colour formation and to stop the cell
metabolism, colonies are dialyzed overnight by transferring the membrane to a dialysis plate containing 2% agar,
10 mM Tris-HCl pH 8, and 0.1 mM PLP (D). Afterwards, the background is depleted by placing the membrane on
M. Planchestainer et al. Chem. Sci., 2019, 10, 5952
filter paper soaked in 10 mM (S)-1-phenethylamine in phosphate buffer pH 8, 1% (v/v) DMSO for 30 minutes (E).
Finally, screening is conducted by incubation of the membranes on assay plates containing 10 mM ortho-
42
xylylenediamine and 10 mM amino acceptor of interest in phosphate buffer pH 8, 10% (v/v) DMSO (F). Scheme
elaborated from Weis et al.12
8
Enzyme engineering requires an efficient screen
Transaminases
o-xylylenediamine assay
E. coli BL21(DE3) cells are transformed with the HEWT library of interest (A) and colonies are grown overnight
on a nitrocellulose membrane placed on LB agar plates supplemented with 100 ug/mL of ampicillin (B).
Membranes are transferred to LB agar plates supplemented with 100 ug/mL of ampicillin and 1 mM IPTG for
protein expression for 8 hours (C). To minimised false positive background colour formation and to stop the cell
metabolism, colonies are dialyzed overnight by transferring the membrane to a dialysis plate containing 2% agar,
10 mM Tris-HCl pH 8, and 0.1 mM PLP (D). Afterwards, the background is depleted by placing the membrane on
filter paper soaked in 10 mM (S)-1-phenethylamine in phosphate buffer pH 8, 1% (v/v) DMSO for 30 minutes (E).
Finally, screening is conducted by incubation of the membranes on assay plates containing 10 mM ortho-
xylylenediamine and 10 mM amino acceptor of interest in phosphate buffer pH 8, 10% (v/v) DMSO (F).
43
20
s: (i) a prin- deposited in the PDB was assessed using ENDscript 2.0 (http://
dues 82 to endscript.ibcp.fr).34 HEWT was compared with 124 structures
g 7 strands, with sequence conservation >30%. As expected, the highest
ng antipar- conservation (sequence and structural) is located in the PLP-
310 helices dependent transferase-like domain and active site region 10/19/20
0–451) that (Fig. 3A).
2 to 32; a1– Sequencing of the selected mutants identied two amino
ns with the acid changes in the A1 variant, (W56C, V435A), while the B1
-phosphate variant displayed three amino acid substitutions (W56C, L211V,
he 3-amino L306M) (notes in the ESI†). Interestingly, both mutants harbour
e linkage, is a tryptophan to cysteine change in position 56, which is located
O
00 residues in Enzyme engineering
the active site. This residue has requires an efficient
previously been identied asscreen
a hotspot in in silico rational design studies.7,35 In this case, R
Transaminases
removing the large indole
o-xylylenediamine assay ring and replacing it with the smaller
Poor activity
cysteine side chain allows for easier substrate binding.
Ø First round
Furthermore, theof evolution
thiol involved
can promote screening
hydrogen ca. 15,000
bonding colonies and identified two improved clones (A1 and B1)
with the
Ø A second round of evolution with both first generation clones afforded an improved variant B2, arising from the B1
parent library
The second round of evolution introduced three additional mutations (V361A, Q388R, P453L) in the B2 variant,
which are also located outside of the active site and all are exposed to the solvent; their side-chains do not make
any significant stabilising intramolecular interactions. B3 introduced mostly silent mutations, apart from a single
A254V mutation neighbouring the D255 residue, forming a hydrogen bond with the pyridine-type nitrogen of the co-
factor PLP.
45
21
a hotspot in in silico rational design studies.7,35 In this case,
removing the large indole ring and replacing it with the smaller
cysteine side chain allows for easier substrate binding. 10/19/20
Furthermore, the thiol can promote hydrogen bonding with the
Improved variants have been isolated and identified using a screen that
tells you very little about what is really going on (kinetics, turnover, substrate
scope…)
It simply identifies mutants that are turning over the o-xylylenediamine substrate quickly
(at least a small amount of it) and this guides further engineering – all without needing to know anything
about the protein structure
Fig.
46
3 HEWT 3D structure conservation and mutant residue positions.
Secondary structure of the HEWT monomer shown in sausage
representation, as automatically generated by ENDscript 2.0 (http://
endscript.ibcp.fr).34 Structure conservation between chain A of HEWT
and 124 structure homologs present in the PDB is indicated by ribbon
Enzyme
thickness, engineering
with regions requires
of low conservation an efficient
being thicker than highly screen
Monoamine oxidase
conserved regions (thin regions). Sequence conservation is indicated
by red shading; the redder the residue, the more conserved it is.
Mutated residues in variants A1 (blue), B1 (yellow), B2 (green) and C1
(orange) are shown as sticks. The B3 mutant shares all B2 mutations
and contains an extra A254V mutation (pink sticks). P453 from variants
rtoon B2 and B3 is not present in the model. The N- and C-termini are
rans- indicated and PLP is shown in sticks. This figure was generated using
sticks. Pymol 2.0.6. Modelling of the active site for the A1 and B1 mutants,
ermini where the bulky Trp (the steric encumbrance of the residue is shown
omain as an orange cloud) (D) is substituted with a cysteine (E), allowing the
mer. In aromatic ring substituent (para-nitroacetophenone in the aldimine
. This intermediate complex with the pyridoxamine phosphate (PMP)) to be
more easily accommodated.
22
10/19/20
A good screen is ESSENTIAL for successful evolution. This MAO screen is particularly powerful because it is
detecting the presence of the reaction by-product – hydrogen peroxide. This has the advantage of being
independent of the amine substrate used i.e. no matter what amine you test, if it’s a substrate, you will see a colour
change. Very important for directed evolution. Many screens are substrate dependent i.e. will only work with a
particular amine and this has limited scope. Another advantage of this screen is that it can be used for any oxidase
where hydrogen peroxide is the by-product.
48
Select/screen
(a few) random
mutations
No
improvement
Repeat
Parent MAO
gene (= parent
protein)
49
23
10/19/20
Ø Can screen for lipases using tributyrin (or equivalent) that can be incorporated
into agar
Ø This could be useful for lipase discovery or for selecting active lipases from
evolution experiment, for example. It won’t, however, give much more information
about the enzyme e.g. substrate scope, enantioselectivity…..
50
There are many ways to screen for lipase activity in both liquid and on solid phase, including enantioselective
screens. The one show here is useful for detecting lipase activity on solid phase. It relies on a compound called
tributyrin, which is incorporated into the agar media on the plate. If the cells express an active lipase, the enzyme
will hydrolyse the tributyrin and result in a zone of clearing on the plate. You can imagine if there are hundreds of
colonies on this plate there would be a zone of clearing around the colonies that were expressing an active
lipase.
51
24
10/19/20
Problem O
Transaminase
NH2
(S)
1) A given transaminase has no activity towards the above ketone and it must be engineered……
Outline an engineering strategy you would take to alter the substrate scope of the enzyme to enable it
to accept this ketone
Design a screen to assist in the engineering of a transaminase for the above reaction
Some info
• The wild-type enzyme does not accept the starting substrate
• There is no crystal structure of the protein
• The DNA sequence is known
• A crystal structure of a similar protein (87% sequence similarity) is available in the literature
2) Having developed a protein that accepts the substrate, you now want it to work in 30% MeOH and the mutant
shows only low activity
Suggest how you would further engineer the protein to enable it to tolerate these conditions
52
25