Thanks to visit codestin.com
Credit goes to www.scribd.com

0% found this document useful (0 votes)
87 views25 pages

CHEM41290 Notes Part I

Uploaded by

Julia Machaj
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
87 views25 pages

CHEM41290 Notes Part I

Uploaded by

Julia Machaj
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 25

10/19/20

Modern Techniques to Monitor


Biological Interactions
CHEM 41290

Dr Elaine O’Reilly
Office 3.11 Chemistry
[email protected]

Biocatalysis
Key topics and learning objectives

Ø Genotype-phenotype linkage in vivo and in vitro


Ø Enzyme/protein engineering
Ø Drug discovery

Ø Mass spectrometry to study protein-protein interactions

Reading material

Ø Lecture notes

Ø Cited publications

1
10/19/20

Biocatalysis
Key topics and learning objectives

Ø Genotype-phenotype linkage in vivo and in vitro


Ø Enzyme/protein engineering
Ø Drug discovery

Ø Mass spectrometry to study protein-protein interactions

Reading material

Ø Lecture notes

Ø Cited publications

Biocatalysis
Biocatalysis typically refers to the application of naturally occurring or modified
enzymes to perform a chemical transformation.

Biocatalysis High-value chemicals


Simple chemical building blocks e.g. pharmaceu2cals,
natural products, biofuels

Advantages Challenges
Activity Substrate range
Selectivity Stability
Environmentally benign Restricted to ‘biological reactions’
Enzyme cascades
Tuneable
Only need to ‘make’ it once

2
10/19/20

Where Do We Get Enzymes


• Nature is a rich source of enzymes that are capable of catalyzing synthetically challenging reactions

• Interesting enzymes can be found in plants, mammals and microorganisms

• Studying biosynthetic pathways is an excellent method for identifying new enzymes

• But how do we find an enzyme for a desired transformation?

Screen libraries of bacteria

Genome mining Enzyme engineering

Previously identified source of enzyme

Developing Biocatalysts
Enzymes isolated from natural sources do not always display the required properties – substrate scope, activity,
stability (temperature/pH/solvent).

Enzymes can be engineered to improve these properties, and there are a variety of approaches that can be taken
to modify the protein, depending on how much information is known about the protein and what properties you
would like to optimise.

To understand how to engineer proteins, we need to understand (remind ourselves) about a few basic principles

3
10/19/20

The Genetic Code


Ø Deoxyribonucleic acid (DNA) or ribonucleic acid (RNA)

Ø Composed of three parts

1) Nitrogen base

2) Sugar
3) Phosphate group

The Genetic Code


Ø Deoxyribonucleic acid (DNA)

Ø A nucleoside is a nitrogen base linked to a sugar

Ø When the sugar is phosphorylated it is called a nucleotide

4
10/19/20

The Genetic Code


Ø Base pairing in DNA

Ø Every base pair consists of one purine and one pyrimidine

Ø The nucleotides are held together by hydrogen bonds

Ø Base pairing is complementary – A/T

Ø Base pairing is complementary – G/C

The Genetic Code

10

5
10/19/20

The Genetic Code


Insulin A Chain
Protein (Amino Acid) Sequence

DNA Sequence

Insulin A Chain With Mutations

What mutations have been incorporated?

What effect would you expect to have on the protein?

11

Heterologous Expression
• If we want to use enzymes to catalyze challenging reactions, the wild-type (WT) enzyme might do the job!

• The organism can sometimes be used directly to catalyze the reaction but there are limitations to this:
culturing/reactions condition compatibility; solubility of substrates; purifications of product; side reactions ….

• Often more useful to clone and perform heterologous expression of the gene of interest

Clone the
corresponding gene Wild-type
Enzyme

12

6
10/19/20

Heterologous gene expression/protein production


Bacterial colonies grown
Gene coding on agar containing antibiotics.
Bacteria carrying Only cells containing plasmid
protein of interest recombinant DNA of interest survive

Cloning Transformation Plating

Bacteria carrying
Vector
recombinant DNA. Vast majority of
with antibiotic Recombinant
transformed cells only take up one
resistance marker DNA
plasmid (important for directed
(e.g. lactamase) Each colony derived from
evolution)
single cell

Disrupt cell wall Purified protein

Centrifuge Cell lysis

DNA or protein
purification
harvest cells
Inoculate cultures with bacterial
cells from a single colony.
At a specific cell density, transcription
(DNA copied to mRNA) and translation
(templated protein synthesis coded by Isolated DNA
mRNA) induced by addition of
promoter to produce target protein.

13

Heterologous gene expression/protein production


The linear gene is annealed with another piece of DNA to generate a plasmid/vector. Plasmid DNA is circular and
more easily taken up by the host. Plasmids are commercially available (pET-vectors are very commonly used)
and they are designed to allow the insertion of an array of genes. Importantly, they also have other features
including antibiotic resistance (genes) and purification tags.

Cloning - The gene encoding the protein of interest is annealed to the plasmid to generate a circular plasmid
(piece of DNA).
Transformation – The plasmid is introduced into the bacterial cells, which have been made competent (able to
take-up circular DNA). The cells are typically incubated with a suitable liquid growth medium to allow cells to
begin to divide (grow). This functions to enable the newly transformed cells to begin to express the genes that
encode for the antibiotic resistance. Typical growth time is 30-60 minutes.
Plating –After the short incubation period, cells are spread into a nutrient agar plate and incubated. The bacterial
colonies that result should each contain the plasmid, as any cells that do not, will not produce the antibiotic
resistance gene and will be killed. Importantly, as each colony is derived from a single cell, individual colonies will
contain the same plasmid – this becomes important when making mutants, which we will see later.
At this point, the cells can be grown up to isolate the multiple copies of recombinant DNA and this DNA stored, or
used to make mutants etc. Alternatively, gene expression can be induced resulting in protein production from the
recombinant gene.

14

7
10/19/20

Enzyme Engineering
However, WT-enzymes are often not suitable for synthetic applications and their properties need to be tuned before
they are practically useful

Enzyme engineering is often required (rational, semi-rational, random). The approach depends on a number of
factors:
• how mush is understood about the enzyme (structure)
• what degree of change is expected from the enzyme
• the availability of a suitable assay

Pre-existing knowledge of the enzyme is extremely beneficial (but not essential)


• sequence and structure
• close ancestral and evolutionary relations
• the active site and reaction mechanism

15

How do we incorporate mutations


The Polymerase Chain Reaction

If the goal is to incorporate random mutations, a low-fidelity DNA polymerase must be used.

16

8
10/19/20

How do we incorporate mutations


The polymerase chain reaction (PCR) was originally developed in 1983 by the American biochemist Kary Mullis.
He was awarded the Nobel Prize in Chemistry in 1993 for his pioneering work. PCR is used in molecular biology
to make many copies of (amplify) small sections of DNA or a gene. Using PCR, it is possible to generate
thousands to millions of copies of a particular section of DNA from a very small amount of DNA.

There are three main stages: Denaturing – when the double-stranded template DNA is heated to separate it into
two single strands. Annealing – when the temperature is lowered to enable the DNA primers to attach to the
template DNA. Extending – when the temperature is raised and the new strand of DNA is made by the Taq
polymerase enzyme. These three stages are repeated 20-40 times, doubling the number of DNA copies each
time.

Denaturation: During this stage the cocktail containing the template DNA and all the other core ingredients is
heated to 94-95⁰C.
The high temperature causes the hydrogen bonds between the bases in two strands of template DNA to break
and the two strands to separate. This results in two single strands of DNA, which will act as templates for the
production of the new strands of DNA. It is important that the temperature is maintained at this stage for long
enough to ensure that the DNA strands have separated completely. This usually takes between 15-30 seconds.

17

How do we incorporate mutations


Annealing: During this stage the reaction is cooled to 50-65⁰C. This enables the primers to attach to a specific
location on the single-stranded template DNA by way of hydrogen bonding (the exact temperature depends on the
melting temperature of the primers you are using). Primers are single strands of DNA or RNA sequence that are
around 20 to 30 bases in length. The primers are designed to be complementary in sequence to short sections of
DNA on each end of the sequence to be copied. Primers serve as the starting point for DNA synthesis. The
polymerase enzyme can only add DNA bases to a double strand of DNA. Only once the primer has bound can the
polymerase enzyme attach and start making the new complementary strand of DNA from the loose DNA bases.
The two separated strands of DNA are complementary and run in opposite directions (from one end - the 5’ end –
to the other - the 3’ end); as a result, there are two primers – a forward primer and a reverse primer. This step
usually takes about 10-30 seconds.

Extending: During this final step, the heat is increased to 72⁰C to enable the new DNA to be made by a special
Taq DNA polymerase enzyme which adds DNA bases. Taq DNA polymerase is an enzyme taken from the heat-
loving bacteria Thermus aquaticus. This bacteria normally lives in hot springs so can tolerate temperatures
above 80⁰C. The bacteria's DNA polymerase is very stable at high temperatures, which means it can withstand
the temperatures needed to break the strands of DNA apart in the denaturing stage of PCR. DNA polymerase
from most other organisms would not be able to withstand these high temperatures, for example,
human polymerase works ideally at 37˚C (body temperature). 72⁰C is the optimum temperature for the Taq
polymerase to build the complementary strand. It attaches to the primer and then adds DNA bases to the single
strand one-by-one in the 5’ to 3’ direction.

18

9
10/19/20

How do we incorporate mutations


The result is a brand new strand of DNA and a double-stranded molecule of DNA. The duration of this step
depends on the length of DNA sequence being amplified but usually takes around one minute to copy 1,000 DNA
bases (1Kb).

These three processes of thermal cycling are repeated 20-40 times to produce lots of copies of the DNA
sequence of interest. The new fragments of DNA that are made during PCR also serve as templates to which the
DNA polymerase enzyme can attach and start making DNA. The result is a huge number of copies of the specific
DNA segment produced in a relatively short period of time.

19

But The Numbers Quickly Add Up


Ø It is possible to just introduce random mutations into a gene (random mutagenesis) and see what happens

Ø Nature does just this (unintentionally!) and is the reason we have such great genetic diversity

Ø Enzymes are typically extremely selective, but most evolved from less specialised proteins that likely catalysed
a greater diversity of reactions and/or with a greater variety of substrates

Ø As molecular biologists, instead of waiting around for mutations to creep in naturally – we speed things up by
introducing mutations

Ø For a protein of 400 amino acids there are 20400 total possible number of variants!

Ø No. of mutants in which only 1 amino acid is exchanged for any other 19 = 400 x 19 = 7,600

Ø If 2 amino acids exchanged = 19 x 19 x 400 = 144,000

Ø If you go a little further, you quickly get into the ‘million and billions’!!

Ø What you are trying to achieve can often determine what evolutionary approach you take

20

10
10/19/20

Approaches to Engineering
Minireviews

Rational Semi-rational Random


(targeted gene libraries)

A number of factors will effect what


engineering approach you take
1) What are you trying to achieve
Studying mechanism
Altering substrate scope
Stability (pH, temp, solvent)

2) What information do you have about the protein


Mechanism
Active site
Crystal structure

3) How are you going to screen

Figure 1. Strategies for the design or directed evolution of enzymes.

21
brary screening. Although simple in principal, directed evolu- views can be consulted for further information.[41, 46–61] The
tion usually requires that considerable thought be given to li- more common and well-tested techniques are described
brary construction and high-throughput methods of screening below.
and/or selection. In many cases it might not be possible to use PCR was originally designed to amplify lengths of DNA but
evolutionary methods to improve an enzyme. has been adapted, in the form of error-prone PCR, to serve as
It should be stressed that rational and evolutionary ap- a tool for library generation. DNA polymerases usually have

Approaches to Engineering
Minireviews
proaches are not incompatible and that in recent decades
hybrid approaches have been commonly employed. This is [39]

referred to as semi-rational design or targeted mutagenesis


proofreading activity to ensure DNA is replicated with high fi-
delity. However, as the name suggests, error-prone PCR is in-
tended to introduce random mutations during replication by
and includes the use of Rational
site-saturation mutagenesis Semi-rational
(SSM) or Random
reducing the fidelity of the DNA polymerase. One approach to
(targetedrather
randomised mutagenesis over a portion of the enzyme gene libraries)
reducing the fidelity is to use a polymerase lacking the domain
[40–42]
than the entire enzyme. responsible for editing (e.g., Taq polymerase). However, the ab-
sence of an editing domain does not give rise to a high error
Rational
rate and additional agents must be considered. Protocols es-
2.1. Library generation
tablished to increase the rate of nucleotide misincorporation
The overall process of introducing mutations and the subse- further include simply varying the nucleotide ratio, increasing
quent selection of desirable traits has been well established. the concentration of Mg2 + or the addition of Mn2 + .[34, 62] Experi-
Development of the polymerase chain reaction (PCR), a relative- mental conditions should be adjusted to obtain a mutation
ly simple laboratory technique that is capable of creating sig- rate appropriate for the screening. The probability of obtaining
nificant genetic diversity, has revolutionised the field.[34] Rou- desirable mutants increases with the size of the library. It
tine methods for library generation include error-prone PCR, should be noted that libraries generated with epPCR are not
site-saturation mutagenesis and DNA shuffling.[43–45] New tech- without their limitations. For example, it is not possible to
niques are continually being developed, and specialised re- obtain all mutations. On average, 5.6 mutations can be ob-

ChemBioChem 2016, 17, 197 – 203 www.chembiochem.org 199 ⌫ 2016 Wiley-VCH Verlag GmbH & Co. KGaA, Weinheim

• You must have some informa4on about the


DNA sequence and/or protein structure

• A point muta4on approach could be used to


examine the involvement of that residue(s) in
binding; mechanism etc.

Figure 1. Strategies for the design or directed evolution of enzymes.

22
brary screening. Although simple in principal, directed evolu- views can be consulted for further information.[41, 46–61] The
tion usually requires that considerable thought be given to li- more common and well-tested techniques are described
brary construction and high-throughput methods of screening below.
and/or selection. In many cases it might not be possible to use PCR was originally designed to amplify lengths of DNA but
evolutionary methods to improve an enzyme. has been adapted, in the form of error-prone PCR, to serve as
It should be stressed that rational and evolutionary ap-
proaches are not incompatible and that in recent decades
a tool for library generation. DNA polymerases usually have
proofreading activity to ensure DNA is replicated with high fi-
11
hybrid approaches have been commonly employed.[39] This is delity. However, as the name suggests, error-prone PCR is in-
referred to as semi-rational design or targeted mutagenesis tended to introduce random mutations during replication by
and includes the use of site-saturation mutagenesis (SSM) or reducing the fidelity of the DNA polymerase. One approach to
randomised mutagenesis over a portion of the enzyme rather reducing the fidelity is to use a polymerase lacking the domain
than the entire enzyme.[40–42] responsible for editing (e.g., Taq polymerase). However, the ab-
PROTOCOL
10/19/20
Figure 4 | Generalization of CASTing. Scheme of an enzyme minimized to a
spherical catalyst of 10 Å (for illustrative purposes Lip A is shown). The pdb
a b

10 Å
file 1ISP is used as an explanatory model. In (a) the residue in yellow (CPK)
represents the catalytic amino acid. Accelrys Discovery Studio Visualizer 1.6
and other protein viewers permit the selection of the residues located in a
defined radius (10 Å in this case) around a specific amino acid residue or
atom. To create this kind of representation, select the desired amino acid and
then under the Menu ‘Edit’, click ‘Select’, define ‘Type’ ‘Amino acid’ and ‘Radius’
in Å. The residues appear in yellow as shown in (b), and this selection can be Semi-rational Approaches
saved as a ‘group’ in the structure file as described in the text. In this way it
Saturation
is very easy to transform mutagenesis
the protein (also
formally into a small known
catalyst as cassette mutagenesis)
of defined
radius as shown in (c). In (d) a general diagram of CAST is shown displaying
PROTOCOL
One or
(arbitrarily for illustrative more five
purposes) positions
different are selected
regions A–E, eachand
one the residue at that/those positions randomized
c d
harboring, e.g., 2 or 3 aa. B
YYY
Figure 4 | Generalization of CASTing. Scheme of an enzyme minimized to a
Smaller sequence space (not asLipmany variants) a b A

YY
spherical catalyst of 10 Å (for illustrative purposes A is shown). The pdb
10 Å

10 Å
C
file 1ISP is used as an explanatory model. In (a) the residue in yellow (CPK)

YY
Conclusions andrepresents
perspectives
the catalytic amino acid. Accelrys Discovery Studio Visualizer 1.6 E Y
Over the past 15Tend otherto be higher quality libraries

Y
Y
andyears, traditional
protein strategies
viewers permit of directed
the selection evolution
of the residues located in a

Y
defined radius (10 Å in this case) around a specific amino acid residue or D
based on classical methods
atom. such
To create this kindas repeating cycles
of representation, ofdesired
select the epPCR aminoandacid and
DNA shuffling thenhave underbeen
the Menu applied
‘Edit’, clicksuccessfully in‘Amino
‘Select’, define ‘Type’ engineering
acid’ and ‘Radius’
Iterative
in Å. 1–6 saturation
The residues appear in yellowmutagenesis
as shown in (b), and this selection can be
numerous enzymes . Recently, a number of groups have devel-
saved as a ‘group’ in the structure file as described in the text. In this way it
oped further molecular
is very easybiological
to transform themethods for gene
protein formally into amutagenesis
small catalyst of defined wrong decision is made, the mistake can be corrected because the
as PROTOCOL
Same principal
radius as shown in (c). Inexcept
(d) a generalthe variants
diagram of CAST isfrom
shown the first round are subjected to further rounds of saturation mutagenesis
displaying
well as new strategies for scanning protein sequence space in the site can be eliminated from further consideration after the initial
hope of increasing efficiency
harboring,
1–6,11,12,17–21. Industrial applications of
e.g., 2 or 3 aa.
c
(arbitrarily for illustrative purposes) five different regions A–E, each one Figure 1 | Schematic illustration of iterative
mutagenesis round. d B
etc.

Example: 1) take a wild-type sequence saturation mutagenesis involving (as an example)


© 2007 Nature Publishing Group http://www.nature.com/natureprotocols

YYY
directed evolution, in particular, require rapid procedures. Our four randomization sites A, B, C and D: confined
A final space aspect toenzyme
be considered when A applying ISM is the codon A CD ABD ABC BCD ABD A BC BCD A CD A BC B CD ACD ABD

YY
2) select say four residues 22,23 to be randomizedprotein sequence
optimization (redundancy (A, in B,
some C,
for evolutionary
cases isD) 10 Å C

YY
contribution to accelerated Conclusions and directed evolution is ISM
perspectives , which is expected). usage. There isThis important the number issue has been discussed in a recent report
3) form four libraries from this parentof cycles. sequence no rule regarding E
YY

Y
Exploration of protein sequence space is
based on a Cartesian Over theview past 15
of years,
the 3D traditional
protein strategies
structure of directed
in which evolution emphasizing library mutagenesis methods Dthat are complementary

Y
B C D A C D A B D A B C

based on classical 4) select


methodsthe suchbest hits from these
as repeating libraries
continued
and subject to further
until the desired degree of improvement

several predefined regions arerounds considered as beingcycles of epPCR


crucial for and to those treated here44. In all of our examples regarding ISM
has been achieved.
DNA shuffling have been of saturation
applied mutagenesis
successfully in engineering 22,23,45in, we have used NNK degeneracy (N: Ade/
improving a given catalytic
numerous property.
enzymes With athe
1–6. Recently, systematic
number of groups restrict-
have devel- published
A variation of the schemeso farillustrated A B C D
Figure 1 involves the stipulation that in a
ing of saturationoped mutagenesis solelybiological
further molecular to ‘hotmethods
sites’, asfor suggested
gene mutagenesis as Cyt/Gua/Thy;
by given wrong
pathway eachdecision K:is Gua/Thy).
site is considered made, onlythe mistake This can involves 32 codons
be corrected becauseand the all the 20
WT
© 2007 Nature Publishing Group http://www.nature.com/natureprotocols

well as new strategies for scanning protein sequence space once. In site
in the the casecanofbe foureliminated
sites, this would from furtherblocks. consideration after basisthe initial
rational considerations resulting from structural information, only proteinogenic
mean convergence after four generations of aa as building On the of statistical
of increasing efficiency1–6,11,12,17–21. Industrial applicationssaturation
defined parts ofhope an enzyme are considered. The iterative process of analyses mutagenesis 46,47,and
mutagenesis, round.
wea total
have of 64calculated the number
libraries if all pathways in organic chemistry, polymer technology and pollution clean-up,
as components
of clones that should
in detergents,
directed evolution, in particular, require rapid procedures. Our A finalexperimentally.
were to be explored aspect to beInconsidered practice this iswhen applying
not neces- ISM is the codon as diagnostic tools, as bio-nanotech-
then allows for 23high evolutionary
contribution pressure
to accelerated directed inevolution
confined regions
is ISM of sary
22,23, which is be
22, but screened
usage.it is This for 95%
of theoretical
important coverage
interestissue
and has needsin
thus been to the
be
discussed case
nological
in a of
devices
recent randomization
and sometimes even as
report attherapeutic drugs. For
this reason, considerable research has been conducted in the quest
considered.
protein sequencebased space,
on which increases
a Cartesian view of the probability
the 3D of success
protein structure in whichThe two and
emphasizing three positions,
library mutagenesis
systematic, iterative nature of the strategy illustrated inrespectively
methods that(Table
toare 1).
complementary
enhance Of
thermostability course,
by a varietyfullof techniques, including
while reducing cost, severaltimepredefined
and human regionseffort.are considered as being crucial for Figure 1 to
coverage those
is unlike treated
anisalternative
not here44. inInwhich
necessarily
approach allmandatory,
of mutations
the our examples and evolution9,27–30
directedregarding
in fact inISM
23
. We speculated that ISM (Fig. 1) could be
our studies
obtained published
in one initial library are22,23,45
simply combined with used
those ofNNK particularly effective . The primary challenge was to find a
a hit degeneracy
The criteria forimproving choosing a given catalytic property. With the systematic restrict-
the appropriate sites for randomization inbyanother
ing of saturation mutagenesis solely to ‘hot sites’, as suggested new
published library24, a so
Cyt/Gua/Thy;
so far
far
process was, only
itallows
that
K: Gua/Thy).
we have
notnew sought
mutants but
This involves
22,23
not . Nevertheless,
criterion that would (N: allow
Ade/a decision regarding the optimal choice
the higher the
mutations. It is also different from the conventional approach32 codons
of the sites and
in theallenzyme
the 20appropriate for saturation mutagenesis.
depend on the rational natureconsiderations
of the catalytic resultingproperty
from structuralthatinformation,
is to beonly basedcoverage,
on proteinogenic
DNA shuffling theofgreater
theaainitial the. Inprobability
hits1,2
as building blocks.
contrast, eachOnnewof finding
Onbasis
the ofimproved
the basis of the well-known
statistical variants.fact that hyperthermophilic
23,considered. enzymes are more rigid than mesophilic analogs31–33, it appeared
a websuggest of cycle of ISM maximizes
improved.
e minimized to a In the case parts
defined of thermostability
of an enzyme are The the use process
iterative Therefore,
analyses46,47 tothe, reduce
probability of obtaining additive and/or
we the molecular
have calculated the number biological
of clones work and the screen-
that should
cooperative effects of newly introduced mutations in a defined region reasonable to introduce appropriate mutations at sites displaying

Semi-rational Approaches
shown). The pdb thenforallows for high evolutionary pressuremethod).
in confinedInregions
B-factors as a basis decision making (the B-FIT the ofoftheing beeffort,
screened we for 95%
suggest coverage
that Wein
space. it havethe
may case be
well of
highrandomization
useful
degrees of to consider
flexibility. at
To identifyother
such sites with some certainty,
10 Å

fitness landscape in protein sequence demon-


e in yellow (CPK) protein sequence space, which increases the 22 probability of success two times
strated several andthethree enormous positions,
benefits of respectively
conducting the search (Tablewe1). turnedOftocourse,
atomic displacement
full parameters obtained from X-ray
case of substrate
dio Visualizer 1.6 acceptance and/or enantioselectivity , the focus is in protein
while reducing cost, time and human effort.
degeneraciessequence spacewhen
coverage is notusing
applying
ISM,
necessarily specifically inISM
mandatory,
(Table
enhancing data, 1).
and iningfact
namely For
in our
example,
the B-factors
studies
NDT
(or B-values) 34–36. These reflect smear-

enantioselectivity22 and thermostability23 of enzymes (see below). of atomic electron densities with respect to their equilibrium
ues on locatedsites around the
in a complete
The criteria binding
for choosing pocket insitesa for
the appropriate systematic
randomizationIt degeneracy published (N:
far itof Ade/Cyt/Gua/Thy; 22,23 D: Ade/Gua/Thy; T: motion
Thy)
is clear that such aso process was notevolutionary
exerting sought pressure . Nevertheless,
is positionsthe as ahigher
result ofthe thermal and positional disorder.
acid residue or
manner (CASTing). Combinatorial
dependAnother
on the nature active-site
possibility,
of the yet tosaturation
catalytic beproperty
tested that test
experi-is (CAST)
be involves
to very coverage,
different only
from the the 12cycles
greater
multiple codons
the and
probability
of epPCR most often12ofusedaa Therefore,
in(Phe,
finding Leu,
improved in thevariants.
Ile,method that we call the B-factor iterative test
Val,acids Tyr, His,that display the highest
red amino acid and 23, we suggest the usedirected evolution studies to date. The latter addresses the whole (B-FIT), only those amino in a protein
mentally, improved.an
is to perform In initial
the caseround
of thermostability
of traditional epPCR and enzyme of Asn, Therefore,
anewAsp,
to reduce
Cys,
in each cycle, Arg,
which
the Ser
means
molecularand
that all
biological
Gly).
its regions are The
work
B-factors and
screening
23 arethe screen-
targeted. effort when
After screening of the corresponding initial
no acid’ and ‘Radius’
CASTing
B-factors as ainvolves focusing
basis for decision makingon the
(the B-FITcatalytic
method).activeIn the
consideredcenter.
ingover effort,
again Thewe Cartesian
evensuggest
though only that aitfew space
may well within
positions be
mayuseful a
mutanttoradius
consider
libraries ofother
prepared approximately
by saturation mutagenesis at positions A,
then tocanchoose
his selection be several hot spots identified thereby as sites
22 for randomizing two or three positions then
B, C, D reduces
etc., the gene to
of the 430
best hitand
is used as the template for
In this way it ISM.10
caseA of̊ substrate
is partitioned
acceptanceinto defined
and/or regions ,(sites)
enantioselectivity to
is be
the focusactually randomized
degeneracies
be important for inducing when by saturation
applying ISMrapidly.
positive responses (Tablemutagenesis
For1). For example, NDT
ext.subsequent Insites
addition 5,175 theclones, respectively, which constitutes saturation mutagenesis at the site from which the second-best hit in
a drastic reduction in
on aroundtothe such a possibility,
complete it is conceivable
binding pocket in a systematic this reason,
degeneracyimproved mutants
(N: accessible by ISM
Ade/Cyt/Gua/Thy;
be found by repeated rounds of epPCR, simply on statistical
are not likely
D: to
Ade/Gua/Thy;
the initial T:
mutagenesis Thy)
experiments originates. The process of ISM
catalyst of defined
thatdisplaying
shown the highly flexible manner sites identified
(CASTing). Another on the
Yellow/green residue represents catalytic amino basis
possibility, of
yet B-factors
to be testedcan experi- experimental
grounds.
acid
involves
Likewise, DNA work
onlyshuffling
12 codons (Table
cannot beand 1).
12
expected toThe
aa (Phe,computer
generate Leu, program
Ile,continued
is then Val, Tyr,
in a His, CASTER,
hierarchical manner until all sites have been
the kind of hitsAsp, that Cys,
arise from ISM ‘visited’. In principle, a given site can be considered more than once
eme of an enzyme minimized to amentally, is to perform an initial round of traditional epPCR and
be usedone in ISM in theaquest to improve catalytic b properties other recombined. Asn,
available from characterour Arg, Serbecause
website,
sequences
andcontains
Gly). The are screening effort when
which calculations of the type shown
A–E, each
purposes Lip A is shown). The pdbthen c to choose several hot spots identified d thereby Bas sites for
The focused
randomizing two enables
of our method,
or three positionsof ben-
is
then reduces to 430 and
10 Å

than
In (a) thermostability
the residue in yellow (CPK) (e.g., enantioselectivity), provided the screen- based
in on
Table
structural
5,175 clones, 1 as
information,
well
respectively, as other
fast convergence
aids
whichofconstitutes useful in designing
this concept a drastic reduction in
saturation
rys Discovery Studio Visualizer 1.6subsequent ISM. In addition to such a possibility, it is conceivable YYY
eficial mutations, as shown in Figure 2. A forerunner
A
ction ing
of thesystem is designed
in a that theaccordingly
highly flexible (‘you get what on you screen for’)C1. can mutagenesis multiplelibraries.
YY

sites identified the10basis is combinational cassette


work mutagenesis withThe simultaneous
residues located Å of B-factors experimental (Table 1). computer program CASTER,
randomization at two amino acid positions identified beforehand
YY

a specific amino acid residue or


Ofthecourse,
n, select such
desired amino B-factor
be used in sites
acid and
ISM in canthealso
questbetoconsidered
improve catalyticEtogether withother
properties by epPCR The
available B-FIT
and saturation from asourapplied
mutagenesis 18. It is to
website, the
contains
also thermostabilization
interesting calculations
to note of the typeof B. subtilis Lip
shown
Desired property

YY the screen-
Y

irected
define ‘Type’evolution
the sites defined
‘Amino acid’ than thermostability (e.g., enantioselectivity),
and ‘Radius’by CASTing. Whatever criterion is chosen, if a provided that
A in featured
in a previous
is Table study1of as wellbinding
selective
here toas illustrate
other
of zinc aids useful toin designing saturation
finger proteins
ISM.
Y

wn in (b), and this selection can be D DNA1 as analyzed by phage display, each zinc finger domain was
ing system is designed accordingly (‘you get what you screen for’) . mutagenesis libraries.
es of epPCR
described in the text.and
In this way it randomized and assembled individually (walking across a target) . 25

allyin
intoengineering
Of course, such B-factor sites can also be considered together with
a small catalyst of defined
The
Although this doesB-FIT as applied
not involve catalysis, ittobears
thesome
thermostabilization
relationship of B. subtilis Lip
iagram of CAST is shown displaying the sites defined by CASTing. Whatever criterion is chosen, ifto aour strategy.
A is featured here
Relevant, too, to illustrate
is a recent study of theISM.
engineering of
oups have devel-
TABLE 1 | Statistical analysis a
of codon usage d. an epoxide hydrolase for enhancing aerobic mineralization of cis-
different regions A–E, each one
c B 1,2-dichloroethylene by several rounds of saturation mutagenesis
ne mutagenesis as wrong decision is made, the mistake can be corrected because the
There is also theanalysis
op4on to make A reduced codon libraries
YYY Wild type
and co-expression of a DNA-shuffled toluene o-mono-oxygenase26.
enceDegenerate
space in the site No.
can | of
1 be eliminated from No. of usage
further 10consideration
a. No. of
after the initialAmino acids 95% coverage 95% coverage
YY

TABLE Statistical of codon Å C


YY

b
Figure 2 | Schematic illustration of iterative b mutagenesis as a
saturation
codon
al applications of codonsround.No. ofamino acidsENo. ofYY
mutagenesis
Degenerate stopsNo. of Enhancing encoded
thermostability by ISM
Amino acids 95% (2coverage
positions) 95%
strategy for coverage
the rapid (3 positions)
convergence of beneficial effects (red pathway upward)
Y

strategies of directed evolution To illustrate ISM, the thermostabilization of Bacillus subtilis lipase of mutations
Y

arising from amino acid exchanges in defined parts of the


procedures.
NNK cycles Our
repeating of epPCR and A final32aspect to be
codon considered
codons 20 whenamino applying
acids
D ISM is the
1 stops codon
A (Lip A) isAll 20 in detail in this protocol(2
encoded
described 3066
positions)
23. Thermostabilityb (3 positions)
protein, leaving
b
98163
the majority of mostly non-relevant protein sequence space
SM 22,23 , which is usage. This important issue has been discussed in a recent of proteins
report is an important issue when they are applied as catalysts unconsidered (inferior mutants not shown).
d successfully
NDT in engineering NNK 12 32 12 20 0 1 All 20
RNDCGHILFSYV 3066 430 98163 5175
number ofingroups
ructure which have devel-
emphasizing
NDT library mutagenesis
12 methods 12 that are complementary 0 RNDCGHILFSYV 430 5175
methods DBK for gene mutagenesisDBK as wrong 18decision is made, the 12
mistake
44. In all of our can be corrected 0
because the 892 | VOL.2 ARCGILMFSTWV
NO.4 | 2007 | NATURE PROTOCOLS 969 17470
being crucial
sequence for space in to
the those here18 12 examples regarding 0 ISM ARCGILMFSTWV 969 17470
NRT
protein
NRT 8treated
site can be eliminated from further
22,23,458, we have 8 consideration
8
after the0initial RNDCGHSY 190 190 1532
stematic restrict-
,17–21. Industrial applicationspublished
of mutagenesis so farround. used NNK degeneracy0 (N: Ade/ RNDCGHSY 1532
aSee CASTER worksheet. bNumberaSee CASTER worksheet. bNumber of clones to be screened for 95% coverage (over-sampling) when two or three amino acid positions at a given site are randomized using a specific degenerate codon.
of clones to be screened for 95% coverage (over-sampling) when two or three amino acid positions at a given site are randomized using a specific degenerate codon.
’require rapid procedures.
, as suggested by Our Cyt/Gua/Thy; A final aspect to be considered
K: Gua/Thy). Thiswhen applying32
involves ISMcodons
is the codon
and all the 20
d evolution is ISM22,23, which is usage. This important issue has been discussed in a recent report
nformation, onlyin whichproteinogenic aa as mutagenesis
building methods blocks.that On the basis of statistical
3D protein structure
iterative
894 as
nsidered | process
being crucial
VOL.2
24|894
NO.4 analyses
for2007
emphasizing
| VOL.2
to |
46,47
NATURE
those
library
NO.4 | 2007 | 44NATURE PROTOCOLS
, treated
we have herecalculated
PROTOCOLS . In all of the
are complementary
numberregarding
our examples of clones ISMthat should
ty. With the systematic 22,23,45, we have used NNK degeneracy (N: Ade/
nfined regions of restrict- published for
be screened so far95% coverage in the case of randomization at
y to ‘hot sites’, as suggested by Cyt/Gua/Thy; K: Gua/Thy). This involves 32 codons and all the 20
ability of success two and three positions, respectively (Table 1). Of course, full
om structural information, only proteinogenic aa as building blocks. On the basis of statistical
nsidered. The iterative process coverage is 46,47
analyses not, we necessarily
have calculated mandatory,
the number of and
clonesinthat
factshould
in our studies
orpressure
randomization of be screened
in confined regionspublished so farforit was95% notcoverage 22,23
in the
sought case of randomizationthe
. Nevertheless, at higher the
eases the probability of success two and three positions, respectively (Table 1). Of course, full
ty that is to be coverage, the greater the probability of finding improved variants.
an effort.
uggest theforuse
ropriate sites of Therefore,
randomization
coverage is not necessarily mandatory, and in fact in our studies
publishedto so
reduce
far it wasthenotmolecular biological the
sought22,23. Nevertheless, work and
higher the the screen-
12
Ttalytic
method).
property Inthat
the is toing be effort, wethesuggest
coverage, greater thethat it mayofwell
probability findingbeimproved
useful to consider other
variants.
vity 2223
ability , we suggest the use of Therefore, to reduce the molecular biological work and the screen-
, the focus is degeneracies when applying ISM (Table 1). For example, NDT
king (the B-FIT method). In the ing effort, we suggest that it may well be useful to consider other
in a systematic
enantioselectivity degeneracy
22, the focus is degeneracies (N:when Ade/Cyt/Gua/Thy;
applying ISM (Table 1). D: ForAde/Gua/Thy;
example, NDT T: Thy)
be
nding tested
pocketexperi- involves
in a systematic only 12
degeneracy (N:codons and 12 aa
Ade/Cyt/Gua/Thy; (Phe, Leu, Ile,
D: Ade/Gua/Thy; Val, Tyr, His,
T: Thy)
ibility, yet to be tested experi- involves only 12 codons and 12 aa (Phe, Leu, Ile, Val, Tyr, His,
can be ed fo f he e ing.

10/19/20

Fig e 3.​​ 16

Simila l , i h DNa e 1, e e al membe of he gene famil a e f agmen ed, and hen PCR i
n. D ing PCR, diffe en membe of he famil a e c o -p imed. Fo e ample, homological
Gene Shuffling
DNA f agmen ill anneal o each o he . The h b id fo med f om hi p oce a e hen ed o
gene a e a lib a of m an ha a e e ed fo ni e p ope ie .
Meyer et al. Page 9

NIH-PA Author Manuscript

Figure 1.
The sequence space of all possible proteins is depicted as a 2-dimentional plane. The x- and
y-axis represent genotypic distance, such that neighboring points have a similar genotype
Gene shuffling involves random shuffling of12related genes in order to explore more sequence space than could be achieved
and distant points are more dissimilar. Left) A starting parental sequence (black dot) is
randomly mutagenized, resulting in a typical random mutagenesis library (gray dots). This
Fig e 4. P ce f dige i a d imi g i PCR.​​
when doing more traditional mutagenesis. library explores the sequence space near the parental sequence, but does not contain the
sequence exhibiting a new function (star). Right) Multiple, divergent parental sequences
NIH-PA Author Manuscript

(black dots) are recombined, resulting in a gene-shuffled library (gray dots). This library
explores the larger, intervening area, and does contain the sequence exhibiting a new
function (star). Thus, the latent evolutionary potential of a gene family can be tapped to find
new functionality.

 
25
NIH-PA Author Manuscript

Gene Shuffling
A number of homologues genes are digested with DNase, creating numerous fragments. PCR is performed
without additional primers. The result is a library of recombined genes.
Curr Protoc Mol Biol. Author manuscript; available in PMC 2015 January 06.

The sequence space of all possible proteins is depicted as a 2-dimentional plane. The x- and y-axis represent
genotypic distance, such that neighbouring points have a similar genotype and distant points are more dissimilar.
Left) A starting parental sequence (black dot) is randomly mutagenized, resulting in a typical random mutagenesis
library (grey dots). This library explores the sequence space near the parental sequence, but does not contain the
sequence exhibiting a new function (star). Right) Multiple, divergent parental sequences (black dots) are
recombined, resulting in a gene-shuffled library (grey dots). This library explores the larger, intervening area, and
does contain the sequence exhibiting a new function (star). Thus, the latent evolutionary potential of a gene family
can be tapped to find new functionality.

26

13
10/19/20

An overview of directed evolution

Frances H. Arnold shared the Nobel Prize in chemistry in 2018


“for the directed evolu-on of enzymes”

Library of evolved genes


Library of evolved
proteins/enzymes

1000’s of colonies
A few
Parent gene mutations
(= parent protein)

C. Zeymer, D. Hilvert, Annu. Rev. Biochem. 2018, 87, 131


J. L. Porter, R. A. Rusli, D. L. Ollis, Chembiochem, 2016, 17,197
M. T. Reetz in Directed Evolution of Selective Enzymes, Wiley-VCH Verlag GmbH & Co. KGaA, 2016.

27

BI87CH08_Hilvert ARI 15 May 2018 15:9

An overview of directed evolution


> Error-prone PCR
Gene of interest > Cassette mutagenesis
> DNA shuffling
> Replication > Mutator strains
> PCR > Recombination

Gene amplification 4 1 Gene diversification

Directed
evolution

Evolved
gene product
3 2
Access provided by University College Dublin (UCD) on 08/28/19. For personal use only.

Selection or screening Gene expression


Annu. Rev. Biochem. 2018.87:131-157. Downloaded from www.annualreviews.org

> in vivo
> in vitro
> Complementation > Absorbance
> Activation of > Fluorescence
transcription > HPLC
> Detoxification > Mass spectrometry
> NMR

Growth of survivors Medium throughput High throughput


> Liquid cell culture in > Phage display
> mRNA display microtiter plates > Cell surface display and FACS
> Ribosome display > Colonies on solid > IVC and microfluidic-based screening
> PACE medium > µSCALE

Figure 1
General strategy for directed evolution and selected experimental methods. Protein catalysts are optimized using iterative cycles of

28gene diversification by mutagenesis (!), gene expression ("), screening or selection for improved variants (#), and subsequent gene
amplification ($). Abbreviations: FACS, fluorescence-activated cell sorting; HPLC, high-performance liquid chromatography; IVC, in
vitro compartmentalization; NMR, nuclear magnetic resonance; µSCALE, microcapillary single-cell analysis and laser extraction;
PACE, phage-assisted continuous evolution; PCR, polymerase chain reaction.

automated using modern robotics systems, thus allowing more efficient laboratory evolution. For
a more comprehensive description of relevant methodology, an excellent review by Packer & Liu
(2) may be consulted.
14
2.1. Generating Genetic Diversity
Exhaustive sampling of sequence space is impossible. A library of fully randomized 40-residue
polypeptides, built from the 20 common proteinogenic amino acids and containing only a single
molecule of each possible sequence, would exceed the mass of Earth by several orders of magnitude.
Typical enzymes are even larger, usually containing several hundred amino acids. Consequently,
10/19/20

Enzyme engineering requires an efficient screen


Library of evolved genes
Library of evolved
proteins/enzymes

1000’s of colonies
A few
Parent gene mutations
(= parent protein)

You are trying to link genotype to phenotype

29

Genotype-phenotype linkage

With proteins, including enzymes, the ‘phenotype’ is the binding ability or reactivity (the properties)

30

15
10/19/20

Enzyme engineering requires an efficient screen


You are trying to link genotype to phenotype
Library of evolved genes
Library of evolved
proteins/enzymes

1000’s of colonies
A few
Parent gene mutations
(= parent protein)

What makes a good HTS

ü General ü Quantitative

ü Operationally simple ü Whole-cells (or equivalent), droplets etc.

ü Sensitive (UV, fluorescent)

31

Enzyme engineering requires an efficient screen


Transaminases
Transaminases can convert aldehydes and ketones to the corresponding (chiral) amine

Ø The prevalence of chiral amines in bioactive natural products and drug molecules means that these enzymes have
become extremely important in industry

Ø As the substrate scope of the wild-type enzymes is often limited, these enzymes are frequently engineered

Ø If we look at the overall transformation, it is clear why an assay that simply relies on the detection of an amine
product may not be suitable
O

2-O PO
OH
3 NH2
N OH
pyridoxal 5’-phosphate 2-O PO
3
PLP
PLP PMP N
Lys
pyridoxamine 5’-phosphate
transaminase PMP
O NH2
N
R R1 R R1

2-O PO
OH
3 Amine Ketone
Donor coproduct
N
H
enzyme-bound
form of cofactor

32

16
10/19/20

Enzyme engineering requires an efficient screen


Transaminases

Transaminases are PLP-dependent enzymes that are capable of catalysing the conversion of aldehydes and
pro-chiral ketones to the corresponding (chiral) primary amines. The enzymes require two substrates; an amine
donor and an amine acceptor and the transformation consists of two half reactions. The PLP coenzyme is
responsible for shuttling the amino group, from the amine donor to the acceptor substrate to form the product.
The reaction is in equilibrium and is freely reversible and the position of equilibrium depends on the nature of
the amino and ketone substrates.
Finding a suitable assay for transaminase enzymes can be challenging because an amine and carbonyl are
both consumed and produced during the reaction. This means that an assay that simply detects the formation
of an amine product will typically not provide any useful information because there is an amine substrate in the
reaction anyway. Likewise, a screen that detects the formation of a ketone product (if you run the reaction in
the reverse direction) may also not work well.

33

Enzyme engineering requires an efficient screen


Transaminases

Assays are rarely perfect and typically involve some compromise

One of the most widely used TA assays is shown below and involves the detection of the acetophenone when
𝞪-methylbenzylamine is used as the amine donor

Acetophenone assay
O NH2
TA library
R R

NH2 O

Ø Not detecting the product itself but rather the co-product


UV detection
245 nm
Ø Why might this be useful?
ü Simple; cheap; broadly accepted

X only works for low absorbing ketones; high enzyme loadings interfere with absorbance; low-medium throughput;
ideal if used with robotics platform (but most groups don’t have these
https://youtu.be/PmgcektmndE
S. Schätzle, M. Höhne, E. Redestad, K. Robins, U. T. Bornscheuer, Anal. Chem. 2009, 81, 8244
35

17
10/19/20

Enzyme engineering requires an efficient screen


Transaminases
o-xylylenediamine assay

Control Assay

ü Amine is commercially available; sensitive; liquid/solid-phase; displaces equilibrium effectively

X not universally accepted; not quantitative

A. P. Green, N. J. Turner, E. O'Reilly, Angew. Chem. Int. Ed. 2014, 53, 10714
37

Enzyme engineering requires an efficient screen


Transaminases
o-xylylenediamine assay

Using o-xylylenediamine as the donor leads to the formation of an imine co-product, which tautomerizes to
isoindole. Isoindole is very unstable and rapidly polymerizes, affording a dark precipitate. This can be exploited
for a medium-throughput colorimetric assay to detect TA activity.
The assay is very sensitive but is qualitative, not quantitative. This is because the dark polymer that forms
precipitates out of solution, and so the conversion can not be measured by UV, for example. However, it does
give useful information. It tells you if a transaminase reaction is occurring or not! Useful leads can then be
followed up by HPLC, for example.

38

18
both analogues can readily gain access to the active site with
close proximity to PLP (ESI† Fig. S1). Assays were performed with
the amine acceptor 7, which led to the formation of a yellow and
brownish coloured precipitate, with 12 and 13 respectively (Fig. 2).
Butylamine formation was confirmed by HPLC analysis showing
that reactions proceeded with similar conversions to those
observed with amine donor 1, 74% (with 12) and 65% (with 13).
Due to the formation of a yellow precipitate, which is the same
colour as PLP, compound 12 is less suited as an amino donor.
However, amine donor 13 can also be used for the screening of
10/19/20
TAms, and as a cyclic donor may be useful to identify TAms that
are able to accept cyclic substrates.
Apart from the application in multi-well plates, a colony-based
colorimetric assay to provide a HT method that is amendable for
rapid screening of TAm variant libraries was also developed. In a
control reaction with wild type E. coli BL21 (DE3) incubated with 1
mine 1 based transaminase screening. (12.5 mM) and 2 (5 mM), background conversion by the host
e. 1 (25 mM), amino acceptor (10 mM), intrinsic enzymes was excluded as it showed no coloration
mM) and enzyme as crude lysate, 18 h, (Fig. 3A). However, the conversion of 1 (12.5 mM) and 2 (5 mM)
Pp-TAm; (C) Kp-TAm; (D) ArRMut11; with recombinant E. coli BL21 (DE3) containing CV-TAm resulted
in the formation of intensely red coloured colonies (Fig. 3B). In

Enzyme engineering requires an efficient screen


contrast, control reactions without amine acceptor 2 led to the
nversions (A–D: 1–6). CV-TAm, formation of faintly orange colonies (Fig. 3C) due to background
amine acceptors 2, 7, 8 and 11 conversions with residual intracellular acceptors such as pyruvate.
ChemComm
ely red coloured solutions. How-
and 10 substantially less colora- Transaminases
However, a clearly visible difference in colour intensity was
observed. Compared to previously published solid-phase TAm
cating only moderate conversion screening methods12,15 this assay used a single amine donor to were used. Computational docking
identify TAm activity and moreover differentiates between enzyme with PLP intermediates, into the ac
m and ArRMut11 gave similar activity with a target substrate and residual activity with intra-
ense coloration (B1–B6, D1–D6) cellular acceptors.
structure (PDB ID: 4AH3) using Auto
tor 11 (B5, D5) compared to the both analogues can readily gain ac
enzymes tested, Kp-TAm showed close proximity to PLP (ESI† Fig. S1).
e different aldehydes and ketones the amine acceptor 7, which led to th
used, as indicated by the slight
2 (C2), 7 (C4) and 10 (C6) and no
brownish coloured precipitate, with 12
C3) and 11 (C5). To confirm the Butylamine formation was confirmed
version of the acceptors into the that reactions proceeded with sim

This article is licensed under a Creative Commons Attribution 3.0 Unported Licence.
Fig. 2 Assay coloration when using amino donors 1, 12 and 13 with
ermined by HPLC analysis and a CV-TAm and acceptor 7. observed with amine donor 1, 74% (

Open Access Article. Published on 24 September 2015. Downloaded on 11/1/2019 9:55:50 PM.
see ESI†). For example, bioconver- View Article Online
Due to the formation of a yellow pre
nd ArRMut11 resulted in low but
ge with amine acceptors 9 (A1, B1, Chemical Science Edge Article colour as PLP, compound 12 is less
proceeded with moderate conver- However, amine donor 13 can also b
results clearly demonstrated that TAms, and as a cyclic donor may be
ped offers a simple, rapid and template did not provide any signicant hits, an improved
a signicant reduction in background (Fig. S4†), (S)-1-phenyl-
are able to accept cyclic substrates.
valuation and substrate profiling variant was isolated from the B1 library. The new mutant,
ethylamine appeared to be most effective, possibly due to better
ower conversions and lower sub- Apart from the application in mult
be determined quantitatively.
identied as B2, afforded comparable molar conversion aer 2
diffusion of this substrate. However, this demonstrates that colorimetric assay to provide a HT m
n withdrawing group (EWG) in 5 hours and a further 1.5-fold increase in turn-over frequency
other amines can also be used to quench endogenous pyruvate, rapid screening of TAm variant librar
This article is licensed under a Creative Commons Attribution-NonCommercial 3.0 Unported Licence.

r it to form an enamine, other (Table 1). The e.e. measured at 24 h was in all cases >99%
for example with aminotransferases that do not accept 1- control reaction with wild type E. coli B
donors possessing EWGs were
ider generality of the assay: 4-(2-
Fig. 3 Colony-based (though the overall yields
phenylethylamine
TAm screening assay usingas
amine
Fig.were
a substrate. low due to experimental1 based
donor 1 (12.5 mM)
1 2-(4-Nitrophenyl)ethan-1-amine condi- transaminase screening. (12.5 mM) and 2 (5 mM), backgrou
and acceptor benzaldehyde 2 (5 mM) at 30 1C for 30 min. The assay was
hloride 12, and a cyclic analogue performed in triplicate. Control assay with WT E. coli BL21 (DE3) with 2 (A). tions needed to assess the Thekinetic
assay wasbehaviour
performed inof the mutants).
triplicate. A
1 (25 mM), amino acceptor (10 mM), intrinsic enzymes was excluded as
nden-2-amine hydrochloride 13, Assays using E. coli BL21 (DE3) containing CV-TAm with 2 (B) and without 2 (C). PLP (1 mM), KPi buffer pH 7.5 (100 mM) and enzyme as crude lysate, 18 h, (Fig. 3A). However, the conversion of
Evolution towards substituted acetophenones third round of mutagenesis, using the B2 variant as a template,
30 1C, 200 rpm. (A) CV-TAm; (B) Pp-TAm; (C) Kp-TAm; (D) ArRMut11; with recombinant E. coli BL21 (DE3) c
initially failed to identify
(E)better variants, as the colour change
no enzyme.
To test the methodology, HEWT was subjected to directed in the formation of intensely red col
Chemistry 2015 Chem. Commun., 2015, 51, 17225--17228 | 17227 on solid screening was too rapid and did not allow discrimi-
evolution using error-prone PCR to generate a completely contrast, control reactions without a
nation between parental activity and mutants. Reducing the
Open Access Article. Published on 09 May 2019. Downloaded on 8/30/2019 10:12:32 AM.

randomised mutant library with a high number of mutations. with most enzymes in bioconversions (A–D: 1–6). CV-TAm, formation of faintly orange colonies (
concentration of para-nitroacetophenone
for example, readilyand ortho-xylylenedi-
accepted amine acceptors 2, 7, 8 and 11 conversions with residual intracellula
The enzyme has minimal activity towards the aromatic
amine from 10 mM to 1 mM, with just
(A2, A4–A6) 5% (v/v)
resulting DMSO prolonged
in intensely red coloured solutions. How- However, a clearly visible differenc
substrate para-nitroacetophenone (1a) and this was selected as
the screening window toever, ca. 1with
h. An additional
amine acceptors variant,
9 and 10 B3,substantially
was less colora- observed. Compared to previously p
the amino acceptor for the initial screening, using ortho-xyly-
isolated from the screeningtion was andobserved
while,(A1, A3) indicating
upon purication, only moderate
it conversion screening methods12,15 this assay us
lenediamine as the amino donor. DMSO (10% (v/v)) was
D. Baud et al. Chem.
required Commun.,
for substrate 2015,51,
solubility 17225-17228
and enhanced cell-
of these substrates.
displayed comparable reaction velocity and 2 h conversions to identify TAm activity and moreover di
Bioconversions with Pp-TAm
B2, it had 2-fold higher expression levels which justies a more and ArRMut11 gave similar activity with a target substrate and
permeability. The discrimination between wild-type and an
39 improved variant relies on the rapidity with which the enzyme
results, however, with less intense coloration (B1–B6, D1–D6)
rapid colour development with respect to the parental variant.
in particular with amine acceptor 11 (B5, D5) compared to the
cellular acceptors.

Each variant was also fully characterised in terms of activity and


converts the substrate in vivo. Screening of ca. 15 000 colonies CV-TAm reactions. Amongst all enzymes tested, Kp-TAm showed
stability at different temperatures
only moderate andacceptance
pHs. In all thedifferent
of the cases, thealdehydes and ketones
provided two variants with enhanced catalytic activity towards
mutants generated did undernot show signicant
the reaction alterations
conditions used, aswhenindicated by the slight
para-nitroacetophenone. The two new variants (referred to as A1
compared to the WT enzyme (Fig.with
coloration S5–S8†).
amine acceptors 2 (C2), 7 (C4) and 10 (C6) and no
and B1) were expressed and puried for further character-
colour change with 9 (C1), 10(2a),
A second substrate, para-cyanoacetophenone (C3)wasand also
11 (C5). To confirm the
isation. They exhibited a 2-fold increase in turnover frequency
reliabilityhas
investigated. HEWT wild-type of this assay, theactivity
negligible conversion of the acceptors into the
towards Fig. 2 Assay coloration when using am
with respect to the wild-type, achieved higher conversion aer 2 corresponding
this molecule, even lower than theamines nitro was determined with
substitution, by HPLC analysis and a CV-TAm and acceptor 7.
A Recent Example hours, and maintained excellent enantioselectivity (Table 1).
Both variants were subjected to a second round of error-prone
good correlation was observed (see ESI†). For example, bioconver-
a turnover frequency of 4 ! 10"3 s"1. The B2 variant showed an
sions with CV-TAm, Pp-TAm and ArRMut11 resulted in low but
impressive 60-fold increase in turn-over frequency (229 ! 10"3
PCR and the libraries screened as before with 1a. Interest-
Transaminases ingly, in this case the positive colonies were isolated aer
detectable levels of colour change with amine acceptors 9 (A1, B1,
s"1), and enhanced stereoselectivity,
D1) and 10 (A3,affording
B3, D3), >99%
which of the (S)-
proceeded with moderate conver-
enantiomer (Table 1). sions Again, B3 showed
of 1–4%. Combined, virtually identical
these results O
clearly demonstrated that
a signicantly shorter incubation period (2–15 minutes
catalytic properties to B2.
o-xylylenediamine compared to 25 minutes assay for the parental colony). While the A1
the colorimetric assay developed offers a simple, rapid and
sensitive HT platform for the evaluation and substrate profiling
R
of large enzyme libraries. For lower conversions and lower sub-
A wild-type TA, named HEWT, has poor activity towards substituted acetophenones
strate concentrations, these can be determined quantitatively.
Table 1 HEWT evolution towards substituted acetophenones. Turnover frequencies (TOF) measured atSince 10 mM
thecarbonyl concentration, 2 hour group (EWG) in 5
4-nitroaryl Poor activity
electron withdrawing
molar conversions (m.c.), and enantiomeric excess (e.e.) for wild-type HEWT and isolated variants towards para-nitro-
will enhance (1a) andfor
the tendency para-cyanoa-
it to form an enamine, other
a
Paradisi and co-workers generated a completely random mutant library and screened using the diamine cetophenone (2a) commercially available amine donors possessing EWGs were
Fig. 3 Colony-based TAm screening assa
investigated to establish the wider generality of the assay: 4-(2- and acceptor benzaldehyde 2 (5 mM) at 3
aminoethyl)benzonitrile hydrochloride 12, and a cyclic analogue performed in triplicate. Control assay with
of 1, 5-nitro-2,3-dihydro-1H-inden-2-amine hydrochloride 13, Assays using E. coli BL21 (DE3) containing CV

This journal is © The Royal Society of Chemistry 2015 Chem. Commun., 2


WT A1 B1 B2 B3

TOF m.c. e.e. TOF m.c. e.e. TOF m.c. e.e. TOF m.c. e.e. TOF m.c. e.e.
Substrate (10"3 s"1) (%) (%) (10"3 s"1) (%) (%) (10"3 s"1) (%) (%) (10"3 s"1) (%) (%) (10"3 s"1) (%) (%)

>99 >99 >99 >99 >99


36 # 6 4 66 # 9 6 65 # 26 6 92 # 5 7 84 # 14 7
(S) (S) (S) (S) (S)

>99 >99 >99


4#1 9 87 (S) 50 # 7 11 94 (S) 127 # 5 12 229 # 7 11 246 # 7 12
(S) (S) (S)

a
Biotransformation reactions were performed with 10 mM ketones, 500 mM L-alanine, 0.1 mM PLP, 0.1 mg mL"1 (1.9 mM) enzyme in 50 mM
M. Planchestainer
phosphate bufferetpHal.8 and
Chem.
10% (v/v)Sci.,
DMSO 2019, 10,Experimental
at 37 $ C (see 5952 in the ESI, Fig. S9). All experiments were conducted in triplicate and the
standard error is reported accordingly. Mutations; A1: W56C, V435A; B1: W56C, L211V, L306M; B2: W56C, L211V, L306M, V361A, Q388R, P453L;
B3: W56C, L211V, A254V, L306M, V361A, Q388R, P453L.
41
5954 | Chem. Sci., 2019, 10, 5952–5958 This journal is © The Royal Society of Chemistry 2019

19
10/19/20
were processed using XDS6 and assigned to a triclinic (P1) space group using POINTLESS and scaled
using SCALA; both implemented in the CCP4i suite.7,8 The 3D structure of HEWT was solved using
Molrep and the structure of an aspartate aminotransferase from Pseudomonas aeruginosa (PDB entry
5TI8; 43% sequence identity over 433 aligned residues) as a search model.9 The structure was manually
built using COOT and refined using phenix.refine until satisfactory refinement parameters were
achieved (Rwork = 16 %; Rfree = 21.0 %).10,11 All residues are located in allowed regions of the
Enzyme engineering requires an efficient screen
Ramachandran except for Ala283 in both chains and Lys284 (chain A); the catalytic lysine is often found

Transaminases
as an outlier in many PLP-dependent enzymes due to its covalent interaction with PMP. Data collection

o-xylylenediamine assay
parameters and refinement statistics are shown in Supplementary Table 3.

Supplementary Figure 1 | Optimised amino acceptor screening based on the ortho-xylylenediamine assay. E.
coli BL21(DE3) cells are transformed with the HEWT library of interest (A) and colonies are grown overnight on a
Selected colonies that rapidly turned black (those much faster than the wild-type control)
nitrocellulose membrane placed on LB agar plates supplemented with 100 g/mL of ampicillin (B). Membranes
are transferred to LB agar plates supplemented with 100 g/mL of ampicillin and 1 mM IPTG for protein
expression for 8 hours (C). To minimised false positive background colour formation and to stop the cell
metabolism, colonies are dialyzed overnight by transferring the membrane to a dialysis plate containing 2% agar,
10 mM Tris-HCl pH 8, and 0.1 mM PLP (D). Afterwards, the background is depleted by placing the membrane on
M. Planchestainer et al. Chem. Sci., 2019, 10, 5952
filter paper soaked in 10 mM (S)-1-phenethylamine in phosphate buffer pH 8, 1% (v/v) DMSO for 30 minutes (E).
Finally, screening is conducted by incubation of the membranes on assay plates containing 10 mM ortho-

42
xylylenediamine and 10 mM amino acceptor of interest in phosphate buffer pH 8, 10% (v/v) DMSO (F). Scheme
elaborated from Weis et al.12

8
Enzyme engineering requires an efficient screen
Transaminases
o-xylylenediamine assay

E. coli BL21(DE3) cells are transformed with the HEWT library of interest (A) and colonies are grown overnight
on a nitrocellulose membrane placed on LB agar plates supplemented with 100 ug/mL of ampicillin (B).
Membranes are transferred to LB agar plates supplemented with 100 ug/mL of ampicillin and 1 mM IPTG for
protein expression for 8 hours (C). To minimised false positive background colour formation and to stop the cell
metabolism, colonies are dialyzed overnight by transferring the membrane to a dialysis plate containing 2% agar,
10 mM Tris-HCl pH 8, and 0.1 mM PLP (D). Afterwards, the background is depleted by placing the membrane on
filter paper soaked in 10 mM (S)-1-phenethylamine in phosphate buffer pH 8, 1% (v/v) DMSO for 30 minutes (E).
Finally, screening is conducted by incubation of the membranes on assay plates containing 10 mM ortho-
xylylenediamine and 10 mM amino acceptor of interest in phosphate buffer pH 8, 10% (v/v) DMSO (F).

M. Planchestainer et al. Chem. Sci., 2019, 10, 5952

43

20
s: (i) a prin- deposited in the PDB was assessed using ENDscript 2.0 (http://
dues 82 to endscript.ibcp.fr).34 HEWT was compared with 124 structures
g 7 strands, with sequence conservation >30%. As expected, the highest
ng antipar- conservation (sequence and structural) is located in the PLP-
310 helices dependent transferase-like domain and active site region 10/19/20
0–451) that (Fig. 3A).
2 to 32; a1– Sequencing of the selected mutants identied two amino
ns with the acid changes in the A1 variant, (W56C, V435A), while the B1
-phosphate variant displayed three amino acid substitutions (W56C, L211V,
he 3-amino L306M) (notes in the ESI†). Interestingly, both mutants harbour
e linkage, is a tryptophan to cysteine change in position 56, which is located
O
00 residues in Enzyme engineering
the active site. This residue has requires an efficient
previously been identied asscreen
a hotspot in in silico rational design studies.7,35 In this case, R
Transaminases
removing the large indole
o-xylylenediamine assay ring and replacing it with the smaller
Poor activity
cysteine side chain allows for easier substrate binding.
Ø First round
Furthermore, theof evolution
thiol involved
can promote screening
hydrogen ca. 15,000
bonding colonies and identified two improved clones (A1 and B1)
with the

Ø A second round of evolution with both first generation clones afforded an improved variant B2, arising from the B1
parent library

A1 mutations - W56C, V435A (First round of evolution)

B1 mutations - W56C, L211V, L306M (First round of


evolution)

B2 mutations - W56C, L211V, L306M, V361A, Q388R,


P453L (Second round of evolution)

B3 mutations - W56C, L211V, L306M, V361A, Q388R,


P453L, A254V (only one extra mutation from B2) better
expression

C1 mutations (not discussed here)


M.3 Planchestainer
Fig. HEWT 3D structure et al. Chem. Sci.,
conservation and2019,
mutant 10, 5952 positions.
residue
Secondary structure of the HEWT monomer shown in sausage
44
representation, as automatically generated by ENDscript 2.0 (http://
endscript.ibcp.fr).34 Structure conservation between chain A of HEWT
and 124 structure homologs present in the PDB is indicated by ribbon
thickness, with regions of low conservation being thicker than highly
conserved regions (thin regions). Sequence conservation is indicated
by red shading; the redder the residue, the more conserved it is.
Mutated residues in variants A1 (blue), B1 (yellow), B2 (green) and C1
Enzyme engineering requires an efficient screen
(orange) are shown as sticks. The B3 mutant shares all B2 mutations
andTransaminases
contains an extra A254V mutation (pink sticks). P453 from variants
mer. Cartoon B2 o-xylylenediamine
and B3 is not presentassayin the model. The N- and C-termini are
ndent trans- indicated and PLP is shown in sticks. This figure was generated using
ted in sticks. Pymol 2.0.6. Modelling
Relevant mutationsofare theshown
active in
sitecolour
for thefor
A1each
and B1 mutants,
of the variants. PLP is docked in the active site. Sequencing of the
nd C-termini where the bulky
selected Trp (theidentified
mutants steric encumbrance
two amino of the
acidresidue
changesis shown
in the A1 variant, (W56C, V435A), while the B1 variant
a subdomain as an orange cloud)
displayed three (D) is substituted
amino with a cysteine
acid substitutions (E), allowing
(W56C, L211V,the L306M). Interestingly, both mutants harbour a tryptophan
monomer. In aromatic ring substituent (para-nitroacetophenone in the aldimine
to cysteine change in position 56, which is located in the active site. This residue has previously been identified as a
clarity. This intermediate complex with the pyridoxamine phosphate (PMP)) to be
hotspot
more in in silico rational design studies. In this case, removing the large indole ring and replacing it with the
easily accommodated.
smaller cysteine side chain allows for easier substrate binding. Furthermore, the thiol can promote hydrogen
bonding with the aromatic substituent and stabilise para-nitroacetophenone. Accordingly, W56 is located in a highly
structurally conserved region, as is V435. The L306M mutation is peripheral and not conserved with regards to
This journal is © The Royal Society of Chemistry 2019
sequence, and therefore less conserved than the A1 mutations with regards to structure, and therefore is somewhat
more difficult to rationalise. The L306M mutation introduces a longer side chain that may form more stabilising
hydrophobic interactions with surrounding hydrophobic residues (M126, V302 and F313).

The second round of evolution introduced three additional mutations (V361A, Q388R, P453L) in the B2 variant,
which are also located outside of the active site and all are exposed to the solvent; their side-chains do not make
any significant stabilising intramolecular interactions. B3 introduced mostly silent mutations, apart from a single
A254V mutation neighbouring the D255 residue, forming a hydrogen bond with the pyridine-type nitrogen of the co-
factor PLP.

M. Planchestainer et al. Chem. Sci., 2019, 10, 5952

45

21
a hotspot in in silico rational design studies.7,35 In this case,
removing the large indole ring and replacing it with the smaller
cysteine side chain allows for easier substrate binding. 10/19/20
Furthermore, the thiol can promote hydrogen bonding with the

Enzyme engineering requires an efficient screen


Transaminases A1 mutations - W56C, V435A (First round of evolution)

B1 mutations - W56C, L211V, L306M (First round of evolution)

B2 mutations - W56C, L211V, L306M, V361A, Q388R, P453L


(Second round of evolution)

B3 mutations - W56C, L211V, L306M, V361A, Q388R, P453L


A254V (only one extra mutation from B3) better expression

C1 mutations (not discussed here)

Improved variants have been isolated and identified using a screen that
tells you very little about what is really going on (kinetics, turnover, substrate
scope…)

It simply identifies mutants that are turning over the o-xylylenediamine substrate quickly
(at least a small amount of it) and this guides further engineering – all without needing to know anything
about the protein structure
Fig.
46
3 HEWT 3D structure conservation and mutant residue positions.
Secondary structure of the HEWT monomer shown in sausage
representation, as automatically generated by ENDscript 2.0 (http://
endscript.ibcp.fr).34 Structure conservation between chain A of HEWT
and 124 structure homologs present in the PDB is indicated by ribbon
Enzyme
thickness, engineering
with regions requires
of low conservation an efficient
being thicker than highly screen
Monoamine oxidase
conserved regions (thin regions). Sequence conservation is indicated
by red shading; the redder the residue, the more conserved it is.
Mutated residues in variants A1 (blue), B1 (yellow), B2 (green) and C1
(orange) are shown as sticks. The B3 mutant shares all B2 mutations
and contains an extra A254V mutation (pink sticks). P453 from variants
rtoon B2 and B3 is not present in the model. The N- and C-termini are
rans- indicated and PLP is shown in sticks. This figure was generated using
sticks. Pymol 2.0.6. Modelling of the active site for the A1 and B1 mutants,
ermini where the bulky Trp (the steric encumbrance of the residue is shown
omain as an orange cloud) (D) is substituted with a cysteine (E), allowing the
mer. In aromatic ring substituent (para-nitroacetophenone in the aldimine
. This intermediate complex with the pyridoxamine phosphate (PMP)) to be
more easily accommodated.

47 This journal is © The Royal Society of Chemistry 2019

22
10/19/20

Enzyme engineering requires an efficient screen


Monoamine oxidase

A good screen is ESSENTIAL for successful evolution. This MAO screen is particularly powerful because it is
detecting the presence of the reaction by-product – hydrogen peroxide. This has the advantage of being
independent of the amine substrate used i.e. no matter what amine you test, if it’s a substrate, you will see a colour
change. Very important for directed evolution. Many screens are substrate dependent i.e. will only work with a
particular amine and this has limited scope. Another advantage of this screen is that it can be used for any oxidase
where hydrogen peroxide is the by-product.

48

Enzyme engineering requires an efficient screen


Monoamine oxidase

Select/screen

(a few) random
mutations
No
improvement

Repeat

Parent MAO
gene (= parent
protein)

Evolved MAO gene


(= evolved protein)

Advantages and drawbacks of this screen?

49

23
10/19/20

Enzyme engineering requires an efficient screen


Lipases

Ø Can screen for lipases using tributyrin (or equivalent) that can be incorporated
into agar

Ø This could be useful for lipase discovery or for selecting active lipases from
evolution experiment, for example. It won’t, however, give much more information
about the enzyme e.g. substrate scope, enantioselectivity…..

50

Enzyme engineering requires an efficient screen


Lipases

There are many ways to screen for lipase activity in both liquid and on solid phase, including enantioselective
screens. The one show here is useful for detecting lipase activity on solid phase. It relies on a compound called
tributyrin, which is incorporated into the agar media on the plate. If the cells express an active lipase, the enzyme
will hydrolyse the tributyrin and result in a zone of clearing on the plate. You can imagine if there are hundreds of
colonies on this plate there would be a zone of clearing around the colonies that were expressing an active
lipase.

51

24
10/19/20

Problem O
Transaminase
NH2

(S)

1) A given transaminase has no activity towards the above ketone and it must be engineered……

Outline an engineering strategy you would take to alter the substrate scope of the enzyme to enable it
to accept this ketone

Design a screen to assist in the engineering of a transaminase for the above reaction

Some info
• The wild-type enzyme does not accept the starting substrate
• There is no crystal structure of the protein
• The DNA sequence is known
• A crystal structure of a similar protein (87% sequence similarity) is available in the literature

2) Having developed a protein that accepts the substrate, you now want it to work in 30% MeOH and the mutant
shows only low activity

Suggest how you would further engineer the protein to enable it to tolerate these conditions

52

25

You might also like