Peter Tompa - Structure and Function of IDPs
Peter Tompa - Structure and Function of IDPs
Peter Tompa
Chapman & Hall/CRC
Taylor & Francis Group
6000 Broken Sound Parkway NW, Suite 300
Boca Raton, FL 33487-2742
This book contains information obtained from authentic and highly regarded sources. Reasonable efforts
have been made to publish reliable data and information, but the author and publisher cannot assume
responsibility for the validity of all materials or the consequences of their use. The authors and publishers
have attempted to trace the copyright holders of all material reproduced in this publication and apologize to
copyright holders if permission to publish in this form has not been obtained. If any copyright material has
not been acknowledged please write and let us know so we may rectify in any future reprint.
Except as permitted under U.S. Copyright Law, no part of this book may be reprinted, reproduced, transmit-
ted, or utilized in any form by any electronic, mechanical, or other means, now known or hereafter invented,
including photocopying, microfilming, and recording, or in any information storage or retrieval system,
without written permission from the publishers.
For permission to photocopy or use material electronically from this work, please access www.copyright.
com (http://www.copyright.com/) or contact the Copyright Clearance Center, Inc. (CCC), 222 Rosewood
Drive, Danvers, MA 01923, 978-750-8400. CCC is a not-for-profit organization that provides licenses and
registration for a variety of users. For organizations that have been granted a photocopy license by the CCC,
a separate system of payment has been arranged.
Trademark Notice: Product or corporate names may be trademarks or registered trademarks, and are used
only for identification and explanation without intent to infringe.
Tompa, Peter.
Structure and function of intrinsically disordered proteins / Peter Tompa.
p. ; cm.
Includes bibliographical references and index.
ISBN 978-1-4200-7892-3 (hardcover : alk. paper)
1. Proteins--Pathophysiology. 2. Proteins--Structure-activity relationships. 3.
Proteins--Metabolism--Disorders. I. Title.
[DNLM: 1. Protein Conformation. 2. Protein Denaturation. 3. Protein Folding. 4.
Structure-Activity Relationship. QU 55.9 T662s 2009]
RC632.P7T66 2009
612.3’98--dc22 2009011360
I wish to give special thanks to my mother, who is a mathematician, and my father, who
is a physicist. From them, I inherited a love for science and a spiritual foundation with-
out which this book would not exist. Ironically, the most profound message I took from
them was not about science, but poetry. They instilled in me a respect for the power of
language that has been a driving force for this book. I also thank my wife, Csilla, and
our daughters, Rozi and Bori, for their love, patience, and encouragement during the
endless days of writing.
Peter Tompa
Contents
vii
viii Contents
4 Hydrodynamic Techniques 43
4.1 Gel Filtration (Size-Exclusion) Chromatography 43
4.2 Dynamic Light Scattering 45
4.3 Analytical Ultracentrifugation 46
4.4 Small-Angle X-Ray Scattering 47
4.4.1 Measles Virus Nucleoprotein 50
4.4.2 Bacterial Cellulase 51
4.4.3 p53 52
4.5 Pulsed-Field Gradient NMR 53
References 265
Index 313
Foreword
It is now half a century since the first crystal structure of a protein (myoglobin) was
published, soon to be followed by a series of high-resolution structures. For most of this
time, we have admired the beautiful structures of proteins comprised of well-packed
helices, sheets linked by turns. A well-folded, albeit dynamic, structure was thought
to be the hallmark of protein function. This view was also built on the previous half
century where such ideas as lock-and-key specificity of enzymes and complementar-
ity of antibody to antigen structure were guiding principles. The subsequent 50,000
or so structures that are deposited in the Protein Data Bank (PDB) have provided the
foundation for our understanding of how enzymes, receptors, transporters, and struc-
tural proteins function. Accordingly, it came as a shock to discover that many proteins
or regions of proteins are not ordered, but intrinsically disordered. Like all paradigm
shifts, the existence of intrinsically disordered or unstructured proteins (IDPs or IUPs),
was not immediately accepted and there is still skepticism in some quarters. But, it now
seems that ordered proteins and domains cover only about half of the sequence space
in various proteomes. Protein disorder reaches high proportions in higher eukaryotes,
and is intimately linked with the functions of signaling and regulation, also often caus-
ally linked with debilitating diseases such as cancer and neurodegenerative disorders.
Peter Tompa’s fine comprehensive overview of this rapidly advancing field, Structure
and Function of Intrinsically Disordered Proteins, is of timely importance, both for its
documentation and for emphasizing the importance of IDPs in biology and in protein
science.
Peter Tompa addresses the structure, function, and evolution of IDPs at a variety
of levels. After a short introduction to the history of the recognition of the phenom-
enon, he provides insight into the physical principles of protein structure and describes
in detail the biophysical techniques applicable for the characterization of IDPs. The
book also highlights bioinformatics and proteomic techniques applied for their large-
scale discovery and characterization. Detailed description of the structural ensemble
of disordered proteins leads to chapters focusing on the functional insight gained by
recognizing disorder, such as the functional classification of IDPs, the extension of
the structure-function paradigm, and the involvement of structural disorder in disease.
This book demonstrates Peter Tompa’s considerable command of the field, providing
appropriate examples and ample details in every respect. Its coverage of even the latest
developments in the field is impressive, and the author manages to strike a good bal-
ance between detail and concept to lead the reader through this novel field. Thus, it
can be recommended to a wide audience, including researchers actively pursuing IDP
research, informed professionals interested in this novel concept, and undergraduate to
xv
xvi Foreword
graduate level students who take on studies in protein science, biochemistry, or molecu-
lar biology. Since results of the field of protein disorder also shed new light on the etiol-
ogy of many well-known diseases, I can also recommend this book to physicians and
biomedical researchers, who must better understand the role of structural disorder in
human diseases in their efforts to develop novel remedies against them.
Alan Fersht
Cambridge, 2009
Preface
Throughout my career, I have shared the view of many colleagues—that the structure of
proteins must relate to their function, as clearly witnessed by tens of thousands of actual
structures solved and deposited in various databases. The turning point in my research
came when I came across the idea of disorder in proteins, and I became interested in
understanding these rare exceptions to the rule. The introduction of this simple idea has
unleashed a wealth of information on proteins that defy the structure-function para-
digm. It turns out that these proteins are rather prevalent and play important regulatory
and signaling roles.
Due to the breathtaking pace at which novel information is generated these days,
it is both easy and difficult to write a book on this subject. It is easy because so many
observations have been made that the subject lends itself to extensive coverage. It is
difficult because the concepts are in continuous flux due to the constant outpouring of
information. At least one thing is clear: The structural disorder of proteins deserves
to be surveyed comprehensively. This book is my attempt to help reach this ambitious
goal. I hope it will inspire future work in the field.
xvii
About the Author
This book could not have been written without my colleagues Bianka Agoston, Denes
Kovacs, Attila Farkas, Veronika Csizmok, Zoltan Bozoki, Eszter Hazy, Agnes Tantos,
Lajos Kalmar, Hedi Hegyi, Istvan Simon, Andras Perczel, Robert Kiss, Kalman Tompa,
and Zsuzsa Dosztanyi, who helped with the figures and gave their advice and comments
on the text. I am also indebted to my editor, Luna Han, for her inspiration and assistance
throughout the various phases of writing.
xxi
Abbreviations
and Acronyms
D-box: destruction-box
Df31: decondensation factor 31
DHFR: dihydrofolate reductase
DHN: dehydrin
DHPR: dihydropyridine receptor
DLS: dynamic light scattering
DP: dual personality
DSC: differential scanning calorimetry
DSSP: dictionary of protein secondary structure
ECM: extracellular matrix
EFP: EWS fusion protein
eIF4E: eukaryotic translation initiation factor 4E
eIF4F: eukaryotic translation initiation factor 4F
eIF4G1: eukaryotic translation initiation factor 4G1
ELM: eukaryotic linear motif
EM: electron microscopy
EOM: ensemble optimization method
EPR: electron paramagnetic resonance
ERD: early responsive to dehydration
ESR: electron spin resonance
EWS: Ewing’s sarcoma
FCS: fluorescence correlation spectroscopy
FID: free induction decay
FITC: fluorescein-isothiocyanate
FMRP: fragile X mental retardation protein
FnBPA: fibronectin binding protein(A)
FRET: fluorescence resonance energy transfer (also Forster resonance energy transfer)
FTIR: Fourier-transform infrared spectroscopy
GARP: glutamic acid-rich protein
GBD: GTPase-binding domain
GF: gel filtration (chromatography)
GFP: green fluorescent protein
Gnd-HCl: guanidine-hydrochloride
GO: gene ontology
HAT: histone acetyltransferase
HCAP: human cancer-associated proteins
HD: Huntington’s disease
HMG: high-mobility group
HMGA: high-mobility group protein A
hnRNPA1: heteronuclear ribonucleoprotein A1
HSQC: heteronuclear single quantum coherence
HTS: high-throughput screening
HTT: Huntingtin
HXMS: hydrogen/deuterium exchange mass-spectrometry
HCA: hydrophobic cluster analysis
I2: inhibitor-2
Abbreviations and Acronyms xxv
1
2 Structure and Function of Intrinsically Disordered Proteins
q1 × q2
F=k (1.1)
r2
where q1 and q2 are the charges at each atom separated by a distance r, and constant
k depends on the dielectric constant of the medium. The attractive interaction of two
opposite charges is called a salt bridge.
A physical interaction of special importance in protein structure is the hydrogen
bond (H-bond), which is the attraction between two electronegative atoms mediated
by a hydrogen atom covalently linked to one of them (donor). Attraction between the
partial positive charge of the hydrogen atom and the partial negative charge of the adja-
cent electronegative atom (acceptor) results in the formation of a partial covalent bond.
H-bonds typically form between carbonyl oxygens and amide hydrogens of the back-
bone, and represent the major stabilizing force of repetitive (or turn) secondary struc-
tural elements.
The structural fate of proteins also depends on the interactions of their residues
with solvent water. In general, polar residues interact favorably, whereas apolar residues
interact unfavorably with water. The relative tendency of amino acids to interact with
water is expressed in terms of hydrophobicity or hydropathy, numerically expressed in
scales such as the “Kyte–Doolittle” (Kyte and Doolittle 1982) and “Sweet–Eisenberg”
(Sweet and Eisenberg 1983) scales (see Figure 1.1). The importance of this interaction
stems from the fact that the vicinity of an apolar/hydrophobic residue limits the con-
formational freedom of water molecules. Thus, the release of such water molecules is
highly favorable and provides for the hydrophobic effect, which drives protein folding
(see Section 1.6).
1 • Principles of Protein Structure and Function 3
Small Nucleophilic
OH OH SH
H H CH3
H2N COOH H2N COOH H2N COOH H2N COOH H2N COOH
Glycine (Gly, G) Alanine (Ala, A) Serine (Ser, S) Threonine (Thr, T) Cysteine (Cys, C)
0.4/0.57/0.75 1.8/1.42/0.83 0.8/0.77/0.75 0.7/0.83/1.19 2.5/0.70/1.19
Hydrophobic S
N COOH
H2N COOH H2N COOH H2 N COOH H
H 2N COOH
Valine (Val, V) Leucine (Leu, L) Isoleucine (Ile, I) Methionine (Met, M) Proline (Pro, P)
4.2/1.06/1.70 3.8/1.21/1.30 4.5/1.08/1.60 1.9/1.45/1.05 1.6/0.57/0.55
Aromatic Acidic
O OH
H O
OH N
OH
Amide Basic
O NH2 NH3+ H2N NH2+
O HN
NH+ NH
NH2
Figure 1.1 Basic features of the 20 amino acids of proteins. The structure and basic
physicochemical features of amino acids (shown by their standard three-letter and one-
letter codes). The three numbers below the name represent their tendency to interact with
water (hydropathy or hydrophobicity, as given by the Kyte–Doolittle scale [data from Kyte
and Doolittle 1982]), and preference to be found in secondary structural elements α-helix
and β-sheet [data from Chou and Fasman 1978].
(psi)
ψ
O H3C H H O
+H N
3N
N O–
H CH3 O H CH3
H
φ
(phi)
Figure 1.2 Local structure and dihedral angle around the α-carbon in a polypeptide
chain. A useful descriptor of local conformation of a polypeptide chain is the pair of dihe-
dral angles of the rotation of two planar peptide bonds around the alpha carbon. The
example shown is an Ala3 tripeptide: The four atoms of the peptide bonds on either side of
Cα are found within a plane, the position of which is described by two torsion angles, Φ
(defined as C′-N-Cα-C′) and Ψ (defined as N-Cα-C′-N).
1 • Principles of Protein Structure and Function 5
The basic units of genetic information are genes, which, at the first approximation,
correspond to segments of DNA that encode for a protein. The information is first tran-
scribed, in which a messenger RNA (mRNA) molecule is synthesized and then trans-
lated by the ribosome to give rise to a protein molecule. The sequence of amino acids
within the polypeptide chain is defined by the succession of codons (nucleotide triplets)
within the gene. Because there are four types of nucleotides in DNA, 34 = 64 different
codons exist. Four of these signal for the initiation (start codon, AUG, also encoding for
Met) and termination (stop codons, UAA, UAG, and UGA) of protein synthesis; thus 61
actually encode for amino acids. Several amino acids have more than one correspond-
ing codon (i.e., the genetic code is redundant in this sense).
The nucleotide sequence of the gene and the amino acid sequence of the polypep-
tide chain of the protein are colinear (i.e., codons read from the 5′ end, defined by the
5′ hydroxyl group of ribose units within the backbone of DNA) toward the 3′ end of the
gene correspond to amino acids, starting from the N-terminus toward the C-terminus
within the protein. The sequence is also determined by covalent changes in mRNA,
because its intervening regions (introns) are removed, and the rest (exons) are joined
together in a process termed splicing, to yield mature mRNA. It is estimated that in
about one-half to two-thirds of eukaryotic genes, splicing can occur in more than one
way (Blencowe 2006; Huang et al. 2005; Kim, Magen, and Ast 2007), and such alterna-
tive splicing generates variants of the same protein.
1.4 Post-translational
modifications of amino acids
The polypeptide chain after synthesis may function without further chemical modifica-
tion, but in many cases it may undergo additional post-translational chemical modifi-
cations, which either extend the range of chemical functionalities of amino acids or
change the function of the protein for the purposes of regulation.
The most frequent modification is the formation of disulfide bonds between thiol
groups of Cys residues, which is intimate to the stabilization of 3-D structure. Disulfide
bonds can form spontaneously, but their formation can also be assisted by specific
enzymes known as protein disulfide isomerases. The tendency of cysteines to spontane-
ously form disulfide bonds is probably one of the major reasons why IDPs have a low
level of this amino acid.
Another common modification of side-chains is the enzymatic phosphorylation of
Ser, Thr, and Tyr residues. This modification is carried out by protein kinases, and it can be
reversed by protein phosphatases. Reversible phosphorylation often causes changes in func-
tion, and thus it is used very extensively for regulatory purposes. Specificity of recognition
by the kinase comes from the primary sequence flanking the site of phosphorylation.
Proteins may also be glycosylated on their amine or hydroxyl groups. Such modifi-
cation adds single or multiple branched carbohydrate moieties to the polypeptide chain
(i.e., N-linked glycans are attached to the amide nitrogen of Asn, and O-linked glycans
6 Structure and Function of Intrinsically Disordered Proteins
are attached to the hydroxy oxygen of Ser or Thr side chains), which may increase
solubility, lengthen the biological lifetime of the protein, or modify its interactions with
other constituents of the cell. Glycosylated proteins are often involved in highly specific
cell–cell contacts or interactions between the cell and the extracellular matrix.
There are many other less-frequent but important regulatory post-translational mod-
ifications, such as acetylation, myristoylation, methylation, sulfonylation, and nitrosy-
lation. A special modification is targeted at the backbone of the protein: Enzymes of
proteolytic activity may cleave off segments of the polypeptide chain, with the remain-
ing fragment(s) having an activity different from the intact protein. Such limited prote-
olysis may be carried out by the proteasome, a large multi-protein complex primarily
involved in protein degradation (see Chapter 8, Section 8.3.1) (Liu et al. 2003).
The enzymatic modification of proteins may occur at a few specific residues only,
but proteins may also undergo spontaneous chemical modifications in the absence of
modifying enzymes at their chemically labile residues. For example, Cys and Met resi-
dues may be oxidized, Asn and Gln residues may undergo spontaneous deamidation,
or Ser residues may be glycosylated by glucose. Such modifications usually have severe
functional consequences.
can be adequately described by two torsion angles, Φ (phi) and Ψ (psi), corresponding
to the rotation of the two adjoining amide plains around the bond connecting them to
the Cα (Figure 1.2). The Ramachandran plot (Figure 1.3) describes local conformation
of a polypeptide chain by the Φ,Ψ pairs. Large parts of the plot correspond to disallowed
conformations, which do not occur in actual proteins. Different amino acids have dif-
ferent propensities to occur in different regions of the plot (i.e., in different secondary
structural elements) (Figure 1.1).
Residues in α-helices typically adopt backbone Φ,Ψ dihedral angles around –60°,
–45° (Figure 1.3). The resulting structure is repetitive, in which the polypeptide chain
takes turns so that the carbonyl oxygen of each peptide bond is H-bonded to the amide
hydrogen of the fourth peptide bond in the chain. The helix has 3.6 residues per turn,
with the H-bonds lying almost parallel with its axis (Figure 1.4A). Often, the distribution
of residues in the sequence creates an α-helix with sides of distinct physicochemical
character (i.e., an amphipathic helix, which has a hydrophobic/apolar and a hydrophilic/
polar face).
180
apβ ppII ch
pβ IIβ
90
αhL
Ψ (degrees)
0
310
–180
–180 90 0 90 180
Ф (degrees)
Figure 1.3 Ramachandran plot: peptide bond dihedral angles in proteins. A Ramachandran
plot of major preferred (dark gray) and allowed (light gray) Φ, Ψ angle pairs in proteins, with
the position of repetitive secondary structures marked. Most of the area (white) on the plot
corresponds to disallowed conformations.
8 Structure and Function of Intrinsically Disordered Proteins
A B
Figure 1.4 Typical secondary structural elements of proteins. Local conformation of the
polypeptide chain in a protein often assumes repetitive conformations, such as α-helix (A,
an oligo-Ala segment), β-sheet (B, shown here to be composed of antiparallel strands,
TOP7 structure, pdb 1qys), or PPII helix (C, collagen model peptide, pdb 2d3f).
1 • Principles of Protein Structure and Function 9
of turns and the structural heterogeneity of individual residues in them, they occupy
different regions in the Ramachandran plot.
Left-handed PPII helix conformation has not been recognized for a long time
as an individual secondary structural element, but comprehensive studies of ordered
(Adzhubei and Sternberg 1993) and intrinsically disordered (Syme et al. 2002) proteins
provided evidence for the frequent occurrence of this structural element. PPII is the
most fully extended secondary structural state of the polypeptide chain, with about three
residues per turn (Figure 1.4C). The polypeptide chain in PPII conformation derives its
stability mostly from H-bonds made with water molecules. Its location (–75°, 150°) on
the Ramachandran plot (Figure 1.3) partially overlaps with the β-strand region.
There are segments of proteins which cannot be described by the repetition of any
of the previously described structural states, but their local conformation varies from
residue to residue. These regions are called coils, and at the extreme the whole protein
may be constituted of such segments (i.e., loopy proteins) (Liu, Tan, and Rost 2002).
When (part of) a protein fluctuates among many alternative conformations, without a
discernible preference for any of the foregoing secondary structural states, it is termed
a random coil. Although a fully structureless state probably does not exist even under
highly denaturing conditions (Kohn et al. 2004; Shortle 1996), this expression very
frequently occurs in the literature.
Secondary structural elements in actual proteins are usually not confined to the
very narrow range of Φ,Ψ angles defined, which introduces some uncertainty into
structural annotation of residues. This situation may be treated by applying a more
thorough set of definitions, as suggested in the dictionary of protein secondary struc-
ture (DSSP) approach (Kabsch and Sander 1983). The DSSP scheme is founded not
on angles but on the presence/absence of H-bonds, defined by a threshold value –0.5
kcal/mole of interaction calculated from partial charges and interatomic distances.
Two elementary H-bond types are defined, and a turn occurs when there is a H-bond
between C=O of residue (i) and NH of residue (i + n), where n = 3, 4, or 5, whereas
a bridge is defined between two (parallal or antiparallel) stretches of tripeptides if
the actual residues (i and j) that form a H-bond are more remote in sequence than
in the case of the turn. A minimal helix is then defined as two consecutive n-turns,
whereas a longer helix is described as overlapping minimal helices (an α-helix, for
example, is described as repeating 4-turns). A ladder is defined as a set of consecu-
tive bridges of identical type, whereas a sheet is defined as one or more ladders con-
nected by shared residues. The extraction of such patterns from structures is easily
automated.
a structural unit of two long α-helices twisted around each other (termed a two-stranded
coiled coil). Fibroin and β-keratin found in silk fibers are composed of stacked antipar-
allel β-sheets. Collagen is a special type of helical structure made up of three helices
wound up around each other, each in a conformation close to the PPII helix.
Structures of globular proteins are more complex, because their polypeptide chain
folds up into a compact globule. Their interior is usually filled by tightly packed hydro-
phobic residues, with very few cavities and water mostly excluded. Their packing den-
sity is usually on the order of 0.72–0.77, close to that of contacting solid spheres. Most
of the polar side chains point outward and interact with solvent water.
Globular proteins represent an enormous variety of individual structures, but con-
sidering the arrangement of secondary structural elements they usually fall into one of
four broad structural classes (Figure 1.5): antiparallel α-helix, parallel or mixed β-sheet,
antiparallel β-sheet, and small metal- and disulfide-rich proteins (Garrett and Grisham
2007). Antiparallel α-helix proteins are dominated by α-helices, usually packed in
an antiparallel arrangement, with a slight twist of the helix bundle, as exemplified by
hemagglutinin (Figure 1.5A). In the second class, structures are arranged around paral-
lel or mixed β-sheets. Because a parallel sheet distributes hydrophobic side chains on
both of its sides, neither side can be exposed to the solvent; thus the sheet is typically
found within the core of the structure of proteins, such as in the eight-stranded β-barrel
of triose-phosphate isomerase (Figure 1.5B). The hydrophobic residues of antiparallel
β-sheets are located on just one side of the sheet, which usually exist as one of two
sheets juxtaposed, with their opposite faces exposed to the solvent. An example is soy-
bean trypsin inhibitor (Figure 1.5C). Small proteins often do not fit into any of these
categories, because their structure is heavily influenced by liganded metals or disulfide
bonds, without which their structure is usually unstable. A characteristic example is
insulin (Figure 1.5D).
These descriptions of tertiary structures only apply to simple proteins composed of
a single globular unit. Real proteins are usually more complex, containing several auton-
omous structural regions, which are termed domains (Copley, Goodstadt, and Ponting
2003; Ponting et al. 2000; Vogel et al. 2004). In these cases, the above descriptions of
tertiary structure actually apply to domains, defined by three distinct definitions. The
original definition is that of an autonomous structural unit of a protein (Wetlaufer 1973)
(i.e., an element that has the same structure whether or not part of the protein). This
structural view is used synonymously to the concept of fold, which emphasizes the abil-
ity of a domain to acquire a well-defined tertiary structure on its own (Han et al. 2007).
A domain may also be considered as a segment of the protein that can be recognized
in distinct genetic contexts by virtue of sequence similarity, when it is called a module
(Patthy 1996). Underlying these definitions is the idea that a domain is a functional unit
of the protein that carries a distinct function on its own (Vogel et al. 2004).
An additional and often underappreciated level of structural complexity stems from
the fact that proteins are not static, but rather undergo constant motions. Mobility has two
basic types: one is best approximated as harmonic atomic/collective oscillations about
the single, most stable equilibrium conformation, and the other is directed motions of
whole segments of the protein (i.e., conformational changes that often form part of the
function). The atomic vibrations are very fast, occurring on the order of picoseconds,
whereas conformational changes may be much slower, taking seconds or even longer.
1 • Principles of Protein Structure and Function 11
A B
C D
Figure 1.5 The four major classes of structures of globular proteins. Ribbon diagrams of
globular proteins that represent the four major structural classes. (A) Hemagglutinin (pdb
1htm) belongs to the class of antiparallel α-helix proteins. (B) The eight-stranded β-barrel of
triose-phosphate isomerase (TIM, pdb 1r2t) represents the class of parallel or mixed β-sheet
proteins. (C) Antiparallel β-sheet structures are exemplified by soybean trypsin inhibitor (pdb
1avu). (D) The structure of small proteins that do not fit into the previous categories is often
organized around liganded metals or disulfide bonds, as is the case of insulin (pdb 2zp6).
1.5.3 Quaternary Structure
The native functional state of a protein is often not a single folded polypeptide chain but
an assembly of several chains (subunits) in a stable oligomeric species. This quaternary
structure can be described by the stoichiometry of subunits, their spatial relations, and
eventually the full description of the coordinates of all the atoms of the oligomer. The
oligomer may be composed of identical (homomultimer) or different (heteromultimer)
subunits, the interaction surfaces of which can be identical (isologous interaction) or
different (heterologous interaction). For example, alcohol dehydrogeanse is a symmetric
dimer of two identical subunits. Hemoglobin, on the other hand, is composed as a dimer
of dimers of two different subunits and has a structure α2β2. More complex cases are
tubulin, which is an αβ dimeric protein that polymerizes to form microtubules (αβ)n,
and the closed structure of the coat of tomato bushy stunt virus composed of 180 sub-
units (Garrett and Grisham 2007).
12 Structure and Function of Intrinsically Disordered Proteins
Individual components arise from differences between the unfolded and folded states
(Garrett and Grisham 2007). The folded structure is highly ordered; thus –TΔSchain is a
large positive quantity in the equation. The other terms depend on the nature of amino
acid residues in the chain. Apolar groups can better interact with water than with each
other; thus ΔHchain is somewhat favorable to the unfolded state. On the other hand, ΔHsolvent
is slightly favorable for the folded state, because water molecules can better interact with
other water molecules than with exposed apolar side chains. The critical component of the
equation, –TΔSsolvent, is large and negative in the presence of apolar groups and strongly
favors the folded state, because interaction with apolar groups forces water molecules
to become ordered. Usually, this hydrophobic effect drives the burial of apolar residues
within the interior of a globule. The net thermodynamic gain of large unfavorable and
favorable components, however, is rather small, usually on the order of 10 kcal/mol (40 kJ/
mol) for typical globular proteins (Baldwin 2007; Makhatadze and Privalov 1995).
The thermodynamics of folding may also be interpreted in terms of the landscape
theory, which approaches folding by the free energy surface of states in the entire con-
formational space. The underlying assumption is that folding occurs over a funnel-like
energy surface (Figure 1.6) that leads from practically all possible starting positions
to the global minimum (Dill and Chan 1997). In terms of the terminology of reaction
kinetics, this means that there is no single transition state along the folding pathway,
1 • Principles of Protein Structure and Function 13
Figure 1.6 The folding funnel of proteins. Protein folding usually occurs over a funnel-
like surface in the conformational space. The shape of the funnel and a global minimum
corresponding to the native (N) state ensure that the protein folds into the same structure
from practically any initial denatured/unfolded state. The walls of the funnel are not per-
fectly smooth, and their ruggedness may occasionally halt folding in local minima (folding
traps) of conformational energy. Reproduced with permission from Dill and Chan (1997),
Nat. Struct. Biol. 4, 10–19. Copyright by Nature Publishing Group.
rather there are several alternatives, and the transition state is only adequately described
by a transition-state ensemble. Because the walls of the funnel are not perfectly smooth,
folding may occasionally be halted in local minima (folding traps), which manifests
itself in both the kinetics and the mechanism of folding.
U↔N (1.3)
U↔I↔N (1.4)
14 Structure and Function of Intrinsically Disordered Proteins
Framework
Nucleation-
condensation
N
D
Hydrophobic
collapse
Figure 1.7 Mechanisms of protein folding. The scheme of the three possible mechanisms
of protein folding. The “framework” model assumes that secondary structural elements
form in the open state of the chain, and tertiary contacts are made by these pre-formed
elements. The “hydrophobic collapse” model suggests that folding is initiated by the
compaction of the polypeptide chain around a hydrophobic core, followed by the forma-
tion of secondary structural elements. A combination of the two models, “nucleation-
condensation,” states that the formation of secondary and tertiary structure occurs in
parallel, in a mutually cooperative manner. Reproduced with permission from Daggett and
Fersht (2003), Trends Biochem. Sci. 28, 18–25. Copyright by Elsevier.
1.7 Unfolding of a protein:
lessons from polymer theory
The interest in describing the unfolded/denatured state of proteins has been motivated
by the fact that it serves as a reference point for understanding both thermodynamic and
mechanistic aspects of folding. Unfolded states are usually generated by denaturing
conditions, such as high concentrations of urea (8M) or guanidine hydrochloride (Gnd-
HCl, 6M), low pH (2.0), or high temperatures (90–100°C). Structural description of
denatured states of globular proteins is in most direct association with describing IDPs
(discussed in detail in Chapter 10, Section 10.4). A first approximation of such descrip-
tions is by one of several global hydrodynamic parameters, such as the following:
16 Structure and Function of Intrinsically Disordered Proteins
1. Radius of gyration (RG, the root mean square distance of atoms from the
center of mass, averaged over all molecules and over time)
2. Stokes radius, also termed hydrodynamic radius (RS, RH, the radius of a hard
sphere that diffuses at the same rate as the given molecule; the corresponding
volume of the molecule is the hydrodynamic volume, VH)
3. End-to-end distribution (R N, the function describing the distribution between
the two ends of the protein), from which mean-squared end-to-end distance
(<L2>, averaged over all molecules in the ensemble), derives
4. Persistence length (LP, the length over which correlations in the directions of
units of a polymer is lost). Below LP, the orientations of segments are correlated,
whereas for longer pieces the properties can only be described statistically.
These parameters are deeply rooted in polymer theory pioneered by Flory (reviewed
in [Flory 1969]) applied to the field of proteins by Tanford (Tanford 1968). The two most
frequent models for describing polypeptide chains are the “freely jointed chain” and the
“wormlike chain.” In the freely jointed chain (Flory 1969), the chain is divided into N
statistical segments (beads) of size b, connected by virtual bonds. The chain performs
a random walk, with mean squared distance between units separated by N segments,
<R N2> = b2N, and the radius of gyration:
1
RG = (1.5)
6bN 1/2
Formally, this description requires that the segments behave independently of each
other. To describe chains of finite length and flexibility, the wormlike chain model
was developed, into which later refinements introduced the effects of heterogeneity of
residues, steric exclusion, and differences in the solvation of side chains. An important
parameter of this model is v (second virial coefficient), which describes interactions
within the chain. v < 0 indicates attraction between segments and a tendency for global
collapse, whereas v > 0 corresponds to repulsive interactions and indicates an overall
swelling of the chain beyond its predicted Gaussian dimensions. v = 0 reproduces the
ideal walk (i.e., the Gaussian or random-coil behavior). As a function of v, the radius of
gyration scales as
RG = R0 N µ (1.6)
where R0 is a constant related to persistence length, and µ is the scaling factor, which,
depending on v, may take on the value µ = 0.33 (collapsed, spherical molecule in poor
solvent), µ = 0.5 (random chain in an ideal solvent, also termed Θ solvent), and µ = 0.588
(an extended volume random coil in a good solvent). <L2> for unfolded proteins is also
expected to scale linearly with chain length (<L2> = L0N) (Fitzkee and Rose 2004).
Tanford provided experimental evidence for the random-coil behavior in the case of
globular proteins under highly denaturing conditions by intrinsic viscosity measure-
ments, which yielded µ = 0.67 (Tanford, Kawahara, and Lapanje 1966). In small-angle
1 • Principles of Protein Structure and Function 17
X-ray scattering (SAXS) experiments, µ = 0.598 was obtained for denatured proteins
(Kohn et al. 2004).
These concepts of polymer theory were adopted by the field of protein folding,
from where it arrived to the field of IDPs. Besides the state of random coil, observa-
tions of more compact intermediates have led to the concept of molten globule (MG)
(Ptytsyn and Uversky 1994) and the somewhat less compact pre-molten globule (PMG)
(Uversky and Ptytsin 1994) states. MG is characterized by a large internal flexibility of
side chains and backbone, with characteristic hydrodynamic parameters, such as RG, RS
1.5–2.0 times larger than that of globular proteins.
A B C
Figure 1.8 A well-defined 3-D structure is required for enzyme activity. (A) The classical
model of lock-and-key was formulated by Emil Fisher in 1894 to explain stereo-specificity of
enzyme catalysis (Fisher 1894). (B) The model assumes that the substrate fits tightly to the
binding site on the enzyme as a key into its lock. (C) The perfect fit between the enzyme
and its substrate can be mimicked by a tight complex with its inhibitor (trypsin in complex
with serpin, pdb 1k90).
spatial pattern of properly placed amino acid residues creates a special physico-
chemical microenvironment tailored for the tight and extremely specific binding of
ligands, catalysis of chemical reactions, translocation of ions or small molecules, or
assembly of specific macromolecular complexes. The success of this paradigm is
demonstrated by the structures deposited into the PDB and countless reports on the
functional details of enzymes, receptors, transporters, membrane channels, building
blocks, mechanochemical proteins, and many other types of proteins (Fersht 1985;
Garrett and Grisham 2007; Stryer 1995). The example of enzymes illustrates some
of the key points.
Enzymes have a rather well-defined binding pocket for the formation of an enzyme-
substrate (ES) complex, as formulated in the classical concept of the lock-and-key
hypothesis (Figure 1.8A; see also Chapter 2, Section 2.2.1), which suggested a tight fit
between the binding pocket and substrate (Figure 1.8B). The concept is substantiated by
the structure of a large number of enzyme-substrate and enzyme-inhibitor complexes
(Figure 1.8C). The current theory assumes a perfect fit between the enzyme and the
transition state (TS) of the reaction (i.e., that acceleration of the conversion of substrate
results from the stabilization of the highest-energy state on the reaction path). Enzymes
lower the energy of TS by several possible mechanisms, such as proximity/orientation,
the formation of transient covalent bonds, general acid/base catalysis, and complemen-
tarity in structure with TS. The residues that directly take part in accelerating the reac-
tion make up the active site. Whereas this model is much more elaborate in its details
than the lock-and-key, its basic premise is the perfect positioning of residues, which can
only be ensured by a well-defined 3-D structure.
A Brief History of
Protein Disorder 2
The classical structure-function paradigm has been the dominant view of proteins for a
long time. This chapter provides a historical overview of how observations contradict-
ing this notion have slowly accumulated for decades, eventually leading to the recogni-
tion of the possible generality of the phenomenon of structural disorder.
21
22 Structure and Function of Intrinsically Disordered Proteins
painstaking to verify if a novel protein conforms to what is considered the rule. From a
practical point of view, it seems more useful to return to an operational definition that
rests on the result of one or a combination of several experimental protocols.
Actually, the whole field of protein disorder has grown out of observations that the
behavior of a protein is in contrast with what a “protein expert” would expect. If a pro-
tein can be boiled and still does not precipitate, this behavior can hardly be interpreted
in terms of the traditional view of proteins. If it has practically no secondary structural
elements by circular dichroism (CD), it is very suspicious behavior. If it is so sensitive
to proteolysis that it cannot be purified without fragmentation, we have a good reason
to suspect that our protein is not an “ordinary” one. If we see poor resonance dispersion
on a nuclear magnetic resonance (NMR) spectrum, there is every reason to assume it is
disordered. Of course, its structure might have been spoiled during expression and/or
purification, but additional test-tube experiments may provide evidence that the protein
is in a state compatible with function. Having accepted such very simple rules of thumb,
the field of protein disorder got to a jump start around the year 2000.
(Sedzik and Kirschner 1992), it was found that “despite our efforts which included 4,600
different conditions, we were unable to induce crystallization of MBP . . . when it is
removed from its native environment in the myelin membrane . . . the protein adopts a
random coil conformation and persists as a population of structurally non-identical mol-
ecules.” Currently, a better definition of disorder is not available. Still, people were reluc-
tant to take notice, although some protein segments with no discernible electron density
were already recognized as essential for function (Alber et al. 1983; Bode, Schwager,
and Huber 1978; Spolar and Record 1994).
Another protein that resisted crystallization for a long time is microtubule-asso-
ciated protein 2 (MAP2), which also represents an example of the early observation
of protein disorder (Hernandez, Avila, and Andreu 1986). MAP2, the homolog of tau
protein involved in Alzheimer’s disease (see Chapter 15, Section 15.3.1), is a neuronal
microtubule (MT) binding protein, which stabilizes the unstable MT polymer composed
of αβ tubulin dimers. This protein was among the first to be recognized as disordered
under native, functional conditions. Avila and colleagues showed that heat treatment did
not affect its behavior, as assessed by several biophysical techniques (Hernandez et al.
1986). The high frictional ratio, f/f0 = 3.7, in sedimentation equilibrium and gel chroma-
tography suggested that MAP2 was clearly not globular but had either a very elongated
shape or an unordered expanded structure. Very little secondary structural content was
seen by CD, and this feature was independent of the purification procedure. Overall,
MAP2 in solution was described as an “unordered, very flexible and non-compact”
protein (Hernandez et al. 1986). Interestingly, a few years later, it was demonstrated by
small-angle X-ray scattering (SAXS), CD, and Fourier-transform infrared spectroscopy
(FTIR) that tau protein could “behave as if it were denatured, having no compact fold,
but a highly extended, random Gaussian polymer, with a minimal content of ordered
secondary structure” (Schweers et al. 1994). Because MAP2 and tau are functional ana-
logues but show limited sequence similarity, this similar structural behavior might have
focused attention on function of an IDP being maintained in the face of little sequence
conservation, an idea the field returned to much later (see Chapter 13, Section 13.4).
It was Paul Sigler, who came closest to generalizing the concept of structural disorder.
In a seminal paper on transcription factors (Sigler 1988), Sigler described their DNA-binding
domains as structurally well-defined, whereas he noted that mutational and structural stud-
ies of their trans-activator domains (TADs) “suggest a disquieting picture of a conformation-
ally ill-defined polypeptide that can function almost irrespective of sequence, provided only
that there is a sufficient excess of acidic residues clustered or peppered about.” He concluded
that eukaryotic transcription relies on nearly shapeless molecules termed “acid blobs” or
“negative noodles.” Disorder, without using the word, is clearly described as “whereas crys-
tal structures at atomic resolution are crucial to our understanding of … specific molecular
interactions, we can imagine many assemblies … whose function requires strong but less
precisely defined arrangements than the ones we have seen crystallographically.”
regions in otherwise globular proteins actually have such high main-chain mobilities
that qualify them as disordered. He suggested that such regions in interleukin-4, GroES,
pyruvate dehydrogenase, and eglin c are actually involved in protein–protein recogni-
tion, in which disorder provides advantages, such as facilitated spatial search and reduc-
tion of binding energy without compromising specificity. These concepts of functional
advantages turned out to be critical for developing the paradigm of protein disorder.
Much controversy surrounded the structural state of prothymosin alpha (ProTa), a
small acidic protein of 109 amino acids in length. Though its exact function was—and
still is—not known, its evolutionary conservation and wide tissue distribution suggested
an essential biological role. Gel-filtration experiments suggested an apparent molar mass
five times greater than that calculated from the amino acid sequence, whereas sedimen-
tation equilibrium measurements gave the correct molecular mass (Haritos et al. 1989).
Proton NMR and CD suggested a disordered chain (Watts et al. 1990), whereas unusual
sodium dodecyl sulfate polyacrylamide gel electrophoresis (SDS-PAGE) mobility was
interpreted in terms of the protein being a stable dimer (Cordero et al. 1992). The con-
troversy of the physical state of ProTa was eventually settled by a detailed investigation
that combined SAXS, dynamic light scattering (DLS), mass spectrometry (MS), and
CD (Gast et al. 1995). The results clearly indicated that ProTa is monomeric but adopts
a random coil-like conformation with no regular secondary structure. RS (30.7 Å) and
RG (47.6 Å) are 1.77 and 3.42 times larger than those expected for a compactly folded
protein of its length. A cautious note of generalization has been worded by the claim
“the finding that a biologically active protein molecule with 109 amino acid residues
adopts a random coil conformation under physiological conditions raises the question
whether this is a rare or a hitherto-overlooked but widespread phenomenon in the field
of macromolecular polypeptides.” However, the major impact of this work resulted from
its title, which put it in plain terms “Prothymosin Alpha: A Biologically Active Protein
with Random Coil Conformation.”
A variety of techniques have been used in the conformational analysis of human
α-synuclein (Weinreb et al. 1996), also known as the non-Aβ component of Alzheimer’s
disease amyloid plaque (NACP, see Chapter 15, Section 15.3.1 and Section 15.3.2.1). An
elongated shape of the protein was indicated by its much larger RS and slower sedi-
mentation than a globular protein of similar MW. CD and FTIR indicated the absence
of significant amounts of secondary structure, whereas CD and ultraviolet (UV) spec-
troscopy suggested the lack of a hydrophobic core. Its conformational properties were
unchanged by boiling and were insensitive to denaturants. It was suggested that NACP
exists as a mixture of rapidly equilibrating extended conformers and that it probably
represents the emerging class of “natively unfolded” proteins. Again, the impact of the
study stems from the title “NACP, a Protein Implicated in Alzheimer’s Disease and
Learning, is Natively Unfolded.”
The functional importance of the unfolded state was raised in the case of the bacte-
rial transcription regulator FlgM (Plaxco and Gross 1997). This bacterial protein is an
inhibitor of the transcription factor σ28, and its function is to down-regulate the synthe-
sis of flagellar proteins when the assembly of the flagellum is completed. The protein is
depleted from cells by being transported through the central channel of flagella, which
are not completed and capped yet. Once flagella are ready, FlgM gets trapped in the
2 • A Brief History of Protein Disorder 27
cell; it inhibits σ28 and shuts down expression of flagellar proteins. The argument for
its disorder comes from the fact that the channels of flagella are too narrow for FlgM
to wriggle through, unless it is in an unfolded state. Although this idea might be chal-
lenged, connecting function with disorder of a protein certainly had a significant impact
on the field, as signified by the title again: “The Importance of Being Unfolded.” This
view was supported by NMR studies of the protein (Daughdrill et al. 1997; Daughdrill,
Hanely, and Dahlquist 1998).
An important element of the functionality of IDPs was brought up by Wright and
colleagues upon studying the cyclin-dependent kinase (Cdk) inhibitor p21Cip1, a protein
important for the p53-dependent control of cell cycle (Kriwacki et al. 1996). Not only
was it shown by proteolytic mapping, CD spectroscopy, and NMR that the binding
domain of p21Cip1 lacks stable secondary or tertiary structure in the unbound state, it
was also demonstrated that the protein adopts a stable conformational state when it
binds its partner, Cdk2. It was suggested that the induced folding process enables p21Cip1
to bind and inhibit a diverse family of cyclin-Cdk complexes, including Cyclin A-Cdk2,
Cyclin E-Cdk2, and Cyclin D-Cdk4. Thus, structural disorder was possibly associated
with binding promiscuity, as suggested explicitly in the title “Conformational Disorder
Mediates Binding Diversity.”
function and the “lack” of structure demanded that the structure-function paradigm be
reexamined. Through a variety of examples disordered regions were demonstrated to be
frequently found in proteins involved in DNA and RNA binding, transcription, transla-
tion, cell-cycle regulation, and membrane fusion, and even in amyloid formation. The
examples pointed to the involvement of unstructured proteins in regulatory functions,
in which the lack of structure might confer functional advantages.
The analysis of amino acid preferences of “natively unfolded” proteins provided
some additional insight into the characteristics of disordered proteins (Uversky,
Gillespie, and Fink 2000a). It was demonstrated that these “natively unfolded’ proteins
are specifically localized within a unique region of the net charge–mean hydrophobicity
phase space, which indicated that a combination of low overall hydrophobicity and large
net charge is primarily responsible for their inability to fold into well-defined structures.
This observation forms the basis of many bioinformatic predictors (Chapter 9) as well
as our understanding of the physical principles underlying disorder.
The transition in concept was solidified by many other contributions. The most
critical elements of the new view have been the following:
These achievements have been reviewed and discussed in many excellent reviews
(Demchenko 2001; Dunker et al. 2002; Dunker et al. 2001; Dyson and Wright 2002a,
2005; Fink 2005; Tompa 2002, 2003a, 2005; Uversky 2002a,b; Uversky, Oldfield, and
2 • A Brief History of Protein Disorder 29
Dunker 2005; Wright and Dyson 1999). Their message is clear: The success of the field
and rapid progress in diverse directions leads to an ever more complete understanding
of the interplay between structure and function of proteins that do not fold into well-
defined 3-D structures. Detailed studies of such “intrinsically unstructured” (IUP),
“intrinsically disordered” (IDP), or “natively unfolded” (NU) proteins or intrinsically
disordered regions (IDR) drive the development of a novel structure-function paradigm
that can encompass all distinct structural states of proteins.
Indirect
Techniques for
Recognizing and
3
Characterizing
Protein Disorder
This chapter covers techniques that provide the first line of evidence on the unusual
structural state of disordered proteins. These simple techniques are usually applicable
on full-length proteins, and they are considered indirect, because they do not directly
provide structural information but suggest a behavior from which the disordered nature
of the protein can be inferred.
31
32 Structure and Function of Intrinsically Disordered Proteins
67.0
55.6
42.7
36.5
26.6
20.0
Figure 3.1 Heat stability and anomalous SDS-PAGE mobility of IDPs. This SDS-PAGE
demonstrates both heat stability and anomalous SDS-PAGE mobility of IDPs. The superna-
tants of a globular control (BSA) and two IDPs (human calpastatin domain 1, CST, and the
juvenile form of microtubule-associated protein 2, MAP2c) were run on the gel without (–)
or with (+) heat-treatment at 100ºC 10 min and subsequent centrifugation. BSA precipi-
tates, whereas IDPs stay in solution under these conditions. The apparent MW of the IDPs
is much higher than their absolute MW determined from their sequence: 25 kDa vs. 15 kDa
for CST and 67 kDa vs. 50 kDa for MAP2c.
resistance of IDPs resides in their unusual amino acid composition; that is, they are
highly charged and have a low content of hydrophobic amino acids (Dunker et al. 2001;
Uversky, Gillespie, and Fink 2000a), due to which they do not expose hydrophobic resi-
dues that would make them aggregate at elevated temperatures.
The possible generality of the phenomenon was addressed in the study of Kim
and colleagues, who characterized the heat stability of proteins in cell extracts and
found that 20 and 70 wt% of total proteins are heat-resistant in Jurkat T-cell lysates
and human serum, respectively (Kim et al. 2000b). The heat-stable proteins are, in
many cases, disordered. The correlation of heat resistance and disorder is also under-
lined by a few proteomic studies of disorder (see Chapter 7), in which an initial heat
treatment was used to enrich cellular extracts for IDPs, as confirmed by subsequent
mass spectrometry (MS) identification of proteins (Csizmok et al. 2006; Galea et al.
2006).
Whereas this technique is undoubtedly simple and effective in isolating and
identifying potential IDPs, it should never be neglected that although the protein sur-
vives boiling, it may suffer irreversible chemical changes during the treatment. At the
first approximation, the structural ensemble of IDPs undergoes a reversible change at
elevated temperatures and returns to its native conformational state upon returning to
ambient temperature. For example, the structure and function of MAP2 prepared with
and without heat treatment showed no critical differences (Hernandez et al. 1986). In
a similar comparative study, the heat-treated and untreated caldesmon also showed
no detectable differences in certain functions (Bretscher 1984; Lynch et al. 1987) but
showed significant alterations in CaM-binding (Zhuang, Mabuchi, and Wang 1996).
It should also be borne in mind that most proteins are neither fully ordered nor fully
3 • Indirect Techniques for Recognizing and Characterizing Protein Disorder 33
disordered but contain ordered and disordered regions at different ratios. Even if such
a protein survives heating as a whole, the ordered part may be irreversibly damaged.
In addition, even if the conformation of the protein is largely unchanged, several of its
residues may undergo chemical conversion that might compromise function. Among
many possibilities, deamidation of Gln and Asn and oxidation of Cys and Met residues
are the most trivial.
3.5 Limited proteolysis and
local structure
Proteolysis under controlled conditions leads to a partial degradation of the substrate,
which may be used to probe into the structure of IDPs (see Chapter 12, Section 12.2.2).
This approach has provided ample insight into the structural topology of mostly ordered
proteins, delineating their flexible segments. It is much less appreciated that at very low
protease concentrations IDPs also undergo limited proteolysis, which implies their non-
fully random structural organization. In the case of caldesmon (Marston and Redwood
1991), nucleoplasmin (Dingwall et al. 1987), the TAD of GC4N (Hope, Mahadevan,
Struhl 1988), CREB KID (Richards et al. 1996), stathmin (Redeker et al. 2000), BRCA1
(Mark et al. 2005), calpastatin (Csizmok et al. 2005), tau (Steiner et al. 1990), and
MAP2 (Wille, Mandelkow, and Mandelkow 1992b), the location of preferential cleav-
age site(s) is not random but correlates with their organization into larger functional and
possibly structural segments. An appealing interpretation of these observations is that
transient short- and/or long-range structural organization ensures preferential spatial
exposure of certain regions in these IDPs.
heat required to increase the temperature of a sample relative to that of a reference, from
which heat capacity as a function of temperature can be determined (Privalov 1979,
1982). The technique is particularly sensitive to heat-capacity changes accompanying
phase transitions, such as the temperature-induced unfolding of a globular protein. The
unfolding event appears as a heat-absorption curve, which, for a single-domain protein,
signals the cooperative melting of the structure. In the case of multidomain proteins,
individual melting peaks overlap and their deconvolution can provide information on
the domain structure of the protein. The two basic parameters derived from the melting
curve are the transition (or melting) temperature Tm and the enthalpy of melting (see
Figure 3.2).
It intuitively follows that the absence of such a cooperative transition may signal the
lack of globularity and, indirectly, disorder (Receveur-Brechot et al. 2005). DSC was used
to demonstrate disorder in an acid-denatured globular protein, alpha-fetoprotein (AFP),
which lacks a stable fold and behaves as a molten globule (MG) (Uversky et al. 1995).
In the case of bona fide IDPs, there are only a few relevant studies, such as in the case
of Df31 (Figure 3.2), a Drosophila protein of chromatin decondensation and remodeling
activities (Szollosi et al. 2008); the nuclear co-activator binding domain (NCBD) of CBP
(Demarest et al. 2004); the carboxy-terminal domain (CTD) of caldesmon (Permyakov
et al. 2003); α-synuclein, β- and κ-casein, and tau protein (Syme et al. 2002); rGmD-19,
a soybean group 1 LEA protein (Soulages et al. 2002); and the N-terminal prion domain
8000
6000
4000 Lysozyme
2000
Cp (kcal/mole °C)
–2000
–4000 Df31
–6000
–8000
–10000
40 50 60 70 80 90
Temperature (°C)
Figure 3.2 DSC of Df31 and lysozyme. The DSC curve of intrinsically disordered Df31 and
globular lysozyme was recorded, and the change in heat capacity was calculated from the
observed flow of heat. The curves show the distinct behavior of the two proteins: Lysozyme
undergoes a cooperative structural transition (melting) with a Tm of 72ºC, whereas Df31
lacks such a transition, which suggests its disordered structural state. Reproduced with per-
mission from Szollosi et al. (2008), J. Proteome Res. 7, 2291–9. Copyright by the American
Chemical Society.
3 • Indirect Techniques for Recognizing and Characterizing Protein Disorder 37
of Ure2p (Baxa et al. 2004). DSC can also be used to study structural features of IDPs
in more subtle ways, as demonstrated by the next two examples.
and entropy (ΔS) can be calculated. The sample is titrated with aliquots of the ligand,
causing heat to be absorbed or released, which is measured by maintaining the sample
and a reference cell at the same temperature. Heat flow spikes are then integrated to
yield the total heat effect per injection, from which the thermodynamic parameters
can be calculated (see Figure 3.3). The actual parameters may suggest and characterize
disorder, as demonstrated by two examples.
Time (min)
0 20 40 60 80 100 120
–4
–6
–0
Q (kcal × mol–1)
–2
–4
–6
Figure 3.3 ITC titration of an SH3 domain with a Pro-rich peptide. The Sem-5 C-SH3
domain was titrated with the SosY peptide (Ac-VPPPVPPRRRY-NH2). The power (in µcal/
sec) needed to maintain the reference and sample cells at identical temperatures is mea-
sured, from which molar heat (Q) is calculated. The thermodynamic parameters obtained
by fitting the data suggest large negative enthalpy and entropy changes of binding, which
contradict the sizeable hydrophobic surface buried upon the interaction and suggests that
the peptide is disordered before binding (see Section 3.7.1 for details). Reproduced with
permission from Ferreon and Hilser (2004), Biochemistry 43, 7787–97. Copyright by the
American Chemical Society.
of binding of these two motifs and the role of the formation of LH. To this end, the
interactions of KID with Cyclin A (mediated by domain 1), Cdk2 (mediated by domain
2), and the Cyclin A-Cdk2 binary complex were separately characterized by ITC (Lacy
et al. 2004).
It was found that all three binding reactions are driven by enthalpy, which over-
comes a large unfavorable decrease in entropy. Binding to Cyclin A (ΔG = –10.4 kcal
mol–1) is slightly more favorable than binding to Cdk2 (ΔG = –9.8 kcal mol–1), with
a very large entropic penalty for the latter, which reflects that the extent of folding
upon binding is very different (estimated from the length of domains to be 29 residues
40 Structure and Function of Intrinsically Disordered Proteins
vs. only 10). Binding of KID to the binary complex is much stronger than that of either
of its fragments (ΔG = –11.6 kcal mol–1), which indicates that both domains favorably
contribute to binding. A very large entropic penalty (–TΔS = +28.6 kcal mol–1) suggests
the extensive ordering of KID upon binding. It was estimated that binding of KID to
Cyclin A is accompanied by folding of about 34 residues, which corresponds to both
domain 1 (~12 residues) and the linker helix (~22 residues). The value for binding at
Cdk2 is about 59 residues, which is accounted for by folding both domain 2 (~30 resi-
dues) and the linker helix (~22 residues). These data suggest that binding of KID to either
partner is accompanied by folding of the partially folded linker region. KID binding is
initiated by domain 1, followed by wrapping around and binding of domain 2 (Lacy
et al. 2004). Ordering of the linker helix is induced upon—and not prior to—binding, in
accordance with previous mutagenesis studies (Bienkiewicz, Adkins, and Lumb 2002),
which showed that stabilization of the helix kinetically hinders formation of the com-
plex (see also Chapter 14, Section 14.3.1).
43
44 Structure and Function of Intrinsically Disordered Proteins
A B
350 0.45
0.3
250
0.25
200 0.2
0.15
150 0.1
0.05
100
0.0075
Residuals
0.007
50 0.0065
0.006
0.0055
0 0.005
0.0045
0.004
10 12 14 16 18 20 6.08 6.09 6.1 6.11 6.12 6.13 6.14 6.15
Radius (cm)
Volume (ml)
extended, random coil–type IDPs elute at an apparent MW that is 4–6 times their real
value (Csizmok et al. 2006).
It should be noted that an unexpectedly high apparent MW of a protein can also be
interpreted in terms of an oligomeric structure, as suggested in the case of Df31, for
example (see Chapter 3, Section 3.8) (Crevel and Cotterill 1995). Disorder and oligomeric
state can be distinguished by elution at high ionic strength and/or in the presence of dena-
turants, such as 8M urea, which can also provide evidence about the possible residual
structure of an IDP. For example, the RS of the carboxy-terminal domain (CTD) of calde-
smon determined by GF (Permyakov et al. 2003) increases slightly in the presence of 6M
Gnd-HCl (28.1 Å in buffer and 35.3 Å in Gnd-HCl), which suggests that the protein is in
a PMG state (estimated values for the same MW are: globular, 19.1 Å; MG, 21.7 Å; PMG,
27.4 Å; and random coil, 34.4 Å). The absolute MW of the protein can also be determined
by analytical ultracentrifugation (see Figure 4.1B and Section 4.3).
GF has provided many observations on the unusual hydrodynamic behavior of
proteins and has basically contributed to the development of the concept of protein
disorder. It has been applied in the case of DARPP-32 (Hemmings et al. 1984), α- and
β-thymosin (Haritos et al. 1989), caldesmon (Lynch et al. 1987; Permyakov et al. 2003),
PKI (Thomas et al. 1991), prothymosin alpha (ProTa) (Cordero et al. 1992), Df31 (Crevel
and Cotterill 1995), α-synuclein (Weinreb et al. 1996), deoxyribonucleic acid (DNA)
4 • Hydrodynamic Techniques 45
repair protein XPA (Iakoucheva et al. 2001b), Nup2p (Denning et al. 2002), AavLEA1
(Goyal et al. 2003), measles virus NTAIL (Longhi et al. 2003), intracellular domain
of gliotactin (Zeev-Ben-Mordehai et al. 2003), IF7 of glutamine synthetase (Muro-
Pastor et al. 2003), T-cell receptor zeta cytD (Sigalov, Aivazian, and Stern2004), secu-
rin (Sanchez-Puig, Veprintsev, and Fersht 2005), glutamic acid-rich proteins (GARP)
of rod photoreceptors (Batra-Safferling et al. 2006), and the C/EBP homolog CHOP
(Singh et al. 2008), for example.
kBT
D= (4.1)
6πηRS
where u is the observed radial velocity of the macromolecule, ω is the angular velocity
of the rotor, r is the radial position, ω2r is the centrifugal field, M is the molar mass, ν is
the partial specific volume, ρ is solvent density, NA is Avogadro’s number, f is the fric-
tional coefficient, and D is the diffusion coefficient. From D, Rs can be calculated by the
Stokes–Einstein equation (Eq. 4.1). Values of s are commonly expressed in Svedberg
(S) units, which correspond to 10 –13 sec. A correction of experimental s values to a
standard state of water (20°C) usually leads to the standard and easily comparable cor-
rected s20,w. An important parameter characterizing the size/shape of a protein is the
ratio of the maximum s value (that of a sphere of the given MW) to the actually observed
value, ssphere/s20,w, which is equal to the ratio of the experimentally observed frictional
coefficient to the minimum frictional coefficient expected for a sphere (f/f0). This ratio
characterizes the shape asymmetry of the molecule, which describes its possible devia-
tion from globularity. SV is also very useful in the identification of the oligomeric state
and the stoichiometry of heterogeneous interactions.
In SE, at centrifugal fields lower than those generally used in SV, sedimen-
tation is eventually balanced by diffusion opposing the concentration gradient,
resulting in a time-invariant concentration profile (Figure 4.1B). This experiment
is insensitive to the shape of the protein and directly reports on its M W and, for
chemically reacting mixtures, on chemical equilibrium constants. Thus, analysis
of SE data can also yield valuable thermodynamic and stoichiometric information
on the interaction of molecules, and it is often used for studying self-association
4 • Hydrodynamic Techniques 47
Dmax
∫ r γ(r )
sin( sr )
I(s ) = 4π 2
dr (4.3)
sr
0
where γ(r) is the spherically averaged autocorrelation function of the excess scattering
density, which is zero for distances exceeding the maximum particle diameter, Dmax. From
this relation, the histogram of interatomic distances within the molecule (i.e., the dis-
tance-distribution function p(r)) can be computed by the inverse Fourier transformation:
∞
r2
∫ s I(s )
sin( sr )
p(r ) = 2 2
dr (4.4)
2π sr
0
In principle, p(r) contains the same information as the scattering intensity. This
real-space representation is more intuitive, however, and information about the particle
shape can often be deduced by simple visual inspection.
48 Structure and Function of Intrinsically Disordered Proteins
A
10
Tau
log(I)
Lysozyme
0
0 0.1
s2/nm–2
B
4
Tau
3
2
I s2
Lysozyme
0
0 0.1 0.2 0.3
s/nm–1
Figure 4.2 SAXS characterization of tau protein. Comparison of the SAXS data of tau
protein and globular lysozyme. (A) The Guinier plot of tau is curved, indicating that no
defined RG can be assigned to this IDP. In contrast, the scattering curve of globular lysozyme
follows a straight line. (B) The Kratky plot of tau increases monotonically, which typifies a
fully disordered IDP. The hump on the lysozyme curve is characteristic of a globular struc-
ture. Reproduced with permission from Schweers et al. (1994), J. Biol. Chem. 269, 24290–7.
Copyright by the American Society for Biochemistry and Molecular Biology.
(Paz et al. 2008). In some instances, a more advanced SAXS approach or the combina-
tion of SAXS with other analyses to limit the number of solutions has been used, as
demonstrated by a few of the most influential cases, such as SAXS and electron para-
magnetic resonance (EPR) (measles virus nucleoprotein); SAXS and molecular dynam-
ics (MD), bacterial cellulase, and p27Kip1 (see Chapter 14, Section 14.12); and SAXS,
NMR, and MD (p53). These cases are discussed next.
50 Structure and Function of Intrinsically Disordered Proteins
10 Å
3
P (r)
2 60 Å
120 Å
r (Å)
B
10 Å 30 Å
50 Å 80 Å
Figure 4.3 Structural ensemble of the linker region of bacterial cellulase. A chimeric cel-
lulase was constructed from the catalytic domains of bacterial cellulases Cel6A and Cel6B,
which is connected by a linker that is two times the length of the original. (A) The inter-
atomic distance-distribution function (P(r)) of the structural ensemble of the chimera was
determined by SAXS (trace with full-circle symbols). The P(r) profiles of models with inter-
module separations of 10 Å, 60 Å, and 120 Å are also shown for comparison. MD simula-
tions and modeling of the P(r) profile suggest a continuous distribution from a compact
to a fully extended (linker stretched to 120 Å) state. (B) The α-carbon views of four typical
molecular structures that were used for the weighted summation. Reproduced with permis-
sion from von Ossowski et al. (2005), Biophys. J. 88, 2823–32. Copyright by Elsevier Inc.
its negative sense, single-stranded genome packaged into a helical nucleocapsid by the
viral nucleoprotein (N). Transcription and replication of the viral genome requires the
action of RNA-dependent RNA polymerase (L) in association with another factor, phos-
phoprotein (P) (Curran and Kolakofsky 1999). In principle, N can self-assemble on
cellular RNA in the absence of viral RNA, and the interaction of P with N prevents this
illegitimate formation of nucleocapsid-like particles. The soluble N-P complex is the
substrate of L polymerase, which initiates the encapsidation of genomic RNA.
The region responsible for self-assembly and RNA-binding is NCORE, whereas inter-
action with P is mediated by the tail region NTAIL. NTAIL protrudes from the globular
region and carries three short recognition elements (Box 1, 2, and 3) that are important
for function. NTAIL is disordered by many techniques (Longhi et al. 2003), also cor-
roborated by SAXS, which suggests an RG = 27.5 Å, compared to 15 Å expected for a
globular protein. By this criterion, NTAIL is largely, but not fully, disordered, because
its RG is smaller than that of a random coil-like chain (35–38 Å), its Kratky curve has a
bump, and its p(r) shows a maximum dimension of 120–130 Å, which is smaller than
that of a fully disordered random structure.
SAXS also provided structural details on the interaction of N TAIL with its bind-
ing region within P, the XD domain (Bourhis et al. 2005). Box 2 of about 12 amino
acids is the primary site of interaction undergoing induced folding to a local α-helix
conformation. By SAXS, XD is globular with the expected RG (12.1 Å) and maximum
diameter (41 Å). The RG of the XD-N TAIL complex is much larger (32.7 Å), although
its M W is not much bigger than that of XD. Thus, the complex is not compact and is
highly anisotropic, which is also underscored by a maximum at 20 Å, a shoulder at
30 Å, and a long tail up to 146 Å on the p(r) function. The overall envelope of the
complex has a globular cluster and an elongated protuberance of varying shape, cor-
responding to the N-terminal segment of N TAIL. Box 3, which also contributes to bind-
ing, tends to point out in different solvent-exposed conformations, without folding to
a stable structure (as also demonstrated by EPR spectroscopy; see Chapter 5, Section
5.6). Overall, disorder of N TAIL and the observed binding mode might play important
roles in the processivity of virus replication, in which continuous rearrangement of
complexes takes place.
relative to CBD. Its detailed analysis was enabled by a fusion construct composed of
the N-terminal half of cellulase, Cel6A, and the C-terminal half of cellulase, Cel6B
(von Ossowski et al. 2005). The broad shoulder on the p(r) profile (Figure 4.3) is indic-
ative of a distribution of conformations with a Dmax of 178 Å and the linker adopting all
the possible separations between the two globular domains. MD simulations of 1,000
linker conformations fitted to the experimental data show that the linker preferentially
samples compact states, but it is able to undergo extension with a relatively low energy
cost (i.e., it can position the catalytic module and the CBD at a distance comparable
to one cellobiose unit, yet enabling processivity [see Chapter 14, Section 14.9] of cel-
lulose degradation).
4.4.3 p53
SAXS results can be effectively combined with NMR residual dipolar coupling (RDC)
data to obtain a self-consistent structural model of an IDP that combines both long-
and short-range structural features (Bernado et al. 2005). In the analysis, intrinsic con-
formational sampling of an IDP, based on Φ,Ψ angles obtained from loop regions of
folded proteins, are used to restrain SAXS calculations. This approach was applied to
model both the bound and free forms of the tetrameric tumor suppressor p53 (Wells
et al. 2008). As described in detail in Chapter 15, Section 15.1.2, p53 is a tumor-sup-
pressor transcription factor of 393 amino acids, composed of four structural-functional
domains: a trans-activator domain (TAD) subdivided into TAD1 (1–40), TAD2 (40–61),
and a Pro-rich region (PRR, 64–92); a core DNA-binding domain (DBD, 93–293); a
tetramerization domain (TD, 325–356); and a regulatory domain (RD, 367–393). Its
modeling was achieved in three stages, which led to one of the major structural achieve-
ments of the IDP field (see Figure 15.2 and cover picture).
The structure of folded domains was solved by X-ray crystallography and NMR,
and the quaternary structure of the protein was delineated by verifying the tertiary
structures of individual domains in the intact protein, identifying domain–domain
interactions by NMR, and determining the arrangement of domains in the full-length
protein by SAXS. In the final stage of generating the structure, NMR, RDC, and
SAXS data were combined with MD simulations (Bernado et al. 2005) to elucidate
the ensemble of structures of the functionally important IDRs, TAD, and RD within
the full-length tetramer. In complex with a specific DNA element, the folded domains
are well-aligned and form a rather rigid single mass. The high degree of order of
this region is efficiently propagated into PRR, due to the relatively long persistence
length of the latter. In TADs, the local orientation sampling is less correlated due
to their much shorter persistence length. The relative stiffness of PRRs projects the
ensemble of TADs away from the main body of the protein, supporting a predomi-
nantly structural role of PRR. In the largely disordered TAD, a transient helix at the
MDM2-interacing site (see Chapter 10, Section 10.2.3.6) is apparent. In the absence
of DNA, the entire p53 structure is much more dynamic, with core domain dimers
attached to tetramerization domains by flexible linkers and an apparent decoupling
of the effective alignment of TAD and PRR, with this region experiencing more iso-
tropic behavior.
4 • Hydrodynamic Techniques 53
100
90
80
70
%]
t[
60
ien
50
ad
Gr
40
n
30
io
us
20
ff
Di
10
3.6 3.5 3 2.5 2 1.5 1 0.5 0
Chemical Shift [ppm]
fitting signal intensity as a function of gradient strength yields a decay rate, which is
proportional to the diffusion coefficient D. From D, RS and R H can be calculated by the
Stokes–Einstein equation (Eq. 4.1). Absolute values of D can only be obtained if the
temperature and viscosity of the solution are precisely controlled, and an internal radius
standard (e.g., dioxan) (Wilkins et al. 1999) is usually used instead, which provides the
effective RH of the protein by the following equation:
5.1 X-ray crystallography
A description of the methodology of X-ray crystallography is beyond the scope of this
book, but is covered in excellent reviews and monographs (Drenth 2006). X-ray crystal-
lography can determine the arrangement of atoms in a protein by recording the intensity
and pattern of the X-ray scattered by the electrons within the protein crystal. Diffraction
appears as a pattern of regularly spaced spots known as reflections, from which the
three-dimensional model of electron density can be recovered by using the Fourier
transforms. The positions of the atomic nuclei are deduced from this electron density in
a manner consistent with the covalent structure (sequence) of the protein.
The structure is characterized by its resolution (down to 1 Å in the best cases) and
by B-factors of atoms (also termed temperature-factor, which describes the degree to
which the electron density is spread out due to either static or dynamic mobility). The
total number of structures solved by X-ray crystallography is about 44,000 today, which
represents the majority of structures (more than 50,000) deposited in the Protein Data
Bank (PDB). It is of special significance with respect to disorder that often the position
of an atom cannot be precisely determined due to crystal defects, or actual multiplicity
of positions, when it is missing from the electron-density map and is termed disordered.
Whereas the underlying assumption is that their position can be ultimately determined,
the term also has been applied for longer regions missing from the electron-density
55
56 Structure and Function of Intrinsically Disordered Proteins
map, which may be caused by either ordered parts occupying multiple positions (wobbly
domains) or “intrinsic” disorder, in a sense being used throughout this book (Dunker
et al. 2001).
Initial identification of intrinsically disordered regions (IDRs) derived from such
observations and the impact on the field is shown by 138 out of about 500 entries in the
DisProt database having “X-ray crystallography” in their field of detection method. A
few prominent examples illustrate the importance of such observations. The structure
of a 10-subunit yeast ribonucleic acid (RNA) polymerase II (RNAP II), the enzyme
responsible for transcription of protein-coding genes in eukaryotes, could be solved at
2.8 Å resolution (see Chapter 11, Figure 11.2) (Cramer, Bushnell, and Kornberg 2001).
The 280 amino acid–long carboxy-terminal domain (CTD) of the largest subunit, Rpb1,
is not seen in the structure. This region orchestrates a complex array of reactions in tran-
scription and messenger RNA (mRNA) maturation, and is indispensable in the function
of RNAP II (see Chapter 11, Section 11.2.3). DNA topoisomerase I (Topo I) is a nuclear
enzyme involved in transcription, replication, recombination, chromosome condensa-
tion, chromatin remodeling, and DNA damage recognition. The enzyme catalyzes the
ATP-independent breakage of single-stranded DNA, and it has an N-terminal region of
about 170 amino acids missing from the X-ray structure (Redinbo et al. 1998). This IDR
regulates Topo I activity through phosphorylation, and possible protein–protein inter-
actions with other chromosomal proteins. Calcineurin is a Ca2+-dependent calmodu-
lin (CaM)-stimulated protein phosphatase, involved in many pathways, such as T-cell
signal transduction, apoptosis, Wnt signaling, MAPK signaling, amyotrophic lateral
sclerosis (ALS) pathway, and osteoclast regulation. The enzyme has a 95 amino acids-
long region connecting its two subunits missing from the crystal structure (Kissinger
et al. 1995). This region has an autoinhibitory element and a CaM-binding site, which
cooperate in CaM-dependent regulation of the enzyme. Core histones form octamers,
which make up nucleosomes (see Figure 5.1) in complex with DNA, which are the basic
building elements of the chromatin (Luger et al. 1997). Their disordered N-terminal
tails missing from the crystal structure provide multiple functions in epigenetic regula-
tion, by mediating protein–protein interactions and posttranslational modifications (see
Chapter 11, Section 11.4.2.1) (Bhaumik, Smith, and Shilatifard 2007; Hansen, Tse, and
Wolffe 1998).
Structural disorder is rather widespread in the PDB, as shown by Dunker and col-
leagues by comparing data in PDB and the corresponding Swiss-Prot sequences (Le
Gall et al. 2007). The complete Swiss-Prot sequence can be found in the PDB structure
in only 7% of the cases, and more than 95% of the Swiss-Prot sequence in only 25% of
the cases. Thus, a great majority of PDB proteins are shorter than their corresponding
Swiss-Prot sequences, either because the construct has been truncated to enable crystal-
lization or because the structure contains residues that do not have well-defined coordi-
nates. Approximately 10% of the PDB proteins contain missing or ambiguous residues
longer than 30 consecutive amino acids, and about 40% of the them have shorter regions
(between 10 and 30 residues) missing. The failure of crystallization of a protein may
also point to structural disorder. Of course, crystallization often fails even in the case
of ordered proteins, and thus the lack of its success does not prove disorder. There have
been some notable failures, though, which actually contributed to developing the con-
cept of disorder (see Chapter 2, Section 2.2.5).
5 • Spectroscopic Techniques for Characterizing Disorder 57
Figure 5.1 X-ray structure of the nucleosome. The structure of a core histone octamer
with 146 base pairs of DNA winding around (i.e., the nucleosome [pdb 1kx5]), as solved by
X-ray crystallography (Luger et al. 1997). The N-terminal tails of histones, as curved pieces,
which are the primary sites of posttranslational regulation of chromatin, are missing from
the actual structures.
Trp is the most highly fluorescent. The maximum of Trp excitation spectrum is at
280 nm, whereas its emission depends on its local molecular environment, which can
be exploited in characterizing the disordered state of proteins. Extrinsic fluorophores
have three basic types. The first is covalently attached small molecules, such as fluo-
rescein and rhodamine isothiocyanate (FITC and RITC). These widely used extrinsic
labels react primarily with Lys and Cys groups of proteins, they have excellent quan-
tum yields, and, due to their long wavelengths of absorption and emission, biological
samples do not interfere with their fluorescence. The second type of extrinsic probe
is the non-covalent 1-anilino-8-naphthalene-sulfonic acid (ANS), which is practically
non-fluorescent in water, but becomes highly fluorescent in an apolar environment.
This makes ANS the classic probe of molten globule (MG) states of proteins, because
its intensity is much higher in the presence of a partially folded than either fully
folded or fully unfolded proteins (Goldberg et al. 1990; Greene, Wijesinha-Bettoni,
and Redfield 2006). A third type of extrinsic probes, small fluorescent proteins, can
be fused to the protein studied. Their prototype is green fluorescent protein (GFP),
which has a highly visible, efficiently emitting internal fluorophore (Tsien 1998). Since
its introduction into molecular biology and cell biology, GFP and its variants have
become standard markers of gene expression and protein targeting in intact cells and
organisms. Mutations that modulate its emission spectrum gave rise to variants of
different spectra (e.g., GFP CyPet, and YPet), making GFP compatible with fluores-
cence resonance energy transfer (FRET) (see Section 5.2.4) applications (Nguyen and
Daugherty 2005).
5.2.1 UV Fluorescence
The basic application of fluorescence in the IDP field derives from the spectral sensi-
tivity of Trp residues to the local environment. Trp can be specifically excited at 295
nm (where Tyr practically does not absorb), and it emits fluorescence with a maximum
around 350 nm if it is fully exposed to water. The fluorescence of a Trp shielded from
the aqueous environment in the hydrophobic core of globular proteins is generally blue-
shifted to around 320 nm (Schmid 1989). Trp residues of IDPs, depending on their
exposure or transient burial in a local hydrophobic cluster, emit somewhere in between
(see Figure 5.2), as shown in the case of α-casein. One should be aware, however, that
this parameter is very sensitive to the exact position of Trp and local perturbations of its
environment. Thus, emission at rather high wavelengths has been observed in the case
of some globular proteins, such as nuclease (334 nm) and human serum albumin (342
nm) (Lakowicz 2006).
120
100
Fluorescence Intensity
NATA
80
cas-D
60
cas-F
40
casein
20
0 RNAse T1
contact, with the quencher. Collisional quenching is especially useful for studying struc-
tural changes and dynamic processes of proteins. Its contribution to the deactivation
rate of the excited fluorophore is described by the Stern–Volmer (SV) equation:
F0 (5.1)
= 1 + k q τ 0[Q] = 1 + K SV [Q]
F
where F0 and F are the fluorescence intensities measured in the absence and presence
of the quencher, applied at a molar concentration [Q], kq is the collisional rate constant,
and KSV is the SV constant, which is the product of a collisional quenching rate con-
stant and the excited state lifetime of the fluorophore in the absence of the quencher
(KSV = kq).
The collisional quenchers most often used are either charged (I–) or neutral (oxy-
gen, acrylamide) molecules, and quenching is visualized by plotting F0 /F as a function
of [Q], which gives a linear function, the slope of which is the SV constant. Values of
KSV reflect accessibility of Trp side chains, which are traditionally used to unveil struc-
tural rigidity of a protein, because transient “breathing” enables the quencher to reach
internal Trp residues (Papp and Vanderkooi 1989). In the case of IDPs, Trp residues are
much more exposed, and their actual accessibility can be used to address the presence
of transient residual structure. This was addressed by comparing SV constants deter-
mined for IDPs with that of a fully folded protein (ribonuclease [RNase] T1) and the
model compound of the fully exposed Trp (N-acetyl Trp amide, NATA), which indi-
cated that accessibility of IDPs falls in between the two extremes (KSV = 0.65, 3.99, and
24.37 for RNase T1, p21Cip1, and NATA, respectively, Tompa unpublished results).
60 Structure and Function of Intrinsically Disordered Proteins
1
kT = (5.2)
R
τ d ( 0 )6
r
where τd is the lifetime of the donor in the absence of the acceptor, r is the distance
between the two molecules, and R0 is the Forster distance at which the efficiency of
transfer is 50% (Forster 1948; Michalet, Weiss, and Jager 2006; Stryer 1978). Because
transfer efficiency depends on the sixth power of distance, FRET can typically measure
distances within the range 20–50 Å, which is well suited for studying the structure and
structural changes (Corradi and Adamo 2007) or interactions (McIntyre et al. 2007) of
proteins. By analyzing time-resolved decays of donor fluorescence, the relative rates of
diffusion of donor-acceptor pairs can also be characterized, which provides information
on the dynamics of structure.
Several applications have been presented in the IDP literature for the character-
ization of the distribution of distances and/or dynamics of disordered structures (see
Chapter 10). For example, FRET was used to examine the spatial relationship of
5 • Spectroscopic Techniques for Characterizing Disorder 61
domains representing a preferred folding state of tau protein (Chapter 10, Figure 10.6)
(Jeganathan et al. 2006), and to estimate the end-to-end distance distributions and per-
sistence length of IDPs of various length, such as the charged-plus-PQ domain of ZipA,
the tail domain of α-adducin, and the C-terminal tail domain of FtsZ (Ohashi et al.
2007). The structural and dynamic behavior of the NM domain of yeast prion Sup35p
(Mukhopadhyay et al. 2007) was also studied by FRET (see also Section 5.2.5.2 and
Chapter 10, Section 10.5.1).
et al. 2004), and that of three IDPs, β-casein, MAP2c, and p21Cip1 in 40% Dextran (see
Chapter 8, Figure 8.2), 40% Ficoll 70, and 3.6M TMAO (Tompa, unpublished results)
was studied. In all cases, it was found that crowding causes compaction of IDPs, but
without a cooperative transition to a folded state.
the difference in absorbance of the two polarized lights is extremely small, within the
range 10 –4 –10 –6 times the actual absorbance of the sample. CD signals are observed in
the same spectral region where absorption of the protein occurs. Typically, near-UV
and far-UV regions are distinguished, which give different kinds of information about
protein structure.
Near-UV CD in the 250–350 nm region (termed the aromatic region) provides
information on the tertiary structure. Different aromatic residues tend to have distinct
wavelength profiles. Phe mostly contributes at 250–270 nm, Tyr at 270–290 nm, and
Trp at 280–300 nm, whereas disulfide bonds contribute broad, weak signals throughout
the spectrum. The near-UV CD spectrum represents a detailed fingerprint of tertiary
structure around these reporter residues, but it cannot be interpreted in terms of the
actual structure. Proteins of stable 3-D structure are usually characterized by intense
and detailed spectrum due to the asymmetric environment of their aromatic residues,
whereas the spectrum of unfolded proteins or IDPs is of low-intensity and low-com-
plexity, because their aromatic residues experience an isotropic environment. Further
insight into residual structure may come from aromatic residues in certain IDPs expe-
riencing local order, because they are part of an element of a residual structure/hydro-
phobic cluster.
The far-UV CD spectrum in the range 190–230 nm originates mainly from amide
(peptide) bonds. It can be used to determine the relative amount of different second-
ary structural elements, because they have characteristic far-UV CD spectra, which is
clearly distinguished in the case of α-helix, β-sheet, turn, PPII helix, and coil conforma-
tions (Figure 5.3A). An actual spectrum can be approximated as a linear combination of
the contributions of different elements, but a major uncertainty of such deconvolution
into components comes from the choice of basis spectra, which can be polyamino acids
or actual proteins of well-known structure.
CD has been a dominant technique for identifying IDPs, and is also used to
characterize their non-fully random structure (see Chapter 10, Section 10.2). The
resemblance of the spectrum of a protein to that of the coil has been taken to indi-
cate the random coil or disordered character of a protein (Figure 5.3B). In DisProt
(Sickmeier et al. 2007), 156 out of about 500 proteins have the annotation CD in the
field detection method, and some of the most prominent IDPs have been first shown
to lack a well-defined structure by far-UV CD. To cite a few cases, CD was used in
the case of MAP2 (Hernandez, Avila, and Andreu 1986), tau protein (Schweers et al.
1994), ProTa (Gast et al. 1995), α-synuclein (Weinreb et al. 1996), p21Cip1 (Kriwacki
et al. 1996), dehydrin Dsp16 (Lisse et al. 1996), and the high mobility group pro-
tein HMGA (Reeves and Beckerbauer 2001). Near-UV CD has been used less often
(e.g., in the case of calpastatin [Konno et al. 1997]), but provided evidence for local
residual structure in some cases. In the case of caldesmon, a rather intensive band
around 275 nm suggests that the Trp residues of this protein are in an asymmet-
ric environment, possibly in a hydrophobic cluster (Permyakov et al. 2003). In the
case of the trans-activator domain (TAD) of transcription factor Vmw65 (Donaldson
and Capone 1992) and the Potato virus A genome-linked protein VPg (Rantalainen
et al. 2008), the contributions of Phe residues at 270–250 nm and Tyr residues at
270–290 nm are interpreted in terms of an asymmetric environment around the aro-
matic residues.
5 • Spectroscopic Techniques for Characterizing Disorder 65
A B
80
Ellipticity per Residue × 103 (deg cm/dmole)
–1
60 Alpha helix
Beta sheet
Alpha helix <10%
Random coil –5
40 Beta sheet 15%
0
–15
–20
–40 –20
Figure 5.3 Typical circular dichroism spectra. (A) Typical CD spectra of α-helix, β-strand,
and coil conformations. (B) CD spectrum of high-mobility groupA1a (HMGA1a, also termed
HMG-I) protein, which shows that the protein largely lacks repetitive secondary structure.
Calculations of the secondary structure composition of the protein (insert) suggest the
presence of very little α-helix, β-sheet, or β-turn conformations, but the predominance of
random coil or “other” structures. Reproduced with permission from Reeves (2001), Gene.
277, 63–81. Copyright by Elsevier Inc.
IR − IL
Δ= (5.3)
IR + IL
Although the relative intensities are sensitive to local secondary structure, the exact
relation of spectral components and structural elements, such as α-helix and β-sheet, has
not yet been established. The observed correlations of ROA band pattern with backbone
66 Structure and Function of Intrinsically Disordered Proteins
A
1316
ROA
1674
IR – IL
0
2.7 × 104
B
1300 1333
ROA 1345 1664
IR – IL
1664
4.9 × 104 1241 1263
800 1000 1200 1400 1600
C
1319
ROA 1282 1670
IR – IL
Figure 5.4 ROA spectra of ordered and disordered proteins. The spectra are shown
to demonstrate the differences between the IDP tau (A), ordered lysozyme (B), and the
Bowman–Birk protease inhibitor (BBI), which has an irregular fold (C). The spectrum of
tau is dominated by the peak at 1,316 cm –1, which is attributed to PPII-helical confor-
mation, also seen in BBI, which has long loops that occur locally in this conformational
state. Reproduced with permission from Syme et al. (2002), Eur. J. Biochem. 268, 148–156.
Copyright by John Wiley & Sons, Inc.
absorption of energy is a change in the spin state of an unpaired electron, the EPR spec-
trum is expected to consist of a single line. Interactions with nearby nuclear spins results
in splitting of allowed energy states and a multi-lined spectrum that contains information
on local structure (see Figure 5.5).
The application of EPR for studying proteins requires the presence of either para-
magnetic metal ions or organic free radicals. Thus, EPR is used either for studying met-
alloproteins (containing Mn2+, Cu2+, or Fe3+), or proteins labeled by specific spin-labels
(spin-probes). The label (e.g., HgR, 2,2,5,5-tetramethyl-4-(2-chloromercuriphenyl)-
3-imidazoline-1-oxyl or MTSL [1-oxyl-2,2,5,5-tetramethyl-3-pyrroline-3-methyl]
68 Structure and Function of Intrinsically Disordered Proteins
2 Azz
IOG
positions. The shape of EPR spectra shows that the probe in the entire protein is in a
large-mobility state, with characteristic τR values on the order of 0.2–0.6 ns. For com-
parison, the correlation time of the unbound probe is about 0.05 ns, whereas that of the
label immobilized inside a well-folded protein is two orders of magnitude higher. These
differences can also be exploited to study induced folding of IDPs upon partner binding,
as demonstrated in the binding of the NTAIL region of measles virus nucleoprotein to the
XD domain of viral phosphoprotein (Morin et al. 2006). The mobility of the spin-probe
at three different positions is significantly reduced in the presence of XD (for details,
see Chapter 4, Section 4.4.1), whereas at a fourth position it is unaffected, which enables
to delineate the regions involved in binding-induced ordering. A similar approach was
used to show that a segment of myelin basic protein (MBP, 82–93), a membrane-bound
protein in the central nervous system, forms an amphipathic α-helix, which lies on the
surface of the membrane, partly embedded in it (Bates et al. 2004).
An influential and rather unique application of EPR concerns the structural changes
that accompany amyloid formation. For example, free states and fibrils generated from
83 different spin-label derivatives of α-synuclein were studied by EPR (Chen et al.
2007). In the free state, all variants have sharp and narrowly spaced triplets, which is
suggestive of a high degree of mobility that follows from the disorder of the protein.
The situation is completely different in fibrils. Within about the N-terminal 30 and
C-terminal 15 residues, the spectra are heterogeneous and slightly broader than in the
free state, which is representative of high but somewhat restricted mobility. Spectra of
the central core region (NAC, 35–95) become almost completely free of hyperfine lines,
indicating spin-exchange narrowing. This fundamental change suggests spatial contacts
between multiple spin labels, which can be best accounted for by a parallel in-register
cross-β structure, which makes multiple molecules stack on top of each other in the
amyloid (see Chapter 15, Section 15.5.3 and Figure 15.5).
(slicing), and staining in heavy metals (lead, uranium, or tungsten). A variant of this
latter method, known as rotary shadowing, is very often used to increase the contrast
of a protein sample, which proceeds by freezing the specimen very rapidly, vacuum-
ing off loose and disorganized ice crystals, spraying the sample with metal vapor, and
then applying acids to dissolve away the protein itself. The specimen that remains is a
thin metal shell moulded in the shape of the protein. Probably due to the problems of
potentially denaturing side effects of fixation, however, EM has only been used in a few
cases in the IDP field.
The classic observation is on caldesmon, which is an 89–93 kDa protein of the
contractile apparatus of muscle cells. Caldesmon is one of the first proteins to be rec-
ognized as intrinsically disordered by heat stability, CD, and GF (Lynch, Riseman,
and Bretscher 1987). Long-angle rotary shadow images of the protein indicate an elon-
gated flexible molecule with an average contour length of 146 ± 40 nm (Figure 5.6).
Its large length variation and highly varied shape is best accounted for by structural
disorder of the protein. A similar picture emerges in the case of the P/Q domain of the
cell-division protein ZipA (Ohashi et al. 2002). Rotary shadowing EM of a construct
of the protein containsing two fused globular domains show a wide distribution of
50
40
Number of Molecules
30
20
10
0
0
10
30
50
70
90
10
30
50
–9
–1
2
1–
1–
1–
1–
1–
1–
1–
71
91
11
13
15
17
19
21
23
Figure 5.6 Electron micrographs of caldesmon. The histogram of the contour lengths of
208 molecules of caldesmon determined by EM has an average length of 146 ± 40 nm. The
collection of low-angle rotary shadowed images demonstrates the extreme flexibility of the
molecule (insert, magnification: 102,000 ×). Reproduced with permission from Lynch et al.
(1987), J. Biol. Chem. 262, 7429–37. Copyright by the American Society for Biochemistry
and Molecular Biology.
5 • Spectroscopic Techniques for Characterizing Disorder 71
separations between the two domains, ranging from 7.8–19.5 nm, with an average of
12.4 nm. The EM of MAP2, which is 1,828 amino acids in length, shows a highly
elongated shape of an apparent width of 3.5 nm and a rather varied length averaging
97 ± 17 nm (Wille et al. 1992a). It is highly flexible, as shown by the molecule folding
back on several images to form antiparallel hairpin-like structures. Its juvenile form,
MAP2c (467 residues), behaves in a similar fashion, forming rod-like structures 4 nm
in width and 48 ± 7 nm in length, which are often curved/bent in shape (Wille et al.
1992b).
EM images can also provide functional insight. Cardiac titin is a giant muscle pro-
tein with multiple Ig domains and a highly repetitive disordered domain termed PEVK
(Pro, Glu, Val, Lys-rich region) because of the preponderance of these four amino acids
(see Chapter 12, Section 12.1.3 and Chapter 13, Section 13.3.1.3). A construct of Ig
domains connected by a 186-amino acid long PEVK linker studied by rotary-shadowed
EM (Li et al. 2001) has a wide length distribution ranging from 9–24 nm, with two
peaks at 11 and 17 nm. The functional importance of this extended structural state (also
addressed by atomic force microscopy (AFM) force-extension curves see Section 5.8)
rests in its elastic behavior, which resists many stretch-relaxation cycles and is critical in
its function as an elastic molecule in muscle. Rotary shadowing EM suggests a similar
structural picture but somewhat different functional interpretation in the case of myosin
VI (Rock et al. 2005). Myosin VI is a processive motor moving along actin filaments
with larger than expected step size. Dimeric molecules have a distribution of the separa-
tion of head modules 27 ± 6 nm by EM, which suggests that the 80-residue long segment
next to the tail is not rigid but highly flexible and allows a diffusive search for binding
sites on F-actin (see Chapter 14, Section 14.9).
5.8.2 α -Synuclein
AFM can also be used to measure the force required for the extension of an IDP (e.g.,
that of α-synuclein) (Sandal et al. 2008). To this end, force-extension curves of multiple
unfolding events of a fusion construct of three N-terminal and three C-terminal titin
immunoglobulin (I27) domains flanking a single α-synuclein molecule were recorded.
About 30% of the molecules show unfolding typical of a fully disordered state, without
any significant deviation from a worm-like chain behavior. About 60% of them display
single or multiple small peaks superimposed on the purely entropic behavior, ascribed
to additional mechanically weak interactions along the chain. Most interesting, about
7% of α-synuclein molecules display extension at a force very similar to that required to
unfold the Ig-domains, which suggests a rather ordered structure dominated by β-type
of interactions. The importance of this observation is underscored by the fact that the
ratio of this structured component increases under conditions that promote the forma-
tion of α-synuclein aggregates, such as the presence of copper, the pathologic A30P
mutation, and high ionic strength (see Chapter 15, Section 15.3.2.1).
Nuclear
Magnetic
Resonance
6
Nuclear magnetic resonance (NMR) spectroscopy has a special status among spectro-
scopic techniques because it can provide residue-level information on the structure and
dynamics of disordered proteins. The NMR of intrinsically disordered proteins (IDPs)
owes a lot to studies of protein denaturaion and folding, where many relevant experi-
ments had been conducted. In the field of IDPs, NMR was initially used for demonstrat-
ing their disorder (i.e., for simply contrasting their behavior with that of folded proteins).
Later, the emphasis shifted to characterizing residual structure and correlating it with
(binding) function, which are the major assets of protein NMR. Combinations of NMR
data with other methods or applying NMR to proteins in a living cell provide unprec-
edented insight into the structure and function of IDPs.
different frequency are applied and are assembled into different time-domain pulse
combinations. The combinations are designed to allow magnetization transfer between
nuclei in coherent motion (spin systems), with the aim of detecting their interactions.
Following excitation, how the out-of-equilibrium magnetization vector precesses about
the external magnetic field and returns to the ground state is measured. These processes
73
74 Structure and Function of Intrinsically Disordered Proteins
11 10 9 8 7 6 5 4 3 2 1 0 ppm
Chemical Shift (ppm)
Figure 6.1 1-D 1H NMR spectrum of the intrinsically disordered cytoplasmic domain of
gliotactin. The 1-D 1H NMR spectrum of the folded 9-kDa complex of α-bungarotoxin with
a 13-mer peptide (A) compared to the spectrum of the cytD of gliotactin (B). The spectrum
of Gli-cytD is typical of that of a protein in a random coil state. Reproduced with permission
from Zeev-Ben-Mordehai et al. (2003), Proteins 53, 758–67. Copyright by Wiley-Liss, Inc.
6.2.2 Wide-Line NMR
The unfolded polypeptide chain of IDPs is largely exposed to the solvent, which is
manifested in a high level of hydration. This can be directly visualized by measuring
the FID of water protons, separating the signal coming from the hydrate layer from
those of the protein and bulk water by freezing (Bokor et al. 2005). Water molecules in
the hydrate layer remain motile below the temperature at which bulk water freezes out,
and the phases of ice protons, protein protons, and unfrozen water protons become sep-
arated in the FID signal due to large differences in their spin–spin relaxation rates. Ice
protons have a typical value of R2 > 200,000 s–1, which is completely buried in the dead
time of the spectrometer. Protein protons also have a large R 2 > 20,000 s–1, whereas
water signals typically relax at a rate R2 < 2000 s–1. In practical terms, the temperature
76 Structure and Function of Intrinsically Disordered Proteins
range can be divided into four regions, within which distinct observed behavior can be
interpreted as weighted averages of the amplitude and dynamics of different unfrozen
water fractions.
Thus, wide-line NMR relaxation can be used for demonstrating a large hydrate
layer and structural disorder, as shown in the case of microtubule-associated protein 2
(MAP2) and calpastatin (Bokor et al. 2005), early responsive to dehydration (ERD10/14)
(Bokor et al. 2005; Tompa et al. 2006a), and Df31 (Szollosi et al. 2008). Because tran-
sient intramolecular interactions effectively compete with hydration of the protein, a
comparison of the hydrate layer of a full-length IDP and its segments also provides
information on residual structure, as suggested in the case of calpastatin (Csizmok et
al. 2005). Overall, the method is applicable for visualizing the interface region of IDPs,
which is a surface representation of structure that is in direct connection with function
(see Chapter 10, Section 10.6).
6.2.4 HSQC
Sequence-specific assignment of resonances is made possible by multidimensional tri-
ple resonance methods of 13C- and 15N-labeled proteins, because the chemical shifts of
backbone 15N and carbonyl 13C resonances are well dispersed in the disordered state.
Typically, the first experiment to be measured with an isotope-labeled protein is a 2-D
heteronuclear single quantum coherence (HSQC) spectrum (Figure 6.2), which corre-
lates the backbone amide nitrogen resonances with those of the directly attached pro-
tons. In a 1H-15N HSQC spectrum, the amide bond of each amino acid residue (with
the exception of prolines) plus amide nitrogen-containing side-chains provide a signal.
The peaks in the proton dimension of an IDP show the lack of dispersion apparent in
the 1-D 1H spectrum (Figure 6.1), spanning between 8.0 ppm and 8.5 ppm (Figure 6.2).
In the nitrogen dimension, the spectrum is well spread out, spanning 105–130 ppm for
backbone and side-chain amide groups. This difference in dispersion in the two dimen-
sions is a reliable indicator of structural disorder, due to which HSQC is often recorded
simply for characterizing the structural state of an IDP.
HSQC is also the starting point of resonance assignment, which is essential for a
meaningful interpretation of more advanced NMR experiments. The problem of poor
proton dispersion is overcome by using the dispersion of 15N and 13C nuclei in a vari-
ety of experiments that add further dimensions to the HSQC plane and help resolve
ambiguous resonances. In the case of small unlabeled proteins, this is achieved by a set
of two-dimensional homonuclear NMR experiments, such as homonuclear correlation
6 • Nuclear Magnetic Resonance 77
110 110
115 115
ω1 –15N (ppm)
120 120
125 125
Figure 6.2 HSQC spectrum of calpastatin domain 1. The 1H-15N HSQC spectrum of uni-
formly 15N-labeled calpastatin domain 1 of 141 amino acids (126 non-Pro), of which 121
expected backbone peaks could be assigned. See Kiss et al. 2008b for details.
HNCACB, and CBCACONH experiments, which add a carbon dimension to the HSQC
plane. Assignment is achieved via a sequential walk through the backbone and along
the side-chain on the basis of one-bond correlations. As to the state of the art, an IDP
as large as 202 amino acids (human securin) could be fully assigned by a combination
of proton-based and proton-less approaches (Csizmok et al. 2008), whereas by a lengthy
procedure that combined the graphical analysis of the spectra of full-length wild-type
protein, different isoforms and corresponding peptides, and subsequent (HA)CANNH
and HNN experiments (Lippens et al. 2006; Lippens et al. 2004; Mukrasch et al. 2005;
Mukrasch et al. 2007b; Smet et al. 2004), the sequence of hTau40, which is the longest
human tau isoform (441 residues), could be almost fully assigned.
Following assignment of the peaks, HSQC is also the starting point of the determi-
nation of a variety of parameters for the sequence-specific characterization of transient
structure and dynamics at the local level. The HSQC experiment is also useful for detect-
ing interactions with other proteins, because a change in relaxation due to the interac-
tion makes NMR parameters of residues directly involved shift or even disappear from
the spectrum. For these reasons, the HSQC spectrum has been one of the most frequent
experiments in the IDP literature, applied in the case of p21Cip1 (Kriwacki et al. 1996),
FlgM (Daughdrill et al. 1997), VP16 TAD (Uesugi et al. 1997), 4E-BP1 (Fletcher and
Wagner 1998), D1–D4 of fibronectin binding protein(A) (FnBPA) (Penkett et al. 1998),
CREB KID (Radhakrishnan et al. 1998), protein kinase inhibitor α (PKIα) (Hauer et al.
1999a), eukaryotic translation initiation factor 4G1 (eIF4G1) (Hershey et al. 1999), p53
TAD (Lee et al. 2000), α-synuclein (Eliezer et al. 2001), CP 12 (Graciet et al. 2003),
Grb14 (Moncoq et al. 2003), Wiskott–Aldrich syndrome protein (WASP) (Panchal et al.
2003), Smad-anchor for receptor activation Smad-binding domain (SARA SBD) (Chong
et al. 2004), thymosin β4 (Tβ4) (Domanski et al. 2004), IA3 (Green et al. 2004), myelin
basic protein (MBP) (Harauz et al. 2004), T-cell receptor zeta cytoplasmic domain (cytD)
(Sigalov et al. 2004), tau protein (Eliezer et al. 2005), BRCA1 (Mark et al. 2005), prion
domain of Ure2p (Pierce et al. 2005), colicin E9 (Tozawa et al. 2005), UreG (Zambelli
et al. 2005), rod photoreceptor glutamic acid-rich protein (GARP) (Batra-Safferling et
al. 2006), phage lambdaN (Prasch et al. 2006), cystic fibrosis transmembrane conduc-
tance regulator (CFTR) R domain (Baker et al. 2007), β-synuclein (Bertoncini et al.
2007), Nogo (Li and Song 2007), Sic1 (Mittag et al. 2008), and MSP2 (Zhang et al.
2008). HSQC is also routinely used for screening the prospect of structure solution in
various structural genomics programs (Oldfield et al. 2005c; Peti et al. 2004).
6.3 Sequence-specific
structural information
Once resonance assignment has been achieved, a variety of NMR parameters can be
determined to characterize structural and dynamic behavior at the residue level. The
values are usually compared to those expected on the assumption of the random coil
state, and deviations are used for a detailed description of local structure (see Chapter
10, Section 10.2.3 and Table 10.1).
6 • Nuclear Magnetic Resonance 79
A B
αA αB
0.4
0.2 pSer133
pS133 0
I137
Figure 6.3 The structure of phosphorylated CREB-KID in CBP-bound and free states.
(A) The structure of the phosphorylated KID domain of CREB in complex with the KIX
domain of CBP (pdb 1kdx) that encompasses two helices connected by a wide turn harbor-
ing phosphorylated Ser133, as solved by NMR (Radhakrishnan et al. 1997). (B) Hα and Cα
CSI values of non-phosphorylated (◽), Ser133-phosphorylated (o), and KIX-bound (•) forms
of KID. CSI values indicate helical preference within regions αA and αB, and very little dif-
ference between the non-phosphorylated and phosphorylated forms. Panel B reproduced
with permission from Radhakrishnan et al. (1998), FEBS Lett. 430, 317–22. Copyright by
Elsevier Inc.
Hanely, and Dahlquist 1998), FnBPA (Penkett et al. 1998), sialoprotein and osteopontin
(Fisher et al. 2001), histone messenger ribonucleic acid (mRNA) stem-loop binding pro-
tein (SLBP) (Thapar, Mueller, and Marzluff 2004), replication protein A (Olson et al.
2005), transcription factor ETS1 (Macauley et al. 2006), p53 (Veprintsev et al. 2006),
β-synuclein (Bertoncini et al. 2007), CREB KID (Sugase, Dyson, and Wright 2007),
and MBP (Libich and Harauz 2008).
1.2
1.0
Peak Intensity Ratio
0.8
0.6
0.4
0.2
0.0
20 40 60 80 100 120 140
Residue Number
that can then be fed into molecular dynamics (MD) simulations to obtain a realistic
picture of the structural ensemble of the IDP (Dedmon et al. 2005). This approach was
used in the case of p53 (Vise et al. 2007) and α-synuclein (see Figure 10.7) (Dedmon
et al. 2005), for example.
or relaxation data. The few examples that should be noted are FnBPA (Penkett et al.
1998), histone mRNA binding protein SLBP (Thapar et al. 2004), Sendai virus phos-
phoprotein (Bernado et al. 2005), α- and β-synuclein (Bertoncini et al. 2005; Bertoncini
et al. 2007), tau protein (Mukrasch et al. 2005; Mukrasch et al. 2007a), and p53 (Wells
et al. 2008).
experiments, water protons are selectively excited, and their exchange with amide posi-
tions is observed by following transfer of magnetization to the amide site. The method
is capable of detecting exchange processes on the milliseconds timescale, and is poten-
tially applicable for the characterization of IDPs. The results are usually expressed
as values of protection factor, which is the ratio of the intrinsic exchange rate of an
unprotected amide in the same chemical environment (same sequence in local random
coil conformation) and the observed exchange rate (i.e., k int /kobs) of the given amide
hydrogen. The former can be directly measured or calculated from the sequence (Bai
et al. 1993).
Typically, protection factors on the order of 103–106 are observed for folded regions,
whereas transient structural elements in IDPs/intrinsically disordered regions (IDRs)
can only provide a protection up to about 10-fold. This approach has been used for
showing the lack of structure in the N- and C-terminal regions of Nogo (Li and Song
2007) and segments of the HET-s prion amyloid (Ritter et al. 2005), whereas it dem-
onstrated transient helical segments in securin (Csizmok et al. 2008) and the NTD of
histone mRNA binding protein SLBP (Thapar et al. 2004).
85
86 Structure and Function of Intrinsically Disordered Proteins
proteins and provide binary information on whether the protein behaves as ordered, or
rather as disordered, within the resolution of the technique.
(A) (B)
pH pH
3 10 3 10
kDa kDa
66 66
45 45
36 36
29 29
24 24
20 20
14 14
Figure 7.1 2-D electrophoresis of A. thaliana proteins enriched for disorder by heat
treatment. A. thaliana seed extracts were boiled to select for heat-resistant dehydration
stress proteins. The supernatant was run on 2DE, and spots were excised for MS identifica-
tion. The enrichment is shown by comparing the 2DE gel before (A) and after (B) the heat
treatment. Boiling reduced the number of spots from 710 to 406, and mostly preserved
disordered LEA and storage proteins. Reproduced with permission from Irar et al. (2006),
Proteomics 6, S1, S175–85. Copyright by Wiley-VCH.
88 Structure and Function of Intrinsically Disordered Proteins
IPMDH
MAP2c Ovalbumin
BSA
MYPT1
Stathmin Fetuin
ERD10
β-casein
NACP
α-casein
CSTD1
DARPP-32
Bob-1
Figure 7.2 Native/urea 2-D electrophoresis of IDPs and globular proteins. A mixture of
IDPs and globular proteins (1 µg each) was run on a native gel in the first dimension and
on a gel containing 8M urea in the second dimension. IDPs stathmin, microtubule-associ-
ated protein 2c (MAP2c), MYPT1 304-511, ERD10, β-casein, α-synuclein (NACP), CSTD1,
Bob-1, DARPP-32, and α-casein are aligned along the diagonal line of the gel (marked by a
solid line). Globular proteins fetuin, IPMDH, BSA, and ovalbumin occupy off-diagonal posi-
tions (marked by a dashed line). Reproduced with permission from Csizmok et al. (2006),
Mol. Cell. Proteomics 5, 265–73. Copyright by the American Society for Biochemistry and
Molecular Biology.
7 • Proteomic Approaches for the Identification of IDPs 89
groups are separated from each other, but direct information on the structural status of
IDPs is provided (Csizmok et al. 2006).
This separation principle can be illustrated by experimentally confirmed IDPs and
globular control proteins (Figure 7.2). IDPs align along the diagonal line, whereas con-
trols appear above. When applied for the proteomic-scale identification of IDPs from
cellular extracts of E. coli and S. cerevisiae, bioinformatic, gel filtration (GF), and cir-
cular dichroism (CD) characterization showed that proteins tend to fall into two groups
(Table 7.2). Some of them, such as transcription factor IIA (TFIIA), appear to be fully
extended, devoid of appreciable secondary structure content, and might be classified
as an IDP of random-coil- or pre-molten globule (PMG)-type. Other proteins, such as
Ubi6, contain a significant amount of secondary structure, they are more compact by
hydrodynamic criteria, and can probably be approximated as molten globules (MGs).
The technique can also be used for studying proteins of limited purity and very low
quantity (on the order of µg). Structural techniques traditionally used for characterizing
IDPs, such as CD, nuclear magnetic resonance (NMR), or small-angle X-ray scatter-
ing (SAXS), require large amounts of purified proteins and cannot be applied to trace
amounts of contaminated proteins. The position on the native/urea 2DE gel, however,
is a dependable indicator of the disordered status of a protein under such conditions
(Csizmok et al. 2006).
IDPs under
Conditions
Approaching
8
In Vivo
A major challenge in the field of intrinsically disordered proteins (IDPs) is to assess to
what extent in vitro observations on the structural state and function of these proteins
can be extrapolated to the living cell. This chapter discusses in vitro experiments aimed
at approximating in vivo conditions and also in vivo experiments that directly address
the physiological state of IDPs. Most studies suggest that IDPs are probably more com-
pact under such conditions, but they do not assume a unique folded state. Indirect con-
siderations also support the notion that IDPs do not become folded by the crowded
conditions encountered in the cell (i.e., disorder is their physiological state). The issue of
their assumed fast intracellular degradation, which would apparently be contradictory
to their existence and functioning, is also addressed.
8.1 Macromolecular
crowding in the cell
Our observations on IDPs are dominated by in vitro experiments, and it is implicitly
assumed that the emerging picture is relevant with respect to their state and affairs
in a living cell. The cell, however, has extremely high intracellular macromolecular
concentrations that give rise to a crowding effect, which might bear direct relevance
on the structural state of IDPs (Ellis 2001; Minton 2005). Typical concentrations of
proteins and other macromolecules reach 300–400 mg/ml, which basically limits the
available space for other macromolecules (Figure 8.1) and causes a severe excluded
volume effect that increases chemical activity of the molecules. Theoretical and exper-
imental estimates suggest that this effect can be of several orders of magnitude for a
protein of average size, which may fundamentally affect structural transitions accom-
panied by changes in volume, such as protein–protein interactions and folding. In the
case of unfolded/denatured globular proteins, crowding does promote their native-like
compact states of at least partial activity (Baskakov and Bolen 1998; McPhie, Ni, and
91
92 Structure and Function of Intrinsically Disordered Proteins
A B
Figure 8.1 Volume exclusion by crowding. Macromolecules in this schematic cell occupy
about 30% of the available space. (A) A small molecule has accessibility to virtually all
of the remaining 70%. (B) A molecule of size similar to the “crowding” macromolecules
is excluded from most of this volume, which gives rise to appreciable excluded volume
effects. Reproduced with permission from Ellis (2001), Trends Biochem. Sci. 26, 597–604.
Copyright by Elsevier Trends Journals.
Minton 2006; Qu and Bolen 2002). By analogy, crowding may also force IDPs to
assume compact or even folded states, making this issue very critical with respect to
their physiological structural state and function.
Basically, there are two direct approaches that can be used to address these issues.
The first is to mimic crowding conditions in the test tube and study the structural state/
function of selected IDPs. The second is to characterize the proteins in living cells
by appropriate experimental techniques. Complementing these two is the collection
of indirect observations, from which inferences can be made with respect to the likely
physiological state of IDPs.
The most widely accepted approach is to apply the high molecular-weight polymers
Dextran and Ficoll 70, which probably present a combination of all three effects. The
application of another protein, such as bovine serum albumin (BSA), is also acceptable,
but it may have unwanted aspecific interactions, for example. Small molecule osmo-
lytes, such as sucrose, trimethylamine N-oxide (TMAO), and trifluoroethanol (TFE),
primarily act upon viscosity and/or solvation of the protein backbone, which do occur
in the cell, but do not directly pertain to crowding by definition. These small mol-
ecules are considered “chemical chaperones,” and they do have the potential to induce
and or stabilize the folding of proteins. Further, they do belong to nature’s arsenal for
fighting abiotic conditions, such as water deficit (Bray 1993) or unfolding of proteins
(Baskakov and Bolen 1998), and thus they represent a fair alternative for approaching
living conditions.
Studies carried out under a wide range of conditions mostly agree that crowding
does elicit some compaction but not folding of IDPs. The conformational state of kinase
inhibitory domain (KID) of p27Kip1 and trans-activator domain (TAD) of c-Fos was
studied by circular dichroism (CD) spectroscopy or 1-anilino-8-naphthalene-sulfonic
acid (ANS)-binding (Flaugh and Lumb 2001). Dextrans of various molecular mass
(MW) or Ficoll 70 (both up to 250 mg/ml) has no effect on either protein, whereas
TFE at 30% concentration induces a significant amount of α-helix in both proteins.
α-Synuclein and an unfolded globular protein, acid-denatured cytochrome c were stud-
ied by two methods—pulsed-field gradient (PFG) nuclear magnetic resonance (NMR)
and CD (Morar et al. 2001)—in the presence of 1M glucose. Hydrodynamic measure-
ments suggest only a slight compaction of α-synuclein (RH decreases from 26.6 Å to
22.5 Å, compared to the value for a globular protein of 140 amino acids, 18–20 Å, and
for a random coil, 33–37 Å), without any change in the secondary structure content. In
the case of acid-denatured cytochrome c, 1M glucose does make it collapse to a state
compatible with the compactness of the native conformation (R H decreases from 30.6
Å to 17.7 Å). Fluorescence resonance energy transfer (FRET) in the case of the P/Q
domain of ZipA (Ohashi et al. 2007) suggested that the average end-to-end distance
in the ensemble slightly decreases in the presence of 20% Ficoll 70, which suggests a
limited compaction under crowding conditions.
TMAO is often (and not fully justifiably) used to assess the structure of unfolded pro-
teins under conditions approaching in vivo, with mixed effects. It has a significant effect
on α-synuclein, as measured by various techniques (Uversky, Li, and Fink 2001d). It
promotes the transition to a conformation dominated by α-helices, with a half-maximal
TMAO concentration around 2.2M. ANS binding, fluorescence quenching, and small-
angle X-ray scattering (SAXS) suggest a large compaction of the structure at 3.5M
TMAO, compatible with the oligomerization/aggregation of the protein (Uversky et al.
2001d). A significant effect on secondary structure (i.e., development of local α-helices)
and a slight compaction (i.e., blue-shifting of UV fluorescence) is seen in the case of the
intermediate chain of cytoplasmic dynein at 2.4M TMAO (Nyarko et al. 2004), whereas
the osmolyte has no effect on the secondary structure of myelin basic protein (MBP) at
concentrations up to 2M (Hill et al. 2002). In the case of a mutant tau protein, TMAO
promotes slight secondary structure formation and restores the ability of the protein
to promote tubulin polymerization (Smith, Crowther, and Goedert 2000). The power
of TMAO, on the other hand, is demonstrated by its effectivity in making unfolded
94 Structure and Function of Intrinsically Disordered Proteins
1E-8
1E-9
D (m2 sec–1)
1E-10
1E-11
1E-12
0 10 20 30 40 50
Dextran (%)
see Chapter 14, Section 14.2.1), may receive even more credit than suggested by in vitro
structural studies. Local elements thus stabilized may be related to recognition func-
tions, as suggested in the case of tau protein (Smith et al. 2000), cytoplasmic dynein
(Nyarko et al. 2004), α-casein, p21Cip1, MAP2c (Tompa, unpublished results), and per-
haps α-synuclein (Uversky et al. 2001d).
ODC dimer
NADH Az
Mdm2
NQO1
p53 ?
p53 p53
20S ODC
proteasome monomers
Figure 8.3 20S proteasome-mediated degradation of IDPs “by default.” Schematic rep-
resentation of the “default” degradation pathways of p53 and ornithine decarboxylase
(ODC). These proteins are degraded by both the 26S and 20S proteasomes, but degrada-
tion by 20S proteasomes does not require prior poly-ubiquitination. It is suggested that
disorder is important for this relation: p53 contains a significant level of disorder (Bell et al.
2002; Dawson et al. 2003), and ODC is a two-state dimer, preferentially disordered in the
monomer state. Degradation of both proteins is regulated by interactions with NAD(P)H
quinone oxidoreductase-1 (NQO1). (Az stands for Antizyme.) Reproduced with permission
from Asher et al. (2006), BioEssays 28, 844–9. Copyright by Wiley-VCH.
A B
54 73
80
112.0
79
116.0
Nitrogen (ppm)
92
49
55
87 120.0
50
86
71 90 124.0
83
97 128.0
C
3.00
ppm Difference from Random Coil Value
2.00
1.00
0.00
62
67
V12
T17
D22
K27
K32
T37
T42
Q47
Q52
D57
R62
K67
N72
M77
I82
I87
S92
K97
–1.00
–2.00
Figure 8.4 Bound structure, free state, and in-cell NMR of FlgM. FlgM is an inhibitor
of the bacterial transcription factor σ28. (A) σ28 is about 97 amino acids in length, and the
structure of A. aeolicus protein bound to its partner (pdb 1rp3) has been solved by X-ray
crystallography (Sorenson et al. 2004). (B) When FlgM is overexpressed in E. coli, its HSQC
spectrum measured in solution (left panel) undergoes significant changes, which suggests
that its C-terminal region assumes structure or is bound to a partner, in the cell (Dedmon
et al. 2002). (C) In isolation, the protein from S. typhimurium tends to sample transient helix
conformations in the C-terminal half, as shown by NMR chemical shift index (CSI) values
(Daughdrill et al. 1998). Reproduced with permission from Dedmon et al. (2002), Proc.
Natl. Acad. Sci. USA. 99, 12681–4, copyright by the National Academy of Sciences, and
Daughdrill et al. (1998) Biochemistry 37, 1076–82, copyright by the American Chemical
Society.
8 • IDPs under Conditions Approaching In Vivo 99
25
20
Fold Index Segment
15
10
0
1 10 100 1000
Half-life (min)
Figure 8.5 Low correlation of physiological half-lives and protein disorder. Data on the
physiological half-life of 3,750 proteins were determined by epitope tagging and quantita-
tive Western-blotting following arrest of translation in yeast (Belle et al. 2006). The half-
lives thus determined indicate a very low level of correlation with the number of predicted
IDRs ≥30 consecutive residues. Reproduced with permission from Tompa et al. (2008),
Proteins 71, 903–9. Copyright by Wiley-Liss.
100 Structure and Function of Intrinsically Disordered Proteins
The most likely interpretation of these findings is that protein degradation is not
determined by a single characteristic, and the proteolytic systems in the cell are highly
regulated. This can be formally demonstrated by showing that proteases/proteolytic sys-
tems of the highest copy number in the cell are all tightly regulated by various means,
such as ubiquitination, localization, special substrate requirement, or post-translational
modification (Tompa et al. 2008).
The question of the possible compact fold of IDPs in vivo is irrelevant in the
case of extracellular IDPs, which do not experience a crowded environment in their
natural physiological habitat. There are many such IDPs in DisProt (Sickmeier et al.
2007). For example, casein(s) function as scavengers of calcium-phosphate seeds in
milk (Holt, Walgren, and Drakenberg 1996), salivary proline-rich glycoproteins serve
to neutralize plant polyphenolic compounds (i.e., tannins) in saliva (Lu and Bennick
1998), whereas bacterial fibronectin-binding proteins that protrude from the surface
of the cell tether bacteria to the extracellular matrix (ECM) of the host (Penkett et al.
1997).
The issue of in vivo order or disorder is probably also irrelevant with respect to
IDPs that perform their function directly by disorder, which is by definition incompat-
ible with a single, stable, conformational state (entropic chains, see Chapter 12, Section
12.1). The function of entropic chains stems directly from the ability of their polypeptide
chain to rapidly fluctuate between a large number of alternative conformational states.
For example, the function of the Pro, Glu, Val, Lys-rich (PEVK) region of titin, an elas-
tic protein that provides passive tension in muscle (Trombitas et al. 1998), the projection
domain of MAP2, which ensures entropic spacing in the cytoskeleton (Mukhopadhyay
and Hoh 2001), and the repeat regions of FG nucleoporins (Nups), which regulate gating
of transport through the nuclear pore (Patel et al. 2007), cannot be rationalized in terms
of a folded, well-defined structure.
As outlined in Chapter 12, many IDPs function by molecular recognition, when
they either transiently or permanently bind to a structured partner. The structures of
these complexes are solved in many cases and demonstrate that IDPs often bind in an
extended, open configuration (see Chapter 6, Figure 6.3; Chapter 10, Figure 10.3; and
Chapter 11, Figure 11.5). Because strength and specificity of binding (often in the range
of nM/pM) argue that the structure determined in vitro is a faithful reflection of the
mode of binding in vivo, it is logical to assume that the protein was unfolded, instead
of having to unfold, prior to binding. In addition, several IDPs do not become fully
ordered even in the partner-bound state but remain partially or even fully disordered,
as captured by the concept of fuzziness (see Chapter 14, Section 14.8). Such complexes
limit the idea that the IDP would be folded in the cell.
Directly linked to the binding argument is the fact that certain IDPs can bind sev-
eral different partners in a process termed binding promiscuity (Kriwacki et al. 1996), or
one-to-many signaling (Dunker et al. 2001), in which the IDP may adopt different struc-
tures. Such an adaptability (see Chapter 14, Section 14.6) has been reported in the case
of the Cdk inhibitor p21Cip1, which can bind different cyclin-Cdk complexes (Kriwacki
et al. 1996), the C-terminal domain (CTD) of RNA polymerase II (RNAP II), which
can recognize both RNA guanylyl transferase Cgt1 and peptidyl-proline isomerase Pin1
(Fabrega et al. 2003), the HIF-1α interaction domain bound to either the TAZ1 domain
of CBP (Dames et al. 2002) or the asparagine hydroxylase FIH (Elkins et al. 2003),
and Tβ4, which can bind G-actin (Irobi et al. 2004) (see Chapter 11, Figure 11.5) as
well as integrin-linked kinase (ILK) and PINCH (Bock-Marquette et al. 2004). This
level of adaptability can be best interpreted in terms of the disorder of the protein in the
unbound state.
Prediction of
Disorder 9
This chapter describes basic bioinformatic approaches of predicting protein disorder
from sequence. Prediction is a classification problem, which can be approached from
three distinct directions: (1) from simple propensities reflecting some basic physical or
sequence features; (2) from machine-learning algorithms, which are trained to recognize
sequences; and (3) from the tendency of amino acids to make or avoid contacts with each
other. The resulting distinctions between predictors are not absolute, because several of
them incorporate more than one of these features. The predictors can be combined
into meta-predictors, and their performance can be compared in a statistically rigorous
way. Their application addresses many structural/functional issues and improves target
prioritization in structural genomics programs.
9.1 General points
It should be made clear that although different predictors are based on different principles
and apply different computational approaches, in one way or another they all rely on
the biased sequence features of intrinsically disordered proteins (IDPs) (see Chapter 10,
Section 10.1), with their basic feature being an enrichment in disorder-promoting amino
acids and depletion in order-promoting amino acids (Dunker et al. 2001). There are about
20 predictors and also some meta-predictors (Table 9.1), which combine the assessment of
several individual servers. An update of links of the predictors is available at the DisProt
Web site (http://www.disprot.org), and the subject is covered in several reviews (Bracken
et al. 2004; Dosztanyi et al. 2007; Ferron et al. 2006).
103
Table 9.1 Disorder predictors*
104
Machine-Learning Algorithms
PONDR® (VL-XT) http://www.PONDR.com/ NN based on amino acid features No (Li et al. 1999)
RONN http://www.strubi.ox.ac.uk/RONN Bio-basis function NN No (Yang et al. 2005)
DISOPRED2 http://bioinf.cs.ucl.ac.uk/disopred SVM, NN for smoothing Yes (Ward et al. 2004)
DisEMBL http://dis.embl.de Neural network No (Linding et al. 2003a)
DISpro http://www.ics.uci.edu/~baldig/ 1-D-recursive NN with profiles Yes
dispro.html
Spritz http://protein.cribi.unipd.it/spritz/ Two SVMs with non-linear Yes (Vullo et al. 2006)
Structure and Function of Intrinsically Disordered Proteins
Metaservers
MeDOR http://www.vazymolo.org/MeDor/ Graphical output of 12 disorder No (Lieutaud et al. 2008)
9
0.6
0.5
Absolute Net Charge
0.4
0.3
0.2
0.1
0.0
0.2 0.3 0.4 0.5 0.6
Mean Normalized Hydrophobicity
Figure 9.1 Charge-hydropathy plot of protein disorder. Net charge vs. mean hydropho-
bicity is plotted for intrinsically disordered (full diamonds) and ordered (empty circles) pro-
teins. The two sets are separated by a straight line <charge> = 2.743 <hydropathy> – 1.109,
with dashed lines delimiting the zone with a prediction accuracy of 95% for disordered
proteins and 97% of ordered proteins, at the expense of discarding 50% of all proteins.
Reproduced with permission from Oldfield et al. (2005), Biochemistry 44, 1989–2000.
Copyright by the American Chemical Society.
Section 7.3 and Table 7.2). For example, identified E. coli and S. cerevisiae proteins
segregate into three groups: disordered (distance on CH plot: 11; PONDR®%: 58.5%;
apparent MW relative to real MW: 3.0), slightly disordered (3;40%;2.35), and ambivalent
(–0.2;35.2%;2.9), by these distinct characteristics.
The CH-plot was developed into a sequence-specific predictor by calculating the
CH values for a segment of the sequence within a sliding window (Prilusky et al. 2005).
The predictor developed on this principle, FoldIndex (Figure 9.2A), is based on rear-
ranging and optimizing the original formula (Uversky et al. 2000a) in the form
to yield IF, the “Fold Index,” which is used to assign order to residues with positive val-
ues and disorder to residues with negative values.
A
0.4
0.2
0.0
–0.2
–0.4
0 50 100 150 200 250 300 350
B
1.0
0.8
0.6
0.4
0.2
C
TAD DBD TD RD
Figure 9.2 Predicted disorder in tumor suppressor p53. (A) FoldIndex score was calcu-
lated within a sliding window of 51 residues. Ordered (light gray) and disordered (dark gray)
regions are color coded (Prilusky et al. 2005). (B) IUPred score was calculated by a window
of 100 residues. A score above the threshold 0.5 is considered disordered (Dosztanyi
et al. 2005a; Dosztanyi et al. 2005b). (C) The domain structure of p53 shows that its TAD,
T(etramerization)D and R(egulatory)D domains are disordered, whereas its DNA-binding
domain (DBD) is ordered (for further details, see text).
only takes into consideration a single parameter, which is optionally either the Russell/
Linding scale (the difference of the propensity of an amino acid to be in “secondary
structure” or “random coil” region) or a scale based on the preference of residues to
be missing in the ATOM records, according to Remark465 in the Protein Data Bank
(PDB). The values are integrated along the sequence following Savitzky–Golay smooth-
ing, which also provides first derivative estimates. Putative globular and disordered seg-
ments are selected using a simple peak-finder algorithm, when the first derivative shows
positive (disorder) or negative (order) values over a continuous stretch of amino acids
with a minimum length.
To quantify these two properties, amino acid distributions in disordered (residues miss-
ing from PDB) and ordered (ensemble of PDB structures) data sets were computed, and
the distance to the nearest hydrophobic cluster is calculated by automated hydrophobic
custer analysis (HCA). Disorder prediction is based on amino acid composition and
maximal distance from the nearest hydrophobic cluster.
1.0
0.5
Estimated Energy/Residue
0.0
–0.5
–1.0
–1.5
–2.0
0 200 400 600 800
Length
Figure 9.3 Pair-wise interresidue interaction energies of globular and disordered pro-
teins. The total pair-wise interresidue interaction energy of globular proteins (inverse tri-
angles) and disordered proteins (circles) is estimated from their amino acid composition
and plotted as a function of the length of their polypeptide chains. Values that are more
negative represent more stabilization due to amino acid interactions. The formula gen-
erating these energies forms the basis of the IUPred algorithm. Reproduced with permis-
sion from Dosztanyi et al. (2005), J. Mol. Biol. 347, 827–39. Copyright by Elsevier Inc.
easy to define. Either a length threshold (usually 30 amino acids) or a definition that
short regions are the ones missing from PDB, whereas long ones are identified by other
methods, is applied. Distinction between the two categories, nevertheless, is justified
by significant differences in their amino-acid composition (Peng et al. 2006) and the
existence of sequence clues of disorder located outside the region (see also Chapter 10,
Section 10.3.3). This latter situation is underscored by a “twilight” zone between order
and disorder (Szilagyi, Gyorffy, and Zavodszky 2008) and the difference in the perfor-
mance of predictors on short versus long regions of disorder, with short regions usually
being predicted less accurately (Bordoli, Kiefer, and Schwedel 2007; Jin and Dunbrack
2005; Melamud and Moult 2003). To handle this problem, predictors have been devel-
oped, which predict short and long regions separately and combine the results on dis-
order afterward.
The problem was addressed first by the development of VSL1 (Obradovic et al.
2005), which consists of three component predictors, each as an ensemble of logistic
regression models, in a two-level architecture. At the first level, there are two special-
ized predictors for predicting long (VSL1-L, for >30 residues) and short (VSL1-S, for
≤30 residues) regions of disorder. VSL1-L was trained on DisProt sequences (112 amino
acids in length on the average), whereas VSL1-S was trained on regions missing from
114 Structure and Function of Intrinsically Disordered Proteins
PDB structures (10 amino acids in length on the average). At the second level, VSL1-M
(a meta-predictor) integrates outputs of the two predictors. In all three predictions, vari-
ous attributes such as amino acid frequency, “spacer” frequency, K2-entropy, charge-
hydrophobicity ratio, flexibility index, PSI-BLAST profiles, and predicted secondary
structure are included. The algorithm was developed further as VSL2 (Peng et al. 2006).
Its component predictors VSL2-S and VSL2-L are SVMs with a linear kernel, whereas
for integrating data of the two, an optimized meta-predictor (VSL2-M2) is used. This
latter is also a linear SVM that uses neighboring predictions of VSL2-S and VSL2-L
as inputs.
Dataset-dependent prediction of disorder is also the defining theme of the Spritz
algorithm (Vullo et al. 2006). Spritz includes two SVM predictors: one trained on a
dataset of long disorder taken from DisProt and the other trained on a dataset of short
disorder taken from the PDB. The two binary classifiers are both implemented with a
non-linear Gaussian kernel, and unbalanced class frequencies are mitigated by using
asymmetric costs (i.e., a larger penalty for disorder misclassification). The two classi-
fiers use residue attributes such as amino-acid frequencies computed from PSI-BLAST
multiple alignments combined by secondary structure prediction.
9.6 Combination of
predictors: meta-servers
Different predictors rely on different principles and/or algorithms, each having strengths
and weaknesses. As discussed in Section 9.8, there is no universal solution to compar-
ing them and establishing the “best” predictor. Thus, it is recommended to compare
predictions by different algorithms based on different physical and/or computational
principles and seek a consensus of their scores. This practical point of view called meta-
servers into existence, which either simply help carry out numerous parallel predictions
or, in a more sophisticated way, integrate several outputs to produce a consensus by
some predefined criterion.
A simple tool to speed up the process of disorder prediction is MeDor, which sub-
mits the query sequence to several servers simultaneously and provides a graphical
output of disorder scores (Lieutaud, Canard, and Longhi 2008). Besides 12 different
disorder predictions (e.g., different versions of IUPred, RONN, FoldUnfold, DisEMBL,
FoldIndex, GlobPlot, DISPROT, and Phobius), it also adds the prediction of secondary
structure and hydrophobic clusters.
A consensus is sought by metaPRDOS (Ishida and Kinoshita 2008), which sends
the query sequence to seven individual predictors (PrDOS, DISOPRED2, DisEMBL,
DISPROT, DISPro, IUPred, and POODLE-S), and compares their predictions. Because
sensitivity and scaling of the component predictors differ, metaPRDOS does not use
simple averaging, but takes the individual scores as a seven-element input vector to a
SVM trained on a dataset of disordered proteins.
9 • Prediction of Disorder 115
A B
Figure 9.4 Prediction of the binding region of an IDP by PONDR® VL-XT. A binding region
within an IDP can be recognized as a downward spike on the disorder score generated by
PONDR® VL-XT. (A) The structure of the complex (pdb 1ej4) of the eukaryotic translation
initiation factor 4E (eIF4E, light gray), and 4E binding protein (4E-BP1, dark gray). (B) The
binding region of completely disordered 4E-BP1 is outlined by PONDR® VL-XT, which has a
downward spike at the location of the binding site (marked by a horizontal bar). Reproduced
with permission from Oldfield et al. (2005), Biochemistry 44, 12454–70. Copyright by the
American Chemical Society.
elements in IDPs/IDRs are also captured by other related concepts, such as eukary-
otic linear motifs (ELMs), preformed structural elements (PSEs), short linear motifs
(SLiMs), and primary contact sites (PCSs)—the recognition of which is achieved by
different bioinformatic approaches, such as SLiMDisc and DILIMOT (see Chapter
14, Section 14.2.2).
0.9
0.8
0.7
0.6
Hit Rate
0.5
0.4
0.3
0.2
0.1
0
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
False Alarm
Figure 9.5 ROC curve of disorder prediction. True positive (hit rate) of disorder predic-
tion is plotted against false positive (false alarm) rate to give ROC curves for predictions
by the DISOPRED algorithm applied in two different modes, including (solid curve) or not
including (dashed curve) information on the structure of homologs of known 3-D structure.
The result expected for a completely random predictor is also shown as a solid diagonal
line. Reproduced with permission from Jones and Ward (2003), Proteins 53, S6, 573–8.
Copyright by Wiley-Liss.
118 Structure and Function of Intrinsically Disordered Proteins
N TP N TP
Ssens = = (9.3)
( N TP + N FN ) N disorder
N TN N
Sspec = = TN (9.4)
( N TN + N FP ) N order
where Ndisorder and Norder are the total number of residues observed as disordered and
ordered, respectively. A predictor performs better if it has both higher sensitivity and spec-
ificity (Jin and Dunbrack 2005; Melamud and Moult 2003), which is sometimes assessed
by calculating their product or combining them into an overall accuracy (ACC):
N TP N TN
S product = Ssens Sspec = (9.5)
NdNo
Ssens + Sspec
ACC = (9.6)
2
Another measure of performance is the overall percentage of accuracy (Q2), which
is also often used in evaluating secondary structure predictors (Jin and Dunbrack
2005):
N TP + N TN
Q2 = (9.7)
N TP + N FP + N TN + N FN
which, however, suffers from unbalanced class frequencies, due to which a method
predicting all residues ordered would probably have the highest Q2 accuracy. Thus,
schemes rewarding correctly predicting disordered residues over correctly predicting
ordered residues, such as the weighted score (SW):
Wd N TP − Wo N FP + Wo N TN − Wd N FN
SW = (9.8)
Wd N d + Wo N o
where Wd and Wo are weights assigned to experimentally defined disordered and ordered
residues, and the Matthews correlation coefficient (SMCC):
N TP N TN − N FP N FN
SMCC = (9.9)
( N TP + N FP )( N TP + N FN )( N TN + N FP )( N TN + N FN )
9 • Prediction of Disorder 119
offer a more reasonable and fair comparison of methods. Depending on the underlying
datasets and criteria of comparison, different predictors have different performance,
with the top ones being PONDR®, PrDOS, DISPro, IUPred and DISOPRED2 (CASP6)
and PONDR®-VSL2, CBRC-DR, DISOPRED2, DisPRO, and GeneSilicoMetaServer
(CASP7). As a final note, it should be stressed that it is very difficult and maybe imprac-
tical to search for the “best” predictor of disorder, because any predictor can be sensi-
tive to certain types of disorder but less sensitive to others. Thus, it is recommended that
several algorithms be tried, and their results combined with caution.
A
77323
100
80
% of Targets
45259
60
40 22868
19150
20 6560 3560
0
Cloned
Expressed
Soluble
Purified
Crystallized
In PDB
B
13006
% of Targets with Disorder
7031
16
3119
12 1949
8 424
4 118
Figure 9.6 The number of targets and the percentage of proteins with long disordered
regions. (A) The number and percentage of targets in the TargetDB database at various
stages of the structure solution pipeline. (B) The number and percentage of proteins with
an IDR ≥30 residues in length, as determined by the IUPred algorithm. For each bar, the
number of proteins in the given stage was used as the 100%. Reproduced with permission
from Dosztanyi et al. (2007), Curr. Protein Pept. Sci. 8, 161–71. Copyright Bentham Science
Publishers Ltd.
of the targets contain at least one IDR ≥30 residues in the initial stage of cloning,
whereas in the final stage of solved structures this value is only 3.5% (Figure 9.6).
Thus, disorder correlates negatively with the success of solving novel structures—
most significantly at the stages of crystallization and actual solution of structure.
Structure of IDPs
10
This chapter is central to the theme of the book, surveying ideas about the structure of
intrinsically disordered proteins (IDPs). Because an IDP by definition has an ensem-
ble of structures that differ for each individual protein, the “structure” of IDPs cannot
be described in general. Rather, the general principles can be outlined, and it can be
shown how these appear in some of the best characterized examples. As we will see,
the description entails three different levels: the primary (sequence), secondary (local),
and tertiary (global) structure. Often, the basic structural concepts are the same as in
the case of globular proteins (see Chapter 1), and particularly close analogies apply with
their denatured states.
121
122 Structure and Function of Intrinsically Disordered Proteins
0.7
0.6 DisProt 1.0 (2004)
(Dataset - Globular-3D)/Globular-3D
Figure 10.1 Amino acid composition of disordered proteins. The differences between
the amino acid compositions of disordered datasets (DisProt 1.0 and DisProt 3.4) and that
of an ordered dataset were plotted as a function of the B-factor estimates of flexibility
of residues. There is a tendency for IDPs to be depleted in rigid (order-promoting) amino
acids and enriched in the more flexible (disorder promoting) amino acids. Reproduced with
permission from Dunker et al. (2008), BMC Genomics 9, S2, S1. Copyright by the BioMed
Central Ltd.
Another underlying cause of this biased composition may come from the need to
avoid amyloid formation (Tompa 2002). The ease of transition of the open structure of
IDPs to a β-strand poses the danger of amyloidosis (Conway, Harper, and Lansbury
2000; von Bergen et al. 2000), as outlined in detail in Chapter 15. A likely selective
force in the evolution of IDPs is to minimize this threat (Monsellier and Chiti 2007),
as also manifested in the strong general correlation of β-strand-forming potential with
order–disorder discrimination (Williams et al. 2001) (i.e., a negative correlation of
β-aggregation tendency and disorder) (Linding et al. 2004).
The correlation of particular amino acid features with disorder supports these con-
clusions (Williams et al. 2001). Out of 265 features analyzed, in the 10 top-ranking ones
discriminating between order and disorder: there are 2 contact scales, 4 hydrophobicity
scales, 3 β-sheet propensity scales, and 1 polarity scale. Certain features are enriched
in ordered (contact scales, hydrophobicty, and β-sheet propensity), whereas others in the
disordered (polarity) dataset, which is in agreement with the functional requirement for
an open structure that is capable of avoiding aggregation.
In accord with the foregoing points, ordered and disordered proteins can be dis-
criminated by a reduced amino acid alphabet (Weathers et al. 2004), because the accu-
racy of IDP prediction by a support vector machine (SVM) is largely preserved when
amino acids are clustered by chemical similarity (87% for 20 amino acids vs. 84% for
4 groups of amino acids). The weights associated with these four vectors (i.e., Phe-Trp-
Tyr, Cys-Ile-Leu-Met-Val, Ala-Gly-Pro-Ser-Thr, and Asp-Glu-His-Lys-Asn-Gln-Arg)
underscore the notion that simple general physicochemical properties and the avoidance
of aggregate formation are critical in defining protein disorder.
10 • Structure of IDPs 123
SWISPROT
–3
log20 (frequency)
–5
0.0 0.2 0.4 0.6 0.8
–1.5
PDB
–3.5
0.0 0.2 0.4 0.6 0.8
Complexity K1
complexity are required for defining a domain. They found that proteins in SwissProt
cover the entire possible range of alphabet size (1–20) and informational entropy range
(K = 0.0 – 4.5), whereas globular domains only occupy a restricted region of values
(alphabet = 10 – 20, K = 3.0 – 4.2). Regions of lower values (down to alphabet size = 3
and K = 1.5) correspond to fibrous structured proteins, such as coiled coils, collagens
and fibroins. It was concluded that a minimal alphabet size of 10 and entropy around
2.9 are necessary and sufficient to define a sequence that can fold into a globular struc-
ture. Although complexity distributions of IDPs/intrinsically disordered regions (IDRs)
are shifted to values lower than those of ordered proteins, there is a significant overlap
(Romero et al. 2001), because disordered proteins cover the range K = 2 – 4.2, with
occasionally exhibiting values as low as K = 1.0. Thus, disorder and low complexity are
related but distinct phenomena.
Although amino acid composition and local sequence features are the primary
determinants of disorder, several points suggest the importance of higher-level sequence
attributes. For example, machine-learning predictors trained on sequences outperform
simple propensity-based predictors (Chapter 9), and there is a twilight zone between
order and disorder (Section 10.3.3). For example, in the case of short segments, disorder
is encoded not only by local composition, but also sequence and environment (i.e., con-
text (Szilagyi et al. 2008)). A critical increase in the amount of data on IDP sequences
is required to enable the exploration of such higher-order features.
(CTD) of RNA polymerase II (RNAP II) (Bienkiewicz, Woody, and Woody 2000). The
characteristic signature of PPII in the CD spectrum, a positive peak at 217 nm, however,
is obscured by large negative contributions from the α-helix and β-strand, which limits
the insight provided by CD. Conclusive and quantifiable data on PPII conformation in
IDPs is only provided by Raman optical activity (ROA) (see Chapter 5, Section 5.5),
which shows the prevalence of PPII in several IDPs, such as casein, α-synuclein, and
tau protein (Syme et al. 2002); some wheat gluten proteins (Blanch et al. 2003); and in
Ala-repeat oligopeptides (McColl et al. 2004).
The importance of PPII helix conformation also derives from its probable involve-
ment in conformational diseases (Blanch et al. 2000). It is suggested that disorder of
the PPII type may enable the formation of regular fibrils, whereas more dynamic or
random coil-type disorder may instead lead to amorphous aggregates. In accord, PPII
conformation appears to dominate in the partially unfolded states of amyloidogenic
globular proteins (Blanch et al. 2000), the neurodegenerative IDPs α-synuclein and
tau protein (Syme et al. 2002), and repetitive amyloidogenic peptides/proteins, such as
polyQ stretches (Chellgren, Miller, and Creamer 2006), and oligoAla peptides (Chen,
Liu, and Kallenbach 2004; Shi et al. 2002).
10.2.3.1 p27Kip1
The Cdk-inhibitor p27Kip1 is one of the best characterized IDPs and is detailed for its
function and involvement in disease in Chapter 15, Section 15.1.3. Its nonrandom solu-
tion state characterized by NMR provides one of the most insightful examples of the
transient organization of an IDP, also reflecting its partner-bound state. Its backbone
Hα, Cα, and amide-N chemical shift index (CSI) values suggest extensive deviations
from the random-coil behavior (Lacy et al. 2004). Most notably, positive values in the
D37–K59 region are consistent with transient local helical conformations within a seg-
ment termed linker helix (LH; for domain definitions, see Chapter 3, Section 3.7.2).
These results were corroborated and extended by a combination of NMR studies
and MD simulations (Sivakolundu, Bashford, and Kriwacki 2005), which provided a
detailed and almost high-resolution structural description of the structural ensemble of
the kinase-inhibitory domain (KID) of p27Kip1 in the unbound state (Figure 10.3).
In the KID domain, no nuclear Overhauser effect (NOE) can be observed in the
specificity-determining domain 1, whereas proximity of sequential residues result in
Table 10.1 Secondary structural elements in IDPs*
Disordered Secondary structure
Protein DisProt Length region observed (residues) Reference
p27 Kip1 DP00018 198 1–198 (Lacy et al. 2004)
α-helix (38–60)
α-helix (37–59), β-hairpin (Sivakolundu et al.
(65–75), single turn of helix 2005)
(87–90)
α-synuclein DP00070 140 1–140 α-helix (18–34), partial α- (Bussell and Eliezer
helix (1–100), possible 2001)
β-turn (C-terminal region)
Potassium channel DP00267 401 1–62 α-helix (2–10, 44–52, 56–61) (Wissmann et al. 1999)
shaker
Tau protein F DP00126 441 1–441 β-strand (274–284, (Mukrasch et al. 2007a)
305–315, 336–345)
Stem-loop DP00144 276 1–175 α-helix (28–45, 50–57, 66–75, (Thapar et al. 2004)
binding protein, SLBP 91–96)
CREB DP00080 341 1–265 α-helix (119–130) (Hua et al. 1998)
α-helix (120–129, 134–144) (Radhakrishnan et al.
1998)
FlgM DP00027 97 1–97 α-helix (60–73, 83–90) (Daughdrill et al. 1998)
p53 DP00086 393 1–73 α-helix (18–24), mixture of (Vise et al. 2005)
α-helix, β-strand and
128 Structure and Function of Intrinsically Disordered Proteins
* IDPs are enlisted for which deviations of NMR observables from random coil values allowed the characterization of local structural preferences in the solution
state. DisProt number, total length of the protein, the region known to be disordered, and the region of transient secondary structure are shown.
Structure of IDPs
129
130 Structure and Function of Intrinsically Disordered Proteins
A Domain LH
D2.1
Domain 1 Domain 2
D2.2
(D1) (D2)
D2.3
B
N
N 30 ns C
45 ns
C
C
C
N
85 ns
99 ns
N
Figure 10.3 Structure of the KID domain of p27Kip1 in the bound and free states.
(A) The structure of p27-KID bound to the Cyclin A-Cdk2 complex (Russo et al. 1996). (B)
The structure of unbound p27-KID was characterized by MD simulations restrained by NMR
NOE distance values. The MD trajectory from 14 ns to 100 ns was analyzed by the diction-
ary of protein secondary structure (DSSP) algorithm for secondary structure, and respective
samples are shown. Reproduced with permission from Sivakolundu et al. (2005), J. Mol. Biol.
353, 1118–28. Copyright by Elsevier Inc.
observable correlations for residues 38–60 (domain LH), residues 62–70 within domain
2 (domain 2.1), 74–80 within domain 2 (domain 2.2), and 86–90 within domain 2
(domain 2.3, see Figure 10.3). To extract details, NOE-constrained MD simulations
were carried out with starting coordinates taken from the crystal structure (Russo et al.
1996). The trajectory confirms the previously characterized helical conformation in LH,
and also significantly populated locally folded conformations in several other regions,
such as a short antiparallel β-sheet in region 62–70, and a transient α-helix at 86–90.
These regions, termed intrinsically folded structural units (IFSUs), closely match the
structural features of bound p27-KID.
10.2.3.3 Tau protein
Interest in the microtubule-associated protein tau derives primarily from its involve-
ment in Alzheimer’s disease, where it forms aggregates termed paired helical filaments
(PHFs) deposited as neurofibrillary tangle in the neurons affected by the disease (see
Chapter 15.3.1). Tau contains a C-terminal repeat domain (i.e., TBD), which binds
microtubules (MT) and promotes MT assembly. TBD has 31-amino acid long micro-
tubule-binding repeats (MTBRs, R1 through R4) (Buee et al. 2000; Mandelkow et al.
1996). A reasonably full assignment of tau of 441 amino acids could be achieved by a
combination of several NMR procedures (see Chapter 6, Section 6.2.4).
Based on this assignment, Cα CSI values suggest a distinct pattern of small but
significant deviations from random-coil values in MTBR. Several continuous stretches
(containing 7–11 residues) with negative values can be observed, in particular Lys274 –
Leu284 (R1/R2), Ser305–Asp315 (R2/R3), and Gln336 –Asp345 (R3/R4). The values indicate
a propensity for β-conformations populated 22%, 25%, and 19% of the time for the three
regions. This local structural element is located at the beginning of repeat units R2, R3,
and R4, but it lacks from region R1, most likely because of the presence of Pro residues
there. This structural interpretation can also be confirmed by additional residual dipolar
coupling (RDC), 3JHNα , and NOE measurements (Mukrasch et al. 2007a), which also
suggest stable highly populated β-turn conformational elements immediately following
the β-strand regions, and probably also at the end of the repeats as well.
10.2.3.5 α-synuclein
α-synuclein (NACP) is involved in Parkinson’s disease and other synucleinopathies
(Chapter 15, Section 15.3.2). It is also not fully random, but contains elements of tran-
sient local order, most notably a short helical segment in the N-terminal region (residues
18–34) (Bussell and Eliezer 2001; Eliezer et al. 2001). Because this region is involved
in membrane-association of the protein, this transient structure may be important for
its physiological function and possibly also for its pathological aggregation. It should be
noted that long-range structural order was also identified in α-synuclein (see Chapter 10,
Section 10.4.2).
10.2.3.6 p53
p53 is a tumor suppressor gene product involved in the regulation of DNA repair and
apoptosis (see Chapter 15, Section 15.1.2) (Joerger and Fersht 2008; Levine 1997). Its
N-terminal TAD is disordered by CD (Bell et al. 2002), ultraviolet (UV) spectros-
copy, gel filtration (GF), and NMR (Dawson et al. 2003), whereas its full structural
characterization by a combination of small-angle X-ray scattering (SAXS), NMR, and
MD is one of the triumphs of IDP research (see Chapter 4, Section 4.4.3, Chapter 15,
Figure 15.2, and cover picture). Thorough NMR analyses based on CSI values, RDC
constants, amide-N relaxation rates and NOEs suggested that TAD has a transient
α-helix in the region 18–24, which is the site of murine-double minute 2 (MDM2)
binding (Kussie et al. 1996), which is the regulatory protein (see Chapter 12, Section
12.6.2.2) that plays a key role in the regulation of p53 (Lee et al. 2000; Vise et al.
2005; Wells et al. 2008).
10.2.3.7 Calpastatin
Calpastatin is the inhibitor of calpain, the calcium-activated intracellular cysteine
protease involved in various physiological and pathological processes (Wendt et
al. 2004). Full assignment of 121 out of its 126 non-Pro residues (see Chapter 6,
Section 6.2.4 and Figure 6.2) enabled its detailed structural characterization in
the solution state (Kiss et al. 2008b). CSI values, amide-N relaxation rates, and
heteronuclear NOE values indicate that the conserved subdomains A (Ser12 –Gly30)
and C (Ser87–Cys105) sample partially helical backbone conformations, whereas the
primary determinant of inhibition, subdomain B (Met 50 –Arg70), also has non-fully
random local conformational preferences (see Chapter 13, Figure 13.4). As shown
by the X-ray structure of the calpain–calpastatin complex (Moldoveanu, Gehring,
and Green 2008), the helical regions bind calpain in a calcium-dependent manner,
whereas the turn region directly inhibits the enzyme.
10 • Structure of IDPs 133
Structure (%)
60 60
40 40
20 20
0 0
0 10 20 30 40 50 60 70 80 90 100 0 10 20 30 40 50 60 70 80 90 100
Secondary Structure (%) Secondary Structure (%)
60 60
40 40
20 20
0 0
0 10 20 30 40 50 60 70 80 90 100 0 10 20 30 40 50 60 70 80 90 100
Secondary Structure (%) Secondary Structure (%)
Figure 10.4 Secondary structure distribution of IDPs in the bound state. Distribution
of the residues of IDPs in complex with their partner (light gray) and those of refer-
ence globular proteins (dark gray) in helix, extended, turn, and coil conformations.
Reproduced with permission from Fuxreiter et al. (2004), J. Mol. Biol. 338, 1015–26.
Copyright by Elsevier Inc.
134 Structure and Function of Intrinsically Disordered Proteins
sequences. About 90% of sequences fall into the “twilight zone” of dubious identity
for chains less than 50 amino acids in length, but only 25% for chains longer than 300
amino acids.
A
Ordered
Molten Random
Globule Coil
B
Ordered
Molten Pre-Molten
Globule Globule
Random
Coil
Figure 10.5 The protein trinity and protein quartet models of the structure-function
relationship. These models provide a conceptual framework to extend the classical struc-
ture-function paradigm by suggesting that different proteins can exist in any of three
(random coil, MG, and globular) or four (random coil, PMG, MG, and globular) structural
states. Function can arise from any of the states or transitions between them. Reproduced
with permission from Dunker et al. (2001), J. Mol. Graphics Modelling 19, 26–59, copy-
right by Elsevier Inc., and Uversky (2002) Protein Sci. 11, 739–56, copyright by the Protein
Society.
paramagnetic resonance (EPR) was used to elucidate the spatial relationship of domains
and global folding state (Jeganathan et al. 2006). FRET pairs were created by engineered
Trp residues (donors) and covalently attached IAEDANS molecules (acceptors). The
observed FRET distances were found to significantly differ from the values expected for
a random coil (Figure 10.6), suggesting that tau folds back so that its C-terminal end is
in the vicinity of TBD, whereas the N-terminus remains outside the FRET distance of
TBD, yet it approaches the other end of the molecule. The average distance between the
C-terminal end and the TBD is about 19–23 Å, which is significantly shorter than the
value predicted for the random coil ensemble (87.7–99.3 Å). The two ends of the mole-
cule are about 20.8–24.2 Å apart, as opposed to the theoretical value of 170 Å. Gnd-HCl
abolishes these interactions, corroborating non-random structural features in tau. FRET
between green fluorescent protein pairs (GFPs)—CyPet and YPet—was also used to
estimate the end-to-end distributions of IDPs of various length, such as the charged-
plus-PQ domain of ZipA, the tail domain of α-adducin, and the C-terminal tail domain
of FtsZ (Ohashi et al. 2007). Constructs of similar length give different FRET intensi-
ties. For example, the N-terminal 33 amino acids of the charged domain of ZipA gives
strong FRET signals, whereas the C-terminal 33 amino acids of the PQ domain of ZipA
gives only moderate FRET signals. Thus, variations of the donor–acceptor distance cal-
culated from FRET efficiency suggest variable stiffness (persistence length) abolished
by 6M urea (i.e., the structural ensemble is more compact than a random coil).
Hydrodynamic measures can also be obtained by MD simulations, which are
constrained by distance restraints derived from long-range paramagnetic resonance
enhancement (PRE) NOEs (Dedmon et al. 2005). This approach suggests that the RG
probability distribution of α-synuclein has a mean RG of 24.7 Å and an average RH of
R2 R4
R3 322
291
310
R1
23Å 19Å 23Å
C 432
23Å
17/18 N
Figure 10.6 Global hairpin folding of tau in solution. Long-range intramolecular inter-
actions within tau protein were assessed by FRET. The tubulin-binding repeats within TBD
are marked R1 through R4. The positions of residues to which fluorescent labels were
attached are indicated by the numbers in ovals. Major features of the global fold are that
the C-terminus folds in the vicinity of the repeat domain, and it is also within FRET distance
from the N-terminus, whereas this latter stays away from the repeats. Reproduced with
permission from Jeganathan et al. (2006), Biochemistry 45, 2283–93. Copyright by the
American Chemical Society.
10 • Structure of IDPs 139
0.1
Probability Density
0.075
0.05
0.025
0
20 30 40 50 60 70
Rg (Å)
Figure 10.7 Structural ensemble of α-synuclein. Radius of gyration (RG) probability dis-
tributions were calculated by MD simulations incorporating PRE-derived distance restraints
for native (black trace) and random coil (gray trace) models of α-synuclein. Representative
structures are indicated with arrows pointing to their corresponding RG values. Reproduced
with permission from Dedmon et al. (2005), J. Am. Chem. Soc. 127, 476–7. Copyright by
the American Chemical Society.
protein (Jeganathan et al. 2006) showed high local mobility, with effective rotational
correlation times of 0.2–0.6 ns. In the case of an unfolded globular protein, denatured
barstar (Saxena et al. 2006) fluorescence anisotropy decay analysis showed side-chain
dynamics of 0.2–0.4 ns time-constants and somewhat slower local segmental motions
on the order of 1–3 ns.
143
144 Structure and Function of Intrinsically Disordered Proteins
Table 11.1 Biological process and molecular function ontologies enriched with disorder*
SP function
BP (Jones) BP (Tompa) (Dunker) MF (Jones)
Ty transposition Pseudohyphal Differentiation Transcription
growth regulation
Development Transcription Transcription Protein kinase
Morphogenesis Morphogenesis Transcription Transcription factor
regulation
Protein Conjugation Spermatogenesis Binding
phosphorylation
Regulation of Cell cycle/ DNA condensation DNA binding
transcription cytokinesis
Transcription, Meiosis Cell cycle Nucleic acid binding
DNA-dependent
DNA packaging Signal transduction mRNA processing RNA polymerase II
Signal transduction Ribosome mRNA splicing Kinase activity
biogenesis and
assembly
Actin cytoskeleton Cytoskeleton o/b Mitosis Enzyme regulator
Pseudohyphal Sporulation Apoptosis Cytoskeletal binding
growth
Chromosome o/b DNA metabolism Protein transport RNA binding
DNA recombination Nuclear o/b Meiosis Signal transducer
Cytoskeleton o/b Cell budding Cell division Intracellular transport
Epigenetic regulation Cell wall o/b Ubl conjugation Carbohydrate
of gene expression pathway transport
Gene silencing RNA metabolism Wnt signaling Nucleotide binding
pathway
* Top BP and MF categories in GO significantly enriched with protein disorder, as suggested in three
bioinformatic studies. Jones and colleagues (Jones et al. 2004) used DISOPRED2, whereas Tompa and
colleagues (Tompa et al. 2006b) used IUPred to estimate the frequency of disorder in proteins in vari-
ous GO categories. Dunker and colleagues (Xie et al. 2007) used PONDR® VL3E to estimate long
disorder in SwissProt (SP) proteins and looked for keywords in their functional annotation record. The
categories are listed in the order of decreasing significance of the correlation. o/b stands for organi-
zation and biogenesis.
These studies agree that disorder is significantly enriched in five large functional
areas (see Table 11.1). The strongest correlations are found with the following:
The studies also agree on the functional categories that show a characteristic deple-
tion in disorder, these are usually the ones that require enzymatic and/or ligand-binding
activities. BP categories such as biosynthesis, metabolism, respiration and energy path-
ways, and MF categories, such as oxidoreductase, catalytic, ligase, liase, and structural
molecule, are noted most often.
A
1.0
Order Disorder
PONDR Score
0.8
0.6
0.5
0.4
0.2
0
0 200 400 600 800 1000 1200 1400 1600 1800 2000 2200 2400
Residue Number
B HAT
TAZ1 KIX Bromo ZZ
NRID NCBD
Linker
PHD TAZ2
Figure 11.1 Domain structure and predicted disorder of CREB-binding protein. Predicted
disorder (A) and schematic domain representation (B) of CREB-binding protein CBP. Known
structured domains are represented by circles, whereas uncharacterized linker regions con-
necting domains and the disordered nuclear-receptor co-activator-binding domain (NCBD)
and nuclear-receptor-interaction domain (NRID) are represented by connecting lines.
Ordered transcriptional-adaptor zinc-finger-1/2 (TAZ1/2), KID-binding domain (KIX, see
also Chapter 6, Figure 6.3), bromo domain (Bromo, see also Chapter 12, Figure 12.3), his-
tone acetyltransferase domain (HAT), plant homeodomain (PHD), and zinc-binding domain
(ZZ) are shown. Most regions between the ordered domains are predicted disordered.
Reproduced with permission from Dyson and Wright (2005), Nat. Rev. Mol. Cell Biol. 6,
197–208. Copyright by the Nature Publishing Group.
(ACTR) domain of p160 co-activator (Chapter 14, Figure 14.3 (Demarest et al. 2002)).
The long linker regions connecting ordered domains enable conformational flexibility
required for CBP function, but they also serve as posttranslational modification sites (e.g.,
for SUMO-ylation (Girdwood and Specified 2003)) and harbor eukaryotic linear motifs
(ELMs) binding regulatory proteins (e.g., three LXXLL motifs for steroid and retinoid
receptors (Heery et al. 2001)). Due to the extreme complexity of signaling/binding func-
tions, CBP/p300 is denoted as a “molecular interpreter” of the “words,” “phrases,” and
“sentences” represented by different combinations of regulatory signals (Smith 2004), in
which adaptability enabled by structural disorder might be indispensable.
The remodeling of chromatin and the assembly of PIC are also coordinated through
the interplay between CBP/p300 and the Mediator complex (Black et al. 2006), a large
multi-subunit assembly comprising about 25 components (Kornberg 2005). The two co-
activators act synergistically, but CBP/p300 compete with the general transcription fac-
tor TFIID for binding to Mediator at the promoter. Mediator has three large modules,
which appear to be specialized for certain functions (Asturias et al. 1999; Myers et al.
1999). The Head module mediates the interaction with RNAP II and other components
of the basal transcription machinery (Takagi et al. 2006), the Middle module is involved
in mediating repression signals and making contacts with the dissociable Cdk com-
plex, whereas the Tail module is targeted by a number of regulatory proteins (Myers
et al. 1999). Cryo-electron microscopy (EM) studies show that the Mediator undergoes
148 Structure and Function of Intrinsically Disordered Proteins
profound conformational changes upon interaction with activators and RNAP II CTD
(Taatjes et al. 2002; Taatjes et al. 2004), which may be facilitated by disordered regions
within several Mediator subunits, as demonstrated in a bioinformatic analysis. In yeast,
80% of Mediator subunits have predicted IDRs ≥30 consecutive residues, and 24% of
them have IDRs ≥100 consecutive residues. In the human Mediator, IDRs ≥30 and ≥100
residues appear in 75% and 32% of the subunits, respectively (Toth-Petroczy et al. 2008).
Disordered regions are also observed experimentally in several cases, such as the Med8/
Med18/Med20 submodule, which contains multiple binding sites for the TATA-box
binding protein (TBP) complex. In the crystal structure, only a short α-helical region of
Med8 can be observed (Lariviere et al. 2006), whereas the linker between the C- and
N-terminal regions of Med8 exhibits enhanced sensitivity to proteolytic digestion in the
free protein. The functional importance of disorder is also underscored by the evolution-
ary conservation of disorder patterns (Toth-Petroczy et al. 2008).
CTD
Figure 11.2 X-ray structure of eukaryotic RNA polymerase II complex. The structure of
the 10-subunit yeast RNA polymerase II (pdb 1i50) has been solved at 2.8 Å resolution by
X-ray crystallography (Cramer et al. 2001). The repetitive CTD of the largest subunit, Rpb1,
is missing from the structure due to its structural disorder and is marked by a dashed line.
charged and is extremely sensitive to proteolysis, also suggesting an exposed and prob-
ably disordered structure (Yong et al. 1998).
Figure 11.3 Many-to-one signaling involving β-catenin partners. β-catenin (light grey)
has several binding partners (dark grey) such as E-cadherin (pdb 1i7w), Tcf3 (pdb 1g3j),
APC (pdb ijpp), and ICAT (pdb 1luj). The β-catenin central domain (515 residues) contains
12 armadillo repeats, each consisting of 3 helices stacked to form a positively charged
right-handed superhelix of helices, and serves as a scaffold for binding distinct partners.
The interplay between various binding modes of β-catenin represents an example of many-
to-one signaling enabled by disordered partners binding to the same ordered target. For
further details, see Gooding et al. 2004.
(CBD) (see Chapter 14 Section 14.2.4), and their binding to the same partner represents
a case of many-to-one signaling enabled by disorder.
A special example of disorder in receptor proteins is provided by the cytoplasmic
domains of antigen receptors of immune cells, such as T cells, B cells, mast cells, and
basophils (Sigalov 2004). In these cells, antigen recognition results in the initiation of
immune response mediated by membrane-bound receptors termed multichain immune
recognition receptors (MIRRs). MIRRs consist of multiple single-transmembrane
subunits, each with extracellular ligand-binding domains and intracellular signaling
domains, these latter containing one or more copies of an immunoreceptor tyrosine-
based activation motif (ITAM), which gets Tyr-phosphorylated upon receptor cluster-
ing. The cytoplasmic domains of ITAM-containing signaling subunits were shown to be
disordered, even in the homo-oligomeric (tetrameric) form, as explicitely demonstrated
in the case of TCRzeta cytoplasmic domain (Sigalov, Aivazian, and Stern 2004; Sigalov
et al. 2006). This seminal observation contributed to the development of the concept of
fuzziness (Chapter 14, Section 14.8).
11.4.1 Ribosome
The ribosome performs protein synthesis in the cell by using mRNA as template.
Eukaryotes have 80S ribosomes, composed of a small (40S) and large (60S) subunit (30S
and 50S in prokaryotes). Their large subunit contains three structural elements—5S
(120 nucleotides), 28S (4,700 nucleotides), and 5.8S (160 nucleotides)—RNA, and about
49 associated (L) proteins. The 40S subunit consists of a 18S (1,900 nucleotides) RNA
and about 33 associated (S) proteins. Ribosomal proteins are consistently predicted to
be among the most highly disordered proteins (Iakoucheva et al. 2002; Ward et al. 2004;
Xie et al. 2007). Nearly 68% of them have predicted IDRs ≥30 consecutive residues,
close to the value of regulatory and cancer-associated proteins in general (Iakoucheva
et al. 2002). The structure of individual ribosomal proteins in complex with ribosomal
11 • Biological Processes Enriched in Disorder 153
RNA is known at atomic resolution (Ban et al. 2000), but their structural disorder in
isolation has not been studied in great detail.
Ribosomal proteins studied in isolation show clear signs of intrinsic disorder.
The ratio of their CD ellipticities at 200 and 222 nm suggest that many of them exist
in a PMG state in solution (Uversky 2002a). In certain cases, the structure solved
by nuclear magnetic resonance (NMR) shows a globular domain and disordered
N- or C-terminal tails, such as the N-terminal 22 amino acids of L18 (Turner and
Moore 2004) or the N-terminal 25 amino acids and a 15 amino acid-long loop in
L16 (Nishimura et al. 2004). In addition, certain ribosomal proteins, such as L7/L12
(Mulder et al. 2004), which constitutes a stalk-like extension on the 50S subunit,
also show structural disorder in the ribosome-bound state. L7/L12 has two globular
domains: the N-domain forms tetramers and connects to the body of the ribsosome,
whereas the C-domain binds to GTPases in the course of translation (IF-2, EF-G,
EF-Tu, RF3). They are flexibly tethered to the ribosome via a disordered linker, which
can be replaced with an unrelated sequence without compromising ribosome func-
tion, whereas shortening or lengthening it seriously impairs translation (Bubunenko,
Chuikov, and Gudkov 1992).
The disorder of ribosomal proteins is probably critical in ribosome assembly, which
involves the sequential binding of numerous proteins via multiple pathways leading to
large-scale changes in the conformation of both RNA and proteins (Xie et al. 2007). For
example, L5 contributes to folding of rRNA in a mutual induced fit mechanism (DiNitto
and Huber 2003). In addition, many ribosomal proteins appear to have extra-ribosomal
functions (Wool 1996), which are often implicated in the regulation of transcription,
RNA processing, DNA repair, and translation. These functions in general are closely
associated with structural disorder, which suggests that disorder of ribosomal proteins
is probably instrumental in these extra-ribosomal functions.
(Clapier et al. 2001) and NURF (Xiao et al. 2001), and it can also interact with
sequence-specific transcription factors resulting in both activation and repression
(Fazzio, Gelbart, and Tsukiyama 2005).
Linker histones are nucleosome-binding proteins that stabilize condensed chro-
matin. They have a very simple domain organization, consisting of a central winged
helix fold, a short N-terminal extension, and a long basic disordered CTD of about
100 residues in length (Hansen et al. 2006). The determinants required to condense
chromatin fibers reside in the CTD, which binds to linker DNA (i.e., DNA between
nucleosomes) and stabilizes nucleosome–nucleosome interactions. In addition, a 47
residue-long segment within linker histone H1 CTD binds and activates the DFF40/
CAD apoptotic nuclease, irrespective of its primary sequence (Lu and Hansen 2004),
which represents a case of sequence independence of recognition in fuzziness (see
Chapter 14, Section 14.10).
A
elF4G
h4
h5
elF4E Cap-elF4E* Cap-elF4E Cap-elF4E-elF4G*
elF4G
=Cap-structure
h4 h3 h4 h3
h5 ? h2 h2
h4 h5 h1 h5 h1
Further elF4E-elF4G
elF4E-elF4G* assembly intermediates elF4E-elF4G Cap-elF4E-elF4G
B 4E-BP1 C elF4G
elF4E elF4E
Figure 11.4 Disorder in eukaryotic translation initiation. (A) A model of structural transi-
tions during the assembly of the cap-binding complex in translation initiation. Key features
of the model are that eIF4E is partially disordered, whereas eIF4G is fully disordered before
binding to the 5’ cap of mRNA or to each other. Reproduced with permission from von der
Haar et al. (2006), J. Mol. Biol. 356, 982–92. Copyright by Elsevier Inc. (B) 4E-BP peptide
bound to eIF4E-7mGDP (pdb 1ej4), which is a molecular mimic of the binding of eIF4G
peptide (C, pdb 1ejh).
11.6.1 Microfilaments
Microfilaments are the most diverse and versatile elements of the cytoskeleton, contrib-
uting to cellular processes as diverse as muscle contraction, cell adhesion, cell migra-
tion, vesicle and organelle transport, signaling, cell division, and cytokinesis (Sparrow
1999). The elementary unit of the filaments is globular actin (G-actin), a 42 kDa con-
served protein that polymerizes into F-actin. Microfilaments, which are composed of
two intertwined fibers, are around 7 nm in diameter, and have two distinguishable ends
termed barbed (+) and pointed (-) ends. They reach the highest density under the cell
membrane, where they are responsible for resisting tension and maintaining cellular
shape, and also for forming cytoplasmatic protuberances, such as pseudopodia and
lamellipodia. When cells move, they form stress fibers, which are responsible for loco-
motion. They are the most dynamic among the three elements of the cytoskeleton and
are under the influence of many regulatory inputs mediated by IDPs.
Tβ4 is an IDP of 45 amino acids in length, it binds and sequesters G-actin and inhib-
its actin polymerization (Hertzog et al. 2004). Tβ4 bound to G-actin (Figure 11.5) blocks
both the barbed and pointed ends of G-actin, thus preventing its interaction with either
end of the growing F-actin polymer (Irobi et al. 2004). Interestingly, a region homolo-
gous to Tβ4, named WASP homology domain 2 (WH2), can be found in many other
proteins regulating the actin cytoskeleton (e.g., ciboulot, verprolin, spire, cordon bleu,
WASP, WAVE, and WIP) (Paunola, Mattila, and Lappalainen 2002; Renault 2008). Its
conserved features suggest that in all the homologs it functions in actin binding, but with
context-dependent outcome. Whereas a single WH2 domain in Tβ4 sequesters G-actin,
several tandem WH2 domains (e.g., in ciboulot) promote actin polymerization, probably
due to tethering multiple actin monomers next to each other (Chereau et al. 2005).
The WH2 domain is also involved in regulating the actin cytoskeleton in a more
sophisticated manner in the WASP, as also detailed in Chapter 14, Section 14.12.2.
158 Structure and Function of Intrinsically Disordered Proteins
Figure 11.5 The model of thymosin-β4 bound to G-actin. The structure of Tβ4 (dark
grey) bound to G-action (light gray) has been assembled by combining the structures of
two fusion proteins containing either half of Tβ4 in complex with G-actin (pdb 1t44 and
pdb 1sqk).
WASP mediates the effect of one of the Rho subfamily GTPases, Cdc42 (Caron 2002),
which stimulates de novo actin polymerization. Mammalian WASP is a modular protein
around 500 amino acids in length, with a WASP homology domain 1 (WH1), a basic
region (BR), a GTPase-binding domain (GBD), a Pro-rich region, and a C-terminal
VCA region composed of a verprolin-homology (V) or WASP homology domain 2
(WH2), a central hydrophobic region (C), and an acidic tail (A). The regions GBD and
WH2 are disordered (Abdul-Manan et al. 1999; Kim et al. 2000a), and the inactive
state of WASP is characterized by the interaction of its GBD and VCA regions, which
occludes VCA (Kim et al. 2000a). Activation primarily results from Cdc42-binding
at GBD, which releases VCA for interaction with the actin-related protein (Arp) 2/3
complex, as described in Chapter 14, Section 14.12.2 (see Chapter 14, Figure 14.3 and
Figure 14.10) (Abdul-Manan et al. 1999; Panchal et al. 2003).
heteropolymers composed of three subunits, IF-L, IF-M, and IF-H, which differ in their
MW (68–70 kDa, 145–160 kDa and 200–220 kDa, respectively). The difference mostly
comes from their sequentially diverged and disordered (Brown and Hoh 1997) C-terminal
tail domains, which extend from the filament backbone and form lateral cross-bridges
between adjacent filaments. The tail domain of the longest isoform, NF-H is highly repeti-
tive containing more than 100 copies of a hexapeptide element, which harbors a character-
istic KSP phosphorylation motif and has been generated by repeat expansion (see Chapter
13, Section 13.3.1). The importance of repeats derives from contributing multiple sites for
phosphorylation, which determines interfilament spacing by virtue of tuning the entropic
exclusion effect of tail domains by electrostatic repulsion (Brown and Hoh 1997).
11.6.3 Microtubules
MTs have the largest diameter (about 25 nm) of the three components of the
cytoskeleton, they play a basic ultrastructural role in highly elongated cells, such as
neurons, and also in cellular processes, such as mitosis, cytokinesis, and vesicular
transport (Avila 1989). They are polymers of α/β-tubulin dimers, which polymer-
ize end to end in protofilaments, 13 of which then bundle into hollow cylindrical
filaments. MTs are nucleated and organized by microtubule organizing centers
(MTOCs), such as centrosomes and basal bodies, and they form the mitotic spindle
required for the segregation of chromosomes in mitosis. MT polymerization is driven
by GTP-binding to tubulin. GTP hydrolysis at the tip of polymers may revert grow-
ing, and occasionally cause rapid depolymerization and shrinkage, termed a catas-
trophe. Due to this inherent instability, MT function depends critically on accessory
proteins.
Microtubule-associated proteins MAP2 and tau proteins are fully disordered
(Csizmok et al. 2005; Hernandez, Avila, and Andreu 1986; Schweers et al. 1994), they
share a common tubulin-binding domain (TBD), and they have unrelated N-terminal
projection domains. The projection domain of tau (Bodart et al. 2008) and probably
also of MAP2 (Mukhopadhyay and Hoh 2001) remains disordered even in the bound
state in vivo, and functions as an entropic spacer/bristle that provides proper spacing in
the cytoskeleton. Due to its involvement in Alzheimer’s disease, tau protein is among
the best characterized IDPs (see Chapter 10, Section 10.2.3.3 and Chapter 15, Section
15.3.1.2). Stathmin is also fully disordered (Honnappa et al. 2006) but plays the oppo-
site role as MAP2 and tau protein, because its interaction with tubulin destabilizes
assembled MTs causing their catastrophic depolymerization (Gigant et al. 2000).
discussed in some detail here. LEA proteins are expressed in late stages of seed matu-
ration and they are also strongly associated with the toleration of abiotic stress condi-
tions, such as dehydration caused by high salinity, high/low temperature or draught
(Tunnacliffe and Wise 2007; Wise and Tunnacliffe 2004). Based on the presence of
certain sequence motifs and function, LEA proteins are classified into three groups.
Homologs of group 1 and 3 proteins are also found in bacteria and in certain inverte-
brates (Tunnacliffe and Wise 2007; Wise and Tunnacliffe 2004). Group 2 is termed
dehydrins (DHNs). LEA proteins have high charge and are hydrophilic in character,
and several of them, such as wheat EM (McCubbin, Kay, and Lane 1985), A. avenae
LEA1 (Goyal et al. 2003), soybean DHN1 (Soulages et al. 2003), maize DHN1 (Koag
et al. 2003), and A. thaliana ERD10/14 (Bokor et al. 2005; Tompa et al. 2006a), are
fully disordered. Overall, it is reasonable to consider LEA proteins disordered in gen-
eral (Goyal et al. 2003; Irar et al. 2006).
LEA proteins have several suggested functions, such as antioxidants, ion sinks,
and membrane stabilizers (Tunnacliffe and Wise 2007; Wise and Tunnacliffe 2004),
but results most consistently point to their stress-related function as chaperones. For
two LEA proteins, A. avenae LEA1 and wheat EM, protection of citrate synthase from
heat-induced aggregation and lactate dehydrogenase from cold-induced aggregation
was demonstrated (Goyal, Walton, and Tunnacliffe 2005). A broad protein stabilization
function of A. avenae LEA1 was also described, with potent inhibitory activity against
polyQ aggregation in vivo (Chakrabortee et al. 2007). Cryoprotective activity was dem-
onstrated for two soybean dehydrin-type proteins, Mat1 and Mat9 (Momma et al. 2003),
and a similar effect was also shown for PCA60, a protein from winter bark tissues of
peach (Wisniewskia et al. 1999). Potent chaperone activity of A. thaliana ERD10/14 was
observed against heat-induced aggregation and/or denaturation of a range of substrates,
such as lysozyme, alcohol dehydrogenase, firefly luciferase (Chapter 12, Figure 12.4),
and citrate synthase (Kovacs et al. 2008). These results point to the role of disorder in
stress-related proteins in general, and chaperones in particular (also discussed in detail
in Chapter 12, Section 12.3 and Chapter 14, Section 14.15).
which suggests that metal binding by its CTD is of rather broad specificity. The effi-
ciency of different metal ions in stimulating fibrillation correlates with their ability to
induce a conformational change in the IDP.
Prion protein (PrP) is also best known for its involvement in a range of fatal neuro-
degenerative diseases (Chapter 15, Section 15.3.4). As described in Chapter 13, Section
13.3.1.4, PrP has a disordered N-terminal half that contains a polymorphic octapeptide
repeat region, which constitutes a high-affinity copper binding site capable of bind-
ing Cu2+ ion in vitro with a Kd of 10 –14 M (Jackson et al. 2001). Mice in which the PrP
gene is ablated exhibit severe reduction in the copper content of synaptosomal and
endosome-enriched subcellular fractions of brain extracts, which suggests that copper
binding is a function of the protein in vivo (Brown et al. 1997a). Because PrP null-
mutant mice also have reduced copper/zinc superoxide dismutase activity, the prion
protein might be a recycling transport protein for copper transport, and/or a superoxide
dismutase enzyme itself.
Several IDPs have also been noted for their ability to bind metal ions with low
affinity but high capacity. For example, (see Chapter 12, Section 12.5.3) calsequestrin
can bind 40–50 Ca2+ ions per molecule, with an affinity of about 1 mM (He et al. 1993).
The function of this protein is probably to store Ca2+ ions and regulate their traffic in
the sarcoplasmic reticulum.
UreG is one of four nickel chaperones (UreE, F, G, and D) involved in the assem-
bly of active urease in bacteria. The CD spectrum of the protein indicates 15% α-helix
and 29% β-strand structure, but NMR spectroscopy shows flexibility characteristic of
a disordered protein (Zambelli et al. 2005). These observations are compatible with
a MG state, even though the protein is a homo-dimer (Neyroz, Zambelli, and Ciurli
2006). UreG catalyzes the hydrolysis of GTP with a kcat = 0.04 min–1, coupling energy
requirement and nickel incorporation into the urease active site. The protein is specific
for the metal ion and has been suggested by structure prediction (threading) to have a
well-defined structure characteristic of GTPases. Apparently, UreG takes the energy
from interaction with cofactors and/or other protein partners to complete folding and
perform its catalytic function.
163
164 Structure and Function of Intrinsically Disordered Proteins
Display Sites
CREB KID PKA Site of phosphorylation
Cyclin B N-terminal E3 ubiquitin ligase Site of ubiquitination
domain
Chaperones
β-synuclein e.g., α-synuclein Prevention of aggregation
ERD 10/14 e.g., luciferase Prevention of aggregation
Nucleocapsid e.g., RNA Trans-splicing
protein 7/9
hnRNP A1 e.g., DNA Strand re-annealing
Effectors
4E-BP1 eIF4E Inhibition of translation initiation
p27Kip1 Cyclin A-Cdk2 Inhibition of cell-cycle
FlgM σ28 Inhibition of transcription
Securin Separase Inhibition of anaphase
Stathmin Tubulin Inhibition of tubulin
polymerization
Assemblers
RNAP II CTD mRNA maturation Regulation of mRNA
factors maturation
SARA Smad Targeting TGFβ activity
at Smad
Ciboulot Actin Promoting actin polymerization
p21Cip1 Cyclin A-Cdk2 Assembly of cyclin-Cdk
complex
CREB p300/CBP Initiation of transcription
Scavengers
Casein Calcium phosphate Stabilization of calcium phosphate in
milk
Salivary PRPs Tannin Neutralization of plant tannins
(Continued)
12 • Molecular Functions of Disordered Proteins 165
Prions
Ure2p Gln3p Utilization of urea under
conditions of growth on
poor nitrogen source
Sup35p NusA, mRNA Suppression of translation
termination, translation readthrough
CPEB Cytoplasmic mRNA Polyadenylation of dormant mRNA
search for binding partners, and/or increase binding affinity by increasing local con-
centration, by virtue of the entropic gain from the physical connection of two binding
elements, also known as the chelate effect (Jencks 1981). Another result is processivity,
which results from alternative binding interactions of the two connected recognition ele-
ments with multiple binding sites along an elongated partner, without full release at any
point. This binding capacity results in rapid diffusive movements along the substrate,
as observed in the case of bacterial cellulase, matrix metalloproteinase 9 (MMP-9), and
myosin VI (discussed in detail in Chapter 14, Section 14.9).
An increase in binding strength and specificity is observed in the case of the tran-
scription factor Oct-1, which regulates the expression of immunoglobulin genes at the
Igκ promoter (Chang et al. 1999). Oct-1 has two globular deoxyribonucleic acid (DNA)-
binding domains (POU homeodomain and POU-specific domain), each recognizing a
4–5 base pair sequence, connected by a 23 amino acid-long linker region. Upon interac-
tion with the promoter region, the two domains connected by the linker target an octamer
DNA sequence with high specificity. The linker is disordered in both the free and bound
states, and it is rather resistant to deletions that shorten it down to about 10–14 amino
acids; Oct-1 with an even shorter linker, 8 amino acids, has a high affinity for a pro-
moter region in which the order of the two DNA recognition sequences is reversed (van
Leeuwen et al. 1997). Interaction with a promoter in which the distance between the two
sequences is increased by 3 base pairs requires the linker to be lengthened to 37 amino
acids. Separation of the two domains (i.e., deletion of the linker) practically abolishes
binding. Overall, flexibility and length of the linker region enable selective binding of
differently spaced and oriented subsites of cognate DNA. Other notable linkers are dis-
cussed in relation with processivity (Chapter 14, Section 14.9), and the retention of linker
function in spite of rapid evolutionary changes (Chapter 13, Section 13.4.1).
(Shaker channel) of nerve axons (Magidovich et al. 2006; Magidovich et al. 2007). The
channel is activated by membrane depolarization, but within 1 ms it becomes inacti-
vated even if membrane depolarization is maintained (Hoshi, Zagotta, and Aldrich
1990; Liebovitch, Selector, and Kline 1992). The molecular mechanism of inactivation
(see Chapter 11, Section 11.3.1) can be accounted for by a ball and chain mechanism,
in which a short helix (ball) is connected to the body of the channel by a disordered
linker (chain). The linker enables the ball to freely move around and search in space
for its cognate site. When bound, the ball sterically occludes the mouth of the channel
and prevents ion translocation (Bentrop et al. 2001; Zagotta, Hoshi, and Aldrich 1990).
Model calculations suggest that movement of the ball on the chain and inactivation
kinetics of the channel can be described by a random spatial walk (Liebovitch et al.
1992). Entropic clock (timing) function results from the control of kinetics of channel
inactivation by disorder of the chain, as substantiated by the dependence of channel
kinetics on its length (Hoshi et al. 1990; Podlaha and Zhang 2003).
effect was first described for cytoskeletal proteins (i.e., the side-arms of neurofilaments)
(Brown and Hoh 1997) and projection domains of microtubule-associated proteins
(MAPs) (Mukhopadhyay and Hoh 2001). In both cases, atomic force microscopy (AFM)
measurements on proteins attached to a solid surface were performed to show that these
regions exert a long-range repulsive effect on approaching macroscopic objects (the tip
of AFM in this case). Typical distances on the order of 50 nm, as opposed to about 5
nm in the case of globular proteins of similar MW, were observed. In both cases, it was
found that the distance-force relationship of the protein and actual cytoskeletal spacing
in cells are correlated, in agreement with the role of these disordered proteins/regions
providing proper spacing in the cytoskeleton by entropic exclusion.
This molecular principle is also exploited for an entirely different purpose in the
mechanism of gating within the nuclear pore complex (NPC). The nuclear pore is a
huge assembly of approximately 50 MDa that selectively transports cargoes across the
nuclear envelope (Alber et al. 2007). NPC in yeast is made up of about 450 copies of 30
different subunits, arranged as a large circle surrounding a central pore of about 9 nm
(extensible to 30 nm, Figure 12.1). NPC has unusual size-selective filtering capacity as it
lets molecules smaller than about 40 kDa freely through, and excludes everything above
this threshold, unless it can bind to a specific carrier molecule termed karyopherin (Rout
et al. 2000). A cargo bound to a karyopherin can translocate through the pore in either
direction between the cytoplasm and the nucleus in an energy-dependent manner. This
enigmatic molecular mechanism of NPC gating can be explained by the entropic effect
of disordered NPC components (nucleoporins, Nups). 13 Nups in yeast contain long
Phe-Gly repeats (thus termed FG Nups), which are intrinsically disordered both in vitro
A B Cytoplasmic
1 fibrils
Ring
0.1
Force (nN)
scaffold
Central
meshwork
0.01
1E–3
Nuclear
basket
0 10 20 30 40 50 60 70 80 90 100
Tip-Sample Distance D (nm)
Figure 12.1 Entropic bristle function of FG Nups in the nuclear pore. (A) Compression of
a nucleoporin (cNUP153) by the tip of AFM results in a force-distance curve which shows
a long-range repulsion due to entropic exclusion by the disordered FG repeat region.
(B) Artistic model of the gating device nuclear pore complex (NPC), with a ring scaffold
made up of different Nup-s, having extensions forming cytoplasmic fibers, a meshwork
of FG-domain filaments in its center, and the nuclear basket structure. A key element of
the gating function of NPC is size-dependent filtering by entropic exclusion exerted by the
disordered FG-domains. Reproduced with permission from Lim et al. (2006), Proc. Natl.
Acad. Sci. USA 103, 9512–7, copyright by the National Academy of Sciences, and Patel et al.
(2007) Cell 129, 83–96, copyright by Elsevier Inc.
168 Structure and Function of Intrinsically Disordered Proteins
and in vivo (Denning et al. 2003). These disordered appendages physically fill the cen-
tral pore of NPC, provide multiple binding sites for karyopherins, and form a meshwork
of random coil chains through which nuclear transport proceeds. Measurements of the
associations of FG-domain coated beads (Patel et al. 2007), and AFM compressibility
in a way similar to that applied in the case of MAPs and neurofilament side-arms,
demonstrated long-range repulsive effects of entropic origin (Figure 12.1) (Lim et al.
2006). These and other observations on transient hydrophobic interactions suggest that
FG Nups anchored at the NPC center form a cohesive meshwork of filaments primarily
via hydrophobic interactions (Frey, Richter, and Gorlich 2006), whereas four peripher-
ally anchored Nups are generally non-cohesive. The interplay of these two different
behaviors results in a two-gate model of NPC featuring a central diffusion gate formed
by a hydrophobic meshwork and a peripheral gate that principally operates by entropic
exclusion (Patel et al. 2007).
The effect of entropic exclusion probably also constitutes a critical mechanistic
element of the chaperone function of IDPs (Tompa and Csermely 2004). For exam-
ple, in the case of late-embryogenesis abundant (LEA) proteins (Chapter 11, Section
11.7), part of their chaperone activity probably results from preventing the aggregation
of their partners by serving as “space fillers” (detailed in Chapter 14, Section 14.15)
(Chakrabortee et al. 2007; Tunnacliffe and Wise 2007). The entropic origin of this
effect is underlined by its similarity to the function of caseins, which bind small cal-
cium-phosphate seeds in milk and prevent their aggregation by an entropic exclusion/
entropic brush mechanism (Holt and Sawyer 1993). This mechanism termed polymer
brush has been known for a long time in polymer chemistry and colloid chemistry
(Bright et al. 2001).
Thr, or Tyr residues, and the phosphate groups are removed by protein phosphatases
(Cohen 1997; Cohen et al. 1996). Reversible phosphorylation is implicated in the regula-
tion of practically all basic cellular processes, such as cell division (Murray 2004), differ-
entiation (Frebel and Wiese 2006), migration (Panetti 2002; Xie and Tsai 2004), apoptosis
(Ojala et al. 2000), and synaptic transmission (Chen and Roche 2007; Takahashi et al.
2003; Wang et al. 2005). By conservative estimates, one-third of eukaryotic proteins
undergo reversible phosphorylation (Hunter 1987; Johnson and Hunter 2005; Manning
2005), and up to 2% of the genome encodes for kinases (kinome) (Manning 2005) and
phosphatases (Cohen 1997; Cohen, Chen, and Armstrong 1996). The loss of control over
the balance of phosphorylation/dephosphorylation is often implicated in cancer (Futreal
et al. 2004).
Studies on individual proteins have shown that phosphorylation occurs in prac-
tically all known IDPs/IDRs, such as cyclic-AMP response element-binding protein
(CREB) (Parker et al. 1996; Radhakrishnan et al. 1998), protein phosphatase 1 (PP1)
I2 (Hurley et al. 2007; Park and DePaoli-Roach 1994), p53 (Chehab et al. 1999; Shieh,
Taya, and Prives 1999), microtubule-associated protein 2 (MAP2) (Hernandez, Avila,
and Andreu 1986; Sanchez, Diaz-Nido, and Avila 2000), tau protein (Mandelkow
et al. 1996; Schweers et al. 1994; Uversky et al. 1998; Zheng-Fischhofer et al. 1998),
p27Kip1 (Galea et al. 2008a), the R domain of cystic fibrosis transmembrane conductance
regulator (CFTR) (Baker et al. 2007; Cheng et al. 1991), stathmin (Honnappa et al. 2006;
Wittmann, Bokoch, and Waterman-Storer 2004), DARPP-32 (Hemmings et al. 1990),
osteopontin (Fisher et al. 2001; Singh, Devouge, and Mukherjee 1990), calpastatin
(Averna et al. 2001; Salamino et al. 1994), the C-terminal domain (CTD) of ribonucleic
acid polymerase II (RNAP II) (Fabrega et al. 2003; Meinhart and Cramer 2004; Zhang
and Corden 1991), LEA proteins (Alsheikh, Heyen, and Randall 2003; Heyen et al.
2002; Irar et al. 2006), 4E-binding protein (4E-BP) (Marcotrigiano et al. 1999), the
cytoplasmic domain (cytD) of E-cadherin (Huber and Weis 2001), securin (Agarwal
and Cohen-Fix 2002), neurofilament sidearms (Aranda-Espinoza et al. 2002), histones
(Bhaumik, Smith, and Shilatfard 2007; Hansen et al. 2006), and caldesmon (Hai and
Gu 2006).
Systematic bioinformatic studies underline the general correlation of the site of
phosphorylation and local disorder (Iakoucheva et al. 2004). By comparing a collec-
tion of more than 1,500 experimentally determined Ser (PS), Thr (PT), and Tyr (PY)
phosphorylation sites to potential sites that are actually non-phosphorylated (NS, NT,
and NY), it was found that segments surrounding phosphorylation sites are signifi-
cantly enriched in amino acids of higher surface exposure, charge, and flexibility and
lower hydrophobicity, reminiscent of the features of disorder-promoting amino acids
(Dunker et al. 2001; Romero et al. 2001). By combining the sets of positive and negative
examples and considering disorder, a predictor, DISPHOS (disorder-enhanced phos-
phorylation predictor), could be constructed. The predictor has an improved accuracy
over other phosphorylation-site predictors, such as NetPhos (Blom, Gammeltoft, and
Brunack 1999) and Scansite (Obenauer, Cantley, and Yaffe 2003), with accuracies of
different sites being somewhat different, 76 % for Ser, 81% for Thr and 83% for Tyr
residues. DISPHOS predictions suggest that phosphorylation sites primarily occur in
regulatory, cancer-associated and cytoskeletal proteins, as opposed to proteins involved
in degradation, biosynthesis, and metabolism (Figure 12.2).
170 Structure and Function of Intrinsically Disordered Proteins
80 Regulation
Cancer
Cytoskeletal
Membrane
60 Ribosomal
Estimated P-sites, %
Inhibitors
Transport
Kinases
40 Degradation
Biosynthesis
Metabolism
GPCRs
20 All disorder
PDB order
0
S sites T sites Y sites
12.2.3 Ubiquitination Sites
Disorder may also be directly implicated in ubiquitination (the addition of a small con-
served protein of about 50 amino acids), although the information on this relation is more
limited. Its importance, however, is warranted by that targeted destruction of proteins is
a critical regulatory mechanism of protein function. In the process three separate enzy-
matic systems take part. Ubiquitin-activating enzymes E1 activate ubiquitin in an ATP-
dependent manner and transfer it to ubiquitin-conjugating enzymes, E2. Then, an E2 alone
or in concert with a ubiquitin ligase (E3) binds the C-terminal carboxyl group of ubiquitin
to the ε-amino group of a Lys residue in the target protein (Hershko and Ciechanover
1998). Addition of ubiquitin moieties usually continues to form a polyubiquitin chain,
which targets the protein to the proteasome (see Chapter 8, Section 8.3.1). Importance
of the ubiquitin/proteasome system is underscored by that it is involved in the regulation
of key cellular process from cell-cycle control to inflammatory response (Hershko and
Ciechanover 1998).
The involvement of structural disorder in ubiquitination was explicitly stated in
the regulation of the cell cycle (Chapter 11, Section 11.3.3), in the mitotic destruction of
securin and Cyclin B (Cox et al. 2002) by the E3 APC (Murray 2004). Securin (Chapter
15, Section 15.1.5) is the inhibitor of separase, the cysteine protease that initiates ana-
phase by cleaving the Scc1/Mcd1/Rad21 cohesin subunit, which holds sister chromatids
together (Jallepalli et al. 2001; Waizenegger et al. 2002; Zou et al. 1999). Cyclin B is
a mitosis-specific cyclin, the level of which rises during interphase and drops during
mitosis. Securin has both D-box (RxxL) and KEN box motifs, whereas cyclin B only
has a D-box. The N-terminal regions of cyclin B and yeast securin Pds1 encompassing
the ubiqutination segments are intrinsically disordered (Cox et al. 2002).
Disorder of these regions sheds light on two intriguing experimental observations,
multiple monoubiquitination and polyubiquitination. The N-terminal region of Cyclin
B actually becomes ubiquitinated at several different Lys residues with no preference
for a particular site (King, Glotzer, and Kirschner 1996). Such a mode of modification
is most compatible with local disorder, which might enable several Lys residues to be
brought into apposition to the active site. The mechanism of polyubiquitination (i.e.,
the formation of a chain of ubiquitin moieties) apparently also requires large confor-
mational rearrangements following the addition of every ubiquitin molecule, enabled
by disorder.
172 Structure and Function of Intrinsically Disordered Proteins
C
N
AcK382
L383
K381
M384
H380
C
F385 K386 N
R379
Figure 12.3 Structure of the CBP bromo-domain/p53 AcK382 peptide complex. Ribbon
representation of the average minimized NMR structure of the CBP bromo-domain/acety-
lated p53 peptide complex (see Mujtaba et al. 2004). The peptide corresponds to residues
Arg379 –Lys386 of p53 and encompasses acetylated Lys382, the site of acetylation within the
regulatory domain of the protein (pdb 1jsp).
to be highly sophisticated machines that use the energy of ATP hydrolysis to drive
folding intermediates over the energy barrier of the folding trap (Csermely 1999; Todd
et al. 1996). Due to their overall benefit to the cell and mechanistic demands of their
action, their appearance is considered a critical early evolutionary invention (Csermely
1997). Due to the diverse mechanistic demands, chaperones are generally thought of as
ordered proteins/complexes.
A bioinformatic analysis shows that chaperone action is compatible with structural
disorder (Table 12.1). There is an elevated level of disorder in protein chaperones, and a
very high level of disorder in RNA chaperones (see Chapter 11, Section 11.5), with 54.2%
of their residues falling into disordered regions and 40% within IDRs ≥30 consecutive
residues (Tompa and Csermely 2004). These numbers exceed even those of regulatory
and signaling proteins, which are thought to be the most disordered functional classes
(Iakoucheva et al. 2002; Ward et al. 2004), and strongly argue for the functional impor-
tance of disorder in RNA (and protein) chaperone functions. Whereas molecular details
of their chaperone action are rather obscure, in principle they may act by
The possible mechanistic details are discussed in Chapter 14, Section 14.15.
0.030
0.025
Absorbance 400 nm
0.020
0.015
0.010
0.005
0.000
Figure 12.4 Chaperone effect of ERD10 and ERD14, two disordered LEA proteins. The
effect of two plant LEA proteins, ERD10 and ERD14, on the heat-induced aggregation of
firefly luciferase. Aggregation of 1.1 µM luciferase induced by heat (45ºC) was followed
without additions (◾), or in the presence of 2 µM BSA (⦁), 2 µM HSP90 (▲), 2 µM ERD10
(◽), or 2 µM ERD14 ( ). Aggregation was measured by absorbance at 400 nm. Reproduced
with permission from Kovacs et al. (2008), Plant Physiol. 147, 381–90. Copyright by the
American Society of Plant Physiologists.
and incorporation into their permanent complex (Cristofari and Darlix 2002; Lorsch
2002). The distinction between the two categories is not always straightforward, but
there are several cases when an elevated level of disorder in bona fide RNA chap-
erones is described. Probably the best characterized such protein is heteronuclear
ribonucleoprotein A1 (hnRNP A1) protein, which is very effective in promoting rena-
turation of complementary nucleic acid strands (Figure 12.5). The disordered Gly-
rich CTD of the protein promotes assembly of the protein–nucleic acid complex, and
is involved in maximal renaturation activity of the protein (Pontius and Berg 1990).
This observation led to the concept that nonspecific initial interactions of disordered
regions of proteins can significantly accelerate macromolecular association reactions
(Pontius 1993).
Nucleocapsid proteins are encoded by both the HIV virus (Ncp7) and the distantly
related yeast Ty3 retrotransposon (Ncp9). These proteins have two zinc-finger motifs
and disordered N-terminal and C-terminal segments, which facilitate strand transfer
reactions during reverse transcription (Cristofari et al. 1999; Morellet et al. 1992).
This was directly shown for nucleocapsid proteins of viruses of the Flaviviridae gen-
era, such as GB virus B, West Nile virus, and bovine viral diarrhoea virus (Ivanyi-
Nagy et al. 2007). A similar chaperone function was described and mapped into the
disordered N-terminal half of the prion protein (Gabus et al. 2001). In a systematic
in vitro trans-splicing assay of the RNA chaperone activity of ribosomal proteins of
176 Structure and Function of Intrinsically Disordered Proteins
A1
1.0
Fraction Denatured
0.8
0.6
0.4
0.2 no A1
0.0
0 10 20 30
Time (min)
Figure 12.5 DNA renaturation facilitated by hnRNP A1. Time course of the renaturation
of single-stranded (ss) DNA 124-nucleotide in length in the absence and presence of hnRNP
A1. The time course was followed by 4.5 nM ss DNA. Lanes 1 and 2, no A1 added; further
lanes, A1 at 32 nM for the time indicated. A1 under these conditions accelerates renatur-
ation more than 3,000-fold (lower panel). Reproduced with permission from Pontius and
Berg (1990), Proc. Natl. Acad. Sci. USA 87, 8403–7. Copyright by the National Academy of
Sciences.
the large ribosomal subunit (Semrad, Green, and Schroeder 2004), it was found that
several of them, such as L13, L15, L16, L18, and L19, are potent RNA chaperones.
Some of these ribosomal proteins also possess protein-chaperone activity, which
gave rise to the concept of “Janus” chaperones that can assist the folding of both
RNA and protein partners (Kovacs et al. 2009). Another example is the fragile X
mental retardation protein (FMRP), which possesses R NA-binding and chaperone
activities in vitro under physiological conditions (Gabus et al. 2004). FMRP is a
large and complex protein, and its RNA chaperone activity is thought to reside in
its disordered region (Ivanyi-Nagy et al. 2005). A direct connection between the
disordered region and RNA chaperone activity is shown when deletion of the under-
lying IDR abolishes activity of the protein. This was also observed in the case of
the prion protein (Gabus et al. 2001), hnRNP A1 (Pontius and Berg 1990), and Ncp9
(Cristofari et al. 1999).
are often found in the Protein Data Bank (PDB). When the effector has both activities,
sometimes with the same partner, it is termed “multitasking” or “moonlighting.”
12.4.1 Inhibitors
There are many examples of this function, a few of which are mentioned here. The
archetypical inhibitor is one of the best characterized IDP, p27Kip1 (see Chapter 14,
Section 14.12.1, and Chapter 15, Section 15.1.3), which inhibits Cdk2 by binding to the
Cyclin A-Cdk2 complex (Russo et al. 1996). Its close homolog, p21Cip1, was the first IDP
for which binding promiscuity was described, because it can inhibit distinct Cdks by
binding to Cyclin A-Cdk2, Cyclin E-Cdk2, and Cyclin D-Cdk4 complexes (Kriwacki
et al. 1996). In apparent contradiction with promiscuity, its inhibition is highly specific,
as demonstrated by its inability to bind and inhibit non-cell-cycle dependent kinases
(e.g., Cdk5 and Cdk7), due to the lack of specificity determinants on their cyclin part-
ners, p35 and cyclin H (Lacy et al. 2004).
Further well-characterized IDP inhibitor-partner pairs are (see also Table 12.1) IA3-
aspartic proteinase (Ganesh et al. 2006; Green et al. 2004), PKIα–cAMP-dependent
protein kinase (Hauer et al. 1999a), I2-PP1 (Hurley et al. 2007; Park and DePaoli-Roach
1994), stathmin-tubulin (Honnappa et al. 2006; Wittmann et al. 2004), DARPP-32—
PPI (Hemmings et al. 1990), 4E-BP1–eukaryotic translation initiation factor 4E (eIF4E)
(Marcotrigiano et al. 1999), Tβ4-actin (Domanski et al. 2004; Hertzog et al. 2004),
calpastatin–calpain (Kiss et al. 2008a; Moldoveanu et al. 2008), securin–separase
(Jallepalli et al. 2001; Waizenegger et al. 2002), FlgM-σ28 (Daughdrill et al. 1997;
Sorenson et al. 2004), and α-synuclein–phospholipase D2 (Jenco et al. 1998). High fre-
quency of this functional relation is also indicated by the DisProt database (Sickmeier
et al. 2007), which lists 22 IDP inhibitors.
12.4.2 Activators
Most effectors inhibit their partners, which probably follows from inhibition of
activity of an enzyme being mechanistically less demanding than its activation. In
fact, activation is always described for proteins that also have an inhibitory effect,
suggesting multiple, often opposing functions for the same protein. To contain this
kind of activity, the terms “moonlighting” or “multitasking” were suggested by Jeffery
(Jeffery 1999; Jeffery 2003a) for ordered proteins. The effect is discussed in detail in
Chapter 14, Section 14.6, where the most instructive examples are mentioned, such as
p21Cip1/p27Kip1, which can both inhibit and activate cyclin-Cdk complexes (Bagui et
al. 2003; Cheng et al. 1999); the random coil C fragment of dihydropyridine receptor
(DHPR), which can interact with skeletal muscle ryanodine receptor (RyR) in two sto-
chastically alternating modes, with one activating and the other inhibiting the partner
(Haarmann et al. 2003); and Tβ4, which inhibits G-actin polymerization (Domanski et
al. 2004; Hertzog et al. 2004) but can also activate integrin-linked kinase ILK (Bock-
Marquette et al. 2004).
178 Structure and Function of Intrinsically Disordered Proteins
12.5.2 Caseins
Caseins constitute a family of proteins in the milk of mammals, traditionally thought
to serve as nutrients for breast-fed newborns (Andrews et al. 1979; Creamer et al. 1981;
Holt and Sawyer 1993). As discussed in the chapter on the history of disorder (Chapter 2,
Section 2.2.4), structural disorder of caseins (i.e., rheomorphism) was among the first
to be recognized (Holt and Sawyer 1993; McMeekin 1952). Perhaps as important as
being nutrients in milk, caseins also function by binding and neutralizing calcium phos-
phate. Milk is a rich source of a great variety of nutrients, vitamins, and minerals,
among which calcium and phosphate can reach concentrations as high as 20–30 mM.
Calcium phosphate is not soluble in water at these concentrations, and its precipitation
would have deleterious effects in the mammary gland. Caseins have binding sites for
calcium phosphate seeds, and due to their open structure they can interact with small
seeds with a large capacity and speed, with an apparent first-order rate constant rival-
ing the active-site activity of enzymes (Holt and Sawyer 1993; Holt, Wahlgren, and
Drakenberg 1996).
12 • Molecular Functions of Disordered Proteins 179
12.5.3 Calsequestrin
Calsequestrin is a low-affinity, high-capacity calcium-binding protein, which can bind
40–50 Ca2+ ions per molecule, with an affinity of about 1 mM (He et al. 1993). The
protein can be found in the terminal cisternae of the sarcoplasmic reticulum of mus-
cle cells, where calcium concentrations reach millimolar levels. Thus, large storage
capacity of a protein with a Kd value in the range of the concentration of the free ion
enables calsequestrin to bind large amounts of Ca2+, thus lowering the free Ca2+ con-
centration inside the sarcoplasmic reticulum and allowing the accumulation of Ca2+ via
Ca2+-ATPase. When the Ca2+-release channel is stimulated to open, free Ca2+ at the
terminal cisternae is increased due to dissociation of Ca2+ from calsequestrin (Ikemoto
et al. 1991) localizing released Ca2+ directly at the release channel concomitant to its
opening. Structurally, calsequestrin is an IDP that undergoes significant induced fold-
ing upon Ca2+ binding with an increase in α-helix content, compactness, and resistance
to proteases (He et al. 1993), when its 3-D structure can be solved (Wang et al. 1998).
Figure 12.6 The complex of SARA and Smad2. Structure of the Smad-anchor for receptor
activation Smad-binding domain (SARA SBD, dark grey) in complex with the MH2 domain
of Smad2 (pdb 1dev). The interaction recruits Smad for phosphorylation by the transform-
ing growth factor-β (TGFβ) transmembrane Ser-Thr kinase receptor (see Wu et al. 2000).
12 • Molecular Functions of Disordered Proteins 181
cell, denoted as the interactome (Aloy and Russell 2004; Arifuzzaman et al. 2006;
Gavin et al. 2006). Considering the distribution of connectivities, the interactome is
“scale-free” (Barabasi and Oltvai 2004) (i.e., the number of connections of proteins fol-
lows a power law). A few proteins in such a network possess a very large number of con-
nections (hubs), whereas most others (ends) have very few, often only one, connections
(Barabasi and Oltvai 2004). This arrangement suggests a functional specialization, in
which hubs are preferentially involved in organizing the network, whereas ends are
rather the executioners of specialized functions. The interactome shows an enhanced
sensitivity to the removal of hubs (Jeong et al. 2001), which underscores the central role
of proteins with multiple interactions. A range of bioinformatic studies suggest that hub
proteins have an elevated level of disorder.
For example, the analysis of the Database of Interacting Proteins (DIP) suggests
that predicted disorder is 21.7% for hubs and 17.2% for non-hubs. For hubs that can be
found in PDB, the observed disorder is 41.2%, as opposed to 32.1% in non-hubs (Patil
and Nakamura 2006). In a different approach comparing data in four interactomes
(human, worm, fly, yeast) (Haynes et al. 2006), statistically significant differences were
observed; for example in C. elegans (worm), the percentage of proteins with at least one
IDR ≥40 consecutive residues is about 67% for hubs and 45% for non-hubs (Figure 12.7).
By applying a dynamic threshold for hub proteins (Dosztanyi et al. 2006), proteins with
the highest level of disorder were significantly enriched in hubs compared to non-hubs
(32% vs. 16% in yeast, for example). When “party” hubs (which interact with most of
their partners simultaneously) are compared to “date” hubs (which bind their different
Hubs
80 Ends
O_PDB_S25
60
Proteins (%)
40
20
0
≥30 ≥40 ≥50 ≥60 ≥40 ≥80 ≥90 ≥100
Length of Predicted Disordered Region (AA)
Figure 12.7 Distribution of predicted disorder in hubs and non-hubs. The percentages
of hub (black), non-hub (end, white), and PDB (gray) proteins with at least one IDR ≥30 to
≥100 consecutive residues predicted by predictor of natural disordered regions (PONDR®)
VL-XT for the C. elegans (worm) interactome. Reproduced from Haynes et al. (2006), PLoS
Comput. Biol. 2, e100.
182 Structure and Function of Intrinsically Disordered Proteins
partners at different times or locations) (Han et al. 2004), 30.8% of date hubs but only
10.2% of party hubs were found to be mostly disordered by the charge-hydropathy anal-
ysis, and 20.4% of the residues in date hubs but only 7.8% of the residues in party hubs
were found to fall into locally disordered regions.
Considering experimental data on hubs (Table 12.2), intrinsic disorder can appar-
ently contribute to hub function in three different ways (Dunker et al. 2005). First, the
disorder of a hub protein can provide the structural basis of binding promiscuity. Such
hubs are exemplified by proteins which are fully disordered (α-synuclein, caldesmon,
high-mobility group protein A (HMGA), and synaptobrevin) and some proteins, which
are “mostly” disordered (i.e., have the majority of their residues in local disorder (BRCA1
and XPA)). Partially disordered hubs (p53 and murine-double minute 2 [MDM2]) have
less residues in local disorder than in local order, and their disordered regions constitute
domains/linkers next to, or between, ordered domains. Third, there are certain hubs
(14-3-3 domain, actin, and CaM), which are well-structured and contain very little pre-
dicted disorder. All three types of behavior, however, are linked with protein disorder
in one way or the other.
MDM2 is also involved in interactions with many other proteins. The partners are
usually classified as effectors (i.e., upstream regulators of MDM2) and affectors (i.e.,
downstream proteins regulated by MDM2) (Iwakuma and Lozano 2003). Among the
effectors, interaction with ARF blocks nucleocytoplasmic shuttling of MDM2 and thus
enhances p53 function (Tao and Levine 1999). HIF-1α probably has a similar func-
tion, because direct interactions between HIF-1α and MDM2 modulate p53 function
(Chen, Luo, and Gu 2003). MDM2 is also the target of several kinases, among which
phosphorylation by ataxia-telangiectasia mutated (ATM) kinase and c-Abl interfere
with the interaction of MDM2 with p53 and impair degradation of p53. Ribosomal
proteins, such as L11 (Lohrum et al. 2003), L5, and L23 (Dai and Lu 2004), bind at the
central acidic region, sequester MDM2 in the nucleolus, and/or directly interfere with
p53 ubiquitination, thus stabilize p53. Interaction with p300/CBP, on the other hand,
cooperates in the degradation of p53 (Grossman et al. 1998). MDM2 also affects the
activity of several interacting proteins, such as retinoblastoma protein (RB), Sp1 tran-
scriptional activator, E2F1, and p300/CBP. Some other proteins, such as the androgen
receptor (AR) and Numb, are also targeted by the E3 ubiquitin ligase activity of MDM2
(Iwakuma and Lozano 2003).
CaM also involves the flexibility/disorder of the partner (Radivojac et al. 2006). A criti-
cal element of evidence is that often CaM-dependent enzymes are also stimulated by
limited proteolytic digestion (e.g., calcineurin [Manalan and Klee 1983] or cyclic nucle-
otide phosphodiesterase [Tucker et al. 1981]), which brings them in a state where they
can no longer responded to, or bind, CaM (see Section 12.2.2). Further, the wrapping of
CaM around the binding peptide demands an open spatial location of the peptide, which
is easiest to be reconciled with disorder, as also shown by the structural state of binding
regions of CaM. Of 42 CaM-partner structures in PDB, CaMBT appears to be properly
folded in 3 cases only, whereas it is missing in 4 cases (e.g., in the case of calcineurin
[Kissinger et al. 1995]), it had been removed in 24 cases (as often done with disordered
segments to help crystallization), and it is either in crystal contacts or in interchain
contacts in a further 11 cases.
20
18
16
Median Disorder 14
12
10
8
6
4
2
0
Single 2–4 5–10 11–100
Complex Size
Figure 12.8 Predicted and observed disorder in complexes of various size. Average dis-
order of complexes of various numbers of subunits was either predicted by the IUPred algo-
rithm or determined by examining their individual components in the PDB. The values thus
determined for individual protein components are averaged within four groups (i.e., singu-
lar proteins and complexes of 2–4, 5–10, and 11–100 subunits) (light gray: yeast; white: E.
coli, both predicted; dark grey: E. coli, observed). Reproduced with permission from Hegyi
et al. (2007), BMC Struct. Biol. 7, 65. Copyright by BioMed Central Ltd.
through the mating pathway. The protein is 917 amino acids in length, it contains only
a single RING-type Zn-finger domain, and it has the capacity to tether and activate the
respective pathway members
A scaffold protein in post-synaptic density (PSD) is the CASK-interactive protein
Caskin (Tabuchi et al. 2002), a multi-domain protein of 1,430 amino acids, possessing
6 ankyrin repeats, 2 sterile-α motifs (SAM domains), and a single SH3 domain in the
N-terminal part. There are no recognizable domains in its C-terminal 800 amino acids,
which are dominated by a long, disordered Pro-rich region (Balázs et al. 2009). Caskin1
can bind the CASK adaptor protein (Tabuchi et al. 2002), the Abl-interactor-2 (Abi-2),
and other nine proteins, and is presumably involved in the assembly of PSD and signal-
ing related to Abl tyrosine kinases.
12.7.1 Sup35
Sup35 prion has been first described as the genetic element [PSI+] in yeast, which causes
translational read-through and is inherited in a non-Mendelian manner (Lindquist
1997). This unusual behavior can be ascribed to the altered conformation of a cellular
protein, Sup35p, which is part of the translational termination complex. The protein is
composed of a disordered, Q/N-rich N-terminal domain NTD or NM region of Chapter
5, Section 5.2.5.2 and Chapter 10, Section 10.5.1.2. (Mukhopadhyay et al. 2007) and a
globular CTD that forms part of the complex. When the NTD undergoes self-sustain-
ing transition to the prion (amyloid) state (Nelson et al. 2005), it occludes the globular
domain, which can no longer take part in complex formation. This suppresses the ter-
mination of translation at stop codons, causing translational read-through, which may
provide functional advantages under certain circumstances (Li and Lindquist 2000).
188 Structure and Function of Intrinsically Disordered Proteins
12.7.2 Cytoplasmic Polyadenylation
Element Binding Protein
Arguably, the most intriguing example of disorder in a functional prion is cytoplasmic
polyadenylation element binding protein (CPEB) of the marine snail Aplysia califor-
nica. This is a neuronal member of a larger family, which regulates mRNA translation
by promoting the polyadenylation of cytoplasmic mRNA, thus activating “dormant”
message and facilitating local protein synthesis at activated synapses (Si et al. 2003a;
Si, Lindquist, and Kandel 2003b). Neuronal CPEB has a Q/N-rich NTD that resem-
bles yeast prion-determinants with predicted conformational flexibility. Expressed as a
fusion construct in yeast, this region brings about epigenetic changes of the cell, which
is a hallmark of yeast prions (Li and Lindquist 2000; Wickner et al. 2004). In the syn-
apses of the snail activated by repetitive neuronal stimuli, its expression is up-regulated,
which promotes its transition to the prion state. This altered state of CPEB serves as a
molecular marker that confers synapse specificity and promotes synaptic growth asso-
ciated with the maintenance of long-term facilitation. Surprisingly, it is the dominant,
self-perpetuating prion-like form that has an elevated capacity to stimulate translation
of CPEB-regulated mRNA. By all criteria, CPEB is a prion with the physiological func-
tion of strengthening synaptic communication in memory formation (Si et al. 2003a;
Si et al. 2003b).
Evolution and
Prevalence
of Disorder
13
The evolutionary history of disorder is of particular importance because disorder corre-
lates with regulatory functions that have undergone an expansion in higher multicellular
organisms. Such functions are often missing from bacteria, which raises several issues
with respect to the generation and evolutionary modification of genes encoding for
intrinsically disordered proteins (IDPs). In addition, tracking the evolutionary history
of a protein is very closely related to understanding the molecular basis of its func-
tion, because selection among functional variants generated by mutations is intimately
linked with their phenotypic effects (i.e., functional readout).
189
190 Structure and Function of Intrinsically Disordered Proteins
potential functional significance are even more prevalent, reaching 63% in Drosophila
(Table 13.1). Predictions by DISOPRED2 corroborate these results, with somewhat
different levels due to differences in the false-positive rates of the predictor (Ward et
al. 2004). In this case, the frequency of proteins with at least one IDR ≥30 residues
13 • Evolution and Prevalence of Disorder 191
60
Proteins (%)
40
20
0
Genome Proteome Essential
Figure 13.1 Structural disorder in E. coli and S. cerevisiae genomes and pro-
teomes. Structural disorder was predicted by IUPred for the E. coli (gray columns) and
S. cerevisiae (white columns) genomes, proteomes, and essential proteins. The percent
of proteins with at least one IDR ≥30 consecutive residues are shown. Reproduced with
permission from Tompa et al. (2006), J. Proteome Res. 5, 1996–2000. Copyright by
Elsevier Inc.
192 Structure and Function of Intrinsically Disordered Proteins
been studied yet. The third possible mechanism is gene duplication, which is the leading
mechanism of the generation of novel genes (Conrad and Antonarakis 2007). In terms
of the generation of novel genes encoding for IDPs, this would assume that when genes
of ordered proteins duplicated, one copy preserved its original structure, whereas the
other has became an IDP by acquiring multiple mutations. This mechanism is some-
what unlikely, because it assumes a series of mutations that can lead from an ordered
to a disordered state, preserving functionality without degenerating into a pseudogene
(Chothia et al. 2003; Prince and Pickett 2002). There is more evidence for duplica-
tions and exchange at the domain level—attaching a disordered domain to an already
existing protein. Such events allowed gradual evolutionary changes and experimenta-
tion with chimera constructs that preserved their original function, undergoing stepwise
modifications.
o thers (Paunola, Mattila, and Lappalainen 2002). The domain occurs in different
sequence contexts, but almost always in a Pro-rich region and with conserved features,
which suggest that in all homologs it functions in actin binding. The binding event
has different functional outcomes, because a single WH2 domain in Tβ4 inhibits actin
polymerization (Figure 11.5), whereas several tandem WH2 domains in other actin-
binding proteins promote actin polymerization (Chereau et al. 2005) (see also Chapter
11, Section 11.6.1).
The possible generality of disorder spreading by domain duplications and exchange
between genes is also underscored by the observation that about 14% of all Pfam
domains are mostly disordered by prediction (for further examples, see Chapter 14,
Section 14.2.4). Because Pfam domains are, by definition, homologous, they must have
spread by duplications and module exchange, which suggests that these mechanisms
contributed significantly to the spread of disorder in eukaryotes.
20 Disordered
15
Protein Family
10
5
Globular
0
Higher Variability
having a DNA-binding domain that tends to be ordered and a TAD(s) that tends to
be disordered (see Chapter 11, Section 11.2.1), these results suggest that disordered
TADs tolerate significantly greater amount of nonsynonymous mutations (i.e., they
evolve in an almost neutral fashion).
The linker region (termed intrinsically unstructured linker domain, IULD), of
RPA70, the 70-kDa subunit of replication protein A (RPA70), also appears to evolve
neutrally. RPA70 plays a critical role in replication, recombination, and DNA repair and
has an N-terminal DNA/protein-interaction domain (DBD F) connected by the linker
to two tandem high-affinity, single-stranded, DNA-binding domains (DBD A and B).
Sequences from distant species (animal, fungi, and plant) are too diverged to be aligned,
and evolutionary variability of the linker region can only be approached by examining
more closely related mammalian sequences (Daughdrill et al. 2007), very much like
in the case SRY. Most sites in the linker region evolve nearly neutrally, with certain
interspersed conserved sites, which happen to be mostly Gly residues (six are preserved
in all nine mammalian homologs, and six are conserved in eight of them). Apparently,
rapid neutral evolution is compatible with flexibility being the primary functional pre-
requisite of the linker (see Chapter 13, Section 13.4.1), which also explains the presence
of conserved Gly residues critical for maintaining flexibility.
The analysis of a collection of 126 IDPs showed directly that the percentage of pro-
teins with tandemly repeated segments is much higher in IDPs (39%) than in SwissProt
(14%), yeast (18%), or human (28%) proteins (Tompa 2003b). Repeat regions make up
a very large fraction, about 34%, of all IDPs, as opposed to about 7% of SwissProt
proteins. Microsatellite and minisatellite sequences are about equally represented, and
they are often essential to the function of the protein. In addition, these regions often
show an exceptional evolutionary activity (i.e., repeat length variation). The possible
mechanisms are discussed next.
Type I
Type II
Type III
Figure 13.3 Repeat expansion in the evolution of IDPs. IDPs/IDRs are often made up of
internal repeats, which may follow three evolutionary routes of expansion. Type I denotes
regions in which repeats generated by tandem duplication(s) remain functionally equivalent.
Repeat units in type II regions diversify due to mutations leading to changes in sequence. A
type III repeat region is envisaged to acquire a novel function as a consequence of expan-
sion. Reproduced with permission from Tompa (2003), BioEssays 25, 847–55. Copyright by
Wiley Periodicals.
200 Structure and Function of Intrinsically Disordered Proteins
of some changes in repeat sequences and/or different distances from the catalytic unit
of the polymerase. Titin followed a different evolutionary path, in the sense that repeat
units of its PEVK domain remained functionally equivalent in terms of the physical
elasticity they provide. Copper binding by the prion octarepeat represents still another
evolutionary alternative, because it indicates a possible sudden functional change when
the repeat region acquired physiologically significant affinity upon extending to four
repeats. These three alternative mechanisms represent three different types of logic in
IDP function and evolution by repeat expansion (Tompa 2003b).
from the three kingdoms of life—two from animals, one from fungi, and two from
plants—show very similar backbone flexibilities by NMR. Thus, the entropic-chain
function of an IDP can be retained in the face of negligible sequence conservation.
The conservation of recognition functions poses an even more serious challenge
in light of the range of sequence variations commonly observed in IDPs (Brown et
al. 2002). The solution may reside in the special binding mode of IDPs (i.e., that they
often recognize their partners by virtue of short recognition motifs) (see Chapter 14,
Section 14.2). These elements are often constructed from a few specificity determinant
residues interspersed in highly variable and disordered regions (Fuxreiter, Tompa, and
Simon et al. 2007). Apparently, a large fraction of these recognition sequences function
as a linker that is rather free to mutate, and only the very little fraction of direct recogni-
tion residues are subject to evolutionary constraints, practically falling into the level of
noise when considering the variability of the entire domain/protein.
Calpastatin, the inhibitor of calpain, demonstrates this situation. Calpastatin is
composed of four equivalent inhibitory domains of about 140 amino acids, each capable
of very tight and specific inhibition of the enzyme with inhibitory constants ranging
from 4.5 pM to 4 nM (Hanna, Garcia-Diaz, and Davies 2007). Because the inhibitor
has co-evolved with its cognate enzyme and each domain can inhibit the same enzyme
species, the absolute conservation of a recognition function is assured. Alignment of
the domains (Figure 13.4) shows the presence of short, conserved segments within each
domain (termed subdomains, marked A through D). Subdomains A, B, and C (Ma et al.
1994; Ma et al. 1993; Takano et al. 1995), and probably also subdomain D (Kiss et al.
2008b) serve as the recognition determinants of the inhibitor, and are in direct contact
with the enzyme, whereas the linker regions connecting them remain free even in the
state bound to the enzyme (Kiss et al. 2008b; Moldoveanu, Gehring, and Green 2008).
Binding occurs through a few specificity-determinant residues only, but the intervening
Figure 13.4 Alignment of four calpastatin domains. The alignment of the four inhibitory
domains of calpastatin exemplifies how function of disordered proteins is preserved in the
face of limited amino acid sequence conservation. Reproduced with permission from Kiss
et al. (2008), Biochemistry 47, 6936–45. Copyright by the American Chemical Society.
202 Structure and Function of Intrinsically Disordered Proteins
regions are rather insensitive to the identity of the actual residues (Betts et al. 2003).
The flexible linkers separating subdomains are largely variable, because they only have
to ensure a range of distances and relative orientations of the subdomains for effective
recognition. Thus, binding results from the combination of very short subsites connected
by flexible linkers, which overall provides respectable binding strength and specificity,
yet it enables large evolutionary variability (Hanna et al. 2007).
Original Copy
Mutation
and selection
populated conformation has more potential for another (binding) function. Initially,
this secondary activity provides only a limited fitness advantage, because binding of
the primary substrate will sequester most of the protein. Improvement through muta-
tion is only possible to a limited extent because such mutations might decrease the
primary activity. Following gene duplication, however, one gene copy becomes free
to evolve without compromising the original activity, and its mutations could improve
the secondary activity very rapidly (Figure 13.5). After successive rounds of mutation
and selection, primary activity of the second copy may completely differ. Because the
ensemble of structures of IDPs already harbors the capacity to manifest different func-
tions, as formulated in the concepts of binding promiscuity (Kriwacki et al. 1996) and
moonlighting (Tompa, Szasz, and Buday 2005), this evolutionary scenario may be of
prime importance in the case of IDPs.
Extension of
the Structure-
Function
14
Paradigm
This chapter discusses how the rapidly accumulating structural and functional
information on intrinsically disordered proteins (IDPs) appears to solidify into a
consistent framework of functional modes (i.e., how function can be interpreted
in terms of the structural features of IDPs). This information is closely related to
the functional classification scheme outlined in Chapter 12, but with a different
emphasis. Rather than trying to classify IDPs by function, here special mechanistic
features of their action are considered, which are often thought of as imparting
“functional advantages” on IDPs. The two dominant elements of these are molec-
ular recognition accompanied by local induced folding and entropic-chain-type
functions that directly stem from disorder. These principles appear in various com-
binations in actual situations, and together they contribute toward formulating an
extended structure-function paradigm that can encompass both ordered and disor-
dered proteins.
1. Linkers and spacers, which provide appropriate spatial separation and search
of binding/catalytic domains or elements (e.g., linker region of cellulase E
(von Ossowski et al. 2005))
205
206 Structure and Function of Intrinsically Disordered Proteins
molecular interfaces (Gunasekaran et al. 2004; Meszaros et al. 2007) and also the
probable structural preferences of IDPs in the unbound state. Such studies have led
to the concept of preformed structural elements (PSEs) (Fuxreiter et al. 2004) and
intrinsically folded structural units (IFSUs) (Sivakolundu, Bashford, and Kriwacki
2005).
The analysis of 26 such intrinsically disordered region (IDR)/partner complex
structures (Fuxreiter et al. 2004) showed that the accuracy of predicting secondary
structural elements in IDPs in the bound state is higher than that of their partner pro-
teins and is significantly higher than the corresponding values for random sequences
(Figure 14.1). This observation suggests that IDPs have rather strong intrinsic prefer-
ences for the conformation they attain when bound to their partners, which may be
interpreted in terms of the partial preformation of their recognition segments in the
free state. The relationship is strongest for helices and weakest for coils. Although these
results are not conclusive with respect to the mechanism of binding (i.e., whether these
elements are truly preformed), often a similar structure in the unbound and bound states
is observed when the IDP is characterized by nuclear magnetic resonance (NMR) (see
Chapter 10, Sections 10.2.3 and 10.2.4, and Table 10.1). For example, this correlation
was observed in the case of the kinase inhibitory domain (KID) of cyclic-AMP response
element-binding protein (CREB) (Parker et al. 1999; Radhakrishnan et al. 1998), p21Cip1/
p27Kip1 (Kriwacki et al. 1996; Lacy et al. 2004; Sivakolundu et al. 2005), p53 (Lee et al.
2000), FlgM (Daughdrill, Hanely, and Dahlquist 1998; Dedmon et al. 2002; Sorenson,
Ray, and Darst 2004), PKI alpha (Hauer et al. 1999a), Tβ4 (Domanski et al. 2004), and
measles virus nucleoprotein (Longhi et al. 2003). Whether such preformed elements
70
60
Prediction Accuracy (%)
50
40
30
20
10
0
IDP IDP-rand Glob Glob-rand
Figure 14.1 Predictability of the secondary structure of IDPs in the bound state.
A selection of 26 IDP structures in complex with their partners was analyzed for the
predictability of the secondary structure attained in the complex by ALB for IDPs (IDP), ran-
domized sequences of IDPs (IDP-rand), sequences of globular partners (Glob) and random-
ized sequences of partners (Glob-rand). Intrinsic structural preferences of IDPs are strongly
correlated with their conformation attained in the bound form. (data from Fuxreiter et al.
2004).
208 Structure and Function of Intrinsically Disordered Proteins
also serve as initial contact points of interaction is a matter of speculation, but they
probably limit the entropic penalty of the induced folding process.
14.2.2 Linear Motifs
The concept of linear motifs (LMs, also denoted as eukaryotic linear motifs (ELMs)
and short linear motifs (SLiMs)) derives from analyzing the sequences involved inter-
actions. In certain proteins, the element of recognition is a short motif of discernible
conservation, often denoted as a “consensus” sequence, such as modification sites of
kinases or binding sites of SH3 domains (Neduva and Russell 2005). LMs are usu-
ally constructed as a few conserved specificity determinant residues interspersed with
residues hardly constrained, with a typical length between 5 and 25 residues. They are
usually described as a short sequence pattern, in which certain sites are restricted (RSs,
e.g., P in Pxx for the SH3 binding sites), whereas others are rather freely exchangeable
(i.e., non-restricted sites) (NRSs, marked with “x” in the above pattern). The first set of
residues serve as specificity determinants, whereas the second set likely act as spacers.
Due to their limited information content, LMs are much more difficult to identify by
sequence comparisons than domains. A traditional Basic Local Alignment Search Tool
(BLAST) search cannot positively identify LMs, and special algorithms that combine
functional/structural clues with sequence analysis of non-globular regions had to be
developed for this purpose (e.g., DILIMOT (Neduva and Russell 2006) and SLiMDisc
(Davey, Shields, and Edwards 2006)).
LMs described in the literature have been collected in the eukaryotic lin-
ear motif (ELM) database available via the ELM server (Puntervoll et al. 2003),
which contains about 800 examples of more than 100 ELMs. LMs are generally
thought to correlate with local disorder (Linding et al. 2003b; Puntervoll et al.
2003), as confirmed in a systematic bioinformatic analysis of the ELM database
(Fuxreiter, Tompa, and Simon 2007). The analysis suggests that LMs and their
flanking segments of about 20 residues in both directions tend to be locally disor-
dered (Figure 14.2A). The amino acid composition of the motifs resemble the char-
acteristic composition of IDPs (Figure 14.2B), but at certain points the similarity
breaks down, because LMs are enriched in hydrophobic residues Trp, Leu, Cys, and
Tyr and the charged residues Arg and Asp. Further, LMs are depleted in Gly and
Ala and enriched in Pro.
Marked differences in the amino acid frequencies of RS and NRS positions explain
these propensities. At the conserved positions, either hydrophobic and rigid, or charged
and flexible residues are preferred, whereas in NRS positions, excessive flexibility, very
similar to that of IDPs, can be observed. The only exception is Pro, which is in excess in
both RS positions and LM flanking regions, indicating its dual role as a contact residue
within LMs and promoter of an open structure outside LMs. Overall, the unique amino
acid composition suggests a mixed nature of LMs, with a few specificity-determinant
residues strongly favoring order, grafted on a completely disordered carrier sequence
flanking and intervening the region critical for interaction.
14 • Extension of the Structure-Function Paradigm 209
A
0.60
0.55
0.50
IUPred Disorder
0.45
0.40
0.35
0.30
B
0.06 DisPort
LM
Flank
0.04 LM + flank
Amino Acid Propensity
0.02
0.00
–0.02
–0.04
W C F I Y V L H M A T R G Q S N P D E K
Figure 14.2 Linear motifs tend to fall into local disorder and are enriched in a special set
of amino acids. Short recognition elements (linear motifs, LM) have been collected from
the ELM database (Puntervoll et al. 2003). (A) Disorder profiles by the IUPred algorithm
were computed and averaged. A thin horizontal line at 0.5 shows the threshold of disorder,
whereas a dotted line at 0.4 shows the average score for experimentally verified disordered
proteins in DisProt (Sickmeier et al. 2007). Standard error of the mean (SEM) values are
displayed by an error bar. (B) Amino acid propensities of LMs and their flanking regions were
also calculated and are shown as the difference between that of LMs and globular proteins.
IDPs of the DisProt database (light gray), LMs (dark gray), 20-residue-long LM flanking seg-
ments (white), and LMs plus 20-residue-flanking segments (black) are shown. Reproduced
with permission from Fuxreiter et al. (2007), Bioinformatics 23, 950–6. Copyright by Oxford
University Press.
210 Structure and Function of Intrinsically Disordered Proteins
14.2.4 Recognition by Domain-Sized
Motifs and Mutual Folding
The three concepts of short recognition motifs of IDPs (PSEs, LMs, MoRFs) can be
considered as manifestations of the same underlying principle of binding of an ordered
partner by a short segment within a disordered region, which undergoes disorder-to-order
transition or folding induced upon binding (Dyson and Wright 2002a). The implicit
assumption in this view is that motifs are always short, on the order of 5 to 25 resi-
dues in the case of LMs (Fuxreiter et al. 2007) and less than 30 residues in the case of
MoREs (Oldfield et al. 2005b). The definition of MoRFs allows somewhat longer disor-
dered binding motifs that fall between 10 and 70 residues (Mohan et al. 2006), which
raises the possibility that interactions of IDPs might actually conform to two principally
different concepts.
There are several examples in the literature that binding of an IDP involves a domain-
sized segment, far too long to be considered as a short motif (see Chapter 13, Section
13.1.3). For example, such a binding mode has been described in the case of the disor-
dered Smad-binding domain (SBD) of the Smad anchor for receptor activation (SARA)
binding to the MH2 domain of Smad (Chapter 12, Figure 12.6) in transforming growth
14 • Extension of the Structure-Function Paradigm 211
factor beta (TGF-β) signaling (Wu et al. 2000), the KID domain of the cyclin-dependent
kinase (Cdk) inhibitor p27Kip1 binding to the Cyclin A-Cdk2 complex (Chapter 10,
Figure 10.3) (Russo et al. 1996), botulinum neurotoxin serotype A (BoNT/A) binding to
SNAP-25 (Breidenbach and Brunger 2004; Brunger et al. 2007), the Wiskott–Aldrich
syndrome protein (WASP) homology domain 2 (WH2) domain of Tβ4 binding to
G-actin (Chapter 11, Figure 11.5) (Irobi et al. 2004), E-cadherin cytoplasmic domain
(cytD) (Huber and Weis 2001) or the catenin binding domain (CBD) of T-cell factor
3 (Tcf3) binding to β-catenin (Chapter 11, Figure 11.3) (Graham et al. 2000), and the
GTPase-binding domain of WASP binding to the small GTPase Cdc42 (Figure 14.3A)
(Abdul-Manan et al. 1999). Although in principle these recognition events can be visu-
alized as binding by several short neighboring or even overlapping motifs, often there
are no apparent motifs within these domain-sized regions, and their binding is better
described as recognition by a disordered domain. In fact, about 14% of Pfam domains
are mostly disordered (see Chapter 13, Section 13.1.3), which by definition arose by evo-
lutionary divergence (unlike motifs, which arise by convergece). which have led to the
extension of the domain concept to the disordered state (Tompa et al. 2009).
A further apparently closely related deviation from the simple picture of recogni-
tion by a short disordered segment is the mutual recognition of two IDPs in a pro-
cess of mutual induced folding, also termed as “co-folding” or “synergistic folding,” as
reported in the case of the interaction between Bob1 and Oct1 trans-activator domain
(TAD) (Lee et al. 2001), multiple vesicle-associated proteins (Dafforn and Smith 2004;
Williamson 1994), and CBP/p300 and p160 nuclear receptor co-activators (Demarest
et al. 2004; Demarest et al. 2002). Whereas in most cases these inferences rely only on
biochemical and functional studies, in the case of the mutual binding of the activator for
thyroid hormone and retinoid receptors (ACTR) domain of p160 and the MG-like nucle-
ar-receptor co-activator-binding domain (NCBD) of CBP, the structure of the resulting
complex is known in atomic detail (Figure 14.3B). The two domains completely wrap
A B
C N
Figure 14.3 Binding of IDPs by disordered domains and mutual induced folding of two
disordered recognition segments. (A) The disordered GTPase binding domain (GBD) of
WASP (dark gray) bound to the small GTPase Cdc42 (light gray, pdb 1cee). (B) The structure
of the complex that results from the mutual folding of disordered NCBD of CBP (light gray)
and disordered ACTR domain of p160 (dark gray, pdb 1kbh).
212 Structure and Function of Intrinsically Disordered Proteins
around each other and bury a surface area of 1,500 Å2 of primarily hydrophobic nature.
The interaction is rather tight (Kd = 3.4 × 10 –8 M) and is driven by enthalpy (∆H° = –31.7
kcal mol–1), which compensates for the high entropic cost (T∆S° = –21.3 kcal mol–1)
associated with the induced folding of both ACTR and NCBD domains.
A 160
140
100
80
60
40
20
0
0 10 20 30 40 50 60 70 80
Interface Area/Residue [Å2]
B
70
60
50
Protein (%)
40
30
20
10
0
1 2–3 4–5 6–7 8–8 >8
Number of Segments
Figure 14.4 Surface and interface area and segmentation of interfaces of IDPs. (A) IDPs
use a large fraction of their surface for binding. The total surface area per residue is given
as a function of the interface area per residue for the smaller chain of ordered complexes
(dark gray triangles) and for disordered proteins in complex with an ordered protein (light
gray squares). (B) IDPs (light gray) tend to use fewer segments to make up the binding site
than globular proteins (dark gray). The distribution of interfaces with the given number of
non-continuous sequence segments is shown. Reproduced with permission from Meszaros
et al. (2007), J. Mol. Biol. 372, 549–61. Copyright by Elsevier Inc.
state. Exposure of hydrophobic amino acids and/or a compositional bias favoring them
at the interface also follows from the analysis of ELMs (Fuxreiter et al. 2007), two-state
complexes (Gunasekaran et al. 2004), and MoRFs (Vacic et al. 2007).
In line with these characteristic composition values, IDP interfaces make much
more hydrophobic–hydrophobic contacts (IDPs: 33%, ordered proteins: 22%), whereas
ordered proteins make significantly more polar–polar contacts (IDPs: 27%, ordered
214 Structure and Function of Intrinsically Disordered Proteins
proteins: 33%) (Gunasekaran et al. 2004; Meszaros et al. 2007). The probable reason
for these distinctions is that IDPs require more enthalpic stabilization to counteract their
decrease in configurational entropy, but probably also that they are less able to shield
interactions of polar residues from hydrate water. In relation to this difference, IDP inter-
faces are tighter (i.e., structurally more complementary), probably due to a better adapta-
tion to the structure of the partner enabled by their induced folding. Structural adaptation
of ordered proteins is limited due to their much lower level of conformational freedom.
These observed differences also manifest themselves in differences in the interaction
energies of the two types of complexes, as demonstrated by the IUPred (Dosztanyi et al.
2005a; Dosztanyi et al. 2005b) algorithm developed to estimate pair-wise inter-residue
interaction energy of IDPs (see Chapter 9, Section 9.4.2). Its application toward analyz-
ing the interfaces (Meszaros et al. 2007) suggested that ordered proteins tend to realize
more stabilizing interactions within their polypeptide chains, whereas IDPs derive more
stabilization from the interaction with the partner than from interactions within their
own chain. The overall balance is therefore shifted towards the folded state only in the
presence of the partner, explaining why IDPs do not fold in isolation.
14.2.6 Unification of Concepts?
The different concepts of short recognition elements are based on different premises and
do not necessarily correspond to the same structural and functional feature, although
some unifying themes do appear. PSEs by definition exist in a similar conformational
state in solution than in the bound state, which corresponds primarily to α-helices.
In this respect, their overlap is most apparent with α-MoRFs and maybe also with
β-MoRFs. In accord, the predictability of PSE structures in the bound state (Fuxreiter
et al. 2004) is also observed for MoRFs (Mohan et al. 2006). LMs, on the other hand,
are defined at the level of sequence, and could correspond to all three MoRF classes,
and, if they have a discernible preference for some local fold, also to PSEs. On the other
hand, PSEs by definition are intrinsically disordered in isolation, whereas LMs and
MoRFs can also occur in ordered regions of the proteins. Thus, in many instances, a
short recognition element conforms to all three definitions, but further work is needed
to arrive at a unified concept.
recognition (Dyson and Wright 2002a), the exact mechanism and thermodynamic
consequences of the process are often rather obscure. One key issue is whether fold-
ing occurs before, after, or concomitant to binding. Experimental evidence seems to
support all these varieties, and a certain level of unification is feasible, especially if
the strong mechanistic parallels of induced folding with the process of protein folding
(Daggett and Fersht 2003; Gianni et al. 2003) are taken into account (see Chapter 1,
Section 1.6).
The implicit assumption in the PSE, and maybe also in the MoRF concept, is that
the polypeptide chain preferentially samples the local conformation attained in the com-
plex. Due to the inherent stability of α-helix conformation, this correlation usually corre-
sponds to a transient helical structure (see Section 14.2.1 and Chapter 10, Section 10.2.3
and Table 10.1). In some cases, an extended conformation may also appear locally, such
as a β-strand in the case of fibronectin binding protein (FnBP) (Penkett et al. 1998) and
polyproline II (PPII) conformation in the case of partners of the Pro-rich peptide binding
GYF domain (Gu et al. 2005). To uncover the actual mechanism of binding at atomistic
detail, the process of induced folding has been approached by site-directed mutagenesis,
molecular dynamics simulation, and NMR spectroscopy. The results appear to depend
very much on the actual system studied, with no apparent generalizations.
14.3.1 Site-Directed Mutagenesis
Studies of Induced Folding
Site-directed mutagenesis can be primarily used to stabilize or destabilize recognition
helices in IDPs, to address if the change inflicted upon the unbound state affects the
process of binding. One of the best characterized systems is the binding of p27Kip1 to the
Cyclin A-Cdk2 complex, mediated by the KID domain (see Chapter 3, Section 3.7.2;
Chapter 10, Section 10.2.3.1; and Chapter 15, Section 15.1.3).
As by NMR and MD, the structure of KID in solution has a significant preference
to locally populate the conformational elements of the bound state (see Chapter 10,
Figure 10.3), and thus binding may proceed from recognition by the PSE α-helix (linker
helix LH, also termed an IFSU ((Sivakolundu et al. 2005))). Analysis of the kinetics
and thermodynamics of binding of various truncated constructs, however, shows that
binding is initiated by the N-terminal coil segment (domain 1), followed by wrapping
around in a staple-like fashion, binding at the active site of the kinase (domain 2), and
finished off by stabilization of LH (details in Chapter 3, Section 3.7.2). The helix defines
the geometry of binding and may function as a PSE or IFSU initiating the recognition
process. This is not the case, however.
Mutagenesis studies showed that the final formation of the helix only occurs after
the transition state of binding (i.e., binding is initiated by a non-structured state (seg-
ment) of the protein) (Bienkiewicz, Adkins, and Lumb 2002). Stabilization of LH helix
by a triple-Ala mutation (E40A/D44A/K47A) or its destabilization by single-Pro muta-
tions (L41P, K47P, and A55P) hardly affect the equilibrium Kd characteristic of the
formation of the p27-KID–Cyclin A-Cdk2 complex (8 ± 2 nM). The single mutant
L41P, which has half the preference for the helix in the unbound state, binds with about
216 Structure and Function of Intrinsically Disordered Proteins
the same affinity (9 ± 1 nM), whereas the other mutants K47P and A55P, in which helix
formation is practically abolished, bind with only slightly lower affinities (16 ± 3 nM
and 13 ± 2 nM, respectively). In addition, stabilization by a triple-Ala mutation does
not lead to a corresponding increase in stability, rather to a slight destabilization of the
complex (12 ± 3 nM). In kinetic experiments, the binding of the helix-stabilized E40A/
D44A/K47A mutant is actually three times slower than that of the wild-type protein or
a single-Pro mutant. Thus, stabilization of the helix results in a kinetic impediment to
binding, suggesting that binding is initiated by a locally unfolded or partially folded
state of p27Kip1 KID, and it proceeds through a rather disordered transition state.
Similar helix-stabilizing and destabilizing mutations unveiled a different binding
mechanism in the case of the transcription factor Gcn4p. Gcn4p contains a dimeric Leu-
zipper deoxyribonucleic acid (DNA)-binding motif, the coiled-coil element of which
has approximately 70% helix content in the absence of DNA, implying only partial
preformation of the zipper (Weiss et al. 1990). In the presence of DNA, α-helix content
increases to at least 95%. To probe into the importance of the formation of helical seg-
ments in the transition state, a series of quadruple amino acid replacements spanning
the entire helix propensity scale were generated at positions that do not directly inter-
fere with DNA binding (Zitzewitz et al. 2000). Binding strength of DNA was found to
correlate with helical propensity, which suggests that preformed elements of secondary
structure play the key role in recognition. This scenario was also corroborated by muta-
tions of the noncontacting Asp-Pro residues at the N-terminal end of the helical region.
Such N-capping motifs, which can stabilize α-helical structure, contribute significantly
to the stability of the complex of Gcn4p with cognate DNA (Hollenbeck, McClain, and
Oakley 2002).
Arg 130
Arg 130
Arg 131 Arg 131
Figure 14.5 Change in the turn conformation of CREB KID upon phosphorylation of Ser133.
The conformation of CREB KID was assessed by MD simulations in the Ser133-phosphorylated
state (pKID, left) and nonphosphorylated state (KID, right). Representative configurations
are shown, with the turn region Arg130 –Ser133 displayed in dark gray. In pKID, a hydrogen
bond is maintained between pSer133 and Arg131 (marked by a dotted line), which stabilizes
the region in a binding-competent closed conformation. Reproduced with permission from
Solt et al. (2006), Proteins 64, 749–57. Copyright by Wiley-Liss.
NMR (Radhakrishnan et al. 1997) shows that CREB KID residues 120–144 undergo
induced folding, which results in two perpendicular α-helices (αA: Asp120 –Ser129, αB:
pSer133 –Asp144), connected by a short turn-like segment that harbors the phosphoryla-
tion site Ser133 (Chapter 6, Figure 6.3), which plays a critical role in the function of
CREB (Parker et al. 1996; Zor et al. 2002). Because the two helices are transiently
populated in the solution state of CREB, (αA 50% of the time, αB 10% of the time, see
Chapter 6, Section 6.3.1), the question whether helices form prior to or after the transi-
tion state of the binding reaction was addressed by MD simulations (Solt et al. 2006). It
was found that helical populations are hardly affected by phosphorylation that initiates
the interaction, whereas a subtle change in the turn region that connects the two helices
occurs (Figure 14.5). Here, phosphorylation induces a transient structural element that
resembles the bound conformation of the molecule, stabilized by the pSer133 –Arg131
interaction. Its formation may limit the conformational search of the flanking helices
and initiate binding, thus serving as a PSE (and/or a primary contact site [PCS], see
Section 14.5.1). In the context of the role of local structure in the transition state of
binding, this MD study suggests that less importance be assigned to preformed helices,
and more importance to the turn region that connects them.
pKID forms an ensemble of transient encounter complexes with KIX, and is stabilized
primarily by non-specific hydrophobic contacts. In this complex, pKID explores an
ensemble of weak interactions with multiple sites on the KIX surface. The encounter
complex is characterized by the strong involvement of pSer133 and has further stabiliz-
ing hydrophobic contacts by Tyr134, Ile137, and Leu138, which lie on the contacting face of
helix αB. CSI values suggest that αB is only partly formed (up to 30%), whereas αA is
almost fully formed, but hardly makes any contacts with KIX. R2 relaxation dispersion
experiments corroborate that αA behaves as a single cluster, whereas αB behaves as
two separate clusters (i.e., it is incompletely folded).
Overall, the process is best described by an encounter complex dominated by
pSer133 interactions and hydrophobic contacts that anchor the pKID αB helix to
the hydrophobic groove of KIX in a partially formed state. Within this encounter
complex, there is a continuing conformational search for the favorable intermolecular
interaction, without pKID dissociating from KIX. This analysis and the MD study
(Solt et al. 2006) agree that the pSer133 region makes contacts critical for the forma-
tion of the encounter complex, and formation of the helices is not essential for the
recognition step.
where ∆SHE is the entropy change that results from the burial of hydrophobic surface
and ∆Srt is the entropy change that results from the loss of rotational-translational free-
dom. Often, ∆SHE is much larger than that expected assuming a rigid-body association
220 Structure and Function of Intrinsically Disordered Proteins
and much larger than the magnitude of ∆Srt, which suggests a large value for ∆Sother
associated with a change in conformational entropy (in the protein, DNA or both) upon
binding. In general, specific DNA sequences serve as better templates for folding of the
protein, and local or even global folding transitions are coupled to DNA binding at spe-
cific sites. Frequently and often unjustifiably, this analysis is generalized to all protein–
protein interactions of IDPs, and is thought to suggest that structural disorder confers the
ability of specific binding on IDPs, which may turn out to be generally true. In terms of
the original issue (i.e., the question of uncoupling specificity from binding strength), it
actually provides evidence for the opposite, showing that the increased conformational
freedom enables IDPs to actually realize a stronger binding with the specific partner.
300 130
56
44
12
17
169
Figure 14.6 The structure of the tripartite I-2–PP-1 complex. Inhibitor-2 wraps around
protein phosphatase 1cγ and contacts the enzyme by three isolated binding segments
(encompassing residues 12–17, 44–56, and 130–169). The structure has been solved by
X-ray crystallography (Hurley et al. 2007). The light gray ribbon represents the structure of
PP1cγ, and the inhibitor is shown in dark gray (pdb 2o86).
the chelate effect, in which binding by multiple weak subsites of the same molecule
results in a strong and probably very specific binding. For example, PRPs/PRGs in
saliva function by the strong multidentate binding of tannins (Charlton et al. 1996;
Hagerman and Butler 1981) by their tandemly repeated Pro-rich sequences (see Chapter
12, Section 12.5.1). Inhibitor-2 (I2) binds protein phosphatase 1 (PP1) with a Kd of 2 nM,
wraps around, and contacts the enzyme via three separate binding regions (Figure 14.6,
(Hurley et al. 2007)). Calpastatin, the strong (Kd = 4.5 pM (Hanna, Garcia-Diaz, and
Davies 2007)) and specific inhibitor of calpain, also wraps around its partner and binds it
via three separate binding regions, termed subdomains (Kiss et al. 2008a; Moldoveanu,
Gehring, and Green 2008).
simple polyamines speed up DNA renaturation (Chapter 12, Figure 12.5) (Pontius
1993; Pontius and Berg 1990, 1991). This issue can be approached from both mech-
anistic and thermodynamic points of view.
S100ββ-p53 Sirtuin-p53
–
365HSSHLKSKKGQSTSRHKKLMFKTEGPDSD-COO
Figure 14.7 Structural adaptability of an IDP. Structural analyses of p53 indicate that
a segment within its regulatory domain (residues 374–388) can bind four different part-
ners (Cyclin A, sirtuin, CBP, and S100ββ), which show that the same disordered region can
adopt different structures. Reproduced with permission from Oldfield et al. (2008), BMC
Genomics 9, S1, S1. Copyright by the BioMed Central Ltd.
Other disordered moonlighting proteins can bind two different partners with
d istinct, often opposing activities. For example, the established function of Tβ4 is
to sequester G-actin by keeping it in a polymerization-incompetent state (Domanski
et al. 2004; Hertzog et al. 2004). The protein can also bind and activate integrin-linked
kinase (ILK), which subsequently phosphorylates the survival kinase Akt (Bock-
Marquette et al. 2004). EBV-SM not only down-regulates intron-containing mRNA
but also up-regulates intron-less mRNA (Ruvolo et al. 1998). PIAS1 not only inhibits
activated STAT but can also activate p53 (Liao, Fu, and Shuai 2000; Megidish, Xu,
and Xu 2002).
The unifying theme of these and other examples (Tompa 2002; Tompa et al. 2005)
is the adaptability of disordered regions for binding distinct partners, or even the very
same partner in different modes. Based on the combination of functional, biochemical,
and limited structural data, three molecular mechanisms have been proposed for the
mechanism of switching between the different functional modes (Figure 14.8) (Tompa
et al. 2005). Due to the adaptability of recognition regions, certain moonlighting IDPs
can bind different partners via two alternative conformations of the same site, or by
two different but overlapping sites (Tβ4). In other cases, the IDP can bind the same
partner in two basically different conformations or binding sites, leading to different
effects (DHPR C). The third case is when the IDP binds the partner in one mode, but
can undergo significant conformational change or reorganization in the bound state,
resulting in a distinct functional outcome (I2).
14 • Extension of the Structure-Function Paradigm 225
Figure 14.8 Mechanisms by which moonlighting IDPs switch between functions. The
highly simplified scheme depicts the three basic molecular mechanisms by which IDPs may
exert opposing effects. The partner molecule is represented by a light gray diamond, which
turns into an oval when it assumes an active conformation and a rectangle when in an
inactive conformation. Its light and dark shades indicate activation and inhibition, respec-
tively. Mostly biochemical data suggest that a protein can bind to the same partner in two
basically different conformations or binding sites, leading to different effects (A). Another
mechanism is when an inhibitor shifts the equilibrium of its partner in favor of its active
conformation but blocks its active site. Activation occurs when the inhibitory interaction is
partially released due to post-translational modification (B). The protein may also bind two
different partners due to structural adaptability of its binding site or by two overlapping
(nested) sites (C). Reproduced with permission from Tompa et al. (2005), Trends Biochem.
Sci. 30, 484–9. Copyright by Elsevier Trends Journals.
a number of binding partners including other proteins and the mineral phase of bones
and teeth (Fisher et al. 2001). For example, OPN binds to hydroxyapatite (HA) along its
entire length, probably via its many Asp groups. This binding is important for mediat-
ing cell attachment to HA, yet keeping the RGD motif of OPN available for integrin
binding. For example, osteoclasts (i.e., bone cells that remove the bone’s mineralized
matrix in bone resorption) may use OPN to bridge between the integrins on the cell
surface and the mineral phase during resorption (Reinholt et al. 1990). OPN complexed
to either integrin or CD44 can also bind Factor H (FH) (Fedarko et al. 2000). Binding
of the various partners HA, FH, integrin, and CD44 occurs by partially overlapping
binding surfaces, and in various combinations (e.g., HA plus integrin, or FH plus integ-
rin, etc.) that are mutually exclusive due to the overlaps between the respective binding
surfaces/motifs (Fisher et al. 2001).
Static Dynamic
A B C D
Disorder
(Fontes, Teh, and Kobe 2000; Fontes et al. 2003) and Pro-rich regions, binding the
GYF domain in alternative conformations (Gu et al. 2005).
2008a; Moldoveanu et al. 2008). Clamp-type fuzzy binding with an elongated partner
of multiple binding sites may also result in processivity (see Section 14.9), as observed
in the case of bacterial cellulase, myosin VI, and matrix metalloproteinase 9 (MMP-9).
a distributed array of short, marginally defined binding motifs, which can even attain
binding by alternative patterns, and overall do not lead to a detectable level of ordering
of the whole protein.
14.11 Ultrasensitivity of recognition
The phenomenon of ultrasensitivity of recognition, another unusual mode of
protein–protein interaction enabled by structural disorder, is also in close associa-
tion with fuzziness. Ultrasensitive binding is observed when numerous suboptimal
binding sites are located in close proximity in a disordered segment of a protein, and
post-translational modification of several of them is required for productive binding and
function, even though there is only a single binding site on the partner. This unusual
binding mode results in an ultrasensitive dose-response curve of recognition (Bary et al.
2007), typified by a high Hill coefficient. There are two cases that have been studied in
detail: the recognition and ubiquitination of the Cdk inhibitor Sic1 by the SCF ubiquitin
ligase subunit of the cell-division cycle protein 4 (Cdc4p), and the binding and inhibi-
tion of CFTR by its disordered regulatory (R) domain.
14 • Extension of the Structure-Function Paradigm 231
of any of the sites reduces local helicity and decreases affinity of the segment for the
NBD domains, but the involvement of several sites is required to have an overall effect
of relieving NBD segregation and activating chloride conductance. This observation
explains the dependence of CFTR activity on multiple PKA phosphorylation events
(Baker et al. 2007).
Figure 14.10 Redesigning the allosteric N-WASP switch. Disorder of N-WASP enables
it to be redesigned as an artificial switch gated by heterologous ligands. (A) N-WASP is
a modular allosteric switch, with an output domain signaling to the actin cytoskeleton
by stimulating Arp2/3. Its activity is repressed by autoinhibitory interactions involving the
endogenous GBD domain and basic motif (marked as B). Ligands can activate N-WASP by
disrupting autoinhibitory interactions. (B) A single-input switch could be designed by tak-
ing advantage of the allostery enabled by disorder of the long linker region of N-WASP, by
placing a PDZ domain-ligand pair flanking the output domain. Reproduced with permission
from Dueber et al. (2003), Science 301, 1904–8. Copyright by the American Association for
the Advancement of Science.
characterized human proteins shows that 81% of 75 alternatively spliced fragments are
associated with fully (57%) or partially (24%) disordered protein regions, and regions
affected by alternative splicing are significantly biased toward disorder-promoting
amino acids (Dunker et al. 2001). Disorder predictions are consistent with these experi-
mental data.
section we examine a model for the chaperone action of disordered proteins, in which
the reciprocity of entropy change (i.e., the functional effect of the increase of entropy of
the partner concomitant to binding of the disordered protein) is suggested. This mecha-
nism forms the basis of a coherent mechanistic model of the action of disordered chap-
erones (details on their identity and action in Chapter 12, Section 12.3).
The first element of the model is that disordered proteins/regions provide unique
versatility in the recognition process, which may be beneficial in fast, relatively
nonspecific and reversible interactions with a range of apparently unrelated part-
ner molecules. The second mechanistic element of the entropy-transfer model is that
disordered segments provide a significant effect of solubilization, as demonstrated in
many cases of protein aggregation, which are inhibited or sometimes even reversed
by disordered chaperones. Because aggregation is usually caused by the association of
hydrophobic patches exposed due to inappropriate folding of the substrate, this effect
may simply result from shielding by highly hydrophilic disordered segments. Long-
range repulsion resulting from entropic exclusion by disordered appendages may add to
this effect, because it may physically prevent molecules from approaching each other
(entropic bristle/brush mechanism, Chapter 12, Section 12.1.4). The magnitude of this
effect has been demonstrated by biophysical measurements in the case of proteins that
ensure spacing in the cytoskeleton, such as MAPs (Mukhopadhyay and Hoh 2001) and
neurofilament side-arms (Brown and Hoh 1997), and also Nups in NPC gating (Chapter
12, Figure 12.1) (Lim et al. 2006). Similar activity has been suggested for caseins in
preventing the aggregation of calcium phosphate nanoclusters (Holt et al. 1996), and
its direct involvement in chaperone-related functions was demonstrated in the case of
Hsp25, for example (Lindner et al. 2000).
The key mechanistic element of chaperone action by disordered proteins, how-
ever, may come from transient ordering of the chaperone upon binding to the substrate.
Because kinetically trapped substrates are stuck in a local energy minimum, chaper-
ones assist folding by random disruption of misformed bonds via reciprocal changes
in disorder and order. Local loss of flexibility upon substrate binding was seen in the
case of GroEL (Gorovits and Horowitz 1995) and α-crystallin (Lindner et al. 1998). A
transient increase of flexibility of the misfolded substrate in the presence of the respec-
tive chaperone was observed in a group of introns in the presence of StpA (Waldsich,
Grossberger, and Schroeder 2002), the TAR element (Azoulay et al. 2003; Bernacchi
et al. 2002), and tRNA Lys (Tisne, Roques, and Dardel 2001) in the presence of NCP7,
mRNA in the presence of cold-shock proteins (Phadtare, Alsina, and Inouye 1999),
and both RuBisCO (Rye et al. 1997) and carbonic anhydrase (Persson et al. 1999) in
the presence of GroEL. Because binding by the chaperone keeps different segments/
strands of the substrate at a close range, this proximity may also spatially limit sub-
sequent conformational search and speed up the folding process, as directly demon-
strated in the case of DNA renaturation facilitated by the disordered CTD of hnRNP
A1 (Pontius and Berg 1990). In all, these distinct mechanistic elements of rapid and
promiscuous binding, solubilization, local unfolding, and proximal positioning com-
bine into a mechanistic model of the action of disordered chaperones, in which recipro-
cal changes in order and disorder (i.e., “transfer of entropy”) play a key role (Tompa
and Csermely 2004).
Structural
Disorder and
Disease
15
Proteins involved in various diseases, such as cancer and neurodegeneration, have a
high frequency of disorder. This general correlation and the possible direct involve-
ment of structural disorder in disease is supported by several bioinformatic analyses
and detailed studies on individual proteins. We will discuss the most important dis-
eases and examples, as well as the elevated level of disorder in pathogenic organ-
isms. The chapter will be concluded with how novel structural insight gained from
the recognition of structural disorder can be harnessed for the purposes of rational
drug design.
237
238 Structure and Function of Intrinsically Disordered Proteins
80
PDB_S25
Signaling
Cancer
60 Cardiovascular
disease
Neurodegenerative
Ptroteins (%)
diseases
40 Diabetes
20
0
30 40 50 60 70 80 90 100
Minimum Disordered Region Length
Figure 15.1 Predicted disorder of proteins associated with various diseases. The percent-
age of proteins with at least one IDR of a given minimal length was predicted in six datasets.
The datasets are non-homologous protein segments with well-defined 3-D structures from
PDB (PDB_S25), human signaling proteins, and proteins implicated in cancer, cardiovas-
cular diseases, neurodegenerative diseases, and diabetes. The error bars represent 95%
confidence intervals. Reproduced with permission from Uversky et al. (2008), Annu. Rev.
Biophys. 37, 215–46. Copyright by Annual Reviews.
15.1.2 p53
p53 is a prime example that without a detailed characterization of structural disor-
der, even a protein studied in as much detail as p53 cannot be fully understood. p53
plays essential roles in maintaining the integrity of the human genome by control-
ling apoptosis, cell cycle, deoxyribonucleic acid (DNA) repair, and senescence, and
is thus sometimes called the “cellular gatekeeper” (Levine 1997) or “guardian of the
genome” (Lane 1992). p53 is directly inactivated in about 50% of human cancers,
and in the remainder its activity is lost due to disruption of associated pathways. This
protein is a transcription factor that responds to upstream signals generated by stress
conditions, such as oncogene activation, DNA damage, and hypoxia, and induces or
inhibits about 150 downstream effectors, such as Bax and p53-upregulated modula-
tor of apoptosis (PUMA) (apopotosis), Gadd45, and proliferating cell nuclear antigen
(PCNA) (DNA repair), p21Cip1 and 14-3-3σ (cell-cycle arrest) and Maspin and brain-
specific angiogenesis inhibitor 1 (BAI1) (anti-angiogenesis). At the molecular level,
p53 function is regulated by a wide array of post-translational modifications (Joerger
and Fersht 2008), and is realized in conjunction with negative (e.g., murine-double
minute 2 (MDM2); see Chapter 12, Section 12.6.2.2) and positive (e.g., p300/CBP; see
Chapter 11, Section 11.2.2) regulators. MDM2 is an E3 ubiquitin ligase, which directly
inhibits its binding functions and promotes its ubiquitin-dependent degradation by the
proteasome.
15 • Structural Disorder and Disease 239
Human p53 is 393 amino acids in length, and it can be divided into four structural
and functional regions/domains as detailed in Chapter 4, Section 4.4.3 (see Chapter 9,
Figure 9.2 for predicted disorder). Basically, p53 is a homotetramer, with folded tetramer-
ization and core domains that are linked together and flanked by ID domains at the N-
and C-termini (Joerger and Fersht 2008). A variety of techniques have shown that the
trans-activator domain (TAD) is disordered (Bell et al. 2002; Dawson et al. 2003) but
also suggested function-related transient structural organization within regions 15–29
and 39–59, which adopt amphipathic α-helices upon interaction with MDM2 (Kussie
et al. 1996) or replication protein A RPA7e (Bochkareva et al. 2005), respectively. The
binding region of MDM2 appears as a downward spike on the disorder score (Mohan
et al. 2006). Solution studies by multidimensional nuclear magnetic resonance (NMR)
have confirmed that unbound full-length p53 TAD populates a helix conformation
between residues T18-L26 (Lee et al. 2000; Wells et al. 2008).
Paramagnetic resonance enhancement (PRE) experiments confirmed the short-
range transient order in p53 TAD, and also have shown its compact dynamic ensemble,
in which the regions responsible for MDM and RPA70 binding are separated by an
average distance of 10–15 Å, less than the random coil expectation (Vise et al. 2007).
Molecular dynamics (MD) simulations restrained by PRE internuclear distances (Lowry
et al. 2008b) also suggest a partially collapsed state of TAD, which places the MDM2
and RPA70 binding regions in close proximity, inferring their possible functional inter-
play. Principal component analysis of the atomic contact maps (Lowry et al. 2008a)
suggested that the ensemble is conspicuously nonrandom, with the negative charges
uniformly exposed on one face of the clusters. This imbalance of charges may steer
other factors in the p53-mediated assembly of complexes.
Structural analysis of other p53 domains also portrays an overall rather complex
picture. p53C (DNA-binding region) is well-folded in both DNA-bound and DNA-free
forms (Joerger and Fersht 2008), with only marginal stability, however (typical melting
temperature is 44–45°C). The tetramerization domain provides yet another interesting
structural example, because it is predicted to be fully disordered (Chapter 9, Figure 9.2),
yet it is ordered in the tetrameric state. This region is probably an example of two-
state complexes, which are disordered in isolation, only to become ordered in the oli-
gomeric state (Gunasekaran, Tsai, and Nussinov 2004). In accord, dimerization of this
domain occurs cotranslationally, whereas the tetramers are formed posttranslationally
by dimerization of dimers (Nicholls et al. 2002). The C-terminal regulatory domain is
fully disordered. This region is highly basic; it can weakly and nonspecifically interact
with DNA and modulate binding at specific sites by p53C. This region is subject to
extensive regulatory post-translational modifications and shows binding promiscuity
(Oldfield et al. 2008), because it can bind several partners, such as S100ββ, CBP, Cyclin
A2, and sirtuin, in different local conformations (Chapter 14, Figure 14.7).
The structure of full-length tetrameric p53 (Figure 15.2 and cover picture) has been
unveiled by a combination of small-angle X-ray scattering (SAXS), electron micros-
copy (EM), NMR, and MD simulations (Joerger and Fersht 2008; Wells et al. 2008). In
the DNA-free form the protein forms an elongated cross-shaped tetramer, in which core
domain dimers are loosely coupled and the N- and C-termini are extended and disor-
dered. In the DNA-bound state, the molecule wraps around the DNA helix, enabled by
the flexibility of the linker between p53C and the tetramerization domain. The TADs
240 Structure and Function of Intrinsically Disordered Proteins
Figure 15.2 Disorder in p53. A visual image of tetrameric p53 in complex with DNA
was created from the X-ray structure of DNA-bound DBD, the tetramerisation domain
(p53CTetD), and the calculated ensemble of the N-terminal domain (see also cover picture).
DBD and p53CTetD (light gray) and DNA (dark gray) are shown in space filling model. The
flexible CTD is not shown for reasons of clarity. NTDs are modeled by using a conforma-
tional sampling model that reproduces NMR residual dipolar coupling (RDC) values; 20
copies for each of the 4 different monomers are shown as thin traces of the polypeptide
chain. Reproduced with permission from Wells et al. (2008), Proc. Natl. Acad. Sci. USA 105,
5762–7. Copyright by the National Academy of Sciences.
extend away from p53C, probably due to the relative stiffness of their Pro-rich regions,
underscoring their role in being the target of a large number of modifications and sig-
naling protein partners (see also Chapter 4, Section 4.4.3).
Cdk1/ P21,
cyclin A p27
Cdk4
M P21,
P21, Cdk1/ (and Cdk6)/
p27
p27 cyclin B G2 cyclin D
G1
S
Cdk2
P21,
(and Cdk1)/
Cdk2 p27
P21, cyclin E
(and Cdk1)/
p27
cyclin A
Figure 15.3 Regulation of eukaryotic cell division cycle. Scheme of the cell division
cycle and the cyclin-Cdk complexes that regulate progression through the different stages.
Initiation of cell division in G1 phase requires Cdk4/Cyclin D and Cdk6/Cyclin D activities,
whereas progression into S phase requires the operation of Cdk2/Cyclin E and Cdk2/Cyclin
A complexes. Cdk1/Cyclin B and Cdk1/Cyclin A are involved in the entry into mitosis (M).
Due to their disorder and binding promiscuity, p21Cip1 and p27Kip1 are involved in inhibiting
and activating (the latter indicated by an arrow) various complexes under certain circum-
stances. Reproduced with permission from Galea et al. (2008), Biochemistry 47, 7598–609.
Copyright by the American Chemical Society.
is elevated in mitogen-starved cells and is rapidly degraded as cells enter the cell cycle;
p57Kip2 regulates cell cycle during embryonic development. CKIs are tumor suppressors
(Fero et al. 1998; Fero et al. 1996), and they also have other functions in apoptosis, tran-
scriptional regulation, and cytoskeletal dynamics (Besson et al. 2008). However, unlike
the classic tumor suppressor genes, oncogenic loss-of-function mutations in CKIs are
extremely rare. Instead, their level is down-regulated by distinct mechanisms. Their activ-
ity is also regulated by other factors, such as phosphorylation and interaction with protein
partners (Galea et al. 2008b; Sherr and Roberts 1999), and they also inhibit cell-cycle
progression independent of cyclins and Cdks, via the inhibition of components of the rep-
lication machinery (Luo, Hurwitz, and Massague 1995). Both p21Cip1 and p57Kip2 can bind
to PCNA, a DNA polymerase processivity factor, whereas p27Kip1 binds minichromo-
some maintenance deficient 7 (MCM7), which is the subunit of a replication fork helicase
(Nallamshetty et al. 2005). In all, CKIs are involved in regulating both cell migration and
cell division activated by the same upstream mitogenic signals (Besson et al. 2008) (i.e.,
they may be involved in the decision between movement and proliferation of cells).
The three CKIs share a conserved, 60-residue N-terminal kinase inhibitory
domain (KID, residues 28–90 in p27Kip1) but diverge in the remainder of the sequence,
which suggests distinct functions and regulation of action. They have nuclear localiza-
tion signals (NLSs) within their CTDs, p21Cip1 and p57Kip2 contain a PCNA-binding
domain, and p27Kip1 and p57Kip2 possess a C-terminal QT domain that harbors a Cdk2-
dependent phosphorylation site that triggers ubiquitination by Skp, Cullin, I-bat (SCP/
Skp) complex (see Chapter 14, Section 14.12.1). NMR studies in the case of p21Cip1
(Kriwacki et al. 1996) and CD and fluorescence studies of p27Kip1 and p57Kip2 (Adkins
and Lumb 2002) have shown that the inhibitors are fully random in the solution state.
242 Structure and Function of Intrinsically Disordered Proteins
More detailed studies of p27Kip1 refined this picture (Lacy et al. 2004). As described
in Section 10.2.3.1, NMR, chemical shift index (CSI), and nuclear Overhauser effect
(NOE) values, and restrained MD simulations (Sivakolundu, Bashford, and Kriwacki
2005) suggest that the structural ensemble in the solution state of the protein closely
reflects its bound conformation (Chapter 10, Figure 10.3). The structure of p27Kip1
KID bound to Cyclin A-Cdk2 shows that Tyr88 inserts itself into the catalytic cleft of
the kinase, thereby preventing catalytic activity (Russo et al. 1996). The combination
of disorder and preformed structural elements (PSEs) in KID enables a sequential
“staple”-like binding mechanism, which is probably required for fast, specific, still
adaptable interaction with its partners (Lacy et al. 2004). The rest of the molecule
remains disordered even in the bound state and mediates other functions, such as
PCNA binding (Galea et al. 2008a) and regulation of degradation by the ubiquitin/
proteasome machinery (Grimmler et al. 2007). p21Cip1 was the first IDP, for which
disorder-dependent binding promiscuity was suggested (Kriwacki et al. 1996), which
is thought to enable it to regulate distinct cyclin-Cdk complexes that control entry into
G1 phase (Cyclin D-Cdk4/6) and progression from G1 to S phase (Cyclin A/E-Cdk2,
see Figure 15.3). Promiscuity in binding was implicated in seemingly opposite effects
of p21Cip1 and p27Kip1 on different cyclin-Cdk complexes, promoting the assembly and
catalytic activity of some (Cyclin D-Cdk4) and potently inhibiting others (Cyclin A/E-
Cdk2 (Cheng et al. 1999; Sherr and Roberts 1999)).
15.1.4 Breast-Cancer 1
Mutations in the breast cancer 1, early onset (BRCA1) gene are implicated in 45% of
familial breast cancers and 80% of both familial breast and ovarian cancers (Miki
et al. 1994). The product of BRCA1 gene is a multifunctional protein of 1863 amino
acids in length (Chapter 12, Section 12.6.2), involved in DNA double-strand break
repair, transcription-coupled repair, cell cycle checkpoint control, centrosome dupli-
cation, transcription regulation, DNA damage signaling, growth regulation, and the
induction of apoptosis (Deng 2006; Venkitaraman 2002). The molecular mechanism
of how BRCA1 can carry out these diverse functions is uncertain, but it involves
interactions with DNA and a large number of protein partners. Upon UV exposure,
BRCA1 localizes to the nucleus in a phosphorylation-dependent manner, which sug-
gests that phosphorylation may play a general role in BRCA1 activation. For exam-
ple, in response to IR radiation, BRCA1 is phosphorylated by ataxia telangiectasia
mutated (ATM) kinase and by ATM-dependent kinase Chk2, whereas upon expo-
sure to UV it is phosphorylated by ATR (ATM and Rad3 related kinase).
BRCA1 has only two small structured domains located at the opposite ends
of the protein, both implicated in protein-protein interactions. The N-terminal
RING domain (residues 1–103) forms a heterodimer with BRCA1-associated RING
domain 1 (BARD1), resulting in an active E3 ubiquitin ligase complex (Brzovic et al.
2003). At the C-terminus, there are two tandem BRCA1 C-terminal domains (BRCT,
residues 1646-1863), which might be involved in the DNA damage response signal
cascade due to their phosphopeptide binding capacity. The two regions are separated
by a long central region of about 1,500 amino acids, which contains no recognizable
15 • Structural Disorder and Disease 243
domains, and is predicted to be largely disordered (Mark et al. 2005). Its 21 overlap-
ping fragments (each about 200 amino acids) were found to be disordered by NMR
and CD (Mark et al. 2005). This long IDR harbors binding sites for DNA and more
than 50 DNA damage sensors, DNA repair proteins, and signal transduction proteins
such as p53, BRCA2, c-Myc, retinoblastoma protein (RB), JunB, Rad50 and Rad51,
and the Fanconi anemia group A (FANCA) protein. In all, the observed disorder
within the central region of BRCA1 is consistent with an earlier proposal that BRCA1
acts as a scaffold (see Chapter 12, Section 12.6.2) that mediates multiple, weak and
possibly transient interactions, thereby integrating multiple signals in the DNA dam-
age response and repair pathway (Foray et al. 2003).
Tyr177
CC Tyr-k
1.0
IUPred Score
0.5
0.0
200 400 600 800 1000 1200 1400
Residue
Figure 15.4 Predicted disorder and domain structure of BCR-ABL. Structural disorder
predicted by the IUPred algorithm and domain structure identified by Pfam of the cancer
protein BCR-ABL generated by chromosomal translocation. Disorder score above 0.5 is con-
sidered disordered. The position of the break point is marked by a vertical line in the plot,
whereas the elements critical for the oncogenic function of the fusion protein are depicted
as rectangles (cc, coiled coil; Tyr-k, tyrosine-kinase domain; arrowhead, the position of
Tyr177 phosphorylation site).
Table 15.2 Predicted α-MoREs and druggable targets in genomes, functional classes,
and diseases*
Percent of
Proteins Number of
with Number of Prospective
Predicted Predicted Druggable
Group MoREs MorEs Interactions
Kingdoms Eukaryotes 21 (±4)
Bacteria 3 (±1)
Archaea 2 (±2)
Functional classes Regulation 48 879
Cell division 42 40
Differentiation 38 25
Cytoskeleton 37 113
Membrane 24 88
Inhibitor 20 50
Transport 18 190
Degradation 16 10
Diseaseses Cancer 34 1,334 837
Diabetes 33 176 116
Autoimmune 32 934 680
Neurodegenerative 24 395 238
disease
Cardiovascular 21 198 153
* Proteins in various kingdoms, functional categories, and diseases have been predicted for the occur-
rence of molecular recognition features (α-MoREs). The percent of proteins with α-MoREs, the
actual number of predicted α-MoREs, and the number of those examples that possibly represent
druggable targets are shown. Adapted from Cheng et al. 2006b.
15.3.1.1 A β peptide
Mutations in familial AD mostly affect the gene of amyloid precursor protein (APP)
and/or presenilin 1 and 2 (PSEN1 and PSEN2). APP is a single-pass transmembrane
protein, the metabolism of which under normal conditions is initiated by two prote-
olytic cleavage events catalyzed by α-secretase (TACE and ADAM10) and γ-secretase
(presenilin). This normal product has no propensity for amyloid formation. If cleav-
age occurs at α- and β-sites due to the action of β-secretase (β-site amyloid precursor
protein-cleaving enzyme, BACE), the resulting Aβ peptide (Chiang, Lam, and Luo
2008) displays enhanced amyloidogeneicity and is deposited into plaques, which is
the causative event in AD. Mutations in the inherited forms (APP, PSEN1, PSEN2)
promote formation of Aβ and lead to the disease. Prior to fibrillation, Aβ monomers
exist as predomainantly extended random chains with no α-helical or β-strand struc-
tures (Kirkitadze, Condron, and Teplow 2001). A partial refolding to a somewhat more
structured state occurs at the earliest stages of fibrillation. Whereas the monomer is
not toxic, Aβ becomes neurotoxic to cortical cell cultures when aggregated (Simmons
et al. 1994).
15.3.1.2 Tau protein
Also contributing to the etiology of AD is the overactivation of the protein kinase Cdk5
and/or glycogen-synthase kinase (GSK3), which leads to the hyperphosphorylation of
tau protein, its dissociation from microtubules and aggregation into PHFs (Iqbal and
Grundke-Iqbal 2008). Tau protein belongs to the family of microtubule-associated
proteins (MAPs), accessory proteins required for MT polymerization and stability
15 • Structural Disorder and Disease 249
(see Chapter 10, Section 10.2.3.3 and Section 10.4.2, and Chapter 11, Section 11.6.3).
Whether the formation of PHFs of soluble tau protein is a cause or a consequence of
disease is a matter of debate, because it also appears in other diseases, taupathies, and
frontotemporal dementias (e.g., frontotemporal dementia with Parkinsonism linked to
chromosome 17, FTDP-17).
Tau protein has several isoforms generated by alternative splicing (Himmler 1989).
The longest one is 441 amino acids, and by functional criteria it can be roughly divided
into an N-terminal projection domain and a C-terminal repetitive tubulin-binding
domain (TBD). Tau protein in isolation is mostly disordered (Schweers et al. 1994),
with some short-range structural organization in the MT-binding repeats (Mukrasch
et al. 2007a) and also transient long-range tertiary interactions (Jeganathan et al. 2006)
(see Figure 10.6).
By a combination of limited proteolysis, generation of overlapping peptides, and
aggregation assays, it was shown that a 43-residue fragment within the third repeat
of tau is required for self-assembly into filaments (von Bergen et al. 2000). A mini-
mal hexapeptide interaction motif of V306QIVYK311 at the beginning of the third inter-
nal repeat, which shows the highest predicted β-structure forming potential in tau, is
probably the most critical element for aggregation. The importance of local propensity
for β-structures in PHF formation was suggested by residual β-structure (see Section
10.2.3.3) for 8–10 residues at the beginning of repeats R2–R4 (Mukrasch et al. 2005).
These regions correspond to sequence motifs, which form the core of the cross-β struc-
ture of tau PHFs.
35 45 55 65 75 85 95 100
EGVLYVGSKTKEGVVHGVATVAEKTKEQVTNVGGAVVTGVTAVAQKTVEGAGSIAAATGF VKKD QL
0.45 0.45
0.40 0.40
0.35 0.35
0.30 0.30
0.25
0.25
∆H0–1
0.20 π(O2)
0.20
0.15
0.15
0.10
0.10
0.05
0.05
0.00
0.00
0 5 10 15 20 25 30 35 40 45 50 55 60 65 70 75 80 85 90 95 100 105 110 115
Residue Number
Figure 15.5 Residue mobility in the amyloid state of α-synuclein. EPR spectroscopy was
used to characterize the mobility and accessibility of residues in α-synuclein fibrils. Data
represent a composite of a series of measurements with mutants spin-labeled at positions
represented by actual data points. Mobility is represented by the inverse central line widths
(∆Ho –1, black dots), whereas O2 accessibility is given by π (O2, gray triangles). Data points are
connected by solid lines in the case of consecutive residues; dashed lines are used other-
wise. Gray shading indicates areas with increased mobilities and accessibilities. The region
36–98 is rather rigid, forming the core of the fibrils. The linker region 62–67 and flanking
regions 5–35 and 99–110 preserve mobility in the amyloid state. Reproduced with permis-
sion from Chen et al. (2007), J. Biol. Chem. 282, 24970–9. Copyright by the American
Society for Biochemistry and Molecular Biology.
15 • Structural Disorder and Disease 251
15.3.3 Glutamine-Repeat Diseases
Glutamine-, trinucleotide-, or CAG-repeat diseases, also termed polyQ diseases,
are related autosomal dominant neurodegenerative disorders that result from aggre-
gation of a functional protein with a Gln-repeat region that expands to pathological
lengths (Table 15.3). The proteins involved are unrelated and result in diseases as
diverse as Huntington’s disease (huntingtin), spinocerebellar ataxia (ataxin), spi-
nocerebellar ataxia17 (TATA-box binding protein), X-linked spinal and bulbar mus-
cular atrophy, also termed Kennedy’s disease (androgen receptor), and hereditary
252 Structure and Function of Intrinsically Disordered Proteins
15.3.3.2 Huntingtin
Huntingtin (HTT) is a large protein of 3,144 amino acids, encoded by a gene consisting
of 67 exons. Its normal physiological function is unknown, but knockout mouse models
have shown it to be essential for development and survival (Nasir et al. 1995), maybe
due to acting as a transcription factor upregulating the expression of brain-derived neu-
rotrophic factor (BDNF), and/or having a role in cytoskeletal anchoring/transport and
vesicle trafficking. The protein displays no homology to other proteins and is highly
expressed in neurons and in testis (Cattaneo, Zuccato, and Tartari 2005). Its CAG-
repeat region encoding for a run of Glns is within exon 1, followed by a Pro-repeat of
11 prolines. When the polyQ region contains fewer than 36 glutamines (usually below
27), it results in a soluble cytoplasmic protein, but when it expands to 36 or more, HTT
deposits into aggregates, causing neuronal decay (Katsuno et al. 2008). The number
of CAG repeats correlates with age at onset and the rate of progression of symptoms
(Kieburtz et al. 1994). With very large repeat counts in about 7% of the cases, HD can
even occur under the age of 20.
HTT had a major role in structural studies of the polyQ regions of proteins. CD and
NMR data of synthetic peptides or GST fusion constructs showed that polyQ sequences
up to the pathogenic length prefer the random coil state (Chen et al. 2001; Masino et al.
2002). The picture was refined in hydrodynamic measurements by fluorescence cor-
relation spectroscopy (FCS) (Crick et al. 2006), which showed that the translational
diffusion coefficient of the protein scales with chain length with an exponent 0.32,
significantly smaller than the value 0.5 or 0.58 of a random coil chain in an ideal or
good solvent (see Chapter 1, Section 1.7). Thus, water is a polymeric poor solvent for
polyQ, and the structural ensemble for monomeric polyQ is made up of a heterogeneous
collection of collapsed structures.
NMR and CD spectroscopy demonstrated that Gln residues possess a high pro-
pensity to adopt the PPII helix conformation (Chellgren et al. 2006) in short tracs
up to about 15 residues. Studies of longer stretches are hampered by poor solubility;
thus currently there is no evidence that the observed PPII helical structure is a pre-
cursor to aggregation, although PPII helix is known to transit easily to other confor-
mational states (see Chapter 1, Section 1.5.1 and Chapter 10, Section 10.2.2) (Blanch
et al. 2000).
160–220, which involves two α-helical regions (helix A and helix B of the C-terminal
globular half), is likely involved in the transition to the β-structure.
Gly1
Gln5 Asn3
Asn2
Tyr7
Asn6
4.87Å
Gln5 Tyr7
Asn2 Asn3
Figure 15.6 Crystal structure of a model amyloid. Amyloid-like structure of the hep-
tapeptide GNNQQNY derived from the yeast prion Sup35p. The structure is a sandwich
of β-sheets, with each β-strand represented by an arrow. Side-chains protrude from
the sheet. Side-chains in the dry interface between the two sheets tightly interdigitate
excluding water, whereas side-chains on the wet outside surface form an extensive net-
work of H-bonds. This molecular arrangement is termed a “steric zipper.” Reproduced
with permission from Nelson et al. (2005), Nature 435, 773–8. Copyright by the Nature
Publishing group.
of β-strands. A series of mutants and labeling suggest that the tightly packed core region
extends from residue 36 to 98 in the case of α-synuclein (see Chapter 5, Section 5.6 and
Figure 15.5), whereas in the case of prion protein it extends approximately between
residues 160 and 220.
tetramerization domain (Lee et al. 2003), and immunoglobulin light chain (Wall et al.
1999). Nonnative conditions destabilizing structure, such as low or high pH, high tem-
peratures, or mild denaturation, also lead to an increased fibrillation, as demonstrated
in the case of the SH3 domain of PI3K (Guijarro et al. 1998) and fibronectin type III
module of murine fibronectin (Litvinovich et al. 1998), for example. Conversely, amy-
loidogenicity of a globular protein can be significantly reduced by stabilization of the
native structure by ligand binding, for example (Chiti et al. 2001). In the case of IDPs,
the primary step of fibrillogenesis is the stabilization of a partially folded conformation,
as demonstrated in the case of α-synuclein (Uversky, Li, and Fink 2001b) and IAPP
(Kayed et al. 1999). In general, the structural prerequisite of amyloid formation is the
transformation of a polypeptide chain into a partially folded disordered conformation,
originating from an ordered or disordered initial state.
A 273
175
245
282
220
1 103 219 408 540 758 894 1113 1234 1646 1863
Figure 15.7 Oncogenic mutations in p53 and BRCA1. The distribution and relative
frequency of oncogenic missense mutations in p53 (A) and BRCA1 (B). In p53, the muta-
tions cluster in the central ordered DNA-binding domain (p536), whereas in BRCA1 they
do not dominate in the N- and C-terminal ordered domains, but also appear in large pro-
portions in the central region that is largely disordered. Reproduced with permission from
Joerger and Fersht (2008), Annu. Rev. Biochem. 77, 557–82, copyright by Annual Reviews,
and Mark et al. (2005), J. Mol. Biol. 345, 275–87, copyright by Elsevier Inc.
their functions are resistant to mutations). In conclusion, disorder per se might not pose
a particular risk in these diseases (Brown et al. 2002; Daughdrill et al. 2007). Of course,
disorder appears as a permissive element in the oncogenic function of fusion proteins
generated by chromosomal translocations (Section 15.1.6), which establishes a direct
link of disorder with cancer.
IDPs/IDRs also appear disproportionately frequently, in about two-thirds of the
cases, in amyloid diseases (Table 15.3), and neurodegenerative diseases are caused by
disordered precursor proteins without exception. Although the special amino acid com-
position of IDPs (Tompa 2002) limits their amyloidogeneicity (Linding et al. 2004),
their open and exposed structure makes them vulnerable to misfolding/deposition in
disease, and in this sense the presence of structural disorder probably does pose a dan-
ger to the organism.
14
12
Regions of more than 40 Residues (%)
Residues within Disordered
10
0
a i ms aea ria
lex ng nis ch cte
mp Fu a Ar Ba
ico org
Ap er
gh
Hi
benign papillomas and act as cofactors in carcinomas, and they have about 100 different
types grouped into low- and high-risk groups according to their association with cancer.
Comparative bioinformatics analysis of the two groups, with particular focus on E6 and
E7 transforming oncoproteins, showed an increased level of disorder in the high-risk
group (Uversky et al. 2006).
The role of disorder in infection is mechanistically established in pathogenic
bacteria, which use membrane-anchored extracellular disordered proteins to tether
to the extracellular matrix (ECM) of their host, an important step in the mechanism
of invasion (Patti et al. 1994). These proteins termed MSCRAMMs (see Chapter 10,
Section 10.2.3.4 for details) are single transmembrane-spanning helix proteins, with
highly repetitive disordered extracellular tail regions that undergo disorder-to-order
transition upon binding to the cognate ECM component (House-Pompeo et al. 1996;
Schwarz-Linek et al. 2004; Schwarz-Linek et al. 2003).
Bates, I. R., J. B. Feix, J. M. Boggs, and G. Harauz. 2004. An immunodominant epitope of myelin
basic protein is an amphipathic alpha-helix. J. Biol. Chem. 279: 5757–64.
Batra-Safferling, R., K. Abarca-Heidemann, H. G. Korschen, et al. 2006. Glutamic acid-rich pro-
teins of rod photoreceptors are natively unfolded. J. Biol. Chem. 281: 1449–60.
Baxa, U., P. D. Ross, R. B. Wickner, and A. C. Steven. 2004. The N-terminal prion domain of
Ure2p converts from an unfolded to a thermally resistant conformation upon filament for-
mation. J. Mol. Biol. 339: 259–64.
Baxter, N. J., T. H. Lilley, E. Haslam, and M. P. Williamson. 1997. Multiple interactions between
polyphenols and a salivary proline-rich protein repeat result in complexation and precipita-
tion. Biochemistry 36: 5566–77.
Bell, S., C. Klein, L. Muller, S. Hansen, and J. Buchner. 2002. p53 Contains large unstructured
regions in its native state. J. Mol. Biol. 322: 917–27.
Belle, A., A. Tanay, L. Bitincka, R. Shamir, and E. K. O’Shea. 2006. Quantification of protein
half-lives in the budding yeast proteome. Proc. Natl. Acad. Sci. USA 103: 13004–9.
Belmont, L. D. and T. J. Mitchison. 1996. Identification of a protein that interacts with tubulin
dimers and increases the catastrophe rate of microtubules. Cell 84: 623–31.
Bennett, M. C. 2005. The role of alpha-synuclein in neurodegenerative diseases. Pharmacol.
Ther. 105: 311–31.
Bentrop, D., M. Beyermann, R. Wissmann, and B. Fakler. 2001. NMR structure of the “ball-and-
chain” domain of KCNMB2, the beta 2-subunit of large conductance Ca2+- and voltage-
activated potassium channels. J. Biol. Chem. 276: 42116–21.
Bernacchi, S., S. Stoylov, E. Piemont, et al. 2002. HIV-1 nucleocapsid protein activates tran-
sient melting of least stable parts of the secondary structure of TAR and its complementary
sequence. J. Mol. Biol. 317: 385–99.
Bernado, P., L. Blanchard, P. Timmins, D. Marion, R. W. Ruigrok, and M. Blackledge. 2005.
A structural model for unfolded proteins from residual dipolar couplings and small-angle
x-ray scattering. Proc. Natl. Acad. Sci. USA 102: 17002–7.
Bernado, P., E. Mylonas, M. V. Petoukhov, M. Blackledge, and D. I. Svergun. 2007. Structural
characterization of flexible proteins using small-angle X-ray scattering. J. Am. Chem. Soc.
129: 5656–64.
Bertoncini, C. W., Y. S. Jung, C. O. Fernandez, et al. 2005. Release of long-range tertiary interac-
tions potentiates aggregation of natively unstructured alpha-synuclein. Proc. Natl. Acad.
Sci. USA 102: 1430–5.
Bertoncini, C. W., R. M. Rasia, G. R. Lamberto, et al. 2007. Structural characterization of the
intrinsically unfolded protein beta-synuclein, a natural negative regulator of alpha-synu-
clein aggregation. J. Mol. Biol. 372: 708–22.
Bertram, L. and R. E. Tanzi. 2004. Alzheimer’s disease: one disorder, too many genes? Hum. Mol.
Genet. 13 Spec. No. 1: R135–41.
Besson, A., S. F. Dowdy, and J. M. Roberts. 2008. CDK inhibitors: cell cycle regulators and
beyond. Dev. Cell 14: 159–69.
Betts, R., S. Weinsheimer, G. E. Blouse, and J. Anagli. 2003. Structural determinants of
the calpain inhibitory activity of calpastatin peptide B27-WT. J. Biol. Chem. 278:
7800–9.
Bevivino, A. E. and P. J. Loll. 2001. An expanded glutamine repeat destabilizes native ataxin-3
structure and mediates formation of parallel beta-fibrils. Proc. Natl. Acad. Sci. USA 98:
11955–60.
Bhattacharyya, J. and K. P. Das. 1999. Molecular chaperone-like properties of an unfolded pro-
tein, alpha(s)-casein. J. Biol. Chem. 274: 15505–9.
Bhattacharyya, R. P., A. Remenyi, M. C. Good, C. J. Bashor, A. M. Falick, and W. A. Lim. 2006.
The Ste5 scaffold allosterically modulates signaling output of the yeast mating pathway.
Science 311: 822–6.
268 References
Bhaumik, S. R., E. Smith, and A. Shilatifard. 2007. Covalent modifications of histones during
development and disease pathogenesis. Nat. Struct. Mol. Biol. 14: 1008–16.
Bienkiewicz, E. A., J. N. Adkins, and K. J. Lumb. 2002. Functional consequences of preorganized
helical structure in the intrinsically disordered cell-cycle inhibitor p27(Kip1). Biochemistry
41: 752–9.
Bienkiewicz, E. A., A. M. Woody, and R. W. Woody. 2000. Conformation of the RNA polymerase
II C-terminal domain: circular dichroism of long and short fragments. J. Mol. Biol. 297:
119–33.
Bjorklund, A. K., D. Ekman, and A. Elofsson. 2006. Expansion of protein domain repeats. PLoS
Comput. Biol. 2: e114.
Black, J. C., J. E. Choi, S. R. Lombardo, and M. Carey. 2006. A mechanism for coordinating
chromatin modification and preinitiation complex assembly. Mol. Cell 23: 809–18.
Blake, C. C., D. F. Koenig, G. A. Mair, A. C. North, D. C. Phillips, and V. R. Sarma. 1965.
Structure of hen egg-white lysozyme. A three-dimensional Fourier synthesis at 2 Angstrom
resolution. Nature 206: 757–61.
Blanch, E. W., D. D. Kasarda, L. Hecht, K. Nielsen, and L. D. Barron. 2003. New insight into
the solution structures of wheat gluten proteins from Raman optical activity. Biochemistry
42: 5665–73.
Blanch, E. W., L. A. Morozova-Roche, D. A. Cochran, A. J. Doig, L. Hecht, and L. D. Barron.
2000. Is polyproline II helix the killer conformation? A Raman optical activity study
of the amyloidogenic prefibrillar intermediate of human lysozyme. J. Mol. Biol. 301:
553–63.
Blander, G. and L. Guarente. 2004. The Sir2 family of protein deacetylases. Annu. Rev. Biochem.
73: 417–35.
Blencowe, B. J. 2006. Alternative splicing: new insights from global analyses. Cell 126:
37–47.
Blom, N., S. Gammeltoft, and S. Brunak. 1999. Sequence and structure-based prediction of
eukaryotic protein phosphorylation sites. J. Mol. Biol. 294: 1351–62.
Bloomfield, V. A. and T. K. Lim. 1978. Quasi-elastic laser light scattering. Methods Enzymol. 48:
415–94.
Bochicchio, B. and A. M. Tamburro. 2002. Polyproline II structure in proteins: identification by
chiroptical spectroscopies, stability, and functions. Chirality 14: 782–92.
Bochkareva, E., L. Kaustov, A. Ayed, et al. 2005. Single-stranded DNA mimicry in the p53 trans-
activation domain interaction with replication protein A. Proc. Natl. Acad. Sci. USA 102:
15412–7.
Bochtler, M., L. Ditzel, M. Groll, C. Hartmann, and R. Huber. 1999. The proteasome. Annu. Rev.
Biophys. Biomol. Struct. 28: 295–317.
Bock-Marquette, I., A. Saxena, M. D. White, J. Michael Dimaio, and D. Srivastava. 2004.
Thymosin beta4 activates integrin-linked kinase and promotes cardiac cell migration, sur-
vival and cardiac repair. Nature 432: 466–72.
Bodart, J. F., J. M. Wieruszeski, L. Amniai, et al. 2008. NMR observation of Tau in Xenopus
oocytes. J. Magn. Reson. 192: 252–7.
Bode, W., P. Schwager, and R. Huber. 1978. The transition of bovine trypsinogen to a trypsin-like
state upon strong ligand binding. The refined crystal structures of the bovine trypsinogen-
pancreatic trypsin inhibitor complex and of its ternary complex with Ile-Val at 1.9 A resolu-
tion. J. Mol. Biol. 118: 99–112.
Bois, P. and A. J. Jeffreys. 1999. Minisatellite instability and germline mutation. Cell. Mol. Life
Sci. 55: 1636–48.
Bois, P. R. 2003. Hypermutable minisatellites, a human affair? Genomics 81: 349–55.
Bokor, M., V. Csizmok, D. Kovacs, et al. 2005. NMR relaxation studies on the hydrate layer of
intrinsically unstructured proteins. Biophys J. 88: 2030–7.
References 269
Bonin, I., R. Muhlberger, G. P. Bourenkov, et al. 2004. Structural basis for the interaction of
Escherichia coli NusA with protein N of phage lambda. Proc. Natl. Acad. Sci. USA 101:
13762–7.
Bonsor, D. A., I. Grishkovskaya, E. J. Dodson, and C. Kleanthous. 2007. Molecular mimicry
enables competitive recruitment by a natively disordered protein. J. Am. Chem. Soc. 129:
4800–7.
Booth, D. R., M. Sunde, V. Bellotti, et al. 1997. Instability, unfolding and aggregation of human
lysozyme variants underlying amyloid fibrillogenesis. Nature 385: 787–93.
Bordoli, L., F. Kiefer, and T. Schwedel. 2007. Assessment of disorder predictions in CASP7.
Proteins 69 (Suppl 8): 129–36.
Borg, M., T. Mittag, T. Pawson, M. Tyers, J. D. Forman-Kay, and H. S. Chan. 2007. Polyelectrostatic
interactions of disordered ligands suggest a physical basis for ultrasensitivity. Proc. Natl.
Acad. Sci. USA 104: 9650–5.
Bourhis, J. M., B. Canard, and S. Longhi. 2006. Structural disorder within the replicative complex
of measles virus: functional implications. Virology 344: 94–110.
Bourhis, J. M., V. Receveur-Brechot, M. Oglesbee, et al. 2005. The intrinsically disordered
C-terminal domain of the measles virus nucleoprotein interacts with the C-terminal domain
of the phosphoprotein via two distinct sites and remains predominantly unfolded. Protein
Sci. 14: 1975–92.
Bracken, C., L. M. Iakoucheva, P. R. Romero, and A. K. Dunker. 2004. Combining prediction,
computation and experiment for the characterization of protein disorder. Curr. Opin. Struct.
Biol. 14: 570–6.
Braig, K., Z. Otwinowski, R. Hegde, et al. 1994. The crystal structure of the bacterial chaperonin
GroEL at 2.8 A. Nature 371: 578–86.
Bray, E. A. 1993. Molecular responses to water deficit. Plant Physiol. 103: 1035–40.
Breidenbach, M. A. and A. T. Brunger. 2004. Substrate recognition strategy for botulinum neuro-
toxin serotype A. Nature 432: 925–9.
Brenner, S. E. 2000. Target selection for structural genomics. Nat. Struct. Biol. 7 Suppl:
967–9.
Bretscher, A. 1984. Smooth muscle caldesmon. Rapid purification and F-actin cross-linking prop-
erties. J. Biol. Chem. 259: 12873–80.
Bright, J. N., T. B. Woolf, and J. H. Hoh. 2001. Predicting properties of intrinsically unstructured
proteins. Prog. Biophys. Mol. Biol. 76: 131–73.
Brookmeyer, R., S. Gray, and C. Kawas. 1998. Projections of Alzheimer’s disease in the United States
and the public health impact of delaying disease onset. Am. J. Public. Health. 88: 1337–42.
Brooks, C. L. and W. Gu. 2006. p53 ubiquitination: Mdm2 and beyond. Mol. Cell 21: 307–15.
Brown, C. J., S. Takayama, A. M. Campen, et al. 2002. Evolutionary rate heterogeneity in proteins
with long disordered regions. J. Mol. Evol. 55: 104–10.
Brown, D. R., K. Qin, J. W. Herms, et al. 1997a. The cellular prion protein binds copper in vivo.
Nature 390: 684–7.
Brown, D. R., W. J. Schulz-Schaeffer, B. Schmidt, and H. A. Kretzschmar. 1997b. Prion protein-
deficient cells show altered response to oxidative stress due to decreased SOD-1 activity.
Exp. Neurol. 146: 104–12.
Brown, H. G. and J. H. Hoh. 1997. Entropic exclusion by neurofilament sidearms: a mechanism
for maintaining interfilament spacing. Biochemistry 36: 15035–40.
Brunger, A. T., M. A. Breidenbach, R. Jin, A. Fischer, J. S. Santos, and M. Montal. 2007.
Botulinum neurotoxin heavy chain belt as an intramolecular chaperone for the light chain.
PLoS Pathog. 3: 1191–4.
Brzovic, P. S., J. R. Keeffe, H. Nishikawa, et al. 2003. Binding and recognition in the assembly of an
active BRCA1/BARD1 ubiquitin-ligase complex. Proc. Natl. Acad. Sci. USA 100: 5646–51.
Buard, J. and A. J. Jeffreys. 1997. Big, bad minisatellites. Nat. Genet. 15: 327–8.
270 References
Bubunenko, M. G., S. V. Chuikov, and A. T. Gudkov. 1992. The length of the interdomain region
of the L7/L12 protein is important for its function. FEBS Lett. 313: 232–4.
Buday, L. 1999. Membrane-targeting of signalling molecules by SH2/SH3 domain-containing
adaptor proteins. Biochim. Biophys. Acta. 1422: 187–204.
Buee, L., T. Bussiere, V. Buee-Scherrer, A. Delacourte, and P. R. Hof. 2000. Tau protein isoforms,
phosphorylation and role in neurodegenerative disorders. Brain Res. Brain Res. Rev. 33:
95–130.
Bushnell, D. A., K. D. Westover, R. E. Davis, and R. D. Kornberg. 2004. Structural basis of
transcription: an RNA polymerase II-TFIIB cocrystal at 4.5 angstroms. Science 303:
983–8.
Bussell, R. Jr. and D. Eliezer. 2001. Residual structure and dynamics in Parkinson’s disease-
associated mutants of alpha-synuclein. J. Biol. Chem. 276: 45996–6003.
Bustos, D. M. and A. A. Iglesias. 2006. Intrinsic disorder is a key characteristic in partners that
bind 14-3-3 proteins. Proteins 63: 35–42.
Callaghan, A. J., J. P. Aurikko, L. L. Ilag, et al. 2004. Studies of the RNA degradosome-organizing
domain of the Escherichia coli ribonuclease RNase E. J. Mol. Biol. 340: 965–79.
Campbell, K. M., A. R. Terrell, P. J. Laybourn, and K. J. Lumb. 2000. Intrinsic structural disorder
of the C-terminal activation domain from the bZIP transcription factor Fos. Biochemistry
39: 2708–13.
Carafoli, E., L. Santella, D. Branca, and M. Brini. 2001. Generation, control, and processing of
cellular calcium signals. Crit. Rev. Biochem. Mol. Biol. 36: 107–260.
Carlson, D. M. 1993. Salivary proline-rich proteins: biochemistry, molecular biology, and regula-
tion of expression. Crit. Rev. Oral. Biol. Med. 4: 495–502.
Caron, E. 2002. Regulation of Wiskott–Aldrich syndrome protein and related molecules. Curr.
Opin. Cell Biol. 14: 82–7.
Carr, C. M. and P. S. Kim. 1993. A spring-loaded mechanism for the conformational change of
influenza hemagglutinin. Cell 73: 823–32.
Carrard, G., A. Koivula, H. Soderlund, and P. Beguin. 2000. Cellulose-binding domains promote
hydrolysis of different sites on crystalline cellulose. Proc. Natl. Acad. Sci. USA 97:
10342–7.
Casares, S., M. Sadqi, O. Lopez-Mayorga, F. Conejero-Lara, and N. A. Van Nuland. 2004.
Detection and characterization of partially unfolded oligomers of the SH3 domain of alpha-
spectrin. Biophys J. 86: 2403–13.
Cattaneo, E., C. Zuccato, and M. Tartari. 2005. Normal huntingtin function: an alternative
approach to Huntington’s disease. Nat. Rev. Neurosci. 6: 919–30.
Chakrabortee, S., C. Boschetti, L. J. Walton, S. Sarkar, D. C. Rubinsztein, and A. Tunnacliffe.
2007. Hydrophilic protein associated with desiccation tolerance exhibits broad protein sta-
bilization function. Proc. Natl. Acad. Sci. USA 104: 18073–78.
Chang, J. F., K. Phillips, T. Lundback, M. Gstaiger, J. E. Ladbury, and B. Luisi. 1999. Oct-1 POU
and octamer DNA co-operate to recognize the Bob-1 transcription co-activator via induced
folding. J. Mol. Biol. 288: 941–52.
Chang, X. B., J. A. Tabcharani, Y. X. Hou, et al. 1993. Protein kinase A (PKA) still activates
CFTR chloride channel after mutagenesis of all 10 PKA consensus phosphorylation sites.
J. Biol. Chem. 268: 11304–11.
Charlton, A. J., N. J. Baxter, T. H. Lilley, E. Haslam, C. J. Mcdonald, and M. P. Williamson. 1996.
Tannin interactions with a full-length human salivary proline-rich protein display a stronger
affinity than with single proline-rich repeats. FEBS Lett. 382: 289–92.
Chatterjee, A., A. Kumar, J. Chugh, S. Srivastava, N. S. Bhavesh, and R. V. Hosur. 2005. NMR of
unfolded proteins. J. Chem. Sci. 117: 3–21.
Chattopadhyay, K., E. L. Elson, and C. Frieden. 2005. The kinetics of conformational fluctuations
in an unfolded protein measured by fluorescence methods. Proc. Natl. Acad. Sci. USA 102:
2385–9.
References 271
Chiti, F. and C. M. Dobson. 2006. Protein misfolding, functional amyloid, and human disease.
Annu. Rev. Biochem. 75: 333–66.
Chiti, F., M. Stefani, N. Taddei, G. Ramponi, and C. M. Dobson. 2003. Rationalization of the
effects of mutations on peptide and protein aggregation rates. Nature 424: 805–8.
Chiti, F., N. Taddei, M. Stefani, C. M. Dobson, and G. Ramponi. 2001. Reduction of the amy-
loidogenicity of a protein by specific binding of ligands to the native conformation. Protein
Sci. 10: 879–86.
Chiti, F., P. Webster, N. Taddei, et al. 1999. Designing conditions for in vitro formation of amyloid
protofilaments and fibrils. Proc. Natl. Acad. Sci. USA 96: 3590–4.
Chong, P. A., B. Ozdamar, J. L. Wrana, and J. D. Forman-Kay. 2004. Disorder in a target for the
Smad2 mad homology 2 domain and its implications for binding and specificity. J. Biol.
Chem. 279: 40707–14.
Chothia, C., J. Gough, C. Vogel, and S. A. Teichmann. 2003. Evolution of the protein repertoire.
Science 300: 1701–3.
Chou, P. Y. and G. D. Fasman. 1978. Prediction of the secondary structure of proteins from their
amino acid sequence. Adv. Enzymol. Relat. Areas Mol. Biol. 47: 45–148.
Chung, W. H., J. L. Craighead, W. H. Chang, et al. 2003. RNA polymerase II/TFIIF structure and
conserved organization of the initiation complex. Mol. Cell 12: 1003–13.
Clapier, C. R., G. Langst, D. F. Corona, P. B. Becker, and K. P. Nightingale. 2001. Critical role
for the histone H4 N terminus in nucleosome remodeling by ISWI. Mol. Cell Biol. 21:
875–83.
Clayton, D. F. and J. M. George. 1998. The synucleins: a family of proteins involved in synaptic
function, plasticity, neurodegeneration and disease. Trends Neurosci. 21: 249–54.
Cliff, M. J., R. Harris, D. Barford, J. E. Ladbury, and M. A. Williams. 2006. Conformational
diversity in the TPR domain-mediated interaction of protein phosphatase 5 with Hsp90.
Structure 14: 415–26.
Cobb, N. J., F. D. Sonnichsen, H. Mchaourab, and W. K. Surewicz. 2007. Molecular architecture
of human prion protein amyloid: a parallel, in-register beta-structure. Proc. Natl. Acad. Sci.
USA 104: 18946–51.
Coeytaux, K. and A. Poupon. 2005. Prediction of unfolded segments in a protein sequence based
on amino acid composition. Bioinformatics 21: 1891–900.
Cohen, P. 2000. The regulation of protein function by multisite phosphorylation—a 25-year
update. Trends Biochem. Sci. 25: 596–601.
Cohen, P. T. 1997. Novel protein serine/threonine phosphatases: variety is the spice of life. Trends
Biochem. Sci. 22: 245–51.
Cohen, P. T., M. X. Chen and C. G. Armstrong. 1996. Novel protein phosphatases that may par-
ticipate in cell signaling. Adv. Pharmacol. 36: 67–89.
Collins, E. C. and T. H. Rabbitts. 2002. The promiscuous MLL gene links chromosomal translo-
cations to cellular differentiation and tumour tropism. Trends Mol. Med. 8: 436–42.
Come, J. H., P. E. Fraser, and P. T. Lansbury Jr. 1993. A kinetic model for amyloid formation in
the prion diseases: importance of seeding. Proc. Natl. Acad. Sci. USA 90: 5959–63.
Conrad, B. and S. E. Antonarakis. 2007. Gene duplication: a drive for phenotypic diversity and
cause of human disease. Annu. Rev. Genomics. Hum. Genet. 8: 17–35.
Consortium. 2004. Finishing the euchromatic sequence of the human genome. Nature 431:
931–45.
Conway, K. A., J. D. Harper, and P. T. Lansbury. 1998. Accelerated in vitro fibril formation by a
mutant alpha-synuclein linked to early-onset Parkinson disease. Nat. Med. 4: 1318–20.
Conway, K. A., J. D. Harper, and P. T. Lansbury Jr. 2000. Fibrils formed in vitro from alpha-
synuclein and two mutant forms linked to Parkinson’s disease are typical amyloid.
Biochemistry 39: 2552–63.
Copley, R. R., T. Doerks, I. Letunic, and P. Bork. 2002. Protein domain analysis in the era of
complete genomes. FEBS Lett. 513: 129–34.
References 273
Copley, R. R., L. Goodstadt, and C. Ponting. 2003. Eukaryotic domain evolution inferred from
genome comparisons. Curr. Opin. Genet. Dev. 13: 623–8.
Cordero, O. J., C. S. Sarandeses, J. L. Lopez, and M. Nogueira. 1992. On the anomalous behav-
iour on gel-filtration and SDS-electrophoresis of prothymosin-alpha. Biochem. Int. 28:
1117–24.
Corradi, G. R. and H. P. Adamo. 2007. Intramolecular fluorescence resonance energy transfer
between fused autofluorescent proteins reveals rearrangements of the N- and C-terminal
segments of the plasma membrane Ca2+ pump involved in the activation. J. Biol. Chem.
282: 35440–8.
Cortese, M. S., J. P. Baird, V. N. Uversky, and A. K. Dunker. 2005. Uncovering the unfoldome:
enriching cell extracts for unstructured proteins by acid treatment. J. Proteome. Res. 4:
1610–8.
Cox, C. J., K. Dutta, E. T. Petri, et al. 2002. The regions of securin and cyclin B proteins recog-
nized by the ubiquitination machinery are natively unfolded. FEBS Lett. 527: 303–8.
Cramer, P., D. A. Bushnell, and R. D. Kornberg. 2001. Structural basis of transcription: RNA
polymerase II at 2.8 angstrom resolution. Science 292: 1863–76.
Creamer, L. K., T. Richardson, and D. A. Parry. 1981. Secondary structure of bovine alpha s1- and
beta-casein in solution. Arch. Biochem. Biophys. 211: 689–96.
Crevel, G. and S. Cotterill. 1995. DF 31, a sperm decondensation factor from Drosophila melano-
gaster: purification and characterization. EMBO J. 14: 1711–7.
Crevel, G., H. Huikeshoven, and S. Cotterill. 2001. Df31 is a novel nuclear protein involved in
chromatin structure in Drosophila melanogaster. J. Cell Sci. 114: 37–47.
Crick, S. L., M. Jayaraman, C. Frieden, R. Wetzel, and R. V. Pappu. 2006. Fluorescence correla-
tion spectroscopy shows that monomeric polyglutamine molecules form collapsed struc-
tures in aqueous solutions. Proc. Natl. Acad. Sci. USA 103: 16764–9.
Cristofari, G. and J. L. Darlix. 2002. The ubiquitous nature of RNA chaperone proteins. Prog.
Nucleic Acid Res. Mol. Biol. 72: 223–68.
Cristofari, G., C. Gabus, D. Ficheux, M. Bona, S. F. Le Grice, and J. L. Darlix. 1999.
Characterization of active reverse transcriptase and nucleoprotein complexes of the yeast
retrotransposon Ty3 in vitro. J. Biol. Chem. 274: 36643–8.
Crowther, R. A., R. Jakes, M. G. Spillantini, and M. Goedert. 1998. Synthetic filaments assembled
from C-terminally truncated alpha-synuclein. FEBS Lett. 436: 309–12.
Csermely, P. 1997. Proteins, RNAs and chaperones in enzyme evolution: a folding perspective.
Trends Biochem. Sci. 22: 147–9.
Csermely, P. 1999. Chaperone-percolator model: a possible molecular mechanism of
Anfinsen-cage-type chaperones. Bioessays 21: 959–65.
Csermely, P., T. Schnaider, C. Soti, Z. Prohaszka, and G. Nardai. 1998. The 90-kDa molecular
chaperone family: structure, function, and clinical applications. A comprehensive review.
Pharmacol. Ther. 79: 129–68.
Csizmok, V., M. Bokor, P. Banki, et al. 2005. Primary contact sites in intrinsically unstructured
proteins: the case of calpastatin and microtubule-associated protein 2. Biochemistry 44:
3955–64.
Csizmok, V., I. Felli, P. Tompa, L. Banci, and I. Bertini. 2008. Structural and dynamic character-
ization of intrinsically disordered human securin by NMR spectroscopy. J. Am. Chem. Soc.
130: 16873–9.
Csizmok, V., E. Szollosi, P. Friedrich, and P. Tompa. 2006. A novel two-dimensional electro-
phoresis technique for the identification of intrinsically unstructured proteins. Mol. Cell.
Proteomics. 5: 265–73.
Curran, J. and D. Kolakofsky. 1999. Replication of paramyxoviruses. Adv. Virus Res. 54:
403–22.
Dafforn, T. R. and C. J. Smith. 2004. Natively unfolded domains in endocytosis: hooks, lines and
linkers. EMBO Rep. 5: 1046–52.
274 References
Daggett, V. and A. R. Fersht. 2003. Is there a unifying mechanism for protein folding? Trends
Biochem. Sci. 28: 18–25.
Dai, M. S. and H. Lu. 2004. Inhibition of MDM2-mediated p53 ubiquitination and degradation by
ribosomal protein L5. J. Biol. Chem. 279: 44475–82.
Dai, P., H. Akimaru, Y. Tanaka, et al. 1996. CBP as a transcriptional coactivator of c-Myb. Genes
Dev. 10: 528–40.
Dames, S. A., M. Martinez-Yamout, R. N. De Guzman, H. J. Dyson, and P. E. Wright. 2002.
Structural basis for Hif-1 alpha/CBP recognition in the cellular hypoxic response. Proc.
Natl. Acad. Sci. USA 99: 5271–6.
Daniels, D. L., K. Eklof-Spink, and W. I. Weis. 2001. Beta-catenin: molecular plasticity and drug
design. Trends Biochem. Sci. 26: 672–8.
Daughdrill, G. W., M. S. Chadsey, J. E. Karlinsey, K. T. Hughes, and F. W. Dahlquist. 1997. The
C-terminal half of the anti-sigma factor, FlgM, becomes structured when bound to its target,
sigma 28. Nat. Struct. Biol. 4: 285–91.
Daughdrill, G. W., L. J. Hanely, and F. W. Dahlquist. 1998. The C-terminal half of the anti-sigma
factor FlgM contains a dynamic equilibrium solution structure favoring helical conforma-
tions. Biochemistry 37: 1076–82.
Daughdrill, G. W., P. Narayanaswami, S. H. Gilmore, A. Belczyk, and C. J. Brown. 2007. Dynamic
behavior of an intrinsically unstructured linker domain is conserved in the face of negligible
amino acid sequence conservation. J. Mol. Evol. 65: 277–88.
Davey, N. E., D. C. Shields, and R. J. Edwards. 2006. SLiMDisc: short, linear motif discovery,
correcting for common evolutionary descent. Nucleic. Acids Res. 34: 3546–54.
David, D. C., R. Layfield, L. Serpell, Y. Narain, M. Goedert, and M. G. Spillantini. 2002.
Proteasomal degradation of tau protein. J. Neurochem. 83: 176–85.
Davies, K. J. 2001. Degradation of oxidized proteins by the 20S proteasome. Biochimie 83:
301–10.
Dawson, R., L. Muller, A. Dehner, C. Klein, H. Kessler, and J. Buchner. 2003. The N-terminal
domain of p53 is natively unfolded. J. Mol. Biol. 332: 1131–41.
De Guzman, R. N., M. Martinez-Yamout, H. J. Dyson, and P. E. Wright. 2004. Interaction of the
TAZ1 domain of CREBbinding protein with the activation domain of CITED2: regulation
by competition between intrinsically unstructured ligands for non-identical binding sites.
J. Biol. Chem. 279: 3042–49.
Dedmon, M. M., K. Lindorff-Larsen, J. Christodoulou, M. Vendruscolo, and C. M. Dobson. 2005.
Mapping long-range interactions in alpha-synuclein using spin-label NMR and ensemble
molecular dynamics simulations. J. Am. Chem. Soc. 127: 476–7.
Dedmon, M. M., C. N. Patel, G. B. Young, and G. J. Pielak. 2002. FlgM gains structure in living
cells. Proc. Natl. Acad. Sci. USA 99: 12681–4.
Demarest, S. J., S. Deechongkit, H. J. Dyson, R. M. Evans, and P. E. Wright. 2004. Packing,
specificity, and mutability at the binding interface between the p160 coactivator and CREB-
binding protein. Protein Sci. 13: 203–10.
Demarest, S. J., M. Martinez-Yamout, J. Chung, et al. 2002. Mutual synergistic folding in recruit-
ment of CBP/p300 by p160 nuclear receptor coactivators. Nature 415: 549–53.
Demchenko, A. P. 2001. Recognition between flexible protein molecules: induced and assisted
folding. J. Mol. Recognit. 14: 42–61.
Deng, C. X. 2006. BRCA1: cell cycle checkpoint, genetic instability, DNA damage response and
cancer evolution. Nucleic. Acids Res. 34: 1416–26.
Denning, D. P., S. S. Patel, V. Uversky, A. L. Fink, and M. Rexach. 2003. Disorder in the nuclear
pore complex: the FG repeat regions of nucleoporins are natively unfolded. Proc. Natl.
Acad. Sci. USA 100: 2450–5.
Denning, D. P. and M. F. Rexach. 2007. Rapid evolution exposes the boundaries of domain
structure and function in natively unfolded FG nucleoporins. Mol. Cell. Proteomics. 6:
272–82.
References 275
Denning, D. P., V. Uversky, S. S. Patel, A. L. Fink, and M. Rexach. 2002. The Saccharomyces
cerevisiae nucleoporin Nup2p is a natively unfolded protein. J. Biol. Chem. 277:
33447–55.
Dill, K. A. and H. S. Chan. 1997. From Levinthal to pathways to funnels. Nat. Struct. Biol. 4:
10–9.
Dill, K. A. and D. Shortle. 1991. Denatured states of proteins. Annu. Rev. Biochem. 60: 795–825.
Dingwall, C., S. M. Dilworth, S. J. Black, S. E. Kearsey, L. S. Cox, and R. A. Laskey. 1987.
Nucleoplasmin cDNA sequence reveals polyglutamic acid tracts and a cluster of sequences
homologous to putative nuclear localization signals. EMBO J. 6: 69–74.
Dinitto, J. P. and P. W. Huber. 2003. Mutual induced fit binding of Xenopus ribosomal protein L5
to 5S rRNA. J. Mol. Biol. 330: 979–92.
Dobson, C. M. 1993. Flexible friends. Current Biology 3: 530–32.
Dobson, C. M. 1999. Protein misfolding, evolution and disease. Trends Biochem. Sci. 24:
329–32.
Dobson, C. M. 2002. Getting out of shape. Nature 418: 729–30.
Domanski, M., M. Hertzog, J. Coutant, et al. 2004. Coupling of folding and binding of thymosin
beta4 upon interaction with monomeric actin monitored by nuclear magnetic resonance.
J. Biol. Chem. 279: 23637–45.
Donaldson, L. and J. P. Capone. 1992. Purification and characterization of the carboxyl-terminal
transactivation domain of Vmw65 from herpes simplex virus type 1. J. Biol. Chem. 267:
1411–4.
Donne, D. G., J. H. Viles, D. Groth, et al. 1997. Structure of the recombinant full-length hamster
prion protein PrP(29-231): the N terminus is highly flexible. Proc. Natl. Acad. Sci. USA 94:
13452–7.
Dosztanyi, Z., J. Chen, A. K. Dunker, I. Simon, and P. Tompa. 2006. Disorder and sequence
repeats in hub proteins and their implications for network evolution. J. Proteome. Res. 5:
2985–95.
Dosztanyi, Z., V. Csizmok, P. Tompa, and I. Simon. 2005a. IUPred: web server for the predic-
tion of intrinsically unstructured regions of proteins based on estimated energy content.
Bioinformatics 21: 3433–4.
Dosztanyi, Z., V. Csizmok, P. Tompa, and I. Simon. 2005b. The pair-wise energy content estimated
from amino acid composition discriminates between folded and instrinsically unstructured
proteins. J. Mol. Biol. 347: 827–39.
Dosztanyi, Z., M. Sandor, P. Tompa, and I. Simon. 2007. Prediction of protein disorder at the
domain level. Curr. Protein Pept. Sci. 8: 161–71.
Drenth, J. 2006. Principles of Protein X-Ray Crystallography. New York: Springer.
Drews, J. 2000. Drug discovery: a historical perspective. Science 287: 1960–4.
Dueber, J. E., B. J. Yeh, K. Chak, and W. A. Lim. 2003. Reprogramming control of an allosteric
signaling switch through modular recombination. Science 301: 1904–8.
Dunker, A. K. 2007. Another window into disordered protein function. Structure 15: 1026–8.
Dunker, A. K., C. J. Brown, J. D. Lawson, L. M. Iakoucheva, and Z. Obradovic. 2002. Intrinsic
disorder and protein function. Biochemistry 41: 6573–82.
Dunker, A. K., M. S. Cortese, P. Romero, L. M. Iakoucheva, and V. N. Uversky. 2005. Flexible nets.
The roles of intrinsic disorder in protein interaction networks. FEBS J. 272: 5129–48.
Dunker, A. K., E. Garner, S. Guillot, et al. 1998. Protein disorder and the evolution of molec-
ular recognition: theory, predictions and observations. Pac Symp Biocomputing 3:
473–84.
Dunker, A. K., J. D. Lawson, C. J. Brown, et al. 2001. Intrinsically disordered protein. J. Mol.
Graphics. Modelling 19: 26–59.
Dunker, A. K., Z. Obradovic, P. Romero, E. C. Garner, and C. J. Brown. 2000. Intrinsic pro-
tein disorder in complete genomes. Genome Inform. Ser. Workshop Genome Inform. 11:
161–71.
276 References
Dunker, A. K., C. J. Oldfield, J. Meng, et al. 2008. The unfoldomics decade: an update on intrinsi-
cally disordered proteins. BMC Genomics 9 Suppl 2: S1.
Dyson, H. J. and P. E. Wright. 2002a. Coupling of folding and binding for unstructured proteins.
Curr. Opin. Struct. Biol. 12: 54–60.
Dyson, H. J. and P. E. Wright. 2002b. Insights into the structure and dynamics of unfolded pro-
teins from nuclear magnetic resonance. Adv. Protein Chem. 62: 311–40.
Dyson, H. J. and P. E. Wright. 2004. Unfolded proteins and protein folding studied by NMR.
Chem. Rev. 104: 3607–22.
Dyson, H. J. and P. E. Wright. 2005. Intrinsically unstructured proteins and their functions. Nat.
Rev. Mol. Cell Biol. 6: 197–208.
Ebert, M. O., S. H. Bae, H. J. Dyson, and P. E. Wright. 2008. NMR relaxation study of the
complex formed between CBP and the activation domain of the nuclear hormone receptor
coactivator ACTR. Biochemistry 47: 1299–308.
Ekman, D., S. Light, A. K. Bjorklund, and A. Elofsson. 2006. What properties characterize the
hub proteins of the protein–protein interaction network of Saccharomyces cerevisiae?
Genome Biol. 7: R45.
El-Agnaf, O. M. and G. B. Irvine. 2002. Aggregation and neurotoxicity of alpha-synuclein and
related peptides. Biochem. Soc. Trans. 30: 559–65.
Eliezer, D., P. Barre, M. Kobaslija, D. Chan, X. Li, and L. Heend. 2005. Residual structure in the
repeat domain of tau: echoes of microtubule binding and paired helical filament formation.
Biochemistry 44: 1026–36.
Eliezer, D., E. Kutluay, R. Bussell Jr., and G. Browne. 2001. Conformational properties of alpha-
synuclein in its free and lipid-associated states. J. Mol. Biol. 307: 1061–73.
Elion, E. A. 2001. The Ste5p scaffold. J. Cell Sci. 114: 3967–78.
Elkins, J. M., K. S. Hewitson, L. A. Mcneill, et al. 2003. Structure of factor-inhibiting hypoxia-
inducible factor (HIF) reveals mechanism of oxidative modification of HIF-1alpha. J. Biol.
Chem. 278: 1802–06.
Ellis, R. J. 2001. Macromolecular crowding: obvious but underappreciated. Trends Biochem. Sci.
26: 597–604.
Ellis, R. J. 2006. Molecular chaperones: assisting assembly in addition to folding. Trends Biochem.
Sci. 31: 395–401.
Etoh, Y., M. Simon, and H. Green. 1986. Involucrin acts as a transglutaminase substrate at mul-
tiple sites. Biochem. Biophys. Res. Commun. 136: 51–6.
Evans, P. R. and D. J. Owen. 2002. Endocytosis and vesicle trafficking. Curr. Opin. Struct. Biol.
12: 814–21.
Fabrega, C., V. Shen, S. Shuman, and C. D. Lima. 2003. Structure of an mRNA capping enzyme bound
to the phosphorylated carboxy-terminal domain of RNA polymerase II. Mol. Cell 11: 1549–61.
Fandrich, M., M. A. Fletcher, and C. M. Dobson. 2001. Amyloid fibrils from muscle myoglobin.
Nature 410: 165–6.
Fazzio, T. G., M. E. Gelbart, and T. Tsukiyama. 2005. Two distinct mechanisms of chromatin inter-
action by the Isw2 chromatin remodeling complex in vivo. Mol. Cell Biol. 25: 9165–74.
Fedarko, N. S., B. Fohr, P. G. Robey, M. F. Young, and L. W. Fisher. 2000. Factor H binding
to bone sialoprotein and osteopontin enables tumor cell evasion of complement-mediated
attack. J. Biol. Chem. 275: 16666–72.
Feldman, R. M., C. C. Correll, K. B. Kaplan, and R. J. Deshaies. 1997. A complex of Cdc4p,
Skp1p, and Cdc53p/cullin catalyzes ubiquitination of the phosphorylated CDK inhibitor
Sic1p. Cell 91: 221–30.
Felsenstein, J. 1997. An alternating least squares approach to inferring phylogenies from pairwise
distances. Syst. Biol. 46: 101–11.
Feng, Z. P., X. Zhang, P. Han, N. Arora, R. F. Anders, and R. S. Norton. 2006. Abundance of
intrinsically unstructured proteins in P. falciparum and other apicomplexan parasite pro-
teomes. Mol. Biochem. Parasitol. 150: 256–67.
References 277
Fowler, D. M., A. V. Koulov, W. E. Balch, and J. W. Kelly. 2007. Functional amyloid-from bacte-
ria to humans. Trends Biochem. Sci. 32: 217–24.
Frebel, K. and S. Wiese. 2006. Signalling molecules essential for neuronal survival and differen-
tiation. Biochem. Soc. Trans. 34: 1287–90.
Frey, S., R. P. Richter, and D. Gorlich. 2006. FG-rich repeats of nuclear pore proteins form a
three-dimensional meshwork with hydrogel-like properties. Science 314: 815–7.
Frieden, C., K. Chattopadhyay, and E. L. Elson. 2002. What fluorescence correlation spectros-
copy can tell us about unfolded proteins. Adv. Protein Chem. 62: 91–109.
Futreal, P. A., L. Coin, M. Marshall, et al. 2004. A census of human cancer genes. Nat. Rev.
Cancer. 4: 177–83.
Fuxreiter, M., I. Simon, P. Friedrich, and P. Tompa. 2004. Preformed structural elements
feature in partner recognition by intrinsically unstructured proteins. J. Mol. Biol. 338:
1015–26.
Fuxreiter, M., P. Tompa, and I. Simon. 2007. Structural disorder imparts plasticity on linear
motifs. Bioinformatics 23: 950–6.
Gabus, C., E. Derrington, P. Leblanc, et al. 2001. The prion protein has RNA binding and chap-
eroning properties characteristic of nucleocapsid protein NCP7 of HIV-1. J. Biol. Chem.
276: 19301–9.
Gabus, C., R. Mazroui, S. Tremblay, E. W. Khandjian, and J. L. Darlix. 2004. The fragile X
mental retardation protein has nucleic acid chaperone properties. Nucleic. Acids Res. 32:
2129–37.
Gajdusek, D. C., C. J. Gibbs Jr., D. M. Asher, and E. David. 1968. Transmission of experimental
kuru to the spider monkey (Ateles geoffreyi). Science 162: 693–4.
Galea, C. A., A. Nourse, Y. Wang, S. G. Sivakolundu, W. T. Heller, and R. W. Kriwacki. 2008a.
Role of intrinsic flexibility in signal transduction mediated by the cell cycle regulator,
p27(Kip1). J. Mol. Biol. 376: 827–38.
Galea, C. A., V. R. Pagala, J. C. Obenauer, C. G. Park, C. A. Slaughter, and R. W. Kriwacki. 2006.
Proteomic studies of the intrinsically unstructured mammalian proteome. J. Proteome. Res.
5: 2839–48.
Galea, C. A., Y. Wang, S. G. Sivakolundu, and R. W. Kriwacki. 2008b. Regulation of cell divi-
sion by intrinsically unstructured proteins: intrinsic flexibility, modularity, and signaling
conduits. Biochemistry 47: 7598–609.
Galzitskaya, O. V., S. O. Garbuzynskiy, and M. Y. Lobanov. 2006. FoldUnfold: web server for the
prediction of disordered regions in protein chain. Bioinformatics 22: 2948–9.
Ganesh, O. K., T. B. Green, A. S. Edison, and S. J. Hagen. 2006. Characterizing the residue level
folding of the intrinsically unstructured IA3. Biochemistry 45: 13585–96.
Garay-Arroyo, A., J. M. Colmenero-Flores, A. Garciarrubio, and A. A. Covarrubias. 2000. Highly
hydrophilic proteins in prokaryotes and eukaryotes are common during conditions of water
deficit. J. Biol. Chem. 275: 5668–74.
Garbuzynskiy, S. O., M. Y. Lobanov, and O. V. Galzitskaya. 2004. To be folded or to be unfolded?
Protein Sci. 13: 2871–7.
Garner, E., P. Cannon, P. Romero, Z. Obradovic, and A. K. Dunker. 1998. Predicting Disordered
Regions from Amino Acid Sequence: Common Themes Despite Differing Structural
Characterization. Genome Inform. Ser. Workshop Genome Inform. 9: 201–13.
Garner, E., P. Romero, A. K. Dunker, C. Brown, and Z. Obradovic. 1999. Predicting Binding
Regions within Disordered Proteins. Genome Inform. Ser. Workshop Genome Inform. 10:
41–50.
Garrett, R. H. and C. M. Grisham. 2007. Biochemistry. Belmont, CA: Thomson Brooks/Cole.
Gast, K., H. Damaschun, K. Eckert, et al. 1995. Prothymosin alpha: a biologically active protein
with random coil conformation. Biochemistry 34: 13211–8.
Gavin, A. C., P. Aloy, P. Grandi, et al. 2006. Proteome survey reveals modularity of the yeast cell
machinery. Nature 440: 631–6.
References 279
Gavin, A. C., M. Bosche, R. Krause, et al. 2002. Functional organization of the yeast proteome by
systematic analysis of protein complexes. Nature 415: 141–7.
George, R. A. and J. Heringa. 2002. An analysis of protein domain linkers: their classification and
role in protein folding. Protein Eng. 15: 871–9.
Gerber, H. P., K. Seipel, O. Georgiev, et al. 1994. Transcriptional activation modulated by
homopolymeric glutamine and proline stretches. Science 263: 808–11.
Ghiso, J., R. Vidal, A. Rostagno, et al. 2000. A newly formed amyloidogenic fragment due to a
stop codon mutation causes familial British dementia. Ann. NY Acad. Sci. 903: 129–37.
Giaever, G., A. M. Chu, L. Ni, et al. 2002. Functional profiling of the Saccharomyces cerevisiae
genome. Nature 418: 387–91.
Gianni, S., N. R. Guydosh, F. Khan, et al. 2003. Unifying features in protein-folding mechanisms.
Proc. Natl. Acad. Sci. USA 100: 13286–91.
Gibbs, C. J. Jr., D. C. Gajdusek, D. M. Asher, et al. 1968. Creutzfeldt–Jakob disease (spongiform
encephalopathy): transmission to the chimpanzee. Science 161: 388–9.
Gidalevitz, T., A. Ben-Zvi, K. H. Ho, H. R. Brignull, and R. I. Morimoto. 2006. Progressive
disruption of cellular protein folding in models of polyglutamine diseases. Science 311:
1471–4.
Giesecke, H., J. C. Barale, G. Langsley, and A. W. Cornelissen. 1991. The C-terminal domain of
RNA polymerase II of the malaria parasite Plasmodium berghei. Biochem. Biophys. Res.
Commun. 180: 1350–5.
Gigant, B., P. A. Curmi, C. Martin-Barbey, et al. 2000. The 4 A X-ray structure of a tubulin:stathmin-
like domain complex. Cell 102: 809–16.
Gill, G. and M. Ptashne. 1987. Mutants of GAL4 protein altered in an activation function. Cell
51: 121–6.
Gillespie, J. R. and D. Shortle. 1997. Characterization of long-range structure in the denatured
state of staphylococcal nuclease. II. Distance restraints from paramagnetic relaxation and
calculation of an ensemble of structures. J. Mol. Biol. 268: 170–84.
Girdwood, D. and E. A. T. B. Specified. 2003. p300 transcriptional repression is mediated by
SUMO modification. Cell 11: 1043–54.
Goldberg, M. E., G. V. Semisotnov, B. Friguet, K. Kuwajima, O. B. Ptitsyn, and S. Sugai. 1990.
An early immunoreactive folding intermediate of the tryptophan synthease beta 2 subunit is
a “molten globule.” FEBS Lett. 263: 51–6.
Goldfarb, L. G., P. Brown, W. R. Mccombie, et al. 1991. Transmissible familial Creutzfeldt–Jakob
disease associated with five, seven, and eight extra octapeptide coding repeats in the PRNP
gene. Proc. Natl. Acad. Sci. USA 88: 10926–30.
Goldgur, Y., S. Rom, R. Ghirlando, et al. 2007. Desiccation and zinc binding induce transition of
tomato abscisic acid stress ripening 1, a water stress- and salt stress-regulated plant-specific
protein, from unfolded to folded state. Plant Physiol. 143: 617–28.
Gooding, J. M., K. L. Yap, and M. Ikura. 2004. The cadherin–catenin complex as a focal point
of cell adhesion and signalling: new insights from three-dimensional structures. Bioessays
26: 497–511.
Goodman, R. H. and S. Smolik. 2000. CBP/p300 in cell growth, transformation, and develop-
ment. Genes Dev. 14: 1553–77.
Gorovits, B. M. and P. M. Horowitz. 1995. The molecular chaperonin cpn60 displays local flexibil-
ity that is reduced after binding with an unfolded protein. J. Biol. Chem. 270: 13057–62.
Goyal, K., L. Tisi, A. Basran, et al. 2003. Transition from Natively unfolded to folded state induced
by desiccation in an anhydrobiotic nematode protein. J. Biol. Chem. 278: 12977–84.
Goyal, K., L. J. Walton, and A. Tunnacliffe. 2005. LEA proteins prevent protein aggregation due
to water stress. Biochem J. 388: 151–7.
Graciet, E., P. Gans, N. Wedel, S. Lebreton, J. M. Camadro, and B. Gontero. 2003. The small
protein CP12: a protein linker for supramolecular complex assembly. Biochemistry 42:
8163–70.
280 References
Graham, T. A., D. M. Ferkey, F. Mao, D. Kimelman, and W. Xu. 2001. Tcf4 can specifically rec-
ognize beta-catenin using alternative conformations. Nat. Struct. Biol. 8: 1048–52.
Graham, T. A., C. Weaver, F. Mao, D. Kimelman, and W. Xu. 2000. Crystal structure of a beta-
catenin/Tcf complex. Cell 103: 885–96.
Granzier, H. and S. Labeit. 2002. Cardiac titin: an adjustable multi-functional spring. J. Physiol.
541: 335–42.
Greaser, M. 2001. Identification of new repeating motifs in titin. Proteins 43: 145–9.
Green, T. B., O. Ganesh, K. Perry, et al. 2004. IA3, an aspartic proteinase inhibitor from
Saccharomyces cerevisiae, is intrinsically unstructured in solution. Biochemistry 43:
4071–81.
Greenbaum, E. A., C. L. Graves, A. J. Mishizen-Eberz, et al. 2005. The E46K mutation in alpha-
synuclein increases amyloid fibril formation. J. Biol. Chem. 280: 7800–7.
Greenblatt, J. and J. Li. 1982. Properties of the N gene transcription antitermination protein of
bacteriophage lambda. J. Biol. Chem. 257: 362–5.
Greene, L. H., R. Wijesinha-Bettoni, and C. Redfield. 2006. Characterization of the molten glob-
ule of human serum retinol-binding protein using NMR spectroscopy. Biochemistry 45:
9475–84.
Grimmler, M., Y. Wang, T. Mund, et al. 2007. Cdk-inhibitory activity and stability of p27Kip1 are
directly regulated by oncogenic tyrosine kinases. Cell 128: 269–80.
Grosschedl, R., K. Giese, and J. Pagel. 1994. HMG domain proteins: architectural elements in the
assembly of nucleoprotein structures. Trends Genet. 10: 94–100.
Grossman, S. R., M. Perez, A. L. Kung, et al. 1998. p300/MDM2 complexes participate in MDM2-
mediated p53 degradation. Mol. Cell 2: 405–15.
Group, The Huntington’s Disease Collaborative Research. 1993. A novel gene containing a
trinucleotide repeat that is expanded and unstable on Huntington’s disease chromosomes.
Cell 72: 971–83.
Gu, W., M. Kofler, I. Antes, C. Freund, and V. Helms. 2005. Alternative binding modes of proline-
rich peptides binding to the GYF domain. Biochemistry 44: 6404–15.
Guijarro, J. I., M. Sunde, J. A. Jones, I. D. Campbell, and C. M. Dobson. 1998. Amyloid fibril
formation by an SH3 domain. Proc. Natl. Acad. Sci. USA 95: 4224–8.
Gunasekaran, K., C. J. Tsai, S. Kumar, D. Zanuy, and R. Nussinov. 2003. Extended disordered
proteins: targeting function with less scaffold. Trends Biochem. Sci. 28: 81–5.
Gunasekaran, K., C. J. Tsai, and R. Nussinov. 2004. Analysis of ordered and disordered protein
complexes reveals structural features discriminating between stable and unstable mono-
mers. J. Mol. Biol. 341: 1327–41.
Guo, J. T., J. W. Jaromczyk, and Y. Xu. 2007. Analysis of chameleon sequences and their implica-
tions in biological processes. Proteins 67: 548–58.
Gusev, N. B., J. Hajdu, and P. Friedrich. 1979. Motility of the N-terminal tail of phosphorylase b
as revealed by cross-linking. Biochem. Biophys. Res. Commun. 90: 70–7.
Gutierrez-Cruz, G., A. H. Van Heerden, and K. Wang. 2001. Modular motif, structural folds and
affinity profiles of the PEVK segment of human fetal skeletal muscle titin. J. Biol. Chem. 276:
7442–9.
Haarmann, C. S., D. Green, M. G. Casarotto, D. R. Laver, and A. F. Dulhunty. 2003. The random-
coil “C” fragment of the dihydropyridine receptor II–III loop can activate or inhibit native
skeletal ryanodine receptors. Biochem J. 372: 305–16.
Hackel, M., H. J. Hinz, and G. R. Hedwig. 1999. A new set of peptide-based group heat capacities
for use in protein stability calculations. J. Mol. Biol. 291: 197–213.
Hackel, M., T. Konno, and H. Hinz. 2000. A new alternative method to quantify residual structure
in “unfolded” proteins. Biochim. Biophys. Acta. 1479: 155–65.
Hagerman, A. E. and L. G. Butler. 1981. The specificity of proanthocyanidin–protein interactions.
J. Biol. Chem. 256: 4494–7.
References 281
Hagestedt, T., B. Lichtenberg, H. Wille, E. M. Mandelkow, and E. Mandelkow. 1989. Tau protein
becomes long and stiff upon phosphorylation: correlation between paracrystalline structure
and degree of phosphorylation. J. Cell Biol. 109: 1643–51.
Hai, C. M. and Z. Gu. 2006. Caldesmon phosphorylation in actin cytoskeletal remodeling. Eur.
J. Cell Biol. 85: 305–9.
Hajdu, J., V. Dombradi, G. Bot, and P. Friedrich. 1979. Structural changes in glycogen phos-
phorlase as revealed by cross-linking with bifunctional diimidates: phosphorylase b.
Biochemistry 18: 4037–41.
Han, J. D., N. Bertin, T. Hao, et al. 2004. Evidence for dynamically organized modularity in the
yeast protein–protein interaction network. Nature 430: 88–93.
Han, J. H., S. Batey, A. A. Nickson, S. A. Teichmann, and J. Clarke. 2007. The folding and evolu-
tion of multidomain proteins. Nat. Rev. Mol. Cell Biol. 8: 319–30.
Hanna, R. A., B. E. Garcia-Diaz, and P. L. Davies. 2007. Calpastatin simultaneously binds four
calpains with different kinetic constants. FEBS Lett. 581: 2894–8.
Hansen, J. C. 2002. Conformational dynamics of the chromatin fiber in solution: determinants,
mechanisms and functions. Annu. Rev. Biophys. Biomol. Struct. 31: 361–92.
Hansen, J. C., X. Lu, E. D. Ross, and R. W. Woody. 2006. Intrinsic protein disorder, amino Acid
composition, and histone terminal domains. J. Biol. Chem. 281: 1853–6.
Hansen, J. C., C. Tse, and A. P. Wolffe. 1998. Structure and function of the core histone N-termini:
more than meets the eye. Biochemistry 37: 17637–41.
Harauz, G., N. Ishiyama, C. M. Hill, I. R. Bates, D. S. Libich, and C. Fares. 2004. Myelin basic
protein-diverse conformational states of an intrinsically unstructured protein and its roles in
myelin assembly and multiple sclerosis. Micron 35: 503–42.
Hardy, J., M. R. Cookson, and A. Singleton. 2003. Genes and parkinsonism. Lancet Neurol 2:
221–8.
Haritos, A. A., P. P. Yialouris, E. P. Heimer, A. M. Felix, E. Hannappel, and M. A. Rosemeyer.
1989. Evidence for the monomeric nature of thymosins. FEBS Lett. 244: 287–90.
Hartlepp, K. F., C. Fernandez-Tornero, A. Eberharter, T. Grune, C. W. Muller, and P. B. Becker.
2005. The histone fold subunits of Drosophila CHRAC facilitate nucleosome sliding
through dynamic DNA interactions. Mol. Cell Biol. 25: 9886–96.
Hashimoto, M., T. Ichimura, H. Mizoguchi, et al. 2005. Cell size and nucleoid organiza-
tion of engineered Escherichia coli cells with a reduced genome. Mol. Microbiol. 55:
137–49.
Hauer, J. A., P. Barthe, S. S. Taylor, J. Parello, and A. Padilla. 1999a. Two well-defined motifs
in the cAMP-dependent protein kinase inhibitor (PKI alpha) correlate with inhibitory and
nuclear export function. Protein Sci. 8: 545–53.
Hauer, J. A., S. S. Taylor, and D. A. Johnson. 1999b. Binding-dependent disorder–order transition
in PKI alpha: a fluorescence anisotropy study. Biochemistry 38: 6774–80.
Hayashi, K., K. Kanda, F. Kimizuka, I. Kato, and K. Sobue. 1989. Primary structure and func-
tional expression of h-caldesmon complementary DNA. Biochem. Biophys. Res. Commun.
164: 503–11.
Haynes, C. and L. M. Iakoucheva. 2006. Serine/arginine-rich splicing factors belong to a class of
intrinsically disordered proteins. Nucleic. Acids Res. 34: 305–12.
Haynes, C., C. J. Oldfield, F. Ji, et al. 2006. Intrinsic disorder is a common feature of hub proteins
from four eukaryotic interactomes. PLoS Comput. Biol. 2: e100.
He, Z., A. K. Dunker, C. R. Wesson, and W. R. Trumble. 1993. Ca(2+)-induced folding and
aggregation of skeletal muscle sarcoplasmic reticulum calsequestrin. The involvement of
the trifluoperazine-binding site. J. Biol. Chem. 268: 24635–41.
Heery, D. M., S. Hoare, S. Hussain, M. G. Parker, and H. Sheppard. 2001. Core LXXLL motif
sequences in CREB-binding protein, SRC1, and RIP140 define affinity and selectivity for
steroid and retinoid receptors. J. Biol. Chem. 276: 6695–702.
282 References
Hegyi, H., L. Buday, and P. Tompa. 2009. Intrinsic structural disorder confers cellular viability on
oncogenic fusion proteins. PLoS Comput. Biol., in press
Hegyi, H., E. Schad, and P. Tompa. 2007. Structural disorder promotes assembly of protein com-
plexes. BMC Struct. Biol. 7: 65.
Hemmings, H. C. Jr., A. C. Nairn, D. W. Aswad, and P. Greengard. 1984. DARPP-32, a dopamine-
and adenosine 3′:5′-monophosphate-regulated phosphoprotein enriched in dopamine-inner-
vated brain regions. II. Purification and characterization of the phosphoprotein from bovine
caudate nucleus. J. Neurosci. 4: 99–110.
Hemmings, H. C. Jr., A. C. Nairn, J. I. Elliott, and P. Greengard. 1990. Synthetic peptide analogs
of DARPP-32 (Mr 32,000 dopamine- and cAMP-regulated phosphoprotein), an inhibitor of
protein phosphatase-1. Phosphorylation, dephosphorylation, and inhibitory activity. J. Biol.
Chem. 265: 20369–76.
Heringa, J. 1998. Detection of internal repeats: how common are they? Curr. Opin. Struct. Biol.
8: 338–45.
Hernandez, M. A., J. Avila, and J. M. Andreu. 1986. Physicochemical characterization of the heat-
stable microtubule-associated protein MAP2. Eur. J. Biochem. 154: 41–8.
Herschlag, D. 1995. RNA chaperones and the RNA folding problem. J. Biol. Chem. 270:
20871–4.
Hershey, P. E., S. M. Mcwhirter, J. D. Gross, G. Wagner, T. Alber, and A. B. Sachs. 1999. The
Cap-binding protein eIF4E promotes folding of a functional domain of yeast translation
initiation factor eIF4G1. J. Biol. Chem. 274: 21297–304.
Hershko, A. and A. Ciechanover. 1998. The ubiquitin system. Annu. Rev. Biochem. 67: 425–79.
Hertzog, M., C. Van Heijenoort, D. Didry, et al. 2004. The beta-thymosin/WH2 domain; struc-
tural basis for the switch from inhibition to promotion of actin assembly. Cell 117:
611–23.
Hess, J. L. 2004. MLL: a histone methyltransferase disrupted in leukemia. Trends Mol. Med. 10:
500–7.
Hess, S. T., S. Huang, A. A. Heikal, and W. W. Webb. 2002. Biological and chemical applications
of fluorescence correlation spectroscopy: a review. Biochemistry 41: 697–705.
Heyen, B. J., M. K. Alsheikh, E. A. Smith, C. F. Torvik, D. F. Seals, and S. K. Randall. 2002.
The calcium-binding activity of a vacuole-associated, dehydrin-like protein is regulated by
phosphorylation. Plant Physiol. 130: 675–87.
Hill, C. M., I. R. Bates, G. F. White, F. R. Hallett, and G. Harauz. 2002. Effects of the osmolyte
trimethylamine-N-oxide on conformation, self-association, and two-dimensional crystal-
lization of myelin basic protein. J. Struct. Biol. 139: 13–26.
Hilser, V. J. and E. B. Thompson. 2007. Intrinsic disorder as a mechanism to optimize allosteric
coupling in proteins. Proc. Natl. Acad. Sci. USA 104: 8311–5.
Himmler, A. 1989. Structure of the bovine tau gene: alternatively spliced transcripts generate a
protein family. Mol. Cell Biol. 9: 1389–96.
Hirano, T., S. I. Funahashi, T. Uemura, and M. Yanagida. 1986. Isolation and characterization of
Schizosaccharomyces pombe cutmutants that block nuclear division but not cytokinesis.
EMBO J. 5: 2973–79.
Hiroaki, H., T. Ago, T. Ito, H. Sumimoto, and D. Kohda. 2001. Solution structure of the PX
domain, a target of the SH3 domain. Nat. Struct. Biol. 8: 526–30.
Hollenbeck, J. J., D. L. McClain, and M. G. Oakley. 2002. The role of helix stabilizing residues in
GCN4 basic region folding and DNA binding. Protein Sci. 11: 2740–7.
Holt, C. and L. Sawyer. 1993. Caseins as rheomorphic proteins: interpretation of primary and
secondary structures of the alpha(s1)-, beta- and kappa-caseins. J. Chem. Soc. Faraday
Trans. 89: 2683–92.
Holt, C., N. M. Wahlgren, and T. Drakenberg. 1996. Ability of a beta-casein phosphopeptide to
modulate the precipitation of calcium phosphate by forming amorphous dicalcium phos-
phate nanoclusters. Biochem J. 314: 1035–9.
References 283
Honnappa, S., W. Jahnke, J. Seelig, and M. O. Steinmetz. 2006. Control of intrinsically disordered
stathmin by multisite phosphorylation. J. Biol. Chem. 281: 16078–83.
Hope, I. A., S. Mahadevan, and K. Struhl. 1988. Structural and functional characterization of
the short acidic transcriptional activation region of yeast GCN4 protein. Nature 333:
635–40.
Hornig, N. C., P. P. Knowles, N. Q. Mcdonald, and F. Uhlmann. 2002. The dual mechanism of
separase regulation by securin. Curr. Biol. 12: 973–82.
Hoshi, T., W. N. Zagotta, and R. W. Aldrich. 1990. Biophysical and molecular mechanisms of
Shaker potassium channel inactivation. Science 250: 533–8.
Hotopp, J. C., M. E. Clark, D. C. Oliveira, et al. 2007. Widespread lateral gene transfer from
intracellular bacteria to multicellular eukaryotes. Science 317: 1753–6.
House-Pompeo, K., Y. Xu, D. Joh, P. Speziale, and M. Hook. 1996. Conformational changes in
the fibronectin binding MSCRAMMs are induced by ligand binding. J. Biol. Chem. 271:
1379–84.
Howard, M. B., N. A. Ekborg, L. E. Taylor, S. W. Hutcheson, and R. M. Weiner. 2004. Identification
and analysis of polyserine linker domains in prokaryotic proteins with emphasis on the
marine bacterium Microbulbifer degradans. Protein Sci. 13: 1422–5.
Hoyt, M. A., J. Zich, J. Takeuchi, M. Zhang, C. Govaerts, and P. Coffino. 2006. Glycine-alanine
repeats impair proper substrate unfolding by the proteasome. EMBO J. 25: 1720–9.
Hua, Q. X., W. H. Jia, B. P. Bullock, J. F. Habener, and M. A. Weiss. 1998. Transcriptional acti-
vator–coactivator recognition: nascent folding of a kinase-inducible transactivation domain
predicts its structure on coactivator binding. Biochemistry 37: 5858–66.
Huang, H. D., J. T. Horng, F. M. Lin, Y. C. Chang, and C. C. Huang. 2005. SpliceInfo: an information
repository for mRNA alternative splicing in human genome. Nucleic. Acids Res. 33: D80–5.
Hubbard, S. J., R. J. Beynon, and J. M. Thornton. 1998. Assessment of conformational param-
eters as predictors of limited proteolytic sites in native protein structures. Protein Eng. 11:
349–59.
Hubbard, S. J., F. Eisenmenger, and J. M. Thornton. 1994. Modeling studies of the change in con-
formation required for cleavage of limited proteolytic sites. Protein Sci. 3: 757–68.
Hubbell, W. L., C. Altenbach, C. M. Hubbell, and H. G. Khorana. 2003. Rhodopsin structure,
dynamics, and activation: a perspective from crystallography, site-directed spin labeling,
sulfhydryl reactivity, and disulfide cross-linking. Adv. Protein Chem. 63: 243–90.
Huber, A. H., D. B. Stewart, D. V. Laurents, W. J. Nelson, and W. I. Weis. 2001. The cadherin
cytoplasmic domain is unstructured in the absence of beta-catenin. A possible mechanism
for regulating cadherin turnover. J. Biol. Chem. 276: 12301–9.
Huber, A. H. and W. I. Weis. 2001. The structure of the beta-catenin/E-cadherin complex and the
molecular basis of diverse ligand recognition by beta-catenin. Cell 105: 391–402.
Hunter, T. 1987. A thousand and one protein kinases. Cell 50: 823–9.
Hurley, T. D., J. Yang, L. Zhang, et al. 2007. Structural basis for regulation of protein phosphatase
1 by inhibitor-2. J. Biol. Chem. 282: 28874–83.
Hurst, L. D. 2002. The Ka/Ks ratio: diagnosing the form of sequence evolution. Trends Genet.
18: 486.
Huth, J. R., C. A. Bewley, M. S. Nissen, et al. 1997. The solution structure of an HMG-I(Y)-
DNA complex defines a new architectural minor groove binding motif. Nat. Struct. Biol.
4: 657–65.
Iakoucheva, L., C. Brown, J. Lawson, Z. Obradovic, and A. Dunker. 2002. Intrinsic disorder in
cell-signaling and cancer-associated proteins. J. Mol. Biol. 323: 573–84.
Iakoucheva, L. M., A. L. Kimzey, C. D. Masselon, et al. 2001a. Identification of intrinsic order
and disorder in the DNA repair protein XPA. Protein Sci. 10: 560–71.
Iakoucheva, L. M., A. L. Kimzey, C. D. Masselon, R. D. Smith, A. K. Dunker, and E. J. Ackerman.
2001b. Aberrant mobility phenomena of the DNA repair protein XPA. Protein Sci. 10:
1353–62.
284 References
Iakoucheva, L. M., P. Radivojac, C. J. Brown, et al. 2004. The importance of intrinsic disorder for
protein phosphorylation. Nucleic. Acids Res. 32: 1037–49.
Ikemoto, N., B. Antoniu, J. J. Kang, L. G. Meszaros, and M. Ronjat. 1991. Intravesicular cal-
cium transient during calcium release from sarcoplasmic reticulum. Biochemistry 30:
5230–7.
Ikura, M. and J. B. Ames. 2006. Genetic polymorphism and protein conformational plasticity in
the calmodulin superfamily: two ways to promote multifunctionality. Proc. Natl. Acad. Sci.
USA 103: 1159–64.
Imarisio, S., J. Carmichael, V. Korolchuk, et al. 2008. Huntington’s disease: from pathology and
genetics to potential therapies. Biochem J. 412: 191–209.
Iqbal, K. and I. Grundke-Iqbal. 2008. Alzheimer neurofibrillary degeneration: significance, etio-
pathogenesis, therapeutics and prevention. J. Cell. Mol. Med. 12: 38–55.
Irar, S., E. Oliveira, M. Pages, and A. Goday. 2006. Towards the identification of late-embryo-
genic-abundant phosphoproteome in Arabidopsis by 2-DE and MS. Proteomics 6 Suppl 1:
S175–85.
Irobi, E., A. H. Aguda, M. Larsson, et al. 2004. Structural basis of actin sequestration by thy-
mosin-beta4: implications for WH2 proteins. EMBO J. 23: 3599–608.
Ishida, T. and K. Kinoshita. 2007. PrDOS: prediction of disordered protein regions from amino
acid sequence. Nucleic. Acids Res. 35: W460–4.
Ishida, T. and K. Kinoshita. 2008. Prediction of disordered regions in proteins based on the meta
approach. Bioinformatics 24: 1344–8.
Ivanyi-Nagy, R., L. Davidovic, E. W. Khandjian, and J. L. Darlix. 2005. Disordered RNA chaper-
one proteins: from functions to disease. Cell. Mol. Life Sci. 62: 1409–17.
Ivanyi-Nagy, R., J. P. Lavergne, C. Gabus, D. Ficheux, and J. L. Darlix. 2007. RNA chaperon-
ing and intrinsic disorder in the core proteins of Flaviviridae. Nucleic. Acids Res. 36:
712–25.
Iwai, A., E. Masliah, M. Yoshimoto, et al. 1995. The precursor protein of non-A beta compo-
nent of Alzheimer’s disease amyloid is a presynaptic protein of the central nervous system.
Neuron 14: 467–75.
Iwakuma, T. and G. Lozano. 2003. MDM2, an introduction. Mol. Cancer Res. 1: 993–1000.
Jackson, G. S., I. Murray, L. L. Hosszu, et al. 2001. Location and properties of metal-binding sites
on the human prion protein. Proc. Natl. Acad. Sci. USA 98: 8531–5.
Jackson, S. E. and A. R. Fersht. 1991. Folding of chymotrypsin inhibitor 2. 1. Evidence for a two-
state transition. Biochemistry 30: 10428–35.
Jallepalli, P. V., I. C. Waizenegger, F. Bunz, et al. 2001. Securin is required for chromosomal sta-
bility in human cells. Cell 105: 445–57.
James, L. C., P. Roversi, and D. S. Tawfik. 2003. Antibody multispecificity mediated by confor-
mational diversity. Science 299: 1362–7.
James, L. C. and D. S. Tawfik. 2003. Conformational diversity and protein evolution—a 60-year-
old hypothesis revisited. Trends Biochem. Sci. 28: 361–8.
Jarrett, J. T. and P. T. Lansbury Jr. 1993. Seeding “one-dimensional crystallization” of amyloid: a
pathogenic mechanism in Alzheimer’s disease and scrapie? Cell 73: 1055–8.
Jeffery, C. J. 1999. Moonlighting proteins. Trends Biochem. Sci. 24: 8–11.
Jeffery, C. J. 2003a. Moonlighting proteins: old proteins learning new tricks. Trends Genet. 19:
415–7.
Jeffery, C. J. 2003b. Multifunctional proteins: examples of gene sharing. Ann. Med. 35: 28–35.
Jeffery, C. J. 2004. Molecular mechanisms for multitasking: recent crystal structures of moon-
lighting proteins. Curr. Opin. Struct. Biol. 14: 663–8.
Jeganathan, S., M. Von Bergen, H. Brutlach, H. J. Steinhoff, and E. Mandelkow. 2006. Global
hairpin folding of tau in solution. Biochemistry 45: 2283–93.
Jencks, W. P. 1981. On the attribution and additivity of binding energies. Proc. Natl. Acad. Sci.
USA 78: 4046–50.
References 285
Keskin, O., B. Ma, and R. Nussinov. 2005. Hot regions in protein—protein interactions: the orga-
nization and contribution of structurally conserved hot spot residues. J. Mol. Biol. 345:
1281–94.
Khan, A. N. and P. N. Lewis. 2005. Unstructured conformations are a substrate requirement for
the Sir2 family of NAD-dependent protein deacetylases. J. Biol. Chem. 280: 36073–8.
Khaymina, S. S., J. M. Kenney, M. M. Schroeter, and J. M. Chalovich. 2007. Fesselin is a natively
unfolded protein. J. Proteome. Res. 6: 3648–54.
Kieburtz, K., M. Macdonald, C. Shih, et al. 1994. Trinucleotide repeat length and progression of
illness in Huntington’s disease. J. Med. Genet. 31: 872–4.
Kim, A. S., L. T. Kakalis, N. Abdul-Manan, G. A. Liu, and M. K. Rosen. 2000a. Autoinhibition
and activation mechanisms of the Wiskott–Aldrich syndrome protein. Nature 404: 151–8.
Kim, E., A. Magen, and G. Ast. 2007. Different levels of alternative splicing among eukaryotes.
Nucleic. Acids Res. 35: 125–31.
Kim, P. S. and R. L. Baldwin. 1990. Intermediates in the folding reactions of small proteins. Annu.
Rev. Biochem. 59: 631–60.
Kim, T. D., H. J. Ryu, H. I. Cho, C. H. Yang, and J. Kim. 2000b. Thermal behavior of proteins:
heat-resistant proteins and their heat-induced secondary structural changes. Biochemistry
39: 14839–46.
King, R. W., M. Glotzer, and M. W. Kirschner. 1996. Mutagenic analysis of the destruction signal
of mitotic cyclins and structural characterization of ubiquitinated intermediates. Mol. Biol.
Cell 7: 1343–57.
Kirkitadze, M. D., M. M. Condron, and D. B. Teplow. 2001. Identification and characteriza-
tion of key kinetic intermediates in amyloid beta-protein fibrillogenesis. J. Mol. Biol. 312:
1103–19.
Kiss, R., Z. Bozoky, D. Kovacs, et al. 2008a. Calcium-induced tripartite binding of intrinsically
disordered calpastatin to its cognate enzyme, calpain. FEBS Lett. 582: 2149–54.
Kiss, R., D. Kovacs, P. Tompa, and A. Perczel. 2008b. Local structural preferences of calpastatin,
the intrinsically unstructured protein inhibitor of calpain. Biochemistry 47: 6936–45.
Kissinger, C. R., H. E. Parge, D. R. Knighton, et al. 1995. Crystal structures of human calcineurin
and the human FKBP12-FK506-calcineurin complex. Nature 378: 641–4.
Kitada, T., S. Asakawa, N. Hattori, et al. 1998. Mutations in the Parkin gene cause autosomal
recessive juvenile parkinsonism. Nature 392: 605–8.
Klein, C. and L. T. Vassilev. 2004. Targeting the p53–MDM2 interaction to treat cancer. Br. J.
Cancer 91: 1415–9.
Kleinschmidt, J. A., C. Dingwall, G. Maier, and W. W. Franke. 1986. Molecular characterization
of a karyophilic, histone-binding protein: cDNA cloning, amino acid sequence and expres-
sion of nuclear protein N1/N2 of Xenopus laevis. EMBO J. 5: 3547–52.
Koag, M. C., R. D. Fenton, S. Wilkens, and T. J. Close. 2003. The binding of maize DHN1 to lipid
vesicles. Gain of structure and lipid specificity. Plant Physiol. 131: 309–16.
Kohn, J. E., I. S. Millett, J. Jacob, et al. 2004. Random-coil behavior and the dimensions of chemi-
cally unfolded proteins. Proc. Natl. Acad. Sci. USA 101: 12491–6.
Konno, T., N. Tanaka, M. Kataoka, E. Takano, and M. Maki. 1997. A circular dichroism study of
preferential hydration and alcohol effects on a denatured protein, pig calpastatin domain I.
Biochim. Biophys. Acta. 1342: 73–82.
Kornberg, R. D. 2005. Mediator and the mechanism of transcriptional activation. Trends Biochem.
Sci. 30: 235–9.
Koshland, D. E. 1958. Application of a theory of enzyme specificity to protein synthesis. Proc.
Natl. Acad. Sci. USA 44: 98–104.
Kovacs, D., E. Kalmar, Z. Torok, and P. Tompa. 2008. Chaperone activity of ERD10 and ERD14,
two disordered stress-related plant proteins. Plant Physiol. 147: 381–90.
Kovacs, D., M. Rakacs, B. Agoston, et al. 2009. Janus chaperones: assistance of both RNA- and
protein-folding by ribosomal proteins. FEBS Lett. 583: 88–92.
References 287
Kovacs, G. G., L. Laszlo, J. Kovacs, et al. 2004. Natively unfolded tubulin polymerization pro-
moting protein TPPP/p25 is a common marker of alpha-synucleinopathies. Neurobiol. D.
17: 155–62.
Kreil, D. P. and G. Kreil. 2000. Asparagine repeats are rare in mammalian proteins. Trends
Biochem. Sci. 25: 270–1.
Kreimer, D. I., R. Szosenfogel, D. Goldfarb, I. Silman, and L. Weiner. 1994. Two-state transition
between molten globule and unfolded states of acetylcholinesterase as monitored by elec-
tron paramagnetic resonance spectroscopy. Proc. Natl. Acad. Sci. USA 91: 12145–9.
Kretsinger, R. H. and C. E. Nockolds. 1973. Carp muscle calcium-binding protein. II. Structure
determination and general description. J. Biol. Chem. 248: 3313–26.
Krimm, S. and J. Bandekar. 1986. Vibrational spectroscopy and conformation of peptides, poly-
peptides, and proteins. Adv. Protein Chem. 38: 181–364.
Kriwacki, R. W., L. Hengst, L. Tennant, S. I. Reed, and P. E. Wright. 1996. Structural studies of
p21Waf1/Cip1/Sdi1 in the free and Cdk2-bound state: conformational disorder mediates
binding diversity. Proc. Natl. Acad. Sci. USA 93: 11504–9.
Kriwacki, R. W., J. Wu, L. Tennant, P. E. Wright, and G. Siuzdak. 1997. Probing protein structure
using biochemical and biophysical methods. Proteolysis, matrix-assisted laser desorption/
ionization mass spectrometry, high-performance liquid chromatography and size-exclusion
chromatography of p21Waf1/Cip1/Sdi1. J. Chromatogr. A 777: 23–30.
Kumar, N., S. Shukla, S. Kumar, et al. 2008. Intrinsically disordered protein from a pathogenic
mesophile Mycobacterium tuberculosis adopts structured conformation at high tempera-
ture. Proteins 71: 1123–33
Kumar, R., R. Betney, J. Li, E. B. Thompson, and I. J. Mcewan. 2004. Induced alpha-helix struc-
ture in AF1 of the androgen receptor upon binding transcription factor TFIIF. Biochemistry
43: 3008–13.
Kumar, R., S. R. Pavithra, and U. Tatu. 2007. Three-dimensional structure of heat shock protein
90 from Plasmodium falciparum: molecular modelling approach to rational drug design
against malaria. J. Biosci. 32: 531–6.
Kurdistani, S. K. and M. Grunstein. 2003. Histone acetylation and deacetylation in yeast. Nat Rev
Mol. Cell Biol. 4: 276–84.
Kussie, P. H., S. Gorina, V. Marechal, et al. 1996. Structure of the MDM2 oncoprotein bound to
the p53 tumor suppressor transactivation domain. Science 274: 948–53.
Kyte, J. and R. F. Doolittle. 1982. A simple method for displaying the hydropathic character of a
protein. J. Mol. Biol. 157: 105–32.
Labeit, S. and B. Kolmerer. 1995. Titins: giant proteins in charge of muscle ultrastructure and
elasticity. Science 270: 293–6.
Lacy, E. R., I. Filippov, W. S. Lewis, et al. 2004. p27 binds cyclin-CDK complexes through a sequen-
tial mechanism involving binding-induced protein folding. Nat. Struct. Mol. Biol. 11: 358–64.
Laemmli, U. K. 1970. Cleavage of structural proteins during the assembly of the head of bacte-
riophage T4. Nature 227: 680–5.
Lagerstrom, M. C. and H. B. Schioth. 2008. Structural diversity of G protein-coupled receptors
and significance for drug discovery. Nat. Rev. Drug Discov. 7: 339–57.
Lakowicz, J. R. 2006. Principles of Fluorescence Spectroscopy, 3rd ed. New York: Springer.
Lane, D. P. 1992. Cancer. p53, guardian of the genome. Nature 358: 15–6.
Langowski, J., W. Kremer, and U. Kapp. 1992. Dynamic light scattering for study of solution
conformation and dynamics of superhelical DNA. Methods Enzymol. 211: 430–48.
Lariviere, L., S. Geiger, S. Hoeppner, S. Rother, K. Strasser, and P. Cramer. 2006. Structure and
TBP binding of the mediator head subcomplex Med8-Med18-Med20. Nat. Struct. Mol.
Biol. 13: 895–901.
Lashuel, H. A., C. Wurth, L. Woo, and J. W. Kelly. 1999. The most pathogenic transthyretin vari-
ant, L55P, forms amyloid fibrils under acidic conditions and protofilaments under physi-
ological conditions. Biochemistry 38: 13560–73.
288 References
Le Gall, T., P. R. Romero, M. S. Cortese, V. N. Uversky, and A. K. Dunker. 2007. Intrinsic disor-
der in the protein data bank. J. Biomol. Struct. Dyn. 24: 325–42.
Lebowitz, J., M. S. Lewis, and P. Schuck. 2002. Modern analytical ultracentrifugation in protein
science: a tutorial review. Protein Sci. 11: 2067–79.
Lee, A. S., C. Galea, E. L. Digiammarino, et al. 2003. Reversible amyloid formation by the p53
tetramerization domain and a cancer-associated mutant. J. Mol. Biol. 327: 699–709.
Lee, H., K. H. Mok, R. Muhandiram, et al. 2000. Local structural elements in the mostly unstruc-
tured transcriptional activation domain of human p53. J. Biol. Chem. 275: 29426–32.
Lee, H. J., C. Choi, and S. J. Lee. 2002. Membrane-bound alpha-synuclein has a high aggrega-
tion propensity and the ability to seed the aggregation of the cytosolic form. J. Biol. Chem.
277: 671–8.
Lee, L., E. Stollar, J. Chang, et al. 2001. Expression of the Oct-1 transcription factor and charac-
terization of its interactions with the Bob1 coactivator. Biochemistry 40: 6580–8.
Lee, M. K. and D. W. Cleveland. 1996. Neuronal intermediate filaments. Annu. Rev. Neurosci.
19: 187–217.
Legault, P., J. Li, J. Mogridge, L. E. Kay, and J. Greenblatt. 1998. NMR structure of the bacterio-
phage lambda N peptide/boxB RNA complex: recognition of a GNRA fold by an arginine-
rich motif. Cell 93: 289–99.
Legname, G., I. V. Baskakov, H. O. Nguyen, et al. 2004. Synthetic mammalian prions. Science
305: 673–6.
Legname, G., S. J. Dearmond, F. Cohen, and S. B. Prusiner. 2007. Pathogenesis of prion diseases.
In Protein Misfolding, Aggregation and Conformational Diseases. New York: Springer.
Lehn, D. A., T. S. Elton, K. R. Johnson, and R. Reeves. 1988. A conformational study of the
sequence specific binding of HMG-I (Y) with the bovine interleukin-2 cDNA. Biochem Int.
16: 963–71.
Leismann, O., A. Herzig, S. Heidmann, and C. F. Lehner. 2000. Degradation of Drosophila PIM
regulates sister chromatid separation during mitosis. Genes Dev. 14: 2192–205.
Letunic, I., L. Goodstadt, N. J. Dickens, et al. 2002. Recent improvements to the SMART domain-
based sequence annotation resource. Nucleic. Acids Res. 30: 242–4.
Levine, A. J. 1997. p53, the cellular gatekeeper for growth and division. Cell 88: 323–31.
Levinthal, C. 1969. How to Fold Graciously. In Mossbauer Spectroscopy in Biological Systems
(eds DeBrunner, J. T. P and E. Munck) pp. 22–24.
Levy, Y., J. N. Onuchic, and P. G. Wolynes. 2007. Fly-casting in protein-DNA binding: frustration
between protein folding and electrostatics facilitates target recognition. J. Am. Chem. Soc.
129: 738–9.
Li, H., A. F. Oberhauser, S. D. Redick, M. Carrion-Vazquez, H. P. Erickson, and J. M. Fernandez.
2001. Multiple conformations of PEVK proteins detected by single-molecule techniques.
Proc. Natl. Acad. Sci. USA 98: 10682–6.
Li, L. and S. Lindquist. 2000. Creating a protein-based element of inheritance. Science 287:
661–4.
Li, L., V. N. Uversky, A. K. Dunker, and S. O. Meroueh. 2007. A computational investigation of
allostery in the catabolite activator protein. J. Am. Chem. Soc. 129: 15668–76.
Li, M. and J. Song. 2007. The N- and C-termini of the human Nogo molecules are intrinsi-
cally unstructured: bioinformatics, CD, NMR characterization, and functional implications.
Proteins 68: 100–8.
Li, X., Z. Obradovic, C. J. Brown, E. Garner, and A. K. Dunker. 2000. Comparing predictors of
disordered protein. Genome Inform. Ser. Workshop Genome Inform. 11: 172–84.
Li, X., P. Romero, M. Rani, A. K. Dunker, and Z. Obradovic. 1999. Predicting protein disorder for
N-, C-, and internal regions. Genome Inform. Ser. Workshop Genome Inform. 10: 30–40.
Liao, J., Y. Fu and K. Shuai. 2000. Distinct roles of the NH2- and COOH-terminal domains of the
protein inhibitor of activated signal transducer and activator of transcription (STAT) 1 (PIAS1)
in cytokine-induced PIAS1-Stat1 interaction. Proc. Natl. Acad. Sci. USA 97: 5267–72.
References 289
Libich, D. S. and G. Harauz. 2008. Backbone dynamics of the 18.5 kDa isoform of myelin
basic protein reveals transient alpha-helices and a calmodulin-binding site. Biophys J. 94:
4847–66.
Licht, J. D. 2001. AML1 and the AML1-ETO fusion protein in the pathogenesis of t(8;21) AML.
Oncogene 20: 5660–79.
Liebovitch, L. S., L. Y. Selector, and R. P. Kline. 1992. Statistical properties predicted by the ball
and chain model of channel inactivation. Biophys J. 63: 1579–85.
Lieutaud, P., B. Canard, and S. Longhi. 2008. MeDor: a metaserver for predicting protein disor-
der. BMC Genomics 9 Suppl 2: S25.
Lim, R. Y., N. P. Huang, J. Koser, et al. 2006. Flexible phenylalanine-glycine nucleoporins as
entropic barriers to nucleocytoplasmic transport. Proc. Natl. Acad. Sci. USA 103: 9512–7.
Linding, R., L. J. Jensen, F. Diella, P. Bork, T. J. Gibson, and R. B. Russell. 2003a. Protein disor-
der prediction: implications for structural proteomics. Structure 11: 1453–9.
Linding, R., R. B. Russell, V. Neduva, and T. J. Gibson. 2003b. GlobPlot: exploring protein
sequences for globularity and disorder. Nucleic. Acids Res. 31: 3701–8.
Linding, R., J. Schymkowitz, F. Rousseau, F. Diella, and L. Serrano. 2004. A comparative study of
the relationship between protein structure and beta-aggregation in globular and intrinsically
disordered proteins. J. Mol. Biol. 342: 345–53.
Lindner, R. A., J. A. Carver, M. Ehrnsperger, et al. 2000. Mouse Hsp25, a small shock protein. The
role of its C-terminal extension in oligomerization and chaperone action. Eur. J. Biochem.
267: 1923–32.
Lindner, R. A., A. Kapur, M. Mariani, S. J. Titmuss, and J. A. Carver. 1998. Structural alterations
of alpha-crystallin during its chaperone action. Eur. J. Biochem. 258: 170–83.
Lindquist, S. 1997. Mad cows meet psi-chotic yeast: the expansion of the prion hypothesis. Cell
89: 495–8.
Lipari, G. and A. Szabo. 1982. Model-free approach to the interpretation of nuclear magnetic
resonance relaxation in macromolecules 1. Theory and range of validity. J. Am. Chem. Soc.
104: 4546–59.
Lippens, G., A. Sillen, C. Smet, et al. 2006. Studying the natively unfolded neuronal tau protein
by solution NMR spectroscopy. Protein Pept. Lett. 13: 235–46.
Lippens, G., J. M. Wieruszeski, A. Leroy, et al. 2004. Proline-directed random-coil chemical shift
values as a tool for the NMR assignment of the tau phosphorylation sites. Chembiochem
5: 73–8.
Lise, S. and D. T. Jones. 2005. Sequence patterns associated with disordered regions in proteins.
Proteins 58: 144–50.
Lisse, T., D. Bartels, H. R. Kalbitzer, and R. Jaenicke. 1996. The recombinant dehydrin-like desic-
cation stress protein from the resurrection plant Craterostigma plantagineum displays no
defined three-dimensional structure in its native state. Biol. Chem. 377: 555–61.
Litingtung, Y., A. M. Lawler, S. M. Sebald, et al. 1999. Growth retardation and neonatal lethality
in mice with a homozygous deletion in the C-terminal domain of RNA polymerase II. Mol.
Gen. Genet. 261: 100–5.
Litvinovich, S. V., S. A. Brew, S. Aota, S. K. Akiyama, C. Haudenschild, and K. C. Ingham. 1998.
Formation of amyloid-like fibrils by self-association of a partially unfolded fibronectin type
III module. J. Mol. Biol. 280: 245–58.
Liu, C. W., M. J. Corboy, G. N. Demartino, and P. J. Thomas. 2003. Endoproteolytic activity of
the proteasome. Science 299: 408–11.
Liu, J., N. B. Perumal, C. J. Oldfield, E. W. Su, V. N. Uversky, and A. K. Dunker. 2006a. Intrinsic
disorder in transcription factors. Biochemistry 45: 6873–88.
Liu, J. and B. Rost. 2003. NORSp: Predictions of long regions without regular secondary struc-
ture. Nucleic. Acids Res. 31: 3833–5.
Liu, J., H. Tan, and B. Rost. 2002. Loopy proteins appear conserved in evolution. J. Mol. Biol.
322: 53–64.
290 References
Liu, J., Y. Xing, T. R. Hinds, J. Zheng, and W. Xu. 2006b. The third 20 amino acid repeat is the
tightest binding site of APC for beta-catenin. J. Mol. Biol. 360: 133–44.
Lo Conte, L., C. Chothia, and J. Janin. 1999. The atomic structure of protein–protein recognition
sites. J. Mol. Biol. 285: 2177–98.
Lobley, A., M. B. Swindells, C. A. Orengo, and D. T. Jones. 2007. Inferring function using pat-
terns of native disorder in proteins. PLoS Comput. Biol. 3: e162.
Loftus, S. R., D. Walker, M. J. Mate, et al. 2006. Competitive recruitment of the periplasmic
translocation portal TolB by a natively disordered domain of colicin E9. Proc. Natl. Acad.
Sci. USA 103: 12353–8.
Lohrum, M. A., R. L. Ludwig, M. H. Kubbutat, M. Hanlon, and K. H. Vousden. 2003. Regulation
of HDM2 activity by the ribosomal protein L11. Cancer Cell 3: 577 – 87.
Longhi, S., V. Receveur-Brechot, D. Karlin, et al. 2003. The C-terminal domain of the measles
virus nucleoprotein is intrinsically disordered and folds upon binding to the C-terminal
moiety of the phosphoprotein. J. Biol. Chem. 278: 18638–48.
Lopez-Garcia, F., R. Zahn, R. Riek, and K. Wuthrich. 2000. NMR structure of the bovine prion
protein. Proc. Natl. Acad. Sci. USA 97: 8334–9.
Lorsch, J. R. 2002. RNA chaperones exist and DEAD box proteins get a life. Cell 109:
797–800.
Love, J. J., X. Li, J. Chung, H. J. Dyson, and P. E. Wright. 2004. The LEF-1 high-mobility group
domain undergoes a disorder-to-order transition upon formation of a complex with cognate
DNA. Biochemistry 43: 8725–34.
Lowry, D. F., A. C. Hausrath, and G. W. Daughdrill. 2008a. A robust approach for analyzing a
heterogeneous structural ensemble. Proteins 73: 918–28.
Lowry, D. F., A. Stancik, R. M. Shrestha, and G. W. Daughdrill. 2008b. Modeling the accessible
conformations of the intrinsically unstructured transactivation domain of p53. Proteins 71:
587–98.
Lu, X. and J. C. Hansen. 2004. Identification of specific functional subdomains within the linker
histone H10 C-terminal domain. J. Biol. Chem. 279: 8701–7.
Lu, Y. and A. Bennick. 1998. Interaction of tannin with human salivary proline-rich proteins.
Arch. Oral Biol. 43: 717–28.
Luger, K., A. W. Mader, R. K. Richmond, D. F. Sargent, and T. J. Richmond. 1997. Crystal struc-
ture of the nucleosome core particle at 2.8 A resolution. Nature 389: 251–60.
Luo, Y., J. Hurwitz, and J. Massague. 1995. Cell-cycle inhibition by independent CDK and PCNA
binding domains in p21Cip1. Nature 375: 159–61.
Lynch, W. P., V. M. Riseman, and A. Bretscher. 1987. Smooth muscle caldesmon is an extended
flexible monomeric protein in solution that can readily undergo reversible intra- and inter-
molecular sulfhydryl cross-linking. A mechanism for caldesmon’s F-actin bundling activ-
ity. J. Biol. Chem. 262: 7429–37.
Ma, H., H. Q. Yang, E. Takano, M. Hatanaka, and M. Maki. 1994. Amino-terminal conserved
region in proteinase inhibitor domain of calpastatin potentiates its calpain inhibitory activ-
ity by interacting with calmodulin-like domain of the proteinase. J. Biol. Chem. 269:
24430–6.
Ma, H., H. Q. Yang, E. Takano, W. J. Lee, M. Hatanaka, and M. Maki. 1993. Requirement of dif-
ferent subdomains of calpastatin for calpain inhibition and for binding to calmodulin-like
domains. J. Biochem. (Tokyo) 113: 591–9.
Ma, J. 2000. Stimulatory and inhibitory functions of the R domain on CFTR chloride channel.
News Physiol. Sci. 15: 154–58.
Ma, K., L. Kan, and K. Wang. 2001. Polyproline II helix is a key structural motif of the elastic
PEVK segment of titin. Biochemistry 40: 3427–38.
Ma, K. and K. Wang. 2003. Malleable conformation of the elastic PEVK segment of titin: non-
co-operative interconversion of polyproline II helix, beta-turn and unordered structures.
Biochem J. 374: 687–95.
References 291
McEwan, I. J., D. Lavery, K. Fischer, and K. Watt. 2007. Natural disordered sequences in the
amino terminal domain of nuclear receptors: lessons from the androgen and glucocorticoid
receptors. Nucl. Recept. Signal. 5: e001.
McMeekin, T. L. 1952. Milk proteins. J. Food Protect. 15: 57–63.
McNulty, B. C., G. B. Young, and G. J. Pielak. 2006. Macromolecular crowding in the Escherichia
coli periplasm maintains alpha-synuclein disorder. J. Mol. Biol. 355: 893–7.
McPhie, P., Y. S. Ni, and A. P. Minton. 2006. Macromolecular crowding stabilizes the molten
globule form of apomyoglobin with respect to both cold and heat unfolding. J. Mol. Biol.
361: 7–10.
Megidish, T., J. H. Xu, and C. W. Xu. 2002. Activation of p53 by protein inhibitor of activated
Stat1 (PIAS1). J. Biol. Chem. 277: 8255–9.
Meinhart, A. and P. Cramer. 2004. Recognition of RNA polymerase II carboxy-terminal domain
by 3′-RNA-processing factors. Nature 430: 223–6.
Meininghaus, M., R. D. Chapman, M. Horndasch, and D. Eick. 2000. Conditional expression of
RNA polymerase II in mammalian cells. Deletion of the carboxyl-terminal domain of the
large subunit affects early steps in transcription. J. Biol. Chem. 275: 24375–82.
Melamud, E. and J. Moult. 2003. Evaluation of disorder predictions in CASP5. Proteins 53 Suppl
6: 561–5.
Meszaros, B., P. Tompa, I. Simon, and Z. Dosztanyi. 2007. Molecular principles of the interac-
tions of disordered proteins. J. Mol. Biol. 372: 549–61.
Michalet, X., S. Weiss, and M. Jager. 2006. Single-molecule fluorescence studies of protein fold-
ing and conformational dynamics. Chem. Rev. 106: 1785–813.
Miki, Y., J. Swensen, D. Shattuck-Eidens, et al. 1994. A strong candidate for the breast and ovar-
ian cancer susceptibility gene BRCA1. Science 266: 66–71.
Minezaki, Y., K. Homma, A. R. Kinjo, and K. Nishikawa. 2006. Human transcription factors
contain a high fraction of intrinsically disordered regions essential for transcriptional regu-
lation. J. Mol. Biol. 359: 1137–49.
Minezaki, Y., K. Homma, and K. Nishikawa. 2007. Intrinsically disordered regions of human
plasma membrane proteins preferentially occur in the cytoplasmic segment. J. Mol. Biol.
368: 902–13.
Minor, D. L. Jr. and P. S. Kim. 1996. Context-dependent secondary structure formation of a
designed protein sequence. Nature 380: 730–4.
Minton, A. P. 2005. Models for excluded volume interaction between an unfolded protein and
rigid macromolecular cosolutes: macromolecular crowding and protein stability revisited.
Biophys J. 88: 971–85.
Mirsky, A. E. and L. Pauling. 1936. On the structure of native, denatured, and coagulated proteins.
Proc. Natl. Acad. Sci. USA 22: 439–47.
Mittag, T., S. Orlicky, W. Y. Choy, et al. 2008. Dynamic equilibrium engagement of a polyvalent
ligand with a single-site receptor. Proc. Natl. Acad. Sci. USA 105: 17772–7.
Mohan, A., C. J. Oldfield, P. Radivojac, et al. 2006. Analysis of molecular recognition features
(MoRFs). J. Mol. Biol. 362: 1043–59.
Mohan, A., W. J. Sullivan Jr., P. Radivojac, A. K. Dunker, and V. N. Uversky. 2008. Intrinsic disor-
der in pathogenic and non-pathogenic microbes: discovering and analyzing the unfoldomes
of early-branching eukaryotes. Mol. Biosyst. 4: 328–40.
Moldoveanu, T., K. Gehring, and D. R. Green. 2008. Concerted multi-pronged attack by calpasta-
tin to occlude the catalytic cleft of heterodimeric calpains. Nature 456:404–8.
Momma, M., S. Kaneko, K. Haraguchi, and U. Matsukura. 2003. Peptide mapping and assessment
of cryoprotective activity of 26/27-kDa dehydrin from soybean seeds. Biosci. Biotechnol.
Biochem. 67: 1832–5.
Moncoq, K., I. Broutin, C. T. Craescu, P. Vachette, A. Ducruix, and D. Durand. 2004. SAXS study
of the PIR domain from the Grb14 molecular adaptor: a natively unfolded protein with a
transient structure primer? Biophys J. 87: 4056–64.
References 293
Moncoq, K., I. Broutin, V. Larue, et al. 2003. The PIR domain of Grb14 is an intrinsically unstruc-
tured protein: implication in insulin signaling. FEBS Lett. 554: 240–6.
Monsellier, E. and F. Chiti. 2007. Prevention of amyloid-like aggregation as a driving force of
protein evolution. EMBO Rep 8: 737–42.
Morar, A. S., A. Olteanu, G. B. Young, and G. J. Pielak. 2001. Solvent-induced collapse of alpha-
synuclein and acid-denatured cytochrome c. Protein Sci. 10: 2195–9.
Morellet, N., N. Jullian, H. De Rocquigny, B. Maigret, J. L. Darlix, and B. P. Roques. 1992.
Determination of the structure of the nucleocapsid protein NCp7 from the human immuno-
deficiency virus type 1 by 1H NMR. EMBO J. 11: 3059–65.
Morin, B., J. M. Bourhis, V. Belle, et al. 2006. Assessing induced folding of an intrinsically disor-
dered protein by site-directed spin-labeling electron paramagnetic resonance spectroscopy.
J. Phys. Chem. B 110: 20596–608.
Mouillon, J. M., P. Gustafsson, and P. Harryson. 2006. Structural investigation of disordered stress
proteins. Comparison of full-length dehydrins with isolated peptides of their conserved seg-
ments. Plant Physiol. 141: 638–50.
Muenzer, J., C. Bildstein, M. Gleason, and D. M. Carlson. 1979. Properties of proline-rich pro-
teins from parotid glands of isoproterenol-treated rats. J. Biol. Chem. 254: 5629–34.
Mujtaba, S., Y. He, L. Zeng, et al. 2004. Structural mechanism of the bromodomain of the coacti-
vator CBP in p53 transcriptional activation. Mol. Cell 13: 251–63.
Mukhopadhyay, R. and J. H. Hoh. 2001. AFM force measurements on microtubule-associated pro-
teins: the projection domain exerts a long-range repulsive force. FEBS Lett. 505: 374–8.
Mukhopadhyay, S., R. Krishnan, E. A. Lemke, S. Lindquist, and A. A. Deniz. 2007. A natively
unfolded yeast prion monomer adopts an ensemble of collapsed and rapidly fluctuating
structures. Proc. Natl. Acad. Sci. USA 104: 2649–54.
Mukrasch, M. D., J. Biernat, M. Von Bergen, C. Griesinger, E. Mandelkow, and M. Zweckstetter.
2005. Sites of tau important for aggregation populate (beta)-structure and bind to microtu-
bules and polyanions. J. Biol. Chem. 280: 24978–86.
Mukrasch, M. D., P. Markwick, J. Biernat, et al. 2007a. Highly populated turn conformations
in natively unfolded tau protein identified from residual dipolar couplings and molecular
simulation. J. Am. Chem. Soc. 129: 5235–43.
Mukrasch, M. D., M. Von Bergen, J. Biernat, et al. 2007b. The “jaws” of the tau-microtubule
interaction. J. Biol. Chem. 282: 12230–9.
Mulder, F. A., L. Bouakaz, A. Lundell, et al. 2004. Conformation and dynamics of ribosomal stalk
protein L12 in solution and on the ribosome. Biochemistry 43: 5930–6.
Muro-Pastor, M. I., F. N. Barrera, J. C. Reyes, F. J. Florencio, and J. L. Neira. 2003. The inactivat-
ing factor of glutamine synthetase, IF7, is a “natively unfolded” protein. Protein Sci. 12:
1443–54.
Murray, A. W. 2004. Recycling the cell cycle: cyclins revisited. Cell 116: 221–34.
Murzin, A. G., S. E. Brenner, T. Hubbard, and C. Chothia. 1995. SCOP: a structural classification
of proteins database for the investigation of sequences and structures. J. Mol. Biol. 247:
536–40.
Myers, L. C., C. M. Gustafsson, K. C. Hayashibara, P. O. Brown, and R. D. Kornberg. 1999.
Mediator protein mutations that selectively abolish activated transcription. Proc. Natl.
Acad. Sci. USA 96: 67–72.
Nagao, K., Y. Adachi, and M. Yanagida. 2004. Separase-mediated cleavage of cohesin at inter-
phase is required for DNA repair. Nature 430: 1044–8.
Nagao, K. and M. Yanagida. 2006. Securin can have a separase cleavage site by substitution
mutations in the domain required for stabilization and inhibition of separase. Genes Cells
11: 247–60.
Nallamshetty, S., M. Crook, M. Boehm, T. Yoshimoto, M. Olive, and E. G. Nabel. 2005. The
cell cycle regulator p27Kip1 interacts with MCM7, a DNA replication licensing factor, to
inhibit initiation of DNA replication. FEBS Lett. 579: 6529–36.
294 References
Nash, P., X. Tang, S. Orlicky, et al. 2001. Multisite phosphorylation of a CDK inhibitor sets a
threshold for the onset of DNA replication. Nature 414: 514–21.
Nasir, J., S. B. Floresco, J. R. O’Kusky, et al. 1995. Targeted disruption of the Huntington’s
disease gene results in embryonic lethality and behavioral and morphological changes in
heterozygotes. Cell 81: 811–23.
Neduva, V. and R. B. Russell. 2005. Linear motifs: evolutionary interaction switches. FEBS Lett.
579: 3342–5.
Neduva, V. and R. B. Russell. 2006. DILIMOT: discovery of linear motifs in proteins. Nucleic.
Acids Res. 34: W350–5.
Nelson, R., M. R. Sawaya, M. Balbirnie, et al. 2005. Structure of the cross-beta spine of amyloid-
like fibrils. Nature 435: 773–8.
Neri, D., M. Billeter, G. Wider, and K. Wuthrich. 1992. NMR determination of residual structure
in a urea-denatured protein, the 434-repressor. Science 257: 1559–63.
Neyroz, P., B. Zambelli, and S. Ciurli. 2006. Intrinsically disordered structure of Bacillus pas-
teurii UreG as revealed by steady-state and time-resolved fluorescence spectroscopy.
Biochemistry 45: 8918–30.
Ng, K. P., G. Potikyan, R. O. Savene, C. T. Denny, V. N. Uversky, and K. A. Lee. 2007. Multiple
aromatic side chains within a disordered structure are critical for transcription and trans-
forming activity of EWS family oncoproteins. Proc. Natl. Acad. Sci. USA 104: 479–84.
Nguyen, A. W. and P. S. Daugherty. 2005. Evolutionary optimization of fluorescent proteins for
intracellular FRET. Nat. Biotechnol. 23: 355–60.
Nicholls, C. D., K. G. Mclure, M. A. Shields, and P. W. Lee. 2002. Biogenesis of p53 involves
cotranslational dimerization of monomers and posttranslational dimerization of dimers.
Implications on the dominant negative effect. J. Biol. Chem. 277: 12937–45.
Nimmo, G. A. and P. Cohen. 1978. The regulation of glycogen metabolism. Purification and char-
acterisation of protein phosphatase inhibitor-1 from rabbit skeletal muscle. Eur. J. Biochem.
87: 341–51.
Nishimura, M., T. Yoshida, M. Shirouzu, et al. 2004. Solution structure of ribosomal protein L16
from Thermus thermophilus HB8. J. Mol. Biol. 344: 1369–83.
Nonet, M., D. Sweetser, and R. A. Young. 1987. Functional redundancy and structural polymor-
phism in the large subunit of RNA polymerase II. Cell 50: 909–15.
Nooren, I. M. and J. M. Thornton. 2003. Diversity of protein–protein interactions. EMBO J. 22:
3486–92.
Nyarko, A., M. Hare, T. S. Hays, and E. Barbar. 2004. The intermediate chain of cytoplas-
mic dynein is partially disordered and gains structure upon binding to light-chain LC8.
Biochemistry 43: 15595–603.
Obenauer, J. C., L. C. Cantley, and M. B. Yaffe. 2003. Scansite 2.0: Proteome-wide predic-
tion of cell signaling interactions using short sequence motifs. Nucleic. Acids Res. 31:
3635–41.
Obradovic, Z., K. Peng, S. Vucetic, P. Radivojac, C. J. Brown, and A. K. Dunker. 2003. Predicting
intrinsic disorder from amino acid sequence. Proteins 53 Suppl 6: 566–72.
Obradovic, Z., K. Peng, S. Vucetic, P. Radivojac, and A. K. Dunker. 2005. Exploiting heteroge-
neous sequence properties improves prediction of protein disorder. Proteins 61 Suppl 7:
176–82.
Ohashi, T., S. D. Galiacy, G. Briscoe, and H. P. Erickson. 2007. An experimental study of
GFP-based FRET, with application to intrinsically unstructured proteins. Protein Sci. 16:
1429–38.
Ohashi, T., C. A. Hale, P. A. De Boer, and H. P. Erickson. 2002. Structural evidence that the
P/Q domain of ZipA is an unstructured, flexible tether between the membrane and the
C-terminal FtsZ-binding domain. J. Bacteriol. 184: 4313–5.
Ohnishi, S., A. L. Lee, M. H. Edgell, and D. Shortle. 2004. Direct demonstration of structural
similarity between native and denatured eglin C. Biochemistry 43: 4064–70.
References 295
Ohno, S. 1984. Repeats of base oligomers as the primordial coding sequences of the primeval
earth and their vestiges in modern genes. J. Mol. Evol. 20: 313–21.
Ohno, S. 1987. Early genes that were oligomeric repeats generated a number of divergent domains
on their own. Proc. Natl. Acad. Sci. USA 84: 6486–90.
Ojala, P. M., K. Yamamoto, E. Castanos-Velez, P. Biberfeld, S. J. Korsmeyer, and T. P. Makela.
2000. The apoptotic v-cyclin-CDK6 complex phosphorylates and inactivates Bcl-2. Nat.
Cell Biol. 2: 819–25.
Oldfield, C. J., Y. Cheng, M. S. Cortese, C. J. Brown, V. N. Uversky, and A. K. Dunker. 2005a.
Comparing and combining predictors of mostly disordered proteins. Biochemistry 44:
1989–2000.
Oldfield, C. J., Y. Cheng, M. S. Cortese, P. Romero, V. N. Uversky, and A. K. Dunker. 2005b.
Coupled folding and binding with alpha-helix-forming molecular recognition elements.
Biochemistry 44: 12454–70.
Oldfield, C. J., J. Meng, J. Y. Yang, M. Q. Yang, V. N. Uversky, and A. K. Dunker. 2008. Flexible
nets: disorder and induced fit in the associations of p53 and 14-3-3 with their partners. BMC
Genomics 9 (Suppl. 1): S1.
Oldfield, C. J., E. L. Ulrich, Y. Cheng, A. K. Dunker, and J. L. Markley. 2005c. Addressing the
intrinsic disorder bottleneck in structural proteomics. Proteins 59: 444–53.
Olson, K. E., P. Narayanaswami, P. D. Vise, D. F. Lowry, M. S. Wold, and G. W. Daughdrill. 2005.
Secondary structure and dynamics of an intrinsically unstructured linker domain. J. Biomol.
Struct. Dyn. 23: 113–24.
Orengo, C. A., A. D. Michie, S. Jones, D. T. Jones, M. B. Swindells, and J. M. Thornton. 1997.
CATH—a hierarchic classification of protein domain structures. Structure 5: 1093–108.
Orphanides, G. and D. Reinberg. 2002. A unified theory of gene expression. Cell 108: 439–51.
Ostedgaard, L. S., O. Baldursson, D. W. Vermeer, M. J. Welsh, and A. D. Robertson. 2000. A
functional R domain from cystic fibrosis transmembrane conductance regulator is predomi-
nantly unstructured in solution. Proc. Natl. Acad. Sci. USA 97: 5657–62.
Otzen, D. E., L. S. Itzhaki, N. F. Elmasry, S. E. Jackson, and A. R. Fersht. 1994. Structure
of the transition state for the folding/unfolding of the barley chymotrypsin inhibitor 2
and its implications for mechanisms of protein folding. Proc. Natl. Acad. Sci. USA 91:
10422–5.
Overall, C. M. and G. S. Butler. 2007. Protease yoga: extreme flexibility of a matrix metallopro-
teinase. Structure 15: 1159–61.
Page, R., W. Peti, I. A. Wilson, R. C. Stevens, and K. Wuthrich. 2005. NMR screening and crystal
quality of bacterially expressed prokaryotic and eukaryotic proteins in a structural genom-
ics pipeline. Proc. Natl. Acad. Sci. USA 102: 1901–5.
Palmer, M. S. and J. Collinge. 1993. Mutations and polymorphisms in the prion protein gene.
Hum. Mutat. 2: 168–73.
Pan, H., G. Barany and C. Woodward. 1997. Reduced BPTI is collapsed. A pulsed field gradient
NMR study of unfolded and partially folded bovine pancreatic trypsin inhibitor. Protein
Sci. 6: 1985–92.
Pan, K. M., M. Baldwin, J. Nguyen, et al. 1993. Conversion of alpha-helices into beta-sheets
features in the formation of the scrapie prion proteins. Proc. Natl. Acad. Sci. USA 90:
10962–6.
Panchal, S. C., D. A. Kaiser, E. Torres, T. D. Pollard, and M. K. Rosen. 2003. A conserved amphip-
athic helix in WASP/Scar proteins is essential for activation of Arp2/3 complex. Nat. Struct.
Biol. 10: 591–8.
Panetti, T. S. 2002. Tyrosine phosphorylation of paxillin, FAK, and p130CAS: effects on cell
spreading and migration. Front. Biosci. 7: d143–50.
Pantazatos, D., J. S. Kim, H. E. Klock, et al. 2004. Rapid refinement of crystallographic protein
construct definition employing enhanced hydrogen/deuterium exchange MS. Proc. Natl.
Acad. Sci. USA 101: 751–6.
296 References
Rape, M. and S. Jentsch. 2002. Taking a bite: proteasomal protein processing. Nat. Cell Biol. 4:
E113–6.
Rauscher, S., S. Baud, M. Miao, F. W. Keeley, and R. Pomes. 2006. Proline and glycine control
protein self-organization into elastomeric or amyloid fibrils. Structure 14: 1667–76.
Receveur-Brechot, V., J. M. Bourhis, V. N. Uversky, B. Canard, and S. Longhi. 2005. Assessing
protein disorder and induced folding. Proteins 62: 24–45.
Receveur, V., M. Czjzek, M. Schulein, P. Panine, and B. Henrissat. 2002. Dimension, shape, and
conformational flexibility of a two-domain fungal cellulase in solution probed by small
angle X-ray scattering. J. Biol. Chem. 277: 40887–92.
Rechsteiner, M. and S. W. Rogers. 1996. PEST sequences and regulation by proteolysis. Trends
Biochem. Sci. 21: 267–71.
Redeker, V., S. Lachkar, S. Siavoshian, et al. 2000. Probing the native structure of stathmin and
its interaction domains with tubulin. Combined use of limited proteolysis, size exclusion
chromatography, and mass spectrometry. J. Biol. Chem. 275: 6841–9.
Redinbo, M. R., L. Stewart, P. Kuhn, J. J. Champoux, and W. G. Hol. 1998. Crystal structures
of human topoisomerase I in covalent and noncovalent complexes with DNA. Science 279:
1504–13.
Reeves, R. 2001. Molecular biology of HMGA proteins: hubs of nuclear function. Gene 277:
63–81.
Reeves, R. and L. Beckerbauer. 2001. HMGI/Y proteins: flexible regulators of transcription and
chromatin structure. Biochim. Biophys. Acta. 1519: 13–29.
Reinholt, F. P., K. Hultenby, A. Oldberg, and D. Heinegard. 1990. Osteopontin—a possible anchor
of osteoclasts to bone. Proc. Natl. Acad. Sci. USA 87: 4473–5.
Renault, L., B. Bugyi and M. F. Careier. 2008. Spire and Cordon-bleu: multifunctional regulators
of actin dynamics Trends Cell Biol. 18: 494–504.
Richards, J. P., H. P. Bachinger, R. H. Goodman, and R. G. Brennan. 1996. Analysis of the struc-
tural properties of cAMP-responsive element-binding protein (CREB) and phosphorylated
CREB. J. Biol. Chem. 271: 13716–23.
Riek, R., S. Hornemann, G. Wider, R. Glockshuber, and K. Wuthrich. 1997. NMR characterization
of the full-length recombinant murine prion protein, mPrP(23–231). FEBS Lett. 413: 282–8.
Riordan, J. R., J. M. Rommens, B. Kerem, et al. 1989. Identification of the cystic fibrosis gene:
cloning and characterization of complementary DNA. Science 245: 1066–73.
Ritter, C., M. L. Maddelein, A. B. Siemer, et al. 2005. Correlation of structural elements and
infectivity of the HET-s prion. Nature 435: 844–8.
Rochet, J. C. and P. T. Lansbury Jr. 2000. Amyloid fibrillogenesis: themes and variations. Curr.
Opin. Struct. Biol. 10: 60–8.
Rock, R. S., B. Ramamurthy, A. R. Dunn, et al. 2005. A flexible domain is essential for the large
step size and processivity of myosin VI. Mol. Cell 17: 603–9.
Rodger, A. and B. Nordén. 1997. Circular Dichroism and Linear Dichroism. Oxford: Oxford
University Press.
Romero, P., Z. Obradovic, and A. K. Dunker. 1999. Folding minimal sequences: the lower bound
for sequence complexity of globular proteins. FEBS Lett. 462: 363–7.
Romero, P., Z. Obradovic, C. R. Kissinger, J. E. Villafranca and A. K. Dunker. 1997. Identifying
disordered regions in proteins from amino acid sequences. Proc. IEEE Int. Conf. Neural
Networks 1: 90–95.
Romero, P., Z. Obradovic, C. R. Kissinger, et al. 1998. Thousands of proteins likely to have long
disordered regions. Pac. Symp. Biocomputing 3: 437–48.
Romero, P., Z. Obradovic, X. Li, E. C. Garner, C. J. Brown, and A. K. Dunker. 2001. Sequence
complexity of disordered protein. Proteins 42: 38–48.
Romero, P. R., S. Zaidi, Y. Y. Fang, et al. 2006. Alternative splicing in concert with protein intrin-
sic disorder enables increased functional diversity in multicellular organisms. Proc. Natl.
Acad. Sci. USA 103: 8390–5.
300 References
Schmidt, E. E. and C. J. Davies. 2007. The origins of polypeptide domains. Bioessays 29:
262–70.
Schneider, B. L., Q. H. Yang, and A. B. Futcher. 1996. Linkage of replication to start by the Cdk
inhibitor Sic1. Science 272: 560–2.
Schwarz-Linek, U., E. S. Pilka, A. R. Pickford, et al. 2004. High affinity streptococcal binding
to human fibronectin requires specific recognition of sequential f1 modules. J. Biol. Chem.
279: 39017–25.
Schwarz-Linek, U., J. M. Werner, A. R. Pickford, et al. 2003. Pathogenic bacteria attach to human
fibronectin through a tandem beta-zipper. Nature 423: 177–81.
Schwarzinger, S., G. J. Kroon, T. R. Foss, J. Chung, P. E. Wright, and H. J. Dyson. 2001.
Sequence-dependent correction of random coil NMR chemical shifts. J. Am. Chem. Soc.
123: 2970–8.
Schwarzinger, S., G. J. Kroon, T. R. Foss, P. E. Wright, and H. J. Dyson. 2000. Random coil
chemical shifts in acidic 8-M urea: implementation of random coil shift data in NMRView.
J. Biomol. NMR 18: 43–8.
Schweers, O., E. Schonbrunn-Hanebeck, A. Marx, and E. Mandelkow. 1994. Structural studies of
tau protein and Alzheimer paired helical filaments show no evidence for beta-structure. J.
Biol. Chem. 269: 24290–7.
Schwob, E., T. Bohm, M. D. Mendenhall, and K. Nasmyth. 1994. The B-type cyclin kinase inhibi-
tor p40SIC1 controls the G1 to S transition in S. cerevisiae. Cell 79: 233–44.
Sebolt-Leopold, J. S. and J. M. English. 2006. Mechanisms of drug inhibition of signalling mol-
ecules. Nature 441: 457–62.
Sedzik, J. and D. A. Kirschner. 1992. Is myelin basic protein crystallizable? Neurochem. Res. 17:
157–66.
Seet, B. T., I. Dikic, M. M. Zhou, and T. Pawson. 2006. Reading protein modifications with inter-
action domains. Nat. Rev. Mol. Cell Biol. 7: 473–83.
Selenko, P., G. Gregorovic, R. Sprangers, et al. 2003. Structural basis for the molecular
recognition between human splicing factors U2AF65 and SF1/mBBP. Mol. Cell 11:
965–76.
Selenko, P., Z. Serber, B. Gadea, J. Ruderman, and G. Wagner. 2006. From the cover: Quantitative
NMR analysis of the protein G B1 domain in Xenopus laevis egg extracts and intact oocytes.
Proc. Natl. Acad. Sci. USA 103: 11904–9.
Selenko, P. and G. Wagner. 2007. Looking into live cells with in-cell NMR spectroscopy. J. Struct.
Biol. 158: 244–53.
Selkoe, D. J. 2003. Folding proteins in fatal ways. Nature 426: 900–4.
Semrad, K., R. Green, and R. Schroeder. 2004. RNA chaperone activity of large ribosomal sub-
unit proteins from Escherichia coli. RNA 10: 1855–60.
Serber, Z. and V. Dotsch. 2001. In-cell NMR spectroscopy. Biochemistry 40: 14317–23.
Serpell, L. C., M. Sunde, M. D. Benson, G. A. Tennent, M. B. Pepys, and P. E. Fraser. 2000. The
protofilament substructure of amyloid fibrils. J. Mol. Biol. 300: 1033–9.
Shannon, C. E. 1948. A mathematical theory of communication. Bell Syst. Tech. J. 27: 379–423,
623–56.
Sheaff, R. J., J. D. Singer, J. Swanger, M. Smitherman, J. M. Roberts, and B. E. Clurman. 2000.
Proteasomal turnover of p21Cip1 does not require p21Cip1 ubiquitination. Mol. Cell 5:
403–10.
Sheng, M. and E. Kim. 2000. The Shank family of scaffold proteins. J. Cell Sci. 113: 1851–6.
Sherr, C. J. and J. M. Roberts. 1999. CDK inhibitors: positive and negative regulators of G1-phase
progression. Genes Dev. 13: 1501–12.
Shi, Z., C. A. Olson, G. D. Rose, R. L. Baldwin, and N. R. Kallenbach. 2002. Polyproline II struc-
ture in a sequence of seven alanine residues. Proc. Natl. Acad. Sci. USA 99: 9190–5.
Shieh, S.Y.,Y. Taya, and C. Prives. 1999. DNA damage-inducible phosphorylation of p53 at N-terminal
sites including a novel site, Ser20, requires tetramerization. EMBO J. 18: 1815–23.
302 References
Sunde, M. and C. Blake. 1997. The structure of amyloid fibrils by electron microscopy and X-ray
diffraction. Adv. Protein Chem. 50: 123–59.
Suzuki, K., S. Hata, Y. Kawabata, and H. Sorimachi. 2004. Structure, activation, and biology of
calpain. Diabetes 53 Suppl 1: S12–8.
Svergun, D. I. and M. H. Koch. 2002. Advances in structure analysis using small-angle scattering
in solution. Curr. Opin. Struct. Biol. 12: 654–60.
Svergun, D. I. and M. H. J. Koch. 2003. Small-angle scattering studies of biological macromol-
ecules in solution. Rep. Prog. Phys. 66: 1735–82.
Sweet, R. M. and D. Eisenberg. 1983. Correlation of sequence hydrophobicities measures similar-
ity in three-dimensional protein structure. J. Mol. Biol. 171: 479–88.
Syme, C. D., E. W. Blanch, C. Holt, et al. 2002. A Raman optical activity study of rheomorphism
in caseins, synucleins and tau. Eur. J. Biochem. 269: 148–56.
Szilagyi, A., D. Gyorffy, and P. Zavodszky. 2008. The twilight zone between protein order and
disorder. Biophys J. 95: 1612–26.
Szollosi, E., M. Bokor, A. Bodor, et al. 2008. Intrinsic structural disorder of DF31, a Drosophila
protein of chromatin decondensation and remodeling activities. J. Proteome. Res. 7:
2291–9.
Taatjes, D. J., A. M. Naar, F. Andel III, E. Nogales, and R. Tjian. 2002. Structure, function, and
activator-induced conformations of the CRSP coactivator. Science 295: 1058–62.
Taatjes, D. J., T. Schneider-Poetsch, and R. Tjian. 2004. Distinct conformational states of nuclear
receptor-bound CRSP-Med complexes. Nat. Struct. Mol. Biol. 11: 664–71.
Tabuchi, K., T. Biederer, S. Butz, and T. C. Sudhof. 2002. CASK participates in alternative tripar-
tite complexes in which Mint 1 competes for binding with caskin 1, a novel CASK-binding
protein. J. Neurosci. 22: 4264–73.
Takagi, Y., G. Calero, H. Komori, et al. 2006. Head module control of mediator interactions. Mol.
Cell 23: 355–64.
Takahashi, M., M. Itakura, and M. Kataoka. 2003. New aspects of neurotransmitter release and
exocytosis: regulation of neurotransmitter release by phosphorylation. J. Pharmacol. Sci.
93: 41–5.
Takano, E., H. Ma, H. Q. Yang, M. Maki, and M. Hatanaka. 1995. Preference of calcium-depen-
dent interactions between calmodulin-like domains of calpain and calpastatin subdomains.
FEBS Lett. 362: 93–7.
Takano, E., M. Maki, H. Mori, et al. 1988. Pig heart calpastatin: identification of repetitive domain
structures and anomalous behavior in polyacrylamide gel electrophoresis. Biochemistry 27:
1964–72.
Tanford, C. 1968. Protein denaturation. Adv. Protein Chem. 23: 121–282.
Tanford, C., K. Kawahara, and S. Lapanje. 1966. Proteins in 6-M guanidine hydrochloride.
Demonstration of random coil behavior. J. Biol. Chem. 241: 1921–23.
Tao, W. and A. J. Levine. 1999. P19(ARF) stabilizes p53 by blocking nucleocytoplasmic shuttling
of Mdm2. Proc. Natl. Acad. Sci. USA 96: 6937–41.
Tfelt-Hansen, J., D. Kanuparthi, and N. Chattopadhyay. 2006. The emerging role of pituitary
tumor transforming gene in tumorigenesis. Clin. Med. Res. 4: 130–7.
Thapar, R., G. A. Mueller, and W. F. Marzluff. 2004. The N-terminal domain of the Drosophila
histone mRNA binding protein, SLBP, is intrinsically disordered with nascent helical struc-
ture. Biochemistry 43: 9390–400.
Thirone, A. C., C. Huang, and A. Klip. 2006. Tissue-specific roles of IRS proteins in insulin sig-
naling and glucose transport. Trends Endocrinol. Metab. 17: 72–8.
Thomas, B. and M. F. Beal. 2007. Parkinson’s disease. Hum Mol Genet 16 Spec No. 2:
R183–94.
Thomas, J., S. M. Van Patten, P. Howard, et al. 1991. Expression in Escherichia coli and charac-
terization of the heat-stable inhibitor of the cAMP-dependent protein kinase. J. Biol. Chem.
266: 10906–11.
References 305
Thomas, P. D. and K. A. Dill. 1996. An iterative method for extracting energy-like quantities from
protein structures. Proc. Natl. Acad. Sci. USA 93: 11628–33.
Thomas, W. H., U. Weser, and K. Hempel. 1977. Conformational changes induced by ionic
strength and pH in two bovine myelin basic proteins. Hoppe Seylers Z. Physiol. Chem. 358:
1345–52.
Thompson, J. D., D. G. Higgins, and T. J. Gibson. 1994. CLUSTAL W: improving the sensitivity
of progressive multiple sequence alignment through sequence weighting, position-specific
gap penalties and weight matrix choice. Nucleic. Acids Res. 22: 4673–80.
Thorn, D. C., S. Meehan, M. Sunde, et al. 2005. Amyloid fibril formation by bovine milk
kappa-casein and its inhibition by the molecular chaperones alpha(S)- and beta-casein.
Biochemistry 44: 17027–36.
Tisne, C., B. P. Roques, and F. Dardel. 2001. Heteronuclear NMR studies of the interaction of
tRNA(Lys)3 with HIV-1 nucleocapsid protein. J. Mol. Biol. 306: 443–54.
Todd, M. J., G. H. Lorimer, and D. Thirumalai. 1996. Chaperonin-facilitated protein folding:
optimization of rate and yield by an iterative annealing mechanism. Proc. Natl. Acad. Sci.
USA 93: 4030–5.
Tofaris, G. K., R. Layfield, and M. G. Spillantini. 2001. alpha-synuclein metabolism and aggre-
gation is linked to ubiquitin-independent degradation by the proteasome. FEBS Lett. 509:
22–6.
Tokuriki, N., M. Kinjo, S. Negi, et al. 2004. Protein folding by the effects of macromolecular
crowding. Protein Sci. 13: 125–33.
Tompa, P. 2002. Intrinsically unstructured proteins. Trends Biochem. Sci. 27: 527–33.
Tompa, P. 2003a. The functional benefits of protein disorder. J. Mol. Struct. THEOCHEM 666–
667: 361–71.
Tompa, P. 2003b. Intrinsically unstructured proteins evolve by repeat expansion. BioEssays 25:
847–55.
Tompa, P. 2005. The interplay between structure and function in intrinsically unstructured pro-
teins. FEBS Lett. 579: 3346–54.
Tompa, P., P. Banki, M. Bokor, et al. 2006a. Protein-water and protein-buffer interactions in the
aqueous solution of an intrinsically unstructured plant dehydrin: NMR intensity and DSC
aspects. Biophys J. 91: 2243–9.
Tompa, P., P. Buzder-Lantos, A. Tantos, et al. 2004. On the sequential determinants of calpain
cleavage. J. Biol. Chem. 279: 20775–85.
Tompa, P. and P. Csermely. 2004. The role of structural disorder in the function of RNA and pro-
tein chaperones. FASEB J 18: 1169–75.
Tompa, P., Z. Dosztanyi, and I. Simon. 2006b. Prevalent structural disorder in E. coli and S. cer-
evisiae proteomes. J. Proteome. Res. 5: 1996–2000.
Tompa, P. and M. Fuxreiter. 2008. Fuzzy complexes: polymorphism and structural disorder in
protein–protein interactions. Trends Biochem. Sci. 33: 2–8.
Tompa, P., M. Fuxreiter, C. J. Oldfield, I. Simon, A. K. Dunker, and V. N. Uversky. 2009. Close
encounters of the third kind: disordered domains and the interactions of proteins. BioEssays
31: 328–35.
Tompa, P., J. Prilusky, I. Silman, and J. L. Sussman. 2008. Structural disorder serves as a weak
signal for intracellular protein degradation. Proteins 71: 903–9.
Tompa, P., C. Szasz, and L. Buday. 2005. Structural disorder throws new light on moonlighting.
Trends Biochem. Sci. 30: 484–9.
Tong, K. I., Y. Katoh, H. Kusunoki, K. Itoh, T. Tanaka, and M. Yamamoto. 2006. Keap1 recruits
Neh2 through binding to ETGE and DLG motifs: characterization of the two-site molecular
recognition model. Mol. Cell Biol. 26: 2887–900.
Torok, M., S. Milton, R. Kayed, et al. 2002. Structural and dynamic features of Alzheimer’s
Abeta peptide in amyloid fibrils studied by site-directed spin labeling. J. Biol. Chem. 277:
40810–5.
306 References
Toth-Petroczy, A., C. J. Oldfield, I. Simon, et al. 2008. Malleable machines in transcription regu-
lation: the mediator complex. PLoS Comput. Biol. 4: e1000243.
Tozawa, K., C. J. Macdonald, C. N. Penfold, et al. 2005. Clusters in an intrinsically disordered
protein create a protein-binding site: the TolB-binding region of colicin E9. Biochemistry
44: 11496–507.
Triezenberg, S. J. 1995. Structure and function of transcriptional activation domains. Curr. Opin.
Genet. Dev. 5: 190–6.
Trombitas, K., M. Greaser, S. Labeit, et al. 1998. Titin extensibility in situ: entropic elasticity
of permanently folded and permanently unfolded molecular segments. J. Cell Biol. 140:
853–9.
Tsien, R. Y. 1998. The green fluorescent protein. Annu. Rev. Biochem. 67: 509–44.
Tsukazaki, T., T. A. Chiang, A. F. Davison, L. Attisano, and J. L. Wrana. 1998. SARA, a FYVE
domain protein that recruits Smad2 to the TGFbeta receptor. Cell 95: 779–91.
Tsvetkov, P., G. Asher, A. Paz, et al. 2008. Operational definition of intrinsically unstructured
protein sequences based on susceptibility to the 20S proteasome. Proteins 70: 1357–66.
Tucker, M. M., J. B. Robinson Jr., and E. Stellwagen. 1981. The effect of proteolysis on the calm-
odulin activation of cyclic nucleotide phosphodiesterase. J. Biol. Chem. 256: 9051–8.
Tucker, P. K. and B. L. Lundrigan. 1993. Rapid evolution of the sex determining locus in Old
World mice and rats. Nature 364: 715–7.
Tung, H. Y., W. Wang, and C. S. Chan. 1995. Regulation of chromosome segregation by Glc8p,
a structural homolog of mammalian inhibitor 2 that functions as both an activator and an
inhibitor of yeast protein phosphatase 1. Mol. Cell Biol. 15: 6064–74.
Tunnacliffe, A. and M. J. Wise. 2007. The continuing conundrum of the LEA proteins.
Naturwissenschaften 94: 791–812.
Turner, C. F. and P. B. Moore. 2004. The solution structure of ribosomal protein L18 from Bacillus
stearothermophilus. J. Mol. Biol. 335: 679–84.
Tyukhtenko, S., L. Deshmukh, V. Kumar, et al. 2008. Characterization of the neuron-specific
L1-CAM cytoplasmic tail: naturally disordered in solution it exercises different binding
modes for different adaptor proteins. Biochemistry 47: 4160–8.
Ueda, K., H. Fukushima, E. Masliah, et al. 1993. Molecular cloning of cDNA encoding an unrec-
ognized component of amyloid in Alzheimer’s disease. Proc. Natl. Acad. Sci. USA 90:
11282–6.
Uesugi, M., O. Nyanguile, H. Lu, A. J. Levine, and G. L. Verdine. 1997. Induced alpha helix in the
VP16 activation domain upon binding to a human TAF. Science 277: 1310–3.
Ulfers, A. L., J. L. Mcmurry, D. A. Kendall, and D. F. Mierke. 2002. Structure of the third intracel-
lular loop of the human cannabinoid 1 receptor. Biochemistry 41: 11344–50.
Uversky, V. N. 1993. Use of fast protein size-exclusion liquid chromatography to study the unfold-
ing of proteins which denature through the molten globule. Biochemistry 32: 13288–98.
Uversky, V. N. 2002a. Natively unfolded proteins: A point where biology waits for physics.
Protein Sci. 11: 739–56.
Uversky, V. N. 2002b. What does it mean to be natively unfolded? Eur. J. Biochem. 269: 2–12.
Uversky, V. N. 2003. A protein-chameleon: conformational plasticity of alpha-synuclein, a disor-
dered protein involved in neurodegenerative disorders. J. Biomol. Struct. Dyn. 21: 211–34.
Uversky, V. N. 2007. Neuropathology, biochemistry, and biophysics of alpha-synuclein aggrega-
tion. J. Neurochem. 103: 17–37.
Uversky, V. N. and A. L. Fink. 2002. The chicken–egg scenario of protein folding revisited. FEBS
Lett. 515: 79–83.
Uversky, V. N. and A. L. Fink. 2004. Conformational constraints for amyloid fibrillation: the
importance of being unfolded. Biochim. Biophys. Acta. 1698: 131–53.
Uversky, V. N., J. R. Gillespie, and A. L. Fink. 2000a. Why are “natively unfolded” proteins
unstructured under physiologic conditions? Proteins 41: 415–27.
References 307
Uversky, V. N., J. R. Gillespie, I. S. Millett, et al. 2000b. Zn(2+)-mediated structure formation and
compaction of the “natively unfolded” human prothymosin alpha. Biochem. Biophys. Res.
Commun. 267: 663–8.
Uversky, V. N., J. R. Gillespie, I. S. Millett, et al. 1999. Natively unfolded human prothymosin
alpha adopts partially folded collapsed conformation at acidic pH. Biochemistry 38:
15009–16.
Uversky, V. N., M. D. Kirkitadze, N. V. Narizhneva, S. A. Potekhin and A. Tomashevski. 1995.
Structural properties of alpha-fetoprotein from human cord serum: the protein molecule at
low pH possesses all the properties of the molten globule. FEBS Lett. 364: 165–7.
Uversky, V. N., H. J. Lee, J. Li, A. L. Fink, and S. J. Lee. 2001a. Stabilization of partially folded
conformation during alpha-synuclein oligomerization in both purified and cytosolic prepa-
rations. J. Biol. Chem. 276: 43495–8.
Uversky, V. N., J. Li, and A. L. Fink. 2001b. Evidence for a partially folded intermediate in alpha-
synuclein fibril formation. J. Biol. Chem. 276: 10737–44.
Uversky, V. N., J. Li, and A. L. Fink. 2001c. Metal-triggered structural transformations, aggre-
gation, and fibrillation of human alpha-synuclein. A possible molecular NK between
Parkinson’s disease and heavy metal exposure. J. Biol. Chem. 276: 44284–96.
Uversky, V. N., J. Li, and A. L. Fink. 2001d. Trimethylamine-N-oxide-induced folding of alpha-
synuclein. FEBS Lett. 509: 31–5.
Uversky, V. N. and O.B. Ptytsin, 1994. “Partly folded” state, a new equilibrium state of protein
molecules: four-state guanidinium chloride-induced unfolding of beta-lactamase at low
temperature. Biochemistry 33: 2782–91.
Uversky, V. N., C. J. Oldfield, and A. K. Dunker. 2005. Showing your ID: intrinsic disorder as an
ID for recognition, regulation and cell signaling. J. Mol. Recognit. 18: 343–84.
Uversky, V. N., C. J. Oldfield, and A. K. Dunker. 2008. Intrinsically disordered proteins in human
diseases: introducing the D2 concept. Annu. Rev. Biophys. 37: 215–46.
Uversky, V. N., A. Roman, C. J. Oldfield, and A. K. Dunker. 2006. Protein intrinsic disorder and
human papillomaviruses: increased amount of disorder in E6 and E7 oncoproteins from
high risk HPVs. J. Proteome. Res. 5: 1829–42.
Uversky, V. N., S. Winter, O. V. Galzitskaya, L. Kittler, and G. Lober. 1998. Hyperphosphorylation
induces structural modification of tau protein. FEBS Lett. 439: 21–5.
Vacic, V., C. J. Oldfield, A. Mohan, et al. 2007. Characterization of molecular recognition fea-
tures, MoRFs, and their binding partners. J. Proteome. Res. 6: 2351–66.
Vamvaca, K., B. Vogeli, P. Kast, K. Pervushin, and D. Hilvert. 2004. An enzymatic molten
globule: efficient coupling of folding and catalysis. Proc. Natl. Acad. Sci. USA 101:
12860–4.
Van Gilst, M. R., W. A. Rees, A. Das, and P. H. Von Hippel. 1997. Complexes of N antitermina-
tion protein of phage lambda with specific and nonspecific RNA target sites on the nascent
transcript. Biochemistry 36: 1514–24.
Van Leeuwen, H. C., M. J. Strating, M. Rensen, W. De Laat, and P. C. Van Der Vliet. 1997.
Linker length and composition influence the flexibility of Oct-1 DNA binding. EMBO J.
16: 2043–53.
Van Montfort, R. L., E. Basha, K. L. Friedrich, C. Slingsby, and E. Vierling. 2001. Crystal struc-
ture and assembly of a eukaryotic small heat shock protein. Nat. Struct. Biol. 8: 1025–30.
Vassilev, L. T., B. T. Vu, B. Graves, et al. 2004. In vivo activation of the p53 pathway by small-
molecule antagonists of MDM2. Science 303: 844–8.
Vaynberg, J., T. Fukuda, K. Chen, et al. 2005. Structure of an ultraweak protein–protein complex
and its crucial role in regulation of cell morphology and motility. Mol. Cell 17: 513–23.
Venkatraman, P., R. Wetzel, M. Tanaka, N. Nukina, and A. L. Goldberg. 2004. Eukaryotic pro-
teasomes cannot digest polyglutamine sequences and release them during degradation of
polyglutamine-containing proteins. Mol. Cell 14: 95–104.
308 References
Venkitaraman, A. R. 2002. Cancer susceptibility and the functions of BRCA1 and BRCA2. Cell
108: 171–82.
Veprintsev, D. B., S. M. Freund, A. Andreeva, et al. 2006. Core domain interactions in full-length
p53 in solution. Proc. Natl. Acad. Sci. USA 103: 2115–9.
Vergnaud, G. and F. Denoeud. 2000. Minisatellites: mutability and genome architecture. Genome
Res 10: 899–907.
Verkhivker, G. M. 2004. Protein conformational transitions coupled to binding in molecular rec-
ognition of unstructured proteins: hierarchy of structural loss from all-atom Monte Carlo
simulations of p27Kip1 unfolding-unbinding and structural determinants of the binding
mechanism. Biopolymers 75: 420–33.
Verkhivker, G. M. 2005. Protein conformational transitions coupled to binding in molecular rec-
ognition of unstructured proteins: deciphering the effect of intermolecular interactions on
computational structure prediction of the p27Kip1 protein bound to the cyclin A-cyclin-
dependent kinase 2 complex. Proteins 58: 706–16.
Verkhivker, G. M., D. Bouzida, D. K. Gehlhaar, P. A. Rejto, S. T. Freer, and P. W. Rose. 2003.
Simulating disorder-order transitions in molecular recognition of unstructured proteins:
where folding meets binding. Proc. Natl. Acad. Sci. USA 100: 5148–53.
Vihinen, M., E. Torkkila, and P. Riikonen. 1994. Accuracy of protein flexibility predictions.
Proteins 19: 141–9.
Vise, P., B. Baral, A. Stancik, D. F. Lowry, and G. W. Daughdrill. 2007. Identifying long-range
structure in the intrinsically unstructured transactivation domain of p53. Proteins 67:
526–30.
Vise, P. D., B. Baral, A. J. Latos, and G. W. Daughdrill. 2005. NMR chemical shift and relaxation
measurements provide evidence for the coupled folding and binding of the p53 transactiva-
tion domain. Nucleic. Acids Res. 33: 2061–77.
Vitalis, A., X. Wang, and R. V. Pappu. 2007. Quantitative characterization of intrinsic disorder in
polyglutamine: insights from analysis based on polymer theories. Biophys J. 93: 1923–37.
Vogel, C., M. Bashton, N. D. Kerrison, C. Chothia, and S. A. Teichmann. 2004. Structure, func-
tion and evolution of multidomain proteins. Curr. Opin. Struct. Biol. 14: 208–16.
Voges, D., P. Zwickl, and W. Baumeister. 1999. The 26S proteasome: a molecular machine
designed for controlled proteolysis. Annu. Rev. Biochem. 68: 1015–68.
von Bergen, M., P. Friedhoff, J. Biernat, J. Heberle, E. M. Mandelkow, and E. Mandelkow.
2000. Assembly of tau protein into Alzheimer paired helical filaments depends on a local
sequence motif ((306)VQIVYK(311)) forming beta structure. Proc. Natl. Acad. Sci. USA
97: 5129–34.
von der Haar, T., Y. Oku, M. Ptushkina, et al. 2006. Folding transitions during assembly of the
eukaryotic mRNA cap-binding complex. J. Mol. Biol. 356: 982–92.
von Mering, C., L. J. Jensen, B. Snel, et al. 2005. STRING: known and predicted protein–pro-
tein associations, integrated and transferred across organisms. Nucleic. Acids Res. 33:
D433–D37.
von Ossowski, I., J. T. Eaton, M. Czjzek, et al. 2005. Protein disorder: conformational distribution
of the flexible linker in a chimeric double cellulase. Biophys J. 88: 2823–32.
Vrhovski, B. and A. S. Weiss. 1998. Biochemistry of tropoelastin. Eur. J. Biochem. 258: 1–18.
Vucetic, S., C. J. Brown, A. K. Dunker, and Z. Obradovic. 2003. Flavors of protein disorder.
Proteins 52: 573–84.
Vucetic, S., Z. Obradovic, V. Vacic, et al. 2005. DisProt: a database of protein disorder.
Bioinformatics 21: 137–40.
Vullo, A., O. Bortolami, G. Pollastri, and S. C. Tosatto. 2006. Spritz: a server for the prediction of
intrinsically disordered regions in protein sequences using kernel machines. Nucleic. Acids
Res. 34: W164–8.
Waizenegger, I., J. F. Gimenez-Abian, D. Wernic, and J. M. Peters. 2002. Regulation of human
separase by securin binding and autocleavage. Curr. Biol. 12: 1368–78.
References 309
Waldsich, C., R. Grossberger, and R. Schroeder. 2002. RNA chaperone StpA loosens interactions
of the tertiary structure in the td group I intron in vivo. Genes Dev. 16: 2300–12.
Walker, F. O. 2007. Huntington’s disease. Lancet 369: 218–28.
Wall, J., M. Schell, C. Murphy, R. Hrncic, F. J. Stevens, and A. Solomon. 1999. Thermodynamic
instability of human lambda 6 light chains: correlation with fibrillogenicity. Biochemistry
38: 14101–8.
Wallon, G., J. Rappsilber, M. Mann, and L. Serrano. 2000. Model for stathmin/OP18 binding to
tubulin. EMBO J. 19: 213–22.
Wang, J. Q., A. Arora, L. Yang, et al. 2005. Phosphorylation of AMPA receptors: mechanisms and
synaptic plasticity. Mol. Neurobiol. 32: 237–49.
Wang, S., W. R. Trumble, H. Liao, C. R. Wesson, A. K. Dunker, and C. H. Kang. 1998. Crystal
structure of calsequestrin from rabbit skeletal muscle sarcoplasmic reticulum. Nat. Struct.
Biol. 5: 476–83.
Ward, J. J., J. S. Sodhi, L. J. Mcguffin, B. F. Buxton, and D. T. Jones. 2004. Prediction and func-
tional analysis of native disorder in proteins from the three kingdoms of life. J. Mol. Biol.
337: 635–45.
Watanabe, K., P. Nair, D. Labeit, et al. 2002. Molecular mechanics of cardiac titin’s PEVK and
N2B spring elements. J. Biol. Chem. 277: 11549–58.
Watt, I. M. 1997. The Principles & Practice of Electron Microscopy. Cambridge: Cambridge
University Press.
Watts, J. D., P. D. Cary, P. Sautiere, and C. Crane-Robinson. 1990. Thymosins: both nuclear and
cytoplasmic proteins. Eur. J. Biochem. 192: 643–51.
Weathers, E. A., M. E. Paulaitis, T. B. Woolf, and J. H. Hoh. 2004. Reduced amino acid alphabet
is sufficient to accurately recognize intrinsically disordered protein. FEBS Lett. 576:
348–52.
Weikl, T., K. Abelmann, and J. Buchner. 1999. An unstructured C-terminal region of the Hsp90
co-chaperone p23 is important for its chaperone function. J. Mol. Biol. 293: 685–91.
Weinreb, P. H., W. Zhen, A. W. Poon, K. A. Conway, and P. T. Lansbury Jr. 1996. NACP, a pro-
tein implicated in Alzheimer’s disease and learning, is natively unfolded. Biochemistry 35:
13709–15.
Weiss, M. A., T. Ellenberger, C. R. Wobbe, J. P. Lee, S. C. Harrison, and K. Struhl. 1990. Folding
transition in the DNA-binding domain of GCN4 on specific binding to DNA. Nature 347:
575–8.
Wells, M., H. Tidow, T. J. Rutherford, et al. 2008. Structure of tumor suppressor p53 and its
intrinsically disordered N-terminal transactivation domain. Proc. Natl. Acad. Sci. USA 105:
5762–7.
Wells, R. D. 1996. Molecular basis of genetic instability of triplet repeats. J. Biol. Chem. 271:
2875–8.
Wendt, A., V. F. Thompson, and D. E. Goll. 2004. Interaction of calpastatin with calpain: a review.
Biol. Chem. 385: 465–72.
Westermark, P., S. Araki, M. D. Benson, et al. 1999. Nomenclature of amyloid fibril proteins.
Report from the meeting of the International Nomenclature Committee on Amyloidosis,
August 8–9, 1998. Part 1. Amyloid 6: 63–6.
Westhof, E., D. Altschuh, D. Moras, et al. 1984. Correlation between segmental mobility and the
location of antigenic determinants in proteins. Nature 311: 123–6.
Wetlaufer, D. B. 1973. Nucleation, rapid folding, and globular intrachain regions in proteins.
Proc. Natl. Acad. Sci. USA 70: 697–701.
Whitfield, L. S., R. Lovell-Badge, and P. N. Goodfellow. 1993. Rapid sequence evolution of the
mammalian sex-determining gene SRY. Nature 364: 713–5.
Wickner, R. B. 2005. Scrapie in ancient China? Science 309: 874.
Wickner, R. B., H. K. Edskes, M. L. Maddelein, K. L. Taylor, and H. Moriyama. 1999. Prions of
yeast and fungi. Proteins as genetic material. J. Biol. Chem. 274: 555–8.
310 References
Wickner, R. B., H. K. Edskes, B. T. Roberts, et al. 2004. Prions: proteins as genes and infectious
entities. Genes Dev. 18: 470–85.
Wickner, R. B., K. L. Taylor, H. K. Edskes, and M. L. Maddelein. 2000. Prions: Portable prion
domains. Curr. Biol. 10: R335–7.
Wilkins, D. K., S. B. Grimshaw, V. Receveur, C. M. Dobson, J. A. Jones, and L. J. Smith. 1999.
Hydrodynamic radii of native and denatured proteins measured by pulsed field gradient
NMR techniques. Biochemistry 38: 16424–31.
Wille, H., E. M. Mandelkow, J. Dingus, R. B. Vallee, L. I. Binder, and E. Mandelkow. 1992a.
Domain structure and antiparallel dimers of microtubule-associated protein 2 (MAP2). J.
Struct. Biol. 108: 49–61.
Wille, H., E. M. Mandelkow, and E. Mandelkow. 1992b. The juvenile microtubule-associated
protein MAP2c is a rod-like molecule that forms antiparallel dimers. J. Biol. Chem. 267:
10737–42.
Williams, R. J. 1989. NMR studies of mobility within protein structure. Eur. J. Biochem. 183:
479–97.
Williams, R. M., Z. Obradovic, V. Mathura, et al. 2001. The protein non-folding problem: amino
acid determinants of intrinsic order and disorder. Pac. Symp. Biocomput. 6: 89–100.
Williamson, M. P. 1994. The structure and function of proline-rich regions in proteins. Biochem
J. 297: 249–60.
Winzeler, E. A., D. D. Shoemaker, A. Astromoff, et al. 1999. Functional characterization of the S.
cerevisiae genome by gene deletion and parallel analysis. Science 285: 901–6.
Wise, M. J. and A. Tunnacliffe. 2004. POPP the question: what do LEA proteins do? Trends Plant.
Sci. 9: 13–7.
Wishart, D. S., C. G. Bigam, A. Holm, R. S. Hodges, and B. D. Sykes. 1995. 1H, 13C and 15N
random coil NMR chemical shifts of the common amino acids. I. Investigations of nearest-
neighbor effects. J. Biomol. NMR 5: 67–81.
Wisniewskia, M., R. Webba, R. Balsamob, T. Closec, X. Yud, and M. Griffithd. 1999. Purification,
immunolocalization, cryoprotective, and antifreeze activity of PCA60: a dehydrin from
peach (Prunus persica). Physol. Plant 105: 600–08.
Wissmann, R., T. Baukrowitz, H. Kalbacher, et al. 1999. NMR structure and functional charac-
teristics of the hydrophilic N terminus of the potassium channel beta-subunit Kvbeta1.1. J.
Biol. Chem. 274: 35521–5.
Wittmann, T., G. M. Bokoch, and C. M. Waterman-Storer. 2004. Regulation of microtu-
bule destabilizing activity of Op18/stathmin downstream of Rac1. J. Biol. Chem. 279:
6196–203.
Wool, I. G. 1996. Extraribosomal functions of ribosomal proteins. Trends Biochem. Sci. 21:
164–5.
Wootton, J. C. 1994a. Non-globular domains in protein sequences: automated segmentation using
complexity measures. Computers Chem. 18: 269–85.
Wootton, J. C. 1994b. Sequences with “unusual” amino acid compositions. Curr. Opin. Struct.
Biol. 4: 413–21.
Wootton, J. C. and M. H. Drummond. 1989. The Q-linker: a class of interdomain sequences found
in bacterial multidomain regulatory proteins. Protein Eng 2: 535–43.
Wopfner, F., G. Weidenhofer, R. Schneider, et al. 1999. Analysis of 27 mammalian and 9 avian
PrPs reveals high conservation of flexible regions of the prion protein. J. Mol. Biol. 289:
1163–78.
Wright, P. E. and H. J. Dyson. 1999. Intrinsically unstructured proteins: re-assessing the protein
structure-function paradigm. J. Mol. Biol. 293: 321–31.
Wu, G., Y. G. Chen, B. Ozdamar, et al. 2000. Structural basis of Smad2 recognition by the Smad
anchor for receptor activation. Science 287: 92–7.
Wu, K., M. E. Bottazzi, C. De La Fuente, et al. 2004. Protein profile of tax-associated complexes.
J. Biol. Chem. 279: 495–508.
References 311
Wutrich, K. 1986. NMR of Proteins and Nucleic Acids. New York: Wiley Interscience.
Xiao, H., R. Sandaltzopoulos, H. M. Wang, et al. 2001. Dual functions of largest NURF sub-
unit NURF301 in nucleosome sliding and transcription factor interactions. Mol. Cell 8:
531–43.
Xie, H., S. Vucetic, L. M. Iakoucheva, et al. 2007. Functional anthology of intrinsic disorder. 1.
Biological processes and functions of proteins with long disordered regions. J. Proteome.
Res. 6: 1882–98.
Xie, Z. and L. H. Tsai. 2004. Cdk5 phosphorylation of FAK regulates centrosome-associated
miocrotubules and neuronal migration. Cell Cycle 3: 108–10.
Yamamoto, A., V. Guacci, and D. Koshland. 1996. Pds1p is required for faithful execution of
anaphase in the yeast, Saccharomyces cerevisiae. J. Cell Biol. 133: 85–97.
Yamamoto, T., S. Izumi, and K. Gekko. 2004. Mass spectrometry on segment-specific hydrogen
exchange of dihydrofolate reductase. J. Biochem. (Tokyo) 135: 17–24.
Yang, J., T. D. Hurley, and A. A. Depaoli-Roach. 2000. Interaction of inhibitor-2 with the catalytic
subunit of type 1 protein phosphatase. Identification of a sequence analogous to the consen-
sus type 1 protein phosphatase-binding motif. J. Biol. Chem. 275: 22635–44.
Yang, W. Z., T. P. Ko, L. Corselli, R. C. Johnson, and H. S. Yuan. 1998. Conversion of a beta-
strand to an alpha-helix induced by a single-site mutation observed in the crystal structure
of Fis mutant Pro26Ala. Protein Sci. 7: 1875–83.
Yang, X. J. 2004a. The diverse superfamily of lysine acetyltransferases and their roles in leukemia
and other diseases. Nucleic. Acids Res. 32: 959–76.
Yang, X. J. 2004b. Lysine acetylation and the bromodomain: a new partnership for signaling.
BioEssays 26: 1076–87.
Yang, X. J. 2005. Multisite protein modification and intramolecular signaling. Oncogene 24:
1653–62.
Yang, Z. R., R. Thomson, P. Mcneil, and R. M. Esnouf. 2005. RONN: the bio-basis function neu-
ral network technique applied to the detection of natively disordered regions in proteins.
Bioinformatics 21: 3369–76.
Yap, K. L., J. Kim, K. Truong, M. Sherman, T. Yuan, and M. Ikura. 2000. Calmodulin target data-
base. J. Struct. Funct. Genomics 1: 8–14.
Yong, C., H. Mitsuyasu, Z. Chun, S. Oshiro, N. Hamasaki, and S. Kitajima. 1998. Structure of the
human transcription factor TFIIF revealed by limited proteolysis with trypsin. FEBS Lett.
435: 191–4.
Young, R. A. 1991. RNA polymerase II. Annu. Rev. Biochem. 60: 689–715.
Yu, H., J. K. Chen, S. Feng, D. C. Dalgarno, A. W. Brauer, and S. L. Schreiber. 1994. Structural
basis for the binding of proline-rich peptides to SH3 domains. Cell 76: 933–45.
Zagotta, W. N., T. Hoshi, and R. W. Aldrich. 1990. Restoration of inactivation in mutants of
Shaker potassium channels by a peptide derived from ShB. Science 250: 568–71.
Zahn, R., A. Liu, T. Luhrs, et al. 2000. NMR solution structure of the human prion protein. Proc.
Natl. Acad. Sci. USA 97: 145–50.
Zambelli, B., M. Stola, F. Musiani, et al. 2005. UreG, a chaperone in the urease assembly process,
is an intrinsically unstructured GTPase that specifically binds Zn2+. J. Biol. Chem. 280:
4684–95.
Zeev-Ben-Mordehai, T., E. H. Rydberg, A. Solomon, et al. 2003. The intracellular domain of
the Drosophila cholinesterase-like neural adhesion protein, gliotactin, is natively unfolded.
Proteins 53: 758–67.
Zhang, J. and J. L. Corden. 1991. Phosphorylation causes a conformational change in the car-
boxyl-terminal domain of the mouse RNA polymerase II largest subunit. J. Biol. Chem.
266: 2297–302.
Zhang, M. and P. Coffino. 2004. Repeat sequence of Epstein–Barr virus-encoded nuclear
antigen 1 protein interrupts proteasome substrate processing. J. Biol. Chem. 279:
8635–41.
312 References
Zhang, X., M. A. Perugini, S. Yao, et al. 2008. Solution conformation, backbone dynamics and
lipid interactions of the intrinsically unstructured malaria surface protein MSP2. J. Mol.
Biol. 379: 105–21.
Zhang, Y., Y. Kim, N. Genoud, et al. 2006. Determinants for dephosphorylation of the RNA poly-
merase II C-terminal domain by Scp1. Mol. Cell 24: 759–70.
Zhang, Y., B. Stec, and A. Godzik. 2007. Between order and disorder in protein structures: analy-
sis of “dual personality” fragments in proteins. Structure 15: 1141–7.
Zheng-Fischhofer, Q., J. Biernat, E. M. Mandelkow, S. Illenberger, R. Godemann, and E.
Mandelkow. 1998. Sequential phosphorylation of tau by glycogen synthase kinase-3beta
and protein kinase A at Thr212 and Ser214 generates the Alzheimer-specific epitope of
antibody AT100 and requires a paired-helical-filament-like conformation. Eur. J. Biochem.
252: 542–52.
Zhu, F., J. Kapitan, G. E. Tranter, et al. 2007. Residual structure in disordered peptides and
unfolded proteins from multivariate analysis and ab initio simulation of Raman optical
activity data. Proteins 70: 823–33.
Zhuang, S., K. Mabuchi, and C. A. Wang. 1996. Heat treatment could affect the biochemical
properties of caldesmon. J. Biol. Chem. 271: 30242–8.
Zitzewitz, J. A., B. Ibarra-Molero, D. R. Fishel, K. L. Terry, and C. R. Matthews. 2000. Preformed
secondary structure drives the association reaction of GCN4-p1, a model coiled-coil sys-
tem. J. Mol. Biol. 296: 1105–16.
Zor, T., B. M. Mayr, H. J. Dyson, M. R. Montminy, and P. E. Wright. 2002. Roles of phos-
phorylation and helix propensity in the binding of the KIX domain of CREB-binding
protein by constitutive (c-Myb) and inducible (CREB) activators. J. Biol. Chem. 277:
42241–8.
Zou, H., T. J. Mcgarry, T. Bernal, and M. W. Kirschner. 1999. Identification of a vertebrate sister-
chromatid separation inhibitor involved in transformation and tumorigenesis. Science 285:
418–22.
Index
A in measles virus nucleoprotein, 50–51
in ordered proteins, 10–11
Accuracy, in PDB, 135
of disorder prediction, 118 α-Synuclein, AFM, 72
Acetylation, 172 amyloid structure, 69, 250
Acetylcholinesterase, dynamics (in binding), 141
EPR spectrum, 68 FTIR, 63
ACF, see Autocorrelation function function, chaperone, 174
Acid blob, see Trans-activator domain in-cell NMR, 97
Actin, in Parkinson’s disease, 250–251
globular, 157 metal binding, 160
in microfilament, 157 natively unfolded protein, 26
structure, Tβ4-bound, 158 proteasomal degradation, 95
Activator for thyroid hormone and retinoid residual structure, 141
receptors (ACTR), 142, 147, 211–212 ROA, 66
Activator, function, effector, structure, solution, 132
ACTR, see Activator for thyroid hormone and hydrodynamic behavior, 136, 138
retinoid receptors tertiary structure by PRE, 81–82
AD, see Alzheimer’s disease under crowding, 93
Adaptability, see Structural adaptability, Alternative splicing, 234–235
see also Moonlighting Alzheimer’s disease, 248–249
in binding, 23 Ambiguity of structure, see Secondary
structural, 101, 203, 223–224; structure; see also
see also Promiscuity, One-to-many Chameleon sequences
signaling, Amide bond, see Peptide bond
Adaptor protein, see Scaffold protein Amide proton exchange, see H/D exchange
Adenomatous polyposis coli (APC), 150 Amino acid, 3–4
AFM, see Atomic force microscopy sequence, see Primary structure
Allostery, structure, 3
in catabolite activator protein, 234 Amino acid composition,
in Wiskott–Aldrich syndrome protein, 234 disorder-promoting amino acids, 121–122
Aβ peptide, of IDPs, 121–122
in Alzheimer’s disease, 248 of interfaces, 212–213
amyloid, structure of, 257 of linear motifs, 209
generation from APP, 248 order-promoting amino acids, 121–122
α-Helix, 7–8; see also Dictionary of Amphipathic α-helix, 7
secondary structure of proteins in drug design, 263
(DSSP), Ramachandran plot, in myelin basic protein, 69
Secondary structure Amyloid, disorder of precursors, 257–258;
amphipathic, 7, 69, 263 see also Amyloidosis
as α-MoRE, 246 electron microscopy, 257
by circular dichroism, 64–65 kinetics of formation, 256
by FTIR, 63 mechanism of formation, 258–259
by NMR, 79, 128–129 structure, 257–258
forming potential of amino acids, 3 Amyloid precursor protein (APP)
in CREB KID, 130–131, 216–217 Aβ peptide, generation from, 248
in DNA recognition, 216 mutation in Alzheimer’s disease, 248
in IDPs, 128–129, 133 Amyloidosis, 247
in IDPs, predicted, 115 neurodegenerative, 246–255
in KID of p27Kip1, 127 systemic, 255
313
314 Index
C rheomorphic, 24, 66
under crowding, 94
Cadherin (E-), UV fluorescence, 58
catenin-binding domain (CBD), 192 Caskin, disordered scaffold protein, 187
complex with β-catenin, 151 CASP (critical assessment of methods of protein
cytoplasmic domain, 150 structure prediction), see Comparison
CAG-repeat disease, see Glutamin-repeat disease of predictors
Calcineurin, 56 Catabolite activator protein, see also Allostery
Caldesmon, chemical cross-linking, 40 order-to-disorder transition in, 234
electron microscopy, 70 Catenin binding domain (CBD), evolution, 211
gel filtration, 44 in E-cadherin and T-cell factor 3/4, 192
hydrodynamic behavior, 136 in many-to-one signaling, 151
near-UV CD, 64 CBD, see Cellulose binding domain
Calmodulin, as hub protein, 184 CBD, see Catenin binding domain
binding partners, disorder of, 35, 170, 184–185 CBP, see CREB-binding protein
disorder in function, 184 CD, see Circular dichroism spectroscopy
partners, limited proteolysis of, 170 Cdc42, 115, 158, 211, 233
Calmodulin-binding target (CaMBT), 184 Cdk, see Cyclin-dependent kinase inhibitor
enhanced proteolytic sensitivity, 35, 170 CDP, see Conserved disorder prediction
in PDB, 185 Cell-cycle, 241
Calorimetry, see Differential scanning regulation, see Signal transduction
calorimetry, Isothermal Cellulase, bacterial,
titration calorimetry processivity of binding, 229
Calpastatin, structure, SAXS, 51-52
evolutionary variability, 201 Cellulose binding domain (CBD),
HSQC spectrum, 77 in bacterial cellulase, 51–52
hydration of, 142 CFTR, see Cystic fibrosis transmembrane
primary contact site, 222 conductance regulator
residual structure, 37 CH, see Charge-hydropathy plot
resistance to heat, 32 Chameleon sequences, 134
resonance assignment, NMR, 78 Chaperone, 173; see also Function, chaperone;
structure, in solution, 132 LEA protein
wide-line NMR, 76 disorder in
Calsequestrin, function, scavenger, 179 entropy transfer, 236
metal binding, 161 fully disordered, 174
structure, 179 heat-shock protein, 15
CaM, see Calmodulin in folding, 14–15
CaMBT, see Calmodulin-binding target mechanism of, 235–236
cAMP response element binding protein of protein, 174
(CREB); see also Kinase inducible of RNA, 174–176
domain (KID) Charge-hydropathy plot, 107–108
binding to KIX domain of CBP, 80, 130 Chelate effect, see Multivalent binding
function, transcription factor, 145–146 Chemical cross-linking, 40; see also
fuzziness of binding, 228 Indirect techniques
inducibility of binding, 220 Chemical denaturation, 33; see also
local secondary structure, 79 Indirect techniques
mechanism of binding, 216–218 Chorismate mutase,
structure, solution, 130–131 as molten-globule enzyme, 161
Cancer, structural disorder in, 237–245 Chromatin, 153–155; see also Histone
CAP, see Catabolite activator protein Chromosomal translocation, 244–245
Cardiovascular disease, structural disorder in, 245 Ciboulot, 157, 164, 192
Casein, FTIR, 63 Cip/Kip Cdk inhibitor, 240–242
function, chaperone, 174 disordered domain, 211
function, scavenger, 178 p21Cip1, argument for disorder, 101
in history of structural disorder, 24 p21Cip1, in history of disorder, 27
proteasomal degradation, 95 p27Kip1, as signaling conduit, 232–233
random coil structure, 24 p27Kip1, structure in solution, 127
316 Index
Dehydrin (DHN), see also Early responsive interatomic, by FRET, 60–61, 138
to dehydration (ERD), Late Distance-distribution function, see Small-angle
embryogenesis abundant (LEA) X-ray scattering, Distance-distribution
as group 2 LEA protein, 160 function
hydration of, 142 DLS, see Dynamic light scattering,
in stress response, 160 DNA-binding domain (DBD), see also
Denaturation, see also Unfolding, 15 transcription factors
and amyloid formation, 259 coiled-coil, see also Leu-zipper, 216, 245
and lock-and-key hypothesis, 22 disorder-to-order transition and
and residual structure, 37 specificity, 219–220
resistance to, 33 in Oct1, see also POU domain, 165
Destruction-box, see Degradation, in RPA70, 195, 200
Destruction-box in transcription factor, 145
Dextran, mutations, in p53, 259
in gel-filtration chromatography, 43 of 53, 52, 108, 240
in mimicking crowding, 59, 93 predicted disorder in, 146
Df31, see Decondensation factor 31 Docking protein, see Scaffold protein
DHFR, see Dihydrofolate reductase Domain, definitions of, 10; see also Catenin-
DHN, see Dehydrin binding domain, Cellulose-binding
DHPR, see Dihydropyridine receptor domain, C-terminal domain,
Diabetes, structural disorder in, 245 DNA-binding domain, Intrinsically
Dictionary of secondary structure of unstructured linker domain, Kinase
proteins (DSSP), 9 inhibitory domain, Kinase-inducible
Differential scanning calorimetry (DSC), 35–37; domain, N-terminal domain, PDZ
see also Indirect techniques domain, Regulatory domain, Tyrosine-
and residual structure, in calpastatin, 37 kinase domain, Trans-activator
and transition to ordered state, 37 domain, Tubulin-binding domain,
molten globule structure, of chorismate disordered, 192–193, 210–212
mutase, 161 DP, see Dual-personality sequence
of caldesmon, 36 Drug design,
of decondensation factor 31, 36 based on disorder, 262–263
of lysozyme, 36 DSC, see Differential scanning calorimetry
Diffusion coefficient, by dynamic DSSP, see Dictionary of secondary structure
light scattering, 45 of proteins
by PFG, 53–54 Dual-personality sequence, 135
Dihydrofolate reductase, Dynamic light scattering, 45; see also
flexibility, of linker, 41 Hydrodynamic techniques
Dihydropyridine receptor (DHPR), Dynamics,
moonlighting of, 177, 223 and FCS, 62
DILIMOT, see also Prediction of linear and fluorescence spectroscopy, 57
motifs, 116, 208 and FRET, 60
DisEMBL, 110 and EPR, 68
DisProt, see also Databases, 18 and NMR relaxation, 79–81
DISOPRED, 111 of structure of IDPs, 140–142
Disorderome, 86 of Sup35p NM region, 62
Disorder-promoting amino acid, see Amino of tau protein, 69
acid composition
Disorder-to-order transition, 141–142; see also E
Induced folding
mechanism, 214–218 Early responsive to dehydration (ERD); see also
reduction of mobility in, 142 LEA proteins hydration, 142
DISPHOS, see also Prediction, of ECM, see Extracellular matrix
phosphorylation site, 169 E-cadherin, see Cadherin (E-)
Display site; see Function, display site; see also Effector, see Function, effector
Post-translational modification EFP, see EWS fusion protein
DisProt, see Databases eIF4F, see Eukaryotic translation initiation
Distance, interatomic, by NMR NOE, 81–82 factor 4F
318 Index
Intrinsically disordered protein, IDP, 29; see also KIX domain, 147
Structural disorder binding of CREB KID, 79
Intrinsically folded structural unit (IFSU), 130; structure in complex with CREB KID, 80
see also Preformed structural element Kyte–Doolittle scale, see Hydrophobicity,
Intrinsically unstructured linker domain scale of
(IULD), 195
evolution, neutrality, 200 L
Intrinsically unstructured protein, IUP, 29; see
also Structural disorder λN, bacteriophage, 155
Isothermal titration calorimetry, 37–38; see also Landscape theory, see also Folding funnel
Indirect techniques of folding, 12
and free energy of binding, 38–40 Late embryogenesis abundant (LEA) protein,
binding of p27Kip1 KID domain to Cyclin A – entropic exclusion, 168
Cdk2, 38–40 function, chaperone, 174–175
in binding of polyproline II helix to SH3 in stress response, 160
domain, 38 LEA protein, see Late embryogenesis
ITC, see Isothermal titration calorimetry abundant protein
IULD, see Intrinsically unstructured LEF (lymphocyte enhancer binding factor),
linker domain see T-cell factor (3/4)
IUP, intrinsically unstructured protein, 29; Leu-zipper, 216; see also Coiled-coil, DNA-
see also Structural disorder binding domain
IUPred, 112 Levinthal paradox, 12
LH, see Linker, helix
J Limited proteolysis, 35; see also
Indirect techniques
Janus chaperone, 176 and display site function, 170–171
Janus chaperone, see also Chaperone and local structure, 35
as post-translational modification, 6
K by proteasome, 171
in proteomics, 35
KID domain, see Kinase inhibitory domain (in of calmodulin-binding target, 170
p27Kip1), Kinase-inducible domain (in prerequisites of, 170
CREB) Linear motif, eukaryotic, 208–209; see also
KID-binding domain, Eukaryotic Linear Motifs (ELM)
in CBP, see KIX domain database, Molecular recognition
Kinase inhibitory domain (KID in p27Kip1), 38; Linear motif, short, see also Prediction, of
see also p27Kip1 linear motifs
binding of Cyclin A-Cdk2, energetics, 38–40 molecular recognition, 208–209
disordered domains in CKIs, 192 Linker, see also Function, linker; Intrinsically
in signaling conduit of p27Kip1, 232–233 unstructured linker domain
mechanism of binding to Cyclin flexibility in dihydrofolate reductase, 41
A-Cdk2, 215–216 helix, in KID domain of p27Kip1, 38
mechanism, inhibition of Cdks, 241–242 neutral evolution of, 200
structure, bound, 130 of CBP, 147
structure, solution, 127–130 of matrix metalloproteinase 9, 165
under crowding, 93 of Oct1, 165
Kinase-inducible domain (KID in CREB), of replication protein A, 195
function, display site, 164 LM, see linear motif, eukaryotic
function, transcription factor, 145–146 Lock-and-key hypothesis, 19; see also Structure–
fuzziness, 228 function paradigm, classical
inducibility of interaction, 220 Loopy protein, 66
MD analysis of binding mechanism, 216–217 Low-complexity region, 123
NMR analysis of binding mechanism, 217–218 and Shannon entropy, 106, 123
preformed structural elements in, 207 in IDPs, 124
structure, in complex with KIX domain prediction of, 106
of CBP, 80 Lymphocyte enhancer binding factor (LEF),
structure, solution, 79–80, 130–131 see T-cell factor (3/4)
322 Index
Molecular mimicry, by 4E-BP, 235; see also Mutation, advantageous, 194; see also
Molecular recognition Evolution, Polymorphism, genetic,
by colicin E9, 235 Repeat expansion
by EspF(U), 235 amyloid precursor protein, 248
Molecular recognition, 206–230 in BRCA1, oncogenic, 259–260
and fast binding, 221–223 disadvantageous, 194
and fly-casting, 222–223 in lysozyme, amyloidogenic, 247, 255
and fuzziness, 226–228 in p53, oncogenic, 259
and interface, 212 in transthyretin, amyloidogenic, 247, 255
and molecular mimicry, 235 missense, 194
and nested interfaces, 225 neutral, 194
by linear motifs, 208 nonsense, 194
by molecular recognition elements point mutations, in evolution, 193
(MoREs), 210 sense, 194
by molecular recognition features Mutual synergistic folding, see Co-folding
(MoRFs), 210 Mw, see Molecular mass
by preformed structural elements Myelin basic protein,
(PSEs), 206–208 amphipathic α-helix in, 69
by primary contact site (PCS), 222 failure of crystallization, 24
by short motifs, 206–214 membrane binding, 69
mechanism of, 215–218 Myosin VI, electron microscopy, 71
sequence independence of, 230 processivity of binding, 229
ultrasensitivity in, 230–231
uncoupling specificity from binding N
strength, 219–221
Molecular recognition element, see MoRE N-acetyl tryptophane amide (NATA),
Molecular recognition feature, see MoRF fluorescence spectrum, 59
Molecular weight, see Molecular mass quenching of, 59
Molten globule (MG), 17; see also Protein NACP (non-Aβ component of Alzheimer’s disease
quartet model, Protein trinity model amyloid plaques), see α-synuclein
as enzyme, 161–162 NAC-region,
by ANS binding, 60 in α-synuclein, 139, 251
by gel-filtration, 43 in amyloid formation, 250–251
by SAXS, 48 restricted motion of, 141
Moonlighting, 177, 223; see also Adaptablity, NAD(P)H quinine oxidoreductase 1 (NQO1), 96
Promiscuity, functional NATA, see N-acetyl tryptophane amide
mechanisms, 225 Native state,
MorE/MoRF, 210 in folding, 13
in disease proteins, 246 Natively unfolded protein, NU, 29; see also
in functional classes, 246 Structural disorder
prediction of, 115, 134, 210, 212, 246, 263 NCBD, see Nuclear coactivator binding domain
mRNA, 5’ capping, 155; see also Alternative Nested interface, 225–226; see also Osteopontin,
splicing, Splicing Sialoprotein
MS, see Mass-spectrometry Neurodegenerative disease, 246–259; see also
MSCRAMM, see Microbial surface Amyloidoses
components recognizing Neurofilament, see Intermediate filament
adhesive matrix molecules; Neutral evolution,
see also Fibronectin-binding of intrinsically disordered linker domain, 200
protein (A) of trans-activator domain, 194
MT, see Microtubule Neutrality,
MTBR, see Microtubule-binding region in evolution, 194–195
Multitasking, see Moonlighting NLS, see Nuclear localization signal
Multivalent binding, 220–221; see also NMR, 73–84
Fuzziness, clamp type and H/D exchange, 83–84
Murine double minute 2 (MDM2), and MD simulations, 83
as hub protein, 183–184 chemical shift index, 79, 98, 127
p53 binding, 183 NOE and distance information, 81–82
324 Index
PDB (Protein Data Bank), 18 see also Databases in polyQ regions, 253
PDZ domain, 150 in RNA polymerase II, 148
Peptide bond, 3–4 in tau protein, 66
Perchloro-acetic acid (PCA), 86 in titin PEVK domain, 81, 126
Persistence length, 16 PolyQ disease, see Glutamin-repeat disease
PEST region, and half-life in vivo, 99 PolyQ region,
PEVK region (in titin), dimensions by FCS, 61
evolution by repeat expansion, 200 PONDR®, see Predictor of naturally
function, entropic chain, 101, 164 disordered regions
function, entropic spring, 166 Post-synaptic density, 187
polyproline II helix, 81, 126 voltage-dependent potassium channel, in
tandem repeats in, 198 assembly of, 150
electron microscopy, 71 Post-translational modification, see also
Pfam, 18; see also Databases Linear motifs
PFG, see NMR, pulsed-field gradient acetylation, 172
PG-SLED (pulse gradient stimulated echo disulfide bridge, 5
longitudinal encode-decode), glycosylation, 5
see NMR, pulsed-field gradient phosphorylation, 5, 168–170
PHF, see Paired helical filament proteolytic processing, 170,171;
Φ-value analysis, 14 see also Limited proteolysis
Phospho-degron, 231; see also ubiquitination, 171
Degradation, signals spontaneous, 6
Phosphorylation, 168–170; see also POU domain, 165, 227
Autophosphorylation prediction of PP1, see Protein phosphatase 1
of CFTR regulatory domain, 231–232 PPII helix, see Polyproline II helix
of KID domain of CREB, 79–80, 216–217 PRE, see Paramagnetic resonance enhancement
of p27Kip1, 232–233 Prediction,
of Sic1, 231 and meta-servers, 114
Phospho-tyrosine-binding domain (PTB), 206 based on amino acid propensity, 103
Phylogenetic distribution, based on contact numbers, 111
of disorder, 189–192 based on contact potentials, 112
PIC, see Pre-initiation complex based on inter-residue interaction energies, 112
Pituitary tumor transforming gene (PTTG), based on neural networks, 109–110
see Securin based on support vector machines, 110–111
PKA, see Protein kinase A of function, 162
PMG, see Pre-molten globule of functional motifs, 115–116
Polar zipper, 257; see also Amyloid, structure of globularity, 107
Polyacrylamide gel-electrophoresis (PAGE), of linear motifs, 116, 208
native, 88–89 of low-complexity regions, 106
SDS-, see SDS-PAGE of MoREs/MoRFs, 115, 134, 210, 212, 246, 263
Polyelectrostatics, see Ultrasensitivity of phosphorylation sites, 169
Polymer theory, 15–17 of structural disorder, 103–120
Polymorphism, of structural disorder in structural
genetic, 197; see also Tandem repeat genomics, 119–120
in glutamine-repeat disease, 251–252 Predictor of naturally disordered regions
in prion protein, 254 (PONDR®), 109–110
structural, 202, 226–227; see also Fuzziness Preformed structural elements, 206–208;
Polyproline II helix (PPII helix), 8, 9; see also see also Residual structure
Ramachandran plot, Dictionary of Pre-initiation complex, 145
secondary structure of proteins (DSSP) RNA polymerase II, role of, 198
and circular dichroism, 63–65 PreLink, 108–109
and ROA, 65–66, 67 Pre-molten globule, 17; see also
binding to SH3 domain, 38 Protein quartet model
energetics of binding, 58 by gel-filtration, 43
in Ala repeats, 127 by SAXS, 48
in IDPs, 126–127 of caldesmon, 44
in ordered proteins, 9 of ribosomal protein, 153
326 Index
W
Z
WASP, see Wiskott–Aldrich syndrome protein
WASP homology domain 2 (WH2), 157 ZipA protein, and crowding, 93
WAVE, 157 electron microscopy, 70