Using AutoDock 4 with ADT: A Tutorial
Dr. Ruth Huey & Dr. Garrett M. Morris
5/13/08 Using AutoDock 4 with ADT 1
What is Docking?
Predicting the best ways two molecules will interact. interact.
(1) (2) (3)
Obtain the 3D structures of the two molecules. Locate the best binding site. site. Determine the best binding modes. modes.
Using AutoDock 4 with ADT 2
5/13/08
What is Docking?
Predicting the best ways two molecules will interact. interact.
We need to quantify or rank solutions; We need a Scoring Function or force field.
Predicting the best ways two molecules will interact. interact.
(waysplural) The experimentally observed structure (ways may be amongst one of several predicted solutions. We need a Search Method.
Using AutoDock 4 with ADT 3
5/13/08
Defining a Docking
Position x, y, z Orientation qx, qy, qz, qw Torsions 1, 2, n
x z
5/13/08 Using AutoDock 4 with ADT 4
Key aspects of docking docking
Scoring Functions What are they? Search Methods How do they work? Which search method should I use? Dimensionality What is it? Why is it important?
Using AutoDock 4 with ADT 5
5/13/08
Scoring Function in AutoDock 4: Motivation
To improve scoring function improved hydrogen bonding new desolvation energy term & internal desolvation energy larger training set and new weights To permit protein sidechain, loop or domain flexibility (new DPF keyword, flexres) flexres treats proteins moving atoms as part of the nonprotein translating, non-reorienting part of the torsion tree
To simulate the unbound state of the ligand & protein
extended, compact and crystallographic ligand conformations LL LL PP PP PL PL G = (Vbound Vunbound ) + (Vbound Vunbound ) + (Vbound Vunbound ) TSconf
5/13/08
Using AutoDock 4 with ADT
AutoDock 4 Scoring Function Terms
Gbinding = GvdW + Gelec + Ghbond + Gdesolv + gtors
G vdW = GvdW 12-6 Lennard-Jones potential (with 0.5 smoothing) G elec with Solmajer & Mehler distance-dependent dielectric G hbond 12-10 H-bonding Potential with Goodford Directionality G desolv Charge-dependent variant of Stouten Pairwise Atomic Solvation Parameters G tors Number of rotatable bonds
Using AutoDock 4 with ADT 7
http://autodock.scripps.edu/science/equations http://autodock.scripps.edu/science/autodock-4-desolvation-free-energy/
5/13/08
Pairwise terms in AutoDock 4
A C B Dij qq (r 2 / 2 2 ) ij ij V = W vdw 12 ij + W hbond E(t) 12 10 + W elec i j + W sol ( SiVj + SjVi )e ij rij6 rij rij i, j rij i, j i, j (rij )rij i, j
Desolvation includes terms for all atom types
Favorable term for C, A (aliphatic and aromatic carbons) Unfavorable term for O, N Proportional to the absolute value of the charge on the atom Computes the intramolecular desolvation energy for moving atoms
Calibrated with 188 complexes from LPDB, Kis from PDB-Bind
Standard error (in Kcal/mol): 2.62 (extended) 2.72 (compact) 2.52 (bound) 2.63 (AutoDock 3, bound)
5/13/08
Improved H-bond directionality ADT Using AutoDock 4 with
Improved H-bond Directionality
Hydrogen affinity
AutoGrid 3
Oxygen affinity
Guanine
Cytosine
AutoGrid 4
Guanine
Cytosine
Huey, Goodsell, Morris, and Olson (2004) Letts. Drug Des. & Disc., 1: 178-183
5/13/08
Using AutoDock 4 with ADT
Why Use Grid Maps?
Saves time:
Pre-computing the interactions on a grid is typically 100 times faster than traditional Molecular Mechanics methods O(N2) calculation becomes O(N) to compute the score of a candidate docked ligand conformation
AutoDock uses trilinear interpolation
5/13/08
AutoDock needs one map for each atom type in the ligand(s) and moving parts of receptor (if there are any) Drawback: The receptor is conformationally rigid (although vdW softened) softened Limits the search space
Using AutoDock 4 with ADT 10
Setting up the AutoGrid Box
Macromolecule atoms in the rigid part Center: center of ligand; center of macromolecule; a picked atom; or typed-in x-, y- and z-coordinates. Grid point spacing: spacing: default is 0.375 (from 0.2 to 1.0: ). 0.375 0.2 1.0 Number of grid points in each dimension: only give even numbers (from 2 2 2 to 126 126 126). AutoGrid adds one point to each dimension. Grid Maps depend on the orientation of the macromolecule. Make sure all the flexible parts of the macromolecule are inside the grid
To make a molecule PDB file to show where the grid box is, use the script makebox: molecule makebox
5/13/08
% makebox mol.gpf Using AutoDock 4 with ADT > mol.gpf.box.pdb
11
Relaxed Complex Method
Lin, J. H., Perryman, A. L., Schames, J. R., and Schames, McCammon, J. A. (2002). Computational drug design accommodating receptor flexibility: The relaxed complex scheme. scheme. Journal of the American Chemical Society, 124: 124: 5632-5633. McCammon, J. (2005). Target flexibility in molecular recognition. recognition. Biochimica et Biophysica Acta, 1754: 221-224. 1754: Perryman, A. L. & McCammon, J. A. (2002). AutoDocking dinucleotides to the HIV-1 integrase core domain: Exploring possible binding sites for viral and genomic DNA. J Med Chem, 45: 5624-5627. 45: Schames, J.R., Henchman, R.H., Siegel, J.S., Sotriffer, C.A., Ni, H., and McCammon, J.A. (2004) Discovery of a novel binding trench in HIV integrase. J Med Chem, 47(8): p. 1879-81.
Docking of the 5CITEP inhibitor to snapshots of a 2 ns HIV-1 integrase MD trajectory indicated a previously uncharacterized trench adjacent to the active site that intermittently opens. Further docking studies of novel ligands with the potential to bind to both regions showed greater selective affinity when able to bind to the trench. Our ranking of ligands is open to experimental testing, and our approach suggests a new target for HIV-1 therapeutics.
5/13/08
Using AutoDock 4 with ADT
12
Spectrum of Search: Breadth and Level-of-Detail
Search Breadth Local Molecular Mechanics (MM) Intermediate Monte Carlo Simulated Annealing (MC SA) Brownian Dynamics Molecular Dynamics (MD) Global Docking Level-of-Detail Atom types Bond stretching Bond-angle bending Rotational barrier potentials
Implicit solvation Polarizability Whats rigid and whats What what flexible?
5/13/08
Using AutoDock 4 with ADT
13
Two Kinds of Search
Systematic Exhaustive Deterministic Outcome is dependent on granularity of sampling Feasible only for lowdimensional problems e.g. DOT (6D) Stochastic Random Outcome varies Must repeat the search to improve chances of success Feasible for bigger problems e.g. AutoDock
5/13/08
Using AutoDock 4 with ADT
14
Stochastic Search Methods
Simulated Annealing (SA)* Evolutionary Algorithms (EA)
Genetic Algorithm (GA)* Tabu Search (TS) Particle Swarm Optimisation (PSO) Lamarckian GA (LGA)*
*Supported in AutoDock
Others
Hybrid Global-Local Search Methods
5/13/08
Using AutoDock 4 with ADT
15
AutoDock has a Variety of Search Methods
Global search algorithms:
Simulated Annealing (Goodsell et al. 1990) Distributed SA (Morris et al. 1996) Genetic Algorithm (Morris et al. 1998) Solis & Wets (Morris et al. 1998) Lamarckian GA (Morris et al. 1998)
Local search algorithm:
Hybrid global-local search algorithm:
5/13/08
Using AutoDock 4 with ADT
16
How Simulated Annealing Works Works
Ligand starts at a random (or user-specified) position/orientation/conformation (state) ( state Constant-temperature annealing cycle: Ligands state undergoes a random change. Ligand Compare the energy of the new position with that of the last position; if it is:
the Metropolis criterion
lower, the move is accepted; accepted (- E/kT higher, the move is accepted if e(- E/kT)) > 0 ; otherwise the current move is rejected. rejected
Cycle ends when we exceed either the number of accepted or rejected moves. Annealing temperature is reduced, 0.85 < g < 1
Ti = g Ti-1
Rinse and repeat. Stops at the maximum number of cycles.
Using AutoDock 4 with ADT 17
5/13/08
How a Genetic Algorithm Works Works
Start with a random population (50-300) Genes correspond to state variables Perform genetic operations Crossover
1-point crossover, ABCD + abcd Abcd + aBCD BCD 2-point crossover, ABCD + abcd AbCD + aBcd uniform crossover, ABCD + abcd AbCd + aBcD arithmetic crossover, ABCD + abcd [ ABCD + (1- ) abcd] + [ abcd] [(1- ) ABCD + abcd] where: 0 < < 1 abcd]
add or subtract a random amount from randomly selected genes, A A
Mutation
Compute the fitness of individuals (energy evaluation) Proportional Selection & Elitism If total energy evaluations or maximum generations reached, stop
Using AutoDock 4 with ADT 18
5/13/08
Lamarck
Jean-Baptiste-PierreJean-Baptiste-PierreAntoinede Monet, Chevalier de Lamarck pioneer French biologist who is best known for his idea that acquired traits are inheritable, an idea known as Lamarckism, which is controverted by Darwinian theory.
5/13/08
Using AutoDock 4 with ADT
19
How a Lamarckian GA works
Lamarckian: Lamarckian: phenotypic adaptations of an individual to its environment can be mapped to its genotype & inherited by its offspring. Phenotype - Atomic coordinates Genotype - State variables (1) Local search (LS) modifies the phenotype, (2) Inverse map phenotype to the genotype Solis and Wets local search advantage that it does not require gradient information in order to proceed Rik Belew (UCSD) & William Hart (Sandia).
5/13/08
Using AutoDock 4 with ADT
20
Important Search Parameters
Simulated Annealing Initial temperature (K)
Genetic Algorithm & Lamarckian GA Population size
rt0 61600
ga_pop_size ga_pop_size 300
Temperature reduction factor (K-1 cycle)
Crossover rate
ga_crossover_rate ga_crossover_rate 0.8
rtrf 0.95
Termination criteria: accepted moves
Mutation rate
ga_mutation_rate ga_mutation_rate 0.02 sw_max_its sw_max_its 300
Solis & Wets local search (LGA only)
accs 25000
rejected moves
Termination criteria:
rejs 25000
annealing cycles
cycles 50
ga_num_evals ga_num_evals 250000 # short ga_num_evals ga_num_evals 2500000 # medium ga_num_evals ga_num_evals 25000000 # long ga_num_generations ga_num_generations 27000
5/13/08
Using AutoDock 4 with ADT
21
Dimensionality of Molecular Docking
Degrees of Freedom (DOF) Position / Translation (3) x,y ,z Orientation / Quaternion (3) qx, qy, qz, qw (normalized in 4D) Rotatable Bonds / Torsions (n) 1, 2, n Dimensionality, D = 3 + 3 + n Dimensionality,
Using AutoDock 4 with ADT 22
5/13/08
Multidimensional Treasure Hunt Hunt
Dimensions 1 Landscape Divide into 2 Treasure $ Chances? 1/2
1/4
1/8
5/13/08
Using AutoDock 4 with ADT
23
Sampling Hyperspace
Say we are hunting in D-dimensional hyperspace hyperspace We want to evaluate each of the D dimensions N times. The number of evals needed, n, is: n = ND evals N = n1/D For example, if n = 106 and and 6 1/6 D=6, N = (10 ) = 10 evaluations per dimension =6, 6 1/36 D=36, N = (10 ) = ~1.5 evaluations per dimension =36, Clearly, the more dimensions, the tougher it gets.
5/13/08
Using AutoDock 4 with ADT
24
Next, AutoDock AutoDock
Now for some specifics about AutoDock AutoDock
More information can be found in the User Guide! Guide!
Using AutoDock 4 with ADT 25
5/13/08
AutoDock / ADT
AutoDock & AutoGrid 1990 Number crunching Command-line. awk, shell & Python scripts. Text editors C & C++, compiled
5/13/08
ADT 2000 Visualizing, set-up Graphical User Interface. PMV Python GUI-less, self-logging & rescriptable Python, interpreted
Using AutoDock 4 with ADT
26
Community (1991 - mid 2005)
AutoDock licenses
Papers citing AutoDock (source: Science Citation Index Expanded)
5/13/08
Using AutoDock 4 with ADT
27
Number of Citations for Docking Programs ISI Web of Science (2005)
Sousa, S.F., Fernandes, P.A. & Ramos, M.J. (2006) Protein-Ligand Docking: Current Status and Future Challenges Proteins, 65:15-26
5/13/08
Using AutoDock 4 with ADT
28
Trends in Citations of Docking Programs ISI Web of Science (2005)
Sousa, S.F., Fernandes, P.A. & Ramos, M.J. (2006) Protein-Ligand Docking: Current Status and Future Challenges Proteins, 65:15-26
5/13/08
Using AutoDock 4 with ADT
29
Practical Considerations
What problem does AutoDock solve? Flexible ligands (4.0 flexible protein). What range of problems is feasible?
Depends on the search method:
LGA > GA >> SA >> LS SA : can output trajectories, D < about 8 torsions. LGA : D < about 8-32 torsions.
When is AutoDock not suitable?
No 3D-structures are available; Modelled structure of poor quality; Too many (32 torsions, 2048 atoms, 22 atom types); Target protein too flexible.
Using AutoDock 4 with ADT 30
5/13/08
10
Using AutoDock: Step-by-Step
5/13/08
Set up ligand PDBQTusing ADTs Ligand menu PDBQT ADT Ligand OPTIONAL: Set up flexible receptor PDBQTusing PDBQT ADTs Flexible Residues menu ADT Residues Set up macromolecule & grid mapsusing ADTs Grid maps ADT Grid menu Pre-compute AutoGrid maps for all atom types in your set of ligandsusing autogrid4 ligands autogrid4 Perform dockings of ligand to targetusing autodock4, target autodock4 and in parallel if possible. Visualize AutoDock resultsusing ADTs Analyze menu results ADT Analyze Cluster dockingsusing analysis DPF command in dockings analysis autodock4 or ADTs Analyze menu for parallel docking autodock4 ADT Analyze results.
Using AutoDock 4 with ADT
31
AutoDock 4 File Formats
Prepare the Following Input Files
Ligand PDBQT file Rigid Macromolecule PDBQT file Flexible Macromolecule PDBQT file (Flexres) (Flexres AutoGrid Parameter File (GPF)
GPF depends on atom types in:
Ligand PDBQT file Optional flexible residue PDBQT files)
AutoDock Parameter File (DPF) Macromolecule PDBQT + GPF Grid Maps, GLG Grid Maps + Ligand PDBQT + [Flexres PDBQT +] [Flexres DPF DLG (dockings & clustering)
Using AutoDock 4 with ADT 32
Run AutoGrid 4
Run AutoDock 4
Run ADT to Analyze DLG
5/13/08
Things you need to do before using AutoDock 4
Ligand:
Add all hydrogens, compute Gasteiger charges, and merge non-polar H; also assign AutoDock 4 atom types Ensure total charge corresponds to tautomeric state Choose torsion tree root & rotatable bonds Add all hydrogens, compute Gasteiger charges, and merge non-polar H; also assign AutoDock 4 atom types Assign Stouten atomic solvation parameters Optionally, create a flexible residues PDBQT in addition to the rigid PDBQT file Compute AutoGrid maps
Using AutoDock 4 with ADT 33
Macromolecule:
5/13/08
11
Preparing Ligands and Receptors
AutoDock uses United Atom model Atom
Reduces number of atoms, speeds up docking Add polar Hs. Remove non-polar Hs.
Need to:
Both Ligand & Macromolecule
Replace missing atoms (disorder). Fix hydrogens at chain breaks. Acidic & Basic residues, Histidines. Histidines. http://molprobity.biochem.duke.edu/
Need to consider pH:
Other molecules in receptor:
Waters; Cofactors; Metal ions.
Using AutoDock 4 with ADT 34
5/13/08
Molecular Modelling elsewhere.
Atom Types in AutoDock 4
One-letter or two-letter atom type codes More atom types than AD3:
22
Same atom types in both ligand and receptor
http://autodock.scripps.edu/wiki/NewFeatures http://autodock.scripps.edu/faqs-help/faq/ how-do-i-add-new-atom-types-to-autodock-4 http://autodock.scripps.edu/faqs-help/faq/ where-do-i-set-the-autodock-4-force-field-parameters
Using AutoDock 4 with ADT 35
5/13/08
Partial Atomic Charges are required for both Ligand and Receptor
Partial Atomic Charges:
Peptides & Proteins; DNA & RNA
Gasteiger (PEOE) - AD4 Force Field Gasteiger (PEOE) - AD4 Force Field; MOPAC (MNDO, AM1, PM3); Gaussian (6-31G*).
Organic compounds; Cofactors
Integer total charge per residue. Non-polar hydrogens:
Always merge
5/13/08
Using AutoDock 4 with ADT
36
12
Carbon Atoms can be either Aliphatic or Aromatic Atom Types
Solvation Free Energy
Based on a partial-charge-dependent variant of Stouten method. Treats aliphatic (C) and aromatic (A) carbons differently. ( (
Need to rename ligand aromatic C to A. ADT determines if ligand is a peptide:
If so, uses a look-up dictionary. If not, inspects geometry of Cs in rings. Renames C to A if flat enough. Can adjust planarity criterion (15 detects more rings than planarity (15 default 7.5). 7.5
Using AutoDock 4 with ADT 37
5/13/08
Defining Ligand Flexibility
Set Root of Torsion Tree: By interactively picking, or
Automatically.
Smallest largest sub-tree. sub-tree
Interactively Pick Rotatable Bonds:
No leaves; leaves No bonds in rings; Can freeze:
Peptide/amide/selected/all;
Can set the number of active torsions that move either the most or the fewest atoms
Using AutoDock 4 with ADT 38
5/13/08
Setting Up Your Environment
At TSRI:
Modify .cshrc
Change PATH & stacksize:
setenv PATH (/mgl/prog/$archosv/bin:/tsri/python:$path) (/mgl/prog/$archosv/bin:/tsri/python:$path) % limit stacksize unlimited
% source /tsri/python/share/bin/initadtcsh
ADT Tutorial, every time you open a Shell or Terminal, type:
To start AutoDockTools, type:
% cd tutorial % adt1
Web
http://autodock.scripps.edu http://mgltools.scripps.edu
5/13/08
Using AutoDock 4 with ADT
39
13
Choose the Docking Algorithm
SA.dpf Simulated Annealing GA.dpf Genetic Algorithm LS.dpf Local Search
Solis-Wets (SW) Pseudo Solis-Wets (pSW)
GALS.dpf Genetic Algorithm with Local Search, i.e. Lamarckian GA
5/13/08
Using AutoDock 4 with ADT
40
Run AutoGrid
Check: Enough disk space?
Maps are ASCII, but can be ~2-8MB !
Start AutoGrid from the Shell: Follow the log file using:
% tail -f mol.glg
% autogrid4 p mol.gpf l mol.glg &!! % autogrid4 -p mol.gpf -l mol.glg ; autodock4 -p mol.dpf -l mol.dlg
Type <Ctrl>-C to break out of the tail -f -f command
Wait for Successful Completion before starting Completion AutoDock
5/13/08
Using AutoDock 4 with ADT
41
Run AutoDock
Do a test docking, ~ 25,000 evals Do a full docking, if test is OK, ~ 250,000 to 50,000,000 evals From the Shell:
% autodock4 p yourFile.dpf l yourFile.dlg &
Expected time? Size of docking log? Distributed computation
At TSRI, Linux Clusters
% submit.py stem 20 submit.py % recluster.py stem 20 during 3.5
5/13/08
Using AutoDock 4 with ADT
42
14
Analyzing AutoDock Results
In ADT, you can:
Read & view a single DLG, or Read & view many DLG results files in a single directory Re-cluster docking results by conformation & view these
Outside ADT, you can re-cluster several DLGs
Useful in distributed docking
% recluster.py stem 20 [during|end] 3.5
5/13/08
Using AutoDock 4 with ADT
43
Viewing Conformational Clusters by RMSD
List the RMSD tolerances
Separated by spaces
Histogram of conformational clusters Number in cluster versus lowest energy in that cluster Picking a cluster
makes a list of the conformations in that cluster; set these to be the current sequence for states player.
5/13/08
Using AutoDock 4 with ADT
44
Advanced Topics
Stochastic search methods rely on random numbers Random Number Generator, RNG
5/13/08
Using AutoDock 4 with ADT
45
15
Random number generator
RNG needs a seed or seeds.
Different seeds lead to different sequences of random numbers SA needs 1 seed GA & LGA need 2 seeds A long integer, say 3141529; or 3141529 time = number of seconds since 1970 Jan 1; or time pid = UNIX process ID of this job pid
SA and GA use different RNGs
A seed can be:
5/13/08
Using AutoDock 4 with ADT
46
Acknowledgments
Ruth Huey William Lindstrom David S. Goodsell Michel Sanner Sophie Coon Daniel Stoffler Michael Pique Art J. Olson
Rik Belew (UCSD) Bill Hart (Sandia) Scott Halliday Chris Rosin Max Chang Flavio Grynszpan (TSRI) Many patient ADT users
5/13/08
Using AutoDock 4 with ADT
47
16