Source code for the Nature Communications paper DynamicBind: predicting ligand-specific protein-ligand complex structure with a deep equivariant generative model.
DynamicBind recovers ligand-specific conformations from unbound protein structures (e.g. AF2-predicted structures), promoting efficient transitions between different equilibrium states.
Create environment dynamicbind:
conda create -n dynamicbind python=3.10
conda activate dynamicbind
conda install pytorch torchvision torchaudio pytorch-cuda=12.1 -c pytorch -c nvidia
pip install pyg_lib torch_scatter torch_sparse torch_cluster torch_spline_conv -f https://data.pyg.org/whl/torch-2.4.0+cu121.html
pip install torch_geometric -f https://data.pyg.org/whl/torch-2.4.0+cu121.html
pip install rdkit
pip install pyyaml biopython
pip install e3nn fair-esm spyrmsd
pip install pandas
pip install tqdm
Create another environment for structural Relaxation.
conda create --name relax python=3.8
conda activate relax
pip install openmm
conda install -c conda-forge pdbfixer
pip install compilers biopython
pip install tqdm
pip install scipy
pip install pandas
pip install rdkit
pip install networkx
mkdir esm_models
cd esm_models/
mkdir checkpoints
cd checkpoints/
wget -c https://dl.fbaipublicfiles.com/fair-esm/models/esm2_t33_650M_UR50D.pt
wget -c https://dl.fbaipublicfiles.com/fair-esm/regression/esm2_t33_650M_UR50D-contact-regression.pt
Download and unzip the workdir.zip containing the model checkpoint form https://zenodo.org/records/10137507, v2 is contained here https://zenodo.org/records/10183369.
By default: 40 poses will be predicted, poses will be ranked (rank1 is the best-scoring pose, rank40 the lowest), relax processes are included.
- Protein (PDB File):
protein.pdb- Automatically cleaned to remove non-standard amino acids, water molecules, or small molecules.
- Ligand (CSV File):
ligand.csv- Must contain a column named 'ligand' listing smiles.
- Number of Animations:
- outputs intermediate pkl data, not the final animation PDB. (After
--savings_per_complex, default is 40)
- outputs intermediate pkl data, not the final animation PDB. (After
- Frames in Animation/inference_steps:
- default is 20.
--header: Name of the result folder.--device: GPU device ID.--python: Python environment for inference.--relax_python: Python environment for relaxation.--num_workers: Number of processes for final output relaxation.
python run_single_protein_inference.py data/origin-1qg8.pdb data/1qg8_input.csv \
--savings_per_complex 40 \
--inference_steps 20 \
--header test --device 0 \
--python /home/ruofan/anaconda3/envs/dynamicbind/bin/python \
--relax_python /home/ruofan/anaconda3/envs/relax/bin/python \
--movie