==============================
[]: # (Badges)
qmlify provides python executables for post-processing molecular mechanical (MM) receptor:ligand binding free energy calculations for the purpose of making hybrid machine learning/MM (ML/MM) free energy corrections. Current ML implementations are with Torchani's ANI2x model.
MM free energy calculations are conducted with Perses
The code is experimental and subject to change.
Install qmlify master via git clone:
git clone https://github.com/choderalab/qmlify.git
cd qmlify
python setup.py install
You will also need the latest conda release of Perses:
conda install -c omnia perses
as well as a couple of classes from coddiwomple and Arsenic:
git clone https://github.com/choderalab/coddiwomple.git
cd coddiwomple
python setup.py install
git clone https://github.com/openforcefield/Arsenic.git
cd arsenic
python setup.py install
qmlify postprocesses perses free energy calculations. You will need perses MM relative free energy calculation data before you can post-process it with qmlify
See experiments/mm_data/README.md for instructions on running perses replica exchange relative free energy calculations. A set of pre-generated perses simulation data is provided at experiments/mm_data, which are queried and analyzed in a demonstration at experiments/free_energy_corrections.ipynb.
qmlify uses bidirectional nonequilibrium (NEQ) free energy calculations to anneal from the MM thermodynamic states (afforded by perses) to the hybrid ML/MM thermodynamics states (and back).
Once perses free energy calculations are complete, you can extract and reformat the necessary perses files for the purpose of conducting a MM-to-ML/MM corrections using:
from qmlify.executables import perses_extraction_admin
ligand_indices = [(0,12),(0,7),(10,13),(10,6),(11,0),(11,14),(14,2),(14,8),(14,9),(15,14),(15,4),(1,13),(1,6),(2,7),(3,0),(3,13),(4,12),(4,13),(4,14),(4,5),(4,9),(5,0),(6,0),(8,0)] # the full set of ligand index pairs for Tyk2
perses_extraction_admin(ligand_indices,
'<location of perses data directory to extract from>',
'<location of local directory to extract to>',
phases = ['solvent', 'complex'],
delete_execution=False)
Once you have switched to the extract_to directory, execute forward NEQ:
from qmlify.executables import propagation_admin
import os
propagation_admin(ligand_indices, annealing_steps=5000, direction = 'forward', parent_dir = os.getcwd(), extraction_indices = range(0, 200, 2), phases = ['solvent', 'complex'], write_log=True, cat_outputs=False, delete_outputs=False)
followed by resampling and decorrelation at the ML/MM endstate:
propagation_admin(ligand_indices, annealing_steps=5000, direction = 'ani_endstate', parent_dir = os.getcwd(), extraction_indices = 100, phases = ['solvent', 'complex'], write_log=True, cat_outputs=True, delete_outputs=False, eq_steps=5000)
and finally, backward NEQ:
propagation_admin(ligand_indices, annealing_steps=5000, direction = 'backward', parent_dir = os.getcwd(), extraction_indices = 100, phases = ['solvent', 'complex'], write_log=True, cat_outputs=True, delete_outputs=False, eq_steps=5000)
These commands will generate .npz files containing a numpy array of cumulative reduced work values (for each ligand in each pair, for each phase, and for each trajectory launched), the last of which is the total reduced work performed on the trajectory over the protocol. Each work file has a default form of lig{i}to{j}.{phase}.{old/new}.{direction}.idx_{snapshot_index}.{annealing_steps}.works.npz where i, j are ligand index pairs, phase is either 'complex' or 'solvent', old/new denotes whether the work array corresponds to ligand i (old) or j (new), direction is 'forward' or 'backward', snapshot_index is which configuration index is annealed, and annealing_steps denotes the number of integration steps in the NEQ protocol. Forward and backward work distributions can be extracted, and the free energy correction of each phase can be computed (in kT) with the Bennett Acceptance Ratio (BAR) to find the maximum likelihood estimate of the free energy.
The work files can be queried and saved for each phase with:
import numpy as np
from qmlify.analysis import aggregate_pair_works, fully_aggregate_work_dict
work_pair_dictionary = aggregate_pair_works(ligand_indices, annealing_steps = {'complex': 5000, 'solvent':5000}) #this may take a few minutes
aggregated_work_dictionary, concatenated_work_dictionary = fully_aggregate_work_dict(work_pair_dictionary) #process the pair dictionary into something that is easier to analyze
np.savez('work_dictionaries.npz', aggregated_work_dictionary, concatenated_work_dictionary) #save into a compressed file
.See the function documentation for kwargs in qmlify.analysis for a more complete description. Once the work_dictionary.npz is generated, see experiments/free_energy_corrections.ipynb, which demonstrates how to query the work dictionary, calculate BAR free energy estimates, plot the work distributions per ligand, per phase, and perform MM-to-MM/ML free energy corrections (with plots).
Copyright (c) 2020, Chodera Lab
- dominic rufa
- Hannah E. Bruce Macdonald
- Josh Fass
- Marcus Wieder
Project based on the Computational Molecular Science Python Cookiecutter version 1.3.
If you found this repository helpful, consider citing:
@article {Rufa2020.07.29.227959,
author = {Rufa, Dominic A. and Bruce Macdonald, Hannah E. and Fass, Josh and Wieder, Marcus and Grinaway, Patrick B. and Roitberg, Adrian E. and Isayev, Olexandr and Chodera, John D.},
title = {Towards chemical accuracy for alchemical free energy calculations with hybrid physics-based machine learning / molecular mechanics potentials},
elocation-id = {2020.07.29.227959},
year = {2020},
doi = {10.1101/2020.07.29.227959},
publisher = {Cold Spring Harbor Laboratory},
URL = {https://www.biorxiv.org/content/early/2020/07/30/2020.07.29.227959},
eprint = {https://www.biorxiv.org/content/early/2020/07/30/2020.07.29.227959.full.pdf},
journal = {bioRxiv}
}