Molecular dynamics (MD) simulations have devoted great contribution to reveal structural and functional mechanisms for many biomolecular systems. However, how to identify functional states and important residues from vast conformation space generated by MD remains challenging, thus an intelligent navigation is highly desired. Despite intelligent advantage of deep learning exhibited in analyzing MD trajectory, its black-box nature limits its application. To address the problem, we explore an interpretable convolutional neural network (CNN) based deep learning framework to automatically identify diverse active states of GPCRs from the MD trajectory, which is named as ICNNMD. To avoid the information loss in representing the structure of the conformation, the pixel representation is adopted, from which CNN is used to efficiently extract features. More importantly, we construct an interpreter for the CNN-based result by approximating it locally with a linear model, through which important residues determining distinct active states can be quickly identified. Our model showcases 100% classification accuracy for three important GPCR systems with different function selectivity. Notably, some important residues regulating different biased activity are successfully identified, which are beneficial to elucidating diverse activation mechanisms of GPCRs. Our model can be served as a general tool to analyze MD trajectory of other biomolecular systems. All source codes are freely available at GitHub for aiding MD studies.
ICNNMD takes MD trajectories and their topology information as input, while its output files are important scores of atoms and residues. The input files are in format of ‘nc’ and ‘pdb’, respectively. The whole working procedure of ICNNMD is shown as the following figure.
Requirements:
Tensorflow 1.14.0
Scikit-learn
numpy
keras
lime
mstraj
msmbuilder
xlrt
XlsxWriter
Note 1: If you want to analyze your MD trajectories by ICNNMD, make sure you have two different trajectories in ‘nc’ format and their topology files in ‘pdb’.
Note 2: These two trajectories need to be the same molecule of different properties, with the same residues and atoms.
There are some arguments you need to set, wherein:
For example, you can use ICNNMD directly by:
python main.py --nc1_file='../data/traj1.nc' --nc2_file='../data/traj2.nc' --pdb1_file='../data/traj1.pdb' --pdb2_file='../data/traj2.pdb' --print_acc=1 --save_models=1 --print_detail=1