PoseX is a comprehensive benchmark dataset designed to evaluate molecular docking algorithms for predicting protein-ligand binding poses. It includes the construction process of Self-Docking and Cross-Docking datasets, as well as complete evaluation codes for different docking tools.
Install PoseX directly from Github to get the latest updates.
git clone https://github.com/CataAI/PoseX.git
cd PoseXWe recommend using mamba to manage the Python environment. For more information on how to install mamba, see Miniforge.
Once mamba is installed, we can run the following command to install the basic environment.
mamba create -f environments/base.yaml
mamba activate posexFor a specific molecular docking tool, we can use the corresponding environment file in the environments folder. Take Chai-1 as an example:
pip install -r environments/chai-1.txtDownload posex_set.zip and posex_cif.zip from Dataset on HF
mkdir data
unzip posex_set.zip -d data
unzip posex_cif.zip -d data
mv data/posex_set data/datasetFor information about creating a dataset from scratch, please refer to the data construction README.
This project provides a complete pipeline for:
- Generate the csv file required for the benchmark based on the
PoseXdataset - Converting docking data into model-specific input formats
- Running different docking models or tools
- Energy minimization of molecular docking results
- Extracting and aligning model outputs
- Calculating evaluation metrics using
PoseBusters
Generate benchmark CSV files containing protein sequences, ligand SMILES, and other metadata:
bash ./scripts/generate_docking_benchmark.sh <dataset>
# --------------- Example --------------- #
# Astex
bash ./scripts/generate_docking_benchmark.sh astex
# PoseX Self-Docking
bash ./scripts/generate_docking_benchmark.sh posex_self_dock
# PoseX Cross-Docking
bash ./scripts/generate_docking_benchmark.sh posex_cross_dockConvert benchmark CSV files to model-specific input formats:
bash ./scripts/convert_to_model_input.sh <dataset> <model_type>
# --------------- Example --------------- #
# PoseX Self-Docking (AlphaFold3)
bash ./scripts/convert_to_model_input.sh posex_self_dock alphafold3Run different docking models:
bash ./scripts/run_<model_type>/run_<model_type>.sh <dataset>
# --------------- Example --------------- #
# PoseX Self-Docking (Alphafold3)
bash ./scripts/run_alphafold3/run_alphafold3.sh posex_self_dockExtract predicted structures from model outputs:
bash ./scripts/extract_model_output.sh <dataset> <model_type>
# --------------- Example --------------- #
# PoseX Self-Docking (AlphaFold3)
bash ./scripts/extract_model_output.sh posex_self_dock alphafold3python -m scripts.relax_model_outputs --input_dir <input_dir> --cif_dir data/posex_cif
# --------------- Example --------------- #
# PoseX Self-Docking (AlphaFold3)
python -m scripts.relax_model_outputs --input_dir data/benchmark/posex_self_dock/alphafold3/output --cif_dir data/posex_cifAlign predicted structures to the reference structures:
bash ./scripts/complex_structure_alignment.sh <dataset> <model_type> <relax_mode>
# --------------- Example --------------- #
# PoseX Self-Docking (AlphaFold3) (Using Relax)
bash ./scripts/complex_structure_alignment.sh posex_self_dock alphafold3 trueCalculate evaluation metrics using PoseBusters:
bash ./scripts/calculate_benchmark_result.sh <dataset> <model_type> <relax_mode>
# --------------- Example --------------- #
# PoseX Self-Docking (AlphaFold3) (Using Relax)
bash ./scripts/calculate_benchmark_result.sh posex_self_dock alphafold3 trueWe provide the results here.
- Code: Licensed under the MIT License.
- Dataset: Licensed under Creative Commons Attribution 4.0 International (CC-BY 4.0). See PoseX Dataset for details.
If you are interested in our work or use our data and code, please cite the following article:
@misc{jiang2025posexaidefeatsphysics,
title={PoseX: AI Defeats Physics Approaches on Protein-Ligand Cross Docking},
author={Yize Jiang and Xinze Li and Yuanyuan Zhang and Jin Han and Youjun Xu and Ayush Pandit and Zaixi Zhang and Mengdi Wang and Mengyang Wang and Chong Liu and Guang Yang and Yejin Choi and Wu-Jun Li and Tianfan Fu and Fang Wu and Junhong Liu},
year={2025},
eprint={2505.01700},
archivePrefix={arXiv},
primaryClass={cs.LG},
url={https://arxiv.org/abs/2505.01700},
}