Thanks to visit codestin.com
Credit goes to github.com

Skip to content

CataAI/PoseX

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

88 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

 

PoseX: AI Defeats Physics-based Methods on Protein Ligand Cross-Docking

arXiv License: MIT Project Status: Active – The project has reached a stable, usable state and is being actively developed.

Dataset on HF

PoseX is a comprehensive benchmark dataset designed to evaluate molecular docking algorithms for predicting protein-ligand binding poses. It includes the construction process of Self-Docking and Cross-Docking datasets, as well as complete evaluation codes for different docking tools.

Online Leaderboard

Dataset Repository


PoseX Self-Docking Result

 

PoseX Cross-Docking Result

 

Contents

Installation

Install PoseX directly from Github to get the latest updates.

git clone https://github.com/CataAI/PoseX.git
cd PoseX

We recommend using mamba to manage the Python environment. For more information on how to install mamba, see Miniforge. Once mamba is installed, we can run the following command to install the basic environment.

mamba create -f environments/base.yaml
mamba activate posex

For a specific molecular docking tool, we can use the corresponding environment file in the environments folder. Take Chai-1 as an example:

pip install -r environments/chai-1.txt

Benchmark Data

Download posex_set.zip and posex_cif.zip from Dataset on HF

mkdir data
unzip posex_set.zip -d data 
unzip posex_cif.zip -d data
mv data/posex_set data/dataset

For information about creating a dataset from scratch, please refer to the data construction README.

Benchmark Pipeline

This project provides a complete pipeline for:

  • Generate the csv file required for the benchmark based on the PoseX dataset
  • Converting docking data into model-specific input formats
  • Running different docking models or tools
  • Energy minimization of molecular docking results
  • Extracting and aligning model outputs
  • Calculating evaluation metrics using PoseBusters

1. Generate Benchmark CSV Data

Generate benchmark CSV files containing protein sequences, ligand SMILES, and other metadata:

bash ./scripts/generate_docking_benchmark.sh <dataset>

# --------------- Example --------------- #
# Astex
bash ./scripts/generate_docking_benchmark.sh astex
# PoseX Self-Docking
bash ./scripts/generate_docking_benchmark.sh posex_self_dock
# PoseX Cross-Docking
bash ./scripts/generate_docking_benchmark.sh posex_cross_dock

2. Convert to Model Inputs

Convert benchmark CSV files to model-specific input formats:

bash ./scripts/convert_to_model_input.sh <dataset> <model_type>

# --------------- Example --------------- #
# PoseX Self-Docking (AlphaFold3)
bash ./scripts/convert_to_model_input.sh posex_self_dock alphafold3

3. Run Docking Models

Run different docking models:

bash ./scripts/run_<model_type>/run_<model_type>.sh <dataset>

# --------------- Example --------------- #
# PoseX Self-Docking (Alphafold3)
bash ./scripts/run_alphafold3/run_alphafold3.sh posex_self_dock

4. Extract Model Outputs

Extract predicted structures from model outputs:

bash ./scripts/extract_model_output.sh <dataset> <model_type>

# --------------- Example --------------- #
# PoseX Self-Docking (AlphaFold3)
bash ./scripts/extract_model_output.sh posex_self_dock alphafold3

5. Energy Minimization

python -m scripts.relax_model_outputs --input_dir <input_dir> --cif_dir data/posex_cif

# --------------- Example --------------- #
# PoseX Self-Docking (AlphaFold3)
python -m scripts.relax_model_outputs --input_dir data/benchmark/posex_self_dock/alphafold3/output --cif_dir data/posex_cif

6. Align Predicted Structures

Align predicted structures to the reference structures:

bash ./scripts/complex_structure_alignment.sh <dataset> <model_type> <relax_mode>

# --------------- Example --------------- #
# PoseX Self-Docking (AlphaFold3) (Using Relax)
bash ./scripts/complex_structure_alignment.sh posex_self_dock alphafold3 true

7. Calculate Benchmark Result

Calculate evaluation metrics using PoseBusters:

bash ./scripts/calculate_benchmark_result.sh <dataset> <model_type> <relax_mode>

# --------------- Example --------------- #
# PoseX Self-Docking (AlphaFold3) (Using Relax)
bash ./scripts/calculate_benchmark_result.sh posex_self_dock alphafold3 true

We provide the results here.

Acknowledgements

Method Pub. Year License Paper Code
Physics-based methods
Discovery Studio late 1990s Commercial -
Schrödinger Glide 2004 Commercial -
MOE 2008 Commercial -
AutoDock Vina 2010, 2021 Apache-2.0 Stars
GNINA 2021 Apache-2.0 Stars
AI docking methods
DeepDock 2021 MIT Stars
EquiBind 2022 MIT Stars
TankBind 2022 MIT Stars
DiffDock 2022 MIT Stars
Uni-Mol 2024 MIT Stars
FABind 2023 MIT Stars
DiffDock-L 2024 MIT Stars
DiffDock-Pocket 2024 MIT Stars
DynamicBind 2024 MIT Stars
Interformer 2024 Apache-2.0 Stars
SurfDock 2024 MIT Stars
AI co-folding methods
NeuralPLexer 2024 BSD Stars
RoseTTAFold-All-Atom 2023 BSD Stars
AlphaFold3 2024 CC-BY-NC-SA 4.0 Stars
Chai-1 2024 Apache-2.0 Stars
Boltz-1 2024 MIT Stars
Boltz-1x 2025 MIT Stars
Protenix 2025 Apache-2.0 Stars

License

Citations

If you are interested in our work or use our data and code, please cite the following article:

@misc{jiang2025posexaidefeatsphysics,
      title={PoseX: AI Defeats Physics Approaches on Protein-Ligand Cross Docking}, 
      author={Yize Jiang and Xinze Li and Yuanyuan Zhang and Jin Han and Youjun Xu and Ayush Pandit and Zaixi Zhang and Mengdi Wang and Mengyang Wang and Chong Liu and Guang Yang and Yejin Choi and Wu-Jun Li and Tianfan Fu and Fang Wu and Junhong Liu},
      year={2025},
      eprint={2505.01700},
      archivePrefix={arXiv},
      primaryClass={cs.LG},
      url={https://arxiv.org/abs/2505.01700}, 
}

About

PoseX: A Molecular Docking Benchmark

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Contributors 5