Template-based protein–ligand pose prediction with command-line interface and web application.
TEMPL is a template-based method for rapid protein–ligand pose prediction that leverages ligand similarity and template superposition. The method uses maximal common substructure (MCS) alignment and constrained conformer generation (ETKDG v3) for pose generation within known chemical space.
Key Features:
- Template-based pose prediction using ligand similarity
- Alignment driven by maximal common substructure (MCS)
- Shape and pharmacophore scoring for pose selection
- Built-in benchmarks (Polaris, time-split PDBbind)
- CPU/GPU adaptive
- Optimized for rapid pose prediction within known chemical space. Performance may be limited for novel scaffolds, allosteric sites, or targets with insufficient template coverage.
git clone https://github.com/fulopjoz/templ-pipeline
cd templ-pipeline
source setup_templ_env.shThe setup script automatically:
- Detects hardware configuration
- Creates the
.templvirtual environment - Installs dependencies with
uv - Downloads required datasets from Zenodo
- Verifies installation
For future sessions, activate the environment:
source .templ/bin/activateTEMPL requires pre-computed embeddings and ligand structures that are automatically downloaded during setup from Zenodo:
templ_protein_embeddings_v1.0.0.npz(~90MB) - Pre-computed ESM-2 protein embeddings for 18,902 PDBBind structurestempl_processed_ligands_v1.0.0.sdf.gz(~10MB) - Processed ligand molecules
Related Zenodo Datasets:
- TEMPL Pipeline Core Dataset - Essential data files for pipeline operation
- TEMPL Pipeline Benchmark Results Dataset - PDBBind timesplit and Polaris benchmark results
See data/README.md for directory layout and the provided benchmark splits under data/splits/.
For benchmarking, download the following freely available v2020 subsets from the official PDBbind website:
-
Protein-ligand complexes: The general set minus refined set (1.8 GB) →
PDBbind_v2020_other_PL -
Protein-ligand complexes: The refined set (658 MB) →
PDBbind_v2020_refined
After downloading, extract both folders into data/PDBBind/ using the standard directory structure.
.
├── setup_templ_env.sh # One-shot environment setup
├── pyproject.toml # Packaging and dependencies
├── data/ # Embeddings, ligands, splits (see data/README.md)
├── templ_pipeline/ # Main Python package and CLI
├── scripts/ # Helper entry points (UI launcher, tests, benchmarks)
├── output/ # Pipeline run outputs (templ_run_...)
├── benchmarks/ # Benchmark workspaces and archives
├── tests/ # Unit, integration, performance tests
├── deploy/ # Docker and Kubernetes assets
├── docs/ # Additional documentation
├── diagrams/ # Architecture and flow diagrams
├── tools/ # Dev configs (pytest, workspace)
└── README.md # This file
templ_pipeline/core/README.md– Module-by-module overview of the core scripts that implement the workflow in Figure 1.templ_pipeline/core/templ_demo.ipynb– Executable notebook that reproduces the end-to-end Panel A → Panel B workflow using the bundled example data.
# Basic pose prediction
templ run --protein-file protein.pdb --ligand-smiles "C1CC(=O)N(C1)CC(=O)N"
# Using PDB ID
templ run --protein-pdb-id 1iky --ligand-smiles "C1CC(=O)N(C1)CC(=O)N"
# Using SDF file
templ run --protein-file protein.pdb --ligand-file ligand.sdf
# Show available commands
templ --helpHosted app: templ.dyn.cloud.e-infra.cz
python scripts/run_streamlit_app.py- The launcher picks the first free port starting at 8501. Override via PORT or TEMPL_PORT_START.
- It prints both Local and Network URLs; the app listens on 0.0.0.0 for LAN access.
# Full benchmark
templ benchmark polaris
templ benchmark time-split # PDBBind dataset
# Partial benchmark
templ benchmark time-split --test-only- Individual runs are written to output/ as templ_run_YYYYMMDD_HHMMSS_[pdbid]/
- Benchmark artifacts are saved under benchmarks/[suite]/...
| Command | Description |
|---|---|
templ run |
Complete pipeline (recommended) |
templ embed |
Generate protein embeddings |
templ find-templates |
Find similar protein templates |
templ generate-poses |
Generate ligand poses |
For detailed command options, run templ --help or templ <command> --help.
We welcome contributions! Please see our Contributing Guidelines for details on how to submit pull requests, report issues, and contribute to the project.
For questions or discussions, please use GitHub Discussions.
Jozef Fülöp
CZ-OPENSCREEN, Department of Informatics and Chemistry, Faculty of Chemical Technology
University of Chemistry and Technology, Prague
Martin Šícho
CZ-OPENSCREEN, Department of Informatics and Chemistry, Faculty of Chemical Technology
University of Chemistry and Technology, Prague
Wim Dehaen
CZ-OPENSCREEN, Department of Informatics and Chemistry & Department of Organic Chemistry
University of Chemistry and Technology, Prague
📧 [email protected]
For the research paper describing the method, please cite:
@article{fulop2025templ,
title={TEMPL: A Template-Based Protein--Ligand Pose Prediction Baseline},
author={Fülöp, Jozef and Šícho, Martin and Dehaen, Wim},
journal={Journal of Chemical Information and Modeling},
year={2025},
publisher={American Chemical Society},
doi={10.1021/acs.jcim.5c01985},
url={https://doi.org/10.1021/acs.jcim.5c01985},
note={Published as part of the special issue "Open Science and Blind Data: The Antiviral Discovery Challenge"}
}If you use TEMPL in your research, please cite the software:
@software{templ2025,
title={TEMPL: Template-based protein-ligand pose prediction},
author={Fülöp, Jozef and Šícho, Martin and Dehaen, Wim},
institution={University of Chemistry and Technology, Prague},
url={https://github.com/fulopjoz/templ-pipeline},
doi={10.5281/zenodo.16890956},
year={2025},
version={1.0.0}
}J.F., M.Š. and W.D. were supported by the Ministry of Education, Youth and Sports of the Czech Republic – National Infrastructure for Chemical Biology (CZ-OPENSCREEN, LM2023052). W.D. was supported by the Ministry of Education, Youth and Sports of the Czech Republic by the project "New Technologies for Translational Research in Pharmaceutical Sciences/NETPHARM", project ID CZ.02.01.01/00/22_008/0004607, cofunded by the European Union. Computational resources were provided by the e-INFRA CZ project (ID:90254), supported by the Ministry of Education, Youth and Sports of the Czech Republic.
This project uses dual licensing:
- Software Code: Licensed under the MIT License
- Data, Documentation, and Research Outputs: Licensed under CC BY 4.0
- Third-party Data: Polaris benchmark datasets are CC0-1.0 (public domain)
The software components use the permissive MIT License for maximum reusability. Research data and documentation are shared under Creative Commons Attribution 4.0 International License for proper academic attribution.