Thanks to visit codestin.com
Credit goes to github.com

Skip to content

fulopjoz/templ-pipeline

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

TEMPL Pipeline

Live App JCIM Open Access DOI

License: MIT License: CC BY 4.0 Python 3.12+ Citation

CI Lines of Code Maintainability Rating

Template-based protein–ligand pose prediction with command-line interface and web application.


Overview

TEMPL is a template-based method for rapid protein–ligand pose prediction that leverages ligand similarity and template superposition. The method uses maximal common substructure (MCS) alignment and constrained conformer generation (ETKDG v3) for pose generation within known chemical space.

Key Features:

  • Template-based pose prediction using ligand similarity
  • Alignment driven by maximal common substructure (MCS)
  • Shape and pharmacophore scoring for pose selection
  • Built-in benchmarks (Polaris, time-split PDBbind)
  • CPU/GPU adaptive

⚠️ Scope:

  • Optimized for rapid pose prediction within known chemical space. Performance may be limited for novel scaffolds, allosteric sites, or targets with insufficient template coverage.

Installation

Installation (one-time)

git clone https://github.com/fulopjoz/templ-pipeline
cd templ-pipeline
source setup_templ_env.sh

The setup script automatically:

  • Detects hardware configuration
  • Creates the .templ virtual environment
  • Installs dependencies with uv
  • Downloads required datasets from Zenodo
  • Verifies installation

For future sessions, activate the environment:

source .templ/bin/activate

Data Requirements

TEMPL requires pre-computed embeddings and ligand structures that are automatically downloaded during setup from Zenodo:

  • templ_protein_embeddings_v1.0.0.npz (~90MB) - Pre-computed ESM-2 protein embeddings for 18,902 PDBBind structures
  • templ_processed_ligands_v1.0.0.sdf.gz (~10MB) - Processed ligand molecules

Related Zenodo Datasets:

See data/README.md for directory layout and the provided benchmark splits under data/splits/.

PDBbind Dataset

For benchmarking, download the following freely available v2020 subsets from the official PDBbind website:

  1. Protein-ligand complexes: The general set minus refined set (1.8 GB) → PDBbind_v2020_other_PL

  2. Protein-ligand complexes: The refined set (658 MB) → PDBbind_v2020_refined

After downloading, extract both folders into data/PDBBind/ using the standard directory structure.


Project Structure

.
├── setup_templ_env.sh        # One-shot environment setup
├── pyproject.toml            # Packaging and dependencies
├── data/                     # Embeddings, ligands, splits (see data/README.md)
├── templ_pipeline/           # Main Python package and CLI
├── scripts/                  # Helper entry points (UI launcher, tests, benchmarks)
├── output/                   # Pipeline run outputs (templ_run_...)
├── benchmarks/               # Benchmark workspaces and archives
├── tests/                    # Unit, integration, performance tests
├── deploy/                   # Docker and Kubernetes assets
├── docs/                     # Additional documentation
├── diagrams/                 # Architecture and flow diagrams
├── tools/                    # Dev configs (pytest, workspace)
└── README.md                 # This file

Usage

Core Workflow Documentation

Command Line Interface

# Basic pose prediction
templ run --protein-file protein.pdb --ligand-smiles "C1CC(=O)N(C1)CC(=O)N"

# Using PDB ID
templ run --protein-pdb-id 1iky --ligand-smiles "C1CC(=O)N(C1)CC(=O)N"

# Using SDF file
templ run --protein-file protein.pdb --ligand-file ligand.sdf

# Show available commands
templ --help

Web Interface

Hosted app: templ.dyn.cloud.e-infra.cz

python scripts/run_streamlit_app.py

Notes

  • The launcher picks the first free port starting at 8501. Override via PORT or TEMPL_PORT_START.
  • It prints both Local and Network URLs; the app listens on 0.0.0.0 for LAN access.

Benchmarking

# Full benchmark
templ benchmark polaris
templ benchmark time-split # PDBBind dataset

# Partial benchmark
templ benchmark time-split --test-only

Outputs

  • Individual runs are written to output/ as templ_run_YYYYMMDD_HHMMSS_[pdbid]/
  • Benchmark artifacts are saved under benchmarks/[suite]/...

API Reference

Command Description
templ run Complete pipeline (recommended)
templ embed Generate protein embeddings
templ find-templates Find similar protein templates
templ generate-poses Generate ligand poses

For detailed command options, run templ --help or templ <command> --help.


Contributing

We welcome contributions! Please see our Contributing Guidelines for details on how to submit pull requests, report issues, and contribute to the project.

For questions or discussions, please use GitHub Discussions.


Authors

Jozef Fülöp ORCID iD CZ-OPENSCREEN, Department of Informatics and Chemistry, Faculty of Chemical Technology University of Chemistry and Technology, Prague

Martin Šícho ORCID iD CZ-OPENSCREEN, Department of Informatics and Chemistry, Faculty of Chemical Technology University of Chemistry and Technology, Prague

Wim Dehaen ORCID iD CZ-OPENSCREEN, Department of Informatics and Chemistry & Department of Organic Chemistry University of Chemistry and Technology, Prague 📧 [email protected]


Citation

For the research paper describing the method, please cite:

@article{fulop2025templ,
  title={TEMPL: A Template-Based Protein--Ligand Pose Prediction Baseline},
  author={Fülöp, Jozef and Šícho, Martin and Dehaen, Wim},
  journal={Journal of Chemical Information and Modeling},
  year={2025},
  publisher={American Chemical Society},
  doi={10.1021/acs.jcim.5c01985},
  url={https://doi.org/10.1021/acs.jcim.5c01985},
  note={Published as part of the special issue "Open Science and Blind Data: The Antiviral Discovery Challenge"}
}

If you use TEMPL in your research, please cite the software:

@software{templ2025,
  title={TEMPL: Template-based protein-ligand pose prediction},
  author={Fülöp, Jozef and Šícho, Martin and Dehaen, Wim},
  institution={University of Chemistry and Technology, Prague},
  url={https://github.com/fulopjoz/templ-pipeline},
  doi={10.5281/zenodo.16890956},
  year={2025},
  version={1.0.0}
}

Acknowledgement

J.F., M.Š. and W.D. were supported by the Ministry of Education, Youth and Sports of the Czech Republic – National Infrastructure for Chemical Biology (CZ-OPENSCREEN, LM2023052). W.D. was supported by the Ministry of Education, Youth and Sports of the Czech Republic by the project "New Technologies for Translational Research in Pharmaceutical Sciences/NETPHARM", project ID CZ.02.01.01/00/22_008/0004607, cofunded by the European Union. Computational resources were provided by the e-INFRA CZ project (ID:90254), supported by the Ministry of Education, Youth and Sports of the Czech Republic.


License

This project uses dual licensing:

The software components use the permissive MIT License for maximum reusability. Research data and documentation are shared under Creative Commons Attribution 4.0 International License for proper academic attribution.