BackFlip is a model trained to predict per-residue backbone flexibility of protein structures described in the paper Flexibility-conditioned protein structure design with flow matching.
This repository relies on the GAFL package and code from FrameFlow.
Table of contents
We provide an instructive Google Colab tutorial for predicting the flexibility of ubiquitin that requires no local installation. Go ahead and try out BackFlip for your favorite protein!
from backflip.deployment.inference_class import BackFlip
# Load backflip model from tag:
bf = BackFlip.from_tag(tag='backflip-0.2', device='cpu')
# run backflip
prediction = bf.predict_from_pdb(pdb_path='ubiquitin.pdb')This will return a dictionary with local and global flexibility, at a runtime below 1s on a CPU.
Local flexibility for ubiquitin (1UBQ), predicted by BackFlip with the above code snippet. BackFlip predicts the alpha helix as locally stiff, the beta sheet as slightly more flexible, and the terminus as very flexible.
Inference on the example folder containing .pdb files:
from backflip.deployment.inference_class import BackFlip
from pathlib import Path
# Inference on the folder containing .pdb files.
pdb_folder_test = Path('./test_data/inference_examples/from_pdb_folder').resolve()
# Download model weights and load backflip model from tag:
bf = BackFlip.from_tag(tag='backflip-0.2', device='cuda', progress_bar=True)
# Predict and write local RMSF as a b-factor to the pdb files
bf.predict(pdb_folder=pdb_folder_test,cuda_memory_GB=8)We recommend running inference with BackFlip given a folder containing .pdb or .cif files as input. You can also point just to the structural file itself. For more details and brief analyses we refer to the example inference scripts available at scripts/example_inference.py.
BackFlip default inference script expects clean .pdb or .cif monomeric structural files as input without breaks. We provide pre-processing script that cleans PDBs. Run it on the folder containing structural files with:
python scripts/process_pdb_folder.py --pdb_dir <path>This will create a folder clean_pdb with cleaned PDBs along with a metadata.csv with some information on the protein structures. BackFlip is not (yet) robust to structural breaks, thus we discourage running inference on protein sturctures which have has_breaks == True in the metadata.csv. This can cause artifacts and unrealistic predictions!
We provide a detailed explanation on how to compute local or global RMSF in scripts/example_rmsf.py.
You can use our install script (here for torch version 2.6.0 and cuda 12.4), which esssentially executes the steps specified in the section pip below:
git clone https://github.com/graeter-group/backflip.git
conda create -n backflip python=3.10 pip=23.2.1 -y
conda activate backflip && bash backflip/install_utils/install_via_pip.sh 2.6.0 124 #torch-ver and cuda as argsVerify your installation by running our example script:
cd backflip/ && python backflip/scripts/example_inference.pyOptional: Create a virtual environment, e.g. with conda, and install pip23.2.1:
conda create -n backflip python=3.10 pip=23.2.1 -y
conda activate backflipInstall the dependencies from the requirements file:
git clone https://github.com/graeter-group/backflip.git
pip install -r backflip/install_utils/requirements.txt
# BackFlip builds on top of the GAFL package, which is installed from source:
git clone https://github.com/hits-mli/gafl.git
cd gafl
bash install_gatr.sh # Apply patches to gatr (needed for gafl)
pip install -e . # Install GAFL
cd ..
# Finally, install backflip with pip:
cd backflip
pip install -e .Install torch with a suitable cuda version, e.g.
pip install torch==2.6.0 --index-url https://download.pytorch.org/whl/cu124
pip install torch-scatter -f https://data.pyg.org/whl/torch-2.6.0+cu124.htmlwhere you can replace cu124 by your cuda version, e.g. cu118 or cu121.
FliPS relies on the GAFL package, which can be installed from GitHub as shown below. The dependencies besides GAFL are listed in install_utils/environment.yaml, we also provide a minimal environment in install_utils/minimal_env.yaml, where it is easier to change torch/cuda versions.
# download backflip:
git clone https://github.com/graeter-group/backflip.git
# create env with dependencies:
conda env create -f backflip/install_utils/minimal_env.yaml
conda activate backflip
# install gafl:
git clone https://github.com/hits-mli/gafl.git
cd gafl
bash install_gatr.sh # Apply patches to gatr (needed for gafl)
pip install -e .
cd ..
# install backflip:
cd backflip
pip install -e .Problems with torch_scatter can usually be resolved by uninstalling and re-installing it via pip for the correct torch and cuda version, e.g. pip install torch-scatter -f https://data.pyg.org/whl/torch-2.0.0+cu124.html for torch 2.0.0 and cuda 12.4.
@inproceedings{
viliuga2025flexibilityconditioned,
title={Flexibility-conditioned protein structure design with flow matching},
author={Vsevolod Viliuga and Leif Seute and Nicolas Wolf and Simon Wagner and Arne Elofsson and Jan St{\"u}hmer and Frauke Gr{\"a}ter},
booktitle={Forty-second International Conference on Machine Learning},
year={2025},
url={https://openreview.net/forum?id=890gHX7ieS}
}
The code relies on the GAFL package and code from FrameFlow. It would be appreciated if you also cite the two respective papers if you use the code.
