Overview

roshambo is a python package for robust Gaussian molecular shape comparison. It provides efficient and fast algorithms for comparing the shapes of small molecules. The package supports reading input files in the SDF and SMILES formats. It uses PAPER in the backend for overlap optimization.

Installation

We recommend that you create a new conda environment before installing roshambo.

conda create -n roshambo python=3.9
conda activate roshambo

Prerequisites

roshambo requires a compiled version of rdkit. You need to compile rdkit with the same version of python that you are using to install roshambo. You also need to compile rdkit with the INCHI option enabled. Please refer to the rdkit documentation for installation instructions.

Important

We have tested roshambo with rdkit version 2023.03.1. We highly recommend using this version of rdkit to avoid any compatibility issues.

Additionally, since roshambo is GPU-accelerated, you need to have CUDA installed.

After you have installed rdkit and CUDA, you need to set the following environment variables:

export RDBASE=/path/to/your/rdkit/installation
export RDKIT_LIB_DIR=$RDBASE/lib
export RDKIT_INCLUDE_DIR=$RDBASE/Code
export RDKIT_DATA_DIR=$RDBASE/Data
export PYTHONPATH=$PYTHONPATH:$RDBASE

export CUDA_HOME=/path/to/your/cuda/installation

Installation

Clone the repository:

git clone https://github.com/rashatwi/roshambo.git

Navigate to the roshambo directory:
```
cd roshambo
```
Install the package:
```
pip3 install .
```
Depending on your cluster/machine settings, you might need install in editable mode:
```
pip3 install -e .
```

Usage

from roshambo.api import get_similarity_scores

get_similarity_scores(
    ref_file="query.sdf",
    dataset_files_pattern="dataset.sdf",
    ignore_hs=True,
    n_confs=0,
    use_carbon_radii=True,
    color=True,
    sort_by="ComboTanimoto",
    write_to_file=True,
    gpu_id=0,
    working_dir="data/basic_run",
)

The above code will run a similarity calculation between the reference molecule in query.sdf and all the molecules in the dataset.sdf file. Hydrogen atoms will be ignored when aligning the molecules and carbon radii will be used. No conformers will be generated for the dataset molecules. Both shape and color similarity scores will be calculated and the results will be written to a file. The scores will be sorted by the ComboTanimoto score and saved in the directory data/basic_run.

You can also run the above example from the command line:

roshambo --n_confs 0 --ignore_hs --color --sort_by ComboTanimoto --write_to_file --working_dir data/basic_run --gpu_id 0 query.sdf dataset.sdf

Name		Name	Last commit message	Last commit date
Latest commit History 382 Commits
data		data
docs		docs
notebooks		notebooks
paper		paper
roshambo		roshambo
.gitignore		.gitignore
CHANGES.rst		CHANGES.rst
LICENSE		LICENSE
README.md		README.md
requirements.txt		requirements.txt
roshambo.toml		roshambo.toml
setup.cfg		setup.cfg
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Overview

Installation

Prerequisites

Installation

Usage

About

Uh oh!

Releases

Packages

Languages

License

freitasR/roshambo

Folders and files

Latest commit

History

Repository files navigation

Overview

Installation

Prerequisites

Installation

Usage

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages