MolVS is a molecule validation and standardization tool, written in Python using the RDKit chemistry framework.
Building a collection of chemical structures from different sources can be difficult due to differing representations, drawing conventions and mistakes. MolVS can standardize chemical structures to improve data quality, help with de-duplication and identify relationships between molecules.
There are sensible defaults that make it easy to get started:
>>> from molvs import standardize_smiles
>>> standardize_smiles('C2(=C1C(=NC=N1)[NH]C(=N2)N)O')
'Nc1nc(O)c2ncnc-2[nH]1'
To install MolVS, simply run:
pip install molvs
Alternatively, try one of the other installation options.
Full documentation is available at http://molvs.readthedocs.org.
- Feature ideas and bug reports are welcome on the Issue Tracker.
- Fork the source code on GitHub, make changes and send a pull request.
MolVS is licensed under the MIT license.
There are a number of projects with similar goals that take differing approaches: