Digitize your chemical reaction image into a machine-readable representation.
V.0
·
IBM Research Zurich
V.1
.
Wageningen University & Research
- Description
- Step by step
- Benchmarking
- Installation
- Models - Training - Evaluation - Inference
- Contributing
- Creators
- Thanks
- Citing
From a chemical reaction image, detect and classify molecules, text and arrows by using a Vision Transformer. The detections are then translated into text by using an OCR or into SMILES by using DECIMER AI. The direction of the reaction is detected and preserved into a .json file as output.
Output:
SMILES:
CC(=O)OC1=C(C=CC=C1)C(=O)O
A DETR model with a ResNet50 backbone is used to detect the objects in the image. Classes to be found = ["molecules","arrows","text", "+ symbols"]. Images of type "png" are feed as input and bounding boxes corresponding to the objects locations in a tensor type as well as its respective label are the returned outputs.
Syntetic Dataset consisting of 60k images that are syntheticaly created to simulate the real-world reactions publications distribution. We also implement a small validation set of 8k images and a testing set of 2k. Also, to see how the model performs, we implement a small dataset with "real-world" reactions extracted from the Organic Chemistry Portal
Data Structure
--Images/
|
|-- train2017/
|-- val2017/
|-- test2017/
|-- labelled2017/
|-- annotations/
|
|-- custom_train.json
|-- custom_val.json
|-- custom_test.json
--real_data
|
|-- real/
For Optical Character Recognition (OCR), we used a PaddleOCR model, an open-source tool optimized for extracting text from images. It applies deep learning techniques to detect and recognize text regions, even in complex layouts or low-quality scans. This step was essential for identifying and extracting relevant textual information, such as labels, annotations, or chemical names, from the input documents before structural analysis
In order to translate molecules from the input images to SMILES strings we used DECIMER AI, an open-source OCSR that uses deep learning to detect, segment, and recognize chemical structures from scientific documents. It turns images of molecules into machine-readable formats, helping extract chemical data from scanned papers and literature.
The direction of the reaction is detected by using a simple heuristic. The algorithm checks the position of the arrows and the molecules in the image. It uses the coordinates of the bounding boxes to determine the direction of the reaction. The algorithm then assigns a direction to each arrow based on its position relative to the molecules.
- Make sure to have all requirements.txt installed.
- See DETR Fine-Tuning for doubts on runnning the DETR model woth Detectron2.
- Follow steps in arrow_78/README.md file.
- Follow steps in Detectron_2/README.md file.
- Backend/:
- Run file
end_to_end.shin the terminal. This will run the whole pipeline from image to JSON file. The output will be saved JSON file in theoutputfolder. - If
--debuggingis set to True, the output from arrow and molecule detection will be saved in their respective folders, allowing to determine what the model is detecting, and how.
- Run file
A randomly selected small sample of the test set is evaluated under the folders "test_results" of each approach. DETR, FRCNN and RetinaNet. Check qualitatevly the performance of the models in there.
Aggregating the aforementioned steps outcome, we can reconstruct JSON and text files.
{
"arrow0": {
"prev_mol": "C=CC(C)(C)C1=C(C[C@@H]2C(=O)N3CCC=C3C(=N2)OC)C4=C(C5=C(C=C4)OC(C)(C)C=C5)N1",
"text": [
[
"20%aq","KOH"
"MeOH"
]
],
"post_mol": "C=CC(C)(C)C1=C(CC2=NC(=C3CCCN3C2=O)OC)C4=C(C5=C(C=C4)OC(C)(C)C=C5)N1"
},
"arrow8": {
"prev_mol": "C=CC(C)(C)C1=C(CC2=NC(=C3CCCN3C2=O)OC)C4=C(C5=C(C=C4)OC(C)(C)C=C5)N1",
"text": [
[
"86%2.4:1 d.r.",
"i. HCI96%"
]
],
"post_mol": "CC1(C)C=CC2=C(C=CC3=C2NC4=C3C[C@]56[C@@H](C[C@@]7(C[C@@H]8C[C@@]7(C(=O)N5)NC8=O)C(=O)N6)C4(C)C)O1"
}
}Wageningen University & Research (WUR), Department of Plant Sciences
IBM Research Europe
Escola Superior de Comerç Internacional (ESCI-UPF)



This thesis would not have been possible without the guidance of Dr. Daniel Probst as my supervisor, and the previous work done by Mark Martori, whom I deeply thank, . Throughout the writing of this dissertation I have also received a great deal of support by my colleagues at the Department of Plant Sciences at WUR.
@software{LSilva2025,
author = {Martori, Mark; Probst, Daniel and Silva, Lucas},
title = {{Machine Learning approach for chemical reactions digitalisation.}},
url = {https://github.com/Brachingo/OChemR},
version = {1.5},
year = {2025}
}





