Thanks to visit codestin.com
Credit goes to chembl.blogspot.com

Skip to main content

Posts

Showing posts from February, 2026

Recording: SureChEMBL2.0 - Now with Added Disease and Protein Annotations

  A week ago, I had the pleasure of presenting SureChEMBL2.0 at the  Cambridge Cheminformatics Network Meeting , organised by Andreas Bender and kindly hosted by the  Cambridge Crystallographic Data Centre . It was a great opportunity to introduce one of the latest freely available databases of scientifically annotated patents to a broad scientific audience. The recording of the talk is now available  online , along with the  slides . What did I cover during this 30-minute talk? Why scientists should pay attention to patent data Why patents are challenging to work with What SureChEMBL is and what it does How we identify chemical compounds in patent documents What SureChEMBL 2.0 has recently introduced How we annotate patents for genes/proteins and diseases How we are improving the quality of structures extracted from images What you can download from the SureChEMBL core datasets — and what they contain Examples of queries that SureChEMBL h...

AI-driven Annotation and FAIRification of ChEMBL Bioassays

  AI-driven bioassay annotation strategy The continued expansion of ChEMBL bioactivity data makes high-quality, structured assay metadata essential for reproducible analysis and machine-learning applications aligned with FAIR principles. Recent work by our team  published in J. Cheminf. describes coordinated manual and AI-driven strategies to enhance the annotation, classification, and interoperability of ChEMBL bioassays.  In this work, we have developed a spaCy-based named entity recognition (NER) model trained on manually curated assay descriptions to identify the Experimental Method within ChEMBL assay descriptions. The model achieved cross-validated precision, recall, and F1-scores of approximately 0.93, 0.95, and 0.94, respectively, and detected experimental methods in ~57 % of binding and functional assays in ChEMBL 35. Extracted method terms were subsequently mapped to the Bioassay Ontology (BAO) , demonstrating good precision at higher confidence thresholds bu...