Collection 17 April 2025

Generative modeling for molecular design and discovery

Generative models have gained widespread attention in recent years due to their inverse design capabilities and their potential to accelerate the molecular design and discovery processes. This Collection includes manuscripts published by Nature Computational Science that apply and develop generative modeling tools for small molecule design and discovery. The Collection features both primary research articles and non-primary content and will be updated as new content is published. Content appears in reverse chronological order.

Research

Rapid traversal of vast chemical space using machine learning-guided docking screens

Combining conformal prediction machine learning with molecular docking, a method to efficiently screen multi-billion-scale libraries is developed, enabling the discovery of a dual-target ligand modulating the A_2A adenosine and D₂ dopamine receptors.
- Andreas Luttens
- Israel Cabeza de Vaca
- Jens Carlsson
ArticleOpen Access13 Mar 2025 Nature Computational Science
Harnessing large language models for data-scarce learning of polymer properties

A physics-based training pipeline is developed to help tackle the challenges of data scarcity. The framework aligns large language models to a physically consistent initial state that is fine-tuned for learning polymer properties.
- Ning Liu
- Siavash Jafarzadeh
- Yue Yu
Article10 Feb 2025 Nature Computational Science
Structure-based drug design with equivariant diffusion models

This work applies diffusion models to conditional molecule generation and shows how they can be used to tackle various structure-based drug design problems
- Arne Schneuing
- Charles Harris
- Bruno Correia
Brief CommunicationOpen Access9 Dec 2024 Nature Computational Science
A deep learning approach for rational ligand generation with toxicity control via reactive building blocks

DeepBlock is a deep learning framework for ligand generation, inspired by the DNA-encoded compound library technique, that enhances ligand design with building blocks and a rule-based reconstruction algorithm, achieving better drug properties.
- Pengyong Li
- Kaihao Zhang
- Xiangxiang Zeng
Article8 Nov 2024 Nature Computational Science
An algorithmic framework for synthetic cost-aware decision making in molecular design

The downselection of compounds for synthesis is a key challenge in molecular design cycles that typically relies on expert chemist intuition. Fromer and Coley propose a cost-aware method to automatically select compounds and synthetic routes.
- Jenna C. Fromer
- Connor W. Coley
Article17 Jun 2024 Nature Computational Science
Directional multiobjective optimization of metal complexes at the billion-system scale

A method is developed for the directional optimization of multiple properties without prior knowledge on their nature. Using a large ligand dataset, diverse metal complexes are found along the Pareto front of vast chemical spaces.
- Hannes Kneiding
- Ainara Nova
- David Balcells
Article29 Mar 2024 Nature Computational Science
Electron density-based GPT for optimization and suggestion of host–guest binders

An optimization algorithm is used to discover guest molecules based on knowing only the structure of the host. The molecules are represented as 3D volumes, optimized to improve host–guest interaction and converted into SMILES using a transformer model.
- Juan M. Parrilla-Gutiérrez
- Jarosław M. Granda
- Leroy Cronin
ArticleOpen Access8 Mar 2024 Nature Computational Science
Accurate transition state generation with an object-aware equivariant elementary reaction diffusion model

A diffusion model that generates chemical reactions in 3D with all desired symmetries preserved is established and shown to reduce transition state search from days to seconds and complement intuition-based reaction exploration with generative AI.
- Chenru Duan
- Yuanqi Du
- Heather J. Kulik
Article15 Dec 2023 Nature Computational Science
Learning on topological surface and geometric structure for 3D molecular generation

SurfGen is a structure-based drug design approach that delves into topological and geometric deep learning techniques for interaction learning, echoing the classical lock-and-key model.
- Odin Zhang
- Tianyue Wang
- Tingjun Hou
Article9 Oct 2023 Nature Computational Science
Guided diffusion for inverse molecular design

GaUDI is a guided diffusion method for the design of molecular structures that features a flexible and scalable target function and that achieves high validity of generated molecules.
- Tomer Weiss
- Eduardo Mayo Yanes
- Renana Gershoni-Poranne
Article5 Oct 2023 Nature Computational Science
High-throughput property-driven generative design of functional organic molecules

A generative deep learning model of molecular structure is combined with supervised deep learning models of molecular properties to achieve high-throughput (multi-)property-driven design of organic molecules.
- Julia Westermayr
- Joe Gilkes
- Reinhard J. Maurer
Article6 Feb 2023 Nature Computational Science
Unconstrained generation of synthetic antibody–antigen structures to guide machine learning methodology for antibody specificity prediction

The Absolut! framework can generate synthetic three-dimensional antibody–antigen structures to assist machine learning and dataset construction for antibody design. Most importantly, the relative machine learning performance learnt on Absolut! datasets is shown to transfer to experimental datasets.
- Philippe A. Robert
- Rahmad Akbar
- Victor Greiff
Resource19 Dec 2022 Nature Computational Science

Reviews & Perspectives

The future of machine learning for small-molecule drug discovery will be driven by data

The application of machine learning techniques to small-molecule drug discovery has not yet yielded a true leap forward in the field. This Perspective discusses how a renewed focus on data and validation could help unlock machine learning’s potential.
- Guy Durant
- Fergus Boyles
- Charlotte M. Deane
Perspective15 Oct 2024 Nature Computational Science
Designing molecules with autoencoder networks

Autoencoders are versatile tools for molecular informatics with the opportunity for advancing molecule and drug design. In this Review, the authors highlight the active areas of development in the field and explore the challenges that need to be addressed moving forward.
- Agnieszka Ilnicka
- Gisbert Schneider
Review Article21 Nov 2023 Nature Computational Science

News & Opinion

Generative molecular design and discovery on the rise

Nature Computational Science is calling all researchers who develop and use generative models for molecular design and discovery to publish with us.

Editorial17 Apr 2025 Nature Computational Science
The promise and pitfalls of AI for molecular and materials synthesis

As artificial intelligence (AI) proliferates, synthetic chemistry stands to benefit from its progress. Despite hidden variables and ‘unknown unknowns’ in datasets that may impede the realization of a digital twin for the laboratory flask, there are many opportunities to leverage AI and large datasets to advance synthesis science.
- Nicholas David
- Wenhao Sun
- Connor W. Coley
Comment18 May 2023 Nature Computational Science

Generative modeling for molecular design and discovery

Research

Rapid traversal of vast chemical space using machine learning-guided docking screens

Harnessing large language models for data-scarce learning of polymer properties

Structure-based drug design with equivariant diffusion models

A deep learning approach for rational ligand generation with toxicity control via reactive building blocks

An algorithmic framework for synthetic cost-aware decision making in molecular design

Directional multiobjective optimization of metal complexes at the billion-system scale

Electron density-based GPT for optimization and suggestion of host–guest binders

Accurate transition state generation with an object-aware equivariant elementary reaction diffusion model

Learning on topological surface and geometric structure for 3D molecular generation

Guided diffusion for inverse molecular design

High-throughput property-driven generative design of functional organic molecules

Unconstrained generation of synthetic antibody–antigen structures to guide machine learning methodology for antibody specificity prediction

Reviews & Perspectives

The future of machine learning for small-molecule drug discovery will be driven by data

Designing molecules with autoencoder networks

News & Opinion

Generative molecular design and discovery on the rise

The promise and pitfalls of AI for molecular and materials synthesis

Search

Quick links