Thanks to visit codestin.com
Credit goes to github.com

Skip to content

schwallergroup/alphaswarm

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

2 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

AlphaSwarm Logo

AlphaSwarm: ML-guided Particle Swarm Optimisation for Chemical Reaction Optimisation

Python Version License Checked with mypy arXiv Linting: Ruff uv

πŸš€ Installation

Install uv with the command

pipx install uv

Create the environment with the following command

uv venv alphaswarm --python=3.12

and activate the environment

source .venv/bin/activate
# source alphaswarm/bin/activate # on MacOS

Alternatively, you can use conda / mamba to create the environment and install all required packages (installation used for all benchmarks and experiments):

git clone https://github.com/schwallergroup/alphaswarm.git
cd alphaswarm
conda env create -f environment.yml
# mamba env create -f environment.yml
python -m pip install -e .

πŸ“– Usage

πŸ–₯️ Benchmark

To run a benchmark, define a configuration file, e.g. benchmark.toml:

file_path = "data/benchmark/virtual_experiment.csv"  # Path to the dataset with features and target
y_columns = ["AP yield", "AP selectivity"]  # Columns containing the objectives values
exclude_columns = ["catalyst", "base"]  # (Optional) Columns to exclude from the feature set used for modelling, usually contains text data

seed = 42  # Seed for reproducibility
n_iter = 3  # Number of iterations
n_particles = 24  # Number of particles (batch size)
init_method = "sobol"  # Initialisation method (random, sobol, LHS, halton)
algo = "alpha-pso"  # Algorithm to use (canonical-pso, alpha-pso, qnehvi, sobol)

[pso_params]  # only for (canonical-pso, alpha-pso)
c_1 = 1.0  # Cognitive parameter
c_2 = 1.0  # Social parameter
c_a = 1.0  # ML parameter
w = 1.0  # Inertia parameter
n_particles_to_move = [0, 0]  # Number of particles to move directly to ML predictions at each iteration after initialisation (list size = iteration_number - 1)

objective_function = "weighted_sum"  # Objective function to use (weighted_sum, weighted_power, ...)

[obj_func_params]
weights = [1.0, 1.0]  # Weights for the weighted sum objective function
noise = 0.0  # Noise to add to the objectives

[model_config]
kernel = "MaternKernel"  # Kernel to use (MaternKernel, KMaternKernel)
kernel_params = "default"  # Kernel parameters
training_iter = 1000  # Number of iterations for training

Then, run the benchmark with the following command:

alphaswarm benchmark benchmark.toml

πŸ§ͺ Experimental campaign

To run an experimental campaign, you need a chemical space file (.csv) that contains your reaction features. This file must also include a column named rxn_id for the reaction identifiers.

If your file includes non-feature columns describing reaction conditions (e.g., catalyst, base, solvent), you must list them in the configuration file under the exclude_columns parameter. This ensures they are excluded from the model's feature set. The config files used in the manuscript accompanying this repository can be found in the /configs directory.

An example of a configuration file is shown below:

Warning

The chemical space features for the experimental campaign must be normalised between 0 and 1. Normalisation can be done with the normalise_features(...) function.

chemical_space_file = "data/experimental_campaigns/example/chemical_space.csv"  # Path to the chemical space
exclude_columns = ["ligand", "solvent", "precursor", "base"]  # (Optional) Columns to exclude from the input features, usually columns containing text data (rxn_id is automatically excluded)

iteration_number = 1  # Number of iterations (1 for the initialisation)

seed = 42  # Random seed for reproducibility
n_particles = 96  # Number of particles (batch size)
init_method = "sobol"  # Initialisation method (random, sobol, LHS, halton)
algo = "alpha-pso"  # Algorithm to use (canonical-pso, alpha-pso, qnehvi, sobol)

[pso_params]  # only for (canonical-pso, alpha-pso)
c_1 = 1.0  # Cognitive parameter
c_2 = 1.0  # Social parameter
c_a = 1.0  # ML parameter
w = 1.0  # Inertia parameter
n_particles_to_move = [0]  # Number of particles to move directly to ML predictions at each iteration after initialisation (list size = iteration_number - 1)

objective_columns = ["AP yield", "AP selectivity"]  # Columns specifying the objectives

# Suggestions path/file format
pso_suggestions_path = "data/experimental_campaigns/example/pso_plate_suggestions"  # output path for the PSO suggestions
pso_suggestions_format = "PSO_plate_{}_suggestions.csv"  # file format of the PSO suggestions
# Experimental/Training data path/file format
experimental_data_path = "data/experimental_campaigns/example/pso_training_data"  # path to the experimental data
experimental_data_format = "PSO_plate_{}_train.csv"  # file format of the training data

[model_config]
kernel = "MaternKernel"  # Kernel to use (MaternKernel, KMaternKernel)
kernel_params = "default"  # Kernel parameters
training_iter = 1000  # Number of iterations for training

Then, run the experimental campaign with the following command:

alphaswarm experimental experimental.toml

Package structure

The package is structured as follows:

πŸ“alphaswarm/
    β”œβ”€β”€ LICENSE  # MIT License file
    β”œβ”€β”€ README.md  # Installation and usage instructions
    |── tox.ini  # Configuration file for tox (testing)
    β”œβ”€β”€ pyproject.toml  # Project configuration file
    β”œβ”€β”€ environment.yml # Configuration file for conda environment
    β”œβ”€β”€ data/
    β”‚Β Β  β”œβ”€β”€ benchmark/  # Contains the virtual experiments for benchmarking
    β”‚Β Β  β”‚Β Β  β”œβ”€β”€ buchwald_virtual_benchmark.csv
    β”‚Β Β  β”‚Β Β  β”œβ”€β”€ ni_suzuki_virtual_benchmark.csv
    β”‚   β”‚   β”œβ”€β”€ sulfonamide_virtual_benchmark.csv
    β”‚Β Β  β”‚Β Β  └── experimental_data/  # Contains the experimental data for training emulators
    β”‚Β Β  β”‚Β Β   Β Β  β”œβ”€β”€ buchwald_train_data.csv
    β”‚Β Β  β”‚Β Β   Β Β  β”œβ”€β”€ ni_suzuki_train_data.csv
    β”‚   β”‚       └── sulfonamide_train_data.csv
    β”‚Β Β  β”œβ”€β”€ experimental_campaigns/
    β”‚Β Β  β”‚Β Β  └── pso_suzuki/  # Example of an experimental campaign
    β”‚Β Β  β”‚Β Β      β”œβ”€β”€ chemical_spaces/  # Contains the chemical spaces
    β”‚Β Β  β”‚Β Β      β”‚   └── pso_suzuki_chemical_space.csv
    β”‚   β”‚       β”œβ”€β”€ configs/  # Contains the config .toml files use to obtain experimental suggestions
    β”‚   β”‚       β”‚   β”œβ”€β”€ pso_suzuki_iter_1.toml
    β”‚   β”‚Β Β  Β Β   β”‚Β Β  ...
    β”‚Β Β  β”‚Β Β      β”œβ”€β”€ pso_plate_suggestions/  # Contains the experimental suggestions
    β”‚Β Β  β”‚Β Β   Β Β  |   β”œβ”€β”€ PSO_suzuki_plate_1_suggestions.csv
    β”‚Β Β  β”‚Β Β   Β Β  |   ...
    β”‚Β Β  β”‚Β Β      └── pso_training_data/  # Contains the training data (experimental results)
    β”‚Β Β  β”‚Β Β          β”œβ”€β”€ PSO_suzuki_plate_1_train.csv
    β”‚Β Β  β”‚Β Β          ...
    β”‚   │── HTE_datasets/ # Contains the experimental HTE datasets in SURF format
    β”‚   β”‚   β”œβ”€β”€ pd_sulfonamide_SURF.csv
    β”‚   β”‚   └── pd_suzuki_SURF.csv
    β”œβ”€β”€ src/
    β”‚Β Β  └── alphaswarm/
    β”‚Β Β      β”œβ”€β”€ __about__.py
    β”‚Β Β      β”œβ”€β”€ __init__.py
    β”‚Β Β      β”œβ”€β”€ cli.py  # Command line interface tools
    β”‚Β Β      β”œβ”€β”€ configs.py  # Configurations for benchmark and experimental campaigns
    β”‚Β Β      β”œβ”€β”€ metrics.py  # Metrics for the benchmark
    β”‚Β Β      β”œβ”€β”€ objective_functions.py  # Objective functions for the benchmark
    β”‚Β Β      β”œβ”€β”€ pso.py  # Main PSO algorithm
    β”‚Β Β      β”œβ”€β”€ swarms.py  # Particle and Swarm classes
    β”‚Β Β      β”œβ”€β”€ acqf/  # Acquisition functions
    β”‚Β Β      β”‚Β Β  β”œβ”€β”€ acqf.py
    β”‚Β Β      β”‚Β Β  └── acqfunc.py
    β”‚Β Β      β”œβ”€β”€ models/  # Surrogate models
    β”‚Β Β      β”‚Β Β  └── gp.py  # Gaussian Process models
    β”‚Β Β      └── utils/
    β”‚Β Β          β”œβ”€β”€ logger.py  # Logger for the package
    β”‚Β Β          β”œβ”€β”€ moo_utils.py  # Utilities for multi-objective optimisation
    β”‚Β Β          β”œβ”€β”€ tensor_types.py  # Type definitions for tensors
    β”‚Β Β          └── utils.py  # General utilities
    └── tests/  # Contains all the unit tests
    

All data is stored in the data/ directory. The benchmark/ directory contains the virtual experiments used for benchmarking. The experimental_campaigns/ directory contains the chemical spaces and the experimental data for the experimental campaigns.

πŸ› οΈ Development details

See developer instructions

To install, run

pip install -e ".[test]"

To run style checks:

uv pip install pre-commit
pre-commit run -a

Run style checks, coverage, and tests

Ruff is used for linting and type checking. To run the tests, use the following command:

ruff check src/ --fix

To test:

uv pip install tox
python -m tox r -e py312

Tensor shapes can be checked using jaxtyping. To check the shapes, set the TYPECHECK environment variable to 1 and run code normally:

export TYPECHECK=1

Generate coverage badge

Works after running tox

uv pip install "genbadge[coverage]"
genbadge coverage -i coverage.xml

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 2

  •  
  •  

Languages