Transfer Learning in Plasma Catalysis

This is the codebase developed from the Open Catalysis Project (OCP), for transfer learning from thermal catalysis to plasma catalysis. The installation, usage, etc of the code remain the same as in the OCP

Tips for Installation

From our experience, following our tested installation steps should be faster and more convenient than the official one. These installation steps work for Red Hat Enterprise Linux 9.4 (Plow), and normally work for other platforms. The overall installation time should be maximumly around 10 mins.

Create a new conda environment with python version of 3.9.18
Activate the conda environment and pip install -r env.txt for all the other dependencies
Install pytorch pip install torch==1.13.1+cu116 --extra-index-url https://download.pytorch.org/whl/cu116
Install torch extensions pip install torch-scatter==2.1.1 torch-sparse==0.6.17 torch-cluster==1.6.1 torch-spline-conv==1.2.2 torch-geometric -f https://data.pyg.org/whl/torch-1.13.1+cu116.html
Install OCP git clone https://github.com/wwwccttoo/ocp.git
Enter the cloned folder
pip install -e .

Training

The following is an example/demo of training python3 -u -m torch.distributed.launch --nproc_per_node=4 ocp/main.py --distributed --num-gpus 4 --mode train --config-yml configs/s2ef/all/equiformer_v2_plasma/Task1_equiformer_v2_plasma_all_traj_scratch_31M.yml --amp --checkpoint checkpoints/2024-04-11-18-28-16/checkpoint.pt

Here, 4 gpus are used. The setup according to Task1_equiformer_v2_plasma_all_traj_scratch_31M is used to define the model, optimization, etc. A checkpoint is used to restart the training or start the training from a pre-trained model.

Checkpoints will be automatically saved for restarting or fine-tuning.

Note: due to the number of atoms in the plasma catalysis data, a GPU of mininmal 12 GB should be used. In this training, we used 4 12 GB GPUS for a bathsize 4 training. Training for 100 epoch will take ~7 days.

Test

The following is an example/demo of test python3 -u -m torch.distributed.launch --nproc_per_node=4 ocp/main.py --distributed --num-gpus 4 --mode predict --config-yml configs/s2ef/all/equiformer_v2_plasma/Task1_equiformer_v2_plasma_all_traj_scratch_31M.yml --amp --checkpoint checkpoints/2024-04-11-18-28-16/checkpoint.pt

The only difference here is the train is set to predict. An output of the corresponding energy and atomic forces for each of the catalysis system in the test dataset will be generated. The test dataset should be specified in the config file.

Reproduction

We provide all the generated predictions for traning, validation, test and extrapolation. As well as the attention score collected for task2. They can be found in the Data section.

We also provide three .ipynb scripts that can be used to regenerate the results we put in the manuscripts. However, the data should be downloaded and the address should be reset in these scripts.

Features

The main added features are the model and dataloader for plasma catalysis. They can be found in:

plasma_v2 The dataloader class for plasma catalysis
equiformer_v2_plasma The model used for Task1 (Transfer Learning from Thermal Catalysis to Plasma Catalysis for Single Metal Atoms) and Task3 (Transfer Learning from Single Atoms to Metal Clusters)
gemnet_equiformer_v2_newdist The model used for Task2 (Interpretable Transfer Learning to Elucidate the Role of Surface Charge)

Acknowledgements

This project uses code adapted from https://github.com/FAIR-Chem/fairchem (Yes, they renamed it), which is available under MIT license. We thank the original authors for their work.

Data

We provide the link for all the datasets and training configs we used. Additionally, all the checkpoints for the trained model can be found in the same link. https://drive.google.com/drive/folders/1mCco444-XpJ7yrezEqb7MqweQ2QXIFbK?usp=drive_link

Rights

Questions regarding this code may be directed to ketong_shao (at) berkeley.edu

Name		Name	Last commit message	Last commit date
Latest commit History 762 Commits
.circleci		.circleci
.github/workflows		.github/workflows
configs		configs
docs		docs
licenses		licenses
ocpmodels		ocpmodels
scripts		scripts
tests		tests
tutorials		tutorials
.flake8		.flake8
.gitattributes		.gitattributes
.gitignore		.gitignore
.isort.cfg		.isort.cfg
.pre-commit-config.yaml		.pre-commit-config.yaml
DATASET.md		DATASET.md
DATASET_PER_ADSORBATE.md		DATASET_PER_ADSORBATE.md
FAQ.md		FAQ.md
INSTALL.md		INSTALL.md
LICENSE.md		LICENSE.md
MODELS.md		MODELS.md
README.md		README.md
README_OCP.md		README_OCP.md
Source Data.zip		Source Data.zip
TRAIN.md		TRAIN.md
Task1_model_comparison.ipynb		Task1_model_comparison.ipynb
Task2_attention_check.ipynb		Task2_attention_check.ipynb
Task3_prediction_cluster_check.ipynb		Task3_prediction_cluster_check.ipynb
codecov.yml		codecov.yml
cp2k_reader.py		cp2k_reader.py
env.common.yml		env.common.yml
env.cpu.yml		env.cpu.yml
env.gpu.yml		env.gpu.yml
env.txt		env.txt
env.yml		env.yml
gpu_env.yml		gpu_env.yml
main.py		main.py
pyproject.toml		pyproject.toml
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Transfer Learning in Plasma Catalysis

Tips for Installation

Training

Test

Reproduction

Features

Acknowledgements

Data

Rights

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Transfer Learning in Plasma Catalysis

Tips for Installation

Training

Test

Reproduction

Features

Acknowledgements

Data

Rights

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages