A toolkit that integrates DFT codes with Bayesian optimization to explore stable atomic configurations in materials [1].
- Python 3.9.6 (tested)
- numpy == 1.26.4
- pandas == 2.2.3
- physbo == 3.1.0
- scikit-learn == 1.6.1
- scipy == 1.13.1
- Quantum ESPRESSO v7.3.1 (tested)
git clone https://github.com/a-ksb/PyAPX.git
cd PyAPX
pip install -r requirements.txtFor details about this material system, please refer to articles [2,3].
cd PyAPX/examples/H_GaN0001_6x6
python gen_candidates.py
PYTHONPATH=../.. python -m pyapx.cli
python visualize_results.pyFor details about this material system, please refer to the article [4].
cd PyAPX/examples/h-BCN_3x3Users need to prepare the following three files:
apx.in: PyAPX input fileqe_template.in: Quantum ESPRESSO input templatecandidates.csv: List of candidate atomic configurations
In apx.in, users can set parameters as follows:
ENCODING = True
RANDOM_SAMPLING = 2 # the number of random sampling
BAYES_SAMPLING = 2 # the number of Bayesian sampling
ENERGY_EVALUATOR = qe # "qe" or custom:module.function
#PARALLEL_COMMAND = mpirun -np 8 # mpi setting for dft, Default: "mpiexec"
OPTIMIZER = physboWhen ENCODING = True, encoded_candidates.pkl is generated, which is required for Bayesian sampling.
Then, schedule the number of random sampling and Bayesian sampling iterations. At least two initial data points are required for Bayesian sampling.
PyAPX (v1.0.0) currently supports only Quantum ESPRESSO as a DFT code.
As an energy evaluator, you can also specify user-defined functions instead of DFT codes (see Quick Start example).
# settings for encoding
ENCODE = NAmod # encoding method: "OH", "NA" or "NAmod"
WEIGHT = 0.3 # parameter for "NA" and "NAmod"You can choose from the following encoding methods: one-hot (OH) encoding, neighbor-atom (NA) encoding, and modified neighbor-atom (NAmod) encoding. For details, please refer to the article [1].
# settings for physbo
SCORE = TS # acquisition function: "TS", "EI" or "PI"
NUM_RAND_BASIS = 3000 # the number of basis functionsYou can choose from the following acquisition functions: Thompson Sampling (TS), Expected Improvement (EI), and Probability of Improvement (PI). For details, please refer to the PHYSBO documentation.
# settings for "NA", "NAmod"
NEIGHBOR_SITES
10 12 18 # site_1's neighbors are site_10, site_12 and site_18
10 11 16 # site_2's neighbors are site_10, site_11 and site_16
11 12 17 # site_3's neighbors are site_11, site_12 and site_17
.
.
.When using NA or NAmod encoding, list the neighboring sites in order from site_1 to define the site network.
In qe_template.in, users need to include atomic coordinates in the Quantum ESPRESSO input file as follows:
&control
calculation = 'vc-relax' # DFT calc settings
.
.
.
ATOMIC_SPECIES
C 12.01060 C_paw.UPF
B 10.81350 B_paw.UPF
N 14.00650 N_paw.UPF
ATOMIC_POSITIONS {crystal}
X 0.0000 0.0000 0.0000 # site_1
X 0.0000 0.3333 0.0000 # site_2
X 0.0000 0.6667 0.0000 # site_3
.
.
.Here, sites that are subject to atomic rearrangement should be set with the element as a wildcard "X".
In this example, candidates.csv is generated using the following enumeration code, which contains as many as 5,717,712 candidate configurations.
julia gen_candidates.jl # julia 1.10.3 (tested)structure_id,site_1,site_2,site_3,...,site_18
0,C,B,B,B,B,B,B,C,C,C,C,C,N,N,N,N,N,N
1,C,B,B,B,B,B,B,C,C,C,C,N,C,N,N,N,N,N
2,C,B,B,B,B,B,B,C,C,C,C,N,N,C,N,N,N,N
.
.
.
5717711,C,N,N,N,N,N,N,C,C,C,C,C,B,B,B,B,B,BUsers are required to prepare candidates.csv in the format shown above.
The sites in candidates.csv correspond to the sites specified by "X" in qe_template.in in order.
Finally, ensure that the three files apx.in, qe_template.in, and candidates.csv are present in the current directory, and run the following command with the PYTHONPATH set to the project root directory where the pyapx directory is located.
PYTHONPATH=../.. python -m pyapx.cliOutput files:
samples.csv: Sampling results and energy valuesdft_calc/qe_sample_[ID].in: Generated QE input filesdft_calc/qe_sample_[ID].out: QE calculation output files
When publishing the results using PyAPX, we hope that you cite the following article [1]. Additionally, please follow the guidelines for PHYSBO and Quantum ESPRESSO to acknowledge their usage and cite the relevant references.
[1] A. Kusaba et al., "PyAPX: Python toolkit for atomic configuration pattern exploration", arXiv:2511.17972 [cond-mat.mtrl-sci].
PyAPX originates from following our previous studies [2,3,4].
[2] A. Kusaba et al., "Exploration of a large-scale reconstructed structure on GaN(0001) surface by Bayesian optimization", Applied Physics Letters 120, 021602 (2022).
[3] K. Kawka et al., "Augmentation of the Electron Counting Rule with Ising Model", Journal of Applied Physics 135, 225302 (2024).
[4] T. Hara et al., "Exploration of Stable Atomic Configurations in Graphene-like BCN Systems by Density Functional Theory and Bayesian Optimization", Crystal Growth & Design 25, 6719-6726 (2025).