0% found this document useful (0 votes)

35 views19 pages

ph451 Final Project 2

Uploaded by

api-752724988

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

35 views19 pages

ph451 Final Project 2

Uploaded by

api-752724988

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 19

PH451 Final Project

Justin Lewis, Olivia Holmes, Riley Nold

April 2024

Abstract

With the rising popularity of Perovskite compounds in technology centers, particularly pho-

tovoltaics, experimental research into Perovskites has increasing appeal. To conduct this research

Perovskites must be created, but not all Perovskite structures are energetically favorable enough

to be reliably grown in a lab setting. This stability is based on the distance of a compound from

its convex hull. This distance can be calculated by hand, but it is a time consuming process. We

attempt to simplify the process of stability predictions by outsourcing it to a machine learning

program. By training three models we find that we can isolate important factors in stability

prediction, estimate a compounds convex hull distance, and produce a prediction of stable or

unstable based on if a compounds convex hull distance is above or below 40 meV per atom.

These tasks are performed by a random forest, neural network regressor, and neural network

classifier model respectively.

1 Introduction

The side-by-side advancement of modern material physics and accessible computational power has

made possible the data-mining and analysis of vast theoretical and experimental material properties

databases. In particular, the application of machine learning principles to important questions of

material physics has generated complex and robust prediction schemes which have informed materials

discovery and experimental research for the last two decades. In this project, we seek to address

key questions in the field of ceramics and ceramic oxides through a combination of domain-specific

knowledge and neural network modelling. As a result of our work, we have developed two neural

networks capable of predicting the formation energy and convex-hull-distance of novel Perovskite

compounds, as well as classifying such compounds as experimentally stable or unstable. We have

also utilized a simple Random-Forest model to provide physical insight into which features are most

1
predictive in regard to Perovskite stability/instability. We hope our work can function as a ”pre-

screening” for Perovskite researchers seeking to isolate compounds worthy of further experimental

and theoretical analysis.

2 Background and Theory

Novel materials have driven technological revolutions in nearly all sectors of society, including health,

energy, scientific research, and beyond. In the medical sector, the discovery of silicon-based semi-

conducting devices provided for magnetic-resonance imaging and improved patient implants (such

as heart stents) [1]. In the industrial setting, materials science has been aimed towards problems

of energy storage and energy collection devices, tunable materials, and structurally robust material

frameworks [2].

Over the last decade, interest in a new material class know as ”Perovskites” has grown due to

promising initial results detailing their wide range of tunable properties. For example, Perovskite

based solar cells were found to have an abnormally high solar power conversion efficiency [2] compared

to standard silicon-based devices [3], primarily due to their large photoabsorbiton coefficient [4].

Certain Perovskite materials, particularly Tin-Iodide based Perovskites, have also been shown to

experience temperature-induced structural and metal-to-insulator transitions, allowing for thermal

control of both their electronic and crystallographic properties [2]. In the biochemical research sphere,

Perovskites have shown tremendous performance in chemical sensing tasks, with the ability to detect

noxious and flammable compounds with concentrations as low as 50ppm [5].

Given the complex nature of these materials, a brief description of what constitutes a Perovskite

is warranted.

2.1 Perovskite Material Class

Perovskite compounds are crystalline materials with a chemical formula of ABX3 , where A and B

are taken to be cations, and X is taken to be an anion (typically oxygen). These materials typically

form in a cubic structure, with the B-type ions sitting at the center and the X ions siting at the cube

faces, forming a X-ion octahedron which surrounds the cationic center. The A-type cations are set

at the corners of the cubic unit cell.

2
Figure 1: Unit-Cell of Perovskite material (Figure is from Ref. 5).

2.2 Existing Challenges

Currently, companies and research labs across the globe are looking for new ways to synthesize

Perovskite materials and realize these exciting technologies at an industrial level [6]. Despite this

excitement, serious roadblocks have emerged in the field. In order to realize these novel material

properties, synthesized Perovskites must be phase-stable, stoichiometrically pure, and stable against

environmental factors. In general, it is known for many Perovskites that there exist materials of

similar stoicheometry which may be more thermodynamically favored to form than the Perovskite

of interest. In laboratory settings, reports have shown this effect manifesting as phase-segregation,

rapid oxidation, degradation due to light exposure, temperature-induced material degradation, etc.

[2,3,6]. The onset of these phase-instabilities can completely disrupt the desirable properties of the

intended Perovskite.

2.3 The Primary Problem

Given the existing challenges of Perovskite phase-stability, a natural question arises: Can we predict

which Perovskites will be the most stable? A predictive model which can classify Perovskites as more

or less ”experimentally stable” would function to pre-screen the Perovskite material space, allowing

researchers and companies to isolate the most promising candidates for Perovskites with properties

which are robust against potential phase-instability and environmental degradation. One common

metric of material stability is the Convex-Hull distance (CHD) (for more information, please see

here.). In a sense, the CHD measures the difference in formation energy of a material from the

closest known stable-phase compound with similar crystal structure and stoicheometry. A CHD of 0

implies the material is stable while large positive values of CHD indicate increasing phase-instability.

Therefore, we seek to develop a set of models (in particular, a Machine-Learning model) capable

of predicting a materials CHD, formation energy, and general stability given information about its

3
constituent elements.

2.4 Related Works

Machine-Learning and related theoretical and/or computational approaches have shown great suc-

cess in predicting the thermodynamic properties of a wide variety of compounds. It has been

shown that high-throughput Density-Functional Theory (DFT) predictions are highly accurate in

predicting formation energies and CHD values, but are computationally lengthy and expensive

[7]. Machine-Learning models trained on the Open-Quantum-Materials database (Random-Forests,

Support-Vector Machines, Neural-Networks) have been shown to provide low MSE predictions of

formation energies for ternary compounds similar to the Perovskite material class [7]. More specifi-

cally, research performed by Wei Li et al in 2018 found that a neural-network approach to classifying
meV
Perovskites as above or below a 40 atom CHD threshold was successful with a test accuracy, F1-score,

and AUC-score of 0.93, 0.88, and 0.976 respectively [8,9]. A natural materials-discovery pipeline

therefore presents itself. Existing experimental and DFT-based Perovskite data can be used to train

complex Machine-Learning models which can predict highly-stable Perovskites which are viable for

future experimental and theoretical analyses [10].

3 Research Goals

In our project, we focus on a particular portion of this materials-discovery pipeline, namely, the

development of a predictive machine-learning model(s) capable of estimating CHD and formation

energy values and classifying Perovskites as stable vs. unstable according to a particular CHD

threshold. Throughout the course of this project and the development of the model(s), we aim to

address two questions:

1. Which Machine-Learning architecture(s) are best-suited for modelling Perovskite stability?

2. Which elemental features provide the most predictive power in determining Perovskite stability?

4 Data Description and Visualization

The dataset that is to be used for this model can be found at figshare.com and was collected under

NSF grant 1148011 in collaboration with the Wisconsin Education Innovation Committee [11]. It

contains 65 columns of data (featues and labels), including the atomic radii of the composite atoms

4
for each material, the unit cell volume, the electron affinity, and the first ionization energy. The data

is categorized by three key labels: convex hull distance, formation energy, ”stable vs. unstable” based
meV
upon a 40 atom CHD threshold. In all there are 1929 instances split between the training, validation,

and testing sets. To give a sense of some of the (potentially) relevant features, we plot below the

covalent radius vs. the formation Energy and CHD values respectively.

meV
Figure 2: Plot of Convex Hull Distance ( atom ) vs. Covalent Radius (Å).

eV
Figure 3: Plot of Formation Energy ( atom ) vs. Covalent Radius (Å).

5
From these graphs it is hard to draw any firm conclusions as they are quite scattered, though

there appears to be some vertical linear groupings in the covalent radius vs formation energy graph.

We also show scatter plots for electron affinity vs convex hull distance and formation energy:

meV KJ
Figure 4: Plot of Convex Hull Distance ( atom ) vs. Electron Affinity ( mol ).

eV KJ
Figure 5: Plot of Formation Energy ( atom ) vs. average AB site Electron Affinity ( mol ).

Here there can be seen groupings of a linear variety between the electron affinity and the formation

energy, being more distinct and at higher energies. As mentioned in the background, the Perovskite

structure is typically of the ABX3 form. However, in many Perovskites there is not a single A or

6
B site element. Instead, the A and B site element varies from unit cell to unit cell between a few

elements of interest. At a minimum, each Perovskite must be composed of at least three distinct

elements. To understand the underlying element distribution of our dataset, we provide a bar-chart

of how many elements the various instances contain:

Figure 6: Histogram depicting the number of elements in the various Perovskite instances.

It can be seen that there are significantly more 4 and 5 element materials than any other number.

Lastly, in order to contextualize our later results we must discuss the underlying label distribution

of our dataset. Namely, we find that the energies follow a multimodal distribution with an under-
eV eV
ling Gaussian background centered at ∼ −1.8 atom . The largest peaks appear around −2 atom and
eV
−1.5 atom . In regard to stability, we find that only 29.4% of pervoskites in the dataset are stable

(w.r.t to the chosen CHD threshold) with 70.6% deemed unstable. This ratio will be important to

keep in mind during the discussion of our classifier models.

5 Models and Methodologies

5.1 Loading the Data

To begin, the data was read from a local excel file into a pandas dataframe within the attached lab

notebook. The data was subsequently converted to a numpy array and split into training, validation,

and test data using the SciKit-Learn Train-Test-Split function [12]. Three models were trained in

7
eV
Figure 7: Bar-Chart of formation energies ( atom ) for the 1929 instances in our dataset.

Figure 8: Pie chart depicting the percentage of stable and unstable Perovskites within the dataset.

order to address the various research goals. The first model was a random-forest ensemble [13],

whereas the second and third models were a neural network regressor and classifier respectively. The

train-test split was first applied including only the two numerical labels (Convex Hull Distance and

formation energy). This first data-split was used to train the neural network regressor. A second

train-test split was performed on a copy of the original dataset, this time including only the categorical

label (stable vs. unstable). This split dataset was used for both the random-forest model and neural

8
network classifier. The stability labels were generated by comparing the CHD for an instance to
meV
a 40 atom threshold, similar to other works in this field [8]. Instances with a CHD above 40 were

labeled unstable (0), while those with CHD below 40 were labeled stable (1). For both split data

sets, 80%/10%/10% train/validation/test split percentages were used. Finally, these data were read

into separate PyTorch dataloaders in order to be interpretable to our models during training.

5.2 Random Forest Model and Results

Our primary model architectures for both the regression and classification tasks were chosen to be

neural networks due to their versatility, ability to capture complex relations, and the particular

skillsets of our group. However, two critical drawbacks of the neural network approach are apparent.

Firstly, it is often difficult to accurately interpret the weights of a complex neural network, making it

challenging to determine which combination(s) of features are the strongest-predictors of our desired

label/outcome. Additionally, neural networks are often computationally expensive and data-hungry

which can make them unwieldy, especially when compared to simpler architectures. For these reasons,

we decided to first train a Random-Forest model on the classification task before moving onto the

neural networks. The Random-Forest will also serve as a useful benchmark for the success of our

more complex models, allowing us to gauge if the increased sophistication produces (or is worth)

potential boosts in accuracy.

Our Random-Forest was composed of 100 decision trees with a Gini-Impurity training criterion

[14]. The min samples leaf and min samples split were both set to 5, with no max depth restriction

and boostrapping enabled.

The model performed with an 88.1% (82.3%) validation (test) classification accuracy, which (as

we will see later) under-performed the neural network classifier. However, the interpretability of

the model provides some useful physical insights. In particular, the trained Random-Forest model

returns a normalized Gini-importance index (GII) for each feature, with a higher GII denoting a

higher predictive power associated with that feature. For more information on the GII, see [14]. We

report in figure 9 the GII of the 61 features (as a bar chart).

The four most predictive features in order were ”Asite BCCenergy pa max”, ”Asite BCCenergydiff min”,

”Bsite Second Ionization Potential (V) weighted avg”, and ”Bsite At.#weighted avg”, with normal-

ized GII values of approximately 0.050, 0.049, 0.045, and 0.043 respectively. The first feature corre-

sponds to the formation energy energy of the A-site element (per atom) in the BCC crystal phase.

If multiple A-site elements are present, the max of their respective BCC formation energies is taken.

9
Figure 9: Gini Importance Index of the 61 Perovskite features.

Likewise, the second feature corresponds to the minimum pairwise difference in BCC formation ener-

gies between any two A-site elements. If only one A-site element is present then this feature is set to

0. The third feature is the weighted average of the second ionization potential (V) across the B-site

elements. The last feature is the weighted average of the B-site atomic number (essentially encoding

the B-site element choice). In general, these features encode three concepts: the crystallography of

the A-site element(s), the chemical stability of the B-site element(s), and the B-site element choice

more generically. Information surrounding these three key concepts seem to provide the most pre-

dictive power in determining the stability of the Perovskite in question. The most important feature

(Asite BCCenergy pa max) is plotted below against the associated CHD values. We observe the

data in this plot are highly columated, with certain columns having a much higher or lower mean

CHD value, making them highly informative to split on in the random-forest context. A similar

trend is observed for the other three high GII features. We also note that one categorical feature,

”Is Pnictide”, was never split on and therefore has a GII of zero. This feature encodes whether

a Perovskite contains a group 15 element, meaning such information is largely uncorrelated with

Perovskite stability.

5.3 Neural Network Models

After testing the Random-Forest model, two independent neural network models were trained and

tested. The first model is a regression-oriented PyTorch-based [15] neural network. The model begins

with a batch-normalization of the input data [16], in order to standardize the variety of inputs and

10
Figure 10: Asite BCCenergy pa max vs. CHD values for all 1929 Perovskite instances.

learn their general scale and offset. We then implement a dense layer with 256 output neurons,

another batch normalization layer, and finally a Parametric-ReLU activation layer[17]. This 3-layer

sequence is repeated 3 more times, with the number of output neurons in the dense layer decreasing

approximately 2-fold each time. Additionally, dropout layers [18] are added to the final two sequences

in order to combat overfitting (which may arise from the PReLU activation, among other sources).

The dropout probabilities were fixed during training to around 0.5 to 0.6. The final layer has 2 output

neurons, as we are attempting to predict two labels (convex hull distance and formation energy).

The second model is a PyTorch-based neural network classifier, and is identical to the first model

with three exceptions. Firstly, the final linear layer has only one output neuron as we are only

predicting a single label (stable vs. unstable). Secondly, a sigmoid activation is called after the final

linear layer, in order to produce a valid class probability. Finally, the parametric-ReLU activations

were replaced with Leaky-ReLU activations [17] with a fixed slope parameter of 0.01. This final

modification will be discussed in the ”Model Justification” section.

For both models, an Adam optimizer [19] was utilized with an initial learning rate of 0.3 (0.6) for

the regressor (classifier), and a weight-decay [20] of 1e-6 for both models. The Adam optimizer was

used in order to converge to a loss minima quicker than standard SGD approaches. The learning rate

was handled by a Cosine-Annealing-Warm-Restarts performance scheduler [21], a complex scheduler

which will be discussed further in the following section. The T 0 and eta min values for the scheduler

were set to 300 epochs and 0.002 respectively. The learning rate vs. the number of epochs is shown

below for the training-loop of the regression model. As we can see, the learning rate initially drops

11
following a cosine curve, reaching near zero after 300 epochs. The learning rate is then artificially

spiked back to the initial learning rate, before the cosine trend is repeated with a period of 300

epochs. Both neural networks were trained for a total of 2500 epochs. For further detail, please see

the attached pdf which contains the relevant code discussed above.

Figure 11: Plot of regression model learning rate vs. number of training epochs.

The particular metrics and loss-functions utilized will be discussed further in the analysis section.

Standard training and testing loops were utilized, similar to those seen in previous hackathons

and hands-on-activities (Hands-On 6 for example). We note that these loops were modified in order

to return the best model from training (according to the validation metric) as opposed to the final

model, as to suppress any chance irregularities which may occur towards the end of the training

cycle.

5.4 Model Justification

Given the two-fold neural network approach of our project (regression and classification) and our

groups particular skill-sets, a neural network approach was deemed best suited for this task. The

batch-normalization layers were introduced in order to improve information propagation through the

network, and to allow the model to learn the general scale of each of the input features. A ReLU-like

activation was chosen in order to avoid the problem of vanishing gradients. PReLU activation in

particular was chosen to combat the “dying ReLU” problem during training and to allow our model

12
to find the best slope parameters during training. On the other hand, these extra degrees of freedom

(the PReLU slopes) could lead to overfitting if not properly handled. While this overfitting was not

observed for the regressor, the classifier suffered significant overfitting during our initial tests. To

combat this model failure, two regularization techniques were chosen. Firstly, the dropout layers

were added to regularize the model and prevent severe overfitting of the training set. Secondly, the

aforementioned weight-decay was utilized in order to prevent exploding neuron weights. For the

Adam optimizer, weight-decay acts similarly to a L2 regularization seen in other models, penalizing

the model for large model weights (w.r.t the L2 metric).

Finally we address the choice of learning scheduler. Our initial attempts utilized a performance-

based scheduler (ReduceLROnPlateau from PyTorch) with a patience of 5 epochs and a learning

factor of 0.95. However, we noticed the learning rates for both the regressor and classifier reduced

exponentially to zero, largely due to unforeseen instabilities during training. This led to a ”freeze”

in learning after approximately 300 epochs. In order to combat this ”dying-learning-rate” problem,

we switched to the Cosine-Annealing-Warm-Restarts scheduler. This scheduler operates by reducing

the learning rate via a cosine function of the number of epochs, reaching a learning-rate minima

(eta min) at a specified number of epochs (T 0). The learning rate is then artificially spiked back

to the initial learning rate, repeating the same cosine pattern with a period of 300 epochs (as was

previously mentioned).

6 Analysis

6.1 Neural Network Models

To analyze our results on the regression model we decided to use the mean-squared error as a loss

function and the L1 loss as our metric, seeking to minimize each [22]. The L1 loss was chosen as a

performance metric primarily due to outliers that were noted in the initial data visualization. Both

the loss and metric dropped quickly over the first five or so epochs before leveling out and decreasing

more incrementally. We also observed large spikes over certain epoch ranges, which can be explained

by the learning-rate spikes induced by the scheduler.

By these metrics, the best model we achieved was at epoch 2304 with a test MSE loss of 0.0628

and a test L1 loss of 0.1738. Although the regressor was not trained for the stable/unstable classifi-

cation task, we can also calculate an accuracy score for the regressor by comparing the model CHD
meV
predictions against the 40 atom and generating associated class predictions. Under this schema, the

13
Figure 12: Plot of regression model MSE and L1 loss vs. number of training epochs.

regressor was found to have a 86.5% test classification accuracy.

To train our classifier model we used binary cross entropy loss as the loss function and a standard

accuracy metric. For accuracy we divided the number of correct classifications by the total number

of classifications, thus trying to minimize the BCE loss criterion and maximize the accuracy metric.

The loss function dropped quickly while the accuracy metric took a while to grow and was somewhat

jumpy between epochs (as seen in Fig.13). The best model (achieved on epoch 2133) had a test BCE

loss of 0.2953 and an accuracy of 92.5%. We managed to achieve an ROC-AUC of 0.932 with an F1

score of 0.761. The ROC-Curve and confusion matrix for this model are shown below. We note that

this classification accuracy is on par with other models on similar datasets [8], and was notably more

accurate than the random-forest approach.

The model was able to very accurately determine the unstable materials, with a lower but still

reasonably high accuracy for classifying the stable compounds. This is likely in part due to the

number of unstable materials greatly outnumbering the stable ones in our dataset and in nature.

The similarity in values between type 1 and type 2 errors also seems to be quite close. Despite this

similarity, we conclude that the model has a slight false positive bias (predicted stable when the Per-

ovskite is actually unstable), due to the asymmetry in the underlying stable/unstable distribution.

In other words, since there are approximately 3 times as many unstable compounds in the dataset,

we would expect approximately three times more false negatives than false positives.

14
Figure 13: Plot of classifier model BCE loss and accuracy vs. number of training epochs.

Figure 14: Receiver-Operating Curve (ROC) for the Perovskite stability classifier.

Overall, both models provided satisfactory results on their respective tasks.

7 Further Discussion and Conclusions

7.1 Perovskite Stability and Formation

Although a more thorough analysis may be required (see model considerations), initial estimates

of the most important features for Perovskite stability have been determined from the Random-

15
Figure 15: Confusion Matrix for Perovskite stability classifier.

Forest model. Three types of features have been identified as key estimators of Perovskite stability:

electronic/chemical structure of the constituent elements, crystallographic properties (namely BCC

formation energy), and more generally the B-site element choice. For example, the second ionization

energy of the B-site element seems to heavily dictate the stability of the Perovskite compound.

This result fits well with a chemical view of Perovskite stability, as the electronic environments of

the constituent components dictate whether or not a stable bond(s) will be formed. Secondly, the

thermodynamic/crystallographic properties of the A-site element, such as the BCC formation energy,

also act as key predictors. This information seems to capture the idea of thermodynamic competition

between the desired Perovskite phase and elemental or binary phases. It may also imply a similarity

in local atomic environment for the A-site element between the BCC and Perovskite phases, so that

high/low A-site BCC stability correlates with improved/diminished Perovskite stability. We note

that Wei Li et al [8] also found similar features were important in predicting stability in more general

classes of oxide materials.

In regard to the neural network models, we found that both the regressor and classifier were

highly successful in their respective tasks, achieving accuracies/losses which were comparable to other

models seen in the literature [8,10]. Notably, the neural network classifier achieved a 4% increase in

test accuracy compared to the random-forest model, justifying the more sophisticated architecture

choice. Despite this jump in complexity our approach still represents a notable reduction in the

16
feature-space, model size, and training time compared to other similar works [8,10].

7.2 Further Considerations

Although the random forest model provides a certain degree of information regarding relative feature

importance (for the classification task), more advanced methods may provide a complete picture of the

feature importance problem. In future studies, a principal-component analysis or lasso-regularization

approach may provide further insights into the predictive power of our features and allow us to fur-

ther reduce the feature space our neural networks have to explore. Additionally, when comparing the

neural network classifier training and validation accuracies, we can see there is still slight overfitting

occurring. This may be due to the choice of Leaky ReLU activation. Therefore, further experimenta-

tion with the classifier activation functions and our regularization methods is required. Despite these

considerations, our first attempt is highly promising, as we were able to produce a classifier with a

test accuracy of ∼ 92.5%. It is possible that with further hyperparameter tuning of the dropout and

initial learning rates that these accuracies could be further improved. However, our current models

are still comparable in performance to other models in the literature.

7.3 Concluding Thoughts

The methods developed in this paper have the potential to improve the lives of Perovskite re-

searchers by equipping them to make fast assessments with limited data regarding Perovskite stability.

Nonetheless, further hyperparameter tuning and a larger dataset may yet improve the model’s ca-

pabilities still further. It is, in the authors’ opinion, worth further investigation into improving the

models.

17
8 References
1. Saddow SE. Silicon Carbide Technology for Advanced Human Healthcare Applications. Micromachines

(Basel). 2022 Feb 22;13(3):346. doi: 10.3390/mi13030346. PMID: 35334637; PMCID: PMC8949526.

2. Jung, Hyun Suk, and Nam-Gyu Park. ”Perovskite Solar Cells: From Materials to Devices.” Small 11.1

(2015): 10-25. Print.

3. Meng, Lei, Jingbi You, and Yang Yang. ”Addressing the Stability Issue of Perovskite Solar Cells for

Commercial Applications.” Nature Communications 9.1 (2018). Print.

4. K. Rao, Maithili, et al. ”Review on Persistent Challenges of Perovskite Solar Cells’ Stability.” Solar

Energy 218 (2021): 469-91. Print.

5. Shellaiah, Muthaiah, and Kien Wen Sun. ”Review on Sensing Applications of Perovskite Nanomateri-

als.” Chemosensors 8.3 (2020): 55. Print.

6. Rong, Yaoguang, et al. ”Challenges for Commercializing Perovskite Solar Cells.” Science 361.6408

(2018): eaat8235. Print.

7. Peterson, Gordon G. C., and Jakoah Brgoch. ”Materials Discovery through Machine-Learning Forma-

tion Energy.” Journal of Physics: Energy 3.2 (2021): 022002. Print.

8. Li, Wei, Ryan Jacobs, and Dane Morgan. ”Predicting the Thermodynamic Stability of Perovskite

Oxides Using Machine-Learning Models.” Computational Materials Science 150 (2018): 454-63. Print.

9. Wu, Yabi, et al. ”First Principles High Throughput Screening of Oxynitrides for Water-Splitting

Photocatalysts.” Energy Environ. Sci. 6.1 (2013): 157-68. Print.

10. Saal, James E., Anton O. Oliynyk, and Bryce Meredig. ”Machine-Learning in Materials Discovery:

Confirmed Predictions and Their Underlying Approaches.” Annual Review of Materials Research 50.1

(2020): 49-69. Print.

11. ”Machine Learning Materials Datasets”, (2018), NSF grant 1148011 and the Wisconsin Education

Innovation Committee

18
12. Fabian Pedregosa, Gaël Varoquaux, Alexandre Gramfort, Vincent Michel, Bertrand Thirion, Olivier

Grisel, Mathieu Blondel, Peter Prettenhofer, Ron Weiss, Vincent Dubourg, Jake Vanderplas, Alexandre

Passos, David Cournapeau, Matthieu Brucher, Matthieu Perrot, and Édouard Duchesnay. 2011. Scikit-

learn: Machine Learning in Python. J. Mach. Learn. Res. 12, (2/1/2011), 2825–2830.

13. Breiman, L. Random Forests. Machine Learning 45, 5–32 (2001). https://doi.org/10.1023/A:1010933404324

14. Stefano Nembrini, Inke R König, Marvin N Wright, The revival of the Gini importance?, Bioinformatics,

Volume 34, Issue 21, November 2018, Pages 3711–3718, https://doi.org/10.1093/bioinformatics/bty373

15. Paszke, Adam et al. “PyTorch: An Imperative Style, High-Performance Deep Learning Library.”

ArXiv abs/1912.01703 (2019)

16. Sergey Ioffe and Christian Szegedy. 2015. Batch normalization: accelerating deep network training by

reducing internal covariate shift. In Proceedings of the 32nd International Conference on International

Conference on Machine Learning - Volume 37 (ICML’15). 448–456.

17. T. Jiang and J. Cheng, ”Target Recognition Based on CNN with LeakyReLU and PReLU Activation

Functions,” 2019 International Conference on Sensing, Diagnostics, Prognostics, and Control (SDPC),

Beijing, China, 2019, pp. 718-722, doi: 10.1109/SDPC.2019.00136.

18. Nitish Srivastava, Geoffrey Hinton, Alex Krizhevsky, Ilya Sutskever, and Ruslan Salakhutdinov. 2014.

Dropout: a simple way to prevent neural networks from overfitting. J. Mach. Learn. Res. 15, 1

(January 2014)

19. Kingma, Diederik P. and Jimmy Ba. “Adam: A Method for Stochastic Optimization.” CoRR abs/1412.6980

(2014)

20. Zhang, Guodong et al. “Three Mechanisms of Weight Decay Regularization.” ArXiv abs/1810.12281

(2018)

21. Loshchilov, Ilya and Frank Hutter. “SGDR: Stochastic Gradient Descent with Warm Restarts.” arXiv:

Learning (2016)

22. Wang, Q., Ma, Y., Zhao, K. et al. A Comprehensive Survey of Loss Functions in Machine Learning.

Ann. Data. Sci. 9, 187–212 (2022). https://doi.org/10.1007/s40745-020-00253-5

Machine Learning - Driven Predictions of Lattice Constants in ABX3 Perovskite Materials
No ratings yet
Machine Learning - Driven Predictions of Lattice Constants in ABX3 Perovskite Materials
10 pages
ML For Perovskite Solar Cells
No ratings yet
ML For Perovskite Solar Cells
18 pages
Predicting The Thermodynamic Stability of Solids Combining Density Functional Theory and Machine Learning
No ratings yet
Predicting The Thermodynamic Stability of Solids Combining Density Functional Theory and Machine Learning
14 pages
35 2018 Jacs
No ratings yet
35 2018 Jacs
10 pages
Machine Learning For Perovskite Optoelectronics: A Review
No ratings yet
Machine Learning For Perovskite Optoelectronics: A Review
13 pages
1 s2.0 S0927025621008272 Main
No ratings yet
1 s2.0 S0927025621008272 Main
12 pages
10 1016@j Commatsci 2020 109614
No ratings yet
10 1016@j Commatsci 2020 109614
9 pages
BTP MSE Report Submission - Rama Edlabadkar
No ratings yet
BTP MSE Report Submission - Rama Edlabadkar
17 pages
2D Thermo Stability ML
No ratings yet
2D Thermo Stability ML
9 pages
Srivastava Et Al 2023 Machine Learning Enables Prediction of Halide Perovskites Optical Behavior With 90 Accuracy
No ratings yet
Srivastava Et Al 2023 Machine Learning Enables Prediction of Halide Perovskites Optical Behavior With 90 Accuracy
7 pages
Phase Transitions of Hybrid Perovskites Simulated by Machine-Learning Force Fields Trained On-The-Fly With Bayesian Inference
No ratings yet
Phase Transitions of Hybrid Perovskites Simulated by Machine-Learning Force Fields Trained On-The-Fly With Bayesian Inference
5 pages
Marchenko 7383 2020
No ratings yet
Marchenko 7383 2020
6 pages
High-Throughput Computational Design of Halide
No ratings yet
High-Throughput Computational Design of Halide
22 pages
High-Throughput DFT Calculations of Formation Energy, Stability and Oxygen Vacancy Formation Energy of ABO3 Perovskites
No ratings yet
High-Throughput DFT Calculations of Formation Energy, Stability and Oxygen Vacancy Formation Energy of ABO3 Perovskites
10 pages
Chen Pao 2019 Fast and Accurate Artificial Neural Network Potential Model For Mapbi3 Perovskite Materials
No ratings yet
Chen Pao 2019 Fast and Accurate Artificial Neural Network Potential Model For Mapbi3 Perovskite Materials
10 pages
Inorganic Compound Discovery via AI
No ratings yet
Inorganic Compound Discovery via AI
7 pages
Computational Modeling of Perovskites For Photovolatic Application, Opportunities and Challenges
No ratings yet
Computational Modeling of Perovskites For Photovolatic Application, Opportunities and Challenges
34 pages
Data-Augmentation For Graph Neural Network Learning of The Relaxed Energies of Unrelaxed Structures
No ratings yet
Data-Augmentation For Graph Neural Network Learning of The Relaxed Energies of Unrelaxed Structures
7 pages
2 5285398441063643418
No ratings yet
2 5285398441063643418
4 pages
Thermodynamic Stability Trend of Cubic P
No ratings yet
Thermodynamic Stability Trend of Cubic P
10 pages
ML Material
No ratings yet
ML Material
38 pages
MD-HIT: Machine Learning For Material Property Prediction With Dataset Redundancy Control
No ratings yet
MD-HIT: Machine Learning For Material Property Prediction With Dataset Redundancy Control
11 pages
Predicting Material Properties Using Machine Learning For Accelerated Materials Discovery
No ratings yet
Predicting Material Properties Using Machine Learning For Accelerated Materials Discovery
9 pages
Multimodal Transformers With Elemental Priors For Phase Classification of X-Ray Diffraction Spectra
No ratings yet
Multimodal Transformers With Elemental Priors For Phase Classification of X-Ray Diffraction Spectra
43 pages
Chen Et Al 2024 Accelerating Computational Materials Discovery With Machine Learning and Cloud High Performance
No ratings yet
Chen Et Al 2024 Accelerating Computational Materials Discovery With Machine Learning and Cloud High Performance
10 pages
Materials 16 02657 v2
No ratings yet
Materials 16 02657 v2
17 pages
38 2021 NPJ
No ratings yet
38 2021 NPJ
8 pages
Machine Learning For Halide Perovskite Materials
No ratings yet
Machine Learning For Halide Perovskite Materials
13 pages
Dunn Et Al. - 2020 - Benchmarking Materials Property Prediction Methods The Matbench Test Set and Automatminer Reference
No ratings yet
Dunn Et Al. - 2020 - Benchmarking Materials Property Prediction Methods The Matbench Test Set and Automatminer Reference
10 pages
All MAR17
No ratings yet
All MAR17
1,571 pages
1 s2.0 S0927025620306820 Main
No ratings yet
1 s2.0 S0927025620306820 Main
8 pages
17 Thno 11 TH April 2023
No ratings yet
17 Thno 11 TH April 2023
31 pages
Machine Learning of Phase Diagrams
No ratings yet
Machine Learning of Phase Diagrams
13 pages
Materials 16 07322 v2
No ratings yet
Materials 16 07322 v2
24 pages
Xiang Et Al 2023 Preparation of ch3nh3pbbr3 Perovskites Encapsulated in Zif 8 With Improved Stability and Their
No ratings yet
Xiang Et Al 2023 Preparation of ch3nh3pbbr3 Perovskites Encapsulated in Zif 8 With Improved Stability and Their
8 pages
Chapter 1
No ratings yet
Chapter 1
20 pages
Adi 0006
No ratings yet
Adi 0006
23 pages
Reviews: Structure Prediction Drives Materials Discovery
No ratings yet
Reviews: Structure Prediction Drives Materials Discovery
18 pages
Lead-Free Hybrid Organic-Inorganic Perovskites For Solar Cell Applications
No ratings yet
Lead-Free Hybrid Organic-Inorganic Perovskites For Solar Cell Applications
9 pages
PDF
No ratings yet
PDF
6 pages
Brittman 2015
No ratings yet
Brittman 2015
20 pages
Matter Gen
No ratings yet
Matter Gen
33 pages
Critical Review of Machine Learning Applications in Perovskite Solar Research
No ratings yet
Critical Review of Machine Learning Applications in Perovskite Solar Research
15 pages
G8 ML Perovskite 17pag
No ratings yet
G8 ML Perovskite 17pag
17 pages
37 2020 Jap
No ratings yet
37 2020 Jap
15 pages
First-Principles Property Assessment of Hybrid Formate Perovskites
No ratings yet
First-Principles Property Assessment of Hybrid Formate Perovskites
17 pages
2D Hybrid Perovskite Database
No ratings yet
2D Hybrid Perovskite Database
8 pages
Nitrides - Antiperovskites
No ratings yet
Nitrides - Antiperovskites
18 pages
Science Abn3445
No ratings yet
Science Abn3445
6 pages
CNN Mse
No ratings yet
CNN Mse
7 pages
Base Paer
No ratings yet
Base Paer
61 pages
Revolution of Perovskite: Narayanasamy Sabari Arul Vellalapalayam Devaraj Nithya Editors
No ratings yet
Revolution of Perovskite: Narayanasamy Sabari Arul Vellalapalayam Devaraj Nithya Editors
322 pages
Beniwal 2021
No ratings yet
Beniwal 2021
12 pages
Adts 202100496
No ratings yet
Adts 202100496
11 pages
Translated - 1 s2.0 S2095809918313559 Main
100% (1)
Translated - 1 s2.0 S2095809918313559 Main
10 pages
36 2019 NPJ
No ratings yet
36 2019 NPJ
11 pages
Worksheet 5.1: Chapter 5: Energetics - Glossary
No ratings yet
Worksheet 5.1: Chapter 5: Energetics - Glossary
4 pages
Periodic Classification Oneshot Bounceback
No ratings yet
Periodic Classification Oneshot Bounceback
158 pages
9ch0 01 Rms 20240815
No ratings yet
9ch0 01 Rms 20240815
35 pages
CHEM May June Variant 2 2022
No ratings yet
CHEM May June Variant 2 2022
24 pages
Samacheer Kalvi 11th Chemistry Chapter 3 Periodic Classification of Elements
No ratings yet
Samacheer Kalvi 11th Chemistry Chapter 3 Periodic Classification of Elements
18 pages
Ionization Energy & Electron Affinity Practice Sheet
No ratings yet
Ionization Energy & Electron Affinity Practice Sheet
4 pages
Lattice Energy
No ratings yet
Lattice Energy
20 pages
I PUC Chemistry
100% (1)
I PUC Chemistry
37 pages
Periodic Table Parmar SSC
No ratings yet
Periodic Table Parmar SSC
32 pages
Periodic Trends
100% (1)
Periodic Trends
11 pages
AS/A2 - Chemistry: 2021-23 Curriculum
No ratings yet
AS/A2 - Chemistry: 2021-23 Curriculum
89 pages
Chemistry Chapter 5 & 6 Review
No ratings yet
Chemistry Chapter 5 & 6 Review
42 pages
IB Chemistry Energetics Test
100% (1)
IB Chemistry Energetics Test
6 pages
2024-2025-Class XI-Chemistry-Chapter 3-AW
No ratings yet
2024-2025-Class XI-Chemistry-Chapter 3-AW
6 pages
Chapterwise Important Questions
No ratings yet
Chapterwise Important Questions
11 pages
Las Science8 Melc 4 q3 Week-8
No ratings yet
Las Science8 Melc 4 q3 Week-8
6 pages
12th Chemistry Notes Long + Short by Youth Academy 0346-6116201
No ratings yet
12th Chemistry Notes Long + Short by Youth Academy 0346-6116201
141 pages
Xi Chem Chapt3 PEriodic Properties of Elements Worksheet
No ratings yet
Xi Chem Chapt3 PEriodic Properties of Elements Worksheet
10 pages
AP Chemistry - Trends in The Periodic Table
No ratings yet
AP Chemistry - Trends in The Periodic Table
3 pages
IB CHEM TR 10.1 Worksheet
No ratings yet
IB CHEM TR 10.1 Worksheet
3 pages
44 Who - Trs - 957 - Eng Informe 44 Anexo 1 BPL
No ratings yet
44 Who - Trs - 957 - Eng Informe 44 Anexo 1 BPL
6 pages
Hsslive-Xi-Chem-3. Classification of Elements Q & A
No ratings yet
Hsslive-Xi-Chem-3. Classification of Elements Q & A
9 pages
Grade 8 Science: Periodic Trends Lesson
No ratings yet
Grade 8 Science: Periodic Trends Lesson
16 pages
Revision Class Xi Half Yearly
No ratings yet
Revision Class Xi Half Yearly
26 pages
Lattice Energy
No ratings yet
Lattice Energy
8 pages
Electron Affinity, Electronegativity, Ionization Energy
No ratings yet
Electron Affinity, Electronegativity, Ionization Energy
2 pages
STPDF2 Periodic Variations of Elements PDF
No ratings yet
STPDF2 Periodic Variations of Elements PDF
15 pages
UKCho Chemistry Revision Notes
No ratings yet
UKCho Chemistry Revision Notes
143 pages
s.6 Periodicity Notes
No ratings yet
s.6 Periodicity Notes
25 pages
CQ On Chap-3 (Chemistry 1 Paper)
No ratings yet
CQ On Chap-3 (Chemistry 1 Paper)
4 pages

ph451 Final Project 2

Uploaded by

ph451 Final Project 2

Uploaded by

PH451 Final Project

Justin Lewis, Olivia Holmes, Riley Nold

attempt to simplify the process of stability predictions by outsourcing it to a machine learning

classifier model respectively.

databases. In particular, the application of machine learning principles to important questions of

compounds, as well as classifying such compounds as experimentally stable or unstable. We have

and theoretical analysis.

2 Background and Theory

experience temperature-induced structural and metal-to-insulator transitions, allowing for thermal

noxious and flammable compounds with concentrations as low as 50ppm [5].

2.1 Perovskite Material Class

at the corners of the cubic unit cell.

2.2 Existing Challenges

2.3 The Primary Problem

2.4 Related Works

[7]. Machine-Learning models trained on the Open-Quantum-Materials database (Random-Forests,

future experimental and theoretical analyses [10].

development of a predictive machine-learning model(s) capable of estimating CHD and formation

address two questions:

1. Which Machine-Learning architecture(s) are best-suited for modelling Perovskite stability?

4 Data Description and Visualization

of how many elements the various instances contain:

keep in mind during the discussion of our classifier models.

5 Models and Methodologies

5.1 Loading the Data

5.2 Random Forest Model and Results

potential boosts in accuracy.

and boostrapping enabled.

report in figure 9 the GII of the 61 features (as a bar chart).

5.3 Neural Network Models

modification will be discussed in the ”Model Justification” section.

was handled by a Cosine-Annealing-Warm-Restarts performance scheduler [21], a complex scheduler

5.4 Model Justification

the model for large model weights (w.r.t the L2 metric).

we switched to the Cosine-Annealing-Warm-Restarts scheduler. This scheduler operates by reducing

6.1 Neural Network Models

by the learning-rate spikes induced by the scheduler.

regressor was found to have a 86.5% test classification accuracy.

accurate than the random-forest approach.

Overall, both models provided satisfactory results on their respective tasks.

7 Further Discussion and Conclusions

7.1 Perovskite Stability and Formation

electronic/chemical structure of the constituent elements, crystallographic properties (namely BCC

classes of oxide materials.

7.2 Further Considerations

feature importance problem. In future studies, a principal-component analysis or lasso-regularization

are still comparable in performance to other models in the literature.

7.3 Concluding Thoughts

(2015): 10-25. Print.

Commercial Applications.” Nature Communications 9.1 (2018). Print.

Energy 218 (2021): 469-91. Print.

als.” Chemosensors 8.3 (2020): 55. Print.

(2018): eaat8235. Print.

tion Energy.” Journal of Physics: Energy 3.2 (2021): 022002. Print.

Photocatalysts.” Energy Environ. Sci. 6.1 (2013): 157-68. Print.

(2020): 49-69. Print.

Volume 34, Issue 21, November 2018, Pages 3711–3718, https://doi.org/10.1093/bioinformatics/bty373

ArXiv abs/1912.01703 (2019)

Conference on Machine Learning - Volume 37 (ICML’15). 448–456.

Beijing, China, 2019, pp. 718-722, doi: 10.1109/SDPC.2019.00136.

Ann. Data. Sci. 9, 187–212 (2022). https://doi.org/10.1007/s40745-020-00253-5

You might also like