General-disorder-prediction

At the most basic level all compounds can be divided into two categories: ordered and disordered, judging by the presence of the partial occupancies in CIF. Here we try to find the best machine learning model which would predict disorder from the composition.

The model is trained on the ICSD data measured at room temperature and atmosperic preassure. It is important to note that the fraction of compounds in the ICSD which have both ordered and disordered version for the same composition is just 1%. So, predicting disorder from composion is a better defined task when one would think knowing about existence of phase transitions and overall temperature dependence of disorder effects.

Python environment file is attached (sorry, I probably should have created specific environment for this project, but I did not)

The code is implemented in pytorch lightning, logging is to wandb (so do not forget to specify your key)

Trained models can be found on my group's external hard drive (beware formated for MAC, does not work in Windows)

List of models:

Random Forrest + Magpie Balanced Accuracy = 0.87, Recall = 0.9, Precision = 0.9, ROC AUC = 0.94, MCC (Matthews correlation coefficient) = 0.74

3-NN + scaled Magpie Balanced Accuracy = 0.79, Recall = 0.83, Precision = 0.83, ROC AUC = 0.86, MCC (Matthews correlation coefficient) = 0.58

Roost + Matscholar Balanced Accuracy = 0.84, Recall = 0.83, Precision = 0.89, ROC AUC = 0.91, MCC (Matthews correlation coefficient) = 0.67

CrabNet + Mat2vec Balanced Accuracy = 0.90, Recall = 0.88, Precision = 0.94, ROC AUC = 0.95, MCC (Matthews correlation coefficient) = 0.79

Ensemble-10Roost + Matscholar Balanced Accuracy = 0.86, Recall = 0.84, Precision = 0.92, ROC AUC = 0.93, MCC (Matthews correlation coefficient) = 0.71

Ensemble-10CrabNet + Matscholar Balanced Accuracy = 0.91, Recall = 0.89, Precision = 0.95, ROC AUC = 0.97, MCC (Matthews correlation coefficient) = 0.82

Blending (Logistic regression on output of all classifiers) Balanced Accuracy = 0.90, Recall = 0.91, Precision = 0.93, ROC AUC = 0.96, MCC (Matthews correlation coefficient) = 0.80

Transfer model: CrabNet trained on formation energy transfered to predict disorder

No gains compared to simple CrabNet

Multi-property model: CrabNet encoder with two projection heads predicting disorder and formation energy.

Trained on disorder data + MP (entry can have only one of those two values to be included in teh dataset). No gains compared to simple CrabNet

Name		Name	Last commit message	Last commit date
Latest commit History 45 Commits
crabnet		crabnet
data		data
disorder		disorder
multi-property model		multi-property model
roost		roost
CrabNet-global-disorder.ipynb		CrabNet-global-disorder.ipynb
CrabNet-global-disorder.py		CrabNet-global-disorder.py
Data_preparation.ipynb		Data_preparation.ipynb
Dummy-models.ipynb		Dummy-models.ipynb
README.md		README.md
RF-global-disorder.ipynb		RF-global-disorder.ipynb
Roost-global-disorder-inference-and-ensembling-both-Roost-and-CrabNet-and-Blending.ipynb		Roost-global-disorder-inference-and-ensembling-both-Roost-and-CrabNet-and-Blending.ipynb
Roost-global-disorder-transfer.py		Roost-global-disorder-transfer.py
Roost-global-disorder.ipynb		Roost-global-disorder.ipynb
Roost-global-disorder.py		Roost-global-disorder.py
Roost-transfer-learning.ipynb		Roost-transfer-learning.ipynb
disorder_prediction.yml		disorder_prediction.yml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

General-disorder-prediction

About

Releases

Packages

Languages

lrcfmd/General-disorder-prediction

Folders and files

Latest commit

History

Repository files navigation

General-disorder-prediction

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages