Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to choose the chemical potential range for the MC simulations #355

Open
darjaved opened this issue Feb 14, 2024 · 9 comments
Open

How to choose the chemical potential range for the MC simulations #355

darjaved opened this issue Feb 14, 2024 · 9 comments

Comments

@darjaved
Copy link

darjaved commented Feb 14, 2024

I have my casm learn input file with some compositions, How to choose the chemical potential range to include all the compositions? is it necessary it should include all my compositions exactly? Why my results.json file compositions like :

"<atom_frac(Sn)>" : [ 0.254463252315, 0.333333333333, 0.333333333333, 0.333333333333, 0.488210556082, 0.520053810807, 0.666663652585, 0.666664382310, 0.666664358008, 0.666665905214, 0.666666666667 ]
There are some compositions which are very close to each other, and shouldn't it be as per my casm learn input file only?

Also, in my casm learn input some configurations have zero weight (i choose), should i remove them completely? will they a part of the MC simulations?

{
{
"comment" : "Built from example",
"debug" : false,
"ensemble" : "grand_canonical",
"method" : "metropolis",
"model" : {
"formation_energy" : "formation_energy"
},
"supercell" : [
[24, 0, 0],
[0, 24, 0],
[0, 0, 24]
],
"data" : {
"sample_by" : "pass",
"sample_period" : 1,
"min_pass" : 100,
"max_pass" : 100,
"confidence" : 0.95,
"measurements" : [
{
"quantity" : "formation_energy",
"precision" : 1e-3
},
{
"quantity" : "potential_energy",
"precision" : 1e-3
},
{
"quantity" : "clex_hull_dist(casm_learn_input,comp)",
"precision" : 1e-3
},
{
"quantity" : "atom_frac"
},
{
"quantity" : "site_frac"
},
{
"quantity" : "comp",
"precision" : 1e-3
},
{
"quantity" : "comp_n"
}
],
"storage" : {
"write_observations" : false,
"write_trajectory" : false,
"output_format" : ["csv", "json"]
}
},
"driver" : {
"dependent_runs": false,
"mode" : "incremental",
"motif" : {
"configname" : "auto"
},
"initial_conditions" : {
"param_chem_pot" : {
"a" : -0.6,
"b" : 0
},
"temperature" : 5,
"tolerance" : 0.001
},
"final_conditions" : {
"param_chem_pot" : {
"a" : 0.7,
"b" : 0
},
"temperature" : 5,
"tolerance" : 0.001
},
"incremental_conditions" : {
"param_chem_pot" : {
"a" : 0.1,
"b" : 0
},
"temperature" : 0,
"tolerance" : 0.001
}
}
}

@darjaved
Copy link
Author

Also, during the weight optimisation for obtaining the best fit with LASSO, what can be the maximum value of the weight? any guidelines?

@darjaved
Copy link
Author

I have my casm learn input file with some compositions, How to choose the chemical potential range to include all the compositions? is it necessary it should include all my compositions exactly? Why my results.json file compositions like :

"<atom_frac(Sn)>" : [ 0.254463252315, 0.333333333333, 0.333333333333, 0.333333333333, 0.488210556082, 0.520053810807, 0.666663652585, 0.666664382310, 0.666664358008, 0.666665905214, 0.666666666667 ] There are some compositions which are very close to each other, and shouldn't it be as per my casm learn input file only?

Also, in my casm learn input some configurations have zero weight (i choose), should i remove them completely? will they a part of the MC simulations?

{ { "comment" : "Built from example", "debug" : false, "ensemble" : "grand_canonical", "method" : "metropolis", "model" : { "formation_energy" : "formation_energy" }, "supercell" : [ [24, 0, 0], [0, 24, 0], [0, 0, 24] ], "data" : { "sample_by" : "pass", "sample_period" : 1, "min_pass" : 100, "max_pass" : 100, "confidence" : 0.95, "measurements" : [ { "quantity" : "formation_energy", "precision" : 1e-3 }, { "quantity" : "potential_energy", "precision" : 1e-3 }, { "quantity" : "clex_hull_dist(casm_learn_input,comp)", "precision" : 1e-3 }, { "quantity" : "atom_frac" }, { "quantity" : "site_frac" }, { "quantity" : "comp", "precision" : 1e-3 }, { "quantity" : "comp_n" } ], "storage" : { "write_observations" : false, "write_trajectory" : false, "output_format" : ["csv", "json"] } }, "driver" : { "dependent_runs": false, "mode" : "incremental", "motif" : { "configname" : "auto" }, "initial_conditions" : { "param_chem_pot" : { "a" : -0.6, "b" : 0 }, "temperature" : 5, "tolerance" : 0.001 }, "final_conditions" : { "param_chem_pot" : { "a" : 0.7, "b" : 0 }, "temperature" : 5, "tolerance" : 0.001 }, "incremental_conditions" : { "param_chem_pot" : { "a" : 0.1, "b" : 0 }, "temperature" : 0, "tolerance" : 0.001 } } }

{
"<atom_frac(Na)>" : [ 1.000000000000, 0.750016693376, 0.744715418544, 0.666666666667, 0.666666666667, 0.666666666667, 0.508919461196, 0.477095170455, 0.333335617690, 0.333334094786, 0.333340115017, 0.333334094786, 0.333332571881, 0.000000000000, 0.000000000000 ],
"<atom_frac(Sn)>" : [ 0.000000000000, 0.249983306624, 0.255284581456, 0.333333333333, 0.333333333333, 0.333333333333, 0.491080538804, 0.522904829545, 0.666664382310, 0.666665905214, 0.666659884983, 0.666665905214, 0.666667428119, 1.000000000000, 1.000000000000 ],
"<clex_hull_dist(casm_learn_input,comp)>" : [ 0.000000000000, -0.178039938030, -0.121295321820, 0.000000000000, 0.000000000000, 0.000000000000, -0.042914768225, -0.038028003429, -0.065069651294, -0.065070827050, -0.065066179139, -0.065069138361, -0.065069469773, 0.000000000000, 0.000000000000 ],
"<comp(a)>" : [ 0.000000000000, 0.249983306624, 0.255284581456, 0.333333333333, 0.333333333333, 0.333333333333, 0.491080538804, 0.522904829545, 0.666664382310, 0.666665905214, 0.666659884983, 0.666665905214, 0.666667428119, 1.000000000000, 1.000000000000 ],
"<comp_n(Na)>" : [ 1.000000000000, 0.750016693376, 0.744715418543, 0.666666666667, 0.666666666667, 0.666666666667, 0.508919461196, 0.477095170455, 0.333335617690, 0.333334094786, 0.333340115017, 0.333334094786, 0.333332571881, 0.000000000000, 0.000000000000 ],
"<comp_n(Sn)>" : [ 0.000000000000, 0.249983306624, 0.255284581456, 0.333333333333, 0.333333333333, 0.333333333333, 0.491080538804, 0.522904829545, 0.666664382310, 0.666665905214, 0.666659884983, 0.666665905214, 0.666667428119, 1.000000000000, 1.000000000000 ],
"<formation_energy>" : [ -0.001507260663, -0.330447594354, -0.276425692132, -0.195193703029, -0.195193703029, -0.195193703029, -0.262256773203, -0.261613218586, -0.252203646786, -0.252204017631, -0.252202551636, -0.252202328942, -0.252201855442, -0.000409821000, -0.000409821000 ],
"<potential_energy>" : [ -0.001507260663, -0.180457610380, -0.148783401404, -0.061860369696, -0.095193703029, -0.128527036363, -0.213148719323, -0.261613218586, -0.318870085017, -0.385537198673, -0.452200517131, -0.518868691028, -0.585535569501, -0.600409821000, -0.700409821000 ],
"<site_frac(Na)>" : [ 1.000000000000, 0.750016693376, 0.744715418543, 0.666666666667, 0.666666666667, 0.666666666667, 0.508919461196, 0.477095170455, 0.333335617690, 0.333334094786, 0.333340115017, 0.333334094786, 0.333332571881, 0.000000000000, 0.000000000000 ],

Can you please check these, if there is any problem

@darjaved
Copy link
Author

"is_converged" : [ true, true, true, true, true, true, true, true, true, true, true, true, true ],
"is_equilibrated" : [ true, true, true, true, true, true, true, true, true, true, true, true, true ],
"param_chem_pot(a)" : [ -0.600000000000, -0.500000000000, -0.400000000000, -0.300000000000, -0.200000000000, -0.100000000000, -0.000000000000, 0.100000000000, 0.200000000000, 0.300000000000, 0.400000000000, 0.500000000000, 0.600000000000 ],
"prec(<atom_frac(Na)>)" : [ 0.000000000000, 0.000000000000, 0.000000000000, 0.000000000000, 0.000000000000, 0.000158634892, 0.000000000000, 0.000000000000, 0.000000000000, 0.000000000000, 0.000000000000, 0.000000000000, 0.000000000000 ],
"prec(<atom_frac(Sn)>)" : [ 0.000000000000, 0.000057317051, 0.000000000000, 0.000000000000, 0.000000000000, 0.000158634892, 0.000000000000, 0.000000000000, 0.000000000000, 0.000000000000, 0.000000000000, 0.000000000000, 0.000000000000 ],
"prec(<clex_hull_dist(casm_learn_input,comp)>)" : [ 0.000072094396, 0.000052275365, 0.000000000000, 0.000000000000, 0.000000000000, 0.000047261045, 0.000025094315, 0.000016087867, 0.000000000000, 0.000000000000, 0.000000000000, 0.000016433246, 0.000000000000 ],
"prec(<comp(a)>)" : [ 0.000000000000, 0.000057317051, 0.000000000000, 0.000000000000, 0.000000000000, 0.000158634892, 0.000000000000, 0.000000000000, 0.000000000000, 0.000000000000, 0.000000000000, 0.000000000000, 0.000000000000 ],
"prec(<comp_n(Na)>)" : [ 0.000000000000, 0.000000000000, 0.000000000000, 0.000000000000, 0.000000000000, 0.000158634892, 0.000000000000, 0.000000000000, 0.000000000000, 0.000000000000, 0.000000000000, 0.000000000000, 0.000000000000 ],
"prec(<comp_n(Sn)>)" : [ 0.000000000000, 0.000057317051, 0.000000000000, 0.000000000000, 0.000000000000, 0.000158634892, 0.000000000000, 0.000000000000, 0.000000000000, 0.000000000000, 0.000000000000, 0.000000000000, 0.000000000000 ],
"prec(<formation_energy>)" : [ 0.000000000000, 0.000082999424, 0.000000000000, 0.000000000000, 0.000000000000, 0.000069281800, 0.000000000000, 0.000000000000, 0.000000000000, 0.000000000000, 0.000000000000, 0.000000000000, 0.000000000000 ],
"prec(<potential_energy>)" : [ 0.000076637017, 0.000052670934, 0.000000000000, 0.000000000000, 0.000000000000, 0.000052108938, 0.000000000000, 0.000000000000, 0.000000000000, 0.000000000000, 0.000000000000, 0.000000000000, 0.000000000000 ],
"prec(<site_frac(Na)>)" : [ 0.000000000000, 0.000000000000, 0.000000000000, 0.000000000000, 0.000000000000, 0.000158634892, 0.000000000000, 0.000000000000, 0.000000000000, 0.000000000000, 0.000000000000, 0.000000000000, 0.000000000000 ],

Can you please explain what are these tags actually? like prec(<clex_hull_dist(casm_learn_input,comp)

@darjaved
Copy link
Author

SCEL8_8_1_1_0_2_7/24 0.5 -0.463914 -0.392215 True 40.0 71.69897
SCEL8_4_2_1_1_0_1/32 0.5 -0.464785 -0.424896 True 24.0 39.88902
SCEL8_8_1_1_0_3_6/12 0.5 -0.467476 -0.407082 True 40.0 60.39397
SCEL8_2_4_1_1_0_0/36 0.5 -0.468075 -0.430692 True 20.0 37.38247
SCEL8_8_1_1_0_5_1/54 0.5 -0.469709 -0.437471 True 22.0 32.23754
SCEL8_8_1_1_0_5_1/52 0.5 -0.476909 -0.436568 True 50.0 40.34080
SCEL8_4_2_1_1_0_1/36 0.5 -0.485797 -0.428869 True 100.0 56.92849
min E_DFT -0.48579745 at SCEL8_4_2_1_1_0_1/36 weight 100.0
min E_CX -0.44363867 at SCEL8_4_2_1_1_0_1/28 weight 40.0
max E_DFT -0.3061359 at SCEL8_2_2_2_0_0_0/35 weight 20.0
max E_CX -0.34110944 at SCEL8_2_2_2_0_1_1/12 weight 24.0

even after using high weights i am not getting the correct ground state. any solution?

@darjaved
Copy link
Author

@xivh Could you please guide me here

@xivh
Copy link
Contributor

xivh commented Feb 28, 2024

How to choose the chemical potential range to include all the compositions? is it necessary it should include all my compositions exactly?

Are you asking about the fitting or the monte carlo

Also, in my casm learn input some configurations have zero weight (i choose), should i remove them completely? will they a part of the MC simulations?
You can ignore configurations in the fitting, but you should make sure that they are not predicted as ground states then. You can't exclude configurations from the monte carlo simulation, but they will be sampled infrequently/never if they are high in energy.

Also, during the weight optimisation for obtaining the best fit with LASSO, what can be the maximum value of the weight? any guidelines?

I usually do CV with something like 1e-5 to 1e-1. What is more important is that the ECI look good (not overfitting). If the $\lambda$ you get after CV is the same as the max weight that you tried, then you should increase the range.

Can you please explain what are these tags actually? like prec(<clex_hull_dist(casm_learn_input,comp)

Maybe this issue will help you? #67

even after using high weights i am not getting the correct ground state. any solution?

I have had success augmenting my data with these hull distance correlations:
https://github.com/Van-der-Ven-Group/thermocore/blob/53daacf16e7fe36a62d0d47f7c4f0cc571696f5d/thermocore/geometry/hull.py#L310

You will have to fit outside of casm learn, though.

@darjaved
Copy link
Author

How to choose the chemical potential range to include all the compositions? is it necessary it should include all my compositions exactly?

Are you asking about the fitting or the monte carlo

Also, in my casm learn input some configurations have zero weight (i choose), should i remove them completely? will they a part of the MC simulations?
You can ignore configurations in the fitting, but you should make sure that they are not predicted as ground states then. You can't exclude configurations from the monte carlo simulation, but they will be sampled infrequently/never if they are high in energy.

Also, during the weight optimisation for obtaining the best fit with LASSO, what can be the maximum value of the weight? any guidelines?

I usually do CV with something like 1e-5 to 1e-1. What is more important is that the ECI look good (not overfitting). If the λ you get after CV is the same as the max weight that you tried, then you should increase the range.

Can you please explain what are these tags actually? like prec(<clex_hull_dist(casm_learn_input,comp)

Maybe this issue will help you? #67

even after using high weights i am not getting the correct ground state. any solution?

I have had success augmenting my data with these hull distance correlations: https://github.com/Van-der-Ven-Group/thermocore/blob/53daacf16e7fe36a62d0d47f7c4f0cc571696f5d/thermocore/geometry/hull.py#L310

You will have to fit outside of casm learn, though.

@darjaved darjaved reopened this Feb 29, 2024
@darjaved
Copy link
Author

How to choose the chemical potential range to include all the compositions? is it necessary it should include all my compositions exactly?

Are you asking about the fitting or the monte carlo

Also, in my casm learn input some configurations have zero weight (i choose), should i remove them completely? will they a part of the MC simulations?
You can ignore configurations in the fitting, but you should make sure that they are not predicted as ground states then. You can't exclude configurations from the monte carlo simulation, but they will be sampled infrequently/never if they are high in energy.

Also, during the weight optimisation for obtaining the best fit with LASSO, what can be the maximum value of the weight? any guidelines?

I usually do CV with something like 1e-5 to 1e-1. What is more important is that the ECI look good (not overfitting). If the λ you get after CV is the same as the max weight that you tried, then you should increase the range.

Can you please explain what are these tags actually? like prec(<clex_hull_dist(casm_learn_input,comp)

Maybe this issue will help you? #67

even after using high weights i am not getting the correct ground state. any solution?

I have had success augmenting my data with these hull distance correlations: https://github.com/Van-der-Ven-Group/thermocore/blob/53daacf16e7fe36a62d0d47f7c4f0cc571696f5d/thermocore/geometry/hull.py#L310
You will have to fit outside of casm learn, though.

I am asking about monte carlo.

@xivh
Copy link
Contributor

xivh commented Mar 1, 2024

If you plot formation energy per prim vs the parametric composition axis, the maximum/minimum slope are starting points for your chemical potential boundaries. If you are integrating across chemical potential at fixed temperature, you will want to select a chemical potential which is large enough that you have a pure compound at your starting point. Here is a reference about the Monte Carlo in CASM:

https://arxiv.org/abs/2309.11761

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants