Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bug: Python API cannot save trained model (Segmentation fault, list index out of range) #386

Open
m-a-saleh opened this issue Dec 6, 2023 · 8 comments

Comments

@m-a-saleh
Copy link

m-a-saleh commented Dec 6, 2023

Describe the bug
After training an SGP model using Python, I am not able to save the model. Using sgp_calc.write_model('my_model.json') generates empty "my_model.json" file, and gives the following error:

Traceback (most recent call last):
  File "train_validate.py", line 289, in <module>
    sgp_calc.write_model('my_model.json') #save the the trained model
  File "/home/saleh/miniconda3/envs/flare/lib/python3.8/site-packages/flare/bffs/sgp/calculator.py", line 158, in write_model
    json.dump(self.as_dict(), f, cls=NumpyEncoder)
  File "/home/saleh/miniconda3/envs/flare/lib/python3.8/site-packages/flare/bffs/sgp/calculator.py", line 139, in as_dict
    out_dict["gp_model"] = self.gp_model.as_dict()
  File "/home/saleh/miniconda3/envs/flare/lib/python3.8/site-packages/flare/bffs/sgp/sparse_gp.py", line 205, in as_dict
    train_struc.info["rel_efs_noise"] = np.array(self.rel_efs_noise[s])
IndexError: list index out of range

The bug is: in "sparce_sgp.py", the lists self.atom_indices and self.rel_efs_noise are initialized to empty but never get populated and stay empty.

On the other hand, using sgp_calc.build_map("lmp.flare", "my_name") generates empty "lmp.flare" file and gives the following error:

Segmentation fault (core dumped)

To Reproduce
In the example given here, just add sgp_calc.write_model('my_model.json') and sgp_calc.build_map("lmp.flare", "my_name") to the end of the script.

Expected behavior
Write "my_model.json" and "lmp.flare" model files.

Desktop (please complete the following information):

  • OS: [Ubuntu 20 LTS]
  • flare version: 1.3.3 and development
@m-a-saleh m-a-saleh changed the title Python API cannot save trained model Bug: Python API cannot save trained model (Segmentation fault, list index out of range) Dec 30, 2023
@YuuuXie
Copy link
Collaborator

YuuuXie commented Jan 5, 2024

@m-a-saleh can you try manually adding rel_efs_noise to sparse gp to get around this error?

@m-a-saleh
Copy link
Author

m-a-saleh commented Jan 5, 2024

@m-a-saleh can you try manually adding rel_efs_noise to sparse gp to get around this error?

@YuuuXie I managed to get a working .json model file by the follwing steps:
1-edit flare/bffs/sgp/sparse_gp.py by commenting the line train_struc.info["rel_efs_noise"] = np.array(self.rel_efs_noise[s]) in the as_dict() function. This generated a .json file but without rel_efs_noise and atom_indices
2-edit the .json file by addeding rel_efs_noise and atom_indices

The previous steps worked only for the combination(force_training=True, Energy_training=false, stress_training=false). It did not work for energy_trainig=true.

I am still not able to generate lmp.flare file, which still raise seg fault error.

@YuuuXie
Copy link
Collaborator

YuuuXie commented Mar 30, 2024

sorry for the late reply. Can you maybe try an earlier version of flare? like 1.3.0?

@rbjiawen
Copy link

Hi, I get the same error when I using sparse_gp.write_model() with the latest version of flare(1.3.3): IndexError: list index out of range.

@jonpvandermause
Copy link
Collaborator

Hi @rbjiawen,

Are you trying the same experiment as @m-a-saleh, that is, calling sgp_calc.write_model after building a sparse GP calculator with custom descriptors as in the tutorial? The issue is that the write_model method as currently written assumes that the training structures were added to the sparse GP wrapper along with associated "relative" energy, force, and stress noises, but in the tutorial, these noise values aren't inserted.

So one way to proceed is to add the last line in this block:

# add structure to the training set
sparse_gp.sparse_gp.add_training_structure(struc_pp, [-1])
sparse_gp.rel_efs_noise.append([1, 1, 1])

Can you give this a try and let me know if it works?

@rbjiawen
Copy link

Yes, I followed the example of customizing the descriptor and tried calling sgp_calc.write_model("mymodel.json"). I switched to version 1.3.0 and do not get this error. I will try adding rel_efs_noise using 1.3.3 version,thanks!

@rbjiawen
Copy link

Version 1.3.3 flare works after add the rel_efs_noise !

@jonpvandermause
Copy link
Collaborator

Glad that works! Please note, however, that the write_model method, along with the from_file and underlying as_dict and from_dict methods of the sparse GP wrapper class, are only designed to work with B2 descriptors, so if you do design custom descriptors that differ from B2, these serialization methods won't work as expected. I'll plan to add a warning to the tutorial to this effect.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants