Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Problem running otf_train.yaml (error message sparse_gp.py) #359

Open
johnemec opened this issue Jun 30, 2023 · 4 comments
Open

Problem running otf_train.yaml (error message sparse_gp.py) #359

johnemec opened this issue Jun 30, 2023 · 4 comments

Comments

@johnemec
Copy link

Describe the bug
I was running flare-otf otf_train.yaml command using a POSCAR file as input structure file and VASP as DFT calculator, when getting this error message:

File "/home/USER/.local/lib/python3.10/site-packages/flare/bffs/sgp/sparse_gp.py", line 335, in update_db
coded_species.append(self.species_map[spec])
KeyError: 14

To Reproduce
Steps to reproduce the behavior:

  1. otf_train.yaml

Super cell is read from a file such as POSCAR, xyz, lammps-data

or any format that ASE supports

supercell:
file: POSCAR
format: vasp
replicate: [1, 1, 1] # supercell creation. Be mindful of DFT limitations and periodicity of your cell.
jitter: 0.1 # perturb the initial atomic positions by 0.1 A, so initial atomic environments added to the sparse set are not the same

Set up FLARE calculator with (sparse) Gaussian process

flare_calc:
gp: SGP_Wrapper
kernels:
- name: NormalizedDotProduct # select kernel for comparison of atomic environments
sigma: 2.0 # signal variance, this hyperparameter will be trained, and is typically between 1 and 10.
power: 2 # power of the kernel, influences body-order
descriptors:
- name: B2 # Atomic Cluster Expansion (ACE) descriptor from R. Drautz (2019). FLARE can only go from B1 up to B3 currently.
nmax: 8 # Radial fidelity of the descriptor (higher value = higher cost)
lmax: 3 # Angular fidelity of the descriptor (higher value = higher cost)
cutoff_function: quadratic # Cutoff behavior
radial_basis: chebyshev # Formalism for the radial basis functions
cutoff_matrix: [[5.0]] # In angstroms. NxN array for N_species in a system.
energy_noise: 0.096 # Energy noise hyperparameter, will be trained later. Typically set to 1 meV * N_atoms.
forces_noise: 0.05 # Force noise hyperparameter, will be trained later. System dependent, typically between 0.05 meV/A and 0.2 meV/A.
stress_noise: 0.001 # Stress noise hyperparameter, will be trained later. Typically set to 0.001 meV/A^3.
energy_training: True
force_training: True
stress_training: True
species:
- 13 # Atomic number of your species (here, 13 = Al).
single_atom_energies:
- 0 # Single atom energies to bias the energy prediction of the model. Can help in systems with poor initial energy estimations. Length must equal the number of species.
cutoff: 5.0 # Cutoff for the (ACE) descriptor. Typically informed by the radial distribution function of the system. Should equal the maximum value in the cutoff_matrix.
variance_type: local # Calculate atomic uncertainties.
max_iterations: 20 # Maximum steps taken during each hyperparameter optimization call.
use_mapping: True # Print mapped model (ready for use in LAMMPS) during trajectory. Model is re-mapped and replaced if new DFT calls are made throughout the trajectory.

In the tutorial, we use ASE Lennard-Jones potential as ground truth

instead of DFT to save time

dft_calc:
name: Vasp
kwargs:
command: "mpirun vasp_std"
# pseudo-potential
xc: pbe
# k points
kpts: [4, 4, 4]
# INCAR
istart: 0
npar: 8
ediff: 1.0e-6
encut: 500
ismear: -5
sigma: 0.2
lreal: Auto
prec: Accurate
algo: Fast
lscalapack: False
params: {}

Set up On-the-fly training and MD

otf: # On-the-fly training and MD
mode: fresh # Start from an empty SGP
md_engine: VelocityVerlet # Define MD engine, here we use the Velocity Verlet engine from ASE. LAMMPS examples can be found in the flare/examples directory in the repo
md_kwargs: {} # Define MD kwargs
initial_velocity: 1000 # Initialize the velocities
dt: 0.001 # Set the time step in picoseconds (1 fs here)
number_of_steps: 10 # Total number of MD steps to be taken
output_name: Si_otf # Name of output
init_atoms: [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11] # Initial atoms to be added to the sparse set
std_tolerance_factor: -0.01 # The uncertainty threshold above which the DFT will be called
max_atoms_added: -1 # Allow for all atoms in a given frame to be added to the sparse set if uncertainties permit
train_hyps: [5,inf] # Define range in which hyperparameters will be optimized. Here, hyps are optimized at every DFT call after the 5th call.
write_model: 4 # Verbosity of model output.
update_style: threshold # Sparse set update style. Atoms above a defined "threshold" will be added using this method
update_threshold: 0.001 # Threshold for adding atoms if "update_style = threshold". Threshold represents relative uncertainty to mean atomic uncertainty, where atoms above are added to sparse set
force_only: False # Train on forces, stresses, and energies.

  1. POSCAR

Si
1.0000000000000000
5.4437023729394527 0.0000000000000000 0.0000000000000003
0.0000000000000009 5.4437023729394527 0.0000000000000003
0.0000000000000000 0.0000000000000000 5.4437023729394527
Si
8
Cartesian
4.1147590257602102 4.2452943034050890 1.3468254251531135
-0.0802625722806385 2.7244722198520734 2.8298858824873951
4.0422699347878837 1.3440468765613307 4.0661853445649534
0.0494231634513068 -0.0609311205725576 0.1084518318735196
1.3544414172112398 4.0951079607678640 4.0422043460588535
2.6399530632153270 2.7130357906305003 0.0432045106348295
1.4549697967372583 1.5227723112914378 1.3071579071337061
2.8919919707496526 -0.0405562349093180 2.8027838013455884

  1. version flare

git clone https://github.com/mir-group/flare.git (latest release 1.3.3)

@cjowen1
Copy link
Collaborator

cjowen1 commented Jun 30, 2023

Hello,

The error you are seeing is due to a mismatch in the species listed in the flare_calc section of your yaml and the structure you are reading. You need to modify the following (assuming your input file only contains Si):

#old
species:

  • 13 # Atomic number of your species (here, 13 = Al).

#new
species:

  • 14 # Atomic number of your species (here, 14 = Si).

  • Cameron

@johnemec
Copy link
Author

johnemec commented Jul 3, 2023

Thank you, it worked! If I have a system with different species (for example Si=14 and O=8), how does my otf_train.yaml looks like (I tried different ways, but always got some error messages)?

Thank you!

To Reproduce

  1. otf_train.yaml

Super cell is read from a file such as POSCAR, xyz, lammps-data

or any format that ASE supports

supercell:
file: POSCAR
format: vasp
replicate: [1, 1, 1] # supercell creation. Be mindful of DFT limitations and periodicity of your cell.
jitter: 0.1 # perturb the initial atomic positions by 0.1 A, so initial atomic environments added to the sparse set are not the same

Set up FLARE calculator with (sparse) Gaussian process

flare_calc:
gp: SGP_Wrapper
kernels:
- name: NormalizedDotProduct # select kernel for comparison of atomic environments
sigma: 2.0 # signal variance, this hyperparameter will be trained, and is typically between 1 and 10.
power: 2 # power of the kernel, influences body-order
descriptors:
- name: B2 # Atomic Cluster Expansion (ACE) descriptor from R. Drautz (2019). FLARE can only go from B1 up to B3 currently.
nmax: 8 # Radial fidelity of the descriptor (higher value = higher cost)
lmax: 3 # Angular fidelity of the descriptor (higher value = higher cost)
cutoff_function: quadratic # Cutoff behavior
radial_basis: chebyshev # Formalism for the radial basis functions
cutoff_matrix: [[5.0]] # In angstroms. NxN array for N_species in a system.
energy_noise: 0.096 # Energy noise hyperparameter, will be trained later. Typically set to 1 meV * N_atoms.
forces_noise: 0.05 # Force noise hyperparameter, will be trained later. System dependent, typically between 0.05 meV/A and 0.2 meV/A.
stress_noise: 0.001 # Stress noise hyperparameter, will be trained later. Typically set to 0.001 meV/A^3.
energy_training: True
force_training: True
stress_training: True
species:
- [14, 8] # Atomic number of your species (here, 13 = Al).
single_atom_energies:
- 0 # Single atom energies to bias the energy prediction of the model. Can help in systems with poor initial energy estimations. Length must equal the number of species.
cutoff: 5.0 # Cutoff for the (ACE) descriptor. Typically informed by the radial distribution function of the system. Should equal the maximum value in the cutoff_matrix.
variance_type: local # Calculate atomic uncertainties.
max_iterations: 20 # Maximum steps taken during each hyperparameter optimization call.
use_mapping: True # Print mapped model (ready for use in LAMMPS) during trajectory. Model is re-mapped and replaced if new DFT calls are made throughout the trajectory.

In the tutorial, we use ASE Lennard-Jones potential as ground truth

instead of DFT to save time

dft_calc:
name: Vasp
kwargs:
command: "mpirun vasp_std"
# pseudo-potential
xc: pbe
# k points
kpts: [5, 5, 4]
# INCAR
istart: 0
npar: 8
ediff: 1.0e-6
encut: 800
ismear: -5
sigma: 0.2
lreal: Auto
prec: Accurate
algo: Fast
lscalapack: False
params: {}

Set up On-the-fly training and MD

otf: # On-the-fly training and MD
mode: fresh # Start from an empty SGP
md_engine: VelocityVerlet # Define MD engine, here we use the Velocity Verlet engine from ASE. LAMMPS examples can be found in the flare/examples directory in the repo
md_kwargs: {} # Define MD kwargs
initial_velocity: 1000 # Initialize the velocities
dt: 0.001 # Set the time step in picoseconds (1 fs here)
number_of_steps: 10 # Total number of MD steps to be taken
output_name: Al_otf # Name of output
init_atoms: [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11] # Initial atoms to be added to the sparse set
std_tolerance_factor: -0.01 # The uncertainty threshold above which the DFT will be called
max_atoms_added: -1 # Allow for all atoms in a given frame to be added to the sparse set if uncertainties permit
train_hyps: [5,inf] # Define range in which hyperparameters will be optimized. Here, hyps are optimized at every DFT call after the 5th call.
write_model: 4 # Verbosity of model output.
update_style: threshold # Sparse set update style. Atoms above a defined "threshold" will be added using this method
update_threshold: 0.001 # Threshold for adding atoms if "update_style = threshold". Threshold represents relative uncertainty to mean atomic uncertainty, where atoms above are added to sparse set
force_only: False # Train on forces, stresses, and energies.

  1. Error message

File "/home/USER/.local/lib/python3.10/site-packages/flare/scripts/otf_train.py", line 285, in
species_map = {flare_config.get("species")[i]: i for i in range(n_species)}
TypeError: unhashable type: 'list'

  1. otf_train.yaml (2)
    same as above, except line:
    species:
    - 14
    - 8
  2. Error message

File "/home/USER/.local/lib/python3.10/site-packages/flare/scripts/otf_train.py", line 233, in get_sgp_calc
assert np.allclose(np.array(d["cutoff_matrix"]).shape, (n_species, n_species)),
AssertionError: cutoff_matrix needs to be of shape (n_species, n_species)

  1. otf_train.yaml (3)
    same as above, except line:
    species:
    - (14, 8)
  2. Error message

File "/home/USER/.local/lib/python3.10/site-packages/flare/bffs/sgp/sparse_gp.py", line 335, in update_db
coded_species.append(self.species_map[spec])

  1. otf_train.yaml (4)
    same as above, except line:
    species:
    - 14, 8
  2. Error message
    File "/home/USER/.local/lib/python3.10/site-packages/flare/bffs/sgp/sparse_gp.py", line 335, in update_db
    coded_species.append(self.species_map[spec])

@YuuuXie
Copy link
Collaborator

YuuuXie commented Jul 4, 2023

@johnemec If you have two species, then use

species:
- 14
- 8

And in such a case, the cutoff_matrix: [[5.0]] is wrong. Instead the cutoff_matrix should be a 2x2 matrix specifying cutoffs between Si-Si, Si-O, O-Si, O-O. If you want to use the same cutoff, you can also just remove the argument cutoff_matrix

@cjowen1
Copy link
Collaborator

cjowen1 commented Jul 5, 2023

@johnemec, Please also include the following, in addition to Yu's suggestions:

single_atom_energies: # total number of entries should match the number of elements considered
- 0
- 0

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants