Add stress support, update precision defaults, update README (#52)

mir-group · Jun 5, 2024 · 3dda11b · 3dda11b
2 parents b3680ef + f066ab8
commit 3dda11b
Show file tree

Hide file tree

Showing 6 changed files with 89 additions and 103 deletions.
diff --git a/.github/workflows/tests.yml b/.github/workflows/tests.yml
@@ -1,14 +1,6 @@
 name: Run LAMMPS-Python tests
 
-on:
-  push:
-    branches:
-    - main
-    - develop
-
-  pull_request:
-    branches:
-    - main
+on: [push, pull_request]
 
 jobs:
   build:
@@ -17,7 +9,7 @@ jobs:
     strategy:
       matrix:
         python-version: [3.9]
-        torch-version: [1.10.1, 1.11.0]
+        torch-version: [1.11.0]
         nequip-branch: ["main"]
 
     steps:
@@ -46,7 +38,7 @@ jobs:
       run: |
         mkdir lammps_dir/
         cd lammps_dir/
-        git clone -b stable_29Sep2021_update2 --depth 1 "https://github.com/lammps/lammps"
+        git clone --depth 1 "https://github.com/lammps/lammps"
         cd ..
         ./patch_lammps.sh lammps_dir/lammps/
         cd lammps_dir/lammps/

diff --git a/README.md b/README.md
@@ -8,7 +8,7 @@ This pair style allows you to use NequIP models from the [`nequip`](https://gith
 
 ## Pre-requisites
 
-* PyTorch or LibTorch >= 1.10.0
+* PyTorch or LibTorch >= 1.11.0;  please note that at present we have only thoroughly tested 1.11 on NVIDIA GPUs (see [#311 for NequIP](https://github.com/mir-group/nequip/discussions/311#discussioncomment-5129513)) and 1.13 on AMD GPUs, but newer 2.x versions *may* also work. With newer versions, setting the environment variable `PYTORCH_JIT_USE_NNC_NOT_NVFUSER=1` sometimes helps.
 
 ## Usage in LAMMPS
 
@@ -18,21 +18,21 @@ pair_coeff	* * deployed.pth <type name 1> <type name 2> ...
 ```
 where `deployed.pth` is the filename of your trained, **deployed** model.
 
-The names after the model path `deployed.pth` indicate, in order, the names of the NequIP model's atom types to use for LAMMPS atom types 1, 2, and so on. The number of names given must be equal to the number of atom types in the LAMMPS configuration (not the NequIP model!). 
+The names after the model path `deployed.pth` indicate, in order, the names of the NequIP model's atom types to use for LAMMPS atom types 1, 2, and so on. The number of names given must be equal to the number of atom types in the LAMMPS configuration (not the NequIP model!).
 The given names must be consistent with the names specified in the NequIP training YAML in `chemical_symbol_to_type` or `type_names`.
 
 ## Building LAMMPS with this pair style
 
 ### Download LAMMPS
 ```bash
-git clone -b stable_29Sep2021_update2 --depth 1 git@github.com:lammps/lammps
+git clone --depth=1 https://github.com/lammps/lammps
 ```
 or your preferred method.
-(`--depth 1` prevents the entire history of the LAMMPS repository from being downloaded.)
+(`--depth=1` prevents the entire history of the LAMMPS repository from being downloaded.)
 
 ### Download this repository
 ```bash
-git clone git@github.com:mir-group/pair_nequip
+git clone https://github.com/mir-group/pair_nequip
 ```
 
 ### Patch LAMMPS
@@ -49,7 +49,6 @@ cp /path/to/pair_nequip/*.cpp /path/to/lammps/src/
 cp /path/to/pair_nequip/*.h /path/to/lammps/src/
 ```
 Then make the following modifications to `lammps/cmake/CMakeLists.txt`:
-- Change `set(CMAKE_CXX_STANDARD 11)` to `set(CMAKE_CXX_STANDARD 14)`
 - Append the following lines:
 ```cmake
 find_package(Torch REQUIRED)
@@ -106,6 +105,9 @@ This gives `lammps/build/lmp`, which can be run as usual with `/path/to/lmp -in
    ```
 
    A: Make sure you remembered to deploy (compile) your model using `nequip-deploy`, and that the path to the model given with `pair_coeff` points to a deployed model `.pth` file, **not** a file containing only weights like `best_model.pth`.
-3. Q: The output pressures and stresses seem wrong / my NPT simulation is broken
+3. Q: I get the following error:
+    ```
+    Exception: Argument passed to at() was not in the map
+    ```
 
-    A: NPT/stress support in LAMMPS for `pair_nequip` is in-progress on the `stress` branch and is not yet finished. 
+    A: We now require models to have been trained with stress support, which is achieved by replacing `ForceOutput` with `StressForceOutput` in the training configuration. Note that you do not need to train on stress (though it may improve your potential, assuming your stress data is correct and converged). If you desperately wish to keep using a model without stress output, there are two options: 1) Remove lines that look like [these](https://github.com/mir-group/pair_allegro/blob/99036043e74376ac52993b5323f193dee3f4f401/pair_allegro_kokkos.cpp#L332-L343) in your version of `pair_allegro[_kokkos].cpp` 2) Redeploy the model with an updated config file, as described [here](https://github.com/mir-group/nequip/issues/69#issuecomment-1129273665).
diff --git a/pair_nequip.cpp b/pair_nequip.cpp
@@ -38,28 +38,12 @@
 #include <torch/torch.h>
 #include <torch/script.h>
 #include <torch/csrc/jit/runtime/graph_executor.h>
-//#include <c10/cuda/CUDACachingAllocator.h>
-
-
-// We have to do a backward compatability hack for <1.10
-// https://discuss.pytorch.org/t/how-to-check-libtorch-version/77709/4
-// Basically, the check in torch::jit::freeze
-// (see https://github.com/pytorch/pytorch/blob/dfbd030854359207cb3040b864614affeace11ce/torch/csrc/jit/api/module.cpp#L479)
-// is wrong, and we have ro "reimplement" the function
-// to get around that...
-// it's broken in 1.8 and 1.9
-// BUT the internal logic in the function is wrong in 1.10
-// So we only use torch::jit::freeze in >=1.11
+
+// Freezing is broken from C++ in <=1.10; so we've dropped support.
 #if (TORCH_VERSION_MAJOR == 1 && TORCH_VERSION_MINOR <= 10)
-  #define DO_TORCH_FREEZE_HACK
-  // For the hack, need more headers:
-  #include <torch/csrc/jit/passes/freeze_module.h>
-  #include <torch/csrc/jit/passes/frozen_conv_add_relu_fusion.h>
-  #include <torch/csrc/jit/passes/frozen_graph_optimizations.h>
-  #include <torch/csrc/jit/passes/frozen_ops_to_mkldnn.h>
+#error "PyTorch version < 1.11 is not supported"
 #endif
 
-
 using namespace LAMMPS_NS;
 
 PairNEQUIP::PairNEQUIP(LAMMPS *lmp) : Pair(lmp) {
@@ -92,13 +76,7 @@ void PairNEQUIP::init_style(){
   if (atom->tag_enable == 0)
     error->all(FLERR,"Pair style NEQUIP requires atom IDs");
 
-  // need a full neighbor list
-  int irequest = neighbor->request(this,instance_me);
-  neighbor->requests[irequest]->half = 0;
-  neighbor->requests[irequest]->full = 1;
-
-  // TODO: probably also
-  neighbor->requests[irequest]->ghost = 0;
+  neighbor->add_request(this, NeighConst::REQ_FULL);
 
   // TODO: I think Newton should be off, enforce this.
   // The network should just directly compute the total forces
@@ -125,7 +103,7 @@ void PairNEQUIP::allocate()
 }
 
 void PairNEQUIP::settings(int narg, char ** /*arg*/) {
-  // "flare" should be the only word after "pair_style" in the input file.
+  // "nequip" should be the only word after "pair_style" in the input file.
   if (narg > 0)
     error->all(FLERR, "Illegal pair_style command");
 }
@@ -186,52 +164,23 @@ void PairNEQUIP::coeff(int narg, char **arg) {
   // This is the check used by PyTorch: https://github.com/pytorch/pytorch/blob/master/torch/csrc/jit/api/module.cpp#L476
   if (model.hasattr("training")) {
     std::cout << "Freezing TorchScript model...\n";
-    #ifdef DO_TORCH_FREEZE_HACK
-      // Do the hack
-      // Copied from the implementation of torch::jit::freeze,
-      // except without the broken check
-      // See https://github.com/pytorch/pytorch/blob/dfbd030854359207cb3040b864614affeace11ce/torch/csrc/jit/api/module.cpp
-      bool optimize_numerics = true;  // the default
-      // the {} is preserved_attrs
-      auto out_mod = freeze_module(
-        model, {}
-      );
-      // See 1.11 bugfix in https://github.com/pytorch/pytorch/pull/71436
-      auto graph = out_mod.get_method("forward").graph();
-      OptimizeFrozenGraph(graph, optimize_numerics);
-      model = out_mod;
-    #else
-      // Do it normally
-      model = torch::jit::freeze(model);
-    #endif
+    model = torch::jit::freeze(model);
   }
 
-  #if (TORCH_VERSION_MAJOR == 1 && TORCH_VERSION_MINOR <= 10)
-    // Set JIT bailout to avoid long recompilations for many steps
-    size_t jit_bailout_depth;
-    if (metadata["_jit_bailout_depth"].empty()) {
-      // This is the default used in the Python code
-      jit_bailout_depth = 2;
-    } else {
-      jit_bailout_depth = std::stoi(metadata["_jit_bailout_depth"]);
-    }
-    torch::jit::getBailoutDepth() = jit_bailout_depth;
-  #else
-    // In PyTorch >=1.11, this is now set_fusion_strategy
-    torch::jit::FusionStrategy strategy;
-    if (metadata["_jit_fusion_strategy"].empty()) {
-      // This is the default used in the Python code
-      strategy = {{torch::jit::FusionBehavior::DYNAMIC, 3}};
-    } else {
-      std::stringstream strat_stream(metadata["_jit_fusion_strategy"]);
-      std::string fusion_type, fusion_depth;
-      while(std::getline(strat_stream, fusion_type, ',')) {
-        std::getline(strat_stream, fusion_depth, ';');
-        strategy.push_back({fusion_type == "STATIC" ? torch::jit::FusionBehavior::STATIC : torch::jit::FusionBehavior::DYNAMIC, std::stoi(fusion_depth)});
-      }
+  // In PyTorch >=1.11, this is now set_fusion_strategy
+  torch::jit::FusionStrategy strategy;
+  if (metadata["_jit_fusion_strategy"].empty()) {
+    // This is the default used in the Python code
+    strategy = {{torch::jit::FusionBehavior::DYNAMIC, 3}};
+  } else {
+    std::stringstream strat_stream(metadata["_jit_fusion_strategy"]);
+    std::string fusion_type, fusion_depth;
+    while(std::getline(strat_stream, fusion_type, ',')) {
+      std::getline(strat_stream, fusion_depth, ';');
+      strategy.push_back({fusion_type == "STATIC" ? torch::jit::FusionBehavior::STATIC : torch::jit::FusionBehavior::DYNAMIC, std::stoi(fusion_depth)});
     }
-    torch::jit::setFusionStrategy(strategy);
-  #endif
+  }
+  torch::jit::setFusionStrategy(strategy);
 
   // Set whether to allow TF32:
   bool allow_tf32;
@@ -463,40 +412,49 @@ void PairNEQUIP::compute(int eflag, int vflag){
   auto output = model.forward(input_vector).toGenericDict();
 
   torch::Tensor forces_tensor = output.at("forces").toTensor().cpu();
-  auto forces = forces_tensor.accessor<float, 2>();
+  auto forces = forces_tensor.accessor<double, 2>();
 
   torch::Tensor total_energy_tensor = output.at("total_energy").toTensor().cpu();
 
   // store the total energy where LAMMPS wants it
-  eng_vdwl = total_energy_tensor.data_ptr<float>()[0];
+  eng_vdwl = total_energy_tensor.data_ptr<double>()[0];
 
   torch::Tensor atomic_energy_tensor = output.at("atomic_energy").toTensor().cpu();
-  auto atomic_energies = atomic_energy_tensor.accessor<float, 2>();
-  float atomic_energy_sum = atomic_energy_tensor.sum().data_ptr<float>()[0];
+  auto atomic_energies = atomic_energy_tensor.accessor<double, 2>();
+
+  if(vflag){
+    torch::Tensor v_tensor = output.at("virial").toTensor().cpu();
+    auto v = v_tensor.accessor<double, 3>();
+    // Convert from 3x3 symmetric tensor format, which NequIP outputs, to the flattened form LAMMPS expects
+    // First [0] index on v is batch
+    virial[0] = v[0][0][0];
+    virial[1] = v[0][1][1];
+    virial[2] = v[0][2][2];
+    virial[3] = v[0][0][1];
+    virial[4] = v[0][0][2];
+    virial[5] = v[0][1][2];
+  }
+  if(vflag_atom) {
+    error->all(FLERR,"Pair style NEQUIP does not support per-atom virial");
+  }
 
   if(debug_mode){
     std::cout << "NequIP model output:\n";
     std::cout << "forces: " << forces_tensor << "\n";
     std::cout << "total_energy: " << total_energy_tensor << "\n";
     std::cout << "atomic_energy: " << atomic_energy_tensor << "\n";
+    if(vflag) std::cout << "virial: " << output.at("virial").toTensor().cpu() << std::endl;
   }
 
-  //std::cout << "atomic energy sum: " << atomic_energy_sum << std::endl;
-  //std::cout << "Total energy: " << total_energy_tensor << "\n";
-  //std::cout << "atomic energy shape: " << atomic_energy_tensor.sizes()[0] << "," << atomic_energy_tensor.sizes()[1] << std::endl;
-  //std::cout << "atomic energies: " << atomic_energy_tensor << std::endl;
-
   // Write forces and per-atom energies (0-based tags here)
   for(int itag = 0; itag < inum; itag++){
     int i = tag2i[itag];
     f[i][0] = forces[itag][0];
     f[i][1] = forces[itag][1];
     f[i][2] = forces[itag][2];
     if (eflag_atom) eatom[i] = atomic_energies[itag][0];
-    //printf("%d %d %g %g %g %g %g %g\n", i, type[i], pos[itag][0], pos[itag][1], pos[itag][2], f[i][0], f[i][1], f[i][2]);
   }
 
-  // TODO: Virial stuff? (If there even is a pairwise force concept here)
 
   // TODO: Performance: Depending on how the graph network works, using tags for edges may lead to shitty memory access patterns and performance.
   // It may be better to first create tag2i as a separate loop, then set edges[edge_counter][:] = (i, tag2i[jtag]).

diff --git a/patch_lammps.sh b/patch_lammps.sh
@@ -62,7 +62,6 @@ fi
 
 echo "Updating CMakeLists.txt..."
 
-# Update CMakeLists.txt
 sed -i "s/set(CMAKE_CXX_STANDARD 11)/set(CMAKE_CXX_STANDARD 14)/" $lammps_dir/cmake/CMakeLists.txt
 
 # Add libtorch
@@ -73,4 +72,4 @@ set(CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} ${TORCH_CXX_FLAGS}")
 target_link_libraries(lammps PUBLIC "${TORCH_LIBRARIES}")
 EOF2
 
-echo "Done!"
+echo "Done!"
diff --git a/tests/test_data/test_repro.yaml b/tests/test_data/test_repro.yaml
@@ -2,6 +2,14 @@ run_name: minimal
 seed: 123
 dataset_seed: 456
 
+# from minimal_stress.yaml
+model_builders:
+  - SimpleIrrepsConfig
+  - EnergyModel
+  - PerSpeciesRescale
+  - StressForceOutput
+  - RescaleEnergyEtc
+
 # network
 num_basis: 4
 l_max: 1

diff --git a/tests/test_python_repro.py b/tests/test_python_repro.py
@@ -12,6 +12,7 @@
 from collections import Counter
 
 import ase
+import ase.units
 import ase.build
 import ase.io
 
@@ -144,9 +145,11 @@ def test_repro(deployed_model):
 
         compute atomicenergies all pe/atom
         compute totalatomicenergy all reduce sum c_atomicenergies
+        compute stress all pressure NULL virial  # NULL means without temperature contribution
 
-        thermo_style custom step time temp pe c_totalatomicenergy etotal press spcpu cpuremain
+        thermo_style custom step time temp pe c_totalatomicenergy etotal press spcpu cpuremain c_stress[*]
         run 0
+        print "$({PRECISION_CONST} * c_stress[1]) $({PRECISION_CONST} * c_stress[2]) $({PRECISION_CONST} * c_stress[3]) $({PRECISION_CONST} * c_stress[4]) $({PRECISION_CONST} * c_stress[5]) $({PRECISION_CONST} * c_stress[6])" file stress.dat
         print $({PRECISION_CONST} * pe) file pe.dat
         print $({PRECISION_CONST} * c_totalatomicenergy) file totalatomicenergy.dat
         write_dump all custom output.dump id type x y z fx fy fz c_atomicenergies modify format float %20.15g
@@ -309,9 +312,33 @@ def test_repro(deployed_model):
                 float(Path(tmpdir + f"/totalatomicenergy.dat").read_text())
                 / PRECISION_CONST
             )
+            # in `metal` units, pressure/stress has units bars
+            # so need to convert
+            lammps_stress = np.fromstring(
+                Path(tmpdir + f"/stress.dat").read_text(), sep=" ", dtype=np.float64
+            ) * (ase.units.bar / PRECISION_CONST)
+            # https://docs.lammps.org/compute_pressure.html
+            # > The ordering of values in the symmetric pressure tensor is as follows: pxx, pyy, pzz, pxy, pxz, pyz.
+            lammps_stress = np.array(
+                [
+                    [lammps_stress[0], lammps_stress[3], lammps_stress[4]],
+                    [lammps_stress[3], lammps_stress[1], lammps_stress[5]],
+                    [lammps_stress[4], lammps_stress[5], lammps_stress[2]],
+                ]
+            )
             assert np.allclose(lammps_pe, lammps_totalatomicenergy)
             assert np.allclose(
                 structure.get_potential_energy(),
                 lammps_pe,
                 atol=1e-6,
             )
+            if periodic:
+                # In LAMMPS, the convention is that the stress tensor, and thus the pressure, is related to the virial
+                # WITHOUT a sign change.  In `nequip`, we chose currently to follow the virial = -stress x volume
+                # convention => stress = -1/V * virial.  ASE does not change the sign of the virial, so we have
+                # to flip the sign from ASE for the comparison.
+                assert np.allclose(
+                    -structure.get_stress(voigt=False),
+                    lammps_stress,
+                    atol=1e-6,
+                )