Skip to content

Commit

Permalink
Add preprocess docs
Browse files Browse the repository at this point in the history
  • Loading branch information
ElliottKasoar committed Nov 1, 2024
1 parent 6534669 commit 903ea88
Show file tree
Hide file tree
Showing 3 changed files with 33 additions and 1 deletion.
1 change: 1 addition & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -114,6 +114,7 @@ janus phonons
janus eos
janus train
janus descriptors
janus preprocess
```

For example, a single point calcuation (using the [MACE-MP](https://github.com/ACEsuit/mace-mp) "small" force-field) can be performed by running:
Expand Down
10 changes: 10 additions & 0 deletions docs/source/apidoc/janus_core.rst
Original file line number Diff line number Diff line change
Expand Up @@ -289,6 +289,16 @@ janus\_core.processing.symmetry module
:undoc-members:
:show-inheritance:

janus\_core.training.preprocess module
--------------------------------------

.. automodule:: janus_core.training.preprocess
:members:
:special-members:
:private-members:
:undoc-members:
:show-inheritance:

janus\_core.training.train module
---------------------------------

Expand Down
23 changes: 22 additions & 1 deletion docs/source/user_guide/command_line.rst
Original file line number Diff line number Diff line change
Expand Up @@ -346,7 +346,7 @@ Training and fine-tuning MLIPs
------------------------------

.. note::
Currently only MACE models are supported. See the `MACE CLI <https://github.com/ACEsuit/mace/blob/main/mace/cli/run_train.py>`_ for further configuration details
Currently only MACE models are supported. See the `MACE run_train CLI <https://github.com/ACEsuit/mace/blob/main/mace/cli/run_train.py>`_ for further configuration details

Models can be trained by passing a configuration file to the MLIP's command line interface:

Expand All @@ -364,6 +364,27 @@ Foundational models can also be fine-tuned, by including the ``foundation_model`
janus train --mlip-config /path/to/fine/tuning/config.yml --fine-tune
Preprocessing training data
----------------------------

.. note::
Currently only MACE models are supported. See the `MACE preprocess_data CLI <https://github.com/ACEsuit/mace/blob/main/mace/cli/preprocess_data.py>`_ for further configuration details

Large datasets, which may not fit into GPU memory, can be preprocessed,
converting xyz training, test, and validation files into HDF5 files that can then be used for on-line data loading.

This can be done by passing a configuration file to the MLIP's command line interface:

.. code-block:: bash
janus preprocess --mlip-config /path/to/preprocessing/config.yml
For MACE, this will create separate folders for ``train``, ``val`` and ``test`` HDF5 data files, when relevant,
as well as saving the statistics of your data in ``statistics.json``, if requested.

Additionally, a log file, ``preprocess-log.yml``, and summary file, ``preprocess-summary.yml``, will be generated.


Calculate descriptors
---------------------

Expand Down

0 comments on commit 903ea88

Please sign in to comment.