From 3c5af441aac2f95deed7ee39362976b68da6c8b4 Mon Sep 17 00:00:00 2001 From: Leo Lonzarich Date: Thu, 5 Dec 2024 17:17:46 -0500 Subject: [PATCH] update dmg --- docs/codes/index.md | 4 +- docs/codes/lonzarich_2024.md | 14 +++-- docs/dmg/code.md | 104 +++++++++++++++++++++++++++++++++++ docs/dmg/detail.md | 46 +++++++++++++--- mkdocs.yml | 1 + 5 files changed, 154 insertions(+), 15 deletions(-) create mode 100644 docs/dmg/code.md diff --git a/docs/codes/index.md b/docs/codes/index.md index fae9a69..6bee85e 100644 --- a/docs/codes/index.md +++ b/docs/codes/index.md @@ -8,12 +8,11 @@ See below for coding projects developed by the community that utilize HydroDL
-- [𝛿MG][lonzarich_2024.md] +- [𝛿MG __(Lonzarich et al. 2024)__][lonzarich_2024.md] --- ![](../assets/project-figures/dMG.png){align="left" width="120"} A second-generation generic, scalable differentiable modeling framework on PyTorch for integrating neural networks with physical models. Coupled with HydroDL2, 𝛿MG enables the hydrologic modeling like HydroDL while greatly expanding the range of applications and capabilities. -
@@ -29,7 +28,6 @@ A second-generation generic, scalable differentiable modeling framework on PyTor ![Manning's n recovery against USGS Data](../assets/project-figures/wrcr27009-fig-0001-m.jpg){align="left" width="120"} A differentiable routing method that uses Muskingum-Cunge and an NN to infer parameterizations for Manning’s roughness -
diff --git a/docs/codes/lonzarich_2024.md b/docs/codes/lonzarich_2024.md index 2646390..67b7a42 100644 --- a/docs/codes/lonzarich_2024.md +++ b/docs/codes/lonzarich_2024.md @@ -10,9 +10,15 @@ The `hydroDL2` repository for hydrology models couples with the 𝛿MG framework Closely synergizes with deep learning tools and the scale advantage of PyTorch. Maintained by the [MHPI group](http://water.engr.psu.edu/shen/) advised by Dr. Chaopeng Shen. +
-### Code Release +## Differentiable Models + +Characterized by the combination of process-based equations with neural networks (NNs), differentiable models train these components together, enabling parameter inputs for the equations to be effectively and efficiently learned at scale by the NNs. There are many possibilities for how such models are built. -- [𝛿MG](https://github.com/mhpi/generic_deltaModel) -- [HydroDL 2.0](https://github.com/mhpi/hydroDL2) -- [Example Data Extracted from CAMELS ](https://mhpi-spatial.s3.us-east-2.amazonaws.com/mhpi-release/camels/camels_data.zip) +In 𝛿MG, we define a differentiable model with the class *DeltaModel* that can couple one or more NNs with a process-based model (itself potentially a collection of models). This class holds `nn` and a `phy_model` objects, respectively, as attributes internally and describes how they interface with each other: + +- **nn**: PyTorch neural networks that can learn and provide either parameters, missing process representations, corrections, or other forms of enhancements to physical models. +- **phy_model**: The physical model written in PyTorch (or potentially another interoperable differentiable platform) that takes learnable outputs from the `nn` model(s) and returns a prediction of some target variable(s). This can also be a wrapper holding several physical models. + +The *DeltaModel* object can be trained and forwarded just as any other PyTorch model (nn.Module). diff --git a/docs/dmg/code.md b/docs/dmg/code.md new file mode 100644 index 0000000..bfab4c8 --- /dev/null +++ b/docs/dmg/code.md @@ -0,0 +1,104 @@ +# Code Release + +To start using 𝛿MG, clone the framework `generic_deltaModel`: + +- [𝛿MG](https://github.com/mhpi/generic_deltaModel) + +For [tutorials](https://github.com/mhpi/generic_deltaModel/tree/master/example) and MHPI benchmarks, additionally clone the hydrologic model package `hydroDL2` and download our CAMELS data extraction: + +- [HydroDL 2.0](https://github.com/mhpi/hydroDL2) +- [CAMELS Date ](https://mhpi-spatial.s3.us-east-2.amazonaws.com/mhpi-release/camels/camels_data.zip) + + +
+ + +# Getting Started with *HydroDL 2.0* + +## System Requirements + +𝛿MG uses PyTorch models requiring CUDA exclusively supported by NVIDIA GPUs. This requires using + +- Windows or Linux + +- NVIDIA GPU(s) (with CUDA version >12.0 recommended) + + + +## Setup + +For a functioning 𝛿MG + HydroDL2 setup... + + + +### 1. Clone and Download Data +- Open a terminal on your system, navigate to the directory where 𝛿MG and HydroDL2 will be stored, and clone (`master` branch): + + ```shell + git clone https://github.com/mhpi/generic_deltaModel.git + git clone https://github.com/mhpi/hydroDL2.git + ``` + +- Download the CAMELS data zip from the link above and extract, optionally to a `Data/` folder in your working directory, which should now look something like + +``` + . + β”œβ”€β”€ Data/ + β”‚ β”œβ”€β”€ training_file # Pickle file with training data + β”‚ β”œβ”€β”€ validation_file # Pickle file with validation/testing data + β”‚ β”œβ”€β”€ gage_ids.npy # Numpy array with all 671 CAMELS gage ids + β”‚ └── 531_subset.txt # Text file of gage ids in 531-gage subset + β”œβ”€β”€ generic_deltaModel/ + └── hydroDL2/ + +``` + + +### 2. Install the ENV +- A minimal yaml list of essential packages is included in `generic_deltaModel`: `generic_deltaModel/envs/deltamodel_env.yaml`. +- To install, run the following (optionally, include `--prefix` flag to specify the env download location): + + ```shell + conda env create --file /generic_deltaModel/envs/deltamodel_env.yaml + ``` + or + + ```shell + conda env create --prefix path/to/env --file /generic_deltaModel/envs/deltamodel_env.yaml + ``` + +- Activate with `conda activate deltamodel` and open a Python instance to check that CUDA is available with PyTorch: + + ```python + import torch + print(torch.cuda.is_available()) + ``` + +- If CUDA is not available, often PyTorch is installed incorrectly. Uninstall PyTorch from the env and reinstall according to your system specifications [here](https://pytorch.org/get-started/locally/). E.g., + + ```shell + conda uninstall pytorch + conda install pytorch torchvision torchaudio pytorch-cuda=12.1 -c pytorch -c nvidia + ``` + + +### 3. Install HydroDL2 +- For `hydroDL2` to be accessible within `generic_deltaModel`, install with pip (optionally, include `-e` flag to install with developer mode): + + ```shell + cd hydroDL2 + pip install . + ``` + + or + + ```shell + cd hydroDL2 + pip install -e . + ``` + +--- + +### 4. Build Models + +- That's it. You should now be able to run the tutorials, train/test MHPI benchmarks, and build your own differentiable models. diff --git a/docs/dmg/detail.md b/docs/dmg/detail.md index eb10bae..7f47a5a 100644 --- a/docs/dmg/detail.md +++ b/docs/dmg/detail.md @@ -17,9 +17,6 @@ For differentiable hydrology models used in MHPI research, 𝛿MG seamlessly int
-Explore the project's [roadmap](https://github.com/orgs/mhpi/projects/4) for planned features and future improvements. It is in our roadmap to interface with differentiable numerical packages like torchode and torchdiffeq. - - ### Key Features - **Hybrid Modeling**: Combines neural networks with physical process equations for enhanced interpretability and generalizability. Skip manually tuning model parameters by using neural networks to feed robust and interpretable parameter predictions directly. @@ -31,11 +28,16 @@ Explore the project's [roadmap](https://github.com/orgs/mhpi/projects/4) for pla - **NextGen-ready**: 𝛿MG is designed to be [CSDMS BMI](https://csdms.colorado.edu/wiki/BMI)-compliant, and our differentiable hydrology models in hydroDL2 come with a prebuilt BMI allowing seamless compatibility with [NOAA-OWP](https://water.noaa.gov/about/owp)'s [NextGen National Water Modelling Framework](https://github.com/NOAA-OWP/ngen). Incidentally, this capability also lends to 𝛿MG being easily wrappable for other applications. +
+ +### Use Cases +This package powers the global- and ([`national-scale water model`](https://doi.org/10.22541/essoar.172736277.74497104/v1)) that provide high-quality seamless hydrologic simulations over US and the world. +It also hosts ([`global-scale photosynthesis `](https://doi.org/10.22541/au.173101418.87755465/v1)) learning and simulations
### The Overall Idea -We define a "differentiable model" (dModel) class which describes how neural networks and the process-based model are coupled. dModel holds NNs and process-based models as attributes and can be trained and forwarded just as any other PyTorch model (nn.Module). We define classes to handle datasets (dataset class), various train/test experiments (trainer), multimodel handling and multi-GPU training (model handler), data assimilation and streaming in a uniform and modular way. All training and simulations can be specified by a config file to be adapted to custom applications. +We define a "differentiable model" class,*DeltaModel*, class which describes how neural networks and the process-based model are coupled. dModel holds NNs and process-based models as attributes and can be trained and forwarded just as any other PyTorch model (nn.Module). We define classes to handle datasets (dataset class), various train/test experiments (trainer), multimodel handling and multi-GPU training (model handler), data assimilation and streaming in a uniform and modular way. All training and simulations can be specified by a config file to be adapted to custom applications. According to the schema, we define these core classes, from bottom up: - **NN**: Neural networks that can provide either parameters, missing process representations, corrections or other forms of enhancements to process-based models. @@ -45,6 +47,31 @@ According to the schema, we define these core classes, from bottom up: - **Trainer**: Manages the train and test of models and connects data to model. - **dataset**: Manages data ingestion in a unified format; support multiple file formats. +
+ +### 𝛿MG Repository Structure: + + . + β”œβ”€β”€ deltaModel/ + β”‚ β”œβ”€β”€ __main__.py # Main entry point + β”‚ β”œβ”€β”€ conf/ # Configuration files + β”‚ β”‚ β”œβ”€β”€ config.py + β”‚ β”‚ β”œβ”€β”€ config.yaml # Main configuration file + β”‚ β”‚ β”œβ”€β”€ hydra/ + β”‚ β”‚ └── observations/ # Observation data config + β”‚ β”œβ”€β”€ core/ + β”‚ β”‚ β”œβ”€β”€ calc/ # Calculation utilities + β”‚ β”‚ β”œβ”€β”€ data/ # Data processing + β”‚ β”‚ └── utils/ # Helper functions + β”‚ β”œβ”€β”€ models/ + β”‚ β”‚ β”œβ”€β”€ differentiable_model.py # Differentiable model definition + β”‚ β”‚ β”œβ”€β”€ model_handler.py # High-level model manager + β”‚ β”‚ β”œβ”€β”€ loss_functions/ # Custom loss functions + β”‚ β”‚ └── neural_networks/ # Neural network architectures + β”‚ └── trainers/ # Training routines + β”œβ”€β”€ docs/ + β”œβ”€β”€ envs/ # Environment configuration files + └── example/ # Example scripts and usage guides
@@ -53,6 +80,8 @@ According to the schema, we define these core classes, from bottom up: Here’s an example of how you can build a differentiable model, coupling a physics-based model with a neural network to intelligently learn model parameters. In this instance, we use an LSTM with the [HBV](https://en.wikipedia.org/wiki/HBV_hydrology_model) hydrology model. ```python +CONFIG_PATH = '../example/conf/config_dhbv1_1p.yaml' + # 1. Load configuration dictionary of model parameters and options. config = load_config(CONFIG_PATH) @@ -73,9 +102,10 @@ dpl_model = dHBV(phy_model=phy_model, nn_model=nn) # 5. For example, to forward: output = dpl_model.forward(dataset_sample) - ``` -### Use Cases -This package powers the global- and ([`national-scale water model`](https://doi.org/10.22541/essoar.172736277.74497104/v1)) that provide high-quality seamless hydrologic simulations over US and the world. -It also hosts ([`global-scale photosynthesis `](https://doi.org/10.22541/au.173101418.87755465/v1)) learning and simulations +See [here](https://github.com/mhpi/generic_deltaModel/blob/master/example/differentiable_hydrology/dhbv_tutorial.ipynb) in the `generic_deltaModel` repository for this and other examples. + +
+ +Explore the [roadmap](https://github.com/orgs/mhpi/projects/4) for planned features and improvements. It is in our roadmap to interface with differentiable numerical packages like torchode and torchdiffeq. diff --git a/mkdocs.yml b/mkdocs.yml index 1c2eca0..fdb005d 100644 --- a/mkdocs.yml +++ b/mkdocs.yml @@ -128,6 +128,7 @@ nav: - 𝛿MG: - Overview: codes/lonzarich_2024.md - Details: dmg/detail.md + - Code Release: dmg/code.md - Benchmarks: - benchmarks/index.md - Getting started: