Skip to content

Commit

Permalink
Merge pull request #17 from mhpi/dmg_build
Browse files Browse the repository at this point in the history
update dmg
  • Loading branch information
leoglonz authored Dec 5, 2024
2 parents ff1671c + 3c5af44 commit 04528c6
Show file tree
Hide file tree
Showing 5 changed files with 154 additions and 15 deletions.
4 changes: 1 addition & 3 deletions docs/codes/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,12 +8,11 @@ See below for coding projects developed by the community that utilize HydroDL
<div class="result" markdown>
<div class="grid cards" markdown>

- [𝛿MG][lonzarich_2024.md]
- [𝛿MG __(Lonzarich et al. 2024)__][lonzarich_2024.md]
---
![](../assets/project-figures/dMG.png){align="left" width="120"}
A second-generation generic, scalable differentiable modeling framework on PyTorch for integrating neural networks with physical models. Coupled with HydroDL2, 𝛿MG enables the hydrologic modeling like HydroDL while greatly expanding the range of applications and capabilities.


</div>
</div>

Expand All @@ -29,7 +28,6 @@ A second-generation generic, scalable differentiable modeling framework on PyTor
![Manning's n recovery against USGS Data](../assets/project-figures/wrcr27009-fig-0001-m.jpg){align="left" width="120"}
A differentiable routing method that uses Muskingum-Cunge and an NN to infer parameterizations for Manning’s roughness


</div>
</div>
<div class="result" markdown>
Expand Down
14 changes: 10 additions & 4 deletions docs/codes/lonzarich_2024.md
Original file line number Diff line number Diff line change
Expand Up @@ -10,9 +10,15 @@ The `hydroDL2` repository for hydrology models couples with the 𝛿MG framework

Closely synergizes with deep learning tools and the scale advantage of PyTorch. Maintained by the [MHPI group](http://water.engr.psu.edu/shen/) advised by Dr. Chaopeng Shen.

<br>

### Code Release
## Differentiable Models

Characterized by the combination of process-based equations with neural networks (NNs), differentiable models train these components together, enabling parameter inputs for the equations to be effectively and efficiently learned at scale by the NNs. There are many possibilities for how such models are built.

- [𝛿MG](https://github.com/mhpi/generic_deltaModel)
- [HydroDL 2.0](https://github.com/mhpi/hydroDL2)
- [Example Data Extracted from CAMELS ](https://mhpi-spatial.s3.us-east-2.amazonaws.com/mhpi-release/camels/camels_data.zip)
In 𝛿MG, we define a differentiable model with the class *DeltaModel* that can couple one or more NNs with a process-based model (itself potentially a collection of models). This class holds `nn` and a `phy_model` objects, respectively, as attributes internally and describes how they interface with each other:

- **nn**: PyTorch neural networks that can learn and provide either parameters, missing process representations, corrections, or other forms of enhancements to physical models.
- **phy_model**: The physical model written in PyTorch (or potentially another interoperable differentiable platform) that takes learnable outputs from the `nn` model(s) and returns a prediction of some target variable(s). This can also be a wrapper holding several physical models.

The *DeltaModel* object can be trained and forwarded just as any other PyTorch model (nn.Module).
104 changes: 104 additions & 0 deletions docs/dmg/code.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,104 @@
# Code Release

To start using 𝛿MG, clone the framework `generic_deltaModel`:

- [𝛿MG](https://github.com/mhpi/generic_deltaModel)

For [tutorials](https://github.com/mhpi/generic_deltaModel/tree/master/example) and MHPI benchmarks, additionally clone the hydrologic model package `hydroDL2` and download our CAMELS data extraction:

- [HydroDL 2.0](https://github.com/mhpi/hydroDL2)
- [CAMELS Date ](https://mhpi-spatial.s3.us-east-2.amazonaws.com/mhpi-release/camels/camels_data.zip)


<br>


# Getting Started with *HydroDL 2.0*

## System Requirements

𝛿MG uses PyTorch models requiring CUDA exclusively supported by NVIDIA GPUs. This requires using

- Windows or Linux

- NVIDIA GPU(s) (with CUDA version >12.0 recommended)



## Setup

For a functioning 𝛿MG + HydroDL2 setup...



### 1. Clone and Download Data
- Open a terminal on your system, navigate to the directory where 𝛿MG and HydroDL2 will be stored, and clone (`master` branch):

```shell
git clone https://github.com/mhpi/generic_deltaModel.git
git clone https://github.com/mhpi/hydroDL2.git
```

- Download the CAMELS data zip from the link above and extract, optionally to a `Data/` folder in your working directory, which should now look something like

```
.
├── Data/
│ ├── training_file # Pickle file with training data
│ ├── validation_file # Pickle file with validation/testing data
│ ├── gage_ids.npy # Numpy array with all 671 CAMELS gage ids
│ └── 531_subset.txt # Text file of gage ids in 531-gage subset
├── generic_deltaModel/
└── hydroDL2/

```
### 2. Install the ENV
- A minimal yaml list of essential packages is included in `generic_deltaModel`: `generic_deltaModel/envs/deltamodel_env.yaml`.
- To install, run the following (optionally, include `--prefix` flag to specify the env download location):
```shell
conda env create --file /generic_deltaModel/envs/deltamodel_env.yaml
```
or
```shell
conda env create --prefix path/to/env --file /generic_deltaModel/envs/deltamodel_env.yaml
```
- Activate with `conda activate deltamodel` and open a Python instance to check that CUDA is available with PyTorch:
```python
import torch
print(torch.cuda.is_available())
```
- If CUDA is not available, often PyTorch is installed incorrectly. Uninstall PyTorch from the env and reinstall according to your system specifications [here](https://pytorch.org/get-started/locally/). E.g.,
```shell
conda uninstall pytorch
conda install pytorch torchvision torchaudio pytorch-cuda=12.1 -c pytorch -c nvidia
```
### 3. Install HydroDL2
- For `hydroDL2` to be accessible within `generic_deltaModel`, install with pip (optionally, include `-e` flag to install with developer mode):
```shell
cd hydroDL2
pip install .
```
or
```shell
cd hydroDL2
pip install -e .
```
---
### 4. Build Models
- That's it. You should now be able to run the tutorials, train/test MHPI benchmarks, and build your own differentiable models.
46 changes: 38 additions & 8 deletions docs/dmg/detail.md
Original file line number Diff line number Diff line change
Expand Up @@ -17,9 +17,6 @@ For differentiable hydrology models used in MHPI research, 𝛿MG seamlessly int

<br>

Explore the project's [roadmap](https://github.com/orgs/mhpi/projects/4) for planned features and future improvements. It is in our roadmap to interface with differentiable numerical packages like torchode and torchdiffeq.


### Key Features
- **Hybrid Modeling**: Combines neural networks with physical process equations for enhanced interpretability and generalizability. Skip manually tuning model parameters by using neural networks to feed robust and interpretable parameter predictions directly.

Expand All @@ -31,11 +28,16 @@ Explore the project's [roadmap](https://github.com/orgs/mhpi/projects/4) for pla

- **NextGen-ready**: 𝛿MG is designed to be [CSDMS BMI](https://csdms.colorado.edu/wiki/BMI)-compliant, and our differentiable hydrology models in hydroDL2 come with a prebuilt BMI allowing seamless compatibility with [NOAA-OWP](https://water.noaa.gov/about/owp)'s [NextGen National Water Modelling Framework](https://github.com/NOAA-OWP/ngen). Incidentally, this capability also lends to 𝛿MG being easily wrappable for other applications.

<br>

### Use Cases
This package powers the global- and ([`national-scale water model`](https://doi.org/10.22541/essoar.172736277.74497104/v1)) that provide high-quality seamless hydrologic simulations over US and the world.
It also hosts ([`global-scale photosynthesis `](https://doi.org/10.22541/au.173101418.87755465/v1)) learning and simulations

<br>

### The Overall Idea
We define a "differentiable model" (dModel) class which describes how neural networks and the process-based model are coupled. dModel holds NNs and process-based models as attributes and can be trained and forwarded just as any other PyTorch model (nn.Module). We define classes to handle datasets (dataset class), various train/test experiments (trainer), multimodel handling and multi-GPU training (model handler), data assimilation and streaming in a uniform and modular way. All training and simulations can be specified by a config file to be adapted to custom applications.
We define a "differentiable model" class,*DeltaModel*, class which describes how neural networks and the process-based model are coupled. dModel holds NNs and process-based models as attributes and can be trained and forwarded just as any other PyTorch model (nn.Module). We define classes to handle datasets (dataset class), various train/test experiments (trainer), multimodel handling and multi-GPU training (model handler), data assimilation and streaming in a uniform and modular way. All training and simulations can be specified by a config file to be adapted to custom applications.
According to the schema, we define these core classes, from bottom up:

- **NN**: Neural networks that can provide either parameters, missing process representations, corrections or other forms of enhancements to process-based models.
Expand All @@ -45,6 +47,31 @@ According to the schema, we define these core classes, from bottom up:
- **Trainer**: Manages the train and test of models and connects data to model.
- **dataset**: Manages data ingestion in a unified format; support multiple file formats.

<br>

### 𝛿MG Repository Structure:

.
├── deltaModel/
│ ├── __main__.py # Main entry point
│ ├── conf/ # Configuration files
│ │ ├── config.py
│ │ ├── config.yaml # Main configuration file
│ │ ├── hydra/
│ │ └── observations/ # Observation data config
│ ├── core/
│ │ ├── calc/ # Calculation utilities
│ │ ├── data/ # Data processing
│ │ └── utils/ # Helper functions
│ ├── models/
│ │ ├── differentiable_model.py # Differentiable model definition
│ │ ├── model_handler.py # High-level model manager
│ │ ├── loss_functions/ # Custom loss functions
│ │ └── neural_networks/ # Neural network architectures
│ └── trainers/ # Training routines
├── docs/
├── envs/ # Environment configuration files
└── example/ # Example scripts and usage guides

<br>

Expand All @@ -53,6 +80,8 @@ According to the schema, we define these core classes, from bottom up:
Here’s an example of how you can build a differentiable model, coupling a physics-based model with a neural network to intelligently learn model parameters. In this instance, we use an
LSTM with the [HBV](https://en.wikipedia.org/wiki/HBV_hydrology_model) hydrology model.
```python
CONFIG_PATH = '../example/conf/config_dhbv1_1p.yaml'

# 1. Load configuration dictionary of model parameters and options.
config = load_config(CONFIG_PATH)

Expand All @@ -73,9 +102,10 @@ dpl_model = dHBV(phy_model=phy_model, nn_model=nn)

# 5. For example, to forward:
output = dpl_model.forward(dataset_sample)

```

### Use Cases
This package powers the global- and ([`national-scale water model`](https://doi.org/10.22541/essoar.172736277.74497104/v1)) that provide high-quality seamless hydrologic simulations over US and the world.
It also hosts ([`global-scale photosynthesis `](https://doi.org/10.22541/au.173101418.87755465/v1)) learning and simulations
See [here](https://github.com/mhpi/generic_deltaModel/blob/master/example/differentiable_hydrology/dhbv_tutorial.ipynb) in the `generic_deltaModel` repository for this and other examples.

<br>

Explore the [roadmap](https://github.com/orgs/mhpi/projects/4) for planned features and improvements. It is in our roadmap to interface with differentiable numerical packages like torchode and torchdiffeq.
1 change: 1 addition & 0 deletions mkdocs.yml
Original file line number Diff line number Diff line change
Expand Up @@ -128,6 +128,7 @@ nav:
- 𝛿MG:
- Overview: codes/lonzarich_2024.md
- Details: dmg/detail.md
- Code Release: dmg/code.md
- Benchmarks:
- benchmarks/index.md
- Getting started:
Expand Down

0 comments on commit 04528c6

Please sign in to comment.