Skip to content

Commit

Permalink
Merge pull request #1 from tornado-prediction/keras3
Browse files Browse the repository at this point in the history
Keras3 Updates
  • Loading branch information
markveilletteLL authored and GitHub Enterprise committed Jul 9, 2024
2 parents 3f8a8cb + eefbd2b commit c30dabd
Show file tree
Hide file tree
Showing 46 changed files with 1,618 additions and 773 deletions.
2 changes: 2 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -5,3 +5,5 @@ __pycache__/
*.egg-info
experiments/
lightning_logs/
tornado_baseline*
tornet_baseline*
70 changes: 54 additions & 16 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,8 +2,19 @@

Software to work with the TorNet dataset as described in the paper [*A Benchmark Dataset for Tornado Detection and Prediction using Full-Resolution Polarimetric Weather Radar Data*](https://arxiv.org/abs/2401.16437)

## Updates (7/9/24)

* The TorNet dataset has been updated to version 1.1. This update fixes a small number of incorrectly labeled frames in v1 of the dataset, and fixes the event and episode IDs of the warning categoies. Version 1.1 also provides the tornado start and end times in the metadata. We recommend re-downloading the newer version of the data using the links below.

* The code and pretrained models are now compatible with `keras` 3.0. Users can now select their deep learning backend from `tensorflow`, `torch`, or `jax`. Backend-agnostic data loaders are also provided. Read more about this library at [keras's website](https://keras.io/). Users of `tf.keras` should use the `tf_keras` branch of this repo.

* The pretrained CNN model is now available on [huggingface (tornet-ml/tornado_detector_baseline_v1)](https://huggingface.co/tornet-ml/tornado_detector_baseline_v1). Instructions for downloading and using the pre-trained model can be found in `models/README.md` and in the `VisualizeSamples.ipynb` notebook.


![Alt text](tornet_image.png?raw=true "sample")



## Downloading the Data

The TorNet dataset can be downloaded from the following location:
Expand All @@ -12,16 +23,16 @@ The TorNet dataset can be downloaded from the following location:

TorNet is split across 10 files, each containing 1 year of data. There is also a catalog CSV file that is used by some functions in this repository.

* Tornet 2013 (3 GB) and catalog: [https://zenodo.org/doi/10.5281/zenodo.10558658](https://zenodo.org/doi/10.5281/zenodo.10558658)
* Tornet 2014 (15 GB): [https://zenodo.org/doi/10.5281/zenodo.10558838](https://zenodo.org/doi/10.5281/zenodo.10558838)
* Tornet 2015 (17 GB): [https://zenodo.org/doi/10.5281/zenodo.10558853](https://zenodo.org/doi/10.5281/zenodo.10558853)
* Tornet 2016 (16 GB): [https://zenodo.org/doi/10.5281/zenodo.10565458](https://zenodo.org/doi/10.5281/zenodo.10565458)
* Tornet 2017 (15 GB): [https://zenodo.org/doi/10.5281/zenodo.10565489](https://zenodo.org/doi/10.5281/zenodo.10565489)
* Tornet 2018 (12 GB): [https://zenodo.org/doi/10.5281/zenodo.10565514](https://zenodo.org/doi/10.5281/zenodo.10565514)
* Tornet 2019 (18 GB): [https://zenodo.org/doi/10.5281/zenodo.10565535](https://zenodo.org/doi/10.5281/zenodo.10565535)
* Tornet 2020 (17 GB): [https://zenodo.org/doi/10.5281/zenodo.10565581](https://zenodo.org/doi/10.5281/zenodo.10565581)
* Tornet 2021 (18 GB): [https://zenodo.org/doi/10.5281/zenodo.10565670](https://zenodo.org/doi/10.5281/zenodo.10565670)
* Tornet 2022 (19 GB): [https://zenodo.org/doi/10.5281/zenodo.10565691](https://zenodo.org/doi/10.5281/zenodo.10565691)
* Tornet 2013 (3 GB) and catalog: [https://doi.org/10.5281/zenodo.12636522](https://doi.org/10.5281/zenodo.12636522)
* Tornet 2014 (15 GB): [https://doi.org/10.5281/zenodo.12637032](https://doi.org/10.5281/zenodo.12637032)
* Tornet 2015 (17 GB): [https://doi.org/10.5281/zenodo.12655151](https://doi.org/10.5281/zenodo.12655151)
* Tornet 2016 (16 GB): [https://doi.org/10.5281/zenodo.12655179](https://doi.org/10.5281/zenodo.12655179)
* Tornet 2017 (15 GB): [https://doi.org/10.5281/zenodo.12655183](https://doi.org/10.5281/zenodo.12655183)
* Tornet 2018 (12 GB): [https://doi.org/10.5281/zenodo.12655187](https://doi.org/10.5281/zenodo.12655187)
* Tornet 2019 (18 GB): [https://doi.org/10.5281/zenodo.12655716](https://doi.org/10.5281/zenodo.12655716)
* Tornet 2020 (17 GB): [https://doi.org/10.5281/zenodo.12655717](https://doi.org/10.5281/zenodo.12655717)
* Tornet 2021 (18 GB): [https://doi.org/10.5281/zenodo.12655718](https://doi.org/10.5281/zenodo.12655718)
* Tornet 2022 (19 GB): [https://doi.org/10.5281/zenodo.12655719](https://doi.org/10.5281/zenodo.12655719)

If downloading through your browser is slow, we recommend downloading these using `zenodo_get` (https://gitlab.com/dvolgyes/zenodo_get).

Expand All @@ -30,41 +41,68 @@ After downloading, there should be 11 files, `catalog.csv`, and 10 files named a

## Setup

Basic python requirements are listed in `requirements/basic.txt` and can be installed using `pip install -r requirements.txt`.
Basic python requirements are listed in `requirements/basic.txt`.

The `tornet` package can then installed into your environment by running

`pip install .`

in this repo. To do ML with TorNet, additional installs may be necessary depending on library of choice. See e.g., `requirements/tensorflow.txt`, `requirements/torch.txt`.
In this repo. To do ML with TorNet, additional installs may be necessary depending on library of choice. See e.g., `requirements/tensorflow.txt`, `requirements/torch.txt` and/or `requirements/jax.txt`.

Please note that we did not exhaustively test all combinations of operating systems, data loaders, deep learning frameworks, and GPU usage. If you are using the latest version of `keras`, then I recommend you follow setup instructions on the keras webpage [https://keras.io/getting_started/](https://keras.io/getting_started/). Feel free to describe any issues you are having under the issues tab.

### Conda

If using conda

```
conda create -n tornet-{backend} python=3.10
conda activate tornet-{backend}
pip install -r requirements/{backend}.txt
```

Replace {backend} with tensorflow, torch or jax.


## Loading and visualizing TorNet

Start with `notebooks/DataLoaders.ipynb` to get an overview on loading and visualizing the dataset.

To run inference on TorNet samples using a pretrained model, look at `notebooks/VisualizeSamples.ipynb`.

## Train CNN baseline model

### Multiple backend support with Keras 3
The model uses Keras 3 which supports multiple backends. The environment variable
KERAS_BACKEND can be used to choose the backend.

```
export KERAS_BACKEND=tensorflow
# export KERAS_BACKEND=torch
# export KERAS_BACKEND=jax
```

The following trains the CNN baseline model described in the paper using `tensorflow`. If you run this out-of-the-box, it will run very slowly because it uses the basic dataloader. Read the DataLoader notebook for tips on how to optimize the data loader.
```
# Set path to dataset
export TORNET_ROOT=/path/to/tornet
# Run training
python scripts/tornado_detection/train_tornado_tf.py scripts/tornado_detection/config/params.json
python scripts/tornado_detection/train_tornado_keras.py scripts/tornado_detection/config/params.json
```

## Evaluate trained model
Weights of a pretrained CNN baseline are provided in `model/`. To evaluate this model on the test set, run
To evaluate this model on the test set, run

```
# Set path to dataset
export TORNET_ROOT=/path/to/tornet
# Evaluate trained model
python scripts/tornado_detection/test_tornado_tf.py models/tornado_detector_baseline.SavedModel
python scripts/tornado_detection/test_tornado_keras.py
```

This will compute and print various metrics computed on the test set.
This will compute and print various metrics computed on the test set. Note that this script will attempt to download pretrained weights from huggingface, so ensure there is internet connectivity. Alternatively, manually download the pretrained yourself and provide with `--model_path`


### Disclosure
Expand Down
11 changes: 11 additions & 0 deletions models/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,11 @@
Pretrained models can be downloaded from huggingface
https://huggingface.co/tornet-ml/tornado_detector_baseline_v1

or accessed using the huggingface api: (assumes `tornet` is in your path)

```python
from huggingface_hub import hf_hub_download
trained_model = hf_hub_download(repo_id="tornet-ml/tornado_detector_baseline_v1",
filename="tornado_detector_baseline.keras")
model = keras.saving.load_model(trained_model,compile=False)
```
1 change: 0 additions & 1 deletion models/tornado_detector_baseline.SavedModel/fingerprint.pb

This file was deleted.

49 changes: 0 additions & 49 deletions models/tornado_detector_baseline.SavedModel/keras_metadata.pb

This file was deleted.

Binary file not shown.
Binary file not shown.
Binary file not shown.
Loading

0 comments on commit c30dabd

Please sign in to comment.