Skip to content

Commit

Permalink
Documentation (#43)
Browse files Browse the repository at this point in the history
* Add MkDocs. (#41)

* Simple MODIS PipeLine Completed.

* Working barebones GOES16 Pipeline

* Updates.

* Quick commit.

Co-authored-by: annajungbluth <[email protected]>
Co-authored-by: Lilli Freischem <[email protected]>

* Coding session w/ Anna & Lilli

---------

Co-authored-by: annajungbluth <[email protected]>
Co-authored-by: Lilli Freischem <[email protected]>

* More Mkdocs. (#42)

* Simple MODIS PipeLine Completed.

* Working barebones GOES16 Pipeline

* Updates.

* Quick commit.

Co-authored-by: annajungbluth <[email protected]>
Co-authored-by: Lilli Freischem <[email protected]>

* Coding session w/ Anna & Lilli

* Add prelim docs.

---------

Co-authored-by: annajungbluth <[email protected]>
Co-authored-by: Lilli Freischem <[email protected]>

* Updated docs.

* Updates.

* Final Updates.

* Updated the ReadMe.

* Fixed pictures.

* Fixed links.

---------

Co-authored-by: annajungbluth <[email protected]>
Co-authored-by: Lilli Freischem <[email protected]>
  • Loading branch information
3 people authored Apr 29, 2024
1 parent 576b3ed commit 1b269f5
Show file tree
Hide file tree
Showing 40 changed files with 11,496 additions and 2,443 deletions.
Empty file added .env.example
Empty file.
71 changes: 71 additions & 0 deletions .github/workflows/build_docs.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,71 @@
name: Build the documentation

on:
push:
branches:
- main

permissions:
contents: write

jobs:
build-docs:
concurrency: ci-${{ github.ref }}
name: Build docs (${{ matrix.python-version }}, ${{ matrix.os }})
runs-on: ${{ matrix.os }}
defaults:
run:
shell: bash -l {0}
strategy:
matrix:
os: ["ubuntu-latest"]
python-version: ["3.10"]

steps:
# Grap the latest commit from the branch
- name: Checkout the branch
uses: actions/[email protected]
with:
persist-credentials: false

# Create a virtual environment
- name: create Conda environment
uses: conda-incubator/setup-miniconda@v2
with:
auto-update-conda: true
python-version: ${{ matrix.python-version }}

# Install katex for math support
- name: Install NPM
uses: actions/setup-node@v3
with:
node-version: 16
- name: Install KaTeX
run: |
npm install katex
# Install Poetry and build the documentation
- name: Install and configure Poetry
uses: snok/install-poetry@v1
with:
version: 1.2.2
virtualenvs-create: false
virtualenvs-in-project: false
installer-parallel: true

- name: Install LaTex
run: |
sudo apt-get update
sudo apt-get install texlive-fonts-recommended texlive-fonts-extra texlive-latex-extra dvipng cm-super
- name: Build the documentation with MKDocs
run: |
cp docs/examples/gpjax.mplstyle .
poetry install --all-extras --with docs
conda install pandoc
poetry run mkdocs build
- name: Deploy Page 🚀
uses: JamesIves/[email protected]
with:
branch: gh-pages
folder: site
167 changes: 157 additions & 10 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,30 +1,177 @@
# A minimal library for preprocessing remote sensing data for machine learning applications (In Progress)
[![CodeFactor](https://www.codefactor.io/repository/github/jejjohnson/rs_tools/badge)](https://www.codefactor.io/repository/github/jejjohnson/rs_tools)
[![codecov](https://codecov.io/gh/jejjohnson/rs_tools/branch/main/graph/badge.svg?token=YGPQQEAK91)](https://codecov.io/gh/jejjohnson/rs_tools)
# `rs-tools`

> This package has some simple, minimal preprocessing of helio-data to make it machine learning ready.


## What are RS-Tools?

`rs_tools` is a toolbox of functions designed to
There is a high barrier to entry when working with remote sensing data for machine learning (ML) research.
This is especially true for level 1 data which is typically raw radiance observations.
There are often many domain-specific transformations that can completely make or break the success of the ML task.
`rs_tools` seeks to lower the barrier to entry cost for ML researchers to make meaningful progress when dealing with remote sensing data.
It features a standardized, transparent and flexible procedure for defining data and evaluation pipelines for data-intensive level 1 data products.

---
***
### Agnostic Toolbox of Functions

We provide a suite of useful functions which can be used to clean level-1 remote sensing data to be used for downstream tasks.
It is an agnostic suite of functions that can be piped together to create preprocessing and evaluation chains.
We take care of all of the nitty-gritty details which are often common for these types of datasets.
However, we take care not to hard-code anything and try to be as transparent as possible so that users can understand and modify the scripts for their own use cases.

***
### Pipelines

We provide some hydra-integrated pipelines which allow users to do some high-level processing to produce ML-ready datasets.
We follow best principles to be as agnostic as possible so that users are not bound by any ML-framework.
In addition, we provide many small bite-sized functions which users can piece together in their own way for their own applications.


***
#### Data Downloader

With a few simple commands, we can download some raw level 1 data products with minimum preprocessing.
We currently have data downloaders for [MODIS Level 1](https://spaceml-org.github.io/rs_tools/datasets/modis) data, [MSG Level 1](https://spaceml-org.github.io/rs_tools/datasets/msg) data, and [GOES16 Level 1](https://spaceml-org.github.io/rs_tools/datasets/goes/) data.


A user can get started right away by simply running the following snippet in the command line.

```bash
# GOES 16
python rs_tools satellite=goes stage=download
# MODIS - AQUA (or TERRA)
python rs_tools satellite=aqua stage=download
# MSG
python rs_tools satellite=msg stage=download
```


***
#### Analysis-Ready Data

We have scripts to generate some *analysis-ready data*.
These are datasets that have been harmonized under a common data structure.
We try to keep as much meta-data as possible which could be useful for downstream tasks, e.g., coordinates, time stamps, units and cloud masks.
A user can do some further analysis on these

<center>
<img src="docs/assets/analysis_ready_data.png" alt="drawing" width="500"/>
</center>

A user can get started right away by simply running the following snippet in the command line.

```bash
# GOES16
python rs_tools satellite=goes stage=geoprocess
# MODIS - AQUA (or TERRA)
python rs_tools satellite=aqua stage=geoprocess
# MSG
python rs_tools satellite=msg stage=geoprocess
```

For more examples, see our pipelines sections for MODIS, MSG and GOES16.

***
#### Machine-Learning Ready Data

We also feature some *ML-Ready* data which is immediately ready for ML-specific tasks.
These are data that have already been divided into patches which sufficiently span the space for the ML task.
A user can user whichever ML dataset/dataloader framework that they choose.



<center>
<img src="docs/assets/ml_ready_data.png" alt="drawing" width="500"/>
</center>

A user can get started right away by simply running the following snippet in the command line.

```bash
# GOES16
python rs_tools satellite=goes stage=patch
# MODIS - AQUA (or TERRA)
python rs_tools satellite=aqua stage=patch
# MSG
python rs_tools satellite=msg stage=patch
```

***
#### Use Case: Instrument-to-Instrument Translation (Work In Progress)

We also feature an Instrument-2-Instrument translation use-case.
See [github/InstrumentToInstrument](https://github.com/RobertJaro/InstrumentToInstrument/tree/master) repo for more details.
In the rs-tools library, we have a simple example training script.


```bash
python rs_tools ...
```

***
## Installation

We can install it directly through pip
### Conda (Recommended)

We recommend the user to use `conda` with the associated environment for the environment.

```bash
pip install git+https://github.com/jejjohnson/rs_tools
conda env create -f environments/environment.yaml
conda activate rs_tools
```

We also use poetry for the development environment.
***
### Pip (Alpha-Version)

We can install via the github repo through pip.

```bash
git clone https://github.com/jejjohnson/rs_tools.git
pip install git+https://github.com/space-ml/rs_tools.git
```

**Warning**: This is an alpha version.

***
### Development Version



```bash
git clone https://github.com/space-ml/rs_tools.git
cd rs_tools
conda create -n rs_tools python=3.11 poetry
poetry install
```

!!! tip
We advise you to create a virtual environment before installing:

```bash
conda env create -f environment.yaml
conda activate rs_tools
```

and recommend you check your installation passes the supplied unit tests:

```bash
poetry run pytest tests/
```

***
### Instrument To Instrument (Work-in-Progress)

We have an example where we could do inference using a pre-trainined model from the ITI repo.
This would require us to install the `itipy` repo directly.

We can use


```bash
conda env create -f environments/environment_iti.yaml
conda activate rs_tools
```


Please see the [InstrumentToInstrument](https://github.com/spaceml-org/InstrumentToInstrument/tree/development-eo) repo with the [example](https://github.com/spaceml-org/InstrumentToInstrument/blob/development-eo/iti/train/msg_to_goes.py) for more details.



---
Expand Down
2 changes: 1 addition & 1 deletion config/example/satellite/aqua.yaml
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
download:
_target_: rs_tools._src.data.modis.downloader_aqua.download
save_dir: ${save_dir}/aqua/
save_dir: ${save_dir}/aqua/raw
start_date: ${period.start_date}
start_time: ${period.start_time}
end_date: ${period.end_date}
Expand Down
4 changes: 2 additions & 2 deletions config/example/satellite/goes.yaml
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
download:
_target_: rs_tools._src.data.goes.downloader_goes16.download
save_dir: ${save_dir}/goes16/
save_dir: ${save_dir}/goes16/raw
start_date: ${period.start_date}
start_time: ${period.start_time}
end_date: ${period.end_date}
Expand All @@ -26,4 +26,4 @@ patch:
patch_size: ${patch_size}
stride_size: ${stride_size}
nan_cutoff: ${nan_cutoff}
save_filetype: ${save_filetype}
save_filetype: ${save_filetype}
2 changes: 1 addition & 1 deletion config/example/satellite/msg.yaml
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
download:
_target_: rs_tools._src.data.msg.downloader_msg.download
save_dir: ${save_dir}/msg/
save_dir: ${save_dir}/msg/raw
start_date: ${period.start_date}
start_time: ${period.start_time}
end_date: ${period.end_date}
Expand Down
2 changes: 1 addition & 1 deletion config/example/satellite/terra.yaml
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
download:
_target_: rs_tools._src.data.modis.downloader_terra.download
save_dir: ${save_dir}/terra/
save_dir: ${save_dir}/terra/raw
start_date: ${period.start_date}
start_time: ${period.start_time}
end_date: ${period.end_date}
Expand Down
Empty file added config/main.yaml
Empty file.
Binary file added docs/assets/analysis_ready_data.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docs/assets/ml_ready_data.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
94 changes: 94 additions & 0 deletions docs/datasets/goes.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,94 @@
# GOES 16

Below are some notes on NOAA's GOES satellites, specifically focussing on the [Advanced Baseline Imager](https://www.goes-r.gov/spacesegment/abi.html) (ABI).


## GOES Satellites

### GOES-16
- Launched on 19 November 2016, operational since 18 December 2017
- Longitude central point: -75.2

### GOES-17 [no longer operational]
- Launched on 1 March 2018, operational from 12 February 2019 to 4 January 2023 at longitude -136.9
- Replaced by GOES-18 due to issues with its Advanced Baseline Imager (ABI) instrument
- Moved to longitude -104.7 (between GOES-16 and GOES-18) and serves as backup for the operational satellites

### GOES-18
- Launched on 1 March 2022, operational since 4 January 2023 (replaced GOES-17)
- Longitude central point: -136.9

## [GOES Instruments](https://www.goes-r.gov/spacesegment/instruments.html)

Earth-facing:
- Advanced Baseline Imager (ABI)
- Geostationary Lightning Mapper (GLM)]

Sun-facing:
- Extreme Ultraviolet and X-ray Irradiance Sensors (EXIS)
- Solar Ultraviolet Imager (SUVI)

Space environment:
- Magnetometer (MAG)
- Space Environment In-Situ Suite (SEISS)

## ABI Data

### Processing Levels

* Level-0: Raw instrument measurements
* Level-1B: Calibrated and geolocated radiances
* Level-2: Derived geophysical variables
* Level-3: Geophysical variables mapped on uniform space-time grid

### Level-1B: Spectral Bands & Resolution

<img width="617" alt="GOES-ABI-bands" src="https://github.com/spaceml-org/rs_tools/assets/33373979/93d673d1-2eca-4a4b-84a9-d9135e7d0dd7">

### Level 2: Clear Sky Mask (ACM)
The [clear sky mask algorithm](https://www.star.nesdis.noaa.gov/goesr/documents/ATBDs/Enterprise/ATBD_Enterprise_Cloud_Mask_v1.2_2020_10_01.pdf) uses the GOES ABI visible, near-infrared and infrared bands to automatically assign one of the following 4 classes to each pixel:
- cloudy
- probably cloudy
- probably clear
- clear

ACM data is provided at the native 2km resolution on the ABI fixed grid for full disk, CONUS, and mesoscale coverage regions, at the same temporal resolution as ABI L1b data.

### Naming Conventions
GOES ABI Level 1b and 2 data are named according to the following [naming conventions](https://cimss.ssec.wisc.edu/goes/ABI_File_Naming_Conventions.pdf):

`\<SE\>\_\<DSN\>\_\<PID\>\_\<Obs Start Date & Time\>\_\<Obs End Date & Time\>\_\<Creation Date & Time\>.\<FE\>`

where:
- SE = System Environment
- DSN = Data Short Name
- PID = Platform Identifier
- Obs Start Date & Time = Observation Period Start Date & Time
- Obs End Date & Time = Observation Period End Date & Time
- Creation Date & Time = File Creation Date & Time
- FE = File Extension

## Working with Level-1B Data

GOES/ABI radiances are provided in $mW/m^2/sr/cm^{-1}$, i.e. the data is normalised to wavenumbers. In order to convert the data to $W/m^2/sr/um$, the data needs to be multiplied by $10^{-7}$; $mW = 10^{-3} W$, $cm^{-1} = 10^4 {um}$.

## Data Format & Access
GOES Data can be explored in the following buckets:

AWS:
- [GOES-16 AWS S3 Explorer](https://noaa-goes16.s3.amazonaws.com/index.html)
- [GOES-17 AWS S3 Explorer](https://noaa-goes17.s3.amazonaws.com/index.html)
- [GOES-18 AWS S3 Explorer](https://noaa-goes18.s3.amazonaws.com/index.html)

Google Cloud:
- [GOES-16 Google Cloud Bucket Explorer](https://console.cloud.google.com/storage/browser/gcp-public-data-goes-16)
- [GOES-17 Google Cloud Bucket Explorer](https://console.cloud.google.com/storage/browser/gcp-public-data-goes-17)
- [GOES-18 Google Cloud Bucket Explorer](https://console.cloud.google.com/storage/browser/gcp-public-data-goes-18)

## Software Tools
> [GOES2GO](https://blaylockbk.github.io/goes2go/_build/html/index.html) - Software download
* allows downloading of GOES data from AWS



## Q/A
Loading

0 comments on commit 1b269f5

Please sign in to comment.