Skip to content

Commit

Permalink
docs: updated readme.md
Browse files Browse the repository at this point in the history
  • Loading branch information
kalisio-nicolas committed May 4, 2023
1 parent 0d664a0 commit 1b5d085
Showing 1 changed file with 42 additions and 5 deletions.
47 changes: 42 additions & 5 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,19 +5,20 @@

A [Krawler](https://kalisio.github.io/krawler/) based service to download data from French open portal [Hub'Eau](https://hubeau.eaufrance.fr/)

## Description
## K-hubeau-hydro

The **k-hubeau** jobs allow to scrape hydrometric data from the following api: [http://hubeau.eaufrance.fr/page/api-hydrometrie](http://hubeau.eaufrance.fr/page/api-hydrometrie). The downloaded data are stored in a [MongoDB](https://www.mongodb.com/) database and more precisely in 2 collections:
The **k-hubeau-hydro** jobs allow to scrape hydrometric data from the following api: [http://hubeau.eaufrance.fr/page/api-hydrometrie](http://hubeau.eaufrance.fr/page/api-hydrometrie). The downloaded data are stored in a [MongoDB](https://www.mongodb.com/) database and more precisely in 3 collections:
* the `observations` collection stores the observed data:
* the water level `H` in meter (m)
* the water flow `Q` in cubic meter per second (m3/s)
* the `stations` collection stores the data of the stations
* the `predictions` collection stores the predicted data:
* the water level `H` in meter (m)

The project consists in 2 jobs:
The project consists in 3 jobs:
* the `stations` job scrapes the stations data according a specific cron expression. By default, every day at midnight.
* the `observations` job scrapes the observations according a specific cron expression. By default every 15 minutes.

## Configuration
* the `prediction` job generates the predictions about future water levels.

### Stations

Expand All @@ -36,6 +37,42 @@ The project consists in 2 jobs:
| `TIMEOUT` | The maximum duration of the job. It must be in milliseconds and the default value is `1 800 000` (30 minutes). |
| `DEBUG` | Enables debug output. Set it to `krawler*` to enable full output. By default it is undefined. |

## K-hubeau-piezo

The **k-hubeau-piezo** jobs allow to scrape piezometric data from the following api: [http://hubeau.eaufrance.fr/page/api-piezometrie](http://hubeau.eaufrance.fr/page/api-piezometrie). The downloaded data are stored in a [MongoDB](https://www.mongodb.com/) database and more precisely in 2 collections:

* the `observations` collection stores the observed data:
* the water table level `profondeur_nappe` in meter (m)
* the water table level in ngf format `niveau_eau_ngf` in meter (m)

* the `stations` collection stores the data of the stations

The project consists in 2 jobs:
* the `stations` job scrapes the stations data according a specific cron expression. By default, every day at midnight.

* the `observations` job scrapes the observations according a specific cron expression. By default every hour at 15 minutes.



### Stations
| Variable | Description |
|--- | --- |
| `DB_URL` | The database URL. The default value is `mongodb://127.0.0.1:27017/hubeau` |
| `CODE_DEP` | list of department codes to filter the stations. (ie: `"75", "92"`), default is all 101 french departments |
| `DATE_FIN_MESURE` | Deadline defining all older stations as inactive, default is `2022-01-01` |
| `DEBUG` | Enables debug output. Set it to `krawler*` to enable full output. By default it is undefined. |

### Observations
| Variable | Description |
|--- | --- |
| `DB_URL` | The database URL. The default value is `mongodb://127.0.0.1:27017/hubeau` |
| `TTL` | The observations data time to live. It must be expressed in seconds and the default value is `604 800` (7 days) |
| `HISTORY` | The duration of the observations data history the job has to download. It must be expressed in milliseconds (should be full days) and the default value is `86 400 000` (1 day)|
| `TIMEOUT` | The maximum duration of the job. It must be in milliseconds and the default value is `1 800 000` (30 minutes). |
| `DEBUG` | Enables debug output. Set it to `krawler*` to enable full output. By default it is undefined. |



## Deployment

We personally use [Kargo](https://kalisio.github.io/kargo/) to deploy the service.
Expand Down

0 comments on commit 1b5d085

Please sign in to comment.