Commands

Scrape Races

Retrieve and process race data from a web datasource, JSON file or spreadsheet.

python manage.py scrape input_source [RACE_ID [RACE_ID ...]] \
	[-c, --club CLUB] \
	[-e, --entity ENTITY] \
	[-f, --flag FLAG] \
	[-y, --year YEAR] \
	[-sy, --start-year YEAR] \
	[-t, --table TABLE] \
	[-g, --gender] \
	[-ca, --category CATEGORY] \
	[--sheet-id SHEET_ID] \
	[--sheet-name SHEET_NAME] \
	[--file-path FILE_PATH] \
	[-i, --ignore ID [ID ...]] \
	[-w, --last-weekend] \
	[--force-gender] \
	[--force-category] \
	[--save-old] \
	[-o, --output OUTPUT]

# positional arguments:
#   input_source          name of the Datasource or path to import data from.
#   race_ids              raceIDs to find in the source and ingest.
#
# options:
#   -c CLUB, --club CLUB
#                         clubID for which races should be imported.
#   -e ENTITY, --entity ENTITY
#                         entityID for which races should be imported.
#   -f FLAG, --flag FLAG
#                         flagID for which races should be imported.
#   -y YEAR, --year YEAR  year for which races should be imported, 'all' to import from the source beginnig.
#   -sy START_YEAR, --start-year START_YEAR
#                         year for which we should start processing years. Only used with year='all'.
#   --sheet-id SHEET_ID
#                         google-sheet ID used for TABULAR datasource.
#   --sheet-name SHEET_NAME
#                         google-sheet name used for TABULAR datasource.
#   --file-path FILE_PATH
#                         sheet file path used for TABULAR datasource.
#   -t TABLE, --table TABLE
#                         table of the race for multipage races.
#   -g GENDER, --gender GENDER
#                         gender filter.
#   -ca CATEGORY, --category CATEGORY
#                         category filter.
#   -i [IGNORE ...], --ignore [IGNORE ...]
#                         raceIDs to ignore during ingestion.
#   -w, --last-weekend
#                         fetches the races for the last weekend.
#   --force-gender
#                         forces the gender to match.
#   --force-category
#                         forces the category to match.
#   --save-old
#                         automatically saves the races before 2003 without asking.
#   -o OUTPUT, --output OUTPUT
#                         Outputs the race data to the given folder path in JSON format.

Examples

# Scrape last weekend races from the LGT datasource forcing the gender and category match in the DB.
python manage.py scrape lgt -w --force-gender --force-category

Recheck Races

Recheck the already full imported flags to find new races.

python manage.py scrape datasource \
	[-f, --flag FLAG] \
	[-g, --gender] \
	[-ca, --category CATEGORY] \
	[--check-participants] \
	[--force-gender] \
	[--force-category]

# positional arguments:
#   datasource            name of the Datasource.
#
# options:
#   -f FLAG, --flag FLAG
#                         flagID for which races should be imported.
#   -g GENDER, --gender GENDER
#                         gender filter.
#   -ca CATEGORY, --category CATEGORY
#                         category filter.
#   --check-participants
#                         checks if the number of participants matches.
#   --force-gender
#                         forces the gender to match.
#   --force-category
#                         forces the category to match.

Plot Data

Use boxplot graphs to visualize each year speeds.

python manage.py plot type \
	[-i, --index INDEX] \
	[-c, --club CLUB_ID] \
	[-l, --league LEAGUE_ID] \
	[-f, --flag FLAG_ID] \
	[-g, --gender GENDER] \
	[-ca, --category CATEGORY] \
	[-y, --years YEARS] \
	[-d, --day DAY] \
	[-n, --normalize] \
	[--leagues-only] \
	[--branch-teams] \
	[-o, --output FILE]

# positional arguments:
#   type                  plot type ['boxplot', 'line', 'nth'].
#
# options:
#   -i INDEX, --index INDEX
#                         position to plot the speeds in 'nth' charts.
#   -c CLUB, --club CLUB
#                         club ID for which to load the data.
#   -l LEAGUE, --league LEAGUE
#                         league ID for which to load the data.
#   -f FLAG, --flag FLAG
#                         flag ID for which to load the data.
#   -g GENDER, --gender GENDER
#                         gender filter.
#   -ca CATEGORY, --category CATEGORY
#                         category filter.
#   -y [YEARS ...], --years [YEARS ...]
#                         years to include in the data.
#   -d DAY, --day DAY
#                         day of the race for multiday races.
#   --leagues-only
#                         only races from a league.
#   --branch-teams
#                         filter only branch teams.
#   -n, --normalize
#                         exclude outliers based on the speeds' standard deviation.
#   -o OUTPUT, --output OUTPUT
#                         saves the output plot.

Examples

# Plot the winner speed of each race for the league 5 in 2015, 2016, 2017, and 2018.
# The plot will be saved in the Downloads folder with the name test.png.
python manage.py plot nth --league 5 -i 1 -y 2015 2016 2017 2018 -o ~/Downloads/p.png

# Plot the normalized league speeds of the Puebla team for all the years.
python manage.py plot -c 25 --leagues-only -n -o ~/Downloads/p.png

# Plot all AVG speeds per year for the league 5.
python manage.py plot --league 5 -o ~/Downloads/p.png

# Plot the speeds of the Puebla team for the league 5 in 2021, 2022, and 2023.
python manage.py plot line -c 25 --league 5 -y 2021 2022 2023 -o ~/Downloads/p.png

# Plot the speeds of the Puebla team for the flag 12 in 2021, 2022, and 2023.
python manage.py plot line -f 12 -y 2021 2022 2023 -o ~/Downloads/p.png

# Plot all leagues AVG speeds per year.
# The plot will be saved in the Downloads folder with a generated name.
parallel -j 11 python manage.py plot --league {} -o ~/Downloads/l{}.png ::: $(seq 1 11)

Development

Environment variables

Default .env file.

DEBUG=False
SECRET_KEY=what-a-fake-secret-key-lol

## DATABASE
DATABASE_HOST=localhost
DATABASE_NAME=postgres
DATABASE_USERNAME=postgres
DATABASE_PASSWORD=postgres

## EMAIL
EMAIL_HOST_USER=,
EMAIL_HOST_PASSWORD=,

Backup Database to YAML

python manage.py dumpdata > <name>.yaml --format yaml --exclude admin.logentry --exclude auth --exclude sessions --exclude contenttypes

Create a fixture to use in tests

python manage.py dumpdata races.race --pks=<PK> > race.json
python manage.py dumpdata participants.participant --pks=<PK1,PK2,...> > participants.json
jq -s '.[0] + .[1]' race.json participants.json > combined.json
jq -s '.[0] + .[1]' fixtures/frozen-db.json combined.json > copy.json
mv copy.json fixtures/frozen-db.json
rm race.json participants.json combined.json

Name		Name	Last commit message	Last commit date
Latest commit History 180 Commits
.github		.github
apps		apps
config		config
fixtures		fixtures
logs		logs
tests		tests
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
.ruff.toml		.ruff.toml
.testenv		.testenv
CHANGELOG.md		CHANGELOG.md
LICENSE.md		LICENSE.md
README.md		README.md
manage.py		manage.py
utils.sql		utils.sql

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Commands

Scrape Races

Examples

Recheck Races

Plot Data

Examples

Development

Environment variables

Backup Database to YAML

Create a fixture to use in tests

About

Releases 3

Packages

Contributors 2

Languages

License

iagocanalejas/regatas

Folders and files

Latest commit

History

Repository files navigation

Commands

Scrape Races

Examples

Recheck Races

Plot Data

Examples

Development

Environment variables

Backup Database to YAML

Create a fixture to use in tests

About

Resources

License

Stars

Watchers

Forks

Releases 3

Packages 0

Contributors 2

Languages

Packages