Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Staging/detran #334

Merged
merged 287 commits into from
Jan 29, 2024
Merged
Show file tree
Hide file tree
Changes from 250 commits
Commits
Show all changes
287 commits
Select commit Hold shift + click to select a range
48fc127
Add packages (Polars and Excel readers)
tamireinhorn Apr 19, 2023
b7b2db4
Create utility functions for processing files
tamireinhorn Apr 19, 2023
47c6c97
Notebook to demonstrate data cleaning
tamireinhorn Apr 19, 2023
a854ccd
Cleaning works for uf tipo
tamireinhorn Apr 19, 2023
f74c3e0
Start file for UF Tipo treatment
tamireinhorn Apr 20, 2023
42b07d6
Better handling of directory changes
tamireinhorn Apr 20, 2023
155d75a
Reorganize testing into two distinct types
tamireinhorn Apr 20, 2023
66afce1
More test rearranging
tamireinhorn Apr 20, 2023
5aab93c
Revert bugged discovery
tamireinhorn Apr 20, 2023
427ae97
Revert "Revert bugged discovery"
tamireinhorn Apr 20, 2023
73c8cad
Discovery works
tamireinhorn Apr 20, 2023
6df4a89
Pre commit enforcing py 3.9
tamireinhorn Apr 21, 2023
1d43c57
Add function to verify denatran total
tamireinhorn Apr 21, 2023
30ca4cd
Wide to long now checks and removes total
tamireinhorn Apr 21, 2023
830ad07
Adds packages for manage.py to work
tamireinhorn Apr 21, 2023
7e6303d
Create pipeline folder structure
tamireinhorn Apr 21, 2023
7779e0a
Just for stashing
tamireinhorn Apr 22, 2023
bdcfe48
Create constants file
tamireinhorn Apr 23, 2023
44e39b4
Add more utility functions
tamireinhorn Apr 23, 2023
031d4de
Migrate code to proper folder
tamireinhorn Apr 23, 2023
635c61f
More constants
tamireinhorn Apr 23, 2023
8e089a7
Rename substituions ruleset
tamireinhorn Apr 23, 2023
2fe17fe
Cleanup
tamireinhorn Apr 23, 2023
a25ce63
Cleanup crew
tamireinhorn Apr 23, 2023
2516646
More migration of code
tamireinhorn Apr 23, 2023
60bdf9b
Small refactoring
tamireinhorn Apr 23, 2023
0a07c40
More reorganizing
tamireinhorn Apr 23, 2023
2ae6fa5
Docstrings for all utils
tamireinhorn Apr 23, 2023
15cec37
Fix documentation in utils
tamireinhorn Apr 23, 2023
0ff97de
Apply better fuzzy matching for strings
tamireinhorn Apr 23, 2023
30f19d2
Add package to speed up string comparisons
tamireinhorn Apr 23, 2023
0bc7fd6
Maybe difflib was better
tamireinhorn Apr 23, 2023
882e8be
All rules done afaik
tamireinhorn Apr 23, 2023
a55714e
Actual decent file to be tested
tamireinhorn Apr 23, 2023
35d3ebc
Imports cleanup
tamireinhorn Apr 23, 2023
ba51ac5
Added guard
tamireinhorn Apr 23, 2023
da00c6d
Clean up
tamireinhorn Apr 25, 2023
baceabc
Reorganizing
tamireinhorn Apr 25, 2023
293364b
Testing more stuff
tamireinhorn Apr 26, 2023
dda7427
Coverage back
tamireinhorn Apr 26, 2023
19d276f
Increasing coverage and cleanup of guess header
tamireinhorn Apr 26, 2023
9faed06
Reduce exposure
tamireinhorn Apr 26, 2023
514f156
More testing
tamireinhorn Apr 26, 2023
02cd15a
Erroring when appropriate for municipio_tipo
tamireinhorn Apr 26, 2023
5fb737f
This is useless
tamireinhorn Apr 26, 2023
4d2f644
Directories are again working!!!!
tamireinhorn Apr 26, 2023
5896ac9
All constants
tamireinhorn Apr 29, 2023
49e668d
Merge branch 'master' into tamir_br_denatran_frota
tamireinhorn Apr 29, 2023
a42845c
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Apr 29, 2023
70d7d64
Add utils function to correct folder
tamireinhorn Apr 29, 2023
5fd8a79
Redo lock file
tamireinhorn May 2, 2023
26fe595
Relocate test and all works
tamireinhorn May 2, 2023
f56900a
Correct importing
tamireinhorn May 2, 2023
2624e48
Fix import yet again
tamireinhorn May 2, 2023
1283baf
Add todo list for myself and ignore it.
tamireinhorn May 2, 2023
4a37eea
First task
tamireinhorn May 2, 2023
2be72b0
Extra arguments for task
tamireinhorn May 2, 2023
0e62488
Dirty test clean up + flow start
tamireinhorn May 3, 2023
0e495bc
Tests are the bomb
tamireinhorn May 3, 2023
7d0ff67
Clean up task tests
tamireinhorn May 3, 2023
f69fadb
Works as is
tamireinhorn May 3, 2023
f493824
Better encapsulation, all tests pass
tamireinhorn May 3, 2023
e6a7c00
flows goes
tamireinhorn May 3, 2023
2e986b8
Changes in header guess uses other flow for test
tamireinhorn May 5, 2023
f0a3a68
Adjustment in tests after adjusting function
tamireinhorn May 5, 2023
6fb1556
Cleaner task test
tamireinhorn May 5, 2023
f6e1f8c
Constant for the CSVs
tamireinhorn May 5, 2023
87aa026
Debugging via workaround
tamireinhorn May 5, 2023
d2a5eec
Adjustment still makes tests pass
tamireinhorn May 5, 2023
63b9998
No test side effects
tamireinhorn May 5, 2023
9b3665b
RAR files work now
tamireinhorn May 5, 2023
e3032f6
Awesome trick for reusing code for zip and rar
tamireinhorn May 5, 2023
d3d9599
No need for specific zip function anymore
tamireinhorn May 5, 2023
d88e64e
Treat 2013 exception
tamireinhorn May 5, 2023
0c9aba7
Merge branch 'master' into tamir_br_denatran_frota
tamireinhorn May 5, 2023
0d8cea2
Modifications to deal with pre 2013 data
tamireinhorn May 7, 2023
20cc2b1
Pre-2013: crude but getting there
tamireinhorn May 7, 2023
653fcf4
2012 extraction works!
tamireinhorn May 7, 2023
ab1f39c
2011 also works now
tamireinhorn May 7, 2023
c474f0c
Fixing toml, lock + adding code2flow
tamireinhorn May 8, 2023
aa22448
Refactoring for cleaner 2010-2012 code
tamireinhorn May 8, 2023
3644bb0
Better encapsulation
tamireinhorn May 9, 2023
b7d1c09
Trying to solve 2009 and down
tamireinhorn May 9, 2023
f5bb766
Adhoc treatment for pre 2010
tamireinhorn May 9, 2023
4800468
Almost there
tamireinhorn May 10, 2023
b011ce1
Works from 2005 onwards
tamireinhorn May 10, 2023
ac0117c
2005 is the best I have so far
tamireinhorn May 10, 2023
80cb580
And now we also have 2004
tamireinhorn May 10, 2023
af18092
Almost 2003
tamireinhorn May 10, 2023
2d9fbf7
Almost 2003
tamireinhorn May 10, 2023
8ac4dcf
2003 extraction works
tamireinhorn May 10, 2023
e09bca8
Extraction task is fully functional
tamireinhorn May 10, 2023
8c50f07
Rename task for consistency
tamireinhorn May 10, 2023
ea8d254
Begin testing of UF tipo treatment
tamireinhorn May 10, 2023
cbed880
Treatment for UF tipo works up for 2009->
tamireinhorn May 10, 2023
93c1370
I have become Death, the destroyer of Worlds.
tamireinhorn May 12, 2023
8dd26b9
I'm going to hell
tamireinhorn May 12, 2023
bb4f0be
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] May 12, 2023
e2ec859
Trying to unleash R inside Python
tamireinhorn May 15, 2023
8fd4b2a
Treatment working except 2003-2004
tamireinhorn May 15, 2023
a3c37fa
Adjust parameter for old data
tamireinhorn May 15, 2023
4b9b74b
uf tipo is done
tamireinhorn May 15, 2023
2d0f30a
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] May 15, 2023
3a8e207
Add task for outputing file to CSV
tamireinhorn May 16, 2023
c4e6a74
Small error
tamireinhorn May 16, 2023
d65706d
WIP for treating municipality data
tamireinhorn May 16, 2023
a5fc92d
Add export task
tamireinhorn May 17, 2023
11d9f04
Change constants to keep things standard
tamireinhorn May 17, 2023
242933e
Add task for getting the extracted file
tamireinhorn May 17, 2023
048e5e5
Create first flow
tamireinhorn May 17, 2023
f949e0c
Ignore my local run file
tamireinhorn May 17, 2023
ef75009
The flow should work but...
tamireinhorn May 17, 2023
98c6217
yeah the flow still sucks i'm stupid
tamireinhorn May 17, 2023
faeda9f
The first flow gets to Google Cloud
tamireinhorn May 19, 2023
39300de
Type hints in handlers.py
tamireinhorn May 19, 2023
9fd4102
Add type hints to task
tamireinhorn May 19, 2023
d3bd53a
Cleanup
tamireinhorn May 19, 2023
fa1e04b
Guess header refactor
tamireinhorn May 19, 2023
c65d804
Fixed function call
tamireinhorn May 19, 2023
ae14b15
Create function for UF Tipo and handler
tamireinhorn May 19, 2023
a7c9ae9
Adds docstring checker for linting
tamireinhorn May 21, 2023
bb8c9ab
General cleanup
tamireinhorn May 21, 2023
c45d045
Documentations in progress
tamireinhorn May 21, 2023
9baf530
More cleanup
tamireinhorn May 21, 2023
570ce51
Using enums + cleanup
tamireinhorn May 21, 2023
4fea4ed
2023 works for treating municipality
tamireinhorn May 22, 2023
325398f
Working from 2016 onwards
tamireinhorn May 22, 2023
ed3343a
Working from 2014 onwards
tamireinhorn May 22, 2023
2990dd8
2010 has a total error?
tamireinhorn May 22, 2023
038bdca
Historical data is messy, adjust util for that
tamireinhorn May 23, 2023
1704472
More substitutions rules for a lot of messy data
tamireinhorn May 23, 2023
e45fec7
Adjustments for historical data
tamireinhorn May 23, 2023
5b9642b
Treatment works!!!!
tamireinhorn May 23, 2023
cbe964d
Flow is done
tamireinhorn May 24, 2023
ba325bd
Fix small error in quadriciclo data
tamireinhorn May 25, 2023
d339e79
Ignore credentials
tamireinhorn May 30, 2023
b908671
Merge branch 'master' into tamir_br_denatran_frota
mergify[bot] Jun 14, 2023
dff5dc7
Merge pull request #310 from tamireinhorn/tamir_br_denatran_frota
lucascr91 Jun 14, 2023
bcb1502
Merge branch 'master' into staging/detran
lucascr91 Jun 15, 2023
a3fcb06
Ajusta flow
tamireinhorn Jun 16, 2023
450868c
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Jun 16, 2023
23deaf2
Commit salvador
tamireinhorn Jun 16, 2023
9a3d2de
Sobe config certa pra ler da BD
tamireinhorn Jun 16, 2023
9262ee9
Reallocate logs
tamireinhorn Jun 17, 2023
da3c660
Novo teste
tamireinhorn Jun 20, 2023
f14f975
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Jun 20, 2023
8b9638c
Tenta mapping pra um flow
tamireinhorn Jun 22, 2023
0836c77
Sobrescrever a tabela (deve dar erro)
tamireinhorn Jun 22, 2023
6fc2193
Faltou map
tamireinhorn Jun 22, 2023
ef71443
register flow
folhesgabriel Jun 24, 2023
90bff91
test: access datasus ftp using prefect
folhesgabriel Apr 10, 2023
6fc2241
re register flow
folhesgabriel Jul 4, 2023
c3ed460
Merge branch 'master' into staging/detran
folhesgabriel Jul 6, 2023
eaeacfe
re register
folhesgabriel Jul 6, 2023
0e96d9d
Sei lá
tamireinhorn Jul 9, 2023
9922962
uh
tamireinhorn Jul 10, 2023
6e38a4f
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Jul 10, 2023
944d931
Teste pra task nova
tamireinhorn Jul 10, 2023
e5f50b9
Modifica flow de acodo
tamireinhorn Jul 10, 2023
99617f9
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Jul 10, 2023
04c85cc
Merge branch 'staging/detran' of https://github.com/basedosdados/pipe…
tamireinhorn Jul 10, 2023
15e34a7
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Jul 10, 2023
e1780b3
Error handling
tamireinhorn Jul 10, 2023
4f5368b
Merge branch 'staging/detran' of https://github.com/basedosdados/pipe…
tamireinhorn Jul 10, 2023
38428ab
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Jul 10, 2023
a67db8a
Tenta usar ano e mes de prod
tamireinhorn Jul 10, 2023
2e5aadd
Merge branch 'staging/detran' of https://github.com/basedosdados/pipe…
tamireinhorn Jul 10, 2023
a2ca388
Ops
tamireinhorn Jul 10, 2023
68771ee
Merge branch 'staging/detran' of https://github.com/basedosdados/pipe…
tamireinhorn Jul 10, 2023
0cf432c
ah ok
tamireinhorn Jul 10, 2023
63b8e8f
logs
tamireinhorn Jul 11, 2023
b7ae3bd
mais simples
tamireinhorn Jul 12, 2023
77cb983
Merge branch 'master' into staging/detran
tamireinhorn Jul 12, 2023
c165835
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Jul 12, 2023
7a8e33c
Flows flows flows
tamireinhorn Jul 12, 2023
99f1b6a
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Jul 12, 2023
f940d88
Adiciona particionamento
tamireinhorn Aug 17, 2023
c4296e3
Guard pra caso falhe o fetch de prod
tamireinhorn Aug 25, 2023
b975225
deploya pls
tamireinhorn Aug 25, 2023
8abb9b4
Remove pasta desnecessária, deploya por favor
tamireinhorn Aug 29, 2023
b4a06d9
Merge branch 'master' into staging/detran
tamireinhorn Aug 30, 2023
9ad3894
Correção no TOML pós merge
tamireinhorn Aug 30, 2023
2f518d1
Regera o .lock
tamireinhorn Aug 30, 2023
87b4f7e
Por algum motivo, faltou Polars
tamireinhorn Aug 30, 2023
b59e9f5
Merge remote-tracking branch 'origin/master' into staging/detran
tamireinhorn Aug 30, 2023
5ae929f
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Aug 30, 2023
90b1cde
Flake8
tamireinhorn Aug 30, 2023
ab07da7
Adiciona pacotes faltantes da master
tamireinhorn Aug 30, 2023
0f39d8b
Flake8 dnv
tamireinhorn Aug 30, 2023
61cd61b
tirei lixo do gitignore
tamireinhorn Aug 30, 2023
809a41b
Adiciona pacote faltante
tamireinhorn Aug 30, 2023
3381dd1
Mais pacotes faltantes
tamireinhorn Aug 30, 2023
99c4947
faltou
tamireinhorn Aug 30, 2023
bcf4acd
Isso agora funciona
tamireinhorn Aug 30, 2023
7906b0f
xlrd
tamireinhorn Aug 30, 2023
62187a0
Isso aqui era o erro
tamireinhorn Sep 2, 2023
803ea5e
agr vai
tamireinhorn Sep 3, 2023
6d255bc
Resolve problema 2
tamireinhorn Sep 3, 2023
d5dc7c9
kkkkkkkkkkkkkkkkk
tamireinhorn Sep 16, 2023
d3e2e50
schedule fix
tamireinhorn Sep 16, 2023
981112f
Merge remote-tracking branch 'origin/master' into staging/detran
tamireinhorn Sep 16, 2023
6323142
TODO MERGE ISSO
tamireinhorn Sep 16, 2023
db0dbea
.
tamireinhorn Sep 16, 2023
7e97dc5
Pronto
tamireinhorn Sep 16, 2023
ddaee64
Isso era pra ser tao simples
tamireinhorn Sep 16, 2023
42f0e52
Cara.
tamireinhorn Sep 16, 2023
4d33dd2
Cara
tamireinhorn Sep 16, 2023
104fb3b
converte na hr certa
tamireinhorn Sep 16, 2023
dafcdc8
Meu deus cara
tamireinhorn Sep 22, 2023
e101494
Eu só quero que isso não rode agora como parece
tamireinhorn Sep 22, 2023
0cfa89a
Tem algo rodando automático e eu cansei
tamireinhorn Sep 22, 2023
af9bebc
Seguro morreu de velho
tamireinhorn Sep 22, 2023
5bd07ec
Acho que isso segura
tamireinhorn Sep 22, 2023
06561f3
OK
tamireinhorn Sep 26, 2023
b7f00ab
pronto
tamireinhorn Sep 27, 2023
736722a
solucao
tamireinhorn Sep 29, 2023
87ee563
Bugged data
tamireinhorn Oct 29, 2023
8370496
Escaneia e resolve colunas multiplicadas por 10
tamireinhorn Nov 6, 2023
e411b8a
Pronto
tamireinhorn Nov 6, 2023
57385dc
All is good
tamireinhorn Nov 19, 2023
88af8e2
Merge remote-tracking branch 'origin/main' into staging/detran
tamireinhorn Nov 23, 2023
abb6614
Refaz o pre commit, que NÃO funciona aqui.
tamireinhorn Nov 23, 2023
f81b587
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Nov 23, 2023
863fc05
Usa a task pra decidir
tamireinhorn Nov 24, 2023
7d0bd15
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Nov 24, 2023
136da7b
feat: add update_django_metadata task
folhesgabriel Dec 12, 2023
aec9ee8
feat: add new get_data task
folhesgabriel Dec 14, 2023
f5685b6
feat: update branch
folhesgabriel Dec 14, 2023
e682d24
Merge branch 'main' of https://github.com/basedosdados/pipelines into…
folhesgabriel Dec 14, 2023
ff57ade
fix: basedosdados read_sql input
folhesgabriel Dec 14, 2023
4153283
fix: table_id name and handlers
folhesgabriel Dec 14, 2023
d7c29ed
feat: add schedules
folhesgabriel Dec 19, 2023
ac20eb5
feat: add final flows
folhesgabriel Jan 18, 2024
9f01e14
Merge remote-tracking branch origin into staging/detran
folhesgabriel Jan 18, 2024
3ea44a9
fix: crawler error to downloand october onwards files
folhesgabriel Jan 18, 2024
d0f6b58
feat: final modifications
folhesgabriel Jan 23, 2024
19fe83f
feat: add logs and set df cols to str
folhesgabriel Jan 24, 2024
cae36a3
Merge branch 'main' into staging/detran
mergify[bot] Jan 24, 2024
b3474a5
feat: fix get_latest_data return
folhesgabriel Jan 24, 2024
c1e9fae
Merge branch 'staging/detran' of https://github.com/basedosdados/pipe…
folhesgabriel Jan 24, 2024
93ec46e
Merge branch 'main' into staging/detran
mergify[bot] Jan 24, 2024
658bf37
feat: delete backfill and add documentation
folhesgabriel Jan 25, 2024
0c41c7d
feat: remove backfill parameters from flow
folhesgabriel Jan 25, 2024
33803d8
Merge branch 'main' into staging/detran
mergify[bot] Jan 25, 2024
869430c
feat: schedules
folhesgabriel Jan 26, 2024
32a7f32
feat: add code owners
folhesgabriel Jan 26, 2024
3da8ebe
Merge branch 'main' into staging/detran
mergify[bot] Jan 26, 2024
5d3d793
fix: flow
folhesgabriel Jan 26, 2024
ea8d3c9
fix: lint code
folhesgabriel Jan 29, 2024
16e1977
feat: remove unnused imports
folhesgabriel Jan 29, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 4 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -147,8 +147,12 @@ notebooks/
/tests/




.DS_Store

# Mac
.DS_Store


/DENATRAN_FILES
1 change: 0 additions & 1 deletion .pre-commit-config.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,6 @@ repos:
rev: 23.7.0
hooks:
- id: black

exclude: 'pipelines\/\{\{cookiecutter\.project_name\}\}.*'
- repo: https://github.com/PyCQA/flake8
rev: 6.1.0
Expand Down
1 change: 1 addition & 0 deletions pipelines/datasets/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -27,6 +27,7 @@
from pipelines.datasets.br_me_caged.flows import *
from pipelines.datasets.br_ibge_pnadc.flows import *
from pipelines.datasets.cross_update.flows import *
from pipelines.datasets.br_denatran_frota.flows import *
from pipelines.datasets.br_bcb_estban.flows import *
from pipelines.datasets.br_ms_cnes.flows import *
from pipelines.datasets.br_rj_isp_estatisticas_seguranca.flows import *
Expand Down
Empty file.
27 changes: 27 additions & 0 deletions pipelines/datasets/br_denatran_frota/backfill.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,27 @@
# -*- coding: utf-8 -*-
from pipelines.datasets.br_denatran_frota.handlers import (
crawl,
treat_uf_tipo,
get_desired_file,
output_file_to_csv,
)
from pipelines.datasets.br_denatran_frota.constants import constants

# Fill for UF TIPO
months = range(1, 13)
years = range(2003, 2023)
for year in years:
for month in months:
print(month)
crawl(month=month, year=year, temp_dir="DENATRAN_FILES")
file = get_desired_file(
year=year,
download_directory="DENATRAN_FILES",
filetype=f"{constants.UF_TIPO_BASIC_FILENAME.value}_{month}",
)
if year == 2004 and month == 3:
breakpoint()
df = treat_uf_tipo(file=file)
path = output_file_to_csv(
df=df, filename=constants.UF_TIPO_BASIC_FILENAME.value
)
147 changes: 147 additions & 0 deletions pipelines/datasets/br_denatran_frota/constants.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,147 @@
# -*- coding: utf-8 -*-
"""
Constant values for the datasets projects
"""

from enum import Enum


class constants(Enum): # pylint: disable=c0103
"""
Constant values for the br_denatran_frota project
"""

# -*- coding: utf-8 -*-

MONTHS = {
"janeiro": 1,
"fevereiro": 2,
"marco": 3,
"abril": 4,
"maio": 5,
"junho": 6,
"julho": 7,
"agosto": 8,
"setembro": 9,
"outubro": 10,
"novembro": 11,
"dezembro": 12,
}

DATASET = "br_denatran_frota"

HEADERS = {
"User-Agent": "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_10_1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/39.0.2171.95 Safari/537.36"
}

DICT_UFS = {
"AC": "Acre",
"AL": "Alagoas",
"AP": "Amapá",
"AM": "Amazonas",
"BA": "Bahia",
"CE": "Ceará",
"DF": "Distrito Federal",
"ES": "Espírito Santo",
"GO": "Goiás",
"MA": "Maranhão",
"MT": "Mato Grosso",
"MS": "Mato Grosso do Sul",
"MG": "Minas Gerais",
"PA": "Pará",
"PB": "Paraíba",
"PR": "Paraná",
"PE": "Pernambuco",
"PI": "Piauí",
"RJ": "Rio de Janeiro",
"RN": "Rio Grande do Norte",
"RS": "Rio Grande do Sul",
"RO": "Rondônia",
"RR": "Roraima",
"SC": "Santa Catarina",
"SP": "São Paulo",
"SE": "Sergipe",
"TO": "Tocantins",
}

SUBSTITUTIONS = {
("RN", "assu"): "acu",
("PB", "sao domingos de pombal"): "sao domingos",
("PB", "santarem"): "joca claudino",
("SP", "embu"): "embu das artes",
("TO", "sao valerio da natividade"): "sao valerio",
("PB", "campo de santana"): "tacima",
("AP", "amapari"): "pedra branca do amapari",
("BA", "maracani"): "macarani",
("BA", "livramento do brumado"): "livramento de nossa senhora",
("PB", "sao bento de pombal"): "sao bentinho",
("PB", "serido"): "sao vicente do serido",
("PR", "vila alta"): "alto paraiso",
("RN", "espirito santo do oeste"): "parau",
("RO", "jamari"): "itapua do oeste",
("SC", "picarras"): "balneario picarras",
("SC", "barra do sul"): "balneario barra do sul",
}

DOWNLOAD_PATH = f"/tmp/input/{DATASET}"

OUTPUT_PATH = f"/tmp/output/{DATASET}"

UF_TIPO_BASIC_FILENAME = "frota_por_uf_e_tipo_de_veiculo"

MUNIC_TIPO_BASIC_FILENAME = "frota_por_municipio_e_tipo"

MONTHS_SHORT = {month[:3]: number for month, number in MONTHS.items()}

UF_TIPO_HEADER = [
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Você não utiliza essa lista, apague.

"Grandes Regiões e\nUnidades da Federação",
"TOTAL",
"AUTOMÓVEL",
"BONDE",
"CAMINHÃO",
"CAMINHÃO TRATOR",
"CAMINHONETE",
"CAMIONETA",
"CHASSI PLATAFORMA",
"CICLOMOTOR",
"MICROÔNIBUS",
"MOTOCICLETA",
"MOTONETA",
"ÔNIBUS",
"QUADRICICLO",
"REBOQUE",
"SEMI-REBOQUE",
"SIDE-CAR",
"OUTROS",
"TRATOR ESTEIRA",
"TRATOR RODAS",
"TRICICLO",
"UTILITÁRIO",
]

MUNICIPIO_TIPO_HEADER = [
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mesma coisa aqui...

"UF",
"MUNICIPIO",
"TOTAL",
"AUTOMÓVEL",
"BONDE",
"CAMINHÃO",
"CAMINHÃO TRATOR",
"CAMINHONETE",
"CAMIONETA",
"CHASSI PLATAFORMA",
"CICLOMOTOR",
"MICROÔNIBUS",
"MOTOCICLETA",
"MOTONETA",
"ÔNIBUS",
"QUADRICICLO",
"REBOQUE",
"SEMI-REBOQUE",
"SIDE-CAR",
"OUTROS",
"TRATOR ESTEIRA",
"TRATOR RODAS",
"TRICICLO",
"UTILITÁRIO",
]
Loading