Skip to content

Commit

Permalink
[Infra] Version 2.0.0b16 (#1678)
Browse files Browse the repository at this point in the history
* [infra] Version 1.7.0 python-package

* [infra] fix update_columns test

* [infra] remove unused import

* [infra] add to_partition utility function

* [infra] add test for to_partitions

* [infra] pump package version 1.6.9-b2

* [infra] add break_file feature

* Revert "[infra] pump package version 1.6.9-b2"

This reverts commit 0cba449.

* feat: add `connection_id` to external data configuration

* fix(Datatype): add connection id for external configuration

* feat: add automatic management of BQ connection

* chore: fix linting issues

* feat: add test folder to gitignore

* feat: release beta version

* feat(Connection): add `service_account` property

* feat(Base): add IAM stuff

* chore: fix linting issues

* feat: automatic granting roles to BigLake service account

* feat: better error handling, set biglake permissions is now optional

* feat: release beta version

* chore: modify log message

* chore: make all partitions string

* chore: merge master

* add __version__ atribute (#1488)

Co-authored-by: mergify[bot] <37929162+mergify[bot]@users.noreply.github.com>

* add option to change copied table name (#1489)

Co-authored-by: mergify[bot] <37929162+mergify[bot]@users.noreply.github.com>

* fix: pylint

* changing python_path fixture

* adding shapely as dependency to downgrade if already installed, like Colab

* pylinting files

* bump version 1.6.10-beta.1

* bump version 1.6.10

* updating version

* return update_columns to Table class

* authentication methods in base class

* method to return dataset id from slug using graphql

* method to return table id from slug of dataset and table using graphql

* using variables in graphql query

* change default & log downloaded path

* authentication with graphql

* change version in pyproject.toml

* chore: refactor connection imports and make working dir default for storage download

* chore: make staging the default mode for storage download

* chore: make staging the default mode for storage download

* fix: pylint

* chore: release new beta version

* small corrections in 1.6.11

* methods to retrieve metadata from graphql api

* adding data to api_data_dict

* adding exists_in_api method

* changing is_updated method

* method to get a request in graphql

* logging errors with loguru, instead of print

* writing yaml files before updating the database

* helper to convert case from snake_case to camelCase and vice versa

* start refactoring the query to use alias

* refactoring: clean edges and nodes from graphql response

* moving graphql queries for separated files and others

* improving unit tests and graphql for api_metadata

* changing api_response for compatibility with current yamls

* Hotfix storage init args (#1576)

* [dados]br_ibge_estadic.indicadores (#1535)

* dados

* dados

* title

* observation_level

* escolaridade

---------

Co-authored-by: mergify[bot] <37929162+mergify[bot]@users.noreply.github.com>

* [dados] br_ipea_avs (#1530)

* up br_ipea_avs

* Ajeitando os comentários da equipe de dados.

* Alterando a temporal_coverage

* Ajustando o PR.3

* Create code

* Delete code

* Create br_avs_ipea

* Add files via upload

* Delete br_avs_ipea

* Subindo novamente toda a base, devido as alterações.

* Ajustando o PR

* Delete br_ipea_avs.ipynb

* update

* update

* update

* update

* update

---------

Co-authored-by: mergify[bot] <37929162+mergify[bot]@users.noreply.github.com>

* Update table_config.yaml (#1553)

Co-authored-by: mergify[bot] <37929162+mergify[bot]@users.noreply.github.com>

* [dados] br_me_clima_organizacional (#1548)

* dados

* partner_organization

---------

Co-authored-by: mergify[bot] <37929162+mergify[bot]@users.noreply.github.com>

* [dados] br_me_exportadoras_importadoras  (#1521)

* Sobe br_me_exportadoras_importadoras

* Faz correções no script

* Corrige erros no table_config apontados na correção do PR

* Corrige erros no table_config apontados na correção do PR v2

* Corrige erros no dicionário

* Corrige tipo do CEP

* Corrige o tipo do dado do CEP no publish.sql

* Delete dataset_config yamls

---------

Co-authored-by: mergify[bot] <37929162+mergify[bot]@users.noreply.github.com>
Co-authored-by: Laura Amaral <[email protected]>

* [dados] world_fao_production (#1536)

* Abrir PR world_fao_production

* Corrige erros apontados na correção do PR

* Corrige erros apontados na correção do PR v2

* Altera nível de observação do table_cofing da tabela item

* Corrige erros apontados na correção do PR

* Corrige a partição e altera a nome da variável ano para year

* remove palavra repetida da descrição

---------

Co-authored-by: mergify[bot] <37929162+mergify[bot]@users.noreply.github.com>
Co-authored-by: Laura Amaral <[email protected]>

* [dados] world_wb_wwbi.country_finance (#1538)

* dados

* update

* data

* add observacoes

---------

Co-authored-by: mergify[bot] <37929162+mergify[bot]@users.noreply.github.com>

* [dados-atualizacao] update pib municipio (#1559)

* update pib municipio

* updates

* updates

* Update table_description.txt

---------

Co-authored-by: Gabrielle Carvalho <[email protected]>

* [dados] br_ibge_estadic (#1560)

* update

* update metadata_modified

* update metadata_modified

* update metadata_modified

---------

Co-authored-by: Laura Amaral <[email protected]>

* [dados] br_ibge_munic (#1534)

* dados

* update table_config

* update

* code

* code novo

* sigla_uf

---------

Co-authored-by: mergify[bot] <37929162+mergify[bot]@users.noreply.github.com>

* [dados] br_me_siconfi.uf (#1546)

* upload br_me_siconfi_uf

* update review gabs

* Update table_config.yaml

* Update table_config.yaml

* Update table_config.yaml

---------

Co-authored-by: mergify[bot] <37929162+mergify[bot]@users.noreply.github.com>
Co-authored-by: Laura Amaral <[email protected]>

* [dados] br_ipea_avs (#1564)

* up br_ipea_avs

* up br_ibge_estadic

* Ajustando os comentários da equipe de dados.

* Ajustando cobertura temporal.

* Delete README.md

* Delete dataset_config.yaml

* Delete publish.sql

* Delete schema-prod.json

* Delete schema-staging.json

* Delete table_config.yaml

* Delete table_description.txt

* update

* update

* update

* update

* update

* update

* update

* Delete br_ibge_estadic_educação.ipynb

apagando código estadic

* Delete publish.sql

apagando publish.sql

* Delete schema-prod.json

apagando schema-prod

* Delete schema-staging.json

apagando schema-staging

* Delete table_config.yaml

apagando table_config

* Delete table_description.txt

apagando table_description

---------

Co-authored-by: mergify[bot] <37929162+mergify[bot]@users.noreply.github.com>

* [dados] br_bcb_estban (#1561)

* Abre PR do conjunto br_bcb_estban

* Corrige erros apontados na correção do PR

* Delete dataset_config.yaml

* Corrige erros apontados na correção do PR v2.0

---------

Co-authored-by: mergify[bot] <37929162+mergify[bot]@users.noreply.github.com>
Co-authored-by: Laura Amaral <[email protected]>

* update dicionario br_me_rais (#1567)

Co-authored-by: mergify[bot] <37929162+mergify[bot]@users.noreply.github.com>

* [dados] br_ibge_estadic (#1531)

* up br_ipea_avs

* up br_ibge_estadic

* Ajustando os comentários da equipe de dados.

* Ajustando cobertura temporal.

* Delete README.md

* Delete dataset_config.yaml

* Delete publish.sql

* Delete schema-prod.json

* Delete schema-staging.json

* Delete table_config.yaml

* Delete table_description.txt

* update

* update

* update

* update

* update

* update

* update

* update

* update dicionario br_ibge_estadic

* update br_ibge_estadic

* update dicionário br_ibge_estadic

---------

Co-authored-by: mergify[bot] <37929162+mergify[bot]@users.noreply.github.com>

* [dados-atualizacao] update `br_sgp_informacao.despesas_cartao_corporativo` (#1570)

* update cartao corporativo

* Update table_config.yaml

* [dados] world_spi (#1555)

* update world_spi

* update

* update

* update_2

* update world_spi

* fix: dataset_id

---------

Co-authored-by: mergify[bot] <37929162+mergify[bot]@users.noreply.github.com>
Co-authored-by: Crislane Alves <[email protected]>

* add required args

* [dados-atualizacao] br_inep_indicadores_educacionais (#1566)

* [dados-atualizacao]

atualiza os dados para 2022; atualiza table_config, cria script em python

* Update table_config.yaml

* Update table_config.yaml

* Update table_config.yaml

* Update table_config.yaml

---------

Co-authored-by: mergify[bot] <37929162+mergify[bot]@users.noreply.github.com>

---------

Co-authored-by: Gabrielle Carvalho <[email protected]>
Co-authored-by: mergify[bot] <37929162+mergify[bot]@users.noreply.github.com>
Co-authored-by: Patrick Teixeira <[email protected]>
Co-authored-by: Gabriel Pisa <[email protected]>
Co-authored-by: Laura Amaral <[email protected]>
Co-authored-by: Arthur Gusmão <[email protected]>
Co-authored-by: Crislane Alves <[email protected]>
Co-authored-by: Fernanda Scovino <[email protected]>
Co-authored-by: Lucas Moreira <[email protected]>

* removing coverage from query

* fixes in dataset and table config files creation

* owner_org and exists_in_api methods

* removing references to REST API and treat errors in login

* adjustments in data_dict, tests for publish

* initializing RemoteAPI for mutations

* mutation to create a dataset

* adjustments in data_dict and others

* corrections and new table for publish tests

* removing IDE settings from project

* prevent exclusion of tmp_bases when it already exists

* change filename to avoid conflict in tests

* commiting notebook file as it does not rollback, despite the fact that contents are identical

* stable exists method

* metadata_modified as datetime

* chore: remove yaml dependecy from metadata.py

* chore: remove yaml dependecy from metadata.py

* feat: remove more code

* feat: remove more code

* initial structure

* chore: refactor graphql requests

* chore: refactor graphql requests

* chore: refactor graphql requests

* chore: make publish_sql

* feat: add backend class for handling interaction with graphql

* chore: clean some code and comment parts where table_config are needed

* chore: clean some code and comment parts where table_config are needed

* chore: add dataset config query

* chore: add table config query

* feat: create dataset and use API metadata

* chore(deps): remove unnecessary deps

* chore: minor cleanup

* chore: delete file

* feat: fix occurences of `table_config`

* feat: add API url to config init

* feat: add structure for `Metadata.create`

* chore: more table modifications

* feat: table create using data_columns and partitioned data

* feat: some refactor and finish table.create

* chore: better casing

* chore: better casing

* chore: better logging

* chore: update table.create docstring

* chore: clean config files

* feat: refactor table.publish and table.update

* chore: make publish.sql from staging schema

* feat: get partition dict from storage

* chore: rename some methods

* chore: update and publish only acts in prod and uses the staging table schema to generate the prod publish query and update schema

* chore: load schema using SchemaField, remove code that depends on template

* chore: refactor init process

* chore: remove upload function from cli

* chore: remove upload function from cli

* chore: clean unused imports, redo poetry packages and release 2.0.0-b1

* chore: add a new dependencie requests-toolbelt

* chore: add tomlkit and better error if columns does not have name

* chore: error handling and  make publish and update get info from api if existis

* fix: typo in _get_columns_from_data and better infos

* chore: add tomlkit

* chore: error handling in case that API is off

* chore: error handling in case that API is off

* hotfix: chang metadata base_url

* chore: get backend metadata from cloud tables

* feat: bump beta version

* chore: change mode in table.delete

* chore: pump version

* chore: no more version number on files

* feat: implement external warnings and messages

* feat: add csv_delimiter and allow csv_allow_jagged_rows

* fix: pump bd version and add new parameters csv_delimiter and csv_allow_jagged_rows

* chore: cleanup

* chore: remove compressed r package

* chore: refactor dependency management

* chore: fix linting issues

* chore: remove pylint action

* fix: change install instructions

* feat: pump bd version

* fix: change install instructions

* feat: pump bd version

* feat: add new parameter csv_skip_leading_rows and setup.py

* feat: pump bd version

* chore: fix conflicts

* chore: add `all` extra

* chore: lint

* feat: create branch v2.0.0

* fix: add csv delimiter to schema

* feat: expand credential scope to drive and bq

* chore: start cleaning tests

* chore: add timeout to pypi warning

---------

Co-authored-by: lucascr91 <[email protected]>
Co-authored-by: Mauricio Fagundes <[email protected]>
Co-authored-by: Gabriel Gazola Milan <[email protected]>
Co-authored-by: mergify[bot] <37929162+mergify[bot]@users.noreply.github.com>
Co-authored-by: Fernanda Scovino <[email protected]>
Co-authored-by: Gabrielle Carvalho <[email protected]>
Co-authored-by: Patrick Teixeira <[email protected]>
Co-authored-by: Gabriel Pisa <[email protected]>
Co-authored-by: Laura Amaral <[email protected]>
Co-authored-by: Arthur Gusmão <[email protected]>
Co-authored-by: Crislane Alves <[email protected]>
Co-authored-by: Lucas Moreira <[email protected]>
  • Loading branch information
13 people authored May 2, 2024
1 parent b2d1414 commit 246943c
Show file tree
Hide file tree
Showing 62 changed files with 10,522 additions and 7,741 deletions.
4 changes: 4 additions & 0 deletions .flake8
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
[flake8]
select = C,E,F,W,B,B950
extend-ignore = E501
max-line-length = 88
29 changes: 0 additions & 29 deletions .github/workflows/lint_python.yaml

This file was deleted.

2 changes: 1 addition & 1 deletion .github/workflows/metadata-validate/metadata_validate.py
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,7 @@
import yaml
from basedosdados import Dataset, Storage
from basedosdados.upload.base import Base
from basedosdados.upload.metadata import Metadata
from basedosdados.upload.metadata import Metadata # TODO: deprecate


def tprint(title=""):
Expand Down
2 changes: 1 addition & 1 deletion .github/workflows/table-approve/table_approve.py
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,7 @@
import yaml
from basedosdados import Dataset, Storage
from basedosdados.upload.base import Base
from basedosdados.upload.metadata import Metadata
from basedosdados.upload.metadata import Metadata # TODO: deprecate


def tprint(title=""):
Expand Down
13 changes: 13 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -3,6 +3,13 @@
.mais
bases/pytest/*
bases/test/
test/*
test.py


.DS_Storage
*/*/.DS_Storage


# NEW repo name
.mais
Expand Down Expand Up @@ -144,6 +151,12 @@ venv.bak/
.spyderproject
.spyproject

# VS Code project settings
.vscode/

# Pycharm project settings
.idea/

# Rope project settings
.ropeproject

Expand Down
54 changes: 34 additions & 20 deletions .pre-commit-config.yaml
Original file line number Diff line number Diff line change
@@ -1,27 +1,41 @@
repos:
- repo: https://github.com/pre-commit/pre-commit-hooks
rev: v4.3.0
rev: v4.4.0
hooks:
- id: check-added-large-files
- id: check-merge-conflict
# - id: check-yaml
- id: detect-private-key
# - id: end-of-file-fixer
# - id: no-commit-to-branch
# args: [-b, main]
# - id: trailing-whitespace
- repo: local
- id: fix-byte-order-marker
- id: no-commit-to-branch
- id: trailing-whitespace

- repo: https://github.com/psf/black
rev: 22.12.0
hooks:
- id: pylint
name: pylint
entry: pylint
language: system
types: [python]
args:
- "--rcfile=.pylintrc"
exclude: .github/
# - repo: https://github.com/macisamuele/language-formatters-pre-commit-hooks
# rev: v2.3.0
# hooks:
# - id: pretty-format-yaml
# args: [--autofix, --indent, '2']
- id: black
language_version: python3.10

- repo: https://github.com/PyCQA/isort
rev: 5.12.0
hooks:
- id: isort

- repo: https://github.com/PyCQA/flake8
rev: 6.0.0
hooks:
- id: flake8

- repo: https://github.com/returntocorp/semgrep
rev: v1.30.0
hooks:
- id: semgrep
language: python
args: [
"--error",
"--config",
"auto",
"--exclude-rule",
"python.lang.security.audit.subprocess-shell-true.subprocess-shell-true",
"--exclude-rule",
"yaml.github-actions.security.third-party-action-not-pinned-to-commit-sha.third-party-action-not-pinned-to-commit-sha",
]
16 changes: 0 additions & 16 deletions Makefile

This file was deleted.

Binary file removed basedosdados_0.2.2.tar.gz
Binary file not shown.

Large diffs are not rendered by default.

4 changes: 0 additions & 4 deletions bases/test_dataset/test_table/table_description.txt
Original file line number Diff line number Diff line change
@@ -1,8 +1,4 @@
<<<<<<< HEAD
this is a test-dataset
=======
None
>>>>>>> fee7177eb7c1b2efc60334b30538bfea04eb2af9

Para saber mais acesse:
Website:
Expand Down
Loading

0 comments on commit 246943c

Please sign in to comment.