Skip to content

Latest commit

 

History

History
131 lines (92 loc) · 4.31 KB

README.en.rst

File metadata and controls

131 lines (92 loc) · 4.31 KB

receita-tools

pypi license

README Laguages: ptbr en

Set of tools to allow automated information recovery from the Secretary of the Federal Revenue of Brazil website. This set of tools will use the receitaws.com.br web service to retrieve information about all Brazilian companies you like.

To install the tool the easiest way is to use pip:

pip install receita-tools

This set of tools will allow you to easily retrieve data from Receita's website. You can get information about multiple companies at once. Those tools also allow you to create a few CSV files to easily import the retrieved data to your system.

The tools provided here use the ReceitaWS webservice. Here are a few important links to read about how the system works before using this tool:

The data retriever program works based on a CSV file containing information about the CNPJs it should look for. This file must have at least on column, and the first one should contain the CNPJ of the companies you want to get information.

You can run receita get cnpj.csv to get information from that CSV file. The retrieved data will be saved by default at the data directory in the directory you ran the command. You can change the directory by using the --output option. Keep in mind that you can use absolute or relative paths too.

You can use the webservice Public API or the Comercial API. Below we describe how to use each of them.

By default the get command will use the Public API to get information about companies. There is no extra configuration or command to perform, so you are ready to go. For example, to get data from the companies listed in the list.csv file and save to cnpj_data folder using the Public API:

receita get list.csv --output cnpj_data

To use the Comercial API you need to provide two extra informations: the maximum data deprecation value (in days) and the API access token. You can generate an access token by accessing your control panel at the ReceitaWS website.

Once you have that information, you need to provide your token as the RWS_TOKEN environment variable. The deprecation value must be provided using the -d option.

To set the environment variable you can use the export command or simply define it when getting information. Here is a sample using the export command and setting the data tolerance to 20 days:

export RWS_TOKEN="<my-token>"
receita get list.csv --output cnpj_data -d 20

After you run the get command all data is already downloaded to your local filesystem. The build command is used to read all this data and generate consolidated CSV files with its information.

If you did not used the default directory to save the data, you need to inform it. You can also say the directory where the generated files will be stored.

receita build --input cnpj_data --output results

This command will generate three files at the output directory:

  • companies.csv: data for every company retrieved;
  • activities.csv: list of companies activities (primary/secondary);
  • activities_seen.csv: the full set of activities from those companies.

You can always use the --help option to get help about a command. You can also use it with the subcommands, like receita build --help.