Set of tools to allow automated information recovery from the Secretary of the Federal Revenue of Brazil website. This set of tools will use the receitaws.com.br web service to retrieve information about all Brazilian companies you like.
To install the tool the easiest way is to use pip
:
pip install receita-tools
This set of tools will allow you to easily retrieve data from Receita's website. You can get information about multiple companies at once. Those tools also allow you to create a few CSV files to easily import the retrieved data to your system.
The tools provided here use the ReceitaWS webservice. Here are a few important links to read about how the system works before using this tool:
The data retriever program works based on a CSV file containing information about the CNPJs it should look for. This file must have at least on column, and the first one should contain the CNPJ of the companies you want to get information.
You can run receita get cnpj.csv
to get information from that CSV file.
The retrieved data will be saved by default at the data
directory in the
directory you ran the command. You can change the directory by using the
--output
option. Keep in mind that you can use absolute or relative
paths too.
You can use the webservice Public API or the Comercial API. Below we describe how to use each of them.
By default the get
command will use the Public API to get information about
companies. There is no extra configuration or command to perform, so you
are ready to go. For example, to get data from the companies listed in the
list.csv
file and save to cnpj_data
folder using the Public API:
receita get list.csv --output cnpj_data
To use the Comercial API you need to provide two extra informations: the maximum data deprecation value (in days) and the API access token. You can generate an access token by accessing your control panel at the ReceitaWS website.
Once you have that information, you need to provide your token as the
RWS_TOKEN
environment variable. The deprecation value must be provided
using the -d
option.
To set the environment variable you can use the export
command or simply
define it when getting information. Here is a sample using the export
command and setting the data tolerance to 20 days:
export RWS_TOKEN="<my-token>" receita get list.csv --output cnpj_data -d 20
After you run the get
command all data is already downloaded to your
local filesystem. The build
command is used to read all this data and
generate consolidated CSV files with its information.
If you did not used the default directory to save the data, you need to inform it. You can also say the directory where the generated files will be stored.
receita build --input cnpj_data --output results
This command will generate three files at the output directory:
- companies.csv: data for every company retrieved;
- activities.csv: list of companies activities (primary/secondary);
- activities_seen.csv: the full set of activities from those companies.
You can always use the --help
option to get help about a command.
You can also use it with the subcommands, like receita build --help
.