Downloader for Facebook ads archive for Media Analytics Tool project.
git clone [email protected]:lvtffuk/mat-facebook-downloader.git
cd mat-facebook-downloader
npm install
npm start
The settings are set with environment variables.
Variable | Description | Required | Default value |
---|---|---|---|
TOKENS_FILE_PATH |
The filepath of the csv file with access tokens. |
✔️ | |
SEARCH_INPUT_FILE_PATH |
The filepath of the csv file with search input data. |
✔️ | |
OUT_DIR |
The directory where the output is stored. | ✔️ | |
CSV_SEPARATOR |
The separator of the input csv files. |
❌ | ; |
WORKER_CONCURRENCY |
The count of parallel runs of the downloading ads archive. | ❌ | 5 |
CLEAR |
Indicates if the output dir should be cleared before the run. All downloads are starting again. | ❌ | 0 |
ADS_ARCHIVE_LIMIT |
The limit of the results in the Facebook API request. | ❌ | 100 |
For access to the Facebook Ads Library are needed access tokens allowed to read the ads archive.
Additionally there should be Facebook application created.
The tokens must be stored in csv
files.
"id";"secret";"token"
"appId";"appSecret";"accessToken"
"appId";"appSecret";"accessToken"
"appId";"appSecret";"accessToken"
"appId";"appSecret";"accessToken"
First line is header.
The appId
and appSecret
can be retrieved from the developers console.
The access token can be retrieved from the Graph API Explorer. The user must log in the correct app.
The search input is csv
file which contains information about the data to search.
"countries";"search"
"countries";"searchTerm"
"countries";"searchTerm"
"countries";"searchTerm"
"countries";"searchTerm"
"countries";"searchTerm"
First line is header.
The countries
are comma separated country codes and searchTerm
is the term to search in archive.
The output is stored in Apache Parquet
file in the output directory as archive.parquet
.
The image is stored in GitHub packages registry and the app can be run in the docker environment.
docker pull ghcr.io/lvtffuk/mat-facebook-downloader:latest
docker run \
--name=mat-facebook-downloader \
-e 'TOKENS_FILE_PATH=./input/tokens.csv' \
-e 'SEARCH_INPUT_FILE_PATH=./input/search.csv' \
-e 'OUT_DIR=./output' \
-v '/absolute/path/to/output/dir:/usr/src/app/output' \
-v '/absolute/path/to/input/dir:/usr/src/app/input' \
ghcr.io/lvtffuk/mat-facebook-downloader:latest
The volumes must be set for accessing input and output data.
This work was supported by the European Regional Development Fund-Project “Creativity and Adaptability as Conditions of the Success of Europe in an Interrelated World” (No. CZ.02.1.01/0.0/0.0/16_019/0000734)."