Extracting Supply Chain Information from News Articles Using Large Language Models: A Fully Automatic Approach

Installation

This codebase is only tested on Windows 11.

To create 'ManualDataset' and 'ManualReducedDataset', the dataset of Wichmann et al. is needed. Go to https://github.com/pwichmann/supply_chain_mining and follow the instructions on obtaining the dataset.

To use LLama-3-8B-Instruct, you need an access token from huggingface and the permission to use the model from Meta. Here are the necessary steps:

Go to https://huggingface.co, and create an account.
Then, go to https://huggingface.co/meta-llama/Meta-Llama-3-8B-Instruct, and acquire access to the model. This may take several days.
If access to the model is granted, go to https://huggingface.co/settings/tokens and create a new access token. Select 'Read' from 'Token type'.
The access token will be shown only once after it is created (starting with 'hf_'). Copy and paste it in the .env file as 'ACCESS_TOKEN', such as ACCESS_TOKEN=hf_...

To obtain the case study dataset (mining.json, mining_processed.json), please contact the author ([email protected]).

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
experiments		experiments
modules		modules
.env		.env
.gitignore		.gitignore
README.md		README.md
__init__.py		__init__.py
pyproject.toml		pyproject.toml
uv.lock		uv.lock