Extracting Supply Chain Information from News Articles Using Large Language Models: A Fully Automatic Approach
This codebase is only tested on Windows 11.
- Download uv
- Open
cmd
from directory - Type
uv sync
- Run each jupyter notebook, using the created virtualenv
To create 'ManualDataset' and 'ManualReducedDataset', the dataset of Wichmann et al. is needed. Go to https://github.com/pwichmann/supply_chain_mining and follow the instructions on obtaining the dataset.
To use LLama-3-8B-Instruct, you need an access token from huggingface and the permission to use the model from Meta. Here are the necessary steps:
- Go to https://huggingface.co, and create an account.
- Then, go to https://huggingface.co/meta-llama/Meta-Llama-3-8B-Instruct, and acquire access to the model. This may take several days.
- If access to the model is granted, go to https://huggingface.co/settings/tokens and create a new access token. Select 'Read' from 'Token type'.
- The access token will be shown only once after it is created (starting with 'hf_'). Copy and paste it in the .env file as 'ACCESS_TOKEN', such as
ACCESS_TOKEN=hf_...
To obtain the case study dataset (mining.json, mining_processed.json), please contact the author ([email protected]).