A set of scripts for the COCO dataset. Any .qsub
files are examples of HPC scripts that can be used to download required files and run necessary scripts.
python3 -m venv .venv
source .venv/bin/activate
python3 -m pip install numpy
The BUTD features can be downloaded following the instructions here.
make_bu_data.py
: Extracts the features out of thetsv
files and creates the required directories. Seemake_bu_data.qsub
for an example of how the script is called.prepro_labels.py
: Used by some codebases to preprocess the captions
The python scripts required to
coco-download.sh
: Downloads the COCO dataset images and the Karpathy Split JSON file