A set of scripts for the COCO dataset. Any .qsub files are examples of HPC scripts that can be used to download required files and run necessary scripts.
python3 -m venv .venv
source .venv/bin/activate
python3 -m pip install numpyThe BUTD features can be downloaded following the instructions here.
make_bu_data.py: Extracts the features out of thetsvfiles and creates the required directories. Seemake_bu_data.qsubfor an example of how the script is called.prepro_labels.py: Used by some codebases to preprocess the captions
The python scripts required to
coco-download.sh: Downloads the COCO dataset images and the Karpathy Split JSON file