This API is able to load all 4 referring expression datasets, i.e., RefClef, RefCOCO, RefCOCO+ and RefCOCOg. They are with different train/val/test split by UNC, Google and UC Berkeley respectively. We provide all kinds of splits here. Note, RefCOCO+ may change in the future as we are still cleaning it. Notification will be announced if we made changes.
If you used the following three datasets RefClef, RefCOCO and RefCOCO+ that were collected by UNC, please consider cite our EMNLP2014 paper; if you want to compare with our recent results, please check our ECCV2016 paper.
Kazemzadeh, Sahar, et al. "ReferItGame: Referring to Objects in Photographs of Natural Scenes." EMNLP 2014.
Yu, Licheng, et al. "Modeling Context in Referring Expressions." ECCV 2016.To install this package, along its dependencies, you can execute:
pip install -U .This package depends on Numpy, Matplotlib, scikit-image and Cython, it also depends on the mscoco API mask routines, which are compiled during setup. These mask-related codes are copied from mscoco API.
Download the cleaned data and extract them into "data" folder
Besides, add "mscoco" into the data/images folder, which can be from mscoco
COCO's images are used for RefCOCO, RefCOCO+ and refCOCOg.
For RefCLEF, please add saiapr_tc-12 into data/images folder. We extracted the related 19997 images to our cleaned RefCLEF dataset, which is a subset of the original imageCLEF. Download the subset and unzip it to data/images/saiapr_tc-12.
The refer module (referit/refer.py) is able to load all 4 datasets with different kinds of data split by UNC, Google and UC Berkeley.
from referit import REFER
# locate your own data_root, and choose the dataset_splitBy you want to use
refer = REFER(data_root, dataset='refclef', split_by='unc')
refer = REFER(data_root, dataset='refclef', split_by='berkeley') # 2 training and 1 testing images missed
refer = REFER(data_root, dataset='refcoco', split_by='unc')
refer = REFER(data_root, dataset='refcoco', split_by='google')
refer = REFER(data_root, dataset='refcoco+', split_by='unc')
refer = REFER(data_root, dataset='refcocog', split_by='google') # testing data haven't been released yet
refer = REFER(data_root, dataset='refcocog', split_by='umd') # train/val/test split provided by UMD (recommended)