- Download and unzip The CLEF Test Suite for the CLEF 2000-2003 Campaigns – Evaluation Package.
- Set environment variable
CLEF_HOMEto point to the location of the unzipped dataset. - (Optional) Download and unzip Swahili (SW) and Somali (SO) CLEF queries here.
- (Optional) Set
CLEF_LOWRES_DIRinclef_paths.pyto where you unzipped the dataset.
Dataloaders expect the following structure after downloading and unzipping CLEF:
clef/
├── clef-low-resource
│ └── long_paper
├── DocumentData
│ ├── dutch
│ ├── english
│ ├── finnish
│ ├── french
│ ├── german
│ ├── italian
│ └── russian
├── RelAssess
│ ├── 2001
│ ├── 2002
│ └── 2003
└── Topics
├── 2001
├── 2002
└── 2003@inproceedings{Bonab2019swahiliclef,
author = {Bonab, Hamed and Allan, James and Sitaraman, Ramesh},
title = {Simulating CLIR Translation Resource Scarcity Using High-Resource Languages},
year = {2019},
url = {https://doi.org/10.1145/3341981.3344236},
booktitle = {Proceedings of ICTIR},
pages = {129–136},
}@inproceedings{braschler2003clef,
title={{CLEF 2003--Overview of results},
author={Braschler, Martin},
booktitle={Workshop of the Cross-Language Evaluation Forum for European Languages},
pages={44--63},
year={2003},
organization={Springer}
}