This is the implementation of the IEEE CBMI 2025 paper "ELIP: Enhanced Visual-Language Foundation Models for Image Retrieval" created by Guanqi Zhan*, Yuanpei Liu*, Kai Han, Weidi Xie and Andrew Zisserman.
The datasets we use include: COCO, Flickr, Occluded COCO and ImageNet-R
For ELIP-C, ELIP-S, and ELIP-S2, change into the ELIP-C directory and follow the corresponding README. For ELIP-B, change into the ELIP-B directory and consult its README.
This repository is built upon OpenCLIP and LAVIS. Thanks for those well-organized codebases.
If you find this repo useful for your research, please consider citing our paper:
@inproceedings{Zhan2025ELIP,
author = {Zhan, Guanqi and Liu, Yuanpei and Han, Kai and Xie, Weidi and Zisserman, Andrew},
title = {ELIP: Enhanced Visual-Language Foundation Models for Image Retrieval},
booktitle = {International Conference on Content-Based Multimedia Indexing (CBMI)},
year = {2025}
}