Sangbeom Lim1* · Junwan Kim2* · Heeji Yoon3 · Jaewoo Jung3 · Seungryong Kim3†
1Korea University 2Yonsei University 3KAIST AI
*: Equal Contribution
†: Corresponding Author
ArXiv 2025
URECA can generate Unique Caption for Any Granularity Regions!- 2025-07-08: URECA training dataset is released!
- 2025-04-08: Our ArXiv Paper is released!
- 🌟 Featured: URECA is now highlighted as a Paper of the Day on Daily Papers page on HuggingFace! 🌟
- 2025-04-06: Training Code, Data collection pipeline, and URECA Model are released.
- 2025-04-06: URECA is released.
Please stay tuned for a URECA Dataset and Evaluation Code!
- Train Code (Apr 6, 2025)
- Pre-trained weights (Apr 6, 2025)
- Code of interactive demo (Apr 6, 2025)
- Demo update (Apr 6, 2025)
- Release ArXiv paper (Apr 8, 2025)
- Training Dataset release (Jul 8, 2025)
- Evaluation Code
- Test Dataset release
conda create -n ureca python=3.9
conda activate ureca
conda install pytorch==2.5.1 torchvision==0.20.1 torchaudio==2.5.1 pytorch-cuda=12.4 -c pytorch -c nvidia
pip install -r requirements.txtPlease Download SAM and place it on models folder.
Download URECA Model by following below script.
mkdir models
cd models
git lfs install
git clone https://huggingface.co/SammyLim/URECA
mkdir sam
! Download SAM-H model weight manually!
python gradio_demo/app.pyWe release our URECA training dataset that has 138,152 mask-caption pair! In order to download image-mask pair, please download SA-1B. Get URECA training caption file from Huggingface Link!
Please use the following bibtex to cite our work:
@article{lim2025ureca,
title={URECA: Unique Region Caption Anything},
author={Lim, Sangbeom and Kim, Junwan and Yoon, Heeji and Jung, Jaewoo and Kim, Seungryong},
journal={arXiv preprint arXiv:2504.05305},
year={2025}
}
This project is largely based on the InternVL repository. Thanks to the authors for their invaluable work and contributions.