- CUDA >= 12.3
- Please see
apptainer/config.def
- Download the dataset from here
- Install the dataset into
./dataset/coco/
- Download the dataset from here
- Install the dataset into
./dataset/imagenet/
python3 main/train.py --config_path config/01_post-pre-training/clip-refine.yamlpython3 main/test.py --config_path config/01_post-pre-training/clip-refine.yaml@inproceedings{Yamaguchi_CVPR25_CLIP-Refine,
title={Post-pre-training for Modality Alignment in Vision-Language Foundation Models},
author={Yamaguchi, Shin'ya and Feng, Dewei and Kanai, Sekitoshi and Adachi, Kazuki and Chijiwa, Daiki},
booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition},
year={2025}
}