FTI4CIR

[SIGIR 2024] - Fine-grained Textual Inversion Network for Zero-Shot Composed Image Retrieval (FTI4CIR).

Installation

Clone the repository

git clone https://github.com/ZiChao111/FTI4CIR.git

Running Environment

Platform: NVIDIA A100 40G
Python  3.9.12
Pytorch  2.2.0

Data Preparation

ImageNet

Download ImageNet1K (ILSVRC2012) test set following the instructions in the official site.

After downloading the dataset, ensure that the folder structure matches the following:

├── ImageNet
│   ├── test
|   |   ├── [ILSVRC2012_test_[00000001 | ... | 00100000].JPEG]

FashionIQ

Download the FashionIQ dataset following the instructions in the official repository.

After downloading the dataset, ensure that the folder structure matches the following:

├── FashionIQ
│   ├── captions
|   |   ├── cap.dress.[train | val | test].json
|   |   ├── cap.toptee.[train | val | test].json
|   |   ├── cap.shirt.[train | val | test].json

│   ├── image_splits
|   |   ├── split.dress.[train | val | test].json
|   |   ├── split.toptee.[train | val | test].json
|   |   ├── split.shirt.[train | val | test].json

│   ├── dress
|   |   ├── [B000ALGQSY.jpg | B000AY2892.jpg | B000AYI3L4.jpg |...]

│   ├── shirt
|   |   ├── [B00006M009.jpg | B00006M00B.jpg | B00006M6IH.jpg | ...]

│   ├── toptee
|   |   ├── [B0000DZQD6.jpg | B000A33FTU.jpg | B000AS2OVA.jpg | ...]

CIRR

Download the CIRR dataset following the instructions in the official repository.

After downloading the dataset, ensure that the folder structure matches the following:

├── CIRR
│   ├── train
|   |   ├── [0 | 1 | 2 | ...]
|   |   |   ├── [train-10108-0-img0.png | train-10108-0-img1.png | ...]

│   ├── dev
|   |   ├── [dev-0-0-img0.png | dev-0-0-img1.png | ...]

│   ├── test1
|   |   ├── [test1-0-0-img0.png | test1-0-0-img1.png | ...]

│   ├── cirr
|   |   ├── captions
|   |   |   ├── cap.rc2.[train | val | test1].json
|   |   ├── image_splits
|   |   |   ├── split.rc2.[train | val | test1].json

CIRCO

Download the CIRCO dataset following the instructions in the official repository.

After downloading the dataset, ensure that the folder structure matches the following:

├── CIRCO
│   ├── annotations
|   |   ├── [val | test].json

│   ├── COCO2017_unlabeled
|   |   ├── annotations
|   |   |   ├──  image_info_unlabeled2017.json
|   |   ├── unlabeled2017
|   |   |   ├── [000000243611.jpg | 000000535009.jpg | ...]

Pre-trained model

The model is available in GoogleDrive.

Pre-training Phase

We generate the image captions by BLIP.

Sample running code for training:

python src/train.py \
    --save-frequency 1 \
    --batch-size=256 \
    --lr=4e-5 \
    --wd=0.01 \
    --epochs=60 \
    --model-dir="./model_save" \
    --workers=8 \
    --model ViT-L/14

Inference Phase

Validation (split=val)

Evaluation on FashionIQ, CIRR, or CIRCO.

python src/evaluate.py \
    --dataset='cirr' \
    --save-path='' \
    --model-path="" \

    --dataset <str>                 Dataset to use, options: ['fashioniq', 'cirr', 'circo']
    --model-path <str>              Path of the pre-trained model
    --save-path <str>               Path to save the predictions file

Test (split=test)

To generate the predictions file for uploading on the CIRR Evaluation Server or the CIRCO Evaluation Server using the our model, please execute the following command:

python src/test.py \
    --dataset='cirr' \
    --save-path='' \
    --model-path="" \

    --dataset <str>                 Dataset to use, options: ['cirr', 'circo']
    --model-path <str>              Path of the pre-trained model
    --save-path <str>               Path to save the predictions file

Citation


@inproceedings{FTI4CIR,
  author    = {Haoqiang Lin and Haokun Wen and Xuemeng Song and Meng Liu and Yupeng Hu and Liqiang Nie},
  title     = {Fine-grained Textual Inversion Network for Zero-Shot Composed Image Retrieval},
  booktitle = {Proceedings of the International {ACM} SIGIR Conference on Research and Development in Information Retrieval},
  pages     = {240-250},
  publisher = {{ACM}},
  year      = {2024}
}

Name		Name	Last commit message	Last commit date
Latest commit History 76 Commits
data		data
src		src
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

FTI4CIR

Installation

Data Preparation

ImageNet

FashionIQ

CIRR

CIRCO

Pre-trained model

Pre-training Phase

Sample running code for training:

Inference Phase

Validation (split=val)

Test (split=test)

Citation

About

Uh oh!

Releases

Packages

Uh oh!

Languages

ZiChao111/FTI4CIR

Folders and files

Latest commit

History

Repository files navigation

FTI4CIR

Installation

Data Preparation

ImageNet

FashionIQ

CIRR

CIRCO

Pre-trained model

Pre-training Phase

Sample running code for training:

Inference Phase

Validation (split=val)

Test (split=test)

Citation

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages