Codestin Search App

Boosting_WRIS

Official code of our paper "Weakly-Supervised Referring Image Segmentation via Progressive Comprehension", NeurIPS2024

Overview

we propose a Progressive Comprehension Network (PCNet) to leverage target-related textual cues from the input description for progressively localizing the target object. We first use a Large Language Model (LLM) to decompose the input text description into short phrases. These short phrases are taken as target-related cues and fed into a Conditional Referring Module (CRM) in multiple stages, to enhance the response map for target localization in a multi-stage manner. We also propose a RaS loss to constrain the visual localization across different stages and an Instance-aware Disambiguation (IaD) loss to suppress instance localization ambiguity.

Preparation

Data

extract phrase by LLM (Mistral or others LLMs)

extract mask proposals by SAM

The overall dataset structure is as follow:

coco/
├── annotations/
│   ├── captions_train2014.json
│   ├── instances_train2014.json
├── refer/
│   ├── grefcoco/
│   ├── refcoco/
│   ├── refcoco+/
│   │   ├── Mistral_7B_Ref_Sents/
│   │   ├── SAM_Mask/
│   │   ├── SOLO_Mask/
│   │   ├── instances.json
│   │   └── refs(unc).p
│   ├── refcocog/
│   │   └── new_grefs_unc.json
│   ├── refcoco.zip
│   ├── refcoco+.zip
│   └── refcocog.zip
├── train2014/
    ├── COCO_train2014_000000000009.jpg
    ├── COCO_train2014_000000000025.jpg
    ├── COCO_train2014_000000000034.jpg

Replace the refer_data_root in line94 of dataset/ReferDataset with the path of coco

Environment
```
conda env create -f environment.yml
```

Evaluation

Download Checkpoint weights

dataset=refcoco+
splitBy=unc
test_split=val
model_path="weights/ckpt_hit.pth"
out_put_dir=""


CUDA_VISIBLE_DEVICES=0 python validate.py \
    --batch_size 1 \
    --size 320 \
    --dataset ${dataset} \
    --splitBy ${splitBy} \
    --test_split ${test_split} \
    --max_query_len 20 \
    --attn_multi_vis 0.1 \
    --attn_multi_text 0.1 \
    --output ${out_put_dir} \
    --resume \
    --K_Iters 1 \
    --pretrain  "${model_path}" \
    --cam_save_dir "${out_put_dir}/cam_${test_split}" \
    --name_save_dir "${out_put_dir}/name_save" \
    --save_train_pseudo_dir "${out_put_dir}/train_pseudo" \
    --save_cam \
    --eval

Acknowledgement

This repository was mostly based on TRIS

Citation

@article{yang2024boosting,
  title={Boosting weakly-supervised referring image segmentation via progressive comprehension},
  author={Yang, Zaiquan and Liu, Yuhao and Lin, Jiaying and Hancke, Gerhard and Lau, Rynson WH},
  journal={arXiv preprint arXiv:2410.01544},
  year={2024}
}

Contact

If you have any questions, please feel free to reach out at [email protected].

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
CLIP/clip		CLIP/clip
dataset		dataset
loss		loss
model		model
segment-anything		segment-anything
showsolo		showsolo
utils		utils
README.md		README.md
__init__.py		__init__.py
adamw.py		adamw.py
args.py		args.py
environment.yml		environment.yml
logger.py		logger.py
main.png		main.png
train_prog.py		train_prog.py
validate.py		validate.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Boosting_WRIS

Overview

Preparation

Evaluation

Acknowledgement

Citation

Contact

About

Uh oh!

Releases

Packages

Contributors 2

Uh oh!

Languages

zaiquanyang/Boosting_WRIS

Folders and files

Latest commit

History

Repository files navigation

Boosting_WRIS

Overview

Preparation

Evaluation

Acknowledgement

Citation

Contact

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Uh oh!

Languages

Packages