RCA: Region Conditioned Adaptation for Visual Abductive Reasoning (ACM Multimedia 2024)

This is the official implementation of the paper RCA: Region Conditioned Adaptation for Visual Abductive Reasoning. We achieved the top rank on the official Sherlock Abductive Reasoning Leaderboard and the DHPR retrieval performance.

July 19, 2024

Release RCA-V1 version (the version used in paper) to public.

Model Zoo

Model	Backbone	Tuned (M↓)	im→txt (↓)	txt→im (↓)	P@1→I (↑)	GT/Auto-Box (↑)	Human Acc (↑)	Model Link
LXMERT [1] from [4]	F-RCNN	NA	51.10	48.80	14.90	69.50 / 30.30	21.10	NA
UNITER [2] from [4]	F-RCNN	NA	40.40	40.00	19.80	73.00 / 33.30	22.90	NA
CPT [3] from [4]	RN50×64	NA	16.35	17.72	33.44	87.22 / 40.60	27.12	NA
CPT [3] from [4]	ViT-B-16	149.62	19.85	21.64	30.56	85.33 / 36.60	21.31	pth
RCA + Dual-Contrast Loss	ViT-B-16	42.26	13.92	16.58	35.42	88.08 / 42.32	27.51	pth
CPT [3] (our impl)	ViT-L-14	428.53	13.08	14.91	37.21	87.85 / 41.99	29.58	pth
RCA + Dual-Contrast Loss	ViT-L-14	89.63	10.14	12.65	40.36	89.72 / 44.73	31.74	pth

1. LXMERT 2. UNITER 3. CPT 4. SHERLOCK

Installation

cd train_code_v2.20.0_RCA_CLIP
pip install -r requirements.txt

Quick Start

Train

Pre-pare data

Create a folder named Sherlock and put the following files in it (annotations can be directly download from annotations.zip, images please download from Sherlock):

Sherlock
|_sherlock_val_with_split_idxs_v1_1.json
|_sherlock_train_v1_1.json
|
|_test_localization_public
|_test_retrieval_public
|_test_comparison_public
|_val_localization
|_val_retrieval
|_val_comparison
|
|_images
  |_vcr1images
  |        |_vcr1images_0.jpg
  |        |_...
  |
  |_VG_100K
  |        |_vcr1images_1.jpg
  |        |_...
  |
  |_VG_100K_2
          |_vcr1images_2.jpg
          |_...

Evaluate

Contributors

RCA is coded and maintained by Dr. Hao Zhang.

Citing

If you find the paper helpful for your work, please consider citing the following:

@inproceedings{hesselhwang2022abduction,
  title={{RCA: Region Conditioned Adaptation for Visual Abductive Reasoning}},
  author={Hao Zhang, Yeo Keat Ee, Basura Fernando},
  booktitle={ACM Multimedia},
  year={2024}
}

@inproceedings{hesselhwang2022abduction,
  title={{The Abduction of Sherlock Holmes: A Dataset for Visual Abductive Reasoning}},
  author={*Hessel, Jack and *Hwang, Jena D and Park, Jae Sung and Zellers, Rowan and Bhagavatula, Chandra and Rohrbach, Anna and Saenko, Kate and Choi, Yejin},
  booktitle={ECCV},
  year={2022}
}

@article{10568360,
  author={Charoenpitaks, Korawat and Nguyen, Van-Quang and Suganuma, Masanori and Takahashi, Masahiro and Niihara, Ryoma and Okatani, Takayuki},
  journal={IEEE Transactions on Intelligent Vehicles}, 
  title={Exploring the Potential of Multi-Modal AI for Driving Hazard Prediction}, 
  year={2024},
  volume={},
  number={},
  pages={1-11},
  keywords={Hazards;Cognition;Videos;Automobiles;Accidents;Task analysis;Natural languages;Vision;Language;Reasoning;Traffic Accident Anticipation},
  doi={10.1109/TIV.2024.3417353}
}

Acknowledgement

Thanks for the following Github repositories:

Name		Name	Last commit message	Last commit date
Latest commit History 40 Commits
images		images
leaderboard_eval		leaderboard_eval
train_code_v2.20.0_RCA_CLIP		train_code_v2.20.0_RCA_CLIP
.gitignore		.gitignore
CODE_LICENSE		CODE_LICENSE
DATASET_LICENSE		DATASET_LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Licenses found

Uh oh!

Repository files navigation

RCA: Region Conditioned Adaptation for Visual Abductive Reasoning (ACM Multimedia 2024)

July 19, 2024

Model Zoo

Installation

Quick Start

Train

Pre-pare data

Evaluate

Contributors

Citing

Acknowledgement

About

Licenses found

Uh oh!

Releases

Packages

Languages

License

Licenses found

LUNAProject22/RPA

Folders and files

Latest commit

History

Repository files navigation

RCA: Region Conditioned Adaptation for Visual Abductive Reasoning (ACM Multimedia 2024)

July 19, 2024

Model Zoo

Installation

Quick Start

Train

Pre-pare data

Evaluate

Contributors

Citing

Acknowledgement

About

Resources

License

Licenses found

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages