DAMO: Decoding by Accumulating Activations Momentum for Mitigating Hallucinations in Vision-Language Models (ICLR 2025)

Introduction

This papers aims to mitigate hallucinations in Vision-Language Models (VLMs) by accumulating visual information from earlier layers, where we found that correct information often appears in the early stage. By refining activations throughout the inference procedure, DAMO effectively preserves essential visual semantics, leading to more accurate and reliable predictions.

Here is the paper link: https://openreview.net/forum?id=JUr0YOMvZA

Usage (Taking LLaVA1.5 as an example)

Please create the llava 1.5 env from official repo:
```
https://github.com/haotian-liu/LLaVA.git
```
Please replace the LLaVA/llava/model/language_model/llava_llama.py with ours.
Please replace the LLaVA/llava/model/llava_arch.py with ours.
Please replace the LLaVA/llava/eval/run_llava.py with ours.
Then, for MME benchmark, you could run CUDA_VISIBLE_DEVICES=0 python llava_mme.py --output_dir DAMO to evaluate MME benchmark.

Citation

@inproceedings{wangdamo,
  title={DAMO: Decoding by Accumulating Activations Momentum for Mitigating Hallucinations in Vision-Language Models},
  author={Wang, Kaishen and Gu, Hengrui and Gao, Meijun and Zhou, Kaixiong},
  booktitle={The Thirteenth International Conference on Learning Representations}
}

Name		Name	Last commit message	Last commit date
Latest commit History 12 Commits
LLaVA		LLaVA
images		images
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

DAMO: Decoding by Accumulating Activations Momentum for Mitigating Hallucinations in Vision-Language Models (ICLR 2025)

Introduction

Usage (Taking LLaVA1.5 as an example)

Citation

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

DAMO: Decoding by Accumulating Activations Momentum for Mitigating Hallucinations in Vision-Language Models (ICLR 2025)

Introduction

Usage (Taking LLaVA1.5 as an example)

Citation

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages