Attention Debiasing for Token Pruning in Vision–Language Models

Official implementation of the paper: Attention Debiasing for Token Pruning in Vision–Language Models.

🚀 Overview

Vision–language models (VLMs) typically encode substantially more visual tokens than text tokens, resulting in significant token redundancy. Pruning uninformative visual tokens is therefore crucial for improving computational efficiency. However, we find that attention in VLMs is systematically biased:

Recency Bias: It disproportionately favors tokens appearing later in the sequence (over-attention to lower image regions).
Attention Sink: It assigns inflated scores to semantically empty padding tokens.

We introduce two lightweight, training-free, and plug-and-play debiasing techniques that restore the reliability of attention as a pruning criterion.

✨ Key Features

Model-Agnostic: Works across various VLM architectures (LLaVA-v1.5, Video-LLaVA, etc.).
Pruning-Method-Agnostic: Enhances existing attention-based pruning methods (FastV, PyramidDrop, SparseVLM, HiMAP, TokenCarve, iLLaVA).
Training-Free: No retraining or fine-tuning required.
Efficient: Incurs negligible computational overhead.

🛠️ Installation

1. Clone the Repository

git clone --recursive https://github.com/intcomp/attention-bias.git
cd attention-bias

2. Apply Submodule Patches

We provide a utility script to apply our debiasing techniques to the integrated pruning baselines:

bash scripts/apply_patches.sh

3. Environment Setup

The environment requirements are primarily based on LLaVA. You can follow the installation steps in the respective submodule directories (e.g., FastV, SparseVLMs). Generally, you will need:

Python 3.10+
PyTorch 2.0+
CUDA 11.7+

📊 Evaluation

Dataset Preparation

Please follow the instructions in LLaVA/Evaluation to download and organize the datasets. Ensure your data structure matches the LLaVA guidelines.

Running Evaluation

We provide scripts to run evaluations for different pruning methods with our debiasing techniques enabled.

# Example: Run FastV evaluation
bash scripts/run_fastv.sh

# Example: Run HiMAP evaluation
bash scripts/run_himap.sh

Available scripts in scripts/:

run_fastv.sh
run_himap.sh
run_pdrop.sh
run_sparsevlm.sh
run_tokencarve.sh

🖼️ Qualitative Results

Our method effectively suppresses the retention of biased padding or bottom-region tokens and preserves semantically important visual tokens.

📝 Citation

If you find our work useful in your research, please consider citing:

@article{zhao2026attention,
  title={Attention Debiasing for Token Pruning in Vision–Language Models},
  author={Zhao, Kai and Yuan, Wubang and Lin, Yuchen and Ruan, Liting and Lu, Xiaofeng and Fan, Deng-Ping and Cheng, Ming-Ming and Zeng, Dan},
  journal={arXiv preprint},
  year={2026}
}

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
FastV @ d165972		FastV @ d165972
HiMAP @ 6c279e7		HiMAP @ 6c279e7
PyramidDrop @ 6444f30		PyramidDrop @ 6444f30
SparseVLMs @ 734d900		SparseVLMs @ 734d900
TokenCarve @ a57bd59		TokenCarve @ a57bd59
assets		assets
patches		patches
playground/data/weight		playground/data/weight
scripts		scripts
.gitattributes		.gitattributes
.gitmodules		.gitmodules
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Attention Debiasing for Token Pruning in Vision–Language Models

🚀 Overview

✨ Key Features

🛠️ Installation

1. Clone the Repository

2. Apply Submodule Patches

3. Environment Setup

📊 Evaluation

Dataset Preparation

Running Evaluation

🖼️ Qualitative Results

📝 Citation

About

Uh oh!

Releases

Packages

Contributors 2

Uh oh!

Languages

License

intcomp/attention-bias

Folders and files

Latest commit

History

Repository files navigation

Attention Debiasing for Token Pruning in Vision–Language Models

🚀 Overview

✨ Key Features

🛠️ Installation

1. Clone the Repository

2. Apply Submodule Patches

3. Environment Setup

📊 Evaluation

Dataset Preparation

Running Evaluation

🖼️ Qualitative Results

📝 Citation

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Uh oh!

Languages

Packages