*denotes equal contribution
NeurIPS 2025
This is the official implementation of Attention (as Discrete-Time Markov) Chains.
For a straight-forward implemetation of multi-bounce attention, TokenRank, and lambda-weighting from the paper, see helpers.py.
We provide a demo for DINOv1/2, CLIP, supervised ViT (from transformers library) in demo.ipynb.
For visualizing attention with FLUX, run:
flux.py flux.yml
You can edit flux.yml for tinkering with the results.
*Note: you must have the libraries imported by flux.py installed in your virtual environment
- Basic functionality
- Visualization demo for FLUX
- Segmentation demo for FLUX
- Demo for DINOv1/2, ViT, CLIP
- Reproduction of experiments
If you find our work useful, please consider giving a star ⭐ and a citation.
@article{erel2025attentionasdiscretetimemarkov,
title = {Attention (as Discrete-Time Markov) Chains},
author = {Erel, Yotam and D{\"u}nkel, Olaf and Dabral, Rishabh and Golyanik, Vladislav and Theobalt, Christian and Bermano, Amit H.},
journal = {arXiv preprint arXiv:2507.17657},
year = {2025}
}