| name | pretrain | resolution | acc@1 | #param | FLOPs | download |
|---|---|---|---|---|---|---|
| DAMamba-T | ImageNet-1K | 224x224 | 83.8 | 26M | 4.8G | ckpt |
| DAMamba-S | ImageNet-1K | 224x224 | 84.8 | 45M | 10.3G | ckpt |
| DAMamba-B | ImageNet-1K | 224x224 | 85.2 | 86M | 16.3G | ckpt |
-
ImageNet is an image database organized according to the WordNet hierarchy. Download and extract ImageNet train and val images from http://image-net.org/. Organize the data into the following directory structure:
imagenet/ βββ train/ β βββ n01440764/ (Example synset ID) β β βββ image1.JPEG β β βββ image2.JPEG β β βββ ... β βββ n01443537/ (Another synset ID) β β βββ ... β βββ ... βββ val/ βββ n01440764/ (Example synset ID) β βββ image1.JPEG β βββ ... βββ ... -
COCO is a large-scale object detection, segmentation, and captioning dataset. Please visit http://cocodataset.org/ for more information, including for the data, paper, and tutorials. COCO API also provides a concise and efficient way to process the data.
-
ADE20K is composed of more than 27K images from the SUN and Places databases. Please visit https://ade20k.csail.mit.edu/ for more information and see the GitHub Repository for an overview of how to access and explore ADE20K.
@article{li2025damamba,
title={DAMamba: Vision State Space Model with Dynamic Adaptive Scan},
author={Li, Tanzhe and Li, Caoshuo and Lyu, Jiayi and Pei, Hongjuan and Zhang, Baochang and Jin, Taisong and Ji, Rongrong},
journal={arXiv preprint arXiv:2502.12627},
year={2025}
}This project is largely based on Mamba, VMamba, Swin-Transformer, InternImage and OpenMMLab. We are truly grateful for their excellent work.
This project is released under the Apache 2.0 license.