UniMamba

Unified Spatial-Channel Representation Learning with Group-Efficient Mamba for LiDAR-based 3D Object Detection

Xin Jin^♠️,2,3, Haisheng Su^{♠️,1,3 📧}, Kai Liu³, Cong Ma³, Wei Wu^3,4, Fei HUI^{3 📧}, Junchi Yan^{1 📧}

¹ School of Computer Science, Shanghai Jiao Tong University

² Chang'an University, ³ SenseAuto Research, ⁴ Tsinghua University

^♠️ Equal Contributions, ^📧 Corresponding authors

News

Mar. 9th, 2025: We released our paper on Arxiv. Code/Models are coming soon. Please stay tuned! ☕️
Mar. 9th, 2025: Our paper has been accepted to CVPR 2025!

Introduction

Recent advances in LiDAR 3D detection have demonstrated the effectiveness of Transformer-based frameworks in capturing the global dependencies from point cloud spaces, which serialize the 3D voxels into the flattened 1D sequence for iterative self-attention. However, the spatial structure of 3D voxels will be inevitably destroyed during the serialization process. Besides, due to the considerable number of 3D voxels and quadratic complexity of Transformers, multiple sequences are grouped before feeding to Transformers, leading to a limited receptive field. Inspired by the impressive performance of State Space Models (SSM), in this paper, we propose a novel Unified Mamba (UniMamba), which seamlessly integrates the merits of 3D convolution and SSM in a concise multi-head manner, aiming to perform "local and global" spatial context aggregation efficiently and simultaneously. Specifically, a UniMamba block is designed which mainly consists of spatial locality modeling, complementary Z-order serialization and local-global sequential aggregator. The spatial locality modeling module integrates 3D submanifold convolution to capture the dynamic spatial position embedding before serialization. Then the efficient Z-order curve is adopted for serialization both horizontally and vertically. Furthermore, the local-global sequential aggregator adopts the channel grouping strategy to efficiently encode both ``local and global" spatial inter-dependencies using multi-head SSM. Additionally, an encoder-decoder architecture with stacked UniMamba blocks is formed to facilitate multi-scale spatial learning hierarchically. Extensive experiments are conducted on three popular datasets: nuScenes, Waymo and Argoverse 2. Particularly, our UniMamba achieves 70.2 mAP on the nuScenes dataset.

Framework

Evaluation on nuScenes dataset

Evaluation on Waymo Open dataset

Evaluation on Argoverse2 dataset

Getting Started

TBD

License

This project is released under the MIT license

Contact

If you have any questions, please contact Haisheng Su via email ([email protected]).

Citation

If you find UniMamba is useful in your research or applications, please consider giving us a star 🌟 and citing it by the following BibTeX entry.

@article{su2025unimamba,
  title={UniMamba: Unified Spatial-Channel Representation Learning with Group-Efficient Mamba for LiDAR-based 3D Object Detection},
  author={Jin, Xin and Su, Haisheng and Liu, Kai and Ma, Cong and Wu, Wei and Hui, Fei and Yan, Junchi},
  journal={arXiv preprint arXiv:2503.12009},
  year={2025}
}

Name		Name	Last commit message	Last commit date
Latest commit History 10 Commits
assets		assets
.DS_Store		.DS_Store
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

UniMamba

Unified Spatial-Channel Representation Learning with Group-Efficient Mamba for LiDAR-based 3D Object Detection

News

Table of Contents

Introduction

Framework

Evaluation on nuScenes dataset

Evaluation on Waymo Open dataset

Evaluation on Argoverse2 dataset

Getting Started

License

Contact

Citation

About

Uh oh!

Releases

Packages

License

suhaisheng/UniMamba

Folders and files

Latest commit

History

Repository files navigation

UniMamba

Unified Spatial-Channel Representation Learning with Group-Efficient Mamba for LiDAR-based 3D Object Detection

News

Table of Contents

Introduction

Framework

Evaluation on nuScenes dataset

Evaluation on Waymo Open dataset

Evaluation on Argoverse2 dataset

Getting Started

License

Contact

Citation

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Packages