Thanks to visit codestin.com
Credit goes to Github.com

Skip to content

[AAAI'26] BEVDilation: LiDAR-Centric Multi-Modal Fusion for 3D Object Detection

License

Notifications You must be signed in to change notification settings

whuhxb/BEVDilation

 
 

Repository files navigation

BEVDilation: LiDAR-Centric Multi-Modal Fusion for 3D Object Detection

AAAI 2026

Guowen Zhang · Chenhang He · Liyi Chen · Zhang Lei

The Hong Kong Polytechnic University
†Corresponding author

arXiv

Our BEVDilation achieves state-of-the-art performance on nuScene datasets. It prioritizes LiDAR information in the multi-modal fusion, achieving effective and robust fusion.

🔥News

-[25-11-24] BEVDilation released on arxiv
-[25-11-24] BEVDilation is accepted by AAAI26!

📘TODO

  • Release the arXiv version.
  • Clean up and release the code.

🏆Main Results

nuScene Dataset

Validation set

Model mAP NDS mATE mASE mAOE mAVE mAAE ckpt
BEVDilation 73.0 75.0 26.9 24.7 28.6 17.7 17.3 ckpt

Test set

Model mAP NDS mATE mASE mAOE mAVE mAAE Leaderboard Submission
BEVDilation 73.1 75.4 24.8 23.4 33.8 17.8 11.7 leaderboard Submission

BEVDilation's result on nuScenes compared with other leading methods. All the experiments are evaluated on an NVIDIA A6000 GPU with the same environment. We hope that our BEVDilation can provide a potential LiDAR-centric solution for efficiently handling multi-modal fusion for 3D tasks.

🚀Usage

Installation

Please refer to INSTALL.md for installation.

Dataset Preparation

Please follow the instructions from DAL. We adopt the same data generation process.

BEVDilation/
├── data/
│   └── nuscenes/
│       ├── samples/          # Sensor data (keyframes)
│       ├── sweeps/           # Sensor data (intermediate frames)
│       ├── maps/             # Map data (optional)
│       ├── v1.0-trainval/    # Metadata for train and val splits
│       ├── v1.0-test/        # Metadata for test split
|       ├── bevdetv3-nuscenes_gt_database/
|       ├── bevdetv3-nuscenes_dbinfos_train.pkl       
|       ├── bevdetv3-nuscenes_infos_train.pkl
|       └── bevdetv3-nuscenes_infos_val.pkl

Generate Hilbert Template, following Voxel Mamba

cd data
mkdir hilbert
python ./tools/create_hilbert_curve_template.py

You can also download Hilbert Template files from Google Drive or BaiduYun (code: mwd4).

Training

# multi-gpu training
cd tools
./tools/dist_train.sh configs/bevdilation/bevdilation.py 8

Test

# multi-gpu testing
./tools/dist_test.sh ./bevdilation/bevdilation.py ./checkpoint_path 8 --eval mAP

Citation

Please consider citing our work as follows if it is helpful.

@article{zhang2024bevdilation,
  title={BEVDilation: LiDAR-Centric Multi-Modal Fusion for 3D Object Detection},
  author={Zhang, Guowen and He, Chenhang and Chen Liyi and Zhang, Lei},
  booktitle={Proceedings of the AAAI Conference on Artificial Intelligence},
  year={2025}
}

Acknowledgments

BEVDilation is based on DAL.
We also thank the Voxel Mamba, DAL, OpenPCDet, and MMDetection3D authors for their efforts.

About

[AAAI'26] BEVDilation: LiDAR-Centric Multi-Modal Fusion for 3D Object Detection

Resources

License

Code of conduct

Contributing

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 97.4%
  • Cuda 1.5%
  • Other 1.1%