Guowen Zhang · Chenhang He · Liyi Chen · Zhang Lei†
The Hong Kong Polytechnic University
†Corresponding author
Our BEVDilation achieves state-of-the-art performance on nuScene datasets. It prioritizes LiDAR information in the multi-modal fusion, achieving effective and robust fusion.
-[25-11-24] BEVDilation released on arxiv
-[25-11-24] BEVDilation is accepted by AAAI26!
- Release the arXiv version.
- Clean up and release the code.
Validation set
| Model | mAP | NDS | mATE | mASE | mAOE | mAVE | mAAE | ckpt |
|---|---|---|---|---|---|---|---|---|
| BEVDilation | 73.0 | 75.0 | 26.9 | 24.7 | 28.6 | 17.7 | 17.3 | ckpt |
Test set
| Model | mAP | NDS | mATE | mASE | mAOE | mAVE | mAAE | Leaderboard | Submission |
|---|---|---|---|---|---|---|---|---|---|
| BEVDilation | 73.1 | 75.4 | 24.8 | 23.4 | 33.8 | 17.8 | 11.7 | leaderboard | Submission |
BEVDilation's result on nuScenes compared with other leading methods. All the experiments are evaluated on an NVIDIA A6000 GPU with the same environment. We hope that our BEVDilation can provide a potential LiDAR-centric solution for efficiently handling multi-modal fusion for 3D tasks.
Please refer to INSTALL.md for installation.
Please follow the instructions from DAL. We adopt the same data generation process.
BEVDilation/
├── data/
│ └── nuscenes/
│ ├── samples/ # Sensor data (keyframes)
│ ├── sweeps/ # Sensor data (intermediate frames)
│ ├── maps/ # Map data (optional)
│ ├── v1.0-trainval/ # Metadata for train and val splits
│ ├── v1.0-test/ # Metadata for test split
| ├── bevdetv3-nuscenes_gt_database/
| ├── bevdetv3-nuscenes_dbinfos_train.pkl
| ├── bevdetv3-nuscenes_infos_train.pkl
| └── bevdetv3-nuscenes_infos_val.pkl
Generate Hilbert Template, following Voxel Mamba
cd data
mkdir hilbert
python ./tools/create_hilbert_curve_template.py
You can also download Hilbert Template files from Google Drive or BaiduYun (code: mwd4).
# multi-gpu training
cd tools
./tools/dist_train.sh configs/bevdilation/bevdilation.py 8
# multi-gpu testing
./tools/dist_test.sh ./bevdilation/bevdilation.py ./checkpoint_path 8 --eval mAP
Please consider citing our work as follows if it is helpful.
@article{zhang2024bevdilation,
title={BEVDilation: LiDAR-Centric Multi-Modal Fusion for 3D Object Detection},
author={Zhang, Guowen and He, Chenhang and Chen Liyi and Zhang, Lei},
booktitle={Proceedings of the AAAI Conference on Artificial Intelligence},
year={2025}
}
BEVDilation is based on DAL.
We also thank the Voxel Mamba, DAL, OpenPCDet, and MMDetection3D authors for their efforts.