Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Implementation of paper "DQFormer: Towards Unified LiDAR Panoptic Segmentation with Decoupled Queries"

License

Notifications You must be signed in to change notification settings

yuyang-cloud/DQFormer

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Introduction

LiDAR panoptic segmentation (LPS) performs semantic and instance segmentation for things (foreground objects) and stuff (background elements), essential for scene perception and remote sensing. We presents DQFormer, a novel framework for unified LPS that employs a decoupled query workflow to adapt to the characteristics of things and stuff in outdoor scenes. It first utilizes a feature encoder to extract multi-scale voxel-wise, point-wise, and BEV features. Then, a decoupled query generator proposes informative queries by localizing things/stuff positions and fusing multi-level BEV embeddings. A query-oriented mask decoder uses masked cross-attention to decode segmentation masks, which are combined with query semantics to produce panoptic results. Extensive experiments on large-scale outdoor scenes, including the vehicular datasets nuScenes and SemanticKITTI, as well as the aerial point cloud dataset DALES, show that DQFormer outperforms superior methods by +1.8%, +0.9%, and +3.5% in panoptic quality (PQ), respectively.

Get Started

Running the code primarily consists of four steps:

  1. Installation: Prepare the required environments
  2. Datasets Preparation: Prepare the datasets
  3. Training: Execute the training scripts
  4. Validation: Run the evaluation scripts

Installation

Install dependencies (we test on python=3.8.17, pytorch==1.10.0, cuda==11.3, other versions may also work)

git clone https://github.com/yuyang-cloud/DQFormer.git
pip install torch==1.10.1+cu113 torchvision==0.11.2+cu113 torchaudio==0.10.1+cu113 -f https://download.pytorch.org/whl/torch_stable.html

# Install requirements
pip install -r requirements.txt

# Install DQFormer package by running in the root directory of this repo
pip install -U -e .

Install sptr

cd dqformer/third_party/SparseTransformer && python setup.py install

Note: Make sure you have installed gcc and cuda, and nvcc can work (if you install cuda by conda, it won't provide nvcc and you should install cuda manually.)

Download the pre-trained weights for the backbone and place them in the DQFormer/ckpts directory.

Datasets Preparation

SemanticKITTI

Download the SemanticKIITI dataset from here. Unzip and arrange it as follows:

DQFormer/
└── dqformer/
    └── data/
        └── kitti
            └── sequences
                ├── 00/           
                │   ├── velodyne/	
                |   |	├── 000000.bin
                |   |	├── 000001.bin
                |   |	└── ...
                │   └── labels/ 
                |       ├── 000000.label
                |       ├── 000001.label
                |       └── ...
                ├── 08/ # for validation
                ├── 11/ # 11-21 for testing
                └── 21/
                    └── ...

nuScenes

Download the nuScenes dataset from here, which include the Full dataset, the nuScenes-lidarseg dataset and the nuScenes-panoptic dataset. Arrange the downloaded data folder structure as follows:

nuscene
├── lidarseg
├── maps
├── panoptic
├── samples
├── sweeps
├── v1.0-mini
├── v1.0-test
└── v1.0-trainval

Then, use nuscenes2kitti script to conver the nuScenes format into the SemanticKITTI format:

python dqformer/utils/nuscenes2kitti.py --nuscenes_dir <nuscenes_directory> --output_dir <output_directory>

This will create a directory for each scene in <output_directory>, maintaining the same structure as SemanticKITTI. To establish a symbolic link between dqformer/data/nuscenes and <output_directory>, follow these steps:

ln -s <output_directory> DQFormer/dqformer/data/nuscenes

DQFormer/
└── dqformer/
    └── data/
        └── nuscenes
            ├── 0001/           
            │   ├── velodyne/	
            |   |	├── 000000.bin
            |   |	├── 000001.bin
            |   |	└── ...
            │   ├── labels/ 
            |   |	├── 000000.label
            |   |   ├── 000001.label
            |   |   └── ...
            │   ├── calib.txt
            │   ├── files_mapping.txt
            │   ├── lidar_tokens.txt
            │   └── poses.txt
            ├── .../
            └── 1110/

Training

cd DQFormer/dqformer
CUDA_VISIBLE_DEVICES=[GPU_IDs] python scripts/train_model.py

Set GPU_IDs to specify the GPUs to be used for training. The program will run in parallel across multiple GPUs.

Validation

cd DQFormer/dqformer
CUDA_VISIBLE_DEVICES=[GPU_IDs] python scripts/evaluate_model.py --w [path_to_model]

Set GPU_IDs to specify the GPUs for evaluation. The program will be evaluated in parallel across multiple GPUs. Please note that using only one GPU will report per-class performances.

Acknowledgement

This repo benefits from SphereFormer, CenterFormer and MaskPLS.

License

This project is released under the MIT License as found in the LICENSE file.

About

Implementation of paper "DQFormer: Towards Unified LiDAR Panoptic Segmentation with Decoupled Queries"

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages