Arxiv
Yu Wang, Xiaobao Wei, Ming Lu, Guoliang Kang$^\dagger$
TIP 2025
$\dagger$ Corresponding author
Robust 3D panoptic scene understanding plays an essential role in various applications, e.g., robot grasping, self-driving, etc. Though rapid progress has been made in 2D panoptic segmentation tasks, it is still challenging to obtain 3D panoptic segmentation masks of a specific scene which should be consistent in both semantic-level and instance-level across different views. The reasons are two-fold. Firstly, it is hard to directly apply the 2D panoptic segmentation model to images of different views to achieve consistent segmentation masks. It is because single-image segmentation model does not have a basic understanding of the 3D scene and is sensitive to the view variations, thus generating noisy and view-inconsistent masks. Secondly, to gain a full understanding of the 3D scene, we need large amounts of 2D or 3D annotations, which are expensive and time-consuming to obtain. Therefore, a typical way to perform 3D panoptic segmentation is to lift the segmentation masks from 2D to 3D, i.e. training a 3D panoptic segmentation model based on machine-generated noisy and view-inconsistent 2D segmentation masks across different views.
We recommend using conda for the environment setup. Please follow the steps below:
git clone https://github.com/wangyuyy/PLGS.git
cd PLGS
# create conda environment, we recommend python >= 3.9
conda create -n plgs python=3.9 -y
conda activate plgs
# install pytorch according to your CUDA version, please refer to https://pytorch.org/get-started/locally/ for more details
pip install torch==1.12.1+cu113 torchvision==0.13.1+cu113 torchaudio==0.12.1 --extra-index-url https://download.pytorch.org/whl/cu113
# install torch_scatter
pip install torch_scatter --extra-index-url ASSIGNED_LINKS=https://pytorch-geometric.com/whl/
# install faiss
pip install faiss-gpu
# install other dependencies
pip install -r requirements.txt
# for submodules
cd submodules/diff-gaussian-rasterization
pip install -e .
cd ../simple-knn
pip install -e .We conduct experiments on Replica, ScanNet and Hypersim datasets. We recommand to download the processed data and pretrained models. Or you can follow the Panoptic Lifting to download and process the data, which may be cubersome.
After downloading the data, please organize the data directory as follows:
├── assets
├── data
│ ├── replica
│ │ ├── room_0
│ │ └── ...
│ ├── scannet
│ │ ├── scene0050_00
│ │ └── ...
│ └── hypersim
│ ├── ai_001_003
│ └── ...
└── ...
Before training, we need to preprocess the data to obtain the initial semantic point cloud and instance masks.
python scene/preprocess_semantic.py --dataset replica
python scene/preprocess_instance.py --dataset replicaTraining can be launched by the following command:
bash scripts/train_replica.sh <GPU_ID>
bash scripts/train_scannet.sh <GPU_ID>
bash scripts/train_hypersim.sh <GPU_ID>Following Scaffold-GS, we inference code the model in train.py file. Or you can run render.py to render and evalute the model individually.
Our code is based on Panoptic Lifting and Scaffold-GS.
Thanks for their great open-source repositories!
If you find our work useful in your research, please consider citing:
@article{wang2025plgs,
title={Plgs: Robust panoptic lifting with 3d gaussian splatting},
author={Wang, Yu and Wei, Xiaobao and Lu, Ming and Kang, Guoliang},
journal={IEEE Transactions on Image Processing},
year={2025},
publisher={IEEE}
}