BUOL: A Bottom-Up Framework with Occupancy-aware Lifting for Panoptic 3D Scene Reconstruction From A Single Image
This is an official release of the paper BUOL: A Bottom-Up Framework with Occupancy-aware Lifting for Panoptic 3D Scene Reconstruction From A Single Image
BUOL: A Bottom-Up Framework with Occupancy-aware Lifting for Panoptic 3D Scene Reconstruction From A Single Image
Tao Chu, Pan Zhang, Qiong Liu, Jiaqi Wang
The IEEE / CVF Computer Vision and Pattern Recognition Conference (CVPR 2023)
The results of BUOL on each dataset are shown below. We have released the models.
| dataset | PRQ | RSQ | RRQ | PRQ_th | PRQ_st | Download |
|---|---|---|---|---|---|---|
| 3D FRONT | 54.05 | 63.72 | 83.14 | 49.77 | 73.34 | front3d.pth |
| Matterport3D | 14.54 | 45.91 | 31.08 | 11.02 | 25.09 | matterport3d.pth |
Creat environment.
conda create -n buol -y
conda activate buol
conda install pytorch==1.8.0 torchvision==0.9.0 torchaudio==0.8.0 cudatoolkit=11.1 -c pytorch -c conda-forge -yInstall MinkowskiEngine.
git clone https://github.com/NVIDIA/MinkowskiEngine.git
cd MinkowskiEngine
python setup.py install --blas_include_dirs=${CONDA_PREFIX}/include --blas=openblas --force_cudaInstall PyMarchingCubes.
git clone https://github.com/xheon/PyMarchingCubes.git
cd PyMarchingCubes
git clone https://gitlab.com/libeigen/eigen.git
python setup.py installInstall other dependency packages.
pip install yacs fvcore
pip install opencv-python
conda install -c conda-forge openexr-python -y
pip install pyexr
pip install matplotlib
pip install plyfile
pip install loguru
pip install scipyDownload front3d.pth
and put it at models/front3d.pth, and run:
python demo.pyDownload datasets and put them in datasets/<dataset_name> as the following structure,
and then set GPUS (e.g. GPUS: (0, 1, 2, 3)) and MODEL.EVAL: False in the config file,
and train with multi-GPU:
python -m torch.distributed.launch --nproc_per_node=4 main.py --cfg configs/front.yamlDownload the model or train the model, and then set MODEL.WEIGHTS as the model path.
Set GPUS: (0,) and MODEL.EVAL: True in the config file, and test with one GPU:
python main.py --cfg configs/front.yamlThe 3D FRONT is a synthetic indoor dataset. We process it the same as Dahnert et al. (Panoptic 3D Scene Reconstruction from a Single RGB Image). You can download or process it from there.
front3d/
<scene_id>/
├── rgb_<frame_id>.png # Color image: 320x240x3
├── depth_<frame_id>.exr # Depth image: 320x240x1
├── segmap_<frame_id>.mapped.npz # 2D Segmentation: 320x240x2, with 0: pre-mapped semantics, 1: instances
├── geometry_<frame_id>.npz # 3D Geometry: 256x256x256x1, truncated, (unsigned) distance field at 3cm voxel resolution and 12 voxel truncation.
├── segmentation_<frame_id>.mapped.npz # 3D Segmentation: 256x256x256x2, with 0: pre-mapped semantics & instances
├── weighting_<frame_id>.mapped.npz # 3D Weighting mask: 256x256x256x1
The Matterport3D is a real-world indoor datasets. We follow Dahnert et al. to preprocess this dataset. In addition, we generate depth and room mask by rendering 3D scenes instead of using the origin version.
matterport/
<scene_id>/
├── <image_id>_i<frame_id>.png # Color image: 320x240x3
├── <image_id>_segmap<frame_id>.mapped.npz # 2D Segmentation: 320x240x2, with 0: pre-mapped semantics, 1: instances
├── <image_id>_intrinsics_<camera_id>.png # Intrinsics matrix: 4x4
├── <image_id>_geometry<frame_id>.npz # 3D Geometry: 256x256x256x1, truncated, (unsigned) distance field at 3cm voxel resolution and 12 voxel truncation.
├── <image_id>_segmentation<frame_id>.mapped.npz # 3D Segmentation: 256x256x256x2, with 0: pre-mapped semantics & instances
├── <image_id>_weighting<frame_id>.npz # 3D Weighting mask: 256x256x256x1
matterport_depth_gen/
<scene_id>/
├── <posithion_id>_d<frame_id>.png # Depth image: 320x240x1
matterport_room_mask/
<scene_id>/
├── <posithion_id>_rm<frame_id>.png # room mask: 320x240x1
@inproceedings{chu2023buol,
title={BUOL: A Bottom-Up Framework With Occupancy-Aware Lifting for Panoptic 3D Scene Reconstruction From a Single Image},
author={Chu, Tao and Zhang, Pan and Liu, Qiong and Wang, Jiaqi},
booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition},
pages={4937--4946},
year={2023}
}