Paper | Project Page | Video
3D Reconstruction with Spatial Memory
Hengyi Wang, Lourdes Agapito
arXiv 2024
[2024-10-25] Add support for Nerfstudio
[2024-10-18] Add camera param estimation
[2024-09-30] @hugoycj adds a gradio demo
[2024-09-20] Instructions for datasets data_preprocess.md
[2024-09-11] Code for Spann3R
-
Clone Spann3R
git clone https://github.com/HengyiWang/spann3r.git cd spann3r -
Create conda environment
conda create -n spann3r python=3.9 cmake=3.14.0 conda install pytorch==2.3.0 torchvision==0.18.0 torchaudio==2.3.0 pytorch-cuda=11.8 -c pytorch -c nvidia # use the correct version of cuda for your system pip install -r requirements.txt # Open3D has a bug from 0.16.0, please use dev version pip install -U -f https://www.open3d.org/docs/latest/getting_started.html open3d -
Compile cuda kernels for RoPE
cd croco/models/curope/ python setup.py build_ext --inplace cd ../../../ -
Download the DUSt3R checkpoint
mkdir checkpoints cd checkpoints # Download DUSt3R checkpoints wget https://download.europe.naverlabs.com/ComputerVision/DUSt3R/DUSt3R_ViTLarge_BaseDecoder_512_dpt.pth -
Download our checkpoint and place it under
./checkpoints
-
Download the example data (2 scenes from map-free-reloc) and unzip it as
./examples -
Run demo:
python demo.py --demo_path ./examples/s00567 --kf_every 10 --vis --vis_camFor visualization
--vis, it will give you a window to adjust the rendering view. Once you find the view to render, please clickspace keyand close the window. The code will then do the rendering of the incremental reconstruction. -
Nerfstudio:
# Run demo use --save_ori to save scaled intrinsics for original images python demo.py --demo_path ./examples/s00567 --kf_every 10 --vis --vis_cam --save_ori # Run splatfacto ns-train splatfacto --data ./output/demo/s00567 --pipeline.model.camera-optimizer.mode SO3xR3 # Render your results ns-render interpolate --load-config [path-to-your-config]/config.ymlNote that here you can use
--save_orito save the scaled intrinsics intotransform.jsonto train NeRF/3D Gaussians with original images.'
We also provide a Gradio interface for a better experience, just run by:
# For Linux and Windows users (and macOS with Intel??)
python app.pyYou can specify the --server_port, --share, --server_name arguments to satisfy your needs!
We use Habitat, ScanNet++, ScanNet, ArkitScenes, Co3D, and BlendedMVS to train our model. Please refer to data_preprocess.md.
Please use the following command to train our model:
torchrun --nproc_per_node 8 train.py --batch_size 4
Please use the following command to evaluate our model:
python eval.py
Our code, data preprocessing pipeline, and evaluation scripts are based on several awesome repositories:
We thank the authors for releasing their code!
The research presented here has been supported by a sponsored research award from Cisco Research and the UCL Centre for Doctoral Training in Foundational AI under UKRI grant number EP/S021566/1. This project made use of time on Tier 2 HPC facility JADE2, funded by EPSRC (EP/T022205/1).
If you find our code or paper useful for your research, please consider citing:
@article{wang20243d,
title={3D Reconstruction with Spatial Memory},
author={Wang, Hengyi and Agapito, Lourdes},
journal={arXiv preprint arXiv:2408.16061},
year={2024}
}