Built on top of MASt3R-SLAM and integrated with Spann3R
Spann3R-SLAM integrates Spann3R into a MASt3R-SLAM-style real-time pipeline.
This repo uses copied upstream Spann3R code under spann3r_core/ (no Spann3R submodule) and runs directly with:
checkpoints/spann3r.pthcheckpoints/DUSt3R_ViTLarge_BaseDecoder_512_dpt.pth- demo data
datasets/examples/s00567
- Spann3R backend: Tracking/optimization pipeline is wired to Spann3R inference.
- Interactive 3D visualization: in3d-based viewer with mouse orbit/pan/zoom and camera frustums.
- Spann3R reprojection rendering: Reprojects Spann3R world points (
pts3d + RGB) to the current view. - Per-frame PNG export: Saves rendered frames by default to
logs/spann3r_renders/. - Runtime tuning: CLI + GUI controls for point density and rendering quality/performance.
| Aspect | MASt3R-SLAM | Spann3R-SLAM |
|---|---|---|
| Backend model | MASt3R | Spann3R (+ DUSt3R checkpoint) |
| Real-time render content | OpenGL point map shaders | Spann3R point reprojection renderer |
| PNG render export | Not default | Enabled by default (logs/spann3r_renders/) |
| Integration mode | Native MASt3R stack | Upstream Spann3R source copied into repo |
- Ubuntu 20.04+ (or WSL2)
- NVIDIA GPU + CUDA-capable driver
- Conda/Miniconda
- Git
git clone --recursive https://github.com/Looong01/Spann3R-SLAM.git
cd Spann3R-SLAMIf cloned without --recursive, run:
git submodule update --init --recursiveconda create -n spann3r-slam python=3.11 -y
conda activate spann3r-slamInstall the CUDA-matching PyTorch for your machine. Example (CUDA 12.4):
pip install torch==2.5.1 torchvision==0.20.1 torchaudio==2.5.1 --index-url https://download.pytorch.org/whl/cu124pip install -r requirements.txt
pip install -e thirdparty/in3d
pip install --no-build-isolation thirdparty/lietorch
pip install --no-build-isolation -e .Notes:
pip install --no-build-isolation -e .builds the backend extension (mast3r_slam_backends).lietorchmust be installed before runningmain.py.
Create the checkpoint directory:
mkdir -p checkpointsPlace the following files in checkpoints/:
spann3r.pthDUSt3R_ViTLarge_BaseDecoder_512_dpt.pth
Put the example scene at:
datasets/examples/s00567
(Download source is the same Google Drive folder above.)
The image-folder loader supports png/jpg/jpeg.
Default inference image size is 224 (aligned with current Spann3R demo usage in this repo).
python main.py \
--dataset datasets/examples/s00567 \
--checkpoint checkpoints/spann3r.pth \
--dust3r-checkpoint checkpoints/DUSt3R_ViTLarge_BaseDecoder_512_dpt.pth \
--config config/base.yamlMinimal command (uses defaults above):
python main.pyHeadless mode:
python main.py --no-viz| Argument | Default | Description |
|---|---|---|
--dataset |
datasets/examples/s00567 |
Input sequence folder / video / image folder / realsense / webcam |
--config |
config/base.yaml |
SLAM config YAML |
--save-as |
default |
Save subdir under logs/ for trajectory/reconstruction outputs |
--no-viz |
off | Disable interactive GUI window |
--calib |
"" |
Optional calibration YAML |
--checkpoint |
checkpoints/spann3r.pth |
Spann3R checkpoint path |
--dust3r-checkpoint |
checkpoints/DUSt3R_ViTLarge_BaseDecoder_512_dpt.pth |
DUSt3R checkpoint path |
--no-render-gaussians |
off | Disable Spann3R rendering and per-frame PNG export |
--render-dir |
logs/spann3r_renders |
Directory for rendered PNGs |
--max-gaussians |
4194304 |
Max active points used by renderer |
--spatial-stride |
4 |
Per-frame point subsampling stride (1 = no subsampling) |
When GUI is enabled, the left panel exposes runtime controls:
| GUI Item | Range / Default | Effect |
|---|---|---|
pause |
bool | Pause frame stepping |
C_conf_threshold |
0.0 .. 5.0 (init from config) |
Confidence filtering for rendered/visualized points |
follow cam |
bool (on) | Viewer follows current camera |
spann3r_rendering |
bool (on) | Toggle Spann3R reprojection rendering layer |
render_res_scale |
0.2 .. 1.0 (default 0.5) |
Viewport rendering resolution scale |
spatial_stride |
1 .. 16 (init from CLI) |
Point density control |
max_gaussians |
20000 .. dynamic upper bound (init from CLI) |
Active point cap |
render_point_radius |
0 .. 2 (default 1) |
Point splat radius in pixels |
cache_refresh |
1 .. 30 (default 1) |
Refresh interval of current-frame cache |
show_keyframe_edges / show_keyframe / show_axis |
bool | Overlay debug visuals |
line_thickness / frustum_scale |
drag | Frustum/edge drawing style |
--spatial-strideand--max-gaussiansare startup defaults and initialize the GUI sliders.- During GUI runs, slider changes apply live to interactive rendering.
- PNG export uses current GUI values of
spatial_strideandmax_gaussians. - In headless mode (
--no-viz), only CLI values are used. - If
--no-render-gaussiansis set, rendering + PNG export are disabled.
Higher quality (slower):
python main.py \
--dataset datasets/examples/s00567 \
--checkpoint checkpoints/spann3r.pth \
--dust3r-checkpoint checkpoints/DUSt3R_ViTLarge_BaseDecoder_512_dpt.pth \
--config config/base.yaml \
--spatial-stride 1 \
--max-gaussians 8388608Higher speed (lighter memory):
python main.py \
--dataset datasets/examples/s00567 \
--checkpoint checkpoints/spann3r.pth \
--dust3r-checkpoint checkpoints/DUSt3R_ViTLarge_BaseDecoder_512_dpt.pth \
--config config/base.yaml \
--spatial-stride 8 \
--max-gaussians 2097152Disable render PNG saving:
python main.py \
--dataset datasets/examples/s00567 \
--checkpoint checkpoints/spann3r.pth \
--dust3r-checkpoint checkpoints/DUSt3R_ViTLarge_BaseDecoder_512_dpt.pth \
--config config/base.yaml \
--no-render-gaussiansWith custom intrinsics:
python main.py \
--dataset path/to/data \
--config config/base.yaml \
--calib config/intrinsics.yamlVideo / image folder / live input:
python main.py --dataset path/to/video.mp4 --config config/base.yaml
python main.py --dataset path/to/image_folder --config config/base.yaml
python main.py --dataset realsense --config config/base.yaml
python main.py --dataset webcam --config config/base.yaml| Output | Location | Description |
|---|---|---|
| Trajectory | logs/<save_as>/<seq_name>.txt (or logs/<seq_name>.txt if default) |
Estimated trajectory |
| Reconstruction | logs/<save_as>/<seq_name>.ply (or logs/<seq_name>.ply if default) |
Reconstructed point cloud |
| Keyframes | logs/<save_as>/keyframes/<seq_name>/ (or logs/keyframes/<seq_name>/) |
Keyframe images |
| Spann3R renders | logs/spann3r_renders/ (or --render-dir) |
Per-frame rendered PNGs |
Spann3R-SLAM/
├── main.py # Main entry
├── spann3r_slam/ # SLAM package
│ ├── spann3r_utils.py # Spann3R loading/inference/render bridge
│ ├── tracker.py # Tracking
│ ├── global_opt.py # Backend optimization
│ ├── frame.py # Frame + shared states
│ ├── visualization.py # in3d interactive visualization + controls
│ └── ...
├── spann3r_core/ # Copied upstream Spann3R / DUSt3R / CroCo code
│ ├── spann3r/
│ ├── dust3r/
│ ├── croco/
│ └── ...
├── thirdparty/
│ ├── in3d/ # Visualization framework
│ ├── lietorch/
│ └── eigen/
├── config/
├── scripts/
├── datasets/
└── checkpoints/
Download helpers:
bash ./scripts/download_tum.sh
bash ./scripts/download_7_scenes.sh
bash ./scripts/download_euroc.sh
bash ./scripts/download_eth3d.shEvaluation helpers:
bash ./scripts/eval_tum.sh
bash ./scripts/eval_tum.sh --no-calib
bash ./scripts/eval_7_scenes.sh
bash ./scripts/eval_euroc.sh
bash ./scripts/eval_eth3d.shInstall thirdparty/lietorch first:
pip install --no-build-isolation thirdparty/lietorchUse headless mode:
python main.py --no-vizReduce density:
python main.py --spatial-stride 8 --max-gaussians 2097152Or reduce input resolution in config (dataset.img_downsample).
This is a performance warning from upstream Spann3R/CroCo kernels; the run can still proceed with slower PyTorch fallback.
This repository builds on open-source contributions from the MASt3R-SLAM and Spann3R projects.
@article{wang20243d,
title={3D Reconstruction with Spatial Memory},
author={Wang, Hengyi and Agapito, Lourdes},
journal={arXiv preprint arXiv:2408.16061},
year={2024}
}@inproceedings{murai2024_mast3rslam,
title={{MASt3R-SLAM}: Real-Time Dense {SLAM} with {3D} Reconstruction Priors},
author={Murai, Riku and Dexheimer, Eric and Davison, Andrew J.},
booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition},
year={2025}
}