Thanks to visit codestin.com
Credit goes to github.com

Skip to content

SII-ZTM/WinT3R

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

17 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

WinT3R: Window-Based Streaming Reconstruction With Camera Token Pool

Paper Website Hugging Face

Official implementation of WinT3R - an online model that infers precise camera pose and high-quality point map for image streams.

Teaser Video

📖 Overview

We present WinT3R, a feed-forward model that infers precise camera pose and high-quality point map for image stream in an online manner. Our main contributions are summarized as follows:

  1. We propose an online window mechanism, enabling sufficient interaction of image tokens within the same window and across adjacent windows.
  2. We maintain a camera token pool, which functions as a lightweight ”global memory” and improves the quality of camera pose prediction with a global perspective.
  3. We achieve state-of-the-art performance, experiments demonstrate that WinT3R achieves state-of-the-art performance in online 3D reconstruction and camera pose estimation, with the fastest reconstruction speed to date.

🛠️ TODO List

  • Release point cloud and camera pose estimation code.
  • Release evaluation code.
  • Release training code.

🌍 Installation

conda create -n WinT3R python=3.10
conda activate WinT3R
pip install torch==2.5.1 torchvision==0.20.1 torchaudio==2.5.1 --index-url https://download.pytorch.org/whl/cu118  # use the correct version of cuda for your system
pip install -r requirements.txt

💿 Checkpoints

Download the checkpoint from huggingface and place it in the checkpoints/pytorch_model.bin directory.

🎯 Run Inference from Command Line

# Run with default example images
python recon.py

# Run on your own data
python recon.py --data_path <path/to/your/images_dir>

📜 Citation

@misc{li2025wint3rwindowbasedstreamingreconstruction,
      title={WinT3R: Window-Based Streaming Reconstruction with Camera Token Pool}, 
      author={Zizun Li and Jianjun Zhou and Yifan Wang and Haoyu Guo and Wenzheng Chang and Yang Zhou and Haoyi Zhu and Junyi Chen and Chunhua Shen and Tong He},
      year={2025},
      eprint={2509.05296},
      archivePrefix={arXiv},
      primaryClass={cs.CV},
      url={https://arxiv.org/abs/2509.05296}, 
}

🙏 Acknowledgement

WinT3R is constructed on the outstanding open-source projects. We are extremely grateful for the contributions of these projects and their communities, whose hard work has greatly propelled the development of the field and enabled our work to be realized.

About

Code of WinT3R: Window-Based Streaming Rrconstruction With Camera Token Pool

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 95.3%
  • Jupyter Notebook 3.3%
  • Other 1.4%