4RC (pronounced "ARC") enables unified and complete 4D reconstruction via conditional querying from monocular videos in a single feed-forward pass.
🎇 For more visual results, go checkout our project page
Introducing 4RC
We present 4RC, a unified feed-forward framework for 4D reconstruction from monocular videos. Unlike existing methods that typically decouple motion from geometry or produce limited 4D attributes, such as sparse trajectories or two-view scene flow, 4RC learns a holistic 4D representation that jointly captures dense scene geometry and motion dynamics. At its core, 4RC introduces a novel encode-once, query-anywhere and anytime paradigm: a transformer backbone encodes the entire video into a compact spatio-temporal latent space, from which a conditional decoder can efficiently query 3D geometry and motion for any query frame at any target timestamp. To facilitate learning, we represent per-view 4D attributes in a minimally factorized form, decomposing them into base geometry and time-dependent relative motion. Extensive experiments demonstrate that 4RC outperforms prior and concurrent methods across a wide range of 4D reconstruction tasks.
- [2026/02/11] Our paper is now live.
If you find our repo useful for your research, please consider citing our paper:
@article{luo20264rc,
title = {4RC: 4D Reconstruction via Conditional Querying Anytime and Anywhere},
author = {Yihang Luo and Shangchen Zhou and Yushi Lan and Xingang Pan and Chen Change Loy},
journal = {arXiv preprint arXiv:2602.10094},
year = {2026}
}If you have any questions, please feel free to reach us at [email protected].