Stars
[ICLR 2024] Code for FreeNoise based on VideoCrafter
(CVPR 2025) From Slow Bidirectional to Fast Autoregressive Video Diffusion Models
[CVPR 2025] StreamingT2V: Consistent, Dynamic, and Extendable Long Video Generation from Text
[ICCV 2025] Free4D: Tuning-free 4D Scene Generation with Spatial-Temporal Consistency
TurboDiffusion: 100–200× Acceleration for Video Diffusion Models
MapAnything: Universal Feed-Forward Metric 3D Reconstruction
Long-horizon, spatially consistent video generation enabled by persistent 3D scene point clouds and dynamic-static disentanglement.
Official repo for paper "IC-Effect: Precise and Efficient Video Effects Editing via In-Context Learning"
WorldPlay: Interactive World Modeling with Real-Time Latency and Geometric Consistency
[ICCV 2025] GameFactory: Creating New Games with Generative Interactive Videos
Generative Camera Dolly: Extreme Monocular Dynamic Novel View Synthesis (ECCV 2024 Oral) - Official Implementation
[ICLR'25] SynCamMaster: Synchronizing Multi-Camera Video Generation from Diverse Viewpoints
[CVPR 2025 Highlight] GEN3C: 3D-Informed World-Consistent Video Generation with Precise Camera Control
[NeurIPS 2025] WorldMem: Long-term Consistent World Simulation with Memory
[ICCV 2025 ⭐highlight⭐] Implementation of VMem: Consistent Interactive Video Scene Generation with Surfel-Indexed View Memory
[TPAMI 2025] ViewCrafter: Taming Video Diffusion Models for High-fidelity Novel View Synthesis
Official implementation of Geometry Forcing: Marrying Video Diffusion and 3D Representation for Consistent World Modeling
Official implementation for "RIFLEx: A Free Lunch for Length Extrapolation in Video Diffusion Transformers" (ICML 2025) and "UltraViCo: Breaking Extrapolation Limits in Video Diffusion Transformers"
[ICCV'25 Best Paper Finalist] ReCamMaster: Camera-Controlled Generative Rendering from A Single Video
[NeurIPS 2025] Wan-Move: Motion-controllable Video Generation via Latent Trajectory Guidance
Official codebase for "Self Forcing: Bridging Training and Inference in Autoregressive Video Diffusion" (NeurIPS 2025 Spotlight)
We introduce 'Thinking with Video', a new paradigm leveraging video generation for multimodal reasoning. Our VideoThinkBench shows that Sora-2 surpasses GPT5 by 10% on eyeballing puzzles and reache…
Thinking with Videos from Open-Source Priors. We reproduce chain-of-frames visual reasoning by fine-tuning open-source video models. Give it a star 🌟 if you find it useful.
Official repo for paper "EditVerse: Unifying Image and Video Editing and Generation with In-Context Learning"
MAGI-1: Autoregressive Video Generation at Scale
A comprehensive list of papers for the definition of World Models and using World Models for General Video Generation, Embodied AI, and Autonomous Driving, including papers, codes, and related webs…