Thanks to visit codestin.com
Credit goes to github.com

Skip to content

tau-yihouxiang/EX-4D

Repository files navigation

EX-4D: EXtreme Viewpoint 4D Video Synthesis via Depth Watertight Mesh

🌟 Highlights

  • 🎯 Extreme Viewpoint Synthesis: Generate high-quality 4D videos with camera movements ranging from -90° to 90°
  • 🔧 Depth Watertight Mesh: Novel geometric representation that models both visible and occluded regions
  • ⚡ Lightweight Architecture: Only 1% trainable parameters (140M) of the 14B video diffusion backbone
  • 🎭 No Multi-view Training: Innovative masking strategy eliminates the need for expensive multi-view datasets
  • 🏆 State-of-the-art Performance: Outperforms existing methods, especially on extreme camera angles

🎬 Demo Results

EX-4D Demo Results

EX-4D transforms monocular videos into camera-controllable 4D experiences with physically consistent results under extreme viewpoints.

🏗️ Framework Overview

EX-4D Architecture

Our framework consists of three key components:

  1. 🔺 Depth Watertight Mesh Construction: Creates a robust geometric prior that explicitly models both visible and occluded regions
  2. 🎭 Simulated Masking Strategy: Generates effective training data from monocular videos without multi-view datasets
  3. ⚙️ Lightweight LoRA Adapter: Efficiently integrates geometric information with pre-trained video diffusion models

🚀 Quick Start

Installation

# Clone the repository
git clone https://github.com/tau-yihouxiang/EX-4D.git
cd EX-4D

# Create conda environment
conda create -n ex4d python=3.10
conda activate ex4d
# Install PyTorch (2.x recommended)
pip install torch==2.4.1 torchvision==0.19.1 torchaudio==2.4.1 --index-url https://download.pytorch.org/whl/cu124
# Install Nvdiffrast
pip install git+https://github.com/NVlabs/nvdiffrast.git
# Install dependencies and diffsynth
pip install -e .
# Install depthcrafter for depth estimation. (Follow DepthCrafter's installing instruction for checkpoints preparation.)
git clone https://github.com/Tencent/DepthCrafter.git

Download Pretrained Model

huggingface-cli download Wan-AI/Wan2.1-I2V-14B-480P --local-dir ./models/Wan-AI
huggingface-cli download yihouxiang/EX-4D --local-dir ./models/EX-4D

Example Usage

1. DW-Mesh Reconstruction

# --cam 180 (30 / 60 / 90 / zoom_in / zoom_out )
python recon.py --input_video examples/flower/input.mp4 --cam 180 --output_dir outputs/flower --save_mesh

2. EX-4D Generation (48GB VRAM required)

python generate.py --color_video outputs/flower/color_180.mp4 --mask_video outputs/flower/mask_180.mp4 --output_video outputs/flower/output.mp4

Input Video

Output Video

User Study Results

  • 70.7% of participants preferred EX-4D over baseline methods
  • Superior performance in physical consistency and extreme viewpoint quality
  • Significant improvement as camera angles become more extreme

🎯 Applications

  • 🎮 Gaming: Create immersive 3D game cinematics from 2D footage
  • 🎬 Film Production: Generate novel camera angles for post-production
  • 🥽 VR/AR: Create free-viewpoint video experiences
  • 📱 Social Media: Generate dynamic camera movements for content creation
  • 🏢 Architecture: Visualize spaces from multiple viewpoints

⚠️ Limitations

  • Depth Dependency: Performance relies on monocular depth estimation quality
  • Computational Cost: Requires significant computation for high-resolution videos
  • Reflective Surfaces: Challenges with reflective or transparent materials

🔮 Future Work

  • Real-time inference optimization (3DGS / 4DGS)
  • Support for higher resolutions (1K, 2K)
  • Neural mesh refinement techniques

🙏 Acknowledgments

We would like to thank the DiffSynth-Studio v1.1.1 project for providing the foundational diffusion framework.

📚 Citation

If you find our work useful, please consider citing:

@misc{hu2025ex4dextremeviewpoint4d,
      title={EX-4D: EXtreme Viewpoint 4D Video Synthesis via Depth Watertight Mesh}, 
      author={Hu, Tao and Peng, Haoyang and Liu, Xiao and Ma, Yuewen},
      year={2025},
      eprint={2506.05554},
      archivePrefix={arXiv},
      primaryClass={cs.CV},
      url={https://arxiv.org/abs/2506.05554}
}

About

The implementation of Extreme Viewpoint 4D Video Generation

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages