EX-4D: EXtreme Viewpoint 4D Video Synthesis via Depth Watertight Mesh

🌟 Highlights

🎯 Extreme Viewpoint Synthesis: Generate high-quality 4D videos with camera movements ranging from -90° to 90°
🔧 Depth Watertight Mesh: Novel geometric representation that models both visible and occluded regions
⚡ Lightweight Architecture: Only 1% trainable parameters (140M) of the 14B video diffusion backbone
🎭 No Multi-view Training: Innovative masking strategy eliminates the need for expensive multi-view datasets
🏆 State-of-the-art Performance: Outperforms existing methods, especially on extreme camera angles

🎬 Demo Results

EX-4D transforms monocular videos into camera-controllable 4D experiences with physically consistent results under extreme viewpoints.

🏗️ Framework Overview

Our framework consists of three key components:

🔺 Depth Watertight Mesh Construction: Creates a robust geometric prior that explicitly models both visible and occluded regions
🎭 Simulated Masking Strategy: Generates effective training data from monocular videos without multi-view datasets
⚙️ Lightweight LoRA Adapter: Efficiently integrates geometric information with pre-trained video diffusion models

🚀 Quick Start

Installation

# Clone the repository
git clone https://github.com/tau-yihouxiang/EX-4D.git
cd EX-4D

# Create conda environment
conda create -n ex4d python=3.10
conda activate ex4d
# Install PyTorch (2.x recommended)
pip install torch==2.4.1 torchvision==0.19.1 torchaudio==2.4.1 --index-url https://download.pytorch.org/whl/cu124
# Install Nvdiffrast
pip install git+https://github.com/NVlabs/nvdiffrast.git
# Install dependencies and diffsynth
pip install -e .
# Install depthcrafter for depth estimation. (Follow DepthCrafter's installing instruction for checkpoints preparation.)
git clone https://github.com/Tencent/DepthCrafter.git

Download Pretrained Model

huggingface-cli download Wan-AI/Wan2.1-I2V-14B-480P --local-dir ./models/Wan-AI
huggingface-cli download yihouxiang/EX-4D --local-dir ./models/EX-4D

Example Usage

1. DW-Mesh Reconstruction

# --cam 180 (30 / 60 / 90 / zoom_in / zoom_out )
python recon.py --input_video examples/flower/input.mp4 --cam 180 --output_dir outputs/flower --save_mesh

2. EX-4D Generation (48GB VRAM required)

python generate.py --color_video outputs/flower/color_180.mp4 --mask_video outputs/flower/mask_180.mp4 --output_video outputs/flower/output.mp4

Input Video

➜

Output Video

User Study Results

70.7% of participants preferred EX-4D over baseline methods
Superior performance in physical consistency and extreme viewpoint quality
Significant improvement as camera angles become more extreme

🎯 Applications

🎮 Gaming: Create immersive 3D game cinematics from 2D footage
🎬 Film Production: Generate novel camera angles for post-production
🥽 VR/AR: Create free-viewpoint video experiences
📱 Social Media: Generate dynamic camera movements for content creation
🏢 Architecture: Visualize spaces from multiple viewpoints

⚠️ Limitations

Depth Dependency: Performance relies on monocular depth estimation quality
Computational Cost: Requires significant computation for high-resolution videos
Reflective Surfaces: Challenges with reflective or transparent materials

🔮 Future Work

Real-time inference optimization (3DGS / 4DGS)
Support for higher resolutions (1K, 2K)
Neural mesh refinement techniques

🙏 Acknowledgments

We would like to thank the DiffSynth-Studio v1.1.1 project for providing the foundational diffusion framework.

📚 Citation

If you find our work useful, please consider citing:

@misc{hu2025ex4dextremeviewpoint4d,
      title={EX-4D: EXtreme Viewpoint 4D Video Synthesis via Depth Watertight Mesh}, 
      author={Hu, Tao and Peng, Haoyang and Liu, Xiao and Ma, Yuewen},
      year={2025},
      eprint={2506.05554},
      archivePrefix={arXiv},
      primaryClass={cs.CV},
      url={https://arxiv.org/abs/2506.05554}
}

Name		Name	Last commit message	Last commit date
Latest commit History 31 Commits
diffsynth		diffsynth
docs		docs
examples/flower		examples/flower
utils		utils
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
generate.py		generate.py
recon.py		recon.py
requirements.txt		requirements.txt
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

EX-4D: EXtreme Viewpoint 4D Video Synthesis via Depth Watertight Mesh

🌟 Highlights

🎬 Demo Results

🏗️ Framework Overview

🚀 Quick Start

Installation

Download Pretrained Model

Example Usage

1. DW-Mesh Reconstruction

2. EX-4D Generation (48GB VRAM required)

User Study Results

🎯 Applications

⚠️ Limitations

🔮 Future Work

🙏 Acknowledgments

📚 Citation

About

Uh oh!

Releases

Packages

Contributors 2

Languages

License

tau-yihouxiang/EX-4D

Folders and files

Latest commit

History

Repository files navigation

EX-4D: EXtreme Viewpoint 4D Video Synthesis via Depth Watertight Mesh

🌟 Highlights

🎬 Demo Results

🏗️ Framework Overview

🚀 Quick Start

Installation

Download Pretrained Model

Example Usage

1. DW-Mesh Reconstruction

2. EX-4D Generation (48GB VRAM required)

User Study Results

🎯 Applications

⚠️ Limitations

🔮 Future Work

🙏 Acknowledgments

📚 Citation

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages