Codestin Search App

4DNeX: Feed-Forward 4D Generative Modeling Made Easy

Zhaoxi Chen^1* Tianqi Liu^1* Long Zhuo^2* Jiawei Ren¹
Zeng Tao² He Zhu² Fangzhou Hong¹ Liang Pan^2† Ziwei Liu^1†

¹Nanyang Technological University ²Shanghai AI Laboratory

^*Equal Contribution ^†Corresponding Authors

TL;DR: 4DNeX is a feed-forward framework for generating 4D scene representations from a single image by fine-tuning a video diffusion model. It produces high-quality dynamic point clouds and enables downstream tasks such as novel-view video synthesis with strong generalizability.

teaser.mp4

🌟 Abstract

We present 4DNeX, the first feed-forward framework for generating 4D (i.e., dynamic 3D) scene representations from a single image. In contrast to existing methods that rely on computationally intensive optimization or require multi-frame video inputs, 4DNeX enables efficient, end-to-end image-to-4D generation by fine-tuning a pretrained video diffusion model. Specifically, 1) To alleviate the scarcity of 4D data, we construct 4DNeX-10M, a large-scale dataset with high-quality 4D annotations generated using advanced reconstruction approaches. 2) We introduce a unified 6D video representation that jointly models RGB and XYZ sequences, facilitating structured learning of both appearance and geometry. 3) We propose a set of simple yet effective adaptation strategies to repurpose pretrained video diffusion models for the 4D generation task. 4DNeX produces high-quality dynamic point clouds that enable novel-view video synthesis. Extensive experiments demonstrate that 4DNeX achieves competitive performance compared to existing 4D generation approaches, offering a scalable and generalizable solution for single-image-based 4D scene generation.

🚧 TODO List

🚀 Quick Start

Environment Setup

We use anaconda or miniconda to manage the python environment:

conda create -n "4dnex" python=3.10 -y
conda activate 4dnex
pip3 install torch torchvision --index-url https://download.pytorch.org/whl/cu121
pip install -r requirements.txt

# git lfs and rerun
conda install -c conda-forge git-lfs
conda install -c conda-forge rerun-sdk

Pretrained Model

Our model is developed on top of Wan2.1 I2V 14B, please download the pretrained model from Hugging Face and place it in the pretrained directory as following structure:

4DNeX/
└── pretrained/
    └── Wan2.1-I2V-14B-480P-Diffusers/
        ├── model_index.json
        ├── scheduler/
        ├── unet/
        ├── vae/
        ├── text_encoder/
        ├── tokenizer/
        └── ...

Then, you may download our pretrained LoRA weights from HuggingFace here and place it in the ./pretrained directory:

cd pretrained
mkdir 4dnex-lora
cd 4dnex-lora
huggingface-cli download FrozenBurning/4DNex-Lora --local-dir .
cd ../..
export PRETRAINED_LORA_PATH=./pretrained/4dnex-lora

Inference

After setup the environment and pretrained model, you can run the following command to generate 4D scene representations from a single image, the output video and point map will be saved in the OUTPUT_DIR directory. Assuming we are going to save the results in the ./results directory, we can run the following command:

export OUTPUT_DIR=./results
python inference.py --prompt ./example/prompt.txt --image ./example/image.txt --out $OUTPUT_DIR --sft_path ./pretrained/Wan2.1-I2V-14B-480P-Diffusers/transformer  --type i2vwbw-demb-samerope --mode xyzrgb --lora_path $PRETRAINED_LORA_PATH --lora_rank 64

We store the path to the image in the ./example/image.txt file, and the prompt in the ./example/prompt.txt file for inference. Feel free to modify the prompt and image path to generate your own 4D scene representations.

Visualization

To visualize the generated 4D scene representations, you may first perform pointmap registration using the following command:

python pm_registration.py --pkl_dir $OUTPUT_DIR

Then, you may visualize the pointmap registration results using Rerun as follows:

python rerun_vis.py --rr_recording test_log.rrd --pkl_dir $OUTPUT_DIR
rerun test_log.rrd --web-viewer

🔥 Training

Prepare Data

Please checkout our 10M 4D dataset from here, and place it in the ./data directory.

The data can be organized in the following structure:

data/
├── dynamic/
│   ├── dynamic_1/
│   ├── dynamic_2/
│   └── dynamic_3/
├── static/
│   ├── static_1/
│   └── static_2/
├── caption/
│   └── dynamic_1_with_caption_upload.csv
│   └── dynamic_2_with_caption_upload.csv
│   └── dynamic_3_with_caption_upload.csv
│   └── static_1_with_caption_upload.csv
│   └── static_2_with_caption_upload.csv
└── raw/
    ├── dynamic/
    │   ├── dynamic_1/
    │   ├── dynamic_2/
    │   └── dynamic_3/
    └── static/
        ├── static_1/
        └── static_2/

Run the command below to preprocess it:

python build_wan_dataset.py \
  --data_dir ./data \ 
  --out ./data/wan21

Once preprocessing is finished, the output directory will be organized as follows:

wan21/
├── cache/
├── videos/
├── first_frames/
├── pointmap/
├── pointmap_latents/
├── prompts.txt
├── videos.txt
└── generated_datalist.txt

Launch Training

To launch training, we assume all data are in the ./data/wan21 directory, and run the following command:

bash scripts/finetune.sh

Convert Zero Checkpoint to FP32

After training, you may convert the zero checkpoint to fp32 checkpoint for inference. For example, the output will be saved in the ./training/4dnex/5000-out directory as follows:

python scripts/zero_to_fp32.py ./training/4dnex/checkpoint-5000 ./training/4dnex/5000-out --safe_serialization

📚 Citation

If you find our work useful for your research, please consider citing our paper:

@article{chen20254dnex,
    title={4DNeX: Feed-Forward 4D Generative Modeling Made Easy},
    author={Chen, Zhaoxi and Liu, Tianqi and Zhuo, Long and Ren, Jiawei and Tao, Zeng and Zhu, He and Hong, Fangzhou and Pan, Liang and Liu, Ziwei},
    journal={arXiv preprint arXiv:2508.13154},
    year={2025}
}

Name		Name	Last commit message	Last commit date
Latest commit History 14 Commits
configs_acc		configs_acc
configs_zero		configs_zero
core		core
example		example
scripts		scripts
.gitignore		.gitignore
README.md		README.md
build_wan_dataset.py		build_wan_dataset.py
finetune.py		finetune.py
inference.py		inference.py
pm_registration.py		pm_registration.py
requirements.txt		requirements.txt
rerun_vis.py		rerun_vis.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Uh oh!

Repository files navigation

4DNeX: Feed-Forward 4D Generative Modeling Made Easy

🌟 Abstract

🚧 TODO List

🚀 Quick Start

Environment Setup

Pretrained Model

Inference

Visualization

🔥 Training

Prepare Data

Launch Training

Convert Zero Checkpoint to FP32

📚 Citation

About

Uh oh!

Releases

Packages

Contributors 2

Languages

Uh oh!

Uh oh!

3DTopia/4DNeX

Folders and files

Latest commit

History

Repository files navigation

4DNeX: Feed-Forward 4D Generative Modeling Made Easy

🌟 Abstract

🚧 TODO List

🚀 Quick Start

Environment Setup

Pretrained Model

Inference

Visualization

🔥 Training

Prepare Data

Launch Training

Convert Zero Checkpoint to FP32

📚 Citation

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages