Thanks to visit codestin.com
Credit goes to github.com

Skip to content

ali-vilab/Wan-Move

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

15 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

Wan-Move: Motion-controllable Video Generation via Latent Trajectory Guidance

Paper Code Model Model Model Video Website

Watch the video

πŸ’‘ TLDR: Bring Wan I2V to SOTA fine-grained, point-level motion control!

Wan-Move: Motion-controllable Video Generation via Latent Trajectory Guidance [Paper]
Ruihang Chu, Yefei He, Zhekai Chen, Shiwei Zhang, Xiaogang Xu, Bin Xia, Dingdong Wang, Hongwei Yi, Xihui Liu, Hengshuang Zhao, Yu Liu, Yingya Zhang, Yujiu Yang

We present our NeurIPS 2025 paper Wan-Move, a simple and scalable motion-control framework for video generation. Wan-Move offers the following key features:

  • 🎯 High-Quality 5s 480p Motion Control: Through scaled training, Wan-Move can generate 5-second, 480p videos with SOTA motion controllability on par with commercial systems such as Kling 1.5 Pro’s Motion Brush, as verified via user studies.

  • 🧩 Novel latent Trajectory Guidance: Our core idea is to represent the motion condition by propagating the first frame’s features along the trajectory, which can be seamlessly integrated into off-the-shelf image-to-video models (e.g., Wan-I2V-14B) without any architecture change or extra motion modules.

  • πŸ•ΉοΈ Fine-grained Point-level Control: Object motions are represented with dense point trajectories, enabling precise, region-level control over how each element in the scene moves.

  • πŸ“Š Dedicated Motion-control Benchmark MoveBench: MoveBench is a carefully curated benchmark with larger-scale samples, diverse content categories, longer video durations, and high-quality trajectory annotations.

πŸ™Œ We’re glad to see Wan-Move being tested in real-world videos by many creators and users.

πŸ”₯ Latest News!!

  • Dec 15, 2025: πŸ‘‹ We've released a local Gradio demo for interactive trajectory drawing and video generation.
  • Dec 10, 2025: πŸ‘‹ We've released the inference code, model weights, and MoveBench of Wan-Move.
  • Sep 18, 2025: πŸ‘‹ Wan-Move has been accepted by NeurIPS 2025! πŸŽ‰πŸŽ‰πŸŽ‰

Community Works

πŸ“‘ Todo List

  • Wan-Move-480P
    • Multi-GPU inference code of the 14B models
    • Checkpoints of the 14B models
    • Data and evaluation code of MoveBench
    • Gradio demo

Introduction of Wan-Move

logo Wan-Move spports diverse motion control applications in image-to-video generation. The generated samples (832Γ—480p, 5s) exhibits high visual fidelity and accurate motion.

logo The framework of Wan-Move. (a) How to inject motion guidance. (b) Training pipeline.

logo The contruction pipeline and statistics of MoveBench. Welcome everyone to use it!

logo Qualitative comparisons between Wan-Move and academic methods and commercial solutions.

Quickstart

Installation

πŸ’‘Note: Wan-Move is implemented as a minimal extension on top of the Wan2.1 codebase. If you have tried Wan2.1, you can reuse most of your existing setup with very low migration cost.

Clone the repo:

git clone  https://github.com/ali-vilab/Wan-Move.git
cd Wan-Move

Install dependencies:

# Ensure torch >= 2.4.0
pip install -r requirements.txt

Model Download

Models Download Link Notes
Wan-Move-14B-480P πŸ€— Huggingface πŸ€– ModelScope 5s 480P video generation

Download models using huggingface-cli:

pip install "huggingface_hub[cli]"
huggingface-cli download Ruihang/Wan-Move-14B-480P --local-dir ./Wan-Move-14B-480P

Download models using modelscope-cli:

pip install modelscope
modelscope download churuihang/Wan-Move-14B-480P --local_dir ./Wan-Move-14B-480P

Evaluation on MoveBench

Download MoveBench from Hugging Face

huggingface-cli download Ruihang/MoveBench --local-dir ./MoveBench --repo-type dataset

πŸ’‘Note:

  • MoveBench has provided the video captions. For a fair evaluation, you should turn off the prompt extension function developed in Wan2.1.
  • MoveBench provides both data in English and Chinese versions. You can select the language via the --language flag: use en for English and zh for Chinese.
  • Single-GPU inference
# For single-object motion test, run: 
python generate.py --task wan-move-i2v --size 480*832 --ckpt_dir ./Wan-Move-14B-480P --mode single --language en --save_path results/en --eval_bench

# For multi-object motion test, run: 
python generate.py --task wan-move-i2v --size 480*832 --ckpt_dir ./Wan-Move-14B-480P --mode multi --language en --save_path results/en --eval_bench

πŸ’‘Note:

  • If you want to visualize the trajectory motion effect in our video demo, add the --vis_track flag. We also provide a separate visualization script, i.e., scripts/visualize.py, to support different visualization settings, for example, enabling mouse-button effects! 😊😊😊
  • If you encounter OOM (Out-of-Memory) issues, you can use the --offload_model True and --t5_cpu options to reduce GPU memory usage.
  • The 14B model can be run in a single 40GB GPU with --t5_cpu --offload_model True --dtype bf16! πŸ€—πŸ€—πŸ€—
  • Multi-GPU inference

    Following Wan2.1, Wan-Move also supports FSDP and xDiT USP to accelerate inference. When running multi-GPU batch evaluation (e.g., evaluating MoveBench or a file containing multiple test cases), you should disable the Ulysses strategy by setting --ulysses_size 1. Ulysses is only supported when generating a single video with multi-GPU inference.

# For single-object motion test, run: 
torchrun --nproc_per_node=8 generate.py --task wan-move-i2v --size 480*832 --ckpt_dir ./Wan-Move-14B-480P --mode single --language en --save_path results/en --eval_bench --dit_fsdp --t5_fsdp

# For multi-object motion test, run: 
torchrun --nproc_per_node=8 generate.py --task wan-move-i2v --size 480*832 --ckpt_dir ./Wan-Move-14B-480P --mode multi --language en --save_path results/en --eval_bench --dit_fsdp --t5_fsdp

After all results are generated, you can change the results storage path inside MoveBench/bench.py, then run:

python MoveBench/bench.py

Run the Default Example

For single video generation, (not evaluating MoveBench), we also provide a sample case in the examples folder. You can directly run:

python generate.py \
  --task wan-move-i2v \
  --size 480*832 \
  --ckpt_dir ./Wan-Move-14B-480P \
  --image examples/example.jpg \
  --track examples/example_tracks.npy \
  --track_visibility examples/example_visibility.npy \
  --prompt "A laptop is placed on a wooden table. The silver laptop is connected to a small grey external hard drive and transfers data through a white USB-C cable. The video is shot with a downward close-up lens." \
  --save_file example.mp4

Gradio Demo

We provide a local Gradio demo for interactive trajectory drawing and video generation.

  1. Launch the Demo:
python gradio_app.py \
    --task wan-move-i2v \
    --size 480*832 \
    --ckpt_dir ./Wan-Move-14B-480P \
    --t5_cpu \
    --offload_model True \
    --dtype bf16 \
    --port 7860 \
    --share
  1. Features:

    • Multi-Trajectory Control: Draw multiple trajectories with distinct colors.
    • Speed Control: Adjust the speed curve for each trajectory independently.
    • Real-time Preview: Visualize your drawn trajectories on the input image and as a GIF.
    • Lazy Loading: The model loads only when you start generation, ensuring fast startup.
    • History Gallery: View your previously generated videos.
  2. Usage:

    • Upload an image.
    • Click on the image to add trajectory points.
    • (Optional) Adjust the speed curve in the editor.
    • Select "Create New..." in the dropdown to add more trajectories.
    • Click "Generate Video".

Citation

If you find our work helpful, please cite us.

@article{chu2025wan,
      title={Wan-Move: Motion-controllable Video Generation via Latent Trajectory Guidance},
      author={Ruihang Chu and Yefei He and Zhekai Chen and Shiwei Zhang and Xiaogang Xu and Bin Xia and Dingdong Wang and Hongwei Yi and Xihui Liu and Hengshuang Zhao and Yu Liu and Yingya Zhang and Yujiu Yang},
      year={2025},
      eprint={2512.08765},
      archivePrefix={arXiv},
      primaryClass={cs.CV}
}

License Agreement

The models in this repository are licensed under the Apache 2.0 License. We claim no rights over the your generated contents, granting you the freedom to use them while ensuring that your usage complies with the provisions of this license. You are fully accountable for your use of the models, which must not involve sharing any content that violates applicable laws, causes harm to individuals or groups, disseminates personal information intended for harm, spreads misinformation, or targets vulnerable populations. For a complete list of restrictions and details regarding your rights, please refer to the full text of the license.

Acknowledgements

We would like to thank the contributors to the Wan, CoTracker, umt5-xxl, and HuggingFace repositories, for their open research.

Contact Us

If you would like to leave a message to our research teams, feel free to drop me an Email.

About

[NeurIPS 2025] Wan-Move: Motion-controllable Video Generation via Latent Trajectory Guidance

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages