Thanks to visit codestin.com
Credit goes to github.com

Skip to content

robingg1/PoseTraj

Repository files navigation

PoseTraj

[CVPR 2025] PoseTraj: Pose-Aware Trajectory Control in Video Diffusion

Official implementation of paper "PoseTraj: Pose-Aware Trajectory Control in Video Diffusion".


Updates

  • Support gradio demo/ More Checkpoints.
  • Release checkpoint on VIPSeg.
  • Release training and inference code.
  • Release dataset and rendering process.
  • Repo initalization.

Abstract

Recent advancements in trajectory-guided video generation have achieved notable progress. However, existing models still face challenges in generating object motions with potentially changing 6D poses under wide-range rotations, due to limited 3D understanding. To address this problem, we introduce PoseTraj, a pose-aware video dragging model for generating 3D-aligned motion from 2D trajectories. Our method adopts a novel two-stage pose-aware pretraining framework, improving 3D understanding across diverse trajectories. Specifically, we propose a large-scale synthetic dataset PoseTraj-10k, containing 10k videos of objects following rotational trajectories, and enhance the model perception of object pose changes by incorporating 3D bounding boxes as intermediate supervision signals. Following this, we fine-tune the trajectory-controlling module on real-world videos, applying an additional camera-disentanglement module to further refine motion accuracy. Experiments on various benchmark datasets demonstrate that our method not only excels in 3D pose-aligned dragging for rotational trajectories but also outperforms existing baselines in trajectory accuracy and video quality.


Pose-Aware Dragging for Rotational Motions

Input Image Drag Trajectory Generated Video

1. Environment Setup

Step 1: Create and Activate the Environment

conda create -n PoseTraj python=3.8
conda activate PoseTraj
pip install -r requirements.txt

Step 2. Download model weights

Download PoseTraj model weights from google drive

Download SVD model weights from hub.

2. Dataset Preparation

You can either use our pre-processed dataset or create your own.

Option 1: Download Prebuilt Dataset

split1, split2

Option 2: Construct Your Own Dataset

Refer to the detailed steps in data_render/ to generate your own dataset.

3. Runing Inference

To perform inference, simply run:

python scripts/run_inference_vipseg_json_repro.py

Gradio Demo will soon be supported !

4. Training Instructions

Two-stage Pretrain:

sh scripts/start_10k_pretrain.sh

Open-domain Finetuning:

# with camera-disentangle
sh scripts/start_ft_cam.sh

# without camera-disentangle
sh scripts/start_ft.sh

About

[CVPR 2025] PoseTraj: Pose-Aware Trajectory Control in Video Diffusion

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 2

  •  
  •