ICCV, 2025
Shaowei Liu
·
Chuan Guo*
·
Bing Zhou*
·
Jian Wang*
·
This repository contains the pytorch implementation for the paper Ponimator: Unfolding Interactive Pose for Versatile Human-Human Interaction Animation, ICCV 2025. In this paper, we propose a uniform framework for human-human interaction animation and generation anchored on interactive poses.
- Installation
- Interactive Pose Animation Demo
- Interactive Motion Generation Demo
- Training and Inference
- Custom Third-party Scripts
- Citation
-
Clone this repository:
git clone https://github.com/stevenlsw/ponimator.git cd ponimator
-
Our code has minimal dependencies:
# most version of python and pytorch should all work conda create -n ponimator python=3.10 conda activate ponimator pip3 install torch torchvision --index-url https://download.pytorch.org/whl/cu124 pip3 install aitviewer yacs roma git+https://github.com/openai/CLIP.git
-
[Optional] Install visualization dependencies:
pip install blendify --find-links https://download.blender.org/pypi/bpy/ pip install imageio[ffmpeg]
-
Prepare SMPL-X body models, you can download a minimal version from minmal smplx folder and put smplx folder under
body_models/
body_models/ ├── smplx ├── SMPLX_MALE.npz ├── SMPLX_FEMALE.npz ├── SMPLX_NEUTRAL.npz ├── SMPLX_MALE.pkl ├── SMPLX_FEMALE.pkl ├── SMPLX_NEUTRAL.pkl
-
Download pretrained checkpoints: Interactive Pose Animator and Interactive Pose Generator and put under
checkpoints/
checkpoints/ ├── contactmotion.ckpt ├── contactpose.ckpt
-
[Optional] Given an interactive pose image, you can estimate the interactive pose by buddi check custom third-party scripts
-
We put buddi demo outputs under
data/buddi/{video_name}
, you can generate interactive motion by:python scripts/run_pose2motion.py --data_dir data/buddi/Couple_6806 --save
-
Visualize the interactive pose by append
--vis_interactive_pose
to above command. For off-screen rendering, you can addxvfb-run -a -s "-screen 0 1024x768x24"
before the command. The output video will be saved underoutputs/Couple_6806
. You can also turn off visualization by append--disable_vis
to the command. The output video will be saved underoutputs/Couple_6806/vis_motion_pred.mp4
. -
[Optional] To render the generated motion on original image:
python demo/vis_smpler.py --data_dir data/buddi/Couple_6806 --result_dir outputs/Couple_6806
The output video will be saved under
outputs/Couple_6806/render_motion_pred.mp4
. -
You are expected to get the following result:
Input Image Generated MOtion Output Video
-
[Optional] Given an single-person image, you can use SMPLer-X to estimate the single pose or animate existing single-person pose dataset like Motion-X. See custom third-party scripts for more details.
-
We put demo Motion-X data under
data/motionx/{video_name}
, you can generate interactive motion by:python scripts/run_singlepose2motion.py --data_dir data/motionx/Back_Flip_Kungfu_wushu_Trim9_clip1 --save
The output video will be saved under
outputs/Back_Flip_Kungfu_wushu_Trim9_clip1/vis_motion_pred.mp4
. You can adjust--seed
,--inter_time_idx
,--gender
,--text
for different outputs. -
[Optional] To render the generated motion on original image:
python demo/vis_motionx.py --data_dir data/motionx/Back_Flip_Kungfu_wushu_Trim9_clip1 --result_dir outputs/Back_Flip_Kungfu_wushu_Trim9_clip1
The output video will be saved under
outputs/Back_Flip_Kungfu_wushu_Trim9_clip1/render_motion_pred.mp4
. -
You are expected to get the following result:
Input Image Generated MOtion Output Video
-
Estimate Interactive Pose by Buddi, a custom script is at
third_party_scripts/buddi/custom_demo.sh
, put under same directory asbuddi
root dir.cd buddi/ chmod +x /custom_demo.sh ./custom_demo.sh ${input_image_path} {gpu_id}
-
Estimate Single Pose by SMPLest-X, a custom script is at
third_party_scripts/SMPLest-X/custom_inference.py
, put under same directory asSMPLest-X/main
folder.cd SMPLest-X python main/custom_inference.py --file_name ${input_image_path} --ckpt_name smplest_x_h --output_folder ${output_folder} --save_pkl
-
Example estimated single pose from SMPLest-X is shown as
Input Image Estimated Single Pose - Quick Note: The example image is the most famous statue in Waikiki beach, Hawaii.
-
Generate interactive motion of custom text
"two person hug each other"
bypython scripts/run_singlepose2motion.py --save --data_dir data/smpler --data_source smpler --text "two person hug each other" --seed 1 --save_dir outputs/smpler
The default output is under
outputs/image_ori
. -
Render the output video on original image:
python demo/vis_smpler.py --data_dir data/smpler --result_dir outputs/image_ori
The output video will be saved under
outputs/image_ori/rendered_video.mp4
. -
You are expected to get the following result:
Input Image Generated MOtion Output Video
If you find our work useful in your research, please cite:
@inproceedings{liu2025ponimator},
title={Ponimator: Unfolding Interactive Pose for Versatile Human-Human Interaction Animation},
author={Liu, Shaowei and Guo, Chuan and Zhou, Bing and Wang, Jian},
booktitle={International Conference on Computer Vision (ICCV)},
year={2025}
}
Our model is based on InterGen . We are also grateful to several other open-source repositories that we built upon during the development of our pipeline: