Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Official Code for CVPR2025 Paper: LatentHOI: On the Generalizable Hand Object Motion Generation with Latent Hand Diffusion

Notifications You must be signed in to change notification settings

jojo23333/LatetHOI

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

9 Commits
 
 
 
 
 
 
 
 

Repository files navigation

LatentHOI: On the Generalizable Hand Object Motion Generation with Latent Hand Diffusion

Project Page Paper Poster

Table of Contents

TODO

  • Code cleanup and refactoring
  • Integrate motionrender for visualization
  • Release test set of OOD Oakink Objects
  • Uploaded Pretrained GraspVAE Checkpoint

Environment Setup

For environment setup instructions, please refer to projects/mdm_hand/environment.md.

Data Preparation

Hand Model (MANO)

Step 1: Register and Download

  1. Register for a MANO account.
  2. Download the checkpoint files.

Step 2: File Placement

Place the downloaded files under mdm_hand/data/body_models/mano.

Expected Folder Structure

The directory structure should look like this:

data/
└── body_models/
    └── mano/
        ├── MANO_LEFT.pkl
        └── MANO_RIGHT.pkl
    └── grab/
        ├── grab_frames
        └── grab_seq20fps
    └── oakink/
        └── oakink_aligned

Download processed dataset

For easy use of the code the processed data, it can be downloaded from huggingface Run following script:

cd projects/mdm_hand/data
wget https://huggingface.co/datasets/jojo23333/LatentHOI-data/resolve/main/grab_frames.tar.gz
wget https://huggingface.co/datasets/jojo23333/LatentHOI-data/resolve/main/grab_seq20fps.tar.gz
tar -xzvf grab_frames.tar.gz
tar -xzvf grab_seq20fps.tar.gz

We also provide preprocessed oakink split used in our code, with mesh cordiante direction calibrated with Grab objects

cd projects/mdm_hand/data
unzip oakink.zip

(Alternatively) Prepare from Raw

All data preparation scripts should be run from the projects/mdm_hand/datasets/GRAB directory.

GRAB Dataset

# For VAE training (single frame hand data, left hand not in contact are omitted)
python grab/grab_preprocessing_adapt_flat_hand.py

# For Diffusion model training (sequence data)
python grab/grab_preprocessing_all_seq.py

DexYCB Dataset

# For VAE training
python grab/dexycb_preprocessing_all_seq.py

# For Diffusion model training (with --seq flag)
python grab/dexycb_preprocessing_all_seq.py --seq

Training

GraspVAE

# GRAB
python -m tools.train_vae --num-gpus 1 --resume --config config/VAE/VAE_grab.yaml

# DexYCB
python -m tools.train_vae --num-gpus 1 --resume --config config/VAE/VAE_dexycb.yaml

Or you can use my pretrained graspvae here: https://drive.google.com/drive/folders/13dvExxUbENk9DF4XhNBO1NKAC7tx0Em8?usp=sharing

Latent Diffusion

In configs, replace the DIFFUSION.VAE_CHECKPOINT with your trained vae checkpoint from above

# GRAB
python -m tools.train_diff --num-gpus 2 --mode ldm --resume --config config/grab/LDM_pretrain_vae.yaml 

# DexYCB
python -m tools.train_diff --num-gpus 2 --mode ldm --resume --config config/dexycb/LDM_pretrain_vae.yaml 

Motion Generation

# Generate for Oakink split
python -m tools.train_diff --mode ldm --eval-only --config config/oakink/ldm_oakink.yaml TEST.BATCH_SIZE 9

Motion Visualization & Evaluation

During training, middle result for evalution will be stored with the frequency defined by TEST.EVAL_PERIOD. <path_to_vis_folder> should contain evaluated result in the form of .pth, below command will visualize/evalutate all the .pth file in the folder

# Basic evaluation
python -m tools.eval_motion -f <path_to_vis_folder>

# With visualization (generates videos)
python -m tools.eval_motion -f <path_to_vis_folder> --vis

# With physics evaluation
python -m tools.eval_motion -f <path_to_vis_folder> --eval

# For DexYCB dataset
eval_motion -f <path_to_vis_folder> --dex

Visualization

Visualization is integrated into the evaluation process. Use the --vis flag with the evaluation command to generate videos of the hand motions:

python -m tools.eval_motion -f <path_to_vis_folder> --vis

Troubleshooting

OpenGL Headless Rendering Issue

Modify the AITViewer backend to use EGL in aitviewer/viewer.py line 129:

self.window = base_window_cls(
    title=title,
    size=size,
    fullscreen=C.fullscreen,
    resizable=C.resizable,
    gl_version=self.gl_version,
    aspect_ratio=None,
    vsync=C.vsync,
    samples=self.samples,
    cursor=True,
    backend="egl"
)

FFMPEG Video Export Issue

The provided ffmpeg might not recognize presets in commands. Solution options:

  1. Download and replace the conda environment ffmpeg as described in StyleSDF issue #20
  2. Use the system's global ffmpeg installation

Memory Leak in Headless Rendering

As reported in AITViewer issue #53, use sub-processes to run rendering commands.

ValueError: bad value(s) in fds_to_keep

This error occurs when storing shared tensors for dataloader workers. Solution:

mean_latent, std_latent = copy.deepcopy(torch.chunk(mean_latent, 2, dim=-1))
dataset.mean_latent, dataset.std_latent = mean_latent.numpy(), std_latent.numpy()

Adding .numpy() converts tensors to numpy arrays, solving the shared tensor issue.

Citation

@InProceedings{Muchen_LatentHOI,
    author    = {Li, Muchen and Christen, Sammy and Wan, Chengde and Cai, Yujun and Liao, Renjie and Sigal, Leonid and Ma, Shugao},
    title     = {LatentHOI: On the Generalizable Hand Object Motion Generation with Latent Hand Diffusion.},
    booktitle = {Proceedings of the Computer Vision and Pattern Recognition Conference (CVPR)},
    month     = {June},
    year      = {2025},
    pages     = {17416-17425}
}

About

Official Code for CVPR2025 Paper: LatentHOI: On the Generalizable Hand Object Motion Generation with Latent Hand Diffusion

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published