LatentHOI: On the Generalizable Hand Object Motion Generation with Latent Hand Diffusion

TODO

Code cleanup and refactoring
Integrate motionrender for visualization
Release test set of OOD Oakink Objects
Uploaded Pretrained GraspVAE Checkpoint

Environment Setup

For environment setup instructions, please refer to projects/mdm_hand/environment.md.

Data Preparation

Hand Model (MANO)

Step 1: Register and Download

Register for a MANO account.
Download the checkpoint files.

Step 2: File Placement

Place the downloaded files under mdm_hand/data/body_models/mano.

Expected Folder Structure

The directory structure should look like this:

data/
└── body_models/
    └── mano/
        ├── MANO_LEFT.pkl
        └── MANO_RIGHT.pkl
    └── grab/
        ├── grab_frames
        └── grab_seq20fps
    └── oakink/
        └── oakink_aligned

Download processed dataset

For easy use of the code the processed data, it can be downloaded from huggingface Run following script:

cd projects/mdm_hand/data
wget https://huggingface.co/datasets/jojo23333/LatentHOI-data/resolve/main/grab_frames.tar.gz
wget https://huggingface.co/datasets/jojo23333/LatentHOI-data/resolve/main/grab_seq20fps.tar.gz
tar -xzvf grab_frames.tar.gz
tar -xzvf grab_seq20fps.tar.gz

We also provide preprocessed oakink split used in our code, with mesh cordiante direction calibrated with Grab objects

cd projects/mdm_hand/data
unzip oakink.zip

(Alternatively) Prepare from Raw

All data preparation scripts should be run from the projects/mdm_hand/datasets/GRAB directory.

GRAB Dataset

# For VAE training (single frame hand data, left hand not in contact are omitted)
python grab/grab_preprocessing_adapt_flat_hand.py

# For Diffusion model training (sequence data)
python grab/grab_preprocessing_all_seq.py

DexYCB Dataset

# For VAE training
python grab/dexycb_preprocessing_all_seq.py

# For Diffusion model training (with --seq flag)
python grab/dexycb_preprocessing_all_seq.py --seq

Training

GraspVAE

# GRAB
python -m tools.train_vae --num-gpus 1 --resume --config config/VAE/VAE_grab.yaml

# DexYCB
python -m tools.train_vae --num-gpus 1 --resume --config config/VAE/VAE_dexycb.yaml

Or you can use my pretrained graspvae here: https://drive.google.com/drive/folders/13dvExxUbENk9DF4XhNBO1NKAC7tx0Em8?usp=sharing

Latent Diffusion

In configs, replace the DIFFUSION.VAE_CHECKPOINT with your trained vae checkpoint from above

# GRAB
python -m tools.train_diff --num-gpus 2 --mode ldm --resume --config config/grab/LDM_pretrain_vae.yaml 

# DexYCB
python -m tools.train_diff --num-gpus 2 --mode ldm --resume --config config/dexycb/LDM_pretrain_vae.yaml

Motion Generation

# Generate for Oakink split
python -m tools.train_diff --mode ldm --eval-only --config config/oakink/ldm_oakink.yaml TEST.BATCH_SIZE 9

Motion Visualization & Evaluation

During training, middle result for evalution will be stored with the frequency defined by TEST.EVAL_PERIOD. <path_to_vis_folder> should contain evaluated result in the form of .pth, below command will visualize/evalutate all the .pth file in the folder

# Basic evaluation
python -m tools.eval_motion -f <path_to_vis_folder>

# With visualization (generates videos)
python -m tools.eval_motion -f <path_to_vis_folder> --vis

# With physics evaluation
python -m tools.eval_motion -f <path_to_vis_folder> --eval

# For DexYCB dataset
eval_motion -f <path_to_vis_folder> --dex

Visualization

Visualization is integrated into the evaluation process. Use the --vis flag with the evaluation command to generate videos of the hand motions:

python -m tools.eval_motion -f <path_to_vis_folder> --vis

Troubleshooting

OpenGL Headless Rendering Issue

Modify the AITViewer backend to use EGL in aitviewer/viewer.py line 129:

self.window = base_window_cls(
    title=title,
    size=size,
    fullscreen=C.fullscreen,
    resizable=C.resizable,
    gl_version=self.gl_version,
    aspect_ratio=None,
    vsync=C.vsync,
    samples=self.samples,
    cursor=True,
    backend="egl"
)

FFMPEG Video Export Issue

The provided ffmpeg might not recognize presets in commands. Solution options:

Download and replace the conda environment ffmpeg as described in StyleSDF issue #20
Use the system's global ffmpeg installation

Memory Leak in Headless Rendering

As reported in AITViewer issue #53, use sub-processes to run rendering commands.

ValueError: bad value(s) in fds_to_keep

This error occurs when storing shared tensors for dataloader workers. Solution:

mean_latent, std_latent = copy.deepcopy(torch.chunk(mean_latent, 2, dim=-1))
dataset.mean_latent, dataset.std_latent = mean_latent.numpy(), std_latent.numpy()

Adding .numpy() converts tensors to numpy arrays, solving the shared tensor issue.

Citation

@InProceedings{Muchen_LatentHOI,
    author    = {Li, Muchen and Christen, Sammy and Wan, Chengde and Cai, Yujun and Liao, Renjie and Sigal, Leonid and Ma, Shugao},
    title     = {LatentHOI: On the Generalizable Hand Object Motion Generation with Latent Hand Diffusion.},
    booktitle = {Proceedings of the Computer Vision and Pattern Recognition Conference (CVPR)},
    month     = {June},
    year      = {2025},
    pages     = {17416-17425}
}

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
d2		d2
projects/mdm_hand		projects/mdm_hand
.gitignore		.gitignore
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

LatentHOI: On the Generalizable Hand Object Motion Generation with Latent Hand Diffusion

Table of Contents

TODO

Environment Setup

Data Preparation

Hand Model (MANO)

Step 1: Register and Download

Step 2: File Placement

Expected Folder Structure

Download processed dataset

(Alternatively) Prepare from Raw

Training

GraspVAE

Latent Diffusion

Motion Generation

Motion Visualization & Evaluation

Visualization

Troubleshooting

OpenGL Headless Rendering Issue

FFMPEG Video Export Issue

Memory Leak in Headless Rendering

ValueError: bad value(s) in fds_to_keep

Citation

About

Uh oh!

Releases

Packages

Languages

jojo23333/LatetHOI

Folders and files

Latest commit

History

Repository files navigation

LatentHOI: On the Generalizable Hand Object Motion Generation with Latent Hand Diffusion

Table of Contents

TODO

Environment Setup

Data Preparation

Hand Model (MANO)

Step 1: Register and Download

Step 2: File Placement

Expected Folder Structure

Download processed dataset

(Alternatively) Prepare from Raw

Training

GraspVAE

Latent Diffusion

Motion Generation

Motion Visualization & Evaluation

Visualization

Troubleshooting

OpenGL Headless Rendering Issue

FFMPEG Video Export Issue

Memory Leak in Headless Rendering

ValueError: bad value(s) in fds_to_keep

Citation

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages