This repo contains model, demo, and test codes of our paper:
CPF: Learning a Contact Potential Field to Model the Hand-Object Interaction
Lixin Yang, Xinyu Zhan, Kailin Li, Wenqiang Xu, Jiefeng Li, Cewu Lu
ICCV 2021
$ git clone --recursive https://github.com/lixiny/CPF.git
$ cd CPF$ conda env create -f environment.yaml
$ conda activate cpfFor researchers in China, you can alternatively download our preprocessed files at this mirror: ηΎεΊ¦η (
2tqv)
Download our [assets.zip] and unzip it as an assets/ folder.
Download the MANO model files from official MANO website, and put it into assets/mano/.
We currently only use the MANO_RIGHT.pkl
Now your assets/ folder should look like this:
assets/
βββ anchor/
β βββ anchor_mapping_path.pkl
β βββ anchor_weight.txt
βΒ Β βββ face_vertex_idx.txt
βΒ Β βββ merged_vertex_assignment.txt
βββ closed_hand/
βΒ Β βββ hand_mesh_close.obj
βββ fhbhands_fits/
βΒ Β βββ Subject_1/
βΒ Β βΒ Β βββ ...
βΒ Β βββ Subject_2/
| βββ ...
βββ hand_palm_full.txt
βββ mano/
βββ fhb_skel_centeridx9.pkl
βββ info.txt
βββ LICENSE.txt
βββ MANO_RIGHT.pkl
Download and unzip the First-Person Hand Action Benchmark dataset following the official instructions to the data/fhbhands/ folder
If everything is correct, your data/fhbhands/ should look like this:
.
βββ action_object_info.txt
βββ action_sequences_normalized/
βββ change_log.txt
βββ data_split_action_recognition.txt
βββ file_system.jpg
βββ Hand_pose_annotation_v1/
βββ Object_6D_pose_annotation_v1_1/
βββ Object_models/
βββ Subjects_info/
βββ Video_files/
βββ Video_files_480/ # Optionally
Optionally, resize the images (speeds up training !) based on the handobjectconsist/reduce_fphab.py.
$ python reduce_fphab.pyDownload our [fhbhands_supp.zip] and unzip it as data/fhbhands_supp:
Download our [fhbhands_example.zip] and unzip it as data/fhbhands_example.
This fhbhands_example/ contains 10 samples that are designed to demonstrate our pipeline.
Currently, your data/ folder should look like this:
data/
βββ fhbhands/
βββ fhbhands_supp/
βΒ Β βββ Object_models/
βΒ Β βββ Object_models_binvox/
βββ fhbhands_example/
βΒ Β βββ annotations/
βΒ Β βββ images/
βΒ Β βββ object_models/
βΒ Β βββ sample_list.txt
Download and unzip the HO3D dataset following the official instructions to the data/HO3D folder.
if everything is correct, the HO3D & YCB folder in your data/ folder should look like this:
data/
βββ HO3D/
βΒ Β βββ evaluation/
βΒ Β βββ evaluation.txt
βΒ Β βββ train/
βΒ Β βββ train.txt
βββ YCB_models/
βΒ Β βββ 002_master_chef_can/
βΒ Β βββ ...
...
Download our [YCB_models_supp.zip] and place it at data/YCB_models_supp
Finally, the data/ folder should have a structure like:
data/
βββ fhbhands/
βββ fhbhands_supp/
βββ fhbhands_example/
βββ HO3D/
βββ YCB_models/
βββ YCB_models_supp/
download our pre-trained [CPF_checkpoints.zip], unzip it as CPF_checkpoints/ folder:
CPF_checkpoints/
βββ honet/
βΒ Β βββ fhb/
βΒ Β βββ ho3dofficial/
βΒ Β βββ ho3dv1/
βββ picr/
βΒ Β βββ fhb/
βΒ Β βββ ho3dofficial/
βΒ Β βββ ho3dv1/
Replace the
${GPU_ID}with a list of integers that indicates the GPU id.
eg:--gpu 0,1;--gpu 0;--gpu 0,1,2,3
We create a FHBExample dataset in hocontact/hodatasets/fhb_example.py that only contains 10 samples to demonstrate our pipeline.
Notice: this demo requires active screen for visualizing. Press q in the "runtime hand" window to start fitting.
# recommend 1 GPU
$ python scripts/run_demo.py \
--gpu ${GPU_ID} \
--init_ckpt CPF_checkpoints/picr/fhb/checkpoint_200.pth.tar \
--honet_mano_fhb_handWe provide shell srcipts to test on the full dataset to approximately reproduce our results.
dump the results of HoNet and PiCR:
# recommend 2 GPUs
$ python scripts/dump_picr_res.py \
--gpu ${GPU_ID} \
--dist_master_addr localhost \
--dist_master_port 12355 \
--exp_keyword fhb \
--train_datasets fhb \
--train_splits train \
--val_dataset fhb \
--val_split test \
--split_mode actions \
--batch_size 8 \
--dump_eval \
--dump \
--vertex_contact_thresh 0.8 \
--filter_thresh 5.0 \
--dump_prefix common/picr \
--init_ckpt CPF_checkpoints/picr/fhb/checkpoint_200.pth.tarand reload the GeO optimizer:
# setting 1: hand-only
# CUDA_VISIBLE_DEVICES=0,1,2,3
# recommend 4 GPUs
$ python scripts/eval_geo.py \
--gpu ${GPU_ID} \
--n_workers 16 \
--data_path common/picr/fhbhands/test_actions_mf1.0_rf0.25_fct5.0_ec \
--mode hand
# setting 2: hand-obj
$ python scripts/eval_geo.py \
--gpu ${GPU_ID} \
--n_workers 16 \
--data_path common/picr/fhbhands/test_actions_mf1.0_rf0.25_fct5.0_ec \
--mode hand_obj \
--compensate_tsldump:
# recomment 2 GPUs
$ python scripts/dump_picr_res.py \
--gpu ${GPU_ID} \
--dist_master_addr localhost \
--dist_master_port 12356 \
--exp_keyword ho3dv1 \
--train_datasets ho3d \
--train_splits train \
--val_dataset ho3d \
--val_split test \
--split_mode objects \
--batch_size 4 \
--dump_eval \
--dump \
--vertex_contact_thresh 0.8 \
--filter_thresh 5.0 \
--dump_prefix common/picr_ho3dv1 \
--init_ckpt CPF_checkpoints/picr/ho3dv1/checkpoint_300.pth.tarand reload optimizer:
# hand-only
# CUDA_VISIBLE_DEVICES=0,1,2,3,4,5,6,7
# recommend 8 GPUs
$ python scripts/eval_geo.py \
--gpu ${GPU_ID}
--n_workers 24 \
--data_path common/picr_ho3dv1/HO3D/test_objects_mf1_likev1_fct5.0_ec/ \
--lr 1e-2 \
--n_iter 500 \
--hodata_no_use_cache \
--lambda_contact_loss 10.0 \
--lambda_repulsion_loss 4.0 \
--repulsion_query 0.030 \
--repulsion_threshold 0.080 \
--mode hand
# hand-obj
# recommend 8 GPUs
$ python scripts/eval_geo.py \
--gpu ${GPU_ID} \
--n_workers 24 \
--data_path common/picr_ho3dv1/HO3D/test_objects_mf1_likev1_fct5.0_ec/ \
--lr 1e-2 \
--n_iter 500 \
--hodata_no_use_cache \
--lambda_contact_loss 10.0 \
--lambda_repulsion_loss 6.0 \
--repulsion_query 0.030 \
--repulsion_threshold 0.080 \
--mode hand_obj
dump:
# recommend 2 GPUs
$ python scripts/dump_picr_res.py \
--gpu ${GPU_ID} \
--dist_master_addr localhost \
--dist_master_port 12356 \
--exp_keyword ho3dofficial \
--train_datasets ho3d \
--train_splits val \
--val_dataset ho3d \
--val_split test \
--split_mode official \
--batch_size 4 \
--dump_eval \
--dump \
--test_dump \
--vertex_contact_thresh 0.8 \
--filter_thresh 5.0 \
--dump_prefix common/picr_ho3dofficial \
--init_ckpt CPF_checkpoints/picr/ho3dofficial/checkpoint_300.pth.tarand reload optimizer:
# CUDA_VISIBLE_DEVICES=0,1,2,3,4,5,6,7
# recommend 8 GPUs
$ python scripts/eval_geo.py \
--gpu ${GPU_ID} \
--n_workers 24 \
--data_path common/picr_ho3dofficial/HO3D/test_official_mf1_likev1_fct\(x\)_ec/ \
--lr 1e-2 \
--n_iter 500 \
--hodata_no_use_cache \
--lambda_contact_loss 10.0 \
--lambda_repulsion_loss 2.0 \
--repulsion_query 0.030 \
--repulsion_threshold 0.080 \
--mode hand_objTesting on the full dataset may take a while ( 0.5 ~ 1.5 day ), thus we also provide our test results at fitting_res.txt.
We provide pytorch implementation of our Anatomical Constrained MANO in lixiny/manotorch, which is modified from the original hassony2/manopth. Thank Yana Hasson for providing the code.
If you find this work helpful, please consider citing us:
@inproceedings{yang2021cpf,
title={{CPF}: Learning a Contact Potential Field to Model the Hand-Object Interaction},
author={Yang, Lixin and Zhan, Xinyu and Li, Kailin and Xu, Wenqiang and Li, Jiefeng and Lu, Cewu},
booktitle={ICCV},
year={2021}
}

