Code repository for the paper:
PICO: Reconstructing 3D People In Contact with Objects
Alpár Cseke*, Shashank Tripathi*, Sai Kumar Dwivedi, Arjun Lakshmipathy, Agniv Chatterjee, Michael J. Black, Dimitrios Tzionas
Conference on Computer Vision and Pattern Recognition (CVPR), 2025
* equal contribution
[Project Page] [Paper] [Video] [Poster] [License] [Contact]
- [2025/06/11] PICO-fit* optimization script is released!
- [2025/09/10] Added auxiliary files for PICO-fit* optimization as reference
- [2025/09/21] Added back collision loss module, with installation help
- [2025/09/22] Closest match lookup script in PICO-db for new input images
- [2025/09/23] Example script on how to load PICO-db contact mappings
- First, clone the repo. Then, we recommend creating a clean conda environment, as follows:
git clone https://github.com/alparius/pico.git
cd pico
conda create -n pico python=3.10 -y
conda activate pico- Install packages:
pip install -r requirements.txt- Install PyTorch:
pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu118Please adjust the CUDA version as required.
- Install PyTorch3D from source. Users may also refer to PyTorch3D-install for more details.
However, our tests show that installing using
condasometimes runs into dependency conflicts. Hence, users may alternatively install Pytorch3D from source.
pip install "git+https://github.com/facebookresearch/pytorch3d.git@stable"- Install the SDF-based collision loss library:
- based on https://github.com/JiangWenPL/multiperson/tree/master/sdf
- go to
src/utils/sdfand runpython setup.py install
- Download some required files:
- run
sh fetch_static.sh(see the script for details) - download the smplx model files from here. Put
SMPLX_NEUTRAL.npzunderstatic/human_model_files/smplx/
Register an account on the PICO website to be able to access the subpage to download the dataset. The dataset consists of the selected object mesh for each image and a contact map between the SMPL-X human mesh and the aforementioned object mesh.
python demo.py <folder_path_with_inputs> <folder_path_for_outputs>
e.g.:
python demo.py demo_input/skateboard__vcoco_000000012938 demo_output/skateboard__vcoco_000000012938
The input folder has to include the following files:
- the input image that has the same filename as the folder itself (plus an image extension)
osx_human.npz: human pose and shape datahuman_detection.npz,object_detection.npz: mask and bbox for the two subjectsobject.obj: trimesh file of the object the human interacts withcorresponding_contacts.json: contact mapping data
- the latter two files make up the dataset itself that you can download from the above link
- there we also include the other 3 files for most of the samples in another archive, but feel free to bring your own inference results.
- please refer to the
notebooks/contact_lookup_on_dataset.ipynbscript as an example for doing closest match lookup in PICO-db. This finds the closest contact sample (and corresponding object mesh) in the database given the human contact data, which then can be used to reconstruct the interaction from the new image. See the second cell of the notebook for more details. - the other 3
.npzfiles you will have to provide yourself with the off-the-shelf methods of your choice
Please refer to the following repository for efficient object lookup and retrieval from a single image.
The same object retrieval strategy was used in both PICO and InteractVLM.
If you find this code useful for your research, please consider citing the following paper:
@inproceedings{cseke_tripathi_2025_pico,
title = {{PICO}: Reconstructing {3D} People In Contact with Objects},
author = {Cseke, Alp\'{a}r and Tripathi, Shashank and Dwivedi, Sai Kumar and Lakshmipathy, Arjun and Chatterjee, Agniv and Black, Michael J. and Tzionas, Dimitrios},
booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
month = {June},
year = {2025},
}See LICENSE.
We thank Felix Grüninger for advice on mesh preprocessing, Jean-Claude Passy and Valkyrie Felso for advice on the data collection, and Xianghui Xie for advice on HDM evaluation. We also thank Tsvetelina Alexiadis, Taylor Obersat, Claudia Gallatz, Asuka Bertler, Arina Kuznetcova, Suraj Bhor, Tithi Rakshit, Tomasz Niewiadomski, Valerian Fourel and Florentin Doll for their immense help in the data collection and verification process, Benjamin Pellkofer for IT support, and Nikos Athanasiou for the helpful discussions. This work was funded in part by the International Max Planck Research School for Intelligent Systems (IMPRS-IS). D. Tzionas is supported by the ERC Starting Grant (project STRIPES, 101165317).
Dimitris Tzionas has received a research gift fund from Google. While Michael J. Black is a co-founder and Chief Scientist at Meshcapade, his research in this project was performed solely at, and funded solely by, the Max Planck Society.
For technical questions, please create an issue. For other questions, please contact [email protected].
For commercial licensing, please contact [email protected].
