Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Code for 'Single-Image 3D Human Reconstruction with 3D-Aware Diffusion Priors and Facial Enhancement [Siggraph Asia 2025]'

License

Notifications You must be signed in to change notification settings

IGLICT/HumanLift

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

9 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

HumanLift: Single-Image 3D Human Reconstruction with 3D-Aware Diffusion Priors and Facial Enhancement

Project Page

1. Institute of Computing Technology, Chinese Academy of Sciences 2. University of Chinese Academy of Sciences
3. Hong Kong University of Science and Technology 4. Cardiff University

Jie Yang1, Bo-Tao Zhang1,2, Feng-Lin Liu1,2, Hongbo Fu3, Yu-Kun Lai4, Lin Gao1,2

SIGGRAPH ASIA 2025


Overview

teaser

HumanLift elevates a single reference image to a 3D animatable human, enabling view-consistent and photorealistic full-body image synthesis with high-quality facial details.


Quick Start

Multi-view Generation

Install the basic dependencies for multi-view generation (based on DiffSynth):

pip install torch torchvision numpy==1.23 Pillow huggingface_hub

Then install the following to obtain SMPL condition images:

# Install PyTorch3D
pip install "git+https://github.com/facebookresearch/pytorch3d.git"

# Install mmcv-full
pip install "mmcv-full>=1.3.17,<1.6.0" -f https://download.openmmlab.com/mmcv/dist/cu117/torch2.0.1/index.html

# Install mmhuman3d
pip install "git+https://github.com/open-mmlab/mmhuman3d.git"

Reconstruction

Install the 3D Gaussian Splatting package:

pip install gsplat

Animation (Optional)

To obtain an animatable 3D human, set up the LHM environment and download the pretrained models.


Inference for Reconstruction

1. SMPL-X Parameter Estimation

Estimate SMPL-X parameters and render multi-view images from an input image:

python pose_estimation/video2motion.py \
    --input_path ./images/2.jpg \
    --output_path ./motion \
    --visualize

2. Multi-view RGB Image Generation

Generate multi-view RGB images using the input image and semantic maps.

Download Checkpoints:

Download the required model checkpoints from Google Drive and place them in the ckpt (refer to inference_wan_rgb.py for the expected path structure).

Setup:

  • Copy ./images/ to data/data/ and rename it to test
  • Copy ./motion/ to data/output/ and rename it to test
  • Update model paths in inference_wan_rgb.py (the downloaded Wan2.1-14B and fine-tuned weights from our google drive)

Run:

python inference_wan_rgb.py

3. Human Gaussian Reconstruction

Pre-processing:

  • Remove backgrounds from generated RGB images and save as transparent RGBA
  • Pad images to 832×832 resolution
  • Copy processed images to 3-gs_recon/data/test/images/ and rename sequentially as lgt0_r_0000.png to lgt0_r_0080.png

Run reconstruction:

python train.py -s data/test -m output/test

Training for Reconstruction

1. Configure Environment

Edit train.sh to set:

  • Dataset path (dataset_path)
  • Wan2.1-14B model path

2. Start Training

bash train.sh

Animation (Alternative Workflow)

⚠️ This section provides an alternative animation method that may yield lower quality compared to the main reconstruction pipeline.

0. Pose Change (optional)

  • Use WeShopAI Fashion Model Pose Change to generate a T‑pose image (image A) with the prompt:
    "a full-body portrait of a person standing with arms and legs spread apart".

1. Prepare SMPL Renderings

  • Set IMAGE_INPUT in predict.sh and run:
    bash predict.sh
  • SMPL-rendered images will be saved in tmp/test/smplimagesrgb.

2. Align Reference Image

  • Use Photoshop to align the T‑pose image (image A) with the first SMPL rendering (000000.png), producing an aligned reference image (image B).
  • Why Photoshop? Current SMPL estimation models are not designed for orthographic camera alignment.

3. Generate Multi-view Images

  • Place SMPL renderings and image B into HumanWan-Dit and modify hyperparameters in inference_wan_rgb.py.
  • Run:
    python inference_wan_rgb.py

4. Background Removal

  • Remove backgrounds from all 81 multi-view images.
  • Save as RGBA format with transparency.

5. Run Inference

Set the following paths in inference.sh:

  • IMAGE_INPUT: path to image A (T‑pose)
  • MOTION_SEQS_DIR: SMPL motion folder
  • DATASET_DIR: RGBA multi-view images folder

Then run:

bash inference.sh

Acknowledgements

We thank the following open-source projects:
DiffSynth, LHM,
WeShopAI Fashion Model Pose Change,
gsplat, and many other inspiring works.


License

MIT License


If you use this work, please cite our SIGGRAPH ASIA 2025 paper. For questions or issues, open an issue on the repository.

@inproceedings{humanlift2025
  author    = {Yang, Jie and Zhang, Bo-Tao and Liu, Feng-Lin abd Fu, Hongbo and Lai, Yu-Kun and Gao, Lin},
  title     = {HumanLift: Single-Image 3D Human Reconstruction with 3D-Aware Diffusion Priors and Facial Enhancement},
  year      = {2025},
  url       = {https://doi.org/10.1145/3757377.3763839},
  doi       = {10.1145/3757377.3763839},
  booktitle = {SIGGRAPH Asia 2025 Conference Papers (SA Conference Papers '25)},
  articleno = {31},
  numpages  = {12},
  series    = {SIGGRAPH ASIA Conference Papers '25}
}

About

Code for 'Single-Image 3D Human Reconstruction with 3D-Aware Diffusion Priors and Facial Enhancement [Siggraph Asia 2025]'

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published