HumanLift: Single-Image 3D Human Reconstruction with 3D-Aware Diffusion Priors and Facial Enhancement

1. Institute of Computing Technology, Chinese Academy of Sciences 2. University of Chinese Academy of Sciences
3. Hong Kong University of Science and Technology 4. Cardiff University

Jie Yang¹, Bo-Tao Zhang^1,2, Feng-Lin Liu^1,2, Hongbo Fu³, Yu-Kun Lai⁴, Lin Gao^1,2

SIGGRAPH ASIA 2025

Overview

HumanLift elevates a single reference image to a 3D animatable human, enabling view-consistent and photorealistic full-body image synthesis with high-quality facial details.

Quick Start

Multi-view Generation

Install the basic dependencies for multi-view generation (based on DiffSynth):

pip install torch torchvision numpy==1.23 Pillow huggingface_hub

Then install the following to obtain SMPL condition images:

# Install PyTorch3D
pip install "git+https://github.com/facebookresearch/pytorch3d.git"

# Install mmcv-full
pip install "mmcv-full>=1.3.17,<1.6.0" -f https://download.openmmlab.com/mmcv/dist/cu117/torch2.0.1/index.html

# Install mmhuman3d
pip install "git+https://github.com/open-mmlab/mmhuman3d.git"

Reconstruction

Install the 3D Gaussian Splatting package:

pip install gsplat

Animation (Optional)

To obtain an animatable 3D human, set up the LHM environment and download the pretrained models.

Inference for Reconstruction

1. SMPL-X Parameter Estimation

Estimate SMPL-X parameters and render multi-view images from an input image:

python pose_estimation/video2motion.py \
    --input_path ./images/2.jpg \
    --output_path ./motion \
    --visualize

2. Multi-view RGB Image Generation

Generate multi-view RGB images using the input image and semantic maps.

Download Checkpoints:

Download the required model checkpoints from Google Drive and place them in the ckpt (refer to inference_wan_rgb.py for the expected path structure).

Setup:

Copy ./images/ to data/data/ and rename it to test
Copy ./motion/ to data/output/ and rename it to test
Update model paths in inference_wan_rgb.py (the downloaded Wan2.1-14B and fine-tuned weights from our google drive)

Run:

python inference_wan_rgb.py

3. Human Gaussian Reconstruction

Pre-processing:

Remove backgrounds from generated RGB images and save as transparent RGBA
Pad images to 832×832 resolution
Copy processed images to 3-gs_recon/data/test/images/ and rename sequentially as lgt0_r_0000.png to lgt0_r_0080.png

Run reconstruction:

python train.py -s data/test -m output/test

Training for Reconstruction

1. Configure Environment

Edit train.sh to set:

Dataset path (dataset_path)
Wan2.1-14B model path

2. Start Training

bash train.sh

Animation (Alternative Workflow)

⚠️ This section provides an alternative animation method that may yield lower quality compared to the main reconstruction pipeline.

0. Pose Change (optional)

Use WeShopAI Fashion Model Pose Change to generate a T‑pose image (image A) with the prompt:
"a full-body portrait of a person standing with arms and legs spread apart".

1. Prepare SMPL Renderings

Set IMAGE_INPUT in predict.sh and run:
```
bash predict.sh
```
SMPL-rendered images will be saved in tmp/test/smplimagesrgb.

2. Align Reference Image

Use Photoshop to align the T‑pose image (image A) with the first SMPL rendering (000000.png), producing an aligned reference image (image B).
Why Photoshop? Current SMPL estimation models are not designed for orthographic camera alignment.

3. Generate Multi-view Images

Place SMPL renderings and image B into HumanWan-Dit and modify hyperparameters in inference_wan_rgb.py.
Run:
```
python inference_wan_rgb.py
```

4. Background Removal

Remove backgrounds from all 81 multi-view images.
Save as RGBA format with transparency.

5. Run Inference

Set the following paths in inference.sh:

IMAGE_INPUT: path to image A (T‑pose)
MOTION_SEQS_DIR: SMPL motion folder
DATASET_DIR: RGBA multi-view images folder

Then run:

bash inference.sh

Acknowledgements

We thank the following open-source projects:
DiffSynth, LHM,
WeShopAI Fashion Model Pose Change,
gsplat, and many other inspiring works.

License

If you use this work, please cite our SIGGRAPH ASIA 2025 paper. For questions or issues, open an issue on the repository.

@inproceedings{humanlift2025
  author    = {Yang, Jie and Zhang, Bo-Tao and Liu, Feng-Lin abd Fu, Hongbo and Lai, Yu-Kun and Gao, Lin},
  title     = {HumanLift: Single-Image 3D Human Reconstruction with 3D-Aware Diffusion Priors and Facial Enhancement},
  year      = {2025},
  url       = {https://doi.org/10.1145/3757377.3763839},
  doi       = {10.1145/3757377.3763839},
  booktitle = {SIGGRAPH Asia 2025 Conference Papers (SA Conference Papers '25)},
  articleno = {31},
  numpages  = {12},
  series    = {SIGGRAPH ASIA Conference Papers '25}
}

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
1-preprocess		1-preprocess
2-mv_gen		2-mv_gen
3-gs_recon		3-gs_recon
4-animation		4-animation
assets		assets
ckpt		ckpt
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

HumanLift: Single-Image 3D Human Reconstruction with 3D-Aware Diffusion Priors and Facial Enhancement

Overview

Quick Start

Multi-view Generation

Reconstruction

Animation (Optional)

Inference for Reconstruction

1. SMPL-X Parameter Estimation

2. Multi-view RGB Image Generation

Download Checkpoints:

Setup:

Run:

3. Human Gaussian Reconstruction

Pre-processing:

Run reconstruction:

Training for Reconstruction

1. Configure Environment

2. Start Training

Animation (Alternative Workflow)

0. Pose Change (optional)

1. Prepare SMPL Renderings

2. Align Reference Image

3. Generate Multi-view Images

4. Background Removal

5. Run Inference

Acknowledgements

License

About

Uh oh!

Releases

Packages

Languages

License

IGLICT/HumanLift

Folders and files

Latest commit

History

Repository files navigation

HumanLift: Single-Image 3D Human Reconstruction with 3D-Aware Diffusion Priors and Facial Enhancement

Overview

Quick Start

Multi-view Generation

Reconstruction

Animation (Optional)

Inference for Reconstruction

1. SMPL-X Parameter Estimation

2. Multi-view RGB Image Generation

Download Checkpoints:

Setup:

Run:

3. Human Gaussian Reconstruction

Pre-processing:

Run reconstruction:

Training for Reconstruction

1. Configure Environment

2. Start Training

Animation (Alternative Workflow)

0. Pose Change (optional)

1. Prepare SMPL Renderings

2. Align Reference Image

3. Generate Multi-view Images

4. Background Removal

5. Run Inference

Acknowledgements

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages