Weijie Lyu, Yi Zhou, Ming-Hsuan Yang, Zhixin Shu
University of California, Merced - Adobe Research
FaceLift transforms a single facial image into a high-fidelity 3D Gaussian head representation, and it generalizes remarkably well to real-world human images.
This is a self-reimplementation of FaceLift.
Model checkpoints will be automatically downloaded from HuggingFace on first run.
Alternatively, you can manually place the checkpoints in the checkpoints/ directory:
checkpoints/mvdiffusion/pipeckpts/- Multi-view diffusion model checkpointscheckpoints/gslrm/ckpt_0000000000021125.pt- GS-LRM model checkpoints
bash setup_env.shProcess images from a directory:
python inference.py --input_dir examples/ --output_dir outputs/Available Arguments:
| Argument | Short | Default | Description |
|---|---|---|---|
--input_dir |
-i |
examples/ |
Input directory containing images |
--output_dir |
-o |
outputs/ |
Output directory for results |
--auto_crop |
- | True |
Automatically crop faces |
--seed |
- | 4 |
Random seed for reproducible results |
--guidance_scale_2D |
- | 3.0 |
Guidance scale for multi-view generation |
--step_2D |
- | 50 |
Number of diffusion steps |
Launch the interactive Gradio web interface:
python gradio_app.pyOpen your browser and navigate to http://localhost:7860 to use the web interface. If running on a server, use the provided public link.
Training data are currently not available. to train with your own data, follow the structure in FaceLift/data_sample/:
Multi-view Diffusion Data:
data_sample/
├── mvdiffusion/
│ ├── data_mvdiff_train.txt # Training data list
│ ├── data_mvdiff_val.txt # Validation data list
│ └── sample_000/
│ ├── cam_000.png # Front view (RGBA, 512×512)
│ ├── cam_001.png # Front-right view
│ ├── cam_002.png # Right view
│ ├── cam_003.png # Back view
│ ├── cam_004.png # Left view
│ └── cam_005.png # Front-left view
GS-LRM Data:
data_sample/
├── gslrm/
│ ├── data_gslrm_train.txt # Training data list
│ ├── data_gslrm_val.txt # Validation data list
│ └── sample_000/
│ ├── images/
│ │ ├── cam_000.png # Multi-view images (RGBA, 512×512)
│ │ ├── cam_001.png
│ │ ├── ...
│ │ └── cam_031.png # 32 views total
│ └── opencv_cameras.json # Camera parameters
accelerate launch --config_file mvdiffusion/node_config/8gpu.yaml \
train_diffusion.py --config configs/mvdiffusion.yamlOur Gaussian Reconstructor is based on GS-LRM and uses pre-trained weights from Objaverse data.
- Stage I: 256 resolution on Objaverse -
gslrm_pretrain_256.yaml - Stage II: 512 resolution on Objaverse -
gslrm_pretrain_512.yaml - Stage III: 512 resolution on synthetic heads data -
gslrm.yaml
torchrun --nproc_per_node 8 --nnodes 1 \
--rdzv_id ${JOB_UUID} --rdzv_backend c10d --rdzv_endpoint localhost:29500 \
train_gslrm.py --config configs/gslrm.yamlIf you find our work useful for your research, please consider citing our paper:
@InProceedings{Lyu_2025_ICCV,
author = {Lyu, Weijie and Zhou, Yi and Yang, Ming-Hsuan and Shu, Zhixin},
title = {FaceLift: Learning Generalizable Single Image 3D Face Reconstruction from Synthetic Heads},
booktitle = {Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV)},
month = {October},
year = {2025},
pages = {12691-12701}
}Copyright 2025 Adobe Inc.
Codes are licensed under Apache-2.0 License.
Model weights are licensed from Adobe Inc. under the Adobe Research License.
This work is built upon Era3D and GS-LRM. We thank the authors for their excellent work.
The code has been reimplemented and the weights retrained. Results may differ slightly from those reported in the paper.