Real-time 3D-aware Portrait Editing from a Single Image
Qingyan Bai, Zifan Shi, Yinghao Xu, Hao Ouyang, Qiuyu Wang, Ceyuan Yang, Xuan Wang, Gordon Wetzstein, Yujun Shen, Qifeng Chen
European Conference on Computer Vision (ECCV) 2024
Figure: Editing results produced by our proposed 3DPE, which allows users to perform 3D-aware portrait editing using image or text prompts.
[Paper] [Project Page]
This work presents 3DPE, a practical method that can efficiently edit a face image following given prompts, like reference images or text descriptions, in a 3D-aware manner. To this end, a lightweight module is distilled from a 3D portrait generator and a text-to-image model, which provide prior knowledge of face geometry and superior editing capability, respectively. Such a design brings two compelling advantages over existing approaches. First, our method achieves real-time editing with a feedforward network (i.e., ∼0.04s per image), over 100× faster than the second competitor. Second, thanks to the powerful priors, our module could focus on the learning of editing-related variations, such that it manages to handle various types of editing simultaneously in the training phase and further supports fast adaptation to user-specified customized types of editing during inference.
Figure: Method overview. We distill priors in the 2D diffusion model and 3D GAN for real-time 3D-aware editing.
First, install the required dependencies using the provided requirements.txt:
pip install -r requirements.txt
Download the model weights from this link and extract the pretrained_models
folder to the current directory. The folder should contain:
3dpe.pt
- Main model checkpoint- Other pre-trained components for the 3D portrait editing pipeline
Use test_scripts.py
to run inference on test images:
python test_scripts.py
This script will:
- Load the pre-trained model from
./pretrained_models/3dpe.pt
- Process images from
./test_imgs/
directory (which containsinput/
andref/
subdirectories) - Generate Multi-view 3D-aware portrait editing results
- Save outputs to
./results
directory
Download the training dataset from Google Drive. The dataset consists of 4 compressed files:
3dpe_dataset.tar.gz.00
(3.81 GB)3dpe_dataset.tar.gz.01
(3.81 GB)3dpe_dataset.tar.gz.02
(3.81 GB)3dpe_dataset.tar.gz.03
(392.1 MB)
To extract the dataset, use the following commands:
# Combine the split files and extract
cat 3dpe_dataset.tar.gz.* | tar -xzf -
This will create a 3dpe_dataset
folder with the following structure:
3dpe_dataset/
├── train/ # Training images
└── test/ # Test images
Use train_scripts.py
to train the model:
python train_scripts.py
This script will:
- Use
torchrun
for distributed training with 8 GPUs - Load pre-trained models from
./pretrained_models/
- Train on the 3DPE dataset (train/ subfolder)
- Save experiment outputs to
./exps
directory
This version of code is based on this implementation of Live3DPortrait.
If you find our work helpful for your research, please consider to cite:
@inproceedings{bai20243dpe,
title = {Real-time 3D-aware Portrait Editing from a Single Image},
author = {Bai, Qingyan and Shi, Zifan and Xu, Yinghao and Ouyang, Hao and Wang, Qiuyu and Yang, Ceyuan and Wang, Xuan and Wetzstein, Gordon and Shen, Yujun and Chen, Qifeng},
booktitle = {European Conference on Computer Vision},
year = {2024}
}