🚀DilateQuant: Accurate and Efficient Diffusion Quantization via Weight Dilation

📖 Introduction

DilateQuant is a novel quantization-aware training (QAT) framework for accelerating diffusion models. It maintains high-quality image generation at 4-bit and 6-bit precision. Specifically, we find the unsaturation property of the in-channel weights and exploit it to alleviate the wide range of activations. By dilating the unsaturated channels to a constrained range, our method (WD) steadily minimizes quantization errors and ensures the convergence of QAT training. Furthermore, we design a flexible quantizer (TPQ) and introduce a novel knowledge distillation strategy (BKD) to further enhance performance while significantly improving training efficiency.

🔹 Overview Methods

🔹 Weight Dilation

This repository provides the official implementation for DilateQuant, including calibration, training, inference without any reservation. The evaluation and deployment settings are the same as those used in EDA-DM.

🔓 Getting Started

🗝️ Installation

Clone this repository, and then create and activate a suitable conda environment named dilatequant by using the following command:

git clone https://github.com/BienLuky/DilateQuant.git
cd DilateQuant
conda env create -f env.yaml
conda activate dilatequant

🔧 Usage

For Latent Diffusion and Stable Diffusion experiments, first download relevant checkpoints following the instructions in the latent-diffusion and stable-diffusion repos from CompVis. We currently use sd-v1-4.ckpt for Stable Diffusion.
Then use the following commands to run:

# CIFAR-10 (DDIM)
bash scripts/for_cifar.sh

# LSUN Bedroom (LDM-4)
bash scripts/for_bedroom.sh

# LSUN Church (LDM-8)
bash scripts/for_church.sh

# ImageNet (LDM-4)
bash scripts/for_imagenet.sh

# COCO (Stable Diffusion)
bash scripts/for_coco.sh

🔍 Details

This work is built upon EDA-DM as the baseline. We adopt a random sampling strategy to construct the calibration, and employ TPQ to assign separate quantization parameters for each diffusion timestep. During the distillation-based optimization, we specifically utilize an index-based approach to update the quantization parameters associated with each diffusion timestep.

📊 Results

🔹Random samples

LDM-4-ImageNet (3.35× Acceleration)

🔹Compression

We deploy the quantized models on RTX 3090 GPU.

🔹Quantization performance

Task	Method	Calib.	Prec.	FID ↓	sFID ↓	IS ↑
ImageNet 256 × 256 LDM-4 steps = 20 eta = 0.0 scale = 3.0	FP	–	32/32	11.69	7.67	364.72
	PTQD	1024	6/6	16.38	17.79	146.78
	EDA-DM	1024	6/6	11.52	8.02	360.77
	TFMQ-DM	10240	6/6	7.83	8.23	311.32
	EfficientDM	102.4K	6/6	8.69	8.10	309.52
	QuEST	5120	6/6	8.45	9.36	310.12
	DilateQuant	1024	6/6	8.25	7.66	312.30
	PTQD	1024	4/4	245.84	107.63	2.88
	EDA-DM	1024	4/4	20.02	36.66	204.93
	TFMQ-DM	10240	4/4	258.81	152.42	2.40
	TCAQ-DM	–	4/4	30.69	18.92	86.11
	EfficientDM	102.4K	4/4	12.08	14.75	122.12
	QuEST	5120	4/4	38.43	29.27	69.58
	DilateQuant	1024	4/4	8.01	13.92	257.24

🔹Quantization efficiency

Task	Method	Framework	Calib.	Data	Time	Memory
CIFAR-10 32 × 32	EDA-DM	PTQ	5120	0	0.97 h	3019 MB
	LSQ	QAT	–	50K	13.89 h	9974 MB
	EfficientDM	V-QAT	819.2K	0	2.98 h	9546 MB
	DilateQuant	V-QAT	5120	0	1.08 h	3439 MB
ImageNet 256 × 256	QuEST	V-QAT	5120	0	15.25 h	20642 MB
ImageNet 256 × 256	DilateQuant	V-QAT	1024	0	6.56 h	14680 MB

💙 Acknowledgments

This code was developed based on EDA-DM. We thank torch-fidelity, pytorch-fid, and clip-score for IS, sFID, FID and CLIP score computation.

📚 Citation

If you find this work useful in your research, please consider citing our paper:

@article{liu2024dilatequant,
  title={DilateQuant: Accurate and Efficient Diffusion Quantization via Weight Dilation},
  author={Liu, Xuewen and Li, Zhikai and Gu, Qingyi},
  journal={arXiv preprint arXiv:2409.14307},
  year={2024}
}

@article{liu2024enhanced,
  title={Enhanced distribution alignment for post-training quantization of diffusion models},
  author={Liu, Xuewen and Li, Zhikai and Xiao, Junrui and Gu, Qingyi},
  journal={arXiv e-prints},
  pages={arXiv--2401},
  year={2024}
}

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
assets		assets
configs		configs
ddim		ddim
ldm		ldm
models		models
quant		quant
quant_control		quant_control
scripts		scripts
README.md		README.md
env.yaml		env.yaml
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

🚀DilateQuant: Accurate and Efficient Diffusion Quantization via Weight Dilation

📖 Introduction

🔓 Getting Started

🗝️ Installation

🔧 Usage

🔍 Details

📊 Results

🔹Random samples

🔹Compression

🔹Quantization performance

🔹Quantization efficiency

💙 Acknowledgments

📚 Citation

About

Uh oh!

Releases

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

🚀DilateQuant: Accurate and Efficient Diffusion Quantization via Weight Dilation

📖 Introduction

🔓 Getting Started

🗝️ Installation

🔧 Usage

🔍 Details

📊 Results

🔹Random samples

🔹Compression

🔹Quantization performance

🔹Quantization efficiency

💙 Acknowledgments

📚 Citation

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages