0% found this document useful (0 votes)

7 views23 pages

Diffusion

The document discusses diffusion models in machine learning, particularly focusing on generative AI techniques such as Noise Conditional Score Networks (NCSN) and Denoising Diffusion Probabilistic Models (DDPM). It outlines the framework for score-based generative modeling, including score matching and Langevin dynamics, and addresses challenges in score estimation and sampling. The document also references foundational concepts in mathematics necessary for understanding these models.

Uploaded by

Trần Khiêm

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

7 views23 pages

Diffusion

Uploaded by

Trần Khiêm

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 23

Introduction NCSN DDPM References

Diffusion model
for machine learning

Tran Trong Khiem

AI lab tranning

2024/08/01

Tran Trong Khiem Diffusion model 1 / 23

Introduction NCSN DDPM References

1 Introduction

2 NCSN

3 DDPM

4 References

Tran Trong Khiem Diffusion model 2 / 23

Introduction NCSN DDPM References

Introduction

Math for machine learning :

• Complete Foundation chapter in Probabilistic Machine
Learning[1]
• Probability and Statistics.
• Linear Algebra.
• Optimazation.
Generative AI:
• Gan
• VAE
• Flow-base
• Diffusion model

Tran Trong Khiem Diffusion model 3 / 23

Introduction NCSN DDPM References

Generative AI

Existing generative modeling techniques can largely be grouped into

two categories based on how they represent probability distribu-
tions.
1 likelihood-based models: which directly learn the
distribution’s probability density (or mass) function via
(approximate) maximum likelihood.(VAEs, EBMs, ...)
• Cons: require strong restrictions on the model architecture to
ensure a tractable normalizing constant for likelihood
computation.
2 implicit generative models: where the probability distribution
is implicitly represented by a model of its sampling
process.(Gan,...)
• Cons: unstable and can lead to model collapse.

Diffusion model introduces another way to represent probability dis-

tributions that circumvent several of these limitations.

Tran Trong Khiem Diffusion model 4 / 23

Introduction NCSN DDPM References

Diffusion model
The key idea is to model the gradient of the log probability density
function, score function.
• score-based models are not required to have a tractable
normalizing constant, and can be directly learned by score
matching.Better than GAN in image generation.
Denote :
• The dataset consists of i.i.d. samples {xi ∈ RD }Ni=1 from an
unknown data distribution pdata (x).
• The score of a probability density p(x) is defined as ∇x log p(x).
• The score network sθ : RD → RD , which will be trained to
approximate the score of pdata (x).
The framework of score-based generative modeling:
1 score matching
2 Langevin dynamics.
Tran Trong Khiem Diffusion model 5 / 23
Introduction NCSN DDPM References

Framework of score-based generative modeling

Score matching :
• train a score network sθ (x) to estimate ∇x log pdata (x) without
training a model to estimate pdata (x)
Langevin dynamics
• produce samples from a probability density p(x) using only the
score function ∇x log pdata (x).

Figure 1: Score-based generative modeling with score matching + Langevin

dynamics.
Tran Trong Khiem Diffusion model 6 / 23
Introduction NCSN DDPM References

Score matching for score estimation

Goal: train a score network sθ (x) to estimate ∇x log pdata (x).

The objective minimizes :
h i
2
Epdata ∥sθ (x) − ∇x log pdata (x)∥2

which can be shown equivalent to the following up to a constant :

1 2
Epdata (x) tr(∇x sθ (x)) + ∥sθ (x)∥2
2

Problem: Score matching is not scalable to deep networks and high-

dimensional data due to the computation of tr(∇x sθ (x)).
Solution: There are two popular methods for large scale score match-
ing.
1 Denoising score matching
2 Sliced score matching

Tran Trong Khiem Diffusion model 7 / 23

Introduction NCSN DDPM References

Score matching for score estimation(.cnt)

Denoising score matching:
• completely circumvents tr(∇x sθ (x)).
• perturbs the data point x with a prespecified noise qσ (x̃ | x).
• employs score matching to estimate the score of the perturbed
data. h i
2
Eqσ (x̃|x)pdata (x) ∥sθ (x̃) − ∇x̃ log qσ (x̃ | x)∥2

• However, s∗θ (x) = ∇x log qσ (x) ≈ ∇x log pdata (x) is true only when
the noise is small enough such that qσ (x) ≈ pdata (x).
Sliced score matching:
• uses random projections to approximate tr(∇x sθ (x)).
• The objective is:
1 h
2
i
Epv Epdata v∇x sθ (x)v + ∥sθ (x)∥2
2
• pv is a simple distribution of random vectors.
Tran Trong Khiem Diffusion model 8 / 23
Introduction NCSN DDPM References

Sampling with Langevin dynamics

Goal: produce samples from a probability density p(x) using only the
score function ∇x log p(x).
• Given a fixed step size ϵ > 0, and an initial value x̃0 ∼ π(x)
• π is a prior distribution.
• Langevin method recursively computes the following :
ϵ
x̃t = x̃t−1 + ∇x log p(x̃t−1 ) + ϵzt ,
2

• zt ∼ N (0, I)
• The distribution of x̃T equals p(x) when ϵ → 0 and T → ∞,
• In practice, ϵ is small and T is large.

Tran Trong Khiem Diffusion model 9 / 23

Introduction NCSN DDPM References

Challenges of score-based generative modeling

Inaccurate score estimation with score matching:
• In score matching, we minimize :
h i Z h i
2 2
Epdata ∥sθ (x) − ∇x log pdata (x)∥2 = p(x) ∥sθ (x) − ∇x log pdata (x)∥2 dx

• Since square error weighted by p(x) , they are largely ignored in

low density regions where p(x) is small.

Figure 2: Estimated scores are only accurate in high density regions

Tran Trong Khiem Diffusion model 10 / 23

Introduction NCSN DDPM References

How to bypass the inaccurate score estimation in regions of

low data density?

Observation : perturbing data with random Gaussian noise makes the

data distribution more amenable to score-based generative modeling.
• large Gaussian noise has the effect of filling low density regions
in the original distribution.
Upon intuition is the key idea for Noise Conditional Score Networks(NCSN):

1 perturbing the data using various levels of noise.

2 simultaneously estimating scores corresponding to all noise levels
by training a single conditional score network.

Tran Trong Khiem Diffusion model 11 / 23

Introduction NCSN DDPM References

1 Introduction

2 NCSN

3 DDPM

4 References

Tran Trong Khiem Diffusion model 12 / 23

Introduction NCSN DDPM References

Noise Conditional Score Networks

Problem : How to choose an appropriate noise scale for the perturba-
tion process?
• Larger noise over-corrupts the data and alters it significantly from
the original distribution.
• Smaller noise, on the other hand, causes less corruption of the
original data.
Solution: Use multiple scales of noise perturbations simultaneously.
Denote:
• {σi }Li=1 be a positive sequence geometric decending sequence.
• qσ (x) = pdata (t)N (x | t, σ 2 I) dt the perturbed data distribution.
R

• sθ (x, σ) is a Noise Conditional Score Network (NCSN).

• train model to jointly estimate the scores of all perturbed data
distributions :
∀σ ∈ {σi }Li=1 : sθ (x, σ) ≈ ∇x log qσ (x)
Tran Trong Khiem Diffusion model 13 / 23
Introduction NCSN DDPM References

Learning NCSNs via score matching

Adapt denoising score matching for learning NCSNs.

• choose the noise distribution to be qσ (x̃ | x) = N (x̃ | x, σ 2 I)
• therefore ∇x̃ log qσ (x̃ | x) = − x̃−x
σ2
• For a given σ, the denoising score matching objective is :
" #
2
1 x̃ − x
L(θ; σ) = Epdata (x) Ex̃∼N (x,σ2 I) sθ (x̃, σ) + .
2 σ2 2

• We combine for all σ ∈ {σi }Li=1 to get one unified objective :

L
1X
L(θ; {σi }Li=1 ) = λ(σi )L(θ; σi )
L
i=1

Tran Trong Khiem Diffusion model 14 / 23

Introduction NCSN DDPM References

NCSN inference via annealed Langevin dynamics

• propose a sampling approach— annealed Langevin dynamics

Figure 3: Annealed Langevin dynamics.

Tran Trong Khiem Diffusion model 15 / 23
Introduction NCSN DDPM References

1 Introduction

2 NCSN

3 DDPM

4 References

Tran Trong Khiem Diffusion model 16 / 23

Introduction NCSN DDPM References

Denoising Diffusion Probabilistic Models

Figure 4: Diffusion model

Forward diffusion process:

• add small amount of Gaussian noise to the sample in T
• producing a sequence of noisy samples x1 , x2 · · · xT
• converts any complex data distribution into a simple, tractable,
distribution.
Reverse diffusion process:
• Learn a reveral of forward diffusion process.

Tran Trong Khiem Diffusion model 17 / 23

Introduction NCSN DDPM References

Foward process

Gradually adds Gaussian noise to the data according to a variance

schedule β1 , . . . , βT :
T
Y p
q(x1:T | x0 ) := q(xt | xt−1 ), q(xt | xt−1 ) := N (xt ; 1 − βt xt−1 , βt I)
t=1

Nice property: We can sample xt at timestep t as :

√ p √ p
xt = αt xt−1 + 1 − αt ϵt = ᾱt x0 + 1 − ᾱt ϵ

• ϵt ∼ N (0, I)
• ᾱt = ts=1 αt and αt = 1 − βt
Q
√
• Thus we have : q(xt |x0 ) = N (xt , ᾱt x0 , (1 − ᾱt I))

Tran Trong Khiem Diffusion model 18 / 23

Introduction NCSN DDPM References

Reverse diffusion process

Goal: Learn to reverse the forward process and sample from q(xt−1 |xt ).

• Use pθ (xt−1 |xt ) to approximate q(xt−1 |xt ).

• The reverse conditional probability is tractable when conditioned
on x0 :
q(xt−1 |xt , x0 ) = N (xt−1 , µ̃(xt , x0 ), β̃t I)

√
• β̃t = 1−at−1
¯ √ 1−ᾱ ᾱt−1 βt
1−a¯t
βt and µ̃(xt , x0 ) = αt 1−ᾱt−1
t
xt + 1−ᾱt
x0

Tran Trong Khiem Diffusion model 19 / 23

Introduction NCSN DDPM References

Reverse diffusion process(.cnt)

Training is performed by optimizing the usual variational bound on

negative log likelihood:

pθ (x0:T )
Eq [− log pθ (x0 )] ≤ Eq − log
q(x1:T | x0 )
T
" #
X pθ (xt−1 | xt )
= Eq − log p(xT ) − log =: L.
q(xt | xt−1 )
t=1

Loss function can rewrite as :

" #
X
Eq DKL (q(xT |x0 )||p(xT )) + DKL (q(xt−1 |xt , x0 )||pθ (xt−1 |xt )) − log pθ (x0 |x1 )
t>1
(1)
Label each component in the variational lower bound loss separately:
• LVLB = Tt=0 Lt
P

Tran Trong Khiem Diffusion model 20 / 23

Introduction NCSN DDPM References

Reverse diffusion process(.cnt)

The loss term Lt is parameterized and simplified to minimize :

h √ i
Lsimple
p
t = Et∼[1,T],x0 ,ϵt ||ϵt − ϵθ ( ᾱt x0 + 1 − ᾱt ϵt t)||2

Figure 5: Traing process.

Tran Trong Khiem Diffusion model 21 / 23

Introduction NCSN DDPM References

1 Introduction

2 NCSN

3 DDPM

4 References

Tran Trong Khiem Diffusion model 22 / 23

Introduction NCSN DDPM References

References

1 Weng, Lilian. (Jul 2021). What are diffusion models? Lil’Log.

2 Jonathan Ho,Ajay Jain,Pieter Abbeel, Denoising Diffusion
Probabilistic Models
3 Yang Song,Stefano Ermon, Generative Modeling by Estimating
Gradients of the Data Distribution
4 Generative Modeling by Estimating Gradients of the Data
Distribution

Tran Trong Khiem Diffusion model 23 / 23

ADHD Assessment
No ratings yet
ADHD Assessment
6 pages
A Detailed Lesson Plan in Mathematics Grade 7 (Algebra) : I. Objectives
0% (1)
A Detailed Lesson Plan in Mathematics Grade 7 (Algebra) : I. Objectives
7 pages
Licensure Examination For Teachers Reviewer (Part 1)
100% (1)
Licensure Examination For Teachers Reviewer (Part 1)
11 pages
Lec16 DiffusionModels
No ratings yet
Lec16 DiffusionModels
57 pages
CVPR2022 Tutorial Diffusion Model
No ratings yet
CVPR2022 Tutorial Diffusion Model
188 pages
Design of Radial Gate Using Rectangular 2
100% (1)
Design of Radial Gate Using Rectangular 2
55 pages
Types of Concrete: Ar. C.N.Vaishnavi Ar. M.Padma
No ratings yet
Types of Concrete: Ar. C.N.Vaishnavi Ar. M.Padma
23 pages
Diffusion Models
No ratings yet
Diffusion Models
151 pages
Ensayo Sobre El Patriotismo
100% (1)
Ensayo Sobre El Patriotismo
6 pages
Solutions Ch08 4e Probs01 14
No ratings yet
Solutions Ch08 4e Probs01 14
20 pages
Lecture7 8 Diffusion Model 1 78
No ratings yet
Lecture7 8 Diffusion Model 1 78
78 pages
Lesson 2 Political Ideologies
No ratings yet
Lesson 2 Political Ideologies
15 pages
Ilovepdf - Merged - 2024-06-07T151331.684
100% (1)
Ilovepdf - Merged - 2024-06-07T151331.684
7 pages
Diffusion: by Aryan Jain
100% (1)
Diffusion: by Aryan Jain
55 pages
List of Is Codes: SL No.1 To SL No.7 Are Design of Gates
No ratings yet
List of Is Codes: SL No.1 To SL No.7 Are Design of Gates
3 pages
Kratus 2017 Music Listening Is Creative
No ratings yet
Kratus 2017 Music Listening Is Creative
6 pages
Aspen HYSYS Pump, Compressor, Expander, and Heat Exchanger Simulations
No ratings yet
Aspen HYSYS Pump, Compressor, Expander, and Heat Exchanger Simulations
22 pages
NeurIPS 2024 Understanding The Transferability of Representations Via Task Relatedness Paper Conference
No ratings yet
NeurIPS 2024 Understanding The Transferability of Representations Via Task Relatedness Paper Conference
34 pages
Lecture7 8 - Diffusion - Model 1 78 1 66
No ratings yet
Lecture7 8 - Diffusion - Model 1 78 1 66
66 pages
Structured Denoising Diffusion Models in Discrete State-Spaces
No ratings yet
Structured Denoising Diffusion Models in Discrete State-Spaces
33 pages
Wans Ser Stein
No ratings yet
Wans Ser Stein
30 pages
Diffusion Models for Beginners
No ratings yet
Diffusion Models for Beginners
8 pages
Lecture 14
No ratings yet
Lecture 14
23 pages
Estimating & Measuring Work Within A Construction Environment
No ratings yet
Estimating & Measuring Work Within A Construction Environment
29 pages
Lecture 13
No ratings yet
Lecture 13
31 pages
CO2 Fire Suppression Systems Guide
100% (2)
CO2 Fire Suppression Systems Guide
21 pages
Reverse Transition Kernel: A Flexible Framework To Accelerate Diffusion Inference
No ratings yet
Reverse Transition Kernel: A Flexible Framework To Accelerate Diffusion Inference
68 pages
PDF No Bake Asweseeit - Compress
No ratings yet
PDF No Bake Asweseeit - Compress
132 pages
A Score-Based Density Formula, With Applications in
No ratings yet
A Score-Based Density Formula, With Applications in
24 pages
Lecture7-8 Diffusion Model
No ratings yet
Lecture7-8 Diffusion Model
136 pages
A Financial Time Series Denoiser Based On Diffusion Models
No ratings yet
A Financial Time Series Denoiser Based On Diffusion Models
9 pages
Lecture # 13-2 Stable Diffusion Model
No ratings yet
Lecture # 13-2 Stable Diffusion Model
48 pages
Time Grad
No ratings yet
Time Grad
11 pages
Diffusion
No ratings yet
Diffusion
19 pages
Lecture 12
No ratings yet
Lecture 12
35 pages
Lec24 Diffusion
No ratings yet
Lec24 Diffusion
83 pages
CS772 Lec21
No ratings yet
CS772 Lec21
26 pages
Associations Between Social Responsibility Disclosure and Characteristics of Companies
No ratings yet
Associations Between Social Responsibility Disclosure and Characteristics of Companies
8 pages
Inventory Management and Control System
No ratings yet
Inventory Management and Control System
88 pages
Lecture7 Diffusion
No ratings yet
Lecture7 Diffusion
42 pages
Score Matching Yang Song
No ratings yet
Score Matching Yang Song
13 pages
Ask in Any Modality A Comprehensive Survey On Multimodal Retrieval - Augmented Generation
No ratings yet
Ask in Any Modality A Comprehensive Survey On Multimodal Retrieval - Augmented Generation
34 pages
Lecture 5 Diffusion - Models Part II Final
No ratings yet
Lecture 5 Diffusion - Models Part II Final
49 pages
Least Mastered Skills in Math III Questionnaire
No ratings yet
Least Mastered Skills in Math III Questionnaire
3 pages
Score Approximation, Estimation and Distribution Recovery of Diffusion Models On Low-Dimensional Data
No ratings yet
Score Approximation, Estimation and Distribution Recovery of Diffusion Models On Low-Dimensional Data
52 pages
Tutorial On Diffusion Models
No ratings yet
Tutorial On Diffusion Models
4 pages
Diffusion Model
No ratings yet
Diffusion Model
17 pages
Diffusion Models in Imaging Tutorial
No ratings yet
Diffusion Models in Imaging Tutorial
90 pages
Diffusion Csail Lecture Notes
No ratings yet
Diffusion Csail Lecture Notes
56 pages
Application DPM
No ratings yet
Application DPM
43 pages
From Denoising Diffusions To Denoising Markov Models
No ratings yet
From Denoising Diffusions To Denoising Markov Models
55 pages
Kaist cs492d Fall 2024 Lecture 5
No ratings yet
Kaist cs492d Fall 2024 Lecture 5
77 pages
Low-Dimensional Adaptation of Diffusion Models
No ratings yet
Low-Dimensional Adaptation of Diffusion Models
52 pages
Non Gaussian Denoising Diffusion Models
No ratings yet
Non Gaussian Denoising Diffusion Models
11 pages
Target Chair Testing Protocol Guide
No ratings yet
Target Chair Testing Protocol Guide
12 pages
DiffusionModel DDPM
No ratings yet
DiffusionModel DDPM
52 pages
NeurIPS 2022 Physics Embedded Neural Networks Graph Neural Pde Solvers With Mixed Boundary Conditions Paper Conference
No ratings yet
NeurIPS 2022 Physics Embedded Neural Networks Graph Neural Pde Solvers With Mixed Boundary Conditions Paper Conference
12 pages
Diffusion Models For PNP IR
No ratings yet
Diffusion Models For PNP IR
48 pages
Na 等 - 2024 - Label-noise robust diffusion models
No ratings yet
Na 等 - 2024 - Label-noise robust diffusion models
44 pages
Intro
No ratings yet
Intro
12 pages
Diffusion
No ratings yet
Diffusion
55 pages
Improving Diffusion Models For Inverse Problems Using Manifold Constraints
No ratings yet
Improving Diffusion Models For Inverse Problems Using Manifold Constraints
29 pages
Divide-and-Conquer Posterior Sampling For Denoising Diffusion Priors
No ratings yet
Divide-and-Conquer Posterior Sampling For Denoising Diffusion Priors
30 pages
Diffusion Based Representation Learning
No ratings yet
Diffusion Based Representation Learning
20 pages
Viviane Namaste - Undoing Theory
No ratings yet
Viviane Namaste - Undoing Theory
23 pages
Diffusion Models in Time Series
No ratings yet
Diffusion Models in Time Series
25 pages
Step-by-Step Diffusion: An Elementary Tutorial
No ratings yet
Step-by-Step Diffusion: An Elementary Tutorial
51 pages
Kaist cs492d Fall 2024 Lecture 4
No ratings yet
Kaist cs492d Fall 2024 Lecture 4
33 pages
Diffusion Model
No ratings yet
Diffusion Model
16 pages
Wang 20 K
No ratings yet
Wang 20 K
11 pages
Denoising Diffusion Probabilistic Models
No ratings yet
Denoising Diffusion Probabilistic Models
25 pages
Lecture 4 Diffusion - Models Part I Final
No ratings yet
Lecture 4 Diffusion - Models Part I Final
39 pages
Variational Diffusion Models Guide
No ratings yet
Variational Diffusion Models Guide
48 pages
Singular Value Fine-Tuning: Few-Shot Segmentation
No ratings yet
Singular Value Fine-Tuning: Few-Shot Segmentation
19 pages
Tutorial On Diffusion Models For Imaging and Vision: Stanley Chan March 28, 2024
No ratings yet
Tutorial On Diffusion Models For Imaging and Vision: Stanley Chan March 28, 2024
51 pages
Neural Processes
No ratings yet
Neural Processes
11 pages
Carl Jung: Everything About Other People That Doesn't Satisfy Us Helps Us To Better Understand Ourselves
No ratings yet
Carl Jung: Everything About Other People That Doesn't Satisfy Us Helps Us To Better Understand Ourselves
1 page
Final Exam (NN&DL 5720)
No ratings yet
Final Exam (NN&DL 5720)
33 pages
Sampling Is As Easy As Learning The Score: Theory For Diffusion Models With Minimal Data Assumptions
No ratings yet
Sampling Is As Easy As Learning The Score: Theory For Diffusion Models With Minimal Data Assumptions
29 pages
Scaffolding in Learning
No ratings yet
Scaffolding in Learning
5 pages
3rd Module
No ratings yet
3rd Module
5 pages
Supply Chain Management Project
No ratings yet
Supply Chain Management Project
3 pages
Advanced Generative Modeling Techniques
No ratings yet
Advanced Generative Modeling Techniques
23 pages
Slides 2
No ratings yet
Slides 2
28 pages
CHE 1000-E LEARNING - BALANCING REDOX REACTIONS
No ratings yet
CHE 1000-E LEARNING - BALANCING REDOX REACTIONS
17 pages
Improved Denoising Diffusion Probabilistic Models
No ratings yet
Improved Denoising Diffusion Probabilistic Models
17 pages
Bentley Openbuildings Designer Connect Edition-Architectural Bim Quickstart A102: Modeling Interior Floors
No ratings yet
Bentley Openbuildings Designer Connect Edition-Architectural Bim Quickstart A102: Modeling Interior Floors
28 pages
Lecture 13
No ratings yet
Lecture 13
43 pages
Lecture 12
No ratings yet
Lecture 12
38 pages
La 111 Sessional 2023
No ratings yet
La 111 Sessional 2023
3 pages
Maximum Likelihood Training of
No ratings yet
Maximum Likelihood Training of
24 pages
Neural Network Diffusion: Forward Process
No ratings yet
Neural Network Diffusion: Forward Process
17 pages
Advanced Score-Based Model Training
No ratings yet
Advanced Score-Based Model Training
31 pages
Data Analysis: Frequency Tables
No ratings yet
Data Analysis: Frequency Tables
1 page
Passport Appointment Receipt India
No ratings yet
Passport Appointment Receipt India
3 pages
Mahatma Gandhi University Revised Scheme For B Tech Syllabus Revision 2010 (Civil Engineering)
No ratings yet
Mahatma Gandhi University Revised Scheme For B Tech Syllabus Revision 2010 (Civil Engineering)
4 pages

Diffusion

Uploaded by

Diffusion

Uploaded by

Introduction NCSN DDPM References

Tran Trong Khiem

Tran Trong Khiem Diffusion model 1 / 23

Tran Trong Khiem Diffusion model 2 / 23

Math for machine learning :

Tran Trong Khiem Diffusion model 3 / 23

Existing generative modeling techniques can largely be grouped into

Diffusion model introduces another way to represent probability dis-

Tran Trong Khiem Diffusion model 4 / 23

Framework of score-based generative modeling

Figure 1: Score-based generative modeling with score matching + Langevin

Score matching for score estimation

Goal: train a score network sθ (x) to estimate ∇x log pdata (x).

which can be shown equivalent to the following up to a constant :

Problem: Score matching is not scalable to deep networks and high-

Tran Trong Khiem Diffusion model 7 / 23

Score matching for score estimation(.cnt)

Sampling with Langevin dynamics

Tran Trong Khiem Diffusion model 9 / 23

Challenges of score-based generative modeling

• Since square error weighted by p(x) , they are largely ignored in

Figure 2: Estimated scores are only accurate in high density regions

Tran Trong Khiem Diffusion model 10 / 23

How to bypass the inaccurate score estimation in regions of

Observation : perturbing data with random Gaussian noise makes the

1 perturbing the data using various levels of noise.

Tran Trong Khiem Diffusion model 11 / 23

Tran Trong Khiem Diffusion model 12 / 23

Noise Conditional Score Networks

• sθ (x, σ) is a Noise Conditional Score Network (NCSN).

Learning NCSNs via score matching

Adapt denoising score matching for learning NCSNs.

• We combine for all σ ∈ {σi }Li=1 to get one unified objective :

Tran Trong Khiem Diffusion model 14 / 23

NCSN inference via annealed Langevin dynamics

Figure 3: Annealed Langevin dynamics.

Tran Trong Khiem Diffusion model 16 / 23

Denoising Diffusion Probabilistic Models

Figure 4: Diffusion model

Forward diffusion process:

Tran Trong Khiem Diffusion model 17 / 23

Gradually adds Gaussian noise to the data according to a variance

Nice property: We can sample xt at timestep t as :

Tran Trong Khiem Diffusion model 18 / 23

Reverse diffusion process

• Use pθ (xt−1 |xt ) to approximate q(xt−1 |xt ).

Tran Trong Khiem Diffusion model 19 / 23

Reverse diffusion process(.cnt)

Training is performed by optimizing the usual variational bound on

Loss function can rewrite as :

Tran Trong Khiem Diffusion model 20 / 23

Reverse diffusion process(.cnt)

The loss term Lt is parameterized and simplified to minimize :

Figure 5: Traing process.

Tran Trong Khiem Diffusion model 21 / 23

Tran Trong Khiem Diffusion model 22 / 23

1 Weng, Lilian. (Jul 2021). What are diffusion models? Lil’Log.

Tran Trong Khiem Diffusion model 23 / 23

You might also like