This repository contains an implementation of Denoising Diffusion Implicit Models (DDIM) for image generation and inversion tasks, however it's edited so that it is compatible with the rectified flow model. Denoising diffusion implicit models (DDIMs) are a generalization of DDPMs via a class of non-Markovian diffusion processes that still lead to the same training objective as DDPMs. These non-Markovian processes can correspond to a deterministic generative process. DDIMs allow us to perform semantic image interpolation directly in the latent space and reconstruct observations with very low error. In addition to that, DDIMs provide a much faster sampling time than DDPMs.
Deep generative models have demonstrated the ability to sample high-quality samples from unknown distributions. In terms of image generation, DDPMs have shown results that are comparable to those of GANs. However, GANs require very specific choices in optimization and architecture to stabilize training, and they could also fail to cover modes of the data distributions. This opened the way to a new class of generative models where we train a neural network to learn how to denoise an image that has been progressively corrupted by Gaussian noise through a forward process that simulates a [[Markov chain]]. The samples are then generated through a Markov chain which starts from white noise, progressively denoising it into an image.
Given samples from a data distribution
DDPMs are latent variable models that have the following form:
Where the parameters
Where
Where the covariance matrix is ensured to have positive terms on the diagonal. This is called the forward process due to the auto-regressive nature of the sampling procedure (
A special property of the forward process is that:
This allows us to express
When we set
where
The length
The generative process is an approximation of the reverse of the inference process (diffusion process). So, if we use different inference processes that are not Markovian, we can reduce the number of iterations required by the generative model.
One important observation regarding the objective
These non-Markovian inference processes lead to the same objective function as DDPMs.
Let's consider a family
Where
The mean function is chosen to ensure that
which is also Gaussian, but unlike the diffusion process, the forward processes in this case are no longer Markovian, since each
Defining a trainable generative process
For some
By rewriting the equation, we can then predict the denoised observation, which is a prediction of
Then we can define the generative process with a fixed prior
where
We optimize for
We get the second formula by factorizing
and
From the definition of
With
From
Another special case is when