AI-5004
Advanced Topics in Generative AI
Dr. Usman Haider
Generative Adversarial
Networks (GANs)
Autoencoder As close as possible
Latent
NN NN
Encoder Decoder
Randomly
Latent
NN
generate a vector
Decoder
Image ?
as code
Common Problems with VAEs
▪ Blurriness in Generated Samples:
▪ Produces blurrier and less sharp images than the original data samples.
▪ Nature of VAEs using the reconstruction loss (average out the variations).
▪ Latent Space Regularization
▪ Enforce a regularization on the latent space to ensure it follows a specific distribution (usually a Gaussian
distribution).
▪ Distribution of the data may not be approximated by the chosen prior well.
▪ Prevent the VAE from capturing more complex distributions present in the data.
▪ Balancing Act
▪ Trade-off between the reconstruction loss and the KL divergence
▪ Too much weight on the reconstruction loss can lead to ignoring the latent space structure
▪ Too much weight on the KL divergence can lead to ignoring the data's details.
Transposed Convolution
(Deconvolution)
Stride=1 Stride=2 Stride=2
Outpu
Pad=Val Pad=Val Pad=Sa
t
id id me
Con
v
Inpu
t
Outpu
t
Decon
v
Inpu
t
https://github.com/vdumoulin/conv_arithmeti
c
Generative Adversarial Network (GAN)
▪ GAN was first introduced by Ian Goodfellow et al in 2014
▪ Cab be used in generating images, videos, text, some simple
conversation.
▪ Note, image processing is easy (even animals can do it), NLP is hard
(only human can do it).
▪ This co-evolution approach might have far-reaching implications.
Bengio: this may hold the key to making computers a lot more
intelligent.
GAN – Learn a discriminator
Randomly
Generator
sample a vector
0 0 0 0
Something like Real images
Decoder in VAE Sampled from
DB: 1 1 1 1
image Discriminator 1/0 (real or fake)
Generative Adversarial Network
▪ GANs perform unsupervised learning tasks in machine learning.
▪ It consists of two models that automatically discover and learn the patterns in input data:
▪ Generator and Discriminator
▪ They compete with each other to capture and replicate the variations within a dataset.
▪ GANs can be used to generate new examples that are similar to the original training
dataset.
Generative Adversarial Network
▪ Generator:
▪ A Generator in GANs is a neural network that creates fake data to be
trained on the discriminator.
▪ It learns to generate probable data.
▪ The generated examples/instances become negative training
examples for the discriminator.
▪ It takes a fixed-length random vector carrying noise as input and
generates a sample.
Generative Adversarial Network
▪ The main aim of the Generator is to make the discriminator classify its output as real.
▪ This can happen when the generator generates realistic images.
Generative Adversarial Network
▪ Discriminator
▪ The Discriminator is a neural network that identifies real data from the fake data created
by the Generator.
▪ The discriminator's training data comes from different two sources:
▪ The real data instances
▪ The fake data instances
Generative Adversarial Network
▪ Complete GAN Network
Random Input
▪ In its most basic form, a GAN takes random noise as its input.
▪ The generator then transforms this noise into a meaningful output.
▪ By introducing noise, we can get the GAN to produce a wide variety of data.
GAN Training
▪ Because a GAN contains two separately trained networks, its training algorithm must
address two complications:
▪ GANs must switch two different kinds of training (generator and discriminator).
▪ GAN convergence is hard to identify.
▪ Alternating Training
1. The discriminator trains for one or more epochs.
2. The generator trains for one or more epochs.
3. Repeat steps 1 and 2 to continue to train the generator and discriminator networks.
Generative Adversarial Network
▪ The backpropagation method is used to adjust each weight in the right direction by
calculating the weight's impact on the output.
Generative Adversarial Network
▪ Loss Functions
▪ GANs generate the probability distribution of a particular data.
▪ Loss functions measures distance between the distribution of the data generated by the GAN and the
distribution of the real data.
▪ Active research area
▪ Many approaches have been proposed.
▪ GANs have two loss functions: one for generator training and one for discriminator training.
▪ Two common GAN loss functions:
▪ Minimax loss
▪ Wasserstein loss
Loss Functions
▪ Minimax Loss
Similarity between two Distributions
▪ KL (Kullback–Leibler) divergence:
▪ Measures how one probability distribution diverges from a second expected probability distribution.
▪
▪ KL divergence is asymmetric.
If P(x)~0 and q(x) is non-zero,
then q’s effect is disregarded.
Problem to measure the similarity
between two equally important
distributions.
GAN on
MNIST
Deconv Conv,
Tanh/Sigmo Conv, Reshap FC,
FC, Deconv BN, e,
id BN, Sigmoi
BN, BN, LReLU FC, BN,
LReLU d
Resha ReLU LReLU
pe
10 7x7x1 14x14x 28x28x 14x14x 7x7x1 25 1
0 6 8 1 8 6 6
Generat
or Discriminat
or
Issues: Convergence
▪ As the generator improves with training, the discriminator performance gets worse
▪ Discriminator can’t differentiate between real and fake.
▪ If the generator succeeds perfectly, then the discriminator has a 50% accuracy.
▪ Starts random guess (Less meaningful)
▪ Warning: The generator may start to train on random feedback, and its own quality may
reduce.
Problems
▪ Imbalance
▪ One network may dominate the other
▪ e.g. discriminator may always tells the fake and real data and generator may not be able to fool
the generator
▪ Local Convergence
▪ We may stuck at a local minima and not the global error
▪ The discriminator feedback gets less meaningful over time.
▪ The generator starts to train on junk feedback, and its own quality may collapse.