Generative Adversarial Networks
Today’s class
• Unsupervised Learning
• Generative Models
• Autoencoders (AE)
• Generative Adversarial Networks (GAN)
• GANs: Recent Trends
Supervised vs Unsupervised Learning
Supervised Learning
Data: (x, y)
x is data, y is label
Goal: Learn a function to map x -> y
Examples: Classification,
regression, object detection,
semantic segmentation,
image captioning, etc.
Credit: cs231n, Stanford
Supervised vs Unsupervised Learning
Supervised Learning
Data: (x, y)
x is data, y is label
Cat
Goal: Learn a function to map x -> y
Examples: Classification, Classification
regression, object detection,
semantic segmentation,
image captioning, etc.
Credit: cs231n, Stanford
Supervised vs Unsupervised Learning
Supervised Learning
Data: (x, y)
x is data, y is label
Goal: Learn a function to map x -> y
Examples: Classification,
regression, object detection, Object Detection
semantic segmentation,
image captioning, etc.
Credit: cs231n, Stanford
Supervised vs Unsupervised Learning
Supervised Learning
Data: (x, y)
x is data, y is label
Goal: Learn a function to map x -> y
Examples: Classification,
regression, object detection,
semantic segmentation, Semantic Segmentation
image captioning, etc.
Credit: cs231n, Stanford
Supervised vs Unsupervised Learning
Supervised Learning
Data: (x, y)
x is data, y is label
Goal: Learn a function to map x -> y
Examples: Classification,
regression, object detection,
semantic segmentation, Image Captioning
image captioning, etc.
Credit: cs231n, Stanford
Supervised vs Unsupervised Learning
Unsupervised Learning
Data: x
Just data, no labels!
Goal: Learn some underlying
hidden structure of the data
Examples: Clustering,
dimensionality reduction, feature
learning, density estimation, etc.
Credit: cs231n, Stanford
Supervised vs Unsupervised Learning
Unsupervised Learning
Data: x
Just data, no labels!
Goal: Learn some underlying
hidden structure of the data
Examples: Clustering,
dimensionality reduction, feature K-Means Clustering
learning, density estimation, etc.
Credit: cs231n, Stanford
Supervised vs Unsupervised Learning
Unsupervised Learning
Data: x
Just data, no labels!
Goal: Learn some underlying
hidden structure of the data
Examples: Clustering,
dimensionality reduction, feature (Principal Component Analysis)
Dimensionality Reduction
learning, density estimation, etc.
Credit: cs231n, Stanford
Supervised vs Unsupervised Learning
Unsupervised Learning
Data: x
Just data, no labels!
Goal: Learn some underlying
hidden structure of the data
Examples: Clustering,
dimensionality reduction, feature Generative Advarsarial Networks
(Distribution learning)
learning, density estimation, etc.
Credit: cs231n, Stanford
Autoencoders
Unsupervised approach for learning a lower-dimensional feature
representation from unlabeled training data
Credit: cs231n, Stanford
Autoencoders
Unsupervised approach for learning a lower-dimensional feature
representation from unlabeled training data
Originally: Linear +
nonlinearity (sigmoid)
Later: Deep, fully-connected
Later: ReLU CNN
Credit: cs231n, Stanford
Autoencoders
Unsupervised approach for learning a lower-dimensional feature
representation from unlabeled training data
Z usually smaller than X
(Dimensionality Reduction)
Originally: Linear +
nonlinearity (sigmoid)
Q: Why dimensionality
Later: Deep, fully-connected
reduction?
Later: ReLU CNN
Credit: cs231n, Stanford
Autoencoders
Unsupervised approach for learning a lower-dimensional feature
representation from unlabeled training data
Z usually smaller than X
(Dimensionality Reduction)
Originally: Linear +
nonlinearity (sigmoid)
Q: Why dimensionality
Later: Deep, fully-connected
reduction?
Later: ReLU CNN
A: Want features to
capture meaningful
factors of variation in
data
Credit: cs231n, Stanford
Autoencoders
How to learn this feature representation?
Credit: cs231n, Stanford
Autoencoders
How to learn this feature representation?
Train such that features can be used to reconstruct
original data “Autoencoding” - encoding itself
Credit: cs231n, Stanford
Autoencoders
How to learn this feature representation?
Train such that features can be used to reconstruct
original data “Autoencoding” - encoding itself
Originally: Linear +
nonlinearity (sigmoid)
Later: Deep, fully-connected
Later: ReLU CNN
Credit: cs231n, Stanford
Autoencoders Reconstructed Data
How to learn this feature representation?
Train such that features can be used to reconstruct
original data “Autoencoding” - encoding itself
Decoder: 4-layer upconv
Encoder: 4-layer conv
Input Data
Credit: cs231n, Stanford
Autoencoders Reconstructed Data
Train such that features
can be used to
reconstruct original data
L2 Loss Function
Decoder: 4-layer upconv
Encoder: 4-layer conv
Input Data
Credit: cs231n, Stanford
Autoencoders Reconstructed Data
Train such that features
can be used to Doesn’t use labels!
reconstruct original data
L2 Loss Function
Decoder: 4-layer upconv
Encoder: 4-layer conv
Input Data
Credit: cs231n, Stanford
Autoencoders Reconstructed Data
Encoder: 4-layer conv
Decoder: 4-layer upconv
After training,
throw awayInput Data
decoder
Credit: cs231n, Stanford
Autoencoders
Credit: cs231n, Stanford
Autoencoders
Encoder can be used to
initialize a supervised model
Loss Function
(Softmax, etc.)
Credit: cs231n, Stanford
Autoencoders
Encoder can be used to
initialize a supervised model
Loss Function
(Softmax, etc.)
Fine-
tune
encoder Train for final task
jointly (sometimes with
with small data)
classifier
Credit: cs231n, Stanford
Generative Adversarial Networks
Sample from a simple distribution, e.g. random noise.
Learn transformation to training distribution.
Ian Goodfellow et al., “Generative Adversarial Nets”, NIPS 2014 Credit: cs231n, Stanford
Generative Adversarial Networks
Sample from a simple distribution, e.g. random noise.
Learn transformation to training distribution.
A neural network can be
used to represent
this complex transformation?
Ian Goodfellow et al., “Generative Adversarial Nets”, NIPS 2014 Credit: cs231n, Stanford
Training GANs: Two-player game
Generator network: try to fool the discriminator by generating real-looking images
Discriminator network: try to distinguish between real and fake images
Ian Goodfellow et al., “Generative Adversarial Nets”, NIPS 2014 Credit: cs231n, Stanford
Training GANs: Two-player game
Generator network: try to fool the discriminator by generating real-looking images
Discriminator network: try to distinguish between real and fake images
Ian Goodfellow et al., “Generative Adversarial Nets”, NIPS 2014 Fake and real images copyright Emily Denton et al. 2015. Credit: cs231n, Stanford
Training GANs: Two-player game
Generator network: try to fool the discriminator by generating real-looking images
Discriminator network: try to distinguish between real and fake images
Ian Goodfellow et al., “Generative Adversarial Nets”, NIPS 2014 Figure: Ian Goodfellow NIPS Talk
Training GANs: Two-player game
Generator network: try to fool the discriminator by generating real-looking images
Discriminator network: try to distinguish between real and fake images
Ian Goodfellow et al., “Generative Adversarial Nets”, NIPS 2014 Credit: cs231n, Stanford
Training GANs: Two-player game
Generator network: try to fool the discriminator by generating real-looking images
Discriminator network: try to distinguish between real and fake images
Ian Goodfellow et al., “Generative Adversarial Nets”, NIPS 2014 Credit: cs231n, Stanford
Training GANs: Two-player game
Generator network: try to fool the discriminator by generating real-looking images
Discriminator network: try to distinguish between real and fake images
• Discriminator (θd) wants to maximize objective such that D(x) is close to 1 (real)
and D(G(z)) is close to 0 (fake)
• Generator (θg) wants to minimize objective such that D(G(z)) is close to 1
(discriminator is fooled into thinking generated G(z) is real)
Ian Goodfellow et al., “Generative Adversarial Nets”, NIPS 2014 Credit: cs231n, Stanford
Training GANs: Two-player game
Ian Goodfellow et al., “Generative Adversarial Nets”, NIPS 2014 Credit: cs231n, Stanford
Training GANs: Two-player game
In practice, optimizing this generator
objective does not work well!
Ian Goodfellow et al., “Generative Adversarial Nets”, NIPS 2014 Credit: cs231n, Stanford
Training GANs: Two-player game
Ian Goodfellow et al., “Generative Adversarial Nets”, NIPS 2014 Credit: cs231n, Stanford
Training GANs: Two-player game
Instead of minimizing likelihood of discriminator being correct,
now maximize likelihood of discriminator being wrong.
Same objective of fooling discriminator, but now higher
gradient signal for bad samples => works much better!
Standard in practice.
Ian Goodfellow et al., “Generative Adversarial Nets”, NIPS 2014 Credit: cs231n, Stanford
Training GANs: Two-player game
GAN training algorithm
Some find k=1
more stable,
others use k > 1,
no best rule.
Recent work (e.g.
Wasserstein GAN)
alleviates this
problem, better
stability!
Ian Goodfellow et al., “Generative Adversarial Nets”, NIPS 2014 Credit: cs231n, Stanford
Training GANs: Two-player game
After training, use generator network
to generate new images
Ian Goodfellow et al., “Generative Adversarial Nets”, NIPS 2014 Fake and real images copyright Emily Denton et al. 2015. Credit: cs231n, Stanford
Generative Adversarial Nets
Generated Samples [MNIST Database, Toronto Face Database (TFD)]
Nearest neighbor from training set
Ian Goodfellow et al., “Generative Adversarial Nets”, NIPS 2014
Generative Adversarial Nets
Generated Samples [CIFAR-10 Database]
convolutional discriminator and
Fully connected model “deconvolutional” generator
Nearest neighbor from training set
Ian Goodfellow et al., “Generative Adversarial Nets”, NIPS 2014
DCGAN
Deep Convolutional Generative Adversarial Nets
❖ Generator is an upsampling network with fractionally-strided convolutions.
Generator
Radford et al, “Unsupervised Representation Learning with Deep Convolutional Generative Adversarial Networks”, ICLR 2016
DCGAN
Deep Convolutional Generative Adversarial Nets
❖ Generator is an upsampling network with fractionally-strided convolutions.
❖ Discriminator is a convolutional network.
Radford et al, “Unsupervised Representation Learning with Deep Convolutional Generative Adversarial Networks”, ICLR 2016
DCGAN
Deep Convolutional Generative Adversarial Nets
❖ Generator is an upsampling network with fractionally-strided convolutions.
❖ Discriminator is a convolutional network.
Radford et al, “Unsupervised Representation Learning with Deep Convolutional Generative Adversarial Networks”, ICLR 2016
DCGAN
Deep Convolutional Generative Adversarial Nets
Generated bedrooms after one training pass through the LSUN dataset.
Amazing!
Radford et al, “Unsupervised Representation Learning with Deep Convolutional Generative Adversarial Networks”, ICLR 2016
Image-to-Image Translation
• Conditional GAN1
• Cycle-Consistent Adversarial Network2
• Dual GAN3
1. Mirza, Mehdi, and Simon Osindero. "Conditional generative adversarial nets." arXiv preprint
arXiv:1411.1784 (2014).
2. Zhu, Jun-Yan, et al. Unpaired Image-To-Image Translation Using Cycle-Consistent Adversarial
Networks. CVPR 2017.
3. Yi, Zili, et al. DualGAN: Unsupervised Dual Learning for Image-To-Image Translation. CVPR 2017.
Slide Credit: Kishan Babu, PhD Student, IIIT Sri City
Image-to-Image Translation with Conditional
Adversarial Networks
Slide Credit: Kishan Babu, PhD Student, IIIT Sri City Phillip et al. 2017
Unpaired image to image translation using cycle
consistency adversarial networks
Slide Credit: Kishan Babu, PhD Student, IIIT Sri City Jun-Yan Zhu et al. 2017
“The GAN Zoo”
https://github.com/hindupuravinash/the-gan-zoo
“The GAN Zoo”
https://github.com/hindupuravinash/the-gan-zoo
“The GAN Zoo”
And Many More …………….......
https://github.com/hindupuravinash/the-gan-zoo
GANs: Things to Remember
Take game-theoretic approach: learn to generate from training distribution
through 2-player game
Pros:
- Beautiful, state-of-the-art samples!
Cons:
- Trickier / more unstable to train
- Can’t solve inference queries such as p(x), p(z|x)
Active areas of research:
-Better loss functions, more stable training (Wasserstein GAN, LSGAN, many
others)
- Conditional GANs, GANs for all kinds of applications