0% found this document useful (0 votes)

14 views36 pages

L11 Autoencoders

My Lecture On Autoencoders Deep Learning in BITSPILANI

Uploaded by

iampulkit

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

14 views36 pages

L11 Autoencoders

My Lecture On Autoencoders Deep Learning in BITSPILANI

Uploaded by

iampulkit

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 36

Deep Learning

BITS Pilani Dr. Pratik Narang

Pilani Campus Department of CSIS
BITS Pilani
Pilani Campus

Lecture 11
Autoencoders
BITS Pilani
Pilani Campus

Dimensionality reduction
Dimensionality reduction

• In machine learning, dimensionality reduction is the process of

reducing the number of features that describe some data.

• This reduction is done either by selection (only some existing

features are conserved) or by extraction (a reduced number of new
features are created based on the old features)

• Useful in many situations that require low dimensional data (data

visualisation, data storage, heavy computation…).

• Commonly used approaches: PCA, ICA

BITS Pilani, Pilani Campus

Dimensionality reduction

Let’s call encoder the process that produce the “new features”
representation from the “old features” representation (by selection or
by extraction) and decoder the reverse process.
Dimensionality reduction can then be interpreted as data compression
where the encoder compress the data (from the initial space to the
encoded space, also called latent space) whereas the decoder
decompress them.

Source: https://towardsdatascience.com/ BITS Pilani, Pilani Campus

Principal components analysis (PCA)

The idea of PCA is to build n_e new independent features

that are linear combinations of the n_d old features and
so that the projections of the data on the subspace
defined by these new features are as close as possible to
the initial data (in term of euclidean distance).
In other words, PCA is looking for the best linear subspace
of the initial space (described by an orthogonal basis of
new features) such that the error of approximating the
data by their projections on this subspace is as small as
possible.

BITS Pilani, Pilani Campus

BITS Pilani
Pilani Campus

Generative modeling
BITS Pilani, Pilani Campus
Supervised vs unsupervised learning

BITS Pilani, Pilani Campus

Generative modeling

BITS Pilani, Pilani Campus

Why generative modelling?

Debiasing

BITS Pilani, Pilani Campus

Outlier detection

BITS Pilani, Pilani Campus

BITS Pilani
Pilani Campus

Latent variable models –

Autoencoders and GANs
What is a latent variable?

Plato, Republic

BITS Pilani, Pilani Campus

BITS Pilani
Pilani Campus

Autoencoders
Typical DNNs characteristics

So far, Deep Learning Models have things in common:

• Input Layer: (maybe vectorized), quantitative
representation
• Hidden Layer(s): Apply transformations with nonlinearity
• Output Layer: Result for classification, regression,
translation, segmentation, etc.
• Models used for supervised learning

BITS Pilani, Pilani Campus

Example

Source: https://cse.iitkgp.ac.in/~sudeshna/courses/DL18/ BITS Pilani, Pilani Campus

Changing the objective!

Now we will talk about unsupervised learning with Deep Neural Networks

Source: https://cse.iitkgp.ac.in/~sudeshna/courses/DL18/ BITS Pilani, Pilani Campus

Autoencoders: definition

Autoencoders are neural networks that are trained to copy their inputs to
their outputs.

• Usually constrained in particular ways to make this task more difficult.

• They compress the input into a lower-dimensional code and then

reconstruct the output from this representation. The code is a compact
“summary” or “compression” of the input, also called the latent-space
representation.

• Structure is almost always organized into encoder network, f, and

decoder network, g : model = g(f(x))

• Trained by gradient descent with reconstruction loss:

measures differences between input and output e.g. MSE:
BITS Pilani, Pilani Campus
Autoencoders

BITS Pilani, Pilani Campus

BITS Pilani, Pilani Campus
BITS Pilani, Pilani Campus
Reconstruction quality

BITS Pilani, Pilani Campus

Autoencoders for representation learning

BITS Pilani, Pilani Campus

Autoencoders

Autoencoders are mainly a dimensionality reduction (or compression)

algorithm with a couple of important properties:

Data-specific: They are only able to meaningfully compress data similar

to what they have been trained on. Since they learn features specific
for the given training data, they are different than a standard data
compression algorithm like gzip.

Lossy: The output of the autoencoder will not be exactly the same as
the input, it will be a close but degraded representation.

Unsupervised: Autoencoders are considered an unsupervised learning

technique since they don’t need explicit labels to train on.

BITS Pilani, Pilani Campus

Undercomplete Autoeconders

Undercomplete Autoeconders are defined to have a hidden layer h, with

smaller dimension than input layer.
• Network must model x in lower dimensional space + map latent space
accurately back to input space.
• Encoder network: function that returns a useful, compressed representation
of input.
• If network has only linear transformations, encoder learns PCA. With typical
nonlinearities, network learns generalized, more powerful version of PCA.

Source: https://cse.iitkgp.ac.in/~sudeshna/courses/DL18/ BITS Pilani, Pilani Campus

Architecture

Source: https://towardsdatascience.com/ BITS Pilani, Pilani Campus

Training

Four hyperparameters need to be set before training an autoencoder:

Code size: number of nodes in the middle layer. Smaller size results in
more compression.
Number of layers: the autoencoder can be as deep as we like
Number of nodes per layer: a stacked autoencoder is one where the
layers are stacked one after another. Usually stacked autoencoders look
like a “sandwich”. The number of nodes per layer decreases with each
subsequent layer of the encoder, and increases back in the decoder. Also
the decoder is symmetric to the encoder in terms of layer structure.
Loss function: we either use mean squared error (MSE) or binary
crossentropy. If the input values are in the range [0, 1] then we typically
use crossentropy, otherwise we use the mean squared error.

BITS Pilani, Pilani Campus

Training

• We can make the autoencoder very powerful by increasing the number

of layers, nodes per layer and most importantly the code size.
• Increasing these hyperparameters will let the autoencoder to learn
more complex codings.
• But we should be careful to not make it too powerful. Otherwise the
autoencoder will simply learn to copy its inputs to the output, without
learning any meaningful representation. It will just mimic the identity
function.
• This is why we prefer a “sandwich” architecture, and deliberately keep
the code size small.
• Since the coding layer has a lower dimensionality than the input data,
the autoencoder is said to be undercomplete. It won’t be able to directly
copy its inputs to the output, and will be forced to learn intelligent
features

BITS Pilani, Pilani Campus

Denoising autoencoders

Another way to force the autoencoder to learn useful

features is by adding random noise to its inputs and
making it recover the original noise-free data.

• This way the autoencoder can’t simply copy the input to

its output because the input also contains random noise.
• We are asking it to subtract the noise and produce the
underlying meaningful data.
• This is called a denoising autoencoder.

BITS Pilani, Pilani Campus

Example

Source: https://towardsdatascience.com/ BITS Pilani, Pilani Campus

Denoising autoencoders

• We introduce a corruption process C(˜x | x), which represents a

conditional distribution over corrupted samples ˜x, given a data sample
x. The autoencoder then learns a reconstruction distribution
preconstruct (x | ˜x) estimated from training pairs (x,˜x) as follows:

• Sample a training example x from the training data.

• Sample a corrupted version ˜x from C(˜x | x=x),

• Use (x,˜x) as a training example for estimating the autoencoder

reconstruction distribution preconstruct(x |˜x) =pdecoder (x | h) with h the
output of encoder f(˜x) and pdecoder typically deﬁned by a decoder g(h)

BITS Pilani, Pilani Campus

Sparse autoencoders

A third method to force the autoencoder to learn useful

features: using regularization
We can regularize the autoencoder by using a sparsity
constraint such that only a fraction of the nodes would have
nonzero values, called active nodes.
Add a penalty term to the loss function such that only a
fraction of the nodes become active. This forces the
autoencoder to represent each input as a combination of
small number of nodes, and demands it to discover
interesting structure in the data.
This method works even if the code size is large, since only a
small subset of the nodes will be active at any time.

BITS Pilani, Pilani Campus

Applications

Data denoising

Dimensionality reduction

Information retrieval

Content generation (Generative models) - VAEs

BITS Pilani, Pilani Campus

Content generation

At first sight, we could be tempted to think that, if the latent

space is regular enough (well “organized” by the encoder
during the training process), we could take a point
randomly from that latent space and decode it to get a
new content.

BITS Pilani, Pilani Campus

Content generation

However, the regularity of the latent

space for autoencoders is a difficult
point that depends on:
• the distribution of the data in the initial space,
• the dimension of the latent space,
• the architecture of the encoder.

So, it is pretty difficult (if not impossible) to ensure, a priori, that the
encoder will organize the latent space in a smart way compatible with
the generative process.
The autoencoder is solely trained to encode and decode with as few loss
as possible, no matter how the latent space is organised.
Thus, it is natural that, during the training, the network takes advantage of
any overfitting possibilities to achieve its task as well as it can.
BITS Pilani, Pilani Campus

Advance Deep Learning - BIT L3
No ratings yet
Advance Deep Learning - BIT L3
45 pages
L12 Autoencoders and More
No ratings yet
L12 Autoencoders and More
29 pages
Advance Deep Learning - BIT L1
No ratings yet
Advance Deep Learning - BIT L1
66 pages
Vae - Gan 1
No ratings yet
Vae - Gan 1
136 pages
Generative Models
No ratings yet
Generative Models
65 pages
D5 PPT
No ratings yet
D5 PPT
79 pages
DL CS01
No ratings yet
DL CS01
28 pages
Auto Encoder S
No ratings yet
Auto Encoder S
52 pages
Unit Iii
No ratings yet
Unit Iii
15 pages
Gen AI Unit 2
100% (1)
Gen AI Unit 2
65 pages
DL Module III Till IA-1
No ratings yet
DL Module III Till IA-1
15 pages
DL Unit 5
No ratings yet
DL Unit 5
19 pages
Unit-V DL
No ratings yet
Unit-V DL
31 pages
1 Autoencoders
No ratings yet
1 Autoencoders
22 pages
Unit5 Autoencoders
No ratings yet
Unit5 Autoencoders
45 pages
Module 03
No ratings yet
Module 03
13 pages
Unit 3
No ratings yet
Unit 3
23 pages
Deep Learning Subject Practicals Uni Mumbai
No ratings yet
Deep Learning Subject Practicals Uni Mumbai
11 pages
07 Autoencoder
No ratings yet
07 Autoencoder
16 pages
Deeplearning Seminar
No ratings yet
Deeplearning Seminar
9 pages
Autoencoder
No ratings yet
Autoencoder
14 pages
Autoencoder
No ratings yet
Autoencoder
4 pages
Autoencoders
No ratings yet
Autoencoders
4 pages
Ch3 Auto Encoder
No ratings yet
Ch3 Auto Encoder
40 pages
Module 3 DL
No ratings yet
Module 3 DL
103 pages
Generative Models
No ratings yet
Generative Models
53 pages
Auto Encoder
No ratings yet
Auto Encoder
39 pages
Unit II
No ratings yet
Unit II
35 pages
Autoencoders: Applications & Types
No ratings yet
Autoencoders: Applications & Types
21 pages
Unit V
No ratings yet
Unit V
32 pages
Unit4 1
No ratings yet
Unit4 1
42 pages
Advance Deep Learning - BIT L4
No ratings yet
Advance Deep Learning - BIT L4
100 pages
Neural Network Unsupervised Machine Learning: What Are Autoencoders?
No ratings yet
Neural Network Unsupervised Machine Learning: What Are Autoencoders?
22 pages
Neural Network Unsupervised Machine Learning: What Are Autoencoders?
No ratings yet
Neural Network Unsupervised Machine Learning: What Are Autoencoders?
22 pages
Introduction To Autoencoders: A Brief Overview
No ratings yet
Introduction To Autoencoders: A Brief Overview
27 pages
Lec15 Generative Models
No ratings yet
Lec15 Generative Models
51 pages
Unit-5 Auto Encoders in Deep Learning
No ratings yet
Unit-5 Auto Encoders in Deep Learning
23 pages
Parallelized Deep Neural Networks
No ratings yet
Parallelized Deep Neural Networks
34 pages
DL M3 Tech
No ratings yet
DL M3 Tech
15 pages
Introduction To Autoencod Ers
No ratings yet
Introduction To Autoencod Ers
8 pages
MODULE 5 Auto-Encoders and Generative Models
No ratings yet
MODULE 5 Auto-Encoders and Generative Models
25 pages
Autoencoders
No ratings yet
Autoencoders
35 pages
Vae Gan
No ratings yet
Vae Gan
214 pages
Lesson 8 AutoEncoders
No ratings yet
Lesson 8 AutoEncoders
29 pages
Reserch Papers On Deep Learning Mpgi
No ratings yet
Reserch Papers On Deep Learning Mpgi
6 pages
DL CS 5 M2 Class Live Session Flow May 25
No ratings yet
DL CS 5 M2 Class Live Session Flow May 25
64 pages
DL Class5
No ratings yet
DL Class5
23 pages
Lecture 6373 07
No ratings yet
Lecture 6373 07
53 pages
Dis10 Sol
No ratings yet
Dis10 Sol
11 pages
DL - Module 3
No ratings yet
DL - Module 3
62 pages
AIML ML Session 4 - Student Common Reference (With More Additional Reading Materials) Part 2
No ratings yet
AIML ML Session 4 - Student Common Reference (With More Additional Reading Materials) Part 2
45 pages
AAI Module 3
No ratings yet
AAI Module 3
11 pages
Autoencoders - Bits and Bytes of Deep Learning - Towards Data Science
No ratings yet
Autoencoders - Bits and Bytes of Deep Learning - Towards Data Science
10 pages
Auto Encoder S
No ratings yet
Auto Encoder S
32 pages
Autoencoder
No ratings yet
Autoencoder
39 pages
AAI - Module 2 - Variational Autoencoders
No ratings yet
AAI - Module 2 - Variational Autoencoders
9 pages
Lecture 2.3.2VariationalAutoencoders (VAEs)
No ratings yet
Lecture 2.3.2VariationalAutoencoders (VAEs)
25 pages
Auto Encoder S 2
No ratings yet
Auto Encoder S 2
67 pages
VAEs Talk
No ratings yet
VAEs Talk
44 pages
AI in UK 1690306074
No ratings yet
AI in UK 1690306074
58 pages
GRIS25-Brochure A4
No ratings yet
GRIS25-Brochure A4
8 pages
RIce Plant Disease Detection Using Different AI Approaches
No ratings yet
RIce Plant Disease Detection Using Different AI Approaches
11 pages
Behavioral Research Associate JD 1711648737
No ratings yet
Behavioral Research Associate JD 1711648737
4 pages
Clat 4
No ratings yet
Clat 4
36 pages
AI Vision Tech: Megvii's Market Impact
No ratings yet
AI Vision Tech: Megvii's Market Impact
21 pages
Introduction To Data Science and Machine Learning
No ratings yet
Introduction To Data Science and Machine Learning
23 pages
Certificate Courses July 2025
No ratings yet
Certificate Courses July 2025
26 pages
Internship Insights: OTIF Analysis
No ratings yet
Internship Insights: OTIF Analysis
52 pages
Artificial Intelligence and Soft Computing
No ratings yet
Artificial Intelligence and Soft Computing
741 pages
Tushar ML
No ratings yet
Tushar ML
52 pages
AI: From Kasparov to Future Tech
No ratings yet
AI: From Kasparov to Future Tech
18 pages
Real Time Hand Gesture Recognition Research
No ratings yet
Real Time Hand Gesture Recognition Research
11 pages
Learning Similarities: An Ensemble Model For Textual Query Image Retrieval System
No ratings yet
Learning Similarities: An Ensemble Model For Textual Query Image Retrieval System
8 pages
Elgendy GDLFCV MEAP V01 ch1
No ratings yet
Elgendy GDLFCV MEAP V01 ch1
48 pages
DL Practical
No ratings yet
DL Practical
14 pages
6CS4-02 Machine Learning
No ratings yet
6CS4-02 Machine Learning
2 pages
IXIGO 16072025171641 Investor Presentation Regulation 30 Financial Results 30062025
No ratings yet
IXIGO 16072025171641 Investor Presentation Regulation 30 Financial Results 30062025
43 pages
The Literature Survey Based On Smart Trolley Using Ai
No ratings yet
The Literature Survey Based On Smart Trolley Using Ai
3 pages
Artificial Consciousness: Consciousness: An Introduction by Susan Blackmore
No ratings yet
Artificial Consciousness: Consciousness: An Introduction by Susan Blackmore
39 pages
Prompt Techniques
No ratings yet
Prompt Techniques
28 pages
The Viral AI Avatar App Lensa Undressed Me-Without My Consent MIT Technology Review
No ratings yet
The Viral AI Avatar App Lensa Undressed Me-Without My Consent MIT Technology Review
1 page
Sector Report e Clerx
No ratings yet
Sector Report e Clerx
10 pages
Computer Science
No ratings yet
Computer Science
8 pages
2024-25 Engineering CAP Round 1 Cutoff
No ratings yet
2024-25 Engineering CAP Round 1 Cutoff
129 pages
Machine Learning for Engineers
100% (1)
Machine Learning for Engineers
80 pages
HUMAN-TECHNOLOGY COMMUNICATION Internet-Of Robotic-Things and Ubiquitous. R. Anandandownload
100% (3)
HUMAN-TECHNOLOGY COMMUNICATION Internet-Of Robotic-Things and Ubiquitous. R. Anandandownload
58 pages
New Relic 2024 Observability Forecast Report
No ratings yet
New Relic 2024 Observability Forecast Report
74 pages
A Hybrid CNN-Transformer Architecture For Precise Medical Image Segmentation
No ratings yet
A Hybrid CNN-Transformer Architecture For Precise Medical Image Segmentation
13 pages
Hidden Markov Model HMM
No ratings yet
Hidden Markov Model HMM
33 pages