0% found this document useful (0 votes)

72 views13 pages

Deep Learning 15

jiji

Uploaded by

sagnikthetiger31

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

72 views13 pages

Deep Learning 15

jiji

Uploaded by

sagnikthetiger31

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 13

1.

What do you mean by single layer perceptron so that a single layer perceptron
can be used to represent the boolean function AND, OR, NOR, NAND?

 Single Layer Perceptron: A simple neural network model with a single layer of
output neurons connected to input features. It applies a linear combination of
inputs and weights, followed by an activation function.

 Boolean Functions: The perceptron can represent linearly separable boolean

functions like:

o AND: The perceptron ﬁres when both inputs are 1. Weights can be
positive, and bias must be negative enough to prevent ﬁring when any
input is 0.

o OR: It ﬁres if either input is 1. Weights are positive, with bias slightly
negative.

o NAND: A combination of AND but with a negative weight, making it ﬁre for
opposite conditions.

o NOR: Similar to OR but with negative weights to ﬁre only when both inputs
are 0.

2. What is the signiﬁcance of activation function used in the deep neural network?
Explain di erent types of activation functions in the deep neural network.

 Signiﬁcance:

o Introduces non-linearity, enabling the network to model complex

patterns.

o Helps in controlling output ranges (e.g., sigmoid constrains to 0-1).

o Aids in gradient-based optimization by propagating errors during

backpropagation.

 Types of Activation Functions:

1. Sigmoid:

 Range (0, 1), used in binary classiﬁcation.

 Problem: Vanishing gradients for large inputs.

2. Tanh:

 Range (-1, 1), used in RNNs.

 Better than sigmoid but still faces vanishing gradient issues.

3. ReLU (Rectiﬁed Linear Unit):

 Outputs 0 for negative inputs, linear for positive inputs.

 Fast convergence but can cause "dead neurons".

4. Leaky ReLU:

 Allows small negative values, ﬁxing the "dead neuron" issue.

5. Softmax:

 Converts logits into probabilities, used in multi-class

classiﬁcation.

3. What do you mean by vanishing gradient problem that generally occurs in the
deep neural network? How is it distinguished from gradient descent problem?

 Vanishing Gradient Problem:

o Occurs when gradients become extremely small during backpropagation,

especially with deep networks.

o This prevents the network from learning e ectively as the updates to

weights become negligible.

o Mainly arises when using Sigmoid or Tanh activation functions due to

their small gradients in the saturation region.

 Distinguished from Gradient Descent Problem:

o Vanishing Gradient refers to diminishing gradients in deep layers,

slowing down learning.

o Gradient Descent is the optimization process that can face issues like
slow convergence or saddle points, but it's not speciﬁcally about the
gradients disappearing.
4. Explain feedforward and backpropagation algorithm.

 Feedforward Algorithm:

o The process of passing input data through the network layer by layer to get
the ﬁnal output.

o Each layer applies weights and an activation function to compute the

outputs, which are passed to the next layer.

 Backpropagation Algorithm:

o Step 1: Compute loss by comparing the predicted output with the actual
output using a loss function.

o Step 2: Calculate the gradient of the loss concerning the weights using
chain rule (i.e., partial derivatives).

o Step 3: Update the weights in the direction of the negative gradient to

minimize the loss (gradient descent).

Example:

 In a network with 2 layers, backpropagation will compute the gradients for each
layer starting from the output, updating weights to reduce error.

5. Why batch normalization is required while training a deep neural network model?

 Reasons for Batch Normalization:

1. Reduces Internal Covariate Shift: Stabilizes the distribution of inputs to

each layer, reducing variation in input distribution and speeding up
learning.

2. Faster Convergence: Allows higher learning rates by normalizing inputs,

helping faster and more stable training.

3. Mitigates Vanishing/Exploding Gradients: Normalizes activations,

keeping gradients well-scaled, which is crucial in deep networks.

4. Regularization: Reduces overﬁtting by introducing noise through mini-

batch normalization.

5. Enables Deeper Networks: Helps train deeper networks by stabilizing

the learning process.
6. What are the problems that a neural network model may face while training it?
(Overﬁtting, Covariate shift, etc.)

 Problems:

1. Overﬁtting:

 The model performs well on training data but poorly on unseen

data.

 Solution: Use regularization techniques (dropout, L2) or more data.

2. Covariate Shift:

 The distribution of the input data changes over time.

 Solution: Batch normalization or domain adaptation techniques.

3. Vanishing Gradient:

 Occurs in deep networks; resolved using ReLU, batch

normalization.

4. Exploding Gradient:

 Gradients grow uncontrollably, leading to unstable training.

 Solution: Gradient clipping.

7. With the help of suitable diagrams, explain the basic building block and their
operation of a convolutional neural network.

 Basic Building Blocks of CNN:

1. Convolutional Layer:

 Applies ﬁlters/kernels to the input image, extracting feature maps.

 Uses convolution operation with stride and padding to control

output size.

2. ReLU Activation:

 Introduces non-linearity after each convolution operation.

3. Pooling Layer:

 Reduces the spatial dimensions of the feature maps (down-

sampling).

 Max pooling is the most common type.

4. Fully Connected Layer:

 Flattens the output from the convolutional layers and connects it

to the output layer, performing ﬁnal classiﬁcation or regression
tasks.

8. Explain the di erence between overﬁtting and underﬁtting.

 Overﬁtting:

o The model learns the noise and details of the training data too well,
resulting in poor generalization to new data.

o Symptoms: Low training error, high test error.

o Solution: Use more data, regularization, or dropout.

 Underﬁtting:

o The model fails to learn the underlying patterns in the data, resulting in
poor performance on both training and test data.

o Symptoms: High training and test error.

o Solution: Use a more complex model, reduce bias, or increase model

capacity.

9. Explain di erent types of loss function used in the context of deep neural
network.

 Types of Loss Functions:

1. Mean Squared Error (MSE):

 Used for regression tasks, it calculates the average of squared

di erences between predicted and actual values.

2. Cross-Entropy Loss:

 Used for classiﬁcation tasks, it measures the di erence between

the actual label distribution and predicted label distribution.

3. Hinge Loss:

 Used for SVMs, encourages correct classiﬁcation with a margin.

4. Huber Loss:
 A combination of MSE and MAE, robust to outliers, used for
regression tasks.

10. Explain the concept of parameter sharing in deep learning.

 Parameter Sharing:

o Refers to reusing the same set of parameters (weights) across di erent

parts of the model, most commonly seen in Convolutional Neural
Networks (CNNs).

o In CNNs: Filters are shared across the input image, meaning the same
ﬁlter is applied to multiple parts of the image to detect similar features
like edges, textures, etc.

o Beneﬁt: Reduces the number of parameters, making the model more

e icient and reducing overﬁtting.

11. Explain di erent types of regularization techniques used in the deep neural
network.

 Regularization Techniques:

1. L2 Regularization (Ridge):

 Adds a penalty proportional to the square of the magnitude of

weights, discouraging large weights.

2. L1 Regularization (Lasso):

 Adds a penalty proportional to the absolute value of weights,

encouraging sparsity.

3. Dropout:

 Randomly drops neurons during training, preventing co-adaptation

of features and reducing overﬁtting.

4. Early Stopping:

 Stops training when validation performance starts to degrade,

avoiding overﬁtting.

5. Data Augmentation:

 Artiﬁcially increases the training data by applying transformations

like rotation,
12. Explain the working principle of Generative Adversarial Network (GAN) with a
suitable diagram.

ChatGPT said:

ChatGPT

Working Principle of Generative Adversarial Networks (GANs)

Generative Adversarial Networks (GANs) are a class of machine learning frameworks

designed to generate new data samples that resemble a given training dataset. They
consist of two neural networks—the Generator and the Discriminator—that work
against each other in a competitive setting.

Components of GANs

1. Generator (G):

o The generator network creates fake data samples from random noise
(latent space).

o It learns to produce data that resembles the training data as closely as

possible.

o The goal of the generator is to "fool" the discriminator into thinking the
generated data is real.

2. Discriminator (D):

o The discriminator network evaluates data samples and classiﬁes them as

real (from the training dataset) or fake (from the generator).

o It learns to distinguish between real and generated data.

o The goal of the discriminator is to correctly identify real vs. fake samples.

Training Process

The training process of GANs involves the following steps:

1. Training the Discriminator:

o A batch of real data is fed into the discriminator to compute the loss (how
well it classiﬁes real data).

o A batch of fake data generated by the generator is fed into the

discriminator to compute the loss (how well it classiﬁes fake data).
o The discriminator's weights are updated to maximize its accuracy in
distinguishing real from fake data.

2. Training the Generator:

o The generator produces a new batch of fake data.

o The fake data is sent to the discriminator, and the loss is computed based
on how well the discriminator classiﬁes it as real.

o The generator's weights are updated to minimize this loss, meaning it

learns to create better data to fool the discriminator.

3. Adversarial Training:

o The two networks are trained simultaneously in an adversarial manner.

o The generator aims to improve its ability to create realistic data, while the
discriminator aims to become better at distinguishing between real and
fake data.

o This process continues iteratively, leading to improvements in both

networks.

Diagram

Below is a simpliﬁed diagram of the GAN architecture:

sql

Copy code

+-----------------+

| |

| Generator (G) |

| |

+--------|--------+

| Fake Data

+-----------------+

| |
| Discriminator (D)|

| |

+--------|--------+

| Real or Fake?

+-----------------+

| |

| Real Data |

| |

+-----------------+

Conclusion

The interaction between the generator and the discriminator forms a minimax game,
where the generator tries to minimize the likelihood of detection by the discriminator,
while the discriminator tries to maximize its accuracy. Over time, as both networks
improve, the generator produces high-quality samples that closely resemble the training
data, achieving the goal of generating realistic outputs.

13. Determine the shape of the output matrix of an image of size 19×19 that uses a
padding of size = 2, stride size = 2, and ﬁlter size = 5×5.

ChatGPT said:

ChatGPT

To determine the shape of the output matrix after applying a convolutional layer with a
speciﬁed image size, padding, stride, and ﬁlter size, we can use the following formula:
Q14. What are the hyperparameters that can be optimized for batch normalization
technique?

ChatGPT said:

ChatGPT

Batch normalization (BN) is a technique used to stabilize and accelerate the training of
deep neural networks by normalizing the inputs of each layer. When using batch
normalization, several hyperparameters can be optimized to improve the model's
performance. Here are the key hyperparameters that can be optimized for batch
normalization:

1. Momentum (β):

 Description: Momentum is used to compute the moving average of the mean

and variance of the features.

 Typical Values: Commonly set between 0.9 and 0.99.

 Impact: A higher momentum value can stabilize the training process by

providing a smoother estimate of the mean and variance, while a lower value can
allow for faster adaptation to changes in the data distribution.

2. Epsilon (ε):

 Description: A small constant added to the variance during normalization to

prevent division by zero.

 Typical Values: Usually set to a small value like 1e−51e-51e−5 or 1e−31e-31e−3.

 Impact: Adjusting epsilon can help prevent numerical instability, especially in

cases where the variance is very small.

3. Learning Rate:

 Description: The rate at which the model's weights are updated during training.
 Impact: The learning rate can a ect how well batch normalization works.
Sometimes, adjusting the learning rate can lead to better convergence when
using batch normalization.

4. Batch Size:

 Description: The number of samples processed before the model is updated.

 Impact: The size of the batch can inﬂuence the estimation of the mean and
variance. Larger batches can provide more stable estimates, while smaller
batches might lead to noisier estimates.

5. Trainable Parameters (γ and β):

 Description: In batch normalization, scale (γ) and shift (β) parameters can be
learned during training.

 Impact: These parameters allow the network to learn how to adjust the
normalized values, which can improve model performance. Optimizing the initial
values and whether to use them can be critical.

6. Position of Batch Normalization:

 Description: Deciding where to apply batch normalization in the network (e.g.,

before or after activation functions).

 Impact: The placement can a ect model performance. Experimenting with

di erent positions (before/after activation) can yield di erent results.

Conclusion

Optimizing these hyperparameters can signiﬁcantly enhance the performance and

stability of a deep learning model using batch normalization. It often requires a
combination of grid search, random search, or more advanced techniques like Bayesian
optimization to ﬁnd the best values for these hyperparameters.

120 Deep Learning Important Questions + Answers ?
No ratings yet
120 Deep Learning Important Questions + Answers ?
68 pages
Remote Sensing for Pavement Assessment
No ratings yet
Remote Sensing for Pavement Assessment
32 pages
Genai See
No ratings yet
Genai See
51 pages
Introduction To ANN
No ratings yet
Introduction To ANN
6 pages
ML Prep For Samsung
No ratings yet
ML Prep For Samsung
73 pages
DL Questions
No ratings yet
DL Questions
5 pages
Neural Network - Test Questions
No ratings yet
Neural Network - Test Questions
9 pages
Mla Cat2
No ratings yet
Mla Cat2
8 pages
DL Internal
No ratings yet
DL Internal
9 pages
Tutorial 1,2
No ratings yet
Tutorial 1,2
12 pages
Algorithmic Advances
No ratings yet
Algorithmic Advances
5 pages
ISE-1 Imp DLPDF
No ratings yet
ISE-1 Imp DLPDF
28 pages
120 Deep Learning Important Questions + Answers ?
No ratings yet
120 Deep Learning Important Questions + Answers ?
68 pages
Deep Learning Important Questions Answers
No ratings yet
Deep Learning Important Questions Answers
4 pages
Exam Gen AI
No ratings yet
Exam Gen AI
14 pages
Interview Questions in Neural Network
No ratings yet
Interview Questions in Neural Network
9 pages
ST1 Question Bank
No ratings yet
ST1 Question Bank
2 pages
DL CO1 and CO2 Answers
No ratings yet
DL CO1 and CO2 Answers
36 pages
DL Viva
No ratings yet
DL Viva
7 pages
Lecture5 MCQ Guide
No ratings yet
Lecture5 MCQ Guide
9 pages
Introtodeeplearning MIT 6.S191
No ratings yet
Introtodeeplearning MIT 6.S191
36 pages
DL Imp Viva
No ratings yet
DL Imp Viva
5 pages
Deep Learning
No ratings yet
Deep Learning
18 pages
Assignment 4
No ratings yet
Assignment 4
7 pages
Deep Learning Unit 2
No ratings yet
Deep Learning Unit 2
4 pages
Deep Learning
No ratings yet
Deep Learning
20 pages
Deep FFNN QA Final Clean
No ratings yet
Deep FFNN QA Final Clean
4 pages
WS 2021
No ratings yet
WS 2021
16 pages
Ch4 and Ch5 Notes
No ratings yet
Ch4 and Ch5 Notes
38 pages
Deep Learning Questions
No ratings yet
Deep Learning Questions
17 pages
Assignment Jaiprakash
No ratings yet
Assignment Jaiprakash
5 pages
Deep Learning 50 QnA
No ratings yet
Deep Learning 50 QnA
2 pages
Viva
No ratings yet
Viva
8 pages
Deep Learning - Question Bank
No ratings yet
Deep Learning - Question Bank
6 pages
Neural Networks Explained
No ratings yet
Neural Networks Explained
15 pages
2mrk Answers
No ratings yet
2mrk Answers
6 pages
Data Science Interview Qes.
No ratings yet
Data Science Interview Qes.
15 pages
Deep Learning Questions
50% (2)
Deep Learning Questions
51 pages
Deep Learning
No ratings yet
Deep Learning
6 pages
Ilide Info Deep Learning Questions PR
No ratings yet
Ilide Info Deep Learning Questions PR
51 pages
Iva Unit-5 Edited
No ratings yet
Iva Unit-5 Edited
42 pages
Section - C: Unit 1
No ratings yet
Section - C: Unit 1
12 pages
Model Questions DWT COMPLETE SOLUTIONS
No ratings yet
Model Questions DWT COMPLETE SOLUTIONS
18 pages
Deep Learning Theory Questions
No ratings yet
Deep Learning Theory Questions
3 pages
Unit 5 (Second Half)
No ratings yet
Unit 5 (Second Half)
10 pages
DL QB
No ratings yet
DL QB
4 pages
DL Cie2
No ratings yet
DL Cie2
5 pages
Deep Learning Question Bank
No ratings yet
Deep Learning Question Bank
8 pages
Artificial Neural Networks - Lect - 4
No ratings yet
Artificial Neural Networks - Lect - 4
17 pages
Unit II
No ratings yet
Unit II
4 pages
Module4 AI
No ratings yet
Module4 AI
12 pages
Deep Learning Interview Q&A
No ratings yet
Deep Learning Interview Q&A
10 pages
SS 2021 Solutions
No ratings yet
SS 2021 Solutions
16 pages
Module 2
No ratings yet
Module 2
13 pages
ANN Unit IV Notes
No ratings yet
ANN Unit IV Notes
4 pages
Deep Learing
No ratings yet
Deep Learing
37 pages
Deep Learning
No ratings yet
Deep Learning
5 pages
Deep Network Questions Answers Final
No ratings yet
Deep Network Questions Answers Final
3 pages
Unit 1 Mid Term
No ratings yet
Unit 1 Mid Term
3 pages
Supervised learningNN
No ratings yet
Supervised learningNN
73 pages
2 Marks
No ratings yet
2 Marks
5 pages
Question Bank Module-1: Department of Computer Applications 18mca53 - Machine Learning
No ratings yet
Question Bank Module-1: Department of Computer Applications 18mca53 - Machine Learning
7 pages
Intro To Neural Nets PDF
No ratings yet
Intro To Neural Nets PDF
29 pages
III B. Tech II Semester Supplementary Examinations, December - 2023 Machine Learning
No ratings yet
III B. Tech II Semester Supplementary Examinations, December - 2023 Machine Learning
13 pages
189 Cheat Sheet Nominicards PDF
No ratings yet
189 Cheat Sheet Nominicards PDF
2 pages
Lab Manual
No ratings yet
Lab Manual
4 pages
Seyfe Yihenew
No ratings yet
Seyfe Yihenew
141 pages
Machine Learning Algorithms Guide
No ratings yet
Machine Learning Algorithms Guide
9 pages
Artificial Neural Networks Exam 2005
0% (1)
Artificial Neural Networks Exam 2005
19 pages
MachineLeanrning With Python
No ratings yet
MachineLeanrning With Python
6 pages
CS 188 Introduction To Artificial Intelligence Fall 2017 Note 10 Neural Networks: Motivation
No ratings yet
CS 188 Introduction To Artificial Intelligence Fall 2017 Note 10 Neural Networks: Motivation
9 pages
Introduction To Artificial Neural Networks
No ratings yet
Introduction To Artificial Neural Networks
25 pages
Neural Network Fundamentals With Graphs
No ratings yet
Neural Network Fundamentals With Graphs
6 pages
Soft Computing Part A Questions
No ratings yet
Soft Computing Part A Questions
38 pages
B.Tech CSE Study Evaluation Scheme
0% (2)
B.Tech CSE Study Evaluation Scheme
32 pages
A Review On Machine Learning Techniques in Biomedical Research
No ratings yet
A Review On Machine Learning Techniques in Biomedical Research
15 pages
Machine Learning: Overfitting vs. Underfitting
No ratings yet
Machine Learning: Overfitting vs. Underfitting
25 pages
Deep Learning With Keras and Tensorflow
No ratings yet
Deep Learning With Keras and Tensorflow
9 pages
Artificial Intelligence To Optimize Water Consumption in Agriculture - A Predictive Algorithm-Based Irrigation Management System
No ratings yet
Artificial Intelligence To Optimize Water Consumption in Agriculture - A Predictive Algorithm-Based Irrigation Management System
11 pages
Bim309 Ai Week13
No ratings yet
Bim309 Ai Week13
53 pages
Lecture Notes: Neural Network & Fuzzy Logic
No ratings yet
Lecture Notes: Neural Network & Fuzzy Logic
82 pages
UNIT-1 Notes Introduction of Soft Computing:: Page No: 1
No ratings yet
UNIT-1 Notes Introduction of Soft Computing:: Page No: 1
66 pages
Artificial Neural Network
No ratings yet
Artificial Neural Network
36 pages
Machine Learning Documentation
No ratings yet
Machine Learning Documentation
18 pages
Exponential Scaling via Data Pruning
No ratings yet
Exponential Scaling via Data Pruning
52 pages
Perceptron
No ratings yet
Perceptron
11 pages
Artificial Intelligence in The Intensive Care
No ratings yet
Artificial Intelligence in The Intensive Care
9 pages
Fingerprint Recognition Using Artificial Neural Networks
No ratings yet
Fingerprint Recognition Using Artificial Neural Networks
5 pages