0% found this document useful (0 votes)

6 views11 pages

DL Answers

Uploaded by

tiwariekta783

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

6 views11 pages

DL Answers

Uploaded by

tiwariekta783

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 11

Batch 1 (Marketing Ops)

Q1. What is ReLU Function? What role does it play in Deep Learning?

ReLU (Rectified Linear Unit) is an activation function defined mathematically as: f(x) = max(0, x)

This means it outputs the input directly if positive, and 0 if the input is negative.

Role in Deep Learning:

1. Alleviates the vanishing gradient problem - Unlike sigmoid or tanh functions, ReLU
doesn't saturate in the positive region, allowing for faster learning.

2. Introduces non-linearity - Without non-linear activation functions like ReLU, neural

networks would simply be linear regressors.

3. Computational efficiency - ReLU is easy to compute (just a max operation).

4. Sparsity - By outputting exact zeros for negative inputs, ReLU creates sparse
activations, which can be beneficial for model representation.

Q2. Explain why backpropagation is better than brute force?

Backpropagation is vastly superior to brute force methods for neural network training:

Brute force approach would involve:

• Testing every possible combination of weights

• Requiring exponential computational resources (O(m^n) where m is possible weight

values and n is the number of weights)

• Being completely infeasible for even small networks (modern networks have millions of
parameters)

Backpropagation advantages:

• Uses the chain rule of calculus to efficiently calculate gradients

• Computational complexity scales linearly with network size

• Efficiently reuses intermediate calculations (error signals)

• Takes advantage of the network structure to determine the impact of each weight on the
overall error

• Enables practical training of large, complex networks

Q3. Give examples of 2 activation functions & draw them.

Activation Functions: Sigmoid and Tanh

1. Sigmoid Function

• Formula: σ(x) = 1/(1 + e^(-x))

• Output range: (0, 1)

• Historically popular but less used now due to vanishing gradient problems

• Still used in output layers for binary classification problems

2. Tanh Function (Hyperbolic Tangent)

• Formula: tanh(x) = (e^x - e^(-x))/(e^x + e^(-x))

• Output range: (-1, 1)

• Zero-centered, which makes optimization easier in some cases

• Stronger gradients than sigmoid but still suffers from vanishing gradient issues

Q4. What is dropout regularization? Explain with Diagram

Dropout Regularization
Dropout Regularization is a technique to prevent overfitting in neural networks by randomly
"dropping out" (temporarily removing) neurons during training.

How it works:

1. During each training iteration, neurons are randomly deactivated with probability p
(typically 0.2-0.5)

2. Forward and backward passes occur only with the remaining neurons

3. At test time, all neurons are used but their outputs are scaled by p to compensate

Benefits:

• Prevents co-adaptation of neurons (neurons becoming too dependent on each other)

• Forces the network to learn more robust features

• Acts like an ensemble of many different network architectures

• Significantly reduces overfitting

Implementation:

• During training: output = activation(input) * mask, where mask is randomly generated 0s

and 1s

• During testing: output = activation(input) * (1-p) to scale appropriately

Q5. Draw a typical Neural Network Diagram for handwritten digit recognition

Neural Network for Handwritten Digit Recognition (MNIST)

This neural network for handwritten digit recognition (commonly using the MNIST dataset)
features:

1. Input layer: 28×28 pixels = 784 neurons (one for each pixel in the image)

2. Hidden layers:

o First hidden layer: 256 neurons with ReLU activation

o Second hidden layer: 128 neurons with ReLU activation

3. Output layer: 10 neurons (one for each digit 0-9) with softmax activation

The architecture shows how input images are processed through multiple layers to recognize
handwritten digits. Key components include:

• Image flattening at the beginning

• Dense connections between layers

• ReLU activation in hidden layers

• Softmax function in the output layer for probability distribution across the 10 classes

Q6. Explain why and how image flattening is done in Deep Learning?

Why Image Flattening is Necessary:

1. Input Format Requirement: Traditional neural network layers (fully connected/dense

layers) require 1D vector inputs.

2. Dimensional Compatibility: Convolutional layers output 3D tensors, but fully

connected layers need 1D inputs.
3. Transition Between Architectures: Serves as a bridge between convolutional and fully
connected parts of networks.

How Image Flattening Works:

1. Mathematical Operation: Conversion of multi-dimensional data (typically 2D or 3D)

into a 1D vector.

2. Implementation:

o For a 2D image of size H×W, the flattened vector will have H×W elements.

o For a 3D volume (H×W×C), the flattened vector will have H×W×C elements.

o Elements are arranged sequentially row by row (or channel by channel).

Example: A 28×28 grayscale image (like in MNIST):

• Original format: 28×28 matrix

• After flattening: 784-element vector (28×28 = 784)

Code Implementation (in Python/Keras):

python

# In a sequential model

model = Sequential([

# Input layer - image is 28x28 pixels

Input(shape=(28, 28)),

# Flattening layer - converts to a 784-element vector

Flatten(),

# Dense layer

Dense(128, activation='relu'),

# Output layer

Dense(10, activation='softmax')

])

Note: While flattening is common, it loses spatial information. For image tasks, CNNs preserve
spatial relationships by using convolutional layers before flattening.
Batch 2

Q1. Where and how Gradient descent is used in Deep learning?

Gradient descent is the fundamental optimization algorithm used in deep learning for training
neural networks.

Where it's used:

• In the training phase of nearly all deep learning models

• To minimize the loss/cost function that measures prediction error

• Across all major architectures: CNNs, RNNs, Transformers, etc.

How it works:

1. Calculate the gradient (direction of steepest increase) of the loss function with respect
to each model parameter

2. Update parameters in the opposite direction of the gradient (to decrease loss)

3. Apply learning rate to control step size

4. Repeat until convergence (minimal improvement in loss)

Types of gradient descent:

• Batch gradient descent: Uses entire dataset per update

• Stochastic gradient descent (SGD): Uses one sample per update

• Mini-batch gradient descent: Uses small batches (most common approach)

Gradient Descent Optimization

The diagram shows a loss function landscape where:

• Red dots represent parameter values during training

• Blue arrows show the direction of updates

• Green dot represents the global minimum (optimal parameters)

• Each step brings the parameters closer to the minimum value

• Parameters are updated using the formula: θ = θ - α∇J(θ)

Q2. Where is the softmax function required in Deep learning?

The softmax function is primarily used in output layers of neural networks for multi-class
classification problems.

Mathematical definition: softmax(z)ᵢ = e^zᵢ / Σ(e^zⱼ) for j=1 to K

Where z is the input vector and K is the number of classes.

Where it's used:

1. Multi-class classification output layers:

o Image recognition (classifying among multiple categories)

o Natural language processing (part-of-speech tagging, named entity recognition)

o Speech recognition (identifying phonemes or words)

2. Attention mechanisms in transformers:

o Used to compute attention weights in transformer architectures

Key properties:

• Converts raw scores (logits) into probabilities (values between 0 and 1)

• Ensures all outputs sum to 1 (proper probability distribution)

• Accentuates the largest input value while suppressing smaller values

• Differentiable, allowing for gradient-based learning

Example use case: In an image classifier with 10 classes (like MNIST digits), the last layer has 10
neurons. Softmax converts these 10 values into probabilities that sum to 1, allowing the model
to predict the most likely digit class.

Q3. What is Activation function and why is it required?

An activation function is a mathematical function applied to the output of each neuron in a

neural network.

What it does:

• Applies a non-linear transformation to the weighted sum of inputs

• Determines whether and to what extent a neuron should "fire" or activate

Why activation functions are required:

1. Introduce non-linearity:

o Without activation functions, neural networks would just be linear regression

models

o Non-linearity allows modeling of complex patterns and relationships

o Enables the network to learn more complex functions

2. Enable backpropagation:

o Most activation functions are differentiable, allowing gradient-based

optimization

o Their derivatives are used to compute gradients during training

3. Control neuron output:

o Normalize outputs to specific ranges

o Prevent numerical issues like exploding values

Common activation functions:

1. ReLU (Rectified Linear Unit): f(x) = max(0, x)

o Most popular in hidden layers

o Computationally efficient

o Helps mitigate vanishing gradient problem

2. Sigmoid: f(x) = 1/(1+e^(-x))

o Outputs between 0 and 1

o Used in binary classification output layers

3. Tanh: f(x) = (e^x - e^(-x))/(e^x + e^(-x))

o Outputs between -1 and 1

o Zero-centered

4. Softmax: Mentioned in previous question

o Used in multi-class classification output layers

5. Leaky ReLU: f(x) = max(αx, x) where α is a small constant

o Addresses "dying ReLU" problem

Without activation functions, deep neural networks could not model complex, non-linear
relationships in data.

Q4. What is F1 score and where is it used?

The F1 score is a popular evaluation metric for classification models that balances precision
and recall.
Definition: F1 = 2 * (Precision * Recall) / (Precision + Recall)

Where:

• Precision = True Positives / (True Positives + False Positives)

• Recall = True Positives / (True Positives + False Negatives)

Where it's used:

1. Imbalanced dataset evaluation:

o When classes are not equally represented

o When simple accuracy is misleading

2. Binary classification problems:

o Medical diagnosis (disease detection)

o Spam detection

o Fraud detection

o Anomaly detection

3. Multi-class classification:

o Can be calculated per class (one-vs-rest) or averaged across classes

4. Information retrieval:

o Document classification

o Search result relevance

Why F1 score is important:

• Single metric that balances false positives and false negatives

• Ranges from 0 (worst) to 1 (best)

• Particularly useful when the cost of false positives and false negatives is similar

• More informative than accuracy when classes are imbalanced

Types of F1 score for multi-class problems:

• Macro F1: Simple average of F1 scores for each class (treats all classes equally)

• Weighted F1: Weighted average based on number of samples in each class

• Micro F1: Calculated by counting global true positives, false positives, and false
negatives

F1 score is especially valuable in deep learning when working with unbalanced datasets where
positive examples are rare, such as in medical imaging or fraud detection.

Q5. CNN with block diagram

Convolutional Neural Networks (CNNs) are specialized neural networks designed primarily for
processing grid-like data, especially images.

CNN Architecture Block Diagram

Key Components of a CNN:

1. Convolutional Layers:

o Apply filters (kernels) to input data

o Extract features like edges, textures, and patterns

o Parameters: filter size, stride, padding, number of filters

o Each filter produces a feature map

o Output dimensions: (input_size - filter_size + 2*padding)/stride + 1

2. Activation Function:

o Usually ReLU after each convolutional layer

o Introduces non-linearity

3. Pooling Layers:

o Reduce spatial dimensions (downsampling)

o Common types: Max Pooling, Average Pooling

o Helps achieve spatial invariance

o No learnable parameters

4. Flatten Layer:

o Converts 3D feature maps to 1D vector

o Prepares data for fully connected layers

5. Fully Connected Layers:

o Traditional neural network layers

o Process high-level features for final classification

o Usually has dropout for regularization

6. Output Layer:

o Contains neurons equal to number of classes

o Uses softmax activation for multi-class classification

Advantages of CNNs:

1. Parameter sharing: Same filter applied across the entire image

2. Translation invariance: Can detect features regardless of their position

3. Spatial hierarchy: Captures features at multiple levels of abstraction

4. Reduced parameters: Compared to fully connected networks of similar depth

CNNs are widely used in computer vision tasks like image classification, object detection, facial
recognition, and medical image analysis. They're also applied to non-image data like audio
spectrograms, time series, and even natural language processing.

DL Module II Till7thAug
No ratings yet
DL Module II Till7thAug
131 pages
Unit II
No ratings yet
Unit II
56 pages
I. Models Arrius 1A Arrius 2B1 Arrius 2B1A Arrius 2F Arrius 2K1 Arrius 2B2 Arrius 1A1
50% (2)
I. Models Arrius 1A Arrius 2B1 Arrius 2B1A Arrius 2F Arrius 2K1 Arrius 2B2 Arrius 1A1
11 pages
Activation Functions in Neural Networks
No ratings yet
Activation Functions in Neural Networks
7 pages
School Based Press Conference Guidelines
No ratings yet
School Based Press Conference Guidelines
13 pages
LLM Ai Interview SS
No ratings yet
LLM Ai Interview SS
187 pages
PT0-003 Updated Dumps - CompTIA PenTest+ Exam
No ratings yet
PT0-003 Updated Dumps - CompTIA PenTest+ Exam
28 pages
MBA Curriculum Overview - Bharathidasan University
No ratings yet
MBA Curriculum Overview - Bharathidasan University
41 pages
Advanced Textile Ironing Solutions
No ratings yet
Advanced Textile Ironing Solutions
24 pages
Neural Network Activation Guide
No ratings yet
Neural Network Activation Guide
43 pages
Dl-Module 2
No ratings yet
Dl-Module 2
138 pages
TCS NQT Prep Guide
No ratings yet
TCS NQT Prep Guide
156 pages
Arjun Yadav 32, Activation Function Assignment
No ratings yet
Arjun Yadav 32, Activation Function Assignment
7 pages
Class 12 Communication Skills Q&A
No ratings yet
Class 12 Communication Skills Q&A
5 pages
AlexNet and Other Pretrained Models - Presentation
No ratings yet
AlexNet and Other Pretrained Models - Presentation
182 pages
AKSA Battery Charger
No ratings yet
AKSA Battery Charger
2 pages
FDL Module1
No ratings yet
FDL Module1
102 pages
Deep Learning Introduction
No ratings yet
Deep Learning Introduction
44 pages
Scanning in Motion - ZEB1 Handheld Mobile 3D Laser Scanner
No ratings yet
Scanning in Motion - ZEB1 Handheld Mobile 3D Laser Scanner
1 page
Deep Learning
No ratings yet
Deep Learning
5 pages
Deep Learning Techniques
No ratings yet
Deep Learning Techniques
72 pages
Data Science Interview Qes.
No ratings yet
Data Science Interview Qes.
15 pages
Practical File Questions
No ratings yet
Practical File Questions
2 pages
Lec 22 Activations Functions Complete
No ratings yet
Lec 22 Activations Functions Complete
33 pages
Lecture5 MCQ Guide
No ratings yet
Lecture5 MCQ Guide
9 pages
Warning!
No ratings yet
Warning!
6 pages
Unit 2
No ratings yet
Unit 2
35 pages
Authority To Hire 1
No ratings yet
Authority To Hire 1
4 pages
Macros For Mine Planning Engineer
No ratings yet
Macros For Mine Planning Engineer
8 pages
Activation Function
No ratings yet
Activation Function
10 pages
ANN Viva Prep
No ratings yet
ANN Viva Prep
66 pages
ICTCYS604 Project Portfolio Best Practices Identify Managment JPSR
No ratings yet
ICTCYS604 Project Portfolio Best Practices Identify Managment JPSR
20 pages
Mobile Communications Networks - Midterm Exam - Feb 2025
No ratings yet
Mobile Communications Networks - Midterm Exam - Feb 2025
4 pages
Deep Learning
No ratings yet
Deep Learning
40 pages
Sam MM
No ratings yet
Sam MM
4 pages
Unit Ii DNN
No ratings yet
Unit Ii DNN
24 pages
Mini Project Format
No ratings yet
Mini Project Format
4 pages
Black and White Simple Classic Professional Cover Letter
No ratings yet
Black and White Simple Classic Professional Cover Letter
1 page
Shipping Label 925589088 14112355345419 PDF
No ratings yet
Shipping Label 925589088 14112355345419 PDF
1 page
Manifest 0002
No ratings yet
Manifest 0002
1 page
DL Lab 02
No ratings yet
DL Lab 02
12 pages
Activation Function
No ratings yet
Activation Function
6 pages
ML Prep For Samsung
No ratings yet
ML Prep For Samsung
73 pages
Deep Network Questions Answers Final
No ratings yet
Deep Network Questions Answers Final
3 pages
NNDL - Unit 1 - CBS
No ratings yet
NNDL - Unit 1 - CBS
11 pages
Unit 2 Deep Learning and Neural Networks
No ratings yet
Unit 2 Deep Learning and Neural Networks
38 pages
Module 2
No ratings yet
Module 2
12 pages
Safetrack Ai-Group 12
No ratings yet
Safetrack Ai-Group 12
13 pages
Activation Functions in Neural Networks
No ratings yet
Activation Functions in Neural Networks
3 pages
Genai See
No ratings yet
Genai See
51 pages
Lecture 9-NN - Modified
No ratings yet
Lecture 9-NN - Modified
94 pages
ISE-1 Imp DLPDF
No ratings yet
ISE-1 Imp DLPDF
28 pages
Machine Learning (CSO851) - Lecture 08
No ratings yet
Machine Learning (CSO851) - Lecture 08
27 pages
AI in Fashion Market - Segmentation Detailed Study With Forecast - Facts and Trends
No ratings yet
AI in Fashion Market - Segmentation Detailed Study With Forecast - Facts and Trends
2 pages
My Tasks Fiori App
No ratings yet
My Tasks Fiori App
4 pages
Unit 5 (Second Half)
No ratings yet
Unit 5 (Second Half)
10 pages
UNIT-III Activation-Function
No ratings yet
UNIT-III Activation-Function
6 pages
Yes Madam's GTM Strategy for Home Salon Services
No ratings yet
Yes Madam's GTM Strategy for Home Salon Services
15 pages
Deep Learning for NLP Enthusiasts
No ratings yet
Deep Learning for NLP Enthusiasts
189 pages
Unit 4
No ratings yet
Unit 4
19 pages
SDL Unit 2 3 4
No ratings yet
SDL Unit 2 3 4
12 pages
Tutorial 1,2
No ratings yet
Tutorial 1,2
12 pages
Deep Learning 15
No ratings yet
Deep Learning 15
13 pages
Mcculloh: Linear Activation Function
No ratings yet
Mcculloh: Linear Activation Function
18 pages
DL Exp-3 16010422230
No ratings yet
DL Exp-3 16010422230
9 pages
Deep Learning
100% (1)
Deep Learning
189 pages
L4 Training Neural Networks en
No ratings yet
L4 Training Neural Networks en
48 pages
What Are The Activation Functions, How Do I Deter...
No ratings yet
What Are The Activation Functions, How Do I Deter...
3 pages
Deep Learning Module-02 Search Creators
No ratings yet
Deep Learning Module-02 Search Creators
15 pages
Ad3451 ML Unit 4 Notes Eduengg
No ratings yet
Ad3451 ML Unit 4 Notes Eduengg
36 pages
ISM - Guidelines For System Management (December 2023)
No ratings yet
ISM - Guidelines For System Management (December 2023)
8 pages
6.3 HiddenUnits
No ratings yet
6.3 HiddenUnits
26 pages
ANN Notes
No ratings yet
ANN Notes
7 pages
Module 2
No ratings yet
Module 2
13 pages
Slide Presentation Colloquim
No ratings yet
Slide Presentation Colloquim
4 pages
Deep Learning Interview Q&A
No ratings yet
Deep Learning Interview Q&A
10 pages
JUnit 5 - IntelliJ IDEA Documentation
No ratings yet
JUnit 5 - IntelliJ IDEA Documentation
6 pages
f8194544 Microsoft PowerPoint DeepLearning
No ratings yet
f8194544 Microsoft PowerPoint DeepLearning
28 pages
Z390M-ITXac multiQIG
No ratings yet
Z390M-ITXac multiQIG
159 pages
Mcculloh: Linear Activation Function
No ratings yet
Mcculloh: Linear Activation Function
12 pages
ORGANIZING
No ratings yet
ORGANIZING
36 pages
Need and Use of Activation Functions in Anndeep Learning
No ratings yet
Need and Use of Activation Functions in Anndeep Learning
7 pages
Duplichecker Plagiarism Report
No ratings yet
Duplichecker Plagiarism Report
2 pages
Deep Learning Assignment 01
No ratings yet
Deep Learning Assignment 01
3 pages
Deep Learning
No ratings yet
Deep Learning
10 pages
Telemecanique ZCKE67 Datasheet
No ratings yet
Telemecanique ZCKE67 Datasheet
12 pages
Payra Port Bridge Electrical Works
No ratings yet
Payra Port Bridge Electrical Works
4 pages
ANN Unit IV Notes
No ratings yet
ANN Unit IV Notes
4 pages
6469 4 Sun-Protection Digital
No ratings yet
6469 4 Sun-Protection Digital
2 pages
Development and Control of Virtual Plants in A Co Simulation Environment 1
No ratings yet
Development and Control of Virtual Plants in A Co Simulation Environment 1
35 pages
C# Chapter 8
No ratings yet
C# Chapter 8
34 pages
Different Activation Functions With The Equations
No ratings yet
Different Activation Functions With The Equations
6 pages
Activation
No ratings yet
Activation
7 pages
GPL Statement
No ratings yet
GPL Statement
1 page
Deep Learning Tutorial 3
No ratings yet
Deep Learning Tutorial 3
12 pages