0% found this document useful (0 votes)

2 views35 pages

Unit 2 Part 01

Uploaded by

harshithkataray1

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

2 views35 pages

Unit 2 Part 01

Uploaded by

harshithkataray1

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 35

Convolution Neural Networks

• Introduction

• Relation of convolutional network with deep learning

• A Convolutional Neural Network (ConvNet/CNN) is a Deep

Learning algorithm which can take in an input image, assign
importance (learnable weights and biases) to various
aspects/objects in the image and be able to differentiate one
from the other.

• The pre-processing required in a ConvNet is much lower as

compared to primitive methods where filters are hand-
engineered.

• ConvNets can learn these filters/characteristics of their own

automatically.
Convolution Neural Networks
• The architecture of a ConvNet is inspired by the organization of
the Visual Cortex of the human brain.

• Individual neurons respond to stimuli only in a restricted

region of the visual field known as the Receptive Field.

• WHAT IS CONVOLUTIONAL NEURAL NETWORK?

• Convolutional Neural Network is one of the main category of

deep learning to do image classification and image recognition
in neural networks.

• Scene labelling, objects detections, and face recognition, etc.,

are some of the areas where convolutional neural networks are
widely used.
Architecture of CNN
• The architecture of a ConvNet is inspired by the organization
of the Visual Cortex and is akin to the connectivity pattern of
Neurons in the Human Brain.

• A ConvNet is able to successfully capture the Spatial and

Temporal dependencies in an image through the application
of relevant filters.

• The architecture performs a better fitting to the image

dataset due to the reduction in the number of parameters
involved and reusability of weights.

• In other words, the network can be trained to understand the

sophistication of the image better.
Architecture of CNN
• Figure below shows an RGB image which has been
separated by its three color planes — Red, Green, and
Blue.

• There are several such color spaces in which images exist —

Grayscale, RGB, HSV, CMYK, etc.
Architecture of CNN
• When the dimensions of the image is increased, it becomes
computationally complex to handle such images.

• This is critical for designing an architecture that is

capable of learning features and also being scalable to
large datasets.

• The ConvNet's job is to compress the images into a format

that is easier to process while preserving elements that
are important for obtaining a decent prediction.

• Figure below shows an CNN architecture used to

overcome the problems of high dimensional image dataset.
Architecture of CNN

• As shown in Figure above, CNN takes an image as input, which

is classified and processed under a certain category such as car,
truck, van, etc.

• The computer sees an image as an array of pixels and

depending on the resolution and features of the image the
classes are classified.
Architecture of CNN
• Based on image resolution, it will see as h * w * d, where
h= height w= width and d= dimension.

• For example, An RGB image is 6 * 6 * 3 array of the

matrix, and the grayscale image is 4 * 4 * 1 array of the
matrix.

• Consider, a simple Convolutional neural network as shown

below which involves various components that are
associated to process an input image.
Architecture of CNN

• Figure: Convolution Neural Network for handwritten digit.

Architecture of CNN
• The typical CNN is made of a combination of four main
layers:

1. Convolutional layers

2. Rectified Linear Unit (ReLU for short)

3. Pooling layers

4. Fully connected layers

• But, there a few things to learn from layer 1 that

is striding (stride) and padding.

• Consider an input matrix of 5×5 and a filter of matrix 3x3.

Architecture of CNN
• a filter is a set of weights in a matrix applied on an
image or a matrix to obtain the required features.

• A filter can be of any depth, if a filter is having a

depth d it can go to a depth of d layers and convolute
i.e sum all the (weights x inputs) of d layers.

• Considering the input of size 5×5 and after applying a

3×3 kernel or filters we obtain a 3×3 output feature
map as shown in Figure below:
Architecture of CNN

• The formula to know the output size of the feature map given
the input size and filter dimensions.

• Consider the square input image (same width and height) and
a square filter (same width and height), we can an simplify
the formulas as:
Architecture of CNN

• Where InputSize and FilterSize represent both the width

and height of the input image and filter, respectively.
• Padding is the number of pixels added to the input image
around its border to control the spatial dimensions of the
output.
• Stride is the step size or the number of pixels the filter
shifts (moves) after each convolution operation.
Architecture of CNN
1. Convolution layers

• Convolution layer is the first building block in CNN. The main

mathematical task performed is called convolution.

• Convolution is the application of a sliding window function to a

matrix of pixels called the convolution.

• The sliding function applied to the matrix is called kernel or

filter, and both can be used interchangeably.

• In the convolution layer, several filters of equal size are

applied, and each filter is used to recognize a specific pattern
from the image, such as the curving of the digits, the edges, the
whole shape of the digits etc.
Architecture of CNN
• Example:

• Consider this 32x32 grayscale image of a handwritten

digit. The values in the matrix are given in the pixel
representation as shown in Figure below:
Architecture of CNN
• We consider a kernel or filter used for convolution which is a
matrix with a dimension of 3x3.

• Zero weights are represented in the black grids and ones in

the white grid.

• The weights of the kernels are determined during the

training process of the neural network.

• The convolution operation is applied between the two

matrices (input image matrix and 3x3 kernel) by taking the
dot product, and work as follows:
Architecture of CNN
1. Apply the kernel matrix from the top-left corner and move
column wise to the right.

2. Perform element-wise multiplication.

3. Sum the values of the products.

4. The resulting value corresponds to the first value (top-left corner)

in the convoluted matrix.

5. Move the kernel down with respect to the size of the sliding
window.

6. Repeat from step 1 to 5 until the image matrix is fully covered.

• The dimension of the convoluted matrix depends on the size of the

sliding window. The higher the sliding window, the smaller the
Architecture of CNN
• The operation of convolution is shown in Figure below:

• The weight matrix behaves like a filter in an image extracting

particular information from the original image matrix.
Architecture of CNN
• The weights are learnt so that that loss function is minimized
and also to extract the features from the original image which
helps the network to make correct predictions.

• When we have multiple convolutional layers, the initial layer

extract the more generic features, while deeper layers extract
more features from the more complex problems.

2. Activation function (Rectified Linear Unit (ReLU for short))

• A ReLU activation function is applied after each convolution

operation.

• This function helps the network learn non-linear relationships

between the features in the image
Architecture of CNN
• Hence, ReLU make the network more robust for
identifying different patterns.
• It also helps to mitigate the vanishing gradient
problems.
3. Pooling layer
• The goal of the pooling layer is to pull the most
significant features from the convoluted matrix.
• This is done by applying some aggregation
operations, which reduces the dimension of the
feature map (convoluted matrix), hence reducing the
memory used while training the network.
Architecture of CNN
• Pooling also helps for mitigating overfitting issue.

• The most common aggregation functions that can be applied

are:

• Max pooling which is the maximum value of the feature map.

• Sum pooling corresponds to the sum of all the values of the

feature map.

• Average pooling is the average of all the values.

• For example

• The dimension of the feature map becomes smaller as the

polling function is applied in the example below:
Architecture of CNN

4. Fully connected layers

• The convolution and pooling layers only extract features and
reduce the number of parameters from the original images.
• To generate the final output we need to apply a fully
connected layer the generate an output equal to the number of
classes needed.
Architecture of CNN
• The fully connected layers are in the last layer of the
convolutional neural network.

• We need to flatten the output of the convolutional and pooling

layers and pass it to a series of fully-connected or dense layers.

• The dense layers of the CNN take an input vector of the

flattened pixels of the image and generate the output as
whether or not the image belongs to a particular class.

• The output layer has a loss function like categorical cross-

entropy, to compute the error in prediction.

• Once the forward pass is complete the backpropagation begins

to update the weight and biases for minimizing the error and
loss.
Architecture of CNN
• Finally, a softmax prediction layer is used to generate
probability values for each of the possible output labels, and the
final label predicted is the one with the highest probability
score.

• Dropout

• Dropout is a regularization technique applied to improve the

generalization capability of the neural networks with a large
number of parameters.

• It consists of randomly dropping some neurons during the

training process, which forces the remaining neurons to learn
new features from the input data.
Why convnets are better than feed-forward neural
nets?
• There are several reasons why CNNs are important and
better like:

1. Unlike traditional machine learning models like SVM

and decision trees that require manual feature
extractions, CNNs can perform automatic feature
extraction at scale, making them efficient.

2. The convolutions layers make CNNs translation

invariant, meaning they can recognize patterns from
data and extract features regardless of their position,
whether the image is rotated, scaled, or shifted.
Why convnets are better than feed-forward neural
nets?
3. Multiple pre-trained CNN models such as VGG-16, ResNet50,
Inceptionv3, and EfficientNet are proved to have reached state-of-
the-art results and can be fine-tuned on news tasks using a
relatively small amount of data.

4. CNNs can also be used for non-image classification problems and are
not limited to natural language processing, time series analysis, and
speech recognition.

• ConvNets are better than feed-forward neural nets since CNN has
features parameter sharing and dimensionality reduction.

• By parameter sharing, the number of parameters is reduced thus the

computations also decreased.
Why convnets are better than feed-forward neural
nets?
• Because of the dimensionality reduction in CNN, the
computational power needed is reduced.

• Consider an input image used in ConvNets and Fee-forward

neural network as shown below:
Why convnets are better than feed-forward neural
nets?
• Each and every pixel of the image will have a different weight
associated to it and it will have three values(rgb values)
associated to it.

• If we can apply feed forward networks to color image say of size

227*227 as input then the number of parameters become
227*227*3.

• Roughly, 10⁴ number of weights will be associated with the

image. So, 10⁴ number of neurons will be required in one single
layer of the network which is really incompatible and complex to
work.
Why convnets are better than feed-forward neural
nets?
• Hence, millions of parameters and neurons will be
required in one single feed forward network, so they are
incompatible for handling many number of images.

• In CNN’s a kernel is built (kernel is basically a matrix of

weights) and the weights are shared as the kernel moves
horizontally and vertically across and image.

• The Maxpooling operation directly cuts the number of

parameters by half.

• Further, the concept of padding and stride which further

decreases the parameter size of the image.
CONVOLUTIONAL OPERATION
• Convolution is a specialized kind of linear operation.

• Convolution is an operation on two functions of a real- valued

argument.

• Convnets are simply neural networks that use convolution in

place of general matrix multiplication in at least one of their
layers.

• Convolution between two functions uses mathematical

operation that produces a third function expressing how the
shape of one function is modified by other.

• Convolution operation uses different types of kernel operations

on the input image to extract different features.
CONVOLUTIONAL OPERATION
• Convolution Kernels types
• A kernel is a small 2D matrix whose contents are based
upon the operations to be performed.
• A kernel maps on the input image by simple matrix
multiplication and addition, the output obtained is of
lower dimensions and therefore easier to work with
images.
• Some kernel types are shown in the Figure below:
CONVOLUTIONAL OPERATION

Figure : Kernel types

• In the Figure, Gaussian blur is applied on the original image and the
result is shown.

• After applying kernel Gaussian blur the image becomes smooth.

CONVOLUTIONAL OPERATION
• The second and third kernel shown the operation of
Sharpen image(enhance the depth of edges) and edge
detection.
• The shape of a kernel is heavily dependent on the input
shape of the image and architecture of the entire network,
mostly the size of kernels is (MxM) i.e a square matrix.
• The movement of a kernel is always from left to right and
top to bottom as shown in the Figure below.
CONVOLUTIONAL OPERATION
• Stride defines by what step does the kernel to move, for
example stride of 1 makes kernel slide by one row/column
at a time and stride of 2 moves kernel by 2 rows/columns.

• Figure below shows the operation of stride

Figure : Multiple kernels with movement of stride=1

CONVOLUTIONAL OPERATION
• Consider a simple kernel operation with stride = 1.

f
CONVOLUTIONAL OPERATION
• Here the input matrix has shape 4x4x1 and the kernel is of
size 3x3, since the shape of input is larger than the kernel.

• we are able to implement a sliding window protocol and

apply the kernel over entire input.

• First entry in the convoluted result is calculated as:

• 450 + 12(-1) + 50 + 22(-1) + 105 + 35(-1) + 88*0 +

26*(-1) + 51*0 = -45

• The above process is repeated till the entire input has been
processed.

CNN, RNN
No ratings yet
CNN, RNN
60 pages
Final Deep Learning Manual
No ratings yet
Final Deep Learning Manual
26 pages
CCS338 Computer Vision Lecture Notes 1
No ratings yet
CCS338 Computer Vision Lecture Notes 1
99 pages
DL CNN
No ratings yet
DL CNN
129 pages
5-Convolutional Neural Network
No ratings yet
5-Convolutional Neural Network
43 pages
GCET DL Unit-3 CNN
No ratings yet
GCET DL Unit-3 CNN
114 pages
Module2 1
No ratings yet
Module2 1
27 pages
Introduction To Convolutional Neural Networks
No ratings yet
Introduction To Convolutional Neural Networks
4 pages
DL-Unit-3 Final
No ratings yet
DL-Unit-3 Final
25 pages
Module 3 Notes
No ratings yet
Module 3 Notes
22 pages
What Is A Convolutional Neural Network (CNN) ?
No ratings yet
What Is A Convolutional Neural Network (CNN) ?
5 pages
4a Convolutional Neural Networks
No ratings yet
4a Convolutional Neural Networks
56 pages
DL Unit 4
No ratings yet
DL Unit 4
58 pages
Facial Emotion Recognition Report
No ratings yet
Facial Emotion Recognition Report
12 pages
CNNs for Image Recognition
No ratings yet
CNNs for Image Recognition
16 pages
Unit 5 CNN
No ratings yet
Unit 5 CNN
151 pages
2-D Convolution and Correlation
No ratings yet
2-D Convolution and Correlation
9 pages
Unit - 5
No ratings yet
Unit - 5
47 pages
DL Unit 3
No ratings yet
DL Unit 3
27 pages
An Introduction To Convolutional Neural Networks
No ratings yet
An Introduction To Convolutional Neural Networks
11 pages
CNN Basics for AI Enthusiasts
No ratings yet
CNN Basics for AI Enthusiasts
29 pages
DL Unit-Ii
No ratings yet
DL Unit-Ii
34 pages
CNN 1
No ratings yet
CNN 1
19 pages
DL Mod3
No ratings yet
DL Mod3
102 pages
03 - CNN
No ratings yet
03 - CNN
10 pages
Image Processing Deep Dive
No ratings yet
Image Processing Deep Dive
4 pages
Introduction To Deep Learning
No ratings yet
Introduction To Deep Learning
47 pages
3 Distributing Tensor Flow Across Devices and Ser 241120 095224
No ratings yet
3 Distributing Tensor Flow Across Devices and Ser 241120 095224
47 pages
What Is A Convolutional Neural Network-Unit3
No ratings yet
What Is A Convolutional Neural Network-Unit3
12 pages
Convolution Neural Network
No ratings yet
Convolution Neural Network
74 pages
DL Unit 4 Modified
No ratings yet
DL Unit 4 Modified
64 pages
Module 05 CNN Arctitecture
No ratings yet
Module 05 CNN Arctitecture
7 pages
UNIT-III DeepLearning Notes
No ratings yet
UNIT-III DeepLearning Notes
30 pages
Introduction To Convolutional Neural Networks (CNNS)
No ratings yet
Introduction To Convolutional Neural Networks (CNNS)
28 pages
Unit 3 CNN 2024
No ratings yet
Unit 3 CNN 2024
58 pages
Sommaire CNN Presentation
No ratings yet
Sommaire CNN Presentation
10 pages
Convolutional Neural Network (CNN)
No ratings yet
Convolutional Neural Network (CNN)
9 pages
Introduction to CNNs in Deep Learning
No ratings yet
Introduction to CNNs in Deep Learning
42 pages
Intro to Convolutional Neural Networks
No ratings yet
Intro to Convolutional Neural Networks
80 pages
CNN Notes Architecture
No ratings yet
CNN Notes Architecture
4 pages
CNN MLFA Ons-Part1
No ratings yet
CNN MLFA Ons-Part1
65 pages
Nria20-Dl - Unit-3 Notes-Final
No ratings yet
Nria20-Dl - Unit-3 Notes-Final
23 pages
Convolutional Neural Networks
No ratings yet
Convolutional Neural Networks
8 pages
CV PPT Mt101
No ratings yet
CV PPT Mt101
16 pages
DL 4
No ratings yet
DL 4
4 pages
CNN Basics for AI Enthusiasts
No ratings yet
CNN Basics for AI Enthusiasts
6 pages
DL Unit4
No ratings yet
DL Unit4
31 pages
DL Unit 3 2019PAT
No ratings yet
DL Unit 3 2019PAT
66 pages
CNNs for AI and Machine Learning
No ratings yet
CNNs for AI and Machine Learning
16 pages
Typical CNN (Convolutional Neural Network) Architecture: CHARAN S (1VE20CA005) Cse-Ai, Svce
No ratings yet
Typical CNN (Convolutional Neural Network) Architecture: CHARAN S (1VE20CA005) Cse-Ai, Svce
13 pages
Unit 3
No ratings yet
Unit 3
19 pages
Deep Learning Image Classification
No ratings yet
Deep Learning Image Classification
11 pages
Unit3 2023 NNDL
No ratings yet
Unit3 2023 NNDL
69 pages
CNNs for ECE Students
No ratings yet
CNNs for ECE Students
60 pages
Digital Image Processing Lab Manual
No ratings yet
Digital Image Processing Lab Manual
42 pages
Unit 2
No ratings yet
Unit 2
28 pages
Visual and Audio Signal Processing Lab University of Wollongong
No ratings yet
Visual and Audio Signal Processing Lab University of Wollongong
20 pages
CCV Lab
No ratings yet
CCV Lab
62 pages
IS Module 1
No ratings yet
IS Module 1
85 pages
286 1006 1 PB
No ratings yet
286 1006 1 PB
8 pages
Unit 3
No ratings yet
Unit 3
65 pages
Introd 03
No ratings yet
Introd 03
61 pages
Deep Learning Module 3 Important Topics PYQs
No ratings yet
Deep Learning Module 3 Important Topics PYQs
21 pages
CNN
No ratings yet
CNN
62 pages
Linear Algebra in Image Processing
No ratings yet
Linear Algebra in Image Processing
31 pages
REF2 - Basic Image Processing
No ratings yet
REF2 - Basic Image Processing
18 pages
Development of A Hand Held Device For Automaticlicense Platerecognition
No ratings yet
Development of A Hand Held Device For Automaticlicense Platerecognition
5 pages
OpenCV-Python Quick Guide
No ratings yet
OpenCV-Python Quick Guide
76 pages
CNN Students
No ratings yet
CNN Students
170 pages
Google Aiml
No ratings yet
Google Aiml
117 pages
IS Unit 1
No ratings yet
IS Unit 1
86 pages
Intro 01
No ratings yet
Intro 01
50 pages
Cloud Nutshell & Roots of Cloud
No ratings yet
Cloud Nutshell & Roots of Cloud
15 pages
Real-Time Object Detection & Distance
100% (1)
Real-Time Object Detection & Distance
27 pages
CCV Project
No ratings yet
CCV Project
13 pages
Yolov8 To Yolo11: A Comprehensive Architecture In-Depth Comparative Review
No ratings yet
Yolov8 To Yolo11: A Comprehensive Architecture In-Depth Comparative Review
22 pages
SaaS
No ratings yet
SaaS
10 pages
W11 Lecture ITS69204 Image Recognition
No ratings yet
W11 Lecture ITS69204 Image Recognition
44 pages
CT04DK76
No ratings yet
CT04DK76
1 page
Class 10 Portfoilio-AI
No ratings yet
Class 10 Portfoilio-AI
9 pages
Advanced Machine Learning Model To Detect Spam On Instagram
No ratings yet
Advanced Machine Learning Model To Detect Spam On Instagram
6 pages
Ch-12 Recurrent-Neural-Networks-And-Long-Short-Term-Memory BooK - Machine Learning and Deep Learning Using Python (McGraw Hill)
No ratings yet
Ch-12 Recurrent-Neural-Networks-And-Long-Short-Term-Memory BooK - Machine Learning and Deep Learning Using Python (McGraw Hill)
65 pages
Waveform
No ratings yet
Waveform
54 pages
Ji Et Al 2025 High Speed and Low Energy Resistive Switching With Two Dimensional Cobalt Phosphorus Trisulfide For
No ratings yet
Ji Et Al 2025 High Speed and Low Energy Resistive Switching With Two Dimensional Cobalt Phosphorus Trisulfide For
14 pages
DIP Lab Manual No 06
No ratings yet
DIP Lab Manual No 06
7 pages
Artificial Intelligence-10
No ratings yet
Artificial Intelligence-10
27 pages
Identification - of - Diseases - in - Apple - Fruits - Using - A-March 2025
No ratings yet
Identification - of - Diseases - in - Apple - Fruits - Using - A-March 2025
9 pages
The Math Behind Convolutional Neural Networks - Towards Data Science
No ratings yet
The Math Behind Convolutional Neural Networks - Towards Data Science
37 pages
Ai
No ratings yet
Ai
14 pages
Menghi Lang
No ratings yet
Menghi Lang
20 pages
Web Attack Detection with CNN
No ratings yet
Web Attack Detection with CNN
12 pages
Paper 81 Easychair Edit 9-12-2023
No ratings yet
Paper 81 Easychair Edit 9-12-2023
4 pages