0% found this document useful (0 votes)

37 views16 pages

DL Unit 2.3

Uploaded by

kattaanithasri1224

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

37 views16 pages

DL Unit 2.3

Uploaded by

kattaanithasri1224

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 16

Artificial Neural Networks

The Input Layer and Dense Layers 99 A

Building Blocks of a Neural Network: Layers and Neurons

There are two building blocks of a Neural Network, let’s look at each one of them in detail-

1. What are Layers in a Neural Network?

A neural network is made up of vertically stacked components called Layers. Each dotted line in
the image represents a layer. There are three types of layers in a NN-

Input Layer– First is the input layer. This layer will accept the data and pass it to the rest of the
network.

Hidden Layer– The second type of layer is called the hidden layer. Hidden layers are either one
or more in number for a neural network. In the above case, the number is 1. Hidden layers are
the ones that are actually responsible for the excellent performance and complexity of neural
networks. They perform multiple functions at the same time such as data transformation,
automatic feature creation, etc.

Output layer– The last type of layer is the output layer. The output layer holds the result or the
output of the problem. Raw images get passed to the input layer and we receive output in the
output layer. For example-
In this case, we are providing an image of a vehicle and this output layer will provide an
output whether it is an emergency or non-emergency vehicle, after passing through the input
and hidden layers of course.

Now, that we know about layers and their function let’s talk in detail about what each of
these layers is made up of.

2. What are Neurons in a Neural Network?

A layer consists of small individual units called neurons. A neuron in a neural network can be
better understood with the help of biological neurons. An artificial neuron is similar to a
biological neuron. It receives input from the other neurons, performs some processing, and
produces an output.

Now let’s see an artificial neuron-

Here, X1 and X2 are inputs to the artificial neurons, f(X) represents the processing done on the
inputs and y represents the output of the neuron.

A neural network with two inputs, two hidden neurons, two output neurons. Additionally, the
hidden and output neurons will include a bias.

Here’s the basic structure:

In order to have some numbers to work with, here are the initial weights, the biases,
and training inputs/outputs:
The goal of backpropagation is to optimize the weights so that the neural network can learn how
to correctly map arbitrary inputs to outputs.

For the rest of this tutorial we’re going to work with a single training set: given inputs 0.05 and
0.10, we want the neural network to output 0.01 and 0.99.

The Forward Pass

To begin, lets see what the neural network currently predicts given the weights and biases above
and inputs of 0.05 and 0.10. To do this we’ll feed those inputs forward though the network.

We figure out the total net input to each hidden layer neuron, squash the total net input using
an activation function (here we use the logistic function), then repeat the process with the output
layer neurons.
Here’s how we calculate the total net input for :

We then squash it using the logistic function to get the output of :

Carrying out the same process for we get:

We repeat this process for the output layer neurons, using the output from the hidden layer
neurons as inputs.

Here’s the output for :

And carrying out the same process for we get:

Calculating the Total Error

We can now calculate the error for each output neuron using the squared error function and
sum them to get the total error:

Some sources refer to the target as the ideal and the output as the actual.

The is included so that exponent is cancelled when we differentiate later on. The result is
eventually multiplied by a learning rate anyway so it doesn’t matter that we introduce a constant
here [1].

For example, the target output for is 0.01 but the neural network output 0.75136507,
therefore its error is:

Repeating this process for (remembering that the target is 0.99) we get:
The total error for the neural network is the sum of these errors:

The Backwards Pass

Our goal with backpropagation is to update each of the weights in the network so that they cause
the actual output to be closer the target output, thereby minimizing the error for each output
neuron and the network as a whole.

Output Layer

Consider . We want to know how much a change in affects the total error, aka .

is read as “the partial derivative of with respect to “. You can also say “the gradient
with respect to “.

By applying the chain rule we know that:

Visually, here’s what we’re doing:

We need to figure out each piece in this equation.

First, how much does the total error change with respect to the output?
is sometimes expressed as

When we take the partial derivative of the total error with respect to , the

quantity becomes zero because does not affect it which means we’re
taking the derivative of a constant which is zero.

Next, how much does the output of change with respect to its total net input?

The partial derivative of the logistic function is the output multiplied by 1 minus the output:

Finally, how much does the total net input of change with respect to ?

Putting it all together:

You’ll often see this calculation combined in the form of the delta rule:

Alternatively, we have and which can be written as , aka (the Greek letter
delta) aka the node delta. We can use this to rewrite the calculation above:

Therefore:
Some sources extract the negative sign from so it would be written as:

To decrease the error, we then subtract this value from the current weight (optionally multiplied
by some learning rate, eta, which we’ll set to 0.5):

Some sources use (alpha) to represent the learning rate, others use (eta), and others even
use (epsilon).

We can repeat this process to get the new weights , , and :

We perform the actual updates in the neural network after we have the new weights leading into
the hidden layer neurons (ie, we use the original weights, not the updated weights, when we
continue the backpropagation algorithm below).

Hidden Layer

Next, we’ll continue the backwards pass by calculating new values for , , , and .

Big picture, here’s what we need to figure out:

Visually:
We’re going to use a similar process as we did for the output layer, but slightly different to
account for the fact that the output of each hidden layer neuron contributes to the output (and
therefore error) of multiple output neurons. We know that affects both and

therefore the needs to take into consideration its effect on the both output neurons:

Starting with :

We can calculate using values we calculated earlier:

And is equal to :

Plugging them in:

Following the same process for , we get:

Therefore:
Now that we have , we need to figure out and then for each weight:

We calculate the partial derivative of the total net input to with respect to the same as we
did for the output neuron:

Putting it all together:

You might also see this written as:

We can now update :

Repeating this for , , and

Finally, we’ve updated all of our weights! When we fed forward the 0.05 and 0.1 inputs
originally, the error on the network was 0.298371109. After this first round of backpropagation,
the total error is now down to 0.291027924. It might not seem like much, but after repeating
this process 10,000 times, for example, the error plummets to 0.0000351085. At this point,
when we feed forward 0.05 and 0.1, the two outputs neurons generate 0.015912196 (vs 0.01
target) and 0.984065734 (vs 0.99 target).

Hot Dog-Detecting Dense Network 101

Forward Propagation Through the First Hidden Layer 102
Forward Propagation Through Subsequent Layers 103
The Softmax Layer of a Fast Food-Classifying Network 106

A frivolous hot dogdetecting binary classifier and the mathematical notation we used to define
artificial neurons. As shown in Figure 7.1, our hot dog classifier is no longer a single neuron; in
this chapter, it is a dense network of artificial neurons. More specifically, with this network
architecture: œ We have reduced the number of input neurons down to two for simplicity: œ
The first input neuron, x , represents the volume of ketchup (in, say, milliliters, which
abbreviates to mL) on the object being considered by the network. (We are no longer working
with perceptrons, so we are no longer restricted to binary inputs only.)
œ The second input neuron, x , represents mL of mustard. œ We have two dense hidden layers:
œ The first hidden layer has three ReLU neurons.
œ The second hidden layer has two ReLU neurons.
œ The output neuron is denoted by ŷ in the network. This is a binary classification problem, so—
as outlined in the previous section—this neuron should be sigmoid. As in our perceptron
examples in Chapter 6, y = 1 corresponds to the presence of a hot dog and y = 0 corresponds to
the presence of some other object.

Forward Propagation through the First Hidden Layer Having described the architecture of our
hot dogdetecting network, let’s turn our attention to its functionality by focusing on the neuron
labelled a . This particular neuron, like its siblings a and a , receives input regarding a given
object’s ketchupyness and mustardyness from x and x , respectively. Despite receiving the same
data as a and a , a treats these data uniquely by having its own unique parameters. Remembering
Figure 6.7, “the most important equation in this book” —w x + b—we may grasp this behavior
more concretely. Breaking this equation down for the neuron labelled a , we consider that it has
two inputs from the previous layer, x and x . This neuron also has two weights: w (which applies
to the importance of the ketchup measurement x ) and w (which applies to the importance of the
mustard measurement x ). With these five pieces of information we can calculate z, the weighted
input to that neuron:

In turn, with the z value for the neuron labelled a , we can calculate the activation a it outputs.
Since the neuron labelled a is a ReLU neuron, we use the equation introduced in Figure 6.11:

œ x is 4.0 mL of ketchup for a given object presented to the network œ x is 3.0 mL of mustard for
that same object œ w = −0.5 œ w = 1.5 œ b = −0.9 To calculate z let’s start with Equation 7.1 and
then fill in our contrived values:

Finally, to compute a—the activation output of the neuron labelled a —we can leverage Equation
7.2:

As suggested by the rightfacing arrow along the bottom of Figure 7.1, executing the calculations
through an artificial neural network from the input layer (the x values) through to the output
layer (ŷ) is called forward propagation. Immediately above, we detailed the process for forward
propagating through a single neuron in the first hidden layer of our hot dogdetecting network.
To forward propagate through the remaining neurons of the first hidden layer—that is, to
calculate the a values for the neurons labelled a and a —we would follow the same process as we
did for the neuron labelled a . The inputs x and x are identical for all three neurons, but despite
being fed the same measurements of ketchup and mustard, each neuron in the first hidden layer
will output a different activation a because the parameters w , w and b vary for each of the
neurons in the layer.
Forward Propagation through Subsequent Layers
The process of forward propagating through the remaining layers of the network is essentially
the same as propagating through the first hidden layer, but for clarity’s sake, let’s work through
it together. In Figure 7.2, we’ll assume that we’ve already calculated the activation value a for
each of the neurons in the first hidden layer. Returning our focus to the neuron labelled a , the
activation it outputs (a1 = 1.6) becomes one of the three inputs into the neuron labelled a4 (and,
as highlighted in the figure, this same activation of a = 1.6 is also fed as one of the three inputs
into the neuron labelled a 5).

Figure 7.2 Our hot dogdetecting network from Figure 7.1, now highlighting the activation output
of neuron a , which is provided as an input to both neuron a4 and neuron a5 .
To provide an example of forward propagation through the second hidden layer, let’s compute a
for the neuron labelled a 4.
Again, we employ the all important equation w · x + b. For brevity’s sake, we’ve combined it with
the ReLU activation function:

This is sufficiently similar to Equations 7.3 and 7.4 that it would be superfluous to walk through
the arithmetic again with feigned values. The only twist, as we propagate through the second
hidden layer, is that the layer’s inputs (i.e., x in the equation w x + b) come not from outside the
network—instead they are provided by the first hidden layer. Thus, in Equation 7.5:
œ x1 is the value a = 1.6, which we obtained earlier from the neuron labelled a1
œ x2 is the activation output a (whatever it happens to equal) from the neuron labelled a2 , and
œ x3 is likewise a unique activation a from the neuron labelled a3
In this manner, the neuron labelled a is able to nonlinearly recombine the information
provided by the three neurons of the first hidden layer. The neuron labelled a5 also nonlinearly
recombines this information, but it would do it in its own distinctive way: The unique
parameters w1 , w2 , w3 and b for this neuron would lead it to output a unique a activation of its
own.
Having illustrated forward propagation through all of the hidden layers of our hot
dogdetecting network, let’s round the process off by propagating through the output layer.
Figure 7.3 highlights that our single output neuron receives its inputs from the neurons labelled
a4 and a5 .
Let’s begin by calculating z for this output neuron. The formula is identical to Equation
7.1, which we used to calculate z for the neuron labelled a , except that the (contrived, as usual)
values we plug into the variables are different:
The activation a computed by the sigmoid neuron in the output layer is a very special case
because it is the final output of our entire hot dogdetecting neural network. Since it’s so special,
we assign it a distinctive designation: ŷ, which is pronounced “why hat”. This value ŷ is the
network’s guess as to whether the object presented to it was a hot dog or not a hot dog, and we
can express this in probabilistic language. Given the inputs x and x that we fed into the network
—that is, 4.0 mL of ketchup and 3.0 mL of mustard—the network estimates that there is an
11.92% chance that an object with 1 2 those particular condiment measurements is a hot dog. If
the object presented to the network was indeed a hot dog (y = 1) then this ŷ of 0.1192 was pretty
far off the mark. On the other hand, if the object was truly not a hot dog (y = 0) then the ŷ is quite
good. We’ll formalize the evaluation of ŷ in Chapter 8, but the general notion is is that the closer
ŷ is to the true value y, the better.
Revisiting Our Shallow Network 108
With the knowledge of dense networks that we developed over the course of this chapter, we
can return to our Shallow Net in Keras notebook and understand the model summary within it.
Example 5.2 shows the three lines of Keras code we used to architect a shallow neural network
for classifying MNIST digits. As detailed in Chapter 5, over those three lines of code we
instantiated a model object and added layers of artificial neurons to it. By calling the summary()
on the model, we see the modelsummarizing table provided in Figure 7.5. The table has three
columns:
œ Layer (type): the name and type of each of our layers
œ Output Shape: the dimensionality of the layer
œ Param #: the number of parameters (weights w and biases b) associated with the layer

Figure 7.5 A summary of the model object from our “Shallow Net in Keras” Jupyter notebook.
The input layer performs no calculations and never has any of its own parameters so no
information on it is displayed directly. The first row in the table, therefore, corresponds 6 to the
first hidden layer of the network. The table indicates that this layer:
œ is called dense_1; this is a default name as we did not designate one explicitly
œ is a Dense layer, as we specified in Example 5.2
œ is composed of 64 neurons, as we further specified in Example 5.2
œ has 50240 parameters associated with it, broken down into:
œ 50176 weights, corresponding to each of the 64 neurons in this dense layer receiving input
from each of the 784 neurons in the input layer (64*784)
œ plus 64 biases, one for each of the neurons in the layer
œ giving us a total of 50240 nparameters = nw + nb = 50176 + 64 = 50240

The second row of the table in Figure 7.5 corresponds to the model’s output layer. The table tells
us that this layer:
œ is called dense_2
œ is a Dense layer,
as we specified it to be œ consists of 10 neurons—yet again, as we specified œ has 650
parameters associated with it:
œ 640 weights, corresponding to each of the ten neurons receiving input from each of the 64
neurons in the hidden layer (64*10)
œ plus 10 biases, one for each of the output neurons From the parameter counts for each layer,
we can calculate for ourselves the Total params line displayed in Figure 7.5:

All 50890 of these parameters are “Trainable params” because—during the subsequent
model.fit() call in the Shallow Net in Keras notebook—they are permitted to be tuned during
model training. This is the norm, but as we’ll see in Part III, there are situations where it is
fruitful to freeze some of the parameters in a model rendering them “Nontrainable params”.

1756210939665-Artificial Neural Networks - A Primer
No ratings yet
1756210939665-Artificial Neural Networks - A Primer
7 pages
Unit-1 NN
No ratings yet
Unit-1 NN
12 pages
Coal Conversions Facts 2013
No ratings yet
Coal Conversions Facts 2013
4 pages
What Is A Neural Network? - IBM
No ratings yet
What Is A Neural Network? - IBM
10 pages
Topic 7
No ratings yet
Topic 7
33 pages
Chapter-4 Fundamental of Neural Network
No ratings yet
Chapter-4 Fundamental of Neural Network
26 pages
Neural Networks
No ratings yet
Neural Networks
10 pages
Understanding Neural Networks
No ratings yet
Understanding Neural Networks
12 pages
Mod 4 Notes
No ratings yet
Mod 4 Notes
46 pages
How To Build Your Own Neural Network From Scratch in
No ratings yet
How To Build Your Own Neural Network From Scratch in
6 pages
EFI Fuel System
No ratings yet
EFI Fuel System
68 pages
Unit III
No ratings yet
Unit III
29 pages
Unit - 4
No ratings yet
Unit - 4
17 pages
Lecture
No ratings yet
Lecture
59 pages
Neural Networks
No ratings yet
Neural Networks
10 pages
Avila Et Al 2021 - Characterization of The Mechanical and Physical Properties
No ratings yet
Avila Et Al 2021 - Characterization of The Mechanical and Physical Properties
12 pages
Unit 1
No ratings yet
Unit 1
29 pages
Backpropagation Example
No ratings yet
Backpropagation Example
9 pages
A Step by Step Backpropagation
No ratings yet
A Step by Step Backpropagation
8 pages
School Memorandum No.22, S. 2020 ICT Training For Teachers
No ratings yet
School Memorandum No.22, S. 2020 ICT Training For Teachers
3 pages
ML Unit-5
No ratings yet
ML Unit-5
22 pages
10 Neural Network
No ratings yet
10 Neural Network
65 pages
Artificial Neural Network
No ratings yet
Artificial Neural Network
31 pages
Xuv300 Accessories
No ratings yet
Xuv300 Accessories
2 pages
ANN Research
No ratings yet
ANN Research
18 pages
Unit 1
No ratings yet
Unit 1
19 pages
Back Propogation
No ratings yet
Back Propogation
9 pages
CDB Review Checklist: Program Analysis (PA) Phase Submittal Design Development (DD) Phase Submittal
No ratings yet
CDB Review Checklist: Program Analysis (PA) Phase Submittal Design Development (DD) Phase Submittal
5 pages
Neural Networks: Python & R Guide
100% (1)
Neural Networks: Python & R Guide
15 pages
JNTUK 2023-24 Pharm D & B. Pharm Calendar
No ratings yet
JNTUK 2023-24 Pharm D & B. Pharm Calendar
2 pages
Neural Network Basics Explained
No ratings yet
Neural Network Basics Explained
10 pages
Elektra Micro Casa A Leva Coffee Machine
No ratings yet
Elektra Micro Casa A Leva Coffee Machine
12 pages
3ML.05.NeuralNetworks DeepLearning
No ratings yet
3ML.05.NeuralNetworks DeepLearning
67 pages
Understanding Multi-Layer Feed-Forward Neural Networks in Machine Learning
No ratings yet
Understanding Multi-Layer Feed-Forward Neural Networks in Machine Learning
4 pages
The Poisson Distribution
No ratings yet
The Poisson Distribution
13 pages
Intro to Feed Forward Neural Networks
No ratings yet
Intro to Feed Forward Neural Networks
41 pages
Neural Network
100% (1)
Neural Network
54 pages
Unit Iii
No ratings yet
Unit Iii
17 pages
ANN Example
No ratings yet
ANN Example
10 pages
Elfospace Box3: Cassette-Type Indoor Installation
No ratings yet
Elfospace Box3: Cassette-Type Indoor Installation
4 pages
Unit 1
No ratings yet
Unit 1
20 pages
SAP CATS Target Hours Calculation
No ratings yet
SAP CATS Target Hours Calculation
2 pages
Assignment 1
No ratings yet
Assignment 1
4 pages
Unit 5 ML
No ratings yet
Unit 5 ML
37 pages
IoT Device Integration Guide
No ratings yet
IoT Device Integration Guide
16 pages
AI Exam for B.Tech Students
No ratings yet
AI Exam for B.Tech Students
2 pages
Unit 2 Deep Learning
No ratings yet
Unit 2 Deep Learning
19 pages
Neural Networks
No ratings yet
Neural Networks
27 pages
Unit 4
No ratings yet
Unit 4
16 pages
DA Assignment - 2
No ratings yet
DA Assignment - 2
2 pages
Lecture Ch4 Performance
No ratings yet
Lecture Ch4 Performance
25 pages
7 - Neural Networks
No ratings yet
7 - Neural Networks
66 pages
Artificial Intelligence - Chapter 7
No ratings yet
Artificial Intelligence - Chapter 7
18 pages
19 - Introduction To Neural Networks
No ratings yet
19 - Introduction To Neural Networks
7 pages
Airtech Busch Parts
No ratings yet
Airtech Busch Parts
7 pages
ML Unit 3 Notes
No ratings yet
ML Unit 3 Notes
10 pages
A Step by Step Backpropagation Example - Matt Mazur
No ratings yet
A Step by Step Backpropagation Example - Matt Mazur
9 pages
AIML-Module-3-part 2
No ratings yet
AIML-Module-3-part 2
122 pages
Refrigeration & HVAC Expert Resume
No ratings yet
Refrigeration & HVAC Expert Resume
3 pages
MP IA2 Q and A
No ratings yet
MP IA2 Q and A
9 pages
Lect8 DNN
No ratings yet
Lect8 DNN
33 pages
2021 Fia f3 Regional Homologation 11.01.21
No ratings yet
2021 Fia f3 Regional Homologation 11.01.21
21 pages
Akvárium Klub Ticket Guidelines
No ratings yet
Akvárium Klub Ticket Guidelines
1 page
2024 - 10 - 14 - ASEAN ITU GovStack - Brunei Country Update FINAL
No ratings yet
2024 - 10 - 14 - ASEAN ITU GovStack - Brunei Country Update FINAL
16 pages
ANN MODULE 1 Part2
No ratings yet
ANN MODULE 1 Part2
58 pages
Rohde and Schwarz TSMA6B - Bro - en - 3609-5622-12 - v0600
No ratings yet
Rohde and Schwarz TSMA6B - Bro - en - 3609-5622-12 - v0600
26 pages
Cisco® Catalyst® 9400 Series
No ratings yet
Cisco® Catalyst® 9400 Series
25 pages
A Step by Step Backpropagation Example - Matt Mazur
No ratings yet
A Step by Step Backpropagation Example - Matt Mazur
17 pages
Unit 3 Iot
No ratings yet
Unit 3 Iot
40 pages
Main
No ratings yet
Main
25 pages
Artificial Neural Networks
No ratings yet
Artificial Neural Networks
23 pages
Backpropagation in Neural Networks
No ratings yet
Backpropagation in Neural Networks
56 pages
Machine Learning For Beginners
No ratings yet
Machine Learning For Beginners
16 pages
Ada Boost Optimizes Wave Energy Arrays
No ratings yet
Ada Boost Optimizes Wave Energy Arrays
6 pages
Neural
No ratings yet
Neural
53 pages
Deep Learning 10 Hours: - Artificial Neural Networks (ANN) : Architecture
No ratings yet
Deep Learning 10 Hours: - Artificial Neural Networks (ANN) : Architecture
24 pages
Code in Voices
No ratings yet
Code in Voices
10 pages
Comparing Functions Answered
No ratings yet
Comparing Functions Answered
14 pages
Neural Networks in Python & R
No ratings yet
Neural Networks in Python & R
12 pages
Websys
No ratings yet
Websys
1 page
Understanding Backpropagation Algorithm - Towards Data Science
No ratings yet
Understanding Backpropagation Algorithm - Towards Data Science
11 pages
A Step by Step Backpropagation Example
No ratings yet
A Step by Step Backpropagation Example
9 pages
A Beginner's Tutorial For CNN
100% (1)
A Beginner's Tutorial For CNN
35 pages
Unit 2 Iot
No ratings yet
Unit 2 Iot
62 pages
Agip GR SLL 00
No ratings yet
Agip GR SLL 00
1 page
ML Unit-2
No ratings yet
ML Unit-2
141 pages
Road Restraint Systems Guide
No ratings yet
Road Restraint Systems Guide
82 pages
Nintendo Power Issue 271 (September 2011)
No ratings yet
Nintendo Power Issue 271 (September 2011)
101 pages
Woofer Tester Pro
No ratings yet
Woofer Tester Pro
16 pages
Chapter3 - BP
No ratings yet
Chapter3 - BP
12 pages
TO Artificial Neural Networks
No ratings yet
TO Artificial Neural Networks
22 pages
I B. Pharmacy
No ratings yet
I B. Pharmacy
1 page
PT0-003 Updated Dumps - CompTIA PenTest+ Exam
No ratings yet
PT0-003 Updated Dumps - CompTIA PenTest+ Exam
28 pages
Institute For Advanced Management Systems Research Department of Information Technologies Abo Akademi University
No ratings yet
Institute For Advanced Management Systems Research Department of Information Technologies Abo Akademi University
41 pages