0% found this document useful (0 votes)

108 views30 pages

DHSCH 6

The document summarizes key aspects of multilayer neural networks from Pattern Classification, Chapter 6. It introduces multilayer neural networks and their ability to learn nonlinear functions from data. It describes the feedforward operation and classification process using a three-layer network with input, hidden, and output layers. It then introduces the backpropagation algorithm which enables learning the input-to-hidden weights by computing an effective error for each hidden unit to solve the credit assignment problem.

Uploaded by

elnazeer74

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPT, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

108 views30 pages

DHSCH 6

Uploaded by

elnazeer74

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPT, PDF, TXT or read online on Scribd

You are on page 1/ 30

Pattern

Classification

All materials in these slides were taken

from
Pattern Classification (2nd ed) by R. O.
Duda, P. E. Hart and D. G. Stork, John
Wiley & Sons, 2000
with the permission of the authors and
the publisher
Chapter 6: Multilayer Neural Networks
(Sections 6.1-6.3)

• Introduction
• Feedforward Operation and Classification
• Backpropagation Algorithm
3
Introduction
• Goal: Classify objects by learning nonlinearity
• There are many problems for which linear
discriminants are insufficient for minimum error

• In previous methods, the central difficulty was the

choice of the appropriate nonlinear functions

• A “brute” approach might be to select a complete

basis set such as all polynomials; such a classifier
would require too many parameters to be determined
from a limited number of training samples
Pattern Classification, Chapter 6
4

• There is no automatic method for determining the

nonlinearities when no information is provided to the
classifier

• In using the multilayer Neural Networks, the form of

the nonlinearity is learned from the training data

Pattern Classification, Chapter 6

Feedforward Operation and

Classification

• A three-layer neural network consists of an input

layer, a hidden layer and an output layer
interconnected by modifiable weights
represented by links between layers

Pattern Classification, Chapter 6

• A single “bias unit” is connected to each unit other than the

input units
d d

• Net activation: net j   x i w ji  w j 0   x i w ji  w tj .x ,

i 1 i 0

where the subscript i indexes units in the input layer, j in the

hidden; wji denotes the input-to-hidden layer weights at the
hidden unit j. (In neurobiology, such weights or connections
are called “synapses”)

• Each hidden unit emits an output that is a nonlinear function

of its activation, that is: yj = f(netj)
Pattern Classification, Chapter 6
9
Figure 6.1 shows a simple threshold function
1 if net  0
f ( net )  sgn( net )  
 1 if net  0

• The function f(.) is also called the activation

function or “nonlinearity” of a unit. There are
more general activation functions with
desirables properties

• Each output unit similarly computes its net

activation based on the hidden unit signals as:
nH nH
net k   y j w kj  w k 0   y j w kj  w kt . y ,
j 1 j 0

where the subscript k indexes units in the ouput

layer and nH denotes the number of hidden units Pattern Classification, Chapter 6
1
0
• More than one output are referred z . An output unit
k
computes the nonlinear function of its net, emitting
zk = f(netk)

• In the case of c outputs (classes), we can view the

network as computing c discriminants functions
zk = gk(x) and classify the input x according to the
largest discriminant function gk(x)  k = 1, …, c

• The three-layer network with the weights listed in

fig. 6.1 solves the XOR problem

Pattern Classification, Chapter 6

1
1
• The hidden unit y computes the boundary:
1

 0  y1 = +1
x1 + x2 + 0.5 = 0
< 0  y1 = -1

• The hidden unit y computes the boundary:

 0  y2 = +1
x1 + x2 -1.5 = 0
< 0  y2 = -1

• The final output unit emits z 1 = +1  y1 = +1 and y2 = +1

zk = y1 and not y2 = (x1 or x2) and not (x1 and x2) = x1 XOR x2
which provides the nonlinear decision of fig. 6.1

Pattern Classification, Chapter 6

1
• General Feedforward Operation – case of c output units 2

 nH  d  
gk ( x )  z k  f   w kj f   w ji x i  w j 0   w k 0  (1)
 j 1  i 1  
(k  1,...,c)
• Hidden units enable us to express more complicated nonlinear functions
and thus extend the classification

• The activation function does not have to be a sign function, it is often

required to be continuous and differentiable

• We can allow the activation in the output layer to be different from the
activation function in the hidden layer or have different activation for each
individual unit

• We assume for now that all activation functions to be identical

Pattern Classification, Chapter 6
1
3
• Expressive Power of multi-layer Networks

Question: Can every decision be implemented by a three-layer

network described by equation (1) ?

Answer: Yes (due to A. Kolmogorov)

“Any continuous function from input to output can be implemented
in a three-layer net, given sufficient number of hidden units nH,
proper nonlinearities, and weights.”

2 n 1
g( x )    j   ij ( xi )
j 1
x  I n ( I  [ 0 ,1 ]; n  2 )

for properly chosen functions j and ij

Pattern Classification, Chapter 6

1
4

• Each of the 2n+1 hidden units j takes as input a sum of d

nonlinear functions, one for each input feature xi

• Each hidden unit emits a nonlinear function j of its total input

• The output unit emits the sum of the contributions of the

hidden units

Unfortunately: Kolmogorov’s theorem tells us very little about

how to find the nonlinear functions based on data; this is the
central problem in network-based pattern recognition
Pattern Classification, Chapter 6
1
5

Pattern Classification, Chapter 6

1
Backpropagation Algorithm 6

• Any function from input to output can be

implemented as a three-layer neural network

• These results are of greater theoretical interest

than practical, since the construction of such a
network requires the nonlinear functions and the
weight values which are unknown!

Pattern Classification, Chapter 6

1
7

Pattern Classification, Chapter 6

1
8

• Our goal now is to set the interconnexion weights based

on the training patterns and the desired outputs

• In a three-layer network, it is a straightforward matter to

understand how the output, and thus the error, depend on
the hidden-to-output layer weights

• The power of backpropagation is that it enables us to

compute an effective error for each hidden unit, and thus
derive a learning rule for the input-to-hidden weights, this
is known as:
The credit assignment problem

Pattern Classification, Chapter 6

1
9

• Network have two modes of operation:

• Feedforward
The feedforward operations consists of presenting a
pattern to the input units and passing (or feeding) the
signals through the network in order to get outputs
units (no cycles!)

• Learning
The supervised learning consists of presenting an input
pattern and modifying the network parameters
(weights) to reduce distances between the computed
output and the desired output
Pattern Classification, Chapter 6
2
0

Pattern Classification, Chapter 6

2
1
• Network Learning
• Let tk be the k-th target (or desired) output and zk be
the k-th computed output with k = 1, …, c and w
represents all the weights of the network
1 c 1
The training error: J ( w )   ( t k  z k ) 
2
• 2 k 1
2

2
tz

• The backpropagation learning rule is based on

gradient descent
• The weights are initialized with pseudo-random values and
are changed in a direction that will reduce the error:
J
w  
w
Pattern Classification, Chapter 6
2
where  is the learning rate which indicates the relative 2
size of the change in weights
w(m +1) = w(m) + w(m)
where m is the m-th pattern presented

• Error on the hidden–to-output weights

J J net k net k
 .   k
w kj net k w kj w kj
where the sensitivity of unit k is defined as:  k   J
net k
and describes how the overall error changes with the
activation of the unit’s net
J J  z k
k    .  ( t k  z k ) f ' ( net k )
net k z k net k
Pattern Classification, Chapter 6
2
3
Since netk = wkt.y therefore: net k
 yj
w kj

Conclusion: the weight update (or learning rule) for the

hidden-to-output weights is:
wkj = kyj = (tk – zk) f’ (netk)yj

• Error on the input-to-hidden units

J J y j net j
 . .
w ji y j net j w ji

Pattern Classification, Chapter 6

J  1 c 2
c
z k 2
However,

y j y j 2  k k 
( t  z )    ( t k  z k )
y j 4
 k 1  k 1
c
z k net k c
  ( t k  zk ) .    ( t k  z k ) f ' ( net k )w kj
k 1  net k  y j k 1

Similarly as in the preceding case, we define the

sensitivity for a hidden unit: c
 j  f ' ( net j ) w kj k
k 1

which means that:“The sensitivity at a hidden unit is simply

the sum of the individual sensitivities at the output units
weighted by the hidden-to-output weights wkj; all multipled
by f’(netj)”

Conclusion: The learning rule for the input-to-hidden

weights is:
w ji  x i j    w kj k  f ' ( net j ) x i
       
j
Pattern Classification, Chapter 6
2
5

• Starting with a pseudo-random weight configuration, the

stochastic backpropagation algorithm can be written as:

Begin initialize nH; w, criterion , , m

 0
do m  m + 1
xm  randomly chosen pattern
wji  wji + jxi; wkj  wkj + kyj
until ||J(w)|| < 
return w
End

Pattern Classification, Chapter 6

2
6
• Stopping criterion
• The algorithm terminates when the change in the criterion
function J(w) is smaller than some preset value 

• There are other stopping criteria that lead to better performance

than this one

• So far, we have considered the error on a single pattern, but we

want to consider an error defined over the entirety of patterns in
the training set

• The total training error is the sum over the errors of n individual
patterns
n
J  Jp (1)
p1

Pattern Classification, Chapter 6

2
7
• Stopping criterion (cont.)

• A weight update may reduce the error on the single pattern

being presented but can increase the error on the full training
set

• However, given a large number of such individual updates,

the total error of equation (1) decreases

Pattern Classification, Chapter 6

2
8
• Learning Curves

• Before training starts, the error on the training set is high; through
the learning process, the error becomes smaller

• The error per pattern depends on the amount of training data and
the expressive power (such as the number of weights) in the
network

• The average error on an independent test set is always higher

than on the training set, and it can decrease as well as increase

• A validation set is used in order to decide when to stop training ;

we do not want to overfit the network and decrease the power of
the classifier generalization
“we stop training at a minimum of the error on the validation set”

Pattern Classification, Chapter 6

2
9

Pattern Classification, Chapter 6

3
0

EXERCISES

• Exercise #1.
Explain why a MLP (multilayer perceptron) does not
learn if the initial weights and biases are all zeros

• Exercise #2. (#2 p. 344)

Pattern Classification, Chapter 6

DHSCH 6
No ratings yet
DHSCH 6
35 pages
Neural Networks for Pattern Classification
No ratings yet
Neural Networks for Pattern Classification
45 pages
Week 03-04 - Deep Feedforward Networks - Intro
No ratings yet
Week 03-04 - Deep Feedforward Networks - Intro
141 pages
Data Mining, Advance Methods
No ratings yet
Data Mining, Advance Methods
83 pages
Neural Networks
No ratings yet
Neural Networks
37 pages
Chap11 Neural Nets
No ratings yet
Chap11 Neural Nets
38 pages
Classification 1
No ratings yet
Classification 1
78 pages
Learning Rules For Multilayer Feedforward Neural Networks
No ratings yet
Learning Rules For Multilayer Feedforward Neural Networks
19 pages
36-Multi-Layer Perceptron and Its Properties-30-10-2024
No ratings yet
36-Multi-Layer Perceptron and Its Properties-30-10-2024
39 pages
ANN-Implemetation of Back-Prop
No ratings yet
ANN-Implemetation of Back-Prop
89 pages
Unit 3
No ratings yet
Unit 3
39 pages
Data Mining-Backpropagation
100% (1)
Data Mining-Backpropagation
5 pages
Lecture 9
No ratings yet
Lecture 9
78 pages
Main
No ratings yet
Main
25 pages
XOR Problem & Two-Layer Perceptron
No ratings yet
XOR Problem & Two-Layer Perceptron
74 pages
Neural
No ratings yet
Neural
53 pages
Artificial Neural Networks An Artificial Neuron: X W X W S X W W y
No ratings yet
Artificial Neural Networks An Artificial Neuron: X W X W S X W W y
7 pages
Chapter 9. Classification: Advanced Methods
No ratings yet
Chapter 9. Classification: Advanced Methods
39 pages
DWDM Unit4-2
No ratings yet
DWDM Unit4-2
4 pages
Classification BP Regression KNN Other Classifiers - Final
No ratings yet
Classification BP Regression KNN Other Classifiers - Final
116 pages
Classification by Back Propagation
No ratings yet
Classification by Back Propagation
20 pages
AI17-Neural Networks
No ratings yet
AI17-Neural Networks
34 pages
CC511 Week 5 - 6 - NN - BP
No ratings yet
CC511 Week 5 - 6 - NN - BP
62 pages
Neural Network
100% (1)
Neural Network
54 pages
978-3-030-41068-1 (1) - 133-188
No ratings yet
978-3-030-41068-1 (1) - 133-188
56 pages
Multilayer Feed Forward Neural Network
No ratings yet
Multilayer Feed Forward Neural Network
8 pages
Backpropagation
No ratings yet
Backpropagation
7 pages
Contents MLP PDF
No ratings yet
Contents MLP PDF
60 pages
Neural Networks & Deep Learning Lecture
No ratings yet
Neural Networks & Deep Learning Lecture
9 pages
Artificial Neural Network
No ratings yet
Artificial Neural Network
35 pages
Data Mining Techniques: Presentation On Neural Network
No ratings yet
Data Mining Techniques: Presentation On Neural Network
55 pages
Neural Networks Unit-3
No ratings yet
Neural Networks Unit-3
14 pages
Bayesian Belief and Regression
No ratings yet
Bayesian Belief and Regression
19 pages
Kagan Lecture2
No ratings yet
Kagan Lecture2
118 pages
Unit 3 - Classification With Back Propagation
No ratings yet
Unit 3 - Classification With Back Propagation
20 pages
ANN MODULE 1 Part2
No ratings yet
ANN MODULE 1 Part2
58 pages
Single Layer Feedforward Networks
No ratings yet
Single Layer Feedforward Networks
21 pages
Concepts and Techniques: Data Mining
No ratings yet
Concepts and Techniques: Data Mining
78 pages
Notes Chapter8
No ratings yet
Notes Chapter8
4 pages
Chapter 11 Neural Nets (Python)
No ratings yet
Chapter 11 Neural Nets (Python)
43 pages
Neural Networks for Tech Enthusiasts
No ratings yet
Neural Networks for Tech Enthusiasts
15 pages
CL Back Propogation
No ratings yet
CL Back Propogation
11 pages
DL Unit 1
No ratings yet
DL Unit 1
10 pages
Unit Iv DM
No ratings yet
Unit Iv DM
58 pages
Module 2 DL Snotes P1
No ratings yet
Module 2 DL Snotes P1
16 pages
NN-Ch2 New V1
No ratings yet
NN-Ch2 New V1
99 pages
Neural Network
No ratings yet
Neural Network
55 pages
3 Neural
No ratings yet
3 Neural
27 pages
Neural Net 2002
No ratings yet
Neural Net 2002
12 pages
Backpropogation Learning
No ratings yet
Backpropogation Learning
9 pages
Topic 7
No ratings yet
Topic 7
33 pages
BackProp in Recurrent NNs
100% (1)
BackProp in Recurrent NNs
10 pages
3 Non Linear Classifiers
No ratings yet
3 Non Linear Classifiers
74 pages
ANN Unit 3
No ratings yet
ANN Unit 3
100 pages
Classification by Backpropagation - A Multilayer Feed-Forward Neural Network - Defining A Network Topology - Backpropagation
No ratings yet
Classification by Backpropagation - A Multilayer Feed-Forward Neural Network - Defining A Network Topology - Backpropagation
8 pages
Unit - II ML
No ratings yet
Unit - II ML
9 pages
Machine Learning Classification Techniques For Heart Disease Prediction: A Review
No ratings yet
Machine Learning Classification Techniques For Heart Disease Prediction: A Review
8 pages
Night Vision Pedestrian Detection
No ratings yet
Night Vision Pedestrian Detection
17 pages
Md. Faisal 2024GE10 Assignment
No ratings yet
Md. Faisal 2024GE10 Assignment
13 pages
Chapter21 4e
No ratings yet
Chapter21 4e
35 pages
Deciphering The Enigma A Deep Dive Into Understand
No ratings yet
Deciphering The Enigma A Deep Dive Into Understand
11 pages
Brain Tumor MRI Detection
No ratings yet
Brain Tumor MRI Detection
39 pages
MCA Machine Learning Practical File
No ratings yet
MCA Machine Learning Practical File
22 pages
MACHINE LEARNING Question Bank
No ratings yet
MACHINE LEARNING Question Bank
11 pages
B12158 Mastering PyTorch Ebook 15 Pages
No ratings yet
B12158 Mastering PyTorch Ebook 15 Pages
15 pages
Lecture Slides For: Ethem Alpaydin © The MIT Press, 2010
No ratings yet
Lecture Slides For: Ethem Alpaydin © The MIT Press, 2010
28 pages
CNN-based and DTW Features For Human Activity Recognition On Depth Maps
No ratings yet
CNN-based and DTW Features For Human Activity Recognition On Depth Maps
14 pages
Efficient CNN Architecture Design Guided by Visualization
No ratings yet
Efficient CNN Architecture Design Guided by Visualization
6 pages
Machine Learning and Big Data Analytics
No ratings yet
Machine Learning and Big Data Analytics
372 pages
AIML ISE mqp1
No ratings yet
AIML ISE mqp1
2 pages
Pham 2023 IOP Conf. Ser. Earth Environ. Sci. 1278 012004
No ratings yet
Pham 2023 IOP Conf. Ser. Earth Environ. Sci. 1278 012004
8 pages
Deep Learning
No ratings yet
Deep Learning
45 pages
Answer Any Two Full Questions, Each Carries 15 Marks: F F1124 Pages: 2
No ratings yet
Answer Any Two Full Questions, Each Carries 15 Marks: F F1124 Pages: 2
2 pages
Computational Psychiatry Insights
No ratings yet
Computational Psychiatry Insights
16 pages
Activation Function
No ratings yet
Activation Function
4 pages
On The Vietnamese Name Entity Recognition: A Deep Learning Method Approach
No ratings yet
On The Vietnamese Name Entity Recognition: A Deep Learning Method Approach
5 pages
Batik Classification with Deep Learning
No ratings yet
Batik Classification with Deep Learning
8 pages
Bits F446 1816 20230809111214
No ratings yet
Bits F446 1816 20230809111214
2 pages
A Brief Review of Feed-Forward Neural Networks
No ratings yet
A Brief Review of Feed-Forward Neural Networks
8 pages
Feedforward Neural Networks - Part 1 - Parveen Khurana - Medium
No ratings yet
Feedforward Neural Networks - Part 1 - Parveen Khurana - Medium
53 pages
Smart Parking System Using Yolov3 Deep Learning Model: Major Project Report
No ratings yet
Smart Parking System Using Yolov3 Deep Learning Model: Major Project Report
26 pages
7 - BV - Ananda - Path - All - Chapter PDF
No ratings yet
7 - BV - Ananda - Path - All - Chapter PDF
86 pages
Generative Artificial Intelligence: Electronic Markets December 2023
No ratings yet
Generative Artificial Intelligence: Electronic Markets December 2023
18 pages
Article Hand Writing Character Recognition Using CNN
No ratings yet
Article Hand Writing Character Recognition Using CNN
6 pages
Ad3501-Deep-Learningquestion Bak
No ratings yet
Ad3501-Deep-Learningquestion Bak
15 pages
Test - 1 IDS
No ratings yet
Test - 1 IDS
20 pages