0% found this document useful (0 votes)

118 views29 pages

Back Propagation Back Propagation Network Network Network Network

The document describes back propagation networks (BPNs), a type of neural network that uses backpropagation as a learning algorithm. BPNs have a multi-layer feedforward architecture consisting of an input layer, hidden layers, and an output layer. During training, the error is calculated at the output and propagated back through the network to update the weights between layers. Factors that affect training include the initial weights, learning rate, momentum, number of training examples, and number of hidden nodes. An example demonstrates calculating the output, error, and weight updates for a simple BPN classifying a single input pattern.

Uploaded by

alvinverghese

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

118 views29 pages

Back Propagation Back Propagation Network Network Network Network

Uploaded by

alvinverghese

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 29

BACK PROPAGATION

NETWORK
Back propagation network (BPN)
Network associated with back propagation learning
algorithm (BPLA).
BPLA is one of the most important development in neural
network.
BPLA is applied to multilayer feed forward networks
consisting of processing elements with continuous
differentiable activation function.
BPN is used to classify the input patterns correctly.
Basics of gradient descent method is used in weight update
algorithm.
Back propagation network (BPN)
Error will be propagated back to the hidden unit.
Aims to achieve a balance between the network’s ability to
respond and its ability to give reasonable responses to the
input that is similar but not identical to the training input.
Training stages in BPN network:
Output generation of the network for the input pattern

Calculation and back propagation of the error

Updation of weights.
Architecture
BPN is a multi layer feed-forward neural network consisting
of
Input layer

Hidden layer and

Output layer

During back propagation of error, the signal are sent in the

reverse direction.
Input and output of BPN may be binary or bipolar.
Activation function could be any function which increases
monotonically and differentiable.
Architecture (Contd..,)
Notations
x – input training vector
t – target output vector
α - learning rate
voj – bias on jth hidden unit
wok – bias on kth output unit
zj – hidden unit j
zinj – net input to zj
yk – output unit k
Notations (Contd..,)
δ k - error correction weight adjustment for wjk
δ j - error correction weight adjustment for vij
Commonly used activation function:
Binary sigmoidal function

Bipolar sigmoidal function

Properties of activation function to be used in BPN

Continuity

Differentiability

Nondecreasing monotony
Training patterns
Incremental approach for updation of weights.
Weights are being changed immediately after a single training pattern
is presented.

Batch – mode training

Weights are being changed only after all the training patterns are
presented.

Requires additional storage for each connection to maintain the

immediate weight changes.

Effectiveness of the training pattern depends on problem.

BPN – Equivalent to optimal Bayesian discriminant function
BP learning algorithm
BP learning algorithm will converge and find proper weights
for network even after enough learning if and only if there
exist a relation between input and output training pattern is
deterministic and error surface is deterministic.
BPN is a special case of stochastic approximation.
Randomness of the algorithm helps it to get out of local
optima.
Factors affecting the BPN
Training of BPN and convergence of BPN is based on the
choice of various parameters like
Initial weights

Learning rate

Updation rule

Size and nature of training set

Architecture (i.e., number of layers and number of neurons per layer)

Factors affecting the BPN
Initial weight
Initialized at random values.
Choice of initial weight determines how fast the network converges.
Can not be very high since sigmoidal activation functions used here
may get saturated and system may be stuck to local optima.
Method 1: Range in which the initial weight can be initialized

where oi is the number of processing elements j that feed – forward to

processing element i
Factors affecting the BPN
Method 2: Using Nyugen – Widrow initialization
This method leads to faster convergence of network.

Concept is based on geometric analysis.

Factors affecting the BPN
Learning rate:
Affects the convergence.

Larger value
speed up the convergence but might result in overshooting

Leads to rapid learning but there is oscillation of weights

Smaller value – Has vice versa effect.

Range: 10-3 to 10
Factors affecting the BPN
Momentum
Very efficient and commonly used method that allows a larger
learning rate without oscillations is adding a momentum factor to the
normal weight updation method.

Denoted as

Common value assigned is 0.9

Can be used in pattern by pattern updating or batch – mode updating.

Momentum factor leaves some useful information for weight

updation if pattern by pattern method is used.

Helps in faster convergence

Factors affecting the BPN
Weight updation formula
Factors affecting the BPN
Generalization
A network is said to be generalized when it sensibly interpolates with
the new input networks.

Over-fitting or Over-training:

Network learns well but does not generalize well if there are many
trainable parameters for the given amount of training data is available,

Making small changes in the input space of a pattern without

changing the output components can improve the ability of the
network to generalize to a test data set.

Smaller networks are preferred since, a network with large number

of nodes is capable of memorizing the training set that generalizing it.
Factors affecting the BPN
Number of training data T
Should be sufficient and proper.

Training data should cover the entire expected input space, and while
training, training – vector pairs should be selected randomly from the
set.

Let us consider the input space can be linearly separable into L

disjoint regions , and T is the lower bound on the number of training
patterns .

If proper value of T is selected such that T/L >> 1, then the network
can able to discriminate pattern classes using fine piecewise
hyperplane partitioning.
Factors affecting the BPN
Number of hidden layer nodes
If there is more than one hidden layer in a BPN, then calculations
performed for a single layer are repeated for other layers and are
summed up at the end.

For a network of a reasonable size, the size of hidden nodes has to be

relatively small fraction of input layer.

Example:
If the network does not converge to a solution, it may need more hidden
nodes.

Also, if the network converges, the user may try a very few hidden nodes
and then settle finally on a size based on overall system performance.
Example
Input pattern [0, 1] . Target output: 1 Learning rate,
Example (Contd..,)
Initial weights:
[ v11 v21 v01] =[0.6 -0.1 0.3]

[v12 v22 v02 ] = [-0.3 0.4 0.5]

[w1 w2 w0] = [ 0.4 0.1 -0.2]

Activation function used:

Example (Contd..,)
Calculate the net input :
For z1 layer:

For z2 layer:
Example (Contd..,)
Applying activation function:

Calculate the net input to the output layer.

Example (Contd..,)
Applying activation function, we get

Compute the error using

Now,
Example (Contd..,)
Therefore,

Change in weight between the hidden and output layer.

Example (Contd..,)
Calculate the error between the input and hidden layer using
and

Here m = 1 and j = 1 to 2.
Therefore,
Example (Contd..,)
Now,
Example (Contd..,)
Now,
Example (Contd..,)
Calculate change un weights between the input and hidden
layer
Example (Contd..,)
The final weights are calculated as

Machine Learning in Business An Introduction To The World of Data Science 2nd Edition John C Hull Download
No ratings yet
Machine Learning in Business An Introduction To The World of Data Science 2nd Edition John C Hull Download
82 pages
Long Short Term Memory (LSTM)
No ratings yet
Long Short Term Memory (LSTM)
23 pages
Financial Time Series Analysis and Prediction With Feature Engineering and Support Vector Machines - Newton - Linchen
100% (1)
Financial Time Series Analysis and Prediction With Feature Engineering and Support Vector Machines - Newton - Linchen
5 pages
Expected Value Markov Chains
No ratings yet
Expected Value Markov Chains
10 pages
Back Propagation
No ratings yet
Back Propagation
33 pages
MLP & Backpropagation Explained
No ratings yet
MLP & Backpropagation Explained
30 pages
Nuclear Physics vs Quantitative Finance
No ratings yet
Nuclear Physics vs Quantitative Finance
34 pages
Markov Chains for Mathematicians
No ratings yet
Markov Chains for Mathematicians
59 pages
Stanford - Discrete Time Markov Chains PDF
No ratings yet
Stanford - Discrete Time Markov Chains PDF
23 pages
Unsupervised Learning Networks
No ratings yet
Unsupervised Learning Networks
29 pages
Markov Chains
No ratings yet
Markov Chains
42 pages
MATH858D Markov Chains: Maria Cameron
No ratings yet
MATH858D Markov Chains: Maria Cameron
44 pages
Multilayer Perceptron Neural Network
No ratings yet
Multilayer Perceptron Neural Network
17 pages
Deep RL Overview for AI Researchers
No ratings yet
Deep RL Overview for AI Researchers
150 pages
A Beginners Guide To Deep Reinforcement Learning PDF
No ratings yet
A Beginners Guide To Deep Reinforcement Learning PDF
9 pages
Kumar - Deep Recurrent Q-Networks For Market Making PDF
No ratings yet
Kumar - Deep Recurrent Q-Networks For Market Making PDF
10 pages
Deep Learning in Hilbert Spaces - New Frontiers in Algorithmic Trading
No ratings yet
Deep Learning in Hilbert Spaces - New Frontiers in Algorithmic Trading
351 pages
Markov Chains
No ratings yet
Markov Chains
61 pages
Deep Reinforcement Learning in High Frequency Trad
No ratings yet
Deep Reinforcement Learning in High Frequency Trad
6 pages
The Multilayer Perceptron
No ratings yet
The Multilayer Perceptron
11 pages
An Introduction To Deep Reinforcement Learning PDF
No ratings yet
An Introduction To Deep Reinforcement Learning PDF
140 pages
QF-TraderNet Intraday Trading Via Deep Reinforceme
No ratings yet
QF-TraderNet Intraday Trading Via Deep Reinforceme
12 pages
Computations in Option Pricing Engines 2020
100% (1)
Computations in Option Pricing Engines 2020
51 pages
Intro To Machine Learning With PyTorch
No ratings yet
Intro To Machine Learning With PyTorch
48 pages
Machine Learning Guide Line
No ratings yet
Machine Learning Guide Line
10 pages
EEG Classification Using Long Short-Term Memory Recurrent Neural Networks
No ratings yet
EEG Classification Using Long Short-Term Memory Recurrent Neural Networks
29 pages
Developing Machine Learning Applications With TensorFlow
No ratings yet
Developing Machine Learning Applications With TensorFlow
22 pages
The Backpropagation Algorithm
No ratings yet
The Backpropagation Algorithm
4 pages
Signal Processing for Finance Experts
No ratings yet
Signal Processing for Finance Experts
19 pages
Why Machine Learning Funds Fail
100% (1)
Why Machine Learning Funds Fail
45 pages
Empirical Market Microstructure Joel Hasbrouck Instant Download
100% (2)
Empirical Market Microstructure Joel Hasbrouck Instant Download
81 pages
Logistic Regression Explained
No ratings yet
Logistic Regression Explained
3 pages
High Frequency Trading Final Paper
No ratings yet
High Frequency Trading Final Paper
7 pages
Lecture 1: Introduction To Reinforcement Learning: David Silver
No ratings yet
Lecture 1: Introduction To Reinforcement Learning: David Silver
46 pages
A Recurrent Neural Network
No ratings yet
A Recurrent Neural Network
22 pages
01 Transformers For Time-Series Data - by BearingPoint Data, Analytics & AI - BearingPoint Data, Analytics & AI - Medium
No ratings yet
01 Transformers For Time-Series Data - by BearingPoint Data, Analytics & AI - BearingPoint Data, Analytics & AI - Medium
20 pages
Markov Chains for Grad Students
100% (1)
Markov Chains for Grad Students
562 pages
Hidden Markov Models & Viterbi Algorithm
No ratings yet
Hidden Markov Models & Viterbi Algorithm
41 pages
Time Series Forecasting Fundamentals
No ratings yet
Time Series Forecasting Fundamentals
37 pages
Graph Neural Network The Next Frontier in Deep Learning
No ratings yet
Graph Neural Network The Next Frontier in Deep Learning
1 page
Continuous Markov Chain
No ratings yet
Continuous Markov Chain
17 pages
Gakhov Time Series Forecasting With Python
No ratings yet
Gakhov Time Series Forecasting With Python
66 pages
A Futures Quantitative Trading Strategy Based On A Deep Reinforcement Learning Algorithm
No ratings yet
A Futures Quantitative Trading Strategy Based On A Deep Reinforcement Learning Algorithm
5 pages
Deep Learning
No ratings yet
Deep Learning
49 pages
The Market Graph
100% (1)
The Market Graph
45 pages
Balancing Coupling in Software Design Universal Design Principles For Architecting Modular Software Systems (Vlad Khononov)
No ratings yet
Balancing Coupling in Software Design Universal Design Principles For Architecting Modular Software Systems (Vlad Khononov)
557 pages
Quantitative Trading with Order Imbalance
No ratings yet
Quantitative Trading with Order Imbalance
94 pages
Bayesian ML: Principles & Applications
No ratings yet
Bayesian ML: Principles & Applications
6 pages
Statistical Machine Learning For Quantitative Finance
No ratings yet
Statistical Machine Learning For Quantitative Finance
25 pages
A Gentle Introduction To Backpropagation
100% (1)
A Gentle Introduction To Backpropagation
15 pages
Forex Industry Insights 2011
No ratings yet
Forex Industry Insights 2011
1 page
Graph Neural Networks: Aakash Kumar Arvind Ramadurai
No ratings yet
Graph Neural Networks: Aakash Kumar Arvind Ramadurai
22 pages
Neural Network Backpropagation Guide
No ratings yet
Neural Network Backpropagation Guide
7 pages
Sklearn Cross-Validation Guide
100% (1)
Sklearn Cross-Validation Guide
9 pages
Machine Learning by Joerg Kienitz
No ratings yet
Machine Learning by Joerg Kienitz
5 pages
RBFN and TDNN
No ratings yet
RBFN and TDNN
42 pages
UNIT 2 Notes
No ratings yet
UNIT 2 Notes
19 pages
Mod 2 4
No ratings yet
Mod 2 4
20 pages
ANN-Implemetation of Back-Prop
No ratings yet
ANN-Implemetation of Back-Prop
89 pages
10 Spheres
No ratings yet
10 Spheres
2 pages
Cambridge International As A Level Mathematics Probability Statistics 1 Practice Book Cambridge International Download
No ratings yet
Cambridge International As A Level Mathematics Probability Statistics 1 Practice Book Cambridge International Download
44 pages
Statistical Quality Control
100% (1)
Statistical Quality Control
3 pages
Quantum Mechanics Notes Summary
No ratings yet
Quantum Mechanics Notes Summary
64 pages
Computational Intelligence (CS3030/CS3031) : School of Computer Engineering, KIIT-DU, BBS-24, India
No ratings yet
Computational Intelligence (CS3030/CS3031) : School of Computer Engineering, KIIT-DU, BBS-24, India
2 pages
Bartłomiej Żyliński, Ryszard Buczkowski
No ratings yet
Bartłomiej Żyliński, Ryszard Buczkowski
18 pages
Sneed-Structuralism and Scientific Realism (1983)
No ratings yet
Sneed-Structuralism and Scientific Realism (1983)
26 pages
Arhaan Math - Merged
No ratings yet
Arhaan Math - Merged
11 pages
Lecture 9 - Performance Evaluation
No ratings yet
Lecture 9 - Performance Evaluation
2 pages
Prof Ed N5 Assessment of Learning
No ratings yet
Prof Ed N5 Assessment of Learning
32 pages
Asgn-6 Soln
No ratings yet
Asgn-6 Soln
16 pages
Cot Math 4 q2 - Week6 2022
No ratings yet
Cot Math 4 q2 - Week6 2022
12 pages
Trig Functions Acute Angles PP
No ratings yet
Trig Functions Acute Angles PP
6 pages
Holiday Assignment
No ratings yet
Holiday Assignment
2 pages
Experiment 4 - Numerical Differentiation
No ratings yet
Experiment 4 - Numerical Differentiation
6 pages
Saljnikov Aleksandar
No ratings yet
Saljnikov Aleksandar
8 pages
EC6303-Signals and Systems
No ratings yet
EC6303-Signals and Systems
10 pages
Chapter 12-14 Study Questions
No ratings yet
Chapter 12-14 Study Questions
2 pages
BTCS9202 Data Sciences Lab Manual
No ratings yet
BTCS9202 Data Sciences Lab Manual
39 pages
Truss Analysis & Elastic Strain Energy
No ratings yet
Truss Analysis & Elastic Strain Energy
12 pages
Mth-Iii 4
No ratings yet
Mth-Iii 4
1 page
Combinatorial Proof of Derangement Identity
No ratings yet
Combinatorial Proof of Derangement Identity
5 pages
CUHK STAT5102 Ch3
No ratings yet
CUHK STAT5102 Ch3
73 pages
Ix Stati
No ratings yet
Ix Stati
3 pages
Logic Students' Guide
No ratings yet
Logic Students' Guide
5 pages
s.3 Mathematics Paper 2
100% (1)
s.3 Mathematics Paper 2
5 pages
Note You Must Follow A Sequential Method and Show All Your Working For Arriving at A Particular Solution
No ratings yet
Note You Must Follow A Sequential Method and Show All Your Working For Arriving at A Particular Solution
9 pages
Did Staggered
No ratings yet
Did Staggered
37 pages
Math Reviewer
No ratings yet
Math Reviewer
1 page
Circles
No ratings yet
Circles
2 pages

Back Propagation Back Propagation Network Network Network Network

Uploaded by

Back Propagation Back Propagation Network Network Network Network

Uploaded by

BACK PROPAGATION

Calculation and back propagation of the error

Hidden layer and

During back propagation of error, the signal are sent in the

Bipolar sigmoidal function

Properties of activation function to be used in BPN

Batch – mode training

Requires additional storage for each connection to maintain the

Effectiveness of the training pattern depends on problem.

Size and nature of training set

Architecture (i.e., number of layers and number of neurons per layer)

where oi is the number of processing elements j that feed – forward to

Concept is based on geometric analysis.

Leads to rapid learning but there is oscillation of weights

Smaller value – Has vice versa effect.

Common value assigned is 0.9

Can be used in pattern by pattern updating or batch – mode updating.

Momentum factor leaves some useful information for weight

Helps in faster convergence

Making small changes in the input space of a pattern without

Smaller networks are preferred since, a network with large number

Let us consider the input space can be linearly separable into L

For a network of a reasonable size, the size of hidden nodes has to be

[v12 v22 v02 ] = [-0.3 0.4 0.5]

[w1 w2 w0] = [ 0.4 0.1 -0.2]

Activation function used:

Calculate the net input to the output layer.

Compute the error using

Change in weight between the hidden and output layer.

You might also like