Machine Learning using Neural
Networks
Presentation by:
C. Vinoth Kumar
SSN College of Engineering
Physics
SA
EC
Computational
Intelligence
ANN
Dtree
Machine
Learning
The Family Tree
Computational Intelligence
Fuzzy Logic Neural Networks Evolutionary Computation
Rough Sets SOM MLPN GA ES
Grey Model KNN SVM GP EP
FSA DTree LPM Foruier SA SWARM
State Space Wavelets ANT
Artificial Neural Networks
Forecast time series
Control robots
Pattern recognition
Noise removal
Digit recognition
Personal identification
Optimise portfolios
Data mining
Learning
Learning is a fundamental and essential characteristic of
biological neural networks.
The ease with which they can learn led to attempts to
emulate a biological neural network in a computer.
3 main types of learning
Supervised learning
– learning with a teacher
Unsupervised learning
– Learning from pattern
Reinforcement learning
– Learning through experiences
Machine Learning
Machine learning involves adaptive mechanisms that
enable computers to learn from experience, learn by
example and learn by analogy. Learning capabilities can
improve the performance of an intelligent system over
time.
The most popular approaches to machine learning are
artificial neural networks and genetic algorithms.
Biological neural network
A neural network can be defined as a model of
reasoning based on the human brain. The brain consists
of a densely interconnected set of nerve cells, or basic
information-processing units, called neurons.
The human brain incorporates nearly 10 billion neurons
and 60 trillion connections, synapses, between them. By
using multiple neurons simultaneously, the brain can
perform its functions much faster than the fastest
computers in existence today.
Each neuron has a very simple structure, but an army of
such elements constitutes a tremendous processing
power.
Biological neural network
Synapse
Synapse Dendrites
Axon
Axon
Axon hillock
Soma Soma
Dendrites
Synapse
Biological neural network
A neuron consists of a cell body, soma, a number of
fibres called dendrites, and a single long fibre called the
axon.
Our brain can be considered as a highly complex, non-
linear and parallel information-processing system.
Information is stored and processed in a neural network
simultaneously throughout the whole network, rather
than at specific locations. In other words, in neural
networks, both data and its processing are global rather
than local.
Artificial Neural Networks
An artificial neural network consists of a number of very
simple processors, also called neurons, which are
analogous to the biological neurons in the brain.
The neurons are connected by weighted links passing
signals from one neuron to another.
The output signal is transmitted through the neuron’s
outgoing connection. The outgoing connection splits into
a number of branches that transmit the same signal.
The outgoing branches terminate at the incoming
connections of other neurons in the network.
Architecture of an ANN
Out put Signals
Input Signals
Middle Layer
Input Layer Output Layer
Analogy between biological and
artificial neural networks
Biological Neural Network Artificial Neural Network
Soma Neuron
Dendrite Input
Axon Output
Synapse Weight
Neuron - A simple computing element
Input Signals Weights Output Signals
x1
Y
w1
x2
w2
Neuron Y Y
wn
Y
xn
Neuron - A simple computing element
The neuron computes the weighted sum of the input
signals and compares the result with a threshold value, θ.
If the net input is less than the threshold, the neuron
output is –1. But if the net input is greater than or equal to
the threshold, the neuron becomes activated and its
output attains a value +1.
The neuron uses the following transfer or activation
function:
n + 1, if X ≥ θ
X = ∑ xi wi Y =
i =1 − 1, if X < θ
This type of activation function is called a sign function.
Activation functions of a neuron
Step function Sign function Sigmoid function Linear function
Y Y Y Y
+1 +1 +1 +1
0 X 0 X 0 X 0 X
-1 -1 -1 -1
step 1, if X ≥ 0 sign +1, if X ≥ 0 sigmoid 1
Y = Y = Y = Y linear= X
0, if X < 0 −1, if X < 0 1+ e− X
Perceptron
In 1958, Frank Rosenblatt introduced the first procedure
/ algorithm for training a simple ANN: a perceptron.
The perceptron is the simplest form of a neural network.
It consists of a single neuron with adjustable synaptic
weights and a hard limiter.
The weighted sum of the inputs is applied to the hard
limiter, which produces an output equal to +1 if its input
is positive and −1 if it is negative.
Single-layer two-input Perceptron
Inputs
x1 Linear Hard
w1 Combiner Limiter
Output
∑ Y
w2
θ
x2
Threshold
The aim of the perceptron is to classify inputs,
x1, x2, . . ., xn, into one of two classes, say
A1 and A2.
In the case of an elementary perceptron, the n-
dimensional space is divided by a hyperplane into two
decision regions. The hyperplane is defined by the
linearly separable function:
n
∑ xi wi − θ = 0
i =1
Linear separability in the Perceptron
x2 x2
Class A 1
1
2
1
x1
Class A 2 x1
x1w 1 + x2w 2 − θ = 0 x1 w 1 + x 2 w 2 + x 3 w 3 − θ = 0
x3
(a) Two-input perceptron. (b) Three-input perceptron.
How does the perceptron learn its classification tasks?
This is done by making small adjustments in the
weights to reduce the difference between the actual and
desired outputs of the perceptron. The initial weights
are randomly assigned, usually in the range [−0.5, 0.5],
and then updated to obtain the output consistent with
the training examples.
If at iteration p, the actual output is Y(p) and the desired
output is Yd (p), then the error is given by:
e(p) = Yd(p) - Y(p) where p = 1, 2, 3, . . .
If the error, e(p), is positive, we need to increase
perceptron output Y(p), but if it is negative, we need to
decrease Y(p).
The perceptron learning rule
wi ( p + 1) = wi ( p) + α ⋅ xi ( p) ⋅ e( p)
where p = 1, 2, 3, . . .
α is the learning rate, a positive constant less
than unity.
The perceptron learning rule was first proposed by
Rosenblatt in 1960. Using this rule we can derive the
perceptron training algorithm for classification tasks.
Perceptron’s training algorithm
Step 1 Initialisation: Set initial weights w1, w2,…, wn and
threshold θ to random numbers in the range [−0.5,
0.5].
Step 2 Activation: Activate the perceptron by applying
inputs x1(p), x2(p),…, xn(p) and desired output Yd (p).
Calculate the actual output at iteration p = 1
Step 3 Weight training: Update the weights of the
perceptron based on the error calculated using delta
rule and weight updation function
Step 4 Iteration: Increase iteration p by one, go back to
Step 2 and repeat the process until convergence.
Example of perceptron learning: the logical operation AND
Inputs Desired Initial Actual Error Final
Epoch output weights output weights
x1 x2 Yd w1 w2 Y e w1 w2
1 0 0 0 0.3 −0.1 0 0 0.3 −0.1
0 1 0 0.3 −0.1 0 0 0.3 −0.1
1 0 0 0.3 −0.1 1 −1 0.2 −0.1
1 1 1 0.2 −0.1 0 1 0.3 0.0
2 0 0 0 0.3 0.0 0 0 0.3 0.0
0 1 0 0.3 0.0 0 0 0.3 0.0
1 0 0 0.3 0.0 1 −1 0.2 0.0
1 1 1 0.2 0.0 1 0 0.2 0.0
3 0 0 0 0.2 0.0 0 0 0.2 0.0
0 1 0 0.2 0.0 0 0 0.2 0.0
1 0 0 0.2 0.0 1 −1 0.1 0.0
1 1 1 0.1 0.0 0 1 0.2 0.1
4 0 0 0 0.2 0.1 0 0 0.2 0.1
0 1 0 0.2 0.1 0 0 0.2 0.1
1 0 0 0.2 0.1 1 −1 0.1 0.1
1 1 1 0.1 0.1 1 0 0.1 0.1
5 0 0 0 0.1 0.1 0 0 0.1 0.1
0 1 0 0.1 0.1 0 0 0.1 0.1
1 0 0 0.1 0.1 0 0 0.1 0.1
1 1 1 0.1 0.1 1 0 0.1 0.1
Threshold: θ = 0.2; learning rate: α = 0.1