Module - 4
Bayesian Learning:
• Introduction to Probability-based Learning, Fundamentals of
Bayes Theorem, Classification Using Bayes Model, Naïve
Bayes Algorithm for Continuous Attributes.
Artificial Neural Networks:
• Introduction, Biological Neurons, Artificial Neurons,
Perceptron and Learning Theory, Types of Artificial Neural
Networks, Popular Applications of Artificial Neural Networks,
Advantages and Disadvantages of ANN, Challenges of ANN.
Chapter-10 (10.1-10.5, 10.9-10.11)
Prof.Dr.Babu Rao K - Machine Learning 1
Artificial Neural Network
(ANN)
Prof.Dr.Babu Rao K,
MITE
Prof.Dr.Babu Rao K - Module – 4 (Artificial Neural Networks) 2
Overview of ANN
• Artificial Neural Newtok (ANNs) imitate human brain behavior and
the way in which learning happens is a human.
• Human brain constitutes a mass of neurons that are all connected as a
network, which is actually a directed graph.
• These neurons are the processing units which receive information,
process it and then transmit this data to other neurons that allows
humans to learn almost any task.
• ANN is a learning mechanism that models a human brain to solve any
non-linear and complex problem.
Prof.Dr.Babu Rao K - Module – 4 (Artificial Neural Networks) 3
Introduction
• The human nervous system has billions of neurons that are the processing
units which make humans to perceive things, to hear, to see and to smell.
• The human nervous system works beautifully, making us understand who
we are, what we do, where we are and everything in our surrounding.
• It makes us to remember, recognize and correlate things around us.
• It is a learning system that consists of functional units called nerve cells,
typically called as neurons.
• The human nervous system divided into two sections called the Central
Nervous System(CNS) and Peripheral Nervous System(PNS).
Prof.Dr.Babu Rao K - Module – 4 (Artificial Neural Networks) 4
Biological Neurons
• A typical biological neuron has 4 parts called dendrites, soma, axon
and synapse.
• The body of the neuron is
called as soma.
• Dendrites accept the input
information and process it in
the cell body called soma.
Prof.Dr.Babu Rao K - Module – 4 (Artificial Neural Networks) 5
Artificial Neurons
• Artificial neurons are like biological neurons which are called as nodes.
• A node or a neuron can receive
one or more input information
and process it.
• Artificial neurons or nodes are
connected by connection links
to one another.
• Each connection link is associate
with synaptic weight.
• The structure of a single neuron is shown in Figure 10.2
Prof.Dr.Babu Rao K - Module – 4 (Artificial Neural Networks) 6
Simple Model of an Artificial Neuron
• The first mathematical model of a biological neuron was designed
by McCulloch & Pitts in 1943.
• It includes two steps:
1. It receives weighted inputs
from other neurons.
2. It operates with a threshold
function or activation function.
• The received inputs are computed as a weighted sum which is
given to the activation function and if the sum exceeds the
threshold value the neuron gets fired. The mathematical model of
a neuron is shown in figure.
Prof.Dr.Babu Rao K - Module – 4 (Artificial Neural Networks) 7
Artificial Neural Network Structure
• Artificial Neural Network (ANN) imitates
a human brain which inhibits some
intelligence.
• It has a network structure represented
as a directed graph with a set of neuron
nodes and connection links or edges
connecting the nodes.
• The nodes in the graph are arrayed in
layered manner and can process
information in parallel.
Prof.Dr.Babu Rao K - Module – 4 (Artificial Neural Networks) 8
Activation Functions
• Activation functions are mathematical functions associated with each
neuron in the neural network that map input signals to output signals.
• It decides whether to fire a neuron or not based on the input signals the
neuron receives.
• These functions normalize the output value of each neuron either
between 0 and 1 or between -1 and +1.
• Typical activation functions can be linear or non-linear.
• Linear functions are useful when the input values can be classified into any
one of the two groups and are generally used in binary perception.
• Non-linear function, on the other hand, are continuous functions that map
the input in the range of (0,1) or (-1,1),etc… these functions are useful in
learning high-dimensional data or complex data as audio, video and
images.
Prof.Dr.Babu Rao K - Module – 4 (Artificial Neural Networks) 9
Below are some of the activation functions used in ANNs.
Prof.Dr.Babu Rao K - Module – 4 (Artificial Neural Networks) 10
Prof.Dr.Babu Rao K - Module – 4 (Artificial Neural Networks) 11
Prof.Dr.Babu Rao K - Module – 4 (Artificial Neural Networks) 12
Prof.Dr.Babu Rao K - Module – 4 (Artificial Neural Networks) 13
Perceptron and Learning Theory
• The first neural network model ’Perceptron’, designed by Frank Rosenblatt
in 1958, is a linear binary classifier used for supervised learning.
• He modified the McCulloch & Pitts Neuron model by combining two
concepts, McCluuloch-Pitts model of an artificial neuron and Hebbian
learning rule of adjusting weights.
• The perceptron model
consists of 4 steps,
1. inputs from other neurons,
2. Weights and bias,
3. Net sum,
4. Activation function
14
Prof.Dr.Babu Rao K - Module – 4 (Artificial Neural Networks)
Perceptron Algorithm – Steps
• Set initial weights w1,w2,…,wn and bias θ to a random values in the range [-0.5, 0.5].
• For each Epoch,
1. Computer the weighted sun by multiplying the inputs with the weights and add the
products.
2. Apply the activation function on the weights sum: Y = Step((x1w1 + x2w2) - θ)
3. If the sum is above the threshold value, output the value as positive else output the
value as negative.
4. Calculate the error by subtracting the estimated output Yestimated from the desired output
Ydesired. error e(t) = Ydesired - Yestimated
[if error e(t) is positive, incress the perceptorn output Y and if it is negative, decrease the perceptron
out: Y]
5. Update the weights if there is an error: ∆wi = α * e(t) * xi
wi = wi + ∆wi
Where, xi is the input value, e(t) is the error at step t, is the learning rate and ∆wi is the difference in
weight that has to be added to wi
Prof.Dr.Babu Rao K - Module – 4 (Artificial Neural Networks) 15
Example : consider a perceptron to represent the Boolean function
AND with the initial weights w1 = 0.3,w2 = -0.2, learning rate α =0.2
and bias θ = 0.4.
• Activation function used here is step function f(x), which gives the
output values as binary, i.e., 0 or 1.
• If value of f(x) is greater than or equal to 0, it outputs 1 or else it
output 0.
• Design a perceptron that performs
the Boolean function AND and
update the weights until the
Boolean function gives the
desired output.
Prof.Dr.Babu Rao K - Module – 4 (Artificial Neural Networks) 16
Solution: Desired output for Boolean function AND is
• For each Epoch, weighted sum is calculated and the activation
function is applied to compute the estimated output Yest. Then Yest is
compared with Ydes to find the error. If there is an error, the weights
are updated.
• Epoch1:
Prof.Dr.Babu Rao K - Module – 4 (Artificial Neural Networks) 17
• Epoch 2
Prof.Dr.Babu Rao K - Module – 4 (Artificial Neural Networks) 18
• Epoch 3
Prof.Dr.Babu Rao K - Module – 4 (Artificial Neural Networks) 19
• Epoch 4
Prof.Dr.Babu Rao K - Module – 4 (Artificial Neural Networks) 20
Delta Learning Rule and Gradient Descent:
• Generally, learning in neural networks is performed by adjusting the
network weights in order to minimize the difference between the
desired and estimated outputs.
• The delta difference is measured as an error function or also called
as cost function.
• The cost function, being linear and continuous, is differentiable.
• This way of learning called as delta rule is a type of back
propagation applied for training the network.
• The training error of hypothesis is half the squared difference between
the desired target output and actual output and is given as follows:
Prof.Dr.Babu Rao K - Module – 4 (Artificial Neural Networks) 21
• The principle of gradient descent is an optimization
approach which is used to minimize the cost function by
converging to a local minimal point moving in the negative
direction of the gradient and each step size during
movement is determined by the learning rate and the slope
of the gradient.
• Gradient descent learning is the foundation of back
propagation algorithm used in MLP.
• Before we study about MLP, let us first understand the
different types of neural networks that differ in their
structure, activation function and learning mechanism.
Prof.Dr.Babu Rao K - Module – 4 (Artificial Neural Networks) 22
Types of Artificial Neural Network
• ANNs consist of multiple neurons arranged in layers.
• There are different types of ANNs that differ by the network structure,
activation function involved and the learning rules used.
• In ANN, there are three layers called input layer, hidden layer, and
output layer.
• Any general ANN would consist of one input layer, one output layer
and zero or more hidden layers.
1. Feed Forward Neural Network(FFNN).
2. Fully Connected Neural Network(FCNN).
3. Multi-Layer Perceptron(MLP).
4. Feedback Neural Network(FBNN).
Prof.Dr.Babu Rao K - Module – 4 (Artificial Neural Networks) 23
Feed Forward Neural Network(FFNN):
• This is the simplest neural network that consists of neurons which are arranged in
layers and the information is propagated only in the forward direction.
• This model may or may not contain a hidden layer and there is no back
propagation.
• Based on the number of hidden layers
they are further classified into
single-layer and multi-layered feed
forward network.
• These ANNs are simple to design and
easy to maintain.
• They are fast but cannot to used for
complex learning.
• They are used for simple classification
and simple image processing, etc.
Prof.Dr.Babu Rao K - Module – 4 (Artificial Neural Networks) 24
Fully Connected Neural Network(FCNN):
• Fully connected neural networks are
the ones in which all the neurons
in a layer are connected to all
other neurons in the next layer.
• The model of a fully connected
neural network is shown in blow
Prof.Dr.Babu Rao K - Module – 4 (Artificial Neural Networks) 25
Multi-Layer Perceptron (MLP):
• This ANN consists of multiple layers with one input layer, one
output layer and one or more hidden layers.
• Every neuron is a layer is connected to all neurons in the next layer
and thus they are fully connected.
• The information flows in both the directions.
• In the forward direction, the inputs are multiplied by weights of
neurons and forwarded to the activation function of the neuron and
output is passed to the next layer.
• If the output is incorrect, then in the backward direction, error is
back propagated to adjust weights and biases to get correct output.
• This type of ANN is used in DL for complex classification, speech
recognition, medical diagnosis, forecasting, etc…
Prof.Dr.Babu Rao K - Module – 4 (Artificial Neural Networks) 26
Prof.Dr.Babu Rao K - Module – 4 (Artificial Neural Networks) 27
Feedback Neural Network(FBNN):
• Feedback neural network have feedback connections between the
neurons that allow information flow in both directions in the
network.
• The output signal can be sent back to
the neurons in the same layer or to
the neurons in the preceding layers.
• Hence this network is more dynamic
during training.
Prof.Dr.Babu Rao K - Module – 4 (Artificial Neural Networks) 28
Popular Applications of Artificial Neural
Networks
• Real-time applications: Face recognition, emotion detection, self-driving
cars, navigation systems, Routing system, target tracking, vehicle
scheduling, etc.
• Business applications: Stock trading, Sales forecasting, Customer
behaviour modelling, market research and analysis, etc…
• Banking and Finance: Credit and loan forecasting, fraud and risk
evaluation, etc …
• Education: Adaptive learning software, etc …
• Healthcare: medical diagnosis, drug discovery, etc …
• Other Engineering Applications: Robotics, aerospace, electronics,
manufacturing
Prof.Dr.Babu Rao K - Module – 4 (Artificial Neural Networks) 29
Advantages and Disadvantages of ANN
Advantages of ANN:
• ANN can solve complex problems involving non-linear process.
• ANNs can learn and recognize complex patterns and solve problems as
humans solve a problem.
• ANNs have a parallel processing capability and can predict in less time.
Limitations of ANN:
• Processors with parallel processing capability.
• The modelling with ANN is also extremely complicated and development
takes a much longer time.
• ANN requires more data then traditional machine learning algorithms.
• More computationally expensive than traditional learning techniques.
Prof.Dr.Babu Rao K - Module – 4 (Artificial Neural Networks) 30
Challenges of Artificial Neural Network
• The major challenges while modelling a real-time
application with ANN are:
1. Training a neural network is the most challenging part of
using this technique. Overfitting or underfitting issues may
raise if datasets used for training are not correct. Moreover,
neural network models normally need a lot of training data
to be robust and are usable for a real-time application.
2. Finding the weight and bias parameters for neural networks
is also hard and it is difficult to calculate an optimal model.
Prof.Dr.Babu Rao K - Module – 4 (Artificial Neural Networks) 31