Thanks to visit codestin.com
Credit goes to www.scribd.com

0% found this document useful (0 votes)
31 views82 pages

Intelligent Control of Drives-1

The document provides an overview of intelligent control of drives through neural networks, detailing their structure, learning processes, and applications. It compares biological neurons with artificial neural networks, highlighting differences in complexity, adaptability, and energy efficiency. Additionally, it discusses various learning rules, activation functions, advantages, and limitations of artificial neural networks, along with their applications in fields such as medicine, aviation, and business.

Uploaded by

Nandhini D
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
31 views82 pages

Intelligent Control of Drives-1

The document provides an overview of intelligent control of drives through neural networks, detailing their structure, learning processes, and applications. It compares biological neurons with artificial neural networks, highlighting differences in complexity, adaptability, and energy efficiency. Additionally, it discusses various learning rules, activation functions, advantages, and limitations of artificial neural networks, along with their applications in fields such as medicine, aviation, and business.

Uploaded by

Nandhini D
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 82

INTELLIGENT CONTROL OF DRIVES

Sub.Code:MEE203 No. of Credits:03=03:0:0 ( L - No. of Lecture


T – P) Hours/Week : 03

Introduction to neural networks :Introduction – Biological


neurons – Artificial neurons – activation function – learning
rules – feed forward networks – supervised learning –
perception networks – adaline – madaline – back propagation
networks – learning factors – linear separability – Hopfield
network – discrete Hopfield networks
Introduction to Neural Networks
• Neural networks are a set of algorithms, modeled loosely after the human brain,
that are designed to recognize patterns. They interpret sensory data through a
kind of machine perception, labeling, and clustering of raw input. Neural
networks are a subset of machine learning, and they are the backbone of deep
learning algorithms. Neural networks are used in various fields, including
image recognition, speech processing, and natural language processing.

Biological Neurons
• The concept of neural networks is inspired by the human brain, which consists
of billions of cells called neurons. Each neuron is connected to others through
synapses and transmits electrical signals. These neurons receive signals from
the sensory organs or other neurons, process them, and send outputs to other
neurons. The primary components of a biological neuron include:
BIOLOGICAL NEURON STRUCTURE
AND FUNCTIONS.
STRUCTURE AND FUNCTIONS OF
ARTIFICIAL NEURON
STATE THE MAJOR DIFFERENCES BETWEEN
BIOLOGICAL AND ARTIFICIANEURAL NETWORKS
Structure
BNN: Made of neurons, dendrites, axons, and synapses in the brain and
nervous system
ANN: Made of artificial neurons (nodes) arranged in layers (input,
hidden, output)
Signal Type
BNN: Uses electrochemical signals for communication
ANN: Uses numerical values and mathematical functions
Learning Process
BNN: Learning through synaptic plasticity, experience, and adaptation
ANN: Learning via algorithms like backpropagation and optimization
methods
Complexity
BNN: Extremely complex with billions of neurons and trillions of
connections
ANN: Simpler and limited by architecture and computing power
Energy Efficiemcy
BNN: Highly energy efficient, operates on low power
ANN: Typically requires much more power and resources
Adaptability
BNN: Highly adaptable, capable of self-repair and plasticity
ANN: Limited adaptability, depends on training data and architecture
Speed
BNN: Processes information in parallel but slower on single signals
ANN: Can process data very fast using parallel computing, but
simplified compared to biology
Purpose
BNN: Controls biological functions and cognition
ANN: Designed to perform specific tasks like pattern recognition,
classification, etc
Physical Medium
BNN: Made of organic tissues
ANN: Implemented in software and hardware (computers, GPUs)
Size
BNN: Our brain contains about 86 billion neurons and more than a 100 synapses
(connections).
ANN: The number of “neurons” in artificial networks is much less than that.

• Dendrites: Receive signals from other neurons.


• Cell body (Soma): Integrates the received signals.
• Axon: Transmits the processed signal to other neurons.
• Synapses: The junctions where signals are transmitted between neurons.
• In an artificial neural network, a similar structure is used, where the signal is
passed through artificial neurons (also called nodes or units).
BRIEFLY EXPLAIN THE BASIC BUILDING
BLOCKS OF ARTIFICIAL NEURAL NETWORKS
Processing of ANN depends upon the following three building
blocks:
1. Network Topology
2. Adjustments of Weights or Learning
3. Activation Functions
1. Network Topology: A network topology is the arrangement
of a network along with its nodes and connecting lines.
According to the topology, ANN can be classified as the
following kinds: A. Feed forward Network
• Single layer feed forward network
• Multilayer feed forward network
B. Feedback Network
• Recurrent networks: They are feedback networks with closed loops.
Following are the two types of recurrent networks.
• Fully recurrent network: It is the simplest neural network architecture
because all nodes are connected to all other nodes and each node
works as both input and output.
• Jordan network − It is a closed loop network in which the output will
go to the input again as feedback as shown in the following diagram.
Artificial Neurons
• Artificial neurons are mathematical functions that simulate the behavior of
biological neurons. They receive inputs, process them using an activation
function, and produce an output. An artificial neuron has the following
components:
• Inputs (x₁, x₂, ..., xn): These are the values fed into the neuron, typically
representing features of the data.
• Weights (w₁, w₂, ..., wn): These determine the importance of each input. The
weights are adjusted during the training process.
• Bias (b): A constant added to the weighted sum of inputs to shift the activation
function.
• Activation Function (f): Determines whether the neuron should "fire" or
produce an output. The output is typically a transformed version of the weighted
sum of inputs.
• The output of an artificial neuron can be represented mathematically as: output=
Activation Function

• The activation function is a critical component of neural networks. It


introduces non-linearity to the model, allowing neural networks to
learn complex patterns. Common activation functions include:
• Sigmoid Function: σ(x)=11+e−x\sigma(x) The sigmoid function
outputs values between 0 and 1 and is commonly used for binary
classification.
• Hyperbolic Tangent (tanh): tanh⁡(x)=ex−e−xex+e−x This function
outputs values between -1 and 1.
• ReLU (Rectified Linear Unit): ReLU(x)=max⁡(0,x) is widely used
because of its simplicity and effectiveness in training deep networks.
• Softmax: Often used in multi-class classification problems to output
probabilities.
SOME COMMONACTIVATION FUNCTIONS
INCLUDE THE FOLLOWING
1. The sigmoid function has a smooth gradient and outputs
values between zero and one. For very high or low values
of the input parameters, the network can be very slow to
reach a prediction, called the vanishing gradient problem.
2. The TanH function is zero-centered making it easier to
model inputs that are strongly negative strongly positive or
neutral.
3. The ReLu function is highly computationally efficient but is
not able to process inputs that approach zero or negative.
4. The Leaky ReLu function has a small positive slope in its
negative area, enabling it to process zero or negative values.
5. The Parametric ReLu function allows the negative slope to be
learned, performing backpropagation to learn the most effective
slope for zero and negative input values.
6. Softmax is a special activation function use for output
neurons. It normalizes outputs for each class between 0 and 1,
and returns the probability that the input belongs to a specific
class.
7. Swish is a new activation function discovered by Google
researchers. It performs better than ReLu with a similar level of
computational efficiency
APPLICATIONS OF ANN
1. Data Mining: Discovery of meaningful patterns (knowledge) from
large volumes of data.
2. Expert Systems: A computer program for decision making that
simulates thought process of a human expert.
3. Fuzzy Logic: Theory of approximate reasoning.
4. Artificial Life: Evolutionary Computation, Swarm Intelligence.
5. Artificial Immune System: A computer program based on the
biological immune system.
6. Medical: At the moment, the research is mostly on modelling parts
of the human body and recognizing diseases from various scans (e.g.
cardiograms, CAT scans, ultrasonic scans, etc.).Neural networks are
ideal in recognizing diseases using scans since there is no need to
provide a specific algorithm on how to identify the disease. Neural
networks learn by example so the details of how to recognize the
disease are not needed. What is needed is a set of examples that are
representative of all the variations of the disease. The quantity of
examples is not as important as the 'quantity'. The examples need to be
selected very carefully if the system is to perform reliably and
efficiently.
• 7. Computer Science: Researchers in quest of artificial intelligence
have created spin offs like dynamic programming, object oriented
programming, symbolic programming, intelligent storage
management systems and many more such tools. The primary goal of
creating an artificial intelligence still remains a distant dream but
people are getting an idea of the ultimate path, which could lead to it.
• 8. Aviation: Airlines use expert systems in planes to monitor
atmospheric conditions and system status. The plane can be put on
autopilot once a course is set for the destination.
• 9. Weather Forecast: Neural networks are used for predicting
weather conditions. Previous data is fed to a neural network, which
learns the pattern and uses that knowledge to predict weather
patterns.
• 10. Neural Networks in business: Business is a diverted field with
several general areas of specialization such as accounting or financial
analysis. Almost any neural network application would fit into one
business area or financial analysis.
• 11. There is some potential for using neural networks for business
purposes, including resource allocation and scheduling.
• 12. There is also a strong potential for using neural networks for
database mining, which is, searching for patterns implicit within the
explicitly stored information in databases. Most of the funded work in
this area is classified as proprietary. Thus, it is not possible to report
on the full extent of the work going on. Most work is applying neural
networks, such as the Hopfield-Tank network for optimization and
scheduling.
• 13. Marketing: There is a marketing application which has been
integrated with a neural network system. The Airline Marketing
Tactician (a trademark abbreviated as AMT) is a computer system
made of various intelligent technologies including expert systems. A
feed forward neural network is integrated with the AMT and was
trained using back-propagation to assist the marketing control of
airline seat allocations. The adaptive neural approach was amenable
to rule expression. Additionally, the application's environment
changed rapidly and constantly, which required a continuously
adaptive solution.
• 14. Credit Evaluation: The HNC company, founded by Robert Hecht-
Nielsen, has developed several neural network applications. One of
them is the Credit Scoring system which increases the profitability of
the existing model up to 27%. The HNC neural systems were also
applied to mortgage screening. A neural network automated
mortgage insurance under writing system was developed by the
Nestor Company. This system was trained with 5048 applications of
which 2597 were certified. The data related to property and borrower
qualifications. In a conservative mode the system agreed on the under
writers on 97% of the cases. In the liberal model the system agreed
84% of the cases. This is system run on an Apollo DN3000 and used
250K memory while processing a case file in approximately 1 sec.
ADVANTAGES OF ANN
1. Adaptive learning: An ability to learn how to do tasks based on the
data given for training or initial experience.
2. Self-Organisation: An ANN can create its own organisation or
representation of the information it receives during learning time.
3. Real Time Operation: ANN computations may be carried out in
parallel, and special hardware devices are being designed and
manufactured which take advantage of this capability.
4. Pattern recognition: is a powerful technique for harnessing the
information in the data and generalizing about it. Neural nets learn to
recognize the patterns which exist in the data set.
5. The system is developed through learning rather than programming..
Neural nets teach themselves the patterns in the data freeing the
analyst for more interesting work.
6. Neural networks are flexible in a changing environment. Although
neural networks may take some time to learn a sudden drastic change
they are excellent at adapting to constantly changing information.
7. Neural networks can build informative models whenever conventional
approaches fail. Because neural networks can handle very complex
interactions they can easily model data which is too difficult to model
with traditional approaches such as inferential statistics or programming
logic.
8. Performance of neural networks is at least as good as classical
statistical modelling, and better on most problems. The neural networks
build models that are more reflective of the structure of the data in
significantly less time.
LIMITATIONS OF ANN
• In this technological era everything has Merits and some Demerits in others
words there is a Limitation with every system which makes this ANN
technology weak in some points. The various Limitations of ANN are:-
• 1) ANN is not a daily life general purpose problem solver.
• 2) There is no structured methodology available in ANN.
• 3) There is no single standardized paradigm for ANN development.
• 4) The Output Quality of an ANN may be unpredictable.
• 5) Many ANN Systems does not describe how they solve problems.
• 6) Black box Nature
• 7) Greater computational burden.
• 8) Proneness to over fitting.
• 9) Empirical nature of model development.
MCCULLOGH-PITTS MODEL
• In 1943 two electrical engineers, Warren McCullogh and Walter Pitts,
published the first paper describing what we would call a neural
network.
The McCulloch-Pitts neural model is also known as linear threshold
gate. It is a neuron of a set of inputs I1, I2,I3,…Im and one output ‘y’ .
The linear threshold gate simply classifies the set of inputs into two
different classes. Thus the output y is binary. Such a function can be
described mathematically using these equations:
• Where, W1,W2,W3……are weight values normalized in the range of
either or and associated with each input line
Assignment -1
• Boolean function
• AND function
• OR function
• NOR function
• NOT function
WHAT ARE THE LEARNING RULES IN
ANN?
• 1. Hebbian learning rule – It identifies, how to modify the weights of
nodes of a network.
• 2. Perceptron learning rule – Network starts its learning by assigning a
random value to each weight.
• 3. Delta learning rule – Modification in sympatric weight of a node is
equal to the multiplication of error and the input.
• 4. Correlation learning rule – The correlation rule is the supervised
learning.
• 5. Outstar learning rule – We can use it when it assumes that nodes
or neurons in a network arranged in a layer
1. Hebbian Learning Rule: The Hebbian rule was the first learning rule.
In 1949 Donald Hebb developed it as learning algorithm of the
unsupervised neural network.

2. Perceptron Learning Rule: Each connection in a neural network has


an associated weight, which changes in the course of learning.
According to it, an example of supervised learning, the network starts
its learning by assigning a random value to each weight.
3. Delta Learning Rule: Developed by Widrow and Hoff, the
delta rule, is one of the most common learning rules. It
depends on supervised learning. This rule states that the
modification in sympatric weight of a node is equal to the
multiplication of error and the input.
4. Correlation Learning Rule: The correlation learning rule
based on a similar principle as the Hebbian learning rule. It
assumes that weights between responding neurons should be
more positive, and weights between neurons with opposite
reaction should be more negative. Contrary to the Hebbian
rule, the correlation rule is the supervised learning, instead of
an actual. The response, oj, the desired response, dj, uses for
the weight-change calculation
5. Out Star Learning Rule: We use the Out Star Learning Rule
when we assume that nodes or neurons in a network arranged
in a layer. Here the weights connected to a certain node
should be equal to the desired outputs for the neurons
connected through those weights. The out start rule produces
the desired response t for the layer of n nodes. Apply this type
of learning for all nodes in a particular layer. Update the
weights for nodes are as in Kohonen neural networks.
BRIEFLY EXPLAIN THE ADALINE MODEL
OF ANN.
• ADALINE (Adaptive Linear Neuron or later Adaptive Linear
Element) is an early single-layer artificial neural network and
the name of the physical device that implemented this
network. The network uses memistors. It was developed by
Professor Bernard Widrow and his graduate student Ted Hoff
at Stanford University in 1960. It is based on the McCulloch–
Pitts neuron. It consists of a weight, a bias and a summation
function.
• The difference between Adaline and the standard (McCulloch–
Pitts) perceptron is that in the learning phase, the weights are
adjusted according to the weighted sum of the inputs (the net).
In the standard perceptron, the net is passed to the activation
(transfer) function and the function's output is used for
adjusting the weights.
Some important points about Adaline are as follows:
• It uses bipolar activation function.
• It uses delta rule for training to minimize the Mean-Squared
Error (MSE) between the actual output and the desired/target
output.
• The weights and the bias are adjustable.
Architecture of ADALINE:
• Adaline is similar to perceptron having an extra feedback loop with
the help of which the actual output is compared with the
desired/target output. After comparison on the basis of training
algorithm, the weights and bias will be updated
Architecture of ADALINE:
Training Algorithm of ADALINE:
• Step 1 − Initialize the following to start the training:  Weights  Bias 
Learning rate α For easy calculation and simplicity, weights and bias
must be set equal to 0 and the learning rate must be set equal to 1.
• Step 2 − Continue step 3-8 when the stopping condition is not true.
• Step 3 − Continue step 4-6 for every bipolar training pair s : t.
• Step 4 − Activate each input unit as follows:

• Step 5 − Obtain the net input with the following relation:


Here ‘b’ is bias and ‘n’ is the total number of input neurons.
• Step 6 − Apply the following activation function to obtain the final
output:

• Step 7 − Adjust the weight and bias as follows :


• Here ‘y’ is the actual output and ‘t’ is the desired/target output.
(t−yin) is the computed error.
• Step 8 − Test for the stopping condition, which will happen when
there is no change in weight or the highest weight change occurred
during training is smaller than the specified tolerance
EXPLAIN MULTIPLE ADAPTIVE LINEAR
NEURONS (MADALINE).
Madaline is a network which consists of many Adalines in
parallel. It will have a single output unit. Three different
training algorithms for MADALINE networks called Rule I, Rule
II and Rule III have been suggested, which cannot be learned
using backpropagation.
The first of these dates back to 1962 and cannot adapt the
weights of the hidden-output connection.[10]
The second training algorithm improved on Rule I and was
described in 1988.[8]
The third "Rule" applied to a modified network with sigmoid
activations instead of signum; it was later found to be equivalent
to backpropagation.
The Rule II training algorithm is based on a principle called
"minimal disturbance".
It proceeds by looping over training examples, then for each
example, it:
 finds the hidden layer unit (ADALINE classifier) with the lowest
confidence in its prediction, tentatively flips the sign of the unit,
 accepts or rejects the change based on whether the network's
error is reduced,
 stops when the error is zero.
Some important points about
Madaline are as follows:
 It is just like a multilayer perceptron, where Adaline will act as
a hidden unit between the input and the Madaline layer.
 The weights and the bias between the input and Adaline
layers, as in we see in the Adaline architecture, are adjustable.
 The Adaline and Madaline layers have fixed weights and bias
of 1.
 Training can be done with the help of Delta rule.
ARCHITECTURE OF MADALINE
Training Algorithm of MADALINE
• we know that only the weights and bias between the input
and the Adaline layer are to be adjusted, and the weights
and bias between the Adaline and the Madaline layer are
fixed.
Step 1 − Initialize the following to start the training:
 Weights
 Bias
 Learning rate α
For easy calculation and simplicity, weights and bias must be
set equal to 0 and the learning rate must be set equal to 1.
Step 2 − Continue step 3-8 when the stopping condition is not
true.
Step 3 − Continue step 4-6 for every bipolar training pair s:t.
Step 4 − Activate each input unit as follows:
x = si(I =1 to n)
Step 5 − Obtain the net input at each hidden layer, i.e. the
Adaline layer with the following relation:

Here ‘b’ is bias and ‘n’ is the total number of input neurons.
Step 6 − Apply the following activation function to obtain the
final output at the Adaline and the Madaline Layer:

• Output at the hidden Adaline unit Qj=f(Qinj)


• Final output of the network y = f(yin)
Step 7 − Calculate the error and adjust the weights as follows –
• Case 1 − if y ≠ t and t = 1 then,
wij(new) = wij(old)+α(1−Qinj)xi
bj(new) = bj(old)+α(1−Qinj)
In this case, the weights would be updated on Qj where the net
input is close to 0 because t = 1.
• Case 2 − if y ≠ t and t = -1 then
wik(new) = wik(old)+α(−1−Qink)xi
bk(new) = bk(old)+α(−1−Qink)
In this case, the weights would be updated on Qk where the
net input is positive because t = -1. Here ‘y’ is the actual
output and ‘t’ is the desired/target output.
• Case 3 − if y = t, then there would be no change in weights.

Step 8 − Test for the stopping condition, which will happen


when there is no change in weight or the highest weight
change occurred during training is smaller than the specified
tolerance.
WHAT IS A PERCEPTRON?
• A perceptron is a binary classification algorithm modeled after
the functioning of the human brain—it was intended to emulate
the neuron. The perceptron, while it has a simple structure, has
the ability to learn.
What is Multilayer Perceptron?

• A multilayer perceptron (MLP) is a group of perceptrons,


organized in multiple layers, that can accurately answer
complex questions. Each perceptron in the first layer (on the
left) sends signals to all the perceptrons in the second layer,
and so on. An MLP contains an input layer, at least one hidden
layer, and an output layer.
The perceptron learns as follows:
1. Takes the inputs which are fed into the perceptrons in the
input layer, multiplies them by their weights, and computes
the sum.
2. 2. Adds the number one, multiplied by a “bias weight”. This
is a technical step that makes it possible to move the output
function of each perceptron (the activation function) up,
down, left and right on the number graph.
3. 3. Feeds the sum through the activation function—in a
simple perceptron system, the activation function is a step
function.
4. 4. The result of the step function is the output.
A multilayer perceptron is quite similar to a modern neural
network.
By adding a few ingredients, the perceptron architecture
becomes a full-fledged deep learning system:
• Activation functions and other hyperparameters: a full
neural network uses a variety of activation functions which
output real values, not boolean values like in the classic
perceptron.
• It is more flexible in terms of other details of the learning
process, such as the number of training iterations (iterations
and epochs), weight initialization schemes, regularization,
and so on. All these can be tuned as hyperparameters.
• Backpropagation: a full neural network uses the
backpropagation algorithm, to perform iterative backward
passes which try to find the optimal values of perceptron
weights, to generate the most accurate prediction.
• Advanced architectures: full neural networks can have a
variety of architectures that can help solve specific problems.
A few examples are Recurrent Neural Networks (RNN),
Convolutional Neural Networks (CNN), and Generative
Adversarial Networks (GAN).
WHAT IS BACKPROPAGATION AND WHY IS IT
IMPORTANT?
After a neural network is defined with initial weights, and a
forward pass is performed to generate the initial prediction,
there is an error function which defines how far away the
model is from the true prediction.
There are many possible algorithms that can minimize the
error function—for example, one could do a brute force
search to find the weights that generate the smallest error.
However, for large neural networks, a training algorithm is
needed that is very computationally efficient.
• Backpropagation is that algorithm—it can discover the
optimal weights relatively quickly, even for a network with
millions of weights
HOW BACKPROPAGATION WORKS?
1. Forward pass—weights are initialized and inputs from the
training set are fed into the network. The forward pass is
carried out and the model generates its initial prediction.
2. 2. Error function—the error function is computed by
checking how far away the prediction is from the known
true value
3. Backpropagation with gradient descent—the algorithm
calculates how much the output values are affected by each of
the weights in the model.
To do this, it calculates partial derivatives, going back from the
error function to a specific neuron and its weight.
This provides complete traceability from total errors, back to a
specific weight which contributed to that error.
The result of backpropagation is a set of weights that minimize
the error function
4. Weight update—weights can be updated after every sample
in the training set, but this is usually not practical.
• Typically, a batch of samples is run in one big forward pass, and then
backpropagation performed on the aggregate result.
• The batch size and number of batches used in training, called
iterations, are important hyperparameters that are tuned to get the
best results.
• Running the entire training set through the backpropagation process
is called an epoch.
Training algorithm of BPNN:
1. Inputs X, arrive through the pre connected path
2. Input is modeled using real weights W. The weights are
usually randomly selected.
3. Calculate the output for every neuron from the input layer,
to the hidden layers, to the output layer.
4. Calculate the error in the outputs ErrorB= Actual Output –
Desired Output
5. Travel back from the output layer to the hidden layer to
adjust the weights such that the error is decreased. Keep
repeating the process until the desired output is achieved
Architecture of back propagation
network:
ELECTRIC LOAD FORECASTING USING ANN

• ANNs were first applied to load forecasting in the late 1980’s.


• ANNs have good performance in data classification and
function fitting. Some examples of utilizing ANN in power
system applications are: Load forecasting, fault classification,
power system assessment, real time harmonic evaluation,
power factor correction, load scheduling, design of
transmission lines, and power system planning.
• Load forecast has been an attractive research topic for many
decades and in many countries all over the world, especially in
fast developing countries with higher load growth rate. Load
forecast can be generally classified into four categories based
on the forecasting time as detailed in the table below.
Learning Factors
• Learning factors refer to variables or conditions that influence how
effectively a neural network can learn. These include:
• Learning Rate: Controls the step size in weight updates.
• Training Data: Quality, size, and diversity of input data.
• Network Architecture: Number of layers, neurons, activation
functions.
• Weight Initialization: Can affect convergence.
• Epochs and Iterations: Number of passes over the data.
• Loss Function: Measures the error and guides optimization.
• Regularization: Prevents overfitting (e.g., L1, L2).
• Optimization Algorithm: Such as SGD, Adam, RMSprop.
Linear Separability
• Linear separability is a property of a dataset:
• A dataset is linearly separable if a single straight line (2D), plane
(3D), or hyperplane (nD) can separate the data points of different
classes.
• Classic example: The AND and OR functions are linearly
separable, but XOR is not.
• Importance: Perceptrons can only solve problems that are
linearly separable.
• Multi-layer networks (like MLPs) can handle non-linearly
separable problems.
Hopfield Neural Network
• The Hopfield Neural Networks, invented by Dr John J.
Hopfield consists of one layer of 'n' fully connected recurrent
neurons.
• It is generally used in performing auto-association and
optimization tasks. It is calculated using a converging
interactive process and it generates a different response than
our normal neural nets.
Discrete Hopfield Network
• It is a fully interconnected neural network where each unit is
connected to every other unit. It behaves in a discrete
manner, i.e. it gives finite distinct output, generally of two
types:
Binary (0/1)
• Bipolar (-1/1)
• The weights associated with this network are symmetric in
nature and have the following properties.
1. wij​=wji​
2. wii​=0
Structure & Architecture of Hopfield Network
• Each neuron has an inverting and a non-inverting output.
• Being fully connected, the output of each neuron is an input
to all other neurons but not the self.
• The below figure shows a sample representation of a
Discrete Hopfield Neural Network architecture having the
following elements.
Training Algorithm

• For storing a set of input patterns S(p) [p = 1 to P],


where S(p) = S1(p) ... Si(p) ... Sn(p), the weight matrix is given
by:
• For binary patterns
wij​=∑p=1P​[2si​(p)−1][2sj​(p)−1] (wij​for all i=j)
• For bipolar patterns
wij​=∑p=1P​[si​(p)sj​(p)] (where wij​=0 for all i=j)
(i.e. weights here have no self-connection)
Steps Involved in the training of a Hopfield
Network are as mapped below:
• Initialize weights (wij) to store patterns (using training algorithm).
• For each input vector yi, perform steps 3-7.
• Make the initial activators of the network equal to the external
input vector x.
• yi​=xi​:(for i=1 to n)
• For each vector yi, perform steps 5-7.
• Calculate the total input of the network yin using the equation
given below.
• yini​​=xi​+∑j​[yj​wji​]
• Apply activation over the total input to calculate the output
as per the equation given below:

• (where θi (threshold) and is normally taken as 0)


• Now feedback the obtained output yi to all other units. Thus,
the activation vectors are updated.
• Test the network for convergence.
• Consider the following problem. We are required to create a
Discrete Hopfield Network with the bipolar representation of
the input vector as [1 1 1 -1] or [1 1 1 0] (in case of binary
representation) is stored in the network. Test the Hopfield
network with missing entries in the first and second
components of the stored vector (i.e. [0 0 1 0]).
• Given the input vector, x = [1 1 1 -1] (bipolar) and we initialize the
weight matrix (wij) as: wij​=∑[sT(p)t(p)]
and weight matrix with no self-connection is:

As per the question, input vector x with missing entries, x


= [0 0 1 0] ([x1 x2 x3 x4]) (binary). Make yi = x = [0 0 1
0] ([y1 y2 y3 y4]). Choosing unit yi (order doesn't
matter) for updating its activation. Take the ith column of
the weight matrix for calculation.

(we will do the next steps for all values of yi and check if there is
convergence or not)
yin1​​=x1​+∑j=14​[yj​wj1​]

Applying activation, yin1​​>0⟹y1​=1


giving feedback to other units, we get y=[1​0​1​0​]
which is not equal to input vector x=[1​1​1​0​]
Hence, no covergence.
yin3​​=x3​+∑j=14​[yj​
wj3​]
• Applying activation, yin3​​>0⟹y3​=1
• giving feedback to other units, we get
• y=[1​0​1​0​]
• which is not equal to input vector
• x=[1​1​1​0​] Hence, no covergence.

yin4​​=x4​+∑j=14​[yj​wj4​]
• Applying activation, yin4​​<0⟹y4​=0
• giving feedback to other units, we get y=[1​0​1​0​]
• which is not equal to input vector
• x=[1​1​1​0​] Hence, no covergence.

• yin2​​=x2​+∑j=14​[yj​wj2​]
• Applying activation, yin2​​>0⟹y2​=1
• giving feedback to other units, we get
y=[1​1​1​0​]
• which is equal to input vector
x=[1​1​1​0​]
• Hence, covergence with vector x.

You might also like