Thanks to visit codestin.com
Credit goes to www.scribd.com

0% found this document useful (0 votes)
12 views55 pages

Unit 3

Uploaded by

lencho03406
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
12 views55 pages

Unit 3

Uploaded by

lencho03406
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 55

Introduction to Neural

Network

Department of Information
Technology
Ambo University
What are neural networks?
 A neural network can mean

 either a real biological neural


network such as the one in your
brain, or

 an artificial neural network


simulated in a computer.
Neurons, cell bodies, and signals
 A neural network,
 either biological and artificial, consists of a
 large number of simple units,
 neurons, that receive and transmit signals to
each other.
 The neurons are very simple processors
of information, consisting of
 a cell body and
 wires that connect the neurons to each
other.
 Most of the time, they do nothing but sit
still and watch for signals coming in
through the wires.
Dendrites, axons, and synapses
 In the biological lingo,
 we call the wires that provide the input to
the neurons dendrites.
 Sometimes, depending on the incoming
signals,
 the neuron may fire and send a signal out
for the other neurons to receive.
 The wire that transmits the outgoing
signal is called an axon.
 Each axon may be connected to one or
more dendrites at intersections that are
called synapses.
Dendrites, axons, and synapses
(contd)
 Isolated from its fellow-neurons,
 a single neuron is quite unimpressive, and
capable of only a very restricted set of
behaviors.

 When connected to each other,


 however, the system resulting from their
concerted action can become extremely
complex.

 The behavior of the system is


determined by the ways in which the
neurons are wired together.
Dendrites, axons, and synapses
(contd)
 Each neuron reacts to the
incoming signals in a specific way
that can also adapt over time.

 This adaptation is known to be the


key to functions such as memory
and learning.
Why develop artificial neural
networks?
 The purpose of building artificial
models of the brain can be
 neuroscience, the study of the brain
and the nervous system in general.
 It is tempting to think that by
mapping the human brain in
enough detail,
 we can discover the secrets of
human and animal cognition and
consciousness.
Why develop artificial neural
networks?
 However, even while we seem to
be almost as far from
understanding the mind and
consciousness,
 there are clear milestones that have
been achieved in neuroscience.
 By better understanding of the
structure and function of the brain,
 we are already reaping some concrete
rewards.
Why develop artificial neural
networks?
 We can, for instance,
 identify abnormal functioning and
 try to help the brain avoid them and
reinstate normal operation.
 This can lead to life-changing new
medical treatments for people
suffering from neurological
disorders:
 epilepsy, Alzheimers disease,
problems caused by developmental
disorders or damage caused by
Why develop artificial neural
networks?
 In fact, another main reason for building
artificial neural networks has
 little to do with understanding biological
systems.
 It is to use biological systems as
 an inspiration to build better AI and
machine learning techniques.
 The idea is very natural:
 the brain is an amazingly complex
information processing system capable of a
wide range of intelligent behaviors and
 therefore, it makes sense to look for
inspiration in it when we try to create
Why develop artificial neural
networks?
 Neural networks have been
 a major trend in AI since the 1960s.
 Well return to the waves of
popularity in the history of AI in the
final part.
 Currently neural networks are
again at the very top of the list
 as deep learning is used to achieve
significant improvements in many
areas
 such as natural language and image
What is so special about neural
networks?
 The case for neural networks in general
as an approach to AI is
 based on a similar argument as that for
logic-based approaches.
 In the latter case,
 it was thought that in order to achieve
human-level intelligence,
 we need to simulate higher-level thought
processes, and
 in particular, manipulation of symbols
representing certain concrete or abstract
concepts using logical rules.
Neural network key feature
 For one, in a traditional computer,
 information is processed in a
central processor
 which can only focus on doing one
thing at a time.
 The CPU can retrieve data to be
processed from the computers
memory,
 and store the result in the memory.
Neural network key feature
(contd)
 Thus, data storage and processing are
handled by two separate components of
the computer:
 the memory and the CPU.

 In neural networks, the system consists


of
 a large number of neurons,
 each of which can process information on its own
 so that instead of having a CPU process each
piece of information one after the other,
 the neurons process vast amounts of information
simultaneously.
Neural network key feature
(contd)
 The second difference is that
 data storage (memory) and
 processing isnt separated like in
traditional computers.
 The neurons both
 store and process information
 so that there is no need to retrieve
data from the memory for processing.
Neural network key feature
(contd)
 The data can be stored short term
in the neurons themselves

 (they either fire or not at any given


time) or
 for longer term storage, in the
connections between the neurons
 their so called weights
Neural network key feature
(contd)
 Because of these two differences,
 neural networks and traditional
computers are suited for somewhat
different tasks.
 Even though it is entirely possible
to simulate neural networks in
traditional computers,
 which was the way they were used for
a long time, their maximum capacity
is achieved only when we use special
hardware
Neural network key feature
(contd)
 (computer devices) that can
process many pieces of
information at the same time.
 This is called parallel processing.
 Incidentally, graphics processors
(or graphics processing units,
GPUs) have this capability and
 they have become a cost-effective
solution for running massive deep
learning methods.
How neural networks are
built
Weights and inputs
 The basic artificial neuron model
involves
 a set of adaptive parameters, called
weights like in linear and logistic
regression.
 Just like in regression,
 these weights are used as multipliers
on the inputs of the neuron, which are
added up.
Weights and inputs (contd)
 The sum of the weights times the
inputs is called
 the linear combination of the
inputs.

 You can probably recall the


shopping bill analogy:
 you multiply the amount of each item
by its price per unit and add up to get
the total.
Weights and inputs (contd)
 If we have a neuron with six inputs
(analogous to the amounts of the
six shopping items:
 potatoes, carrots, and so on),
 input1, input2, input3, input4, input5,
and input6,
 we also need six weights.
 The weights are analogous to the
prices of the items.
Weights and inputs (contd)
 Well call them
 weight1, weight2, weight3, weight4,
weight5, and weight6.
 In addition,
 well usually want to include an
intercept term like we did in linear
regression.
 This can be thought of as
 a fixed additional charge due to
processing a credit card payment, for
example.
Weights and inputs (contd)
 We can then calculate the linear
combination like this:
 linear combination =
 intercept + weight1 × input1 + ... +
weight6 × input6
 With some example numbers we
could then get:
 10.0 + 5.4 × 8 + (-10.2) × 5 + (-0.1)
× 22 + 101.4 × (-5) + 0.0 × 2 + 12.0
× (-3) = -543.0
Weights and inputs (contd)
 The weights are almost always
learned
 from data using the same ideas as in
linear or logistic regression, as
discussed previously.
 But before we discuss this in more
detail,
 we'll introduce another important
stage that a neuron completes before
it sends out an output signal.
Activations and outputs
 Once the linear combination has
been computed,
 the neuron does one more operation.
 It takes the linear combination and
puts it through a so-called
activation function.
 Typical examples of the activation
function include:
 identity function: do nothing and just
output the linear combination
Activations and outputs (contd)
 step function: if the value of the linear
combination is greater than zero,
 send a pulse (ON), otherwise do nothing
(OFF)

 sigmoid function: a soft version of


the step function
Activations and outputs (contd)
 Note that with the first activation
function, the identity function,
 the neuron is exactly the same as
linear regression.
 This is why the identity function is
rarely used in neural networks:
 it leads to nothing new and
interesting.
How neurons activate
 Real, biological neurons
communicate
 by sending out sharp, electrical
pulses called spikes,
 so that at any given time, their
outgoing signal is either on or off (1
or 0).
 The step function imitates this
behavior.
 However, artificial neural networks
tend
How neurons activate (contd)
 Thus, to use a somewhat awkward
figure of speech,
 real neurons communicate by
something similar to the Morse code,

 whereas artificial neurons


communicate by adjusting the pitch
of their voice as if yodeling.
How neurons activate (contd)
 The output of the neuron,
determined by
 the linear combination and the
activation function, can be used to
extract a prediction or a decision.
 For example, if the network is
designed
 to identify a stop sign in front of a
self-driving car,
 the input can be the pixels of an image
captured by a camera and

How neurons activate (contd)
 Learning or adaptation in the
network occurs
 when the weights are adjusted
 so as to make the network produce the
correct outputs, just like in linear or
logistic regression.
 Many neural networks are very
large, and the largest contain
hundreds of billions of weights.
 Optimizing them all can be a
daunting task that requires
Perceptron: the mother of all
ANNs
 The perceptron is simply a fancy
name
 for the simple neuron model with the
step activation function we discussed
above.
 It was among the very first formal
models of neural computation and
 because of its fundamental role in
the history of neural networks,
 it wouldnt be unfair to call it the
mother of all artificial neural
Perceptron: the mother of all
ANNs
 It can be used as a simple
classifier in binary classification
tasks.
 A method for learning the weights
of the perceptron from data, called
the Perceptron algorithm, was
introduced
 by the psychologist Frank
Rosenblatt in 1957.
 We will not study the Perceptron
Perceptron: the mother of all
ANNs
 Suffice to say that it is just about
as simple as the nearest neighbor
classifier.
 The basic principle is to feed the
network training data one example
at a time.
 Each misclassification leads to an
update in the weight.
Putting neurons together:
networks
 A single neuron would be way
 too simple to make decisions and
prediction reliably in most real-life
applications.
 To unleash the full potential of
neural networks,
 we can use the output of one neuron
as the input of other neurons, whose
outputs can be the input to yet other
neurons, and so on.
Putting neurons together:
networks
 The output of the whole network is
obtained as
 the output of a certain subset of the
neurons, which are called the output
layer.
 We'll return to this in a bit,
 after we discussed the way neural
networks adapt to produce different
behaviors
 By learning their parameters from
data.
Layers
 Often the network architecture is
composed of layers.
 The input layer consists of neurons
 that get their inputs directly from the
data.
 So for example, in an image
recognition task,
 the input layer would use the pixel
values of the input image as the
inputs of the input layer.
Layers
 The network typically also has
 hidden layers that use the other
neurons´ outputs as their input,
 and whose output is used as the input
to other layers of neurons.
 Finally, the output layer produces
the output of the whole network.
 All the neurons on a given layer
get inputs from neurons on the
previous layer and feed their
output to the next.
Layers
 A classical example of a multilayer
network is the so-called
 multilayer perceptron.
 As we discussed above,
Rosenblatts Perceptron algorithm
can be used to learn the weights of
a perceptron.
 For multilayer perceptron,
 the corresponding learning problem is
way harder and it took a long time
before a working solution was
Layers
 But eventually, one was invented:
 the backpropagation algorithm
lead to a revival of neural networks in
the late 1980s.

 It is still at the heart of many of the


most advanced deep learning
solutions.
A simple neural network
classifier
 To give a relatively simple example
of using a neural network
classifier,
 we'll consider a task that is very
similar to the MNIST digit recognition
task, namely classifying images in
two classes.
 We will first create a classifier to
classify
 whether an image shows a cross (x)
or a circle (o).
A simple neural network
classifier
 Our images are represented here
as
 pixels that are either colored or white,
 and the pixels are arranged in 5 × 5
grid.
 In this format our images of a cross
and a circle
 (more like a diamond, to be honest)
A simple neural network
classifier
 In order to build a neural network
classifier,
 we need to formalize the problem in a
way
 where we can solve it using the
methods we have learned.
 Our first step is to represent the
information in the pixels by
numerical values that can be used
as the input to a classifier.
A simple neural network
classifier
 Let's use 1 if the square is colored,
and 0 if it is white.
 Note that although the symbols in
the graphic are of different color
(green and blue),
 our classifier will ignore the color
information and
 use only the colored/white
information.
 The 25 pixels in the image make
the inputs of our classifier.
A simple neural network
classifier
 To make sure that we know which
pixel is which in the numerical
representation,
 we can decide to list the pixels in the
same order as you'd read text, so row
by row from the top, and reading
each row from left to right.
 The first row of the cross, for
example, is represented as
1,0,0,0,1;
 the second row as 0,1,0,1,0, and
A simple neural network
classifier
 The full input for the cross input is
then:
 1,0,0,0,1,0,1,0,1,0,0,0,1,0,0,0,1,0,1,0,
1,0,0,0,1.
 We'll use the basic neuron model
where the first step is to compute
a linear combination of the inputs.
 Thus need a weight for each of the
input pixels, which means 25
weights in total.
A simple neural network
classifier
 Finally, we use the step activation
function.
 If the linear combination is
negative, the neuron activation is
zero,
 which we decide to use to signify a
cross.
 If the linear combination is
positive,
 the neuron activation is one, which
we decide to signify a circle.
A simple neural network
classifier
 Let's try
 what happens when all the weights
take the same numerical value, 1.
 With this setup,
 our linear combination for the cross
image will be
 9 (9 colored pixels, so 9 × 1, and
 16 white pixels, 16 × 0), and
 for the circle image it will be
 8 (8 colored pixels, 8 × 1, and
 17 white pixels, 17 × 0).
A simple neural network
classifier
 In other words,
 the linear combination is positive for
both images
 and they are thus classified as circles.

 Not a very good result given that


there are only two images to
classify.
A simple neural network
classifier
 To improve the result,
 we need to adjust the weights in such
a way that
 the linear combination will negative for a
cross and positive for a circle.
 If we think about what
 differentiates images of crosses
and circles,
 we can see that circles have no
colored pixels in the center of the
image, whereas crosses do.
A simple neural network
classifier
 Likewise, the pixels at the corners of the
image are colored in the cross, but
white in the circle.
 We can now adjust the weights.
 There are an infinite number of weights
that do the job.
 For example,
 assign weight -1 to the center pixel (the
13th pixel), and
 weight 1 to the pixels in the middle of each
of the four sides of the image, letting all the
other weights be 0.
A simple neural network
classifier
 Now, for the cross input, the center
pixel produce the value 1, while for
all the other pixels either the pixel
value of the weight is 0,
 so that 1 is also the total value.
 This leads to activation 0, and the
cross is correctly classified.
 How about the circle then?
A simple neural network
classifier
 Each of the pixels in the middle of
the sides produces the value 1,
 which makes 4 × 1 = 4 in total.
 For all the other pixels either the
pixel value or the weight is zero,
 so 4 is the total.
 Since 4 is a positive value, the
activation is 1, and
 the circle is correctly recognized as
well.

You might also like