0% found this document useful (0 votes)

13 views70 pages

COMP3411 Week 3 - NN

The document provides an overview of neural networks, including biological and artificial neurons, single-layer and multi-layer perceptrons, and neural network design and architectures. It discusses the motivation behind artificial neural networks, their learning mechanisms, and the historical context of their development. Key concepts such as function approximation, activation functions, and backpropagation for training neural networks are also covered.

Uploaded by

tianzong Li

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

13 views70 pages

COMP3411 Week 3 - NN

Uploaded by

tianzong Li

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 70

Neural Networks

COMP3411/9814: Artificial Intelligence

Lecture Overview
• Motivation

• Biological and artificial neurons

• Single-layer perceptron

• Multi-layer perceptron

• Neural network design

• Neural network architectures

Lecture Overview
• Motivation

• Biological and artificial neurons

• Single-layer perceptron

• Multi-layer perceptron

• Neural network design

• Neural network architectures

Motivation

• Great ability of cognitive beings to carry out some tasks: shape

recognition, speech and image processing, etc.

• It seemed important to understand and emulate successful

mechanisms from humans and animals: Parallelism and high
connectivity.

• A branch of artificial intelligence: Artificial Neural Networks.

Motivation

• New paradigm (non-algorithmic) to process information

(neurocomputing): learning and adaptation, distributed and
parallel processing.

• New computational tools (faster and cheaper)

• In the future: more caution and theoretical support. Open issue:

Generalization.
Motivation
• The general problem of function approximation can be divided into two
subproblems:

• Classification: to approximate a function that represents the membership

function of an entity – characterized by a set of input variables, either
continuous or discrete – to a particular class (output with discrete values),
e.g., character recognition.

• Regression: to approximate the generating function (unknown) of a process

by mapping elements from the input variables to output variables. Usually,
continuous values are used.

y = f(x, w)
Lecture Overview
• Motivation

• Biological and artificial neurons

• Single-layer perceptron

• Multi-layer perceptron

• Neural network design

• Neural network architectures

Biological Neuron
Biological Neuron
• The brain is made up of neurons (nerve cells) which have
• a cell body (soma)
• dendrites (inputs)
• an axon (outputs)
• synapses (connections between cells)

• Synapses can be excitatory or inhibitory and may change over time.

• When the inputs reach some threshold an action potential (electrical

pulse) is sent along the axon to the outputs.
Biological Neuron

• Human brain has 100 billion neurons (~10 -

10 10 neurons)
11 with an
average of 10,000 synapses each (some even with 100,000
synapses).

• Latency is about 3-6 milliseconds.

• At most a few hundred “steps” in any mental computation, but

massively parallel.
Artificial Neuron

• Automata characterized by:

• An internal state.

• Input signals.

• Activation and transfer functions.

Artificial Neuron

• McCulloch-Pitts’
model (1943)
Artificial Neuron
• McCulloch-Pitts model:
• Inputs either 0 or 1.
• Output 0 or 1.
• Input can be either excitatory or inhibitory.

• Summing inputs
• If input is 1, and is excitatory, add 1 to sum. 𝑠𝑠𝑠𝑠𝑠𝑠 = 𝑥𝑥1 � 𝑤𝑤1 + 𝑥𝑥2 � 𝑤𝑤2 + 𝑥𝑥3 � 𝑤𝑤3
• If input is 1, and is inhibitory, subtract 1 from sum. +…

• Threshold, if sum < Ө then output is 0

• if sum < threshold Ө, output 0. else output is 1.
• Otherwise, output 1.
Learning
• Ability of a neuron (or neural net) to adjust connections (weights) to
obtain the intended output or that meets certain criteria.

• Hebbian learning (1949): When a neuron A persistently activates

another nearby neuron B, the connection between the two neurons
becomes stronger. Specifically, a growth process occurs that
increases how effective neuron A is in activating neuron B. As a
result, the connection between those two neurons is strengthened
over time.

• “Neurons that fire together, wire together”, Hebb.

Artificial Neural Networks
• Information processing architecture loosely modelling the brain

• Consists of many interconnected processing units (neurons)

• Work in parallel to accomplish a global task

• Generally used to model relationships between inputs and outputs or

to find patterns in data

• Characterized by (i) number of neurons, (ii) interconnection

architecture, (iii) weight values, (iv) activation and transfer functions.
Artificial Neural Networks

❑ ANNs nodes have

➢ inputs edges with some weights
➢ outputs edges with weights
➢ activation level (function of inputs)

• Weights can be positive or negative and may change over time (learning).
• The input function is the weighted sum of the activation levels of inputs.
• The activation level is a non-linear transfer function g of this input:

activation𝑗𝑗 = 𝑔𝑔 𝑠𝑠𝑗𝑗 = 𝑔𝑔 � 𝑤𝑤𝑖𝑖𝑖𝑖 𝑥𝑥𝑖𝑖

𝑖𝑖
Some nodes are inputs (sensing), some are outputs (action)
Artificial Neural Networks
Artificial Neural Networks

• Neural networks (NN) might work in two ways:

• Learning: adapting its weights, architecture, activation and

transfer functions.

• Simulation or recognition: it is used for information processing.

• NN learning: Supervised (through examples), unsupervised.

Activation Functions
Function 𝑔𝑔(𝑠𝑠) takes the weighted sum of inputs and produces output
for node, given some threshold.

1 if 𝑠𝑠 ≥ 0
𝑔𝑔 𝑠𝑠 = �
0 if 𝑠𝑠 < 0
Lecture Overview
• Motivation

• Biological and artificial neurons

• Single-layer perceptron

• Multi-layer perceptron

• Neural network design

• Neural network architectures

Single-layer Perceptron

• Frank Rossenblat, 1957

• Use a logic threshold function for classification tasks.

• Classification and discriminant functions

• How can we make the classification?

• Using discriminant functions defined in the vector space we want

to classify and produce a value that can be compared.
Single-layer Perceptron

• If we have a function (gi) for each class (si). The classification rule is:

u Є si iff gi (u) > gj (u); j ≠ i

• For 2-classes problems, it can be reduced to one function as:

g(u) = g1(u) – g2(u), then u Є s1 g(u) > 0 and u Є s2 otherwise

Single-layer Perceptron

• Linear classification with discriminant function using perceptron:

g (u) = w1 u1 + w2 u2 + ...+ wn un = w . u

• The value associated with g(u) = 0 corresponds to the border, a

hyperplane, that divides the two classes.

• Perceptrons are appropriate only for classes linearly separable.

• The learning problem is reduced to finding a hyperplane that

separates the two classes.
Single-layer Perceptron

𝑦𝑦 = 𝑎𝑎𝑎𝑎 + 𝑏𝑏
X0
w0

a y

w1 θ

X1
ω0 θ
X 0ω0 + X 1ω1 =
θ ⇒ X1 =
− X0 +
ω1 ω1
X1

A A

A A
B B
A
X0

B B
Single-layer Perceptron

• Simplest output function

Single-layer Perceptron

• AND gate output

Input 1

(0,1) (1,1)

1.5 = w1I1 + w2I2

Input 2
(0,0) (1,0)
Single-layer Perceptron

• XOR gate
Input 1
• => hidden layer
needed (0,1) (1,1)

Input 2
(0,0) (1,0)
Single-layer Perceptron
• Linearly separable if there is a hyperplane where classification is
true on one side of the hyperplane and false on the other side
• For the sigmoid function, when the hyperplane is:
𝑥𝑥1 � 𝑤𝑤1 + ⋯ + 𝑥𝑥𝑛𝑛 � 𝑤𝑤𝑛𝑛 = 0
Single-layer Perceptron

• Learning rule
Single-layer Perceptron
• Perceptron convergence theorem:

• For any data set that is linearly separable, the perceptron

learning rule is guaranteed to find a solution in a finite
number of iterations.
Historical Context
• In 1969, Minsky and Papert published a book highlighting limitations of
perceptrons.
• Funding agencies redirected funding away from neural network research
preferring instead logic-based methods such as expert systems.

• Known since 1960s that any logical function could be implemented in a 2-layer
neural network with step function activations.

• The problem was how to learn the weights of a multi-layer neural network from
training examples.

• Solution found in 1974 by Paul Werbos.

• Not widely known until rediscovered in 1986 by Rumelhart, Hinton and
Williams.
Lecture Overview
• Motivation

• Biological and artificial neurons

• Single-layer perceptron

• Multi-layer perceptron

• Neural network design

• Neural network architectures

Multi-layer Neural Network

• Given an explicit logical function, we can design a multi-layer neural network by

hand to compute that function.
• But, if we are just given a set of training data, can we train a multi-layer network
to fit these data?
Multi-layer Perceptron

𝑤𝑤𝑖𝑖,𝑗𝑗 ≡ weight between node 𝑖𝑖and node 𝑗𝑗

• Feed-forward network = a parameterised family of nonlinear functions:

𝑎𝑎5 = 𝑔𝑔 𝑊𝑊3,5 � 𝑎𝑎3 + 𝑊𝑊4,5 � 𝑎𝑎4
= 𝑔𝑔 𝑊𝑊3,5 � 𝑔𝑔 𝑊𝑊1,3 � 𝑎𝑎1 + 𝑊𝑊2,3 � 𝑎𝑎2 + 𝑊𝑊4,5 � 𝑔𝑔 𝑊𝑊1,4 � 𝑎𝑎1 + 𝑊𝑊2,4 � 𝑎𝑎2
• Adjusting weights changes the function
Feedforward Propagation

(a) is a step function or threshold function

1
(b) is a sigmoid function 1+𝑒𝑒 − 𝑥𝑥

Changing the bias weight 𝑏𝑏 moves the threshold

vi 1 wj 1
x1(t)
x2(t)
Σ fNL Σ y1(t)

…
…

…
xn(t)

bin 1 bout 1
vi 2 wj 2

Σ fNL Σ y2(t)

…
…
bin 2 bout 2

...
...
vi p wj m

Σ fNL Σ ym(t)
…

…
bin p bout m
Backpropagation

1. Forward pass: apply inputs to the Forward Pass

“lowest layer” and feed activations
forward to get output.

2. Calculate error: difference between

desired output and actual output.

3. Backward pass: Propagate errors

back through the network to adjust
weights. Backprop
Random
Updating initial
weights
Backpropagation
weights
of the network
Error
Output calculation
backpropagation
Error calculation
w11
X1
Output of
w’11 the network
w12

w21
w’21

w22 error= y − yˆ
X2

w31 w32
w’31

1 1
Backpropagation
Gradient descent
1 2
𝐸𝐸 = ∑(𝑑𝑑 − 𝑦𝑦)
2𝑁𝑁

If transfer functions are smooth, can use multivariate calculus to adjust weights by taking
the steepest downhill direction.
𝜕𝜕𝜕𝜕
𝑤𝑤 ← 𝑤𝑤 − α
𝜕𝜕𝜕𝜕

Parameter α is the learning rate

• How the cost function affects the particular weight

Backpropagation
The derivative of a function is the
slope of the tangent at a point

𝑦𝑦 = 𝑓𝑓 𝑥𝑥 = 𝑚𝑚𝑚𝑚 + 𝑏𝑏

𝑐𝑐ℎ𝑎𝑎𝑎𝑎𝑎𝑎𝑎𝑎 𝑖𝑖𝑖𝑖 𝑦𝑦 ∆𝑦𝑦

𝑚𝑚 = =
𝑐𝑐ℎ𝑎𝑎𝑎𝑎𝑎𝑎𝑎𝑎 𝑖𝑖𝑖𝑖 𝑥𝑥 ∆𝑥𝑥

𝑑𝑑𝑑𝑑
Written
𝑑𝑑𝑑𝑑
Backpropagation
Partial derivative

Derivative of a function of several variables with respect to one of these

variables

If 𝑧𝑧 = 𝑓𝑓 𝑥𝑥, 𝑦𝑦, …

𝜕𝜕𝜕𝜕
Derivative with respect to x is written:
𝜕𝜕𝜕𝜕
Backpropagation
1 1 1

0 0 0

-1 -1 -1

Function must be continuous to be differentiable

Replace the (discontinuous) step function with a differentiable function, such as the sigmoid:
1
𝑔𝑔 𝑠𝑠 =
1 + 𝑒𝑒 −𝑠𝑠
or hyperbolic tangent

𝑒𝑒 𝑠𝑠 −𝑒𝑒 −𝑠𝑠 1
𝑔𝑔 𝑠𝑠 = tanh 𝑠𝑠 = =2 −1 (-1 to 1)
𝑒𝑒 𝑠𝑠 +𝑒𝑒 −𝑠𝑠 1+𝑒𝑒 −2𝑠𝑠
Lecture Overview
• Motivation

• Biological and artificial neurons

• Single-layer perceptron

• Multi-layer perceptron

• Neural network design

• Neural network architectures

Step 1: Exhaustive Analysis of the System

• It should determine the number and type of input variables and

model output, reducing the number of variables.

• Is it really necessary to use a neural model? Why not use any other
existing classic model (e.g., phenomenological)?

• Neural Network: Best second solution.

• If a neural model is used, do we have available data representing

properly the system to be modelled? Do we have enough?
Step 2: Preprocessing
• Data: a neural network is a black-box model (a.k.a. empirical model) for
interpolation (never extrapolation); therefore, they greatly depend on
quality and quantity of data available.

• Quality: related to the degree to which the available data represents the
function being approximated. Ideal: to obtain them by following a
properly designed survey/experimental plan.

• Quantity: It is extremely important because only an adequate amount of

data will allow us to correctly identify the parameters (weights) of our
neural model.

• If the quantity of data is small, we cannot expect to develop a

complex neural model.
Step 2: Preprocessing
• Visual examination of the data.

• Detect and, if possible, eliminate outliers, empty values, etc.

• It might also help to detect correlations between variables.

• Normalization of variables: It is necessary when variables with different units and,

therefore, potentially different magnitudes are involved. Sometimes, the magnitudes
can differ by several orders of magnitude.

• Xn = (X-Xmin)/(Xmax-Xmin); Xn ∈ [0,1]

• Xn = 2*(X-Xmin)/(Xmax-Xmin) – 1; Xn ∈ [-1,1]

• It is necessary to perform the corresponding denormalization at the output stage.

Step 3: Design of the Neural Model
• Input and output neurons depend on the previous analysis of the
system.
• But, what about the number of neurons Nh in the hidden layer?

• Rule of thumb: Nh should lead to a number of parameters (weights) Nw

that:

• Nw < (Number of samples) / 10

• The number of weights Nw of an MLP, with Ni neurons in its input layer, a

hidden layer with Nh neurons, and No neurons in the output layer is:

• Nw = (Ni+1)*Nh+(Nh+1)*No
Step 3: Design of the Neural Model

• An MLP with 3 inputs, 4 units in its hidden layer, and 2 outputs, has
a number of parameters:

• Nw = (3+1)*4+(4+1)*2 = 26

• Then, at least 260 samples are required to train the network weights.
Step 3: Design of the Neural Model

• In MLPs is demonstrated that using one hidden layer with a proper

number of neurons is sufficient to approximate any non-linear
function with an arbitrary precision degree.

• Activation functions: A usual criterion is to use sigmoid functions or

ReLUs in the hidden layer and linear functions in the output.
However, sigmoids or softmax can also be used in the output.
Step 4: Training
• Training a neural network is a hard process due to the complexity of the
error function solution space, which can have numerous local minima,
saddle (minimax) points, etc.

• There are three main problems that can arise during training:

• Bias

• Overparameterization

• Overfitting

• The latter two might affect the network's ability to generalize (high
variance).
Step 4: Training

Training bias y(x)

x
Step 4: Training

• To decrease bias:

• Increase (prudently) the number of neurons in the hidden layer.

• Aim to reach a better local minimum by conducting a sufficient

number of different training processes, starting from randomly
chosen initial weights (20 or more attempts).
Step 4: Training

High variance problem

(overparameterization and y(x)
overfitting)

x
Step 4: Training
• To avoid overfitting problem, work with two sets during training:

• Training set

• Test set

• The best is to visualize the error function simultaneously on both

sets.

• Characteristics of the training and test sets:

• Both sets should be large enough, and data should be

representative on both sets.
Step 4: Training

Error function for Error

training set (-) and test
set (---)
Minimum test error

Number of
epochs
Step 4: Training

Error

Number of network
parameters
Step 4: Training

• Cross-validation: Different neural network models are developed

using the available data, splitting the training and test sets in
different ways. The model that achieves the minimum error on the
test set is chosen.

• Additional training aspects:

• Weight initialisation.
• Online or batch learning.
• Adjust the parameters, e.g., learning rate and epochs to suit the
particular task.
Step 5: Generalisation

• To test the generalisation capability of the network, that is, its

performance on a different (never seen) set of data, a small (but
representative) third set might be reserved, the generalisation set.

• This set should also be representative of the phenomenon being

modelled as the previous sets (training and test).
Step 5: Generalisation

y(x)
Approximation of the
underlying function

x
Lecture Overview
• Motivation

• Biological and artificial neurons

• Single-layer perceptron

• Multi-layer perceptron

• Neural network design

• Neural network architectures

Neural Network Architectures
• Two main network structures
Neural Network Architectures

Feed-forward network has connections only in one direction:

• Every node receives input from “upstream” nodes; delivers output

to “downstream” nodes.

• No loops.

• Represents a function of its current input.

• It has no internal state other than the weights themselves.

Neural Network Architectures
• Two main network structures
Neural Network Architectures
Recurrent network feeds outputs back into its own inputs:

• Activation levels of network form a dynamical system.

• It may reach a stable state or exhibit oscillations or even chaotic

behaviour.

• Response of network to an input depends on its initial state.

• This may depend on previous inputs.

• Can support short-term memory.

Deep Learning Architectures
• Multiple layers form a hierarchical model, known as deep learning.
• Convolutional neural networks are specialised for vision tasks.
• Recurrent neural networks are used for time series.

• Typical real-world network can have 10 to 20 layers with hundreds of

millions of weights:
• It can take hours, days, or months to learn on machines with
thousands of cores.
References

• Poole & Mackworth, Artificial

Intelligence: Foundations of
Computational Agents, Chapter 7.
• Russell & Norvig, Artificial Intelligence: a
Modern Approach, Chapters 18.6, 18.7
.
• Bishop, Neural Networks and Their
Applications, Review of Scientific
Instruments, 65(6): 1803-1832.
Feedback
• In case you want to provide anonymous
feedback on these lectures, please visit:

• https://forms.gle/KBkN744QuffuAZLF8

Muchas gracias!

Refined Chapter 5 UceQEJ
No ratings yet
Refined Chapter 5 UceQEJ
79 pages
DL Unit-1 San
No ratings yet
DL Unit-1 San
58 pages
Chapter 5 Part I Basics Neural Networks
No ratings yet
Chapter 5 Part I Basics Neural Networks
85 pages
Wk. 12. Artificial Neural Networks (12!05!2021)
No ratings yet
Wk. 12. Artificial Neural Networks (12!05!2021)
48 pages
Unit 4 Neural Networks
No ratings yet
Unit 4 Neural Networks
76 pages
Artificial Neural Networks
No ratings yet
Artificial Neural Networks
66 pages
Neural Network
No ratings yet
Neural Network
85 pages
Refined Chapter 5 UceQEJ
No ratings yet
Refined Chapter 5 UceQEJ
79 pages
Deep Learning
No ratings yet
Deep Learning
180 pages
Unit - 4 ANN
No ratings yet
Unit - 4 ANN
46 pages
Neural Deep Learning
No ratings yet
Neural Deep Learning
221 pages
Unit V
No ratings yet
Unit V
49 pages
DL IT324a 2 ANN
No ratings yet
DL IT324a 2 ANN
123 pages
6ee412 ch6 Neural DSP
No ratings yet
6ee412 ch6 Neural DSP
41 pages
Basics
No ratings yet
Basics
48 pages
4 Neural Networks
No ratings yet
4 Neural Networks
44 pages
Artificial Neural Network
No ratings yet
Artificial Neural Network
86 pages
ML-Lec10-Artificial Neural Networks
No ratings yet
ML-Lec10-Artificial Neural Networks
76 pages
Chapter 3-1 Neural Network
No ratings yet
Chapter 3-1 Neural Network
43 pages
Soft Computing Exam Prep Guide
100% (6)
Soft Computing Exam Prep Guide
3 pages
Advanced Supervised Learning
No ratings yet
Advanced Supervised Learning
17 pages
Neural Network
No ratings yet
Neural Network
82 pages
Unit - 4
No ratings yet
Unit - 4
17 pages
ML Lec11
No ratings yet
ML Lec11
14 pages
Types of Neural Networks and Definition of Neural Network
No ratings yet
Types of Neural Networks and Definition of Neural Network
15 pages
Neural Network: Presented by Lecturer Dept. of Mechatronics Engineering Rajshahi University of Engineering & Technology
No ratings yet
Neural Network: Presented by Lecturer Dept. of Mechatronics Engineering Rajshahi University of Engineering & Technology
25 pages
Artificial Neural Networks Basics
No ratings yet
Artificial Neural Networks Basics
50 pages
Neural Networks
100% (1)
Neural Networks
119 pages
Neural Networks
No ratings yet
Neural Networks
28 pages
ML Unit-5 Final
No ratings yet
ML Unit-5 Final
23 pages
05 ANN Artificial Neural Networks
No ratings yet
05 ANN Artificial Neural Networks
221 pages
Neural Nets
No ratings yet
Neural Nets
43 pages
Lesson 7.0 Supervised Learning With Neural Networks
No ratings yet
Lesson 7.0 Supervised Learning With Neural Networks
22 pages
NNDL
No ratings yet
NNDL
96 pages
07 Neural Networks1
No ratings yet
07 Neural Networks1
73 pages
COMP3411 Week 2 - Search - Armin v1-1
No ratings yet
COMP3411 Week 2 - Search - Armin v1-1
146 pages
Artificial Neural Networks
No ratings yet
Artificial Neural Networks
82 pages
This Document Is About Artificial Inteligence.
No ratings yet
This Document Is About Artificial Inteligence.
81 pages
Unit 3 - Ann
No ratings yet
Unit 3 - Ann
49 pages
Neural Network: Presented by Lecturer Dept. of Mechatronics Engineering Rajshahi University of Engineering & Technology
No ratings yet
Neural Network: Presented by Lecturer Dept. of Mechatronics Engineering Rajshahi University of Engineering & Technology
25 pages
Neural Networks for Tech Enthusiasts
No ratings yet
Neural Networks for Tech Enthusiasts
51 pages
ECSE484 Intro v2
No ratings yet
ECSE484 Intro v2
67 pages
Artificial Neural Network
No ratings yet
Artificial Neural Network
75 pages
Lecture 7 - Neural Networks
No ratings yet
Lecture 7 - Neural Networks
48 pages
ML Unit 5
No ratings yet
ML Unit 5
33 pages
WINSEM2023-24 BITE410L TH VL2023240503970 2024-03-11 Reference-Material-I
No ratings yet
WINSEM2023-24 BITE410L TH VL2023240503970 2024-03-11 Reference-Material-I
40 pages
Neural Networks
No ratings yet
Neural Networks
40 pages
Module 2
No ratings yet
Module 2
84 pages
Artificial Intelligence: Outline
No ratings yet
Artificial Intelligence: Outline
35 pages
Artificial Neural Networks
No ratings yet
Artificial Neural Networks
66 pages
Neural Networks: - Genetic Algorithms - Genetic Programming - Behavior-Based Systems
No ratings yet
Neural Networks: - Genetic Algorithms - Genetic Programming - Behavior-Based Systems
74 pages
28 Lecture CSC462
No ratings yet
28 Lecture CSC462
28 pages
COMP3411 Week 7 - Computer Vision
No ratings yet
COMP3411 Week 7 - Computer Vision
58 pages
Chapter-4 Fundamental of Neural Network
No ratings yet
Chapter-4 Fundamental of Neural Network
26 pages
05 ANN Artificial Neural Networks
No ratings yet
05 ANN Artificial Neural Networks
216 pages
CC511 Week 5 - 6 - NN - BP
No ratings yet
CC511 Week 5 - 6 - NN - BP
62 pages
Neural Networks: Some Material Adopted From Notes by
No ratings yet
Neural Networks: Some Material Adopted From Notes by
35 pages
Machine Learning Using Neural Networks: Presentation By: C. Vinoth Kumar SSN College of Engineering
No ratings yet
Machine Learning Using Neural Networks: Presentation By: C. Vinoth Kumar SSN College of Engineering
24 pages
COMP3411 Week 8 - Language Processing
No ratings yet
COMP3411 Week 8 - Language Processing
74 pages
Lecture 10 Neural Network
No ratings yet
Lecture 10 Neural Network
34 pages
L5 Compression
No ratings yet
L5 Compression
60 pages
Intelligent Information Processing With Matlab - Xiu Zhang
No ratings yet
Intelligent Information Processing With Matlab - Xiu Zhang
347 pages
ANN PG Module1
No ratings yet
ANN PG Module1
75 pages
COMP3411 Slides All Term
No ratings yet
COMP3411 Slides All Term
23 pages
ML Unit 2
No ratings yet
ML Unit 2
63 pages
6COM1044 Deep Learning 1
No ratings yet
6COM1044 Deep Learning 1
49 pages
CNN vs RNN: MNIST Dataset Analysis
No ratings yet
CNN vs RNN: MNIST Dataset Analysis
21 pages
NN Unit - 1
No ratings yet
NN Unit - 1
27 pages
Mod 3
No ratings yet
Mod 3
101 pages
Image Caption Generator
100% (1)
Image Caption Generator
20 pages
CNN Architectures Workshop
No ratings yet
CNN Architectures Workshop
104 pages
NPTEL Live Session Week 1 Deep Learning-IIT Ropar
No ratings yet
NPTEL Live Session Week 1 Deep Learning-IIT Ropar
26 pages
Deep Learning for Vision: FDP Feb 2024
No ratings yet
Deep Learning for Vision: FDP Feb 2024
2 pages
Deep Learning EECS 6327
No ratings yet
Deep Learning EECS 6327
43 pages
Test 2 Lab 6
No ratings yet
Test 2 Lab 6
8 pages
Foundations of Deep Learning
No ratings yet
Foundations of Deep Learning
48 pages
Neural Networks
No ratings yet
Neural Networks
6 pages
Lecture 01 Overview
No ratings yet
Lecture 01 Overview
39 pages
RNN Basics
No ratings yet
RNN Basics
17 pages
Approximation With Artificial Neural Network - MSC Thesis 2001 PDF
No ratings yet
Approximation With Artificial Neural Network - MSC Thesis 2001 PDF
45 pages
4 Implementing A GPT Model From Scratch To Generate Text - Build A Large Language Model (From Scratch)
No ratings yet
4 Implementing A GPT Model From Scratch To Generate Text - Build A Large Language Model (From Scratch)
25 pages
LCTM and Gru
No ratings yet
LCTM and Gru
62 pages
RNNs: A Deep Dive for CS Students
No ratings yet
RNNs: A Deep Dive for CS Students
10 pages
RBF Elm PNN-2020
No ratings yet
RBF Elm PNN-2020
24 pages
Artificial Neural Networks Guide
No ratings yet
Artificial Neural Networks Guide
51 pages
Neural Network Basics for Researchers
No ratings yet
Neural Network Basics for Researchers
19 pages
Typical CNN (Convolutional Neural Network) Architecture: CHARAN S (1VE20CA005) Cse-Ai, Svce
No ratings yet
Typical CNN (Convolutional Neural Network) Architecture: CHARAN S (1VE20CA005) Cse-Ai, Svce
13 pages
Unit 5
No ratings yet
Unit 5
10 pages
Introduction To Recurrent Neural Networks (RNNS) : Dr. Hans Weber February 9, 2024
No ratings yet
Introduction To Recurrent Neural Networks (RNNS) : Dr. Hans Weber February 9, 2024
9 pages
01-NDL Theory and Lab Syllabus
No ratings yet
01-NDL Theory and Lab Syllabus
4 pages
Syllabus
No ratings yet
Syllabus
2 pages
CH2.3 - Resnet Backpropagation Well Explained
No ratings yet
CH2.3 - Resnet Backpropagation Well Explained
5 pages
Spiking Neural Networks
No ratings yet
Spiking Neural Networks
9 pages

COMP3411 Week 3 - NN

Uploaded by

COMP3411 Week 3 - NN

Uploaded by

Neural Networks

COMP3411/9814: Artificial Intelligence

• Biological and artificial neurons

• Neural network design

• Neural network architectures

• Biological and artificial neurons

• Neural network design

• Neural network architectures

• Great ability of cognitive beings to carry out some tasks: shape

• It seemed important to understand and emulate successful

• A branch of artificial intelligence: Artificial Neural Networks.

• New paradigm (non-algorithmic) to process information

• New computational tools (faster and cheaper)

• In the future: more caution and theoretical support. Open issue:

• Classification: to approximate a function that represents the membership

• Regression: to approximate the generating function (unknown) of a process

• Biological and artificial neurons

• Neural network design

• Neural network architectures

• Synapses can be excitatory or inhibitory and may change over time.

• When the inputs reach some threshold an action potential (electrical

• Human brain has 100 billion neurons (~10 -

• Latency is about 3-6 milliseconds.

• At most a few hundred “steps” in any mental computation, but

• Automata characterized by:

• Activation and transfer functions.

• Threshold, if sum < Ө then output is 0

• Hebbian learning (1949): When a neuron A persistently activates

• “Neurons that fire together, wire together”, Hebb.

• Consists of many interconnected processing units (neurons)

• Generally used to model relationships between inputs and outputs or

• Characterized by (i) number of neurons, (ii) interconnection

❑ ANNs nodes have

activation𝑗𝑗 = 𝑔𝑔 𝑠𝑠𝑗𝑗 = 𝑔𝑔 � 𝑤𝑤𝑖𝑖𝑖𝑖 𝑥𝑥𝑖𝑖

• Neural networks (NN) might work in two ways:

• Learning: adapting its weights, architecture, activation and

• Simulation or recognition: it is used for information processing.

• NN learning: Supervised (through examples), unsupervised.

• Biological and artificial neurons

• Neural network design

• Neural network architectures

• Frank Rossenblat, 1957

• Use a logic threshold function for classification tasks.

• Classification and discriminant functions

• How can we make the classification?

• Using discriminant functions defined in the vector space we want

u Є si iff gi (u) > gj (u); j ≠ i

• For 2-classes problems, it can be reduced to one function as:

g(u) = g1(u) – g2(u), then u Є s1 g(u) > 0 and u Є s2 otherwise

• Linear classification with discriminant function using perceptron:

• The value associated with g(u) = 0 corresponds to the border, a

• Perceptrons are appropriate only for classes linearly separable.

• The learning problem is reduced to finding a hyperplane that

• Simplest output function

• AND gate output

1.5 = w1*I1 + w2*I2

• For any data set that is linearly separable, the perceptron

• Solution found in 1974 by Paul Werbos.

• Biological and artificial neurons

• Neural network design

• Neural network architectures

• Given an explicit logical function, we can design a multi-layer neural network by

• Definition: A network in which

𝑤𝑤𝑖𝑖,𝑗𝑗 ≡ weight between node 𝑖𝑖and node 𝑗𝑗

• Feed-forward network = a parameterised family of nonlinear functions:

(a) is a step function or threshold function

Changing the bias weight 𝑏𝑏 moves the threshold

1. Forward pass: apply inputs to the Forward Pass

2. Calculate error: difference between

3. Backward pass: Propagate errors

Parameter α is the learning rate

• How the cost function affects the particular weight

𝑐𝑐ℎ𝑎𝑎𝑎𝑎𝑎𝑎𝑎𝑎 𝑖𝑖𝑖𝑖 𝑦𝑦 ∆𝑦𝑦

Derivative of a function of several variables with respect to one of these

Function must be continuous to be differentiable

• Biological and artificial neurons

1.5 = w1I1 + w2I2