Thanks to visit codestin.com
Credit goes to www.scribd.com

0% found this document useful (0 votes)
31 views35 pages

Artificial Intelligence: Outline

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
31 views35 pages

Artificial Intelligence: Outline

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 35

9/28/2023

Artificial Intelligence

Dr. Tran Quang Huy

OUTLINE
2
Chapter 1: Overview of AI
Chapter 2: Artificial Neural Networks
Chapter 3: Searching, Knowledge, Reasoning, and Planning
Chapter 4: Machine learning

W1 W2 W3 W4 W5 W6 W7 W8 W9 W10
L L L I-T L L L L P P

L: Lesson; I-T: In-class Test; P: Project

2
9/28/2023

Objectives
3
1. Understand the basics of Neural Networks

2. Being able to move on the more advanced Convolutional Neural Networks

Main contents
4
1. Artificial Neural Networks (ANN) and their relation to biology
2. The seminal Perceptron algorithm
3. Back propagation
4. How to train Neural Networks using Keras library

4
9/28/2023

What are Neural Networks?


5
Question:
- How does your family dog recognize you, the owner, versus a complete and
total stranger?
- How does a small child learn to recognize the difference between a school
bus and a transit bus?
- How do our own brains subconsciously perform complex pattern recognition
tasks each and every day without us even noticing?

What are Neural Networks?


Answer: Each of us contains a real-life biological neural networks that6is
connected to our nervous systems – this network is made up of a large
number of interconnected neurons (nerve cells).
The word “neural” is the adjective form of “neuron”, and “network” denotes a
graph-like structure; therefore, an “Artificial Neural Network” is a computation
system that attempts to mimic (or at least, is inspired by) the neural
connections in our nervous system. Artificial neural networks are also referred
to as “neural networks” or “artificial neural systems”.
It is common to abbreviate Artificial Neural Network and refer to them as
“ANN” or simply “NN”

6
9/28/2023

ANN
7

ANN
8

8
9/28/2023

ANN
9

A simple neural network architecture.


Inputs are presented to the network.
Each connection carries a signal
through the two hidden layers in the
network. A final function computes
the output class label.

ANN
10

10
9/28/2023

ANN
11

11

Read the following and explain the meaning of each part in the figure and equations
12

12
9/28/2023

13
Activation Functions

What is activation function?

How does the activation function work?

Why do we use activation functions?

Listsome types of popular activation


functions?

13

What is activation function?


14

How does the activation function work?

14
9/28/2023

Why do we use activation functions? 15

1. Create non-linear characteristic for model


2. Keep the output in a specific range, such as [0, 1]; [-1, 1]

15

16

Popular Activation
Functions

Find the equation of


each activation
function.

16
9/28/2023

Activation Functions
Step function: 17

Sigmoid function:

ReLU function:

17

Activation Functions
Step function: 18

This is a very simple threshold function. If the weighted sum: we ou tput 1, otherwise,
we output 0.

The output of f is always zero when net is less than or equal zero. If net is greater than
zero, then f will return one.

What is the problems of step function?

18
9/28/2023

Activation Functions
19
Sigmoid function:

Sigmoid function is a more common activation function used in the history of NN.

19

Activation Functions
20
Sigmoid function:

Sigmoid function is a more common activation function used in the history of NN.

Why???
The primary advantage here is that the smoothness of the sigmoid function makes it easier to
devise learning algorithms.
The sigmoid function is a better choice for learning than the simple step function since it:
1. Is continuous and differentiable everywhere.
2. Is symmetric around the y-axis.
3. Asymptotically approaches its saturation values.

20
9/28/2023

Activation Functions
21
Sigmoid function:

Disadvantage of Sigmoid function:

1. The outputs of the sigmoid are not zero centered.


2. Saturated neurons essentially kill the gradient, since the delta of the gradient will be
extremely small.

21

Activation Functions
22
Tanh function:

The hyperbolic tangent, or tanh (with a similar shape of the sigmoid) was also heavily used as
an activation function up until the late 1990s.
The tanh function is zero centered, but the gradients are still killed when neurons become
saturated

22
9/28/2023

Activation Functions
23
ReLU function:

Rectified Linear Unit (ReLU) is also called “ramp functions” due to how they look
when plotted.

23

Activation Functions
24
ReLU function:

Note:

Notice how the function is zero for negative inputs but then linearly increases for
positive values. The ReLU function is not saturable and is also extremely
computationally efficient.

The ReLU activation function tends to outperform both the sigmoid and tanh
functions in nearly all applications.

24
9/28/2023

Activation Functions
25
ReLU function:

As of 2015, ReLU is the most popular activation function used in deep learning.
However, a problem arises when we have a value of zero – the gradient cannot be
taken.

25

Activation Functions
26
ReLU6 function:

This function limits the problem of exploding gradients

26
9/28/2023

Activation Functions
27
Leaky ReLU function:

Leaky ReLUs allow for a small, non-zero gradient when the unit is not active

27

Activation Functions
28
Leaky ReLU function:

The function is indeed allowed to take on


a negative value, unlike traditional ReLUs
which “clamp" the function output at zero.
Parametric ReLUs build on Leaky ReLUs
and allow the parameter α to be learned on
an activation-by-activation basis, implying
that each node in the network can learn a
different “coefficient of leakage” separate
from the other nodes.

28
9/28/2023

Feedforward Network Architectures


29
In this type of architecture, a
connection between nodes is
only allowed from nodes in
layer i to nodes in layer i+1
(hence the term, feedforward).
There are no backward or inter-
layer connections allowed.
When feedforward networks
include feedback connections
(output connections that feed
back into the inputs) they are
called recurrent neural networks.

29

30
Feedforward Network Architectures

This figure is a 3-2-3-2 feedforward network


Layer 0 contains 3 inputs, our xi values. These could be raw
pixel intensities of an image or a feature vector extracted
from the image.
Layers 1 and 2 are hidden layers containing 2 and 3 nodes,
respectively.
Layer 3 is the output layer or the visible layer – there is where
we obtain the overall output classification from our network.
The output layer typically has as many nodes as class labels;
one node for each potential output.
For example, if we were to build an NN to classify
handwritten digits, our output layer would consist of 10 nodes,
one for each digit 0-9.

30
9/28/2023

PERCEPTRON ALGORITHM
31
Perceptron was introduced by Frank Rosenblatt in 1957. He proposed a Perceptron
learning rule based on the original MCP neuron. A Perceptron is an algorithm for
supervised learning of binary classifiers. This algorithm enables neurons to learn
and processes elements in the training set one at a time.

https://www.javatpoint.com/perceptron-in-machine-learning

31

32
TYPES OF PERCEPTRON

1.Single layer (a): Single layer


perceptron can learn only linearly
separable patterns.

2. Multilayer (b): Multilayer


perceptrons can learn about two
or more layers having a greater
processing power.

https://www.javatpoint.com/perceptron-in-machine-learning

32
9/28/2023

33
TYPES OF PERCEPTRON

A single-layered perceptron model consists feed-forward network and also includes a


threshold transfer function inside the model. The main objective of the single-layer
perceptron model is to analyze the linearly separable objects with binary outcomes.

A single layer perceptron model do not contain recorded data, so it begins with
inconstantly allocated input for weight parameters. Further, it sums up all inputs
(weight). After adding all inputs, if the total sum of all inputs is more than a pre-
determined value, the model gets activated and shows the output value as +1.
If the outcome is same as pre-determined or threshold value, then the performance
of this model is stated as satisfied, and weight demand does not change. However,
this model consists of a few discrepancies triggered when multiple weight inputs
values are fed into the model. Hence, to find desired output and minimize errors,
some changes should be necessary for the weights input.
33

34
TYPES OF PERCEPTRON

The multi-layer perceptron model is also known as the


Backpropagation algorithm, which executes in two
stages as follows:
•Forward Stage: Activation functions start from the
input layer in the forward stage and terminate on the
output layer.
•Backward Stage: In the backward stage, weight and
bias values are modified as per the model's
requirement. In this stage, the error between actual
output and demanded originated backward on the
output layer and ended on the input layer.

34
9/28/2023

35
TYPES OF PERCEPTRON

Advantages of Multi-Layer Perceptron:


•A multi-layered perceptron model can be used to solve complex non-
linear problems.
•It works well with both small and large input data.
•It helps us to obtain quick predictions after the training.
•It helps to obtain the same accuracy ratio with large as well as small
data.
Disadvantages of Multi-Layer Perceptron:
• Computations are difficult and time-consuming.
•It is difficult to predict how much the dependent variable affects each
independent variable.
•The model functioning depends on the quality of the training.

35

Basic Components of Perceptron


36

Mr. Frank Rosenblatt invented the perceptron model as a binary classifier which contains
three main components:
• Input Nodes or Input Layer
• Weight and Bias
• Activation Function https://www.javatpoint.com/perceptron-in-machine-learning

36
9/28/2023

Basic Components of Perceptron


37
Types of Activation functions:

https://www.javatpoint.com/perceptron-in-machine-learning

37

How does Perceptron work?


38
In Machine Learning, Perceptron is
considered as a single-layer neural network
that consists of four main parameters named
input values (Input nodes), weights and Bias,
net sum, and an activation function.
The perceptron model begins with the
multiplication of all input values and their
weights, then adds these values together to
create the weighted sum. Then this weighted
sum is applied to the activation function 'f' to Write the final equation
obtain the desired output. This activation based on these info?
function is also known as the step
function and is represented by 'f'. https://www.javatpoint.com/perceptron-in-machine-learning

38
9/28/2023

How does Perceptron work?


39

For example, x1 = 2, x2 = 3, x3 = 1, wn are certain numbers in the range


[0,1], step function is used. Estimate the ouput.

39

40

40
9/28/2023

41

Problem 1: The input to a single-input neuron is 2.0, its weight is 2.3 and its bias is -3.
i. What is the net input to the transfer function?
ii. What is the neuron output?

41

42

42
9/28/2023

43

Problem 2: The input to a single-input neuron is 2.0, its weight is 2.3 and its bias is -3.
What is the output of the neuron if it has the following transfer functions?
i. Hard limit
ii. Linear
iii. Log-sigmoid

43

44
Problem 3:

Given a two-input neuron with the following parameters: b = 1.2, W = [3 2],

and p = [-5 6]T, calculate the neuron output for the following transfer

functions:

i. A symmetrical hard limit transfer function

ii. A saturating linear transfer function

iii. A hyperbolic tangent sigmoid (tansig) transfer function

44
9/28/2023

45

45

An illustrative example
There is a conveyer belt on which the fruit is 46
loaded. This conveyer passes through a set of
sensors, which measure three properties of the
fruit: shape, texture and weight.

Value 1 -1
Shape round elliptical
Texture smooth rough
Weight >1 pound <=1 pound

The three sensor outputs will then be input to a neural network. The purpose of the
network is to decide which kind of fruit is on the conveyor. Let’s assume that there
are only two kinds of fruit on the conveyor: apples and oranges.

46
9/28/2023

An illustrative example
47
Apply the following perceptron model to the previous problem in the case of
two inputs

47

An illustrative example
48

If w1,1 = -1, w1,2 = 1, find a?

48
9/28/2023

An illustrative example
49
Therefore, if the inner product of the weight matrix (a single row vector in this case) with
the input vector is greater than or equal to -b, the output will be 1. If the inner product of the
weight vector and the input is less than -b, the output will be -1.
This divides the input space into two parts. The figure illustrates this for the case where b = -
1. The blue line in the figure represents all points for which the net input is equal to 0:
n = [-1 1]p – 1 = 0

49

An illustrative example
50
The decision boundary between the categories is determined by the equation
Wp + b = 0

Because the boundary must be linear, the single-layer perceptron can only be used to
recognize patterns that are linearly separable

50
9/28/2023

An illustrative example
51
Apply the following perceptron model to the previous problem in the case of
three inputs
Find a

51

An illustrative example
52
We want to choose the bias and the elements of the weight matrix so that
the perceptron will be able to distinguish between apples and oranges.
For example, we may want the output of the perceptron to be 1 when an
apple is input and -1 when an orange is input.

52
9/28/2023

AND, OR, and XOR Datasets


53

53

54

Perceptron Training Procedure and the Delta Rule (step 2c)

54
9/28/2023

Implementing the Perceptron in Python


55

55

Implementing the Perceptron in Python


56

56
9/28/2023

Implementing the Perceptron in Python


57

57

Implementing the Perceptron in Python


58

58
9/28/2023

Implementing the Perceptron in Python


59

59

Implementing the Perceptron in Python


60

60
9/28/2023

Implementing the Perceptron in Python


61

61

Implementing the Perceptron in Python


62

62
9/28/2023

Implementing the Perceptron in Python


63

63

Evaluating the Perceptron Bitwise Datasets


64

64
9/28/2023

Evaluating the Perceptron Bitwise Datasets


65

65

Evaluating the Perceptron Bitwise Datasets


66

66
9/28/2023

Evaluating the Perceptron Bitwise Datasets


67

67

Evaluating the Perceptron Bitwise Datasets


68

68
9/28/2023

Evaluating the Perceptron Bitwise Datasets


69

69

70

70

You might also like