Thanks to visit codestin.com
Credit goes to www.scribd.com

0% found this document useful (0 votes)
8 views107 pages

Unit4 C

The document outlines the curriculum for a course on Bayesian and Computational Learning, covering key concepts such as Bayes Theorem, concept learning, and various algorithms including Naïve Bayes and Gibbs Algorithm. It emphasizes the application of Bayesian learning for modeling data and making predictions based on prior knowledge and observed data. Additionally, it discusses the principles of Maximum Likelihood and Bayesian Belief Networks as part of the learning framework.

Uploaded by

G0REM0ND
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
8 views107 pages

Unit4 C

The document outlines the curriculum for a course on Bayesian and Computational Learning, covering key concepts such as Bayes Theorem, concept learning, and various algorithms including Naïve Bayes and Gibbs Algorithm. It emphasizes the application of Bayesian learning for modeling data and making predictions based on prior knowledge and observed data. Additionally, it discusses the principles of Maximum Likelihood and Bayesian Belief Networks as part of the learning framework.

Uploaded by

G0REM0ND
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 107

UNIT4

Bayesian and
Computational Learning

Dr Raghavendra S
Associate Professor & Coordinator
BTech(CSE) in AIML

MISSION VISION CORE VALUES


CHRIST is a nurturing ground for an Excellence and Service Faith in God | Moral Uprightness
individual’s holistic development to make Love of Fellow Beings 1
effective contribution to the society in a Social Responsibility | Pursuit of
CHRIST
Deemed to be University

UNIT4- Bayesian and Computational


Learning
●Bayes Theorem
●Concept Learning
●Maximum Likelihood
●Minimum Description Length Principle
●Bayes Optimal Classifier
●Gibbs Algorithm
●Naïve Bayes Classifier
●Bayesian Belief Network
●EM Algorithm.

2
Excellence and Service
CHRIST
Deemed to be University

BAYES
THEOREM

3
Excellence and Service
CHRIST
Deemed to be University

Bayesian Learning

● Bayesian Learning make use of the probability to model


the data and to measure the uncertainty in the
prediction.
● It uses prior knowledge to make the prediction.
● Can be used for both the classification and regression
problems
● Each observed training data can incrementally
decrease or increase the estimated probability of a
hypothesis.
● It combine the priorExcellence
knowledge with the observed data
and Service
4
CHRIST
Deemed to be University

Bayes Theorem gives Conditional Probability of an


event A given another event B has occurred

Where,
P(A|B) : Probability of A given B
P(B|A) : Probability of B given A
P(A) : Prior Probability of event A
P(B) : Prior Probability of event B

Algorithmic Steps
1. Compute Prior Probability of A
2. Compute Prior Probability of B
3. Compute Likelihood Probability of B given A
4. Find Posterior Probability of A given B using step 1,2,3
5
Excellence and Service
CHRIST
Deemed to be University

Bayes Theorem Derivation

This is Bayes
Theorem

6
Excellence and Service
CHRIST
Deemed to be University
Ex

7
Excellence and Service
CHRIST
Deemed to be University

h: Hypothesis
D: Training Data

Ignore P(D) which is


common to both cases

Max
New patient is not
having cancer
8
Excellence and Service
CHRIST
Deemed to be University

Max
New patient is not
having cancer
9
Excellence and Service
CHRIST
Deemed to be University

CONCEPT
LEARNING
10
Excellence and Service
CHRIST
Deemed to be University

11
Excellence and Service
CHRIST
Deemed to be University

Concept Learning

● The problem of inducing a generic function from


specific training example is the central concept of
concept learning.

● Concept learning can be formulated as a problem of


searching through a predefined space of potential
hypothesis for the hypothesis that best fit the training
examples.

● We need to go with predefined set of potential


hypothesis to identify One hypothesis that best fits the
training examples.

12
Excellence and Service
CHRIST
Deemed to be University

Concept Learning

Case Study

13
Excellence and Service
CHRIST
Deemed to be University

In which
heuristically search
for the most
optimum or best
solution wrt given
training set

14
Excellence and Service
CHRIST
Deemed to be University

Illustration: Determine when Tom enjoys


his sports under the given conditions

Possible
values for
each
attributes

15
Excellence and Service
CHRIST
Deemed to be University

16
Excellence and Service
CHRIST
Deemed to be University

17
Excellence and Service
CHRIST
Deemed to be University

18
Excellence and Service
CHRIST
Deemed to be University

Add 2 more possibilities to each we get

19
Excellence and Service
CHRIST
Deemed to be University

Substitute ? With all possible values and determine decision. Ex

Sunny, Warm, Normal/High, Strong, Warm/Cool, Same/Change

20
Excellence and Service
CHRIST
Deemed to be University

Substitute ? With all possible values and determine decision. Ex

Sunny, Warm, Normal/High, Strong, Warm/Cool, Same/Change

21
Excellence and Service
CHRIST
Deemed to be University

Substitute ? With all possible values and determine decision. Ex

Sunny, Warm, Normal/High, Strong, Warm/Cool, Same/Change

22
Excellence and Service
CHRIST
Deemed to be University

Substitute ? With all possible values and determine decision. Ex

Sunny, Warm, Normal/High, Strong, Warm/Cool, Same/Change

23
Excellence and Service
CHRIST
Deemed to be University

FIND-S Algorithm – Finding a Maximally


Specific Hypothesis

24
Excellence and Service
CHRIST
Deemed to be University

25
Excellence and Service
CHRIST
Deemed to be University

26
Excellence and Service
CHRIST
Deemed to be University

27
Excellence and Service
CHRIST
Deemed to be University

28
Excellence and Service
CHRIST
Deemed to be University

29
Excellence and Service
CHRIST
Deemed to be University

30
Excellence and Service
CHRIST
Deemed to be University

31
Excellence and Service
CHRIST
Deemed to be University

32
Excellence and Service
CHRIST
Deemed to be University

33
Excellence and Service
CHRIST
Deemed to be University

Maximum
Likelihood
&
Least Squared
Error
Excellence and Service
34
CHRIST
Deemed to be University

Maximum Likelihood and Least-Squared


Error Hypothesis

35
Excellence and Service
CHRIST
Deemed to be University

Maximum Likelihood and Least-Squared


Error Hypothesis

36
Excellence and Service
CHRIST
Deemed to be University

Maximum Likelihood and Least-Squared


Error Hypothesis

37
Excellence and Service
CHRIST
Deemed to be University

Maximum Likelihood and Least-Squared


Error Hypothesis

38
Excellence and Service
CHRIST
Deemed to be University

Maximum Likelihood and Least-Squared


Error Hypothesis

39
Excellence and Service
CHRIST
Deemed to be University

Maximum Likelihood and Least-Squared


Error Hypothesis

40
Excellence and Service
CHRIST
Deemed to be University

Maximum Likelihood and Least-Squared


Error Hypothesis
● Let us assume that the Target variable is normally
distributed. If the target variables are normally
distributed then we can use the probability density
function of normal distribution as below:

Here, μ = Mean, x = input and σ = Standard Deviation


(assume constant)

41
Excellence and Service
CHRIST
Deemed to be University

Maximum Likelihood and Least-Squared


Error Hypothesis
● Let us assume that the Target variable is normally
distributed. If the target variables are normally
distributed then we can use the probability density
function of normal distribution as below:

Here, μ = Mean, x = input and σ = Standard Deviation


(assume constant)

● We know that, in equation given below, the probability


is normally distributed, we can replace by
so that = and h =

42
Excellence and Service
CHRIST
Deemed to be University

Maximum Likelihood and Least-Squared


Error Hypothesis
● From the above replacement we get:

43
Excellence and Service
CHRIST
Deemed to be University

Maximum Likelihood and Least-Squared


Error Hypothesis
● Taking logarithm on the RHS, we get:

Taking logarithm will reduce the


Complexity and the Summation
will be converted to Product term.

Formula used is:


ln(xy) = ln(x) + (ln(y)
ln(ex ) = x * ln(e)

44
Excellence and Service
CHRIST
Deemed to be University

Maximum Likelihood and Least-Squared


Error Hypothesis
● Taking logarithm on the RHS, we get:

● Discard the constant, we get:

45
Excellence and Service
CHRIST
Deemed to be University

Maximum Likelihood and Least-Squared


Error Hypothesis
● Taking logarithm on the RHS, we get:

● Discard the constant, we get:

● Maximalizing the negative quantity is nothing but


minimizing the positive quantity, so convert the argmax
to argmin, we get:

46
Excellence and Service
CHRIST
Deemed to be University

Maximum Likelihood and Least-Squared


Error Hypothesis
● Taking logarithm on the RHS, we get:

● Discard the constant, we get:

● Maximalizing the negative quantity is nothing but


minimizing the positive quantity, so convert the argmax
to argmin, we get:

● Finally discard the constant, we get:

47
Excellence and Service
CHRIST
Deemed to be University
MLE-In case of discrete values

Discrete Ex: Probability of Outlook is Sunny Given yes


and
Probability of Outlook is Sunny Given No

48
Excellence and Service
CHRIST
Deemed to be University

What if MLE for Continuous data?


Target is Attributes are
Discrete Continuous
Use this
equation

49
Excellence and Service
CHRIST
Deemed to be University

BAYES OPTIMAL
CLASSIFIER
(Maximization function)

50
Excellence and Service
CHRIST
Deemed to be University

Bayes Optimal Classifier(Max


function)

51
Excellence and Service
CHRIST
Deemed to be University

52
Excellence and Service
CHRIST
Deemed to be University

Example-Bayes Optimum Classifier

53
Excellence and Service
CHRIST
Deemed to be University

Max{0.4, 0.6}= 0.6


belongs to -ve
54
Excellence and Service
CHRIST
Deemed to be University

55
Excellence and Service
CHRIST
Deemed to be University

GIBBS
ALGORITHM
For more hypothesis

56
Excellence and Service
CHRIST
Deemed to be University

Gibbs Algorithm

57
Excellence and Service
CHRIST
Deemed to be University

Naive Bayes Classifier

58
Excellence and Service
CHRIST
Deemed to be University

59
Excellence and Service
CHRIST
Deemed to be University

60
Excellence and Service
CHRIST
Deemed to be University

61
Excellence and Service
CHRIST
Deemed to be University

Po= LPr/Pr

62
Excellence and Service
CHRIST
Deemed to be University

Case1: Based on highest probability

63
Excellence and Service
CHRIST
Deemed to be University

64
Excellence and Service
CHRIST
Deemed to be University

Case2: Based on Most Recent Data

65
Excellence and Service
CHRIST
Deemed to be University

66
Excellence and Service
CHRIST
Deemed to be University

EXAMPLE1
NAÏVE BAYES CLASSIFIER

67
Excellence and Service
CHRIST
Deemed to be University

Example1:
Find the probability of the given instance for the given prior and
conditional probabilities using Naïve Bayes Theorem
1 Target variable with
4 Attributes with given Cond. Prob given prior Prob

Find
probability of
the given
instance
68
Excellence and Service
CHRIST
Deemed to be University

Prior Probabilities of Target Variable

Frequency Tables of Conditional Probabilities of Attributes wrt Y/N

69
Excellence and Service
CHRIST
Deemed to be University

Generalized Naïve Bayes formula is


Prob of A given B

70
Excellence and Service
CHRIST
Deemed to be University

71
Excellence and Service
CHRIST
Deemed to be University

Try this: Find the probability of playing in


weather Sunny
Bayes Theorem

Problem Firmulation

2/9 * 9/14
--------------- = 0.399
5/14

72
Excellence and Service
CHRIST
Deemed to be University
Example2: (3 Targets, 3 Attributes)
Find the Probability of a given Fruit being Orange, banana or others given its
prior probabilities having 3 attributes being yellow, Sweet, and Long given its
conditional probabilities using Naïve Bayes Algorithm
3 Target Variables
with given prior
Prob 3 Attributes with given conditional Prob

Fruit Yellow Sweet Long Total


1) Orange 350 450 0 650
2) Banana 400 300 350 400
Total
3) Others 50 100 50 150
Probabilities
Total 800 850 400 1200 given

Generalized Naïve Bayes formula is


Prob of A given B

73
Excellence and Service
Fruit yellow Sweet Long Total
CHRIST
Orange 350 450 0 650 Deemed to be University

Banana 400 300 350 400

Others 50 100 50 150

Total 800 850 400 1200

74
Excellence and Service
CHRIST
Deemed to be University

P(Fruit |Orange)= P(O|Y) x P(O|S) x P(O|L) = 0.53 x 0.69 x 0 = 0

Similarly
Determine probability of Fruit being Banana & Others

P(fruit |Banana)= P(B|Y) x P(B|S) x P(B|L) = 1 x 0.75 x 0.87 = 0.65

P(fruit |Others)= P(Ot|Y) x P(Ot|S) x P(Ot|L) = 0.33 x 0.66 x 0.33 = 0.072

Hence given fruit is belongs to Banana being the highest probability of


0.65

75
Excellence and Service
CHRIST
Deemed to be University

Joint Probability
Distribution
For all combinations of Variables

76
Excellence and Service
CHRIST
Deemed to be University

Joint Probability Distribution

X is not true

Y is not true

77
Excellence and Service
CHRIST
Deemed to be University

Limitations of Joint Probability


Distribution

78
Excellence and Service
CHRIST
Deemed to be University

Bayesian Belief
Network
Also called as
Bayesian Network/Belief
Network/Probabilistic N/w

79
Excellence and Service
CHRIST
Deemed to be University
Bayesian Networks
● A Bayesian network is a probabilistic graphical model
which represents a set of variables and their conditional
dependencies using a directed acyclic graph.
● It is also called a Bayes network, belief network,
decision network, or Bayesian model.
● Bayesian networks are probabilistic, because these
networks are built from a probability distribution, and
also use probability theory for prediction and anomaly
detection.
● Real world applications are probabilistic in nature, and
to represent the relationship between multiple events,
we need a Bayesian network.
● Bayesian Network can be used for building models from
data and experts opinions, and it consists of two parts:
1. Directed Acyclic Graph
2. Table of conditional probabilities. 80
Excellence and Service
CHRIST
Deemed to be University

Bayesian Networks - ● A Bayesian network graph is made


Example up of nodes and Arcs (directed
links), where:

 Each node corresponds to the


random variables, and a variable can
be continuous or discrete.

 Arc or directed arrows represent the


causal relationship or conditional
probabilities between random variables.
These directed links or arrows connect the
pair of nodes in the graph.
These links represent that one node directly
influence the other node, and if there is no
directed link that means that nodes are
independent with each other

• In the above diagram, A, B, C, and D


are random variables represented by
the nodes of the network graph.
• If we are considering node B, which
is connected with node A by a
81
directed
Excellence and Service arrow, then node A is called
CHRIST
Deemed to be University

Bayesian Networks

● Each node in the Bayesian network has condition


probability distribution
P(Xi | Parent(Xi) ), which determines the effect of the parent
on that node.

● Bayesian network is based on Joint probability distribution


and conditional probability.

82
Excellence and Service
CHRIST
Deemed to be University

Joint probability distribution

● If we have variables x1, x2, x3,....., xn, then the


probabilities of a different combination of x1, x2, x3..
xn, are known as Joint probability distribution.

P[x1, x2, x3,....., xn], it can be written as the following way


in terms of the joint probability distribution.

= P[x1 | x2, x3,....., xn] P[x2, x3,....., xn]

= P[x1 | x2, x3,....., xn]P[x2|x3,....., xn]....P[xn-1|xn]P[xn].

In general for each variable Xi, we can write the equation


as:

P(Xi | Xi-1,........., X1) = P(Xi | 83


Excellence and Service
Parents(Xi ))
CHRIST
Deemed to be University

84
Excellence and Service
CHRIST
Deemed to be University

Ex: Compute Conditional Probability


for the DAG

= 0.7 x 0.4 x 0.6 x 0.3 = 0.0504


85
Excellence and Service
CHRIST
Deemed to be University

86
Excellence and Service
CHRIST
Deemed to be University

87
Excellence and Service
CHRIST
Deemed to be University

88
Excellence and Service
CHRIST
Deemed to be University

89
Excellence and Service
CHRIST
Deemed to be University

90
Excellence and Service
CHRIST
Deemed to be University

EM Algorithm

91
Excellence and Service
CHRIST
Deemed to be University

92
Excellence and Service
CHRIST
Deemed to be University

EM
Algorithm
Expectation Maximization For Unsupervised
Learning

93
Excellence and Service
CHRIST
Deemed to be University

EM Algorithm

94
Excellence and Service
CHRIST
Deemed to be University

EM Algorithm

Clustering

Get best
Cluster

95
Excellence and Service
CHRIST
Deemed to be University

EM Flow Chart

96
Excellence and Service
CHRIST
Deemed to be University

97
Excellence and Service
CHRIST
Deemed to be University

EM Algorithm: Example1

Problem definition
• Let A & B : two Coins
• ϴ1: Probability of Getting Head with Coin A
• ϴ2: Probability of Getting Head with Coin B
• Find the final probability values of ϴ1 & ϴ2 by tossing
coin A & B 5 times randomly

98
Excellence and Service
CHRIST
Deemed to be University

99
Excellence and Service
CHRIST
Deemed to be University

This indicates that whenever we Toss Coin A then


there is 80% of chance that we get HEAD and
whenever we Toss Coin B then there is 45% of
chance that we get HEAD
In the above scenario we see that 5 experiments
are done and, in each experiment, we know which
coin is selected. If suppose the experiments are
conducted without knowing the COIN labels then
it will be difficult
Excellence to identify the COIN, cannot fill100
and Service
the TABLE and calculate the THETA values.
CHRIST
Deemed to be University

To solve this, we make use of the EM


algorithm.
We consider the same experiment but without
mentioning the Labels.
Excellence and Service
101
CHRIST
Deemed to be University

Step1: Assume Probabilities

102
Excellence and Service
CHRIST
Deemed to be University

E-Step2

Using Initial ϴ1 & ϴ2; Head & Tail information

103
Excellence and Service
CHRIST
Deemed to be University

Step3- Compute L(H), L(T) for each coin

Step4: Compute new ϴ1 & ϴ2

Repeat Step1 to
Step4 till the
convergence
Reached

104
Excellence and Service
CHRIST
Deemed to be University
Theta A 0.6

Theat B 0.5

HEADS TAILS HEADS TAILS


# of # of
L(A) L(B) P(A) P(B) COIN A COIN B
Heads Tails
Exp 1 5 5 0.000796262 0.000976563 0.45 0.55 2.2 2.2 2.8 2.8

Exp 2 9 1 0.004031078 0.000976563 0.80 0.20 7.2 0.8 1.8 0.2

Exp 3 8 2 0.002687386 0.000976563 0.73 0.27 5.9 1.5 2.1 0.5

Exp 4 4 6 0.000530842 0.000976563 0.35 0.65 1.4 2.1 2.6 3.9

Exp 5 7 3 0.00179159 0.000976563 0.65 0.35 4.5 1.9 2.5 1.1

21.3 8.6 11.7 8.4

New Theta A 0.713012

New Theta B 0.581339

105
Excellence and Service
CHRIST
Deemed to be University

106
Excellence and Service
CHRIST
Deemed to be University

Unit4 Summary
●Bayes Theorem- Derivation
●Concept Learning
●Maximum Likelihood(Minimization)
●Minimum Description Length Principle
●Bayes Optimal Classifier(Maximization)
●Gibbs Algorithm(n Hypothesis)
●Naïve Bayes Classifier((Maximization)
●Bayesian Belief Network(Conditional Prob)
●EM Algorithm.(Clustering)

107
Excellence and Service

You might also like