Intro:: Part-1: Bayesian Learning

Bayesian learning is a probabilistic method that utilizes Bayes' Theorem to update the probability of a hypothesis as new data is acquired, involving prior probability, likelihood, and posterior probability. It includes techniques like Maximum Likelihood Estimation and Least Squared Error for model fitting, as well as approaches like Locally Weighted Regression and Case-Based Reasoning for problem-solving. The document also contrasts lazy and eager learning methods in machine learning, highlighting their computational and memory requirements.

Uploaded by

Charan Cherry

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

12 views6 pages

Intro:: Part-1: Bayesian Learning

Uploaded by

Charan Cherry

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 6

Part-1: Bayesian Learning

Intro:

- Bayesian learning is a probabilistic approach that applies Bayes' Theorem to update the probability estimate
for a hypothesis, as more evidence or information becomes available.
- Bayesian Learning is suitable for problems where there is plenty of information/training/prior data to go with.
- That is, the hypothesis for a data is determined and updated every time the new data is added into the
considered dataset.
- Terminologies:
- Hypothesis Space (H): A set of all possible hypotheses that can explain the observed data.
- Prior Probability (P(H)): The initial probability assigned to a hypothesis before observing any data. This
represents prior knowledge or belief about the hypothesis.
- Likelihood (P(D|H)): The probability of observing the data 𝐷 given that the hypothesis 𝐻 is true. This
measures how well the hypothesis explains the data.
- Posterior Probability (P(H|D)): The updated probability of the hypothesis after considering the observed
data. This is the core of Bayesian updating.
- Bayesian Learning Process:
1. Define the Prior: Start with an initial belief about the hypotheses, represented by the prior distribution
P(H)P(H)P(H).
2. Collect Data: Gather observed data DDD.
3. Compute the Likelihood: Evaluate how likely the observed data is under each hypothesis using
P(D∣H)P(D|H)P(D∣H).
4. Update Beliefs: Apply Bayes' Theorem to update the prior beliefs and obtain the posterior distribution
P(H∣D)P(H|D)P(H∣D).

Bayes’ Theorem:

- It is used to find the probability of occurrence of an event, based on some pre-provided evidence.
- Its is stated as, for a given dataset(evidence) (D), the Probability that (H) is the correct hypothesis for (D) is:

(known as posterior probability)

Where,
P(D | H) - Probability that (D) exists for given Hypothesis (H) (current probability)
P(H) - Probability of Hypothesis (H)
P(D) - Probability of Dataset(D)
- P(D | H) is termed as a likelihood.
- P(H) and P(D) are termed as Dataset(D).
- From the set of posterior probabilities for each hypothesis, we can determine the most probable hypothesis
by using hmap.
- hmap (Maximum-A-Posteriori Hypothesis) is a Hypothesis that maximizes the posterior probability using bayes’
theorem.
- It is given by:
Bayes’ Theorem and Concept Learning:

Maximum Likelihood and Least Squared Error Hypothesis:

- Maximum Likelihood Estimation (MLE) is a method used in statistics and machine learning to estimate the
parameters of a machine learning model.
- The principle behind MLE is to find the parameter values that maximize the likelihood(probability) of the
observed data.
- Since, the hmap focuses on likelihood and prior probabilities, but the hML focuses only on likelihood probability.
- It is given by:

- The above relation is also called continuous-valued target function (ML Hypothesis), because the set to
target-data-values(D) makes it continuous.
- But the ML models like Linear Regression, Non-linear Regression, and Curve Fitting cannot/unable to learn
directly from this above relation.
- Hence we derive the Maximum Likelihood Hypothesis relation to get the LSE relation.
- The Least Squared Error (LSE) hypothesis is a method used in regression analysis to find the best-fitting line
to a set of observed data points.
- The principle behind LSE is to minimize the sum of squared differences between observed values and
predicted values.
- It is given by:

- By deriving the maximum likelihood hypothesis with a normalization distribution function, we get the
minimized/least squared error hypothesis.
- Derivation: (Notes)
—---------------------------------------------------------------------------------------------------
2. Locally Weighted Regression:
- Regression is a statistical approach used to analyze the relationship between a dependent variable (target
variable) and one or more independent variables (predictor variables). The objective is to determine the most
suitable function that characterizes the connection between these variables.
- When similar data points in a dataset can be separated with a single straight line, then it is known as linear
regression, and the data is known as linearly separable data.
- Hence, Locally Weighted Regression is a technique to separate the non-linearly separable data.
- In this method, weights are assigned to each data point, by using the below relation known as Kernel
Smoothing:

Where, X = each training input (X1, X2, …… Xn)

X0 = value we are predicting

- The weight to be assigned for a data point is inversely proportional to the difference b/w expected value and
predicted value. That is, less the error value, more the weight will be.

- We have to construct a weight matrix for each input value (X).

Drawbacks:
- Computation cost is high.
- Memory requirements are high.

3. Radial basis functions:

Data in Machine Learning can be of two types, viz:

1. Linearly Separable Data
2. Non-Linearly Separable Data
- A single layer perceptron can be used to classify the linearly separable data.
- Whereas, multi-layer perceptron is used to classify the
non-linearly separable data. But it is very complex.
- Hence, the complexity of classifying the non-linearly separable
data by using multi-layer perceptrons, can be reduced by using
radial basis functions as activation functions.
- With RBFs, the data is compressed horizontally and expanded
Vertically.
- The two RBFs are:
4. Case Based Reasoning:

- Case-Based Reasoning (CBR) is a problem-solving approach in artificial intelligence and cognitive science
where new problems are solved by referencing and adapting solutions from previously encountered, similar
cases.
- In this method, everything is assumed/considered as a separate case.
- In this method, the instances are represented as the symbols.
- CBR is similar to Instance based learning. That is, the new instances are classified by analyzing the existing
instances directly. Similarly, CBR takes previously solved cases, while solving a new case (classifying).
- CBR uses CADET (Case-Based-Desgned-Tool) system which provides 75 predefined case designs.
- Working Procedure of CBR:
1. Case Retrieval: When a new problem arises, the system retrieves cases from its database of previously
solved problems that are similar to the current problem. This involves finding cases that have similar features
or are in similar contexts.
2. Case Adaptation: Once similar cases are retrieved, the system adapts the solutions from these cases to fit
the new problem. This may involve modifying the solution to account for differences between the old and new
cases.
3. Solution Application: The adapted solution is then applied to the current problem. This step involves
executing the solution and potentially testing it to ensure it works in the new context.
4. Case Storage: After solving the problem, the system stores the new case and its solution in the database.
This allows the system to build up a repository of cases over time, improving its ability to solve future problems.
Remarks on lazy and eager learning:

Lazy Learning: Prediction is made directly by analyzing the pre existing instances. That is, without building a
model.
- Computation time and Memory requirement is more, because all the existing instances must be checked and
new instances must be stored so that it is used for next new instance classification.
Ex: KNN, etc
Eager Learning: Prediction is made by a trained model with previously provided instances. That is, by building
a model.
- Computation time and Memory requirements are relatively less.
Ex: Decision Tree Learning, Naive Bayes Classifier, etc

FREE ENERGY Tesla Secrets For Everybody
100% (9)
FREE ENERGY Tesla Secrets For Everybody
58 pages
Criminology MCQs
100% (1)
Criminology MCQs
4 pages
Conditionals in Reported Speech
No ratings yet
Conditionals in Reported Speech
2 pages
ML Hand Written Notes
No ratings yet
ML Hand Written Notes
19 pages
Deep Learning
No ratings yet
Deep Learning
13 pages
GTU Quesion Answers
No ratings yet
GTU Quesion Answers
35 pages
Finite Element Method For Electromagnetics
No ratings yet
Finite Element Method For Electromagnetics
360 pages
Operating System Concepts Test
No ratings yet
Operating System Concepts Test
11 pages
MLT Study
No ratings yet
MLT Study
22 pages
Traction Alternator Type Ta10106cy
No ratings yet
Traction Alternator Type Ta10106cy
64 pages
Chapter 1 5 Thesis Sample
100% (2)
Chapter 1 5 Thesis Sample
64 pages
Pattern Recognition Techniques
No ratings yet
Pattern Recognition Techniques
10 pages
Machine Learning Syllabus
No ratings yet
Machine Learning Syllabus
26 pages
Bayesian and Computational Learning
No ratings yet
Bayesian and Computational Learning
178 pages
Unit 2
No ratings yet
Unit 2
18 pages
Python Unit 2
No ratings yet
Python Unit 2
16 pages
ML Synoppsis r22
No ratings yet
ML Synoppsis r22
53 pages
ML Notes 2k25
No ratings yet
ML Notes 2k25
19 pages
Unit - 5 ML
No ratings yet
Unit - 5 ML
57 pages
Notes Machine Learning
No ratings yet
Notes Machine Learning
25 pages
6.1 Bayesian Learning
No ratings yet
6.1 Bayesian Learning
33 pages
ML Imp QB
No ratings yet
ML Imp QB
34 pages
Computer Network: 02 December 2024 22:38
No ratings yet
Computer Network: 02 December 2024 22:38
5 pages
Data Input
No ratings yet
Data Input
6 pages
Unit 3
No ratings yet
Unit 3
16 pages
Bayesian Inference and Learning
No ratings yet
Bayesian Inference and Learning
48 pages
Machine Learning Notes
No ratings yet
Machine Learning Notes
21 pages
UNIT I-Part 2
No ratings yet
UNIT I-Part 2
35 pages
Ia1 ML Scheme Common To Is, Ai, Cs
No ratings yet
Ia1 ML Scheme Common To Is, Ai, Cs
10 pages
Math Test: Rounding & Operations
No ratings yet
Math Test: Rounding & Operations
4 pages
ChatGPT - Machine Learning Overview
No ratings yet
ChatGPT - Machine Learning Overview
34 pages
Module 4
No ratings yet
Module 4
15 pages
Introduction To Machine Learning
No ratings yet
Introduction To Machine Learning
21 pages
ML Unit-4 Prob Learning
No ratings yet
ML Unit-4 Prob Learning
36 pages
ML Module - 1-1
No ratings yet
ML Module - 1-1
25 pages
ML Imp Ques 1
No ratings yet
ML Imp Ques 1
22 pages
Lecturenotes PDF
No ratings yet
Lecturenotes PDF
80 pages
Sec 1630
No ratings yet
Sec 1630
145 pages
U1 - ML
No ratings yet
U1 - ML
5 pages
MLE
No ratings yet
MLE
15 pages
AI UNIT 3 Tycs
No ratings yet
AI UNIT 3 Tycs
16 pages
Unit 4
No ratings yet
Unit 4
20 pages
ECE 449 Notes
No ratings yet
ECE 449 Notes
5 pages
en - GASP 2020 2022 Global Aviation Safety Plan
No ratings yet
en - GASP 2020 2022 Global Aviation Safety Plan
144 pages
AIML Module-03
No ratings yet
AIML Module-03
40 pages
Module 1
No ratings yet
Module 1
50 pages
Unit III
No ratings yet
Unit III
19 pages
Concept Learning
No ratings yet
Concept Learning
33 pages
Machine - Learning (Unit 3)
No ratings yet
Machine - Learning (Unit 3)
9 pages
ML Unit 1
No ratings yet
ML Unit 1
35 pages
Machine 2023 Part 1
No ratings yet
Machine 2023 Part 1
4 pages
Physics1 PDF
No ratings yet
Physics1 PDF
7 pages
Module 1
No ratings yet
Module 1
27 pages
Unit 2.5 ML
No ratings yet
Unit 2.5 ML
4 pages
ML Unit 1
No ratings yet
ML Unit 1
73 pages
AI Unit 3
No ratings yet
AI Unit 3
12 pages
Are Today's Teenagers Smarter and Better Than We Think - The New York Times
No ratings yet
Are Today's Teenagers Smarter and Better Than We Think - The New York Times
5 pages
Lesson 5 Freedom of The Human Person
No ratings yet
Lesson 5 Freedom of The Human Person
16 pages
Fall 2022 Midterm Notes PDF
No ratings yet
Fall 2022 Midterm Notes PDF
15 pages
PR & ML: CS5691: Machine Learning
No ratings yet
PR & ML: CS5691: Machine Learning
42 pages
Bayesian Learning in Machine Learning
No ratings yet
Bayesian Learning in Machine Learning
60 pages
FAI - ch5 - Remaining Topics
No ratings yet
FAI - ch5 - Remaining Topics
3 pages
ML 3
No ratings yet
ML 3
45 pages
Ai Unit5 Learning
No ratings yet
Ai Unit5 Learning
62 pages
U1
No ratings yet
U1
29 pages
Lecturenotes Cse176
No ratings yet
Lecturenotes Cse176
80 pages
Visit:: Join Telegram To Get Instant Updates: Contact: MAIL: Instagram: Instagram: Whatsapp Share
No ratings yet
Visit:: Join Telegram To Get Instant Updates: Contact: MAIL: Instagram: Instagram: Whatsapp Share
25 pages
ML Notes
No ratings yet
ML Notes
15 pages
ML 1 2 3
No ratings yet
ML 1 2 3
54 pages
True or False Items
No ratings yet
True or False Items
17 pages
Listening Starter 1
No ratings yet
Listening Starter 1
9 pages
Huawei H12-211 PRACTICE EXAM HCNA-HNTD H
No ratings yet
Huawei H12-211 PRACTICE EXAM HCNA-HNTD H
117 pages
The Future of Automotive Manufacturing - Integrating AI... For Next-Gen Automatic Cars
No ratings yet
The Future of Automotive Manufacturing - Integrating AI... For Next-Gen Automatic Cars
9 pages
A Preliminary Idea On Machine Learning
No ratings yet
A Preliminary Idea On Machine Learning
40 pages
15CS73 Module 4
No ratings yet
15CS73 Module 4
60 pages
Intro:: Part-1: Analytical Learning-1
No ratings yet
Intro:: Part-1: Analytical Learning-1
6 pages
Chapter 4 (Answers)
No ratings yet
Chapter 4 (Answers)
5 pages
CSF Anatomy & Physiology
No ratings yet
CSF Anatomy & Physiology
20 pages
Python Unit 3
No ratings yet
Python Unit 3
11 pages
NLP Unit-1
No ratings yet
NLP Unit-1
20 pages
Ci Driver Do Motor Do CD Rom Datasheet
No ratings yet
Ci Driver Do Motor Do CD Rom Datasheet
11 pages
Group 17 - Research Proposal-1
No ratings yet
Group 17 - Research Proposal-1
36 pages
U4
No ratings yet
U4
21 pages
International Journal of Non-Linear Mechanics: Chiara Gastaldi, Teresa M. Berruti
No ratings yet
International Journal of Non-Linear Mechanics: Chiara Gastaldi, Teresa M. Berruti
16 pages
Mtn66060008-Usermanual 2
No ratings yet
Mtn66060008-Usermanual 2
46 pages
Principles of Assessment: Prepared By: Julie G. de Guzman Eps - I Science
No ratings yet
Principles of Assessment: Prepared By: Julie G. de Guzman Eps - I Science
25 pages
Georges Renault Cvis II
No ratings yet
Georges Renault Cvis II
76 pages
Dmath Unit1
No ratings yet
Dmath Unit1
63 pages
French SAT Subject Test
No ratings yet
French SAT Subject Test
1 page
AI Lesson: Conditionals & Vocabulary
No ratings yet
AI Lesson: Conditionals & Vocabulary
6 pages
Commerce
No ratings yet
Commerce
10 pages
Abs Paris
No ratings yet
Abs Paris
2 pages
Mini-Vert Brochure
No ratings yet
Mini-Vert Brochure
4 pages
Beginning of The Year Progress Note
No ratings yet
Beginning of The Year Progress Note
2 pages
Title List
No ratings yet
Title List
2 pages