0% found this document useful (0 votes)

25 views38 pages

07 Intro To ML

Uploaded by

liangyibo653

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

25 views38 pages

07 Intro To ML

Uploaded by

liangyibo653

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 38

Introduction to ML

Computer Vision
Introduction to
Machine Learning

2 / 38
Recognition Problems
• The main problem: the rules must be selected manually.

true
Class 1
true
Condition 2
Condition 1 false Class 2
false
Class 1
Geometric features for
selected segments

• Consequences:
1. Meaningful and informative features should be used.
2. There are very few such features, and their complex combinations
cannot be processed.
3 / 38
Ideal Solution
• The machine gives answers to questions based on the data
already processed.
Learning algorithm Trained machine

Data Answer

Question

4 / 38
Learning Process
• Learning is not the same as memorization, memorization is not a
problem for a machine.
• The machine must learn to draw inferences from a set of training
data.
• The machine must work correctly based on new data that was
not given to it before.

5 / 38
Definition
• «A computer program is said to learn from experience 𝐸 with
respect to some task 𝑇 and some performance measure 𝑃, if its
performance on 𝑇, as measured by 𝑃, improves with experience
𝐸» © Т.М. Mitchell, 1997.

6 / 38
Applications
• computer vision,
• speech recognition,
• computational linguistics and natural language processing,
• medical diagnostics,
• bioinformatics,
• technical diagnostics,
• financial applications,
• search and rubricating of texts,
• expert systems,
• etc.
7 / 38
Machine Learning Classes
1. Deductive learning (from general to particular).
• There are formalized data.
• It is required to derive a rule applicable to a particular case
based on formalized data.
• Typical example: expert systems.
2. Inductive learning (from particular to general).
• There are empirical data. Need to restore some dependency.
• Subdivided into:
a. Supervised learning;
b. Unsupervised learning;
c. Reinforcement learning;
d. Active learning etc.
8 / 38
Probability Theory and Stochastic Processes

• What is a probability?
1. In frequency interpretation: the probability is the frequency of a
repeating event.
2. In the Bayesian interpretation: the probability is a measure of the
experiment’s outcome uncertainty.

9 / 38
Example: Extraction of Fruit From Two Boxes

• Experiment:
• Box selection;
• Fruit extraction;
• Putting the fruit back.
• Two random variables:
• 𝑋 – the color of the box (red or blue);
• 𝑌 – fruit (orange or apple).
𝐻𝑜𝑤 𝑚𝑎𝑛𝑦 𝑡𝑖𝑚𝑒𝑠 𝑡ℎ𝑒 𝑟𝑒𝑑 𝑏𝑜𝑥 𝑤𝑎𝑠 𝑐ℎ𝑜𝑠𝑒𝑛
𝑃 𝑋 = 𝑟𝑒𝑑 =
𝐻𝑜𝑤 𝑚𝑎𝑛𝑦 𝑒𝑥𝑝𝑒𝑟𝑖𝑚𝑒𝑛𝑡𝑠 𝑤𝑒𝑟𝑒 𝑑𝑜𝑛𝑒
• 𝑃 is the probability of choosing the red box.

10 / 38
Example: Extraction of Fruit From Two Boxes

• We will perform an experiment and enter the number of

outcomes in the table (horizontally – the colors of the box,
vertically – fruits).

• Event intersection probability:

#!"
𝑃 𝑋 = 𝑥! , 𝑃 = 𝑦" = ,
$
where 𝑁 – the number of experiments, the number of outcomes 𝑛!" .
• Conditional Probability:
#!"
𝑃 𝑃 = 𝑦" |𝑋 = 𝑥! , = .
%! 11 / 38
Example: Extraction of Fruit From Two Boxes

P(Y|X)=red P(Y|X)=blue

Conditional Probability:
75% that an orange in a red box
25% that it in blue.
12 / 38
Bayes Formula

• What is the probability that we have a dinosaur in front of us (𝑥 –

observation)?
!(#|%)'!(%)
𝑃 𝑦𝑥 = − Bayes formula,
!(#)
where 𝑃(𝑥|𝑦) – is the probability that the dinosaur looks like this;
𝑃(𝑦) – the probability of meeting a dinosaur;
𝑃 𝑥 – the probability of seeing such a scene.

13 / 38
Probability Theory
• Sum rule:

𝑃 𝑥 = < 𝑃 𝑥, 𝑦 𝑑𝑦 ↔ 𝑃 𝑦 = < 𝑃 𝑥, 𝑦 𝑑𝑥
& '
• Product rule:
𝑃 𝑥, 𝑦 = 𝑃 𝑦|𝑥 𝑃 𝑥 = 𝑃 𝑥|𝑦 𝑃(𝑦)
• If two random variables are independent:
𝑃 𝑥, 𝑦 = 𝑃 𝑥 𝑃(𝑦)

14 / 38
Machine Learning Tasks
• Setting the task of supervised learning:
• 𝕏 – a set of objects or examples, situations, inputs (samples);
• 𝕐 – a set of answers or labels, outputs (responses).
• There is some dependence that allows predicting 𝑦 ∈ 𝕐 from 𝑥 ∈
𝕏.
• If the dependence is deterministic, then there is a function
𝑓 ∗: 𝕏 → 𝕐.
• The dependence is known only on the objects of the training
sample, it means that we know some finite number of data:
𝑥 (!) , 𝑦 (!) : 𝑥 (!) ∈ 𝕏, 𝑦 ! ∈ 𝕐(𝑖 = 1, … , 𝑁) .

15 / 38
Machine Learning Tasks
• An ordered object-response pair 𝑥 (!) , 𝑦 (!) ∈ (𝕏×𝕐) is called a
precedent.
• The task of supervised learning is to restore the relationship
between input and output based on the existing training sample,
• i.e., it is necessary to design a function (decision rule) 𝑓: 𝕏 → 𝕐, for
new objects 𝑥 ∈ 𝕏 predicting the answer 𝑓(𝑥) ∈ 𝕐:
𝑦 = 𝑓 𝑥 ≈ 𝑓∗ 𝑥 .

16 / 38
Basic Definitions
• The functions 𝑓 are chosen from the parametric family 𝐹, i.e.
from a set of possible models.
• The process of finding the function 𝑓 is called learning, as well
as tuning or fitting the model.
• An algorithm for designing a function 𝑓 from a given training set
is called a learning algorithm.
• Some class of algorithms is called a learning method.

17 / 38
Basic Definitions
• Learning algorithms operate with object descriptions: each
sample element is described by a set of features 𝑥 =
(𝑥+ , 𝑥, , … , 𝑥- ) (feature vector), where 𝑥" ∈ 𝑄" , 𝑗 = 1, 𝑑, 𝕏 =
𝑄+ ×𝑄, × ⋯ 𝑄- .
• The set 𝕏 is called the feature space.
• It is necessary to design such a function 𝑦 = 𝑓(𝑥) from the
feature vector 𝑥 = (𝑥+ , 𝑥, , … , 𝑥- ), that would give the answer 𝑦 for
any possible observation 𝑥.
• The component 𝑥" is called the 𝑗-th feature, or property, or
attribute of the object 𝑥.

18 / 38
Basic Definitions
• If 𝑄" = ℝ, then the 𝑗-th attribute is called quantitative or real.
• If 𝑄" is finite, then the 𝑗-th feature is called nominal, or
categorical, or factor.
• If dim𝑄) = 2, then the feature is called binary.
• If 𝑄) is ordered, then the feature is called ordinal.

19 / 38
Regression Retrieval Task
• If 𝕐 = ℝ, then it is a regression retrieval task.
• decision rule 𝑓 called regression.
• If 𝕐 is finite 𝕐={1,2,…,𝐾}, then this is a classification task.
• The decision rule 𝑓 is called the classifier.

An example of regression retrieval, 𝑦 is a continuous value. 20 / 38

Binary Classification Task
• Given a training sample:
𝑋* = 𝑥+ , 𝑦+ , … , 𝑥* , 𝑦* , 𝑥, , 𝑦, ∈ 𝑅 * ×𝑌, 𝑌 = −1, +1 .
• Objects belong to one of two classes.
• We mark the main class as “+1”, the secondary “background” as “−1”.
• It is required for all new values 𝑥 to define the class "+1" or "−1".

21 / 38
Multiclass Classification Task
• Given a training sample:
𝑋* = 𝑥+ , 𝑦+ , … , 𝑥* , 𝑦* , 𝑥, , 𝑦, ∈ 𝑅 * ×𝑌, 𝑌 = 1, … , 𝐾 .
• Objects belong to one of the 𝐾 classes.
• It is required for all new values 𝑥 to define a class and put a label from 1
to 𝐾.

22 / 38
Remarks
1. The found decision rule should have a generalizing ability (the
constructed classifier or regression function should reflect the
overall dependence of the output on the input, based only on
known data about the precedents of the training sample).
2. Attention should be paid to the problem of effective
computability of the function 𝑓 and to the learning algorithm:
the model tuning should take place in an acceptable time.

23 / 38
Machine Learning Tasks
Data with unknown
• The quality of the algorithm on new data
answers
is interesting: it is necessary to connect
the existing data with those that we will
process in the future.
• For this, the values of the features will be
considered random variables.
• We will assume that the data that will
have to be processed in the future and the
available data are distributed equally.
Training set

24 / 38
Machine Learning Paradigms
• Discriminative Approach
• We will choose functions 𝑓 from the parametric family 𝐹, i.e. from some
set of possible models.
• Let us introduce some loss function 𝐿(𝑦,𝑓(𝑥)) of the true value of the
output 𝑦 and the predicted value 𝑓(𝑥):

25 / 38
Discriminative Approach
• Let us introduce some loss function 𝐿(𝑦,𝑓(𝑥)) of the true value of the
output 𝑦 and the predicted value 𝑓(𝑥):
• In a regression retrieval task,
• a quadratic error:
1 !
𝐿 𝑦, 𝑓 𝑥 = 𝑦−𝑓 𝑥 ,
2
• an absolute error:
𝐿 𝑦, 𝑓 𝑥 = 𝑦 − 𝑓 𝑥 .
• In a classification task,
• a prediction error:
𝐿 𝑦, 𝑓 𝑥 =𝐼 𝑦≠𝑓 𝑥 ,
where 𝑓(𝑥) is the predicted class,
1, condition met
𝐼=- – indicator function.
0, condition not met
26 / 38
Discriminative Approach
• It is necessary to design a function 𝑦=𝑓(𝑥) – a decision rule or a
classifier.
• Any decision rule divides space into decision regions separated
by decision boundaries.

27 / 38
Machine Learning Paradigms
• Generative approach
!(#|%)'!(%)
• Bayes formula 𝑃 𝑦 𝑥 = is used;
!(#)
• Each class is modeled separately, evaluate 𝑃 𝑥 𝑦 , 𝑃(𝑦);
• The problem statement is like the classification.
• Discriminative approach
• Since we are interested in 𝑃(𝑦│𝑥), we will evaluate it;
• Problem statement is like regression.

28 / 38
Risks
• The task of learning is to find a set of classifier parameters 𝑓, in
which the losses for new data will be minimal.
• Let's introduce the concept of general (average) risk – this is the
mathematical expectation of losses:

𝑅 𝑓 = 𝐸 𝐿 𝑓 𝑥 ,𝑦 = < 𝐿 𝑓 𝑥 , 𝑦 𝑑𝑃.
',&
• Unfortunately, due to the unknown probability distribution 𝑃 of
the joint random variable (𝑥,𝑦), the total risk cannot be
calculated.

29 / 38
Risks
• Let us introduce the concept of empirical risk. Let 𝑋={𝑥+ ,…, 𝑥/ },
𝑌={𝑦+ ,…, 𝑦/ } be the training sample. Empirical risk or training
error:
/
1
𝑅0/1 𝑓, 𝑋 = R 𝐿 𝑦! , 𝑓 𝑥! .
𝑚
!2+
• To minimize the empirical risk, it is necessary to find the
function 𝑓 in accordance with the condition:
𝑓 = arg min 𝑅0/1 𝑓, 𝑋 .
3∈5
• The condition is called: the empirical risk minimization principle.

30 / 38
Comment
• There can be an unlimited number of hypotheses that have zero
empirical risk:

The most common hypothesis

Middle way

The most general hypothesis

31 / 38
Supervised Learning Challenge
• The problem was reduced to finding a function 𝑓 from an
admissible set 𝐹 that satisfies the condition:
𝑓 = arg min 𝑅0/1 𝑓, 𝑋 ,
3∈5
𝐹 and 𝐿 are fixed and known.
• The class of models 𝐹 is parametrized, i.e. there is a description
of the species 𝐹 = {𝑓 𝑥 = 𝑓 𝑥, 𝜃 : 𝜃 ∈ Θ}, where Θ is some known
set.
• Model tuning process:
• the learning algorithm selects the values of the set of parameters Θ
that ensure the fulfillment of the condition 𝑓, i.e. minimizing the
error on the precedents of the training sample.
32 / 38
Overfitting
• The considered condition is not suitable for evaluation the
generalizing ability of the algorithm.
• All available data is divided into training and test sets:
• Training is performed using a training set,
• Evaluation of the prediction quality based on test sample data.
• Values 𝑅 𝑓 and 𝑅0/1 𝑓, 𝑋 can differ significantly.
• The phenomenon when 𝑅0/1 𝑓, 𝑋 is small and 𝑅 𝑓 is too large is
called overfitting.

33 / 38
Overfitting
• Let there be a regression problem.
• 𝑡 = sin 2𝜋𝑥 + 𝜖, where 𝜖 – normally distributed noise, but we
don't know that.
• Let there be a training sample and it is required to restore the
dependence:

34 / 38
Overfitting
• We will choose the target dependence among polynomials of
order 𝑀 (parametrized set):
𝑦 𝑥, 𝑤 = 𝑤6 + 𝑤+ 𝑥 + 𝑤, 𝑥 , + ⋯ + 𝑤7 𝑥 7 = 𝑤 8𝜙7 𝑥 .
• We introduce the loss function:
1
𝐿 𝑥, 𝑡 , 𝑦 = (𝑦 𝑥, 𝑤 − 𝑡), .
2
• Among the set of polynomials, we will choose the one that
brings the least total loss on the training set.

35 / 38
Overfitting

36 / 38
Overfitting
• Reason: the hypothesis well describes the properties of not
objects in general, but only objects from the training sample:
• Too many degrees of freedom of the algorithm model parameters (a
complex model);
• Noisy data;
• Bad training set.

37 / 38
THANK YOU
FOR YOUR TIME!

[email protected]

Unit 1
No ratings yet
Unit 1
92 pages
Artificial Intelligence Handout With Worksheet
No ratings yet
Artificial Intelligence Handout With Worksheet
7 pages
Coursera - Introduction To Ai Notes
100% (1)
Coursera - Introduction To Ai Notes
4 pages
40 Algorithms Every Data Scientist Should Know - Navigating Through Essential AI and ML Algorithms by W
No ratings yet
40 Algorithms Every Data Scientist Should Know - Navigating Through Essential AI and ML Algorithms by W
848 pages
The AI Dilemma Ethical Considerations For The Future of Machines
No ratings yet
The AI Dilemma Ethical Considerations For The Future of Machines
8 pages
Machine Learning Basics for Beginners
100% (5)
Machine Learning Basics for Beginners
134 pages
Artificial Intelligence in Accounting Implications For Practices and Education
No ratings yet
Artificial Intelligence in Accounting Implications For Practices and Education
12 pages
Lecture 1 2022
No ratings yet
Lecture 1 2022
55 pages
Introduction To Machine Learning
No ratings yet
Introduction To Machine Learning
316 pages
Ai512 Book
No ratings yet
Ai512 Book
127 pages
DSA5105 Lecture1
No ratings yet
DSA5105 Lecture1
51 pages
INT354 - Unit 1
No ratings yet
INT354 - Unit 1
72 pages
Unit 1
No ratings yet
Unit 1
93 pages
Artificial Intelligence
No ratings yet
Artificial Intelligence
31 pages
Statistical Machine Learning-The Basic Approach and Current Research Challenges
No ratings yet
Statistical Machine Learning-The Basic Approach and Current Research Challenges
35 pages
Data Science & ML Course Guide
No ratings yet
Data Science & ML Course Guide
83 pages
AI ch6
No ratings yet
AI ch6
42 pages
UNIT I-Part 2
No ratings yet
UNIT I-Part 2
35 pages
CSE 440 AI Volume1 (p1)
No ratings yet
CSE 440 AI Volume1 (p1)
4 pages
Notes
No ratings yet
Notes
125 pages
ML Unit 2
No ratings yet
ML Unit 2
33 pages
DSA5102X Lecture1
No ratings yet
DSA5102X Lecture1
51 pages
The Impact of Artificial Intelligence On Journalism and News Media
No ratings yet
The Impact of Artificial Intelligence On Journalism and News Media
6 pages
Lecturenotes PDF
No ratings yet
Lecturenotes PDF
80 pages
Sec 1630
No ratings yet
Sec 1630
145 pages
Machine Learning Fundamentals
No ratings yet
Machine Learning Fundamentals
19 pages
19 ML Intro
No ratings yet
19 ML Intro
33 pages
14 Supervised Machine Learning
No ratings yet
14 Supervised Machine Learning
94 pages
Lecture Notes
No ratings yet
Lecture Notes
86 pages
ML 01
No ratings yet
ML 01
24 pages
Predictive Analytics Basics
No ratings yet
Predictive Analytics Basics
16 pages
Presentation On ML
No ratings yet
Presentation On ML
469 pages
Lecture1 Intro ML
No ratings yet
Lecture1 Intro ML
60 pages
Cours1 ML
No ratings yet
Cours1 ML
41 pages
ML Intro Theory
No ratings yet
ML Intro Theory
10 pages
Lecture 15 - Recap and Midterm Review
No ratings yet
Lecture 15 - Recap and Midterm Review
37 pages
MLSM Lecture1 050923
No ratings yet
MLSM Lecture1 050923
37 pages
Module3 DS PPT
No ratings yet
Module3 DS PPT
68 pages
Week 3
No ratings yet
Week 3
56 pages
19 ML Intro
No ratings yet
19 ML Intro
31 pages
Lec10 Intro ML
No ratings yet
Lec10 Intro ML
93 pages
Introduction ML
No ratings yet
Introduction ML
65 pages
CE880 Lecture5 Slides
No ratings yet
CE880 Lecture5 Slides
32 pages
ML Merge
No ratings yet
ML Merge
145 pages
Classification
No ratings yet
Classification
53 pages
Lecture 2 Ai
No ratings yet
Lecture 2 Ai
24 pages
Machine Learning
No ratings yet
Machine Learning
33 pages
APS1070 Lecture (3) Slides
No ratings yet
APS1070 Lecture (3) Slides
70 pages
Lecture 2 - Supervised Learning
No ratings yet
Lecture 2 - Supervised Learning
6 pages
Introduction To Machine Learning
No ratings yet
Introduction To Machine Learning
27 pages
Lecturenotes Cse176
No ratings yet
Lecturenotes Cse176
80 pages
Machine Learning HC
No ratings yet
Machine Learning HC
4 pages
Unit Ii
No ratings yet
Unit Ii
118 pages
The Development of Artificial Intelligence in Educ
No ratings yet
The Development of Artificial Intelligence in Educ
14 pages
Machine Learning Notes
No ratings yet
Machine Learning Notes
48 pages
Artificial Intelligence Question Set
No ratings yet
Artificial Intelligence Question Set
2 pages
05-1 Supervised Learning
No ratings yet
05-1 Supervised Learning
65 pages
Call For Papers - 11th International Conference On Artificial Intelligence and Soft Computing (AIS 2025)
No ratings yet
Call For Papers - 11th International Conference On Artificial Intelligence and Soft Computing (AIS 2025)
2 pages
2-Inductive Learning
No ratings yet
2-Inductive Learning
37 pages
6.036: Intro To Machine Learning: Lecturer: Professor Leslie Kaelbling Notes By: Andrew Lin Fall 2019
No ratings yet
6.036: Intro To Machine Learning: Lecturer: Professor Leslie Kaelbling Notes By: Andrew Lin Fall 2019
50 pages
Course Handbook - Prompt Engineering Bootcamp
No ratings yet
Course Handbook - Prompt Engineering Bootcamp
10 pages
CA Lecture 10
No ratings yet
CA Lecture 10
44 pages
Brief Intro To ML PDF
No ratings yet
Brief Intro To ML PDF
236 pages
Machine Learning: Foundations: Prof. Nathan Intrator
No ratings yet
Machine Learning: Foundations: Prof. Nathan Intrator
60 pages
Deep Learning Summer School 2015: Introduction To Machine Learning
No ratings yet
Deep Learning Summer School 2015: Introduction To Machine Learning
46 pages
2019 Book DevelopingEnterpriseChatbots
No ratings yet
2019 Book DevelopingEnterpriseChatbots
566 pages
MIT - Machine Learning Notes From Chapter 1 - 14 PDF
No ratings yet
MIT - Machine Learning Notes From Chapter 1 - 14 PDF
101 pages
Cs 171 18 IntroLearning Old
No ratings yet
Cs 171 18 IntroLearning Old
47 pages
CA Lecture 8
No ratings yet
CA Lecture 8
49 pages
Statistical Machine Learning-The Basic Approach and Current Research Challenges
No ratings yet
Statistical Machine Learning-The Basic Approach and Current Research Challenges
35 pages
6.867 Lecture Notes: Section 1: Introduction: 1 Intro 2 2 Problem Class 3
No ratings yet
6.867 Lecture Notes: Section 1: Introduction: 1 Intro 2 2 Problem Class 3
10 pages
AIDS SCO Form Pending List24-25
No ratings yet
AIDS SCO Form Pending List24-25
4 pages
Chat 5
No ratings yet
Chat 5
23 pages
Question Bank For Fundamental of Ai
No ratings yet
Question Bank For Fundamental of Ai
2 pages
CA Lecture 11
No ratings yet
CA Lecture 11
49 pages
Anomaly GPT
No ratings yet
Anomaly GPT
19 pages
Artificial Intelligence in Breast Cancer
No ratings yet
Artificial Intelligence in Breast Cancer
17 pages
05-06 Feature and Object Descriptions
No ratings yet
05-06 Feature and Object Descriptions
103 pages
QB Dl-Cie1
No ratings yet
QB Dl-Cie1
1 page
Introduction To Artificial Intelligence (AI) & Automation
No ratings yet
Introduction To Artificial Intelligence (AI) & Automation
4 pages
08 Classification
No ratings yet
08 Classification
46 pages
Military AI-Week 01-AI Concept-Theory-Full-2023
No ratings yet
Military AI-Week 01-AI Concept-Theory-Full-2023
57 pages
W1-1 Introduction Class
No ratings yet
W1-1 Introduction Class
34 pages
Adversarial Search
No ratings yet
Adversarial Search
3 pages
Machine Learning & Graph Algorithms
No ratings yet
Machine Learning & Graph Algorithms
24 pages
Planning Problem Strips
No ratings yet
Planning Problem Strips
16 pages
Logic and Sciences of The Mind
No ratings yet
Logic and Sciences of The Mind
3 pages
04 - CauAIN - Causal Aware Interaction Network For Emotion Recognition in Conversations
No ratings yet
04 - CauAIN - Causal Aware Interaction Network For Emotion Recognition in Conversations
7 pages
Prompt Engineering
No ratings yet
Prompt Engineering
2 pages
Unit 1 Textual Annotation
No ratings yet
Unit 1 Textual Annotation
4 pages
Call For Chapters STS Computaional Intelligence 1736093594
No ratings yet
Call For Chapters STS Computaional Intelligence 1736093594
1 page
NLP Techniques: N-grams, RNNs, Transformers
No ratings yet
NLP Techniques: N-grams, RNNs, Transformers
2 pages
ESDSS - Expert Systems and Decision Support System
No ratings yet
ESDSS - Expert Systems and Decision Support System
4 pages

07 Intro To ML

Uploaded by

07 Intro To ML

Uploaded by

Introduction to ML

• We will perform an experiment and enter the number of

• Event intersection probability:

• What is the probability that we have a dinosaur in front of us (𝑥 –

An example of regression retrieval, 𝑦 is a continuous value. 20 / 38

The most common hypothesis

The most general hypothesis

You might also like