0% found this document useful (0 votes)

7 views28 pages

ML Intro

The document outlines the fundamentals of machine learning, including definitions of learning and the data science process. It distinguishes between supervised, unsupervised, and semi-supervised learning, and discusses the stages of machine learning such as training and testing. Additionally, it provides examples related to cancer diagnosis to illustrate the application of machine learning algorithms in classification tasks.

Uploaded by

rtzvdpsw2x

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

7 views28 pages

ML Intro

Uploaded by

rtzvdpsw2x

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 28

Machine Learning

Data Science Process

Learning?
• Herbert Simon: “Learning is any process by which a
system improves performance from experience.”

• There are two ways that a system can improve:

1. By acquiring new knowledge (e.g. acquiring new facts)
2. By adapting its behavior (e.g. solving problems more accurately )
• How to learn a machine using data?
Main types of Machine Learning
• Supervised learning(With a teacher): uses a
series of labelled examples with direct feedback

• Unsupervised/clustering learning (without a

teacher): no feedback

• Semi-supervised: in between supervised and

unsupervised learning (Some data is labeled but
most of it is unlabeled)
Supervised vs Unsupervised
• How many groups do we have in this figure?
• Can we apply supervised learning?
• What will you get if you apply unsupervised learning?
What do you think now?
Supervised vs Unsupervised
• Can you separate this data into two groups?
Supervised vs Unsupervised vs Semi-supervised
Example
• We have a dataset with two columns x1 and x2
X1 X2
1 2
5 3
… …
• We plot the data into two-dimensional space as follows

Q1) can you

divide the data
into two
groups?
• Q1) can you divide the data into two groups?
• Try to separate the points based on the distance
between the data points

X1 X2
1 2
5 3
... ...
Q2) If we give you the labels (a new column which
provides the class of each row) can you draw a line
that separte the two classes?

X1 X2 X3
(Label)
1 2 normal
(blue)
5 3 abnormal
(red)
.. .. ..
Examples of
ML Algorithms
Usual ML stages
• Hypothesis, data
• Training or learning (requires examples/data)
• Testing or generalization
Training
• Training is the acquisition of knowledge, skills, and competencies as
a result of teaching, practical skills and knowledge that relate to
specific useful competencies (wikipedia)
• Training requires scenarios or examples (data)
In machine learning we learn from the available data or examples

Training: The figure shows how the separating line is updated through the several training steps

Initial random line Updating the line after one Training is complete
training step
Testing
• How well the learned system works?
• Generalization
• Performance on unseen or unknown scenarios or data

• Which model performs the best?

Types of testing
• Evaluate performance by
testing on data NOT used for
training (both should be
randomly sampled)

• Cross validation methods

for small data sets

The more (relevant) data the

better.
Defining the Learning Task
Improve on task, T, with respect to
performance metric, P, based on experience, E.
T: Recognizing hand-written words
P: Percentage of words correctly classified
E: Database of human-labeled images of handwritten words

T: Driving on four-lane highways using vision sensors

P: Average distance traveled before a human-judged error
E: A sequence of images and steering commands recorded while
observing a human driver.

T: Categorize email messages as spam or legitimate.

P: Percentage of email messages correctly classified.
E: Database of emails, some with human-given labels
Suppose that we are done
with EDA and data is
ready for modelling, what
is next?
Cancer diagnosis
This is our data 103x5
Patient ID # of Tumors Avg Area Avg Density Diagnosis
1 5 20 118 M
2 3 15 130 B
3 7 10 52 B
4 2 30 100 M
... ... ... ... ...
100 3 19 100 M
101 4 16 95 M
102 9 22 125 B
103 1 14 80 M
Recall ML stages
Supervised Learning Classification

Training
Set

• Use this training set to learn how to classify patients

where diagnosis is not known:
Patient ID # of Tumors Avg Area Avg Density Diagnosis
101 4 16 95 ?
102 9 22 125 ? Test Set
103 1 14 80 ?

Will be predicted by
Input Data our model
Breast Cancer Diagnosis Linear Separation

Line produced
by our model
to separate
the two
classes

The plot of the training data into 2D, where:

red represents M cases and blue represents B cases
Predict the test data

The gray circles represent the test set

• The model predict the test data as following:

Patient ID # of Tumors Avg Area Avg Density Diagnosis

101 4 16 95 M Predicted by
102 9 22 125 M
the model
103 1 14 80 M

Actual
diagnosis

• How good is our model?

Examples of
ML Algorithms

12-Principles-of-Effective-Adult-Leanring Vella PDF
100% (2)
12-Principles-of-Effective-Adult-Leanring Vella PDF
2 pages
Introduction To CBT
No ratings yet
Introduction To CBT
12 pages
Unit 1
No ratings yet
Unit 1
92 pages
Introduction To Machine Learning
No ratings yet
Introduction To Machine Learning
24 pages
1 - Introduction
No ratings yet
1 - Introduction
82 pages
Introduction to Machine Learning Basics
No ratings yet
Introduction to Machine Learning Basics
606 pages
Chapter 2
No ratings yet
Chapter 2
35 pages
Machine Learning: Professor Department of Computer Science & Engineering
No ratings yet
Machine Learning: Professor Department of Computer Science & Engineering
59 pages
2-Capacity, Underfitting, overfitting-15-Jul-2020Material - I - 15-Jul-2020 - ML - Fundamentals
No ratings yet
2-Capacity, Underfitting, overfitting-15-Jul-2020Material - I - 15-Jul-2020 - ML - Fundamentals
35 pages
Unit 4
No ratings yet
Unit 4
34 pages
AAI Lecture 9 SP 25
No ratings yet
AAI Lecture 9 SP 25
26 pages
BE02000041 Funda of AI Unit 3 Basics of ML
No ratings yet
BE02000041 Funda of AI Unit 3 Basics of ML
86 pages
Intro To ML
No ratings yet
Intro To ML
107 pages
Unit 1 ML
No ratings yet
Unit 1 ML
93 pages
DIR Notes 1
No ratings yet
DIR Notes 1
39 pages
ML 1
No ratings yet
ML 1
35 pages
Machine Learning - Introduction
No ratings yet
Machine Learning - Introduction
73 pages
Introduction To ML Unit-1
No ratings yet
Introduction To ML Unit-1
90 pages
Unit 1
No ratings yet
Unit 1
93 pages
Introduction To ML
No ratings yet
Introduction To ML
46 pages
Chapter 01 Introduction To Machine Learning
No ratings yet
Chapter 01 Introduction To Machine Learning
59 pages
Machine Learning Fundamentals
No ratings yet
Machine Learning Fundamentals
19 pages
01 Introduction ML
No ratings yet
01 Introduction ML
48 pages
MLintroduction
No ratings yet
MLintroduction
75 pages
Lecture 1
No ratings yet
Lecture 1
47 pages
A.I. Lecture 4 NEW
No ratings yet
A.I. Lecture 4 NEW
31 pages
Ch7 Introduction To Machine Learning
No ratings yet
Ch7 Introduction To Machine Learning
29 pages
Introduction To Machine Learning
No ratings yet
Introduction To Machine Learning
15 pages
Aiml Co - 3,4 Notes
No ratings yet
Aiml Co - 3,4 Notes
98 pages
Machine Learning - 1
No ratings yet
Machine Learning - 1
52 pages
Lec1 - Introduction
No ratings yet
Lec1 - Introduction
55 pages
ML - Module 1
No ratings yet
ML - Module 1
30 pages
ML ppt1
No ratings yet
ML ppt1
39 pages
Lecture#12 DM MS (DEIM) Spring 2025
No ratings yet
Lecture#12 DM MS (DEIM) Spring 2025
21 pages
1.0 Introduction
No ratings yet
1.0 Introduction
50 pages
Ch3-Machine Learning
No ratings yet
Ch3-Machine Learning
124 pages
Introduction To Machine Learning
No ratings yet
Introduction To Machine Learning
10 pages
Selected T Chapter 3
No ratings yet
Selected T Chapter 3
62 pages
Chapter 7 - Artificial Intelligence Application
No ratings yet
Chapter 7 - Artificial Intelligence Application
29 pages
Data Science & ML Course Guide
No ratings yet
Data Science & ML Course Guide
83 pages
L1 - SLM Notes (Bacground, ML)
No ratings yet
L1 - SLM Notes (Bacground, ML)
29 pages
Chapter 5 Machine Learning
No ratings yet
Chapter 5 Machine Learning
96 pages
Selected T Chapter 3
No ratings yet
Selected T Chapter 3
62 pages
2021 Machine Learning Intro
No ratings yet
2021 Machine Learning Intro
43 pages
Unit 1
No ratings yet
Unit 1
62 pages
Made By: Swati Tripathi
No ratings yet
Made By: Swati Tripathi
31 pages
Module2 ch2
No ratings yet
Module2 ch2
36 pages
Lec1 Intoduction
No ratings yet
Lec1 Intoduction
34 pages
Intro Machine Learning
No ratings yet
Intro Machine Learning
4 pages
Lec 1,2
No ratings yet
Lec 1,2
69 pages
Lec 7 - 8 - Machine Learning Introduction
No ratings yet
Lec 7 - 8 - Machine Learning Introduction
55 pages
Introduction to Machine Learning
No ratings yet
Introduction to Machine Learning
14 pages
Machine Learning in Biostatistics - Master of Science in Biostatistics by Slidesgo
No ratings yet
Machine Learning in Biostatistics - Master of Science in Biostatistics by Slidesgo
15 pages
Lesson 2 - Fundamentals of Machine Learning and Deep Learning
No ratings yet
Lesson 2 - Fundamentals of Machine Learning and Deep Learning
100 pages
01ML Introduction
No ratings yet
01ML Introduction
80 pages
An Enlightenment To Machine Learning
100% (1)
An Enlightenment To Machine Learning
16 pages
Machine Learning Course Guide
No ratings yet
Machine Learning Course Guide
151 pages
Introduction To Machine Learning
No ratings yet
Introduction To Machine Learning
4 pages
UNIT 1ML Removed Removed
No ratings yet
UNIT 1ML Removed Removed
123 pages
Introduction To Machine Learning
No ratings yet
Introduction To Machine Learning
316 pages
Intro to Machine Learning Basics
No ratings yet
Intro to Machine Learning Basics
71 pages
Cohesive Devices Error Analysis
No ratings yet
Cohesive Devices Error Analysis
88 pages
Human Behavior Lec 4
No ratings yet
Human Behavior Lec 4
17 pages
Characteristics of Effective Writing Characteristics of Effective Writing
No ratings yet
Characteristics of Effective Writing Characteristics of Effective Writing
16 pages
The Sled A Consonant Blends Decodable Book by Brooke Vitale
100% (1)
The Sled A Consonant Blends Decodable Book by Brooke Vitale
14 pages
Thesis Statement Mini-Lesson
No ratings yet
Thesis Statement Mini-Lesson
5 pages
Control System Performance PDF
No ratings yet
Control System Performance PDF
18 pages
Pinheiro Et Al. 2023
No ratings yet
Pinheiro Et Al. 2023
14 pages
Stainton Rogers - Chapter11 - Social Selves, Social Identities-1
No ratings yet
Stainton Rogers - Chapter11 - Social Selves, Social Identities-1
31 pages
RTT® WEALTH WIRING - Script
No ratings yet
RTT® WEALTH WIRING - Script
7 pages
Recruitment and Selection Refresher: Sue Kellaway
No ratings yet
Recruitment and Selection Refresher: Sue Kellaway
41 pages
Evidence-Based Practice in Physiotherapy
100% (2)
Evidence-Based Practice in Physiotherapy
17 pages
Teaching Channel
No ratings yet
Teaching Channel
5 pages
Action Plan and Training Design Design Thinking
No ratings yet
Action Plan and Training Design Design Thinking
5 pages
Business Studies Exam Answering Techniques - 1-1
No ratings yet
Business Studies Exam Answering Techniques - 1-1
9 pages
Rubric For Group Activity
100% (1)
Rubric For Group Activity
3 pages
Past Continuous Tense Lesson Plan
No ratings yet
Past Continuous Tense Lesson Plan
15 pages
Organizational Behaviour Theory
No ratings yet
Organizational Behaviour Theory
3 pages
Storytelling For Strategists - Communicating The Strategic Plan - Strategy Leadership Council
No ratings yet
Storytelling For Strategists - Communicating The Strategic Plan - Strategy Leadership Council
4 pages
Reaction Paper
No ratings yet
Reaction Paper
3 pages
Behaviorism - Ivan Pavlov
100% (1)
Behaviorism - Ivan Pavlov
3 pages
Understanding Gerund Phrases
No ratings yet
Understanding Gerund Phrases
5 pages
CAT Prep Schedule & Sessions
No ratings yet
CAT Prep Schedule & Sessions
10 pages
Ethos Logos Pathos in Advertising
No ratings yet
Ethos Logos Pathos in Advertising
4 pages
CELPIP Performance Standards
No ratings yet
CELPIP Performance Standards
2 pages
Triplet Loss
No ratings yet
Triplet Loss
30 pages
Commander's Leadership Guide
No ratings yet
Commander's Leadership Guide
2 pages
Opinion Paragraph PDF
0% (1)
Opinion Paragraph PDF
14 pages
Outlining in Reverse
No ratings yet
Outlining in Reverse
2 pages

ML Intro

Uploaded by

ML Intro

Uploaded by

Machine Learning

Data Science Process

• There are two ways that a system can improve:

• Unsupervised/clustering learning (without a

• Semi-supervised: in between supervised and

Q1) can you

• Which model performs the best?

• Cross validation methods

The more (relevant) data the

T: Driving on four-lane highways using vision sensors

T: Categorize email messages as spam or legitimate.

• Use this training set to learn how to classify patients

The plot of the training data into 2D, where:

The gray circles represent the test set

Patient ID # of Tumors Avg Area Avg Density Diagnosis

• How good is our model?

You might also like