0% found this document useful (0 votes)

5 views31 pages

05 - Machine Learning

The document provides an overview of machine learning approaches, including classical, reinforcement, and ensemble learning, as well as neural networks and deep learning. It discusses the differences between supervised and unsupervised learning, highlighting their applications, drawbacks, and evaluation methods. Additionally, it addresses issues related to data preparation, classification, prediction, and the challenges of overfitting and underfitting in model performance.

Uploaded by

buitouyenglobaltwe

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

5 views31 pages

05 - Machine Learning

Uploaded by

buitouyenglobaltwe

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 31

Introduction to data science

Overview of Machine Learning

Machine Learning Approaches

Classical
learning

Reinforcement MACHINE Ensemble

LEARNING
learning learning

Neural nets
and deep
learning
Machine Learning Approaches

Classical
learning

Supervised Unsupervised Semi-supervised

learning learning learning
Machine Learning Approaches

Ensemble
learning

Boosting Bagging Stacking

Machine Learning Approaches

Reinforcement
learning

Genetic
Algorithm Q-Learning …
(GA)
Machine Learning Approaches

Neural nets
(NN) and
deep learning

Back
Feed forward Convolutional
Propagation Recurrent NN ….
NN NN
NN
Supervised vs. Unsupervised Learning

◼ Supervised learning (classification)

◼ Supervision: The training data (observations,
measurements, etc.) are accompanied by labels
indicating the class of the observations
◼ New data is classified based on the training set
◼ Unsupervised learning (clustering)
◼ The class labels of training data is unknown
◼ Given a set of measurements, observations, etc. with
the aim of establishing the existence of classes or
clusters in the data
Supervised vs. Unsupervised Learning
Supervised Learning: Classification vs. Prediction

◼ Classification
◼ predicts categorical class labels (discrete or nominal)

◼ classifies data (constructs a model) based on the

training set and the values (class labels) in a
classifying attribute and uses it in classifying new data
◼ Prediction (Regression)
◼ models continuous-valued functions, i.e., predicts
unknown or missing values
◼ Typical applications
◼ Credit approval

◼ Target marketing

◼ Medical diagnosis

◼ Fraud detection
Supervised Learning: Drawbacks

◼ Supervised learning requires human expertise: Expert

annotators play an invaluable role in guiding your model’s
training, but they can be difficult to recruit.
◼ Supervised learning is labor-intensive: You’ll need to
have a big enough team with relevant expertise to accurately
label large datasets.
◼ Supervised learning is time-intensive: In addition to top
talent, you’ll need the bandwidth to accurately annotate the
dataset so that your model is capable of producing
predictable outcomes.
Classification: A Two-Step Process

◼ Model construction: describing a set of predetermined classes

◼ Each tuple/sample is assumed to belong to a predefined class,
as determined by the class label attribute
◼ The set of tuples used for model construction is training set

◼ The model is represented as classification rules, decision trees,

or mathematical formulae
◼ Model usage: for classifying future or unknown objects
◼ Estimate accuracy of the model

◼ The known label of test sample is compared with the

classified result from the model

◼ Accuracy rate is the percentage of test set samples that are

correctly classified by the model

◼ Test set is independent of training set, otherwise over-

fitting will occur

◼ If the accuracy is acceptable, use the model to classify data
tuples whose class labels are not known
Process (1): Model Construction

Classification
Algorithms
Training
Data

NAME RANK YEARS TENURED Classifier

M ik e A ssistan t P ro f 3 no (Model)
M ary A ssistan t P ro f 7 yes
B ill P ro fesso r 2 yes
J im A sso c iate P ro f 7 yes
IF rank = ‘professor’
D ave A ssistan t P ro f 6 no
OR years > 6
Anne A sso c iate P ro f 3 no
THEN tenured = ‘yes’
Process (2): Using the Model in Prediction

Classifier

Testing
Data Unseen Data

(Jeff, Professor, 4)
NAME RANK YEARS TENURED
Tom A ssistan t P ro f 2 no Tenured?
M erlisa A sso c iate P ro f 7 no
G eo rg e P ro fesso r 5 yes
J o sep h A ssistan t P ro f 7 yes
Machine learning in data mining
Issues regarding to classification and prediction
Issues: Data Preparation

◼ Data cleaning
◼ Preprocess data in order to reduce noise and handle
missing values
◼ Relevance analysis (feature selection)
◼ Remove the irrelevant or redundant attributes
◼ Data transformation
◼ Generalize and/or normalize data
Issues: Evaluating Classification Methods

◼ Accuracy
◼ classifier accuracy: predicting class label

◼ predictor accuracy: guessing value of predicted attributes

◼ Speed
◼ time to construct the model (training time)

◼ time to use the model (classification/prediction time)

◼ Robustness: handling noise and missing values

◼ Scalability: efficiency in disk-resident databases
◼ Interpretability
◼ understanding and insight provided by the model

◼ Other measures, e.g., goodness of rules, such as decision

tree size or compactness of classification rules
Issues: Evaluating Classification Methods
Actual class
+ –
False Positive - NP
Predicted + True Positive - TP
Type I error
False Negative- FN
class – Type II error
True Negative - TN
Issues: Evaluating Classification Methods
Miss Detection Rate

False Alarm Rate

Issues: Evaluating Classification Methods
Issues: Evaluating Classification Methods

Example: Given a confusion matrix

Calculate Accuracy, Precision,

Recall and F1-Score.

Accuracy =
Precision =
Recall =
F1-Score =
Issues: Evaluating Regression Methods
Issues: Evaluating Regression Methods

Mean Squared Error (MSE)

Mean Absolute Error (MAE):

Root Mean Square Error (RMSE):

where: yi is the actual values, and 𝑦ො𝑖 is the predicted values

Issues: Evaluating Regression Methods

Mean Absolute Percentage Error (MAPE)

R2 (R-squared):

where: yi is the actual values, and 𝑦ො𝑖 is the predicted values

SSR is the sum of squared residuals, and SST is the total sum of squares
Issues: Evaluating Regression Methods

Calculate MSE, MAE, RMSE, R2

Issues: Evaluating Regression Methods
Issues: Overfitting and underfitting

▪ Underfitting happens when a model is not good enough to understand all the
details in the data
→ Poor performance on both the training and test sets
▪ Overfitting occurs when a model is too complex and memorizes the training
data too well
→ good performance on the training set but poor performance on the test set
Other machine learning models

▪ Ensemble learning:
Other machine learning models

▪ Ensemble learning:

O&m Manual For Access Control System
100% (1)
O&m Manual For Access Control System
81 pages
CSS Basics and Best Practices
No ratings yet
CSS Basics and Best Practices
113 pages
Lecturenotes Cse176
No ratings yet
Lecturenotes Cse176
80 pages
Classifiers (Support Vector Machines, Decision Trees, Nearest Neighbor Classification)
No ratings yet
Classifiers (Support Vector Machines, Decision Trees, Nearest Neighbor Classification)
16 pages
Lecturenotes PDF
No ratings yet
Lecturenotes PDF
80 pages
Jed Commands
67% (3)
Jed Commands
3 pages
Unit Iii Classification
No ratings yet
Unit Iii Classification
57 pages
ABP DWDM UNIT 4 Classification 1
No ratings yet
ABP DWDM UNIT 4 Classification 1
51 pages
Tamil Typing Practice Book Free 426 PDF
0% (1)
Tamil Typing Practice Book Free 426 PDF
5 pages
20150908-Lecture-3-Draft Asd Def HFL DFGF Lkreglker Lerg Kelr GK
No ratings yet
20150908-Lecture-3-Draft Asd Def HFL DFGF Lkreglker Lerg Kelr GK
15 pages
ML 2
No ratings yet
ML 2
166 pages
NLP Chapter 2
No ratings yet
NLP Chapter 2
79 pages
Unit Ii
No ratings yet
Unit Ii
118 pages
Machine Learning Fundamentals
No ratings yet
Machine Learning Fundamentals
19 pages
Supervised Machine Learning Guide
No ratings yet
Supervised Machine Learning Guide
7 pages
Unit 5 PPT
No ratings yet
Unit 5 PPT
32 pages
Machine Learning Models: by Mayuri Bhandari
No ratings yet
Machine Learning Models: by Mayuri Bhandari
48 pages
Introduction To Machine Learning
No ratings yet
Introduction To Machine Learning
61 pages
Intro to Machine Learning Basics
No ratings yet
Intro to Machine Learning Basics
42 pages
Unit 2
No ratings yet
Unit 2
63 pages
Machine - Learning - Unit - 1
No ratings yet
Machine - Learning - Unit - 1
70 pages
ML Unit-1
No ratings yet
ML Unit-1
39 pages
19-Introduction Classification Algorithm-18-09-2024
No ratings yet
19-Introduction Classification Algorithm-18-09-2024
102 pages
myDATA API Documentation v0 6b - Eng
No ratings yet
myDATA API Documentation v0 6b - Eng
50 pages
Module 2 - ML
No ratings yet
Module 2 - ML
53 pages
2-Capacity, Underfitting, overfitting-15-Jul-2020Material - I - 15-Jul-2020 - ML - Fundamentals
No ratings yet
2-Capacity, Underfitting, overfitting-15-Jul-2020Material - I - 15-Jul-2020 - ML - Fundamentals
35 pages
MLT Unit 1
No ratings yet
MLT Unit 1
15 pages
Data Science Lecture: Classification & Regression
No ratings yet
Data Science Lecture: Classification & Regression
27 pages
Eclipse Shortcuts
No ratings yet
Eclipse Shortcuts
1 page
Vacant Positions For Tamil Nadu, Tnega: 1. Enterprise Architect
No ratings yet
Vacant Positions For Tamil Nadu, Tnega: 1. Enterprise Architect
22 pages
Chat GPT Questions
No ratings yet
Chat GPT Questions
16 pages
Basics of ML and Evaluation
No ratings yet
Basics of ML and Evaluation
42 pages
Research Project Arduino Powered Automatic Fluid Dispenser
No ratings yet
Research Project Arduino Powered Automatic Fluid Dispenser
36 pages
Userguide Ethernetip en Cro 2017 05 08
No ratings yet
Userguide Ethernetip en Cro 2017 05 08
34 pages
6.data Mining - Classification
No ratings yet
6.data Mining - Classification
37 pages
What Is Machine Learning
No ratings yet
What Is Machine Learning
13 pages
Machine Learning
No ratings yet
Machine Learning
42 pages
4 Ways To Improve Your Plotly Graphs - by Dylan Castillo - Towards Data Science
No ratings yet
4 Ways To Improve Your Plotly Graphs - by Dylan Castillo - Towards Data Science
11 pages
Chapter 01 Introduction To Machine Learning
No ratings yet
Chapter 01 Introduction To Machine Learning
59 pages
MI - Unit 3
100% (1)
MI - Unit 3
107 pages
Data Mining and Classification Basics
No ratings yet
Data Mining and Classification Basics
129 pages
BSC ML CH1
No ratings yet
BSC ML CH1
63 pages
Machine Learning
No ratings yet
Machine Learning
2 pages
dbms-10 Marks
No ratings yet
dbms-10 Marks
32 pages
Classification
No ratings yet
Classification
53 pages
Intro To ML
No ratings yet
Intro To ML
34 pages
Chapter 5 Machine Learning
No ratings yet
Chapter 5 Machine Learning
96 pages
Unit 1 Machine Learning
No ratings yet
Unit 1 Machine Learning
68 pages
Complete ML Concepts
No ratings yet
Complete ML Concepts
30 pages
Kemuning/Icu Isolasi 3 JAB Rating Bobot N
No ratings yet
Kemuning/Icu Isolasi 3 JAB Rating Bobot N
7 pages
Case Study On Systems Consideration in Hris
No ratings yet
Case Study On Systems Consideration in Hris
10 pages
Machine Learning HC
No ratings yet
Machine Learning HC
4 pages
Chapter 01 Introduction To ML
No ratings yet
Chapter 01 Introduction To ML
178 pages
Artificial Intelligence
No ratings yet
Artificial Intelligence
24 pages
Chapter 2
No ratings yet
Chapter 2
35 pages
Requirement For Variant Generation in
No ratings yet
Requirement For Variant Generation in
15 pages
Introduction To Machine Learning
No ratings yet
Introduction To Machine Learning
17 pages
Supervised and Unsupervised Learning
No ratings yet
Supervised and Unsupervised Learning
14 pages
Untitled Paste
No ratings yet
Untitled Paste
16 pages
Introduction to Machine Learning
No ratings yet
Introduction to Machine Learning
7 pages
SC200: Advanced Sensor Controller
No ratings yet
SC200: Advanced Sensor Controller
4 pages
AAI Lecture 9 SP 25
No ratings yet
AAI Lecture 9 SP 25
26 pages
Unit 3 - DS - 1st Year
No ratings yet
Unit 3 - DS - 1st Year
5 pages
Unit 4
No ratings yet
Unit 4
61 pages
Config Switch Core 10.16.35.1
No ratings yet
Config Switch Core 10.16.35.1
10 pages
Homework List Template
100% (1)
Homework List Template
5 pages
1brf An Introduction To Rubrik For Mongodb Data Protection Tech Brief
No ratings yet
1brf An Introduction To Rubrik For Mongodb Data Protection Tech Brief
5 pages
NFA008 Examen Final 2021-2022 VEng Session 1
No ratings yet
NFA008 Examen Final 2021-2022 VEng Session 1
5 pages
Business Action Plan Template PDF
No ratings yet
Business Action Plan Template PDF
8 pages
S1 Ict End of Year
No ratings yet
S1 Ict End of Year
3 pages
Data Analyst Interview Questionaries
No ratings yet
Data Analyst Interview Questionaries
16 pages
CH 6
No ratings yet
CH 6
24 pages
CC Assignment 5
No ratings yet
CC Assignment 5
5 pages
Iso File Naming Macro
No ratings yet
Iso File Naming Macro
6 pages
Chapter 2 Supervised Learning - p1-2
No ratings yet
Chapter 2 Supervised Learning - p1-2
45 pages
18-Dial's Algorithm-06-02-2025
No ratings yet
18-Dial's Algorithm-06-02-2025
24 pages
Платежные решения
No ratings yet
Платежные решения
3 pages
Selected T Chapter 3
No ratings yet
Selected T Chapter 3
62 pages
The Institute of Management Sciences, Lahore: Assignment #1
No ratings yet
The Institute of Management Sciences, Lahore: Assignment #1
3 pages
Lecture1 MCQ Guide
No ratings yet
Lecture1 MCQ Guide
4 pages
CH 1
No ratings yet
CH 1
34 pages
ML Algorithms
No ratings yet
ML Algorithms
2 pages
None
No ratings yet
None
16 pages
Selected T Chapter 3
No ratings yet
Selected T Chapter 3
62 pages
3-1 Supervised Learning With Scikit-Learn - Chapter 1 Classification
No ratings yet
3-1 Supervised Learning With Scikit-Learn - Chapter 1 Classification
87 pages
Unit 1 ML
No ratings yet
Unit 1 ML
49 pages
ML Notes
No ratings yet
ML Notes
17 pages

05 - Machine Learning

Uploaded by

05 - Machine Learning

Uploaded by

Introduction to data science

Overview of Machine Learning

Reinforcement MACHINE Ensemble

Supervised Unsupervised Semi-supervised

Boosting Bagging Stacking

◼ Supervised learning (classification)

◼ classifies data (constructs a model) based on the

◼ Supervised learning requires human expertise: Expert

◼ Model construction: describing a set of predetermined classes

◼ The model is represented as classification rules, decision trees,

◼ The known label of test sample is compared with the

classified result from the model

correctly classified by the model

fitting will occur

NAME RANK YEARS TENURED Classifier

◼ predictor accuracy: guessing value of predicted attributes

◼ time to use the model (classification/prediction time)

◼ Robustness: handling noise and missing values

◼ Other measures, e.g., goodness of rules, such as decision

False Alarm Rate

Example: Given a confusion matrix

Calculate Accuracy, Precision,

Mean Squared Error (MSE)

Mean Absolute Error (MAE):

Root Mean Square Error (RMSE):

where: yi is the actual values, and 𝑦ො𝑖 is the predicted values

Mean Absolute Percentage Error (MAPE)

where: yi is the actual values, and 𝑦ො𝑖 is the predicted values

Calculate MSE, MAE, RMSE, R2

You might also like