0% found this document useful (0 votes)

25 views39 pages

l09 Machine Learning

The document discusses model evaluation in machine learning, focusing on metrics for binary classification and the challenges posed by class imbalance. It covers various evaluation metrics such as precision, recall, F1 score, and ROC curves, emphasizing the importance of selecting appropriate metrics based on the specific goals of the analysis. Additionally, it addresses multi-class classification and the use of custom scoring in cross-validation.

Uploaded by

sashakayukov23

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

25 views39 pages

l09 Machine Learning

Uploaded by

sashakayukov23

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 39

6012B0419Y Machine Learning

Model Evaluation and Class

Imbalance
27-11-2023

Guido van Capelleveen

(Prepared by: Stevan Rudinac)
Slide Credit
●Andreas Müller, lecturer at the Data Science
Institute at Columbia University
● Author of the book we will be using for this course
“Introduction to Machine Learning with Python”

● Great materials available at:

● https://github.com/amueller/applied_ml_spring_2017/
● https://amueller.github.io/applied_ml_spring_2017/
Reading

Pages: 277 – 305

Metrics for Binary Classification
Kinds of Errors
● Example: Early cancer detection screening
– The test is negative: patient is assumed healthy
– The test is positive: patient undergoes additional test

● Possible mistakes:
– Healthy patient is classified as positive: false positive
or type I error
– Sick patient is classified as negative: false negative or
type II error
Review: confusion matrix

Diagonal divided by everything.

Problems with accuracy
● Imbalanced classes lead to hard-to-interpret accuracy:

Data with 90% negatives

(class, is this OK?)
Precision, Recall, f-score
Positive Predictive Value
(PPV)

limit

Sensitivity, coverage, true positive rate.

limit

All depend on definition of positive and

negative!
The
zoo

https://en.wikipedia.org/wiki/Precision_and_recall
Goal setting!
● What do I want? What do I care
about? (precision, recall, something
else)
● Can I assign costs to the confusion matrix?
(i.e. a false positive costs me $10, a false negative
$100)
● What guarantees do we want to give?
Changing Thresholds
Precision-Recall Curve
Precision-Recall Curve
Comparing RF and SVC
Comparing RF and SVC
Average Precision
Precision at threshold k

Change in recall
between k and k-1

Sum over data points, ranked by decision function

Same as area under the precision-recall curve

(depending on how you treat edge-cases)
F1 vs average precision
Receiver Operating Characteristics
(ROC) Curve

= recall
ROC
●
AUC
Area under ROC Curve
● Always .5 for random / constant prediction
●Evaluation of the ranking: probability that a randomly
picked positive sample will have a higher score than a
randomly picked negative sample

The Relationship Between Precision-Recall and ROC Curves

https://www.biostat.wisc.edu/~page/rocpr.pdf
Multi-class classification
Confusion Matrix

Normalizing confusion matrix (by rows) can be

helpful
Micro and Macro F1
● Macro-average f1: Average f1 scores over classes
●Micro-average f1: Computes the total number of FP,
FN and TP over all classes and then computes P, R and
f1 using these counts.
●Weighted: Mean of the per-class f1 scores, weighted
by support

Macro: “all classes are equally important”

Micro: “all samples are equally important” - same for other metric averages
Multi-class ROC AUC
● Hand & Till, 2001 one vs one

● Provost & Domingo, 2000 one vs

rest

● https://github.com/scikit-learn/scikit-learn/pull/7663
Picking metrics?
● Accuracy rarely what you want
● Problems are rarely balanced
● Find the right criterion for the task
● OR pick one arbitrarily, but at least think about it
● Emphasis on recall or precision?
● Which classes are the important ones?
Using metrics in cross-validation

Same for GridSearchCV

Will make GridSearchCV.score use your
metric!
Built-in scoring
● “scoring” can be string or callable.
● Strings:
Providing your own callable
● Takes estimator, X, y
● Returns score – higher is better (always!)

def accuracy_scoring(est, X, y):

return (est.predict(X) == y).mean()
You can access the model!
Metrics for regression models
Build-in standard metrics
● R^2 : easy to understand scale
● MSE : easy to relate to input
● Mean absolute error, median absolute
error: more robust.

●When using “scoring” use

“neg_mean_squared_error” etc
Prediction plots
Residual Plots
Target vs Feature
Residual vs Feature
Absolute vs relative:
MAPE Mean absolute percentage error (MAPE)
Over vs under
●Overprediction and underprediction can
have different cost.
●Try to create cost-matrix: how much
does overprediction and underprediction
cost?
● Is it linear?

3 - Model Evaluation & Validation
No ratings yet
3 - Model Evaluation & Validation
47 pages
Python Pandas Tutorial PDF
100% (1)
Python Pandas Tutorial PDF
13 pages
Evaluation Metricsflaksdj Fa
No ratings yet
Evaluation Metricsflaksdj Fa
22 pages
Unit - 5
No ratings yet
Unit - 5
57 pages
AI Researcher Profile
No ratings yet
AI Researcher Profile
2 pages
4-1 Fine-Tuning Your Model
No ratings yet
4-1 Fine-Tuning Your Model
60 pages
Unit 4
No ratings yet
Unit 4
20 pages
Smai Lecture 04 Perf Measures Classification
No ratings yet
Smai Lecture 04 Perf Measures Classification
42 pages
03 Performance Metrics
No ratings yet
03 Performance Metrics
15 pages
CLASSIFICATION
No ratings yet
CLASSIFICATION
36 pages
Evaluation Metrics in Machine Learning - GeeksforGeeks
No ratings yet
Evaluation Metrics in Machine Learning - GeeksforGeeks
12 pages
ML Lecture 11 Evaluation
No ratings yet
ML Lecture 11 Evaluation
17 pages
Model Evaluation
No ratings yet
Model Evaluation
31 pages
ML Interview Questions Placements
No ratings yet
ML Interview Questions Placements
99 pages
Classification Metrics
No ratings yet
Classification Metrics
39 pages
Confusion Matrix
No ratings yet
Confusion Matrix
8 pages
Performance Parameters
No ratings yet
Performance Parameters
14 pages
Evaluation Metrics in Machine Learning - GeeksforGeeks
No ratings yet
Evaluation Metrics in Machine Learning - GeeksforGeeks
6 pages
Titanic Survival Prediction Using ML Miniproject
No ratings yet
Titanic Survival Prediction Using ML Miniproject
21 pages
W6 CSE 4781 Classification Metrics
No ratings yet
W6 CSE 4781 Classification Metrics
28 pages
Digital Audit: Governance Impact
No ratings yet
Digital Audit: Governance Impact
10 pages
Lect 02 Evaluation Part 1
No ratings yet
Lect 02 Evaluation Part 1
33 pages
Machine Learning II
No ratings yet
Machine Learning II
61 pages
Quantitative Research Guide
No ratings yet
Quantitative Research Guide
65 pages
Machine Learningassignment
No ratings yet
Machine Learningassignment
10 pages
Engineer-Process Metallurgy (Wire Rod VAG)
No ratings yet
Engineer-Process Metallurgy (Wire Rod VAG)
3 pages
IT 138 - Lecture 4
No ratings yet
IT 138 - Lecture 4
30 pages
Lecture11evaluationmetricsforclassification 240913060639 0c766554
No ratings yet
Lecture11evaluationmetricsforclassification 240913060639 0c766554
28 pages
Model Evaluation and Selection
No ratings yet
Model Evaluation and Selection
49 pages
Performance Metrics Classification
No ratings yet
Performance Metrics Classification
39 pages
DL IT324a 4
No ratings yet
DL IT324a 4
52 pages
Machine Learning Model Evaluation
No ratings yet
Machine Learning Model Evaluation
2 pages
ML Metrics
No ratings yet
ML Metrics
9 pages
Last Day
No ratings yet
Last Day
35 pages
جلسه 13
No ratings yet
جلسه 13
76 pages
AIML-HC Mod 03
No ratings yet
AIML-HC Mod 03
46 pages
Confusion Matrix
No ratings yet
Confusion Matrix
5 pages
Chapter 5 Model Evaluation
No ratings yet
Chapter 5 Model Evaluation
21 pages
14.1 Data Preprocessing Class Imbalance and AUC Curve
No ratings yet
14.1 Data Preprocessing Class Imbalance and AUC Curve
5 pages
ML CH 5
No ratings yet
ML CH 5
45 pages
Chap3 Part1 Classification
No ratings yet
Chap3 Part1 Classification
38 pages
Machine Learning Evaluation Metrics
No ratings yet
Machine Learning Evaluation Metrics
16 pages
Iai&ml Unit-5
No ratings yet
Iai&ml Unit-5
15 pages
08 Classifier Evaluation
No ratings yet
08 Classifier Evaluation
39 pages
ML Model Evaluation Metrics
No ratings yet
ML Model Evaluation Metrics
8 pages
Machine Learning Evaluation Metrics Lecturer
No ratings yet
Machine Learning Evaluation Metrics Lecturer
30 pages
A10 Model Performance v2 2up
No ratings yet
A10 Model Performance v2 2up
11 pages
8 Classification
No ratings yet
8 Classification
16 pages
Session 1 Evaluation Model
No ratings yet
Session 1 Evaluation Model
58 pages
Lec5 Classification
No ratings yet
Lec5 Classification
27 pages
Unit 4 Model Evaluation
No ratings yet
Unit 4 Model Evaluation
24 pages
L 13 Choose Your Own Algorithm D 07062024 111828am
No ratings yet
L 13 Choose Your Own Algorithm D 07062024 111828am
36 pages
BC Icici
67% (3)
BC Icici
12 pages
ML Model Evaluation Metrics
No ratings yet
ML Model Evaluation Metrics
11 pages
11.2 - Classification Evaluation Metrics
No ratings yet
11.2 - Classification Evaluation Metrics
22 pages
Classification Metrics Guide
No ratings yet
Classification Metrics Guide
15 pages
Caballero Et Al. (2019) Mindfulness and Achievement in Middle School
No ratings yet
Caballero Et Al. (2019) Mindfulness and Achievement in Middle School
10 pages
FALLSEM2024-25 BCSE334L TH VL2024250101768 2024-10-08 Reference-Material-I
No ratings yet
FALLSEM2024-25 BCSE334L TH VL2024250101768 2024-10-08 Reference-Material-I
18 pages
Binary Classifier Evaluation Guide
No ratings yet
Binary Classifier Evaluation Guide
12 pages
Sports Final Module 2
No ratings yet
Sports Final Module 2
10 pages
Evaluating A Machine Learning Model
No ratings yet
Evaluating A Machine Learning Model
14 pages
ML3 Evaluating Models
No ratings yet
ML3 Evaluating Models
40 pages
Hands On Machine Learning 3 Edition
No ratings yet
Hands On Machine Learning 3 Edition
31 pages
ML Model Evaluation Metrics
No ratings yet
ML Model Evaluation Metrics
22 pages
Exp7 MLAI2
No ratings yet
Exp7 MLAI2
8 pages
How To Use ROC Curves and Precision-Recall Curves For Classification in Python
No ratings yet
How To Use ROC Curves and Precision-Recall Curves For Classification in Python
47 pages
Mapping Actors in Value Chain
50% (2)
Mapping Actors in Value Chain
28 pages
R Package for Short Time Series Modeling
No ratings yet
R Package for Short Time Series Modeling
14 pages
Social Media's Impact on Purchases
100% (1)
Social Media's Impact on Purchases
11 pages
Handling Imbalanced Data
No ratings yet
Handling Imbalanced Data
21 pages
Gurpreet Kaur
No ratings yet
Gurpreet Kaur
7 pages
Predicting Sonar Rocks Against Mines With ML
No ratings yet
Predicting Sonar Rocks Against Mines With ML
7 pages
Chapter 1: The Nature of Econometrics and Economic Data
No ratings yet
Chapter 1: The Nature of Econometrics and Economic Data
18 pages
Linear Regression Essentials
No ratings yet
Linear Regression Essentials
6 pages
Gerstman PP15
No ratings yet
Gerstman PP15
20 pages
Emp Code First Name Last Name Department Region Branch Salary
No ratings yet
Emp Code First Name Last Name Department Region Branch Salary
14 pages
Clustering of Study Program Using Block-Based K-Medoids: Jurnal Varian
No ratings yet
Clustering of Study Program Using Block-Based K-Medoids: Jurnal Varian
17 pages
MBA Assignment: Production & Operations
No ratings yet
MBA Assignment: Production & Operations
9 pages
Pub 1032621
No ratings yet
Pub 1032621
11 pages
PDF 20230501 220524 0000
No ratings yet
PDF 20230501 220524 0000
2 pages
Lead Scoring Model Case Study
No ratings yet
Lead Scoring Model Case Study
12 pages
EasyChair Preprint 12726
No ratings yet
EasyChair Preprint 12726
6 pages
Lecture 1 (With Ans)
No ratings yet
Lecture 1 (With Ans)
10 pages
Lecture Five - Docx Measure of Dispersion
No ratings yet
Lecture Five - Docx Measure of Dispersion
9 pages
Module 6 in PR 2
No ratings yet
Module 6 in PR 2
8 pages
B Ed Assignment 1 and 2
No ratings yet
B Ed Assignment 1 and 2
5 pages
Previziunea Vanzarilor: Forecasting
No ratings yet
Previziunea Vanzarilor: Forecasting
3 pages
Water Fraud REPORT
0% (2)
Water Fraud REPORT
63 pages

l09 Machine Learning

Uploaded by

l09 Machine Learning

Uploaded by

6012B0419Y Machine Learning

Model Evaluation and Class

Guido van Capelleveen

● Great materials available at:

Pages: 277 – 305

Diagonal divided by everything.

Data with 90% negatives

Sensitivity, coverage, true positive rate.

All depend on definition of positive and

Sum over data points, ranked by decision function

Same as area under the precision-recall curve

The Relationship Between Precision-Recall and ROC Curves

Normalizing confusion matrix (by rows) can be

Macro: “all classes are equally important”

● Provost & Domingo, 2000 one vs

Same for GridSearchCV

def accuracy_scoring(est, X, y):

●When using “scoring” use

You might also like