0% found this document useful (0 votes)

15 views51 pages

Module 8 PDF

The document discusses the k-Nearest Neighbors (kNN) algorithm, focusing on its classification and regression applications, effects of outliers, and the importance of choosing an appropriate k value. It also covers performance metrics for classification models, including confusion matrices, accuracy, precision, recall, F1-score, and the area under the ROC curve (AUC). Additionally, it highlights challenges in kNN computation and optimization techniques such as k-d trees.

Uploaded by

satyam.kumar10

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

15 views51 pages

Module 8 PDF

Uploaded by

satyam.kumar10

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 51

Learning Objectives

• Continuing discussion on KNN

• Performance metrics for Classification

• Significance of different metrics

1
kNN: Classification

Effect of Outliers:

● Consider k=1.
● Sensitive to outliers: Decision boundary
changes drastically with outliers.
● Solution?
○ Increase k

3
kNN: Classification

Effect of k:

● Low k: overfitting, highly

unstable decision boundary
● Good k: Smooth boundary, no
overfitting/underfitting
● Higher k: Everything classified
as most probable class
● How to find a good k?
k=1 k=15

4
kNN: Classification

Effect of k:

● Low k: overfitting, highly

unstable decision boundary
● Good k: Smooth boundary, no
overfitting/underfitting
● Higher k: Everything classified
as most probable class
● How to find a good k?
k=1 k=15
Cross validation is our friend!

5
kNN: Classification

What if we have same votes from

both classes?

Potential solutions for tie-

breaking:
● Take k odd
● Randomly select
● Use the class with larger prior k=1 k=15

6
kNN: Classification

A probabilistic variant: Probabilistic kNN

E.g. k=4, c=3

P=[3/4, 0, 1/4]

y=1 y=2 y=3

8
kNN: Classification

A probabilistic variant: Probabilistic kNN

E.g. k=4, c=3 Variant with pseudo counts:

P=[(3+1)/(4+3), (0+1)/(4+3),(1+1)/(4+3)]
=[4/7, 1/7, 2/7]
y=1 y=2 y=3

9
kNN: Regression

A simple regression algorithm:

● Train examples: where is a continuous real valued target

● Given test input
● Find distances to the n training examples using a distance metric
● Select k closest training examples and their target values
● The output is the mean of the target values of the k neighbours

Can be used for interpolation.

12
kNN: Challenges

Computationally expensive:
● Need to store all training example
● Required to compute distances to all training examples:

There are ways to optimize kNN computation:

● Reduce dimensionality using dimensionality reduction techniques
● Reduce number of comparisons:
○ kD tree implementation
○ Locality sensitive hashing

14
kNN: Computational Complexity
Brute force method
● Training time complexity: O(1)
● Training space complexity: O(1)
● Prediction time complexity: O(k * n * d)
● Prediction space complexity: O(1)

15
kNN: Computational Complexity

k-d tree method

● Training time complexity: O(d * n * log(n))
● Training space complexity: O(d * n)
● Prediction time complexity: O(k * log(n))
● Prediction space complexity: O(1)

16
kNN: k-d Tree

● K Dimensional tree (or k-d tree) is a tree data structure that is used to represent
points in a k-dimensional space.
● Used for various applications like nearest point (in k-dimensional space), efficient
storage of spatial data, range search etc.

18
kNN: k-d Tree

Example:

19
kNN: k-d Tree

Example:

20
kNN: Computational Complexity

The more “traditional” application of the kNN is the classification of data. It

often has quite a lot of points, e. g. MNIST has 60k training images and 10k test
images. Classification is done offline, which means we first do the training
phase, then just use the results during prediction. Therefore, if we want to
construct the data structure, we only need to do so once. For 10k test images,
let’s compare the brute force (which calculates all distances every time) and
k-d tree for 3 neighbors.

21
kNN: Computational Complexity

● Brute force (O(k * n)): 3 * 10,000 = 30,000

● k-d tree (O(k * log(n))): 3 * log(10,000) ~ 3 * 13 = 39

22
Classification Metrics

How to measure the performance of a classification model?

24
Classification Metrics

Most widely used metrics and tools to access classification models:

● Confusion matrix
● Accuracy
● Precision/Recall/F1-score
● Area under the ROC curve

25
Classification Metrics

Confusion Matrix

A table to summarize how successful the classification model is at

predicting examples belonging to various classes.

26
Classification Metrics

Confusion Matrix

E.g., For binary classification, a model predicts two classes: “spam” and
“not_spam” from a given email.
prediction
spam not_spam

spam True Positive False Negative

actual

(TP) (FN)

not_spam False Positive True Negative

(FP) (TN)

27
Classification Metrics

Confusion Matrix

Exercise 1: Consider a Cricket Tournament. Find

the mapping.
True Positive False Negative
(TP) (FN)
1. You had predicted that India would win and it won.
2. You had predicted that England would not win and it
False Positive True Negative
lost. (FP) (TN)
3. You had predicted that England would win, but it lost.
4. You had predicted that India would not win, but it won.

28
Classification Metrics

Confusion Matrix

Exercise 2:

prediction
1 0

1 TP=? FN=?
actual

0 FP=? TN=?

29
Classification Metrics

Confusion Matrix

Exercise 2:

prediction
1 0

1 TP=6 FN=2
actual

0 FP=1 TN=3

30
Classification Metrics

Confusion Matrix
Multiclass Classification: E.g. emotion classification
prediction
Happy Sad Angry Surprise Disgust Neutral

Happy

Sad

actual
=? Angry

Surprise

Disgust

Neutral

31
Classification Metrics

Accuracy
Accuracy is given by the number of correctly classified examples divided by the
total number of classified examples.
prediction
spam not_spam
TP+TN
Acc = spam True Positive False Negative
TP + TN + FP + FN
actual
(TP) (FN)

not_spam False Positive True Negative

(FP) (TN)

33
Classification Metrics

Accuracy
Accuracy is given by the number of correctly classified examples divided by the
total number of classified examples.

prediction
1 0
actual

1 TP=6 FN=2

0 FP=1 TN=3

Accuracy = ?
34
Classification Metrics

Precision
Precision is the ratio of correct positive predictions to the overall number of
positive predictions
prediction
spam not_spam
TP
Precisions = spam True Positive False Negative
TP + FP
actual
(TP) (FN)

FP is costly! not_spam False Positive True Negative

(FP) (TN)

35
Classification Metrics

Recall
Recall is the ratio of correct positive predictions to the overall number of positive
examples.
prediction
spam not_spam

TP spam True Positive False Negative

Recall =
actual
(TP) (FN)
TP + FN
not_spam False Positive True Negative
(FP) (TN)
FN is costly!

36
Classification Metrics

F1-Score

● The formula for the standard F1-score

is the harmonic mean of the precision
and recall.
● Best of both worlds
● A perfect model has an F-score of 1.
● FP & FN both are costly!

37
Classification Metrics

Visualizing Precision/Recall

38
Classification Metrics

Precision/Recall/F1-score

prediction
1 0

1 P=6 FN=2
Precision = ?

actual
0 FP=1 TN=3
Recall=?

F1-score=?

39
Classification Metrics

Examples: It all depends on the problem!

Diagnosis of cancer.

prediction
It is important in medical cases
cancer no_cancer where it doesn’t matter
whether we raise a false alarm
cancer Perfect X
actual

but the actual positive cases

should not go undetected!
no _cancer OK Perfect
What metric would you pick?

41
Classification Metrics

Examples: It all depends on the problem!

Diagnosis of cancer.

prediction
It is important in medical cases
cancer no_cancer where it doesn’t matter
whether we raise a false alarm
cancer Perfect X
actual

but the actual positive cases

should not go undetected!
no _cancer OK Perfect
TP
Recall =
TP + FN

42
Classification Metrics

Examples: It all depends on the problem!

Detecting if an email spam or no spam.

prediction
It is important in emails where
spam no_spam it is more important that we
don’t miss any important email
spam Perfect OK
actual

as spam than receiving an

occasional spam as no spam.
no_spam X Perfect
What metric would you pick?

43
Classification Metrics

Examples: It all depends on the problem!

Detecting if an email spam or no spam.

prediction
It is important in emails where
spam no_spam it is more important that we
don’t miss any important email
spam Perfect OK
actual

as spam than receiving an

occasional spam as no spam.
no_spam X Perfect
TP
Precision =
TP + FP

44
Classification Metrics

Multiclass Classification

prediction
Happy Sad Angry Surprise Disgust Neutral

Happy
● Can you define recall (Happy)?

Sad
● Can you define precision (Happy)?
actual

Angry

Surprise

Disgust

Neutral

45
Classification Metrics

Multiclass Classification

prediction
Happy Sad Angry Surprise Disgust Neutral

recall(Happy)
Happy

Sad
actual

Angry

Surprise

Disgust

Neutral

46
Classification Metrics

Multiclass Classification

prediction
Happy Sad Angry Surprise Disgust Neutral

precision(Happy)
Happy

Sad
actual

Angry

Surprise

Disgust

Neutral

47
Classification Metrics

Multiclass Classification

prediction
Happy Sad Angry Surprise Disgust Neutral

Can you define accuracy?

Happy

Sad
actual

Angry

Surprise

Disgust

Neutral

48
Classification Metrics

Multiclass Classification

prediction
Happy Sad Angry Surprise Disgust Neutral

Happy

Sad
actual

Angry

Surprise

Disgust

Neutral

49
Classification Metrics

Area under the ROC Curve (AUC)

● The ROC curve (ROC stands for “receiver operating characteristic,” the term
comes from radar engineering. The method was originally developed for
operators of military radar receivers starting in 1941, which led to its name.) is
a commonly used method to assess the performance of binary classification
models.

● ROC curves use a combination of:

(1) true positive rate (the proportion of positive examples predicted correctly,
defined exactly as recall) and
(2) false positive rate (the proportion of negative examples predicted
incorrectly)
to build up a summary picture of the classification performance.
51
Classification Metrics

Area under the ROC Curve (AUC)

● ROC curves use a combination of:
(1) true positive rate (the proportion of positive examples predicted correctly,
defined exactly as recall) and
(2) false positive rate (the proportion of negative examples predicted
incorrectly)
to build up a summary picture of the classification performance.

TP
FP
TPR =
FPR =
TP + FN
FP + TN

52
Classification Metrics

Area under the ROC Curve (AUC)

prediction
TP spam not_spam
TPR =
TP + FN spam True Positive False Negative

actual
(TP) (FN)

FP not_spam False Positive True Negative

FPR = (FP) (TN)
FP + TN
Specificity = 1 - FPR = TN / (TN + FP)
Sensitivity/Recall = TPR
53
Classification Metrics

Area under the ROC Curve (AUC)

TP
TPR =
TP + FN

FP
FPR =
FP + TN

54
Classification Metrics

Area under the ROC Curve (AUC)

● We used a threshold for

classification in many classification
models
● Typically for models that give
probabilistic output score.

55
Classification Metrics

Area under the ROC Curve (AUC)

● To compare different classifiers, it can

be useful to summarize the
performance of each classifier into a
single measure.
● One common approach is to
calculate the area under the ROC
curve, which is abbreviated to AUC.

56
Classification Metrics

Area under the ROC Curve (AUC)

● AUC ranges in value from 0 to 1

● A model whose predictions are 100%
wrong has an AUC of 0.0
● One whose predictions are 100%
correct has an AUC of 1.0
● AUC is classification-threshold-
invariant and suitable for comparison

58
Classification Metrics

Area under the ROC Curve (AUC)

prediction
spam not_spam
actual

spam 10 0

not_spam 10 0

All predictions say “spam”.

(1) TPR=?
(2) FPR=?
(3) Where is the point in ROC curve? 59
Classification Metrics

Area under the ROC Curve (AUC)

prediction
spam not_spam
actual

spam 0 10

not_spam 0 10

All predictions say “not_spam”.

(1) TPR=?
(2) FPR=?
(3) Where is the point in ROC curve? 60
Classification Metrics

Area under the ROC Curve (AUC)

prediction
spam not_spam
actual

spam 10 0

not_spam 0 10

All predictions are perfect.

(1) TPR=?
(2) FPR=?
(3) Where is the point in ROC curve? 61
Classification Metrics

Area under the ROC Curve (AUC)

prediction
spam not_spam
actual

spam 5 5

not_spam 5 5

Some random predictions.

(1) TPR=?
(2) FPR=?
(3) Where is the point in ROC curve? 62

Inferensi Disekitar Mean Dan Pos Hoc-Zahro
No ratings yet
Inferensi Disekitar Mean Dan Pos Hoc-Zahro
11 pages
KNN Evaluation
No ratings yet
KNN Evaluation
51 pages
Lecture 4
No ratings yet
Lecture 4
31 pages
Module 2
No ratings yet
Module 2
151 pages
ML 4
No ratings yet
ML 4
33 pages
Unit 4 Learning
No ratings yet
Unit 4 Learning
100 pages
9.1 Accuracy: Formula: Accuracy (True Positives + True Negatives) / (Total Observations)
No ratings yet
9.1 Accuracy: Formula: Accuracy (True Positives + True Negatives) / (Total Observations)
4 pages
ML Unit 3
No ratings yet
ML Unit 3
127 pages
K Nearest Neighbors
No ratings yet
K Nearest Neighbors
19 pages
ML CH 5
No ratings yet
ML CH 5
45 pages
Sensitivity Unit 4
No ratings yet
Sensitivity Unit 4
4 pages
BSC ML CH1
No ratings yet
BSC ML CH1
63 pages
CH-5 ML
No ratings yet
CH-5 ML
36 pages
Machine Learningassignment
No ratings yet
Machine Learningassignment
10 pages
Share UNIT-IV-1
No ratings yet
Share UNIT-IV-1
138 pages
Unit 5 Classification PDF
No ratings yet
Unit 5 Classification PDF
131 pages
19-Performance Metrics
No ratings yet
19-Performance Metrics
23 pages
3-Performance Measures
No ratings yet
3-Performance Measures
35 pages
Classification and K Nearest Neighbour Algorithm
No ratings yet
Classification and K Nearest Neighbour Algorithm
53 pages
Accuracy Precision and Recall
No ratings yet
Accuracy Precision and Recall
21 pages
Ai DS 2 Book-Chpt-5
No ratings yet
Ai DS 2 Book-Chpt-5
17 pages
Evaluation Measures For Machine Learning Models
No ratings yet
Evaluation Measures For Machine Learning Models
6 pages
Unit 4
No ratings yet
Unit 4
20 pages
CH 5
No ratings yet
CH 5
19 pages
2-Training and Testing Models, Evaluation Metrics-01-07-2023
No ratings yet
2-Training and Testing Models, Evaluation Metrics-01-07-2023
23 pages
Lecture 2 Final
No ratings yet
Lecture 2 Final
90 pages
Performance Metrics
No ratings yet
Performance Metrics
34 pages
4.0 Supervised Learning 4.1 Discuss Classification Model
No ratings yet
4.0 Supervised Learning 4.1 Discuss Classification Model
48 pages
ML Metrics
No ratings yet
ML Metrics
9 pages
Confusion Matrix and Classification Evaluation Metrics
No ratings yet
Confusion Matrix and Classification Evaluation Metrics
16 pages
Classification
No ratings yet
Classification
58 pages
2 KNN
No ratings yet
2 KNN
67 pages
Ads Exp4
No ratings yet
Ads Exp4
3 pages
ML Unit 2
No ratings yet
ML Unit 2
31 pages
Instruction & Option Choice
No ratings yet
Instruction & Option Choice
6 pages
ML Classification Algorithms Guide
No ratings yet
ML Classification Algorithms Guide
13 pages
Confusion Matrix
No ratings yet
Confusion Matrix
5 pages
Imp Notes For Aamd
No ratings yet
Imp Notes For Aamd
6 pages
Supervised Learning
No ratings yet
Supervised Learning
30 pages
Confusion Matrix
No ratings yet
Confusion Matrix
18 pages
CS585 Lecture October03rd
No ratings yet
CS585 Lecture October03rd
146 pages
Chapter 3 Model Evaluation Final
No ratings yet
Chapter 3 Model Evaluation Final
30 pages
Evaluation Metrics
No ratings yet
Evaluation Metrics
61 pages
Chapter 5 Model Evaluation
No ratings yet
Chapter 5 Model Evaluation
21 pages
Unit 2 Classification
No ratings yet
Unit 2 Classification
59 pages
CLASSIFICATION
No ratings yet
CLASSIFICATION
36 pages
Binary Classification PDF
No ratings yet
Binary Classification PDF
27 pages
Classification FoundationalMathofAI S24
No ratings yet
Classification FoundationalMathofAI S24
6 pages
ML 7th Sem Aiml Ite Notes Complete Long (1) - 63-155
No ratings yet
ML 7th Sem Aiml Ite Notes Complete Long (1) - 63-155
93 pages
Lectures3 5
No ratings yet
Lectures3 5
57 pages
Machine Learning II
No ratings yet
Machine Learning II
61 pages
Data Mining Final
No ratings yet
Data Mining Final
25 pages
08 Classification
No ratings yet
08 Classification
26 pages
(REPORT) LAB - 2 - Decision - Tree
No ratings yet
(REPORT) LAB - 2 - Decision - Tree
17 pages
7.classification Before
No ratings yet
7.classification Before
27 pages
جلسه 13
No ratings yet
جلسه 13
76 pages
Lecture03. Classification (Chapter 3)
No ratings yet
Lecture03. Classification (Chapter 3)
46 pages
8 2 Lecture AI 8 2
No ratings yet
8 2 Lecture AI 8 2
35 pages
Unit 3
No ratings yet
Unit 3
13 pages
L 13 Choose Your Own Algorithm D 07062024 111828am
No ratings yet
L 13 Choose Your Own Algorithm D 07062024 111828am
36 pages
Lecture #2: Prediction, K-Nearest Neighbors: CS 109A, STAT 121A, AC 209A: Data Science
No ratings yet
Lecture #2: Prediction, K-Nearest Neighbors: CS 109A, STAT 121A, AC 209A: Data Science
28 pages
Multivariate Analysis Techniques
No ratings yet
Multivariate Analysis Techniques
8 pages
Ijstra 2024 0073
No ratings yet
Ijstra 2024 0073
6 pages
Adopsi Budaya Kerja
No ratings yet
Adopsi Budaya Kerja
11 pages
Machine - Learning - Content - Python PDF
No ratings yet
Machine - Learning - Content - Python PDF
3 pages
Measurement System Analysis
No ratings yet
Measurement System Analysis
1 page
CEO Salary Analysis & Regression
No ratings yet
CEO Salary Analysis & Regression
6 pages
Cases Conjoint Analysis
No ratings yet
Cases Conjoint Analysis
5 pages
Econometrics Word File
No ratings yet
Econometrics Word File
13 pages
Việt Cường
No ratings yet
Việt Cường
14 pages
Project 3 - Build A Logistic Regression Model To Predict Custo Mer Churn in Telecom IndustryV1.0 PDF
100% (1)
Project 3 - Build A Logistic Regression Model To Predict Custo Mer Churn in Telecom IndustryV1.0 PDF
38 pages
Chapter 9,10,11,12 - Công TH C
No ratings yet
Chapter 9,10,11,12 - Công TH C
9 pages
Analysis of Covariance (Linear, Quadratic, Site Index As Covariables) - Dr. Rong-Cai Yang
No ratings yet
Analysis of Covariance (Linear, Quadratic, Site Index As Covariables) - Dr. Rong-Cai Yang
47 pages
Random Motors Briefing
No ratings yet
Random Motors Briefing
43 pages
Performance Score Overall Rating Performance Score Overall Rating
No ratings yet
Performance Score Overall Rating Performance Score Overall Rating
3 pages
Classification Basics
No ratings yet
Classification Basics
14 pages
Predicting Stem Volume To Any Height Limit For Native Tree Species in Southern New South Wales and Victoria - Bi - 1999
No ratings yet
Predicting Stem Volume To Any Height Limit For Native Tree Species in Southern New South Wales and Victoria - Bi - 1999
14 pages
Understanding Multicollinearity in Regression
No ratings yet
Understanding Multicollinearity in Regression
8 pages
Afroz 21 Assignment-2
No ratings yet
Afroz 21 Assignment-2
19 pages
Graham 2006
No ratings yet
Graham 2006
16 pages
Marketing Strategy Impact on SMEs in Ibadan
No ratings yet
Marketing Strategy Impact on SMEs in Ibadan
10 pages
Slide 6
No ratings yet
Slide 6
37 pages
Files-2-Presentations Malhotra Mr05 PPT 16
No ratings yet
Files-2-Presentations Malhotra Mr05 PPT 16
53 pages
Machine Learning - Project
80% (10)
Machine Learning - Project
14 pages
Ecotrix Gujarati Solutions
No ratings yet
Ecotrix Gujarati Solutions
189 pages
Lecture #4
No ratings yet
Lecture #4
20 pages
Econometrics Project - Sirjan and Paridhi
No ratings yet
Econometrics Project - Sirjan and Paridhi
20 pages
Type 1
No ratings yet
Type 1
2 pages
Regression Analysis on Birth Weight
No ratings yet
Regression Analysis on Birth Weight
5 pages

Module 8 PDF

Uploaded by

Module 8 PDF

Uploaded by

Learning Objectives

• Continuing discussion on KNN

• Performance metrics for Classification

• Significance of different metrics

● Low k: overfitting, highly

● Low k: overfitting, highly

What if we have same votes from

Potential solutions for tie-

A probabilistic variant: Probabilistic kNN

E.g. k=4, c=3

y=1 y=2 y=3

A probabilistic variant: Probabilistic kNN

E.g. k=4, c=3 Variant with pseudo counts:

A simple regression algorithm:

● Train examples: where is a continuous real valued target

Can be used for interpolation.

There are ways to optimize kNN computation:

k-d tree method

The more “traditional” application of the kNN is the classification of data. It

● Brute force (O(k * n)): 3 * 10,000 = 30,000

How to measure the performance of a classification model?

Most widely used metrics and tools to access classification models:

A table to summarize how successful the classification model is at

spam True Positive False Negative

not_spam False Positive True Negative

Exercise 1: Consider a Cricket Tournament. Find

not_spam False Positive True Negative

FP is costly! not_spam False Positive True Negative

TP spam True Positive False Negative

● The formula for the standard F1-score

Examples: It all depends on the problem!

but the actual positive cases

Examples: It all depends on the problem!

but the actual positive cases

Examples: It all depends on the problem!

as spam than receiving an

Examples: It all depends on the problem!

as spam than receiving an

Can you define accuracy?

Area under the ROC Curve (AUC)

● ROC curves use a combination of:

Area under the ROC Curve (AUC)

Area under the ROC Curve (AUC)

FP not_spam False Positive True Negative

Area under the ROC Curve (AUC)

Area under the ROC Curve (AUC)

● We used a threshold for

Area under the ROC Curve (AUC)

● To compare different classifiers, it can

Area under the ROC Curve (AUC)

● AUC ranges in value from 0 to 1

Area under the ROC Curve (AUC)

All predictions say “spam”.

Area under the ROC Curve (AUC)

All predictions say “not_spam”.

Area under the ROC Curve (AUC)

All predictions are perfect.

Area under the ROC Curve (AUC)

Some random predictions.

You might also like