Confusion Matrix

Environmental scientists want to classify a genetic variant using a machine learning model. They construct a confusion matrix to evaluate the model using 500 samples. The matrix tracks the model's predicted and actual classifications of samples as containing or not containing the variant. Based on the data, the scientists populate the matrix with values for true positives, false positives, true negatives, and false negatives.

Uploaded by

Kittu Bhargavi

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

113 views8 pages

Confusion Matrix

Uploaded by

Kittu Bhargavi

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 8

Confusion Matrix:

A confusion matrix provides a summary of the predictive results

in a classification problem. Correct and incorrect predictions are
summarized in a table with their values and broken down by
each class.

Confusion Matrix for the Binary Classification

2. Calculate a confusion matrix:

Let’s take an example:
We have a total of 10 cats and dogs and our model predicts
whether it is a cat or not.

Actual values = [‘dog’, ‘cat’, ‘dog’, ‘cat’, ‘dog’, ‘dog’, ‘cat’, ‘dog’,
‘cat’, ‘dog’]
Predicted values = [‘dog’, ‘dog’, ‘dog’, ‘cat’, ‘dog’, ‘dog’, ‘cat’, ‘cat’,
‘cat’, ‘cat’]

Remember, we describe predicted values as Positive/Negative

and actual values as True/False.

Definition of the Terms:

True Positive: You predicted positive and it’s true. You predicted
that an animal is a cat and it actually is.

True Negative: You predicted negative and it’s true. You

predicted that animal is not a cat and it actually is not (it’s a
dog).
False Positive (Type 1 Error): You predicted positive and it’s
false. You predicted that animal is a cat but it actually is not (it’s
a dog).

False Negative (Type 2 Error): You predicted negative and it’s

false. You predicted that animal is not a cat but it actually is.

Classification Accuracy:
Classification Accuracy is given by the relation:

Recall (aka Sensitivity):

Recall is defined as the ratio of the total number of correctly
classified positive classes divide by the total number of positive
classes. Or, out of all the positive classes, how much we have
predicted correctly. Recall should be high.

Precision:
Precision is defined as the ratio of the total number of correctly
classified positive classes divided by the total number of
predicted positive classes. Or, out of all the predictive positive
classes, how much we predicted correctly. Precision should be
high.

Trick to remember: Precision has Predictive Results in the

denominator.

F-score or F1-score:
It is difficult to compare two models with different Precision
and Recall. So to make them comparable, we use F-Score. It is
the Harmonic Mean of Precision and Recall. As compared to
Arithmetic Mean, Harmonic Mean punishes the extreme values
more. F-score should be high.

Specificity:
Specificity determines the proportion of actual negatives that
are correctly identified.
Example to interpret confusion matrix:
Let’s calculate confusion matrix using above cat and dog
example:
Classification Accuracy:
Accuracy = (TP + TN) / (TP + TN + FP + FN) =
(3+4)/(3+4+2+1) = 0.70

Recall: Recall gives us an idea about when it’s actually yes, how
often does it predict yes.
Recall = TP / (TP + FN) = 3/(3+1) = 0.75

Precision: Precsion tells us about when it predicts yes, how

often is it correct.
Precision = TP / (TP + FP) = 3/(3+2) = 0.60

F-score:
F-score = (2*Recall*Precision)/(Recall+Presision) =
(2*0.75*0.60)/(0.75+0.60) = 0.67

Specificity:
Specificity = TN / (TN + FP) = 4/(4+2) = 0.67

The AUC-ROC curve, or Area Under the Receiver Operating

Characteristic curve, is a graphical representation of the performance of a
binary classification model at various classification thresholds. It is
commonly used in machine learning to assess the ability of a model to
distinguish between two classes, typically the positive class (e.g.,
presence of a disease) and the negative class (e.g., absence of a
disease).
Environmental scientists want to solve a two-class
classification problem for predicting whether a
population contains a specific genetic variant. They
can use a confusion matrix to determine how many
ways automated processes might confuse the
machine learning classification model they're
analyzing. Assuming the scientists use 500 samples
for their data analysis, a table is constructed for their
predictive and actual values before calculating the
confusion matrix.

Predicted without Predicted with the

the variant variant
Actual number
without the variant
Actual number with
the variant
Total predictive
Total predicted value
value

After creating the matrix, the scientists analyze their

sample data. Assume the scientists predict that 350
test samples contain the genetic variant and 150
samples don't. If they determine the actual number of
samples containing the variant is 305, the actual
number of samples without the variant is 195. These
values become the "true" values in the matrix and the
scientists enter the data in the table:
Predicted with the
Predicted without the variant variant
Actual number without the True negative = 45 False positive =
Predicted with the
Predicted without the variant variant
variant = 195 150
Actual number with the variant True positive =
False negative = 105
= 305 200
150 350

3 - Model Evaluation & Validation
No ratings yet
3 - Model Evaluation & Validation
47 pages
Unit - 5
No ratings yet
Unit - 5
57 pages
Machine Learning: Professor Department of Computer Science & Engineering
No ratings yet
Machine Learning: Professor Department of Computer Science & Engineering
53 pages
Unit 4
No ratings yet
Unit 4
20 pages
Unit 2 Classification
No ratings yet
Unit 2 Classification
59 pages
Confusion Matrix
No ratings yet
Confusion Matrix
43 pages
Performance Metrics
No ratings yet
Performance Metrics
34 pages
Unit - 3 Evaluation
No ratings yet
Unit - 3 Evaluation
6 pages
Confusion Matrix
No ratings yet
Confusion Matrix
7 pages
Confusion Matrix in Classification Models
No ratings yet
Confusion Matrix in Classification Models
4 pages
Classification Metrics
No ratings yet
Classification Metrics
39 pages
Computer Science Practical
No ratings yet
Computer Science Practical
43 pages
Confusion Matrix and Outliers
No ratings yet
Confusion Matrix and Outliers
32 pages
Assignment 5
No ratings yet
Assignment 5
22 pages
Performance Metrics Classification
No ratings yet
Performance Metrics Classification
39 pages
Unit 5 Classification PDF
No ratings yet
Unit 5 Classification PDF
131 pages
Essentials of Modern Business Statistics With Microsoft Excel 6th Edition Anderson Solutions Manual 1
100% (88)
Essentials of Modern Business Statistics With Microsoft Excel 6th Edition Anderson Solutions Manual 1
22 pages
Unit 4 Learning
No ratings yet
Unit 4 Learning
100 pages
Unit3 Evaluating Models
No ratings yet
Unit3 Evaluating Models
10 pages
BA
No ratings yet
BA
11 pages
Data Analytics
100% (8)
Data Analytics
346 pages
ML Lecture 11 Evaluation
No ratings yet
ML Lecture 11 Evaluation
17 pages
ML Classification Metrics Guide
100% (1)
ML Classification Metrics Guide
30 pages
Python Data Science
100% (5)
Python Data Science
353 pages
Iai&ml Unit-5
No ratings yet
Iai&ml Unit-5
15 pages
Unit 2 Chap 4
No ratings yet
Unit 2 Chap 4
14 pages
Confusion Matrix
No ratings yet
Confusion Matrix
21 pages
Risk Security and Regulatory Compliance
No ratings yet
Risk Security and Regulatory Compliance
12 pages
Confusion Matrix in Machine Learning
No ratings yet
Confusion Matrix in Machine Learning
10 pages
Machine Learning With Python
100% (15)
Machine Learning With Python
692 pages
Confusion Matrix
No ratings yet
Confusion Matrix
5 pages
3-Performance Measures
No ratings yet
3-Performance Measures
35 pages
Unit 3
No ratings yet
Unit 3
13 pages
CE880 Lecture6 Slides
No ratings yet
CE880 Lecture6 Slides
25 pages
Ai DS 2 Book-Chpt-5
No ratings yet
Ai DS 2 Book-Chpt-5
17 pages
Comprehensive Guide On Confusion Matrix 1657202063
No ratings yet
Comprehensive Guide On Confusion Matrix 1657202063
5 pages
11.2 - Classification Evaluation Metrics
No ratings yet
11.2 - Classification Evaluation Metrics
22 pages
Confusion Matrix & Evaluation Metrics in Machine Learning
No ratings yet
Confusion Matrix & Evaluation Metrics in Machine Learning
23 pages
Chap3 Part1 Classification
No ratings yet
Chap3 Part1 Classification
38 pages
WINSEM2024-25 CBS3006 ETH VL2024250505168 2025-01-09 Reference-Material-IV
No ratings yet
WINSEM2024-25 CBS3006 ETH VL2024250505168 2025-01-09 Reference-Material-IV
20 pages
Python Programming for Beginners_ From Basics to AI Integrations. 5-Minute Illustrated Tutorials, Coding Hacks, Hands-On Exercises & Case Studies to Master Python in 7 Days and Get Paid More by Prince
100% (13)
Python Programming for Beginners_ From Basics to AI Integrations. 5-Minute Illustrated Tutorials, Coding Hacks, Hands-On Exercises & Case Studies to Master Python in 7 Days and Get Paid More by Prince
244 pages
Lesson 4 - Performance Metrics
No ratings yet
Lesson 4 - Performance Metrics
46 pages
Lecture 2 Classifier Performance Metrics
No ratings yet
Lecture 2 Classifier Performance Metrics
60 pages
19-Performance Metrics
No ratings yet
19-Performance Metrics
23 pages
Module 5 ML
No ratings yet
Module 5 ML
12 pages
BigData Section6
No ratings yet
BigData Section6
10 pages
PM - SPM Assignment - Shubham Kothawade - Scaler
No ratings yet
PM - SPM Assignment - Shubham Kothawade - Scaler
210 pages
009 Confusion Matrix - Unlocked
No ratings yet
009 Confusion Matrix - Unlocked
4 pages
Session 2 Evaluation Boosting Bagging Contemporary Business Anaytics
No ratings yet
Session 2 Evaluation Boosting Bagging Contemporary Business Anaytics
17 pages
Performance Measures - Session 2
No ratings yet
Performance Measures - Session 2
35 pages
Classification Metrics Mod 6
No ratings yet
Classification Metrics Mod 6
8 pages
Evaluating Models CH-3
No ratings yet
Evaluating Models CH-3
5 pages
21-General Approach To Classification, Classification by Decision Tree Induction-17-02-2025
No ratings yet
21-General Approach To Classification, Classification by Decision Tree Induction-17-02-2025
15 pages
ML 5
No ratings yet
ML 5
5 pages
ML Model Evaluation Metrics
No ratings yet
ML Model Evaluation Metrics
8 pages
ML Notes UT-2
No ratings yet
ML Notes UT-2
19 pages
2019-20-S2 Maths
0% (1)
2019-20-S2 Maths
15 pages
ML Lab PGM 4
No ratings yet
ML Lab PGM 4
3 pages
Python in Excel (2024)
100% (13)
Python in Excel (2024)
607 pages
Understanding The Confusion Matrix in Machine Learning
No ratings yet
Understanding The Confusion Matrix in Machine Learning
4 pages
Accuracy Precision and Recall
No ratings yet
Accuracy Precision and Recall
21 pages
Wa0013.
No ratings yet
Wa0013.
9 pages
Confusion Matrix
No ratings yet
Confusion Matrix
4 pages
Machine Learning Project Report (Group 3) Shahbaz Khan
No ratings yet
Machine Learning Project Report (Group 3) Shahbaz Khan
11 pages
Learning Best Practices For Model Evaluation and Hyper-Parameter Tuning
No ratings yet
Learning Best Practices For Model Evaluation and Hyper-Parameter Tuning
20 pages
Tutorial 6
No ratings yet
Tutorial 6
12 pages
Data Stream Sampling Techniques
No ratings yet
Data Stream Sampling Techniques
3 pages
Classification Metrics Guide
No ratings yet
Classification Metrics Guide
15 pages
Linux Essentials For Cybersecurity
100% (23)
Linux Essentials For Cybersecurity
1,966 pages
Data Structures and File Management
No ratings yet
Data Structures and File Management
12 pages
Full Course of Machine Learning
100% (16)
Full Course of Machine Learning
660 pages
Learn Excel Data Analysis
100% (17)
Learn Excel Data Analysis
721 pages
Top 100 Applications of Generative AI 1683282083
100% (20)
Top 100 Applications of Generative AI 1683282083
119 pages
Python Machine Learning For Beginners Ebook Final
100% (11)
Python Machine Learning For Beginners Ebook Final
305 pages
Steen - Mathematics Tomorrow
100% (1)
Steen - Mathematics Tomorrow
244 pages
Evaluation of Predictive Models Final
No ratings yet
Evaluation of Predictive Models Final
6 pages
2019 Book DataScienceAndBigDataAnalytics
100% (15)
2019 Book DataScienceAndBigDataAnalytics
418 pages
POWER BI Tutorial
91% (11)
POWER BI Tutorial
77 pages
Hackers Guide To Machine Learning With Python PDF
100% (15)
Hackers Guide To Machine Learning With Python PDF
272 pages
Bms Institute of Technology Department of Mca Sub Code - 16mca38 Algorithms Laboratory Viva Questions
No ratings yet
Bms Institute of Technology Department of Mca Sub Code - 16mca38 Algorithms Laboratory Viva Questions
13 pages
Practical Projects
100% (30)
Practical Projects
478 pages
MGH Data Analysis With Microsoft Power BI 126045861X
93% (14)
MGH Data Analysis With Microsoft Power BI 126045861X
808 pages
Computational Intelligence (CS3030/CS3031) : School of Computer Engineering, KIIT-DU, BBS-24, India
No ratings yet
Computational Intelligence (CS3030/CS3031) : School of Computer Engineering, KIIT-DU, BBS-24, India
2 pages
Artificial Intelligence With Python (Machine Learning Foundations, Methodologies, and Applications) (Teik Toe Teoh, Zheng Rong)
94% (16)
Artificial Intelligence With Python (Machine Learning Foundations, Methodologies, and Applications) (Teik Toe Teoh, Zheng Rong)
334 pages
Assignment Problem
No ratings yet
Assignment Problem
12 pages
Beginners Python Cheat Sheet
89% (9)
Beginners Python Cheat Sheet
28 pages
Sampling Rate and Aliasing On A Virtual Laboratory
No ratings yet
Sampling Rate and Aliasing On A Virtual Laboratory
4 pages
ch10 Sequence Modelling - Recurrent and Recursive Nets
No ratings yet
ch10 Sequence Modelling - Recurrent and Recursive Nets
45 pages
Data Analysis From Scratch With Python - Beginner Guide Using Python, Pandas, NumPy, Scikit-Learn, IPython, TensorFlow and
100% (10)
Data Analysis From Scratch With Python - Beginner Guide Using Python, Pandas, NumPy, Scikit-Learn, IPython, TensorFlow and
104 pages
M.Tech Power Systems QBank
No ratings yet
M.Tech Power Systems QBank
6 pages
Let Us Python by Yashavant Kanetkar
89% (27)
Let Us Python by Yashavant Kanetkar
429 pages
Hands On Machine Learning With Python Concepts and Applications For Beginners - John Anderson 2018
91% (11)
Hands On Machine Learning With Python Concepts and Applications For Beginners - John Anderson 2018
166 pages
Microsoft Power BI Cookbook by Greg Deckler
100% (20)
Microsoft Power BI Cookbook by Greg Deckler
655 pages
Machine Learning - An Applied Mathematics Introduction PDF
100% (13)
Machine Learning - An Applied Mathematics Introduction PDF
246 pages
Data Analysis With Microsoft Excel
92% (25)
Data Analysis With Microsoft Excel
532 pages
Essential Calculus Skills Practice Workbook With Full Solutions
95% (85)
Essential Calculus Skills Practice Workbook With Full Solutions
528 pages
The Python Bible
97% (31)
The Python Bible
506 pages
(Probability and Statistics For Programmers) Allen Downey - Think Stats. Probability and Statistics For programmers-O'Reilly Media (2012) PDF
100% (12)
(Probability and Statistics For Programmers) Allen Downey - Think Stats. Probability and Statistics For programmers-O'Reilly Media (2012) PDF
142 pages
Question Bank
No ratings yet
Question Bank
9 pages
Classification Metrics in Machine Learning
No ratings yet
Classification Metrics in Machine Learning
6 pages
Machine Learning Projects in Python
100% (16)
Machine Learning Projects in Python
135 pages
Python Programming. A Step-by-Step Guide For Absolute Beginners
91% (44)
Python Programming. A Step-by-Step Guide For Absolute Beginners
181 pages
Present Value of A Single Amount
No ratings yet
Present Value of A Single Amount
21 pages
Optimal Monetary Policy - Lecture Notes
No ratings yet
Optimal Monetary Policy - Lecture Notes
6 pages
Machine Learning Projects Python
94% (18)
Machine Learning Projects Python
134 pages
Machine Learning Exercises in Python, Part 1: Curious Insight
No ratings yet
Machine Learning Exercises in Python, Part 1: Curious Insight
14 pages
Mohammad Jari Resume
No ratings yet
Mohammad Jari Resume
1 page
The Python Manual
97% (32)
The Python Manual
196 pages
Ch-2 Digital Image Processing Topics
No ratings yet
Ch-2 Digital Image Processing Topics
36 pages
Ba 4201 - QTDM - 20250514 - 0001
No ratings yet
Ba 4201 - QTDM - 20250514 - 0001
4 pages
Information Theory and Coding (J.G. Daugman)
No ratings yet
Information Theory and Coding (J.G. Daugman)
77 pages
Literature Review
No ratings yet
Literature Review
4 pages
VFC 4
No ratings yet
VFC 4
3 pages
Theory of Computation Solutions
100% (1)
Theory of Computation Solutions
8 pages
Discrete Time Signals PDF
100% (1)
Discrete Time Signals PDF
13 pages
Linearisation Techniques Explained
No ratings yet
Linearisation Techniques Explained
3 pages
Turtle Programming - Encryption in Python Final PDF
No ratings yet
Turtle Programming - Encryption in Python Final PDF
14 pages
Factoring: Math 8 Teacher Jervy Josiah D. Bayang
No ratings yet
Factoring: Math 8 Teacher Jervy Josiah D. Bayang
23 pages
Final Exam: COS 226 Algorithms and Data Structures Fall 2015
No ratings yet
Final Exam: COS 226 Algorithms and Data Structures Fall 2015
15 pages