Albertus Aditya
Today's Agenda
Step of Machine Learning (ML) Development
Introduction of Evaluating ML Models
Example of Evaluating Classification ML Models
Example of Evaluating Clustering ML Models
Step of Machine Learning Model Development
Data Cleaning Model Building
Model Deployment &
Data Collection & Feature & Choose
Evaluation Visualization
Engineering Algorithm
Step of Machine Learning Model Development
Let's see our challenges!
Data Cleansing & Model Building and
Data Collection
Feature Choose Algorithm
• Enough sources • Have all • One algorithm is
of data? important enough?
• Is it variative variables put into
enough? account?
Step of Machine Learning Model Development
Supervised,
e.g. Classification
The way of evaluation is not the same!
Unsupervised
e.g. Clustering/K-Means
Model Evaluation: Brief Introduction
Model evaluation is the process of using different
evaluation metrics to understand a machine
learning model’s performance.
It is necessary that our models are capable of
making accurate predictions despite of new, unseen
data, which is crucial for their successful
deployment in real-world applications.
Model Evaluation: Brief Introduction
Evaluation is everywhere in IT!
But, the technique is different.
Software development's testing involve
manual test of various case
Machine learning's testing involve test of
various data
Take a Breath ...
Evaluating a machine learning model
1 is part of developing of machine learning model itself
There are factual challenges in developing
2 a machine learning model
Evaluating machine learning model is useful
3 to ensure that it is applicable to new data
Evaluating a Classification Model
Confusion matrix
A table to describe
the performance
where the true
values are known
Evaluating a Classification Model
Accuracy Precision
How close a measurement is to the How close measurements of the
true or accepted value same item are to each other.
Think of Figure C Think of Figure B
Don't think of Figure A
Good to have Figure D
Others: Recall, F1 Score, ROC
Evaluating a Classification Model
Testing of 50 case:
• Predicted: Pregnant,
Real: Pregnant
==> TP = 20
• Predicted: Pregnant • Accuracy = (TP + TN) / (TP + TN + FP + FN) = 88%
Real: Not Pregnant • Precision = TP / (TP + FP) = 91%
==> FP = 2
• Recall = TP / (TP + FN) = 83%
• Predicted: Not Pregnant
Real: Pregnant • F1 Score = (2 * Prec. * Rec.) / (Prec. + Rec.) = 87%
==> FN = 4 • ROC (FPR) = FP / (FP + TN) = 7,69%
• Predicted: Not Pregnant
Real: Not Pregnant
==> TN = 24
Evaluating a Clustering Model
Silhouette coefficient
a metric that measures how well each data point
fits into its assigned cluster; its value between -1 and 1.
It combines information about:
• Cohesion: how close a data point is to other points in its own cluster
• Separation: how far a data point is from points in other clusters
Dunn's Index
• Equals the minimum inter-cluster distance divided by the maximum
cluster size.
Evaluating a Clustering Model
• It is easier to calculate using defined methods in Python,
e.g. using Scikit
Take Away
Accuracy & Precision Silhouette coefficient
Confusion Matrix Cohesion & Separation
True Positive, True Negative,
Python
False Positive, False Negative
Our Experience Said
• There is no particular way of measuring our methods.
• Consider the challenge since initial stages of development
• Model might be evaluated based on years of business experience