Machine
Learning
Demystifying the World of Artificial
Intelligence and Exploring Its Potential
Housing prices problem
Labelled Data Labelled Data Unlabeled Data
Categorical Data Numerical Data
Machine Learning
Logistic
Regression
estimates the probability of an event occurring
Machine Learning
Linear Regression As A Classifier!
Why not use linear regression as a classifier?
Malignant?
(Yes) 1
(No) 0
Tumor Size
Machine Learning
Linear Regression As A Classifier!
Why not use linear regression as a classifier?
Malignant?
(Yes) 1 X X X X
(No) 0 X X X X
Tumor Size
Machine Learning
Linear Regression As A Classifier!
Why not use linear regression as a classifier?
Malignant?
(Yes) 1 X X X X
(No) 0 X X X X
Tumor Size
Machine Learning
Linear Regression As A Classifier!
Why not use linear regression as a classifier?
Malignant?
(Yes) 1 X X X X
0.5
(No) 0 X X X X
Tumor Size
Machine Learning
Linear Regression As A Classifier!
Why not use linear regression as a classifier?
Malignant?
(Yes) 1 X X X X
0.5 Benign Malignant
(No) 0 X X X X
Tumor Size
Machine Learning
Linear Regression As A Classifier!
Why not use linear regression as a classifier?
Malignant?
(Yes) 1 X X X X
0.5
(No) 0 X X X X
Tumor Size
Machine Learning
Linear Regression As A Classifier!
Why not use linear regression as a classifier?
Malignant?
(Yes) 1 X X X X X
0.5
(No) 0 X X X X
Tumor Size
Machine Learning
Linear Regression As A Classifier!
Why not use linear regression as a classifier?
Malignant?
(Yes) 1 X X X X X
0.5
(No) 0 X X X X
Tumor Size
Machine Learning
Linear Regression As A Classifier!
Why not use linear regression as a classifier?
Malignant?
(Yes) 1 X X X X X
0.5 Benign Malignant
(No) 0 X X X X
Tumor Size
Machine Learning
Linear Regression As A Classifier!
Why not use linear regression as a classifier?
Malignant?
(Yes) 1 X X X X X
0.5 Benign Malignant
(No) 0 X X X X
Tumor Size
Linear regression as a classifier
will be sensitive to points
- As a classification problem:
Machine Learning
Linear Regression As A Classifier!
Why not use linear regression as a classifier?
Malignant?
(Yes) 1 X X X X X
0.5 Benign Malignant
(No) 0 X X X X
Tumor Size
Linear regression as a classifier
will be sensitive to points
- As a classification problem:
Tumor Size
Machine Learning
Linear Regression As A Classifier!
Why not use linear regression as a classifier?
Malignant?
(Yes) 1 X X X X X
0.5 Benign Malignant
(No) 0 X X X X
Tumor Size
Linear regression as a classifier
will be sensitive to points
- As a classification problem:
X X X X O O O O O Tumor Size
Machine Learning
Linear Regression As A Classifier!
Why not use linear regression as a classifier?
Malignant?
(Yes) 1 X X X X X
0.5 Benign Malignant
(No) 0 X X X X
Tumor Size
Linear regression as a classifier
will be sensitive to points
- As a classification problem:
X X X X O O O O O Tumor Size
Benign Malignant
Machine Learning
Imagine you invade a
New Planet
Machine Learning
Star Wars!
Imagine you invade a
New Planet
• The aliens have 2 words
▪ Aack
▪ Beep
Machine Learning
Star Wars!
Machine Learning
Star Wars!
• The aliens have 2 words
▪ Aack
▪ Beep
• You want to know if alien is happy or sad.
• You got this data form your experience with
some aliens.
• What do you notice about this data?
• How do we solve this?
Machine Learning
Star Wars!
Let’s build our dataset
Machine Learning
Star Wars!
Draw a line that classifies the data correctly.
Machine Learning
Star Wars!
Draw a line that classifies the data correctly.
Machine Learning
Star Wars!
Draw a line that classifies the data correctly.
Line:
Xaack = Xbeeb
Or
Line:
Xaack - Xbeeb = 0
Machine Learning
Star Wars!
What do you notice about this planet?
Machine Learning
Star Wars!
What do you notice about this planet?
Machine Learning
Star Wars!
What do you notice about this planet?
Xcrack - Xdoink - 3.5 = 0
Happy:
Xcrack - Xdoink - 3.5 >= 0
Sad:
Xcrack - Xdoink - 3.5 < 0
Machine Learning
Star Wars!
What do you notice about this planet?
Xcrack - Xdoink - 3.5 = 0
Happy:
Xcrack - Xdoink - 3.5 >= 0
Sad:
Xcrack - Xdoink - 3.5 < 0
Is it always that simple?
Machine Learning
Star Wars!
Machine Learning
Star Wars!
• Model could mistake some samples
• We try to find the most general
model with least error
Machine Learning
Perceptron Classifier [Formula]
Machine Learning
Perceptron Classifier [Formula]
The general form of classifiers using line:
ax1 + bx2 + c = 0
This equation is different from the equation used in
linear regression but they are equal.
What do a, b, and c represent?
This formula could be extended if features more than 2.
Ex (3 features, plane model):
ax1 + bx2 + cx3 + d = 0
So why we use this formula?
Machine Learning
Perceptron Classifier [Formula]
Here is the answer...
Machine Learning
Perceptron Classifier [Formula]
Here is the answer...
How to take the decision?
Machine Learning
Perceptron Classifier [Step function]
Machine Learning
Perceptron Classifier [Step function]
step(x) = 1 if x>=0
step(x) =0 if x<0
Classifier output:
y’= step(ax1 + bx2 + c)
1: happy
0: sad
Machine Learning
Perceptron Classifier [Step function]
step(x) = 1 if x>=0
step(x) =0 if x<0
Classifier output:
y’= step(ax1 + bx2 + c)
1: happy
0: sad
How to compare between 2 models?
Machine Learning
Perceptron Classifier [Compare Models (Error Function)]
1- Number of errors
Machine Learning
Perceptron Classifier [Compare Models (Error Function)]
1- Number of errors
Machine Learning
Perceptron Classifier [Compare Models (Error Function)]
1- Number of errors
Problem with this method:
- Doesn’t have a degree of true and false => don’t have a measure of converging
Machine Learning
Perceptron Classifier [Compare Models (Error Function)]
1- Number of errors
Problem with this method:
- Doesn’t have a degree of true and false => don’t have a measure of converging
Machine Learning
Perceptron Classifier [Compare Models (Error Function)]
2- Distance
Machine Learning
Perceptron Classifier [Compare Models (Error Function)]
2- Distance
Machine Learning
Perceptron Classifier [Compare Models (Error Function)]
3 Score
Properties:
- The points in the boundary have a score of 0.
- The points in the positive zone have positive scores.
- The points in the negative zone have negative scores.
- The points close to the boundary have scores of low magnitude.
- The points far from the boundary have scores of high magnitude.
Machine Learning
Perceptron Classifier [Compare Models (Error Function)]
3 Score
Properties:
- The points in the boundary have a score of 0.
- The points in the positive zone have positive scores.
- The points in the negative zone have negative scores.
- The points close to the boundary have scores of low magnitude.
- The points far from the boundary have scores of high magnitude.
Score error:
If the sentence is correctly classified, the error is 0.
If the sentence is misclassified, the error is |ax1 + bx2 + c|
Machine Learning
Perceptron Classifier [Score error function example]
Machine Learning
Perceptron Classifier [Score error function example]
Machine Learning
Perceptron Classifier [Score error function example]
Machine Learning
Perceptron Classifier [Training]
• Case 1: If the point is correctly classified, leave the line as it is.
•Case 2: If the point is incorrectly classified, that means it produces a positive error.
Adjust the weights and the bias a small amount so that this error slightly decreases.
Machine Learning
Perceptron Classifier [Training]
Machine Learning
Perceptron Classifier [Training] -- SKlearn
Machine Learning
Perceptron Classifier [Drawbacks]
- The algorithm is designed for classifying two categories but not more.
- The step function is discrete which cause:
- It is hard to have the derivative of it.
- His answers offer very little information. For example, the sentence “I’m happy”, and the sentence
“I’m thrilled, this is the best day of my life!”, are both happy sentences. However, the
second one is much happier than the first one. A perceptron classifier will classify them both as
‘happy’.
So we need a continuous classifier and our Logistic Regression Classifier solve these problems.
Machine Learning
Logistic Regression VS Perceptron Classifier
Machine Learning
Logistic Regression VS Perceptron Classifier
Why does it called logistic ‘regression’ while it is a classifier?!
Machine Learning
Logistic regression [Logistic function(Sigmoid)]
The output of this function
should be continuous [0,1]
Machine Learning
Logistic regression [Logistic function(Sigmoid)]
The output of this function
should be continuous [0,1]
Machine Learning
Logistic regression [Logistic function(Sigmoid)]
The output of this function
should be continuous [0,1]
Machine Learning
Logistic regression [Logistic function(Sigmoid)]
The output of this function
should be continuous [0,1]
Another advantage of sigmoid function
is the simple derivative:
Derivative of sigmoid(x)=
sigmoid(x) * (1- sigmoid(x))
Machine Learning
Logistic regression [Star War Example]
Machine Learning
Logistic regression [Star War Example]
Machine Learning
Logistic regression [Star War Example]
Machine Learning
Logistic regression [Star War Example]
Machine Learning
Logistic regression [Star War Example]
Machine Learning
Logistic regression [Star War Example]
Machine Learning
Logistic regression [Training]
For Perceptron classifier it was:
• Case 1: If the point is correctly
classified, leave the line as it is.
• Case 2: If the point is incorrectly classified,
move the line a little closer to the point.
But for our Logistic classifier:
•Case 1: If the point is correctly classified,
move the line a little away to the point.
•Case 2: If the point is incorrectly classified,
move the line a little closer to the point.
Machine Learning
Logistic regression [Training]
Observation: there is a similarity between the increase/decrease column and (y- y’) column.
Base on this observation, we can modify the weights and bias with this formula:
Machine Learning
Logistic regression [Training]
Observation: there is a similarity between the
increase/decrease column and (y- y’) column.
Base on this observation, we can modify the
weights and bias with this formula:
Where y’ = sigmoid(ax1 + bx2 + c)
Machine Learning
Logistic regression [Training]
Observation: there is a similarity between the
increase/decrease column and (y- y’) column.
Base on this observation, we can modify the
weights and bias with this formula:
Where y’ = sigmoid(ax1 + bx2 + c)
Machine Learning
Logistic regression [Training] -- SKlearn
Machine Learning
Multi-class Model
- We can use one-to-many method, but we need to find a method to get the probability of each class.
- To get the probabilities we use the softmax function (general form of sigmoid).
- For the output of ax1 + bx2 + c, let’s imagine these results:
Dog Classifier: 3 Cat Classifier: 2 Bird Classifier: -1
Machine Learning
Multi-class Model
- We can use one-to-many method, but we need to find a method to get the probability of each class.
- To get the probabilities we use the softmax function (general form of sigmoid).
- For the output of ax1 + bx2 + c, let’s imagine these results:
Dog Classifier: 3 Cat Classifier: 2 Bird Classifier: -1
We can use this result over the sum, but Bird will have a negative probability -¼,
So, we need a function that is always positive, we can use ex as it is also has a simple derivative.
Machine Learning
Multi-class Model
- We can use one-to-many method, but we need to find a method to get the probability of each class. We can use this result over the sum, but Bird
- To get the probabilities we use the softmax function (general form of sigmoid). will have a negative probability -¼,
- For the output of ax1 + bx2 + c, let’s imagine these results: So, we need a function that is always positive,
we can use ex as it is also has a simple
Dog Classifier: 3 Cat Classifier: 2 Bird Classifier: -1 derivative.
- Dog Classifier: e3= 20.085 Cat Classifier: e2 =7.389 Bird Classifier: e-1=0.368
- Probability of dog: Probability of cat: Probability of bird:
20.085/27.842 = 0.721 7.389/27.842 = 0.265 0.368/27.842 = 0.013
- General formula of Softmax:
Machine Learning
Multi-class Model
Machine Learning
Performance Evaluation
• Bias and Variance
• The Confusion Matrix
Machine Learning
Performance Evaluation
• Central Idea: Minimizing the error on test data, i.e., difference
between the actual value of predictor variable & value the ML model
predicts
• Compares different models by calculating a performance metric on the
test data.
Machine Learning
Performance Evaluation
• Approaches:
• Classification Metrics for Classification (discrete output):
▪ Confusion Matrix
• Regression Metrics (continuous output):
▪ Mean squared error (MSE)
▪ Mean absolute error (MAE)
▪ Root Mean Squared error (RSME)
Machine Learning
Performance Evaluation
• Regression Metrics
Machine Learning
Performance Evaluation
• Classification Metrics
• Confusion Matrix
Machine Learning
Performance Evaluation
• Classification Metrics
Machine Learning
Performance Evaluation
• Class Imbalance
Machine Learning
Performance Evaluation
Machine Learning
Bias-Variance Tradeoff
Machine Learning
Bias-Variance Tradeoff
HERE
Machine Learning HERE
Bias-Variance Tradeoff
Machine Learning
Underfitting & Overfitting
Machine Learning
Reasons for Underfitting/Overfitting
Machine Learning
THANK
YOU!
Any Question!