Thanks to visit codestin.com
Credit goes to www.scribd.com

0% found this document useful (0 votes)
11 views20 pages

Class 10 - Logistic Regression-Checkpoint

The document provides an overview of logistic regression, comparing it to the linear probability model and discussing its application in classification problems. It covers key concepts such as the sigmoid function, maximum likelihood estimation, and performance metrics like accuracy, precision, recall, F1 score, ROC, and AUC. Additionally, it highlights the importance of logistic regression in predicting categorical outcomes, using examples like credit card defaults.

Uploaded by

omarfaroque910
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
11 views20 pages

Class 10 - Logistic Regression-Checkpoint

The document provides an overview of logistic regression, comparing it to the linear probability model and discussing its application in classification problems. It covers key concepts such as the sigmoid function, maximum likelihood estimation, and performance metrics like accuracy, precision, recall, F1 score, ROC, and AUC. Additionally, it highlights the importance of logistic regression in predicting categorical outcomes, using examples like credit card defaults.

Uploaded by

omarfaroque910
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 20

Class 10 – Logistic Regression

Prof. Pedram Jahangiry

Prof. Pedram Jahangiry 1


Road map
ML Algorithm

Supervised Unsupervised

Dimensionality
Regression Classification Clustering
Reduction

Linear / Logistic
Principle K-Mean
Polynomial regression
Component
Penalized Analysis (PCA)
regression KNN Hierarchical
KNN

SVC
SVR

CART
CART Random Forest
Random Forest

Prof. Pedram Jahangiry


Topics
Part I
1. Linear probability model (LPM) vs Logistic regression
2. Sigmoid function
3. Logistic regression

Part II
1. Classification performance metrics
a) Accuracy,
b) Precision,
c) Recall,
d) F1 score,
e) ROC and AUC.

Prof. Pedram Jahangiry 3


Classification

• Qualitative variables can be either nominal or ordinal.


• Qualitative variables are often referred to as categorical.
• Classification is the process of predicting categorical variables.
• Classification problems are quite common, perhaps even more than regression
problems.
• Examples:
• Financial instrument tranches (investment grade or junk)
• Online transactions (fraudulent or not)
• Loan application (approved or denied)
• Credit card default (default or not)
• Car insurance customers (high, medium, low risk)

Prof. Pedram Jahangiry 4


Credit card default example

 Goal: Build a classifier that performs well in both train and test set.

Prof. Pedram Jahangiry 5


Part I
Logistic Regression

Prof. Pedram Jahangiry 6


Linear Probability Model (LPM) vs Logistic Regression

Starting with simple LPM : 𝑦 = 𝛽0 + 𝛽1 𝑏𝑎𝑙 + 𝜖 where, 𝑌 = 1 for default and 0 otherwise.

𝐸 𝑌 = 1 𝑏𝑎𝑙 = Pr 𝑌 = 1 𝑏𝑎𝑙 = 𝑃 𝑥 = 𝛽0 + 𝛽1 𝑏𝑎𝑙

• It seems that simple regression is perfect for this task,


• But what are the caveats?

Prof. Pedram Jahangiry 7


Sigmoid Function

• We need a monotone mapping function that has a range of [0,1]

Prof. Pedram Jahangiry 8


Logistic Regression (Model)

1
• The model: 𝑓𝑤,𝑏 𝑋 =
1+𝑒 − 𝑊𝑋+𝑏

• In case of two classes, 𝑓𝑤,𝑏 𝑋 = Pr 𝑌 = 1 𝑥 = 𝑝(𝑥).


• A bit of rearrangement gives
𝑝 𝑋
𝐿𝑜𝑔 = 𝑊𝑋 + 𝑏
1−𝑝 𝑥

• This monotone transformation is called the log odds or logit transformation of 𝑝(𝑥).
• Logistic regression ensures that our estimates always lie between 0 and 1

Prof. Pedram Jahangiry 9


Logistic regression fit (Decision boundary)

• Depending on how we define 𝑊𝑋 + 𝑏, we can get any of the following fits from
logistic regression classifier.

Prof. Pedram Jahangiry 10


Logistic Regression (Maximum Likelihood)

• In logistic regression, instead of minimizing the average loss, we maximize the likelihood
of the training data according to our model. This is called maximum likelihood estimation.
• A fantastic visualization!
• Can you do the same visualization with the S curve?

1−𝑦𝑖
𝐿𝑤,𝑏 = ෑ 𝑓𝑤,𝑏 𝑥𝑖 𝑦𝑖 1 − f𝑤,𝑏 𝑥𝑖
𝑖

Prof. Pedram Jahangiry 11


Logistic Regression (Objective function)

• Maximizing the likelihood function:


1−𝑦𝑖
𝑀𝑎𝑥 {𝐿𝑤,𝑏 = ς𝑖 𝑓𝑤,𝑏 𝑥𝑖 𝑦𝑖 1 − f𝑤,𝑏 𝑥𝑖 }

• Solution: In practice, it is more convenient to maximize the log-likelihood function. This


log-likelihood maximization, gives us 𝑤 ∗ and 𝑏 ∗ . There is no closed form solution to this
optimization problem. We need to use gradient descent.
• We are now ready to make predictions.
1
𝑓𝑤 ∗,𝑏∗ 𝑋 = ∗ ∗
1+𝑒 − 𝑊 𝑋+𝑏

• Depending on how we define the probability threshold, we can classify the observations.
In practice, the choice of the threshold could be different depending on the problem.

Prof. Pedram Jahangiry 12


Logistic regression output for credit card default example

1
𝑃(𝑑𝑒𝑓𝑎𝑢𝑙𝑡|𝑏𝑎𝑙, 𝑖𝑛𝑐) =
1 + 𝑒 −(𝑏 +𝑤1 𝑏𝑎𝑙 +𝑤2 𝑖𝑛𝑐 )

Prof. Pedram Jahangiry 13


Part II
Classification Performance Metrics

Prof. Pedram Jahangiry 14


Confusion Matrix

Prof. Pedram Jahangiry 15


Accuracy, Precision, Recall and F1score

While recall expresses the ability to


find all relevant instances in a dataset,
precision expresses the proportion of
the data points our model says was
relevant were actually relevant.

F1 uses the harmonic mean instead of a simple average because it


punishes extreme values.

Prof. Pedram Jahangiry 16


ROC (Receiver Operating Characteristic)

𝜌 FN
TPR

FP FPR
𝜌

Prof. Pedram Jahangiry 17


AUC

Prof. Pedram Jahangiry 18


Some other classification metrics

Prof. Pedram Jahangiry 19


Students’ questions
1) Are we treating (classifying) 𝑦ො = 0.51 and 𝑦ො = 0.99 the same?
2) Does it make sense to have non-linear decision boundaries in logistic regression?
3) Is logistical regression useful for anything beyond probability prediction?
4) What do ROC and AUC tell us about our predictions?

Prof. Pedram Jahangiry 20

You might also like