0% found this document useful (0 votes)

15 views72 pages

Logistic Regression

The document provides an overview of Logistic Regression, highlighting its purpose, assumptions, and model development process. It explains the differences between Logistic and Linear Regression, particularly in handling binary dependent variables and the importance of odds ratios. Additionally, it discusses issues in model estimation, data augmentation techniques, and the significance of sample size in achieving accurate results.

Uploaded by

singhanalyst1998

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

15 views72 pages

Logistic Regression

Uploaded by

singhanalyst1998

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 72

Logistic Regression

Dr. Amar Saxena

[email protected]
+91.993.002.2910 11th Oct 2024
The essential aspects
• Why Logistic Regression?
• What’s different about it?
• What are the assumptions? Any issues in implementing it?
• Developing the model
– Defining the project parameters
– Data Windows – observation and performance
– Data augmentation techniques
– Variable selection – weight of evidence
– Reject Inferencing
– Converting probabilities into scorecard
• Assessing the model
• Understanding the output

Slide 2
Let us look at some scenarios
• In many real life scenarios, the dependent variable is "limited."
o E.g. Is a person loyal to a brand? Will the person purchase my brand?
o The outcome here is not continuous or distributed normally.
• How to predict the outcome using linear regression?
o Do you see any problem here?
o Try to fit a regression line on this data.
o Can you decipher a relationship among these variables?

The problem of fitting a

regular regression line to
a non-normal dependent
variable

Slide 3
Fitting a Regression line
• We could severely simplify the plot by drawing a line between the
means for the two dependent variable levels,
• But this is problematic in two ways:
o the line seems to oversimplify the relationship and
o it gives predictions that cannot be observable values of Y for extreme
values of X.

Why does it not work?

This approach is analogous to
fitting a linear model to the
probability of the event.
However probabilities can only
take values between 0 and 1.
Hence, we need a different
approach to ensure that our
model is appropriate for the
data.
Slide 4
If Linear Regression is used, then
In linear regression (aka Ordinary Least Square, OLS regression) :

Y = β0 + β1 X + ε ; where Y = (0, 1)

• The error terms will be heteroskedastic

• ε will not be normally distributed, as Y takes only two values
• The predicted probabilities can be greater than 1 or less than 0

range = 0 to 1 range = -∞ to +∞

Slide45
The Logistic Regression Model

The "logit" model solves these problems:

o p is prob that the event Y occurs, p(Y=1) [range = 0 to 1]
𝑝
o is the "odds ratio" [range = 0 to ∞]
1−𝑝
𝑝
o ln : log odds ratio, or "logit“ [range = -∞ to +∞]
1−𝑝

𝑝 𝑒 (𝑎+𝑏𝑋)
ln =a+𝑏X p=
1−𝑝 1+ 𝑒 (𝑎+𝑏𝑋)

The logistic distribution constraints the

estimated probabilities to lie between 0 and 1.

Slide 6
• If Linear Regression is used:
− Probabilities (the dependent variable) range from 0 to 1.
− However linear predictions can be outside of this range.

• If probabilities are transformed to logit, then this problem is

avoided, as the range of the logit is not restricted.
− This transformation creates a variable with a range from - to
+ .

• In addition, the interpretation of logits is simple – take the

exponential of the logit and you have the odds for the two
groups in question.

Slide 7
Understanding Odds Ratio
• Problem with probabilities is that they are non-linear
o Going from 0.10 to 0.20 doubles the probability, but going from 0.80
to 0.90 barely increases the probability.
• Odds are like probability.
o Usually written as “4 to 1 odds” which is equivalent to 1 outof
five or 0.20 probability or 20% chance, etc.

• If there is a 75% chance that it will rain tomorrow, then 3 out of 4

times we say that it will rain. That means for every three times it
rains, once it will not.
o The odds of it raining tomorrow are 3 to 1 p
o Probability of winning TO Probability of loosing odds =
o As (¾)/¼=3/1. 1− p
• Odds that my pony will win the race is 1 to 3. Means for every 4
races it runs, it will win 1 and lose 3. Therefore I should be paid
₹3 for every rupee I bet.
Slide 8
Odds & Odds Ratios

p
odds =
1− p
Odds have a range of 0 to  ,

With values greater than 1 associated with an event being

more likely to occur than to not occur, and
Values less than 1 associated with an event that is less likely to
occur than not occur.

The logit is defined as the log of the odds:

 p 
ln (odds) = ln   = ln ( p) − ln (1− p)
 1− p 

Slide 9
Graph of the Logistic Function

Ogive function

(𝑎+𝑏𝑋)
The estimated probability is: 𝑝 = 𝑒 (𝑎+𝑏𝑋)
1+ 𝑒
o If {a + b X} = 0, then p = 0.50
o As {a + b X} gets really big, p approaches 1
o As {a + b X} gets really small, p approaches 0
Slide 10
Why use Logistic Regression?
• Binary Logistic Regression is a type of CLASSIFICATION
ALGORITHM where the dependent variable is a dummy
variable:
• Coded 0 (e.g. not brand loyal) or 1 (e.g. brand loyal)
o Will the customer make a purchase at the store in the next 30
days (yes vs. no)? Does it change if (s)he is a member of store
loyalty program or if the total purchase last year was above Rs
10,000?
o What is the impact of compensation, employee engagement
schemes and satisfaction scores on employee retention?
• Relationship between the DV and predictors is non-linear.

Slide 11
Understanding Logistics Regression
• Binomial logistic regression is similar to linear regression
o Except that dependent variable is binary (notcontinuous)
• Unlike linear regression, you are not attempting to determine
the predicted value of the dependent variable, but the
probability of being in a particular category of the
dependent variable given the independent variables.
• As with other types of regression, binomial logistic regression
also uses interactions between independent variables to
predict the dependent variable.
• Logistic regression is a classification algorithm, don’t
confuse with the name regression.

At the centre of the logistic regression analysis is the

task of estimating the log odds of an event.
Slide 12
Comparing the LP and Logit Model

𝑝
ln =a+𝑏X
1−𝑝
𝑎+𝑏𝑥
𝑃(𝑦|𝑥) = 𝑒
1 + 𝑒 𝑎+𝑏𝑥

Values in the regression equation b0 and b1 take on different meanings.

a  The regression constant (moves curve left and right)
b  The regression slope (steepness of curve)
-a /b  The threshold, where probability of success = 0.50
Slide 13
ASSUMPTIONS & ISSUES

Slide 14
Assumptions…
• Dependent variable should be nominal, with 2 categories.
o Could be ordinal or > 2 categories, however focus here is on binary DVs
• One or more independent variables -
o Measured on either a continuous or a nominal scale.
If an independent variable is ordinal, then it must be converted into nominal.
Else the software might treat it as a interval scale data.
• Observations are independent of each other.
i.e. obs should not come from repeated measurements or matched data.
• A case (or an object) should fall only in one category of DV
A person cannot be both – brand loyal and not brand loyal.
• DV should have mutually exclusive and exhaustive categories.
• Obs within each category of the DV should be independent
o i.e. values within a category should not have a relationship
o Same should also hold for the nominal IVs.
Slide 15
…Assumptions
• Limited or no multicollinearity among independent
variables.
• Linearity: A linear relationship between the odd ratio and the
independent variable.
o No assumption about the predictors being linear or to each other
• Error Term: The error term is assumed independent.
• Logistic regression typically requires a large sample size.
• There should be no outliers in the data
o As it uses maximum likelihood estimation (MLE) (unlike linear
regression, which uses Ordinary Least Square)
o Treating values beyond ±3.29 standardized scores

Slide 16
Not Required for Logistic Regression
• The dependent variable in logistic regression is very simple. Not
measured on an interval or ratio scale.
• Logistic regression does not require a linear relationship between
the dependent and independent variables.
− Linear relationship with log of odds
• OLS (linear regression) assumes that the distribution should be
normally distributed, but in logistic regression, the distribution
may be normal, poisson or binominal.
• Error terms (residuals) do not need to be normally distributed.
• Homoscedasticity is not required.
o OLS assumes that there is an equal variance between all independent
variables, but Logistic does not assume that there is an equal
variance between independent variables.

Slide 17
Sample Size
• Overall sample size
o Should be 400 to achieve best results with maximum
likelihood estimation (MLE).
− With smaller samples sizes, method could be less efficient in
model estimation.
• More focused on the size of each outcome group, which should
have 10 times the number of estimated model coefficients:
o Particularly problematic are situations where the actual
frequency of one outcome group is small (i.e., below 20). Actual
size of the small group is a bigger issue than the low rate of
occurrence.
o Alternate approaches are required for addressing this situation,
what may be termed “rare event” situations.
• Sample size requirements should be met in both – the training
and the holdout samples.
Slide 18
Issues in Model Estimation
o Small sample sizes
• Difficult to accurately estimate coefficients and standard errors.
o Complete Separation
• Dependent variable is perfectly predicted by an independent variable.
• Problem is that probabilities of one and zero are not defined for
logit values, thus no values are available for estimation.
o Quasi-Complete Separation (Zero Cells Effect)
• Most frequently encountered.
• One or more of the groups defined by the nonmetric independent
variable have counts of zero.
• Either use specialized methods or collapse categories.

Slide 19
So, in short, for Logistic Regression

Typical model assumptions

• Adequate sample size
• Absence of multi-collinearity
• Independence of errors
• Absence of outliers

Logistic Regression specific assumptions

• Binary outcome variable
• Linearity in the logit
• Lack of perfect separation

Slide 20
Maximum Likelihood Estimate
• Logistic Regression uses MLE, instead of OLS as the statistical method
for estimating coefficients of a model.
• Maximizes the likelihood that an event will occur – the event being
a respondent is assigned to one group versus another.
o An iterative procedure.
o Starts with a guess as to the best weight for each predictor variable (i.e.,
each coefficient in the model). Then adjusts these coefficients repeatedly
until there is no additional improvement in the ability to predict the
value of the outcome variable (either 0 or 1) for each case.
• While OLS is the process of finding the line which best fits the data.
• Logistic regression is more similar to cross-tabulation given that the
outcome is categorical and the test statistic utilized is the Chi Square.
• The likelihood function (L) measures the probability of observing
the particular set of dependent variable values (p1, p2, ..., pn) that
occur in the sample:
L = Prob (p1* p2* * * pn)
Slide 21
DEVELOPING THE MODEL

Slide 22
Definition of Project Parameters
• Exclusions
– Accounts that have abnormal performance
– Adjudicated using some non-score-dependent criteria
– Designated accounts such as staff, VIPs, out of country,
preapproved, lost/stolen cards, deceased, underage, and
voluntary cancellations within the performance window
Accounts used for development should be those that are scored during
normal day-to-day credit-granting operations, and those that would
constitute the intended customer.

• Performance and Sample Windows

• Effect of Seasonality
• Definition of Good and Bad
Slide 23
Data Windows
• Observation Period – Period from where independent variables
come from. Aka the independent variables are created
considering this period (window) only.
• Performance Period – Period from where dependent variable
comes from. It is the period following the observation window.
• No fixed window for all the models. Depends on the type of
model and the industry.

Slide 24
Performance Window
• Factors that would determine it –
– Should be long enough to have enough events. Check the vintage
analysis.
– Depends on the product.
Take multiple length of the performance windows and calculate
event rate against these periods. Select the period at which
event rate stabilizes which means event rate does not increase
much.

• Length of the observation and performance windows will

depend on the industry sector for which the model is designed.
– Telecom – high turnover of prepaid account => shorter window
– Banking sector – takes atleast a year for credit account to go bad
=> longer windows

Slide 25
Performance Window
• Rolling Performance Window –
Multiple windows to build a model but the duration of
performance window is fixed.
– To account for seasonality
– To include impact of multiple campaigns

Slide 26
Good Model Design
A good model design should document the following:
• The unit of analysis (such as customer or product level)
• Population frame and sample size
• Operational definitions (what are ‘good’/ ‘bad’ customers?) and
modeling assumptions (did this model include/exclude fraudulent
customers?)
• Observational time horizon (such as customers’ payment history
over the last two years) and performance windows (such as the
timeframe for which the “bad” definition applies)
• Data sources and data collection methods

Slide 27
Data Augmentation Techniques
• Classification algorithms trained on imbalanced data often result
in poor predictive quality.
– Models bias heavily toward the majority class, overlooking minority
examples critical to many use cases.
• Data Augmentation Techniques are used in data analytics to
modify unequal data classes to create balanced data sets.
• Oversampling – method to rebalance classes before training.
– When one class of data is the underrepresented in the sample
– When the amount of data collected is insufficient.
• Techniques for Oversampling
– Random oversampling
– Smoothed bootstrap oversampling
– SMOTE (Synthetic Minority Over-sampling Technique)
– ADASYN (Adaptive Synthetic Sampling Approach for Imbalanced
Learning) Slide 28
Under-Sampling

Total 6,206,339
Sep 08

Non-Resp 6,204,138
Resp 2,201

Total 6,283,598
Dec 08

Non-Resp 6,281,530 Non-Responders Non-Responders

0.15%
Resp 2,068 25,307,485 38,992
Modeling Sample
Total 6,368,547 Responders Responders 48,944
Mar 09

100%
Non-Resp 6,365,785 9,952 9,952
Resp 2,762

Total 6,458,953
Jun 09

Non-Resp 6,456,032
Resp 2,921

Resp: 0.039% Resp: 20.3%

Slide 29
Selection of Characteristics
• Expected Predictive Power
• Reliability and Robustness
– E.g. income
• Ease in Collection
• Interpretability
– E.g. occupation
• Human Intervention
• Legal Issues Surrounding the Usage of Certain Types of
Information
• Creation of Ratios Based on Business Reasoning
• Future Availability
• Changes in Competitive Environment

Slide 30
Selecting the Independent Variables

Slide 31
Variable Selection Process
•Segregate variables into Numeric & Character (Nominal)
– Nominal variables not part of Selection process
– 8 Nominal variables segregated from analysis data

•For Numeric Variables:

o Stepwise Variable Selection: 16 variables selected
o Decision Tree Selection
– 3 iteration of CHAID executed
– Variables in an iteration dropped from the following iteration
– 67 variables selected
o Gini Coefficient
– Gini coefficient for all variables evaluated
– Top 75 variables shortlisted

• 108 Numeric variables shortlisted for next phase

Slide 32
Variable Selection …
•Next phase of Variable Selection –
– Multi-collinearity check of variables The detailed excel file for
Initial Selection & Multi-
Variables dropped in successive iterations, collinearity Check
• Based on VIF
– All variables with VIF > 2.0 were investigated
• 4 iterations used to arrive at the shortlist of variables
– Initial list of 108 variables reduced to 24 variables
– 8 character variables (segregated earlier) added to this list

•Total 32 Variables shortlisted for Model Development

– These variables used for developing the Initial Model

Slide 33
Weight of Evidence and Information Value
• The Weight of Evidence (WOE) tells the predictive power of an
independent variable in relation to the dependent variable

Slide 34
Weight of Evidence and Information Value
Range Bins Non Events Events % Non-Events % Events WOE IV

0 to 50 1 197 20 0.05380 0.05917 -0.09525 0.00051

51 to 100 2 450 34 0.12288 0.10059 0.20017 0.00446

101 to 150 3 492 39 0.13435 0.11538 0.15220 0.00289

151 to 200 4 597 51 0.16303 0.15089 0.07737 0.00094

201 to 250 5 609 54 0.16630 0.15976 0.04012 0.00026

251 to 300 6 582 55 0.15893 0.16272 -0.02358 0.00009

301 to 350 7 386 41 0.10541 0.12130 -0.14045 0.00223

351 to 400 8 165 23 0.04506 0.06805 -0.41227 0.00948

> 400 9 184 21 0.05025 0.06213 -0.21231 0.00252

Summation 3662 338 0.02339

IV increases as # of bins increases for an independent variable.

So be careful when there are more than 20 bins as some bins may
have a very few number of events and non-events.
Slide 35
Where to use?
• Performing Binning of features using WoE Analysis – i.e.
transforming a Continuous Variable into a Discrete variable

• Using Information Value (IV) for feature selection for Binary

Logistic Regression

• Using WoE value to transform Categorical Variables for model

building
– Curse of Dimensionality

Slide 36
Reject Inference
• Typically used in Credit Risk Models
• Sample selection process –
1. Credit Application
2. Application approved or rejected
3. Approved applications given the loan
4. These accounts are observed: do they remain good or turn bad
5. Scorecard developed on these account
→ Applications rejected in stage-2 are not considered

Hence, the scorecard is developed on a sample which is not the same

as the “thru-the-door” sample.

Slide 37
Reject Inference is a process whereby the performance of
previously rejected applications is analyzed to estimate their
behavior (i.e., to assign performance class).
Just as there are some bads in the population that is approved,
there will be some goods that have been declined.
This process gives relevance to the scorecard development
process by recreating the population performance for a 100%
approval rate.

Slide 38
Techniques for Reject Inference
• Assign All Rejects to Bads
Not satisfactory. As any applications will be good.

• Assign Rejects in the Same Proportion of Goods to Bads as

Reflected in the Acceptees
So this assumes that there is absolutely no consistency in the current
selection system

• Ignore the Rejects Altogether

Scoring system built only on accepted applicants. Two-step process:
(1) first select acceptees as at present, (2) then score all accepted
accounts and reject those that fall below the predetermined cutoff.

Slide 39
Techniques for Reject Inference
• Approve All Applications
Only method to find out the actual (as opposed to inferred)
performance of rejected accounts. It involves approving all
applications for a specific period of time – say 3 months.
This is “buying-data”. Perhaps the most scientific and simple, the
notion of approving applicants that are known to be very high-risk
can be daunting.

• Similar In-House or Bureau Data Based Method

Use in-house performance data for applicants declined for one
product but approved for a similar product.
A related method uses performance at credit bureaus of those
declined by one creditor but approved for a similar pdt elsewhere.

• Augmentation techniques

Slide 40
Converting Probabilities into Scores
• Final Scorecard Production – the probabilities have to be scaled
• Scaling does not affect the predictive strength of the scorecard.
It is an operational decision based on considerations such as:
– Implementability of the scorecard into application processing
software.
– Ease of understanding by staff (e.g., discrete numbers are easier
to work with).
– Continuity with existing scorecards or other scorecards in the
company. This avoids retraining on scorecard usage and
interpretation of scores.

Slide 41
Converting Probabilities into Scorecard
Scaled Scores

Slide 42
Cautions
• Overfitting
o Adding IVs to a logistic regression model will always increase the
amount of variance explained in the log odds (similar to R²)
o Adding more independent variables to the model can result in
overfitting
o This reduces the generalizability of the model beyond the data on
which the model is fit.

• Empty cells or small cells: Check for empty/ small cells by doing
a crosstab between categorical predictors and outcome variable.
o If a cell has very few cases (a small cell), the model may become
unstable.

• When the outcome is rare, even if the overall dataset is large, it

can be difficult to estimate a logit model.

Slide 43
ASSESSING THE STRENGTH OF MODEL

Slide 44
Confusion Matrix – Measuring Predictive Accuracy
• Predictive Accuracy – ability to classify observations into correct
outcome group.
o All predictive accuracy measures are based on the cut-off value
selected for classification.
o The final cut-off value selected should be based on comparison of
predictive accuracy measures across cut-off values. While 0.5 is
generally the default cut-off, other values may substantially improve
predictive accuracy.

• Generates Four Outcomes:

o True Negatives
o True Positives
o False Negatives
o False Positives

Slide 45
Overall and Outcome – Specific Measures
• Predictive Accuracy of Actual Outcomes
o Sensitivity: (aka Recall) true positive rate
– percentage of positive outcomes TP
correctly predicted. TP + FN

o Specificity: true negative rate – TN

percentage of negative outcomes TN + FP
correctly predicted.

• Overall Measures of Predictive Accuracy:

o Accuracy (aka, Hit Ratio): predictive
accuracy of positives and negatives
combined.

o Youden index (J) – combination of the True

Positive rate (Sensitivity) and the True
Negative rate (Specificity) minus 1.
Slide 46
Overall and Outcome-Specific Measures
• Accuracy of Predicted Outcomes
o PPV (Positive Predictive Value): aka Precision TP
− Percentage of positive predictions that are correct. TP + FP
o NPV (Negative Predictive Value):
TN
− Percentage of negative predictions that are correct.
TN + FN
• Other Measures
o F-Measure
2 * TP
2*TP + FN + FP
FP TN
o Specificity 1 − =
FP + TN TN + FP

Slide 47
ROC Curve (Receiver Operating Characteristic)
• Graphical portrayal. Evaluates the trade off between true positive
rate (sensitivity) and false positive rate (1 – specificity).
• Area under curve (AUC), also known as c-statistic, is a powerful
non-parametric measure.
o Higher the area under curve, better the prediction power of the model.
• Measures the ability of the model to correctly classify true
positives and true negatives. We want our model to predict the
true classes as true and false classes as false.
• We want the true positive rate to be 1. But we are also concerned
about the false positive rate. So, we are not only concerned about
predicting the Y classes as Y but we also want N classes to be
predicted as N.
• We compare it to the random line, at 45o, where c-statistic is 0.5.
o So the c-statistic must be above 0.5.

Slide 48
ROC Curve

So if 2 models are
compared, then the
model having higher c-
statistic is better –
i.e. greater the area under
curve (AUC), better is the
model.

For an excellent description of ROC Curve –

http://fouryears.eu/2011/10/12/roc-area-under-the-curve-explained/
Slide 49
Lorenz Curve
• A measure similar to the ROC curve used to compare models
• Plot of distribution of “bad” cases and total cases by deciles
across all score ranges.
• Measures how well a scorecard isolates the bads and goods
into selected deciles.

Scorecard “A” isolates

about 90% of all bads,
Whereas scorecard “B”
only isolates about 80%.
Therefore, scorecard
“A” displays stronger
performance.

Slide 50
Gini Index
• Uses the Lorenz Curve.
• Ratio of the area between a scorecard’s Lorenz curve and the
45 degree line – the entire triangular area under the 45 degree
line, is equivalent to the Gini index.

Slide 51
Other Measures
• Gains Chart. Cumulative positive predicted value vs.
distribution of predicted positives (depth)

• Lift/Concentration Curve. Sensitivity vs. depth.

Lift = Positive predicted value/% positives in the sample
• Misclassification Costs. Where losses are assigned to false
positive and false negative cases. The optimal decision rule
minimizes the total expected cost.
• Bayes Rule. This minimizes expected cost (i.e., total
misclassification cost). Bayes’ rule and misclassification costs
are difficult to implement in practice, due to problems in
obtaining accurate loss numbers.

Slide 52
Is the model good? KS Statistics
Decile Good Bad Total % Bad Cum Good Cum Bad % Cum Good % Cum Bad KS

1 650 1,600 2,250 32.1% 650 1,600 3.7% 32.1% 28.4%

2 1,512 738 2,250 14.8% 2,162 2,338 12.3% 46.9% 34.6%

3 1,607 643 2,250 12.9% 3,769 2,981 21.5% 59.9% 38.3%

4 1,785 465 2,250 9.3% 5,554 3,446 31.7% 69.2% 37.5%

5 1,870 380 2,250 7.6% 7,424 3,826 42.4% 76.8% 34.5%

6 1,914 336 2,250 6.7% 9,338 4,162 53.3% 83.6% 30.3%

7 1,986 264 2,250 5.3% 11,324 4,426 64.6% 88.9% 24.2%

8 2,004 246 2,250 4.9% 13,328 4,672 76.1% 93.8% 17.7%

9 2,061 189 2,250 3.8% 15,389 4,861 87.8% 97.6% 9.8%

10 2,131 119 2,250 2.4% 17,520 4,980 100.0% 100.0% 0.0%

17,520 4,980 22,500 100.0%

Good Rank Ordering ~77% of the Bads are

KS of 38.3 in 3rd decile captured in the first 5 deciles
Slide 53
Is the model good? KS Statistics

Total Obs Cum Cumulative % %

# of Non- # of Response Cumulative Cumulative
Decile in the Response % Gain Lift # of Non- Cumulative Cumulative KS Statistic
Resp Resp Rate Lift # of Resp
Decile Rate Resp Non-Resp Resp
Based on # of obs # of obs {(Col-F) - Response
Calculating
PREDICTED where where {Col-C} (Overall Resp Rate by Cum Resp Rate Calculating {Col-F} divided {Col-G} divided {Col-I}
Cumulative of Cumulative
Probability Dependent Dependent Col B + Col C divided by Rate)} divided Overall by Overall Resp Cumulative by {Total # of by {Total # of minus
the deciles Non-Resp for
of the Variable = Variable = {Col-D} by {Overall Response Rate Resp for decile non-Resp} Resp} {Col-H}
decile
model 0 1 Resp Rate} Rate
1 650 1,600 2,250 71.1% 71.1% 221.3% 321.3% 321.3% 650 1600 3.71% 32.13% 28.42%
2 1,512 738 2,250 32.8% 52.0% 134.7% 148.2% 234.7% 2,162 2,338 12.34% 46.95% 34.61%
3 1,607 643 2,250 28.6% 44.2% 99.5% 129.1% 199.5% 3,769 2,981 21.51% 59.86% 38.35%
4 1,785 465 2,250 20.7% 38.3% 73.0% 93.4% 173.0% 5,554 3,446 31.70% 69.20% 37.50%
5 1,870 380 2,250 16.9% 34.0% 53.7% 76.3% 153.7% 7,424 3,826 42.37% 76.83% 34.45%
6 1,914 336 2,250 14.9% 30.8% 39.3% 67.5% 139.3% 9,338 4,162 53.30% 83.57% 30.28%
7 1,986 264 2,250 11.7% 28.1% 27.0% 53.0% 127.0% 11,324 4,426 64.63% 88.88% 24.24%
8 2,004 246 2,250 10.9% 26.0% 17.3% 49.4% 117.3% 13,328 4,672 76.07% 93.82% 17.74%
9 2,061 189 2,250 8.4% 24.0% 8.5% 38.0% 108.5% 15,389 4,861 87.84% 97.61% 9.77%
10 2,131 119 2,250 5.3% 22.1% 0.0% 23.9% 100.0% 17,520 4,980 100% 100% 0%
Total 17,520 4,980 22,500 22.1%

Understanding KS
Statistics and other Me

Slide 54
Explanation of the KS-Table
The output of logistic regression model is Probability. Sort it in Descending
Decile order. Then divide the entire sample into 10 equal parts. So this is based on
PREDICTED probabilities
Count of number of non-responders (i.e. dependent variable = 0) for each
# of Non-Resp
decile
# of Resp Count of number of responders (i.e. dependent variable = 1) for each decile
Total Obs in the
Total number of observations in the decile (Count of prior 2 columns)
Decile
Response Rate for each decile. Total number of responders for each decile
Response Rate
divided by total observations in decile
Cumulative response rate for each decile. Total number of responders till
Cum Response Rate
that decile divided by the total number of observations till that decile
{(Response rate for the decile) - (Overall Response rate)} divided by {Overall
% Gain
Response rate}
Lift Response rate for the decile divided by the overall response rate
Cumulative Lift Cumulative Response rate for the decile divided by the overall response rate
Cumulative # of Cumulative number of non-responders for each decile. Total number of non-
Non-Resp responders till that decile.
Cumulative # of Cumulative number of responders for each decile. Total number of
Resp responders till that decile.
% Cumulative Non- {Cumulative number of non-responders for each decile} divided by (Total
Resp number of non-responders}
{Cumulative number of responders for each decile} divided by (Total
% Cumulative Resp
number of responders}
KS Statistic {% Cumulative Responders} minus {% Cumulative Non-Responders}
Slide 55
Other Measures
• Cost Ratio. Ratio of the cost of misclassifying a bad credit risk as
a good risk (false negative) to the cost of misclassifying a good
risk as a bad (false positive).
– When used to calculate the cutoff, the cost ratio tends to max the
sum of the two proportions of correct classification.
– This is done by plotting the cost ratio against sensitivity and
specificity.
– The point where the two curves meet tends to be the point where
both sensitivity and specificity are maximized.

• Somers’ D, Gamma, Tau-a. Based on the numbers of concordant

and discordant pairs.
These measures are related to the c-statistic.

Slide 56
Comparing Different Models – relative measures
AIC (Akaike Information Criterion)
• Measure of relative goodness of fit
• Lower value of AIC suggests "better" model.
AIC = −2log(L) + 2K
• But it is a relative measure. of model fit. It is used for
model selection, i.e. it lets you to compare different
models estimated on the same dataset.
• If a model has value of AIC of, say, 2000. This number is
meaningless on its own, and tells nothing about how well
the model fits. However, another model having one more
predictor, results in drop of AIC to 1500. This shows that
model 2 is a better fit to the data than model 1.

Slide 57
Hosmer-Lemeshow Goodness Of Fit Test
Small p-values (significance) are However, large p-value does not mean the model
indicative of poor fit fits well. Since lack of evidence against a H0 is not
equivalent to evidence in favour of the H1. In
particular, for small sample size, a high p-value from
the test may simply be a consequence of the test
having lower power to detect mis-specification,
rather than being indicative of good fit.

Slide 58
Model Estimation Fit and Between Model comparisons
• Maximum Likelihood Estimation:
o Maximizes the likelihood that an event will occur – the event
being a respondent is assigned to one group versus another.
o The basic measure of how well the maximum likelihood estimation
procedure fits is the likelihood value.
• Comparisons of the likelihood values follow three steps:
o Estimate a Null Model – which acts as the “baseline” for making
comparisons of improvement in model fit.
o Estimate Proposed Model – the model containing the
independent
variables to be included in the logistic regression.
o Assess –2LL (-2 Log Likelihood) Difference
• Lower -2LL implies better fit of the model
Pl Note: There is no such a thing as "typical" or correct likelihood for a
model. It is a relative measure to compare two models as to how well
they fit the data
Slide 59
Many other measures of model fit
• Pseudo R2 Measures:
o Interpreted in a manner similar to the R2 in multiple regression.
Use -2LL to measure
Low -2LL & High R2
indicate better fit
o Different pseudo R2 measures vary widely in terms of magnitude and
no one version has been deemed most preferred.
o For all of the pseudo R2 measures, however, the values tend to be
much lower than for multiple regression models.
o Commonly used measures
• Cox & Snell R2
• Nagekerke R2
• McFadden’s R2 {= 1 - [-2LL(, )/-2LL()] (from software output)}

Slide 60
Casewise Diagnostics
• Two Types of Casewise Diagnostics Similar to Multiple
Regression

o Residuals
✓ Both residuals (Pearson and deviance) reflect standardized
differences between predicted probabilities and outcome value (0
and 1). Values above ± 2 merit further attention.

o Influential Observations
✓ Influence measures reflect impact on model fit and estimated
coefficients if an observation is deleted from the analysis.
✓ Comparable to those measures found in multiple regression.

Slide 61
Caution
• Reporting the R2
o Numerous pseudo-R2 values have been developed
• Should be interpreted with extreme caution as they have many
computational issues which cause them to be artificially high or low.
o Goodness of fit tests
• Hosmer-Lemeshow test measures it using Chi-square test. An
insignificant result is better. However the test has issues.

Slide 62
UNDERSTANDING THE OUTPUT

Slide 63
Example
• Explaining Brand Loyalty

• The response variable, Loyal/Not Loyal, is a binary variable.

• There are three predictor variables:
o Attitude towards Brand
o Attitude towards Product Category
o Attitude towards Shopping

Slide 64
Example
• A researcher is interested in how variables, such as GRE (Graduate
Record Exam scores), GPA (grade point average) and prestige of the
undergraduate institution, effect admission into graduate school.
• The response variable, admit/don’t admit, is a binary variable.

• Binary response (outcome, dependent) variable called admit

• There are three predictor variables: gre, gpa, and rank.
• gre and gpa are continuous
• Variable rank takes on the values 1 through 4

Slide 65
Understanding the Output (SPSS)
• The first model in the output is a null model, that is, a model with
no predictors.

• The constant in the table labelled Variables in the Equation gives the
unconditional log odds of admission (i.e., admit=1).

• The table labelled Variables not in the Equation gives the results of a
score test. The column labelled Score gives the estimated change in
model fit if the term is added to the model, the other two columns
give the degrees of freedom, and p-value (labelled Sig.) for the
estimated change. Based on the table above, all three of the
predictors, gre, gpa, and rank, are expected to improve the fit of the
model.

Slide 66
Gives the overall test for the model that
includes the predictors. Chi-square value
of 41.459 with a p-value of less than
0.0005 ➔ implies that the model as a
whole fits significantly better than an
empty (or null) model (i.e., a model with
no predictors).

The -2 Log Likelihood (458.517) in the

Model Summary table can be used in
comparisons of models. This table
also gives two measures of pseudo R-
square.

Cox & Snell and Nagelkerke R2 give the -2 Log likelihood – measures how
improvement from null model to the poorly the model predicts the
fitted model. As compared to – decisions. Smaller the statistic,
- R2 as explained variance
better is the model.
- R2 as square of correlation
The stats show that the model is poor.
Slide 67
coefficients S.E. – standard error around the
aka Log Odds odds ratio
coefficient for the constant
Wald chi-square – tests the H0 that
the constant equals 0. Here this hyp
is rejected because the p-value
("Sig.") is smaller than the critical p-
value of .05 (or .01).
df – degrees of freedom for Wald

𝐥𝐧 𝑶𝒅𝒅𝒔 = −𝟓. 𝟓𝟒𝟏 + 𝟎. 𝟎𝟎𝟐 𝒈𝒓𝒆 + 𝟎. 𝟖𝟎𝟒 𝒈𝒑𝒂+ 𝑂𝑑𝑑𝑠

𝟏. 𝟓𝟓𝟏 𝒓𝒂𝒏𝒌(𝟏) + 𝟎. 𝟖𝟕𝟔 𝒓𝒂𝒏𝒌(𝟐) + 𝟎. 𝟐𝟏𝟏 𝒓𝒂𝒏𝒌(𝟑) 𝑌෠=
1 + 𝑂𝑑𝑑𝑠
• Both gre and gpa are statistically significant.
• The overall (i.e., multiple degree of freedom) test for rank is given first, followed by the
terms for rank=1, rank=2, and rank=3. The overall effect of rank is statistically
significant, as are the terms for rank=1 and rank=2.
• The logistic regression coefficients give the change in the log odds of the outcome for
a one unit increase in the predictor variable.
• For every one unit change in gre, the log odds of admission (versus non-admission)
increases by 0.002. Similarly, For a one unit increase in gpa, the log odds of being
admitted to graduate school increases by 0.804.
• The indicator variables for rank have a slightly different interpretation. For example,
having attended an undergraduate institution with rank of 1, versus an institution with
a rank of 4, increases the log odds of admission by 1.551.
Slide 68
Example Interpretation of coefficient 

The Individualized Education Program, often called the IEP

An IEP outlines the special education experience for all eligible students with a disability.
An eligible student is any child in the U.S between the ages of 3-21 attending a public
school and has been evaluated as having a need in the form of a specific learning
disability, autism, emotional disturbance, other health impairments, intellectual disability,
orthopedic impairment, multiple disabilities, hearing impairments, deafness, visual
impairment, deaf-blindness, developmental delay, speech/language impairment, or
traumatic brain injury.

Slide 69
Example Interpretation of coefficient 

p/(1-p)=odds

Odds in IEP in with HS = (33/623)/(590/623) = 33/590 = 0.056

5% / 95% = 0.5/0.95=.056

Odds in IEP, No HS = (45/553)/(508/553) = 45/508 = 0.089

8% / 92% = 0.8/ 0.92 = 0.089

Change in odds due to HS = 0.056/0.089 = 0.63

The odds that the child of a mother with high school education has an IEP is
0.63 that of other mothers – it is lower because they are less likely.

Logistic regression coefficient = ln(0.63) = -0.46

Change in odds =e 0 + 1/e 0=e1 e-.46 = 0.63

Slide 70
Accuracy of the Classification
• The results of our logistic regression can be used to classify subjects.
• Before we can use this information to classify subjects, we need to
have a decision rule.
• Our decision rule will be:
o If prob of the event is >= to some threshold, we shall predict that the
event will take place.
o By default, SPSS sets this threshold to 0.5.
o However, in many cases we may want to set it higher or lower than 0.5

Slide 71
Slide 72

Logistic Regression in Machine Learning
No ratings yet
Logistic Regression in Machine Learning
10 pages
Counseling Psychology Exam Guide
100% (1)
Counseling Psychology Exam Guide
42 pages
REI Guide To Loss Prevention PDF
100% (1)
REI Guide To Loss Prevention PDF
14 pages
53 Grade Ordinary Portland Cement: Wadi 2023022
No ratings yet
53 Grade Ordinary Portland Cement: Wadi 2023022
1 page
TOEFL Structure 29 - 07 (Dragged) (Dragged)
100% (1)
TOEFL Structure 29 - 07 (Dragged) (Dragged)
4 pages
Project Management Mini Project
0% (1)
Project Management Mini Project
23 pages
Eml 24.7.25
No ratings yet
Eml 24.7.25
23 pages
Logistic Regression for Researchers
100% (1)
Logistic Regression for Researchers
51 pages
Nisha Arora - Logistics Regression Using SPSS
No ratings yet
Nisha Arora - Logistics Regression Using SPSS
76 pages
Lecture 6 Logistic Regression
No ratings yet
Lecture 6 Logistic Regression
28 pages
Logistic Regression: Psy 524 Ainsworth
No ratings yet
Logistic Regression: Psy 524 Ainsworth
37 pages
Day 4
No ratings yet
Day 4
29 pages
Regression Logistic 4
No ratings yet
Regression Logistic 4
51 pages
Logistic Regresson
No ratings yet
Logistic Regresson
32 pages
Superintendent Inspection Movement Checklist
100% (1)
Superintendent Inspection Movement Checklist
4 pages
Logistic Regression
No ratings yet
Logistic Regression
25 pages
ML Logistic Regression Module3 Final
No ratings yet
ML Logistic Regression Module3 Final
22 pages
Logistic Regression
No ratings yet
Logistic Regression
36 pages
T3 Logistic Regression
No ratings yet
T3 Logistic Regression
53 pages
11logistic Regression in Machine Learning - GeeksforGeeks
No ratings yet
11logistic Regression in Machine Learning - GeeksforGeeks
4 pages
MACHINE LEARNING Presentation Logistic Regression
No ratings yet
MACHINE LEARNING Presentation Logistic Regression
18 pages
Lecture 22. GLM
No ratings yet
Lecture 22. GLM
41 pages
Data Analytics Using R
No ratings yet
Data Analytics Using R
23 pages
09 23ECE216 LogisticRegression
No ratings yet
09 23ECE216 LogisticRegression
40 pages
Lecture 4-Logistic Regression
No ratings yet
Lecture 4-Logistic Regression
20 pages
Logistic Regression
No ratings yet
Logistic Regression
20 pages
Mou
No ratings yet
Mou
6 pages
Logisticregression
No ratings yet
Logisticregression
22 pages
W5S01 - PM-Logistic Regression
No ratings yet
W5S01 - PM-Logistic Regression
17 pages
M8 Logreg
No ratings yet
M8 Logreg
10 pages
5.1) Binary Logistic Regression
No ratings yet
5.1) Binary Logistic Regression
32 pages
Proposed Ordinance General Santos City Adolescent and Development Centers Series 2018
No ratings yet
Proposed Ordinance General Santos City Adolescent and Development Centers Series 2018
10 pages
Probit Logit Interpretation
100% (1)
Probit Logit Interpretation
26 pages
Detailed Logistic Regression
No ratings yet
Detailed Logistic Regression
30 pages
Module 2
No ratings yet
Module 2
92 pages
Logistic Regression Explained
No ratings yet
Logistic Regression Explained
50 pages
Logistic Regression For Machine Learning Complete TutorialUnderstand This Popular Supervised Classifi
No ratings yet
Logistic Regression For Machine Learning Complete TutorialUnderstand This Popular Supervised Classifi
10 pages
Logistic Regression
No ratings yet
Logistic Regression
14 pages
What Is Logistic Regression
No ratings yet
What Is Logistic Regression
20 pages
Logistic Regression
No ratings yet
Logistic Regression
8 pages
Regression3 Slides
No ratings yet
Regression3 Slides
47 pages
Script
No ratings yet
Script
8 pages
ML Unit 3
No ratings yet
ML Unit 3
40 pages
Logistic+Regression - Done
100% (1)
Logistic+Regression - Done
41 pages
ML Lec-9
No ratings yet
ML Lec-9
13 pages
ABG Analysis for Med Students
No ratings yet
ABG Analysis for Med Students
31 pages
Logistic Regression
No ratings yet
Logistic Regression
30 pages
Logistic Regression
100% (1)
Logistic Regression
37 pages
Logistic Regression Course Overview
No ratings yet
Logistic Regression Course Overview
16 pages
Logistic Regression
No ratings yet
Logistic Regression
4 pages
195 - Digest - Sarao V Guevarra - G.R. No. 4264 - 31 May 1940 - Janine Tolentino
No ratings yet
195 - Digest - Sarao V Guevarra - G.R. No. 4264 - 31 May 1940 - Janine Tolentino
2 pages
Logistic Regression
No ratings yet
Logistic Regression
25 pages
Logistic Regression
No ratings yet
Logistic Regression
7 pages
Binary Logistic Regression Guide
No ratings yet
Binary Logistic Regression Guide
30 pages
3 Logistic Regression
No ratings yet
3 Logistic Regression
21 pages
Logistic Regression
No ratings yet
Logistic Regression
54 pages
MLStackCafe QAS 1672810525772
No ratings yet
MLStackCafe QAS 1672810525772
12 pages
Vorabinformation
100% (1)
Vorabinformation
13 pages
Introduction To Logistic Regression
No ratings yet
Introduction To Logistic Regression
20 pages
Logistic Regression
No ratings yet
Logistic Regression
24 pages
03 Logistic Regression
No ratings yet
03 Logistic Regression
23 pages
Logistic Regression Insights
No ratings yet
Logistic Regression Insights
49 pages
Basf Masterseal Cr195 Tds
No ratings yet
Basf Masterseal Cr195 Tds
2 pages
Logistic Regression: Logistic Regression and The New: Residual Logistic Regression
No ratings yet
Logistic Regression: Logistic Regression and The New: Residual Logistic Regression
31 pages
1 LogisticRegressionNotes1
No ratings yet
1 LogisticRegressionNotes1
11 pages
Lesson 13 Logistic Regression
No ratings yet
Lesson 13 Logistic Regression
26 pages
Regresi Logistik
No ratings yet
Regresi Logistik
34 pages
Logistic Regression Explained
No ratings yet
Logistic Regression Explained
9 pages
spss10 LOGIT
No ratings yet
spss10 LOGIT
17 pages
Logistic Nota
No ratings yet
Logistic Nota
87 pages
Logistic Regression
No ratings yet
Logistic Regression
27 pages
Make Yogurt at Home
No ratings yet
Make Yogurt at Home
14 pages
Surface Chemistry & Colloids
No ratings yet
Surface Chemistry & Colloids
16 pages
CV - Amarjeet Chitkara
No ratings yet
CV - Amarjeet Chitkara
1 page
Critical Care Medicine Principles of Diagnosis and Management in the Adult 5th Edition by Joseph Parrillo, Phillip Dellinger ISBN 0323611613 9780323611619 - Quickly download the ebook to explore the full content
100% (11)
Critical Care Medicine Principles of Diagnosis and Management in the Adult 5th Edition by Joseph Parrillo, Phillip Dellinger ISBN 0323611613 9780323611619 - Quickly download the ebook to explore the full content
31 pages
Atlas of Sunnah Hijama Part II English V
No ratings yet
Atlas of Sunnah Hijama Part II English V
44 pages
Govt. College of Nursing Somajiguda Hyderabad Toddler Assessment Tool
No ratings yet
Govt. College of Nursing Somajiguda Hyderabad Toddler Assessment Tool
18 pages
By: Ricky S. Alegre Gabrel C Arsenio
No ratings yet
By: Ricky S. Alegre Gabrel C Arsenio
19 pages
Logistic Regression Lab Guide
No ratings yet
Logistic Regression Lab Guide
10 pages
Diatomic Molecule Population Distributions
No ratings yet
Diatomic Molecule Population Distributions
18 pages
Youth Essay Contest Winners
No ratings yet
Youth Essay Contest Winners
40 pages
Clinical Manifestations, Differential Diagnosis, and Clinical Evaluation of A Palpable Breast Mass - UpToDate
No ratings yet
Clinical Manifestations, Differential Diagnosis, and Clinical Evaluation of A Palpable Breast Mass - UpToDate
17 pages
Eveline
No ratings yet
Eveline
17 pages
DB en Quint Oring 24dc 2x20 1x40 104623 en 06
No ratings yet
DB en Quint Oring 24dc 2x20 1x40 104623 en 06
17 pages
s40 Pricelist PM
No ratings yet
s40 Pricelist PM
2 pages
Hoot Bar Menu
No ratings yet
Hoot Bar Menu
9 pages
Product 572
No ratings yet
Product 572
2 pages
API 20 E Bolting
No ratings yet
API 20 E Bolting
4 pages
Appendicitis
No ratings yet
Appendicitis
4 pages

Logistic Regression

Uploaded by

Logistic Regression

Uploaded by

Logistic Regression

Dr. Amar Saxena

The problem of fitting a

Why does it not work?

• The error terms will be heteroskedastic

The "logit" model solves these problems:

The logistic distribution constraints the

• If probabilities are transformed to logit, then this problem is

• In addition, the interpretation of logits is simple – take the

• If there is a 75% chance that it will rain tomorrow, then 3 out of 4

With values greater than 1 associated with an event being

The logit is defined as the log of the odds:

At the centre of the logistic regression analysis is the

Values in the regression equation b0 and b1 take on different meanings.

Typical model assumptions

Logistic Regression specific assumptions

• Performance and Sample Windows

• Length of the observation and performance windows will

Non-Resp 6,281,530 Non-Responders Non-Responders

Resp: 0.039% Resp: 20.3%

•For Numeric Variables:

• 108 Numeric variables shortlisted for next phase

•Total 32 Variables shortlisted for Model Development

0 to 50 1 197 20 0.05380 0.05917 -0.09525 0.00051

51 to 100 2 450 34 0.12288 0.10059 0.20017 0.00446

101 to 150 3 492 39 0.13435 0.11538 0.15220 0.00289

151 to 200 4 597 51 0.16303 0.15089 0.07737 0.00094

201 to 250 5 609 54 0.16630 0.15976 0.04012 0.00026

251 to 300 6 582 55 0.15893 0.16272 -0.02358 0.00009

301 to 350 7 386 41 0.10541 0.12130 -0.14045 0.00223

351 to 400 8 165 23 0.04506 0.06805 -0.41227 0.00948

> 400 9 184 21 0.05025 0.06213 -0.21231 0.00252

Summation 3662 338 0.02339

IV increases as # of bins increases for an independent variable.

• Using Information Value (IV) for feature selection for Binary

• Using WoE value to transform Categorical Variables for model

Hence, the scorecard is developed on a sample which is not the same

• Assign Rejects in the Same Proportion of Goods to Bads as

• Ignore the Rejects Altogether

• Similar In-House or Bureau Data Based Method

• When the outcome is rare, even if the overall dataset is large, it

• Generates Four Outcomes:

o Specificity: true negative rate – TN

• Overall Measures of Predictive Accuracy:

o Youden index (J) – combination of the True

For an excellent description of ROC Curve –

Scorecard “A” isolates

• Lift/Concentration Curve. Sensitivity vs. depth.

1 650 1,600 2,250 32.1% 650 1,600 3.7% 32.1% 28.4%

2 1,512 738 2,250 14.8% 2,162 2,338 12.3% 46.9% 34.6%

3 1,607 643 2,250 12.9% 3,769 2,981 21.5% 59.9% 38.3%

4 1,785 465 2,250 9.3% 5,554 3,446 31.7% 69.2% 37.5%

5 1,870 380 2,250 7.6% 7,424 3,826 42.4% 76.8% 34.5%

6 1,914 336 2,250 6.7% 9,338 4,162 53.3% 83.6% 30.3%

7 1,986 264 2,250 5.3% 11,324 4,426 64.6% 88.9% 24.2%

8 2,004 246 2,250 4.9% 13,328 4,672 76.1% 93.8% 17.7%

9 2,061 189 2,250 3.8% 15,389 4,861 87.8% 97.6% 9.8%

10 2,131 119 2,250 2.4% 17,520 4,980 100.0% 100.0% 0.0%

17,520 4,980 22,500 100.0%

Good Rank Ordering ~77% of the Bads are

Total Obs Cum Cumulative % %

• Somers’ D, Gamma, Tau-a. Based on the numbers of concordant

• The response variable, Loyal/Not Loyal, is a binary variable.

• Binary response (outcome, dependent) variable called admit

The -2 Log Likelihood (458.517) in the

𝐥𝐧 𝑶𝒅𝒅𝒔 = −𝟓. 𝟓𝟒𝟏 + 𝟎. 𝟎𝟎𝟐 𝒈𝒓𝒆 + 𝟎. 𝟖𝟎𝟒 𝒈𝒑𝒂+ 𝑂𝑑𝑑𝑠

The Individualized Education Program, often called the IEP

Odds in IEP in with HS = (33/623)/(590/623) = 33/590 = 0.056

Odds in IEP, No HS = (45/553)/(508/553) = 45/508 = 0.089

Change in odds due to HS = 0.056/0.089 = 0.63

Logistic regression coefficient = ln(0.63) = -0.46

You might also like