0% found this document useful (0 votes)

47 views44 pages

Logistic Regression Diagnostics, Splines and Interactions: Sandy Eckel Seckel@jhsph - Edu 19 May 2007

This document discusses logistic regression diagnostics and methods for assessing model fit. It describes using graphs and tables to check the assumptions of logistic regression models. Specifically, it recommends using lowess curves to visualize relationships between a binary outcome and continuous predictors. Three methods are presented for assessing model fit: examining graphs and tables, comparing observed and predicted probabilities, and using a goodness-of-fit test. The document also discusses adding splines and interactions to logistic regression models to allow for flexible relationships between predictors and the outcome. An example using breastfeeding data is presented.

Uploaded by

Borja Martinez

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

47 views44 pages

Logistic Regression Diagnostics, Splines and Interactions: Sandy Eckel Seckel@jhsph - Edu 19 May 2007

Uploaded by

Borja Martinez

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 44

Lecture 16:

Logistic regression diagnostics,

splines and interactions

Sandy Eckel
[email protected]

19 May 2007

1
Logistic Regression Diagnostics
Graphs to check assumptions

Recall: Graphing was used to check the

assumptions of linear regression
Graphing binary outcomes for logistic
regression is not as straightforward as
graphing a continuous outcome for linear
regression
Several methods have been developed to
visualize the logistic regression model for use
in checking the assumptions
Tables
Graphs with lowess curves
2
Nepali breastfeeding study
Example: data
Breastfeeding tends to be protective for numerous
infant health risks
A study was conducted in Nepal to evaluate the odds
of breastfeeding using a number of possible factors

Outcome: breastfeeding (1=yes, 0=no)

Primary predictor: baby’s gender (1=F, 0=M)
Secondary predictors:
Child’s age (0 to 76 months)
Mother’s age (17 to 52)
Number of children (parity) (1 to 14)

3
How to look at the data? Binary Y and
Binary (or categorical) X

Breastfeeding vs. baby’s gender

both binary
make a table!

This method would work for any binary or

categorical predictor

4
How to look at the data? Binary Y and
Continuous X

Breastfeeding vs. child’s age

Breastfeeding is binary
Child’s age is continuous
Could make child’s age categorical or binary
by
breaking it at the quartiles
defining groups by years
e.g. <1 year, 1 year, 2-3 years, 4+ years
then use tables
Or, we could graph the relationship
5
How to look at the data? Binary Y and
Continuous X
A scatter plot

1
.8

Actual
.6
breast fed

breastfeeding
.4 .2
0

0 20 40 60 80
age of child (months)

This isn’t very informative…how can we fix this?

6
How to look at the data? Binary Y and
Continuous X
Allow a smoothed relationship
The “lowess” command is a smoothed
graph
It’s like a window has been pulled
across the graph
at each moment, the probability of a 1
within the window is graphed
as the window moves, the probability of a 1
is shown as a line
changing the width of the window yields
different levels of smoothing
7
How to look at the data? Binary Y and
Continuous X
A scatter plot with Lowess curve

Lowess smoother
1
.8

Probability of
breast fed
.6

breastfeeding
.4 .2
0

0 20 40 60 80
age of child (months)
bandwidth = .9

Much more informative! Now we can talk about how the

probability of breastfeeding changes with child’s age
- We want this to look like a nice ‘logistic’ curve 8
Checking form of the model
Lowess allows us to visualize how the probability of
our outcome varies by a certain predictor
We really want to graph log[p/(1-p)], because that
function is assumed to be linear in logistic regression
Get the lowess smooth of the probability and then you can
transform the smoothed probability to the log odds scale
Plot the `smoothed’ log odds versus the continuous covariate
of interest
This relation should look linear
By looking at lowess plots within key subgroups, we
can detect whether the relationship varies across
covariates
Looking at these plots helps us decide if interactions
or splines are needed in the model

9
Assumptions of logistic regression

Two assumptions:
L – the model fits the data
I – the observations are all independent

Independence still cannot be assessed

graphically; must know how the data
were collected

10
How can we assess our model ?
L – the model fits the data
3 methods for assessing model fit
“Look” at the data
Binary or categorical predictors: tables
Do you see a need for interaction?
Continuous predictors: lowess curves
Do you see a need for interaction or splines?
Graph observed probability vs. the
predicted probability

Use the X2 Test of Goodness of Fit to

assess the predicted probabilities
11
Assess model fit : Method 2
Graphing observed vs. predicted probabilities
Run the model
Save the predicted probability of breastfeeding for each
child
Plot observed vs predicted probabilities
Lowess smoother
If the relationship is
close to a straight line
1

the predicted and

observed probabilities
Observed

are almost the same

.6
breast fed

the model fits the data

very well
.2

If not, try to add more

X’s, splines or
0

bandwidth = .8
.2 .4
Pr(bf)
.6 .8 1
interactions
Predicted
12
Assess model fit : Method 3
X2 Test of Goodness of Fit

Run the model

X2 Test of Goodness of Fit
Breaks data groups of equal size
Compares observed and predicted
numbers of observations in each group
with a X2 test
(also called the Hosmer-Lemeshow X2 Test)

H0: the model fits the observed data well

We want p>0.05 so we don’t reject H0

13
Method 3:
X2 Test of Goodness of Fit

p = 0.20 > α = 0.05

Fail to reject H0; conclude that the model fits

the data reasonably well

Conclusion matches the other methods

Scatter plots showed same relationship as model
the observed and predicted probabilities matched
method 2: straight line
the observed and predicted data matched
method 3: p>0.05

14
Summary: logistic regression model diagnostics

There are no easy graphs for looking at binary

outcome data
use lowess
split according to binary/categorical covariates to
see how relationship between outcome and
primary predictor varies
Assessing model fit: 3 methods
look at tables and graphs
compare graph of observed vs. predicted p
X2 Test of Goodness of Fit: want large p-value

15
How do we add
Flexibility in logistic regression?

Same methods as in linear regression!

Splines
are used to allow the “line” to bend

Interaction
is used to allow different effects (difference
in log odds ratio) for different groups

16
Example: Back to breastfeeding example

Outcome: breastfeeding (1=yes, 0=no)

Primary predictor: gender (1=F, 0=M)

Secondary predictors:
Child’s age (0 to 76 months)
Mother’s age (17 to 52) – need to center
Number of children (parity) (1 to 14) – need to
center

17
Model A: gender

 p   p 
log  = β0 + β1 (Gender ) ⇒ log  = -0.37 + 0.04(Gender )
1− p  1− p 

Logit estimates Number of obs = 472

LR chi2(1) = 0.04
Prob > chi2 = 0.8352
Log likelihood = -319.98468 Pseudo R2 = 0.0001

baby’s gender (1=F, 0=M)

18
Model B:
gender and mother's age
 p 
log  = β0 + β1 (Gender ) + β2 ( Agemom − 25)
1− p 
 p 
⇒ log  = -0.16 + 0.06(Gender ) + -0.06( Agemom − 25)
1− p 

Logit estimates Number of obs = 472

LR chi2(2) = 16.50
Prob > chi2 = 0.0003
Log likelihood = -311.75482 Pseudo R2 = 0.0258

------------------------------------------------------------------------------
bf | Coef. Std. Err. z P>|z| [95% Conf. Interval]
-------------+----------------------------------------------------------------
gender | .0620916 .1907094 0.33 0.745 -.311692 .4358751
age_momc | -.0615396 .0156442 -3.93 0.000 -.0922016 -.0308776
(Intercept) | -.1573215 .13957 -1.13 0.260 -.4308736 .1162307
------------------------------------------------------------------------------

baby’s gender (1=F, 0=M)

19
Possible modification – add a spline

A plot of the log odds of the lowess smooth of

breastfeeding versus mother’s age reveals
There may be a bend in the line at approximately
mother’s age = 25
We’ll add a spline for mother’s age>25
Lowess smoother
Logit transformed smooth
Boys Girls
6
4
breast fed
2
0
-2

20 30 40 50 20 30 40 50
age of mother (years)
bandwidth = .9

20
Possible modification – add a spline

For mother’s age > 25

we center mother’s age at 25 also, for
convenience
The spline is a new variable:

(agemom – 25)+
= 0 if age < 25
= (agemom – 25) if age >25

21
Model C:
gender and mother's age with spline
 p 
log  = β0 + β1 (Gender ) + β2 ( Agemom − 25) + β3 ( Agemom − 25) +
1− p 
 p 
⇒ log  = -0.55 + 0.08(Gender ) + -0.25( Agemom − 25) + 0.23( Agemom − 25) +
1− p 

Logit estimates Number of obs = 472

LR chi2(3) = 26.49
Prob > chi2 = 0.0000
Log likelihood = -306.76341 Pseudo R2 = 0.0414

------------------------------------------------------------------------------
bf | Coef. Std. Err. z P>|z| [95% Conf. Interval]
-------------+----------------------------------------------------------------
gender | .0821887 .1928521 0.43 0.670 -.2957946 .4601719
age_momc | -.2467804 .0627557 -3.93 0.000 -.3697794 -.1237814
age_mom_sp | .2306511 .074613 3.09 0.002 .0844122 .3768899
(Intercept) | -.5487527 .1888302 -2.91 0.004 -.9188531 -.1786522
------------------------------------------------------------------------------

baby’s gender (1=F, 0=M)

22
Understanding the equation
Write separate equations by age group

log(odds) = -0.55 + 0.08(Gender)– 0.25(Age-25)

+ 0.23(Age-25)+

For those with mothers under 25

-0.55 + 0.08(Gender) – 0.25(Age-25)

For those with mothers over 25

-0.55+0.08(Gender)–0.25(Age-25) + 0.23(Age-25)
= -0.55 + 0.08(Gender)+(-0.25+0.23)(Age-25)
= -0.55 + 0.08(Gender)+ -0.02 (Age-25)

23
Model C: Interpretation
 p 
log  = β0 + β1 (Gender ) + β 2 ( Agemom − 25) + β3 ( Agemom − 25) +
1− p 

β0: The log odds of breastfeeding for boys with

25-year-old mothers is -0.55 baby’s gender (1=F, 0=M)

β1: Adjusting for mother’s age, the log odds

ratio of breastfeeding for girls vs. boys is 0.08

β2: Adjusting for gender, the log odds ratio of

breastfeeding corresponding to a one year
difference in mother’s age for mothers
under 25 years is -0.25
24
Model C: Interpretation
 p 
log  = β0 + β1 (Gender ) + β2 ( Agemom − 25) + β3 ( Agemom − 25) +
1− p 
β2+β3: Adjusting for gender, the log odds ratio
of breastfeeding corresponding to a one year
difference in mother’s age for mothers over
25 years is -0.25 + 0.23
β3: Adjusting for gender, the difference in the
log odds ratio of breastfeeding corresponding
to a one year difference in mother’s age for
mothers over 25 years compared with mothers
under 25 years is 0.23
Tough both to put in words and to understand,
can be easier to understand mathematically! 25
Model C: Is the difference in the log odds ratio for
mother’s age statistically significant?
 p 
log  = β0 + β1 (Gender ) + β2 ( Agemom − 25) + β3 ( Agemom − 25) +
1− p 

H0: β3 = 0 in the population

i.e., the change in slope is 0, and the line does
not bend in the population
One variable added: use the Wald test
Z=3.09, p=0.002, CI for β3 = (0.08, 0.38)
Reject H0
Conclude that Model C is better than Model B

26
Breastfeeding example conclusion
For boys and girls with mothers under 25 years of
age, the odds that the mother will breastfeed the
child decreases by a factor of
exp(β2)=exp(-.24)=0.78
for each additional year of mother’s age
(95% CI: 0.69, 0.88)

This relationship is significantly different for boys and

girls with mothers over 25 years of age:
for these children, the odds that the mother will
breastfeed the child is approximately the same for each
year of mother’s age; the odds decreases by a factor of
only exp(β2+β3)=0.98 for each additional year of
mother’s age (95% CI: 0.95, 1.02)

27
Model D: gender and number of children (parity)
Logit estimates Number of obs = 472
LR chi2(2) = 9.99
Prob > chi2 = 0.0068
Log likelihood = -315.01027 Pseudo R2 = 0.0156

------------------------------------------------------------------------------
bf | Coef. Std. Err. z P>|z| [95% Conf. Interval]
-------------+----------------------------------------------------------------
gender | .0622939 .1894771 0.33 0.742 -.3090744 .4336622
parityc | -.1180777 .0384221 -3.07 0.002 -.1933837 -.0427718
(Intercept) | -.8009664 .1937284 -4.13 0.000 -1.180667 -.4212659
------------------------------------------------------------------------------
Sketch of Model D

breastfeeding
 p 
log odds of

log  = β0 + β1 (Gender )
1− p 
+ β 2 ( Parity − 8)

baby’s gender (1=F, 0=M)

8 Parity 28
Asessing the relationship in the data
The relationship between logit(bf) and parity is very
different for boys and girls
Mothers of more children tend to
breastfeed boys more
breastfeed girls less The relationship is about
the same for boys and
Lowess smoother girls whose mothers
Logit transformed smooth
Boys Girls
have about 8 or fewer
kids
2

Could add a spline

and an interaction
0

term for only parity

breast fed

> 8 so that the

-2

slopes only differ

then
-4

First we’ll just add a

spline
-6

0 5 10 15 0 5 10 15
# of kids mother had born alive
bandwidth = .9 29
Model E: gender, parity,
and parity spline
Logit estimates Number of obs = 472
LR chi2(3) = 14.18
Prob > chi2 = 0.0027
Log likelihood = -312.91444 Pseudo R2 = 0.0222

------------------------------------------------------------------------------
bf | Coef. Std. Err. z P>|z| [95% Conf. Interval]
-------------+----------------------------------------------------------------
gender | .0666432 .1903717 0.35 0.726 -.3064785 .439765
parityc | -.1718923 .0465719 -3.69 0.000 -.2631716 -.080613
parity_sp | .3281222 .1562619 2.10 0.036 .0218545 .6343899
(Intercept) | -1.045415 .2291123 -4.56 0.000 -1.494466 -.5963627
------------------------------------------------------------------------------
Sketch of Model E

breastfeeding
 p 
log odds of
log  = β0 + β1 (Gender )
1− p 
+ β 2 ( Parity − 8)
+ β3 ( Parity − 8) +
8 Parity
baby’s gender (1=F, 0=M) 30
Understanding the equation
Write separate equations by parity group

log(odds) = -1.05 + 0.07(Gender) – 0.17(Parity-8) +

0.33(Parity-8)+

For those with mothers with less than 8

children
-1.05 + 0.07(Gender) – 0.17(Parity-8)

For those with mothers with at least 8 children

-1.05 + 0.07(Gender) – 0.17(Parity-8) + 0.33(Parity-8)
= -1.05 + 0.07(Gender) + (-0.17+0.33)(Parity-8)
= -1.05 + 0.07(Gender) + 0.16(Parity-8)

31
Problem with the parity spline

Model E forces the “slope” to be the same for

boys and girls
The lowess curve suggests slope should differ
for boys and girls whose mothers had more
than around 8 children
Add an interaction term between the spline and
gender
that allows the slope to differ by gender only for
those whose mothers have 8 or more children

32
The new variable

Gender = 0 for boys

(Parity – 8)+ = 0 for children of low
parity families

(Gender)x(Parity – 8)+
baby’s gender (1=F, 0=M)
= 0 for boys
= 0 for parity < 8
= (Parity – 8) for girls with parity >=8

33
Model F:
spline + interaction with spline
Logit estimates Number of obs = 472
LR chi2(4) = 21.75
Prob > chi2 = 0.0002
Log likelihood = -309.12925 Pseudo R2 = 0.0340

------------------------------------------------------------------------------
bf | Coef. Std. Err. z P>|z| [95% Conf. Interval]
-------------+----------------------------------------------------------------
gender | .1806766 .1953877 0.92 0.355 -.2022763 .5636294
parityc | -.1737844 .0473172 -3.67 0.000 -.2665244 -.0810445
parity_sp | .734593 .2786475 2.64 0.008 .1884539 1.280732
parity_sp_~r | -.8665087 .3966433 -2.18 0.029 -1.643915 -.0891021
(Intercept) | -1.106983 .2343301 -4.72 0.000 -1.566261 -.647704
------------------------------------------------------------------------------

Sketch of Model F
 p 
breastfeeding
log  = β0 + β1 (Gender ) log odds of
1− p 
+ β2 ( Parity − 8) + β3 ( Parity − 8) +
+ β4Gender × ( Parity − 8) + baby’s gender (1=F, 0=M)
8 Parity 34
Understanding the equation
Write separate equations by parity and gender
log(odds) = -1.11 + 0.18(Gender) – 0.17(Parity-8) + 0.73(Parity-8)+
- 0.87(Gender)x(Parity-8)+
baby’s gender (1=F, 0=M)

For those with mothers with less than 8 children

-1.11 + 0.18(Gender) – 0.17(Parity-8)

For boys with mothers with at least 8 children

-1.11 + 0.18(Gender) – 0.17(Parity-8) + 0.73(Parity-8)
= -1.11 + (-0.17+0.73)(Parity-8)
For girls with mothers with at least 8 children
-1.11 + 0.18(Gender) – 0.17(Parity-8) + 0.73(Parity-8)
- 0.87(Gender)x(Parity-8)
= (-1.11 + 0.18) +(-0.17 + 0.73 - 0.87)(Parity-8)

35
Interpretation – Model F

exp(β0): The odds of breastfeeding for boys

of mothers with 8 children is exp(-1.11) =
0.33
exp(β1): Adjusting for mother’s parity, the
odds ratio of breastfeeding for girls vs. boys is
1.20 for children of mothers with less than 8
children
exp(β2): Adjusting for gender, the odds ratio
of breastfeeding corresponding to a one child
difference in parity for mothers with fewer
than 8 children is .84
36
Interpretation – Model F

exp(β2+β3): Among boys, the odds

ratio of breastfeeding corresponding to
a one child difference in parity for
mothers with at least 8 children is 1.75

exp(β2+β3+β4): Among girls, the

odds ratio of breastfeeding
corresponding to a one child difference
in parity for mothers with at least 8
children is 0.74
37
Interpretation – Model F
Complicated to interpret the components on their
own – read on your own if you want!

exp(β3): The odds ratio of breastfeeding corresponding to

a one child difference in parity is 2.08 times higher for
boys whose mothers have at least 8 children than
for boys whose mothers have fewer than 8 children
exp(β3+β4): The odds ratio of breastfeeding corresponding
to a one child difference in parity is 0.74 times lower for
girls whose mothers have at least 8 children than for
girls whose mothers have fewer than 8 children
exp(β4): The odds ratio of breastfeeding corresponding to
a one child difference in parity is 0.42 times lower for
boys whose mothers have at least 8 children than
for girls whose mothers have at least 8 children

38
Is the difference in the log odds ratio for parity by gender
statistically significant?

H0: β4 = 0 in the population

i.e. the change in slope for parity > 8 is the same for boys and
girls in the population
One variable added: use the Wald test
Z=-2.18, p=0.029, CI for exp(β3) = (0.19, 0.91)
Reject H0
Conclude that Model F is better than Model E

 p 
log  = β0 + β1 (Gender )
1− p 
+ β2 ( Parity − 8) + β3 ( Parity − 8) +
+ β4Gender × ( Parity − 8) + 39
Conclusion – Model F
For children whose mothers have fewer than 8 children,
the odds that the mother will breastfeed the child is
about the same for boys and girls and decreases by a
factor of exp(β2)=0.84 for each additional year of
mother’s age (95% CI: 0.77, 0.92).
This relationship is significantly different for both boys
and girls whose mothers have more than 8 children:
For boys whose mothers have more than 8 children,
the odds that the mother will breastfeed increases by
a factor of exp{β2+β3}=1.75 for each additional year
of mother’s age (95% CI: 1.05, 2.93).
For girls whose mothers have more than 8 children,
the odds that the mother will breastfeed decreases by
a factor of exp{β2+β3+β4}=0.74 for each additional
year of mother’s age (95% CI: 0.40, 1.37).
40
Comparing the models
Odds Ratio for Model
Variables A B C D E F
Reference* 0.69 0.85 0.58 0.45 0.35 0.33
Gender 1.04 1.06 1.09 1.06 1.07 1.20
Age-25 0.94 0.78
(Age-25)+ 1.26
Parity – 8 0.89 0.84 0.84
(Parity-8)+ 1.39 2.08
(Gender)x
0.42
(Parity-8)+
Deviance 640.0 623.5 613.5 630.0 625.8 618.3
41
*The table value for the reference group is the odds, not the odds ratio
Comparing the models

Models C and F are both nested in

Model A
Models C and F cannot be directly
compared to one another, but we can
see which has a smaller p-value when
compared to Model A
C vs. A: X2 = 26.5 with 2 df
F vs. A: X2 = 21.7 with 3 df
Both p-values are very small <.0001, but the p-
value for model C is slightly smaller
42
What next?

Model C improves prediction beyond gender

alone (Model A) more than Model F.
Model C should be the next parent model, and
we should test the new variables in Model F to
see if they continue to improve prediction
within the context of Model C
When a tentative final model is identified, the
assumptions of logistic regression should be
checked

43
Summary of lecture 16
Logistic regression assumptions
L – the model fits the data
I – the observations are all independent
Logistic regression diagnostics
“Look” at the data: tables or logits of lowess curves
Graph observed probability vs. the predicted probability
Use the X2 Test of Goodness of Fit to assess the predicted
probabilities
Splines and interactions add flexibility to the model
When comparing nested models, a table of
the coefficients and their CI’s, or
the odds ratios and their CI’s
helps the reader quickly compare models
Two models not nested in one another cannot be directly
compared
One can identify a new parent model by comparing statistical
44
significance

Logistic Regression
0% (1)
Logistic Regression
49 pages
Logistic Regression Analysis
No ratings yet
Logistic Regression Analysis
48 pages
Logistic Regression
No ratings yet
Logistic Regression
76 pages
3 Linear Regression
No ratings yet
3 Linear Regression
57 pages
Logistic Regression: Interaction Terms
No ratings yet
Logistic Regression: Interaction Terms
23 pages
Heart Disease App With Code
No ratings yet
Heart Disease App With Code
22 pages
1 LogisticRegressionNotes1
No ratings yet
1 LogisticRegressionNotes1
11 pages
Logistic Regression for Statisticians
No ratings yet
Logistic Regression for Statisticians
25 pages
Introduction To Logistic Regression: Rachid Salmi, Jean-Claude Desenclos, Alain Moren, Thomas Grein
No ratings yet
Introduction To Logistic Regression: Rachid Salmi, Jean-Claude Desenclos, Alain Moren, Thomas Grein
36 pages
Regression Logistic 4
No ratings yet
Regression Logistic 4
51 pages
Logistic Regression in R Guide
No ratings yet
Logistic Regression in R Guide
10 pages
Binary
No ratings yet
Binary
135 pages
Logistic Regression for Researchers
100% (1)
Logistic Regression for Researchers
51 pages
Logistic Regression Lab Guide
No ratings yet
Logistic Regression Lab Guide
10 pages
Logistic Regression Notes
No ratings yet
Logistic Regression Notes
79 pages
Probit Logit Interpretation
100% (1)
Probit Logit Interpretation
26 pages
18logistic Regression Yilma
No ratings yet
18logistic Regression Yilma
88 pages
Logistic Regression A Self Learning Text (Statistics For Biology and Health) 3rd Ed. 2010 Edition Latest Edition Download
100% (12)
Logistic Regression A Self Learning Text (Statistics For Biology and Health) 3rd Ed. 2010 Edition Latest Edition Download
16 pages
4 - Logistic Reg 1
No ratings yet
4 - Logistic Reg 1
30 pages
L9 Logistical Regression Models Updated
No ratings yet
L9 Logistical Regression Models Updated
10 pages
Lect7 Math231
No ratings yet
Lect7 Math231
29 pages
Logistic Regression (2022)
No ratings yet
Logistic Regression (2022)
44 pages
(Book) Bayesian Logistik - Hilbe Practical Guide To Logistic Regression (PDFDrive)
No ratings yet
(Book) Bayesian Logistik - Hilbe Practical Guide To Logistic Regression (PDFDrive)
170 pages
Practical Guide To Logistic Regression - Joseph M. Hilbe (2017)
100% (1)
Practical Guide To Logistic Regression - Joseph M. Hilbe (2017)
170 pages
Discrete Choice Models in Econometrics
No ratings yet
Discrete Choice Models in Econometrics
38 pages
Logistic Regression
No ratings yet
Logistic Regression
25 pages
Non-Linear Regression Guide
No ratings yet
Non-Linear Regression Guide
10 pages
Logistic Regression - 2021 ch-8
No ratings yet
Logistic Regression - 2021 ch-8
52 pages
Regression Logistic Regression
100% (1)
Regression Logistic Regression
37 pages
Logisticregression PDF
No ratings yet
Logisticregression PDF
48 pages
Class 3 Count Models 1.0
No ratings yet
Class 3 Count Models 1.0
39 pages
Lecture 10
No ratings yet
Lecture 10
13 pages
Logistic Reg
No ratings yet
Logistic Reg
87 pages
Binary Logistic Regression - 6.2
No ratings yet
Binary Logistic Regression - 6.2
34 pages
Advanced Logistic Regression Models
No ratings yet
Advanced Logistic Regression Models
27 pages
Detailed Logistic Regression
No ratings yet
Detailed Logistic Regression
30 pages
Logistic Regression: Logistic Regression and The New: Residual Logistic Regression
No ratings yet
Logistic Regression: Logistic Regression and The New: Residual Logistic Regression
31 pages
Data Analytics Using R
No ratings yet
Data Analytics Using R
23 pages
330 Lecture18 2014
No ratings yet
330 Lecture18 2014
24 pages
Logistic Regression
0% (1)
Logistic Regression
4 pages
Article: An Introduction Tos Logistic Regression Analysis and Reporting
No ratings yet
Article: An Introduction Tos Logistic Regression Analysis and Reporting
5 pages
Margins 01
No ratings yet
Margins 01
48 pages
Regression3 Slides
No ratings yet
Regression3 Slides
47 pages
Logistic Regression: 30 March 2016
No ratings yet
Logistic Regression: 30 March 2016
49 pages
Alternatives To Logistic Regression (Brief Overview)
No ratings yet
Alternatives To Logistic Regression (Brief Overview)
5 pages
Binary Logistic Regression - Prof. Sami Day 1
No ratings yet
Binary Logistic Regression - Prof. Sami Day 1
41 pages
Vcd5 Handout 2x2
No ratings yet
Vcd5 Handout 2x2
16 pages
Unit - 5
No ratings yet
Unit - 5
111 pages
330 Lecture1 2014
No ratings yet
330 Lecture1 2014
26 pages
5.1) Binary Logistic Regression
No ratings yet
5.1) Binary Logistic Regression
32 pages
Logistic Regression Insights
No ratings yet
Logistic Regression Insights
54 pages
Materi MT
No ratings yet
Materi MT
14 pages
02 Simple-Logistic-Regression-An-Overview Simple Logistic Regression
No ratings yet
02 Simple-Logistic-Regression-An-Overview Simple Logistic Regression
86 pages
Vcd4 Handout 2x2
No ratings yet
Vcd4 Handout 2x2
20 pages
3 Classification
No ratings yet
3 Classification
26 pages
L5 Logistic Regression (2011)
100% (1)
L5 Logistic Regression (2011)
55 pages
09-Limited Dependent Variable Models
No ratings yet
09-Limited Dependent Variable Models
71 pages
Statistical Failure Models For Water Distribution Pipes - A Review From A Unified Perspective
No ratings yet
Statistical Failure Models For Water Distribution Pipes - A Review From A Unified Perspective
11 pages
Box and Whisker Instructions 1 1 1 1
No ratings yet
Box and Whisker Instructions 1 1 1 1
6 pages
T - Table (Critical Values For The Student's T Distribution)
No ratings yet
T - Table (Critical Values For The Student's T Distribution)
1 page
ECS4863 2021 Assignment 01
No ratings yet
ECS4863 2021 Assignment 01
3 pages
Manova: Presented By
No ratings yet
Manova: Presented By
13 pages
IIT Roorkee 2013 Data Structures Grades
No ratings yet
IIT Roorkee 2013 Data Structures Grades
5 pages
DSO, DIO, DPO Impact on CR in BEI
No ratings yet
DSO, DIO, DPO Impact on CR in BEI
18 pages
Unit - 2 Deep Learning
No ratings yet
Unit - 2 Deep Learning
26 pages
Econometric Modeling: Model Specification and Diagnostic Testing
No ratings yet
Econometric Modeling: Model Specification and Diagnostic Testing
52 pages
Aaoc ZC111
No ratings yet
Aaoc ZC111
13 pages
Table VII Critical Values For The Sign Test
No ratings yet
Table VII Critical Values For The Sign Test
1 page
The Population Discrepancy Between Cronbach - Raykov
No ratings yet
The Population Discrepancy Between Cronbach - Raykov
10 pages
TOPIC 3 - Measures of Central Tendency and Variability
No ratings yet
TOPIC 3 - Measures of Central Tendency and Variability
5 pages
Biostats Practice Problems 1 Key
No ratings yet
Biostats Practice Problems 1 Key
9 pages
Entropy 24 00713 v2
No ratings yet
Entropy 24 00713 v2
12 pages
Sampling Distribution: Definition
No ratings yet
Sampling Distribution: Definition
39 pages
Suicide Analysis
No ratings yet
Suicide Analysis
18 pages
Homework 3
No ratings yet
Homework 3
7 pages
Data Science Interview Questions: Answer Here
No ratings yet
Data Science Interview Questions: Answer Here
54 pages
Linear Regression Assumptions
100% (2)
Linear Regression Assumptions
16 pages
Part A Assignment - No - 4
No ratings yet
Part A Assignment - No - 4
14 pages
Standard Deviation
100% (1)
Standard Deviation
18 pages
Demand Forecasting Lecture IIM
No ratings yet
Demand Forecasting Lecture IIM
17 pages
Rubin - Multiple Imputation After 18+ Years
No ratings yet
Rubin - Multiple Imputation After 18+ Years
17 pages
Assignment Rouneik Kumar
No ratings yet
Assignment Rouneik Kumar
5 pages
Statistical Estimation
No ratings yet
Statistical Estimation
37 pages
Forecasting
100% (1)
Forecasting
50 pages
Chap 7
100% (1)
Chap 7
28 pages
Econometrics Term Paper
No ratings yet
Econometrics Term Paper
8 pages
CA Foundation QA RTP May 2025 Exam Castudynotes Com
No ratings yet
CA Foundation QA RTP May 2025 Exam Castudynotes Com
8 pages

Logistic Regression Diagnostics, Splines and Interactions: Sandy Eckel Seckel@jhsph - Edu 19 May 2007

Uploaded by

Logistic Regression Diagnostics, Splines and Interactions: Sandy Eckel Seckel@jhsph - Edu 19 May 2007

Uploaded by

Lecture 16:

Logistic regression diagnostics,

Recall: Graphing was used to check the

Outcome: breastfeeding (1=yes, 0=no)

Breastfeeding vs. baby’s gender

This method would work for any binary or

Breastfeeding vs. child’s age

This isn’t very informative…how can we fix this?

Much more informative! Now we can talk about how the

Independence still cannot be assessed

Use the X2 Test of Goodness of Fit to

the predicted and

are almost the same

the model fits the data

If not, try to add more

Run the model

H0: the model fits the observed data well

p = 0.20 > α = 0.05

Fail to reject H0; conclude that the model fits

Conclusion matches the other methods

There are no easy graphs for looking at binary

Same methods as in linear regression!

Outcome: breastfeeding (1=yes, 0=no)

Primary predictor: gender (1=F, 0=M)

Logit estimates Number of obs = 472

baby’s gender (1=F, 0=M)

Logit estimates Number of obs = 472

baby’s gender (1=F, 0=M)

A plot of the log odds of the lowess smooth of

For mother’s age > 25

Logit estimates Number of obs = 472

baby’s gender (1=F, 0=M)

log(odds) = -0.55 + 0.08(Gender)– 0.25(Age-25)

For those with mothers under 25

For those with mothers over 25

β0: The log odds of breastfeeding for boys with

β1: Adjusting for mother’s age, the log odds

β2: Adjusting for gender, the log odds ratio of

H0: β3 = 0 in the population

This relationship is significantly different for boys and

baby’s gender (1=F, 0=M)

Could add a spline

term for only parity

> 8 so that the

slopes only differ

First we’ll just add a

log(odds) = -1.05 + 0.07(Gender) – 0.17(Parity-8) +

For those with mothers with less than 8

For those with mothers with at least 8 children

Model E forces the “slope” to be the same for

Gender = 0 for boys

For those with mothers with less than 8 children

For boys with mothers with at least 8 children

exp(β0): The odds of breastfeeding for boys

exp(β2+β3): Among boys, the odds

exp(β2+β3+β4): Among girls, the

exp(β3): The odds ratio of breastfeeding corresponding to

H0: β4 = 0 in the population

Models C and F are both nested in

Model C improves prediction beyond gender

You might also like