Thanks to visit codestin.com
Credit goes to www.scribd.com

0% found this document useful (0 votes)
75 views36 pages

Applied Logistic Regression Analysis

The document is a comprehensive guide on Applied Logistic Regression Analysis by Scott Menard, published by SAGE Publications in 2011. It discusses the methodology, interpretation, and diagnostics of logistic regression, particularly in comparison to linear regression, and includes updates on software tools and goodness-of-fit measures. The second edition addresses improvements in statistical software and expands on the analysis of grouped data and polytomous logistic regression models.

Uploaded by

gbe-23-141711
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
75 views36 pages

Applied Logistic Regression Analysis

The document is a comprehensive guide on Applied Logistic Regression Analysis by Scott Menard, published by SAGE Publications in 2011. It discusses the methodology, interpretation, and diagnostics of logistic regression, particularly in comparison to linear regression, and includes updates on software tools and goodness-of-fit measures. The second edition addresses improvements in statistical software and expands on the analysis of grouped data and polytomous logistic regression models.

Uploaded by

gbe-23-141711
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 36

Sage Research Methods

Applied Logistic Regression Analysis

For the most optimal reading experience we recommend using our website.
https://methods.sagepub.com/book/mono/applied-logistic-regression-analysis/
toc

Author: Scott Menard


Pub. Date: 2011
Product: Sage Research Methods
DOI: https://doi.org/10.4135/9781412983433
Methods: Logistic regression, Dependent variables, Independent variables
Keywords: cannabis, errors, equations, estimates, scale, parameters
Sage Sage Research Methods
Contact SAGE Publications at http://www.sagepub.com

Disciplines: Anthropology, Business and Management, Criminology and Criminal Justice, Communication
and Media Studies, Counseling and Psychotherapy, Economics, Education, Geography, Health, History,
Marketing, Nursing, Political Science and International Relations, Psychology, Social Policy and Public Policy,
Social Work, Sociology, Science, Technology, Computer Science, Engineering, Mathematics, Medicine
Access Date: April 7, 2025
Publisher: SAGE Publications, Inc.
City: Thousand Oaks
Online ISBN: 9781412983433

© 2011 SAGE Publications, Inc. All Rights Reserved.

Page 2 of 36 Applied Logistic Regression Analysis


Sage Sage Research Methods
Contact SAGE Publications at http://www.sagepub.com

Front Matter

• Copyright
• Series Editor's Introduction
• Author's Introduction to the Second Edition

Chapters

• Linear Regression and the Logistic Regression Model


• Summary Statistics for Evaluating the Logistic Regression Model
• Interpreting the Logistic Regression Coefficients
• An Introduction to Logistic Regression Diagnostics
• Polytomous Logistic Regression and Alternatives to Logistic Regression

Back Matter

• Notes
• Appendix: Probabilities
• References
• About the Author

Page 3 of 36 Applied Logistic Regression Analysis


Sage Sage Research Methods
Contact SAGE Publications at http://www.sagepub.com

Copyright

Page 4 of 36 Applied Logistic Regression Analysis


Sage Sage Research Methods
Contact SAGE Publications at http://www.sagepub.com

Series Editor's Introduction

The linear regression model provides a powerful device for organizing data analysis. Researchers focus
on the explanation of a dependent variable, Y, as a function of multiple independent variables, from X1 to

Xk. Models are specified, variables are measured, and equations are estimated with ordinary least squares

(OLS). All goes well if the classical linear regression assumptions arc met. However, several assumptions are
likely to be unmet if the dependent variable has only two or three response categories. In particular, with a
dichotomous dependent variable, assumptions of homoskedasticity, linearity, and normality are violated, and
OLS estimates are inefficient at best. The maximum likelihood estimation of a logistic regression overcomes
this inefficiency, transforming Y(1, 0) into a logit (log of the odds of falling into the “1” category).

Professor Menard fully explicates the estimation, interpretation, and diagnostics of such logistic regression
models. The pedagogical portmanteau of his work is the parallel he continually draws to the linear regression

model. The logistic counterparts to the OLS statistics—the R2, the standard error of estimate, the t ratio,
and the slope—are systematically presented. These parallels allow the analyst to step naturally from familiar
terrain to new turf. Traditional regression diagnostics as well—the Studentized residual, leverage, dbeta—are
included in an innovative logistic “protocol” for diagnostics. The last chapter dissects the problem of a
polytomous dependent variable, with multiple ordered or unordered categories.

The discussion of the various computer packages is up-to-the minute and discriminating. For example, he
notes that in SPSS 10 the NOMREG routine is good for nominal dependent variables, whereas for ordered
dependent variables, SAS LOGISTIC is preferred. Attention to current computer software is part of the
changes made from the first edition of this monograph. Other changes include a comprehensive evaluation

of the many different goodness-of-fit measures. Dr. Menard makes a convincing case for the use of R2L, at

least if the goal is direct comparison to the OLS R2. He also adds new material on grouped data, predictive
efficiency, and risk ratios.

The Qualitative Applications in the Social Sciences series has published many papers on classical linear
regression (see Lewis-Beck, Applied Regression, Vol. 22; Achen, Interpreting and Using Regression, Vol.
29; Berry and Feldman, Multiple Regression in Practice, Vol. 50; Schroeder, Sjoquist, and Stephan,
Understanding Regression Analysis, Vol. 57; Fox, Regression Diagnostics, Vol. 79; Berry, Understanding
Regression Assumptions, Vol. 92; and Hardy. Regression With Dummy Variables, Vol. 93). The voluminous
output is justified by the dominance of the linear regression paradigm. In observational research work, this
paradigm increasingly is bumping up against the reality of dependent variables that are less than continuous,
Page 5 of 36 Applied Logistic Regression Analysis
Sage Sage Research Methods
Contact SAGE Publications at http://www.sagepub.com

less than interval. Hence the heightened attention to the logistic regression alternative. The series first
published DeMaris, Logit Modeling, Vol. 86, followed by Menard, Applied Logistic Regression Analysis, 1st
ed., Vol. 106, and Pampel, Logistic Regression: A Primer, Vol. 132. The last mentioned monograph provides
a basic introduction to the technique. The monograph at hand goes beyond that introduction, attending to the
most contemporary of complex issues and mechanics. For the social scientist who wishes to be au courant
regarding this rapidly evolving topic, the Menard second edition is a must.

Michael S.Lewis-Beck Series Editor

Page 6 of 36 Applied Logistic Regression Analysis


Sage Sage Research Methods
Contact SAGE Publications at http://www.sagepub.com

Author's Introduction to the Second Edition

The last sentence before the Notes section in the previous edition of this monograph was, “One can hope
that many of the ‘kludges’ for making logistic regression analysis work with existing software … will become
obsolete as the software available for logistic regression analysis is expanded and improved.” This second
edition was written because that hope has been at least partially fulfilled. There have been some relatively

minor changes in SAS, including the addition of the Cox-Snell and Nagelkerke pseudo-R2 measures, plus an
optional component, SAS Display Manager, a windowing shell. There are also two new instructional manuals
devoted to logistic regression analysis (Allison, 1999; SAS, 1995) that have been published since the first
edition of this monograph. SPSS PC+, a command-driven package, which was illustrated in the first edition,
has been supplanted by SPSS 10, which is highly integrated with the Windows 95/98 environment. Also, there
are two new SPSS routines, NOMREG (nominal regression) for polytomous nominal logistic regression and
PLUM (polytomous logit universal model) for ordinal logistic regression and related models (Norusis, 1999;
SPSS, 1999a). Changes from the first edition include:

• More detailed consideration of grouped as opposed to casewise data throughout the monograph

• An updated discussion of the properties and appropriate use of goodness-of-fit measures, R2


analogues, and indices of predictive efficiency (Chapter 2);
• Discussion of the misuse of odds ratios to represent risk ratios (Chapter 3)
• Discussion of overdispersion and underdispersion for grouped data (Chapter 4)
• Updated coverage of unordered and ordered polytomous logistic regression models; some material
that is no longer necessary for working around the limitations of earlier versions of the software has
been dropped (Chapter 5).

The focus in this second edition, as in the first, is on logistic regression models for individual level data, but
aggregate or grouped data, with multiple cases for each possible combination of values of the predictors,
are considered in more detail. As in the first edition, examples using SAS and SPSS software are provided.
Finally, observant readers informed me about places in the previous edition where there were errors or where
the clarity of presentation could be improved. For their comments, questions, and constructive criticisms, I
thank the anonymous reviewers of the first edition, Alfred De Maris, who reviewed the present edition, and
also Dennis Fisher, Tom Knapp, Michael Lewis-Beck, Fred Pampel, Hidetoshi Saito, Dan Waschbusch, Susan
White, and, especially, David Nichols of SPSS for his detailed comments. I would also like to absolve all of
them of blame for any errors, new or old, in the present edition.

Page 7 of 36 Applied Logistic Regression Analysis


Sage Sage Research Methods
Contact SAGE Publications at http://www.sagepub.com

Linear Regression and the Logistic Regression Mod-


el

In linear regression analysis, it is possible to test whether two variables are linearly related and to calculate
the strength of the linear relationship if the relationship between the variables can be described by an equation
of the form Y = α + βX, where Y is the variable being predicted (the dependent, criterion, outcome, or en-
dogenous variable), X is a variable whose values are being used to predict Y (the independent, exogenous,

or predictor variable),1 and α and β are population parameters to be estimated. The parameter α, called the
intercept, represents the value of Y when X = 0. The parameter β represents the change in Y associated with
a one-unit increase in X or the slope of the line that provides the best linear estimate of Y from X. In multi-
ple regression, there are several predictor variables. If k denotes the number of independent variables, the
equation becomes Y = α + β1X1 + β2X2 + … + βkXk and β1, β2, …, βk are called partial slope coefficients,

reflecting the fact that any one of the k predictor variables X1, X2, …, Xk provides only a partial explanation

or prediction for the value of Y. The equation is sometimes written in a form that explicitly recognizes that
prediction of Y from X may be imprecise, Y = α + βX + ε, or for several predictors, Y = α + β1X1 + β2X2 + … +

βkXk + ε, where ε is the error term, a random variable that represents the error in predicting Y from X. For an

individual case j, Yj = αj + βXj + εj or Yj = αj + β1X1j+β2X2j + … + βkXkj+εj, and the subscript j indicates that

the equation predicts values for specific cases, indexed by j (j = 1 for the first case, j = 2 for the second case,
etc.). Yj, X1j, Xkj etc. refer to specific values of the dependent and independent variables. This last equation

is used to calculate the value of Y for a particular case j, rather than describing the relationship among the
variables for all of the cases in the sample or the population.

Estimates of the intercept α and the regression coefficients β (or β1, β2, …, βk) are obtained mathematically

using the method of ordinary least squares (OLS) estimation, which is discussed in many introductory statis-
tics texts (for example, Agresti and Finlay, 1997; Bohrnstedt and Knoke, 1994). These estimates produce the
equation Ŷ = a + bX or, in the case of several predictors, Ŷ = a + b1X1 + b2X2 + … + bkXk, where Ŷ is the

value of Y predicted by the linear regression equation, a is the OLS estimate of the intercept α, and b (or b1,

b2, …, bk) is the OLS estimate for the slope β (or the partial slopes β1, β2, …, βk). Residuals for each case

ej are equal to (Yj − Ŷj), where Ŷ is the estimated value of Yj for case j. For bivariate regression, the residuals

can be visually or geometrically represented by the vertical distance between each point in a bivariate scat-

Page 8 of 36 Applied Logistic Regression Analysis


Sage Sage Research Methods
Contact SAGE Publications at http://www.sagepub.com

terplot and the regression line. For multiple regression, visual representation is much more difficult because it
requires several dimensions.

An example of a bivariate regression model is given in Figure 1.1. In part A of Figure 1.1, the dependent
variable is FRQMRJ5, the annual frequency of self-reported marijuana use (“How many times in the last year
have you smoked marijuana?”), and the independent variable is EDF5, an index of exposure to delinquent

friends, for 16-year-old respondents interviewed in 1980 in the fifth wave of a national household survey.2
The exposure to delinquent friends scale is the sum of the answers to eight questions about how many of
the respondent's friends are involved in different types of delinquent behavior (theft, assault, drug use). The
responses to individual items range from 1 (none of my friends) to 5 (all of my friends), resulting in a possible
range from 8 to 40 for EDF5. From part A of Figure 1.1, there appears to be a positive relationship between
exposure to delinquent friends and marijuana use, described by the equation

In other words, for every one-unit increase in the index of exposure to delinquent friends, frequency of mari-
juana use increases by about six times per year, or about once every two months. The coefficient of determi-

nation (R2) indicates how much better we can predict the dependent variable from the independent variable
than we could predict the dependent variable without information about the independent variable. Without
information about the independent variable, we would use the mean frequency of marijuana use as our pre-
diction for all respondents. By knowing the value of exposure to delinquent friends, however, we can base
our prediction on the value of EDF5 and the relationship, represented by the regression equation, between
FRQMRJ5 and EDF5. Using the regression equation reduces the sum of the squared errors of prediction,

by R2 = .116 or about 12%.

Page 9 of 36 Applied Logistic Regression Analysis


Sage Sage Research Methods
Contact SAGE Publications at http://www.sagepub.com

Figure 1.1. Bivariate Regression Plots

It is necessary when interpreting the results to consider the actual values of the dependent and independent
variables. The intercept indicates that for an individual with 0 as the value of exposure to delinquent friends,
the frequency of marijuana use would be negative. This seemingly impossible result occurs because expo-
sure, as already noted, is measured on a scale that ranges from a minimum of 8 (no exposure at all; not
one friend is involved in any of 8 delinquent activities) to 40 (extensive exposure; all friends are involved in

Page 10 of 36 Applied Logistic Regression Analysis


Sage Sage Research Methods
Contact SAGE Publications at http://www.sagepub.com

all 8 delinquent activities). Thus, for individuals with the minimum possible exposure to delinquent friends (a
value of 8, representing no exposure), the expected frequency of marijuana use is −49.2+6.2(8) = 0.4, which
is close to 0, but indicates that even some individuals with no exposure to delinquency may use marijuana
at least occasionally. The maximum value of EDF5 in this sample is 29, which corresponds to an expected
frequency of marijuana use equal to −49.2 + 6.2(29) = 130.6 or use approximately every 3 days. This result
makes sense substantively, in terms of real-world behavior, as well as statistically, in terms of the regression
equation.

1.1. Regression Assumptions

To use the OLS method to estimate and make inferences about the coefficients in linear regression analysis,
a number of assumptions must be satisfied (Lewis-Beck, 1980, pp. 26–47; Berry & Feldman, 1985; Berry,
1993). Specific assumptions include the following:

1.
Measurement: All independent variables are interval, ratio, or dichotomous, and the dependent variable is
continuous, unbounded, and measured on an interval or ratio scale. All variables are measured without er-

ror.3
2.
Specification: (a) All relevant predictors of the dependent variable are included in the analysis, (b) no irrel-
evant predictors of the dependent variable are included in the analysis, and (c) the form of the relationship
(allowing for transformations of dependent or independent variables) is linear.
3.
Expected value of error: The expected value of the error, ε, is 0.
4.
Homoscedasticity: The variance of the error term, ε, is the same or constant for all values of the independent
variables.
5.
Normality of errors: The errors are normally distributed for each set of values of the independent variables.
6.
No autocorrelation: There is no correlation among the error terms produced by different values of the inde-
pendent variables. Mathematically, E(εi, εj) = 0.
7.
No correlation between the error terms and the independent variables: The error terms are uncorrelated with

Page 11 of 36 Applied Logistic Regression Analysis


Sage Sage Research Methods
Contact SAGE Publications at http://www.sagepub.com

the independent variables. Mathematically, E(εj, εj) = 0.


8.
Absence of perfect multicollinearity: For multiple regression, none of the independent variables is a perfect

linear combination of the other independent variables. Mathematically, for any i, R2i < 1, where R2i is the
variance in the independent variable Xi that is explained by all of the other independent variables X1, X2, …,
Xi-1, Xi+1,…, Xk. If there is only one predictor, multicollinearity is not an issue.

1.1.1. Violations of the Measurement Assumption: Dichotomous Variables in Linear Regression

The linear regression model can be extended easily to accommodate dichotomous predictors, including sets
of dummy variables (Lewis-Beck, 1980, pp. 66–71; Berry & Feldman, 1985, pp. 64–75; Hardy, 1993). An ex-
ample is presented in part B of Figure 1.1. Here, the dependent variable is again self-reported annual fre-
quency of marijuana use, but the independent variable this time is sex or gender (coded 0=female, 1=male).
The regression equation is

The resulting diagram consists of two columns of values for frequency of marijuana use: one represents fe-
males and one represents males. With a dichotomous predictor, coded 0–1, the intercept and the slope have
a special interpretation. It is still true that the intercept is the predicted value of the dependent variable when
the independent variable is 0 (substantively, when the respondent is female), but with only two groups, the
intercept now is the mean frequency of marijuana use for the group coded as 0 (females). The slope is still
the change in the dependent variable associated with a one-unit change in the independent variable, but with
only two categories, that value becomes the difference in the means between the first (female) and second
(male) groups. The sum of the slope and the intercept, 29.3 — 10.0 = 19.3, is therefore the mean frequency of
marijuana use for the second group (males). As indicated in part B of Figure 1.1, females report a higher (yes,
higher) frequency of marijuana use than males, but the difference is not statistically significant (as indicated
by Sig. = .3267). In part B of Figure 1.1, the regression line is simply the line that connects the mean frequen-
cy of marijuana use for females and the mean frequency of marijuana use for males, that is, the conditional

means4 of marijuana use for females and males, respectively. The predicted values of Y over the observed
range of X lie well within the observed (and possible) values of Y. Again, the results make substantive as well
as statistical sense.

Page 12 of 36 Applied Logistic Regression Analysis


Sage Sage Research Methods
Contact SAGE Publications at http://www.sagepub.com

When the dependent variable is dichotomous, the interpretation of the regression equation is not as straight-
forward. In part C of Figure 1.1, the independent variable is again exposure to delinquent friends, but now
the dependent variable is the prevalence of marijuana use: whether (yes = 1 or no = 0) the individual used
marijuana at all during the past year. In part C of Figure 1.1, with a dichotomous dependent variable, there
are two rows (rather than columns, as in part B). The linear regression model with a dichotomous dependent
variable, coded 0–1, is called a linear probability model (Agresti, 1990, p. 84; Aldrich & Nelson, 1984). The
equation from part C of Figure 1.1 is

When there is a dichotomous dependent variable, the mean of the variable is a function of the probability5
that a case will fall into the higher of the two categories for the variable. Coding the values of the variable
as 0 and 1 produces the result that the mean of the variable is the proportion of cases in the higher of the
two categories of the variable, and the predicted value of the dependent variable (the conditional mean, giv-
en the value of X and the assumption that X and Y are linearly related) can be interpreted as the predicted
probability that a case falls into the higher of the two categories on the dependent variable, given its value on
the independent variable. Ideally, we would like the predicted probability to lie between 0 and 1, because a
probability cannot be less than 0 or more than 1.

As is evident from part C of Figure 1.1, the predicted values for the dependent variable may be higher or
lower than the possible values of the dependent variable. For the minimum value of EDF5 (EDF5 = 8), the
predicted prevalence of marijuana use (i.e., the predicted probability of marijuana use) is -.41 + .06(8) = .10,
a reasonable result, but for the maximum value of EDF5 (EDF5 = 29), the predicted probability of marijuana
use becomes -.41 + .064(29) = 1.45, or an impossibly high probability of about 1½. In addition, the variability
of the residuals will depend on the size of the independent variable (Schroeder, Sjoquist, & Stephan, 1986,
pp. 79–80; Aldrich & Nelson, 1984, p. 13). This condition, called heteroscedasticity, implies that the estimates
for the regression coefficients, although they are unbiased (not systematically too high or too low), will not be
the best estimates in the sense of having a small standard error. There is also a systematic pattern to the
values of the residuals that depends on the value of X. For values of X greater than 23.5 in part C of Figure
1.1, all of the residuals will be negative because Ŷj will be greater than Yj (because for X greater than 23.5,

Ŷj is greater than 1, but Yj is less than or equal to 1). Also, residuals will not be normally distributed (Schroed-

er et al., 1986, p. 80) and sampling variances will not be correctly estimated (Aldrich & Nelson, 1984, pp.

Page 13 of 36 Applied Logistic Regression Analysis


Sage Sage Research Methods
Contact SAGE Publications at http://www.sagepub.com

13–14); therefore, the results of hypothesis testing or construction of confidence intervals for the regression
coefficients will not be valid.

1.1.2. Nonlinearity, Conditional Means, and Conditional Probabilities

For continuous dependent variables, Ŷj, the regression estimate of Y, may be thought of as an estimate of

the conditional mean of Y for a particular value of X, given that the relationship between X and Y is linear. In
bivariate regression, for continuous independent variables, the estimated value of Y may not be exactly equal
to the mean value of Y for those cases, because the conditional means of Y for different values of X may
not lie exactly on a straight line. For a dichotomous predictor variable, the regression line will pass exactly
through the conditional means of Y for each of the two categories of X. If the conditional means of FRQMRJ5
are plotted against the dichotomous predictor SEX, the plot consists of two points (remember, the cases are
aggregated by the value of the independent variable): the conditional means of Y for males and females. The
simplest, most parsimonious description of this plot is a straight line between the two conditional means, and
the linear regression model appears to work well.

The inherent nonlinearity of relationships that involve dichotomous dependent variables is illustrated in Figure
1.2. In Figure 1.2, the observed conditional mean of PMRJ5, the prevalence of marijuana use, is plotted for
each value of the independent variable EDF5. The observed conditional mean is symbolized by the letter “C.”
Since PMRJ5 is coded as either 0 or 1, the conditional means represent averages of Os and Is, and are inter-
pretable as conditional probabilities. Figure 1.2 is therefore a plot of probabilities that PMRJ5 = 1 for different
values of EDF5. All of the observed values of Y lie between the two vertical lines at 0 and 1, respectively, in
Figure 1.2. Predicted probabilities, however, can, in principle, be infinitely large or small if we use the linear
probability model.

The plot of observed conditional probabilities (C) in Figure 1.2 is overlaid with the plot of predicted conditional
probabilities based on the regression equation (R) in part C of Figure 1.1. For values of EDF5 greater than
23.5, the observed value of the conditional mean prevalence of marijuana use stops increasing and levels
off at PMRJ5 = 1. The predicted values from the regression equation, however, continue to increase past the
value of 1 for PMRJ5, to a maximum of 1.45, and the error of prediction increases as EDF5 increases from
23.5 to its maximum of 29.

Page 14 of 36 Applied Logistic Regression Analysis


Sage Sage Research Methods
Contact SAGE Publications at http://www.sagepub.com

Two points need to be made about Figure 1.2. First, although a linear model appears to be potentially appro-
priate for a continuous dependent variable, regardless of whether the independent variables are continuous
or dichotomous, it is evident that a nonlinear model is better suited to the analysis of the dichotomous variable
PMRJ5. In general, for very high values of X (or very low values, if the relationship is negative), the condi-
tional probability that Y = 1 will be so close to 1 that it should change little with further increases in X. This is
the situation illustrated in Figure 1.2. It is also the case that for very low values of X (or very high values if the
relationship is negative), the conditional probability that Y = 1 will be so close to 0 that it should change little
with further decreases in X. The curve that represents the relationship between X and Y should, therefore,
be very shallow, with a slope close to 0, for very high and very low values of X if X can, in principle, become
indefinitely large or indefinitely small. If X and Y are related, then between the very high and very low values
of X, the slope of the curve will be steeper, that is, significantly different from 0. The general pattern is that of
an “S curve” as depicted in Figure 1.3.

Page 15 of 36 Applied Logistic Regression Analysis


Sage Sage Research Methods
Contact SAGE Publications at http://www.sagepub.com

Figure 1.2. Conditional Probabilities Observed (C) and Predicted by Linear Regression (R)
C: Observed Mean Prevalence of Marijuana Use (MPMRJ5) WITH EDF5 (Exposure to Delin-
quent Friends). R: Linear Regression Prediction of Prevalence of Marijuana Use (MRPEP-
MJ5) WITH EDF5. $: Multiple occurrence (Linear Regression prediction and observed value
coincide). 21 cases.

Page 16 of 36 Applied Logistic Regression Analysis


Sage Sage Research Methods
Contact SAGE Publications at http://www.sagepub.com

Figure 1.3. Logistic Curve Model for a Dichotomous Dependent Variable

Second, for prevalence data, the observed conditional mean of Y is equal to the observed conditional proba-
bility that Y = 1. and the predicted value of Y is equal to the predicted conditional probability that Y = 1. The
actual values used to identify the two categories of Y are arbitrary, a matter of convenience. They may be 0
and I, for example, or 2 and 3 (in which case the predicted values of Y are equal to 2 plus the conditional
probability that Y = 3, still a function of the conditional probability that Y has the higher of its two values for
a given value of X). What is substantively important is not the numerical value of Y, but the probability that
Y has one or the other of its two possible values, and the extent to which that probability depends on one or
more independent variables.

The distinction between the arbitrary numerical value of Y, upon which OLS parameter estimates are based,
and the probability that Y has one or the other of its two possible values is problematic for OLS linear re-
gression and leads us to consider alternative methods for estimating parameters to describe the relationship
between X and Y.First, however, we address the issue of nonlinearity. For continuous independent and de-
pendent variables, the presence of nonlinearity in the relationship between X and Y may sometimes be ad-
dressed by the use of nonlinear transformations of dependent or independent variables (Berry & Feldman,
1985). Similar techniques play a part in estimating relationships that involve dichotomous dependent vari-
ables.

Page 17 of 36 Applied Logistic Regression Analysis


Sage Sage Research Methods
Contact SAGE Publications at http://www.sagepub.com

1.2. Nonlinear Relationships and Variable Transformations

When a relationship appears to be nonlinear, it is possible to transform either the dependent variable or one
or more of the independent variables so that the substantive relationship remains nonlinear, but the form of
the relationship is linear and can, therefore, be analyzed using OLS estimation. Another way to say that a
relationship is substantively nonlinear but formally linear is to say that the relationship is nonlinear in terms of
its variables, but linear in terms of its parameters (Berry & Feldman, 1985, p. 53). Examples of using variable
transformations to achieve a linear form for the relationship are given in Berry and Feldman (1985, pp. 55–72)
and Lewis-Beck (1980, pp. 437).

In Figure 1.2, there was some evidence of nonlinearity in the relationship between frequency of marijuana use
and exposure to delinquent friends. One possible transformation that could be used to model this nonlinearity

is a logarithmic transformation6 of the dependent variable FRQMRJ5. This is done by adding 1 to FRQMRJ5
and then taking the natural logarithm. (Adding 1 is necessary to avoid taking the natural logarithm of 0, which

is undefined.) The regression equation then has the form ln(Y+ 1) = α + βX or, equivalently, (Y + 1) = eα+βX

or Y = eα+βX — 1, where e = 2.72 is the base of the natural logarithm. Specifically, for prevalence of marijua-
na use and exposure to delinquent friends,

Comparing the results of the model using the logarithmic transformation with the untransformed model in part
A of Figure 1.1, it is evident that the slope is still positive, but the numerical value of the slope has changed
(because the units in which the dependent variable is measured have changed from frequency to log frequen-
cy). The coefficient of determination for the transformed equation is also larger (.32 instead of .12), reflecting
a better fit of the linear regression model when the dependent variable is transformed. This is evidence (not
conclusive proof, just evidence) that the relationship between frequency of marijuana use and exposure to
delinquent friends is substantively nonlinear. A similar result occurs for the relationship between the dichoto-
mous predictor SEX and frequency of marijuana use. With the logarithmic transformation of the dependent
variable, the explained variance increases (from a puny .004 to an unimpressive .028), and the relationship
between gender and frequency of marijuana use is statistically significant (p = .011) in the transformed equa-
tion. It appears that the relationship between frequency of marijuana use and both of the predictors consid-
ered so far is substantively nonlinear, but we are still able to use a formal linear model to describe those
relationships and we can still use OLS to estimate the parameters of the model.

Page 18 of 36 Applied Logistic Regression Analysis


Sage Sage Research Methods
Contact SAGE Publications at http://www.sagepub.com

1.3. Probabilities, Odds, Odds Ratios, and the Logit Trans-


formation for Dichotomous Dependent Variables

As noted earlier, for a dichotomous dependent variable, the numerical value of the variable is arbitrary, a
matter of convenience, and is not intrinsically interesting. What is intrinsically interesting is whether the clas-
sification of cases into one or the other of the categories of the dependent variable can be predicted by the
independent variable. Instead of trying to predict the arbitrary value associated with a category, it may be
useful to reconceptualize the problem as an attempt to predict the probability that a case will be classified
into one as opposed to the other of the two categories of the dependent variable. Because the probability of
being classified into the first or lower-valued category, P(Y = 0), is equal to 1 minus the probability of being
classified into the second or higher-valued category, P(Y = 1), if we know one probability, we know the other:
P(Y = 0) = 1 − P(Y = 1).

We could try to model the probability that Y = 1 as P(Y = 1) = α + βX, but we would again run into the problem
that although observed values of P(Y = 1) must lie between 0 and 1, predicted values may be less than 0 or
greater than 1. A step toward solving this problem would be to replace the probability that 7 = 1 with the odds
that Y = 1. The odds that Y = 1, written odds(Y = 1), is the ratio of the probability that Y = 1 to the probability
that Y ≠ 1. The odds that Y = 1 is equal to P(Y = 1)/[1 − P(Y = 1)]. Unlike P(Y = 1), the odds has no fixed
maximum value, but like the probability, it has a minimum value of 0.

One further transformation of the odds produces a variable that varies, in principle, from negative infinity to
positive infinity. The natural logarithm of the odds, ln{P(Y = 1)/[1 − P(Y = 1)]}, is called the logit of Y. The logit
of Y, written logit(Y), becomes negative and increasingly large in absolute value as the odds decrease from
1 toward 0, and becomes increasingly large in the positive direction as the odds increase from 1 to infinity. If
we use the natural logarithm of the odds that Y = 1 as our dependent variable, we no longer face the problem
that the estimated probability may exceed the maximum or minimum possible values for the probability. The
equation for the relationship between the dependent variable and the independent variables then becomes

We can convert logit(Y) back to the odds by exponentiation, calculating [odds that Y = 1] = elogit(Y) This re-
sults in the equation

Page 19 of 36 Applied Logistic Regression Analysis


Sage Sage Research Methods
Contact SAGE Publications at http://www.sagepub.com

and a change of one unit in X multiplies the odds by eβ. We can then convert the odds back to the probability
that (Y = 1) by the formula P(Y = 1) = [odds that Y = 1]/[1 + odds that Y = 1], that is, the probability that Y = 1
is equal to the odds that Y = 1 divided by 1 plus the odds that Y = 1. This produces the equation

It is important to understand that the probability, the odds, and the logit are three different ways to express
exactly the same thing. Of the three measures, the probability or the odds is probably the most easily un-
derstood. Mathematically, however, the logit form of the probability best helps us to analyze dichotomous
dependent variables. Just as we took the natural logarithm of the continuous dependent variable (frequency
of marijuana use) to correct for the nonlinearity in the relationship between frequency of marijuana use and
exposure to delinquent friends, we can also take the logit of the dichotomous dependent variable (prevalence
of marijuana use) to correct for the nonlinearity in the relationship between prevalence of marijuana use and
exposure to delinquent friends.

For any given case, logit(Y) = ±∞. This ensures that the probabilities estimated for the probability form of the
model (Equation 1.3) will not be less than 0 or greater than 1, but it also means that because the linear form of
the model (Equation 1.1) has infinitely large or small values of the dependent variable, OLS cannot be used to
estimate the parameters. Instead, maximum likelihood techniques are used to maximize the value of a func-
tion, the log-likelihood function, which indicates how likely it is to obtain the observed values of Y, given the
values of the independent variables and parameters α, β1,…, βk. Unlike OLS, which is able to solve directly

for the parameters, the solution for the logistic regression model is found by beginning with a tentative solu-
tion, revising it slightly to see if it can be improved, and repeating the process until the change in the likelihood
function from one step of the process to another is negligible. This process of repeated estimation, testing,
and reestimation is called iteration, and the process of obtaining a solution from repeated estimation is called
an iterative process. When the change in the likelihood function from one step to another becomes negligible,
the solution is said to converge. All of this is done by means of computer-implemented numerical algorithms
designed to search for and identify the best set of parameters to maximize the log-likelihood function. When
the assumptions of OLS regression are met, however, the OLS estimates for the linear regression coefficients
are identical to the estimates that would be obtained using maximum likelihood estimation (Eliason, 1993, pp.
13–18). OLS estimation is in this sense a special case of maximum likelihood estimation, one in which it is
possible to calculate a solution directly without iteration.

Page 20 of 36 Applied Logistic Regression Analysis


Sage Sage Research Methods
Contact SAGE Publications at http://www.sagepub.com

1.4. Logistic Regression: A First Look

Part C of Figure 1.1 showed the results of an OLS linear regression analysis of the relationship between the
prevalence of marijuana use (PMRJ5) and exposure to delinquent friends (EDF5). Figure 1.4 presents the
output from a bivariate logistic regression with the same two variables. This is output from SPSS LOGIS-
TIC REGRESSION, with some deletions but nothing added. The equation for the logit of the prevalence of
marijuana use from Figure 1.4 is logit(PMRJ5) = =5.487 + .407(EDF5). There are several other statistics pre-
sented in Figure 1.4 that will be discussed in the pages to follow. For the moment, however, note that the
presentation of logistic regression results includes (a) some summary statistics for the goodness of fit of the
model (Omnibus Tests of Model Coefficients and Model Summary), (b) a comparison of observed and predict-
ed values (or classification) of cases according to whether they do (yes) or do not (no) report using marijuana
(Classification Table), (c) the estimated parameters (B) of the logistic regression equation, along with other
statistics associated with those parameters (Variables in the Equation), and (d) a plot of the observed (yes
= 1 or no = 0) and predicted probabilities of “membership” of being marijuana users (Observed Groups and
Predicted Probabilities).

Page 21 of 36 Applied Logistic Regression Analysis


Sage Sage Research Methods
Contact SAGE Publications at http://www.sagepub.com

Figure 1.4. Bivariate Logistic Regression for the Prevalence of Marijuana Use

Page 22 of 36 Applied Logistic Regression Analysis


Sage Sage Research Methods
Contact SAGE Publications at http://www.sagepub.com

Figure 1.5. Conditional Probabilities Observed (C) and Predicted by Logistic-Regression (L)
C: Mean Prevalence of Marijuana Use (MPMRJ5) WITH EDF5 (Exposure to delinquent
friends). L: Logistic Regression Prediction of Prevalence of Marijuana Use (MLPEPMRJ5)
WITH EDF5. $: Multiple Occurence (Logistic Regression prediction and observed value co-
incide). 21 cases.

Figure 1.5 plots the predicted and observed conditional probabilities (or, equivalently, the conditional means)
for the logistic regression equation. The observed conditional probabilities are represented by the letter “C”
and the predicted conditional probabilities are represented by the letter “L” for logistic regression. In Figure
1.2, the predicted probabilities from linear regression analysis represented a straight line, and for values of

Page 23 of 36 Applied Logistic Regression Analysis


Sage Sage Research Methods
Contact SAGE Publications at http://www.sagepub.com

EDF5 greater than 23.5, the predicted conditional probabilities of being a marijuana user were greater than
1. The observed conditional probabilities, unlike the predicted conditional probabilities, leveled off at 1. In Fig-
ure 1.5, the conditional probabilities predicted by logistic regression analysis all lie between 0 and 1, and the
pattern of the predicted probabilities follows the curve suggested by the observed conditional probabilities, a
curve similar to the right half of the curve in Figure 1.3. Just from looking at the pattern, there appears to be
a closer correspondence between the observed and predicted conditional means when logistic regression is
used to predict the dependent variable.

Page 24 of 36 Applied Logistic Regression Analysis


Sage Sage Research Methods
Contact SAGE Publications at http://www.sagepub.com

https://doi.org/10.4135/9781412983433

Notes

1. Although the relationship being modeled often represents a causal relationship in which the single
predicted variable is believed to be an effect of the one or more predictor variables, this is not always the
case. We can as easily predict a cause from an effect (for example, predict whether different individuals are
male or female based on their income) as predict an effect from a cause (predicting income based on whether
someone is male or female). Throughout this monograph, the emphasis is on predictive rather than causal
relationships, although the language of causal relationships is sometimes employed. Describing a variable as
independent or dependent, therefore, or as an outcome or a predictor, does not necessarily imply a causal
relationship. Instead, all relationships should be regarded as definitely predictive, but only possibly causal in
nature.

2. Data are taken from the National Youth Survey, a national household probability sample of individuals
who were adolescents (age 11 to 17) in 1976 and young adults (age 27 to 33) in 1992. Data were collected
annually for the years 1976 to 1980, then in 3 year intervals thereafter, from 1983 to 1992. The data include
information on self-reported illegal behavior, family relationships, school performance, and sociodemographic
characteristics of the respondents. Details on sampling and measurement may be found in Elliott et al. (1985,
1989). For present purposes, attention is restricted to respondents who were 16 years old in 1980. In the
scatterplot, the numbers and symbols refer to numbers of cases at a given point on the plot: a 1 indicates one
case, a 2 indicates two cases, a 9 indicates nine cases; the letters A to Z continue the count, A = 10 cases, B
= 11 cases, …, Z = 35 cases. When more than 35 cases occupy a single point, an asterisk (·) is used.

3. For a review of levels of measurement, see, for example. Agresti and Finlay (1997, pp. 12–17).

Page 25 of 36 Applied Logistic Regression Analysis


Sage Sage Research Methods
Contact SAGE Publications at http://www.sagepub.com

4. The unconditional main of Y is simply the familiar mean = σ Yj/N.

The conditional mean of Y for a given value of X is calculated by selecting only those cases for which
X has a certain value and calculating the mean for those cases. The conditional mean can be denoted

, where i is the value of X for which we are calculating the conditional mean of Y, Yij are the values of Y for

the cases (j = 1,2,…, ni) for which X = i, and ni is the number of cases for which X = i.

5. A brief discussion of probability, including conditional probabilities, is presented in the Appendix.

6. The logarithmic transformation is one of several possibilities discussed by Berry and Feldman (1985, pp.
63–64), Lewis-Beck (1980, p. 44), and others to deal with relationships that are nonlinear with respect to the
variables, but may be expressed as linear relationships with respect to the parameters.

7. R2L is also provided as the pseudo-R2 in Stata (Stata, 1999). An earlier version of SAS PROC LOGISTIC

Page 26 of 36 Applied Logistic Regression Analysis


Sage Sage Research Methods
Contact SAGE Publications at http://www.sagepub.com

[SAS (SUGI) PROC LOGIST (Harrell, 1986)] included a variant of R2L that was adjusted for the number of

parameters in the model. This measure is analogous to the adjusted R2 in linear regression, and we may

denote it as R2LA to indicate its connection with R2L and to distinguish it from other R2-type measures. R2LA

= (GM − 2k)/(D0), where k is the number of independent variables in the model. If GM, < 2k and particularly if

GM = 0, it is possible to get a negative estimate for explained variance using R2LA.

8. It should be noted, however, that in other contexts, it may not he appropriate. For example, in proportional

hazards models, R2L is more sensitive to censoring than R2N (Schemper, 1990. 1992).

9. The designation ϕp was selected because ϕp, like ϕ, is based on comparisons between observed and

expected values for individual cells (rather than rows or columns. as with λp and τp), because the numerical

value of ϕp is close to the numerical value of ϕ for tables with consistent marginals (row and column totals in

which the larger row total corresponds to the larger column total), and ϕp and ϕ have the same sign (because

they have the same numerator).

10. ϕp can be adjusted by adding the minimum number of errors,

, to the expected number of errors without the model. This results in a coefficient that (a) retains the
proportional change in error interpretation (because the adjustment is built into the calculation of the expected
error) and (b) still may have negative values if the model is pathologically inaccurate. For extremely poor
models, the revised index still has a maximum value less than 1. even when the maximum number of cases is
correctly classified, and the increment over ϕp is small. Based again on similarities with ϕ, we may designate

this adjusted ϕp as ϕ′p. Note, however, that ϕ′p, cannot be calculated as ϕp/ max(ϕp): to do so would destroy

Page 27 of 36 Applied Logistic Regression Analysis


Sage Sage Research Methods
Contact SAGE Publications at http://www.sagepub.com

the proportional change in error interpretation for the measure and would leave the measure undefined when
the maximum value of ϕp was 0.

11. For a two-tailed test, the null hypothesis is that there is no difference between the proportion of errors with
and without the prediction model. The alternative hypothesis is that the proportion of errors with the prediction
model is not equal to the proportion of errors without the prediction model. For a one-tailed test, specifying
that the model results in increased accuracy of prediction of the dependent variable, the null hypothesis is
that the proportion of errors with the prediction model is no smaller than the proportion of errors without the
prediction model. The alternative hypothesis is that the proportion of errors with the prediction model is less
than the proportion of errors without the prediction model. If we want to know whether the prediction model
improves our ability to predict the classification of the cases, the one-tailed test is more appropriate, and a
negative value of λp will result in a negative value for d and failure to reject the null hypothesis.

12. Copas and Loeber (1990) noted this property and indicated that in this situation it would be a
misinterpretation to regard a value of 1 as indicating perfect prediction. This leads to two questions. How
should we interpret the value of RIOC in this situation? What value does indicate perfect prediction for RIOC?
Ambiguity of interpretation is an undesirable quality in any measure of change, and there are enough better
alternatives that the use of the RIOC measure should be avoided.

13. It will not always he the ease that logistic regression produces a higher R2 than linear regression for a

dichotomous dependent variable. In a parallel analysis of theft for the full National Youth Survey sample, R2
for linear and logistic regression was .255 and .253, respectively.

14. This is sometimes called a Type II error or a false negative (failure to detect a relationship that exists), as
opposed to a Type I error or a false positive (concluding that there is a relationship when there really is none).

15. This was done using the backward stepwise procedure, to be discussed later in the text. In SPSS
NOMREG and PLUM there is no stepwise procedure, but it is possible to include the likelihood statistics in
the output.

16. If they were, they would indicate that non-Hispanic European Americans have the lowest rates of
marijuana use, followed by African Americans, and others have the highest prevalence of marijuana use. It is
always questionable, however, to make statements about the nature of a relationship that is not statistically
significant and may reflect nothing more than random sample error.

17. Mathematically, the omitted category is redundant or of little or no interest. In both theory testing and

Page 28 of 36 Applied Logistic Regression Analysis


Sage Sage Research Methods
Contact SAGE Publications at http://www.sagepub.com

applied research, however, it makes more sense to provide full information about the coefficients and their
statistical significance for all three categories, rather than leave one for pencil and paper calculation.

18. Hosmer and Lemeshow (1989) distinguished between analyzing residuals based on individuals and
analyzing residuals based on covariate patterns, the combinations of values of the independent variables
that actually occur in the sample. When the number of covariate patterns is equal to the number of cases,
or very nearly so, residuals must be analyzed for each case separately. This is the implicit approach taken
in this section, and in SAS PROC LOGISTIC and SPSS LOGISTIC REGRESSION, but SPSS NOMREG
aggregates cases by covariate pattern and produces the correct predictions, residuals, and goodness-of-
fit tests based on those subpopulations (Norusis, 1999). When the number of cases is much larger than
the number of covariate patterns or when some of the covariate patterns hold for more than five cases,
Hosmer and Lemeshow recommended aggregating the cases by covariate pattern, because of potential
underestimation of the leverage statistic hj.

19. In a standard normal distribution with a mean of 0 and a standard deviation of 1, 95% of the cases should
have standardized scores (or, in this context, standardized residuals) between -2 and +2, and 99% should
have scores or residuals between -3 and +3. Having a standardized or deviance residual larger than 2 or 3
does not necessarily mean that there is something wrong with the model. We would expect about 5% of the
sample to lie outside the range -2 to +2, and 1% to lie outside the range -2.5 to +2.5. Values far outside this
range, however, are usually indications that the model fits poorly for a particular case and suggest either that
there is something unusual about the case that merits further investigation or that the model may need to be
modified to account for whatever it is that explains the poor fit for some of the cases.

20. As Fox (1991) noted, in linear regression, influence = leverage × discrepancy, where “discrepancy” refers
to being an outlier on Y with respect to the predictors. In logistic regression, in contrast to linear regression,
as fitted probabilities get close to 0 (less than .1) or 1 (greater than .9), the leverages stop increasing and turn
rapidly toward 0 (Hosmer & Lemeshow, 1989, pp. 153–154).

21. A particularly clear and concise discussion of overdispersion and underdispersion may be found in
Hutcheson and Sofroniou (1999). See also McCullagh and Nelder (1989, pp. 124–128).

22. Had this been run in SAS CATMOD, neither D0 nor GM would be directly available. For both, the likelihood

X2 statistics are based on comparisons of cells in a contingency table, rather than on probabilities of category
membership. The two are related, however, and it is possible to derive the statistics appropriate for logistic
regression analysis from the statistics provided by SAS PROC CATMOD. The appropriate steps are

Page 29 of 36 Applied Logistic Regression Analysis


Sage Sage Research Methods
Contact SAGE Publications at http://www.sagepub.com

1.
Compute

, where ny h, is the number of eases for which Y is equal to I of its possible values h, N is the total
sample size, and the sum is taken over all possible values h of Y.
2.
Examine the iteration history of the model. The “-2 Log Likelihood” from the final iteration, listed iii
the Maximum Likelihood Analysis table is approximately (but not exactly) equal to DM.
3.

Compute GM = D0 − DM; compute R2L = GM/D0.

If these procedures are followed with a dichotomous dependent variable, the resulting figures are

approximately equal to the GM and R2L that would be obtained in the identical analysis from SAS PROC

LOGISTIC. (SAS PROC LOGISTIC uses an iteratively reweighted least squares algorithm to calculate
the model parameters; PROC CATMOD uses weighted least squares or maximum likelihood estimation,
depending on the type of model being calculated.) The process is a bit awkward, but if the likelihood ratio
statistics provided in CATMOD are used without modification, they produce results different from those that
would be obtained using SAS PROC LOGISTIC when the dependent variable is dichotomous.

23. For reliable estimates, this requires a large N (Hu, Bender, & Kano, 1992).

Page 30 of 36 Applied Logistic Regression Analysis


Sage Sage Research Methods
Contact SAGE Publications at http://www.sagepub.com

Appendix: Probabilities

The probability of an event is estimated by its relative frequency in a population or sample. For example, if nY

= 1 is the number of cases for which Y = 1 in a sample and N is the total number of cases in the sample, then

1.
We denote the probability that Y is equal to 1 as P(Y=1)
2.
P(Y=1) = nY=1/N
3.
The probability that Y is not equal to 1 is P(Y ≠ 1) = 1 P(Y = 1) = 1 -(nY=1/N) = (N − nY=1)/N
4.
The minimum possible value for a probability is 0 (nY=1 = 0 implies nY=1/N = 0)
5.
The maximum possible value for a probability is 1 (nY=1 = N implies nY=1/N = 1).

The joint probability of two independent events (occurrences that are unrelated to one another) is the product
of their individual probabilities. For example, the probability that both X and Y are equal to 1, if X and Y are
unrelated is P(Y = 1 and X = 1) = P(Y = 1) × P(X = 1). If X and Y are related (for example, if the probability
that Y is equal to 1 depends on the value of X), then P(Y = 1 and X = 1) will not be equal to P(Y = 1) × P(X =
1). Instead, we will want to consider the conditional probability that Y = 1 when X = 1, or P(Y = 1∣X = 1).

The conditional probability that Y = 1 is the probability that Y = 1 for a given value of some other variable.
[In this context, we may sometimes refer to P(Y = 1), the probability that Y = 1 regardless of the value of any
other variable, as the unconditional probability that Y = 1.] For example, the probability that the prevalence
of marijuana use is equal to 1 for the data in Figure 2.1 is P(PMRJ5 = 1) =.35 (for males and females
combined; detailed data not shown). The conditional probability that prevalence of marijuana use is equal to
1 is P(PMRJ5 = 1∣ SEX = 0) =.45 for females and P(PMRJ5 = 1 ∣ SEX = 1) =.25 for males. For a dichotomous
variable, coded as 0 or 1, the probability that the variable is equal to 1 is equal to the mean for that variable,
and the conditional probability that the variable is equal to 1 is equal to the conditional mean (see note 4) for
the variable.

Page 31 of 36 Applied Logistic Regression Analysis


Sage Sage Research Methods
Contact SAGE Publications at http://www.sagepub.com

References

AGRESTI, A.(1990).Categorical data analysis.New York: Wiley.

AGRESTI, A., and FINLAY, B.(1997).Statistical methods for the social sciences (3rd ed.).Upper Saddle River,
NJ: Prentice-Hall.

ALDRICH, J. H., and NELSON, F. D.(1984).Linear probability, logit, and probit models. Sage University Paper
Series on Quantitative Applications in the Social Sciences. 07–045.Beverly Hills, CA: Sage.

ALLISON, P. D.(1999).Logistic regression using the SAS system.Cary, NC: SAS Institute.

BEGG, C. B.GREY, R.71(1984).11–18.BENDEL, R. B.AFIFI, A. A.72(1977).46–53.


BERRY, W. D.(1993).Understanding regression assumptions. Sage University Paper Series on Quantitative
Applications in the Social Sciences, 07–092.Newbury Park, CA: Sage.

BERRY, W. D., and FELDMAN, S.(1985).Multiple regression in practice. Sage University Paper Series on
Quantitative Applications in the Social Sciences, 07–050.Beverly Hills, CA: Sage.

BOHRNSTEDT, G. W., and KNOKE, D.(1994).Statistics for social data analysis (3rd ed.).Itasca, IL: F. F.,
Peacock.

BOLLEN, K. A.(1989).Structural equation models with latent variables.New York: Wiley.

BULMER, M. G.(1979).Principles of statistics.New York: Dover.

CLOGG, C. C. and SHIHADEH, E. S.(1994).Statistical models for ordinal variables.Thousand Oaks, CA:
Sage.

COPAS, J. B.LOEBER, R.43(1990).293–307.COSTNER, H. L.30(1965).341–353.


COX, D. R., and SNELL, E. J.(1989).The analysis of binary data (2nd ed.)London: Chapman and Hall.

CRAGG, J. G.UHLER, R.3(1970).386–406.


DeMARIS, A.(1992).Logit modeling. Sage University Paper Series on Quantitative Applications in the Social
Sciences, 07–086.Newbury Park. CA: Sage.

ELIASON, S. R.(1993).Maximum likelihood estimation: Logic and practice. Sage University Paper Series on
Quantitative Applications in the Social Sciences, 07–096.Newbury Park, CA: Sage.

ELLIOTT, D. S, HUIZINGA, D., and ACETON, S. S.(1985).Explaining delinquency and drug use.Beverly Hills,

Page 32 of 36 Applied Logistic Regression Analysis


Sage Sage Research Methods
Contact SAGE Publications at http://www.sagepub.com

CA: Sage.

ELLIOTT, D. S., HUIZINGA, D., and MENARD, S.(1989).Multiple problem youth.New York: Springer-Verlag.

HARRINGTON, D. P.LOEBER, R.5(1989).201–213.


FOX, J.(1991).Regression diagnostics. Sage University Paper Series on Quantitative Applications in the
Social Sciences, (17–079.)Newbury Park, CA: Sage.

HAGLE, T. M.MITCHELL, G. E., II36(1992).762–784.


HARDY, M.(1993).Regression with dummy variables. Sage University Paper Series on Quantitative
Applications in the Social Sciences, 07–093.Newbury Park. CA: Sage.

HARRELL, F. E., Jr.(1986).The LOGIST procedure. In SAS Institute, Inc. (Ed.), SUGI supplemental library
user's guide, (Version 5. pp. 269–293).Cary, NC: SAS Institute.

HOSMER, D. W., and LEMESHOW, S.(1989).Applied logistic regression.New York: Wiley.

HU, L.BENTLER, P. M.KANO, Y.112(1992).351–362.


HUTCHESON, G., and SOFRONIOU, N.(1999).The multivariate social scientist: Introductory statistics using
generalized linear models.Thousand Oaks, CA: Sage.

JÖRESKOG, K. G., and SÖRBOM, D.(1988).PRELIS: A program for multivariate data screening and data
summarization (2nd ed.).Chicago, IL: Scientific Software International.

JÖRESKOG, K. G., and SÖRBOM, D.(1993).LISREL 8: Structural equation modeling with the SIMPLIS
command language.Chicago: Scientific Software International.

KLECKA, W. R.(1980).Discriminant analysis. Sage University Paper Series on Quantitative Applications in


the Social Sciences, 07–019.Beverly Hills, CA: Sage.

KNOKE, D., and BURKE, P. J.(1980).Log-linear models. Sage University Paper Series on Quantitative
Applications in the Social Sciences, 07–021.Beverly Hills, CA: Sage.

LANDWEHR, J. M.PREGIBON, D.SHOEMAKER, A. C.79(1984).61–71.


LEWIS-BECK, M. S.(1980).Applied regression: An introduction. Sage University Paper Series on Quantitative
Applications in the Social Sciences, 07–022.Beverly Hills, CA: Sage.

LOEBER, R.DISHION, T94(1983).68–99.


LONG, J. S.(1997).Regression models for categorical and limited dependent variables.Thousand Oaks, CA:
Sage.

Page 33 of 36 Applied Logistic Regression Analysis


Sage Sage Research Methods
Contact SAGE Publications at http://www.sagepub.com

MADDALA, G. S.(1983).Limited-dependent and qualitative variables in econometrics.Cambridge, UK:


Cambridge University Press.

MAGEE, L.44(1990).250–253.
McCULLAGH, P. and NELDER, J. A.(1989).Generalized linear models (2nd ed.).London: Chapman and Hall.

McFADDEN, D.3(1974).303–328.McKELVEY, R.ZAVOINA, W.4(1975).103–120.MENARD,


S.54(2000).17–24.
MIECZKOWSKI, T.(1990).The accuracy of self-reported drug use: An evaluation and analysis of new data. In
R. Weisheit (Ed.). Drugs, crime, and the criminal justice system (pp. 275–302).Cincinnati: Anderson.

NAGELKERKE, N. J. D.78(1991).691–692.
NORUSIS, M. J.(1999).SPSS regression models 10.0.Chicago: SPSS. Inc.

OHLIN, L. E.DUNCAN, O. D.54(1949).441–451.


SAS(1989).SAS/STAT user's guide (Version 6. 4th ed. Vols. 1 and 2).Cary, NC: SAS Institute.

SAS(1995).Logistic regression examples using the SAS system.Cary, NC: SAS Institute.

SCHAEFER, R. L.25(1986).75–91.SCHEMPER, M.77(1990).216–218.SCHEMPER, M.79(1992).202–204.


SCHROEDER, E. D., SJOQUIST, D. E., and STEPHAN, P. E.(1986).Understanding regression analysis:
An introductory guide. Sage University Paper Series on Quantitative Applications in the Social Sciences.
07–057.Beverly Hills, CA: Sage.

SIMONOFF, J. S.52(1998).10–14.
SODERSTROM, L. and LEITNER, D.(1997).The effects of base rate, selection ratio, sample size, and
reliability of predictors on predictive efficiency indices associated with logistic regression models. Paper
presented at the annual meeting of the Mid-Western Educational Research Association. Chicago.

SPSS(1991).SPSS statistical algorithms (2nd ed.).Chicago: SPSS. Inc.

SPSS(1999a).SPSS advanced models 10.0.Chicago: SPSS. Inc.

SPSS(1999b).SPSS base 10.0 applications guide.Chicago: SPSS. Inc.

STATA(1999).Stata reference manual (Release 6, Vol. 2).College Station, TX: Stata Press.

STUDENMUND, A. H., and CASSIDY, H. J.(1987).Using econometrics: A practical guide.Boston: Little.


Brown.

VEALL, M. R.ZIMMERMAN, K. F.10(1996).241–260.

Page 34 of 36 Applied Logistic Regression Analysis


Sage Sage Research Methods
Contact SAGE Publications at http://www.sagepub.com

WIGGINS, J. S.(1973).Personality and prediction: Principles of personality assessment.Reading, MA:


Addison-Wesley.

WOFFORD, S.ELLIOTT, D. S.MENARD, S.9(1994).195–225.

Page 35 of 36 Applied Logistic Regression Analysis


Sage Sage Research Methods
Contact SAGE Publications at http://www.sagepub.com

About the Author

SCOTT MENARD is Research Associate in the Institute of Behavioral Science at the University of Colorado,
Boulder. He received his A.B. at Cornell University and his Ph.D. at the University of Colorado, both in
sociology. His primary substantive interests are in the longitudinal study of drug use and other forms of illegal
behavior. His publications include the Sage QASS monograph Longitudinal Research (1991; second edition
forthcoming in 2002) and the books Perspectives on Publication (with Elizabeth W. Mocn, 1987), Multiple
Problem Youth (with Delbert S. Elliott and David Huizinga, 1989), and Juvenile Gangs (with Herbert C. Covey
and Robert J. Franzese, second edition 1997).

Page 36 of 36 Applied Logistic Regression Analysis

You might also like