Chapter 5
Regression on Dummy Dependent Variable
(LDVM)/Qualitative Response Model
• The Linear Probability Model (LPM)
• The Logit and Probit Models
• Interpreting the Probit and Logit Model Estimates
Introduction
Qualitative Response Model shows situations in which the
dependent variable in a regression equation simply
represents a discrete choice assuming only a limited
number of values
Such a model is called
Limited dependent variable
Discrete dependent variable
Qualitative response
Categories of Qualitative Response Models
there are two broad categories of QRM
Introduction
1. Binomial Model: it shows the choice between
two alternatives
e.g: Decision to participate in labor force or not
2. Multinomial models: the choice between more
than two alternatives
e.g: Y= 1, occupation is farming
=2, occupation is carpentry
=0, government employee
Introduction
Qualitative choice models may be used when a decision
maker faces a choice among a set of alternatives meeting the
following criteria:
The number of choices are finite
The choices are mutually exclusive (the person chooses
only one of the alternatives)
The choices are exhaustive (all possible alternatives are
included)
Introduction
Throughout our discussion we shall restrict ourselves to
cases of qualitative choice where the set of alternatives is
binary.
For the sake of convenience the dependent variable is
given a value of 0 or 1.
Example: Suppose the choice is whether to work or not.
The discrete dependent variable we are working with will
assume only two values 0 and 1: 1if i th individual is working
Yi th
0if i individual is notworking
where i = 1, 2, …, n.
The independent variables (called factors) that are expected to
affect an individual’s choice may be X1 = age, X2 = marital
status, X3 = gender, X4 = education, and the like.
The Linear Probability Model
• We will first examine a simple and obvious, but unfortunately unsound,
method for dealing with binary dependent variables, known as the linear
probability model.
• it is based on an assumption that the probability of an event occurring,
Pi, is linearly related to a set of explanatory variables
Pi p( yi 1) 1 2 x2i 3 x3i k xki ui
• The actual probabilities cannot be observed, so we would estimate a
model where the outcomes, yi (the series of zeros and ones), would be the
dependent variable.
• This is then a linear regression model and would be estimated by OLS.
• The set of explanatory variables could include either quantitative
variables or dummies or both.
• The fitted values from this regression are the estimated probabilities for
yi =1 for each observation i.
The Linear Probability Model
• The slope estimates for the linear probability model can be
interpreted as the change in the probability that the dependent
variable will equal 1 for a one-unit change in a given explanatory
variable, holding the effect of all other explanatory variables fixed.
• Suppose, for example, that we wanted to model the probability
that a firm i will pay a dividend p(yi = 1) as a function of its
market capitalisation (x2i, measured in millions of US dollars), and
we fit the following line:
Pˆi 0.3 0.012 x2i
where P̂i denotes the fitted or estimated probability for firm i.
• This model suggests that for every $1m increase in size, the
probability that the firm will pay a dividend increases by 0.012
(or 1.2%).
• A firm whose stock is valued at $50m will have a -
0.3+0.01250=0.3 (or 30%) probability of making a dividend
payment.
The Fatal Flaw of the Linear Probability Model
• Graphically, the situation we have is
Disadvantages of the Linear Probability Model
• While the linear probability model is simple to estimate and
intuitive to interpret, the diagram on the previous slide
should immediately signal a problem with this setup.
– For any firm whose value is less than $25m, the model-
predicted probability of dividend payment is negative,
– while for any firm worth more than $88m, the probability
is greater than one.
• Clearly, such predictions cannot be allowed to stand, since the
probabilities should lie within the range (0,1).
• An obvious solution is to truncate the probabilities at 0 or 1,
so that a probability of -0.3, say, would be set to zero, and a
probability of, say, 1.2, would be set to 1.
Non –Linear Probability Models
• We need a procedure to translate our linear regression results
into true probabilities.
• We need a function that takes a value from -∞ to +∞ and returns
a value from 0 to 1.
• Both the Probit and Logit models have the same basic structure:
1. Estimate a latent variable Z using a linear model. Z ranges
from -∞ to +∞ .
Z 0 1 X 1i 2 X 2i ... K X Ki i
2. Use a non-linear function to transform Z into a predicted Y
value between 0 and 1.
E(Z ) 0 1 X1i 2 X2i ... K X Ki
• The predicted probability of Y is a non-linear function of E(Z).
• To predict the Prob(Y) for a given X value, begin by calculating the
fitted Z value from the predicted linear coefficients.
E ( Z ) Zˆi 0 1 Xˆ i
Prob(Y ) F ( Zˆ )
dProb(Y ) dProb(Y ) dZˆ dF ˆ
1
dX 1 dZˆ X 1 dZˆ
• Their difference:
The probit (normit) model uses the cumulative standard normal
distribution Function.
The logit model uses the cumulative standard logistic
distribution Function.
• Either model is estimated using a statistical method called the
method of maximum likelihood.
Logit and Probit: Better Approaches
• Both the logit and probit model
approaches are able to overcome
the limitation of the LPM that it
can produce estimated
probabilities that are negative or
greater than one.
• They do this by using a function
that effectively transforms the
regression model so that the
fitted values are bounded
within the (0,1) interval.
• Visually, the fitted regression
model will appear as an S-shape
rather than a straight line, as
was the case for the LPM.
2. The Logit Model
Logistic regression extends the ideas of linear regression to the
situation where the dependent variable, Y , is categorical.
In Logistic Regression the goal is to predict which class a new
observation will belong to, or simply to classify the observation into
one of the classes.
• Let 𝑍𝑖 = 𝛽1 + 𝛽2 𝑋𝑖 , then the function 𝑷𝒊 = 𝐸(𝑌 = 1 | 𝑋𝑖 ) =
𝟏 𝒆𝒁𝒊
= is called the (cumulative) logistic distribution
𝟏+𝒆−𝒁𝒊 𝟏+𝒆𝒁𝒊
function.
As Zi ranges from −∞ to +∞, Pi ranges between 0 and 1 and
that Pi is nonlinearly related to Zi (i.e., Xi)
dP/dX= β2 P(1 − P), which shows that the rate of change in
probability with respect to X involves not only β2 but also the
level of probability from which the change is measured.
• This creates an estimation problem because Pi is nonlinear not
only in X but also in the β’s i.e., we cannot use OLS to estimate β’s.
– But it can be linearized, by getting the logit function.
• If Pi is the probability of owning a house, then (1−Pi), the probability
of not owning a house, is
1
1 − 𝑃𝑖 =
1 + 𝑒 𝑍𝑖
𝑷𝒊
= 𝒆𝒁𝒊 , odds ratio=likely to occur
𝟏−𝑷𝒊
• Now Pi / (1− Pi) is simply the odds ratio in favor of owning a house
- the ratio of the probability that a family will own a house to the
probability that it will not own a house.
Odds expresses the likelihood of occurrence relative to likelihood
of non-occurrence
• The log of the odds ratio = Logit:
𝑃𝑖
𝐿𝑖 = ln = 𝑍𝑖 = 𝛽1 + 𝛽2 𝑋𝑖 = 3.69 + 0.07X
1−𝑃𝑖
𝐿 is not only linear in X but also linear in the parameters and
L is called the logit model.
As the probability increases (from 0 to 1), the odds increase
from 0 to . The log of the odds then increases from – to + .
β2, the slope, measures the change in L for a unit change in X,
that is, it tells how the log-odds in favor of owning a house
change as income changes by a unit.
– E.g. Β2 =0.07, an increase in X of 1 unit will increase our log
(odds) by 0.07.
If we antilog both sides then we could see in terms of odds:
b0 b1 X
odds (Y 1) e
Eg. Odds ( Y =1) = 5 implies for every 5 success there
should be one failure
In terms of probability:
b0 b1 X
e
Pr(Y 1) b0 b1 X
1 e
= 0.83 implies the probability of success is 83%.
Thus the logit, odds and probability are different ways of
expressing the same thing
The Probit Model
• Instead of using the cumulative logistic function to
transform the model, the cumulative normal distribution
is sometimes used instead.
• This gives rise to the probit model.
• As for the logistic approach, this function provides a
transformation to ensure that the fitted probabilities will
lie between zero and one.
Probit regression
Probit regression models the probability that Y=1 using
the cumulative standard normal distribution function,
evaluated at
• Φ is the cumulative normal distribution function (cndf).
• is the “z-value” or “z-index” of the probit
model.
Example: Suppose , so
• Pr(Y = 1|X = .4) = area under the standard normal density
to left of z = -0.8, which is
Pr(Z ≤-0.8) = .2119
Parameter Interpretation for Logit and Probit Models
Standard errors and t-ratios will automatically be calculated
by the econometric software package used, and hypothesis
tests can be conducted in the usual fashion.
However, interpretation of the coefficients needs slight care.
It is tempting, but incorrect, to state that a 1-unit increase in
x2i, for example, causes a 2 % increase in the probability
that the outcome corresponding to yi =1 will be realised.
This would have been the correct interpretation for the
linear probability model.
However, for logit or probit models, this interpretation would
be incorrect and compute marginal effects for each
independent variables.
Marginal Effect
However, what we really care is not itself.
We want to know how the change of X will
affect the probability that Y = 1.
For the probit model,
In STATA, the command dprobit reports the
marginal effect, instead of
COMPARISON OF LPM, LOGIT AND PROBIT
We would choose a logit or probit model over LPM because:
• The predicted probabilities lie between 0 and 1
• The marginal effects are not constant
How do logit and probit models compare?
• In most applications the models are quite similar,
• The main difference being that the logistic distribution has
slightly fatter tails,
That is to say, the conditional probability Pi approaches zero
or one at a slower rate in logit than in probit.
• Therefore, there is no compelling reason to choose one over the
other.
• In practice many researchers choose the logit model because of
its comparative mathematical simplicity.
8.23
Amemiya (1981) suggests the relationship between
probit and logit models is as follows:
probit= 0.625logit and logit= 1.6probit
Incidentally, Amemiya has also shown that the
coefficients of LPM and logit models are related as
follows:
βLPM=0.25βlogit except for intercept
and
βLPM=0.25βlogit +0.5 for intercept
Logit/probit model regression, Marginal effect calculation and
Interpretation
Command: logit DV list of independent variables
Command: probit DV list of independent variables
Command: mfx (just after the logit or probit regression)
Probit regression Number of obs = 20
LR chi2(3) = 16.86
Prob > chi2 = 0.0008
Log likelihood = -4.5175929 Pseudo R2 = 0.6511
------------------------------------------------------------------------------
demanddummy | Coef. Std. Err. z P>|z| [95% Conf. Interval]
-------------+----------------------------------------------------------------
Tax | 4.253679 2.218793 1.92 0.055 -.0950754 8.602432
Income | .3610412 .1740521 2.07 0.038 .0199053 .702177
WC | -1.443386 1.53312 -0.94 0.346 -4.448247 1.561474
_cons | -22.76094 11.80317 -1.93 0.054 -45.89473 .3728605
------------------------------------------------------------------------------
Marginal effects after probit
y = Pr(demanddummy) (predict)
= .89802238
------------------------------------------------------------------------------
variable | dy/dx Std. Err. z P>|z| [ 95% C.I. ] X
---------+--------------------------------------------------------------------
Tax | .7572469 .64805 1.17 0.243 -.512913 2.02741 2.475
Income | .0642731 .04797 1.34 0.180 -.029753 .1583 40
WC*| -.2084803 .1856 -1.12 0.261 -.572242 .155281 .65
------------------------------------------------------------------------------
(*) dy/dx is for discrete change of dummy variable from 0 to 1
As tax increases by ___%, the probability of having demand is expected to
increase by 75.72%
As income increases by________birr, the probability of having demand is
expected to increase by 6.43%