Thanks to visit codestin.com
Credit goes to www.scribd.com

0% found this document useful (0 votes)
3 views42 pages

Lecture Note 12

The document discusses the Difference-in-Difference (DID) method, which is used to estimate program impacts by comparing outcomes before and after an intervention between treatment and control groups. It highlights the assumptions of time-invariant unobserved heterogeneity and addresses potential selection bias through various approaches, including combining DID with propensity score matching. Additionally, it outlines the advantages and disadvantages of the DID method, emphasizing the importance of controlling for initial conditions to avoid omitted variable bias.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
3 views42 pages

Lecture Note 12

The document discusses the Difference-in-Difference (DID) method, which is used to estimate program impacts by comparing outcomes before and after an intervention between treatment and control groups. It highlights the assumptions of time-invariant unobserved heterogeneity and addresses potential selection bias through various approaches, including combining DID with propensity score matching. Additionally, it outlines the advantages and disadvantages of the DID method, emphasizing the importance of controlling for initial conditions to avoid omitted variable bias.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 42

DIFFERENCE-IN DIFFERENCE OR DOUBLE DIFFERENCE

Dr. Muhammad Shahadat Hossain Siddiquee


Professor, Department of Economics
University of Dhaka
Email: [email protected]
Cell: +8801719397749
DIFFERENCE-IN-DIFFERENCE (DID)
• Difference-in-Difference (DID) or Double-difference methods, compared
with propensity score matching (PSM), assume that unobserved
heterogeneity in participation is present.
• But such unobserved factors are time invariant.
• With data on project and control observations before and after the
program intervention, therefore, this fixed component can be differenced
out.
• Some variants of the DD approach have been introduced to account for
potential sources of selection bias.
• Selection bias problem may arise from non-random program placement
in case of repeated sampling data. In that case, combining PSM with DD
methods can help resolve this problem. In this case, only matching units
in the common support, are used for estimating impact using DID.
DIFFERENCE-IN-DIFFERENCE (DID)..
• Controlling for initial area conditions can also resolve nonrandom
program placement that might bias the program effect.
• Where a baseline might not be available, using a triple-difference
method with an entirely separate control experiment after program
intervention (that is, a separate set of untreated observations) offers
an alternate calculation of the program’s impact.
LEARNING OBJECTIVES OF STUDYING DID
After completing this topic, you will be able to discuss

 How to construct the double-difference estimate


 How to address potential violations of the assumption of time-
invariant heterogeneity
 How to account for nonrandom program placement
 What to do when a baseline is not available
ADDRESSING SELECTION BIAS: USING DIFFERENCES AS
COUNTERFACTUAL
• The two methods—randomized evaluation and PSM—focus on various
single-difference estimators that often require only an appropriate cross-
sectional survey.
• The double-difference estimation technique, which typically uses panel
data. However, DD can be used on repeated cross-section data as well, as
long as the composition of participant and control groups is fairly stable
over time.
• In a panel setting, DD estimation resolves the problem of missing data by
measuring outcomes and covariates for both participants and
nonparticipants in pre- and post-intervention periods.
• DD essentially compares treatment and comparison groups in terms of
outcome changes over time relative to the outcomes observed for a pre-
intervention baseline.
ADDRESSING SELECTION BIAS: USING DIFFERENCES AS
COUNTERFACTUAL…

T1 = 1 denotes treatment or the presence of the program at t = 1, whereas T1 = 0


denotes untreated or absence of the program.
Unlike PSM alone, the DD estimator allows for unobserved heterogeneity (the
unobserved difference in mean counterfactual outcomes between treated and
untreated units) that may lead to selection bias. For example, one may want to
account for factors unobserved by the researcher, such as differences in innate
ability or personality across treated and control subjects or the effects of
nonrandom program placement at the policy-making level.
DD assumes this unobserved heterogeneity is time invariant, so the bias cancels out
through differencing. In other words, the outcome changes for nonparticipants
reveal the counterfactual outcome changes.
DID Method: Theory and Application
• The DD estimator relies on a comparison of participants and
nonparticipants before and after the intervention.
• For example, after an initial baseline survey of both nonparticipants and
(subsequent) participants, a follow-up survey can be conducted of both
groups after the intervention.
• From this information, the difference is calculated between the observed
mean outcomes for the treatment and control groups before and after
program intervention.
• When baseline data are available, one can thus estimate impacts by
assuming that unobserved heterogeneity is time invariant and
uncorrelated with the treatment over time. This assumption is weaker
than conditional exogeneity and renders the outcome changes for a
comparable group of nonparticipants.
DD Method: Theory and Application (Contd..)
• The DD estimate can also be calculated within a regression framework; the
regression can be weighted to account for potential biases in DD. In particular,
the estimating equation would be specified as follows:

the coefficient β on the interaction between the treatment variable (Ti1) and time (t = 1. . .T) gives the average DD
effect of the program. Thus, β = DD. In addition to this interaction term, the variables Ti1 and t are included separately
to pick up any separate mean effects of time as well as the effect of being targeted versus not being targeted. Again,
as long as data on four different groups are available to compare, panel data are not necessary to implement the DD
approach (for example, the t subscript, normally associated with time, can be reinterpreted as a particular geographic
area, k = 1. . .K).
DD Method: Theory and Application (Contd..)
• To understand the intuition better behind above equation, one can write it out
in detail in expectations form:
DD Method: Theory and Application (Contd..)
• Subtracting gives DD estimate.
• DD is unbiased only if the potential source of selection bias is additive and time
invariant.
• Using the same approach, if a simple pre- versus post-estimation impact on the
participant sample is calculated, the calculated program impact would be DD + γ, and
the corresponding bias would be γ.
• Without a control group, justifying that other factors were not responsible in affecting
participant outcomes is difficult.
• One might also try comparing just the post-program difference in outcomes across
treatment and control units; however, in this case, the estimated impact of the policy
would be DD + ρ, and the bias would be ρ.
• Systematic, unmeasured differences that could be correlated with treatment cannot be
separated easily.
DD Method: Theory and Application (Contd..)
DD estimator is to be interpreted correctly if the following assumptions must
hold:
• The model for outcome variable is correctly specified.
• The error term is uncorrelated with the other variables in the equation:

The last of these assumptions, also known as the parallel-trend assumption, is the most critical. It means that
unobserved characteristics affecting program participation do not vary over time with treatment status.
Panel Fixed-Effects Model
• The preceding two-period model can be generalized with multiple time periods,
which may be called the panel fixed-effects model.
• This possibility is particularly important for a model that controls not only for
the unobserved time-invariant heterogeneity but also for heterogeneity in
observed characteristics over a multiple-period setting.
• More specifically, Yit can be regressed on Tit , a range of time-varying covariates
Xit , and unobserved time-invariant individual heterogeneity ηt that may be
correlated with both the treatment and other unobserved characteristics εit.
• Consider the following revision of above equation:
Panel Fixed-Effects Model (Contd..)
• Differencing both the right- and left-hand side of the above equation over time,
one would obtain the following differenced equation:
Panel Fixed-Effects Model (Contd..)
• In this case, because the source of endogeneity (that is, the unobserved individual
characteristics ηt ) is dropped from differencing, ordinary least squares (OLS) can
be applied to equation to estimate the unbiased effect of the program (φ).
• With two time periods, φ is equivalent to the DD estimate, controlling for the
same covariates Xit; the standard errors, however, may need to be corrected for
serial correlation (Bertrand, Duflo, and Mullainathan 2004).
• With more than two time periods, the estimate of the program impact will
diverge from DD.
Implementing DD: Tabular Method
Implementing DD (Contd..)
• To apply a DD approach using panel data, baseline data need to be collected on
program and control areas before program implementation.
• Quantitative as well as qualitative information on these areas are helpful in
determining who is likely to participate.
• Follow-up surveys after program intervention also should be conducted on the
same units.
• Calculating the average difference in outcomes separately for participants and
nonparticipants over the periods and then taking an additional difference
between the average changes in outcomes for these two groups gives the DD
impact.
Implementing DD (Contd..)
Implementing DD (Contd..)
• The lowermost line in figure depicts the true counterfactual outcomes, which
are never observed.
• Under the DD approach, unobserved characteristics that create a gap between
measured control outcomes and true counterfactual outcomes are assumed to
be time invariant, such that the gap between the two trends is the same over the
period. This assumption implies that (Y3 − Y2) = (Y1 − Y0). Using this equality
in the preceding DD equation, one gets DD = (Y4 − Y2).
Implementing DD (Contd..)
• One application of DD estimation comes from Thomas and others (2004). They
examine a program in Indonesia that randomized the distribution of iron
supplements to individuals in primarily agricultural households, with half the
respondents receiving treatment and controls receiving a placebo.
• A baseline was also conducted before the intervention.
• Using DD estimation, the study found that men who were iron deficient before
the intervention experienced improved health outcomes, with more muted effects
for women.
• The baseline was also useful in addressing concerns about bias in compliance with
the intervention by comparing changes in outcomes among subjects assigned to
the treatment group relative to changes among subjects assigned to the control
group.
Implementing DD (Contd..)
• Khandker, Bakht, and Koolwal (2009) examine the impact of two rural road-paving
projects in Bangladesh, using a quasi-experimental household panel data set
surveying project and control villages before and after program implementation.
• Both project and control areas shared similar socioeconomic and community-level
characteristics before program implementation; control areas were also targets for
future rounds of the road improvement program.
• Each project had its own survey, covered in two rounds—the first in the mid-1990s
before the projects began and the second about five years later after program
completion.
• DD estimation was used to determine the program’s impacts across a range of
outcomes, including household per capita consumption (a measure of household
welfare), prices, employment outcomes for men and women, and children’s school
enrollment.
• Using an additional fixed-effects approach that accounted for initial conditions, the
study found that households had benefited in a variety of ways from road
investment.
Advantages and Disadvantages of Using DD
• The advantage of DD is that it relaxes the assumption of conditional exogeneity
or selection only on observed characteristics.
• It also provides a tractable, intuitive way to account for selection on unobserved
characteristics.
• The main drawback, however, rests precisely with this assumption: the notion of
time-invariant selection bias is implausible for many targeted programs in
developing countries.
• The empirical evidences reveal that such programs often have wide-ranging
approaches to poverty alleviation that span multiple sectors.
• Given that such programs are also targeted in areas that are very poor and have low
initial growth, one might expect over several years that the behavior and choices of
targeted areas would respond dynamically (in both observed and unobserved ways) to
the program. Training programs, which are also widely examined in the evaluation
literature, provide another example.
Advantages and Disadvantages of Using DD (Contd..)
• Suppose evaluating the impact of a training program on earnings is of interest.
• Enrollment may be more likely if a temporary (perhaps shock-induced) slump in
earnings occurs just before introduction of the program (this phenomenon is also
known as Ashenfelter’s Dip).
• Thus, the treated group might have experienced faster growth in earnings even
without participation. In this case, a DD method is likely to overestimate the
program’s effect.
• Timevarying, unobserved heterogeneity could lead to an upward or downward
bias.
Time-Varying Unobserved Heterogeneity
Advantages and Disadvantages of Using DD (Contd..)
• In practice, ex ante, time-varying unobserved heterogeneity could be accounted for with proper
program design, including ensuring that project and control areas share similar preprogram
characteristics.
• If comparison areas are not similar to potential participants in terms of observed and
unobserved characteristics, then changes in the outcome over time may be a function of this
difference.
• This factor would also bias the DD.
• For example, in the context of a school enrollment program, if control areas were
selected that were initially much farther away from local schools than targeted areas,
DD would overestimate the program’s impact on participating localities.
• Similarly, differences in agroclimatic conditions and initial infrastructural development across
treated and control areas may also be correlated with program placement and resulting changes
in outcomes over time.
• Using data from a poverty-alleviation program in China, Jalan and Ravallion (1998) show that a
large bias exists in the DD estimate of the project’s impact because changes over time are a
function of initial conditions that also influence program placement. Controlling for the area
characteristics that initially attracted the development projects can correct for this bias; by doing
so, Jalan and Ravallion found significant longer-term impacts whereas none had been evident in
the standard DD estimator.
Advantages and Disadvantages of Using DD (Contd..)

• Applying PSM could help match treatment units with observationally similar
control units before estimating the DD impact.
• Specifically, one would run PSM on the base year and then conduct a DD on the
units that remain in the common support.
• Studies show that weighting the control observations according to their propensity
score yields a fully efficient estimator (Hirano, Imbens, and Ridder 2003).
• As effective PSM depends on a rich baseline, however, during initial data collection
careful attention should be given to characteristics that determine participation
Advantages and Disadvantages of Using DD (Contd..)
• Even if comparability of control and project areas could be ensured before the program,
however, the DD approach might falter if macroeconomic changes during the
program affected the two groups differently.
• Suppose some unknown characteristics make treated and non-treated groups react
differently to a common macroeconomic shock. In this case, a simple DD might
overestimate or underestimate the true effects of a program depending on how the
treated and nontreated groups react to the common shock.
• Bell, Blundell, and van Reenen (1999) suggest a differential time-trend-adjusted DD for
such a case.
• This alternative will be discussed later in terms of the triple-difference method.
• Another approach might be through instrumental variables. If enough data are
available on other exogenous or behavior-independent factors affecting participants and
nonparticipants over time, those factors can be exploited to identify impacts when
unobserved heterogeneity is not constant. An instrumental variables panel fixed-effects
approach could be implemented.
Alternative DD Models
• The double-difference approach yields consistent estimates of project impacts if
unobserved community and individual heterogeneity are time invariant.
• However, one can conceive of several cases where unobserved characteristics of a
population may indeed change over time—stemming, for example, from changes
in preferences or norms over a longer time series. A few variants of the DD
method have therefore been proposed to control for factors affecting these
changes in unobservables.
Do Initial Conditions Matter?
• Not controlling for initial area conditions when assessing the impact of an
antipoverty program may lead to significant omitted variable bias—if local
conditions were also responsible for the improvement of household outcomes or
program targeting was correlated with such area characteristics.
• Approaches to controlling for initial area conditions in a DD approach, using data
over multiple years as well as data covering only two time periods
Case Study: Accounting for Initial Conditions with a DD Estimator—
Applications for Survey Data of Varying Lengths
• Long-Term Data with Multiple Rounds
Jalan and Ravallion (1998) examined the impact of a development program in a poor
area on growth in household consumption by using panel data from targeted and
nontargeted areas across four contiguous provinces in southwest China. Using data on
about 6,650 households between 1985 and 1990 (supplemented by additional field visits
in 1994–95), they employed a generalized method-of-moments time-series estimation
model for household consumption growth, including initial area conditions on the right-
hand side and using second and higher lags of consumption as instruments for lagged
consumption to obtain consistent estimates of a dynamic growth model with panel data.
Their results show that program effects are indeed influenced by initial household and
community wealth; dropping initial area conditions (such as initial wealth and fertilizer
use) caused the national program effect to lose significance completely, with provincial
program effects changing sign and becoming slightly negative. In particular, after
correcting for the area characteristics that initially attracted the development projects,
Jalan and Ravallion (1998) found significant longer term impacts than those obtained
using simple fixed-effects methods. Thus, failing to control for factors underlying
potential differences in local and regional growth trajectories can lead to a substantial
underestimation of the welfare gains from the program.
Case Study: Accounting for Initial Conditions with a DD Estimator—
Applications for Survey Data of Varying Lengths (Contd..)
• Data with Two Time Periods
With fewer time periods (for example, with two years) a simpler OLS-first difference model
can be applied on the data, incorporating a range of initial area characteristics across project
and control areas prior to program implementation. In their study on rural roads, Khandker,
Bakht, and Koolwal (2009) used two rounds of data—namely, baseline and post-program
data on treated and control areas—to compare DD results based on a household fixed-
effects approach with OLS-first difference estimations on the same outcomes and
covariates. These OLS-first difference estimates control for a number of preproject
characteristics of villages where households were located. These initial area characteristics
included local agroclimatic factors; the number of banks, schools, and hospitals serving the
village; the distance from the village to the nearest paved road; the average short-term
interest rate in the village; and the number of active microfinance institutions in the village.
Although the project estimates are similar across both specifications for a number of
outcomes, the study found that the beneficial household impact of the program was also
strengthened for many outcomes when initial area conditions were controlled for. Because
the project’s effect did not disappear for most outcomes after initial area conditions were
controlled for, the study provides one indication that program targeting was not entirely
predisposed toward certain areas with particular initial development characteristics.
PSM with DD
• Assume that rich data on control and treatment areas exist, PSM can be combined
with DD methods to better match control and project units on preprogram
characteristics. Specifically, propensity score can be used to match participant and
control units in the base (preprogram) year, and the treatment impact is calculated
across participant and matched control units within the common support. For two
time periods t = {1,2}, the DD estimate for each treatment area i will be calculated
as:
PSM with DD
• In terms of a regression framework, Hirano, Imbens, and Ridder (2003) show that
a weighted least squares regression, by weighting the control observations
according to their propensity score, yields a fully efficient estimator:

The weights in the regression in equation 5.6 are equal to 1 for treated units and to
 
p ( x) / 1  p ( x)
for comparison units.
Case Study: PSM with DD
• In a study on rural road rehabilitation in Vietnam, van de Walle and Mu (2008)
controlled for time invariant unobserved heterogeneity and potential time-varying
selection bias attributable to differences in initial observable characteristics by
combining DD and PSM using data from 94 project and 95 control communes
over three periods: a baseline survey in 1997, followed by surveys in 2001 and
2003.
• Highlighting the importance of comparing short-term versus long-term impacts,
the study found that most outcomes were realized at different stages over the
period. Primary school completion, for example, reflected sustained growth
between 1997 and 2003, increasing by 15 to 25 percent. Other outcomes, such as
the expansion of markets and availability of non-food-related goods, took a
longer time to emerge (markets, for example, developed in about 10 percent more
project than control communes after seven years) than did short-run effects such
as the number of secondary schools and availability of food-related goods.
Moreover, van de Walle and Mu found that market impacts were greater if the
commune initially was poorly developed.
Triple-Difference Method
• What if baseline data are not available? Such might be the case during an economic
crisis, for example, where a program or safety net has to be set up quickly. In this
context, a triple-difference method can be used.
• In addition to a “first experiment” comparing certain project and control groups, this
method exploits the use of an entirely separate control experiment after program
intervention. That is, this separate control group reflects a set of nonparticipants in
treated and nontreated areas that are not part of the first control group.
• These new control units may be different from the first control group in socioeconomic
characteristics if evaluators want to examine the project’s impact on participants relative
to another socioeconomic group. Another difference from the first experiment would
then be taken from the change in the additional control sample to examine the impact of
the project, accounting for other factors changing over time (see, Gruber 1994).
• This method would therefore require data on multiple years after program intervention,
even though baseline data were missing.
Triple-Difference Method (Contd..)
• Ravallion and others (2005) examine program impacts on income for “stayers”
versus “leavers” in the Trabajar workfare program in Argentina. Given that
the program was set up shortly after the 1997 financial crisis, baseline data were
not available.
• Ravallion and others therefore examine the difference in incomes for
participants leaving the program and those still participating, after differencing out
aggregate economywide changes by using an entirely separate, matched group of
nonparticipants. Without the matched group of nonparticipants, a simple DD
between stayers and leavers will be unbiased only if counterfactual earnings
opportunities outside of the program were the same for each group. However, as
Ravallion and others (2005) point out, individuals who choose to remain in the
program may intuitively be less likely to have better earnings opportunities outside
the program than those who dropped out early. As a result, a DD estimate
comparing just these two groups will underestimate the program’s impact. Only in
circumstances such as an exogenous program contraction, for example, can a
simple DD between stayers and leavers work well.
Case Study: Triple-Difference Method—Trabajar Program in Argentina
Adjusting for Differential Time Trends
• Suppose one wants to evaluate a program such as employment training
introduced during a macroeconomic crisis.
• With data available for treated and nontreated groups before and after the
program, one could use a DD approach to estimate the program’s effect on
earnings.
• However, such events are likely to create conditions where the treated and
nontreated groups would respond differently to the shock. Bell, Blundell, and van
Reenen (1999) have constructed a DD method that accounts for these
differential time-trend effects.
• Apart from the data on treated and nontreated groups before and after
treatment, another time interval is needed (say, t-1 to t) for the same treated and
nontreated groups. The recent past cycle is likely the most appropriate time
interval for such comparison. More formally, the time-trend-adjusted DD is
defined as follows:
Adjusting for Differential Time Trends (Contd..)
3 ways to looks at DD
Regression (for 2 time periods)
Yit     .1(t  1)   .1( Di  1)   .1(t  1).1( Di  1)   it

E (Yi1 | Di  1)  ???
E (Yi 0 | Di  1)  ???
E (Yi1 | Di  0)  ???
E (Yi 0 | Di  0)  ???

DD  ( E (Yi1 | Di  1)  E (Yi 0 | Di  1))  ( E (Yi1 | Di  0)  E (Yi 0 | Di  0))
 ???
Regression (for 2 time periods) [Contd..)
Yit     .1(t  1)   .1( Di  1)   .1(t  1).1( Di  1)   it

E (Yi1 | Di  1)     .1   .1   .1.1  E ( i1 | Di  1)        
E (Yi 0 | Di  1)     .0   .1   .0.1  E ( i 0 | Di  1)    
E (Yi1 | Di  0)     .1   .0   .1.0  E ( i1 | Di  0)    
E (Yi 0 | Di  0)     .0   .1   .0.0  E ( i 0 | Di  0)  

DD  ( E (Yi1 | Di  1)  E (Yi 0 | Di  1))  ( E (Yi1 | Di  0)  E (Yi 0 | Di  0))
 (   )  

Violation of Equal Trend Assumption

You might also like