Panel Data Methods
Fixed effects estimation
Random effects estimation
Difference-in-differences
Further issues
Fixed effects estimation
Consider a model with a single explanatory
variable: for each i:
For each i, average this equation over time. We
get:
Where and similar denotation for
other variables.
Fixed effects estimation
Subtract two above equations, we get:
or:
where is the time-demeaned data on y,
and similarly for xit and uit.
The fixed effects transformation is also called the
within transformation. The importance is that the
unobserved effect, ai has disappeared.
Fixed effects estimation
Under a strict exogeneity assumption on the
explanatory variables, the fixed effects estimator is
unbiased: roughly, the idiosyncratic error uit should
be uncorrelated with each explanatory variable
across all time periods.
The fixed effects estimator allows for arbitrary
correlation between ai and the explanatory
variables in any time period.
Any explanatory variable that is constant over time
is removed by the fixed effects transformation.
Fixed effects estimation
We can estimate the between-effects model:
In Stata output, there are some R-squared:
R-sq: within:
R-sq: between
R-sq: overall
Fixed effects estimation
The composite error eit is decomposed into:
Since ai is constant, eit is correlated with uit and
this correlation is computed by
2
a
2
a
2
u
Fixed effects estimation
We should not perform data transformation and run
OLS on the transformed data. Although the point
estimates are correct, the standard error is invalid.
Since:
OLS on transformed data by ourselves will use
(NT – k) degree of freedom.
However, because of transformation of data, the
correct degree freedom is (NT – N – k). In Stata we
use xtreg command for fixed-effects model.
Fixed effects estimation
We can perform fixed-effects model by running
OLS on dummy variables of observations.
See computer example
Panel data format
id year age wage
1 2002 43 2.3
2 2002 45 4.4
3 2002 54 5.3
4 2002 32 2.3
1 2003 44 3.2
2 2003 46 4.4
3 2003 55 4.2
4 2003 33 5.1
1 2004 45 3.2
2 2004 47 4.8
3 2004 56 5.7
4 2004 34 4.1
Random effects model
• Instead of treating β1i as fixed, we assume that it is a
random variable with a mean value of β1
• The intercept value for an individual company can be
expressed as
where εi is a random error term
• Four companies have a common mean value for the
intercept ( =β1) and the individual differences in the
intercept values of each company are reflected in the
error term εi
Random effects model
• Substituting β1i into Yit, we obtain
where
• The composite error term wit consists of two
components, εi , which is the cross-section, or individual-
specific, error component, and uit, which is the combined
time series and cross-section error component.
Difference-in-differences
Consider a model:
yit 1 xit 2 Dit uit
where :
t = 1, 2, i.e., two-periods panel data (or pooled
cross sectional data)
D is a binary variables, which is equal 0 for all
observations at the time t1.
D is correlated with D.
Then we can estimate by:
2
yit 1 xit 2 DitTit 3 Dit 4Tit uit
Further issues
Dependent variables are not continuous:
Multinomial logit, ordered logit.
Count model
Fractional logit model
Limited dependent variable:
Tobit models
Truncated/censored models
Sample selection models
Further issues
Endogeneity issues:
Randomized design
Simultaneous equation model
Regression discontinuity
Non-linear functions:
Non-linear models
Matching
Panel and time-series models