Thanks to visit codestin.com
Credit goes to www.scribd.com

0% found this document useful (0 votes)
21 views22 pages

CH 08

Heteroskedasticity occurs when the variance of the error term is not constant but depends on the independent variables. This violates the assumption of homoskedasticity in the classical linear regression model. If heteroskedasticity is present, the standard errors produced by ordinary least squares (OLS) will be biased. Weighted least squares (WLS) and feasible generalized least squares (FGLS) can produce more efficient estimates than OLS by accounting for heteroskedasticity. The Breusch-Pagan and White tests can be used to test for the presence of heteroskedasticity.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
21 views22 pages

CH 08

Heteroskedasticity occurs when the variance of the error term is not constant but depends on the independent variables. This violates the assumption of homoskedasticity in the classical linear regression model. If heteroskedasticity is present, the standard errors produced by ordinary least squares (OLS) will be biased. Weighted least squares (WLS) and feasible generalized least squares (FGLS) can produce more efficient estimates than OLS by accounting for heteroskedasticity. The Breusch-Pagan and White tests can be used to test for the presence of heteroskedasticity.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 22

Multiple Regression Analysis

y = b 0 + b 1 x1 + b 2 x2 + . . . b k xk + u

6. Heteroskedasticity

1
What is Heteroskedasticity
Recall the assumption of homoskedasticity
implied that conditional on the explanatory
variables, the variance of the unobserved error, u,
was constant
If this is not true, that is if the variance of u is
different for different values of the x’s, then the
errors are heteroskedastic
Example: estimating returns to education and
ability is unobservable, and think the variance in
ability differs by educational attainment

2
Example of Heteroskedasticity
f(y|x)

y
.
. E(y|x) = b0 + b1x

.
x1 x2 x3 x
3
Why Worry About
Heteroskedasticity?
OLS is still unbiased and consistent, even if
we do not assume homoskedasticity
The standard errors of the estimates are
biased if we have heteroskedasticity
If the standard errors are biased, we can not
use the usual t statistics or F statistics or LM
statistics for drawing inferences

4
Variance with Heteroskedasticity

ˆ å (xi - x )ui
For the simple case, b1 = b1 + , so
å (xi - x )
2

(
å i ) 2

( ) - s 2
x x
Var bˆ1 = , where SSTx = å ( xi - x )
i 2
2
SST x

A valid estimator for this when s i2 ¹ s 2is


(
å ix - x ) ˆ
u 2
i
2

2
, where uˆi are are the OLS residuals
SST x

5
Variance with Heteroskedasticity

For the general multiple regression model, a valid


( )
estimator of Var bˆ with heteroskedasticity is
j

( )
Varˆ bˆ j =
å ij i
rˆ ˆ
u 2

, where rˆij is the i th residual from


SST j2
regressing x j on all other independent variables, and
SST j is the sum of squared residuals from this regression

6
Robust Standard Errors
Now that we have a consistent estimate of
the variance, the square root can be used as
a standard error for inference
Typically call these robust standard errors
Sometimes the estimated variance is
corrected for degrees of freedom by
multiplying by n/(n – k – 1)
As n → ∞ it’s all the same, though

7
Robust Standard Errors (cont)
Important to remember that these robust
standard errors only have asymptotic
justification – with small sample sizes t
statistics formed with robust standard errors
will not have a distribution close to the t,
and inferences will not be correct
In Stata, robust standard errors are easily
obtained using the robust option of reg
8
A Robust LM Statistic
Run OLS on the restricted model and save the
residuals ŭ
Regress each of the excluded variables on all of
the included variables (q different regressions) and
save each set of residuals ř1, ř2, …, řq
Regress a variable defined to be = 1 on ř1 ŭ, ř2 ŭ,
…, řq ŭ, with no intercept
The LM statistic is n – SSR1, where SSR1 is the
sum of squared residuals from this final regression

9
Testing for Heteroskedasticity
Essentially want to test H0: Var(u|x1, x2,…,
xk) = s2, which is equivalent to H0: E(u2|x1,
x2,…, xk) = E(u2) = s2
If assume the relationship between u2 and xj
will be linear, can test as a linear restriction
So, for u2 = d0 + d1x1 +…+ dk xk + v) this
means testing H0: d1 = d2 = … = dk = 0

10
The Breusch-Pagan Test
Don’t observe the error, but can estimate it with
the residuals from the OLS regression
After regressing the residuals squared on all of the
x’s, can use the R2 to form an F or LM test
The F statistic is just the reported F statistic for
overall significance of the regression, F =
[R2/k]/[(1 – R2)/(n – k – 1)], which is distributed
Fk, n – k - 1
The LM statistic is LM = nR2, which is distributed
c2k

11
The White Test
The Breusch-Pagan test will detect any
linear forms of heteroskedasticity
The White test allows for nonlinearities by
using squares and crossproducts of all the x’s
Still just using an F or LM to test whether
all the xj, xj2, and xjxh are jointly significant
This can get to be unwieldy pretty quickly

12
Alternate form of the White test
Consider that the fitted values from OLS, ŷ,
are a function of all the x’s
Thus, ŷ2 will be a function of the squares
and crossproducts and ŷ and ŷ2 can proxy
for all of the xj, xj2, and xjxh, so
Regress the residuals squared on ŷ and ŷ2
and use the R2 to form an F or LM statistic
Note only testing for 2 restrictions now

13
Weighted Least Squares
While it’s always possible to estimate
robust standard errors for OLS estimates, if
we know something about the specific form
of the heteroskedasticity, we can obtain
more efficient estimates than OLS
The basic idea is going to be to transform
the model into one that has homoskedastic
errors – called weighted least squares
14
Case of form being known up to
a multiplicative constant
Suppose the heteroskedasticity can be
modeled as Var(u|x) = s2h(x), where the
trick is to figure out what h(x) ≡ hi looks like
E(ui/√hi|x) = 0, because hi is only a function
of x, and Var(ui/√hi|x) = s2, because we
know Var(u|x) = s2hi
So, if we divided our whole equation by √hi
we would have a model where the error is
homoskedastic
15
Generalized Least Squares
Estimating the transformed equation by
OLS is an example of generalized least
squares (GLS)
GLS will be BLUE in this case
GLS is a weighted least squares (WLS)
procedure where each squared residual is
weighted by the inverse of Var(ui|xi)

16
Weighted Least Squares
While it is intuitive to see why performing
OLS on a transformed equation is
appropriate, it can be tedious to do the
transformation
Weighted least squares is a way of getting
the same thing, without the transformation
Idea is to minimize the weighted sum of
squares (weighted by 1/hi)

17
More on WLS
WLS is great if we know what Var(ui|xi)
looks like
In most cases, won’t know form of
heteroskedasticity
Example where do is if data is aggregated,
but model is individual level
Want to weight each aggregate observation
by the inverse of the number of individuals

18
Feasible GLS
More typical is the case where you don’t
know the form of the heteroskedasticity
In this case, you need to estimate h(xi)
Typically, we start with the assumption of a
fairly flexible model, such as
Var(u|x) = s2exp(d0 + d1x1 + …+ dkxk)
Since we don’t know the d, must estimate

19
Feasible GLS (continued)
Our assumption implies that u2 = s2exp(d0
+ d1x1 + …+ dkxk)v
Where E(v|x) = 1, then if E(v) = 1
ln(u2) = a0 + d1x1 + …+ dkxk + e
Where E(e) = 1 and e is independent of x
Now, we know that û is an estimate of u, so
we can estimate this by OLS

20
Feasible GLS (continued)
Now, an estimate of h is obtained as ĥ =
exp(ĝ), and the inverse of this is our weight
So, what did we do?
Run the original OLS model, save the
residuals, û, square them and take the log
Regress ln(û2) on all of the independent
variables and get the fitted values, ĝ
Do WLS using 1/exp(ĝ) as the weight

21
WLS Wrapup
When doing F tests with WLS, form the weights
from the unrestricted model and use those weights to
do WLS on the restricted model as well as the
unrestricted model
Remember we are using WLS just for efficiency –
OLS is still unbiased & consistent
Estimates will still be different due to sampling
error, but if they are very different then it’s likely that
some other Gauss-Markov assumption is false

22

You might also like