Example of 2SLS and Hausman Test
Consider a model
LWAGE = + 1 Educ + 2 Exper + 3 ExperSQ + e
where Educ is endogenous and Exper and ExperSQ are exogenous.
IVs for Educ are mothereduc, fatheduc and huseduc.
That is,
X1 = Z1 = [Exper, ExperSQ]
Z2 = [mothereduc, fatheduc, huseduc]
Z = [Z1, Z2]
Data: “2sls_mroz_428.dta”
(1) OLS: biased
. regress lwage educ exper expersq
Source | SS df MS Number of obs = 428
-------------+------------------------------ F( 3, 424) = 26.29
Model | 35.0223023 3 11.6741008 Prob > F = 0.0000
Residual | 188.305149 424 .444115917 R-squared = 0.1568
-------------+------------------------------ Adj R-squared = 0.1509
Total | 223.327451 427 .523015108 Root MSE = .66642
------------------------------------------------------------------------------
lwage | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
educ | .1074896 .0141465 7.60 0.000 .0796837 .1352956
exper | .0415665 .0131752 3.15 0.002 .0156697 .0674633
expersq | -.0008112 .0003932 -2.06 0.040 -.0015841 -.0000382
_cons | -.5220407 .1986321 -2.63 0.009 -.9124668 -.1316145
------------------------------------------------------------------------------
(2) Hausman test for endogeneity
. regress educ exper expersq motheduc fatheduc huseduc
------------------------------------------------------------------------------
educ | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
exper | .0374977 .0343102 1.09 0.275 -.0299424 .1049379
expersq | -.0006002 .0010261 -0.58 0.559 -.0026171 .0014167
motheduc | .1141532 .0307835 3.71 0.000 .0536452 .1746613
fatheduc | .1060801 .0295153 3.59 0.000 .0480648 .1640955
huseduc | .3752548 .0296347 12.66 0.000 .3170049 .4335048
_cons | 5.538311 .4597824 12.05 0.000 4.634562 6.44206
------------------------------------------------------------------------------
. predict edu_res, res
. regress lwage educ exper expersq edu_res
------------------------------------------------------------------------------
lwage | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
educ | .0803918 .0216362 3.72 0.000 .0378639 .1229197
exper | .0430973 .013181 3.27 0.001 .017189 .0690057
expersq | -.0008628 .0003937 -2.19 0.029 -.0016366 -.000089
edu_res | .047189 .0285519 1.65 0.099 -.0089322 .1033102
_cons | -.1868574 .2835905 -0.66 0.510 -.7442794 .3705647
------------------------------------------------------------------------------
. test edu_res
( 1) edu_res = 0
F( 1, 423) = 2.73
Prob > F = 0.0991
Or, simply
. ivendog
Tests of endogeneity of: educ
H0: Regressor is exogenous
Wu-Hausman F test: 2.73157 F(1,423) P-value = 0.09912
Durbin-Wu-Hausman chi-sq test: 2.74613 Chi-sq(1) P-value = 0.09749
(3) 2SLS
. ivreg lwage (educ = motheduc fatheduc huseduc) exper expersq
Instrumental variables (2SLS) regression
Source | SS df MS Number of obs = 428
-------------+------------------------------ F( 3, 424) = 11.52
Model | 33.3927427 3 11.1309142 Prob > F = 0.0000
Residual | 189.934709 424 .447959218 R-squared = 0.1495
-------------+------------------------------ Adj R-squared = 0.1435
Total | 223.327451 427 .523015108 Root MSE = .6693
------------------------------------------------------------------------------
lwage | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
educ | .0803918 .021774 3.69 0.000 .0375934 .1231901
exper | .0430973 .0132649 3.25 0.001 .0170242 .0691704
expersq | -.0008628 .0003962 -2.18 0.030 -.0016415 -.0000841
_cons | -.1868574 .2853959 -0.65 0.513 -.7478243 .3741096
------------------------------------------------------------------------------
Instrumented: educ
Instruments: exper expersq motheduc fatheduc huseduc
------------------------------------------------------------------------------
(4) My own 2 stages
1st stage
. regress educ exper expersq motheduc fatheduc huseduc
Source | SS df MS Number of obs = 428
-------------+------------------------------ F( 5, 422) = 63.30
Model | 955.830608 5 191.166122 Prob > F = 0.0000
Residual | 1274.36565 422 3.01982382 R-squared = 0.4286
-------------+------------------------------ Adj R-squared = 0.4218
Total | 2230.19626 427 5.22294206 Root MSE = 1.7378
------------------------------------------------------------------------------
educ | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
exper | .0374977 .0343102 1.09 0.275 -.0299424 .1049379
expersq | -.0006002 .0010261 -0.58 0.559 -.0026171 .0014167
motheduc | .1141532 .0307835 3.71 0.000 .0536452 .1746613
fatheduc | .1060801 .0295153 3.59 0.000 .0480648 .1640955
huseduc | .3752548 .0296347 12.66 0.000 .3170049 .4335048
_cons | 5.538311 .4597824 12.05 0.000 4.634562 6.44206
------------------------------------------------------------------------------
. predict edu_pre
(option xb assumed; fitted values)
2nd stage
. regress lwage edu_pre exper expersq edu_res
Source | SS df MS Number of obs = 428
-------------+------------------------------ F( 4, 423) = 20.48
Model | 36.2305033 4 9.05762583 Prob > F = 0.0000
Residual | 187.096948 423 .44230957 R-squared = 0.1622
-------------+------------------------------ Adj R-squared = 0.1543
Total | 223.327451 427 .523015108 Root MSE = .66506
------------------------------------------------------------------------------
lwage | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
edu_pre | .0803918 .0216362 3.72 0.000 .0378639 .1229197
exper | .0430973 .013181 3.27 0.001 .017189 .0690057
expersq | -.0008628 .0003937 -2.19 0.029 -.0016366 -.000089
edu_res | .1275808 .0186301 6.85 0.000 .0909616 .1642
_cons | -.1868573 .2835905 -0.66 0.510 -.7442793 .3705647
------------------------------------------------------------------------------
Note that the coefficients are the same as those from the 2SLS, but Std. Err.
Values are different. It is advised to use the command, 2SLS, which corrects
for Std. Err. Values.
(5) Weak IV test
1st stage reduced form regression
. regress educ exper expersq motheduc fatheduc huseduc
Source | SS df MS Number of obs = 428
-------------+------------------------------ F( 5, 422) = 63.30
Model | 955.830608 5 191.166122 Prob > F = 0.0000
Residual | 1274.36565 422 3.01982382 R-squared = 0.4286
-------------+------------------------------ Adj R-squared = 0.4218
Total | 2230.19626 427 5.22294206 Root MSE = 1.7378
------------------------------------------------------------------------------
educ | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
exper | .0374977 .0343102 1.09 0.275 -.0299424 .1049379
expersq | -.0006002 .0010261 -0.58 0.559 -.0026171 .0014167
motheduc | .1141532 .0307835 3.71 0.000 .0536452 .1746613
fatheduc | .1060801 .0295153 3.59 0.000 .0480648 .1640955
huseduc | .3752548 .0296347 12.66 0.000 .3170049 .4335048
_cons | 5.538311 .4597824 12.05 0.000 4.634562 6.44206
------------------------------------------------------------------------------
. test motheduc fatheduc huseduc
( 1) motheduc = 0
( 2) fatheduc = 0
( 3) huseduc = 0
F( 3, 422) = 104.29
Prob > F = 0.0000
(6) Over-identifying restriction test
. overid
Tests of overidentifying restrictions:
Sargan N*R-sq test 1.115 Chi-sq(2) P-value = 0.5726
Basmann test 1.102 Chi-sq(2) P-value = 0.5763