0% found this document useful (0 votes)

15 views18 pages

18-Econometrics-Linear Regression

The document discusses the use of instrumental variables (IV) in econometrics, particularly in the context of endogenous regressors and their impact on regression analysis. It explains the process of transforming equations to achieve homoskedasticity, the importance of choosing appropriate instruments, and the implications of using IV estimators compared to ordinary least squares (OLS). Additionally, it provides examples and exercises related to estimating returns to schooling using parental education as instruments.

Uploaded by

Lorenzo Lucchesi

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

15 views18 pages

18-Econometrics-Linear Regression

Uploaded by

Lorenzo Lucchesi

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 18

Econometrics

University of Milan-Bicocca

Course lecturer:
Maryam Ahmadi
[email protected]

1
Endogenous Regressors and
Instrumental Variables

2
Problem 17 & Answer.
1- Consider a linear model to explain monthly beer consumption:
𝑏𝑒𝑒𝑟 = 𝛽0 + 𝛽1 𝑖𝑛𝑐 + 𝛽2 𝑝𝑟𝑖𝑐𝑒 + 𝛽3 𝑒𝑑𝑢𝑐 + 𝛽4 𝑓𝑒𝑚𝑎𝑙𝑒 + 𝑢
E(u|inc, price, educ, female) = 0
Var(u|inc, price, educ, female) = 𝜎2inc2
Write the transformed equation that has a homoskedastic error term.

Var(u|inc,price,educ,female) = 𝜎2inc2 → h(x)= inc2 where h(x) is the heteroskedasticity

function. Therefore, ℎ = inc, and so the transformed equation is obtained by dividing the
original equation by inc:
𝑏𝑒𝑒𝑟 1 𝑝𝑟𝑖𝑐𝑒 𝑒𝑑𝑢𝑐 female 𝑢
= 𝛽0 + 𝛽1 + 𝛽2 + 𝛽3 + 𝛽4 +
𝑖𝑛𝑐 ⅈ𝑛𝐶 ⅈ𝑛𝐶 ⅈ𝑛𝐶 ⅈ𝑛𝐶 ⅈ𝑛𝐶
Notice that, 𝛽1 , which is the slope on inc in the original model, is now a constant in the
transformed equation. This is simply a consequence of the form of the heteroskedasticity and
the functional forms of the explanatory variables in the original equation.
3
2- Consider the model y= 𝛽0 + 𝛽1 𝑥1 + 𝛽2 𝑥2 + 𝑢, and suppose that cov(𝑢,𝑥2 ) ≠ 0.

a) Is it possible to still make appropriate inferences based on the OLS estimator, while
adjusting the standard errors appropriately?
No. If E(𝑢. 𝑥2 ) ≠ 0, the OLS estimator is biased, no matter what other assumptions we
are making. Correcting standard errors does not solve the biasedness.

b) Explain how an instrumental variable, zi, leads to a new moment condition and,
consequently, an alternative estimator for 𝛽.
An instrumental variable, z, say, gives rise to a new moment condition that can replace
the invalid one. 𝐸 𝑢. 𝑧) = 𝐸{(𝑦 − 𝛽𝑥 . 𝑧} = 0. This is coming from the exogeneity of z
and the fact that cov(z,u)=0. This leads to the IV estimator, , the
ratio between covariance of z and y over covariance of z and x.
4
c) Why does this alternative estimator lead to a smaller R2 than the OLS one? What does this say about
the R2 as a measure for the adequacy of the model?
OLS minimizes the residual sum of squares and therefore maximizes the R2. Any other estimator,
including instrumental variables, results in a lower R2. Note that we are not interested in obtaining an R2
that is as high as possible, but in obtaining unbiased estimates for the coefficients of interest that are as
accurate as possible. The R2 does not tell us which estimator is the preferred one. The R2 tells us how
well the model fits the data (in a given sample) and typically is only interpreted in this way when the
model is estimated by ordinary least squares.

d) Why can we not choose z= 𝑥1 as an instrument for 𝑥2 , even if E(𝑥1 ,u) = 0? Would it be possible to use
𝑥12 as an instrument for 𝑥2 ?
we cannot use x1 as an instrument for x2 because x1 is already included in the model.
In theory, it is possible to use x1-squared an instrument for x2. However, while not being correlated with
u is a necessity condition for instrumental variables, it is not a sufficient condition. An instrument should
be correlated to x2, not correlated with u and adding it to the model be intuitional.

5
Example: Education in a wage equation,

• Individual ability is included in u

• and is correlated with education
• Education is endogenous
• We need an instrumental variable (z) that is correlated with education but uncorrelated
with ability
• We chose father education as an instrument (z)

Use MROZ.dta

. reg educ fatheduc . ivregress 2sls lwage (educ= fatheduc )

Or
. predict educhat
. reg lwage educhat

you will get exactly the coefficients of the 2SLS/IV model (but you will get different
standard errors)
6
The correct two-stage
the residuals are:
least-squares
෣
r = y − (𝑒𝑑𝑢𝑐)𝛽 residuals are:
But these are not the right residuals for 2SLS/IV. Because we
are fitting a structural model, we are interested in the e = y − (𝑒𝑑𝑢c)𝛽
residuals using the actual values of the endogenous variables.
7
Importance of chosing the right instrument

• If x and z are only slightly correlated, the sampling variance for 𝛽𝐼𝑉 could be
very large. The higher correlation between z and x, the smaller is the
variance of the IV estimator.

• This highlights an important cost of performing IV estimation, when x and u

are actually uncorrelated.

• This also highlights the importance of chosing the right instrument z that
satisfies the instrument relevance assumption (Cov(z,x)≠0).
8
An example of using an irrelevant instrument that cov(x,z)≠0 doesn’t hold

The log of birth weight, lbwght, is regressed on number of packs of cigarettes

that mother smoked per day during pregnancy.

We might worry that packs is correlated with other health factors or the
availability of good prenatal care as well as the mothers education, so packs is
endonegnous and and zero conditional mean assumption is violated

A possible instrument variable for packs is the average of price of cigarettes,

cigprice
We assume that cigprice is correlated with packs (instrument relevance) but
uncorrelated with u that is the health factors (instrument exogeneity).
9
. ivregress 2sls lbwght (packs = cigprice), first
The estimation results show that
First-stage regressions

• In the first stage of estimation, there is no relationship

Number of obs = 1,388 between cigprice and packs of smoked cigarettes
F( 1, 1386) = 0.13 (relevance assumption is violated) .
Prob > F = 0.7179
R-squared = 0.0001
Adj R-squared = -0.0006
Root MSE = 0.2987 • The IV estimation results show that the coefficient on
packs is huge and has an unexpected sign.
packs Coef. Std. Err. t P>|t| [95% Conf. Interval]

cigprice
_cons
.0002829
.0674257
.000783
.1025384
0.36
0.66
0.718
0.511
-.0012531
-.1337215
.0018188
.2685728
• The standard error of packs also is very large resulted
from low correlation betweeb cigprice and packs.

Instrumental variables (2SLS) regression Number of obs = 1,388

Wald chi2(1) = 0.12 • This estimation is failed as Cov(cigprice,packs)=0 and
Prob > chi2 = 0.7310 therefore the relevance assumption (Cov(z,x)≠0) is
R-squared
Root MSE
=
=
.
.93818
violated

lbwght Coef. Std. Err. z P>|z| [95% Conf. Interval]

• This is a case of irrelevant instrument; however, we
packs 2.988676 8.692619 0.34 0.731 -14.04854 20.0259 can face the problem of a weak instrument as well, in
_cons 4.448136 .9075006 4.90 0.000 2.669468 6.226805 which the covariance between x and z is not zero but
Instrumented: packs
is very small.
Instruments: cigprice

10
IV estimation in the multiple regression model

𝑦 = 𝛽0 + 𝛽1 𝑥1 + 𝛽2 𝑥2 + ⋯ + 𝛽𝑘 𝑥k + 𝑢
Endogenous variable exogenous variables

𝑦 is the dependent variable

𝑥1 is the endogenous regressor (correlated with 𝑢)
𝑥2 to 𝑥𝑘 are the exogenous variables or included exogenous regressors
(uncorrelated with 𝑢)
z is the instrumental variable
Does not appear in regression equation
Is uncorrelated with error term
Is partially correlated with endogenous explanatory variable
11
Stage one.

Regress 𝑥1 on all the exogenous regressors: regress 𝑥1 on 𝑥2 to 𝑥𝑘−1 and z by OLS

𝑥1 = 𝜋1 + 𝜋2 𝑥2 + …+ 𝜋𝑘 𝑥𝑘 + 𝜋𝑘+1 𝑧 + 𝑣 In a regression of the endogenous
explanatory variable on all
exogenous variables, the
• This is called “reduced form regression”. instrumental variable must have a
non-zero coefficient.

• The important is to have a statistically significant coefficient for 𝑧, as z is the instrumental

variable and Cov(𝑥1 , 𝑧)≠0 should hold. The significance of other variables doesn’t matter.

• Moreover, for all exogenous variables, Cov(𝑥2 , 𝑢)=0, Cov(𝑥3 , 𝑢)=0, ……, Cov(𝑥𝑘 , 𝑢)=0

• Compute predicted values of 𝑥1 as 𝑥ො1

𝑥ො1 = π
ෝ1 +ෝ
π2 𝑥2 +…+ π
ෝ𝑘 𝑥𝑘 + π
ෝ𝑘+1 𝑧
12
Stage two.

Regress 𝑦1 on 𝑥ො1 and 𝑥2 to 𝑥𝑘 using an OLS

𝑦 = 𝛽0 + 𝛽1 𝑥ො1 + 𝛽2 𝑥2 + ⋯ + 𝛽𝑘 𝑥k + 𝑒𝑟𝑟𝑜𝑟

• This is a Two Stage Least Squares (2SLS) estimation

13
Example. Using the data SCHOOLING, the log of wage is regressed on a set of explanatory variables
lwage=𝛽0 + 𝛽1 𝑒𝑑𝑢𝑐 + 𝛽2 𝑒𝑥𝑝𝑒𝑟 + 𝛽3 𝑒𝑥𝑝𝑒𝑟 2 + 𝛽4 𝑏𝑙𝑎𝑐𝑘+𝛽5 𝑠𝑚𝑠𝑎+𝛽6 𝑠𝑜𝑢𝑡ℎ + 𝑢
smsa is a dummy variable for living in SMSA

• Suppose educ is an endogenous variable that is correlated with the error term, as there is ability
of the person in the error term that can be correlated with education and wage in the same time.

• Ability is unobservable, so we dont have data for this variable. Therefore, we should find an
instrument for education, that is correlated with education but uncorrelated with ability

• We choose nearc4 as the instrument for educ. It is a dummy variable =1 if the family lived near a
four year college at 1966. It is correlated with educ and uncorrelated with the ability (error term).

• Typically, an instrument is thought of as a variable that affects the costs of schooling (and thus the
choice of schooling) but not earnings. 14
OLS estimation results of regressing log(wage) on education, experience, experience-squared and three dummy
variables indicating whether the individual is black, lived in a metropolitan area (SMSA) and lived in the south:

15
Reduced form explaining endogenous
regressors from exogenous regressors and
instruments, should show significant
effect of the instruments. (If weak: weak
instruments problem.)

IV estimates are (much) less accurate than

OLS (how much depends upon their
correlation with the endogenous
regressors).

16
The fact that the IV estimate of the returns to there is no unique definition of an R2 if the
schooling is higher than the OLS, suggests that model is not estimated by ordinary
OLS underestimates the true causal effect of least squares.
schooling.
This is at odds with the ‘ability bias’. When we estimate the model by
instrumental variables methods, goodness-
The downward bias of OLS could be due to of-fit is not what we are after. Our goal was
• measurement error, or to consistently estimate the causal effect of
• the possibility that the true returns to schooling on wage and that is exactly what
schooling vary across individuals, negatively instrumental variables is trying to do.
related to schooling.
Again, the R2 plays no role in comparing
alternative estimators.

17
Problem 18
Consider the data SCHOOLING. The purpose of this exercise is to explore the role of
parents’ education as instruments to estimate the returns to schooling.

a. Estimate a reduced form for schooling that include mother’s and father’s education
levels, instead of the lived near college dummy. What do these results indicate about the
possibility of using parents’ education as instruments?

b. Estimate the returns to schooling, on the basis of the same specification as in the
example, using mother’s and father’s education as instruments.

c. Re-estimate the model using also the lived near college dummy.

d. Compare and interpret the different estimates on the returns to schooling from
example, and parts b and c of this exercise.

• The command for more than one instrument for an endogenous regressor is the same as one
instrument, just add other instruments; e.g. ivregress 2sls lwage76 ( ed76 = nearc4 momed daded ) …
18

First COT Detailed Lesson Plan
100% (2)
First COT Detailed Lesson Plan
7 pages
Accenture Complete Preparation Sheet
100% (1)
Accenture Complete Preparation Sheet
11 pages
Mathematics Formula Sheet Class 12
67% (3)
Mathematics Formula Sheet Class 12
28 pages
International Math Bowl Open 2024
No ratings yet
International Math Bowl Open 2024
9 pages
Business and Professional Ethics 9th Edition Leonard J. Brooks Download
No ratings yet
Business and Professional Ethics 9th Edition Leonard J. Brooks Download
64 pages
Problem Set 1
100% (2)
Problem Set 1
26 pages
Endogeneity and IV Estimation
No ratings yet
Endogeneity and IV Estimation
27 pages
17-Econometrics-Linear Regression
No ratings yet
17-Econometrics-Linear Regression
18 pages
Lecture16 Instrumental Variables
No ratings yet
Lecture16 Instrumental Variables
36 pages
Chapter 4. Statistical Inference
No ratings yet
Chapter 4. Statistical Inference
49 pages
19-Econometrics-Linear Regression
No ratings yet
19-Econometrics-Linear Regression
17 pages
Econometrics: 2SLS & Hausman Test
No ratings yet
Econometrics: 2SLS & Hausman Test
4 pages
GMM Stata Implementation ESS 2017
No ratings yet
GMM Stata Implementation ESS 2017
33 pages
EC311 Slides Spring25 Week9 Part1
No ratings yet
EC311 Slides Spring25 Week9 Part1
16 pages
October 25, 2011
No ratings yet
October 25, 2011
27 pages
Part 2 - Simple Regression Model
No ratings yet
Part 2 - Simple Regression Model
56 pages
Econometrics Assignment Analysis
No ratings yet
Econometrics Assignment Analysis
12 pages
Exam Practice 4
No ratings yet
Exam Practice 4
5 pages
Chapter 09
No ratings yet
Chapter 09
25 pages
L10.2 2023
No ratings yet
L10.2 2023
64 pages
Ss 2 Economics 1st Term E-Note
No ratings yet
Ss 2 Economics 1st Term E-Note
77 pages
Econ 1630 HW1
No ratings yet
Econ 1630 HW1
6 pages
Lecture Set 4
No ratings yet
Lecture Set 4
39 pages
面板数据方法与Stata分析
No ratings yet
面板数据方法与Stata分析
63 pages
Ch3 Multiple Regression
No ratings yet
Ch3 Multiple Regression
56 pages
IV - Exercises (Chapter 1)
No ratings yet
IV - Exercises (Chapter 1)
3 pages
Mock Test Econ
No ratings yet
Mock Test Econ
2 pages
L8.2 2023
No ratings yet
L8.2 2023
70 pages
Experiment 4 - Numerical Differentiation
No ratings yet
Experiment 4 - Numerical Differentiation
6 pages
HJGH
No ratings yet
HJGH
48 pages
Outputs 1
No ratings yet
Outputs 1
3 pages
Reaction Engineering Course Outline
No ratings yet
Reaction Engineering Course Outline
181 pages
DC 21EC51 Module 5 Notes
No ratings yet
DC 21EC51 Module 5 Notes
103 pages
QM 9 Instrumental Variables I
No ratings yet
QM 9 Instrumental Variables I
29 pages
Notes 9
No ratings yet
Notes 9
57 pages
Examen Parcial 2 2023-2 Secc 1 (Solutions Alumnos)
No ratings yet
Examen Parcial 2 2023-2 Secc 1 (Solutions Alumnos)
5 pages
Econ321 2017 Tutorial 2 Lab
No ratings yet
Econ321 2017 Tutorial 2 Lab
9 pages
Econometrics Chapter 8 PPT Slides
100% (1)
Econometrics Chapter 8 PPT Slides
42 pages
Tutorial 5
No ratings yet
Tutorial 5
12 pages
Nu - Edu.kz Econometrics-I Assignment 4 Answer Key
No ratings yet
Nu - Edu.kz Econometrics-I Assignment 4 Answer Key
4 pages
Truss Analysis & Elastic Strain Energy
No ratings yet
Truss Analysis & Elastic Strain Energy
12 pages
Lecture Notes 2
No ratings yet
Lecture Notes 2
9 pages
Intergrated Problem
No ratings yet
Intergrated Problem
8 pages
2012 Mancity Annual and Financial Report
No ratings yet
2012 Mancity Annual and Financial Report
59 pages
Sorting Search New
No ratings yet
Sorting Search New
15 pages
6013B0519Y T2 Homework Solutions 20240504
No ratings yet
6013B0519Y T2 Homework Solutions 20240504
6 pages
QM 10 Instrumental Variables II
No ratings yet
QM 10 Instrumental Variables II
27 pages
EC212: Introduction To Econometrics Multiple Regression: Inference (Wooldridge, Ch. 4)
No ratings yet
EC212: Introduction To Econometrics Multiple Regression: Inference (Wooldridge, Ch. 4)
89 pages
2024 Mancity Financial Report
No ratings yet
2024 Mancity Financial Report
53 pages
Theory 3. Linear Regression With One Regressor (Textbook Chapter 4)
No ratings yet
Theory 3. Linear Regression With One Regressor (Textbook Chapter 4)
41 pages
Multiple Regression Estimation Guide
No ratings yet
Multiple Regression Estimation Guide
76 pages
Rec 5 4
No ratings yet
Rec 5 4
8 pages
Ees 404
No ratings yet
Ees 404
10 pages
An Example of Two-Stage Least Squares (2SLS) Metho
No ratings yet
An Example of Two-Stage Least Squares (2SLS) Metho
5 pages
UNSW ECON2206 Assignment
No ratings yet
UNSW ECON2206 Assignment
7 pages
Econometric Project - Permanent Income Hypothesis
No ratings yet
Econometric Project - Permanent Income Hypothesis
9 pages
Final Exam Suggested Solution Key
No ratings yet
Final Exam Suggested Solution Key
5 pages
Introduction To Multiple Regression
No ratings yet
Introduction To Multiple Regression
36 pages
Linear Equations and Inequalities Lesson Plan
100% (1)
Linear Equations and Inequalities Lesson Plan
7 pages
15-Econometrics-Linear Regression
No ratings yet
15-Econometrics-Linear Regression
25 pages
Econometrics Exam for Students
100% (1)
Econometrics Exam for Students
8 pages
Testing Endogeneity
No ratings yet
Testing Endogeneity
3 pages
PID Controller
No ratings yet
PID Controller
5 pages
14 - Econometrics - Linear Regression
No ratings yet
14 - Econometrics - Linear Regression
18 pages
Assignment 2
No ratings yet
Assignment 2
8 pages
Problem Set 3
No ratings yet
Problem Set 3
9 pages
Assignement 1 .Hridita. BUS 525
No ratings yet
Assignement 1 .Hridita. BUS 525
10 pages
No Linealidades Stock Watson
No ratings yet
No Linealidades Stock Watson
59 pages
YD Slides5 NonLin
No ratings yet
YD Slides5 NonLin
54 pages
Solution Assignment
No ratings yet
Solution Assignment
34 pages
3 Sls
No ratings yet
3 Sls
31 pages
Lnq = Β + Β Lnli + Β Lnki + Ɛ
No ratings yet
Lnq = Β + Β Lnli + Β Lnki + Ɛ
12 pages
2SLS Klein Macro PDF
No ratings yet
2SLS Klein Macro PDF
4 pages
Instrumental Variables Regression
No ratings yet
Instrumental Variables Regression
20 pages
4-Econometrics-Linear Regression
No ratings yet
4-Econometrics-Linear Regression
12 pages
Assignment Econometrics
No ratings yet
Assignment Econometrics
7 pages
ECON W3412: Introduction To Econometrics Chapter 12. Instrumental Variables Regression (Part II)
No ratings yet
ECON W3412: Introduction To Econometrics Chapter 12. Instrumental Variables Regression (Part II)
33 pages
Unit 5
No ratings yet
Unit 5
25 pages
11 - Econometrics - Linear Regression
No ratings yet
11 - Econometrics - Linear Regression
20 pages
Learning SciPy For Numerical and Scientific Computing Second Edition Sergio J. Rojas G. Instant Download
No ratings yet
Learning SciPy For Numerical and Scientific Computing Second Edition Sergio J. Rojas G. Instant Download
42 pages
Engineering Deflection Analysis
No ratings yet
Engineering Deflection Analysis
7 pages
COSC 3101A - Design and Analysis of Algorithms 7
No ratings yet
COSC 3101A - Design and Analysis of Algorithms 7
50 pages
G.T.N. Arts College (Autonomous)
No ratings yet
G.T.N. Arts College (Autonomous)
20 pages
12-Econometrics-Linear Regression
No ratings yet
12-Econometrics-Linear Regression
18 pages
Econometrics Lecture1
No ratings yet
Econometrics Lecture1
17 pages
16-Econometrics-Linear Regression
No ratings yet
16-Econometrics-Linear Regression
14 pages
8-Econometrics-Linear Regression
No ratings yet
8-Econometrics-Linear Regression
14 pages
21-Econometrics-Linear Regression
No ratings yet
21-Econometrics-Linear Regression
9 pages
Does The Environmental Kuznets Curve Exist? An International Study
No ratings yet
Does The Environmental Kuznets Curve Exist? An International Study
22 pages
Experiment No 1 Units
No ratings yet
Experiment No 1 Units
4 pages
Euclid's Algorithm: ENGI 1331: Exam 2 Review - Additional Practice Problems Fall 2020
No ratings yet
Euclid's Algorithm: ENGI 1331: Exam 2 Review - Additional Practice Problems Fall 2020
4 pages
CHANDRA DZDA STAT6174037 ProbabilityTheoryandAppliedStatistics
No ratings yet
CHANDRA DZDA STAT6174037 ProbabilityTheoryandAppliedStatistics
17 pages
SSC GD: Previous Paper
No ratings yet
SSC GD: Previous Paper
37 pages
Assessment Record 2024-2025
No ratings yet
Assessment Record 2024-2025
12 pages
Machine Learning for Solubility Prediction
No ratings yet
Machine Learning for Solubility Prediction
6 pages
Gat Eee Nba DSP 18eel67 Co 2021-22
No ratings yet
Gat Eee Nba DSP 18eel67 Co 2021-22
2 pages
Errata Sheet for Module Corrections
No ratings yet
Errata Sheet for Module Corrections
6 pages
Allocate 25 Seats For Five States Whose Populations
No ratings yet
Allocate 25 Seats For Five States Whose Populations
3 pages
Digital Systems Design Exam 2023
No ratings yet
Digital Systems Design Exam 2023
2 pages

18-Econometrics-Linear Regression

Uploaded by

18-Econometrics-Linear Regression

Uploaded by

Econometrics

Var(u|inc,price,educ,female) = 𝜎2inc2 → h(x)= inc2 where h(x) is the heteroskedasticity

• Individual ability is included in u

. reg educ fatheduc . ivregress 2sls lwage (educ= fatheduc )

• This highlights an important cost of performing IV estimation, when x and u

The log of birth weight, lbwght, is regressed on number of packs of cigarettes

A possible instrument variable for packs is the average of price of cigarettes,

• In the first stage of estimation, there is no relationship

Instrumental variables (2SLS) regression Number of obs = 1,388

lbwght Coef. Std. Err. z P>|z| [95% Conf. Interval]

𝑦 is the dependent variable

Regress 𝑥1 on all the exogenous regressors: regress 𝑥1 on 𝑥2 to 𝑥𝑘−1 and z by OLS

• The important is to have a statistically significant coefficient for 𝑧, as z is the instrumental

• Compute predicted values of 𝑥1 as 𝑥ො1

Regress 𝑦1 on 𝑥ො1 and 𝑥2 to 𝑥𝑘 using an OLS

• This is a Two Stage Least Squares (2SLS) estimation

IV estimates are (much) less accurate than

You might also like