EC3090 Econometrics
Topic 3: The Multiple Regression Model
Reading:
Wooldridge, Chapter 3
Gujarati and Porter, Chapter 7
Topic 3: The Multiple Regression Model
1. The model with two independent variables
Say we have information on more variables that theory tells us may
influence Y:
Yi 0 1 X1i 2 X 2i ui
0 : measures the average value of Y when X1 and X2 are zero
1 and 2 are the partial regression coefficients/slope coefficients which
measure the ceteris paribus effect of X1 and X2 on Y, respectively
Key assumption:
E ui | X 1i , X 2i 0
For k independent variables:
Yi 0 1 X1i 2 X 2i ........ k X ki ui
E ui | X 1i , X 2i ,..........., X ki 0
Covui , X 1i Covui , X 2i ........ Covui , X ki 0
Topic 3: The Multiple Regression Model
2. OLS estimation of the multiple regression model
Simultaneously choose the values of the unknown parameters of the
population model that minimise the sum of the squared residuals:
u Y
n
2
i
i 1
i 1
1 X 1i 2 X 2i ....... k X ki
The first order conditions are given by the k+1 equations:
Y
n
i 1
0 1 X 1i 2 X 2i ....... k X ki 0
1i
X
i 1
..
..
n
X
i 1
ki
Y
i
1 X 1i 2 X 2i ....... k X ki 0
1 X 1i 2 X 2i ....... k X ki 0
Note: these equations can also be obtained using MM estimation
Topic 3: The Multiple Regression Model
2. OLS estimation of the multiple regression model
Consider the case where k=2. OLS requires that
we minimise:
2
u Y
n
i 1
2
i
i 1
1 X 1i 2 X 2i
The first order conditions are given by the 3 equations:
Y
n
i 1
X
i 1
1i
X
i 1
1 X 1i 2 X 2i 0
2i
1 X 1i 2 X 2i 0
Y
i
1 X 1i 2 X 2i 0
Solve simultaneously to find OLS parameter estimator
Topic 2: The Simple Regression Model
2. OLS estimation of the multiple regression model
Algebraic Properties:
n
1. u i 0
i 1
n
2.
3.
Y Y
i 1
u 0
ki i
Y u
i 1
i i
4. X 1 , X 2 ,....., X k , Y is always on the regression line
Topic 3: The Multiple Regression Model
3.
Interpreting the coefficients of the Multiple Regression Model
OLS slope coefficients depend on the relationship between each of
the individual variables and Y and on the relationship between the
Xs (illustrate)
In the two-variable example re-write the OLS estimator for 1 as:
n
1 r1iYi
i 1
r
i 1
2
1i
where r1 are the OLS residuals from a simple regression of X1 on X2.
Thus, 1 gives the pure effect of X1 on Y, i.e., netting out the effect
of X2.
Predicted Values: Yi 0 1 X1i 2 X 2i ........ k X ki
Residuals: ui Yi Yi
If ui 0 model under-predicts Y
If ui 0 model over-predicts Y
Topic 3: The Multiple Regression Model
3.
Interpreting the coefficients of the Multiple Regression Model
Relationship between simple and multiple regression estimates.
~
~
1 1 21
where the coefficients are OLS estimates from:
~ ~ ~
Yi 0 1 X 1i
Yi 0 1 X1i 2 X 2i
~ ~
~
X 2i 0 1 X 1i
The inclusion of additional regressors will affect the slope estimates.
~
But 1 1 where:
1. 2 0
2. ~ 0
1
Topic 3: The Multiple Regression Model
4.
Goodness-of-Fit in the Multiple Regression Model
How well does regression line fit the observations?
As in simple regression model define:
n
SST=Total Sum of Squares Yi Y 2
i 1
n
2
SSE=Explained Sum of Squares i1Yi Y
n
SSR=Residual Sum of Squares ui2
Y Y
n
R
2
i 1
n
Y Y
i 1
i 1
SSE
SSR
1
SST
SST
Recall: SST = SSE + SSR SSE SST and SSE > 0
0 SSE/SST 1
R2 never decreases as more independent variables are added use
adjusted R2 :
Includes punishment for
SSR n k 1
2
R 1
adding more variables to the
SST n 1
model
Topic 3: The Multiple Regression Model
5. Properties of OLS Estimator of Multiple Regression Model
Gauss-Markov Theorem
Under certain assumptions known as the Gauss-Markov assumptions the
OLS estimator will be the Best Linear Unbiased Estimator
Linear: estimator is a linear function of the data
Unbiased: E 0 0
E
k
Best: estimator is most efficient estimator, i.e., estimator has the
minimum variance of all linear unbiased estimators
Topic 3: The Multiple Regression Model
5. Properties of OLS Estimator of Multiple Regression Model
Assumptions required to prove unbiasedness:
A1: Regression model is linear in parameters
A2: X are non-stochastic or fixed in repeated sampling
A3: Zero conditional mean
A4: Sample is random
A5: Variability in the Xs and there is no perfect collinearity in the Xs
Assumptions required to prove efficiency:
A6: Homoscedasticity and no autocorrelation
V ui | X 1i , X 2i ,...., X ki 2
Covui , u j 0
Topic 3: The Multiple Regression Model
6. Estimating the variance of the OLS estimators
Need to know dispersion (variance) of sampling distribution of OLS
estimator in order to show that it is efficient (also required for inference)
In multiple regression model:
V k
SSTk 1 Rk2
Depends on:
a) 2: the error variance (reduces accuracy of estimates)
b) SSTk: variation in X (increases accuracy of estimates)
c) R2k: the coefficient of determination from a regression of Xk on all
other independent variables (degree of multicollinearity reduces accuracy
of estimates)
What about the variance of the error terms 2?
n
1
ui2
n k 1 i 1
2
Topic 3: The Multiple Regression Model
7. Model specification
Inclusion of irrelevant variables:
OLS estimator unbiased but with higher variance if Xs correlated
Exclusion of relevant variables:
Omitted variable bias if variables correlated with variables included in
the estimated model
Yi 0 1 X1i 2 X 2i
~ ~ ~
Y
Estimated Model:
i 0 1 X 1i
~ ~
1 1 21
OLS estimator:
True Model:
~
~
E
Biased:
1
1
2 1
Omitted Variable Bias: Bias ~1 2~1