Thanks to visit codestin.com
Credit goes to www.scribd.com

0% found this document useful (0 votes)
10 views44 pages

Econometrics - Chapter 4 Lecture Notes

Uploaded by

asmh32612
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
10 views44 pages

Econometrics - Chapter 4 Lecture Notes

Uploaded by

asmh32612
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 44

ECONOMETRICS I

[Econ 3061]

Madda Walabu University


College of Business and Economics
Department of Economics

Genemo Fitala (MSc.)


[email protected]
2

CHAPTER 4

VIOLATION OF CLASSICAL REGRESSION ASSUMPTIONS


Review of Assumption of the classical
3
regression/OLS
1. The regression model is linear, correctly specified, and has an additive error
term.
2. The error term has a zero population mean.
3. All explanatory variables are uncorrelated with the error term
4. Observations of the error term are uncorrelated with each other (no serial
correlation).
5. The error term has a constant variance (no heteroskedasticity).
6. No explanatory variable is a perfect linear function of any other explanatory
variables (no perfect multicollinearity).
7. The error term is normally distributed (not required).
4.1. Multicollinearity 4

 Multicollinearity denotes the linear relationship among independent or


explanatory variables.
 An implicit assumption that is made when using the OLS estimation method is
that the explanatory variables are not correlated with one another.
 If there is no relationship between the explanatory variables, they would be said to be
orthogonal to one another.
 If the explanatory variables were orthogonal to one another, adding or removing a
variable from a regression equation would not cause the values of the coefficients on
the other variables to change.
 In any practical context, the correlation between explanatory variables will be non-
zero - a small degree of association between explanatory variables will almost always
occur but will not cause too much loss of precision.
Cont’d 5
 However, a problem occurs when the explanatory variables are very highly correlated
with each other, and this problem is known as multicollinearity. It is possible to
distinguish between two classes of multicollinearity: perfect & near multicollinearity

1.Perfect Collinearity
 Perfect multicollinearity occurs when there is an exact relationship between two or
more variables. In this case, it is not possible to estimate all of the coefficients.
 It usually observed only when the same IV is inadvertently used twice in a regression.
For illustration, suppose that two variables were employed in a regression function
such that the value of one variable was always twice that of the other (e.g.
suppose 𝑋3 = 2 ∗ 𝑋2 ). If both 𝑋3 and 𝑋2 were used as explanatory variables in the
same regression, then the model parameters cannot be estimated.
 If explanatory variables have perfectly correlated i.e if correlation coefficient is
one, then the parameters become indeterminate and & their S.Es are infinite.
Cont’d
6
2. Near Multicollinearity:
 Near multicollinearity is much more likely to occur in practice, and would arise when
there was a non-negligible, but not perfect, relationship between two or more of the
explanatory variables.

 If multicollinearity is less than perfect, the regression coefficients are determinate


but have large standard errors which harms the accuracy of the estimation of the
coefficients. In case, OLS estimators have large variance though they are not BLUE.

 Note that a high correlation between the dependent variable and one of the
independent variables is not multicollinearity.

 Testing for multicollinearity is surprisingly difficult, and hence all that is presented
here is a simple method to investigate the presence or otherwise of the most easily
detected forms of near multicollinearity.
Cont’d
7
 This method simply involves looking at the matrix of correlations
between the individual variables.
 Suppose that a regression equation has three explanatory variables (plus
a constant term), and that the pair-wise correlations between these
explanatory variables are:

 Clearly, if multicollinearity was suspected, the most likely culprit would be a


high correlation between 𝑋2 𝑎𝑛𝑑 𝑋4 .
1.The potential source of multicollinearity 8

1. The data collection method employed: Sampling over limited range of the
values taken by the repressors in the population.
2. Constraint on the model or in the population being sampled.
3. Model specification
 Adding polynomial terms to a regression
 Use of highly dependent independent variables
 Use of many interaction terms in a model
4. Over-determined model: the model has more explanatory variables than
number of observation,
5. Use of many dummy independent variables
2. The Detection of Multicollinearity 9

1. High Correlation Coefficients: Pairwise correlations among independent


variables might be high (in absolute value). The Rule of thumb is If the
correlation > 0.8 then severe multicollinearity may be present.
2. High 𝑹𝟐 with low t-Statistic Values : Possible for individual regression
coefficients to be insignificant but for the overall fit of the equation to be high.
3. High Variance Inflation Factors (VIFs):
 The larger the value of VIF, the more ―troublesome or collinear the variable Xj
 If the VIF of a variable exceeds 10, which will happen if 𝑹𝟐 exceeds 0.90,
that variable is said to be highly collinear
 The closer the Variance Inflation Factors (VIFs)j is to 1, the greater the
evidence that Xj is not collinear with the other regression.
3.Remedies for Multicollinearity 10

 No single solution exists that would eliminate multicollinearity. Certain


approaches may be useful:

1. Do Nothing: Live with what you have.

2. Drop a Redundant Variable

 If a variable is redundant, it should have never been included in the model


in the first place.

 So dropping it actually is just correcting for a specification error.

 Use economic theory to guide your choice of which variable to drop.


Cont’d 11
3. Transform the Multicollinear Variables

 Sometimes you can reduce multicollinearity by re-specifying the model, for


instance, create a combination of the multicollinear variables.

 As an example, rather than including the variables GDP and population in


the model, include GDP/population (GDP per capita) instead.

4. Increase the Sample Size

 Increasing the sample size improves the precision of an estimator and


reduces the adverse effects of multicollinearity. Usually adding data though
is not feasible.

 It should have never been included in the model in the first place.
4.2. Heteroscedasticity 12

1. The Nature of Heteroscedasticity


 If the probability distribution of Ui remains the same over all explanatory
variables this assumptions is called homoscedasticity i.e. 𝑣𝑎𝑟 𝑢𝑖 = 𝛔𝟐𝐮 constant
Variance.

 In this case the variation of 𝒖𝒊 around the explanatory variables is remains


constant.

 But if the distribution of 𝑢𝑖 around the explanatory is not constant we say that 𝑢𝑖’s
are heteroscedastic (not constant variance

 𝑣𝑎𝑟 𝑢𝑖 = 𝛔𝟐𝒖𝒊 signifies the fact that the individual variance may be different.
Cont’d
13
 The assumption of homoscedasticity states that the variation of each random
term (Ui) around its zero mean is constant and does not change as the explanatory
variables change whether the sample size is increasing, decreasing or remains the
same it will not affect the variance of Ui which is constant.

𝑉𝑎𝑟 𝑢𝑖 = 𝛔𝟐𝒖 ≠ 𝑓(𝑋𝑖)

 This explains that the variation of the random term around its mean does not
depend upon the DV Xi. This is called homoscedastic (constant Variance)
 But the dispersion of the random term around the regression line may not be constant or
the variance of the random term Ui may be a function of the explanatory variables.

 𝑉𝑎𝑟 𝑢𝑖 = 𝛔𝟐𝒖𝒊 = 𝑓 𝑋𝑖 , here it signifies the individual variance may all be different. This is
called heteroscedasticity or not constant variance.
Cont’d
14
 Diagrammatically, in the two-variable regression FIGURE 4.1 Heteroscedastic
disturbances.
model homoscedasticity can be, for convenience,
shown as in Figure 4.1

 As Figure 4.1 shows, the conditional variance of 𝑌𝑖


(which is equal to that of 𝑢𝑖 ), conditional upon the
given 𝑋𝑖 , remains the same regardless of the
values taken by the variable 𝑿.

 In contrast, consider Figure 4.2, which shows that


the conditional variance of 𝑌𝑖 increases as 𝑿
increases. Here, the variances of Yi are not the
same. Hence, there is heteroscedasticity.

 Symbolically, E (𝑢𝑖 2 ) = 𝛔𝟐𝒊 of the random term


around its mean does not depend upon the DV

FIGURE 4.2 Heteroscedastic disturbances.


Cont’d
15
 To make the difference between FIGURE 4.1 Heteroscedastic
disturbances.
homoscedasticity and heteroscedasticity clear,
assume that in the two-variable model 𝑌𝑖 = 𝛽1 +
𝛽2 + 𝑢𝑖 , 𝒀 represents savings and 𝑿 represents
income.

 Figures 4.1 and 4.2 show that as income


increases, savings on the average also increase.

 But in Figure 4.1 the variance of savings


remains the same at all levels of income,
whereas in Figure 4.2 it increases with income.

 It seems that in Figure 4.2 the higher-income


families on the average save more than the
lower-income families, but there is also more
variability in their savings. FIGURE 4.2 Heteroscedastic disturbances.
2. Source of heteroskedasticity
16
FIGURE 4.3 Illustration of
1. Following the error-learning models heteroscedasticity.

 Aspeople learn, their errors of behavior become


smaller over time or the number of errors becomes
more consistent.
 In this case, 𝛔𝟐𝒊 is expected to decrease.
 Asan example, consider Figure 4.3, which relates the
number of typing errors made in a given time period
on a test to the hours put in typing practice.
 As Figure 4.3 shows, as the number of hours of
typing practice increases, the average number of
typing errors as well as their variances decreases.
Cont’d
17
2. As the data collection techniques improves, 𝛔𝟐𝒖𝒊 is likely to decrease.

Example:
 Banks that have sophisticated data processing equipment are likely to
commit fewer errors in the monthly or quarterly statements of their
customers than banks with out such facilities.
3. Presence of outliers:

 An outlier observation is an observation that is much different either very


small or very large in relation to other observation in the sample.
3. Consequence of heteroskedasticity 18

 If the assumption of homoscedasticity is violated, it will have the following


consequences

1. Heteroskedasticity increases the variances of the distributions of the


coefficients of OLS thereby turning the OLS estimators inefficient.

2. OLS estimators shall be inefficient: If the random term Ui is


heteroskedasticity, the OLS estimates do not have the minimum variance in
the class of unbiased estimators. Therefore they are not efficient both in
small & large samples.

 So, heteroskedasticity has a wide impact on hypothesis testing; the


conventional t and F statistics are no more reliable for hypothesis testing.
Cont’d 19

 Then 𝒗𝒂𝒓(𝜷 ) under heteroscedasticty will be greater than its variance under
homoscedasticity.

 As a result the true standard error of(𝜷 ) shall be underestimated.

 As such the t-value associated with it will be over estimated which might lead
to the conclusion that in a specific case at hand 𝜷 is statistically significant
(which in fact may not be true).

 Moreover, if we proceed with our model under false belief of homoscedasticity


of the error variance, our inference and prediction about the population
coefficients would be incorrect.
4. Detection of heteroskedasticity 20

 There are informal & formal methods of detecting heteroskedasticity

1. Informal

A. Nature of Problems: Following the pioneering work from empirical information

B. Graphical Method
 Plotting the square of the residual against the dependent variable gives rough
indication of the existence of heteroskedasticity.
 If there appears a systematic trend in the graph it may be an indication of
the existence of heteroskedasticity.
Cont’d 21

 In Figure 4.4., 𝒖𝟐𝒊 are plotted against 𝒀𝒊 , the


estimated Yi from the regression line, the idea
being to find out whether the estimated mean
value of Y is systematically related to the squared
residual.

 In Figure 4.4a we see that there is no systematic


pattern between the two variables, suggesting
that perhaps no heteroscedasticity is present in
the data.

 Figures 4.4b to e, however, exhibit definite


patterns. For instance, Figure 4.4c suggests a
linear relationship, whereas Figures 4.4d and e
indicate a quadratic relationship between 𝒖𝟐𝒊 and 𝑌𝑖 .
Figure 4.4. Hypothetical patterns of estimated
squared residuals.
Cont’d 22
2. Formal Methods
A. The spearman rank- correlation Test

 This is the simplest & approximate test for defecting hetroscedastic which
will be applied either to small or large samples.

 A high rank correlation coefficient suggests the presence of


heteroskedasticity.

 If we have more explanatory variable we may compute the rank correlation


coefficient between ei & each one of the explanatory variables separately.
Cont’d 23
B. The Breusch-Pagan Test

 This test is applicable for large samples & the number of observations (at least)
i.e sample size is twice the number of explanatory variables.

 If the numbers of explanatory variables are 3(X1, X2, X3) then the sample size
is at least must be 6.

 If the computed value is greater than the table value rejects the null
hypothesis that there is homoscedasticity & accepts the alternative that there
is heteroskedasticity.
Cont’d
24
C. White’s General Heteroskedasticity Test

 It is an LM test, but it has the advantage that it does not require any prior
knowledge about the pattern of heteroskedasticity.

 The assumption of normality is also not required here. For all these reasons, it
is considered to be more powerful among the tests of heteroskedasticity.

 Basic intuition → focuses on systematic patterns between the residual


variance, the explanatory variables, the squares of explanatory variables and
their cross-products.
Cont’d 25

Limitations of White’s test

i. When we have large number of explanatory variables, the number of terms in


the auxiliary regression model will be so high that we may not have adequate
degrees of freedom.

ii. It is basically a large sample test so that, when we work with a small sample, it
may fail to detect the presence of heteroskedasticity in data even when such a
problem is present.
Cont’d 26

D. Goldfeld-Quandt Test

 This popular method is applicable if one assumes that the heteroscedastic


variance 𝛔𝟐𝒊 is positively related to one of the explanatory variables in the
regression model

 It may be applied when one of the explanatory variables is suspected to be the


heteroskedasticity culprit.

 The basic idea here is that if the variances of the disturbances are same across
all observations (i.e., homoscedastic), then the variance of one part of the
sample should be same with the variance of another part of the sample.
5. Remedies of heteroskedasticity 27

1. Log-transformation of data ⟶ log-transformation compresses the scales in


which the variables are measured. So it helps to reduce intensity of
heteroskedasticity problem. Obviously, this method cannot be followed where
some variables take on zero or negative values.

2. Using some suitable deflator if available to transform the data series ⟶ the
idea is to estimate the model by using the deflated variables so that more
efficient estimates of the parameters are obtained. But this process might lead
to ‘spurious relationship’ between the variables when a common deflator is used
to deflate the variables.

3. When heteroskedasticity appears owing to presence of outliers, increasing sample


size might be helpful.
4.3. Autocorrelation 28

o Inour discussion of simple and multiple regression models, one of the assumptions
of the classicalist is that the 𝐶𝑜𝑣(𝑢𝑖 𝑢𝑗 ) = 𝐸(𝑢𝑖 𝑢𝑗 ) = 0 which implies that successive
values of disturbance term U are temporarily independent, i.e. disturbance
occurring at one point of observation is not related to any other disturbance.

o Thismeans that when observations are made over time, the effect of disturbance
occurring at one period does not carry over into another period.

o Ifthe above assumption is not satisfied, that is, if the value of U in any
particular period is correlated with its own preceding value(s), we say there is
autocorrelation of the random variables.
Cont’d 29

 Hence, autocorrelation is defined as a ‘correlation’ between members of series of


observations ordered in time or space.

 Autocorrelation refers to the internal correlation between members of series of


observation ordered in time or space.

 Autocorrelation is a special case of correlation where the association is not


between elements of two or more variables but between successive value of one
variable, While correlation refers to the relation ship between values of two or
more different variables.

 Autocorrelation is also sometimes called as serial correlation but some economists


distinguish between these two terms.
1. Reasons/sources of Autocorrelation 30
There are several reasons why autocorrelation arises. Some of these are

A. Cyclical fluctuations
 Time series such as GNP, price index, production, employment and
unemployment exhibit business cycle. Starting at the bottom of recession,
when economic recovery starts, most of these series move upward. In this
upswing, the value of a series at one point in time is greater than its
previous value.

 Thus, there is a momentum built in to them, and it continues until something


happens (e.g. increase in interest rate or tax) to slowdown them.

 Therefore, regression involving time series data, successive observations


are likely to be interdependent.
Cont’d 31
B. Omitted Explanatory variables

 If an autocorrelated variable has been excluded from the set of


explanatory variables then its influence will be reflected in Ui.

 If several auto correlated explanatory variables are omitted, then the


random variables ,U, may not be auto correlated. This is because the auto
correlation pattern of the omitted variables may offset each other.

C. Mis-specification of the mathematical form of the model

 If we use mathematical form which differ from the correct form of relation-
ship then the random variables may show the serial correlation.

 Example : If we chosen a linear function while the correct form are non-
linear, then the values of U will be correlated.
Cont’d 32
D. Mis-specification of the true random term U.

 Many random factors like war, drought, weather condition, strikes etc. exert
influence that are spread over more than one period of time.

 Example: the effect of weather condition in agricultural sector will influence


the performance of all other economic variables in several times in the
future.

 A strike in an organization affect its future production process which will


persist for several future periods.

 In such cases the value’s of U become serially dependent, so that


autocorrelation has happen.
2. Effect of Autocorrelation on OLS Estimators 33

 We have seen that ordinary least square technique is based on basic


assumptions.

 Some of the basic assumptions are with respect to mean, variance and
covariance of disturbance term.

 Naturally, therefore, if these assumptions do not hold good on what so ever


account, the estimators derived by OLS procedure may not be efficient.

 Following are effects on the estimators if OLS method is applied in presence of


autocorrelation in the given data.
Cont’d 34

1. The OLS estimator is unbiased but not BLUE

2.The variance of OLS estimates is inefficient: The variance of estimate 𝛽 in


simple regression model will be biased down wards (i.e. underestimated) when
random terms are auto correlated.

3.Wrong Testing Procedure: If var(𝛽 ) is underestimated, SE(𝛽 ) is also


underestimated, this makes t-ratio large. This large t-ratio may make 𝜷
statistically significant while it may not.

4.Wrong testing procedure will make wrong prediction and inference about the
characteristics of the population.
3. Detection of Autocorrelation 35

The following are detection methods of Autocorrelation:


1.Graphic methods

 In the disturbance term- Ui, since 𝒆𝒊 = 𝐘𝐢 − 𝒀𝒊 estimates of the true


Ui, and then if 𝒆𝒊 are found to be correlated it will suggest that Ui are
auto correlated with each others.
 In order to test for autocorrelation, it is necessary to investigate
whether any relationships exist between the current value of 𝑢, 𝑢𝑡 , and
any of its previous values, 𝑢𝑡−1 , 𝑢𝑡−2 …
Cont’d
36
Plot the residuals against their own lag:
 It shows the relationships b/n the current 𝑢, &
the immediately previous one, 𝑢𝑡−1 .
 Plot 𝑢𝑡 horizontally and 𝑢𝑡−1 vertically. i.e. plot the Fig 4.5. Plot of 𝒖𝒕 against 𝒖𝒕−𝟏 showing
following observations. positive autocorrelation

 If most of the points fall in quadrant I and III,


we say that the given data is autocorrelated & the
autocorrelation is positive autocorrelation(Fig 4.5).
 If most of the points fall in quadrant II and IV,
the autocorrelation is said to be negative.(Fig 4.6).
 But if the points are scattered equally in all the
quadrants, its said to be no autocorrelation in the Fig 4.6. Plot of 𝒖𝒕 against
given data (fig. 4.9). 𝒖𝒕−𝟏 showing negative
autocorrelation
Cont’d 37
Figure 4.7 Plot of 𝒖𝒕 over time, showing
positive autocorrelation
Plotting 𝑢𝑡 over time:
 If series of residuals will not cross the time-
axis very frequently, shows that a positively
autocorrelated. (Fig. 4.7)

 A negatively autocorrelated series of residuals


will cross the time-axis more frequently than if
they were distributed randomly. (Fig. 4.8)

 If the time series plot of the residuals does


not cross the x-axis either too frequently or
too little. then we say there is no
autocorrelation in the given data. (Fig. 4.10)
Figure 4.8 Plot of 𝒖𝒕 over time, showing
negative autocorrelation
Cont’d 38

Figure 4.10. Plot of 𝒖𝒕 over time,


Figure 4.9 Plot of 𝒖𝒕 against 𝒖𝒕−𝟏 , showing no autocorrelation
showing no autocorrelation
Cont’d 39
 Autocorrelation may be positive or negative but in most of the cases of
practice autocorrelation is positive.

 The main reason for this is economic variables are moving in the same
direction.

Example:
 In period of boom employment, investment, output, growth of GNP, consumption
etc are moving up wards & then the random term Ui will follow the same
pattern.
 And again in periods of recession all the economic variables will move down
words & the random term will follow the same patterns.
Cont’d 40
2. Formal testing method
 This method is called formal because the testis based on the formal testing
procedure you have seen in your statistics course.

 Of course, a first step in testing whether the residual series from an


estimated model are autocorrelated would be to plot the residuals as
above, looking for any patterns.

 Graphical methods may be difficult to interpret in practice, however, and


hence a formal statistical test should also be applied.

 Different econometricians and statisticians suggest different types of


testing methods.
Cont’d 41
The most frequently and widely used testing methods by researchers are:

i. The Durbin-Watson (DW) Test, Durbin and Watson (1951)

 Durbin–Watson (DW) is a test for first order autocorrelation – i.e. it tests only for a
relationship between an error and its immediately previous value.

 One way to motivate the test and to interpret the test statistic would be in the
context of a regression of the time t error on its previous value.

 Under the null hypothesis, the errors at time t − 1 and t are independent of one
another, and if this null were rejected, it would be concluded that there was evidence
of a relationship between successive residuals.
Cont’d 42
2. The Breusch–Godfrey test
 Recall that DW is a test only of whether consecutive errors are related to one another. So, not
only can the DW test not be applied if a certain set of circumstances are not fulfilled, there will
also be many forms of residual autocorrelation that DW cannot detect.
 For example, if 𝐶𝑜𝑟𝑟(𝑢𝑡 , 𝑢𝑡−1 ) = 0, but 𝐶𝑜𝑟𝑟(𝑢𝑖 , 𝑢𝑡−2 ) ≠ 0, DW as defined above will not find any
autocorrelation. One possible solution would be to replace 𝑢𝑡−1 in with 𝑢𝑡−2 .
 However, pairwise examinations of the correlations (𝑢𝑖 , 𝑢𝑡−1 ), 𝑢𝑖 , 𝑢𝑡−2 , 𝑢𝑖 , 𝑢𝑡−3 , …will be tedious
in practice and is not coded in econometrics software packages.
 Consequently, the critical values should also be modified somewhat in these cases. Therefore, it is
desirable to examine a joint test for autocorrelation that will allow examination of the
relationship between 𝒖𝒕 and several of its lagged values at the same time.
 The Breusch–Godfrey test is a more general test for autocorrelation.
4. Remedies for Autocorrelation 43

 The remedies to remove the effect of auto correlation depends on the


source of autocorrelation.

1. Include these omitted explanatory variables.

2. Apply the appropriate mathematical form of the model.

3. If tests prove that there is a true autocorrelation problem, we may


use other techniques such as GLS.
44

Ganamo Fitala
[email protected]

END OF CHAPTER FOUR

THANK YOU!

You might also like