Thanks to visit codestin.com
Credit goes to www.scribd.com

0% found this document useful (0 votes)
68 views53 pages

CHapter 2 Econometrics

Uploaded by

dagaagaahinjiguu
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
68 views53 pages

CHapter 2 Econometrics

Uploaded by

dagaagaahinjiguu
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 53

Chapter 2

Simple Linear Regression Or


The Two variable Regression Model

10/17/2024 By: Urgessa F 1


2.1. The basic concept of Regression
• Regression is one of the most commonly used tools in
econometrics analysis.
• Regression is a statistical method that attempts to determine
the strength and character of the relationship between one
dependent variable and other independent variables.
• Regression function indicates the dependence of one variable, the
dependent variable, on one or more other variables, the
explanatory variables.

10/17/2024 By: Urgessa F 2


10/17/2024 By: Urgessa F 3
❑ We denote the dependent variable by Y and the explanatory
variables by X 1 , X 2 ,..., X k .
❑ The dependent variable is that variable which its average
value is computed using the already known values of the
explanatory variable(s).
❑ But the values of the explanatory variables are obtained from
fixed or in repeated sampling of the population.
➢ If k = 2 , that is, there is only one explanatory variable.
➢ A regression model which considers only two variables (one
dependent and on independent) is called simple linear
regression.
➢ In this chapter first we consider the case of only two variables
10/17/2024 By: Urgessa F 4
• On the other hand, if K > 2 , that is, there are more than one
explanatory variable, we have what is known as multiple
regression.
➢ Population Regression Function (PRF): PRF represents the true
relationship between the dependent variable and the independent
variable(s) in the entire population
➢ Sample Regression Function (SRF): SRF is the regression
equation that is estimated from a sample of data
➢ PRF represents the true, underlying relationship in the population,
while SRF is the estimated relationship based on a sample.
• In a regression analysis our task is to estimate the PRF on the basis
of SRF
10/17/2024 By: Urgessa F 5
• Ex. Suppose the amount of commodity demanded by an individual is depend
on the price of the commodity, income of individual, price of other goods &
etc.
• From this statement, quantity demanded is the dependent variable which its
value is determined by the price of the commodity and income of the
individual, Price of other goods etc.
• And price of the commodity, income of individuals & price of other goods
are independent (explanatory) variables whose value is obtained from the
population using repeated sampling.
𝑄𝑑 = 𝑓 (𝑃, 𝑃0, 𝑌, 𝑒𝑡𝑐)
• The relationship between these dependent and independent variable is a
concern of regression analysis.

10/17/2024 By: Urgessa F 6


Regression analysis has the following objectives & uses
➢ To show the relationship among variables.
➢ To estimate average value (mean) of the dependent variable given
the value of independent variable(s);
➢ To test hypothesis about sign and magnitude of the variables
relationship
➢ To forecast future value(s) of the dependent variable
❑ It is to explain the variation in the dependent variable based on the
variation in one or more independent variables.

10/17/2024 By: Urgessa F 7


2.1.1 Non-Stochastic and Stochastic Relationships
❑ A non-stochastic relationship is one where the dependent variable
(Y) is completely determined by the independent variable(s) (X).
✓ In the deterministic or non-stochastic model, no random or
unexplained variation in the relationship, and the dependent variable
can be perfectly predicted by the independent variable(s)
✓ A relationship between X and Y, characterized as Y = f(X) is said to
be deterministic or non-stochastic, if for each value of the
independent variable (X) there is one and only one corresponding
value of dependent variable (Y).

10/17/2024 By: Urgessa F 8


❑A stochastic relationship is one where the dependent variable (Y) is
not completely determined by the independent variable(s) (X).
✓ There is an element of randomness or uncertainty in the relationship,
which is captured by the error term (ε) in the regression model.
• Let’s illustrate the distinction between stochastic and non stochastic
relationships with the help of a supply function.
• Assuming that the supply for a certain commodity depends on its
price (other determinants taken to be constant) and the function being
linear, the relationship can be indicated as:

10/17/2024 By: Urgessa F 9


• The above relationship between P and Q is such that for a particular
value of P, there is only one corresponding value of Q.
• This is, therefore, a deterministic (non-stochastic) relationship, since
for each price there is always only one corresponding quantity
supplied.
• This implies that all the variation in Y is due solely to changes in X,
and there are no other factors affects the dependent variable.
• If this is true all the points of price-quantity pairs, it would fall on a
straight line, if plotted on a two-dimensional plane
• However, if we gather observations on the quantity actually supplied
in the market at various prices and we plot them on a diagram, we see
that they do not fall on a straight line.
10/17/2024 By: Urgessa F 10
The deviations of observations from the line may be attributed to several factors

10/17/2024 By: Urgessa F 11


• In order to take into account the above sources of errors we introduce
in econometric functions which is a random variable that usually
denoted by the letter ‘Ui’ or ‘ei’ and is called error term or random
/disturbance or stochastic term of the function, so called be cause Ui
is supposed to ‘disturb’ the exact linear relationship which is assumed
to exist between X and Y.
• By introducing this random variable in the function, the model is
rendered stochastic form as:
………………………………….(2.2)
• Thus a stochastic model is a model in which the dependent variable is not only
determined by the explanatory variable(s) included in the model but also by
others which are not included in the model

10/17/2024 By: Urgessa F 12


Causes of (reasons for) the error
We introduced ‘Ui‘ – random term due to the following reasons:
i. Omission of variables from the function.
• In economic reality each variable is influenced by very large number of factors and each
variable may not be included in the function because of
a. Some of the factors may not be known even to the person.
b. Even if we know them, the factors may not be measured statistically, example
psychological factors (test, preferences, expectations etc) are not measurable
c. Some factors are random appearing in an unpredictable way & time. Example
epidemic, earth quacks e.t.c.
d. Even if all factors are known, the available data may not be adequate for the measure
of all factors influencing a relationship.

10/17/2024 By: Urgessa F 13


ii. The erratic (inconsistent) nature of human beings:- Random behavior of human
beings
➢The human behavior may deviate from the normal situation to a certain extent in
unpredictable way.
iii. Misspecification of the mathematical model:-
✓we may wrongly specify the relationship between variables.
✓We may form linear function to non- linearly related relationships or we
may use a single equation models for simultaneously determined
relationships.
v. Errors of measurement:- when we are collecting data we may commit
errors of measurement.

10/17/2024 By: Urgessa F 14


2.2. Simple Linear Regression model.
• A stochastic relationship with only one explanatory variable
is called simple linear regression model.

➢The term ‘simple’ refers to the fact that we use only two variables
(one dependent and one independent variable).
➢Linear refers to linear in parameters, it may or may not be linear in
the variable. The parameters appear with a power of one & is not
multiplied/divided by other parameters.
10/17/2024 By: Urgessa F 15
The true relationship which connects the variables involved is split into
two parts: a part represented by a line(fitted model) and a part
represented by the random term ‘u’(estimated model)

10/17/2024 By: Urgessa F 16


Cont.…………………………………………….
• The scatter of observations represents the true relationship between Y and X.
So, the line represents the exact part of the relationship and
• The deviation of the observation from the line represents the random
component of the relationship.

10/17/2024 By: Urgessa F 17


Some Notation
• Some alternative names for the y and x variables:
y x
dependent variable independent variables
regressand regressors
effect variable causal variables
explained variable explanatory variable
Outcome variable

10/17/2024 By: Urgessa F 18


2.3. Assumptions of the linear stochastic regression model.
Assumptions of the classical Linear Regression Model (CLRM)
• To guess the value of ‘Ui‘ we make some assumptions about Ui &
divided these assumptions in to four
a. Some refer to the distribution of random variable Ui.
b. Some refers to the relationship between Ui & the explanatory
variables(Xi)
c. Some refer to the relationship between the explanatory variables
themselves.
d. Some refer to the model

10/17/2024 By: Urgessa F 19


2.3.1. Assumption about Ui
i. Randomness of disturbance term 𝒖𝒊: the error term is assumed to be a random
variable.
• i.e. The value which Ui-may assume in any one period depends on chance, it may be positive, negative or
zero.
ii. Zero mean value of disturbance Ui.
• The mean value of Ui in any particular period is zero
• This means that for each value of X, the random variable (U𝑖 ) may assume various
values, some greater than zero and some less than zero, but if we considered all the
positive and negative values of U, for any given value of X, they would have on average
value equal to zero.
➢In other words the positive and negative values of U cancel each other.
❖Mathematically: E(U𝑖 ) = 0 𝑠𝑖𝑛𝑐𝑒 σ𝑛𝑖=1 𝑈𝑖 = 0

10/17/2024 By: Urgessa F 20


Con……………………………………
iii. Homoscedasticity (Equal or Constant Variance)
• The variation of each Ui around all values of the explanatory variable is the
same. The variance of Ui about its mean is constant at all values of X.
• In other words, for all values of X, the Ui will show the same dispersion
around their mean. Mathematically;
𝑉𝑎𝑟 𝑈𝑖 = E 𝑈𝑖 − E(𝑈𝑖 ) 2 = E(𝑈𝑖 )2 = 𝜎𝑢2 . ……………since E(U𝑖 ) = 0
iv. The Ui has a normal distribution with mean zero & Constant variance:-
• This means the values of Ui (for each X) have a zero mean and constant variance 𝜎𝑢2 ,
i.e.
𝑈𝑖 ~𝑁(0, 𝜎𝑢2 )
v. No autocorrelation between the disturbances terms:-Ui is serially independent
• The value of Ui in one period is not depend up on the value of Ui in other period of time (
the co-variance between Ui & Uj is equal to zero).

10/17/2024 By: Urgessa F 21


Con……………………………………
➢This means the value which the random term assumed in one period
does not depend on the value which it assumed in any other period.
• Algebraically,
cov (𝑈𝑖 , 𝑈𝑗 ) = E 𝑈𝑖 − E 𝑈𝑖 (𝑈𝑗 − E(𝑈𝑗 )
• By assumption ii – the 𝐸(𝑈𝑖 ) = 0 then
cov (𝑈𝑖 , 𝑈𝑗 ) = E (𝑈𝑖 − 0)(𝑈𝑗 − 0)
= E[(𝑈𝑖 ) 𝑈𝑗
cov (𝑈𝑖 , 𝑈𝑗 ) = E(𝑈𝑖 )E(𝑈𝑗 ) = 0
Again by assumption the 𝐸 𝑈𝑖 = 0
cov (𝑈𝑖 , 𝑈𝑗 ) = 0

10/17/2024 By: Urgessa F 22


2.3.2. Assumption about Ui & Xi
i. Zero covariance between 𝑈𝑖 and Xi
• The disturbance term 𝑈𝑖 is not correlated with explanatory variables.
➢The random variable (U) is independent of the explanatory variable (s).
➢This means there is no correlation between the random variable and the explanatory variable.
➢If two variables are unrelated, their covariance is zero.
➢It means 𝑈𝑖 ‘s & Xi‘s are not moving together or the covariance between Ui & Xi‘s are zero.
cov (𝑈𝑖 , 𝑋𝑖 ) = E 𝑈𝑖 − E 𝑈𝑖 (𝑋𝑖 − E(𝑋𝑖 )
cov (𝑈𝑖 , 𝑋𝑖 ) = E [𝑈𝑖 − 0] (𝑋𝑖 − E(𝑋𝑖 ) , given𝐸 𝑈𝑖 = 0
cov (𝑈𝑖 , 𝑋𝑖 ) = E[(𝑈𝑖 )(𝑋𝑖 − E(𝑋𝑖 )]
= E[𝑈𝑖 𝑋𝑖 − 𝑈𝑖 E(𝑋𝑖 )]
E(𝑈𝑖 𝑋𝑖 ) − E(𝑈𝑖 )E(𝑋𝑖 )
= E(𝑈𝑖 𝑋𝑖 ) − 0. E(𝑋𝑖 ), given𝐸 𝑈𝑖 = 0
= E(𝑈𝑖 𝑋𝑖 )
= 𝑋𝑖 E(𝑈𝑖 ) ,since the value of Xi′s are fixed then
cov (𝑈𝑖 , 𝑋𝑖 ) = 𝑋𝑖 . 0 = 0
10/17/2024 By: Urgessa F 23
2.3.3. Assumption about explanatory variables (Xi)
i. The explanatory variables Xi's are measured with out error i.e. no problem of
aggregation.
▪ The X’s are a set of fixed values that are measured without error.
▪ If there is such problem in the measurement it will be absorbed by the random term Ui.
▪ That is Ui absorbs the influence of omitted variables and possibly errors of measurement
in the y’s. i.e., we will assume that the regressors are error free, while y values may or
may not include errors of measurement.
ii. Variability in X values.
• The X values in a given sample must not all be the same.
• Technically, var (X) must be a finite positive number.
ത it is impossible to estimate the parameters.
• ✓if 𝑋𝑖 = 𝑋,

10/17/2024 By: Urgessa F 24


Cont.………………………………………………………….
iii. No multicollinearity.
• The explanatory variables are not perfectly linearly correlated.
• If there are more than one explanatory variables, the relationships is assumed that they are not
perfectly correlated with each other.
• Ex: 𝑌𝑡=𝛼+𝛽1𝑋1+𝛽2𝑋2+𝛽3𝑋3+𝑈𝑖
𝑋1 & 𝑋2, 𝑋2 & 𝑋3, 𝑋1 & 𝑋3 are not correlated with each others. i.e. no multicollinearity.
• That is, there are no perfect linear relationships among the explanatory variables.
iv. X values are fixed in repeated sampling.
• Values taken by the regressor X are considered fixed in repeated samples
• More technically, X is assumed to be non stochastic
• In other words X is assumed to be known with certainty
• What all this means is that our regression analysis is conditional regression analysis that
is, conditional on the given values of the regressor (s) X.

10/17/2024 By: Urgessa F 25


2.3.4. Assumptions about the model
i. Linear regression model.
• The model is linear in parameters.
• The classicals assumed that the model should be linear in the parameters
regardless of whether the explanatory and the dependent variables are linear or
not.
• This is because if the parameters are non-linear it is difficult for estimation.
Example;

𝒀 = 𝜶 + 𝜷𝑿𝟐 + 𝒖

𝒀𝟐 = 𝜶 + 𝜷𝟏 𝑿𝟏 + 𝜷𝟐 𝑿𝟏 𝑿𝟐
10/17/2024 By: Urgessa F 26
Cont.……………………………………………..
ii. The number of observations n must be greater than the number of parameters to be
estimated
• Alternatively, the number of observations n must be greater than the number of
explanatory variables
• From a single observation there is no way to estimate
• the two unknowns, α and β
• We need at least two pairs of observations to estimate the two unknowns.
iii. The regression model is correctly specified.
• Alternatively, there is no specification bias or error in the model used in empirical
analysis.
• Some important questions that arise in the specification of the model include the
following:
a. What variables should be included in the model?
b. What is the functional form of the model? Is it linear in the parameters, the variables, or both?
c. What are the probabilistic assumptions made about the Yi , the Xi , and the ui entering the model?
10/17/2024 By: Urgessa F 27
2.4. Methods of estimation
• The next step is the estimation of the numerical values of the parameters of
economic relationships.
• The parameters of the simple linear regression model can be estimated by
various methods. Three of the most commonly used methods are:
1. Ordinary least square method (OLS)
2. Maximum likelihood method (MLM)
3. Method of moments (MM)
▪ But, here we will deal with the OLS methods of estimation.

10/17/2024 By: Urgessa F 28


2.4.1.The ordinary least square (OLS) method
• The model 𝑌𝑖 = 𝛼 + 𝛽𝑋 + 𝑒𝑖 is called the true relationship between Y and X
because Y and X represent their respective population value, 𝛼 𝑎𝑛𝑑 𝛽are
called the true parameters since they are estimated from the population value
of Y and X
• But it is difficult to obtain the population value of Y and X because of
technical or economic reasons.
• So we are forced to take the sample value of Y and X.
• The parameters estimated from the sample value of Y and X are called the
estimators of the true parameters 𝛼 𝑎𝑛𝑑 𝛽 which are symbolized as
𝛼ො 𝑎𝑛𝑑 𝛽.መ
• The model 𝑌𝑖 = 𝛼ො + 𝛽𝑋 መ + 𝑒𝑖, is called estimated relationship between Y
and X.
10/17/2024 By: Urgessa F 29
• From the estimated relationship, we obtain: Yi = ˆ + ˆX i + ei
ei = Yi − (ˆ + ˆX i ).......................2.8

i  i
2
= − 
ˆ − ˆ
 X 2
e (Y )
i ..............................2.9

To find the values of 𝛼ො 𝑎𝑛𝑑 𝛽መ that minimize this sum, we have to partially differentiate
σ 𝑒 𝑖 2 with respect to 𝛼ො 𝑎𝑛𝑑 𝛽መ and set the partial derivatives equal to zero.
  ei2
= −2 (Yi − ˆ − ˆX i ) = 0.......................................................(2.10)
ˆ

i
Y = n  + ˆX
 i

ˆ = Y − ˆX ..........................................................................(2.11)
  ei2
= −2 X i (Yi − ˆ − ˆX ) = 0..................................................(2.12)
ˆ 10/17/2024 By: Urgessa F 30
2

Then the partial differentiation of  e with respect to 𝛼ො 𝑎𝑛𝑑 𝛽መ are:


i

2 2
 u i    u i  


= −2 Yi + 2n  + 2 1  X i = 0 
= −2 X iYi + 2  X i + 2 1  X i2 = 0
  1
   
2 X iYi = 2   X i + 2 1  X i2
2 Yi = 2n  + 2  1  X i
 
 
X Y =   X i +  1  X i2
Y i = n +  1  X i i i

divided by n we get
Substituting the value of 𝛼 from equation
  −  − 
Y i
=
n  1  X i
+  X Y = X (Y − 
i i i 1 X ) +  1  X i
2

n n n − −  − 

−   − X Y i i =n X (Y −  X ) +  1  X i2
Y = + 1 X − −  − 2 
X Y i i = n X Y −  1 n X +  1  X i2 


X iYi − n X Y

 −  − − −  − 2 1 =
 X Y − n X Y =  ( X
− 2

 = Y − 1 X ..............2.8 i i 1 i
2
−nX ) X i
2
−nX
........2.9

10/17/2024 By: Urgessa F 31


Let us define the deviation of the variables from their mean using small letters as follows
− −2
 i  i
y 2
= (Y − Y ) 2
=  i − nY
Y 2

− − 2
 x = ( X2
i i − X) = X −nX 2
i
2

− − − −
 x y = ( X
i i i − X )(Yi − Y ) =  X iYi − n X Y −
ത 𝑌ത are the sample means of X and Y. We have also defined xi = ( X i − X )
Where, 𝑋and
− − −


1 =
 (Y − Y ) − ( X − X )  x y
i
=
i i i
.and yi = (Yi − Y ) Hence forth, we adopt the conventional of lettering the
lowercase letters denote deviations from mean value. Finally, if Y is the dependent
 ( X − X )  x −
2
2
i variable and X is an explanatory variable then the sample regression function (SRF)
i
of on is written formally as   

We can proof the equality of the numerators and denominators Yi =  + X i


1. The equality of the numerator  X Y
− − − − − − − −  X iYi − n i

 x y = (Y − Y )( X
i i i − X)  X Y − Y n X − X nY + n X Y
i i
n2
 X Y Multiplying both sides
− − − − − −
 X iYi − i

 X Y − Y  X − X Y + n X Y
i i i i  X Y −Y n X n by n we get
  X iYi −  i  i n
i i
−  X Y
X = nX  X . Y 
 n 

 X iYi − n
i i i

Y = nY 10/17/2024
i
n
By: Urgessa F
n n X iYi −  X i Yi
32
Estimation of a function with zero intercept
• Suppose it is desired to fit the line Yi =  + X i + U i , subject to  = 0.the restriction
To estimate  , ˆ the problem is put in a form of restricted minimization
problem and then Lagrange method is applied. Substituting (iii) in (ii) and
rearranging we obtain:
n
We minimize: e 2 = (Y − ˆ − ˆX ) 2
i  i i =1
i
Subject to: ˆ = 0
Z =  (Yi − ˆ − ˆX i ) 2 − ˆ ,
We minimize the function with respect to X i (Yi − ˆX i ) = 0
Z ˆ
Yi X i − X i = 0
2
= −2(Yi − ˆ − ˆX i ) −  = 0 − − − − − − − −(i )
ˆ
Z
= −2(Yi − ˆ − ˆX i ) ( X i ) = 0 − − − − − − − −(ii) X i Yi
ˆ ˆ
=
z

= −2 = 0 − − − − − − − − − − − − − − − − − − − (iii) X i2
10/17/2024 By: Urgessa F 33
2.4.2.PROPERTIES OF OLS ESTIMATORS
• The ideal or optimum properties that the OLS estimates possess may be summarized by well
known theorem known as the Gauss-Markov Theorem.
• Statement of the theorem: “Given the assumptions of the classical linear regression model, the
OLS estimators, in the class of linear and unbiased estimators, have the minimum variance,
i.e. the OLS estimators are BLUE.
• According to the this theorem, under the basic assumptions of the classical linear regression
model, the least squares estimators are linear, unbiased and have minimum variance (i.e. are
best of all linear unbiased estimators). Some times the theorem referred as the BLUE
theorem i.e. Best, Linear, Unbiased Estimator. An estimator is called BLUE if:
a. Linear: a linear function of the a random variable, such as, the dependent variable Y.
b. Unbiased: its average or expected value is equal to the true population parameter.
c. Minimum variance: It has a minimum variance in the class of linear and unbiased
estimators. An unbiased estimator with the least variance is known as an efficient estimator.

10/17/2024 By: Urgessa F 34


Assignment
• Proof BLUE
1. Best Linearity
2. Unbiasedness
3. Minimum Variance of OLS parameters

10/17/2024 By: Urgessa F 35


2.5. Statistical test of Significance of the OLS Estimators (First Order
tests)
• We divide the available criteria into three groups: the theoretical a
priori criteria, the statistical criteria, and the econometric criteria.
Under this section, our focus is on statistical criteria (first order
tests). The two most commonly used first order tests in econometric
analysis are:
I. The coefficient of determination (the square of the correlation
coefficient i.e. R2). This test is used for judging the explanatory
power of the independent variable(s).
II. The standard error tests of the estimators. This test is used for
judging the statistical reliability of the estimates of the regression
coefficients
10/17/2024 By: Urgessa F 36
TESTS OF THE ‘GOODNESS OF FIT’ WITH R2
• R2 shows the percentage of total variation of the dependent variable that can be
explained by the changes in the explanatory variable(s) included in the model. To
elaborate this let’s draw a horizontal line corresponding to the mean value of the
dependent variable mean of Y (see figure‘d’ below).
• By fitting the line Yˆ = ˆ 0 + ˆ1 X we try to obtain the explanation of the variation of
the dependent variable Y produced by the changes of the explanatory variable X.

10/17/2024 By: Urgessa F 37


• As can be seen from fig.(d) above, Y − Y represents measures the
variation of the sample observation value of the dependent variable around
the mean.
• However the variation in Y that can be attributed the influence of X, (i.e.
the regression line) is given by the vertical distance Yˆ − Y
• The part of the total variation in Y about Y that can’t be attributed to X is
equal to which is referred to as the residual variation.
• In summary:
• ei = Yi − Yˆ = deviation of the observation Yi from the regression line.
• yi = Y − Y = deviation of Y from its mean.
• yˆ = Yˆ − Y = deviation of the regressed (predicted) value (Yˆ) from the mean.

10/17/2024 By: Urgessa F 38


• Now, we may write the observed Y as the sum of the predicted value ( 𝑌෠ ) and the residual
term (ei.).
Yi = Yˆ + ei
 predicted Yi

Observed Yi Re sidual

From equation we can have the above equation but in deviation form y = yˆ + e
By squaring and summing both sides, we obtain the following expression:

y = ( y + ei + 2 yei)
2
ˆ 2 2

= yi + ei2 + 2yˆei


2

ŷei = e(Yˆ − Y ) = e(ˆ + ˆxi − Y )


= ˆei + ˆexi − Yˆei ei = 0, exi = 0
10/17/2024 By: Urgessa F 39
  yˆ e = 0................................2.46
yi2 = 

ˆ2
y + ei2 .............................2.47
 
Total Explained Un exp lained
var iation var iation var ation
Total sum of Explained sum Re sidual sum
= +
square of square of square

       
TSS ESS RSS

TSS = ESS + RSS ............................2.48


Mathematically; the explained variation as a percentage of the total variation is
explained as: ESS y ˆ2
= ..........................................2.49
TSS y 2

From equation (2.37) we have yˆ = . ˆx Squaring and summing both sides give us

yˆ 2 = ˆ 2 x 2 − − − ...........................2.50
10/17/2024 By: Urgessa F 40
We can substitute (2.50) in (2.49) and obtain: Comparing (2.52) and (2.54), we see exactly the expressions.
Therefore:
ˆ 2 x 2 xy xy
ESS / TSS = ......................................2.51 ESS/TSS = = r2
y 2
x 2 y 2
 xy  xi
2 2
ˆ xi yi
= 2  sence,  =
 x   y
2 xi2 From (2.48), RSS=TSS-ESS. Hence R2 becomes
xy xy R =
TSS − RSS
= 1−
RSS ei2
= 1 − 2 ...............2.55
= 2 2 ............................2.52
2

TSS TSS y
x y

RSS = ei2 = yi2 (1 − R 2 ) − − − − − − − − − − − − − − − − − − − − − − − − − − − −(2.56)

The limit of R2: The value of R2 falls between zero and one. i.e. 0<R2<1 .

10/17/2024 By: Urgessa F 41


Interpretation of R2
• Suppose R2=0.9, this means that the regression line gives a good fit
to the observed data since this line explains 90% of the total variation
of the Y value around their mean.
• The remaining 10% of the total variation in Y is unaccounted for by
the regression line and is attributed to the factors included in the
disturbance variable Ui.

10/17/2024 By: Urgessa F 42


TESTING THE SIGNIFICANCE OF OLS PARAMETERS
To test the significance of the OLS parameter estimators we need the following:
• Variance of the parameter estimators
• Unbiased estimator of
• The assumption of normality of the distribution of error term.
ˆ 2
ˆ 2 X 2 
var( ˆ ) = 2
2
var(ˆ ) = e RSS
ˆ 2 = =
x n x 2 n−2 n−2
• The most common ones are:
i. Standard error test
ii. Student’s t-test
iii. Confidence interval
10/17/2024 By: Urgessa F 43
Standard error test
• This test helps us to decide whether the estimates α ˆ and β ˆ are significantly
different from zero, i.e. whether the sample from which they have been
estimated might have come from a population whose true parameters are zero.
α = 0 and / or β = 0 .

• Formally we test the hypothesis

Null hypothesis H0: βi = 0 (X and Y has no relations)

Alternative hypothesis H1 : β i ≠ 0(X and Y has relations)

10/17/2024 By: Urgessa F 44


The standard error test may be outlined as follows.
• First: Compute standard error of the parameters.
1. SE(β) ෢
෠ = var(𝛽)
ො = var(𝛼)
2. SE(𝛼) ො
• Second: compare the standard errors with the numerical values of α ˆ
and β ˆ .
Decision rule:
1. If SE(β)෠ > 1/2 β,
෠ accept the null hypothesis and reject the alternative
hypothesis. We conclude that β෠ is statistically insignificant.
2. If SE(β)෠ < 1/2 β,
෠ reject the null hypothesis and accept the alternative
hypothesis. We conclude that β ˆi is statistically significant.

10/17/2024 By: Urgessa F 45


• The acceptance or rejection of the null hypothesis has definite
economic meaning.
• Namely, the acceptance of the null hypothesis β = 0 (the slope
parameter is zero) implies that the explanatory variable to which this
estimate relates does not in fact influence the dependent variable Y
and should not be included in the function, since the conducted test
provided evidence that changes in X leave Y unaffected.
• In other words acceptance of H0 implies that the relation ship
between Y and X is in fact Y = α +0( x) = α , i.e. there is no
relationship between X and Y.

10/17/2024 By: Urgessa F 46


Student’s t-test
• Like the standard error test, this test is also important to test the significance of
the parameters
i. We can derive the t-value of the OLS estimates
Define the null & alternative hypotheses: Null hypothesis H0: βi = 0 against
the Alternative hypothesis H1 : β i ≠ 0

ii. Choose the level of significance, which is the probability of rejecting the null
hypothesis when it is true, in other words, level of significance refers to the
probability of committing type one error by the researcher.

10/17/2024 By: Urgessa F 47


i. Define the number of degree of freedom (d.f) as N-K where N is the sample
size and K is No of estimated pentameters, 𝛽‘s.
ii. Find the theoretical or table value of t, which is obtained from the t-table
using the d.f and the significant level chosen. This t-value is termed as t-
tabulated, t-tab, or t-critical, t-cr.
iii. Find the calculate value of t, tcal using the following formula: 𝑡 =

𝛽
෡ . This t-value is also termed as t-calculated, t-cal, t-computed, t-com,
𝑆𝐸(𝛽)
or t-statistic t-stat.
iv. Finally, the t-calculated will be compared with the t-tabulated value

10/17/2024 By: Urgessa F 48


Decision rule
i. If t-cal > t-tab, then we reject the bull hypothesis (or accept
alternative
hypothesis) so that we concluded that the estimate 𝛽መ is statistically
significant.
ii. If t-cal < t-tab, then we accept the null hypothesis (or reject the
alternative hypothesis) so 𝛽መ that the estimated parameter
is not statistically significant.

10/17/2024 By: Urgessa F 49


3. Confidence Intervals
• Confidence interval is an alternative way of expressing the significance level
with percentages. For example, if the confidence interval is given as 95
percent (i.e, CI=95%) then it refers to the researcher is 95% confident enough
not to commit type I error by rejecting the null hypothesis when it is true.
• In other words, using significance level of 𝛼=5% the researcher has 5%
probability of committing the error.
• Thus, the CI of 95% can be computed and presented using the following
formula expression.

10/17/2024 By: Urgessa F 50


➢In order to define how close the estimate to the true parameter, we
must construct confidence interval for the true parameter, in other
words we must establish limiting values around the estimate within
which the true parameter is expected to lie within a certain “degree of
confidence”.
➢We choose a probability in advance and refer to it as confidence
level (interval coefficient).
➢It is customarily in econometrics to choose the 95% confidence
level. This means that in repeated sampling the confidence limits
computed from the sample would include the true population
parameter in 95% of the cases.
➢In the other 5% of the cases the population parameter will fall
outside the confidence interval.
10/17/2024 By: Urgessa F 51
• The limit within which the true β lies at (1−α)% degree of confidence is: (𝛽- መ
መ መ መ 𝛼
𝑆𝐸 𝛽 𝑡𝑐, 𝛽 + 𝑆𝐸 𝛽 𝑡𝑐 ; where tc is the critical value of t at confidence
2
interval and n-2 degree of freedom.
• The test procedure is outlined as follows.
i. Null hypothesis H0: βi = 0 against the Alternative hypothesis H1 : β i ≠ 0
Decision rule:
i. If the hypothesized value of β in the null hypothesis is within the confidence
interval, accept H0 and reject H1. The implication is that β ˆ is statistically
insignificant;
ii. if the hypothesized value of β in the null hypothesis is outside the limit, reject
H0 and accept H1. This indicates β ˆ is statistically significant.

10/17/2024 By: Urgessa F 52
Numerical Example
Given the following data and a sample of 20 observation corresponding to the regression model Yi = α + βX i +Ui
σ 𝑌𝑖=21.9
σ 𝑋𝑖=186.2
ത − 𝑌)
σ(𝑋𝑖 − 𝑋)(Y ത = 106.4
ത 2 = 86.9
σ( 𝑌 − 𝑌)
ത 2 = 215.4
σ( 𝑋 − 𝑋)
Find
a. Estimate α and β
b. Find the explained and unexplained variation in the model
c. Compute the coefficient of determination
d. Calculate the variance of our estimates
e. Compute the standard error of the regression coefficients
f. conduct test of significance at the 5% level of significance.
g. Construct Confidence interval of 95% confidence interval for α

10/17/2024 By: Urgessa F 53

You might also like