Thanks to visit codestin.com
Credit goes to www.scribd.com

0% found this document useful (0 votes)
11 views134 pages

Introduction To Econometric - Tutor

The document discusses econometrics tutor services including linear regression, dummy regression, and time series analysis. It covers the objectives, outlines, introduction to econometrics including definitions, uses, and methodology. It also discusses concepts related to regression, causation, correlation and the classical linear regression model.

Uploaded by

tarkulamiso
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
11 views134 pages

Introduction To Econometric - Tutor

The document discusses econometrics tutor services including linear regression, dummy regression, and time series analysis. It covers the objectives, outlines, introduction to econometrics including definitions, uses, and methodology. It also discusses concepts related to regression, causation, correlation and the classical linear regression model.

Uploaded by

tarkulamiso
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 134

Econometrics Tutor

By: Aemro Tazeze (PhD)

Department:
BSc in Agricultural Economics
BSc in Agribusiness and Value Chain Management

Haramaya University, Ethiopia

5 March 2023 Aemro T.


Core Competence for National Exit
Exam
General Objective Specific Objectives

Understand, estimate ▪ Understand and apply the


methodology of econometrics
and predict economic for their research
variables using ▪ Estimate and use regression
regression model using the real data and
interpret the result,
▪ Use estimated equations to
make predictions and
forecasting
5 March 2023 Aemro T.
Outlines for Tutor

❖Introduction to Econometrics
❖Linear Regression& Econometric Problem
❖Dummy Regression
❖Non-linear Regression
❖Introduction to Time Series

5 March 2023 Aemro T.


Introduction to Econometrics

❖ What is Econometrics?
❖ Use of Econometrics
❖ Steps in Econometric Methodology

5 March 2023 Aemro T.


What is Econometrics?

◼ Intuitively, econometrics (econ+metrics) is economic


measurement
 Combines economic theory, mathematics and
statistics to explain, model and measure quantitative
economic relationships.
 Econometrics is relevant in virtually every branch of
applied economics: finance, labor, health, industrial,
macro, development, international, trade, marketing
strategy, etc.

5 March 2023 Aemro T.


What is Econometrics?

What makes econometrics different from other


applications of statistics?

1. Economic data is non-experimental


data.
2. Economic models (either simple or
sophisticated) are key to interpret the
statistical results in econometric
applications.
5 March 2023 Aemro T.
Use of Econometrics

 Testing economic theories


✓ Test Keynes hypothesis: Consumption
increases with income
 Estimation of economic relationships
✓ Demand and supply equations
✓ Production functions

5 March 2023 Aemro T.


Use of Econometrics

 Forecasting
✓ Use current and past economic data to predict
future values of variables such as inflation, GDP, stock
prices, etc.
 Evaluating government policies
✓ Impact of coffee export on economic growth

5 March 2023 Aemro T.


Steps in Econometrics Methodology

1. Formulation of the question(s) of


interest
2. Collection of data
3. Specification of the econometric model
4. Estimation, validation, hypotheses
testing and forecasting .

5 March 2023 Aemro T.


Steps in Econometrics Methodology

Step 1: Question(s)/hypothesis

• Suppose we want to test the hypothesis:


H1: Consumption increases with income
increase
• This is important for decision-makers to
improve household consumption
5 March 2023 Aemro T.
Steps in Econometrics Methodology

Step 2: Collection of data


 There are different datasets that can be collected to
test this hypothesis.
1. Data on consumption versus data on income.
2. Data at the household level Vs country level.
3. Types of data: Time series data (over several
years), or cross-sectional data (at the same
period); or panel data (many households over
multiple periods).

11
5 March 2023 Aemro T.
Steps in Econometrics Methodology

Step 3: Specification of the econometric


model
 econometric model- observable and not
observable to the researcher.

 A researcher’s decision- depends critically


on what is observable.

12
5 March 2023 Aemro T.
Steps in Econometrics Methodology

Step 3: Specification of the econometric


model
 Whether we estimate one function or the
other depends very much on the available
data:
✓with data on consumption; data on income

13
5 March 2023 Aemro T.
Steps in Econometrics Methodology

Step 3: Specification of the econometric model

 Suppose that we decide to estimate a consumption function.

Y= F(X1, X2,..Xk, u)

14
5 March 2023 Aemro T.
Steps in Econometrics Methodology

Step 3: Specification of the econometric


model

▪ An important specification assumption is the


choice of the functional form of the
consumption function F(.) & empirical
literature
• For instance, is it linear, quadratic, or Cobb-Douglas
(C-D) function
15
5 March 2023 Aemro T.
Steps in Econometrics Methodology

Step 3: Specification of the econometric model

 In linear consumption function is:


Y = 0 + 1 X1 + 2 X 2 ... + k X k + 

• The β’s are parameters to estimate.


• u represents unobservable factors to the
econometrician, e.g., climate related shocks
16
5 March 2023 Aemro T.
Steps in Econometrics Methodology

Step 3: Specification of the econometric model

 Certain conditions on the statistical properties of


the error term are key for the good properties of
our estimators of the parameters of interest.
 The economic interpretation of the error term is very
important to interpret our estimation results.

5 March 2023 Aemro T.


Steps in Econometrics Methodology

Step 4: Estimation, validation, hypotheses testing, Forecasting

 We want to estimate the parameters β in the


consumption function.
 After estimation, we have to make specification
tests to validate some of the specification
assumptions.
 The results of these tests may imply a re-
specification and re-estimation of the model.

18
5 March 2023 Aemro T.
Steps in Econometrics Methodology

Step 4: Estimation, validation, hypotheses


testing, prediction

 Once we have a validated model, we can


interpret the results from an economic point
of view, and make tests, and predictions.

19
5 March 2023 Aemro T.
Linear Regression & Econometric
Problem

Introduction
 Economic theory specifies a set of precise,
deterministic relationships among variables
 Econometrics, however, a combination of
deterministic and stochastic process.
 We start with OLS which minimizes the sum
of squared residuals

5 March 2023 Aemro T.


Linear Regression

Theory Mathematical Model Econometric Model

on also
yi = f ( xi ) = b0 + b1xi yi = f ( x i ) = b0 + b1x i + ei
increases, but not ti
a much as income.
s

5 March 2023 Aemro T.


Some Concepts: Regression,
Causation, and Correlation
▪ Correlation analysis shows the existence of a
relationship between two variables.
✓ It measures the strength of linear association between
variables.
• But not have a cause-and-effect relationship

▪ In correlation analysis both y and x are independent


variables.
▪ correlation coefficient shows the:
✓ existence, direction and magnitude of a relationship
between the variables; cannot be used to predict one for the
other
5 March 2023 Aemro T.
Cont…

 Regression is the estimation or predicts on of the


average value of a dependent variable on the basis of
the fixed values of other variables.

 Regression-causality between dependent variable y


and independent variable x.

 Causation comes from theory rather than statistics.

5 March 2023 Aemro T.


Cont…

When do we apply regression analysis?

The relationship between dependent and


independent variables by determining the:
◼ extent,
◼ direction and
◼ strength of the relationship

5 March 2023 Aemro T.


Cont…

 The classical linear regression population model is of


the form:
Y = 0 + 1 X1 + 2 X 2 ... + k X k + 

 Where
◼ Y is the dependent variable,
◼ X , X ,..., X
1 2 k are the independent variables
◼ ε is the unobservable or random or disturbance or error;
and

0 , 1 ,  2 ,...,  k are the parameters
(constants)

5 March 2023 Aemro T.


Cont…

 Why is the disturbance term ε?


✓ Measurement errors
✓ Erratic human behavior
✓ Exclusion of important variables
✓ Simultaneity
 The sample linear regression is given by:
Yi = 0 + 1 X i1 +  2 X i 2 + ... +  k X ik +  i

5 March 2023 Aemro T.


Cont…

 Why more than one predictor


variable?
✓ More than one variable influences a
dependent variable.
✓ Predictors may themselves be correlated
(multicollinearity)

5 March 2023 Aemro T.


Cont…

 Model selection
✓ Should explain the most variation in the

dependent variable
 Evaluation of assumptions
✓ Have we met the assumptions of the OLS

 Model validation
✓ Validating the model results

5 March 2023 Aemro T.


Cont…

Yi = 0 + 1 X i1 + 2 X i 2 +  + k X ik +  i

0 - Intercept
1k - Partial Regression slope coefficients

i - Error term associated with the ith observation

This model gives the expected value of Y conditional


on the fixed values of X1, X2, Xk, plus error
5 March 2023 Aemro T.
Matrix Representation

For a sample of size n the regression model is best


described as a system of equations:

Y1 =  0 + 1 X 11 + ... +  k X 1k +  1
Y2 =  0 + 1 X 21 + ... +  k X 2 k +  2
.
.
.
Yn =  0 + 1 X n1 + ... +  k X nk +  n

5 March 2023 Aemro T.


Cont…

•We can re-write these equations in a matrix form


as :  
Y  1 
1 X X X  0   1 
   11 12 1k
   
 Y2  1 X 21 X 22  X 2k  1    2 

   =        +   
      
 Y  1  X nk      
 n  X n1 X n2  k   n 

Y= X  +

(n  1) (n  k) (k  1) (n  1)

5 March 2023 Aemro T.


OLS Assumptions

 Assumption 1: The expected value of the


error vector is 0

 1   0 
   
 2   0
E ( ) = E   = 
 
   
 n   0

5 March 2023 Aemro T.


OLS Assumptions

 Assumption 2: There is no correlation


between the ith and jth error terms

E ( i  j ) = 0
 This is called no autocorrelation

5 March 2023 Aemro T.


OLS Assumptions

 Assumption 3: The errors exhibit


constant variance

E (  ) =  I 2

 This is called homoscedasticity


 If errors don’t exhibit-hetroscedasticity
5 March 2023 Aemro T.
OLS Assumptions

 Assumption 4: Covariance between the


X’s and error terms is 0
◼ Usually satisfied if the predictor variables are
fixed and non-stochastic
cov( , X ) = 0
◼ X is called an exogeneous variable
◼ If not then it is called an endogeneous
variable

5 March 2023 Aemro T.


OLS Assumptions

 Assumption 5: No exact linear relationships


among X variables.
◼ Assumption of no multicollinearity

5 March 2023 Aemro T.


OLS Assumptions

 If these assumptions hold…


◼ Then the OLS estimators are unbiased
linear estimators & minimum variance
estimators
◼ In this case we say that the OLS

estimators are BLUE

5 March 2023 Aemro T.


OLS Assumptions
What does it mean to be BLUE?
◼ Allows us to compute a number of
statistics.
◼ OLS estimation

5 March 2023 Aemro T.


OLS Assumptions

 Assumption 6: The error terms are normally


distributed.
 i ~ N (0,  ) 2

✓ Not necessarily, but will ease statistical analysis.


 Assumption 7: Data generating process for X
is not related to ε

5 March 2023 Aemro T.


OLS Estimation

 Population regression model:

Y = Xb + e
 OLS requires choosing values of b, such that
residual sum-of-squares (SSR) is as small as
possible.

5 March 2023 Aemro T.


The Normal Equations

 Need to differentiate with respect to the


unknowns (b): 
SSE = ee = (Y − Xb ) (Y − Xb )
 Yields n simultaneous equations in k
unknowns, also known as the Normal
Equations
 Matrix form of the normal equations
( X X )b = X Y
5 March 2023 Aemro T.
The solution for the “b’s”

•It should be apparent how to solve for the


unknown parameters
•Pre-multiply by the inverse of XX
( X X )−1 ( X X )b = ( X X )−1 X Y

b = ( X X )−1 X Y
•This is the fundamental outcome of OLS theory
5 March 2023 Aemro T.
Goodness-of-Fit (R2)

▪ R2 statistic) given by:


SSE
R =
2

SST
• proportion of variability in response variable
that is accounted for the explanatory variables
 0  R2  1
 Good fit- R2 will be close to one.
 Poor fit- R2 will be near 0.

5 March 2023 Aemro T.


R2 –Coefficient of Determination

R = 1 − SSR / SST = 1 −
2 ( 
)(
Y − Yˆ Y − Yˆ )

(Y − Y ) (Y − Y )

5 March 2023 Aemro T.


Critique of R2

 R2 is inflated by increasing the number


of explanatory variables in the model
✓ Alternatively use the adjusted R2

5 March 2023 Aemro T.


Adjusted R 2

R = 1−
2 Y − Y(
ˆ 
Y − Y)(
ˆ )
/ (n − k )

(Y − Y ) (Y − Y )/ (n − 1)
= 1 − MSR / MST

k  1; R 2  R 2

5 March 2023 Aemro T.


How adjusted R2 work?

 Total Sum-of-Squares is fixed since it is


independent of the number of explanatory
variables
 The numerator, SSR, decreases as the number
of variables increases
 R2 artificially inflated by adding explanatory
variables to the model
 Adjusted R2 takes into account the number of
predictors in the model
5 March 2023 Aemro T.
Statistical Inference

 Inference can be made using:


1) hypothesis testing
2) interval estimation

5 March 2023 Aemro T.


ANOVA Approach

 Decomposition of total sums-of-squares


into components relating
◼ explained variance (regression)
◼ unexplained variance (error)

5 March 2023 Aemro T.


ANOVA Table

Source of Sums-of- df Mean F-ratio


Variation Squares Square

Regression k-1 MSE/M


bX Y − nY 2 bX Y − nY 2

k −1 SR

Residual n-k Y Y − bX Y


Y Y − bX Y n−k

Total n-1
Y Y

5 March 2023 Aemro T.


F-test/Test of Multiple Restrictions

•Tests the null hypothesis:


H0: 1=2k = 0

•Null hypothesis is known as a joint or simultaneous


hypothesis, because it compares the values of all i
simultaneously
•This tests overall significance of regression model

5 March 2023 Aemro T.


The F-test statistic and R2 vary directly

(bX Y − nY ) (k − 1)
2 SSE (k − 1)
F= F=
(Y Y − bX Y ) (n − k ) SSR (n − k )

SSE (k − 1) F=
SSE SST n − k
F= 1 − (SSE SST ) k − 1
( SST − SSE ) (n − k )

R2 n − k
F=
1 − R2 k − 1
5 March 2023 Aemro T.
Test statistic

bi −  i
t=
s cii
•Follows a t distribution with n – k df.
where cii is the element of the ith row and ith column
of []-1
•The 100(1-)% Confidence Interval is obtained from
 
bi  t  ; n − k  s cii
2 
5 March 2023 Aemro T.
Econometric Problems

What happens if one or more of these


assumptions are violated or not fulfilled?

❖The estimator/s :

➢ Biased
➢ Inefficient parameter
➢ Unacceptable standard errors
➢ Inconsistent estimates

5 March 2023 Prepared by Aemro Tazeze 54


Cont…
The basic questions to be addressed for all
the assumptions are:

➢What is the nature of the problem?


➢What are the consequences of the problem?
➢How do we detect (diagnose) the problem?
➢What remedies (prescriptions) are available
for the problem?
5 March 2023 Prepared by Aemro Tazeze 55
Cont…

➢ The Zero Mean Assumption i.e. E(  i)=0

✓ If this assumption is violated, we obtain a biased


estimate of the intercept term.
✓ But, since the intercept term is not very important
we can leave it.
✓ The slope coefficients remain unaffected even if
assumption one is violated.
✓ The intercept term does not also have physical
interpretation.
5 March 2023 Prepared by Aemro Tazeze 56
Cont…

➢Homoscedasticity Assumption

✓The error terms in the regression equation


have a common variance i.e., are
Homoscedastic.
✓If they do not have common variance-
Heteroscedastic.

5 March 2023 Prepared by Aemro Tazeze 57


Cont…

✓In the case of homoscedastic, the spread of


disturbance term, around the mean is
constant, i.e. var (ei) =  2.

✓But, in the case of heteroscedasticity, the


variance disturbance terms change with each
explanatory variable.

5 March 2023 Prepared by Aemro Tazeze 58


What are the causes Heteroscedasticity?

✓The problem is more common in cross-


sectional data than in time-series data.
✓Inappropriate or faulty sampling design or mix-
up of random sampling methods
✓Various observations within a population or re-
grouping problem of non-overlapping samples.

5 March 2023 Prepared by Aemro Tazeze 59


What are the effects or consequences
:symptoms

▪ An unbiased but inefficient estimate


✓ high standard errors
✓ wider confidence interval problems
✓ increases the variance of the parameters
✓ OLS estimators are still unbiased

5 March 2023 Prepared by Aemro Tazeze 60


Cont…

✓It does affect the minimum variance


property. Thus the OLS estimators are
inefficient.
✓Thus the test statistics – t-test and F-test –
cannot be relied on in the face of
heteroscedasticity.

5 March 2023 Prepared by Aemro Tazeze 61


DIAGNOSES: How do we detect
non-Constant Variance or
Heteroscedasticity?
➢Breuch-Pagan (BP) test
✓One of the most common tests for
heteroscedasticity is the Breuch-Pagan (BP)
test.
✓Under the null hypothesis of
◼ H0 : Constant variance,
compute χ2 and compare with the tabulated χ2 and if
calculated is less than tabulated Chi-square then
heteroscedasticity exists.
5 March 2023 Prepared by Aemro Tazeze 62
Example

✓ An organization dealing with Family Planning


wished to examine the relationship between
expenditure, income and family size in Oromia
region.
✓ The organization drew a random sample of ten
families and obtained the data given in Table below.
Determine the regression equation.

5 March 2023 Prepared by Aemro Tazeze 63


Cont…
Family Expenditure Income Family size
1 19 6 3
2 20 7 4
3 14 6 2
4 10 4 4
5 22 7 6
6 23 8 5
7 17 6 3
8 15 4 3
9 7 2 4
10 23 10 3
5 March 2023 Prepared by Aemro Tazeze 64
Cont…

. sum

Variable Obs Mean Std. Dev. Min Max

expenditure 10 17 5.497474 7 23
income 10 6 2.260777 2 10
familysize 10 3.7 1.159502 2 6

5 March 2023 Prepared by Aemro Tazeze 65


Cont…
. reg expenditure income familysize

Source SS df MS Number of obs = 10


F( 2, 7) = 23.94
Model 237.307999 2 118.653999 Prob > F = 0.0007
Residual 34.6920014 7 4.95600021 R-squared = 0.8725
Adj R-squared = 0.8360
Total 272 9 30.2222222 Root MSE = 2.2262

expenditure Coef. Std. Err. t P>|t| [95% Conf. Interval]

income 2.175534 .3294222 6.60 0.000 1.396574 2.954494


familysize .9627217 .6423018 1.50 0.178 -.5560807 2.481524
_cons .3847267 3.041991 0.13 0.903 -6.80844 7.577893

5 March 2023 Prepared by Aemro Tazeze 66


Cont…

. hettest

Breusch-Pagan / Cook-Weisberg test for heteroskedasticity


Ho: Constant variance
Variables: fitted values of expenditure

chi2(1) = 0.96
Prob > chi2 = 0.3266

5 March 2023 Prepared by Aemro Tazeze 67


Cont…

✓The test for heteroskedasticity (BP test)


implies that there is no problem of
heteroskedasticity (non-constant variance)
problem.
✓since the chi-square value (p =0.3266 > 0.05)
suggests not rejecting the null hypothesis of
constant variance.

5 March 2023 Prepared by Aemro Tazeze 68


PRESCRIPTIONS: What are the
remedies for heteroskedasticity?

➢Transform the original data to log, or x2 or


square root following acceptable procedure.
➢Deflate the values by some measure of ‘size’.
➢Use the weighted- least-square (WLS) analysis

5 March 2023 Prepared by Aemro Tazeze 69


Multicollinearity

✓Since collinearity indicates that one of the


predictors or independent variables is an exact
linear combination of the others cov( X i , X j )  0 ,
then this is known as a problem of
multicollinearity.

5 March 2023 Prepared by Aemro Tazeze 70


CAUSES: What are the causes of
multicollinearity?

✓Use of highly related independent variables


✓Rounding off sensitive variables
✓Improper scaling (or choice of measurement
unit) of variables
✓Inclusion of extreme values via errors in data
collection
✓Use of many dummy independent variables
✓Use of many interaction terms in a model
5 March 2023 Prepared by Aemro Tazeze 71
SYMPTONS: What are the effects or
consequences of multicollinearity?

✓Inaccurate coefficient estimates/measurements


✓Incorrect model specification or estimation
✓High coefficient of determination

5 March 2023 Prepared by Aemro Tazeze 72


DIAGNOSES: How do we detect
multicollinearity problem?

✓Wrong R2 value- very high/negative value

✓OLS estimates might be insignificant/high


standard error values.
✓VIF
• If VIF > 10, then there is the intolerable problem of
multicollinearity
5 March 2023 Prepared by Aemro Tazeze 73
Cont…

. vif

Variable VIF 1/VIF

familysize 1.01 0.992814


income 1.01 0.992814

Mean VIF 1.01

. pwcorr

expend~e income family~e

expenditure 1.0000
income 0.9119 1.0000
familysize 0.2789 0.0848 1.0000

5 March 2023 Prepared by Aemro Tazeze 74


Cont…

✓The VIF values for both family size (fsize) and


income variables are by far less than 10- no
multicollinearity

✓Pairwise correlation matrix show also there is


very weak collinearity/non-existence of
multicollinearity.

5 March 2023 Prepared by Aemro Tazeze 75


PRESCRIPTIONS: What are the remedies
for multicollinearity?

➢Drop one or some highly correlated variable


➢Scale the variables/adjust choice of
measurement.
➢Center the data set or normalize the data.
✓ For example, adding a constant number.

5 March 2023 Prepared by Aemro Tazeze 76


Normality

✓Normality assumption is basically the


disturbance terms are normality distributed
✓The violation of the normality assumption is
known as non-normality problem.

5 March 2023 Prepared by Aemro Tazeze 77


CAUSES: What are the causes of non-
normality

➢Outliers data.
➢Incorrect random sampling technique
➢Incorrect sampling methods choosing non-
random methods such as convenience
sampling, purposive sampling and quota
sampling

5 March 2023 Prepared by Aemro Tazeze 78


Cont…

➢Very small sample size


➢Observations mis-recorded (too many
outliers)
➢Omission of relevant variables
➢Missing value problems

5 March 2023 Prepared by Aemro Tazeze 79


SYMPTONS: What are the effects or
consequences of non-normality?

➢Biased estimates
➢Inflated standard error

5 March 2023 Prepared by Aemro Tazeze 80


Cont…

✓The residuals/standardized residuals test for


normality.
✓Kolmogorov-Smirnov test for normality of the
distribution: sktest resid

✓Check normality by previous data related to


expenditure, income and family size

5 March 2023 Prepared by Aemro Tazeze 81


Cont…

. sktest resid

Skewness/Kurtosis tests for Normality


joint
Variable Obs Pr(Skewness) Pr(Kurtosis) adj chi2(2) Prob>chi2

resid 10 0.9980 0.6411 0.22 0.8970

5 March 2023 Prepared by Aemro Tazeze 82


Cont…

✓In this case, since p=0.8970 > 0.05, the


residuals are not statistically significant
✓Therefore the Ho of no difference between
the theoretical normal distribution and the
data residuals distribution cannot be
rejected.
✓Thus, the normality assumption is not
violated.
5 March 2023 Prepared by Aemro Tazeze 83
Cont…

. gen lnexp=ln(exp)

. kdensity lnexp, bwidth(0.2) normal n(10)


Kernel density estimate
1
Density

.5
0

1.5 2 2.5 3 3.5


lnexp

Kernel density estimate


Normal density
kernel = epanechnikov, bandwidth = 0.2000

5 March 2023 Prepared by Aemro Tazeze 84


Cont…
✓The kernel density plot provides a smoother
version of histogram which looks like
normal graph.

✓The kernel density graph for expenditure


data is fairly smooth and it appears that it
closely matches the normal curve.

5 March 2023 Prepared by Aemro Tazeze 85


PRESCRIPTIONS: What are the
remedies for non-normality?

➢Trim or drop the outliers


➢Smooth or transform the original data

5 March 2023 Prepared by Aemro Tazeze 86


Autocorrelation

What is autocorrelation?

➢Autocorrelation is the
interdependence/correlation of a pair of error
terms in a model.
➢The problem should be eliminated/minimized

5 March 2023 Prepared by Aemro Tazeze 87


CAUSES: What are the causes of
autocorrelation?

➢Omission of independent variable because of


lack of data.
➢Faulty functional form or misspecification of
a model.
➢Missing values or observations

5 March 2023 Prepared by Aemro Tazeze 88


SYMPTONS: What are the effects or consequences of
autocorrelation?

❖Unbiased and inconsistent (no efficient)


parameter estimates with wrong standard
errors.
❖Incorrect statistical test (such as wider
confidence interval, the variance are biased
❖Thus, R2, t and F statistics tend to be
exaggerated.
5 March 2023 Prepared by Aemro Tazeze 89
DIAGNOSES: How Do We Detect
Autocorrelation Problem?

❖Durbin-Watson (DW) test


❖The value of DW lies between 0 and 4
inclusive (i.e., 0  DW  4 ).

5 March 2023 Prepared by Aemro Tazeze 90


Cont…

✓If DW is in the neighborhoods of 2 and


equal to 2 (no evidence of autocorrelation)
✓DW as moves towards 0 (positive
autocorrelation)
✓DW as it moves towards 4 (negative
autocorrelation)

5 March 2023 Prepared by Aemro Tazeze 91


PRESCRIPTIONS: What are the remedies for
autocorrelation?

✓Averaging/extrapolating to estimate the


missing values or observations
✓Differencing the data.
• This induces the stationarity condition by
removing trend and some seasonal components of
a time series data set

5 March 2023 Prepared by Aemro Tazeze 92


Endogeniety

What is endogeniety problem?

➢Endogeniety is condition when any of the


independent/dependent variable is correlated
with any error term, that is, Cov( xi ,  i ).  0
➢ A departure from the non-endogeniety
assumption Cov( xi ,  i ) = 0 is known as
Exogenous.
5 March 2023 Prepared by Aemro Tazeze 93
CAUSES: What are the causes of endogeniety?

➢Endogeniety occurs where irrelevant


variables or lagged dependent variable (s) are
introduced as independent variable(s) in a
model.

5 March 2023 Prepared by Aemro Tazeze 94


SYMPTONS: What are the effects or
consequences of endogeniety?

➢This leads to high standard error and


inefficient parameter estimates.

5 March 2023 Aemro T.


DIAGNOSES: How do we detect endogeniety?

➢Most widely applied test statistic is known as


Hausman test.

5 March 2023 Prepared by Aemro Tazeze 96


PRESCRIPTIONS: What are the
remedies for endogeniety?

➢Omit irrelevant dependent or independent


variables
➢Exclude lagged dependent variable (s) that
were introduced as independent variable(s) in
a model

5 March 2023 Aemro T.


Misspecification

What is misspecification or non-specification?

➢ Misspecification is usually a problem that may


arise due to a mismatch of a model and a data
set;
➢ On the other hand, misspecification occurs due
to inclusion of irrelevant variables, and
exclusion of relevant variables in a regression
equation.
5 March 2023 Prepared by Aemro Tazeze 98
What are the causes of misspecification?

➢Incorrect functional forms


SYMPTONS: What are the effects or
consequences of misspecification?
➢Biased estimates
➢Incorrect statistical estimates

5 March 2023 Prepared by Aemro Tazeze 99


DIAGNOSES: How do we detect
misspecification?

➢Observe for outliers


➢Notice if non-constant variance or
heteroskedasticity behavior exists
➢Unusual coefficient of determination value,
closer to 100%.
➢Using Ramsey Reset test

5 March 2023 Prepared by Aemro Tazeze 100


PRESCRIPTIONS: What are the
remedies for misspecification?

➢Re-examine the data set and identify if the


right type of modeling is applied to it.
➢Recall that the type of data collected vs
determines what type of model
➢Transformation of data set through the
inclusion of relevant variables and exclusion
of irrelevant variables.
5 March 2023 Prepared by Aemro Tazeze 101
Cont…

✓ The following is test for misspecification on the


expenditure (two variable equation previously) data,
using the STATA command:
. ovtest

Ramsey RESET test using powers of the fitted values of expenditure


Ho: model has no omitted variables
F(3, 4) = 0.63
Prob > F = 0.6322

➢ The Ramsey RESET test (Prob > F = 0.6322 > 0.05)


indicates that there are no omitted variables for this
particular model with the two variables; therefore, there is
no need to improve the specification of the model.

5 March 2023 Prepared by Aemro Tazeze 102


Cont…

➢Conclusion: Interpreting the linear


regression equation for the expenditure
function
➢As demonstrated above, since the model has
passed all the regression hurdles
➢we therefore conclude that the model
adequately fits the data.

5 March 2023 Prepared by Aemro Tazeze 103


Dummy Regression

✓Dummy variables are discrete variables


taking a value of ‘0’ or ‘1’. They are often
called ‘on’ ‘off’ variables, being ‘on’ when
they are 1.
✓Dummy variables can be used either as
explanatory variables or as the dependent
variable.

5 March 2023 Aemro T.


Cont…

✓These are: nominal, ordinal, interval and ratio


scale variables.
✓regression models do not deal only with ratio
scale variables; they can also involve nominal
and ordinal scale variables.
➢Dummy variable regression is that if there are
m categories, we need only m-1 dummy
variables.
5 March 2023 Aemro T.
Cont…

✓One approach to this problem would simply


be to estimate two separate consumption
functions and obtain two consumption
equations.
✓Suppose that we hypothesize that war time
controls do not alter the marginal
propensity to consume out of disposable
income, but instead simply reduce the average
propensity to consume.

5 March 2023 Aemro T.


Cont…

✓By this we mean that the slope remains the


same, whereas the constant term becomes
smaller for war- time case.

✓With this assumption, the consumption


function becomes
Ct = b0 + b1Ydt + b2 Dt + ut , t = 1, 2, ..., n,

5 March 2023 Aemro T.


Cont…

✓ Where Dt = 0 during peace time years


= 1 for war years

✓ Equation above says that during peace time,


when Dt = 0, we have
Ct = b0 + b1Ydt+ut
✓ Which in period of war (Dt=1) becomes
Ct = (b0 + b2) + b1Ydt+ ut

5 March 2023 Aemro T.


Cont…

✓Suppose the time period under consideration


has both war and peace periods.

✓Using the data, we could estimate the values


of the coefficient in equation with our
standard multiple regression equation.

5 March 2023 Aemro T.


Cont…
✓ Suppose that we in fact did this and obtained the equation
ˆ = 40 + 0.9Y − 30 D
C t dt t

✓ Let us say that the t- ratio corresponding to the Dt was of


sufficient size to suggest that the parameter b2 is not zero.

✓ We would then conclude that the war had a significant


negative effect on consumption expenditures. The
estimated consumption function would be

5 March 2023 Aemro T.


Cont…

✓ Cˆ t = 10 + 0.9Ydt , for years of peace

✓ Cˆ t = 40 + 0.9Ydt , for war years

✓ If consumption expenditures are measured in


billions of dollars, a comparison of the above two
equation would then suggest that, for
corresponding levels of income, consumption
expenditures were 30 billion dollars less during
years of war.

5 March 2023 Aemro T.


Non-linear Regression

✓ What type of models do we use for qualitative


observations?
✓ Several methods have been developed to analyze
data using regression models with dichotomous
(binary) or several categorical dependent variable.
✓ The most common ones are - Linear Probability
Models (LPM), Probit (or Normit), Logit, Tobit and
etc.

5 March 2023 Aemro T.


Cont…

There are several situation in which the outcome


variable we want to explain can take only two
possible values.

So the researchers are interested to model the


choice of an individual by using binary choice
models.

5 March 2023 Aemro T.


Cont…

Consumer economics: whether a consumer makes a


purchase or not.
Labor economics: whether an individual participates
in the labor market or not.
Agricultural economics: whether or not a farmer
adopts or uses organic practices,
marketing/production contracts, etc.

5 March 2023 Aemro T.


Binary Choice Models

Binary choice models are the foundation from


which more complex models for ordinal, nominal,
and count models can be derived.
The decision/choice is whether or not to have, do,
use, or adopt.
The dependent variable is a binary response

It takes on two values: 0 and 1.

5 March 2023 Aemro T.


5 March 2023
Logit /Probit Regression

Logit or probit model is a realization of a binomial


process with probabilities given to the occurrence or
non-occurrence of an event

which its dependent variable is a dichotomous


observation.

The attractiveness of the logit/probit model is to


capture exactly the effect in categorical dependent
variables.
Aemro T.
Logit

For the logit model, F(X’) is the cdf of the logistic

Econometrics(AgEc 721)
distribution.

Advanced
5
The predicted probabilities are limited between 0 Marc
h
and 1. 2023

Aemro T.
5 March 2023
Cont…

Which indicate how often something happens (y =


1) and not happen (y = 0).
Then the probability of the event happening is given
by
z
Pi =
1+  z

From the probability rule the probability of not


happening is z 1
1 − Pi = 1 − =
1+  z
1+ z
Aemro T.
5 March 2023
Cont…

Where,
z =  0 + 1 X
We can write the following
z
Pi
= 1 +  z
= z

1 − Pi 1
1+ z
Now Pi/(1 − Pi) is simply the odds ratio in favor of
event happen. i.e. the ratio of the probability that an
event will happen to the probability that it will not
happen.
Aemro T.
5 March 2023
Cont…

Now if we take the natural log of the above equation,


we obtain :
 Pi 
Li = ln  ( )
 = ln  z = zi =  0 + 1 X i
 1 − Pi 

That is, L, the log of the odds ratio, is not only linear
in X, but also (from the estimation viewpoint) linear in
the parameters.

L is called the logit, but probability is not linear


Aemro T.
Probit Model

5 March 2023
For the logit model, F(X’) is the cdf of the
standard normal distribution.

The predicted probabilities are limited between 0


and 1.

Aemro T.
5 March 2023
Probit/Logit

1.2

0.8
P(Y)

0.6 Probit Logit

0.4

0.2

0
-10 0 10
z
Aemro T.
5 March 2023
Interpretation of Coefficients

An increase in x increases/decreases the likelihood that y=1


(makes that outcome more/less likely).

In other words, an increase in x makes the outcome of 1 more


or less likely.

We interpret the sign of the coefficient but not the magnitude.

The magnitude cannot be interpreted using the coefficient


because different models have different scales of coefficients.

Aemro T.
5 March 2023
Choice between the Logit and Probit
Model
The choice depends on the data generating process, which
is unknown.

The models produce almost identical results (different


coefficients but similar marginal
effects).The choice is up to the researcher.

If we reverse the categories 0 and 1, the signs of the


coefficients are reversed (positive become
negative and vice versa) but the magnitudes are the same.

Aemro T.
Tobit

❖This model is called Tobit because it was first


proposed by Tobin (1958.
❖The model is used when we have all observations of
the explanatory variables but the continuous
dependent variable is “limited” in the sense that we
observe it only if it is above or below some cut off
level.

5 March 2023 Aemro T.


Introduction to Time Series

 Time series data is data collected for a single


entity at multiple points in time

 Time-series analysis: The statistical analysis of a


sample of time-ordered, periodic observations

 Example: annual performance data of GDP


(gross domestic product), and PCE (personal
consumption expenditure) a country.
5 March 2023 Aemro T.
Cont…

 A time series is a collection of data yt (t=1,2,…,T), with


the interval between yt and yt+1 being fixed and
constant.

 We can think of time series as being generated by a


stochastic process, or the data generating process
(DGP).

 A time series (sample) is a particular realization of the


DGP (population).

 Time series analysis is the estimation of difference


equations containing stochastic (error) terms
5 March 2023 Aemro T.
Cont…

 Regression analysis based on time series data


implicitly assumes that the underlying time
series are stationary.

 In practice most economic time series are


nonstationary.

 The are various potential difficulty in the


statistical analysis of time-series data that can
invalidate the empirical results
5 March 2023 Aemro T.
Category of Time Series

 Time series can broadly be categorized into two:


◼ Univariate time series: Concerned with time

series properties of a single series


◼ Eg. yt = β0 + β1 yt-1 +εt

◼ Multivariate time series: Concerned with time

series properties of more than one series


◼ Eg. yt = β0 + β1 yt-1 + β2xt +…+ βixt-i +εt

5 March 2023 Aemro T.


Stationarity and weakly
dependent time series
 Stationary is important property that must hold
before we can estimate a time-series model
 Stationarity: Time series yt is strongly stationary if
its probability density function does not depend on
time i.e. pdf of (ys,ys+1,ys+2,..ys+t ) does not depend
on s (gap)
✓ A stationary time series process is one whose probability
distributions are stable over time
 Weak stationarity: A series has weak stationarity if
first and second moments do not depend on t
5 March 2023 Aemro T.
Cont…

 Time series data often have time-dependent


moments (e.g. mean, variance..).
 Stochastic process (y1,…, yt ,…yt+n) is weakly
stationary…
✓ if E(yt) = does not depend on t (constant mean)
✓ V{yt}= does not depend on t (constant variance)
✓ Cov{yt , yt-s}= depends on s, the distance (gap)
between the two periods and not t.

5 March 2023 Aemro T.


Cont…
Weakly Dependent Time Series
 Stationarity has to do with the joint
distributions of a process as it moves through
time.
 A very different concept is that of weak
dependence, which places restrictions on how
strongly related the random variables xt and
xt+h can be as the time distance between them,
h, gets large.
5 March 2023 Aemro T.
Cont…

 The mean or variance of many time series


increases over time.
 This is a property of time series data called
nonstationarity.
 If two independent, nonstationary series are
regressed on each other, the chances for finding
a spurious relationship are very high.

5 March 2023 Aemro T.


Great Love for you!!!

Good Luck!

5 March 2023 Aemro T.

You might also like