Thanks to visit codestin.com
Credit goes to www.scribd.com

0% found this document useful (0 votes)
84 views102 pages

Chapter 4

The document discusses time series analysis and concepts related to stationarity. It defines time series analysis as analyzing data organized across time periods. Time series models are theoretically important as behaviors evolve over time. The researcher faces issues of variables influencing each other with time lags and non-stationary variables potentially leading to spurious regressions. The Dickey-Fuller test is introduced to test for unit roots in variables to determine if they are stationary or non-stationary. If variables are non-stationary but cointegrated, there may still be a long-run relationship between them.

Uploaded by

berhanu seyoum
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
84 views102 pages

Chapter 4

The document discusses time series analysis and concepts related to stationarity. It defines time series analysis as analyzing data organized across time periods. Time series models are theoretically important as behaviors evolve over time. The researcher faces issues of variables influencing each other with time lags and non-stationary variables potentially leading to spurious regressions. The Dickey-Fuller test is introduced to test for unit roots in variables to determine if they are stationary or non-stationary. If variables are non-stationary but cointegrated, there may still be a long-run relationship between them.

Uploaded by

berhanu seyoum
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 102

CHAPTER 4

ECONOMETRIC ANALYSIS OF TIME SERIES DATA


4.1 Definition and Some Concepts
Time series analysis is the analysis of data organized across units of time.
 It is data for one or more variables is collected for many observations at
different time periods.
Conceptual reasons to consider the time series models are:
 (1) the classic regression models assume that all causation is instantaneous-
this is clearly suspect and
 (2)behaviors are dynamic - they evolve over time.
Why time series?
 Many economists believe that time series models are theoretically and
fundamentally more important than cross-sectional models and the models
that we are really interested are those that help us how the systems change
across time.
8/3/2021 For MSc students by Urgaia R.(Ph.D.), OSU 1
• The researcher which uses the time series data faces two problems
which do not exist in the cross sectional data:
i. One time series variable can influence another with a time lag; and
ii. If the variables are non-stationary, spurious regression may arise.
 Non-stationary time series variables should be transformed into
stationary before running a regression using unit root tests for every
variable in the regression equation.
 The value of the dependent variable at a given point of time can
depend not only on the value of the explanatory variable at that time
period, but also on values of the explanatory variable in the past.
 In the case of time series data, the effect of some explanatory variables
on the dependent variable may take time.
 Time series may be either univariate - one variable description or
multivariate - causal explanation.

8/3/2021 For MSc students by Urgaia R.(Ph.D.), OSU 2


4

8/3/2021 For MSc students by Urgaia R.(Ph.D.), OSU 3


Stationary and Nonstationary Time Series (cont.)
To get a better understanding of these issues, consider the case where
Yt is generated by an equation that includes only past values of itself (an
autoregressive equation):
Yt = γYt–1 + vt
where vt is a classical error term and its expected value = 0.
If | γ | < 1, then the expected value of Yt will eventually approach 0 and
therefore it becomes stationary as the sample size gets bigger and bigger.
Similarly, if | γ | > 1, then the expected value of Yt will continuously
increase, making Yt nonstationary.
This is nonstationarity due to a trend, but it still can cause spurious
regression results.
If you run a regression in which the dependent variable and one or more
independent variables are spuriously correlated, the result is a spurious
regression, and the t-scores and overall fit of such spurious regressions are
likely to be overstated and untrustworthy

8/3/2021 For MSc students by Urgaia R.(Ph.D.), OSU 4


Stationary and Nonstationary Time Series (cont.)
Most importantly, if |γ| = 1, then:
Yt = Yt–1 + vt
This is a random walk in which the expected value of Yt does not
converge on any value, meaning that it is nonstationary and above
equation is called a unit root
If a variable has a unit root, then the above equation holds, and
the variable follows a random walk and is nonstationary

8/3/2021 For MSc students by Urgaia R.(Ph.D.), OSU 5


8/3/2021 For MSc students by Urgaia R.(Ph.D.), OSU 6
 An application of the unit root tests also showed that there is no unit
root in the first differences of Y. That is, it is the first difference of the Y
series that is stationary.
 If a time series becomes stationary after we take its first differences,
we call such a time series difference stationary (stochastic) process
(DSP).
 If a time series becomes stationary if we detrend it in the manner
suggested, it is called a trend stationary (stochastic) process (TSP).
 It may be pointed out here that a process with a deterministic trend is
non-stationary but not a unit root process.
 It is important to note that if a time series is DSP but we regard it as
TSP, this is called under-differencing.
 On the other hand, if a time series is TSP and we treat it as DSP, this is
called over-differencing.
8/3/2021 For MSc students by Urgaia R.(Ph.D.), OSU 7
Standard sequence of steps for dealing with nonstationary time
series
1.Specify the model with lags vs. no lags, etc
2.Test all variables for nonstationarity , technically unit roots test using the
appropriate version of the Dickey–Fuller test or ADF test
3.If the variables don’t have unit roots, estimate the equation in its original
units of Y and X
4.If the variables have unit roots, test the residuals of the equation for
cointegration
5.If the variables have unit roots but are not cointegrated, then change the
functional form of the model to first differences such ∆X and ∆Y and
estimate the equation

8/3/2021 For MSc students by Urgaia R.(Ph.D.), OSU 8


4.2.1 The Dickey–Fuller Test for Unit Root
From the previous discussion of stationarity and unit roots, it makes sense to
estimate:
Yt = γYt–1 + vt
and then determine if |γ| < 1 to see if Y is stationary. This is almost exactly
how the Dickey-Fuller test works:
1. Subtract Yt–1 from both sides of the equation yielding:
(Yt – Yt–1) = (γ – 1)Yt–1 + vt
By defining, ΔYt = Yt – Yt–1 then we have the simplest form of the
Dickey– Fuller test:
ΔYt = β1Yt–1 + vt ,
where β1 = γ – 1.

2. Set up the test hypotheses:


H0: β1 = 0 (unit root)
HA: β1 < 0 (stationary)

Note: alternative Dickey-Fuller tests additionally include a constant and/or a


constant and a trend term

8/3/2021 For MSc students by Urgaia R.(Ph.D.), OSU 9


The Dickey–Fuller Test (cont.)
3. Set up the decision rule:
If it is statistically significantly less than 0, then we can reject the null
hypothesis of nonstationarity
If it is not statistically significantly less than 0, then we cannot reject the null
hypothesis of nonstationarity
The DF test can be performed in three different forms:
Random walk: ΔYt = β1Yt–1 + vt
Random walk with drift: ΔYt = β0 + β1Yt–1 + vt
Random walk with drift around a deterministic trend:
ΔYt = β0 + β1 t+ β2Yt–1 + vt
Note that the standard t-table does not apply to Dickey–Fuller tests

8/3/2021 For MSc students by Urgaia R.(Ph.D.), OSU 10


8/3/2021 For MSc students by Urgaia R.(Ph.D.), OSU 11
Integrated time series
 Such a time series becomes stationary after differencing it once, it is said to
be integrated of order one, denoted as 1(1).
 If it has to be differenced twice, i.e. difference of difference to make it
stationary, it is said to be integrated of order two, denoted as 1(2).
 If it has to be differenced d times to make it stationary, it is said to be
integrated of order d, denoted as 1(d).
 Therefore the terms "stationary time series" and "time series integrated of
order zero" mean the same thing.
 That is why an 1(1) series is said to have a stochastic trend. As a result, the
autocorrelations in a correlogram of an 1(0) series decline to zero very
rapidly as the lag increases whereas for an 1(1) series they decline to zero
very slowly
 Most non-stationary economic time series generally do not need to be
differenced more than once or twice.
• To sum up, a non-stationary time series is known as an integrated time series
or a series with stochastic trend.
8/3/2021 For MSc students by Urgaia R.(Ph.D.), OSU 12
4.2.3 Cointegration Test for Long -run Relationship
If the Dickey–Fuller test reveals non-stationarity, what should we do?
The traditional approach has been to take first differences such as:
ΔY = Yt – Yt–1 and ΔX = Xt – Xt–1 and use them in place of Yt and Xt in the
regressions.
Issue: the first-differencing basically ”throw away information” about the
possible equilibrium relationships between the variables
Alternatively, one might want to test whether the time-series are
cointegrated, which means that individual variables might be non-stationary,
it’s possible for linear combinations of nonstationary variables to be
stationary.
As Granger notes ‘’ a test for cointegration can be thought of as a per-test to
avoid ‘ spurious regression ‘ situation’’.
In the context of testing for cointegration, the DF and ADF tests are known
as Engle-Granger (EG) and augmented Enger-Granger (AEG) tests, which
is now incorporated in several software packages.
The main difference between the unit root and the cointegration is that tests
for unit roots are performed on single time series whereas coiintegration
deals with the relationship among a group of variables, each having a unit
root.

8/3/2021 For MSc students by Urgaia R.(Ph.D.), OSU 13


Cointegration Test for… (cont.)
 To see how this works, consider the following equation:

Assume that both Yt and Xt have a unit root and solving equation for ut, we
get:

 In the above equation u t is a function of two nonstationary variables, so u t


might be expected also to be nonstationary
 Cointegration refers to the case where this is not the case Yt and Xt are both
non-stationary, yet a linear combination of them, as given by above
Equation is stationary
 This could happen if economic theory supports the equation as an
equilibrium

8/3/2021 For MSc students by Urgaia R.(Ph.D.), OSU 14


 This is because the standard linear procedures assume that the time
series involved in the analysis
8/3/2021 are
For MSc students stationary.
by Urgaia R.(Ph.D.), OSU 15
S.E

S.E

8/3/2021 For MSc students by Urgaia R.(Ph.D.), OSU 16


 Or example, the cointegrtated can be of the same order of zero, I(0)
stationary (in this case, we use the ECM which incorporates both short
run and long-run dynamics and we apply the OLS to estimate the short-
coefficients and the expected negatively significant error correction
term)or it can be different order(in that case, we use ARDL model for
estimation coefficients).
 If the error correction term is zero, there is no disequilibrium between
the variables and the long –run relationship will be given by the
coitegrating relationship.

8/3/2021 For MSc students by Urgaia R.(Ph.D.), OSU 17


4.2.4

8/3/2021 For MSc students by Urgaia R.(Ph.D.), OSU 18


8/3/2021 For MSc students by Urgaia R.(Ph.D.), OSU 19
 Autoregressive Integrated Moving Average(ARIMA) is actually a class of
model that explains a given time series based on its own past values and
the lagged forecast errors, so that it can be used to forecast future values.

 ARIMA model as a distinct from ARMA models, the I stands for


integrated.
 An integrated autoregressive process is one with a characteristic root on the
unit circle.
 Typically researchers difference the variable as necessary and then build an
ARMA model on those differenced variables.
 An ARMA(p,q) model in the variable differenced d times is equivalent to
an ARIMA(p,d,q) model on the original data.

8/3/2021 For MSc students by Urgaia R.(Ph.D.), OSU 20


 ARIMA is a highly refined curve-fitting device that uses current and past
values of the dependent variable to produce often accurate short-term
forecasts of that variable
◦ Examples of such forecasts are stock market price predictions
created by brokerage analysts called “chartists” or “technicians”
based entirely on past patterns of movement of the stock prices
 The use of ARIMA is appropriate when:
◦ little or nothing is known about the dependent variable being
forecasted,
◦ the independent variables known to be important cannot be
forecasted effectively
◦ all that is needed is a one or two-period forecast

8/3/2021 For MSc students by Urgaia R.(Ph.D.), OSU 21


 The ARIMA approach combines two different specifications called
processes into one equation:
1. An autoregressive process (AR):
 expresses a dependent variable as a function of past values of the
dependent variable
 This is similar to the serial correlation error term function and the
dynamic model
2.A moving average process (MA):
 expresses a dependent variable as a function of past values of the
error term
 Such a function is a moving average of past error term observations
that can be added to the mean of Y to obtain a moving average of
past values of Y

8/3/2021 For MSc students by Urgaia R.(Ph.D.), OSU 22


An autoregressive model of order p, an AR(p) can be expressed as
y t    1 y t 1   2 y t 2  ...   p y t  p  u t
• Or using the lag operator notation:
Lyt = yt-1 Liyt = yt-i

p
• or y t      i y t i  u t
i 1
p

or y t      i L y t  u t
i

i 1

 ( L) y t    u t where  ( L)  1  (1 L  2 L ... p L )


2 p

• The condition for stationarity of a general AR(p) model is that the roots of
all lie outside the unit circle using the following equation:
1  1z   2 z 2 ... p z p  0
8/3/2021 For MSc students by Urgaia R.(Ph.D.), OSU 23
ARIMA Models (cont.)
To create an ARIMA model, we begin with an econometric equation with
no independent variables:

and then add to it both the autoregressive and moving-average processes:

where the θs and the ϕs are the coefficients of the autoregressive and
moving-average processes, respectively, and p and q are the number of
past values used of Y and ε, respectively.

8/3/2021 For MSc students by Urgaia R.(Ph.D.), OSU 24


ARIMA Models (cont.)
 Before this equation can be applied to a time series, however, it must
be ensured that the time series is stationary.
 For example, a non-stationary series can often be converted into a
stationary one by taking the first difference:

If the first differences do not produce a stationary series, then first


differences of this first-differenced series can be taken—i.e. a second-
difference transformation:

8/3/2021 For MSc students by Urgaia R.(Ph.D.), OSU 25


ARIMA Models (cont.)
 If a forecast of Y* or Y** is made, then it must be converted back into Y
terms
 For example, if d = 1 (where d is the number of differences taken to make
Y stationary), then:

 This conversion process is similar to integration in mathematics, so the “I”


in ARIMA stands for “integrated”
 ARIMA thus stands for Auto-Regressive Integrated Moving Average
◦ An ARIMA model with p, d, and q specified is usually denoted as ARIMA
(p,d,q) with the specific integers chosen inserted for p, d, and q
◦ If the original series is stationary and d therefore equals 0, this is
sometimes shortened to ARMA

8/3/2021 For MSc students by Urgaia R.(Ph.D.), OSU 26


4.2.5

8/3/2021 For MSc students by Urgaia R.(Ph.D.), OSU 27


 As all estimated coefficients of ARCH are expected positive values and
both the mean and the variance estimations are carried out
simultaneously, we use maximum likelihood approach.
 ARCH is applicable for assets price volatilities such stock price, interest
rates, foreign exchange rate and inflation rates and we observe auto-
correlated heteroscedasticity over different periods
 ARCH models have been widely used in financial time series analysis and
particularly in analysing the risk of holding an asset, evaluating the price
of an option, forecasting time varying confidence intervals and obtaining
more efficient estimators under the existence of heteroscedasticity.

8/3/2021 For MSc students by Urgaia R.(Ph.D.), OSU 28


8/3/2021 For MSc students by Urgaia R.(Ph.D.), OSU 29
ML estimation of ARCH model of the dollar/ euro exchange rate
Dependent Variable: Return; Method: ML-ARCH(Marquardt)-Normal distribution: Sample
adjusted 2 2355, included observation 2354 after adjustments, Presample variance:
(Parameter=7)GARCH=C(2)+C3)*RESID(-1)^2+C(4) )*RESID(-2)^2+…+ C(10)*RESID(-8)^2

Variable Coefficient Std. Error Z-Statistic Prob. Values


CONSTANT 0.000168 0.000116 1.4557 0.1454

Variance Equation
CONSTANT -7.183338 1.015788 -7.071691 0.0000
RESID(-1)^2 -3.074875 0.364616 -8.433184 0.7846
RESID(-2)^2 -1.565313 0.509188 -3.074139 0.3989
RESID(-3)^2 1.095976 0.506078 2.165626 0.0678
RESID(-4)^2 1.370301 0.065904 20.79231 0.0000
RESID(-5)^2 0.166607 0.016048 lO.38205 0.0000
RESID(-6)^2 1.095976 0.506078 2.165626 0.0678
RESID(-7)^2 1.370301 0.065904 20.79231 0.0000
RESID(-8)^2 0.166607 0.016048 lO.38205 0.0000

8/3/2021 For MSc students by Urgaia R.(Ph.D.), OSU 30


 The ML estimates of the ARCH(8) are given in the above Table. The first
part of the table gives the estimate of the mean equation and the second
half gives the estimates of the coefficients of the variance equation.
 As you can see, all the lagged variance coefficients are positive as
expected; the first three coefficients are not individually statistically
significant, but the last five are.
 It seems there is an ARCH effects in dollar/euro exchange rate return.
That is the error variances are auto-correlated. This information can be
used for the purpose of forecasting volatilities.

Some drawback of the ARCH(p) model are:


 First, it requires of the coefficients estimation of p auto-regressive term,
which can consume several degree of freedom;
 Secondly, it often difficult to interpret all the coefficients especially, if
some of them are negative.
8/3/2021 For MSc students by Urgaia R.(Ph.D.), OSU 31
8/3/2021 For MSc students by Urgaia R.(Ph.D.), OSU 32
GARCH (1,1), model of the dollar/ euro exchange rate
Dependent Variable: Z; Method: ML-ARCH(Marquardt)-Normal distribution: Sample
adjusted 2 2355, included observation 2354 after adjustments, Presample variance:
(Parameter=7)GARCH=C(2)+C3)*RESID(-1)^2+C(4) )*GARCH(-1)
Variable Coefficient Std. Error Z-Statistic Prob. Values
CONSTANT 0.000198 0.000110 1.7977 0.0722
Variance Equation
CONSTANT 0.183338 0.015788 0.071691 0.1240
RESID(-1)^2 3.074875 0.364616 8.433184 0.0000
GARCH(-1) 1.565313 0.509188 3.074139 0.0000
R-squared 0.323339 Mean dependent var 12.36585
Adjusted R-squared 0.320702 S.D. dependent var 7.896350
S.E. of regression 6.508137 Akaike info criterion 6.588627
Sum squared resid 54342.54 Schwarz criterion 6.612653
Log likelihood -4240.370 Hannan-Quinn crit. 6.652653
Durbin-Watson stat 1.897513 Note: Z= First difference of log of ln(exchange rate)

8/3/2021 For MSc students by Urgaia R.(Ph.D.), OSU 33


Returning to the exchange rate in our example, the results of GARCH(1,1) model can be
compared with the ARCH(8) model. As it is a short-cut method of the infinite ARCH
process, the GARCH(1,1) model in effect captures the eight lagged squared error terms
in the above table.
 As you can see the results of the GARCH(1,1) model on this table, in the variance
equation both the lagged squared error terms and the lagged conditional variance
term are individually highly significant.
 Since the lagged conditional variance affects the current conditional variance, and
hence, there is clear evidence that the ARCH effect is pronounced.
 To sum up, there is clear evidence that the dollar/ euro exchange rate returns
exhibits considerable time varying and time – correlated volatility, whether we use
the ARCH or the GARCH model.
 The ARCH model can be further extended to GARCH-mean model(GARCH-M),
threshold GARCH(TGARCH), exponential GARCH(EGARCH) each introducing more
versatility or flexibility and complexity in the estimation of volatility.
 In GARCH-M model for instance, an average investor is interested not only in
maximizing return on his or her investment, but also in minimizing the risk
associated with such investment

8/3/2021 For MSc students by Urgaia R.(Ph.D.), OSU 34


8/3/2021 For MSc students by Urgaia R.(Ph.D.), OSU 35
GARCH (1,1), model of the dollar/ euro exchange rate
Dependent Variable: RET; Method: ML-ARCH(Marquardt)-Normal distribution: Sample
adjusted 2 2355, included observation 2354 after adjustments, Presample variance:
(Parameter=7)GARCH=C(3)+C(4)*RESID(-1)^2+C(5)*GARCH(-1)
Variable Coefficient Std. Error Z-Statistic Prob. Values
GARCH -0.1887 0.0959 -1.1993 0.0490
CONSTANT 0.0783 0.03158 2.4798 0.0131

Variance Equation
CONSTANT 0.163638 0.035788 0.091691 0.1048
RESID(-1)^2 4.074875 0.564616 6.433184 0.0000
GARCH(-1) 2.565313 0.309188 4.074139 0.0000
R-squared 0.323339 Mean dependent var 12.36585
Adjusted R-squared 0.320702 S.D. dependent var 7.896350
S.E. of regression 6.508137 Akaike info criterion 6.588627
Sum squared resid 54342.54 Schwarz criterion 6.612653
Log likelihood -4240.370 Hannan-Quinn crit. 6.652653
Durbin-Watson stat 1.897513 Note: Z= First difference of log of ln(exchange rate)
8/3/2021 For MSc students by Urgaia R.(Ph.D.), OSU 36
8/3/2021 For MSc students by Urgaia R.(Ph.D.), OSU 37
 The model that incorporate dynamic effects where the dependent
variable depends on the lags of itself and other explanatory
variables as well as lags of the explanatory variables is known as
Auto-regressive Distributed lags (ARDL) model
 The OLS estimation results interpretation under ceteris paribus
condition can still be used in ARDL.
 Another interpretation concept which is called multiplier is commonly
used in interpretation of ARDL regression results focusing on the long
run or total multiplier.

8/3/2021 For MSc students by Urgaia R.(Ph.D.), OSU 38


4.2.7

8/3/2021 For MSc students by Urgaia R.(Ph.D.), OSU 39


8/3/2021 For MSc students by Urgaia R.(Ph.D.), OSU 40
8/3/2021 For MSc students by Urgaia R.(Ph.D.), OSU 41
Probability 0.5287

8/3/2021 For MSc students by Urgaia R.(Ph.D.), OSU 42


Probability 0.4587

8/3/2021 For MSc students by Urgaia R.(Ph.D.), OSU 43


Probability 0.0007

8/3/2021 For MSc students by Urgaia R.(Ph.D.), OSU 44


Probability 0.0005

8/3/2021 For MSc students by Urgaia R.(Ph.D.), OSU 45


8/3/2021 For MSc students by Urgaia R.(Ph.D.), OSU 46
Normalized Cointegrating Coefficients: 1 Cointegrating Equation(s)
Consumption GDP CONSTANT
1.00000 -0.3470 95.2861
(0.0068)

8/3/2021 For MSc students by Urgaia R.(Ph.D.), OSU 47


8/3/2021 For MSc students by Urgaia R.(Ph.D.), OSU 48
8/3/2021 For MSc students by Urgaia R.(Ph.D.), OSU 49
Some Caveats about Cointegrating uniting the Engel-Granger approach:
1. If we have more than three variables, there might be more than one
coitegrating relationship. However, the Engel- Granger two steps
procedure doesn’t allow for the estimation more than one coitegration
regression.
2. The problem with the EG test is the order in which variable enter the
cointegrating regression and hence it is difficult to decide which
variable is regressand and which one is regressor.
3. Another problem with the EG methodology in dealing with multiple or
multivariate time series doesn't work and we do not only deal with
error correction term for each cointegration relationship.
Therefore, to handle all these problems, we use the Johansen methodology
under framework of the Vector Error Correction Model(VECM) for
multivariate time series regression
8/3/2021 For MSc students by Urgaia R.(Ph.D.), OSU 50
8/3/2021 For MSc students by Urgaia R.(Ph.D.), OSU 51
 An example, the demand and supply of a good from economics literature
are given as :
Qdt    Pt  St  ut (4.1)
(4.2)
where
Qdt = quantity of the good demanded Q = quantity of the good supplied
st
St = price of a substitute good and Tt = some variable embodying the
state of technology
 Equations (4.1) and (4.2) are examples of structural equations that
could be a demand and a supply equation, respectively.
 Structural equations characterize the underlying economic theory
behind each endogenous variable by expressing it in terms of both
endogenous and exogenous variables.
 The term “predetermined variable “implies that exogenous and lagged
endogenous variables are determined outside the system of specified
equations or prior to the current period
8/3/2021 For MSc students by Urgaia R.(Ph.D.), OSU 52
Assuming that the market always clears, and dropping the time subscripts
for simplicity and the following equations are called a simultaneous
structural form of the model :

Q    P  S  u (4.3)
Q    P  T  v (4.4)

• The point is that price and quantity are determined simultaneously,


meaning , price affects quantity and quantity affects price.
• P and Q are endogenous variables, while S and T are exogenous.

8/3/2021 For MSc students by Urgaia R.(Ph.D.), OSU 53


8/3/2021 For MSc students by Urgaia R.(Ph.D.), OSU 54
 Reduced form equations indicate that the endogenous variables are
correlated with the exogenous regressors.
 In the reduced form of equations the endogenous variables are expressed
in terms of the exogenous
8/3/2021 and
For MSc lagged
students variables.
by Urgaia R.(Ph.D.), OSU 55
8/3/2021 For MSc students by Urgaia R.(Ph.D.), OSU 56
 We can estimate the reduced form equations using OLS since all the RHS
variables are exogenous and Sometimes we can retrieve the original
coefficients from the ’s.
 But we probably don’t care what the values of the  coefficients are;
what we wanted were the original parameters in the structural
equations - , , , , , .
 There are at least three reasons for using reduced-form equations:
1. Since the reduced-form equations have no inherent simultaneity, they do
not violate Classical Assumption normality distribution of error terms.
– Therefore, they can be estimated with OLS without encountering the
problems discussed in this chapter
2. The interpretation of the reduced-form coefficients as impact multipliers
means that they have economic meaning and useful applications of their
own
3. Reduced-form equations play a crucial role in Two-Stage Least Squares,
the estimation technique most frequently used for simultaneous
equations.
8/3/2021 For MSc students by Urgaia R.(Ph.D.), OSU 57
4.3.1 Identification Problem in Simultaneous Systems Model
 Identification is a precondition for the application of 2SLS to equations
in simultaneous systems
 A structural equation is identified only when enough of the system’s
predetermined variables are omitted from the equation in question to
allow that equation to be distinguished from all the others in the
system
◦ Note that one equation in a simultaneous system might be identified
and another might not
 Most simultaneous systems are fairly complicated, so econometricians
need a general method by which to determine whether equations are
identified

8/3/2021 For MSc students by Urgaia R.(Ph.D.), OSU 58


8/3/2021 For MSc students by Urgaia R.(Ph.D.), OSU 59
The Order Condition of Identification
It is a systematic method of determining whether a particular equation in a
simultaneous system has the potential to be identified
If an equation can meet the order condition, then it is almost always
identified
We thus say that the order condition is a necessary but not sufficient
condition of identification.
A necessary condition for an equation to be identified is that the number of
predetermined (exogenous plus lagged endogenous) variables in the system
be greater than or equal to the number of slope coefficients in the equation
of interest
Or, in equation form, a structural equation meets the order condition if:
# predetermined variables ≥ # slope coefficients in the simultaneous
system.
8/3/2021 For MSc students by Urgaia R.(Ph.D.), OSU 60
8/3/2021 For MSc students by Urgaia R.(Ph.D.), OSU 61
Solution:
G = 3;
1. If # excluded variables = 2, the eqn (4.10) is just or exact identified
and thus we can get unique structural form coefficient estimates

2. If # excluded variables > 2, the eqn (4.11) is over-identified and more


than one set of structural coefficients could be obtained from the
reduced form.

3. If # excluded variables < 2, the eqn (4.9) is not identified and thus we
cannot get the structural coefficients from the reduced form estimates

8/3/2021 For MSc students by Urgaia R.(Ph.D.), OSU 62


4.3.2 Tests for Exogeneity or check for endogeneity problem
In a statistical model of econometrics, endogenous problem exists when
there is a correlation between the independent variable and the error term.
 Ignoring simultaneity in the estimation leads to biased estimates as it
violates the exogeneity assumption of the Gauss-Markov theorem.
Besides simultaneity, the endogeneity problem can arise when an
unobserved or omitted variable is confounding both independent and
dependent variables, or when independent variables are measured with error,
auto regression with autocorrelated errors.
The problem of endogeneity is often, unfortunately, ignored by researchers
conducting non-experimental research and doing so disqualifies making policy
recommendations.
Instrumental variable(IV) techniques are commonly used to address this
problem.

8/3/2021 For MSc students by Urgaia R.(Ph.D.), OSU 63


8/3/2021 For MSc students by Urgaia R.(Ph.D.), OSU 64
8/3/2021 For MSc students by Urgaia R.(Ph.D.), OSU 65
Y1  10  11 X 1  12 X 2  v1
Y2  20  21 X 1  v2
Y3  30  31 X1  v3

2. Run the regression corresponding to equation (4.9).


3. Run the regression (4.9) again, but now also including the fitted
values as additional regressors:
(4.15)

8/3/2021 For MSc students by Urgaia R.(Ph.D.), OSU 66


4.3.3 Estimation of Recursive Systems or Reduced form of
equation of the system , Indirect Least Squares (ILS)
Consider the following system of equations:

8/3/2021 For MSc students by Urgaia R.(Ph.D.), OSU 67


• Equation (4.18): Contains both Y1 and Y2; we require these to be
uncorrelated with u3. By similar arguments to the above, equations
(4.16) and (4.17) do not contain Y3, so we can use OLS on (4.18).
• This is known as a RECURSIVE or TRIANGULAR system. We do not
have a simultaneity problem here. But in practice not many systems of
equations will be recursive.
• If we cannot use OLS on structural equations, but we can validly apply
it to the reduced form equations.
• If the system is just identified, ILS involves estimating the reduced
form equations using OLS, and then using them to substitute back to
obtain the structural parameters.
• However, ILS is not used much because
1. Solving back to get the structural parameters can be tedious.
2. Most simultaneous equations systems are over-identified.

8/3/2021 For MSc students by Urgaia R.(Ph.D.), OSU 68


Estimation of Systems using Two-Stage Least Squares
 Estimation of Two-Stage Least Squares (2SLS) helps mitigate simultaneity
bias in simultaneous equation systems
 In fact, we can use this technique for just-identified and over-identified
systems.
 2SLS requires a variable that is:
1. a good proxy for the endogenous variable
2. uncorrelated with the error term
 Such a variable is called an instrumental variable
 Two stage least squares (2SLS or TSLS) is done in two stages:
Stage 1: Obtain and estimate the reduced form equations using OLS and save
the fitted values for the dependent variables.
Stage 2: Estimate the structural equations, but replace any RHS endogenous
variables with their stage 1 fitted values.
8/3/2021 For MSc students by Urgaia R.(Ph.D.), OSU 69
Y1 ,Y2 ,Y3

Y1   0  1Y2   3Y3   4 X 1  5 X 2  u1
Y2  0  1Y3  2 X 1  u2
Y3   0   1Y2  u3

Y2 Y3 Y2


Y3
8/3/2021 For MSc students by Urgaia R.(Ph.D.), OSU 70
• It is still of concern in the context of simultaneous systems whether the
CLRM assumptions are supported by the data.
• If the disturbances in the structural equations are autocorrelated, the
2SLS estimator is not even consistent.
• The standard error estimates also need to be modified compared with
their OLS counterparts, but once this has been done, we can use the
usual t- and F-tests to test hypotheses about the structural form
coefficients.

8/3/2021 For MSc students by Urgaia R.(Ph.D.), OSU 71


The Properties of Two-Stage Least Squares
1. 2SLS estimates are still biased in small samples
◦ But consistent in large samples (get closer to true βs as N increases)
2. Bias in 2SLS for small samples typically is of the opposite sign of the bias
in OLS
3. If the fit of the reduced-form equation is poor, then 2SLS will not rid the
equation of bias even in a large sample
4. 2SLS estimates have increased variances and standard errors relative to
OLS
Note that Two-Stage Least Squares cannot be applied to an equation unless
that equation is identified, however
We therefore now turn to the issue of identification

8/3/2021 For MSc students by Urgaia R.(Ph.D.), OSU 72


Instrumental Variables
Recall that the reason we cannot use OLS directly on the structural
equations is that the endogenous variables are correlated with the errors.
• One solution to this would be not to use Y2 or Y3 , but rather to use
some other variables instead.
• We want these other variables to be highly correlated with Y2 and Y3,
but not correlated with the errors - they are called INSTRUMENTS.
• Say we found suitable instruments for Y2 and Y3, z2 and z3 respectively.
We do not use the instruments directly, but run regressions of the
form:

Y2  1  2 z2  1 (4.22)
Y3  3  4 z3  2 (4.23)

8/3/2021 For MSc students by Urgaia R.(Ph.D.), OSU 73


• Obtain the fitted values from the preceding Y2 and Y3 , and replace Y2 and Y3 with
these in the structural equation.
• We do not use the instruments directly in the structural equation.
• It is typical to use more than one instrument per endogenous variable.
• If the instruments are the variables in the reduced form equations, then IV is
equivalent to 2SLS.
What Happens if We Use IV / 2SLS Unnecessarily?
• The coefficient estimates will still be consistent, but will be inefficient
compared to those that just used OLS directly.
The Problem With IV
• It is difficult to find out the instruments to be fitted.
Solution: 2SLS is easier.
Other Estimation Techniques
1. 3SLS - allows for non-zero covariances between the error terms.
2. Limited information maximum likelihood (LIML)use to estimate reduced
form equations by maximum likelihood
3. Full Information Maximum Likelihood(FIML) estimates all the equations
simultaneously using maximum likelihood.
8/3/2021 For MSc students by Urgaia R.(Ph.D.), OSU 74
4.4 Vector Autoregressive Models
• Vector Autoregressive Models(VAR) is a natural generalisation of
autoregressive models popularised by Sims
• A VAR is in a sense of systems regression model i.e. there is more than
one dependent variable.
• Simplest case is a bivariate VAR

where ut is an iid disturbance term with E(ui)=0, i=1,2; E(u1t u2t)=0.


• The analysis could be extended to a VAR(g) model, or so that there are
g variables and g equations.

8/3/2021 For MSc students by Urgaia R.(Ph.D.), OSU 75


Notation and Concepts
• One important feature of VARs is the compactness with which we can
write the notation.
• For example, consider the case from above where k=1,We can write
this as

yt = 0 + 1 yt-1 + ut
g1 g1 gg g1 g1
8/3/2021 For MSc students by Urgaia R.(Ph.D.), OSU 76
• This model can be extended to the case where there are k lags of each
variable in each equation:
yt = 0 + 1 yt-1 + 2 yt-2 +...+ k yt-k + ut
g1 g1 gg g1 gg g1 gg g1 g1
• We can also extend this to the case where the model includes first
difference terms and cointegrating relationships (a VECM).
Primitive versus Standard Form of VARs
We can take the contemporaneous terms over to the LHS and write or
B yt = 0 + 1 yt-1 + ut
• We can then pre-multiply both sides by B-1 to give:
yt = B-10 + B-11 yt-1 + B-1ut
or
yt = A0 + A1 yt-1 + et
• This is known as a standard form VAR, which we can estimate using OLS.
8/3/2021 For MSc students by Urgaia R.(Ph.D.), OSU 77
Vector Autoregressive Models Compared with Structural Equations
Models
• Advantages of VAR Modelling
- We do not need to specify which variables are endogenous or
exogenous - all are endogenous in VAR Modelling
- It allows the value of a variable to depend on more than just its
own lags or combinations of white noise terms, so more general
than ARMA modelling
- Provided that there are no contemporaneous terms on the
right hand side of the equations in VAR Modelling, can simply
use OLS separately on each equation
- Forecasts in VAR Modelling are often better than “traditional
structural” models.

8/3/2021 For MSc students by Urgaia R.(Ph.D.), OSU 78


• Problems with VAR’s
- VAR’s are theoretical as are ARMA models
- How do you decide the appropriate lag length?
- So many parameters! For example, if we have g equations for g
variables and we have k lags of each of the variables in each equation, we
have to estimate (g+kg2) parameters. e.g. g=3, k=3, parameters = 30
Choosing the Optimal Lag Length for a VAR
Two possible approaches forchoosing the optimal Lag length for a VAR:
cross-equation restrictions and information criteria.
Cross-Equation Restrictions:
 In the spirit of unrestricted VAR modelling, each equation should have
the same lag length

8/3/2021 For MSc students by Urgaia R.(Ph.D.), OSU 79


 Suppose that a bivariate VAR(8) estimated using quarterly data has 8 lags
of the two variables in each equation, and we want to examine a restriction
that the coefficients on lags 5 through 8 are jointly zero. This can be done
using a likelihood ratio test.
 The likelihood ratio test for this joint hypothesis is given by

where is the variance-covariance matrix of the residuals for the restricted


model with 4 lags, is the variance-covariance matrix of residuals for the
unrestricted VAR with 8 lags, and T is the sample size.

8/3/2021 For MSc students by Urgaia R.(Ph.D.), OSU 80


• The test statistic is asymptotically distributed as a 2 with degrees of
freedom equal to the total number of restrictions. In the VAR case above,
we are restricting 4 lags of two variables in each of the two equations = a
total of 4 *2 * 2 = 16 restrictions.
• In the general case where we have a VAR with p equations, and we want
to impose the restriction that the last q lags have zero coefficients, there
would be p2q restrictions altogether
• Disadvantages: Conducting the LR test is cumbersome and requires a
normality assumption for the disturbances.

8/3/2021 For MSc students by Urgaia R.(Ph.D.), OSU 81


Information Criteria for VAR Lag Length Selection
•Multivariate versions of the information criteria are required. These can
be defined as:

where MAIC stands for mean Akakie information criteria and all notation
is as above and k is the total number of regressors in all equations, which
will be equal to g2k + g for g equations, each with k lags of the g variables,
plus a constant term in each equation. The values of the information
criteria are constructed for 0, 1, … lags (up to some pre-specified
maximum).
8/3/2021 For MSc students by Urgaia R.(Ph.D.), OSU 82
8/3/2021 For MSc students by Urgaia R.(Ph.D.), OSU 83
8/3/2021 For MSc students by Urgaia R.(Ph.D.), OSU 84
8/3/2021 For MSc students by Urgaia R.(Ph.D.), OSU 85
Does the VAR Include Contemporaneous Terms?

• So far, we have assumed the VAR is of the form


y1t  10  11 y1t 1  11 y2t 1  u1t
y2t  20  21 y2t 1  21 y1t 1  u2t
• But what if the equations had a contemporaneous feedback term?

• We can write this as y1t  10  11 y1t 1  11 y2t 1  12 y2t  u1t
y2t  20  21 y2t 1   21 y1t 1   22 y1t  u2t
• This VAR is in primitive form:
 y1t   10   11 11  y1t 1   12 0   y2t   u1t 
            
 y2t   20   21 21   y2t 1  0 22   y1t   u2t 

8/3/2021 For MSc students by Urgaia R.(Ph.D.), OSU 86


4.5 Granger Causality
Granger causality, or precedence, is a circumstance in which one time
series variable consistently and predictably changes before another
variable
A word of caution: even if one variable precedes (“Granger causes”)
another, this does not mean that the first variable “causes” the other to
change
There are several tests for Granger causality
They all involve distributed lag models in one form or another, however,
we’ll discuss an expanded version of a test originally developed by Granger
• Granger suggested that to see if A Granger-caused Y, we should run:
Yt = β0 + β1Yt–1 + ... + βpYt–p + α1At–1 + ... + αpAt–p + εt
and test the null hypothesis that the coefficients of the lagged As (the
αs) jointly equal zero
8/3/2021 For MSc students by Urgaia R.(Ph.D.), OSU 87
• If we can reject this null hypothesis using the F-test, then we have
evidence that A Granger-causes Y
• Applications of this test involve running two Granger tests, one in each
direction
• That is, run Equation the above equation and also run:
• At = β0 + β1At–1 + ... + βpAt–p + α1Yt–1 + ... + αpYt–p + εt

Testing for Granger causality in both directions by testing the null


hypothesis that the coefficients of the lagged Ys (again, the αs) jointly
equal zero
If the F-test is significant for preceding equation but not for the one in
the above, then we can conclude that A Granger-causes Y.

8/3/2021 For MSc students by Urgaia R.(Ph.D.), OSU 88


Block Significance and Causality Tests
•It is likely that, when a VAR includes many lags of variables, it will be
difficult to see which sets of variables have significant effects on each
dependent variable and which do not. For illustration, consider the
following bivariate VAR(3):
 y1t    10    11  12  y1t 1    11  12  y1t  2    11  12  y1t 3   u1t 
             

 2t   20    21
y  22  y 2t 1    21  22  y 2t  2    21  22  y 2t 3   u 2t 
• This VAR could be written out to express the individual equations as

• We might be interested in testing the following hypotheses, and their


implied restrictions on the parameter matrices:

8/3/2021 For MSc students by Urgaia R.(Ph.D.), OSU 89


• Each of these four joint hypotheses can be tested within the F-test
framework, since each set of restrictions contains only parameters drawn
from one equation.
• These tests could also be referred to as Granger causality tests.
• Granger causality tests seek to answer questions such as “Do changes in
y1 cause changes in y2?” If y1 causes y2, lags of y1 should be significant in
the equation for y2. If this is the case, we say that y1 “Granger-causes” y2.
• If y2 causes y1, lags of y2 should be significant in the equation for y1.
• If both sets of lags are significant, there is “bi-directional causality”

8/3/2021 For MSc students by Urgaia R.(Ph.D.), OSU 90


4.6 Impulse Responses and Variance Decompositions: The
Ordering of the Variables
• But for calculating impulse responses and variance decompositions, the
ordering of the variables is important.
• The main reason for this is that above, we assumed that the VAR error
terms were statistically independent of one another.
• This is generally not true, however. the error terms will typically be
correlated to some degree.
• Therefore, the notion of examining the effect of the innovations separately
has little meaning, since they have a common component.
• What is done is to “orthogonalize” the innovations.
• In the bivariate VAR, this problem would be approached by attributing all
of the effect of the common component to the first of the two variables in
the VAR.
•8/3/2021
In the general case where there are more variables, the situation is more 91
For MSc students by Urgaia R.(Ph.D.), OSU
complex but the interpretation is the same.
4.6.1 Impulse Responses
• VAR models are often difficult to interpret: one solution is to construct
the impulse responses and variance decompositions.
• Impulse responses trace out the responsiveness of the dependent
variables in the VAR to shocks to the error term. A unit shock is applied
to each variable and its effects are noted.
• Consider for example a simple bivariate VAR(1):
y1t  10  11 y1t 1  11 y2 t 1  u1t
y2 t  20  21 y2 t 1  21 y1t 1  u2 t
• A change in u1t will immediately change y1. It will change y2 and also y1
during the next period.
• We can examine how long and to what degree a shock to a given
equation has on all of the variables in the system.

8/3/2021 For MSc students by Urgaia R.(Ph.D.), OSU 92


4.6.2 Variance Decompositions
• Variance decompositions offer a slightly different method of examining
VAR dynamics.
• They give the proportion of the movements in the dependent variables
that are due to their “own” shocks, versus shocks to the other variables.
• This is done by determining how much of the s-step ahead forecast error
variance for each variable is explained innovations to each explanatory
variable (s = 1,2,…).
• The variance decomposition gives information about the relative
importance of each shock to the variables in the VAR.

8/3/2021 For MSc students by Urgaia R.(Ph.D.), OSU 93


4.7 Forecasting
 In general, forecasting is the act of predicting the future
 In econometrics, forecasting is the estimation of the expected value of a
dependent variable for observations that are not part of the same data set
 In most forecasts, the values being predicted are for time periods in the
future, but cross-sectional predictions of values for countries or people
not in the sample are also common
 To simplify terminology, the words prediction and forecast will be used
interchangeably.
 Econometric forecasting generally uses a single linear equation to predict
or forecast that can be summarized into two steps:

8/3/2021 For MSc students by Urgaia R.(Ph.D.), OSU 94


1. Specify and estimate an equation that has as its dependent variable
the item that we wish to forecast:

2. Obtain values for each of the independent variables f or the


observations for which we want a forecast and substitute them into
forecasting equation:

8/3/2021 For MSc students by Urgaia R.(Ph.D.), OSU 95


4.7.1 More Complex about Forecasting Problems
 The forecasts generated in the previous section are quite simple, however,
and most actual forecasting involves one or more additional questions—
for example:
1. Unknown Xs: It is unrealistic to expect to know the values for the
independent variables outside the sample
2. Serial Correlation: If there is serial correlation involved, the forecasting
equation may be estimated with GLS.
3. Confidence Intervals: All the previous forecasts were single values, but
such single values are almost never exactly right, so it would be more
helpful if we forecasted a confidence interval instead
4. Simultaneous Equations Models: many economic and business
equations are part of simultaneous models

8/3/2021 For MSc students by Urgaia R.(Ph.D.), OSU 96


Conditional Forecasting (Unknown X Values for the Forecast Period)
 Unconditional forecast: all values of the independent variables are known
with certainty, but this is rare in practice
 Conditional forecast: actual values of one or more of the independent
variables are not known. This is the more common type of forecast
 The careful selection of independent variables can sometimes help avoid
the need for conditional forecasting
 This opportunity can arise when the dependent variable can be expressed
as a function of leading indicators:
◦ A leading indicator is an independent variable the movements of which
anticipate movements in the dependent variable
◦ The best known leading indicator, the Index of Leading Economic
Indicators, is produced each month

8/3/2021 For MSc students by Urgaia R.(Ph.D.), OSU 97


Forecasting with Serially Correlated Error Terms
 When serial correlation is severe, one remedy is to run Generalized
Least Squares (GLS) as noted :

If Equation is estimated, the dependent variable will be:

Thus, if a GLS equation is used for forecasting, it will produce predictions


of Y*T + 1 rather than of YT+1. Such predictions thus will be of the wrong
variable!

8/3/2021 For MSc students by Urgaia R.(Ph.D.), OSU 98


 If forecasts are to be made with a GLS equation, Equation 9.18 should
first be solved for Yt before forecasting is attempted:

Next, substitute T+1 for t (to forecast time period T+1) and insert
estimates for the coefficients, ρs and Xs into the equation to get:

 Equation thus should be used for forecasting when an equation has


been estimated with GLS to correct for serial correlation

8/3/2021 For MSc students by Urgaia R.(Ph.D.), OSU 99


Forecasting Confidence Intervals
 The techniques we use to test hypotheses can also be adapted to create
forecasting confidence intervals
 Given a point forecast, all we need to generate a confidence interval
around that forecast are tc, the critical t-value (for the desired level of
confidence), and SF, the estimated standard error of the forecast:

 The critical t-value, tc, can be found in Statistical Table B-1 (for a two-
tailed test with T-K-1 degrees of freedom)

8/3/2021 For MSc students by Urgaia R.(Ph.D.), OSU 100


Forecasting Confidence Intervals (cont.)
 Lastly, the standard error of the forecast, SF, for an equation with just one
independent variable, equals the square root of the forecast error
variance:

where:
s2 = the estimated variance of the error term
T = the number of observations in the sample
XT+1 = the forecasted value of the single independent variable
= the arithmetic mean of the observed Xs in the sample
 How should forecasting be done in the context of a simultaneous model?
There are two approaches to answering this question, depending on
whether there are lagged endogenous variables on the right-hand side of
any of the equations in the system:

8/3/2021 For MSc students by Urgaia R.(Ph.D.), OSU 101


Forecasting with Simultaneous Equations Systems (cont.)
1. No lagged endogenous variables in the system:
 the reduced-form equation for the particular endogenous variable
can be used for forecasting because it represents the simultaneous
solution of the system for the endogenous variable being forecasted
2. Lagged endogenous variables in the system:
 then the approach must be altered to take into account the dynamic
interaction caused by the lagged endogenous variables
 For simple models, this sometimes can be done by substituting
for the lagged endogenous variables where they appear in the
reduced-form equations
 If such a manipulation is difficult, however, then a technique called
simulation analysis can be used

8/3/2021 For MSc students by Urgaia R.(Ph.D.), OSU 102

You might also like