CHAPTER 3
Introduction to Basic Regression
Analysis with Time Series Data
1
Outline
3.1 The nature of Time Series Data
3.2 Stationary and non-stationary stochastic Processes
3.3 Trend Stationary and Difference Stationary
Stochastic Processes
3.4 Integrated Stochastic Process
3.5 Tests of Stationarity: The Unit Root Test
2
3.1 The nature of Time Series Data
It is a data collected on a variable chronologically,
a regular interval of time.
Autocorrelation between consecutive values of a
time series results in violation of a classical linear
regression assumption of .
3
3.1 The nature of Time Series Data
A sequence of random variables indexed by time is
called a stochastic process or a time series process.
(“Stochastic” is a synonym for random).
If we let Y represent GDP, for our data we have Y1, Y2,
Y3, ..., Y42, Y43, where the subscript 1 denotes the first
observation (i.e., GDP for the year 1981)
Keep in mind that each of these Y’s is a random
4
variable.
GDP of Ethiopia(in USD)
140
120
100
80
60
43.310721414083
40
20
0
81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 00 01 02 03 04 05 06 07 08 09 10 11 12 13 14 15 16 17 18 19 20 21 22
19 19 19 19 19 19 19 19 19 19 19 19 19 19 19 19 19 19 19 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20
GDP of Ethiopia(in USD)
5
3.1 The nature of Time Series Data
A sequence of random variables indexed by time is
called a stochastic process or a time series process.
(“Stochastic” is a synonym for random).
If we let Y represent GDP, for our data we have Y1, Y2,
Y3, ..., Y42, Y43, where the subscript 1 denotes the first
observation (i.e., GDP for the year 1981)
Keep in mind that each of these Y’s is a random
6
variable.
3.2 Stationary and non-stationary
stochastic Processes
What is stationarity?
Stationary time series data refers to data in which the
statistical properties remain constant over time.
Time invariant mean and variance don’t change over
time(mean-reverting process over time).
A stationary time series always reverts to the long-
term mean.
7
3.2 Stationary and non-stationary
stochastic Processes
What is stationarity?
Time invariant mean and variance don’t change
over time(mean-reverting process over time).
More formally, a time-series variable, , is stationary if:
(1) ,
(2) ,
(3) , where is a lag-length.
8
The auto-covariances are not a particularly useful
measure
• since the values of the auto-covariances depend on
the units of measurement of , and hence the values
that they take have no immediate interpretation.
• It is thus more convenient to use the autocorrelations,
which are the auto-covariances normalized by
dividing by the variance
where,
9
Stationary
Declines to 0 immediately
Non-stationary
Declines to 0 gradually over a prolonged period of
time.
Partial correlation coefficient
It measures correlation between observations that
are k time periods apart, after controlling for
correlations at intermediate lags. 10
11
12
Why stationarity?
Shocks to the series may have a long-lasting
impact:
This makes it harder to predict future values
because the series is not reverting to its mean or a
stable pattern(Autocov./autocorrelation increases
overtime)
The use of non-stationary data can lead to spurious
regressions.
13
14
Lag length selection criterion
Hold for ACF and PACF
These statistics test the joint hypothesis that all
the ACFs are simultaneously equal to zero, thus
the series is stationary:
1) Q-TEST
m
Q n
ˆ k2
k1
n sample size
m lag length
2 ( m) degrees of freedom 15
2) LJUNG-BOX TEST
This statistic is the same as the Q statistic in
large samples, but has better properties in
small samples
m 2
ˆ k
LB n(n 2) ( )
k 1 (n k )
16
3.2 Stationary and non-stationary
stochastic Processes
Non-stationary Stationary
White-noise
Random walk model
ARIMA processes
Random walk with drift
Strict stationary
Trend Difference stationary
Trend stationary
17
18
3.2.1.Non-stationary stochastic Processes
1) Random walk model (RWM) without a drift
Where, (it is a white-noise process)
It is a non-mean-reverting (non-stationary) process that
can move away from the mean either in a positive or
19
negative direction.
To see this nature of the process, let’s assume
that . Then, the process evolves as follows:
• We have by successive substitution
• Hence,
20
Its variance
• We have by successive substitution
Since,
21
2) Random walk model (RWM) with drift
,
The value of is the average of the changes between
consecutive observations.
If is positive, then the average change is an increase
in the value of . Thus, will tend to drift upwards.
However, if is negative, will tend to drift
downwards. 22
2) Random walk model (RWM) with drift
Mean
Variance
23
2) Deterministic Trend
where is a white noise error term and where is time
measured chronologically.
24
3.3 Stationary stochastic processes
1. Difference stationarity
,
,
• Symbolically
if a non-stationary time series has to differenced
d-times (that is: ) to be stationary, then we say is
d-ordered integrated time series. Or .
25
Example of difference stationary(Dow Jones
Indices for October 2015)
26
2) Trend stationarity
Differencing once
• is now stationary with mean(which is the straight
line in and constant variance).
27
28
3.4 Stationarity tests
1. Visual DETECTION
Time series plot
Correlogram
2. Unit root test
Dickey fuller test
Augmented dickey fuller
29
(1) VISUAL DETECTION
Time series plot
• Example 1: Determine whether the Dow Jones
closing averages for the month of October 2015,
as shown in columns A and B of the following
Figure(next slide) is a stationary time series.
• As you see from the figure, there is an upward
trend to the data. This is an indication that the
time series is not stationary.
30
31
Correlogram
a graphical showing height of ACF/PACF at
various lag length.
32
2) Unit root test
We now turn to the important problem of testing
whether a time series is stationary or not.
The simplest approach to testing for a unit root
begins with a random walk model (RWM):
,
the null hypothesis is that has a unit root:
33
(1)Dickey-Fuller test
A convenient equation for carrying out the unit root
test is to subtract from both sides of the equation ()
as:
• Let us define . Hence,
• If contains a unit root, and . If is stationary,
and . Hence we construct a one-sided t-test on the
hypothesis that :
34
Decision
• , we reject the hypothesis that , in which case the
time series is stationary.
• On the other hand, if , we do not reject the null
hypothesis, in which case the time series is non-
stationary.
35
It is estimated in 3 different forms
1) is a random walk:
2) is a random walk with drift:
3) is a random walk with drift and around stochastic
trend:
36
37
(2) Augmented dickey fuller(ADF) test
The Dickey–Fuller specifications and the critical
values for those specifications are derived under
the assumption that the error term is serially
uncorrelated.
Augmented Dickey–Fuller test (ADF), adds a series
of lagged values of to the Dickey–Fuller test (That
is, it takes autocorrelation into account) 38
EXAMPLE
39
40
The ADF test here consists of estimating the
following regression:
where is a pure white noise error term and where ,
etc.
The number of lagged difference terms to include is
often determined empirically, the idea being to
include enough terms so that the error term is
serially uncorrelated. 41
In ADF we still test whether and the ADF test
follows the same asymptotic distribution as the DF
statistic, so the same critical values can be used.
EXAMPLE
42
3.6 Cointegration and ECM
As a general rule, non-stationary time-series variables should not be
used in regression models, to avoid the problem of spurious
regression.
However, there is an exception to this rule. Suppose;
If and are non-stationary variables, then we expect their difference,
or any linear combination of them, such as
43
Cointegration implies that and share similar
stochastic trends, and, since the difference is
stationary, they never diverge too far from each
other.
𝑌𝑡
44
Testing for Cointegration
Engle–Granger (EG) or Augmented Engle–Granger
(AEG) Test
Since the estimated are based on the estimated
cointegrating parameter , the DF and ADF critical
significance values are not quite appropriate.
Step
1) Estimate model
2) Predict residual
3) Undertake unit root test
45
Example
1) Using the data introduced in Section 21.1 and
found on the book’s website, we first
regressed LPCEC on LDPIC and obtained the
following regression:
46
2) Since LPCE and LDPI are individually non-stationary,
there is the possibility that this regression is spurious.
But when we performed a unit root test on the residuals
obtained from Eq. (21.11.3), we obtained the following
results:
he Engle–Granger asymptotic 5 percent and 10 percent
critical values are about -3.34 and -3.04, respectively.
47
48
Error correction model (ECM)
49
EXAMPLE
50
• The negative value of the error correction
coefficient (-0.1223) indicates that the dependent
variable adjusts towards its long-run equilibrium
at a rate of approximately 12.23% per period.
This means that it takes time for the system to
correct deviations from the long-term
equilibrium, and the adjustment occurs at a
relatively slow pace.
• Given the negative sign of the coefficient, it
suggests that the system tends to correct
deviations from equilibrium, which aligns with
the theoretical expectations of an error
correction model. 51