Time Series Analysis
EC3090 Econometrics
Gaia Narciso
Trinity College Dublin
1
Time Series vs. Cross Sectional
• Time series data has a temporal ordering, unlike cross-
section data
2
Time Series vs. Cross Sectional
• Question: How do we think about randomness in time
series data?
Consider the series of Irish GDP data.
We can interpret each observation for each year as a
realization of a stochastic process
We only observe one observation, one realization
because we cannot start the process over again. But if
history had been different, we would have obtained a
different realization for the stochastic (random) process
3
Time Series vs. Cross Sectional
• A stochastic process (or time series process) is a
sequence of random variables
• When we collect time series data, we actually
collect the realizations of this stochastic process
• Sample size: Number of time periods over which
we observe the variables of interest.
4
Time Series vs. Cross Sectional
• To sum up:
Cross sectional data:
Population
Sample
Time series data:
Stochastic process (Time series process)
Realization
We are going to consider different stochastic
processes. Before doing so, we will give some
definitions
5
Stationary Time Series
• Covariance Stationary
E (Yt )
Var(Yt )
Cov (Yt , Yt k ) k
6
Autoregressive Process AR
• AR(1)
Yt Yt 1 ut ut is white noise
1 Then AR(1) is stationary
E (Yt ) E[ Yt 1 ut ]
E (Yt ) 0
7
Autoregressive Process AR
• E(Yt) Derivation
E (Yt ) E[ Yt 1 ut ]
E[Yt 1]
E[Yt ]
(1 E (Yt ) 0 E (Yt ) 0
8
Autoregressive Process AR
• AR(1)
Var(Yt ) Var( Yt 1 ut )
2
Var(Yt ) u
1
9
Autoregressive Process AR
• Var (Yt) Derivation
Var(Yt ) Var( Yt 1 ut )
Var(Yt 1) u
(1 )Var(Yt ) u
2
Var(Yt ) u
1
10
Autoregressive Process AR
• AR(1)
Cov (Yt , Yt 1) ?
Yt 1 Yt ut 1
Cov (Yt , Yt 1) E(Yt (Yt 1))
2
Cov (Yt , Yt 1)
2
11
Autoregressive Process AR
• Covariance Derivation
Cov (Yt , Yt 1) E(Yt (Yt 1))
E (Yt ( Yt ut 1))
E (Yt ) 2
2
Cov (Yt , Yt 1)
2
12
Autoregressive Process AR
• AR(1)
Cov (Yt , Yt 2) E (YtYt 2)
2
Cov (Yt , Yt 2) 2
2
We can generalize:
2
Cov (Yt , Yt k ) k
2
13
Autoregressive Process AR
• Covariance Derivation
Cov (Yt , Yt 2) E (YtYt 2)
EYt ( Yt 1 ut 2)
EYtYt 1 Ytut 2)
EYt( Yt ut 1) E (Ytut 2)
EYt Ytut 1 E (Ytut 2)
2
14
Autoregressive Process AR
Covariance Derivation (Contd.)
Cov (Yt , Yt 2) E (Yt )
2 2
2
15
Autoregressive Process AR
• AR(1)
1 Covariance between two
terms which are very distant
in time is low
as k
k
16
Autoregressive Process AR
• AR(1)
cov(Yt , Yt k )
Corr (Yt , Yt k )
Var (Yt )
1
Corr (Yt , Yt k )
1
Look: 1
as k the correlation
17
Moving Average Process MA
• MA (1)
Yt ut ut 1
E (Yt ) 0
Var (Yt ) Var (ut ut 1)
2
u
2 2
u
18
Moving Average Process MA
Variance Derivation
Var (Yt ) Var (ut ut 1)
( 1 ) 2
u
2
u
2 2
u
19
Moving Average Process MA
• MA(1)
Cov(Yt , Yt 1) E[(ut ut 1)(ut 1 ut 2)]
20
Moving Average Process MA
• Covariance Derivation (Contd.)
Cov (Yt , Yt 1) E[(ut ut 1)(ut 1 ut 2)]
E[utut 1 utut 2 ut21
2ut-2]
21
Autocorrelation
Cov(Yt,Yt 1) E[(ut ut 1)(ut 1 ut )]
Cov(Yt,Yt 2) E[(ut ut 1)(ut 2 ut 1)]
0
22
Autocorrelation
MA(k)
MA(1): Cov(Yt, Yt 1)
Cov(Yt, Yt 2) 0
MA(k): Cov (Yt , Yt k ) 0 up to k
Cov (Yt , Yt s ) 0 if s k
23
Autoregressive Process
• AR (1)
ut is white noise
Yt Yt 1 ut 1
2
• E (Yt ) 0 • Cov (Yt , Yt k ) k
2
2
• Var(Yt ) • Corr (Yt , Yt k )
1
as k k
24
Moving Average Process
• MA (1)
Yt ut ut 1 t 1,2,.....
• E (Yt ) 0
• Var(Yt ) u2 2 u2
25
How Do You Calculate Autocorrelation?
Cov (Yt , Yt k )
Corr (Yt , Yt k ) k
Var(Yt )
Cov(Yt, Yt k )
k
Var (Yt ) *Var(Yt k )
Cov (Yt , Yt k ) k
k
Var(Yt )
26
How Do You Calculate Autocorrelation
AR(1)
k
k
/
k corr (Yt , Yt k ) k
as k decreases expon.
k
-1
27
How Do You Calculate Autocorrelation
MA(1)
Cov (Yt , Yt k )
o k
Var(Yt )
o 1
(1 ) (1 )
cov(Yt , Yt 2) 0
o 2 0
Var(Yt ) (1 )
28
Autocorrelation
We are going to consider the autocorrelation
function:
It gives the relationship between
autocorrelation and the time distance
between the variables
29
Autocorrelation
By looking at the autocorrelation function
you can try to understand what the
underlying stochastic process is
ARMA
Yt Yt 1 ut ut 1
Yt 1Yt 1 2Yt 2 ... pYt p
ut 1ut 1 ... qut q
30
Box-Jenkins Methodology
Box-Jenkins consists of 4 steps
1. Identification
ACF will be relevant correlogram
2. Estimation
Estimate the parameters
31
Box-Jenkins Methodology
Box-Jenkins consists of 4 steps (Continued)
3. Diagnostic Checking
Does the model fit the data?
Are the estimated residuals white noise?
If the answer is no: Start again
4. Forecasting
This is the reason we like time series!
32
Box-Jenkins Methodology
1. Identification
We have seen the autocorrelation function
Cov(Yt, Yt k ) k
k
Var (Yt ) 0
33
Box-Jenkins Methodology
1. Identification
We can calculate the sample autocorrelation function
ˆ
k
(Y Y )(Y
t t k Y )
n
ˆ 0
(Yt Y ) 2
n
ˆk
ˆk
ˆ 0
Then we plot ̂k over time
34
Box-Jenkins Methodology
1. Identification
2. Estimation
3. Diagnostic Checking
4. Forecasting
35
Box-Jenkins Methodology
1. Identification
Autocorrelation Function
ACF AR(1)
MA(1)
ACF
Corr(Y,Yt-1)
Corr(Y,Yt-2)
1 2 3 4 5
# of Lags
AR(p) decays exponentially or with damped sine wave
pattern or both
36
Box-Jenkins Methodology
1. Identification
ACF
1 2 3 4
Lag
MA(q) spikes through q lags
37
Box-Jenkins Methodology
a) Autocorrelation function
b) Partial Autocorrelation Function
kk
It measures correlation between time series
observations that are K time periods apart
after controlling for correlation at
intermediate lags
38
Box-Jenkins Methodology
Consider Yt and Yt-k
When you calculate the correlation between
these 2 observations, you consider the effect of
the intermediate Y’s
[Yt-1 and Yt-2 ,.., Yt-k-1]have an impact on
corr(Yt,Yt-k)
When you calculate PACF, you disregard this.
How do you estimate?
Yt =0+1Yt-1+2Yt-2+et
39
Box-Jenkins Methodology
What does it look like?
kk
AR(2)
1 2 3 4
Lag
kk
MA(1)
1 2 3 4 5
Lag
40
Box-Jenkins Methodology
Model ACF PACF
AR(p) Declines exponentially, Spikes for p lags
or with damped sine Then it drops
wave pattern, or both
MA(q) Spikes through q lags Declines exponentially
ARMA(p,q) Exponential decay Exponential decay
41
Box-Jenkins Methodology
1. Identification
2. Estimation
• As usual
3. Diagnostic Checking
• Plot the residuals
• Look at ACF and PACF of the residuals
Ut
4. Forecasting
42
Non-Stationary Process
(Weak) Stationarity
E (Yt ) Constant over time
Var(Yt )
Constant over time
Cov (Yt ,Yt k ) k Depends on the time
distance between the 2
observations
43
Non-Stationary Process
3 Types of Non-Stationary Processes
1) Random Walk
2) Random Walk with Drift
3) Trend Stationary
44
Non-Stationary Process
What is the difference?
Their mean and variance change over time
What does it mean that the mean/var/cov
are constant?
It means that whenever the series is hit by a
shock, it diverges from its mean, but then it
goes back to it.
The fluctuations around the mean have a
constant amplitude
45
Non-Stationary Process
Non stationary time series
They vary
But if the mean and variance change over
time, each set of time series data is an
episode
46
Non-Stationary Stochastic Process
A) Random Walk
Yt Yt 1 ut
Similar to AR(1) :
Yt Yt 1 ut 1
47
Non-Stationary Stochastic Process:
Random Walk
Let’s see why it is non-stationary
Y 1 Y 0 u1
Y 2 Y 1 u2
Y 2 Y 0 u1 u 2
Y 3 Y 2 u3 3
Y 3 Y 0 u1 u 2 u 3 Y 0 ui
t i 1
Yt Y 0 ui
i 1
48
Non-Stationary Stochastic Process:
Random Walk
Mean
t
E (Yt ) E[Y 0 ui ]
i 1
Y0
{ut} is white noise
ut ~ (0, u )
49
Non-Stationary Stochastic Process:
Random Walk
Variance
t
var(Yt ) var[Y 0 ui ]
i 1
t 2
u
• The variance increases with time
• Note one feature of RW: Random shocks are
persistent
50
Non-Stationary Stochastic Process:
Random Walk
Variance Derivation
t
var(Yt ) var[Y 0 ui ]
i 1
var[u1 u 2 ... ut ]
[ u u ... u ]
2 2 2
t 2
u
t
51
Non-Stationary Stochastic Process
B) Random Walk With Drift
Yt Yt 1 ut
Drift parameter
52
Non-Stationary Stochastic Process:
Random Walk With Drift
Yt Yt 1 ut
Y 1 Y 0 u1
Y 2 Y 1 u 2 ( Y 0 u1) u 2
Y 2 2 Y 0 u1 u 2
Y 3 Y 2 u3
Y 3 Y 0 u1 u 2] u3
3
Y 3 3 Y 0 u1 u 2 u 3 3 Y 0 ui
i 1
t
Yt t Y 0 ui
i 1 53
Non-Stationary Stochastic Process:
Random Walk With Drift
Mean
t
E (Yt ) E[t Y 0 ui ]
i 1
t Y 0
• The mean depends on time
54
Non-Stationary Stochastic Process:
Random Walk
Variance
t
var(Yt ) var[t Y 0 ui ]
i 1
t 2
u
55
Non-Stationary Stochastic Process
C) Trend Stationary
Yt t ut
t Determinis tic Trend
56
Non-Stationary Stochastic Process:
Trend Stationary
Mean
E (Yt ) E ( 1 t ut )
t Not Constant
Variance
var(Yt ) 2
u
57
Non-Stationary Stochastic Process:
Trend Stationary
What Shall We Do?
Differencing
Say that you have a RW:
Yt Yt 1 ut
Subtract Yt-1
Yt Yt 1 ut
Yt ut Stationary
While Yt is non-stationary, Yt is stationary
58
Non-Stationary Stochastic Process:
Trend Stationary
Differencing
Yt Yt 1 ut
Yt Yt 1 ut
You can take the first difference It is stationary
59
Non-Stationary Stochastic Process:
Trend Stationary
Detrending
The mean of a trend stationary process is:
Yt t ut
E (Yt ) t
If we subtract the mean of Yt from Yt
The series is stationary
Detrending
60
Non-Stationary Stochastic Process:
Trend Stationary
Detrending
Yt t ut
Yˆt ˆ ˆ t
Yt Yt Yˆt Stationary
61
Integrated Process
A non-stationary process is said to be
integrated of order 1 if it has to be
differentiated once to make it stationary
Y~I(d): It has to be differentiated d
times to makes it stationary.
62
Integrated Process
Some properties
1) Xt~I(0) Stationary
Yt~I(1) Non-Stationary
Zt=(Xt+Yt)~I(1)
2) Xt~I(d)
Zt=(a+bXt)~I(d)
63
Integrated Process
Some properties
3) Xt~I(d1), Yt~I(d2) d2>d1
Zt=(aXt+bYt)~I(d2)
4) Xt~I(d) Yt~I(d)
Zt=aXt+bYt~I(d*)
d* d Cointegrat ion
64
Spurious Regressions
Say that you have 2 variables, Yt~I(1)
Xt~I(1)
It means that they are highly trended.
Xt Xt 1 at at ~ (0, )
a
Yt Yt 1 t t ~ (0, )
Two RW processes. at and lt are independent:
Suppose that you run:
Yt Xt ut
65
Spurious Regressions (Continued)
Say that you know they are totally independent
H 0 : 1
1
You would expect to reject H0
But….
The t-test will not be able to reject the null
hypothesis.
This is the spurious regression problem
66
Spurious Regressions (Continued)
Why does it occur?
Yt Xt ut
ut should be homoscedastic, not serially
correlated.
Look at ut:
ut Yt Xt
67
Spurious Regressions (Continued)
ut has always been assumed to be well-behaved but
in this case it is a linear combination of 2 integrated
processes
It is non-stationary
Gauss Markov assumptions are violated
t-statistic doesn’t have a limiting standard normal
distribution.
R2 will be high (0.99)
68
Random Walk
Random walk process (with/without drift) is an
example of unit root process.
Yt Yt 1 ut 1 1
• Random walk process
Yt Yt 1 ut
• Mean is constant over time
• Variance is not constant
Unit root process
• AR(1) It is stationary
69
Dickey-Fuller Test
How do we detect whether a process is a unit root
process or not?
Dickey-Fuller Test
Start with the general process
Yt Yt 1 ut *
We would like to know whether
Play with * and subtract Yt-1
Yt Yt 1 Yt 1 Yt 1 ut
Yt ( 1)Yt 1 ut
Yt Yt 1 ut 70
Dickey-Fuller Test (Continued)
H0 :
a :
If , Yt = Ut the first difference is stationary
If you cannot reject the null hypothesis it means that we
cannot reject the fact that Yt is non-stationary.
Evidence in favour of non-stationarity
If we reject H0: Evidence in favour of stationarity
71
Dickey-Fuller Test (Continued)
Can we use the t-test?
No! Why?
Under the null hypothesis, the model is non
stationary.
The t-statistic of the estimated coefficients doesn’t
follow a t distribution, not even in large samples
72
Dickey-Fuller Test (Continued)
What should we do then?
Dickey-Fuller have created tables of critical values by
using a montecarlo simulator
You have the tau-statistic
The table of critical values is made of three panels:
Why?
The critical values depend on the model used.
73
Dickey-Fuller Test (Continued)
Upper Panel
Yt Yt 1 ut Random Walk
Middle Panel
Yt Yt 1 ut RW with drift
Lower Panel
Yt Yt 1 t ut RW +Trend + Drift
74
Dickey-Fuller Test (Continued)
The null hypothesis is the same
H 0 : Ha :
ˆ
tau statistic :
se (ˆ)
But the critical values depend on the model
75
Augmented Dickey-Fuller Test
We have seen that a way of carrying out the unit
root test is to write the RW as:
Yt Yt 1 et
H0 :
H0 :
We could also have more complicated models with
additional lags
Yt Yt 1 1Yt 1 et where 1 1
H0 : {Yt} follows a stable AR(1) model
76
Augmented Dickey-Fuller Test
(Continued)
We can also add more lags:
Run a regression
Yt on Yt 1, Yt 1, Yt 2, , Yt p
and carry out the test on ˆ as before.
This is the Augmented Dickey-Fuller test
77
Cointegration and Error Correction
Models
We have talked about the fact that:
Xt~I(1) and Yt~I(1)
A linear combination of the 2 is I(1)
However, there are exceptions!
It is possible that there exists a such that:
Zt = (Yt - Xt) ~ I(0)
In this case we say that the 2 variables are
cointegrated.
If exists, then is the cointegration parameter
78
Cointegration and Error Correction
Models
Two variables are said to be cointegrated if they
have a long term or equilibrium relationship
between them
It means that they drift together, they share a
common trend
79
Testing For Cointegration
How do you test for cointegration?
1) If is known,
Zt = Yt-Xt
Zt = Zt-1+et Test whether Zt has
Zt = (1-Zt-1+et a unit root
Apply the Dickey-Fuller test to Zt
80
Testing For Cointegration
a) If we find evidence that Zt is stationary (Reject
the null hypothesis of non-stationarity)
The 2 variables share a common stochastic
trend: they are cointegrated.
b) If we find evidence that Zt is nonstationary
We cannot reject the null hypothesis of non-
stationarity
The 2 variables are not cointegrated
81
Testing For Cointegration
2) If is unknown,
We have to rely on the residuals
a) Consider the case of a spurious regression.
In a spurious regression the errors are non-
stationary
b) Then, in the presence of cointegration it must
be that the errors are stationary
82
Testing For Cointegration
Yt Xt ut
Estimate it:
Yt ˆ ˆXt uˆt
uˆt Yt ˆ ˆXt
If the 2 variables are cointegrated
The ût are stationary
ût is a linear combination of the 2
83
Testing For Cointegration
Test
et is white noise
uˆt uˆt 1 et process
and use the Dickey-Fuller critical values for
cointegration
84
Testing For Cointegration
If Yt and Xt are I(1) and they are not cointegrated
A regression of Yt on Xt does not mean anything
You can still take the first differences of each
variable and work with Yt, Xt.
If Yt and Xt are cointegrated
Ok, it means that a long run equilibrium
relationship exists.
85
Forecasting
We are at time t
We want to forecast Yt+1
What do we do?
We have an information set at time t, It. It means
that we know all the previous values taken by Yt
and other variables
86
Forecasting
Consider an AR(1)
Yt Yt 1 et
Estimate it and get the estimated values ˆ ,ˆ
Now update the process
Yt 1 Yt et 1
Et (Yt 1 \ t ) ˆ ˆ Yt
We can forecast forward
Yt 2 Yt 1 et 2
EtYt 2 ˆ ˆ Et (Yt 1)
EtYt 2 ˆ ˆ [ˆ ˆ Yt ] 87
Forecasting
The quality of the forecast deteriorates as we
forecast farther out in the future.
Of course forecasts are not accurate
We have to consider the forecast error:
1-step ahead forecast error:
t (1) Yt 1 Et (Yt 1)
88