Chapter 6
Univariate time series modelling and forecasting
c Chris Brooks 2013
Introductory Econometrics for Finance
Univariate Time Series Models
Where we attempt to predict returns using only information
contained in their past values.
Some Notation and Concepts
A Strictly Stationary Process
A strictly stationary process is one where
P{yt1 b1 , . . . , ytn bn } = P{yt1 +m b1 , . . . , ytn +m bn }
A Weakly Stationary Process
c Chris Brooks 2013
Introductory Econometrics for Finance
Univariate Time Series Models (Contd)
If a series satisfies the next three equations, it is said to be weakly
or covariance stationary
(1) E (yt ) = t = 1, 2, . . . ,
(2) E (yt )(yt ) = 2 <
(3) E (yt1 )(yt2 ) = t2 t1
t1 , t2
So if the process is covariance stationary, all the variances are
the same and all the covariances depend on the difference
between t1 and t2 . The moments
E (yt E (yt ))(yts E (yts )) = s , s = 0, 1, 2, . . .
are known as the covariance function.
The covariances, s , are known as autocovariances.
c Chris Brooks 2013
Introductory Econometrics for Finance
Univariate Time Series Models (Contd)
However, the value of the autocovariances depend on the
units of measurement of yt .
It is thus more convenient to use the autocorrelations which
are the autocovariances normalised by dividing by the variance:
s =
s
,
0
s = 0, 1, 2, . . .
If we plot s against s=0,1,2,... then we obtain the
autocorrelation function or correlogram.
c Chris Brooks 2013
Introductory Econometrics for Finance
A White Noise Process
A white noise process is one with (virtually) no discernible
structure. A definition of a white noise process is
E (yt ) =
var(yt ) = 2
2
if t = r
tr =
0
otherwise
Thus the autocorrelation function will be zero apart from a
single peak of 1 at s=0. s approx. N(0, 1/T ) where T =
sample size
We can use this to do significance tests for the autocorrelation
coefficients by constructing a confidence interval.
c Chris Brooks 2013
Introductory Econometrics for Finance
A White Noise Process (Contd)
For example, a 95 % confidence interval would be given by
1
1.96
T
. If the sample autocorrelation coefficient, s , falls outside this
region for any value of s, then we reject the null hypothesis
that the true value of the coefficient at lag s is zero.
c Chris Brooks 2013
Introductory Econometrics for Finance
Joint Hypothesis Tests
We can also test the joint hypothesis that all m of the k
correlation coefficients are simultaneously equal to zero using
the Q-statistic developed by Box and Pierce:
m
X
Q=T
k2
k=1
where T=sample size, m=maximum lag length
The Q-statistic is asymptotically distributed as a 2m .
However, the Box Pierce test has poor small sample
properties, so a variant has been developed, called the
Ljung-Box statistic:
m
X
k2
Q = T (T + 2)
2m
T k
k=1
This statistic is very useful as a portmanteau (general) test of
linear dependence in time series.
c Chris Brooks 2013
Introductory Econometrics for Finance
An ACF Example
Question:
Suppose that a researcher had estimated the first 5
autocorrelation coefficients using a series of length 100
observations, and found them to be (from 1 to 5): 0.207,
-0.013, 0.086, 0.005, -0.022.
Test each of the individual coefficient for significance, and use
both the Box-Pierce and Ljung-Box tests to establish whether
they are jointly significant.
Solution:
A coefficient would be significant if it lies outside
(-0.196,+0.196) at the 5% level, so only the first
autocorrelation coefficient is significant.
Q=5.09 and Q*=5.26
c Chris Brooks 2013
Introductory Econometrics for Finance
An ACF Example (Contd)
Compared with a tabulated 2 (5)=11.1 at the 5% level, so
the 5 coefficients are jointly insignificant.
c Chris Brooks 2013
Introductory Econometrics for Finance
Moving Average Processes
Let ut (t = 1, 2, 3, . . . ) be a sequence of independently and
identically distributed (iid) random variables with E(ut ) = 0
and var(ut ) = 2 , then
yt = + ut + 1 ut1 + 2 ut2 + + q utq
is a qth order moving average model MA(q).
Its properties are
E(yt ) =
var(yt ) = 0 = 1 + 12 + 22 + + q2 2
Covariances
(
(s + s+1 1 + s+2 2 + + q qs ) 2 for s = 1, . . . , q
s =
0
for s > q
c Chris Brooks 2013
Introductory Econometrics for Finance
10
Example of an MA Problem
1. Consider the following MA(2) process:
yt = ut + 1 ut1 + 2 ut2
where ut is a zero mean white noise process with variance 2 .
i. Calculate the mean and variance of Xt
ii. Derive the autocorrelation function for this process (i.e. express
the autocorrelations, 1 , 2 , ...as functions of the parameters 1
and 2 ).
iii. If 1 = 0.5 and 2 = 0.25, sketch the acf of Xt .
c Chris Brooks 2013
Introductory Econometrics for Finance
11
Solution
i. If E (ut ) = 0, then E(uti ) = 0 i So
E(yt ) = E(ut + 1 ut1 + 2 ut2 )
= E(ut ) + 1 E(ut1 ) + 2 E(ut2 ) = 0
var(yt ) = E[yt E(yt )][yt E(yt )]
But E(yt ) = 0, so
var(yt ) = E[(yt )(yt )]
var(yt ) = E[(ut + 1 ut1 + 2 ut2 )(ut + 1 ut1 + 2 ut2 )]
2
2
+ cross-products
+ 22 ut2
var(yt ) = E ut2 + 12 ut1
But E[cross-products] = 0 since cov(ut , uts ) = 0 for s 6= 0.
c Chris Brooks 2013
Introductory Econometrics for Finance
12
Solution (Contd)
So
2
2
var(yt ) = 0 = E ut2 + 12 ut1
+ 22 ut2
= 2 + 12 2 + 22 2
= 1 + 12 + 22 2
c Chris Brooks 2013
Introductory Econometrics for Finance
13
Solution (Contd)
ii. The acf of yt
1 = E[yt E(yt )][yt1 E(yt1 )]
1 = E[yt ][yt1 ]
1 = E[(ut + 1 ut1 + 2 ut2 )(ut1 + 1 ut2 + 2 ut3 )]
2
2
1 = E 1 ut1
+ 1 2 ut2
1 = 1 2 + 1 2 2
1 = (1 + 1 2 ) 2
c Chris Brooks 2013
Introductory Econometrics for Finance
14
Solution (Contd)
2 = E[yt E(yt )][yt2 E(yt2 )]
2 = E[yt ][yt2 ]
2 = E[(ut + 1 ut1 + 2 ut2 )(ut2 + 1 ut3 + 2 ut4 )]
2
2 = E 2 ut2
2 = 2 2
c Chris Brooks 2013
Introductory Econometrics for Finance
15
Solution (Contd)
3 = E[yt E(yt )][yt3 E(yt3 )]
3 = E[yt ][yt3 ]
3 = E[(ut + 1 ut1 + 2 ut2 )(ut3 + 1 ut4 + 2 ut5 )]
3 = 0
So s = 0 for s > 2.
c Chris Brooks 2013
Introductory Econometrics for Finance
16
Solution (Contd)
We have the autocovariances, now calculate the
autocorrelations:
0 =
1 =
2 =
3 =
s
0
=1
0
1
(1 + 1 2 ) 2
(1 + 1 2 )
=
=
2
2
2
0
1 + 1 + 2
1 + 12 + 22
(2 ) 2
2
2
=
=
2
2
2
0
1 + 1 + 2
1 + 12 + 22
3
=0
0
s
=0 s >2
0
c Chris Brooks 2013
Introductory Econometrics for Finance
17
Solution (Contd)
iii. For 1 = 0.5 and 2 = 0.25, substituting these into the
formulae above gives 1 = 0.476, 2 = 0.190.
c Chris Brooks 2013
Introductory Econometrics for Finance
18
ACF Plot
Thus the acf plot will appear as follows:
1.2
1
0.8
0.6
acf
0.4
0.2
0
0
0.2
0.4
0.6
lag, s
c Chris Brooks 2013
Introductory Econometrics for Finance
19
Autoregressive Processes
An autoregressive model of order p, an AR(p) can be
expressed as
yt = + 1 yt1 + 2 yt2 + + p ytp + ut
Or using the lag operator notation:
Li yt = yti
Lyt = yt1
yt = +
p
X
i yti + ut
i =1
or
yt = +
p
X
i Li yt + ut
i =1
or
(L)yt = +ut
where
c Chris Brooks 2013
Introductory Econometrics for Finance
(L) = (11 L2 L2 p Lp ).
20
The Stationary Condition for an AR Model
The condition for stationarity of a general AR( p) model is
that the roots of 1 1 z 2 z 2 p z p = 0 all lie
outside the unit circle.
A stationary AR(p) model is required for it to have an
MA() representation.
Example 1: Is yt = yt1 + ut stationary?
The characteristic root is 1, so it is a unit root process (so
non-stationary)
Example 2: Is yt = 3yt1 2.75yt2 + 0.75yt3 + ut
stationary?
The characteristic roots are 1, 2/3, and 2. Since only one of
these lies outside the unit circle, the process is non-stationary.
c Chris Brooks 2013
Introductory Econometrics for Finance
21
Wolds Decomposition Theorem
States that any stationary series can be decomposed into the
sum of two unrelated processes, a purely deterministic part
and a purely stochastic part, which will be an MA().
For the AR(p) model, (L)yt = ut , ignoring the intercept, the
Wold decomposition is
yt = (L)ut
where,
(L) = (L)1 = (1 1 L 2 L2 p Lp )1
c Chris Brooks 2013
Introductory Econometrics for Finance
22
The Moments of an Autoregressive Process
The moments of an autoregressive process are as follows. The
mean is given by
E (yt ) =
0
1 1 2 p
The autocovariances and autocorrelation functions can be
obtained by solving what are known as the Yule-Walker
equations:
1 = 1 + 1 2 + + p1 p
2 = 1 1 + 2 + + p2 p
.. .. ..
. . .
p = p1 1 + p2 2 + + p
If the AR model is stationary, the autocorrelation function will
decay exponentially to zero.
c Chris Brooks 2013
Introductory Econometrics for Finance
23
Sample AR Problem
Consider the following simple AR(1) model
yt = + 1 yt1 + ut
i. Calculate the (unconditional) mean of yt .
For the remainder of the question, set = 0 for simplicity.
ii. Calculate the (unconditional) variance of yt .
iii. Derive the autocorrelation function for yt .
c Chris Brooks 2013
Introductory Econometrics for Finance
24
Solution
i. Unconditional mean:
E(yt )
E( + 1 yt1 )
E(yt )
+ 1 E(yt1 )
But also
So
E(yt )
E(yt )
+ 1 ( + 1 E(yt2 ))
+ 1 + 21 E(yt2 )
+ 1 + 21 ( + 1 E(yt3 ))
+ 1 + 21 + 31 E(yt3 )
c Chris Brooks 2013
Introductory Econometrics for Finance
25
Solution
(Contd)
An infinite number of such substitutions would give
E(yt ) = 1 + 1 + 21 + ) +
1 y0
So long as the model is stationary, i.e. |1 | < 1, then
1 = 0.
So
E(yt ) = 1 + 1 + 21 + =
1 1
ii. Calculating the variance of yt : yt = 1 yt1 + ut
c Chris Brooks 2013
Introductory Econometrics for Finance
26
Solution
(Contd)
From Wolds decomposition theorem:
yt (1 1 L) = ut
yt
yt
(1 1 L)1 ut
1 + 1 L + 21 L2 + ut
So long as, |1 | < 1, this will converge.
var(yt ) = E[yt E(yt )][yt E(yt )]
c Chris Brooks 2013
Introductory Econometrics for Finance
27
Solution
(Contd)
but E(yt ) = 0, since is set to zero.
var(yt ) = E[(yt )(yt )]
= E ut + 1 ut1 + 21 ut2 + ut + 1 ut1
+21 ut2 +
2
2
= E ut2 + 21 ut1
+ 41 ut2
+ + cross-products
2
2
= E ut2 + 21 ut1
+ 41 ut2
+
= u2 + 21 u2 + 41 u2 +
= u2 1 + 21 + 41 +
=
u2
(1 u2 )
c Chris Brooks 2013
Introductory Econometrics for Finance
28
Solution
(Contd)
iii. Turning now to calculating the acf, first calculate the
autocovariances:
1 = cov (yt , yt1 ) = E[yt E (yt )][yt1 E (yt1 )]
Since a0 has been set to zero, E(yt ) = 0 and E(yt1 ) = 0, so
1 = E[yt yt1 ]
c Chris Brooks 2013
Introductory Econometrics for Finance
29
Solution
(Contd)
under the result above that E(yt ) = E(yt1 ) = 0. Thus
1 = E ut + 1 ut1 + 21 ut2 + ut1 + 1 ut2
+ 21 ut3 +
2
2
1 = E 1 ut1
+ 31 ut2
+ + cross products
1 = 1 2 + 31 2 + 51 2 +
1 =
1 2
1 21
For the second autocorrelation coefficient,
2 = cov(yt , yt2 ) = E[yt E(yt )][yt2 E(yt2 )]
c Chris Brooks 2013
Introductory Econometrics for Finance
30
Solution
(Contd)
Using the same rules as applied above for the lag 1 covariance
2 = E[yt yt2 ]
2 = E ut + 1 ut1 + 21 ut2 + ut2 + 1 ut3
+ 21 ut4 +
2
2
2 = E 21 ut2
+ 41 ut3
+ +cross-products
2 = 21 2 + 41 2 +
2 = 21 2 1 + 21 + 41 +
2 =
21 2
1 21
c Chris Brooks 2013
Introductory Econometrics for Finance
31
Solution
(Contd)
If these steps were repeated for 3 , the following expression
would be obtained
3 =
31 2
1 21
and for any lag s, the autocovariance would be given by
s =
c Chris Brooks 2013
Introductory Econometrics for Finance
s1 2
1 21
32
Solution
(Contd)
The acf can now be obtained by dividing the covariances by
the variance:
0
=1
0 =
0
!
1 2
1 21
1
! = 1
1 =
=
0
2
1 21
!
21 2
1 21
2
! = 21
=
2 =
2
0
1 21
3 = 31
s = s1
c Chris Brooks 2013
Introductory Econometrics for Finance
33
The Partial Autocorrelation Function (denoted kk )
Measures the correlation between an observation k periods
ago and the current observation, after controlling for
observations at intermediate lags (i.e. all lags <k).
So kk measures the correlation between yt and ytk after
removing the effects of ytk+1 , ytk+2 , . . . , yt1
At lag 1, the acf = pacf always
At lag 2,
22 = 2 12
1 12
For lags 3+, the formulae are more complex.
c Chris Brooks 2013
Introductory Econometrics for Finance
34
The Partial Autocorrelation Function (denoted kk )
(Contd)
The pacf is useful for telling the difference between an AR
process and an ARMA process.
In the case of an AR(p), there are direct connections between
yt and yts only for s p.
So for an AR(p), the theoretical pacf will be zero after lag p.
In the case of an MA(q), this can be written as an AR(), so
there are direct connections between yt and all its previous
values.
For an MA(q), the theoretical pacf will be geometrically
declining.
c Chris Brooks 2013
Introductory Econometrics for Finance
35
ARMA Processes
By combining the AR(p) and MA(q) models, we can obtain
an ARMA(p,q) model:
(L)yt = + (L)ut
where
(L) = 1 1 L 2 L2 p Lp
and
(L) = 1 + 1 L + 2 L2 + + q Lq
or
yt
= + 1 yt1 + 2 yt2 + + p ytp + 1 ut1
+ 2 ut2 + + q utq + ut
with
E(ut ) = 0; E ut2 = 2 ; E (ut us ) = 0, t 6= s
c Chris Brooks 2013
Introductory Econometrics for Finance
36
The Invertibility Condition
Similar to the stationarity condition, we typically require the
MA(q) part of the model to have roots of (z) = 0 greater
than one in absolute value.
The mean of an ARMA series is given by
E (yt ) =
1 1 2 p
The autocorrelation function for an ARMA process will display
combinations of behaviour derived from the AR and MA
parts, but for lags beyond q, the acf will simply be identical to
the individual AR(p) model.
c Chris Brooks 2013
Introductory Econometrics for Finance
37
Summary of the Behaviour of the acf for AR and
MA Processes
An autoregressive process has
a geometrically decaying acf
number of spikes of pacf = AR order
A moving average process has
Number of spikes of acf = MA order
a geometrically decaying pacf
c Chris Brooks 2013
Introductory Econometrics for Finance
38
Some sample acf and pacf plots for standard
processes
The acf and pacf are not produced analytically from the
relevant formulae for a model of that type, but rather are
estimated using 100,000 simulated observations with
disturbances drawn from a normal distribution.
Figure: Sample autocorrelation and partial autocorrelation functions for
an MA(1) model: yt = 0.5ut1 + ut
0.05
0
1
10
0.05
acf and pacf
0.1
0.15
0.2
0.25
0.3
acf
pacf
0.35
0.4
0.45
c Chris Brooks 2013
Introductory Econometrics for Finance
lag, s
39
ACF and PACF for an MA(2) Model:
yt = 0.5ut1 0.25ut2 + ut
0.4
acf
pacf
0.3
0.2
acf and pacf
0.1
0
1
10
0.1
0.2
0.3
0.4
lag, s
c Chris Brooks 2013
Introductory Econometrics for Finance
40
ACF and PACF for a slowly decaying AR(1) Model:
yt = 0.9yt1 + ut
1
0.9
acf
pacf
0.8
0.7
acf and pacf
0.6
0.5
0.4
0.3
0.2
0.1
0
10
0.1
lag, s
c Chris Brooks 2013
Introductory Econometrics for Finance
41
ACF and PACF for a more rapidly decaying AR(1)
Model: yt = 0.5yt1 + ut
0.6
0.5
acf
pacf
acf and pacf
0.4
0.3
0.2
0.1
0
1
10
0.1
lag, s
c Chris Brooks 2013
Introductory Econometrics for Finance
42
ACF and PACF for a more rapidly decaying AR(1)
Model with Negative Coefficient: yt = 0.5yt1 + ut
0.3
0.2
0.1
acf and pacf
0
1
10
0.1
0.2
0.3
acf
pacf
0.4
0.5
0.6
lag, s
c Chris Brooks 2013
Introductory Econometrics for Finance
43
ACF and PACF for a Non-stationary Model (i.e. a
unit coefficient):yt = yt1 + ut
1
0.9
0.8
acf and pacf
0.7
0.6
0.5
0.4
0.3
acf
pacf
0.2
0.1
0
1
c Chris Brooks 2013
Introductory Econometrics for Finance
lag, s
10
44
ACF and PACF for an ARMA(1,1):
yt = 0.5yt1 + 0.5ut1 + ut
0.8
0.6
acf
pacf
acf and pacf
0.4
0.2
0
1
10
0.2
0.4
lag, s
c Chris Brooks 2013
Introductory Econometrics for Finance
45
Building ARMA Models - The Box Jenkins
Approach
Box and Jenkins (1970) were the first to approach the task of
estimating an ARMA model in a systematic manner. There
are 3 steps to their approach:
1. Identification
2. Estimation
3. Model diagnostic checking
Step 1:
Involves determining the order of the model.
Use of graphical procedures
A better procedure is now available
Step 2:
c Chris Brooks 2013
Introductory Econometrics for Finance
46
Building ARMA Models - The Box Jenkins
Approach (Contd)
Estimation of the parameters
Can be done using least squares or maximum likelihood
depending on the model.
Step 3:
Model checking
Box and Jenkins suggest 2 methods:
deliberate overfitting
residual diagnostics
c Chris Brooks 2013
Introductory Econometrics for Finance
47
Some More Recent Developments in ARMA
Modelling
Identification would typically not be done using acfs.
We want to form a parsimonious model.
Reasons:
variance of estimators is inversely proportional to the number
of degrees of freedom.
models which are profligate might be inclined to fit to data
specific features
This gives motivation for using information criteria, which
embody 2 factors
a term which is a function of the RSS
some penalty for adding extra parameters
The object is to choose the number of parameters which
minimises the information criterion.
c Chris Brooks 2013
Introductory Econometrics for Finance
48
Information Criteria for Model Selection
The information criteria vary according to how stiff the
penalty term is.
The three most popular criteria are Akaikes (1974)
information criterion (AIC), Schwarzs (1978) Bayesian
information criterion (SBIC), and the Hannan-Quinn criterion
(HQIC).
AIC = ln(
2 ) +
2k
T
SBIC = ln(
2 ) +
k
ln T
T
HQIC = ln(
2 ) +
2k
ln(ln(T ))
T
c Chris Brooks 2013
Introductory Econometrics for Finance
49
Information Criteria for Model Selection
(Contd)
where k = p + q + 1, T= sample size. So we min. IC s.t.
p p, q q
SBIC embodies a stiffer penalty term than AIC.
Which IC should be preferred if they suggest different model
orders?
SBIC is strongly consistent but (inefficient).
AIC is not consistent, and will typically pick bigger models.
c Chris Brooks 2013
Introductory Econometrics for Finance
50
ARIMA Models
As distinct from ARMA models. The I stands for integrated.
An integrated autoregressive process is one with a
characteristic root on the unit circle.
Typically researchers difference the variable as necessary and
then build an ARMA model on those differenced variables.
An ARMA(p,q) model in the variable differenced d times is
equivalent to an ARIMA(p,d,q) model on the original data.
c Chris Brooks 2013
Introductory Econometrics for Finance
51
Exponential Smoothing
Another modelling and forecasting technique
How much weight do we attach to previous observations?
Expect recent observations to have the most power in helping
to forecast future values of a series.
The equation for the model
St = yt + (1 )St1
(1)
Where
is the smoothing constant, with 0 1,
yt is the current realised value,
St is the current smoothed value.
c Chris Brooks 2013
Introductory Econometrics for Finance
52
Exponential Smoothing
(Contd)
St1 = yt1 + (1 )St2
(2)
and lagging again
St2 = yt2 + (1 )St3
(3)
Substituting into (1) for St1 from (2)
St
= yt + (1 )(yt1 + (1 )St2 )
St
= yt + (1 )yt1 + (1 )2 St2
c Chris Brooks 2013
Introductory Econometrics for Finance
(4)
53
Exponential Smoothing
(Contd)
Substituting into (4) for St2 from (3)
St
= yt + (1 )yt1 + (1 )2 St2
= yt + (1 )yt1 + (1 )2 (yt2 + (1 )St3 )
= yt + (1 )yt1 + (1 )2 yt2 + (1 )3 St3
T successive substitutions of this kind would lead to
St =
T
X
i =0
(1 )i yti
+ (1 )T S0
Since 0, the effect of each observation declines
geometrically as the variable moves another observation
forward in time.
c Chris Brooks 2013
Introductory Econometrics for Finance
54
Exponential Smoothing
(Contd)
Forecasts are generated by
ft,s = St
for all steps into the future s = 1, 2, . . .
This technique is called single (or simple) exponential
smoothing.
It doesnt work well for financial data because
there is little structure to smooth
it cannot allow for seasonality
it is an ARIMA(0,1,1) with MA coefficient (1-) - (See
Granger & Newbold, p174)
forecasts do not converge on long term mean as s
Can modify single exponential smoothing
c Chris Brooks 2013
Introductory Econometrics for Finance
55
Exponential Smoothing
(Contd)
to allow for trends (Holts method)
or to allow for seasonality (Winters method).
Advantages of Exponential Smoothing
Very simple to use
Easy to update the model if a new realisation becomes
available.
c Chris Brooks 2013
Introductory Econometrics for Finance
56
Forecasting in Econometrics
Forecasting = prediction.
An important test of the adequacy of a model.
e.g.
Forecasting tomorrows return on a particular share
Forecasting the price of a house given its characteristics
Forecasting the riskiness of a portfolio over the next year
Forecasting the volatility of bond returns
We can distinguish two approaches:
Econometric (structural) forecasting
Time series forecasting
The distinction between the two types is somewhat blurred
(e.g, VARs).
c Chris Brooks 2013
Introductory Econometrics for Finance
57
In-Sample Versus Out-of-Sample
Expect the forecast of the model to be good in-sample.
Say we have some data - e.g. monthly FTSE returns for 120
months: 1990M1 1999M12. We could use all of it to build
the model, or keep some observations back:
Out-of-sample forecast
evaluation period
In-sample estimation period
Jan 1990
Dec 1998
Jan 1999
Dec 1999
A good test of the model since we have not used the
information from 1999M1 onwards when we estimated the
model parameters.
c Chris Brooks 2013
Introductory Econometrics for Finance
58
How to produce forecasts
Multi-step ahead versus single-step ahead forecasts
Recursive versus rolling windows
To understand how to construct forecasts, we need the idea of
conditional expectations:
E (yt+1 | t )
We cannot forecast a white noise process:
E(ut+s |t ) = 0 , s > 0
The two simplest forecasting methods
1. Assume no change: f (yt+s ) = yt
2. Forecasts are the long term average f (yt+s ) = y
c Chris Brooks 2013
Introductory Econometrics for Finance
59
Models for Forecasting
Structural models
e.g.
= X + u
= 1 + 2 x2t + 3 x3t + + k xkt + ut
To forecast y, we require the conditional expectation of its
future value:
E(yt |t1 ) = E(1 + 2 x2t + 3 x3t + + k xkt + ut )
= 1 + 2 E(x2t ) + 3 E(x3t ) + + k E(xkt )
But what are E(x2t ) etc.? We could use x2 , so
E(yt ) = 1 + 2 x2 + 3 x3 + + k xk
= y !!
c Chris Brooks 2013
Introductory Econometrics for Finance
60
Models for Forecasting (Contd)
Time Series Models
The current value of a series, yt , is modelled as a function
only of its previous values and the current value of an error
term (and possibly previous values of the error term).
Models include:
simple unweighted averages
exponentially weighted averages
ARIMA models
Non-linear models e.g. threshold models, GARCH, bilinear
models, etc.
c Chris Brooks 2013
Introductory Econometrics for Finance
61
Forecasting with ARMA Models
The forecasting model typically used is of the form:
ft,s =
p
X
ai ft,si +
i =1
q
X
bj ut+sj
j=1
where
ft,s
ut+s
= yt+s , s 0
= 0, s > 0
= ut+s , s 0
c Chris Brooks 2013
Introductory Econometrics for Finance
62
Forecasting with MA Models
An MA(q) only has memory of q.
e.g. say we have estimated an MA(3) model:
yt
= + 1 ut1 + 2 ut2 + 3 ut3 + ut
yt+1 = + 1 ut + 2 ut1 + 3 ut2 + ut+1
yt+2 = + 1 ut+1 + 2 ut + 3 ut1 + ut+2
yt+3 = + 1 ut+2 + 2 ut+1 + 3 ut + ut+3
We are at time t and we want to forecast 1,2,..., s steps
ahead.
c Chris Brooks 2013
Introductory Econometrics for Finance
63
Forecasting with MA Models
(Contd)
We know yt , yt1 , ..., and ut , ut1
ft,1 = E (yt+1|t ) = + 1 ut + 2 ut1 + 3 ut2
ft,2 = E (yt+2|t ) = E ( + 1 ut+1 + 2 ut + 3 ut1 + ut+2 | t )
= E (yt+2|t ) = + 2 ut + 3 ut1
ft,3 = E (yt+3|t ) = E ( + 1 ut+2 + 2 ut+1 + 3 ut + ut+3 | t )
= E (yt+3|t ) = + 3 ut
ft,4 = E (yt+4|t ) =
ft,s
= E (yt+s|t ) = s 4
c Chris Brooks 2013
Introductory Econometrics for Finance
64
Forecasting with AR Models
Say we have estimated an AR(2)
yt
= + 1 yt1 + 2 yt2 + ut
yt+1 = + 1 yt + 2 yt1 + ut+1
yt+2 = + 1 yt+1 + 2 yt + ut+2
yt+3 = + 1 yt+2 + 2 yt+1 + ut+3
c Chris Brooks 2013
Introductory Econometrics for Finance
65
Forecasting with AR Models (Contd)
ft,1 = E (yt+1|t ) = E ( + 1 yt + 2 yt1 + ut+1 | t )
= E (yt+1|t ) = + 1 E (yt | t) + 2 E (yt1 | t)
= E (yt+1|t ) = + 1 yt + 2 yt1
ft,2 = E (yt+2|t ) = E ( + 1 yt+1 + 2 yt + ut+2 | t )
=
E (yt+2|t ) = + 1 E (yt+1 | t) + 2 E (yt | t)
E (yt+2|t ) = + 1 ft,1 + 2 yt
ft,3 = E (yt+3|t ) = E ( + 1 yt+2 + 2 yt+1 + ut+3 | t )
= E (yt+3|t ) = + 1 E (yt+2 | t) + 2 E (yt+1 | t)
= E (yt+3|t ) = + 1 ft,2 + 2 ft,1
c Chris Brooks 2013
Introductory Econometrics for Finance
66
Forecasting with AR Models (Contd)
We can see immediately that
ft,4 = + 1 ft,3 + 2 ft,2
ft,s
etc, so
= + 1 ft,s1 + 2 ft,s2
Can easily generate ARMA(p, q) forecasts in the same way.
c Chris Brooks 2013
Introductory Econometrics for Finance
67
How can we test whether a forecast is accurate or
not?
For example, say we predict that tomorrows return on the
FTSE will be 0.2, but the outcome is actually -0.4. Is this
accurate? Define ft,s as the forecast made at time t for s
steps ahead (i.e. the forecast made for time t + s), and yt+s
as the realised value of y at time t+s.
Some of the most popular criteria for assessing the accuracy
of time series forecasting techniques are:
MSE =
N
1 X
(yt+s ft,s )2
N
t=1
c Chris Brooks 2013
Introductory Econometrics for Finance
68
How can we test whether a forecast is accurate or
not? (Contd)
MAE is given by
N
1 X
|yt+s ft,s |
MAE =
N
t=1
Mean absolute percentage error:
N
100 X yt+s ft,s
MAPE =
yt+s
N
t=1
It has, however, also recently been shown (Gerlow et al.,
1993) that the accuracy of forecasts according to traditional
statistical criteria are not related to trading profitability.
c Chris Brooks 2013
Introductory Econometrics for Finance
69
How can we test whether a forecast is accurate or
not? (Contd)
A measure more closely correlated with profitability:
N
1 X
zt+s
% correct sign predictions =
N
t=1
where
zt+s = 1 if (yt+s ft,s ) > 0
zt+s = 0 otherwise
c Chris Brooks 2013
Introductory Econometrics for Finance
70
Forecast Evaluation Example
Given the following forecast and actual values, calculate the
MSE, MAE and percentage of correct sign predictions:
Steps ahead Forecast
1
0.20
2
0.15
3
0.10
4
0.06
5
0.04
Actual
0.40
0.20
0.10
0.10
0.05
MSE = 0.079, MAE = 0.180, % of correct sign predictions =
40
c Chris Brooks 2013
Introductory Econometrics for Finance
71
What factors are likely to lead to a good
forecasting model?
signal versus noise
data mining issues
simple versus complex models
financial or economic theory
c Chris Brooks 2013
Introductory Econometrics for Finance
72
Statistical Versus Economic or Financial loss
functions
Statistical evaluation metrics may not be appropriate.
How well does the forecast perform in doing the job we
wanted it for?
Limits of forecasting: What can and cannot be forecast?
All statistical forecasting models are essentially extrapolative
Forecasting models are prone to break down around turning
points
Series subject to structural changes or regime shifts cannot be
forecast
Predictive accuracy usually declines with forecasting horizon
Forecasting is not a substitute for judgement
c Chris Brooks 2013
Introductory Econometrics for Finance
73
Back to the original question: why forecast?
Why not use experts to make judgemental forecasts?
Judgemental forecasts bring a different set of problems:
e.g., psychologists have found that expert judgements are
prone to the following biases:
over-confidence
inconsistency
recency
anchoring
illusory patterns
group-think.
The Usually Optimal Approach
To use a statistical forecasting model built on solid theoretical
foundations supplemented by expert judgements and
interpretation.
c Chris Brooks 2013
Introductory Econometrics for Finance
74