Thanks to visit codestin.com
Credit goes to www.scribd.com

0% found this document useful (0 votes)
11 views12 pages

Notes LinearTimeSeries

yes

Uploaded by

varshzz16
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
11 views12 pages

Notes LinearTimeSeries

yes

Uploaded by

varshzz16
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 12

MA634 Financial Risk Management

Time-Series Models: Linear

1 Introduction
A time series arises when observations of a quantity are collected at successive time points. Examples
include tracking economic indicators, monitoring industrial operations, or recording meteorological
conditions. Mathematically, a time series can be represented by a stochastic process {Yt , t ∈ T },
where the index set T may be discrete (e.g., Z = {0, 1, 2, . . .}) or continuous (e.g., [0, ∞)). In this
course, we will restrict our attention to the discrete case.

Typically, one observes a single realization of a time series over a finite time horizon, such as y1 , y2 , . . . , yn .
The objective of time series analysis is to find an appropriate stochastic model that could
plausibly generate the observed data. Such a model can serve several purposes: 1) to provide
insight into the mechanism underlying the observed phenomenon, 2) to forecast future outcomes, 3)
to monitor and control processes (e.g., in manufacturing, where adjustable systems yield sequences of
measurements), and 4) to explain the dynamics of one time series using information from another, in
a way that is closely related to regression analysis.

Time series data are encountered across a wide range of scientific domains, and the study of such data
is of fundamental importance for both theoretical research and practical decision-making:

• In economics, common examples include daily stock market prices or monthly unemployment figures.

• In the social sciences, population-related series such as birth rates or school enrollment are of
interest.

• In epidemiology, one may analyze the number of recorded influenza cases during a given period.

Asset returns, such as the log return of a stock, can be represented as a time series {rt }, where values
evolve over time. The most general formulation for the returns {rit ; i = 1, . . . , N ; t = 1, . . . , T } is
given by their joint distribution function:

Fr (r11 , . . . , rN 1 ; r12 , . . . , rN 2 ; . . . ; r1T , . . . , rN T ; θ),

where θ is a parameter vector uniquely determining the distribution Fr (·). The probability distribution
Fr (·) governs the stochastic behavior of both the returns rit . In practice, the empirical analysis of
asset returns typically aims at estimating the unknown parameter vector θ and making statistical
inferences about the behavior of {rit } based on previously observed returns.

Focusing on a single stock, our interest is in the joint distribution of {rit }Tt=1 . It is useful to factorize
this distribution as
T
Y
F (ri1 , . . . , riT ; θ) = F (ri1 ) F (ri2 | ri1 ) · · · F (riT | ri,T −1 , . . . , ri1 ) = F (ri1 ) F (rit | ri,t−1 , . . . , ri1 ).
t=2

This decomposition highlights the temporal dependence of log-returns rit . The essential modeling
problem is the specification of the conditional distribution F (rit | ri,t−1 , . . .), and particu-
larly how this distribution evolves over time.

1
2 Linear Time Series Models
Linear time series analysis provides a framework to study their dynamic behavior. For asset returns,
these models focus on the connection between current returns and information available from the
past. Historical return data as well as external economic factors can influence this relationship.
Correlation plays a crucial role, capturing the dependence between present and lagged values. In time
series analysis, these correlations, known as serial correlations or autocorrelations, form the basis for
analyzing stationary processes.

2.1 Preliminaries on Time Series and Stationarity


Time-Series

• Time series are stochastic processes. A process that describes a “statistical phenomenon that
develops over time according to the laws of probability”.

• Mathematically, a stochastic process is defined as a collection of random variables arranged over


time (this course deals with discrete, equally spaced time series)

R0 , R1 , R2 , . . .

The observed time series is a realization of the said process

r1 , r2 , . . . , rT .

• The purpose of time series analysis is to find the probability distribution of a stochastic process
through the observed time series.

• Instead of probability distributions, we often focus on examining the moments of the process. The
mean and autocovariance functions are defined as

µt = E(Rt ) and γ(s, t) = Cov(Rs , rt ) = E[(Rs − µs )(Rt − µt )]

for all t, s. When s = t, we obtain the variance function

γ(t, t) = Cov(Rt , Rt ) = E[(Rt − µt )2 ].

Stationary Process

A key concept in time series analysis is stationarity.

Strictly Stationary: A time series {rt } is strictly stationary if the joint distribution of (rt1 , . . . , rtk )
is unchanged under time shifts for any choice of (t1 , . . . , tk ).

This is a very strong assumption and is rarely testable empirically. A weaker and more practical
notion is weak stationarity.

Weak Stationarity: A time-series is weakly stationary if both the mean of rt and the covariance
between rt and rt−ℓ are time-invariant, where ℓ is an integer. More specifically, {rt } is weakly stationary
if

2
(i) E(rt ) = µ, a constant mean, and

(ii) Cov(rt , rt−ℓ ) = γℓ , depending only on the lag ℓ.

In practice, weak stationarity implies that the time series fluctuates around a fixed level with constant
variance, which allows reliable inference and prediction.

Weak stationarity assumes that the first two moments of rt are finite. If rt is strictly stationary with
finite first and second moments, then it is also weakly stationary. The converse, however, does not
necessarily hold, except when rt is normally distributed, in which case weak and strict stationarity
coincide.

Note the following key properties of autocovariance:

γ0 = Var(rt ), and γ−ℓ = γℓ .

It is often more practical to work with the autocorrelation function (ACF). The ACF at lag τ is
defined as
γℓ Cov(rt , rt−ℓ )
ρℓ = =  ,
γ0 E (rt − r)2
where r is the mean of the process. This normalization ensures that ρℓ is scale-free, making it easier
to compare dependence structures across different time series.
PT
For a given sample of returns {rt }Tt=1 , let r̄ be the sample mean (i.e., r̄ = t=1 rt /T ). Then the lag-1
sample autocorrelation of rt is PT
t=2 (rt − r̄)(rt−1 − r̄)
ρ̂1 = PT .
2
t=1 (rt − r̄)

and in general, the lag-ℓ sample autocorrelation of rt is defined as


PT
t=ℓ+1 (rt − r̄)(rt−ℓ − r̄)
ρ̂ℓ = PT , 0 ≤ ℓ < T − 1.
2
t=1 (rt − r̄)

ˆ 1 , rho
The statistics rho ˆ 2 , . . . is called the sample autocorrelation function (ACF) of rt . The autocor-
relation function (ACF) plays a central role in the analysis of linear time series. In fact, a linear time
series model can be fully described by its ACF, and the process of linear time series modeling relies
on the sample ACF to represent the linear dynamics present in the data.
Remark 1. In time series modeling, the assumption of stationarity is fundamental for several reasons:

(i) Constant mean and variance: A stationary process has statistical properties, such as mean
and variance, that do not change over time. This constancy simplifies estimation, improves
interpretability, and ensures the stability of the model.

(ii) Autocorrelation structure: For stationary series, the autocorrelation function depends only
on the lag between observations, not on the specific time at which the correlation is computed.
This property is crucial since the autocorrelation function is central to most time series methods.

(iii) Model reliability: Stationarity ensures that the model’s behavior remains stable over time. In
contrast, non-stationary series often exhibit trends or seasonality, which can complicate estima-
tion and forecasting.

3
(iv) Theoretical basis: Many commonly used models, such as ARMA and ARIMA, rely on the
stationarity assumption. In practice, non-stationary series are frequently transformed into sta-
tionary ones before modeling.

It is important to note that not all time series are stationary. When non-stationarity is present,
specialized methods or transformations are typically required.
Remark 2. A structured procedure for analyzing time series data usually involves the following steps:

(i) Inspect the series visually to detect patterns such as trends, seasonality, or unusual observations.

(ii) Apply transformations, such as detrending, deseasonalizing, or differencing, to produce a stationary


sequence. The resulting process is often referred to as the transformed or pre-whitened series.

(iii) Fit candidate models to the transformed data to capture its underlying dynamics.

(iv) Choose the most suitable model using statistical criteria and diagnostic checks of model fit.

(v) Generate forecasts for the transformed series, and then apply inverse transformations to obtain
predictions for the original time series.

3 White Noise and Linear Time Series


A discrete process {rt } is IID noise (independent and identically distributed noise), if it
consists of a sequence of independent and identically distributed random variables with an expected
value of 0. If the random variables have a finite variance, we denote

{rt } ∼ IID(0, σ 2 ).

A discrete process {rt } is white noise, if it consists of a sequence of uncorrelated random variables
with an expected value of 0 and a variance of σ 2 , that is,

σ 2 , s = t,
E(rt ) = 0 and Cov(rs , rt ) =
0, s ̸= t.

We denote
{rt } ∼ WN(0, σ 2 ).

A time series {rt } is called linear if it can be represented as



X
rt = µ + ψi at−i ,
i=0

where µ is the mean of rt , ψ0 = 1, and {at } is a white noise. Here, at captures the new information
or innovation at time t, often referred to as a shock.

The dynamic structure of {rt } is determined by the sequence {ψi }, known as the ψ-weights. If the

4
series is weakly stationary, its mean and variance are

X
E(rt ) = µ, and Var(rt ) = σa2 ψi2 ,
i=0

where σa2 is the variance of at . For stationarity, {ψi2 } must form a convergent sequence, implying that
ψi → 0 as i → ∞. Thus, the influence of past shocks at−i on rt diminishes as i grows larger.

The lag-ℓ autocovariance of {rt } is


 ! ∞ 

X X
γℓ = Cov(rt , rt−ℓ ) = E  ψi at−i  ψj at−ℓ−j 
i=0 j=0

 

X ∞
X ∞
X
=E ψi ψj at−i at−ℓ−j  = ψj+ℓ ψj E(a2t−ℓ−j ) = σa2 ψj ψj+ℓ .
i,j=0 j=0 j=0

and the autocorrelation function is given by


P∞
γℓ ψψ
ρℓ = P∞i i+ℓ2 ,
= i=0 ℓ ≥ 0.
γ0 1 + i=1 ψi

Therefore, the ψ-weights are closely related to the autocorrelations of the series. For a weakly station-
ary process, ψi → 0 as i → ∞, which leads to ρℓ → 0 as ℓ → ∞. In the context of asset returns, this
indicates that the linear dependence of current returns on very distant past returns becomes negligible
for large lags.
Remark 3. Backward shift operator:

Bht = ht−1 , B 2 ht = ht−2 , B j ht = ht−j

for any process {ht }.

With backward shift operator, the linear process above can expressed as

X
rt = ψ(B)at , where ψ(B) = ψj B j ,
j=0

P∞ j.
where ψ(B) = j=0 ψj B

The backward shift operator B in ψ(B) is a functional that “shifts” the time series backwards. When
B acts on a term at , it replaces it with at−1 , B 2 replaces at with at−2 , and so on.

5
4 Simple Autoregressive Models
An autoregressive process of order 1, AR(1), is defined as

rt = ϕ0 + ϕ1 rt−1 + at ,

where {at } is assumed to be a white noise process with mean zero and variance σa2 . This form is similar
to a simple linear regression model, with rt as the dependent variable and rt−1 as the explanatory
variable.

The AR(1) model has properties analogous to regression but also key differences. In particular,
conditional on rt−1 , we have

E(rt | rt−1 ) = ϕ0 + ϕ1 rt−1 , and Var(rt | rt−1 ) = Var(at ) = σa2 .

Thus, given the past return, the current return is centered around ϕ0 + ϕ1 rt−1 with variance σa2 .
Importantly, the AR(1) model satisfies the Markov property, meaning that once rt−1 is known, past
returns rt−i for i > 1 provide no additional information about rt .

A natural extension of this framework is the autoregressive model of order p, AR(p), defined as

rt = ϕ0 + ϕ1 rt−1 + · · · + ϕp rt−p + at ,

where p is a nonnegative integer and {at } is again white noise. In this case, the past p returns jointly
determine the conditional expectation of rt . The AR(p) model resembles a multiple linear regression,
with lagged returns serving as explanatory variables.

There are several questions that need to be discussed. First, it is important to study the properties of
AR models. Second, how do we identify the order p of an AR time series. Third, how do we estimate
the coefficients. Fourth, how do we check the model adequacy? and finally how to make forecasts,
which is an important application of time-series analysis.

4.1 Properties of AR models


AR(1) Model

Consider the AR(1) model


rt = ϕ0 + ϕ1 rt−1 + at (1)

(i) Mean: Taking expectations in (1) gives

E(rt ) = ϕ0 + ϕ1 E(rt−1 ).

If the process is stationary, then E(rt ) = E(rt−1 ) = µ, so

ϕ0
µ = ϕ0 + ϕ1 µ ⇒ E(rt ) = µ = .
1 − ϕ1

This result has two key implications:

(a) The mean of rt exists only if ϕ1 ̸= 1.

6
(b) The mean is zero if and only if ϕ0 = 0.

Hence, for a stationary AR(1) process, the constant ϕ0 determines the long-run mean of rt . If
ϕ0 = 0, then E(rt ) = 0.
Remark 4. Using the identity ϕ0 = (1 − ϕ1 )µ, the AR(1) model can be written as

rt − µ = ϕ1 (rt−1 − µ) + at .

Applying repeated substitution leads to



X
rt − µ = at + ϕ1 at−1 + ϕ21 at−2 + ··· = ϕi1 at−i .
i=0

which is of the form of linear time series with ψi = ϕi . Thus, the AR(1) process can be expressed
as an infinite weighted sum of past shocks {at−i }.

(ii) Variance: Taking the variance in (1), we obtain

Var(rt ) = ϕ21 Var(rt−1 ) + σa2 + 2ϕ1 Cov(r−1 , at ),

where σa2 is the variance of the innovation process {at }. Note that rt − µ can be expressed as a
linear combination of past shocks at−i . Since {at } is a white noise sequence, it follows that

E[(rt − µ)at+1 ] = 0.

By the stationarity assumption,

Cov(rt−1 , at ) = E[(rt−1 − µ)at ] = 0,

because rt−1 is determined before time t, and at is independent of past values.

Thus, we have
Var(rt ) = ϕ21 Var(rt−1 ) + σa2 .

Since Var(rt ) = Var(rt−1 ) under stationarity, this gives

σa2
Var(rt ) = ,
1 − ϕ21

provided that ϕ21 < 1.

The condition ϕ21 < 1 ensures that the variance is finite and positive. This requirement leads
to the weak stationarity condition of the AR(1) model, which implies

−1 < ϕ1 < 1.

(iii) Autocovariance function: Further, the autocovariance function is given by



ϕ1 γ1 + σa2 ,
 ℓ = 0,
γℓ =
ϕ1 γ ,

ℓ > 0,
ℓ−1

7
where γℓ = Cov(rt , rt−ℓ ).

Thus, for a weakly stationary AR(1) model,

σa2
Var(rt ) = γ0 = , and γℓ = ϕ1 γℓ−1 , ℓ > 0.
1 − ϕ21

From the recursive relation, the autocorrelation function (ACF) satisfies

ρℓ = ϕ1 ρℓ−1 , ℓ > 0.

Since ρ0 = 1, it follows that


ρℓ = ϕℓ1 .

This result shows that the ACF of a weakly stationary AR(1) process decays exponentially with
lag ℓ, at a rate determined by ϕ1 , starting from ρ0 = 1 (refer Figure 1.

Figure 1: Case a: ϕ = 0.8 and Case a: ϕ = −0.8

AR(2) Model

Consider the AR(2) model


rt = ϕ0 + ϕ1 rt−1 + ϕ2 rt−2 + at (2)

(i) Mean: Taking mean gives


ϕ0
E(rt ) = µ =
1 − ϕ1 − ϕ2
provided that ϕ1 + ϕ2 ̸= 1. Using ϕ0 = (1 − ϕ1 − ϕ2 )µ.

(ii) Autocovariance Function: we can rewrite the AR(2) model as

(rt − µ) = ϕ1 (rt−1 − µ) + ϕ2 (rt−2 − µ) + at .

Multiplying the prior equation by (rt−ℓ − µ), we have

(rt−ℓ − µ)(rt − µ) = ϕ1 (rt−ℓ − µ)(rt−1 − µ) + ϕ2 (rt−ℓ − µ)(rt−2 − µ) + (rt−ℓ − µ)at .

8
Taking expectation and using E[(rt−ℓ − µ)at ] = 0 for ℓ > 0, we obtain

γℓ = ϕ1 γℓ−1 + ϕ2 γℓ−2 , for ℓ ≥ 0.

This result is referred to as the moment equation of a stationary AR(2) model. Dividing the
above equation by γ0 , we have the property

ρℓ = ϕ1 ρℓ−1 + ϕ2 ρℓ−2 , ℓ > 0,

for the ACF of rt . In particular, the lag-1 ACF satisfies

ρ1 = ϕ1 ρ0 + ϕ2 ρ−1 = ϕ1 + ϕ2 ρ1 .

Therefore, for a stationary AR(2) series rt , we have ρ0 = 1,

ϕ1
ρ1 = , ρℓ = ϕ1 ρℓ−1 + ϕ2 ρℓ−2 , ℓ ≥ 2.
1 − ϕ2

Thus, the ACF of a stationary AR(2) series satisfies the second-order difference equation

(1 − ϕ1 B − ϕ2 B 2 )ρℓ = 0.

This difference equation determines the properties of the ACF of a stationary AR(2) time series. It
also determines the behavior of the forecasts of rt .

Under such a condition, the recursive equation of ACF ensures that the ACF of the model converges
to 0 as the lag ℓ increases. This convergence property is a necessary condition for a stationary time
series.

AR(p) Model

Extending the results obtained for the AR(1) and AR(2) processes, the mean for a stationary AR(p)
process is
ϕ0
E(rt ) = ,
1 − ϕ1 − · · · − ϕp
as long as the denominator is nonzero. The stationarity of the AR(p) process depends on the charac-
teristic equation of the model:

1 − ϕ1 x − ϕ2 x2 − · · · − ϕp xp = 0.

If all solutions (roots) of this equation lie outside the unit circle (i.e., their moduli are greater than
one), then the process {rt } is stationary.

For a stationary AR(p) process, the autocorrelation function (ACF) satisfies the difference equation

(1 − ϕ1 B − ϕ2 B 2 − · · · − ϕp B p )ρℓ = 0, ℓ ≥ 0.

As a result, the ACF of an AR(p) process can exhibit a variety of shapes, including mixtures of
exponentially decaying patterns and damped sine or cosine waves, depending on the location of the
characteristic roots.

9
4.2 Identifying the order of AR model
Partial Autocorrelation Function (PACF)

A partial correlation is a conditional correlation. The partial autocorrelation function, denoted by αℓ ,


at lag ℓ can be thought of as the correlation between observations rt and rt+ℓ , with the influence of
observations rt+1 , . . . , rt+ℓ−1 eliminated. That is, we correlate the “parts” of rt and rt+ℓ that are not
predicted by rt+1 , . . . , rt+ℓ−1 .

The partial autocorrelation function (PACF) of a stationary time series is closely related to the ACF
and serves as a useful tool for identifying the appropriate order p of an autoregressive (AR) model.
One way to introduce PACF is by considering AR models of increasing order:

rt = ϕ0,1 + ϕ1,1 rt−1 + e1t ,


rt = ϕ0,2 + ϕ1,2 rt−1 + ϕ2,2 rt−2 + e2t ,
rt = ϕ0,3 + ϕ1,3 rt−1 + ϕ2,3 rt−2 + ϕ3,3 rt−3 + e3t ,
rt = ϕ0,4 + ϕ1,4 rt−1 + ϕ2,4 rt−2 + ϕ3,4 rt−3 + ϕ4,4 rt−4 + e4t ,
..
.

Here, ϕ0,j is the constant, ϕi,j are the coefficients for the lagged terms, and ejt represents the error in
the AR(j) model.

These models resemble multiple regression equations and their parameters can be estimated using
least squares. Note that

• ϕ̂1,1 corresponds to the lag-1 PACF of rt ,

• ϕ̂2,2 corresponds to the lag-2 PACF,

• ϕ̂3,3 corresponds to the lag-3 PACF,

• and so forth.

The lag-2 PACF ϕ̂2,2 measures the incremental effect of rt−2 on rt , after controlling for rt−1 (the AR(1)
component). Similarly, the lag-3 PACF reflects the additional contribution of rt−3 beyond the AR(2)
model, and so on. For an AR(p) process, the lag-p PACF is nonzero, but for lags greater than p, the
PACF values should be approximately zero. This truncation property of the PACF is commonly used
to determine the appropriate order p.

Parameter Estimation

For a specified AR(p) model, the conditional least-squares method, which starts with the (p + 1)th
observation, is often used to estimate the parameters. Specifically, conditioning on the first p obser-
vations, we have
rt = ϕ0 + ϕ1 rt−1 + · · · + ϕp rt−p + at , t = p + 1, . . . , T,

which is in the form of a multiple linear regression and can be estimated by the least-squares method.
Denote the estimate of ϕi by ϕ̂i . The fitted model is

r̂t = ϕ̂0 + ϕ̂1 rt−1 + · · · + ϕ̂p rt−p ,

10
and the associated residual is
ât = rt − r̂t .

The series {ât } is called the residual series, from which we obtain
PT 2
t=p+1 ât
σ̂a2 = .
T − 2p − 1

After fitting the time series, it is essential to verify with diagnostic checks that the model’s residuals
{ât } have the same properties as those of a white noise process.

The residual process is an estimate for a white noise process, which generates the series {rt }. If the
model is good, the empirical residuals should have the same properties as those of the white noise
process. These properties include uncorrelatedness and homoscedasticity. Additionally, if normality
is assumed for the error process, the residuals should be approximately normal.

Goodness of fit

A commonly used statistic to measure goodness of fit of a stationary model is the R square (R2 )
defined as
residual sum of squares
R2 = 1 − .
total sum of squares

For a stationary AR(p) time series model with T observations {rt | t = 1, . . . , T }, the measure becomes
PT 2
2 t=p+1 ât
R = 1 − PT ,
t=p+1 (rt − r̄)2

1 PT
where r̄ = T −p t=p+1 rt . It is easy to show that 0 ≤ R2 ≤ 1. Typically, a larger R2 indicates that
the model provides a closer fit to the data.

For a given data set, it is well known that R2 is a nondecreasing function of the number of parameters
used. To overcome this weakness, an adjusted R2 is proposed, which is defined as

variance of residuals σ̂ 2
Adj-R2 = 1 − = 1 − a2 ,
variance of rt σ̂r

where σ̂r2 is the sample variance of rt . This new measure takes into account the number of parameters
used in the fitted model.1 However, it is no longer between 0 and 1.

4.3 Forecasting
For the AR(p) model, suppose that we are at the time index h and are interested in forecasting rh+ℓ ,
where ℓ ≥ 1. The time index h is called the forecast origin and the positive integer ℓ is the forecast
horizon.
1
Expanding this with penalty terms, we can write
1
PT 2
σ̂a2 T −p−1 t=p+1 ât
Adj-R2 = 1 − =1− T
,
σ̂r2 1
P
t=1 (rt − r̄)
2
T −1

Thus, compared to the unadjusted R2 , the adjusted version increases the denominator’s degrees of freedom (using T − 1)
but penalizes the numerator by dividing the residual sum of squares by (T − p − 1) instead of T , to account for the
number of estimated parameters.

11
1-Step-Ahead Forecast From the AR(p) model, we have

rh+1 = ϕ0 + ϕ1 rh + · · · + ϕp rh+1−p + ah+1 .

the optimal forecast of rh+1 based on the information set Fh is the conditional expectation, given by
p
X
r̂h (1) = E(rh+1 | Fh ) = ϕ0 + ϕi rh+1−i ,
i=1

and the corresponding forecast error becomes

eh (1) = rh+1 − r̂h (1) = ah+1 .

In econometric terminology, at+1 is commonly called the shock series at time t + 1. The variance of
this one-step-ahead forecast error is, therefore,

Var[eh (1)] = Var(ah+1 ) = σa2 .

2-Step-Ahead Forecast

From the AR(p) representation, we can write

rh+2 = ϕ0 + ϕ1 rh+1 + · · · + ϕp rh+2−p + ah+2 .

Taking the conditional expectation with respect to Fh , the two-step-ahead forecast becomes

r̂h (2) = E(rh+2 | Fh ) = ϕ0 + ϕ1 r̂h (1) + ϕ2 rh + · · · + ϕp rh+2−p .

The corresponding forecast error is



eh (2) = rh+2 − r̂h (2) = ϕ1 rh+1 − r̂h (1) + ah+2 = ah+2 + ϕ1 ah+1 .

Thus, the variance of the two-step-ahead forecast error is

Var[eh (2)] = (1 + ϕ21 )σa2 .

which indicates that as the forecast horizon grows, the uncertainty in the forecast also increases.
This observation is consistent with intuition. You can use this recursive process for multistep-ahead
forecasts.
Remark 5. For a stationary AR(p) process, the forecast r̂h (ℓ) approaches the expected value E(rt ) as
the forecast horizon ℓ becomes very large. In other words, over the long run, predictions for the series
converge to its unconditional mean. In finance, this behavior is known as mean reversion.

12

You might also like