Box-Jenkins non seasonal
and seasonal models
C. Tavera – Université Rennes
1
Technical tools
• Lag operator : 𝐿
• Lag polynomial : 𝑚
𝛾 𝐿 = 1 + 𝛾1 𝐿 + 𝛾2 𝐿2 + ⋯ + 𝛾𝑚 𝐿𝑚 = 𝛾𝑖 𝐿𝑖
𝑖=0
• Cumulated lag polynomial : 𝑚
𝛾 1 = 1 + 𝛾1 + 𝛾2 + ⋯ + 𝛾𝑚 = 𝛾𝑖
𝑖=0
Remark : 𝐿𝑐 = 𝑐 and 𝛾 𝐿 𝑐 = 𝛾 1 𝑐
• By heart : 1 − 𝛾𝐿 : first degree polynomial with 𝛾 < 1 ∞
1
= 1 + 𝛾𝐿 + 𝛾 2 𝐿2 + ⋯ = 𝛾 𝑖 𝐿𝑖
1 − 𝛾𝐿
𝑖=0
• Stationarity
Non seasonal ARMA models
Wold theorem
• 𝑦𝑡 𝑡=1,⋯,𝑇 non centered stationary
➢∃𝑢𝑡 ~𝐵𝐵 0 ; 𝜎𝑢2
➢∃𝛼 ∈ 𝑅
➢∃ 𝛿0 , 𝛿1 , ⋯ avec
▪ 𝛿𝑖 ∈ 𝑅 (pour 𝑖 = 0,1, ⋯)
▪ 𝛿0 = 1
▪ lim 𝛿𝑖 = 0 : short memory effect
𝑖→∞
Such that : 𝒚𝒕 = 𝜶 + 𝒖𝒕 + 𝜹𝟏 𝒖𝒕−𝟏 + 𝜹𝟐 𝒖𝒕−𝟐 + ⋯ = 𝜶 + σ∞
𝒊=𝟎 𝜹𝒊 𝒖𝒕−𝒊 = 𝜶 + 𝜹(𝑳)𝒖𝒕
with 𝛿 𝐿 = 1 + 𝛿1 𝐿 + 𝛿2 𝐿2 + ⋯
𝑦𝑡 = 𝛼 + 𝛿(𝐿)𝑢𝑡 : Wold MA(∞) model
• Remark : if 𝑦𝑡 𝑡=1,⋯,𝑇 stationary and centered : 𝛼 = 0 and 𝑦𝑡 = σ∞
𝑖=0 𝛿𝑖 𝑢𝑡−𝑖 = 𝛿(𝐿)𝑢𝑡
ARMA(p,q) model
𝜃 𝐿 𝜃 𝐿
• Fractional approximation : 𝛿(𝐿) ≈ so that 𝑦𝑡 = 𝛼 + 𝑢𝑡
𝜙 𝐿 𝜙 𝐿
or 𝜙 𝐿 𝑦𝑡 = 𝛼 ′ + 𝜃 𝐿 𝑢𝑡 with 𝛼′ = 𝛼 ∙𝜙 1 : ARMA(p,q) model
with 𝜙 𝐿 = 1 − 𝜙1 𝐿 − 𝜙2 𝐿2 − ⋯ − 𝜙𝑝 𝐿𝑝 and 𝜃 𝐿 = 1 + 𝜃1 𝐿 + 𝜃2 𝐿2 + ⋯ + 𝜃𝑞 𝐿𝑞
Moreover : in applied cases 𝑝 and 𝑞 are « small »
• Remark : centered ARMA(p,q) model : 𝜙 𝐿 𝑦𝑡 = 𝜃 𝐿 𝑢𝑡
• Key point 1 : 𝑦𝑡 is stationary : you need to check for this hypothesis
• Key point 2 : 𝑢𝑡 is a white noise process : you need to test for this hypothesis
ARIMA(p,d,q) model
• In case 𝑦𝑡 is not stationary, you have to make it stationary
• Many ways to do it but most of the time we use a stochastic difference of order 𝒅:
∆𝑑 𝑦𝑡 = (1 − 𝐿)𝑑 𝑦𝑡
• If ∆𝑑 𝑦𝑡 is stationary and can be modelled as an ARMA(p,q) model, we can say :
➢ ∆𝑑 𝑦𝑡 is modeled as as ARMA(p,q)
➢or 𝑦𝑡 is modeled as an ARIMA(p,d,q) model (I means « integrated » of order d)
• Important remark : with economic variables, we often have 𝑑 = 1 (ARIMA(p,1,q) model)
• With 𝑑 = 1 : ∆1 𝑦𝑡 = ∆𝑦𝑡 = 1 − 𝐿 𝑦𝑡 = 𝑦𝑡 − 𝑦𝑡−1 : variation of 𝑦 between two consecutive
periods.
𝑧𝑡 −𝑧𝑡−1
• Remark : if 𝑦 is a Log (for example 𝑦 = 𝐿𝑜𝑔(𝑧)), then ∆𝑦𝑡 = ∆𝐿𝑛(𝑧𝑡 ) ≈
𝑧𝑡−1
Identification tools for ARMA models
1) ACF and PACF : for « pure » AR or MA models
Définition
Confidence interval
Properties with pure AR ou MA models
ACF PACF Model
Falls abrutply towards 0 Falls progressively towards MA(q)
after the lag q zero
Falls progressively towards Falls abrutply towards 0 AR(p)
zero after the lag p
2) Information criterion (AIC is often used)
𝑆𝑆𝑅
𝐴𝐼𝐶 = 𝑇 ∙ 𝐿𝑜𝑔 +2∙ 𝑘+2
𝑇
3) Iterative procedure
The road to ARIMA or ARMA models (iterative procedure)
y stationary ?
Preliminary identification
Estimation of tentative
models and test
Are
Modification of the Common roots ?
residuals
"tentative models" Wold MA(∞) model
WN ?
NO YES
Compare the predictive capacity of
the models with WS forecasts
Selection of the most predictive
model
Seasonal ARMA models
SARMA model
Multiplicative seasonal ARMA models : SARMA(P,Q)(p,q)
Φ 𝐿𝑠 𝜙 𝐿 𝑦𝑡 = 𝛼 + Θ(𝐿𝑠 )𝜃(𝐿)𝑢𝑡
➢s : span of seasonality
➢Φ 𝐿𝑠 = 1 − Φ1 𝐿𝑠 1 − Φ2 𝐿𝑠 2 − ⋯ Φ𝑃 𝐿𝑠 𝑃
➢𝜙 𝐿 = 1 − 𝜙1 𝐿 − 𝜙2 𝐿2 + ⋯ − 𝜙𝑝 𝐿𝑝
➢Θ 𝐿𝑠 = 1 + Θ1 𝐿𝑠 1 + Θ2 𝐿𝑠 2 + ⋯ +Θ𝑄 𝐿𝑠 𝑄
➢𝜃 𝐿 = 1 + 𝜃1 𝐿 + 𝜃2 𝐿2 + ⋯ + 𝜃𝑞 𝐿𝑞
• Example : 𝑦𝑡 ~𝑆𝐴𝑅𝑀𝐴(2,1)(1,1) with a monthly (𝑠 = 12) and centered serie
1 − Φ1 𝐿12 − Φ2 𝐿24 1 − 𝜙1 𝐿 𝑦𝑡 = (1 + Θ1 𝐿12 )(1 + 𝜃1 𝐿)𝑢𝑡
SARIMA model
Multiplicative seasonal ARIMA models : SARIMA(P,D,Q)(p,d,q)
Φ 𝐿𝑠 𝜙 𝐿 (1 − 𝐿𝑠 )𝐷 (1 − 𝐿)𝑑 𝑦𝑡 = 𝛼 + Θ(𝐿𝑠 )𝜃(𝐿)𝑢𝑡
Φ 𝐿𝑠 𝜙 𝐿 ∆𝐷 𝑑 𝑑 𝑠
𝑠 ∆ (1 − 𝐿) 𝑦𝑡 = 𝛼 + Θ(𝐿 )𝜃(𝐿)𝑢𝑡
➢s : span of seasonality
➢Φ 𝐿𝑠 = 1 − Φ1 𝐿𝑠 1 − Φ2 𝐿𝑠 2 − ⋯ Φ𝑃 𝐿𝑠 𝑃
➢𝜙 𝐿 = 1 − 𝜙1 𝐿 − 𝜙2 𝐿2 + ⋯ − 𝜙𝑝 𝐿𝑝
➢Θ 𝐿𝑠 = 1 + Θ1 𝐿𝑠 1 + Θ2 𝐿𝑠 2 + ⋯ +Θ𝑄 𝐿𝑠 𝑄
➢𝜃 𝐿 = 1 + 𝜃1 𝐿 + 𝜃2 𝐿2 + ⋯ + 𝜃𝑞 𝐿𝑞
• Example : 𝑦𝑡 ~𝑆𝐴𝑅𝐼𝑀𝐴(2,1,1)(1,1,1) with a monthly (𝑠 = 12) and centered serie
1 − Φ1 𝐿12 − Φ2 𝐿24 1 − 𝜙1 𝐿 (1 − 𝐿12 )1 (1 − 𝐿)1 𝑦𝑡 = (1 + Θ1 𝐿12 )(1 + 𝜃1 𝐿)𝑢𝑡
SARIMA model
• ACF and PACF do not give clear indication for idetifying the model
• Alternative identification methods
➢Start with a simple tentative model (example airline model)
➢Calculate the ACF and PACF of estimated residuals
➢Try to modify the initial tentative model untill the residuals are WN
➢Once a final model is obtained, estimate alternative models by changing the lags by 1
unit. Calculate the associated value of an information criterion (for example AIC)
➢Select the « best » model
Example: with a final model SARIMA(2,1,0)12(3,0,0), we will then try :
SARIMA(2,1,0)12(3,0,0) SARIMA(1,1,0)12(3,0,0)
SARIMA(2,1,0)12(3,0,1) SARIMA(1,1,1)12(3,0,0)
SARIMA(2,1,0)12(3,0,2) SARIMA(1,1,2)12(3,0,0)
SARIMA(2,1,0)12(2,0,0) SARIMA(0,1,1)12(3,0,0)
SARIMA model : Example
• Monthly value of sales (index); food products, France (INSEE)
SARIMA model : Example
• Log of the serie (to make the variance stable)
SARIMA model : Example
• First difference, seasonal difference and first plus seasonal differences of the Log
SARIMA model : Example
• Retained model : SARIMA(2,1,1)12(3(lags 1 et 3), 1, 1) without constant term
SARIMA model : Example
Estimated residuals
Standardized residuals
Create 2
dummies
for the
outliers
SARIMA model : Example
SARIMA model : Example
Conclusion
Estimated residuals can be considered as
WN since :
➢ Non autocorrelation
➢ Normal distribution
The model is then compared top the model
obtanied with the automatic procedure of
the software : SARIMA(1,0,1)12(0,1,2)
▪ AIC -4.375
▪ SBC -4.248
▪ Hannan-Quinn -4.323