CS2 Booklet 8 (Time Series) 2019 FINAL
CS2 Booklet 8 (Time Series) 2019 FINAL
CS2 Booklet 8 (Time Series) 2019 FINAL
Subject CS2
Revision Notes
For the 2019 exams
Time series
Booklet 8
covering
CONTENTS
Contents Page
Copyright agreement
Legal action will be taken if these terms are infringed. In addition, we may
seek to take disciplinary action through the profession or through your
employer.
These conditions remain in force after you have finished using the course.
These chapter numbers refer to the 2019 edition of the ActEd Course Notes.
The numbering of the syllabus items is the same as that used by the Institute
and Faculty of Actuaries.
2.1.9 Show that certain univariate time series models have the
Markov property and describe how to rearrange a
univariate time series model as a multivariate Markov
model.
OVERVIEW
This booklet covers Syllabus objectives 2.1 and 2.2, which relate to time
series.
In this course, we look in detail at four important types for time series:
moving average (MA) processes
autoregressive (AR) processes
autoregressive moving average (ARMA) processes, and
autoregressive integrated moving average (ARIMA) processes.
We then go on to discuss how we can fit a time series model to a data set
using the Box-Jenkins methodology, and how to use a model to forecast
future values of a process.
There are many past exam questions (from Subject CT6) that ask for the
derivation of an autocorrelation function. These questions involve standard
algebra.
CORE READING
All of the Core Reading for the topics covered in this booklet is contained in
this section.
The text given in Arial Bold Italic font is additional Core Reading that is not
directly related to the topic being discussed.
____________
x (t1), x (t2 ), , x (t n ) ie as { x (t i ) : i = 1, 2, 3, , n }
x1, x 2 , , x n ie as { x t : t = 1, 2, 3, , n }
____________
For example, a list of returns of the stocks in the FTSE 100 index on a
particular day is not a time series, and the order of records in the list is
irrelevant. At the same time, a list of values of the FTSE 100 index
taken at one-minute intervals on a particular day is a time series, and
the order of records in the list is of paramount importance.
100
h
u
n 90
d
r
e
d 80
s
70
Index 10 20 30 40 50
{ X t : t = 1, 2, 3, , n }
(Note, however, that in the modern literature the term ‘time series’ is
often used to mean both the data and the process of which it is a
realisation.)
____________
g k = cov( X t , X t + k ) = E ( X t X t + k ) - E ( X t )E ( X t + k )
g 0 = var( X t )
____________
gk
r k = corr( X t , X t + k ) =
g0
____________
È
( )
2˘
E Í X t - f k ,1X t -1 - f k ,2 X t - 2 - - f k ,k X t - k ˙˚
Î
Figure 13.1: ACF and PACF values of some stationary time series
model.
____________
Ê 1 r1 ˆ
det Á
Ë r1 r2 ¯˜ r - r2
f1 = r1, f2 = = 2 21
Ê 1 r1ˆ 1 - r1
det Á
Ë r1 1 ˜¯
____________
Operators
(BX )t = X t - 1
____________
(—X )t = X t - X t - 1
____________
diff(x,lag=1,differences=1)
diff(x,lag=1,differences=3)
diff(x,lag=12,differences=1)
for a simple seasonal difference with period 12, —12 (see later).
____________
White noise
ÔÏs 2 if k = 0
g k = cov(et , et + k ) = Ì
ÓÔ0 otherwise
____________
21 The main linear models used for modelling stationary time series are:
Autoregressive process (AR)
Moving average process (MA)
Autoregressive moving average process (ARMA).
____________
X t = m + a 1( X t - 1 - m ) + a 2 ( X t - 2 - m ) + + a p ( X t - p - m ) + et
X t = m + et + b 1et - 1 + + b q et - q
25 The two basic processes (AR and MA) can be combined to give an
autoregressive moving average, or ARMA, process. The defining
equation of an ARMA( p, q ) process is:
X t = m + a 1( X t - 1 - m ) + + a p ( X t - p - m ) + et + b 1et - 1 + + b q et - q
AR (1) processes
X t = m + a ( X t - 1 - m ) + et (13.1)
____________
t -1
Xt = m + a t (X 0 - m ) + Â a j et - j (13.2)
j =0
____________
mt = m + a t ( m0 - m )
____________
1 - a 2t
var( X t ) = s 2 + a 2t var( X 0 )
1- a 2
•
Xt = m + Â a j et - j (13.3)
j =0
____________
•
s2
 a 2 js 2 = if a < 1
j =0 1- a 2
____________
• •
g k = cov( X t , X t + k ) = Â Â a ia j cov(et - j , et + k - i )
j =0 i =0
•
= Â s 2a 2 j + k = a kg 0
j =0
g k = cov( X t , X t - k ) = cov( m + a ( X t - 1 - m ) + et , X t - k )
= a cov( X t - 1, X t - k )
= ag k - 1
implying that:
s2
g k = a kg 0 = a k for k ≥ 0
1- a 2
____________
35 So:
gk
rk = = a k for k ≥ 0
g0
____________
a2 -a2
f1 = r1 = a f2 = =0
1- a 2
The following lines in R generate the ACF and PACF functions for an
AR (1) model:
par(mfrow=c(1,2))
rt = m + a (rt - 1 - m ) + et
One initial condition, the value for r0 , is required for the complete
specification of the model for the force of inflation rt .
____________
AR ( p) processes
X t = m + a 1( X t - 1 - m ) + a 2 ( X t - 2 - m ) + + a p ( X t - p - m ) + et (13.4)
____________
(1 - a 1B - a 2B2 - - a pB p )( X - m ) = et (13.5)
____________
As seen for AR (1) , there are some restrictions on the values of the a j
which are permitted if the process is to be stationary. In particular, we
have the following result.
____________
1 - a 1z - a 2 z 2 - - a p z p = 0
Proof
Ê p ˆ p
g k = cov( X t , X t - k ) = cov Á Â a j X t - j + et , X t - k ˜ = Â a jg k - j
ËÁ j =1 ¯˜ j =1
p
gk = Â A j z -j k
j =1
The converse of this result is also true (but the proof is not given here):
if the roots of the characteristic polynomial are all greater than 1 in
absolute value, then it is possible to construct a stationary process X
satisfying (13.4). In order for an arbitrary process X satisfying (13.4)
to be stationary, the variances and covariances of the initial values
X 0 , X -1, , X - p + 1 must also be equal to the appropriate values.
Often exact values for the g k are required, entailing finding the values
of the constants Ak .
____________
g k = a 1g k - 1 + a 2g k - 2 + + a pg k - p + s 2 1{k = 0}
g 3 = a 1g 2 + a 2g 1 + a 3g 0
g 2 = a 1g 1 + a 2g 0 + a 3g 1
g 1 = a 1g 0 + a 2g 1 + a 3g 2
g 0 = a 1g 1 + a 2g 2 + a 3g 3 + s 2
____________
r2 - r12
f1 = r1 and f 2 = that we have seen before.
1 - r12
____________
MA(1) processes
X t = m + et + b et - 1
____________
mt = m
____________
g 0 = var(et + b et - 1) = (1 + b 2 )s 2
g 1 = cov(et + b et - 1 , et - 1 + b et - 2 ) = bs 2
gk =0 for k > 1
____________
r0 = 1
b
r1 =
1+ b 2
rk = 0 for k > 1
____________
X - m = (1 + b B)e (13.6)
(1 + b B )-1( X - m ) = e
X t - m - b ( X t - 1 - m ) + b 2 ( X t - 2 - m ) - b 3 ( X t - 3 - m ) + = et
Although more than one MA process may share a given ACF, at most
one of the processes will be invertible.
(1 - b 2 ) b k
fk = ( -1)k + 1
1 - b 2(k + 1)
____________
MA(q ) processes
X - m = (1 + b1B + b 2 B2 + + b q Bq )e
____________
q q q -k
gk = Â Â b i b j E (et - i et - j - k ) = s 2 Â bi bi +k
i =0 j =0 i =0
52 Although there may be many moving average processes with the same
ACF, at most one of them is invertible, since no two invertible
processes have the same autocorrelation function. Moving average
models fitted to data by statistical packages will always be invertible.
____________
ARMA processes
X t = m + a 1( X t -1 - m ) + + a p ( X t - p - m ) + et + b 1et -1 + + b q et -q
(1 - a 1B - - a pB p )( X - m ) = (1 + b 1B + + b q Bq )e
____________
53 Neither the ACF nor the PACF of the ARMA process eventually
becomes equal to zero.
X t = a X t -1 + et + b et -1 (13.7)
is given by:
(1 + ab )(a + b )
r1 =
(1 + b 2 + 2ab )
r k = a k -1r1, k = 2,3,
Figure 13.1 shows the ACF and PACF values of such a process with
a = 0.7 and b = 0.5 .
____________
cov( X t , et -1) = a cov( X t -1, et -1) + cov(et , et -1) + b cov(et -1, et -1)
= (a + b )s 2
cov( X t , X t -1) = a cov( X t -1, X t -1) + cov(et , X t -1) + b cov(et -1, X t -1)
So:
g 0 = ag 1 + (1 + ab + b 2 )s 2
g 1 = ag 0 + bs 2
g k = ag k - 1
1 + 2ab + b 2
g0 = s2
1- a 2
(a + b )(1 + ab )
g1 = s2
1- a 2
g k = a k -1g 1 , k = 2,3,...
ARIMA processes
57 Example 1
X t = X t -1 + et
t
Xt = X0 + Â ej
j =1
Yt = —X t = et
Example 2
Zt = Zt - 1 exp( m + et )
Yt = m + Yt - 1 + et
Example 3
Markov property
P [ X t Œ A | X s1 = x1, X s2 = x 2 , , X sn = x n , X s = x ]
= P[ X t Œ A | X s = x ]
for all times s1 < s2 < < s < t , all states x1, x 2 , , x n , x in S and all
subsets A of S .
____________
( )
T
but we may define a vector-valued process Y t = X t , X t -1, , X t - p +1
which does.
____________
( )
T
Y t = X t , X t -1, , X t - p -d +1 which does.
____________
( X n , X n - 1, , X n - q + 1 )
T
any finite collection will never be enough to
deduce the value of en , on which the distribution of X n + 1 depends.
Since a moving average has been shown to be equivalent to an
autoregression of infinite order, and since a p th order autoregression
needs to be expressed as a p -dimensional vector in order to possess
the Markov property, a moving average has no similar
finite-dimensional Markov representation.
____________
All the methods that we shall investigate apply only to a time series
which gives the appearance of stationarity. In this section, therefore,
we deal with possible sources of non-stationarity and how to
compensate for them.
ts.plot(x)
Seasonal variation:
A company, which sells greetings cards, will find that the sales in some
months of the year will be much higher than in others.
Plotting the series will highlight any obvious trends in the mean and
will show up any cyclic variation, which could also form evidence of
non-stationarity. This should always be the first step in any practical
time series analysis.
____________
ts.plot(log(FTSE100$Close))
points(log(FTSE100$Close),cex=.4)
generates Figure 14.1, which shows the time series of the logs of 300
successive closing values of FTSE100 index.
The corresponding sample ACF and sample PACF are produced using:
par(mfrow=c(1,2))
acf(log(FTSE100$Close))
pacf(log(FTSE100$Close))
Figure 14.2: Sample ACF and sample PACF of the log(FTSE100) data;
dotted lines indicate cut-offs for significance if data came from some
white noise process.
____________
66 If the sample ACF decreases slowly but steadily from a value near 1,
we would conclude that the data need to be differenced before fitting
the model.
____________
Figure 14.2 shows the sample ACF of a time series which is clearly
non-stationary as the values decrease in some linear fashion;
differencing is therefore required before fitting a stationary model.
See, for example, the change of ACF and PACF for the differenced
data:
x t = a + bt + y t
—x t = b + — y t
xt = m + q t + y t (14.1)
is a stationary process.
____________
Figure 14.4: Data plot, sample ACF and PACF of temperature data.
yt =
1
2h ( 1
2
xt - h + xt - h + 1 + + xt - 1 + xt + + xt + h - 1 +
1
2
xt + h )
This ensures that each period makes an equal contribution to y t .
The same can be done with odd periods d = 2h + 1 , but the end terms
x t - h and x t + h do not need to be halved.
____________
For example, when fitting the model in Equation 14.1 to a monthly time
series x extending over 10 years from January 1990 the estimate for
m is x and the estimate for q January is:
1
qˆJanuary = ( x1 + x13 + x 25 + + x109 ) - mˆ
10
____________
ts.plot(manston1$tmax,ylab="",main="Max
temperatures")
points(manston1$tmax,cex=0.4)
The moving average can be added (in red) using the code:
lines(as.vector(decomp$trend),col="red")
The sum of seasonal and moving average trends can be added (in blue)
as follows:
lines(as.vector(decomp$seasonal+decomp$trend),
col="blue")
A further caution when using transformed data involves the final step
of turning forecasts for the transformed process into forecasts for the
original process, as some transformations introduce a systematic bias.
____________
1 n
mˆ = Â xt
n t =1
____________
1 n
gˆk = Â ( xt - mˆ )( xt - k - mˆ )
n t = k +1
____________
gˆk
rk =
gˆ0
As we have seen before, R functions acf and pacf can be used for
generating these values.
set.seed(123)
x=arima.sim(n=300,model=list(ar=0.7,ma=0.5))
Then:
par(mfrow=c(1,2))
acf(x,main="Sample ACF")
pacf(x,main="Sample PACF")
Figure 14.7: ACF and PACF of some simulated data from ARMA(1,1) .
____________
Clearly the SACF and SPACF of a white noise process are random,
being simple functions of the observations. In particular, even if the
original process was a perfectly standard white noise the SACF and
SPACF would not be identically zero. The question is what scale of
deviation from zero is to be expected.
____________
X t = m + et
80 A ‘portmanteau’ test is due to Ljung and Box, who state that, if the
white noise model is correct, then:
m rk2
n(n + 2) Â 2
~ cm
k =1 n - k
for each m .
____________
Identification of MA(q )
81 If the data really do come from a MA(q ) model, the estimators r k for
k > q will be roughly normally distributed with mean 0 and variance
1Ê q ˆ
Á 1 + 2 Â rk ˜ .
2
nË k =1 ¯
____________
Identification of AR ( p )
Identifying p , d and q
Suppose now that the appropriate value for the parameter d has been
found, and the time series { zd + 1, zd + 2 , , zn } is adequately stationary.
(Notice that a differenced series has one fewer observation than the
original series.) We shall assume throughout this section that the
sample mean of the z sequence is zero; if this is not the case, obtain a
new sequence by subtracting mˆ = z from each value in the sequence.
We shall also assume, for the sake of simplicity in setting down the
lower and upper limits of sums, that d = 0 .
number of parameters
AIC (model) = log(sˆ 2 ) + 2 ¥
n
Parameter estimation
Zt = a 1Zt - 1 + + a p Zt - p + et + b 1et - 1 + + b q et - q
et = Zt - a 1Zt - 1 - - a p Zt - p
n
(zt - a 1zt -1 - - a p zt - p )
2
Â
t = p +1
____________
et = zt - a 1zt - 1 - b 1et - 1
93 First assume they are all equal to zero and estimate the a i and b j on
that basis, then use standard forecasting techniques on the
time-reversed process { zn , , z1 } to obtain predicted values for
(e0 , , eq - 1) , a method known as backforecasting. These new values
can be used as the starting point for another application of the
estimation procedure; this continues until the estimates have
converged.
____________
In Figure 14.7, the ACF and PACF plots show some significant spikes
in the early lags, suggesting some presence of autoregressive and
moving average.
The code:
fit=arima(x,order=c(1,0,1));fit
1 n ˆ2
sˆ 2 = Â et
n t = p +1
1 n
= Â (zt - aˆ1zt -1 - - aˆ p zt - p - bˆ1eˆt -1 - - bˆq eˆt - q )2
n t = p +1
Diagnostic checks
The behaviour of the sample ACF and sample PACF of a white noise
sequence have already been described.
____________
È2 16n - 29 16n - 29 ˘
Í 3 (n - 2) - 1.96
2
, 3
(n - 2) + 1.96 ˙
ÎÍ 90 90 ˚˙
____________
The command:
tsdiag(fit)
where the last plot shows a sequence of p-values of the Ljung-Box test,
high values observed suggesting good fit, ie residuals close to white
noise.
Forecasting
X n + k = m + a 1( X n + k -1 - m ) + + a p ( X n + k - p - m )
+ en + k + b 1en + k -1 + + b qen + k -q
102 The one-step ahead and two-step ahead forecasts for an AR (2) are
given by:
xˆ n (1) = mˆ + aˆ1( x n - mˆ ) + aˆ 2 ( x n - 1 - mˆ )
xˆ n (2) = mˆ + aˆ1( xˆ n (1) - mˆ ) + aˆ 2 ( x n - mˆ )
____________
predict(fit,n.ahead=3)
Exponential smoothing
xˆ n (1) = a ( x n + (1 - a ) x n -1 + (1 - a )2 x n - 2 + )
____________
107 The method lends itself easily to regular updating: it is easy to see
that:
108 This technique works for stationary series, but clearly cannot be
applied to series exhibiting a trend or seasonal variation.
____________
X t(1) , , X t(m ) .
____________
113 In the stationary case the notation m will be used to represent the
common mean vector, and S k the covariance matrix cov( X t , X t + k ) .
p
Xt = m + Â A j (Xt - j - m ) + et (14.2)
j =1
cov (
et(i ) , es(I ) ) = 0 for s π t .
____________
Ê it - m i ˆ Ê a 11 0 ˆ Ê it -1 - m i ˆ Ê et ˆ
(i )
=
ÁË I - m ˜¯ ÁË a ˜Á ˜ + Á ˜
21 a 22 ¯ Ë It -1 - m I ¯ ÁË et ˜¯
(I )
t I
____________
117 The theory and analysis of a VAR (1) closely parallels that of a
univariate AR (1) . Iterating from equation (14.2) in the case p = 1 , it is
clear that:
t -1
Xt = m + Â A j et - j + At (X 0 - m )
j =0
____________
118 In order that X should represent a stationary time series, the powers
of A should converge to zero in some sense. The appropriate
requirement is that all eigenvalues of the matrix A should be less than
1 in absolute value.
____________
(a 11 - l )(a 22 - l ) - a 12a 21 = 0
Ct = a Yt - 1 + et
(1)
It = b (Ct - 1 - Ct - 2 ) + et
(2)
where e (2) is another zero-mean white noise. Finally, any part of the
national income is either consumed or invested; therefore:
Yt = Ct + It
____________
Ct = a Ct - 1 + a It - 1 + et(1)
It = b (Ct - 1 - Ct - 2 ) + et(2)
0ˆ Ê Ct - 2 ˆ Ê et ˆ
(1)
Ê Ct ˆ Ê a a ˆ Ê Ct -1ˆ Ê 0
ÁË I ˜¯ = ÁË b +
0 ˜¯ ÁË It -1 ˜¯ ÁË - b
+ Á ˜
0˜¯ ÁË It - 2 ˜¯ ËÁ e (2) ¯˜
t t
____________
122 Two time series processes X and Y are called cointegrated if:
Pt
ln X t = ln + Yt
Qt
Yt = m + a (Yt -1 - m ) + et + b et -1
where e (1) and e (2) are zero-mean white noises, possibly correlated.
124 The general class of bilinear models can be exemplified by its simplest
representative, the random process X defined by the relation:
X n + a ( X n - 1 - m ) = m + en + b en - 1 + b( X n - 1 - m )en - 1
125 The main qualitative difference between the bilinear model and models
from the ARMA class is that many bilinear models exhibit ‘bursty’
behaviour: when the process is far from its mean it tends to exhibit
larger fluctuations.
Ïa 1( X n - 1 - m ) + en , if X n - 1 £ d
Xn = m + Ì
Óa 2 ( X n - 1 - m ) + en , if X n - 1 > d
____________
X t = m + a t ( X t - 1 - m ) + et
129 The behaviour of these processes can vary widely, depending on the
distribution chosen for the a t , but is in general more irregular than
that of the corresponding AR (1) .
____________
p
 a k (Xt -k - m)
2
X t = m + et a 0 +
k =1
132 The simplest representative of the ARCH ( p) class is the ARCH (1)
model defined by the relation:
X t = m + et a 0 + a 1 ( X t - 1 - m )
2
____________
133 As may be seen from the ARCH (1) model, a significant deviation of
X t - 1 from the mean m gives rise to an increase in the conditional
variance of X t given X t - 1 .
____________
134 The ARCH models have been used for modelling financial time series.
If Zt is the price of an asset at the end of the t th trading day, it is
found that the ARCH model can be used to model X t = ln(Zt Zt - 1) ,
interpreted as the daily return on day t .
This section contains all the relevant exam questions from 2008 to 2017 that
are related to the topics covered in this booklet.
Solutions are given after the questions. These give enough information for
you to check your answer, including working, and also show you what an
outline examination answer should look like. Further information may be
available in the Examiners’ Report, ASET or Course Notes. (ASET can be
ordered from ActEd.)
We first provide you with a cross-reference grid that indicates the main
subject areas of each exam question. You can use this, if you wish, to
select the questions that relate just to those aspects of the topic that you
may be particularly interested in reviewing.
Alternatively, you can choose to ignore the grid, and attempt each question
without having any clues as to its content.
24
23
22
21
20
19
18
17
16
15
14
13
12
11
10
Question
Page 70
MA processes
AR processes
ARMA processes
Cross-reference grid
ARIMA processes
VAR processes
Autocov fn /
autocorr fn
Partial ACF
Non-stationary
series
Parameter
estimation
Exclusive use Batch0402p
Forecasting
Testing / choosing
model
Trends, cycles
ARCH models
Lag ACF
1 0.73
2 0.14
3 0.37
4 0.59
5 0.24
6 0.12
7 0.07
(iii) Explain whether these results confirm the initial belief that the model
could be appropriate for these data. [3]
[Total 10]
(ii) Explain why weakly stationary multivariate normal processes are also
strictly stationary. [1]
(iii) Show that the following bivariate time series process, ( X n ,Yn )T , is
weakly stationary:
where enx and enx are two independent white noise processes. [5]
is stationary. [6]
[Total 14]
X t et 0 1( X t 1 )2
Yt a1Yt 1 et
where et is a white noise error term with mean zero and variance 2 .
Yt a1Yt 1 a2Yt 2 et
where et is a white noise error term with mean zero and variance 2 .
(iv) List two statistical tests that you should apply to the residuals after fitting
a model to time series data. [2]
[Total 15]
(i) Show that the new series X t = a + bt + Yt where a and b are fixed
non-zero constants, is not stationary. [2]
(iii) Show that if Yt is a moving average process of order 1, then the series
DX t is not invertible and has variance larger than that of Yt . [6]
[Total 15]
Yt = 2a Yt -1 + Zt a < 0.5
•
Express Yt in the form Yt = Â a j Zt - j and hence or otherwise find an
j =0
The following data is observed from n = 500 realisations from a time series:
n n
 xi = 13,153.32 ,  ( xi - x )2 = 3,153.67 and
i =1 i =1
n -1
 ( xi - x )( xi +1 - x ) = 2,176.03
i =1
(i) Estimate, using the data above, the parameters m , a1 and s from the
model:
X t - m = a1( X t -1 - m ) + e t
(ii) After fitting the model with the parameters found in (i), it was calculated
that the number of turning points of the residuals series eˆt is 280.
The following two models have been suggested for representing some
quarterly data with underlying seasonality.
Model 1 Yt = aYt - 4 + et
Model 2 Yt = b et - 4 + et
(ii) State the features of the sample autocorrelation that would lead you to
prefer Model 1. [1]
[Total 5]
Observations y1, y 2 , , y n are made from a random walk process given by:
(i) Derive expressions for E (Yt ) and var (Yt ) and explain why the process
is not stationary. [3]
(ii) Show that g t ,s = cov (Yt ,Yt - s ) for s < t is linear in s . [2]
(iii) Explain how you would use the observed data to estimate the
parameters a and s . [3]
(iv) Derive expressions for the one-step and two-step forecasts for Yn +1
and Yn + 2 . [2]
[Total 10]
Yt = 2a Yt -1 - a 2Yt - 2 + et
(i) Determine the values of a for which the process is stationary. [2]
(ii) Derive the auto-covariances g 0 and g 1 for this process and find a
general recursive expression for g k for k ≥ 2 . [10]
(iii) Show that the auto-covariance function can be written in the form:
g k = Aa k + kBa k
(iii) Calculate E (Yt ) and find the auto-covariance function for Yt . [6]
(1 - a B )3 X t = et
where B is the backwards shift operator and et is a white noise process with
variance s 2 .
(1 - B ) (1 - (a + b ) B + ab B ) X
3 2
t = et
where B is the backward shift operator and et is a white noise process with
variance s 2 .
(i) Show that for a suitable choice of s the seasonal difference series
Yt = X t - X t - s is stationary for a range of values of a and b , which
you should specify. [3]
(iii) Forecast the next two observations x̂101 and x̂102 based on the
parameters estimated in part (ii) and the observed values x1, x2 ,..., x100
of X t . [4]
[Total 14]
X t = a X t -1 + et
( xi -a xi -1 )2
n 1 -
Lμ’ e 2s 2 . [3]
i =1 2p s
(ii) Show that the maximum likelihood estimate of a can also be regarded
as a least squares estimate. [2]
(iv) Derive the Yule-Walker equations for the model and hence derive
estimates of a and s 2 based on observed values of the
autocovariance function. [5]
(i) State the three main stages in the Box-Jenkins approach to fitting an
ARIMA time series model. [3]
(ii) Explain, with reasons, which ARIMA time series would fit the observed
data in the charts below. [2]
ACF PACF
X t = a1X t -1 + a 2 X t - 2 + b1et -1 + et
(iv) Explain whether the partial auto-correlation function for this model can
ever give a zero value. [2]
[Total 13]
A sequence of 100 observations was made from a time series and the
following values of the sample auto-covariance function (SACF) were
observed:
Lag SACF
1 0.68
2 0.55
3 0.30
4 0.06
The sample mean and variance of the same observations are 1.35 and 0.9
respectively.
(i) Calculate the first two values of the partial correlation function fˆ1
and fˆ2. [1]
(a) Yt = a0 + a1Yt -1 + et
(iv) Explain whether each of the models in part (ii) satisfies the Markov
property. [2]
[Total 17]
(i) List the main steps in the Box-Jenkins approach to fitting an ARIMA time
series to observed data. [3]
Observations x1, x2, , x200 are made from a stationary time series and
the following summary statistics are calculated:
200
( xi x )( xi 2 x ) 17.1
i 3
,
(ii) Calculate the values of the sample auto-covariances ˆ0 ˆ1 and ˆ2. [3]
(iii) Calculate the first two values of the partial correlation function ˆ1
X t a1( X t 1 ) et
After fitting the model in part (iv) the 200 observed residual values eˆt were
calculated. The number of turning points in the residual series was 110.
(v) Carry out a statistical test at the 95% significance level to test the
hypothesis that eˆt is generated from a white noise process. [4]
[Total 18]
The following time series model is being used to model monthly data:
(i) Perform two differencing transformations and show that the result is a
moving average process which you may assume to be stationary. [3]
(iii) Derive the auto-correlation function of the model generated in part (i). [8]
[Total 12]
X t 0.5 X t 1 Yt t1
Yt 0.5Yt 1 X t t2
X X 1
M t N t 1 t
Yt Yt 1 t2
(ii) Show that the following set of equations represents a VAR( p ) (vector
auto regressive) process, by specifying the order and the relevant
parameters:
X t X t 1 Yt 1 t1
Yt X t 1 X t 2 t2
[3]
[Total 12]
Yt 1 0.6Yt 1 0.16Yt 2 t
1 B 1 B B X
12 2
t t
(ii) Determine the range of values for and for which the process will
be stationary after applying this seasonal difference. [3]
Assume that after the appropriate seasonal differencing the following sample
autocorrelation values for observations of Yt are ˆ1 0 and ˆ2 0.09 .
(iv) Derive the next two forecasted values for next two observations xˆT 1
and xˆT 2 , as a function of the existing observations. [4]
[Total 13]
Yt = m + a Yt -1 + e t
(i) State two approaches for estimating the parameters in Model A. [2]
Mary, an actuarial student, wishes to revise Model A such that the error
terms e t no longer follow a normal distribution.
(ii) Explain which of the approaches in part (i) she should now use for
parameter estimation. [2]
Mary has now constructed Model B. She has done this by multiplying both
sides of the equation above by (1 - cB) , where B is the backshift operator,
so that Model B follows the equation:
(1 - cB)Yt = (1 - cB)(a Yt -1 + e t )
Let DX t = X t - X t -1.
(iv) Set out an equation for DX t in terms of b, b , e t and L, the lag operator.
[1]
The solutions presented here are just outline solutions for you to use to
check your answers. See ASET for full solutions.
(i) Invertibility
We have
l = - b1 , 4 - b1
1 4
The time series is invertible if the roots are all greater than 1 in magnitude:
- b1 > 1 fi b1 < 1
1
(ii) ACF
g 0 = cov(Yt ,Yt )
= cov(et + b1et -1 + b 4et - 4 + b1b 4et - 5 ,
= s 2 (1 + b12 )(1 + b 42 )
Similarly:
= s 2 b1(1 + b 42 )
g 2 = cov(Yt ,Yt - 2 )
= cov(et + b1et -1 + b 4et - 4 + b1b 4et - 5 ,
=0
g 3 = cov(Yt ,Yt - 3 )
= cov(et + b1et -1 + b 4et - 4 + b1b 4et - 5 ,
= b1b 4s 2
g 4 = cov(Yt ,Yt - 4 )
= cov(et + b1et -1 + b 4et - 4 + b1b 4et - 5 ,
= b 4s 2 + b12 b 4s 2
= s 2 b 4 (1 + b12 )
g 5 = cov(Yt ,Yt - 5 )
= cov(et + b1et -1 + b 4et - 4 + b1b 4et - 5 ,
= b1b 4s 2
gk = 0 k >5
Hence:
g0
r0 = =1
g0
g1 s 2 b1(1 + b 42 ) b1
r ±1 = = 2 =
g 0 s (1 + b12 )(1 + b 42 ) 1 + b12
r ±2 = 0
g3 b1b 4
r ±3 = =
g 0 (1 + b12 )(1 + b 42 )
g4 s 2 b 4 (1 + b12 ) b4
r ±4 = = 2 =
g 0 s (1 + b12 )(1 + b 42 ) 1 + b12
g5 b1b 4
r ±5 = =
g 0 (1 + b12 )(1 + b 42 )
r±k = 0 |k |>5
So it appears that the sample ACFs are not consistent with the theoretical
ACFs.
E ( X t ) is constant
det( A - l I) = 0
Ê a bˆ
For a 2 ¥ 2 matrix, M = Á , the determinant is det M = ad - bc . Hence:
Ë c d ˜¯
l 2 - 1.3l + 0.37 = 0
Since both of these roots are less than 1 in magnitude, the multivariate
process is stationary.
Hence:
(1.3 + 2c ) + 0.21
<1 fi 0.879 + c < 1 fi c < 0.121
2
(1.3 + 2c ) - 0.21
<1 fi 0.421 + c < 1 fi c < 0.579
2
E X t s
E X t X t s E et 0 1( X t 1 )2 X t s
E X t s E et 0 1( X t 1 )2 X t s
2 E et E 0 1( X t 1 )2 X t s
0
0 2
2
So:
E X t X t s 2 E X t E X t s
Let Yt X t , so that:
Yt et 0 1Yt21
Squaring this:
Yt2 et2 0 1Yt21
We can use repeated substitution to get:
Yt2 et 2 0 1Yt21
et 2 0 1et21
0 1Yt22
et 2 e
0 e Y
2
1 t 1 0
2
1 t 2 0
2
1 t 3
et 2 e
0 e Y
2
1 t 1 0
2
1 t 2 0
2
1 t s
f Yt2 s
( ) (
P Yt2 < 1 Yt2- s = 1,000,000 < P Yt2 < 1 Yt2- s = 1 )
So Yt2 is not independent of Yt2s , which implies that Yt is not independent
of Yt s and hence that X t is not independent of X t s .
From the figures given it looks like the ACF is decaying slowly and the PACF
is cutting off after lag 2. This is a characteristic of an AR (2) model.
As a starter step:
cov(Yt , et ) cov(a1Yt 1 et , et ) 2
Since we are told in the question that the sample ACF with lag 1 is 0.854,
this is our estimate of 1 . So we have:
0.854 aˆ1
Since we are told in the question that the sample ACF with lag 1 is 0.854
(our estimate of 1 ) and the sample variance is 1.253 (our estimate of 0 ),
we have:
ˆ0 1.253
ˆ1
0.854 ˆ1 0.854 1.253
ˆ0
From 0 a1 1 2 :
From this:
a12 (1 a2 )a2
2
1 a2
Since we are told in the question that the sample ACF with lag 1 is 0.854
(our estimate for 1 ) and that the sample ACF with lag 2 is 0.820 (our
estimate of 2 ), we have:
aˆ1
0.854
1 aˆ2
aˆ12 (1 aˆ2 )aˆ2
0.820
1 aˆ2
By substituting this back into the first equation above, we get aˆ1 0.568 .
Since we are told in the question that the sample ACF with lag 1 is 0.854,
the sample ACF with lag 2 is 0.820 and the sample variance is 1.253, we
have:
ˆ0 1.253
ˆ1
0.854 ˆ1 0.854 1.253
ˆ0
ˆ2
0.820 ˆ2 0.820 1.253
ˆ0
(iv) Tests
The appropriate tests are the Portmanteau (Ljung and Box) and Turning
Points tests.
We have:
E ( X t ) = a + bt + E (Yt )
Since Yt is stationary E (Yt ) does not depend upon time. However since
E ( X t ) contains t it depends on time and so is not constant. Hence X t is
not stationary.
(ii) Autocovariance
DX t = X t - X t -1 = a + bt + Yt - a - b(t - 1) - Yt -1 = b + Yt - Yt -1
This means that the autocovariance function with lag s depends upon the
lag only.
These are the two conditions required for the time series to be stationary.
DX t = b + et + b et -1 - et -1 - b et - 2
= b + et + ( b - 1)et -1 - b et - 2
Here:
1 + ( b - 1)l - bl 2 = 0
1 - b ± ( b - 1)2 + 4 b 1 - b ± b 2 + 2 b + 1 1 - b ± ( b + 1)
fil = = =
-2 b -2 b -2 b
1
=- or 1
b
Since the roots of the characteristic equation are not all strictly greater than
1 in magnitude, the process is not invertible.
Also:
Yt = 2a Yt -1 + Zt
= 2a (2a Yt - 2 + Zt -1) + Zt = 4a 2Yt - 2 + 2a Zt -1 + Zt
= 4a 2 (2a Yt - 3 + Zt - 2 ) + 2a Zt -1 + Zt = 8a 3Yt - 3 + 4a 2Zt - 2 + 2a Zt -1 + Zt
•
= Â (2a ) j Zt - j
j =0
Ê • ˆ • •
var(Yt ) = var Á Â (2a ) j Zt - j ˜ = Â (2a )2 j var(Zt - j ) = Â (2a )2 j s 2
Ë j =0 ¯ j =0 j =0
2 2
s s
= =
1 - (2a )2 1 - 4a 2
var( X t - m ) = var(a1( X t -1 - m ) + e t )
We estimate the variance of the residuals from the data given in the
question:
3,153.67 3,153.67
= 0.6902 ¥ +s 2 fi s 2 = 3.304
500 500
So:
s = 1.818
13,153.32
m= = 26.31
500
2
E (T ) = (498) = 332
3
16 ¥ 500 - 29
var(T ) = = 88.567
90
Under H0 , this should come from the standard normal distribution. Since
-5.47 < -1.96 we have very strong evidence to reject H0 . This suggests
that the residuals are not consistent with white noise.
Model 1: Yt = a Yt - 4 + et
fi g 2 = 0 since a π 0 .
3 1 0 since a π 0 .
g 5 = ag 1 = 0
g 6 = ag 2 = 0
g 7 = ag 3 = 0
g 8 = ag 4 = a 2g 0 etc
Hence:
ÔÏa k / 4g 0 if k = 4,8,12,16,...
gk = Ì
ÓÔ0 otherwise
1 if k 0
k k k / 4 if k 4,8,12,16,...
0
0 otherwise
Model 2: Yt = b et - 4 + et
We have:
2 2 2 2 1 2
1 cov Yt ,Yt 1 cov et 4 et , et 5 et 1
0
k 0 k4
So:
Ï1 if k = 0
Ô
Ô b
rk = Ì 2 if k = 4
Ôb +1
Ô0 otherwise
Ó
rk Æ 0 as k Æ • for Model 1
Yt a Yt 1 et
a a Yt 2 et 1 et
2a Yt 2 et 1 et
2a a Yt 3 et 2 et 1 et
3a Yt 3 et 2 et 1 et
...
t
t a Y0 e1 e2 et 1 et ta e j
j 1
So:
Ê t ˆ t
E (Yt ) = E Á ta + Â e j ˜ = ta + Â E (e j ) = t a
Ë j =1 ¯ j =1
Ê t ˆ t
var (Yt ) = var Á ta + Â e j ˜ = Â var (e j ) = t s 2
Ë j =1 ¯ j =1
We have:
Ê t t -s ˆ Ê t t -s ˆ
g t ,s = cov (Yt ,Yt - s ) = cov Á ta + Â e j , (t - s ) a + Â e j ˜ = cov Á Â e j , Â e j ˜
Ë j =1 j =1 ¯ Ë j =1 j =1 ¯
= cov (e1 + e2 + + et -1 + et , e1 + e2 + + et - s -1 + et - s )
= var (e1) + var (e2 ) + + var (et - s ) since ei terms independent
= (t - s ) s 2
—Yt = Yt - Yt -1 = a + et
yˆ n (1) = aˆ + y n + E (en +1 ) = aˆ + y n
Yt - 2a Yt -1 + a 2Yt - 2 = et
1 - 2al + a 2l 2 = 0
2a ± 4a 2 - 4a 2 1
l= =
2a 2 a
1
>1 fi a <1
a
We have:
Hence:
g 0 = 2ag 1 - a 2g 2 + s 2 (1)
Similarly, we get:
g 3 = cov(Yt ,Yt - 3 )
= 2ag 2 - a 2g 1
2a
g 1(1 + a 2 ) = 2ag 0 fi g1 = g0 (5)
1+ a 2
Ê 2a ˆ 4a 2 3a 2 - a 4
g 2 = 2a Á g 0 ˜ - a 2g 0 = g 0 - a 2g 0 = g0 (6)
Ë 1+ a 2 ¯ 1+ a 2
1+ a 2
We can now substitute (5) and (6) into the lag 0 covariance equation (1):
Ê 2a ˆ Ê 3a 2 - a 4 ˆ
g 0 = 2a Á g 0˜ - a 2 Á g 0˜ +s 2
Ë 1+ a 2 ¯ Ë 1 +a 2
¯
4a 2 - 3a 4 + a 6
= g 0 +s 2
1+ a 2
1+ a 2
fi g0 = s2 (7)
1 - 3a + 3a 4 - a 6
2
2a
g1 = s2 (8)
1 - 3a 2 + 3a 4 - a 6
g k - 2ag k -1 + a 2g k - 2 = 0
g k = ( A + Bk )l k (9)
l 2 - 2al + a 2 = 0
2a ± 4a 2 - 4a 2
l= =a
2
g k = ( A + Bk )a k (10)
1+ a 2
g0 = A = s2
1 - 3a + 3a 4 - a 6
2
2a
g 1 = ( A + B )a = s2
1 - 3a 2 + 3a 4 - a 6
Hence:
2
B= s2 - A
1 - 3a 2 + 3a 4 - a 6
2 1+ a 2
B= 2 4 6
s2 - 2 4 6
s2
1 - 3a + 3a - a 1 - 3a + 3a - a
1- a 2
= s2
1 - 3a 2 + 3a 4 - a 6
Yt is an ARIMA(2, 0, 0) if it is stationary.
1 - 0.4l - 0.12l 2
which has roots -5 and 1.667. Since all roots are of magnitude greater
than 1, the process is stationary.
0.7
E (Yt ) = m = = 1.4583
0.48
(iv) Autocorrelations
g 1 = cov(Yt , Yt -1)
= cov(0.4Yt -1 + 0.12Yt - 2 + et , Yt -1)
= 0.4 cov(Yt -1, Yt -1) + 0.12 cov(Yt - 2,Yt -1) + cov(et , Yt -1)
= 0.4g 0 + 0.12g 1
g 2 = cov(Yt , Yt - 2 )
= cov(0.4Yt -1 + 0.12Yt - 2 + et , Yt - 2 )
= 0.4 cov(Yt -1, Yt - 2 ) + 0.12 cov(Yt - 2, Yt - 2 ) + cov(et , Yt - 2 )
= 0.4g 1 + 0.12g 0
g 3 = 0.4g 2 + 0.12g 1
g 4 = 0.4g 3 + 0.12g 2
0.4 5
g1 = g0 = g0
0.88 11
Solve to get:
5 83 241 731
r1 = r2 = r3 = r4 =
11 275 1,375 6,875
1 - 0.4l = 0 fi l= 1 = 2.5
0.4
Since the root is greater than 1 in absolute value, the process is stationary.
Hence d = 0 .
(ii)(a) Stationary?
(ii)(b) Invertible?
1 + 0.9l = 0 fi 1 = - 10
l = - 0.9 9
Since the process is stationary, we know that E (Yt ) = E (Yt -1) = ... = m . So:
1
m - 0.4 m = 0.1 + 0 + 0 fi m=
6
Now since white noise is uncorrelated and future white noise is independent
of the past values of a time series:
Similarly:
and, in general:
Substituting our expression for g 0 into the expression for g 1 above, we get:
1.768 2 221 2
g 1 = 0.4(0.4g 1 + 2.17s 2 ) + 0.9s 2 fi g1 = s = s
0.84 105
Ê 221 2 ˆ 253 2
fi g 0 = 0.4 Á s + 2.17s 2 = s
Ë 105 ˜¯ 84
Ê 221 2 ˆ
fi g k = 0.4k -1g 1 = 0.4k -1 Á s ˜ for k ≥ 1
Ë 105 ¯
fi Yt = (1 - 0.4B )
-1
(0.1 + 0.9et -1 + et )
However:
Yt = (1 - 0.4B )
-1
(0.1 + 0.9et -1 + et )
( )
= 1 + (0.4B ) + (0.4B ) + (0.4B ) + (0.1 + 0.9et -1 + et )
2 3
0.1
= + et + 1.3 et -1 + 0.4 ¥ 1.3 et - 2 + 0.42 ¥ 1.3 et - 3 +
1 - 0.4
1 •
= + et + 1.3 Â 0.4 j -1et - j
6 j =1
1
(1 - al )3 = 0 fi l=
a
Writing the time series in its long-hand form, ie by expanding out the
backward shift operator, gives:
(1 - a B )3 X t = et
¤ (1 - 3a B + 3a B 2 2
)
- a 3B 3 X t = et
¤ X t - 3a X t -1 + 3a 2 X t - 2 - a 3 X t - 3 = et
¤ X t = 3a X t -1 - 3a 2 X t - 2 + a 3 X t - 3 + et
g k = cov ( X t , X t - k )
Hence:
g 0 = cov ( X t , X t )
= cov (1.2 X t -1 - 0.48 X t - 2 + 0.064 X t - 3 + et , X t )
= 1.2 cov ( X t -1, X t ) - 0.48 cov ( X t - 2, X t ) + 0.064 cov ( X t - 3 , X t )
+ cov (et , X t )
= 1.2g 1 - 0.48g 2 + 0.064g 3 + s 2 (1)
Similarly:
g 1 = cov ( X t , X t -1)
= cov (1.2 X t -1 - 0.48 X t - 2 + 0.064 X t - 3 + et , X t -1)
= 1.2 cov ( X t -1, X t -1) - 0.48 cov ( X t - 2, X t -1) + 0.064 cov ( X t -3 , X t -1)
+ cov (et , X t -1)
= 1.2g 0 - 0.48g 1 + 0.064g 2 + 0 (2)
g 2 = cov ( X t , X t - 2 )
= cov (1.2 X t -1 - 0.48 X t - 2 + 0.064 X t - 3 + et , X t - 2 )
= 1.2cov ( X t -1, X t - 2 ) - 0.48 cov ( X t - 2, X t - 2 ) + 0.064 cov ( X t - 3 , X t - 2 )
+ cov (et , X t - 2 )
= 1.2g 1 - 0.48g 0 + 0.064g 1 + 0
= 1.264g 1 - 0.48g 0 (3)
g 3 = cov ( X t , X t -3 )
= cov (1.2 X t -1 - 0.48 X t - 2 + 0.064 X t - 3 + et , X t - 3 )
= 1.2 cov ( X t -1, X t - 3 ) - 0.48 cov ( X t - 2, X t - 3 ) + 0.064 cov ( X t - 3 , X t - 3 )
+ cov (et , X t - 3 )
= 1.2g 2 - 0.48g 1 + 0.064g 0 + 0
290
fi g1 = g 0 = 0.83573g 0
347
Ê 290 ˆ 200
g 2 = 1.264 Á g - 0.48g 0 = g = 0.57637g 0
Ë 347 0 ˜¯ 347 0
and:
g 1 290
r1 = = = 0.83573
g 0 347
g 2 200
r2 = = = 0.57637
g 0 347
The partial autocorrelation function, fk , will cut off (ie be 0) for k > 3 .
Since Yt = X t - X t - s = (1 - B s ) X t , we have s = 3 .
(1 - (a + b ) B + ab B )Y = e
2
t t
(1 - (a + b ) l + abl ) = 0
2
1 1
(1 - al )(1 - bl ) = 0 fi l=
a
,
b
1 1
> 1 and >1
a b
ie:
Yt is an AR(2) process:
Yt = (a + b ) Yt -1 - ab Yt - 2 + et
g1 =
(a + b ) g
0
1 + ab
g 2 = (a + b )
(a + b ) g - abg 0
0
1 + ab
r1 =
(a + b )
1 + ab
and r2 = (a + b )
(a + b ) - ab = (a + b ) r1 - ab
1 + ab
(a + b ) = 0.2 (3)
1 + ab
¤ a - 0.2ab = 0.2 - b
0.2 - b
¤ a = (5)
1 - 0.2 b
Ê 0.2 - b ˆ Ê 0.2 - b ˆ
0.7 = Á + b ˜ 0.2 - Á b
Ë 1 - 0.2 b ¯ Ë 1 - 0.2 b ˜¯
0.04 - 0.2 b + 0.2 b - 0.04 b 2 - 0.2 b + b 2
=
1 - 0.2 b
0.04 - 0.2 b + 0.96 b 2
=
1 - 0.2 b
Rearranging gives:
and:
( x - a )( x - b ) = 0 fi x 2 - (a + b )x + ab = 0
x 2 - 0.0625 x - 0.6875 = 0
(1 - 0.0625B - 0.6875B )Y = e2
t t
¤ Yt = 0.0625Yt -1 + 0.6875Yt - 2 + et
Substituting in Yt = X t - X t - 3 , we have:
X t - X t - 3 = 0.0625 ( X t -1 - X t - 4 ) + 0.6875 ( X t - 2 - X t - 5 ) + et
xˆ101 = 0.0625 x100 + 0.6875 x99 + x98 - 0.0625 x97 - 0.6875 x96
xˆ102 = 0.0625 xˆ101 + 0.6875 x100 + x99 - 0.0625 x98 - 0.6875 x97
N (a xt -1,s 2 )
( xt -a xt -1 )2
1 -
e 2s 2
2
2ps
( xi -a xi -1 )2
n 1 -
L(a ,s ) = ’ e 2s 2
2
i =1 2ps
The least squares estimate will minimise the following with respect to a :
n n
 ei2 =  ( xi - a xi -1)2
i =1 i =1
( xi -a xi -1 )2
n 1 -
’ 2
e 2s 2
i =1 2ps
( xi -a xi -1 )2 1 n
n 1 - 1 - 2 Â ( xi -a xi -1 )2
L(a ,s ) = ’ e 2s
2
e 2s = const ¥ i =1
i =1 2ps 2 sn
Taking logs:
1 n
ln L(a ,s ) = const - n ln s - 2 Â ( xi - a xi -1)2
2s i =1
∂ 1 n
∂a
ln L(a ,s ) = - Â 2 ¥ ( xi - a xi -1) ¥ ( - xi -1)
2s 2 i =1
1 n
= Â xi -1( xi - a xi -1)
s2 i =1
1 n
 xi -1( xi - aˆ xi -1) = 0
sˆ 2 i =1
n n
 xi -1xi - aˆ  xi2-1 = 0
i =1 i =1
n
 xi -1xi
aˆ = i =1
n
 xi2-1
i =1
∂ n 1 n
∂s
ln L(a ,s ) = - + 3
s s
 ( xi - a xi -1)2
i =1
n 1 n
- +
sˆ sˆ 3
 ( xi - aˆ xi -1)2 = 0
i =1
n
-nsˆ 2 + Â ( xi - aˆ xi -1)2 = 0
i =1
1 n
sˆ 2 = Â ( x - aˆ xi -1)2
n i =1 i
A preliminary step:
cov( X t , et ) = cov(a X t -1 + et , et )
= a cov( X t -1, et ) + cov(et , et )
= 0 +s 2
=s2
g 0 = cov( X t , X t )
= cov(a X t -1 + et , X t )
= a cov( X t -1, X t ) + cov(et , X t )
= ag 1 + s 2
g 1 = cov( X t , X t -1)
= cov(a X t -1 + et , X t -1)
= a cov( X t -1, X t -1) + cov(et , X t -1)
= ag 0
g 2 = cov( X t , X t - 2 )
= cov(a X t -1 + et , X t - 2 )
= a cov( X t -1, X t - 2 ) + cov(et , X t - 2 )
= ag 1
In general:
g k = ag k -1 k ≥1
ˆ ˆ1 + sˆ 2
gˆ0 = ag (1)
gˆ1 = ag
ˆ ˆ0 (2)
gˆ1
aˆ =
gˆ0
gˆ12
sˆ 2 = gˆ0 - ag
ˆ ˆ1 = gˆ0 -
gˆ0
1 n
gˆ0 = Â ( xi - x )2
n i =1
1 n
gˆ1 = Â ( xi - x )( xi -1 - x )
n i =1
(v) Comment
(ii) ARIMA time series to fit the observed data in the charts
The ACF cuts off (becomes 0) at all lags greater than 1, whereas the PACF
decays towards 0. Hence we have an MA(1) .
Also:
g 0 = cov ( X t , X t )
Similarly:
g 1 = cov ( X t , X t -1)
= cov (a1X t -1 + a 2 X t - 2 + b1et -1 + et , X t -1)
= a1 cov ( X t -1, X t -1) + a 2 cov ( X t - 2, X t -1)
+ b1 cov (et -1, X t -1) + cov (et , X t -1)
= a1 g 0 + a 2 g 1 + b1s 2 + 0 (2)
g 2 = cov ( X t , X t - 2 )
= a1 g 1 + a 2 g 0 + 0 + 0 (3)
g k = a1 g k -1 + a 2 g k - 2 (4)
(iv) Can the partial auto-correlation function ever give a zero value?
For a MA (q ) process, where q ≥ 1 , the PACF tends towards 0 but does not
completely cut off.
fˆ1 = r1 = 0.68
g 0 = cov(Yt ,Yt )
= cov(a0 + a1Yt -1 + et ,Yt )
= a1g 1 + s 2 (1)
g 1 a1g 0
r1 = = = a1
g0 g0
Equating the model lag 1 ACF to the sample lag 1 ACF gives:
aˆ1 = 0.68
Substituting the expression for g 1 from equation (2) into equation (1) gives:
g 0 = a1(a1g 0 ) + s 2
s2
g0 = (3)
1 - a12
sˆ 2
= 0.9 fi sˆ 2 = 0.9(1 - 0.682 ) = 0.48384
1 - 0.682
We have:
m = a0 + a1m
a0
m=
1 - a1
Equating this to the given sample mean of 1.35 and substituting in aˆ1 = 0.68
that we calculated, we get:
aˆ0
= 1.35 fi aˆ0 = 1.35(1 - 0.68) = 0.432
1 - 0.68
g 0 = cov(Yt ,Yt )
= cov(a0 + a1Yt -1 + a2Yt - 2 + et ,Yt )
= a1g 1 + a2g 2 + s 2 (1)
g 2 = cov(Yt ,Yt - 2 )
= cov(a0 + a1Yt -1 + a2Yt - 2 + et ,Yt - 2 )
= a1g 1 + a2g 0 (3)
g 1 - a2g 1 = a1g 0
a1
fi g1 = g0 (4)
1 - a2
Ê a1 ˆ Ê a2 ˆ
g 2 = a1 Á g 0 ˜ + a2g 0 = Á 1 + a2 ˜ g 0 (5)
Ë 1 - a2 ¯ Ë 1 - a2 ¯
a1
g0
g 1 - a2 a1
r1 = 1 = =
g0 g0 1 - a2
Ê a2 ˆ
1 +a g
Á 2˜ 0
g2 Ë 1 - a2 ¯ a12
r2 = = = + a2
g0 g0 1 - a2
Equating the lag 1 and lag 2 model and sample ACFs gives:
a1
= 0.68 (6)
1 - a2
a12
+ a2 = 0.55 (7)
1 - a2
73 255
aˆ1 = 0.68 - 0.68 ¥ = = 0.56920
448 448
Substituting g 1 and g 2 from equations (4) and (5) into equation (1) gives:
Ê a1 ˆ ÏÔÊ a 2 ˆ ¸Ô
g 0 = a1 Á g 0 ˜ + a2 ÌÁ 1 + a2 ˜ g 0 ˝ + s 2
Ë 1 - a2 ¯ ÓÔË
1 - a2 ¯ ˛Ô
Rearranging:
ÔÏ a12 a 2a ¸Ô
g 0 Ì1 - - 1 2 - a22 ˝ = s 2
ÓÔ 1 - a2 1 - a2 ˛Ô
s2
fig0 =
a12 a 2a
1- - 1 2 - a22
1 - a2 1 - a2
Ï
( ) ( 255
448 )
¸
2 2
255 73
Ô
( ) ˝Ô = 0.47099
448 448 2Ô
sˆ2 = 0.9 Ì1 - - - 73
73 73 448
Ô 1 - 448 1- 448
Ó ˛
We have:
fi m = a0 + a1m + a2 m
a0
fim=
1 - a1 - a2
255
Equating this to the given sample mean of 1.35 and substituting in â1 = 448
73
and â2 = 448
, we get:
1-
a0
255 - 73
= 1.35 fi aˆ0 = 1.35 1 - ( 255
448
- 73
448 )= 81
224
0.36161
448 448
Yes stationarity is necessary for both models. Otherwise the mean, variance
and covariances would change over time.
(iv) Markov?
The first model satisfies the Markov property as it only depends on the
previous value of Yt -1 .
However, the second model does not satisfy the Markov property as it also
depends on Yt - 2 .
1 n 1
ˆ0
n t 1
( xt ˆ )2
200
35.4 0.177
1 n 1
ˆ1 ( xt ˆ )( xt 1 ˆ ) 200 28.4 0.142
n t 2
1 n 1
ˆ2 ( xt ˆ )( xt 2 ˆ ) 200 17.1 0.0855
n t 3
We have:
ˆ1 0.142
r1 0.802260
ˆ0 0.177
ˆ2 0.0855
r2 0.483051
ˆ0 0.177
Hence:
ˆ1 r1 0.802260
1 200 1
ˆ x xt 200 83.7 0.4185
200 t 1
0 cov( X t , X t )
cov( a1( X t 1 ) et , X t )
a1 1 2 (1)
1 cov( X t , X t 1)
cov( a1( X t 1 ) et , X t 1)
a1 0 (2)
1 a1 0
1 a1
0 0
Equating the model lag 1 ACF to the sample lag 1 ACF gives:
aˆ1 0.802260
0 a1(a1 0 ) 2
2
0 (3)
1 a12
ˆ 2
0.177 ˆ 2 0.177(1 0.8022602 ) 0.063079
1 0.8022602
We are testing:
H0 : the residuals are from a white noise process
H1 : the residuals are not from a white noise process
E (T ) 2 ( n 2) 2 (200 2) 132
3 3
16n 29 16 200 29
and var(T ) 35.23
90 90
110 132
3.706
35.23
The critical values are ±1.96 , so the test statistic lies in the rejection region.
Hence, we have sufficient evidence at the 5% level (and even at the 0.02%
level) to reject H0 . We have very strong evidence to suggest that the
residuals are not consistent with white noise.
(i) Differencing
We have:
g 0 = cov( X t , X t )
= cov(et + b1et -1 + b12et -12 + b1b12et -13 ,
= (1 + b12 + b12
2
+ b12 b12
2
)s 2
Also:
g 1 = cov( X t , X t -1)
= cov(et + b1et -1 + b12et -12 + b1b12et -13 ,
g 2 = g 3 = = g 10 = 0
g 11 = cov( X t , X t -11)
= cov(et + b1et -1 + b12et -12 + b1b12et -13 ,
g 12 = cov( X t , X t -12 )
= cov(et + b1et -1 + b12et -12 + b1b12et -13 ,
g 13 = cov( X t , X t -13 )
= cov(et + b1et -1 + b12et -12 + b1b12et -13 ,
= b1b12s 2
g 14 = g 15 = = 0
g0
r0 = =1
g0
2
g1 b1 + b1b12
r1 = =
g 0 1 + b12 + b12
2
+ b12 b12
2
r2 = r3 = = r10 = 0
g 11 b1b12
r11 = =
g 0 1 + b12 + b12
2
+ b12 b12
2
g 13 b1b12
r13 = =
g 0 1 + b12 + b12
2
+ b12 b12
2
r14 = r15 = = 0
1 X t 0.5 0 X t 1 t1
1 Yt 0 0.5 Yt 1 2
t
So we have:
1 0.5 0
M N
1 0 0.5
1 1
M1
1
2 1
X t M1NX t 1 M1εt
Xt 1 1 0.5 0 X t 1 1 1 t1
ie
Y
t 1 2 1 0 0.5 Yt 1 1 2 1 2
t
1 0.5 0.5 X t 1 1 1 t
1
1 0.5
2 0.5 Yt 1 1 2 1 2
t
0.5 0.5
1 2 1 2 X t 1 1 1 t1
0.5 0.5 Yt 1 1 2 1 2
t
1
2
1 2
Setting:
gives:
2 2
0.5 0.5
0
2 1 2
1
0.5 (1 )
2
0.5 0
2 2
0.25 (1 2 ) 2 (1 2 )2 0.25 2 0
2 (1 2 )2 (1 2 ) 0.25(1 2 ) 0
2 (1 2 ) 0.25 0
1 1 4 0.25(1 2 ) 1 2 1
2
2
2(1 ) 2(1 ) 2(1 2 )
1 1
ie or
2(1 ) 2(1 )
1 1
1 and 1
2(1 ) 2(1 )
1 1
2 or 1 21
1 or 3
2 2
1 1
2 or 1 21
21 or 32
1 or 3
2 2
(ii) VAR(p)
Xt X t 1 0 0 X t 2 t1
Yt 0 Yt 1 0 Yt 2 2
t
X t A1X t 1 A 2X t 2 εt
with:
0 0
A1 and A2
0 0
We have:
Yt 0.6Yt 1 0.16Yt 2 1 t
1 0.6 0.16 2 0
Solving for :
5
1 0.6 0.16 2 0 5 or
4
5
Since both 5 >1 and 1 , the process is stationary.
4
1
1 0.6 0.16 4 61
0.24
0.6 5 1 5
0.84 1 0.6 0 1 0 0 1
0.84 7 0 7
5
Substituting 1 0 into equation (2) gives:
7
5 103 2 103
2 0.6 0 0.16 0 0 2
7 175 0 175
5
1 1
7
57
2
103
2 12 175 4
2
1 12 1 57
2 25
(i) Why s = 12
(1 - (a + b )B + ab B )Y = e
2
t t
So, setting s = 12 removes the seasonal component from the time series.
1 - (a + b )l + abl 2 = 0
1 1
(1 - al )(1 - bl ) = 0 fi l= ,
a b
1 1
> 1 and >1
a b
ie:
Yt = (a + b )Yt -1 - ab Yt - 2 + e t
g 1 = cov ÎÈYt ,Yt -1˚˘ = cov ÈÎ(a + b )Yt -1 - abYt - 2 + e t ,Yt -1˘˚
= (a + b )g 0 - abg 1 + 0 (1)
g1 =
(a + b ) g (3)
0
1 + ab
Dividing equations (2) and (3) through by g 0 , we have the following two
equations:
g1 a + b
r1 = =
g 0 1 + ab
g2
r2 = = (a + b ) r1 - ab
g0
a +b
r1 = =0 and r2 = 0 - ab = 0.09
1 + ab
So:
ie:
a = -b and a 2 = 0.09
Hence a = 0.3 or –0.3, and the corresponding values of b are –0.3 or 0.3.
Using the estimated values of a and b from part (ii), the equation for the
differenced series is:
Yt = (aˆ + bˆ )Yt -1 - ab
ˆ ˆYt - 2 + e t = 0.09Yt - 2 + e t
ie:
g 0 = cov(Yt ,Yt )
= cov( m + a Yt -1 + e t ,Yt )
= ag 1 + s 2 (1)
g 1 ag 0
r1 = = =a (3)
g0 g0
We now equate equation (1) to the sample variance and equation (3) to the
sample ACF at lag 1.
Since the same filter has been applied to both sides of the equation, the
observations of Model A will also satisfy Model B.
However, in order to cancel the (1 - cB) term then (1 - cB)-1 must exist.
Looking at the conditions given on page 2 of the Tables for a convergent
series expansion for (1 + x )p we see that | x | < 1 which means we require
| c | < 1 . This was not included in the examiners’ solution though.
(i) Non-stationarity of X t
E ( X t ) = E (a + bt + Yt ) = a + bt + E (Yt )
E ( X t ) = a + bt + m
(ii) Stationarity of DX t
We have:
g Y (s ) = cov(Yt ,Yt - s )
Since DX t = b + Yt - Yt -1 :
= g Y (s ) - g Y (s + 1) - g Y (s - 1) + g Y (s )
= 2g Y (s ) - g Y (s + 1) - g Y (s - 1)
DX t = b + Yt - Yt -1
= b + e t + be t -1 - e t -1 - be t - 2
= b + e t + ( b - 1)e t -1 - be t - 2
Hence:
(
DX t = b + 1 + ( b - 1)L - b L2 e t )
(v) Variance of DX t
var(Yt ) = s 2 + b 2s 2 = (1 + b 2 )s 2
Similarly:
FACTSHEET
univariate (eg MA, AR, ARMA, ARIMA) – just one variable, say, X t
White noise
Stationarity
Invertibility
Purely indeterministic
Markov property
Integrated of order d
Autocovariance function
g 0 = var( X t )
g k = cov( X t , X t + k )
gk
rk = corr( X t , X t + k ) = - 1 £ rk £ 1
g0
r2 - r12
f1 = r1, f2 = , fk given on page 40 of Tables
1 - r12
X t = m + et + b1et -1 + + b q et - q
always stationary
need to check invertibility
never Markov
X t = m + a1( X t -1 - m ) + + a p ( X t - p - m ) + et
X t = m + a1( X t -1 - m ) + + a p ( X t - p - m ) + et
+ b1et -1 + + b q et - q
X t = m + A1( X t -1 - m ) + + Ap ( X t - p - m ) + et
always invertible
VAR(1) is Markov
Cointegration
yt =
1
12 ( 1
2
x t - 6 + + xt - 1 + x t + x t + 1 + +
1
2
xt + 6 )
method of seasonal means, eg subtract the monthly estimate from the
appropriate month.
Box-Jenkins methodology
If the model is a good fit then the residuals, eˆt , will be white noise:
a turning points test can check that the {eˆt } are patternless
Forecasting
NOTES
NOTES
NOTES
NOTES
NOTES