CS2 Booklet 8 (Time Series) 2019 FINAL

Download as pdf or txt
Download as pdf or txt
You are on page 1of 168

Exclusive use Batch0402p

Subject CS2
Revision Notes
For the 2019 exams

Time series
Booklet 8

covering

Chapter 13 Time series 1


Chapter 14 Time series 2

The Actuarial Education Company


Exclusive use Batch0402p
Exclusive use Batch0402p

CONTENTS

Contents Page

Links to the Course Notes and Syllabus 2


Overview 4
Core Reading 5
Past Exam Questions 69
Solutions to Past Exam Questions 90
Factsheet 155

Copyright agreement

All of this material is copyright. The copyright belongs to Institute and


Faculty Education Ltd, a subsidiary of the Institute and Faculty of Actuaries.
The material is sold to you for your own exclusive use. You may not hire
out, lend, give, sell, transmit electronically, store electronically or photocopy
any part of it. You must take care of your material to ensure it is not used or
copied by anyone at any time.

Legal action will be taken if these terms are infringed. In addition, we may
seek to take disciplinary action through the profession or through your
employer.

These conditions remain in force after you have finished using the course.

© IFE: 2019 Examinations Page 1


Exclusive use Batch0402p

LINKS TO THE COURSE NOTES AND SYLLABUS

Material covered in this booklet

Chapter 13 Time series 1


Chapter 14 Time series 2

These chapter numbers refer to the 2019 edition of the ActEd Course Notes.

Syllabus objectives covered in this booklet

The numbering of the syllabus items is the same as that used by the Institute
and Faculty of Actuaries.

2.1 Concepts underlying time series models

2.1.1 Explain the concept and general properties of stationary,


I (0) , and integrated, I (1) , univariate time series.

2.1.2 Explain the concept of a stationary random series.

2.1.3 Explain the concept of a filter applied to a stationary


random series.

2.1.4 Know the notation for backwards shift operator, backwards


difference operator, and the concept of roots of the
characteristic equation of time series.

2.1.5 Explain the concepts and basic properties of


autoregressive (AR), moving average (MA), autoregressive
moving average (ARMA) and autoregressive integrated
moving average (ARIMA) time series.

2.1.6 Explain the concept and properties of discrete random


walks and random walks with normally distributed
increments, both with and without drift.

2.1.7 Explain the basic concept of a multivariate autoregressive


model.

2.1.8 Explain the concept of cointegrated time series.

Page 2 © IFE: 2019 Examinations


Exclusive use Batch0402p

2.1.9 Show that certain univariate time series models have the
Markov property and describe how to rearrange a
univariate time series model as a multivariate Markov
model.

2.2 Applications of time series models

2.2.1 Outline the processes of identification, estimation and


diagnosis of a time series, the criteria for choosing
between models and the diagnostic tests that might be
applied to the residuals of a time series after estimation.

2.2.2 Describe briefly other non-stationary, non-linear time


series models.

2.2.3 Describe simple applications of a time series model,


including random walk, autoregressive and cointegrated
models as applied to security prices and other economic
variables.

2.2.4 Develop deterministic forecasts from time series data,


using simple extrapolation and moving average models,
applying smoothing techniques and seasonal adjustment
when appropriate.

© IFE: 2019 Examinations Page 3


Exclusive use Batch0402p

OVERVIEW

This booklet covers Syllabus objectives 2.1 and 2.2, which relate to time
series.

In this course, we look in detail at four important types for time series:
 moving average (MA) processes
 autoregressive (AR) processes
 autoregressive moving average (ARMA) processes, and
 autoregressive integrated moving average (ARIMA) processes.

For each model we consider the properties of stationarity and invertibility,


and we consider its autocorrelation function and partial autocorrelation
function.

We then go on to discuss how we can fit a time series model to a data set
using the Box-Jenkins methodology, and how to use a model to forecast
future values of a process.

In addition, we brieftly consider some more complicated time series models,


including multivariate time series and ARCH models.

There are many past exam questions (from Subject CT6) that ask for the
derivation of an autocorrelation function. These questions involve standard
algebra.

Questions on the Box-Jenkins methodology can involve bookwork or the


interpretation of graphs and summary statistics.

Page 4 © IFE: 2019 Examinations


Exclusive use Batch0402p

CORE READING

All of the Core Reading for the topics covered in this booklet is contained in
this section.

We have inserted paragraph numbers in some places, such as 1, 2, 3 …, to


help break up the text. These numbers do not form part of the Core
Reading.

The text given in Arial Bold font is Core Reading.

The text given in Arial Bold Italic font is additional Core Reading that is not
directly related to the topic being discussed.
____________

Chapter 13 – Time series 1

Univariate time series

1 A univariate time series is a sequence of observations of a single


process taken at a sequence of different times. Such a series can in
general be written as:

x (t1), x (t2 ),  , x (t n ) ie as { x (t i ) : i = 1, 2, 3,  , n }

Most applications involve observations taken at equally-spaced times.


In this case the series is written as:

x1, x 2 ,  , x n ie as { x t : t = 1, 2, 3,  , n }
____________

2 For instance, a sequence of daily closing prices of a given share


constitutes a time series, as does a sequence of monthly inflation
figures.
____________

© IFE: 2019 Examinations Page 5


Exclusive use Batch0402p

3 The fact that the observations occur in time order is of prime


importance in any attempt to describe, analyse and model time series
data.

The observations are related to one another and cannot be regarded as


observations of independent random variables. It is this very
dependence amongst the members of the underlying sequence of
variables which any analysis must recognise and exploit.
____________

For example, a list of returns of the stocks in the FTSE 100 index on a
particular day is not a time series, and the order of records in the list is
irrelevant. At the same time, a list of values of the FTSE 100 index
taken at one-minute intervals on a particular day is a time series, and
the order of records in the list is of paramount importance.

Note that the observations x t can arise in different situations. For


example:
 the time scale may be inherently discrete (as in the case of a series
of ‘closing’ share prices)
 the series may arise as a sample from a series observable
continuously through time (as in the case of hourly readings of
atmospheric temperature)
 each observation may represent the results of aggregating a
quantity over a period of time (as in the case of a company’s total
premium income on new business each month).
UK women unemployed (1967-72)

100
h
u
n 90
d
r
e
d 80
s

70

Index 10 20 30 40 50

Figure 13.0: a time series

Page 6 © IFE: 2019 Examinations


Exclusive use Batch0402p

The purposes of a practical time series analysis may be summarised


as:
 description of the data
 construction of a model which fits the data
 forecasting future values of the process
 deciding whether the process is out of control, requiring action
 for vector time series, investigating connections between two or
more observed processes with the aim of using values of some of
the processes to predict those of the others
____________

4 A univariate time series is modelled as a realisation of a sequence of


random variables:

{ X t : t = 1, 2, 3,  , n }

called a time series process.

(Note, however, that in the modern literature the term ‘time series’ is
often used to mean both the data and the process of which it is a
realisation.)
____________

5 A time series process is a stochastic process indexed in discrete time


with a continuous state space.
____________

The sequence { X t : t = 1, 2,  , n } may be regarded as a subsequence


of a doubly infinite collection { X t : t =  , - 2, - 1, 0, 1, 2, } . This
interpretation will be found to be helpful in investigating notions such
as convergence to equilibrium.
____________

Properties of univariate time series

The concept of stationarity was introduced in Booklet 1, along with the


ideas of strict and weak stationarity.
____________

© IFE: 2019 Examinations Page 7


Exclusive use Batch0402p

In the study of time series it is a convention that the word ‘stationary’


on its own is a shorthand notation for ‘weakly stationary’, though in the
case of a multivariate normal process the two forms of stationarity are
equivalent.

But we do need to be careful in our definition, as there are some


processes which we wish to exclude from consideration but which
satisfy the definition of weak stationarity.
____________

6 A process X is called purely indeterministic if knowledge of the


values of X 1,  , X n is progressively less useful at predicting the value
of X N as N Æ • .
____________

7 When we talk of a ‘stationary time series process’ we shall mean a


weakly stationary purely indeterministic process.
____________

8 A particular form of notation is used for time series: X is said to be


I (0) (read ‘integrated of order 0’) if it is a stationary time series
process, X is I (1) if X itself is not stationary but the increments
Yt = X t - X t - 1 form a stationary process, X is I (2) if it is
non-stationary but the process Y is I (1) , and so on.
____________

The theory of stationary random processes plays an important role in


the theory of time series because the calibration of time series models
(that is, estimation of the values of the model’s parameters using
historical data) can be performed efficiently only in the case of
stationary random processes. A non-stationary random process has to
be transformed into a stationary one before the calibration can be
performed.
____________

Page 8 © IFE: 2019 Examinations


Exclusive use Batch0402p

Mean, covariance and correlation

9 The mean function (or trend) of the process is mt = E ( X t ) , the


covariance function cov( X s , X t ) = E ( X s X t ) - E ( X s )E ( X t ) .
____________

Both of these functions take a simpler form in the case where X is


stationary.
____________

10 The mean of a stationary time stochastic process is constant, ie mt ∫ m


for all t .

The covariance of any pair of elements X r and X s of a stationary


sequence X depends only on the difference r - s .
____________

11 We can therefore define the autocovariance function {g k : k Œ } of a


stationary random process X as follows:

g k = cov( X t , X t + k ) = E ( X t X t + k ) - E ( X t )E ( X t + k )

The common variance of the elements of a stationary process is given


by:

g 0 = var( X t )
____________

12 The autocorrelation function (ACF) of a stationary process is defined


by:

gk
r k = corr( X t , X t + k ) =
g0
____________

13 The ACF of a purely indeterministic process satisfies r k Æ 0 as


kƕ.
____________

© IFE: 2019 Examinations Page 9


Exclusive use Batch0402p

14 The autocovariance function g and autocorrelation function r of a


stationary random process are even functions of k , that is, g k = g - k
and r k = r - k .

Since the autocovariance function g k = cov( X t , X t + k ) does not depend


on t , we have:

g k = cov( X t - k , X t - k + k ) = cov( X t - k , X t ) = cov( X t , X t - k ) = g - k

Thus g is an even function, which in turn implies that r is even.


____________

15 Another important characteristic of a stationary random process is the


partial autocorrelation function (PACF) {fk : k = 1, 2, } , defined as the
conditional correlation of X t + k with X t given X t + 1,  , X t + k - 1 .
____________

This may be derived as the coefficient fk ,k in the problem to minimise:

È
( )

E Í X t - f k ,1X t -1 - f k ,2 X t - 2 -  - f k ,k X t - k ˙˚
Î

The formula for calculating fk involves a ratio of determinants of large


matrices whose entries are determined by r1,  , r k ; it may be found
in standard works on time series analysis, and is readily available in
common computer packages like R.

Page 10 © IFE: 2019 Examinations


Exclusive use Batch0402p

Figure 13.1: ACF and PACF values of some stationary time series
model.
____________

16 In particular the formulae for f1 and f2 are as follows:

Ê 1 r1 ˆ
det Á
Ë r1 r2 ¯˜ r - r2
f1 = r1, f2 = = 2 21
Ê 1 r1ˆ 1 - r1
det Á
Ë r1 1 ˜¯
____________

Note that for each k , f k depends on only r1, r2 ,..., r k .


____________

Operators

Further discussion of the various models will be helped by the use of


two operators which operate on the whole time series process X .
____________

© IFE: 2019 Examinations Page 11


Exclusive use Batch0402p

17 The backwards shift operator, B , acts on the process X to give a


process BX such that:

(BX )t = X t - 1
____________

18 The difference operator, — , is defined as — = 1 - B , or in other words:

(—X )t = X t - X t - 1
____________

19 Both operators can be applied repeatedly. For example:

(B2 X )t = (B(BX ))t = (BX )t - 1 = X t - 2


(—2 X )t = (—X )t - (—X )t - 1 = X t - 2 X t - 1 + X t - 2

and can be combined as, for example:

(B—X )t = (B(1 - B ) X )t = (BX )t - (B2 X )t = X t - 1 - X t - 2


____________

The usefulness of both of these operators will become apparent in later


sections.

The R commands for generating the differenced values of some time


series x are:

diff(x,lag=1,differences=1)

for ordinary difference — .

diff(x,lag=1,differences=3)

for differencing three times —3 , and:

diff(x,lag=12,differences=1)

for a simple seasonal difference with period 12, —12 (see later).
____________

Page 12 © IFE: 2019 Examinations


Exclusive use Batch0402p

White noise

A simple class of weakly stationary random processes is the white


noise processes.
____________

20 A random process {et : t Œ } is a white noise process if E (et ) = 0 for


any t , and:

ÔÏs 2 if k = 0
g k = cov(et , et + k ) = Ì
ÓÔ0 otherwise
____________

An important representative of the white noise processes is a


sequence of independent normal random variables with common
mean 0 and variance s 2 .
____________

Main linear models of time series

21 The main linear models used for modelling stationary time series are:
 Autoregressive process (AR)
 Moving average process (MA)
 Autoregressive moving average process (ARMA).
____________

The definitions of each of these processes, presented below, involve


the standard zero-mean white noise process {et : t = 1, 2, } defined
above.

In practice we often wish to model processes which are not I (0)


(stationary) but I (1) .
____________

22 For this purpose a further model is considered:

 Autoregressive integrated moving average (ARIMA).


____________

© IFE: 2019 Examinations Page 13


Exclusive use Batch0402p

23 An autoregressive process of order p (the notation AR ( p) is


commonly used) is a sequence of random variables { X t } defined
consecutively by the rule:

X t = m + a 1( X t - 1 - m ) + a 2 ( X t - 2 - m ) +  + a p ( X t - p - m ) + et

Thus the autoregressive model attempts to explain the current value of


X as a linear combination of past values with some additional
externally generated random variation. The similarity to the procedure
of linear regression is clear, and explains the origin of the name
‘autoregression’.
____________

24 A moving average process of order q , denoted MA(q ) , is a sequence


{ X t } defined by the rule:

X t = m + et + b 1et - 1 +  + b q et - q

The moving average model explains the relationship between the X t


as an indirect effect, arising from the fact that the current value of the
process results from the recently passed random error terms as well as
the current one. In this sense, X t is ‘smoothed noise’.
____________

25 The two basic processes (AR and MA) can be combined to give an
autoregressive moving average, or ARMA, process. The defining
equation of an ARMA( p, q ) process is:

X t = m + a 1( X t - 1 - m ) +  + a p ( X t - p - m ) + et + b 1et - 1 +  + b q et - q

Note: ARMA( p,0) is AR ( p) ; ARMA(0, q ) is MA(q ) .


____________

Page 14 © IFE: 2019 Examinations


Exclusive use Batch0402p

AR (1) processes

The simplest autoregressive process is the AR (1) , given by:

X t = m + a ( X t - 1 - m ) + et (13.1)
____________

26 A process satisfying this recursive definition can be represented as:

t -1
Xt = m + a t (X 0 - m ) + Â a j et - j (13.2)
j =0
____________

27 It follows that the mean function mt is given by:

mt = m + a t ( m0 - m )
____________

28 The same representation (13.2) gives the variance:

1 - a 2t
var( X t ) = s 2 + a 2t var( X 0 )
1- a 2

where, as before, s 2 denotes the common variance of the white noise


terms {et } .
____________

29 From this it follows that a stationary process X satisfying (13.1) can


only exist if a < 1 .
____________

© IFE: 2019 Examinations Page 15


Exclusive use Batch0402p

30 Further requirements are that m0 = m and that var( X 0 ) = s 2 (1 - a 2 ) .


Notice that this implies that X can only be stationary if X 0 is random.

If X 0 is a known constant, then var( X 0 ) = 0 and var( X t ) is no longer


independent of t , whereas if X 0 has expectation different from m then
the process X will have non-constant expectation.
____________

31 It is easy to see that the difference mt - m is a multiple of a t and that


var( X t ) - s 2 (1 - a 2 ) is a multiple of a 2t . Both of these terms will
decay away to zero for large t if a < 1 , implying that X will be
virtually stationary for large t .
____________

32 In this context it is often helpful to assume that X 1,  , X n is merely a


subsequence of a process  , X -1, X 0 , X 1,  , X n which has been
going on unobserved for a long time and has already reached a ‘steady
state’ by the time of the first observation.
____________

A double-sided infinite process satisfying (13.1) can be represented as:


Xt = m + Â a j et - j (13.3)
j =0
____________

33 This representation makes it clear that X t has expectation m and


variance equal to:


s2
 a 2 js 2 = if a < 1
j =0 1- a 2
____________

Page 16 © IFE: 2019 Examinations


Exclusive use Batch0402p

In order to deduce that X is stationary we also need to calculate the


autocovariance function:

• •
g k = cov( X t , X t + k ) = Â Â a ia j cov(et - j , et + k - i )
j =0 i =0

= Â s 2a 2 j + k = a kg 0
j =0

This is independent of t , and thus a stationary process exists as long


as a < 1 .

It is worth introducing here a method of more general utility for


calculating autocovariance functions.
____________

34 From (13.1) we have, assuming that X is stationary:

g k = cov( X t , X t - k ) = cov( m + a ( X t - 1 - m ) + et , X t - k )
= a cov( X t - 1, X t - k )
= ag k - 1

implying that:

s2
g k = a kg 0 = a k for k ≥ 0
1- a 2
____________

35 So:

gk
rk = = a k for k ≥ 0
g0
____________

© IFE: 2019 Examinations Page 17


Exclusive use Batch0402p

36 The partial autocorrelation function fk is given by:

a2 -a2
f1 = r1 = a f2 = =0
1- a 2

Indeed, since the best linear estimator of Xt given


X t - 1, X t - 2 , X t - 3 ,  is just a X t - 1 , the definition of the PACF implies
that fk = 0 for all k > 1 . Notice the contrast with the ACF, which
decreases geometrically towards 0.
____________

The following lines in R generate the ACF and PACF functions for an
AR (1) model:

par(mfrow=c(1,2))

barplot(ARMAacf(ar=0.7,lag.max = 12)[-1],main = "ACF


of AR(1)",col="red")

barplot(ARMAacf(ar=0.7,lag.max = 12,pacf = TRUE),main


= "PACF of AR(1)",col="red")

Figure 13.2: ACF and PACF of AR (1) with   0.7


____________

Page 18 © IFE: 2019 Examinations


Exclusive use Batch0402p

One of the well-known applications of a univariate autoregressive


model is the description of the evolution of the consumer price index
{Qt : t = 1, 2, 3,  } . The force of inflation, rt = ln(Qt Qt - 1 ) , is assumed
to follow the AR (1) process:

rt = m + a (rt - 1 - m ) + et

One initial condition, the value for r0 , is required for the complete
specification of the model for the force of inflation rt .
____________

AR ( p) processes

The equation of the more general AR ( p) process is:

X t = m + a 1( X t - 1 - m ) + a 2 ( X t - 2 - m ) +  + a p ( X t - p - m ) + et (13.4)
____________

37 In terms of the backwards shift operator:

(1 - a 1B - a 2B2 -  - a pB p )( X - m ) = et (13.5)
____________

As seen for AR (1) , there are some restrictions on the values of the a j
which are permitted if the process is to be stationary. In particular, we
have the following result.
____________

38 If the time series process X given by (13.4) is stationary, then the


roots of the equation:

1 - a 1z - a 2 z 2 -  - a p z p = 0

are all greater than 1 in absolute value.

© IFE: 2019 Examinations Page 19


Exclusive use Batch0402p

(The polynomial 1 - a 1z - a 2 z 2 -  - a p z p is called the characteristic


polynomial of the autoregression.)
____________

Proof

If X is stationary then its autocovariance function satisfies:

Ê p ˆ p
g k = cov( X t , X t - k ) = cov Á Â a j X t - j + et , X t - k ˜ = Â a jg k - j
ËÁ j =1 ¯˜ j =1

for k ≥ p . This is a pth order difference equation with constant


coefficients; it has solution of the form:

p
gk = Â A j z -j k
j =1

for all k ≥ 0 , where z1,  , z p are the p roots of the characteristic


polynomial and A1,  , Ap are constants. As X is purely

indeterministic, we must have g k Æ 0 , which requires that z j > 1 for


each j .

The converse of this result is also true (but the proof is not given here):
if the roots of the characteristic polynomial are all greater than 1 in
absolute value, then it is possible to construct a stationary process X
satisfying (13.4). In order for an arbitrary process X satisfying (13.4)
to be stationary, the variances and covariances of the initial values
X 0 , X -1,  , X - p + 1 must also be equal to the appropriate values.

Often exact values for the g k are required, entailing finding the values
of the constants Ak .
____________

Page 20 © IFE: 2019 Examinations


Exclusive use Batch0402p

39 From (13.4) we have:

cov( X t , X t - k ) = a 1 cov( X t -1, X t - k ) +  + a p cov( X t - p , X t - k )


+ cov(et , X t - k )

which can be re-expressed as:

g k = a 1g k - 1 + a 2g k - 2 +  + a pg k - p + s 2 1{k = 0}

for 0 £ k £ p . (These are known as the Yule-Walker equations.) Here


the notation 1{ k = 0} denotes an indicator function, taking the value 1 if
k = 0 , the value 0 otherwise.
____________

40 For p = 3 we have 4 equations:

g 3 = a 1g 2 + a 2g 1 + a 3g 0
g 2 = a 1g 1 + a 2g 0 + a 3g 1
g 1 = a 1g 0 + a 2g 1 + a 3g 2
g 0 = a 1g 1 + a 2g 2 + a 3g 3 + s 2
____________

The second and third of these equations are sufficient to deduce g 2


and g 1 in terms of g 0 , which is all that is required to find r2 and r1 .
The first and fourth of the equations are needed when the values of the
g k are to be found explicitly.

© IFE: 2019 Examinations Page 21


Exclusive use Batch0402p

The PACF, {fk : k ≥ 1} , of the AR ( p) process can be calculated from


the defining equations, but is not memorable. In particular, the first
three equations above can be written in terms of r1 , r2 , r3 and the
resulting solution of a 3 as a function of r1 , r2 , r3 is the expression
of f 3 . The same idea applies to all values of k , so that f k is the
solution of a k in a system of k linear equations, including those for

r2 - r12
f1 = r1 and f 2 = that we have seen before.
1 - r12
____________

41 It is important to note, though, that:

fk = 0 for all k > p


____________

42 This property of the PACF is characteristic of autoregressive


processes and forms the basis of the most frequently used test for
determining whether an AR ( p) model fits the data.
____________

It would be difficult to base a test on the ACF as the ACF of an


autoregressive process is a sum of geometrically decreasing
components. (See later.)
____________

MA(1) processes

A first-order moving average, denoted MA(1) , is a process given by:

X t = m + et + b et - 1
____________

43 The mean of this process is:

mt = m
____________

Page 22 © IFE: 2019 Examinations


Exclusive use Batch0402p

44 The variance and autocovariance are:

g 0 = var(et + b et - 1) = (1 + b 2 )s 2

g 1 = cov(et + b et - 1 , et - 1 + b et - 2 ) = bs 2

gk =0 for k > 1
____________

45 Hence the ACF of the MA(1) process is:

r0 = 1

b
r1 =
1+ b 2

rk = 0 for k > 1
____________

46 An MA(1) process is stationary regardless of the values of its


parameters.
____________

The parameters are nevertheless usually constrained by imposing the


condition of invertibility. This may be explained as follows.

It is possible to have two distinct MA(1) models with identical ACFs:


consider, for example, b = 0.5 and b = 2 , both of which have
b
r1 = = 0.4 .
1+ b 2
____________

© IFE: 2019 Examinations Page 23


Exclusive use Batch0402p

47 The defining equation of the MA(1) may be written in terms of the


backwards shift operator:

X - m = (1 + b B)e (13.6)

In many circumstances an autoregressive model is more convenient


than a moving average model. We may rewrite (13.6) as:

(1 + b B )-1( X - m ) = e

and use the standard expansion of (1 + b B )-1 to give:

X t - m - b ( X t - 1 - m ) + b 2 ( X t - 2 - m ) - b 3 ( X t - 3 - m ) +  = et

The original moving average model has therefore been transformed


into an autoregression of infinite order. But this procedure is only
valid if the sum on the left hand side is convergent, in other words if
b < 1 . When this condition is satisfied the MA(1) is called invertible.
____________

Although more than one MA process may share a given ACF, at most
one of the processes will be invertible.

It is possible, at the cost of considerable effort, to calculate the PACF


of the MA(1) , giving:

(1 - b 2 ) b k
fk = ( -1)k + 1
1 - b 2(k + 1)
____________

48 This decays approximately geometrically as k Æ • , highlighting the


way in which the ACF and PACF are complementary: the PACF of a
MA(1) behaves like the ACF of an AR(1), the PACF of an AR(1) behaves
like the ACF of a MA(1).

Page 24 © IFE: 2019 Examinations


Exclusive use Batch0402p

Figure 13.3: ACF and PACF of MA(1) with b = 0.7


____________

MA(q ) processes

49 The defining equation of the general q th order moving average is, in


backwards shift notation:

X - m = (1 + b1B + b 2 B2 +  + b q Bq )e
____________

The autocovariance function is easier to find than in the case of AR(p):

q q q -k
gk = Â Â b i b j E (et - i et - j - k ) = s 2 Â bi bi +k
i =0 j =0 i =0

as long as k £ q . (Here b 0 denotes 1.)


____________

50 For k > q it is obvious that g k = 0 .


____________

© IFE: 2019 Examinations Page 25


Exclusive use Batch0402p

51 Just as autoregressive processes are characterised by the property


that the partial ACF is equal to zero for sufficiently large k , moving
average processes are characterised by the property that the ACF is
equal to zero for sufficiently large k .
____________

52 Although there may be many moving average processes with the same
ACF, at most one of them is invertible, since no two invertible
processes have the same autocorrelation function. Moving average
models fitted to data by statistical packages will always be invertible.
____________

ARMA processes

A combination of the moving average and autoregressive models, an


ARMA model includes direct dependence of X t on both past values
of X and present and past values of e .

The defining equation is:

X t = m + a 1( X t -1 - m ) +  + a p ( X t - p - m ) + et + b 1et -1 +  + b q et -q

or, in backwards shift operator notation:

(1 - a 1B -  - a pB p )( X - m ) = (1 + b 1B +  + b q Bq )e
____________

53 Neither the ACF nor the PACF of the ARMA process eventually
becomes equal to zero.

This makes it more difficult to identify an ARMA model than either a


pure autoregression or a pure moving average.
____________

Page 26 © IFE: 2019 Examinations


Exclusive use Batch0402p

It is possible to calculate the ACF by a method similar to the method


employing the Yule-Walker equations for the ACF of an autoregression.

We will show that the autocorrelation function of the stationary


zero-mean ARMA(1,1) process:

X t = a X t -1 + et + b et -1 (13.7)

is given by:

(1 + ab )(a + b )
r1 =
(1 + b 2 + 2ab )

r k = a k -1r1, k = 2,3,

Figure 13.1 shows the ACF and PACF values of such a process with
a = 0.7 and b = 0.5 .
____________

54 Using equation (13.7):

cov( X t , et ) = a cov( X t -1, et ) + cov(et , et ) + b cov(et -1, et ) = s 2

since et is independent of both et -1 and X t -1 . Similarly:

cov( X t , et -1) = a cov( X t -1, et -1) + cov(et , et -1) + b cov(et -1, et -1)
= (a + b )s 2

This enables us to deduce the autocovariance function of X .

Again from (13.7):

cov( X t , X t ) = a cov( X t -1, X t ) + cov(et , X t ) + b cov(et -1, X t )

cov( X t , X t -1) = a cov( X t -1, X t -1) + cov(et , X t -1) + b cov(et -1, X t -1)

© IFE: 2019 Examinations Page 27


Exclusive use Batch0402p

Also, for k > 1 :

cov( X t , X t - k ) = a cov( X t -1, X t - k ) + cov(et , X t - k ) + b cov(et -1, X t - k )

So:

g 0 = ag 1 + (1 + ab + b 2 )s 2

g 1 = ag 0 + bs 2

g k = ag k - 1

The solution is:

1 + 2ab + b 2
g0 = s2
1- a 2

(a + b )(1 + ab )
g1 = s2
1- a 2

g k = a k -1g 1 , k = 2,3,...

assuming that the process is stationary, ie that a < 1 .


____________

ARIMA processes

In many applications the process being modelled cannot be assumed


stationary, but can reasonably be fitted by a model with stationary
increments, that is, if the first difference of X , Y = —X , is itself a
stationary process.

A process X is called an ARIMA( p,1, q ) process if X is non-


stationary but the first difference of X is an ARMA( p, q ) process.

Page 28 © IFE: 2019 Examinations


Exclusive use Batch0402p

In certain cases it may be considered desirable to continue beyond the


first difference, if the process X is still not stationary after being
differenced once. The notation extends in a natural way.
____________

55 If X needs to be differenced at least d times in order to reduce it to


stationarity and if the d th difference Y = —d X is an ARMA( p, q )
process, then X is termed an ARIMA( p, d , q ) process.
____________

56 In terms of the backwards shift operator, the equation of the


ARIMA( p, d , q ) process is:

(1 - a 1B -  - a p B p )(1 - B)d ( X - m ) = (1 + b1B +  + b q Bq )e


____________

57 Example 1

The simplest example of an ARIMA process is the random walk:

X t = X t -1 + et

This can be rewritten as:

t
Xt = X0 + Â ej
j =1

The expectation of Xt is equal to E ( X 0 ) but the variance is


var( X 0 ) = ts 2 , so that X is not itself stationary. The first difference,
however, is given by:

Yt = —X t = et

which certainly is stationary. Thus the random walk is an


ARIMA(0,1, 0) process.
____________

© IFE: 2019 Examinations Page 29


Exclusive use Batch0402p

58 To identify the values of p, d and q for which X is an


ARIMA( p, d , q ) process, where:

X t = 0.6 X t - 1 + 0.3 X t - 2 + 0.1X t - 3 + et - 0.25et - 1

we can write the equation in terms of the backwards shift operator:

(1 - 0.6B - 0.3B2 - 0.1B3 ) X = (1 - 0.25B)e

We now check whether the polynomial on the left-hand side is divisible


by 1 - B ; if so, factorise it out. Continue to do this until the remaining
polynomial is not divisible by 1 - B .

(1 - B)(1 + 0.4B + 0.1B2 ) X = (1 - 0.25B)e

The model can now be seen to be ARIMA(2,1,1) .


____________

Other examples of ARIMA processes

Example 2

Let Zt denote the closing price of a share on day t . The evolution of


Z is frequently described by the model:

Zt = Zt - 1 exp( m + et )

By taking logarithms we see that this model is equivalent to an I (1)


model, since Yt = ln Zt satisfies the equation:

Yt = m + Yt - 1 + et

which is the defining equation of a random walk with drift because


t
Yt = Y0 + m t + Â ej . The model is based on the assumption that the
j =1

daily returns ln(Zt Zt - 1 ) are independent of the past prices


Z 0 , Z1 ,  , Z t - 1 .

Page 30 © IFE: 2019 Examinations


Exclusive use Batch0402p

Example 3

The logarithm of the consumer price index can be described by the


ARIMA(1,1, 0) model:

(1 - B )ln Qt = m + a [(1 - B)ln Qt - 1 - m ] + et

When analysing the behaviour of an ARIMA( p,1, q ) model, the


standard technique is to look at the first difference of the process and
to perform the kind of analysis which is suitable for an ARMA model.
Once complete, this can be used to provide predictions for the original,
undifferenced, process.

ARIMA models play a central role in the Box-Jenkins methodology,


which aims to provide a consistent and unified framework for analysis
and prediction using time series models. (See later.)
____________

Markov property

59 As we saw in Booklet 1, if the future development of a process can be


predicted from its present state alone, without any reference to its past
history, it possesses the Markov property. Stated precisely this reads:

P [ X t Œ A | X s1 = x1, X s2 = x 2 ,  , X sn = x n , X s = x ]

= P[ X t Œ A | X s = x ]

for all times s1 < s2 <  < s < t , all states x1, x 2 ,  , x n , x in S and all
subsets A of S .
____________

60 A first-order autoregressive process possesses the Markov property,


since the conditional distribution of X n + 1 given all previous X t
depends only on X n .
____________

© IFE: 2019 Examinations Page 31


Exclusive use Batch0402p

61 This property does not apply, however, to higher-order


autoregressions.

Suppose X is an AR (2) . X does not possess the Markov property,


since the conditional distribution of X n + 1 given the history of X up
until time n depends on X n - 1 as well as on X n . But let us define a

vector-valued process Y by Y t = ( X t , X t -1) . Given the whole history


T

of the process X up until time n , the distribution of Y n +1 depends


only on the values of X n and X n - 1 – in other words, on the value
of Y n . This means that Y possesses the Markov property.

In general an AR ( p ) does not possess the Markov property (for p > 1 )

( )
T
but we may define a vector-valued process Y t = X t , X t -1,  , X t - p +1
which does.
____________

Recall from Booklet 1 that a random walk possesses the Markov


property. The discussion of the Markov property for autoregressions
can be extended to include some ARIMA processes such as the
random walk, which has already been shown to be an ARIMA(0,1, 0)
process.
____________

62 An ARIMA( p, d , 0) process does not possess the Markov property (for


p + d > 1) but we may define a vector-valued process

( )
T
Y t = X t , X t -1,  , X t - p -d +1 which does.
____________

Page 32 © IFE: 2019 Examinations


Exclusive use Batch0402p

63 A moving average, or more generally an ARIMA( p, d , q ) process with


q > 0 , can never be Markov, since knowledge of the value of X n , or of

( X n , X n - 1,  , X n - q + 1 )
T
any finite collection will never be enough to
deduce the value of en , on which the distribution of X n + 1 depends.
Since a moving average has been shown to be equivalent to an
autoregression of infinite order, and since a p th order autoregression
needs to be expressed as a p -dimensional vector in order to possess
the Markov property, a moving average has no similar
finite-dimensional Markov representation.
____________

© IFE: 2019 Examinations Page 33


Exclusive use Batch0402p

Chapter 14 – Time series 2

Compensating for trend and seasonality

All the methods that we shall investigate apply only to a time series
which gives the appearance of stationarity. In this section, therefore,
we deal with possible sources of non-stationarity and how to
compensate for them.

A simple time series plot in R can be generated as:

ts.plot(x)

where x is some (vector) time series data.


____________

64 Lack of stationarity may be caused by the presence of deterministic


effects in the quantity being observed.

Linear or exponential trends:

Monthly sales figures for a company, which is expanding rapidly,


would be expected to show a steady underlying increase, possibly
linear or perhaps even exponential.

Seasonal variation:

A company, which sells greetings cards, will find that the sales in some
months of the year will be much higher than in others.

In both cases there is an underlying deterministic pattern and some


(possibly stationary) random variation on top of that. In order to
predict sales figures in future months it is necessary to extrapolate the
deterministic trends as well as to analyse the stationary random
variation.

A further cause of non-stationarity may be that the process observed is


an integrated version of a more fundamental process.
In these cases, differencing the observed time series may produce a
series which is more likely to be a realisation of some stationary
process.
____________

Page 34 © IFE: 2019 Examinations


Exclusive use Batch0402p

65 The most useful tools in identifying non-stationarity are the simplest: a


plot of the series against t , and the sample ACF.

Plotting the series will highlight any obvious trends in the mean and
will show up any cyclic variation, which could also form evidence of
non-stationarity. This should always be the first step in any practical
time series analysis.
____________

The R code below uses the ts.plot function.

ts.plot(log(FTSE100$Close))
points(log(FTSE100$Close),cex=.4)

generates Figure 14.1, which shows the time series of the logs of 300
successive closing values of FTSE100 index.

Figure 14.1: 300 successive closing values of the FTSE100 index,


Jan 2017 – Mar 2018; log-transformed

© IFE: 2019 Examinations Page 35


Exclusive use Batch0402p

The corresponding sample ACF and sample PACF are produced using:

par(mfrow=c(1,2))
acf(log(FTSE100$Close))
pacf(log(FTSE100$Close))

These are shown in Figure 14.2 below.

Figure 14.2: Sample ACF and sample PACF of the log(FTSE100) data;
dotted lines indicate cut-offs for significance if data came from some
white noise process.
____________

The sample ACF should, in the case of a stationary time series,


ultimately converge towards zero exponentially fast, as for AR (1)
where r s = a s .
____________

66 If the sample ACF decreases slowly but steadily from a value near 1,
we would conclude that the data need to be differenced before fitting
the model.
____________

Page 36 © IFE: 2019 Examinations


Exclusive use Batch0402p

If the sample ACF exhibits a periodic oscillation, however, it would be


reasonable to conclude that there is some underlying cause of the
variation.

Figure 14.2 shows the sample ACF of a time series which is clearly
non-stationary as the values decrease in some linear fashion;
differencing is therefore required before fitting a stationary model.

See, for example, the change of ACF and PACF for the differenced
data:

Figure 14.3: Data plot, sample ACF and sample PACF of


—ln(FTSE 100) .
____________

© IFE: 2019 Examinations Page 37


Exclusive use Batch0402p

67 The simplest way to remove a linear trend is by ordinary least squares.


This is equivalent to fitting the model:

x t = a + bt + y t

where a and b are constants and y is a zero-mean stationary


process. The parameters a and b can be estimated by linear
regression prior to fitting a stationary model to the residuals y t .
____________

68 Differencing may well be beneficial if the sample ACF decreases slowly


from a value near 1, but has useful effects in other instances as well.
If, for instance, x t = a + bt + y t , then:

—x t = b + — y t

so that the differencing has removed the trend in the mean.


____________

Where seasonal variation is present in the data, one way of removing it


is to take a seasonal difference.
____________

69 Suppose that the time series x records the monthly average


temperature in London. A model of the form:

xt = m + q t + y t (14.1)

might be applied, where q is a periodic function with period 12 and y


is a stationary series. The seasonal difference of x is defined as
—12 x = (1 - B12 ) x and we see that:

(—12 x )t = x t - xt -12 = ( m + q t + y t ) - ( m + q t -12 + y t -12 ) = y t - y t -12

is a stationary process.
____________

Page 38 © IFE: 2019 Examinations


Exclusive use Batch0402p

Figure 14.4 below is generated from the following lines in R, where


functions ts.plot, acf and pacf are used:

layout(matrix(c(1,1,2,3), 2, 2, byrow = TRUE))


ts.plot(manston1$tmax,ylab="",main="Max temperatures
observed at each month (2010-2017), Manston, UK")
points(manston1$tmax,cex=0.4)
acf(manston1$tmax,main="")
pacf(manston1$tmax,main="")

Figure 14.4: Data plot, sample ACF and PACF of temperature data.

© IFE: 2019 Examinations Page 39


Exclusive use Batch0402p

Seasonal differencing 12 seems to have removed the seasonal


behaviour of the data. See Figure 14.5 generated from:

layout(matrix(c(1,1,2,3), 2, 2, byrow = TRUE))


ts.plot(diff(manston1$tmax,lag=12),ylab="",
main="Seasonal differenced temperature data")
points(diff(manston1$tmax,lag=12),cex=0.4)
acf(diff(manston1$tmax,lag=12),main="")
pacf(diff(manston1$tmax,lag=12),,main="")

Figure 14.5: Temperature data after appropriate differencing


____________

The monthly inflation figures are obtained by seasonal differencing of


the Retail Prices Index. If x t is the value of the RPI in month t, the
xt - xt - 12
annual inflation figure reported is ¥ 100%
xt - 12
____________

Page 40 © IFE: 2019 Examinations


Exclusive use Batch0402p

70 The method of moving averages makes use of a simple linear filter to


eliminate the effects of periodic variation. If x is a time series with
seasonal effects with even period d = 2h , then we define a smoothed
process y by:

yt =
1
2h ( 1
2
xt - h + xt - h + 1 +  + xt - 1 + xt +  + xt + h - 1 +
1
2
xt + h )
This ensures that each period makes an equal contribution to y t .

The same can be done with odd periods d = 2h + 1 , but the end terms
x t - h and x t + h do not need to be halved.
____________

As with most filtering techniques, care must be taken lest the


smoothing of the data obscure the very effects which the procedure is
intended to uncover.
____________

71 The simplest method for removing seasonal variation is to subtract


from each observation the estimated mean for that period, obtained by
simply averaging the corresponding observations in the sample.

For example, when fitting the model in Equation 14.1 to a monthly time
series x extending over 10 years from January 1990 the estimate for
m is x and the estimate for q January is:

1
qˆJanuary = ( x1 + x13 + x 25 +  + x109 ) - mˆ
10
____________

In R the function decompose can be used to obtain both the moving


average and seasonal means.

ts.plot(manston1$tmax,ylab="",main="Max
temperatures")
points(manston1$tmax,cex=0.4)

© IFE: 2019 Examinations Page 41


Exclusive use Batch0402p

The time series data is plotted as in Figure 14.6 below.


decomp=decompose(ts(manston1$tmax,frequency =
12),type="additive")

The decomposition is saved as decomp.

The moving average can be added (in red) using the code:
lines(as.vector(decomp$trend),col="red")

The sum of seasonal and moving average trends can be added (in blue)
as follows:
lines(as.vector(decomp$seasonal+decomp$trend),
col="blue")

Figure 14.6: Temperature data and its decomposition into moving


average (in red) and seasonal trend (in blue) added.
____________

Page 42 © IFE: 2019 Examinations


Exclusive use Batch0402p

Diagnostic procedures such as an inspection of a plot of the residuals


may suggest that even the best-fitting standard linear time series
model is failing to provide an adequate fit to the data. Before
attempting to use more advanced non-linear models it is often worth
attempting to transform the data in some straightforward way in an
attempt to find a data set on which the linear theory will work properly.
____________

72 Transformations are most commonly used when a dependence is


suspected between the variance of the residuals and the size of the
fitted values. If, for example, the standard deviation of X t + 1 - X t
appears to be proportional to X t , then it would be appropriate to use
the logarithmic transformation, to work on the time series Y = ln X .
____________

73 In certain applications it may be found that most residuals are small


and negative, with a few large positive values to offset them. This may
be taken to indicate that the distribution of the error terms is
non-normal, leading to doubts as to whether the standard time series
procedures, designed for normal errors, are applicable.

It may be possible to find a transformation which will improve the


normality of the error terms of the transformed process, but care
should be taken that this does not lead to instability in the variance.
____________

A further caution when using transformed data involves the final step
of turning forecasts for the transformed process into forecasts for the
original process, as some transformations introduce a systematic bias.
____________

Identification of MA(q ) and AR ( p) models

The treatment of this section assumes that the sequence of


observations { x1, x 2 ,  , x n } may be presumed to come from a
stationary time series process. The problems of how to tell if the
assumption of stationarity is reasonable and what to do if it is not have
been treated in the previous section.

© IFE: 2019 Examinations Page 43


Exclusive use Batch0402p

Estimation of the ACF and PACF

The autocovariance and autocorrelation functions, as seen above, play


a central role in the analysis of time series. Other descriptive tools,
such as the partial autocorrelation function, are derived from the ACF.
Faced, then, with a sequence of observations { x1, x 2 ,  , x n } and the
task of finding a time series model to fit the sequence, a primary
concern must be to estimate the ACF of the time series process of
which the data form a realisation.
____________

74 The common mean of a stationary model can be estimated using the


sample mean:

1 n
mˆ = Â xt
n t =1
____________

75 The autocovariance function g k can be estimated using the sample


autocovariance function, denoted ck or gˆk , given by:

1 n
gˆk = Â ( xt - mˆ )( xt - k - mˆ )
n t = k +1
____________

76 Estimates for the autocorrelation function rk are given by:

gˆk
rk =
gˆ0

The collection {rk : k Œ  } is called the sample autocorrelation


function (SACF). Every time series analysis involves at least one plot
of rk against k . Such a plot is called a correlogram.
____________

Page 44 © IFE: 2019 Examinations


Exclusive use Batch0402p

77 The partial autocorrelation function fk can be estimated using the


formula involving the ratio of determinants to which reference was
made earlier, but with the r k replaced by their estimates rˆ k . The
resulting function fˆk , called the sample partial autocorrelation
function (SPACF), and the plot of fˆk against k , called the partial
correlogram, are as important as the SACF and the correlogram in the
analysis of time series.
____________

As we have seen before, R functions acf and pacf can be used for
generating these values.

For example, the following lines simulate observations from an


ARMA(1,1) model.

Set the seed to guarantee reproducibility. The code is:

set.seed(123)

Call the simulated data x :

x=arima.sim(n=300,model=list(ar=0.7,ma=0.5))

Then:

par(mfrow=c(1,2))
acf(x,main="Sample ACF")
pacf(x,main="Sample PACF")

produces the graphs below.

© IFE: 2019 Examinations Page 45


Exclusive use Batch0402p

Figure 14.7: ACF and PACF of some simulated data from ARMA(1,1) .
____________

Identification of white noise

A test for whether a particular sequence of observations forms a


standard white noise process may seem of doubtful usefulness, but
one of the techniques of residual analysis suggests that the
verification of goodness of fit of any model should include a test as to
whether the residuals form a white noise process. A suitable test, or
portfolio of tests, is therefore a valuable asset.

Clearly the SACF and SPACF of a white noise process are random,
being simple functions of the observations. In particular, even if the
original process was a perfectly standard white noise the SACF and
SPACF would not be identically zero. The question is what scale of
deviation from zero is to be expected.
____________

Page 46 © IFE: 2019 Examinations


Exclusive use Batch0402p

78 An asymptotic result states that, if the original model is white noise:

X t = m + et

then the estimators r k and fk are approximately normally distributed


with mean 0, variance 1 n for each k .
____________

79 Values of the SACF or SPACF falling outside the range from - 2 n to


2 n can be taken as suggesting that the white noise model is
inappropriate. This range is indicated by dashed lines in the standard
output in R for ACF and PACF.

But some care should be exercised: the cut-off points of ±2 n give


approximate 95% limits, implying that about one value in 20 will fall
outside the range even when the white noise model is correct. This
means that one single value of r or fˆ outside the specified range
k k
would not be regarded as significant on its own, but three such values
might well be significant.
____________

80 A ‘portmanteau’ test is due to Ljung and Box, who state that, if the
white noise model is correct, then:

m rk2
n(n + 2) Â 2
~ cm
k =1 n - k

for each m .
____________

The standard commands for running these tests in R on some


observations (simulated white noise here) are:

x <- rnorm (100)


Box.test (x, lag = 1, type = "Ljung")
____________

© IFE: 2019 Examinations Page 47


Exclusive use Batch0402p

Identification of MA(q )

The distinguishing characteristic of MA(q ) is that r k = 0 for all k > q .


A test for the appropriateness of a MA(q ) model, therefore, is that rk
is close to 0 for all k > q .
____________

81 If the data really do come from a MA(q ) model, the estimators r k for
k > q will be roughly normally distributed with mean 0 and variance

1Ê q ˆ
Á 1 + 2 Â rk ˜ .
2
nË k =1 ¯
____________

This asymptotic result enables a test to be formulated.


____________

Identification of AR ( p )

82 The corresponding diagnostic procedure for an autoregressive model


is based on the sample partial ACF, since the PACF of an AR ( p ) is
distinctive, being equal to zero for k > p .

The asymptotic variance of fk is 1 n for each k > p . Again a normal


approximation can be used, so that values of the SPACF outside the
range ±2 n may suggest that the AR ( p) model is inappropriate.
____________

The Box-Jenkins methodology

In this section we consider the general class of autoregressive


integrated moving average models – the ARIMA( p, d , q ) models. As
usual we assume that historical data, comprising a time series
{ x t : t = 1, 2,  n} , are given.
____________

Page 48 © IFE: 2019 Examinations


Exclusive use Batch0402p

83 The Box-Jenkins approach allows one to find an ARIMA model which is


reasonably simple and provides a sufficiently accurate description of
the behaviour of the historical data.

The main steps of the approach are:


 Tentative identification of a model from the ARIMA class.
 Estimation of parameters in the identified model.
 Diagnostic checks.

If the tentatively identified model passes the diagnostic tests, the


model is ready to be used for forecasting. If it does not, the diagnostic
tests should indicate how the model ought to be modified, and a new
cycle of identification, estimation and diagnosis is performed.
____________

Identifying p , d and q

An ARIMA( p, d , q ) model is completely identified by the choice of


non-negative integer values for the parameters p , d , and q . The
parameter d is the number of times we have to difference the time
series x to convert it to some stationary level. The following
principles can be used to choose the appropriate value of d .
____________

84 A time series X can be modelled by a stationary ARMA model if the


sample autocorrelation function rk decays rapidly to zero with k .
____________

85 If, on the other hand, a slowly decaying positive sample autocorrelation


function rk is observed, this should be taken to indicate that the time
series needs to be differenced to convert it into a likely realisation of a
stationary random process.
____________

© IFE: 2019 Examinations Page 49


Exclusive use Batch0402p

86 Let sˆd2 denote the sample variance of the process z (d ) = —d x , ie the


sample variance of the data after they have been differenced d times.
It is normally the case that sˆd2 first decreases with d until stationarity
is achieved and then starts to increase. Therefore d can be set to the
value which minimises sˆd2 . This could be d = 0 if the original time
series x is already stationary.
____________

Suppose now that the appropriate value for the parameter d has been
found, and the time series { zd + 1, zd + 2 ,  , zn } is adequately stationary.
(Notice that a differenced series has one fewer observation than the
original series.) We shall assume throughout this section that the
sample mean of the z sequence is zero; if this is not the case, obtain a
new sequence by subtracting mˆ = z from each value in the sequence.
We shall also assume, for the sake of simplicity in setting down the
lower and upper limits of sums, that d = 0 .

In the framework of the Box-Jenkins approach we try to find an


ARMA( p, q ) model which fits the data z .
____________

87 If either the correlogram or the partial correlogram appears to be close


to zero for sufficiently large k , an MA(q ) or AR ( p ) model is indicated.
____________

88 Otherwise we should look for an ARMA( p, q ) model with non-zero


values of p and q . A good indicator for possible values of p and q
in an ARMA( p, q ) is the number of spikes in the ACF and PACF until
some geometrical decay to zero is observed. Since models can be
readily fitted in R, it is not hard to start with a simple model like
ARMA(1,1) and to work up to more complicated models if the simple
ones are deemed inadequate.
____________

Page 50 © IFE: 2019 Examinations


Exclusive use Batch0402p

89 Every additional parameter improves the fit of the model by reducing


the residual sum of squares. Taking this to extremes, a model with n
parameters could be found to fit the data exactly. But this will result in
some spurious model with insignificant t -values of parameter
estimates and the forecasts made with such a model will be found to
be practically useless. This is known as the problem of overfitting.
____________

90 The question of when to stop adding new parameters is addressed by


Akaike’s information criterion (AIC), which states that we should only
consider adding an extra parameter if this results in a reduction of the
residual sum of squares by a factor of at least e - 2 n , or alternatively,
one can evaluate for each possible model the value of:

number of parameters
AIC (model) = log(sˆ 2 ) + 2 ¥
n

and choose as the most appropriate the one corresponding to the


lowest such value.
____________

Parameter estimation

Once the values of p and q have been identified, the problem


becomes to estimate the values of parameters a 1, a 2 ,  , a p and
b1, b 2 ,  , b q for the ARMA( p, q ) model:

Zt = a 1Zt - 1 +  + a p Zt - p + et + b 1et - 1 +  + b q et - q

Least squares estimation suggests itself. This is equivalent to


maximum likelihood estimation if the et may be assumed to be
normally distributed.
____________

© IFE: 2019 Examinations Page 51


Exclusive use Batch0402p

91 In the case of an AR ( p) we have:

et = Zt - a 1Zt - 1 -  - a p Zt - p

and the estimators aˆ1,  , aˆ p are chosen to minimise:

n
(zt - a 1zt -1 -  - a p zt - p )
2
Â
t = p +1
____________

92 In the case of a more general ARMA process we encounter the


difficulty that the et cannot be deduced from the zt . For example, in
the case of ARMA(1,1) we have:

et = zt - a 1zt - 1 - b 1et - 1

an equation which can be solved iteratively for et as long as some


starting value e0 is assumed. For an ARMA( p, q ) the list of starting
values is (e0 ,  , eq - 1) . The starting values need to be estimated,
which is usually carried out by a recursive technique.
____________

93 First assume they are all equal to zero and estimate the a i and b j on
that basis, then use standard forecasting techniques on the
time-reversed process { zn ,  , z1 } to obtain predicted values for
(e0 ,  , eq - 1) , a method known as backforecasting. These new values
can be used as the starting point for another application of the
estimation procedure; this continues until the estimates have
converged.
____________

In Figure 14.7, the ACF and PACF plots show some significant spikes
in the early lags, suggesting some presence of autoregressive and
moving average.

Page 52 © IFE: 2019 Examinations


Exclusive use Batch0402p

The code:

fit=arima(x,order=c(1,0,1));fit

fits the ARIMA(1,0,1) to this data set, with standard output:

arima(x = x, order = c(1, 0, 1))


Coefficients:
ar1 ma1 intercept
0.6118 0.5849 0.0911
s.e. 0.0530 0.0600 0.2224
sigma^2 estimated as 0.9016:
log likelihood = -410.9, aic = 829.8

where the estimated parameters ar1 and ma1 correspond to a and b


in the ARMA(1,1) model. The fitted model has AIC = 829.8, which is the
smallest value among other possible models like AR (1) , MA(1) ,
ARMA(1,2) and ARMA(2,2) .
____________

94 An alternative method of estimation is based on method of moments


estimation. There are p + q parameters to be estimated. We can
calculate the theoretical ACF { rk } of an ARMA( p, q ) process, which
will be a function of the a ’s and b ’s. Then the method of moments
estimates are those values of a and b such that the theoretical ACF
r1,  , r p + q coincides with the observed sample ACF r1,  , rp + q . This
method is easily available for AR ( p) models since the corresponding
Yule-Walker equations are linear, therefore moment estimation requires
solving them with respect to the unknown parameters a i .
____________

© IFE: 2019 Examinations Page 53


Exclusive use Batch0402p

95 The final parameter of the model is s 2 , the variance of the et , which


may be estimated using:

1 n ˆ2
sˆ 2 = Â et
n t = p +1
1 n
= Â (zt - aˆ1zt -1 -  - aˆ p zt - p - bˆ1eˆt -1 -  - bˆq eˆt - q )2
n t = p +1

where eˆt denotes the residual at time t .


____________

If the number of observations, n , of the time series is sufficiently large


there will be little difference between the least squares estimates and
the method of moments estimates of the parameters.
____________

Diagnostic checks

After the tentative identification of an ARIMA( p, d , q ) model and


calculation of the estimates mˆ , sˆ , aˆ1, ... , aˆ p , bˆ1, ... , bˆq we have to
perform diagnostic checking.
____________

96 The principle of this is that, if the ARMA( p, q ) model is a good


approximation to the underlying time series process, then the residuals
eˆt will form a good approximation to a white noise process.
____________

The following checks are frequently used.


____________

Page 54 © IFE: 2019 Examinations


Exclusive use Batch0402p

97 The visual inspection of the graph of the residuals against t or the


graph of eˆt against zt can help to highlight a poorly fitting model. If
any pattern is evident, whether in the average level of the residuals or
in the magnitude of the fluctuations about 0, this should be taken to
mean that the model is inadequate.
____________

The behaviour of the sample ACF and sample PACF of a white noise
sequence have already been described.
____________

98 If the SACF or SPACF of the sequence of residuals has too many


values outside the range ±2 N , we conclude that the fitted model
does not have enough parameters and a new model with additional
parameters should be fitted. The Ljung-Box chi-squared statistic may
also be used for this purpose, but the degrees of freedom of the test
statistics needs to be reduced by the number of parameters p + q of
the ARMA model.
____________

99 If y 1, y 2 ,  , y n is a sequence of numbers, then we say that the


sequence has a turning point at time k if either y k - 1 < y k and
y k > y k + 1 , or y k - 1 > y k and y k < y k + 1 .
____________

100 If Y1, Y2 ,  , Yn is a sequence of independent random variables with


continuous distribution, then the probability of a turning point at time
2 2
k is 3
, the expected number of turning points is 3
(n - 2) , and the
variance is (16n - 29) 90 . Therefore the number of turning points in a
realisation of Y1, Y2 ,  , Yn should be within the 95% confidence
interval:

È2 16n - 29 16n - 29 ˘
Í 3 (n - 2) - 1.96
2
, 3
(n - 2) + 1.96 ˙
ÎÍ 90 90 ˚˙
____________

© IFE: 2019 Examinations Page 55


Exclusive use Batch0402p

The command:

tsdiag(fit)

generates a graphical summary of the diagnostic checks of the


residuals.

where the last plot shows a sequence of p-values of the Ljung-Box test,
high values observed suggesting good fit, ie residuals close to white
noise.

Figure 14.8: Diagnostic checks of residuals


____________

Page 56 © IFE: 2019 Examinations


Exclusive use Batch0402p

Forecasting

101 Using the Box-Jenkins approach, forecasting is relatively


straightforward. Having fitted an ARMA model to the data { x1,  , x n }
we have the equation:

X n + k = m + a 1( X n + k -1 - m ) +  + a p ( X n + k - p - m )
+ en + k + b 1en + k -1 +  + b qen + k -q

The forecast value of X n + k given all observations up until time n ,


known as the k -step ahead forecast and denoted xˆ n (k ) , is obtained
from this equation by:
 replacing all (unknown) parameters by their estimated values;

 replacing the random variables X 1, , X n by their observed values


x1, , x n ;

 replacing the random variables X n +1, , X n + k -1 by their forecast


values xˆ n (1), , xˆ n (k - 1) ;

 replacing the innovations e1, , en by the residuals eˆ1, , eˆn ;

 replacing the random variables en +1, , en + k -1 by their


expectations, 0.
____________

102 The one-step ahead and two-step ahead forecasts for an AR (2) are
given by:

xˆ n (1) = mˆ + aˆ1( x n - mˆ ) + aˆ 2 ( x n - 1 - mˆ )
xˆ n (2) = mˆ + aˆ1( xˆ n (1) - mˆ ) + aˆ 2 ( x n - mˆ )
____________

Thus the k -step ahead forecast is essentially the conditional


expectation of the future value of the process given all the information
currently available at time n .

© IFE: 2019 Examinations Page 57


Exclusive use Batch0402p

A point estimate of X n + k is less useful than a confidence interval, for


which an estimate of the variance is required. A comparison of X n + 1
with xˆ n (1) shows that the difference between them arises from
numerous sources, including en + 1 , differences between true values of
parameters and their estimates, and differences between true values of
the et and the residuals eˆt which are used to estimate them.
____________

103 Calculation of the prediction variance in any given case is complicated


and is best left to a computer. In general, though, it is possible to state
that the variance of the k -step ahead estimator is relatively small for
small values of k and converges, for large k , to g 0 , the variance of
the stationary process X .
____________

104 If X is an ARIMA( p, d , q ) process, then Z = —d X is ARMA( p, q ) , the


techniques given above can be used to produce forecasts and
confidence intervals for future values of Z . By reversing the
differencing procedure these can be translated into forecasts of future
values of X .
____________

For example, if X is ARIMA(0,1,1) , then X n + 1 = X n + Z n + 1 , and


xˆ n (1) = x n + zˆn (1) .

An ARIMA( p, d , q ) process with d > 0 is not stationary and therefore


has no stationary variance. It should come as no surprise, then, that
the prediction variance for the k -step ahead forecast increases to
infinity as k increases. This is easily seen in the case of the random
walk process.

For predicting three steps ahead using R:

predict(fit,n.ahead=3)

Page 58 © IFE: 2019 Examinations


Exclusive use Batch0402p

The Box-Jenkins methodology is demanding, requiring a skilled


operator to produce reliable results. There are many instances in
which a company needs no more than a simple forecast of some future
value without having to employ a trained statistician to provide it.
____________

Exponential smoothing

105 A much simpler forecasting technique, introduced by Holt in 1958,


uses a weighted combination of past values to predict future
observations:

xˆ n (1) = a ( x n + (1 - a ) x n -1 + (1 - a )2 x n - 2 + )
____________

106 Here a is a single parameter, either chosen by the user or estimated


by least squares from past data. Typically a value in the range 0.2 to
0.3 is used. The geometrically decreasing weights give rise to the
name exponential smoothing.
____________

107 The method lends itself easily to regular updating: it is easy to see
that:

xˆ n (1) = (1 - a ) xˆ n -1(1) + a x n = xˆ n -1(1) + a ( x n - xˆ n -1(1))

so that the current forecast is obtained by taking the previous forecast


and compensating for the error observed when the actual figure
became available.
____________

108 This technique works for stationary series, but clearly cannot be
applied to series exhibiting a trend or seasonal variation.
____________

109 There are more sophisticated versions of exponential smoothing which


are able to cope with trends or seasonal variation, and are even well
equipped to handle slowly varying trends or multiplicative, rather than
additive, seasonal variation.
____________

© IFE: 2019 Examinations Page 59


Exclusive use Batch0402p

Multivariate time series models

110 An m -dimensional multivariate time series { x1,  , x n } is a sequence


of m -dimensional vectors. Each vector x t is a set of observations of
the values of m variables of interest at time t .
____________

111 A multivariate time series is modelled by a sequence of random


vectors { X 1, X 2 ,  } . The components of X t will be denoted

X t(1) ,  , X t(m ) .
____________

The second order properties of a sequence of random vectors are


summarised by:

 the vectors of expected values mt = E ( X t ) , and

 the covariance matrices for all pairs of random vectors,


cov( X t , X t + k ) .
____________

112 The definition of stationarity is the same in the multidimensional case


as it is for univariate time series: the vector process is (weakly)
stationary if both E ( X t ) and cov( X t , X t + k ) are independent of t .
____________

113 In the stationary case the notation m will be used to represent the
common mean vector, and S k the covariance matrix cov( X t , X t + k ) .

The diagonal elements of the covariance matrix S k are clearly the


autocovariances at lag k of the individual components of the random
vector X t . The off-diagonal elements S k (i , j ) are called the lag k

cross-covariances of X (i ) with X ( j ) , cov( X t(i ) , X t( +j )k ) .


____________

Page 60 © IFE: 2019 Examinations


Exclusive use Batch0402p

114 A multivariate white noise process is the simplest example of a


multivariate random process. Suppose e1, e2 ,  is a sequence of
independent zero-mean random vectors, each having the same
covariance matrix S .

S need not be a diagonal matrix, though it must be symmetrical. In


other words, the components of the innovations vector need not be
independent of one another. This is a multivariate analogue of the
zero-mean white noise.
____________

Vector autoregressive processes

115 A vector autoregressive process of order p , denoted VAR ( p ) , is a


sequence of m -component random vectors { X 1, X 2 ,  } satisfying:

p
Xt = m + Â A j (Xt - j - m ) + et (14.2)
j =1

where e is an m -dimensional white noise process and the A j are


m ¥ m matrices.
____________

We might believe that interest rates, it , and tendency to invest, It , are


related to one another by the equations:

ÏÔi - m = a 11(it -1 - mi ) + et(i )


t i
Ì (14.3)
I
ÔÓ t - m I = a 21(it -1 - mi ) + a 22 (It -1 - mI ) + et(I )

where e (i ) and e (I ) are zero-mean (univariate) white noises. They may


have different variances and are not necessarily uncorrelated; that is,
we do not require ( )
cov et(i ) , et(I ) = 0 , although we do require

cov (
et(i ) , es(I ) ) = 0 for s π t .
____________

© IFE: 2019 Examinations Page 61


Exclusive use Batch0402p

116 This model can be expressed as a 2-dimensional VAR (1) :

Ê it - m i ˆ Ê a 11 0 ˆ Ê it -1 - m i ˆ Ê et ˆ
(i )
=
ÁË I - m ˜¯ ÁË a ˜Á ˜ + Á ˜
21 a 22 ¯ Ë It -1 - m I ¯ ÁË et ˜¯
(I )
t I

____________

117 The theory and analysis of a VAR (1) closely parallels that of a
univariate AR (1) . Iterating from equation (14.2) in the case p = 1 , it is
clear that:

t -1
Xt = m + Â A j et - j + At (X 0 - m )
j =0
____________

118 In order that X should represent a stationary time series, the powers
of A should converge to zero in some sense. The appropriate
requirement is that all eigenvalues of the matrix A should be less than
1 in absolute value.
____________

119 The eigenvalues of matrix A are the values l such that


det( A - l I ) = 0 . Eg for a 2-dimensional time series this equation
reduces to:

(a 11 - l )(a 22 - l ) - a 12a 21 = 0

where A[ i , j ] = a ij (i = 1,2, j = 1,2) .


____________

Similar, though more complicated, requirements can be set out under


which a more general VAR ( p ) process is stationary.

Page 62 © IFE: 2019 Examinations


Exclusive use Batch0402p

Fitting a vector autoregression is very similar to the process of fitting a


univariate autoregression. Parameter estimation can be carried out by
least squares or by method of moments. Some elements of the
univariate theory, such as the use of Akaike’s Information Criterion, do
not translate unchanged into a multivariate setting, but other topics
carry across relatively easily.
____________

The following simple dynamic Keynesian model provides an example


of a multivariate autoregressive process. Denote by Yt the national
income over a certain period of time, and denote by Ct and It the total
consumption and investment over the same period. It is assumed that
the consumption, Ct , depends on the income over the previous
period:

Ct = a Yt - 1 + et
(1)

where e (1) is a zero-mean white noise. The investment, It , is


determined by the ‘accelerator’ mechanism:

It = b (Ct - 1 - Ct - 2 ) + et
(2)

where e (2) is another zero-mean white noise. Finally, any part of the
national income is either consumed or invested; therefore:

Yt = Ct + It
____________

120 Eliminating the national income we arrive at the following


two-dimensional VAR (2) :

Ct = a Ct - 1 + a It - 1 + et(1)

It = b (Ct - 1 - Ct - 2 ) + et(2)

© IFE: 2019 Examinations Page 63


Exclusive use Batch0402p

Using matrix notation we can rewrite the above equation as:

0ˆ Ê Ct - 2 ˆ Ê et ˆ
(1)
Ê Ct ˆ Ê a a ˆ Ê Ct -1ˆ Ê 0
ÁË I ˜¯ = ÁË b +
0 ˜¯ ÁË It -1 ˜¯ ÁË - b
+ Á ˜
0˜¯ ÁË It - 2 ˜¯ ËÁ e (2) ¯˜
t t
____________

Cointegrated time series

121 Recall that a time series process X is called integrated of order d ,


abbreviated as I (d ) , if the process Y = —d X is stationary.
____________

122 Two time series processes X and Y are called cointegrated if:

 X and Y are I (1) random processes

 there exists a non-zero vector (a , b ) such that a X + b Y is


stationary.

The vector (a , b ) is called a cointegrating vector.


____________

123 There are a number of circumstances when it is reasonable to expect


that two processes may be cointegrated:
 if one of the processes is driving the other
 if both are being driven by the same underlying process.
____________

The following simple model of evolution of the USDollar/GBPound


exchange rate X t provides an example of a cointegrated model. It is
assumed that the exchange rate fluctuates around the purchasing
power Pt Qt , where Pt and Qt are the consumer price indices for US
and UK, respectively.

Page 64 © IFE: 2019 Examinations


Exclusive use Batch0402p

This is described by the following model:

Pt
ln X t = ln + Yt
Qt

Yt = m + a (Yt -1 - m ) + et + b et -1

where e is a zero-mean white noise.

The evolution of ln P and lnQ is described by ARIMA(1,1, 0) models:

(1 - B)ln Pt = m1 + a 1[(1 - B)ln Pt - 1 - m1 ] + et(1)

(1 - B) ln Qt = m2 + a 2 [(1 - B )ln Qt - 1 - m2 ] + et(2)

where e (1) and e (2) are zero-mean white noises, possibly correlated.

ln P and ln Q are both ARIMA(1,1, 0) processes. The logarithm of the


exchange rate is also non-stationary. However, ln X - ln P + ln Q is the
ARMA(1,1) random process Y and, therefore, is a stationary random
process. It follows that the sequence of random vectors
{(ln X t ,ln Pt ,ln Qt ) : t = 1, 2,  } is described by a cointegrated model
with the cointegrating vector (1, -1,1) .
____________

Special non-stationary, non-linear time series models

124 The general class of bilinear models can be exemplified by its simplest
representative, the random process X defined by the relation:

X n + a ( X n - 1 - m ) = m + en + b en - 1 + b( X n - 1 - m )en - 1

Considered only as a function of X , this relation is linear; it is also


linear when considered as a function of e only. This is why it is called
‘bilinear’.
____________

© IFE: 2019 Examinations Page 65


Exclusive use Batch0402p

125 The main qualitative difference between the bilinear model and models
from the ARMA class is that many bilinear models exhibit ‘bursty’
behaviour: when the process is far from its mean it tends to exhibit
larger fluctuations.

The difference between this model and an ARMA(1,1) process may be


seen to lie in the last term on the right hand side: when X n - 1 is far
from m and en - 1 is far from 0 – events which are far from being
independent – the final term assumes a much greater significance.
____________

126 A simple representative of the class of threshold autoregressive


models is the random process X defined by the relation:

Ïa 1( X n - 1 - m ) + en , if X n - 1 £ d
Xn = m + Ì
Óa 2 ( X n - 1 - m ) + en , if X n - 1 > d
____________

127 The distinctive feature of some models from the threshold


autoregressive class is the limit cycle behaviour. This makes the
threshold autoregressive models suitable for the description of ‘cyclic’
phenomena.
____________

128 Another modification of the AR class of models is that of


autoregressive models for which the coefficient is random. In other
words:

X t = m + a t ( X t - 1 - m ) + et

where {a 1, a 2 ,  , a n } is a sequence of independent random variables.


____________

129 The behaviour of these processes can vary widely, depending on the
distribution chosen for the a t , but is in general more irregular than
that of the corresponding AR (1) .
____________

Page 66 © IFE: 2019 Examinations


Exclusive use Batch0402p

130 Such a model could be used to represent the behaviour of an


investment fund, with m = 0 and a t = 1 + it with it being the random
rate of return.
____________

131 The class of autoregressive models with conditional heteroscedasticity


of order p – the ARCH ( p) models – is defined by the relation:

p
 a k (Xt -k - m)
2
X t = m + et a 0 +
k =1

where e is a sequence of independent standard normal random


variables.
____________

132 The simplest representative of the ARCH ( p) class is the ARCH (1)
model defined by the relation:

X t = m + et a 0 + a 1 ( X t - 1 - m )
2

____________

If m is zero, it can be shown that cov( X t , X s ) = 0 for s π t confirming


that Xt is white noise with uncorrelated but not independent
components.
____________

133 As may be seen from the ARCH (1) model, a significant deviation of
X t - 1 from the mean m gives rise to an increase in the conditional
variance of X t given X t - 1 .
____________

134 The ARCH models have been used for modelling financial time series.
If Zt is the price of an asset at the end of the t th trading day, it is
found that the ARCH model can be used to model X t = ln(Zt Zt - 1) ,
interpreted as the daily return on day t .

© IFE: 2019 Examinations Page 67


Exclusive use Batch0402p

The ARCH family of models captures the feature frequently observed in


asset price data that a significant change in the price of an asset is
often followed by a period of high volatility.
____________

Page 68 © IFE: 2019 Examinations


Exclusive use Batch0402p

PAST EXAM QUESTIONS

This section contains all the relevant exam questions from 2008 to 2017 that
are related to the topics covered in this booklet.

Solutions are given after the questions. These give enough information for
you to check your answer, including working, and also show you what an
outline examination answer should look like. Further information may be
available in the Examiners’ Report, ASET or Course Notes. (ASET can be
ordered from ActEd.)

We first provide you with a cross-reference grid that indicates the main
subject areas of each exam question. You can use this, if you wish, to
select the questions that relate just to those aspects of the topic that you
may be particularly interested in reviewing.

Alternatively, you can choose to ignore the grid, and attempt each question
without having any clues as to its content.

© IFE: 2019 Examinations Page 69


9
8
7
6
5
4
3
2
1

24
23
22
21
20
19
18
17
16
15
14
13
12
11
10
Question

Page 70





MA processes
















AR processes



ARMA processes
Cross-reference grid




ARIMA processes



VAR processes
Autocov fn /

















 autocorr fn






Partial ACF
Non-stationary




 series
Parameter









estimation
Exclusive use Batch0402p




Forecasting
Testing / choosing







model




Trends, cycles

ARCH models

© IFE: 2019 Examinations


Question attempted
Exclusive use Batch0402p

1 Subject CT6 April 2008 Question 7

Consider the following model applied to some quarterly data:

Yt = et + b1et -1 + b 4et - 4 + b1b 4et - 5

where et is a white noise process with mean zero and variance s 2 .

(i) Express in terms of b1 and b 4 the roots of the characteristic


polynomial of the MA part, and give conditions for invertibility of the
model. [2]

(ii) Derive the autocorrelation function (ACF) for Yt . [5]

For our particular data the sample ACF is:

Lag ACF

1 0.73
2 0.14
3 0.37
4 0.59
5 0.24
6 0.12
7 0.07

(iii) Explain whether these results confirm the initial belief that the model
could be appropriate for these data. [3]
[Total 10]

© IFE: 2019 Examinations Page 71


Exclusive use Batch0402p

2 Subject CT6 April 2008 Question 9

(i) Describe the difference between strictly stationary processes and


weakly stationary processes. [2]

(ii) Explain why weakly stationary multivariate normal processes are also
strictly stationary. [1]

(iii) Show that the following bivariate time series process, ( X n ,Yn )T , is
weakly stationary:

X n = 0.5 X n -1 + 0.3Yn -1 + enx


Yn = 0.1X n -1 + 0.8Yn -1 + eny

where enx and enx are two independent white noise processes. [5]

(iv) Determine the positive values of c for which the process

X n = (0.5 + c ) X n -1 + 0.3Yn -1 + enx


Yn = 0.1X n -1 + (0.8 + c )Yn -1 + eny

is stationary. [6]
[Total 14]

3 Subject CT6 September 2008 Question 6

Consider the ARCH(1) process:

X t    et  0  1( X t 1   )2

where et are independent normal random variables with variance 1 and


mean 0. Show that, for s  1,2,, t  1 , X t and X t s are:

(i) uncorrelated. [5]

(ii) not independent. [3]


[Total 8]

Page 72 © IFE: 2019 Examinations


Exclusive use Batch0402p

4 Subject CT6 September 2008 Question 10

From a sample of 50 consecutive observations from a stationary process,


the table below gives values for the sample autocorrelation function (ACF)
and the sample partial autocorrelation function (PACF):

Lag ACF PACF


1 0.854 0.854
2 0.820 0.371
3 0.762 0.085

The sample variance of the observations is 1.253.

(i) Suggest an appropriate model, based on this information, giving your


reasoning. [2]

(ii) Consider the AR(1) model:

Yt  a1Yt 1  et

where et is a white noise error term with mean zero and variance  2 .

Calculate method of moments (Yule-Walker) estimates for the


parameters of a1 and  2 on the basis of the observed sample. [4]

(iii) Consider the AR(2) model:

Yt  a1Yt 1  a2Yt  2  et

where et is a white noise error term with mean zero and variance  2 .

Calculate method of moments (Yule-Walker) estimates for the


parameters of a1 , a2 and  2 on the basis of the observed sample. [7]

(iv) List two statistical tests that you should apply to the residuals after fitting
a model to time series data. [2]
[Total 15]

© IFE: 2019 Examinations Page 73


Exclusive use Batch0402p

5 Subject CT6 April 2009 Question 10

Let Yt be a stationary time series with autocovariance function g Y (s ) .

(i) Show that the new series X t = a + bt + Yt where a and b are fixed
non-zero constants, is not stationary. [2]

(ii) Express the autocovariance function of DX t = X t - X t -1 in terms of


g Y (s ) and show that this new series is stationary. [7]

(iii) Show that if Yt is a moving average process of order 1, then the series
DX t is not invertible and has variance larger than that of Yt . [6]
[Total 15]

6 Subject CT6 September 2009 Question 1

Consider the stationary autoregressive process of order 1 given by

Yt = 2a Yt -1 + Zt a < 0.5

where Zt denotes white noise with mean zero and variance s 2 .


Express Yt in the form Yt = Â a j Zt - j and hence or otherwise find an
j =0

expression for the variance of Yt in terms of a and s . [4]

Page 74 © IFE: 2019 Examinations


Exclusive use Batch0402p

7 Subject CT6 September 2009 Question 6

The following data is observed from n = 500 realisations from a time series:

n n
 xi = 13,153.32 ,  ( xi - x )2 = 3,153.67 and
i =1 i =1
n -1
 ( xi - x )( xi +1 - x ) = 2,176.03
i =1

(i) Estimate, using the data above, the parameters m , a1 and s from the
model:

X t - m = a1( X t -1 - m ) + e t

where e t is a white noise process with variance s 2 . [7]

(ii) After fitting the model with the parameters found in (i), it was calculated
that the number of turning points of the residuals series eˆt is 280.

Perform a statistical test to check whether there is evidence that eˆt is


not generated from a white noise process. [3]
[Total 10]

© IFE: 2019 Examinations Page 75


Exclusive use Batch0402p

8 Subject CT6 April 2010 Question 3

The following two models have been suggested for representing some
quarterly data with underlying seasonality.

Model 1 Yt = aYt - 4 + et

Model 2 Yt = b et - 4 + et

where et is a white noise process in each case.

(i) Determine the autocorrelation function for each model. [4]

The observed quarterly data is used to calculate the sample autocorrelation.

(ii) State the features of the sample autocorrelation that would lead you to
prefer Model 1. [1]
[Total 5]

9 Subject CT6 April 2010 Question 6

Observations y1, y 2 , , y n are made from a random walk process given by:

Y0 = 0 and Yt = a + Yt -1 + et for t > 0

where et is a white noise process with variance s 2 .

(i) Derive expressions for E (Yt ) and var (Yt ) and explain why the process
is not stationary. [3]

(ii) Show that g t ,s = cov (Yt ,Yt - s ) for s < t is linear in s . [2]

(iii) Explain how you would use the observed data to estimate the
parameters a and s . [3]

(iv) Derive expressions for the one-step and two-step forecasts for Yn +1
and Yn + 2 . [2]
[Total 10]

Page 76 © IFE: 2019 Examinations


Exclusive use Batch0402p

10 Subject CT6 September 2010 Question 11

A time series model is specified by:

Yt = 2a Yt -1 - a 2Yt - 2 + et

where et is a white noise process with variance s 2 .

(i) Determine the values of a for which the process is stationary. [2]

(ii) Derive the auto-covariances g 0 and g 1 for this process and find a
general recursive expression for g k for k ≥ 2 . [10]

(iii) Show that the auto-covariance function can be written in the form:

g k = Aa k + kBa k

for some values of A , B which you should specify in terms of the


constants a and s 2 . [5]
[Total 17]

11 Subject CT6 April 2011 Question 7

Consider the time series Yt = 0.7 + 0.4Yt -1 + 0.12Yt - 2 + et where et is a


white noise process with variance s 2 .

(i) Identify the model as an ARIMA( p, d , q ) process. [1]

(ii) Determine whether Yt is a stationary process. [2]

(iii) Calculate E (Yt ) . [2]

(iv) Calculate the auto-correlations r1 , r2 , r3 and r 4 . [4]


[Total 9]

© IFE: 2019 Examinations Page 77


Exclusive use Batch0402p

12 Subject CT6 October 2011 Question 8

Consider the time series

Yt = 0.1 + 0.4 Yt -1 + 0.9 et -1 + et

where et is a white noise process with variance s 2 .

(i) Identify the model as an ARIMA( p, d , q ) process. [1]

(ii) Determine whether Yt is:

(a) a stationary process

(b) an invertible process. [2]

(iii) Calculate E (Yt ) and find the auto-covariance function for Yt . [6]

(iv) Determine the MA(• ) representation for Yt . [4]


[Total 13]

Page 78 © IFE: 2019 Examinations


Exclusive use Batch0402p

13 Subject CT6 April 2012 Question 9

Consider the time series model

(1 - a B )3 X t = et

where B is the backwards shift operator and et is a white noise process with
variance s 2 .

(i) Determine for which values of a the process is stationary. [2]

Now assume that a = 0.4 .

(ii) (a) Write down the Yule-Walker equations.

(b) Calculate the first two values of the auto-correlation function r1


and r2 . [7]

(iii) Describe the behaviour of rk and the partial autocorrelation function


fk as k Æ • . [3]
[Total 12]

© IFE: 2019 Examinations Page 79


Exclusive use Batch0402p

14 Subject CT6 September 2012 Question 9

In order to model a particular seasonal data set an actuary is considering


using a model of the form:

(1 - B ) (1 - (a + b ) B + ab B ) X
3 2
t = et

where B is the backward shift operator and et is a white noise process with
variance s 2 .

(i) Show that for a suitable choice of s the seasonal difference series
Yt = X t - X t - s is stationary for a range of values of a and b , which
you should specify. [3]

After appropriate seasonal differencing the following sample autocorrelation


values for the series Yt are observed: rˆ1 = 0.2 and rˆ2 = 0.7 .

(ii) Estimate the parameters a and b based on this information. [7]

[HINT: let X = a + b , Y = ab and find a quadratic equation with roots a


and b .]

(iii) Forecast the next two observations x̂101 and x̂102 based on the
parameters estimated in part (ii) and the observed values x1, x2 ,..., x100
of X t . [4]
[Total 14]

Page 80 © IFE: 2019 Examinations


Exclusive use Batch0402p

15 Subject CT6 April 2013 Question 11

An actuary is considering the time series model defined by:

X t = a X t -1 + et

where et is a sequence of independent Normally distributed random


variables with mean 0 variance s 2. The series begins with the fixed value
X 0 = 0.

(i) Show that the conditional distribution of X t given X t -1 is normal and


hence show that the likelihood of making observations x1, x2,º, xn
from this model is:

( xi -a xi -1 )2
n 1 -
Lμ’ e 2s 2 . [3]
i =1 2p s

(ii) Show that the maximum likelihood estimate of a can also be regarded
as a least squares estimate. [2]

(iii) Find the maximum likelihood estimates of a and s 2. [4]

(iv) Derive the Yule-Walker equations for the model and hence derive
estimates of a and s 2 based on observed values of the
autocovariance function. [5]

(v) Comment on the difference between the estimates of a in parts (iii)


and (iv). [1]
[Total 15]

© IFE: 2019 Examinations Page 81


Exclusive use Batch0402p

16 Subject CT6 September 2013 Question 9

(i) State the three main stages in the Box-Jenkins approach to fitting an
ARIMA time series model. [3]

(ii) Explain, with reasons, which ARIMA time series would fit the observed
data in the charts below. [2]

ACF PACF

Now consider the time series model given by:

X t = a1X t -1 + a 2 X t - 2 + b1et -1 + et

where et is a white noise process with variance s 2.

(iii) Derive the Yule-Walker equations for this model. [6]

(iv) Explain whether the partial auto-correlation function for this model can
ever give a zero value. [2]
[Total 13]

Page 82 © IFE: 2019 Examinations


Exclusive use Batch0402p

17 Subject CT6 April 2014 Question 12

A sequence of 100 observations was made from a time series and the
following values of the sample auto-covariance function (SACF) were
observed:

Lag SACF
1 0.68
2 0.55
3 0.30
4 0.06

The sample mean and variance of the same observations are 1.35 and 0.9
respectively.

(i) Calculate the first two values of the partial correlation function fˆ1
and fˆ2. [1]

(ii) Estimate the parameters (including s 2 ) of the following models which


are to be fitted to the observed data and can be assumed to be
stationary.

(a) Yt = a0 + a1Yt -1 + et

(b) Yt = a0 + a1Yt -1 + a2Yt - 2 + et

In each case et is a white noise process with variance s 2 . [12]

(iii) Explain whether the assumption of stationarity is necessary for the


estimation for each of the models in part (ii). [2]

(iv) Explain whether each of the models in part (ii) satisfies the Markov
property. [2]
[Total 17]

© IFE: 2019 Examinations Page 83


Exclusive use Batch0402p

18 Subject CT6 September 2014 Question 9

(i) List the main steps in the Box-Jenkins approach to fitting an ARIMA time
series to observed data. [3]

Observations x1, x2,  , x200 are made from a stationary time series and
the following summary statistics are calculated:

200 200 200


 xi  83.7  ( xi  x )2  35.4  ( xi  x )( xi 1  x )  28.4
i 1 i 1 i 2

200
 ( xi  x )( xi 2  x )  17.1
i 3

,
(ii) Calculate the values of the sample auto-covariances ˆ0 ˆ1 and ˆ2. [3]

(iii) Calculate the first two values of the partial correlation function ˆ1

and ˆ2 . [3]

The following model is proposed to fit the observed data:

X t    a1( X t 1   )  et

where et is a white noise process with variance  2 .

(iv) Estimate the parameters  , a1 and  2 in the proposed model. [5]

After fitting the model in part (iv) the 200 observed residual values eˆt were
calculated. The number of turning points in the residual series was 110.

(v) Carry out a statistical test at the 95% significance level to test the
hypothesis that eˆt is generated from a white noise process. [4]
[Total 18]

Page 84 © IFE: 2019 Examinations


Exclusive use Batch0402p

19 Subject CT6 April 2015 Question 7

The following time series model is being used to model monthly data:

Yt = Yt -1 + Yt -12 - Yt -13 + et + b1et -1 + b12et -12 + b1b12et -13

where et is a white noise process with variance s 2 .

(i) Perform two differencing transformations and show that the result is a
moving average process which you may assume to be stationary. [3]

(ii) Explain why this transformation is called seasonal differencing. [1]

(iii) Derive the auto-correlation function of the model generated in part (i). [8]
[Total 12]

20 Subject CT6 October 2015 Question 11

Consider the following pair of equations:

X t  0.5 X t 1  Yt   t1

Yt  0.5Yt 1   X t   t2

where  t1 and  t2 are independent white noise processes.

(i) (a) Show that these equations can be represented as:

X   X   1 
M  t   N  t 1    t 
 Yt   Yt 1    t2 

where M and N are matrices to be determined.

(b) Determine the values of  for which these equations represent a


stationary bivariate time series model. [9]

© IFE: 2019 Examinations Page 85


Exclusive use Batch0402p

(ii) Show that the following set of equations represents a VAR( p ) (vector
auto regressive) process, by specifying the order and the relevant
parameters:

X t   X t 1  Yt 1   t1

Yt   X t 1   X t 2   t2
[3]
[Total 12]

21 Subject CT6 April 2016 Question 9

Consider the following time series model:

Yt  1  0.6Yt 1  0.16Yt  2   t

where  t is a white noise process with variance  2 .

(i) Determine whether Yt is stationary and identify it as an ARMA  p,q 


process. [3]

(ii) Calculate E Yt  . [2]

(iii) Calculate for the first four lags:


 the autocorrelation values 1,  2 , 3 ,  4 and

 the partial autocorrelation values  1,  2 ,  3 ,  4 . [7]


[Total 12]

Page 86 © IFE: 2019 Examinations


Exclusive use Batch0402p

22 Subject CT6 September 2016 Question 9

In order to model the seasonality of a particular data set an actuary is asked


to consider the following model:

1  B  1      B   B  X
12 2
t  t

where B is the backshift operator and  t is a white noise process with


variance  2 .

The actuary intends to apply a seasonal difference s X t  Yt .

(i) Explain why s should be 12 in this case (ie Yt  X t  X t 12 ). [1]

(ii) Determine the range of values for  and  for which the process will
be stationary after applying this seasonal difference. [3]

Assume that after the appropriate seasonal differencing the following sample
autocorrelation values for observations of Yt are ˆ1  0 and ˆ2  0.09 .

(iii) Estimate the parameters  and  . [5]

The actuary observes a sequence of observations x1, x2 ,, xT of X t , with


T  12.

(iv) Derive the next two forecasted values for next two observations xˆT 1
and xˆT  2 , as a function of the existing observations. [4]
[Total 13]

© IFE: 2019 Examinations Page 87


Exclusive use Batch0402p

23 Subject CT6 April 2017 Question 6

Model A is a stationary AR(1) model, which follows the equation:

Yt = m + a Yt -1 + e t

where e t is a standard white noise process.

(i) State two approaches for estimating the parameters in Model A. [2]

Mary, an actuarial student, wishes to revise Model A such that the error
terms e t no longer follow a normal distribution.

(ii) Explain which of the approaches in part (i) she should now use for
parameter estimation. [2]

(iii) Propose a method by which Mary will be able to calculate estimates of


the parameters a and s 2 , with reference to any relevant equations. [3]

Mary has now constructed Model B. She has done this by multiplying both
sides of the equation above by (1 - cB) , where B is the backshift operator,
so that Model B follows the equation:

(1 - cB)Yt = (1 - cB)(a Yt -1 + e t )

(iv) Explain why Model A and Model B are identical. [2]

(v) Explain for which values of c Model B is stationary. [2]


[Total 11]

Page 88 © IFE: 2019 Examinations


Exclusive use Batch0402p

24 Subject CT6 September 2017 Question 10

Let X t = a + bt + Yt , where Yt is a stationary time series, and a and b are


fixed non-zero constants.

(i) Show that X t is not stationary. [2]

Let DX t = X t - X t -1.

(ii) Show that DX t is stationary. [1]

(iii) Determine the autocovariance values of DX t in terms of those of Yt . [4]

Now assume that Yt is an MA(1) process, ie Yt = e t + be t -1

(iv) Set out an equation for DX t in terms of b, b , e t and L, the lag operator.
[1]

(v) Show that DX t has a variance larger than that of Yt . [4]


[Total 12]

© IFE: 2019 Examinations Page 89


Exclusive use Batch0402p

SOLUTIONS TO PAST EXAM QUESTIONS

The solutions presented here are just outline solutions for you to use to
check your answers. See ASET for full solutions.

1 Subject CT6 April 2008 Question 7

(i) Invertibility

We have

Yt = et + b1et -1 + b 4et - 4 + b1b 4et - 5


= (1 + b1B + b 4B 4 + b1b 4B5 )et

So the characteristic polynomial of the white noise terms is:

1 + b1l + b 4 l 4 + b1b 4 l 5 = (1 + b1l )(1 + b 4 l 4 )

The roots of the characteristic polynomial are:

l = - b1 , 4 - b1
1 4

The time series is invertible if the roots are all greater than 1 in magnitude:

- b1 > 1 fi b1 < 1
1

4 - b1 > 1 fi - b1 > 14 fi b4 < 1


4 4

Page 90 © IFE: 2019 Examinations


Exclusive use Batch0402p

(ii) ACF

The autocovariance function at lag 0 is:

g 0 = cov(Yt ,Yt )
= cov(et + b1et -1 + b 4et - 4 + b1b 4et - 5 ,

et + b1et -1 + b 4et - 4 + b1b 4et - 5 )

= s 2 + b12s 2 + b 42s 2 + b12 b 42s 2

= s 2 (1 + b12 )(1 + b 42 )

Similarly:

g 1 = cov(Yt ,Yt -1)


= cov(et + b1et -1 + b 4et - 4 + b1b 4et - 5 ,

et -1 + b1et - 2 + b 4et - 5 + b1b 4et - 6 )


= b1s + b1b 42s 2
2

= s 2 b1(1 + b 42 )

g 2 = cov(Yt ,Yt - 2 )
= cov(et + b1et -1 + b 4et - 4 + b1b 4et - 5 ,

et - 2 + b1et - 3 + b 4et - 6 + b1b 4et - 7 )

=0

g 3 = cov(Yt ,Yt - 3 )
= cov(et + b1et -1 + b 4et - 4 + b1b 4et - 5 ,

et - 3 + b1et - 4 + b 4et - 7 + b1b 4et - 8 )

= b1b 4s 2

© IFE: 2019 Examinations Page 91


Exclusive use Batch0402p

g 4 = cov(Yt ,Yt - 4 )
= cov(et + b1et -1 + b 4et - 4 + b1b 4et - 5 ,

et - 4 + b1et - 5 + b 4et - 8 + b1b 4et - 9 )

= b 4s 2 + b12 b 4s 2

= s 2 b 4 (1 + b12 )

g 5 = cov(Yt ,Yt - 5 )
= cov(et + b1et -1 + b 4et - 4 + b1b 4et - 5 ,

et - 5 + b1et - 6 + b 4et - 9 + b1b 4et -10 )

= b1b 4s 2

gk = 0 k >5

Hence:

g0
r0 = =1
g0
g1 s 2 b1(1 + b 42 ) b1
r ±1 = = 2 =
g 0 s (1 + b12 )(1 + b 42 ) 1 + b12

r ±2 = 0

g3 b1b 4
r ±3 = =
g 0 (1 + b12 )(1 + b 42 )

g4 s 2 b 4 (1 + b12 ) b4
r ±4 = = 2 =
g 0 s (1 + b12 )(1 + b 42 ) 1 + b12

g5 b1b 4
r ±5 = =
g 0 (1 + b12 )(1 + b 42 )

r±k = 0 |k |>5

Page 92 © IFE: 2019 Examinations


Exclusive use Batch0402p

(iii) Results confirm belief?

Now r2 , r6 and r7 are zero, so we would expect r2 , r6 and r7 to be


close to zero. They do not appear to be (we have insufficient information to
carry out a formal test).

Now r3 and r5 should both equal the product of r1 and r 4 . r3 and r5


are not equal and are both smaller than r1r4 = 0.431 .

Alternatively, assuming we require an invertible model, then | b1 | < 1 and


b1 b4
| b 4 | < 1 which implies that r1 = < 0.5 , r4 = < 0.5 and
1+ b12 1+ b 42

r3 = r5 < 0.25 . Only r5 meets these conditions.

So it appears that the sample ACFs are not consistent with the theoretical
ACFs.

2 Subject CT6 April 2008 Question 9

(i) Strictly and weakly stationary time series

A process is strictly stationary if:

f ( xt1 ,  , xtn ) ∫ f ( xt1 + k ,  , xtn + k )

A process is weakly stationary if:

 E ( X t ) is constant

 cov( X t , X t + k ) is constant for a given lag k .

(ii) Weakly stationary multivariate is strictly stationary

A normal distribution is defined by its mean, m , and its variance, s 2 only.


So if these are constant (as per the weakly stationary definition) then this will
uniquely define the process. Hence it will also be strictly stationary.

© IFE: 2019 Examinations Page 93


Exclusive use Batch0402p

(iii) Show VAR(1) stationary

A VAR(1) process X n = AX n -1 + ε n is stationary if the eigenvalues of matrix


A are less than one in magnitude. The eigenvalues are the values of l
that solve:

det( A - l I) = 0

Expressing the time series in vector form X n = AX n -1 + ε n , we have:

Ê X n ˆ Ê 0.5 0.3ˆ Ê X n -1ˆ Ê enx ˆ


ÁË Y ˜¯ = ÁË 0.1 0.8˜¯ ÁË Y ˜¯ + ÁÁ y ˜˜
n n -1 Ë en ¯

The eigenvalues of the matrix A solve:

È Ê 0.5 0.3ˆ Ê 1 0ˆ ˘ Ê 0.5 - l 0.3 ˆ


det Í Á ˜ -lÁ ˜ ˙ = det Á =0
ÍÎ Ë 0.1 0.8¯ Ë 0 1¯ ˙˚ Ë 0.1 0.8 - l ˜¯

Ê a bˆ
For a 2 ¥ 2 matrix, M = Á , the determinant is det M = ad - bc . Hence:
Ë c d ˜¯

(0.5 - l )(0.8 - l ) - 0.3 ¥ 0.1 = 0

l 2 - 1.3l + 0.37 = 0

Solving this gives:

1.3 ± 1.32 - 4 ¥ 1 ¥ 0.37


l= = 0.421, 0.879
2

Since both of these roots are less than 1 in magnitude, the multivariate
process is stationary.

Page 94 © IFE: 2019 Examinations


Exclusive use Batch0402p

(iv) Values of c so multivariate process is stationary

Expressing the time series in vector form X n = AX n -1 + ε n , we have:

Ê X n ˆ Ê 0.5 + c 0.3 ˆ Ê X n -1ˆ Ê enx ˆ


ÁË Y ˜¯ = ÁË 0.1 +Á ˜
0.8 + c ˜¯ ÁË Yn -1 ˜¯ ÁË e y ˜¯
n n

The eigenvalues of the matrix A solve:

È Ê 0.5 + c 0.3 ˆ Ê 1 0ˆ ˘ Ê 0.5 + c - l 0.3 ˆ


det Í Á
Ë 0.1 0.8 + c ˜ - l ËÁ 0 1¯˜ ˙ = det ËÁ
¯ 0.1 0.8 + c - l ¯˜
=0
ÎÍ ˙˚

Hence:

(0.5 + c - l )(0.8 + c - l ) - 0.3 ¥ 0.1 = 0

l 2 - (1.3 + 2c )l + (c 2 + 1.3c + 0.37) = 0

Solving this gives:

(1.3 + 2c ) ± (1.3 + 2c )2 - 4 ¥ 1 ¥ (c 2 + 1.3c + 0.37)


l=
2
(1.3 + 2c ) ± (1.69 + 5.2c + 4c 2 ) - (4c 2 + 5.2c + 1.48)
=
2
(1.3 + 2c ) ± 0.21
=
2

To be stationary, we require both roots to be less than 1 in magnitude.


Since we are told that c is positive, we get:

(1.3 + 2c ) + 0.21
<1 fi 0.879 + c < 1 fi c < 0.121
2

(1.3 + 2c ) - 0.21
<1 fi 0.421 + c < 1 fi c < 0.579
2

Hence c < 0.121 .

© IFE: 2019 Examinations Page 95


Exclusive use Batch0402p

3 Subject CT6 September 2008 Question 6

(i) Show that the values are uncorrelated

Two random variables X and Y are uncorrelated if and only if


E  XY   E  X  E Y  .

For the two values X t and X t s of this ARCH model, we have:

E  X t   E    et  0  1( X t 1   )2     E et  0  1( X t 1   )2 


   
 2
   E et  E   0  1( X t 1   ) 
  
0
  0  

There’s nothing special about t , so we also have for t  s :

E  X t s   

The expectation of the product (when s  1 ) is:


 
E  X t X t s   E    et  0  1( X t 1   )2 X t s 



  E  X t s   E et  0  1( X t 1   )2 X t s 
   

  2  E et  E   0  1( X t 1   )2 X t s 
  
0

   0  2
2

So:

E  X t X t s    2     E  X t  E  X t s 

ie X t and X t s are uncorrelated.

Page 96 © IFE: 2019 Examinations


Exclusive use Batch0402p

(ii) Show that the values are not independent

If two random variables X and Y are independent, then


P f ( X )  A   P f ( X )  A | Y  y  .

Let Yt  X t   , so that:

Yt  et  0  1Yt21

Squaring this:


Yt2  et2  0  1Yt21 
We can use repeated substitution to get:


Yt2  et 2  0  1Yt21 

 et 2  0  1et21  
0  1Yt22

 et 2    e


0    e    Y  
2
1 t 1 0
2
1 t 2 0
2
1 t 3

 et 2    e
 0    e    Y  
2
1 t 1 0

2
1 t 2 0
2
1 t s


f Yt2 s 

So the bracketed factor f Yt2s   indicated (which contains only positive

numbers and squares) is an increasing function of Yt2s . So it follows that,


for example:

( ) (
P Yt2 < 1 Yt2- s = 1,000,000 < P Yt2 < 1 Yt2- s = 1 )
So Yt2 is not independent of Yt2s , which implies that Yt is not independent
of Yt s and hence that X t is not independent of X t s .

© IFE: 2019 Examinations Page 97


Exclusive use Batch0402p

4 Subject CT6 September 2008 Question 10

(i) Appropriate model

From the figures given it looks like the ACF is decaying slowly and the PACF
is cutting off after lag 2. This is a characteristic of an AR (2) model.

(ii) Parameter estimates

As a starter step:

cov(Yt , et )  cov(a1Yt 1  et , et )   2

Consider the autocovariance with a lag of 1:

 1  cov(Yt ,Yt 1)  cov(a1Yt 1  et ,Yt 1)  a1 0  1  a1

Since we are told in the question that the sample ACF with lag 1 is 0.854,
this is our estimate of 1 . So we have:

0.854  aˆ1

Consider the autocovariance with a lag of 0:

 0  cov(Yt ,Yt )  cov(a1Yt 1  et ,Yt )  a1 1   2

Since we are told in the question that the sample ACF with lag 1 is 0.854
(our estimate of 1 ) and the sample variance is 1.253 (our estimate of  0 ),
we have:

ˆ0  1.253
ˆ1
 0.854  ˆ1  0.854  1.253
ˆ0

From  0  a1 1   2 :

1.253  0.8542  1.253  ˆ 2  ˆ 2  0.339

Page 98 © IFE: 2019 Examinations


Exclusive use Batch0402p

(iii) Parameter estimates

Consider the autocovariance with a lag of 1:

 1  cov(Yt ,Yt 1)  cov(a1Yt 1  a2Yt  2  et ,Yt 1)  a1 0  a2 1


a1
 1 
1  a2

Consider the autocovariance with a lag of 2:

 2  cov(Yt ,Yt  2 )  cov(a1Yt 1  a2Yt  2  et ,Yt  2 )  a1 1  a2 0


a12 a 2  (1  a2 )a2
  0  a2 0  1 0
1  a2 1  a2

From this:

a12  (1  a2 )a2
2 
1  a2
Since we are told in the question that the sample ACF with lag 1 is 0.854
(our estimate for 1 ) and that the sample ACF with lag 2 is 0.820 (our
estimate of 2 ), we have:

aˆ1
0.854 
1  aˆ2
aˆ12  (1  aˆ2 )aˆ2
0.820 
1  aˆ2

Replacing â1 by 0.854(1  aˆ2 ) in the second equation, we get:

0.8542 (1  aˆ2 )2  (1  aˆ2 )aˆ2


0.820   0.8542 (1  aˆ2 )  aˆ2
1  aˆ2
 0.820  0.8542  aˆ2 (1  0.8542 )  aˆ2  0.335

By substituting this back into the first equation above, we get aˆ1  0.568 .

© IFE: 2019 Examinations Page 99


Exclusive use Batch0402p

Consider the autocovariance with a lag of 0:

 0  cov(Yt ,Yt )  cov(a1Yt 1  a2Yt  2  et ,Yt )  a1 1  a2 2   2

Since we are told in the question that the sample ACF with lag 1 is 0.854,
the sample ACF with lag 2 is 0.820 and the sample variance is 1.253, we
have:

ˆ0  1.253
ˆ1
 0.854  ˆ1  0.854  1.253
ˆ0
ˆ2
 0.820  ˆ2  0.820  1.253
ˆ0

From  0  a1 1  a2 2   2 :

1.253  0.568  0.854  1.253  0.335  0.820  1.253  ˆ 2


 ˆ 2  0.301

(iv) Tests

The appropriate tests are the Portmanteau (Ljung and Box) and Turning
Points tests.

5 Subject CT6 April 2009 Question 10

(i) Showing series is not stationary

We have:

E ( X t ) = a + bt + E (Yt )

Since Yt is stationary E (Yt ) does not depend upon time. However since
E ( X t ) contains t it depends on time and so is not constant. Hence X t is
not stationary.

Page 100 © IFE: 2019 Examinations


Exclusive use Batch0402p

(ii) Autocovariance

Simplifying the given expression for DX t :

DX t = X t - X t -1 = a + bt + Yt - a - b(t - 1) - Yt -1 = b + Yt - Yt -1

Consider the autocovariance function of this series with lag s :

g DX t (s ) = cov(b + Yt - Yt -1, b + Yt - s - Yt - s -1)


= cov(Yt ,Yt - s ) - cov(Yt ,Yt - s -1) - cov(Yt -1,Yt - s ) + cov(Yt -1,Yt - s -1)
= g Y (s ) - g Y (s + 1) - g Y (s - 1) + g Y (s )
= 2g Y (s ) - g Y (s + 1) - g Y (s - 1)

This means that the autocovariance function with lag s depends upon the
lag only.

Next, consider the mean:

E ( DX t ) = E (b + Yt - Yt -1) = b + E (Yt ) - E (Yt -1)

Since Yt is stationary, E (Yt ) = E (Yt -1) , so:

E ( DX t ) = b + E (Yt ) - E (Yt -1) = b


So the mean is constant.

These are the two conditions required for the time series to be stationary.

(iii) Moving average process

Since Yt is an MA(1) , then Yt = et + b et -1 . Therefore:

DX t = b + et + b et -1 - et -1 - b et - 2
= b + et + ( b - 1)et -1 - b et - 2

© IFE: 2019 Examinations Page 101


Exclusive use Batch0402p

Here:

1 + ( b - 1)l - bl 2 = 0

1 - b ± ( b - 1)2 + 4 b 1 - b ± b 2 + 2 b + 1 1 - b ± ( b + 1)
fil = = =
-2 b -2 b -2 b
1
=- or 1
b

Since the roots of the characteristic equation are not all strictly greater than
1 in magnitude, the process is not invertible.

Now consider the variance of Yt :

var(Yt ) = var(et + b et -1) = var(et ) + b 2 var(et -1) by independence

Let var(et ) = s 2 so that:

var(Yt ) = var(et ) + b 2 var(et -1) = s 2 + b 2s 2 = (1 + b 2 )s 2

Also:

var( DX t ) = var(b + et + ( b - 1)et -1 - b et - 2 )


= var(et ) + var(( b - 1)et -1) + var( - b et - 2 )
= s 2 + ( b - 1)2s 2 + b 2s 2
= ( b - 1)2s 2 + (1 + b 2 )s 2
= ( b - 1)2s 2 + var(Yt )

This is clearly greater than the variance of Yt .

Page 102 © IFE: 2019 Examinations


Exclusive use Batch0402p

6 Subject CT6 September 2009 Question 1

We need to ‘unfold’ the equation for Yt :

Yt = 2a Yt -1 + Zt
= 2a (2a Yt - 2 + Zt -1) + Zt = 4a 2Yt - 2 + 2a Zt -1 + Zt
= 4a 2 (2a Yt - 3 + Zt - 2 ) + 2a Zt -1 + Zt = 8a 3Yt - 3 + 4a 2Zt - 2 + 2a Zt -1 + Zt

= Â (2a ) j Zt - j
j =0

In calculating the variance, we recognise it is a geometric progression:

Ê • ˆ • •
var(Yt ) = var Á Â (2a ) j Zt - j ˜ = Â (2a )2 j var(Zt - j ) = Â (2a )2 j s 2
Ë j =0 ¯ j =0 j =0
2 2
s s
= =
1 - (2a )2 1 - 4a 2

7 Subject CT6 September 2009 Question 6

(i) Estimation of parameters

The covariance with lag 1, g 1 , is:

g 1 = cov( X t , X t -1) = cov( m + a1( X t -1 - m ) + e t , X t -1)


= cov( m, X t -1) + cov(a1( X t -1 - m ), X t -1) + cov(e t , X t -1)
= 0 + cov(a1X t -1, X t -1) - cov(a1m, X t -1) + 0
= a1g 0 - 0 = a1g 0

We can substitute in the sample values given in the question:

2,176.03 = 3,153.67a1 fi a1 = 0.690

The variance is:

var( X t - m ) = var(a1( X t -1 - m ) + e t )

© IFE: 2019 Examinations Page 103


Exclusive use Batch0402p

Since m and a1 are constants, and e t is independent of X t -1 :

var( X t ) = a12 var( X t -1) + var(e t ) = a12 var( X t -1) + s 2

We estimate the variance of the residuals from the data given in the
question:

3,153.67 3,153.67
= 0.6902 ¥ +s 2 fi s 2 = 3.304
500 500

So:

s = 1.818

We also calculate the mean from the data in the question:

13,153.32
m= = 26.31
500

(ii) Turning point test

2
E (T ) = (498) = 332
3
16 ¥ 500 - 29
var(T ) = = 88.567
90

The null and alternative hypotheses are:

H0 : the residuals are from a white noise process


H1 : the residuals are not from a white noise process

The observed value of the test statistic is:

280 + 0.5 - 332


= -5.47
88.567

Under H0 , this should come from the standard normal distribution. Since
-5.47 < -1.96 we have very strong evidence to reject H0 . This suggests
that the residuals are not consistent with white noise.

Page 104 © IFE: 2019 Examinations


Exclusive use Batch0402p

8 Subject CT6 April 2010 Question 3

(i) Autocorrelation functions

Model 1: Yt = a Yt - 4 + et

The examiners assumed this process was stationary. Had we checked it


using the characteristic equation:

1 - al 4 = 0 fi l4 = 1 a fi | a | < 1 so that | l | > 1

We will assume that a π 0 , otherwise Yt is just white noise.

For this process:

g 1 = cov (Yt ,Yt -1 ) = cov (a Yt - 4 + et ,Yt -1 )


= a cov (Yt - 4 ,Yt -1 ) + cov (et ,Yt -1 ) = ag 3 (1)

g 2 = cov (Yt ,Yt - 2 ) = cov (a Yt - 4 + et ,Yt - 2 )


= a cov (Yt - 4 ,Yt - 2 ) + cov (et ,Yt - 2 ) = ag 2

fi g 2 = 0 since a π 0 .

g 3 = cov (Yt ,Yt - 3 ) = cov (a Yt - 4 + et ,Yt - 3 )


= a cov (Yt - 4 ,Yt - 3 ) + cov (et ,Yt - 3 ) = ag 1
2
=a g3 by eqn (1)

  3   1  0 since a π 0 .

g 4 = cov (Yt ,Yt - 4 ) = cov (a Yt - 4 + et ,Yt - 4 )


= a cov (Yt - 4 ,Yt - 4 ) + cov (et ,Yt - 4 ) = ag 0

© IFE: 2019 Examinations Page 105


Exclusive use Batch0402p

Continuing in this way, we get:

g 5 = ag 1 = 0
g 6 = ag 2 = 0
g 7 = ag 3 = 0
g 8 = ag 4 = a 2g 0 etc

Hence:

ÔÏa k / 4g 0 if k = 4,8,12,16,...
gk = Ì
ÓÔ0 otherwise

1 if k  0
 
 k  k   k / 4 if k  4,8,12,16,...
0 
0 otherwise


Model 2: Yt = b et - 4 + et

We have:

 0  cov Yt ,Yt   cov   et  4  et ,  et  4  et 


  2 2   2   2  1  2 
 1  cov Yt ,Yt 1   cov   et  4  et ,  et 5  et 1 
0

 2  cov Yt ,Yt  2   cov   et  4  et ,  et  6  et 2 


0

 3  cov Yt ,Yt 3   cov   et  4  et ,  et 7  et 3 


0

 4  cov Yt ,Yt  4   cov   et  4  et ,  et 8  et  4 


 2

k  0 k4

Page 106 © IFE: 2019 Examinations


Exclusive use Batch0402p

So:

Ï1 if k = 0
Ô
Ô b
rk = Ì 2 if k = 4
Ôb +1
Ô0 otherwise
Ó

(ii) Features that would lead you to prefer Model 1

As we noted in part (i):

rk Æ 0 as k Æ • for Model 1

rk = 0 for k > 4 for Model 2

So Model 1 would be preferable if the sample autocorrelation function was


observed to decay to 0 rather than cutting off after lag 4.

9 Subject CT6 April 2010 Question 6

(i) Mean and variance

We repeated substitution we have:

Yt  a  Yt 1  et
 a   a  Yt  2  et 1   et
 2a  Yt  2  et 1  et
 2a   a  Yt 3  et 2   et 1  et
 3a  Yt 3  et  2  et 1  et
 ...
t
 t a  Y0  e1  e2    et 1  et  ta   e j
j 1

© IFE: 2019 Examinations Page 107


Exclusive use Batch0402p

So:

Ê t ˆ t
E (Yt ) = E Á ta + Â e j ˜ = ta + Â E (e j ) = t a
Ë j =1 ¯ j =1

Ê t ˆ t
var (Yt ) = var Á ta + Â e j ˜ = Â var (e j ) = t s 2
Ë j =1 ¯ j =1

Since the mean and variance of Yt depend on t , the process is not


stationary.

(ii) Proof of linearity

We have:

Ê t t -s ˆ Ê t t -s ˆ
g t ,s = cov (Yt ,Yt - s ) = cov Á ta + Â e j , (t - s ) a + Â e j ˜ = cov Á Â e j , Â e j ˜
Ë j =1 j =1 ¯ Ë j =1 j =1 ¯
= cov (e1 + e2 +  + et -1 + et , e1 + e2 +  + et - s -1 + et - s )
= var (e1) + var (e2 ) +  + var (et - s ) since ei terms independent
= (t - s ) s 2

Hence g t ,s is a linear function of s .

(iii) Estimation of the parameters

Since 1    0 has root l = 1 , we can difference this time series:

—Yt = Yt - Yt -1 = a + et

This is stationary and has mean and variance:

E (—Yt ) = E (Yt - Yt -1 ) = a + E (et ) = a

var (—Yt ) = var (Yt - Yt -1 ) = var (a + et ) = var (et ) = s 2

So the values of a and s can be estimated by differencing the sample data


and calculating the sample mean and standard deviation of the differenced
data. The sample mean can be used as an estimate of a and the sample
standard deviation can be used as an estimate of s .

Page 108 © IFE: 2019 Examinations


Exclusive use Batch0402p

(iv) Forecast values

Since Yn +1 = a + Yn + en +1 , the one-step ahead forecast for Yn +1 is:

yˆ n (1) = aˆ + y n + E (en +1 ) = aˆ + y n

Also, since Yn + 2 = a + Yn +1 + en + 2 , the two-step ahead forecast for Yn +2 is:

yˆ n (2) = aˆ + yˆ n (1) + E (en + 2 ) = aˆ + yˆ n (1) = 2aˆ + y n

10 Subject CT6 September 2010 Question 11

(i) Determine the values of a for which the process is stationary

Rearranging and writing in terms of the backwards shift operator, we have:

Yt - 2a Yt -1 + a 2Yt - 2 = et

The characteristic equation of the autoregressive terms is:

1 - 2al + a 2l 2 = 0

Solving this we obtain:

2a ± 4a 2 - 4a 2 1
l= =
2a 2 a

To be stationary, we require all the roots to be greater than 1 in magnitude:

1
>1 fi a <1
a

(ii) Derive the autocovariance for lag 0, lag 1 and lag k

Starting at lag 0, we have:

g 0 = cov(Yt ,Yt ) = cov(2a Yt -1 - a 2Yt - 2 + et ,Yt )


= 2ag 1 - a 2g 2 + cov(et ,Yt )

© IFE: 2019 Examinations Page 109


Exclusive use Batch0402p

We have:

cov(et ,Yt ) = cov(et , 2aYt -1 - a 2Yt - 2 + et )


= 0 - 0 + var(et )
=s2

Hence:

g 0 = 2ag 1 - a 2g 2 + s 2 (1)

Similarly, we get:

g 1 = cov(Yt ,Yt -1 ) = cov(2aYt -1 - a 2Yt - 2 + et ,Yt -1 )


= 2ag 0 - a 2g 1 (2)

g 2 = cov(Yt ,Yt - 2 ) = cov(2aYt -1 - a 2Yt - 2 + et ,Yt - 2 )


= 2ag 1 - a 2g 0 (3)

g 3 = cov(Yt ,Yt - 3 )
= 2ag 2 - a 2g 1

So we can see that the general recursive formula is:

g k = 2ag k -1 - a 2g k - 2 k≥2 (4)

Rearranging equation (2) gives:

2a
g 1(1 + a 2 ) = 2ag 0 fi g1 = g0 (5)
1+ a 2

Substituting this into equation (3) gives:

Ê 2a ˆ 4a 2 3a 2 - a 4
g 2 = 2a Á g 0 ˜ - a 2g 0 = g 0 - a 2g 0 = g0 (6)
Ë 1+ a 2 ¯ 1+ a 2
1+ a 2

Page 110 © IFE: 2019 Examinations


Exclusive use Batch0402p

We can now substitute (5) and (6) into the lag 0 covariance equation (1):

Ê 2a ˆ Ê 3a 2 - a 4 ˆ
g 0 = 2a Á g 0˜ - a 2 Á g 0˜ +s 2
Ë 1+ a 2 ¯ Ë 1 +a 2
¯

4a 2 - 3a 4 + a 6
= g 0 +s 2
1+ a 2

1+ a 2
fi g0 = s2 (7)
1 - 3a + 3a 4 - a 6
2

Substituting (7) into equation (5) gives:

2a
g1 = s2 (8)
1 - 3a 2 + 3a 4 - a 6

Equations (7), (8) and (4) give the required answer.

(iii) General formula for autocovariance

Rearranging equation (4) we have:

g k - 2ag k -1 + a 2g k - 2 = 0

Comparing this to the difference equation on page 4 of the Tables, we see


that a = 1 , b = -2a and c = a 2 . Since b2 - 4ac = 0 we see that the
solution is of the form:

g k = ( A + Bk )l k (9)

where l is the root of the quadratic equation:

l 2 - 2al + a 2 = 0

Using the quadratic equation formula, the root is:

2a ± 4a 2 - 4a 2
l= =a
2

© IFE: 2019 Examinations Page 111


Exclusive use Batch0402p

Hence, the solution of the recursive equation (4) is:

g k = ( A + Bk )a k (10)

For k = 0 , equation (10) gives us g 0 = A . Comparing this to our solution in


equation (7):

1+ a 2
g0 = A = s2
1 - 3a + 3a 4 - a 6
2

For k = 1 , equation (10) gives us g 1 = ( A + B )a . Comparing this to our


solution in equation (8):

2a
g 1 = ( A + B )a = s2
1 - 3a 2 + 3a 4 - a 6

Hence:

2
B= s2 - A
1 - 3a 2 + 3a 4 - a 6

Substituting in our value of A we get:

2 1+ a 2
B= 2 4 6
s2 - 2 4 6
s2
1 - 3a + 3a - a 1 - 3a + 3a - a
1- a 2
= s2
1 - 3a 2 + 3a 4 - a 6

Alternatively, we could substitute the given formula g k = Aa k + kBa k into


the recursive formula g k = 2ag k -1 - a 2g k - 2 (equation (4)) and show that it
works. And then solve for A and B as before.

Page 112 © IFE: 2019 Examinations


Exclusive use Batch0402p

11 Subject CT6 April 2011 Question 7

(i) Classifying Yt as an ARIMA(p,d,q)

Yt is an ARIMA(2, 0, 0) if it is stationary.

(ii) Checking for stationarity

The characteristic polynomial is:

1 - 0.4l - 0.12l 2

which has roots -5 and 1.667. Since all roots are of magnitude greater
than 1, the process is stationary.

(iii) Finding E(Yt )

E (Yt ) = E (0.7 + 0.4Yt -1 + 0.12Yt - 2 + et )


= 0.7 + 0.4E (Yt -1) + 0.12E (Yt - 2 ) + E (et )

where E (Yt + k ) = E (Yt ) = m for any k (since Yt and is stationary), and


E (et ) = 0 .

m = 0.7 + 0.4 m + 0.12 m .

0.7
E (Yt ) = m = = 1.4583
0.48

(iv) Autocorrelations

g 1 = cov(Yt , Yt -1)
= cov(0.4Yt -1 + 0.12Yt - 2 + et , Yt -1)
= 0.4 cov(Yt -1, Yt -1) + 0.12 cov(Yt - 2,Yt -1) + cov(et , Yt -1)
= 0.4g 0 + 0.12g 1

g 2 = cov(Yt , Yt - 2 )
= cov(0.4Yt -1 + 0.12Yt - 2 + et , Yt - 2 )
= 0.4 cov(Yt -1, Yt - 2 ) + 0.12 cov(Yt - 2, Yt - 2 ) + cov(et , Yt - 2 )
= 0.4g 1 + 0.12g 0

© IFE: 2019 Examinations Page 113


Exclusive use Batch0402p

g 3 = 0.4g 2 + 0.12g 1

g 4 = 0.4g 3 + 0.12g 2

0.4 5
g1 = g0 = g0
0.88 11

Solve to get:

5 83 241 731
r1 = r2 = r3 = r4 =
11 275 1,375 6,875

12 Subject CT6 October 2011 Question 8

(i) Identification as an ARIMA process

Rewriting the defining equation in terms of the backwards shift operator:

(1 - 0.4B )Yt = 0.1 + (1 + 0.9B )et

The characteristic equation of the AR terms is:

1 - 0.4l = 0 fi l= 1 = 2.5
0.4

Since the root is greater than 1 in absolute value, the process is stationary.
Hence d = 0 .

Also, p = 1 and q = 1 since the value of the process at time t depends on


Yt -1 and et -1 but not on earlier value of Y or e .

So the model is an ARIMA(1,0,1) process.

(ii)(a) Stationary?

We have already shown in part (i) that the process is stationary.

Page 114 © IFE: 2019 Examinations


Exclusive use Batch0402p

(ii)(b) Invertible?

The characteristic equation of the MA part is:

1 + 0.9l = 0 fi 1 = - 10
l = - 0.9 9

Since this is greater than 1 in absolute value, the process is invertible.

(iii) Mean and autocovariance function

E (Yt ) - 0.4E (Yt -1) = 0.1 + 0.9E (et -1) + E (et )

Since the process is stationary, we know that E (Yt ) = E (Yt -1) = ... = m . So:

1
m - 0.4 m = 0.1 + 0 + 0 fi m=
6

Calculating the autocovariance function:

g 0 = cov (Yt ,Yt )


= cov (0.4Yt -1 + 0.9et -1 + et ,Yt )
= 0.4 cov (Yt -1,Yt ) + 0.9 cov (et -1,Yt ) + cov (et ,Yt )
= 0.4g 1 + 0.9 cov (et -1,Yt ) + cov (et ,Yt )

Now since white noise is uncorrelated and future white noise is independent
of the past values of a time series:

cov (et ,Yt ) = cov (et ,0.4Yt -1 + 0.9et -1 + et )


= 0.4 cov (et ,Yt -1) + 0.9 cov (et , et -1) + cov (et , et )
= 0 + 0 + var (et )
=s2

© IFE: 2019 Examinations Page 115


Exclusive use Batch0402p

cov (et -1,Yt ) = cov (et -1,0.4Yt -1 + 0.9et -1 + et )


= 0.4 cov (et -1,Yt -1) + 0.9 cov (et -1, et -1) + cov (et -1, et )
= 0.4 cov (et ,Yt ) + 0.9 cov (et , et ) + 0
= 0.4s 2 + 0.9s 2
= 1.3s 2
Hence:

g 0 = 0.4g 1 + 0.9 ¥ 1.3s 2 + s 2 = 0.4g 1 + 2.17s 2

Similarly:

g 1 = cov (Yt ,Yt -1)


= cov (0.4Yt -1 + 0.9et -1 + et , Yt -1)
= 0.4 cov (Yt -1, Yt -1) + 0.9 cov (et -1, Yt -1) + cov (et , Yt -1)
= 0.4g 0 + 0.9 cov (et , Yt ) + 0
= 0.4g 0 + 0.9s 2

g 2 = cov (Yt ,Yt - 2 )


= cov (0.4Yt -1 + 0.9et -1 + et , Yt - 2 )
= 0.4 cov (Yt -1, Yt - 2 ) + 0.9 cov (et -1, Yt - 2 ) + cov (et , Yt - 2 )
= 0.4g 1

g 3 = cov (Yt ,Yt - 3 )


= cov (0.4Yt -1 + 0.9et -1 + et , Yt - 3 )
= 0.4 cov (Yt -1, Yt - 3 ) + 0.9 cov (et -1, Yt - 3 ) + cov (et , Yt - 3 )
= 0.4g 2
= 0.42g 1

and, in general:

g k = 0.4k -1g 1 for k ≥ 1

Page 116 © IFE: 2019 Examinations


Exclusive use Batch0402p

Substituting our expression for g 0 into the expression for g 1 above, we get:

1.768 2 221 2
g 1 = 0.4(0.4g 1 + 2.17s 2 ) + 0.9s 2 fi g1 = s = s
0.84 105

Ê 221 2 ˆ 253 2
fi g 0 = 0.4 Á s + 2.17s 2 = s
Ë 105 ˜¯ 84

Ê 221 2 ˆ
fi g k = 0.4k -1g 1 = 0.4k -1 Á s ˜ for k ≥ 1
Ë 105 ¯

(iv) MA(∞) representation

We can write the equation for Yt as:

(1 - 0.4B )Yt = 0.1 + 0.9et -1 + et

fi Yt = (1 - 0.4B )
-1
(0.1 + 0.9et -1 + et )
However:

(1 - 0.4B )-1 = 1 + (0.4B ) + (0.4B )2 + (0.4B )3 + 


So:

Yt = (1 - 0.4B )
-1
(0.1 + 0.9et -1 + et )
( )
= 1 + (0.4B ) + (0.4B ) + (0.4B ) +  (0.1 + 0.9et -1 + et )
2 3

= 0.1 + 0.4 ¥ 0.1 + 0.42 ¥ 0.1 + 0.43 ¥ 0.1 + 

+ 0.9et -1 + 0.4 ¥ 0.9et - 2 + 0.42 ¥ 0.9et - 3 + 0.43 ¥ 0.9et - 4 + 

+ et + 0.4 et -1 + 0.42 et - 2 + 0.43 et - 3 + 

0.1
= + et + 1.3 et -1 + 0.4 ¥ 1.3 et - 2 + 0.42 ¥ 1.3 et - 3 + 
1 - 0.4
1 •
= + et + 1.3 Â 0.4 j -1et - j
6 j =1

© IFE: 2019 Examinations Page 117


Exclusive use Batch0402p

13 Subject CT6 April 2012 Question 9

(i) Values of a for which the process is stationary

Setting the characteristic equation equal to 0 and solving:

1
(1 - al )3 = 0 fi l=
a

The time series is stationary if l > 1 , ie if a < 1 .

(ii)(a) Yule-Walker equations

Writing the time series in its long-hand form, ie by expanding out the
backward shift operator, gives:

(1 - a B )3 X t = et

¤ (1 - 3a B + 3a B 2 2
)
- a 3B 3 X t = et

¤ X t - 3a X t -1 + 3a 2 X t - 2 - a 3 X t - 3 = et
¤ X t = 3a X t -1 - 3a 2 X t - 2 + a 3 X t - 3 + et

We’re told that a = 0.4 which gives:

X t = 1.2 X t -1 - 0.48 X t - 2 + 0.064 X t - 3 + et

A useful intermediate step, before computing the autocovariance function is


to work out cov ( X t , et ) :

cov ( X t , et ) = cov (1.2 X t -1 - 0.48 X t - 2 + 0.064 X t - 3 + et , et )


= 1.2 cov ( X t -1, et ) - 0.48 cov ( X t - 2, et ) + 0.064 cov ( X t - 3 , et )
+ cov (et , et )
= 0 - 0 + 0 +s 2
=s2

The autocovariance function is given by:

g k = cov ( X t , X t - k )

Page 118 © IFE: 2019 Examinations


Exclusive use Batch0402p

Hence:

g 0 = cov ( X t , X t )
= cov (1.2 X t -1 - 0.48 X t - 2 + 0.064 X t - 3 + et , X t )
= 1.2 cov ( X t -1, X t ) - 0.48 cov ( X t - 2, X t ) + 0.064 cov ( X t - 3 , X t )
+ cov (et , X t )
= 1.2g 1 - 0.48g 2 + 0.064g 3 + s 2 (1)

Similarly:

g 1 = cov ( X t , X t -1)
= cov (1.2 X t -1 - 0.48 X t - 2 + 0.064 X t - 3 + et , X t -1)
= 1.2 cov ( X t -1, X t -1) - 0.48 cov ( X t - 2, X t -1) + 0.064 cov ( X t -3 , X t -1)
+ cov (et , X t -1)
= 1.2g 0 - 0.48g 1 + 0.064g 2 + 0 (2)

g 2 = cov ( X t , X t - 2 )
= cov (1.2 X t -1 - 0.48 X t - 2 + 0.064 X t - 3 + et , X t - 2 )
= 1.2cov ( X t -1, X t - 2 ) - 0.48 cov ( X t - 2, X t - 2 ) + 0.064 cov ( X t - 3 , X t - 2 )
+ cov (et , X t - 2 )
= 1.2g 1 - 0.48g 0 + 0.064g 1 + 0
= 1.264g 1 - 0.48g 0 (3)

g 3 = cov ( X t , X t -3 )
= cov (1.2 X t -1 - 0.48 X t - 2 + 0.064 X t - 3 + et , X t - 3 )
= 1.2 cov ( X t -1, X t - 3 ) - 0.48 cov ( X t - 2, X t - 3 ) + 0.064 cov ( X t - 3 , X t - 3 )
+ cov (et , X t - 3 )
= 1.2g 2 - 0.48g 1 + 0.064g 0 + 0

In summary, the Yule-Walker equations are:

g 0 = 1.2g 1 - 0.48g 2 + 0.064g 3 + s 2


g k = 1.2g k -1 - 0.48g k - 2 + 0.064g k - 3 k ≥1

© IFE: 2019 Examinations Page 119


Exclusive use Batch0402p

(ii)(b) Auto-correlation function at lags 1 and 2

Substituting equation (3) into equation (2) gives:

g 1 = 1.2g 0 - 0.48g 1 + 0.064(1.264g 1 - 0.48g 0 )


= 1.16928g 0 - 0.399104g 1

290
fi g1 = g 0 = 0.83573g 0
347

Substituting this back into equation (3) gives:

Ê 290 ˆ 200
g 2 = 1.264 Á g - 0.48g 0 = g = 0.57637g 0
Ë 347 0 ˜¯ 347 0

and:

g 1 290
r1 = = = 0.83573
g 0 347

g 2 200
r2 = = = 0.57637
g 0 347

(iii) Asymptotic behaviour of the autocorrelation and partial auto-correlation

The time series is an AR (3 ) series.

The autocorrelation function, r k , will decay geometrically (ie tend to 0) as


k Æ•.

The partial autocorrelation function, fk , will cut off (ie be 0) for k > 3 .

Page 120 © IFE: 2019 Examinations


Exclusive use Batch0402p

14 Subject CT6 September 2012 Question 9

(i) Suitable choice of s

Since Yt = X t - X t - s = (1 - B s ) X t , we have s = 3 .

Applying this seasonal differencing, Yt = X t - X t - 3 , we have:

(1 - (a + b ) B + ab B )Y = e
2
t t

For Yt to be stationary, we require that the roots of the characteristic


equation all be greater than 1 in magnitude. The characteristic equation is:

(1 - (a + b ) l + abl ) = 0
2

Solving for l , we get:

1 1
(1 - al )(1 - bl ) = 0 fi l=
a
,
b

For stationarity, we need:

1 1
> 1 and >1
a b

ie:

a < 1 and b < 1

(ii) Estimates of a and b

Yt is an AR(2) process:

Yt = (a + b ) Yt -1 - ab Yt - 2 + et

© IFE: 2019 Examinations Page 121


Exclusive use Batch0402p

The Yule-Walker equations are:

g 1 = cov ÎÈYt ,Yt -1 ˚˘ = cov ÈÎ(a + b ) Yt -1 - ab Yt - 2 + et ,Yt -1 ˘˚


= (a + b ) g 0 - abg 1 + 0 (1)

g 2 = cov ÈÎYt ,Yt - 2 ˘˚ = cov ÈÎ(a + b ) Yt -1 - ab Yt - 2 + et ,Yt - 2 ˘˚


= (a + b ) g 1 - abg 0 + 0 (2)

Rearranging equation (1) gives:

g1 =
(a + b ) g
0
1 + ab

Substituting this into equation (2) gives:

g 2 = (a + b )
(a + b ) g - abg 0
0
1 + ab

Dividing through by g 0 , we have two equations:

r1 =
(a + b )
1 + ab

and r2 = (a + b )
(a + b ) - ab = (a + b ) r1 - ab
1 + ab

Equating these to rˆ1 = 0.2 and rˆ2 = 0.7 gives:

(a + b ) = 0.2 (3)
1 + ab

and 0.7 = (a + b ) 0.2 - ab (4)

Page 122 © IFE: 2019 Examinations


Exclusive use Batch0402p

Equation (3) gives:

a + b = 0.2 (1 + ab ) = 0.2 + 0.2ab

¤ a - 0.2ab = 0.2 - b

0.2 - b
¤ a = (5)
1 - 0.2 b

Substituting this into Equation (4) gives:

Ê 0.2 - b ˆ Ê 0.2 - b ˆ
0.7 = Á + b ˜ 0.2 - Á b
Ë 1 - 0.2 b ¯ Ë 1 - 0.2 b ˜¯
0.04 - 0.2 b + 0.2 b - 0.04 b 2 - 0.2 b + b 2
=
1 - 0.2 b
0.04 - 0.2 b + 0.96 b 2
=
1 - 0.2 b

Rearranging gives:

0.7 (1 - 0.2 b ) = 0.04 - 0.2 b + 0.96 b 2

¤ 0.96 b 2 - 0.06 b - 0.66 = 0

0.06 ± 0.062 + 4 ¥ 0.96 ¥ 0.66


¤ b = = -0.7985, 0.8610
2 ¥ 0.96

By the symmetry of the Yt = (a + b ) Yt -1 - ab Yt - 2 + et equation, it’s clear


that these two solutions are also the solutions for a .
So:

a = -0.7985 and b = 0.8610

© IFE: 2019 Examinations Page 123


Exclusive use Batch0402p

Alternatively, using the hint we substitute X = a + b and Y = ab into


equations (3) and (4) to get:

X = 0.2 + 0.2Y (5)

and:

0.7 = 0.2X - Y (6)

Substituting X from equation (5) into equation (6) gives:

0.7 = 0.04 + 0.04Y - Y fi 0.66 = -0.96Y fi Y = -0.6875

Hence X = 0.2 + 0.2( -0.6875) = 0.0625 .

Therefore a + b = 0.0625 and ab = -0.6875 and since a quadratic


equation with roots a and b is:

( x - a )( x - b ) = 0 fi x 2 - (a + b )x + ab = 0

We see that a and b are roots of the quadratic equation:

x 2 - 0.0625 x - 0.6875 = 0

which we solve to get:

0.0625 ± 0.06252 + 4 ¥ 0.6875


= -0.7985, 0.8610
2

ie a = -0.7985 and b = 0.8610 .

(iii) Forecast of x̂101 and x̂102

Using the parameter values a = -0.7985 and b = 0.8610 , we have:

(1 - 0.0625B - 0.6875B )Y = e2
t t

¤ Yt = 0.0625Yt -1 + 0.6875Yt - 2 + et

Page 124 © IFE: 2019 Examinations


Exclusive use Batch0402p

Substituting in Yt = X t - X t - 3 , we have:

X t - X t - 3 = 0.0625 ( X t -1 - X t - 4 ) + 0.6875 ( X t - 2 - X t - 5 ) + et

¤ X t = 0.0625 X t -1 + 0.6875 X t - 2 + X t - 3 - 0.0625 X t - 4 - 0.6875 X t - 5 + et

So, the one-step ahead forecast is:

xˆ101 = 0.0625 x100 + 0.6875 x99 + x98 - 0.0625 x97 - 0.6875 x96

and the two-step ahead forecast is:

xˆ102 = 0.0625 xˆ101 + 0.6875 x100 + x99 - 0.0625 x98 - 0.6875 x97

15 Subject CT6 April 2013 Question 11

(i) Conditional distribution and the likelihood

The conditional distribution of X t | X t -1 = xt -1 is:

N (a xt -1,s 2 )

The PDF of X t | X t -1 = xt -1 will be:

( xt -a xt -1 )2
1 -
e 2s 2
2
2ps

Hence the likelihood will be:

( xi -a xi -1 )2
n 1 -
L(a ,s ) = ’ e 2s 2
2
i =1 2ps

© IFE: 2019 Examinations Page 125


Exclusive use Batch0402p

(ii) Equivalence between maximum likelihood and least squares estimators

The least squares estimate will minimise the following with respect to a :

n n
 ei2 =  ( xi - a xi -1)2
i =1 i =1

This is exactly the same as maximising the following with respect to a :

( xi -a xi -1 )2
n 1 -
’ 2
e 2s 2
i =1 2ps

(iii) Obtain MLE

Simplifying the likelihood function:

( xi -a xi -1 )2 1 n
n 1 - 1 - 2 Â ( xi -a xi -1 )2
L(a ,s ) = ’ e 2s
2
e 2s = const ¥ i =1

i =1 2ps 2 sn

Taking logs:

1 n
ln L(a ,s ) = const - n ln s - 2 Â ( xi - a xi -1)2
2s i =1

Differentiating with respect to a using the chain rule:

∂ 1 n

∂a
ln L(a ,s ) = - Â 2 ¥ ( xi - a xi -1) ¥ ( - xi -1)
2s 2 i =1
1 n
= Â xi -1( xi - a xi -1)
s2 i =1

Page 126 © IFE: 2019 Examinations


Exclusive use Batch0402p

Setting this equal to zero:

1 n
 xi -1( xi - aˆ xi -1) = 0
sˆ 2 i =1
n n
 xi -1xi - aˆ  xi2-1 = 0
i =1 i =1
n
 xi -1xi
aˆ = i =1
n
 xi2-1
i =1

Differentiating with respect to s :

∂ n 1 n

∂s
ln L(a ,s ) = - + 3
s s
 ( xi - a xi -1)2
i =1

Setting this equal to zero:

n 1 n
- +
sˆ sˆ 3
 ( xi - aˆ xi -1)2 = 0
i =1
n
-nsˆ 2 + Â ( xi - aˆ xi -1)2 = 0
i =1
1 n
sˆ 2 = Â ( x - aˆ xi -1)2
n i =1 i

(iv) Yule-Walker equations and estimates of a and s

A preliminary step:

cov( X t , et ) = cov(a X t -1 + et , et )
= a cov( X t -1, et ) + cov(et , et )
= 0 +s 2
=s2

© IFE: 2019 Examinations Page 127


Exclusive use Batch0402p

The autocovariance at lag 0 is:

g 0 = cov( X t , X t )
= cov(a X t -1 + et , X t )
= a cov( X t -1, X t ) + cov(et , X t )
= ag 1 + s 2

The autocovariance at lag 1 is:

g 1 = cov( X t , X t -1)
= cov(a X t -1 + et , X t -1)
= a cov( X t -1, X t -1) + cov(et , X t -1)
= ag 0

The autocovariance at lag 2 is:

g 2 = cov( X t , X t - 2 )
= cov(a X t -1 + et , X t - 2 )
= a cov( X t -1, X t - 2 ) + cov(et , X t - 2 )
= ag 1

In general:

g k = ag k -1 k ≥1

Equating the autocovariances to the observed values, we get:

ˆ ˆ1 + sˆ 2
gˆ0 = ag (1)

gˆ1 = ag
ˆ ˆ0 (2)

Rearranging (2) gives:

gˆ1
aˆ =
gˆ0

Page 128 © IFE: 2019 Examinations


Exclusive use Batch0402p

Hence using (1) we have:

gˆ12
sˆ 2 = gˆ0 - ag
ˆ ˆ1 = gˆ0 -
gˆ0

where, from Page 40 of the Tables:

1 n
gˆ0 = Â ( xi - x )2
n i =1

1 n
gˆ1 = Â ( xi - x )( xi -1 - x )
n i =1

(v) Comment

The autocovariance estimates involve x whereas the maximum likelihood


estimates don’t.

16 Subject CT6 September 2013 Question 9

(i) Box-Jenkins approach

• Tentative identification of a model from the ARIMA class


• Estimation of parameters in the identified model
• Diagnostic checks

(ii) ARIMA time series to fit the observed data in the charts

The ACF cuts off (becomes 0) at all lags greater than 1, whereas the PACF
decays towards 0. Hence we have an MA(1) .

(iii) Yule-Walker equations

We start with some useful preliminary equations:

cov ( X t , et ) = cov (a1X t -1 + a 2 X t - 2 + b1et -1 + et , et )


= 0 + 0 + 0 + cov (et , et )
= var (et ) = s 2

© IFE: 2019 Examinations Page 129


Exclusive use Batch0402p

Also:

cov ( X t , et -1) = cov (a1X t -1 + a 2 X t - 2 + b1et -1 + et , et -1)


= a1 cov ( X t -1, et -1) + 0 + b1 cov (et -1, et -1) + 0
= a1s 2 + b1s 2
= (a1 + b1) s 2

The autocovariance at lag 0 is:

g 0 = cov ( X t , X t )

= cov (a1X t -1 + a 2 X t - 2 + b1et -1 + et , X t )

= a1 cov ( X t -1, X t ) + a 2 cov ( X t - 2, X t )

+ b1 cov (et -1, X t ) + cov (et , X t )

= a1 g 1 + a 2 g 2 + b1 (a1 + b1) s 2 + s 2 (1)

Similarly:

g 1 = cov ( X t , X t -1)
= cov (a1X t -1 + a 2 X t - 2 + b1et -1 + et , X t -1)
= a1 cov ( X t -1, X t -1) + a 2 cov ( X t - 2, X t -1)
+ b1 cov (et -1, X t -1) + cov (et , X t -1)
= a1 g 0 + a 2 g 1 + b1s 2 + 0 (2)

g 2 = cov ( X t , X t - 2 )

= cov (a1X t -1 + a 2 X t - 2 + b1et -1 + et , X t - 2 )

= a1 cov ( X t -1, X t - 2 ) + a 2 cov ( X t - 2, X t - 2 )

+ b1 cov (et -1, X t - 2 ) + cov (et , X t - 2 )

= a1 g 1 + a 2 g 0 + 0 + 0 (3)

Page 130 © IFE: 2019 Examinations


Exclusive use Batch0402p

In general, for lags k ≥ 2 :

g k = a1 g k -1 + a 2 g k - 2 (4)

These are the Yule-Walker equations.

(iv) Can the partial auto-correlation function ever give a zero value?

For a MA (q ) process, where q ≥ 1 , the PACF tends towards 0 but does not
completely cut off.

Here, we have an ARMA (2,1) process, ie q ≥ 1 . Hence the PACF tends


towards 0 but does not completely cut off. There will always be a small
partial autocorrelation.

17 Subject CT6 April 2014 Question 12

(i) Calculate PACF

fˆ1 = r1 = 0.68

r2 - r12 0.55 - 0.682


fˆ2 = = = 0.1629
1 - r12 1 - 0.682

(ii)(a) Estimate the AR(1) parameters

g 0 = cov(Yt ,Yt )
= cov(a0 + a1Yt -1 + et ,Yt )
= a1g 1 + s 2 (1)

g 1 = cov(Yt ,Yt -1)


= cov(a0 + a1Yt -1 + et ,Yt -1)
= a1g 0 (2)

© IFE: 2019 Examinations Page 131


Exclusive use Batch0402p

Dividing equation (2) by g 0 gives:

g 1 a1g 0
r1 = = = a1
g0 g0

Equating the model lag 1 ACF to the sample lag 1 ACF gives:

aˆ1 = 0.68

Substituting the expression for g 1 from equation (2) into equation (1) gives:

g 0 = a1(a1g 0 ) + s 2

Rearranging this gives:

s2
g0 = (3)
1 - a12

Equating this to the given sample variance of 0.9 and substituting in


aˆ1 = 0.68 that we calculated, we get:

sˆ 2
= 0.9 fi sˆ 2 = 0.9(1 - 0.682 ) = 0.48384
1 - 0.682

We have:

E (Yt ) = a0 + a1E (Yt -1) + 0

Since the time series is stationary, we have m = E (Yt ) = E (Yt -1) =  :

m = a0 + a1m
a0
m=
1 - a1

Page 132 © IFE: 2019 Examinations


Exclusive use Batch0402p

Equating this to the given sample mean of 1.35 and substituting in aˆ1 = 0.68
that we calculated, we get:

aˆ0
= 1.35 fi aˆ0 = 1.35(1 - 0.68) = 0.432
1 - 0.68

(ii)(b) Estimate the AR(2) parameters

g 0 = cov(Yt ,Yt )
= cov(a0 + a1Yt -1 + a2Yt - 2 + et ,Yt )
= a1g 1 + a2g 2 + s 2 (1)

g 1 = cov(Yt ,Yt -1)


= cov(a0 + a1Yt -1 + a2Yt - 2 + et ,Yt -1)
= a1g 0 + a2g 1 (2)

g 2 = cov(Yt ,Yt - 2 )
= cov(a0 + a1Yt -1 + a2Yt - 2 + et ,Yt - 2 )
= a1g 1 + a2g 0 (3)

Rearranging equation (2):

g 1 - a2g 1 = a1g 0
a1
fi g1 = g0 (4)
1 - a2

Substituting this into equation (3) gives:

Ê a1 ˆ Ê a2 ˆ
g 2 = a1 Á g 0 ˜ + a2g 0 = Á 1 + a2 ˜ g 0 (5)
Ë 1 - a2 ¯ Ë 1 - a2 ¯

Dividing equation (4) by g 0 gives:

a1
g0
g 1 - a2 a1
r1 = 1 = =
g0 g0 1 - a2

© IFE: 2019 Examinations Page 133


Exclusive use Batch0402p

Dividing equation (5) by g 0 gives:

Ê a2 ˆ
1 +a g
Á 2˜ 0
g2 Ë 1 - a2 ¯ a12
r2 = = = + a2
g0 g0 1 - a2

Equating the lag 1 and lag 2 model and sample ACFs gives:

a1
= 0.68 (6)
1 - a2

a12
+ a2 = 0.55 (7)
1 - a2

Rearranging equation (6) gives:

a1 = 0.68 - 0.68a2 (8)

Substituting this as well as equation (6) into equation (7):

0.68(0.68 - 0.68a2 ) + a2 = 0.55


fi (1 - 0.682 )a2 = 0.55 - 0.682
73
fi aˆ2 = = 0.16295 (9)
448

Substituting this back into equation (8) gives:

73 255
aˆ1 = 0.68 - 0.68 ¥ = = 0.56920
448 448

Substituting g 1 and g 2 from equations (4) and (5) into equation (1) gives:

Ê a1 ˆ ÏÔÊ a 2 ˆ ¸Ô
g 0 = a1 Á g 0 ˜ + a2 ÌÁ 1 + a2 ˜ g 0 ˝ + s 2
Ë 1 - a2 ¯ ÓÔË
1 - a2 ¯ ˛Ô

Page 134 © IFE: 2019 Examinations


Exclusive use Batch0402p

Rearranging:

ÔÏ a12 a 2a ¸Ô
g 0 Ì1 - - 1 2 - a22 ˝ = s 2
ÓÔ 1 - a2 1 - a2 ˛Ô
s2
fig0 =
a12 a 2a
1- - 1 2 - a22
1 - a2 1 - a2

Equating this to the given sample variance of 0.9 and substituting in


255 73
â1 = 448
and â2 = 448
, we get:

Ï
( ) ( 255
448 )
¸
2 2
255 73
Ô
( ) ˝Ô = 0.47099
448 448 2Ô
sˆ2 = 0.9 Ì1 - - - 73
73 73 448
Ô 1 - 448 1- 448
Ó ˛

We have:

E (Yt ) = a0 + a1E (Yt -1) + a2E (Yt - 2 ) + 0

Since the time series is stationary:

m = E (Yt ) = E (Yt -1) = E (Yt - 2 ) =  :

fi m = a0 + a1m + a2 m
a0
fim=
1 - a1 - a2

255
Equating this to the given sample mean of 1.35 and substituting in â1 = 448
73
and â2 = 448
, we get:

1-
a0
255 - 73
= 1.35 fi aˆ0 = 1.35 1 - ( 255
448
- 73
448 )= 81
224
 0.36161
448 448

© IFE: 2019 Examinations Page 135


Exclusive use Batch0402p

(iii) Is stationarity necessary?

Yes stationarity is necessary for both models. Otherwise the mean, variance
and covariances would change over time.

(iv) Markov?

The first model satisfies the Markov property as it only depends on the
previous value of Yt -1 .

However, the second model does not satisfy the Markov property as it also
depends on Yt - 2 .

18 Subject CT6 September 2014 Question 9

(i) Box-Jenkins method

The steps are:


1. identify p , d and q from ARIMA( p, d , q )

2. estimate parameters (eg  ,  2 ,  i ’s and i ’s)

3. check the fit of the model (using diagnostic checks).

(ii) Calculate sample autocovariance function

We have n  200 , hence using the formulae given on page 40 of the


Tables:

1 n 1
ˆ0  
n t 1
( xt  ˆ )2 
200
 35.4  0.177

1 n 1
ˆ1   ( xt  ˆ )( xt 1  ˆ )  200  28.4  0.142
n t 2

1 n 1
ˆ2   ( xt  ˆ )( xt 2  ˆ )  200  17.1  0.0855
n t 3

Page 136 © IFE: 2019 Examinations


Exclusive use Batch0402p

(iii) Calculate the sample PACF

We have:

ˆ1 0.142
r1    0.802260
ˆ0 0.177

ˆ2 0.0855
r2    0.483051
ˆ0 0.177

Hence:

ˆ1  r1  0.802260

r2  r12 0.483051  0.8022602


ˆ2    0.45056
1  r12 1  0.8022602

(iv) Estimate the AR(1) parameters

Equating the sample and population means gives:

1 200 1
ˆ  x   xt  200  83.7  0.4185
200 t 1

 0  cov( X t , X t )
 cov(   a1( X t 1   )  et , X t )
 a1 1   2 (1)

 1  cov( X t , X t 1)
 cov(   a1( X t 1   )  et , X t 1)
 a1 0 (2)

Taking equation (2) and dividing it by  0 gives:

 1 a1 0
1    a1
0 0

© IFE: 2019 Examinations Page 137


Exclusive use Batch0402p

Equating the model lag 1 ACF to the sample lag 1 ACF gives:

aˆ1  0.802260

Substituting  1 from equation (2) into equation (1) gives:

 0  a1(a1 0 )   2

Rearranging this gives:

2
0  (3)
1  a12

Equating this to the sample variance of ˆ0  0.177 and substituting in


aˆ1  0.802260 , we get:

ˆ 2
 0.177  ˆ 2  0.177(1  0.8022602 )  0.063079
1  0.8022602

(v) Turning points test

We are testing:
H0 : the residuals are from a white noise process
H1 : the residuals are not from a white noise process

From page 42 of the Tables, have:

E (T )  2 ( n  2)  2 (200  2)  132
3 3

16n  29 16  200  29
and var(T )    35.23
90 90

The observed value of the test statistic is:

110  132
 3.706
35.23

Page 138 © IFE: 2019 Examinations


Exclusive use Batch0402p

The critical values are ±1.96 , so the test statistic lies in the rejection region.
Hence, we have sufficient evidence at the 5% level (and even at the 0.02%
level) to reject H0 . We have very strong evidence to suggest that the
residuals are not consistent with white noise.

19 Subject CT6 April 2015 Question 7

(i) Differencing

We have:

Yt - Yt -1 - Yt -12 + Yt -13 = et + b1et -1 + b12et -12 + b1b12et -13

Taking a difference of the autoregressive terms gives:

(1 - B )(Yt - Yt -12 ) = et + b1et -1 + b12et -12 + b1b12et -13

Taking a seasonal difference of order 12 gives:

(1 - B )(1 - B12 )Yt = et + b1et -1 + b12et -12 + b1b12et -13

Setting X t = (1 - B )(1 - B12 )Yt gives:

X t = et + b1et -1 + b12et -12 + b1b12et -13

The differenced time series, X t , is a moving average process of order 13.

(ii) Explain why it is called seasonal differencing

Since we have monthly data then (1 - B12 ) subtracts the corresponding


monthly figure from the previous year. This will strip out any seasonal effect.

(iii) Auto-correlation function

g 0 = cov( X t , X t )
= cov(et + b1et -1 + b12et -12 + b1b12et -13 ,

et + b1et -1 + b12et -12 + b1b12et -13 )

= (1 + b12 + b12
2
+ b12 b12
2
)s 2

© IFE: 2019 Examinations Page 139


Exclusive use Batch0402p

Also:

g 1 = cov( X t , X t -1)
= cov(et + b1et -1 + b12et -12 + b1b12et -13 ,

et -1 + b1et - 2 + b12et -13 + b1b12et -14 )


2
= ( b1 + b1b12 )s 2

g 2 = g 3 =  = g 10 = 0

g 11 = cov( X t , X t -11)
= cov(et + b1et -1 + b12et -12 + b1b12et -13 ,

et -11 + b1et -12 + b12et - 23 + b1b12et - 24 )


2
= b1b12s

g 12 = cov( X t , X t -12 )
= cov(et + b1et -1 + b12et -12 + b1b12et -13 ,

et -12 + b1et -13 + b12et - 24 + b1b12et - 25 )

= ( b12 + b12 b12 )s 2

g 13 = cov( X t , X t -13 )
= cov(et + b1et -1 + b12et -12 + b1b12et -13 ,

et -13 + b1et -14 + b12et - 25 + b1b12et - 26 )

= b1b12s 2

g 14 = g 15 =  = 0

Page 140 © IFE: 2019 Examinations


Exclusive use Batch0402p

The ACF is therefore:

g0
r0 = =1
g0

2
g1 b1 + b1b12
r1 = =
g 0 1 + b12 + b12
2
+ b12 b12
2

r2 = r3 =  = r10 = 0

g 11 b1b12
r11 = =
g 0 1 + b12 + b12
2
+ b12 b12
2

g 12 b12 + b12 b12


r12 = =
g 0 1 + b12 + b12
2
+ b12 b12
2

g 13 b1b12
r13 = =
g 0 1 + b12 + b12
2
+ b12 b12
2

r14 = r15 =  = 0

20 Subject CT6 October 2015 Question 11

(i)(a) Rewrite in vector format

We can rewrite the equations as:

 1     X t   0.5 0   X t 1    t1 
       
  1   Yt   0 0.5   Yt 1    2 
 t 

So we have:

 1    0.5 0 
M  N 
  1   0 0.5 

© IFE: 2019 Examinations Page 141


Exclusive use Batch0402p

(i)(b) Values of β for which the process is stationary

To test stationarity we require the process to be of the form


X n = AX n -1 + ε n . So we need to multiply by the inverse of matrix M which
is given by:

1 1 
M1   
1   
2 1

Multiplying the defining equation by M1 gives:

X t  M1NX t 1  M1εt

 Xt  1 1    0.5 0   X t 1  1 1     t1 
ie         
Y
 t  1  2  1 0 0.5   Yt 1  1   2  1   2 
 t 
1  0.5 0.5    X t 1  1 1    t 
1
      
1    0.5 
2 0.5   Yt 1  1   2   1   2 
 t 
 0.5 0.5  
 
1  2 1   2   X t 1  1 1     t1 
     
 0.5  0.5   Yt 1  1   2 1   2 
    t 
 1 
2
1   2 

This time series is stationary if both the eigenvalues of the matrix


 0.5 0.5  
 2 
 1   1  2 
are less than 1 in magnitude.
 0.5  0.5 
 
 1 
2
1   2 

Page 142 © IFE: 2019 Examinations


Exclusive use Batch0402p

Setting:

  0.5 0.5     0.5 0.5  


 2 2    
 1  1    1 0 1  2 1  2 
det    
   det  0
 0.5  0.5  0.5  
   0 1  
0.5
  
  1  2 1   2   1  2 1  2 
   

gives:

2 2
 0.5   0.5  
   0
2   1   2 
 1    

Multiplying through by (1 - b 2 )2 and rearranging gives:

0.5  (1  )
2
  0.5    0
2 2

 0.25   (1   2 )   2 (1   2 )2  0.25  2  0
  2 (1   2 )2   (1   2 )  0.25(1   2 )  0
  2 (1   2 )    0.25  0
1  1  4  0.25(1   2 ) 1  2 1 
 2
 2

2(1   ) 2(1   ) 2(1   2 )

1 1
ie  or 
2(1   ) 2(1   )

Therefore the time series is stationary if:

1 1
1 and 1
2(1   ) 2(1   )

 1  2(1   ) and 1  2(1   )


 1   1 and 1   1
2 2

© IFE: 2019 Examinations Page 143


Exclusive use Batch0402p

For the first condition to be satisfied, we must have:

1   1
2 or 1     21
  1 or  3
2 2

For the second condition to be satisfied, we must have:

1   1
2 or 1     21
    21 or    32

So for both eigenvalues to be less than 1 in magnitude, we must have:

  1 or   3
2 2

(ii) VAR(p)

We can rewrite the equations as:

 Xt       X t 1   0 0   X t  2    t1 
      
 Yt    0   Yt 1     0   Yt  2    2 
t 

which is a VAR(2) since it is of the form:

X t  A1X t 1  A 2X t  2  εt

with:

    0 0
A1    and A2   
 0   0

Page 144 © IFE: 2019 Examinations


Exclusive use Batch0402p

21 Subject CT6 April 2016 Question 9

(i) Testing stationarity and identifying ARMA(p,q)

We have:

Yt  0.6Yt 1  0.16Yt  2  1   t

The characteristic equation is:

1  0.6  0.16 2  0

Solving for  :

5
1  0.6  0.16 2  0    5 or
4

5
Since both 5 >1 and  1 , the process is stationary.
4

The earliest Y term is Yt  2 , so p  2 . The earliest white noise term is  t ,


which has zero lag, so q  0 .

Hence this is an ARMA(2,0) process.

(ii) Calculating the mean

Taking expectations gives:

E Yt   E 1  0.6Yt 1  0.16Yt  2   t 

 1  0.6E Yt 1   0.16E Yt 2   E   t 

Since the process is stationary, we know that E Yt k  is equal to some


constant  for all values of k . Also E (e t ) = 0 . This gives:

1
  1  0.6   0.16     4 61
0.24

© IFE: 2019 Examinations Page 145


Exclusive use Batch0402p

(iii) Calculating ACFs and PACFs

The Yule-Walker equations are:

 1  cov Yt ,Yt 1 


 cov(1  0.6Yt 1  0.16Yt 2   t ,Yt 1)
 cov 1,Yt 1   0.6cov Yt 1,Yt 1   0.16cov Yt 2 ,Yt 1   cov  t ,Yt 1 
 0  0.6 0  0.16 1  0
(1)

 2  cov Yt ,Yt  2 


 cov 1  0.6Yt 1  0.16Yt  2   t ,Yt  2 
 cov 1,Yt 2   0.6cov Yt 1,Yt  2   0.16cov Yt 2 ,Yt  2   cov  t ,Yt  2 
 0  0.6 1  0.16 0  0
(2)

 3  cov Yt ,Yt 3 


 cov 1  0.6Yt 1  0.16Yt  2   t ,Yt 3 
 cov 1,Yt 3   0.6cov Yt 1,Yt 3   0.16cov Yt  2 ,Yt 3   cov  t ,Yt 3 
 0  0.6 2  0.16 1  0
(3)

 4  cov Yt ,Yt  4 


 cov 1  0.6Yt 1  0.16Yt  2   t ,Yt  4 
 cov 1,Yt  4   0.6cov Yt 1,Yt  4   0.16cov Yt  2 ,Yt  4   cov  t ,Yt  4 
 0  0.6 3  0.16 2  0
(4)

Rearranging equation (1) gives:

0.6 5 1 5
0.84 1  0.6 0  1  0  0  1  
0.84 7 0 7

Page 146 © IFE: 2019 Examinations


Exclusive use Batch0402p

5
Substituting  1   0 into equation (2) gives:
7

5  103  2 103
 2  0.6   0   0.16 0  0  2  
7  175  0 175

Substituting these into equation (3) gives:

 103   5  409  3 409


 3  0.6   0   0.16   0   0  3  
 175   7  875  0 875

Substituting all the above into equation (4) gives:

 409   103  1,639  4 1,639


 4  0.6   0   0.16  0   0  4  
 875   175  4,375  0 4,375

Using the formulae on page 40 of the Tables, the PACF is:

5
 1  1 
7

 57 
2
103 
 2  12 175 4
2   
1  12 1   57 
2 25

 3   4  0 since AR(2) cuts off for lags greater than 2.

© IFE: 2019 Examinations Page 147


Exclusive use Batch0402p

22 Subject CT6 September 2016 Question 9

(i) Why s = 12

Setting s = 12 , ie setting Yt = X t - X t -12 , eliminates the factor of (1 - B12 )


and leaves us with an equation of the form:

(1 - (a + b )B + ab B )Y = e
2
t t

So, setting s = 12 removes the seasonal component from the time series.

(ii) Values of α and β for which the differenced process is stationary

The characteristic equation of the differenced series is:

1 - (a + b )l + abl 2 = 0

Factorising and solving gives:

1 1
(1 - al )(1 - bl ) = 0 fi l= ,
a b

For stationarity, we need:

1 1
> 1 and >1
a b

ie:

a < 1 and b <1

(iii) Estimates of α and β

The equation for Yt is:

Yt = (a + b )Yt -1 - ab Yt - 2 + e t

Page 148 © IFE: 2019 Examinations


Exclusive use Batch0402p

The Yule-Walker equations are:

g 1 = cov ÎÈYt ,Yt -1˚˘ = cov ÈÎ(a + b )Yt -1 - abYt - 2 + e t ,Yt -1˘˚
= (a + b )g 0 - abg 1 + 0 (1)

g 2 = cov ÈÎYt ,Yt - 2 ˘˚ = cov ÈÎ(a + b )Yt -1 - abYt - 2 + e t ,Yt - 2 ˘˚


= (a + b )g 1 - abg 0 + 0 (2)

Rearranging equation (1) gives:

g1 =
(a + b ) g (3)
0
1 + ab

Dividing equations (2) and (3) through by g 0 , we have the following two
equations:

g1 a + b
r1 = =
g 0 1 + ab

g2
r2 = = (a + b ) r1 - ab
g0

Equating these to rˆ1 = 0 and rˆ2 = 0.09 gives:

a +b
r1 = =0 and r2 = 0 - ab = 0.09
1 + ab

So:

a +b =0 and -ab = 0.09

ie:

a = -b and a 2 = 0.09

Hence a = 0.3 or –0.3, and the corresponding values of b are –0.3 or 0.3.

© IFE: 2019 Examinations Page 149


Exclusive use Batch0402p

(iii) Forecast values

Using the estimated values of a and b from part (ii), the equation for the
differenced series is:

Yt = (aˆ + bˆ )Yt -1 - ab
ˆ ˆYt - 2 + e t = 0.09Yt - 2 + e t

Also, since Yt = X t - X t -12 , we have:

X t - X t -12 = 0.09 ( X t - 2 - X t -14 ) + e t

ie:

X t = 0.09 X t - 2 + X t -12 - 0.09 X t -14 + e t

So, the one-step ahead forecast at time T is:

xˆT +1 = 0.09 xT -1 + xT -11 - 0.09 xT -13

Similarly, the two-step ahead forecast is:

xˆT + 2 = 0.09 xT + xT -10 - 0.09 xT -12

23 Subject CT6 April 2017 Question 6

(i) State two approaches for estimating the parameters

Any TWO of the following:


 Method of moments estimation
 Maximum likelihood estimation
 Least squares estimation.

(ii) Explain which approach should be used

Any ONE of the following:


 Method of moments estimation
 Least squares estimation.

Page 150 © IFE: 2019 Examinations


Exclusive use Batch0402p

(iii) Method to calculate estimates

g 0 = cov(Yt ,Yt )
= cov( m + a Yt -1 + e t ,Yt )
= ag 1 + s 2 (1)

g 1 = cov(Yt ,Yt -1)


= cov( m + a Yt -1 + e t ,Yt -1)
= ag 0 (2)

Dividing equation (2) it by g 0 gives:

g 1 ag 0
r1 = = =a (3)
g0 g0

We now equate equation (1) to the sample variance and equation (3) to the
sample ACF at lag 1.

(iv) Explain why the models are identical

There was an error in the question. It should have read:

(1 - cB)Yt = (1 - cB) m + (1 - cB)(a Yt -1 + e t )

Since the same filter has been applied to both sides of the equation, the
observations of Model A will also satisfy Model B.

(v) Explain for which values of c Model B is stationary

Since Model A is stationary and Model B is equivalent then Model B is also


stationary.

However, in order to cancel the (1 - cB) term then (1 - cB)-1 must exist.
Looking at the conditions given on page 2 of the Tables for a convergent
series expansion for (1 + x )p we see that | x | < 1 which means we require
| c | < 1 . This was not included in the examiners’ solution though.

© IFE: 2019 Examinations Page 151


Exclusive use Batch0402p

24 Subject CT6 September 2017 Question 10

(i) Non-stationarity of X t

For this time series:

E ( X t ) = E (a + bt + Yt ) = a + bt + E (Yt )

We are told that Yt is stationary, so E (Yt ) is equal to some constant, m


say. Therefore:

E ( X t ) = a + bt + m

The presence of the bt term indicates that E ( X t ) depends on time t,


ie E ( X t ) is not constant and hence X t is not stationary.

(ii) Stationarity of DX t

We have:

DX t = X t - X t -1 = a + bt + Yt - (a + b(t - 1) + Yt -1) = b + Yt - Yt -1 = b + DYt

Since Yt is stationary, it follows that DYt is also stationary. Adding a


constant to a stationary series produces another stationary series. So DX t
is stationary.

(iii) Autocovariance function of DX t

Since Yt is a stationary series, its covariance function depends only on the


lag. Let:

g Y (s ) = cov(Yt ,Yt - s )

Page 152 © IFE: 2019 Examinations


Exclusive use Batch0402p

Since DX t = b + Yt - Yt -1 :

cov( DX t , DX t - s ) = cov(b + Yt - Yt -1, b + Yt - s - Yt - s -1)

= cov(Yt ,Yt - s ) - cov(Yt ,Yt - s -1)

- cov(Yt -1,Yt - s ) + cov(Yt -1,Yt - s -1)

= g Y (s ) - g Y (s + 1) - g Y (s - 1) + g Y (s )

= 2g Y (s ) - g Y (s + 1) - g Y (s - 1)

(iv) Equation for DX t

Substituting the expression for Yt into the formula for DX t gives:

DX t = b + Yt - Yt -1
= b + e t + be t -1 - e t -1 - be t - 2
= b + e t + ( b - 1)e t -1 - be t - 2

Hence:

(
DX t = b + 1 + ( b - 1)L - b L2 e t )
(v) Variance of DX t

The white noise process e t is a set of uncorrelated random variables. So:

var(Yt ) = var(e t + be t -1) = var(e t ) + b 2 var(e t -1)

In addition, the white noise random variables are identically distributed, so


they all have the same variance. If we denote this common variance by s 2 ,
then:

var(Yt ) = s 2 + b 2s 2 = (1 + b 2 )s 2

© IFE: 2019 Examinations Page 153


Exclusive use Batch0402p

Similarly:

var( DX t ) = var( b + e t + ( b - 1)e t -1 - be t - 2 )


= var(e t ) + var(( b - 1)e t -1) + var( - be t - 2 )
= s 2 + ( b - 1)2s 2 + b 2s 2
= ( b - 1)2s 2 + (1 + b 2 )s 2
= ( b - 1)2s 2 + var(Yt )

which is greater than the variance of Yt .

Page 154 © IFE: 2019 Examinations


Exclusive use Batch0402p

FACTSHEET

This factsheet summarises the main methods, formulae and information


required for tackling questions on the topics in this booklet.

Time series process

Stochastic process, { X t : t Œ J } , with a continuous state space, X t Œ S , and


discrete time set J . There are two types:

 univariate (eg MA, AR, ARMA, ARIMA) – just one variable, say, X t

 multivariate (eg VAR) – more than one variable, use vectors/matrices.

White noise

A sequence of uncorrelated (or independent) random variables, each with


the same distribution. In the context of time series, these random variables
are assumed to have with zero mean.

Stationarity

A time series is (weakly) stationary if:


 E ( X t ) is constant

 cov( X t , X t + k ) depends only on the lag k .

We can test this by:


 writing down the characteristic polynomial of the X t ’s

 showing that the roots are all greater than 1 in magnitude.

Moving average processes are always stationary.

© IFE: 2019 Examinations Page 155


Exclusive use Batch0402p

Invertibility

A time series is invertible if we can calculate the white noise terms


(residuals) from observed data values by inverting the process formula.

We can test this by:

 writing down the characteristic polynomial of the et ’s

 showing that the roots are all greater than 1 in magnitude.

Autoregressive processes are always invertible.

Purely indeterministic

 If knowledge of X1,  , X n is less useful in predicting X N as N Æ • .

 All the time series in this course are purely indeterministic.

Markov property

 Future probabilities depend only on the most recent value.


 Only the AR(1) and VAR(1) have this property.

 AR( p ) ’s can be converted to a VAR(1) .

Integrated of order d

 I (0) means the process, X t , is stationary

 I (d ) , d > 0 means the process, X t , is not stationary, but Yt = —d X t is.

Autocovariance function

g 0 = var( X t )
g k = cov( X t , X t + k )

Page 156 © IFE: 2019 Examinations


Exclusive use Batch0402p

Autocorrelation function (ACF)

gk
 rk = corr( X t , X t + k ) = - 1 £ rk £ 1
g0

 rk Æ 0 as k Æ • for purely indeterministic processes

 rk = 0 , k > q for an MA(q ) process.

Partial autocorrelation function (PACF)

 Conditional correlation of X t + k with X t given X t +1,  , X t + k -1

r2 - r12
f1 = r1, f2 = , fk given on page 40 of Tables
1 - r12

 fk Æ 0 as k Æ • for purely indeterministic processes

 fk = 0 , k > p for an AR( p ) process.

Moving average process of order q, MA(q )

 weighted average of the past q white noise terms:

X t = m + et + b1et -1 +  + b q et - q

 always stationary
 need to check invertibility
 never Markov

 ACF, rk , cuts off for k > q

 PACF, fk , decays to zero.

© IFE: 2019 Examinations Page 157


Exclusive use Batch0402p

Autoregressive process of order p, AR ( p)

 weighted average of the past p observed values:

X t = m + a1( X t -1 - m ) +  + a p ( X t - p - m ) + et

 need to check stationarity


 always invertible
 only AR(1) is Markov

 ACF, rk , decays to zero

 PACF, fk , cuts off for k > p .

Autoregressive moving average process, ARMA( p, q )

 combination of AR( p ) and MA(q ) :

X t = m + a1( X t -1 - m ) +  + a p ( X t - p - m ) + et
+ b1et -1 +  + b q et - q

 need to check stationarity


 need to check invertibility
 only ARMA(1,0) is Markov

 ACF, rk , decays to zero

 PACF, fk , decays to zero.

Autoregressive integrated moving average process, ARIMA( p, d , q )

—d X t is a stationary ARMA( p, q ) process.

Page 158 © IFE: 2019 Examinations


Exclusive use Batch0402p

Vector autoregressive process of order p, VAR ( p)

 X t = m + A1( X t -1 - m ) +  + Ap ( X t - p - m ) + et

 always invertible
 VAR(1) is Markov

 AR( p ) ’s can be converted to a VAR (1) .

Cointegration

Two time series processes, X t and Yt are cointegrated if:

 X t ,Yt are both I (1)

 There exist non-zero values of a and b such that a X t + b Yt is


stationary.

Removing trends and seasonal variation

Linear trends can be removed by:


 differencing, y t = xt - xt -1

 least squares trend removal, y t = xt - (a + bt ) .

Exponential trends can be removed by taking logs, ie set y t = ln xt .

Seasonal variation can be removed by:

 seasonal differencing, eg set y t = xt - xt -12

 method of moving averages, eg set:

yt =
1
12 ( 1
2
x t - 6 +  + xt - 1 + x t + x t + 1 +  +
1
2
xt + 6 )
 method of seasonal means, eg subtract the monthly estimate from the
appropriate month.

© IFE: 2019 Examinations Page 159


Exclusive use Batch0402p

Box-Jenkins methodology

An approach to fitting an ARIMA( p, d , q ) model to a data set.

Step 1 – identify the p , d and q

An SACF slowly decaying from 1 indicates that the series should be


differenced. A good choice for d is the difference with the lowest sample
variance.

 We have an AR( p ) if fˆk cuts-off for k > p .

 We have an MA(q ) if rk cuts-off for k > q .

 Otherwise we have an ARMA which can be fitted by a computer.

Step 2 – estimate parameters

 Use method of moments estimation by setting rk = rk .

 Least squares could be used on an AR( p ) process.

 Maximum likelihood requires an assumption about et ’s distribution.

Step 3 – check fit of the model

If the model is a good fit then the residuals, eˆt , will be white noise:

 the graph of {eˆt } against t or xt should be patternless

 a turning points test can check that the {eˆt } are patternless

 the graphs of SACF and SPACF of {eˆt } should be patternless and


close to zero

 95% of the SACF and SPACF values should lie within ± 2 n

 a Ljung-Box chi-squared test can be used to check whether the


residuals are uncorrelated.

Page 160 © IFE: 2019 Examinations


Exclusive use Batch0402p

Forecasting

There are two methods available:

 the k -step ahead forecast, xˆn (k ) estimates xn + k given x0 ,  , xn .

For an ARMA(2,1) process, xn = a1xn -1 + a 2 xn - 2 + en + b en -1 , we


get:

xˆ n (1) = aˆ1xn + aˆ 2 xn -1 + 0 + b eˆn


xˆ n (2) = aˆ1xˆ n (1) + aˆ 2 xn + 0 + 0 , etc

 Exponential smoothing calculates a forecast based on a weighted


average using smoothing parameter a :

xˆ n (1) = (1 - a )xˆn -1(1) + a xn

© IFE: 2019 Examinations Page 161


Exclusive use Batch0402p

NOTES

Page 162 © IFE: 2019 Examinations


Exclusive use Batch0402p

NOTES

© IFE: 2019 Examinations Page 163


Exclusive use Batch0402p

NOTES

Page 164 © IFE: 2019 Examinations


Exclusive use Batch0402p

NOTES

© IFE: 2019 Examinations Page 165


Exclusive use Batch0402p

NOTES

Page 166 © IFE: 2019 Examinations

You might also like