stat520ch9slides

Chapter 9: Forecasting
I One of the critical goals of time series analysis is to

forecast (predict) the values of the time series at times in
the future. I When forecasting, we ideally should evaluate
the precision of the forecast.
I We will consider examples of forecasts for
1. deterministic trend models;
2. ARMA- and ARIMA-type models;
3. models containing deterministic trends and ARMA
(or ARIMA) stochastic components.
I The methods we use here assume the model
(including parameter values) is known exactly.
I This is not true in practice, but for large sample sizes,
the parameter estimates should be close to the true
parameter values.
Minimum MSE Forecasting
I Assume we have observed the time series up to the

present time, t, so that we have observed Y1, Y2, . . . ,
Yt.
I The goal is to forecast the value of Yt+`, which is the value `
time units into the future.
I In this case, time t is called the forecast origin and ` is
called the lead time of the forecast.
I The forecast (predicted future value) itself is denoted Yˆt(`).
I We will find the forecast formula that minimizes the mean
square error (MSE) of the forecast, E[(Yt+` − Yˆt(`))2], for a
variety of models.
Forecasting with a Deterministic Trend Model
I Consider the trend model Yt = µt + Xt, where µtis some

deterministic trend and the stochastic component Xt
has mean zero.
I In particular, we assume {Xt} is white noise with variance
γ0. Then
Yˆt(`) = E(µt+` + Xt+`|Y1, Y2, . . . , Yt)

= E(µt+`|Y1, Y2, . . . , Yt) + E(Xt+`|Y1, Y2, . . . , Yt)
= E(µt+`) + E(Xt+`) = µt+`,
since Xt+` has mean zero and is independent of the

previously observed values Y1, Y2, . . . , Yt.
Forecasting with a Linear Trend Model
I In the case in which we assume a linear trend, µt = β0 + β1t.

I So the forecast of the response at ` time units into the
future is Yˆt(`) = β0 + β1(t + `).
I This forecast assumes that the same linear trend holds in
the future, which can be a dangerous assumption, since
we don’t have the (future) data (yet) to justify it.
Forecasting with Other Trend Models
I For a quadratic trend, where µt = β0 + β1t + β2t2, the
2
forecast is Yˆt(`) = β0 + β1(t + `) + β2(t + `) .
I With higher-order polynomial trends, extrapolating into
the future becomes even more risky.
I For periodic seasonal means models in which µt = µt+12, the
forecast is Yˆt(`) = µt+12+` = Yˆt(` + 12).

I So for such models, the forecast at a particular time is
the same as the forecast at the time 12 months later.
I See the examples of forecasts on real data sets on the
course web page.
Forecast Error and Forecast Error Variance
I The forecast error is denoted by et(`):
et(`) = Yt+` − Yˆt(`)

= µt+` + Xt+` − µt+` = Xt+`,
so that E[et(`)] = E[Xt+`] = 0.

I Thus the forecast is unbiased.
I And the forecast error variance is var[et(`)] = var[Xt+`] = γ0,
which does not depend on the lead time `.
Forecasting in AR(1) Models
I Consider the AR(1) process with a nonzero mean µ:

Yt − µ = φ(Yt−1 − µ) + et.
I Suppose we want to forecast the process 1 time unit into

the future. Note that
Yt+1 − µ = φ(Yt − µ) + et+1.
I Taking the conditional expected value (given Y1, Y2, . . . ,

Yt) of both sides, we have:
Yˆt(1) − µ = φ[E(Yt|Y1, Y2, . . . , Yt) − µ] + E(et+1|Y1, Y2, . . . , Yt)

= φ[Yt − µ] + E(et+1) = φ[Yt − µ].
since et+1 is independent of Y1, Y2, . . . , Yt and has mean zero.

Forecasting and the Difference Equation Form
I So Yˆt(1) = µ + φ(Yt − µ).
I That is, the forecast for the next value is the process
mean, plus some fraction of the current deviation from
the process mean.
I If we forecast not just 1 time unit but ` time units into the
future, we have
Yˆt(`) = µ + φ[Yˆt(` − 1) − µ] for ` ≥ 1.
I So any forecast can be found recursively: We can find
Yˆt(1), which we can then use to find Yˆt(2), etc.

I This recursive formula is called the difference equation form
of the forecasts.
A General Formula for Forecasts in AR(1) Models
I Note that we can solve for a general formula for a
forecast with a lead time ` in an AR(1) process:
Yˆt(`) = φ[Yˆt(` − 1) − µ] + µ
= φ[{φ[Yˆt(` − 2) − µ]} + µ − µ] + µ
= φ[{φ[Yˆt(` − 2) − µ]}] + µ
..
.
= φ`−1[Yˆt(1) − µ] + µ
= φ`−1[µ + φ(Yt − µ) − µ] + µ
which implies that Yˆt(`) = µ + φ`(Yt − µ).

I So the fraction of the current deviation from the process
mean that is added to µ becomes closer to zero as the
lead time gets larger.
Forecasting with the Color Property Example
I Recall that we used a AR(1) model for the color property

time series.
I Via ML, we estimated φ and µ to be 0.5705 and
74.3293, respectively.
I For the purpose of the forecast, we will take these to be
the true parameter values (though they really are not).
I The last observed value, Yt, of this color property series
was 67.
I So forecasting 1 time unit into the future yields Yˆt(1)

= 74.3293 + 0.5705(67 − 74.3293) = 70.14793.
We get the lead ` forecast
`
Yˆt(`) = 74.3293 + (0.5705) (67 − 74.3293).
We can implement a function to calculate Yˆt(`) for

some `: Yˆt(1)=70.14793,Yˆt(2)=71.94383,
(continued)
I To forecast, say, 5 time units into the future, we can

continue recursively, or just use the general formula to
obtain:
5
Yˆt(5) = 74.3293 + 0.5705 (67 − 74.3293) = 73.88636.
I Note that forecasting 20 time units into the future yields
20
Yˆt(20) = 74.3293 + 0.5705 (67 − 74.3293) = 74.3292. I We
see that for a large lead time, the forecast nearly equals µ. I
In general, for all stationary ARMA models, Yˆt(`) ≈ µ for large

`.
One-step-ahead Forecast Error
I The one-step-ahead forecast error et(1) is the difference

between the actual value of the process one time unit into
the future and the predicted value one time unit ahead.
I For the AR(1) model, this is et(1) = Yt+1 − Yˆt(1) =
[φ(Yt − µ) + µ + et+1] − [φ(Yt − µ) + µ] = et+1.
I So the one-step-ahead forecast error is simply a
white-noise observation, and it is independent of Y1, Y2, . .
. , Yt.
I And var[et(1)] = σ2e.
Forecast Error for General Lead Time
I The forecast error for a general lead time, `, et(`), is the
difference between the actual value of the process ` time
units into the future and the predicted value ` time units
ahead. I For any general linear process, it can be shown that
et(`) = et+` + ψ1et+`−1 + ψ2et+`−2 + · · · + ψ`−1et+1
I Clearly, E[et(`)] = 0, so the forecasts are

unbiased. I And var[et(`)] = σ2e(1 + ψ21 + ψ22 + · · ·
+ ψ2`−1).
I These results hold for all ARIMA models.
Forecast Error for General Lead Time in AR(1)
Models
I For an AR(1) process, the forecast error for a general
lead time is
et(`) = et+` + φet+`−1 + φ2et+`−2 + · · · + φ`−1et+1

σ2e − φ2
I And
1 − φ2` 1
var[et(`)] = .
I So for long lead times, var[et(`)] ≈σ2e

2
for large `.
1−φ
I And since this right hand side is the variance formula for
an AR(1) process, note that var[et(`)] ≈ var(Yt) = γ0 for
large `.
I This last result holds for all stationary ARMA models.
Forecasting with an MA(1) Model
I Consider now an MA(1) model with a nonzero

mean, Yt = µ + et − θet−1.
I Replacing t by t + 1 and taking conditional expectations, we
have Y2, . . . , Yt).
Yˆt(1) = µ − θE(et|Y1,
I If the model is invertible, then E(et|Y1, Y2, . . . , Yt) = et (at
least approximately, since we condition on Y1, Y2, . . . , Yt
rather than on the infinite history . . . , Y0, Y1, Y2, . . . , Yt).
I If the model is not invertible, then E(et|Y1, Y2, . . . , Yt) 6=
et (not even approximately).
I For an invertible MA(1) model, the one-step-ahead forecast
is Yˆt(1) = µ − θet.
Forecast Error for MA(1) Model
I Again, the one-step-ahead forecast error is

et(1) = Yt+1 − Yˆt(1) = [µ + et+1 − θet] − [µ − θet] = et+1. I
For longer lead time, where ` > 1,
Yˆt(`) = µ+E(et+`|Y1, Y2, . . . , Yt)−θE(et+`−1|Y1, Y2, . . . , Yt)
I But for ` > 1, both et+` and et+`−1 are independent of Y1, Y2,
. . . , Yt, so these conditional expected values are both
zero.
I Therefore, in an invertible MA(1) model, Yˆt(`) = µ for ` > 1.

Forecasting with the Random Walk with Drift
I Now we consider forecasting with a nonstationary

ARIMA process.
I Specifically, consider the random walk with drift model,
where Yt = Yt−1 + θ0 + et.
I This is basically an ARIMA(0, 1, 0) model with an
extra constant term.
I The forecast one step ahead is
Yˆt(1) = E(Yt|Y1, Y2, . . . , Yt) + θ0 + E(et+1|Y1, Y2, . . . , Yt)

= Yt + θ0
Forecasting with the Random Walk with Drift with
General Lead Time
I For ` > 1, Yˆt(`) = Yˆt(` − 1) + θ0.
I So by iterating backward, we see that Yˆt(`) = Yt + θ0` for
` ≥ 1.
I The forecast, as a function of the lead time `, is a straight
line with slope θ0.
I With nonstationary series, the presence of the constant
term has a major effect on the forecast, so it is important to
determine whether the constant term is truly needed
(we could check whether it is significantly different
from zero).
Forecast Error with the Random Walk with Drift
I For the random walk with drift model, the
one-step-ahead forecast error is again et(1) = Yt+1 −
Yˆt(1) = et+1.
I But the forecast error ` steps ahead can be shown to
be et(`) = et+1 + et+2 + · · · + et+`.
I So var[et(`)] = `σ2e.
I In this nonstationary model, the variance of the forecast
error continues to increase without bound as the lead
time gets larger.
I This phenomenon will happen with all nonstationary
ARIMA models.
I On the other hand, with stationary models, the variance
of the forecast error increases as the lead time gets
larger, but there is a limit to the increase.
I And with deterministic trend models, the variance of
the forecast error is constant as the lead time gets
larger.
Forecasting with the ARMA(p, q) Model
I The general difference equation form for forecasts in
the ARMA(p, q) model is somewhat complicated:
Yˆt(`) = φ1Yˆt(` − 1) + φ2Yˆt(` − 2) + · · · + φpYˆt(` − p) + θ0

− θ1et+`−1I[` ≤ 1] − θ2et+`−2I[` ≤ 2]
− · · · − θqet+`−2I[` ≤ q]
where the indicator I[·] equals 1 if the condition in
the brackets is true, and 0 otherwise.
I For example, with an ARMA(1, 1) model,
Yˆt(1) = φYt + θ0 − θet, and Yˆt(2) = φYˆt(1) + θ0, and in

general, Yˆt(`) = φYˆt(` − 1) + θ0 for ` ≥ 2.
I With an ARMA(1, 1) model, an explicit general formula for
a forecast ` time units ahead, in terms of µ = E(Yt), is
` `−1
Yˆt(`) = µ + φ (Yt − µ) − φ θet for ` ≥ 1.
More On Forecasting with the ARMA(p, q) Model
I For lead time ` = 1, 2, . . . , q, the noise terms appear in

the formulas for the forecasts.
I For longer lead times (i.e., ` > q) the noise terms
disappear and only the autoregressive component (and
the constant term) of the model affects the forecast.
I For ` > q, the difference equation formula for the
ARMA(p, q) model reduces to
Yˆt(`) = φ1Yˆt(` − 1) + φ2Yˆt(` − 2) + · · · + φpYˆt(` − p) + θ0.

Forecasting with the ARMA(p, q) Model as Lead
Times Increase
I Since we have shown that θ0 = µ(1 − φ1 − φ2 − · · · − φp),

this can be rewritten as
Yˆt(`) − µ = φ1[Yˆt(` − 1) − µ] + φ2[Yˆt(` − 2) − µ]+

· · · + φp[Yˆt(` − p) − µ] for ` ≥ q.
I For a stationary ARMA model, Yˆt(`) − µ will decay toward

zero as the lead time ` increases, and thus for long lead
times, the forecast will approximately equal the process
mean µ.
I This is sensible because for stationary models, the
dependence grows weaker as the time between
observations increases, and µ would be the natural best
forecast to use if there were no dependence over time.
Forecasting with Nonstationary Models
I We have seen one example of forecasting with

nonstationary models (the random walk with drift).
I For an ARIMA(1, 1, 1) model,
Yˆt(1) = (1 + φ)Yt − φYt−1 + θ0 − θet
Yˆt(2) = (1 + φ)Yˆt(1) − φYt + θ0

..
.
Yˆt(`) = (1 + φ)Yˆt(` − 1) − φYˆt(` − 2) + θ0
I These forecasts are unbiased, i.e., E[et(`)] = 0 for any ` ≥ 1.

Forecast Error Variance with Nonstationary Models
I But the variance of the forecast error is
`−1
var[et(`)] = ψ2jfor ` ≥ 1.
σ2eX j=0
I For a nonstationary series, these ψj weights do not decay

to zero as j increases.
I So the forecast error variance increases without bound as
the lead time ` increases.
I Lesson: With nonstationary series, when we forecast far
into the future, we have a lot of uncertainty about the
forecast.

stat520ch9slides

Uploaded by

Copyright:

Available Formats

stat520ch9slides

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

stat520ch9slides

Uploaded by

Copyright:

Available Formats

Chapter 9: Forecasting

I One of the critical goals of time series analysis is to

I Assume we have observed the time series up to the

I Consider the trend model Yt = µt + Xt, where µtis some

Yˆt(`) = E(µt+` + Xt+`|Y1, Y2, . . . , Yt)

since Xt+` has mean zero and is independent of the

I In the case in which we assume a linear trend, µt = β0 + β1t.

forecast is Yˆt(`) = µt+12+` = Yˆt(` + 12).

et(`) = Yt+` − Yˆt(`)

so that E[et(`)] = E[Xt+`] = 0.

I Consider the AR(1) process with a nonzero mean µ:

I Suppose we want to forecast the process 1 time unit into

Yt+1 − µ = φ(Yt − µ) + et+1.

I Taking the conditional expected value (given Y1, Y2, . . . ,

Yˆt(1) − µ = φ[E(Yt|Y1, Y2, . . . , Yt) − µ] + E(et+1|Y1, Y2, . . . , Yt)

since et+1 is independent of Y1, Y2, . . . , Yt and has mean zero.

Yˆt(`) = µ + φ[Yˆt(` − 1) − µ] for ` ≥ 1.

I So any forecast can be found recursively: We can find

Yˆt(1), which we can then use to find Yˆt(2), etc.

which implies that Yˆt(`) = µ + φ`(Yt − µ).

I Recall that we used a AR(1) model for the color property

I So forecasting 1 time unit into the future yields Yˆt(1)

We get the lead ` forecast

We can implement a function to calculate Yˆt(`) for

I To forecast, say, 5 time units into the future, we can

In general, for all stationary ARMA models, Yˆt(`) ≈ µ for large

I The one-step-ahead forecast error et(1) is the difference

et(`) = et+` + ψ1et+`−1 + ψ2et+`−2 + · · · + ψ`−1et+1

I Clearly, E[et(`)] = 0, so the forecasts are

et(`) = et+` + φet+`−1 + φ2et+`−2 + · · · + φ`−1et+1

I So for long lead times, var[et(`)] ≈σ2e

I Consider now an MA(1) model with a nonzero

I Again, the one-step-ahead forecast error is

Yˆt(`) = µ+E(et+`|Y1, Y2, . . . , Yt)−θE(et+`−1|Y1, Y2, . . . , Yt)

I Therefore, in an invertible MA(1) model, Yˆt(`) = µ for ` > 1.

I Now we consider forecasting with a nonstationary

Yˆt(1) = E(Yt|Y1, Y2, . . . , Yt) + θ0 + E(et+1|Y1, Y2, . . . , Yt)

Yˆt(`) = φ1Yˆt(` − 1) + φ2Yˆt(` − 2) + · · · + φpYˆt(` − p) + θ0

Yˆt(1) = φYt + θ0 − θet, and Yˆt(2) = φYˆt(1) + θ0, and in

I For lead time ` = 1, 2, . . . , q, the noise terms appear in

Yˆt(`) = φ1Yˆt(` − 1) + φ2Yˆt(` − 2) + · · · + φpYˆt(` − p) + θ0.

I Since we have shown that θ0 = µ(1 − φ1 − φ2 − · · · − φp),

Yˆt(`) − µ = φ1[Yˆt(` − 1) − µ] + φ2[Yˆt(` − 2) − µ]+

I For a stationary ARMA model, Yˆt(`) − µ will decay toward

I We have seen one example of forecasting with

Yˆt(1) = (1 + φ)Yt − φYt−1 + θ0 − θet

Yˆt(2) = (1 + φ)Yˆt(1) − φYt + θ0

Yˆt(`) = (1 + φ)Yˆt(` − 1) − φYˆt(` − 2) + θ0

I These forecasts are unbiased, i.e., E[et(`)] = 0 for any ` ≥ 1.

I For a nonstationary series, these ψj weights do not decay

You might also like