Time Series (ARIMA) : by Hrishikesh Khaladkar Department of Mathematics Fergusson College, Pune

Time Series (ARIMA)
by
Hrishikesh Khaladkar
Department of Mathematics
Fergusson College,Pune
April 11, 2018
Hrishikesh Khaladkar,Fergusson College Time Series (ARIMA) April 11, 2018 1 / 34

Time Series
What is a Time Series
Any process that varies over a time is a Time Series process provided
that the interval is fixed.
Time series data is a sequence of records collected from a process
with equally spaced intervals in time.
The aim of time series analysis is to comprehend historical time line
of data, analyze it to uncover hidden patterns and finally model the
patterns to use it for forecasting.

Time Series
Real Life Example!!!!
Suppose Mr. Ajay starts his job in year 2010 and his starting salary was
5, 000 Rs per month. Every years he is appraised and salary reached to a
level of 20, 000 Rs per month in year 2014. His annual salary can be
considered a time series and it is clear that every year’s salary is function
of previous year’s salary (here function is appraisal rating).

Time Series
Main Phases of a time Series
In descriptive phase you try understand the nature of time series. We

try to identify the trend, seasonal, cyclic or any irregular variations in
the data.
In modeling or pattern discovery step, we model the inherent patterns
of the time series data. There are several methods of finding patterns
in the process of time series analysis.
Once identified,forecasting is relatively easy.

Time Series
Examples:Laptop sales (Time Series Plot)
Time series data of a monthly Laptop sales (in thousands)

Time Series
Examples:US GDP (Source: World Bank) Time Series Plot
Time series data of a GDP for United States (in Billions)

Time Series
Regression vs Time Series
Normally in predictive modelling , you predict the depending variable

Y on a set of X variables.
In time series you have to predict Y using the previous values of Y.
Regression Y = f (X1 , X2 ....XP )
Time Series Yt = g (Yt−1 , Yt−2 ....Y1 )

Time Series
Components of a Time Series
Trend: Series could be constantly increasing or decreasing or first

decreasing for a considerable time period and then decreasing. This
trend is identified and then removed from the time series in ARIMA
forecasting process.
Seasonality: Repeating pattern with fixed period.
For examples : Sales in festive seasons. Sales of Candies and sales of
Chocolates peaks in every October Month and December month
respectively every year in US. It is because of Halloween and
Christmas falling in those months. The time-series should be
de-seasonalized in ARIMA forecasting process.
Random Variation (Irregular Component): This is the unexplained
variation in the time-series which is totally random. Erratic
movements that are not predictable because they do not follow a
pattern.
Example: Earthquake, Famine, Big Economical Scandal etc..(This is
what makes Time Series Analysis tough)

Time Series
Measure the trend

There are several ways to measure the trend.
Graphical Method
Method of Semi Averages
Method of Curve Fitting using the Principles of Least Squares
Method of Moving Averages
We discuss the Method of Moving Averages
It consists of obtaining a series of moving averages (arithmetic means)
of the successive overlapping groups or sections of a time series.
The averaging process smoothens the ups and downs in the data.
The moving average is characterized by a constant known as the
period or extent of the moving average.
For example if y1 , y2 , y3 ...... is a time series then
y1 + y2 ...ym y2 + y3 ...ym+1 y3 + y4 ...ym+2
1st MA= , 2nd MA= , 3rd MA=
m m m
and so on.
Time Series
How to identify seasonality

There are several graphical ways to plot seasonality
Run charts
Multiple Box Plots
Seasonal plots

Time Series
Stationary and Non-Stationary Time Series

A time series is said to be stationary if there is no systematic change
in mean (no trend), if there is no systematic change in variance, and
if strictly periodic variations have been removed.
In the stationary time-series process, the mean and variance hover
around a single value.
With the growth of a series, if the mean and variance of the series
also tend to grow extremely high, then the series is considered to be
nonstationary. A nonconstant mean or nonconstant variance is a sign
of a nonstationary time-series process.These processes are also called
explosive.

Time Series
What is a Stationary Time Series

Some steadiness is apparent in the series plot. Its not inflating too
much from its mean value line. In this case, the value of variance or
mean will also show steadiness or a state of equilibrium.
The graph below denotes a stationary time series

Time Series
Testing Stationarity Using a DF Test
To test the stationarity of a series, you use Dickey Fuller (DF) test checks.
The null and alternative hypotheses of a DF test are as follows:
H0 : The series is not stationary.
H1 : The series is stationary.
You perform a DF test and take note of the P-value.
Considering the p value
if the P-value is less than 5 percent (0.05), youreject the null
hypothesis, which is equivalent to rejecting the hypothesis that the
series is not stationary. So, a series is concluded as stationary when
the P-value of a DF test is less than 0.05.
On the other hand, if the P-value is more than 0.05, then you go
ahead and accept the null hypothesis, which means that the series is
not stationary.

Time Series
Achieving Stationarity
If a series is not stationary, you can differentiate it to make it
stationary.
If Yt is the original series, then ∆Yt = Yt − Yt−1 and work with this
new series of ∆Yi . This is called as lag.
Note that some series may not be stationary even after the first
differentiation. You then need to go for the second differentiation. If
even the second differentiation doesnt work, you may have to try
some other transformation logarithm.
Time Series
White noise
A white noise process is one with a constant mean of zero, a constant

variance and no correlation between its values at different times.
White noise series exhibit a very erratic, jumpy, unpredictable
behavior.Since values are uncorrelated, previous values do not help us
to forecast future values.
White noise series themselves are quite uninteresting from a
forecasting standpoint (they are no linearly forecastable).

Time Series
ARIMA (Box Jenkins Approach)
ARIMA : Auto Regressive Integrated Moving Average

Using Box Jenkins approach it is developed in two steps by
understanding the AR, MA and ARMA Approach.
Understanding ARIMA is same as understanding Eye Sight
Measurement using Snellen chart.

Time Series
AR Process
Consider the time series given by Yt , Yt−1 , ....Yt−k .
The auto-regressive process is denoted by AR(p), where p is the order
of the auto-regressive process.
In the AR process, the current values of the series are a factor of
previous values.
p determines on how many previous values the current value of the
series depends.
In an AR(1) process then Yt ) = a1 Yt−1 + t where
a1 : Quantified impact of Yt−1 on Yt .
t : white noise.
Similarly in an AR(2) process then Yt = a1 Yt−1 + a2 Yt−2 t where
t : white noise.
Time Series
AR Process
In general for an AR(p) process Yt = a1 Yt−1 + a2 Yt−2 + ....ap Yt−p + t

where
a1 , a2 , ...ap : Quantified impact of Yt−1 , Yt−2 , ....Yt−p on Yt .
t : white noise.

Time Series
MA Process
A moving average (MA) process is a time-series process where the current

value and the previous values.in the series are almost the same. But the
current deviation in the series depends upon the previous white noise or
error or shock.
It is similar to a AR(p) process, it tells us how many error values have an
effect on the current value.
In an MA(1) process the deviation at time t will be a factor of the error at
t and t-1.
If the series is a MA(1) process then Yt − µ = t + b1 t−1 where
b1 : Quantified impact of t−1 on t .
µ : is the mean of the overall series.
Yt − µ is the deviation is at time t.

Time Series
MA Process
In a process is MA(2),then Yt − µ = t + b1 t−1 + b2 t−2 where

b1 , b2 : Quantified impact of t−1 , t−2 on t .
In general if the process is MA(q),then
Yt − µ = t + b1 t−1 + b2 t−2 + ...bt−q t−q where
b1 , b2 , ..bq : Quantified impact of t−1 , t−2 , ...t−q on t .
If the mean is zero then
Yt = t + b1 t−1 + b2 t−2 + ...bt−q t−q

Time Series
ARMA Process
If a process shows the properties of an auto-regressive process and a

moving average process, then it is called an ARMA process. In an ARMA
time-series process, the current value of the series depends on its previous
values.
You can think of an ARMA process as a series with both long-term
trend and short-term seasonality.
ARMA(p,q) is the general notation for an ARMA process, where p is the
order of the AR process and q is the order of the MA process. In an
AR(1,1) series Yt = a1 Yt−1 + t + b1 t−1
t : is the random error at time t.
b1 Quantified impact of t−1 on t .

Time Series
ARMA Process
In an AR(2,1) series Yt = a1 Yt−1 + a2 Yt−2 t + b1 t−1

a1 , a2 : Quantified impact of Yt−1 , Yt−2 on Yt .
b1 Quantified impact of t−1 on t .
In general for a AR(p,q) series
Yt = a1 Yt−1 + a2 Yt−2 + ...ap Yt−p + t + b1 t−1 + b2 t−2 + .....bq t−q
a1 , a2 , ...ap : Quantified impact of Yt−1 , Yt−2 , ...Yt−p on Yt .
b1 , ....bq : Quantified impact of t−1 , t−2 ..t−q on t .

Time Series
Comparision with Vision Test
To test eye sight and prescribe eyeglasses, doctors perform a small

test. Instead of using special equipment, in the past doctors had a
box full of lenses (of different powers).
The patient was asked to sit in a chair and was given an empty frame
to put on her eyes. The doctor used to put differently powered
lenses,on by one, in the frame and asked the patient to read from the
Snellen chart. Some patients, for example, read the top seven rows
and struggled with the lower ones. The doctor then removed the first
lens and put another.
After much such iteration, the doctor used to finalize on the exact
lenses to be used in the patients glass. Some patients got diagnosed
as nearsighted and some with farsightedness.

Time Series
Analogy of Vision Test and Box Jenkins Approach
Vision Test
Assume the patient is literate.
Based on some tests, identify nearsightedness or farsightedness and
get a rough estimate of eyesight.
Estimate the exact eyesight by trying various lenses.
Use the test results to give the prescription.
Box Jenkins Approach
Assume that the time series is stationary,if not make it stationary.
Based on ACF and PACF plots , identify whether the model is an AR
or MA or ARMA process.
Estimate the parameters such as a1 , a2 ...ap and b1 , b2 ...bq .
Use the final model for forecasting.

Time Series
ARIMA Processess
The ARIMA(1,1,0) series is written as follows:

∆Yt = a1 ∆Yt−1 + t .
∆Yt = a1 ∆Yt−1 + a2 ∆Yt−2 + t .
∆Yt = a1 ∆Yt−1 + t + b1 t−1 .
The rule of thumb is that you subtract the previous lag values while
differentiating; you add previous lag values while integrating.
Next how to identify p and q for ARIMA Models

Time Series
Auto Correlation Function
Auto correlation is the correlation between Y t upon Y t1 . Generally

youwill find correlation between two variables, but here you are finding
correlation between Y upon previous values of Y. The auto correlation
function is a function of all such correlations at different lags.The ACF is
denoted by ρh , where h indicates the lag.
ACF(0): Correlation at lag0 ρ0 =Yt and Yt = 1.
ACF(1): Correlation at lag1 ρ1 =Yt and Yt−1 .
The graphs created using auto correlation values are called auto
correlation plots

Time Series
Auto Correlation plots
On the x-axis you have the lag values, while the y-axis has the
autocorrelation values. The graph might vary based on the type of the
series.

Time Series
Partial Auto Correlation Function
The partial auto correlation function is the partial correlations between Y

and its previous values calculated at different lags. n the context of
time-series analysis, partial auto correlation is found by regressing the old
values of Y on the current value. The PACF is denoted by θh , where h
indicates the lag.
PACF(0): Partial Correlation at lag0 θ0 =Yt and Yt = 1.
PACF(1):Partial Correlation at lag1 θ1 =Regression coefficient of Yt−1
when Yt−1 is regressed upon Yt .
PACF(2): Partial Correlation at lag2 θ2 =Regression coefficient of
Yt−2 when Yt−1 and Yt−2 is regressed upon Yt .
PACF(3): Partial Correlation at lag3 θ3 =Regression coefficient of
Yt−3 when Yt−1 , Yt−2 and Yt−3 is regressed upon Yt .
The graphs created using partial auto correlation values are called partial
auto correlation plots

Time Series
Partial Auto Correlation plots
On the x-axis you have the lag values, while the y- axis has the partial
auto correlation values.As is the case with ACF, a PACF graph might vary
based on the type of series.

Time Series
Rules of thumb to identify a AR(p) process

The rule is as follows:
ACF: Slowly tails off or diminishes to zero. Either reduces in one
direction or reduces in a sinusoidal (sine wave) passion.
PACF: Cuts off. The cutoff lag indicates the order of the AR process.

Time Series
Rules of thumb to identify a MA(q) process

ACF: Cuts off. The cutoff lag indicates the order of the MA process.
PACF: Slowly tails off or diminishes to zero. Either reduces in one
direction or reduces in a sinusoidal or sine wave pattern.

Time Series
Rules of Thumb for Identifying the ARMA Process

ACF: Dampens to zero.
PACF: Dampens to zero.

Time Series
Checking for Model Accuracy
Ideally you would like to have the error be zero or less than 5 percent.The
following are some measures of accuracy.
Yi denote the actual value.
Ŷi denote the expected value
Pn
1 Mean absolute deviation (MAD)= i=1 |Yi − Ŷi |
n
100 ni=1 |Yi − Ŷi |
P
2 Mean absolute percent error(MAPE)=
n Yi
Pn 2
3 Mean square error (MSE)= i=1 (Yi − Ŷi )
n
4 Another related measure is root mean square error, which is
√
RMSE = MSE .
It is generally a good practice to keep 5 to 10 percent of the sample data
for validation purposes

Time Series
Suggestions with the Box Jenkins Approach
This is a logical end to our time-series analysis and forecasting process

using the BoxJenkins ARIMA approach. Every analyst wants to accurately
forecast the future values by building the best model for the available
historical data. This can be definitely achieved by using the BoxJenkins
approach. The following are some suggestions while building time-series
models:
Have sufficient historical data, at least 30 data points. Make sure you
dont run into too much history. Only the historical values that will
have an impact on future forecasts should be considered.
Do not forecast too far into the future. With one year of data,
forecasting the next two years is not a good idea. It may be that 10
percent or fewer data points into the future is recommended.
Remove outliers before building the model.

Time Series (ARIMA) : by Hrishikesh Khaladkar Department of Mathematics Fergusson College, Pune

Uploaded by

Copyright:

Available Formats

Time Series (ARIMA) : by Hrishikesh Khaladkar Department of Mathematics Fergusson College, Pune

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Time Series (ARIMA) : by Hrishikesh Khaladkar Department of Mathematics Fergusson College, Pune

Uploaded by

Copyright:

Available Formats

Time Series (ARIMA)

April 11, 2018

Hrishikesh Khaladkar,Fergusson College Time Series (ARIMA) April 11, 2018 1 / 34

What is a Time Series

Hrishikesh Khaladkar,Fergusson College Time Series (ARIMA) April 11, 2018 2 / 34

Real Life Example!!!!

Hrishikesh Khaladkar,Fergusson College Time Series (ARIMA) April 11, 2018 3 / 34

Main Phases of a time Series

In descriptive phase you try understand the nature of time series. We

Hrishikesh Khaladkar,Fergusson College Time Series (ARIMA) April 11, 2018 4 / 34

Examples:Laptop sales (Time Series Plot)

Time series data of a monthly Laptop sales (in thousands)

Hrishikesh Khaladkar,Fergusson College Time Series (ARIMA) April 11, 2018 5 / 34

Examples:US GDP (Source: World Bank) Time Series Plot

Time series data of a GDP for United States (in Billions)

Hrishikesh Khaladkar,Fergusson College Time Series (ARIMA) April 11, 2018 6 / 34

Regression vs Time Series

Normally in predictive modelling , you predict the depending variable

Hrishikesh Khaladkar,Fergusson College Time Series (ARIMA) April 11, 2018 7 / 34

Components of a Time Series

Trend: Series could be constantly increasing or decreasing or first

Hrishikesh Khaladkar,Fergusson College Time Series (ARIMA) April 11, 2018 8 / 34

Measure the trend

How to identify seasonality

Hrishikesh Khaladkar,Fergusson College Time Series (ARIMA) April 11, 2018 10 / 34

Stationary and Non-Stationary Time Series

Hrishikesh Khaladkar,Fergusson College Time Series (ARIMA) April 11, 2018 11 / 34

What is a Stationary Time Series

Hrishikesh Khaladkar,Fergusson College Time Series (ARIMA) April 11, 2018 12 / 34

Testing Stationarity Using a DF Test

Hrishikesh Khaladkar,Fergusson College Time Series (ARIMA) April 11, 2018 13 / 34

A white noise process is one with a constant mean of zero, a constant

Hrishikesh Khaladkar,Fergusson College Time Series (ARIMA) April 11, 2018 15 / 34

ARIMA (Box Jenkins Approach)

ARIMA : Auto Regressive Integrated Moving Average

Hrishikesh Khaladkar,Fergusson College Time Series (ARIMA) April 11, 2018 16 / 34

In general for an AR(p) process Yt = a1 Yt−1 + a2 Yt−2 + ....ap Yt−p + t

Hrishikesh Khaladkar,Fergusson College Time Series (ARIMA) April 11, 2018 18 / 34

A moving average (MA) process is a time-series process where the current

Hrishikesh Khaladkar,Fergusson College Time Series (ARIMA) April 11, 2018 19 / 34

In a process is MA(2),then Yt − µ = t + b1 t−1 + b2 t−2 where

Hrishikesh Khaladkar,Fergusson College Time Series (ARIMA) April 11, 2018 20 / 34

If a process shows the properties of an auto-regressive process and a

Hrishikesh Khaladkar,Fergusson College Time Series (ARIMA) April 11, 2018 21 / 34

In an AR(2,1) series Yt = a1 Yt−1 + a2 Yt−2 t + b1 t−1

Hrishikesh Khaladkar,Fergusson College Time Series (ARIMA) April 11, 2018 22 / 34

Comparision with Vision Test

To test eye sight and prescribe eyeglasses, doctors perform a small

Hrishikesh Khaladkar,Fergusson College Time Series (ARIMA) April 11, 2018 23 / 34

Analogy of Vision Test and Box Jenkins Approach

Hrishikesh Khaladkar,Fergusson College Time Series (ARIMA) April 11, 2018 24 / 34

The ARIMA(1,1,0) series is written as follows:

Hrishikesh Khaladkar,Fergusson College Time Series (ARIMA) April 11, 2018 25 / 34

Auto Correlation Function

Auto correlation is the correlation between Y t upon Y t1 . Generally

Hrishikesh Khaladkar,Fergusson College Time Series (ARIMA) April 11, 2018 26 / 34

Auto Correlation plots

Hrishikesh Khaladkar,Fergusson College Time Series (ARIMA) April 11, 2018 27 / 34

Partial Auto Correlation Function

In general for an AR(p) process Yt = a1 Yt−1 + a2 Yt−2 + ....ap Yt−p + t

In a process is MA(2),then Yt − µ = t + b1 t−1 + b2 t−2 where

In an AR(2,1) series Yt = a1 Yt−1 + a2 Yt−2 t + b1 t−1