1.1 Basic Time Series Decomposition PDF

Download as pdf or txt
Download as pdf or txt
You are on page 1of 38

Time Series Analysis

Basics of Time Series Analysis

Nicoleta Serban, Ph.D.


Associate Professor
Stewart School of Industrial and Systems Engineering

Basic Statistical Concepts


About this lesson
Review of Basic Statistical Concepts
• Moments of a Distribution: Fully characterizing the
distribution
• Estimation Methods: Method of Moments versus
Maximum Likelihood Estimation
• Basic Estimators: Approach and Sampling Distribution
• Multivariate data: Joint, marginal, and conditional
distribution
• Statistical Inference: Confidence intervals and
Hypothesis Testing
Moments of a Distribution
Moments of a random variable with density 𝑓 :
′ 𝑙 ∞ 𝑙𝑓
-th moment: 𝑙 =Ε = −∞

𝑙 ∞
-th central moment: 𝑙 = Ε −𝜇 = −∞
− 𝜇 𝑙𝑓 ⅆ
Examples of moments:
• Expectation Ε
• Variance Ε −𝜇
𝑋−𝜇
• Skewness 𝑆 = Ε
𝜎
𝑋−𝜇
• Kurtosis =Ε
𝜎
Statistical Estimation
Parametric Statistics: Observe , … , 𝑛 (realizations) from a
set of random variables , … , 𝑛 ~ 𝑓 ; 𝜃 where 𝑓 ; 𝜃 is a
density function with parameter 𝜃 which is assumed unknown
• Estimation: evaluate the unknown parameter 𝜃 using the
set of observations , … , 𝑛 and using the distribution of
the random variables , … , 𝑛 from which we observe.
• Approaches:
1. Method of Moments (MOM)
2. Maximum Likelihood Estimation (MLE)
Examples of Classic Estimators
Given the data { , … , 𝑛 }, estimate mean, variance,
skewness and kurtosis with
𝑛
Sample mean: 𝜇 = = 𝑖= 𝑖
𝑛
𝑛
Sample Variance: 𝜎 = 𝑆 = 𝑖= 𝑖 −𝜇
𝑛−
𝑛
Sample skewness: 𝑆 ,…, 𝑛 = 𝑖= 𝑖 −𝜇
𝑛− 𝜎
𝑛
Sample kurtosis: ,…, 𝑛 = 𝑖= 𝑖 −𝜇
𝑛− 𝜎
Sampling Distributions of Estimators
Sampling Distributions: Under normality assumption

, … , 𝑛 ~ 𝑁 𝜇, 𝜎
~ 𝑁 𝜇, 𝜎 /
𝑆 −
~ 𝜒𝑛−
𝜎
Properties of the Estimators
1. Unbiasedness
2. Consistency
Method of Moments Estimation
MOM Approach: Equate the distribution moments to
the observed moments
𝑛
𝑝 𝑝
Ε = 𝑖
𝑖=
Example: ,…, 𝑛 ~ 𝑁 𝜇, 𝜎
𝑛 𝑛
𝑝= ∶ Ε =𝜇 = 𝑖= 𝑖⇨ 𝜇 = 𝑖= 𝑖
𝑛 𝑛
𝑛
𝑛 𝑋𝑖 −𝑋
𝑝= ∶ Ε =𝜎 −𝜇 = 𝑖= 𝑖 ⇨𝜎 =
𝑖=
𝑛 𝑛
Maximum Likelihood Estimation
MLE Approach: Maximize the likelihood function of 𝜃 given the data.

Joint distribution of ,…, 𝑛 ~ 𝑓 ; 𝜃 under independence assumption

𝑓 ,..., 𝑛; 𝜃 =𝑓 ;𝜃 ...𝑓 𝑛; 𝜃

Likelihood function

𝜃∶ ,..., 𝑛 =𝑓 ,..., 𝑛; 𝜃

MLE: 𝜃 = argmax 𝜃∈Θ 𝜃∶ ,..., 𝑛


Multivariate Distribution
Joint, Marginal and Conditional distributions:
Joint distribution = Conditional × Marginal: 𝑓 , =𝑓 𝑓 =𝑓 𝑓

For three variables , , :


𝑓 , , =𝑓 , 𝑓 , =𝑓 , 𝑓 𝑓
For variables , … , 𝑛 :
𝑓 ,..., 𝑛 = 𝑓 𝑛 | 𝑛− ,..., 𝑓 𝑛− | 𝑛− ,..., ...𝑓 | 𝑓

Example: if 𝑖 | 𝑖− ,..., is normal with mean 𝜇𝑖 and variance 𝜎𝑖 and if 𝑓 is ignored


𝑛
− 𝑖 − 𝜇𝑖
𝑓 𝑛, 𝑛− ,..., = exp
𝜋𝜎𝑖 𝜎𝑖
𝑖=
Statistical Inference
Hypothesis Testing
Parameter-based hypothesis testing: 𝐻 : 𝜃 = 𝜃 vs 𝐻𝑎 : 𝜃 ≠ 𝜃
Distribution-based hypothesis testing:

𝐻 : ,..., 𝑛 ~ 𝑁 𝜇, 𝜎 vs 𝐻𝑎 : non−normal distribution

P-value = a measure of the plausibility of 𝐻


Significance Level = the probability of type1error (reject 𝐻 when it is true)

Confidence Interval A − 𝛼 confidence interval for 𝜃 is 𝜃 − , 𝜃 + with s.t.

Pr 𝜃 ∈ 𝜃 − , 𝜃 + = −𝛼
Formal Definition
A stochastic process is a collection of random variables 𝑡, 𝑡 ∈𝑇 ,
defined on a probability space Ω, 𝐹, 𝑃 .

A time series is a stochastic process in which 𝑇 is a set of time


points, usually
𝑇 = , ± , ± , . . . , , , , . . . , , ∞ , or −∞, ∞

Note: The term “time series” is also used to refer to the realization
of such a process (observed time series).
Example: Time Series
• Monthly sales of Australian red wine
• Monthly accidental deaths in the U.S.
• Daily Average Temperature from La Harpe station in Hancock
County, Illinois
• Daily stock price of IBM stock
• US monthly interest rates
• US yearly GDP
• 1-minute intraday S&P500 return
Time Series: Characteristics
• Trend: long-term increase or decrease in the data over time
• Seasonality: influenced by seasonal factors (e.g. quarter of the year,
month, or day of the week)
• Periodicity: exact repetition in regular pattern (seasonal series often
called periodic, although they do not exactly repeat themselves)
• Cyclical trend: data exhibit rises and falls that are not of a fixed
period
• Heteroskedasticity: varying variance with time
• Dependence: positive (successive observations are similar) or
negative (successive observations are dissimilar)
Example: GDP
Example: Daily IBM Stock Price
Example: S&P500 Intraday
Is Time Series Analysis Necessary?
Time Series ⇒ Dependence
• Data redundancy: number of degrees of freedom is smaller
than T (T is the number of observations)
• Data sampling: 𝑡 , 𝑡 = , . . , 𝑇 concentrated about a small part
of the probability space
Ignoring dependence leads to
• Inefficient estimates of regression parameters
• Poor predictions
• Standard errors unrealistically small (too narrow CI ⇒
improper inferences)
Time Series: Objectives
Description
• Plot the data and obtain simple descriptive measures of the
main properties of the series.

Explanation
• Find a model to describe the time dependence in data.

Forecasting
Given a finite sample from the series (observations), forecast
the next value or the next several values.

Control/Tuning
• After forecasting, adjust various control/tune parameters.
Time Series Analysis: Approaches
Time domain approach

• Assume that correlation between adjacent points in time can


be explained through dependence of the current value on past
values.

Frequency domain approach

• Characteristics of interest relate to periodic (systematic)


sinusoidal variations in the data, often caused by biological,
physical, or environmental phenomena.
Time Series: Basics
Data: 𝑡 , where t indexes time, e.g. minute, hour, day, month
Model: 𝑡 = 𝑡 + 𝑡 + 𝑡
• 𝑡 is a trend component;
• 𝑡 is a seasonality component with known periodicity d
( 𝑡 = 𝑡+𝑑 ) such that 𝑑= =
• 𝑡 is a stationary component, i.e. its probability distribution
does not change when shifted in time
Approach: 𝑡 and 𝑡 are first estimated and subtracted from 𝑡
to have left the stationary process 𝑡 to be model using time
series modeling approaches.
Time Series: Basics
Data: 𝑡 , where t indexes time, e.g. minute, hour, day, month
Model: 𝑡 = 𝑡 + 𝑡 + 𝑡
• 𝑡 is a trend component;
• 𝑡 is a seasonality component with known periodicity d
( 𝑡 = 𝑡+𝑑 ) such that 𝑑= =
• 𝑡 is a stationary component, i.e. its probability distribution
does not change when shifted in time
Approach: 𝑡 and 𝑡 are first estimated and subtracted from 𝑡
to have left the stationary process 𝑡 to be model using time
series modeling approaches.
Time Series: Trend Estimation
Elimination of Trend (no Seasonality)
1. Estimate trend and remove it, or
2. Difference the data to remove the trend directly.

Estimation Methods
1. Moving Average
2. Parametric Regression (Linear, Quadratic, etc.)
3. Non-Parametric Regression
Trend: Moving Average
Estimate the trend for with a width of the moving window d:
If the width is 𝑑 = 𝑞, use
𝑥𝑡− 𝑥𝑡+
𝑡 = + 𝑥𝑡− + + 𝑥𝑡− + + . . . +𝑥𝑡+ − + .
𝑑
If the width is 𝑑 = 𝑞 + , use

𝑡 = 𝑥𝑡+
𝑑
=−
The width selection reflects the bias-variance trade-off:
• If width large, then the trend is smooth (i.e. low variability)
• If width small, then the trend is not smooth (i.e. low bias)
Trend: Parametric Regression
Estimate the trend 𝑡 assuming a polynomial in t:
𝑡 =𝛽 +𝛽 +𝛽 + ⋯+ 𝛽
• Commonly use small order polynomial (p=1 or 2)
• Estimation approach: Fit a linear regression model where
the predicting variables are ( , ,…, )
• Which terms to keep? Use model selection to select
among the predicting variables. Cautious! Strong
correlation among the predicting variables.
Trend: Non- Parametric Regression
Estimate the trend 𝑡 with t in { , , … , 𝑛 }:
1. Kernel Regression
𝑡 = = 𝑛= 𝑡𝑖 where a weight function
depending on a kernel function.
2. Local Polynomial Regression
• An extension of the kernel regression and the polynomial
regression: fit a local polynomial within a width of a data point
3. Other Approaches
• Splines regression
• Wavelets
• Orthogonal basis function decomposition
Trend: Non- Parametric Regression
Which one to choose?
→ Local polynomial regression is preferred over kernel
regression since it overcomes boundary problems and its
performance is not dependent on the design of the time points
→ Other methods are to be selected depending on the level of
smoothness of the function to be estimated
→ For estimating the trend in time series, local polynomial or
splines regression will perform well in most cases
Data Example:
Temperature in Atlanta, Georgia
Data: Average monthly temperature records starting in 1879 until 2016.

• Available from the iWearherNet.com


• The Weather Bureau (now the National Weather Service) began
keeping weather records for Atlanta for 138 years since October 1,
1878.
• Provided in Fahrenheit degrees

Do we find an increasing trend in temperature in Atlanta?


Data Example in R
## Time Series Plot
data = read.table("AvTempAtlanta.txt",header=T)
names(data)
[1] "Year" "Jan" "Feb" "Mar" "Apr" "May" "Jun" "Jul" "Aug" "Sep"
[11] "Oct" "Nov" "Dec" "Annual"
temp = as.vector(t(data[,-c(1,14)]))
temp = ts(temp,start=1879,frequency=12)
ts.plot(temp, ylab="Temperature")
Data Example in R
Trend: Moving Average
## Create equally spaced time points for fitting trends
time.pts = c(1:length(temp))
time.pts = c(time.pts - min(time.pts))/max(time.pts)
## Fit a moving average
mav.fit = ksmooth(time.pts, temp, kernel = "box")
temp.fit.mav = ts(mav.fit$y,start=1902,frequency=12)
## Is there a trend?
ts.plot(temp,ylab="Temperature")
lines(temp.fit.mav,lwd=2,col="purple")
abline(temp.fit.mav[1],0,lwd=2,col="blue")
Trend: Moving Average
Trend: Parametric Regression
## Fit a parametric quadratic polynomial
x1 = time.pts
x2 = time.pts^2
lm.fit = lm(temp~x1+x2)
summary(lm.fit)
Estimate Std. Error t value Pr(>|t|)
(Intercept) 61.4247 0.9841 62.420 <2e-16 ***
x1 -1.5723 4.5481 -0.346 0.730
x2 3.4937 4.4062 0.793 0.428
## Is there a trend?
temp.fit.lm = ts(fitted(lm.fit),start=1879,frequency=12)
ts.plot(temp,ylab="Temperature")
lines(temp.fit.lm,lwd=2,col="green")
abline(temp.fit.lm[1],0,lwd=2,col="blue")
Trend: Parametric Regression
Trend: Non- Parametric Regression
## Local Polynomial Trend Estimation
loc.fit = loess(temp~time.pts)
temp.fit.loc = ts(fitted(loc.fit),start=1879,frequency=12)
## Splines Trend Estimation
library(mgcv)
gam.fit = gam(temp~s(time.pts))
temp.fit.gam = ts(fitted(gam.fit),start=1879,frequency=12)
## Is there a trend?
ts.plot(temp,ylab="Temperature")
lines(temp.fit.loc,lwd=2,col="brown")
lines(temp.fit.gam,lwd=2,col="red")
abline(temp.fit.loc[1],0,lwd=2,col="blue")
Trend: Non- Parametric Regression
Trend: Comparison
## Compare all estimated trends
all.val = c(temp.fit.mav,temp.fit.lm,temp.fit.gam,temp.fit.loc)
ylim= c(min(all.val),max(all.val))
ts.plot(temp.fit.lm,lwd=2,col="green",ylim=ylim,ylab="Temperature")
lines(temp.fit.mav,lwd=2,col="purple")
lines(temp.fit.gam,lwd=2,col="red")
lines(temp.fit.loc,lwd=2,col="brown")
legend(x=1900,y=64,legend=c("MAV","LM","GAM","LOESS"),lty = 1,
col=c("purple","green","red","brown"))
Trend: Comparison

You might also like