ECON 332 Business Forecasting Methods Prof. Kirti K. Katkar

Download as ppt, pdf, or txt
Download as ppt, pdf, or txt
You are on page 1of 38

ECON 332

Business Forecasting Methods


Chapter 4
Prof. Kirti K. Katkar
4-2

Event Models

• Useful in modeling and forecasting time series that contain


special events such as sales promotions and natural disasters
like draught, storms, fires etc.
• For each event, its effect is estimated and data adjusted to
ensure that events do not distort the underlying trend and
seasonality as appropriate
• This is essentially further extension of the exponential
smoothing models
• Besides smoothing average, trend and seasonality here we
smooth each event by its own smoothing constant
– For this purpose each event is assigned its own index
– Events in the past and in the future are identified
4-3

Event Models (Contd.)

• First the baseline forecast is prepared – with average, trend


and seasonality smoothing
• Then the event smoothing equations are used to calculate
historical lift/ drop from the baseline and forecasts are adjusted
to reflect the effect of planned promotions/ expected disasters
• Not easy to do by hand: computerized models essential
4-4 Example of Event Model: Monthly
Historical Demand of the Condiment

A. C. Nielson Report
4-5

Condiment Promotion Indices

Index Value Event


I0 0 No promotion
I1 1 Free-standing inserts (FSIs)
I2 2 FSI/ radio, TV, print campaign
I3 3 Load (trade promotion)
I4 4 Deload (month after effect of load)
I5 5 Thematics (themed advertising campaign)
I6 6 Instant redeemable coupon (IRC)
4-6

Association of Event Indices to Past Data


and Forecast Months
4-7 Association of Event Indices to Past Data and
Forecast Months
4-8

Value Added of the Event Model


•Lower RMSE than Winter’s Winter’s Event
Model Model
model
Historical RMSE 16.813 12.897

Alpha – Average 0.11 0.05

Beta - 1.00 1.00


Seasonality
Gamma – Trend 0.45 0.00

•Utility in planning promotions FSI – I1 NA 1.01

–Expected lift/ drop estimates FSI+ - I2 NA 1.00

–Cost effectiveness of Load – I3 NA 1.06


promotions Deload – I4 NA 1.03
–Better use of precious
Thematics – I5 NA 0.94
marketing $
IRC – I6 NA 0.99
4-9

The Bivariate aka Simple Regression Model


• Population regression model
Y = βo + β1 X + ε
• Normally what’s available is a sample and the estimated
sample regression model is
Ŷ = bo + b1X
• Deviations of the predicted/ estimated value Ŷ from the actual
value Y is called the residual e
e = Y - Ŷ = Y - bo - b1X
• The slope b1 and the intercept bo are estimated by using the
least squares method where we
minimize Σ e2 = Σ (Y - bo - b1X)2
4-10

The Bivariate aka Simple Regression Model (Contd.)


• This is a simple optimization problem to estimate bo and b1
• By taking partial derivatives of Σ (Y - bo - b1X)2 w.r.t. bo and
b1 and solving those two equations simultaneously, we get

(∑ XY − n X Y )
b1 =
(∑ X − n X )
2

and bo = Y -b1 X

• Typically we use computerized model to estimate the bivariate


model parameters
4-11

The Bivariate aka Simple Regression Model


Major Assumptions
• The relationship between X and Y is linear. This implies Y
is related to X and not vice versa.
• Variance of X is non-zero and finite for any sample. The
values of Xt are not all the same. If variance of X is zero, it
would be impossible to estimate the impact of ΔX on Y.
• The error term εt has zero expected value and constant
variance for allN observations: E(εt ) = 0 and Var(εt ) = σ2
where σ = ∑ e 2
2 t
1
N
• The random variables εt are uncorrelated. i.e.
Cov(εt , εt-i ) = 0 for all i.
5. The error term εt is normally distributed.
4-12

Using Regression Model instead of Holt’s Smoothing


Model

• Regression model can be an alternative when forecasting a


time series with trend in it
• Here an independent variable is simply a time index which
assumes integer values as time passes
• Time does not “cause” the dependent variable to change but it
serves as a proxy for other factors
• One should always examine a time-series plot to see if a linear
trend model is appropriate.
4-13
Value of Visualization: Plot of Data Series

•Bivariate regression model same for all


•Only one set has meaningful linear relationship
4-14
Disposable Personal Income (DPI) per Capita Forecast
4-15
DPI does not fall on a perfectly straight line. However it does show a nearly
linear trend
4-16 The linear trend follows the generally upward movement
in DPI rather well. The bivariate model is
DPI = 17,498.40 + 61.687 x T
where T = 1 for 1990Q1 and 40 for 1999Q4
4-17

5-Step Process for Statistical Evaluation of


Regression Results
1. Do the results make sense in terms of the slope sign?
2. Is the slope term statistically +ve or –ve at the desired
significance level using the t test?
3. How much of the variation in the dependent variable is
explained by using the R-squared value?
4. Does the model exhibit serial correlation? Use of Durbin-
Watson statistic
5. Do the residuals exhibit Heteroscedasticiy?
4-18

Statistical Evaluation of Slope Signs


• Does the sign ( + or -) make sense?
– In the DPI case, it does.
• What if the signs do not make sense?
– Clear indication that the model is wrong
– Model could be incomplete and may require more than one
independent variable to explain the underlying phenomenon i.e. it is
under-specified
– Should never use models where signs do not make sense
• If the signs do make sense, is the slope significantly positive
or negative?
– The slope closer to zero would indicate that there is no linear
relationship between X and Y, rather Y and X are completely
independent of each other
4-19
Is the slope term statistically +ve or –ve at the desired
significance level using the t test?

• The null hypothesis should be set up so as to be rejected to


minimize making the Type I error
• If the knowledge of relationship indicates the slope should be
+ or -, a one tailed test is appropriate.
– For +ve slope (e.g. DPI) the hypothesis would be
H0 : β≤0 H1 : β>0
– For –ve slope the hypothesis would be

H0 : β≥0 H1 : β<0
• If there is no apriory notion on slope whatsoever, the
hypothesis would be
H0 : β=0 H1 : β≠0
• The appropriate test here is t-test and
tcalc = (b1-0)/ Standard Error of Estimate (SEE) b1
4-20 How far from zero does the slope need to be? Application of
Hypothesis Testing

• Standard Error of Estimate (SEE)


n

s= ∑ (Yi − yi)
1
2

,
( n − 2)

Where Yi are actual values of Y , and yi are fitted values and n is


the # of data points.
• If at an α level of significance, the critical value of t would be
looked up from the t Table with n-2 degrees of freedom, say
tn-2, α for one tailed tests and tn-2, α/2 for two tailed test.
• For +ve slope, if tcalc > tn-2, α we can reject Ho .
• For –ve slope, if tcalc < tn-2, α we can reject Ho .
• For unknown slope, if tcalc falls in the confidence interval we
can reject Ho .
4-21

How much of the variation in the dependent variable is


explained by using the R-squared value?

• R-squared is the coefficient of determination and ranges


between 0 and 1
• R-squared can be statistically tested to see if it is different
from zero using the F statistic. We will look at this next week
when we cover multiple regression models
• For now, the closer R-squared is to 1, the better is the
regression model. It signifies that a great deal of variability in
the dependent variable is explained by the model.
4-22

Does the model exhibit serial correlation?

• One of the major assumptions in the ordinary least squares


regression model is that the error terms are independent and
normally distributed. This implies that there is no pattern
between errors.
• If the serial correlation between errors exists, it estimates the
standard error of estimate to be smaller than they really are. It
does not impact the estimates of slope and intercept.
• This would mean that the calculated t would be overstated and
that we may reject the null hypothesis that should not be
rejected.
4-23

Examples of +ve and –ve Serial Correlation

-ve Serial Correlation +ve Serial Correlation


Errors alternate in sign Errors follow previous
error’s sign
4-24

Statistical Test for Existence of Serial Correlation


Durbin-Watson (DW) Statistic

• DW statistic is calculated as ∑ (et − et − 1) ∑


2
et
2

where et is the residual for time period t


and et-1 is the residual for previous time period t-1
• DW statistic will always range between 0 and 4
• A value closer to 2, say between 1.75 and 2.25 indicates that
there is no serial correlation
• A value closer to 0 would indicate positive serial correlation
• A value closer to 4 would indicate negative serial correlation
• For precise evaluation we use the DW table
4-25
Schematic for Evaluating Serial Correlation
4-26
The DW Table
4-27
The DW Table (Contd.)
4-28

DPI Model -Audit Trail – ANOVA Table


4-29

Schema to reduce Serial Correlation


If the basic hypothesis is correct, the trend is significantly
different from zero and R2 is close to one:
• Take first difference
– ΔY = b0 + b1(ΔX), where ΔY = Yt – Yt-1 , and ΔX = Xt – Xt-1
• Use Multiple Regression
– Model is underspecified and requires additional causal independent
variables
• Introduce the square of the independent variable
– Yt = b0 + b1 Xt + b2 Xt2
• Introduce lagged values of the independent variable as an
additional independent variable
– Yt = b0 + b1 Xt + b2 Yt-1
4-30

Heteroscedasticiy
• The model is said to be homoscedastic when the error term εt
has zero expected value and constant variance for all
observations
• If the variance is not constant the model is called
heteroscedastic. In this case the standard error of regression
coefficient is underestimated causing the calculated t statistic
to be larger than they should be and our incorrectly concluding
that a variable is statistically significant. i.e. rejecting the null
hypothesis that slope is zero.
• This can be evaluated by looking at the scatter-plot of
residuals.
4-31
Model Evaluation for Heteroscedisticity by Scatter-plots
of Residuals

Residuals Pattern indicating


Homoscedisticity

Residuals Pattern indicating


Heteroscedisticity
4-32

Schema to reduce Heteroscedasticity


If the basic hypothesis is correct, the trend is significantly
different from zero and R2 is close to one and the serial
correlation is non-existent:
• Use the logarithm of the dependent variable in the estimation
of the regression model
– log Y = b0 + b1 log X
4-33
Causal Regression Models

• Using regression models only to estimate linear trend does nt


take full advantage of regression models
• Regression models are extremely useful in developing models
describing causal relationships
• Examples include several real life phenomena
– Retail sales could depend on advertising expenses, consumer
disposable income, mortgage interest rates etc.
– GDP could depend on consumer purchases, business purchases,
exports etc.
4-34

Cross-Sectional Forecasting

• All data pertain to one time period rather that a time series
• Examples would like-stores sales forecast
– Sales = b0 + b1 (Population in area served)
Given the original data of several stores – their population
served and corresponding sales
4-35
Retail Sales (RS) in $ Million: Clear +ve Trend and a
consistent Seasonal Pattern

Months shown are end-months of quarters


4-36
Retail Sales (RS) in $ Million
4-37 Retail Sales Forecast Based on Disposable Personal
Income (DPI) per Capita
4-38

Table 4-5 (continued)

You might also like