Statistical Modeling Dependent Variable Independent Variables

Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 7

Linear regression is a powerful mathematical tool that allows you to take results from your business statistics and

project them into


the future. You can take data such as sales figures, staff levels or costs and apply linear regression to determine future values. A
typical application is to forecast sales for the next few months based on the monthly sales figures from the past year. The tool gives
accurate results as long as past trends remain the same.

In statistical modeling, regression analysis is a set of statistical processes for estimating the relationships among variables. It
includes many techniques for modeling and analyzing several variables, when the focus is on the relationship between a dependent
variable and one or more independent variables (or 'predictors'). More specifically, regression analysis helps one understand how
the typical value of the dependent variable (or 'criterion variable') changes when any one of the independent variables is varied,
while the other independent variables are held fixed.

dentify Your Variables

The linear regression technique works with any two variables. But in forecasting, one of your variables is time and the other is the
variable for which you need the forecast. For example, for a sales forecast, assume that at the end of month one your sales were at
12,000 units. At the end of months two, three and four, sales were at 14,000, 15,000 and 17,000. The following example uses linear
regression to forecast sales for months five and six.

Calculate the Sums and Averages

Define the number of months as x and your monthly sales in thousands as y. In the example, your data points (x,y) are (1,12), (2,14),
(3,15) and (4,17). The first step is to total all the x values and all the y values and find the average of each. For the example, define
your total sales in thousands as Yt, which equals 58. Define the total number of months as Xt, equal to 10. The average sales, called
Ya, were 58/4 = 14.5. The average number of months, called Xa, were 10/4 = 2.5.

Calculate the Squares, Products and Totals

Calculate the squares of each x value, the total of the squares of x, the products of each x and y value pair and the total of the
products. For the example, the squares of the x values are 1, 4, 9, and 16, and their sum is 30. Call this total X2t. The products of
each x and y value pair are 1 x 12, 2 x 14, 3 x 15 and 4 x 17. The results are 12, 28, 45, 68 and the sum is 153. Call this value XYt.

Perform a Linear Regression

To find b and c in the equation y = bx + c, calculate Sxx, which is the sum of the squares of x, X2t = 30, minus the square of the sum
of the x values, Xt squared = 100, divided by the number of data points, which is four. Sxx = 30 - 100/4 = 5. Calculate Sxy, which is the
sum of the products of x and y, XYt = 153, minus the sum of the x values, Xt = 10, times the sum of the y values, Yt = 58, divided by
the number of data points, which is four. Sxy = 153 - 580/4 = 8.

Find the Equation and Calculate Your Forecast

The constant b = Sxy/Sxx = 1.6. The constant c = the average of y, Ya = 14.5, minus b times the average of x, Xa = 2.5; c = 14.5 - 2.5 x
1.6 = 10.5. The equation for your sales forecast in thousands is y = 1.6x + 10.5. The sales forecast for month 5 is 1.6 times 5 plus 10.5
= 18.5 and the sales forecast for month 6 is 1.6 times 6 plus 10.5 = 20.1. Your sales under present trends will be 18,500 and 20,100 in
months five and six.

What is the 'Delphi Method'


The Delphi method is a forecasting method based on the results of questionnaires sent to a panel of experts. Several rounds of
questionnaires are sent out, and the anonymous responses are aggregated and shared with the group after each round. The experts
are allowed to adjust their answers in subsequent rounds. Since multiple rounds of questions are asked and the panel is told what
the group thinks as a whole, the Delphi method seeks to reach the correct response through consensus.

BREAKING DOWN 'Delphi Method'


The Delphi method was originally conceived in the 1950s by Olaf Helmer and Norman Dalkey of the Rand Corporation. The name
refers to the Oracle of Delphi, a priestess at a temple of Apollo in ancient Greece known for her prophecies. The Delphi method
allows experts to work towards a mutual agreement by conducting a circulating series of questionnaires and releasing related
feedback to further the discussion with each subsequent round. The experts' responses shift as rounds are completed based on the
information brought forth by other experts participating in the analysis.
Using the Delphi Method

First, the group facilitator selects a group of experts based on the topic being examined. Once all participants are confirmed, each
member of the group is sent a questionnaire with the instructions to comment on each topic based on their personal opinion,
experience or previous research. The questionnaires are returned to the facilitator who groups the comments and prepares copies
of the information. A copy of the compiled comments is sent to each participant, along with the opportunity to comment further.

At the end of each comment session, all questionnaires are returned to the facilitator who decides if another round is necessary or if
the results are ready for publishing. The questionnaire rounds can be repeated as many times as necessary to achieve a general
sense of consensus.
A time series is a sequence of data points, measured typically at successive points in time spaced at uniform time
intervals. Examples of time series are the daily closing value of the Dow Jones Industrial Average, the annual flow
volume of the Nile River at Aswan etc. Time series are used in statistics, signal processing, pattern
recognition, econometrics, mathematical finance, weather forecasting, earthquake prediction, electroencephalography,
control engineering, astronomy, and communications engineering etc. Clearly the application of time series forecasting
and analysis spans across multiple domains and businesses.
Time series forecasting methods produce forecasts based solely on historical values and they are widely used in business
situations where forecasts of a year or less are required. These methods used are particularly suited to Sales, Marketing,
Finance, Production planning etc and they have the advantage of relative simplicity, but certain factors need to be
considered:
 Time series methods are better suited for short-term forecasts (i.e., less than a year).
 Time series forecasting relies on sufficient past data being available and that the data is of a high quality and truly representative.
 Time series methods are best suited to relatively stable situations. Where substantial fluctuations are common and underlying
conditions are subject to extreme change, then time series methods may give relatively poor results.
 Averaging methods

 If a time series is generated by a constant process subject to random error, then mean is a useful
statistic and can be used as a forecast for the next period.

 Averaging methods are suitable for stationary time series data where the series is in equilibrium around
a constant value ( the underlying mean) with a constant variance over time.

 Exponential smoothing methods

 The simplest exponential smoothing method is the single smoothing (SES) method where only one
parameter needs to be estimated

 Holt’s method makes use of two different parameters and allows forecasting for series with trend.

 Holt-Winters’ method involves three smoothing parameters to smooth the data, the trend, and the
seasonal index.

moving average

 A large k is desirable when there are wide, infrequent fluctuations in the series.

 A small k is most desirable when there are sudden shifts in the level of series.

 For quarterly data, a four-quarter moving average, MA(4), eliminates or averages out seasonal effects.

 A large k is desirable when there are wide, infrequent fluctuations in the series.

 A small k is most desirable when there are sudden shifts in the level of series.

 For quarterly data, a four-quarter moving average, MA(4), eliminates or averages out seasonal effects.

 A moving average of order k, MA(k) is the value of k consecutive observations.

( yt  yt 1  yt 2    yt k 1 )
Ft 1  yˆ t 1 
K
1 t
Ft 1   yi
k i t k 1
 K is the number of terms in the moving average.

The moving average model does not handle trend or seasonality very well although it can do better than the total mean

 The weekly sales figures (in millions of dollars) presented in the following table are used by a major department
store to determine the need for temporary sales personnel.
Exponential Smoothing Methods

 This method provides an exponentially weighted moving average of all previously observed values.

 Appropriate for data with no predictable upward or downward trend.

 The aim is to estimate the current level and use it as a forecast of future value.

Simple Exponential Smoothing Method

 Formally, the exponential smoothing equation is Ft 1   yt  (1   ) Ft

 forecast for the next period.

  = smoothing constant.

 yt = observed value of series in period t.

 = old forecast for period t.

 The forecast Ft+1 is based on weighting the most recent observation yt with a weight  and weighting the
most recent forecast Ft with a weight of 1- 

 The implication of exponential smoothing can be better seen if the previous equation is expanded by replacing Ft
with its components as follows: F   y  (1   ) F
t 1 t t

  yt  (1   )[ yt 1  (1   ) Ft 1 ]
  yt   (1   ) y t 1 (1   ) 2 Ft 1
 If this substitution process is repeated by replacing Ft-1 by its components, Ft-2 by its components, and so
on the result is:
Ft 1   yt   (1   ) y t 1  (1   ) 2 y t  2   (1   )3 y t 3    (1   )t 1 y1
 Therefore, Ft+1 is the weighted moving average of all past observations.

 The following table shows the weights assigned to past observations for  = 0.2, 0.4, 0.6, 0.8, 0.9

Weight assigned to 0.2 0.4 0.6 0.8 0.9


Yt 0.2 0.4 0.6 0.8 0.9

Yt-1 0.2(1-0.2) 0.4(1-0.4) 0.6(1-0.6)


2 2 2
Yt-2 0.2(1-0.2) 0.4(1-0.4) 0.6(1-0.6)
3 3 3
Yt-3 0.2(1-0.2) 0.4(1-0.4) 0.6(1-0.6)
4 4 4
Yt-4 0.2(1-0.2) 0.4(1-0.4) 0.6(1-0.6)
5 5 5
Yt-5 0.2(1-0.2) 0.4(1-0.4) 0.6(1-0.6)

 The exponential smoothing equation rewritten in the following form elucidate the role of weighting factor .
Ft 1  Ft   ( yt  Ft )

 Exponential smoothing forecast is the old forecast plus an adjustment for the error that occurred in the last
forecast.
 The value of smoothing constant  must be between 0 and 1

  can not be equal to 0 or 1.

 If stable predictions with smoothed random variation is desired then a small value of  is desire.

 If a rapid response to a real change in the pattern of observations is desired, a large value of  is appropriate.

 To estimate , Forecasts are computed for  equal to .1, .2, .3, …, .9 and the sum of squared forecast error is
computed for each.

 The value of  with the smallest RMSE is chosen for use in producing the future forecasts.

 To start the algorithm, we need F1 because F2   y1  (1   ) F1

 Since F1 is not known, we can

 Set the first estimate equal to the first observation.

 Use the average of the first five or six observations for the initial smoothed value.

Period (t) Sales (y)
1 5.3
Weekly Sales
2 4.4
3 5.4 8
4 5.8
5 5.6 7
6 4.8
7 5.6 6
8 5.6
9 5.4
5
10 6.5
11 5.1
Sales

4 Sal…
12 5.8
13 5
3
14 6.2
15 5.6
2
16 6.7
17 5.2
1
18 5.5
19 5.8
20 5.1 0
0 5 10 15 20 25 30
21 5.8 Weeks
22 6.7
23 5.2
24 6
25 5.8

 Use a three-week moving average (k=3) for the department store sales to forecast for the week 24 and 26.

( y23  y22  y21 ) 5.2  6.7  5.8


yˆ 24    5.9
3 3

 The forecast error is

e24  y24  yˆ 24  6  5.9  .1

 The forecast for the week 26 is

y25  y24  y23 5.8  6  5.2


yˆ 26    5.7
3 3

You might also like