Forecasting Techniques: Quantitative Techniques in Management
Forecasting Techniques: Quantitative Techniques in Management
Forecasting Techniques: Quantitative Techniques in Management
Chapter 3
Forecasting Techniques
Contents
Forecasting Techniques .............................................................................................................................. 62
3.1 Estimation Using the Regression Line ............................................................................................... 62
3.2 Scatter Diagram ................................................................................................................................ 69
3.3 Karl Pearson’s Coefficient of Correlation .......................................................................................... 70
3.4 Time Series Analysis .......................................................................................................................... 72
3.5 Forecasting ........................................................................................................................................ 82
Quantitative Techniques in Management
Forecasting Techniques
So, far we have studied problems relating to one variable only. In practice we come across large
number of problems involving the use of two or more than two variables. If two quantities vary
in such a way that movements in one are accompanied by movements in other, these quantities
are correlated.
Managers make personal and professional decisions based on prediction of future events. For this
they rely on relationship between known and what is to be estimated. If decision makers can
determine how the known is related to the unknown, they can aid the decision-making process
considerably.
Regression and correlation analyses show us how to determine both the nature and the strength
of relationship between two variables. In regression analysis, we develop an estimating equation
i.e. a mathematical formula that relates the known variables to the unknown variable. Afterwards
we can apply correlation analysis to determine the degree to which the variables are related.
Relationship Type
Regression and correlation analyses are based on the relationship, or association, between two
(or more) variables. The known variable is called the independent variable. The variable which is
need to predict is dependent variable.
Scatter Diagrams
First step in determining whether there is a relationship between two variables is to examine the
graph of the observed data. This graph or chart is called the scatter diagram. Scatter diagram
gives two types of information. First, we can look for patterns that indicate that the variables are
related. Then, if variables are related, then what kind of estimating equation, describes this
relationship.
Y = a + bX
Quantitative Techniques in Management
where Y is the dependent variable, X is the independent variable, a is the Y-intercept, which is
the point at which the regression line crosses the Y-axis (the vertical axis) and b is the slope of
the regression line. It should be noted that the values of both a and b will remain constant for any
given straight line. On the basis of this equation, we can find the value of Y for any value of X if
values of a and b are known to us.
∑Y = na +b ∑X
∑XY = a∑X + ∑X2
Quantitative Techniques in Management
where
∑Y = the total of Y series
n = number of observations
∑X = the total of X series
∑XY = the sum of XY column
∑X2 = the total of squares of individual items in X series
a and b are the Y-intercept and the slope of the regression line, respectively. We now take up an
example to illustrate the use of the two normal equations. Given the following data, find the
regression equation of Y and X.
Example:
X 2 3 4 5 6
Y 7 9 10 14 15
Solution
We have now to set up a worksheet to get the values of the terms shown earlier. Worksheet for
Computing Correlation
X Y XY X2
2 7 14 4
3 9 27 9
4 10 40 16
5 14 70 25
6 15 90 36
∑X = 20 ∑Y = 55 ∑XY = 241 ∑X2 =90
Substituting these values in the normal equations given above
55 = 5a + 20b (i)
241 = 20a + 90b (ii)
Solving these we get,
a = 2.6
b = 2.1
Therefore, the regression equation of Y on X is
Y = 2.6 + 2.IX
Alternative Approach
We can use an alternative approach, which involves the use of two formulae-one to calculate the
Y-intercept and the other to calculate the slope. The formula for calculating the slope is
Y = 2.6 + (2.1 *
We are now clear as to how the regression line is obtained. The question is how to check the
accuracy of our results. One method is to draw a scatter diagram with original data pertaining to
X and Y series and then to fit a straight line. This graph will give a visual idea about the
suitability of the straight line fitted. A more refined and, therefore, better approach is based on
the mathematical properties of a line fitted by the method of least squares. This means that the
positive and the negative errors (i.e. differences between the original data points and the
calculated points) must be equal so that when all individual errors are added together, the result
is zero.
X Y Yc Y-Yc
2 7 2.6 + (2.1 *
3 9 2.6 + (2.1 *
4 10 2.6 + (2.1 * -1.0
5 14 2.6 + (2.1 *
6 15 2.6 + (2.1 * -0.2
Total 0
Here, the calculated value of Y is shown as Yc. We find that the sum of positive errors Y - Yc, is
equal to 1.2. The same is true for negative errors. Thus, the sum of the column Y- Yc, comes to
zero. This means problem solved is correct.
Regression Coefficients
So far our discussion on regression analysis related to finding the regression of Y on X. It is just
possible that we may think of X as dependent variable and Y as an independent one. In that case,
we may have to use X = a + bY as an estimating equation. Then, the normal equations will be
∑X = na +b ∑Y
∑XY = a∑Y + ∑Y2
Here, we will have to get the values of ∑X, ∑Y, ∑XY, ∑Y2 and n. Once these values are known,
we may enter them in the two normal equations. The equations then can be solved in the same
manner as in the case of regression of Y on X.
i. Regression equation of Y on X
Y Y =r(s y /s x )(X X )
The term r(s y /s x ) can be written as ∑XY / ∑X2
ii. Regression equation of X on Y
X X =r(s x /s y )(Y Y )
Quantitative Techniques in Management
The term r(s y /s x ) can be written as ∑XY / ∑Y2
It may be noted that the square root of the product of two regression coefficients is the value of
the coefficient of correlation.
We may write
bxy or r(s x /s y ) =∑xy/∑y2
byx or r(s x /s y ) =∑xy/∑x 2
r =(bxy *b yx )^0.5
Another point to note is that x and y are the deviations in X and Y series from their arithmetic
means.
Example:
X Y x =X- X x2 y =Y- Y y2 xy
2 7 -2 4 -4 16 8
3 9 -1 1 -2 4 2
4 10 0 0 -1 1 0
5 14 1 1 3 9 3
6 15 2 4 4 16 8
20 55 ∑x = 0 ∑x 2 = 10 ∑y = 0 ∑y2 = 46 ∑xy = 21
X =20/5 = 4
Y =55/5 = 11
Regression equation of X on Y:
X X =r(s x /s y )(Y Y )
On putting the values, we get
Y = 2.6 + 2.1X
Also,
r =(bxy *b yx)^0.5
= ∑xy/∑y 2 )*( ∑xy/∑x2 ))^0.5
= ((21/46)*(21/10))^0.5
= (0.9587)^0.5
= 0.98
Regression equation of X on Y is
X X =r(s x /s y )(Y Y )
or X = 40 + 0.5 (10/9) (Y - 45)
or X = 40 + 0.556 (Y - 45)
or X = 40 + 0.556Y- 25.02
or X = 14.98 + 056Y
In order to estimate the value of Y for X = 48, we have to use the regression equation of Y on X
Y = 27 + 0.45X
when X= 48
Y = 27 + (0.45 *
or Y= 27 + 21.6
or Y = 48.6
This should be obvious as r happens to be the square root of the two regression coefficients.
6. Finally, regression coefficients are not affected by change of origin. But this is not the case in
respect of scale.
where Yc is the calculated or estimated value of Y. Y is the actual or observed value of variable
n is the total number of observations Syx is the standard deviation of regression of Y values on X
values It may be noted that the sum of the squared deviations is divided by n-2 and not by n This
is because we have lost 2 degrees of freedom in estimating the regression line. One can reason
out that while obtaining values of a and b from the sample data, 2 degrees of freedom have been
lost while estimating the regression line from these points. There is a short-cut method for
finding the standard error of estimate. The formula is
where
X = values of the independent variable
Y = values of dependent variable
a = Y-intercept
b = slope of the estimating equation
n = number of observations
It should be obvious that this formula gives a short-cut method. When we estimate the regression
equation, all the values, which we need, are already determined.
Quantitative Techniques in Management
Example
Suppose that we have been given the following data pertaining to two series X and Y
X 40 30 20 50 60 40 20 60
Y 100 80 60 120 150 90 70 130
X series indicates advertising expenditure in thousand rupees and Y series relates to sales in
units. We are told the regression equation is Yc =24.444 + 1.889 X. We are asked to calculate
the standard error of estimate.
Solution
X Y Yc (Yi-Yc) (Yi-Yc)2
40 100 100 0
30 80 81 -1 1
20 60 62 -2 4
50 120 119 1 1
60 150 138 12 144
40 90 100 -10 100
20 70 62 8 64
60 130 138 -8 64
Total: 378
= (378/6)^0.5
= (63)^0.5
= 7.9 (standard error)
Higher the magnitude of the standard error of estimate, the greater is the dispersion or variability
of points around the regression line. In contrast, if the standard error of estimate is zero, then we
may take it that the estimate in equation is the best estimator of the dependent variable. In such a
case, all the points would lie on the regression line. As such, there would be no point scattered
around the regression line.
As the summer heat rises, hill stations, are crowded with more and more visitors. Ice-cream sales
become more brisk. Thus, the temperature is related to number of visitors and sale of ice-creams.
Correlation analysis is a means for examining such relationships systematically. It deals with
questions such as:
• Is there any relationship between two variables? If the value of one variable changes, does the
value of the other also change? Do both the variables move in the same direction? How strong is
the relationship?
The purpose of correlation analysis is to measure and interpret the strength of a linear or
nonlinear (eg, exponential, polynomial, and logistic) relationship between two continuous
variables. When conducting correlation analysis, we use the term association to mean linear
association
Quantitative Techniques in Management
Simple linear correlation is a measure of the degree to which two variables vary together, or a
measure of the intensity of the association between two variables. The parameter being measure
is r (rho) and is estimated by the statistic r, the correlation coefficient. r can range from -1 to 1,
and is independent of units of measurement. The strength of the association increases as r
approaches the absolute value of 1.0. A value of 0 indicates there is no association between the
two variables tested.
A better estimate of r usually can be obtained by calculating r on treatment means averaged
across replicates. Correlation does not have to be performed only between independent and
dependent variables. Correlation can be done on two dependent variables. The X and Y in the
equation to determine r do not necessarily correspond between an independent and dependent
variable, respectively.
Methods measuring Correlation-
Widely used techniques for the study of correlation are scatter diagrams, Karl Pearson‘s
coefficient of correlation and Spearman‘s rank correlation. A scatter diagram visually presents
the nature of association without giving any specific numerical value. A numerical measure of
linear relationship between two variables is given by Karl Pearson‘s coefficient of correlation. A
relationship is said to be linear if it can be represented by a straight line. Another measure is
Spearman‘s coefficient of correlation, which measures the linear association between ranks
assigned to individual items according to their attributes. Attributes are those variables which
cannot be numerically measured such as intelligence of people, physical appearance, honesty etc.
Scatter plots are a useful means of getting a better understanding of your data.
Quantitative Techniques in Management
The standard deviations of X and Y respectively are the positive square roots of their variances.
Covariance of
X and Y is defined as
Where x = X-X and y = X – Y are the deviations of the ith value of X and Y from their mean
values respectively.
The sign of covariance between X and Y determines the sign of the correlation coefficient. The
standard deviations are always positive. If the covariance is zero, the correlation coefficient is
always zero. The product moment correlation or the Karl Pearson‘s measure of correlation is
given by
Quantitative Techniques in Management
Strong and weak are words used to describe correlation. If there is strong correlation, then the
points are all close together. If there is weak correlation, then the points are all spread apart.
There are ways of making numbers show how strong the correlation is. These measurements are
called correlation coefficients. The best known is the Pearson product-moment correlation
coefficient. If the answer is 1, then there is strong correlation. If the answer is -1, then there is
weak correlation. Another kind of correlation coefficient is Spearman's rank correlation
coefficient.
On the other hand, in non-linear correlation, the amount in one variable does not bear a constant
ratio to the amount of change in other variable.
In time series analysis, it is assumed that there is a multiplicative relationship between these four
components.
Symbolically,
Quantitative Techniques in Management
Y=T S C I
Where Y denotes the result of the four elements; T = Trend; S = Seasonal component; C =
Cyclical components; I = Irregular component
In the multiplicative model it is assumed that the four components are due to different causes but
they are not necessarily independent and they can affect one another. Another approach is to treat
each observation of a time series as the sum of these four components. Symbolically
Y = T + S+ C+ I
The additive model assumes that all the components of the time series are independent of one
another.
1) Secular Trend or Long - Term movement or simply Trend
2) Seasonal Variation
3) Cyclical Variations
4) Irregular or erratic or random movements (fluctuations)
Secular Trend:
It is a long term movement in Time series. The general tendency of the time series is to increase
or decrease or stagnate during a long period of time is called the secular trend or simply trend.
Methods of Measuring Trend:
Trend is measured by the following mathematical methods.
1. Graphical method
2. Method of Semi-averages
3. Method of moving averages
4. Method of Least Squares
Graphical Method:
This is the easiest and simplest method of measuring trend. In this method, given data must be
plotted on the graph, taking time on the horizontal axis and values on the vertical axis. Draw a
smooth curve which will show the direction of the trend. While fitting a trend line the following
important points should be noted to get a perfect trend line.
(i) The curve should be smooth.
(ii) As far as possible there must be equal number of points above and below the trend line.
(iii) The sum of the squares of the vertical deviations from the trend should be as small as
possible.
(iv)If there are cycles, equal number of cycles should be above or below the trend line.
(v) In case of cyclical data, the area of the cycles above and below should be nearly equal.
Example:
Fit a trend line to the following data by graphical method.
Year 1996 1997 1998 1999 2000 2001 2002
Sales (in Rs 000) 60 72 75 65 80 85 95
Solution:
Quantitative Techniques in Management
Demerits:
1. It is highly subjective. Different trend curves will be obtained by different persons for the
same set of data.
2. It is dangerous to use freehand trend for forecasting purposes.
3. It does not enable us to measure trend in precise quantitative terms.
Method of semi averages:
In this method, the given data is divided into two parts, preferably with the same number of
years. For example, if we are given data from 1981 to 1998 i.e., over a period of 18 years, the
two equal parts will be first nine years, i.e., 1981 to 1989 and from 1990 to 1998. In case of odd
number of years like 5,7,9,11 etc, two equal parts can be made simply by omitting the middle
year. For example, if the data are given for 7 years from 1991 to 1997, the two equal parts would
be from 1991 to 1993 and from 1995 to 1997, the middle year 1994 will be omitted.
After the data have been divided into two parts, an average of each part is obtained. Thus we get
two points. Each point is plotted at the mid-point of the class interval covered by respective part
and then the two points are joined by a straight line which gives us the required trend line. The
line can be extended downwards and upwards to get intermediate values or to predict future
values.
Example :
Draw a trend line by the method of semi-averages.
Year 1991 1992 1993 1994 1995 1996
Sales Rs in (1000) 60 75 81 110 106 117
Solution:
Divide the two parts by taking 3 values in each part.
Quantitative Techniques in Management
Merits:
1. It is simple and easy to calculate
2. By this method every one getting same trend line.
3. Since the line can be extended in both ways, we can find the later and earlier estimates.
Demerits:
1. This method assumes the presence of linear trend to the values of time series which may not
exist.
2. The trend values and the predicted values obtained by this method are not very reliable.
Example:
Calculate the three yearly averages of the following data.
Year 1975 1976 1977 1978 1979 1980 1981 1982 1983 1984
Production in (tones) 50 36 43 45 39 38 33 42 41 34
Solution:
Example: The production of Tea in India is given as follows. Calculate the Four-yearly moving
averages
Year 1993 1994 1995 1996 1997 1998 1999 2000 2001 2002
Production (tones) 464 515 518 467 502 540 557 571 586 612
Solution:
Merits:
1. The method is simple to understand and easy to adopt as compared to other methods.
2. Method is flexible as mere addition of more figures to the data will not change the entire
calculation. That will produce only some more trend values.
3. Regular cyclical variations can be completely eliminated by a period of moving average equal
to the period of cycles.
4. It is particularly effective if the trend of a series is very irregular.
Demerits:
1. It cannot be used for forecasting or predicting future trend, which is the main objective of
trend analysis.
2. The choice of the period of moving average is sometimes subjective.
3. Moving averages are generally affected by extreme values of items.
4. It cannot eliminate irregular variations completely.
Quantitative Techniques in Management
Method of Least Square:
This method is widely used. It plays an important role in finding the trend values of economic
and business time series. It helps for forecasting and predicting the future values. The trend line
by this method is called the line of best fit.
The equation of the trend line is y = a + bx, where the constants a and b are to be estimated so as
to minimize the sum of the squares of the difference between the given values of y and the
estimate values of y by using the equation. The constants can be obtained by solving two normal
equations.
∑y = na + b∑x ………. (1)
∑xy = a∑x + b∑x2 ……… (2)
Here x represent time point and y are observed values. n‘ is the number of pair- values.
When odd numbers of years are given
Step 1: Writing given years in column 1 and the corresponding sales or production etc in column
2.
Step 2: Write in column 3 start with 0, 1, 2 .. against column 1 and denote it as X
Step 3: Take the middle value of X as A
Step 4: Find the deviations u = X - A and write in column 4
Step 5: Find u2 values and write in column 5.
Step 6: Column 6 gives the product uy
Now the normal equations become
∑y = na + b∑u where u = X-A
∑uy = a∑u + b∑u2
Since ∑u = 0, From above equation
a = y/n
∑uy = b∑u2
b = y/∑u2
The fitted straight line is
y = a + bu = a + b (X - A)
Example:
Fit a straight line trend by the method of least squares for the following data.
Year 1983 1984 1985 1986 1987 1988
Sales (Rs. in lakhs) 3 8 7 9 11 14
Also estimate the sales for the year 1991
Solution:
Quantitative Techniques in Management
u = (X-A)/(1/ 2)
= 2 (X 2.5) = 2X 5
The straight line equation is
y = a + bX = a + bu
a = 52/6
= 8.67
From (2) 66 = 70 b
b = 66/70
= 0.94
The fitted straight line equation is
y = a+bu
y = 8.67+0.94(2X-5)
y = 8.67 + 1.88X - 4.7
y = 3.97 + 1.88X
The trend values are
Put X = 0, y = 3.97 X = 1, y = 5.85
X = 2, y = 7.73 X = 3, y = 9.61
X = 4, y = 11.49 X = 5, y = 13.37
The estimated sale for the year 1991 is; put X = x –1983
= 1991 – 1983 = 8
y = 3.97 + 1.88 × 8
= 19.01 lakhs
The following graph will show clearly the trend line.
Quantitative Techniques in Management
Merits:
1. Since it is a mathematical method, it is not subjective so it eliminates personal bias of the
investigator.
2. By this method we can estimate the future values. As well as intermediate values of the time
series.
3. By this method we can find all the trend values.
Demerits:
1. It is a difficult method. Addition of new observations makes recalculations.
2. Assumption of straight line may sometimes be misleading since economics and business time
series are not linear.
3. It ignores cyclical, seasonal and irregular fluctuations.
4. The trend can estimate only for immediate future and not for distant future.
Seasonal Variations:
Seasonal Variations are fluctuations within a year during the season. The factors that cause
seasonal variation are
i) Climate and weather condition. ii)
Customs and traditional habits.
Measurement of seasonal variation:
The following are some of the methods more popularly used for measuring the seasonal
variations.
1. Method of simple averages.
2. Ratio to trend method.
3. Ratio to moving average method.
4. Link relative method
Method of simple averages
The steps for calculations:
i) Arrange the data season wise
ii) Compute the average for each season.
iii) Calculate the grand average, which is the average of seasonal averages.
iv) Obtain the seasonal indices by expressing each season as percentage of Grand average
The total of these indices would be 100n where n‘ is the number of seasons in the year.
Quantitative Techniques in Management
Example :
Find the seasonal variations by simple average method for the data given below.
Solution:
Cyclical variations:
The term cycle refers to the recurrent variations in time series that extend over longer period of
time, usually two or more years. Most of the time series relating to economic and business show
some kind of cyclic variation. A business cycle consists of the recurrence of the up and down
movement of business activity. It is a four-phase cycle namely.
1. Prosperity
2. Decline
3. Depression
4. Recovery
Each phase changes gradually into the following phase. The following diagram illustrates a
business cycle.
Quantitative Techniques in Management
The study of cyclical variation is extremely useful in framing suitable policies for stabilizing the
level of business activities. Businessmen can take timely steps in maintaining business during
booms and depression.
Irregular variation:
Irregular variations are also called erratic. These variations are not regular and which do not
repeat in a definite pattern. These variations are caused by war, earthquakes, strikes flood,
revolution etc. This variation is short-term one, but it affects all the components of series. There
are no statistical techniques for measuring or isolating erratic fluctuation. Therefore the residual
that remains after eliminating systematic components is taken as representing irregular
variations.
3.5 Forecasting
Introduction:
A very important use of time series data is towards forecasting the likely value of variable in
future. In most cases it is the projection of trend fitted into the values regarding a variable over a
sufficiently long period by any of the methods discussed latter. Adjustments for seasonal and
cyclical character introduce further improvement in the forecasts based on the simple projection
of the trend. The importance of forecasting in business and economic fields lies on account of its
role in planning and evaluation. If suitably interpreted, after consideration of other forces, say
political, social governmental policies etc., this statistical technique can be of immense help in
decision making.
The success of any business depends on its future estimates. On the basis of these estimates a
business man plans his production stocks, selling market, arrangement of additional funds etc.
Forecasting is different from predictions and projections. Regression analysis, time series
analysis, Index numbers are some of the techniques through which the predictions and
projections are made. Whereas forecasting is a method of foretelling the course of business
activity based on the analysis of past and present data mixed with the consideration of ensuring
economic policies and circumstances. In particular forecasting means fore-warning. Forecasts
based on statistical analysis are much reliable than a guess work.
3. R is independent of
(a). choice of origin and not of choice of scale
(b). choice of scale and not of choice of origin
(c). both choice of origin and choice of scale
(d). none of these
7. If the value of r2 for a particular situation is 0.49. What is the coefficient of correlation
(a). 0.49
(b). 0.7
(c). 0.07
(d). Cannot be determined
Quantitative Techniques in Management