Chapter 15

Download as pdf or txt
Download as pdf or txt
You are on page 1of 48

Chapter 15

Multiple Regression
Analysis

Business Statistics - Naval Bajpai


Learning Objectives

Upon completion of this chapter, you will be able to:


 Understand the applications of the multiple regression model
 Understand the concept of coefficient of multiple determination, adjusted
coefficient of multiple determination, and standard error of the estimate
 Understand and use residual analysis for testing the assumptions of
multiple regression
 Use statistical significance tests for the regression model and coefficients
of regression
 Test portions of the multiple regression model
 Understand non-linear regression model and the quadratic regression
model, and test the statistical significance of the overall quadratic
regression model
 Understand the concept of model transformation in regression models
 Understand the concept of collinearity and the use of variance inflationary
factors in multiple regression
 Understand the conceptual framework of model building in multiple
regression

Business Statistics - Naval Bajpai


The Multiple Regression Model

 Regression analysis with two or more independent variables


or at least one non-linear predictor is referred to as multiple
regression analysis
 Multiple regression model with k independent
variables

 Multiple regression equation

Business Statistics - Naval Bajpai


Figure 15.1: Summary of the estimation process for multiple
regression

Business Statistics - Naval Bajpai


Multiple Regression Model with Two
Independent Variables

 Multiple regression model with two independent variables


is the simplest multiple regression model where highest
power of any of the variables is equal to one.
 Multiple regression model with two independent
variables

 Multiple regression equation with two independent


variables

Business Statistics - Naval Bajpai


Example 15.1 A consumer electronics company has
adopted an aggressive policy to increase sales of a newly
launched product. The company has invested in
advertisements as well as employed salesmen for
increasing sales rapidly. Table 15.2 presents the sales, the
number of employed salesmen, and advertisement
expenditure for 24 randomly selected months. Develop a
regression model to predict the impact of advertisement
and the number of salesmen on sales.

Business Statistics - Naval Bajpai


Table 15.2 : Sales, number of salesmen employed, and
advertisement expenditure for 24 randomly selected months
of a consumer electronics company

Business Statistics - Naval Bajpai


FIGURE 15.5 Minitab output (partial) for Example 15.1

Business Statistics - Naval Bajpai


Using MS Excel, Minitab and SPSS for multiple
regression

 Solved Examples\Excel\Ex 15.1.xls


 Solved Examples \Minitab\Ex 15.1.MPJ
 Solved Examples\SPSS\Ex 15.1.sav
 Ch 15 Solved Examples\SPSS\Output Ex. 15.1.spv

Business Statistics - Naval Bajpai


Determination of Coefficient of Multiple
Determination

In case of multiple regression, the coefficient of multiple


determination is the proportion of variation in the
dependent variable y that is explained by the
combination of independent (explanatory) variables.

This implies that 73.90% of the variation in sales is


explained by the variation in the number of salesmen employed
and the variation in the advertisement expenditure.

Business Statistics - Naval Bajpai


Adjusted R Square

Adjusted R square is commonly used when a researcher


wants to compare two or more regression models having
the same dependent variable but different number of
independent variables.

This indicates that 71.42% of the total variation in sales can


be explained by the multiple regression model adjusted for the
number of independent variables and sample size.

Standard Error of the Estimate

Business Statistics - Naval Bajpai


Figures 15.8 & 15.9: Partial regression output from MS Excel and
Minitab showing coefficient of multiple determination, adjusted R
square, and standard error

Business Statistics - Naval Bajpai


Residual Analysis for the Multiple Regression
Model
 Linearity of the regression model
 Constant error variance (Homoscedasticity)
 Independence of error
 Normality of error

Business Statistics - Naval Bajpai


Testing the Statistical Significance of the Overall
Regression Model

Figure 15.18(a): Computation of the F statistic using MS Excel (partial


output for Example 15.1)

Business Statistics - Naval Bajpai


t Test for Testing the Statistical Significance of
Regression Coefficients

The hypotheses for testing the regression coefficient of


each independent variable can be set as

Figure 15.19(a) : Computation of the t statistic using MS Excel (partial output


for Example 15.1)

Business Statistics - Naval Bajpai


Non-linear Regression Model: The Quadratic
Regression Model

Figure 15.22 : Existence of non-linear relationship (quadratic) between the dependent


and independent variable (β2 is the coefficient of quadratic term)

Business Statistics - Naval Bajpai


Non-linear Regression Model: The Quadratic
Regression Model (Contd.)

Business Statistics - Naval Bajpai


Example 15.2: A leading consumer electronics company
has 125 retail outlets in the country. The company spent
heavily on advertisement in the previous year. It wants to
estimate the effect of advertisements on sales. This
company has taken a random sample of 21 retail stores
from the total population of 125 retail stores. Table 15.5
provides the sales and advertisement expenses (in
thousand rupees) of 21 randomly selected retail stores.

Business Statistics - Naval Bajpai


Table 15.5: Sales and advertisement expenses of 21 randomly
selected retail stores

Fit an appropriate regression model. Predict the sales when advertisement


expenditure is Rs 28,000.
Business Statistics - Naval Bajpai
Using MS Excel, Minitab, and SPSS for the
Quadratic Regression Model

 Solved Examples\Excel\Ex 15.2 quadratic1.xls


 Solved Examples\Minitab\EX 15.2 QUADRATIC
1.MPJ
 Solved Examples\SPSS\Ex 15.2 quadratic 1.sav
 Ch 15 Solved Examples\SPSS\Output Ex. 15.2.spv

Business Statistics - Naval Bajpai


A Case When the Quadratic Regression Model Is a
Better Alternative to the Simple Regression Model

Figure 15.31: Fitted line plot for Example 15.2 (simple regression
model) produced using Minitab

Business Statistics - Naval Bajpai


A Case When the Quadratic Regression Model Is a
Better Alternative to the Simple Regression Model

Figure 15.33 : Fitted line plot for Example 15.2 (quadratic regression
model) produced using Minitab

Business Statistics - Naval Bajpai


Testing the Statistical Significance of the Overall
Quadratic Regression Model

F Statistic is used for testing the significance of the quadratic


regression model as it is used in the simple regression model.

Testing the Quadratic Effect of a Quadratic Regression


Model

t Statistic is used for testing the significance of the


quadratic effect of quadratic regression model.

Business Statistics - Naval Bajpai


Indicator (Dummy Variable Model)

 Regression models are based on the assumption that all the


independent variables (explanatory) are numerical in
nature.
 There may be cases when some of the variables are qualitative
in nature.
 These variables generate nominal or ordinal information and
are used in multiple regression. These variables are referred to
as indicator or dummy variables.
 Researchers usually assign 0 or 1 to code dummy variables in
their study.
 Here, it is important to note that the assignment of code 0 or 1
is arbitrary and the numbers merely represent a place for the
category.
 A particular dummy variable xd is defined as

Business Statistics - Naval Bajpai


Example 15.3: A company wants to test the effect of age and
gender on the productivity (in terms of units produced by the
employees per month) of its employees. The HR manager has taken
a random sample of 15 employees and collected information about
their age and gender. Table 15.6 provides data about the
productivity, age, and gender of 15 randomly selected employees. Fit
a regression model considering productivity as the dependent
variable and age and gender as the explanatory variables.

Business Statistics - Naval Bajpai


Table 15.6: Data about productivity, age, and gender of 15
randomly selected employees.

Predict the productivity of male and female employees at 45 years of age.


Business Statistics - Naval Bajpai
Using MS Excel, Minitab and SPSS for the
Dummy Variable Regression Model
 Solved Examples\Excel\Ex 15.3 dummy & interaction.xls
 Solved Examples\Minitab\Ex 15.3 DUMMY &
INTERACTION.MPJ
 Solved Examples\SPSS\Ex 15.3 dummy & interaction.sav
 Ch 15 Solved Examples\SPSS\Output Ex. 15.3 dummy &
Interaction.spv

Business Statistics - Naval Bajpai


FIGURE 15.35 : Minitab output for Example 15.3

Business Statistics - Naval Bajpai


Solution (Example 15.3)

Business Statistics - Naval Bajpai


Model Transformation In Regression Models

 In many situations, in regression analysis, the


assumptions of regression are violated or
researchers find that the model is not linear.
 In both the cases, either the dependent
variable y or the independent variable x or
both the variables are transformed to avoid
the violation of regression assumptions or to
make the regression model linear.

Business Statistics - Naval Bajpai


The Square Root Transformation

Square root transformation is


often used for overcoming the
assumption of constant error
variance (homoscedasticity),
and in order to convert a
nonlinear model to a linear
modal.

Business Statistics - Naval Bajpai


Example 15.4: A furniture company receives 12 lots of wooden
plates. Each lot is examined by the quality control inspector of the
firm for defective items. His report is given in Table 15.8:

Taking batch size as the independent variable and the number of


defectives as the dependent variable, fit an appropriate regression
model and transform the independent variable if required.

Business Statistics - Naval Bajpai


Using MS Excel, Minitab and SPSS for the Square
Root Transformation
 Solved Examples\Excel\Ex 15.4 square root
transformation.xls
 Solved Examples\Minitab\Ex 15.4 transformation.MPJ
 Solved Examples\SPSS\Ex 15.4 square root
transformation.sav
 Ch 15 Solved Examples\SPSS\Output Ex. 15.4 square root
transformation.spv

Business Statistics - Naval Bajpai


FIGURE 15.46 : Minitab fitted line plot of
number of defectives versus batch size for
Example15.4

Business Statistics - Naval Bajpai


FIGURE 15.47 : Minitab fitted line plot of number of
defectives versus square root of batch size for Example 15.4

Business Statistics - Naval Bajpai


Logarithm Transformation

Logarithm transformation is often used to verify the


assumption of constant error variance
(homoscedasticity) and to convert a non-linear model
to a linear model.

Business Statistics - Naval Bajpai


Example 15.5: The data related to sales turnover and
advertisement expenditure of a company for 15 randomly
selected months are given in Table 15.10

Taking sales as the dependent variable and advertisement as the independent variables, fit
a regression line using log transformation of variables.
Business Statistics - Naval Bajpai
Using MS Excel, Minitab and SPSS for
Logarithm Transformation
 Solved Examples\Excel\Ex 15.5 log transformation.xls
 Solved Examples\Minitab\EX 15.5 LOG
TRANSFORMATION.MPJ
 Solved Examples\SPSS\Ex 15.5 log transformation.sav
 Ch 15 Solved Examples\SPSS\Output Ex. 15.5 log
transformation.spv

Business Statistics - Naval Bajpai


FIGURE 15.53: Minitab fitted line plot of sales
versus advertisement for Example 15.5

Business Statistics - Naval Bajpai


FIGURE 15.54 : Minitab fitted line plot of log
sales versus log advertisement for Example
15.5

Business Statistics - Naval Bajpai


Collinearity

 In multiple regression analysis, when two independent


variables are correlated, it is referred to as collinearity
and when three or more variables are correlated, it is
referred to as multicollinearity.
 Collinearity is measured by variance inflationary factor
(VIF) for each explanatory variable.
 If explanatory variables are uncorrelated, then variance
inflationary factor (VIF) will be equal to 1. Variance
inflationary factor (VIF) being greater than 10 is an
indication of serious multicollinearity problems.
 Collinearity is not very simple to handle in multiple
regression. One of the best solutions to overcome the
problem of collinearity is to drop collinear variables
from the regression equation.

Business Statistics - Naval Bajpai


FIGURE 15.61/15.62: Minitab /SPSS output (partial)
indicating VIF for Example 15.1

Business Statistics - Naval Bajpai


Example 15.6: Table 15.13 provides the modified data for
the consumer electronics company discussed in Example 15.1.
Two new variables, number of showrooms and showroom age,
of the concerned company have been added. Fit an
appropriate regression model.

 Solved Examples\Minitab\EX 15.6 MODEL


BUILDING.MPJ
 Solved Examples\SPSS\Ex 15.6 model building
stepwise.sav
 Ch 15 Solved Examples\SPSS\Output Ex. 15.6 model
building stepwise.spv

Business Statistics - Naval Bajpai


Table 15.13 provides the modified data for the consumer electronics
company discussed in Example 15.1. Two new variables, number of
showrooms and showroom age, of the concerned company have been
added. Fit an appropriate regression model.

Business Statistics - Naval Bajpai


FIGURE 15.63 : Minitab regression output for sales including
four explanatory variables for Example 15.6

Business Statistics - Naval Bajpai


FIGURE 15.64 Minitab regression output (partial) for Example
15.6 using stepwise method

Business Statistics - Naval Bajpai


FIGURE 15.71 Minitab regression output (partial) for Example
15.6 using the forward selection method

Business Statistics - Naval Bajpai


FIGURE 15.73 Minitab regression output (partial) for Example
15.6 using backward elimination method

Business Statistics - Naval Bajpai

You might also like