Business Analytics Module 8
Business Analytics Module 8
Business Analytics Module 8
Regression Analysis
Variables
Regression Models with Nonlinear Terms
independent variable.
Multiple regression involves two or more
independent variables.
linear trend.
Use alternative approaches if the data is not linear.
Figure 9.1
X = square footage
Y = market value ($)
The scatter plot of the full
data set (42 homes)
indicates a linear trend.
Figure 9.3
Figure 9.4
Figure 9.5
Slope
=INTERCEPT(C4:C45, B4:B45)
Estimate Y when X = 1800 square feet
^
Y = 32,673 + 35.036(1800) = $95,737.80
=TREND(C4:C45, B4:B45, 1800)
Copyright © 2013 Pearson Education, Inc.
publishing as Prentice Hall 9-11
Simple Linear Regression
Excel Regression tool
Data
Data Analysis
Regression
Input Y Range
Input X Range
Labels
regression statistics.
coefficient of determination, R2
varies from 0 (no fit) to 1 (perfect fit)
Adjusted R Square
Figure 9.8
is to use a t-test:
Figure 9.9
Figure 9.10
Copyright © 2013 Pearson Education, Inc.
publishing as Prentice Hall 9-20
Residual Analysis and Regression Assumptions
Figure 9.9
Figure 9.3
Checking Assumptions
Linearity
Figure 9.11
Figure 9.10
Figure 9.12
Figure 9.13
Analytics in Practice:
Using Linear Regression and
Interactive Risk Simulators to
Predict Performance at ARAMARK
ARAMARK, located in Philadelphia, is an award-
Figure 9.15
Figure 9.16
Figure 9.17
Figure 9.18
Copyright © 2013 Pearson Education, Inc.
publishing as Prentice Hall 9-36
Building Good Regression Models
Multicollinearity
- occurs when there are strong correlations among
the independent variables
- makes it difficult to isolate the effects of
independent variables
- signs of slope coefficients may be opposite of the
true value and p-values can be inflated
Correlations exceeding ±0.7 are an indication that
multicollinearity might exist.
Variance Inflation Factors are a better indicator.
Parsimony is an age-old principle that applies here.
Full model
Adjusted R2 = 0.4921
Figure 9.13
Copyright © 2013 Pearson Education, Inc.
publishing as Prentice Hall 9-38
Building Good Regression Models
Example 9.12 (continued)
Identifying Potential Multicollinearity
Correlation Matrix (Colleges and Universities data)
Dropping Expenditures
Adjusted R2 drops to 0.4556
Full Model
Adjusted R2 = 0.9441
Education and Home Value
are not significant.
Figure 9.17
Copyright © 2013 Pearson Education, Inc.
publishing as Prentice Hall 9-43
Building Good Regression Models
Example 9.12 (continued)
Identifying Potential Multicollinearity
Correlation matrix for the Banking data
Figure 9.21
interaction.
Test for interaction by adding a new term to the
Figure 9.22
Adjusted R2 = 0.949858
Figure 9.23
Copyright © 2013 Pearson Education, Inc.
publishing as Prentice Hall 9-50
Regression with Categorical Variables
Example 9.14 Incorporating Interaction Terms in a
Regression Model
Define an interaction between Age and MBA and
Figure 9.24
Adjusted R2 = 0.976701
Figure 9.25
Copyright © 2013 Pearson Education, Inc.
publishing as Prentice Hall 9-52
Regression with Categorical Variables
Example 9.14 (continued)
Salary = 3,323 + 984(Age) for those without MBA
Salary = 3,323 + 1410(Age) for those with MBA
Adjusted R2 = 0.976727
(a slight improvement)
Figure 9.26
Figure 9.27
Figure 9.28
Figure 9.29
Copyright © 2013 Pearson Education, Inc.
publishing as Prentice Hall 9-56
Regression Models with Nonlinear Terms
Curvilinear Regression
Curvilinear models may be appropriate when
scatter charts or residual plots show nonlinear
relationships.
A second order polynomial might be used
outside.
Figure 9.30
Figure 9.31
Residual
pattern is
more random
Sales = 142,850
−3643(temperature)
+ 23.3(temperature)2
Figure 9.32