Newbold Book Solutions

18
Statistics for Business & Economics, 6th edition
Chapter 13:
Multiple Regression
13.1 Given the following estimated linear model: y 10 3 x1 2 x2 4 x3

a. y 10 3(20) 2(11) 4(10) = 132
b. y 10 3(15) 2(14) 4(20) = 163
c. y 10 3(35) 2(19) 4(25) = 253
d. y 10 3(10) 2(17) 4(30) = 194
a. y 10 5(20) 4(11) 2(10) = 174
b. y 10 5(15) 4(14) 2(20) = 181
c. y 10 5(35) 4(19) 2(25) = 311
d. y 10 5(10) 4(17) 2(30) = 188
a. y 10 2(20) 12(11) 8(10) = 262
b. y 10 2(15) 12(24) 8(20) = 488
c. y 10 2(20) 12(19) 8(25) = 478
d. y 10 2(10) 12(9) 8(30) = 378
a. y increases by 8
b. y increases by 8
c. y increases by 24
a. y increases by 8
a. y decreases by 8
b. y decreases by 6
c. y increases by 28
19
13.6
The estimated regression slope coefficients are interpreted as follows:

b1 = .661: All else equal, an increase in the planes top speed by one mph
will increase the expected number of hours in the design effort by an
estimated .661 million or 661 thousand worker-hours.
b2 = .065: All else equal, an increase in the planes weight by one ton will
increase the expected number of hours in the design effort by an estimated .
065 million or 65 thousand worker-hours
b3 = -.018: All else equal, an increase in the percentage of parts in common
with other models will result in a decrease in the expected number of hours
in the design effort by an estimated .018 million or 18 thousand workerhours
13.7
The estimated regression slope coefficients are interpreted as follows:

b1 = .057: All else equal, an increase of one unit in the change over the
quarter in bond purchases by financial institutions results in an estimated .
057 increase in the change over the quarter in the bond interest rate
b2 = -.065: All else equal, an increase of one unit in the change over the
quarter in bond sales by financial institutions results in an estimated .065
decrease in the change over the quarter in the bond interest rates
13.8
a. b1 = .052: All else equal, an increase of one hundred dollars in

weekly income results in an estimated .052 quarts per week increase in
milk consumption. b2 = 1.14: All else equal, an increase in family size
by one person will result in an estimated increase in milk consumption
by 1.14 quarts per week.
b. The intercept term b0 of -.025 is the estimated milk consumption of
quarts of milk per week given that the familys weekly income is 0
dollars and there are 0 members in the family. This is likely
extrapolating beyond the observed data series and is not a useful
interpretation.
13.9
a. b1 = .653: All else equal, a one unit increase in the average number
of meals eaten per week will result in an estimated .653 pounds gained
during freshman year.
b2 = -1.345: All else equal, a one unit increase in the average number of
hours of exercise per week will result in an estimated 1.345 pound
weight loss.
b3 = .613: All else equal, a one unit increase in the average number of
beers consumed per week will result in an estimated .613 pound weight
gain.
b. The intercept term b0 of 7.35 is the estimated amount of weight gain
during the freshman year given that the meals eaten is 0, hours exercise
is 0 and there are no beers consumed per week. This is likely
extrapolating beyond the observed data series and is not a useful
interpretation.
13.10 Compute the slope coefficients for the model: yi b0 b1 x1i b2 x2i
Chapter 13: Multiple Regression
Given that b1
s y (rx1 y rx1x2 rx2 y )

sx1 (1 r 2 x1x2 )
, b2

sx2 (1 r 2 x1x2 )
400(.60 (.50)(.70))
400(.70 (.50)(.60))
= 2.000, b2
= 3.200
2
200(1 .50 )
200(1 .502 )
400(.60 (.50)(.70))
b. b1
= -.667,
200(1 (.50) 2 )
400(.70 ( .50)(.60))
b2
= 1.067
200(1 (.50) 2 )
400(.40 (.80)(.45))
400(.45 (.80)(.40))
c. b1
=.083, b2
= .271
2
200(1 (.80) )
200(1 (.80)2 )
400(.60 (.60)(.50))
d. b1
= .9375,
200(1 (.60) 2 )
400(.50 (.60)(.60))
b1
= -.4375
200(1 (.60) 2 )
a. b1
13.11 a. When the correlation is = 0 the slope coefficient of the X1 term

simplifies to the slope coefficient of the bivariate regression: Start with
equation 13.4: b1
. Note that if the correlation between
sx1 (1 r 2 x1x2 )
X1 and X2 is zero, then the second terms in both the numerator and
denominator are zero and the formula algebraically reduces to b1
s y rx1 y
sx1
which is the equivalent of the bivariate slope coefficient (see box on bottom
of page 380).
b. When the correlation between X1 and X2 is = 1, then the second term in the
denominator goes to 0 and the slope coefficient is undefined.
13.12 a. Electricity sales as a function of number of customers and price
Regression Analysis: salesmw2 versus priclec2, numcust2
The regression equation is
salesmw2 = - 647363 + 19895 priclec2 + 2.35 numcust2
Predictor
Coef
SE Coef
T
P
Constant
-647363
291734
-2.22
0.030
priclec2
19895
22515
0.88
0.380
numcust2
2.3530
0.2233
10.54
0.000
S = 66399
R-Sq
Analysis of Variance
Source
DF
Regression
2
Residual Error
61
Total
63
= 79.2%
R-Sq(adj) = 78.5%
SS
MS
1.02480E+12 5.12400E+11
2.68939E+11 4408828732
1.29374E+12
F
116.22
P
0.000
20
21
All else equal, for every one unit increase in the price of electricity, we estimate
that sales will increase by 19895 mwh. Note that this estimated coefficient is not
significantly different from zero (p-value = .380).
All else equal, for every additional residential customer who uses electricity in the
heating of their home, we estimate that sales will increase by 2.353 mwh.
b. Electricity sales as a function of number of customers
Regression Analysis: salesmw2 versus numcust2
salesmw2 = - 410202 + 2.20 numcust2
Predictor
Coef
SE Coef
Constant
-410202
114132
numcust2
2.2027
0.1445
S = 66282
R-Sq
Source
DF
Regression
1
Residual Error
62
Total
63
= 78.9%
T
-3.59
15.25
P
0.001
0.000
R-Sq(adj) = 78.6%
SS
MS
1.02136E+12 1.02136E+12
2.72381E+11 4393240914
1.29374E+12
F
232.48
P
0.000
An additional residential customer will add 2.2027 mwh to electricity sales.

The two models have roughly equivalent explanatory power; therefore, adding
price as a variable does not add a significant amount of explanatory power to the
model. There appears to be a problem of high correlation between the independent
variables of price and customers.
c.
Regression Analysis: salesmw2 versus priclec2, degrday2
salesmw2 = 2312260 - 165275 priclec2 + 56.1 degrday2
Predictor
Coef
SE Coef
T
P
Constant
2312260
148794
15.54
0.000
priclec2
-165275
24809
-6.66
0.000
degrday2
56.06
60.37
0.93
0.357
S = 110725
R-Sq
Source
DF
Regression
2
Residual Error
61
Total
63
= 42.2%
R-Sq(adj) = 40.3%
SS
MS
5.45875E+11 2.72938E+11
7.47863E+11 12260053296
1.29374E+12
F
22.26
P
0.000
All else equal, an increase in the price of electricity will reduce electricity sales by
165,275 mwh.
All else equal, an increase in the degree days (departure from normal weather) by
one unit will increase electricity sales by 56.06 mwh.
Note that the coefficient on the price variable is now negative, as expected, and it
is significantly different from zero (p-value = .000)
22
d.
Regression Analysis: salesmw2 versus Yd872, degrday2
salesmw2 = 293949 + 326 Yd872 + 58.4 degrday2
Predictor
Coef
SE Coef
T
Constant
293949
67939
4.33
Yd872
325.85
21.30
15.29
degrday2
58.36
35.79
1.63
S = 66187
R-Sq
Source
DF
Regression
2
Residual Error
61
Total
63
= 79.3%
P
0.000
0.000
0.108
R-Sq(adj) = 78.7%
SS
MS
1.02652E+12 5.13259E+11
2.67221E+11 4380674677
1.29374E+12
F
117.16
P
0.000
All else equal, an increase in personal disposable income by one unit will increase
electricity sales by 325.85 mwh.
All else equal, an increase in degree days by one unit will increase electricity sales
by 58.36 mwh.
13.13 a. mpg as a function of horsepower and weight
Regression Analysis: milpgal versus horspwr, weight
milpgal = 55.8 - 0.105 horspwr - 0.00661 weight
150 cases used 5 cases contain missing values
Predictor
Coef
SE Coef
T
P
Constant
55.769
1.448
38.51
0.000
horspwr
-0.10489
0.02233
-4.70
0.000
weight
-0.0066143
0.0009015
-7.34
0.000
S = 3.901
R-Sq = 72.3%
R-Sq(adj) = 72.0%
Source
DF
SS
MS
F
Regression
2
5850.0
2925.0
192.23
Residual Error
147
2236.8
15.2
Total
149
8086.8
P
0.000
All else equal, a one unit increase in the horsepower of the engine will reduce fuel
mileage by .10489 mpg. All else equal, an increase in the weight of the car by 100
pounds will reduce fuel mileage by .66143 mpg.
b. Add number of cylinders
Regression Analysis: milpgal versus horspwr, weight, cylinder
milpgal = 55.9 - 0.117 horspwr - 0.00758 weight
Predictor
Coef
SE Coef
T
Constant
55.925
1.443
38.77
horspwr
-0.11744
0.02344
-5.01
weight
-0.007576
0.001066
-7.10
cylinder
0.7260
0.4362
1.66
+ 0.726 cylinder
P
0.000
0.000
0.000
0.098
S = 3.878
R-Sq = 72.9%
R-Sq(adj) = 72.3%
Source
DF
SS
MS
F
Regression
3
5891.6
1963.9
130.62
Residual Error
146
2195.1
15.0
Total
149
8086.8
P
0.000
23
All else equal, one additional cylinder in the engine of the auto will increase fuel
mileage by .726 mpg. Note that this is not significant at the .05 level (p-value = .
098).
Horsepower and weight still have the expected negative signs
c. mpg as a function of weight, number of cylinders
Regression Analysis: milpgal versus weight, cylinder
milpgal = 55.9 - 0.0104 weight + 0.121 cylinder
Predictor
Coef
SE Coef
T
P
Constant
55.914
1.525
36.65
0.000
weight
-0.0103680
0.0009779
-10.60
0.000
cylinder
0.1207
0.4311
0.28
0.780
S = 4.151
R-Sq = 68.8%
R-Sq(adj) = 68.3%
Source
DF
SS
MS
F
Regression
2
5725.0
2862.5
166.13
Residual Error
151
2601.8
17.2
Total
153
8326.8
P
0.000
All else equal, an increase in the weight of the car by 100 pounds will reduce fuel
mileage by 1.0368 mpg. All else equal, an increase in the number of cylinders in
the engine will increase mpg by .1207 mpg.
The explanatory power of the models has stayed relatively the same with a slight
drop in explanatory power for the latest regression model.
Note that the coefficient on weight has stayed negative and significant (p-values
of .000) for all of the regression models; although the value of the coefficient has
changed. The number of cylinders is not significantly different from zero in either
of the models where it was used as an independent variable. There is likely some
correlation between cylinders and the weight of the car as well as between
cylinders and the horsepower of the car.
d. mpg as a function of horsepower, weight, price
Regression Analysis: milpgal versus horspwr, weight, price
milpgal = 54.4 - 0.0938 horspwr - 0.00735 weight +0.000137 price
Predictor
Coef
SE Coef
T
P
Constant
54.369
1.454
37.40
0.000
horspwr
-0.09381
0.02177
-4.31
0.000
weight
-0.0073518
0.0008950
-8.21
0.000
price
0.00013721 0.00003950
3.47
0.001
S = 3.762
R-Sq = 74.5%
R-Sq(adj) = 73.9%
Source
DF
SS
MS
F
Regression
3
6020.7
2006.9
141.82
Residual Error
146
2066.0
14.2
Total
149
8086.8
P
0.000
All else equal, an increase by one unit in the horsepower of the auto will reduce
fuel mileage by .09381 mpg. All else equal, an increase by 100 pounds in the
weight of the auto will reduce fuel mileage by .73518 mpg and an increase in the
price of the auto by one dollar will increase fuel mileage by .00013721 mpg.
24
e.
Horse power and weight remain significant negative independent variables
throughout whereas the number of cylinders has been insignificant. The size of the
coefficients change as the combinations of independent variables changes. This is
likely due to strong correlation that may exist between the independent variables.
13.14
a. Horsepower as a function of weight, cubic inches of displacement

Regression Analysis: horspwr versus weight, displace
horspwr = 23.5 + 0.0154 weight + 0.157 displace
Predictor
Coef
SE Coef
T
P
Constant
23.496
7.341
3.20
0.002
weight
0.015432
0.004538
3.40
0.001
displace
0.15667
0.03746
4.18
0.000
S = 13.64
R-Sq = 69.2%
R-Sq(adj) = 68.8%
Source
DF
SS
MS
F
Regression
2
61929
30964
166.33
Residual Error
148
27551
186
Total
150
89480
VIF
6.0
6.0
P
0.000
All else equal, a 100 pound increase in the weight of the car is associated with a
1.54 increase in horsepower of the auto.
All else equal, a 10 cubic inch increase in the displacement of the engine is
associated with a 1.57 increase in the horsepower of the auto.
b. Horsepower as a function of weight, displacement, number of cylinders
Regression Analysis: horspwr versus weight, displace, cylinder
horspwr = 16.7 + 0.0163 weight + 0.105 displace + 2.57 cylinder
Predictor
Coef
SE Coef
T
P
VIF
Constant
16.703
9.449
1.77
0.079
weight
0.016261
0.004592
3.54
0.001
6.2
displace
0.10527
0.05859
1.80
0.074
14.8
cylinder
2.574
2.258
1.14
0.256
7.8
S = 13.63
R-Sq = 69.5%
R-Sq(adj) = 68.9%
Source
DF
SS
MS
F
Regression
3
62170
20723
111.55
Residual Error
147
27310
186
Total
150
89480
P
0.000
All else equal, a 100 pound increase in the weight of the car is associated with a
1.63 increase in horsepower of the auto.
All else equal, one additional cylinder in the engine is associated with a 2.57
increase in the horsepower of the auto.
Note that adding the independent variable number of cylinders has not added to the
explanatory power of the model. R square has increased marginally. Engine
displacement is no longer significant at the .05 level (p-value of .074) and the
estimated regression slope coefficient on the number of cylinders is not
25
significantly different from zero. This is due to the strong correlation that exists
between cubic inches of engine displacement and the number of cylinders.
c. Horsepower as a function of weight, displacement and fuel mileage
Regression Analysis: horspwr versus weight, displace, milpgal
horspwr = 93.6 + 0.00203 weight + 0.165 displace - 1.24 milpgal
Predictor
Coef
SE Coef
T
P
VIF
Constant
93.57
15.33
6.11
0.000
weight
0.002031
0.004879
0.42
0.678
8.3
displace
0.16475
0.03475
4.74
0.000
6.1
milpgal
-1.2392
0.2474
-5.01
0.000
3.1
S = 12.55
R-Sq = 74.2%
R-Sq(adj) = 73.6%
Source
DF
SS
MS
F
Regression
3
66042
22014
139.77
Residual Error
146
22994
157
Total
149
89036
P
0.000
All else equal, a 100 pound increase in the weight of the car is associated with a .
203 increase in horsepower of the auto.
All else equal, an increase in the fuel mileage of the vehicle by 1 mile per gallon is
associated with a reduction in horsepower of 1.2392.
Note that the negative coefficient on fuel mileage indicates the trade-off that is
expected between horsepower and fuel mileage. The displacement variable is
significantly positive, as expected, however, the weight variable is no longer
significant. Again, one would expect high correlation among the independent
variables.
d. Horsepower as a function of weight, displacement, mpg and price
Regression Analysis: horspwr versus weight, displace, milpgal, price
horspwr = 98.1 - 0.00032 weight + 0.175 displace - 1.32 milpgal
+0.000138 price
Predictor
Coef
SE Coef
T
P
VIF
Constant
98.14
16.05
6.11
0.000
weight
-0.000324
0.005462
-0.06
0.953
10.3
displace
0.17533
0.03647
4.81
0.000
6.8
milpgal
-1.3194
0.2613
-5.05
0.000
3.5
price
0.0001379
0.0001438
0.96
0.339
1.3
S = 12.55
R-Sq = 74.3%
R-Sq(adj) = 73.6%
Source
DF
SS
MS
F
Regression
4
66187
16547
105.00
Residual Error
145
22849
158
Total
149
89036
P
0.000
Engine displacement has a significant positive impact on horsepower, fuel mileage

is negatively related to horsepower and price is not significant.
e. Explanatory power has marginally increased from the first model to the last.
The estimated coefficient on price is not significantly different from zero.
Displacement and fuel mileage have the expected signs. The coefficient on
weight has the wrong sign; however, it is not significantly different from zero
(p-value of .953).
13.15
A regression analysis has produced the following Analysis of Variance table

SSE
SSR
SSE
1
Given that SST = SSR + SSE, s 2 e
, R2
,
n k 1
SST
SST
SSE (n k 1)
R2 1
SST (n 1)
500
a. SSE = 500, s 2 e
= 19.2308, se = 4.3853
30 3 1
b. SST = SSR + SSE = 4,500 + 500 = 5,000
500 (26)
4500
500
2
1
c. R 2
= .90, R 1
= .8885
5000 (29)
5000
5000
13.16

SSE
SSR
SSE
1
, R2
,
n k 1
SST
SST
SSE (n k 1)
R2 1
SST (n 1)
2500
a. SSE = 2500, s 2 e
= 86.207, se =9.2848
32 2 1
b. SST = SSR + SSE = 7,000 + 2,500 = 9,500
2500 (29)
7000
2500
2
1
c. R 2
= .7368, R 1
= .7086
9500 (31)
9500
9500
13.17

SSE
SSR
SSE
1
, R2
,
n k 1
SST
SST
SSE (n k 1)
R2 1
SST (n 1)
10000
a. SSE = 10,000, s 2 e
= 222.222, se =14.9071
50 4 1
b. SST = SSR + SSE = 40,000 + 10,000 = 50,000
10, 000 (45)
40, 000
10, 000
2
2
1
c. R
= .80, R 1
= .7822
50, 0 00 (49)
50, 000
50, 000
26
27
13.18

SSE
SSR
SSE
1
, R2
,
n k 1
SST
SST
SSE (n k 1)
R2 1
SST (n 1)
15, 000
a. SSE = 15,000, s 2 e
= 75.0, se = 8.660
206 5 1
b. SST = SSR + SSE = 80,000 + 15,000 = 95,000
10, 000 (45)
80, 000
15, 000
2
2
1
c. R
= .8421, R 1
= .7822
50, 0 00 (49)
95, 000
95, 000
13.19
13.20
13.21
3.549
.9145 , therefore, 91.45% of the variability in work-hours
3.881
of design effort can be explained by the variation in the planes top speed,
weight and percentage number of parts in common with other models.
b. SSE = 3.881-3.549 = .332
.332 /(27 4)
.9033
c. R 2 1
3.881/ 26
d. R .9145 .9563 . This is the sample correlation between observed and
predicted values of the design effort
a. R 2
88.2
.5441 , therefore, 54.41% of the variability in milk
162.1
consumption can be explained by the variations in weekly income and family
size.
73.9 /(30 3)
.5103
b. R 2 1
162.1/ 29
c. R .5441 .7376 . This is the sample correlation between observed and
predicted values of milk consumption.
a. R 2
79.2
.6331 , therefore, 63.31% of the variability in weight
79.2 45.9
gain can be explained by the variations in the average number of meals eaten,
number of hours exercised and number of beers consumed weekly.
45.9 /(25 4)
.5807
b. R 2 1
125.1/ 24
c. R .6331 .7957 . This is the sample correlation between observed and
predicted values of weight gained
a. R 2
13.22
a.
Regression Analysis: Y profit versus X2 offices
Y profit = 1.55 -0.000120 X2 offices
Predictor
Coef
SE Coef
Constant
1.5460
0.1048
X2 offi -0.00012033 0.00001434
T
14.75
-8.39
P
0.000
0.000
S = 0.07049
R-Sq = 75.4%
R-Sq(adj) = 74.3%
Source
DF
SS
MS
F
Regression
1
0.34973
0.34973
70.38
Residual Error
23
0.11429
0.00497
Total
24
0.46402
P
0.000
b.
Regression Analysis: X1 revenue versus X2 offices
X1 revenue = - 0.078 +0.000543 X2
Predictor
Constant
X2 offi
Coef
-0.0781
0.00054280
SE Coef
0.2975
0.00004070
offices
T
-0.26
13.34
P
0.795
0.000
S = 0.2000
R-Sq = 88.5%
R-Sq(adj) = 88.1%
Source
DF
SS
MS
F
Regression
1
7.1166
7.1166
177.84
Residual Error
23
0.9204
0.0400
Total
24
8.0370
P
0.000
c.
Regression Analysis: Y profit versus X1 revenue
Y profit = 1.33 - 0.169 X1 revenue
Predictor
Coef
SE Coef
Constant
1.3262
0.1386
X1 reven
-0.16913
0.03559
T
9.57
-4.75
P
0.000
0.000
S = 0.1009
R-Sq = 49.5%
R-Sq(adj) = 47.4%
Source
DF
SS
MS
F
Regression
1
0.22990
0.22990
22.59
Residual Error
23
0.23412
0.01018
Total
24
0.46402
P
0.000
d.
Regression Analysis: X2 offices versus X1 revenue
X2 offices = 957 + 1631 X1 revenue
Predictor
Coef
SE Coef
Constant
956.9
476.5
X1 reven
1631.3
122.3
T
2.01
13.34
P
0.057
0.000
S = 346.8
R-Sq = 88.5%
R-Sq(adj) = 88.1%
Source
DF
SS
MS
F
Regression
1
21388013
21388013
177.84
Residual Error
23
2766147
120267
Total
24
24154159
P
0.000
28
29
13.23
Given the following results where the numbers in parentheses are the sample
standard error of the coefficient estimates
a. Compute two-sided 95% confidence intervals for the three regression slope
coefficients
b j tn k 1, 2 sb j
95% CI for x1 = 4.8 2.086 (2.1); .4194 up to 9.1806
95% CI for x2 = 6.9 2.086 (3.7); -.8182 up to 14.6182
95% CI for x3 = -7.2 2.086 (2.8); -13.0408 up to -1.3592
b. Test the hypothesis H 0 : j 0, H1 : j 0
4.8
2.286 t20,.05 /.01 = 1.725, 2.528
For x1: t
2.1
Therefore, reject H 0 at the 5% level but not at the 1% level
6.9
1.865 t20,.05 /.01 = 1.725, 2.528
For x2: t
3.7
7.2
2.571 t20,.05/.01 = 1.725, 2.528
For x3: t
2.8
Therefore, do not reject H 0 at either level
13.24
coefficients
b j tn k 1, 2 sb j
95% CI for x1 = 6.8 2.042 (3.1); .4698 up to 13.1302
95% CI for x2 = 6.9 2.042 (3.7); -6.4554 up to 14.4554
95% CI for x3 = -7.2 2.042 (3.2); -13.7344 up to -.6656
6.8
2.194 t30,.05 /.01 = 1.697, 2.457
For x1: t
3.1
6.9
1.865 t30,.05 /.01 = 1.697, 2.457
For x2: t
3.7
7.2
2.25 t30,.05 /.01 = 1.697, 2.457
For x3: t
3.2
Therefore, do not reject H 0 at the 5% level nor the 1% level
13.25
coefficients
b j tn k 1, 2 sb j
95% CI for x1 = 34.8 2.000 (12.1); 10.60 up to 59.0
95% CI for x2 = 56.9 2.000 (23.7); 9.50 up to 104.30
95% CI for x3 = -57.2 2.000 (32.8); -122.80 up to 8.40
34.8
2.876 t60,.05/.01 = 1.671, 2.390
For x1: t
12.1
56.9
2.401 t60,.05/.01 = 1.671, 2.390
For x2: t
23.7
57.2
1.744 t60,.05 /.01 = 1.671, 2.390
For x3: t
32.8
13.26
coefficients
b j tn k 1, 2 sb j
95% CI for x1 = 17.8 2.042 (7.1); 3.3018 up to 32.2982
95% CI for x2 = 26.9 2.042 (13.7); -1.0754 up to 54.8754
95% CI for x3 = -9.2 2.042 (3.8); -16.9596 up to -1.44
17.8
2.507 t35,.05 /.01 1.697, 2.457
For x1: t
7.1
26.9
1.964 t35,.05 /.01 1.697, 2.457
For x2: t
13.7
9.2
2.421 t35,.05 /.01 1.697, 2.457
For x3: t
3.8
13.27
a. b1 .661, sb 1 .099, n 27, t23,.05/.025 1.714, 2.069

90% CI: .661 1.714(.099); .4913 up to .8307
30
31
95% CI: .661 2.069(.099); .4562 up to .8658

b. b2 .065, sb 2 .032, t23,.025/.005 2.069, 2.807
95% CI: .065 2.069(.032); -.0012 up to .1312
99% CI: .065 2.807(.032); -.0248 up to .1548
H
c. 0 : 2 0, H1 : 2 0
.065
t
2.031
.032
t23,.05 /.025 = 1.714, 2.069
d. H 0 : 1 2 0, H1 : At least one i 0, (i 1, 2)
(3.311 .332) / 2
F
103.188
.332 / 23
Therefore, reject H 0 at the 1% level since F = 103.188 > 5.66 = F 2,23,.01
13.28
a. H 0 : 1 0; H1 : 1 0
.052
t
2.26
.023
t27,.025 /.01 2.052, 2.473
Therefore, reject H 0 at the 2.5% level but not at the 1% level
b. t27,.05/.025 /.005 1.703, 2.052, 2.771
90% CI: 1.14 1.703(.35); .5439 up to 1.7361
95% CI: 1.14 2.052(.35); .4218 up to 1.8582
99% CI: 1.14 2.771(.35); .1701 up to 2.1099
13.29
a. H 0 : 2 0; H1 : 2 0
1.345
t
2.381
.565
t21,.025/.01 2.080, 2.518
b. H 0 : 3 0; H1 : 3 0
.613
t
2.523
.243
t21,.01/.005 2.518, 2.831
Therefore, reject H 0 at the 1% level but not at the .5% level
c. t21,.05 /.025 /.005 1.721, 2.080, 2.831
90% CI: .653 1.721(.189); .3277 up to .9783
95% CI: .653 2.080(.189); .2599 up to 1.0461
99% CI: .653 2.831(.189); .1179 up to 1.1881
32
13.30 a. H 0 : 3 0, H1 : 3 0
.000191
t
.428
.000446
t16,.10 = -1.337
Therefore, do not reject H 0 at the 20% level
b. H 0 : 1 2 3 0, H1 : At least one i 0, (i 1, 2,3)
16 .71
F
13.057 , F 3,16,.01 = 5.29
3 1 .71
Therefore, reject H 0 at the 1% level
13.31 a. t85,.025 /.005 2.000, 2.660
90% CI: 7.878 2.000(1.809); 4.260 up to 11.496
95% CI: 7.878 2.660(1.809); 3.0661 up to 12.6899
.003666
2.73 , t85,.005 2.660
b. H 0 : 2 0; H1 : 2 0 , t
.001344
Therefore, reject H 0 at the .5% level
13.32
a. All else being equal, an extra $1 in mean per capita personal income leads
to an expected extra $.04 of net revenue per capita from the lottery
b. b2 .8772, sb 2 .3107, n 29, t24,.025 2.064
95% CI: .8772 2.064(.3107), .2359 up to 1.5185
c. H 0 : 3 0, H1 : 3 0
365.01
t
1.383
263.88
t24,.10 /.05 = -1.318, -1.711
13.33 a. b3 7.653, sb 3 3.082, n 19, t15,.025 2.131

95% CI: 7.653 2.131(3.082), 1.0853 up to 14.2207
19.729
2.194
b. H 0 : 2 0, H1 : 2 0 , t
8.992
t15,.025/.01 = 2.131, 2.602
13.34 a. n 19, b1 .2, sb 1 .0092, t16,.025 2.12
95% CI: .2 2.12(.0092), .1805 up to .2195
.1
1.19
b. H 0 : 2 0, H1 : 2 0 , t
.084
33
t16,.10 = -1.337, Therefore, do not reject H 0 at the 10% level

13.35 a. n 14, b1 .101, sb 1 .023, t10,.05 1.812
90% CI: .101 1.812(.023), .0593 up to .1427
.244
3.05
b. H 0 : 2 0, H1 : 2 0 , t
.08
t10,.01/.005 = -2.764, -3.169
c. H 0 : 3 0, H1 : 3 0
.057
t
6.162
.00925
t10,.005 = 3.169
13.36 a. n 39, b5 .0495, sb 1 .01172, t30,.005 2.750
99% CI: .0495 2.750(.01172), .0173 up to .0817
H
b. 0 : 4 0, H1 : 4 0
.48122
t
.617
.77954
t30,.10 = 1.31
c. H 0 : 7 0, H1 : 7 0
.00645
t
2.108
.00306
t30,.025/.01 = 2.042, 2.457
13.37 Test the hypothesis that all three of the predictor variables are equal to zero given
the following Analysis of Variance tables
a. H 0 : 1 2 3 0, H1 : At least one i 0, (i 1, 2,3)
SSR k
MSR
4500 3
F
2 =
= 78.0, F 3,26,.05 = 2.98
SSE (n k 1)
se
500 26
b. H 0 : 1 2 3 0, H1 : At least one i 0, (i 1, 2,3)
SSR k
MSR
9780 3
F
2 =
= 40.362, F 3,26,.05 = 2.98
SSE (n k 1)
se
2100 26
c. H 0 : 1 2 3 0, H1 : At least one i 0, (i 1, 2,3)

SSR k
MSR
46, 000 3
F
2 =
= 15.9467, F 3,26,.05 = 2.98
SSE (n k 1)
se
25, 000 26
d. H 0 : 1 2 3 0, H1 : At least one i 0, (i 1, 2,3)
SSR k
MSR
87, 000 3
F
2 =
= 15.3708, F 3,26,.05 = 2.98
SSE (n k 1)
se
48, 000 26
13.38 a. SST = 3.881, SSR = 3.549, SSE = .332
H 0 : 1 2 3 0, H1 : At least one i 0, (i 1, 2,3)
3.549 / 3
F
81.955
.332 / 23
F 3,23,.01 = 4.76
Therefore, reject H 0 at the 1% level.
b. Analysis of Variance table:
Sources of
Sum of Degress of
variation
Squares
Freedom
Regressor
3.549
3
Error
.332
23
Total
3.881
26
Mean Squares
1.183
.014435
F-Ratio
81.955
13.39 H 0 : 1 2 0, H1 : At least one i 0, (i 1, 2)

45 .463 (2 / 45)
F
21.262
2
1 .463
F 2,45,.01 = 5.18
13.40 a. SST = 162.1, SSR =88.2, SSE = 73.9
H 0 : 1 2 0, H1 : At least one i 0, (i 1, 2)
88.2 / 2
F
16.113 , F 2,27,.01 = 5.49
73.9 / 27
b.
Sources of
Sum of
Degress of
variation
Squares
Freedom Mean Squares
Regressor
88.2
2
44.10
Error
73.9
27
2.737
Total
162.1
29
13.41 a. SST = 125.1, SSR = 79.2, SSE = 45.9
F-Ratio
16.113
34
35
H 0 : 1 2 3 0, H1 : At least one i 0, (i 1, 2,3)

79.2 / 3
F
12.078 , F 3,21,.01 = 4.87
45.9 / 21
Therefore, reject Ho at the 1% level
b.
Sources of
variation
Regressor
Error
Total
Sum of
Squares
79.2
45.9
125.1
Degress of
Freedom
3
21
24
Mean Squares
26.4
2.185714
F-Ratio
12.078
13.42 a. H 0 : 1 2 3 4 0, H1 : At least one i 0, (i 1, 2,3, 4)

The test can be based directly on the coefficient of determination since
R2
SSR SST SSR
SSR
SSE
2
F , and
R
1
, and hence
2
1 R
SSE SST SSE
SST
SST
n K 1 R2
24 .51
F
F
6.2449 , F 4,24,.01 = 4.22. Therefore,
2 ,
K
4 1 .51
1 R
reject H 0 at the 1% level
13.43 a. H 0 : 1 2 3 0, H1 : At least one i 0, (i 1, 2,3)
R2
R2
SSR SST SSR
SSR
SSE
F , and
1
, and hence
2
1 R
SSE SST SSE
SST
SST
n K 1 R2
15 .84
F
F
26.25 , F 3,15,.01 = 5.42. Therefore, reject
2 ,
K
3 1 .84
1 R
H 0 at the 1% level
13.44 a. H 0 : 1 2 0, H1 : At least one i 0, (i 1, 2)
R2
F
R2
SSR SST SSR
SSR
SSE
F , and
1
, and hence
2
1 R
SSE SST SSE
SST
SST
n K 1 R2
16 .96 (2 /16)
F
217 , F 2,16,.01 = 6.23
2 ,
K
2
1 .96
1 R
36
13.45 H 0 : 1 2 3 4 5 6 7 0, H1 : At least one

i 0, (i 1, 2,3, 4,5, 6, 7)
31 .3572 (7 / 31)
F
4.017 , F 7,30,.01 = 3.30.
7
1 .3572
( SSE * SSE )k1 n k 1 ( SSE * SSE ) / SST
SSE /(n k 1)
k1
SSE / SST
13.46
=
13.47
n k 11 R 2* (1 R 2 )
n k 1 R 2 R 2*
=
k1
1 R2
k1
1 R2
Let 3 be the coefficient on the number of preschool children in the household

(88.2 83.7) /1
H 0 : 3 0, H1 : 3 0 , F
1.398 , F 1,26,.05 = 4.23
83.7 / 26
13.47.1
SSE /( n k 1)
n 1
(1 R 2 ) =
= 1
SST /(n 1)
n k 1
n 1
k
(n 1) R 2 k
R2
=
n k 1
n k 1
n k 1
2
a. R 1
b. Since R 2
c.
(n 1) R 2 k
(n k 1) R 2 k
, then R 2
n k 1
n 1
SSR / k
n k 1 SSR / SST
n k 1 R2
=
=
SSE /(n k 1)
k
SSE / SST
k
1 R2
n k 1 [(n k 1) R 2 k ] /(n 1)
n k 1 (n k 1) R 2 k
=
k
[n 1 (n k 1) R 2 k ] /(n 1)
k
( n k 1)(1 R 2 )
n k 1 R 2 k
=
k
(1 R 2 )
13.49 Given the estimated multiple regression equation y 6 5 x1 4 x2 7 x3 8 x4

a. y 6 5(10) 4(23) 7(9) 8(12) = 307
b. y 6 5(23) 4(18) 7(10) 8(11) = 351
c. y 6 5(10) 4(23) 7(9) 8(12) = 307
d. y 6 5(10) 4(13) 7( 8) 8(16) = -176
37
13.50 Y 7.35 .653(20) 1.345(10) .613(6) 10.638 pounds

13.51 Y .578 .052(6) 1.14(4) 5.45 quarts of milk per week
13.52 Y 2.0 .661(1) .065(7) .018(50) 2.216 million worker hours
13.53 a. All else equal, a one square foot increase in the lot size is expected to increase
the selling price of the house by $1.468
b. 98.43% of the variation in the selling price of homes can be explained by the
variation in house size, lot size, number of bedrooms and number of bathrooms
2701.1
1.353 , t15,.10 /.05 = 1.341, 1.753
c. H 0 : 4 0, H1 : 4 0 , t
1996.2
d.
Y 1998.5 22.352(1250) 1.4686(4700) 6767.3(3) 2701.1(1.5) 61194.47
13.54
Compute values of yi when xi = 1, 2, 4, 6, 8, 10

Xi
1
2
4
6
1.5
4
11.3137 32
58.7878
yi 4 x
13
41
85
yi 1 2 xi 2 xi 2 5
8
90.5097
10
126.4611
145
221
13.55 Compute values of yi when xi = 1, 2, 4, 6, 8, 10

Xi
1
2
4
6
8
1.8
4
13.9288
48.5029 100.6311 168.8970
yi 4 x
13
41
85
145
yi 1 2 xi 2 xi 2 5
Xi
1
2
4
6
1.5
4
11.3137 32
58.7878
yi 4 x
4.7
11.8
36.2
74.2
yi 1 2 xi 1.7 xi 2
Xi
1
2
4
6
1.2
3
6.8922 15.8341 25.7574
yi 3x
15
-3
-23
yi 1 5 xi 1.5 xi 2 4.5
10
252.3829
221
8
90.5097
10
126.4611
125.8
191
8
36.3772
10
47.5468
-55
-99
13.58 There are many possible answers. Relationships that can be approximated by a
non-linear quadratic model include many supply functions, production functions
and cost functions including average cost versus the number of units produced.
38
To estimate the function with linear least squares, solve the equation 1 2 2
for 2 . Since 2 2 1 , plug into the equation and algebraically manipulate:
Y o 1 X 1 (2 1 ) X 21 3 X 2
13.59
Y o 1 X 1 2 X 21 1 X 21 3 X 2
Y o 1[ X 1 X 21 ] 2 X 21 3 X 2
Y 2 X 21 o 1[ X 1 X 21 ] 3 X 2
Conduct the variable transformations and estimate the model using least squares.
13.60
a. All else equal, 1% increase in annual consumption expenditures will be

associated with a 1.1556% increase in expenditures on vacation travel.
All else equal, a 1% increase in the size of the household will be associated
with a .4408% decrease in expenditures on vacation travel.
b. 16.8% of the variation in vacation travel expenditures can be explained by the
variations in the log of total consumption expenditures and log of the number
of members in the household
c. 1.1556 1.96(.0546) = 1.049 up to 1.2626
.4408
8.996 ,
d. H 0 : 2 0, H1 : 2 0 , t
.0490
13.61
a. A 1% increase in median income leads to an expected .68% increase in store

size.
.68
8.831 , Therefore, reject H 0 at the 1% level
b. H 0 : 1 0, H1 : 1 0 , t
.077
13.62
a. All else equal, a 1% increase in the price of beef will be associated with a
decrease of .529% in the tons of beef consumed annually in the U.S.
b. All else equal, a 1% increase in the price of pork will be associated with an
increase of .217% in the tons of beef consumed annually in the U.S.
.416
2.552 , t25,.01 = 2.485, Therefore, reject
c. H 0 : 4 0, H1 : 4 0 , t
.163
H 0 at the 1% level
d. H 0 : 1 2 3 4 0, H1 : At least one i 0, (i 1, 2,3, 4)
n k 1 R2
25 .683
13.466 , F 4,25,.01 = 4.18. Therefore,

2
k
1 R
4 1 .683
reject H 0 at the 1% level
e. If an important independent variable has been omitted, there may be
specification bias. The regression coefficients produced for the misspecified
model would be misleading.
F
13.63 Estimate a Cobb-Douglas production function with three independent variables:
Y 0 X 1 1 X 2 2 X 3 3 where X1 = capital, X2 = labor and X3 = basic research
39
Taking the log of both sides of the equation yields:

log(Y ) log( 0 ) 1 log( X 1 ) 2 log( X 2 ) 3 log( X 3 )
Using this form, now regression the log of Y on the logs of the three independent
variables and obtain the estimated regression slope coefficients.
12.34
a. Coefficients for exponential models can be estimated by taking the

logarithm of both sides of the multiple regression model to obtain an equation
that is linear in the logarithms of the variables.
log(Y ) log( 0 ) 1 log( X 1 ) 2 log( X 2 ) 3 log( X 3 ) 4 (log( X 4 ) log( )
Substituting in the restrictions on the coefficients: 1 2 1, 2 1 1 ,
3 4 1, 4 1 3
log(Y ) log( 0 ) 1 log( X 1 ) [1 1 ]log( X 2 ) 3 log( X 3 ) [1 3 ](log( X 4 ) log( )
Simplify algebraically and estimate the coefficients. The coefficient 2 can be
found by subtracting 1 from 1.0. Likewise the coefficient 4 can be found by
subtracting 3 from 1.0.
b. Constant elasticity for Y versus X4 is the regression slope coefficient on the X4
term of the logarithm model.
Linear model:
Regression Plot
Salary = 20544.5 + 616.113 Experience
S = 3117.89
R-Sq = 78.0 %
R-Sq(adj) = 77.9 %
50000
40000
Salary
13.64
30000
20000
0
10
20
Experience
30
40
Quadratic model:
Regression Plot
- 8.21382 Experience**2
S = 3027.17
R-Sq = 79.4 %
R-Sq(adj) = 79.1 %
50000
Salary
40000
30000
20000
0
10
20
Experience
30
40
40
41
Cubic model:
Regression Plot
+ 26.4323 Experience**2 - 0.582553 Experience**3
S = 2982.43
R-Sq = 80.2 %
R-Sq(adj) = 79.8 %
50000
Salary
40000
30000
20000
0
10
20
30
40
Experience
All three of the models appear to fit the data well. The cubic model appears to fit
the data the best as the standard error of the estimate is lowest. In addition,
explanatory power is marginally higher for the cubic model than the other models.
13.66
Results for: GermanImports.xls
Regression Analysis: LogYt versus LogX1t, LogX2t
LogYt = - 4.07 + 1.36 LogX1t + 0.101 LogX2t
Predictor
Coef
SE Coef
T
Constant
-4.0709
0.3100
-13.13
LogX1t
1.35935
0.03005
45.23
LogX2t
0.10094
0.05715
1.77
S = 0.04758
R-Sq = 99.7%
Source
DF
Regression
2
Residual Error
28
Total
30
Source
LogX1t
LogX2t
DF
1
1
SS
21.345
0.063
21.409
P
0.000
0.000
0.088
VIF
4.9
4.9
R-Sq(adj) = 99.7%
MS
10.673
0.002
F
4715.32
Seq SS
21.338
0.007
13.67 What is the model constant when the dummy variable equals 1
a. y 7 8 x1 , b0 = 7
b. y 12 6 x1 , b0 = 12
c. y 7 12 x1 , b0 = 7
P
0.000
42
13.68 What is the model constant when the dummy variable equals 1
a. y 5.78 4.87 x1
b. y 1.15 9.51x1
c. y 13.67 8.98 x1
13.69 The interpretation of the dummy variable is that we can conclude that for a given
difference between the spot price in the current year and OPEC price in the
previous year, the difference between the OPEC price in the current year and
OPEC price in the previous years is $5.22 higher in 1974 during the oil embargo
than in other years
13.70
a. All else being equal, expected selling price is higher by $3,219 if condo has a
fireplace.
b. All else being equal, expected selling price is higher by $2,005 if condo has
brick siding.
c. 95% CI: 3219 1.96(947) = $1,362.88 up to $5,075.12
2005
2.611 , t809,.005 = 2.576
d. H 0 : 5 0, H1 : 5 0 , t
768
13.71
a. All else being equal, the price-earnings ratio is higher by 1.23 for a regional
company than a national company
1.23
2.48 , t29,.01/.005 = 2.462, 2.756
b. H 0 : 2 0, H1 : 2 0 , t
.496
c. H 0 : 1 2 0, H1 : At least one i 0, (i 1, 2)
n k 1 R2
29 .37
8.516 , F 2,29,.05 = 3.33

2
k
1 R
2 1 .37
F
13.72 35.6% of the variation in overall performance in law school can be explained by
the variation in undergraduate gpa, scores on the LSATs and whether the students
letter of recommendation are unusually strong. The overall model is significant
since we can reject the null hypothesis that the model has no explanatory power in
favor of the alternative hypothesis that the model has significant explanatory
power. The individual regression coefficients that are significantly different than
zero include the scores on the LSAT and whether the students letters of
recommendation were unusually strong. The coefficient on undergraduate gpa was
not found to be significant at the 5% level.
13.73
a. All else equal, the annual salary of the attorney general who can be removed is
$5,793 higher than if the attorney general cannot be removed
43
b. All else equal, the annual salary of the attorney general of the state is $3,100
lower if the supreme court justices are elected on partisan ballots
5793
1.9996 , t43,.05 /.025 = 1.68, 2.016
c. H 0 : 5 0, H1 : 5 0 , t
2897
Therefore, reject H 0 at the 5% level but not at the 2.5% level
3100
1.76 , t43,.05 /.025 = -1.68, -20.16
d. H 0 : 6 0, H1 : 6 0 , t
1761
e. t43,.05/.025 = 2.016
95% CI: 547 2.016(124.3), 296.41 up to 797.59
13.74
a. All else equal, the average rating of a course is 6.21 units higher if a
visiting
lecturer is brought in than if otherwise.
6.21
1.73 , t20,.05 = 1.725
b. H 0 : 4 0, H1 : 4 0 , t
3.59
c. 56.9% of the variation in the average course rating can be explained by the
variation in the percentage of time spent in group discussions, the dollars spent
on preparing the course materials, the dollars spent on food and drinks, and
whether a guest lecturer is brought in.
H 0 : 1 2 3 4 0, H1 : At least one i 0, (i 1, 2,3, 4)
n k 1 R2
20 .569
6.6
2
k
1 R
4 1 .569
F 4,20,.01 = 4.43
d. t20,.025 = 2.086
95% CI: .52 2.086(.21), .0819 up to .9581
13.75 34.4% of the variation in a test on understanding college economics can be
explained by which course was taken, the students gpa, the teacher that taught the
course, the gender of the student, the pre-test score, the number of credit hours
completed and the age of the student. The regression model has significant
explanatory power:
H 0 : 1 2 3 4 5 6 7 0, H1 : At least one
i 0, (i 1, 2,3, 4,5, 6, 7)
F
n k 1 R2
342 .344
25.62
2
k
1 R
7 1 .344
44
13.76
Results for: Student Performance.xls
Regression Analysis: Y versus X1, X2, X3, X4, X5
Y = 2.00 + 0.0099 X1 + 0.0763 X2 - 0.137 X3 + 0.064 X4 + 0.138 X5
Predictor
Constant
X1
X2
X3
X4
X5
Coef
1.997
0.00990
0.07629
-0.13652
0.0636
0.13794
S = 0.5416
SE Coef
1.273
0.01654
0.05654
0.06922
0.2606
0.07521
R-Sq = 26.5%
Source
DF
Regression
5
Residual Error
21
Total
26
T
1.57
0.60
1.35
-1.97
0.24
1.83
P
0.132
0.556
0.192
0.062
0.810
0.081
VIF
1.3
1.2
1.1
1.4
1.1
R-Sq(adj) = 9.0%
SS
2.2165
6.1598
8.3763
MS
0.4433
0.2933
F
1.51
P
0.229
The model is not significant (p-value of the F-test = .229). The model only explains
26.5% of the variation in gpa with the hours spent studying, hours spent preparing
for tests, hours spent in bars, whether or not students take notes or mark highlights
when reading tests and the average number of credit hours taken per semester. The
only independent variables that are marginally significant (10% level but not the 5%
level) include number of hours spent in bars and average number of credit hours.
The other independent variables are not significant at common levels of alpha.
13.77
a. Begin the analysis with the correlation matrix identify important independent
variables as well as correlations between the independent variables
Correlations: Salary, Experience, yearsenior, Gender_1F
Experien
yearseni
Salary Experien yearseni

0.883
0.000
0.777
0.000
0.674
0.000
Gender_1 -0.429
0.000
-0.378
0.000
-0.292
0.000
Regression Analysis: Salary versus Experience, yearsenior, Gender_1F

Salary = 22644 + 437 Experience + 415 yearsenior - 1443 Gender_1F
Predictor
Coef
SE Coef
T
P
VIF
Constant
22644.1
521.8
43.40
0.000
Experien
437.10
31.41
13.92
0.000
2.0
yearseni
414.71
55.31
7.50
0.000
1.8
Gender_1
-1443.2
519.8
-2.78
0.006
1.2
S = 2603
R-Sq = 84.9%
Source
DF
Regression
3
Residual Error
146
SS
5559163505
989063178
R-Sq(adj) = 84.6%
MS
1853054502
6774405
F
273.54
P
0.000
45
Total
149
6548226683
84.9% of the variation in annual salary (in dollars) can be explained by the variation
in the years of experience, the years of seniority and the gender of the employee. All
of the variables are significant at the .01 level of significance (p-values of .000, .000
and .006 respectively). The F-test of the significance of the overall model shows
that we reject H 0 that all of the slope coefficients are jointly equal to zero in favor
of H1 that at least one slope coefficient is not equal to zero. The F-test yielded a pvalue of .000.
b. H 0 : 3 0, H1 : 3 0
1443.2
t
2.78 , t146,.01 = -2.326
519.8
Therefore, reject H 0 at the 1% level. And conclude that the annual salaries for
females are statistically significantly lower than they are for males.
c. Add an interaction term and test for the significance of the slope coefficient on
the interaction term.
13.78 Two variables are included as predictor variables. What is the effect on the
estimated slope coefficients when these two variables have a correlation equal to
a. .78. A large correlation among the independent variables will lead to a high
variance for the estimated slope coefficients and will tend to have a small
2
students t statistic. Use the rule of thumb r
to determine if the
n
correlation is large.
b. .08. No correlation exists among the independent variables. No effect on the
estimated slope coefficients.
c. .94. A large correlation among the independent variables will lead to a high
variance for the estimated slope coefficients and will tend to have a small
students t statistic.
2
d. .33. Use the rule of thumb r
to determine if the correlation is large.
n
13.79 n = 34 and four independent variables. R = .23. Does this imply that this
independent variable will have a very small students t statistic?
Correlation between the independent variable and the dependent variable is not
necessarily evidence of a small students t statistic. A high correlation among the
independent variables could result in a very small students t statistic as the
correlation creates a high variance.
46
13.80 n = 47 with three independent variables. One of the independent variables has a
correlation of .95 with the dependent variable.
13.81 n = 49 with two independent variables. One of the independent variables has a
correlation of .56 with the dependent variable.
13.82 Through 13.84 Reports can be written by following the extended Case Study on
the data file Cotton see Section 13.9
13.85
Regression Analysis: y_deathrate versus x1_totmiles, x2_avgspeed
y_deathrate = - 2.97 - 0.00447 x1_totmiles + 0.219 x2_avgspeed
Predictor
Coef
SE Coef
T
P
VIF
Constant
-2.969
3.437
-0.86
0.416
x1_totmi
-0.004470
0.001549
-2.89
0.023
11.7
x2_avgsp
0.21879
0.08391
2.61
0.035
11.7
S = 0.1756
R-Sq = 55.1%
R-Sq(adj) = 42.3%
Source
DF
SS
MS
F
Regression
2
0.26507
0.13254
4.30
Residual Error
7
0.21593
0.03085
Total
9
0.48100
P
0.061
55.1% of the variation in death rates can be explained by the variation in total
miles traveled and in average travel speed. The overall model is significant at the
10% but not the 5% level since the p-value of the F-test is .061.
All else equal, the average speed variable has the expected sign since as average
speed increases, the death rate also increases. The total miles traveled variable is
negative which indicates that the more miles traveled, the lower the death rate.
Both of the independent variables are significant at the 5% level (p-values of .023
and .035 respectively). There appears to be some correlation between the
independent variables.
47
Due to the high correlation between the independent variables, an alternative

model using a quadratic model is as follows:
Regression Analysis: y_deathrate versus x1_totmiles, x1_totsquared
y_deathrate = - 6.54 + 0.0268 x1_totmiles -0.000015 x1_totsquared
Predictor
Coef
Constant
-6.539
x1_totmi
0.026800
x1_totsq -0.00001480
SE Coef
1.296
0.002835
0.00000153
T
-5.04
9.45
-9.68
P
0.001
0.000
0.000
VIF
285.5
285.5
S = 0.06499
R-Sq = 93.9%
R-Sq(adj) = 92.1%
Source
DF
SS
MS
F
Regression
2
0.45143
0.22572
53.44
Residual Error
7
0.02957
0.00422
Total
9
0.48100
Source
x1_totmi
x1_totsq
DF
1
1
P
0.000
Seq SS
0.05534
0.39609
13.86
Regression Analysis: y_FemaleLFPR versus x1_income, x2_yrsedu, ...
y_FemaleLFPR = 0.2 +0.000406 x1_income + 4.84 x2_yrsedu - 1.55
x3_femaleun
Predictor
Coef
SE Coef
T
P
VIF
Constant
0.16
34.91
0.00
0.996
x1_incom
0.0004060
0.0001736
2.34
0.024
1.2
x2_yrsed
4.842
2.813
1.72
0.092
1.5
x3_femal
-1.5543
0.3399
-4.57
0.000
1.3
S = 3.048
R-Sq = 54.3%
Source
DF
Regression
3
Residual Error
46
Total
49
SS
508.35
427.22
935.57
R-Sq(adj) = 51.4%
MS
169.45
9.29
F
18.24
P
0.000
13.87
Regression Analysis: y_money versus x1_pcincome, x2_ir
y_money = - 1158 + 0.253 x1_pcincome - 19.6 x2_ir
Predictor
Coef
SE Coef
T
P
Constant
-1158.4
587.9
-1.97
0.080
x1_pcinc
0.25273
0.03453
7.32
0.000
x2_ir
-19.56
21.73
-0.90
0.391
S = 84.93
R-Sq = 89.8%
R-Sq(adj) = 87.5%
Source
DF
SS
MS
F
Regression
2
570857
285429
39.57
Residual Error
9
64914
7213
Total
11
635771
Source
x1_pcinc
x2_ir
DF
1
1
Seq SS
565012
5845
VIF
1.3
1.3
P
0.000
48
13.88
Regression Analysis: y_manufgrowt versus x1_aggrowth, x2_exportgro, ...
y_manufgrowth = 2.15 + 0.493 x1_aggrowth + 0.270 x2_exportgrowth
- 0.117 x3_inflation
Predictor
Coef
SE Coef
T
P
VIF
Constant
2.1505
0.9695
2.22
0.032
x1_aggro
0.4934
0.2020
2.44
0.019
1.0
x2_expor
0.26991
0.06494
4.16
0.000
1.0
x3_infla
-0.11709
0.05204
-2.25
0.030
1.0
S = 3.624
R-Sq = 39.3%
R-Sq(adj) = 35.1%
Source
DF
SS
MS
F
Regression
3
373.98
124.66
9.49
Residual Error
44
577.97
13.14
Total
47
951.95
Source
x1_aggro
x2_expor
x3_infla
DF
1
1
1
P
0.000
Seq SS
80.47
227.02
66.50
13.89
The method of least squares regression yields estimators that are BLUE Best
Linear Unbiased Estimators. This result holds when the assumptions regarding the
behavior of the error term are true. BLUE estimators are the most efficient (best)
estimators out of the class of all unbiased estimators. The advent of computing
power incorporating the method of least squares has dramatically increased its use.
13.90 The analysis of variance table identifies how the total variability of the dependent
variable (SST) is split up between the portion of variability that is explained by the
regression model (SSR) and the part that is unexplained (SSE). The Coefficient of
Determination (R2) is derived as the ratio of SSR to SST. The analysis of variance
table also computes the F statistic for the test of the significance of the overall
regression whether all of the slope coefficients are jointly equal to zero. The
associated p-value is also generally reported in this table.
13.91
a. False. If the regression model does not explain a large enough portion of the
variability of the dependent variable, then the error sum of squares can be larger
than the regression sum of squares
b. False the sum of several simple linear regressions will not equal a multiple
regression since the assumption of all else equal will be violated in the simple
linear regressions. The multiple regression holds all else equal in calculating the
partial effect that a change in one of the independent variables has on the
dependent variable.
c. True
d. False While the regular coefficient of determination (R2) cannot be negative,
the adjusted coefficient of determination R 2 can become negative. If the
independent variables added into a regression equation have very little
explanatory power, the loss of degrees of freedom may more than offset the
added explanatory power.
49
e. True
13.92 If one model contains more explanatory variables, then SST remains the same for
both models but SSR will be higher for the model with more explanatory variables.
Since SST = SSR1 + SSE1 which is equivalent to SSR2 + SSE2 and given that SSR2 >
SSR1, then SSE1 > SSE2. Hence, the coefficient of determination will be higher with
a greater number of explanatory variables and the coefficient of determination must
be interpreted in conjunction with whether or not the regression slope coefficients on
the explanatory variables are significantly different from zero.
13.93
This is a classic example of what happens when there is a high degree of

correlation between the independent variables. The overall model can be shown to
have significant explanatory power and yet none of the slope coefficients on the
independent variables are significantly different from zero. This is due to the effect
that high correlation among the independent variables has on the variance of the
estimated slope coefficient.
13.94
e (y a b x b x
e (y y b x b x
e ny ny nb x nb x
e 0
1 1i
2 2i
1 1i
2 2i
b1 x1i b2 x2i )
2 2
nb1 x1 nb2 x2
1 1
13.95
a. All else equal, a unit change in population, industry size, measure of

economic quality, measure of political quality, measure of environmental quality,
measure of health and educational quality, and social life results in a respective
4.983, 2.198, 3.816, -.310, -.886, 3.215, and .085 increase in the new business
starts in the industry.
2
b. R .766 . 76.6% of the variability in new business starts in the industry can be
explained by the variability in the independent variables; population, industry
size, economic, political environmental, health and educational quality of life.
c. t 62,.05 = 1.67, therefore, the 90% CI: 3.816 1.67(2.063) = .3708 up to 7.2612
.886
.29 , t62,.025 = 2.000
d. H 0 : 5 0, H1 : 5 0 , t
3.055
3.125
2.05 , t62,.025 = 2.000
e. H 0 : 6 0, H1 : 6 0 , t
1.568
f. H 0 : 1 2 3 4 5 6 7 0, H1 : At least one
i 0, (i 1, 2,3, 4,5, 6, 7)
n k 1 R2
62 .766
28.99
2
k
1 R
2 1 .766
F 7,62,.01 = 2.79, Therefore, reject H 0 at the 1% level
F
13.96
50
a. All else equal, an increase of one question results in a decrease of 1.834 in

expected percentage of responses received. All else equal, an increase in one
word in length of the questionnaire results in a decrease of .016 in expected
percentage of responses received.
b. 63.7% of the variability in the percentage of responses received can be explained
by the variability in the number of questions asked and the number of words
c. H 0 : 1 2 0, H1 : At least one i 0, (i 1, 2)
n k 1 R2
27 .637
23.69
2
k
1 R
2 1 .637
F 2,27,.01 = 5.49, Therefore, reject H 0 at the 1% level
d. t27,.005 = 2.771, 99% CI: -1.8345 2.771(.6349). 3.5938 up to -.0752
e. t = -1.78, t27,.05/.025 = -1.703, -2.052.
F

13.97
a. All else equal, a 1% increase in course time spent in group discussion results
in an expected increase of .3817 in the average rating of the course. All else
equal, a dollar increase in money spent on the preparation of subject matter
materials results in an expected increase of .5172 in the average rating by
participants of the course. All else equal, a unit increase in expenditure on noncourse related materials results in an expected increase of .0753 in the average
rating of the course.
b. 57.9% of the variation in the average rating can be explained by the linear
relationship with % of class time spent on discussion, money spent on the
preparation of subject matter materials and money spent on non-class related
materials.
c. H 0 : 1 2 3 0, H1 : At least one i 0, (i 1, 2,3)
n k 1 R2
21 .579
9.627
2
k
1 R
3 1 .579
F 2,21,.05 = 3.47
d. t21,.05 = 1.721, 90% CI: .3817 1.721(.2018) .0344 up to .729
e. t = 2.64, t21,.01/.005 = 2.518, 2.831
F

f. t = 1.09, t21,.05 = 1.721. Therefore, do not reject H 0 at the 10% level
51
13.98
Regression Analysis: y_rating versus x1_expgrade, x2_Numstudents
y_rating = - 0.200 + 1.41 x1_expgrade - 0.0158 x2_Numstudents
Predictor
Coef
SE Coef
T
P
VIF
Constant
-0.2001
0.6968
-0.29
0.777
x1_expgr
1.4117
0.1780
7.93
0.000
1.5
x2_Numst
-0.015791
0.003783
-4.17
0.001
1.5
S = 0.1866
R-Sq = 91.5%
R-Sq(adj) = 90.5%
Source
DF
SS
MS
F
Regression
2
6.3375
3.1687
90.99
Residual Error
17
0.5920
0.0348
Total
19
6.9295
13.99
H 0 : 4 5 6 7 0, H1 : At least one i 0, (i 4,5, 6, 7, )

n k 1 R2
55 .467 .242
5.804 , F
2
k
1 R
4 1 .467
F
13.100
P
0.000
4,55,.01
= 3.68
a. All else equal, each extra point in the students expected score leads to an
expected increase of .469 in the actual score
b. t 103,.025 = 1.98, therefore, the 95% CI: 3.369 1.98(.456) = 2.4661 up to
4.2719
3.054
2.096 , t103,.025 = 1.96
c. H 0 : 3 0, H1 : 3 0 , t
1.457
d. 68.6% of the variation in the exam scores is explained by their linear
dependence on the students expected score, hours per week spent working on
the course and the students grade point average
H
e. 0 : 1 2 3 0, H1 : At least one i 0, (i 1, 2,3)
n k 1 R2
103 .686
75.008 , F 3,103,.01 = 3.95

2
k
1 R
3 1 .686
Reject H 0 at any common levels of alpha
F
f. R .686 .82825
g. Y 2.178 .469(80) 3.369(8) 3.054(3) 75.812
13.101 a. t 22,.01 = 2.819, therefore, the 99% CI: .0974 2.819(0.0215)
= .0368 up to .1580
.374
1.789 , t22,.05 /.025 = 1.717, 2.074.
b. H 0 : 2 0, H1 : 2 0 , t
.209
Therefore, reject H 0 at the 5% level but not the 2.5% level
22(.91) 2
.9175
c. R 2
24
52
d. H 0 : 1 2 0, H1 : At least one i 0, (i 1, 2)
n k 1 R2
22 .9175
122.33 , F 2,22,.01 = 5.72

2
k
1 R
2 1 .9175
Reject H 0 at any common levels of alpha
F
e. R .9175 .9579
13.102 a. t 2669,.05 = 1.645, therefore, the 90% CI: 480.04 1.645(224.9) = 110.0795 up
to 850.0005
b. t 2669,.005 = 2.576, therefore, the 99% CI: 1350.3 2.576(212.3) = 803.4152 up
to 1897.1848
891.67
4.9299
c. H 0 : 8 0, H1 : 8 0 , t
180.87
t2669,.005 = 2.576, therefore, reject H 0 at the .5% level
722.95
6.5142
d. H 0 : 9 0, H1 : 9 0 , t
110.98
t2669,.005 = 2.576, therefore, reject H 0 at the .5% level
e. 52.39% of the variability in minutes played in the season can be explained by
the variability in all 9 variables.
f. R .5239 .7238
13.103 a. H 0 : 1 0, H1 : 1 0
.052
t
2.737 , t60,.005 = 2.66, therefore, reject H 0 at the 1% level
.019
.005
.119
b. H 0 : 2 0, H1 : 2 0 , t
.042
t60,.10 = 1.296, therefore, do not reject H 0 at the 20% level
c. 17% of the variation in the growth rate in GDP can be explained by the
variations in real income per capita and the average tax rate, as a proportion of
GNP.
d. R .17 .4123
13.104 A report can be written by following the Case Study and testing the significance
of the model. See section 13.9
53
13.105 a. Start with the correlation matrix:

Correlations: EconGPA, SATverb, SATmath, HSPct
EconGPA
0.478
0.000
SATverb
SATmath
0.427
0.000
0.353
0.003
HSPct
0.362
0.000
0.201
0.121
SATverb
SATmath
0.497
0.000
Regression Analysis: EconGPA versus SATverb, SATmath, HSPct

EconGPA = 0.612 + 0.0239 SATverb + 0.0117 SATmath + 0.00530
Predictor
Coef
SE Coef
T
P
Constant
0.6117
0.4713
1.30
0.200
SATverb
0.023929
0.007386
3.24
0.002
SATmath
0.011722
0.007887
1.49
0.143
HSPct
0.005303
0.004213
1.26
0.213
S = 0.4238
R-Sq = 32.9%
R-Sq(adj) = 29.4%
Source
DF
SS
MS
F
Regression
3
5.0171
1.6724
9.31
Residual Error
57
10.2385
0.1796
Total
60
15.2556
Source
SATverb
SATmath
HSPct
DF
1
1
1
HSPct
VIF
1.2
1.5
1.3
P
0.000
Seq SS
3.7516
0.9809
0.2846
The regression model indicates positive coefficients, as expected, for all three
independent variables. The greater the high school rank, and the higher the SAT
verbal and SAT math scores, the larger the Econ GPA. The high school rank
variable has the smallest t-statistic and is removed from the model:
Regression Analysis: EconGPA versus SATverb, SATmath
EconGPA = 0.755 + 0.0230 SATverb + 0.0174 SATmath
Predictor
Coef
SE Coef
T
P
Constant
0.7547
0.4375
1.72
0.089
SATverb
0.022951
0.006832
3.36
0.001
SATmath
0.017387
0.006558
2.65
0.010
S = 0.4196
R-Sq = 30.5%
R-Sq(adj) = 28.3%
Source
DF
SS
MS
F
Regression
2
4.9488
2.4744
14.05
Residual Error
64
11.2693
0.1761
Total
66
16.2181
Source
SATverb
SATmath
DF
1
1
Seq SS
3.7109
1.2379
VIF
1.1
1.1
P
0.000
54
Both SAT variables are now statistically significant at the .05 level and appear to
pick up separate influences on the dependent variable. The simple correlation
coefficient between SAT math and SAT verbal is relatively low at .353. Thus,
multicollinearity will not be dominant in this regression model.
The final regression model, with conditional t-statistics in parentheses under the
coefficients, is:
Y .755 .023( SATverbal ) .0174( SATmath)
(3.36)
(2.65)
S = .4196 R2 = .305 n = 67
b. Start with the correlation matrix:
Correlations: EconGPA, Acteng, ACTmath, ACTss, ACTcomp, HSPct
EconGPA
0.387
0.001
Acteng
ACTmath
0.338
0.003
0.368
0.001
ACTss
0.442
0.000
0.448
0.000
0.439
0.000
ACTcomp
0.474
0.000
0.650
0.000
0.765
0.000
0.812
0.000
HSPct
0.362
0.000
0.173
0.150
0.290
0.014
0.224
0.060
Acteng
ACTmath
ACTss
ACTcomp
0.230
0.053
Regression Analysis: EconGPA versus Acteng, ACTmath, ...

EconGPA = - 0.207 + 0.0266 Acteng - 0.0023 ACTmath + 0.0212 ACTss
+ 0.0384 ACTcomp + 0.0128 HSPct
Predictor
Coef
SE Coef
T
P
VIF
Constant
-0.2069
0.6564
-0.32
0.754
Acteng
0.02663
0.02838
0.94
0.352
2.2
ACTmath
-0.00229
0.03031
-0.08
0.940
4.2
ACTss
0.02118
0.02806
0.75
0.453
4.6
ACTcomp
0.03843
0.07287
0.53
0.600
12.7
HSPct
0.012817
0.005271
2.43
0.018
1.2
S = 0.5034
R-Sq = 31.4%
R-Sq(adj) = 26.1%
Source
DF
SS
MS
F
Regression
5
7.5253
1.5051
5.94
Residual Error
65
16.4691
0.2534
Total
70
23.9945
Source
Acteng
ACTmath
ACTss
ACTcomp
HSPct
DF
1
1
1
1
1
Seq SS
3.5362
1.0529
1.4379
0.0001
1.4983
P
0.000
55
The regression shows that only high school rank is significant at the .05 level. We
may suspect multicollinearity between the variables, particularly since there is a
total ACT score (ACT composite) as well as the components that make up the
ACT composite. Since conditional significance is dependent on which other
independent variables are included in the regression equation, drop one variable at
a time. ACTmath has the lowest t-statistic and is removed:
Regression Analysis: EconGPA versus Acteng, ACTss, ACTcomp,
HSPct
EconGPA = - 0.195 + 0.0276 Acteng + 0.0224 ACTss + 0.0339 ACTcomp
+ 0.0127 HSPct
Predictor
Coef
SE Coef
T
P
VIF
Constant
-0.1946
0.6313
-0.31
0.759
Acteng
0.02756
0.02534
1.09
0.281
1.8
ACTss
0.02242
0.02255
0.99
0.324
3.0
ACTcomp
0.03391
0.04133
0.82
0.415
4.2
HSPct
0.012702
0.005009
2.54
0.014
1.1
S = 0.4996
R-Sq = 31.4%
R-Sq(adj) = 27.2%
Source
DF
SS
MS
F
Regression
4
7.5239
1.8810
7.54
Residual Error
66
16.4706
0.2496
Total
70
23.9945
Source
Acteng
ACTss
ACTcomp
HSPct
DF
1
1
1
1
P
0.000
Seq SS
3.5362
2.1618
0.2211
1.6048
Again, high school rank is the only conditionally significant variable. ACTcomp
has the lowest t-statistic and is removed:
Regression Analysis: EconGPA versus Acteng, ACTss, HSPct
EconGPA = 0.049 + 0.0390 Acteng + 0.0364 ACTss + 0.0129 HSPct
Predictor
Coef
SE Coef
T
P
VIF
Constant
0.0487
0.5560
0.09
0.930
Acteng
0.03897
0.02114
1.84
0.070
1.3
ACTss
0.03643
0.01470
2.48
0.016
1.3
HSPct
0.012896
0.004991
2.58
0.012
1.1
S = 0.4983
R-Sq = 30.7%
R-Sq(adj) = 27.6%
Source
DF
SS
MS
F
Regression
3
7.3558
2.4519
9.87
Residual Error
67
16.6386
0.2483
Total
70
23.9945
Source
Acteng
ACTss
HSPct
DF
1
1
1
Seq SS
3.5362
2.1618
1.6579
P
0.000
56
Now ACTss and high school rank are conditionally significant. ACTenglish has a tstatistic less than 2 and is removed:
57
Regression Analysis: EconGPA versus ACTss, HSPct

EconGPA = 0.566 + 0.0479 ACTss + 0.0137 HSPct
Predictor
Coef
SE Coef
T
Constant
0.5665
0.4882
1.16
ACTss
0.04790
0.01355
3.53
HSPct
0.013665
0.005061
2.70
P
0.250
0.001
0.009
S = 0.5070
R-Sq = 27.1%
R-Sq(adj) = 25.0%
Source
DF
SS
MS
F
Regression
2
6.5123
3.2562
12.67
Residual Error
68
17.4821
0.2571
Total
70
23.9945
Source
ACTss
HSPct
DF
1
1
VIF
1.1
1.1
P
0.000
Seq SS
4.6377
1.8746
Both of the independent variables are statistically significant at the .05 level and
hence, the final regression model, with conditional t-statistics in parentheses under
the coefficients, is:
Y .567 .0479( ACTss) .0137( HSPct )
(3.53)
(2.70)
2
S = .5070 R = .271 n = 71
c. The regression model with the SAT variables is the better predictor
because the standard error of the estimate is smaller than for the ACT model
(.4196 vs. .5070). The R2 measure cannot be directly compared due to the sample
size differences.
13.106
Correlations: Salary, age, Experien, yrs_asoc, yrs_full, Sex_1Fem, Market,
C8
age
Salary
0.749
0.000
age Experien yrs_asoc yrs_full Sex_1Fem
Experien
0.883
0.000
0.877
0.000
yrs_asoc
0.698
0.000
0.712
0.000
0.803
0.000
yrs_full
0.777
0.000
0.583
0.000
0.674
0.000
0.312
0.000
Sex_1Fem -0.429
0.000
-0.234
0.004
-0.378
0.000
-0.367
0.000
-0.292
0.000
Market
0.026
0.750
-0.134
0.103
-0.150
0.067
-0.113
0.169
-0.017
0.833
0.062
0.453
-0.029
0.721
-0.189
0.020
-0.117
0.155
-0.073
0.373
-0.043
0.598
-0.094
0.254
C8
Market
-0.107
0.192
58
The correlation matrix indicates that several of the independent variables are likely
to be significant, however, multicollinearity is also a likely result. The regression
model with all independent variables is:
Regression Analysis: Salary versus age, Experien, ...
Salary = 23725 - 40.3 age + 357 Experien + 263 yrs_asoc + 493
yrs_full
- 954 Sex_1Fem + 3427 Market + 1188 C8
Predictor
Coef
SE Coef
T
P
VIF
Constant
23725
1524
15.57
0.000
age
-40.29
44.98
-0.90
0.372
4.7
Experien
356.83
63.48
5.62
0.000
10.0
yrs_asoc
262.50
75.11
3.49
0.001
4.0
yrs_full
492.91
59.27
8.32
0.000
2.6
Sex_1Fem
-954.1
487.3
-1.96
0.052
1.3
Market
3427.2
754.1
4.54
0.000
1.1
C8
1188.4
597.5
1.99
0.049
1.1
S = 2332
R-Sq = 88.2%
R-Sq(adj) = 87.6%
Source
DF
SS
MS
F
Regression
7 5776063882
825151983
151.74
Residual Error
142
772162801
5437766
Total
149 6548226683
Source
age
Experien
yrs_asoc
yrs_full
Sex_1Fem
Market
DF
1
1
1
1
1
1
P
0.000
Seq SS
3669210599
1459475287
1979334
500316356
22707368
100860164
Since age is insignificant and has the smallest t-statistics, it is removed from the
model:
The conditional F test for age is:
SSRF SSRR 5, 766, 064, 000 5, 771, 700, 736
FX 2
.80
s 2Y | X
(2332) 2
Which is well below any common critical value of F. Thus, age is removed from
the model. The remaining independent variables are all significant at the .05 level
of significance and hence, become the final regression model. Residual analysis to
determine if the assumption of linearity holds true follows:
Regression Analysis: Salary versus Experien, yrs_asoc, ...

Salary = 22455 + 324 Experien + 258 yrs_asoc + 491 yrs_full
Sex_1Fem
+ 3449 Market + 1274 C8
Predictor
Coef
SE Coef
T
P
Constant
22455.2
557.7
40.26
0.000
Experien
324.24
51.99
6.24
0.000
yrs_asoc
257.88
74.88
3.44
0.001
yrs_full
490.97
59.19
8.29
0.000
Sex_1Fem
-1043.4
476.7
-2.19
0.030
Market
3449.4
753.2
4.58
0.000
C8
1274.5
589.3
2.16
0.032
S = 2330
R-Sq = 88.1%
R-Sq(adj) = 87.6%
Source
DF
SS
MS
F
Regression
6 5771700580
961950097
177.15
Residual Error
143
776526103
5430252
Total
149 6548226683
10000
5000
RESI1
59
-5000
10
20
Experien
30
40
- 1043
VIF
6.7
4.0
2.6
1.2
1.1
1.1
P
0.000
10000
RESI1
5000
-5000
10
20
yrs_asoc
10000
RESI1
5000
-5000
10
yrs_full
20
60
10000
RESI1
5000
-5000
0.0
0.5
1.0
Sex_1Fem
10000
5000
RESI1
61
-5000
0.0
0.5
Market
1.0
62
10000
RESI1
5000
-5000
0.0
0.5
1.0
C8
The residual plot for Experience shows a relatively strong quadratic relationship
between Experience and Salary. Therefore, a new variable, taking into account the
quadratic relationship is generated and added to the model. None of the other
residual plots shows strong evidence of non-linearity.
Regression Analysis: Salary versus Experien, ExperSquared, ...
Salary = 18915 + 875 Experien - 15.9 ExperSquared + 222 yrs_asoc + 612
yrs_full
- 650 Sex_1Fem + 3978 Market + 1042 C8
Predictor
Coef
SE Coef
T
P
VIF
Constant
18915.2
583.2
32.43
0.000
Experien
875.35
72.20
12.12
0.000
20.6
ExperSqu
-15.947
1.717
-9.29
0.000
16.2
yrs_asoc
221.58
59.40
3.73
0.000
4.0
yrs_full
612.10
48.63
12.59
0.000
2.8
Sex_1Fem
-650.1
379.6
-1.71
0.089
1.2
Market
3978.3
598.8
6.64
0.000
1.1
C8
1042.3
467.1
2.23
0.027
1.1
S = 1844
R-Sq = 92.6%
R-Sq(adj) = 92.3%
Source
DF
SS
MS
F
Regression
7 6065189270
866455610
254.71
Residual Error
142
483037413
3401672
Total
149 6548226683
Source
Experien
ExperSqu
yrs_asoc
yrs_full
Sex_1Fem
Market
C8
DF
1
1
1
1
1
1
1
Seq SS
5109486518
91663414
15948822
678958872
12652358
139540652
16938635
P
0.000
63
The squared term for experience is statistically significant; however, the Sex_1Fem
is no longer significant at the .05 level and hence is removed from the model:
Regression Analysis: Salary versus Experien, ExperSquared, ...
Salary = 18538 + 888 Experien - 16.3 ExperSquared + 237 yrs_asoc
+ 624 yrs_full
+ 3982 Market + 1145 C8
Predictor
Coef
SE Coef
T
P
VIF
Constant
18537.8
543.6
34.10
0.000
Experien
887.85
72.32
12.28
0.000
20.4
ExperSqu
-16.275
1.718
-9.48
0.000
16.0
yrs_asoc
236.89
59.11
4.01
0.000
3.9
yrs_full
624.49
48.41
12.90
0.000
2.8
Market
3981.8
602.9
6.60
0.000
1.1
C8
1145.4
466.3
2.46
0.015
1.0
S = 1857
R-Sq = 92.5%
R-Sq(adj) = 92.2%
Source
DF
SS
MS
F
Regression
6 6055213011 1009202168
292.72
Residual Error
143
493013673
3447648
Total
149 6548226683
P
0.000
This is the final model with all of the independent variables being conditionally
significant, including the quadratic transformation of Experience. This would
indicate that a non-linear relationship exists between experience and salary.
13.107
Correlations: hseval, Comper, Homper, Indper, sizehse, incom72
hseval
-0.335
0.001
Comper
Homper
0.145
0.171
-0.499
0.000
Indper
-0.086
0.419
-0.140
0.188
-0.564
0.000
sizehse
0.542
0.000
-0.278
0.008
0.274
0.009
-0.245
0.020
incom72
0.426
0.000
-0.198
0.062
-0.083
0.438
0.244
0.020
Comper
Homper
Indper
sizehse
0.393
0.000
The correlation matrix indicates that the size of the house, income and percent
homeowners have a positive relationship with house value. There is a negative
relationship between the percent industrial and percent commercial and house
value.
64
Regression Analysis: hseval versus Comper, Homper, ...

hseval = - 19.0 - 26.4 Comper - 12.1 Homper - 15.5 Indper + 7.22
sizehse
+ 0.00408 incom72
Predictor
Coef
SE Coef
T
P
VIF
Constant
-19.02
13.20
-1.44
0.153
Comper
-26.393
9.890
-2.67
0.009
2.2
Homper
-12.123
7.508
-1.61
0.110
3.0
Indper
-15.531
8.630
-1.80
0.075
2.6
sizehse
7.219
2.138
3.38
0.001
1.5
incom72
0.004081
0.001555
2.62
0.010
1.4
S = 3.949
R-Sq = 40.1%
R-Sq(adj) = 36.5%
Source
DF
SS
MS
F
Regression
5
876.80
175.36
11.25
Residual Error
84
1309.83
15.59
Total
89
2186.63
P
0.000
All variables are conditionally significant with the exception of Indper and Homper.
Since Homper has the smaller t-statistic, it is removed:
Regression Analysis: hseval versus Comper, Indper, sizehse,
incom72
hseval = - 30.9 - 15.2 Comper - 5.73 Indper + 7.44 sizehse +
0.00418 incom72
Predictor
Coef
SE Coef
T
P
VIF
Constant
-30.88
11.07
-2.79
0.007
Comper
-15.211
7.126
-2.13
0.036
1.1
Indper
-5.735
6.194
-0.93
0.357
1.3
sizehse
7.439
2.154
3.45
0.001
1.5
incom72
0.004175
0.001569
2.66
0.009
1.4
S = 3.986
R-Sq = 38.2%
R-Sq(adj) = 35.3%
Source
DF
SS
MS
F
Regression
4
836.15
209.04
13.16
Residual Error
85
1350.48
15.89
Total
89
2186.63
P
0.000
Indper is not significant and is removed:

Regression Analysis: hseval versus Comper, sizehse, incom72
hseval = - 34.2 - 13.9 Comper + 8.27 sizehse + 0.00364 incom72
Predictor
Coef
SE Coef
T
P
VIF
Constant
-34.24
10.44
-3.28
0.002
Comper
-13.881
6.974
-1.99
0.050
1.1
sizehse
8.270
1.957
4.23
0.000
1.2
incom72
0.003636
0.001456
2.50
0.014
1.2
S = 3.983
R-Sq = 37.6%
Source
DF
Regression
3
Residual Error
86
Total
89
SS
822.53
1364.10
2186.63
R-Sq(adj) = 35.4%
MS
274.18
15.86
F
17.29
P
0.000
65
This becomes the final regression model. The selection of a community with the
objective of having larger house values would include communities where the
percent of commercial property is low, the median rooms per residence is high and
the per capita income is high.
13.108
a. Correlation matrix:
Correlations: deaths, vehwt, impcars, lghttrks, carage
vehwt
impcars
lghttrks
carage
deaths
0.244
0.091
vehwt
impcars lghttrks
-0.284
0.048
-0.943
0.000
0.726
0.000
0.157
0.282
-0.175
0.228
-0.422
0.003
0.123
0.400
0.011
0.943
-0.329
0.021
Crash deaths are positively related to vehicle weight and percentage of light trucks
and negatively related to percent imported cars and car age. Light trucks will have
the strongest linear association of any independent variable followed by car age.
Multicollinearity is likely to exist due to the strong correlation between impcars
and vehicle weight.
b.
Regression Analysis: deaths versus vehwt, impcars, lghttrks, carage
deaths = 2.60 +0.000064 vehwt - 0.00121 impcars
lghttrks
- 0.0395 carage
Predictor
Coef
SE Coef
T
Constant
2.597
1.247
2.08
vehwt
0.0000643
0.0001908
0.34
impcars
-0.001213
0.005249
-0.23
lghttrks
0.008332
0.001397
5.96
carage
-0.03946
0.01916
-2.06
+ 0.00833
P
0.043
0.738
0.818
0.000
0.045
S = 0.05334
R-Sq = 59.5%
R-Sq(adj) = 55.8%
Source
DF
SS
MS
F
Regression
4
0.183634
0.045909
16.14
Residual Error
44
0.125174
0.002845
Total
48
0.308809
VIF
10.9
10.6
1.2
1.4
P
0.000
Light trucks is a significant positive variable. Since impcars has the smallest tstatistic, it is removed from the model:
66
Regression Analysis: deaths versus vehwt, lghttrks, carage

deaths = 2.55 +0.000106 vehwt + 0.00831 lghttrks - 0.0411 carage
Predictor
Coef
SE Coef
T
P
VIF
Constant
2.555
1.220
2.09
0.042
vehwt
0.00010622 0.00005901
1.80
0.079
1.1
lghttrks
0.008312
0.001380
6.02
0.000
1.2
carage
-0.04114
0.01754
-2.34
0.024
1.2
S = 0.05277
R-Sq = 59.4%
Source
DF
Regression
3
Residual Error
45
Total
48
SS
0.183482
0.125326
0.308809
R-Sq(adj) = 56.7%
MS
0.061161
0.002785
F
21.96
P
0.000
Also, remove vehicle weight using the same argument:

Regression Analysis: deaths versus lghttrks, carage
deaths = 2.51 + 0.00883 lghttrks - 0.0352 carage
Predictor
Coef
SE Coef
T
P
Constant
2.506
1.249
2.01
0.051
lghttrks
0.008835
0.001382
6.39
0.000
carage
-0.03522
0.01765
-2.00
0.052
S = 0.05404
R-Sq = 56.5%
Source
DF
Regression
2
Residual Error
46
Total
48
SS
0.174458
0.134351
0.308809
VIF
1.1
1.1
R-Sq(adj) = 54.6%
MS
0.087229
0.002921
F
29.87
P
0.000
The model has light trucks and car age as the significant variables. Note that car
age is marginally significant (p-value of .052) and hence could also be dropped
from the model.
c. The regression modeling indicates that the percentage of light trucks is
conditionally significant in all of the models and hence is an important predictor
in the model. Car age and imported cars are marginally significant predictors
when only light trucks is included in the model.
13.109
Correlations: deaths, Purbanpop, Ruspeed, Prsurf
deaths Purbanpo
Purbanpo -0.594
0.000
Ruspeed
Prsurf
0.305
0.033
-0.224
0.121
-0.556
0.000
0.207
0.153
Ruspeed
-0.232
0.109
67
Descriptive Statistics: deaths, Purbanpop, Prsurf, Ruspeed

Variable
deaths
Purbanpo
Prsurf
Ruspeed
N
49
49
49
49
Mean
0.1746
0.5890
0.7980
58.186
Median
0.1780
0.6311
0.8630
58.400
TrMean
0.1675
0.5992
0.8117
58.222
Variable
deaths
Purbanpo
Prsurf
Ruspeed
Minimum
0.0569
0.0000
0.2721
53.500
Maximum
0.5505
0.9689
1.0000
62.200
Q1
0.1240
0.4085
0.6563
57.050
Q3
0.2050
0.8113
0.9485
59.150
StDev
0.0802
0.2591
0.1928
1.683
SE Mean
0.0115
0.0370
0.0275
0.240
The proportion of urban population and rural roads that are surfaced are negatively
related to crash deaths. Average rural speed is positively related, but the
relationship is not as strong as the proportion of urban population and surfaced
roads. The simple correlation coefficients among the independent variables are
relatively low and hence multicollinearity should not be dominant in this model.
Note the relatively narrow range for average rural speed. This would indicate that
there is not much variability in this independent variable.
b. Multiple regression
Regression Analysis: deaths versus Purbanpop, Prsurf, Ruspeed
deaths = 0.141 - 0.149 Purbanpop - 0.181 Prsurf + 0.00457 Ruspeed
Predictor
Constant
Purbanpo
Prsurf
Ruspeed
S = 0.05510
Coef
0.1408
-0.14946
-0.18058
0.004569
SE Coef
0.2998
0.03192
0.04299
0.004942
R-Sq = 55.8%
Source
DF
Regression
3
Residual Error
45
Total
48
SS
0.172207
0.136602
0.308809
T
0.47
-4.68
-4.20
0.92
P
0.641
0.000
0.000
0.360
VIF
1.1
1.1
1.1
R-Sq(adj) = 52.8%
MS
0.057402
0.003036
F
18.91
P
0.000
The model has conditionally significant variables for percent urban population and
percent surfaced roads. Since average rural speed is not conditionally significant,
it is dropped from the model:
68
Regression Analysis: deaths versus Purbanpop, Prsurf

deaths = 0.416 - 0.155 Purbanpop - 0.188 Prsurf
Predictor
Coef
SE Coef
T
P
Constant
0.41609
0.03569
11.66
0.000
Purbanpo
-0.15493
0.03132
-4.95
0.000
Prsurf
-0.18831
0.04210
-4.47
0.000
S = 0.05501
R-Sq = 54.9%
Source
DF
Regression
2
Residual Error
46
Total
48
VIF
1.0
1.0
R-Sq(adj) = 53.0%
SS
0.169612
0.139197
0.308809
MS
0.084806
0.003026
F
28.03
P
0.000
This becomes the final model since both variables are conditionally significant.
c. Conclude that the proportions of urban populations and the percent of rural
roads that are surfaced are important independent variables in explaining crash
deaths. All else equal, increases in the proportion of urban population, the
lower the crash deaths. All else equal, increases in the proportion of rural
roads that are surfaced will result in lower crash deaths. The average rural
speed is not conditionally significant.
13.110 a. Correlation matrix and descriptive statistics
Correlations: hseval, sizehse, Taxhse, Comper, incom72, totexp
sizehse
Taxhse
Comper
incom72
totexp
hseval
0.542
0.000
0.248
0.019
-0.335
0.001
0.426
0.000
0.261
0.013
sizehse
Taxhse
Comper
incom72
0.289
0.006
-0.278
0.008
0.393
0.000
-0.022
0.834
-0.114
0.285
0.261
0.013
0.228
0.030
-0.198
0.062
0.269
0.010
0.376
0.000
The correlation matrix shows that multicollinearity is not likely to be a problem in

this model since all of the correlations among the independent variables are
relatively low.
69
Descriptive Statistics: hseval, sizehse, Taxhse, Comper, incom72, totexp

Variable
hseval
sizehse
Taxhse
Comper
incom72
totexp
N
90
90
90
90
90
90
Mean
21.031
5.4778
130.13
0.16211
3360.9
1488848
Median
20.301
5.4000
131.67
0.15930
3283.0
1089110
TrMean
20.687
5.4638
128.31
0.16206
3353.2
1295444
Variable
hseval
sizehse
Taxhse
Comper
incom72
totexp
Minimum
13.300
5.0000
35.04
0.02805
2739.0
361290
Maximum
35.976
6.2000
399.60
0.28427
4193.0
7062330
Q1
17.665
5.3000
98.85
0.11388
3114.3
808771
Q3
24.046
5.6000
155.19
0.20826
3585.3
1570275
StDev
4.957
0.2407
48.89
0.06333
317.0
1265564
SE Mean
0.522
0.0254
5.15
0.00668
33.4
133402
The range for applying the regression model (variable means + / - 2 standard errors):
Hseval 21.03 +/- 2(4.957) = 11.11 to 30.94
Sizehse5.48 +/- 2(.24) = 5.0 to 5.96
Taxhse
130.13 +/- 2(48.89) = 32.35 to 227.91
Comper
.16 +/- 2(.063) = .034 to .286
Incom72
3361 +/- 2(317) = 2727 to 3995
Totexp
1488848 +/- 2(1265564) = not a good approximation
b. Regression models:
Regression Analysis: hseval versus sizehse, Taxhse, ...
hseval = - 31.1 + 9.10 sizehse - 0.00058 Taxhse
incom72
+0.000001 totexp
Predictor
Coef
SE Coef
T
Constant
-31.07
10.09
-3.08
sizehse
9.105
1.927
4.72
Taxhse
-0.000584
0.008910
-0.07
Comper
-22.197
7.108
-3.12
incom72
0.001200
0.001566
0.77
totexp
0.00000125 0.00000038
3.28
S = 3.785
R-Sq = 45.0%
Source
DF
Regression
5
Residual Error
84
Total
89
SS
982.98
1203.65
2186.63
- 22.2 Comper + 0.00120

P
0.003
0.000
0.948
0.002
0.445
0.002
VIF
1.3
1.2
1.3
1.5
1.5
R-Sq(adj) = 41.7%
MS
196.60
14.33
F
13.72
P
0.000
Taxhse is not conditionally significant, nor is income; however, dropping one variable at a
time, eliminate Taxhse first, then eliminate income:
70
Regression Analysis: hseval versus sizehse, Comper, totexp

hseval = - 29.9 + 9.61 sizehse - 23.5 Comper +0.000001 totexp
Predictor
Coef
SE Coef
T
P
VIF
Constant
-29.875
9.791
-3.05
0.003
sizehse
9.613
1.724
5.58
0.000
1.1
Comper
-23.482
6.801
-3.45
0.001
1.2
totexp
0.00000138 0.00000033
4.22
0.000
1.1
S = 3.754
R-Sq = 44.6%
Source
DF
Regression
3
Residual Error
86
Total
89
R-Sq(adj) = 42.6%
SS
974.55
1212.08
2186.63
MS
324.85
14.09
F
23.05
P
0.000
This is the final regression model. All of the independent variables are
conditionally significant.
Both the size of house and total government expenditures enhances market value
of homes while the percent of commercial property tends to reduce market values
of homes.
c. In the final regression model, the tax variable was not found to be conditionally
significant and hence it is difficult to support the developers claim.
13.111
a. Correlation matrix
Correlations: retsal84, Unemp84, perinc84
retsal84
-0.370
0.008
perinc84 0.633
0.000
Unemp84
Unemp84
-0.232
0.101
There is a positive association between per capita income and retail sales. There is
a negative association between unemployment and retail sales. High correlation
among the independent variables does not appear to be a problem since the
correlation between the independent variables is relatively low.
Descriptive Statistics: retsal84, perinc84, Unemp84
Variable
retsal84
perinc84
Unemp84
N
51
51
51
Mean
5536
12277
7.335
Median
5336
12314
7.000
TrMean
5483
12166
7.196
Variable
retsal84
perinc84
Minimum
4250
8857
Maximum
8348
17148
Q1
5059
10689
Q3
6037
13218
StDev
812
1851
2.216
SE Mean
114
259
0.310
71
Regression Analysis: retsal84 versus Unemp84, perinc84

retsal84 = 3054 - 86.3 Unemp84 + 0.254 perinc84
Predictor
Coef
SE Coef
T
P
Constant
3054.3
724.4
4.22
0.000
Unemp84
-86.25
40.20
-2.15
0.037
perinc84
0.25368
0.04815
5.27
0.000
S = 612.9
R-Sq = 45.3%
Source
DF
Regression
2
Residual Error
48
Total
50
VIF
1.1
1.1
R-Sq(adj) = 43.0%
SS
14931938
18029333
32961271
MS
7465969
375611
F
19.88
P
0.000
This is the final model since all of the independent variables are conditionally significant at
the .05 level. The 95% confidence intervals for the regression slope coefficients:
1 t ( S ) : -86.25 +/- 2.011(40.2) = -86.25 +/- 80.84
1
2 t ( S ) :
2
b.
.254 + / - 2.011(.04815) = .254 + / - .0968
All things equal, the condition effect of a $1,000 decrease in per capita income
on retail sales would be to reduce retail sales by $254.
c.
Adding state population as a predictor yields the following regression results:

Regression Analysis: retsal84 versus Unemp84, perinc84, Totpop84
retsal84 = 2828 - 71.3 Unemp84 + 0.272 perinc84 - 0.0247 Totpop84
Predictor
Coef
SE Coef
T
P
VIF
Constant
2828.4
737.9
3.83
0.000
Unemp84
-71.33
41.40
-1.72
0.091
1.1
perinc84
0.27249
0.04977
5.47
0.000
1.1
Totpop84
-0.02473
0.01845
-1.34
0.187
1.1
S = 607.8
R-Sq = 47.3%
R-Sq(adj) = 44.0%
Source
DF
SS
MS
F
Regression
3
15595748
5198583
14.07
Residual Error
47
17365523
369479
Total
50
32961271
P
0.000
The population variable is not conditionally significant and adds little explanatory power,
therefore, it will not improve the multiple regression model.
13.112 a.
Correlations: FRH, FBPR, FFED, FM2, GDPH, GH
FBPR
FFED
FM2
GDPH
GH
FRH
0.510
0.000
0.244
0.001
0.854
0.000
0.934
0.000
0.907
FBPR
FFED
FM2
GDPH
0.957
0.000
0.291
0.000
0.580
0.000
0.592
0.077
0.326
0.287
0.000
0.285
0.987
0.000
0.977
0.973
0.000
0.000
0.000
0.000
72
0.000
The correlation matrix shows that both interest rates have a significant positive
impact on residential investment. The money supply, GDP and government
expenditures also have a significant linear association with residential investment.
Note the high correlation between the two interest rate variables, which, as
expected, would create significant problems if both variables are included in the
regression model. Hence, the interest rates will be developed in two separate
models.
Regression Analysis: FRH versus FBPR, FM2, GDPH, GH
FRH = 70.0 - 3.79 FBPR - 0.0542 FM2 + 0.0932 GDPH - 0.165 GH
Predictor
Coef
SE Coef
T
P
VIF
Constant
70.00
24.87
2.82
0.005
FBPR
-3.7871
0.6276
-6.03
0.000
1.2
FM2
-0.054210
0.009210
-5.89
0.000
46.8
GDPH
0.093223
0.007389
12.62
0.000
58.1
GH
-0.16514
0.03747
-4.41
0.000
28.7
S = 23.42
R-Sq = 86.7%
R-Sq(adj) = 86.3%
Source
DF
SS
MS
F
Regression
4
573700
143425
261.42
Residual Error
161
88331
549
Total
165
662030
P
0.000
This will be the final model with prime rate as the interest rate variable since all of
the independent variables are conditionally significant. Note the significant
multicollinearity that exists between the independent variables.
Regression Analysis: FRH versus FFED, FM2, GDPH, GH
FRH = 55.0 - 2.76 FFED - 0.0558 FM2 + 0.0904 GDPH - 0.148 GH
Predictor
Coef
SE Coef
T
P
VIF
Constant
55.00
26.26
2.09
0.038
FFED
-2.7640
0.6548
-4.22
0.000
1.2
FM2
-0.05578
0.01007
-5.54
0.000
50.7
GDPH
0.090402
0.007862
11.50
0.000
59.6
GH
-0.14752
0.03922
-3.76
0.000
28.5
S = 24.61
R-Sq = 85.3%
R-Sq(adj) = 84.9%
Source
DF
SS
MS
F
Regression
4
564511
141128
233.00
Residual Error
161
97519
606
Total
165
662030
P
0.000
The model with the federal funds rate as the interest rate variable is also the final
model with all of the independent variables conditionally significant. Again, high
correlation among the independent variables will be a problem with this regression
model.
73
b. 95% confidence intervals for the slope coefficients on the interest rate term:
Bank prime rate as the interest rate variable:
1 t ( S ) : -3.7871 +/- 1.96(.6276) = -3.7871 +/- 1.23
1
Federal funds rate as the interest rate variable:

1 t ( S ) : -2.764 +/- 1.96(.6548) = -2.764 +/- 1.2834
1
13.112 a.
Correlations: Infmrt82, Phys82, Perinc84, Perhosp
Infmrt82
0.434
0.001
Perinc84 0.094
0.511
Perhosp
0.411
0.003
Phys82 Perinc84
Phys82
0.614
0.000
0.285
0.042
0.267
0.058
The correlation matrix shows a positive association with Phys82 and Perhosp.
These variables are the number of physicians per 100,000 population and the total
per capita expenditures for hospitals. One would expect a negative association,
therefore, examine the scatterdiagram of infant mortality vs. phys82:
Infmrt82
20
15
10
100
150
200
250
300
350
400
450
500
550
Phys82
The graph shows an obvious outlier which, upon further investigation, is the
District of Columbia. Due to the outlier status, this row is dropped from the
analysis and the correlation matrix is recalculated:
74
Correlations: Infmrt82, Phys82, Perinc84, Perhosp

Infmrt82
-0.147
0.309
Perinc84 -0.192
0.181
Perhosp
0.199
0.166
Phys82 Perinc84
Phys82
0.574
0.000
-0.065
0.654
0.140
0.331
The physicians per 100,000 population now has the correct sign, however, none of
the independent variables has a statistically significant linear association with the
dependent variable. Per capita expenditures for hospitals is an unexpected positive
sign; however, it is not conditionally significant. The multiple regression results
are likely to yield low explanatory power with insignificant independent variables:
Regression Analysis: Infmrt82 versus Phys82, Perinc84, Perhosp
Infmrt82 = 12.7 - 0.00017 Phys82 -0.000206 Perinc84 + 6.30 Perhosp
Predictor
Coef
SE Coef
T
P
VIF
Constant
12.701
1.676
7.58
0.000
Phys82
-0.000167
0.006647
-0.03
0.980
1.5
Perinc84
-0.0002064
0.0001637
-1.26
0.214
1.6
Perhosp
6.297
3.958
1.59
0.118
1.1
S = 1.602
R-Sq = 8.9%
R-Sq(adj) = 3.0%
Source
DF
SS
MS
F
Regression
3
11.546
3.849
1.50
Residual Error
46
118.029
2.566
Total
49
129.575
P
0.227
As expected, the model explains less than 9% of the variability in infant mortality.
None of the independent variables are conditionally significant and high correlation
among the independent variables does not appear to be a significant problem. The
standard error of the estimate is very large (1.602) relative to the size of the infant
mortality rates and hence the model would not be a good predictor. Sequentially
dropping the independent variable with the lowest t-statistic confirms the
conclusion that none of the independnet variables is conditionally significant. The
search is on for better independent variables.
b. The two variables to include are per capita spending on education (PerEduc)
and per capita spending on public welfare (PerPbwel). Since the conditional
significance of the independent variables is a function of other independent
variables in the model, we will include the original set of variables:
75
Regression Analysis: Infmrt82 versus Phys82, Perinc84, ...

Infmrt82 = 12.2 - 0.00122 Phys82 +0.000015 Perinc84 + 8.87 Perhosp
- 1.96 PerEduc - 4.56 PerPbwel
Predictor
Coef
SE Coef
T
P
VIF
Constant
12.167
1.654
7.35
0.000
Phys82
-0.001219
0.008028
-0.15
0.880
2.4
Perinc84
0.0000152
0.0001942
0.08
0.938
2.4
Perhosp
8.872
4.010
2.21
0.032
1.2
PerEduc
-1.960
1.315
-1.49
0.143
1.8
PerPbwel
-4.555
3.360
-1.36
0.182
1.7
S = 1.550
R-Sq = 18.4%
R-Sq(adj) = 9.1%
Source
DF
SS
MS
F
Regression
5
23.848
4.770
1.98
Residual Error
44
105.727
2.403
Total
49
129.575
P
0.100
The model shows low explanatory power and only one independent variable that is
conditionally significant (Perhosp). Dropping sequentially the independent variable
with the lowest t-statistic yields a model with no conditionally significant
independent variables. This problem illustrates that in some applications, the
variables that have been identified as theoretically important predictors do not
meet the statistical test.
13.114 a.
Correlations: Salary, age, yrs_asoc, yrs_full, Sex_1Fem, Market, C8
Salary
0.749
0.000
yrs_asoc 0.698
0.000
yrs_full 0.777
0.000
Sex_1Fem -0.429
0.000
Market
0.026
0.750
C8
-0.029
0.721
age yrs_asoc yrs_full Sex_1Fem
Market
age
0.712
0.000
0.583
0.000
-0.234
0.004
-0.134
0.103
-0.189
0.020
0.312
0.000
-0.367
0.000
-0.113
0.169
-0.073
0.373
-0.292
0.000
-0.017
0.833
-0.043
0.598
0.062
0.453
-0.094
0.254
-0.107
0.192
The correlation matrix indicates several independent variables that should provide
good explanatory power in the regression model. We would expect that age, years
at Associate professor and years at full professor are likely to be conditionally
significant:
76
Regression Analysis: Salary versus age, yrs_asoc, ...

Salary = 21107 + 105 age + 532 yrs_asoc + 690 yrs_full - 1312
Sex_1Fem
+ 2854 Market + 1101 C8
Predictor
Coef
SE Coef
T
P
VIF
Constant
21107
1599
13.20
0.000
age
104.59
40.62
2.58
0.011
3.1
yrs_asoc
532.27
63.66
8.36
0.000
2.4
yrs_full
689.93
52.66
13.10
0.000
1.7
Sex_1Fem
-1311.8
532.3
-2.46
0.015
1.3
Market
2853.9
823.3
3.47
0.001
1.0
C8
1101.0
658.1
1.67
0.097
1.1
S = 2569
R-Sq = 85.6%
Source
DF
Regression
6
Residual Error
143
Total
149
SS
5604244075
943982608
6548226683
R-Sq(adj) = 85.0%
MS
934040679
6601277
F
141.49
P
0.000
Dropping the C8 variable yields:

Regression Analysis: Salary versus age, yrs_asoc, ...
Salary = 21887 + 90.0 age + 539 yrs_asoc + 697 yrs_full - 1397
Sex_1Fem
+ 2662 Market
Predictor
Coef
SE Coef
T
P
VIF
Constant
21887
1539
14.22
0.000
age
90.02
39.92
2.26
0.026
3.0
yrs_asoc
539.48
63.91
8.44
0.000
2.4
yrs_full
697.35
52.80
13.21
0.000
1.7
Sex_1Fem
-1397.2
533.2
-2.62
0.010
1.2
Market
2662.3
820.3
3.25
0.001
1.0
S = 2585
R-Sq = 85.3%
Source
DF
Regression
5
Residual Error
144
Total
149
SS
5585766862
962459821
6548226683
R-Sq(adj) = 84.8%
MS
1117153372
6683749
F
167.14
P
0.000
This is the final model. All of the independent variables are conditionally
significant and the model explains a sizeable portion of the variability in salary.
b. To test the hypothesis that the rate of change in female salaries as a function of
age is less than the rate of change in male salaries as a function of age, the
dummy variable Sex_1Fem is used to see if the slope coefficient for age (X1) is
different for males and females. The following model is used:
Y 0 ( 1 6 X 4 ) X 1 2 X 2 3 X 3 4 X 4 5 X 5
0 1 X 1 6 X 4 X 1 2 X 2 3 X 3 4 X 4 5 X 5
77
Create the variable X4X1 and then test for conditional significance in the regression model.
If it proves to be a significant predictor of salaries then there is strong evidence to
conclude that the rate of change in female salaries as a function of age is different than for
males:
Regression Analysis: Salary versus age, femage, ...
Salary = 22082 + 85.1 age + 11.7 femage + 543 yrs_asoc + 701 yrs_full
- 1878 Sex_1Fem + 2673 Market
Predictor
Coef
SE Coef
T
P
VIF
Constant
22082
1877
11.77
0.000
age
85.07
48.36
1.76
0.081
4.4
femage
11.66
63.89
0.18
0.855
32.2
yrs_asoc
542.85
66.73
8.13
0.000
2.6
yrs_full
701.35
57.35
12.23
0.000
2.0
Sex_1Fem
-1878
2687
-0.70
0.486
31.5
Market
2672.8
825.1
3.24
0.001
1.0
S = 2594
R-Sq = 85.3%
R-Sq(adj) = 84.7%
Source
DF
SS
MS
F
Regression
6 5585990999
930998500
138.36
Residual Error
143
962235684
6728921
Total
149 6548226683
P
0.000
The regression shows that the newly created variable of femage is not conditionally
significant. Thus, we cannot conclude that the rate of change in female salaries as a
function of age differs from that of male salaries.
13.115
Regression Analysis: hseval versus sizehse, taxrate, incom72, Homper
hseval = - 32.7 + 6.74 sizehse - 223 taxrate + 0.00464 incom72 + 11.2
Homper
Predictor
Coef
SE Coef
T
P
VIF
Constant
-32.694
8.972
-3.64
0.000
sizehse
6.740
1.880
3.58
0.001
1.4
taxrate
-222.96
45.39
-4.91
0.000
1.2
incom72
0.004642
0.001349
3.44
0.001
1.2
Homper
11.215
4.592
2.44
0.017
1.3
S = 3.610
R-Sq = 49.3%
R-Sq(adj) = 47.0%
Source
DF
SS
MS
F
Regression
4
1079.08
269.77
20.70
Residual Error
85
1107.55
13.03
Total
89
2186.63
P
0.000
All of the independent variables are conditionally significant. Now add the percent of
commercial property to the model to see if it is significant:
78
Regression Analysis: hseval versus sizehse, taxrate, ...

Homper
- 2.18 Comper
Predictor
Coef
SE Coef
T
P
VIF
Constant
-31.615
9.839
-3.21
0.002
sizehse
6.757
1.892
3.57
0.001
1.4
taxrate
-217.63
49.58
-4.39
0.000
1.4
incom72
0.004534
0.001412
3.21
0.002
1.4
Homper
10.287
5.721
1.80
0.076
2.0
Comper
-2.182
7.940
-0.27
0.784
1.7
S = 3.630
R-Sq = 49.4%
R-Sq(adj) = 46.4%
Source
DF
SS
MS
F
Regression
5
1080.07
216.01
16.40
Residual Error
84
1106.56
13.17
Total
89
2186.63
P
0.000
With a t-statistic of -.27 we have not found strong enough evidence to reject H 0 that the
slope coefficient on percent commercial property is significantly different from zero. The
conditional F test:
SSRF SSRR 1080.07 1079.08
FComper
.075 , thus, at any common level of

S 2Y | X
(3.63)2
alpha, do not reject H 0 that the percent commercial property has no effect on house
values.
Add percent industrial property to the base model:
Regression Analysis: hseval versus sizehse, taxrate, ...
Homper
- 7.50 Indper
Predictor
Coef
SE Coef
T
P
VIF
Constant
-28.643
9.602
-2.98
0.004
sizehse
6.096
1.956
3.12
0.003
1.5
taxrate
-232.34
46.00
-5.05
0.000
1.2
incom72
0.005208
0.001431
3.64
0.000
1.4
Homper
8.681
5.070
1.71
0.091
1.6
Indper
-7.505
6.427
-1.17
0.246
1.7
S = 3.602
R-Sq = 50.2%
Source
DF
Regression
5
Residual Error
84
Total
89
SS
1096.77
1089.86
2186.63
R-Sq(adj) = 47.2%
MS
219.35
12.97
F
16.91
P
0.000
Likewise, the percent industrial property is not significantly different from zero. The
RSS5 RSS 4 1096.77 1079.08
FIndper
1.36 , again this is

conditional F test:
S 2Y | X
(3.602) 2
79
lower than the critical value of F based on common levels of alpha, therefore, do not reject
H 0 that the percent industrial property has no effect on house values.
Tax rate models:
Regression Analysis: taxrate versus taxbase, expercap, Homper
taxrate = - 0.0174 -0.000000 taxbase +0.000162 expercap + 0.0424 Homper
Predictor
Coef
SE Coef
T
P
VIF
Constant
-0.017399
0.007852
-2.22
0.029
taxbase
-0.00000000 0.00000000
-0.80
0.426
1.2
expercap
0.00016204 0.00003160
5.13
0.000
1.1
Homper
0.042361
0.009378
4.52
0.000
1.2
S = 0.007692
R-Sq = 31.9%
R-Sq(adj) = 29.5%
Source
DF
SS
MS
F
Regression
3 0.00237926 0.00079309
13.41
Residual Error
86 0.00508785 0.00005916
Total
89 0.00746711
P
0.000
Since taxbase is not significant, it is dropped from the model:

Regression Analysis: taxrate versus expercap, Homper
taxrate = - 0.0192 +0.000158 expercap + 0.0448 Homper
Predictor
Coef
SE Coef
T
P
Constant
-0.019188
0.007511
-2.55
0.012
expercap
0.00015767 0.00003106
5.08
0.000
Homper
0.044777
0.008860
5.05
0.000
S = 0.007676
R-Sq = 31.4%
R-Sq(adj) = 29.8%
Source
DF
SS
MS
F
Regression
2
0.0023414
0.0011707
19.87
Residual Error
87
0.0051257
0.0000589
Total
89
0.0074671
VIF
1.1
1.1
P
0.000
Both of the independent variables are significant. This becomes the base model that we
now add percent commercial property and percent industrial property sequentially:
Regression Analysis: taxrate versus expercap, Homper, Comper
taxrate = - 0.0413 +0.000157 expercap + 0.0643 Homper + 0.0596 Comper
Predictor
Coef
SE Coef
T
P
VIF
Constant
-0.041343
0.008455
-4.89
0.000
expercap
0.00015660 0.00002819
5.55
0.000
1.1
Homper
0.064320
0.009172
7.01
0.000
1.4
Comper
0.05960
0.01346
4.43
0.000
1.3
S = 0.006966
R-Sq = 44.1%
Source
DF
Regression
3
Residual Error
86
Total
89
SS
0.0032936
0.0041735
0.0074671
R-Sq(adj) = 42.2%
MS
0.0010979
0.0000485
F
22.62
P
0.000
80
Percent commercial property is conditionally significant and an important independent

variable as shown by the conditional F-test:
RSS3 RSS 2 .003294 .00234
FComper
19.62
S 2Y | X
(.006966) 2
With 1 degree of freedom in the numberator and (90-3-1) = 86 degrees of freedom
in the denominator, the critical value of F at the .05 level is 3.95. Hence we would
conclude that the percentage of commercial property has a statistically significant positive
impact on tax rate.
We now add industrial property to test the effect on tax rate:
Regression Analysis: taxrate versus expercap, Homper, Indper
taxrate = - 0.0150 +0.000156 expercap + 0.0398 Homper - 0.0105 Indper
Predictor
Coef
SE Coef
T
P
VIF
Constant
-0.015038
0.009047
-1.66
0.100
expercap
0.00015586 0.00003120
5.00
0.000
1.1
Homper
0.03982
0.01071
3.72
0.000
1.6
Indper
-0.01052
0.01273
-0.83
0.411
1.5
S = 0.007690
R-Sq = 31.9%
R-Sq(adj) = 29.5%
Source
DF
SS
MS
F
Regression
3 0.00238178 0.00079393
13.43
Residual Error
86 0.00508533 0.00005913
Total
89 0.00746711
P
0.000
The percent industrial property is insignificant with a t-statistic of only -.83. The F-test
confirms that the variable does not have a significant impact on tax rate:
RSS3 RSS 2 .002382 .00234
FIndper
.683 . With 1 degree of freedom in the

S 2Y | X
(.00769) 2
numberator and (90-3-1) = 86 degrees of freedom in the denominator, the critical value of
F at the .05 level is 3.95. Hence we would conclude that the percentage of commercial
property has no statistically significant impact on tax rate.
In conclusion, we found no evidence to back three of the activists claims and strong
evidence to reject one of them. We concluded that commercial development will have no
effect on house value, while it will actually increase tax rate. In addition, we concluded
that industrial development will have no effect on house value or tax rate.
It was important to include all of the other independent variables in the regression models
because the conditional significance of any one variable is influenced by which other
independent variables are in the regression model. Therefore, it is important to test if
direct relationships can be explained by the relationships with other predictor variables.
81
13.116
Correlations: EconGPA, sex, Acteng, ACTmath, ACTss, ACTcomp, HSPct
EconGPA
0.187
0.049
Acteng
0.387
0.001
ACTmath
0.338
0.003
ACTss
0.442
0.000
ACTcomp
0.474
0.000
HSPct
0.362
0.000
sex
Acteng
ACTmath
ACTss
ACTcomp
0.270
0.021
-0.170
0.151
-0.105
0.375
-0.084
0.478
0.216
0.026
0.368
0.001
0.448
0.000
0.650
0.000
0.173
0.150
0.439
0.000
0.765
0.000
0.290
0.014
0.812
0.000
0.224
0.060
0.230
0.053
sex
There exists a positive relationship between EconGPA and all of the independent variables,
which is expected. Note that there is a high correlation between the composite ACT score
and the individual components, which is again, as expected. Thus, high correlation among
the independent variables is likely to be a serious concern in this regression model.
Regression Analysis: EconGPA versus sex, Acteng, ...
EconGPA = - 0.050 + 0.261 sex + 0.0099 Acteng + 0.0064 ACTmath + 0.0270
ACTss
+ 0.0419 ACTcomp + 0.00898 HSPct
Predictor
Coef
SE Coef
T
P
VIF
Constant
-0.0504
0.6554
-0.08
0.939
sex
0.2611
0.1607
1.62
0.109
1.5
Acteng
0.00991
0.02986
0.33
0.741
2.5
ACTmath
0.00643
0.03041
0.21
0.833
4.3
ACTss
0.02696
0.02794
0.96
0.338
4.7
ACTcomp
0.04188
0.07200
0.58
0.563
12.8
HSPct
0.008978
0.005716
1.57
0.121
1.4
S = 0.4971
R-Sq = 34.1%
R-Sq(adj) = 27.9%
Source
DF
SS
MS
F
Regression
6
8.1778
1.3630
5.52
Residual Error
64
15.8166
0.2471
Total
70
23.9945
P
0.000
As expected, high correlation among the independent variables is affecting the results. A
strategy of dropping the variable with the lowest t-statistic with each successive model
causes the dropping of the following variables (in order): 1) ACTmath, 2) ACTeng, 3)
ACTss, 4) HSPct. The two variables that remain are the final model of gender and
ACTcomp:
82
Regression Analysis: EconGPA versus sex, ACTcomp

EconGPA = 0.322 + 0.335 sex + 0.0978 ACTcomp
Predictor
Coef
SE Coef
T
Constant
0.3216
0.5201
0.62
sex
0.3350
0.1279
2.62
ACTcomp
0.09782
0.01989
4.92
P
0.538
0.011
0.000
S = 0.4931
R-Sq = 29.4%
R-Sq(adj) = 27.3%
Source
DF
SS
MS
F
Regression
2
7.0705
3.5352
14.54
Residual Error
70
17.0192
0.2431
Total
72
24.0897
VIF
1.0
1.0
P
0.000
Both independent variables are conditionally significant.

b. The model could be used in college admission decisions by creating a predicted
GPA in economics based on sex and ACT comp scores. This predicted GPA
could then be used with other factors in deciding admission. Note that this
model predicts that females will outperform males with equal test scores.
Using this model as the only source of information may lead to charges of
unequal treatment.
13.117
92.7% of the total variation in gross revenue from a medical practice can be
explained by the variations in number of hours worked, number of physicians,
number of allied health personnel and the number of rooms used in the practice.
The regression model indicates that gross revenue from a medical practice is
positively associated with the number of hours worked, the number of physicians
in the the practice, the number of allied health personnel and the number of rooms
used in the practice. All of these variables are expected to have a positive effect on
gross revenue. With df = (50 4 1) = 45, the critical value of t for a two-tailed
test at an alpha of .05 is approximately 2.021. Therefore, one can conclude that all
of the independent variables are conditionally significant with the sole exception of
the number of rooms used in the practice. This variable, while a positive estimated
coefficient, is not significantly different from zero.
The number of physicians in the practice has a large effect on gross revenue since,
all else equal, for a 1% increase in the number of physicians in the practice, we
estimate an increase in the gross revenue from the medical practice by .673%.
Note however that before additional physicians are hired, the analysis should be
extended to include not only gross revenue but also profits.

Newbold Book Solutions

Uploaded by

Copyright:

Available Formats

Newbold Book Solutions

Uploaded by

Document Information

Original Description:

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Newbold Book Solutions

Uploaded by

Copyright:

Available Formats

How are slope coefficients in a regression model interpreted?

How are slope coefficients in a regression model interpreted?

What factors positively impact the gross revenue of a medical practice according to the regression model?

What factors positively impact the gross revenue of a medical practice according to the regression model?

18

Statistics for Business & Economics, 6th edition

13.1 Given the following estimated linear model: y 10 3 x1 2 x2 4 x3

Statistics for Business & Economics, 6th edition

The estimated regression slope coefficients are interpreted as follows:

The estimated regression slope coefficients are interpreted as follows:

a. b1 = .052: All else equal, an increase of one hundred dollars in

Chapter 13: Multiple Regression

s y (rx1 y rx1x2 rx2 y )

s y (rx2 y rx1x2 rx1 y )

13.11 a. When the correlation is = 0 the slope coefficient of the X1 term

Statistics for Business & Economics, 6th edition

An additional residential customer will add 2.2027 mwh to electricity sales.

Chapter 13: Multiple Regression

Statistics for Business & Economics, 6th edition

Chapter 13: Multiple Regression

a. Horsepower as a function of weight, cubic inches of displacement

Statistics for Business & Economics, 6th edition

Engine displacement has a significant positive impact on horsepower, fuel mileage

Chapter 13: Multiple Regression

A regression analysis has produced the following Analysis of Variance table

A regression analysis has produced the following Analysis of Variance table

A regression analysis has produced the following Analysis of Variance table

Statistics for Business & Economics, 6th edition

A regression analysis has produced the following Analysis of Variance table

Chapter 13: Multiple Regression

Statistics for Business & Economics, 6th edition

Chapter 13: Multiple Regression

a. b1 .661, sb 1 .099, n 27, t23,.05/.025 1.714, 2.069

Statistics for Business & Economics, 6th edition

95% CI: .661 2.069(.099); .4562 up to .8658

Chapter 13: Multiple Regression

13.33 a. b3 7.653, sb 3 3.082, n 19, t15,.025 2.131

Statistics for Business & Economics, 6th edition

t16,.10 = -1.337, Therefore, do not reject H 0 at the 10% level

Chapter 13: Multiple Regression

c. H 0 : 1 2 3 0, H1 : At least one i 0, (i 1, 2,3)

13.39 H 0 : 1 2 0, H1 : At least one i 0, (i 1, 2)

Statistics for Business & Economics, 6th edition

H 0 : 1 2 3 0, H1 : At least one i 0, (i 1, 2,3)

13.42 a. H 0 : 1 2 3 4 0, H1 : At least one i 0, (i 1, 2,3, 4)

Therefore, reject H 0 at the 1% level

Chapter 13: Multiple Regression

13.45 H 0 : 1 2 3 4 5 6 7 0, H1 : At least one

Let 3 be the coefficient on the number of preschool children in the household

13.49 Given the estimated multiple regression equation y 6 5 x1 4 x2 7 x3 8 x4

Statistics for Business & Economics, 6th edition

13.50 Y 7.35 .653(20) 1.345(10) .613(6) 10.638 pounds

Compute values of yi when xi = 1, 2, 4, 6, 8, 10

13.55 Compute values of yi when xi = 1, 2, 4, 6, 8, 10

Chapter 13: Multiple Regression

a. All else equal, 1% increase in annual consumption expenditures will be

a. A 1% increase in median income leads to an expected .68% increase in store

13.466 , F 4,25,.01 = 4.18. Therefore,

13.63 Estimate a Cobb-Douglas production function with three independent variables: