17.1.6 General Comments On Linear Regression
17.1.6 General Comments On Linear Regression
17.1.6 General Comments On Linear Regression
Thus, the intercept, log a2, equals 20.300, and therefore, by taking the antilogarithm,
a2 5 1020.3 5 0.5. The slope is b2 5 1.75. Consequently, the power equation is
y 5 0.5x 1.75
This curve, as plotted in Fig. 17.10a, indicates a good it.
Following the procedure of the previous section, we take the derivative of Eq. (17.18)
with respect to each of the unknown coeficients of the polynomial, as in
0Sr
5 22 a ( yi 2 a0 2 a1xi 2 a2 xi2 )
0a0
[92]
17.2 POLYNOMIAL REGRESSION 473
0Sr
5 22 a xi ( yi 2 a0 2 a1xi 2 a2x2i )
0a1
0Sr
5 22 a x2i ( yi 2 a0 2 a1xi 2 a2x2i )
0a2
These equations can be set equal to zero and rearranged to develop the following set of
normal equations:
( ) ( )
(n)a0 1 a xi a1 1 a x2i a2 5 a yi
B n 2 (m 1 1)
Sr
sy/x 5 (17.20)
[93]
474 LEAST-SQUARES REGRESSION
TABLE 17.4 Computations for an error analysis of the quadratic least-squares fit.
50
Least-squares
parabola
0 5 x
FIGURE 17.11
Fit of a second-order polynomial.
£ 15 225 § • a1 ¶ 5 • 585.6 ¶
6 15 55 a0 152.6
55
55 225 979 a2 2488.8
Solving these equations through a technique such as Gauss elimination gives a0 5 2.47857,
a1 5 2.35929, and a2 5 1.86071. Therefore, the least-squares quadratic equation for this case is
y 5 2.47857 1 2.35929x 1 1.86071x2
The standard error of the estimate based on the regression polynomial is [Eq. (17.20)]
A 623
3.74657
syyx 5 5 1.12
[94]
17.2 POLYNOMIAL REGRESSION 475
[95]
PROBLEMS 487
PROBLEMS
That is, determine the slope that results in the least-squares it for a
straight line with a zero intercept. Fit the following data with this
model and display the result graphically:
x 2 4 6 7 10 11 14 17 20
y 1 2 5 2 8 7 6 9 12
(a) Along with the slope and intercept, compute the standard error
of the estimate and the correlation coeficient. Plot the data and
the straight line. Assess the it.
(b) Recompute (a), but use polynomial regression to it a parabola
to the data. Compare the results with those of (a).
17.7 Fit the following data with (a) a saturation-growth-rate model,
Determine (a) the mean, (b) the standard deviation, (c) the vari- (b) a power equation, and (c) a parabola. In each case, plot the data
ance, (d) the coeficient of variation, and (e) the 90% conidence and the equation.
interval for the mean. (f) Construct a histogram. Use a range from
x 0.75 2 3 4 6 8 8.5
28 to 34 with increments of 0.4. (g) Assuming that the distribution
is normal and that your estimate of the standard deviation is valid, y 1.2 1.95 2 2.4 2.4 2.7 2.6
compute the range (that is, the lower and the upper values) that
17.8 Fit the following data with the power model (y 5 axb). Use
encompasses 68% of the readings. Determine whether this is a
the resulting power equation to predict y at x 5 9:
valid estimate for the data in this problem.
17.3 Use least-squares regression to it a straight line to x 2.5 3.5 5 6 7.5 10 12.5 15 17.5 20
y 5 a1x 1 e y 5 a4xeb4x
[96]
488 LEAST-SQUARES REGRESSION
Linearize this model and use it to estimate a4 and b4 based on the Determine the coeficients by setting up and solving Eq. (17.25).
following data. Develop a plot of your it along with the data. 17.16 Given these data
x 0.1 0.2 0.4 0.6 0.9 1.3 1.5 1.7 1.8 x 5 10 15 20 25 30 35 40 45 50
y 0.75 1.25 1.45 1.25 0.85 0.55 0.35 0.28 0.18 y 17 24 31 33 37 37 40 40 42 41
17.12 An investigator has reported the data tabulated below for an use least-squares regression to it (a) a straight line, (b) a power
experiment to determine the growth rate of bacteria k (per d), as a equation, (c) a saturation-growth-rate equation, and (d) a parabola.
function of oxygen concentration c (mg/L). It is known that such Plot the data along with all the curves. Is any one of the curves
data can be modeled by the following equation: superior? If so, justify.
17.17 Fit a cubic equation to the following data:
kmaxc2
k5
cs 1 c2 x 3 4 5 7 8 9 11 12
where cs and kmax are parameters. Use a transformation to linearize y 1.6 3.6 4.4 3.4 2.2 2.8 3.8 4.6
this equation. Then use linear regression to estimate cs and kmax and Along with the coeficients, determine r2 and syyx.
predict the growth rate at c 5 2 mg/L. 17.18 Use multiple linear regression to it
c 0.5 0.8 1.5 2.5 4
x1 0 1 1 2 2 3 3 4 4
k 1.1 2.4 5.3 7.6 8.9
x2 0 1 2 1 2 1 2 1 2
17.13 An investigator has reported the data tabulated below. It is y 15.1 17.9 12.7 25.6 20.5 35.1 29.7 45.4 40.2
known that such data can be modeled by the following equation
Compute the coeficients, the standard error of the estimate, and the
x 5 e(y2b)ya correlation coeficient.
where a and b are parameters. Use a transformation to linearize this 17.19 Use multiple linear regression to it
equation and then employ linear regression to determine a and b.
x1 0 0 1 2 0 1 2 2 1
Based on your analysis predict y at x 5 2.6.
x2 0 2 2 4 4 6 6 2 1
x 1 2 3 4 5
y 14 21 11 12 23 23 14 6 11
y 0.5 2 2.9 3.5 4
Compute the coeficients, the standard error of the estimate, and the
17.14 It is known that the data tabulated below can be modeled by correlation coeficient.
the following equation
a 1 1x 2
17.20 Use nonlinear regression to it a parabola to the following
y5a b
data:
b 1x
[97]
PROBLEMS 489
data trend shows a linear relationship. Use least-squares regression at which the concentration will reach 200 CFUy100 mL. Note that
to determine a best-it equation for these data. your choice of model should be consistent with the fact that nega-
tive concentrations are impossible and that the bacteria concentra-
N, cycles 1 10 100 1000 10,000 100,000 1,000,000 tion always decreases with time.
Stress, MPa 1100 1000 925 800 625 550 420 17.28 An object is suspended in a wind tunnel and the force mea-
sured for various levels of wind velocity. The results are tabulated
17.25 The following data show the relationship between the vis- below.
cosity of SAE 70 oil and temperature. After taking the log of the
data, use linear regression to ind the equation of the line that best v, m/s 10 20 30 40 50 60 70 80
its the data and the r2 value. F, N 25 70 380 550 610 1220 830 1450
Temperature, 8C 26.67 93.33 148.89 315.56 Use least-squares regression to it these data with (a) a straight line,
Viscosity, m, N ? s/m2 1.35 0.085 0.012 0.00075 (b) a power equation based on log transformations, and (c) a power
model based on nonlinear regression. Display the results graphically.
17.26 The data below represents the bacterial growth in a liquid 17.29 Fit a power model to the data from Prob. 17.28, but use
culture over a number of days. natural logarithms to perform the transformations.
Day 0 4 8 12 16 20 17.30 Derive the least-squares it of the following model:
6
Amount 3 10 67 84 98 125 149 185 y 5 a 1x 1 a2x2 1 e
Find a best-it equation to the data trend. Try several possibilities— That is, determine the coeficients that results in the least-squares it
linear, parabolic, and exponential. Use the software package of for a second-order polynomial with a zero intercept. Test the ap-
your choice to ind the best equation to predict the amount of bac- proach by using it to it the data from Prob. 17.28.
teria after 40 days. 17.31 In Prob. 17.11 we used transformations to linearize and it
17.27 The concentration of E. coli bacteria in a swimming area is the following model:
monitored after a storm:
y 5 a4xeb4x
t (hr) 4 8 12 16 20 24
Use nonlinear regression to estimate a4 and b4 based on the follow-
c (CFUy100 mL) 1600 1320 1000 890 650 560
ing data. Develop a plot of your it along with the data.
The time is measured in hours following the end of the storm and x 0.1 0.2 0.4 0.6 0.9 1.3 1.5 1.7 1.8
the unit CFU is a “colony forming unit.” Use these data to estimate
y 0.75 1.25 1.45 1.25 0.85 0.55 0.35 0.28 0.18
(a) the concentration at the end of the storm (t 5 0) and (b) the time
[98]