ECM6Lecture11bVietnam 2014

ECM6 Computational Methods :
Slide 1 of 5
Lecture 11b
Polynomial Regression
Brian G. Higgins
Department of Chemical Engineering & Materials Science
University of California, Davis
April 2014, Hanoi, Vietnam
ECM6Lecture11bVietnam_2014.nb
Background
We will now extend the ideas from Lecture 11a to find the best fit in a least squares sense for an
arbitrary polynomial:
P HxL = a0 + a1 x + a2 x2 + a3 x3 + + an xn
As before we will be given a set of M data points in the form
88x1 , y1 <, 8x2 , y2 <, 8x3 , y3 <, 8xM , yM <<

and our objective is to find the parameters ak that minimize the least square error. Thus
M
E2 HPL = IIa0 + a1 xk + a2 x2k + + an xnk M - yk M
k=1
We know from calculus that to minimize the least square error, the desired values of ai must satisfy
E2 HLL
ai
= 0 , i = 0, 1, , n
Note that since the xk ' s and yk ' s are constant. Thus we have n+1 linear equations for the n+1 parameters ak
Example 1
Suppose we have the following data points
881, 5.12<, 83, 3<, 86, 2.48<, 89, 2.34<, 815, 2.18<<
and we want to find the parameters ak that give the least square error for a guess function
PHxL = a0 + a1 x + a2 x 2 + a3 x 3
Solution Step1: Define Regression Functions

We proceed as before and define the following quantities
data3 = 881, 5.12<, 83, 3<, 86, 2.48<, 89, 2.34<, 815, 2.18<<;
x@i_D := data3@@i, 1DD; y@i_D := data3@@i, 2DD; M = Length@data3D;
P@x_D := a0 + a1 x + a2 x2 + a3 x3
M
E2 @g_, M_D := Hg@x@iDD - y@iDL2

i=1
Solution Step 2: Evaluate the Least Squared Function

Evaluating the least squares function gives
E2 @P, MD
H- 5.12 + a0 + a1 + a2 + a3 L2 + H- 3 + a0 + 3 a1 + 9 a2 + 27 a3 L2 + H- 2.48 + a0 + 6 a1 + 36 a2 + 216 a3 L2 +
H- 2.34 + a0 + 9 a1 + 81 a2 + 729 a3 L2 + H- 2.18 + a0 + 15 a1 + 225 a2 + 3375 a3 L2
Solution Step 3: Determining the Equations for Parameters ai :

To take the derivative of the sum of squares with respect to a parameter we proceed as follows
D@E2 @P, MD, a0 D
2 H- 5.12 + a0 + a1 + a2 + a3 L + 2 H- 3 + a0 + 3 a1 + 9 a2 + 27 a3 L + 2 H- 2.48 + a0 + 6 a1 + 36 a2 + 216 a3 L +
2 H- 2.34 + a0 + 9 a1 + 81 a2 + 729 a3 L + 2 H- 2.18 + a0 + 15 a1 + 225 a2 + 3375 a3 L
Thus it is a simple matter to generate the various equations required to determine the parameters ai :
eqns =
8D@E2 @P, MD, a0 D 0, D@E2 @P, MD, a1 D 0, D@E2 @P, MD, a2 D 0, D@E2 @P, MD, a3 D 0<
82 H- 5.12 + a0 + a1 + a2 + a3 L +
2 H- 3 + a0 + 3 a1 + 9 a2 + 27 a3 L + 2 H- 2.48 + a0 + 6 a1 + 36 a2 + 216 a3 L +
2 H- 2.34 + a0 + 9 a1 + 81 a2 + 729 a3 L + 2 H- 2.18 + a0 + 15 a1 + 225 a2 + 3375 a3 L 0,
2 H- 5.12 + a0 + a1 + a2 + a3 L + 6 H- 3 + a0 + 3 a1 + 9 a2 + 27 a3 L +
12 H- 2.48 + a0 + 6 a1 + 36 a2 + 216 a3 L + 18 H- 2.34 + a0 + 9 a1 + 81 a2 + 729 a3 L +
30 H- 2.18 + a0 + 15 a1 + 225 a2 + 3375 a3 L 0,
2 H- 5.12 + a0 + a1 + a2 + a3 L + 18 H- 3 + a0 + 3 a1 + 9 a2 + 27 a3 L +
72 H- 2.48 + a0 + 6 a1 + 36 a2 + 216 a3 L + 162 H- 2.34 + a0 + 9 a1 + 81 a2 + 729 a3 L +
450 H- 2.18 + a0 + 15 a1 + 225 a2 + 3375 a3 L 0,
2 H- 5.12 + a0 + a1 + a2 + a3 L + 54 H- 3 + a0 + 3 a1 + 9 a2 + 27 a3 L +
432 H- 2.48 + a0 + 6 a1 + 36 a2 + 216 a3 L + 1458 H- 2.34 + a0 + 9 a1 + 81 a2 + 729 a3 L +
6750 H- 2.18 + a0 + 15 a1 + 225 a2 + 3375 a3 L 0<
Solution Step 4: Solving for Parameters ai :

As before we use Solve to find the values of the parameters
sol3 = Flatten@Solve@eqnsDD
8a0 6.41245, a1 - 1.55201, a2 0.181872, a3 - 0.00648415<
Solution Step 5: Determining the Least Squared Error

The least square error is
E2 @P, MD . sol3
0.120889
Using Mathematicas Fit Function

Let us compare our previous results with Mathematica's Fit function
cubicFit = FitAdata3, 91, x, x2 , x3 =, xE
6.41245- 1.55201 x + 0.181872 x2 - 0.00648415 x3
The results are the same

Now let us visualize the fit graphically
plt1 = Plot@P@xD .sol3, 8x, 0, 15<, PlotStyle 8Blue, Thick<, Frame True,
FrameLabel 8Style@"x", 16D, Style@"PHxL", 16D<, PlotRange AllD;
plt2 = ListPlot@data3, PlotStyle 8PointSize@LargeD, Red<D;
Show@plt1, plt2, PlotRange AllD
6
PHxL
10
12
14
x
It is apparent from the plots that a cubic polynomial is not an exceptionally good fit visually. How can we
evaluate the goodness of the fit quantitatively. We leave this question for later.
Note as the order of the polynomial increases, so does its ability to oscillate.
Example 2
Problem Statement
Consider the following set of {x,y} data
882, 0.75<, 84, 0.1875<, 85, 0.1200<, 86, 0.0833<, 88, 0.0469<<
Determine the parameter values c and n in the model
y = c xn
that gives the best fit of the data in a least squares sense.
Solution Step1: Model Transformation

By taking the natural log of the model we get
Y=m+nX
where the following definitions are used
Y = Loge HyL, X = Loge HxL, m = Loge HcL
Solution Step 2: Data Transformation

To proceed we need to transform the data
data5 = 882., 0.75<, 84., 0.1875<, 85., 0.1200<, 86., 0.0833<, 88., 0.0469<<;
We will make use of rules to transform the data

data5b = data5 . 8x_, y_< 8Log@xD, Log@yD<
880.693147, - 0.287682<, 81.38629, - 1.67398<,
81.60944, - 2.12026<, 81.79176, - 2.48531<, 82.07944, - 3.05974<<
Solution Step 3: Least Squares Formulation

As before we define functions that extract out the X and Y data , and define our model and the least
squares function
X@i_D := data5b@@i, 1DD; Y@i_D := data5b@@i, 2DD; M = Length@data5bD;
Pow@X_D := m + n X
M
E2 @g_, M_D := Hg@X@iDD - Y@iDL2

i=1
Solution Step 4: Solving Least Square Problem

It is a simple matter to determine the equations for the parameters m and n:
eqns = 8D@E2 @Pow, MD, mD 0, D@E2 @Pow, MD, nD 0<

82 H0.287682 + m + 0.693147 nL + 2 H1.67398 + m + 1.38629 nL +
2 H2.12026 + m + 1.60944 nL + 2 H2.48531 + m + 1.79176 nL + 2 H3.05974 + m + 2.07944 nL 0,
1.38629 H0.287682 + m + 0.693147 nL + 2.77259 H1.67398 + m + 1.38629 nL +
3.21888 H2.12026 + m + 1.60944 nL + 3.58352 H2.48531 + m + 1.79176 nL +
4.15888 H3.05974 + m + 2.07944 nL 0<
We solve this set using Solve

sol = Flatten@Solve@eqnsDD
8m 1.09838, n - 1.99983<
Solution Step 5: Transforming Back to Original Variables

Finally we need to transform this parameters back to the original definitions
regressParam = c m . sol
c 2.99929
Thus the model that is a best fit of the data is
y = 2.2
.999 x-1.999 3 x-2
Note instead of using the loge to transform our power function into a linear model, we could have used
log10 or for that matter log to any base.
Solution Step 6: Visualizing the Fit

Here is the plot in terms of the transformed variables (X, Y)
plt1 = Plot@Hm + n XL . sol, 8X, 0.5, 2.6<, PlotStyle 8Blue, Thick<, Frame True,
FrameLabel 8Style@"X", 16D, Style@"Y=PHXL", 16D<, PlotRange AllD;
plt2 = ListPlot@data5b, PlotStyle 8PointSize@LargeD, Red<D;
0
Y=PHXL
-1
-2
-3
-4
0.5
1.0
1.5
2.0
X
Here is the plot in terms of the original variables (x, y)
2.5
plt1 = PlotA3 x-2 , 8x, 2, 10<, PlotStyle 8Blue, Thick<, Frame True,
FrameLabel 8Style@"x", 16D, Style@"PHxL", 16D<, PlotRange AllE;
plt2 = ListPlot@data5, PlotStyle 8PointSize@LargeD, Red<D;
0.7
0.6
0.5
PHxL
0.4
0.3
0.2
0.1
0.0
2
10
References
These notes and the examples were adapted from the follow texts:
M. J. Maron, Numerical Analysis. A Practical Approach, 2nd Edition, Macmillan Publishing Company,
1987
A. J. Pettoprezzo, Introductory Numerical Analysis, Dover Publications, 1984

ECM6Lecture11bVietnam 2014

Uploaded by

Copyright:

Available Formats

ECM6Lecture11bVietnam 2014

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

ECM6Lecture11bVietnam 2014

Uploaded by

Copyright:

Available Formats

ECM6 Computational Methods :

88x1 , y1 <, 8x2 , y2 <, 8x3 , y3 <, 8xM , yM <<

E2 HPL = IIa0 + a1 xk + a2 x2k + + an xnk M - yk M

Solution Step1: Define Regression Functions

E2 @g_, M_D := Hg@x@iDD - y@iDL2

Solution Step 2: Evaluate the Least Squared Function

Solution Step 3: Determining the Equations for Parameters ai :

Solution Step 4: Solving for Parameters ai :

Solution Step 5: Determining the Least Squared Error

Using Mathematicas Fit Function

The results are the same

Solution Step1: Model Transformation

Y = Loge HyL, X = Loge HxL, m = Loge HcL

Solution Step 2: Data Transformation

We will make use of rules to transform the data

Solution Step 3: Least Squares Formulation

E2 @g_, M_D := Hg@X@iDD - Y@iDL2

Solution Step 4: Solving Least Square Problem

eqns = 8D@E2 @Pow, MD, mD 0, D@E2 @Pow, MD, nD 0<

We solve this set using Solve

Solution Step 5: Transforming Back to Original Variables

Thus the model that is a best fit of the data is

.999 x-1.999 3 x-2

Solution Step 6: Visualizing the Fit

You might also like