Curve Fitting - DS

Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 9

CURVE FITTING

Curve fitting is the process of finding the “BEST FIT” curve for a given set of data. It is the
representation of the relationship between two variables by means of an algebraic equation.

On the basis of this mathematical equation, predictions can be made in many statistical problems.

Suppose a set of ‘n’ points of values (x1,y1), (x2,y2), ….., (xn,yn) of the two variables x and y are
given. These values are plotted on a rectangular coordinate system i.e., xy- plane. The resulting set
of points is known as a scatter diagram.

The scatter diagram exhibits the trend and it is possible to visualize a smooth curve
approximating the data. Such a curve is known as an approximating curve.

LEAST SQUARES METHOD:


From a scatter diagram, generally, more than one curve may be seen to be approximate to
the given set of data. The method of least squares is used to find a curve which passes through the
maximum number of points.

Let P (xi, yi) be a point on the scatter diagram. Let the ordinate at P meet the curve y=f(x) at
Q and the x-axis at M.

Distance QP = MP – MQ = yi – y = yi – f(xi)
CURVE FITTING

The distance QP is known as deviation, error, or residual and is denoted by di. It may be positive,
negative, or zero depending upon whether P lies above, below, or on the curve.

Similar residuals or errors corresponding to the remaining (n-1) points may be obtained.

The sum of squares of residuals, denoted by E, is given as,

If E = 0 then all the ‘n’ points will lie on the curve y = f(x). If E is not equal to zero, then f(x) is chosen
such that E is minimum, i.e., the best fitting curve to the set of points is that for which E is minimum.
(For this particular point, we need to depend on CALCULUS, and this is the starting point realize how
optimization problems plays a vital role in predictive analysis)

THIS METHOD IS KNOWN AS THE LEAST – SQUARES METHOD. This method does not attempt to
determine the form of the curve y = f(x) but determines the values of the parameters of the
equation of the curve.

FITTING A STRAIGHT LINE BY THE LEAST –


SQAURES METHOD.
Suppose a set of ‘n’ points of values (x1,y1), (x2,y2), ….., (xn,yn) of the two variables x and y are given.
And let the relationship between the variables ‘x’ and ‘y’ be y = a+bx. The constants ‘a’ and ‘b’ are
selected such that the straight line is the best fit to the data.
CURVE FITTING

The above two equations (A) and (B) are known as normal equations. These equations can be solved
simultaneously to give the best values of ‘a’ and ‘b’. The best fitting straight line is obtained by
substituting the values of ‘a’ and ‘b’ in the equation y = a+bx

Example 1:

Fit a straight line to the following data. Also, estimate the value of ‘y’ at x=2.5

x 0 1 2 3 4
y 1 1.8 3.3 4.5 6.3

Solution: Let the straight line to be fitted to the data be y = a+bx

The normal equations are

here n = 5
CURVE FITTING

x y X2 xy
0 1 0 0
1 1.8 1 1.8
2 3.3 4 6.6
3 4.5 9 13.5
4 6.3 16 25.2
Total 10 16.9 30 47.1

Substituting these values in the normal equations

16.9 = 5a + 10b
47.1 = 10a + 30b
Solving the above system of equations
a = 0.72 and b = 1.33
Hence, the required straight line equation is y = 0.72 + 1.33 x
Y (at x = 2.5) = 0.72 + 1.33(2.5) = 4.045

Example 2:

Fit a straight line to the following data taking ‘x’ as the dependent variable

X 1 3 4 6 8 9 11 14
Y 1 2 4 4 5 7 8 9

Solution:

If ‘x’ is considered as the dependent variable and ‘y’ as the independent variable , the equation of
the straight line to be fitted to the data is x = a+by

The normal equations are

Here n = 8

x y y2 xy
1 1 1 1
3 2 4 6
4 4 16 16
6 4 16 24
8 5 25 40
9 7 49 63
11 8 64 88
14 9 81 126
Total 56 40 256 364

Substituting these values in the normal equations


CURVE FITTING

56 = 8a + 40b

364 = 40a + 256b

Solving the above system of equations

a = -0.5 and b = 1.5

Hence the required equation of the straight line is x = -0.5 +1.5y

Problem:

The results of a measurement of electric resistance R of a copper bar at various temperatures T


degree Celsius are listed below.

T (Degree) 19 25 30 36 40 45 50
R 76 77 79 80 82 83 85
Find a relation R = a+bT, where ‘a’ and ‘b’ are constants to be determined.

Answer: R = 70.05 + 0.292 T

FITTING A PARABOLA BY THE LEAST – SQAURES


METHOD.
Suppose a set of ‘n’ points of values (x1,y1), (x2,y2), ….., (xn,yn) of the two variables x and y are given.
And let the relationship between the variables ‘x’ and ‘y’ be y = a+bx+cx2. The constants ‘a’ ,‘b’, and
‘c’ are selected such that the parabola is the best fit to the data.
CURVE FITTING

The equations (A), (B) and (C) are known as ‘normal equations’. These equations can be
solved simultaneously to give the best values of ‘a’, ‘b’ and ‘c’. The best fitting parabola is obtained
by substituting the values of ‘a’, ‘b’ and ‘c’ in the equation y = a+bx+cx2.
CURVE FITTING

Example 3:

Fit a least – squares quadratic curve to the following data:

x 1 2 3 4
y 1.7 1.8 2.3 3.2

And also, estimate the value of y at x = 2.4 and x = 3.6

Solution:

Let the relationship between the variables ‘x’ and ‘y’ be y = a+bx+cx2.

The normal equations are

Here n = 4

x y X2 X3 X4 XY X2Y
1 1.7 1 1 1 1.7 1.7
2 1.8 4 8 16 3.6 7.2
3 2.3 9 27 81 6.9 20.7
4 3.2 16 64 256 12.8 51.2
Total 9 30 100 354 25 80.8
10

Substituting these values in the normal equations

9 = 4a + 10b + 30c

25 = 10a + 30b + 100c

80.8 = 30a + 100b + 354c ,

Solving the above system of equations

a = 2, b = - 0.5, c = 0.2

Hence, the required equation of the least – squares quadratic curve is

y = 2 – 0.5x +0.2 x2

y(2.4) = 1.952 and y(3.6) = 2.792


CURVE FITTING

Problem:

Fit a curve y = a+bx+cx2 for the given data by using the method of least squares

X 3 5 7 9 11 13
Y 2 3 4 6 5 8

Answer: y = 0.7897 + 0.4004 x + 0.0089 x2

Example 4:

Fit a curve for the following data

X 1 2 3 4 5 6
Y 2.51 5.82 9.93 14.84 20.55 27.06

Solution:

Let the relationship between the variables ‘x’ and ‘y’ be

The normal equations are

HOW ??
Here n = 4

x y x2 x3 x4 xy x2y
1 2.51 1 1 1 2.51 2.51
2 5.82 4 8 16 11.64 23.28
3 9.93 9 27 81 29.79 89.37
4 14.84 16 64 256 59.36 237.44
5 20.55 25 125 625 102.75 513.75
6 27.06 36 216 1296 162.36 974.16
TOTAL 91 441 2275 368.41 1840.51

368.41 = 91a + 441b


1840.51 = 441a + 2275b
Solving the above system of equations: a = 2.11 , b= 0.4
Hence the required equation of the curve is:
Problem:
CURVE FITTING

Fit a second degree parabola to the following data


x 1 2 3 4 5
y 1.8 5.1 8.9 14.1 19.8

You might also like