Maths Notes

Chapter 1: Errors in Numerical Computations
1. Introduction
In this chapter, we first begin with some mathematical preliminary theorems that are
invariably useful concerning the study of numerical analysis. This chapter presents the
various kinds of errors that may occur in a problem. The representation of numbers in
computers is introduced. General results on the propagation of errors in numerical
computation are also given. Finally, the concepts of stability and conditioning of problems
and a brief idea of convergence of numerical methods are also introduced.
2. Mathematical preliminaries
Theorem 1.1. (Intermediate value theorem)
Let f (x) be a real valued continuous function on the finite interval [a, b] and define
m = Inf f (x) , M = Sup f (x ) .

a ≤ x ≤b a≤ x≤b
Then for any number µ in [m, M ] , there exists at least one point ξ in [a, b] for which
f (ξ ) = µ .
In particular, there are points ξ and ξ in [a , b] for which m = f (ξ ) and M = f (ξ ) .
Theorem 1.2. (Mean value theorem)
Let f (x) be a real valued continuous function on the finite interval [a, b] and differentiable in
( a, b) . Then there exists at least one point ξ in (a, b) for which
f (b) − f (a) = (b − a ) f ′(ξ ) .

Theorem 1.3. (Integral mean value theorem)
Let w(x ) be nonnegative and integrable on [a , b] , and let f (x) be continuous on [a , b] . Then
b b
∫ w( x) f ( x)dx = w(ξ )∫ f (x)dx , for some ξ ∈ [a, b] .

a a
One of the most important and useful tools for approximating functions f (x ) by polynomials,
in numerical analysis is Taylor’s theorem and the associated Taylor series. These
polynomials expressed in Taylor series are used extensively in numerical analysis.
Theorem 1.4. (Taylor’s theorem)
Let f (x) be a real valued continuous function on the finite interval [ a , b] and have n + 1
continuous derivatives on [a, b] for some n≥0, and let x, x0 ∈ [a, b] . Then
f ( x) = Pn ( x) + Rn+1 ( x) ,
where
( x − x0 ) ( x − x0 ) n ( n )
Pn ( x) = f ( x0 ) + f ′( x0 ) + ... + f ( x0 )
1! n!
( x − x0 ) n +1 ( n +1)
Rn+1 ( x) = f (ξ ) , a<ξ <b.
(n + 1)!
• Taylor’s theorem in two dimensions
Let f ( x, y) be a function of two independent variables x and y and suppose f (x ) possesses
continuous partial derivatives of order n in some neighbourhood N of a point (a, b) in the
domain of definition of f (x ) . Let (a + h, b + k ) ∈ N , then there exists a positive number θ
(0 < θ < 1) , such that
2
2 n−1
 ∂ ∂  1 ∂ ∂  1  ∂ ∂ 
f (a + h, b + k ) = f (a , b) +  h + k  f (a, b) +  h + k  f (a, b) + ... +  h + k  f ( a, b )
 ∂x ∂y  2!  ∂x ∂y  ( n − 1)!  ∂x ∂y
+ Rn+1 ( x ) ,
where
n
1 ∂ ∂ 
Rn ( x ) =  h + k  f (a + θh, b + θk ) , 0 <θ <1.
n!  ∂x ∂y 
Rn (x ) is called the remainder after n terms and the theorem is called, Taylor’s theorem with
remainder or Taylors’ expansion about the point (a, b) .
3. Approximate numbers and significant figures
Significant figures:
A significant figures is any one of the digits 1, 2, 3,… and 0.
In the number 0.00134, the significant figures are 1, 3, 4. The zeros are used here merely to
fix the decimal point and therefore not significant. But in the number 0.1204, the significant
figures are 1, 2, 0, 4. Similarly, 1.00317 has 6 significant figures.
• Rules of significant figures
Rule 1: All non-zero digits 1,2,…,9 are significant.
Rule 2: Zeros between non-zero digits are significant, e.g., In reading the measurement 9.04
cm, the zero represents a measured quantity, just as 9 and 4 and is, therefore, a significant
number. Similarly in another example, there are four significant numbers in the number 1005.
3
Rule 3: Zeros to the left of the first non-zero digit in a number are not significant, e.g.,
0.0026. Also, in the measurement 0.07 kg, the zeros are used merely to locate the decimal
point and are, therefore, not significant.
Rule 4: When a number ends in zeros that are to the right of the decimal point, then zeros are
significant, e.g., in the number 0.0200, there are three significant numbers. Another example
is that in reading the measurement 11.30 cm, the zero is an estimate and represents a
measured quantity. It is therefore significant. Thus, zeros to the right of the decimal point and
at the end of the number, are significant figures.
Rule 5: When a number ends in zeros that are not to the right of the decimal point, then zeros
are not necessarily significant, e.g., if a distance is reported as 1200 feet, one may assume
two significant figures. However, reporting measurements in scientific notation removes all
doubt, since all numbers are written in scientific notation are considered significant.
1200 feet 1.2 × 10 3 feet Two significant figures
1200 feet 1.20 × 103 feet Three significant figures
1200 feet 1.200 × 103 feet Four significant figures
Thus, we may conclude that if a zero represents a measured quantity, it is a significant figure.
If it merely locates the decimal point, it is not a significant figure.
4. Rounding off numbers
Numbers are rounded off so as to cause the least possible errors.
The following is the general rule for rounding off a number to n significant digits:
4
Discard all digits to the right of the n th place. If the discarded number is less than half a unit
in the n th place, leave the n th digit unchanged; if the discarded number is greater than half a
unit in the n th place, add 1 to the n th digit. If the discarded number is exactly half a unit in
the n th place, leave the n th digit unaltered if it is an even number, but increase it by 1 if it is
an odd number.
When a number is rounded off according to the rule just stated above, then it is said to be
correct to n significant digits.
To illustrate, the following numbers are corrected to 4 significant figures:
27.1345 becomes 27.13
27.8793 becomes 27.88
27.355 becomes 27.36
27.365 becomes 27.36
We now proceed to present the classification of the ways by which error is involved into the
numerical computation. Let us start with some simple definitions about error.
(i) Absolute error:
Let xT be the exact value or true value of a number and x A be its approximate value, then
xT − x A is called the absolute error Ea . Therefore, absolute error
Ea ≡ xT − x A
(ii) Relative and percentage errors:
5
The relative error is defined by
xT − x A
Er ≡ , provided xT ≠ 0 or xT is not close to zero.
xT
The percentage relative error is defined by
xT − x A
E p ≡ E r × 100 = × 100 , provided xT ≠ 0 or xT is not close to zero.
xT
(iii) Inherent error:
The inherent error is that quantity which is already present in the statement of the problem
before its solution. The inherent error arises either due to the straight assumptions in the
mathematical forms of the problem or due to the physical measurements of the parameters of
problem. Inherent error can not be completely eliminated but can be minimized if we select
better data or by employing high precision computer computations.
(iv) Round-off and chopping errors:
A number x is said to be rounded correct to a d -decimal digit number x ( d ) if the error
satisfies
1
x − x (d ) ≤ × 10 −d . (4.1)
2
The error arising out of rounding of a number, as defined in eq. (4.1) is known as round-off
error.
Let an arbitrary given real number x which has the representation in the following form
x = .d1d 2 ...d k d k +1...× b e , (4.2)
6
where b is the base, d1 ≠ 0 , d 2 , …, d k are integers and satisfies 0 ≤ d i ≤ b − 1 and the exponent
e is such that emin ≤ e ≤ emax . The fractional part .d1d 2 ...d k d k +1 ... is called the mantissa and it lies
between -1 and 1.
Now, the floating-point number fl ( x) in k -digit mantissa standard form can be obtained in
the following two ways:
(a) Chopping: In this case, we simply discard the digits d k +1 , d k +2 , … in eq. (4.2), and
obtain
fl ( x) = .d1d 2 ...d k × b e (4.3)
(b) Rounding: In this case, fl ( x) is chosen as the normalized floating-point number
nearest to x , together with the rule of symmetric rounding, according to which, if the
truncated part be exactly half a unit in the k -th position, then if the k -th digit be odd,
it is rounded up to an even digit and if it is even, then it is left unchanged.
Thus the relative error for k -digit mantissa standard form representation of x becomes
b1−k , for chopping

x − fl ( x) 
≤  1 1−k (4.4)
x  b , for rounding
2
Therefore, the bound on the relative error of a floating point number is reduced by half when
rounding is used instead of chopping. For this reason, on the most of the computers rounding
is used.
Example 1:
Approximate values of 1 and 1 , correct to 4 decimal places are 0.1667 and 0.0769
6 13
respectively. Find the possible relative error and absolute error in the sum of 0.1667 and
0.0769.
7
Solution:
Let x = 0.1667 and y = 0.0769 .
The maximum absolute error in each x and y is given by
1
× 10 −4 = 0.00005
2
(i) Relative Error in ( x + y ) A
( x + y )T − ( x + y ) A E ( x) E ( y) 0.00005 0.00005
E r [( x + y ) A ] = ≤ a + a ≤ + ≤ 0.000950135
( x + y )T ( x + y ) T ( x + y )T 0.1667 0.0769
(ii) Absolute Error in ( x + y ) A
E a [( x + y ) A ] = Er ( x + y) ( x + y )T ≤ 0.000950135× 0.2436 = 0.000231452886
Example 2:
If the number π = 4 tan −1 (1) is approximated using 5 significant digits, find the percentage
relative error due to
(i) Chopping,
(ii) Rounding.
Solution:
(i) Percentage relative error due to chopping is given by
8
π − 3.1415
× 100 = 0.00294926%
π
(ii) Percentage relative error due to rounding is given by
π − 3.1416
× 100 = 0.000233843%
π
From the above errors, it may be easily observed that rounding reduces error.
5. Truncation errors
These are the errors due to approximate formulae used in the computations. Truncation errors
result from the approximate formulae used which are generally based on the truncated series.
The study of this error is usually associated with the problem of convergence.
For example, let us assume that a function f (x) and all its higher order derivatives with
respect to the independent variable x at the point, say x = x0 are known. Now in order to
find the function value at a neighbouring point, say x = x0 + ∆x , one can use the Taylor series
expansion for the function f (x) about x = x0 as
( x − x0 ) 2
f ( x ) = f ( x0 ) + ( x − x0 ) f ′( x0 ) + f ′′( x0 ) + ... (5.1)
2!
the right hand side of the above equation is an infinite series and one has to truncate it
after some finite number of terms to calculate f ( x0 + ∆x) either with computer or by manual
calculations.
If the series is truncated after n terms, then it is equivalent to approximating f (x) with a
polynomial of degree n − 1 . Therefore, we have
9
( x − x0 ) 2 ( x − x0 ) n−1 (n−1)
f ( x ) ≈ Pn−1 ( x) = f ( x0 ) + ( x − x0 ) f ′( x0 ) + f ′′( x0 ) + ... + f ( x0 ) (5.2)
2! ( n − 1)!
The truncated error is given by
( x − x0 ) n ( n )
ET ( x ) = f ( x ) − Pn−1 ( x) = f (ξ ) (5.3)
n!
Now, let
M n ( x) = max f ( n) (ξ ) (5.4)
a ≤ξ ≤ x
Then the bound of the truncation error is given by
n
M n ( x ) x − x0
ET ( x) ≤ (5.5)
n!
If h = x−a, then the truncation error ET (x) is said to be of order O(h n ) .
Hence, from eq. (5.2) an approximate value of f ( x0 + ∆x) can be obtained with the truncation
error estimate as given in eq. (5.3).
Example 3:
Obtain a second degree polynomial approximation to
f ( x) = (1 + x) 2 / 3 , x ∈ [0,0.1]
Using Taylor series expansion about x = 0 . Use the expansion to approximate f (0.04) and
find a bound of the truncation error.
Solution:
10
We have,
f ( x) = (1 + x) 2 / 3 , f (0) = 1
2 2
f ′( x ) = , f ′(0) =
3(1 + x )1/ 3 3
2 2
f ′′( x) = − 4/ 3
, f ′′(0) = −
9(1 + x) 9
8
f ′′′( x ) = .
27(1 + x) 7 / 3
Thus, the Taylor series expansion with the remainder term is given by
2 x2 4 x3
(1 + x ) 2 / 3 = 1 + x− + , 0 < ξ < 0.1
3 9 81 (1 + ξ ) 7 / 3
Therefore, the truncation error is
 2 x2  4 x3
ET ( x) = (1 + x ) 2 / 3 − 1 + x −  =
 3 9  81 (1 + ξ ) 7 / 3

The approximate value of f (0.04) is
2 (0.06) 2
f (0.04) ≈ 1 + (0.06) − = 1.026488888889 , correct to 12 decimal places.
3 9
The truncation error bound in x ∈ [0,0.1] is given by
4 (0.1) 3
ET ≤ max
0≤ x ≤ 0.1 81 (1 + x) 7 / 3
4
≤ (0.1) 3 = 0.493827 × 10 −4
81
11
The exact value of f ( 0.04 ) correct upto 12 decimal places is 1.026491977549.
6. Floating point representation of numbers
A floating point number is represented in the following form
± .d1d 2 ...d k × b e ,
where b is the base, d1 ≠ 0 , d 2 , …, d k are digits or bits satisfying 0 ≤ d i ≤ b − 1 , k is the
number of significant digits or bits, which indicates the precision of the number and the
exponent e is such that emin ≤ e ≤ emax . The fractional part .d1d 2 ...d k d k +1... is called the mantissa
or significand and it lies between -1 and 1.
6.1 Computer representation of numbers
In numerical computation, nowadays usually digital computers are used. Most of the digital
computers uses floating point mode for storing real numbers.
The fundamental unit of data stored in a computer memory is called computer word. The
number of bits a word can hold is called word length. The word length is fixed for a computer
although it varies from computer to computer. The typical word lengths are 16, 32, 64 bits or
higher bits. The largest number that can be stored in a computer depends on word length. To
store a number in floating point representation, a computer word is divided into three fields.
The first part consists of one bit, called the sign bit. The next bits represent the exponent and
finally the bits represent the mantissa. For example, in single-precision floating-point format,
a 32-bit word is divided into three fields as follows: 1 bit for the sign, 8 bits for the exponent
and 23 bits for the mantissa. Exponent is an 8 bit signed integer from −128 to 127. On the
other hand, in double-precision floating-point format, a 64-bit word is divided into three
fields as follows: 1 bit for the sign, 11 bits for the exponent and 52 bits for the mantissa.
12
In the normalized floating point representation, the exponent is so adjusted that the bit d1
immediately after the binary point is always 1. Formally, a nonzero floating point number is
in normalized floating point form if
1
≤ mantissa < 1 .
b
The range of the exponents that a typical computer can handle is very large. The following
table shows the effective range of IEEE (Institute of Electrical and Electronics Engineers)
floating-point numbers:
Effective Floating-Point Range
Binary number Decimal number
Single precision ± (2-2-23) × 2 127 ~ ± 10 38.53
Double precision ± (2-2-52) × 2 1023 ~ ± 10 308.25
Table 1: Effective Floating-Point Range
If in a numerical computation, a number lies outside the range, then the following cases arise:
(a) Overflow: It occurs when the number is larger than the above range specified in
Table 1.
13
(b) Underflow: It occurs when the number is smaller than the above range specified in
Table 1.
In case of underflow, the number is usually set to zero and computation continues. But in
cases of overflow, the computer execution halts.
7. Propagation of errors
In this section, we consider the effect of arithmetic operations which involve errors. Let, x A
and y A be the approximate numbers used in the calculations. Suppose they are in error with
the true values xT and yT respectively.
Thus we can write xT = x A + ε x and yT = y A + ε y . Now, we examine the propagated error in
some particular cases:
Case 1: (Multiplication)
In multiplication, for the error in x A y A , we have
xT yT − x A y A = xT yT − ( xT − ε x )( yT − ε y )
= xT ε y + yT ε x − ε xε y .
Thus the relative error in x A y A is
xT yT − x A y A
Er ( xA yA ) =
xT yT
εx εy εx εy
= + − .
xT yT xT yT
≤ E r ( x A ) + E r ( y A ) , provided E r ( x A ) , Er ( y A ) << 1
Case 2: (Division)
14
Proceeding with same argument as in multiplication, we get
E r ( x A / y A ) ≤ Er ( x A ) + Er ( y A ) , provided Er ( y A ) << 1
Case 3: (Addition and subtraction)
In cases of addition and subtraction, we have
( xT ± yT ) − ( x A ± y A ) = ( xT − x A ) ± ( yT − y A ) = ε x ± ε y .
Thus the absolute error in ( x A ± y A ) is given by
Ea ( x A ± y A ) ≤ Ea ( x A ) + Ea ( y A ) .
Notes:
(i) The relative error in a product is bounded by the sum of the relative errors in the
multiplicands; and the relative error in a quotient is bounded by the sum of the
relative errors in the dividend and divisor. The relative errors in multiplication or
division do not propagate very rapidly.
(ii) The absolute error in the sum or difference of two numbers is bounded by the sum
of the absolute values of the errors in the corresponding numbers.
8. The general formula for errors
Let us consider the differentiable function
u = f ( x1 , x2 ,..., xn ) (8.1)
of several independent variables x1 , x2 , …, xn .
Suppose that, ∆xi represents error in each xi , so that the error in u is given by
15
u + ∆u = f ( x1 + ∆x1 , x2 + ∆x2 ,..., xn + ∆x n ) (8.2)
Taylor series expansion of the right hand side of eq. (8.2) gives
n
∂f
u + ∆u = f ( x1 , x2 ,..., xn ) + ∑ ∂x ∆x + O(∆x
i =0 i
i
2
i ) (8.3)
If we assume that the errors ∆x1 , ∆x2 , …, ∆xn are relatively very small, we can neglect the
second and higher powers of ∆xi . Thus from eq. (8.3), we get
n
∂f ∂f ∂f ∂f
∆u ≅ ∑ ∂x ∆x
i =0 i
i =
∂x1
∆x1 +
∂x2
∆x2 + ... +
∂xn
∆x n (8.4)
This is the general formula for computing the error of a function u = f ( x1 , x2 ,..., xn ) .
The relative error Er is then given by
∆u ∂f ∆x1 ∂f ∆x2 ∂ f ∆x n
Er = ≅ + + ... + .
u ∂x1 f ∂x2 f ∂xn f
Example 4:
3
If u = xyz 3 + x 2 y 3 and errors in x , y , z are 0.005, 0.001, 0.005 respectively at x = 2 , y = 1 ,
2
z = 1 , compute the maximum absolute and relative errors in evaluating u .
Solution:
Let
3 2 3
u = f ( x, y , z ) = xyz 3 + x y
2
We have
16
∂f ∂f 9 ∂f
= yz 3 + 3xy 3 , = xz 3 + x 2 y 2 , = 3xyz 2
∂x ∂y 2 ∂z
From eq. (8.4), we get
∂f ∂f ∂f 9
∆u ≅ ∆x + ∆y + ∆z = ( yz 3 + 3 xy 3 )∆x + ( xz 3 + x 2 y 2 ) ∆y + 3 xyz 2 ∆z
∂x ∂y ∂z 2
Given that x = 2 , y = 1 , z = 1 , ∆x = 0.005 , ∆y = 0.001 , ∆z = 0.005 and therefore, we obtain
9 2 2
∆u ≤ ( yz 3 + 3 xy 3 )∆x + ( xz 3 + x y ) ∆y + 3 xyz 2 ∆z
2
= 7 × 0.005 + 20 × 0.001 + 6 × 0.005 = 0.085
Hence the maximum absolute error in u is 0.085.
The maximum relative error in u is given by
∆u 0.085
(Er )max = max ≈ = 0.010625
u 8
17
Chapter 2: Numerical Solutions of Algebraic and Transcendental equations
1. Introduction
In this chapter, we shall consider the problem of numerical computation of real root of a given
equation f ( x) = 0 , which may be algebraic or trigonometric or transcendental. It will be assumed
that the function f (x) is continuously differentiable sufficient number of times.
All the methods for numerical solution of equations will consist of two stages. The first stage,
called location of root at which rough values of the root are obtained and the second stage which
consists in improvement of rough value of each root to any desired degree of accuracy.
In the second stage, a method of improvement of the rough value of a root will generate a
sequence of successive approximation or iterates {x n n ≥ 0} , starting with initial rough value x0 of
the root α obtained in the first stage, such that xn → α as n → ∞ .
2. Basic concepts and definitions
2.1 Sequence of successive approximations
Let, {xn } be a sequence of successive approximations for a desired root α of the equation
f ( x) = 0 .
The error ε n at the nth iteration is defined by
ε n = α − xn (2.1.1)
And we defined hn by
hn = xn+1 − xn = ε n − ε n+1 , (2.1.2)
1
which may be considered as an approximation of ε n .
The iteration process converges if and only if ε n → 0 as n → ∞ .
2.2 Order of convergence
Definition: If an iterative method converges and two constants p ≥ 1 and C > 0 exists such that
ε n+1
lim =C (2.2.1)
n →∞ ε p
n
Then p is called the order of convergence of the method and C is called asymptotic error
constant.
A sequence of iterates {x n n ≥ 0} is said to converge with order of convergence p ≥ 1 to a root α if
p
ε n +1 ≤ k ε n , n ≥ 0 (2.2.2)
for some k > 0 . If p = 1 , then the sequence of iterates {x n n ≥ 0} is said to be linearly convergent. If
p = 2 , then the iterative method is said to have quadratic convergence.
3. Initial approximation
3.1 Graphical method
In this method, we plot the graph of the curve y = f (x ) on the graph paper; the point at which the
curve crosses the x-axis, gives the root of the equation f ( x ) = 0 , any value in the neighbourhood
of this point may be taken as an initial approximation to the required root.
2
Fig 1 Graph of y = cos x − xe x
Sometimes the equation f ( x) = 0 can be written in the form g ( x) = h( x) where the graphs of
y = g (x) and y = h(x) may be conveniently drawn. In that case the abscissae of the point of
intersection of the two graphs gives the required root of the f ( x ) = 0 and therefore any value in
the neighbourhood of this point can be taken as initial approximation to the required root. Fig.1
cos x
shows the graph of y = cos x − xe x and Fig. 2 shows the graphs of y = x and y = . The abscissae
ex
of the point of intersection of these two graphs give the required root of the f ( x) = 0 .
cos x
Fig 2 Graphs of y = x and y =
ex
3
Another commonly used method is based upon the Intermediate mean value theorem, which
states that
Theorem 1: If f (x) be continuous function in the closed interval [ a, b] and c be any number such
that f (a) ≤ c ≤ f (b) , then there is at least one number α ∈ [a, b] such that f (α ) = c .
3.2 Incremental search method
The incremental search method is a numerical method that is used when it is needed to find an
interval of two values of x where the root is supposed to be existed. The incremental search
method starts with an initial value x0 and a sufficiently small interval ∆x . It is supposed that we
are going to search the location of the root in the x-axis from left to right.
We can find the value of x1 easily with this following equation
x1 = x0 + ∆x
If we convert that equation into an iterative one, we get
xn = xn −1 + ∆x
If f ( xn −1 ) f ( xn ) < 0 , we can assure that there exists a root between the interval [ xn −1, xn ] .
We construct a table of values of f (x) for various values of x and choose a suitable initial
approximation to the root. This method is also known as method of tabulation.
Example 1. In order to obtain an initial approximation to the root of the equation
f ( x) = cos x − xe x = 0 , we prepare a table of values of f (x) for known values of x, we get
x 0 0.5 1 1.5 2
4
f (x) 1 0.0532 -2.1780 -6.6518 -15.1942
From this table we find that the equation f ( x ) = 0 has at least one root in the interval (0.5,1).
Example 2. Find real root of the equation f ( x) = 10 x − x − 4 = 0 correct to 2 significant digits by
the method of tabulation.
Solution:
Let us tabulate the values of f(x) in [0, 1] with step size 0.1.
x f (x )
0 -3
0.1 -2.841
0.2 -2.615
0.3 -2.305
0.4 -1.888
0.5 -1.338
0.6 -0.6189
0.7 0.3119
0.8 1.510
0.9 3.043
1 5
So the root of the given equation lies in (0.6, 0.7). Since, there is only one change of sign
between 0.6 and 0.7, there is only one real root between 0.6 and 0.7.
5
We tabulate again between 0.6 and 0.7 with step length 0.01.
x f (x )
0.6 -0.6189
0.61 -0.5362
0.62 -0.4513
0.63 -0.3642
0.64 -0.2748
0.65 -0.1832
0.66 -0.0891
0.67 0.007351
0.68 0.1063
0.69 0.2078
0.7 0.3119
From the above table we can observe that the root lies between 0.66 and 0.67. Therefore, we may
conclude that the value of the required root is 0.67 correct to two significant figures.
4. Iterative methods
4.1 Method of Bisection
This is the simplest iterative method based on the repeated application of following Bolzano’s
theorem on continuity which is a particular case of Intermediate value theorem.
Theorem 2 (Bolzano’s theorem):
6
If f (x) be continuous in the closed interval [ a, b] and f (a ) , f (b) are of opposite signs, then there
exists a number α ∈ [a, b] such that f (α ) = 0 i.e. there exists a real root α of the equation f ( x ) = 0
. ■
We first find a sufficiently small interval [ a 0 , b0 ] containing α by the method of tabulation or
graphical method such that f (a 0 ) , f (b0 ) are of opposite signs.
Next, we proceed to generate the sequence {xn } of successive approximations as follows:
a 0 + b0
We set x 0 = a 0 or b0 and x1 = and compute f ( x1 ) .
2
Now, if f (a 0 ) , f ( x1 ) are of opposite signs, we set a1 = a0 and b1 = x1 or [ a1 , b1 ] = [a0 , x1 ] .
Otherwise, if f ( x1 ) , f (b0 ) are of opposite signs, we set a1 = x1 and b1 = b0 or [ a1 , b1 ] = [ x1 , b0 ] so
b0 − a 0
that in either case [a1 , b1 ] contains the root α and b1 − a1 = .
2
a1 + b1
We again set x2 = and repeat the above steps and so on.
2
In general, if the interval [ a n , bn ] containing α has been obtained so that f (a n ) f (bn ) < 0 , then we
a n + bn
set xn +1 =
2
If f ( an ) f ( x n+1 ) < 0 , we set an +1 = an , bn+1 = xn+1 .
Otherwise, if f ( x n+1 ) f (bn ) < 0 , we set an+1 = x n+1 , bn+1 = bn .
So in either case [ an+1 , bn+1 ] contains α i.e. f ( an+1 ) f (bn+1 ) < 0
7
bn − a n
Therefore, bn +1 − a n+1 =
2
bn −1 − a n −1
This implies, bn − a n =
2
bn−2 − an−2
=
22
b0 − a0
= (4.1.1)
2n
bn − a n
Since xn = a n or bn , approximation of ε n viz. hn = x n +1 − xn = .
2
Now error at n-th iteration ε n = α − xn ≤ bn − an
Therefore, we have
b0 − a0
ε n ≤ bn − an = (4.1.2)
2n
Consequently, ε n → 0 as n → ∞
Therefore, the iteration of bisection method invariably converges i.e. bisection method converges
unconditionally.
4.1.1 Order of convergence of Bisection method
We know
bn−1 − an−1 bn−2 − an− 2 b −a

bn − an = = 2
= ... = 0 n 0
2 2 2
8
Therefore,
b0 − a0
ε n = α − xn ≤ bn − an ≤ (4.1.1.1)
2n
This implies
b0 − a0
ε n −1 ≤
2n −1
Therefore,
ε n +1 1
≅ (4.1.1.2)
εn 2
Hence, the order of convergence for bisection method is 1.
Example 3:
Find the real root of the equation x log10 x − 1.2 = 0 correct to five significant figures using method
of bisection.
Solution:
We first apply method of tabulation in order to find the location of rough value of the root.
Table 1: location of the root
x f (x )
1 -1.2
2 -0.6
3 0.23
9
We note that f(2)<0 and f(3)>0. Thus the given equation changes its sign within the interval [2,
3]. Therefore, there exists at least one real root of the equation within [2, 3].
a n + bn
Now, for the bisection method, we compute xn+1 = .
2
If f (a n ) f ( xn+1 ) < 0 , then the root lies between [an , xn +1] i.e. we set an +1 = an and bn +1 = xn +1 .
Otherwise, if f ( xn+1 ) f (bn ) < 0 , then the root lies between [ xn+1 , bn ] i.e. we set an +1 = xn +1 and
bn +1 = bn . The successive iterations have been presented in Table 2.
Table 2: Table for finding real root
10
n an bn f (an ) f (bn ) a n + bn f (x n +1 )
xn+1 =
2
0 2 3 -0.59794 0.231364 2.5 -0.20515
1 2.5 3 -0.20515 0.231364 2.75 0.00816491
2 2.5 2.75 -0.20515 0.00816491 2.625 -0.0997856
3 2.625 2.75 -0.0997856 0.00816491 2.6875 -0.046126
4 2.6875 2.75 -0.046126 0.00816491 2.718750 -0.0190585
5 2.71875 2.75 -0.0190585 0.00816491 2.734375 -0.0054662
6 2.734375 2.75 -0.0054662 0.00816491 2.742188 0.00134452
7 2.734375 2.742188 -0.0054662 0.00134452 2.738281 -0.00206205
8 2.738281 2.742188 -0.00206205 0.00134452 2.740234 -0.000359068
9 2.740234 2.742188 -0.000359068 0.00134452 2.741211 0.00049265
10 2.740234 2.741211 -0.000359068 0.00049265 2.740723 0.0000667723
11 2.740234 2.740723 -0.000359068 0.0000667723 2.740479 -0.000146153
12 2.740479 2.740723 -0.000146153 0.0000667723 2.740601 -0.0000396913
13 2.740601 2.740723 -0.0000396913 0.0000667723 2.740662 0.0000135402
14 2.740601 2.740662 -0.0000396913 0.0000135402 2.740631 -0.0000130756
At the 14th step, the root lies between 2.740662 and 2.740631. Hence, the required real root is
2.7406 correct to five significant figures.
4.1.2 Advantage and disadvantage of bisection method
• Advantage
11
The bisection method is very simple because the iteration in each stage does not depend on the
function values f ( xn ) but only on their signs. Also the convergence of the method is
unconditional. This method converges invariably, so it is surely convergent.
• Disadvantage
The method is very slow since it converges linearly. Consequently, it requires a large number of
iteration to obtain moderate result upto certain degree of accuracy and thus this method is very
laborious.
4.1.3 Algorithm for Bisection method
Step 1: Start the program.
Step 2: Define the function f (x) .
Step 3: Enter the interval [a, b] in which the root lies.
Step 4: If f (a ) f (b ) < 0 . Then go to step 5 else step 9.
a+b
Step 5: Calculate x = .
2
Step 6: If f (a ) f ( x) < 0 , set b = x , otherwise if f ( x) f (b) < 0 set a = x .
Step 7: If a − b < ε , ε being the prescribed accuracy then go to step 8 else 5.
Step 8: Print the value of x which is required root.
Step 9: Stop the program. □
12
4.2 Regula-falsi method or method of false position
This method is the oldest method for finding the real root of an equation f (x) = 0 . It is also known
as method of chords or method of linear interpolation. Like the bisection method, the false
position method starts with two points a0 and b0 such that f (a0 ) and f (b0 ) are of opposite signs,
which implies by the intermediate value theorem that the function f has a root in the interval
[a0, b0], assuming continuity of the function f.
We first determine by the method of tabulation a sufficiently small interval [a0 , b0 ] containing the
only root α of the equation f (x ) = 0 such that f (a0 )f (b0 )< 0 .
Then, we approximate the portion of the curve y = f (x ) between the two points (a0 , f (a0 )) and
(b0 , f (b0 )) by a straight line. It has been shown in fig. 3.
Fig 3 graphical representation of regula-falsi method
Now the equation of the chord joining the two points (a0 , f (a0 )) and (b0 , f (b0 )) is given by
f (b0 )− f (a0 )(
y − f (b0 ) = x − b0 ) (4.2.1)
b0 − a0
1
The method consists in replacing the portion of the curve between the points (a0 , f (a0 )) and
(b0 , f (b0 ))by means of the chord joining these two points, and taking the point of intersection of
the chord with the x-axis as an approximation to the root. In this case, the point of intersection is
obtained by putting y = 0 in eq. (4.2.1). Thus, the first approximation x1 to the root α is obtained
as
x1 = a0 f (b0 )− b0 f (a0 ) (4.2.2)

f (b0 )− f (a0 )
Equivalently we may write,
a0 f (b0 ) + b0 f (a0 )
x1 = (4.2.3)
f (b0 ) + f (a0 )
Next, we compute f (x1 ).
If f (a0 ) and f (x1 ) are of opposite signs i.e. f (a0 )f (x1 ) < 0 , then the root lies in [a0 , x1 ] ,
so we set a1 = a 0 and b1 = x1 .
Otherwise, if f (x1 ) and f (b0 ) are of opposite signs i.e. f (x1 )f (b0 )< 0 , then the root lies in [x1, b0 ] ,
we set a1 = x1 and b1 = b0 , so that in either case [a1, b1] contains α , i.e. f (a1) f (b1) < 0 .
We repeat the above step to obtain second approximation x 2 , third approximation x3 and so on.
In general, having obtained the interval [an ,bn ] containing α such that f (an ) f (bn ) < 0 . We obtain
the (n+1)-th approximation xn +1 as
2
bn − an
xn+1 = an − f (an )
f (bn ) − f (a n )
b f (a ) + a f (b )
n n n n
=
f (an ) + f (bn )
Now, we set an+1 = an and bn+1 = xn+1 if f (an ) f (xn+1) < 0 .
Otherwise, we set an+1 = xn+1 and bn+1 = bn if f (xn+1) f (bn ) < 0 , so that [an+1, bn+1] contains α . i.e.
f (an+1 ) f (bn+1 ) < 0 .
Example 4:
Find the real root of the equation x log10 x −1.2 = 0 correct to four decimal places using regula-falsi
method.
Solution:
x f (x )
1 -1.2
2 -0.6
3 0.23
3
b f (a ) + a f (b )
n n n n
Now, for the regula-falsi method, we compute x n+1 = .
f (a n ) + f (bn )
If f (an ) f (xn+1 ) < 0 , then the root lies between [an , xn+1 ] i.e. we set an+1 = an and bn+1 = xn+1 .
Otherwise, if f (xn+1 ) f (bn ) < 0 , then the root lies between [xn+1 ,bn ] i.e. we set an+1 = xn+1 and
n an bn f (an ) f (bn ) bn f (an ) + an f (bn ) f (x n+1 )

xn+1 =
f (a ) + f (b )
n n
0 2 3 -0.59794 0.231364 2.72101 -0.0170911
1 2.72101 3 -0.0170911 0.231364 2.74021 -0.000384056
2 2.74021 3 -0.000384056 0.231364 2.74064 -8.58134E-6
3 2.74064 3 -8.58134E-6 0.231364 2.74065 -1.91717E-7
Hence, the required real root is 2.7406 correct to four decimal places.
Example 5:
Find the real root of the equation x e x − cos x = 0 correct to four significant figures using regula-
falsi method.
Solution:
4
x f (x )
0 -1
0.5 -0.0532219
0.6 0.267936
1 2.17798
We note that f(0.5)<0 and f(0.6)>0. Thus the given equation changes its sign within the interval
[0.5, 0.6]. Therefore, there exists at least one real root of the equation within [0.5, 0.6].
bn f (an ) + an f (bn )
Now, for the regula-falsi method, we compute xn+1 = .
f (an ) + f (bn )
If f (an ) f (xn+1 ) < 0 , then the root lies between [an , xn+1 ] i.e. we set an+1 = an and bn+1 = xn+1 .
Otherwise, if f (xn+1 ) f (bn ) < 0 , then the root lies between [xn+1 ,bn ] i.e. we set an+1 = xn+1 and
5
n an bn f (an ) f (bn ) bn f (an ) + an f (bn ) f (x n+1 )
xn+1 =
f (a ) + f (b )
n n
0 0.5 0.6 -0.0532219 0.267936 0.516572 -0.00360274
1 0.516572 0.6 -0.00360274 0.267936 0.517679 -0.000238932
2 0.517679 0.6 -0.000238932 0.267936 0.517752 -0.0000158242
3 0.517752 0.6 -0.0000158242 0.267936 0.517757 -1.04793E-6
Hence, the required real root is 0.5178 correct to four significant figures.
4.2.1 Order of convergence of regula-falsi method
It may be shown that the error at the (n+1)-th step is related to the error in the n-th step by the
expression
ε n+1 ≅ A ,
εn (4.2.1.1)
where A is a constant depending on the function f . This shows that the sequence of successive
iteration {xn } converges linearly to the root.
4.2.2 Advantage and disadvantage of regula-falsi method
• Advantage
This method is very simple and does not require to calculate the derivative of f (x) . Moreover,
this method is certainly convergent. Since the solution remains bracketed at each step,
convergence is guaranteed as was the case for the bisection method.
• Disadvantage
6
The method is first order and is exact for linear f. The method is very slow since it converges
linearly. Also, the initial interval in which the root lies is to be chosen very small.
4.2.3 Algorithm for regula-falsi method
Step 1: Start the program.
Step 2: Define the function f (x) .
Step 3: Enter the interval [a, b] in which the root lies.
Step 4: If f (a) f (b) < 0 . Then go to step 5 else step 9.
a f (b) + b f (a)
Step 5: Calculate x = .
f (a) + f (b)
Step 6: If f (a) f (x) < 0 , set b = x , otherwise if f (x) f (b) < 0 set a = x .
Step 7: If a − b < ε , ε being the prescribed accuracy. Then go to step 8 else 5.
Step 8: Print the value of x which is required root.
Step 9: Stop the program. □
7
4.3 Fixed-point iteration
Let, [a0 , b0 ] be an initial small interval containing the only root α of the given equation
f ( x) = 0 .
Let us rewrite given equation as
x = φ (x) (4.3.1)
so that the root α satisfies the equation
α = φ (α ) (4.3.2)
We assume that, φ(x) is continuously differentiable sufficient number of times in [ a0 , b0 ] such
that for x ∈ [ a0 , b0 ] , φ ( x) ∈ [a0 , b0 ] .
We take the successive approximation using the formula
xn +1 = φ ( xn ) , n ≥ 0 , (4.3.3)
starting with x0 = a0 or b0 .
Substracting eq. (4.3.3) from eq. (4.3.2), we get
α − xn +1 = φ (α ) − φ ( xn ) = (α − xn )φ ′(ξ n ) , (by using Lagrange’s mean value theorem)
where min{α , xn } < ξ n < max{α , xn }
Therefore, ε n +1 = ε nφ ′(ξ n ) , where min{α , xn } < ξ n < max{α , xn } .
This is called error equation.
1
4.3.1 Condition of convergence for fixed-point iteration method
Theorem 3: Let α be the root of the equation f ( x) = 0 i.e. x = α be a solution of x = φ (x) and
suppose φ ( x ) has a continuous derivative in some interval [a 0 ,b0 ] containing the root α . If
φ ′( x ) ≤ K < 1 for all x ∈ [a 0 , b0 ] , then the fixed point iteration process xn +1 = φ (xn ) converges with
any initial approximation x 0 ∈ [a 0 , b0 ] .
Proof:
According to the hypothesis of the theorem
α = φ (α ) (4.3.1.1)
We know the iteration scheme of fixed-point method is
xn +1 = φ (xn ) (4.3.1.2)
By subtracting (4.3.1.2) from (4.3.1.1), we have
α − xn +1 = φ (α ) − φ (xn )
Since, ε n +1 = α − x n +1 is error at the (n+1)-th approximation.
So, ε n +1 = (α − xn )φ ′(ξ ) , applying Lagrange’s mean value theorem
where min{α , xn } < ξ < max{α , xn }
Therefore, ε n+1 = α − xn φ ′(ξ ) ≤ K α − xn ≤ K ε n (4.3.1.3)
Again it can be written as
2
ε n ≤ K ε n−1
Therefore,
ε n ≤ K 2 ε n − 2 ≤ K 3 ε n − 3 ... ≤ K n ε 0
Now, since K < 1 , K n → 0 as n → ∞
So, ε n → 0 as n → ∞
This implies
α − x n → 0 as n → ∞
Therefore, x n → α as n → ∞ .
Hence, the condition of convergence of the fixed-point iteration is that K < 1 or φ ′( x) < 1 in
[a0 , b0 ] .
Corollary:
It can be observed that the condition of convergence of fixed point iteration method is φ ′( x) < 1
in the neighbourhood of α . This fact can be easily examined from the following fig. 4:
3
−1 < φ ′( x) ≤ 0 ⇒ convergence −1 < φ ′( x) ≤ 0 ⇒ divergence
0 < φ ′( x) ≤ 1 ⇒ convergence 1 < φ ′( x) ⇒ divergence
Fig. 4 The conditional convergence of fixed point iteration method
Example 6:
Find the real root of the equation x 3 + x − 1 = 0 correct to three significant figures using fixed-
point iteration method.
Solution:
Let f ( x) = x 3 + x − 1 = 0 .

4
x f (x)
0 -1
1 1
Now, we rewrite the equation x 3 + x − 1 = 0 as
1
x= 2
= φ ( x) , say
x +1
− 2x
Hence, φ ′( x) = so that φ ′( x) < 1 in 0 < x < 1 . Therefore, according to theorem 3, iteration
( x 2 + 1) 2
in fixed-point method certainly converges.
We choose initial approximation x0 = 0 .
The successive iterations generated by the eq. (4.3.3) have been presented in Table 8 and it has
been cited graphically in fig. 5.
5
Fig. 5 successive iterates in fixed point iteration
n xn xn +1 = φ ( xn ) f (x n +1 )
0 0 1 1
1 1 0.5 -0.375
2 0.5 0.8 0.312
3 0.8 0.60976 -0.163535
4 0.60976 0.72897 0.116337
5 0.72897 0.653 -0.0685556
6 0.653 0.70106 0.045624
7 0.70106 0.67047 -0.0281294
8 0.67047 0.68988 0.0182119
9 0.68988 0.67754 -0.011432
10 0.67754 0.68537 0.0073189
11 0.68537 0.68039 -0.00462747
12 0.68039 0.68356 0.00294911
13 0.68356 0.68155 -0.00187003
14 0.68155 0.68282 0.00118959
15 0.68282 0.68201 -0.000755204
Hence, the required real root of the given equation is 0.682 correct to three significant digits.
6
Example 7:
Find the real root of the equation 10 x + x − 4 = 0 correct to six decimal places using fixed-point
iteration method.
Solution:
Let f ( x ) = 10 x + x − 4 = 0 .
x f (x)
0 -3
1 7
Now, we rewrite the equation 10 x + x − 4 = 0 as
x = log10 (4 − x) = φ ( x) , say
−1
Hence, φ ′( x) = so that φ ′( x) < 1 in 0 < x < 1 . Therefore, according to theorem 3, iteration
(4 − x ) ln 10
in fixed-point method certainly converges.
We choose initial approximation x0 = 0 .
7
The successive iterations generated by the eq. (4.3.3) have been presented in Table 10.
n xn xn +1 = φ ( xn ) f (x n +1 )
0 0 0.60205999 0.60206
1 0.60205999 0.53121571 -0.0708443
2 0.53121571 0.54017729 0.00896159
3 0.54017729 0.53905384 -0.00112345
4 0.53905384 0.53919484 0.000140998
5 0.53919484 0.53917715 -0.0000176934
6 0.53917715 0.53917937 2.22033E-6
7 0.53917937 0.53917909 -2.78627E-7
Hence, the required real root of the given equation is 0.539179 correct to six decimal places.
8
4.4 Newton-Raphson method
Let, [a0 , b0 ] be an initial interval containing the only root  of the given equation f ( x)  0 and let f (x) be
continuously differentiable sufficient number of times in [a0 , b0 ] .
Let x0  [a0 , b0 ] be an initial approximation to  . We set initial approximation of  as x0  a0 or b0 . Let
x1  x0  h be the exact root closer to x0 so that f ( x1 )  f ( x0  h)  0 , where h is sufficiently small.
Then, expanding f ( x0  h) by Taylor’s series about x0 , we obtain
h2
f ( x0  h)  f ( x0 )  hf ( x0 )  f ( x0 )  ...  0 .
2!
Now, neglecting the terms containing h 2 and higher powers of h , we have
f ( x0 )  hf ( x0 )  0 .
This implies
f ( x0 )
h .
f ( x0 )
Therefore, we take the first approximation to  as
x1  x0  h
f ( x0 )
 x0 
f ( x0 )
Continuing similarly, we get successive approximations (2 nd approximation, 3rd approximation and so on) to  as
f ( x1 )
x2  x1  ,
f ( x1 )
f ( x2 )
x3  x2  ,
f ( x2 )
and so on.
1
In general, the (n+1)th approximation to  is given by
f ( xn )
xn 1  xn  , n  0,1,2... (4.4.1)
f ( xn )
This is the iteration scheme of Newton-Raphson method.
4.4.1 Condition of convergence
The iteration scheme shows that, Newton-Raphson method is only a particular case of fixed point iteration method.
In this case, the equation f ( x)  0 can to be rewritten as
f ( x)
x   ( x)  x 
f ( x)
  f ( x)2  f ( x) f ( x)  f ( x) f ( x)

Therefore,  ( x)  1   
  f ( x)2   f ( x)2
Thus, the condition of convergence of Newton-Raphson method is
 ( x)  1, for all x  [a0 , b0 ]
f ( x) f ( x)
This implies 1, for all x  [a0 , b0 ]
 f ( x)2
2
Hence, the condition of convergence of Newton-Raphson method is f ( x) f ( x)  f ( x) , for all x  [a0 , b0 ] .
4.4.2 Order of convergence for Newton-Raphson method
We know that, error at n-th iteration  n    xn , i.e.   xn   n
Therefore,
 n2
0  f ( )  f ( xn   n )  f ( xn )   n f ( xn )  f (n ) , applying Taylor series expansion
2
2
where min{ , xn }  n  max{  , xn } .
f ( xn )
Now, iteration formula is xn 1  xn 
f ( xn )
Therefore,
f ( xn )
  xn 1    xn 
f ( xn )
  n 2 f ( n ) 
This implies,  n 1   n   n  
 2 f ( xn ) 
Hence,
 n 2 f ( n )
 n1   (4.4.2.1)
2 f ( xn )
where min{ , xn }  n  max{  , xn } .
This is the error equation.
If the iteration converges, then xn ,  n   .
Therefore,
 n 1 1 f ( )
lim 
n   n2 2 f ( )
which shows that the order of convergence of the Newton-Raphson method is 2 and the asymptotic error constant is
1 f ( )
.
2 f ( )
3
Example 8:
Find the real root of the equation 10 x  x  4  0 correct to six significant digits using Newton-Raphson method.
Solution:
Let f ( x)  10x  x  4  0 .
x f (x)
0 -3
1 7
We note that f(0)<0 and f(1)>0. Thus the given equation changes its sign within the interval [0, 1]. Therefore, there
exists at least one real root of the equation within [0, 1].
Now, f ( x)  10x ln 10  1
We choose initial approximation x0  0 .
n f ( xn ) f  x n 1 
xn 1  xn 
f ( xn )
0 0.90837932 5.00641
4
1 0.65355360 1.15709
2 0.55178472 0.11453
3 0.53934062 0.00144869
4 0.53917915 2.39265E-7
5 0.53917912 6.18949E-15
Hence, the required real root of the given equation is 0.5391791 correct to six significant digits.
5
4.4.3 Geometrical significance of Newton-Raphson method
The geometrical meaning of the Newton-Raphson method is that the point at which the tangent
to the curve y = f (x) at the points ( xn , f ( xn )) cuts the x-axis is xn +1 .
The equation of the tangent is y − f ( xn ) = f ′( xn )( x − xn )
Fig 6 graphical representation of Newton-Raphson method
f ( xn )
It intersects x-axis i.e. y = 0 at x = xn − = xn +1
f ′( xn )
Accordingly, Newton-Raphson method is also known as the method of tangents.
4.4.4 Advantage and disadvantage of Newton-Raphson method
• Advantage
The basic advantage of the Newton-Raphson method is that it converges very rapidly, since the
order of convergence of Newton-Raphson method is quadratic.
1
• Disadvantage
Newton-Raphson method fail, if f ′( x) = 0 or very small in the neighbourhood of the desired root.
In such cases the regula-falsi method should be used. The initial approximation should be taken
very close to the desired root, otherwise the method may diverge. Sometimes this method may
not be suitable for a function f (x) whose derivative is difficult to calculate.
2
4.5 Secant Method
In secant method, the derivative at x n is approximated by the following difference quotient
f ( xn ) − f ( xn −1 )
f ′( xn ) ≈
xn − xn −1
Hence, the iteration scheme of Newton-Raphson method given in (4.4.1) reduces to
( xn − xn −1 ) f ( xn )
xn +1 = xn − (4.5.1)
( f ( xn ) − f ( xn −1 ))
This is the iteration scheme for secant method. It is to be noted that the iteration scheme in eq.
(4.5.1) requires two initial approximations to find the root.
4.5.1 Geometrical significance of Secant method
We approximate the curve y = f (x) between the two points ( xn , f ( xn )) and ( xn −1 , f ( xn −1 )) by a
y − f ( xn ) f ( xn −1 ) − f ( xn )
straight line. The equation of this straight line is = . It cuts the x-axis i.e.
x − xn xn −1 − xn
y=0 at x = xn +1 .
Fig 7 graphical representation of secant method
3
xn − xn −1
Therefore, the (n+1)-th approximation xn +1 is x = xn − y n = xn +1 .
y n − yn −1
In case, y n , y n −1 are of the same sign, the required point of intersection falls outside the range of
interpolation i.e. we have a case of extrapolation and in that case there is no guarantee that the
iteration will converge or not. In fact, the secant method may or may not converge. But in case
the iteration converges, this renders an efficient method for computing the required real root.
Example 9:
Find the real root of the equation 10 x + x − 4 = 0 correct to five decimal places using Secant
method.
Solution:
Let f ( x) = 10 x + x − 4 = 0 .
x f (x)
0 -3
1 7
4
n xn −1 xn y n −1 yn ( xn − xn−1 ) y n f ( x n +1 )
xn +1 = x n −
( y n − y n−1 )
0 0 1 -3 7 0.3000000 -1.70474
1 1 0.3 7 -1.70474 0.4370881 -0.827088
2 0.3 0.4370881 -1.70474 -0.827088 0.5662786 0.24993
3 0.4370881 0.5662786 -0.827088 0.24993 0.5362989 -0.0257559
4 0.5662786 0.5362989 0.24993 -0.0257559 0.5390998 -0.000711539
5 0.5362989 0.5390998 -0.0257559 -0.000711539 0.5391794 2.0981E-6
6 0.5390998 0.5391794 -0.000711539 2.0981E-6 0.5391791 -1.70286E-10
Hence, the required real root of the given equation is 0.53918 correct to five decimal places.
4.5.3 Advantage and disadvantage of secant method
• Advantage
The advantage of Secant method is that if it converges then it is convergent more rapidly than
regula-falsi method, since order of convergence of secant method is 1.61803.
• Disadvantage
5
This method does not always converge in contrast to regula-falsi method. If it converges then it
is faster than regula-falsi method but lesser than Newton-Raphson method. If at any instant
f ( x n −1 ) ≅ f ( x n ) , then the method fails.
6
Chapter 3: Interpolation
1. Introduction
Let us assume that f (x) be a function defined in (−∞, ∞) in which it is continuously differentiable
a sufficient number of times. Here we are concerned with the function f (x) such that the
analytical formula representing the function is unknown, but the values of f (x) are known for a
given set of n + 1 distinct values of x, say x0 , x1 , x2 ,..., xn .
x f (x)
x0 f ( x0 )
x1 f (x1 )
x2 f ( x2 )
. .
. .
. .
xn f (x n )
Our problem is to find the value of f (x) for a given value of x in the vicinity of the above
tabulated values of the arguments, but different from the above tabulated values of x. Since the
formula for f (x) is unknown, the precise value of f (x) cannot be obtained. We try to find an
1
approximate value of the same by the principle of interpolation. In interpolation, the unknown
function f (x) is approximated by a simple function ϕ n (x) so that it takes the same values as f (x)
for the given arguments values x0 , x1 , x2 ,..., xn . In case, if the given value of x lies slightly
outside the interval [min{x0 , x1,..., xn }, max{x0 , x1 ,..., xn }] , the corresponding process is often called
extrapolation.
Now the function ϕ n (x) may be variety of forms. When ϕ n (x) is a polynomial, then the process of
approximating f (x) by ϕ n (x) is called polynomial interpolation. If ϕ n (x) is a piecewise
polynomial, the interpolation is called piecewise polynomial interpolation; if ϕ n (x) is a finite
trigonometric series then interpolation is called trigonometric interpolation. Likewise, ϕ n (x) may
be a series of Legendre polynomials, Bessel functions and Chebyshev polynomials etc.
First we shall familiar with polynomial interpolation. It is concerned with following famous
theorem due to Weierstrass:
Theorem 1(Weierstrass Approximation Theorem): Suppose f (x) be a continuous real-
valued function defined on the real interval [a, b]. For every ε > 0, there exists a polynomial
ϕ n (x) such that for all x in [a, b], we have f ( x) − ϕ n ( x) ∞

<ε.
2. Polynomial interpolation
In polynomial interpolation, f (x) is approximated by a polynomial ϕ n (x) of degree ≤ n such that
f ( xi ) ≅ ϕ n ( xi ) for all i=0,1,2,3,…,n (2.1)
Let ϕ n ( x ) = a0 + a1 x + ... + an x n .
2
Then from (2.1), we get
a0 + a1 xi + ... + an xin = f ( xi ) for all i=0,1,2,3,…,n. (2.2)
This is a set of (n + 1) linear equations in (n + 1) unknowns of a0 , a1 , a 2 ,..., a n . The coefficient
determinant of the system eq. (2.2) is
1 x0 ... x0n
1 x1 ... x1n
. . . . = ∏
0≤ i , j ≤ n
(xi − x j ) ≠ 0
. . . . i≠ j
1 xn ... xnn
This determinant is known as Vandermonde’s determinant. The value of this determinant is
different from zero, since x0 , x1 , x2 ,..., xn are distinct.
Therefore, by Cramer’s rule, the values of a0 , a1 , a 2 ,..., a n can be uniquely determine so that the
polynomial ϕ n (x) exists and is unique. This polynomial ϕ n (x) is called the interpolation
polynomial. The given points x0 , x1 , x2 ,..., xn are called interpolating points or nodes.
2.1.Geometric interpretation of interpolation
Geometrically, the curve representing the unknown function y = f ( x) passes through the points
( xi , yi ) , (i = 0,1,..., n) . This unknown function is approximated by a unique n -th degree parabola
y = ϕ n (x) which passes through the above points. It has been depicted in the following fig. 1. The
parabola y = ϕ n (x) is called interpolation parabola or interpolation polynomial. In this context
polynomial interpolation is also referred to as parabolic interpolation.
3
Fig. 1 Geometrical representation of interpolation
4
2.9. Lagrange’s interpolation formula
Let us consider y 0 , y1 ,..., y n be the values of y = f (x ) corresponding to the values x0 , x1 , x 2 ,...,
xn of x . In this case, the values of x need not necessarily be equispaced.
Let, ϕ n ( x ) be a polynomial of degree n. Let us take the polynomial in the form
ϕ n ( x ) = c0 ( x − x1 )( x − x2 )...( x − xn ) + c1 ( x − x0 )( x − x2 )...( x − xn )
+ ... + cr ( x − x0 )( x − x1 )...( x − xr −1 )( x − xr +1 )...( x − xn ) + ...
+ cn ( x − x0 )( x − x1 )( x − x2 )...( x − xn −1 ) (2.9.1)
We shall determine the coefficients c 0 , c1 ,..., c n so as to make
ϕ n (x0 ) = y0 , ϕ n (x1 ) = y1 ,…, ϕ n ( xn ) = y n .
Now, substituting in eq. (2.9.1) the successive values x0 , x1 ,..., xn for x at the same time putting
ϕ n (x0 ) = y0 , ϕ n (x1 ) = y1 ,…, ϕ n ( xn ) = y n , we have
y0 = ϕ n (x0 ) = c0 ( x0 − x1 )( x0 − x2 )...( x0 − xn )
This implies
y0
c0 =
( x0 − x1 )( x0 − x2 )...(x0 − xn )
Similarly,
1
y1
c1 =
( x1 − x0 )( x1 − x2 )...(x1 − xn )
yn
cn =
( xn − x0 )( xn − x1 )...(xn − xn−1 )
Substituting these values for c 0 , c1 ,..., c n in eq. (2.9.1), we get
( x − x1 )( x − x2 )...(x − xn ) y0 ( x − x0 )(x − x2 )...(x − xn ) y1

ϕn ( x) = + + ...
( x0 − x1 )( x0 − x2 )...(x0 − xn ) ( x1 − x0 )( x1 − x2 )...(x1 − xn )
( x − x0 )( x − x1 )...(x − xn−1 ) yn
+ (2.9.2)
( xn − x0 )(xn − x1 )...(xn − xn−1 )
This is known as Lagrange’s interpolation formula.
Now if we treat x (dependent variable) as a function of y (independent variable), then
interchanging x and y in eq. (2.9.2), we may obtain
( y − y1)( y − y2 )...( y − yn ) x0 ( y − y0 )( y − y2 )...( y − yn ) x1

x≅ + + ...
( y0 − y1)( y0 − y2 )...( y0 − yn ) ( y1 − y0 )( y1 − y2 )...( y1 − yn )
( y − y0 )( y − y1)...(y − yn−1) xn
+ (2.9.2)
( yn − y0 )( yn − y1)...( yn − yn−1)
This relation useful for inverse interpolation and it is sometimes referred to as Lagrange’s
inverse interpolation formula.
2
2.9.1. Error in Lagrange’s interpolation
To find the error committed in approximating f ( x ) by the polynomial ϕ n ( x ) , we have from eq.
(2.2.5), the remainder or error is
( x − x0 )( x − x1 )( x − x2 )...( x − xn ) ( n+1)
f ( x ) − ϕ n ( x ) = Rn+1 ( x) = f (ξ ) , (2.9.1.1)
( n + 1)!
where min{ x, x0 , x1 ,..., xn } < ξ < max{x, x0 , x1 ,..., xn } .
Example 1:
The following table gives the viscosity of an oil as a function of temperature. Use Lagrange’s
formula to find viscosity of oil at a temperature of 140 0 .
Temperature 0 : 110 130 160 190
Viscosity: 10.8 8.1 5.5 4.8
Solution:
Let x denotes temperature and y denotes viscosity respectively.
Using Lagrange’s interpolation formula, we have
( x − 130)( x − 160)( x − 190) ( x − 110)( x − 160)( x − 190)

f ( x) ≅ × 10.8 + × 8. 1
(110 − 130)(110 − 160)(110 − 190) (130 − 110)(130 − 160)(130 − 190)
( x − 110)( x − 130)( x − 190) ( x − 110)( x − 130)( x − 160)

+ × 5. 5 + × 4. 8
(160 − 110)(160 − 130)(160 − 190) (190 − 110)(190 − 130)(190 − 160)
3
which gives
f (140) ≅ 7.03
2.3.Finite Differences
Let y = f (x) be a real-valued function of x and y0 , y1 ,..., yn be the values of y = f (x)
corresponding to the values x0 , x1 ,..., xn of x . The values of x , i.e. x i (i = 0,1,..., n) are called the
nodes or arguments. The argument values x0 , x1 ,..., xn may or may not be equidistant or equally
spaced.
2.3.1. Forward differences
Let y0 , y1 ,..., yn be a given set of values of y corresponding to the equidistant values x0 , x1 ,..., xn of
x , i.e. xi = x0 + ih , i = 0,1,...,n . The differences y1 − y 0 , y 2 − y1 , …, yn − yn −1 are called first
forward differences if these are denoted by ∆y0 , ∆y1 , …, ∆y n −1 respectively. Thus we have
∆yi = yi +1 − yi , i = 0,1,..., n − 1 . (2.3.1.1)
The operator ∆ is called first forward difference operator.
In general, the first forward difference operator is defined by
∆f ( x ) = f ( x + h ) − f ( x ) (2.3.1.2)
Similarly, we can define the second order, third order, forth order, etc. forward differences
formulae respectively as follows
∆2 yi = ∆(∆yi ) = ∆yi +1 − ∆yi
4
∆3 yi = ∆(∆2 yi ) = ∆2 yi+1 − ∆2 yi (2.3.1.3)
∆k yi = ∆(∆k −1 yi ) = ∆k −1 yi +1 − ∆k −1 yi
where i = 0,1,..., n − 1 and k ( k = 1,..., n) is a positive integer.
5
2.3.1. Forward differences (Cont.)
Thus, we have
∆y0 = y1 − y 0
∆2 y0 = ∆(∆y0 ) = ∆y1 − ∆y0 = y2 − 2 y1 + y0 (2.3.1.4)
∆3 y0 = ∆(∆2 y0 ) = ∆2 y1 − ∆2 y0 = y3 − 3 y2 + 3 y1 − y0
∆4 y0 = ∆(∆3 y0 ) = ∆3 y1 − ∆3 y0 = y4 − 4 y3 + 6 y2 − 4 y1 + y0
and so on.
In general,
rk
k
∆k yi = ∑ (−1)  r  y
r =0
i+ k −r , (2.3.1.5)
where i = 0,1,..., n − 1 and k ( k = 1,..., n) is a positive integer.
2.3.1.1. Forward difference table
We can calculate the above forward differences very easily with the help of the following table
which is called forward difference table. It has been shown in Table1.
1
x y ∆y ∆2 y ∆3 y ∆4 y ∆5 y
x0 y0
∆y 0
x1 y1 ∆2 y0
∆ y1 ∆3 y0
x2 y2 ∆2 y1 ∆4 y0
∆3 y1
2
∆5 y0
∆y 2 ∆ y2
x3 y3 ∆4 y1
3
∆ y2
∆y 3 2
∆ y3
x4 y4
∆y 4
x5 y5
Table1 Forward difference table
2
2.4 The Shift, differentiation and averaging operators
2.4.1. The Shift operator
The shift operator or shifting operator E is defined by
Ef ( x) = f ( x + h ) (2.4.1.1)
Now, we know that
∆f ( x) = f ( x + h) − f ( x)
= Ef ( x) − f ( x)
= ( E − I ) f ( x) , where I is the identity operator.
So, it follows that
∆= E−I (2.4.1.2)
The eq. (2.4.1.2) represents a relation between the shift operator and the forward difference
operator ∆ .
Now, we have
Ef ( x) = f ( x + h)
E 2 f ( x ) = E .Ef ( x ) = Ef ( x + h ) = f ( x + 2 h )
E 3 f ( x ) = E . E 2 f ( x ) = Ef ( x + 2 h ) = f ( x + 3h )
3
By the method of induction, it may be easily proved that
E n f ( x ) = f ( x + nh ) , (2.4.1.3)
where n is a positive integer.
2.4.3. The averaging operator
The averaging operator µ is defined by
1  h  h 
µf ( x ) =  f  x +  + f  x −  (2.4.3.1)
2  2  2 
1 1/ 2
=
2
[
E f ( x ) + E −1 / 2 f ( x ) ]
1 1/ 2
=
2
[
E + E −1/ 2 f ( x)]
This implies that
1 1/ 2
µ=
2
[
E + E −1 / 2 ] (2.4.3.2)
The eq. (2.4.3.2) represents a relation between the averaging operator and the shift operator E .
 ∆2  4
Exercise 1: Prove that   x = 12x 2 + 2 , where the interval of differencing being unity.

E
Exercise 2: Find the missing terms in the following table
x 0 1 2 3 4 5
f(x) 0 − 8 15 − 35
4
2.6. Backward differences
Let y0 , y1 ,..., yn be a given set of values of y corresponding to the equidistant values x0 , x1 ,..., xn of
x , i.e. xi = x0 + ih , i = 0,1,...,n . The differences y1 − y 0 , y 2 − y1 , …, yn − yn −1 are called first
backward differences if these are denoted by ∇ y1 , ∇y 2 , …, ∇y n respectively. Thus we have
∇y i = yi − y i−1 , i = 1,..., n . (2.6.1)
The operator ∇ is called first backward difference operator.
In general, the first backward difference operator is defined by
∇f (x) = f ( x) − f ( x − h) (2.6.2)
= f ( x ) − E −1 f ( x )
= ( I − E −1 ) f ( x ) , where I is the identity operator.
So, it follows that
∇ = I − E −1 (2.6.2)
The eq. (2.6.2) represents a relation between the shift operator and the backward difference
operator ∇ .
5
2.6. Backward differences (Cont.)
Similarly, we can define the second order, third order, forth order, etc. backward differences
formulae respectively as follows
∇ 2 y i = ∇ (∇ y i ) = ∇ y i − ∇ y i −1 = y i − 2 y i −1 + y i − 2
∇ 3 y i = ∇ (∇ 2 y i ) = ∇ 2 y i − ∇ 2 y i −1 = y i − 3 y i −1 + 3 y i − 2 − y i − 3 (2.6.3)
rk 
k
∇ k yi = ∇ (∇ k −1 yi ) = ∇ k −1 yi − ∇ k −1 yi −1 = ∑ (−1)  r  y
r =0
i −r
where i = 1,..., n and k ( k = 1,..., n ) is a positive integer.
2.6.1. Relation between the forward and backward difference operators
We have
∇yi = yi − yi −1 = ∆yi −1
∇ 2 y i = y i − 2 y i −1 + y i − 2 = ∆2 y i − 2
∇ 3 y i = y i − 3 y i −1 + 3 y i − 2 − y i − 3 = ∆3 y i −3
In general ∇ k y i = ∆k y i − k (2.6.1.1)
where i = 1,..., n and k ( k = 1,..., n ) is a positive integer.
1
The eq. (2.6.1.1) represents a relation between the forward and the backward difference
operators.
2.6.2. Backward difference table
We can calculate the above backward differences very easily with the help of the following table
which is called backward difference table. It has been shown in Table 3. However, according to
the result in eq. (2.6.1.1), the backward differences may also be derived from the forward
difference table. In that case, the following backward difference table is not required.
x y ∇y ∇2 y ∇3 y ∇4 y
x0 y0
x1 y1 ∇y1
x2 y2 ∇y 2 ∇2 y2
x3 y3 ∇3 y3
∇y 3 ∇2 y3
x4 y4 ∇y 4 ∇4 y4
∇2 y4 ∇3 y4
Table 3 Backward difference table
2.7. Newton’s forward difference interpolation
Let y = f (x) be a function which takes the value y 0 , y1 ,..., y n for the equidistant values x0 , x1 ,
x2 ,..., xn , i.e. x i = x 0 + ih for all i=0,1,2,…, n .
2
Let, ϕ n ( x) be a polynomial of degree n. This polynomial may be written as
ϕ n ( x ) = a0 + a1 ( x − x0 ) + a2 ( x − x1 )( x − x0 ) + ... + an ( x − x0 )( x − x1 )( x − x2 )...( x − xn−1 ) (2.7.1)
We shall now determine the coefficient a0 , a1 ,..., a n so as to make
ϕ n (x0 ) = y0 , ϕ n ( x1 ) = y1 ,…, ϕ n (xn ) = y n .
Substituting in eq.(2.7.1) the successive values x0 , x1 , x2 ,..., x n for x at the same time putting
ϕ n (x0 ) = y0 , ϕ n ( x1 ) = y1 ,…, ϕ n (xn ) = y n , we have
y1 − y0 ∆y0 ∆2 y0 ∆3 y0 ∆n y0
a0 = y 0 ; a1 = = ; a2 = ; a3 = ; …; an = .
x1 − x0 h 2!h 2 3!h 3 n!h n
Substituting these values of a0 , a1 ,..., a n in (2.7.1), we have
( x − x0 ) ( x − x1 )( x − x0 ) 2 ( x − x0 )( x − x1 )( x − x2 )...( x − xn−1 ) n
ϕ n ( x) = y0 + ∆y0 + 2
∆ y0 + ... + ∆ y0 (2.7.2)
1!h 2! h n! h n
which is Gregory-Newton’s forward difference interpolation formula and it is useful to
interpolate near the beginning of a set of tabular values.
(x − x0 )
Now, settting u= ,
h
from eq. (2.7.2), we obtain
u(u − 1) 2 u(u − 1)...(u − n + 1) n

ϕn ( x) = y0 + u∆y0 + ∆ y0 + ... + ∆ y0
2! n!
u  u  u 
= y0 +  ∆y0 +  ∆2 y0 + ... +  ∆n y0 (2.7.3)
1  2 n
3
In practical numerical computation, instead of eq.(2.7.2) we should use eq.(2.7.3) in order to ease
the calculation involves with it.
2.7.1. Error in Newton’s forward difference interpolation
To find the error committed in approximating f (x) by the polynomial ϕ n (x) , we have from eq.
(2.2.5), the remainder or truncation error or simply error is
( x − x0 )( x − x1 )( x − x2 )...( x − xn ) ( n +1)
Rn +1 ( x) = f (ξ ) , (2.7.1.1)
(n + 1)!
Now, eq. (2.7.1.1) can be written as
( x − x0 )(x − x1 )( x − x2 )...(x − xn ) n+1

Rn+1 ( x) = ∆ f (ξ )
(n + 1)!hn+1
u (u − 1)...(u − n) n+1
= ∆ f (ξ )
( n + 1)!
Example 2:
Calculate the values of f (218) from the following data
x 200 250 300 350 400
y 15.04 16.81 18.42 19.90 21.27
Solution:
The forward difference table is
4
x y ∆y ∆2 y ∆3 y ∆4 y
200 15.04
1.77
-0.16
250 16.81 0.03
1.61 -0.01
-0.13
300 18.42 0.02
1.48
-0.11
350 19.90
1.37
400 21.27
Since, x=218 is near the beginning of the table, we use Newton’s forward interpolation formula.
From the above forward difference table, only underlined upper portion values will be used in
the Newton’s forward interpolation formula.
x − x0 218 − 200 18
Here, u = = = = 0.36 .
h 50 50
Using Newton’s forward interpolation formula, we get
u (u − 1) 2 u(u − 1)(u − 2) 3 u(u − 1)(u − 2)(u − 3) 4

f (218) ≅ y0 + u∆y0 + ∆ y0 + ∆ y0 + ∆ y0
2! 3! 4!
0.36(0.36 − 1) 0.36(0.36 − 1)(0.36 − 2)

= 15.04 + 0.36 ×1.77 + × (−0.16) + × (0.03)
2! 3!
0.36(0.36 − 1)(0.36 − 2)(0.36 − 3)

+ × (−0.01)
4!
= 15 .04 + 0 .6372 + 0.018432 + 0.00188928 + 0.00041564 16
= 15.6979
5
2.8. Newton’s backward difference interpolation
Let y = f (x) be a function which takes the value y 0 , y1 ,..., y n for the equidistant values x0 , x1 ,
x2 ,..., xn , i.e. x i = x 0 + ih for all i=0,1,2,3…, n .
Let, ϕ n (x) be a polynomial of degree n. This polynomial may be written as
ϕ n (x ) = a0 + a1 ( x − xn ) + a2 ( x − xn )( x − xn −1 ) + ... + an ( x − xn )( x − xn −1 )...( x − x1 ) (2.8.1)
We shall now determine the coefficients a0 , a1 ,..., a n so as to make
ϕ n ( xn ) = y n , ϕ n (xn −1 ) = y n −1 ,…, ϕ n (x0 ) = y0
Substituting in eq. (2.8.1) the successive values x n , x n −1 ,... for x at the same time putting
ϕ n ( xn ) = y n , ϕ n (xn −1 ) = y n −1 ,…, ϕ n (x0 ) = y0 etc., we have
y − y n −1 ∇y n ∇2 yn ∇3 yn ∇ n yn
a0 = y n ; a1 = n = ; a2 = ; a3 = 3 ; …, a n = .
x n − x n −1 h 2!h 2 3!h n! h n
Substituting these values of a0 , a1 ,..., a n in (2.8.1), we have
( x − xn ) ( x − xn )( x − xn −1 ) 2 ( x − xn )( x − xn −1 )...(x − x1 ) n
ϕ n (x ) = yn + ∇y n + 2
∇ y n + ... + ∇ yn (2.8.2)
h 2! h n! h n
which is Gregory-Newton’s backward difference interpolation formula and it is useful to
interpolate near the end of a set of tabular values.
(x − xn )
Now, setting u= ,
h
1
we can obtain
u(u + 1) 2 u(u + 1)...(u + n − 1) n

ϕn ( x ) = yn + u∇yn + ∇ yn + ... + ∇ yn (2.8.3)
2! n!
In this case also in practical numerical computation, instead of eq.(2.8.2) we should use
eq.(2.8.3) in order to ease the calculation involves with it.
2.8.1. Error in Newton’s backward difference interpolation
To find the error committed in approximating f (x) by the polynomial ϕ n (x) , we have from eq.
(2.2.5), the remainder or truncation error or simply error is
( x − x0 )( x − x1 )( x − x2 )...( x − xn ) ( n +1)
Rn +1 ( x) = f (ξ ) , (2.8.1.1)
(n + 1)!
Example 3:
Calculate the values of f ( 218) and f ( 410) from the following data
x 200 250 300 350 400
y 15.04 16.81 18.42 19.90 21.27
Solution:
The forward difference table is
2
x y ∆y ∆2 y ∆3 y ∆4 y
200 15.04
1.77
-0.16
250 16.81
0.03
1.61
-0.01
-0.13
300 18.42
0.02
1.48
-0.11
350 19.90
1.37
400 21.27
Since, x=410 is near the end of the table, we use Newton’s backward interpolation formula. From
the above forward difference table, only underlined lower portion values will be used in the
Newton’s backward interpolation formula.
x − xn 410 − 400 10
Here, u = = = = 0.2 .
h 50 50
Using Newton’s backward interpolation formula, we get
u(u + 1) 2 u(u + 1)(u + 2) 3 u(u + 1)(u + 2)(u + 3) 4

f (410) ≅ y n + u∇y n + ∇ yn + ∇ yn + ∇ yn
2! 3! 4!
3
0.2 ×1.2 0.2 ×1.2 × 2.2 0.2 ×1.2 × 2.2 × 3.2
= 21.27 + 0.2 ×1.37 + × (−0.11) + × 0.02 + + × (−0.01)
2! 3! 4!
= 21 .27 + 0 .274 − 0 .0132 + 0. 00176 − 0.000704
= 21.532
2.9. Divided difference
Let us consider y 0 = f ( x0 ) , y1 = f ( x1 ) ,..., y n = f ( x n ) be the values of the function y = f (x)
corresponding to the values x0 , x1 , x2 ,..., xn , which are not necessarily equally spaced.
The first order divided differences of f ( x) for two arguments x0 , x1 denoted by f [ x0 , x1 ] and is
defined by
f ( x0 ) − f ( x1 ) f ( x1 ) − f ( x0 )
f [ x0 , x1 ] = =
x0 − x1 x1 − x0
Similarly,
f ( x1 ) − f ( x 2 ) f ( x 2 ) − f ( x1 )
f [ x1 , x 2 ] = = ,
x1 − x 2 x 2 − x1
f ( x 2 ) − f ( x3 ) f ( x3 ) − f ( x 2 )
f [ x2 , x3 ] = =
x2 − x3 x3 − x 2
and so on.
The second order divided differences of f ( x) for three arguments x0 , x1 , x2 denoted by
f [ x0 , x1 , x2 ] and is defined by
4
f [ x0 , x1 ] − f [ x1 , x 2 ]
f [ x0 , x1 , x 2 ] =
x0 − x2
In general, the n th order divided differences of f ( x) with n + 1 arguments x0 , x1 , x2 ,..., xn is
defined by
f [ x0 , x1 , x2 ,..., xn−1 ] − f [ x1 , x2 ,..., xn ]

f [ x0 , x1 , x2 ,..., xn ] = .
x0 − xn
5
2.8. Divided difference (Cont.)
Let us consider y0 = f ( x0 ) , y1 = f ( x1 ) ,..., y n = f ( xn ) be the values of the function y = f (x)
corresponding to the values x0 , x1 , x2 ,..., xn , which are not necessarily equally spaced.
The first order divided differences of f (x) for two arguments x0 , x1 denoted by f [ x0 , x1 ] and is
defined by
f ( x0 ) − f ( x1 ) f ( x1 ) − f ( x0 )
f [ x0 , x1 ] = =
x0 − x1 x1 − x0
Similarly,
f ( x1 ) − f ( x 2 ) f ( x 2 ) − f ( x1 )
f [ x1 , x 2 ] = = ,
x1 − x 2 x 2 − x1
f ( x2 ) − f ( x3 ) f ( x3 ) − f ( x 2 )
f [ x2 , x3 ] = =
x 2 − x3 x3 − x 2
and so on.
The second order divided differences of f (x) for three arguments x0 , x1 , x2 denoted by
f [ x0 , x1 , x 2 ] and is defined by
f [ x0 , x1 ] − f [ x1 , x 2 ]
f [ x0 , x1 , x2 ] =
x0 − x 2
In general, the n th order divided differences of f (x) with n + 1 arguments x0 , x1 , x2 ,..., xn is
defined by
1
f [ x0 , x1 , x2 ,..., xn−1 ] − f [ x1 , x2 ,..., xn ]
f [ x0 , x1 , x2 ,..., xn ] = .
x0 − xn
Symmetric property of divided differences
The divided differences are symmetric with respect to their arguments.
f ( x0 ) − f ( x1 ) f ( x1 ) − f ( x0 )
f [ x0 , x1 ] = = = f [ x1 , x0 ]
x0 − x1 x1 − x0
Also, we have
1
f ( x0 ) f ( x1 ) f ( xi )
f [ x0 , x1 ] = +
x0 − x1 x1 − x0
= ∑ 1
i =0
∏ (x − x )
j =0
i j
j ≠i
Again,
f [ x0 , x1 ] − f [ x1 , x2 ]
f [ x0 , x1 , x 2 ] =
x0 − x2
1  f ( x0 ) f ( x1 ) f ( x1 ) f ( x2 ) 
=  + − − 
( x0 − x2 )  x0 − x1 x1 − x0 x1 − x2 x2 − x1 
f ( x0 ) f ( x1 ) f ( x2 )
= + +
( x0 − x1 )( x0 − x2 ) ( x1 − x0 )( x1 − x2 ) ( x2 − x0 )( x2 − x1 )
2
f ( xi )
= ∑ 2
i =0
∏ (x
j =0
i − xj )
j ≠i
2
2.8.1 Divided difference table
The following Table 4 shows the divides differences where the differences used in eq. (2.10.3.6)
have been underlined.
x y 1st order divided 2nd order n th order

difference divided divided
difference difference
x0 y0
f [ x 0 , x1 ]
x1 y1 f [ x 0 , x1 , x 2 ]
f [ x1 , x2 ]
x2 y2 .
.
. f [ x 0 , x1 , x 2 ,..., x n ]
. .
.
. .
.
.
. .
.
.
x n −1 f [ xn −2 , xn −1 , xn ]
y n −1
f [ xn −1 , xn ]
xn yn
Table 4 Divided difference table
3
2.9 Newton’s divided difference interpolation formula
Let us consider y0 = f ( x0 ) , y1 = f ( x1 ) ,..., y n = f ( xn ) be the values of the function y = f (x)
corresponding to the arguments x0 , x1 , x2 ,..., xn , which are not necessarily equally spaced.
Let ϕ n (x) be interpolating polynomial which is interpolating at the n + 1 distinct points x0 , x1 , …,
xn . Let us consider this polynomial as
ϕ n ( x) = c0 + c1 ( x − x0 ) + c2 ( x − x0 )( x − x1 ) + ... + cn ( x − x0 )( x − x1 )...(x − xn −1 ) . (2.9.1)
Substituting successively x = x0 , x1 ,..., xn in eq. (2.9.1), we obtain
ϕ n ( x0 ) = f ( x0 ) = c0 ,
ϕ n ( x1 ) = f ( x1 ) = c0 + c1 ( x1 − x0 ) = f ( x0 ) + c1 ( x1 − x0 ) yields to
f ( x1 ) − f ( x0 )
c1 = = f [ x0 , x1 ] ,
x1 − x0
ϕ n ( x2 ) = f ( x2 ) = c0 + c1 ( x2 − x0 ) + c2 ( x2 − x0 )( x2 − x1 )
f ( x1 ) − f ( x0 )
= f ( x0 ) + ( x2 − x0 ) + c2 ( x 2 − x0 )( x2 − x1 ) yields to
x1 − x0
1  f ( x1 ) − f ( x0 ) 
c2 =  f ( x2 ) − f ( x0 ) − ( x2 − x0 )
( x2 − x0 )( x2 − x1 )  x1 − x0 
f ( x0 ) f ( x1 ) f ( x2 )
= + +
( x0 − x1 )( x0 − x2 ) ( x1 − x0 )( x1 − x2 ) ( x2 − x0 )( x2 − x1 )
= f [ x0 , x1 , x2 ] .
Using induction, it can obtained that
4
cn = f [ x0 , x1 ,..., xn ] . (2.9.2)
Therefore, the divided difference interpolating polynomial becomes
f ( x) ≅ ϕ n ( x) = f ( x0 ) + ( x − x0 ) f [ x0 , x1 ] + ( x − x0 )( x − x1 ) f [ x0 , x1 , x2 ] + ... + ( x − x0 )( x − x1 )...(x − xn −1 ) f [ x0 , x1 ,..., xn ]
which is called Newton’s divided difference interpolation formula or Newton’s general
interpolation formula with divided differences.
5
Example 4:
Use Newton’s divided difference formulae to calculate f (3) from the following table:
x 0 1 2 4 5 6
f (x) 1 14 15 5 6 19
Solution:
To use the Newton’s divided difference interpolation formula, we first construct the divided
difference table
1
x y 1st order 2nd order 3rd order 4th order 5th order
divided divided divided divided divided
difference difference difference difference difference
0 1
13
1 14 -6
1 1
2 15 -2 0
-5 1 0
4 5 2 0
1 1
5 6 6
13
6 19
From this divided difference table, only underlined values will be used in the newton divided
difference interpolation formula.
Now, we obtain the Newton divided difference interpolating polynomial as
f ( x ) ≅ 1 + 13 x + x (x − 1)× (−6) + x( x − 1)( x − 2) ×1
= 1 + 13x − 6 x (x − 1) + x(x − 1)(x − 2)
2
= x 3 − 9 x 2 + 21x + 1
which gives
f (3) ≅ 27 − 81 + 63 + 1 = 10 .
2.10. Theorem (Relation between divided differences and forward differences):
Let the arguments x0 , x1 ,..., xn be equally spaced with step length h , i.e. xi = x0 + ih (i = 0,1,2,...,n) ,
then a divided difference reduces to a finite difference given by
∆n f ( x0 )
f [x0 , x1 ,..., x n ] = . (2.10.1)
n!h n
Proof:
We shall prove the result in eq. (2.10.1) by the method of induction on n
For n=1
f (x 0 ) − f (x1 ) f (x1 ) − f (x 0 ) ∆f (x 0 )
f [x 0 , x1 ] = = =
x 0 − x1 x1 − x 0 1! h1
For n=2
f [x0 , x1 ] − f [x1 , x2 ] [∆f ( x1 ) − ∆f ( x0 )] ∆2 f (x0 )

f [x0 , x1 , x2 ] = = =
x0 − x2 2h 2 2!h 2
Thus the result is true for n = 1, 2 .
3
Let us assume that the result is true for n = k , so
∆k f ( x0 )
f [x0 , x1 , x2 ,..., xk ] =
k! h k
f [x0 , x1 ,..., xk ] − f [x1 , x2 ,..., xk +1 ]

Now, for n=k+1, we have f [x0 , x1 ,..., xk +1 ] =
x0 − xk +1
=
[∆k
f ( x1 ) − ∆k f ( x0 ) ], using induction hypothesis
( k + 1)! h k +1
∆k +1 f (x0 )
=
( k + 1)! h k +1
Therefore, it follows that the result is also true for n=k+1.
Hence, by the method of induction, the result holds for all positive integer k, i.e. k = 1, 2,3 ... . ■
2.11 Cubic spline interpolation
Spline interpolation is a piecewise polynomial interpolation. Let f (x) be a function defined in
a ≤ x ≤ b and we partition [a, b] into finite number of subintervals by the node points
a = x0 < x1 < ... < x n = b (2.11.1)
Now, we shall find a cubic spline function ϕ (x) that approximates f (x) such that
ϕ ( x0 ) = f ( x0 ) = f 0 , ϕ ( x1 ) = f ( x1 ) = f1 ,…, ϕ ( xn ) = f ( xn ) = f n (2.11.2)
4
2.11.1 Cubic spline
A cubic spline ϕ (x) defined on a ≤ x ≤ b corresponding to the partition eq. (2.11.1) is a continuous
function. ϕ (x) has continuous first and second derivatives everywhere in that interval. Also, in
each subinterval [xi −1 , xi ] , i = 1,2,...n , of that partition eq. (2.11.1), ϕ (x) is represented by a
polynomial si (x) of degree not exceeding three. Hence ϕ (x) consists of n such polynomials, one
in each subinterval. Therefore, in each subinterval [xi −1, xi ] , the spline ϕ (x) must agree with the
polynomial si (x) of degree not exceeding three, such that
(i) si ( xi −1) = yi −1 and si ( xi ) = yi , i = 1,2,..., n , (2.11.1.1)
(ii) si (x) , si′ (x) and si′′(x) are continuous functions in [a, b] . This implies that
si ( xi ) = si +1 ( xi ) ,
si′ ( xi ) = si′+1 ( xi ) ,
si′′( xi ) = si′′+1 ( xi ) , i = 1,2,..., n − 1 . (2.11.1.2)
From the following fig. 2, the above conditions can be easily perceived.
5
Fig. 2 Geometrical representation of spline interpolation
6
2.11 Cubic spline interpolation
Spline interpolation is a piecewise polynomial interpolation. Let f (x) be a function defined in
a ≤ x ≤ b and we partition [a, b] into finite number of subintervals by the node points
a = x0 < x1 < ... < x n = b (2.11.1)
Now, we shall find a cubic spline function ϕ (x) that approximates f (x) such that
ϕ ( x0 ) = f ( x0 ) = f 0 , ϕ ( x1 ) = f ( x1 ) = f1 ,…, ϕ ( xn ) = f ( xn ) = f n . (2.11.2)
2.11.1 Cubic spline
A cubic spline ϕ (x) defined on a ≤ x ≤ b corresponding to the partition eq. (2.11.1) is a continuous
function. ϕ (x) has continuous first and second derivatives everywhere in that interval. Also, in
each subinterval [xi −1 , xi ] , i = 1,2,...n , of that partition eq. (2.11.1), ϕ (x) is represented by a
polynomial si (x) of degree not exceeding three. Hence ϕ (x) consists of n such polynomials, one
in each subinterval. Therefore, in each subinterval [xi −1 , xi ] , the spline ϕ (x) must agree with the
polynomial si (x) of degree not exceeding three, such that
(i) si ( xi −1 ) = yi −1 and si ( xi ) = yi , i = 1,2,..., n , (2.11.1.1)
(ii) si (x) , si′ (x) and si′′(x) are continuous functions in [a, b] . This implies that
si ( xi ) = si +1 ( xi ) ,
si′ ( xi ) = si′+1 ( xi ) ,
si′′( xi ) = si′′+1 ( xi ) , i = 1,2,..., n − 1 . (2.11.1.2)
1
From the following fig. 2, the above conditions can be easily perceived.
Fig. 2 Geometrical representation of spline interpolation
2
In the subinterval [xi −1, xi ] , the spline ϕ (x) is given by a cubic polynomial which can be written in
the form
si ( x) = ai0 + ai1 ( x − xi ) + ai 2 ( x − xi ) 2 + ai3 ( x − xi )3 , i = 1,2,.., n (2.11.1.3)
We shall now determine the coefficient of the spline si (x) .
Now, using Taylor’s formula, from eq. (2.11.1.3) we get
ai 0 = si ( xi ) = yi
ai1 = si′ ( xi ) = k i
1 3 1
ai 2 = si′′( xi ) = 2 ( yi −1 − yi ) + ( k i −1 + 2k i ) (2.11.1.4)
2 h h
1 2 1
ai 3 = si′′′( xi ) = 3 ( yi −1 − y i ) + 2 ( k i −1 + k i )
6 h h
Hence, we can obtain the required cubic spline ϕ (x) by determining si (x) in each subinterval
[xi−1 , xi ] , i = 1,2,..,n .
For equispaced points hi = xi − xi −1 = h for all i, we can obtain
3
ki −1 + 4k i + ki +1 = ( f i +1 − f i −1 ) , i = 1,2,.., n − 1 (2.11.1.5)
h
Now, eq. (2.11.1.5) constitutes a system of n − 1 linear equations in n + 1 unknowns k 0 , k1 ,..., k n .
The remaining two equations may be derived using either the free or clamped boundary
conditions as discussed above.
3
• In case of free or natural boundary conditions
s1′′( x0 ) = 0 , s′n′ ( xn ) = 0 , (2.11.1.6)
We can derive
3( y1 − y 0 )
2 y ′0 + y1′ = (2.11.1.7)
h1
3( y n − y n −1 )
y ′n −1 + 2 y ′n = (2.11.1.8)
hn
Eqs. (2.11.1.7) and (2.11.1.8) can be written as
3( y1 − y 0 )
2k 0 + k1 = (2.11.1.9)
h1
3( yn − y n−1 )
k n−1 + 2k n = (2.11.1.10)
hn
Solving eqs. (2.11.1.5), (2.11.1.9) and (2.11.1.10), we can determine the unknowns k0 , k1,..., kn .
It may be noted that, we need to solve an (n + 1) × ( n + 1) tridiagonal system of equations in order
to find the values of k i , i = 0,1,2,.., n .
• In case of clamped boundary conditions
s1′ ( x0 ) = y0′ = k 0 , sn′ ( xn ) = y ′n = k n (2.11.1.11)
4
are known explicitly. Substituting these values of k0 and kn in eq. (2.11.1.5), a system of n − 1
linear equations in n − 1 unknowns k i , i = 1,2,.., n − 1 are obtained. Solving this (n − 1) × (n − 1)
tridiagonal system of equations, we can determine the values of k i , i = 1,2,.., n − 1 .
Example 5: Interpolate f ( x) = x 4 on the interval − 1 ≤ x ≤ 1 by the cubic spline g (x)
corresponding to the partition x 0 = −1 , x1 = 0 , x 2 = 1 and satisfying the clamped conditions
g ′(−1) = f ′( −1) , g ′(1) = f ′(1)
Solution:
We write f (−1) = f 0 = 1 , f (0) = f1 = 0 , f (1) = f 2 = 1 . Here, x0 = −1 , x1 = 0 and x2 = 1 .
The given interval − 1 ≤ x ≤ 1 is partitioned into n = 2 parts, each of length h = 1 . Hence the spline
g (x) consists of n = 2 cubic polynomials.
From eq. (2.11.1.3), we have
si ( x) = ai0 + ai1 ( x − xi ) + ai 2 ( x − xi ) 2 + ai3 ( x − xi )3 , i = 1,2 (1)
Putting i = 1 in eq. (1), we get
s1( x) = a10 + a11( x − x1 ) + a12 ( x − x1 ) 2 + a13( x − x1)3
= a10 + a11x + a12 x 2 + a13x3 , −1 ≤ x ≤ 0 (2)
Again, putting i = 2 in eq. (1), we get
s2 ( x) = a20 + a21( x − x2 ) + a22 ( x − x2 ) 2 + a23( x − x2 )3
5
= a20 + a21( x − 1) + a22 ( x − 1) 2 + a23( x − 1)3 , 0 ≤ x ≤1 (3)
From eq. (2.11.1.5), we have
3 3(1 − 1)
k 0 + 4k1 + k 2 = ( f2 − f0 ) = =0, (4)
h 1
where s1′ ( x0 ) = k0 , s1′ ( x1 ) = k1 and s′2 ( x2 ) = k 2 .
Now, given that, g ′ = f ′ at x = ±1 .
Therefore,
f ′(−1) = −4 = g ′(−1) = s1′ (−1) = k0
f ′(1) = 4 = g ′(1) = s1′ (1) = k 2
Substituting k0 and k 2 into eq. (4), we get k1 = 0
Now, from eq. (2.11.1.4), we obtain
a10 = f1 = 0
a11 = k1 = 0
3 (k + 2k1 ) 3(1 − 0) (−4 + 0)

a12 = 2
( f 0 − f1 ) + 0 = + = −1
h h 12 1
2 1 2 1
a13 = 3
( f 0 − f1 ) + 2 (k0 + k1 ) = 3 (1 − 0) + 2 (− 4 + 0) = −2
1 1 1 1
Similarly,
a20 = f 2 = 1
6
a21 = k 2 = 4
3 1 3 1
a22 = ( f1 − f 2 ) + (k1 + 2k 2 ) = 2 (0 − 1) + (0 + 8) = 5
12 1 1 1
2 1 2 1
a23 = 3
( f1 − f 2 ) + 2
(k1 + k 2 ) = 3
(0 − 1) + ( 0 + 4) = 2
1 1 1 12
Thus the cubic polynomials are
s1( x) = a10 + a11x + a12 x2 + a13x3 , for −1 ≤ x ≤ 0
= − x 2 − 2x 3 , for −1 ≤ x ≤ 0
and
s2 ( x) = a20 + a21( x − 1) + a22 ( x − 1) 2 + a23 ( x − 1)3 , for 0 ≤ x ≤ 1
= 1 + 4 ( x − 1) + 5( x − 1) 2 + 2 ( x − 1) 3 , for 0 ≤ x ≤ 1
= − x 2 + 2x 3 , for 0 ≤ x ≤ 1
Therefore, the required cubic spline g (x) is given by
g( x) = − x2 − 2 x3 if −1 ≤ x ≤ 0
= − x 2 + 2x3 if 0 ≤ x ≤ 1 .
7
5. Numerical Integration formula from Newton’s forward interpolation formula
Let the values the function f (x) be given at the (n + 1) equispaced points a = x0, x1, x2,...,xn = b
where xi = x0 + ih, i = 0(1)n and b−a x − x0

h= = n .
n n
b
Now to find the definite integral ∫ f ( x)dx , we are to a use an interpolating polynomial for
a
f (x )
over [a, b] and then integrate it.
We first divide the range of integration [a, b] into n equal parts, each of length h by the (n + 1)
equidistant points a = x0, x1, x2,...,xn = b .
Using Newton’s forward interpolating formula
u (u − 1) 2 u (u − 1)...( u − n + 1) n
f ( x ) ≅ y0 + u∆y0 + ∆ y 0 + ... ∆ y 0 , where
2! n!
a<ξ <b
and x − x0 so that dx = hdu.

u =
h
Thus,
b n
 u (u − 1) 2 u (u − 1)...( u − n + 1) n 
∫
a
f ( x) dx = h ∫  y0 + u.∆y0 +
0 
2!
∆ y0 + ... +
n!
∆ y0  du + Rn +1 ( f ) ,

(5.1)
where Rn+1 ( f ) is the error term.
1
Now, we may now deduce different quadrature formula from this eq. (5.1).
I. Trapezoidal Rule
Putting n =1 in eq.( 5.1) we get
b 1
∫ f (x)dx = h∫ [y
a 0
0 + u∆y0 ]du + R2 ( f )
 1 
= h  y0 + ( y1 − y0 ) + R2 ( f )
 2 
h h3
= [y0 + y1 ] + R2 ( f ) , where R2 ( f ) ≅ − f ′′( x0 ) .
2 12
• Composite trapezoidal Rule
We divide the range of integration [a, b] into n equal subintervals, each of width h by the n + 1
equidistant points of x , say a = x 0 , x1 = x 0 + h , x 2 = x 0 + 2h ,…, x n = x 0 + nh so that b − a = nh .
Let the values of the integrand at these equidistant points be yi = f ( xi ) , i = 0,1,2,...n .
b xn x1 x2 xn
Let, I = ∫ ydx = ∫ ydx = ∫ ydx + ∫ ydx + ... + ∫ ydx
a x0 x0 x1 x n −1
Taking the first subinterval [ x0 , x1 ] and using Newton’s forward interpolation formula, from eq.
(5.1), we get
x1 x1
I1 = ∫ ydx = ∫ [y
x0 x0
0 + u∆y0 ]dx , where x = x0 + uh
1
=h ∫ (y
0
0 + u∆y0 )du
h
= ( y0 + y1 )
2
2
For the next subinterval [ x1 , x2 ] , we can obtain
x2
h
I2 = ∫ ydx = 2 ( y
x1
1 + y2 )
and so on.
Finally, for the last subinterval [ x n −1 , x n ] , we have
xn
h
In = ∫ ydx = 2 ( y
x n −1
n −1 + yn )
Adding I1 , I 2 ,…, I n , we have
I TC = I1 + I 2 + ... + I n
h
I TC = [ y0 + yn + 2( y1 + y2 + yn −1 )].
2
which is known as composite trapezoidal rule for numerical integration.
Therefore, the total error is
h3
ETC ≅ − [ f ′′( x0 ) + f ′′( x1 ) + ... + f ′′( xn −1 )] (5.2)
12
Let, f ′′(ξ ) = Max f ′′( x) , where a < ξ < b

a ≤ x ≤b
So, from eq. (5.2), we get the error estimate
nh3 (b − a)h 2
ETC ≅ − f ′′(ξ ) = − f ′′(ξ ) ~ O(h2 )
12 12
which is called global truncation error in composite trapezoidal rule.
3
Now we may observe that global error in trapezoidal rule is O (h 2 ) whereas the corresponding
local error is O (h3 ) .
Hence, the composite trapezoidal rule with error term is
b 2
∫ f ( x)dx =
h
[y0 + yn + 2( y1 + y2 + ... + yn−1 )]− (b − a)h f ′′(ξ ) , a < ξ < b. (5.3)
a
2 12
5.1 Geometrical interpretation
Fig. 1 Geometrical interpretation of composite trapezoidal rule
From fig. 1, it can be observed that the geometrical significance of trapezoidal rule is that the
curve y = f (x) is replaced by n straight line segments joining the points ( x0 , y0 ) , ( x1 , y1 ) , …,
( xn −1 , yn−1 ) and ( xn , yn ) . The area bounded by the curve y = f (x ) , the ordinates x = x0 = a ,
x = xn = b and the x -axis is then approximately equal to the sum of the areas of the n trapeziums
obtained as shown in fig. 1.
4
5.3 Derivation of Composite Simpson’s one-third Rule
1
II. Simpson’s rule
3
Putting n = 2 in eq.(5.1), we get
b 2
 u (u − 1) 2 
∫ f ( x)dx = h∫  y
a 0
0 + u∆y0 +
2!
∆ y0  du + R3 ( f )

 1 
= h 2 y0 + 2( y1 − y0 ) + ( y0 − 2 y1 + y2 ) + R3 ( f )
 3 
h
= [ y0 + 4 y1 + y2 ] + R3 ( f ) , (5.12)
3
h5 iv
where R3 ( f ) ≅ − f (ξ ) , a = x0 < ξ < x2 = b .
90
• Composite Simpson’s one-third Rule
We divide the range of integration [a, b] into even number of equal subintervals, i.e. n = 2 m ,
m ∈ Z + say, by the n + 1 equidistant points of x , say a = x 0 , x1 = x 0 + h , x 2 = x 0 + 2h ,…,
x n = x 0 + nh so that b − a = nh . Let the values of the integrand at these equidistant points be
yi = f ( xi ) , i = 0,1,2,...n . Now, we apply Simpson’s one-third rule eq.(5.12) in each of the
subintervals [ x0 , x2 ] , [ x2 , x4 ] , [ x4 , x6 ] ,…, [ xn −2 , xn ] so that
b xn x2 x4 x6 xn
I= ∫
a
ydx = ∫
x0
ydx = ∫
x0
ydx + ∫
x2
ydx + ∫
x4
ydx + ... + ∫ ydx
xn − 2
1
Taking the subinterval [ x0 , x2 ] and using Newton’s forward interpolation formula, from eq. (5.1),
we get
x2
I1 = ∫xo
ydx
2
 u (u − 1) 2 
0 
∫
= h  y 0 + u∆y 0 +
2!
∆ y0  du

 1 
= h 2 y0 + 2∆y0 + ∆2 y0 
 3 
h
= [ y 0 + 4 y1 + y 2 ]
3
Similarly, for the interval [ x 2 , x 4 ] , we can obtain
x4 h
I2 = ∫x2
ydx =
3
[ y 2 + 4 y3 + y 4 ]
and finally, for the last interval [ xn −2 , xn ] , we have
xn h
In =
2
∫xn − 2
ydx =
3
[ y n−2 + 4 yn−1 + y n ]
Adding I 1 , I 2 ,..., I n , we get

2
I SC = I1 + I 2 + ... + I n
2
h
I SC = [ y0 + yn + 4( y1 + y3 + ... + y n−1 ) + 2( y2 + y4 + ... + yn− 2 )]
3
which is known as composite Simpson’s one-third rule for numerical integration.
2
The global error estimate is given by
nh 5 iv (b − a ) h 4 iv
E SC ≅ − f (ξ ) = − f (ξ ) ~ O ( h 4 ) (5.13)
180 180
Hence, the composite Simpson’s one-third rule with error term is
b 4
∫ f ( x)dx =
h
[y0 + yn + 4( y1 + y3 + y5 + ...yn−1 ) + 2( y2 + y4 + ... + yn−2 )]− (b − a)h f iv (ξ ) , a < ξ < b.
a
3 180
(5.14)
5.4 Geometrical Significance of Simpson’s one-third rule
Fig. 2 Geometrical interpretation of composite Simpson’s one-third rule
From fig. 2, it can be observed that the geometrical significance of Simpson’s one-third is that
the curve y = f (x ) is replaced by n / 2 parabolic chains joining three consecutive points
3
( x0 , y0 ) , ( x1 , y1 ) , ( x2 , y 2 ) ; ( x2 , y 2 ) , ( x3 , y3 ) , ( x4 , y 4 ) ; …; ( xn −2 , yn −2 ) , ( xn −1 , yn−1 ) , ( xn , yn ) , where
n = 2m , m ∈ Z + .
The sum of these n / 2 areas thus obtained as shown in fig. 2 is then approximately equal to the
area bounded by the curve y = f (x) , the ordinate x = x0 = a , x = xn = b and x -axis.
4
5.5 Truncation error in Simpson’s one-third rule
The truncation error term in this formula is given by
x2
h
R3 ( f ) = ∫ f ( x)dx − [y0 + 4 y1 + y2 ] (5.15)
x0
3
where y0 = f ( x0 ), y1 = f ( x0 + h), y2 = f ( x0 + 2h).
Now, let f (x) and its derivatives upto 4-th order are all continuous in [a, b] i.e., in [ x0 , x2 ] .
Expanding f (x ) in Taylor series expansion about x0 , it follows
b
4 2 4 5 iv
∫ f ( x) dx = 2hf (x ) + 2h f ′(x ) + 3 h f ′′(x ) + 3 h f ′′′( x0 ) + h f ( x0 ) + ...
2 3 4
0 0 0
a
15
Again, expanding y1 and y2 in Taylor series expansion about x0 , it follows
y0 + 4 y1 + y2 = f ( x0 ) + 4 f ( x0 + h ) + f ( x0 + 2h )
 h2 h3 h 4 iv 
= f ( x0 ) + 4  f (x0 ) + hf ′( x0 ) + f ′′(x0 ) + f ′′′( x0 ) + f ( x0 ) + ....
 2! 3! 4! 
 (2h )2 f ′′(x ) + (2h)3 f ′′′(x ) + (2h)4 f iv (x ) + ....
+  f ( x0 ) + 2 hf ′( x0 ) + 0 0 0 
 2! 3! 4! 
5 4
= 6 f ( x 0 ) + 6 h f ′( x 0 ) + 4 h 2 f ′′( x 0 ) + 2 h 3 f ′′′(x 0 ) + h f iv
(x0 ) + ...
6
Therefore, from eq. (5.15), we obtain
x2 5
h
R3 ( f ) = ∫ f ( x) dx − [ y0 + 4 y1 + y2 ] = − h f iv ( x0 ) + ...
x0
3 90
h 5 iv
So, R3 ( f ) ≈ − f ( x0 ) (5.16)
90
1
5.5.1 Composite Simpson’s one-third rule
In this case we divide the interval [a, b] into an even number of equal subintervals by the points
a = x0, x1, x2 ...,xn = b where n = 2k is even and length of each subinterval is b−a b−a
h= = .
n 2k
n
Then we apply Simpson’s one-third rule to each of subintervals [xi−2 , xi ] , i = 2, 4 , ..., n = 2k
2
and get the value of the definite interval as
b x2 x4 xn
∫ f ( x)dx = ∫ f ( x)dx + ∫ f ( x)dx + ... + ∫ f ( x)dx

a x0 x2 x n− 2
Now, according to eq. (5.16), the leading term of truncation error or local error in subinterval
[ x0 , x2 ] is
h 5 iv
− f (x0 ) ~ O ( h 5 ) (5.17)
90
Similarly, the local errors in the remaining subintervals [ xi , xi + 2 ] , (i = 2,4,...,n − 2) are given
respectively by
5
h 5 iv h 5 iv h
− f (x2 ) , − f ( x 4 ) ,…, − f ′′(xn − 2 )
90 90 90
Therefore,
x2 5
h
∫ f ( x)dx = [y0 + 4 y1 + y2 ] − h f iv (x0 )
3 90
x0
x4 5
h
∫ f ( x)dx = [y2 + 4 y3 + y4 ] − h f iv (x2 )
3 90
x2
2
xn
5
h
Finally, ∫ f ( x)dx = [yn − 2 + 4 yn −1 + yn ] − h f iv (xn − 2 )
3 90
x n− 2
Adding both sides of the above equations we have
b 5
∫ f ( x)dx =
h
3 3 3 90
[
[y0 + 4 y1 + y2 ] + h [y2 + 4 y3 + y4 ] + ... + h [yn−2 + 4 yn−1 + yn ]− h f iv (x0 ) + f iv ( x2 ) + ... + f iv ( xn− 2 ) ]
a
5
h
= [ y0 + 4( y1 + y3 + y5 + ... yn −1 ) + 2( y2 + y4 + ... + yn− 2 ) + yn ]− h f iv (x0 ) + f iv ( x2 ) + ... + f iv ( xn− 2 )
[ ]
3 90
(5.18)
h 5 iv
E SC ≅ −
90
[
f ( x0 ) + f iv ( x 2 ) + ... + f iv ( x n − 2 ) ] (5.19)
Let f iv (ξ ) = Max f iv ( x ) , where a < ξ < b

a ≤ x ≤b
nh 5 iv (b − a ) h 4 iv
E SC ≅ − f (ξ ) = − f (ξ ) ~ O ( h 4 ) (5.20)
180 180
which is called the global truncation error in composite Simpson’s one-third rule.
Now we may observe that global error in Simpson’s one-third rule is O(h4 ) whereas the
corresponding local error is O ( h 5 ) .
Thus,
b 4
∫ f ( x)dx =
h
[ y0 + 4( y1 + y3 + y5 + ...yn −1) + 2( y2 + y4 + ... + yn −2 ) + yn ]− (b − a)h f iv (ξ ) , a < ξ < b.
3 180
a
(5.21)
3
This is called composite Simpson’s one-third rule for numerical integration with error term.
5.6 Derivation of Composite Simpson’s three-eight Rule
III. Simpson’s three-eight rule
Putting n = 3 in eq.(5.1), we get
b 3
 u (u − 1) 2 u (u − 1)(u − 2) 3 
∫
a
∫
f ( x) dx = h  y0 + u∆y0 +
0 
2!
∆ y0 +
3!
∆ y0  du + R4 ( f )

 9 9 3 
= h 3 y0 + ∆y0 + ∆2 y0 + ∆3 y0  + R4 ( f )
 2 4 8 
3h
= [ y0 + 3 y1 + 3 y 2 + y3 ] + R4 ( f ) , (5.22)
8
3h5 iv
where R4 ( f ) ≅ − f ( x0 ) .
80
• Composite Simpson’s three-eight Rule
We divide the range of integration [a, b] into n = 3m , m ∈ Z + number of equal subintervals by the
n +1 equidistant points of x , say a = x 0 , x1 = x 0 + h , x 2 = x 0 + 2h ,…, x n = x 0 + nh so that
b − a = nh . Let the values of the integrand at these equidistant points be yi = f ( xi ) , i = 0,1,2,...n .
Now, we apply Simpson’s three-eight rule eq.(5.22) in each of the subintervals [ x0 , x3 ] , [ x3 , x6 ] ,…,
[ xn −3 , xn ] so that
b xn x3 x6 xn
I= ∫ ydx = ∫ ydx = ∫ ydx + ∫ ydx + ... + ∫ ydx

a x0 x0 x3 xn − 3
4
Taking the subinterval [ x0 , x3 ] and using Newton’s forward interpolation formula, from eq. (5.1),
we get
x3
I1 = ∫xo
ydx
3
 u (u − 1) 2 u (u − 1)(u − 2) 3 
0  ∫
= h  y0 + u ∆ y 0 +
2!
∆ y0 +
3!
∆ y 0  du

3h
= [y0 + 3 y1 + 3 y2 + y3 ]
8
Similarly, for the interval [ x3 , x6 ] , we can obtain
x6 3h
I2 = ∫x3
ydx =
8
[ y3 + 3 y 4 + 3 y5 + y 6 ]
and finally, for the last interval [ xn − 3 , x n ] , we have
xn 3h
In =
3
∫xn − 3
ydx =
8
[y n−3 + 3 yn −2 + 3 y n−1 + y n ]
Adding I1 , I 2 ,..., I n , we get

3
I SC1 = I1 + I 2 + ... + I n
3
3h
I SC1 = [ y0 + y n + 3( y1 + y2 + y 4 + y5 + ... + y n−2 + yn−1 ) + 2( y3 + y6 + ... y n−3 )]
8
which is known as composite Simpson’s three-eight rule for numerical integration.
The global error estimate is given by
nh5 iv (b − a)h 4 iv
ESC1 ≅ − f (ξ ) = − f (ξ ) ~ O(h 4 ) (5.23)
80 80
5
Hence, the composite Simpson’s three-eight rule with error term is
b
3h
∫ f ( x)dx = 8 [y
a
0 + yn + 3( y1 + y2 + y4 + y5 + ... + yn −2 + yn−1 ) + 2( y3 + y6 + ... yn−3 )]
(b − a)h 4 iv
− f (ξ ) , a < ξ < b. (5.24)
80
6
6. Gauss Quadrature formula
b
In the integration formulae derived so far to evaluate I = ∫ f ( x )dx , we used the values of f (x) at
a
equally spaced points of the interval [a, b]. Gauss’s quadrature formula uses the same number of
functional values f (x) with different spacing in order to obtain better accuracy.
Let,
b b
∫ ∫
I = w( x ) ydx = w( x ) f ( x ) dx
a a
(6.1)
be the integral to be evaluated, where w( x) > 0 ( a ≤ x ≤ b) is the weight function.
b −a b + a
Let us introduce transformation x =  u +   , so that the interval [a, b] is transformed to
 2   2 
[−1,1] .
Now, we have
b−a b+a
y = f ( x) = f  u+  = φ (u ) ,
 2 2 
b−a b+a
w( x) = w u+  = ψ (u )
 2 2 
 b−a
and dx =  du
 2 
b−a 1
I = ∫
 ψ (u )φ (u )du
 2  −1
(6.2)
1
The n-point Gauss’s quadrature formula is expressed in the form
1 n
∫ψ (u)φ (u)du = w1φ (u1 ) + w2φ (u2 ) + ... + wnφ (un ) = ∑ wkφ (uk )
−1 k =1
(6.3)
where u1 , u 2 ,...,u n are the points of subinterval of the interval (-1, 1). These points are known as
nodes or abscissae and w1 , w2 ,..., wn are the weights to be determined.
Now, suppose that, ψ (u ) = 1 . To evaluate eq. (6.3), we have to determine 2n unknowns ui and wi ,
i = 1, 2 ,..., n . Thus 2n equations are required and these are obtained such that the formula in eq.
(6.3) is exact for all polynomials of degree less than or equal to 2n − 1 .
Therefore, we can take
φ (u) = c0 + c1u + ... + c2n−1u 2n−1 (6.4)
and so
1 n
∑
2 2 1
∫
−1
φ (u )du = 2c0 + c2 + ... +
3 2n − 1
c 2 n− 2 = 2
k =1
2k −1
c 2 k −2 (6.5)
Thus, from eqs. (6.3) and (6.5), we get
∑ 2k − 1c
1
w1φ (u1 ) + w2φ (u 2 ) + ... + wnφ (u n ) = 2 2 k −2
k =1
This implies that
w1 (c0 + c1u1 + ... + c2n−1u12n−1 ) + w2 (c0 + c1u 2 + ... + c2n−1u 2 2n−1 ) + ...
∑ 2k − 1c
1
+ wn (c 0 + c1u n + ... + c2 n −1u n 2 n −1 ) = 2 2 k −2 (6.6)
k =1
2
Now, comparing the different co-efficients of c j , j = 0,1,..., 2n − 1 from both sides of eq. (6.6), we
get
w1 + w2 + ... + wn = 2
w1u1 + w2u 2 + ... + wnu n = 0
2
w1u12 + w2 u 22 + ... + wn u n2 = (6.7)
3
2
w1u12 n− 2 + w2 u 22 n−2 + ... + wn u n2 n −2 =
2n − 1
w1u12n−1 + w2u22n−1 + ... + wnun2n−1 = 0
The above eqs. (6.7) form a system of 2n nonlinear equations in 2n unknowns ui and wi ,
i = 1, 2 ,..., n . By solving these 2n equations, we can obtain the values of the 2n unknowns ui and wi
, i = 1, 2 ,..., n . Though theoretically it would be possible to solve eqs. (6.7), but practically it is
extremely complicated to solve these equations, in general.
3
6. Gauss Quadrature formula
Let,
b b
∫ ∫
I = w( x ) ydx = w( x ) f ( x ) dx
a a
(6.1)
be the integral to be evaluated, where w( x) > 0 ( a ≤ x ≤ b) is the weight function.
b −a b + a
Let us introduce transformation x =  u +   , so that the interval [a, b] is transformed to
 2   2 
[−1,1] .
Now, we have
b−a b+a
y = f ( x) = f  u+  = φ (u ) ,
 2 2 
b−a b+a
w( x) = w u+  = ψ (u )
 2 2 
 b−a
and dx =  du
 2 
b−a 1
I = ∫
 ψ (u )φ (u )du
 2  −1
(6.2)
The n-point Gauss’s quadrature formula is expressed in the form
1 n
∫ψ (u)φ (u)du = w1φ (u1 ) + w2φ (u2 ) + ... + wnφ (un ) = ∑ wkφ (uk )
−1 k =1
(6.3)
1
where u1 , u 2 ,...,u n are the points of subinterval of the interval (-1, 1). These points are known as
nodes or abscissae and w1 , w2 ,..., wn are the weights to be determined.
Now, suppose that, ψ (u ) = 1 . To evaluate eq. (6.3), we have to determine 2n unknowns u i and wi ,
i = 1, 2 ,..., n . Thus 2n equations are required and these are obtained such that the formula in eq.
(6.3) is exact for all polynomials of degree less than or equal to 2n − 1 .
Therefore, we can take
φ (u) = c0 + c1u + ... + c2n−1u 2n−1 (6.4)
and so
1 n
∑
2 2 1
∫
−1
φ (u )du = 2c0 + c2 + ... +
3 2n − 1
c 2 n− 2 = 2
k =1
2k − 1
c 2 k −2 (6.5)
Thus, from eqs. (6.3) and (6.5), we get
∑ 2k − 1c
1
w1φ (u1 ) + w2φ (u 2 ) + ... + wnφ (u n ) = 2 2 k −2
k =1
This implies that
w1 (c0 + c1u1 + ... + c2n−1u12n−1 ) + w2 (c0 + c1u 2 + ... + c2n−1u 2 2n−1 ) + ...
∑ 2k − 1c
1
+ wn (c 0 + c1u n + ... + c2 n −1u n 2 n −1 ) = 2 2 k −2 (6.6)
k =1
Now, comparing the different co-efficients of c j , j = 0,1,..., 2n − 1 from both sides of eq. (6.6), we
get
w1 + w2 + ... + wn = 2
2
w1u1 + w2u 2 + ... + wnu n = 0
2
w1u12 + w2 u 22 + ... + wn u n2 = (6.7)
3
2
w1u12 n −2 + w2 u 22 n −2 + ... + wn u n2 n −2 =
2n − 1
w1u12n−1 + w2u22n−1 + ... + wnun2n−1 = 0
The above eqs. (6.7) form a system of 2n nonlinear equations in 2n unknowns u i and wi ,
i = 1, 2 ,..., n . By solving these 2n equations, we can obtain the values of the 2n unknowns u i and wi ,
i = 1, 2 ,..., n . Though theoretically it would be possible to solve eqs. (6.7), but practically it is
extremely complicated to solve these equations, in general.
6.1 Guass-Legendre Integration Method
It is found that when u1 , u 2 ,..., u n occurring in eqs. (6.7) are the roots of the equation Pn ( x) = 0 ,
then the values of wi ’s are easily obtained.
Now, by Rodrigue’s formula, we have
1 dn
Pn (u ) = n
2 n ! du n
(u − 1)
2 n
Thus u1 , u 2 ,..., u n are the roots of the equation
dn
du n
(u − 1)
2 n
=0 (6.8)
whose roots are symmetrical about u = 0 , since the power of u is even.
3
Case I.
When n = 1, eq. (6.8) gives u = 0 , i.e. u1 = 0 . From eq. (6.7), we have w1 = 2 . Therefore, from eq.
(6.3), we get
∫φ (u)du = 2φ (u ) = 2φ (0)
−1
1
so that from eq. (6.2), we have
b a+b
∫ a
f ( x ) dx ≅ (b − a ) f  
 2 
(6.9)
This is called Gauss-Legendre one-point formula.
Case II.
When n = 2, eq. (6.8) gives u = ± 1 . We take u 1 = − 1 and u 2 = 1 .

3 3 3
Now, from eq. (6.7), we have
w1 + w2 = 2
1 1
− w1 + w2 = 0
3 3
Solving the above equations, we obtain
w1 = w2 = 1
Therefore, from eq. (6.3), we get
1
 1   1 
∫φ (u)du = φ  −
−1
 +φ  
3   3 
4
b (b − a )   b − a b + a   b − a b + a 
∫a
f ( x)dx ≅  f −
2   2 3
+
2 
+ f 
2 3
+ 
2 
(6.10)
This is called Gauss-Legendre two-point formula.
1 dx
Example 2: Evaluate the integral I = ∫ using two-point Gauss Legendre quadrature
0 1 + x2
formula.
Solution:
u 1
Putting x = + we get,
2 2
1 2du 1
I= ∫−1 u 2 + 2u + 5 ∫−1φ (u )du ,
=
2
where φ (u ) = 2
.
u + 2u + 5
The 2 point Gauss’s quadrature formula is
I = w1φ (u1 ) + w2φ (u 2 ) ,
where u1 , u2 are two roots of the equation
d 2 (u 2 − 1) 2
=0
du 2
This implies
1
u=± .
3
1 1
Therefore, u1 = − and u 2 =
3 3
w1 and w2 are given by the equation
1 1
w1 + w2 = 2 and − w1 + w2 = 0
3 3
Solving the above equations, we get w1 = w 2 = 1
5
Thus, the 2-point Gauss Legendre quadrature formula becomes
 1   1 
I = φ  −  + φ   = 0.786885
 3  3
6
6.1 Guass-Legendre Integration Method
It is found that when u1 , u 2 ,..., u n occurring in eqs. (6.7) are the roots of the equation Pn ( x) = 0 ,
then the values of wi ’s are easily obtained.
Now, by Rodrigue’s formula, we have
dn
Pn (u ) =
1
n
2 n ! du n
(u − 1)
2 n
Thus u1 , u 2 ,..., u n are the roots of the equation
dn
du n
(u − 1)
2 n
=0 (6.8)
whose roots are symmetrical about u = 0 , since the power of u is even.
Case I.
When n = 1, eq. (6.8) gives u = 0 , i.e. u1 = 0 . From eq. (6.7), we have w1 = 2 . Therefore, from eq.
(6.3), we get
∫φ (u)du = 2φ (u ) = 2φ (0)
−1
1
b a+b
∫ a
f ( x ) dx ≅ (b − a ) f  
 2 
(6.9)
This is called Gauss-Legendre one-point formula.
1
Case II.
When n = 2, eq. (6.8) gives u = ± 1 . We take u 1 = − 1 and u 2 = 1 .

3 3 3
w1 + w2 = 2
1 1
− w1 + w2 = 0
3 3
Solving the above equations, we obtain
w1 = w2 = 1
1
 1   1 
∫φ (u)du = φ  −
−1
 +φ  
3   3 
b (b − a )   b − a b + a   b − a b + a 
∫a
f ( x) dx ≅  f −
2   2 3
+
2 
+ f 
2 3
+ 
2 
(6.10)
This is called Gauss-Legendre two-point formula.
Case III.
3 3 3
When n=3, eq. (6.8) gives u = 0,± . We take u 1= − , u 2 = 0 and u 3 = .
5 5 5
2
w1 + w2 + w3 == 2
3 3
− w1 + w3 = 0
5 5
3 3 2
w1 + w3 =
5 5 3
5 8 5
Solving these equations we get w1 = , w2 = and w3 = .
9 9 9
1
5  3  8 5  3 
∫φ (u)du = 9 φ −
−1

5 9
+ φ (0) + φ 
9  5 
b (b − a )   b − a 3 b + a  b+a b−a 3 b + a  
∫a
f ( x ) dx ≅ 5 f −
18   2 5
+
2  
+8f  + 5 f 
 2   2 5
+
2  
 (6.11)
This is called Gauss-Legendre three-point formula.
The error of Gauss-Legendre quadrature formula is given by
2 2 n +1 (n!) 4 f ( 2 n ) (ξ )
En ( f ) = , −1 < ξ < 1 (6.12)
(2n + 1)[(2n)!]3
The nodes and the corresponding weights for the Gauss-Legendre integration method are given
as follows:
n nodes uk weights wk
2 ± 0.5773502692 1.0000000000
0.0000000000 0.888888889
3
± 0.7745966692 0.555555556
4 ± 0.3399810436 0.6521451549
3
± 0.8611363116 0.3478548451
0.0000000000 0.5688888889
5 ± 0.5384693101 0.4786286705
± 0.9061798459 0.2369268851
± 0.2386191861 0.4679139346
± 0.6612093865 0.3607615730
6
0.1713244924
± 0.9324695142
Table 1 Nodes and weights for Gauss-Legendre quadrature method
4
5. Numerical solution of simultaneous nonlinear equations
5.1 Newton’s Method
Newton-Raphson method can be extended to find the solutions of simultaneous equations in
several unknowns. Let us consider a system of two nonlinear equations in two unknowns
f ( x, y ) = 0 , g ( x, y ) = 0 (5.1.1)
Let ( x0 , y0 ) be an initial approximation to the roots of the system given in eq. (5.1.1). If
( x0 + ∆x, y0 + ∆y ) is the root of the system, then we must have
f ( x0 + ∆x, y0 + ∆y ) = 0 , g ( x0 + ∆x, y0 + ∆y ) = 0 (5.1.2)
Assuming the functions f and g to be sufficiently differentiable, expanding f and g by Taylor’s
series in the neighbourhood ( x0 , y0 ) , we have
2
 ∂ ∂  1 ∂ ∂ 
f ( x0 , y0 ) +  ∆x + ∆y  f ( x0 , y0 ) +  ∆x + ∆y  f ( x0 , y0 ) + ... = 0
 ∂x ∂x  2!  ∂x ∂x 
2
 ∂ ∂  1 ∂ ∂ 
g ( x0 , y0 ) +  ∆x + ∆y  g ( x0 , y0 ) +  ∆x + ∆y  g ( x0 , y0 ) + ... = 0
 ∂x ∂x  2!  ∂x ∂x 
Neglecting second and higher powers of ∆x and ∆y , we have
∂f ( x0 , y0 ) ∂f ( x0 , y0 )
f ( x0 , y0 ) + ∆x + ∆y =0 ,
∂x ∂y
1
∂g ( x0 , y0 ) ∂g ( x0 , y0 )
g ( x0 , y0 ) + ∆x + ∆y = 0. (5.1.3)
∂x ∂y
Solving (5.1.3) for ∆x and ∆y , we obtain
∂f ( x0 , y0 ) ∂f ( x0 , y0 )
f ( x0 , y0 ) f ( x0 , y0 )
1 ∂y 1 ∂x
∆x = − and ∆y = − ,
J0 g(x , y ) ∂g ( x0 , y0 ) J 0 ∂g ( x0 , y0 )
0 0 g ( x0 , y0 )
∂y ∂x
∂f ( x0 , y0 ) ∂f ( x0 , y0 )
∂x ∂y
where the Jacobian J 0 = ≠0
∂g ( x0 , y0 ) ∂g ( x0 , y0 )
∂x ∂y
Thus the first approximation to the required root is given by
∂f ( x0 , y0 )
f ( x0 , y0 )
1 ∂y
x1 = x0 + ∆x = x0 − ,
J0 g(x , y ) ∂g ( x0 , y0 )
0 0
∂y
∂f ( x0 , y0 )
f ( x0 , y0 )
1 ∂x
y1 = y0 + ∆y = y0 − .
J 0 ∂g ( x0 , y0 ) g ( x0 , y0 )
∂x
Repeating the above procedure, the (n + 1) -th approximation to the roots is given by
2
∂f ( xn , yn )
f ( xn , yn )
1 ∂y
xn+1 = xn + ∆x = xn − ,
J n g(x , y ) ∂g ( x n , y n )
n n
∂y
∂f ( xn , yn )
f ( xn , yn )
1 ∂x
y n+1 = yn + ∆y = yn − .
J n ∂g ( xn , y n ) g ( xn , yn )
∂x
where, in each iteration, it is assumed that the Jacobian
∂f ( xn , y n ) ∂f ( xn , y n )
∂x ∂y
Jn = ≠ 0 , (n = 0,1,2,...) .
∂g ( xn , yn ) ∂g ( xn , yn )
∂x ∂y
This iteration process will continue until xn+1 − xn < ε , y n+1 − yn < ε , where ε is the given
accuracy.
Example 10:
Solve the system of equations
x2 + y 2 − 1 = 0
x3 − y = 0 ,
correct to four decimal places, using Newton’s method, given that x0 = 1 and y0 = 0.5 .
Solution:
3
Let,
f ( x, y) = x 2 + y 2 − 1 = 0 , g ( x, y ) = x3 − y = 0
Here, the Jacobian is given by
∂f ( x n , y n ) ∂f ( xn , y n )
∂x ∂y
Jn =
∂g ( xn , y n ) ∂g ( x n , y n )
∂x ∂y
2 xn 2 yn
= 2
= −2 xn − 6 xn2 yn
3x n −1
Now, according to Newton’s method, we have
∂f ( xn , yn )
f ( xn , yn )
1 ∂y 1 xn2 + yn2 − 1 2 yn
xn+1 = xn − = xn − ,
Jn g(x , y ) ∂g ( xn , yn ) J n xn3 − yn −1
n n
∂y
∂f ( xn , yn )
f ( xn , yn ) xn2 + y n2 − 1
1 ∂x 1 2 xn
y n+1 = yn − = yn − .
J n ∂g ( xn , y n ) g ( xn , yn ) J n 3xn2 xn3 − yn
∂x
In matrix form, we have
 xn+1   xn  1 1 − xn2 − 2 xn3 y n + yn2 

y  = y  − J  2 4 2 2
.
 n+1   n  n 3 xn − xn − 2 xn yn − 3 xn yn 
Starting with x0 = 1 and y0 = 0.5 , we get
4
 x1   x0  1 1 − x0 − 2 x0 y0 + y0  1  1 − 0.75 0.85
2 3 2
y  y  J  2
= − 4 2 2
 =  −
− 0.25  = 0.55
 1  0  0 3
 0x − x0 − 2 x y
0 0 − 3 x y
0 0   0.5 5    
 x2   x1  1 1 − x1 − 2 x1 y1 + y1  0.85 − 0.0955375 0.826608

2 3 2
1
y  y  J  2
= −  =   −  = 
 2   1  1 3 x1 − x1 − 2 x1 y1 − 3x1 y1  0.55 − 4.08425 0.054825  0.563424
4 2 2
Similarly,
 x3  0.826032
 = ,
 y3  0.563624
 x4  0.826031
 y  = 0.563624 .
 4  
Therefore, the required roots of the given equations are x = 0.826 and y = 0.5636 correct to four
decimal places.
5
5.2 Truncation error in trapezoidal rule
The truncation error in trapezoidal rule is given by
b
h
∫
R2 ( f ) = f ( x) dx −
a
2
( y0 + y1 ) (5.4)
which is obviously a function of h where
x0 = a, x1 = b, h = b − a = x1 − x0 , y0 = f ( x0 ), y1 = f ( x1 ) = f ( x0 + h) .
Now, let us assume that f (x) and its first, second and higher order derivatives are continuous in
[ x0 , x1 ] . Expanding f (x) and f ( x0 + h) by Taylor series about x = x0 , it follows
x0 + h
h
R2 ( f ) = ∫ f ( x )dx − 2 [ f (x ) + f (x
x0
0 0 + h )]
x0 + h
 ( x − x0 ) 2 ( x − x0 )3  h
= ∫  f ( x0 ) + ( x − x0 ) f ′( x0 ) + f ′′( x0 ) + f ′′′( x0 ) + ...dx − [ f ( x0 ) + f ( x0 + h)]
x0  2! 3!  2
h2 h3 h4 h
= hf ( x0 ) + f ′( x0 ) + f ′′( x0 ) + f ′′′( x0 ).... − [ f ( x0 ) + f ( x0 + h)]
2! 3! 4! 2
h2 h3 h4 h h2 
= hf (x0 ) + f ′( x0 ) + f ′′(x0 ) + f ′′′( x0 ).... −  f (x0 ) + f ( x0 ) + hf ′(x0 ) + f ′′( x0 ) + ...
2! 3! 4! 2 2! 
 h3 h3   1 1
=  −  f ′′( x0 ) + h 4  −  f ′′′( x0 ) + ...
 6 4  24 12 
1
h3
≈− f ′′( x0 ) , if h is small (5.5)
12
Since f ′′(x) is assumed to be continuous in [a, b] and h = x1 − x0 = b − a is small, we get
approximately
h3
R2 ( f ) ≈ − f ′′(ξ ) where a < ξ < b (5.6)
12
Hence, the trapezoidal rule with error term is given by
b 3
h
∫ f ( x)dx = ( y0 + y1 ) − h f ′′(ξ ) (5.7)
a
2 12
5.2.1 Global error in composite trapezoidal rule
If h is not small so that f (x) varies much in the interval [a, b] , we divide [a, b] into n parts by
b−a
the node points xi = x0 + ih where x0 = a , xn = b and h = ≡ width of each subinterval.
n
We then apply trapezoidal rule for each of the subintervals [ xi −1, xi ] , i = 1,2,..., n , yielding
b x1 x2 xn n xi
∫ f ( x )dx = ∫ f ( x ) dx + ∫ f ( x) dx + ... + ∫ f ( x) dx = ∑ ∫ f ( x )dx

i =1 xi −1
a x0 x1 xn −1
Now, according to eq. (5.5), the leading term of truncation error or local error in subinterval
[ x0 , x1 ] is
h3
− f ′′( x0 ) ~ O(h3 ) (5.8)
12
2
Similarly, the local errors in the remaining subintervals [ xi , xi +1 ] , (i = 1,2,..., n − 1) are given
respectively by
h3 h3 h3
− f ′′( x1 ) , − f ′′( x2 ) ,…, − f ′′( xn−1 ) .
12 12 12
Thus,
b n 3 n−1
∑ ∑ f ′′( x )
h
∫ f ( x)dx = [yi−1 + yi ] − h i
i =1
2 12 i=0
a
3 n−1
= h [y 0 + y n + 2( y1 + y 2 + ... + y n −1 )] − h
2 12 ∑ f ′′( x ) .
i =0
i (5.9)
h3
ETC ≅ − [ f ′′( x0 ) + f ′′( x1 ) + ... + f ′′( xn −1 )]. (5.10)
12
Let, f ′′(ξ ) = Max f ′′( x) , where a < ξ < b

a ≤ x ≤b
nh3 (b − a)h 2
ETC ≅ − f ′′(ξ ) = − f ′′(ξ ) ~ O(h2 ) (5.11)
12 12
which is called global truncation error in composite trapezoidal rule.
Now we may observe that global error in trapezoidal rule is O (h 2 ) whereas the corresponding
local error is O (h3 ) .
3
π
Example 1: Evaluate ∫0
2 sin x dx correct to five significant digits taking n=6 subintervals, using
following trapezoidal rule.
π
Solution: Here, step size h = . We first tabulate the functional values of f (x) in the following
12
table.
x y
0 0
π 0.5087426
12
π 0.7071068
6
π 0.8408964
4
π 0.9306048
3
5π 0.9828152
12
π 1
2
Using trapezoidal rule, we get
π
IT = [0 + 1 + 2 × (0.5087426 + 0.7071068 + 0.8408964 + 0.9306048 + 0.9828152)]
24
= 1.1703 (rounding off upto five significant digits)
4
6.1 Gauss elimination method
Example 1: Solve the following linear system of equations by Gauss elimination method
x1 + 2 x 2 + x3 = 0
2 x1 + 2 x2 + 3x3 = 3
− x1 − 3 x2 = 2
Solution:
First we represent the given system of linear equations with the following augmented matrix
1 2 1 0
 
[A| b]=  2 2 3 3
− 1 − 3 0 2

Now, we shall apply Gauss elimination method (without partial pivoting) in the following steps.
Since, there are 3 unknowns x1 , x2 , x3 . The number of steps required in this method will be 2.
Step 1:
1 2 1 0 R′2 ←R2 −2 R1 1 2 1 0
  R3′ ← R3 + R1  
2 2 3 3   → 0 − 2 1 3
− 1 − 3 0 2 0 − 1 1 2
  
Step 2:
1 2 1 0 1 1 2 1 0
  R3′ ←R3 − 2 R2  
 0 − 2 1 3     → 0 − 2 1 3 
0 − 1 1 2 0 0 0.5 0.5
   
1
Thus starting with augment matrix [A| b], we arrive at the following upper triangular matrix [U|
1 2 1 0
 
g] , viz. 0 − 2 1 3 
0 0 0.5 0.5
 
after completion of Gauss elimination method in 2 steps.
Now, the equivalent system of equations is
x1 + 2 x 2 + x3 = 0
−2 x2 + x3 = 3
0.5 x3 = 0.5
We solve the above system of equations by back substitution method.
Hence, the required solution of the system of linear equations is
 x1   1 
   , i.e. x1 = 1 , x 2 = −1 , x3 = 1 .
 x 2  = − 1
 x 3   1 
Example 2: Using Gauss elimination method, solve the following linear system of equations
x1 + x2 + x3 = 9
2 x1 − 3x2 + 4 x3 = 13
3x1 + 4 x 2 + 5 x3 = 40
Solution:
2
1 1 1 9
 
[A| b]= 2 − 3 4 13 
3 4 5 40

Now, we shall apply Gauss elimination method (with partial pivoting) in the following steps.
Since, there are 3 unknowns x1 , x2 , x3 . The number of steps required in this method will be 2.
Step 1:
2
R2′ ← R2 − R1  
1 1 1 9 3 4 5 40  3 3 4 5 40 
  R13  
1
R3′ ←R3 − R1  17 2 41 
2 − 3 4 13  → 2 − 3 4 13  3→ 0 − − 
3 4 5  3 3 3
 40 1 1 1 9 
   1 2 13 
0 − 3 − − 
 3 3
Step 2:
   
3 4 5 40  3 4 5 40 
 17 2 41 
1
R3′ ←R3 − R2  17 2 41 
0 − −   17 → 0 − − 
 3 3 3  3 3 3 
 1 2 13   12 180 
0 − 3 − −  0 0 − −
17 51 
 3 3 
Thus starting with augment matrix [A| b], we arrive at the following upper triangular matrix [U|
 
3 4 5 40 
 17 2 41 
g] , viz. 0 − − 
 3 3 3 
 12 180 
0 0 − −
 17 51 
after completion of Gauss elimination method in 2 steps.
Now, the equivalent system of equations is
3x1 + 4 x 2 + 5 x3 = 40
3
17 2 41
− x2 + x3 = −
3 3 3
12 180
− x3 = −
17 51
We solve the above system of equations by back substitution method.
Hence, the required solution of the system of linear equations is
 x1  1 
   , i.e. x1 = 1 , x 2 = 3 , x3 = 5 .
 x 2  = 3 
 x3  5 
4
6.2 Gauss-Jordan method
This method is much similar to Gauss elimination method including the possible use of partial
pivoting. In this method, elimination of unknowns is performed not in the equations below the
diagonal but in the equations above the diagonal also, as a result of which the coefficient matrix
of the given system of equations reduces to a diagonal matrix or identity matrix form.
Example 3: Using Gauss-Jordan method, solve the following equations
2 x1 + 4 x2 + x3 = 3
3x1 + 2 x 2 − 2 x3 = −2
x1 − x 2 + x3 = 6
Solution:
2 4 1 3
 
[A| b]=  3 2 − 2 − 2
1 − 1 1 6 
Now, we shall apply Gauss-Jordan method in the following steps. Since, there are 3 unknowns
x1 , x2 , x3 . The number of steps required in this method will be 3.
Step 1:
1
 2 2 2
2 4 1 3  3 2 − 2 − 2 1 1 3 − 3 − 3  R ′ ← R − 2 R
  R12   3 R1 2 4 1 3  R
2 2 1
3′ ← R3 − R1
 3 2 − 2 − 2 →  2 4 1 3   →     →
1 − 1 1 6  1 − 1 1 6  1 − 1 1 6 
 
 2 2 2
1 3 − − 
 3 3
8 7 13 
0 .
 3 3 3 
 5 5 20 
0 − 3 3 
 3
Step 2:
 2 2 2  2 2 2  5 7
1 3 − −  1 3 − 3 − 3  R1′ ←R1 − 23 R2 1 0 − 4 − 4 
3 3 
 8 7 13  8 R2
3
7 13  R3′ ←R3 + 53 R2  7 13 
0  →
 0 1   → 0 1 
 3 3 3   8 8   8 8 
 5 5 20   5 5 20   25 75 
0 − 3 3  0 − 3 3 3  0 0 8 8 
 3    
Step 3:
 5 7  5 7
1 0 − 4 − 4  1 0 − 4 − 4  R1′ ←R1 + 54 R3
 1 0 0 2 
7 13  258 R3  7 13  R2′ ← R2 − 78 R3  
0 1  
→ 0 1     → 0 1 0 − 1
 8 8   8 8 
 25 75  0 0 1 3  0 0 1 3 
0 0 8 8   
   
Therefore the required solution of the system of linear equations is
 x1   2 
 x  =  − 1 , i.e. x = 2 , x = −1 , x = 3 .
 2   1 2 3
 x3   3 
2
6.3 Inversion of a matrix by Gaussian method
Usually, Gauss-Jordan method is used to compute the inverse of a matrix. We start with the
augmented matrix A and the identity matrix I of the same order, viz. [A | I ]. When the Gauss-
Jordan procedure is completed, we obtain [I | A −1 ]. Therefore, using Gauss Jordan method, we
have
[A | I ] Gauss Jordanmethod
→ [I | A −1 ]
Example 4: Using Gauss-Jordan method, find the inverse of the following matrix
 4 − 1 − 5
15 1 − 5
 
 5 4 9 
Solution: Let,
 4 − 1 − 5
 
A = 15 1 − 5
 5 4 9 
We start with the following augmented matrix
 4 − 1 − 5 1 0 0
 
15 1 − 5 0 1 0
 5 4 9 0 0 1
Step 1:
3
 4 − 1 − 5 1 0 0 15 1 − 5 0 1 0  1
  R12   15 R1
15 1 − 5 0 1 0 →  4 − 1 − 5 1 0 0   →
 5 4 9 0 0 1   5 4 9 0 0 1 
 1 1 1 
 1 1 1  1 15 − 0 0
1 15 − 3 0 15 0 R2′ ← R2 −4 R1 
3 15

 4 − 1 − 5 1 0 0 R3′ ← R3 −5 R1
   
→ 0 −
19 11
− 1 −
4
0
   15 3 15 
5 4 9 0 0 1  11 32 1 
  0 0 − 1
 3 3 3 
Step 2:
 1 1 1   1 1 1 
1 15 − 0 0 1 15 − 0 0
3 15 3 15
 19 11 4   11 32 1  3
R2
0 − − 1 − 0 →R23
0 0 − 1 11
→
 15 3 15   3 3 3 
 11 32 1   19 11 4 
0 0 − 1 0 − 15 − 1 − 0
 3 3 3   3 15 
 1 1 1   29 4 1
1 15 − 0 0  R1′ ←R1 − 1 R2 1 0 − 0 − 
3 15 15 55 55 55
 32 1 3  R3′ ←R3 +15 R2 
19
32 1 3 
0 1 0 −  → 0 1 0 − 
 11 11 11  11 11 11 
 19 11 4   1 21 19 
0 − 15 − 3 1 − 15 0 0 0 1 −
55 55 
   55
Step 3:
 29 4 1  29 4 1
1 0 − 55 0 55 − 55  1 0 − 55 0 55 − 55 
 32 1 3  55R3  32 1 3 
0 1 0 −  
→ 0 1 0 − 
 11 11 11   11 11 11 
 1 21 19  0 0 1 55 − 21 19 
 0 0 1 −
 55 55 55  



29
R1′ ←R1 + R3
1 0 0 29
55
32
− 11 10 

R2′ ←R2 − R3 
→ 0 1 0 − 160 61 − 55
11
0 0 1 55 − 21 19 
4
Hence, the required inverse of the given matrix is
 29 − 11 10 
 − 160 61 − 55 .
 
 55 − 21 19 
5
6.4 Triangularization method
This method is also known as triangular decomposition or factorization method or LU-
decomposition method.
This method is based on the fact that every square matrix can be expressed as the product of a
lower and an upper triangular matrix provided all the principal minors of the given square matrix
A= [aij ]n×n are non-singular. i.e.
a11 a12
a11 ≠ 0 , ≠0, …, det A= A ≠ 0 .
a 21 a 22
Further, if the matrix A can be factorized, then it is unique.
If it is possible then in this method the coefficient matrix A of the system of equations Ax=b is
decomposed into the product of a lower triangular matrix L and an upper triangular matrix U so
that
A=LU (3.3.1)
where
l11 0 0 L 0 u11 u12 u13 L u1n 

l  0 u u23 L u2 n 
 21 l22 0 L 0  22 
L = l31 l32 l33 L 0  and U =  0 0 u33 L u3n  (3.3.2)
   
M M M L M M M M L M 
ln1 ln 2 ln3 L l nn   0 0 0 L unn 
Thus, the system of equations Ax=b becomes
1
LUx=b (3.3.3)
Let us take
Ux=z (3.3.4)
Substituting this in eq. (3.3.3), we get
Lz=b (3.3.5)
The unknowns z1 , z2 ,…, zn in eq.(3.3.5) are determined by forward substitution and then
consequently the unknowns x1, x2 ,...,xn in (3.3.4) are determined by backward substitution.
To determine L and U in LU-decomposition method, we can apply any of the following two
methods.
3.2.1 Doolittle’s method
If we choose all the elements of principal diagonal of lower triangular matrix L as 1, i.e. lii = 1 ,
i = 1,2,..., n , then the corresponding method is called Doolittle’s method.
Example 5: Find the solution of the system of equations
x1 + x 2 + x3 = 1
4 x1 + 3x 2 − x 3 = 6
3x1 + 5 x 2 + 3x 3 = 4
by LU-decomposition method.
Solution:
2
We apply Doolittle’s method. However, Crout’s method can also be applied. Using Doolittle’s
method, the coefficient matrix A can be decomposed as follows
1 1 1   1 0 0 u11 u12 u13   u11 u12 u13 

  
A = 4 3 − 1 = l 21 1 0  0 u 22  
u 23  = l21u11 l21u12 + u 22 l21u13 + u 23 

3 5 3  l31 l32 1   0 0 u33  l31u11 l31u12 + l32 u 22 l31u13 + l32 u 23 + u 33 
Therefore, we have
u11 = 1 , u12 = 1 , u13 = 1 , l 21u11 = 4 , l 21u12 + u 22 = 3 , l 21u13 + u 23 = −1 , l31u11 = 3 , l31u12 + l32 u 22 = 5 and
l 31u13 + l 32 u 23 + u 33 = 3 .
Solving the above equations, we get
u11 = u12 = u13 = 1, l 21 = 4, u 22 = −1, u 23 = −5, u 33 = −10

l 31 = 3, l 32 = −2
Now, using forward substitution method, solution of the system of equations Lz=b, i.e.
1 0 0   z1  1 
4 1 0  z  = 6 ,
  2   
3 − 2 1   z 3   4
yields z1 = 1, z 2 = 2, z 3 = 5 .
Again, solving the system of equations Ux=z, i.e.
1 1 1   x1  1 
0 − 1 − 5   x  = 2
  2   
0 0 − 10  x 3  5 
by back substitution method, we get
3
1 
x= 1 / 2 

− 1 / 2
Therefore, the required solution is
1
x1 = 1 , x 2 = , x3 = − 1 .
2 2
6.5 Crout’s method
If we choose all the elements of principal diagonal of upper triangular matrix U as 1, i.e. u ii = 1 ,
i = 1,2,..., n , then the corresponding method is called Crout’s method.
Example 6 : Find the solution of the system of equations
2 x1 + x 2 + 4 x 3 = 12
8 x1 − 3 x 2 + 2 x 3 = 20
4 x1 + 11x 2 − x3 = 33
by Crout’s decomposition method.
Solution:
Using Crout’s method, the coefficient matrix A can be decomposed as follows
2 1 4  l11 0 0  1 u12 u13  l11 l11u12 l11u13 

A = 8 − 3 2  = l21
 l22 0  0 1 u 23  = l21 l21u12 + l22 l 21u13 + l 22 u 23 
4 11 − 1 l31 l32 l33  0 0 1  l31 l31u12 + l32 l31u13 + l32 u 23 + l33 
Therefore, we have
4
1
l11 = 2 , l 21 = 8 , l31 = 4 , u12 = , u13 = 2 , l 22 = −7 , l 21u12 + l 22 = −3 , l 21u13 + l 22u 23 = 2 , l31u12 + l32 = 11
2
and l 31u13 + l32 u 23 + l33 = −1 .
Solving the above equations, we get
1
l11 = 2 , l 21 = 8 , l31 = 4 , u12 = , u13 = 2 , l 22 = −7 , u 23 = 2 , l 32 = 9 , l33 = −27
2
Now, using forward substitution method, solution of the system of equations Lz=b, i.e.
2 0 0   z1  12 
8 − 7  
 0   z 2  = 20  ,
4 9 − 27   z 3  33 
yields z1 = 6, z 2 = 4, z3 = 1 .
Again, solving the system of equations Ux=z, i.e.
1 0.5 2   x1  6 
 0 1 2   x  =  4
  2   
0 0 1   x 3  1 
by back substitution method, we get
3 
x= 2
1 
Therefore, the required solution is
x1 = 3 , x2 = 2 , x3 = 1 .
5
6.5 Cholesky’s method
Let, A = [aij ]n ×n be a symmetric positive definite matrix of order n. Thus A=LL T and x T A x >0 i.e.
n n
∑∑ a x x
i=1 j =1
ij i j > 0.
The method states that there exists always a lower triangular matrix L such that
A=LL T , (6.5.1)
where L = [lij ]n×n and lij = 0 , i< j
The equation Ax=b gives
LL T x=b (6.5.2)
Let us take
LT x=z, (6.5.3)
then
Lz=b (6.5.4)
The solutions zi (i=1,2,…,n) are obtained from eq. (6.5.4) by forward substitution method and
then the solutions xi (i=1,2,…,n) are determined from eq. (6.5.3) by back substitution process.
The elements lij (i,j=1,2,…,n) of lower triangular matrix L are given by
l11 = a11 ,
ai1
li1 = , i=2,…,n
l11
1
i −1
lii = aii − ∑l
j =1
2
ij , i=2,…,n (6.5.5)
 j −1 
∑l
1 a −  , i = j + 1,...,n
lij = ik l jk , j≥2
l jj  ij 
 k =1 
and lij = 0 , i < j
This method is also known as square root method.
If the coefficient matrix A is symmetric but not positive definite, this method could be still
applied. But in that case, it leads to a complex matrix L, so that the method becomes impractical.
Example 7:
Solve the following system of equations by Cholesky’s method
4x1 + 6x2 + 8x3 = 0
6x1 + 34x2 + 52x3 = −160
8x1 + 52x2 +129x3 = −452
Solution:
The given system of equations can be written as Ax=b
4 6 8   x1   0 
where A= 6 34 52  ,
 x=  x 2  , b=  − 160 
8 52 129  x 3   − 452 
Since, the coefficient matrix A is symmetric, we can apply Cholesky’s method.
We decompose the matrix A as A=LL T , where L is a 3 × 3 lower triangular matrix given by
2
l11 0 0
L = l21 l 22 0 
l31 l32 l33 
Using the formulae in eq. (6.5.5), we get
l11 = 2 , l 21 = 3 , l 22 = 5 , l31 = 4 , l32 = 8 , l33 = 7 .
Otherwise, we can determine l ij ’s directly by solving the following equation
A=LL T , i.e.
 l11 0 0  l11 l 21 l 31  4 6 8 
l l 22 0   0 l 22 l 32  = 6 34 52 
 21
l 31 l 32 l 33   0 0 l 33  8 52 129
Then we solve the system Lz=b, i.e.
 2 0 0   z1   0 
3 5 0  z  =  − 160 
  2   
4 8 7  z 3  − 452
by forward substitution method yielding z1 = 0 , z 2 = −32 , z3 = −28 .
The resulting solution is now obtained from L T x=z i.e.
2 3 4  x1   0 
0 5 8  x  = − 32
  2   
0 0 7  x 3  − 28
using back substitution method.
Therefore, the required solution is x = 8 , x2 = 0 , x 3 = −4 .
3
6.6 Gauss-Jacobi iteration
Initially the given equation of the system are so rearranging that aii ≠ 0 for i=1, 2, …, n. Suppose
that this rearranging system of equation is
a11 x1 + a12 x2 + ... + a1n x n = b1
a 21 x1 + a 22 x2 + ... + a 2n xn = b2 (6.6.1)
a n1 x1 + a n2 x 2 + ... + a nn xn = bn
Now the equations in (6.6.1) can be written as
1
x1 = − (a12 x 2 + a13 x3 + ... + a1n xn − b1 )
a11
1
x2 = − (a 21 x1 + a 23 x3 + ... + a 2n xn − b2 ) (6.6.2)
a 22
1
xn = − (a n1 x1 + a n2 x2 + ... + a n,n −1 x n−1 − bn )
a nn
In the Gauss-Jacobi method the iteration is generated by the formula
 
1  
n
xi( k +1) =− ∑
a ii  j =1
(k )
 a ij x j − bi  , i=1, 2, …, n

and k=0,1,2,… (6.6.3)
 j ≠i 
where the initial guess xi( 0) (i=1, 2, …, n) can be chosen arbitrarily.
Thus, we have
1
1
x1( k +1) = −
a11
(
a12 x 2( k ) + a13 x3( k ) + ... + a1n x n( k ) − b1 )
1
x2( k +1) = −
a 22
(
a 21 x1( k ) + a 23 x3( k ) + ... + a 2 n x n( k ) − b2 ) (6.6.4)
1
xn( k +1) = −
a nn
(
a n1 x1( k ) + a n 2 x3( k ) + ... + a n , n −1 x n( k−)1 − bn )
where k=0,1,2,…
Example 8: Solve the following system of equations by Gauss-Jacobi method
4 x1 + 5 x3 = 12.5
x1 + 6 x 2 + 2 x3 = 18.5
8 x1 + 2 x2 + x3 = −11.5
Solution: We first rearrange the given system of equations so that the resulting system is as
follows
8 x1 + 2 x 2 + x3 = −11.5
x1 + 6 x 2 + 2 x3 = 18.5
4 x1 + 5 x3 = 12.5
Now the system of equations is strictly diagonally. So, Gauss-Jacobi method will certainly
converge.
Again, we rewrite the above equations in the following form
1
x1 = ( −11.5 − 2 x 2 − x3 )
8
1
x2 = (18.5 − x1 − 2 x 3 )
6
2
1
x3 = (12.5 − 4 x1 )
5
The successive iterations in Gauss-Jacobi method will stop if
║x(k+1) − x(k)║∞ < ɛ, where k ≥ 0 and ɛ is the prescribed error tolerance.
Here, we take ɛ=0.01
Initial Step:
x1( 0 ) = x 2( 0) = x3( 0 ) = 0
First iteration:
1 −11.5
x1(1) = (−11.5 − 2 x2( 0 ) − x3( 0) ) = = −1.4375
8 8
1 18.5
x2(1) = (18.5 − x1( 0) − 2 x3( 0 ) ) = = 3.0833
6 6
1 12.5
x3(1) = (12.5 − 4 x1( 0) ) = = 2.5
5 5
Here,
║x(1) − x(0)║∞ = Max{1.4375, 3.0833, 2.5} = 3.0833 > ɛ
Second iteration:
1 1
x1( 2) = (−11.5 − 2 x2(1) − x3(1) ) = [−11.5 − 2 × (3.0833) − 2.5] = −2.5208
8 8
1 1
x2( 2) = (18.5 − x1(1) − 2 x3(1) ) = (18.5 + 1.4375 − 5) = 2.4896
6 6
1 1
x3( 2) = (12.5 − 4 x1(1) ) = [12.5 − 4 × (−1.4375)] = 3.65
5 5
║x(2) − x(1)║∞ = Max {1.0833, 0.5937, 1.15} = 1.15 > ɛ
3
Third iteration:
1 1
x1(3) = (−11.5 − 2 x 2( 2) − x 3( 2) ) = [−11.5 − 2 × (2.4896) − 3.65] = −2.5162
8 8
1 1
x 2(3) = (18.5 − x1( 2) − 2 x3( 2 ) ) = [18.5 + 2.5208 − 2 × (3.65)] = 2.2868
6 6
1 1
x3(3) = (12.5 − 4 x1( 2) ) = [12.5 + 4 × (2.5208)] = 4.5167
5 5
In this case,
║x(3) − x(2)║∞ = Max {0.0046, 0.2028, 0.8667} = 0.8667 > ɛ
Fourth iteration:
1 1
x1( 4 ) = (−11.5 − 2 x 2(3) − x3(3) ) = [−11.5 − 2 × (2.2868) − 4.5167] = −2.5738
8 8
1 1
x 2( 4 ) = (18.5 − x1(3) − 2 x 3( 3) ) = [18.5 + 2.5162 − 2 × (4.5167)] = 1.9971
6 6
1 1
x3( 4 ) = (12.5 − 4 x1(3) ) = [12.5 − 4 × (−2.5162)] = 4.5129
5 5
Also,
║x(4) − x(3)║∞ = Max {0.0576, 0.2897, 0.0038} = 0.2897 > ɛ
Fifth iteration:
1 1
x1(5) = (−11.5 − 2 x 2( 4) − x3( 4 ) ) = [−11.5 − 2 × (1.9918) − 4.5129] = −2.5009
8 8
1 1
x 2(5) ( 4) ( 4)
= (18.5 − x1 − 2 x 3 ) = [18.5 + 2.5738 − 2 × (4.5129)] = 2.0080
6 6
1 1
x3(5) = (12.5 − 4 x1( 4) ) = [12.5 − 4 × (−2.5738)] = 4.5590
5 5
4
║x(5) − x(4)║∞ = Max {0.0729, 0.0109, 0.0461} = 0.0729 > ɛ
Sixth iteration:
1 1
x1( 6) = (−11.5 − 2 x 2( 5) − x 3(5) ) = [−11.5 − 2 × (2.0080) − 4.5590] = −2.5094
8 8
1 1
x 2( 6) = (18.5 − x1( 5) − 2 x3( 5) ) = [18.5 + 2.5009 − 2 × (4.5590)] = 1.9805
6 6
1 1
x3( 6) = (12.5 − 4 x1(5) ) = [12.5 − 4 × (−2.5009)] = 4.5007
5 5
Here,
║x(6) − x(5)║∞ = Max {0.0085, 0.0275, 0.0583} = 0.0583 > ɛ
Seventh iteration:
1 1
x1( 7 ) = (−11.5 − 2 x 2( 6) − x 3( 6) ) = [−11.5 − 2 × (1.9805) − 4.5007] = −2.4952
8 8
(7) 1 ( 6) (6 ) 1
x 2 = (18.5 − x1 − 2 x 3 ) = [18.5 + 2.5094 − 2 × (4.5007)] = 2.0013
6 6
1 1
x3( 7 ) = (12.5 − 4 x1( 6 ) ) = [12.5 − 4 × (−2.5094)] = 4.5075
5 5
Also,
║x(7) − x(6)║∞ = Max {0.0142, 0.0208, 0.0068} = 0.0208 > ɛ
Eighth iteration:
1 1
x1(8) = (−11.5 − 2 x 2( 7 ) − x 3( 7 ) ) = [−11.5 − 2 × (2.0013) − 4.5075] = −2.5013
8 8
(8 ) 1 (7) (7) 1
x 2 = (18.5 − x1 − 2 x3 ) = [18.5 + 2.4952 − 2 × (4.5075)] = 1.9967
6 6
1 1
x3(8) = (12.5 − 4 x1( 7 ) ) = [12.5 − 4 × (−2.4952)] = 4.4962
5 5
║x(8) − x(7)║∞ = Max {0.0061, 0.0046, 0.0113} = 0.0113 > ɛ
Ninth iteration:
5
1 1
x1(9 ) = (−11.5 − 2 x 2(8) − x 3(8) ) = [−11.5 − 2 × (1.9967) − 4.4962] = −2.4987
8 8
1 1
x 2(9 ) (8 ) (8 )
= (18.5 − x1 − 2 x3 ) = [18.5 + 2.5013 − 2 × (4.4962)] = 2.0015
6 6
1 1
x3(9 ) = (12.5 − 4 x1(8) ) = [12.5 − 4 × (−2.5013)] = 4.5010
5 5
Finally,
║x(9) − x(8)║∞ = Max {0.0026, 0.0048, 0.0048} = 0.0048 < ɛ
Hence, we shall stop here.
So, the sequence of successive approximations terminates at Ninth iteration.
Therefore, the required solution of the given system of equations is
x1 = −2.5 , x2 = 2 and x3 = 4.5 .
6
6.7 Gauss-Seidel iteration method
This is an iterative method of great practical importance. In this method, the iteration is
generated by the following formulae
1
x1( k +1) = −
a11
(
a12 x 2( k ) + a13 x3( k ) + ... + a1n x n( k ) − b1 )
1
x 2( k +1) = −
a 22
(
a 21 x1( k +1) + a 23 x3( k ) + ... + a 2 n x n( k ) − b2 ) (6.7.1)
1
x n( k +1) = −
a nn
(
a n1 x1( k +1) + a n 2 x3( k +1) + ... + a n ,n −1 x n( k−+11) − bn )
where the initial guess xi( 0) (i=1, 2, …, n) can be chosen arbitrarily.
In brief
 
1  
n n
xi( k +1) =− ∑ ( k +1)
 aij x j
aii  j =1
+ ∑ j =1
aij x (jk ) − bi  , i=1, 2, …, n and k=0,1,2,…

(6.7.2)
 j <i j >i 
which can be rearranged in the following form
n
a11x1(k +1) = − ∑a x
j =2
ij
(k )
j + b1
n
a21x1( k +1) + a22 x2( k +1) = − ∑a
j =3
2j x (jk ) + b2
an1x1( k +1) + an 2 x2( k +1) + ... + ann x2( k +1) = bn

Example 9: Solve the following system of equations by Gauss-Seidel method
5 x1 + x 2 + 2 x 3 = 19
x1 + 4 x 2 − 2 x3 = −2
2 x1 + 3x 2 + 8 x 3 = 39
Solution:
Clearly, the given system of equations is strictly diagonally. So, Gauss-Seidel method will
certainly converge.
Again, we rewrite the above equations in the following form
1
x1 = (19 − x 2 − 2 x 3 )
5
1
x2 = ( −2 − x1 + 2 x 3 )
4
1
x3 = (39 − 2 x1 − 3 x 2 )
8
The successive iterations in Gauss-Seidel method will stop if
║x(k+1) − x(k)║∞ < ɛ, where k ≥ 0 and ɛ is the prescribed error tolerance.
Here, we take ɛ=0.01
Initial Step:
x1( 0 ) = x 2( 0) = x 3( 0) = 1
First iteration:
1 1
x1(1) = (19 − x 2( 0) − 2 x 3( 0) ) = (19 − 1 − 2) = 3.2
5 5
(1) 1 (1) (0) 1
x 2 = (−2 − x1 + 2 x 3 ) = (−2 − 3.2 + 2 × 1) = −0.8
4 4
1 1
x3(1) = (39 − 2 x1(1) − 3x 2(1) ) = (39 − 6.4 + 2.4) = 4.375
8 8
Here,
║x(1) − x(0)║∞ = Max{2.2, 1.8, 3.375}= 3.375 > ɛ
Second iteration:
1 1
x1( 2 ) = (19 − x 2(1) − 2 x 3(1) ) = (19 + 0.8 − 2 × 4.375) = 2.21
5 5
1 1
x 2( 2 ) ( 2) (1)
= (−2 − x1 + 2 x 3 ) = (−2 − 2.21 + 2 × 4.375) = 1.135
4 4
1 1
x3( 2 ) = (39 − 2 x1( 2 ) − 3x 2( 2) ) = (39 − 2 × 2.21 − 3 × 1.135) = 3.89687
8 8
║x(2) − x(1)║∞ = Max {0.99, 1.935, 0.478125} = 1.935 > ɛ
Third iteration:
1 1
x1( 3) = (19 − x2( 2 ) − 2 x3( 2 ) ) = [19 − 1.135 − 2 × (3.89687)] = 2.01425
5 5
1 1
x2( 3) = (−2 − x1(3) + 2 x3( 2 ) ) = [−2 − 2.01425 + 2 × (3.89687)] = 0.94487
4 4
1 1
x3( 3) ( 3) ( 3)
= (39 − 2 x1 − 3 x2 ) = [39 − 2 × (2.01425) − 3 × (0.94487)] = 4.01711
8 8
In this case,
║x(3) − x(2)║∞ = Max {0.19575, 0.190125, 0.120234}= 0.19575 > ɛ
Fourth iteration:
1 1
x1( 4) = (19 − x 2( 3) − 2 x 3( 3) ) = (19 − 0.944875 − 2 × (4.01711)) = 2.00418
5 5
1 1
x 2( 4) ( 4) ( 3)
= (−2 − x1 + 2 x3 ) = (−2 − 2.00418 + 2 × (4.01711)) = 1.00751
4 4
1 1
x3( 4) = (39 − 2 x1( 4) − 3x 2( 4 ) ) = (39 − 2(2.00418) − 3 × (1.00751)) = 3.99614
8 8
Also,
║x(4) − x(3)║∞ = Max {0.0100688, 0.0626344, 0.0209707} = 0.0626344 > ɛ
Fifth iteration:
1 1
x1(5) = (19 − x 2( 4) − 2 x3( 4 ) ) = [19 − 1.00751 − 2 × (3.99614)] = 2.00004
5 5
1 1
x 2(5) (5 ) ( 4)
= (−2 − x1 + 2 x 3 ) = [−2 − 2.00004 + 2 × (3.99614)] = 0.998059
4 4
1 1
x3(5) = (39 − 2 x1( 5) − 3x 2(5) ) = [39 − 2 × (2.00004) − 3 × (0.998059)] = 4.00072
8 8
Finally,
║x(5) − x(4)║∞ = Max {0.00413859, 0.0094507, 0.00457866} = 0.0094507< ɛ
Hence, we shall stop here.
So, the sequence of successive approximations terminates at fifth iteration.
Therefore, the required solution of the given system of equations is
x1 = 2 , x2 = 1 and x3 = 4
Chapter 7: Numerical solutions of ordinary differential equations
We shall study the numerical solution for the initial value problem. Let us consider the numerical
solution of an ordinary differential equation of the first order
dy
= f ( x, y) (1.2)
dx
subject to a given initial condition y ( x0 ) = y0 .
The function f ( x, y) is a continuous function for all ( x, y) in some domain D of the xy -plane, say
D : a = x0 ≤ x ≤ b , −∞ < y < ∞ , and ( x0 , y0 ) is a given point in D .
The first step in obtaining a numerical solution of the differential equation (1.2) is to partition the
interval [a, b] on which the solution is desired into a finite number of subinterval by the points
a = x0 < x1 < x2 < ... < xn = b
These points are called the mesh points or grid points. Usually, the grid points are taken to be
equally spaced that is
b−a
xi = x0 + ih , h = (i=0, 1, 2, …, n)
n
where h is called the step length or step size.
7.1 Taylor’s series method
We consider the initial value problem
1
dy
= f ( x, y) (7.1.1)
dx
with the initial conditions y ( x0 ) = y0 .
Let us assume that the given function f ( x, y) is continuously differentiable a sufficient number of
times in the region D containing the point ( x0 , y0 ) , so that the solution y( x) is also continuously
differentiable a sufficient number of times in D .
Now,
y ′ = f ( x, y ) (7.1.2)
Differentiating eq.(2.2.2) with respect to x , we get
∂f ∂f dy
y ′′ = + , using chain rule
∂x ∂y dx
This implies
y′′ = f x + f y y′ = f x + ff y (7.1.3)
Similarly, again using chain rule, we can obtain
[ ] [
y′′′ = f xx + f yx f + f f xy + f yy f + f x + ff y f y ]
= f xx + 2 ff xy + f 2 f yy + f x f y + ff y2 (7.1.4)
and so on.
Putting x = x0 and y = y 0 , we get y′0 , y 0′′ , y0′′′ , etc. from eqs. (7.1.2) to (7.1.4) respectively.
Now, the Taylor’s series expansion of y( x) in the neighbourhood of x = x 0 is given by
( x − x0 ) 2
y ( x ) = y 0 + ( x − x0 ) y0′ + y0′′ + ... (7.1.5)
2!
2
Substituting the values of y′0 , y 0′′ , y0′′′ ,… in eq. (2.2.5), we can obtain y(x) for all values of x .
• Error estimate
If xi and yi = y( xi ) are known exactly, then the Taylor’ series expansion
h2 h p ( p)
y ( xi +1 ) = y ( xi + h) = y ( xi ) + hy ′( xi ) + y ′( xi ) + ... + y ( xi ) (7.1.6)
2! p!
can be used to compute yi +1 = y( xi +1 ) with an error
h p +1 ( p +1)
Ti +1 = y (ξ ) , xi < ξ < xi +1 (7.1.7)
( p + 1)!
The number of terms to be included in eq. (7.1.6) is fixed by the permissible error ε . If the series
is truncated at the term y ( p ) ( xi ) , then
h p +1 ( p +1)
y (ξ ) < ε
( p + 1)!
Example 1:
Using Taylor’s series method, obtain y(0.1) , y(0.2) , if y( x) satisfies the following differential
equation
dy
= y 2 − x 2 with y (0) = 1 .
dx
Solution:
Here, x0 = 0 and y0 = 1 .
3
Now,
y′ = y 2 − x 2 , y ′0 = 1
y ′′ = 2 yy ′ − 2 x , y ′0′ = 2
y ′′′ = 2 y ′ 2 + 2 yy ′′ − 2 , y 0′′′ = 4
y iv = 6 y ′y′′ + 2 yy ′′′ , y 0iv = 20
y v = 6 y ′′2 + 8 y ′y ′′′ + 2 yy iv , y0v = 96
and so on.
To compute y(0.1) correct to four decimal places, we consider Taylor’s series of order 4, i.e. upto
the term x4 .
The Taylor’s series expansion for y( x) about x = x0 is
x2 x3 x 4 iv
y ( x ) = y0 + xy0′ + y0′′ + y0′′′ + y
2! 3! 4!
2 3 5 4
= 1+ x + x2 + x + x (1)
3 6
Hence, putting x = 0.1 and 0.2 in (1), we get
y (0.1) = 1.1108 and y (0.2) = 1.2467
4
7.2 Euler Method
Expanding y( x + h) by Taylor’s theorem, we get
h2 h p ( p) h p +1 ( p +1)
y ( x + h) = y ( x) + hy ′( x ) + y ′′( x) + ... + y ( x) + y ( x + θh) , 0 < θ < 1
2! p! ( p + 1)!
Now, we can write
y( x + h) = y( x) + hFT ( x, y; h) + O(h p +1 ) , (7.2.1)
where
h h p −1 ( p )
FT ( x, y; h) = y ′( x ) + y ′′( x) + ... + y ( x) (7.2.2)
2! p!
The recursive formula of this method is given by
y i +1 = y i + hFT ( x i , y i ; h ) , (i=0, 1, 2, …, n-1) (7.2.3)
with the initial condition y0 = y ( x0 ) .
This is called the Taylor’s series method of order p.
Substituting p = 1 , in eq.(7.2.2) we get
FT ( x , y ; h ) = y ′ = f ( x , y )
This single-step method of order 1 is known as Euler’s method. Therefore, Euler’s method can
also be called as the Taylor’s series method of order 1.
In Euler’s method, the recursion is given by
1
y i +1 = y i + hFT ( x i , y i ; h ) = y i + hf ( x i , y i ) , (i=0, 1, 2, …, n-1) (7.2.4)
with the initial condition y0 = y ( x0 ) .
The grid error or global error is E = O (h) .
Euler’s method is the simplest approach to compute a numerical solution of an initial value
problem. However, if we wish to compute very accurate solutions, or solutions that are accurate
over a long interval, then Euler’s method requires a large number of small steps. For sufficiently
small h, the method yields better result.
Problem 2: Consider the initial value problem
y ′ = x ( x + y ) − 2 , y (0) = 2 ,
use Euler’s method with step sizes h = 0.3 , h = 0.2 and h = 0.15 to compute approximations to
y (0.6) up to five decimal places.
Solution:
Here, f ( x, y) = x( y + x) − 2 , x0 = 0 , y0 = y0 = 2 and xi = x0 + ih , i = 0,1, 2 ..., n .
When h = 0 . 15 ,
2
y(0.15) = y1 = y0 + hf ( x0 , y 0 ) = 1.7
y (0.3) = y 2 = y1 + hf ( x1 , y1 ) = 1.441625
y (0.45) = y 3 = y 2 + hf ( x 2 , y 2 ) = 1.2199981
y (0.6) = y 4 = y3 + hf ( x3 , y 3 ) = 1.032723
Therefore,
y (0.6) = 1.03272
When h = 0.2 ,
y(0.2) = y1 = y0 + hf ( x0 , y0 ) = 1.6
y(0.4) = y2 = y1 + hf ( x1 , y1 ) = 1.272
y(0.6) = y4 = y3 + hf ( x3 , y3 ) = 1.00576
Therefore,
y(0.6) = 1.00576
When h = 0 .3 ,
y(0.3) = y1 = y0 + hf ( x0 , y0 ) = 1.4
y(0.6) = y4 = y3 + hf ( x3 , y3 ) = 0.953
Therefore,
y(0.6) = 0.953
3
7.3 Runge-Kutta methods
Runge-Kutta methods are most widely used numerical methods for solving ordinary differential
equations. It is especially suitable in the case of ordinary differential equations where the
computation of higher derivatives is complicated. As we have already seen that, Euler’s method
is less efficient in practical problems since it requires step length h to be very small for obtaining
reasonable accuracy.
7.3.1 2nd order Runge-Kutta method or Heun’s method:
In this method the recursive formula is given by
1
y i +1 = y i + (k1 + k 2 ) , i = 0,1, 2,... (7.3.1.1)
2
where
k1 = hf ( xi , y i ) , (7.3.1.2)
k 2 = hf ( xi + h, yi + k1 ) . (7.3.1.3)
The local error of this method is O (h 3 ) and the global error of this method is O (h 2 ) .
7.3.2 Third-order Runge-Kutta method:
The recursive formula for standard third-order Runge-Kutta method is as follows:
1
y i +1 = y i + (k1 + 4k 2 + k 3 ) , i = 0,1, 2,... (7.3.1.4)
6
where
k1 = hf ( x i , y i ) , (7.3.1.5)
1 1
k 2 = hf ( xi + h, y i + k1 ) , (7.3.1.6)
2 2
1
k 3 = hf ( x i + h, y i − k1 + 2k 2 ) . (7.3.1.7)
The local error of this method is O (h 4 ) and the global error of this method is O (h 3 ) .
dy
Example 3: Given = x 2 + y , y (0) = 1 . Using Heun’s method find y (0.2) .
dx
Solution:
Let us take step size h = 0.1 . Here, f ( x, y ) = x 2 + y . From the initial condition, it follows that
x 0 = 0 and y 0 = 1 .
For x1 = 0.1 ,
k1 = hf ( x0 , y 0 )
= 0.1
k 2 = hf ( x0 + h, y 0 + k1 )
= 0.111
1
k = ( k1 + k 2 )
2
= 0.1055
Therefore,
y (0.1) = y1 = y 0 + k
= 1.1055 .
Now, for x 2 = 0.2 ,
k1 = hf ( x1 , y1 )
= 0.11155
2
k 2 = hf ( x1 + h, y1 + k1 )
= 0.125705
1
k = ( k1 + k 2 )
2
= 0.118628
Therefore,
y (0.2) = y 2 = y 1 +k
= 1.22413 .
3
7.3.2 Third-order Runge-Kutta method:
The recursive formula for standard third-order Runge-Kutta method is as follows:
1
y i +1 = y i + (k1 + 4k 2 + k 3 ) , i = 0 ,1, 2 ,... (7.3.1.4)
6
where
k1 = hf ( xi , y i ) , (7.3.1.5)
1 1
k 2 = hf ( xi + h, yi + k1 ) , (7.3.1.6)
2 2
k 3 = hf ( xi + h, y i − k1 + 2k 2 ) . (7.3.1.7)
The local error of this method is O(h 4 ) and the global error of this method is O(h 3 ) .
y−x
Example 4: Solve f ( x, y) = with y (0) = 1 for finding y (0.6) by using third-order Runge-
y+x
Kutta method taking h = 0.1 .
Solution:
y−x
Here, step size h = 0.1 and f ( x, y ) = . From the initial condition, it follows that x 0 = 0 and
y+x
y0 = 1.
For x1 = 0.1 ,
k1 = hf ( x0 , y 0 )
= 0.1
h k
k 2 = hf ( x0 + , y 0 + 1 )
2 2
= 0.0909091
1
k 3 = hf ( x 0 + h, y 0 − k1 + 2k 2 )
= 0.0830769
1
k = (k1 + 4k 2 + k 3 )
6
= 0.0911189
Therefore,
y (0.1) = y1 = y 0 + k
= 1.09112.
Now, for x 2 = 0.2 ,
k1 = hf ( x1 , y1 )
= 0.0832091
k 2 = hf ( x1 + h, y1 + k1 )
= 0.0766123
k 3 = hf ( x1 + h, y1 − k1 + 2k 2 )
= 0.0706127
1
k = (k1 + 4k 2 + k 3 )
6
= 0.0767118
Therefore,
y (0.2) = y2 = y 1 + k
= 1.16783.
2
Next, for x3 = 0.3 ,
k1 = hf ( x 2 , y 2 )
= 0.0707566
k 2 = hf ( x 2 + h, y 2 + k1 )
= 0.0655934
k 3 = hf ( x 2 + h, y 2 − k1 + 2k 2 )
= 0.0607397
1
k = (k1 + 4k 2 + k 3 )
6
= 0.065645
Therefore,
y (0.3) = y3 = y 2 + k
= 1.23348.
Then, for x 4 = 0.4 ,
k1 = hf ( x3 , y 3 )
= 0.0608732
k 2 = hf ( x3 + h, y3 + k1 )
= 0.0566271
k 3 = hf ( x 3 + h, y 3 − k1 + 2k 2 )
= 0.0525464
1
k = (k1 + 4k 2 + k 3 )
6
= 0.0566547
3
Therefore,
y (0.4) = y4 = y 3+ k
= 1.29013.
Next, for x5 = 0.5 ,
k1 = hf ( x4 , y4 )
= 0.0526664
k2 = hf ( x4 + h, y4 + k1 )
= 0.0490507
k3 = hf ( x4 + h, y4 − k1 + 2k 2 )
= 0.0455209
1
k = (k1 + 4k 2 + k 3 )
6
= 0.049065
Therefore,
y (0.5) = y5 = y 4 +k
= 1.3392 .
Finally, for x6 = 0.6 ,
k1 = hf ( x5 , y5 )
= 0.0456284
k2 = hf ( x5 + h, y5 + k1 )
= 0.0424689
4
k3 = hf ( x5 + h, y5 − k1 + 2k 2 )
= 0.0393481
1
k = (k1 + 4k 2 + k 3 )
6
= 0.0424754
Therefore,
y (0.6) = y6 = y 5 + k
= 1.38167 .
5
7.3.3 Fourth-order Runge-Kutta method
The recursive formula for fourth-order Runge-Kutta method is as follows:
1
yi +1 = y i + (k1 + 2k 2 + 2k3 + k 4 ) , i = 0 ,1, 2 ,... (7.3.1)
6
where
k1 = hf ( xi , y i ) , (7.3.2)
h k
k 2 = hf ( x i + ,yi+ 1 ) , (7.3.3)
2 2
h k
k 3 = hf ( xi + ,yi+ 2 ) , (7.3.4)
2 2
k 4 = hf ( x i + h, y i + k 3 ) . (7.3.5)
The truncation error of this method is O( h 5 ) and the global error of this method is O(h 4 ) .
Example 5: Consider the initial value problem
dy
+ y = y 2 , y (0) = −1 .
dx
Use Runge-Kutta method with stepsizes h = 0.05 and h = 0.1 to compute the approximate value of y(0.4) .
Solution:
Here f ( x, y) = − y + y 2 .
The 4th order Runge-Kutta method is given by
1
yi +1 = yi + (k1 + 2k2 + 2k3 + k4 ) , i = 0,1,2,....
6
where y0 = −1 at x = 0 .
1
For h=0.05,
Step 1:
k1 = 0.1 , k2 = 0.092625 , k3 = 0.0931604 , k4 = 0.0864599
1
y1 = y0 + ( k1 + 2 k 2 + 2 k3 + k 4 )
6
1
y1 = y (0.05) = −1 + ( 0.1 + 2 × 0.092625 + 2 × 0.0931604 + 0.0864599 ) = - 0.90699490
6
Step 2:
k1 = 0.0864817 , k2 = 0.0804913 , k3 = 0.0809002 , k4 = 0.0754264
1
y2 = y(0.1) = - 0.90699490+ ( 0.0864817+ 2 × 0.0804913+ 2 × 0.0809002+ 0.0754264) = - 0.82621307
6
Step 3:
k1 = 0.0754421, k2 = 0.0705106 , k3 = 0.0708286 , k4 = 0.0662995
1
y3 = y(0.15) = - 0.82621307+ ( 0.0754421+ 2 × 0.0705106+ 2 × 0.0708286+ 0.0662995) = - 0.75547641
6
Step 4:
k1 = 0.0663111 , k2 = 0.0622034 , k3 = 0.0624547 , k4 = 0.058665
1
y4 = y(0.2) = - 0.75547641+ ( 0.0663111+ 2 × 0.0622034+ 2 × 0.0624547+ 0.058665) = - 0.69309437
6
Step 5:
k1 = 0.0586737 , k2 = 0.0552166 , k3 = 0.0554179 , k4 = 0.0522154
1
y5 = y(0.25) = - 0.69309437+ ( 0.0586737+ 2 × 0.0552166+ 2 × 0.0554179+ 0.0522154) = - 0.63773470
6
Step 6:
k1 = 0.052222 , k2 = 0.0492854 , k3 = 0.0494487 , k4 = 0.0467183
1
y6 = y(0.3) = - 0.63773470+ ( 0.052222+ 2 × 0.0492854+ 2 × 0.0494487+ 0.0467183) = - 0.58833329
6
Step 7:
k1 = 0.0467235 , k2 = 0.0442082 , k3 = 0.0443422 , k4 = 0.0419959
1
y7 = y(0.35) = - 0.58833329+ ( 0.0467235+ 2 × 0.0442082+ 2 × 0.0443422+ 0.0419959) = - 0.54402992
6
Step 8:
k1 = 0.0419999 , k2 = 0.0398295 , k3 = 0.0399406 , k4 = 0.0379098
1
y8 = y(0.4) = - 0.54402992+ ( 0.0419999+ 2 × 0.0398295+ 2 × 0.0399406+ 0.0379098) = - 0.50412160
6
For h=0.1,
Step 1:
k1 = 0.2 , k2 = 0.171 , k3 = 0.175081 , k4 = 0.150541
2
1
y1 = y (0.1) = - 1 + ( 0.2 + 2 × 0.171+ 2 × 0.175081+ 0.150541) = - 0.82621615
6
Step 2:
k1 = 0.150885 , k2 = 0.131443 , k3 = 0.133885 , k4 = 0.117165
1
y2 = y(0.2) = - 0.82621615+ ( 0.150885+ 2 × 0.131443+ 2 × 0.133885+ 0.117165) = - 0.69309839
6
Step 3:
k1 = 0.117348 , k2 = 0.103692 , k3 = 0.105246 , k4 = 0.0933423
1
y3 = y(0.3) = - 0.69309839+ ( 0.117348+ 2 × 0.103692+ 2 × 0.105246+ 0.0933423) = - 0.58833742
6
Step 4:
k1 = 0.0934478 , k2 = 0.0834959 , k3 = 0.084535 , k4 = 0.0757619
1
y4 = y(0.4) = - 0.58833742+ ( 0.0934478+ 2 × 0.0834959+ 2 × 0.084535+ 0.0757619) = - 0.50412552
6
3
7.3.4 Improved Euler method
The Euler forward scheme may be very easy to implement but one main drawback of both the
Euler method is the low convergence order. A very small step size is required for satisfactory
result. Next, we present a method that has a higher convergence order. This method is a
modification of Euler’s method.
In the improved Euler method or improved Euler-Cauchy method, also known as Heun’s
method, we first compute an approximate value of the yi +1 at the point xi +1 .
The usual choice of the initial guess or initial approximation yi(+01) is based on the Euler’s method
yi(+01) = yi + hf ( xi , yi ) , i = 0,1,2... (7.3.4.1)
with y0 = y0 .
This is called the predictor formula.
Then this value of yi +1 is improved using the average of two slopes in eq. (2.5.3). Thus it leads to
the following iteration formula
h
yi(+k1+1) = yi +
2
[ ]
f ( xi , yi ) + f ( xi +1 , yi(+k1) ) , k=0, 1, 2, …; i=0, 1, 2, … (7.3.4.2)
This is called the corrector formula.
1
Thus, the improved Euler method is a predictor-corrector method, because in each step we first
predict the value yi(+01) by the predictor formula in eq. (7.3.4.1) and then correct it by the
corrector formula in eq. (7.3.4.2). It will thus generate a sequence of successive approximations
yi(+k1) for k=0, 1, 2, ….
The iteration process generated by eq. (7.3.4.2) is terminated at each step if the following
condition is satisfied
yi(+k1+1) − yi(+k1) < ε , k=0, 1, 2, … (7.3.4.3)
where ε is the error tolerance depending on the level of accuracy to be accomplished.
The local truncation error for the improved Euler formula is O(h 3 ) as opposed to O( h 2 ) for the
Euler's method. It can also be shown that for a finite interval, the global truncation error for the
improved Euler formula is bounded by O( h 2 ) , so this method is a second order method.
2
Example 6:
Determine the value of y when x = 0.1 given that
y ′ = x 2 + y 2 , y (0) = 1
Solution:
We take, h = 0.05
Here, x0 = 0 and y 0 = 1 . Therefore, f ( x0 , y0 ) = 1
By Euler’s formula, we get
y1(0) = y 0 + hf ( x0 , y 0 ) = 1.05 .
This predicted value of y1(0) is used in the corrector formula as an initial approximation.
Using corrector formula of improved Euler formula, we get the first approximation to y1 as
h
y1(1) = y0 +
2
[ ]
f ( x0 , y 0 ) + f ( x1 , y1(0) ) = 1.05256 .
Again, the second approximation to y1 is
h
y1( 2 ) = y0 +
2
[ ]
f ( x0 , y0 ) + f ( x1 , y1(1) ) = 1.0527
Hence, we take y1 = 1.053 correct to 3 decimal places i.e. y(0.05) = 1.053 .
Next, again by using Euler’s formula, we get
3
y 2( 0) = y1 + hf ( x1 , y1 ) = 1.10857
This predicted value of y2(0) is used in the corrector formula as an initial approximation.
Using corrector formula of improved Euler formula, we get the second approximation to y 2 as
h
y 2(1) = y1 +
2
[ ]
f ( x1 , y1 ) + f ( x2 , y 2( 0) ) = 1.11157
h
y 2( 2) = y1 +
2
[ ]
f ( x1 , y1 ) + f ( x2 , y 2(1) ) = 1.11173
Therefore, we take y 2 = 1.112 correct to 3 decimal places i.e. y(0.1) = 1.112 .
So, the required solution is
y (0.1) = 1.112 .
4
7.3.5 Adams-Bashforth and Adams-Moulton Predictor-Corrector Method
We assume that the values y1 , y2 , . . . , yi −1, yi have already been calculated by some other method.
The following formula, known as the Adams-Bashforth formula, is given by
h
yiP+1 = yi +
24
[ ]
55 f i − 59 f i −1 + 37 f i − 2 − 9 fi −3 , (7.3.5.1)
where f i = f ( xi , yi ) .
This formula is used as a predictor formula. The superscript P indicates that it provides a
predicted value. The local truncation error of Adams-Bashforth method is O (h 5 ) .
The following formula, known as the Adams-Moulton formula, is given by
h
yiC+1 = yi +
24
[
9 f i +1 + 19 f i − 5 f i −1 + f i −2 ] (7.3.5.2)
where f i = f ( xi , yi ) .
This formula is used as a corrector formula. The superscript C indicates that it provides a
corrected value. The local truncation error of Adams-Moulton method is O (h 5 ) .
In the above predictor-corrector method, the initial approximation to yi +1 is provided by the
predictor formula eq. (7.3.5.1) and then a sequence of successive approximations to yi +1 is
generated by the corrector formula eq. (7.3.5.2).
1
Dahlquist’s equivalence theorem (Süli and Mayers 2003): For a linear multi-step method that
is consistent with the ordinary differential equation
y ′ = f ( x, y ) ,
where f is assumed to satisfy a Lipschitz condition, and starting with consistent initial data,
zero-stability is necessary and sufficient for convergence. Moreover if the solution y (x) has
continuous derivative of order ( p + 1) and truncation error O (h p ) , then the global error
en = y ( x n ) − y n is also O (h p ) .
dy
Example 7: Use Adam-Moulton method to determine y(1.4) given that = x 2 (1 + y ) , y (1) = 1 .
dx
Solution:
Here,
x0 = 1 , y0 = 1 and h = 0.1 , say
y ′ = x 2 (1 + y ) , y′0 = 2
y ′′ = ( 2 x + x 4 )(1 + y ) , y′0′ = 6
y′′′ = (2 + 6 x 3 + x 6 )(1 + y ) , y′0′′ = 18
y iv = (20x 2 + 12 x 5 + x 8 )(1 + y ) , y 0iv = 66
2
y v = (40x + 80x 4 + 20x 7 + x10 )(1 + y ) , y0v = 282
h2 h3 h 4 iv h5 v
y (1.1) = y1 = y0 + hy′0 + y′0′ + y0′′′ + y0 + y0
2! 3! 4! 5!
(0.1) 2 (0.1) 3 (0.1) 4 (0.1)5

= 1 + 0. 1 × 2 + ×6 + × 18 + × 66 + × 282
2! 3! 4! 5!
= 1.2333
(2h)2 (2h)3 (2h) 4 iv (2h)5 v

y(1.2) = y2 = y0 + 2hy0′ + y0′′ + y0′′′ + y0 + y0
2! 3! 4! 5!
(0 . 2 ) 2 (0 . 2 ) 3 (0.2) 4 (0 . 2 ) 5
= 1 + 0.2 × 2 + ×6 + × 18 + × 66 + × 282
2! 3! 4! 5!
= 1.54915
Similarly,
(3h) 2 (3h)3 (3h) 4 iv (3h)5 v

y (1.3) = y3 = y0 + 3hy′0 + y0′′ + y0′′′ + y0 + y0
2! 3! 4! 5!
(0.3) 2 (0.3)3 (0.3) 4 (0.3)5

= 1 + 0.3 × 2 + ×6+ × 18 + × 66 + × 282
2! 3! 4! 5!
= 1.97899
Now, let us consider construct the following table:
x y f ( x, y) = x 2 (1 + y)
1 1 2
3
1.1 1.2333 2.70229
1.2 1.54915 3.67078
1.3 1.97899 5.03449
The values in the above table will be used in the Adams predictor-corrector formulae.
Using Adams-Bashforth predictor formula in eq. (3.1.3), we get
h
y p (1.4) = y4P = y3 +
24
[
55 f 3 − 59 f 2 + 37 f1 − 9 f 0 ], where f i = f ( xi , yi )
0.1
= 1.97899 + [55 × 5.03449 − 59 × 3.67078+ 37 × 2.70229 − 9 × 2]
24
= 2.57193
y′(1.4) = (1.4) 2 × (1 + 2.57193) = 7.0009828
Again, using Adams-Moulton corrector formula in eq. (3.1.5), we get
1st iteration:
h
y C (1.4) = y4C = y3 +
24
[
9 f 4 + 19 f3 − 5 f 2 + f1 ]
4
0.1
= 1.97899 + [9 × 7.0009828+ 19 × 5.03449 − 5 × 3.67078+ 2.70229]
24
= 2.57488
y′(1.4) = (1.4) 2 × (1 + 2.57488) = 7.0067648
2nd iteration:
h
y C (1.4) = y4C = y3 +
24
[
9 f 4 + 19 f3 − 5 f 2 + f1 ]
0.1
= 1.97899 + [9 × 7.0067648 + 19 × 5.03449 − 5 × 3.67078 + 2.70229]
24
= 2.57509
y′(1.4) = (1.4) 2 × (1 + 2.57509) = 7.0071764
3rd iteration:
h
y C (1.4) = y4C = y3 +
24
[
9 f 4 + 19 f3 − 5 f 2 + f1 ]
0.1
= 1.97899 + [9 × 7.0071764 + 19 × 5.03449 − 5 × 3.67078 + 2.70229]
24
= 2.57511
y′(1.4) = (1.4) 2 × (1 + 2.57511) = 7.0072156
4th iteration:
h
y C (1.4) = y4C = y3 +
24
[
9 f 4 + 19 f3 − 5 f 2 + f1 ]
5
0.1
= 1.97899 + [9 × 7.0072156+ 19 × 5.03449− 5 × 3.67078+ 2.70229]
24
= 2.57511
Since, the value of y(1.4) in 4th iteration coincides with that of 3rd iteration.
Therefore, we can stop here and the required solution is y(1.4) = 2.5751 correct to four decimal
places.
6
1. Power method
The power method is an iterative technique for approximating the dominant eigenvalue of a
matrix together with an associated eigenvector. This method applies to any n× n matrix A that
has a dominant eigenvalue λ = max λ i .

1≤i ≤ n
To apply the power method, we assume that the n × n matrix A has n eigenvalues λ1 , λ 2 , …, λ n
with an associated linearly independent eigenvectors {v1 , v 2 ,...,v n } .
Let A be an n × n matrix with eigenvalues satisfying
λ1 > λ2 > ... > λn > 0 . (1.1)
The largest eigenvalue λ1 is called the dominant eigenvalue.
Example 1: Determine the largest eigenvalue and the corresponding eigenvector of the
following matrix
 1 3 − 1
A =  3 2 4  .
 − 1 4 10 
Solution:
− 1
Let us choose the initial vector x (0) =  0  such that x (0) = 1.
∞
 1 
1
Then
 1 3 − 1 − 1 − 2 − 0.181818
y (1)
= Ax ( 0)
=  3 2 4   0  =  1  = 11 0.0909091 = 11x (1) .
     
− 1 4 10   1   11   1 
Next,
 1 3 − 1 − 0.181818 − 0.909091 − 0.0862069

y (2)
= Ax (1)
=  3 2 4   0.0909091 =  3.63636  = 10.5455 0.344828  = 10.5455 x ( 2 ) .
     
− 1 4 10   1   10.5455   1 
In this similar manner, we can obtain
− 0.0517241 − 0.00451128
y ( 3)
= Ax ( 2)
=  4.43103  = 11.4655 0.386466  = 11.4655 x (3) ,
 
 11.4655   1 
0.154887 0.0134097
y (4)
= Ax ( 3)
=  4.7594  = 11.5504 0.412056  = 11.5504 x ( 4 ) ,
 
 11.5504   1 
0.249577 0.0214509
y ( 5)
= Ax (4)
=  4.86434  = 11.6348 0.418085  = 11.6348 x (5 ) ,
 
 11.6348   1 
0.275706 0.0236639
y (6)
= Ax (5 )
=  4.90052  = 11.6509 0.420614  = 11.6509 x ( 6) ,
 
 11.6509   1 
0.285505 0.0244884
y ( 7 ) = Ax ( 6 ) =  4.91222  = 11.6588 0.421332  = 11.6588 x ( 7 ) ,
 11.6588   1 
2
0.288484 0.0247395
y (8 )
= Ax ( 7)
=  4.91613  = 11.6608 0.421593  = 11.6608 x (8) ,
 
 11.6608   1 
0.289519 0.0248266
y (9 )
= Ax (8 )
=  4.9174  = 11.6616 0.421674  = 11.6616 x (9 ) ,
 
 11.6616   1 
0.289848 0.0248543
y (10 ) = Ax ( 9) =  4.91783  = 11.6619 0.421701  = 11.6619 x (10 ) ,
 11.6619   1 
0.289959 0.0248637
y (11) = Ax (10 ) =  4.91797  = 11.662 0.42171  = 11.662 x (11) ,
 11.662   1 
0.289995 0.0248667
y (12 )
= Ax (11)
=  4.91801  = 11.662 0.421713  = 11.662 x (12 )
 
 11.662   1 
and so on.
The above results can be summarized in the following computation table:
Step m x ( m −1)
T
y ( m)
T T
y (m)
∞
1 (−1, 0, 1) (−2, 1, 11) 11
2 (−0.0862069, 0.344828, 1) ( −0.909091, 3.63636 , 10 .5455 ) 10.5455
3 (−0.00451128, 0.386466, 1) (−0.0517241, 4.43103, 11.4655) 11.4655
3
4 (0.0134097, 0.412056, 1) (0.154887, 4.7594, 11.5504) 11.5504
5 (0.0214509, 0.418085, 1) (0.249577, 4.86434, 11.6348) 11.6348
6 (0.0236639, 0.420614, 1) (0.275706, 4.90052, 11.6509) 11.6509
7 (0.0244884, 0.421332, 1) (0.285505, 4.91222, 11.6588) 11.6588
8 (0.0247395, 0.421593, 1) (0.288484, 4.91613, 11.6608) 11.6608
9 (0.0248266, 0.421674, 1) (0.289519, 4.9174, 11.6616) 11.6616
10 (0.0248543, 0.421701, 1) (0.289848, 4.91783, 11.6619) 11.6619
11 (0.0248637, 0.42171, 1) (0.289959, 4.91797, 11.662) 11.662
12 (0.0248667, 0.421713, 1) (0.289995, 4.91801, 11.662) 11.662
Since, x(11) ≈ x (12 ) correct to five decimal places, the approximate largest eigenvalue is 11.66
correct to four significant figures and the corresponding approximate normalized eigenvector for
the eigenvalue 11.66 is (0.025, 0.422, 1)T .
4
2. Symmetric Power Method
To approximate the dominant eigenvalue and an associated eigenvector of the n× n symmetric
matrix A, given a nonzero vector x , we have the following theorem.
Theorem 1: Let, A be an n× n real symmetric matrix. Let, x ≠ 0 be any real vector with n
components. Then the quotient
x T Ax
q= (Rayleigh quotient) (2.1)
xT x
is an approximation for an eigenvalue λ of A , and if we set q = λ − ε , so that ε is the error of q ,
then the error bound is given by
( Ax)T Ax
ε ≤δ = − q2 . (2.2)
xT x
Example 2: Determine the largest eigenvalue and the corresponding eigenvector of the
following symmetric matrix
9 4 
 4 3 .
 
Solution:
9 4 1
Let, A =   and initial guess x0 =   .
4 3 1
Step 1:
1
13  1 
y (1) = Ax ( 0) =   = 13  = 13 x
(1)
7
   0 .538462 
[1 1]
9 4 1 13
( 0 )T (0)    [1 1] 
q=
x Ax
= 4 3 1 =  7  = 10
x (0 )T
x (0) 1 2
[1 1] 
1
13
[13 7] 
δ=  7  − q 2 = 169 + 49 − 10 2 = 3
2 2
Step 2:
9 4  1  11.153848  1 
y ( 2) = Ax (1) =   =  = 11.153848  = 11.153848 x
( 2)
4 3 0.538462  5.615386  0.503448
T
x (1) Ax(1)
q= T
= 10.990827
x (1) x (1)
δ=
(Ax ) (1) T
Ax (1)
− q 2 = 0.3027
T
x (1) x (1)
Step 3:
9 4  1  11.013792  1 
y (3) = Ax (2) ==   =  = 11.013792  = 11.013792 x
( 3)
4 3 0 . 503448  5 . 510344  0 .500313
T
x ( 2) Ax (2)
q= T
= 10.999923
x ( 2) x ( 2 )
δ=
(Ax ) Ax (2) T (2)
− q 2 = 0.027548
( 2) T (2)
x x
2
Step 4:
9 4  1  11.0013   1 
y ( 4) = Ax (3) ==   =  = 11.0013  = 11.0013 x
( 4)
 4 3 0 . 500313 5.50094  0 .500028 
T
x (3) Ax (3)
q= T
= 11
x (3) x (3)
δ=
(Ax ) (3 ) T
Ax ( 3)
− q 2 = 0.00250438
( 3)T ( 3)
x x
So, the required dominant eigenvalue is 11 and the corresponding eigenvector is (1, 0.503448) .
The above results can be summarized in the following computation table:
Step m T T q δ
x ( m −1) y ( m)
1 (1, 1) (13, 7) 10 3
2 (1, 0.538462) (11.153848 , 5.615386 ) 10.990827 0.302752
3 (1, 0.503448) (11.013792, 5.510344) 10.999923 0.027548
4 (1, 0.500313) (11.0013, 5.50094) 11 0.00250438

Maths Notes

Uploaded by

Copyright:

Available Formats

Maths Notes

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Maths Notes

Uploaded by

Copyright:

Available Formats

Chapter 1: Errors in Numerical Computations

computers is introduced. General results on the propagation of errors in numerical

and a brief idea of convergence of numerical methods are also introduced.

Theorem 1.1. (Intermediate value theorem)

m = Inf f (x) , M = Sup f (x ) .

In particular, there are points ξ and ξ in [a , b] for which m = f (ξ ) and M = f (ξ ) .

Theorem 1.2. (Mean value theorem)

( a, b) . Then there exists at least one point ξ in (a, b) for which

f (b) − f (a) = (b − a ) f ′(ξ ) .

∫ w( x) f ( x)dx = w(ξ )∫ f (x)dx , for some ξ ∈ [a, b] .

polynomials expressed in Taylor series are used extensively in numerical analysis.

Theorem 1.4. (Taylor’s theorem)

• Taylor’s theorem in two dimensions

Let f ( x, y) be a function of two independent variables x and y and suppose f (x ) possesses

continuous partial derivatives of order n in some neighbourhood N of a point (a, b) in the

domain of definition of f (x ) . Let (a + h, b + k ) ∈ N , then there exists a positive number θ

(0 < θ < 1) , such that

remainder or Taylors’ expansion about the point (a, b) .

3. Approximate numbers and significant figures

A significant figures is any one of the digits 1, 2, 3,… and 0.

figures are 1, 2, 0, 4. Similarly, 1.00317 has 6 significant figures.

• Rules of significant figures

Rule 1: All non-zero digits 1,2,…,9 are significant.

point and are, therefore, not significant.

at the end of the number, are significant figures.

1200 feet 1.2 × 10 3 feet Two significant figures

1200 feet 1.20 × 103 feet Three significant figures

1200 feet 1.200 × 103 feet Four significant figures

If it merely locates the decimal point, it is not a significant figure.

4. Rounding off numbers

Numbers are rounded off so as to cause the least possible errors.

correct to n significant digits.

To illustrate, the following numbers are corrected to 4 significant figures:

27.1345 becomes 27.13

27.8793 becomes 27.88

27.355 becomes 27.36

27.365 becomes 27.36

(i) Absolute error:

xT − x A is called the absolute error Ea . Therefore, absolute error

(ii) Relative and percentage errors:

The percentage relative error is defined by

(iii) Inherent error:

better data or by employing high precision computer computations.

(iv) Round-off and chopping errors:

A number x is said to be rounded correct to a d -decimal digit number x ( d ) if the error

x = .d1d 2 ...d k d k +1...× b e , (4.2)

the following two ways:

fl ( x) = .d1d 2 ...d k × b e (4.3)

(b) Rounding: In this case, fl ( x) is chosen as the normalized floating-point number

it is rounded up to an even digit and if it is even, then it is left unchanged.

b1−k , for chopping

Let x = 0.1667 and y = 0.0769 .

The maximum absolute error in each x and y is given by

(i) Relative Error in ( x + y ) A

(ii) Absolute Error in ( x + y ) A

E a [( x + y ) A ] = Er ( x + y) ( x + y )T ≤ 0.000950135× 0.2436 = 0.000231452886

relative error due to

(i) Percentage relative error due to chopping is given by

(ii) Percentage relative error due to rounding is given by

expansion for the function f (x) about x = x0 as