CS321 Numerical Analysis and Computing: Locating Roots of Equations

CS321
Numerical Analysis
and Computing
Lecture 2
Locating Roots of Equations
Professor Jun Zhang
Department of Computer Science

University of Kentucky
Lexington, KY 40506-0633
September 8, 2015
What is the Root
Many physical system can be written in the form of an equation
This equation represents the relationship between the dependent

variables and independent variables
The root of a nonlinear equation is the solution of that equation
It can also be said to be the solution of a nonlinear system
Most nonlinear equations are too complicated to have an analytical

solution
In practice, we are more interested in finding some numerical solutions
explicit numbers, approximate solutions
2
Roots (Zeros)of a Function
3
Roots (Zeros)of a Function
4
Roots of a Functions
Let f(x) be a function that has values of opposite signs at the two ends of
a given interval [a,b] with a < b, i.e., f(a)•f(b) < 0. If f(x) is continuous on
[a,b], then there exists a number c in [a,b] such that f(c) = 0
c is called a root of function f(x) = 0
Example. The function

f ( x) = x 2 + 2 x − 3 = 0
has a root in the interval [0,2]. It has two roots in the interval [-5,2]
Remark: roots are not necessarily unique in a given interval
Need some root finding algorithms for general functions
5
A Root of a Function
6
A Function with Four Roots
7
Bisection Method
Given an interval [a,b] and a continuous function f(x), if f(a)•f(b) < 0,

then f(x) must have a root in [a,b]. How to find it?
We suppose f(a) > 0 and f(b) < 0

b+a b−a
Step 1. Compute the midpoint c = , stop if is small, and
2 2
take c as the root
Step 2. Evaluate f(c), if f(c) = 0, a root is found
Step 3. If f(c) ≠ 0, then either f(c) > 0 or f(c) < 0
Step 4. If f(c) < 0, a root must be in [a,c]
Step 5. Let b ← c , f (b ) ← f (c ), go to Step 1.
8
Bisection Process
9
Bisection Process
Find a root of the function 3
10
Bisection Process
11
Convergence Analysis
a0 + b0
Let r be a root of f(x) in the interval [a0, b0]. Let c0 = be the
midpoint, then 2
b0 − a 0
r − c0 ≤
2
If we use the bisection algorithm, we compute and have a0, b0, c0, a1, b1,
c1, …, then
bn − a n
r − cn ≤ ( n ≥ 2)
2
Since the interval length is halved at each step, we have
bn −1 − a n −1 b −a
bn − a n = == 0 n 0
2 2
Hence b0 − a 0
r − cn ≤
2 n+1
which is the maximum error if we take cn as an approximate to the root r
12
Linear Convergence
A sequence {xn} has linear convergence to a limit x if there exists a

constant C in the interval [0,1) such that
x n+1 − x ≤ C x n − x ( n ≥ 1)
By recursion, we have
x n + 1 − x ≤ C x n − x ≤ C 2 x n −1 − x
≤  ≤ C n x1 − x
Or equivalently, a linear convergence satisfies
x n +1 − x ≤ AC n (0 ≤ C < 1)
For some positive number A
The bisection algorithm has a linear convergence rate with C = ½ and

A = (b0 - a0)/2
13
Stopping Criterion
What is our goal? When to stop? How many iterations?
Our goal is to find r Є [a,b] such that f(r) = 0
With the bisection algorithm, we generate a sequence such that
| r − c n |< ε
for some prescribed number ε > 0
i.e., we find a point cn inside the interval [a,b] that is very close to the
root r. We then use cn as an approximate to r
It is not guaranteed, however, that f(cn) is very close to 0
14
How Many Iterations
If we want the approximate root cn is close to the true root r, i.e., we want
r − cn < ε ,
Then the number of bisection steps n satisfies
b−a
n +1
<ε
2
Or
log( b − a ) − log( 2ε )
n>
log 2
Example. Find a root in [16,17] up to machine single precision
a = (10 000.0)2, b = (10 001.0)2 so r must have a binary form

r = (10 000.***…)2. We have a total of 24 bits, 5 is already fixed. The
accuracy will be up to another 19 bits, which is between 2-19 and 2-20. We
choose є = 2-20. Since b – a = 1, we need 2n+1 > 220, yielding n ≥ 20
15
Newton’s Method
Given a function f(x) and a point x0, if we know the derivative of f(x) at x0, we
can construct a linear function that passes through (x0, f(x0)) with a slope
f’(x0) ≠ 0 as
l ( x ) = f ' ( x 0 )( x − x 0 ) + f ( x 0 )
Since l(x) is close to f(x) at x0, if x0 is close to r, we can use the root of l(x) as
an approximate to r, the root of f(x)
f ( x0 )
x1 = x 0 −
f ' ( x0 )
x1 may not be close to r enough, we repeat the procedure to find
f ( x1 )
x 2 = x1 − ,
f ' ( x1 )
Under certain conditions, {xn} converges to r
16
Newton’s Method
17
From Taylor Series
If f(x0) ≠ 0, but x0 is close to r, we may assume that they differ by h, i.e.,
x0 + h = r, or f ( x + h) = f ( r ) = 0
0
Using Taylor series expansions

h2
f ( x 0 ) + hf ' ( x 0 ) + f " ( x0 ) +  = 0
2
Ignoring the higher order terms, we have
f ( x 0 ) + hf ' ( x 0 ) = 0
Or
f ( x0 )
h=−
f ' ( x0 )
Since h does not satisfy f(x0 + h) = 0, we use
f ( x0 )
x1 = x 0 + h = x 0 −
f ' ( x0 )
as an approximate to r, and repeat the process
18
First Few Approximations
19
Fast Convergence
Find a root for the following function f(x), starting at x0 = 4
f ( x) = x 3 − 2 x 2 + x − 3
f '( x) = 3 x 2 − 4 x + 1
Each iteration gains double digits of accuracy and f(xn)

decreases quadratically to 0
20
Example
21
Convergence Analysis
Let the function f have continuous first and second derivatives f’ and f”,
and r be a simple root of f with f’(r) ≠ 0. If x0 is sufficiently close to r, then
Newton’s method converges to r quadratically.
2
xn+1 − r ≤ c xn − r
If xn differs from r by at most one unit in the kth decimal place, i.e.,
x n − r ≤ 10 − k
Then, for c = 1, we have
xn +1 − r ≤ 10 −2 k
The number of correct decimal digits doubles after another iteration
22
Convergence Proof
Let en = r – xn. Newton’s method generates a sequence {xn} such that

f ( xn )
e n+1 = r − x n+1 = r − x n +
f ' ( xn )
f ( xn ) en f ' ( xn ) + f ( xn )
= en + =
f ' ( xn ) f ' ( xn )
Using Taylor’s expansion, there exists a point between xn and r for
which
0 = f (r ) = f ( xn + en )
1
= f ( x n ) + e n f ' ( x n ) + e n2 f " (ξ n )
2
It follows that
1
e n f ' ( x n ) + f ( x n ) = − e n2 f " (ξ n )
2
23
Convergence Proof Cont.
We thus have
1  f " (ξ n )  2
e n +1 = −  e n
2  f ' ( xn ) 
Define an upper bound
1 max| x − r|≤δ f " ( x)

c(δ ) = , (δ > 0)
2 min | x − r|≤δ f ' ( x)
We can choose δ small so that
e n = x n − r ≤ δ and ξn − r ≤ δ
This is to guarantee that xn is close to r within a distance of δ
24
Convergence Proof Cont.
For very small δ > 0, we have

1 f " (ξ n ) 2
e n+1 = e n ≤ c(δ )e n2
2 f ' ( xn )
≤ δ c(δ ) e n = ρ e n
With ρ = δc(δ) < 1 if δ is small enough, therefore
x n+1 − r = e n+1 ≤ ρ e n ≤ e n ≤ δ
xn+1 is also close to r within a distance of δ. By recursion, if x0 is close to r,

then
en ≤ ρ en −1 ≤ ρ 2 en − 2 ≤  ≤ ρ n e0
Since ρ < 1, this is to say
lim e n = 0 as n → ∞
n→ ∞
25
Weakness of Newton’s Method
Newton’s method converges fast, only when x0 is chosen close to r. In

practice, there might also be a number of problems
1.) needs derivative value and availability
2.) starting point must be close to r
3.) lose quadratic convergence if multiple root
4.) iterates may run away (not in convergence domain)
5.) flat spot with f ’(xn) = 0
6.) cycling iterates around r
26
Problems of Newton’s Method
27
Newton’s Method Cycling
28
Systems of Nonlinear Equations
Newton’s method is really useful for finding zero of a system of nonlinear

equations f 1 ( x1 , x 2 ,  , x n ) = 0
f 2 ( x1 , x 2 ,  , x n ) = 0
 
f n ( x1 , x 2 ,  , x n ) = 0
Written in vector form as
f(x) = 0
Where
f = ( f 1 , f 2 ,  , f n )T
x = ( x1 , x 2 ,  , x n )T
We have
x ( k +1) = x ( k ) − [f ' ( x ( k ) )]−1 f ( x ( k ) )
f’(x(k)) = J(x(k)) is the Jacobian matrix
29
A 3 Equation Example
f 1 ( x1 , x 2 , x 3 ) = 0
f 2 ( x1 , x 2 , x 3 ) = 0
f 3 ( x1 , x 2 , x 3 ) = 0
Using Taylor expansion
f i ( x1 + h1 , x 2 + h2 , x 3 + h3 )
= f i ( x1 , x 2 , x 3 ) +
∂f i ∂f i ∂f i
h1 + h2 + h3 +
∂ x1 ∂x 2 ∂x 3
Let x ( 0 ) = ( x1( 0 ) , x 2( 0 ) , x 3( 0 ) )T be an approximate solution and the computed

correction be h = ( h1 , h2 ,h 3 )T . Hence
0 ≈ f ( x ( 0 ) + h ) = f ( x ( 0 ) ) + f ' ( x ( 0 ) )h
30
Example Cont.
The Jacobian matrix is  ∂f 1 ∂f 1 ∂f 1 
 ∂x1 ∂x 2 ∂x 3 
 ∂f ∂f 2 ∂f 2 
J= 2
∂x1 ∂x 2 ∂x 3 
 ∂f ∂f 3 ∂f 3 
 ∂x3 
It follows that  1 ∂x 2 ∂x 3 
h ≈ -[f' (x (0) )]−1 f ( x ( 0 ) )

Hence, the new iterate is
x (1) = x (0) − [f ' ( x ( 0 ) )]−1 f ( x ( 0 ) )
In practice, we solve the Jacobian matrix in
[J ( x ( k ) )]h ( k ) = − f ( x ( k ) )
So that
x ( k + 1) = x ( k ) + h ( k )
31
Secant Method
In Newton’s method
f ( xn )
x n+1 = xn −
f ' ( xn )
We need to evaluate f(xn) and f’(xn) at each iteration
We can approximate the derivative at x = xn by

f ( x n −1 ) − f ( x n )
f ' ( xn ) ≈
x n −1 − x n
Thus, the secant method generates iterates

 x n − x n −1 

x n+1 = x n −   f ( x n )
 f ( x n ) − f ( x n −1 ) 
Only one functional evaluation at each iteration
32
Secant Method
33
Comments
Secant method needs two iterates to start with, we can use bisection
method to generate the second iterate
Secant method does not need to know the derivative of f(x)
If | f(xn) – f(xn-1)| is small, the computation may lose significant digits and
becomes unstable
The convergence rate of the secant method is superlinear

α
e n +1 ≤ C e n
With α = 1 (1 + 5 ) ≈ 1.62. Its convergence rate is between that of the
2
bisection method and the Newton’s method
20
Hybrid Approaches
In practice, hybrid methods are usually used
For example, we can use Bisection Method to generate initial iterates that
are close to the root, so that Newton’s method can be applied
When the evaluation of derivative is expensive, the Secant Method should

be used to replace the Newton’s method
The trade-offs between Bisection Method and Newton’s Method are

robustness and fast convergence
The Secant Method is supposed to come between these two methods, to

take the advantages of both, and avoid the disadvantages of either
35

CS321 Numerical Analysis and Computing: Locating Roots of Equations

Uploaded by

Copyright:

Available Formats

CS321 Numerical Analysis and Computing: Locating Roots of Equations

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

CS321 Numerical Analysis and Computing: Locating Roots of Equations

Uploaded by

Copyright:

Available Formats

CS321

Locating Roots of Equations

Professor Jun Zhang

Department of Computer Science

Many physical system can be written in the form of an equation

This equation represents the relationship between the dependent

The root of a nonlinear equation is the solution of that equation

It can also be said to be the solution of a nonlinear system

Most nonlinear equations are too complicated to have an analytical

In practice, we are more interested in finding some numerical solutions

explicit numbers, approximate solutions

c is called a root of function f(x) = 0

Example. The function

Remark: roots are not necessarily unique in a given interval

Need some root finding algorithms for general functions

Given an interval [a,b] and a continuous function f(x), if f(a)•f(b) < 0,

We suppose f(a) > 0 and f(b) < 0

Step 2. Evaluate f(c), if f(c) = 0, a root is found

Step 3. If f(c) ≠ 0, then either f(c) > 0 or f(c) < 0

Step 4. If f(c) < 0, a root must be in [a,c]

Step 5. Let b ← c , f (b ) ← f (c ), go to Step 1.

Find a root of the function 3

A sequence {xn} has linear convergence to a limit x if there exists a

The bisection algorithm has a linear convergence rate with C = ½ and

What is our goal? When to stop? How many iterations?

Our goal is to find r Є [a,b] such that f(r) = 0

With the bisection algorithm, we generate a sequence such that

for some prescribed number ε > 0

It is not guaranteed, however, that f(cn) is very close to 0

a = (10 000.0)2, b = (10 001.0)2 so r must have a binary form

Using Taylor series expansions

Each iteration gains double digits of accuracy and f(xn)

Let en = r – xn. Newton’s method generates a sequence {xn} such that

Define an upper bound

1 max| x − r|≤δ f " ( x)

For very small δ > 0, we have

xn+1 is also close to r within a distance of δ. By recursion, if x0 is close to r,

Newton’s method converges fast, only when x0 is chosen close to r. In

1.) needs derivative value and availability

2.) starting point must be close to r

3.) lose quadratic convergence if multiple root

4.) iterates may run away (not in convergence domain)

5.) flat spot with f ’(xn) = 0

6.) cycling iterates around r

Newton’s method is really useful for finding zero of a system of nonlinear

Let x ( 0 ) = ( x1( 0 ) , x 2( 0 ) , x 3( 0 ) )T be an approximate solution and the computed

h ≈ -[f' (x (0) )]−1 f ( x ( 0 ) )

We need to evaluate f(xn) and f’(xn) at each iteration

We can approximate the derivative at x = xn by

Thus, the secant method generates iterates

Secant method does not need to know the derivative of f(x)

The convergence rate of the secant method is superlinear

When the evaluation of derivative is expensive, the Secant Method should

The trade-offs between Bisection Method and Newton’s Method are

The Secant Method is supposed to come between these two methods, to

You might also like