CS321 Numerical Analysis and Computing: Locating Roots of Equations

Download as pdf or txt
Download as pdf or txt
You are on page 1of 35

CS321

Numerical Analysis
and Computing

Lecture 2

Locating Roots of Equations

Professor Jun Zhang

Department of Computer Science


University of Kentucky
Lexington, KY 40506-0633

September 8, 2015
What is the Root

Many physical system can be written in the form of an equation

This equation represents the relationship between the dependent


variables and independent variables

The root of a nonlinear equation is the solution of that equation

It can also be said to be the solution of a nonlinear system

Most nonlinear equations are too complicated to have an analytical


solution

In practice, we are more interested in finding some numerical solutions

explicit numbers, approximate solutions

2
Roots (Zeros)of a Function

3
Roots (Zeros)of a Function

4
Roots of a Functions

Let f(x) be a function that has values of opposite signs at the two ends of
a given interval [a,b] with a < b, i.e., f(a)•f(b) < 0. If f(x) is continuous on
[a,b], then there exists a number c in [a,b] such that f(c) = 0

c is called a root of function f(x) = 0

Example. The function


f ( x) = x 2 + 2 x − 3 = 0

has a root in the interval [0,2]. It has two roots in the interval [-5,2]

Remark: roots are not necessarily unique in a given interval

Need some root finding algorithms for general functions

5
A Root of a Function

6
A Function with Four Roots

7
Bisection Method

Given an interval [a,b] and a continuous function f(x), if f(a)•f(b) < 0,


then f(x) must have a root in [a,b]. How to find it?

We suppose f(a) > 0 and f(b) < 0


b+a b−a
Step 1. Compute the midpoint c = , stop if is small, and
2 2
take c as the root

Step 2. Evaluate f(c), if f(c) = 0, a root is found

Step 3. If f(c) ≠ 0, then either f(c) > 0 or f(c) < 0

Step 4. If f(c) < 0, a root must be in [a,c]

Step 5. Let b ← c , f (b ) ← f (c ), go to Step 1.

8
Bisection Process

9
Bisection Process

Find a root of the function 3

10
Bisection Process

11
Convergence Analysis
a0 + b0
Let r be a root of f(x) in the interval [a0, b0]. Let c0 = be the
midpoint, then 2
b0 − a 0
r − c0 ≤
2
If we use the bisection algorithm, we compute and have a0, b0, c0, a1, b1,
c1, …, then
bn − a n
r − cn ≤ ( n ≥ 2)
2
Since the interval length is halved at each step, we have
bn −1 − a n −1 b −a
bn − a n = == 0 n 0
2 2
Hence b0 − a 0
r − cn ≤
2 n+1
which is the maximum error if we take cn as an approximate to the root r
12
Linear Convergence

A sequence {xn} has linear convergence to a limit x if there exists a


constant C in the interval [0,1) such that
x n+1 − x ≤ C x n − x ( n ≥ 1)
By recursion, we have
x n + 1 − x ≤ C x n − x ≤ C 2 x n −1 − x
≤  ≤ C n x1 − x
Or equivalently, a linear convergence satisfies
x n +1 − x ≤ AC n (0 ≤ C < 1)
For some positive number A

The bisection algorithm has a linear convergence rate with C = ½ and


A = (b0 - a0)/2

13
Stopping Criterion

What is our goal? When to stop? How many iterations?

Our goal is to find r Є [a,b] such that f(r) = 0

With the bisection algorithm, we generate a sequence such that

| r − c n |< ε

for some prescribed number ε > 0

i.e., we find a point cn inside the interval [a,b] that is very close to the
root r. We then use cn as an approximate to r

It is not guaranteed, however, that f(cn) is very close to 0

14
How Many Iterations

If we want the approximate root cn is close to the true root r, i.e., we want
r − cn < ε ,
Then the number of bisection steps n satisfies
b−a
n +1

2
Or
log( b − a ) − log( 2ε )
n>
log 2
Example. Find a root in [16,17] up to machine single precision

a = (10 000.0)2, b = (10 001.0)2 so r must have a binary form


r = (10 000.***…)2. We have a total of 24 bits, 5 is already fixed. The
accuracy will be up to another 19 bits, which is between 2-19 and 2-20. We
choose є = 2-20. Since b – a = 1, we need 2n+1 > 220, yielding n ≥ 20

15
Newton’s Method

Given a function f(x) and a point x0, if we know the derivative of f(x) at x0, we
can construct a linear function that passes through (x0, f(x0)) with a slope
f’(x0) ≠ 0 as
l ( x ) = f ' ( x 0 )( x − x 0 ) + f ( x 0 )

Since l(x) is close to f(x) at x0, if x0 is close to r, we can use the root of l(x) as
an approximate to r, the root of f(x)
f ( x0 )
x1 = x 0 −
f ' ( x0 )
x1 may not be close to r enough, we repeat the procedure to find
f ( x1 )
x 2 = x1 − ,
f ' ( x1 )
Under certain conditions, {xn} converges to r

16
Newton’s Method

17
From Taylor Series
If f(x0) ≠ 0, but x0 is close to r, we may assume that they differ by h, i.e.,
x0 + h = r, or f ( x + h) = f ( r ) = 0
0

Using Taylor series expansions


h2
f ( x 0 ) + hf ' ( x 0 ) + f " ( x0 ) +  = 0
2
Ignoring the higher order terms, we have
f ( x 0 ) + hf ' ( x 0 ) = 0
Or
f ( x0 )
h=−
f ' ( x0 )
Since h does not satisfy f(x0 + h) = 0, we use
f ( x0 )
x1 = x 0 + h = x 0 −
f ' ( x0 )
as an approximate to r, and repeat the process

18
First Few Approximations

19
Fast Convergence
Find a root for the following function f(x), starting at x0 = 4

f ( x) = x 3 − 2 x 2 + x − 3
f '( x) = 3 x 2 − 4 x + 1

Each iteration gains double digits of accuracy and f(xn)


decreases quadratically to 0
20
Example

21
Convergence Analysis

Let the function f have continuous first and second derivatives f’ and f”,
and r be a simple root of f with f’(r) ≠ 0. If x0 is sufficiently close to r, then
Newton’s method converges to r quadratically.
2
xn+1 − r ≤ c xn − r
If xn differs from r by at most one unit in the kth decimal place, i.e.,
x n − r ≤ 10 − k
Then, for c = 1, we have
xn +1 − r ≤ 10 −2 k
The number of correct decimal digits doubles after another iteration

22
Convergence Proof

Let en = r – xn. Newton’s method generates a sequence {xn} such that


f ( xn )
e n+1 = r − x n+1 = r − x n +
f ' ( xn )
f ( xn ) en f ' ( xn ) + f ( xn )
= en + =
f ' ( xn ) f ' ( xn )
Using Taylor’s expansion, there exists a point between xn and r for
which
0 = f (r ) = f ( xn + en )
1
= f ( x n ) + e n f ' ( x n ) + e n2 f " (ξ n )
2
It follows that
1
e n f ' ( x n ) + f ( x n ) = − e n2 f " (ξ n )
2

23
Convergence Proof Cont.

We thus have
1  f " (ξ n )  2
e n +1 = −  e n
2  f ' ( xn ) 

Define an upper bound

1 max| x − r|≤δ f " ( x)


c(δ ) = , (δ > 0)
2 min | x − r|≤δ f ' ( x)
We can choose δ small so that

e n = x n − r ≤ δ and ξn − r ≤ δ
This is to guarantee that xn is close to r within a distance of δ

24
Convergence Proof Cont.

For very small δ > 0, we have


1 f " (ξ n ) 2
e n+1 = e n ≤ c(δ )e n2
2 f ' ( xn )
≤ δ c(δ ) e n = ρ e n
With ρ = δc(δ) < 1 if δ is small enough, therefore

x n+1 − r = e n+1 ≤ ρ e n ≤ e n ≤ δ

xn+1 is also close to r within a distance of δ. By recursion, if x0 is close to r,


then
en ≤ ρ en −1 ≤ ρ 2 en − 2 ≤  ≤ ρ n e0
Since ρ < 1, this is to say
lim e n = 0 as n → ∞
n→ ∞

25
Weakness of Newton’s Method

Newton’s method converges fast, only when x0 is chosen close to r. In


practice, there might also be a number of problems

1.) needs derivative value and availability

2.) starting point must be close to r

3.) lose quadratic convergence if multiple root

4.) iterates may run away (not in convergence domain)

5.) flat spot with f ’(xn) = 0

6.) cycling iterates around r

26
Problems of Newton’s Method

27
Newton’s Method Cycling

28
Systems of Nonlinear Equations

Newton’s method is really useful for finding zero of a system of nonlinear


equations f 1 ( x1 , x 2 ,  , x n ) = 0
f 2 ( x1 , x 2 ,  , x n ) = 0
 
f n ( x1 , x 2 ,  , x n ) = 0
Written in vector form as
f(x) = 0
Where
f = ( f 1 , f 2 ,  , f n )T
x = ( x1 , x 2 ,  , x n )T
We have
x ( k +1) = x ( k ) − [f ' ( x ( k ) )]−1 f ( x ( k ) )
f’(x(k)) = J(x(k)) is the Jacobian matrix
29
A 3 Equation Example

f 1 ( x1 , x 2 , x 3 ) = 0
f 2 ( x1 , x 2 , x 3 ) = 0
f 3 ( x1 , x 2 , x 3 ) = 0
Using Taylor expansion
f i ( x1 + h1 , x 2 + h2 , x 3 + h3 )
= f i ( x1 , x 2 , x 3 ) +
∂f i ∂f i ∂f i
h1 + h2 + h3 +
∂ x1 ∂x 2 ∂x 3

Let x ( 0 ) = ( x1( 0 ) , x 2( 0 ) , x 3( 0 ) )T be an approximate solution and the computed


correction be h = ( h1 , h2 ,h 3 )T . Hence

0 ≈ f ( x ( 0 ) + h ) = f ( x ( 0 ) ) + f ' ( x ( 0 ) )h

30
Example Cont.
The Jacobian matrix is  ∂f 1 ∂f 1 ∂f 1 
 ∂x1 ∂x 2 ∂x 3 
 ∂f ∂f 2 ∂f 2 
J= 2
∂x1 ∂x 2 ∂x 3 
 ∂f ∂f 3 ∂f 3 
 ∂x3 
It follows that  1 ∂x 2 ∂x 3 

h ≈ -[f' (x (0) )]−1 f ( x ( 0 ) )


Hence, the new iterate is
x (1) = x (0) − [f ' ( x ( 0 ) )]−1 f ( x ( 0 ) )
In practice, we solve the Jacobian matrix in
[J ( x ( k ) )]h ( k ) = − f ( x ( k ) )
So that
x ( k + 1) = x ( k ) + h ( k )

31
Secant Method

In Newton’s method
f ( xn )
x n+1 = xn −
f ' ( xn )

We need to evaluate f(xn) and f’(xn) at each iteration

We can approximate the derivative at x = xn by


f ( x n −1 ) − f ( x n )
f ' ( xn ) ≈
x n −1 − x n

Thus, the secant method generates iterates


 x n − x n −1 

x n+1 = x n −   f ( x n )
 f ( x n ) − f ( x n −1 ) 
Only one functional evaluation at each iteration

32
Secant Method

33
Comments

Secant method needs two iterates to start with, we can use bisection
method to generate the second iterate

Secant method does not need to know the derivative of f(x)

If | f(xn) – f(xn-1)| is small, the computation may lose significant digits and
becomes unstable

The convergence rate of the secant method is superlinear


α
e n +1 ≤ C e n
With α = 1 (1 + 5 ) ≈ 1.62. Its convergence rate is between that of the
2
bisection method and the Newton’s method

20
Hybrid Approaches
In practice, hybrid methods are usually used

For example, we can use Bisection Method to generate initial iterates that
are close to the root, so that Newton’s method can be applied

When the evaluation of derivative is expensive, the Secant Method should


be used to replace the Newton’s method

The trade-offs between Bisection Method and Newton’s Method are


robustness and fast convergence

The Secant Method is supposed to come between these two methods, to


take the advantages of both, and avoid the disadvantages of either

35

You might also like