Rounding Errors and Stability of Algorithms

Download as pdf or txt
Download as pdf or txt
You are on page 1of 24

Rounding Errors and Stability of Algorithms

Backward Error Analysis of Gaussian Elimination


&
Cholesky Decomposition Methods
Backward error analysis of backward and forward
substitution
Some notation:
I For any two matrices C = [cij ]n×m and F = [fij ]n×m , we write
C ≤ F if cij ≤ fij for all 1 ≤ i ≤ n, 1 ≤ j ≤ m.
I For any C = [cij ]n×m , |C| = [|cij |]n×m .
Backward error analysis of backward and forward
substitution
Some notation:
I For any two matrices C = [cij ]n×m and F = [fij ]n×m , we write
C ≤ F if cij ≤ fij for all 1 ≤ i ≤ n, 1 ≤ j ≤ m.
I For any C = [cij ]n×m , |C| = [|cij |]n×m .

Theorem Let G be any nonsingular lower (upper) triangular n × n


matrix and b be any nonzero column vector of length n. If yc be the
computed solution of the system Gy = b using any variant of forward
(backward) substitution in floating point arithmetic, then yc satisfies

(G + δG)yc = b (6)
where δG is lower (upper) triangular such that
|δG| ≤ 2nu|G| + O(u 2 ) (7)
Backward error analysis of backward and forward
substitution
Some notation:
I For any two matrices C = [cij ]n×m and F = [fij ]n×m , we write
C ≤ F if cij ≤ fij for all 1 ≤ i ≤ n, 1 ≤ j ≤ m.
I For any C = [cij ]n×m , |C| = [|cij |]n×m .

Theorem Let G be any nonsingular lower (upper) triangular n × n


matrix and b be any nonzero column vector of length n. If yc be the
computed solution of the system Gy = b using any variant of forward
(backward) substitution in floating point arithmetic, then yc satisfies

(G + δG)yc = b (6)
where δG is lower (upper) triangular such that
|δG| ≤ 2nu|G| + O(u 2 ) (7)

Here (7) means |δgij | ≤ 2nu|gij | + O(u 2 ) for all 1 ≤ i, j ≤ n and O(u 2 )
stands for a matrix whose entries are O(u 2 ).
Backward error analysis of backward and forward
substitution
Some notation:
I For any two matrices C = [cij ]n×m and F = [fij ]n×m , we write
C ≤ F if cij ≤ fij for all 1 ≤ i ≤ n, 1 ≤ j ≤ m.
I For any C = [cij ]n×m , |C| = [|cij |]n×m .

Theorem Let G be any nonsingular lower (upper) triangular n × n


matrix and b be any nonzero column vector of length n. If yc be the
computed solution of the system Gy = b using any variant of forward
(backward) substitution in floating point arithmetic, then yc satisfies

(G + δG)yc = b (6)
where δG is lower (upper) triangular such that
|δG| ≤ 2nu|G| + O(u 2 ) (7)

Here (7) means |δgij | ≤ 2nu|gij | + O(u 2 ) for all 1 ≤ i, j ≤ n and O(u 2 )
stands for a matrix whose entries are O(u 2 ).
For a proof see pp. 159-161 of Fundamentals of Matrix Computations, 2nd Edition.
Backward error analysis of backward and forward
substitution
Exercise Given any two matrices C = [cij ]n×m and F = [fij ]n×m , show
that
1. |C| ≤ |F | ⇒ kCkp ≤ kF kp where p = 1, F , ∞.
2. kCkp = k|C|kp for p = 1, F , ∞.
Backward error analysis of backward and forward
substitution
Exercise Given any two matrices C = [cij ]n×m and F = [fij ]n×m , show
that
1. |C| ≤ |F | ⇒ kCkp ≤ kF kp where p = 1, F , ∞.
2. kCkp = k|C|kp for p = 1, F , ∞.

Corollary Let G be any nonsingular lower (upper) triangular n × n


matrix and b be any nonzero column vector of length n. If yc be the
computed solution of the system Gy = b using any variant of forward
(backward) substitution in floating point arithmetic, then yc satisfies

(G + δG)yc = b
where δG is lower (upper) triangular such that

kδGk∞ ≤ 2nukGk∞ + O(u 2 ) (8)


Backward error analysis of backward and forward
substitution
Exercise Given any two matrices C = [cij ]n×m and F = [fij ]n×m , show
that
1. |C| ≤ |F | ⇒ kCkp ≤ kF kp where p = 1, F , ∞.
2. kCkp = k|C|kp for p = 1, F , ∞.

Corollary Let G be any nonsingular lower (upper) triangular n × n


matrix and b be any nonzero column vector of length n. If yc be the
computed solution of the system Gy = b using any variant of forward
(backward) substitution in floating point arithmetic, then yc satisfies

(G + δG)yc = b
where δG is lower (upper) triangular such that

kδGk∞ ≤ 2nukGk∞ + O(u 2 ) (8)

Therefore backward and forward substitution processes are


unconditionally backward stable.
Backward error analysis of Gaussian Elimination
Theorem Let A be an n × n matrix. Let Lc and Uc be the LU factors of A
computed via Gaussian Elimination in floating point arithmetic. If no zero
pivots were encountered in the process, then

A + δA = Lc Uc

where
|δA| ≤ 2nu|Lc ||Uc | + O(u 2 ) (9)
and consequently,

kδAk∞ ≤ 2nukLc k∞ kUc k∞ + O(u 2 ) (10)


Backward error analysis of Gaussian Elimination
Theorem Let A be an n × n matrix. Let Lc and Uc be the LU factors of A
computed via Gaussian Elimination in floating point arithmetic. If no zero
pivots were encountered in the process, then

A + δA = Lc Uc

where
|δA| ≤ 2nu|Lc ||Uc | + O(u 2 ) (9)
and consequently,

kδAk∞ ≤ 2nukLc k∞ kUc k∞ + O(u 2 ) (10)

Further if xc be the computed solution of Ax = b, b 6= 0, obtained by solving


lower and upper triangular systems with Lc and Uc as coefficient matrices via
forward and backward substitution respectively, then

(A + E)xc = b

where
|E| ≤ 6nu|Lc ||Uc | + O(u 2 ) (11)
and consequently,

kEk∞ ≤ 6nukLc k∞ kUc k∞ + O(u 2 ) (12)


Backward error analysis of Gaussian Elimination
Therefore the process is backward stable if
kLc k∞ kUc k∞
kAk∞
is small.
Backward error analysis of Gaussian Elimination
Therefore the process is backward stable if
kLc k∞ kUc k∞
kAk∞
is small.
In GENP, the presence of small pivots can make kLc k∞ very large.
Therefore, GENP is unstable.
Backward error analysis of Gaussian Elimination
Therefore the process is backward stable if
kLc k∞ kUc k∞
kAk∞
is small.
In GENP, the presence of small pivots can make kLc k∞ very large.
Therefore, GENP is unstable.
In GEPP and GECP, kLc k∞ ≤ n. Therefore,

kLc k∞ kUc k∞ kUc k∞


≤n .
kAk∞ kAk∞
Backward error analysis of Gaussian Elimination
Therefore the process is backward stable if
kLc k∞ kUc k∞
kAk∞
is small.
In GENP, the presence of small pivots can make kLc k∞ very large.
Therefore, GENP is unstable.
In GEPP and GECP, kLc k∞ ≤ n. Therefore,

kLc k∞ kUc k∞ kUc k∞


≤n .
kAk∞ kAk∞

kUc k∞ √
For GEPP it is observed that for almost all matrices in practice, kAk∞
≈ n.
Backward error analysis of Gaussian Elimination
Therefore the process is backward stable if
kLc k∞ kUc k∞
kAk∞
is small.
In GENP, the presence of small pivots can make kLc k∞ very large.
Therefore, GENP is unstable.
In GEPP and GECP, kLc k∞ ≤ n. Therefore,

kLc k∞ kUc k∞ kUc k∞


≤n .
kAk∞ kAk∞

kUc k∞ √
For GEPP it is observed that for almost all matrices in practice, kAk∞
≈ n.
kUc k∞ 2n−1
However, kAk∞
= n
when A is the n × n Wilkinson matrix in GEPP.
Backward error analysis of Gaussian Elimination
Therefore the process is backward stable if
kLc k∞ kUc k∞
kAk∞
is small.
In GENP, the presence of small pivots can make kLc k∞ very large.
Therefore, GENP is unstable.
In GEPP and GECP, kLc k∞ ≤ n. Therefore,

kLc k∞ kUc k∞ kUc k∞


≤n .
kAk∞ kAk∞

kUc k∞ √
For GEPP it is observed that for almost all matrices in practice, kAk∞
≈ n.
kUc k∞ 2n−1
However, kAk∞
= n
when A is the n × n Wilkinson matrix in GEPP.
Therefore GEPP is conditionally backward stable as stabiity is subject to
kUc k∞
kAk∞
being small.
Backward error analysis of Gaussian Elimination
Therefore the process is backward stable if
kLc k∞ kUc k∞
kAk∞
is small.
In GENP, the presence of small pivots can make kLc k∞ very large.
Therefore, GENP is unstable.
In GEPP and GECP, kLc k∞ ≤ n. Therefore,

kLc k∞ kUc k∞ kUc k∞


≤n .
kAk∞ kAk∞

kUc k∞ √
For GEPP it is observed that for almost all matrices in practice, kAk∞
≈ n.
kUc k∞ 2n−1
However, kAk∞
= n
when A is the n × n Wilkinson matrix in GEPP.
Therefore GEPP is conditionally backward stable as stabiity is subject to
kUc k∞
kAk∞
being small.
kUc k∞
For GECP it can be proved that kAk∞
= O(n).
Backward error analysis of Gaussian Elimination
Therefore the process is backward stable if
kLc k∞ kUc k∞
kAk∞
is small.
In GENP, the presence of small pivots can make kLc k∞ very large.
Therefore, GENP is unstable.
In GEPP and GECP, kLc k∞ ≤ n. Therefore,

kLc k∞ kUc k∞ kUc k∞


≤n .
kAk∞ kAk∞

kUc k∞ √
For GEPP it is observed that for almost all matrices in practice, kAk∞
≈ n.
kUc k∞ 2n−1
However, kAk∞
= n
when A is the n × n Wilkinson matrix in GEPP.
Therefore GEPP is conditionally backward stable as stabiity is subject to
kUc k∞
kAk∞
being small.
kUc k∞
For GECP it can be proved that kAk∞
= O(n).
Hence GECP is unconditionally backward stable.
GEPP: The algorithm of choice for solving square
systems

However despite the verdict of conditional backward stability on


GEPP, it remains the algorithm of choice when it comes to
direct methods for solving square system of equations that are
not too large and the coefficient matrix is not positive definite
and not sparse, that is most of its entries are nonzero.
This is because it is cheaper than GECP and it has excellent
stability properties for almost all matrices.
Therefore the MATLAB command A \ b runs GEPP to solve
Ax = b when A is not too large, not sparse and not positive
definite.
A complete understanding of the reasons for the backward
stability of GEPP in almost all cases is an open problem.
Backward error analysis of Cholesky factorization
Theorem Let A be an n × n positive definite matrix and Gc be the
Cholesky factor computed in floating point arithmetic via some
version of Cholesky factorization algorithm. Then

A + δA = GcT Gc

where |δA| ≤ 2nu|GcT ||Gc | + O(u 2 ).


Backward error analysis of Cholesky factorization
Theorem Let A be an n × n positive definite matrix and Gc be the
Cholesky factor computed in floating point arithmetic via some
version of Cholesky factorization algorithm. Then

A + δA = GcT Gc

where |δA| ≤ 2nu|GcT ||Gc | + O(u 2 ).

Consequently,
kδAkF 2n3/2 u
≤ + O(u 2 ).
kAkF 1 − 2n3/2 u
Backward error analysis of Cholesky factorization
Theorem Let A be an n × n positive definite matrix and Gc be the
Cholesky factor computed in floating point arithmetic via some
version of Cholesky factorization algorithm. Then

A + δA = GcT Gc

where |δA| ≤ 2nu|GcT ||Gc | + O(u 2 ).

Consequently,
kδAkF 2n3/2 u
≤ + O(u 2 ).
kAkF 1 − 2n3/2 u

Thus, the algorithms for finding a Cholesky factor of a positive definite


matrix and using it to solve a system of equations is unconditionally
backward stable.
Backward error analysis of Cholesky factorization
Theorem Let A be an n × n positive definite matrix and Gc be the
Cholesky factor computed in floating point arithmetic via some
version of Cholesky factorization algorithm. Then

A + δA = GcT Gc

where |δA| ≤ 2nu|GcT ||Gc | + O(u 2 ).

Consequently,
kδAkF 2n3/2 u
≤ + O(u 2 ).
kAkF 1 − 2n3/2 u

Thus, the algorithms for finding a Cholesky factor of a positive definite


matrix and using it to solve a system of equations is unconditionally
backward stable.
Consequently, the solutions of a system of equations with a positive
definite coefficient matrix is unconditionaly backward stable.

You might also like