CombOptim_Ch3

Linear Algebra and
Complexity
Chris Dickson
CAS 746 - Advanced Topics in Combinatorial
Optimization
McMaster University, January 23, 2006

Introduction
 Review of linear algebra
 Complexity of determinant, inverse, systems of equations, etc
 Gaussian elimination method for soliving linear systems
 complexity
 solvability
 numerical issues
 Iterative methods for solving linear systems
 Jacobi Method
 Gauss-Sidel Method
 Successive Over-Relaxation Methods
2
Review
 A subset L of Rn (respectively Qn) is called a linear hyperplane if
L = { x | ax = 0 } for some nonzero row vector a in Rn (Qn).
 A set U of vectors in Rn (respectively Qn) is called a linear subspace of Rn
if the following properties hold:
 The zero vector is in U
 If x  U then x  U for any scalar 
 If x,y  U then x + y  U
 Important subspaces
 Null Space : null(A) = { x | Ax = 0 }
 Image : im(A) = { b | Ax = b for some x in A }
 A basis for a subspace L is a set of linearly independent vectors that spans
L
3
Review
 Theorem 3.1. Each linear subspace L of Rn is generated by finitely
many vectors and is also the intersection of finitely many linear
hyperplanes
 Proof :
 If L is a subspace, then a maximal set of linearly independent vectors
from L will form a basis.
 Also, L is the intersection of hyperplanes { x | aix = 0 } where the set of
all ai are linearly independent generating L* := { z | zx = 0 for all x in L }
4
Review
 Corollary 3.1a. For each matrix A there exist vectors x1 , … , xT

such that: Ax = 0  x = 1x1 + … +  TxT for certain rationals 1 , …
 T.
 If x1 , … , xT are linearlly independent they form a fundamental

system of solutions to Ax = 0.
 Corollary 3.1b The system Ax = b has a solution iff yb = 0 for each

vector y with yA = 0. [ Fundamental theorem of linear algebra,
Gauss 1809].
5
Review
 Proof :
 () If Ax=b then when yA = 0 we have yAx = 0  yb = 0.
 ()Let im(A) = {z | Ax = z for some x }. By Thrm 3.1 there
exists a matrix C such that im(A) = { z | Cz = 0 }. So, CAx = 0
implies that CA is all-zero. Thus:
(yA = 0  yb = 0)
 Cb = 0
 b  im(A)
 There is a solution to Ax = b
6
Review
 Corollary 3.1c. If x0 is a solution to Ax = b, then there are vectors

x1, … , xt, such that Ax = b iff x = x0 + 1x1 + … +  txt for certain
rationals  1, … ,  t.
 Proof: Ax = b iff A(x – x0) = 0. Cor 3.1a implies:
(x – x0) =  1x1 + … +  txt

x = x 0 +  1 x 1 + … +  tx t
7
Review
 Cramer’s Rule: If A is an invertible (n x n)-matrix, the solution to
Ax = b is given by:
x1 = detA1 / detA
…
xn = detAn / detA
 Ai is defined by replacing the ith column of A by the column vector b.
8
Review
 Example:
5x1 +x2 – x3 = 4
9x1 +x2 – x3 = 1
x1 -x2 +5x3 = 2
Here, x2 = det(A2) / det(A) = 166 / 16 = 10.3750
9
Sizes and Characterizations
 Rational Number: r=p/q (p  Z, q  N, relatively
prime)
 Rational Vector c = ( 1 , 2 , ... , n )
 Rational Matrix
We only consider rational entries for vectors and matrices.
10
 Theorem 3.2. Let A be a square rational matrix of size . Then, the
size of det(A) is less than 2 .
 Proof: Let A = (pij/qij) for i,j = 1..n. Let detA = p/q where p,q and
pij ,qij are all relatively prime. Then:
 From the definitition of the determinant:
11
 Noting that |p| = |detA|q, we combine the two formulas to bound p
and q:
 Plugging the bounds for p and q into the definition for the size of a
rational number, we get:
12
 Several important corollaries from the theorem
 Corollary 3.2a. The inverse A-1 of a nonsingular (rational) matrix A

has size polynomially bounded by the size of A.
 Proof: Entries of A-1 are quotients of subdeterminants of A. Then,
apply the theorem.
 Corollary 3.2b. If the system Ax=b has a solution, it has one of size
polynomially bounded by the sizes of A and b
Proof: Since A-1 is polynomially bounded, then A-1b is polynomially
bounded.
This tells us that the Corollary 3.1b provides a good characterization
13
 What if there isn’t a solution to Ax = b? Is this problem still
polynomially bounded? Yes!
 Corollary 3.2c says so! If there is no solution, we can specify a

vector y with yA = 0 and yb = 1. By corollary 3.2b, such a vector
exists and is polynomially bounded by the sizes of A and b.
 Gaussian elimination provodes a polynomial method for finding (or

not finding) a solution
 One more thing left to show. Do we know the solution will be

polynomially bounded in size?
14
 Corollary 3.2d. Let A be a rational m x n matrix and let b be a
rational column vector such that each row of the matrix [A b] has
size at most . If Ax = b has a solution, then:
{ x | Ax = b} = { x0 + 1x1 + txt | i  R } (*)
for certain rational vectors x0 , .... , xt of size at most 4n2 .

 Proof: By Cramer’s Rule, there exists x0 , ... , xt satisfying (*) which
are quotients of subdeterminants of [A b] of order at most n. By
theorem 3.2, each determinant has size <= 2n. Each component
of xi has size less than 4n. Since each xi has n components, the
size of xi is at most 4n2.
15
Gaussian Elimination
 Given a system of linear equations:
a11x1 + a12x2 + .... + a1nxn = b1

a21x1 + a22x2 + .... + a2nxn = b2
.........
an1x1 + an2x2 + .... + annxn = bn
We subtract appropriate multiples of the first equation from the other
equations to get:
a11x1 + a12x2 + .... + a1nxn = b1
a’22x2 + .... + a’2nxn = b2
.........
a’n2x2 + .... + a’nnxn = bn
We then recursively solve the system of the last m-1 equations.
16
 Operations allowed on matrices:
 Adding a multiple of one row to another row
 Permuting rows or columns
 Forward Elimination: Given a matrix Ak of the form
We choose a nonzero component of D called the pivot element.

Without loss of generality, assume that this pivot element is in D 11.
Next, add rational multiples of the first row of D to the other rows in
D such that D11 is the only non-zero element in the first column of D.
(B is non-singular, upper triangular, order k)
17
 Back Substitution: After FE stage, we wish to transform the matrix
into the form:
where  is non-singular diagonal.
 We accomplish this by adding multiples of the kth row to the rows

1, ... , k-1 such that the element Akk is the only non-zero in the kth
row and column for k = r, r-1, ... , 1. (r = rank(A))
18
 Theorem 3.3. For rational data, the Gaussian elminination method
is a polynomial time method.
 proof is found in the book. It is very longwinded!
 since operations allowed for GE are polynomially bounded, it only
remains to show that numbers do not grow exponentially.
 Corollary of the theorem implies that other operations on matrices
are polynomially solvable as they depend on solving a system of
linear equations.
 calculating determinant of a rational matrix
 determining the rank of a rational matrix
 calculating the inverse of a rational matrix
 testing a set of vectors for linear independence
 solving a system of rational linear equations
19
 Numerical Considerations
 .A naive pivot strategy always uses D11 as the pivot. However, this can
lead to numerical problems when we do not have exact arithmetic. (ie/
floating point numbers).
 Partial pivoting always selects the largest element (in absolute value) in
the first column of D as the pivot. (swap rows to make this D 11)
 Scaled partial pivoting is like partial pivoting, but it scales the pivot
column and row so that D11 = 1.
 Full pivoting uses the largest value in the entire D submatrix as the
pivot. We then swap rows and columns to put this value in D11.
 Studies show that full pivoting usually requires more overhead to be
beneficial.
20
 Other methods:
 Gauss-Jordan Method: This method combines the FE and BS stages in
Gaussian elimination.
 inferior to ordinary Gaussian elimination
 LU Decomposition: Uses GE to decompose original matrix A into the
product of a lower and upper triangular matrix.
 This is the method of choice when we must solve many systems with the
same matrix A but with different b’s.
 We only need to perform FE once, then do back substitution for each right
hand side.
 Iterative Methods: Want to solve Ax = b by determining a sequence of
vectors x0 , x1, ... , which eventually converge to a solution x*.
21
Iterative Methods
 Given Ax = b and x0, the goal is find a sequence of iterates x1, x2 , ...
, x* such that Ax* = b.
 For most methods, the goal is to split A into A = L + D + U where L

is strictly lower triangular, D is diagonal, and U is strictly upper
triangular.
 Example:
L D U
22
Iterative Methods
 Now, we can define how the iteration proceeds. In general:
 Jacobi Iteration: Mj = D , Nj = -(L + U). Alternatively, the book

presents the iteration:
These two are equivalent as book assumes aii = 1, which means D

= D-1 = I. (Identity)
23
Iterative Methods
 Example: Solve for x in Ax=b using Jacobi’s Method where:
 Using the iteration in the previous slide and x0 = b, we obtain:
x1 = [ -5 0.667 1.5 ]T
x2 = [ 0.833 -1.0 -1.0]T
.........
x10 = [0.9999 0.9985 1.9977 ]T [ x* = [ 1 1 2] T ]
24
Iterative Methods
 Example 2: Solve for x in Ax=b using Jacobi’s Method where:
 Using x0 = b, we obtain:
x1 = [ -44.5 -49.5 -13.4 ]T

x2 = [ 112.7 235.7 67.7]T
.........
x10 = [2.82 4.11 1.39 ]T *106 [ x* = [ 1 1 1] T ]
25
Iterative Methods
• .What happened? The sequence diverged!
• The matrix in the second example was NOT diagonally dominant.
• Strict diagonal row dominance is sufficient for convergence.

• Diagonal element is greater than sum of all other elements in the same
row (take absolute values!)
• Not neccessary though as example 1 did not have strict diagonal
dominance.
• Alternatively, book says that Jacobi Method will converge if all

eigenvalues of (I - A) are less than 1 in absolute value.
• Again, this is a sufficient condition. It can be checked that eigenvalues
of matrix in first example do not satisfy this condition.
26
Iterative Methods
 Gauss-Sidel Method: Computes the (i+1)th component of xk+1. ie,
where
27
Iterative Methods
 Gauss-Sidel Method: Alternatively, if we split A = L+D+U, then we
obtain the iteration:
 Where MG = D + L , and NG = -U.
 Again, it can be shown that the book’s method is equivalent to this

method.
28
Iterative Methods
 Example: Gauss-Sidel Method
 Starting with x0 = b, we obtain the sequence:
x1 = [ -4.666 -5.116 3.366 ]T

x2 = [ 1.477 -2.788 1.662]T
x3 = [ 1.821 -0.404 1.116]T
....
x12 = [1.000 1.000 1.000 ]T
29
Iterative Methods
 It should also be noted that in both GS and Jacobi Methods, the choice of M
is important.
 Since the iterations of each method involve inverting M, we should choose

M so that this is easy.
 In Jacobi Method, M is a diagonal matrix

 Easy to invert!
 In Gauss-Sidel Method, M is a lower triangular

 Slightly harder. Although inverse is still lower triangular, we must perform a little
more work, but we only have to invert it once.
 Alternatively, use the strategy by the book to compute iterates component by
component.
30
Iterative Methods
 Successive Over-Relaxation Method (SOR) Write matrices L and U
(L lower triangular with diagonal) such that:
for some appropriately chosen  > 0 ( is the diagonal of A)
 A related class of iterative relaxation methods for approximating the

solution to Ax = b can now be defined.
31
Iterative Methods
 Algorithm:
 For a certain  with 0<≤2

 convergent if 0<<2 and Ax = b has a solution.
32
Iterative Methods
 There are many different choices that can be made for the
relaxation methods
 Choice of 
  = 2, then xk+1 is reflection of xk into xi = 
  = 1, then x
k+1 is projection of xk onto the hyperplane xi = 
 How to choose violated equation

 Is it best to just pick any violated equation?
 Should we have a strategy? (First, Best, Worst, etc)
 Stopping Criteria:
 How close is close enough?
33
Iterative Methods
 Why use iterative methods?
 Sometimes they will be more numerically stable.
 Particulary when matrix A is positive-semi-definite and diagonally
dominant. (This is the case for the linear least squares problem)
 If an exact solution is not neccessary. Only need to be ‘close enough’.
 Which one to use?

 Compared with Jacobi Iteration, Gauss-Seidel Method converges faster,
usually by a factor of 2.
34
References
 Schrijver, A. Theory of Linear and Integer Programming. John
Wiley & Sons, 1998.
 Qiao, S. Linear Systems. CAS 708 Lecture Slides. (2005)
35
THE END
36

CombOptim_Ch3

Uploaded by

Copyright:

Available Formats

CombOptim_Ch3

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

CombOptim_Ch3

Uploaded by

Copyright:

Available Formats

Linear Algebra and

McMaster University, January 23, 2006

 Corollary 3.1a. For each matrix A there exist vectors x1 , … , xT

 If x1 , … , xT are linearlly independent they form a fundamental

 Corollary 3.1b The system Ax = b has a solution iff yb = 0 for each

 Corollary 3.1c. If x0 is a solution to Ax = b, then there are vectors

 Proof: Ax = b iff A(x – x0) = 0. Cor 3.1a implies:

(x – x0) =  1x1 + … +  txt

 Ai is defined by replacing the ith column of A by the column vector b.

Here, x2 = det(A2) / det(A) = 166 / 16 = 10.3750

 Rational Vector c = ( 1 , 2 , ... , n )

We only consider rational entries for vectors and matrices.

 From the definitition of the determinant:

 Corollary 3.2a. The inverse A-1 of a nonsingular (rational) matrix A

This tells us that the Corollary 3.1b provides a good characterization

 Corollary 3.2c says so! If there is no solution, we can specify a

 Gaussian elimination provodes a polynomial method for finding (or

 One more thing left to show. Do we know the solution will be

{ x | Ax = b} = { x0 + 1x1 + txt | i  R } (*)

for certain rational vectors x0 , .... , xt of size at most 4n2 .

a11x1 + a12x2 + .... + a1nxn = b1

We choose a nonzero component of D called the pivot element.

(B is non-singular, upper triangular, order k)

where  is non-singular diagonal.

 We accomplish this by adding multiples of the kth row to the rows

 For most methods, the goal is to split A into A = L + D + U where L

 Jacobi Iteration: Mj = D , Nj = -(L + U). Alternatively, the book

These two are equivalent as book assumes aii = 1, which means D

 Using the iteration in the previous slide and x0 = b, we obtain:

x1 = [ -44.5 -49.5 -13.4 ]T

• The matrix in the second example was NOT diagonally dominant.

• Strict diagonal row dominance is sufficient for convergence.

• Alternatively, book says that Jacobi Method will converge if all

 Where MG = D + L , and NG = -U.

 Again, it can be shown that the book’s method is equivalent to this

 Starting with x0 = b, we obtain the sequence:

x1 = [ -4.666 -5.116 3.366 ]T

 Since the iterations of each method involve inverting M, we should choose

 In Jacobi Method, M is a diagonal matrix

 In Gauss-Sidel Method, M is a lower triangular

for some appropriately chosen  > 0 ( is the diagonal of A)

 A related class of iterative relaxation methods for approximating the

 For a certain  with 0<≤2

 How to choose violated equation

 Which one to use?

 Qiao, S. Linear Systems. CAS 708 Lecture Slides. (2005)

You might also like