CombOptim_Ch3

Download as ppt, pdf, or txt
Download as ppt, pdf, or txt
You are on page 1of 36

Linear Algebra and

Complexity

Chris Dickson
CAS 746 - Advanced Topics in Combinatorial
Optimization

McMaster University, January 23, 2006


Introduction
 Review of linear algebra
 Complexity of determinant, inverse, systems of equations, etc
 Gaussian elimination method for soliving linear systems
 complexity
 solvability
 numerical issues
 Iterative methods for solving linear systems
 Jacobi Method
 Gauss-Sidel Method
 Successive Over-Relaxation Methods

2
Review
 A subset L of Rn (respectively Qn) is called a linear hyperplane if
L = { x | ax = 0 } for some nonzero row vector a in Rn (Qn).
 A set U of vectors in Rn (respectively Qn) is called a linear subspace of Rn
if the following properties hold:
 The zero vector is in U
 If x  U then x  U for any scalar 
 If x,y  U then x + y  U
 Important subspaces
 Null Space : null(A) = { x | Ax = 0 }
 Image : im(A) = { b | Ax = b for some x in A }
 A basis for a subspace L is a set of linearly independent vectors that spans
L

3
Review
 Theorem 3.1. Each linear subspace L of Rn is generated by finitely
many vectors and is also the intersection of finitely many linear
hyperplanes
 Proof :
 If L is a subspace, then a maximal set of linearly independent vectors
from L will form a basis.
 Also, L is the intersection of hyperplanes { x | aix = 0 } where the set of
all ai are linearly independent generating L* := { z | zx = 0 for all x in L }

4
Review

 Corollary 3.1a. For each matrix A there exist vectors x1 , … , xT


such that: Ax = 0  x = 1x1 + … +  TxT for certain rationals 1 , …
 T.

 If x1 , … , xT are linearlly independent they form a fundamental


system of solutions to Ax = 0.

 Corollary 3.1b The system Ax = b has a solution iff yb = 0 for each


vector y with yA = 0. [ Fundamental theorem of linear algebra,
Gauss 1809].

5
Review
 Proof :
 () If Ax=b then when yA = 0 we have yAx = 0  yb = 0.
 ()Let im(A) = {z | Ax = z for some x }. By Thrm 3.1 there
exists a matrix C such that im(A) = { z | Cz = 0 }. So, CAx = 0
implies that CA is all-zero. Thus:
(yA = 0  yb = 0)
 Cb = 0
 b  im(A)
 There is a solution to Ax = b

6
Review

 Corollary 3.1c. If x0 is a solution to Ax = b, then there are vectors


x1, … , xt, such that Ax = b iff x = x0 + 1x1 + … +  txt for certain
rationals  1, … ,  t.

 Proof: Ax = b iff A(x – x0) = 0. Cor 3.1a implies:

(x – x0) =  1x1 + … +  txt


x = x 0 +  1 x 1 + … +  tx t

7
Review
 Cramer’s Rule: If A is an invertible (n x n)-matrix, the solution to
Ax = b is given by:

x1 = detA1 / detA

xn = detAn / detA

 Ai is defined by replacing the ith column of A by the column vector b.

8
Review
 Example:

5x1 +x2 – x3 = 4
9x1 +x2 – x3 = 1
x1 -x2 +5x3 = 2

Here, x2 = det(A2) / det(A) = 166 / 16 = 10.3750

9
Sizes and Characterizations
 Rational Number: r=p/q (p  Z, q  N, relatively
prime)

 Rational Vector c = ( 1 , 2 , ... , n )

 Rational Matrix

We only consider rational entries for vectors and matrices.

10
Sizes and Characterizations
 Theorem 3.2. Let A be a square rational matrix of size . Then, the
size of det(A) is less than 2 .
 Proof: Let A = (pij/qij) for i,j = 1..n. Let detA = p/q where p,q and
pij ,qij are all relatively prime. Then:

 From the definitition of the determinant:

11
Sizes and Characterizations
 Noting that |p| = |detA|q, we combine the two formulas to bound p
and q:

 Plugging the bounds for p and q into the definition for the size of a
rational number, we get:

12
Sizes and Characterizations
 Several important corollaries from the theorem

 Corollary 3.2a. The inverse A-1 of a nonsingular (rational) matrix A


has size polynomially bounded by the size of A.
 Proof: Entries of A-1 are quotients of subdeterminants of A. Then,
apply the theorem.

 Corollary 3.2b. If the system Ax=b has a solution, it has one of size
polynomially bounded by the sizes of A and b
Proof: Since A-1 is polynomially bounded, then A-1b is polynomially
bounded.

This tells us that the Corollary 3.1b provides a good characterization

13
Sizes and Characterizations
 What if there isn’t a solution to Ax = b? Is this problem still
polynomially bounded? Yes!

 Corollary 3.2c says so! If there is no solution, we can specify a


vector y with yA = 0 and yb = 1. By corollary 3.2b, such a vector
exists and is polynomially bounded by the sizes of A and b.

 Gaussian elimination provodes a polynomial method for finding (or


not finding) a solution

 One more thing left to show. Do we know the solution will be


polynomially bounded in size?

14
Sizes and Characterizations
 Corollary 3.2d. Let A be a rational m x n matrix and let b be a
rational column vector such that each row of the matrix [A b] has
size at most . If Ax = b has a solution, then:

{ x | Ax = b} = { x0 + 1x1 + txt | i  R } (*)

for certain rational vectors x0 , .... , xt of size at most 4n2 .


 Proof: By Cramer’s Rule, there exists x0 , ... , xt satisfying (*) which
are quotients of subdeterminants of [A b] of order at most n. By
theorem 3.2, each determinant has size <= 2n. Each component
of xi has size less than 4n. Since each xi has n components, the
size of xi is at most 4n2.

15
Gaussian Elimination
 Given a system of linear equations:

a11x1 + a12x2 + .... + a1nxn = b1


a21x1 + a22x2 + .... + a2nxn = b2
.........
an1x1 + an2x2 + .... + annxn = bn
We subtract appropriate multiples of the first equation from the other
equations to get:
a11x1 + a12x2 + .... + a1nxn = b1
a’22x2 + .... + a’2nxn = b2
.........
a’n2x2 + .... + a’nnxn = bn
We then recursively solve the system of the last m-1 equations.

16
Gaussian Elimination
 Operations allowed on matrices:
 Adding a multiple of one row to another row
 Permuting rows or columns
 Forward Elimination: Given a matrix Ak of the form

We choose a nonzero component of D called the pivot element.


Without loss of generality, assume that this pivot element is in D 11.
Next, add rational multiples of the first row of D to the other rows in
D such that D11 is the only non-zero element in the first column of D.

(B is non-singular, upper triangular, order k)

17
Gaussian Elimination
 Back Substitution: After FE stage, we wish to transform the matrix
into the form:

where  is non-singular diagonal.

 We accomplish this by adding multiples of the kth row to the rows


1, ... , k-1 such that the element Akk is the only non-zero in the kth
row and column for k = r, r-1, ... , 1. (r = rank(A))

18
Gaussian Elimination
 Theorem 3.3. For rational data, the Gaussian elminination method
is a polynomial time method.
 proof is found in the book. It is very longwinded!
 since operations allowed for GE are polynomially bounded, it only
remains to show that numbers do not grow exponentially.
 Corollary of the theorem implies that other operations on matrices
are polynomially solvable as they depend on solving a system of
linear equations.
 calculating determinant of a rational matrix
 determining the rank of a rational matrix
 calculating the inverse of a rational matrix
 testing a set of vectors for linear independence
 solving a system of rational linear equations

19
Gaussian Elimination
 Numerical Considerations
 .A naive pivot strategy always uses D11 as the pivot. However, this can
lead to numerical problems when we do not have exact arithmetic. (ie/
floating point numbers).
 Partial pivoting always selects the largest element (in absolute value) in
the first column of D as the pivot. (swap rows to make this D 11)
 Scaled partial pivoting is like partial pivoting, but it scales the pivot
column and row so that D11 = 1.
 Full pivoting uses the largest value in the entire D submatrix as the
pivot. We then swap rows and columns to put this value in D11.
 Studies show that full pivoting usually requires more overhead to be
beneficial.

20
Gaussian Elimination
 Other methods:
 Gauss-Jordan Method: This method combines the FE and BS stages in
Gaussian elimination.
 inferior to ordinary Gaussian elimination
 LU Decomposition: Uses GE to decompose original matrix A into the
product of a lower and upper triangular matrix.
 This is the method of choice when we must solve many systems with the
same matrix A but with different b’s.
 We only need to perform FE once, then do back substitution for each right
hand side.
 Iterative Methods: Want to solve Ax = b by determining a sequence of
vectors x0 , x1, ... , which eventually converge to a solution x*.

21
Iterative Methods
 Given Ax = b and x0, the goal is find a sequence of iterates x1, x2 , ...
, x* such that Ax* = b.

 For most methods, the goal is to split A into A = L + D + U where L


is strictly lower triangular, D is diagonal, and U is strictly upper
triangular.

 Example:
L D U

22
Iterative Methods
 Now, we can define how the iteration proceeds. In general:

 Jacobi Iteration: Mj = D , Nj = -(L + U). Alternatively, the book


presents the iteration:

These two are equivalent as book assumes aii = 1, which means D


= D-1 = I. (Identity)

23
Iterative Methods
 Example: Solve for x in Ax=b using Jacobi’s Method where:

 Using the iteration in the previous slide and x0 = b, we obtain:

x1 = [ -5 0.667 1.5 ]T
x2 = [ 0.833 -1.0 -1.0]T
.........
x10 = [0.9999 0.9985 1.9977 ]T [ x* = [ 1 1 2] T ]

24
Iterative Methods
 Example 2: Solve for x in Ax=b using Jacobi’s Method where:

 Using x0 = b, we obtain:

x1 = [ -44.5 -49.5 -13.4 ]T


x2 = [ 112.7 235.7 67.7]T
.........
x10 = [2.82 4.11 1.39 ]T *106 [ x* = [ 1 1 1] T ]

25
Iterative Methods
• .What happened? The sequence diverged!

• The matrix in the second example was NOT diagonally dominant.

• Strict diagonal row dominance is sufficient for convergence.


• Diagonal element is greater than sum of all other elements in the same
row (take absolute values!)
• Not neccessary though as example 1 did not have strict diagonal
dominance.

• Alternatively, book says that Jacobi Method will converge if all


eigenvalues of (I - A) are less than 1 in absolute value.
• Again, this is a sufficient condition. It can be checked that eigenvalues
of matrix in first example do not satisfy this condition.

26
Iterative Methods
 Gauss-Sidel Method: Computes the (i+1)th component of xk+1. ie,

where

27
Iterative Methods
 Gauss-Sidel Method: Alternatively, if we split A = L+D+U, then we
obtain the iteration:

 Where MG = D + L , and NG = -U.

 Again, it can be shown that the book’s method is equivalent to this


method.

28
Iterative Methods
 Example: Gauss-Sidel Method

 Starting with x0 = b, we obtain the sequence:

x1 = [ -4.666 -5.116 3.366 ]T


x2 = [ 1.477 -2.788 1.662]T
x3 = [ 1.821 -0.404 1.116]T
....
x12 = [1.000 1.000 1.000 ]T

29
Iterative Methods
 It should also be noted that in both GS and Jacobi Methods, the choice of M
is important.

 Since the iterations of each method involve inverting M, we should choose


M so that this is easy.

 In Jacobi Method, M is a diagonal matrix


 Easy to invert!

 In Gauss-Sidel Method, M is a lower triangular


 Slightly harder. Although inverse is still lower triangular, we must perform a little
more work, but we only have to invert it once.
 Alternatively, use the strategy by the book to compute iterates component by
component.

30
Iterative Methods
 Successive Over-Relaxation Method (SOR) Write matrices L and U
(L lower triangular with diagonal) such that:

for some appropriately chosen  > 0 ( is the diagonal of A)

 A related class of iterative relaxation methods for approximating the


solution to Ax = b can now be defined.

31
Iterative Methods
 Algorithm:

 For a certain  with 0<≤2


 convergent if 0<<2 and Ax = b has a solution.

32
Iterative Methods
 There are many different choices that can be made for the
relaxation methods

 Choice of 
  = 2, then xk+1 is reflection of xk into xi = 
  = 1, then x
k+1 is projection of xk onto the hyperplane xi = 

 How to choose violated equation


 Is it best to just pick any violated equation?
 Should we have a strategy? (First, Best, Worst, etc)

 Stopping Criteria:
 How close is close enough?

33
Iterative Methods
 Why use iterative methods?
 Sometimes they will be more numerically stable.
 Particulary when matrix A is positive-semi-definite and diagonally
dominant. (This is the case for the linear least squares problem)
 If an exact solution is not neccessary. Only need to be ‘close enough’.

 Which one to use?


 Compared with Jacobi Iteration, Gauss-Seidel Method converges faster,
usually by a factor of 2.

34
References
 Schrijver, A. Theory of Linear and Integer Programming. John
Wiley & Sons, 1998.

 Qiao, S. Linear Systems. CAS 708 Lecture Slides. (2005)

35
THE END

36

You might also like