2.7.1 Definition:: LU Decomposition
2.7.1 Definition:: LU Decomposition
2.7.1 Definition:: LU Decomposition
2.7.1 Definition: A mn matrix is said to have a LU-decomposition if there exists matrices L and U with the following properties: (i) L is a mn lower triangular matrix with all diagonal entries being 1. (ii) U is a mn matrix in some echelon form. (iii) A = LU. 2.7.2 Advantage of LU-decomposition:: Suppose we want to solve a mn system AX = b. If we can find a LU-decomposition for A , then to solve AX =b, it is enough to solve the systems
Thus the system LY = b can be solved by the method of forward substitution and the system UX = Y can be solved by the method of backward substitution. To illustrate, we give some examples 2.7.3 Example: Consider the given system AX = b, where
to get
and get
The above discussion raises the following question . Which matrices have LU-decomposition, and how to find them. The answer is given by the following : 2.7.4 Theorem: Let A be a mn matrix and U= .... A 's corresponds to the operation of row interchange, then is also a lower triangular , ,... be elementary matrices such that
Proof
2.7.5 Example:
Let
,3
give
Thus
Note that the element below the main diagonal in L is the pivotal columns are the negatives of the scalars used in the respective column to make all the entries in that column below the pivot zero. For example, in first column, we operated -2 second entry is 2 and the third entry is -3. 2.7.6 Example: Let + and 3 + . Thus, in the first column, the
To reduce A to a row -echelon form, we operate , namely the second column to get
and
Next we operate -3
column to get
This has only two non-zero rows. Thus L is given by replacing entries below diagonal pivotal columns by -ve of the multiples: in column 1 these are - , - ,and in second column this is 3. Thus
Let us check
The method followed in example 2.5.8 and 2.5.9 can in fact be proved mathematically as follows: 2.7.7 Theorem: Let be the m m elementary matrix which corresponds to the elementary row operation of and adding it to the jth row. Then
(iii) If
Proof
Till now we had assumed that no row interchanges were required to transform A to upper triangular
form. If such row operations are needed we can proceed by the following:
2.7.8 Theorem: Let A be an m n matrix which requires row interchanges , in addition to the other elementary row operations to transform it to row echelon form U : Then , the matrix PA requires no row interchanges to reduce it to row echelon form and hence can be written as PA = LU . Proof
2.7.9 Corollary: The system AX = b is equivalent to a system PAX = Pb , Where PA has LU-decomposition .
2.7.10 Definition: A mm matrix which arises from matrix . 2.7.11 Example: by finite number of row interchanges is called a permutation
to get
Hence,
In theorem 8.1.5 we saw that every set of nonzero orthogonal vectors is linearly independent. This motivates our next definition.
8.2.1 Definition: Let V be an inner product space and W a subspace of V. An orthonormal set orthonormal basis of W if . As an immediate application of theorem 8.1.5, we have the following : is called an
8.2.2 Theorem: Let W be any subspace of V. Then W has an orthonormal basis. Proof
8.2.3 Example: For V = with usual inner-product, the set with ,where 1 is
at the
8.2.4 Gram-Schmidt algorithm: Based on the proof of theorem 8.1.5(iv), the procedure for finding an orthonormal basis for an inner product space is given as follows: Let Step1: Define Step 2: : be a basis of V.
Step 3: Define
Then,
is an orthonormal basis of V .
8.2.5 Examples: (i) In example 8.1.6, the basis S={(1, 1, 1, ), (-1, 0, -1), (-1, 2, 3)} of gave the orthonormal basis
(ii) Consider the vector space V = of all real-polynomial function p(x), degree at most 3. Let the inner-product on V be given by
,of
Consider the basis S = for V. The Gram-Schmidt process gives the following.
Next, define
Then is an orthonormal basis of V. (iii) Let W denote the subspace of spanned by the set of vectors: S = {(1, 1, 1), (2, 2, 3), (0, 0, 1), (1, 2, 3)}. Clearly S is not a basis of W , since S has four elements. Thus, to find an orthonormal basis of W one way is to first select a basis of W out of the vectors in S and then apply Gram-Schmidt process to it. Another, more straight forward method, is to apply Gram-Schmidt process directly to the set of vectors in S , and discard those vectors which become zero. In our case it is as follows:
8.2.6 Note:
Of course, one method is first extend {X} to a basis of and then using GramSchmidt process orthonormalize it. An efficient way of doing this is as follows: (i) Write
(iii) It is easy to check that the matrix U has the property i.e. Columns of U are orthonormal.
8.3. Advantages of an orthonormal basis Recall that, if for a vector space V, is an ordered basis, then every vectors v V can be assigned unique coordinates, i.e. there exist unique
scalars
such that
However, we do not know how to compute these scalars for a given v. Suppose, V is an inner product space and B is an ordered orthonormal basis of V. Then the coordinates of any vector v V, can be computed easily, as explained in the next theorem. 8.3.1 Theorem: Let V be an inner product space with ordered orthonormal basis followings holds. (i) For every v V, . Then the
Proof 8.3.2 Note: Formula (8.1) tells us how to compute its coordinates of a vector and formulae (8.2) and (8.3) allow us to compute the inner-product and the norm in terms of these coordinates. 8.3.3 Examples: (i) Consider the set
we have (ii) Consider the space of all real polynomials of degree at most 4. For p, q . Consider the subspace of , define . Consider the
Now define
is given by
Next we show that the matrix representation of a linear transformation is easy to compute with respect to an ordered orthonormal basis. 8.3.4 Theorem: Suppose is a linear transformation on an inner-product space V. Let be an ordered orthonormal basis of V. Then
Proof 8.3.5 Example: Let us compute the matrix projection on the line along the unit vector u = Since . the orthogonal
and
we have
10.1. Diagonalizability Let A be a nn matrix with entries fromF . In 6.4.4 we had raised the question : When is A similar to a diagonal matrix? Equivalently given a linear transformation T : V V, dim(V)= n , when we can find an ordered basis B of V such that is diagonal? Let us make the following definition:
10.1.1 Definition: (i) A matrix A is said to be diagonalizable if A is similar to a diagonal matrix , i.e. there exists an invertible matrix P such that is a diagonal matrix. (ii) Let V be a finite dimensional vector space and T : V V be linear transformation . We sat T is diagonalizable if there exists a basis B of V such that the matrix is a diagonal matrix.
We would like to know when is a given matrix A diagonalizable and if so, how to find P such that is diagonal ? Next theorem answers this question.
10.1.2 Theorem: Let A be a n n matrix . If A is diagonalizable, then there exists scalars and vectors be such that the following holds :
Proof Theorem 10.1.2 says that if A is diagonalizable then not only A has n eigenvalues, it has a basis consisting of eigenvectors. In fact, the converse of this is also true.
10.1.3 Theorem: (i) Let A be a n n matrix . If A has a linearly independent eigenvectors, then A is diagonalizable. (ii) Let A be a n n matrix. If A is diagonalizable and P is any n n invertible matrix , then B:= is also diagonalizable. (iii) Let V be a finite dimensional vector space and T : V V be a linear transformation . If there exists a basis B of V consisting of eigenvectors of T, then T is diagonalizable. Proof
10.1.4 Note: Theorem 10.1.3 not only tells us when is A diagonalizable, it also gives us a matrix P which will diagonalize A, i.e., = D, and gives us the resulting diagonal matrix also. The columns vectors of the matrix P are the n eigenvectors of A and the diagonal matrix D the diagonal entries as the eigenvalues of A corresponding to these n-eigenvectors .
Since
Thus,
For example
is a solution. Hence
Thus (
- I ) X = 0 implies
Thus, X =
is an eigenvector for
for = 1.
Similarly, for = 2
Thus,
is an eigenvector for
and T ( -1 + x ) = -2 + 2x = 0 ( -2 + x ) + 2 ( -1 + x ) Thus
To apply theorem 10.1.2 to diagonalize a matrix, it is important to know how many eigenvalues the matrix has, and the relations between the vectors. 10.1.6 Theorem: Let A be a n n matrix . (i) Let be an eigenvalue of A and Then, is a subspace of called the eigen subspace for the eigenvalue . (ii) Eigenvectors corresponding to distinct eigenvalues are linearly independent : let is linearly independent set .
Proof
10.1.7 Corollary: Let A be a nn matrix. If A has n distinct eigenvalues, then A is diagonalizable. Proof
Since
Thus, = 1, 2, 3 are the eigenvalues of A, and hence A is diagonalizable. 10.1.9 Note: If a nn matrix A does not have all eigenvalues distinct, it need not imply that A is not diagonalizable. For example in 10.1.4(ii), though A had only two distinct values, it was diagonalizable, for we were able to find 3-linearly independent eigenvalues . When a nn matrix fails to have distinct eigenvalues, it can never happen because either the characteristic polynomial have no roots in F ( where F = ), or some of the roots of det( A - I ) are repeated .This motivates our next definitions: 10.1.10 Definition:
Let A be a nn matrix and F be an eigenvalue of A . (i) We say a positive integer rn is the algebraic multiplicity of if is a root of det(A -tI) of order r. For = 0,
Thus
Finally, for = -2
Thus
Since
Then
Thus, A has three distinct eigenvalues = 5, 0 and -2. Each has algebraic multiplicity one. Infact each has geometric multiplicity also one. Let us verify that. For = 5,
Thus
Thus .y .
( A - I ) X = 0 iff X = Thus
Hence, = 1 has geometric multiplicity also 1. The notations of algebraic multiplicity, geometric multiplicity and the diagonalizability are related as follows:
10.1.12 Theorem: Let A be a nn matrix. Let A have distinct eigenvalues , respectively and geometric multiplicities with algebraic multiplicities , respectively . Then
Proof
8.4. Orthogonal decomposition 8.4.1 Definition: Let S be any subset of V and let
8.4.2 Theorem (Orthogonal complement): Let W be any subspace of V. Then w + v for some w W, u Further, every v V can be uniquely written as .
, we write this as : V= W
Proof
8.4.3 Example: (i) Let V = with the standard inner-product. Let S = {( 1, 0, 0 ), ( 0, 1, 0 )}. Then = {( x, y, z ) | (x, y, z) . (1, 0, 0 ) = 0 = ( 0, 1, 0 ).( x, y, z )} = {(0, 0, z ) | z } If we write W = [S] = {( x, y, 0 ) | x, y }, then = {( 0, 0, 3 ) | z }.
(ii) Let
, the space of all real polynomials of degree at most 3, with inner product :
, if and only if
and
i.e.
where
if and only if
Hence
(iii) Let The vectors ( 2, 5, -1 ), ( -2, 1, 1 ) are orthogonal . Then an orthonormal basis of W is given by
onto W is given by
Thus
Let V be an inner-product space and let W be a subspace of V. Theorem 8.1.4 tells us that , i.e. , the space V decomposes as a sum of two orthogonal subspaces. Decompositions, as given by theorem 8.4.2 are very useful in understanding linear transformation on inner-product spaces ( see proposition 9.3.2 and theorem 10.2.4. ). This idea can be extended as follows :
In that case, we say V is an orthogonal sum of the subspaces The following is easy to prove :
and we write it as
Proof
8.4.6 Theorem (Best unique approximation): Let W be any subspace of V and let v V. Then there exists a unique such that
Proof
8.4.7 Example: (i) Let V = and W be the subspace of generated by {( 2, -1, -2 ), ( 1, 0, 1 )}. Then, the best approximator for u = ( 2, 1, 3 ) from W is given by
(ii) Consider
Let W denote the subspace of V spanned by { 1, t }. Since , { 1, t } forms an orthogonal basis of w . Since
8.5. Least square approximations To show that theorem 8.4.2 is not only of theoretical interest and has practical applications also, let us consider the following problem: An experimenter collects data by taking measurements at time points , respectively. Geometrically, this gives points
Because of the distribution of these points or otherwise, the experimenter feels that there exists a linear relation between the time variable t and the observation variables , say y = ct + d . The problem is to find c and d such that the line y = ct + d best fits the experimental data. Equivalently, one wants to minimize the error:
The error will be zero i.e., the line will be the exact fit if and only if We can reformulate the problem as follows: let
Then, Thus, given the matrix A and vector Y, the line ct + d will be the exact fit (error zero) if and only if AX = Y has a solution. This leads to the problem of solving the system AX = Y. However, in general, the number of equations greatly exceed the the number of variables and it is unrealistic to expect a solution of AX = Y. So, one would like to find (8.4) This leads to the following definition : such that it gives least error, i.e.,
8.5.2 Theorem: The set of least square solutions of AX = Y is same as the solution of the system
8.5.3 Note: For a matrix A and a given vector , let be a least square solution of AX = Y. Then ,
is the projection of the vector Y onto the column space of A. Least square approximation need not be unique, however if and are both least square solutions for AX = Y, then A =A
8.5.4 Corollary: The system AX = Y , has unique least square solution if and only if in that case , the least square solution is given by is invertible, and
Proof
The above corollary tells us that the least square approximation can be computed if is invertible. Necessary and sufficient conditions for the invertibility of is given below .
It is easy to check that the system AX = Y is inconsistent. To find a least square approximation, we solve the normal equations, Here
Since
is invertible, the system has unique least square approximation given by the solution of
8.5.6 Theorem: For any matrix A, the matrix is invertible if and only if rank( A ) = n.
Proof
Thus
In the particular situation, when the columns of A are linearly independent , it is possible to compute the least square approximation another method . For any vector .
8.5.8 Definition: A matrix said to have a QR decomposition if there exist matrices Q and R with the following properties : (i) Q is a matrix whose columns form an orthonormal basis for the column space of A. (ii) A is a upper triangular invertible matrix with positive diagonal entries. (iii) A = QR .
8.5.9 Theorem: Let A be a approximation matrix and . If A admits a QR decomposition, then the least square
for AX = Y is given by
Proof
8.5.10 Note: The above theorem tells us that if A = QR, then to find the least square approximation, we do not have to compute , as suggested by corollary 8.5.3, instead one can compute which is easier to compute as R is Upper triangular, and use above theorem. ,
8.5.11 Example: Let A = QR , where . Then Q has orthonormal columns and R is invertible, upper triangular. Thus for . The system AX = Y, for Here , has least square approximation, given by
The existence and uniqueness of QR decomposition of a matrix is given by the next theorem.
8.5.12 Theorem: Let A be a matrix with linearly independent columns. Thus, A admits a QR decomposition . Further such a decomposition is unique. Proof
It is easy to verify that the columns of A are linearly independent. Application of Gram-Schmidt process to the column vectors of A gives the orthonormal basis for the column-space .
Thus
where
Thus,
We verify
8.5.14 Note: The method of least square approximation applies equally well for fitting a polynomial to a given data.