SVD PDF

Download as pdf or txt
Download as pdf or txt
You are on page 1of 10

396 Chapter 7.

Applied Mathematics and A T A

7.2 Positive Definite Matrices and the SVD


This chapter about applications of AT A depends on two important ideas in linear algebra.
These ideas have big parts to play, we focus on them now.

1. Positive definite symmetric matrices (both AT A and AT CA are positive definite)

2. Singular Value Decomposition (A D U V T gives perfect bases for the 4 subspaces)

Those are orthogonal matrices U and V in the SVD. Their columns are orthonormal
eigenvectors of AAT and AT A. The entries in the diagonal matrix are the square
roots of the eigenvalues. The matrices AAT and AT A have the same nonzero eigenvalues.
Section 6.5 showed that the eigenvectors of these symmetric matrices are orthogonal.
I will show now that the eigenvalues of AT A are positive, if A has independent columns.

Start with AT Ax D x. Then x T AT Ax D x T x. Therefore  D jjAxjj2 =jjxjj2 > 0

I separated x T AT Ax into .Ax/T .Ax/ D jjAxjj2 . We dont have  D 0 because AT A is


invertible (since A has independent columns). The eigenvalues must be positive.
Those are the key steps to understanding positive definite matrices. They give us three
tests on S three ways to recognize when a symmetric matrix S is positive definite :

Positive 1. All the eigenvalues of S are positive.


definite 2. The energy x T S x is positive for all nonzero vectors x.
symmetric 3. S has the form S D AT A with independent columns in A.

There is also a test on the pivots .all > 0/ and a test on n determinants .all > 0/.
Example 1 Are these matrices positive definite ? When their eigenvalues are positive,
construct matrices A with S D AT A and find the positive energy x T S x.
     
4 0 5 4 4 5
(a) S D (b) S D (c) S D
0 1 4 5 5 4

Solution The answers are yes, yes, and no. The eigenvalues of those matrices S are

(a) 4 and 1 : positive (b) 9 and 1 : positive (c) 9 and 1 : not positive.
A quicker test than eigenvalues uses two determinants : the 1 by 1 determinant S11 and
the 2 by 2 determinant of S . Example (b) has S11 D 5 and det S D 25 16 D 9 (pass).
Example (c) has S11 D 4 but det S D 16 25 D 9 (fail the test).
7.2. Positive Definite Matrices and the SVD 397

Positive energy is equivalent to positive eigenvalues, when S is symmetric. Let me


test the energy x T S x in all three examples. Two examples pass and the third fails :

  
4 0 x1
x1 x2 D 4x12 C x22 > 0 Positive energy when x 0
0 1 x2
  
5 4 x1
x1 x2 D 5x12 C 8x1 x2 C 5x22 Positive energy when x 0
4 5 x2
  
4 5 x1
x1 x2 D 4x12 C 10x1 x2 C 4x22 Energy 2 when x D .1; 1/
5 4 x2

Positive energy is a fundamental property. This is the best definition of positive definiteness.

When the eigenvalues are positive, there will be many matrices A that give AT A D S .
One choice
p of A is symmetric and positive definite ! Then AT A is A2 , and this choice
A D S is a true square root of S . The successful examples (a) and (b) have S D A 2 :

         
4 0 2 0 2 0 5 4 2 1 2 1
D and D
0 1 0 1 0 1 4 5 1 2 1 2

We know that all symmetric matrices have the form S pD VV T with orthonormal
eigenvectors in V . The diagonal
p p has a square root , when all eigenvalues are
matrix
positive. In this case A D S D V V T is the symmetric positive definite square root :
p p p p p p
ATA D S S D .V V T /.V V T / D V V T D S because V T V D I:
p p
Starting from this unique square root S , other choices of A come easily. pMultiply S
by any matrix Q that has orthonormal columns (so that QT Q D I ). Then Q S is another
choice for A (not a symmetric choice). In fact all choices come this way :
p p p p
AT A D .Q S /T .Q S / D SQT Q S D S: (1)

I will choose a particular Q in Example 1, to get particular choices of A.


 
0 1 p p
Example 1 (continued) Choose Q D to multiply S . Then A D Q S.
1 0
      
0 1 2 0 0 1 T 4 0
A D D has S D A A D
1 0 0 1 2 0 0 1
      
0 1 2 1 1 2 5 4
A D D has S D AT A D :
1 0 1 2 2 1 4 5
398 Chapter 7. Applied Mathematics and A T A

Positive Semidefinite Matrices


Positive semidefinite matrices include positive definite matrices, and more. Eigenvalues of
S can be zero. Columns of A can be dependent. The energy x T S x can be zerobut not
negative. This gives new equivalent conditions on a (possibly singular) matrix S D S T .

10 All eigenvalues of S satisfy   0 (semidefinite allows zero eigenvalues).

20 The energy is nonnegative for every x : x T S x  0 (zero energy is allowed).

30 S has the form AT A (every A is allowed; its columns can be dependent).

Example 2 The first two matrices are singular and positive semidefinitebut not the
third :
     
0 0 4 4 4 4
(d) SD (e) SD (f) SD .
0 1 4 4 4 4

The eigenvalues are 1; 0 and 8; 0 and 8; 0. The energies x T S x are x22 and 4.x1 C x2 /2
and 4.x1 x2 /2 . So the third matrix is actually negative semidefinite.

Singular Value Decomposition


Now we start with A, square or rectangular. Applications also start this waythe matrix
comes from the model. The SVD splits any matrix into orthogonal U times diagonal
T
times orthogonal V . Those orthogonal factors will give orthogonal bases for the four
fundamental subspaces associated with A.
Let me describe the goal for any m by n matrix, and then how to achieve that goal.
Find orthonormal bases v 1 ; : : : ; v n for Rn and u1 ; : : : ; um for Rm so that

Av 1 D 1 u1 ::: Av r D r ur Av rC1 D 0 ::: Av n D 0 (2)

The rank of A is r. Those requirements in (4) are expressed by a multiplication AV D U .


The r nonzero singular values 1  2  : : :  r > 0 are on the diagonal of :
2 3
2 3 2 3 1 0
6 :: 7
AV D U A 4 v 1 : : : v r : : : v n 5 D 4 u1 : : : ur : : : um 5 6 : 7 (3)
4 r 5
0 0

The last n r vectors in V are a basis for the nullspace of A. The last m r vectors in U
are a basis for the nullspace of AT . The diagonal matrix is m by n, with r nonzeros.
Remember that V 1 D V T , because the columns v 1 ; : : : ; v n are orthonormal in Rn :

Singular Value Decomposition AV D U becomes A D U V T : (4)


7.2. Positive Definite Matrices and the SVD 399
T T
The SVD has orthogonal matrices U and V , containing eigenvectors of AA and A A.
Comment. A square matrix is diagonalized by its eigenvectors : Ax i D i x i is like
Av i D i ui . But even if A has n eigenvectors, they may not be orthogonal. We need two
basesan input basis of vs in Rn and an output basis of us in Rm . With two bases, any
m by n matrix can be diagonalized. The beauty of those bases is that they can be chosen
orthonormal. Then U T U D I and V T V D I .
The vs are eigenvectors of the symmetric matrix S D AT A. We can guarantee their
orthogonality, so that v Tj v i D 0 for j i . That matrix S is positive semidefinite, so its
eigenvalues are i2  0. The key to the SVD is that Avj is orthogonal to Avi :

i2 if j D i
Orthogonal us .Av j /T .Av i / D v Tj .AT Av i / D v Tj .i2 v i / D (5)
0 if j i

This says that the vectors ui D Av i =i are orthonormal for i D 1; : : : ; r. They are a basis
for the column space of A. And the us are eigenvectors of the symmetric matrix AAT ,
which is usually different from S D AT A (but the eigenvalues 12 ; : : : ; r2 are the same).

Example 3 Find the input and output eigenvectors v and u for the rectangular matrix A :
 
2 2 0
AD D U V T :
1 1 0

Solution Compute S D AT A and its unit eigenvectors


p ; v 3 . The eigenvalues  2
v 1 ; v 2p
are 8; 2; 0 so the positive singular values are 1 D 8 and 2 D 2 :
2 3 2p 3 2 p 3 2 3
5 3 0 2 2 0
1 4p 5 14 p 5
AT A D 4 3 5 05 has v 1 D 2 ; v2 D 2 ; v3 D 4 0 5 :
0 0 0 2 2 1
0 0
p
The outputs
p u1 D Av 1 =1 and u2 D Av 2 =2 are also orthonormal, with 1 D 8 and
2 D 2. Those vectors u1 and u2 are in the column space of A :
       
2 2 0 v1 1 2 2 0 v2 0
u1 D p D and u2 D p D :
1 1 0 8 0 1 1 0 2 1

Then U D I and the Singular Value Decomposition for this 2 by 3 matrix is U V T :

2 p p 3T
    " p # 2
2 2 0 1 0 8 p0 0 1 4 p p2 0
AD D 2 2 0 5 :
1 1 0 0 1 0 2 0 2
0 0 2
400 Chapter 7. Applied Mathematics and A T A

The Fundamental Theorem of Linear Algebra

I think of the SVD as the final step in the Fundamental Theorem. First come the dimen-
sions of the four subspaces in Figure 7.3. Then come the orthogonality of those pairs of
subspaces. Now come the orthonormal bases of vs and us that diagonalize A :

Av j D j uj for j  r AT uj D j v j for j  r
SVD T
Av j D 0 for j > r A uj D 0 for j > r

Multiplying Av j D j uj by AT and dividing by j gives that equation A T uj D j vj .

dim r dim r
row column
space space
of A of A
Av1 D 1 u1 1 u1
v1 u1
vr Avr D r ur ur
R n r ur
m
R
Avn D 0 urC1 um
vrC1 vn
nullspace
nullspace of A T
of A
dim n r dim m r

Figure 7.3: Orthonormal bases of vs and us that diagonalize A : m by n with rank r.

The norm of A is its largest singular value : jjAjj D 1 . This measures the largest
possible ratio of jjAvjj to jjvjj. That ratio of lengths is a maximum when v D v 1 and
Av D 1 u1 . This singular value 1 is a much better measure for the size of a matrix than
the largest eigenvalue. An extreme case can have zero eigenvalues and just one eigenvector
(1; 1) for A. But AT A can still be large : if v D .1; 1/ then Av is 200 times larger.
 
100 100
AD has max D 0: But max D norm of A D 200: (6)
100 100
7.2. Positive Definite Matrices and the SVD 401

The Condition Number


T
A valuable property of A D U V is that it puts the pieces of A in order of importance.
Multiplying a column ui times a row i v Ti produces one piece of the matrix. There will be
r nonzero pieces from r nonzero s, when A has rank r. The pieces add up to A, when
we multiply columns of U times rows of V T :
2 32 3
1 v T1
The pieces
A D 4 u1 : : : ur 5 4 : : : : : 5 D u1 .1 v T1 / C    C ur .r v Tr /: (7)
have rank 1
r v Tr
1
The first piece gives the norm of A which is 1 . The last piece gives the norm of A ,
which is 1=n when A is invertible. The condition number is 1 times 1=n :

1 1
Condition number of A c.A/ D jjAjj jjA jj D : (8)
n
This number c.A/ is the key to numerical stability in solving Av D b. When A is an
orthogonal matrix, the symmetric S D AT A is the identity matrix. So all singular values of
an orthogonal matrix are  D 1. At the other extreme, a singular matrix has n D 0.
In that case c D 1. Orthogonal matrices have the best condition number c D 1.

Data Matrices : Application of the SVD


Big data is the linear algebra problem of this century (and we wont solve it here).
Sensors and scanners and imaging devices produce enormous volumes of information.
Making decisive sense of that data is the problem for a world of analysts (mathematicians
and statisticians of a new type). Most often the data comes in the form of a matrix.
The usual approach is by PCAPrincipal Component Analysis. That is essentially
the SVD. The first piece 1 u1 v T1 holds the most information (in statistics this piece has
the greatest variance). It tells us the most. The Chapter 7 Notes include references.

REVIEW OF THE KEY IDEAS

1. Positive definite symmetric matrices have positive eigenvalues and pivots and energy.
2. S D AT A is positive definite if and only if A has independent columns.
3. x T AT Ax D .Ax/T .Ax/ is zero when Ax D 0. AT A can be positive semidefinite.
4. The SVD is a factorization A D U V T D (orthogonal) (diagonal) (orthogonal).
5. The columns of V and U are eigenvectors of AT A and AAT (singular vectors of A).
6. Those orthonormal bases achieve Av i D i ui and A is diagonalized.
7. The largest piece of A D 1 u1 v T1 C    C r ur v Tr gives the norm jjAjj D 1 .
402 Chapter 7. Applied Mathematics and A T A

Problem Set 7.2


1 For a 2 by 2 matrix, suppose the 1 by 1 and 2 by 2 determinants a and ac b 2 are
positive. Then c > b 2 =a is also positive.
(i) 1 and 2 have the same sign because their product 1 2 equals .
(i) That sign is positive because 1 C 2 equals .
Conclusion : The tests a > 0; ac b 2 > 0 guarantee positive eigenvalues 1 ; 2 .
2 Which of S1 , S2 , S3 , S4 has two positive eigenvalues? Use a and ac b 2 , dont
compute the s. Find an x with x T S1 x < 0, confirming that A1 fails the test.
       
5 6 1 2 1 10 1 10
S1 D S2 D S3 D S4 D :
6 7 2 5 10 100 10 101

3 For which numbers b and c are these matrices positive definite ?


     
1 b 2 4 c b
SD SD SD :
b 9 4 c b c

4 What is the energy q D ax 2 C 2bxy C cy 2 D x T S x for each of these matrices ?


Complete the square to write q as a sum of squares d1 . /2 C d2 . /2 .
   
1 2 1 3
SD and SD :
2 9 3 9

5 x T S x D 2x1 x2 certainly has a saddle point and not a minimum at .0; 0/. What
symmetric matrix S produces this energy ? What are its eigenvalues ?
6 Test to see if AT A is positive definite in each case :
2 3
  1 1  
1 2 1 1 2
AD and A D 4 1 2 5 and A D :
0 3 1 2 1
2 1

7 Which 3 by 3 symmetric matrices S and T produce these quadratic energies ?



x T S x D 2 x12 C x22 C x32 x1 x2 x2 x3 . Why is S positive definite?

x T T x D 2 x12 C x22 C x32 x1 x2 x1 x3 x2 x3 . Why is T semidefinite ?
8 Compute the three upper left determinants of S to establish positive definiteness.
(The first is 2.) Verify that their ratios give the second and third pivots.
2 3
2 2 0
Pivots D ratios of determinants S D 42 5 35 :
0 3 8
7.2. Positive Definite Matrices and the SVD 403

9 For what numbers c and d are S and T positive definite? Test the 3 determinants :
2 3 2 3
c 1 1 1 2 3
S D 41 c 15 and T D 42 d 45 :
1 1 c 3 4 5

1
10 If S is positive definite then S is positive definite. Best proof : The eigenvalues
of S 1 are positive because . Second proof (only for 2 by 2) :
 
1 1 c b
The entries of S D pass the determinant tests :
ac b 2 b a

11 If S and T are positive definite, their sum S C T is positive definite. Pivots and
eigenvalues are not convenient for S C T . Better to prove x T .S C T /x > 0.
12 A positive definite matrix cannot have a zero (or even worse, a negative number)
on its diagonal. Show that this matrix fails to have x T S x > 0 :
2 32 3
  4 1 1 x1
x1 x2 x3 4 1 0 2 5 4 x2 5 is not positive when .x1 ; x2 ; x3 / D . ; ; /:
1 2 5 x3

13 A diagonal entry ajj of a symmetric matrix cannot be smaller than all the s. If it
were, then A ajj I would have eigenvalues and would be positive definite.
But A ajj I has a on the main diagonal.
14 Show that if all  > 0 then x T S x > 0. We must do this for every nonzero x,
not just the eigenvectors. So write x as a combination of the eigenvectors and
explain why all cross terms are x Ti x j D 0. Then x T S x is

.c1 x 1 C   C cn x n /T .c1 1 x 1 C   C cn n xn / D c12 1 x T1 x 1 C   C cn2 n x Tn x n > 0:

15 Give a quick reason why each of these statements is true:


(a) Every positive definite matrix is invertible.
(b) The only positive definite projection matrix is P D I .
(c) A diagonal matrix with positive diagonal entries is positive definite.
(d) A symmetric matrix with a positive determinant might not be positive definite !
p p
16 With positive pivots in D, the factorization
p p S D LDLT becomes p L T D DL .
T

(Square roots of the pivots give D D D D.) Then A D DL yields the


Cholesky factorization S D AT A which is symmetrized L U :
   
3 1 4 8
From A D find S: From S D find A D chol.S /:
0 2 8 25
404 Chapter 7. Applied Mathematics and A T A
   
cos  sin  2 0 cos  sin 
17 Without multiplying S D , find
sin  cos  0 5 sin  cos 

(a) the determinant of S (b) the eigenvalues of S


(c) the eigenvectors of S (d) a reason why S is symmetric positive definite.

18 For F1 .x; y/ D 41 x 4 C x 2 y C y 2 and F2 .x; y/ D x 3 C xy x find the second


derivative matrices H1 and H2 :
" #
@2F=@x 2 @2F=@x@y
Test for minimum H D 2 is positive definite
@ F=@y@x @2F=@y 2

H1 is positive definite so F1 is concave up .D convex). Find the minimum point


of F1 and the saddle point of F2 (look only where first derivatives are zero).

19 The graph of z D x 2 C y 2 is a bowl opening upward. The graph of z D x 2 y 2 is


a saddle. The graph of z D x 2 y 2 is a bowl opening downward. What is a test
on a; b; c for z D ax 2 C 2bxy C cy 2 to have a saddle point at .0; 0/ ?

20 Which values of c give a bowl and which c give a saddle point for the graph of
z D 4x 2 C 12xy C cy 2 ? Describe this graph at the borderline value of c.

21 When S and T are symmetric positive definite, S T might not even be symmetric.
But its eigenvalues are still positive. Start from S T x D x and take dot products
with T x. Then prove  > 0.

22 Suppose C is positive definite (so y T C y > 0 whenever y 0) and A has indepen-


dent columns (so Ax 0 whenever x 0). Apply the energy test to x T AT CAx
to show that AT CA is positive definite : the crucial matrix in engineering.

23 Find the eigenvalues and unit eigenvectors v 1 ; v 2 of AT A. Then find u1 D Av 1 =1 :


     
1 2 T 10 20 T 5 15
AD and A A D and AA D :
3 6 20 40 15 45

Verify that u1 is a unit eigenvector of AAT . Complete the matrices U; ; V .


  h i  h iT
1 2 1
SVD D u1 u2 v1 v2 :
3 6 0

24 Write down orthonormal bases for the four fundamental subspaces of this A.
2
25 (a) Why is the trace of AT A equal to the sum of all aij ?
(b) For every rank-one matrix, why is 12 D sum of all aij
2
?
7.2. Positive Definite Matrices and the SVD 405

26 Find the eigenvalues and unit eigenvectors of AT A and AAT . Keep each Av D u:
 
1 1
Fibonacci matrix AD
1 0

Construct the singular value decomposition and verify that A equals U V T .


27 Compute AT A and AAT and their eigenvalues and unit eigenvectors for V and U:
 
1 1 0
Rectangular matrix AD :
0 1 1
Check AV D U (this will decide signs in U ). has the same shape as A.
1
28 Construct the matrix with rank one that has Av D 12u for v D 2 .1; 1; 1; 1/ and
u D 13 .2; 2; 1/. Its only singular value is 1 D .
29 Suppose A is invertible (with 1 > 2 > 0). Change A by as small a matrix as
possible to produce a singular matrix A0 . Hint : U and V do not change.
h i  h iT
1
From A D u1 u2 v1 v2 find the nearest A0 .
2

30 The SVD for A C I doesnt use C I . Why is .A C I / not just .A/ C I ?
31 Multiply AT Av D  2 v by A. Put in parentheses to show that Av is an eigenvector
of AAT . We divide by its length jjAvjj D  to get the unit eigenvector u.
32 My favorite example of the SVD is when Av.x/ D dv=dx, with the endpoint con-
ditions v.0/ D 0 and v.1/ D 0. We are looking for orthogonal functions v.x/
so that their derivatives Av D dv=dx are also orthogonal. The perfect choice is
v1 D sin x and v2 D sin 2x and vk D sin kx. Then each uk is a cosine.
The derivative of v1 is Av1 D  cos x D u1 . The singular values are 1 D 
and k D k. Orthogonality of the sines (and orthogonality of the cosines) is the
foundation for Fourier series.
You may object to AV D U . The derivative A D d=dx is not a matrix ! The
orthogonal factor V has functions sin kx in its columns, not vectors. The matrix
U has cosine functions cos kx. Since when is this allowed ? One answer is to
refer you to the chebfun package on the web. This extends linear algebra to matrices
whose columns are functionsnot vectors.
Another answer is to replace d=dx by a first difference matrix A. Its shape will be
N C 1 by N . A has 1s down the diagonal and 1s on the diagonal below. Then
AV D U has discrete sines in V and discrete cosines in U . For N D 2 those will
be sines and cosines of 30 and 60 in v 1 and u1 .
T
 Can you construct the p p matrix A (3 byp2) and Ap A (2 by 2) ? The dis-
difference
crete sines are v 1 D . 3=2; 3=2/ and v 2 D . 3=2; 3=2/. Test that Av 1 is
orthogonal to Av 2 . What are the singular values 1 and 2 in ?

You might also like