Polar Decomposition
Polar Decomposition
Polar Decomposition
Contents
1 Introduction 1
2 Orthogonal decompositions 4
3 Length of T v 5
1 Introduction
The polar decomposition proven in the book (Theorem 7.45 on page 233)
concerns a linear map T ∈ L(V ) from a single inner product space to itself.
Exactly the same ideas treat that the case of a map
I stated the result in class; the point of these notes is to write down some
details of the proof, as well as to talk a bit more about why the result is
useful. To get to the statement, we need some notation. I’ll think of the
linear map T as an arrow going from V to W :
T
V −−−→ W ; (1.1b)
this is just another notation for saying that T is a function that takes any-
thing in v ∈ V and gives you something T (v) ∈ W . Because these are inner
product spaces, we get also the adjoint of T
T∗
W −−−−→ V, hT v, wi = hv, T ∗ wi (v ∈ V, w ∈ W ). (1.1c)
1
The property written on the right defines the linear map T ∗ . Using these
two maps, we immediately get two subspaces of each of V and W :
The first basic fact is that these spaces provide orthogonal direct sum de-
compositions of V and W .
Te
6. The natural isomorphism V / Null(T ) −−→ Range(T ) (text, 3.91) re-
stricts to an isomorphism of vector spaces
T
Range(T ∗ ) −−−→ Range(T ).
Te∗
8. The natural isomorphism W/ Null(T ∗ ) −−→ Range(T ∗ ) (see text, 3.91)
restricts to an isomorphism of vector spaces
T∗
Range(T ) −−−→ Range(T ∗ ).
2
I’ll outline proofs in Section 2. (Even better, you should try to write
down proofs yourself.)
Here is the polar decomposition I stated in class.
T = SP (S ∈ L(V, W ), P ∈ L(V ))
3. P |Null(T ) = S|Null(T ) = 0.
Write λmax and λmin and for the largest and smallest eigenvalues of P (which
are the square roots of the corresponding eigenvalues of T ∗ T ). Then
5.
kT vk/kvk ≤ λmax (0 6= v ∈ V )
with the maximum attained exactly on the λmax eigenspace of P .
6.
kT vk/kvk ≥ λmin (0 6= v ∈ V )
with the minimum attained exactly on the λmin eigenspace of P .
T = S0P
3
It is the factorization T = S 0 P that is established in the text (when
V = W ). Even in that case, I like T = SP better because it’s unique, and
unique is (almost) always better. The two maps S and S 0 differ only on
Null(T ); so if T is invertible, S = S 0 .
Items (5) and (6) correspond to an application that I discussed in class:
often one cares about controlling how a linear map can change the sizes of
vectors. This result answers that question very precisely. I’ll discuss this in
Section 3.
The proof will be outlined in Section 4. This is harder than Proposition
1.2; but it’s still very worthwhile first to try yourself to write proofs.
2 Orthogonal decompositions
This section is devoted to the proof of Proposition 1.2.
Proof of Proposition 1.2. The idea that’s used again and again is that in an
inner product space V
Null(T ) = {v ∈ V | T v = 0} (definition)
= {v ∈ V | hT v, wi = 0 (all w ∈ W )} (by (2.1))
∗
= {v ∈ V | hv, T wi = 0 (all w ∈ W )} (by (1.1c))
= Range(T ∗ )⊥ (definition of U ⊥ ).
This proves 1.
Part 2 is just 1 applied to T ∗ , together with the (very easy; you should
write a proof!) T ∗∗ = T .
Part 3 follows from 1 and Theorem 6.47 in the text; and part 4 is 3
applied to T ∗ .
For part 5, suppose V = V1 ⊕ V2 is any direct sum decomposition of any
finite-dimensional V over any field F ; the claim is that the natural quotient
map
π : V → V /V1
restricts to an isomorphism
'
V2 −−→ V /V1 .
4
To see this, notice first that
(Theorem 3.78 on page 93 and 3.89 on page 97)so the two vector spaces V
and V /V1 have the same dimension. Second, Null(π) = V1 (text, proof of
Theorem 3.89), so
Null(V2 → V /V1 ) = V2 ∩ V1 = 0
(text, 1.45). So our map is an injection between vector spaces of the same
dimension; so it is an isomorphism.
For 6, the map we are looking at is the composition of the isomorphism
from 5 and the isomorphism from Proposition 3.91 in the text; so it is also
an isomorphism.
Parts 7 and 8 are just 5 and 6 applied to T ∗ .
3 Length of T v
This section concerns the general question “how big is T v?” whenever this
question makes sense: for us, that’s the setting (1.1a). I’ll start with a
slightly different result, proved in class.
Proposition 3.1. Suppose S ∈ L(V ) is a self-adjoint linear operator. De-
fine
µmin = smallest eigenvalue of S
µmax = largest eigenvalue of S
Then for all nonzero v ∈ V ,
All values in this range are attained. The first inequality is an equality if
and only if v is an eigenvector for µmin ; that is, if and only if v ∈ Vµmin .
The second inequality is an equality if and only if v is an eigenvector for
µmax ; that is, if and only if v ∈ Vµmax .
Proof. According to the Spectral Theorem for self-adjoint operators (which
is in the text, but not so easy to point to; there is a simple statement in the
notes on the spectral theorem on the class web site) the eigenvalues of S are
all real, so they can be arranged as
5
Furthermore V is the orthogonal direct sum of the eigenspaces:
So far this calculation would have worked for any diagonalizable S. Now we
use the fact that S is self-adjoint, and therefore that we could choose the eji
to be orthonormal. This allows us to calculate
0 0 0 0
hµi vij eji , vij0 eki0 i = µi vij vij0 heji , eji0 i
0
(
(µi vij vij0 ) · 1 (i = i0 , j = j 0 )
= 0
(µi vij vij0 ) · 0 i 6= i0 or j 6= j 0 )
(
µi |vij |2 (i = i0 , j = j 0 )
=
0 i 6= i0 or j 6= j 0 ).
0 ≤ wi ≤ 1, w1 + · · · + wr = 1 (3.2g)
6
These numbers are the weights. Simplest weights are the uniform weight
(1/r, · · · , 1/r). The weighted average is
µ = w1 µ 1 + · · · + wr µ r . (3.2h)
µ = (µ1 + · · · + µr )/r,
the ordinary average. The opposite extreme is the teacher’s pet weight
(
1 (i = p)
wi =
0 (i 6= p)
where all the weight is on a single value µp . The weighted average for the
teacher’s pet weight is
µ = µp .
No matter what weights you use, it is always true that
The first inequality is an equality if and only if the weights are concentrated
on the minimum values:
wi 6= 0 ⇐⇒ µi = µmin ,
and similarly for the second inequality. I won’t write out a proof of (3.2i),
but here’s how to start. For each i, the definitions of min and max say that
µmin ≤ µi ≤ µmax .
7
You should convince yourself that this really is a set of weights (that they
are non-negative real numbers adding up to 1). Now (3.2f) says that
r
X
hSv, vi/hv, vi = wi µi = weighted average of eigenvalues of S. (3.2k)
i=1
Now (3.2i) gives the inequalities in the proposition. You should also convince
yourself that the conditions for equality given for weighted averages after
(3.2i) lead exactly to the conditions for equality stated in the proposition.
Proposition 3.3. Suppose we are in the setting (1.1). Write λ2max and λ2min
and for the largest and smallest eigenvalues of T ∗ T , with 0 ≤ λmin ≤ λmax .
Then
1.
kT vk/kvk ≤ λmax (0 6= v ∈ V )
with the maximum attained exactly on the λ2max eigenspace of T ∗ T .
2.
kT vk/kvk ≥ λmin (0 6= v ∈ V )
with the minimum attained exactly on the λ2min eigenspace of T ∗ T .
kT vk = kRvk (v ∈ V ).
8
Using the definition of adjoint, these can be written
hT v, T vi = hT ∗ T v, vi > .
S1∗ S1 = IRange(T ∗ ) , f∗ S
S1 1 = IRange(T ∗ ) .
f (4.1a)
S ∗ S|Range(T ∗ ) = Se∗ S|
e Range(T ∗ ) = IRange(T ∗ ) . (4.1b)
S ∗ S|Null(T ) = Se∗ S|
e Null(T ) = 0Null(T ) . (4.1c)
9
To continue, we need to know that
Similarly,
T ∗ T = PRange(T ∗ ) P 2 . (4.1k)
T ∗T = P 2. (4.1l)
Now Theorem 1.3(4) follows immediately; and in particular this shows that
P = Pe. (4.1m)
S1 = T (P |Range(T ∗ ) )−1 = S
f1 . (4.1n)
S = S,
e (4.1o)
10
completing the uniqueness proof for the decomposition. (We also proved (4)
in the process.
For the existence of the decomposition, we define P = (T ∗ T )1/2 , the
unique positive square root, as we must. By Proposition 3.3(3)
kT vk = kP vk (v ∈ V ). (4.1p)
Therefore
P1 =def P |Range(T ∗ ) ∈ L(Range(T ∗ ) (4.1r)
is invertible; and P1 inherits from P the property
V = Null(T ) ⊕ Range(T ∗ )
Then Theorem 1.3(2) and (3) are true by this definition and (4.1u). Fur-
thermore
T (n + v 0 ) = (T |Range(T ∗ ) )(v 0 )
= (T |Range(T ∗ ) )P1−1 P1 v 0
(4.1w)
= S1 P1 v 0
= SP (n + v 0 ).
This proves the factorization T = SP .
Parts (5) and (6) of Theorem 1.3 are contained in Proposition 3.3.
11
For the last assertion (about an alternate factorization), it’s clear from
the preceding proof that we can achieve the factorization using any S 0 which
agrees with S on Range(P ) = Null(T )⊥ . To complete the definition of S 0 ,
we just need to define it (as any linear map to W ) on Null(T ). The hy-
pothesis dim V = dim W and Proposition 1.2 guarantee that dim Null(T ) =
dim Null(T ∗ ); so we can choose S 0 on Null(T ) to be an isometry to Null(T ∗ ).
I’ll omit the rest of the details.
12