Further Linear Algebra. Chapter IV. Jordan Normal Form.: Andrei Yafaev

Further linear algebra. Chapter IV.
Jordan
normal form.
Andrei Yafaev
In what follows V is a vector space of dimension n and B is a basis of V .

In this chapter we are concerned with linear maps T : V −→ V .
Let A be the matrix representing T in the basis B. Because A is an n × n
matrix, we can form powers Ak for any k with the convention that A0 = In .
Note that Ak represents the transformation T composed ktimes.
Notice for example that when the matrix is diagonal with coefficients λi
on the diagonal, then An is diagonal with coefficients λni . Notice also that
such a matrix is invertible if and only if all λi s are non-zero, then A−1 is the
diagonal matrix with coefficients λ−1 i on the diagonal.
Definition 0.1 Let f (X) = ai X i ∈ k[X]. We define
P
X
f (T ) = ai T i .
where we define T 0 = Id. This is a linear transformation.

If A ∈ Mn (k) is a matrix then we define
X
f (A) = ai Ai ,
This matrix f (A) represents f (T ) in the basis B.

What is means is that we can ‘evaluate’ a polynomial at a n × n matrix
and get another n × n matrix. We write [f (T )]B for this matrix in the basis
B, obviously it is the same as f ([T ]B ).
Let’s look
at an example.

−1 3
Take A = and f (x) = x2 − 5x + 3. Then
4 7

2 21 3
f (A) = A − 5A + 3 =
4 29
1
Another example: V = Mn (k) and T sends M to M t . Consider f (x) =
x − 1. As T 2 = I, we see that f (T ) = 0.
2
Notice that
[f (T )]B = f ([T ]B )
It follows that if T is a linear map, f (T ) = 0 if and only if f (A) = 0
where A is the matrix of T is some (equivalently any) basis.
Another property worth noting is that if f, g ∈ k[x], then
f (T )g(T ) = (f g)(T ) = (gf )(T ) = g(T )f (T )
Definition 0.2 Recall that the characteristic polynomial of an n × n matrix

A is defined by
chA (x) = det(x · In − A) = (−1)n det(A − x · In ).
This is a monic polynomial of degree n over k. Now suppose T : V → V is

a linear map. We can define chT to be ch[T ]B but we need to check that this
does not depend on the basis B. If C is another basis with transition matrix
M then we have:
ch[T ]C (X) = det(X · In − M −1 [T ]B M )

= det(M −1 (X · In − [T ]B )M )
= det(M )−1 det(X · In − [T ]B ) det(M )
= det(X · In − [T ]B )
= ch[T ]B (X)
In other words, the characteristic polynomial does not depend on

the choice of the basis in which we write our matrix.
The following (important !) theorem was proved in the first year courses.
Theorem 0.1 (Cayley–Hamilton Theorem) For any A be an n × n ma-

trix. We have chA (A) = 0.
We therefore have:
Theorem 0.2 (Cayley–Hamilton Theorem) For any T : V −→ V lin-

ear map, we have chT (T ) = 0.
2

λ1 0
Example 0.3 Take A = Then chA (x) = (x − λ1 )(x − λ2 ) and
0 λ2
clearly chA (A)= 0.
1 2
Take A = . Calculate chA (x) and check that chA (A) = 0.
3 4
There are plenty of polynomials f such that f (A) = 0, all the multiples
of chA for example.
What can also happen is that some divisor g of chA is already such that
g(A) = 0. Take the identity In for example. Its characteristic polynomial is
(x − 1)n but in fact g = x − 1 is already such that g(In ) = 0. This leads us
to the notion of minimal polynomial.
0.1 Minimal polynomials.

Definition 0.3 Let V be a finite dimensional vector space over a field k
and T : V → V a linear map. A minimal polynomial of T is the monic
polynomial m ∈ k[X] such that
• m(T ) = 0;
• if f (T ) = 0 and f 6= 0 then deg f ≥ deg m.
Notice that if m is a polynomial as in the definition, and f a polynomial

such that f (T ) = 0 and deg(f ) < deg(m), then f = 0.
Indeed, if f was not zero, then dividing f by the coefficient of the leading
term, one would obtain a monic polynomial with degree strictly less than m
which contradicts the definition of m.
Theorem 0.4 Every linear map T : V → V has a unique minimal polyno-

mial mT . Furthermore if f ∈ k[x] is such that f (T ) = 0 iff mT |f .
Proof. Firstly the Cayley-Hamilton theorem implies that there exists a

polynomial f satisfying f (T ) = 0, namely f = chT . Among monic polyno-
mials satisfying f (T ) = 0 we choose one of smallest degree, call it mT . This
shows that the minimal polynomial exists.
Suppose that mT is not unique then there exists another monic poly-
nomial n(x) ∈ k[X] satisfying the conditions of the definition. Because
n(T ) = 0, deg(n) ≥ deg(mT ) and because mT (T ) = 0, deg(mT ) ≥ deg(n)
hence deg(m) = deg(n).
3
If f (x) = mT (x) − n(x) then
f (T ) = mT (T ) − n(T ) = 0,
also deg(f ) < deg(mT ) = deg(n). By the remark following the definition of
the minimal polynomial, we see that f = 0 i.e. mT = n. This proves the
uniqueness.
Suppose f ∈ k[X] and f (T ) = 0. By the Division Algorithm for polyno-
mials there exist unique q, r ∈ k[X] with deg(r) < deg(m) and
f = qm + r.
Then
r(T ) = f (T ) − q(T )m(T ) = 0 − q(T ) · 0 = 0.
So r is the zero polynomial (by the remark following the definition.) Hence
f = qm and so m|f .
Conversely if f ∈ k[X] and m|f then f = qm for some q ∈ k[X] and so
f (T ) = q(T )m(T ) = q(T ) · 0 = 0.
Corollary 0.5 If T : V → V is a linear map then mT |chT .
Proof. By the Cayley-Hamilton Theorem chT (T ) = 0.
Using the corollary we can calculate the minimal polynomial as follows:

• Calculate chT and factorise it into irreducibles.
• Make a list of all the factors.
• Find the monic factor m of smallest degree such that m(T ) = 0.
 
2 1
Example 0.6 Suppose T is represented by the matrix  2 . The
2
characteristic polynomial is
chT (X) = (X − 2)3 .
The factors of this are:
1, (X − 2), (X − 2)2 , (X − 2)3 .
The minimal polynomial is (X − 2)2 .
4
In fact this method can be speeded up: there are certain factors of the
characteristic polynomial which cannot arise. To explain this we recall the
definition of an eigenvalue
Definition 0.4 Recall that a number λ ∈ k is called an eigenvalue of T if

there is a non-zero vector v satisfying
T (v) = λ · v.
The non-zero vector v is called an eigenvector
Remark 0.7 It is important that an eigenvector be non-zero. If you allow

zero to be an eigenvector, then any λ would be an eigenvalue.
Proposition 0.8 Let v be an eigenvector of T with eigenvalue λ ∈ k. Then

for any polynomial f ∈ k[X],
(f (T ))(v) = f (λ) · v.
Proof. Just use that T (v) = λv.
Theorem 0.9 If T : V → V is linear and λ ∈ k then the following are

equivalent:
(i) λ is an eigenvalue of T .
(ii) mT (λ) = 0.
(iii) chT (λ) = 0.
Proof. (i) ⇒ (ii): Assume T (v) = λv with v 6= 0. Then by the proposition,
(mT (T ))(v) = mT (λ) · v.
But mT (T ) = 0 so we have mT (λ)·v = 0. Since v 6= 0 this implies mT (λ) = 0.

(ii) ⇒ (iii): This is trivial since we have already shown that mT is a factor
of chT .
(iii) ⇒ (i): Suppose chT (λ) = 0. Therefore det(λ · Id − T ) = 0. It
follows that (λ · Id − T ) is not invertible hence λ · Id − T has a non-zero
kernel. Therefore there exists v ∈ V such that (λ · Id − T )(v) = 0. But then
T (v) = λ · v.
5
Now suppose the characteristic polynomial of T factorises into irreducibles
as r
Y
chT (X) = (X − λi )a1 .
i=1
By fundamental theorem of algebra, if k = C, we can always factorise it like

this.
Then the minimal polynomial has the form
r
Y
mT (X) = (X − λi )b1 , 1 ≤ b i ≤ ai .
i=1
This makes it much quicker to calculate the minimal polynomial. Indeed,

in practice, the number of factors and the ai s are quite small.
Example 0.10 Suppose T is represented by the matrix diag(2, 2, 3). The

characteristic polynomial is
chT (X) = (X − 2)2 (X − 3).
The possibilities for the minimal polynomial are:
(X − 2)(X − 3), (X − 3)2 (X − 3).
The minimal polynomial is (X − 2)(X − 3).
0.2 Generalised Eigenspaces

Definition 0.5 Let V be a finite dimensional vector space over a field k,
and let λ ∈ k be an eigenvalue of a linear map T : V → V . We define for
t ∈ N the t-th generalised eigenspace by:
Vt (λ) = ker((λ · Id − T )t ).
Note that V1 (λ) is the usual eigenspace (i.e. the set of eigenvectors to-
gether with zero).
Remark 0.11 We obviously have
V1 (λ) ⊆ V2 (λ) ⊆ . . .
6
and by definition,
dim Vt (λ) = N ullity (λ · Id − T )t .

Another property is that generalised eigenspaces are T invariant:
T (Vi (λ)) ⊂ Vi (λ)

To see this, let v ∈ Vi (λ). Then T (T − λId)i v = 0 = (T − λId)i T v,
therefore T (v) ∈ Vi (λ).
Example 0.12 Let  

2 2 2
A = 0 2 2 .
0 0 2
We have chA (X) = (X − 2)3 so 2 is the only eigenvalue. We’ll now calculate
the generalised eigenspaces Vt (2):
 
0 2 2
V1 (2) = ker 0 0
 2 .
0 0 0
We calculate the kernel by row-reducing the matrix:

   
0 1 0  1 
V1 (2) = ker 0 0 1 = Span 0 .
 
0 0 0 0
 
Similarly      
0 0 1  1 0 
V2 (2) = ker 0
 0 0 = Span
  0 , 1 .
 
0 0 0 0 0
 
       
0 0 0  1 0 0 
V3 (2) = ker 0 0
 0 = Span
 0 , 1 , 0 .
   
0 0 0 0 0 1
 
Example 0.13 Let  

1 1 −2
A = 1 1 −2 .
1 1 −2
7
Let V be a vector space and U and W be two subspaces. Then (exercise
on Sheet 4), U + W is a subspace of V . If furthermore U ∩ V = {0}, then
we call this subspace the direct sum of U and W and denote it by U ⊕ W .
In this case
dim(U ⊕ W ) = dim U + dim W
Theorem 0.14 (Primary Decomposition Theorem) If V is a finite di-

mensional vector space over k and T : V → V is linear, with distinct eigen-
values λ1 , . . . , λr ∈ k.
Let f be a monic polynomial such that f (T ) = 0 and suppose that f
factorises in k[x] into a product of degree one polynomials
r
Y
f (X) = (X − λi )bi ,
i=1
then
V = Vb1 (λ1 ) ⊕ · · · ⊕ Vbr (λr ).
Lemma 0.15 Let k be a field. If f, g ∈ k[x] satisfy gcd(f, g) = 1 and T is

as above then
ker(f g(T )) = ker(f (T )) ⊕ ker(g(T )).
Proof. (of the theorem) We have f (T ) = 0, so ker(f (T )) = V . We have a

factorization of f into pairwise coprime factors of the form (x − λi )bi so the
lemma implies that
r
!
Y
V = ker(f (α)) = ker (T − λi 1)bi = ker(T − λ1 )b1 ⊕ · · · ⊕ ker(T − λt )bt
i=1
= Vb1 (λ1 ) ⊕ · · · ⊕ Vbt (λt ).
Proof. ( of the lemma) Let f, g ∈ k[x] satisfy gcd(f, g) = 1.

Firstly if v ∈ ker f (T ) + ker g(T ), say v = w1 + w2 , with w1 ∈ ker f (T )
and w2 ∈ ker g(T ) then
f g(T )v = f g(T )(w1 + w2 ) = f (g(T )w1 ) + f (g(T )w2 ) = f (g(T )w1 )
Now, f and g are polynomials in k[x], hence f g = gf , therefore
8
f (g(T )w1 ) = g(f (T )w1 ) = 0
because w2 ∈ ker(f (T )).
Therefore ker(f (α)) + ker(g(α)) ⊆ ker(f g(α)).
We need to prove the eqaulity, here we will use that gcd(f, g) = 1.
Now since gcd(f, g) = 1 there exist a, b ∈ k[x] such that
af + bg = 1.
So
a(T )f (T ) + b(T )g(T ) = 1 (the identity map).
Let v ∈ ker(f g(T )). If
v1 = a(T )f (T )v, v2 = b(T )g(T )v
then v = v1 + v2 and
g(T )v1 = (gaf )(T )v = a(f g(T )v) = a(T )0 = 0.
So v1 ∈ ker(g(T )). Similarly v2 ∈ ker(f (T )) since
f (T )v2 = (f bg)(T )v = b(f g(T )v) = b(T )0 = 0.
Hence ker f g(T ) = ker f (T ) + ker g(T ). Moreover, if v ∈ ker f (T ) ∩ ker g(T )
then v1 = 0 = v2 so v = 0. Hence
ker f g(T ) = ker f (T ) ⊕ ker g(T ).
Notice that, by fundamental theorem of algebra, in C[x] every polynomial

factorises into a product of degree one ones. The theorem applies to chT (x)
and mT (x).
Definition 0.6 Recall that a linear map T : V → V is diagonalisable if

there is a basis B of V such that [T ]B is a diagonal matrix. This is equivalent
to saying that the basis vectors in B are all eigenvectors.
9
Theorem 0.16 Let V is a finite dimensional vector space over a field k and
let T : V → V be a linear map with distinct eigenvalues λ1 , . . . , λr ∈ k.
Then T is diagonalizable iff we have (in k[X]):
mT (X) = (X − λ1 ) . . . (X − λr ).
Proof. First suppose that T is diagonalisable and let B be a basis of eigen-

vectors. Let f (X) = (X − λ1 ) . . . (X − λr ). We already know that f |mT , so
to prove that f = mT we just have to check that f (T ) = 0. To show this, it
is sufficient to check that f (T )(v) = 0 for each basis vector v ∈ B. Suppose
v ∈ B, so v is an eigenvector with some eigenvalue λi . Then we have
f (T )(v) = f (λ) · v = 0 · v = 0.
Therefore mT = f .
Conversely if mT = f then by the primary decomposition theorem we
have
V = V1 (λ1 ) ⊕ . . . ⊕ V1 (λr ).
Let Bi be a basis for V1 (λi ). Then obviously the elements of Bi are eigen-
vectors and B = B1 ∪ . . . ∪ Br is a basis of V . Therefore T is diagonalisable.
Remark 0.17 Observe that if the characteristic polynomial splits as chT (x) =
(x − λ1 ) · · · (x − λr ) where λi s are distinct eigenvalues, then mT = chT and
the matrix is diagonalisable.
The converse, of course, does not hold.
Example 0.18 Let k = C and let

4 2
A= .
3 3
The characteristic polynomials is (x − 1)(x − 6). The minimal polynomial is

the same. The matrix is diagonalisable.
One finds that the basis of eigenvectors is

2 1
,
−3 1
.
In fact this matrix is diagonalisable over R or even Q.
10
Example 0.19 Let k = R and let

0 −1
A= .
1 0
The characteristic polynomial is x2 +1. It is irreducible over R. The minimal

polynomial is the same. The matrix is not diagonalisable over R, however
over C x2 + 1 = (x − i)(x + i) and the matrix is diagonalisable.

1 −1
A= .
1 −1
The characteristic polynomials is X 2 . Since A 6= 0 the minimal polynomial

is also X 2 . Since this is not a product of distinct linear factors, A is not
diagonalizable over C.

1 0
A= .
1 1
The minimal polynomial is (x − 1)2 . Not diagonalisable.
0.3 Jordan Bases in the one eigenvalue case

Let T : V → V be a linear map. Fix a basis for V and let A be the matrix of
T in this basis. As we have seen above, it is not always the case that T can
be diagonalised; i.e. there is not always a basis of V consisting of eigenvalues
of T . This the case that there is no basis of eigenvalues, the best kind of
basis is a Jordan basis. We shall define a Jordan basis in several steps.
Suppose λ ∈ k is the only eigenvalue of a linear map T : V → V . We
have defined generalized eigenspaces:
V1 (λ) ⊆ V2 (λ) ⊆ . . . ⊆ Vb (λ),
where b is the power of X − λ in the minimal polynomial mT .

We can choose a basis B1 for V1 (λ). Then we can choose B2 so that B1 ∪B2
is a basis for V2 (λ) etc. Eventually we end up with a basis B1 ∪ . . . ∪ Bb for
Vb (λ). We’ll call such a basis a pre-Jordan basis.
11
Consider
3 −2
A= .
8 −5
One calculates the characteristic polynomial and finds(x + 1)2 hence λ = −1
1
is the only eigenvalue. The unique eigenvector is v = . Hence V1 (−1) =
2
Span(v). Of course we have (A − λI2 )2 = 0 hence 2
V2 (−1) = C and we
0
complete v to a basis of C2 = V2 (−1), by v2 = for example. We have
1
Av2 = −2v1 − v2 and hence in the basis {v1 , v2 } the matrix of A

−1 −2
0 −1
The basis {v1 , v2 } is a pre-Jordan basis for A.
Example 0.22  
2 1 −2
A = 1 2 −2
1 1 −1
We have chA (X) = (X − 1)3 and mA (X) = (X − 1)2 . There is only one
eigenvalue λ = 1, and we have generalized eigenspaces
V1 (1) = ker 1 1 −2 , V2 (1) = ker(0) = C3 .

So we can choose a pre-Jordan basis as follows:

     
 1 0   1 
B1 =  −1 , 2
   , B2 = 0 .
0 1 0
   
This in fact also works over R.
Now note the following:
Lemma 0.23 If v ∈ Vt (λ) with t > 1 then
(T − λ · Id)(v) ∈ Vt−1 (λ).
Proof. Clear from the definition of the generalised eigenspaces.
12
Now suppose we have a pre-Jordan basis B1 ∪ . . . ∪ Bb . We call this a
Jordan basis if in addition we have the condition:
(T − λ · Id)Bt ⊂ Bt−1 , t = 2, 3, . . . , b.
If we have a pre-Jordan basis B1 ∪ . . . ∪ Bb , then to find a Jordan basis, we

do the following:
• For each basis vector v ∈ Bb , replace one of the vectors in Bb−1 by

(T − λ · Id)(v). When choosing which vector to replace, we just need
to take care that we still have a basis at the end.
• For each basis vector v ∈ Bb−1 , replace one of the vectors in Bb−2 by
(T − λ · Id)(v). When choosing which vector to replace, we just need
to take care that we still have a basis at the end.
• etc.
• For each basis vector v ∈ B2 , replace one of the vectors in B1 by (T −

λ · Id)(v). When choosing which vector to replace, we just need to take
care that we still have a basis at the end.
Once finished, order the vectors of the basis approriately.
We’ll prove later that this method always works.
Example 0.24 Let’s look again at

3 −2
A= .
8 −5
We have seen that {v1 , v2 } is a pre-Jordan basis, here v2 is the second vector
in the standard basis.
2
Replace v1 by the vector (A + I2 )v2 = . Then {v1 , v2 } still forms a
4
basis of C2 . This is the Jordan basis for A.
We have Av1 = −v1 and Av2 = v1 − v2 (you do not need to calculate, just
use (A + I2 )v2 = v1 ). Hence in the new basis the matrix is

−1 1
0 −1
13
Example 0.25 In the example above, we replace one of the vectors in B1 by
   
1 1
(A − I3 ) 0 = 1 .
  
0 1
So we can choose a Jordan basis as follows:

     
 1 1   1 
B1 =  −1 , 1
   , B2 = 0 .
0 1 0
   
Example 0.26 Take k = R.

1 −1
A= .
1 −1
Here, we have seen that the characteristic and minimal polynomials are
2
x . Therefore, 0 is
the only eigenvalue.
1
Clearly v1 = generates the eigenspace and V2 (0) = R2 . We complete
1
0
the basis by taking v2 = . We get a pre-Jordan basis.
1
−1
Let’s construct a Jordan basis. Replace v1 by Av2 = . This is a
−1
Jordan basis. The matrix of A in the new basis is

0 1
0 0
Example 0.27  
2 2 2
A = 0 2 2 .
0 0 2
Clearly, the characteristic polynomial is (x−2)3 and it is equal to the minimal
polynomial, 2 is the only eigenvalue.  
1
V1 (2) has equations y = z = 0, hence it’s spanned by v1 = 0
0
14
V2 (2) is z = 0 and V3 (2) is R3 . Therefore, the standard basis is a pre-
Jordan basis.
We have  
0 0 4
(A − 2I3 )2 = 0 0 0
0 0 0
hence we get
Now,  
0 2 2
A − 2I3 = 0 0 2
0 0 0
We have  
2
(A − 2I3 )v3 = 2

0
and we replace v2 by this
 vector.

4
Now (A − 2I3 )v2 = 0
0
We get:
     
4 2 0
v1 = 0 , v2 = 2 , v3 = 0
0 0 1
We have
Av1 = 2v1 , Av2 = v1 + 2v2 , Av3 = v2 + 2v3
In this basis the matrix of A is:

 
2 1 0
0 2 1
0 0 2
This is a Jordan basis.
15
0.4 Jordan Canonical (or Normal) Form in the one
eigenvalue case
The Jordan canonical form of a linear map T : V → V is essentially the
matrix of T with respect to a Jordan basis. We just need to order the vecors
appropriately. Everything is over a field k, often k will be C.
Suppose for the moment that mT = (x − λ)b , in particular T has only
one eigenvalue λ. Choose a Jordan basis:
B = B1 ∪ . . . ∪ Bb ,
Of course, as (A − λid)b = 0, we have Vb (λ) = V .
We have a chain of subspaces
V1 (λ) ⊂ V2 (λ) ⊂ · · · ⊂ Vb (λ) = V
and the pre-Jordan basis was constructed by starting with a basis of V1 (λ)
and completing it successefully to get a basis of Vb (λ) = V . WE then altered
this basis so that
(T − λid)Bi ⊂ Bi−1
Notice that we can arrange the basis elements in chains : starting with a
vector v ∈ Bb we get a chain
v, (T − λid)v, (T − λid)2 v, . . . , (T − λid)b−1 v
This last vector w = (T − λid)b−1 v is in V1 (λ). Indeed
(T − λid)b v = 0
hence
(T − λid)w = 0
therefore
T w = λw
therefore w ∈ V1 (λ).
We have the following
Lemma 0.28 For any v ∈ Bb (in particular v ∈

/ Bi for i < b !), the vectors
v, (T − λid)v, (T − λid)2 v, . . . , (T − λid)b−1 v
are linearly independent.
16
Proof. Suppose that X
µi (T − λid)i v = 0
i
Then
µ0 v + (T − λid)w = 0
where w is a lineat combination of (T − λid)k v. Multiplying by (T − λid)b−1 ,
we get
µ0 (T − λid)b−1 v = 0
but, as v ∈
/ Vb−1 (λ), we see that
(T − λid)b−1 v 6= 0
hence µ0 = 0.
Repeating the process inductively, we get that µi = 0 for all i and the
vectors we consider are linearly independent.
Let us number the vectors in this chain as vb = (T − λid)b−1 v, . . . , v0 = v.

In other words
vi = (T − λid)b−i v
Then
(T − λid)vi = vi−1
i.e.
T vi = λvi + vi−1
In other words,
0
 
 .. 
.
0
 
1
 
T (vi ) =   ,
λ
0
 
.
 .. 
0
in the basis formed by this chain.
This gives a Jordan block i.e. b × b matrix:
17
λ 1
 
 λ 1 
. .
 
Jk (λ) = 
 .. ..  .

 . . . 1

λ
In this way, we arrange our Jordan basis in chains starting with Bi (for
i = b, b − 1, . . . , 1) and terminating at V1 (λ).
By putting the chains together, we get that in the Jordan basis, the
matrix is of the following form:
 
λ 1

 λ 1 


 λ 1 


 λ 


 λ 1 


 λ 1 
.

 λ 1 


 λ 


 λ 1 


 λ 

 λ 
λ
We nad write it as
[T ]B = diag(Jh1 (λ), . . . , Jhw (λ)).
where the Jhi s are blocks corresponding to a chain of length hi .

We can actually say more; in fact the following results determines the
number of blocks:
Lemma 0.29 The number of blocks is the dimension of the eigenspace V1 (λ).
Proof. Let (v1 , . . . , vk ) be the Jordan basis of the subspace U corresponding

to one block. It is a chain, we have
T v1 = λv1
18
and
T vi = λvi + vi−1
for 2 ≤ i ≤ k.
Let v ∈ U be an eigenvector : T v = λv. Write v = ki=1 ci vi . Then
P
X X
T v = c1 λv1 + ci (λvi + vi−1 ) = λv + ci vi
i≥2 i≥2
P
It follows that Tv = λv if and only if i≥2 ci vi = 0 which implies that c2 =
· · · = cn = 0 and hence v is in the subspace generated by v1 . Therefore, each
block determines exactly one eigenvector for eigenvalue λ. As eigenvectors
from different blocks are linearly independent : they are members of a basis,
the number of blocks is exactly the dimension of the eigenspace V1 (λ).
SUMMARY :
To summarise what we have seen so far. Suppose T has one eigenvalue
λ, let mT (x) = (x − λid)b be its minimal polynomial. We construct a pre-
Jordan basis by choosing a basis B1 for the eigenspace V1 (λ) and then com-
plete by B2 (a certain number of vectors in V2 (λ)) and then to B3 , . . . , Bb .
Note that Vb (λ) = V . We get a pre-Jordan basis B = B1 ∪ · · · ∪ Bb .
Now we alter the pre-Jordan basis by doing the following. Start with
a vector vb ∈ B, replace one of the vectors in Bb−1 by vb−1 = (T − λid)vb
making sure that this vb−1 is linearly independent of the other vectors in
Bb−1 . Then replace a vector in Bb−2 by vb−2 = (T − λid)vb−1 (again choose a
vector to replace by choosing one such that vb−2 is linearly independent of the
others)... continue until you get to V1 (λ). The last vector will be v1 ∈ V1 (λ)
i.e. v1 is an eigenvector.
We obtain a chain of vectors
v1 = (T − λid)v2 , v2 = (T − λid)v3 , . . . , vb−1 = (T − λid)vb , vb
Hence in particular
T vk = vk−1 + λvk
The subspace U spanned by this chain is T -stable (because T vk = vk−1 +λvk )
and this chain is linearly independent hence the chain forms a basis of U .
19
In restriction to U and with respect to this basis the matrix of T is
 
λ 1

 λ 1 


 λ 1 


 λ 


 λ 1 

 λ 1 
J(b)(λ) = 
 .
 λ 1 


 λ 


 λ 1 


 λ 

 λ 
λ
One constructs such chains with all elements of Bb . Once done, one looks for
elements in Bb−1 which are not in the previously constructed chains
starting at Bb and constructs chains with them. Then with Bb−2 , etc...
In the end, the union of chains will be a Jordan basis and in it the
matrix of T is of the form :
diag(Jh1 (λ), . . . , Jhw (λ)).

Notice the following two observations :
1. There is always a block of size b × b. Hence by knowing the
degree of the minimal polynomial, in some cases it is possible to
determine the shape of Jordan normal form.
2. The number of blocks is the dimension of the eigenspace V1 (λ)
Here are some examples:
Suppose you have a matrix such that
chA = (x − λ)5
and
mA (x) = (x − λ)4
There is always a block of size 4 × 4, hence the Jordan normal form has one
4 × 4 block and one 1 × 1 block.
Suppose chA is the same but mA (x) = (x − λ)3 . Here you need to know
more. There is one 3 × 3 block and then either two 1 × 1 blocks or one 2 × 2
20
block. This is determined by the diomension of V1 (λ). If it’s three then the
first possibility, if it’s two then the second.
 
0 1 0 1
 0 1 0 0
A= −1 1 1 1

−1 1 0 2
One calculates that chA (x) = (x − 1)4 . We have
 
−1 1 0 1
0 0 0 0
A−I = −1 1

0 1
−1 1 0 1
Clearly the rank of A − I is 1, hence dim V1 (λ) = 3.

This means that the Jordan normal form will have three blocks. Therefore
there will be two blocks of size 1×1 and one of size 2×2. The Jordan normal
form is  
1 0 0 0
0 1 0 0
 
0 0 1 1
0 0 0 1
Another example:
 
−2 0 −1 1
 0 −2 1 0
A=
0

0 −2 0 
0 0 0 −2
One calculates that chA (x) = (x + 2)4 . We have
 
0 0 −1 1
0 0 1 0
A + 2I = 0 0

1 0
0 0 0 0
We see that (A+2I)2 = 0 and therefore mA (x) = (x+2)2 . As there is always

a block of size two, there are two possibilities : either two 2 × 2 blocks or one
2 × 2 and two 1 × 1.
21
To decide which one it is, we see that the rank of A + 2I is 2 hence the
dimension of the kernel is 2. There are therefore 2 blocks and the Jordan
normal form is  
−2 1 0 0
 0 −2 0 0
 
0 0 −2 1 
0 0 0 −2
0.5 Jordan canonical form in general.

Once we know how to determine Jordan canonical form in one eigenvalue
case, the general case is easy. Let T be a linear transformation and λ1 , . . . , λr
its eigenvalues. Suppose that the minimal polynomial decomposes as
r
Y
mT (x) = (x − λi )bi
i=1
(recall again that this is always true over C.)

The we have seen that
V = Vb1 (λ1 ) ⊕ · · · ⊕ Vbr (λr )
and each Vbi (λi ) is stable by T . Therefore in restriction to each Vbi (λi ), T
is a linear transformation with one eigenvelue, namely λi and the minimal
polynomial of T restricted to Vbi (λi ) is (x − λi )bi .
One gets the Jordan basis by taking the union of Jordan bases
for each Vbi (λi ) which are constructed as previously.
Let’s look at an example.
 
−1 1 1
A = −2 2 1
−1 1 1
One calculates that chA (x) = x(x − 1)2 . Then 0 and 1 are the only
eigenvalues and
V = V1 (0) ⊕ V2 (1)
One finds that V1 (0) is generated by
 
1
v0 = 1
0
22
That will be the first vector of the basis.
For λ = 1.
We have  
−2 1 1
A − I = −2 1 1
−1 1 0
We find that V1 (λ) is spanned by
 
1
v1 = 1

1
Then  
1 0 −1
(A − I)2 = 1 0 −1
0 0 0
and V2 (λ) is spanned by  
1
v1 = 1

1
and  
0
v2 = 1

0
Notice that (A − I)v2 = v1 and therefore this is already a Jordan basis. The
matrix in this basis is  
0 0 0
0 1 1
0 0 1
Another example :
 
5 4 2
A = 4 5 2
2 2 2
One calculates that chA (x) = (x − 1)2 (x − 10). Then 1 and 10 are the
only eigenvalues.
23
One finds   
−1 −1
V1 (1) = Span(u1 =  0  , u2  1 )
2 0
The dimension is two, therefore there will be two blocks of size 1 × 1 corre-
sponding to the eigenvalue 1.
For V1 (10), one finds
 
2
V1 (10) = Span(u3 = 2)
1
In the basis (u1 , u2 , u3 ), the matrix is
 
1 0 0
0 1 0
0 0 10
It is diagonal, the matrix is diagonalisable, in fact mA = (x − 1)(x − 10).

And a last example : find Jordan basis and normal form of :
 
4 0 1 0
 2 2 3 0
A= −1 0 2 0

4 0 1 2
One finds that the characteristic polynomial is chA (x) = (x − 2)2 (x − 3)2 .
Hence 2 and 3 are eigenvalues and we have
 
2 0 1 0
 2 0 3 0
A − 2I =  −1 0 0 0

4 0 1 0
Clearly the dimension of the kernel is 2 and spanned by e2 and e4 .
The eigenspace has dimension two.
So we will have two blocks of size 1 × 1 corresponding to eigenvalue 2.
For the eigenvalue 3:
24
 
1 0 1 0
 2 −1 3 0
A − 3I = 
−1 0 −1 0 

4 0 1 −1
We see that rows one and three are identical, others are linearly inde-
pendent.
  It follows that the eigenspace is one-dimensional and spanned by
1
−1
u= −1)

3
We will have one block.
Let us calculate:
 
0 0 0 0
−3 1 −4 0
(A − 3I)2 =   0 0 0 0

−1 0 2 1
 
0
4 2
We take the vector v =   1 ) to complete the basis of ker(A − 3I) .

−2
Now, we have (A − 3I)v = u hence we already have a Jordan basis.
The basis (e2 , e4 , u, v) is a Jordan basis and in this basis the matrix
is  
2 0 0 0
0 2 0 0
 
0 0 3 1
0 0 0 3
25

Further Linear Algebra. Chapter IV. Jordan Normal Form.: Andrei Yafaev

Uploaded by

Copyright:

Available Formats

Further Linear Algebra. Chapter IV. Jordan Normal Form.: Andrei Yafaev

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Further Linear Algebra. Chapter IV. Jordan Normal Form.: Andrei Yafaev

Uploaded by

Copyright:

Available Formats

Further linear algebra. Chapter IV.

In what follows V is a vector space of dimension n and B is a basis of V .

where we define T 0 = Id. This is a linear transformation.

This matrix f (A) represents f (T ) in the basis B.

f (T )g(T ) = (f g)(T ) = (gf )(T ) = g(T )f (T )

Definition 0.2 Recall that the characteristic polynomial of an n × n matrix

chA (x) = det(x · In − A) = (−1)n det(A − x · In ).

This is a monic polynomial of degree n over k. Now suppose T : V → V is

ch[T ]C (X) = det(X · In − M −1 [T ]B M )

In other words, the characteristic polynomial does not depend on

Theorem 0.1 (Cayley–Hamilton Theorem) For any A be an n × n ma-

Theorem 0.2 (Cayley–Hamilton Theorem) For any T : V −→ V lin-

0.1 Minimal polynomials.

Notice that if m is a polynomial as in the definition, and f a polynomial

Theorem 0.4 Every linear map T : V → V has a unique minimal polyno-

Proof. Firstly the Cayley-Hamilton theorem implies that there exists a

Corollary 0.5 If T : V → V is a linear map then mT |chT .

Proof. By the Cayley-Hamilton Theorem chT (T ) = 0.

Using the corollary we can calculate the minimal polynomial as follows:

Definition 0.4 Recall that a number λ ∈ k is called an eigenvalue of T if

The non-zero vector v is called an eigenvector

Remark 0.7 It is important that an eigenvector be non-zero. If you allow

Proposition 0.8 Let v be an eigenvector of T with eigenvalue λ ∈ k. Then

Proof. Just use that T (v) = λv.

Theorem 0.9 If T : V → V is linear and λ ∈ k then the following are

(iii) chT (λ) = 0.

Proof. (i) ⇒ (ii): Assume T (v) = λv with v 6= 0. Then by the proposition,

(mT (T ))(v) = mT (λ) · v.

But mT (T ) = 0 so we have mT (λ)·v = 0. Since v 6= 0 this implies mT (λ) = 0.

By fundamental theorem of algebra, if k = C, we can always factorise it like

This makes it much quicker to calculate the minimal polynomial. Indeed,

Example 0.10 Suppose T is represented by the matrix diag(2, 2, 3). The

chT (X) = (X − 2)2 (X − 3).

The possibilities for the minimal polynomial are:

(X − 2)(X − 3), (X − 3)2 (X − 3).

The minimal polynomial is (X − 2)(X − 3).

0.2 Generalised Eigenspaces

Remark 0.11 We obviously have

dim Vt (λ) = N ullity (λ · Id − T )t .

Another property is that generalised eigenspaces are T invariant:

T (Vi (λ)) ⊂ Vi (λ)

Example 0.12 Let  

We calculate the kernel by row-reducing the matrix:

Example 0.13 Let  

Theorem 0.14 (Primary Decomposition Theorem) If V is a finite di-

Lemma 0.15 Let k be a field. If f, g ∈ k[x] satisfy gcd(f, g) = 1 and T is

Proof. (of the theorem) We have f (T ) = 0, so ker(f (T )) = V . We have a

Proof. ( of the lemma) Let f, g ∈ k[x] satisfy gcd(f, g) = 1.

f g(T )v = f g(T )(w1 + w2 ) = f (g(T )w1 ) + f (g(T )w2 ) = f (g(T )w1 )

Now, f and g are polynomials in k[x], hence f g = gf , therefore

v1 = a(T )f (T )v, v2 = b(T )g(T )v

g(T )v1 = (gaf )(T )v = a(f g(T )v) = a(T )0 = 0.

So v1 ∈ ker(g(T )). Similarly v2 ∈ ker(f (T )) since

f (T )v2 = (f bg)(T )v = b(f g(T )v) = b(T )0 = 0.

ker f g(T ) = ker f (T ) ⊕ ker g(T ).

Notice that, by fundamental theorem of algebra, in C[x] every polynomial

Definition 0.6 Recall that a linear map T : V → V is diagonalisable if

Proof. First suppose that T is diagonalisable and let B be a basis of eigen-

Example 0.18 Let k = C and let