Smith

6 The Smith Canonical Form
6.1 Equivalence of Polynomial Matrices

DEFINITION 6.1
A matrix P ∈ Mn×n (F [x]) is called a unit in Mn×n (F [x]) if ∃ Q ∈
Mn×n (F [x]) such that
P Q = In .
Clearly if P and Q are units, so is P Q.
THEOREM 6.1
A matrix P ∈ Mn×n (F [x]) is a unit in Mn×n (F [x]) if and only if det P =
c, where c ∈ F and c 6= 0.
proof
“only if”. Suppose P is a unit. Then P Q = In and
det P Q = det P det Q = det In = 1.
However det P and det Q belong to F [x], so both are in fact non–zero ele-
ments of F .
“if”. Suppose P ∈ Mn×n (F [x]) satisfies det P = c, where c ∈ F and
c 6= 0. Then
P adj P = (det P )In = cIn .
Hence P Q = In , where Q = c−1 adj P ∈ Mn×n (F [x]). Hence P is a unit in
Mn×n (F [x]).
EXAMPLE
6.1
1 + x −x
P = ∈ M2×2 (F [x]) is a unit, as det P = 1.
x 1−x
THEOREM 6.2
Elementary row matrices in Mn×n (F [x]) are units:
(i) Eij : interchange rows i and j of In ;
(ii) Ei (t): multiply row i of In by t ∈ F, t 6= 0;
(iii) Eij (f ): add f times row j of In to row i, f ∈ F [x].
In fact det Eij = −1; det Ei (t) = t; det Eij (f ) = 1.
Similarly for elementary column matrices in Mn×n (F [x]):
Fij , Fi (t), Fij (f ).
120
Remark: It follows that a product of elementary matrices in Mn×n (F [x])
is a unit. Later we will be able to prove that the converse is also true.
DEFINITION 6.2
Let A, B ∈ Mm×n (F [x]). Then A is equivalent to B over F [x] if units
P ∈ Mm×m (F [x]) and Q ∈ Mn×n (F [x]) exist such that
P AQ = B.
THEOREM 6.3
Equivalence of matrices over F [x] defines an equivalence relation on
Mm×n (F [x]).
6.1.1 Determinantal Divisors

DEFINITIONS 6.1
Let A ∈ Mm×n (F [x]). Then for 1 ≤ k ≤ min (m, n), let dk (A) denote the
gcd of all k × k minors of A.
dk (A) is sometimes called the k th determinantal divisor of A.
Note: gcd (f1 , . . . , fn ) 6= 0 ⇔ at least one of f1 , . . . , fn is non–zero.
ρ(A), the determinantal rank of A, is defined to be the largest integer r
for which there exists a non–zero r × r minor of A.
THEOREM 6.4
For 1 ≤ k ≤ ρ(A), we have dk (A) 6= 0. Also dk (A) divides dk+1 (A) for
1 ≤ k ≤ ρ(A) − 1.
proof
Let r = ρ(A). Then there exists an r × r non–zero minor and hence
dr (A) 6= 0. Then because each r × r minor is a linear combination over F [x]
of (r − 1) × (r − 1) minors of A, it follows that some (r − 1) × (r − 1) minor of
A is also non–zero and hence dr−1 (A) 6= 0; also dr−1 (A) divides each minor
of size r − 1 and consequently divides each minor of size r; hence dr−1 (A)
divides dr (A), the gcd of all minors of size r. This argument can be repeated
with r replaced by r − 1 and so on.
THEOREM 6.5
Let A, B ∈ Mm×n (F [x]). Then if A is equivalent to B over F [x], we
have
(i) ρ(A) = ρ(B) = r;
121
(ii) dk (A) = dk (B) for 1 ≤ k ≤ r.
proof
Suppose P AQ = B, where P and Q are units. First consider P A. The
rows of P A are linear combinations over F [x] of the rows of A, so it follows
that each k × k minor of P A is a linear combination of the k × k minors of
A. Similarly each column of (P A)Q is a linear combinations over F [x] of
the columns of P A, so it follows that each k × k minor of B = (P A)Q is a
linear combination over F [x] of the k × k minors of P A and consequently of
the k × k minors of A.
It follows that all minors of B with size k > ρ(A) must be zero and hence
ρ(B) ≤ ρ(A). However B is equivalent to A, so we deduce that ρ(A) ≤ ρ(B)
and hence ρ(A) = ρ(B).
Also dk (B) is a linear combination over F [x] of all k × k minors of B
and hence of all k × k minors of A. Hence dk (A)|dk (B) and by symmetry,
dk (B)|dk (A). Hence dk (A) = dk (B) if 1 ≤ k ≤ r.
6.2 Smith Canonical Form

THEOREM 6.6 (Smith canonical form)
Every non–zero matrix A ∈ Mm×n (F [x]) with r = ρ(A) is equivalent to
a matrix of the form
 
f1 0 · · · 0 · · · 0
 0 f2 · · · 0 · · · 0 
 
 .. .. .. .. .. .. 
 . . . . . . 
D=  0 0 · · · fr · · · 0  = P AQ

 
 .. .. .. .. . . .. 
 . . . . . . 
0 0 ··· 0 ··· 0
where f1 , . . . , fr ∈ F [x] are monic, fk |fk+1 for 1 ≤ k ≤ r−1, P is a product of

elementary row matrices, and Q is a product of elementary column matrices.
DEFINITION 6.3
The matrix D is said to be in Smith canonical form.
proof
This is presented in the form of an algorithm which is in fact used by
Cmat to find unit matrices P and Q such that P AQ is in Smith canonical
form.
122
Our account is based on that in the book “Rings, Modules and Linear
Algebra,” by B. Hartley and T.O. Hawkes.
We describe a sequence of elementary row and column operations over
F [x], which when applied to a matrix A with a11 6= 0 either yields a matrix
C of the form  
f1 0 · · · 0
 0 
C= .
 
. C ∗ 
 . 
0
where f1 is monic and divides every element of C ∗ , or else yields a matrix
B in which b11 6= 0 and
deg b11 < deg a11 . (28)
Assuming this, we start with our non–zero matrix A. By performing suitable

row and column interchanges, we can assume that a11 6= 0. Now repeatedly
perform the algorithm mentioned above. Eventually we must reach a ma-
trix of type C, otherwise we would produce an infinite strictly decreasing
sequence of non–negative integers by virtue of inequalities of type (28).
On reaching a matrix of type C, we stop if C ∗ = 0. Otherwise we perform
the above argument on C ∗ and so on, leaving a trail of diagonal elements as
we go.
Two points must be made:
(i) Any elementary row or column operation on C ∗ corresponds to an
elementary operation on C, which does not affect the first row or
column of C.
(ii) Any elementary operation on C ∗ gives a new C ∗ whose new entries

are linear combinations over F [x] of the old ones; consequently these
new entries will still be divisible by f1 .
Hence in due course we will reach a matrix D which is in Smith canonical
form.
We now detail the sequence of elementary operations mentioned above.
Case 1. ∃ a1j in row 1 with a11 not dividing a1j . Then
a1j = a11 q + b,
by Euclid’s division theorem, where b 6= 0 and deg b < deg a11 . Subtract q
times column 1 from column j and then interchange columns 1 and j. This
yields a matrix of type B mentioned above.
123
Case 2. ∃ ai1 in column 1 with a11 not dividing ai1 . Proceed as in Case 1,
operating on rows rather than columns, again reaching a matrix of type B.
Case 3. Here a11 divides every element in the first row and first column.
Then by subtracting suitable multiples of column 1 from the other columns,
we can replace all the entries in the first row other than a11 by 0. Similarly
for the first column. We then have a matrix of the form
 
e11 0 · · · 0
 0 
E= . .
 
. E ∗
 . 
0
If e11 divides every element of E ∗ , we have reached a matrix of type C.

Otherwise ∃ eij not divisible by e11 . We then add row i to row 1, thereby
reaching Case 1.
EXAMPLE 6.2
(of the Smith Canonical Form)
1 + x2

x
A=
x 1+x
We want D = P AQ in Smith canonical form. So we construct the augmented
matrix
work on rows work on columns
↓ ↓
1 0 1 + x2 x 1 0
0 1 x 1+x 0 1
R1 → R1 − xR2 ⇒ 1 −x 1 −x2 1 0
0 1 x 1+x 0 1
C2 → C2 + x2 C1 ⇒ 1 −x 1 0 1 x2
0 1 x 1 + x + x3 0 1
R2 → R2 − xR1 ⇒ 1 −x 1 0 1 x2
−x 1 + x2 0 1 + x + x3 0 1
↑ ↑ ↑
P D Q
Invariants are f1 = 1, f2 = 1 + x + x3 . Note also
d2 (A)
f1 = d1 (A), f2 = .
d1 (A)
124
6.2.1 Uniqueness of the Smith Canonical Form
THEOREM 6.7
Every matrix A ∈ Mm×n (F [x]) is equivalent to precisely one matrix is
Smith canonical form.
proof Suppose A is equivalent to a matrix B in Smith canonical form.

That is,  
f1
 .. 0 

.
B=  and f1 | f2 | · · · | fr .

 fr 
0 0
Then r = ρ(A), the determinantal rank of A. But if 1 ≤ k ≤ r,
dk (A) = dk (B) = f1 f2 . . . fk
and so the fi are uniquely determined by
f1 = d1 (A)
d2 (A)
f2 =
d1 (A)
..
.
dr (A)
fr = .
dr−1 (A)
6.3 Invariant factors of a polynomial matrix

DEFINITION 6.4
The polynomials f1 , . . . , fr in the Smith canonical form of A are called
the invariant factors of A.3
Note: Cmat calls the invariant factors of xI − B, where B ∈ Mn×n (F ), the
“similarity invariants” of B.
We next find these similarity invariants. They are
1, 1, . . . , 1, d1 , . . . , ds
| {z }
n−s
where d1 , . . . , ds are what earlier called the invariant factors of TB .

3
NB. This is a slightly different, though similar, form of “invariant factor” to that we
met a short while ago.
125
LEMMA 6.1
The Smith canonical form of xIn − C(d) where d is a monic polynomial
of degree n is
diag (1, . . . , 1, d).
| {z }
n−1
proof Let d = xn + an−1 xn−1 + · · · + a0 ∈ F [x], so

 
x 0 a0
 −1 x · · · a1 
 
 0 −1 a2 
xIn − C(d) =  .
 
.. . . ..

 . . . 

 x an−2 
0 · · · −1 x + an−1
Now use the row operation
R1 → R1 + xR2 + x2 R3 + · · · + xn−1 Rn
to obtain  
0 0 d

 −1 x · · · a1 


 0 −1 a2 

 .. . . .. 

 . . . 

 x an−2 
0 · · · −1 x + an−1
(think about it!) and then column operations
C2 → C2 + xC1 , . . . , Cn−1 → Cn−1 + xCn−2
and then
Cn → Cn + a1 C1 + a2 C2 + · · · + an−2 Cn−2 + (x + an−1 )Cn−1
yielding  
0 0 ... 0 d

 −1 0 0 

 0 −1 .

 .. .. 
 . . 
0 · · · −1 0
126
Trivially, elementary operations now form the matrix
diag (1, . . . , 1, d).

| {z }
n−1
THEOREM 6.8
Let B ∈ Mn×n (F ). Then if the invariant factors of B are d1 , . . . , ds , then
the invariant factors of xIn − B are
1, . . . , 1, d , d , . . . , ds .
| {z } 1 2
n−s
proof There exists non-singular P ∈ Mn×n (F ) such that

s
M
P −1 BP = C(dk ).
k=1
Then
s
M
P −1 (xIn − B)P = xIn − C(dk )
k=1
s
M
= (xImk − C(dk )) where mk = deg dk .
k=1
But by the lemma, each xImk − C(dk ) is equivalent over F [x] to

diag (1, . . . , 1, dk ) and hence xIn − B is equivalent to
 
1
 .. 
 . 
s  
M  1 
diag (1, . . . , 1, dk ) ∼  .
 d1 
k=1  
 .. 
 . 
ds
EXAMPLE 6.3
Find the invariant factors of
 
2 0 0 0
 −1 1 0 0 
B=  ∈ M4×4 (Q)
 0 −1 0 −1 
1 1 1 2
127
by finding the Smith canonical form of xI4 − B.
Solution:  
x−2 0 0 0
 1 x − 1 0 0 
xI4 − B = 
 0

1 x 1 
−1 −1 −1 x − 2
We start off with the row operations
R1 → R1 − (x − 2)R2
R1 ↔ R2
R4 → R4 + R1
and get
 
1 x−1 0 0
 0 −(x − 1)(x − 2) 0 0 
 
 0 1 x 1 
0 x−2 −1 x − 2
 
1 0 0 0
 0 −(x − 1)(x − 2) 0 0 
(column ops.) ⇒ 
 0

1 x 1 
0 x − 2 −1 x − 2
 
1 0 0 0
 0 1 x 1 
⇒ 
 0 −(x − 1)(x − 2) 0

0 
0 x−2 −1 x − 2
 
1 0 0 0
 0 1 x 1 
 
⇒  0 0 x(x − 1)(x − 2) (x − 1)(x − 2) 
 
 0 0 −1 − x(x − 2) 0 
2
{= −(x − 1) }
 
1 0 0 0
 0 1 0 0 
⇒  0 0 x(x − 1)(x − 2) (x − 1)(x − 2)  .
 
0 0 −(x − 1)2 0
Now, for brevity, we work just on the 2 × 2 block in the bottom right corner:

(x − 1)(x − 2) x(x − 1)(x − 2)
⇒
0 −(x − 1)2
128

(x − 1)(x − 2) 0
C2 → C2 − xC1 ⇒
0 −(x − 1)2
(x − 1)(x − 2) (x − 1)2

R1 → R1 + R2 ⇒
0 −(x − 1)2

(x − 1)(x − 2) x−1
C2 → C2 − C1 ⇒
0 −(x − 1)2

x−1 (x − 1)(x − 2)
C1 ↔ C2 ⇒
−(x − 1)2 0

x−1 0
C2 → C2 − (x − 2)C1 ⇒
−(x − 1) (x − 2)(x − 1)2
2

x−1 0
R2 → R2 + (x − 1)R1 ⇒
0 (x − 2)(x − 1)2
and here we stop, as we have a matrix in Smith canonical form. Thus

 
1
 1 
xI4 − B ∼  
 x−1 
(x − 1)2 (x − 2)
so the invariant factors of B are the non-trivial ones of xI4 − B, i.e.
(x − 1) and (x − 1)2 (x − 2).
Also, the elementary divisors of B are
(x − 1), (x − 1)2 and (x − 2)
so the Jordan canonical form of B is
J2 (1) ⊕ J1 (1) ⊕ J1 (2).
THEOREM 6.9
Let A, B ∈ Mn×n (F ). Then A is similar to B
⇔ xIn − A is equivalent to xIn − B

⇔ xIn − A and xIn − B have the same
Smith canonical form.
proof
129
⇒ Obvious. If P −1 AP = B, P ∈ Mn×n (F ) then
P −1 (xIn − A)P = xIn − P −1 AP

= xIn − B.
⇐ If xIn − A and xIn − B are equivalent over F [x], then they have the
same invariant factors and so have the same non-trivial invariant fac-
tors. That is, A and B have the same invariant factors and hence are
similar.
Note: It is possible to start from xIn − A and find P ∈ Mn×n (F ) such that
s
M
P −1 AP = C(dk )
k=1
where
P1 (xIn − B)Q1 = diag (1, . . . , 1, d1 , . . . , ds ).
(See Perlis, Theory of matrices, p. 144, Corollary 8–1 and p. 137, Theorem
7–9.)
THEOREM 6.10
Every unit in Mn×n (F [x]) is a product of elementary row and column
matrices.
Proof: Problem sheet 7, Question 12.
130

Smith

Uploaded by

Copyright:

Available Formats

Smith

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Smith

Uploaded by

Copyright:

Available Formats

6 The Smith Canonical Form

6.1 Equivalence of Polynomial Matrices

det P Q = det P det Q = det In = 1.

Fij , Fi (t), Fij (f ).

6.1.1 Determinantal Divisors

(i) ρ(A) = ρ(B) = r;

6.2 Smith Canonical Form

where f1 , . . . , fr ∈ F [x] are monic, fk |fk+1 for 1 ≤ k ≤ r−1, P is a product of

deg b11 < deg a11 . (28)

Assuming this, we start with our non–zero matrix A. By performing suitable

(ii) Any elementary operation on C ∗ gives a new C ∗ whose new entries

If e11 divides every element of E ∗ , we have reached a matrix of type C.

Invariants are f1 = 1, f2 = 1 + x + x3 . Note also

proof Suppose A is equivalent to a matrix B in Smith canonical form.

and so the fi are uniquely determined by

6.3 Invariant factors of a polynomial matrix

We next find these similarity invariants. They are

where d1 , . . . , ds are what earlier called the invariant factors of TB .

proof Let d = xn + an−1 xn−1 + · · · + a0 ∈ F [x], so

Now use the row operation

C2 → C2 + xC1 , . . . , Cn−1 → Cn−1 + xCn−2

Cn → Cn + a1 C1 + a2 C2 + · · · + an−2 Cn−2 + (x + an−1 )Cn−1

diag (1, . . . , 1, d).

proof There exists non-singular P ∈ Mn×n (F ) such that

But by the lemma, each xImk − C(dk ) is equivalent over F [x] to

and here we stop, as we have a matrix in Smith canonical form. Thus

so the invariant factors of B are the non-trivial ones of xI4 − B, i.e.

(x − 1) and (x − 1)2 (x − 2).

Also, the elementary divisors of B are

(x − 1), (x − 1)2 and (x − 2)

so the Jordan canonical form of B is

J2 (1) ⊕ J1 (1) ⊕ J1 (2).

⇔ xIn − A is equivalent to xIn − B

P −1 (xIn − A)P = xIn − P −1 AP

Proof: Problem sheet 7, Question 12.

You might also like