Smith
Smith
Smith
THEOREM 6.1
A matrix P ∈ Mn×n (F [x]) is a unit in Mn×n (F [x]) if and only if det P =
c, where c ∈ F and c 6= 0.
proof
“only if”. Suppose P is a unit. Then P Q = In and
However det P and det Q belong to F [x], so both are in fact non–zero ele-
ments of F .
“if”. Suppose P ∈ Mn×n (F [x]) satisfies det P = c, where c ∈ F and
c 6= 0. Then
P adj P = (det P )In = cIn .
Hence P Q = In , where Q = c−1 adj P ∈ Mn×n (F [x]). Hence P is a unit in
Mn×n (F [x]).
EXAMPLE
6.1
1 + x −x
P = ∈ M2×2 (F [x]) is a unit, as det P = 1.
x 1−x
THEOREM 6.2
Elementary row matrices in Mn×n (F [x]) are units:
(i) Eij : interchange rows i and j of In ;
(ii) Ei (t): multiply row i of In by t ∈ F, t 6= 0;
(iii) Eij (f ): add f times row j of In to row i, f ∈ F [x].
In fact det Eij = −1; det Ei (t) = t; det Eij (f ) = 1.
Similarly for elementary column matrices in Mn×n (F [x]):
120
Remark: It follows that a product of elementary matrices in Mn×n (F [x])
is a unit. Later we will be able to prove that the converse is also true.
DEFINITION 6.2
Let A, B ∈ Mm×n (F [x]). Then A is equivalent to B over F [x] if units
P ∈ Mm×m (F [x]) and Q ∈ Mn×n (F [x]) exist such that
P AQ = B.
THEOREM 6.3
Equivalence of matrices over F [x] defines an equivalence relation on
Mm×n (F [x]).
THEOREM 6.4
For 1 ≤ k ≤ ρ(A), we have dk (A) 6= 0. Also dk (A) divides dk+1 (A) for
1 ≤ k ≤ ρ(A) − 1.
proof
Let r = ρ(A). Then there exists an r × r non–zero minor and hence
dr (A) 6= 0. Then because each r × r minor is a linear combination over F [x]
of (r − 1) × (r − 1) minors of A, it follows that some (r − 1) × (r − 1) minor of
A is also non–zero and hence dr−1 (A) 6= 0; also dr−1 (A) divides each minor
of size r − 1 and consequently divides each minor of size r; hence dr−1 (A)
divides dr (A), the gcd of all minors of size r. This argument can be repeated
with r replaced by r − 1 and so on.
THEOREM 6.5
Let A, B ∈ Mm×n (F [x]). Then if A is equivalent to B over F [x], we
have
121
(ii) dk (A) = dk (B) for 1 ≤ k ≤ r.
proof
Suppose P AQ = B, where P and Q are units. First consider P A. The
rows of P A are linear combinations over F [x] of the rows of A, so it follows
that each k × k minor of P A is a linear combination of the k × k minors of
A. Similarly each column of (P A)Q is a linear combinations over F [x] of
the columns of P A, so it follows that each k × k minor of B = (P A)Q is a
linear combination over F [x] of the k × k minors of P A and consequently of
the k × k minors of A.
It follows that all minors of B with size k > ρ(A) must be zero and hence
ρ(B) ≤ ρ(A). However B is equivalent to A, so we deduce that ρ(A) ≤ ρ(B)
and hence ρ(A) = ρ(B).
Also dk (B) is a linear combination over F [x] of all k × k minors of B
and hence of all k × k minors of A. Hence dk (A)|dk (B) and by symmetry,
dk (B)|dk (A). Hence dk (A) = dk (B) if 1 ≤ k ≤ r.
DEFINITION 6.3
The matrix D is said to be in Smith canonical form.
proof
This is presented in the form of an algorithm which is in fact used by
Cmat to find unit matrices P and Q such that P AQ is in Smith canonical
form.
122
Our account is based on that in the book “Rings, Modules and Linear
Algebra,” by B. Hartley and T.O. Hawkes.
We describe a sequence of elementary row and column operations over
F [x], which when applied to a matrix A with a11 6= 0 either yields a matrix
C of the form
f1 0 · · · 0
0
C= .
. C ∗
.
0
where f1 is monic and divides every element of C ∗ , or else yields a matrix
B in which b11 6= 0 and
a1j = a11 q + b,
by Euclid’s division theorem, where b 6= 0 and deg b < deg a11 . Subtract q
times column 1 from column j and then interchange columns 1 and j. This
yields a matrix of type B mentioned above.
123
Case 2. ∃ ai1 in column 1 with a11 not dividing ai1 . Proceed as in Case 1,
operating on rows rather than columns, again reaching a matrix of type B.
Case 3. Here a11 divides every element in the first row and first column.
Then by subtracting suitable multiples of column 1 from the other columns,
we can replace all the entries in the first row other than a11 by 0. Similarly
for the first column. We then have a matrix of the form
e11 0 · · · 0
0
E= . .
. E ∗
.
0
EXAMPLE 6.2
(of the Smith Canonical Form)
1 + x2
x
A=
x 1+x
We want D = P AQ in Smith canonical form. So we construct the augmented
matrix
work on rows work on columns
↓ ↓
1 0 1 + x2 x 1 0
0 1 x 1+x 0 1
R1 → R1 − xR2 ⇒ 1 −x 1 −x2 1 0
0 1 x 1+x 0 1
C2 → C2 + x2 C1 ⇒ 1 −x 1 0 1 x2
0 1 x 1 + x + x3 0 1
R2 → R2 − xR1 ⇒ 1 −x 1 0 1 x2
−x 1 + x2 0 1 + x + x3 0 1
↑ ↑ ↑
P D Q
d2 (A)
f1 = d1 (A), f2 = .
d1 (A)
124
6.2.1 Uniqueness of the Smith Canonical Form
THEOREM 6.7
Every matrix A ∈ Mm×n (F [x]) is equivalent to precisely one matrix is
Smith canonical form.
dk (A) = dk (B) = f1 f2 . . . fk
f1 = d1 (A)
d2 (A)
f2 =
d1 (A)
..
.
dr (A)
fr = .
dr−1 (A)
1, 1, . . . , 1, d1 , . . . , ds
| {z }
n−s
125
LEMMA 6.1
The Smith canonical form of xIn − C(d) where d is a monic polynomial
of degree n is
diag (1, . . . , 1, d).
| {z }
n−1
R1 → R1 + xR2 + x2 R3 + · · · + xn−1 Rn
to obtain
0 0 d
−1 x · · · a1
0 −1 a2
.. . . ..
. . .
x an−2
0 · · · −1 x + an−1
(think about it!) and then column operations
and then
yielding
0 0 ... 0 d
−1 0 0
0 −1 .
.. ..
. .
0 · · · −1 0
126
Trivially, elementary operations now form the matrix
THEOREM 6.8
Let B ∈ Mn×n (F ). Then if the invariant factors of B are d1 , . . . , ds , then
the invariant factors of xIn − B are
1, . . . , 1, d , d , . . . , ds .
| {z } 1 2
n−s
Then
s
M
P −1 (xIn − B)P = xIn − C(dk )
k=1
s
M
= (xImk − C(dk )) where mk = deg dk .
k=1
EXAMPLE 6.3
Find the invariant factors of
2 0 0 0
−1 1 0 0
B= ∈ M4×4 (Q)
0 −1 0 −1
1 1 1 2
127
by finding the Smith canonical form of xI4 − B.
Solution:
x−2 0 0 0
1 x − 1 0 0
xI4 − B =
0
1 x 1
−1 −1 −1 x − 2
We start off with the row operations
R1 → R1 − (x − 2)R2
R1 ↔ R2
R4 → R4 + R1
and get
1 x−1 0 0
0 −(x − 1)(x − 2) 0 0
0 1 x 1
0 x−2 −1 x − 2
1 0 0 0
0 −(x − 1)(x − 2) 0 0
(column ops.) ⇒
0
1 x 1
0 x − 2 −1 x − 2
1 0 0 0
0 1 x 1
⇒
0 −(x − 1)(x − 2) 0
0
0 x−2 −1 x − 2
1 0 0 0
0 1 x 1
⇒ 0 0 x(x − 1)(x − 2) (x − 1)(x − 2)
0 0 −1 − x(x − 2) 0
2
{= −(x − 1) }
1 0 0 0
0 1 0 0
⇒ 0 0 x(x − 1)(x − 2) (x − 1)(x − 2) .
0 0 −(x − 1)2 0
Now, for brevity, we work just on the 2 × 2 block in the bottom right corner:
(x − 1)(x − 2) x(x − 1)(x − 2)
⇒
0 −(x − 1)2
128
(x − 1)(x − 2) 0
C2 → C2 − xC1 ⇒
0 −(x − 1)2
(x − 1)(x − 2) (x − 1)2
R1 → R1 + R2 ⇒
0 −(x − 1)2
(x − 1)(x − 2) x−1
C2 → C2 − C1 ⇒
0 −(x − 1)2
x−1 (x − 1)(x − 2)
C1 ↔ C2 ⇒
−(x − 1)2 0
x−1 0
C2 → C2 − (x − 2)C1 ⇒
−(x − 1) (x − 2)(x − 1)2
2
x−1 0
R2 → R2 + (x − 1)R1 ⇒
0 (x − 2)(x − 1)2
THEOREM 6.9
Let A, B ∈ Mn×n (F ). Then A is similar to B
proof
129
⇒ Obvious. If P −1 AP = B, P ∈ Mn×n (F ) then
⇐ If xIn − A and xIn − B are equivalent over F [x], then they have the
same invariant factors and so have the same non-trivial invariant fac-
tors. That is, A and B have the same invariant factors and hence are
similar.
Note: It is possible to start from xIn − A and find P ∈ Mn×n (F ) such that
s
M
P −1 AP = C(dk )
k=1
where
P1 (xIn − B)Q1 = diag (1, . . . , 1, d1 , . . . , ds ).
(See Perlis, Theory of matrices, p. 144, Corollary 8–1 and p. 137, Theorem
7–9.)
THEOREM 6.10
Every unit in Mn×n (F [x]) is a product of elementary row and column
matrices.
130