MTH6140 Linear Algebra II: Notes 6 25th November 2010
MTH6140 Linear Algebra II: Notes 6 25th November 2010
MTH6140 Linear Algebra II: Notes 6 25th November 2010
6 Quadratic forms
A lot of applications of mathematics involve dealing with quadratic forms: you meet
them in statistics (analysis of variance) and mechanics (energy of rotating bodies),
among other places. In this section we begin the study of quadratic forms.
in the variables in which every term has degree two (that is, is a multiple of xi x j for
some i, j), and each Ai j belongs to K.
1
In the above representation of a quadratic form, we see that if i 6= j, then the term
in xi x j comes twice, so that the coefficient of xi x j is Ai j + A ji . We are free to choose
any two values for Ai j and A ji as long as they have the right sum; but we will always
make the choice so that the two values are equal. That is, to obtain a term cxi x j , we
take Ai j = A ji = c/2. (This is why we require that the characteristic of the field is not
2.)
Any quadratic form is thus represented by a symmetric matrix A with (i, j) entry
Ai j (that is, a matrix satisfying A = A> ). This is the third job of matrices in linear
algebra: Symmetric matrices represent quadratic forms.
We think of a quadratic form as defined above as being a function from the vector
space K n to the field K. It is clear from the definition that
x1
> .
Q(x1 , . . . , xn ) = v Av, where v = .. .
xn
Now if we change the basis for V , we obtain a different representation for the
same function Q. The effect of a change of basis is a linear substitution v = Pv0 on the
variables, where P is the transition matrix between the bases. Thus we have
Proposition 6.1 A basis change with transition matrix P replaces the symmetric matrix
A representing a quadratic form by the matrix P> AP.
Proposition 6.2 Two symmetric matrices are congruent if and only if they represent
the same quadratic form with respect to different bases.
Our next job, as you may expect, is to find a canonical form for symmetric matrices
under congruence; that is, a choice of basis so that a quadratic form has a particularly
simple shape. We will see that the answer to this question depends on the field over
which we work. We will solve this problem for the fields of real and complex numbers.
2
6.2 Reduction of quadratic forms
Even if we cannot find a canonical form for quadratic forms, we can simplify them
very greatly.
where Ai j = A ji for i 6= j.
Case 1: Assume that Aii 6= 0 for some i. By a permutation of the variables (which is
certainly a linear substitution), we can assume that A11 6= 0. Let
n
y1 = x1 + ∑ (A1i /A11 )xi .
i=2
Then we have
n
A11 y21 = A11 x12 + 2 ∑ A1i x1 xi + Q0 (x2 , . . . , xn ),
i=2
where Q0 is a quadratic form in x2 , . . . , xn . That is, all the terms involving x1 in Q have
been incorporated into A11 y21 . So we have
3
Case 2: All Aii are zero, but Ai j 6= 0 for some i 6= j. Now
xi x j = 41 (xi + x j )2 − (xi − x j )2 ,
so taking xi0 = 21 (xi + x j ) and x0j = 21 (xi − x j ), we obtain a new form for Q which does
contain a non-zero diagonal term. Now we apply the method of Case 1.
Case 3: All Ai j are zero. Now Q is the zero form, and there is nothing to prove: take
α1 = · · · = αn = 0.
Example 6.1 Consider the quadratic form Q(x, y, z) = x2 + 2xy + 4xz + y2 + 4z2 . We
have
(x + y + 2z)2 = x2 + 2xy + 4xz + y2 + 4z2 + 4yz,
and so
Q = (x + y + 2z)2 − 4yz
= (x + y + 2z)2 − (y + z)2 + (y − z)2
= X 2 +Y 2 − Z 2 ,
How do we find an invertible matrix P such that P> AP = A0 ? Here is how:
x
If v is the vector consisting of the ‘original’ variables, so v = y , and v0 is the
z
X
vector consisting of ‘new’ variables, so v = Y , then P is defined by v = Pv0 (see
0
Z
the argument on page 2 of this chapter).
4
Now in the current example we have
X 1 1 2 x
Y = 0 1 −1 y ,
Z 0 1 1 z
α1 x12 + · · · + αn xn2
by a linear substitution. But this is still not a “canonical form for congruence”. For
example, if y1 = x1 /c, then α1 x12 = (α1 c2 )y21 . In other words, we can multiply any αi
by any factor which is a perfect square in K.
Over the complex numbers C, every element has a square root. Suppose that
α1 , . . . , αr 6= 0, and αr+1 = · · · = αn = 0. Putting
√
( αi )xi for 1 ≤ i ≤ r,
yi =
xi for r + 1 ≤ i ≤ n,
we have
Q = y21 + · · · + y2r .
We will see later that r is an “invariant” of Q: however we do the reduction, we arrive
at the same value of r.
Over the real numbers R, things are not much worse. Since any positive real
number has a square root, we may suppose that α1 , . . . , αs > 0, αs+1 , . . . , αs+t < 0,
and αs+t+1 , . . . , αn = 0. Now putting
( √
(√αi )xi for 1 ≤ i ≤ s,
yi = ( −αi )xi for s + 1 ≤ i ≤ s + t,
xi for s + t + 1 ≤ i ≤ n,
we get
2
Q = x12 + · · · + xs2 − xs+1 2
− · · · − xs+t .
Again, we will see later that s and t don’t depend on how we do the reduction. [This
is the theorem known as Sylvester’s Law of Inertia.]
5
6.3 Linear forms and dual space
Now we begin dealing with quadratic forms in a more abstract way. We begin with
linear forms, that is, functions of degree 1. The definition is simple:
Definition 6.3 Let V be a vector space over K. A linear form on V is a linear map
from V to K, where K is regarded as a 1-dimensional vector space over K: that is, it is
a function from V to K satisfying
xn
Definition 6.4 Linear forms can be added and multiplied by scalars in the obvious
way:
( f1 + f2 )(v) = f1 (v) + f2 (v), (c f )(v) = c f (v).
So they form a vector space, which is called the dual space of V and is denoted by V ∗ .
f (c1 v1 + · · · + cn vn ) = a1 c1 + · · · + an cn ,
6
Now let fi be the linear map defined by the rule that
1 if i = j,
fi (v j ) =
0 if i 6= j.
Then ( f1 , . . . , fn ) form a basis for V ∗ ; indeed, the linear form f defined in the preceding
paragraph is a1 f1 + · · · + an fn . This basis is called the dual basis of V ∗ corresponding
to the given basis for V . Since it has n elements, we see that dim(V ∗ ) = n = dim(V ).
Definition 6.5 The Kronecker delta δi j for i, j ∈ {1, . . . , n} is defined by the rule that
1 if i = j,
δi j =
0 if i 6= j.
Note that δi j is the (i, j) entry of the identity matrix. Now, if (v1 , . . . , vn ) is a basis for
V , then the dual basis for the dual space V ∗ is the basis ( f1 , . . . , fn ) satisfying
fi (v j ) = δi j .
There are some simple properties of the Kronecker delta with respect to summa-
tion. For example,
n
∑ δi j a i = a j
i=1
for fixed j ∈ {1, . . . , n}. This is because all terms of the sum except the term i = j are
zero.
Proposition 6.5 Let B and B0 be bases for V , and B∗ and (B0 )∗ the dual bases of the
dual space. Then
−1
>
PB∗ ,(B0 )∗ = PB,B 0 .
7
Proof Use the notation from just before the statement of this Proposition. If P = PB,B0
has (i, j) entry pi j , and Q = PB∗ ,(B0 )∗ has (i, j) entry qi j , we have
n
v0i = ∑ pkivk ,
k=1
n
f j0 = ∑ ql j fl ,
l=1
and so
δi j = f j0 (v0i )
! !
n n
= ∑ ql j f l ∑ pkivi
l=1 k=1
n n
= ∑ ∑ ql j δi j pki
l=1 k=1
n
= ∑ qk j pki.
k=1
I = Q> P,
> −1
whence Q> = P−1 , so that Q = P−1 = P> , as required.
8
(b) Let Q : V → K be a function. We say that Q is a quadratic form if
(i) Q(cv) = c2 Q(v) for all c ∈ K, v ∈ V , and
(ii) the function b defined by
If we think of the prototype of a quadratic form as being the function x2 , then the
first equation says (cx)2 = c2 x2 , while the second has the form
(x + y)2 − x2 − y2 = 2xy,
(which is known as the polarisation formula) says that the bilinear form is determined
by the quadratic form Q. Conversely, if we know the symmetric bilinear form b, then
we have
so that Q(v) = 12 b(v, v), and we see that the quadratic form is determined by the sym-
metric bilinear form. So these are equivalent objects.
9
Proof We already saw that A is congruent to a matrix of this form. Moreover, if P is
invertible, then so is P> , and so
r = rank(P> AP) = rank(A)
as claimed.
The next result is Sylvester’s Law of Inertia.
Theorem 6.7 Any n × n real symmetric matrix A is congruent to a matrix of the form
Is O O
O −It O
O O O
for some s,t. Moreover, if A is congruent to two matrices of this form, then they have
the same values of s and of t.
Proof Again we have seen that A is congruent to a matrix of this form. Arguing as in
the complex case, we see that s + t = rank(A), and so any two matrices of this form
congruent to A have the same values of s + t. Moreover, by restricting to a subspace
on which A is invertible, we may assume without loss of generality that s + t = n.
Suppose that two different reductions give the values s,t and s0 ,t 0 respectively,
with s + t = s0 + t 0 = n. Suppose (in order to obtain a contradiction) that s < s0 . Now
let Q be the quadratic form represented by A. Then we are told that there are linear
functions y1 , . . . , yn and z1 , . . . , zn of the original variables x1 , . . . , xn of Q such that
Q = y21 + · · · + y2s − y2s+1 − · · · − y2n = z21 + · · · + z2s0 − z2s0 +1 − · · · − z2n .
Now consider the equations
y1 = 0, . . . , ys = 0, zs0 +1 = 0, . . . zn = 0
regarded as linear equations in the original variables x1 , . . . , xn . The number of equa-
tions is s + (n − s0 ) = n − (s0 − s) < n. According to a lemma from much earlier in
the course (we used it in the proof of the Exchange Lemma!), the equations have a
non-zero solution. That is, there are values of x1 , . . . , xn , not all zero, such that the
variables y1 , . . . , ys and zs0 +1 , . . . , zn are all zero.
Since y1 = · · · = ys = 0, we have for these values
Q = −y2s+1 − · · · − y2n < 0.
But since zs0 +1 = · · · = zn = 0, we also have
Q = z21 + · · · + z2s0 > 0.
But this is a contradiction. So we cannot have s < s0 . Similarly we cannot have s0 < s
either. So we must have s = s0 , as required to be proved.
10
We saw that s + t is the rank of A.
Of course, both the rank and the signature are independent of how we reduce the
matrix (or quadratic form); and if we know the rank and signature, we can easily
recover s and t.
11