The Full Pythagorean Theorem: Charles Frohman January 1, 2010
The Full Pythagorean Theorem: Charles Frohman January 1, 2010
The Full Pythagorean Theorem: Charles Frohman January 1, 2010
Charles Frohman
January 1, 2010
Abstract
This note motivates a version of the generalized pythagorean that
says: if A is an n × k matrix, then
X
det(At A) = det(AI )2
I
1 Introduction
The pythagorean theorem is one of the first theorems of geometry that people
learn. If a right triangle has legs of length a and b and its hypotenuse has
length c then
a2 + b 2 = c 2 .
The Playfair proof of the Pythagorean theorem is easy to explain, but some-
how mysterious.
1
b a
a b
c c
c c
b a
a b
Behold!
(x2 , y2 )
(x1 , y1 ) (x2 , y1 )
2
(0, 0, c)
(0, b, 0)
(a, 0, 0)
You can imagine that the tetrahedron has three legs, which are right
triangles that lie in the xy-plane, xz-plane, and yz-plane. The hypotenuse
is the triangle having vertices (a, 0, 0), (0, b, 0) and (0, 0, c). The sum of the
squares of the areas of the three legs is
1 2 2 1 2 2 1 2 2
ab + ac + bc.
4 4 4
√ q
2 b2
The base of the hypotenuse is a2 + b2 and its height is aa2 +b 2
2 + c . The
area of the hypotenuse is one half of its base times its height,
1√ 2 2
a b + a2 c 2 + b 2 c 2 .
2
Squaring this gives the sum of the squares of the areas of the legs! This
computation is sometimes referred to as de Gua’s theorem after Jean Paul
de Gua de Malves who was an 18th century French mathematician [2].
The statement that this tetrahedron is right, boils down to the legs of
the tetrahedron being the orthogonal projections of the hypotenuse into the
coordinate hyperplanes. The shapes we have been using so far, triangles
and tetrahedra, are examples of simplices. Simplices are well and good, but
for the sake of discussing the pythagorean theorem in the language of linear
algebra, parallelepipeds are better.
Suppose that ~v = (a, b, c) and w~ = (d, e, f ) are vectors in space. The
parallelogram spanned by ~v and w ~ is everything of the form s~v + tw~ where
3
s, t ∈ [0, 1]. The orthogonal projection of the parallelogram into the xy-
plane is spanned by (a, b) and (d, e). Using the standard area formula via
determinants its area is |ae − bd|. Its orthogonal projection into the yz-plane
is spanned by (b, c) and (e, f ) and has area |bf − ce|. Finally its projection
into the xz-plane is spanned by (a, c) and (d, f ) and has area |af − cd|. The
pythagorean theorem says that the square of the area of the parallelogram
in space is the sum of the squares of the areas of the projections into the
coordinate hyperplanes.
but this is just the norm squared of the cross product (a, b, c)×(d, e, f ), which
confirms a well known formula for the area of a parallelogram in space.
In general, the P parallelepiped spanned by vectors ~v1 , ~v2 , . . . , ~vk is every-
thing of the form ki=1 λi~vi where the λi vary over the unit interval [0, 1].
The pythagorean theorem says that if P is a parallelepiped in Rn spanned
by k-vectors ~v1 , ~v2 , . . . , ~vk then the square of the k-dimensional content of
P , is the sum of the squares of the k-dimensional content of the orthogonal
projections of P into the k-dimensional coordinate hyperplanes in Rn . It is
the point of this note to give a statement in linear algebraic terms of this
theorem, and prove it.
2 Content
Let V, W be innerproduct spaces, and L : V → W a linear map. We can
restrict L to get,
L : ker L⊥ → imL,
where by ker L⊥ we mean the subspace of V made up of all vectors that are
pependicular to the kernel of L and by imL we mean the image of L. As
ker L⊥ and imL are subspaces of innerproduct spaces they are themselves
innerproduct spaces. Hence we can choose orthonormal bases for them and
represent L as a matrix lij with respect to those bases. The matrix lij is
square.
4
The content is not as well behaved as the determinant, for instance it
doesn’t have a sign. Also the content of the composition of two linear maps
is not in general the product of their contents. However,
Proposition 1. If L : V → W and M : W → U are linear maps of inner-
product spaces and imL = ker M ⊥ then c(M ◦ L) = c(M )c(L).
Proof. Since imL = ker M ⊥ one of the orthonormal bases you use to compute
c(L) can be chosen to coincide with one of the orthonormal bases used to
compute c(M ). If lij and mjk are the matrices that represent L and M with
respect to this choice, then the matrix representing M ◦L is the product of lij
and mjk . The determinant of the product is the product of the determinants.
c(L) = c(L∗ ).
Using Proposition 1, and the fact that the image of L∗ is the perpendicular
to the kernel of L we arrive at c(L ◦ L∗ ) = c(L)2 , and using the fact that
L∗∗ = L we have c(L∗ ◦ L) = c(L)2 .
Proposition 2. For any linear map of innerproduct spaces L : V → W ,
5
√
You can think of detg(p) as the k-dimensional content of the parallelepiped
spanned by the vectors
∂ ∂
dfp ( |p ), . . . , df p ( |p )
∂x1 ∂xk
Choose an orthonormal basis for Tf (p) N and let ~vi be the column vector
representing
∂
dfp ( i |p )
∂x
with respect to this basis. Let A be the n × k-matrix whose columns are the
~vi . Notice that At A = g, so
p
c(A) = det(g).
That is, the square of the content of the parallelepiped spanned by A is equal
to the sum of the squares of the orthogonal projections of the parallelepiped
into the k-dimensional coordinate hyperplanes.
We will prove this theorem after we develop some vocabulary for manip-
ulating determinants.
6
algebra over k on V modulo the relations that v ∧ v = 0 for every v ∈ V ,
where we are denoting the multiplication by a wedge [1]. It is an elementary
fact that the relation v ∧ v = 0 for all v implies that v ∧ w = −w ∧ v for all
v and w.
We will be working with innerproduct spaces. Suppose that ei i ∈
{1, . . . , n} is an orthonormal basis of V . If I = {i1 , i2 , . . . , ik } ⊂ {1, 2, . . . , n}
with i1 < i2 . . . < ik let
Λi (IdV ) = IdΛi (V ) ,
and
Λi (L ◦ M ) = Λi (L) ◦ Λi (M ).
where L : V → W and M : W → U are any linear mappings between
innerproduct spaces.
Proposition 3. If we choose orthonormal bases ej for V and fi for W and
let lij be the matrix of f with respect to these bases, if |J| = i then
X
L(eJ ) = MIJ fI
|I|=i
where MIJ is the determinant of the i × i submatrix of lij whose rows and
columns come respectively from the sets I and J in order.
7
P
Proof. Notice that L(ej ) = i lij fi . Expanding,
where Si denotes the symmetric group on i letters, and sgn(σ) is the sign of
the permutation sigma. Notice that the sum over Si of the signed products
is a classical formula for the determinant, so it reduces to
X
= MIJ fI
|I|=i
where the MIJ is the determinant of the minor whose entries are indexed by
ij with i ∈ I and j ∈ J..
Now suppose that dim(V ) = dim(W ) = k. Notice dimΛk (V ) = dimΛk (W ) =
1, and
Λk (L)(e{1,...,k} ) = det(lij )f{1,...,k} .
Proposition 4. If L : V → W is a linear map of innerproduct spaces then
the map Λi (L∗ ) is equal to Λi (L)∗ .
Proof. Choose orthonormal bases ej and fi for V and W respectively. That
way, if lij is the matrix of L with respect to those bases then the matrix of L∗
is lji . Notice eJ and fI are orthonormal bases of Λi (V ) and Λi (W ) where we
let I and J range over subsets with i elements in their respective index sets.
Hence if LIJ is the matrix of Λi (L) with respect to the bases eJ and fI then
the matrix of Λi (L∗ ) is LJI . Here I am using the fact that the determinant
of the transpose of a matrix is equal to the determinant of the matrix in
computing the coefficients of the matrix for Λi (L∗ ).
We are ready.
8
Proof. Theorem 1 Let A be an n × k matrix, which we can view as the
matrix of a linear map A : Rk → Rn with respect to the standard orthonormal
bases. The determinant of At A is expressed as
Λk (A∗ ) ◦ Λk (A)(e{1,...,k} ).
However, we have
X
Λk (A)(e{1,...,k} ) = det(AI )eI ,
I⊂{1,...n} |I|=k}
Therefore,
X
Λk (A∗ ) ◦ Λk (A)(e{1,...,k} ) = Λk (A∗ )( det(AI )eI ) =
I⊂{1,...n} |I|=k}
X
det(AI )Λk (A∗ )(eI ) =
I⊂{1,...n} |I|=k}
X
det(AI )2 e{1,...,k} .
I⊂{1,...n} |I|=k}
9
4 Epilogue
This is a theorem that gets rediscovered over and over again. Via a web search
I found a note by Alvarez [3] that proves the theorem for right n-simplices,
but better than that has a nice bibliography with some references to proofs
and historical texts. He cites evidence that the theorem first appeared in
a book on analytic geometry by Monge and Hatchet written in the 19th
century. There is also a paper by Atzema [4] that proves the same theorem I
proved here, via a more general result about the determinant of a product of
matrices due to Cauchy. I would be surprised if a proof along the lines that
I gave here, didn’t appear elsewhere.
I think the theorem might be of pedagogical interest as it gives a unified
paradigm for integrals to compute arc length and area that could leave the
students in a position to set up integrals to compute higher dimensional
content.
References
[1] [Sp] Spivak, Michael Calculus on Manifolds, Perseus Books , Cam-
bridge MA, 1965
10