Determinants
Determinants
Determinants
September 7, 2017
1 Determinants
One of the first things that most students learn about in linear algebra is the determinant of a matrix. Lots
of useful formulas for 2×2 and 3×3 matrices can be expressed in terms of determinants, and determinants
played a central role in linear algebra 100 years ago when most matrices were tiny.
Nowadays, determinants are much less useful as a practical tool, although they still occasionally show up.
Determinant-related formulas are also useful in proving theorems in linear algebra. The basic computational
problem, however, is that the determinant formulas don’t scale — for a big matrix, there is almost always
a better way of computing something than using explicit determinants, cofactors, Cramer’s rule, and other
tricks useful for small matrices.
Still, it is important to know what determinants are, and their basic properties. In 18.06, we mainly use
determinants as a conceptual tool to help us understand eigenvalues via the characteristic polynomial —
although, again, this is not a practical computational tool for eigenvalues, which are nowadays computed by
very different methods.
In [1]: A = [1 3
2 4]
det(A)
Out[1]: -2.0
(You may even remember the formula for the 2×2 determinant: 1 × 4 − 3 × 2 = −2.
But this matrix is singular (the second column is twice the first), and so its determinant is zero:
In [2]: A = [1 2
2 4]
det(A)
Out[2]: 0.0
• By the way, many authors, including Strang’s book, use the abbreviated notation |A| = det A. I won’t
use this notation here, mainly because I don’t think the determinant is important enough anymore
to deserve its own punctuation. Anyway, |A| looks too much like an absolute value, even though the
determinant can have any sign.
1
2.1 A lucky guess for the determinant
In 18.06, we know have another way to check whether a matrix is zero: perform Gaussian elimination, and
then check whether any pivots (diagonal entries of U) are zero.
But this gives us an obvious way to construct a single determinant-like number: just multiply the
pivots together, and the result will be zero if and only if the matrix is singular.
In fact, this intuition turns out to be almost exactly the right guess:
• The determinant is ± the product of the pivots, with a minus sign if elimination involved an odd
number of row swaps and a plus sign if there were an even number of swaps (including zero swaps).
In [3]: A = randn(5,5)
det(A)
Out[3]: 1.4856724724450872
Out[5]: 1.4856724724450876
Note that this matches det(A) (up to roundoff errors in the last few digits).
This immediately gives you a hint of why the determinant is not such a useful computational tool as you
might have thought:
• The most efficient way to compute a determinant, in general, is to do Gaussian elimination and then
multiply the pivots together.
• Once you have done elimination, you already know whether the matrix is singular and you can already
solve Ax = b efficiently, so the determinant is mostly superfluous.
2
3.1 1. det(I) = 1
It is clear that the identity matrix I is not singular, and all its pivots are 1. A reasonable starting point for
defining determinants, therefore, is to require:
For example:
In [6]: eye(5)
In [7]: det(eye(5))
Out[7]: 1.0
In [9]: B = copy(A)
B[1,:] = A[2,:]
B[2,:] = A[1,:]
B
Out[10]: (-27.99999999999998,27.99999999999999)
3
3.3 3. Linearity in any individual row
The determinant will not be a linear operation on the whole matrix: det(A + B) 6= det A + det B!! But, we
would like it to be linear with respect to operations on individual rows.
This means two things:
This axiom actually makes a lot of sense if you think about the example of the identity matrix. Multiplying
the first row of I by α leads to the matrix:
α 0 0 0 ···
0 1 0 0 · · ·
0 0 1 0 · · ·
0 0 0 1 · · ·
.. .. .. .. . .
. . . . .
The determinant of this matrix is exactly α! As α → 0, this matrix becomes singular, and the determinant
goes to zero at the same rate. It is also consistent with our “product of the pivots” intuitive guess above,
because the pivots here are (α, 1, 1, · · · ).
We can also try this with our random matrix A from above. Let’s multiply the second row by 2:
In [11]: C = copy(A)
C[2,:] = 2*A[2,:]
C
Out[12]: (-27.99999999999998,-55.99999999999996)
• det(αA) = αm det A
This is not an axiom, it is a consequence of the axiom above: we pick up a factor of α for each row that
we scale.
For our 5 × 5 matrix A, this means that det(2A) = 25 det A = 32 det A:
Out[13]: 32.0
4
3.3.2 Adding a row vector to a row
There is a second property of linearity, corresponding to vector addition:
• If we add a row vector r to a row of A, then the determinant becomes det(A) + det(A0 ), where A0
is the matrix with that row replaced by r (with other rows unchanged).
a + a0 b + b0
0
b0
a b a
det = det + det .
c d c d c d
Or, in terms of our matrix A from above, let’s add (1, 2, 3, 4, 5) to the first row:
In [14]: A
Out[16]: 307.99999999999994
This should be the same as det A plus the determinant of A with the first row replaced by (1, 2, 3, 4, 5):
In [17]: A 0 = copy(A)
A 0 [1,:] = [1,2,3,4,5] # replace first row
A0
Out[18]: 307.99999999999994
5
4.1 4. If two rows are equal, det = 0
It’s easy to see why this follows from property 2: if we swap two equal rows, the matrix doesn’t change,
but the determinant must flip sign. But this means:
In [19]: det([ 1 2 3
4 5 6
1 2 3 ])
Out[19]: 0.0
This property also makes sense if our expectation is that the determinant is zero for singular matrices:
if two rows are equal, the matrix is singular.
4.2 5. Subtracting a multiple of one row from another doesn’t change det
Suppose we take a matrix A, and subtract (or add) a multiple of one row from another. For example:
a b a b a b a b
det = det − α det = det +0
c − αa d − αb c d a b c d
Here, we applied axiom 3 (linearity), and then property 4 (repeated rows).
The same thing happens for any size of matrix.
But this is precisely the kind of operation that we perform during Gaussian elimination. It has the crucial
implications:
And, by axiom 2:
• Gaussian elimination with row swaps gives the same determinant but with flipped sign for
each row swap.
For example:
Out[21]: (-27.99999999999998,-27.99999999999998)
6
4.3 6. A matrix with a row of zeros has det = 0
This is easy to see from axiom 3 (linearity): if we multiply the row of zeros by zero, it doesn’t change the
matrix but multiplies the determinant by zero, hence:
In [22]: det([1 2 3
4 5 6
0 0 0])
Out[22]: 0.0
1. Eliminate “upward” above the pivots to get a diagonal matrix D. This doesn’t change the determinant
by property 5.
2. Pull out each diagonal element by axiom 3 (linearity) until you get the identity matrix I whose
determinant is 1 by axiom 1:
α1 1
α2 α2
det = α det = · · · = α1 α2 α3 · · · det I = α1 α2 α3 · · ·
α3
1
α3
.. ..
. .
If we have a zero diagonal entry, we can’t eliminate upward above it (we can’t divide by the diagonal
“pivot”). But in that case we end up with a row of zeros after eliminating above the other diagonals, and
by property 6 we get a zero determinant. So it still matches the product of the diagonal entries.
Similarly for a lower triangular matrix, except that we eliminate “downward”.
We already saw an example of this earlier, but let’s do it again. We got our U matrix from elimination
on A:
In [23]: U
In [24]: diag(U)
7
The product of these is:
In [25]: prod(diag(U))
Out[25]: -27.99999999999998
which matches det U (and det A):
In [26]: det(U), det(A)
Out[26]: (-27.99999999999998,-27.99999999999998)
If we do need to compute the determinant, this gives us a very practical way to do it: compute det(A)
by taking the product of the pivots after elimination, with a sign flip for every row swap.
This is, in fact exactly what the Julia det function does, as you can check by looking at the source code:
In [27]: @which det(A)
Out[27]: det{T}(A::AbstractArray{T,2}) at linalg/generic.jl:574
From the source code, this calls det(lufact(A)), which calls:
In [28]: @which det(lufact(A))
Out[28]: det{T,S}(A::Base.LinAlg.LU{T,S}) at linalg/lu.jl:196
8
4.6.1 Matrix inverses
This rule has important consequences for matrix inverses. First:
det(A−1 ) = 1/ det(A)
Proof: 1 = det(I) = det(AA−1 ) = det(A) det(A−1 ).
For example:
Out[32]: (-0.03571428571428567,-0.03571428571428574)
Recall from last lecture that XAX −1 corresponds simply to a change of basis from A. (We will later
call this a similar matrix to A). Now we know:
det P = ±1
(depending on how many row swaps there were). Furthermore, recall that P T P = I (the permution P is
orthogonal since its rows/columns are orthogonal (a re-ordering of the rows/columns of I). By property 9,
this means that
1 = det I = det(P T P ) = det(P T ) det(P )
Since det P = ±1, this means that det(P T ) = det(P ).
It follows that
Out[33]: (-27.99999999999998,-27.999999999999968)
9
5 Useful applications of determinants
Ignoring formulas (e.g. Cramer’s rule, a formula for A−1 — see Strang, section 5.3) that are mainly useful
for tiny matrices, here are some examples of real usages of determinants even for large matrices:
• Understanding eigenvalues: determinants will turn eigenvalues into polynomial roots, and since we
know about polynomial roots, that tells us a lot about eigenvalues. (This is not how eigenvalues are
computed in practice, however!)
• There is also something called a nonlinear eigenproblem, arising in many science and engineering
problems, in which the determinant plays a basic conceptual role. Again, however, computational
methods typically avoid computing determinants explicitly except for tiny matrices.
• Proofs: Determinants show up in a lot of proofs in matrix theory, because they reduce matrices to
numbers that have nice properties and are easy to reason about. One also often sees things like the
adjugate matrix and the Cayley–Hamilton theorem, both related to determinants.
• That is, we often use determinants to help use understand and derive things in linear algebra, even if
the final result doesn’t require us to compute the determinant for any practical purpose.
• Jacobian factors: in multivariable calculus, a factor of | det J| arises when you perform a change of
variables in integration, where J is a Jacobian matrix.
• The reason a determinant arises here is that, more generally, det(A) is the volume of a paral-
lelepiped (“box”) whose edges are given by the columns of A.
• Integration may sound like something that only happens in a few dimensions (= tiny matrices J), but
extremely high dimensional (even infinite-dimensional) integrals appear in statistics, quantum field
theory, bioinformatics, and other fields.
• High-dimensional Gaussian integrals often arise in statistics and related areas of science (e.g. quantum
field theory), and the inverse of the square root of a determinant appears in the answer. Often, one
wants the logarithm of the result, in which case what arises is the log determinant log det A, an
important matrix function.
This is no doubt an incomplete list. Nevertheless, although determinants are a much more marginal topic
in modern linear algebra than they were in the 19th century, they have hardly disappeared.
10
That means that this formula requires ∼ n × n! scalar operations, which is worse than exponential in
n. This is far more expensive than elimination (∼ n3 ), making this formula computationally useless
for n > 3.
(There is also another computationally useless formula involving minors and cofactors; see Strang, section
5.2.)
The permutation formula is still sometimes useful conceptually, however.
11