DIAGONALIZATION
DIAGONALIZATION
DIAGONALIZATION
Recall that an eigenvalue-eigenvector pair (λ, v) of the square matrix A is a scalar-vector pair such that
Av = λv.
In the Week 07 extra homework sheet, you proved the following 3 facts:
1. if the eigenvalues of A are distinct (i.e., λ1 6= λ2 ), then the eigenvectors are linearly independent:
Span {v 1 , v 2 } = R2 ,
Further recall that a vector space can have more than one basis. The simplest and most familiar basis for
the vector space R2 is the set of canonical vectors
1 0
{i, j} = , .
0 1
A natural question arises: how do we convert from one basis to another? (Think Relativity here: if I am
in one coordinate frame (inside a train) and you are in another (on the station platform), how do I interpret
observations you made in your frame of reference in my own frame of reference?) The answer is given
(implicitly) by the following theorem (which we formulate in the general case).
Definition 1. A square matrix D ∈ Rn×n where all entries off the main diagonal are zero is called a
diagonal matrix. That is, D takes the form
d1 0 ··· 0
0 d2 · · · 0
D = diag (d1 , d2 , . . . , dn ) := . .. .
.. ..
.. . . .
0 0 ··· dn
Theorem 1 (Eigen Decomposition Theorem). Let A ∈ Rn×n with distinct eigenvalues λ1 , λ2 , . . . , λn (i.e.,
λi 6= λj for all i 6= j). Let P = (v 1 v 2 · · · v n ) be the matrix of eigenvectors, where v 1 , v 2 , . . . , v n are
the eigenvectors associated with λ1 , λ2 , . . . , λn , respectively. Then
A = P DP −1 ,
Proof. We present the proof in the 2 × 2 case and note that the proof is exactly the same for the n × n case.
1
u1 u2
We will write the eigenvectors as v 1 = and v 2 = . So, standard matrix multiplication gives
v1 v2
a11 a12 u1 u2
AP = A (v 1 v 2 · · · v n ) =
a21 a22 v1 v2
a11 u1 + a12 v1 a11 u2 + a12 v2
=
a21 u1 + a22 v1 a21 u2 + a22 v2
a11 u1 + a12 v1 a11 u2 + a12 v2
=
a21 u1 + a22 v1 a21 u2 + a22 v2
a11 a12 u1 a11 a12 u2
=
a21 a22 v1 a21 a22 v2
= (Av 1 Av 2 )
= (λ1 v 1 λ2 v 2 )
λ1 u1 λ2 u2 u1 u2 λ1 0
= =
λ1 v1 λ2 v2 v1 v2 0 λ2
= PD
Now, since the eigenvalues of A are distinct, the eigenvectors are linearly independent. This implies that the
matrix P of eigenvectors is invertible. Hence we can multiply by P −1 :
AP = P D =⇒ A = P DP −1 .
Remark 1. Note that the order in which the eigenvalues appears in the diagonal matrix D corresponds
to
λ2 0
the ordering of the eigenvectors as columns of P . That is, if P = (v 2 v 1 ), then D = .
0 λ1
Observe that an alternative way of writing the Eigendecomposition Theorem is
D = P −1 AP.
The matrix A has intrinsic features we want to understand. The eigendecomposition theorem tells us that if
the eigenvalues are distinct, we can always switch to a coordinate system where the non-essential features
of A are stripped away and we are left with a diagonal matrix D = diag (λ1 , λ2 ). This process is known as
diagonalization.
2 1
Example 1. Diagonalize the matrix A = .
0 −1
Our first task is to compute the eigenvalues and eigenvectors of A.
2
We can now diagonalize A as follows:
−1 1 0 −1 2 1 1 1 1 0 −1 −1 2 −1 0
D = P AP = = =
3 3 1 0 −1 −3 0 3 3 1 3 0 0 2
Powers of A
One practical use of the Eigendecomposition Theorem is in computing powers of the matrix A, which we
often need to do.
Lemma 3. Let A ∈ Rn×n with distinct eigenvalues. Let P = (v 1 v 2 · · · v n ) be the matrix of eigenvectors
and D = diag (λ1 , λ2 , . . . , λn ) the diagonal matrix with eigenvalues along the main diagonal. Then
Ak = P Dk P −1 .
Ak = (P DP −1 )k
= (P DP −1 )(P DP −1 ) · · · (P DP −1 )
−1 −1 −1 −1
= PD P | {z P} D · · · P
| {z P} D P | {z P} DP
=I =I =I
−1
· · · D} P
= P |DD{z
k factors of D
k −1
= PD P
−1 1
Example 2. Compute the powers of the matrix A = .
2 −1
The eigenvalues of A are the zeros of the characteristic polynomial:
By definition of eigenvalue-eigenvector pairs, the rows of the matrix A − λI must be linearly dependent
(scalar multiples of each other). This means row reduction will always lead to a row of zeros. So we can
take a slight shortcut here and just look at one of the rows to get the desired relation:
√
1
√ .
2u + v = 0 =⇒ v 1 =
− 2
3
√
Eigenvector for λ = −1 + 2:
√
− 2 1
√
(A − λI)v = 0 =⇒ v = 0.
2 − 2
Again, we can take a slight shortcut and just look at one of the rows to get the desired relation:
√
1
− 2u + v = 0 =⇒ v 2 = √ .
2
√
1 1
√ √ . Then P −1 = 1 √ 2 −1
Let P = √ . By the eigendecomposition theorem, we have
− 2 2 2 2 2 1
A = P DP −1 ,
√ √
where D = diag (−1 − 2, −1 + 2). We can now compute the powers of A:
Ak = P Dk P −1
√
(−1 − 2)k
√
1
√ √1 0√ 1
√ 2 −1
= √
− 2 2 0 (−1 + 2)k 2 2 2 1
√ k √
1 1
√ √1 (−1 − 2) 0√ √ 2 −1
= √
2 2 − 2 2 0 (−1 + 2)k 2 1
√ √ k √ k √ k √ k
1 2 −1 − 2 + −1 + 2 −1 + 2 − −1 − 2
= √ √ k √ k √ √ k √ k
2 2 2 −1 + 2 − 2 −1 − 2 2 −1 − 2 + −1 + 2