PCA Dr. Pawan Kumar Tiwari
PCA Dr. Pawan Kumar Tiwari
PCA Dr. Pawan Kumar Tiwari
By
2
J 0 ( x0 ) x0 xk (1)
k 1
n n n
(x m) 2 (x m) (x m) (x m)
k 1
0
2
k 1
0
T
k
k 1
k
2
n n n
k 1
2
( x0 m) 2( x0 m)T
k 1
( xk m )
k 1
( xk m )
2
(3)
Cont..
This sample mean is zero dimensional representation of the data
set.
It is simple but it does not reveal any of the variability in the data.
We can obtain a more interesting, one dimensional
representation by projecting the data onto a line running through
the sample mean.
Let e be the unit vector in the direction of line. Then the equation
of line can be written as
x m ae ( 4)
where the scalar a corresponds to the distance of any point x from
the mean m.
Cont..
If we represent xk by m ak e we can
find an “optimal” set of coefficient ak by
minimizing the squared-error criterion
function.
n n
J1 (a1 ,..., an , e) (m ak e) xk ak e ( xk m)
2 2
k 1 k 1
n n n
a e 2 ak e ( xk m) ( xk m) (5)
2 2 T 2
k
k 1 k 1 k 1
Cont…
Recognizing that e partially
1, differentiating with
respect to and asetting
k the derivative to zero, we
obtain
ak eT ( xk m) (6)
Geometrically, this result merely says that we obtain a
least square solution by projecting the vector
onto the line in
xkdirection of e that passes through
the sample mean.
Cont..
This bring us to more interesting problem of finding the best direction e
for the line.
The solution to this problem involves the so called scatter matrix S
defined by:
n
S
k 1
( xk m)( xk m)T (7)
ak
It arises when we substitute the value of from equation (6) into
equation no (5).
Cont…
n n n
J1 (e)
k 1
2
ak 2 k 1
2
ak k 1
xk m
2
n n
k 1
[eT ( xk m) ]2
k 1
xk m
2
n n
k 1
eT ( xk m)( xk m)T e
k 1
xk m
2
n
eT S e k 1
xk m
2
(8)
Cont..
Clearly the vector e that minimizes J1 will also
maximize eT S e .
We will apply the method of Lagrange multiplier
to maximize eT S e subject to the constraint
e 1.
Cont..
Let be the undetermined multiplier,
We differentiate
T T
With respect to e to obtain
U e S e (e e 1) (9)
U
2 Se 2e (10)
e
Setting this Gradient Vector equal to zero, we see that e must be an eigenvector of the scatter matrix
Se e (11)
Cont...
In Particular because eT S e eT e , it
follows that to maximize eT S e , we want to
select the eigenvector corresponding to the
largest eigenvalue of the scatter matrix.
This result can be readily extended from one
dimensional projection to a d l dimensional
projection . In place of Eqn. (4) we write
dl
x m a e
i 1
i i (12)
l
where d d.
Cont..
It is not difficult to show that criterion function
2
n dl
Jd l m
k 1
i 1
aki ei xk
(13)
e1,...., ed l
Is minimized where the vectors are
dl
the eigenvectors of the scatter matrix having the largest
eigen values.
This scatter matrix is real and symmetric, these eigen values
are otrthogonal.
Cont..
They form a natural set of basis vector for
representing any vector x.
The coefficients ai in Eqn (12) are the
component of x in that basis and are called
the principal components.
Example