Cumulative Function N Dimensional Gaussians 12.2013
Cumulative Function N Dimensional Gaussians 12.2013
Cumulative Function N Dimensional Gaussians 12.2013
(t) =
N (x, 0, 1)dx.
|x |
;
it measures how many standard deviations is the point x far from the Gaussian.
Given R > 0, let us compute the probability that a point falls at a Mahalanobis distance r R from the Gaussian above. This probability is equal
to
+R
N (x, , )dx,
R
Conversely, if the probability p that a point falls within a certain Mahalanobis distance R from the Gaussian is known, one can compute R explicitly:
[
(p + 1)
]
p+1
p = 2F ( + R) 1 = F ( + R) =
= R = F 1
/
2
2
(p + 1)
= R = 1
.
2
B. 2-d Gaussians and bivariate normal densities
The bivariate normal density with mean = (1 , 2 ) and covariance matrix
is (with z = (x, y))
p(x, y) =
1
1
exp[ (z )1 (z )t ],
2
2 ||
[(
( x )( y ) ( y )2 ]
1
x 1 )2
1
2
2
2
+
.
1 2
1
1
2
2
Since || < 1, the locus of the points for which r is constant are ellipses. It
is not dicult to see that a parametrization of such an ellipse at level r2 is
x = r1 cos + 1 ,
y = r2 ( cos + 1 2 sin ) + 2 .
2
Interestingly, if M is the Choleski decomposition of (M is an upper triangular matrix such that M t M = ), then this parametrization can be written
(x, y) = r(cos , sin )M + (1 , 2 );
this can be shown with ease by the explicit computation of M , namely
[
]
1
M=
,
0 2 1 2
or more even simply using the fact that the parametrization above is nothing
else than the locus of the points ruM + , where u = (x, y) is the set of points
such that uut = 1. To see this, it suces to put z = ruM + in the expression
r2 = (z )1 (z )t . This Choleski form of the parametrization holds also
in higher dimensions, and can be obtained eciently in computers.
Another important fact is that the directions of the two axis of the ellipse
above are given by the eigenvectors
of
, while the extent of the great and
with
uut = 1.
The great radius of the ellipse is the segment OP , where O is the center of the
ellipse (that is, (0, 0)), and P = (x, y) is one of the two points of the ellipse that
are located at maximal distance from the center of the ellipse. Similarly, the
small radius of the ellipse is given by one of the two points of the ellipse that
are located at minimal distance from the center. In other word, we have to nd
z and u such that uut = 1 and
zz t = x2 + y 2 = r2 uM M t ut = max,
But this is the classic Rayleigh problem, and it is well known that in the maximum case, zz t /r2 = 1 is the largest eigenvalue of M M t , and ut = ut1 is its
corresponding normalized eigenvector. In the minimum case, zz t /r2 = 2 is
the smallest eigenvalue of M M t and ut = ut2 is its corresponding normalized
2
eigenvector.
Now let p(x, y) be the general bivariate normal density, with as above.
We hope to compute the cumulative function of this Gaussian, as a function
of r. More precisely, our aim is to compute the probability that a point falls
inside the ellipse given by the parametrization above, for a given Mahalanobis
distance r = R.
To this end, let us rst compute the Jacobian of the parametrization above.
It is equal to
1
cos
1 r
sin
x/r x/
=
y/r y/ 2 ( cos + 1 2 sin ) 2 r( sin + 1 2 cos )
= 1 2 1 2 r = r,
as can be easily checked.
To compute the integral of the Gaussian, we can suppose without loss of
generality that = 0 (since integrals are invariant under translation). The
integral to compute is
1
2 ||
dr
r 2
||re 2 .
It is equal to
re
0
r 2
2
dr =
r 2
2
R2 /2
dr /2 =
er dr
= 1 eR
/2
/2
/2
Then
r2 /2 = ln(1 p),
or
r=
2 ln(1 p).
Hence,
r = F 1 (p) =
2 ln(1 p).
1
(2)n/2
||
[
]
exp 12 (z )1 (z )t ,
i ,
1
,
(2)n/2 ||
we have to
1
Z T Z exp( Z1 Z T )dx1 dx2 dxn .
2
1
C
M T Y T Y M exp( Y Y T )|M |dy1 dy2 dyn .
2
T
Extracting the
matrices M and M from the integral, and taking into account
that |M | = ; the expression above becomes
(
)
1
1
T
T
T
M
Y Y exp( Y Y )dy1 dy2 dyn M.
2
(2)n/2
The expression inside the parentheses can be shown to be the identity matrix
(sketch of the proof: non-diagonal components cancel by symmetry, and diagonal elements can be computed using the volume of the k-dimensional sphere).
Hence the whole expression is equal to M T IM = M T M = , as was claimed.
Notice that the argument above shows that if a n-dimensional centered
stochastic variable X is normally distributed, with covariance matrix , and
if = M T M is the Choleski decomposition of M , then Y = XM 1 is normally
distributed with covariance matrix equal to I (that is, its gaussian is the product of n independent Gaussians with standard deviation = 1). By the way,
this gives a good mean to implement stochastic generators of the variable X: it
suces to generate n samples y1 , y2 , . . . , yn following separately the 1-d normal
law with standard deviation = 1, and to multiply Y = (y1 , y2 , . . . , yn ) by
M from the right; this gives a n-dimensional stochastic sample of the n-variate
normal law with covariance matrix .
Now, our aim is to nd the cumulative function of the Gaussian which composes X as a function of the Mahalanobis distance r, that is, to nd for each
r, the proportion of instances of X that falls statistically at a Mahalanobis
distance r r from the Gaussian composing X. Without loss of generality,
the variable X can be assumed to be centered. Recall that the Mahalanobis
distance of a point p = (p1 , . . . , pn ) to this Gaussian is given by r2 = p1 pT .
Letting p be an instance of X, we can see r2 itself as as random variable, that is,
r2 = X1 X T , giving rise the variable
r = X1 X T .
x(n2)/2 ex/2
,
2n/2 (n/2)
y n1 ey
/2
dy,
F (r) =
y n1 ey /2 dy.
m1
2
(m 1/2)(m 3/2) 3/2 2 0
Simplifying by the factors 2 leads to
F (r) =
3 5 7 (n 2) 2
7
r
0
y n1 ey
/2
dy.
chi2inv(c, n).
This is the Mahalanobis distance threshold under which a proportion c of points
emitted by any n-variate normal law fall statistically below this threshold.