Mathematics For Informatics 4a: The Story of The Film So Far..

The story of the film so far...
X, Y independent random variables and Z = X + Y :

fZ = fX ? fY , where ? is the convolution
Mathematics for Informatics 4a
X, Y with joint density f(x, y) and Z = g(X, Y):
x
José Figueroa-O’Farrill E(Z) = g(x, y)f(x, y)dx dy
X, Y independent:
E(XY) = E(X)E(Y)
Var(X + Y) = Var(X) + Var(Y)
MX+Y (t) = MX (t)MY (t), where MX (t) = E(etX )
We defined covariance and correlation of two r.v.s
Lecture 14 Proved Markov’s and Chebyshev’s inequalities
9 March 2012 Proved the (weak) law of large numbers and the
Chernoff bound
Waiting times of Poisson processes are exponentially
distributed
José Figueroa-O’Farrill mi4a (Probability) Lecture 14 1 / 23 José Figueroa-O’Farrill mi4a (Probability) Lecture 14 2 / 23
More approximations 0.35 0.25
0.30
0.20
In Lecture 7 we saw that the binomial distribution with 0.25
0.15
0.20
parameters n, p can be approximated by a Poisson distribution 0.15
with parameter λ in the limit as n → ∞, p → 0 but np → λ:

0.10
0.10
0.05
0.05
λk

n k
p (1 − p)n−k ∼ e−λ
k k! n=5 n = 10
But what about if n → ∞ but p 6→ 0?

0.10
For example, consider flipping a fair coin n times and let X 0.15
denote the discrete random variable which counts the number 0.08
0.10
of heads. Then
0.06
n 1 0.04
P(X = k) = 0.05
k 2n 0.02
Lectures 6 and 7: this distribution has µ = n/2 and σ2 = n/4.

n = 20 n = 50
Normal limit of (symmetric) binomial distribution Proof
Theorem Let k = n
+ x. Then
2
1
Let X be binomial with parameter n and p = 2. Then for n large
n!2−n

and k − n/2 not too large, n −n n
2 = n 2−n = n
n
2 +x ! 2 −x !
k 2 + x
r
n n
− 1 ··· n

n 1 1 2 −2(k−n/2)2 /n

'√
2 2
e−(k−µ) /2σ = e n!2−n 2 2 2 − (x − 1)
n = n n × n
k 2 nπ
n n

2πσ 2 ! 2 ! 2 +1 2 + 2 ··· 2 + x

2 2 n x
for µ = n/2 and σ2 = n/4. 1 1 − · · · 1 − (x − 1)
r
2 n n 2
' ×
x
nπ 1 + n2 1 + 2 n2 · · · 1 + x n2 n2
The proof rests on the de Moivre/Stirling formula for the factorial
of a large number:
√ √ Now we use the exponential approximation
n! ' 2πnn ne−n
1
which implies that 1 − z ' e−z and ' e−z
r 1+z

n n! 2
= ' 2n (valid for z small) to rewrite the big fraction in the RHS.
n/2 (n/2)!(n/2)! πn
Proof – continued. Example (Rolling a die ad nauseam)

r It’s raining outside, you are bored and you roll a fair die 12000
times. Let X be the number of sixes. What is

n −n 2 4 8 2(x − 1) 2x
2 ' exp − − − · · · − −
k nπ n n n n P(1900 < X < 2200)?
r
2

4 2x
The variable X is the sum X1 + · · · + X12000 , where Xi is the
= exp − (1 + 2 + · · · + (x − 1)) − number of sixes on the ith roll. This means that X is binomially
nπ n n
r distributed with parameter n = 12000 and p = 16 , so
2 4 x(x − 1) 2x
= exp − − µ = pn = 2000 and σ2 = np(1 − p) = 5000 .
nπ n 2 n X−2000
√ √3
r X ∈ (1900, 2200) iff σ ∈ (− 6, 2 6), whence
2 2 /n
= e−2x √ √
nπ P(1900 < X < 2200) ' Φ(2 6) − Φ(− 6) ' 0.992847
which is indeed a normal distribution with σ2 = n

4. The exact result is
A similar proof shows that the general binomial distribution with X
2199
12000 1 k 5 12000−k
µ = np and σ2 = np(1 − p) is also approximated by a normal 6 6 ' 0.992877
k
distribution with the same µ and σ2 . k=1901
Normal limit of Poisson distribution We have just shown that in certain limits of the defining
parameters, two discrete probability distributions tend to
0.12
normal distributions:
0.15
0.10 the binomial distribution in the limit n → ∞,
0.10
0.08 the Poisson distribution in the limit λ → ∞
0.06
0.04
What about continuous probability distributions?
0.05
0.02 We could try with the uniform or exponential distributions:
λ=5 λ = 10 1.0 4
0.8
3
0.6
0.08 0.05 2
0.4
0.04
0.06
1
0.03 0.2
0.04
0.02
-1 1 2 3 4 0.5 1.0 1.5 2.0 2.5 3.0
0.02
0.01
No amount of rescaling is going to work. Why?
λ = 20 λ = 50
The binomial and Poisson distributions have the following Sum of uniformly distributed variables
property:
Xi i.i.d. uniformly distributed on [0, 1]: then X1 + · · · + Xn is
if X, Y are binomially distributed with parameters (n, p) and
(m, p), X + Y is binomially distributed with parameter 1.4 1.0
(n + m, p) 1.2
1.0
0.8
if X, Y are Poisson distributed with parameters λ and µ, 0.8

0.6
X + Y is Poisson distributed with parameter λ + µ 0.6

0.4
0.4
It follows that if X1 , X2 , . . . are i.i.d. with binomial 0.2

0.2
distribution with parameters (m, p), X1 + · · · + Xn is -1.0 -0.5 0.5 1.0 1.5 2.0 -1 1 2 3
binomial with parameter (nm, p). Therefore m large is n=1 n=2

equivalent to adding many of the Xi .
It also follows that if X1 , X2 , . . . are i.i.d. with Poisson 0.8 0.7
0.6
distribution with parameter λ, X1 + · · · + Xn is Poisson 0.6

0.5
distributed with parameter nλ and again λ large is 0.4

0.4
0.3
equivalent to adding a large number of the Xi . 0.2

0.2
The situation with the uniform and exponential distributions 0.1
-1 1 2 3 4 -1 1 2 3 4 5
is different.
n=3 n=4
Sum of exponentially distributed variables
0.12
If Xi are i.i.d. exponentially distributed with parameter λ, we 0.15 0.10
already saw that Z2 = X1 + X2 has a “gamma” probability 0.08
0.10
density function: 0.06
2 −λz
fZ2 (z) = λ ze 0.05
0.04
0.02
2 4 6 8 10 5 10 15 20
It is not hard to show that Zn = X1 + · · · + Xn has probability n=5 n = 10

density function
zn−1 −λz
fZn (z) = λn e 0.08 0.04
(n − 1)!
0.06 0.03
0.04 0.02
What happens when we take n large?

0.02 0.01
10 20 30 40 20 40 60 80 100 120 140
n = 20 n = 75
The Central Limit Theorem Our 4-line proof of the CLT rests on Lévy’s continuity law,
which we will not prove.
Let X1 , X2 , . . . be i.i.d. random variables with mean µ and
variance σ2 . Paraphrasing: “the m.g.f. determines the c.d.f.”
Let Zn = X1 + · · · + Xn . It is then enough to show that the limit n → ∞ of the m.g.f.
of Z√ n −nµ
is the m.g.f. of the standard normal distribution.
Then Zn has mean nµ and variance nσ2 , but in addition we nσ
have
Proof of CLT.
Theorem (Central Limit Theorem) We shift the mean: the variables Yi = Xi − µ are i.i.d. with
In the limit as n → ∞, mean 0 and variance σ2 , and Zn − nµ = Y1 + · · · + Yn .
MZn −nµ (t) = MY1 (t) · · · MYn (t) = MY1 (t)n , by i.i.d.
Zn − nµ n
√ 6 x → Φ(x)

P t t
nσ M Zn√−nµ (t) = MZn −nµ √nσ = MY1 √nσ
nσ
2 t2
n 2
with Φ the c.d.f. of the standard normal distribution. M Zn√−nµ (t) = 1 + 2σnσ 2 + · · · → et /2 , which is the m.g.f.
nσ
In other words, for n large, Zn is normally distributed. of a standard normal variable.
Example (Rounding errors)
Suppose that you round off 108 numbers to the nearest integer,
and then add them to get the total S. Assume that the rounding
errors are independent and uniform on [− 12 , 12 ]. What is the
Crucial observation probability that S is wrong by more than 3? more than 6?
The CLT holds regardless of how the Xi are distributed! Let Z = X1 + · · · + X108 . We may approximate it by a normal
The sum of any large number of i.i.d. normal variables always distribution with µ = 0 and σ2 = 108
12 = 9, whence σ = 3.
tends to a normal distribution. S is wrong by more than 3 iff |Z| > 3 or |Z−µ|
σ > 1 and hence
This also explains why normal distributions are so popular in
P(|Z − µ| > σ) = 1 − P(|Z − µ| 6 σ)
probabilistic modelling.
Let us look at a few examples. = 1 − (2Φ(1) − 1) = 2(1 − Φ(1)) ' 0.3174
|Z−µ|
S is wrong by more than 6 iff σ > 2 and hence
P(|Z − µ| > 2σ) = 1 − P(|Z − µ| 6 2σ)

= 1 − (2Φ(2) − 1) = 2(1 − Φ(2)) ' 0.0456
Place your bets! Example (Roulette – continued)

Let Xi denote your winnings on the ith spin of the wheel. Then
Example (Roulette) 9
P(Xi = 1) = 19 and P(Xi = −1) = 10
19 . The mean is therefore
A roulette wheel has 38 slots: the numbers 1 to 36 (18 black,
1
18 red) and the numbers 0 and 00 in green. µ = P(Xi = 1) − P(Xi = −1) = − 19
You place a £1 bet on and the variance is

whether the ball will land on
a red or black slot and win σ2 = P(Xi = 1) + P(Xi = −1) − µ2 = 1 − 1
361 = 360
361
£1 if it does. Otherwise you
Then Z = X1 + · · · + X361 has mean −19 and variance 360. This
lose the bet. Therefore you
means that after 361 spins you are down £19 on average. We
win £1 with probability
18 9 are after the probability P(Z > 0):
38 = 19 and you “win” −£1
with probability 20 10
38 = 19 .

P(Z > 0) = P √ 19
Z+
> √19 =1−P √ 19
Z+
6 √19
360 360 360 360
After 361 spins of the wheel, what is the probability that you are ' 1 − Φ(1) ' 0.1587
ahead? (Notice that 361 = 192 .)
So there is about a 16% chance that you are ahead.
Example (Measurements in astronomy) Example (Measurements in astronomy – continued)
Astronomical measurements are subject to the vagaries of Zn√
−nd
By the CLT we can assume that 2 n
is standard normal, so
weather conditions and other sources of errors. Hence in order we are after √
to estimate, say, the distance to a star one takes the average of n
P | Z2n√
−nd
n
| 6 4 ' 0.95
many measurements. Let us assume that different √
n
measurements are i.i.d. with mean d (the distance to the star) or, equivalently, 4 = 2, so that n = 64.
and variance 4 (light-years2 ). How many measurements should
we take to be “reasonably sure” that the estimated distance is A question remains: is n = 64 large enough for the CLT? To
accurate to within half a light-year? answer it, we need to know more about the distribution of the
Let Xi denote the measurements and Zn = X1 + · · · + Xn . Let’s Xi . However Chebyshev’s
inequality
can be used to provide a
say that “reasonably sure” means 95%, which is 2σ in the safe n. Since E n = d and Var n = n4 , Chebyshev’s
Zn Zn
standard normal distribution. (In Particle Physics, “reasonably inequality says

sure” means 5σ, but this is Astronomy.) Then we are after n
4 16
such that P | Znn − d| > 0.5 6 n(0.5)2
= n
P | Znn − d| 6 0.5 ' 0.95
so choosing n = 320 gives P | n − d| > 0.5 6 0.05 or
Zn

P | Znn − d| 6 0.5 > 0.95 as desired.
Summary
The binomial distribution with parameters n, p can be
approximated by a normal distribution (with same mean
and variance) for n large.
Similarly for the Poisson distribution with parameter λ as
λ→∞
These are special cases of the Central Limit Theorem: if
Xi are i.i.d. with mean µ and (nonzero) variance σ2 , the
sum Zn = X1 + · · · + Xn for n large is normally distributed.
We saw some examples on the use of the CLT: rounding
errors, roulette game, astronomical measurements.
José Figueroa-O’Farrill mi4a (Probability) Lecture 14 23 / 23

Mathematics For Informatics 4a: The Story of The Film So Far..

Uploaded by

Copyright:

Available Formats

Mathematics For Informatics 4a: The Story of The Film So Far..

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Mathematics For Informatics 4a: The Story of The Film So Far..

Uploaded by

Copyright:

Available Formats

The story of the film so far...

X, Y independent random variables and Z = X + Y :

More approximations 0.35 0.25

In Lecture 7 we saw that the binomial distribution with 0.25

with parameter λ in the limit as n → ∞, p → 0 but np → λ:

But what about if n → ∞ but p 6→ 0?

Lectures 6 and 7: this distribution has µ = n/2 and σ2 = n/4.

Proof – continued. Example (Rolling a die ad nauseam)

which is indeed a normal distribution with σ2 = n

0.02 We could try with the uniform or exponential distributions:

if X, Y are Poisson distributed with parameters λ and µ, 0.8

X + Y is Poisson distributed with parameter λ + µ 0.6

It follows that if X1 , X2 , . . . are i.i.d. with binomial 0.2

binomial with parameter (nm, p). Therefore m large is n=1 n=2

distribution with parameter λ, X1 + · · · + Xn is Poisson 0.6

distributed with parameter nλ and again λ large is 0.4

equivalent to adding a large number of the Xi . 0.2

The situation with the uniform and exponential distributions 0.1

If Xi are i.i.d. exponentially distributed with parameter λ, we 0.15 0.10

already saw that Z2 = X1 + X2 has a “gamma” probability 0.08

It is not hard to show that Zn = X1 + · · · + Xn has probability n=5 n = 10

What happens when we take n large?

10 20 30 40 20 40 60 80 100 120 140

In other words, for n large, Zn is normally distributed. of a standard normal variable.

P(|Z − µ| > 2σ) = 1 − P(|Z − µ| 6 2σ)

Place your bets! Example (Roulette – continued)

You place a £1 bet on and the variance is

standard normal distribution. (In Particle Physics, “reasonably inequality says

José Figueroa-O’Farrill mi4a (Probability) Lecture 14 23 / 23

You might also like