Basics of Probability and Statistics
Basics of Probability and Statistics
Basics of Probability and Statistics
Statistics
Outline
Probability
Statistical Measures
Probability
Idea of Probability
Probability is the science of chance behavior
number of repetitions.
{1,2,3,…,100}
Random Experiment…
…a
If random experiment
an experiment is an
has n possible action [all
outcomes or process that leads
equally likely to
to occur].
one of several possible outcomes. For example:
Experiment Outcomes
simulation
Relative-Frequency Probabilities
Coin flipping:
Probability Models
Event 3
Event 4
Event 1
Sample Space
Event 5
Event 2
Example
Rolling an odd
number={2,4,6}
Rolling an even
number={2,4,6}
Sample Space
={1,2,3,4,5,6}
Rolling a prime
number={2,3,5}
Probability Model for Two Dice
Random phenomenon: roll pair of fair dice.
Sample space:
What is a PROBABILITY?
0 ¼ or .25 ½ or .5 ¾ or .75 1
# 𝑓𝑎𝑣𝑜𝑟𝑎𝑏𝑙𝑒 𝑜𝑢𝑡𝑐𝑜𝑚𝑒𝑠 3 1
𝑃 𝑒𝑣𝑒𝑛 # =
#𝑝𝑜𝑠𝑠𝑖𝑏𝑙𝑒 𝑜𝑢𝑡𝑐𝑜𝑚𝑒𝑠
= =
6 2
# 𝑓𝑎𝑣𝑜𝑟𝑎𝑏𝑙𝑒 𝑜𝑢𝑡𝑐𝑜𝑚𝑒𝑠 1
𝑃 𝐻𝑒𝑎𝑑 = =
#𝑝𝑜𝑠𝑠𝑖𝑏𝑙𝑒 𝑜𝑢𝑡𝑐𝑜𝑚𝑒𝑠 2
Key Concepts:
1) P(black) =
2) P(1) =
3) P(odd) =
4) P(prime) =
Probability of Simple Events
1) P(black) = 4/8
2) P(1) = 1/8
3) P(odd) = 1/2
4) P(prime) = 1/2
Probability of Simple Events
1) P(red) =
2) P(2) =
3) P(not red) =
4) P(even) =
Probability of Simple Events
1) P(red) =1/2
2) P(2) = 1/4
3) P(not red) = 1/2
4) P(even) = 1/2
Probability of Simple Events
0 P( E ) 1
P( E ) P( E ) 1
Properties of Probability:
P( E ) 1 P( E )
P( E ) 1 P( E )
Complementary Events
Example I: A sequence of 5 bits is randomly generated. What is
the probability that at least one of these bits is zero?
Solution: There are 25 = 32 possible outcomes of generating
such a sequence.
Define event E as at least one of the bits is zeros
ത “none of the bits is zero”, includes only one
Then event 𝐸,
of these outcomes, namely the sequence 11111.
ത = 1/32.
Therefore, p(𝐸)
Now p(E) can easily be computed as
ത = 1 – 1/32 = 31/32.
p(E) = 1 – p(𝐸)
Complementary Events
Example II: What is the probability that at least two out of 36
people have the same birthday?
Solution: The sample space S encompasses all possibilities
for the birthdays of the 36 people, so |S| = 36536.
ത
Let us consider the event 𝐸(“no two people out of 36 have the
same birthday”).
P( A B) P( A) P(B)
This rule can extend to any number of independent
events.
P( A B) 0
In a Venn diagram this means that event A is disjoint from event B.
A B A B
P( A B) P( A) P(B) P( A B)
P( A B) P( A) P(B)
e.g. There are 2 red and 3 blue counters in a bag and, without
looking, we take out one counter and do not replace it.
The probability of a 2nd counter taken from the bag being red
P(A B) means
e.g. 1. The following table gives data on the type of car, grouped
by petrol consumption, owned by 100 people.
100
e.g. 1. The following table gives data on the type of car, grouped
by petrol consumption, owned by 100 people.
100
However, I haven’t proved the formula, just shown that it works for
one particular problem.
Draw a Venn diagram and use it to illustrate the conditional probability formula.
Solution: Let R be the event “ Red flower ” and F be the event “ First packet ”
F R
Draw a Venn diagram and use it to illustrate the conditional probability formula.
Solution: Let R be the event “ Red flower ” and F be the event “ First packet ”
F R
8
Draw a Venn diagram and use it to illustrate the conditional probability formula.
Solution: Let R be the event “ Red flower ” and F be the event “ First packet ”
F R
8
Draw a Venn diagram and use it to illustrate the conditional probability formula.
Solution: Let R be the event “ Red flower ” and F be the event “ First packet ”
F R
12 8
Draw a Venn diagram and use it to illustrate the conditional probability formula.
Solution: Let R be the event “ Red flower ” and F be the event “ First packet ”
F R
12 8
Draw a Venn diagram and use it to illustrate the conditional probability formula.
Solution: Let R be the event “ Red flower ” and F be the event “ First packet ”
F R
12 8 15
Draw a Venn diagram and use it to illustrate the conditional probability formula.
Solution: Let R be the event “ Red flower ” and F be the event “ First packet ”
F R
12 8 15
Draw a Venn diagram and use it to illustrate the conditional probability formula.
Solution: Let R be the event “ Red flower ” and F be the event “ First packet ”
F R
12 8 15
10
Draw a Venn diagram and use it to illustrate the conditional probability formula.
Solution: Let R be the event “ Red flower ” and F be the event “ First packet ”
F R
12 8 15
Total: 20 + 25 10
e.g. 2. I have 2 packets of seeds. One contains 20 seeds and although they
look the same, 8 will give red flowers and 12 blue. The 2nd packet has 25
seeds of which 15 will be red and 10 blue.
Draw a Venn diagram and use it to illustrate the conditional probability formula.
Solution: Let R be the event “ Red flower ” and F be the event “ First packet ”
45
F R
12 8 15
Total: 20 + 25 10
e.g. 2. I have 2 packets of seeds. One contains 20 seeds and although they
look the same, 8 will give red flowers and 12 blue. The 2nd packet has 25
seeds of which 15 will be red and 10 blue.
Draw a Venn diagram and use it to illustrate the conditional probability formula.
Solution: Let R be the event “ Red flower ” and F be the event “ First packet ”
45
F R
12 8 15
10
e.g. 2. I have 2 packets of seeds. One contains 20 seeds and although they
look the same, 8 will give red flowers and 12 blue. The 2nd packet has 25
seeds of which 15 will be red and 10 blue.
Draw a Venn diagram and use it to illustrate the conditional probability formula.
Solution: Let R be the event “ Red flower ” and F be the event “ First packet ”
45
P(R and F) = F R
12 8 15
10
e.g. 2. I have 2 packets of seeds. One contains 20 seeds and although they
look the same, 8 will give red flowers and 12 blue. The 2nd packet has 25
seeds of which 15 will be red and 10 blue.
Draw a Venn diagram and use it to illustrate the conditional probability formula.
Solution: Let R be the event “ Red flower ” and F be the event “ First packet ”
8 45
P(R and F) = F R
12 8 15
10
e.g. 2. I have 2 packets of seeds. One contains 20 seeds and although they
look the same, 8 will give red flowers and 12 blue. The 2nd packet has 25
seeds of which 15 will be red and 10 blue.
Draw a Venn diagram and use it to illustrate the conditional probability formula.
Solution: Let R be the event “ Red flower ” and F be the event “ First packet ”
8 45
P(R and F) = F R
45
12 8 15
10
e.g. 2. I have 2 packets of seeds. One contains 20 seeds and although they
look the same, 8 will give red flowers and 12 blue. The 2nd packet has 25
seeds of which 15 will be red and 10 blue.
Draw a Venn diagram and use it to illustrate the conditional probability formula.
Solution: Let R be the event “ Red flower ” and F be the event “ First packet ”
8 45
P(R and F) = F R
45
P(R F) = 8 12 8 15
10
e.g. 2. I have 2 packets of seeds. One contains 20 seeds and although they
look the same, 8 will give red flowers and 12 blue. The 2nd packet has 25
seeds of which 15 will be red and 10 blue.
Draw a Venn diagram and use it to illustrate the conditional probability formula.
Solution: Let R be the event “ Red flower ” and F be the event “ First packet ”
8 45
P(R and F) = F R
45
P(R F) = 8 P(F) = 12 8 15
20
10
e.g. 2. I have 2 packets of seeds. One contains 20 seeds and although they
look the same, 8 will give red flowers and 12 blue. The 2nd packet has 25
seeds of which 15 will be red and 10 blue.
Draw a Venn diagram and use it to illustrate the conditional probability formula.
Solution: Let R be the event “ Red flower ” and F be the event “ First packet ”
8 45
P(R and F) = F R
45
P(R F) = 8 P(F) =
20 12 8 15
20
10
e.g. 2. I have 2 packets of seeds. One contains 20 seeds and although they
look the same, 8 will give red flowers and 12 blue. The 2nd packet has 25
seeds of which 15 will be red and 10 blue.
Draw a Venn diagram and use it to illustrate the conditional probability formula.
Solution: Let R be the event “ Red flower ” and F be the event “ First packet ”
8 45
P(R and F) = F R
45
P(R F) = 8 P(F) =
20 12 8 15
20 45
10
e.g. 2. I have 2 packets of seeds. One contains 20 seeds and although they
look the same, 8 will give red flowers and 12 blue. The 2nd packet has 25
seeds of which 15 will be red and 10 blue.
Draw a Venn diagram and use it to illustrate the conditional probability formula.
Solution: Let R be the event “ Red flower ” and F be the event “ First packet ”
8 45
P(R and F) = F R
45
P(R F) = 8 P(F) =
20 12 8 15
20 45
10
8 1 20 8
P(R F) P(F) =
1 20 45 45
So, P(R and F) = P(R F) P(F)
Summary
The probability that both event A and event B occur is given by
P(A and B) = P(A B) P(B)
We often use this in the form
Why?
time
P A B P A | BPB
Computing Probabilities
P J 1 R P R | J 1 P J 1
3 1 1
8 3 8
Similar calculations show:
1 1
P J2 R P R | J2 P J2
1
6 3 18
P J 3 R P R | J 3 P J 3
4 1 4
9 3 27
Venn Diagram
Updating our Venn Diagram with these
probabilities:
Where are we going with this?
Our original problem was:
One jar is chosen at random and a ball is selected.
If the ball is red, what is the probability that it came
from the 2nd jar?
In terms of the events we’ve defined we want:
P J 2 R
P J 2 | R
P R
Finding our Probability
We already know what the numerator portion is
from our Venn Diagram
What is the denominator portion?
P J 2 R
P J 2 | R
P R
P J 2 R
P J 1 R P J 2 R P J 3 R
Arithmetic!
Plugging in the appropriate values:
P J 2 R
P J 2 | R
P J 1 R P J 2 R P J 3 R
1
18
12
0.17
1 1 4 71
8 18 27
Bayes’ Theorem:
PB AP A
P A B
P(B)
P ( B A) P ( A)
P ( A B) =
å P(B A )P( A )
n n n
80
Discrete Random Variables
Random variables that have a finite (countable) list of
possible outcomes, with probabilities assigned to each
of these outcomes, are called discrete
81
Discrete example: roll of a die
p(x)
1/6
x
1 2 3 4 5 6
P(x) 1
all x
Probability Distribution Function (PDF)
x p(x)
1 p(x=1)=1/6
2 p(x=2)=1/6
3 p(x=3)=1/6
4 p(x=4)=1/6
5 p(x=5)=1/6
6 p(x=6)=1/6
1.0
Cumulative Distribution Function (CDF)
1.0 P(x)
5/6
2/3
1/2
1/3
1/6
1 2 3 4 5 6 x
Cumulative Distribution Function (CDF)
x P(x≤A)
1 P(x≤1)=1/6
2 P(x≤2)=2/6
3 P(x≤3)=3/6
4 P(x≤4)=4/6
5 P(x≤5)=5/6
6 P(x≤6)=6/6
Examples
P(x≤3)=1/2
Binomial
Yes/no outcomes (dead/alive,
treated/untreated, smoker/non-smoker,
sick/well, etc.)
Poisson
Counts (e.g., how many cases of disease
in a given area)
Continuous Random Variables
Random variables that can take on any
value in an interval, with probabilities given
as areas under a density curve, are called
continuous
Continuous random variables
weight
temperature
88
Probability Density Function (PDF)
The probability function that accompanies a continuous
particular.
Probability Density Function (PDF)
f ( x) e x
e
x x
e 0 1 1
0
0
Probability Density Function (PDF)
p(x)=e-x
x
1 2
2 2
x x
P(1 x 2) e e e 2 e 1 .135 .368 .23
1
1
Cumulative Density Function (CDF)
A A
x x
e e e A e 0 e A 1 1 e A
0
0
Cumulative Density Function (CDF)
p(x)
1
2 x
2
P(x 2) 1 - e 1 - .135 .865
Uniform Density
x
1
1 x
0
0
1 0 1
Uniform Density
p(x)
¼ ½ x
1
P(1/4 ≤ x≤ 1/2 )= ¼
The Normal Density Function
1 x 2
1 ( )
f ( x) e 2
2
This is a bell shaped curve
Note constants: with different centers and
spreads depending on and
=3.14159
e=2.71828
The Normal Density Function
μ
The Normal Density Function
+∞
1 1 𝑥−𝜇 2
−
න 𝑒 2 𝜎 𝑑𝑥 =1
𝜎 2𝜋
−∞
The Shape of Normal Density
Normal distribution is bell shaped, and symmetrical around m.
90 110
Why symmetrical? Let µ = 100. Suppose x = 110. Now suppose x = 90
2 2 2 2
110 100 10 90 100 10
1 (1/ 2) 1 (1/ 2) 1 (1/ 2) 1 (1/ 2)
f (110) e e f (90) e e
2 2 2 2
Normal Probability Density
The expected value (also called the mean) E(X) (or )
can be any number
The standard deviation can be any nonnegative
number
The total area under every normal curve is 1
There are infinitely many normal distributions
Normal Probability Density
𝐸(𝑥)
Ԧ = 𝑝1 𝑥Ԧ1 + 𝑝2 𝑥Ԧ2 + ⋯ + 𝑝𝑛 𝑥Ԧ𝑛 = 𝑝𝑖 𝑥Ԧ𝑖
𝑖=1
Mean or Average
1
[(1,2)+ (5,6)
11
(3,4)+
(6,5)
(5,6)+ Mean
(2,4)+ (5,5)
(1,1)+ (2,4) (3,4) (3.3636,3.0909)
(4,2)+
(6,5)+ (5,3)
(3,1)+
(2,1)+ (2,1) (4,2)
(5,3)+
(5,5)] (1,1) (1,2) (3,1)
Median (M)
A resistant measure of the data’s center
At least half of the ordered values are less than or equal to
the median value
At least half of the ordered values are greater than or equal
to the median value
If n is odd, the median is the middle ordered value
If n is even, the median is the average of the two middle
ordered values
Median (M)
Location of the median: L(M) = (n+1)/2 ,
where n = sample size.
Example 2 data: 2 4 6 8
Median = 5 (average of 4 and 6)
Example 3 data: 6 2 4
Median 2
(order the values: 2 4 6 , so Median = 4)
Comparing the Mean & Median
Computation of mean is easier.
xi x
Deviations
what is a typical deviation from the mean?
(standard deviation)
small values of this typical deviation indicate
small variability in the data
large values of this typical deviation indicate
large variability in the data
Variance
Mean
Variance
2
-
Variance
2
-
2
-
Variance
1
---------------- ……… + 2 2
- + - + ………
No. of Data
Points
Variance Formula
𝑛
2
1 2
𝜎 = (𝑥𝑖 − 𝑥)ҧ
𝑛
𝑖=1
Standard Deviation
𝑛
1
𝜎 = (𝑥𝑖 − 𝑥)ҧ 2
𝑛
𝑖=1
214,870
2
30695.71
7
1 𝑛
Variance(x)= σ𝑖=1(𝑥𝑖 − 𝑥)ҧ 2
𝑛
1 𝑛
= σ (𝑥 − 𝑥)(𝑥
ҧ 𝑖 − 𝑥)ҧ
𝑛 𝑖=1 𝑖
1 𝑛
Covariance(x, y) = σ𝑖=1(𝑥𝑖 − 𝑥)(𝑦
ҧ 𝑖 − 𝑦)
ത
𝑛
Covariance x, x = var x
Covariance x, 𝑦 = Covariance y, x
Covariance
1 𝑛
Covariance(x, y) = σ𝑖=1(𝑥𝑖 − 𝑥)(𝑦
ҧ 𝑖 − 𝑦)
ത
𝑛
Covariance
1 𝑛
Covariance(x, y) = σ𝑖=1(𝑥𝑖 − 𝑥)(𝑦
ҧ 𝑖 − 𝑦)
ത
𝑛
Covariance
1 𝑛
Covariance(x, y) = σ𝑖=1(𝑥𝑖 − 𝑥)(𝑦
ҧ 𝑖 − 𝑦)
ത
𝑛
𝑦ത
𝑦1 − 𝑦<0
ത
𝑦1
𝑥1 𝑥ҧ
𝑥1 − 𝑥<0
ҧ
Covariance
1 𝑛
Covariance(x, y) = σ𝑖=1(𝑥𝑖 − 𝑥)(𝑦
ҧ 𝑖 − 𝑦)
ത
𝑛
𝑦1
𝑦1 − 𝑦ത >0
𝑦ത
𝑥ҧ 𝑥1 𝑥1 − 𝑥ҧ >0
Covariance
1 𝑛
Covariance(x, y) = σ𝑖=1(𝑥𝑖 − 𝑥)(𝑦
ҧ 𝑖 − 𝑦)
ത
𝑛
(𝑥𝑖 − 𝑥)(𝑦
ҧ 𝑖 − 𝑦)>0
ത
(𝑥𝑖 − 𝑥)(𝑦
ҧ 𝑖 − 𝑦)<0
ത
Positive
Relation
(𝑥𝑖 − 𝑥)(𝑦
ҧ 𝑖 − 𝑦)<0
ത
(𝑥𝑖 − 𝑥)(𝑦
ҧ 𝑖 − 𝑦)>0
ത
Covariance
1 𝑛
Covariance(x, y) = σ𝑖=1(𝑥𝑖 − 𝑥)(𝑦
ҧ 𝑖 − 𝑦)
ത
𝑛
Covariance
1 𝑛
Covariance(x, y) = σ𝑖=1(𝑥𝑖 − 𝑥)(𝑦
ҧ 𝑖 − 𝑦)
ത
𝑛
Covariance
1 𝑛
Covariance(x, y) = σ𝑖=1(𝑥𝑖 − 𝑥)(𝑦
ҧ 𝑖 − 𝑦)
ത
𝑛
𝑦1
𝑦1 − 𝑦ത >0
𝑦ത
𝑥1 𝑥ҧ
𝑥1 − 𝑥<0
ҧ
Covariance
1 𝑛
Covariance(x, y) = σ𝑖=1(𝑥𝑖 − 𝑥)(𝑦
ҧ 𝑖 − 𝑦)
ത
𝑛
𝑦ത
𝑦1 − 𝑦<0
ത
𝑦1
𝑥ҧ 𝑥1
𝑥1 − 𝑥>0
ҧ
Covariance
1 𝑛
Covariance(x, y) = σ𝑖=1(𝑥𝑖 − 𝑥)(𝑦
ҧ 𝑖 − 𝑦)
ത
𝑛
𝑥𝑖 − 𝑥ҧ 𝑦𝑖 − 𝑦ത <0
(𝑥𝑖 − 𝑥)(𝑦
ҧ 𝑖 − 𝑦)>0
ത
Negative
Relation
(𝑥𝑖 − 𝑥)(𝑦
ҧ 𝑖 − 𝑦)>0
ത
(𝑥𝑖 − 𝑥)(𝑦
ҧ 𝑖 − 𝑦)<0
ത
Covariance
1 𝑛
Covariance(x, y) = σ𝑖=1(𝑥𝑖 − 𝑥)(𝑦
ҧ 𝑖 − 𝑦)
ത
𝑛
Covariance
1 𝑛
Covariance(x, y) = σ𝑖=1(𝑥𝑖 − 𝑥)(𝑦
ҧ 𝑖 − 𝑦)
ത
𝑛
𝑥𝑖 − 𝑥ҧ 𝑦𝑖 − 𝑦ത <0 𝑥𝑖 − 𝑥ҧ 𝑦𝑖 − 𝑦ത >0
No
Relation
𝑥𝑖 − 𝑥ҧ 𝑦𝑖 − 𝑦ത <0
𝑥𝑖 − 𝑥ҧ 𝑦𝑖 − 𝑦ത >0
Covariance
1 𝑛
Covariance(x, y) = σ𝑖=1(𝑥𝑖 − 𝑥)(𝑦
ҧ 𝑖 − 𝑦)
ത
𝑛
(𝑥, 𝑦) (𝑥 − 𝑥,ҧ 𝑦 − 𝑦)
ത
(2 , 1) (-2.4545, -2.8182)
(2 , 2) (-2.4545, -1.8182) 1
Covariance(x, y) = (𝑥 − 𝑥)ҧ 𝑇 (𝑦 − 𝑦)
ത
(4 , 3) (-0.4545, -0.8182) 11
(6 , 1) (1.5455, -2.8182)
(8 , 3) (3.5455, -0.8182)
𝑇
(1 , 5) (-3.4545, 1.1818) Covariance(x, y) = 𝐸[ 𝑥 − 𝑥ҧ 𝑦 − 𝑦ത ]
(4 , 6) (-0.4545, 2.1818)
(4 , 7) (-0.4545, 3.1818)
(6 , 3) (1.5455, -0.8182)
(6 , 5) (1.5455, 1.1818)
(6 , 6) (1.5455, 2.1818)
(4.4545, 3.8182) (0, 0)
Covariance Matrix
−1 ≤ 𝐶𝑜𝑟𝑟𝑒𝑙𝑎𝑡𝑖𝑜𝑛 𝑥, 𝑦 ≤ +1
Multivariate Gaussians (or "multinormal distribution“ or
“multivariate normal distribution”)
Multivariate case:
Vector of observations x,
vector of means and covariance matrix
Dimension of x Determinant
Multivariate Gaussians
Univariate case
Multivariate case
do not depend on x
normalization constants
depends on x and positive
The mean vector
μ1
μ
2
μ E ( x) .
.
μm
Covariance of two random variables
Recall for two random variables xi, xj
Cov( xi , x j )
2
ij
E[( xi i )( x j j )]
E ( xi x j ) E ( xi ) E ( x j )
The covariance matrix
transpose operator
E[ ( x μ)( x μ) ] T
( x1 μ1 ) 2
1 12 .. 14
21 2 2
. 24
.
E [( x1 μ1 )..( xn μn )] . . .. .
.
. . .. .
( xm μm )
2
m1 m 2 .. m
Var(xm)=Cov(xm, xm)
An example: 2 variate case
Determinant
An example: 2 variate case
2
0 2
ij2 E[( xi i )( x j j )] 0
i j
Gaussian Intuitions: Size of
Identity matrix
= [0 0] = [0 0] = [0 0]
=I = 0.6I = 2I
As becomes larger,
Gaussian becomes more spread out
Gaussian Intuitions: Off-diagonal
As the off-diagonal entries increase, more correlation between value of x and value of
y
Gaussian Intuitions: off-diagonal and diagonal