Basics of Probability and Statistics

Download as pptx, pdf, or txt
Download as pptx, pdf, or txt
You are on page 1of 161
At a glance
Powered by AI
The key takeaways are the basic concepts of probability, random experiments, sample spaces, events, and probability models.

The main concepts related to probability discussed are ideas of probability, randomness, random experiments, sample spaces, events, relative frequency probabilities, and probability models.

A sample space is the set of all possible outcomes of a random experiment, while an event is a subset of outcomes of the sample space.

Basics of Probability and

Statistics
Outline
 Probability

 Statistical Measures
Probability
Idea of Probability
 Probability is the science of chance behavior

 Chance behavior is unpredictable in the short

run but has a regular and predictable pattern


in the long run
Randomness
 Random: individual outcomes are uncertain

 But there is a regular distribution of outcomes in a large

number of repetitions.

 Example: select any number from a bag of numbers

{1,2,3,…,100}
Random Experiment…
 …a
If random experiment
an experiment is an
has n possible action [all
outcomes or process that leads
equally likely to
to occur].
one of several possible outcomes. For example:

Experiment Outcomes

Flip a coin Heads, Tails

Selecting a color ball Green, red, blue

Rolling a die 1,2,3,4,5,6

Picking a card from a


52 cards
deck
Relative-Frequency Probabilities
 Relative frequency (proportion of occurrences) of an

outcome settles down to one value over the long run.


That one value is then defined to be the probability of
that outcome.

 Can be determined (or checked) by observing a long

series of independent trials (empirical data)

 experience with many samples

 simulation
Relative-Frequency Probabilities

Coin flipping:
Probability Models

 The sample space S of a random phenomenon is the set

of all possible outcomes.

 An event is an outcome or a set of outcomes (subset of

the sample space).

 A probability model is a mathematical description of long-

run regularity consisting of a sample space S and a way of


assigning probabilities to events.
Sample Space and Events

Event 3
Event 4
Event 1

Sample Space

Event 5
Event 2
Example

Rolling an odd
number={2,4,6}
Rolling an even
number={2,4,6}

Sample Space
={1,2,3,4,5,6}
Rolling a prime
number={2,3,5}
Probability Model for Two Dice
Random phenomenon: roll pair of fair dice.
Sample space:

Event: rolling even numbers on both dice


12
Probability Model for 52 card deck
Random phenomenon: Arrange 52 card deck in a zigzag way
Sample space:

Event: pick an ace


Probability
What is a PROBABILITY?

- Probability is the chance that some event


will happen

- It is the ratio of the number of ways a


certain event can occur to the number of
possible outcomes
Probability

What is a PROBABILITY?

number of favorable outcomes


P(event) = number of possible outcomes

Examples that use Probability:


(1) Dice, (2) Spinners, (3) Coins, (4) Deck of Cards, (5)
Evens/Odds, (6) Alphabet, etc.
Probability
What is a PROBABILITY?

0 ¼ or .25 ½ or .5 ¾ or .75 1

Impossible Not Very Equally Likely Somewhat Certain


Likely Likely
Probability of Simple Events
Probability of Simple Events
Example 2: Roll a dice.
What is the probability of rolling an even number?

# 𝑓𝑎𝑣𝑜𝑟𝑎𝑏𝑙𝑒 𝑜𝑢𝑡𝑐𝑜𝑚𝑒𝑠 3 1
𝑃 𝑒𝑣𝑒𝑛 # =
#𝑝𝑜𝑠𝑠𝑖𝑏𝑙𝑒 𝑜𝑢𝑡𝑐𝑜𝑚𝑒𝑠
= =
6 2

The probability of rolling an even number


is 3 out of 6.
Probability of Simple Events
Example 3: Roll a dice.
Random phenomenon: roll pair of fair dice and
count the number of pips on the up-faces.
Find the probability of rolling a 5.

P(roll a 5) = P( )+P( )+P( )+P( )


= 1/36 + 1/36 + 1/36 + 1/36
= 4/36
= 0.111
19
Probability of Simple Events
Example 4: Spinners.
What is the probability of spinning green?
# 𝑓𝑎𝑣𝑜𝑟𝑎𝑏𝑙𝑒 𝑜𝑢𝑡𝑐𝑜𝑚𝑒𝑠 1
𝑃 𝑔𝑟𝑒𝑒𝑛 = =
#𝑝𝑜𝑠𝑠𝑖𝑏𝑙𝑒 𝑜𝑢𝑡𝑐𝑜𝑚𝑒𝑠 4

The probability of spinning green is 1 out of 4


Probability of Simple Events
Example 5: Flip a coin.
What is the probability of flipping a tail?

# 𝑓𝑎𝑣𝑜𝑟𝑎𝑏𝑙𝑒 𝑜𝑢𝑡𝑐𝑜𝑚𝑒𝑠 1
𝑃 𝐻𝑒𝑎𝑑 = =
#𝑝𝑜𝑠𝑠𝑖𝑏𝑙𝑒 𝑜𝑢𝑡𝑐𝑜𝑚𝑒𝑠 2

The probability of spinning green is 1 out of 2


Probability of Simple Events
Example 6: Deck of Cards.
 What is the probability of picking a heart?
# 𝒇𝒂𝒗𝒐𝒓𝒂𝒃𝒍𝒆 𝒐𝒖𝒕𝒄𝒐𝒎𝒆𝒔 𝟏𝟑 𝟏
𝑃 𝐻𝑒𝑎𝑟𝑡 = = =
#𝒑𝒐𝒔𝒔𝒊𝒃𝒍𝒆 𝒐𝒖𝒕𝒄𝒐𝒎𝒆𝒔 𝟓𝟐 𝟒
The probability of picking a heart is 1 out of 4

 What is the probability of picking a non heart?


# 𝒇𝒂𝒗𝒐𝒓𝒂𝒃𝒍𝒆 𝒐𝒖𝒕𝒄𝒐𝒎𝒆𝒔 𝟑𝟗 𝟑
𝑃 𝑛𝑜𝑛 − 𝐻𝑒𝑎𝑟𝑡 = = =
#𝒑𝒐𝒔𝒔𝒊𝒃𝒍𝒆 𝒐𝒖𝒕𝒄𝒐𝒎𝒆𝒔 𝟓𝟐 𝟒

The probability of picking a heart is 3 out of 4


Probability of Simple Events

Key Concepts:

- Probability is the chance that some event will


happen

- It is the ratio of the number of ways a certain


even can occur to the total number of possible
outcomes
Probability of Simple Events

Guided Practice: Calculate the probability of each independent


event.

1) P(black) =
2) P(1) =
3) P(odd) =
4) P(prime) =
Probability of Simple Events

Guided Practice: Answers

1) P(black) = 4/8
2) P(1) = 1/8
3) P(odd) = 1/2
4) P(prime) = 1/2
Probability of Simple Events

Independent Practice: Calculate the probability of each


independent event.

1) P(red) =
2) P(2) =
3) P(not red) =
4) P(even) =
Probability of Simple Events

Independent Practice: Answers

1) P(red) =1/2
2) P(2) = 1/4
3) P(not red) = 1/2
4) P(even) = 1/2
Probability of Simple Events

Real World Example:


A computer company manufactures 2,500 computers each day. An
average of 100 of these computers are returned with defects. What is
the probability that the computer you purchased is not defective?

# 𝑓𝑎𝑣𝑜𝑟𝑎𝑏𝑙𝑒 𝑜𝑢𝑡𝑐𝑜𝑚𝑒𝑠 2400 24


𝑃 𝑛𝑜𝑡 𝑑𝑒𝑓𝑒𝑐𝑡𝑖𝑣𝑒 = = =
#𝑝𝑜𝑠𝑠𝑖𝑏𝑙𝑒 𝑜𝑢𝑡𝑐𝑜𝑚𝑒𝑠 2500 25
Complementary Events
 The complement of an event E is the set of all
outcomes in a sample space that are not included in
event E.

 The complement of an event E is denoted by 𝐸 ′ 𝑜𝑟 𝐸ത

0  P( E )  1
P( E )  P( E )  1
Properties of Probability:
P( E )  1  P( E )
P( E )  1  P( E )
Complementary Events
 Example I: A sequence of 5 bits is randomly generated. What is
the probability that at least one of these bits is zero?
 Solution: There are 25 = 32 possible outcomes of generating
such a sequence.
Define event E as at least one of the bits is zeros
ത “none of the bits is zero”, includes only one
Then event 𝐸,
of these outcomes, namely the sequence 11111.
ത = 1/32.
Therefore, p(𝐸)
Now p(E) can easily be computed as
ത = 1 – 1/32 = 31/32.
p(E) = 1 – p(𝐸)
Complementary Events
 Example II: What is the probability that at least two out of 36
people have the same birthday?
 Solution: The sample space S encompasses all possibilities
for the birthdays of the 36 people, so |S| = 36536.

Let us consider the event 𝐸(“no two people out of 36 have the
same birthday”).

𝐸ത includes P(365, 36) outcomes (365 possibilities for the first


person’s birthday, 364 for the second, and so on).
ത = P(365, 36)/36536 = 0.168,
Then p(𝐸)
so p(E) = 0.832
The Multiplication Rule
 If events A and B are independent, then the probability
of two events, A and B occurring in a sequence (or
simultaneously) is:

P( A  B)  P( A)  P(B)
 This rule can extend to any number of independent
events.

 Two events are independent if the occurrence of the first


event does not affect the probability of the occurrence of
the second event.
Mutually Exclusive
Two events A and B are mutually exclusive if and only if:

P( A  B)  0
In a Venn diagram this means that event A is disjoint from event B.

A B A B

A and B are M.E. A and B are not M.E.


The Addition Rule
 The probability that at least one of the events A or B
will occur, P(A or B), is given by:

P( A  B)  P( A)  P(B)  P( A  B)

 If events A and B are mutually exclusive, then the


addition rule is simplified to:

P( A  B)  P( A)  P(B)

 This simplified rule can be extended to any number of


mutually exclusive events.
The Addition and Multiplication Rule
 Example: What is the probability of a positive integer
selected at random from the set of positive integers
{1,2,….,100} to be divisible by 2 or 5?
 Solution:
E2: “integer is divisible by 2”
E5: “integer is divisible by 5”

 E2 = {2, 4, 6, …, 100} and |E2| = 50


p(E2) = 0.5

 E5 = {5, 10, 15, …, 100} and|E5| = 20


p(E5) = 0.2
The Addition and Multiplication Rule
E2  E5 = {10, 20, 30, …, 100} and |E2  E5| = 10
p(E2  E5) = 0.1

p(E2  E5) = p(E2) + p(E5) – p(E2  E5 )


p(E2  E5) = 0.5 + 0.2 – 0.1 = 0.6
Conditional Probability
 We talk about conditional probability when the probability of one
event depends on whether or not another event has occurred.

 e.g. There are 2 red and 3 blue counters in a bag and, without
looking, we take out one counter and do not replace it.

 The probability of a 2nd counter taken from the bag being red

depends on whether the 1st was red or blue.


 Conditional probability problems can be solved by considering the
individual possibilities or by using a table, a Venn diagram, a tree
diagram or a formula.
Notation

P(A B) means

“the probability that event A occurs given that B


has occurred”. This is conditional probability.
Example

e.g. 1. The following table gives data on the type of car, grouped
by petrol consumption, owned by 100 people.

Low Medium High Total


Male 12 33 7
Female 23 21 4

100

One person is selected at random.


L is the event “the person owns a low rated car”
Example

e.g. 1. The following table gives data on the type of car, grouped
by petrol consumption, owned by 100 people.

Low Medium High Total


Male 12 33 7
Female 23 21 4

100

One person is selected at random.


L is the event “the person owns a low rated car”
F is the event “a female is chosen”.
e.g. 1. The following table gives data on the type of car, grouped
by petrol consumption, owned by 100 people.
Low Medium High Total
Male 12 33 7
Female 23 21 4

One person is selected at random. 100


L is the event “the person owns a low rated car”
F is the event “a female is chosen”.
e.g. 1. The following table gives data on the type of car, grouped by
petrol consumption, owned by 100 people.

Low Medium High Total


Male 12 33 7
Female 23 21 4

One person is selected at random. 100


L is the event “the person owns a low rated car”
F is the event “a female is chosen”.
Find (i) P(L) (ii) P(F and L) (iii) P(F L)

We need to be careful which row or column we look at.


Solution: Low Medium High Total
Male 12 33 7
Female 23 21 4
35 100
Find (i) P(L) (ii) P(F and L) (iii) P(F L)
35 7 7
(i) P(L) = 
100 20 20
Solution: Low Medium High Total
Male 12 33 7
Female 23 21 4
100
Find (i) P(L) (ii) P(F and L) (iii) P(F L)
35 7 7
(i) P(L) = 
100 20 20
23 The probability of selecting a
(ii) P(F and L) =
100 female with a low rated car.
Solution: Low Medium High Total
Male 12 33 7
Female 23 21 4
35 100
Find (i) P(L) (ii) P(F and L) (iii) P(F L)
35 7 7
(i) P(L) = 
100 20 20
23
(ii) P(F and L) =
100
23 We
Themust be careful
probability with the a female
of selecting
(iii) P(F L) 
35 denominators
given the car isinlow
(ii)rated.
and (iii). Here we
are given the car is low rated. We want
the total of that column.
Solution: Low Medium High Total
Male 12 33 7
Female 23 21 4
100
Find (i) P(L) (ii) P(F and L) (iii) P(F L)
35 7 7
(i) P(L) = 
100 20 20 Notice that
1
23 7 23 23
(ii) P(F and L) = P(L)  P(F L)   
100 20 35 5 100
23 = P(F and L)
(iii) P(F L) 
35
So, P(F and L) = P(F L)  P(L)
Conditional Probability

P(F and L) = P(F L)  P(L)

 This result can be used to help solve harder conditional probability


problems.

 However, I haven’t proved the formula, just shown that it works for
one particular problem.

 We’ll just illustrate it again on a simple problem using a Venn


diagram.
e.g. 2. I have 2 packets of seeds. One contains 20 seeds and although they
look the same, 8 will give red flowers and 12 blue. The 2nd packet has 25
seeds of which 15 will be red and 10 blue.

Draw a Venn diagram and use it to illustrate the conditional probability formula.

Solution: Let R be the event “ Red flower ” and F be the event “ First packet ”

F R

Red in the 1st packet


e.g. 2. I have 2 packets of seeds. One contains 20 seeds and although they
look the same, 8 will give red flowers and 12 blue. The 2nd packet has 25
seeds of which 15 will be red and 10 blue.

Draw a Venn diagram and use it to illustrate the conditional probability formula.

Solution: Let R be the event “ Red flower ” and F be the event “ First packet ”

F R
8

Red in the 1st packet


e.g. 2. I have 2 packets of seeds. One contains 20 seeds and although they
look the same, 8 will give red flowers and 12 blue. The 2nd packet has 25
seeds of which 15 will be red and 10 blue.

Draw a Venn diagram and use it to illustrate the conditional probability formula.

Solution: Let R be the event “ Red flower ” and F be the event “ First packet ”

F R
8

Blue in the 1st packet


e.g. 2. I have 2 packets of seeds. One contains 20 seeds and although they
look the same, 8 will give red flowers and 12 blue. The 2nd packet has 25
seeds of which 15 will be red and 10 blue.

Draw a Venn diagram and use it to illustrate the conditional probability formula.

Solution: Let R be the event “ Red flower ” and F be the event “ First packet ”

F R
12 8

Blue in the 1st packet


e.g. 2. I have 2 packets of seeds. One contains 20 seeds and although they
look the same, 8 will give red flowers and 12 blue. The 2nd packet has 25
seeds of which 15 will be red and 10 blue.

Draw a Venn diagram and use it to illustrate the conditional probability formula.

Solution: Let R be the event “ Red flower ” and F be the event “ First packet ”

F R
12 8

Red in the 2nd packet


e.g. 2. I have 2 packets of seeds. One contains 20 seeds and although they
look the same, 8 will give red flowers and 12 blue. The 2nd packet has 25
seeds of which 15 will be red and 10 blue.

Draw a Venn diagram and use it to illustrate the conditional probability formula.

Solution: Let R be the event “ Red flower ” and F be the event “ First packet ”

F R
12 8 15

Red in the 2nd packet


e.g. 2. I have 2 packets of seeds. One contains 20 seeds and although they
look the same, 8 will give red flowers and 12 blue. The 2nd packet has 25
seeds of which 15 will be red and 10 blue.

Draw a Venn diagram and use it to illustrate the conditional probability formula.

Solution: Let R be the event “ Red flower ” and F be the event “ First packet ”

F R
12 8 15

Blue in the 2nd packet


e.g. 2. I have 2 packets of seeds. One contains 20 seeds and although they
look the same, 8 will give red flowers and 12 blue. The 2nd packet has 25
seeds of which 15 will be red and 10 blue.

Draw a Venn diagram and use it to illustrate the conditional probability formula.

Solution: Let R be the event “ Red flower ” and F be the event “ First packet ”

F R
12 8 15

10

Blue in the 2nd packet


e.g. 2. I have 2 packets of seeds. One contains 20 seeds and although they
look the same, 8 will give red flowers and 12 blue. The 2nd packet has 25
seeds of which 15 will be red and 10 blue.

Draw a Venn diagram and use it to illustrate the conditional probability formula.

Solution: Let R be the event “ Red flower ” and F be the event “ First packet ”

F R
12 8 15
Total: 20 + 25 10
e.g. 2. I have 2 packets of seeds. One contains 20 seeds and although they
look the same, 8 will give red flowers and 12 blue. The 2nd packet has 25
seeds of which 15 will be red and 10 blue.

Draw a Venn diagram and use it to illustrate the conditional probability formula.

Solution: Let R be the event “ Red flower ” and F be the event “ First packet ”

45
F R
12 8 15
Total: 20 + 25 10
e.g. 2. I have 2 packets of seeds. One contains 20 seeds and although they
look the same, 8 will give red flowers and 12 blue. The 2nd packet has 25
seeds of which 15 will be red and 10 blue.

Draw a Venn diagram and use it to illustrate the conditional probability formula.

Solution: Let R be the event “ Red flower ” and F be the event “ First packet ”

45
F R
12 8 15

10
e.g. 2. I have 2 packets of seeds. One contains 20 seeds and although they
look the same, 8 will give red flowers and 12 blue. The 2nd packet has 25
seeds of which 15 will be red and 10 blue.

Draw a Venn diagram and use it to illustrate the conditional probability formula.

Solution: Let R be the event “ Red flower ” and F be the event “ First packet ”

45
P(R and F) = F R
12 8 15

10
e.g. 2. I have 2 packets of seeds. One contains 20 seeds and although they
look the same, 8 will give red flowers and 12 blue. The 2nd packet has 25
seeds of which 15 will be red and 10 blue.

Draw a Venn diagram and use it to illustrate the conditional probability formula.

Solution: Let R be the event “ Red flower ” and F be the event “ First packet ”

8 45
P(R and F) = F R
12 8 15

10
e.g. 2. I have 2 packets of seeds. One contains 20 seeds and although they
look the same, 8 will give red flowers and 12 blue. The 2nd packet has 25
seeds of which 15 will be red and 10 blue.

Draw a Venn diagram and use it to illustrate the conditional probability formula.

Solution: Let R be the event “ Red flower ” and F be the event “ First packet ”

8 45
P(R and F) = F R
45
12 8 15

10
e.g. 2. I have 2 packets of seeds. One contains 20 seeds and although they
look the same, 8 will give red flowers and 12 blue. The 2nd packet has 25
seeds of which 15 will be red and 10 blue.

Draw a Venn diagram and use it to illustrate the conditional probability formula.

Solution: Let R be the event “ Red flower ” and F be the event “ First packet ”

8 45
P(R and F) = F R
45
P(R F) = 8 12 8 15

10
e.g. 2. I have 2 packets of seeds. One contains 20 seeds and although they
look the same, 8 will give red flowers and 12 blue. The 2nd packet has 25
seeds of which 15 will be red and 10 blue.

Draw a Venn diagram and use it to illustrate the conditional probability formula.

Solution: Let R be the event “ Red flower ” and F be the event “ First packet ”

8 45
P(R and F) = F R
45
P(R F) = 8 P(F) = 12 8 15
20
10
e.g. 2. I have 2 packets of seeds. One contains 20 seeds and although they
look the same, 8 will give red flowers and 12 blue. The 2nd packet has 25
seeds of which 15 will be red and 10 blue.

Draw a Venn diagram and use it to illustrate the conditional probability formula.

Solution: Let R be the event “ Red flower ” and F be the event “ First packet ”

8 45
P(R and F) = F R
45
P(R F) = 8 P(F) =
20 12 8 15
20
10
e.g. 2. I have 2 packets of seeds. One contains 20 seeds and although they
look the same, 8 will give red flowers and 12 blue. The 2nd packet has 25
seeds of which 15 will be red and 10 blue.

Draw a Venn diagram and use it to illustrate the conditional probability formula.

Solution: Let R be the event “ Red flower ” and F be the event “ First packet ”

8 45
P(R and F) = F R
45
P(R F) = 8 P(F) =
20 12 8 15
20 45
10
e.g. 2. I have 2 packets of seeds. One contains 20 seeds and although they
look the same, 8 will give red flowers and 12 blue. The 2nd packet has 25
seeds of which 15 will be red and 10 blue.

Draw a Venn diagram and use it to illustrate the conditional probability formula.

Solution: Let R be the event “ Red flower ” and F be the event “ First packet ”

8 45
P(R and F) = F R
45
P(R F) = 8 P(F) =
20 12 8 15
20 45
10
8 1 20 8
 P(R F)  P(F) =  
1 20 45 45
So, P(R and F) = P(R F)  P(F)
Summary
The probability that both event A and event B occur is given by
P(A and B) = P(A B)  P(B)
We often use this in the form

P(A B)  P(A and B)


P(B)
In words, this is “the probability of event A given that B has
occurred, equals the probability of both A and B occurring
divided by the probability of B”.
Reminder:
P(A and B) can also be written as P(A  B)
Example

 Three jars contain colored balls as described in the table


below.
 One jar is chosen at random and a ball is selected. If the ball is
red, what is the probability that it came from the 2nd jar?

Jar # Red White Blue


1 3 4 1
2 1 2 3
3 4 3 2
Example
 We will define the following events:

 J1 is the event that first jar is chosen

 J2 is the event that second jar is chosen

 J3 is the event that third jar is chosen

 R is the event that a red ball is selected


Example
 The events J1 , J2 , and J3 mutually exclusive

 Why?

 You can’t chose two different jars at the same

time

 Because of this, our sample space has been

divided or partitioned along these three events


Venn Diagram
 Let’s look at the Venn Diagram
Venn Diagram
 All of the red balls are in the first, second, and
third jar so their set overlaps all three sets of our
partition
Finding Probabilities
 What are the probabilities for each of the events
in our sample space?
 How do we find them?

P A  B  P A | BPB
Computing Probabilities

P J 1  R   P R | J 1 P J 1    
3 1 1
8 3 8
 Similar calculations show:

      1 1
P J2  R  P R | J2 P J2   
1
6 3 18
P J 3  R   P R | J 3 P J 3    
4 1 4
9 3 27
Venn Diagram
 Updating our Venn Diagram with these
probabilities:
Where are we going with this?
 Our original problem was:
 One jar is chosen at random and a ball is selected.
If the ball is red, what is the probability that it came
from the 2nd jar?
 In terms of the events we’ve defined we want:

P J 2  R 
P J 2 | R  
P R 
Finding our Probability
 We already know what the numerator portion is
from our Venn Diagram
 What is the denominator portion?

P J 2  R 
P J 2 | R  
P R 
P J 2  R 

P J 1  R   P J 2  R   P J 3  R 
Arithmetic!
 Plugging in the appropriate values:

P J 2  R 
P J 2 | R  
P J 1  R   P J 2  R   P J 3  R 
1
 
  18 

12
 0.17
 1   1   4  71
   
 8   18   27 
Bayes’ Theorem:
PB AP A
P A B  
P(B)

P ( B A) P ( A)
P ( A B) =
å P(B A )P( A )
n n n

The important consequence of Bayes’ Theorem


is that it relates inverse probabilities: P(A|B) and
P(B|A)
79
Random Variables
 A random variable is a variable whose value is a

numerical outcome of a random experiment


 often denoted with capital alphabetic symbols (X, Y, etc.)

 a normal random variable may be denoted as X ~ N(µ, )

 The probability distribution of a random variable X tells us

what values X can take and how to assign probabilities to


those values

80
Discrete Random Variables
 Random variables that have a finite (countable) list of
possible outcomes, with probabilities assigned to each
of these outcomes, are called discrete

 Discrete random variables


 number of pets owned (0, 1, 2, … )
 numerical day of the month (1, 2, …, 31)
 the total number of tails you get if you flip 100 coins

81
Discrete example: roll of a die

p(x)

1/6

x
1 2 3 4 5 6

 P(x)  1
all x
Probability Distribution Function (PDF)
x p(x)
1 p(x=1)=1/6

2 p(x=2)=1/6

3 p(x=3)=1/6

4 p(x=4)=1/6

5 p(x=5)=1/6

6 p(x=6)=1/6
1.0
Cumulative Distribution Function (CDF)

1.0 P(x)
5/6
2/3
1/2
1/3
1/6
1 2 3 4 5 6 x
Cumulative Distribution Function (CDF)

x P(x≤A)
1 P(x≤1)=1/6

2 P(x≤2)=2/6

3 P(x≤3)=3/6

4 P(x≤4)=4/6

5 P(x≤5)=5/6

6 P(x≤6)=6/6
Examples

1. What’s the probability that you roll a 3 or less?

P(x≤3)=1/2

2. What’s the probability that you roll a 5 or


higher?

P(x≥5) = 1 – P(x≤4) = 1-2/3 = 1/3


Important discrete distributions in
epidemiology…

 Binomial
 Yes/no outcomes (dead/alive,
treated/untreated, smoker/non-smoker,
sick/well, etc.)
 Poisson
 Counts (e.g., how many cases of disease
in a given area)
Continuous Random Variables
 Random variables that can take on any
value in an interval, with probabilities given
as areas under a density curve, are called
continuous
 Continuous random variables
 weight
 temperature

88
Probability Density Function (PDF)
 The probability function that accompanies a continuous

random variable is a continuous mathematical function that


integrates to 1.

 The probabilities associated with continuous functions are

just areas under the curve (integrals!).

 Probabilities are given for a range of values, rather than a

particular.
Probability Density Function (PDF)

 For example, the negative exponential function (in


probability, this is called an “exponential distribution”):

f ( x)  e  x

 This function integrates to 1:

 

e
x x
 e  0 1 1
0
0
Probability Density Function (PDF)

p(x)=e-x

The probability that x is any exact particular value (such as


1.9976) is 0; we can only assign probabilities to possible
ranges of x.
Probability Density Function (PDF)
For example, the probability of x falling within 1
to 2:
p(x)=e-x

x
1 2

2 2


x x
P(1  x  2)  e  e  e  2  e 1  .135  .368  .23
1
1
Cumulative Density Function (CDF)

As in the discrete case, we can specify the “cumulative


distribution function” (CDF):

The CDF here = P(X≤A)=

A A


x x
e  e  e  A  e 0   e  A  1  1  e  A
0
0
Cumulative Density Function (CDF)

p(x)
1

2 x

2
P(x  2)  1 - e  1 - .135  .865
Uniform Density

The uniform distribution: all values are equally likely

The uniform distribution:


f(x)= 1 , for 1 x 0 p(x)

x
1

We can see it’s a probability distribution because it integrates


to 1 (the area under the curve is 1): 1 1

1  x
0
0
1 0 1
Uniform Density

What’s the probability that x is between ¼ and ½?

p(x)

¼ ½ x
1

P(1/4 ≤ x≤ 1/2 )= ¼
The Normal Density Function

1 x 2
1  ( )
f ( x)  e 2 
 2
This is a bell shaped curve
Note constants: with different centers and
spreads depending on  and 
=3.14159
e=2.71828
The Normal Density Function

μ
The Normal Density Function

It’s a probability function, so no matter what the


values of  and , must integrate to 1!

+∞
1 1 𝑥−𝜇 2

න 𝑒 2 𝜎 𝑑𝑥 =1
𝜎 2𝜋
−∞
The Shape of Normal Density
Normal distribution is bell shaped, and symmetrical around m.

90  110
Why symmetrical? Let µ = 100. Suppose x = 110. Now suppose x = 90
2 2 2 2
 110 100   10   90 100   10 
1 (1/ 2)  1 (1/ 2)  1 (1/ 2)  1 (1/ 2) 
         
f (110)  e  e f (90)  e  e
 2  2  2  2
Normal Probability Density
 The expected value (also called the mean) E(X) (or )
can be any number
 The standard deviation  can be any nonnegative
number
 The total area under every normal curve is 1
 There are infinitely many normal distributions
Normal Probability Density

Total area =1; symmetric around µ


The effects of  and 

How does the standard deviation affect the shape of f(x)?


= 2
 =3
 =4

How does the expected value affect the location of f(x)?


 = 10  = 11  = 12
Statistical Measures
Statistical Measures
 Center of the data
 Mean
 Median
 Variation
 Range
 Quartiles
 Variance
 Standard Deviation
 Covariance
 Correlation
Mean or Average or Expectation
 Traditional measure of center
 Sum the values and divide by the number of values
𝑛
1 1
𝐸(𝑥)
Ԧ = 𝑥Ԧ1 + 𝑥Ԧ2 + ⋯ + 𝑥Ԧ𝑛 = ෍ 𝑥Ԧ𝑖
𝑛 𝑛
𝑖=1
 In general
𝑛

𝐸(𝑥)
Ԧ = 𝑝1 𝑥Ԧ1 + 𝑝2 𝑥Ԧ2 + ⋯ + 𝑝𝑛 𝑥Ԧ𝑛 = ෍ 𝑝𝑖 𝑥Ԧ𝑖
𝑖=1
Mean or Average
1
[(1,2)+ (5,6)
11
(3,4)+
(6,5)
(5,6)+ Mean
(2,4)+ (5,5)
(1,1)+ (2,4) (3,4) (3.3636,3.0909)
(4,2)+
(6,5)+ (5,3)
(3,1)+
(2,1)+ (2,1) (4,2)
(5,3)+
(5,5)] (1,1) (1,2) (3,1)
Median (M)
 A resistant measure of the data’s center
 At least half of the ordered values are less than or equal to
the median value
 At least half of the ordered values are greater than or equal
to the median value
 If n is odd, the median is the middle ordered value
 If n is even, the median is the average of the two middle
ordered values
Median (M)
Location of the median: L(M) = (n+1)/2 ,
where n = sample size.

Example: If 25 data values are recorded, the Median would


be the
(25+1)/2 = 13th ordered value.
Median
 Example 1 data: 2 4 6
Median (M) = 4

 Example 2 data: 2 4 6 8
Median = 5 (average of 4 and 6)

 Example 3 data: 6 2 4
Median 2
(order the values: 2 4 6 , so Median = 4)
Comparing the Mean & Median
 Computation of mean is easier.

 Finding median in higher dimension is much complex.

 Mean is prone to noise.

 The mean and median of data from a symmetric distribution

should be close together. The actual (true) mean and median


of a symmetric distribution are exactly the same.
Spread or Variability
 If all values are the same, then they all equal to the mean.
There is no variability.
 Eg: 2, 2, 2, 2, 2, 2; mean = 2

 Variability exists when some values are different from


(above or below) the mean.
 Eg: 10, 15,-20,-22,30, 22

 We will discuss the following measures of spread: range,


quartiles, variance, and standard deviation
Range
 One way to measure spread is to give the smallest
(minimum) and largest (maximum) values in the data set;
Range = max  min
 Eg: 10,-2,-7,22,0,11; Range = 22-(-7)=28

 The range is strongly affected by outliers


Quartiles
 Three numbers which divide the ordered data into four equal
sized groups.
 Q1 has 25% of the data below it.
 Q2 has 50% of the data below it. (Median)
 Q3 has 75% of the data below it.
Quartiles Uniform Distribution

1st Qtr Q1 2nd Qtr Q2 3rd Qtr Q3 4th Qtr


Obtaining the Quartiles
 Order the data.
 For Q2, just find the median.
 For Q1, look at the lower half of the data values, those to the left
of the median location; find the median of this lower half.
 For Q3, look at the upper half of the data values, those to the
right of the median location; find the median of this upper half.
Variance and Standard Deviation
 Recall that variability exists when some values are different
from (above or below) the mean.
 Each data value has an associated deviation from the mean:

xi  x
Deviations
 what is a typical deviation from the mean?
(standard deviation)
 small values of this typical deviation indicate
small variability in the data
 large values of this typical deviation indicate
large variability in the data
Variance

Variance is the average squared deviation from the mean of a set


of data. It is used to find the standard deviation.
Variance

Mean
Variance

2
-
Variance
2
-

2
-
Variance

1
---------------- ……… + 2 2
- + - + ………
No. of Data
Points
Variance Formula

𝑛
2
1 2
𝜎 = ෍(𝑥𝑖 − 𝑥)ҧ
𝑛
𝑖=1
Standard Deviation

𝑛
1
𝜎 = ෍(𝑥𝑖 − 𝑥)ҧ 2
𝑛
𝑖=1

[ standard deviation = square root of the variance ]


Variance and Standard Deviation
Metabolic rates of 7 men (cal./24hr.) :
1792 1666 1362 1614 1460 1867 1439

1792  1666  1362  1614  1460  1867  1439


x 
7
11,200

7
 1600
Variance and Standard Deviation
Observations Deviations Squared deviations
xi xi  x xi  x 
2

1792 17921600 = 192 (192)2 = 36,864


1666 1666 1600 = 66 (66)2 = 4,356
1362 1362 1600 = -238 (-238)2 = 56,644
1614 1614 1600 = 14 (14)2 = 196
1460 1460 1600 = -140 (-140)2 = 19,600
1867 1867 1600 = 267 (267)2 = 71,289
1439 1439 1600 = -161 (-161)2 = 25,921
sum = 0 sum = 214,870
Variance and Standard Deviation

214,870
 
2
 30695.71
7

  30695.71  175.20 calories


Variance (2D)
Variance (2D)
Variance (2D)
Variance (2D)
Variance (2D)

Variance doesn’t explore


relationship between variables
Covariance

1 𝑛
Variance(x)= σ𝑖=1(𝑥𝑖 − 𝑥)ҧ 2
𝑛

1 𝑛
= σ (𝑥 − 𝑥)(𝑥
ҧ 𝑖 − 𝑥)ҧ
𝑛 𝑖=1 𝑖

1 𝑛
Covariance(x, y) = σ𝑖=1(𝑥𝑖 − 𝑥)(𝑦
ҧ 𝑖 − 𝑦)

𝑛

 Covariance x, x = var x
 Covariance x, 𝑦 = Covariance y, x
Covariance
1 𝑛
Covariance(x, y) = σ𝑖=1(𝑥𝑖 − 𝑥)(𝑦
ҧ 𝑖 − 𝑦)

𝑛
Covariance
1 𝑛
Covariance(x, y) = σ𝑖=1(𝑥𝑖 − 𝑥)(𝑦
ҧ 𝑖 − 𝑦)

𝑛
Covariance
1 𝑛
Covariance(x, y) = σ𝑖=1(𝑥𝑖 − 𝑥)(𝑦
ҧ 𝑖 − 𝑦)

𝑛

𝑦ത
𝑦1 − 𝑦<0

𝑦1

𝑥1 𝑥ҧ
𝑥1 − 𝑥<0
ҧ
Covariance
1 𝑛
Covariance(x, y) = σ𝑖=1(𝑥𝑖 − 𝑥)(𝑦
ҧ 𝑖 − 𝑦)

𝑛

𝑦1
𝑦1 − 𝑦ത >0
𝑦ത

𝑥ҧ 𝑥1 𝑥1 − 𝑥ҧ >0
Covariance
1 𝑛
Covariance(x, y) = σ𝑖=1(𝑥𝑖 − 𝑥)(𝑦
ҧ 𝑖 − 𝑦)

𝑛

(𝑥𝑖 − 𝑥)(𝑦
ҧ 𝑖 − 𝑦)>0

(𝑥𝑖 − 𝑥)(𝑦
ҧ 𝑖 − 𝑦)<0

Positive
Relation
(𝑥𝑖 − 𝑥)(𝑦
ҧ 𝑖 − 𝑦)<0

(𝑥𝑖 − 𝑥)(𝑦
ҧ 𝑖 − 𝑦)>0

Covariance
1 𝑛
Covariance(x, y) = σ𝑖=1(𝑥𝑖 − 𝑥)(𝑦
ҧ 𝑖 − 𝑦)

𝑛
Covariance
1 𝑛
Covariance(x, y) = σ𝑖=1(𝑥𝑖 − 𝑥)(𝑦
ҧ 𝑖 − 𝑦)

𝑛
Covariance
1 𝑛
Covariance(x, y) = σ𝑖=1(𝑥𝑖 − 𝑥)(𝑦
ҧ 𝑖 − 𝑦)

𝑛

𝑦1
𝑦1 − 𝑦ത >0

𝑦ത

𝑥1 𝑥ҧ
𝑥1 − 𝑥<0
ҧ
Covariance
1 𝑛
Covariance(x, y) = σ𝑖=1(𝑥𝑖 − 𝑥)(𝑦
ҧ 𝑖 − 𝑦)

𝑛

𝑦ത
𝑦1 − 𝑦<0

𝑦1

𝑥ҧ 𝑥1
𝑥1 − 𝑥>0
ҧ
Covariance
1 𝑛
Covariance(x, y) = σ𝑖=1(𝑥𝑖 − 𝑥)(𝑦
ҧ 𝑖 − 𝑦)

𝑛

𝑥𝑖 − 𝑥ҧ 𝑦𝑖 − 𝑦ത <0

(𝑥𝑖 − 𝑥)(𝑦
ҧ 𝑖 − 𝑦)>0

Negative
Relation
(𝑥𝑖 − 𝑥)(𝑦
ҧ 𝑖 − 𝑦)>0

(𝑥𝑖 − 𝑥)(𝑦
ҧ 𝑖 − 𝑦)<0

Covariance
1 𝑛
Covariance(x, y) = σ𝑖=1(𝑥𝑖 − 𝑥)(𝑦
ҧ 𝑖 − 𝑦)

𝑛
Covariance
1 𝑛
Covariance(x, y) = σ𝑖=1(𝑥𝑖 − 𝑥)(𝑦
ҧ 𝑖 − 𝑦)

𝑛

𝑥𝑖 − 𝑥ҧ 𝑦𝑖 − 𝑦ത <0 𝑥𝑖 − 𝑥ҧ 𝑦𝑖 − 𝑦ത >0

No
Relation
𝑥𝑖 − 𝑥ҧ 𝑦𝑖 − 𝑦ത <0
𝑥𝑖 − 𝑥ҧ 𝑦𝑖 − 𝑦ത >0
Covariance
1 𝑛
Covariance(x, y) = σ𝑖=1(𝑥𝑖 − 𝑥)(𝑦
ҧ 𝑖 − 𝑦)

𝑛

(𝑥, 𝑦) (𝑥 − 𝑥,ҧ 𝑦 − 𝑦)

(2 , 1) (-2.4545, -2.8182)
(2 , 2) (-2.4545, -1.8182) 1
Covariance(x, y) = (𝑥 − 𝑥)ҧ 𝑇 (𝑦 − 𝑦)

(4 , 3) (-0.4545, -0.8182) 11
(6 , 1) (1.5455, -2.8182)
(8 , 3) (3.5455, -0.8182)
𝑇
(1 , 5) (-3.4545, 1.1818) Covariance(x, y) = 𝐸[ 𝑥 − 𝑥ҧ 𝑦 − 𝑦ത ]
(4 , 6) (-0.4545, 2.1818)
(4 , 7) (-0.4545, 3.1818)
(6 , 3) (1.5455, -0.8182)
(6 , 5) (1.5455, 1.1818)
(6 , 6) (1.5455, 2.1818)
(4.4545, 3.8182) (0, 0)
Covariance Matrix

𝑐𝑜𝑣(𝑥1 , 𝑥1 ) 𝑐𝑜𝑣(𝑥1 , 𝑥2 ) ⋯ 𝑐𝑜𝑣(𝑥1 , 𝑥𝑛 )


𝑐𝑜𝑣(𝑥2 , 𝑥1 ) 𝑐𝑜𝑣(𝑥2 , 𝑥2 ) ⋯ 𝑐𝑜𝑣(𝑥2 , 𝑥𝑛 )
𝐶𝑜𝑣 σ =
⋮ ⋮ ⋮ ⋮
𝑐𝑜𝑣(𝑥𝑛 , 𝑥1 ) 𝑐𝑜𝑣(𝑥𝑛 , 𝑥2 ) ⋯ 𝑐𝑜𝑣(𝑥𝑛 , 𝑥𝑛 )

 Diagonal elements are variances, i.e. Cov(𝑥, 𝑥)=𝑣𝑎𝑟 𝑥 .


 Covariance Matrix is symmetric.
 It is a positive semi-definite matrix.
Correlation

Positive relation Negative relation No relation

• Covariance determines whether relation is positive or negative, but it was


impossible to measure the degree to which the variables are related.
• Correlation is another way to determine how two variables are related.
• In addition to whether variables are positively or negatively related, correlation
also tells the degree to which the variables are related each other.
Correlation
𝑐𝑜𝑣(𝑥, 𝑦)
𝜌𝑥𝑦 = 𝐶𝑜𝑟𝑟𝑒𝑙𝑎𝑡𝑖𝑜𝑛 𝑥, 𝑦 =
𝑣𝑎𝑟(𝑥) 𝑣𝑎𝑟(𝑦).

−1 ≤ 𝐶𝑜𝑟𝑟𝑒𝑙𝑎𝑡𝑖𝑜𝑛 𝑥, 𝑦 ≤ +1
Multivariate Gaussians (or "multinormal distribution“ or
“multivariate normal distribution”)

Univariate case: single mean  and


variance 

Multivariate case:
Vector of observations x,
vector of means  and covariance matrix 

Dimension of x Determinant
Multivariate Gaussians
Univariate case

Multivariate case

do not depend on x
normalization constants
depends on x and positive
The mean vector

 μ1 
μ 
 2
μ  E ( x)  . 
 
. 
 μm 
 
Covariance of two random variables
 Recall for two random variables xi, xj

  Cov( xi , x j )
2
ij

 E[( xi  i )( x j   j )]
 E ( xi x j )  E ( xi ) E ( x j )
The covariance matrix
transpose operator
  E[ ( x  μ)( x  μ) ] T

 ( x1  μ1 )    2
1  12 ..  14 
    21  2 2 
.  24 
 .  
E  [( x1  μ1 )..( xn  μn )]   . . .. . 
 .  
    . . .. . 
 ( xm  μm ) 
  2 
 m1  m 2 ..  m 

Var(xm)=Cov(xm, xm)
An example: 2 variate case

The pdf of the multivariate will be: Covariance matrix

Determinant
An example: 2 variate case

Factorized into two independent Gaussians!


They are independent!
Recall in general case independence implies uncorrelation
but uncorrelation does not necessarily implies independence.
Multivariate Gaussians is a special case where uncorrelation
implies independence as well.
Diagonal covariance matrix
If all the variables are independent from each other,
The covariance matrix will be an diagonal one.
Reverse is also true:
If the covariance matrix is a diagonal one they are independent
 21 0  Diagonal matrix: m matrix where off-diagonal terms are zero

 2 
 0  2
 ij2  E[( xi  i )( x j   j )]  0
i j
Gaussian Intuitions: Size of 

Identity matrix

 = [0 0]  = [0 0]  = [0 0]
=I  = 0.6I  = 2I

As  becomes larger,
Gaussian becomes more spread out
Gaussian Intuitions: Off-diagonal

As the off-diagonal entries increase, more correlation between value of x and value of
y
Gaussian Intuitions: off-diagonal and diagonal

 Decreasing non-diagonal entries (#1-2)


 Increasing variance of one dimension in diagonal (#3)

You might also like