Topic01 - Probability
Topic01 - Probability
Topic01 - Probability
3
Introduction
• Life is uncertain; we are not sure what the future will bring
• Probability is a numerical statement about the likelihood
that an event will occur
– 10% chance of rain tomorrow
– 20% chance the Hang Seng Index will not go down next week
– 30% chance the launch of the new iPhone will be delayed
– 70% chance Donald Trump will win the election again
– > 99.99% chance that the sun will rise tomorrow morning.
6
Diversey Paint Example
11
Drawing a Card Example
Mutually Collectively
Two events: A and B
Exclusive Exhaustive
1. Draw a spade and a club
2. Draw a face card and a number card
3. Draw an ace and a 3
4. Draw a club and a non-club
5. Draw a 5 and a diamond
6. Draw a red card and a non-diamond
13
Union and Intersection
• The union of two events is the set of all
outcomes that are contained in either of the A B
two events.
𝑃 Union of 𝐴&𝐵 = 𝑃 𝐴 or 𝐵 = 𝑃 𝐴 ∪ 𝐵
15
Unions and Intersections
• General rule for union of two events,
additive rule
16
Conditional Probability
• Conditional probability – probability that event A occurs
given event B has already happened
P ( AB )
P( A | B) =
P (B )
P ( AB ) = P ( A | B )P (B )
– Probability of a King given a heart has been drawn
1
P ( AB )
P( A | B) = = 52 = 1
P (B ) 13 13
52
17
Independent events
• Independent one event has no effect on the other event
P(A | B) = P(A)
P(A and B) = P(A)P(B)
– Denote by 𝐴 ⊥ 𝐵
– For a fair coin tossed twice
Also, 0 ≤ 𝑃 𝐴|𝐵 ≤ 1, 0 ≤ 𝑃 𝐴𝐵 ≤ 1, 0 ≤ 𝑃 𝐴 ∪ 𝐵 ≤ 1.
• 𝑃 𝐴∪𝐵 +𝑃 𝐴∩𝐵 =𝑃 𝐴 +𝑃 𝐵 .
• 𝑃 𝐴 ∩ 𝐵 = 𝑃 𝐴|𝐵 ∙ 𝑃 𝐵 = 𝑃 𝐵|𝐴 ∙ 𝑃 𝐴 .
22
Bayes’ Theorem
• Bayes’ theorem is used to incorporate additional
information and help create posterior probabilities from
original or prior probabilities
• What is the probability of a person infected by HIV (Human
Immuno-deficiency Virus)?
HK Prevalence Rate
= 0.1%
𝑃 Infected|Test Positive =?
Lab Test 𝑃 Infected|Test Negative =? 23
Bayes’ Theorem
• Test accuracy
– If a person is infected, the test will return positive result
90% of the time
– If a person is NOT infected, the test will return negative
result 95% of the time
• How to revise your probability assessment when you have
new information?
Diagnostic test for the Human Immuno-deficiency Virus (HIV)
𝑃 𝐴𝐵 𝑃 𝐵|𝐴 𝑃 𝐴 𝑃 𝐵|𝐴 𝑃 𝐴
• 𝑃 𝐴|𝐵 = = =
𝑃 𝐵 𝑃 𝐵𝐴 +𝑃 𝐵𝐴′ 𝑃 𝐵|𝐴 𝑃 𝐴 +𝑃 𝐵|𝐴′ 𝑃 𝐴′
0.9×0.001 =1.77%
• 𝑃 Infected|Test Positive = 0.9×0.001+0.05×0.999
25
Bayes’ Theorem
Tabular Form of Bayes’ Calculations Given That Event B Has
Occurred
STATE OF P(B | STATE PRIOR JOINT POSTERIOR
NATURE OF NATURE) PROBABILITY PROBABILITY PROBABILITY
A P(B | A) ×P(A) =P(B and A) P(B and A)÷P(B) = P(A | B)
A′ P(B | A′) ×P(A′) =P(B and A′) P(B and A′)÷P(B) = P(A′ | B)
Blank Blank Blank P(B) Blank
28
Random Variables
• A random variable (RV) assigns a number to every
possible outcome or event in a statistical experiment.
• A discrete RV can assume only a finite or coutable set of
values
• E.g., X = the number of newspapers sold during the
day.
• A continuous RV can assume any one of an infinite set of
values
• E.g., Y = the lifespan of a light bulb.
29
Random Variables
Examples of Random Variables
30
RV with Qualitative Outcomes
• When the outcome itself is not numerical or quantitative, it
is necessary to define an RV that associates each
outcome with a unique real number. E.g.
– For tossing a coin, X = 1 if head and 0 if tail;
– For consumers’ response to how they like a product,
Y = 1 if poor, 2 if average, and 3 if good;
– For the brand of soda purchased by a consumer, Z = 1
if Pepsi, 2 if Coca-Cola, and 3 if Dr. Pepper.
31
Random Variables
Random Variables for Outcomes That Are Not Numbers
32
Probability Distributions
• For discrete random variables, probability value assigned
to each value
– Quiz with five problems with 1 point for each correct
answer
– Lowest score = 1, highest score = 5
• Follows the three rules:
1. Outcomes are mutually exclusive and collectively
exhaustive
2. Individual probability values between 0 and 1
3. Total probability sums to 1
33
Probability Distributions
34
Probability Distribution: Discrete RV
• For each possible outcome 𝑋𝑖 , there is a probability value
𝑃 𝑋𝑖 .
• These values must be between 0 and 1: 0 ≤ 𝑃 𝑋𝑖 ≤ 1.
• They must sum up to 1: σ𝑛𝑖=1 𝑃 𝑋𝑖 = 1.
• 𝑃 𝑋𝑖 : probability mass function (pmf)
• 𝐹 𝑥 = 𝑃 𝑋 ≤ 𝑥 = σ𝑋𝑖 ≤𝑥 𝑃 𝑋𝑖
cumulative probability function (cdf)
35
Probability Distribution: Continuous RV
• For each possible interval (𝑥, 𝑥 + Δ), there is a probability
value 𝑃 𝑋 ∈ (𝑥, 𝑥 + Δ) = 𝑓 𝑥 Δ for small Δ
– The probability of each individual value of the random
variable occurring must equal 0
• 𝑓(𝑥): probability density function (pdf)
– Note 𝑓(𝑥) is not a probability (it’s possible 𝑓 𝑥 > 1 )
𝑏
– The probabilities: 𝑃 𝑋 ∈ (𝑎, 𝑏) = 𝑓 𝑎 𝑥 𝑑𝑥 ∈ 0,1 .
∞
– They must sum up to 1: −∞ 𝑓 𝑥 𝑑𝑥 = 1
𝑥
• 𝐹 𝑥 = 𝑃 𝑋 ≤ 𝑥 = −∞ 𝑓 𝑢 𝑑𝑢: cumulative distribution
function (cdf)
36
Probability Distribution: Continuous RV
• For any continuous Probability Density Function
distribution, the probability
does not change if a
single point is added to
the range of values that is
being considered.
𝑃 5.22 < 𝑋 < 5.26 = 𝑃 5.22 < 𝑋 ≤ 5.26 = 𝑃 5.22 ≤ 𝑋 < 5.26
= 𝑃 5.22 ≤ 𝑋 ≤ 5.26 = F 5.26 − F(5.22)
37
Summary Statistics: Expected Value
• Expected value is a measure of the central tendency of
the distribution
• For a discrete random variable:
𝜇 = 𝐸 𝑋 = σ𝑛𝑖=1 𝑋𝑖 ∙ 𝑃 𝑋𝑖
= 𝑋1 ∙ 𝑃 𝑋1 + 𝑋2 ∙ 𝑃 𝑋2 + ⋯ + 𝑋𝑛 ∙ 𝑃 𝑋𝑛
where
𝑋𝑖 = random variable’s possible values
𝑃(𝑋𝑖 ) = probability of each possible value of the random variable
σ𝑛𝑖=1 = summation sign indicating we are adding all 𝑛 possible values
𝐸(𝑋) = expected value or mean of the random variable
38
Summary Statistics Expected Value Example
n
E ( X ) = Xi P ( Xi )
i =1
𝜎 2 = 𝑉 𝑋 = 𝑋𝑖 − 𝐸 𝑋 2 ∙ 𝑃 𝑋𝑖
𝑖=1
where
𝑋𝑖 = random variable’s possible values
𝐸(𝑋𝑖) = expected value of the random variable
[𝑋𝑖 − 𝐸(𝑋)] = difference between each value of the random
variable and the expected value
𝐸(𝑋) = probability of each possible value of the random
variable
s = Variance = s 2 40
Summary Statistics Variance and
Standard Deviation Example
• For quiz scores 𝐸 𝑋 = 2.9
n
Variance = [Xi − E (X)]2 P (Xi )
i =1
42
In-class Exercises
• True or False?
– Define X to be a random variable as follows: X = 0 if it rains
tomorrow and X = 1 if it is sunny tomorrow.
– Suppose a random variable X has the following three possible
values: -1, 0, and 1. Then the expectation of X is 0.
– If 𝑓(𝑥) is the pdf of a random variable X, we must have 𝑓 𝑥 ≤ 1.
– If X is a discrete random variable, then the possible values of X
must be integers.
– If X is a discrete random variable that can take integer values from
1 to 5, then 𝑃 2 ≤ 𝑋 ≤ 3 = 𝑃 𝑋 ≤ 3 − 𝑃 𝑋 ≤ 2 .
43
4 Common Distributions
44
The Binomial Distribution
• The binomial distribution is used to find the probability of a
specific number (𝑟) of successes in 𝑛 trials
𝑛!
𝑃 𝑟; 𝑛, 𝑝 = ∙ 𝑝𝑟 ∙ 1 − 𝑝 𝑛−𝑟
𝑟! 𝑛 − 𝑟 !
n = number of trials
p = the probability of success on any single trial
r = number of successes
The symbol ! means factorial, and n! = n×(n − 1) ×(n − 2)…× 1. For
example: 4! = 4×3×2×1= 24. Also, 1! = 1 and 0! = 0 by definition
45
Binomial Distribution Examples
• Toss a fair coin for n times. Let X be the number of heads you get. X ~
Binomial(n, 0.5)
– the probability of getting 4 heads:
5!
– 𝑃(𝑋 = 4) = 0.54 0.55-4
4!(5-4)!
5(4)(3)(2)(1)
= (0.0625)(0.5) = 0.15625
4(3)(2)(1)1!
47
Using Excel
Excel Output for the Binomial Example
48
Poisson Distribution
• A discrete probability distribution
– number of independent arrivals during a unit period of
time.
– Probability density function given by
𝑒 −𝜆 ∙ 𝜆𝑥
𝑃 𝑥; 𝜆 =
𝑥!
Note 𝑃(𝑥; 𝜆) = probability of exactly 𝑥 arrivals or occurrences
𝝀 = average number of arrivals per unit of time
(the mean arrival rate)
e = 2.718, the base of natural logarithms
𝑥 = number of occurrences (0, 1, 2, 3, …)
50
Poisson Distribution Examples
Sample Poisson Distributions with λ = 2 and λ = 4
51
Using Excel
Excel Output for the Poisson Distribution
52
Uniform Distribution
• A random variable 𝑋 is uniformly distributed if it takes any
value equally likely from a finite interval 𝑎, 𝑏 .
𝑎+𝑏 𝑏−𝑎 2
• If 𝑋~𝑈 𝑎, 𝑏 , then 𝐸 𝑋 = 2
and 𝑉 𝑋 = 12
53
Exponential Distribution
• The exponential distribution often
describes the time required to f
service a customer or the lifespan
of a product (e.g., a light bulb)
54
Exponential Distribution Examples
56
Using Excel
Excel Output for the Exponential Distribution
57
5 Normal Distribution
• Normal Distribution
• Standard Normal Distribution
• Using the Standard Normal Table
58
Normal Distribution
• One of the most popular and useful continuous probability
distributions
– E.g., return of a stock portfolio, forecast errors, and test
scores.
– The probability density function
1 𝑥−𝜇 2
−
𝑓 𝑥 = ∙ 𝑒 2𝜎2
𝜎 2𝜋
– Completely specified by the mean, μ, and the standard
deviation, σ
– We often use the notation 𝑁 𝜇, 𝜎 2 .
59
Normal Distribution
Normal Distribution with Different Values for μ
60
Normal Distribution
Normal Distribution with Different Values for σ
61
Normal Distribution
• Symmetrical with the midpoint representing the mean
• Shifting the mean does not change the shape
• Values on the X axis measured in the number of standard
deviations away from the mean
• As standard deviation becomes larger, curve flattens
• As standard deviation becomes smaller, curve becomes
steeper
62
Normal Distribution
63
Standard Normal Distribution
𝑿−𝝁
If 𝑿~𝑵 𝝁, 𝝈𝟐 , then 𝒁 = follows standard normal distribution.
𝝈
64
Using the Standard Normal Table Example
X − 130 − 100
Z= =
15
30
= = 2 std dev
15
65
Using the Standard Normal Table
Step 1
• Convert the normal distribution into a standard normal
distribution
– Mean of 0 and a standard deviation of 1
– The new standard random variable is Z
X −
Z=
where
X = value of the random variable we want to measure
μ = mean of the distribution
σ = standard deviation of the distribution
Z = number of standard deviations from X to the mean, μ
66
Using the Standard Normal Table
Step 2
• Look up the probability from a table of normal curve areas
• Use Standard Normal Table (Provided in Exam)
• Column on the left is Z value
• Row at the top has second decimal places for Z values
67
Standard Normal Table: Z Table
68
69
For Z = 2.00
P(X < 130) = P(Z < 2.00) = 0.97725
P(X > 130) = 1 − P(X ≤ 130) = 1 − P(Z ≤ 2)
= 1 − 0.97725 = 0.02275
Using the Standard Normal Table
• Find a specified percentile
– 𝑋 ∼ 𝑁 𝜇, 𝜎
– Find the 80th percentile of 𝑋: find a value 𝑥 such that
80% of the possible values of 𝑋 are no larger than 𝑥
▪ E.g., 50th percentile of 𝑋 is 𝜇
– Equivalent: Find a number 𝑥 such that 𝑋 will be less
than 𝑥 with probability 0.8, i.e., 𝑃 𝑋 < 𝑥 = 0.8.
– How to find this value 𝑥?
70
Finding the x value
• Step 1: Finding the z value corresponding to 0.8
– If we need 𝑥 for 𝑃 𝑋 < 𝑥 , find 𝑧 corresponding to
𝑃 𝑋<𝑥 .
– 𝑧0.8 = 0.84
• Step 2: Setting 𝑥 to 𝑥𝑧 = 𝜇 + 𝑧 ∙ 𝜎.
𝑋−𝜇
– 𝑃 𝑋 < 𝑥𝑧 = 𝑃 𝑋 < 𝜇 + 𝑧 ∙ 𝜎 = 𝑃 <𝑧 =
𝜎
𝑃 𝑍<𝑧 =𝑃 𝑋<𝑥
– 𝑃 𝑋 < 100 + 𝑧0.8 × 15 = 𝑃 𝑋 < 112.6 = 0.8
71
Haynes Construction Company
72
Haynes Construction Company
• Compute Z
X − 125 – 100
Z= =
20
25
= = 1.25
20
𝑃 𝑋 < 125 = 𝑃 𝑍 < 1.25 = 0.8944
73
Haynes Construction Company
• If finished in 75 days or less, bonus = $5,000
– Probability of bonus?
X − 75 – 100
Z= =
20
–25
= = –1.25
20
• Because the distribution is
symmetrical: The probability of completing
𝑃 𝑋 < 75 = 𝑃 𝑍 < −1.25 the contract in 75 days or less
= 𝑃 𝑍 > 1.25 is about 11%
= 1 − 𝑃 𝑍 < 1.25
= 1 − 0.8944 = 0.1056 74
Haynes Construction Company
X − 110 – 100
Z= =
20
10
= = 0.5
20
– 𝑃 𝑋 < 110 = 𝑃 𝑍 < 0.5 = 0.6915
• 𝑃 110 < 𝑋 < 125
= 0.8944 − 0.6915 = 0.2029
75
Haynes Construction Company **
𝑋 ′ −𝜇
• Objective: find 𝜇 such that 𝑃 𝑋′ < 125 = 𝑃 ൬ 20 <
125−𝜇
൰ = 0.95
20
77
Appendix:
Standard Normal Table (Part 1)
78
Appendix:
Standard Normal Table (Part 2)
79
Summary
1 Fundamental Concepts
4 Common Distributions
80