23 4 14 Prob Distribution
23 4 14 Prob Distribution
23 4 14 Prob Distribution
Economics
Chapter
Random Variables &
Probability Distributions
Content
1. Two Types of Random Variables
2. Probability Distributions for Discrete
Random Variables
3. The Binomial Distribution
4. Poisson and Hypergeometric Distributions
5. Probability Distributions for Continuous
Random Variables
6. The Normal Distribution
Content (continued)
7. Descriptive Methods for Assessing
Normality
8. Approximating a Binomial Distribution with
a Normal Distribution
9. Uniform and Exponential Distributions
10. Sampling Distributions
11. The Sampling Distribution of a Sample
Mean and the Central Limit Theorem
Learning Objectives
1. Develop the notion of a random variable
2. Learn that numerical data are observed values of
either discrete or continuous random variables
3. Study two important types of random variables
and their probability models: the binomial and
normal model
4. Define a sampling distribution as the probability of
a sample statistic
5. Learn that the sampling distribution of x follows a
normal model
2011 Pearson Education, Inc
Thinking Challenge
Youre taking a 33 question multiple
choice test. Each question has 4
choices. Clueless on 1 question, you
decide to guess. Whats the chance
youll get it right?
If you guessed on all 33 questions, what
would be your grade? Would you pass?
Two Types of
Random Variables
Random Variable
A random variable is a variable that assumes
numerical values associated with the random
outcomes of an experiment, where one (and only
one) numerical value is assigned to each sample
point.
Discrete
Random Variable
Random variables that can assume a countable
number (finite or infinite) of values are called
discrete.
Random
Variable
Possible
Values
# Sales
0, 1, 2, ..., 100
Inspect 70 Radios
# Defective
0, 1, 2, ..., 70
Answer 33 Questions
# Correct
0, 1, 2, ..., 33
# Cars
Arriving
0, 1, 2, ...,
Continuous
Random Variable
Random variables that can assume
values corresponding to any of the
points contained in one or more
intervals (i.e., values that are infinite
and uncountable) are called
continuous.
Continuous Random
Variable Examples
Experiment
Random
Variable
Possible
Values
Weight
Hours
$ amount
Measure Time
Between Arrivals
Probability Distributions
for Discrete Random
Variables
Discrete
Probability Distribution
The probability distribution of a
discrete random variable is a
graph, table, or formula that specifies
the probability associated with each
possible value the random variable
can assume.
Discrete Probability
Distribution Example
Experiment: Toss 2 coins. Count number of
tails.
Probability Distribution
Values, x Probabilities, p(x)
0
1/4 = .25
2/4 = .50
1/4 = .25
Visualizing Discrete
Probability Distributions
Listing
Table
# Tails
f(x)
Count
p(x)
0
1
2
1
2
1
.25
.50
.25
Graph
p(x)
.50
.25
.00
Formula
x
0
p (x ) =
n!
px(1 p)n x
x!(n x)!
Summary Measures
1. Expected Value (Mean of probability distribution)
Weighted average of all possible values
= E(x) = x p(x)
2. Variance
Weighted average of squared deviation about
mean
2 = E[(x (x p(x)
3.
Standard Deviation
Summary Measures
Calculation Table
x
p(x)
Total
x p(x)
xp(x)
(x
(x p(x)
(x p(x)
Thinking Challenge
You toss 2 coins. Youre interested in
the number of tails. What are the
expected value, variance, and
standard deviation of this random
variable, number of tails?
p(x)
x p(x)
.25
1.00
1.00
.25
.50
.50
.25
.50
1.00
1.00
.25
= 1.0
(x (x p(x)
Empirical Rule
P x x
.68
P x 2 x 2
34
.95
89
1.00
P x 3 x 3
The Binomial
Distribution
Binomial Distribution
Number of successes in a sample of n
observations (trials)
Number of reds in 15 spins of roulette wheel
Number of defective items in a batch of 5 items
Number correct on a 33 question exam
Number of customers who purchase out of 100
customers who enter store (each customer is
equally likely to pyrchase)
Binomial Probability
Characteristics of a Binomial Experiment
1. The experiment consists of n identical trials.
2. There are only two possible outcomes on each trial. We
will denote one outcome by S (for success) and the other
by F (for failure).
3. The probability of S remains the same from trial to trial.
This probability is denoted by p, and the probability of
F is denoted by q. Note that q = 1 p.
4. The trials are independent.
5. The binomial random variable x is the number of Ss in
n trials.
Binomial Probability
Distribution
n
n!
x n x
p( x) p q
p x (1 p) n x
x ! (n x)!
x
p(x) = Probability of x Successes
p = Probability of a Success on a single trial
q = 1p
n = Number of trials
x = Number of Successes in n trials
(x = 0, 1, 2, ..., n)
n x = Number of failures in n trials
Binomial Probability
Distribution Example
Experiment: Toss 1 coin 5 times in a row. Note
number of tails. Whats the probability of 3 tails?
n!
p( x)
p x (1 p) n x
x !(n x)!
5!
p(3)
.53 (1 .5)53
3!(5 3)!
.3125
.01
0.50
.99
.951
.031
.000
.999
.188
.000
1.000
.500
.000
1.000
.812
.001
1.000
.969
.049
Cumulative Probabilities
p(x 3) p(x 2) = .812 .500 = .312
Binomial Distribution
Characteristics
n = 5 p = 0.1
Mean
E(x) np
Standard Deviation
npq
n = 5 p = 0.5
Binomial Distribution
Thinking Challenge
Youre a telemarketer selling service contracts for
Macys. Youve sold 20 in your last 100 calls (p
= .20). If you
call 12 people tonight, whats the probability of
A. No sales?
B. Exactly 2 sales?
C. At most 2 sales?
D. At least 2 sales?
4.4
Other Discrete Distributions:
Poisson and Hypergeometric
Poisson Distribution
1. Number of events that occur in an interval
events per unit
Time, Length, Area, Space
2. Examples
Number of customers arriving in 20 minutes
Number of strikes per year in the U.S.
Number of defects per lot (group) of DVDs
Characteristics of a Poisson
Random Variable
1. Consists of counting number of times an event
occurs during a given unit of time or in a given
area or volume (any unit of measurement).
2. The probability that an event occurs in a given unit
of time, area, or volume is the same for all units.
3. The number of events that occur in one unit of
time, area, or volume is independent of the number
that occur in any other mutually exclusive unit.
4. The mean number of events in each unit is denoted
by
Poisson Probability
Distribution Function
x
p (x )
x!
(x = 0, 1, 2, 3, . . .)
p(x) = Probability of x given
= Mean (expected) number of events in unit
e = 2.71828 . . . (base of natural logarithm)
x = Number of events per unit
Poisson Probability
Distribution Function
= 0.5
Mean
E(x)
= 6
Standard Deviation
p ( x)
e
x
x!
3.6
p (4)
4!
-3.6
.1912
Cumulative Probabilities
p(x 4) p(x 3) = .706 .515 = .191
Thinking Challenge
You work in Quality Assurance for an
investment firm. A clerk enters 75
words per minute with 6 errors per
hour. What is the probability of 0
errors in a 255-word bond
transaction?
e
x
x!
.34
p(0)
0!
-.34
.7118
Characteristics of a
Hypergeometric
Random Variable
1. The experiment consists of randomly drawing n
elements without replacement from a set of N
elements, r of which are Ss (for success) and (N
r) of which are Fs (for failure).
2. The hypergeometric random variable x is the
number of Ss in the draw of n elements.
Hypergeometric Probability
Distribution Function
r
x
p x
nr
N
where . . .
N r
n x
N
n
r N r n N n
N 2 N 1
2
Hypergeometric Probability
Distribution Function
N = Total number of elements
r = Number of Ss in the N elements
n = Number of elements drawn
x = Number of Ss drawn in the n
elements
Probability
Distributions for
Continuous Random
Variables
Continuous Probability
Density Function
The graphical form of the probability distribution for a
continuous random variable x is a smooth curve
The Normal
Distribution
Importance of
Normal Distribution
1. Describes many random processes or
continuous phenomena
2. Can be used to approximate discrete
probability distributions
Example: binomial
Normal Distribution
1. Bell-shaped &
symmetrical
f(x )
x
Mean
Median
Mode
1 x
2
where
= Mean of the normal random variable x
= Standard deviation
= 3.1415 . . .
e = 2.71828 . . .
P(x < a) is obtained from a table of normal
probabilities
Effect of Varying
Parameters ( & )
Normal Distribution
Probability
Probability is
area under
curve!
P(c x d)
f(x)
f (x)dx ?
.04
.05
.06
=1
.4750
= 0 1.96 z
Probabilities
Shaded area
exaggerated
=1
.3962
.3962
P(1.26 z 1.26)
= .3962 + .3962
1.26 1.26 z
=0
Shaded area exaggerated
= .7924
=1
P(z > 1.26)
.5000
= .5000 .3962
.3962
1.26
=0
= .1038
=1
P(2.78 z 2.00)
.4973
= .4973 .4772
.4772
2.78 2.00
=0
= .0201
=1
.4834
.5000
= .4834 + .5000
2.13
=0
Shaded area exaggerated
= .9834
Non-standard Normal
Distribution
Normal distributions differ by
mean & standard deviation.
f(x)
Thats an infinite
number of tables!
Standardize the
Normal Distribution
x
z
Normal
Distribution
Standardized Normal
Distribution
= 1
= 0
One table!
x 6.2 5
z
.12
10
Standardized Normal
Distribution
= 10
=1
.0478
= 5 6.2
= 0 .12
x 3.8 5
z
.12
10
Standardized Normal
Distribution
= 10
=1
.0478
3.8 = 5
-.12 = 0
x 2.9 5
z
.21
10
x 7.1 5
z
.21
10
Normal
Distribution
Standardized Normal
Distribution
= 10
=1
.1664
.0832 .0832
2.9 5 7.1
x
Shaded area exaggerated
-.21 0 .21
.30
10
Normal
Standardized Normal
Distribution
Distribution
= 10
=1
.5000
.1179
=5
x
Shaded area exaggerated
=0
.3821
.30 z
.21
10
Normal
Distribution
x 85
z
.30
10
Standardized Normal
Distribution
= 10
=1
.1179
.0832
=5
7.1 8
=0
.21 .30
.0347
Normal Distribution
Thinking Challenge
You work in Quality Control for GE. Light bulb life
has a normal distribution with = 2000 hours and
= 200 hours. Whats the probability that a bulb will
last
A. between 2000 and 2400
hours?
B. less than 1470 hours?
2.0
200
Normal
Distribution
Standardized Normal
Distribution
= 200
=1
.4772
= 2000 2400
=0
2.0
2.65
200
Normal
Distribution
Standardized Normal
Distribution
= 200
=1
.5000
.0040
1470
= 2000
.4960
2.65 = 0
Finding z-Values
for Known Probabilities
Standardized Normal
Probability Table (Portion)
What is Z, given
P(z) = .1217?
.1217
=1
.00
.01
0.2
=0
Shaded area
exaggerated
.31
?
Finding x Values
for Known Probabilities
Normal Distribution
= 10
=1
.1217
= 5 8.1
?
.1217
= 0 .31
x z 5 .3110
Shaded areas exaggerated
4.7
Descriptive Methods for
Assessing Normality
IQR Q3 Q1
s
s
4. Examine a normal
probability plot for the
data. If the data are
approximately normal,
the points will fall
(approximately) on a
straight line.
Expected zscore
Observed value
2011 Pearson Education, Inc
Approximating a
Binomial Distribution
with a Normal
Distribution
Normal Approximation of
Binomial Distribution
1. Useful because not all
binomial tables exist
2. Requires large sample
size
3. Gives approximate
probability only
4. Need correction for
continuity
n = 10 p = 0.50
p(x)
.3
.2
.1
.0
0
x
2
10
p(x)
.3
.2
Probability Lost by
Normal Curve
.1
.0
x
0
Binomial Probability:
Bar Height
10
4.5
(4 + .5)
3 np 3 np 1 p
If interval lies in the range 0 to n, the normal
distribution will provide a reasonable
approximation to the probabilities of most
binomial events.
P x 5 1 P x 4
P 7 x 10 P x 10 P x 6
.3
.2
.1
.0
x
0
4
6
3.5 4.5
10
n p 1 p
a .5 n p
n p 1 p
3.5 10 .5
10 .5 1 .5
4.5 10 .5
10 .5 1 .5
0.95
0.32
=0
=1
.3289
- .1255
.2034
.1255
.3289
-.95
-.32
.3
.2
.1
.0
p(x)
x
0
10
Other Continuous
Distributions:
Uniform and
Exponential
Uniform Distribution
Continuous random variables that appear to have
equally likely outcomes over their range of possible
values possess a uniform probability distribution.
Suppose the random
variable x can assume
values only in an
interval c x d.
Then the uniform
frequency function
has a rectangular
shape.
cd
Mean:
2
1
f (x)
dc
cxd
dc
Standard Deviation:
12
P a x b b a d c , c a b d
Uniform Distribution
Example
Youre production manager of a soft drink bottling
company. You believe that when a machine is set
to dispense 12 oz., it really dispenses between
11.5 and 12.5 oz. inclusive. Suppose the
amount dispensed has a uniform distribution.
What is the probability that less than 11.8 oz. is
dispensed?
d c 12.5 11.5
1
1.0
1
f(x)
1.0
11.5 11.8
P(11.5 x 11.8)
12.5
= (Base)/(Height)
= (11.8 11.5)/(1) = .30
Exponential Distribution
The length of time between emergency
arrivals at a hospital, the length of time
between breakdowns of manufacturing
equipment, and the length of time between
catastrophic events (e.g., a stockmarket
crash), are all continuous random
phenomena
we
want tobetween
describe
The length ofthat
time
ormight
the distance
probabilistically.
occurrences of random events like these
can often be described by the exponential
probability distribution. For this reason,
the exponential distribution is sometimes
called the waiting-time distribution.
Probability Distribution
for an Exponential Random
Variable x
Probability density function:
Mean:
Standard Deviation:
1 x
f (x) e
x 0
Exponential Distribution
Example
Suppose the length of time (in hours) between
emergency arrivals at a certain hospital is modeled as
an exponential distribution with = 2. What is the
probability that more than 5 hours pass without an
emergency arrival?
Mean: 2
Standard Deviation: 2
1
f (x) e
x 0
Exponential Distribution
Solution
Probability is the area A to
the right of a = 5.
A e
52
e2.5
From Table V:
A e2.5 .082085
4.10
Sampling Distributions
Population Parameter
Mean
Standard
Deviation
Variance
s2
Binomial
Proportion
^
p
Sampling Distribution
The sampling distribution of a sample statistic
calculated from a sample of n measurements is
the probability distribution of the statistic.
Developing
Sampling Distributions
Suppose Theres a Population ...
Population size, N = 4
Random variable, x
Values of x: 1, 2, 3, 4
Uniform distribution
Population Characteristics
Summary Measure
N
xi
i1
2.5
Population Distribution
.3
.2
.1
.0
P(x)
16 Sample Means
Sampling Distribution
of All Sample Means
16 Sample Means
1st 2nd Observation
Obs 1
2
3
4
Sampling Distribution
of the Sample Mean
P(x)
.3
.2
.1
.0
1.0 1.5 2.0 2.5 3.0 3.5 4.0
Summary Measure of
All Sample Means
N
2.5
N
16
i 1
Comparison
Population
.3
.2
.1
.0
Sampling Distribution
P(x)
2.5
P(x)
.3
.2
.1
.0
1.0 1.5 2.0 2.5 3.0 3.5 4.0
x 2.5
.
That is, x
n
Theorem 4.1
If a random sample of n observations is selected from
a population with a normal distribution, the sampling
distribution of x will be a normal distribution.
Sampling from
Normal Populations
Central Tendency
x
Population Distribution
= 10
Dispersion
x
n
Sampling with
replacement
= 50
Sampling Distribution
n=4
x = 5
n =16
x = 2.5
x- = 50
x
n
Sampling
Distribution
Standardized Normal
Distribution
= 1
=0
Thinking Challenge
Youre an operations analyst for
AT&T. Long-distance telephone calls
are normally distributed with = 8
min. and = 2 min. If you select
random samples of 25 calls, what
percentage of the sample means
would be between 7.8 & 8.2
minutes?
.50
2
25
n
x 8.2 8
z
.50
2
Standardized Normal
25
n
Distribution
z
Sampling
Distribution
x = .4
=1
.3830
.1915 .1915
7.8 8 8.2 x
.50 0 .50
Sampling from
Non-Normal Populations
Central Tendency
Population Distribution
= 10
Dispersion
x
n
Sampling with
replacement
= 50
Sampling Distribution
n=4
x = 5
n =30
x = 1.8
x- = 50
x
n
sampling
distribution
becomes almost
normal.
1.77
.2
Sampling
Standardized Normal
n
50
Distribution
Distribution
x = .03
.0384
=1
.4616
11.95 12
1.77 0
Key Ideas
Properties of Probability Distributions
Discrete Distributions
1. p(x) 0
2. p x 1
allx
Continuous Distributions
1. P(x = a) = 0
2. P(a < x < b) = area under curve between a and b
Key Ideas
Normal Approximation to Binomial
x is binomial (n, p)
P x a P z a .5
Key Ideas
Methods for Assessing Normality
1. Histogram
Key Ideas
Methods for Assessing Normality
2. Stem-and-leaf display
1
3389
245677
19
Key Ideas
Methods for Assessing Normality
3. (IQR)/S 1.3
4. Normal probability plot
Key Ideas
Generating the Sampling Distribution of x