Expected Value and Variance

Download as pdf or txt
Download as pdf or txt
You are on page 1of 8

Random Variables

It frequently occurs that in performing an experiment we are mainly interested in some


functions of the outcome as opposed to the outcome itself. For instance, in tossing dice
we are often interested in the sum of the two dice and are not really concerned about
the actual outcome.
That is, we may be interested in knowing that the sum is seven and not be concerned
over whether the actual outcome was (1, 6) or (2, 5) or (3, 4) or (4, 3) or (5, 2) or (6,
1). These quantities of interest, or more formally, these real-valued functions defined
on the sample space, are known as random variables.
Since the value of a random variable is determined by the outcome of the experi-
ment, we may assign probabilities to the possible values of the random variable.

Example Letting X denote the random variable that is defined as the sum of two fair
dice, then
1
P {X = 2} = P {(1, 1)} = 36 ,
2
P {X = 3} = P {(1, 2), (2, 1)} = 36 ,
3
P {X = 4} = P {(1, 3), (2, 2), (3, 1)} = 36 ,
4
P {X = 5} = P {(1, 4), (2, 3), (3, 2), (4, 1)} = 36 ,
5
P {X = 6} = P {(1, 5), (2, 4), (3, 3), (4, 2), (5, 1)} = 36 ,
6
P {X = 7} = P {(1, 6), (2, 5), (3, 4), (4, 3), (5, 2), (6, 1)} = 36 ,
5
P {X = 8} = P {(2, 6), (3, 5), (4, 4), (5, 3), (6, 2)} = 36 ,
4
P {X = 9} = P {(3, 6), (4, 5), (5, 4), (6, 3)} = 36 ,
3
P {X = 10} = P {(4, 6), (5, 5), (6, 4)} = 36 ,
2
P {X = 11} = P {(5, 6), (6, 5)} = 36 ,
1
P {X = 12} = P {(6, 6)} = 36

The Cumulative Distribution Function


For some fixed value x, we often wish to compute the probability that the observed value
of X will be at most x. For example, let X be the number of number of beds occupied in
a hospital’s emergency room at a certain time of day; suppose the pmf of X is given by
x 0 4
p(x) .20 .25 .30 .15 .10

Then the probability that at most two beds are occupied is


P(X # 2) 5 p(0) 1 p(1) 1 p(2) 5 .75
Furthermore, since X # 2.7 if and only if X # 2, we also have P(X # 2.7) 5 .75, and sim-
ilarly P(X # 2.999) 5 .75. Since 0 is the smallest possible X value, P(X # 21.5) 5 0,
P(X # 210) 5 0, and in fact for any negative number x, P(X # x) 5 0. And because 4
is the largest possible value of X, P(X # 4) 5 1, P(X # 9.8) 5 1, and so on.
Very importantly,
P(X , 2) = p(0) 1 p(1) 5 .45 , .75 5 P(X # 2)
because the latter probability includes the probability mass at the x value 2 whereas
the former probability does not. More generally, P(X , x) , P(X # x) whenever x
is a possible value of X. Furthermore, P(X # x) is a well-defined and computable
probability for any number x.
DEFINITION The cumulative distribution function (cdf) F(x) of a discrete rv variable X
with pmf p(x) is defined for every number x by
F(x) 5 P(X # x) 5 o p(y)
y:y # x

For any number x, F(x) is the probability that the observed value of X will be
at most x.

EXAMPLE A store carries flash drives with either 1 GB, 2 GB, 4 GB, 8 GB, or 16 GB of mem-
ory. The accompanying table gives the distribution of Y 5 the amount of memory in
a purchased drive:

y 1 2 4 8 16

p(y) .05 .10 .35 .40 .10

For any two numbers a and b with a # b,


P(a # X # b) 5 F(b) 2 F(a2)
where “a2” represents the largest possible X value that is strictly less than a. In
particular, if the only possible values are integers and if a and b are integers, then
P(a # X # b) 5 P(X 5 a or a 1 1 or… or b)
5 F(b) 2 F(a 2 1)
Taking a 5 b yields P(X 5 a) 5 F(a) 2 F(a 2 1) in this case.
Expectation of a Random Variable

If X is a discrete random variable having a probability mass function p(x), then the
expected value of X is defined by

E[X] = xp(x) if X is discrete
x:p(x)>0

 ∞
E[X] = xf (x) dx if X is continuous
−∞

The expected value of a random variable X, E[X], is also referred to as the mean
or the first moment of X. The quantity E[Xn], n ≥ 1, is called the nth moment of X.

⎧ 

⎪ x n p(x), if X is discrete

x:p(x)>0
E[X n ] =  ∞


⎩ n
x f (x) dx, if X is continuous
−∞

E(aX 1 b) 5 a ? E(X) 1 b
(Or, using alternative notation, aX1b 5 a ? X 1 b)

Another quantity of interest is the variance of a random variable X, denoted by


Var(X), which is defined by
 
Var(X) = E (X − E[X])2

Thus, the variance of X measures the expected square of the deviation of X from its
expected value.

Let X have pmf p(x) and expected value . Then the variance of X, denoted
by V(X) or 2X, or just 2, is

V(X) 5 o (x 2 )
D
2
? p(x) 5 E[(X 2 )2]

The standard deviation (SD) of X is


V(X) 5 2 5 3o 4
x 2 ? p(x) 2 2 5 E(X 2 ) 2 [E(X)]2
D
NOTATION For X , Bin(n, p), the cdf will be denoted by
x
B(x; n, p) 5 P(X # x) 5 o b(y; n, p)
y50
x 5 0, 1,…, n

In a binomial experiment, the probability of exactly x successes in n trials is


n!
P1x2 = nCx pxqn - x = pxqn - x.
1n - x2! x!

Note that the number of failures is n - x.

EXAMPLE Finding Binomial Probabilities Using Formulas


A survey of U.S. adults found that 62% of women believe that there is a link
between playing violent video games and teens exhibiting violent behavior. You
randomly select four U.S. women and ask them whether they believe that there is
a link between playing violent video games and teens exhibiting violent behavior.
Find the probability that (1) exactly two of them respond yes, (2) at least two of
them respond yes, and (3) fewer than two of them respond yes. (Source: Harris
Interactive)

PROPOSITION If X , Bin(n, p), then E(X) 5 np, V(X) 5 np(1 2 p) 5 npq, and X 5 Ïnpq
(where q 5 1 2 p).

EXAMPLE Finding and Interpreting Mean, Variance, and Standard Deviation


In Pittsburgh, Pennsylvania, about 56% of the days in a year are cloudy. Find
the mean, variance, and standard deviation for the number of cloudy days
during the month of June. Interpret the results and determine any unusual
values. (Source: National Climatic Data Center)
DEFINITION
A geometric distribution is a discrete probability distribution of a random
variable x that satisfies these conditions.
1. A trial is repeated until a success occurs.
2. The repeated trials are independent of each other.
3. The probability of success p is the same for each trial.
4. The random variable x represents the number of the trial in which the first
success occurs.
The probability that the first success will occur on trial number x is
P1x2 = pq x - 1, where q = 1 - p.

Using the Geometric Distribution


Basketball player LeBron James makes a free throw shot about 75% of the
time. Find the probability that the first free throw shot he makes occurs on the
third or fourth attempt. (Source: National Basketball Association)

Expectation of a Geometric Random Variable: Calculate the expectation of a


geometric random variable having parameter p.
∞ ∞

 d n
E[X] = np(1 − p) n−1 E[X] = p (q )
dq
n=1 n=1
∞ ∞

 d
=p nq n−1 =p qn
dq
n=1 n=1

where q = 1 − p,  
d q
=p
dq 1 − q
p 1
= =
(1 − q)2 p


 ∞
 ∞ ∞
1 1
iP (X ≥ i) = i(1 − p)i−1 = i p(1 − p)i−1 = i P (X = i)
p p
i=1 i=1 i=1 i=1
1 1
= E[X] = 2
p p

2 1 1 1 1−p
E[X 2 ] = 2
− Var(X) = 2
− =
p p p p p2
DEFINITION
The Poisson distribution is a discrete probability distribution of a random
variable x that satisfies these conditions.
1. The experiment consists of counting the number of times x an event occurs
in a given interval. The interval can be an interval of time, area, or volume.
2. The probability of the event occurring is the same for each interval.
3. The number of occurrences in one interval is independent of the number of
occurrences in other intervals.
The probability of exactly x occurrences in an interval is
mxe -m
P1x2 =
x!
where e is an irrational number approximately equal to 2.71828 and m is the
mean number of occurrences per interval unit.

2 3 ` x ` e2 ? x
e 5 1 1  1
2!
1
3!
1…5 o
x50 x!
15 o
x50 x!

Using the Poisson Distribution


The mean number of accidents per month at a certain intersection is three.
What is the probability that in any given month four accidents will occur at
this intersection?
The Poisson random variable has a wide range of applications in a diverse number of
areas.
An important property of the Poisson random variable is that it may be used to
approximate a binomial random variable when the binomial parameter n is large and
p is small. To see this, suppose that X is a binomial random variable with parameters
(n, p), and let mean be np. Then
n!
P {X = i} = p i (1 − p)n−i
(n − i)! i!
 i  n−i
n!  
= 1−
(n − i)! i! n n
n(n − 1) · · · (n − i + 1) i (1 −/n)n
=
ni i! (1 − /n)i
Now, for n large and p small,
 n  i
 n(n − 1) · · · (n − i + 1) 
1− − ≈ 1, 1− ≈1
n ≈ e , ni n

Hence, for n large and p small,


i
 
P {X = i} ≈ e−
i!

Expectation of a Poisson Random Variable: Calculate E[X] if X is a Poisson random


variable with parameter λ.

 
ie −  i
E[X] =
i!
i=0
∞
e−λ  i
=
(i − 1)!
i=1

  i−1

= e− (i − 1)!
i=1
∞
  k
= e−
k!
k=0

 
= e− e

= 
Suppose that in the binomial pmf b(x; n, p), we let n S ` and p S 0 in such
a way that np approaches a value  . 0. Then b(x; n, p) S p(x; ).

According to this result, in any binomial experiment in which n is large and


p is small, b(x; n, p) < p(x; ), where  5 np. As a rule of thumb, this
approximation can safely be applied if n . 50 and np , 5.

EXAMPLE If a publisher of nontechnical books takes great pains to ensure that its books are free
of typographical errors, so that the probability of any given page containing at least one
such error is .005 and errors are independent from page to page, what is the probability
that one of its 600-page novels will contain exactly one page with errors? At most three
pages with errors?

Table
Discrete Probability mass Moment Mean Variance
probability function, p(x) generating
distribution function, φ(t)
&n' x
x p (1 − p)
Binomial with n−x , (pet + (1 − p))n np np(1 − p)
parameters n, p, x = 0, 1, . . . , n
0≤p≤1
mxe -m
Poisson with pa- P1x2 = n ! exp{(et − 1)}  
rameter x = 0, 1, 2, . . .
pet 1 1−p
Geometric with p(1 − p)x−1 ,
1 − (1 − p)et p p2
parameter x = 1, 2, . . .
0≤p≤1

You might also like