PR Student Manual
PR Student Manual
PR Student Manual
_
0 x < 0
0.25 0 x < 1
0.5 1 x < 2
0.75 2 x < 3
1 x 3
See the graph below.
1 2 3
x
0
0.25
F( x)
1
0.5
0.75
29. The values of F(x) are zero until x (moving from to ) reaches the smallest value in the
range of X, which is x = 0. There, F(0) = P(X 0) = P(X = 0) = 0.8. Then, F(x) remains constant
until x reaches the next value in the range of X, which is x = 1. The value of F is
F(1) = P(X 1) = P(X = 0) +P(X = 1) = 0.8 + 0.05 = 0.85
Continuing in the same way, we obtain the following:
F(x) =
_
_
0 x < 0
0.8 0 x < 1
0.85 1 x < 2
0.9 2 x < 3
0.95 3 x < 4
1 x 4
See the graph below.
1 2 3 x 0
0.85
F( x)
1
0.9
0.8
0.95
4
31. (a) Initial location: 0; locations after 1 step: 1 and 1; locations after 2 steps: 2, 0 and 2;
locations after 3 steps: 3, 1, 1, and 3; locations after 4 steps: 4, 2, 0, 2, and 4; locations
P1-20 Probability and Statistics [Solutions]
after 5 steps: 5, 3, 1, 1, 3, and 5. To move one step ahead, we add 1 to the locations in the
previous step or subtract 1 from the locations in the previous step. Adding 1 to an even number
(or subtracting 1 from an even number) makes it odd, and vice versa. A particle starts at an even
numbered location (x = 0). Thus, after an even (odd) number of steps, the particle arrives at an
even-numbered (odd-numbered) location.
(b) To be absorbed, the particle needs to reach 3 or 3, which are odd numbers. The particle can
reach 3 or 3 in 3 steps (in which case X = 3). If it does not, it means that it ended at 1 or 1 after 3
steps (since after an odd number of steps a particle can only be in an odd-numbered location). Thus,
the particle needs 2 more steps to reach 3 or 3 (in which case X = 5); if it does not, it means that
it ended at 1 or 1; repeating this routine, we see that X can assume only odd-numbered values.
33. We read the values from the table. The probability mass function of X is given by P(X = 1) = 0.3,
P(X = 2) = 0.1, P(X = 3) = 0.2, P(X = 4) = 0.1, and P(X = 5) = 0.3. The discontinuities of the
cumulative distribution function F(x) of X occur at x = 1, 2, 3, 4 and 5. We nd that
F(x) =
_
_
0 x < 1
0.3 1 x < 2
0.4 2 x < 3
0.6 3 x < 4
0.7 4 x < 5
1 x 5
Section 7 [Solutions] P1-21
Section 7 The Mean, the Median, and the Mode
1. Ordering S
1
, we get S
1
= {2, 3, 4, 5, 6, 7, 10}; the median is 5. Ordering S
2
, we get S
2
=
{2, 3, 4, 5, 6, 700,000, 1,000,000}; the median is 5 as well. The median fails to capture large dier-
ence in the values at the right ends of the two distributions.
3. Adding up the values of all elements in S
1
and dividing by the number of elements in S
1
we get the
mean of S
1
. To calculate the mean of S
2
, the numerator doubles whereas the denominator remains
the same. Thus, the mean of S
2
is double the mean of S
1
.
The location of the midpoint of the two distributions does not change, since multiplication by 2
does not change the order (assume that S
1
and S
2
are ordered; if a is before b in the list for S
1
, then
2a is before 2b in the list for S
2
). Thus, the median of S
2
is double the median of S
1
.
If a is the value that appears most often in S
1
, then the value 2a appears most often in S
2
. So,
the mode of S
2
is double the mode of S
1
.
5. Intutively: since all outcomes are equally likely, the mean is (1+2+3+ +10)/10 = 55/10 = 5.5.
Formally,
E(X) =
10
k=1
k P(X = k) =
10
k=1
k
1
10
=
1
10
(1 + 2 + 3 + + 10) =
1
10
10 11
2
= 5.5
(Recall the formula:
n
k=1
k = 1 + 2 + 3 + +n = n(n + 1)/2.)
7. No. Consider the random variable X given by P(X = 0) = 0.5 and P(X = 6) = 0.5 Then
E(X) = 0 0.5 +6 0.5 = 3. The distribution of X
2
is P(X
2
= 0) = 0.5 and P(X
2
= 36) = 0.5 and so
E(X
2
) = 0 0.5 + 36 0.5 = 18. (This is just one of many counterexamples.)
9. Using properties of the expected value,
E(2X
2
4X + 1) = E(2X
2
) E(4X) +E(1) = 2E(X
2
) 4E(X) +E(1)
Since
E(1) =
x
1 P(X = x) =
x
P(X = x) = 1
we get E(2X
2
4X + 1) = 2(3) 4(2) + 1 = 1.
11. Using the properties of the expected value,
E(Y ) = E
_
1
(X )
_
=
1
E (X ) =
1
(E(X) ) = 0,
since, by assumption, E(X) = . (Recall that E(X +b) = E(X) +b for a real number b; replacing b
by , we get E(X ) = E(X) , which is how the second last equality was obtained.)
13. We compute
E(X) =
3
x=0
x P(X = x) = 0 0.25 + 1 0.25 + 2 0.25 + 3 0.25 = 6(0.25) = 1.5
E(X
2
) =
3
x=0
x
2
P(X = x) = 0 0.25 + 1 0.25 + 4 0.25 + 9 0.25 = 14(0.25) = 3.5
E(X(X 1)) =
3
x=0
x(x 1) P(X = x) = 0 0.25 + 0 0.25 + 2 0.25 + 6 0.25 = 8(0.25) = 2
To check:
E(X(X 1)) = E(X
2
X) = E(X
2
) E(X) = 3.5 1.5 = 2
P1-22 Probability and Statistics [Solutions]
15. We compute
E(X) =
4
x=0
x P(X = x) = 0 0.8 + 1 0.05 + 2 0.05 + 3 0.05 + 4 0.05 = 10(0.05) = 0.5
E(X
2
) =
4
x=0
x
2
P(X = x) = 0 0.8 + 1 0.05 + 4 0.05 + 9 0.05 + 16 0.05 = 30(0.05) = 1.5
E(X(X 1)) =
4
x=0
x(x 1) P(X = x) = 0 0.8 + 0 0.05 + 2 0.05 + 6 0.05 + 12 0.05
= 20(0.05) = 1
To check:
E(X(X 1)) = E(X
2
X) = E(X
2
) E(X) = 1.5 0.5 = 1
17. Instead of ordering the list (call it S), we record the frequencies:
Value 14 18 19 20 22 25 27 29 30
Frequency 1 15 3 5 1 1 3 2 5
Clearly, the mode is 18.
The data set S contains 36 elements. In identifying the median, we calculate the mean of the
18th and the 19th entries. Since both are equal to 19, the median of S is 19. The mean is
S =
1
36
(1 14 + 15 18 + 3 19 + 5 20 + 1 22 + 1 25 + 3 27 + 2 29 + 5 30)
=
777
36
21.58
19. From
E(X) =
4
x=1
x P(X = x) = 1 0.2 + 2 0.4 + 3 0.3 + 4 0.1 = 2.3
E(sin(X)) =
4
x=1
sin x P(X = x) = sin 1 0.2 + sin 2 0.4 + sin 3 0.3 + sin 4 0.1 0.49867
we compute E(sin X) sin(E(X)) = 0.49867 sin 2.3 0.24704.
21. From
E(ln(X)) =
4
x=1
ln x P(X = x) = ln 1 0.2 + ln 2 0.4 + ln 3 0.3 + ln 4 0.1 0.74547
we compute e
E(ln X)
= e
0.74547
2.10743.
23. We compute
E(1/X) =
4
x=1
1
x
P(X = x) =
1
1
0.2 +
1
2
0.4 +
1
3
0.3 +
1
4
0.1 = 0.525
Section 7 [Solutions] P1-23
25. Let R represent the per capita production rate of the sh population. Its probability mass
function is P(R = 1.25) = 0.7 and P(R = 0.1) = 0.3. From
E(ln(R)) = ln 1.25 P(R = 1.25) + ln 0.1 P(R = 0.1)
= (ln 1.25)(0.7) + (ln 0.1)(0.3)
0.53458
we get the geometric mean e
E(ln R)
= e
0.53458
0.58591. The geometric mean predicts a decline in
the population at the rate of 1 0.58591 = 0.41409 per year.
27. The mode consists of three values: 2, 6, and 7. The probability mass function is given below.
x P(X = x)
1 0.15
2 0.2
5 0.1
6 0.2
7 0.2
8 0.15
The mean is
E(X) = (1)(0.15) + 2(0.2) + 5(0.1) + 6(0.2) + 7(0.2) + 8(0.15) = 4.85
To nd the median, we keep calculating the values of the cumulative distribution function until we
reach 0.5: F(1) = 0.15, F(2) = 0.35, F(5) = 0.45, F(6) = 0.65. The median is (5 + 6)/2 = 5.5.
29. The mode consists of two values: 3 and 6. The probability mass function is given in the table
below.
x P(X = x)
1 0.2
3 0.3
6 0.3
8 0.2
The mean is
E(X) = (1)(0.2) + 3(0.3) + 6(0.3) + 8(0.2) = 4.5
To nd the median, we keep calculating the values of the cumulative distribution function until we
reach 0.5: F(1) = 0.2, F(3) = 0.5, F(6) = 0.8. The median is (3 + 6)/2 = 4.5.
P1-24 Probability and Statistics [Solutions]
Section 8 The Spread of a Distribution
1. All three samples share the same mean:
A
=
B
=
C
= 3. The sample B is least spread out
(the two values which dier from the mean are one unit away from it). The sample A is less spread
out than C: the four values in A which dier from 3 are closer to the mean than the four values in C
which dier from 3. Thus, B has the smallest standard deviation, followed by A; the sample C has
the largest standard deviation of the three samples.
We conrm our reasoning by calculating the three standard deviations:
var(A) =
a
(a
A
)
2
P(A = a) =
1
5
a
(a 3)
2
=
1
5
_
(2 3)
2
+ (2 3)
2
+ (3 3)
2
+ (4 3)
2
+ (4 3)
2
_
=
4
5
and
A
=
_
var(A) = 2/
5. Likewise,
var(B) =
b
(b
B
)
2
P(B = b) =
1
5
b
(b 3)
2
=
1
5
_
(2 3)
2
+ (3 3)
2
+ (3 3)
2
+ (3 3)
2
+ (4 3)
2
_
=
2
5
and
B
=
_
var(B) =
2/
5. Finally,
var(C) =
c
(c
C
)
2
P(C = c) =
1
5
c
(c 3)
2
=
1
5
_
(1 3)
2
+ (1 3)
2
+ (3 3)
2
+ (5 3)
2
+ (5 3)
2
_
=
16
5
and
C
=
_
var(C) = 4/
5. Thus,
B
<
A
<
C
.
3. Consider multiplying X by a real number a. The formula var(aX) = a
2
var(X) gives 2 = a
2
(22),
so a
2
= 1/22 and a = 1/
11
X
_
=
_
11
_
2
var(X) =
1
11
22 = 2
(Note that adding a real number to X does not change its variance; thats why we considered the
multiplication by a real number).
5. The expected value of X is zero:
E(X) =
4
k=4
kP(X = k) =
1
9
4
k=4
k =
1
9
(4 3 2 1 + 0 + 1 + 2 + 3 + 4) = 0
Therefore
var(X) =
4
k=4
(k E(X))
2
P(X = k) =
1
9
4
k=4
k
2
=
1
9
(16 + 9 + 4 + 1 + 0 + 1 + 4 + 9 + 16) =
60
9
7. From
E(X) =
x
xP(X = x) = (0)(0.15) + (1)(0.15) + (2)(0.15) + (4)(0.15) = (7)(0.15) = 1.05
and
E(X
2
) =
x
x
2
P(X = x) = (0)(0.15) + (1)(0.15) + (4)(0.15) + (16)(0.15) = (21)(0.15) = 3.15
we compute
var(X) = E(X
2
) [E(X)]
2
= 3.15 (1.05)
2
= 2.0475
and
X
=
2.0475 1.43091.
Section 8 [Solutions] P1-25
9. From
E(X) =
x
xP(X = x) = (2)(0.25) + (1)(0.2) + (0)(0.1) + (1)(0.2) + (2)(0.25) = 0
and
E(X
2
) =
x
x
2
P(X = x) = (4)(0.25) + (1)(0.2) + (0)(0.1) + (1)(0.2) + (4)(0.25) = 2.4
we compute
var(X) = E(X
2
) [E(X)]
2
= 2.4 (0)
2
= 2.4
and
X
=
2.4 1.54919.
11. Let E(X) = and X
1
= X E(X) = X .
Direct proof:
E(X
1
) =
x
1
x
1
P(X
1
= x
1
)
=
x
(x )P(X = x )
=
x
(x )P(X = x)
=
x
xP(X = x)
x
P(X = x)
= E(X)
x
P(X = x)
= 1 = 0
Using Theorem 7:
E(X
1
) = E(X ) = E(X) = = 0
13. Replacing X in var(X) = E(X
2
) [E(X)]
2
by aX, we get
var(aX) = E[(aX)
2
] [E(aX)]
2
= E[a
2
X
2
] [aE(X)]
2
= a
2
E[X
2
] a
2
[E(X)]
2
= a
2
(E(X
2
) [E(X)]
2
) = a
2
var(X)
15. The sample of 12 healthy adults, sorted:
110, 116, 120, 122, 123, 125, 125, 128, 132, 138, 138, 140
The minimum is 110, and the maximum is 140. The median is the mean of the sixth and the
seventh numbers: 125. The lower quartile is the mean of the third and the fourth numbers: Q
1
=
(120 + 122)/2 = 121, and the upper quartile is the mean of the ninth and the tenth numbers: Q
3
=
(132 + 138)/2 = 135.
The sample of 12 adults with a history of cardiovascular problems, sorted:
136, 142, 148, 150, 154, 154, 154, 158, 160, 160, 162, 166
The minimum is 136, and the maximum is 166. The median is the mean of the sixth and the
seventh numbers: 154. The lower quartile is the mean of the third and the fourth numbers: Q
1
=
(148+150)/2 = 149, and the upper quartile is the mean of the ninth and the tenth numbers: Q
3
= 160.
See the gure below for the boxplots.
P1-26 Probability and Statistics [Solutions]
149
121
Cardiovascular problems
110
160
125
140
166
Healthy
154
136
135
Blood pressure
17. The sample, sorted:
14, 16, 17, 18, 19, 20, 20, 20, 22, 22, 24, 24, 24, 25
The sample contains 14 numbers. The minimum is 14, and the maximum is 25. The median is the
mean of the seventh and the eighth numbers: 20. The lower quartile is the fourth number: Q
1
= 18,
and the upper quartile is the eleventh number: Q
3
= 24.
18
14
20
25
Lions in captivity
24
Lifespan
19. The sample, sorted:
12, 20, 20, 20, 21, 23, 23, 24, 24, 25, 25, 26, 27, 28
The sample contains 14 numbers. The minimum is 12, and the maximum is 28. The median is the
mean of the seventh and the eighth numbers: 23.5. The lower quartile is the fourth number: Q
1
= 20,
and the upper quartile is the eleventh number: Q
3
= 25.
20
12
23.5
28
Moose
25
Lifespan
Section 8 [Solutions] P1-27
21. We extract the probability mass function from the histogram.
x P(X = x)
1 0.15
2 0.1
3 0.05
4 0.15
5 0.1
6 0.2
7 0.05
8 0.2
From
E(X) =
x
xP(X = x)
= (1)(0.15) + (2)(0.1) + (3)(0.05) + (4)(0.15) + (5)(0.1) + (6)(0.2) + (7)(0.05) + (8)(0.2)
= 4.75
and
E(X
2
) =
x
x
2
P(X = x)
= (1)(0.15) + (4)(0.1) + (9)(0.05) + (16)(0.15) + (25)(0.1) + (36)(0.2) + (49)(0.05) + (64)(0.2)
= 28.35
we compute
var(X) = E(X
2
) [E(X)]
2
= 28.35 (4.75)
2
= 5.7875
and
X
=
5.7875 2.40572.
23. We extract the probability mass function from the histogram.
x P(X = x)
1 0.05
2 0.05
3 0.1
4 0.1
5 0.15
6 0.15
7 0.2
8 0.2
From
E(X) =
x
xP(X = x)
= (1)(0.15) + (2)(0.15) + (3)(0.1) + (4)(0.1) + (5)(0.15) + (6)(0.15) + (7)(0.2) + (8)(0.2) = 5.8
P1-28 Probability and Statistics [Solutions]
and
E(X
2
) =
x
x
2
P(X = x)
= (1)(0.05) + (4)(0.05) + (9)(0.1) + (16)(0.1) + (25)(0.15) + (36)(0.15) + (49)(0.2) + (64)(0.2)
= 34.5
we compute
var(X) = E(X
2
) [E(X)]
2
= 34.5 (5.8)
2
= 0.86
and
X
=
0.86 0.92736.
25. The mean of all three distributions is 24.5. For the Milky Way Farm,
MAD(X
1
) = E(|X
1
E(X
1
)|) = E(|X
1
24.5|)
= |18 24.5|
6
30
+|20 24.5|
5
30
+|22 24.5|
2
30
+|24 24.5|
1
30
+|25 24.5|
1
30
+|27 24.5|
4
30
+|29 24.5|
4
30
+|30 24.5|
7
30
=
134
30
For the Milkshake Farm,
MAD(X
2
) = E(|X
2
24.5|)
= |22 24.5|
4
30
+|23 24.5|
6
30
+|24 24.5|
3
30
+|25 24.5|
8
30
+|26 24.5|
6
30
+|27 24.5|
3
30
=
41
30
For the Butterscotch Farm,
MAD(X
3
) = E(|X
3
24.5|)
= |17 24.5|
2
30
+|18 24.5|
7
30
+|19 24.5|
4
30
+|20 24.5|
3
30
+|30 24.5|
2
30
+|31 24.5|
5
30
+|32 24.5|
7
30
=
192
30
Thus, the MAD is able to detect the dierences in the spreads of the three distributions. Note that
the order of the three distributions from the smallest to the largest standard deviation is the same as
the order of the three distributions from the smallest to the largest mean absolute deviation.
Section 9 [Solutions] P1-29
Section 9 Joint Distributions
1. Using independence, we nd
P(X = 1, Y = 1) = P(X = 1)P(Y = 1) = (0.2)(0.7) = 0.14
P(X = 1, Y = 2) = P(X = 1)P(Y = 2) = (0.2)(0.3) = 0.06
P(X = 2, Y = 1) = P(X = 2)P(Y = 1) = (0.8)(0.7) = 0.56
P(X = 2, Y = 2) = P(X = 2)P(Y = 2) = (0.8)(0.3) = 0.24
These four probabilities form the joint probability distribution of X and Y. See below.
X = 1 X = 2
Y = 1 0.14 0.56
Y = 2 0.06 0.24
3. Denote the missing entries by a and b and expand the table to include the horizontal and the
vertical totals:
X = 0 X = 1
Y = 0 0.1 0.3 P(Y = 0) = 0.4
Y = 1 a b P(Y = 1) = a +b = 0.6
P(X = 0) = 0.1 +a P(X = 1) = 0.4 +b
By independence
P(X = 0, Y = 0) = P(X = 0)P(Y = 0)
0.1 = (0.1 +a)(0.4)
0.25 = 0.1 +a
and thus a = 0.15. From P(Y = 1) = a +b = 0.6 we get b = 0.45.
5. Using independence, we nd
P(X = 1, Y = 1) = P(X = 1)P(Y = 1) = (0.2)(0.9) = 0.18
P(X = 1, Y = 2) = P(X = 1)P(Y = 2) = (0.2)(0.1) = 0.02
P(X = 2, Y = 1) = P(X = 2)P(Y = 1) = (0.8)(0.9) = 0.72
P(X = 2, Y = 2) = P(X = 2)P(Y = 2) = (0.8)(0.1) = 0.08
The four probabilities form the joint probability distribution of X and Y, shown in the table below
(expanded, to include horizontal and vertical totals):
X = 1 X = 2
Y = 1 0.18 0.72 P(Y = 1) = 0.9
Y = 2 0.02 0.08 P(Y = 2) = 0.1
P(X = 1) = 0.2 P(X = 2) = 0.8
We nd
P(X = 1 | Y = 1) =
P(X = 1, Y = 1)
P(Y = 1)
=
0.18
0.9
= 0.2
P(X = 1 | Y = 2) =
P(X = 1, Y = 2)
P(Y = 2)
=
0.02
0.1
= 0.2
P1-30 Probability and Statistics [Solutions]
Recall the law of total probability: If A is an event, and E
1
and E
2
form a partition, then
P(A) = P(A| E
1
)P(E
1
) +P(A| E
2
)P(E
2
)
Substituting A = {X = 1}, E
1
= {Y = 1}, and E
2
= {Y = 2}, we obtain the desired relation
P(X = 1) = P(X = 1 | Y = 1)P(Y = 1) +P(X = 1 | Y = 2)P(Y = 2)
between P(X = 1 | Y = 1), P(X = 1 | Y = 2), and P(X = 1). We illustrate it by substituting the
probabilities we calculated:
0.2 = (0.2)(0.9) + (0.2)(0.1)
7. We need to nd P(R = + and G = B| G = B). From P(G = B) = 0.076 + 0.014 = 0.09 we get
P(R = + and G = B| G = B) =
P(R = +, G = B)
P(G = B)
=
0.076
0.09
0.844
9. The two probabilities
P(R = +| G = B) =
P(R = +, G = B)
P(G = B)
=
0.076
0.09
=
76
90
P(R = | G = B) =
P(R = , G = B)
P(G = B)
=
0.014
0.09
=
14
90
dene the distribution of R conditional on G = B. (Note that P(G = B) = 0.076 + 0.014 = 0.09.)
11. By adding up the entries horizontally, we obtain the marginal distribution for A:
P(A = allergy) = P(A = allergy, T = positive)
+P(A = allergy, T = negative) +P(A = allergy, T = inconclusive)
= 0.3 + 0.07 + 0.1 = 0.47
P(A = no allergy) = P(A = no allergy, T = positive)
+P(A = no allergy, T = negative) +P(A = no allergy, T = inconclusive)
= 0.03 + 0.45 + 0.05 = 0.53
Thus, there is a 47% chance that a randomly selected person from the group has allergy.
By adding up the entries vertically, we obtain the marginal distribution for T:
P(T = positive) = P(A = allergy, T = positive) +P(A = no allergy, T = positive)
= 0.3 + 0.03 = 0.33
P(T = negative) = P(A = allergy, T = negative) +P(A = no allergy, T = negative)
= 0.07 + 0.45 = 0.52
P(T = inconclusive) = P(A = allergy, T = inconclusive) +P(A = no allergy, T = inconclusive)
= 0.1 + 0.05 = 0.15
Thus, for 33% of the population the test is positive, and for 52% it is negative; for 15% of the
population the test is inconclusive.
13. We compute
P(A = allergy | T = negative) =
P(A = allergy, T = negative)
P(T = negative)
=
0.07
0.07 + 0.45
=
7
52
0.13461
15. We need to nd the probabilities a, b, c and d which dene the joint distribution:
Section 9 [Solutions] P1-31
X = 1 X = 2
Y = 3 a b
Y = 4 c d
From the given information, we get the following equations:
P(X = 1) = 0.4 implies that a +c = 0.4
P(X = 2) = 0.6 implies that b +d = 0.6
P(Y = 3 | X = 1) = 0.7 implies that
P(Y = 3, X = 1)
P(X = 1)
=
a
a +c
= 0.7
P(Y = 3 | X = 2) = 0.1 implies that
P(Y = 3, X = 2)
P(X = 2)
=
b
b +d
= 0.1
Combining the rst and the third equation we get a/0.4 = 0.7 and thus a = 0.28. From a + c = 0.4
it follows that c = 0.12. Combining the second and the fourth equation we get b/0.6 = 0.1 and thus
b = 0.06. From b +d = 0.6 it follows that d = 0.54. The joint distribution is
X = 1 X = 2
Y = 3 0.28 0.06
Y = 4 0.12 0.54
17. By adding up the entries along the rows we obtain the distribution for F:
P(F = sh) = P(F = sh, P = brown bear)
+P(F = sh, P = wolf) +P(F = sh, P = fox)
= 0.2 + 0.02 + 0.03 = 0.25
P(F = insects) = P(F = insects, P = brown bear)
+P(F = insects, P = wolf) +P(F = insects, P = fox)
= 0.1 + 0.05 + 0.05 = 0.2
P(F = small mammals) = P(F = small mammals, P = brown bear)
+P(F = small mammals, P = wolf) +P(F = small mammals, P = fox)
= 0.2 + 0.25 + 0.1 = 0.55
By adding up the entries vertically we obtain the distribution for P:
P(P = brown bear) = P(P = brown bear, F = sh)
+P(P = brown bear, F = insects)
+P(P = brown bear, F = small mammals)
= 0.2 + 0.1 + 0.2 = 0.5
P(P = wolf) = P(P = wolf, F = sh)
+P(P = wolf, F = insects) +P(P = wolf, F = small mammals)
= 0.02 + 0.05 + 0.25 = 0.32
P(P = fox) = P(P = fox, F = sh)
+P(P = fox, F = insects) +P(P = fox, F = small mammals)
= 0.03 + 0.05 + 0.1 = 0.18
P1-32 Probability and Statistics [Solutions]
19. The conditional probabilities are:
P(F = sh | P = wolf) =
P(F = sh, P = wolf)
P(P = wolf)
=
0.02
0.02 + 0.05 + 0.25
=
0.02
0.32
P(F = insects | P = wolf) =
P(F = insects, P = wolf)
P(P = wolf)
=
0.05
0.02 + 0.05 + 0.25
=
0.05
0.32
P(F = small mammals | P = wolf) =
P(F = small mammals, P = wolf)
P(P = wolf)
=
0.25
0.02 + 0.05 + 0.25
=
0.25
0.32
The probabilities add up to 1, because a wolf will have one of the three for food.
21. The probability that a bear will prey on a small mammal is
P(F = small mammals | P = bear) =
P(F = small mammals, P = bear)
P(P = bear)
=
0.2
0.2 + 0.1 + 0.2
=
2
5
or 40%.
23. (a) The marginal distribution for X is given by
P(X = 0) = P(X = 0, Y = 0) +P(X = 0, Y = 1) = 0.05 + 0.45 = 0.5
P(X = 1) = P(X = 1, Y = 0) +P(X = 1, Y = 1) = 0.1 + 0.4 = 0.5
The marginal distribution for Y is given by
P(Y = 0) = P(X = 0, Y = 0) +P(X = 1, Y = 0) = 0.05 + 0.1 = 0.15
P(Y = 1) = P(X = 0, Y = 1) +P(X = 1, Y = 1) = 0.45 + 0.4 = 0.85
(b) The random variables X and Y are not independent; for instance, P(X = 0, Y = 0) = 0.05 is not
equal to P(X = 0)P(Y = 0) = (0.5)(0.15) = 0.075.
25. (a) The marginal distribution for X is given by
P(X = 0) = P(X = 0, Y = 0) +P(X = 0, Y = 1) +P(X = 0, Y = 2) = 0.12 + 0.22 + 0.02 = 0.36
P(X = 1) = P(X = 1, Y = 0) +P(X = 1, Y = 1) +P(X = 1, Y = 2) = 0.18 + 0.28 + 0.18 = 0.64
The marginal distribution for Y is given by
P(Y = 0) = P(X = 0, Y = 0) +P(X = 1, Y = 0) = 0.12 + 0.18 = 0.3
P(Y = 1) = P(X = 0, Y = 1) +P(X = 1, Y = 1) = 0.22 + 0.28 = 0.5
P(Y = 2) = P(X = 0, Y = 2) +P(X = 1, Y = 2) = 0.02 + 0.18 = 0.2
(b) The random variables X and Y are not independent; for instance, P(X = 0, Y = 1) = 0.22 is not
equal to P(X = 0)P(Y = 1) = (0.36)(0.5) = 0.18.
27. We nd
P(Y = 0 | X = 0) =
P(Y = 0, X = 0)
P(X = 0)
=
0.2
0.2 + 0.08 + 0.12
=
0.2
0.4
= 0.5
P(Y = 0 | X = 1) =
P(Y = 0, X = 1)
P(X = 1)
=
0.3
0.3 + 0.12 + 0.18
=
0.3
0.6
= 0.5
We see that P(Y = 0 | X = 0)+P(Y = 0 | X = 1) = 0/5+0.5 = 1. From the joint distribution table we
compute P(Y = 0) = 0.2+0.3 = 0.5. By examining the joint distribution closer, we realize that X and
Y are independent. Thus P(Y = 0 | X = 0)+P(Y = 0 | X = 1) = P(Y = 0)+P(Y = 0) = 2P(Y = 0),
which is illustrated by their numeric values above.
29. The probabilities P(X = 0) = 0.05 + 0.1 + 0.4 = 0.55 and P(X = 1) = 0.1 + 0.1 + 0.25 = 0.45
dene the marginal probability distribution of X.
Section 9 [Solutions] P1-33
31. The probabilities
P(X = 0 | Y = 2) =
P(X = 0, Y = 2)
P(Y = 2)
=
0.4
0.4 + 0.25
=
0.4
0.65
=
8
13
P(X = 1 | Y = 2) =
P(X = 1, Y = 2)
P(Y = 2)
=
0.25
0.4 + 0.25
=
0.25
0.65
=
5
13
dene the distribution of X conditional on Y = 2.
33. The marginal probability distributions of X and Y are given in the last row and the last column
in the table below.
Y = 1 Y = 2
X = 2 0 0.12 P(X = 2) = 0.12
X = 1 0.1 0.38 P(X = 1) = 0.48
X = 0 0.26 0.14 P(X = 0) = 0.4
P(Y = 1) = 0.36 P(Y = 2) = 0.64
35. Assume that P(X = 1) = p
1
and P(X = 2) = p
2
is the probability distribution of X and
P(Y = 3) = q
1
, P(Y = 4) = q
2
, and P(Y = 5) = q
3
is the probability distribution of Y. Then
E(X) = p
1
+2p
2
and E(Y ) = 3q
1
+4q
2
+5q
3
. The range of XY is {3, 4, 5, 6, 8, 10} and its distribution
is given by (here we use independence)
P(XY = 3) = P(X = 1 and Y = 3) = P(X = 1)P(Y = 3) = p
1
q
1
P(XY = 4) = P(X = 1 and Y = 4) = P(X = 1)P(Y = 4) = p
1
q
2
P(XY = 5) = P(X = 1 and Y = 5) = P(X = 1)P(Y = 5) = p
1
q
3
P(XY = 6) = P(X = 2 and Y = 3) = P(X = 2)P(Y = 3) = p
2
q
1
P(XY = 8) = P(X = 2 and Y = 4) = P(X = 2)P(Y = 4) = p
2
q
2
P(XY = 10) = P(X = 2 and Y = 5) = P(X = 2)P(Y = 5) = p
2
q
3
It follows that
E(XY ) = 3p
1
q
1
+ 4p
1
q
2
+ 5p
1
q
3
+ 6p
2
q
1
+ 8p
2
q
2
+ 10p
2
q
3
Since
E(X)E(Y ) = (p
1
+ 2p
2
)(3q
1
+ 4q
2
+ 5q
3
) = 3p
1
q
1
+ 4p
1
q
2
+ 5p
1
q
3
+ 6p
2
q
1
+ 8p
2
q
2
+ 10p
2
q
3
we conclude that E(XY ) = E(X)E(Y ).
In general: the range of X is {x
1
, x
2
, . . . , x
m
}; assume that its distribution is given by P(X =
x
i
) = p
i
. The range of Y is {y
1
, y
2
, . . . , y
n
}; assume that its distribution is given by P(Y = y
i
) = q
i
.
Then E(X) = p
1
x
1
+ p
2
x
2
+ + p
m
x
m
and E(Y ) = q
1
y
1
+ q
2
y
2
+ + q
n
y
n
. The range of XY
consists of all products x
i
y
j
, where i = 1, 2, . . . , m and j = 1, 2, . . . , n. The probability distribution is
(by independence)
P(XY = x
i
y
j
) = P(X = x
i
and Y = y
j
) = P(X = x
i
)P(Y = y
j
) = p
i
q
j
and
E(XY ) = x
1
y
1
p
1
q
1
+x
1
y
2
p
1
q
2
+ +x
1
y
n
p
1
q
n
+x
2
y
1
p
2
q
1
+x
2
y
2
p
2
q
2
+ +x
2
y
n
p
2
q
n
+
+x
m
y
1
p
m
q
1
+x
m
y
2
p
m
q
2
+ +x
m
y
n
p
m
q
n
Computing the product
E(X)E(Y ) = (p
1
x
1
+p
2
x
2
+ +p
m
x
m
)(q
1
y
1
+q
2
y
2
+ +q
n
y
n
)
we see that E(XY ) = E(X)E(Y ).
P1-34 Probability and Statistics [Solutions]
Section 10 The Binomial Distribution
1. No. The binomial distribution requires that the same experiment (with the same probability of
success) be repeated. In this case, the probability of success (a male is interviewed) changes: initially,
the probability that a male is selected for an interview is 1/2. Assuming independence, the probability
that the second interviewee is a male is 10/19 (if the rst interviewee was a woman) or 9/19 (if the
rst interviewee was a man); however, neither is equal to 50%.
3. Dene
B =
_
1 goshawk catches a small mammal (success)
0 goshawk does not catch a small mammal
B is a Bernoulli random variable with the probability of success equal to 0.6. Repeat the experiment
10 times; by assumption, the outcomes are independent. The random variable X = number of
small mammals captured counts the number of successes in ten independent repetitions of the same
experiment. Thus, X is a binomial variable.
5. Let S represent success and F represent a no-success (failure). Exactly two successes in four trials
occur in the following six cases: SSFF, SFSF, SFFS, FSSF, FSFS, and FFSS. They are all equally
likely, with the probability
P(SSFF) = P(S)P(S)P(F)P(F) = (0.3)(0.3)(0.7)(0.7) = (0.3)
2
(0.7)
2
Thus, probability of obtaining exactly two successes in four trials is 6P(SSFF) = 6 (0.3)
2
(0.7)
2
=
0.2646.
Now the binomial distribution approach: the probability of success in a single experiment is 0.3.
The probability of 2 successes in 4 independent repetitions of the experiment is given by
b(2, 4; 0.3) =
_
4
2
_
(0.3)
2
(0.7)
2
= 6 (0.3)
2
(0.7)
2
Clearly, the two answers match.
7.
_
12
3
_
represents the number of ways to choose a group of three objects from a group of 12 objects
(say, the number of ways of picking three shirts from a collection of 12 shirts in dierent colours). We
compute
_
12
3
_
=
12 11 10
1 2 3
= 2 11 10 = 220
9. We compute
C(8, 0) =
_
8
0
_
=
8!
0! (8 0)!
= 1
since 0! = 1. In theory, C(8, 0) represents the number of ways to choose zero objects from a group
of eight objects (we can dene that there is one way of not picking any object from a group of 8
objects).
11. The number b(1, 4; 0.6) represents the probability of one success in 4 independent repetitions of
the same Bernoulli experiment with the probability of success equal to 0.6. We compute
b(1, 4; 0.6) =
_
4
1
_
(0.6)
1
(1 0.6)
41
= 4 (0.6)(0.4)
3
= 0.1536.
Section 10 [Solutions] P1-35
13. The number b(1, 7; 0.2) represents the probability of one success in 7 independent repetitions of
the same Bernoulli experiment whose probability of success is 0.2. We compute
b(1, 7; 0.2) =
_
7
1
_
(0.2)
1
(1 0.2)
71
= 7 (0.2)(0.8)
6
0.3670.
15. Label the tosses by numbers S = {1, 2, 3, 4, 5, 6, 7, 8}. Picking a group of three numbers from S
corresponds to one event in which 3 tails occurred in 8 tosses (for instance, picking 4, 5 and 8 describes
the event in which tails occurred on the 4th, 5th and 8th tosses). Thus, the number of ways of getting
three tails in eight tosses of a coin is equal to the number of ways of selecting a group of 3 numbers
from the set S of 8 numbers, which is
_
8
3
_
=
8 7 6
1 2 3
= 56
17. The number of ways of selecting a team of 4 students from a group of 20 students is
_
20
4
_
=
20 19 18 17
1 2 3 4
= 5 19 3 17 = 4845
19. At least three successes means 3, 4, or 5 successes. Thus, the probability of at least three
successes in ve trials is given by b(3, 5; 0.6) +b(4, 5; 0.6) +b(5, 5; 0.6).
21. The number of successes could be 5, 6, 7, 8, or 9. The probability is given by b(5, 25; 0.6) +
b(6, 25; 0.6) +b(7, 25; 0.6) +b(8, 25; 0.6) +b(9, 25; 0.6).
23. Formula (10.3) states that
_
n
k
_
=
n!
k!(n k)!
Replacing k by n k, we get
_
n
n k
_
=
n!
(n k)!(n (n k))!
=
n!
(n k)!(k!
=
_
n
k
_
Thus,
_
22
20
_
=
_
22
22 20
_
=
_
22
2
_
=
22 21
1 2
= 231
25. (a) The probability distribution of X is given by:
P(X = 0) = b(0, 2; 0.4) =
_
2
0
_
(0.4)
0
(1 0.4)
2
= (0.6)
2
= 0.36
P(X = 1) = b(1, 2; 0.4) =
_
2
1
_
(0.4)
1
(1 0.4)
1
= 2 (0.4)(0.6) = 0.48
P(X = 2) = b(2, 2; 0.4) =
_
2
2
_
(0.4)
2
(1 0.4)
0
= (0.4)
2
= 0.16
(b) See below.
P1-36 Probability and Statistics [Solutions]
k
P(N=k)
0.16
0.36
0 1 2
0.48
(c) The mean is
E(X) = 0 0.36 + 1 0.48 + 2 0.16 = 0.8
From
E(X
2
) = 0 0.36 + 1 0.48 + 4 0.16 = 1.12
we compute the variance
var(X) = E(X
2
) (E(X))
2
= 1.12 0.8
2
= 0.48
(d) Using (10.4), E(X) = np = 2 0.4 = 0.8; using (10.5), var(X) = np(1 p) = 2 0.4 0.6 = 0.48.
27. (a) The probability distribution of X is given by:
P(X = 0) = b(0, 4; 0.4) =
_
4
0
_
(0.4)
0
(0.6)
4
= (0.6)
4
= 0.1296
P(X = 1) = b(1, 4; 0.4) =
_
4
1
_
(0.4)
1
(0.6)
3
= 4 (0.4)(0.6)
3
= 0.3456
P(X = 2) = b(2, 4; 0.4) =
_
4
2
_
(0.4)
2
(0.6)
2
= 6 (0.4)
2
(0.6)
2
= 0.3456
P(X = 3) = b(3, 4; 0.4) =
_
4
3
_
(0.4)
3
(0.6)
1
= 4 (0.4)
3
(0.6) = 0.1536
P(X = 4) = b(4, 4; 0.4) =
_
4
4
_
(0.4)
4
(0.6)
0
= (0.4)
4
= 0.0256
(b) See below.
k
P(N=k)
0.3456
0.1296
0 1 2 3 4
0.1536
0.0256
(c) The mean is
E(X) = 0 0.1296 + 1 0.3456 + 2 0.3456 + 3 0.1536 + 4 0.0256 = 1.6
From
E(X
2
) = 0 0.1296 + 1 0.3456 + 4 0.3456 + 9 0.1536 + 16 0.0256 = 3.52
we compute the variance
var(X) = E(X
2
) (E(X))
2
= 3.52 1.6
2
= 0.96
(d) Using (10.4), E(X) = np = 4 0.4 = 1.6; using (10.5), var(X) = np(1 p) = 4 0.4 0.6 = 0.96.
Section 10 [Solutions] P1-37
29. (a) Dene
H =
_
1 a chocolate has a hazelnut (success)
0 a chocolate has no hazelnut
H is a Bernoulli random variable with probability of success equal to 0.03. Let N = number of
chocolates with a hazelnut in a box of 20 chocolates. Since N counts the number of successes in
repetitions of H (assumed independent), N is a binomially distributed random variable with n = 20
and p = 0.03.
The expected number of chocolates with a hazelnut per box is E(N) = np = 20(0.03) = 0.6.
(b) The probability that there are no chocolates with a hazelnut in one box of 20 chocolates is the
probability of no successes in 20 repetitions:
b(0, 20; 0.03) =
_
20
0
_
(0.03)
0
(0.97)
20
= 0.54379
i.e., a bit over 54%.
(c) Dene
B =
_
1 a box of chocolates has no chocolates with a hazelnut (success)
0 a box of chocolates has a chocolate with a hazelnut
B is a Bernoulli random variable with probability of success equal to 0.54379. Let M = number of
boxes of chocolates which do not contain a chocolate with a hazelnut. Since M counts the number
of successes in 15 independent repetitions of B, M is a binomially distributed random variable with
n = 15 and p = 0.54379.
The expected number of of boxes that contain no chocolates with a hazelnut is E(M) = np =
15(0.54379) = 8.15685; i.e., 8 boxes.
31. Dene
T =
_
1 a tomato plant has been infested with hornworms (success)
0 a tomato plant has not been infested with hornworms
T is a Bernoulli random variable with probability of success equal to 0.15. Let N = number of
tomato plants which have been infested with hornworms. Since N counts the number of successes in
independent repetitions of T, N is a binomially distributed random variable; it is given that n = 10
and p = 0.15. The probability that none of the ten randomly picked tomato plants have been infested
with hornworms is (zero successes in ten trials)
b(0, 10; 0.15) =
_
10
0
_
(0.15)
0
(0.85)
10
0.19687
33. The probability distribution of the genotype of a puppy of SC parents is P(SS) = 0.25, P(SC) =
0.5, and P(CC) = 0.25. Thus,
P(puppy has straight hair) = P(SS) +P(SC) = 0.75
P(puppy has curly hair) = P(CC) = 0.25
Dene
H =
_
1 a puppy has curly hair (success)
0 a puppy does not have curly hair
H is a Bernoulli random variable with probability of success equal to 0.25. Let N = number of
puppies which have curly hair. Since N counts the number of successes in independent repetitions
of H, N is a binomially distributed random variable; it is given that n = 8 and p = 0.25.
The expected number of puppies with curly hair is E(N) = np = 8(0.25) = 2. The probability
that exactly 2 puppies have curly hair is
b(2, 8; 0.25) =
_
8
2
_
(0.25)
2
(0.75)
6
0.31146
P1-38 Probability and Statistics [Solutions]
35. The probability distribution of the genotype of an ospring of LS parents is P(LL) = 0.25 =
P(long), P(LS) = 0.5 = P(medium-sized), and P(SS) = 0.25 = P(short). Dene
T =
_
1 an ospring is medium-sized (success)
0 an ospring is not medium-sized
T is a Bernoulli random variable with probability of success equal to 0.5. Let N = number of medium-
sized ospring. Since N counts the number of successes in repetitions of T, N is a binomially
distributed random variable; it is given that n = 12 and p = 0.5.
(a) The expected number of medium-sized ospring is E(N) = np = 12(0.5) = 6.
(b) The probability that that there are at most two medium-sized ospring is
b(0, 12; 0.55) +b(1, 12; 0.55) +b(2, 12; 0.55)
=
_
12
0
_
(0.5)
0
(0.5)
12
+
_
12
1
_
(0.5)
1
(0.5)
11
+
_
12
2
_
(0.5)
2
(0.5)
10
= (1 + 12 + 66)(0.5)
12
0.01929
37. (a) We approximate
50!
250
_
50
e
_
50
= 10
_
50
e
_
50
3.036344619 10
64
The true value is
50! = 30414093201713378043612608166064768844377641568960512000000000000
(b) We get
log
10
_
2n
_
n
e
_
n
_
= log
10
2n + log
10
_
n
e
_
n
=
1
2
(log
10
(2) + log
10
n) +n(log
10
n log
10
e)
=
1
2
log
10
(2) +
_
n +
1
2
_
log
10
n nlog
10
e
When n = 120,
log
10
120!
1
2
log
10
(2) + 120.5 log
10
120 120 log
10
e 198.8250922
(c) Using (b), we get
log
10
_
120
36
_
= log
10
120!
36!84!
= log
10
120! [log
10
36! + log
10
84!]
1
2
log
10
(2) + 120.5 log
10
120 120 log
10
e
_
1
2
log
10
(2) + 36.5 log
10
36 36 log
10
e +
1
2
log
10
(2) + 84.5 log
10
84 84 log
10
e
_
=
1
2
log
10
(2) + 120.5 log
10
120 36.5 log
10
36 84.5 log
10
84
30.7356092
Section 11 [Solutions] P1-39
Section 11 The Multinomial and the Geometric Distributions
1. (a) We can do it in
4!
1! 3!
=
24
6
= 4 ways: {1 | 2, 3, 4}, {2 | 1, 3, 4}, {3 | 1, 2, 4}, and {4 | 1, 2, 3}.
(b) We can do it in
4!
2! 2!
=
24
4
= 6 ways: {1, 2 | 3, 4}, {1, 3 | 2, 4}, {1, 4 | 2, 3}, {2, 3 | 1, 4}, {2, 4 | 1, 3},
and {3, 4 | 1, 2}.
(c) We can do it in
4!
1! 1! 2!
=
24
2
= 12 ways: {1 | 2 | 3, 4}, {2 | 1 | 3, 4}, {1 | 3 | 2, 4}, {3 | 1 | 2, 4}, {1 | 4 | 2, 3},
{4 | 1 | 2, 3}, {2 | 3 | 1, 4}, {3 | 2 | 1, 4}, {2 | 4 | 1, 3}, {4 | 2 | 1, 3}, {3 | 4 | 1, 2}, and {4 | 3 | 1, 2}.
3. (a) The probability that the 80 wolves will prey on 10 deer, 70 beavers, no moose, and no animals
from the other group is
P(N
1
= 10, N
2
= 70, N
3
= 0, N
4
= 0) =
80!
10! 70! 0! 0!
0.33
10
0.55
70
0.05
0
0.07
0
=
80!
10! 70!
0.33
10
0.55
70
(b) Is it given that N
2
= 60, N
4
= 16 and N
1
+ N
3
= 4. The probability is (we go through all
combinations of N
1
and N
3
whose sum is 4):
P(N
1
= 0, N
2
= 60, N
3
= 4, N
4
= 16) +P(N
1
= 1, N
2
= 60, N
3
= 3, N
4
= 16)
+P(N
2
= 2, N
2
= 60, N
3
= 2, N
4
= 16) +P(N
1
= 3, N
2
= 60, N
3
= 1, N
4
= 16)
+P(N
1
= 4, N
2
= 60, N
3
= 0, N
4
= 16)
=
80!
0! 60! 4! 16!
0.33
0
0.55
60
0.05
4
0.07
16
+
80!
1! 60! 3! 16!
0.33
1
0.55
60
0.05
3
0.07
16
+
80!
2! 60! 2! 16!
0.33
2
0.55
60
0.05
2
0.07
16
+
80!
3! 60! 1! 16!
0.33
3
0.55
60
0.05
1
0.07
16
+
80!
4! 60! 0! 16!
0.33
4
0.55
60
0.05
0
0.07
16
5. The probability distribution of the genotype of an ospring of AB parents is P(AA) = 0.25,
P(AB) = 0.5, and P(BB) = 0.25. There is a total of 9 ospring. The probability is
P(three AA, two AB, four BB) =
9!
3! 2! 4!
0.25
3
0.5
2
0.25
4
=
9!
6 2 24
0.25
8
0.01923
i.e., close to 2%.
7. The probability distribution of the genotype of an ospring of LS parents is P(LL) = 0.25 =
P(long), P(LS) = 0.5 = P(medium length), and P(SS) = 0.25 = P(short).
(a) The probability is
P(two LL, two LS, two SS) =
6!
2! 2! 2!
0.25
2
0.5
2
0.25
2
=
6!
8
0.25
5
0.08789
(b) The probability is
P(two LL, zero LS, four SS) +P(two LL, one LS, three SS)
=
6!
2! 0! 4!
0.25
2
0.5
0
0.25
4
+
6!
2! 1! 3!
0.25
2
0.5
1
0.25
3
0.00366 + 0.02930 = 0.03296
9. The probability distribution of the genotype of an ospring of AB parents is
P(AA) = 0.25 = P(neither carrier nor has the trait)
P(AB) = 0.5 = P(carrier)
P(BB) = 0.25 = P(has the trait)
P1-40 Probability and Statistics [Solutions]
The probability that one child will have attached earlobes, two will be carriers, and one will neither
be a carrier nor have attached earlobes is
P(one AA, two AB, one BB) =
4!
1! 2! 1!
(0.25)
1
(0.5)
2
(0.25)
1
= 12 (0.25)
3
= 0.1875
11. (a) Consider the geometric distribution with probability of success p = 0.15. The probability of
the rst success occurring on the fourth trial is
P(X = 4) = (1 0.15)
3
(0.15) = (0.85)
3
(0.15) 0.092
(b) See below.
1 2 4 6 8 10 12 14 16 18 20
0
0.02
0.04
0.06
0.08
0.1
0.12
0.14
0.16
0.18
p = 0.15
13. (a) Consider the geometric distribution with probability of success p = 0.2. The probability of
the rst success occurring on the third trial is
P(X = 3) = (1 0.2)
2
(0.2) = (0.8)
2
(0.2) = 0.128
(b) See below.
1 2 3 4 5 6 7 8 9 10 11 12
0
0.02
0.04
0.06
0.08
0.1
0.12
0.14
0.16
0.18
0.2
p = 0.2
15. (a) Consider the geometric distribution with probability of success p = 0.6. The probability that
the rst success occurs on or after the fourth trial is (use the complementary event success occurs
before the fourth trial)
P(X 4) = 1 P(X < 4)
= 1 [P(X = 1) +P(X = 2) +P(X = 3)]
= 1 [0.6 + (1 0.6)(0.6) + (1 0.6)
2
(0.6)] = 0.064
(b) See below.
Section 11 [Solutions] P1-41
1 2 3 4 5 6 7 8
0
0.1
0.2
0.3
0.4
0.5
0.6
p = 0.6
17. (a) Consider the geometric distribution with probability of success p = 0.6. The probability that
the rst success occurs on or before the fourth trial is
P(X 4) = P(X = 1) +P(X = 2) +P(X = 3) +P(X = 4)
= 0.6 + (1 0.6)(0.6) + (1 0.6)
2
(0.6) + (1 0.6)
3
(0.6)
= 0.6[1 + 0.4 + 0.4
2
+ 0.4
3
]
= 0.6
1 0.4
4
1 0.4
= 1 0.4
4
= 0.9744
(In calculating the sum in the end, we used the formula 1 +q +q
2
+q
3
+ +q
n
= (1 q
n+1
)/(1 q)
with q = 0.4 and n = 3.)
(b) See below.
1 2 3 4 5 6 7 8
0
0.1
0.2
0.3
0.4
0.5
0.6
p = 0.6
19. A geometric distribution will larger p is less spread out that the one with smaller p (look at
histograms in Figure 11.1). Thus, the geometric distribution with p
2
= p/2 is more spread out than
the one with p
1
= p.
Formally: the variances are var
1
= (1 p)/p
2
and
var
2
=
1
p
2
_
p
2
_
2
=
1
p
2
p
2
4
=
4 2p
p
2
From 4 2p > 1 p (which is true whenever p < 3) we conclude
var
1
=
1 p
p
2
<
4 2p
p
2
= var
2
So, p
2
= p/2 yields larger variance (thus, wider spread) than p
1
= p.
21. From E(X) = 1/p = 5 we get p = 0.2 The variance is var(X) = (1 p)/p
2
= 0.8/0.04 = 20 and
the standard deviation is
20 4.472.
23. From var = (1p)/p
2
= 2 we get 2p
2
= 1p and 2p
2
+p1 = (2p1)(p+1) = 0. Thus p = 1/2
(the remaining solution p = 1 makes no sense) and so the mean is 1/(1/2) = 2.
P1-42 Probability and Statistics [Solutions]
25. Let X = number of trials until gene mutates. X is a geometrically distributed random variable
with the probability of success p = 0.001.
The probability that a gene will mutate during the 20th cell division is
P(X = 20) = (1 0.001)
19
(0.001) 0.00098
The probability that the gene will mutate before or during the 20th cell division is
P(X 20) = P(X = 1) +P(X = 2) +P(X = 3) + +P(X = 20)
= 0.001 + (0.999)(0.001) + (0.999)
2
(0.001) + + (0.999)
19
(0.001)
= 0.001[1 + 0.999 + (0.999)
2
+ + (0.999)
19
]
= 0.001
1 0.999
20
1 0.999
= 1 0.999
20
0.0198
(In calculating the sum we used the formula 1 + q + q
2
+ q
3
+ + q
n
= (1 q
n+1
)/(1 q) with
q = 0.999 and n = 19.)
27. (a) We compute
s
n
qs
n
= 1 +q +q
2
+q
3
+ +q
n
q(1 +q +q
2
+q
3
+ +q
n
)
s
n
(1 q) = 1 +q +q
2
+q
3
+ +q
n
q q
2
q
3
q
n
q
n+1
s
n
(1 q) = 1 q
n+1
s
n
=
1 q
n+1
1 q
(b) Since |q| < 1, it follows that the limit of q
n+1
as n is zero. Thus,
lim
n
s
n
= lim
n
1 q
n+1
1 q
=
1
1 q
i.e.,
1 +q +q
2
+q
3
+ =
1
1 q
29. (a) Dierentiating
1 +q +q
2
+q
3
+ =
1
1 q
with respect to q, we get
1 + 2q + 3q
2
+ 4q
3
+ = (1 q)
2
(1) =
1
(1 q)
2
replacing q by 1 p yields
1 + 2(1 p) + 3(1 p)
2
+ 4(1 p)
3
+ =
1
(1 (1 p))
2
k=1
k(1 p)
k1
=
1
p
2
(b) Dierentiating
1 +q +q
2
+q
3
+ =
1
1 q
with respect to q, then multiplying by q and dierentiating with respect to q again, we obtain
1 + 2q + 3q
2
+ 4q
3
+ =
1
(1 q)
2
q + 2q
2
+ 3q
3
+ 4q
4
+ =
q
(1 q)
2
Section 11 [Solutions] P1-43
1 + 2
2
q + 3
2
q
2
+ 4
2
q
3
+ =
(1 q)
2
q 2(1 q)(1)
(1 q)
4
=
(1 q) + 2q
(1 q)
3
1 + 2
2
q + 3
2
q
2
+ 4
2
q
3
+ =
q + 1
(1 q)
3
Replacing q by 1 p and then multiplying by p yields
1 + 2
2
(1 p) + 3
2
(1 p)
2
+ 4
2
(1 p)
3
+ =
(1 p) + 1
(1 (1 p))
3
k=1
k
2
(1 p)
k1
=
2 p
p
3
p
k=1
k
2
(1 p)
k1
=
2 p
p
2
P1-44 Probability and Statistics [Solutions]
Section 12 The Poisson Distribution
1. It is given that X Po (2.5). Using
P(X = k) =
e
k
k!
=
e
2.5
(2.5)
k
k!
we obtain
P(X = 0) =
e
2.5
(2.5)
0
0!
= e
2.5
0.0820850
P(X = 1) =
e
2.5
(2.5)
1
1!
0.205212
P(X = 2) =
e
2.5
(2.5)
2
2!
0.256516
P(X = 3) =
e
2.5
(2.5)
3
3!
0.213763
P(X = 4) =
e
2.5
(2.5)
4
4!
0.133602
3. It is given that X Po (12). We nd
P(4 X 7) = P(X = 4) +P(X = 5) +P(X = 6) +P(X = 7)
=
e
12
12
4
4!
+
e
12
12
5
5!
+
e
12
12
6
6!
+
e
12
12
7
7!
= e
12
_
12
4
4!
+
12
5
5!
+
12
6
6!
+
12
7
7!
_
0.087213
5. It is given that X Po (4). We nd
P(0 X 3) = P(X = 0) +P(X = 1) +P(X = 2) +P(X = 3)
=
e
4
4
0
0!
+
e
4
4
1
1!
+
e
4
4
2
2!
+
e
4
4
3
3!
= e
4
_
1 + 4 + 8 +
32
3
_
0.433470
7. Look at Figure 12.1. There are two identical probabilities, corresponding to the values P(X = 1)
and P(X = ). Thus, the given graph represents the Poisson distribution with = 4.
We now prove that the observation we made is indeed true. Assume that is an integer, 1.
and that X Po (). Then
P(X = 1) =
e
1
( 1)!
=
e
1
( 1)!
=
e
!
= P(X = )
(In the above, we used the fact that ( 1)! = !.)
9. Dene X = number of people with a respiratory infection in a group of 5000 people. The
occurrence of 3 out of 2,000 translates to (multiply by 2.5) 7.5 out of 5,000. Thus, X is a Poisson
distribution with parameter = 7.5. The probability that 12 out of 5,000 people are diagnosed with
the infection is
P(X = 12) =
e
7.5
(7.5)
12
12!
0.036575
Section 12 [Solutions] P1-45
11. Let X = number of more serious trac accidents per week. Then X is a Poisson distribution
with parameter = 4. The probability that at least two more serious accidents happen in a week is
P(X 2) = 1 P(X < 2) = 1 (P(X = 0) +P(X = 1))
= 1
_
e
4
(4)
0
0!
+
e
4
(4)
1
1!
_
= 1 5e
4
0.908422
13. Dene X = number of spoiled apples in a bag of 15 apples. The occurrence of 2 spoiled
apples in a bag of 30 apples translates to 1 spoiled apple in a bag of 15 apples. Thus, X is a Poisson
distribution with parameter = 1. The probability that there are no more than two spoiled apples in
the bag of 15 apples is
P(X 2) = P(X = 0) +P(X = 1) +P(X = 2)
=
e
1
(1)
0
0!
+
e
1
(1)
1
1!
+
e
1
(1)
2
2!
= e
1
(1 + 1 + 0.5) = 2.5e
1
0.919699
15. Dene X = number of heavy metal particles in a half-litre glass of tap water. The occurrence
of six heavy metal particles in 1 L of tap water translates to three heavy metal particles in 1/2 L of
tap water. Thus, X is a Poisson distribution with parameter = 3. The probability that there are no
heavy metal particles in a half-litre glass of tap water is
P(X = 0) =
e
3
(3)
0
0!
= e
3
0.049787
i.e., a bit less than 5%.
17. Dene X = number of molecules leaving the region by the end of the second hour. The rate
of 0.4 molecules per hour translates to the rate of 0.8 molecules per two hours. Thus, X is a Poisson
distribution with parameter = 0.8. The probability that three or fewer molecules leave by the end
of the second hour is
P(X 3) = P(X = 0) +P(X = 1) +P(X = 2) +P(X = 3)
=
e
0.8
(0.8)
0
0!
+
e
0.8
(0.8)
1
1!
+
e
0.8
(0.8)
2
2!
+
e
0.8
(0.8)
3
3!
0.990920
19. Dene X = number of hits by cosmic rays in an eight-hour interval. The rate of one cosmic ray
per day translates to the rate of 1/3 cosmic rays per eight hours. Thus, X is a Poisson distribution
with parameter = 1/3. The probability that we will be hit at least once in an eight-hour interval is
P(X 1) = 1 P(X < 1) = 1 P(X = 0) = 1
e
1/3
(1/3)
0
0!
= 1 e
1/3
0.283469
21. Let X = number of text messages received in an hour. The context implies that X is a Poisson
distribution with parameter = 3. The probability that the student receives more than ve messages
in an hour is
P(X > 5) = 1 P(X 5)
= 1 (P(X = 0) +P(X = 1) +P(X = 2) +P(X = 3) +P(X = 4) +P(X = 5))
= 1
_
e
3
(3)
0
0!
+
e
3
(3)
1
1!
+
e
3
(3)
2
2!
+
e
3
(3)
3
3!
+
e
3
(3)
4
4!
+
e
3
(3)
5
5!
_
= 1
92
5
e
3
0.083918
P1-46 Probability and Statistics [Solutions]
23. Since X Po (1) and Y Po (9), it follows that (assuming independence) X + Y Po (1 + 9),
i.e., X +Y Po (10). Thus
P(X +Y = 2) =
e
10
(10)
2
2!
= 50e
10
0.002270
and
P(Y = 2 | X +Y = 2) =
P(Y = 2 and X +Y = 2)
P(X +Y = 2)
=
P(Y = 2 and X = 0)
P(X +Y = 2)
=
P(Y = 2)P(X = 0)
P(X +Y = 2)
=
e
9
(9)
2
2!
e
1
(1)
0
0!
50e
10
=
(9)
2
50 2!
= 0.81
25. Dene T = number of text messages in an hour and C = number of phone calls in an hour.
It is given that T Po (4) and C Po (2). Let I = T + C = number of interruptions in an hour.
Assuming independence, I Po (6). The probability that the student will experience no interruptions
in 1 hour is
P(I = 0) =
e
6
(6)
0
0!
= e
6
0.002479
Let J = number of interruptions in a ten-minute interval. Then J Po (1), and the probability
that the student will experience one interruption every 10 minutes is
P(J = 1) =
e
1
(1)
1
1!
= e
1
0.367879
27. Dene
A =
_
1 a person experiences serious side eects from allergy medication (success)
0 a person does not experience serious side eects from allergy medication
A is a Bernoulli random variable with probability of success equal to 0.003. Let N = number of
people experiencing serious side eects from allergy medication in a group of 200 people. Since N
counts the number of successes in 200 independent repetitions of the event A, it follows that N is a
binomially distributed random variable with n = 200 and p = 0.003. Thus, the probability that in a
group of 200 people nobody experiences serious side eects is
b(0, 200; 0.003) =
_
200
0
_
(0.003)
0
(0.997)
200
= (0.997)
200
0.548317
Using Poisson approximation (recall that b(k, n; p) P(X = k) if X Po (np)), we get
b(0, 200; 0.003) P(X = 0)
where X Po (200 0.003 = 0.6). Thus
P(X = 0) =
e
0.6
(0.6)
0
0!
= e
0.6
0.548812
29. Dene
L =
_
1 a person has serious consequences from lactose intolerance (success)
0 a person does not have serious consequences from lactose intolerance
L is a Bernoulli random variable with probability of success equal to 0.002. Let N = number of
people who have serious consequences from lactose intolerance in a group of 500 people. Since N
counts the number of successes in 500 independent repetitions of the event L, it follows that N is a
Section 12 [Solutions] P1-47
binomially distributed random variable with n = 500 and p = 0.002. Thus, the probability that in a
group of 500 people one person experiences serious consequences from lactose intolerance is
b(1, 500; 0.002) =
_
500
1
_
(0.002)
1
(0.998)
499
= (0.998)
499
0.368248
Using Poisson approximation (recall that b(k, n; p) P(X = k) if X Po (np)), we get
b(1, 500; 0.002) P(X = 1)
where X Po (500 0.002 = 1). Thus
P(X = 1) =
e
1
(1)
1
1!
= e
1
0.367879
P1-48 Probability and Statistics [Solutions]
Section 13 Continuous Random Variables
1. The function f(x) = 1 x
2
, x [0, 2], cannot be a probability density function because f(x) 0
does not hold on [0, 2]. For instance, f(1.5) = 1 (1.5)
2
= 1.25.
3. To satisfy f(x) 0, we need a 0 (actually, we need a > 0; if a = 0, then f is identically zero and
cannot be a probability density function). As well, the integral of f has to be equal to 1:
_
10
1
a
x
dx = a ln |x|
10
1
= a ln10 a ln1 = a ln10 = 1
Thus, a = 1/ ln 10.
5. To satisfy f(x) 0, we need a 0 (actually we need a > 0; if a = 0, then f is identically zero and
cannot be a probability density function). As well, the integral of f has to be 1:
_
0
a
1 +x
2
dx = a arctanx
0
= a arctan() a arctan0 = a
2
= 1
(since arctan0 = 0). Thus, a = 2/.
In the above, we abbreviated the calculation of the improper integral. Without skipping steps:
_
0
a
1 +x
2
dx = a lim
T
_
T
0
a
1 +x
2
dx
= a lim
T
arctanx
T
0
= a lim
T
(arctanT arctan 0) = a
2
7. Clearly, f(x) = 2/x
3
is positive for x [1, ). As well,
_
1
2
x
3
dx = lim
T
_
T
1
2
x
3
dx
= lim
T
2
x
2
2
T
1
= lim
T
1
x
2
T
1
= lim
T
_
1
T
2
+
1
1
2
_
= 0 + 1 = 1
The mean is
=
_
1
x
2
x
3
dx = lim
T
_
T
1
2
x
2
dx
= lim
T
2
x
1
1
T
1
= lim
T
2
x
T
1
= lim
T
_
2
T
+
2
1
_
= 0 + 2 = 2
9. No. Let f(x) = a for x [0, ), where a > 0 is a constant. Since
_
0
a dx = lim
T
_
T
0
a dx = lim
T
ax
T
0
= lim
T
(aT) =
the integral of f cannot be equal to 1, no matter what value of a is used.
Section 13 [Solutions] P1-49
11. Using the probability density function, we compute
P(0.5 X 2) =
_
2
0.5
(0.3 + 0.2x) dx = (0.3x + 0.1x
2
)
2
0.5
= (0.6 + 0.4) (0.15 + 0.025) = 0.825
The cumulative distribution function of f(x) is
F(x) =
_
x
0
(0.3 + 0.2t) dt = (0.3t + 0.1t
2
)
x
0
= (0.3x + 0.1x
2
) 0 = 0.3x + 0.1x
2
for x in [0, 2]. Thus,
P(0.5 X 2) = F(2) F(0.5) = [(0.3)(2) + (0.1)(2)
2
] [(0.3)(0.5) + (0.1)(0.5)
2
] = 0.825
13. Using the probability density function, we compute
P(1 X 2) =
_
2
1
1
x
dx = ln |x|
2
1
= ln 2 ln 1 = ln 2
The cumulative distribution function of f(x) is
F(x) =
_
x
1
1
t
dt = ln |t|
x
1
= ln x ln 1 = ln x
for x in [1, e]. Thus,
P(1 X 2) = F(2) F(1) = ln 2 ln 1 = ln 2
15. We check the properties listed in Theorem 13:
(a) Since e
2x
1 for x 0, it follows that F(x) = 1 e
2x
0 for all x [0, ). As well, e
2x
> 0,
and thus F(x) = 1 e
2x
1 for all x [0, ).
(b) The function F(x) is continuous for all x, as the dierence of two continuous functions. The fact
that F
(x) = e
2x
(2) = 2e
2x
> 0 implies that F(x) is increasing (thus, it is non-decreasing) for
all x [0, ).
(c) The limits:
lim
x0
F(x) = lim
x0
(1 e
2x
) = 1 e
0
= 0
and
lim
x
F(x) = lim
x
(1 e
2x
) = 1 e
= 1
Thus F(x) = 1 e
2x
, x [0, ), is indeed a cumulative distribution function. The corresponding
probability density function is f(x) = F
(x) = 2e
2x
.
The expected value is given by
=
_
0
x(2e
2x
) dx = 2
_
0
xe
2x
dx
First we calculate the indenite integral (using integration by parts): let u = x and v
= e
2x
. Then
u
= 1, v = e
2x
/2, and
_
xe
2x
dx = uv
_
vu
dx
=
1
2
xe
2x
+
1
2
_
e
2x
dx
=
1
2
xe
2x
1
4
e
2x
=
1
4
(2x + 1)e
2x
Thus
= 2
_
0
xe
2x
dx = 2 lim
T
_
T
0
xe
2x
dx
P1-50 Probability and Statistics [Solutions]
= 2 lim
T
_
1
4
(2x + 1)e
2x
_
T
0
= 2
_
lim
T
_
1
4
(2T + 1)e
2T
_
1
4
__
= 2
_
0 +
1
4
_
=
1
2
Recall that
lim
T
e
2T
= 0
and, by LH opitals rule,
lim
T
Te
2T
= lim
T
T
e
2T
= lim
T
1
2e
2T
= 0
17. (a) Clearly, f(x) = 2x 0 for x [0, 1]. As well,
_
1
0
f(x) dx =
_
1
0
2xdx = x
2
1
0
= 1 0 = 1
(b) The cumulative distribution function is
F(x) =
_
x
0
2t dt = t
2
x
0
= x
2
for x [0, 1].
(c) The expected value of X is
= E(X) =
_
1
0
x(2x) dx =
2x
3
3
1
0
=
2
3
(d) We nd
P(X ) = P(X 2/3) = F(2/3) =
_
2
3
_
2
=
4
9
19. (a) Clearly, f(x) = 3x
2
0 for x [0, 1]. As well,
_
1
0
f(x) dx =
_
1
0
3x
2
dx = x
3
1
0
= 1 0 = 1
(b) The cumulative distribution function is
F(x) =
_
x
0
3t
2
dt = t
3
x
0
= x
3
for x [0, 1].
(c) The expected value of X is
= E(X) =
_
1
0
x(3x
2
) dx =
3x
4
4
1
0
=
3
4
(d) We nd
P(X ) = P(X 3/4) = F(3/4) =
_
3
4
_
3
=
27
64
21. (a) From 0 x 3 we get (after multiplying by 2/9) 0 2x/9 2/3. Thus, 2/3 2x/9 0 for
x [0, 3]. As well,
_
3
0
f(x) dx =
_
3
0
_
2
3
2x
9
_
dx =
_
2x
3
x
2
9
_
3
0
= (2 1) 0 = 1
Section 13 [Solutions] P1-51
(b) The cumulative distribution function is
F(x) =
_
x
0
_
2
3
2t
9
_
dt =
_
2t
3
t
2
9
_
x
0
=
2x
3
x
2
9
for x [0, 3].
(c) The expected value of X is
= E(X) =
_
3
0
x
_
2
3
2x
9
_
dx =
_
3
0
_
2x
3
2x
2
9
_
dx =
_
x
2
3
2x
3
27
_
3
0
= (3 2) 0 = 1
(d) We nd
P(X ) = P(X 1) = F(1) =
_
2
3
1
9
_
=
5
9
23. We compute
= E(X) =
_
1
0
x(3x
2
) dx =
3x
4
4
1
0
=
3
4
= 0.75
E(X
2
) =
_
1
0
x
2
(3x
2
) dx =
3x
5
5
1
0
=
3
5
var(X) = E(X
2
) (E(X))
2
=
3
5
9
16
=
3
80
and =
_
var(X) =
_
3/80 0.19365. The probability that the values of X are at most one standard
deviation away from the mean is
P( X +) =
_
+
3x
2
dx
= x
3
= ( +)
3
( )
3
= (0.75 + 0.19365)
3
(0.75 0.19365)
3
= 0.668093
25. The cumulative distribution function is
F(x) =
_
x
0
3t
2
dt = t
3
x
0
= x
3
for x [0, 1]. The median is the value x where F(x) = 1/2, i.e., where x
3
= 1/2. Thus, the median is
3
_
1/2.
27. We are looking for a number Q
3
such that P(X Q
3
) = 0.75.
_
Q
3
0
_
2
3
2x
9
_
dx = 0.75
_
2x
3
x
2
9
_
Q
3
0
= 0.75
2Q
3
3
Q
2
3
9
= 0.75
Multiplying by 9 and using the quadratic formula, we get
Q
2
3
6Q
3
+ 6.75 = 0
Q
3
=
6
36 27
2
P1-52 Probability and Statistics [Solutions]
So Q
3
= 1.5 or Q
3
= 4.5. Since the probability density function and the cumulative distribution
function are deed on 0 x 3, the upper quartile is 1.5.
29. The average lifetime of the tree is given by the integral
_
0
t f(t) dt =
_
0
t 0.01e
0.01t
dt = lim
T
_
T
0
0.01te
0.01t
dt
To calculate the indenite integral, we use the integration by parts with u = t and v
= e
0.01t
. Then
u
= 1, v = e
0.01t
/0.01 = 100e
0.01t
, and
_
0.01te
0.01t
dt = 0.01
_
uv
_
vu
dt
_
= 0.01
_
100te
0.01t
+
_
100e
0.01t
dt
_
= 0.01
_
100te
0.01t
10000e
0.01t
_
= te
0.01t
100e
0.01t
Thus,
lim
T
_
T
0
0.01te
0.01t
dt = lim
T
_
te
0.01t
100e
0.01t
_
T
0
= lim
T
__
Te
0.01T
100e
0.01T
_
(0 100)
= 100
because
lim
T
e
0.01T
= 0
and, by LH opitals rule,
lim
T
Te
0.01T
= lim
T
T
e
0.01T
= lim
T
1
0.01e
0.01T
= 0
Thus, the average lifetime of a tree is 100 years.
The probability that a tree will live longer than 70 years is
P =
_
70
f(t) dt =
_
70
0.01e
0.01t
dt
= lim
T
_
T
70
0.01e
0.01t
dt
= lim
T
e
0.01t
T
70
= lim
T
_
e
0.01T
+e
0.01(70)
_
= e
0.7
0.49659
i.e., about 50%.
31. The probability is
P(distance 10) =
_
10
0
2
(1 +x
2
)
dx
=
2
arctan x
10
0
=
2
arctan 10 0 0.936550
Section 13 [Solutions] P1-53
33. (a) When x 0, f(x) = 1 |x| = 1 x; thus,
P(1/2 X 3/4) =
_
3/4
1/2
(1 |x|) dx =
_
3/4
1/2
(1 x) dx
=
_
x
x
2
2
_
3/4
1/2
=
_
3
4
9
32
_
_
1
2
1
8
_
=
3
32
When x < 0, f(x) = 1 |x| = 1 (x) = 1 +x; so
P(1/2 X 0) =
_
0
1/2
(1 |x|) dx =
_
0
1/2
(1 +x) dx
=
_
x +
x
2
2
_
0
1/2
= (0)
_
1
2
+
1
8
_
=
3
8
(b) To nd the expected value, we need to nd the integral
E(X) =
_
1
1
x(1 |x|) dx =
_
0
1
x(1 +x) dx +
_
1
0
x(1 x) dx
We can proceed as usual, calculating antiderivatives and evaluating. But, there is a shortcut: the
function x(1 |x|) is odd, and therefore its integral from 1 to 1 is zero. Thus, E(X) = 0. We need
to nd
E(X
2
) =
_
1
1
x
2
(1 |x|) dx =
_
0
1
x
2
(1 +x) dx +
_
1
0
x
2
(1 x) dx
=
_
0
1
(x
2
+x
3
) dx +
_
1
0
(x
2
x
3
) dx
=
_
x
3
3
+
x
4
4
_
0
1
+
_
x
3
3
x
4
4
_
1
0
= (0)
_
1
3
+
1
4
_
+
_
1
3
1
4
_
(0) =
1
6
Thus, the variance is var(X) = E(X
2
) (E(X))
2
= 1/6.
35. The Intermediate Value Theorem states that a continuous function dened on a closed interval
[a, b] assumes all values between f(a) and f(b). The cumulative distribution function is continuous,
and by assumption in this exercise, it is dened on a closed interval [a, b] (and not on an interval that
includes or ). Any cumulative distribution function F(x) satises F(a) = 0 and F(b) = 1. Thus,
the Intermediate Value Theorem implies that F assumes all values between 0 and 1, in particular the
value 1/2. In other words, there is a number x where F(x) = 1/2; this number is the median of X.
P1-54 Probability and Statistics [Solutions]
Section 14 The Normal Distribution
1. Assume that X is normally distributed with mean and variance
2
. The z-score of a number
a is the number (a )/; it is used to convert a probability related to a normal distribution to a
probability related to the standard normal distribution.
If X N(3, 16) then = 3 and = 4. To calculate P(0 X 7) we convert the numbers to
their z-scores:
P(0 X 7) = P
_
0 3
4
X 3
4
7 3
4
_
= P(3/4 Z 1)
The random variable Z = (X 3)/4 has the standard normal distribution.
3. The notation X N(0, 2
2
) says that the mean is = 0 and the standard deviation is = 2. Thus,
P(1 X 2) = P
_
1 0
2
X 0
2
2 0
2
_
= P(1/2 Z 1)
Below is the graph of the standard normal distribution; the area of the shaded region is equal to
P(1 X 2).
5 4 3 2 1/2 0 1 2 3 4 5
0
0.05
0.1
0.15
0.2
0.25
0.3
0.35
0.4
5. It is given that = 5 and = 10. Thus
P(X < 9) = P
_
X 5
10
<
9 5
10
_
= P(Z < 0.4) = F(0.4) = 0.655422
7. It is given that = 0 and = 10. Thus
P(X > 25) = P
_
X 0
10
>
25 0
10
_
= P(Z > 2.5)
= 1 P(Z 2.5)
= 1 F(2.5) = 1 0.993790 = 0.006210
9. It is given that = 5 and = 10. We nd
P(X < 10) = P
_
X (5)
10
<
10 (5)
10
_
= P(Z < 0.5)
= F(0.5)
= 1 F(0.5) = 1 0.691462 = 0.308538
Section 14 [Solutions] P1-55
11. It is given that = 2 and = 5. Thus
P(0 X 5) = P
_
0 2
5
X 2
5
5 2
5
_
= P(0.4 Z 0.6)
= F(0.6) F(0.4)
= F(0.6) (1 F(0.4)) = 0.725747 (1 0.655422) = 0.381169
13. Let W denote the weight of a pink salmon. It is given that W N(1.7, 0.1
2
). The ratio of pink
salmon which is heavier than 1.9 kg is given by
P(W > 1.9) = P
_
W 1.7
0.1
>
1.9 1.7
0.1
_
= P(Z > 2)
= 1 P(Z 2)
= 1 F(2) = 1 0.977250 = 0.022750
So, about 2.3% of salmon is heavier than 1.9 kg.
15. The mean of I is = 100 and the standard deviation is = 15. We compute
P(I > 120) = P
_
Z >
120 100
15
_
= P(Z > 4/3)
= 1 P(Z 1.33)
= 1 F(1.33)
1 F(1.35) = 1 0.911492 = 0.088508
The probability that someones IQ is more than 120 is about 8.85%.
17. Given S N(44, 5
2
), we compute
P(S > 50) = P
_
Z >
50 44
5
_
= P(Z > 1.2)
= 1 P(Z 1.2)
= 1 F(1.2) = 1 0.884930 = 0.115070
About 11.5% of moose can run faster than 50 km/h.
19. The fraction of the population in the interval (, +) is 0.683. The fraction of the population
in the interval (2, +2) which is outside of (, +) is 0.955 0.683 = 0.272. The fraction
of population in ( , + 2) is the fraction of the population in ( , + ) plus (because of
symmetry) one half of the population in the interval (2, +2) which is outside of (, +).
Thus, the required fraction is 0.683 + 0.272/2 = 0.819.
21. Let X denote the given population. From P( X + ) = 0.683 it follows that
P( X +) = 0.683/2 (thats because of the symmetry of the graph). Thus,
P( X +) = P( X ) +P( X +) = 0.5 + 0.683/2 = 0.8415
23. Let X denote the given population. From P( X + ) = 0.683 it follows that
P( X ) = 0.683/2 (because of the symmetry of the graph). Thus,
P( X ) = P( X ) P( X ) = 0.5 0.683/2 = 0.1585
P1-56 Probability and Statistics [Solutions]
25. X is normally distributed with mean E(X) = 2 + 4 = 6 and variance var(X) = 12
2
+ 6
2
= 180
(so the standard deviation of X is =
180).
27. Reducing to z-scores, we obtain
P(X x) = 0.56
P
_
Z
x 2
12
_
= 0.56
In Table 14.4 we nd
P(Z 0.15) = 0.559618
which is the closest value to 0.56. Thus, (x 2)/12 0.15, and x 12(0.15) + 2 = 3.8.
29. Reducing to z-scores, we obtain
P(X > x) = 0.2
P
_
Z >
x 2
12
_
= 0.2
1 P
_
Z
x 2
12
_
= 0.2
P
_
Z
x 2
12
_
= 0.8
In Table 14.4 we nd
P(Z 0.85) = 0.802337
which is the closest value to 0.8. Thus, (x 2)/12 0.85, and x 12(0.85) + 2 = 12.2.
31. Denote by S the grades on the test. It is given that S N(72, 8
2
). The ratio of students which
scored more than 90% on the test is
P(S > 90) = P
_
Z >
90 72
8
_
= P(Z > 18/8 = 2.25)
= 1 P(Z 2.25)
= 1 F(2.25)
= 1 0.987776 = 0.012224
Thus, about 1.2% of students scored more than 90% on the test.
33. Denote by S the grades on the test. It is given that S N(72, 8
2
). We are asked to nd s so that
P(S s) = 0.05. We compute
P(S s) = 0.05
P
_
Z >
s 72
8
_
= 0.05
1 P
_
Z
s 72
8
_
= 0.05
P
_
Z
s 72
8
_
= 0.95
In Table 14.4 we nd P(Z 1.65) = 0.950529, which is the closest value to 0.95. Thus, (s 72)/8 =
1.65, and s = 8(1.65) + 72 = 85.2. So the minimum score of the highest 5% of the test scores is 85.2
(of 100).
Section 14 [Solutions] P1-57
35. Dene the Bernoulli experiment
T
i
=
_
1 ith tree is infested by canker-rot fungus (success)
0 ith tree is not infested by canker-rot fungus
It is given that p = P(T
i
= 1) = 0.014 and P(T
i
= 0) = 0.986 for i = 1, 2, . . . , 200 (we nd
E(T
i
) = p = 0.014 and var(T
i
) = p(1 p) = (0.014)(0.986) = 0.013804 for all i). The random variable
M =
200
i=1
T
i
counts the number trees infested by canker-rot fungus.
The random variables T
i
are identically distributed (and assumed to be) independent. The mean
of M is (see Theorem 7 in Section 7) E(M) = np = (200)(0.014) = 2.8 and the variance is (see
Theorem 9 in Section 9) var(M) = np(1 p) = 200(0.014)(0.986) = 2.7608. Using the Central Limit
Theorem, we approximate M by the normal distribution M N(2.8, 2.7608).
The probability that fewer than 25 trees are infested with the fungus is (approximately)
P(M 25) = P
_
Z
25 2.8
2.7608
_
P(Z 13.36089)
= F(13.36089)
0.999999
(We dont have this value in the tables, but know that its very close to 1).
37. Consider the random variable B
i
= number of surviving ospring from ith bacterium, where
i = 1, 2, 3, . . . , 10, 000. It is given that, for all i, P(B
i
= 2) = 0.15, P(B
i
= 1) = 0.75, and P(B
i
=
0) = 0.1. We compute
E(B
i
) = 2(0.15) + 1(0.75) + 0(0.1) = 1.05
From
E(B
2
i
) = 4(0.15) + 1(0.75) + 0(0.1) = 1.35
we compute the variance
var(B
i
) = E(B
2
i
) (E(B
i
))
2
= 1.35 1.05
2
= 0.2475.
The random variable
M =
10,000
i=1
B
i
counts the number surviving ospring.
The random variables B
i
are identically distributed (with mean = 1.05 and variance
2
=
0.2475) and assumed to be independent. The mean of M is (see Theorem 7 in Section 7) E(M) =
n = (10, 000)(1.05) = 10, 500 and the variance is (see Theorem 9 in Section 9) var(M) = n
2
=
10, 000(0.2475) = 2, 475. Using the Central Limit Theorem, we approximate M by the normal distri-
bution M N(10, 500, 2, 475).
The probability that the population will be larger than 10,000 is (approximately)
P(M > 10, 000) = P
_
Z >
10, 000 10, 500
2, 475
_
= P(Z > 10.05)
= 1 P(Z 10.05)
= 1 F(10.05)
= 1 (1 F(10.05)) = F(10.05) 0.999999
(F(10.05) is very close to 1.)
P1-58 Probability and Statistics [Solutions]
39. Dene the Bernoulli experiment
V =
_
1 virus is present (success)
0 virus is absent
It is given that p = P(V = 1) = 0.2 and P(V = 0) = 0.8. Repeat the experiment 120 times, and let
N count the number of successes (number of months the virus is present). The probability that the
virus will be present in between 30 and 36 months during a 10-year period is given by the sum of the
probabilities of 30, 31, 32, 33, 34, 35 and 36 successes in 120 repetitions:
P(30 n 36) = b(30, 120; 0.2) +b(31, 120; 0.2) +b(32, 120; 0.2)
+b(33, 120; 0.2) +b(34, 120; 0.2) +b(35, 120; 0.2) +b(36, 120; 0.2)
The major diculty in evaluating the seven expressions consists of dealing with products of very large
numbers (factorials) with very small numbers (coming from the probabilities). For instance,
b(33, 120; 0.2) =
_
120
33
_
(0.2)
33
(0.8)
87
=
120!
33! 87!
(0.2)
33
(0.8)
87
However, thus is not a real problem if instead of a pocket calculator we use Maple, Matlab, Mathe-
matica, or similar software.
41. Let t = u
2
. Then dt/du = 2u, and udu = dt/2; we get
_
ue
u
2
du =
_
e
t
_
dt
2
_
dt =
1
2
_
e
t
dt =
1
2
e
t
+C =
1
2
e
u
2
+C.
The denite integral is computed to be
_
0
ue
u
2
du = lim
T
_
T
0
ue
u
2
du
= lim
T
_
1
2
e
u
2
_
T
0
=
1
2
lim
T
_
e
T
2
e
0
_
=
1
2
(0 1) =
1
2
Let f(u) = ue
u
2
. Then f(u) = (u)e
(u)
2
= ue
u
2
= f(u); i.e., f(u) is an odd function.
Because
_
0
ue
u
2
du is a convergent integral (equal to 1/2) it follows that
_
0
ue
u
2
du is convergent
as well, and equal to 1/2. Thus,
_
ue
u
2
du =
_
0
ue
u
2
du +
_
0
ue
u
2
du =
1
2
+
_
1
2
_
= 0
43. (a) The calculation g(x) = e
(x)
2
= e
x
2
= g(x) proves that g is an even function.
(b) We compute g
(x) = 2xe
x
2
. If x > 0, then g
(x) = 2xe
x
2
= 0 implies that x = 0 is the only critical point of g. Since g
changes from increasing to decreasing at x = 0, it follows that g(0) = 1 is a local maximum.
Because x
2
0 for all x, we conclude that e
x
2
e
0
= 1 for all real numbers x. Thus, g(0) = 1
is also a global maximum of g.
(c) Dierentiating g
, we obtain
g
(x) = 2e
x
2
2xe
x
2
(2x) = 2e
x
2
(1 2x
2
)
From g
(x) = 0 we get 1 2x
2
= 0, x
2
= 1/2 and x = 1/
2.
If x < 1/
2, then g
2 < x < 1/
2, then g
2, then g
2 are points of
inection of g.
Section 14 [Solutions] P1-59
(d) We nd
lim
x
g(x) = lim
x
e
x
2
= e
= 0
lim
x
g(x) = lim
x
e
x
2
= e
= 0
45. It is assumed that X N(,
2
). We use z-scores to convert to calculations involving the standard
normal distribution:
P( X +) = P
_
_
= P(1 Z 1)
= F(1) F(1)
= F(1) (1 F(1))
= 2F(1) 1 = 2(0.841345) 1 = 0.682690 0.683
Likewise,
P( 2 X + 2) = P
_
2
+ 2
_
= P(2 Z 2)
= F(2) F(2)
= 2F(2) 1 = 2(0.977250) 1 = 0.9545 0.955
and
P( 3 X + 3) = P
_
3
+ 3
_
= P(3 Z 3)
= F(3) F(3)
= 2F(3) 1 = 2(0.998650) 1 = 0.9973 0.997
P1-60 Probability and Statistics [Solutions]
Section 15 The Uniform and the Exponential Distributions
1. From var(U) = (b 0)
2
/12 = 12 we get b
2
= 12
2
and b = 12 (since b > 0). The mean of U is
E(U) = (0 + 12)/2 = 6.
3. (a) The probability density function is f(t) = 0.2e
0.2t
and the cumulative distribution function is
F(t) = 1 e
0.2t
. The probability that the rst event occurs between times 2 and 6 is
P(2 T 6) =
_
6
2
0.2e
0.2t
dt
=
_
e
0.2t
_
6
2
= e
1.2
+e
0.4
0.369126
Alternatively, using the cumulative distribution function,
P(2 T 6) = F(6) F(2) =
_
1 e
0.2(6)
_
_
1 e
0.2(2)
_
= e
1.2
+e
0.4
0.369126
(b) See below.
0 1 2 3 4 5 6 7 8 9 10
0
0.02
0.04
0.06
0.08
0.1
0.12
0.14
0.16
0.18
0.2
5. (a) The probability density function is f(t) = 1.5e
1.5t
and the cumulative distribution function is
F(t) = 1 e
1.5t
. The probability that the rst event occurs before t = 3 is
P(T < 3) =
_
3
0
1.5e
1.5t
dt
=
_
e
1.5t
_
3
0
= e
4.5
+ 1 0.988891
Alternatively, using the cumulative distribution function, we obtain
P(T < 3) = F(3) = 1 e
1.5(3)
= 1 e
4.5
0.988891
(b) See below.
0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5
0
0.5
1
1.5
Section 15 [Solutions] P1-61
7. (a) The probability density function is f(t) = 2.4e
2.4t
and the cumulative distribution function is
F(t) = 1 e
2.4t
. The probability that the rst event occurs before t = 0.3 or after t = 1.2 is
P(T < 0.3) +P(T > 1.2) = P(T < 0.3) + (1 P(T 1.2))
= F(0.3) + 1 F(1.2)
= (1 e
2.4(0.3)
) + 1 (1 e
2.4(1.2)
) 0.569383
(b) See below.
0 0.3 1 2 3
0
0.5
1
1.5
2
2.5
1.2
9. Given s(t) = e
0.4t
, we identify = 0.4/month. The mean lifetime is 1/ = 1/0.4 = 2.5 months.
From s(3) = e
0.4(3)
= e
1.2
0.301194 we conclude that about 30.1% of insects will survive 3
months.
11. Denote the lifespan of the atom by T. Since the expected lifespan is 4 hours, it follows that
= 1/4 = 0.25/hour. The probability density function of T is f(t) = 0.25e
0.25t
, the cumulative
distribution function is F(t) = 1 e
0.25t
, and the survivorship function is s(t) = e
0.25t
.
The probability that the atom will not decay during the rst 3 hours is
P(T > 3) = 1 P(T 3) = 1 F(3) = s(3) = e
0.25(3)
= e
0.75
0.472367
Repeating this calculation, we obtain the probability that the atom will decay after 6 hours:
P(T > 6) = s(6) = e
0.25(6)
= e
1.5
0.223130
13. (a) The average lifespan of a guinea pig is 1/0.18 5.56 years.
(b) The survivorship function for the guinea pig is s(t) = e
0.18t
. Thus, the chance that a guinea pig
will live longer than 6 years is s(6) = e
0.18(6)
0.236928.
(c) Let T represent the lifetime of a guinea pig. We nd
P(T > 8 | T > 2) =
P((T > 8) (T > 2))
P(T > 2)
=
P(T > 8)
P(T > 2)
=
s(8)
s(2)
=
e
0.18(8)
e
0.18(2)
= e
0.18(6)
= s(6)
The answer is the same as in (b).
15. Young and old organisms are more likely to die, since the survivorship curve is sharply decreasing
for them. After the initial sharp drop, the curve continues with a small negative slope. Thus, an adult
P1-62 Probability and Statistics [Solutions]
organism has a good change of living bit longer (until it reaches the age where the survivorship curve
drops quickly again).
17. This is not hard to guess: the function f(x) = 5x stretches by a factor of 5: it maps the interval
(0, 1) to the interval (0, 5). Now we shift by 3 units, so the answer is f(x) = 5x + 3.
(Formally: we are looking for a linear function that maps the initial point of the rst interval (0)
to the initial point of the second interval (3) and the terminal point of the rst interval (1) to the
terminal point of the second interval (8). In other words, we are looking for an equation of a line
through the points (0, 3) and (1, 8). Using the point-slope equation, we get y 3 =
83
10
(x 0), i.e.,
y = 5x + 3.)
By generating random numbers in the interval (0, 1) and then applying f(x) to them, we generate
random numbers in the interval (3, 8).
The length of the interval (a, b) is b a. Thus f(x) = (b a)x transforms the interval (0, 1) to
(0, b a). Now we move it so that it starts at a; the function f(x) = (b a)x + a maps the interval
(0, 1) to (0 + a, b a + a) = (a, b). So, composing a random number generator on the interval (0, 1)
with f(x) we obtain a random number generator on the interval (a, b).
19. The half-life of a radioactive substance is the time t
h
for which P(T > t
h
) = s(t
h
) = 1/2. From
e
t
h
= 1/2 we obtain
t
h
= ln(1/2) = ln 1 ln 2 = ln 2
and t
h
= ln 2/.
The median is the time t
m
such that F(t
m
) = 1 e
t
m
= 1/2, i.e., e
t
m
= 1/2.
We see that t
m
= t
h
. From F(t) = 1 e
t
= 1 s(t) we conclude that F(t) + s(t) = 1. So, if
one of the F(t) or s(t) is 1/2, so is the other. Or: the half-life is the time t when the probability of
surviving s(t) is the same as the probability of dying F(t).