Theories Joint Distribution PDF
Theories Joint Distribution PDF
Theories Joint Distribution PDF
1 Joint Distribution
P (x1 ≤ X ≤ x2 , y1 ≤ Y ≤ y2 )
= F (x2 , y2 ) − F (x2 , y1 ) − F (x1 , y2 ) + F (x1 , y1 ).
1
In general, if X1 , · · · , Xn are jointly distributed random variables, the joint
cdf is
(1.2) p(xi , yj ) = P (X = xi , Y = yj ).
2
Solution: the joint distribution of (X, Y ) can be summarized in the fol-
lowing table:
x/y 0 1 2 3
0 18 28 18 0
1 0 18 28 81
Marginal probability mass functions: Suppose that we wish to find the pmf
of Y from the joint pmf of X and Y in the previous example:
pY (0) = P (Y = 0)
= P (Y = 0, X = 0) + P (Y = 0, X = 1)
= 1/8 + 0 = 1/8
pY (1) = P (Y = 1)
= P (Y = 1, X = 0) + P (Y = 1, X = 1)
= 2/8 + 1/8 = 3/8
Joint PDF and Joint CDF: Suppose that X and Y are continuous random
variables. The joint probability density function (pdf) of X and Y is the
function f (x, y) such that for every set C of pairs of real numbers
Z Z
(1.3) P ((X, Y ) ∈ C) = f (x, y)dxdy.
(x,y)∈C
3
Another interpretation of the joint pdf is obtained as follows:
Z b+db Z a+da
P {a < X < a + da, b < Y < b + db} = f (x, y)dxdy
b a
≈ f (a, b)dadb,
P (X ≤ x) = P {X ≤ x, Y ∈ (−∞, ∞)}
Z x Z ∞
= f (x, y)dydx.
−∞ −∞
Then we have
Z x Z ∞ Z ∞
d d
fX (x) = P (X ≤ x) = f (x, y)dydx = f (x, y)dy.
dx dx −∞ −∞ −∞
4
Similarly, the pdf of Y is given by
Z ∞
fY (y) = f (x, y)dx.
−∞
(b) For 0 ≤ x ≤ 1,
Z 1
12 12 6
fX (x) = f (x2 + xy)dy = x2 + x.
0 7 7 7
For x < 0 or x > 1, we have
fX (x) = 0.
5
Example 3 Suppose the set of possible values for (X, Y ) is the rectangle
D = {(x, y) : 0 ≤ x ≤ 1, 0 ≤ y ≤ 1}. Let the joint pdf of (X, Y ) be
6
f (x, y) = (x + y 2 ), for (x, y) ∈ D.
5
6
2 Independent Random Variables
The random variables X and Y are said to be independent if for any two
sets of real numbers A and B,
7
Example 5 Suppose that a man and a woman decide to meet at a cer-
tain location. If each person independently arrives at a time uniformly
distributed between [0, T ].
8
2.1 More than two random variables
If X1 , · · · , Xn are all discrete random variables, the joint pmf of the variables
is the function
If the variables are continuous, the joint pdf of the variables is the func-
tion f (x1 , · · · , xn ) such that
Z b1 Z bn
P (a1 ≤ X1 ≤ b1 , · · · , an ≤ Xn ≤ bn ) = ··· f (x1 , · · · , xn )dx1 · · · xn .
a1 an
9
Example 7 When a certain method is used to collect a fixed volume of rock
samples in a region, there are four rock types. Let X1 , X2 and X3 denote the
proportion by volume of rock types 1, 2, and 3 in a randomly selected sample
(the proportion of rock type 4 is redundant because X4 = 1−X1 −X2 −X3 ).
Suppose the joint pdf of (X1 , X2 , X3 ) is
f (x1 , x2 , x3 ) = kx1 x2 (1 − x3 )
• What is k?
• What is the probability that types 1 and 2 together account for at most
50%?
10
The random variables X1 , · · · , Xn are independent if for every subset
Xi1 , · · · , Xik of the variables, the joint pmf (pdf) is equal to the product of
the marginal pmf’s (pdf’s).
11
2.2 Sum of Independent Random Variables
FZ (a) = P (X + Y ≤ a)
Z Z
= fXY (x, y)dxdy
x+y≤a
Z Z
= fX (x)fY (y)dxdy
x+y≤a
Z ∞ Z a−y
= fX (x)dxfY (y)dy
−∞ −∞
Z ∞
= FX (a − y)fY (y)dy
−∞
12
Group project (optional):
Example 9 Let X and Y denote the lifetimes of two bulbs. Suppose that
X and Y are independent and that each has an exponential distribution
with λ = 1 (year).
13
• What is the probability that the total lifetime is between 1 and 2 years?
14
Example 10 Considered the situation that a fair coin is tossed three times
independently. Let X denote the number of heads on the first toss and Y
denote the total number of heads.
15
If X and Y are independent random variables, then the conditional prob-
ability mass function is the same as the unconditional one. This follows
because if X is independent of Y , then
(2) The Continuous Case: If X and Y have a joint probability density func-
tion f (x, y), then the conditional pdf of X, given that Y = y, is defined for
all values of y such that fY (y) > 0, by
fX,Y (x, y)
(2.9) fX|Y (x|y) = .
fY (y)
To motivate this definition, multiply the left-hand side by dx and the right
hand side by (dxdy)/dy to obtain
fX,Y (x, y)dxdy
fX|Y (x|y)dx =
fY (y)dy
P {x ≤ X ≤ x + dx, y ≤ Y ≤ y + dy}
≈
P {y ≤ Y ≤ y + dy}
= P {x ≤ X ≤ x + dx|y ≤ Y ≤ y + dy}.
In other words, for small values of dx and dy, fX|y (x|y) represents the
conditional probability that X is between x and x + dx given that Y is
between y and y + dy.
That is, if X and Y are jointly continuous, then for any set A,
Z
P {X ∈ A|Y = y} = fX|Y (x|y)dx.
A
16
X given that Y = y by
Z a
FX|Y (a|y) = P (X ≤ a|Y = y) = fX|Y (x|y)dx.
−∞
17
Group project (optional): Suppose X and Y are two independent random
variables, both uniformly distributed on (0,1). Let T1 = min(X, Y ) and
T2 = max(X, Y ).
3 Expected Values
Let X and Y be jointly distributed rv’s with pmf p(x, y) or pdf f (x, y)
according to whether the variables are discrete or continuous. Then the
expected value of a function h(X, Y ), denoted by E[h(X, Y )], is given by
( PP
h(x, y)p(x, y) Discrete
(3.10) E[h(X, Y )] = R ∞ R ∞
x y
18
Rule of expected values: If X and Y are independent random variables, then
we have
E(XY ) = E(X)E(Y ).
This is in general not true for correlated random variables.
19
4 Covariance and Correlation
Covariance and correlation are related parameters that indicate the extent
to which two random variables co-vary. Suppose there are two technology
stocks. If they are affected by the same industry trends, their prices will
tend to rise or fall together. They co-vary. Covariance and correlation
measure such a tendency. We will begin with the problem of calculating the
expected values of a function of two random variables.
0.4
3
4
2
0.2
2
1
y1
y2
y3
0.0
0
0
−1
−2
−0.2
−2
−4
−3
−0.4
−1.5 −0.5 0.5 1.0 1.5 −2 −1 0 1 2 −1.5 −0.5 0.0 0.5 1.0 1.5
x1 x2 x3
20
4.1 Covariance
When two random variables are not independent, we can measure how
strongly they are related to each other. The covariance between two rv’s X
and Y is
3. Cov(aX + b, cY + d) = acCov(X, Y ).
21
Variance of linear combinations: Let X, Y be two random variables, and a
and b be two constants, then
22
Example 15 The joint pdf of X and Y is f (x, y) = 24xy when 0 ≤ x ≤ 1,
0 ≤ y ≤ 1 and x + y ≤ 1, and f (x, y) = 0 otherwise. Find Cov(X, Y ).
23
4.2 Correlation Coefficient
The defect of covariance is that its computed value depends critically on the
units of measurement (e.g., kilograms versus pounds, meters versus feet).
Ideally, the choice of units should have no effect on a measure of strength of
relationship. This can be achieved by scaling the covariance by the standard
deviations of X and Y .
The correlation coefficient of X and Y , denoted by ρX,Y , is defined by
Cov(X, Y )
(4.1) ρX,Y = .
σX σY
24
The correlation coefficient is not affected by a linear change in the units
of measurements. Specifically we can show that
Correlation and dependence: Zero correlation coefficient does not imply that
X and Y are independent, but only that there is complete absence of a
linear relationship. When ρ = 0, X and Y are said to be uncorrelated. Two
random variables could be uncorrelated yet highly dependent.
Correlation and Causation: A large correlation does not imply that increas-
ing values of X causes Y to increase, but only that large X values are
associated with large Y values. Examples:
25