Random Variables
Random Variables
Random Variables
We shall use a capital letter, say X, to denote a random variable and its corresponding small letter, x in this case,
for one of its values.
Example2: Suppose a sampling plan involves sampling items from a process until a defective is observed. The
evaluation of the process will depend on how many consecutive items are observed. In that regard, let X be a
random variable defined by the number of items observed before a defective is observed. Labeling N a non-
defective and D a defective, sample points are (D) given X = 0, (ND) given X = 1, (NND) given X = 2, and so on.
2- If a sample space contains a finite number of possibilities or an unending sequence with as many elements as
there are whole numbers, it is called a discrete sample space.
3- If a sample space contains an infinite number of possibilities equal to the number of points on a line segment,
it is called a continuous sample space.
4- A random variable is called a discrete random variable if its set of possible outcomes is countable. It
represents countable data. A discrete random variable assumes each of its values with a certain probability.
When a random variable can take on values on a continuous scale, it is called a continuous random valuable. It
represents measured data.
Example 3: A shipment of 8 similar microcomputers to a retail outlet contains 3 that are defective. If a school
makes a random purchase of 2 of these computers, find the probability distribution for the number of defectives.
6- The cumulative distribution function F(x) of a random variable X with probability distribution f(x) is
F(a) = P(X ≤a) =∑𝑥≤𝑎 𝑓(𝑥) 𝑓𝑜𝑟 − ∞ < 𝑎 < ∞
The cumulative distribution function is a monotone non-decreasing function defined not only for the values
assumed by the given random variable but for all real numbers. The cumulative distribution of a discrete random
variable is a stair function and is a piece wise continuous function. (Take subintervals with lower bounds
inclusive except the first subinterval that is lower bound exclusive function.)
7- Bar chart: By joining the points (x,f(x)) to the x axis either with a dashed or solid line, we obtain what is
commonly called a bar chart.
1
8-Probability histogram: The rectangles are constructed so that their bases of equal width are centered at each
value x and their heights are equal to the corresponding probabilities given by f(x). The diagram is constructed so
as to leave no space between the rectangles. The P(X = x) is equal to the area of the rectangle centered at x.
Example6: Two refills for a ballpoint pen are selected at random from a box that contains 3 blue refills, 2 red
refills, and 3 green refills. If X is the number of blue refills and Y is the number of red refills selected, find
(a) The joint probability function f(x,y),
(b) P[(X,Y) 𝜖A], where A is the region {(x,y)/ x + y ≤ 1}.
c) P[(X,Y) 𝜖B], where B is the region {(x,y)/ x + y ≥ 1}.
2
Mass function for continuous random variables
The function f(x,y) is a joint density function of the continuous random variables X and Y if:
1.𝑓(𝑥, 𝑦) ≥ 0 𝑓𝑜𝑟 𝑎𝑙𝑙 (𝑥, 𝑦)
∞ ∞
2. ∫−∞ ∫−∞ 𝑓(𝑥, 𝑦)𝑑𝑥𝑑𝑦 = 1
3. 𝑃[(𝑋, 𝑌)𝜖𝐴] = ∫𝐴 𝑓(𝑥, 𝑦)𝑑𝑥𝑑𝑦 for any region A in the xy plane.
Example7: A candy company distributes boxes of chocolates with a mixture of creams, toffees, and nuts coated
in both light and dark chocolate. For a randomly selected box, let X and Y, respectively, be the proportions of the
light and dark chocolates that are creams and suppose that the joint density function is:
2
𝑓(𝑥, 𝑦) = {5 (2𝑥 + 3𝑦) 0 ≤ 𝑥 ≤ 1, 0≤𝑦≤1
0 𝑒𝑙𝑠𝑒𝑤ℎ𝑒𝑟𝑒
a)Verify that f(x,y) is a joint probability function.
b) Find 𝑃[(𝑋, 𝑌)𝜖𝐴] where A is the region {(x,y)/ 0<x<1/2, 1/4<y<1/2}
c)Find P[(X,Y) 𝜖B], where B is the region {(x,y)/ ½<x<3; -1≤y<1/4}.
Marginal distributions
Given the joint probability distribution f(x,y) of the discrete random variables X and Y, the probability
distribution g(x) of X alone is obtained by summing f(x, y) over the values of Y. Similarly, the
probability distribution h(y) of Y alone is obtained by summing f(x,y) over the values of X. We define
g(x) and h(y) to be the marginal distributions of X and Y, respectively.
For the discrete case: 𝑔(𝑥) = ∑𝑦 𝑓(𝑥, 𝑦) and ℎ(𝑦) = ∑𝑥 𝑓(𝑥, 𝑦).
+∞ +∞
For the continuous case: 𝑔(𝑥) = ∫−∞ 𝑓(𝑥, 𝑦)𝑑𝑦 and ℎ(𝑦) = ∫−∞ 𝑓(𝑥, 𝑦)𝑑𝑥
The values of g(x) and h(y) are just the marginal totals of the respective columns and rows when the
values of f(x, y) are displayed in a rectangular table.
Example 8: Find the marginal distributions of X and Y respectively for:
a) The case of example 6.
b) The case of example 7.
3
𝑏
𝑃(𝑎 < 𝑋 < 𝑏 ⁄𝑌 = 𝑦) = ∫ 𝑓(𝑥 ⁄𝑦) 𝑑𝑥
𝑎
Example 9: Referring to Example 6, find the conditional distribution of X, given that Y =1, and use it
to determine P(X = 0|Y= 1).
Example 10: Referring to example 8,
a) Find the conditional density 𝑓(𝑦⁄𝑥 ).
b) Evaluate P(Y > 0.25 ⁄X = 0.5)
Example 11: A tobacco company produces blends of tobacco with each blend containing various
proportions of Turkish, domestic, and other tobaccos. The proportion of Turkish and domestic in a
blend are random variables with joint density function (X=Turkish and Y=domestic)
𝑘. 𝑥𝑦, 0 ≤ 𝑥 ≤ 1, 0 ≤ 𝑦 ≤ 1, 𝑥 + 𝑦 ≤ 1
𝑓(𝑥, 𝑦) = {
0 𝑒𝑙𝑠𝑒𝑤ℎ𝑒𝑟𝑒
a)Find k so that f(x,y) is a probability density function
b) Find the marginal density function for the proportion of domestic tobacco.
c) Find the marginal density function for the proportion of Turkish tobacco.
d) Find the probability that in a given box the Turkish tobacco accounts for over half the blend.
e) Find the probability that the proportion of Turkish tobacco is less than 1/8 it is known that the blend
contains ¾ domestic tobacco.
f) Find the probability that the proportion of domestic tobacco is less than 1/8.
g) Find the probability that the proportion of domestic tobacco and Turkish tobacco is greater than ½.
h) Are the random variables X and Y dependent or independent?
Statistical independence
Let X and Y be two random variables, discrete or continuous, with joint probability distribution f(x,y)
and marginal distributions g(x) and h(y), respectively.
The random variables X and Y are said to be statistically independent if and only if
f(x,y) = g(x)h(y) for all (x, y) within their range.
In such a case: 𝑓(𝑥 ⁄𝑦) = 𝑔(𝑥) and 𝑓(𝑦⁄𝑥 ) = ℎ(𝑦)
All the preceding definitions concerning two random variables can be generalized to the case of n
random variables. Let f(x1,x2,… ,xn ) be the joint probability function of the random variables X1, X2 ,...,
Xn. The marginal distribution of X1 ,for example, is :
For the discrete case: 𝑔(𝑥1 ) = ∑𝑥2 … ∑𝑥𝑛 𝑓(𝑥1 , 𝑥2 , … 𝑥𝑛 )
∞ ∞
And for the continuous case: 𝑔(𝑥1 ) = ∫−∞ … . ∫−∞ 𝑓(𝑥1 , 𝑥2 , … 𝑥𝑛 )𝑑𝑥2 , … 𝑑𝑥𝑛
We obtain in the same way joint marginal distributions:
For the discrete case: 𝑔(𝑥1 , 𝑥2 ) = ∑𝑥3 … ∑𝑥𝑛 𝑓(𝑥1 , 𝑥2 , … 𝑥𝑛 )
∞ ∞
And for the continuous case: 𝑔(𝑥1 , 𝑥2 ) = ∫−∞ … . ∫−∞ 𝑓(𝑥1 , 𝑥2 , … 𝑥𝑛 )𝑑𝑥3 , … 𝑑𝑥𝑛
We could consider numerous conditional distributions. For example, the joint conditional
distribution of X1, X2 , and X3, given that X4 = x4 , X5 = x5 , . . . , Xn = xn, is written as:
𝑓(𝑥1 , 𝑥2 , … 𝑥𝑛 )
𝑓(𝑥1 , 𝑥2 , 𝑥3 ⁄𝑥4 , 𝑥5 … 𝑥𝑛 ) =
𝑔(𝑥4 , 𝑥5 , … 𝑥𝑛 )
Let X1 , X2,..., Xn be n random variables, discrete or continuous, with joint probability distribution
f(x1, x2.... .xn ) and marginal distribution f1(x1), f2(x2),…., fn(xn) respectively, The random variables
X1,X2 ,..., Xn are said to be mutually statistically independent if and only if
𝑓(𝑥1 , 𝑥2 , … 𝑥𝑛 ) = f1(x1). f2(x2)…. fn(xn) for all (x1 , x2,..., xn ) within their range.
Example12: Suppose that the shelf life, in years, of a certain perishable: food product packaged in
cardboard containers is a random variable whose probability density function is given by
𝑒 −𝑥 𝑥 > 0
𝑓(𝑥) = {
0 𝑒𝑙𝑠𝑒𝑤ℎ𝑒𝑟𝑒
Let X1, X2, and X3 represent the shelf lives for three of these containers selected independently and
find P(X1 < 2, 1 < X2 < 3, X3 > 2).