Hacettepe Journal of Mathematics and Statistics
Volume 39 (1) (2010), 109 – 120
DISCRETE DISTRIBUTIONS
CONNECTED WITH THE
BIVARIATE BINOMIAL
I. Bairamov∗† and O. Elmastas Gultekin‡
Received 19 : 02 : 2009 : Accepted 02 : 12 : 2009
Abstract
A new class of multivariate discrete distributions with binomial and
multinomial marginals is studied. This class of distributions is obtained in a natural manner using probabilistic properties of the sampling model considered. Some possible applications in game theory, life
testing and exceedance models for order statistics are discussed.
Keywords: Discrete multivariate distributions, Bivariate binomial distribution, Multinomial distribution, Probability density function, Poisson approximation.
2000 AMS Classification: 62 E 15, 62 E 17.
1. Introduction
Bivariate and multivariate binomial distributions have aroused the interest of many
authors as a natural extension of the univariate binomial distribution. Aitken and Gonin [1] derived bivariate binomial probability functions by considering sampling with
replacement from a fourfold population, and expressed the bivariate probability function as products of the corresponding univariate functions, multiplied by a terminating
series bilinear in the appropriate orthogonal polynomials. Krishnamoorthy [17] studied
the multivariate binomial distribution and extended the series of Aitken and Gonin [1]
for a bivariate binomial distribution to any number of variables. In the papers of Hamdan [10, 11], Hamdan and Al-Bayati [12], Hamdan and Jensen [13], Papageorgiou and
David [19, 20], Doss and Graham [8], Shanbhag and Basawa [21], the conditional distributions associated with trivariate and multivariate binomial distributions were studied,
and characterizations of multivariate binomial distribution by univariate marginals established. For some discussions on bivariate binomial distributions see Kocherlakota and
Kocherlakota [16] and Johnson et al. [15].
∗
Izmir University of Economics, Department of Mathematics, Izmir, Turkey.
E-mail:
[email protected]
†
Corresponding Author.
‡
Ege University, Department of Statistics, Izmir, Turkey. E-mail:
[email protected]
110
I. Bairamov, O. E. Gultekin
Biswasa and Hwang [4] provide a new formulation of the bivariate binomial distribution in the sense that marginally each of the two random variables has a binomial
distribution, and they have some non-zero correlation in the joint distribution. Chandrasekar and Balakrishnan [6] considered a trivariate binomial distribution and obtained
regression equations of this distribution. They provided a set of necessary and sufficient
conditions for the regression to be linear, and also established a characterization of the
trivariate binomial distribution based on the distribution of the sum of two trivariate
random vectors.
In the present paper we consider new trivariate and quadrivariate distributions constructed on the basis of a bivariate binomial distribution. These distributions appear in
several models in the contexts of lifetesting and exceedances, and can also be applied in
strategic games. We also consider an extension of the bivariate binomial model to the
case when each individual of a population is being classified as one of A1 , A2 , . . . , Am
and simultaneously
. , Bm , with probabilities given
Pas one of B1 , B2 , . . P
Pm by P (Ai Bj ) = pij ,
i, j = 1, 2, . . . , m,
pij = 1, P (Ai ) = m
P
(A
B
),
P
(B
)
=
i
j
j
j=1
i=1 P (Ai Bj ). Let the
experiment be repeated n times. Assume that χ1 , χ2 , χ11 , χ12 and χ21 are the numbers of occurrences of the events A1 , B1 , A1 B1 , A1 B2 and A2 B1 in these n repetitions,
respectively. We study the joint distributions of the random variables and discuss their
possible applications.
For a description of a simple bivariate binomial distribution consider the fourfold
model:
A\B
B1
B2
A1
π11
π12
A2
π21
π22
wherein each individual of a population can be classified as being one of AP
1 , A2 and at
the same time as one of B1 , B2 with probabilities P (Ai Bj ) = πij , i, j = 1, 2; ij πij = 1.
Under random sampling with replacement n times, let ξ1 and ξ2 denote the number of
occurrences of A1 and B1 , respectively. It is well known that
p1 (k, l) = P {ξ1 = k, ξ2 = l}
(1)
min(k,l)
=
X
i=max(0,k+l−n)
n!
i
k−i l−i n−k−l+i
π11
π12
π21 π22
,
i!(k − i)!(l − i)!(n − k − l + i)
(see Aitken and Gonin [1] and Johnson, Kotz and Balakrishnan [15]). The bivariate
discrete distribution given in (1) is called a bivariate binomial distribution. The corresponding probability generating function (pgf) is
(2)
Φ1 (t, s) = (π11 ts + π12 t + π21 s + π22 )n .
A connection between a bivariate binomial distribution and a multinomial distribution
can be shown as follows. In the fourfold model described above, let A1 B1 = C1 , A1 B2 =
C2 , A2 B1 = C3 , A2 B2 = C4 and P (C1 ) = p11 , P (C2 ) = p12 , P (C3 ) = p21 , P (C4 ) = p22 .
Let ζi be the number of cases in which Ci occurs in n repetitions, i = 1, 2, 3, 4. Clearly,
(ζ1 , ζ2 , ζ3 , ζ4 ) is multinomial. Then ξ1 = ζ1 + ζ2 and ξ2 = ζ1 + ζ3 .
A simple trivariate distribution in the fourfold model described above is of interest.
Under random sampling n times, let ξ1 , ξ2 and ξ11 be the number of occurrences of A1 , B1
and A1 B1 , respectively. The joint probability function of the random variables ξ1 , ξ2 and
Discrete Distributions
111
ξ11 can be obtained easily from combinatorial considerations, and it is
pn (k, l, r) = P {ξ1 = k, ξ2 = l, ξ11 = r}
=
(3)
n!
r
k−r l−r n−k−l+r
π11
π12
π21 π22
,
r!(k − r)!(l − r)!(n − k − l + r)!
(k, l = 0, 1, 2, . . . , n and r = max(0, k + l − n), . . . , min(k, l)).
The corresponding probability generating function is
Ψ(t, s, z) = (π11 tsz + π12 t + π21 s + π22 )n .
It is clear that the univariate marginals of the discrete random vector (ξ1 , ξ2 , ξ11 ) are
binomial, with cell probabilities (π11 + π12 ), (π11 + π21 ) and π11 , respectively. The joint
distribution of (ξ1 , ξ2 ) is obviously the bivariate binomial distribution with probability
function (1) and pgf Ψ(t, s, 1) = (π11 ts + π12 t + π21 s + π22 )n , as in (2).
The joint probability function of (ξ1 , ξ11 ) is
P {ξ1 = k, ξ11 = r}
=
n
X
l=0
n!
r
k−r l−r n−k−l+r
π11
π12
π21 π22
,
r!(k − r)!(l − r)!(n − k − l + r)!
and the pgf is
Ψ1,11 (t, z) = Ψ(t, 1, z) = (π11 tz + π12 t + π21 + π22 )n .
Similarly, the joint probability function of (ξ2 , ξ11 ) is
P {ξ2 = l, ξ11 = r}
=
n
X
k=0
n!
r
k−r l−r n−k−l+r
π11
π12
π21 π22
,
r!(k − r)!(l − r)!(n − k − l + r)!
and the corresponding pgf is
Ψ2,11 (s, z) = Ψ(1, s, z) = (π11 sz + π21 s + π12 + π22 )n .
The Poisson procedure allows us to obtain the formula that approximates the joint probability function pn (k, l, r) when the number of trials is large, (n → ∞) and nπ11 → λ11 ,
nπ12 → λ12 , nπ21 → λ21 . We have
lim P {ξ1 = k, ξ2 = l, ξ11 = r}
n→∞
= lim
n→∞
=
1
)(1
n
− n2 ) · · · (1 − k+l−r−1
) r k−r l−r
n
λ11 λ12 λ21
r!(k − r)!(l − r)!
n
−(k+l−r)
λ11 + λ12 + λ21
λ11 + λ12 + λ21
1−
× 1−
n
n
(1 −
l−r
λr11 λk−r
12 λ21
exp (−(λ11 + λ12 + λ21 )) .
r!(k − r)!(l − r)!
Therefore pn (k, l, r) → p(k, l, r), where
p(k, l, r) =
l−r
λr11 λk−r
12 λ21
exp (−(λ11 + λ12 + λ21 )) ;
r!(k − r)!(l − r)!
(k, l = 0, 1, 2, . . . and r = 0, 1, 2, . . . , min(k, l)).
This distribution is a trivariate Poisson distribution.
112
I. Bairamov, O. E. Gultekin
2. Extensions of the bivariate binomial distribution
2.1. Example. Consider a strategic game of two players A and B. Player A uses one
of the strategies A1 , A2 , . . . , An together with one of the strategies B1 , B2 , . . . , Bn of
Player B. The probability of the event “A uses strategy Ai and B uses strategy Bj ” is
P (Ai Bj ) = pij , i, j = 1, 2, . . . , n. If A uses strategy Ai against strategy Bj used by B,
then A wins aij units and B loses aij units. If the game is repeated n times, then we are
interested in the joint distribution of the random variables χ1 and χ2 , where χ1 is the
number of cases in which the strategy A1 was used, χ2 is the number of cases in which
the strategy B1 was used. Clearly, (χ1 , χ2 ) is bivariate binomial.
Now assume that a third party is interested in this game, and has some profit in
all cases when the strategy A1 of the first player is used, or the strategy B1 of the
second player is used. Let χ11 be the number of cases in which A1 and B1 were used
simultaneously. Then the number in which the third party is interested is χ1 + χ2 − χ11
− the number of cases when A1 or B1 were used in the n times repeated game. It is
clear that
X
P {χ1 = k, χ2 = l, χ11 = k + l − m} .
P {χ1 + χ2 − χ11 = m} =
k,l
Therefore, the joint probability function of χ1 , χ2 and χ11 is required.
2.2. Example. Suppose n independent units, each consisting of two components, are
placed on a life-test with the corresponding failure times (X1 , Y1 ), (X2 , Y2 ), . . ., (Xn , Yn )
being identically distributed with cumulative distribution function F (x, y) and probability density function f (x, y). For predefined numbers a1 < a2 , if Xi ≤ a1 we say that the
ith unit fails test A1 . If a1 < Xi ≤ a2 , then we say that the ith unit is successful in test
A1 , but fails test A2 . Similarly, for b1 < b2 , if Yi ≤ b1 , then we say that the ith unit fails
test B1 , b1 < Yi ≤ b2 means that it passes test B1 but fails test B2 . Under this setup we
are interested in the following probabilities.
a) What is the probability that q1 units fail test A1 , q2 units fail test B1 , and at
least q3 pass both tests A1 and B1 ? If the number of units that fail test A1 is
χ1 , the number of units that fail test A2 is χ2 , the number of units that fail
both tests A1 and B1 is χ11 , the number of units that fail both tests A2 and B1
is χ21 , and the number of units that fail both tests A1 and B2 is χ12 , then the
required probability is P {χ1 = q1 , χ2 = q2 , χ11 < q3 }.
b) What is the probability that q1 units fail test A1 , q2 units fail test B1 , q3
units fail both tests A1 and B2 , and q4 units fail both tests A2 and B1 ? This
probability is P {χ1 = q1 , χ2 = q2 , χ12 = q3 , χ21 = q4 }. Therefore the joint pmf
of (χ1 , χ2 , χ12 , χ21 ) is required.
2.1. A model. A general model, including the two examples, above can be described
as follows. Suppose in an experiment that the results are observed as one of the events
A1 , A2 , . . . , Am , and at the same time as one
P of the events B1 , B2 , . . . , Bm with probabilities P (Ai Bj ) = pij , i, j = 1, 2, . . . , m;
ij pij = 1, m ≥ 3. This means that the
outcomes in the experiment are pairs Ai Bj , i, j = 1, 2, . . . , m. Assume that we repeat
the experiment n times and that the trials are independent.
2.3. Definition. Let χ1 , χ2 , χ11 , χ12 , χ21 denote the number of occurrences of A1 , B1 ,
A1 B1 , A1 B2 , A2 B1 , respectively.
Discrete Distributions
113
From (3) it is clear that the joint probability function of the random variables χ1 , χ2
and χ11 is
P {χ1 = k, χ2 = l, χ11 = r}
(4)
n!
l−r n−k−l+r
Πr11 Πk−r
,
=
12 Π21 Π22
r!(k − r)!(l − r)!(n − k − l + r)!
m
m
m
m
P
P
P
P
pi1 .
p1j −
pi1 , Π22 = 1 − p11 −
p1j , Π21 =
where Π11 = p11 , Π12 =
i=2
j=2
i=2
j=2
2.4. Theorem. Let P11 = p11 , P12 = p12 , P21 = p21 , P13 =
m
P
p1j and P31 =
j=3
m
P
pi1 .
i=3
Then
a) The joint probability function of the random variables χ1 , χ2 , χ12 and χ21 is
Pn {k, l, r, h}
= P {χ1 = k, χ2 = l, χ12 = r, χ21 = h}
min(k−r,l−h)
(5)
=
X
i=max(0,k+l−n)
n!
i!r!h!(k − i − r)!(l − i − h)!(n − k − l + i)!
i
r
k−i−r h
l−i−h
× P11
P12
P13
P21 P31
(1 − P11 − P12 − P13 − P21 − P31 )n−k−l+i
(k = r, . . . , n − h ; l = h, . . . , n − r; r = 0, . . . , n − h; h = 0, . . . , n).
b) The joint probability generating function is given by
(6)
Φ(t, s, z, c) = (α1 ts + α2 tz + α3 sc + α4 t + α5 s + α6 )n ,
where α1 = P11 , α2 = P12 , α3 = P21 , α4 = P13 , α5 = P31 and α6 =
m P
m
P
pij .
i=2 j=2
Proof. It is clear that without loss of generality we can take m = 3. The model can be
described symbolically as follows:
A\B
A1
B1
p11
A2
p21
A3
p31
B2
p12
i
h
l−i−h
r
B3
p13
p22
p23
p32
p33
k−i−r
It is clear that if we repeat the experiment
n times, then r outcomesof the event A1 can
be observed together with B2 in nr ways, together with B1 in n−r
ways and together
i
with B3 in n−r−i
ways.
Then,
h
outcomes
of
the
event
B
can
be
realized
together with
1
k−r−i
n−r−i−(k−r−i)
n−k
n−k−h
A2 in
= h ways, and together with A3 in l−i−h ways. Therefore,
h
in n repeated independent trials, the number of possible cases when A1 appears i times,
B1 appears l times, A1 B2 appears r times and A2 B1 appears h times is
!
!
!
!
!
n
n−r
n−r−i
n−k
n−k−h
r
i
k−i−r
h
l−i−h
=
n!
,
i!r!h!(k − i − r)!(l − i − h)!(n − k − l + i)!
and each case has the same probability of
i
r
k−i−r h
l−i−h
P11
P12
P13
P21 P31
(1 − P11 − P12 − P13 − P21 − P31 )n−k−l+i .
114
I. Bairamov, O. E. Gultekin
It is then easy to see that i changes from max(0, k + l − n) to min(k − r, l − h), and
consequently we obtain (5).
To derive the joint probability generating function, let us write
(
1 if in the ith trial A1 appears,
i
ξ1 =
0 otherwise,
(
1 if in the ith trial B1 appears,
ξ2i =
0 otherwise,
(
1 if in the ith trial A1 B2 appears,
i
ξ12
=
0 otherwise,
(
1 if in the ith trial A2 B1 appears,
i
ξ21
=
0 otherwise,
for i = 1, 2, . . . , n. It is then clear that χ1 =
n
P
ξ1i , χ2 =
i=1
n
P
n
P
ξ2i , χ12 =
i=1
n
P
i
ξ12
and χ21 =
i=1
i
ξ21
. Since the n trials are independent, the pgf of the random vector (χ1 , χ2 , χ12 , χ21 )
i=1
can then be written as
(7)
Φ(t, s, z, c) =
1
X
tx1 sx2 z x3 cx4 qx1 ,x2 ,x3 ,x4
x1 ,x2 ,x3, x4 =0
n
,
where
i
i
= x4 }; x1 , x2 , x3 , x4 = 0, 1.
= x3 , ξ21
qx1 ,x2 ,x3 ,x4 = P {ξ1i = x1 , ξ2i = x2 , ξ12
We have
q1,1,1,1 = P (A1 B1 (A1 B2 )(A2 B1 )) = 0
q0,1,1,1 = P (A1c B1 (A1 B2 )(A2 B1 )) = 0
q1,1,1,0 = P (A1 B1 (A1 B2 )(A2 B1 )c ) = 0
q0,1,1,0 = P (A1c B1 (A1 B2 )(A2 B1 )c ) = 0
q1,1,0,1 = P (A1 B1 (A1 B2 )c (A2 B1 )) = 0
q0,1,0,1 = P (A1c B1 (A1 B2 )c (A2 B1 )) = P21
q1,1,0,0 = P (A1 B1 (A1 B2 )c (A2 B1 )c ) = P11
q0,1,0,0 = P (A1c B1 (A1 B2 )c (A2 B1 )c ) = P31
q1,0,1,1 = P (A1 B1c (A1 B2 )(A2 B1 )) = 0
q0,0,1,1 = P (A1c B1c (A1 B2 )(A2 B1 )) = 0
q1,0,1,0 = P (A1 B1c (A1 B2 )(A2 B1 )c ) = P12
q0,0,1,0 = P (A1c B1c (A1 B2 )(A2 B1 )c ) = 0
q1,0,0,1 = P (A1 B1c (A1 B2 )c (A2 B1 )) = 0
q0,0,0,1 = P (A1c B1c (A1 B2 )c (A2 B1 )) = 0
q1,0,0,0 = P (A1 B1c (A1 B2 )c (A2 B1 )c ) = P13 ,
X3 X3
q0,0,0,0 = P (Ac1 B1c (A1 B2 )c (A2 B1 )c ) =
i=2
j=2
pij .
Substituting for these values in (7) and simplifying, we obtain (6). Observe that k =
r, r + 1, . . . , n − h; l = h, h + 1, . . . , n − r; r = 0, 1, . . . , n − h; h = 0, 1, . . . , n.
Discrete Distributions
115
2.1.1. Marginal distributions. The univariate marginals of the discrete random vector
(χ1 , χ2 , χ12 , χ21 ) are binomial with cell probabilities (P11 + P12 + P13 ), (P11 + P21 + P31 ),
P12 and P21 , respectively.
The joint distribution of (χ1 , χ2 ) is obviously the bivariate binomial distribution with
probability function (1) and pgf
Ψ1,2 (t, s) = Φ(t, s, 1, 1) = (b1 ts + b2 t + b3 s + b4 )n ,
where b1 = α1 , b2 = α2 + α4 , b3 = α3 + α5 and b4 = α6 , as in (2).
The joint pgf of (χ1 , χ12 )
Ψ1,12 (t, z) = Φ(t, 1, z, 1) = (c1 tz + c2 t + c3 )n ,
where c1 = α2 , c2 = α1 + α4 , c3 = α3 + α5 + α6 .
The joint pgf of (χ1 , χ21 ) is
Ψ1,21 (t, c) = Φ(t, 1, 1, c) = (d1 t + d2 c + d3 )n ,
where d1 = α1 + α2 + α4 , d2 = α3 , d3 = α5 + α6 , which is the pgf of a trinomial
distribution.
The joint pgf of (χ2 , χ12 ) is
Ψ2,12 (s, z) = Φ(1, s, z, 1) = (e1 s + e2 z + e3 )n ,
where e1 = α1 + α3 + α5 , e2 = α2 , e3 = α4 + α6 , which is the pgf of a trinomial
distribution.
The joint pgf of (χ2 , χ21 ) is
Ψ2,21 (s, c) = Φ(1, s, 1, c) = (f1 cs + f2 s + f3 )n ,
where f1 = α3 , f2 = α1 + α5 , f3 = α2 + α4 + α6 .
The joint pgf (χ12 , χ21 ) is
Ψ12,21 (z, c) = Φ(1, 1, z, c) = (g1 c + g2 z + g3 )n ,
where g1 = α3 , g2 = α2 , g3 = α1 + α4 + α5 + α6 , which is the pgf of a trinomial
distribution.
The trivariate marginals of the discrete random vector (χ1 , χ2 , χ12 , χ21 ) are as follows.
The joint pgf of (χ1 , χ2 , χ12 ) is
Ψ1,2,12 (t, s, z) = Φ(t, s, z, 1) = (h1 ts + h2 tz + h3 t + h4 s + h5 )n ,
where h1 = α1 , h2 = α2 , h3 = α4 , h4 = α3 + α5 , h5 = α6 .
The joint pgf of (χ1 , χ2 , χ21 ) is
Ψ1,2,21 (t, s, c) = Φ(t, s, 1, c) = (j1 ts + j2 cs + j3 t + j4 s + j5 )n ,
where j1 = α1 , j2 = α3 , j3 = α2 + α4 , j4 = α5 , j5 = α6 .
The joint pgf of (χ1 , χ12 , χ21 ) is
Ψ1,12,21 (t, z, c) = Φ(t, 1, z, c) = (k1 tz + k2 t + k3 c + k4 )n ,
where k1 = α2 , k2 = α1 + α4 , k3 = α3 , k4 = α5 + α6 .
The joint pgf of (χ2 , χ12 , χ21 ) is
Ψ2,12,21 (s, z, c) = Ψ(1, s, z, c) = (n1 cs + n2 s + n3 z + n4 )n ,
where n1 = α3 , n2 = α1 + α5 , n3 = α2 , n4 = α4 + α6 .
116
I. Bairamov, O. E. Gultekin
Example 2.1 (continued) a) It is clear that
q3 −1
P {χ1 = q1 , χ2 = q2 , χ11 < q3 } =
X
P {χ1 = q1 , χ2 = q2 , χ11 = r}.
r=1
It can be calculated from (4), for m = 3 and the probabilities
p11 = P (A1 B1 ) = P {X ≤ a1 , Y ≤ b1 },
p12 = P (A1 B2 ) = P {X ≤ a1 , b1 < Y ≤ b2 },
p21 = P (A2 B1 ) = P {a1 < X ≤ a2 , Y ≤ b1 },
p22 = P (A2 B2 ) = P {a1 < X ≤ a2 , b1 < Y ≤ b2 },
p13 = P (A1 B3 ) = P {X ≤ a1 , Y > b2 },
(8)
p23 = P (A2 B3 ) = P {a1 < X ≤ a2 , Y > b2 }
p31 = P (A3 B1 ) = P {X > a2 , Y ≤ b1 },
p32 = P (A3 B2 ) = P {X > a2 , b1 < Y ≤ b2 },
p33 = P (A3 B3 ) = P {X > a2 , Y > b2 }.
b) The probability is
P {χ1 = q1 , χ2 = q2 , χ12 = q3 , χ21 = q4 },
which can be calculated from (5) for m = 3 by using the probabilities (8).
Below, in Table 1, we provide some numerical values of
f (n, k, l, r, h) = P {χ1 = k, χ2 = l, χ12 = r, χ21 = h}
for n = 2, m = 3 and pij = 19 , i, j = 1, 2, 3.
Table 1. Numerical values of f (n, k, l, r, h) = P {χ1 = k, χ2 = l, χ12 = r, χ21 = h}
for n = 2, m = 3.
n
k
l
r
h
f (n, k, l, r, h)
n
k
l
r
h
f (n, k, l, r, h)
2
0
0
0
0
0.198
2
1
1
1
0
0.025
2
0
1
0
0
0.099
2
2
0
1
0
0.025
2
0
2
0
0
0.012
2
2
1
1
0
0.025
2
1
0
0
0
0.099
2
2
0
2
0
0.012
2
1
1
0
0
0.123
2
0
1
0
1
0.099
2
1
2
0
0
0.025
2
0
2
0
1
0.025
2
2
0
0
0
0.012
2
1
1
0
1
0.025
2
2
1
0
0
0.025
2
1
2
0
1
0.025
2
2
2
0
0
0.012
2
1
1
1
1
0.025
2
1
0
1
0
0.099
2
0
2
0
2
0.012
2.2. The Poisson approximation. The Poisson procedure allows us to obtain the
formula that approximates the probability mass function pn (k, l, r, h) when the number
of trials is large (n → ∞), and P11 , P12 , P21 , P13 , P31 → 0, nP11 → λ11 , nP12 → λ12 ,
nP21 → λ21 , nP1 → λ1 , nP2 → λ2 , where P1 = P11 + P12 + P13 and P2 = P11 + P21 + P31 .
Discrete Distributions
117
The limiting form of P {k, l, r, h} is given by
lim P {χ1 = k, χ2 = l, χ12 = r, χ21 = h}
n→∞
min(k−r,l−h)
= lim
n→∞
n(n − 1) · · · (n − k) · · · (n − k − l + i + 1)(n − k − l + i)!
i!r!h!(k − i − r)!(l − i − h)!(n − k − l + i)!
i=max(0,k+l−n)
i
r
k−i−r
h
λ11
λ12
λ1 − λ11 − λ12
λ21
×
n
n
n
n
l−i−h
n−k−l+i
λ1 + λ2 − λ11
λ2 − λ11 − λ21
×
1−
n
n
X
min(k−r,l−h)
X
= lim
n→∞
1(1 − n1 ) · · · (1 − nk ) · · · (1 − k+l−i−1
) i r
n
λ11 λ12
i!r!h!(k − i − r)!(l − i − h)!
i=max(0,k+l−n)
× (λ1 − λ11 − λ12 )k−i−r λh21 (λ2 − λ11 − λ21 )l−i−h
n−k−l+i
λ1 + λ2 − λ11
× 1−
n
= e−(λ1 +λ2 −λ11 )
min(k−r,l−h) i
X
λ11
i=0
λr12 λh21 (λ1 − λ11 − λ12 )k−i−r
i! r! h!
(k − i − r)!
(λ2 − λ11 − λ21 )l−i−h
,
(l − i − h)!
(k = r, r + 1, . . . ; l = h, h + 1, . . . ; r = 0, 1, 2, . . . ; h = 0, 1, 2, . . .).
×
Therefore Pn (k, l, r, h) → p(k, l, r, h), where
min(k−r,l−h)
p(k, l, r, h) = e−(λ1 +λ2 −λ11 )
X
i=0
λi11 λr12 λh21 (λ1 − λ11 − λ12 )k−i−r
i! r! h!
(k − i − r)!
(λ2 − λ11 − λ21 )l−i−h
;
(l − i − h)!
(k = r, r + 1, . . . ; l = h, h + 1, . . . ; r = 0, 1, 2, . . . ; h = 0, 1, 2, . . .).
×
This distribution is a version of the quadrivariate Poisson distribution.
2.5. Remark. It should be noted that Theorem 2.4 enables one to calculate the joint
probability function of any random variables Xi , Xj , Xij counting, respectively, the
number of occurrences of Ai , Bj and Ai Bj in n repetitions of the experiment.
2.6. Remark. In this paper we do not deal with statistical inferences for the proposed
family of distributions. The estimating and testing techniques for similar multivariate
distributions are discussed in the statistical literature, see e.g. Voinov and Nikulin [22],
Voinov et al. [23].
2.7. Remark. The problem addressed in this paper is that associated with determining
the joint distribution of overlapping sums of the coordinates of a multinomial density in
certain specific cases. The general problem would begin with
X ∼ Multinomial(n; p1 , p2 , . . . , pm ),
m
P
pi = 1, and for each i, Xi denotes the number of outcomes of type i,
where pi ≥ 0 and
i=1
i = 1, 2, . . . , m. In such a setting, we can consider the joint density of the random vector
Z = AX, where A is a k × m matrix of zeroes and ones. It is random vectors of this type
118
I. Bairamov, O. E. Gultekin
that are considered in the paper. In particular, we focus on a cross-classified version in
which there are m2 possible outcomes of the experiment which can be indexed by {(i; j):
i = 1, 2, . . . , m; j = 1, 2, . . . , m} with corresponding probabilities pij .
3. Some possible applications
3.1. Empirical distribution function and dependence measures. Let (X1 , Y1 ),
(X2 , Y2 ), . . . , (Xn , Yn ) be a sample from a bivariate distribution with distribution function (df) F (x, y) and marginal df’s FX (x) and FY (y). Write ξ1 = # {i : Xi ≤ x},
ξ2 = # {j : Yj ≤ y}, ξ11 = # {i : Xi ≤ x, Yi ≤ y}, ξ12 = # {i : Xi ≤ x, Yi > y}, A1 =
{X ≤ x}, A2 = {X > x} and B1 = {Y ≤ y}, B2 = {Y > y}. It is easy to observe that
ξ1 represents the number of elements of the sample X1 , X2 , . . . , Xn falling below the
threshold x, ξ2 represents the number of elements of the sample Y1 , Y2 , . . . , Yn falling
below the threshold y, and ξ11 represents the number of elements of the sample (X1 , Y1 ),
(X2 , Y2 ), . . . , (Xn , Yn ) belonging to the set {(u, v) : u ≤ x, v ≤ y}. It can also be observed
∗
that ξ1 ≡ nFX
(x) is the empirical df of the sample X1 , X2 , . . . , Xn ; ξ2 ≡ nFY∗ (y) is the
∗
empirical df of the sample Y1 , Y2 , . . . , Yn , and ξ11 = nFX,Y
(x, y) is the empirical df of the
bivariate sample (X1 , Y1 ), (X2 , Y2 ), . . . , (Xn , Yn ). In this case we have
(9)
∗
∗
P {nFX
(x) = k, nFY∗ (y) = l, nFX,Y
(x, y) = r} = P {ξ1 = k, ξ2 = l, ξ11 = r} ,
where the joint probability function of the exceedances (ξ1 , ξ2 , ξ11 ) is as given in (4) with
the probabilities
π11 = P (A1 B1 ) = P {X ≤ x, Y ≤ y} = F (x, y)
π12 = P (A1 B2 ) = P {X ≤ x, Y > y} = FX (x) − F (x, y)
π21 = P (A2 B1 ) = P {X > x, Y ≤ y} = FY (y) − F (x, y)
π22 = P (A2 B2 ) = P {X > x, Y > y} = 1 − FX (x) − FY (y) + F (x, y)
The probabilities (9) can be used to construct a criteria for testing the independence
∗
(x),
of random variables X and Y , based on the empirical distribution functions FX
∗
∗
FY (y) and FX,Y (x, y). In recent years, several statistical papers have appeared, discussing local dependence measures that can characterize the dependence structure of
two random variables localized at the fixed point. For more details on local dependence
functions, see Bjerve and Doksum [5], Jones [14], Bairamov and Kotz [2], Bairamov et
al. [3], Kotz and Nadarajah [18]. Assume that J(u, v, w) is any function on the unit
∗
∗
cube and that J(FX
(x, y)) leads to a test statistic for testing indepen(x), FY∗ (y), FX,Y
dence between X and Y at the point (x, y). Then the distribution of this statistic is
∗
∗
P {J(FX
(x), FY∗ (y), FX,Y
(x, y)) ≤ t} given by
X
∗
∗
∗
P {J(FX (x), FY (y), FX,Y
(x, y)) ≤ t} =
P {ξ1 = k, ξ2 = l, ξ11 = r} .
k , l , r )≤t}
{(k,l,r):J ( n
n n
3.2. Exceedances. Let (X1 , Y1 ), (X2 , Y2 ), . . . , (Xn , Yn ) be a sample as in Example 2.2.
Denote by Xr:n the r th order statistic constructed from the sample X1 , X2 , . . . , Xn and
let Y[r:n] be the corresponding concomitant of Xr:n . The joint probability density function
of the r th order statistic and its concomitant Y[r:n] is
fXr:n ,Y[r:n] (x, y) = f (y | x)fr:n (x).
The concomitants of order statistics arise in different selection procedures (see David
[7]). Assume that (Xn+1 , Yn+1 ), (Xn+2 , Yn+2 ), . . . , (Xn+m , Yn+m ) are the next m observations obtained from the same population with df F (x, y) that are independent
of the
first sample. In this case let r < s and η1 = # {i : Xi ≤ Xr:n }, η2 = # j : Yj ≤ Y[r:n] ,
η11 = # i : Xi ≤ Xr:n , Yi ≤ Y[r:n] , η12 = # i : Xi ≤ Xr:n , Yi ≤ Y[s:n] and η21 =
Discrete Distributions
119
# i : Xi ≤ Xs:n , Yi ≤ Y[r:n] . The random variable η1 + 1 shows the rank of Xr:n among
the Xn+1 , Xn+2 , . . . , Xn+m and η2 + 1 shows the rank of Y[r:n] among the Yn+1 , Yn+2 ,
. . . , Yn+m . The joint distribution of η1 + 1 and η2 + 1 can be obtained from (1) as follows:
P {ξ1 = k − 1, ξ2 = l − 1}
min(k−1,l−1)
=
X
i=max(0,k+l−n−2)
(10)
×
n!
i!(k − 1 − i)!(l − 1 − i)!(n − k − l + i + 2)!
Z∞ Z∞
i
k−i−1
l−i−1
π11
(x, y)π12
(x, y)π21
(x, y)
−∞ −∞
n−k−l+i+2
× π22
(x, y)f (y | x)fr:n (x) dx dy.
Formula (10) is obtained in Eryilmaz and Bairamov [9] by conditioning on Xr:n and
Y[r:n] . The joint distribution of the exceedance statistics η1 , η2 , η11 and η1 , η2 , η12 , η21
can be obtained in a similar way to (4) and (5).
Acknowledgement. We are grateful to the editor and anonymous referee for their
valuable comments which improved the presentation of this paper.
References
[1] Aitken, A. C. and Gonin, H. T. On fourfold sampling with and without replacement, Proc.
Roy. Soc. Edinburgh 55, 114–125, 1935.
[2] Bairamov I. and Kotz. S. On local dependence function for multivariate distributions, New
Trends in Probability and Statistics 5, 27–44, 2000.
[3] Bairamov I. G., Kotz S. and Kozubowski T. A new measure on linear local dependence,
Statistics, 37 (3), 243–258, 2003.
[4] Biswasa, A. and Hwang, J. S. A new bivariate binomial distribution, Statistics and Probability Letters 60, 231–240, 2002.
[5] Bjerve, S. and Doksum, K. Correlation curves: measures of association as functions of
covariate values, Ann. Statist. 21, 890–902, 1993.
[6] Chandrasekar, B. and Balakrishnan N. Some properties and a characterization of trivariate
and multivariate binomial distributions, Statistics 36, 211–218, 2002.
[7] David H. A. Concomitant of order statistics, Theory and Application, in: Krishnaiah, P, R.,
Sen, P. K. (Eds.) (Handbook of Statistics 4, North-Holland, Amsterdam, 1993), 383–403.
[8] Doss, D. C. and Graham, R. C. A characterization of multinomial bivariate distribution by
univariate marginals, Bull. Calcutta Statist. Assoc. 24, 93–99, 1975.
[9] Eryilmaz, S. and Bairamov, I. G. On a new sample rank of an order statistic and its concomitant, Statist. Probab. Lett. 63, 123–131, 2003.
[10] Hamdan, M. A. Canonical expansions of the bivariate binomial distribution with unequal
marginal indices, Int. Statist. Rev. 40, 277–280, 1972.
[11] Hamdan, M. A. A note on the trinomial distribution, Int. Statist. Rev. 43, 219–220, 1975.
[12] Hamdan, M. A. and Al-Bayati, H. A. A note on the bivariate Poisson distribution, Amer.
Statist. 23, 32–33, 1969.
[13] Hamdan, M. A. and Jensen, D. R. A bivariate binomial distribution and some applications,
Austral. J. Statist. 18, 163–169, 1976.
[14] Jones M. C. The local dependence function, Biometrika 83, 899–904, 1996.
[15] Johnson N. L., Kotz S. and Balakrishnan N. Discrete Multivariate Distributions (John Wiley
& Sons, New York, 1997).
[16] Kocherlakota, S. and Kocherlakota, K. Bivariate Discrete Distributions (Marcel Dekker,
New York, 1992).
[17] Krishnamoorthy, A. S. Multivariate Poisson and binomial distributions, Sankhya 11, 117–
124, 1951.
120
I. Bairamov, O. E. Gultekin
[18] Kotz, S. and Nadarajah, S. Local dependence functions for the elliptically symmetric distributions, Sankhya 65 (1), 207–223, 2003.
[19] Papageorgiou, H. and David, K. M. On countable mixtures of bivariate binomial distributions, Biom. J. 36, 581–601, 1994.
[20] Papageorgiou, H. and David, K. M. The structure of compounded trivariate binomial distributions, Biom. J. 37, 81–95, 1995.
[21] Shanbhag, D. N. and Basawa, I. V. (1974). On a characterization property of multinomial
distribution, Trab. Estad. 25, 109–112, 1974.
[22] Voinov, V. and Nikulin, M. Unbiased Estimators and their Applications 2: Multivariate
case (Kluwer, 1996).
[23] Voinov, V., Nikulin, M. and Smirnov, T. Multivariate discrete distributions induced by an
urn scheme, linear diophantine equations, unbiased estimating and testing, J. Statist. Plann.
Inference 101 (1-2), 255–266, 2002.