Academia.eduAcademia.edu

Discrete distributions connected with the bivariate binomial

2010, Hacettepe University Bulletin of Natural Sciences and Engineering Series B Mathematics and Statistics

A new class of multivariate discrete distributions with binomial and multinomial marginals is studied. This class of distributions is obtained in a natural manner using probabilistic properties of the sampling model considered. Some possible applications in game theory, life testing and exceedance models for order statistics are discussed.

Hacettepe Journal of Mathematics and Statistics Volume 39 (1) (2010), 109 – 120 DISCRETE DISTRIBUTIONS CONNECTED WITH THE BIVARIATE BINOMIAL I. Bairamov∗† and O. Elmastas Gultekin‡ Received 19 : 02 : 2009 : Accepted 02 : 12 : 2009 Abstract A new class of multivariate discrete distributions with binomial and multinomial marginals is studied. This class of distributions is obtained in a natural manner using probabilistic properties of the sampling model considered. Some possible applications in game theory, life testing and exceedance models for order statistics are discussed. Keywords: Discrete multivariate distributions, Bivariate binomial distribution, Multinomial distribution, Probability density function, Poisson approximation. 2000 AMS Classification: 62 E 15, 62 E 17. 1. Introduction Bivariate and multivariate binomial distributions have aroused the interest of many authors as a natural extension of the univariate binomial distribution. Aitken and Gonin [1] derived bivariate binomial probability functions by considering sampling with replacement from a fourfold population, and expressed the bivariate probability function as products of the corresponding univariate functions, multiplied by a terminating series bilinear in the appropriate orthogonal polynomials. Krishnamoorthy [17] studied the multivariate binomial distribution and extended the series of Aitken and Gonin [1] for a bivariate binomial distribution to any number of variables. In the papers of Hamdan [10, 11], Hamdan and Al-Bayati [12], Hamdan and Jensen [13], Papageorgiou and David [19, 20], Doss and Graham [8], Shanbhag and Basawa [21], the conditional distributions associated with trivariate and multivariate binomial distributions were studied, and characterizations of multivariate binomial distribution by univariate marginals established. For some discussions on bivariate binomial distributions see Kocherlakota and Kocherlakota [16] and Johnson et al. [15]. ∗ Izmir University of Economics, Department of Mathematics, Izmir, Turkey. E-mail: [email protected] † Corresponding Author. ‡ Ege University, Department of Statistics, Izmir, Turkey. E-mail: [email protected] 110 I. Bairamov, O. E. Gultekin Biswasa and Hwang [4] provide a new formulation of the bivariate binomial distribution in the sense that marginally each of the two random variables has a binomial distribution, and they have some non-zero correlation in the joint distribution. Chandrasekar and Balakrishnan [6] considered a trivariate binomial distribution and obtained regression equations of this distribution. They provided a set of necessary and sufficient conditions for the regression to be linear, and also established a characterization of the trivariate binomial distribution based on the distribution of the sum of two trivariate random vectors. In the present paper we consider new trivariate and quadrivariate distributions constructed on the basis of a bivariate binomial distribution. These distributions appear in several models in the contexts of lifetesting and exceedances, and can also be applied in strategic games. We also consider an extension of the bivariate binomial model to the case when each individual of a population is being classified as one of A1 , A2 , . . . , Am and simultaneously . , Bm , with probabilities given Pas one of B1 , B2 , . . P Pm by P (Ai Bj ) = pij , i, j = 1, 2, . . . , m, pij = 1, P (Ai ) = m P (A B ), P (B ) = i j j j=1 i=1 P (Ai Bj ). Let the experiment be repeated n times. Assume that χ1 , χ2 , χ11 , χ12 and χ21 are the numbers of occurrences of the events A1 , B1 , A1 B1 , A1 B2 and A2 B1 in these n repetitions, respectively. We study the joint distributions of the random variables and discuss their possible applications. For a description of a simple bivariate binomial distribution consider the fourfold model: A\B B1 B2 A1 π11 π12 A2 π21 π22 wherein each individual of a population can be classified as being one of AP 1 , A2 and at the same time as one of B1 , B2 with probabilities P (Ai Bj ) = πij , i, j = 1, 2; ij πij = 1. Under random sampling with replacement n times, let ξ1 and ξ2 denote the number of occurrences of A1 and B1 , respectively. It is well known that p1 (k, l) = P {ξ1 = k, ξ2 = l} (1) min(k,l) = X i=max(0,k+l−n) n! i k−i l−i n−k−l+i π11 π12 π21 π22 , i!(k − i)!(l − i)!(n − k − l + i) (see Aitken and Gonin [1] and Johnson, Kotz and Balakrishnan [15]). The bivariate discrete distribution given in (1) is called a bivariate binomial distribution. The corresponding probability generating function (pgf) is (2) Φ1 (t, s) = (π11 ts + π12 t + π21 s + π22 )n . A connection between a bivariate binomial distribution and a multinomial distribution can be shown as follows. In the fourfold model described above, let A1 B1 = C1 , A1 B2 = C2 , A2 B1 = C3 , A2 B2 = C4 and P (C1 ) = p11 , P (C2 ) = p12 , P (C3 ) = p21 , P (C4 ) = p22 . Let ζi be the number of cases in which Ci occurs in n repetitions, i = 1, 2, 3, 4. Clearly, (ζ1 , ζ2 , ζ3 , ζ4 ) is multinomial. Then ξ1 = ζ1 + ζ2 and ξ2 = ζ1 + ζ3 . A simple trivariate distribution in the fourfold model described above is of interest. Under random sampling n times, let ξ1 , ξ2 and ξ11 be the number of occurrences of A1 , B1 and A1 B1 , respectively. The joint probability function of the random variables ξ1 , ξ2 and Discrete Distributions 111 ξ11 can be obtained easily from combinatorial considerations, and it is pn (k, l, r) = P {ξ1 = k, ξ2 = l, ξ11 = r} = (3) n! r k−r l−r n−k−l+r π11 π12 π21 π22 , r!(k − r)!(l − r)!(n − k − l + r)! (k, l = 0, 1, 2, . . . , n and r = max(0, k + l − n), . . . , min(k, l)). The corresponding probability generating function is Ψ(t, s, z) = (π11 tsz + π12 t + π21 s + π22 )n . It is clear that the univariate marginals of the discrete random vector (ξ1 , ξ2 , ξ11 ) are binomial, with cell probabilities (π11 + π12 ), (π11 + π21 ) and π11 , respectively. The joint distribution of (ξ1 , ξ2 ) is obviously the bivariate binomial distribution with probability function (1) and pgf Ψ(t, s, 1) = (π11 ts + π12 t + π21 s + π22 )n , as in (2). The joint probability function of (ξ1 , ξ11 ) is P {ξ1 = k, ξ11 = r} = n X l=0 n! r k−r l−r n−k−l+r π11 π12 π21 π22 , r!(k − r)!(l − r)!(n − k − l + r)! and the pgf is Ψ1,11 (t, z) = Ψ(t, 1, z) = (π11 tz + π12 t + π21 + π22 )n . Similarly, the joint probability function of (ξ2 , ξ11 ) is P {ξ2 = l, ξ11 = r} = n X k=0 n! r k−r l−r n−k−l+r π11 π12 π21 π22 , r!(k − r)!(l − r)!(n − k − l + r)! and the corresponding pgf is Ψ2,11 (s, z) = Ψ(1, s, z) = (π11 sz + π21 s + π12 + π22 )n . The Poisson procedure allows us to obtain the formula that approximates the joint probability function pn (k, l, r) when the number of trials is large, (n → ∞) and nπ11 → λ11 , nπ12 → λ12 , nπ21 → λ21 . We have lim P {ξ1 = k, ξ2 = l, ξ11 = r} n→∞ = lim n→∞ = 1 )(1 n − n2 ) · · · (1 − k+l−r−1 ) r k−r l−r n λ11 λ12 λ21 r!(k − r)!(l − r)! n  −(k+l−r)  λ11 + λ12 + λ21 λ11 + λ12 + λ21 1− × 1− n n (1 − l−r λr11 λk−r 12 λ21 exp (−(λ11 + λ12 + λ21 )) . r!(k − r)!(l − r)! Therefore pn (k, l, r) → p(k, l, r), where p(k, l, r) = l−r λr11 λk−r 12 λ21 exp (−(λ11 + λ12 + λ21 )) ; r!(k − r)!(l − r)! (k, l = 0, 1, 2, . . . and r = 0, 1, 2, . . . , min(k, l)). This distribution is a trivariate Poisson distribution. 112 I. Bairamov, O. E. Gultekin 2. Extensions of the bivariate binomial distribution 2.1. Example. Consider a strategic game of two players A and B. Player A uses one of the strategies A1 , A2 , . . . , An together with one of the strategies B1 , B2 , . . . , Bn of Player B. The probability of the event “A uses strategy Ai and B uses strategy Bj ” is P (Ai Bj ) = pij , i, j = 1, 2, . . . , n. If A uses strategy Ai against strategy Bj used by B, then A wins aij units and B loses aij units. If the game is repeated n times, then we are interested in the joint distribution of the random variables χ1 and χ2 , where χ1 is the number of cases in which the strategy A1 was used, χ2 is the number of cases in which the strategy B1 was used. Clearly, (χ1 , χ2 ) is bivariate binomial. Now assume that a third party is interested in this game, and has some profit in all cases when the strategy A1 of the first player is used, or the strategy B1 of the second player is used. Let χ11 be the number of cases in which A1 and B1 were used simultaneously. Then the number in which the third party is interested is χ1 + χ2 − χ11 − the number of cases when A1 or B1 were used in the n times repeated game. It is clear that X P {χ1 = k, χ2 = l, χ11 = k + l − m} . P {χ1 + χ2 − χ11 = m} = k,l Therefore, the joint probability function of χ1 , χ2 and χ11 is required. 2.2. Example. Suppose n independent units, each consisting of two components, are placed on a life-test with the corresponding failure times (X1 , Y1 ), (X2 , Y2 ), . . ., (Xn , Yn ) being identically distributed with cumulative distribution function F (x, y) and probability density function f (x, y). For predefined numbers a1 < a2 , if Xi ≤ a1 we say that the ith unit fails test A1 . If a1 < Xi ≤ a2 , then we say that the ith unit is successful in test A1 , but fails test A2 . Similarly, for b1 < b2 , if Yi ≤ b1 , then we say that the ith unit fails test B1 , b1 < Yi ≤ b2 means that it passes test B1 but fails test B2 . Under this setup we are interested in the following probabilities. a) What is the probability that q1 units fail test A1 , q2 units fail test B1 , and at least q3 pass both tests A1 and B1 ? If the number of units that fail test A1 is χ1 , the number of units that fail test A2 is χ2 , the number of units that fail both tests A1 and B1 is χ11 , the number of units that fail both tests A2 and B1 is χ21 , and the number of units that fail both tests A1 and B2 is χ12 , then the required probability is P {χ1 = q1 , χ2 = q2 , χ11 < q3 }. b) What is the probability that q1 units fail test A1 , q2 units fail test B1 , q3 units fail both tests A1 and B2 , and q4 units fail both tests A2 and B1 ? This probability is P {χ1 = q1 , χ2 = q2 , χ12 = q3 , χ21 = q4 }. Therefore the joint pmf of (χ1 , χ2 , χ12 , χ21 ) is required. 2.1. A model. A general model, including the two examples, above can be described as follows. Suppose in an experiment that the results are observed as one of the events A1 , A2 , . . . , Am , and at the same time as one P of the events B1 , B2 , . . . , Bm with probabilities P (Ai Bj ) = pij , i, j = 1, 2, . . . , m; ij pij = 1, m ≥ 3. This means that the outcomes in the experiment are pairs Ai Bj , i, j = 1, 2, . . . , m. Assume that we repeat the experiment n times and that the trials are independent. 2.3. Definition. Let χ1 , χ2 , χ11 , χ12 , χ21 denote the number of occurrences of A1 , B1 , A1 B1 , A1 B2 , A2 B1 , respectively. Discrete Distributions 113 From (3) it is clear that the joint probability function of the random variables χ1 , χ2 and χ11 is P {χ1 = k, χ2 = l, χ11 = r} (4) n! l−r n−k−l+r Πr11 Πk−r , = 12 Π21 Π22 r!(k − r)!(l − r)!(n − k − l + r)! m m m m P P P P pi1 . p1j − pi1 , Π22 = 1 − p11 − p1j , Π21 = where Π11 = p11 , Π12 = i=2 j=2 i=2 j=2 2.4. Theorem. Let P11 = p11 , P12 = p12 , P21 = p21 , P13 = m P p1j and P31 = j=3 m P pi1 . i=3 Then a) The joint probability function of the random variables χ1 , χ2 , χ12 and χ21 is Pn {k, l, r, h} = P {χ1 = k, χ2 = l, χ12 = r, χ21 = h} min(k−r,l−h) (5) = X i=max(0,k+l−n) n! i!r!h!(k − i − r)!(l − i − h)!(n − k − l + i)! i r k−i−r h l−i−h × P11 P12 P13 P21 P31 (1 − P11 − P12 − P13 − P21 − P31 )n−k−l+i (k = r, . . . , n − h ; l = h, . . . , n − r; r = 0, . . . , n − h; h = 0, . . . , n). b) The joint probability generating function is given by (6) Φ(t, s, z, c) = (α1 ts + α2 tz + α3 sc + α4 t + α5 s + α6 )n , where α1 = P11 , α2 = P12 , α3 = P21 , α4 = P13 , α5 = P31 and α6 = m P m P pij . i=2 j=2 Proof. It is clear that without loss of generality we can take m = 3. The model can be described symbolically as follows: A\B A1 B1 p11 A2 p21 A3 p31 B2 p12 i h l−i−h r B3 p13 p22 p23 p32 p33 k−i−r It is clear that if we repeat the experiment n times, then r outcomesof the event A1 can  be observed together with B2 in nr ways, together with B1 in n−r ways and together i  with B3 in n−r−i ways. Then, h outcomes of the event B can be realized together with 1 k−r−i    n−r−i−(k−r−i) n−k n−k−h A2 in = h ways, and together with A3 in l−i−h ways. Therefore, h in n repeated independent trials, the number of possible cases when A1 appears i times, B1 appears l times, A1 B2 appears r times and A2 B1 appears h times is ! ! ! ! ! n n−r n−r−i n−k n−k−h r i k−i−r h l−i−h = n! , i!r!h!(k − i − r)!(l − i − h)!(n − k − l + i)! and each case has the same probability of i r k−i−r h l−i−h P11 P12 P13 P21 P31 (1 − P11 − P12 − P13 − P21 − P31 )n−k−l+i . 114 I. Bairamov, O. E. Gultekin It is then easy to see that i changes from max(0, k + l − n) to min(k − r, l − h), and consequently we obtain (5). To derive the joint probability generating function, let us write ( 1 if in the ith trial A1 appears, i ξ1 = 0 otherwise, ( 1 if in the ith trial B1 appears, ξ2i = 0 otherwise, ( 1 if in the ith trial A1 B2 appears, i ξ12 = 0 otherwise, ( 1 if in the ith trial A2 B1 appears, i ξ21 = 0 otherwise, for i = 1, 2, . . . , n. It is then clear that χ1 = n P ξ1i , χ2 = i=1 n P n P ξ2i , χ12 = i=1 n P i ξ12 and χ21 = i=1 i ξ21 . Since the n trials are independent, the pgf of the random vector (χ1 , χ2 , χ12 , χ21 ) i=1 can then be written as  (7) Φ(t, s, z, c) = 1 X tx1 sx2 z x3 cx4 qx1 ,x2 ,x3 ,x4 x1 ,x2 ,x3, x4 =0 n , where i i = x4 }; x1 , x2 , x3 , x4 = 0, 1. = x3 , ξ21 qx1 ,x2 ,x3 ,x4 = P {ξ1i = x1 , ξ2i = x2 , ξ12 We have q1,1,1,1 = P (A1 B1 (A1 B2 )(A2 B1 )) = 0 q0,1,1,1 = P (A1c B1 (A1 B2 )(A2 B1 )) = 0 q1,1,1,0 = P (A1 B1 (A1 B2 )(A2 B1 )c ) = 0 q0,1,1,0 = P (A1c B1 (A1 B2 )(A2 B1 )c ) = 0 q1,1,0,1 = P (A1 B1 (A1 B2 )c (A2 B1 )) = 0 q0,1,0,1 = P (A1c B1 (A1 B2 )c (A2 B1 )) = P21 q1,1,0,0 = P (A1 B1 (A1 B2 )c (A2 B1 )c ) = P11 q0,1,0,0 = P (A1c B1 (A1 B2 )c (A2 B1 )c ) = P31 q1,0,1,1 = P (A1 B1c (A1 B2 )(A2 B1 )) = 0 q0,0,1,1 = P (A1c B1c (A1 B2 )(A2 B1 )) = 0 q1,0,1,0 = P (A1 B1c (A1 B2 )(A2 B1 )c ) = P12 q0,0,1,0 = P (A1c B1c (A1 B2 )(A2 B1 )c ) = 0 q1,0,0,1 = P (A1 B1c (A1 B2 )c (A2 B1 )) = 0 q0,0,0,1 = P (A1c B1c (A1 B2 )c (A2 B1 )) = 0 q1,0,0,0 = P (A1 B1c (A1 B2 )c (A2 B1 )c ) = P13 , X3 X3 q0,0,0,0 = P (Ac1 B1c (A1 B2 )c (A2 B1 )c ) = i=2 j=2 pij . Substituting for these values in (7) and simplifying, we obtain (6). Observe that k = r, r + 1, . . . , n − h; l = h, h + 1, . . . , n − r; r = 0, 1, . . . , n − h; h = 0, 1, . . . , n.  Discrete Distributions 115 2.1.1. Marginal distributions. The univariate marginals of the discrete random vector (χ1 , χ2 , χ12 , χ21 ) are binomial with cell probabilities (P11 + P12 + P13 ), (P11 + P21 + P31 ), P12 and P21 , respectively. The joint distribution of (χ1 , χ2 ) is obviously the bivariate binomial distribution with probability function (1) and pgf Ψ1,2 (t, s) = Φ(t, s, 1, 1) = (b1 ts + b2 t + b3 s + b4 )n , where b1 = α1 , b2 = α2 + α4 , b3 = α3 + α5 and b4 = α6 , as in (2). The joint pgf of (χ1 , χ12 ) Ψ1,12 (t, z) = Φ(t, 1, z, 1) = (c1 tz + c2 t + c3 )n , where c1 = α2 , c2 = α1 + α4 , c3 = α3 + α5 + α6 . The joint pgf of (χ1 , χ21 ) is Ψ1,21 (t, c) = Φ(t, 1, 1, c) = (d1 t + d2 c + d3 )n , where d1 = α1 + α2 + α4 , d2 = α3 , d3 = α5 + α6 , which is the pgf of a trinomial distribution. The joint pgf of (χ2 , χ12 ) is Ψ2,12 (s, z) = Φ(1, s, z, 1) = (e1 s + e2 z + e3 )n , where e1 = α1 + α3 + α5 , e2 = α2 , e3 = α4 + α6 , which is the pgf of a trinomial distribution. The joint pgf of (χ2 , χ21 ) is Ψ2,21 (s, c) = Φ(1, s, 1, c) = (f1 cs + f2 s + f3 )n , where f1 = α3 , f2 = α1 + α5 , f3 = α2 + α4 + α6 . The joint pgf (χ12 , χ21 ) is Ψ12,21 (z, c) = Φ(1, 1, z, c) = (g1 c + g2 z + g3 )n , where g1 = α3 , g2 = α2 , g3 = α1 + α4 + α5 + α6 , which is the pgf of a trinomial distribution. The trivariate marginals of the discrete random vector (χ1 , χ2 , χ12 , χ21 ) are as follows. The joint pgf of (χ1 , χ2 , χ12 ) is Ψ1,2,12 (t, s, z) = Φ(t, s, z, 1) = (h1 ts + h2 tz + h3 t + h4 s + h5 )n , where h1 = α1 , h2 = α2 , h3 = α4 , h4 = α3 + α5 , h5 = α6 . The joint pgf of (χ1 , χ2 , χ21 ) is Ψ1,2,21 (t, s, c) = Φ(t, s, 1, c) = (j1 ts + j2 cs + j3 t + j4 s + j5 )n , where j1 = α1 , j2 = α3 , j3 = α2 + α4 , j4 = α5 , j5 = α6 . The joint pgf of (χ1 , χ12 , χ21 ) is Ψ1,12,21 (t, z, c) = Φ(t, 1, z, c) = (k1 tz + k2 t + k3 c + k4 )n , where k1 = α2 , k2 = α1 + α4 , k3 = α3 , k4 = α5 + α6 . The joint pgf of (χ2 , χ12 , χ21 ) is Ψ2,12,21 (s, z, c) = Ψ(1, s, z, c) = (n1 cs + n2 s + n3 z + n4 )n , where n1 = α3 , n2 = α1 + α5 , n3 = α2 , n4 = α4 + α6 . 116 I. Bairamov, O. E. Gultekin Example 2.1 (continued) a) It is clear that q3 −1 P {χ1 = q1 , χ2 = q2 , χ11 < q3 } = X P {χ1 = q1 , χ2 = q2 , χ11 = r}. r=1 It can be calculated from (4), for m = 3 and the probabilities p11 = P (A1 B1 ) = P {X ≤ a1 , Y ≤ b1 }, p12 = P (A1 B2 ) = P {X ≤ a1 , b1 < Y ≤ b2 }, p21 = P (A2 B1 ) = P {a1 < X ≤ a2 , Y ≤ b1 }, p22 = P (A2 B2 ) = P {a1 < X ≤ a2 , b1 < Y ≤ b2 }, p13 = P (A1 B3 ) = P {X ≤ a1 , Y > b2 }, (8) p23 = P (A2 B3 ) = P {a1 < X ≤ a2 , Y > b2 } p31 = P (A3 B1 ) = P {X > a2 , Y ≤ b1 }, p32 = P (A3 B2 ) = P {X > a2 , b1 < Y ≤ b2 }, p33 = P (A3 B3 ) = P {X > a2 , Y > b2 }. b) The probability is P {χ1 = q1 , χ2 = q2 , χ12 = q3 , χ21 = q4 }, which can be calculated from (5) for m = 3 by using the probabilities (8). Below, in Table 1, we provide some numerical values of f (n, k, l, r, h) = P {χ1 = k, χ2 = l, χ12 = r, χ21 = h} for n = 2, m = 3 and pij = 19 , i, j = 1, 2, 3. Table 1. Numerical values of f (n, k, l, r, h) = P {χ1 = k, χ2 = l, χ12 = r, χ21 = h} for n = 2, m = 3. n k l r h f (n, k, l, r, h) n k l r h f (n, k, l, r, h) 2 0 0 0 0 0.198 2 1 1 1 0 0.025 2 0 1 0 0 0.099 2 2 0 1 0 0.025 2 0 2 0 0 0.012 2 2 1 1 0 0.025 2 1 0 0 0 0.099 2 2 0 2 0 0.012 2 1 1 0 0 0.123 2 0 1 0 1 0.099 2 1 2 0 0 0.025 2 0 2 0 1 0.025 2 2 0 0 0 0.012 2 1 1 0 1 0.025 2 2 1 0 0 0.025 2 1 2 0 1 0.025 2 2 2 0 0 0.012 2 1 1 1 1 0.025 2 1 0 1 0 0.099 2 0 2 0 2 0.012 2.2. The Poisson approximation. The Poisson procedure allows us to obtain the formula that approximates the probability mass function pn (k, l, r, h) when the number of trials is large (n → ∞), and P11 , P12 , P21 , P13 , P31 → 0, nP11 → λ11 , nP12 → λ12 , nP21 → λ21 , nP1 → λ1 , nP2 → λ2 , where P1 = P11 + P12 + P13 and P2 = P11 + P21 + P31 . Discrete Distributions 117 The limiting form of P {k, l, r, h} is given by lim P {χ1 = k, χ2 = l, χ12 = r, χ21 = h} n→∞ min(k−r,l−h) = lim n→∞ n(n − 1) · · · (n − k) · · · (n − k − l + i + 1)(n − k − l + i)! i!r!h!(k − i − r)!(l − i − h)!(n − k − l + i)! i=max(0,k+l−n) i r k−i−r h  λ11 λ12 λ1 − λ11 − λ12 λ21 × n n n n  l−i−h n−k−l+i λ1 + λ2 − λ11 λ2 − λ11 − λ21 × 1− n n X min(k−r,l−h) X = lim n→∞ 1(1 − n1 ) · · · (1 − nk ) · · · (1 − k+l−i−1 ) i r n λ11 λ12 i!r!h!(k − i − r)!(l − i − h)! i=max(0,k+l−n) × (λ1 − λ11 − λ12 )k−i−r λh21 (λ2 − λ11 − λ21 )l−i−h n−k−l+i  λ1 + λ2 − λ11 × 1− n = e−(λ1 +λ2 −λ11 ) min(k−r,l−h) i X λ11 i=0 λr12 λh21 (λ1 − λ11 − λ12 )k−i−r i! r! h! (k − i − r)! (λ2 − λ11 − λ21 )l−i−h , (l − i − h)! (k = r, r + 1, . . . ; l = h, h + 1, . . . ; r = 0, 1, 2, . . . ; h = 0, 1, 2, . . .). × Therefore Pn (k, l, r, h) → p(k, l, r, h), where min(k−r,l−h) p(k, l, r, h) = e−(λ1 +λ2 −λ11 ) X i=0 λi11 λr12 λh21 (λ1 − λ11 − λ12 )k−i−r i! r! h! (k − i − r)! (λ2 − λ11 − λ21 )l−i−h ; (l − i − h)! (k = r, r + 1, . . . ; l = h, h + 1, . . . ; r = 0, 1, 2, . . . ; h = 0, 1, 2, . . .). × This distribution is a version of the quadrivariate Poisson distribution. 2.5. Remark. It should be noted that Theorem 2.4 enables one to calculate the joint probability function of any random variables Xi , Xj , Xij counting, respectively, the number of occurrences of Ai , Bj and Ai Bj in n repetitions of the experiment. 2.6. Remark. In this paper we do not deal with statistical inferences for the proposed family of distributions. The estimating and testing techniques for similar multivariate distributions are discussed in the statistical literature, see e.g. Voinov and Nikulin [22], Voinov et al. [23]. 2.7. Remark. The problem addressed in this paper is that associated with determining the joint distribution of overlapping sums of the coordinates of a multinomial density in certain specific cases. The general problem would begin with X ∼ Multinomial(n; p1 , p2 , . . . , pm ), m P pi = 1, and for each i, Xi denotes the number of outcomes of type i, where pi ≥ 0 and i=1 i = 1, 2, . . . , m. In such a setting, we can consider the joint density of the random vector Z = AX, where A is a k × m matrix of zeroes and ones. It is random vectors of this type 118 I. Bairamov, O. E. Gultekin that are considered in the paper. In particular, we focus on a cross-classified version in which there are m2 possible outcomes of the experiment which can be indexed by {(i; j): i = 1, 2, . . . , m; j = 1, 2, . . . , m} with corresponding probabilities pij . 3. Some possible applications 3.1. Empirical distribution function and dependence measures. Let (X1 , Y1 ), (X2 , Y2 ), . . . , (Xn , Yn ) be a sample from a bivariate distribution with distribution function (df) F (x, y) and marginal df’s FX (x) and FY (y). Write ξ1 = # {i : Xi ≤ x}, ξ2 = # {j : Yj ≤ y}, ξ11 = # {i : Xi ≤ x, Yi ≤ y}, ξ12 = # {i : Xi ≤ x, Yi > y}, A1 = {X ≤ x}, A2 = {X > x} and B1 = {Y ≤ y}, B2 = {Y > y}. It is easy to observe that ξ1 represents the number of elements of the sample X1 , X2 , . . . , Xn falling below the threshold x, ξ2 represents the number of elements of the sample Y1 , Y2 , . . . , Yn falling below the threshold y, and ξ11 represents the number of elements of the sample (X1 , Y1 ), (X2 , Y2 ), . . . , (Xn , Yn ) belonging to the set {(u, v) : u ≤ x, v ≤ y}. It can also be observed ∗ that ξ1 ≡ nFX (x) is the empirical df of the sample X1 , X2 , . . . , Xn ; ξ2 ≡ nFY∗ (y) is the ∗ empirical df of the sample Y1 , Y2 , . . . , Yn , and ξ11 = nFX,Y (x, y) is the empirical df of the bivariate sample (X1 , Y1 ), (X2 , Y2 ), . . . , (Xn , Yn ). In this case we have (9) ∗ ∗ P {nFX (x) = k, nFY∗ (y) = l, nFX,Y (x, y) = r} = P {ξ1 = k, ξ2 = l, ξ11 = r} , where the joint probability function of the exceedances (ξ1 , ξ2 , ξ11 ) is as given in (4) with the probabilities π11 = P (A1 B1 ) = P {X ≤ x, Y ≤ y} = F (x, y) π12 = P (A1 B2 ) = P {X ≤ x, Y > y} = FX (x) − F (x, y) π21 = P (A2 B1 ) = P {X > x, Y ≤ y} = FY (y) − F (x, y) π22 = P (A2 B2 ) = P {X > x, Y > y} = 1 − FX (x) − FY (y) + F (x, y) The probabilities (9) can be used to construct a criteria for testing the independence ∗ (x), of random variables X and Y , based on the empirical distribution functions FX ∗ ∗ FY (y) and FX,Y (x, y). In recent years, several statistical papers have appeared, discussing local dependence measures that can characterize the dependence structure of two random variables localized at the fixed point. For more details on local dependence functions, see Bjerve and Doksum [5], Jones [14], Bairamov and Kotz [2], Bairamov et al. [3], Kotz and Nadarajah [18]. Assume that J(u, v, w) is any function on the unit ∗ ∗ cube and that J(FX (x, y)) leads to a test statistic for testing indepen(x), FY∗ (y), FX,Y dence between X and Y at the point (x, y). Then the distribution of this statistic is ∗ ∗ P {J(FX (x), FY∗ (y), FX,Y (x, y)) ≤ t} given by X ∗ ∗ ∗ P {J(FX (x), FY (y), FX,Y (x, y)) ≤ t} = P {ξ1 = k, ξ2 = l, ξ11 = r} . k , l , r )≤t} {(k,l,r):J ( n n n 3.2. Exceedances. Let (X1 , Y1 ), (X2 , Y2 ), . . . , (Xn , Yn ) be a sample as in Example 2.2. Denote by Xr:n the r th order statistic constructed from the sample X1 , X2 , . . . , Xn and let Y[r:n] be the corresponding concomitant of Xr:n . The joint probability density function of the r th order statistic and its concomitant Y[r:n] is fXr:n ,Y[r:n] (x, y) = f (y | x)fr:n (x). The concomitants of order statistics arise in different selection procedures (see David [7]). Assume that (Xn+1 , Yn+1 ), (Xn+2 , Yn+2 ), . . . , (Xn+m , Yn+m ) are the next m observations obtained from the same population with df F (x, y) that are independent of the  first sample. In this case let r < s and η1 = # {i : Xi ≤ Xr:n }, η2 = # j : Yj ≤ Y[r:n] ,   η11 = # i : Xi ≤ Xr:n , Yi ≤ Y[r:n] , η12 = # i : Xi ≤ Xr:n , Yi ≤ Y[s:n] and η21 = Discrete Distributions 119  # i : Xi ≤ Xs:n , Yi ≤ Y[r:n] . The random variable η1 + 1 shows the rank of Xr:n among the Xn+1 , Xn+2 , . . . , Xn+m and η2 + 1 shows the rank of Y[r:n] among the Yn+1 , Yn+2 , . . . , Yn+m . The joint distribution of η1 + 1 and η2 + 1 can be obtained from (1) as follows: P {ξ1 = k − 1, ξ2 = l − 1} min(k−1,l−1) = X i=max(0,k+l−n−2) (10) × n! i!(k − 1 − i)!(l − 1 − i)!(n − k − l + i + 2)! Z∞ Z∞ i k−i−1 l−i−1 π11 (x, y)π12 (x, y)π21 (x, y) −∞ −∞ n−k−l+i+2 × π22 (x, y)f (y | x)fr:n (x) dx dy. Formula (10) is obtained in Eryilmaz and Bairamov [9] by conditioning on Xr:n and Y[r:n] . The joint distribution of the exceedance statistics η1 , η2 , η11 and η1 , η2 , η12 , η21 can be obtained in a similar way to (4) and (5). Acknowledgement. We are grateful to the editor and anonymous referee for their valuable comments which improved the presentation of this paper. References [1] Aitken, A. C. and Gonin, H. T. On fourfold sampling with and without replacement, Proc. Roy. Soc. Edinburgh 55, 114–125, 1935. [2] Bairamov I. and Kotz. S. On local dependence function for multivariate distributions, New Trends in Probability and Statistics 5, 27–44, 2000. [3] Bairamov I. G., Kotz S. and Kozubowski T. A new measure on linear local dependence, Statistics, 37 (3), 243–258, 2003. [4] Biswasa, A. and Hwang, J. S. A new bivariate binomial distribution, Statistics and Probability Letters 60, 231–240, 2002. [5] Bjerve, S. and Doksum, K. Correlation curves: measures of association as functions of covariate values, Ann. Statist. 21, 890–902, 1993. [6] Chandrasekar, B. and Balakrishnan N. Some properties and a characterization of trivariate and multivariate binomial distributions, Statistics 36, 211–218, 2002. [7] David H. A. Concomitant of order statistics, Theory and Application, in: Krishnaiah, P, R., Sen, P. K. (Eds.) (Handbook of Statistics 4, North-Holland, Amsterdam, 1993), 383–403. [8] Doss, D. C. and Graham, R. C. A characterization of multinomial bivariate distribution by univariate marginals, Bull. Calcutta Statist. Assoc. 24, 93–99, 1975. [9] Eryilmaz, S. and Bairamov, I. G. On a new sample rank of an order statistic and its concomitant, Statist. Probab. Lett. 63, 123–131, 2003. [10] Hamdan, M. A. Canonical expansions of the bivariate binomial distribution with unequal marginal indices, Int. Statist. Rev. 40, 277–280, 1972. [11] Hamdan, M. A. A note on the trinomial distribution, Int. Statist. Rev. 43, 219–220, 1975. [12] Hamdan, M. A. and Al-Bayati, H. A. A note on the bivariate Poisson distribution, Amer. Statist. 23, 32–33, 1969. [13] Hamdan, M. A. and Jensen, D. R. A bivariate binomial distribution and some applications, Austral. J. Statist. 18, 163–169, 1976. [14] Jones M. C. The local dependence function, Biometrika 83, 899–904, 1996. [15] Johnson N. L., Kotz S. and Balakrishnan N. Discrete Multivariate Distributions (John Wiley & Sons, New York, 1997). [16] Kocherlakota, S. and Kocherlakota, K. Bivariate Discrete Distributions (Marcel Dekker, New York, 1992). [17] Krishnamoorthy, A. S. Multivariate Poisson and binomial distributions, Sankhya 11, 117– 124, 1951. 120 I. Bairamov, O. E. Gultekin [18] Kotz, S. and Nadarajah, S. Local dependence functions for the elliptically symmetric distributions, Sankhya 65 (1), 207–223, 2003. [19] Papageorgiou, H. and David, K. M. On countable mixtures of bivariate binomial distributions, Biom. J. 36, 581–601, 1994. [20] Papageorgiou, H. and David, K. M. The structure of compounded trivariate binomial distributions, Biom. J. 37, 81–95, 1995. [21] Shanbhag, D. N. and Basawa, I. V. (1974). On a characterization property of multinomial distribution, Trab. Estad. 25, 109–112, 1974. [22] Voinov, V. and Nikulin, M. Unbiased Estimators and their Applications 2: Multivariate case (Kluwer, 1996). [23] Voinov, V., Nikulin, M. and Smirnov, T. Multivariate discrete distributions induced by an urn scheme, linear diophantine equations, unbiased estimating and testing, J. Statist. Plann. Inference 101 (1-2), 255–266, 2002.