Pure Sig Test STR 656

Download as pdf or txt
Download as pdf or txt
You are on page 1of 32

Techical Report No.

656
Pure Significance Tests for Multinomial and
Binomial Distributions: the Uniform
Alternative⇤
Michael D. Perlman†
Department of Statistics
University of Washington
Seattle WA 98195, U.S.A.

April 18, 2024

Abstract

A pure significance test (PST) tests a simple null hypothesis Hf :


Y ⇠ f without specifying an alternative hypothesis by rejecting Hf
for small values of f (Y ). When the sample space supports a proper
uniform pmf funif , the PST can be viewed as a classical likelihood
ratio test for testing Hf against this uniform alternative. Under this
interpretation, standard test features such as power, Kullback-Leibler
divergence, and expected p-value can be considered. This report fo-
cuses on PSTs for multinomial and binomial distributions, and for the
related goodness-of-fit testing problems with the uniform alternative.
The case of repeated observations cannot be reduced to the single
observation case via sufficiency. The ordered binomial distribution,
apparently new, arises in the course of this study.


Key words: Pure significance test, multinomial and binomial distribution, likelihood
ratio test, Kullback-Leibler divergence, expected p-value, ordered binomial distribution.

[email protected].

1
1. Pure Significance Tests.
Let Y denote a random vector (rvtr) with sample space Y and let f be a
probability mass function (pmf) f on Y. A pure significance test (PST)
for testing the simple null hypothesis Hf : Y ⇠ f without specifying an
alternative hypothesis rejects Hf for small values of f (Y ). If Y = y0 is
observed, the attained p value is

(1) Pf [f (Y )  f (y0 )].

Pure significance tests long have been a contentious subject, since many
believe that relative likelihood, not likelihood, is appropriate for non-Bayesian
statistical analysis, hence an alternative hypothesis must be specified. See
Hodges (1990) and Howard (2009) for informative reviews and see Appendix
2 of this report for a critique of PSTs. Here we attempt to reconcile, or
at least reduce, this contention by noting that when the sample space Y
supports a proper uniform pmf funif , the PST can be viewed as a classical
likelihood ratio test (LRT) for testing

(2) Hf : Y ⇠ f vs. Hunif : Y ⇠ funif ,

the uniform alternative. This interpretation allows consideration of standard


test features for PST’s, such as power, Kullback-Leibler divergence (KLD),
and expected p-value (EPV).
This report studies PSTs for multinomial distributions, with binomial
distributions as a special case.1 After briefly reviewing the geometry of the
multinomial family in Section 2, problem (??) for testing a simple multino-
mial hypothesis against uniformity is discussed in Section 3. It is shown in
Propostion 3.1 that under the KLD criterion, the equal-cell-probability (ecp)
multinomial is closest to the uniform alternative, which is not a member of
the multinomial family, so that the p-value for this case may provide an upper
bound for the p-value in the general case. The p-value is then examined di-
rectly by studying the EPV criterion, where Conjecture 3.3 with supporting
results provide further evidence for this upper bound property.
A related testing problem in treated in Section 4, where the simple null
hypothesis in (??) is replaced by the composite hypothesis consisting of the
1
This topic arose in Problem 2023-03 under the UW Statistics Department’s NSASAG
study project.

2
entire multinomial family; again the alternative hypothesis is the uniform
distribution. This can be viewed as testing goodness-of-fit for the multino-
mial family, but with a specified alternative, namely the uniform distribution.
Here the ecp multinomial distribution is not the least favorable distribution
for this problem (cf. Proposition 4.1), so the PST with the ecp distribution
as the null hypothesis is inappropriate for the composite multinomial null
hypothesis. Instead, the LRT for the composite hypothesis is derived and
shown to be unbiased, with its p-value determined by the ecp distribution
(cf. Proposition 4.2 and Remark 4.3).
The first testing problem is revisited in Sections 5-7 for the apparently
simpler but still challenging binomial case, where Conjecture 3.3 for the EPV
criterion is refined and verified in several cases. Hopefully this will lead to
further insight for the general multinomial case. The binomial case leads to
the introduction of the ordered binomial distribution (OBD) (Definition 7.1),
which appears to be new and of interest in its own right.
Although the preceding results are stated for a single observation from
a multinomial or binomial distribution, at first glance it might appear that
they extend directly to the case of repeated observations, since the sum of
the observations is a sufficient statistic whose distribution remains in the
multinomial or binomial family. In Section 8, however, we note that this
is invalid for the PST when the uniform alternative is introduced, since the
sum statistic is not sufficient for the multinomial/binomial + uniform model.
The repeated observations case appears to be substantially more challenging.
Most proofs appear in Appendix 1, while Appendix 2 presents a general
critique of pure significance tests. Warm thanks go to Steven W. Knox for
helpful discussions.

2. Some geometry of the multinomial family.


Let Multp (k, n) denote the k-cell multinomial distribution based on n trials
with cell probabilities (p1 , . . . , pk ) ⌘ p. Its sample space is the integer simplex
P
(3) Sk,n = {x ⌘ (x1 , . . . , xk ) | x1 0, . . . , xk 0, kj=1 xj = n},

while its pmf is given by


✓ ◆Yk
n x
(4) fp (x) = pj j , x 2 Sk,n ,
x j=1

3
n n!
where x
= Qk
xj !
is the multinomial coefficient. Denote the set of all such
j=1
multinomial pmfs by

(5) M(k, n) = {fp | p 2 Pk },

where Pk is the (k 1)-dimensional probability simplex:


k
X
(6) Pk = {p ⌘ (p1 , . . . , pk ) | 0  pj  1, pj = 1}.
j=1

Because |Sk,n | = n+kk 1


1
, the multinomial pmf fp can be viewed as a vec-
tor in the probability simplex P(n+k 1) and the multinomial family M(k, n) is
k 1
a (k 1)-dimensional curved surface in P(n+k 1) . For example, in the simplest
k 1
case k = n = 2 (binomial, 2 trials), p = (p, 1 p), 0  p  1, and

fp = (1 p)2 , 2p(1 p), p2 ,

so M(2, 2) is a symmetric section of a parabola in P3 , with endpoints (1, 0, 0)


and (0, 0, 1) and stationary point ( 14 , 12 , 14 ).
Because Sk,n is finite, the uniform distribution Unif(k, n) on Sk,n exists
and is given by the constant pmf
n+k 1
(7) funif (x) = 1/ k 1
, x 2 Sk,n .

This uniform pmf also can be viewed as a vector in P(n+k 1) :


k 1

n+k 1
(8) funif = 1/ k 1
,
n+k 1
1 : = (1, . . . , 1) 2 R( k 1 ).

/ M(k, n)2 but funif lies in the interior of


It is important to note that funif 2
the convex hull of M(k, n). The latter follows from the well-known fact that
2
Suppose to the contrary that funif = fp for some p, that is,
✓ ◆Y
n k x
p j = constant 8x 2 Sk,n .
x j=1 j

Take (x1 , . . . , xk ) = (0, . . . , 0, n, 0, . . . , 0) to see that pn1 = · · · = pnk , so p1 = · · · = pk .


Then take (x1 , . . . , xk ) = (n 1, 1, 0, . . . , 0) to see that pn1 = npn1 1 p2 , so p1 = np2 6= p2 ,
a contradiction.

4
if ⌫ is the uniform distribution on Pk then
✓ ◆ Z ⇣Y ⌘
n k x n+k 1
(9) pj j d⌫(p) = 1/ k 1
.
x Pk j=1

3. The first testing problem.


Let X ⌘ (X1 , . . . , Xk ) be a rvtr with pmf f on the sample space Sk,n . We
shall study two problems. The first is that of testing the simple hypothesis

(10) Hp : f = f p vs. Hunif : f = funif

for any fixed p 2 Pk . The classical LRT for this problem rejects Hp for
large values of funif (X)/fp (X), equivalently, for small values of fp (X), so is
equivalent to the PST for Hp . If X = x0 is observed, the attained p value is

(11) ⇡p (x0 ) := Pp [fp (X)  fp (x0 )].

Determination of ⇡p (x0 ) is a challenging computational question that is


not addressed here. However, Proposition 3.1 and Conjecture 3.3 (if true)
indicate that the LRT ⌘ PST for (??) when p 6= pecp is more sensitive, i.e.,
more powerful (see Remark 3.2) than the LRT ⌘ PST when p = pecp , the
equal-cell-probability (ecp) case, where

(12) pecp = ( k1 , . . . , k1 ) 2 Pk ,
✓ ◆
n 1
(13) fecp (x) = , x 2 Sk,n .
x kn

This test rejects Hecp ⌘ Hpecp in favor of Hunif for large values of
✓ ◆ 1
funif (X) n Qk
(14) / / j=1 Xj ! .
fecp (X) X

Thus the ecp case should provide a floor for the general case.
Proposition 3.1. Under the KLD criterion, fecp is the closest pmf in
M(k, n) to funif . That is,
 
funif (X) funif (X)
(15) Eunif log > Eunif log 8p 2 Pk \{pecp }.
fp (X) fecp (X)

5
The left side and right side of (??) are the (positive) KLDs from fp to funif
and from fecp to funif , respectively.
Proof. Inequality (??) is derived as follows. From (??) and (??),
 
funif (X) funif (X)
Eunif log Eunif log
fp (X) fecp (X)

fecp (X)
= Eunif log
fp (X)
" k
#
Y X
= Eunif log k n pj j
j=1
" k
#
X
= n log k + Eunif (Xj ) log pj
j=1
" k
#
1X
= n log k + log pj > 0
k j=1

by the symmetry of funif and the strict convexity of log x. ⇤

Remark 3.2. Cherno↵ (1956) showed that the logarithm of the Type 2 error
probability of the LRT ⌘ PST is asymptotically proportional to KLD. ⇤

Now consider the EPV criterion for the PSTs. For a discussion of the role
of the EPV in hypothesis testing, see Sackrowitz and Samuel-Cahn (1999).
Abbreviate Sk,n by S, let Y ⇠ funif on S, and consider the p-values ⇡p and
⇡unif ⌘ ⇡punif given by (??).

Conjecture 3.3: Under the EPV criterion, fecp is the closest pmf in M(k, n)
to funif . That is, the EPV Eunif [⇡p (Y)] is maximized when p = pecp , i.e.,
(16) Eunif [⇡ecp (Y)] > Eunif [⇡p (Y)] 8p 2 Pk \{pecp }. ⇤

From (??), under the uniform alternative Hunif , the EPV of the LRT ⌘
PST for (??) with a general p is given by
Eunif [⇡p (Y)] = Eunif {Pp [fp (X)  fp (Y) | Y]}
= Pp,unif [fp (X)  fp (Y)]
1 P⇢
= n+k
k 1
1
fp (x),
(x,y)2⇥S fp (x)fp (y)

6
while under Hunif , the EPV of the LRT ⌘ PST for (??) with p = punif is
1 P⇢
Eunif [⇡ecp (Y)] = n+k
k 1
1
fecp (x).
(x,y)2S⇥S fecp (x)fecp (y)

Thus the conjectured inequality (??) holds i↵


X X
(17) fecp (x) > fp (x).
⇢ ⇢
(x,y)2S⇥S (nx)(ny) (x,y)2S⇥S fp (x)fp (y)

By conditioning on x, inequality (??) can be rewritten as


X ⇢ ✓ ◆ ✓ ◆
n n
y2S  fecp (x)
x y
{x2S}
X n o
(18) > y 2 S fp (x)  fp (y) fp (x),
{x2S}

where |A| denotes the number of elements of A. Verification of (??) has


proven to be elusive, although it is straightforward for the extreme case
where p is any of the n permutations of p0 ⌘ (1, 0, . . . , 0):

Proposition 3.4. Inequality (??) holds when p is any permutation of p0 .


Proof. Because S is invariant under permutations of x, the right-hand side
of (??) is invariant under permutations of p, so it suffices to consider p = p0 .
Because fp0 (x) = 1(0) if x = (6=) x0 := (n, 0, . . . , 0), the right-hand side of
(??) = 1 when p = p0 . But the left-hand side > 1, since
n o
y 2 S xn  yn 1 8 x 2 S,
n
while x0
= 1 so
n o
y2S n
x0
 n
y
= |S| > 1. ⇤

After Proposition 3.4, the next simplest case of Conjecture 3.3 occurs
when p has exactly two positive components, which by symmetry can be
taken to be p1 , p2 > 0, with p3 = · · · = pk = 0. This reduces the multino-
mial family to the binomial family, which will be examined in Sections 5-7,
providing further evidence for the validity of Conjecture 3.3 in general.

7
Remark 3.5. Since we are testing f = fp vs. f = funif (cf. (??)), it
would also seem to be of interest to compare the expected p-values un-
der the null hypotheses fecp and fp , that is to compare Eecp [⇡ecp (Y)] and
Ep [⇡p (Y)]. However, if Y had a continuous distribution then ⇡ecp (Y) would
have the Uniform(0, 1) distribution under fecp , as would ⇡p (Y) under fp ,
hence Eecp [⇡ecp (Y)] = Ep [⇡p (Y)] = 12 . Thus under the actual discrete dis-
tribution of Y, both these expectations are approximately 12 , at least for
moderate or large n, so this comparison would be uninformative. ⇤

4. The second testing problem.


Now consider the problem of testing the composite null hypothesis

(19) Hmult : f 2 M(k, n) vs. Hunif : f = funif .

This can be viewed as testing goodness-of-fit for the multinomial distribution,


but with a specified alternative, namely the uniform distribution. Because
funif 2
/ M(k, n) while fecp is the pmf in M(k, n) closest to funif according to
KLD (Proposition 3.1) and possibly EPV (Conjecture 3.3), one might ask if
fecp is the least favorable distribution for the testing problem (??).3 If this
were true then the LRT ⌘ PST (??) would be the most powerful test of its
size for (??). By Corollary 3.8.1 of [LR] Lehmann and Romano (2005), fecp
would be the least favorable distribution if, for all c > 0,

Pp [fecp (X)  c]  Pecp [fecp (X)  c] 8p 62 Pk .

However, exactly the opposite is true:


Proposition 4.1. Pp [fecp (X)  c] is a Schur-convex4 function of p. Thus

(20) Pp [fecp (X)  c] Pecp [fecp (X)  c] 8p 2


/ Pk .

Proof. From (??),


n Pk
(21) Pp [fecp (X)  c] = Pp [ X
 c 0 ] = Pp [ j=1 log(Xj !) c00 ]
3
Phrased more properly, is the prior distribution on M(k, n) that assigns mass 1 to
fecp a least favorable prior distribution for (??) (see [LR], §3.8)).
4
Refer to [MOA] Marshall, Olkin, Arnold (2011) for definitions of Schur-convexity,
Schur-concavity, majorization, and T -transforms.

8
0
for some constants cP and c00 . Since x! = (x + 1) and the gamma func-
k
tion is log convex, j=1 log(xj !) is convex and symmetric in x, so is a
Schur-convex
Pk function. This implies that the indicator function of the set
00
{x | j=1 log(xj !) < c } is Schur-concave in x, hence its expected value un-
der Pp is Schur-concave in p by the Proposition in Example 2 of [R] Rinott
(1973) (see [MOA] Proposition 11.E.11.). By (??) this expected value is

Pp [fecp (X) > c] = 1 Pp [fecp (X)  c],

so the first assertion holds. Because every p majorizes pecp , (??) follows. ⇤

In view of Proposition 4.1, the LRT ⌘ PST (??) is inappropriate for (??).
Instead we propose the actual LRT for (??), which rejects Hmult in favor of
Hunif for large values of

Y Xj ! k
funif (X)
(22) / =: L(X).
sup fp (X) j=1 XjXj
p2Pk

By Proposition 4.2, this test has the desirable property that its power func-
tion is a Schur-concave function of p, hence attains its maximum over the
null hypothesis Hmult at p = pecp , in conformity with the minimum property
of the KLD in (??). This implies that the Type 2 error probability attains
its maximum over Hmult at p, hence this LRT is unbiased for (??).
Proposition 4.2. Pp [L(X) c] is a Schur-concave function of p 2 Pk .
Proof. We will show that L(x) is Schur-concave, hence the indicator function
Ic (x) of the set {x | L(x) c} is also Schur-concave. Then by the Proposition
in Example 2 of [R], Ep [Ic (x)] ⌘ Pp [L(X) c] is Schur-concave in p.
To show that L(x) is Schur-concave, we must show that L(y)  L(x)
whenever y majorizes x. By a classical result of Muirhead (cf. [MOA] Lemma
3.1), x can be obtained from y by a finite number of linear T -transforms,
which act on pairs of the coordinates of y by moving both members of the
pair toward their average. Thus by the symmetry of L, it suffices to show

(23) L(x1 1, x2 + 1, x3 , . . . , xk )  L(x1 , x2 , x3 , . . . , xk )

9
when x1  x2 . But this is equivalent to each of the inequalities

(x1 1)!(x2 + 1)! x1 !x2 !


x 1 x +1
 x1 x2 ,
(x1 1) 1 (x2 + 1) 2 x1 x2
⇣ ⌘ x1 1 ⇣ ⌘ x2
1 1
(24) 1 + x1 1  1 + x2 ,

1 t
and (??) holds since 1 + t
increases for t > 0 and x1 1 < x1  x 2 . ⇤
Remark 4.3. If X = x0 is observed, the attained p-value of the LRT (??)
can be expressed in a tractable form:

⇡L (x0 ) : = sup Pp [L(X) L(x0 )]


p2Pk

(25) = Pecp [L(X) L(x0 )].

Because p majorizes pecp 8p 2 Pk , (??) follows from Proposition 4.2. Eval-


uation of (??), based on fecp (x), is not addressed here. ⇤
Remark 4.4. Evaluation of (??) is easy in the binomial case k = 2, where
x = (x1 , n x1 ), x0 = (x10 , n x10 ), and

x1 !(n x1 )!
(26) L(x1 , n x1 ) = x1 , x1 = 0, . . . , n.
x1 (n x 1 ) n x1

Here the Schur-concavity of L(x), shown in the proof of Proposition 4.2, im-
plies that L(x1 , n x1 ) is unimodal and symmetric about x1 = n/2. Therefore
n n
(27) Pecp [L(X) L(x0 )] = Pecp [ |X1 2
|  |x10 2
| ].

Because X1 ⇠ Binomial(n, 12 ) in the ecp case, this is readily evaluated. ⇤

5. The binomial case (k = 2).


In Sections 5-7 we examine the validity of Conjecture 3.3 for the first testing
problem (??) in the apparently simple but still challenging binomial case,
which hopefully might suggest an approach to the general multinomial case.
Denote the Binomial(n, p) pmf by
n
(28) fp (x) = x
px (1 p)n x , x = 0, 1 . . . , n, 0  p  1.

10
When k = 2, the inequality (??) in Conjecture 3.3 reduces to
n
X n
X
(29) q 1 (x)f 1 (x) qp (x)fp (x),
2 2
x=0 x=0

where
n o
(30) qp (x) = y 0  y  n, fp (x)  fp (y) .

Note that 1  qp (x)  n + 1. I believe that (??) can be sharpened as follows:


1
Conjecture 5.1. If 2
< p  1 then
n
X n
X
(31) q 1 (x)f 1 (x) > qp (x)fp (x). ⇤
2 2
x=0 x=0

It is easy to see that

(32) q 1 (x) = 1 + |n 2x|,


2

so (??) can be written as


n
X n
X
(33) 1+ |n 2x|f 1 (x) > qp (x)fp (x).
2
x=0 x=0

Because fp (x) = f1 p (n x) and qp (x) = q1 p (n x), only the case 12  p  1


need be considered. The difficulty in verifying (??) stems mainly from the
difficulty in determining qp (x) as p increases from 12 to 1. Two special cases
are somewhat amenable (proved in Appendix 1):

Proposition 5.2. For the binomial case, Conjecture 5.1 holds when
n
(i) n+1
 p  1 (i.e., p near the extreme case p = 1);
1 1
(ii) 2
<p< 2
+ n for some sufficiently small n  ✏n (i.e., p near the ecp

11
case p = 12 ), where
ǎn 1
(34) ✏n = ;
2[ǎn + 1]
8 n o
<min an (x) x = n+3 , . . . , n , n odd,
2
(35) ǎn = n o
:min an (x) x = + 1, . . . , n ,
n
n even;
2
✓ ◆ 2x 1n 1
x
(36) an (x) = > 1. ⇤
n x+1

6. Three binomial examples.


In the binomial case the conjectured inequality (??) holds for the first non-
trivial cases n = 3, 4, 5. To verify this, first determine qpP
(x) as in Tables
1, 2, 3 below, then determine the n-th degree polynomial nx=0 qp (x)fp (x).
For n = 3 there are six such polynomials, each of degree 3 or less, one for
each of the six sub-ranges of p in Table 1. For n = 4 (n = 5) there are ten
(fourteen) sub-ranges in Table 2 (Table 3) hence ten (fourteen) such poly-
nomials, each of degree 4 (5) or less but not shown. It is straightforward to
verify numerically that (??) holds for each of these polynomials over their
respective sub-ranges of p in Tables 1, 2, 3. (However, (??) usually does not
extend from the sub-range to the entire range [ 12 , 1]).
These three examples begin to illustrate the complexity of determining
the values of qp (x), but perhaps suggest a pattern which might be extended
for general n. However, caution must be exercised, as demonstrated now.
In Tables 1,2,3, as p begins to increase above 12 the values of qp (x) remain
unchanged for x  n/2 and decrease by 1 for x > n/2. As p continues to
increase, the values of qp (x) change in pairs: in Table 1, n = 3 is odd and
the first pair to change is (qp (1), qp (3)), from (2,3) to (3,3), to (3,2); in Table
3, n = 5 is odd and the first pair to change is (qp (2), qp (4)), again from (2,3)
to (3,3), to (3,2). This suggests that for odd n, the first pair to change is
always (qp ( n 2 1 ), qp ( n+3
2
)). However, this is not the case:
n 1 n+3 n 1 n+3
qp ( ) = qp ( ) () fp ( ) = fp ( )
2 2 2 2
✓ ◆1
n+3 2
() t = ,
n 1

12
p
where t = 1 p
, while

qp (1) = qp (n) () fp (1) = fp (n)


1
() t = n n 1 ,

hence the pair (qp ( n 2 1 ), qp ( n+3


2
)) changes before the pair (qp (1), qp (n)) i↵
✓ ◆ 12
n+3 1
< nn 1 .
n 1

This holds for n  45 but fails for n 47.

p P
3
t⌘ 1 p
p qp (0) qp (1) qp (2) qp (3) qp (x)fp (x)
x=0
1 1
1 2
⌘ 1+1 4 2 2 4 4 6p + 6p2
1
1 1
(1, 3 2 ) ( 1+1 , 31 2 ) 4 2 1 3 4 6p + 3p2 + 2xp3
3 2 +1
1
1 32
32 1 4 3 1 3 4 3p 3p2 + 5p3
3 2 +1
1
1 32 3
(3 2 , 3) ( 1 , 3+1
) 4 3 1 2 4 3p 3p2 + 4p3
3 +1
2
3
3 3+1
4 3 2 2 4 3p + p3
3
(3, 1] ( 3+1 , 1] 4 3 2 1 4 3p

Table 1: Determination of qp (x) for n = 3.

By contrast, when n is even, the unimodality of fp (0), . . . , fp (n) and the


symmetry of f 1 (x) about x = n/2 implies that the first pair to change must
2
be one of the pairs (qp (n x + 1), qp (x), x = n2 + 1, . . . , n. For such pairs, the
change occurs when

qp (n x + 1) = qp (x) () fp (n x + 1) = fp (x)
✓ ◆ 2x 1n 1
x
() t = =: Q(x).
n x+1

13
However,

log x
log(n x + 1)
log Q(x) =
x
(n x + 1)
Xx
1
= dm ,
2x n 1 m=n x+2
dm : = log(m) log(m 1)

and dm is strictly convex in m, hence log Q(x) is strictly increasing in x for


x = n2 + 1, . . . , n by Lemma 6.1 below (proved in Appendix 1). This implies
that for even n, (qp ( n2 ), qp ( n2 + 1)) is always the first pair to change.

p
t⌘ 1 p
p qp (0) qp (1) qp (2) qp (3) qp (4)
1 1
1 2
⌘ 1+1
5 3 1 3 5
3
(1, 32 ) 1
( 1+1 , 3
2
+1
) 5 3 1 2 4
2
3
3
2 3
2
+1
5 3 2 2 4
2
3 1
1
( 32 , 4 ) 3 ( 3
2
,
+1 4 3 +1
43
1 ) 5 3 2 1 4
2
1
1 43
43 1 5 4 2 1 4
4 3 +1
1 1
1 1 43 62
(4 3 , 6 2 ) ( 1 , 1 ) 5 4 2 1 3
4 3 +1 6 2 +1
1
1 62
62 1 5 4 3 1 3
6 2 +1
1
1 62 4
(6 2 , 4) ( 1 ,4+1
) 5 4 3 1 2
6 +1
2
4
4 4+1
5 4 3 2 2
4
(4, 1] ( 4+1 , 1] 5 4 3 2 1

Table 2: Determination of qp (x) for n = 4.

14
Lemma 6.1. For n 4 let d1 , . . . , dn be a strictly convex sequence. For
n
2
+ 1  x  n, define
x
X
1
Dx = dm .
2x n 1 m=n x+2

Then Dx is strictly increasing in x. ⇤

p
t⌘ 1 p
p qp (0) qp (1) qp (2) qp (3) qp (4) qp (5)
1 1
1 2
⌘ 1+1 6 4 2 2 4 6
1
1 1
(1, 2 ) 2 ( 1+1 , 21 2 ) 6 4 2 1 3 5
2 2 +1
1
1 22
2 2 1 6 4 3 1 3 5
2 2 +1
1 1
1 1 22 54
(2 2 , 5 4 ) ( 1 , 1 ) 6 4 3 1 2 5
2 2 +1 5 4 +1
1
1 54
54 1 6 5 3 1 2 5
5 4 +1
1
1 54 2
(5 4 , 2) ( 1 , ) 6 5 3 1 2 4
5 4 +1 2+1
2
2 2+1
6 5 3 2 2 4
1
1 2
(2, 10 ) 3 ( 2+1 , 101 3 ) 6 5 3 2 1 4
10 3 +1
1
1 10 3
10 3 1 6 5 4 2 1 4
10 3 +1
1 1
1 1 10 3 10 2
(10 3 , 10 2 ) ( 1 ,1 ) 6 5 4 2 1 3
10 3 +1 10 2 +1
1
1 10 2
10 2 1 6 5 4 3 1 3
10 2 +1
1
1 10 2 5
(10 2 , 5) ( 1 , ) 6 5 4 3 1 2
10 2 +1 5+1
5
5 5+1
6 5 4 3 2 2
5
(5, 1] ( 5+1 , 1] 6 5 4 3 2 1

Table 3: Determination of qp (x) for n = 5.

7. The ordered binomial distribution.


There is an interesting relation between the binomial Conjecture 5.1 and
what I shall call the ordered binomial distribution (OBD), which appears to
be new and of interest in its own right.

15
For n = 1, 2, . . . and 0  p  1, let X ⌘ Xn,p be a random variable having
the Binomial(n, p) distribution, with pmf fp (x) given in (??). Rearrange the
n + 1 probabilities fp (x) in ascending order to obtain

(37) f˜p (0)  f˜p (1)  · · ·  f˜p (n 1)  f˜p (n).

The relation between f˜p (·) and fp (·) is given by

(38) f˜p (rp (x)) = fp (x), x = 0, . . . , n,

where
n o
rp (x) = y 0  y  n, fp (y) < fp (x)
(39) =n+1 qp (x)

and |A| denotes the number of elements of A. Here rp (x) + 1 is the rank of
fp (x) among fp (0), . . . , fp (n), where, in the case of a tie fp (x1 ) = fp (x2 ), the
lower rank is assigned to both. Clearly 0  rp (x)  n.

Definition 7.1. The ordered binomial distribution (OBD) is the distribution


of the random variable (rv) X̃ ⌘ X̃n,p , defined by

(40) P[X̃ = x] = f˜p (x), x = 0, 1 . . . , n.


d
Clearly X̃n,1 p = X̃n,p because fp (x) = f1 p (n x), so it suffices to study
the OBD only for 12  p  1. By (??), Conjecture 5.1 can be restated
equivalently as follows:
1
Conjecture 7.2. If 2
< p  1 then
n
X n
X
(41) rp (x)fp (x) > r 1 (x)f 1 (x). ⇤
2 2
x=0 x=0

By (??),
n
X n
X
(42) rp (x)fp (x) = rp (x)f˜p (rp (x)).
x=0 x=0

Lemma 7.3 (proved in Appendix 1) shows that the sums in (??) are either
equal to, or very close to, E(X̃p ).

16
P
n
Lemma 7.3. (i) rp (x)fp (x) = E(X̃p ) if no ties occur5 among fp (0), . . . , fp (n).
x=0
n
This includes the case n+1
< p < 1.
P
n
1
(ii) r 1 (x)f 1 (x) = E(X̃ 1 ) 2
.
2 2 2
x=0

P
n
1 n
(iii) rp (x)fp (x) = E(X̃p ) n,p if 2
< p  n+1
and ties occur among
x=0
among fp (0), . . . , fp (n); here n,p is some number such that 0 < n,p < 12 .
P
n
(iv) r1 (x)f1 (x) = E(X̃1 ). ⇤
x=0

By Lemma 7.3, the inequality (??) in Conjecture 7.2 can be restated


equivalently in terms of the OBD rv X̃p :
8 1 1
<E(X̃ 12 )
> 2
if 2
< p < 1 and no ties occur,
1
(43) E(X̃p ) > E(X̃ 1 ) ⇤n,p if 2
< p < 1 and ties occur,
>
: 2
1
E(X̃ 1 ) 2
if p = 1,
2

where ⇤n,p = 12 1
n,p , so 0 < ⇤n,p < 2 . By Proposition 5.2, (??) holds for p
1
near 2 and near 1. The three examples show that (??) holds for n = 3, 4, 5.
Lastly, the conjectured inequality (??) can be strengthened successively:
Conjecture 7.4. E(X̃p ) > E(X̃ 1 ) for 1
2
< p  1. ⇤
2

Conjecture 7.5. E(X̃p ) is strictly increasing for 1


2
 p  1. ⇤

Conjecture 7.6: The OBD rv X̃p is strictly stochastically increasing in


p. Equivalently, the OBD probability vector (f˜p (0), . . . , f˜p (n)) is strictly
increasing in the majorization ordering for 12  p  1. ⇤

I will continue to investigate these conjectures for the binomial case, hope-
fully to gain some insight for the general multinomial case. For now I conclude
with two miscellaneous remarks about the OBD.

5
Because fp (x) is a polynomial in p, no ties can occur among fp (0), . . . , fp (n) if p is
non-algebraic, i.e., transcendental, and almost all real numbers are transcendental.

17
Remark 7.7. Lemma 7.3(ii) provides an explicit expression for E(X̃ 1 ):
2
Because r 1 (x) = n |n 2x| by (??) and (??), (??) and Lemma 7.3(ii) yield
2

n
X n ✓ ◆
1 X n
r 1 (x)f 1 (x) = n n
|n 2x|
x=0
2 2 2 x
8 h x=0 i
<n 1 1 n
if n is even,
2n n
= h 2 i
:n 1 1 n 1
if n is odd;
2n 1 n 2 1
8 h i
<n 1 1 n
+ 12 if n is even,
2n n
E(X̃ 1 ) = h 2 i
2 :n 1 1 n 1
+ 12 if n is odd. ⇤
2n 1 n 1 2

Remark 7.8. For 12  p  1, the minimum OBD probability f˜p (0) =


(1 p)n , the minimum binomial probability. The maximum OBD probabil-
ity f˜p (n) = fp (x̂n,p ), the maximal ⌘ modal binomial probability, where the
binomial mode x̂n,p satisfies

fp (x̂n,p ) max(fp (x̂n,p 1), fp (x̂n,p + 1)),


x̂n,p x̂n,p + 1
(44) equivalently, p .
n+1 n+1
Thus x̂n,p is the unique integer in the interval ((n+1)p 1, (n+1)p) if (n+1)p
is not an integer, while x̂n,p occurs at both (n+1)p 1 and (n+1)p if (n+1)p
is an integer. In particular,
(
n
˜ n/2
/2n if n is even,
f 1 (n) = n n n n
2 (n 1)/2
/2 = (n+1)/2 /2 if n is odd;
✓ ◆
n
f˜p (n) = p(n+1)p 1 (1 p)n (n+1)p+1
(n + 1)p 1
✓ ◆
n
= p(n+1)p (1 p)n (n+1)p if (n + 1)p is an integer. ⇤
(n + 1)p

18
8. Repeated observations. Now suppose we observe the random matrix
0 1 0 1
X1 X11 · · · X1k
B C B .. C ,
(45) X = @ ... A = @ ... . A
Xr Xr1 · · · Xrk

where X1 , . . . , Xr are independent, identically distributed rvtrs with common


pmf f on Sk,n . The range of X is
r
Sk,n := Sk,n ⇥ · · · ⇥ Sk,n (r times).

Reconsider the testing problems (??) in Section 3 and (??) in Section 4,


repeated here for convenience: test

(46) Hp : f = f p vs. Hunif : f = funif ,


(47) Hmult : f 2 M(k, n) vs. Hunif : f = funif ,

but now based on the repeated observations represented by X.


If we define
r
X
(48) X+ := Xi ⌘ (X+1 , . . . , X+k ),
i=1

then X+ is a sufficient statistic for X under the multinomial model M(k, n)


and X+ ⇠ Multp (k, rn) when f = fp . Therefore it may seem that the results
of Sections 3 and 4 apply directly: just replace n and X therein by rn and
X+ throughout. For example, for problem (??) the LRT statistic L(X) in
(??) apparently would be replaced by
k
Y X+j !
funif (X+ )
(49) / X+j
=: L̃(X+ ).
sup fp (X+ ) j=1 X+j
p2Pk

However, this approach is incorrect, both because X+ is not uniformly dis-


tributed on Sk,rn under Hunif and because X+ is not a sufficient statistic
under the combined model M(k, n) [ {funif }; information would be lost by
considering X+ alone. Therefore, tests for (??) and (??) must be based on
r
the pmf of X itself, whose range is Sk,n not Sk,rn .

19
r
For x 2 Sk,n , the pmfs of X under Hp , Hecp , and Hunif respectively, are
r
"✓ ◆ k #
Y n Y xij
fp (x) = pj
i=1
x i j=1
" r ✓ ◆# " k #
Y n Y x
(50) = pj +j ,
i=1
x i j=1
r ✓ ◆
1 Y n
(51) fecp (x) = rn ,
k i=1 xi
✓ ◆ r
n+k 1
(52) funif (x) = .
k 1
r
Because funif (x) is uniform on Sk,n , the LRT ⌘ PST for (??) based on X
rejects Hp for small values of fp (X). Because the KLD based on X is r ⇥ the
KLD based on a single observation Xi , Proposition 3.1 remains valid here,
again indicating that this test is least sensitive when p = pecp . For this case
the LRT ⌘ PST rejects Hecp for large values of
Yr ✓ ◆ 1 Yr Y k
n
(53) / Xij ! ,
i=1
Xi i=1 j=1

which should be compared to (??). We believe that Conjecture 3.3 regarding


the EPV criterion is also valid for repeated observations.
Next, the LRT for (??) based on X rejects Hmult for large values of
Qr Qk
funif (X) i=1 j=1 Xij !
(54) / Qk X+j
=: L⇤ (X),
sup fp (X) j=1 X+j
p2Pk

which should be compared to (??) and (??). We conjecture that Proposition


4.2 extends to the repeated observations case as follows:
Conjecture 8.1. Pp [L⇤ (X) c] is a Schur-concave function of p 2 Pk . ⇤
If Conjecture 8.1 is true then as in Remark 4.3, if X = x0 is observed,
the attained p-value of the LRT (??) can be expressed in a tractable form:
⇡L⇤ (x0 ) = sup Pp [L⇤ (X) L⇤ (x0 )]
p2Pk

(55) = Pecp [L⇤ (X) L⇤ (x0 )].

20
Here (??) would follow from Conjecture 8.1 since p majorizes pecp 8p 2 Pk .
As with (??), evaluation of (??) is not addressed here.
Suppose, however, that we attempt to apply the method used for the
proof of Proposition 4.2 in order to verify Conjecture 8.1. Write

(56) Pp [L⇤ (X) c] = Ep {P[L⇤ (X) c | X+ ]},

where the conditional pmf fmult (x | x+ ) under the multinomial model M(k, n)
does not depend on p:
Qr n xij Qr n
i=1 xi pj i=1 xi
(57) fmult (x | x+ ) = rn Qk x+j = rn , x 2 R(x+ ),
x+ j=1 pj x+
r
(58) R(x+ ) : = Sk,n \ {x | 11⇥r x = x+ };

fmult (x | x+ ) is a multivariate hypergeometric pmf. To show Pp [L⇤ (X) c]


is Schur-concave by applying the Proposition in Example 2 of [R] to the
expectation in (??), it must be shown that

c (X+ ) ⌘ Pp [L⇤ (X) c | X+ ]

is Schur-concave in X+ . Unfortunately this is not true in general: for the


simplest case k = n = r = 2, it can be shown that

(59) ( c (0, 4), c (1, 3), c (2, 2), c (3, 1), c (4, 0)) = (0, 1, 13 , 1, 0)
1 2
when 16
<c< 27
, which violates Schur-concavity.
An alternative, more tractable procedure for testing problem (??) is ob-
tained as follows. Factor the LRT statistic L⇤ in (??) as
Qr Qk Qk
⇤ i=1 j=1 Xij ! j=1 X+j !
L (X) = Qk · Qk X
j=1 X+j ! j=1 X+j+j
funif (X | X+ )
/ · L̃(X+)
fmult (X | X+ )
(60) ⌘ V (X | X+ ) · L̃(X+),

where we use the fact that the conditional pmf funif (X | X+ ) remains uniform
(constant) over the conditional range R(x+ ). As noted above, L̃(X+ ) by itself

21
may lose information relevant for (??), but some of this missing information
can be obtained from V (X | X+ ) as now shown.
A simple but reasonable approach to combining V (X | X+ ) with L̃(X+ )
for testing (??) is the following:
(⇤) reject Hmult if V (X | X+ ) or L̃(X+ ) is large.
As in (??), we seek a tractable upper bound for the overall significance level.
This can be accomplished by a hybrid procedure that combines the non-
dependence on p of the conditional pmf fmult (x | x+ ) under M(k, n) with
the Schur-concavity in p of Pp [L̃(X+ ) c]. (Because X+ ⇠ Multp (k, rn),
Proposition 4.2 and Remark 4.3 apply to L̃(X+ ).) This procedure is now
described.
For specified ↵, > 0, use fmult (x | x+ ) to determine c↵ (x+ ) such that

(61) P[V (X | X+ ) c↵ (X+ ) | X+ ] = ↵ (or ⇡ ↵),

and, as in (??), use fecp (x) to determine d such that

(62) Pecp [L̃(X) d ]= (or ⇡ ),

Apply (⇤) to obtain the following test procedure for (??):


T↵, : reject Hmult if V (X | X+ ) c↵ (X+ ) or L̃(X+ ) d .
Proposition 8.2. The overall significance level of T↵, is  ↵ + .
Proof. By Bonferroni’s inequality, the overall significance level of T↵, is

sup Pp [T↵, rejects Hmult ]


p2Pk

= sup Pp [V (X | X+ ) c↵ (X+ ) or L̃(X+ ) d ]


p2Pk

 sup {Pp [V (X | X+ ) c↵ (X+ )] + Pp [L̃(X+ ) d ]}


p2Pk

= ↵ + sup Pp [L̃(X+ ) d ]
p2Pk

= ↵ + Pecp [L̃(X+ ) d ]
=↵+ ,

which can be controlled by choosing ↵ and appropriately. ⇤

22
Appendix 1

Proof of Proposition 5.2. (i) The binomial pmf fp (x) is unimodal and
✓ ◆ ✓ ◆
n n n 1 n
 p  1 () p (1 p)  p,
n+1 n 1 n
so fp (x) is nondecreasing in x, hence the second sum in (??) is
n
X
(n x + 1)fp (x) = n(1 p) + 1.
x=0

Therefore the conjectured inequality (??) is equivalent to


n
X
(63) |n 2x|f 1 (x) > n(1 p).
2
x=0

1
which, since 1 p n+1
, will hold if
Xn ✓ ◆
n n2n
(64) |n 2x| > .
x=0
x n+1

Now apply the following identity, proved below:


Xn ✓ ◆ ( n
n n n if n is even,
(65) |n 2x| = 2

x=0
x 2n nn 11 if n is odd.
2

If n is even, it follows from this identity that (??) will hold if


✓ ◆
n 2n
(66) n > .
2
n+1
n
This inequality follows from a lower bound for n in Kra↵t (2000):
2

✓ ◆ r
n n 1 2 2n
n 2 > .
2
n n+1
Similarly, if n is odd then (??) will hold if
✓ ◆
n 1 2n 1
(67) n 1 > ,
2
n+1

23
which again follows from Kra↵t’s inequality:
✓ ◆ r
n 1 n 2 2 2n 1
n 1 2 > .
2
n 1 n + 1

(ii) Recall that q 1 (x) = 1+|n 2x| and f 1 (x) = f 1 (n x) for x = 0, . . . , n.


2 2 2
Furthermore, if p > 12 and n/2 < x  n then
fp (x)
(68) = t2x n
> 1;
fp (n x)
(69)
8
1
>
<< 1 if t < an (x) , p < 2
+ bn (x),
fp (x) n x + 1 2x n 1 1
= t = 1 if t = an (x) , p = + bn (x),
fp (n x + 1) x >
:
2
1
> 1 if t > an (x) , p > 2
+ bn (x),
p an (x) 1
where t ⌘ t(p) = 1 p
> 1 and bn (x) = 2[an (x)+1]
.
First suppose that n is odd. Then (??), (??), and the unimodality of the
binomial pmf imply that when 12 < p < 12 + ✏n ,
( )
q 1 (x), x = 0, . . . , n 2 1 ,
(70) qp (x) = 2
.
q 1 (x) 1, x = n+1
2
,...,n
2

Thus in this case the conjectured inequality (??) is equivalent to


n
X n
X n
X
(71) |n 2x|f 1 (x) > |n 2x|fp (x) fp (x)
2
x=0 x=0 x= n+1
2

Because
8 9
<Xn n
X = n
X
1
(72) lim1 |n 2x|fp (x) fp (x) = |n 2x|f 1 (x) 2
,
p# 2 : x=0 ; x=0
2
x= n+1
2

1 1
the inequality (??) holds when 2
<p< 2
+ n for sufficiently small n < ✏n .
If n is even then (??), (??), and the unimodality of the binomial pmf
imply that when 12 < p < 12 + ✏n ,
( )
q 1 (x), x = 0, . . . , n2 ,
(73) qp (x) = 2
.
q 1 (x) 1, x = n2 + 1, . . . , n
2

24
Thus in this case the conjectured inequality (??) is equivalent to
n
X n
X n
X
(74) |n 2x|f 1 (x) > |n 2x|fp (x) fp (x)
2
x=0 x=0 x= n
2
+1

Because
(75)8 9
<X n n
X = n
X h i
1 1 n
lim1 |n 2x|fp (x) fp (x) = |n 2x|f 1 (x) 2
1 2n n ,
p# 2 : ; 2 2
x=0 x= n
2
+1 x=0

the inequality (??) holds when 12 p < 1


2
+ n for sufficiently small n < ✏n . ⇤
n n
Proof of identity (??). If n 4 is even then because x
= n x
,

n
X ✓ ◆ Xn
2
1 ✓ ◆ X n ✓ ◆
n n n
|n 2x| = (n 2x) + (2x n)
x x x
x=0 x=0 x= n
2
+1
2 3
Xn ✓ ◆ X n
2
1 ✓ ◆
n n 5
= 24 x x
x x
x= n2
+1 x=0
2n 3
X
2
1 ✓ ◆ X n
2
1 ✓ ◆
n n 5
= 2 4 (n x) x
x=0
x x=0
x
2 n 3
1✓ ◆ n
1 ✓ ◆
X2
n X
2
n 5
= 2 4n 2 x
x=0
x x=0
x
2 8n 9 3
< 1✓ ◆ n ✓ ◆ =
n
2✓ ◆
n X n 2 X n X n 1
2

= 24 + 2n 5
2 : x=0 x n x ; x=0
x
x= 2 +1
 ⇢ ✓ ◆ ⇢ ✓ ◆ ✓ ◆
1 n n n 1 n 1 n 1
= 2n 2 n 2 n n
2 2 2
1 2

25
✓ ◆ ✓ ◆ ✓ ◆
n1 n 1 1 n
= 2n n + n
1 2 n2
✓ 2 ◆ ✓ ◆ 2
n 1 n
= 2n n
2 n2
✓ ◆2
n
=n n .
2

Next, if n 3 is odd then similarly to the even case,

n
X ✓ ◆ Xn 1
2 ✓ ◆ X n ✓ ◆
n n n
|n 2x| = (n 2x) + (2x n)
x=0
x x=0
x x
x= n+1
2
2 3
X n ✓ ◆ X n 1
2 ✓ ◆
n n 5
= 24 x x
x x=0
x
x= n+1
2
2n 1 3
X2 ✓ ◆ X n 1
2 ✓ ◆
n n 5
= 2 4 (n x) x
x=0
x x=0
x
2 n 1 3
2 ✓ ◆ ✓ ◆
n 1
X n X 2
n 5
= 2 4n 2 x
x=0
x x=0
x
2 8n 1 9 3
2 ✓ ◆ ✓ ◆= 2 ✓ ◆
n 3

n < X n X n
n X n 1 5
= 24 + 2n
2 : x=0 x x ; x=0
x
x= n+1
2
 ⇢ ✓ ◆
1 n n 1 n 1
= 2n {2 } 2 n 1
2 2
✓ ◆
n 1
= 2n n 1 .
2

This completes the proof of identity (??). ⇤

26
n
Proof of Lemma 6.1. For 2
+1xn 1,

(2x n 1)[2(x + 1) n 1](Dx+1 Dx )


x+1
X x
X
= (2x n 1) dm [2(x + 1) n 1] dm
m=n (x+1)+2 m=n x+2
x+1
X x
X
= (2x n 1) dm (2x n + 1) dm
m=n x+1 m=n x+2
x
X
= (2x n 1)(dn x+1 + dx+1 ) 2 dm > 0,
m=n x+2

by the strict convexity of dm in m. ⇤

Proof of Lemma 7.3. (i) If no ties occur among fp (0), . . . , fp (n) then
P
n
(rp (0), . . . , rp (n)) is a permutation of (0, . . . , n) so rp (x)f˜p (rp (x)) = E(X̃p ).
x=0
n
Therefore (i) holds by (??). When n+1
< p < 1, fp (n 1) < fp (n) so by
unimodality, no ties can occur.
(ii) Recall that f 1 (x) = f 1 (n x) and, by (??) and (??),
2 2

r 1 (x) = n |n 2x| = r 1 (n x).


2 2

Thus, when n is odd the tied pairs are (0, n), (1, n 1), . . . , ( n 2 1 , n+1
2
), so
n 1 n 1
n
X X
2 X
2

(76) r 1 (x)f 1 (x) = 2 r 1 (x)f 1 (x) = 4 xf 1 (x).


2 2 2 2 2
x=0 x=0 x=0

Furthermore,
(
f 1 ( x2 ), x = 0, 2, 4, . . . , n 1,
f˜1 (x) = 2
2
f 1 ( x 2 1 ), x = 1, 3, 5, . . . , n,
2

27
so
n
X
E(X̃ 1 ) = xf˜1 (x)
2 2
x=0
P P
= xf 1 ( x2 ) + xf 1 ( x 2 1 )
2 2
x=0,2,...,n 1 x=1,3,...,n
n 1 n 1
X
2 X
2

= 2xf 1 (x) + (2x + 1)f 1 (x)


2 2
x=0 x=0
n 1
X
2

=4 xf 1 (x) + 12 .
2
x=0

By comparing this to (??), we see that (ii) holds.


When n is even, the tied pairs are (0, n), (1, n 1), . . . , ( n2 1, n2 + 1), so

(77)
n
n 1 n
X X
2
P1
2
r 1 (x)f 1 (x) = 2 r 1 (x)f 1 (x) + nf 1 ( n2 ) =4 xf 1 (x) + nf 1 ( n2 ).
2 2 2 2 2 2 2
x=0 x=0 x=0

Furthermore,
(
f 1 ( x2 ), x = 0, 2, 4, . . . , n 2, n,
f˜1 (x) = 2
2
f 1 ( x 2 1 ), x = 1, 3, 5, . . . , n 1,
2

so
n
X
E(X̃ 1 ) = xf˜1 (x)
2 2
x=0
P P
= xf 1 ( x2 ) + xf 1 ( x 2 1 ) + nf 1 ( n2 )
2 2 2
x=0,2,...,n 2 x=1,3,...,n 1
n n
1 1
X
2 X
2

= 2xf 1 (x) + (2x + 1)f 1 (x) + nf 1 ( n2 )


2 2 2
x=0 x=0
n
1
X
2

=4 xf 1 (x) + 12 + nf 1 ( n2 ).
2 2
x=0

By comparing this to (??), we see that (ii) again holds.

28
(iii) If ties occur among fp (0), . . . , fp (n) then by the unimodality of fp (x)
in x, these ties must occur in nested non-overlapping pairs fp (yi ) = fp (xi ),
where 1  · · · < y2 < y1 < x1 < x2 < · · ·  n. For fixed n, the most such
tied pairs occur when p = 12 as in (ii). Here we consider the case 12 < p  n+1 n
.
Partition the half-open interval ( 12 , n+1
n
] as follows:
8S ⇤
⇤ < n 1n+1 x , x+1 if n is odd,
1 n x=
⇣ n2 i S n+1 n+1
, = ⇤
2 n+1 : 1 , 2 +1 [ nx= 1n +1 x , x+1 if n is even.
2 n+1 n+1 n+12

Because ⇤
x
p2 , x+1
n+1 n+1
) fp (x 1) < fp (x) fp (x + 1),
by unimodality, ties can occur only for (yi , xi ) pairs with x + 1  xi  n.
For each such tie, the P coefficient of fp (xi ) in E(X̃p ) is greater by 1 than
its coefficient rp (xi ) in nx=0 rp (x)fp (x). Therefore, since p  n+1x+1
and the
binomial(n, p) distribution is stochastically increasing in p,
n
X n
X n
X
(78) n,p : = E(X̃p ) rp (z)fp (z)  fp (z)  f x+1 (z).
n+1
z=0 z=x+1 z=x+1

However, by the relationship between the binomial and beta distributions,


n
X
x+1
(79) f x+1 (z) = Pr[Beta(x + 1, n x)  n+1
] < 12 .
n+1
z=x+1


This inequality holds because median[Beta(↵, )] > ↵+ when ↵ > > 1
1
(cf. Kerman (2011)), hence n,p < 2 . Since ⇤ n,p > 0 because at least one
x x+1
tie is present, (iii) is verified for p 2 n+1 , n+1 .
Lastly, because
⇣ n
i
+1
p 2 12 , 2
n+1
) fp ( n2 1) < fp ( n2 ) fp ( n2 + 1),

by unimodality, ties can occur only for (yi , xi ) pairs with n2 + 1  xi  n.


n
+1
Since p  n+1
2
, it follows as in (??) and (??) with x = n2 that
n
X
n,p  f n2 +1 (z) < 12 .
n+1
z= n
2
+1

29
Again
⇣ n,p i> 0 because at least one tie is present, so (iii) is verified for
n
+1
p 2 12 , 2
n+1
.

(iv) If p = 1 then n 1 ties occur, namely


f1 (0) = · · · = f1 (n 1) = 0, f1 (n) = 1,
f˜1 (0) = · · · = f˜1 (n 1) = 0, f˜1 (n) = 1,
r1 (0) = · · · = r1 (n 1) = 0, r1 (n) = n,
so (iv) holds, since both sides of (iv) equal n. ⇤

Appendix 2

A critique of pure significance tests. The PST method described in


Section 1 possesses a seemingly irreparable ambiguity, as now demonstrated.
Example 1. Suppose that Mike observes the rv U whose range is the interval
(0, 1), Joe observes X = log U whose range is (0, 1), and Steve observes
Y = log(1 U ) whose range also is (0, 1). Note that U , X, and Y are
mutually equivalent, so Mike, Joe, and Steve possess the same information.
Suppose that Mike wishes to test H0 : U ⇠ Uniform(0, 1), equivalently,
fU (u) = 1(0,1) (u), the uniform pdf on (0, 1), without specifying any alterna-
tive distribution(s). Because fU is constant, every PST based on U is trivial,
either rejecting H0 for all U or accepting H0 for all U .
However, the situation is much worse for Joe and Steve. It is straight-
forward to show that for Joe, H0 is equivalent to fX (x) = e x 1(0,1) (x),
the standard exponential pdf, while for Steve, H0 is equivalent to fY (y) =
e y 1(0,1) (y), also standard exponential. Thus for Joe the PST rejects H0 if
e X  ⌧ where 0 < ⌧ < 1, equivalently, if
(80) X log ⌧.
Y
Similarly, for Steve the PST rejects H0 if e  ⌧ , equivalently, if Y
log ⌧ . However,
X
Y = log(1 U) = log(1 e ),
so Steve’s PST rejects H0 if
(81) X log(1 ⌧ ),

30
which is essentially the opposite of Joe’s PST (??). In fact, in terms of U ,
(??) and (??) become, respectively,
(82) U  ⌧,
(83) U 1 ⌧.
1
Thus if ⌧ = 2
then (??) and (??) become
(84) U  12 ,
1
(85) U 2
,
so the two PSTs reach exactly opposite conclusions, although based on equiv-
alent evidence. Clearly this is an undesirable property for an inference pro-
cedure.
And the situation is actually far worse. Suppose that Marina observes
Z = log ⌘(U ), where ⌘ is an arbitrary measure-preserving bijection of
(0, 1) ! (0, 1). Under H0 , ⌘(U ) ⇠ U , so for Marina H0 is equivalent to
fZ (z) = e z 1(0,1) (z), again standard exponential. Thus her PST rejects H0
if Z log ⌧ , equivalently, if
(86) ⌘(U )  ⌧.
Because this holds for all ⌘, this shows that in terms of U every measurable
subset of (0, 1) is a rejection region for a PST of H0 . Thus, in terms of Z
essentially every measurable subset of (0, 1) is the rejection region of a PST
for H0 , again an unacceptable conclusion. ⇤
Example 2. For the coup de grace, we show that the above discussion
extends to essentially all pdfs f0 on R1 , not just standard exponentials. Sup-
pose it is wished to test H0 : fX = f0 without specifying any alternative(s).
For simplicity, assume that the support of f0 is an interval (a, b), where
1  a < b  1 and let F0 be the cdf corresponding to f0 . Under H0 ,
U := F (X) ⇠ Uniform(0, 1). (U is the probability integral transform of X.)
The rejection region of the PST for H0 is
(87) R⌧ = {x 2 (a, b) | f0 (x)  ⌧ }
= {u 2 (0, 1) | f (F0 1 (u))  ⌧ }.

Now define
(88) W⌘ = F0 1 (⌘(F0 (X))) = F0 1 (⌘(U )),

31
where again ⌘ is an arbitrary measure-preserving bijection of (0, 1) ! (0, 1).
It is straightforward to show that fW⌧ = f0 under H0 , so the PST for H0
based on W⌧ has rejection region
(89) {w 2 (a, b) | f0 (w)  ⌧ } = {u 2 (0, 1) | f0 (F0 1 (⌘(u)))  ⌧ }
= ⌘ 1 {v 2 (0, 1) | f0 (F0 1 (v))  ⌧ }
= ⌘ 1 (R⌧ ).
Because this holds for all ⌘, in terms of U essentially every measurable subset
of (0, 1) is the rejection region of a PST for H0 . Thus, in terms of X essentially
every measurable subset of (a, b) is the rejection region of a PST for H0 , again
an undesirable property. ⇤

References

[C] Cherno↵, H. (1956). Large sample theory: parametric case. Ann. Math.
Statist. 27 1-22.

[H] Hodges, J. S. (1990). Can/may Bayesians do pure tests of significance?


In Bayesian and Likelihood Methods in Statistics and Econometrics, S.
Geisser, J. S. Hodges, S. J. Press, A. Zellner, eds., 75-90. Elsevier (North
Holland).

[H] Howard, J. V. (2009). Significance testing with no alternative hypothesis:


a measure of surprise. Erkenntnis 70 253-270.

[K] Kerman, J. (2011). A closed-form expression for the median of the beta
distribution. arXiv:1111.0433.

[K] Kra↵t, O. (2000). Problem 10819, Amer. Math. Monthly 107 p.652.

[MOA] Marshall, A. W, I. Olkin, B. C. Arnold (2011). Inequalities: Theory


of Majorization and its Applications (2nd ed.), Springer, New York.

[R] Rinott, Y. (1973). Multivariate majorization and rearrangement inequal-


ities with some applications to probability and statistics. Israel J. Math.
15 60-77.

[SS] Sackrowitz, H. and E. Samuel-Cahn (1999). P values as random vari-


ables - expected p values. Amer. Statistician 53 326-331.

32

You might also like