Pass 11 - Non-Inferiority Tests For Two Proportions

210-1
Chapter 210
Non-Inferiority
Tests for Two
Proportions
Introduction
This module provides power analysis and sample size calculation for non-inferiority and
superiority tests in two-sample designs in which the outcome is binary. Users may choose from
among eight popular test statistics commonly used for running the hypothesis test.
The power calculations assume that independent, random samples are drawn from two
populations.
Four Procedures Documented Here

There are four procedures in the menus that use the program module described in this chapter.
These procedures are identical except for the type of parameterization. The parameterization can
be in terms of proportions, differences in proportions, ratios of proportions, and odds ratios. Each
of these options is listed separately on the menus.
Example
A non-inferiority test example will set the stage for the discussion of the terminology that
follows. Suppose that the current treatment for a disease works 70% of the time. Unfortunately,
this treatment is expensive and occasionally exhibits serious side-effects. A promising new
treatment has been developed to the point where it can be tested. One of the first questions that
must be answered is whether the new treatment is as good as the current treatment. In other
words, do at least 70% of treated subjects respond to the new treatment?
Because of the many benefits of the new treatment, clinicians are willing to adopt the new
treatment even if it is slightly less effective than the current treatment. They must determine,
however, how much less effective the new treatment can be and still be adopted. Should it be
adopted if 69% respond? 68%? 65%? 60%? There is a percentage below 70% at which the
difference between the two treatments is no longer considered ignorable. After thoughtful
discussion with several clinicians, it was decided that if a response of at least 63% were achieved,
the new treatment would be adopted. The difference between these two percentages is called the
margin of equivalence. The margin of equivalence in this example is 7%.
210-2 Non-Inferiority Tests for Two Proportions
The developers must design an experiment to test the hypothesis that the response rate of the new
treatment is at least 0.63. The statistical hypothesis to be tested is
H0: p1 − p2 ≤ −0.07 versus H1: p1 − p2 > −0.07
Notice that when the null hypothesis is rejected, the conclusion is that the response rate is at least
0.63. Note that even though the response rate of the current treatment is 0.70, the hypothesis test
is about a response rate of 0.63. Also notice that a rejection of the null hypothesis results in the
conclusion of interest.
Technical Details
The details of sample size calculation for the two-sample design for binary outcomes are
presented in the chapter “Two Proportions Non-Null Case,” and they will not be duplicated here.
Instead, this chapter only discusses those changes necessary for non-inferiority and superiority
tests.
Approximate sample size formulas for non-inferiority tests of two proportions are presented in
Chow et al. (2003), page 90. Only large sample (normal approximation) results are given there.
The results available in this module use exact calculations based on the enumeration of all
possible values in the binomial distribution.
Suppose you have two populations from which dichotomous (binary) responses will be recorded.
Assume without loss of generality that the higher proportions are better. The probability (or risk)
of cure in population 1 (the treatment group) is p1 and in population 2 (the reference group)
is p2 . Random samples of n1 and n2 individuals are obtained from these two populations. The
data from these samples can be displayed in a 2-by-2 contingency table as follows
Group Success Failure Total

Treatment x11 x12 n1
Control x21 x22 n2
Totals m1 m2 N
The binomial proportions, p1 and p2 , are estimated from these data using the formulae
a x11 b x
p$1 = = and p$ 2 = = 21
m n1 n n2
Let p1.0 represent the group 1 proportion tested by the null hypothesis, H0 . The power of a test is
computed at a specific value of the proportion which we will call p11. . Let δ represent the
smallest difference (margin of equivalence) between the two proportions that still results in the
conclusion that the new treatment is not inferior to the current treatment. For a non-inferiority
test, δ < 0. The set of statistical hypotheses that are tested is
H 0: p1.0 − p2 ≤ δ versus H1: p1.0 − p2 > δ
which can be rearranged to give
H 0: p1.0 ≤ p2 + δ versus H1: p1.0 > p2 + δ
Non-Inferiority Tests for Two Proportions 210-3
There are three common methods of specifying the margin of equivalence. The most direct is to
simply give values for p2 and p1.0 . However, it is often more meaningful to give p2 and then
specify p1.0 implicitly by specifying the difference, ratio, or odds ratio. Mathematically, the
definitions of these parameterizations are
Parameter Computation Hypotheses

Difference δ = p1.0 − p2 H 0: p1.0 − p2 ≤ δ0 vs. H1: p1.0 − p2 > δ0 , δ0 < 0
Ratio φ = p1.0 / p2 H 0: p1 / p2 ≤ φ0 vs. H1: p1 / p2 > φ0 , φ0 < 1
Odds Ratio ψ = Odds1.0 / Odds2 H 0: o1.0 / o2 ≤ ψ 0 versus H1: o1.0 / o2 > ψ 0 , ψ 0 < 1
Difference
The difference is perhaps the most direct method of comparison between two proportions. It is
easy to interpret and communicate. It gives the absolute impact of the treatment. However, there
are subtle difficulties that can arise with its interpretation.
One difficulty arises when the event of interest is rare. If a difference of 0.001 occurs when the
baseline probability is 0.40, it would be dismissed as being trivial. However, if the baseline
probably of a disease is 0.002, a 0.001 decrease would represent a reduction of 50%. Thus
interpretation of the difference depends on the baseline probability of the event.
Note that if δ < 0 , the procedure is called a non-inferiority test while if δ > 0 the procedure is
called a superiority test.
Non-Inferiority using a Difference

The following example might help you understand the concept of a non-inferiority test. Suppose
60% of patients respond to the current treatment method ( p2 = 0.60) . If the response rate of the
new treatment is no less than 5 percentage points worse (δ = −0.05) than the existing treatment,
it will be considered to be noninferior. Substituting these figures into the statistical hypotheses
gives
H 0:δ ≤ −0.05 versus H1:δ > −0.05
Using the relationship
p1.0 = p2 + δ
gives
H 0: p1.0 ≤ 0.55 versus H1: p1.0 > 0.55
In this example, when the null hypothesis is rejected, the concluded alternative is that the
response rate is at least 55%, which means that the new treatment is not inferior to the current
treatment.
Superiority using a Difference

The following example is intended to help you understand the concept of a superiority test.
Suppose 60% of patients respond to the current treatment method ( p2 = 0.60) . If the response
rate of the new treatment is at least 10 percentage points better (δ = 010
. ) , it will be considered to
be superior to the existing treatment. Substituting these figures into the statistical hypotheses
gives
H0:δ ≤ 010
. versus H1:δ > 010
.
p1.0 = p2 + δ
gives
H0: p1.0 ≤ 0.70 versus H1: p1.0 > 0.70
response rate is at least 0.70. That is, the conclusion of superiority is that the new treatment’s
response rate is at least 0.10 more than that of the existing treatment.
Ratio
The ratio, φ = p1.0 / p2 , gives the relative change in the probability of the response. Testing non-
inferiority and superiority use the formulation
H 0: p1.0 / p2 ≤ φ0 versus H1: p1.0 / p2 > φ0
The only subtlety is that for non-inferiority tests φ0 < 1 , while for superiority tests φ0 > 1 .
Non-Inferiority using a Ratio

The following example might help you understand the concept of non-inferiority as defined by
the ratio. Suppose that 60% of patients ( p2 = 0.60) respond to the current treatment method. If a
new treatment decreases the response rate by no more than 10% (φ0 = 0.90) , it will be
considered to be noninferior to the standard treatment. Substituting these figures into the
statistical hypotheses gives
H 0:φ ≤ 0.90 versus H1:φ > 0.90
p1.0 = φ0 p2
gives
H 0: p1.0 ≤ 0.54 versus H1: p1.0 > 0.54
response rate is at least 54%. That is, the conclusion of non-inferiority is that the new treatment’s
response rate is no worse than 10% less than that of the standard treatment.
Odds Ratio
( ) ( )
The odds ratio, ψ = p1.0 / (1 − p1.0 ) / p2 / (1 − p2 ) , gives the relative change in the odds of the
response. Testing non-inferiority and superiority use the same formulation
H0:ψ ≤ ψ 0 versus H1:ψ > ψ 0
The only difference is that for non-inferiority tests ψ 0 < 1 , while for superiority tests ψ 0 > 1 .
A Note on Setting the Significance Level, Alpha

Setting the significance level has always been somewhat arbitrary. For planning purposes, the
standard has become to set alpha to 0.05 for two-sided tests. Almost universally, when someone
states that a result is statistically significant, they mean statistically significant at the 0.05 level.
Although 0.05 may be the standard for two-sided tests, it is not always the standard for one-sided
tests, such as non-inferiority tests. Statisticians often recommend that the alpha level for one-
sided tests be set at 0.025 since this is the amount put in each tail of a two-sided test.
Power Calculation
The power for a test statistic that is based on the normal approximation can be computed exactly
using two binomial distributions. The following steps are taken to compute the power of these
tests.
1. Find the critical value using the standard normal distribution. The critical value, zcritical , is
that value of z that leaves exactly the target value of alpha in the appropriate tail of the
normal distribution.
2. Compute the value of the test statistic, zt , for every combination of x11 and x21 . Note that
x11 ranges from 0 to n1 , and x21 ranges from 0 to n2 . A small value (around 0.0001) can
be added to the zero-cell counts to avoid numerical problems that occur when the cell value
is zero.
3. If zt > z critical , the combination is in the rejection region. Call all combinations of x11
and x21 that lead to a rejection the set A.
4. Compute the power for given values of p11. and p2 as
⎛n ⎞ n1 − x11 ⎛ 2 ⎞ x 21 n 2 − x 21
n
1 − β = ∑ ⎜ 1 ⎟ p11x.11 q11 ⎜ ⎟ p2 q2
A ⎝ x11 ⎠ ⎝ x21⎠
.
5. Compute the actual value of alpha achieved by the design by substituting p2 for p11. to
obtain
⎛ n1 ⎞ ⎛ n2 ⎞ x11 + x 21 n1 + n2 − x11 − x 21
α* = ∑ ⎜ ⎟ ⎜ ⎟ p2 q2
A ⎝ x11 ⎠ ⎝ x21 ⎠
Asymptotic Approximations
When the values of n1 and n2 are large (say over 200), these formulas often take a long time to
evaluate. In this case, a large sample approximation can be used. The large sample approximation
is made by replacing the values of p$1 and p$ 2 in the z statistic with the corresponding values of
p11. and p2 , and then computing the results based on the normal distribution. Note that in large
samples, the Farrington and Manning statistic is substituted for the Gart and Nam statistic.
Test Statistics
Several test statistics have been proposed for testing whether the difference, ratio, or odds ratio
are different from a specified value. The main difference among the several test statistics is in the
formula used to compute the standard error used in the denominator. These tests are based on the
following z-test
p$1 − p$ 2 − δ 0 − c
zt =
σ$
The constant, c, represents a continuity correction that is applied in some cases. When the
continuity correction is not used, c is zero. In power calculations, the values of p$1 and p$ 2 are not
known. The corresponding values of p11. and p2 may be reasonable substitutes.
Following is a list of the test statistics available in PASS. The availability of several test statistics
begs the question of which test statistic one should use. The answer is simple: one should use the
test statistic that will be used to analyze the data. You may choose a method because it is a
standard in your industry, because it seems to have better statistical properties, or because your
statistical package calculates it. Whatever your reasons for selecting a certain test statistic, you
should use the same test statistic when doing the analysis after the data have been collected.
Z Test (Pooled)
This test was first proposed by Karl Pearson in 1900. Although this test is usually expressed
directly as a chi-square statistic, it is expressed here as a z statistic so that it can be more easily
used for one-sided hypothesis testing. The proportions are pooled (averaged) in computing the
standard error. The formula for the test statistic is
p$1 − p$ 2 − δ 0
zt =
σ$1
where
⎛1 1⎞
σ$1 = p(1 − p )⎜ + ⎟
⎝ n1 n2 ⎠
n1 p$1 + n2 p$ 2
p=
n1 + n2
Z Test (Unpooled)
This test statistic does not pool the two proportions in computing the standard error.
p$1 − p$ 2 − δ 0
zt =
σ$ 2
where
p$1 (1 − p$1 ) p$ 2 (1 − p$ 2 )
σ$ 2 = +
n1 n2
Z Test with Continuity Correction (Pooled)

This test is the same as Z Test (Pooled), except that a continuity correction is used. Remember
that in the null case, the continuity correction makes the results closer to those of Fisher’s Exact
test.
F⎛ 1 1⎞
p$1 − p$ 2 − δ0 + ⎜ + ⎟
2 ⎝ n1 n2 ⎠
zt =
σ$1
⎛1 1⎞
σ$1 = p(1 − p )⎜ + ⎟
⎝ n1 n2 ⎠
n1 p$1 + n2 p$ 2
p=
n1 + n2
where F is -1 for lower-tailed hypotheses and 1 for upper-tailed hypotheses.
Z Test with Continuity Correction (Unpooled)

This test is the same as the Z Test (Unpooled), except that a continuity correction is used.
Remember that in the null case, the continuity correction makes the results closer to those of
Fisher’s Exact test.
F⎛ 1 1⎞
p$1 − p$ 2 − δ0 − ⎜ + ⎟
2 ⎝ n1 n2 ⎠
zt =
σ$2
p$1 (1 − p$1 ) p$ 2 (1 − p$ 2 )
σ$ 2 = +
n1 n2
where F is -1 for lower-tailed hypotheses and 1 for upper-tailed hypotheses.
T-Test of Difference
Because of a detailed, comparative study of the behavior of several tests, D’Agostino (1988) and
Upton (1982) proposed using the usual two-sample t-test for testing whether the two proportions
are equal. One substitutes a ‘1’ for a success and a ‘0’ for a failure in the usual, two-sample t-test
formula.
Miettinen and Nurminen’s Likelihood Score Test of the Difference

Miettinen and Nurminen (1985) proposed a test statistic for testing whether the difference is equal
to a specified, non-zero, value, δ 0 . The regular MLE’s, p$1 and p$ 2 , are used in the numerator of
the score statistic while MLE’s ~ p1 and ~ p2 , constrained so that ~
p1 − ~
p2 = δ 0 , are used in the
denominator. A correction factor of N/(N-1) is applied to make the variance estimate less biased.
The significance level of the test statistic is based on the asymptotic normality of the score
statistic. The formula for computing this test statistic is
p$1 − p$ 2 − δ0
z MND =
σ$ MND
where
⎛~p1q~1 ~
p q~ ⎞ ⎛ N ⎞
σ$ MND = ⎜ + 2 2 ⎟⎜ ⎟
⎝ n1 n2 ⎠ ⎝ N − 1⎠
~
p1 = ~
p2 + δ0
L
p1 = 2 B cos( A) − 2
~
3L3
1⎡ ⎛ C ⎞⎤
A = ⎢π + cos −1 ⎜ 3 ⎟ ⎥
3⎣ ⎝ B ⎠⎦
L22
B = sign(C )
L
2
− 1
9 L3 3L3
L32 LL L
C= 3
− 1 22 + 0
27 L3 6 L3 2 L3
L0 = x21δ0 (1 − δ0 )
L1 = [ N 2δ0 − N − 2x21 ]δ0 + M1
L2 = ( N + N2 )δ0 − N − M1
L3 = N
Miettinen and Nurminen’s Likelihood Score Test of the Ratio

Miettinen and Nurminen (1985) proposed a test statistic for testing whether the ratio is equal to a
specified value φ0 . The regular MLE’s, p$1 and p$ 2 , are used in the numerator of the score statistic
while MLE’s ~ p1 and ~ p2 , constrained so that ~
p1 / ~
p2 = φ0 , are used in the denominator. A
correction factor of N/(N-1) is applied to make the variance estimate less biased. The significance
level of the test statistic is based on the asymptotic normality of the score statistic.
The formula for computing the test statistic is

p$1 / p$ 2 − φ0
z MNR =
⎛~p q~ ~~
2 p q ⎞⎛ N ⎞
⎜ 1 1 + φ0 2 2 ⎟ ⎜ ⎟
⎝ n1 n2 ⎠ ⎝ N − 1⎠
where
~
p1 = ~
p2φ0
~ − B − B 2 − 4 AC
p2 =
2A
A = Nφ0
B = −[ N1φ0 + x11 + N 2 + x21φ0 ]
C = M1
Miettinen and Nurminen’s Likelihood Score Test of the Odds Ratio

Miettinen and Nurminen (1985) proposed a test statistic for testing whether the odds ratio is equal
to a specified value, ψ 0 . Because the approach they used with the difference and ratio does not
easily extend to the odds ratio, they used a score statistic approach for the odds ratio. The regular
MLE’s are p$1 and p$ 2 . The constrained MLE’s are ~ p1 and ~ p2 . These estimates are constrained
so that ψ~ = ψ 0 . A correction factor of N/(N-1) is applied to make the variance estimate less
biased. The significance level of the test statistic is based on the asymptotic normality of the score
statistic.
( p$1 − ~p1 ) − ( p$ 2 − ~p2 )
~
p1q~1 ~
p2q~2
z MNO =
⎛ 1 1 ⎞⎛ N ⎞
⎜ ~ ~ + ~ ~ ⎟⎜ ⎟
⎝ N 2 p1q1 N 2 p2q2 ⎠ ⎝ N − 1⎠
where
~
p2ψ 0
~
p1 =
1 + p2 (ψ 0 − 1)
~
~ − B + B 2 − 4 AC
p2 =
2A
A = N 2 (ψ 0 − 1)
B = N1ψ 0 + N 2 − M1 (ψ 0 − 1)
C = − M1
Farrington and Manning’s Likelihood Score Test of the Difference

Farrington and Manning (1990) proposed a test statistic for testing whether the difference is equal
to a specified value δ 0 . The regular MLE’s, p$1 and p$ 2 , are used in the numerator of the score
statistic while MLE’s ~ p1 and ~p2 , constrained so that ~
p1 − ~
p2 = δ0 , are used in the denominator.
The significance level of the test statistic is based on the asymptotic normality of the score
statistic.
p$1 − p$ 2 − δ0
z FMD =
⎛~ p q~ ~ p q~ ⎞
⎜ 1 1 + 2 2⎟
⎝ n1 n2 ⎠
where the estimates ~

p1 and ~
p2 are computed as in the corresponding test of Miettinen and
Nurminen (1985) given above.
Farrington and Manning’s Likelihood Score Test of the Ratio

Farrington and Manning (1990) proposed a test statistic for testing whether the ratio is equal to a
specified value φ0 . The regular MLE’s, p$1 and p$ 2 , are used in the numerator of the score
statistic while MLE’s ~ p1 and ~
p2 , constrained so that ~
p1 / ~
p2 = φ0 , are used in the denominator.
A correction factor of N/(N-1) is applied to increase the variance estimate. The significance level
of the test statistic is based on the asymptotic normality of the score statistic.
p$ 1 / p$ 2 − φ0
z FMR =
⎛~
p q~ ~~
2 p q ⎞
⎜ 1 1 + φ0 2 2 ⎟
⎝ n1 n2 ⎠

p1 and ~
Farrington and Manning’s Likelihood Score Test of the Odds Ratio

Farrington and Manning (1990) indicate that the Miettinen and Nurminen statistic may be
modified by removing the factor N/(N-1).
The formula for computing this test statistic is
( p$1 − ~p1 ) − ( p$ 2 − ~p2 )
~
p1q~1 ~
p2q~2
z FMO =
⎛ 1 1 ⎞
⎜ ~~ + ~ ~ ⎟
⎝ N 2 p1q1 N 2 p2q2 ⎠

p1 and ~
Gart and Nam’s Likelihood Score Test of the Difference

Gart and Nam (1990), page 638, proposed a modification to the Farrington and Manning (1988)
difference test that corrects for skewness. Let z FMD (δ ) stand for the Farrington and Manning
difference test statistic described above. The skewness corrected test statistic, zGND , is the
appropriate solution to the quadratic equation
(− γ~ ) zGND
2
+ ( − 1) zGND + ( z FMD (δ ) + γ~ ) = 0
where
~
~ V 3/ 2 (δ ) ⎛ ~
p1q~1 (q~1 − ~
p1 ) ~p2q~2 (q~2 − ~
p2 ) ⎞
γ = ⎜ − ⎟
6 ⎝ n12 2
n2 ⎠
Gart and Nam’s Likelihood Score Test of the Ratio

Gart and Nam (1988), page 329, proposed a modification to the Farrington and Manning (1988)
ratio test that corrects for skewness. Let z FMR (φ ) stand for the Farrington and Manning ratio test
statistic described above. The skewness corrected test statistic, zGNR , is the appropriate solution to
the quadratic equation
(− ϕ~ ) zGNR
2
+ ( − 1) zGNR + ( z FMR (φ ) + ϕ~ ) = 0
where
1 ⎛ q~1 (q~1 − ~p1 ) q~2 (q~2 − ~p2 ) ⎞
ϕ~ = ~ 3/ 2 ⎜ 2 ~2
− 2 ~2 ⎟
6u ⎝ n1 p1 n2 p2 ⎠
q~1 q~2
u~ = ~ + ~
n1 p1 n2 p2
Procedure Options
This section describes the options that are specific to this procedure. These are located on the
Data tab. For more information about the options of other tabs, go to the Procedure Window
chapter.
Data Tab (Common Options)

The Data tab contains the parameters associated with this test such as the proportions, sample
sizes, alpha, and power. This chapter covers four procedures, each of which has different options.
This section documents options that are common to all four procedures. Later, unique options for
each procedure will be documented.
Solve For
Find (Solve For)
This option specifies the parameter to be solved for using the other parameters. The parameters
that may be selected are P1.1, Alpha, Power and Beta, N1, and N2. Under most situations, you
will select either Power and Beta or N1.
Select N1 when you want to calculate the sample size needed to achieve a given power and alpha
level.
Select Power and Beta when you want to calculate the power of an experiment.
Error Rates
Power or Beta
This option specifies one or more values for power or for beta (depending on the chosen setting).
Power is the probability of rejecting a false null hypothesis, and is equal to one minus Beta. Beta
is the probability of a type-II error, which occurs when a false null hypothesis is not rejected.
Values must be between zero and one. Historically, the value of 0.80 (Beta = 0.20) was used for
power. Now, 0.90 (Beta = 0.10) is also commonly used.
A single value may be entered here or a range of values such as 0.8 to 0.95 by 0.05 may be
entered.
Alpha (Significance Level)
This option specifies one or more values for the probability of a type-I error. A type-I error occurs
when a true null hypothesis is rejected.
Values must be between zero and one. Historically, the value of 0.05 has been used for alpha.
This means that about one test in twenty will falsely reject the null hypothesis. You should pick a
value for alpha that represents the risk of a type-I error you are willing to take in your
experimental situation.
You may enter a range of values such as 0.01 0.05 0.10 or 0.01 to 0.10 by 0.01.
Sample Size
N1 (Sample Size Group 1)
Enter a value (or range of values) for the sample size of this group. You may enter a range of
values such as 10 to 100 by 10.
N2 (Sample Size Group 2)
Enter a value (or range of values) for the sample size of group 2 or enter Use R to base N2 on the
value of N1. You may enter a range of values such as 10 to 100 by 10.
• Use R
When Use R is entered here, N2 is calculated using the formula
N2 = [R(N1)]
where R is the Sample Allocation Ratio, and [Y] is the first integer greater than or equal to Y.
For example, if you want N1 = N2, select Use R and set R = 1.
R (Sample Allocation Ratio)

Enter a value (or range of values) for R, the allocation ratio between samples. This value is only
used when N2 is set to Use R.
When used, N2 is calculated from N1 using the formula: N2= [R(N1)] where [Y] is the next
integer greater than or equal to Y. Note that setting R = 1.0 forces N2 = N1.
Effect Size – Reference (Group 2)

P2 (Reference Group Proportion)
Specify the value of p2 , the reference, baseline, or control group’s proportion. The null
hypothesis is that the two proportions differ by no more than a specified amount. Since P2 is a
proportion, these values must be between 0 and 1.
Test
Higher Proportions Are
This option specifies whether proportions represent successes (better) or failures (worse).
• Better (Successes)
When proportions represent successes, higher proportions are better. A noninferior treatment
is one whose proportion is at least almost as high as that of the reference group.
For testing non-inferiority, D0 is negative, R0 is less than 1, and OR0 is less than 1. For
testing superiority, D0 is positive, R0 is greater than 1, and OR0 is greater than 1.
• Worse (Failures)
When proportions represent failures, lower proportions are better. A noninferior treatment is
one whose proportion is at most almost as low as that of the reference group.
For testing non-inferiority, D0 is positive, R0 is greater than 1, and OR0 is greater than 1. For
testing superiority, D0 is negative, R0 is less than 1, and OR0 is less than 1.
Test Type
Specify which test statistic is used in searching and reporting. Although the pooled z-test is
commonly shown in elementary statistics books, the likelihood score test is arguably the best
choice.
Note that C.C. is an abbreviation for Continuity Correction. This refers to the adding or
subtracting 1/(2n) to (or from) the numerator of the z-value to bring the normal approximation
closer to the binomial distribution.
Data Tab (Proportions)

This section documents options that are used when the parameterization is in terms of the values
of the two proportions, P1 and P2. P1.0 is the value of the P1 assumed by the null hypothesis and
P1.1 is the value of P1 at which the power is calculated.
Effect Size – Treatment (Group 1)

P1.0 (Equivalence Proportion)
This option allows you to specify the value P1.0 directly. This is that value of treatment group’s
proportion above which the treatment group is considered noninferior to the reference group.
When Higher Proportions Are is set to Better, the trivial proportion is the smallest value of P1 for
which the treatment group is declared noninferior to the reference group. In this case, P1.0 should
be less than P2 for non-inferiority tests and greater than P2 for superiority tests. The reverse is the
case when Higher Proportions Are is set to Worse.
Proportions must be between 0 and 1. They cannot take on the values 0 or 1. This value should
not be set to exactly the value of P2.
P1.1 (Actual Proportion)
This option specifies the value of P1.1 which is the value of the treatment proportion at which the
power is to be calculated. Proportions must be between 0 and 1. They cannot take on the values 0
or 1.
Data Tab (Differences)

This section documents options that are used when the parameterization is in terms of the
difference, P1 – P2. P1.0 is the value of P1 assumed by the null hypothesis and P1.1 is the value
of P1 at which the power is calculated. Once P2, D0, and D1 are given, the values of P1.1 and
P1.0 can be calculated.
Effect Size – Differences

D0 (Equivalence Difference)
This option specifies the trivial difference (often called the margin of error) between P1.0 (the
value of P1 under H0) and P2. This difference is used with P2 to calculate the value of P1.0 using
the formula: P1.0 = P2 + D0.
When Higher Proportions Are is set to Better, the trivial difference is that amount by which P1
can be less than P2 and still have the treatment group declared noninferior to the reference group.
In this case, D0 should be negative for non-inferiority tests and positive for superiority tests.
The reverse is the case when Higher Proportions Are is set to worse.
You may enter a range of values such as -.03 -.05 -.10 or -.05 to -.01 by .01. Differences must be
between -1 and 1. D0 cannot take on the values -1, 0, or 1.
D1 (Actual Difference)
This option specifies the actual difference between P1.1 (the actual value of P1) and P2. This is
the value of the difference at which the power is calculated. In non-inferiority trials, this
difference is often set to 0.
The power calculations assume that P1.1 is the actual value of the proportion in group 1
(experimental or treatment group). This difference is used with P2 to calculate the value of P1
using the formula: P1.1 = D1 + P2.
You may enter a range of values such as -.05 0 .5 or -.05 to .05 by .02. Actual differences must be
between -1 and 1. They cannot take on the values -1 or 1.
Data Tab (Ratios)

This section documents options that are used when the parameterization is in terms of the ratio,
P1 / P2. P1.0 is the value of P1 assumed by the null hypothesis and P1.1 is the value of P1 at
which the power is calculated. Once P2, R0, and R1 are given, the values of P1.0 and P1.1 can be
calculated.
Effect Size – Ratios

R0 (Equivalence Ratio)
This option specifies the trivial ratio (also called the Relative Margin of Equivalence) between
P1.0 and P2. The power calculations assume that P1.0 is the value of the P1 under the null
hypothesis. This value is used with P2 to calculate the value of P1.0 using the formula: P1.0 = R0
x P2.
When Higher Proportions Are is set to Better, the trivial ratio is the relative amount by which P1
can be less than P2 and still have the treatment group declared noninferior to the reference group.
In this case, R0 should be less than one for non-inferiority tests and greater than 1 for superiority
tests. The reverse is the case when Higher Proportions Are is set to Worse.
Ratios must be positive. R0 cannot take on the value of 1. You may enter a range of values such
as 0.95 .97 .99 or .91 to .99 by .02.
R1 (Actual Ratio)
This option specifies the ratio of P1.1 and P2, where P1.1 is the actual proportion in the treatment
group. The power calculations assume that P1.1 is the actual value of the proportion in group 1.
This difference is used with P2 to calculate the value of P1 using the formula: P1.1 = R1 x P2. In
non-inferiority trials, this ratio is often set to 1.
Ratios must be positive. You may enter a range of values such as 0.95 1 1.05 or 0.9 to 1.9 by
0.02.
Data Tab (Odds Ratios)

This section documents options that are used when the parameterization is in terms of the odds
ratios, O1.1 / O2 and O1.0 / O2. Note that the odds are defined as O2 = P2 / (1 – P2), O1.0 = P1.0
/ (1 – P1.0), etc. P1.0 is the value of P1 assumed by the null hypothesis and P1.1 is the value of
P1 at which the power is calculated. Once P2, OR0, and OR1 are given, the values of P1.1 and
P1.0 can be calculated.
Effect Size – Odds Ratios

OR0 (Equivalence Odds Ratio)
This option specifies the trivial odds ratio between P1.0 and P2. The power calculations assume
that P1.0 is the value of the P1 under the null hypothesis. OR0 is used with P2 to calculate the
value of P1.0.
When Higher Proportions Are is set to Better, the trivial odds ratio implicitly gives the amount
by which P1 can be less than P2 and still have the treatment group declared noninferior to the
reference group. In this case, OR0 should be less than 1 for non-inferiority tests and greater than
1 for superiority tests. The reverse is the case when Higher Proportions Are is set to Worse.
Odds ratios must be positive. OR0 cannot take on the value of 1.
OR1 (Actual Odds Ratio)
This option specifies the odds ratio of P1.1 and P2, where P1.1 is the actual proportion in the
treatment group. The power calculations assume that P1.1 is the actual value of the proportion in
group 1. This value is used with P2 to calculate the value of P1. In non-inferiority trials, this odds
ratio is often set to 1.
Odds ratios must be positive. You may enter a range of values such as 0.95 1 1.05 or 0.9 to 1 by
0.02.
Options Tab
The Options tab contains various limits and options.
Maximum Iterations
Maximum Iterations Before Search Termination
Specify the maximum number of iterations before the search for the criterion of interest is
aborted. When the maximum number of iterations is reached without convergence, the criterion is
not reported. A value of at least 500 is recommended.
Zero Counts
Zero Count Adjustment Method
Zero cell counts cause many calculation problems. To compensate for this, a small value (called
the Zero Count Adjustment Value) can be added either to all cells or to all cells with zero counts.
This option specifies whether you want to use the adjustment and which type of adjustment you
want to use. We recommend that you use the option ‘Add to zero cells only.’
Zero cell values often do not occur in practice. However, since power calculations are based on
total enumeration, they will occur in power and sample size estimation.
Adding a small value is controversial, but can be necessary for computational considerations.
Statisticians have recommended adding various fractions to zero counts. We have found that
adding 0.0001 seems to work well.
Zero Count Adjustment Value
Zero cell counts cause many calculation problems when computing power or sample size. To
compensate for this, a small value may be added either to all cells or to all zero cells. This is the
amount that is added. We have found that 0.0001 works well.
Be warned that the value of the ratio and the odds ratio will be affected by the amount specified
here!
Exact Test Options

Maximum N1 or N2 for Exact Calculations
When either N1 or N2 is above this amount, power calculations are based on the normal
approximation to the binomial. In this case, the actual value of alpha is not calculated. Currently,
for three-gigahertz computers, a value near 200 is reasonable. As computers get faster, this
number may be increased.
Example 1 – Finding Power

A study is being designed to establish the non-inferiority of a new treatment compared to the
current treatment. Historically, the current treatment has enjoyed a 60% cure rate. The new
treatment reduces the seriousness of certain side effects that occur with the current treatment.
Thus, the new treatment will be adopted even if it is slightly less effective than the current
treatment. The researchers will recommend adoption of the new treatment if it has a cure rate of
at least 55%.
The researchers plan to use the Farrington and Manning likelihood score test statistic to analyze
the data that will be (or has been) obtained. They want to study the power of the Farrington and
Manning test at group sample sizes ranging from 50 to 500 for detecting a difference of -0.05
when the actual cure rate of the new treatment ranges from 57% to 70%. The significance level
will be 0.025.
Setup
This section presents the values of each of the parameters needed to run this example. First, from
the PASS Home window, load the Non-Inferiority Tests for Two Proportions [Differences]
procedure window by expanding Proportions, then Two Independent Proportions, then
clicking on Non-Inferiority, and then clicking on Non-Inferiority Tests for Two Proportions
[Differences]. You may then make the appropriate entries as listed below, or open Example 1 by
going to the File menu and choosing Open Example Template.
Option Value
Data Tab
Find (Solve For) ...................................... Power and Beta
Power ...................................................... Ignored since this is the Find setting
Alpha ....................................................... 0.025
N1 (Sample Size Group 1) ...................... 50 to 500 by 50
N2 (Sample Size Group 2) ...................... Use R
R (Sample Allocation Ratio) .................... 1.0
D0 (Non-Inferiority Difference) ................ -0.05
D1 (Actual Difference) ............................. -0.03 0.00 0.05 0.10
P2 (Reference Group Proportion) ........... 0.6
Test Type ................................................ Likelihood Score (Farr. & Mann.)
Higher Proportions Are............................ Better
Options Tab
Maximum N1 or N2 Exact ....................... 300
Annotated Output
Click the Run button to perform the calculations and generate the following output.
Numeric Results
Numeric Results of Non-Inferiority Tests Based on the Difference: P1 – P2
H0: P1-P2<=D0. H1: P1-P2=D1>D0. Test Statistic: Score test (Farrington & Manning)
Sample Sample Equiv. Actual Equiv. Actual

Size Size Grp 2 Grp 1 Grp 1 Margin Margin
Grp 1 Grp 2 Prop Prop Prop Diff Diff Target Actual
Power N1 N2 P2 P1.0 P1.1 D0 D1 Alpha Alpha Beta
0.0380 50 50 0.6000 0.5500 0.5700 -0.0500 -0.0300 0.0250 0.0236 0.9620
0.0494 100 100 0.6000 0.5500 0.5700 -0.0500 -0.0300 0.0250 0.0267 0.9506
0.0525 150 150 0.6000 0.5500 0.5700 -0.0500 -0.0300 0.0250 0.0241 0.9475
0.0588 200 200 0.6000 0.5500 0.5700 -0.0500 -0.0300 0.0250 0.0244 0.9412
0.0650 250 250 0.6000 0.5500 0.5700 -0.0500 -0.0300 0.0250 0.0241 0.9350
0.0735 300 300 0.6000 0.5500 0.5700 -0.0500 -0.0300 0.0250 0.0261 0.9265
0.0776 350 350 0.6000 0.5500 0.5700 -0.0500 -0.0300 0.0250 0.9224
0.0832 400 400 0.6000 0.5500 0.5700 -0.0500 -0.0300 0.0250 0.9168
0.0886 450 450 0.6000 0.5500 0.5700 -0.0500 -0.0300 0.0250 0.9114
Report continues …
Note: exact results based on the binomial were only calculated when both N1 and N2 were less than 300.
Report Definitions
'Power' is the probability of rejecting a false null hypothesis. It should be close to one.
'N1 and N2' are the sizes of the samples drawn from the corresponding groups.
'P2' is the response rate for group two which is the standard, reference, baseline, or control group.
'P1.0' is the smallest treatment-group response rate that still yields a non-inferiority conclusion.
'P1.1' is the treatment-group response rate at which the power is calculated.
'D0' is the non-inferiority margin. It is the difference P1-P2 assuming H0.
'D1' is the actual difference, P1-P2, at which the power is calculated.
'Target Alpha' is the probability of rejecting a true null hypothesis that was desired.
'Actual Alpha' is the value of alpha that is actually achieved.
'Beta' is the probability of accepting a false H0. Beta = 1 - Power.
'Grp 1' refers to Group 1 which is the treatment or experimental group.
'Grp 2' refers to Group 2 which is the reference, standard, or control group.
' Equiv.' refers to a small amount that is not of practical importance.
'Actual' refers to the true value at which the power is computed.
Summary Statements
Sample sizes of 50 in group one and 50 in group two achieve 4% power to detect a
non-inferiority margin difference between the group proportions of -0.0500. The reference group
proportion is 0.6000. The treatment group proportion is assumed to be 0.5500 under the null
hypothesis of inferiority. The power was computed at for the case when the actual treatment
group proportion is 0.5700. The test statistic used is the one-sided Score test (Farrington &
Manning). The significance level of the test was targeted at 0.0250. The significance level
actually achieved by this design is 0.0236.
This report shows the values of each of the parameters, one scenario per row. Note that the actual
alpha value is blank for sample sizes greater than 300, which was the limit set for exact
computation.
Most of the report columns have obvious interpretations. Those that may not be obvious are
presented here. Note that the discussion below assumes that higher response rates are better and
that non-inferiority testing (rather than superiority testing) is planned.
Prop Grp 2 P2
This is the value of P2, the response rate in the control group.
Equiv. Grp 1 Prop P1.0
This is the value of P1.0, the response rate of the treatment group, as specified by the null
hypothesis of inferiority. Values of P1 less than this amount are considered different from P2.
Values of P1 greater than this are considered noninferior to the reference group. The difference
between this value and P2 is the value of the null hypothesis.
Actual Grp 1 Prop P1.1
This is the value of P1.1, the response rate of the treatment group, at which the power is
computed. This is the value of P1 under the alternative hypothesis. The difference between this
value and P2 is the value of the alternative hypothesis.
Equiv. Margin Diff D0
This is the value of D0, the difference between the two group proportions under the null
hypothesis. This value is often called the margin of non-inferiority.
Actual Margin Diff D1
This is the value of D1, the difference between the two group proportions at which the power is
computed. This is the value of the difference under the alternative hypothesis.
Target Alpha
This is the value of alpha that was targeted by the design. Note that the target alpha is not usually
achieved exactly. For one-sided tests, this value should usually be 0.025.
Actual Alpha
This is the value of alpha that was actually achieved by this design. Note that since the limit on
exact calculations was set to 300, and since this value is calculated exactly, it is not shown for
values of N1 greater than 300.
The difference between the Target Alpha and the Actual Alpha is caused by the discrete nature of
the binomial distribution and the use of the normal approximation to the binomial in determining
the critical value of the test statistic.
Plots Section
Power vs N1 by D1
P2=0.60 A=0.03 N2=N1 D0=-0.05 1-Sided LS FM Test
1.0
0.8
0.6 D1
-0.0300
0.0000
0.0500
0.4 0.1000
0.2
0.0
50 200 350 500
N1
The values from the table are displayed in the above chart. This chart gives us a quick look at the
sample size that will be required for various values of D1.
Example 2 – Finding the Sample Size

Continuing with the scenario given in Example 1, the researchers want to determine the sample
size necessary for each value of D1 to achieve a power of 0.80. To cut down on the runtime, they
decide to look at approximate values whenever N1 is greater than 100.
Setup
Option Value
Data Tab
Find (Solve For) ...................................... N1
Power ...................................................... 0.8
Alpha ....................................................... 0.025
N1 (Sample Size Group 1) ...................... Ignored since this is the Find setting
D1 (Actual Difference) ............................. -0.03 0.00 0.05 0.10
Options Tab
Output
Numeric Results

Size Size Prop Grp 1 Grp 1 Margin Margin
Grp 1 Grp 2 Grp 2 Prop Prop Diff Diff Target Actual
0.8000 9509 9509 0.6000 0.5500 0.5700 -0.0500 -0.0300 0.0250 0.2000
0.8001 1505 1505 0.6000 0.5500 0.6000 -0.0500 0.0000 0.0250 0.1999
0.8008 368 368 0.6000 0.5500 0.6500 -0.0500 0.0500 0.0250 0.1992
0.8019 159 159 0.6000 0.5500 0.7000 -0.0500 0.1000 0.0250 0.1981
The required sample size will depend a great deal on the value of D1. Any effort spent
determining an accurate value for D1 will be worthwhile.
Example 3 – Comparing the Power of Several Test

Statistics
Continuing with Example 1, the researchers want to determine which of the eight possible test
statistics to adopt by using the comparative reports and charts that PASS produces. They decide
to compare the powers and actual alphas for various sample sizes between 50 and 200 when D1 is
0.1.
Setup
Option Value
Data Tab
Alpha ....................................................... 0.025
N1 (Sample Size Group 1) ...................... 50 100 150 200
D1 (Actual Difference) ............................. 0.10
Reports Tab
Show Numeric Report ............................. Not checked
Show Comparative Reports .................... Checked
Show Definitions ..................................... Not checked
Show Plots .............................................. Not checked
Show Comparative Plots ......................... Checked
Show Summary Statements.................... Not checked
Options Tab
Output
Numeric Results and Plots

Power Comparison of Non-Inferiority Tests Based on the Difference: P1 – P2
H0: P1-P2<=D0. H1: P1-P2=D1>D0.
Z(P) Z(UnP) Z(P) Z(UnP) T F.M. M.N. G.N.
Target Test Test CC Test CC Test Test Score Score Score
N1/N2 P2 P1 Alpha Power Power Power Power Power Power Power Power
50/50 0.6000 0.7000 0.0250 0.3581 0.3670 0.2782 0.2945 0.3464 0.3581 0.3464 0.3581
100/100 0.6000 0.7000 0.0250 0.6030 0.6088 0.5474 0.5475 0.5982 0.6030 0.6030 0.6030
150/150 0.6000 0.7000 0.0250 0.7821 0.7837 0.7453 0.7474 0.7821 0.7837 0.7821 0.7821
200/200 0.6000 0.7000 0.0250 0.8849 0.8857 0.8635 0.8638 0.8849 0.8857 0.8849 0.8849
Actual Alpha Comparison of Non-Inferiority Tests Based on the Difference: P1 – P2

H0: P2-P1<=D0. H1: P2-P1=D1>D0.
Z(P) Z(UnP) Z(P) Z(UnP) T F.M. M.N. G.N.
Target Test Test CC Test CC Test Test Score Score Score
N1/N2 P1 P2 Alpha Alpha Alpha Alpha Alpha Alpha Alpha Alpha Alpha
50/50 0.6000 0.7000 0.0250 0.0236 0.0253 0.0140 0.0161 0.0225 0.0236 0.0225 0.0236
100/100 0.6000 0.7000 0.0250 0.0267 0.0267 0.0190 0.0190 0.0266 0.0267 0.0267 0.0267
150/150 0.6000 0.7000 0.0250 0.0239 0.0241 0.0181 0.0183 0.0239 0.0241 0.0239 0.0239
200/200 0.6000 0.7000 0.0250 0.0243 0.0244 0.0191 0.0191 0.0243 0.0244 0.0243 0.0243
Power vs N1 by Test
D1=0.10 P2=0.60 A=0.03 N2=N1 D0=-0.05 1-Sided Test
0.90
0.77
Test
Zp
0.64 Zup
Zpcc
Power
Zupcc
T
0.51 LS FM
LS MN
LS GN
0.38
0.25
40 80 120 160 200
N1
It is interesting to note that the powers of the continuity-corrected test statistics are consistently
lower than the other tests. This occurs because the actual alpha achieved by these tests is lower
than for the other tests. An interesting finding of this example is that the regular t-test performed
about as well as the z-test.
Example 4 – Validation using Machin with Equal Sample

Sizes
Machin et al. (1997), page 106, present a sample size study in which P2 = 0.5, D0 = -0.2, D1=0,
one-sided alpha = 0.1, and beta = 0.2. Using the Farrington and Manning test statistic, they found
the sample size to be 55 in each group.
Setup
Option Value
Data Tab
Find (Solve For) ...................................... N1
Power ...................................................... 0.8
Alpha ....................................................... 0.1
Options Tab
Maximum N1 or N2 Exact ....................... 2 (Set low for a rapid search.)
Output
Numeric Results

0.8001 55 55 0.5000 0.3000 0.5000 -0.2000 0.0000 0.1000 0.1999
PASS found the required sample size to be 55 which corresponds to Machin.

Example 5 – Validation of a Superiority Test using

Farrington and Manning
Farrington and Manning (1990), page 1451, present a sample size study for a superiority test in
which P2 = 0.05, D0 = 0.2, D1=0.35, one-sided alpha = 0.05, and beta = 0.20. Using the
Farrington and Manning test statistic, they found the sample size to be 80 in each group. They
mention that the true power is 0.813.
Setup
Option Value
Data Tab
Find (Solve For) ...................................... N1
Power ...................................................... 0.80
Alpha ....................................................... 0.05
D0 (Non-Inferiority Difference) ................ 0.2
Options Tab
Maximum N1 or N2 Exact ....................... 2 (Set low for a rapid search.)
Output
Numeric Results

0.8007 80 80 0.0500 0.2500 0.4000 0.2000 0.3500 0.0500 0.1993
PASS also calculated the required sample size to be 80.

Next, to calculate the exact power for this sample size, we make the following changes to the
template.
Option Value
Data Tab
N1 (Sample Size Group 1) ...................... 80
Options Tab
Maximum N1 or N2 Exact ....................... 200 (Set >80 to force exact calculation.)
Numeric Results

0.8132 80 80 0.0500 0.2500 0.4000 0.2000 0.3500 0.0500 0.0553 0.1993
PASS also calculated the exact power to be 0.813.
Example 6 – Validation of Risk Ratio Calculations using

Blackwelder
Blackwelder (1993), page 695, presents a table of power values for several scenarios using the
risk ratio. The second line of the table presents the results for the following scenario: P2 = 0.04,
R0 = 0.3, R1=0.1, N1=N2=1044, one-sided alpha = 0.05, and beta = 0.20. Using the Farrington
and Manning likelihood-score test statistic, he found the exact power to be 0.812, the exact alpha
to be 0.044, and, using the asymptotic formula, the approximate power to be 0.794.
Setup
the PASS Home window, load the Non-Inferiority Tests for Two Proportions [Ratios]
[Ratios]. You may then make the appropriate entries as listed below, or open Example 6 by
Option Value
Data Tab
Alpha ....................................................... 0.05
R0 (Non-Inferiority Ratio) ........................ 0.3
R1 (Actual Ratio) ..................................... 0.1
Higher Proportions Are............................ Worse
Options Tab
Maximum N1 or N2 Exact ....................... 2000 (Set high for exact results.)
Output
Numeric Results
Numeric Results of Non-Inferiority Tests Based on the Difference: P1 / P2
H0: P1/P2>=R0. H1: P1/P2=R1<R0. Test Statistic: Score test (Farrington & Manning)

Grp 1 Grp 2 Grp 2 Prop Prop Ratio Ratio Target Actual
Power N1 N2 P2 P1.0 P1.1 R0 R1 Alpha Alpha Beta
0.8118 1044 1044 0.0400 0.0120 0.0040 0.300 0.100 0.0500 0.0444 0.1882
PASS also calculated the power to be 0.812 and the actual alpha to be 0.044, within rounding.
Next, to calculate the asymptotic power, we make the following changes to the template.
Option Value
Options Tab
Maximum N1 or N2 Exact ....................... 2 (Set < 1044 to force asymptotic calculation.)
Numeric Results
Numeric Results of Non-Inferiority Tests Based on the Difference: P1 / P2
H0: P1/P2>=R0. H1: P1/P2=R1<R0. Test Statistic: Score test (Farrington & Manning)

Grp 1 Grp 2 Grp 2 Prop Prop Ratio Ratio Target Actual
Power N1 N2 P2 P1.0 P1.1 R0 R1 Alpha Alpha Beta
0.7937 1044 1044 0.0400 0.0120 0.0040 0.300 0.100 0.0500 0.2063
PASS also calculated the power to be 0.794.

Example 7 – Finding Power following an Experiment

In an effort to show a new treatment non-inferior to the current standard, researchers randomly
assigned 80 subjects to each treatment. The new treatment was to be considered non-inferior if
the odds ratio (treatment to standard) was at least 0.80. Using the Farrington and Manning
Likelihood Score test, non-inferiority could not be concluded. The researchers now want to see
the power of the test. The control proportion was 0.625.
Setup
the PASS Home window, load the Non-Inferiority Tests for Two Proportions [Odds Ratios]
[Odds Ratios]. You may then make the appropriate entries as listed below, or open Example 7
by going to the File menu and choosing Open Example Template.
Option Value
Data Tab
Alpha ....................................................... 0.05
OR0 (Non-Inferiority Odds Ratio) ............ 0.80
OR1 (Actual Odds Ratio) ........................ 1.0
Output
Numeric Results
Numeric Results for Non-Inferiority Tests Based on the Odds Ratio: O1 / O2
H0: O1/O2<=OR0. H1: O1/O2=OR1>OR0. Test Statistic: Score test (Farrington & Manning)

Grp 1 Grp 2 Prop Prop Prop O.R. O.R. Target Actual
Power N1 N2 P2 P1.0 P1.1 OR0 OR1 Alpha Alpha Beta
0.1845 80 80 0.6250 0.5714 0.6250 0.800 1.000 0.0500 0.0571 0.8155
The power of a test with 80 receiving each treatment is only 0.1801.

Example 8 – Finding True Proportion Difference

Researchers have developed a new treatment with minimal side effects compared to the standard
treatment. The researchers are limited by the number of subjects (140 per group) they can use to
show the new treatment is non-inferior. The new treatment will be deemed non-inferior if it is at
least 0.10 below the success rate of the standard treatment. The standard treatment has a success
rate of about 0.75. The researchers want to know how much more successful the new treatment
must be (in truth) to yield a test which has 90% power. The test statistic used will be the pooled Z
test.
Setup
Option Value
Data Tab
Find (Solve For) ...................................... P1.1 (Search>P1.0)
Power ...................................................... 0.90
Alpha ....................................................... 0.05
D1 (Actual Difference) ............................. Ignored since this is the Find setting
Test Type ................................................ Z Test (Pooled)
Options Tab
Maximum N1 or N2 Exact ....................... 500 (Set high for exact results.)
Output
Numeric Results
Numeric Results for Non-Inferiority Tests Based on the Difference: P1 - P2
H0: P1-P2<=D0. H1: P1-P2=D1>D0. Test Statistic: Z test (pooled)

Grp 1 Grp 2 Prop Prop Prop Diff Diff Target Actual
0.9000 140 140 0.7500 0.6500 0.7961 -0.1000 0.0461 0.0500 0.0505 0.1000
With 140 subjects in each group, the new treatment must have a success rate 0.0461 higher than
the current treatment (or about 0.7961) to have 90% power in the test of non-inferiority.

Pass 11 - Non-Inferiority Tests For Two Proportions

Uploaded by

Copyright:

Available Formats

Pass 11 - Non-Inferiority Tests For Two Proportions

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Pass 11 - Non-Inferiority Tests For Two Proportions

Uploaded by

Copyright:

Available Formats

210-1

Four Procedures Documented Here

Group Success Failure Total

Parameter Computation Hypotheses

Non-Inferiority using a Difference

Superiority using a Difference

Non-Inferiority using a Ratio

A Note on Setting the Significance Level, Alpha

4. Compute the power for given values of p11. and p2 as

Z Test with Continuity Correction (Pooled)

Z Test with Continuity Correction (Unpooled)

Miettinen and Nurminen’s Likelihood Score Test of the Difference

Miettinen and Nurminen’s Likelihood Score Test of the Ratio

The formula for computing the test statistic is

Miettinen and Nurminen’s Likelihood Score Test of the Odds Ratio

Farrington and Manning’s Likelihood Score Test of the Difference

where the estimates ~

Farrington and Manning’s Likelihood Score Test of the Ratio

where the estimates ~

Farrington and Manning’s Likelihood Score Test of the Odds Ratio

where the estimates ~

Gart and Nam’s Likelihood Score Test of the Difference

Gart and Nam’s Likelihood Score Test of the Ratio

Data Tab (Common Options)

R (Sample Allocation Ratio)

Effect Size – Reference (Group 2)

Data Tab (Proportions)

Effect Size – Treatment (Group 1)

Data Tab (Differences)

Effect Size – Differences

Data Tab (Ratios)

Effect Size – Ratios

Data Tab (Odds Ratios)

Effect Size – Odds Ratios

Exact Test Options

Example 1 – Finding Power

Sample Sample Equiv. Actual Equiv. Actual

Example 2 – Finding the Sample Size

Sample Sample Equiv. Actual Equiv. Actual

Example 3 – Comparing the Power of Several Test

Numeric Results and Plots

Actual Alpha Comparison of Non-Inferiority Tests Based on the Difference: P1 – P2

Example 4 – Validation using Machin with Equal Sample

Sample Sample Equiv. Actual Equiv. Actual

PASS found the required sample size to be 55 which corresponds to Machin.

Example 5 – Validation of a Superiority Test using

Sample Sample Equiv. Actual Equiv. Actual

PASS also calculated the required sample size to be 80.

Sample Sample Equiv. Actual Equiv. Actual

PASS also calculated the exact power to be 0.813.

Example 6 – Validation of Risk Ratio Calculations using

Sample Sample Equiv. Actual Equiv. Actual

Sample Sample Equiv. Actual Equiv. Actual

PASS also calculated the power to be 0.794.

Example 7 – Finding Power following an Experiment

Sample Sample Equiv. Actual Equiv. Actual

The power of a test with 80 receiving each treatment is only 0.1801.

Example 8 – Finding True Proportion Difference