Academia.eduAcademia.edu

The Design of Order-of-Addition Experiments

2017, arXiv (Cornell University)

We introduce systematic methods to create optimal designs for order-of-addition (OofA) experiments, those that study the order in which m components are appliedfor example, the order in which chemicals are added to a reaction or layers are added to a film. Full designs require m! runs, so we investigate design fractions. Balance criteria for creating such designs employ an extension of orthogonal arrays (OA's) to OofA-OA's. A connection is made between D-efficient and OofA-OA designs. Necessary conditions are found for the number of runs needed to create OofA-OA's of strengths 2 and 3. We create a number of new, optimal, designs: 12-run OofA-OA's in 4 and 5 components, 24-run OofA-OA's in 5 and 6 components, and near OofA-OA's in 7 components. We extend these designs to include (a) process factors, and (b) the common case in which component orderings are restricted. We also suggest how such designs may be analyzed.

arXiv:1701.02786v2 [stat.ME] 6 Nov 2017 The Design of Order-of-Addition Experiments JOSEPH G. VOELKEL School of Mathematical Sciences Rochester Institute of Technology, Rochester, NY 14623 November 8, 2017 Abstract We introduce systematic methods to create optimal designs for order-of-addition (OofA) experiments, those that study the order in which m components are applied— for example, the order in which chemicals are added to a reaction or layers are added to a film. Full designs require m! runs, so we investigate design fractions. Balance criteria for creating such designs employ an extension of orthogonal arrays (OA’s) to OofA-OA’s. A connection is made between D-efficient and OofA-OA designs. Necessary conditions are found for the number of runs needed to create OofA-OA’s of strengths 2 and 3. We create a number of new, optimal, designs: 12-run OofA-OA’s in 4 and 5 components, 24-run OofA-OA’s in 5 and 6 components, and near OofAOA’s in 7 components. We extend these designs to include (a) process factors, and (b) the common case in which component orderings are restricted. We also suggest how such designs may be analyzed. Keywords: D-efficiency; Experimental design; Minimum chi-square; Minimum moment aberration; Orthogonal array. 1 INTRODUCTION 1 An early reference to an order-of-addition (OofA) experiment (Fisher(1971)) is that of a lady tasting tea, who claimed she could distinguish whether the tea or the milk was first added to her cup. Fisher’s experiment consisted of four replications of each of tea→milk and milk→tea. More complex OofA applications appear in a number of areas, including chemical, food, and film, although in the author’s experience such applications tend to be proprietary. An application in the public arena includes Preuss, et al. (2009, p. 24), who note that “The next step [in an investigation] is usually to determine the order of action of the protein with respect to known treatments required at the same general transport step. This approach is sometime referred to an order of addition experiment...” A PerkinElmer (2016) reference states “Order of addition can influence the signal generated to a large extent. The optimal order in which assay components interact should always be determined empirically... some binding partners may interfere with the association of other binding partners if allowed to interact in the wrong order.”(italics ours.) Other references to OofA experiments include Black, et al. (2001) and Kim, et al. (2001). None of these references mention a strategy of experimenter design—rather, a small subset of select orders was investigated. In this paper, we will denote each material whose order of addition is being considered as a component. If there are m components, say c0 , . . . , cm−1 , then there are m! possible combinations of orders. An experiment with 5 components or more would normally be considered excessively large, so our interest is to find good fractions of such designs. Our objectives in this paper are to: 1. Review past work in the field; 2. Consider goodness criteria for fractionating an OofA design; 3. Investigate the goodness of the designs created by these criteria, including a comparison of these designs to those from past work; 4. Extend these designs to include process factors as well—for example, varying the temperature or the amount of c1 ; 5. Consider designs in which restrictions are placed on component ordering. 2 PAST WORK Given the importance of order of addition in certain experimental areas, it is surprising that little research has been done to create good OofA designs. The only direct statistical 2 reference for creating such designs appears to be Van Nostrand (1995), whose primary approach was to select a fraction of such combinations with good properties, as follows: 1. Create “pseudo factors”  (which in this paper we will call pair-wise ordering (PWO) factors). There are m2 PWO factors, corresponding to all pairs of component orders {F ck<cl , 0 ≤ k < l ≤ m − 1}, where F is included in these names to emphasize these are factors, and each F ck <cl is at two levels, 1 and 0, to indicate whether or not component k is added before component l, respectively. So, if m = 4, the six PWO factors are F c0 <c1 , F c0 <c2 , F c0 <c3 F c1 <c2 , F c1 <c3 , F c2 <c3 —see Table 1 for an example of a design in all m! runs. (The DF and PF notation will be explained in the next section.)  2. Create a tentative two-level design in these m2 PWO factors, fractionated to a suggested number of starting runs. For m = 5, for example, with 10 PWO factors, this might correspond to a 210−3 design, the hope being that, because the full 210 would include all valid 5! = 120 combinations, then a particular 210−3 design might include about 120(2−3 ) = 15 valid combinations. 3. Reduce the number of runs in the fraction to include only valid combinations. For example, if the fraction includes a suggested run such as c0 < c1 , c0 > c2 , c1 < c2 , such a run would be excluded. For five factors, Van Nostrand examined the eight fractions from a particular 210−3 family—that is, all eight combinations of ±G1 , ±G2 , ±G3 , where the Gi ’s are the words in the generator. For each combination, he considered the main-effects model, and then used the mean VIF’s (variance-inflation factors) from each model to measure the goodness of each design. In his example, the number of valid runs from his 128-run fractions ranged from 13 to 17; the 13-run designs were singular, and the other designs had mean VIF’s in the range from 3.3 to 3.6. (The mean VIF in the full, 120-run, design is 2.) He noted that different sets of words in the generator lead to different designs, and that “[p]rinciples for assigning letters to generators to produce designs with low variance inflation factors are not yet apparent.” Also note that even if a full-factorial design is considered for m component, only the fraction m!/2m(m−1)/2 of these runs are valid. In Van Nostrand’s 5-component example, only about 12% of the runs from the full-factorial design would be valid, which is consistent with the fractions valid in his fractional-factorial designs. This fraction drops quickly with m—even for m = 6, the fraction is only 2%. An author whose designs appear to be more widely used for OofA experiments is Williams (1949, 1950). However, Williams’ intent was to create designs not for OofA experiments, but for crossover experiments—experiments in which each subject receives m 3 Table 1: Four-factor OofA design DF in 24 runs, with PWO design matrix PF . Design DF 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 0 0 0 0 0 0 1 1 1 1 1 1 2 2 2 2 2 2 3 3 3 3 3 3 1 1 2 2 3 3 0 0 2 2 3 3 0 0 1 1 3 3 0 0 1 1 2 2 2 3 1 3 1 2 2 3 0 3 0 2 1 3 0 3 0 1 1 2 0 2 0 1 3 2 3 1 2 1 3 2 3 0 2 0 3 1 3 0 1 0 2 1 2 0 1 0 PWO Design Matrix PF F c0<c1 F c0<c2 F c0<c3 F c1<c2 F c1<c3 F c2<c3 1 1 1 1 1 1 1 1 1 1 1 0 1 1 1 0 1 1 1 1 1 0 0 1 1 1 1 1 0 0 1 1 1 0 0 0 0 1 1 1 1 1 0 1 1 1 1 0 0 0 1 1 1 1 0 0 0 1 1 1 0 1 0 1 1 0 0 0 0 1 1 0 1 0 1 0 1 1 1 0 1 0 0 1 0 0 1 0 1 1 0 0 0 0 1 1 1 0 0 0 0 1 0 0 0 0 0 1 1 1 0 1 0 0 1 1 0 0 0 0 0 1 0 1 0 0 0 0 0 1 0 0 1 0 0 0 0 0 0 0 0 0 0 0 4 treatments in succession. His designs address the possibility that one or more preceding treatments as well as the current one may affect the response—his designs are balanced for these so-called residual effects. Williams showed that designs that are balanced for the effect of only one preceding treatment exist for m treatments in one m × m Latin Square (m runs) if m is even—this is known as a cyclic row-complete Latin Square—and two such Latin Squares (2m runs) if m is odd. In these Latin Squares, rows indicate the run (or subject) number, columns the treatment order, and Latin letters the treatment. In addition, if m is a prime or a power of a prime, then Williams showed that a design in m − 1 Latin Squares (m(m − 1) runs) may be created that is balanced for any number of preceding treatments, where their interactions are not taken into account. Williams also created designs for m treatments, again in m − 1 Latin Squares, that are balanced for each pair of preceding treatments in both of their orders, so that residual-effect interactions may be taken into account. 3 APPROACH AND CRITERIA Williams’ designs lead to certain balance properties. However, these properties are not directly connected to OofA considerations. For example, the natural factors in Williams’ designs, as shown in his 1949 example, are replications, order, treatments (direct), and treatments (residual), each with m − 1 d.f. But none of these are natural factors of interest in an OofA experiment. Our approach for creating OofA designs will employ Van Nostrand’s use of PWO factors, which emphasize the order of addition, and so are a priori a natural set of factors to use. However, we will not restrict ourselves to certain classes of designs as Van Nostrand and Williams did. Let N be the number of runs in a design. Define the design matrix D to be an N × m matrix, each of whose rows contain a permutation of c0 , c1 , . . . , cm−1 , where the k th column of D shows which component is entered at stage k. (This definition will be extended when process variables are also considered.) The full design matrix DF is the m!×m matrix of all such permutations. For any D, an N × m′ PWO design matrix P can be generated, where m′ = m(m − 1)/2. In particular, DF generates the full PWO design matrix PF . Now, it is clear that, given m components, any m − 1 columns of DF (the remaining column being so defined) are necessary to generate PF . The converse is also true, as we now show. Proposition 1. For m components, all m′ columns from the PF matrix are necessary to generate the DF matrix. Proof. The results are obvious for m = 2. For m > 2, suppose that at least one of the 5 PWO-factor columns is not needed. Consider the specific case where F c0<c1 is not needed. (By the symmetry of the problem, we need to show a result only for a specific case.) But then, for example, the remaining PWO-factor columns could not distinguish between the two permutations c0 < c1 < c2 < . . . < cm−1 and c1 < c0 < c2 < . . . < cm−1 .  To create and judge these designs, it will be useful to consider the following, which we do in the next three subsecions: 1. model-based and balanced-based criteria; 2. measures of goodness; 3. algorithms to create designs. 3.1 General Criteria and Goodness Measures A recent review of several criteria for constructing designs may be found in Lekivetz, et al. (2015), some of which we will consider. A common method of constructing designs using model-based criteria employs one or more of the “alphabetic-optimality” criteria, for example D-optimality, which tries to find a candidate design X that maximizes |X′ X|, where X is the model matrix. The main-effects model matrix X corresponding to an OofA design with PWO matrix P is X = [1|P], where 1 is a conformable vector of 1’s. Alternatives to model-based criteria are balanced-based criteria, and for those it is natural to consider orthogonal arrays (OA’s). Here, using notation based on Hedayat, et al. (1999), an N × k array (matrix) A each of whose columns contain s levels is said to be an OA with strength t if every N × t sub-array of A contains all possible t-tuples the same number of times—this array may be denoted as OA(N, sk , t). For mixed-level designs, where ki factors are at si levels, the notation OA(N, sk11 sk22 . . . , t) may be used, and the definition can be extended in a natural way. In the usual case where each column corresponds to a factor, an array of strength t corresponds to a design of at least Resolution t + 1. There are several ways to measure balance relative to an OA for t = 2. One, used by Yamada and Lin (1999), is a χ2 measure. Let nkl (a, b) be the number of rows of A in which level a appears in column k and level b appears in column l. Then the χ2 measure for the {k, l} column pair is sk X sl X [nkl (a, b) − N/(sk sl )]2 2 χkl (A) = , (1) N/(sk sl ) a=1 b=1 6 where sk and sl are the number of levels in their respective columns and N/(sk sl ) = N (sk /N )(sl /N ) is the frequency that would occur under an orthogonal array. An overall measure of non-orthogonality is then X χ2 (A) = χ2kl (A)/[p(p − 1)/2], (2) 1≤k<l≤p the average of all such measures, where χ2 (A) = 0 for an orthogonal array and χ2 (A) > 0 otherwise. The maximum of all such χ2kl (A) measures may also be considered. 3.2 Criteria and Goodness Measures for OofA Experiments We will examine the model-based approach using the D-criterion, but we first note that it can be limiting for very-small-N designs. For example, Williams’ 10-run balanced design in 5 components corresponds to a model matrix X that is 10 × 11, so |X′ X| = 0. One way to handle this problem would be use the approach suggested by Wu (1993) for supersaturated designs; however, we will not examine very-small-N designs in this paper using the Dcriterion. Next, consider a χ2 , balanced-based, measure of non-orthogonality. It turns out that the χ2 of Yamada and Lin cannot be used directly for OofA experiments. To see this, consider the m = 4 case. There are 4! = 24 permutations in the full design, with 42 = 6  PWO factors, as shown in Table 1, so there are 62 = 15 tables of pairs of PWO factors. Three distinct tables exist and are shown in Figure 1, along with examples of PWO factors that generated them. In the first table (8 such tables exist), “c0 <” appears in both PWO factors—this is a synergistic pair of PWO factors and is denoted in the table as Psyn . In the second table (4 such exist), “c1 <” appears in one factor and “< c1 ” in the other factor—this creates an antagonistic pair of PWO factors, denoted as Pant . In the third table (3 such exist), one pair of components appears in one factor and the other pair in the other factor—this creates an independent pair of PWO factors, denoted as Pind . Because there is a natural lack of balance among a subset of these pairs of PWO factors even when all combinations are run, the criterion of Yamada and Lin cannot be used directly as a measure of goodness. Also, note that the Psyn and Pant tables, while different, have the same frequency patterns. We call two tables weakly-table-isomorphic (or wt-isomorphic) if they have the same frequency patterns; here, in both tables, “4” occurs with a frequency of 2 and “8” occurs with a frequency of 2. We can use this idea to weakly summarize a design by (a) the number of different wt-isomorphic tables it contains and (b) the frequency of such tables. If we number the tables in our example from least to most balanced and call each one a type 7 Figure 1: The three table types of pairs of PWO factors for the m = 4, t = 2 case, with factor examples and frequency counts. of table, then in our example we have two table types, whose corresponding frequencies are 12 and 3. We call two tables table-isomorphic (or t-isomorphic) if one can be obtained from the other by possibly switching one or more labels and/or the directions of inequalities in one of the tables. If two tables are t-isomorphic then they are also wt-isomorphic. The idea of wt-isomorphism is especially useful when we later consider fractions of these designs. If two such designs are to be compared, we can say that the designs are wt-isomorphic if they have the same frequencies of table types. Most strongly, we can say that two designs are design-isomorphic (or d-isomorphic) if one can be obtained from the other by one or more of these operations: 1. Permutation of the runs; 2. Permutation of the components; 3. Reversing the order of all orderings in the design, which corresponds to swapping all 0’s and 1’s in the PWO design matrix. If two designs are not wt-isomorphic, then they are clearly not d-isomorphic. We will use this idea as a check to see if two designs are not d-isomorphic. For a more complex example, consider m = 6, with 720 permutations and 15 PWO factors, and strength t = 3. (The m = 6 case corresponds to the smallest m that illustrates all the features we show below for t = 3.) To do this, we need consider the 455 sets of 3 of these 15 factors; these sets of 3 may be represented as 2 × 2 × 2 tables. There are 5 non-wt-isomorphic tables in this case. See Figure 2, where the tables are arranged from least to most balanced, the PWO-factor labels have been removed, the frequencies 8 Figure 2: Five non-wt-isomorphic table types for the m = 6, t = 3 case, with relative (of N =24) frequencies for each type. Tables are arranged from least to most balanced. in each table have been divided by 30 for ease of understanding, and the frequencies of such tables (out of 455) are shown. Type 1 tables correspond to 3 components, such as F c0 <c1 , F c0 <c2 , F c1 <c2 ; Type 2 tables to four components, with two components in two PWO’s, such as F c0 <c1 , F c0 <c2 , F c1 <c3 ; Type 3 tables to four components, with one in all three PWO’s, such as F c0<c1 , F c0<c2 , F c0<c3 ; Type 4 tables to five components, with one in two PWO’s, such as F c0<c1 , F c0<c2 , F c3<c4 ; and Type 5 tables to six components, such as F c0 <c1 , F c2 <c3 , F c4 <c5 . As m increases, the relative frequencies of table Types differ—for example, the fraction of tables of Type 5 increases to a limit of 1. The natural lack of balance seen in Figure 1 and Figure 2 leads to the following definition, which includes the PWO-factor case as well as the possibility of additional, process, factors: Definition 1. An N × m′ array whose columns consist of all of the PWO factor levels for m components is said to be an Order-of-Addition Orthogonal Array (OofA-OA) of strength t for m components if every N × t sub-array of the array contains all possible t-tuples in the relative fractions corresponding to those found when all m! runs are considered. Such an array will be denoted as OofA-OA(N, m, t). If the array is augmented with p additional columns associated with factors whose levels in each column may be varied independently of the first m′ columns and each other, and if Np∗ is the number of combinations of these p additional columns, then the resulting m!Np∗ × (m′ + p) array replaces the original m! × m array. In particular, the rules for standard OA designs of strength t apply to any t-tuple sub-arrays that are created from the process variables alone. Using this definition and examining Figure 1 and Figure 2, we can state the following. Proposition 2. An OofA-OA of strength 2 exists for m > 3 only for N = 0 mod 12. An OofA-OA of strength 3 exists for m > 3 only for N = 0 mod 24. 9 In particular, for m = 4 only the full design with N = 24 is strength 3. We will use these restrictions on N in our search for OA’s. These restrictions seem to us to be stronger than for the OA case, and also show that most of Williams designs cannot be OofA-OA’s. (For m = 4, Williams(1949) did provide one design with N = m(m − 1) = 12 runs, but it was not an OofA-OA, and Williams(1950) noted that no solutions exist for those designs for the m = 4 case, for which N would also have been 12. The next larger m for which N = 0 mod 12 is m = 9.) With this definition, we can now generalize the Yamada and Lin measure. For simplicity, we show the result only for the pure OofA case. For t = 2, the χ2 measure for the {k, l} column pair for an array A becomes χ2kl (A) = sl sk X X [nkl (a, b) − N Ekl (a, b)/m!)]2 N Ekl (a, b)/m! a=1 b=1 , (3) where Ekl (a, b) is the frequency of the occurrence of levels a and b in columns k and l respectively in the full design PF —see Figure 1 and, after multiplying by 30, Figure 2 for examples of Ekl (a, b) values. When Ekl (a, b) = 0, we define that contribution to the sum to be 0. When p, say single-d.f., terms for process variables with a possible Np∗ treatment combinations are included, we simply modify Ekl (a, b) to be based on the full m!Np∗ ×(m+p) design. The overall measure of non-orthogonality is again the average of all such measures, which we denote by χ2ave,2 (A), where the subscript indicates strength 2; we denote the maximum of all such measures as χ2max,2 (A). A third measure we consider is the fraction of all χ2kl (A) that are 0, which we denote by F O2 (A). We will find that in many cases we obtain more than one OofA-OA array for t = 2, in which case it is natural to consider secondary measures of goodness, in particular the t = 3 properties of these designs. To this end, we consider all 3-tuples of columns, and so modify χ2kl (A) to, say, χ2klu (A) and call the resulting overall average measure as χ2ave,3 (A). Similarly, we modify F O2 to F O3 . We also consider t = 2 or t = 3 properties when we leave out one component at a time, a projective property of the design. For designs that are not OofA-OA’s, we examine m leave-one-out designs and consider the average-over-m measures χ2ave,2 and F O2 , which we denote by χ2ave,2,−1 and F O2,−1 , respectively. For OofA-OA’s, we instead consider χ2ave,3,−1 and F O3,−1 . The above measures all have clear statistical bases. Another measure P we consider is the minimum-moment-aberration criterion of Xu (2003). Let δij (P) = m k=1 δ(Pik , Pjk ) be the number of coincidences in the ith and jth rows of P, where δ is the Kronecker delta function. (The Hamming distance between the rows, m − δij (P), was used by Clark and Dean (2001) to show the equivalence (d-isomorphism idea) of fractional factorial designs.) 10 P Xu defines the sth power moment of P to be Ks (P) = [N (N − 1)]−1 1≤i<j≤N [δij (P)]s . We standardize this measure of similarity among rows as Sims (P) = (Ks (P))1/s , where Sim1 ≤ Sim2 ≤ Sim3 . . . by Jensen’s inequality. Designs with the lowest Sim1 values are best—ties are broken by successively considering Sim2 , Sim3 , . . .. Our emphasis will be in constructing orthogonal OofA designs of strength 2. Efficient algorithms exist for finding OA designs in the case where all columns of the design matrix may be considered independently, e.g. Xu (2002) and Lekivetz, et al. (2015), but this is not the case for OofA designs. Efficient algorithms do exist, however, for finding D-optimal designs for the OofA case, in which we have a set of candidate points from XF , from which a subset is selected for the design X. For this reason we consider only the D-criterion for most of the design construction below. Our use of this criteria is also partly justified by the following theorem. Theorem 1. Assume that the matrix X = [1|P] is of full column rank. Then χ22 = 0 for P only if X has a D-efficiency of 1 relative to XF . Proof. If χ22 (P) = 0, then P corresponds to an OofA-OA. In particular, the relative (i.e., N -adjusted) frequency of each of the 2 × 2 tables that are formed from all pairs of columns of P correspond to those of PF . This also applies to the corresponding matrices X and XF because the constant column is orthogonal to all other columns. But these relative frequencies also determine the relative values of the X′ X matrix, which, by the same reasoning, are equal to those of the X′F XF matrix. Because of this, the D-efficiency of X with respect to XF is equal to 1.  To our surprise, the converse is not true, as we illustrate with a counter-example in Section 5.4; however, for the vast majority of the designs we have created, when X has a D-efficiency of 1 then the corresponding χ22 = 0. We have also not been able to prove that whenever X has this D-efficiency of 1, then X is D-optimal. However, all of our work suggests this, as does the fact that XF , the OofA analog of a full factorial, intuitively appears to be the best arrangement in m! runs. Because XF is symmetric in all m components, and because of the similarity of X to XF as noted in the proof above, these OofA designs have properties corresponding to the full design. In particular:  1. The VIF’s for all m2 effects in the main-effects model are equal. The VIF value appears to be 3(m − 1)/(m + 1), based on an examination of m = 3, . . . , 8; 11 2. The variances   of all effects are equal, with the standardized variance of an effect, say N var β̂ /σ 2 , equal to four times the VIF (a result of the {0, 1} coding), where sigma2 is the error variance associated with the linear model corresponding to X;   3. The variances of all predicted values are equal, with value σ 2 m2 + 1 /N . We use PROC OPTEX from SAS/QC software with various numbers of iterations and the Federov option to obtain efficient designs. It is possible to use this procedure to extract all of the designs found or only all of the most efficient designs. We also use the function optFederov from the R package AlgDesign (Wheeler, 2014), but we are only able to extract one design from each function call regardless of the number of iterations. For this reason, our results are often based on a number of designs from SAS software and one from R software. In the next three sections, we present our results. First, we briefly compare our designs to some of the designs created by Van Nostrand and Williams. We then consider designs where OofA-OA’s may (and if fact, often do) exist. Our approach usually generates a number of OofA-OA’s, and this leads us to consider ways to distinguish among them. Finally, we consider several examples of OofA-OA’s in which process-variable factors are also included. Our objective is not to create a catalog of such designs, but rather to illustrate several of them and their properties. 4 RESULTS: COMPARISONS TO PAST WORK So that designs in this section may be shown concisely, we will define them through the row-index values of DF , where the rows of this design are arranged in lexicographic order. (The function permutations from the R package gtools (Warnes, et al., 2015) provides designs in this order.) Van Nostrand considered a 5-component problem, and using his methods found a 15run design that corresponds to rows (2, 18, 27, 35, 42, 44, 52, 53, 55, 72, 81, 89, 101, 103, 110) of DF . The mean VIF, used by Van Nostrand, was 3.28. Using our measures for this design, χ2ave,2 = 1.41, χ2max,2 = 5.4, and Deff = 0.79, where Deff is the D-efficiency with respect to DF . Our OofA-OA search using the D-criterion yielded 13 15-run designs that had χ2ave,2 = 0.29, χ2max,2 = 0.4, and Deff = 0.96, with a meanVIF of 2.17, which compares well to the corresponding value of 2 for the full, 120-run, design. One such design had row indices (1, 6, 15, 19, 22, 46, 55, 68, 70, 76, 81, 83, 94, 95, 104). Somewhat surprisingly, none of our 13 designs were wt-isomorphic—this is a consideration we will examine in more detail when 12 we compare OofA-OA’s below. Other, secondary measures also showed our design to be better. For example, Van Nostrand’s design had χ2ave,2,−1 = 1.44 and Sim1 = 5.16, while the corresponding values for our design were 0.31 and 5.02 (where Sim1 = 5 for the full design). We next consider two 5-component Williams’ designs. The first, in 10 runs, is balanced for the effects of the preceding treatment, with row indices (1, 6, 15, 19, 22, 46, 55, 68, 70, 76, 81, 83, 94, 95, 104). Because an X associated with m = 5, N = 10 cannot be of fullcolumn rank, the D-efficiency measure is useless. So we wrote a simple search algorithm using the χ2ave,2 criterion and found a good design with row indices (3, 10, 32, 38, 46, 64, 86, 94, 99, 101). Williams’ design yields χ2ave,2 = 0.80 and χ2max,2 = 3.2, while ours yields χ2ave,2 = 0.50 and χ2max,2 = 1.7. We also decided to compare the projective properties of the designs when each component in turn is left out of the design. First consider the average D-efficiency of the 5 leave-one-out designs. For the Williams design, X has column rank of only 6, so no leave-one-out designs have full column rank—the average Deff is 0. For our design, X has column rank of 10, and all leave-one-out designs have full column rank, with an average Deff of 0.84. For Williams design, χ2ave,2,−1 = 0.88; for ours, χ2ave,2,−1 = 0.51. However, because of the balance features in the Williams design, we find Sim1 = 5, which is optimal, while our design has Sim1 = 5.02. The second 5-component William’s design we consider, in 20 runs, is balanced for any number of preceding treatments, with row indices (4, 7, 18, 21, 27, 35, 40, 44, 50, 60, 61, 71, 77, 81, 86, 94, 100, 103, 114, 117). We searched for designs using both χ2ave,2 and Defficiency criteria, and obtained the following results. Williams’ design yields χ2ave,2 = 0.71, χ2max,2 = 1.6, and Deff = 0.78, while ours, using the minimum χ2ave,2 criterion, yields χ2ave,2 = 0.15, χ2max,2 = 0.8, and Deff = 0.90, with row indices (2, 9, 20, 28, 36, 37, 42, 51, 52, 56, 72, 78, 81, 83, 89, 101, 103, 109, 112, 116); using the maximum Deff criterion yields χ2ave,2 = 0.27, χ2max,2 = 1.2, and Deff = 0.97, with row indices (4, 12, 14, 16, 29, 34, 37, 47, 50, 59, 62, 63, 82, 92, 96, 99, 105, 108, 115, 119). As a secondary measure, the Sim1 values for these three designs are 5, 5, and 5.02, respectively. So, based on these criteria, designs that are specifically created for OofA considerations lead to better designs than those designed for balance with respect to certain residualtreatment effects. From this example,we can also see that the connection between OofA-OA’s (χ2ave,2 = 0) and Deff = 1 does not imply that, in comparing two designs, a design with a lower χ2ave,2 must have a higher D-efficiency. 13 5 RESULTS: SOME OofA OA OR NEAR-OA DESIGNS In this section, we consider five OofA designs. Because it turns out that OofA OA’s may be constructed for all but the last of them, the notation OofA-OA(N, m, t) will be used to describe the first four. 5.1 OofA-OA(12, 4, 2) Designs For OofA-OA(12, 4, 2), 100 starting points from PROC OPTEX produced 88 OA designs, and AlgDesign another 1. However, these yielded only two non-wt-isomorphic designs, shown in Table 2. Both designs had Sim1 = 3 and Sim2 = 3.34, the values for the full 24-run design; however, Sim3 (P1 ) = 3.55 < Sim3 (P2 ) = 3.57, so Design 1 is superior based on this measure. Design 1 also had superior secondary measures: χ2ave,3 = 0.82 vs. 1.49 and F O3 = 0.40 vs. 0.30. We were also surprised to see that while the best design had each component balanced with respect to its order of addition, as one might hope for, the second design was quite imbalanced: for example, component 1(2) appeared second in order 0(6) times, and third in order 6(0) times—we will see imbalance in later designs as well. Based on all of these criteria, Design 1 is superior. Table 2: The two non-wt-isomorphic OofA-OA(12, 4, 2) designs, with reference-design rows. Design 2 0213 3 0231 4 0312 5 1032 8 1203 9 1 2 3 0 10 1 3 0 2 11 2 0 1 3 13 2 3 1 0 18 3 0 1 2 19 3 2 0 1 23 3 2 1 0 24 Design 1 0132 2 0213 3 0312 5 1023 7 1 2 3 0 10 1 3 2 0 12 2 0 3 1 14 2 1 0 3 15 2 3 0 1 17 3 0 2 1 20 3 1 0 2 21 3 2 1 0 24 14 5.2 OofA-OA(12, 5, 2) Designs For OofA-OA(12, 5, 2), 100 starting points from PROC OPTEX produced 40 OA designs, and AlgDesign another 1. However, all designs were wt-isomorphic. One of these is shown in Table 3, with secondary measures χ2ave,3 = 1.24 and F O3 = 0.42. Table 3: The OofA-OA(12, 5, 2) design, with reference-design rows. 04213 04312 10324 12304 14023 14320 20314 24013 24310 30214 34012 34210 5.3 21 23 27 33 43 48 51 67 72 75 91 96 OofA-OA(24, 5, 2) Designs For OofA-OA(24, 5, 2), 300 starting points from PROC OPTEX produced 37 OA designs, and AlgDesign another 1. An additional 28 OA designs were produced as follows—from the 6 designs discussed below for OofA-OA(24, 6, 2), each design was used to create 6 OofA-OA(24, 5, 2) designs by leaving out one component at a time. This created 36 designs, of which 28 were OofA-OA’s. From the resulting 66 designs, we were surprised to find that 65 were non-wt-isomorphic—we assign these ID numbers (1–36 OPTEX; 37 AlgDesign; 38-65 leave-one-out). Consider the χ2ave,3 measures, which are graphed versus the F O3 and Sim3 measures in Figure 3. (Here, all 65 designs had Sim1 = 5 and Sim2 = 5.40, the values for the full design.) Both graphs indicate that five designs, denoted by overlaid ×’s, are the best in this group. This is further corroborated by the χ2ave,3,−1 and the F O3,−1 measures—see Table 4. These five criteria were then used to rank the 65 designs (tied ranks were assigned to the lowest rank)—these ranks are shown parenthetically in the Table. From these, average ranks (Rank) were obtained, leading to the ranks (Rank) among these five. The Design 15 Figure 3: F O3 and Sim3 vs. χ2ave,3 for 65 OofA-OA(24, 5, 2) Designs. The five best designs are shown with overlaid ×’s. ID’s (ID) are also provided, as well as the worst measured value among all 65 designs for each of the criteria. Table 4: Rankings for The OofA-OA(24, 5, 2) designs. ID Rank 33 1 43 2 5 3 65 4 57 5 Rank 1.0 1.6 3.4 4.2 4.8 worst: F O3 0.85(1) 0.83(2) 0.82(3) 0.80(5) 0.82(3) 0.58 χ2ave,3 0.51(1) 0.56(2) 0.63(3) 0.68(5) 0.64(4) 1.71 F O3,−1 0.88(1) 0.88(1) 0.84(4) 0.86(3) 0.82(7) 0.63 χ2ave,3,−1 0.43(1) 0.43(1) 0.58(4) 0.51(3) 0.65(7) 1.45 Sim3 5.739(1) 5.740(2) 5.742(3) 5.744(5) 5.742(3) 5.776 RM Vord 2.52(59) 2.52(59) 1.99(36) 1.25(1) 2.14(42) 2.86 However, as for OofA-OA(24, 4, 2), we next consider the balance of each design with respect to its order of addition, which we now summarize as follows. Define fkl as the frequency with which component kPappears in the lth order. Then create the unbalance P m 2 1/2 measure RM Vord = [(1/m2 ) m−1 . (Note that full balance (RM Vord = k=0 l=1 (fkl − fk. ) ] 0) is not possible for this design because 24 is not divisible by 5.) The RM Vord values in Table 4 show large differences among the five designs, with the highest-ranked design based on the earlier measures being quite imbalanced in its component-order-of-addition. 16 We believe this large imbalance would be a concern for most experimenters, and the other measures are not too dissimilar, so we would likely recommend design ID 65 in practice. This design, as well as design ID’s 5 and 33, appear in Table 5 along with their fkl values. 5.4 OofA-OA(24, 6, 2) Designs The OofA-OA(24, 6, 2) designs were more difficult to obtain, so the number of starting points was increased in PROC OPTEX to 5000. That produced 5 designs with a Deff = 1, and AlgDesign was used to obtain one more. All designs were non-wt-isomorphic, and were assigned ID numbers (1–5 OPTEX; 6 AlgDesign). However, designs 4 and 5 had χ2 (A) = 0.095 > 0, providing the counterexamples noted in Section 3.2. The four OofA-OA(24, 6, 2) designs had Sim1 = 7.5 and Sim2 = 7.96, the values for the full design, while the other two designs had Sim1 = 7.51 and Sim2 = 7.97. More details on all six are provided in Table 6. The unbalance measure RM Vord breaks the tie between the top two designs. Although full balance may be possible in this case, none of these designs achieved it. The three highest-ranking designs and the worst design appear in Table 7, along with their fkl values. 5.5 Seven-Component Designs No OofA-OA(24, 7, 2) designs were found after 5000 starting points in each of PROC OPTEX and AlgDesign. The best design had Deff = 0.990, with χ2ave,2 = 0.07, with row indices (823, 839, 909, 1167, 1466, 1525, 1653, 1791, 2226, 2258, 2517, 2721, 2927, 2935, 3071, 3515, 3602, 3642, 4001, 4259, 4332, 4415, 4865, 5009). Using the same process, no OofA-OA(36, 7, 2) designs were found, the best having Deff = 0.970 and χ2ave,2 = 0.29, with row indices (454, 486, 551, 629, 637, 881, 1296, 1377, 1470, 1529, 1711, 1947, 2068, 2154, 2353, 2382, 2408, 2726, 2794, 2935, 3039, 3117, 3215, 3263, 3340, 3367, 3505, 3649, 3742, 3874, 4060, 4268, 4330, 4559, 4627, 4896); no OofA-OA(48, 7, 2) designs were found, the best having Deff = 0.985 and χ2ave,2 = 0.22, with row indices (69, 171, 253, 307, 445, 606, 706, 777, 823, 912, 1009, 1050, 1223, 1547, 1604, 1716, 1756, 1810, 1905, 2021, 2143, 2232, 2284, 2448, 2824, 3030, 3216, 3290, 3357, 3368, 3602, 3806, 3828, 3920, 4013, 4036, 4044, 4182, 4287, 4419, 4463, 4533, 4609, 4754, 4781, 4810, 4842, 4853). 17 Table 5: Three OofA-OA(24, 5, 2) designs (ID’s 5, 33, 65) with reference-design rows, followed by fkl values. (For OA columns, see Section 6.) OA(24 ) 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 Design 5 01432 6 02143 8 0 2 3 4 1 10 0 3 2 1 4 15 0 3 4 2 1 18 1 0 4 2 3 29 1 2 0 3 4 31 1 2 4 0 3 35 1 3 0 2 4 37 1 3 4 2 0 42 2 0 4 1 3 53 2 1 3 4 0 58 2 3 0 1 4 61 2 4 3 1 0 72 3 0 4 1 2 77 3 1 2 0 4 81 3 1 4 0 2 83 3 2 4 0 1 89 4 0 1 2 3 97 4 1 0 3 2 104 4 2 0 3 1 110 4 2 1 3 0 112 4 3 0 1 2 115 4 3 2 1 0 120 Design 33 01243 2 01342 4 02314 9 0 3 2 4 1 16 0 4 2 1 3 21 0 4 3 1 2 23 1 0 2 3 4 25 1 3 2 4 0 40 1 4 0 3 2 44 1 4 2 3 0 46 2 1 0 4 3 56 2 1 3 0 4 57 2 3 4 0 1 65 2 4 0 1 3 67 2 4 3 1 0 72 3 0 4 1 2 77 3 1 2 0 4 81 3 1 4 0 2 83 3 2 0 1 4 85 3 4 2 1 0 96 4 1 2 0 3 105 4 1 3 0 2 107 4 2 0 3 1 110 4 3 0 2 1 116 Design 73 01243 2 0 2 4 3 1 12 0 3 1 4 2 14 0 4 1 3 2 20 1 0 3 2 4 27 1 0 4 2 3 29 1 2 3 4 0 34 1 3 0 2 4 37 1 4 3 2 0 48 2 0 1 3 4 49 2 0 3 4 1 52 2 1 4 0 3 59 2 3 1 0 4 63 2 4 3 0 1 71 3 0 4 2 1 78 3 1 4 0 2 83 3 2 0 1 4 85 3 2 4 1 0 90 3 4 0 1 2 91 4 0 2 1 3 99 4 0 3 2 1 102 4 1 2 0 3 105 4 2 1 3 0 112 4 3 1 2 0 118 0 1 2 3 4 54645 55374 47355 47355 61935 62664 48084 53916 54645 47355 47355 54645 55374 54645 54645 18 123 111 111 111 111 011 011 001 010 001 100 100 010 001 100 101 010 100 000 101 110 110 010 000 001 4 1 0 1 1 0 0 1 1 0 1 1 0 1 0 1 1 0 0 0 0 0 1 1 0 OA(22 31 ) 12 11 11 11 11 01 01 00 01 00 10 10 01 00 10 10 01 10 00 10 11 11 01 00 00 3 2 1 1 2 2 2 2 1 1 1 1 0 2 0 2 1 0 0 2 0 0 0 0 1 Table 6: Rankings for four OofA-OA(24, 6, 2) designs and two other D-efficient designs. ID Rank 6 1.5 3 1.5 1 3 2 4 4 5 5 6 6 Rank 1.6 1.6 3.4 3.6 4.8 6 worst: F O3 0.69(2) 0.70(1) 0.66(3) 0.65(4) 0.56(5) 0.52(6) 0.52 χ2ave,3 1.10(2) 1.06(1) 1.25(5) 1.22(3) 1.22(4) 1.38(6) 1.38 F O3,−1 0.72(1) 0.70(3) 0.70(2) 0.67(4) 0.57(5) 0.54(6) 0.54 χ2ave,3,−1 1.00(1) 1.10(2) 1.12(3) 1.17(4) 1.20(5) 1.36(6) 1.36 Sim3 8.406(2) 8.403(1) 8.415(4) 8.412(3) 8.417(5) 8.425(6) 8.425 RM Vord 1.12(1) 1.79(5) 2.01(6) 1.74(3) 1.53(2) 1.74(3) 2.01 RESULTS: COMPONENTS AND PROCESS VARIABLES We next investigate the addition of process variables to OofA designs. Formally, we want to consider adding p additional factors to our design, and we let Np∗ be the number of possible combinations of their levels. The levels of these factors may be free to vary independently, as we have suggested to this point for simplicity, but they can, in fact, be constrained—we illustrate both cases below. Our objective in this section is to illustrate how this may be done, and to this end we consider augmenting a OofA-OA(24, 5, 2) design with each of the following terms: 1. main effects for a 22 design 2. main effects for a 23 design 3. main effects for a 24 design 4. main effects for a 24−1 IV (constrained) design 5. main effects for a 22 × 3 design Note that the number of treatment combinations for most of these designs divides into 24—such designs were selected with the hope that an OofA-OA(24, 5, 2) design could be naturally extended. The remaining, 24 , design was selected to see whether it, too, might be able to extend an OofA-OA(24, 5, 2) design. 19 Table 7: Three OofA-OA(24, 6, 2) designs (ID’s 6, 3, 1) and another D-efficient design (ID 5) with reference-design rows, followed by fkl values. 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 Design 015243 024351 031542 045132 103254 123405 143250 153024 205134 214503 231045 250341 254301 304215 345012 352014 352410 402135 405321 415203 431205 510423 531402 542130 0 1 2 3 4 5 452463 444444 524634 444444 444444 356235 6 20 40 54 92 128 153 208 229 259 281 295 340 359 375 451 469 474 487 504 525 561 629 683 712 Design 012543 023451 031452 032514 045132 105234 132450 135042 142503 143052 204135 243150 251304 312054 340215 354120 421053 423051 453102 504321 521403 523014 532401 540123 Design 032145 034152 042531 052143 054132 124053 132045 142350 153042 154032 210354 230154 241530 250431 325401 340251 351024 410235 430152 435210 452013 501234 523410 531420 3 6 34 52 59 92 139 178 188 203 206 253 328 345 392 435 478 536 542 597 624 659 661 689 697 532653 524634 356235 356235 363525 533382 516615 524634 436425 372273 444444 372273 20 1 55 62 84 104 116 158 175 202 230 236 266 290 324 342 431 436 463 505 554 576 589 601 666 684 Design 024351 031452 042153 051324 120453 123450 130452 154032 210354 214530 240135 302514 312504 340125 342510 420531 431052 435201 451032 501423 502341 523140 532041 541230 436425 436425 356235 452463 453534 531582 5 40 52 80 99 148 154 172 236 266 282 313 371 395 433 450 534 560 575 584 605 610 664 686 706 We could create each design from basic principles—for example, there are 120 × 16 = 1920 candidate points for the third design. Our strategy for each design, however, was to first find an OofA-OA(24, 5, 2) design and use this design instead of the 120-run design. In our example, this reduces the number of candidate points to 24 × 16 = 384. In addition, it may also be possible to fractionate the process-variable portion as well—this case leads to the fourth design, with 24 × 8 = 192 candidate points. For our results, we choose to use Design ID 73 as our OofA-OA(24, 5, 2) design, and we show results only for the 24 and 22 × 3 designs. In obvious notation, we call these designs as OofA-OA(24, 5, 2; 24 ) and OofA-OA(24, 5, 2; 22 31 ). These results appear in Table 5, where for brevity we refer to the two process-variable portions as OA(24 ) and OA(22 31 ). The 23 result may be constructed from factors 2–4 from the 24 , and the 22 from any two of the factors 2–4. We were not able to find an OofA-OA with the 24−1 IV design for Design ID 73, but were able to do so for Design ID 33 (not shown). 7 DESIGNS WITH RESTRICTIONS, AND NOTES ON ANALYSIS In our work above, we have assumed that all m! permutations are in the set of candidate points. However, this may often not be the case. For example, the experimenter may be convinced that constraints are necessary, such as c0 < c1 , or c0 < c1 and c2 < c3 , or c0 < c1 < c2 . As L. Hare, who has used the Williams’ Latin Squares in OofA experiments, notes (personal communication, August 1, 2016), “I have done some work with order of addition, mostly related to food ingredients. It can make a very big difference for emulsions, for example. I’ve used a few full permutation designs, but usually the experimenters have some prior knowledge, especially about what orders will not work.” In such cases, we can simply first reduce the number of candidate points to agree with the restrictions, create a new set of PWO frequency tables of the kind shown in Figure 1, and use this to define “orthogonal” properties in searching for an OofA-OA. For analyses, we suggest either of two approaches, both of which are based on the assumption that interactions among the PWO factors are likely to occur. The first follows that of Hamada and Wu (1992)—use a stepwise approach to examine active main effects, and then repeat this approach but now including two f.i.’s that are associated with the active main effects. The second is to use tree-based regression modeling, in which interactions of a certain type are examined naturally. 21 8 FUTURE WORK Proving that a D-efficiency of 1 is equivalent to D-optimality may lead to some insights into the structure of D-optimal, and perhaps OofA, designs. We have not been able to create a closed-form method of constructing any OofA-OA’s, but have not ruled out such a possibility. For a design with N runs, can the largest m for which an OofA-OA exists be determined for general N and m, for a given t? A catalog of these designs, including a wide class of component-order-restricted designs and the addition of a wide class of process factors, would be useful, as would be the incorporation of our design-construction methods into software packages that are easily accessible to experimenters. The apparent small number of non-wt-isomorphic designs in some cases and large number in others surprised us. Does a structure exist among these? Can the number of non-wt-isomorphic designs be obtained and cataloged? Graphical summaries of key features of the designs, such as groupings among the orthogonal pairs or triples of the PWO factors, may help experimenters in assigning their components to the component labels 0, . . . , m − 1. We have relied on the D-criterion because of its efficient implementation, but efficient routines for the χ2 -criterion may produce better near-orthogonal OofA designs and designs with better strength-3 properties. 9 CONCLUSIONS In this article we have introduced a class of OofA experimental designs, for which designs with good properties may be created in a relatively small numbers of runs. By generalizing the idea of OA’s to OofA-OA’s, we use a natural set of balance criteria to measure the quality of these designs. A connection made between D-efficient and OofA-OA designs, along with empirical evidence, allows us to use existing D-optimal algorithms to create OofA-OA designs. We have also found necessary conditions for the number of runs needed to create OofA-OA’s of strengths 2 and 3. We have created designs for what we believe will be the most common cases, including 12-run OofA-OA’s in 4 and 5 components, 24-run OofA-OA’s in 5 and 6 components, and 24-run near-OofA-OA’s in 7 components. It is possible to extend some of these designs to include process factors, making these designs even more useful. Our methods can be modified to create designs for the common case in which a subset of the component orderings are restricted a priori by the experimenter. We also raise a number of questions for future research. 22 References [1] Black, B. E., Holaska J. M., Lvesque L., Ossareh-Nazari B., Gwizdek C., Dargemont C., and Paschal B. M. (2001). “NXT1 is Necessary for the Terminal Step of Crm1mediated Nuclear Export”. Journal of Cell Biology, 152, 141–55. [2] Clark, J. B. and Dean, A. M. (2001). “Equivalence of Fractional Factorial Designs”. Statistica Sinica, 11, 537–547. [3] Fisher, Ronald A. (1971) [1935 original edition]. The Design of Experiments, 9th ed., Macmillan. [4] Hamada, M. and Wu, C. F. J. (1992). “Analysis of Designed Experiments with Complex Aliasing”. Journal of Quality Technology, 24, 130–127. [5] Hedayat, A. S., Sloane, N. J. A., Stufken, J. (1999). Orthogonal Arrays: Theory and Applications, Springer, New York. [6] Kim, H. A., Ratner, N., Roberts, T. M., and Stiles, C. D. (2001). “Schwann Cell Proliferative Responses to cAMP and Nf1 Are Mediated by Cyclin D1”. The Journal of Neuroscience, 21, 1110-1116. [7] Lekivetz, R., Sitter, R., Bingham, D., Hamada, M. S., Moore, L. M., and Wendelberger, J. R. (2015).“On Algorithms for Obtaining Orthogonal and Near-Orthogonal Arrays for Main-Efects Screening”. Journal of Quality Technology 1, 2–13. [8] PerkinElmer, Inc. (2016). User’s Guide To Alpha Assays Protein:Protein Interactions, available at https://www.perkinelmer.com/labsolutions/resources/docs/009625A 01 GDE.pdf, accessed 1/27/17. [9] Preuss, M., Weidman, P., and Nielsen, E. (2009). “How We Study Protein Transport”. in Trafficking Inside Cells: Pathways, Mechanisms and Regulation, Segev, N., ed., Springer-Verlag, New York. [10] Van Nostrand, R. C. (1995). “Design of Experiments where the Order of Addition is Important”. ASA Proceedings of the Section on Physical and Engineering Sciences, 155–160. American Statistical Association, Alexandria, Virginia. [11] Warnes, G. R., Bolker, B., Lumley, T. (2015). “gtools: Various R Programming Tools”. R package version 3.5.0, available at https://CRAN.R-project.org/package=gtools 23 [12] Wheeler, B. (2014). “AlgDesign: Algorithmic Experimental Design”. R package version 1.1-7.3, available at https://CRAN.R-project.org/package=AlgDesign. [13] Williams, E. J. (1949). “Experimental Designs Balanced for the Estimation of Residual Effects of Treatments”. Australian Journal of Scientific Research, Series A, Physical Sciences 2, 149–168. [14] Williams, E. J. (1950). “Experimental designs balanced for pairs of residual effects”. Australian Journal of Scientific Research, Series A, Physical Sciences 3, 351–363. [15] Wu, C. F. J. (1993). “Construction of Supersaturated Designs Through Partially Aliased Interactions”. Biometrika 80, 661–669. [16] Xu, H. (2002). “An Algorithm for Constructing Orthogonal and Nearly-Orthogonal Arrays with Mixed Levels and Small Runs”. Technometrics 44, 356-368. [17] Xu, H. (2003). “Minimum Moment Aberration for Nonregular Designs and Supersaturated designs”. Statistica Sinica, 13, 691–708. [18] Yamada, S. and Lin, D. K. J. (1999). “Three-Level Supersaturated Designs”. Statistics and Probability Letters 45, 31-39. 24