Permutation Group Algorithms
Permutation Group Algorithms
Permutation Group Algorithms
Permutation Group
Algorithms
Cambridge, New York, Melbourne, Madrid, Cape Town, Singapore, São Paulo
1 Introduction page 1
1.1 A List of Algorithms 4
1.2 Notation and Terminology 6
1.2.1 Groups 7
1.2.2 Permutation Groups 9
1.2.3 Algorithmic Concepts 10
1.2.4 Graphs 11
1.3 Classification of Randomized Algorithms 12
2 Black-Box Groups 16
2.1 Closure Algorithms 18
2.1.1 Orbit Computations 18
2.1.2 Closure of Algebraic Structures 23
2.2 Random Elements of Black-Box Groups 24
2.3 Random Subproducts 30
2.3.1 Definition and Basic Properties 30
2.3.2 Reducing the Number of Generators 33
2.3.3 Closure Algorithms without Membership Testing 37
2.3.4 Derived and Lower Central Series 38
2.4 Random Prefixes 40
2.4.1 Definition and Basic Properties 40
2.4.2 Applications 44
v
vi Contents
Bibliography 254
Index 262
1
Introduction
1
2 Introduction
idea was to introduce the notions of base and strong generating set. This data
structure enables us to decide membership in G constructively, by writing any
given element of G as a short product of the strong generators. The technique
for constructing a strong generating set can also be applied to other tasks such
as computing normal closures of subgroups and handling homomorphisms of
groups. Therefore, a significant part of this book is devoted to the description
of variants and applications of Sims’s method.
A second generation of algorithms uses divide-and-conquer techniques by
utilizing the orbit structure and imprimitivity block structure of the input group,
thereby reducing the problems to primitive groups. Although every abstract
group has a faithful transitive permutation representation, the structure of prim-
itive groups is quite restricted. This extra information, partly obtained as a
consequence of the classification of finite simple groups, can be exploited in
the design of algorithms.
We shall also describe some of the latest algorithms, which use an even finer
divide-and-conquer technique. A tower of normal subgroups is constructed
such that the factor groups between two consecutive normal subgroups are the
products of isomorphic simple groups. Abelian factors are handled by linear al-
gebra, whereas the simple groups occurring in nonabelian factors are identified
with standard copies of these groups, and the problems are solved in the stan-
dard copies. This identification process works in the more general black-box
group setting, when we do not use the fact that the input group is represented
by permutations: The algorithms only exploit the facts that we can multiply
and invert group elements and decide whether two group elements are equal.
This generality enables us to use the same algorithms for matrix group inputs.
Computations with matrix groups is currently the most active area of CGT.
Dealing with permutation groups is the area of CGT where the complexity
analysis of algorithms is the most developed. The initial reason for interest
in complexity analysis was the connection of permutation group algorithms
with the celebrated graph isomorphism problem. The decisive result in estab-
lishing the connection is the polynomial-time algorithm in [Luks, 1982] for
testing isomorphism of graphs with bounded valence, where the isomorphism
problem is reduced to finding setwise stabilizers of subsets in the permutation
domain of groups with composition factors of bounded size. This paper not
only established a link between complexity theory and CGT but provided new
methodology for permutation group algorithms.
Up until the end of the 1980s, permutation group algorithms were devel-
oped in two different contexts. In one of these, the primary goal was efficient
implementation, to handle the groups occurring in applications. In the other
context, the main goal was the rigorous asymptotic analysis of algorithms.
Introduction 3
Algorithms for numerous tasks were developed separately in the two con-
texts, and the two previous books on permutation group algorithms reflect
this division: [Butler, 1991] deals mostly with the practical approach, whereas
[Hoffmann, 1982] concentrates on the asymptotic analysis. In the past decade,
a remarkable convergence of the approaches occurred, and algorithms with fast
asymptotic running times that are suitable for implementation were developed.
The main purpose of this book is to describe this new development. We con-
sider the interaction of theory and implementation to be of great importance
to each side: Symbolic algebra can benefit considerably by the influx of ideas
of algorithmic complexity theory and rigorous asymptotic analysis; conversely,
the implementations help demonstrate the power of the asymptotic paradigm,
which is at the foundation of the theory of computing.
The major theme of this book is the description of nearly linear-time algo-
rithms. These are the algorithms representing the convergence of theoretical
and practical considerations. Their running time is O(n|S| logc |G|) for input
groups G = S ≤ Sn ; in particular, in the important subcase of small-base
groups, when log |G| is bounded from above by a polylogarithmic function of
n, the running time is a nearly linear, O(N logc N ), function of the input length
N = n|S|. The category of small-base groups includes all permutation repre-
sentations of finite simple groups except the alternating ones and all primitive
groups that do not have alternating composition factors in their socle. Most
practical computations are performed with small-base input groups.
Quite different methods give the asymptotically fastest solutions for com-
putational problems in large-base groups, where log |G| is bounded only by
log n!. Most of these algorithms have not yet been implemented. We shall also
describe backtrack methods, which are the practical algorithms for problems
with no known polynomial-time solutions. For small-base input groups, back-
track methods may be practical in groups of degree in the tens of thousands.
Our main goal is to present the mathematics behind permutation group algo-
rithms, and implementation details will be mostly omitted. We shall give details
only in the cases where the implemented version differs significantly from the
one described by the theoretical result or when the reason for the fast asymptotic
running time is a nontrivial data structure. Most of the algorithms described in
this book have been implemented in the GAP system [GAP, 2000], which, along
with its source code, is freely available. GAP code is written in a high-level,
Pascal-like language, and it can be read as easily as the customary pseudocode
in other books and articles on computational group theory. The addresses of ftp
servers for GAP can be obtained from the World Wide Web page
http://www-gap.dcs.st-and.ac.uk/~gap.
4 Introduction
The other large computer algebra system particularly suitable for computations
with groups is Magma (see [Bosma et al., 1997]). The World Wide Web page
http://www.maths.usyd.edu.au:8000/u/magma
Acknowledgments
The writing of this book began in 1993, on the suggestion of Joachim Neubüser,
who envisioned a series of books covering the major areas of computational
group theory. The fact that the writing is finished in less than a decade is in no
small part the consequence of my wife Sherry’s continuous encouragement. I
am thankful to both of them and to the editors of Cambridge University Press
for their patience. During this period, I was partially supported by the National
Science Foundation.
Alexander Hulpke, William Kantor, Joachim Neubüser, Cheryl Praeger,
Charles Sims, and Leonard Soicher read parts of the manuscript, and their
comments improved the presentation significantly. I am especially indebted to
William Kantor and Joachim Neubüser for their help.
1.2.1. Groups
If G is a group and S ⊆ G then we denote by S the subgroup generated by S.
We write H ≤ G to indicate that H is a subgroup of G and H G if H ≤ G and
H = G. If H is isomorphic to a subgroup of G then we write H G. The symbol
|G : H | denotes the number |G|/|H |, and H G denotes that H is normal in
G. A subgroup H ≤ G is subnormal in G, in notation H G, if there exists a
chain of subgroups H = H0 H1 · · · Hk = G. If N G and H ≤ G such
that N ∩ H = 1 and G = N H then we call H a complement of N in G.
The group of automorphisms, outer automorphisms, and inner automor-
phisms of G are denoted by Aut(G), Out(G), and Inn(G), respectively. We say
that G acts on a group H if a homomorphism ϕ : G → Aut(H ) is given. If ϕ is
clear from the context, for g ∈ G and h ∈ H we sometimes denote ϕ(g)(h), the
image of h under the automorphism ϕ(g), by h g . If G acts on H and U ⊆ H
then U G := {U g | g ∈ G} is the orbit of U under the G-action, and U G is
the G-closure of U . In the special case H = G, the group U G is called the
normal closure of U . For U ≤ H, C G (U ) := {g ∈ G | (∀u ∈ U )(u g = u)} is
the centralizer of U in G and NG (U ) := {g ∈ G | U g = U } is the normalizer
of U in G. In particular, Z (G) := C G (G) is the center of G.
8 Introduction
use the Lie-theoretic notation 2B2 (q), 2G 2 (q), and so on. As mentioned earlier,
no detailed knowledge of the groups of Lie type is required in this book.
(i.e., functions that take positive values with finitely many exceptions). For
1.2 Notation and Terminology 11
f ∈ F, we define
and
( f ) := O( f ) ∩ ( f ).
Stated less formally, t ∈ O( f ) means that for large enough n, t(n)/ f (n) is
bounded from above by an absolute constant c; t ∈ ( f ) means that for large
enough n, t(n)/ f (n) is bounded from below by an absolute positive constant c;
and t ∈ o( f ) means that the limit of t(n)/ f (n) is 0.
For t ∈ O( f ), we also say that t is O( f ), and we use similar statements for
and as well. We believe that this notation is more correct than the traditional
t = O( f ) (see [Brassard and Bratley, 1988, Chap. 2] for the different variants
of these notations).
We also introduce a “soft version” of the big-O notation. We write t ∈ O ∼ ( f )
if t(n) ≤ C f (n) logc n for large enough n (where c, C are positive constants).
Logarithms are always of base 2.
1.2.4. Graphs
Let V be a set and E a subset of the two-element subsets of V . The pair (V, E)
is called a graph X (V, E). The elements of V and E are the vertices and the
edges of X , respectively. For v ∈ V , the number of edges containing v is the
degree or valency of v, which we shall denote by deg(v). A graph X is called
regular if all vertices have the same valency. We also say that an edge {u, v} ∈ E
connects u and v. The set N (v) := {u ∈ V | {u, v} ∈ E} is the neighborhood of
v in X . A graph X is called bipartite if V can be partitioned into two sets A, B
so that all edges of X connect some vertex in A with some vertex in B. The
automorphism group Aut(X ) consists of those permutations of V that leave E
invariant.
These notions can be generalized by requiring only that the elements of E are
subsets of V , but of arbitrary size. Then X is called a hypergraph. A hypergraph
X is uniform if all elements of E are of the same size.
Another generalization is when E consists of ordered pairs of V . Then X is
called a directed graph. If we want to emphasize that a graph is directed then
we shall use arrows above E and above the ordered pairs in E. The out-degree
−−→
of a vertex u is the number of edges (u, v) in E. For a directed graph X (V, E),
12 Introduction
the underlying graph of X is the graph U(V, E) with edge set E = {{u, v} |
−−→
(u, v) ∈ E}.
Let X (V, E) be a graph. A walk in X is a sequence of vertices (v0 , v1 , . . . , vk )
such that {vi , vi+1 } ∈ E for all i ∈ [0, k − 1]. A path is a walk with vi = v j for
all i, j with 0 ≤ i < j ≤ k; a cycle is a walk with v0 = vk and vi = v j for all
i, j with 0 ≤ i < j ≤ k − 1. We can define a binary relation R on V by letting
(u, v) ∈ R if and only if there is a walk (u = v0 , v1 , . . . , vk = v). Then R is an
equivalence relation; the equivalence classes are called the components of X .
The graph X is connected if it has only one component.
If X is a directed graph then walks, cycles, and paths are defined similarly,
but the binary relation R is not necessarily an equivalence relation. We may
define another binary relation R̂ on V: (u, v) ∈ R̂ if and only if (u, v) ∈ R
and (v, u) ∈ R. Then R̂ is an equivalence relation, and its equivalence classes
are called the strongly connected components of X . The directed graph X is
strongly connected if it has only one strongly connected component.
Cycle-free graphs are called forests and connected, cycle-free graphs are
called trees. A rooted tree is a tree with a distinguished vertex. If X (V, E) is a
rooted tree with root r then the parent of v ∈ V \{r } is the first vertex after v
on the unique path from v to r . The children of v ∈ V are those vertices whose
parent is v. A vertex without children is called a leaf.
Let G be a group and S ⊆ G. The Cayley graph (G, S) is defined to have
vertex set G; for g, h ∈ G, {g, h} is an edge if and only if gs = h for some
s ∈ S ∪ S −1 . The Cayley graph (G, S) is connected if and only if G = S.
Cayley graphs are regular and vertex-transitive (i.e., Aut((G, S)) is a transitive
subgroup of Sym(G)).
Sequences will be denoted by enclosing their elements in parentheses. For a
sequence L , L[i] denotes the ith element of L.
The most abused notation is (a, b) for integers a and b. Depending on the
context, it may mean a sequence with elements a and b, the set of real numbers
between a and b, the set of integers between a and b (although, in this case,
we shall prefer to use the closed interval [a + 1, b − 1]), or the permutation
exchanging a and b.
where the value for the error term ε < 1/2 is specified in the description of the
Monte Carlo algorithm. In most practical situations, the reliability of the algo-
rithms can be improved by repeated applications, or by running the algorithms
longer. Although we cannot formulate a theorem in the general setting consid-
ered in this section, at least the following holds for all Monte Carlo algorithms
for permutation groups described in this book: The probability of an incorrect
answer can be bounded from above by an arbitrary ε > 0, prescribed by the
user. The associated cost is the running time of the algorithm multiplied by a
factor O(log(1/ε)).
A situation we can analyze here is the case of decision problems. Suppose that
we have a Monte Carlo algorithm for a decision problem, with error probability
ε < 1/2. Running the algorithm t times, and taking a majority vote of the results,
14 Introduction
√
we can increase the reliability to at least 1 − δ t , with δ := 2 ε(1 − ε) < 1:
The probability of error is at most
t t t/2
t k t ε
ε (1 − ε) < (1 − ε)
t−k t
< δt .
k=t/2
k k=t/2
k 1 − ε
Groups can be given in different ways: The group elements can be, for exam-
ple, permutations, invertible matrices, or words in generators satisfying some
relations. Algorithms usually try to exploit the specific features of the given
representation. As an example, suppose that we have to compute the mth power
of some g ∈ G. If G is a permutation group then for large values of m the fastest
method to compute g m is the following. We compute the mth power of each
cycle of g separately, first reducing m modulo the cycle lengths. If C is a cycle
of length k and j is the remainder of m divided by k, then for all α ∈ C we have
m j
α g = α g . The arithmetic operations required in the computation of j are much
cheaper than group operations. Unfortunately, no analogue of this method is
available in other representations of groups.
i
Another way to compute g m is by repeated squaring. First, we compute g 2
for 1 ≤ i ≤ log m. Then, if the binary expansion of m is m = i∈I 2i then
i
g m = i∈I g 2 . This method requires O(log m) group operations and it is inde-
pendent of the actual representation of G: All we use is that – somehow – we
can compute the product of two given group elements.
In this chapter we deal with algorithms of the latter type. A black-box group
G is a group whose elements are encoded as strings of length at most N over
an alphabet Q, for some positive integer N and finite set Q. We do not require
that group elements have a unique representation as a string, and not all strings
need to correspond to group elements.
Group operations are performed by an oracle (the black box). Given strings
representing g, h ∈ G, we can
16
Black-Box Groups 17
It is possible that the oracle accepts strings representing elements of an over-
group Ḡ of G, and it performs the group operations in Ḡ. If this is the case then
we assume that we can decide whether a string represents an element of Ḡ, but
we do not assume that we can decide whether a string represents an element of
G or Ḡ\G.
Combining (i)–(iii), we can compare group elements. In practical situations,
usually it is possible to decide whether g = h directly, without computing gh −1
and then comparing it with 1. A black-box group algorithm is an algorithm that
does not use specific features of the group representation or particulars of how
the group operations are performed; it can use only the operations described in
the list above. Note that we have an upper bound for the order of G, namely,
N
|G| ≤ i=1 |Q|i ≤ N |Q| N .
The definition we gave here is a slight generalization of the original one in
[Babai and Szemerédi, 1984]. There, only the alphabet Q = {0, 1} was allowed,
and the group elements had to be represented by strings of uniform length. It
is easy to reduce our more general definition to the original one: The elements
of Q can be encoded by 1 + log |Q| -long 0–1 sequences corresponding to
the binary representations of the numbers 1, 2, . . . , |Q|, and short strings can
be padded by zeros to achieve uniform length. Although we did not require
in the definition that group elements have a unique representation as a string
over the alphabet Q, in a number of practical situations this additional property
is satisfied. Unique representation allows ordering of group elements (for ex-
ample, by the lexicographic ordering of strings) that may speed up algorithms
dealing with lists of group elements (see, for example, Section 2.1.1).
Two examples of black-box groups are matrix groups (with a finite field as
Q) and permutation groups (with Q = {1, 2, . . . , n}). In these examples, the set
of all invertible matrices and the set of all permutations, respectively, are natural
overgroups. The oracles for group operations are the usual matrix and permuta-
tion multiplications and inverses. The strings representing group elements are of
uniform length, and each group element has a unique representation as a string.
Another example is the polycyclic representation of solvable groups (cf.
Section 7.2). In this case, the alphabet Q is a set of generators, and multiplication
is performed by a collection method. Every string up to length N represents a
group element, and group elements may be represented by more than one string.
However, each group element has a string representation in a normal form, the
so-called collected word. In fact, the oracle performs a group multiplication by
concatenating the input strings and computing the collected word corresponding
to this concatenation. By always using the collected word to represent group
elements, we can consider the black-box group as having unique representation.
For details, we again refer to Section 7.2.
18 Black-Box Groups
In Section 5.3, we shall introduce another way to consider permutation groups
as black-box groups. Similarly to the case of polycyclic representations, the al-
phabet is a set of generators, and every string represents a group element. Again,
we shall have a normal form for representing group elements, the so-called stan-
dard word, and we can restrict ourselves to working with standard words. For
small-base permutation groups, a class of groups important for practical com-
putations, this new representation allows much faster group operations than
are possible with permutation multiplication. The ability to perform group op-
erations quickly offsets the disadvantage that we lose the information stored
implicitly in the cycle structure of permutations; all we can do is to perform the
black-box operations listed above.
An important example of black-box groups with no unique representation
of elements is when G is a factor group H/M. Here H can be any black-box
group, and the oracle for group operations in G is the oracle of H . However, to
satisfy condition (iii), we must be able to decide whether a string representing
an element of H is in M. For example, if M = Z (H ) then we can test g ∈ M
by checking whether g commutes with the generators of H . Another example
where we can test membership in M is when M = O∞ (H ) is the solvable
radical of H or M is the Fitting subgroup of H . In Sections 2.3 and 2.4,
we shall describe Monte Carlo algorithms for computing normal closures and
commutator subgroups. Using these methods, we can test whether g ∈ M by
computing g H and the derived or lower central series of g H , respectively.
These last examples are oracles with the possibility of incorrect output, and
the membership computation in M is quite time consuming and so cannot be
applied frequently in practice. However, these examples play an important role
in the theoretical complexity investigations of matrix group algorithms.
Finitely presented groups, however, are not black-box groups. Although we
can perform group multiplications by simply concatenating the words repre-
senting g and h, there is no uniform bound for the word length. Even more
importantly, the innocent looking condition (iii) corresponds to the word prob-
lem, which is, by the celebrated results of Novikov and Boone [Rotman, 1995,
Chap. 12], in general undecidable.
{2, 4}\(L 0 ∪ L 1 ∪ L 2 ) = {2} are computed. Finally, L 3S = {2s1 , 2s2 } = {1, 2} and
L 4 := {1, 2}\(L 0 ∪ L 1 ∪ L 2 ∪ L 3 ) = ∅ are computed. Since L 4 = ∅, the algo-
rithm terminates and outputs the orbit 4G = L 0 ∪ L 1 ∪ L 2 ∪ L 3 = {1, 2, 4, 5}.
In implementations, we collect the elements of the sets L i in a list U . We
define U [1] := α and, successively for each element β ∈ U , compute β g for all
g ∈ S. If β g is not already in U then we add β g to U . The algorithm terminates
when all elements of U are processed.
The timing of the algorithm depends on |α G |. We have to compute the image
of each β ∈ α G under each g ∈ S, and so the total number of image computations
is |α G ||S|. The cost of an image computation depends heavily on the actual
representation of G. If the black-box group G is a permutation group then
computing β g for some β ∈ and g ∈ G is the basic operation whose time
requirement is the unit in running time estimates; in the other two examples
mentioned earlier, an image computation amounts to performing some group
operations to compute a conjugate of a group element or the conjugates of a list
of generators.
Also, we have to decide whether an image β g just computed already occurs
in the list U . As with image computations, the cost of this operation varies ac-
cording to the actual representation of the black-box group and . We consider
three possible situations. In the first one, a superset of manageable size is
known for α G and the position of elements of α G in can be determined easily.
This occurs, for example, when G is a permutation group acting on : We may
simply choose := . In the second situation, no such superset is available.
The third situation is a special case of the second: There is no superset at
our disposal, but a linear ordering of is defined and we can compare any two
elements of in this ordering. This latter situation occurs, for example, when
= G and we have to compute the conjugacy class of some α ∈ G, and the
elements of the black-box group G have unique representations as strings. In
this case, we may simply consider the lexicographic ordering of strings as a
linear ordering of .
Theorem 2.1.1. Suppose that a group G = S acts on a set . Then the orbit
of some α ∈ can be computed using O(|α G ||S|) image computations. More-
over,
Remark 2.1.3. Using more sophisticated data structures (e.g., 2–3 trees; cf.
[Sedgewick, 1988, Chap. 15]), the number of comparisons in Theorem 2.1.1(iii)
can be decreased to O(|α G | log |α G ||S|).
Proof. Let z := xk−1 yk , and write z as a product z = xi1 · · · xil of elements from
X \{xk }. Starting with X and replacing the element in position k by the product
of elements in position k and i j for j = 1, . . . , l takes X to Y .
Theorem 2.2.3. Let m 1 be the minimal size of a generating set of G and let
m 2 be the maximal size of a nonredundant generating set. If m ≥ m 1 + m 2
28 Black-Box Groups
then the Markov chain M associated with the product replacement algorithm
is irreducible and aperiodic.
Lemma 2.3.1. Suppose that G = S acts on a set and let ⊆ . Let g be
a random subproduct of the elements of S (in any fixed ordering of S). If is
not closed for the action of G then Prob(g = ) ≥ 1/2.
Proof. Let S = {g1 , . . . , gk }, g = g1ε1 g2ε2 · · · gkεk , and pi = g1ε1 g2ε2 · · · giεi for
0 ≤ i ≤ k. If is not closed for the G-action then gi = for some gi ∈ S.
Let us consider the largest index i with this property. If pi−1 fixes then pi does
not fix with probability 1/2, since if εi = 1 then pi = pi−1 gi = gi = .
The other case is that pi−1 does not fix . Then with probability at least 1/2
neither does pi , since if εi = 0 then pi = pi−1 = . Furthermore, if pi =
then g = , since all g j , j > i, fix . Formally,
ε1 ε2 εi
Prob(g = ) = Prob g1 g2 ···gi =
ε1 εi−1 ε1 εi−1
≥ Prob εi = 1|g1 ···gi−1 = Prob g1 ···gi−1 =
ε1 εi−1 ε1 εi−1
+ Prob εi = 0|g1 ···gi−1 = Prob g1 ···gi−1 = = 1/2.
t
≤ e−ε
2
Prob X i ≤ (1 − ε) pt pt/2
.
i=1
Proof. The proof is based on Chernoff’s bound (cf. [Chernoff, 1952]). This
states that if Y1 , Y2 , . . . are independent Bernoulli trials (coin flippings) with
Prob(Yi = 1) = p then
t
≤ e−ε
2
Prob Yi ≤ (1 − ε) pt pt/2
. (2.2)
i=1
Hence, to prove the lemma, it is enough to show that for all integers k, t,
t
t
Prob Xi ≥ k ≥ Prob Yi ≥ k . (2.3)
i=1 i=1
32 Black-Box Groups
We prove (2.3) by induction on t. The initial case is obvious. Supposing (2.3)
for t, we have
t+1
t
Prob Xi ≥ k = Prob Xi ≥ k
i=1 i=1
t t
+ Prob X t+1 = 1 X i = k − 1 Prob Xi = k − 1
i=1 i=1
t t
≥ Prob X i ≥ k + p Prob Xi = k − 1
i=1 i=1
t
t
= p Prob X i ≥ k − 1 + (1 − p) Prob Xi ≥ k
i=1 i=1
t
t
≥ p Prob Yi ≥ k − 1 + (1 − p) Prob Yi ≥ k . (2.4)
i=1 i=1
We used the inductive hypothesis in the last inequality. Doing the steps of (2.4)
backward using the Yi , and noting that the independence of the Yi means that
we have equality at each step, we obtain
t
t
t+1
p Prob Yi ≥ k − 1 + (1− p) Prob Yi ≥ k = Prob Yi ≥ k ,
i=1 i=1 i=1
Lemma 2.3.4. Suppose that a generating set S and an upper bound l G for
the length of subgroup chains for a group G are given, and let δ > 0 be ar-
bitrary. Then there is a Monte Carlo algorithm that, with probability at least
1 − δ, constructs a generating set T of size O(l G ), using O(|S|l G ) group op-
erations.
Proof. Let the set T consist of cl G random subproducts made from S, where
the constant c depends on the desired error probability. We claim that
2 2
Prob(T = G) ≥ 1 − e−(1− c ) cl G /4
. (2.5)
Remark 2.3.5. As is clear from (2.5), to achieve error probability less than δ we
may choose the constant c := max{4, 16 ln(δ −1 )/l G }. Another interpretation is
that we construct a new generating set of size 4l G , with the reliability of the
algorithm improving exponentially as a function of l G . In a similar way, we
can compute explicit values for the constants in all algorithms presented in
Sections 2.3.3 and 2.3.4 as well. In Remark 2.3.13, we shall comment further
on the error probability in our algorithms.
34 Black-Box Groups
The following theorem from [Babai et al., 1995] also constructs a generating
set of size O(l G ), while reducing the number of necessary group multiplications
significantly. The idea is that at the beginning of the construction of a new gen-
erating set, already short random subproducts have a good chance to augment
the already constructed subgroup.
Theorem 2.3.6. Suppose that a generating set S and an upper bound l G for
the length of subgroup chains for a group G are given, and let δ > 0 be ar-
bitrary. Then there is a Monte Carlo algorithm that, with probability at least
1 − δ, constructs a generating set T of size O(l G ), using O(|S| log l G ) group
operations.
Proof. We can suppose that l G is a power of 2 and also that l G divides |S|
because we may put the first (|S| mod l G ) elements of S into T and work with
the subgroup of G generated by the remaining elements of S. Also, we can
suppose |S| ≥ 10l G ; otherwise we can define T := S.
The elements of T are constructed in two phases. In the first phase, we place
cl G random subproducts of length |S|/l G made from S into T , where c ≥ 10 is
a constant depending on the desired error probability. We claim that, after this
phase,
To see (2.6), we use an extension of the idea behind the basic-type applications
of Lemma 2.3.3. Let r1 , r2 , . . . be the sequence of random subproducts of length
|S|/l G placed in T and associate a 0–1 valued random variable X i to each ri . Set
X i = 1 if and only if either ri ∈ r1 , . . . , ri−1 or |{g ∈ S | g ∈ r1 , . . . , ri−1 }| <
l G . Clearly, if i≤clG X i > l G then fewer than l G elements of S are not in
T , since otherwise the initial segments of the ri define a subgroup chain of
length greater than l G in G. Hence, it is enough to give an upper estimate for
Prob( i≤clG X i ≤ l G ).
To this end, note that Prob(X i = 1) > 3/10. This is true because if |{g ∈ S |
g ∈ r1 , . . . , ri−1 }| < l G then Prob(X i = 1) = 1. Otherwise, with probability
at least
|S| − l G |S| |S| − |S|/l G lG
1− ≥1 − > 1 − 1/e > 3/5,
|S|/l G |S|/l G |S|
−c /100
e . The number of group operations is O(l G (|S|/l G )) = O(|S|) in the first
phase and
log l
G l G 2i−1 |S|
O = O(|S| log l G )
i=1
2i lG
in the second.
Lemma 2.3.10.
n
1
[a1 a2 · · · an , b1 b2 · · · bm ] = [ai , b j ]b j+1 ···bm ai+1 ···an .
i=1 j=m
2.3 Random Subproducts 39
Proof. By induction, it is easy to check that
1
[a, b1 b2 · · · bm ] = [a, b j ]b j+1 ···bm
j=m
and
n
[a1 a2 · · · an , b] = [ai , b]ai+1 ···an .
i=1
Proof. We construct a generating set for [H, K ] in two phases. In the first
phase, we place cl G commutators [u, v] into a set T , where u and v are random
subproducts from the sets U and V , respectively. Based on Lemma 2.3.11, an
application of Lemma 2.3.3 of basic type shows that, with error probability at
most e−c lG , the normal closure of T generates [H, K ]. In the second phase,
using the algorithm in the proof of Theorem 2.3.9, we compute generators for
T H,K .
40 Black-Box Groups
Remark 2.3.13. During the computation of the derived or lower central series
of G, we apply the algorithm in the proof of Theorem 2.3.12 l G times, and
the error probabilities of all rounds have to be added. This causes no problem,
because all error estimates are of the form e−c lG for some constant c and
l G e−c lG is still exponentially small as a function of l G .
Lemma 2.3.11 also can be applied to test whether a given group G = S is
abelian. Note that the straightforward deterministic method, checking whether
each pair of generators commutes, requires (|S|2 ) group operations.
δ H (x) = min |x − y|
y ∈ H̄ \{x}
Prob(spread(H ) ≥ ε) ≥ q
m − (h + 1)d m m − h + 1 − (h + 1)d h
= >
h h m−h+1
h
(h + 1)d (h + 1)d (h + 1)ε(m + 1)
= 1− >1−h >1−
m−h+1 m−h+1 m−h+1
(h + 1)ε(m + 1)
>1 − > 1 − (h + 1)ε(15/14) > 1 − (15ε)h/4 .
(1 − ε)(m + 1)
h+2 l (h + 2 − l)!
l k − 2l (h + 2 − k)!
ways, since the elements a1,1 , a2,1 , . . . , al,1 are from H0 ; from this set, we have
to choose a k − 2l-element subset to determine those ai,1 which have ki = 3;
finally, fixing the two subsets of the ai,1 , there are (h + 2 − l)!/(h + 2 − k)!
choices for the remaining ai, j .
Fixing the set H0∗ , we can define f by first choosing the (random) values
f (ai,1 ) for 1 ≤ i ≤ l and then continuing with the choice of f (ai, j ) for j ≥ 2.
The probability that f (ai, j+1 ) falls between f (ai, j ) and f (ai, j ) + c(m + 1) is at
most c(m + 1)/(m + 1 − (k − 1)). (Note that this estimate is valid even in the
case ai, j+1 = h + 1, although the value of f (h + 1) is not chosen randomly:
In this case, consider the probability that f (ai, j ) falls between (1 − c)(m + 1)
and m + 1.) Therefore,
h+2 l (h + 2 − l)! c(m + 1) k−l
Prob(A(c)) < .
l k − 2l (h + 2 − k)! m + 1 − k
h+2 l (h + 2 − l)! 29h k−l
< , (2.10)
l k − 2l (h + 2 − k)! 4
in the case k/3 < l < k/2 the term (3l − k)3l−k (k − 2l)k−2l can be estimated
from below by k l 4l (29e/8)l−k . Also, (h + 2 − k)h+2−k > e2 (h/2)h+2−k . Using
these estimates, it is straightforward to check that (2.10) holds. If l ∈ {k/3, k/2}
then ( k−2ll
) = 1, and Stirling’s formula easily yields (2.10).
What does it mean that A(c) fails? Let A = (a0 , a1 , . . . , ah+1 ) be the sequence
of elements of H ∪ {0, m + 1}, listed in increasing order. Then let A1 , . . . , An
be the maximal subsequences of consecutive elements with difference less than
c(m + 1); that is, ai+1 − ai < c(m + 1) if ai and ai+1 are in the same A j , but
min(A j+1 ) − max(A j ) ≥ c(m + 1). Each A j of size greater than 1 can be par-
titioned into subsequences of length two and three, which can be considered as
the f -images of the set H0∗ . Therefore, if A(c) fails then j ∈ J |A j | < k, where
the summation runs over the A j with |A j | > 1. Hence more than h −k elements
x ∈ H are in one-element sets A j , which means that δ H (x) ≥ c(m+1). In particu-
lar, if h is even then we choose c := 2ε/ h and obtain that spread(H ) ≥ (h−k)c =
ε with probability greater than 1 − (15ε)k−l ≥ 1 − (15ε)h/4 . If h is odd then we
choose c := 2ε/(h +1) and obtain that spread(H ) ≥ (h −k)c = ε with probabil-
ity greater than 1−(15εh/(h+1))k−l ≥ 1−(15εh/(h+1))(h−1)/4 > 1−(15ε)h/4 .
(The last inequality is the only point in the proof of case h ≥ 6 where we used
the lower bound ε ≥ 1/30.)
Theorem 2.4.2. Let 1/30 ≤ ε < 1/15, C > 0, and r > (4 + 4(C + 1) log m)/
log(1/(15ε)). Furthermore, let P = { p1 , . . . , pr } be a set of r random permu-
tations of {1, 2, . . . , m}. Then P is an (ε, m)-spreader with probability at least
1 − m −C .
Here, the first inequality is true because the condition imposed on r implies that
(15ε)r/4 < 1/(me), and (1 + x)m < 1 + 2xm for 0 < x < 1/(me). The second
inequality is a straightforward consequence of the condition imposed on r .
Proof. Let H := {s ∈ S | s ∈ K } and let a1 < a2 < · · · < ah denote the posi-
tions corresponding to the elements of H in the randomly chosen permuta-
tion p ∈ P. We also define a0 := 0 and ah+1 := |S| + 1 and let g j denote the
random prefix of S corresponding to the first j elements of p. Then, for a
fixed i ∈ [1, h], either g j ∈ K for all j with ai−1 ≤ j < ai or g j ∈ K for all
j with ai ≤ j < ai+1 (it is also possible that both of these events occur).
So Prob(g ∈ K ) ≥ spread({a1 , . . . , ah })/2 and, since P is an (ε, |S|)-spreader,
spread({a1 , . . . , ah }) ≥ ε with probability at least 1/r .
2.4.2. Applications
We shall apply random prefixes to G-closure computations. We consider the
same situation as in Section 2.3.3, namely, G = S acts on H , we can compute
h g ∈ H for any h ∈ H, g ∈ G, and an upper bound l H is known for the length of
subgroup chains in H .
O(l H log |S|(log l H + log log |S|)3 + |A| log l H + |S| log |S|)
group operations.
2.4 Random Prefixes 45
Proof. We can suppose that |A| ∈ O(l H ) because if |A| is too big then we
construct O(l H ) generators for A. By Theorem 2.3.6, this can be done by
O(|A| log l H ) group operations.
We would like to apply Lemma 2.4.4. We can fix a value ε in the interval
[1/30, 1/15), and there is no problem concerning the set S: We can construct an
(ε, |S|)-spreader P = { p1 , . . . , pr1 } of size O(log |S|) and collect in a set TG all
products of elements of S corresponding to initial segments of the permutations
in P. For the construction of random permutations in P, see Exercise 2.1. The
construction of TG requires O(|S| log |S|) group operations.
However, the set U containing the already constructed generators for A G
changes during the algorithm, so we cannot precompute the product of initial
segments in random permutations of U . Computing spreader permutations of
increasing degree as |U | changes would take too much time. Therefore, we
proceed in the following way.
Let m = cl H log |S|(log l H + log log |S|) for a sufficiently large constant c
and let Q = {q1 , . . . , qr2 } be an (ε, m)-spreader of size O(log m) = O(log l H +
log log |S|). We can suppose that m is a power of 2. We use an m-long array
U to store generators for A G . Initially, U contains the elements of A and the
rest of U is padded with the identity element. As new generators x for A G
are constructed, we replace one of the identity elements in U by x. Now comes
the idea that speeds up the computation. Instead of products of initial segments
of the U qi , we store the product of elements in positions l2 j + 1, l2 j + 2, . . . ,
(l + 1)2 j in U qi for all i ∈ [1, r2 ], j ∈ [0, log m], and l ∈ [0, m/2 j − 1]. This
requires the storage of less than 2mr2 group elements.
From this data structure TH , the product of the first k elements of U qi
can be computed with O(log m) = O(log l H + log log |S|) group operations,
by taking segments of lengths corresponding to the terms in the binary ex-
pansion of k. Also, after replacing an element of U , the data structure can
be updated by at most r2 log m ∈ O((log l H + log log |S|)2 ) group operations,
since after updating segments of length 2 j , the r2 segments of length 2 j+1
containing the new element can be updated by one multiplication per seg-
ment.
This gives us the following algorithm: Replace the identity elements in U by
group elements of the form u g as described in Lemma 2.4.4, always updating the
data structure TH . If the construction of the spreaders P, Q was successful then
Lemma 2.4.4 and an application of the basic type of Lemma 2.3.3 imply that
by the time all m − |A| ∈ O(l H r1r2 ) identity elements are replaced, U generates
A G with probability greater than 1 − e−c l H for some constant c > 0. The
number of group operations required is O(mr2 log m) = O(l H log |S|(log l H +
log log |S|)3 ).
46 Black-Box Groups
Corollary 2.4.6. Suppose that an upper bound l G is known for the length of
subgroup chains in a group G and that O(l G ) generators are given for G and
for some H ≤ G. Let δ > 0. Then there is a Monte Carlo algorithm that, with
probability at least 1 − δ, constructs O(l G ) generators for the normal closure
H G , using O(l G log4 l G ) group operations.
Proof. By Theorem 2.4.5, O(l G log2 l G ) generators can be obtained by the indi-
cated number of group operations. By Theorem 2.3.6, this generating set can be
reduced to one of size O(l G ) with O(l G log3 l G ) further group operations.
Exercises
2.1. Design an algorithm that constructs a uniformly distributed random ele-
ment of Sn in O(n) time. (You can suppose that a random element of a
list can be chosen in O(1) time.) Hint: We have to construct an injective
function f : [1, n] → [1, n]. When f (1), . . . , f (k) are already defined,
store the remaining n − k possible function values in an array of length
n − k. How should this array be modified when f (k + 1) is defined?
2.2. Design an algorithm that constructs a uniformly distributed random ele-
ment of An .
2.3. Let M be a finite state Markov chain, with transition probability matrix
P. Prove that pi(k) k
j is the (i, j)-entry in P .
2.4. Let M be a Markov chain and suppose that states u, v can be reached
from each other. Prove that u and v are both aperiodic or they have the
same period.
2.5. Prove that the stationary distribution of a finite, irreducible, aperiodic
Markov chain is the uniform one if and only if the column sums of the
transition probability matrix are all 1.
2.6. Prove the inequality (2.11).
2.7. [Beals and Babai, 1993] Suppose that G is a nonabelian black-box
group and N G, N = G. Suppose also that a subset A := {g1 , . . . ,
gk } ⊆ G\{1} is given, and we know that A ∩ N = ∅. Design a Monte
Carlo algorithm that, with high probability, computes a nontrivial ele-
ment of a proper normal subgroup of G. Hint: If {a, b} ∩ N = ∅ then
[a, b] ∈ N . What can we do if a and b commute?
3
Permutation Groups
A Complexity Overview
In this chapter, we start the main topic of this book with an overview of permu-
tation group algorithms.
48
3.1 Polynomial-Time Algorithms 49
polynomial-time toolkit. The following list is a part of that toolkit; we overview
those tasks that can be done by a polynomial-time algorithm for all inputs
G = S ≤ Sn . For special classes of groups (mostly for solvable groups or,
slightly more generally, for groups with nonabelian composition factors of
bounded order), there is a significant extension of this list. We mention some
of the additional problems, which can be handled in polynomial time if the
input group has bounded nonabelian composition factors, in Section 3.3. Also,
as a general rule, it seems to be easier to compute a targeted subgroup if it is
normal in G: For example, C G (H ) is computable in polynomial time if H G
(since H G implies C G (H ) G) but no polynomial-time algorithm is known
for arbitrary inputs H ≤ G.
Let G = S ≤ Sym(). Then the following tasks can be computed by deter-
ministic polynomial-time algorithms:
(m) Given H ≤ G, find CoreG (H ). (For the definitions used in (m) – (p), see
Section 1.2.1.)
(n) Find the socle of G.
(o) Find the intersection of all maximal normal subgroups of G.
(p) For any collection
of simple groups, find O
(G) and O
(G).
(q) Find all of the above in quotient groups of G.
We give historical remarks and indicate some of the group theory involved
in the different algorithms.
Item (a) is essentially all we can do using only the given generators of G
(although there are some exceptions, e.g., testing whether a permutation group
is regular). The first algorithm for computing blocks of imprimitivity appeared
in [Atkinson, 1975]. For all other tasks, we need to compute bases and strong
generating sets (cf. Section 4.1), the basic data structures introduced in the
seminal works [Sims, 1970, 1971a]. It was first analyzed in [Furst et al., 1980]
that Sims’s algorithm for computing strong generating sets runs in polynomial
time. Items (b)–(g) are straightforward consequences of the strong generator
construction. Concerning (d), we note that the presentation obtained from such
a construction usually contains redundant defining relations; the first method
to delete some of the redundant relators is in [Cannon, 1973]. Sims’s strong
generating set construction uses only basic group theory: Lagrange’s theorem
and Schreier’s lemma (cf. Lemma 4.2.1).
Computing the center is also elementary (cf. [Luks, 1987]); in fact, it can be
reduced to a point stabilizer construction (cf. [Cooperman et al., 1989]). The
preliminary version of [Luks, 1987] for finding a composition series, circulated
since 1981, was the first algorithm based on consequences of the classification
of finite simple groups. [Neumann, 1986] also contains an algorithm for com-
puting a composition series, which runs in polynomial time for groups with
some restrictions of the primitive groups involved in them. Problem ( j), the
identification of simple groups, is solved in [Kantor, 1985b].
The nontrivial case of (k)(i) is when H is abelian. This was resolved in
[Rónyai, 1990], as an application of methods handling associative algebras.
(k)(ii) is a direct consequence of the first part.
Item (l), the work on the algorithmic Sylow machinery in [Kantor, 1985b,
1990] is the most complicated algorithm in our list: In addition to consequences
of the classification of finite simple groups, it requires a case-by-case study of the
classical finite simple groups. Considering the elementary nature of the Sylow
theorems and their important role in group theory, it would be worthwhile to
find simpler algorithms to deal with Sylow subgroups.
Problem (m) is solved in [Kantor and Luks, 1990]. The algorithm uses (l)
and so the classification of finite simple groups. The nonabelian part of the
3.2 Nearly Linear-Time Algorithms 51
socle is first computed in [Babai et al., 1983] whereas the abelian part, based
on [Rónyai, 1990], is obtained in [Kantor and Luks, 1990].
The solution of (o) and computing O
(G) is implicit in [Babai et al., 1987].
The subgroups O p (G) are obtained in [Kantor, 1985b] and [Neumann, 1986]
and the general case of O
(G) is handled in [Kantor and Luks, 1990].
The paper [Kantor and Luks, 1990] generalized the entire polynomial-time
toolkit to work with quotient groups G/K of G ≤ Sn . The trivial approach,
namely finding a permutation representation of the factor group, does not work:
In Exercise 6.6, we shall give examples K G ≤ Sn from [Neumann, 1986] for
an infinite sequence of n values so that no faithful permutation representation
of G/K acts on less than 2n/4 points. The quotient group machinery required
a radically new approach and it is based on Sylow subgroup computations.
Therefore, it is much more difficult to compute, for example, the upper central
series of G than to compute Z (G).
We shall present algorithms for most of the tasks listed in our toolkit. How-
ever, in most cases, these will not be the deterministic polynomial-time versions.
Rather, we shall describe faster, randomized algorithms, which we define in the
next section.
consists of one cyclic group or it is the family of all cyclic groups. The price we
pay for the speedup is that, with a few exceptions, the algorithms are random-
ized Monte Carlo (cf. Section 1.3 for the definition), and so they may return
incorrect answers. Also, these algorithms may not have the lowest time com-
plexity on general inputs. For example, the version of basic permutation group
manipulation from [Babai et al., 1991], which we present in Section 4.5, runs
in O(n log4 |G| log n + |S|n log |G|) time, which is O ∼ (n 5 + |S|n 2 ) in the worst
case. However, there are O ∼ (n 4 +|S|n 2 ) deterministic (cf. [Babai et al., 1997b])
and O ∼ (n 3 + |S|n) Monte Carlo (cf. [Babai et al., 1995]) algorithms for the
same task. We mention, however, that usually some of the log |G| factors in the
running time estimates can be replaced by a bound on the length of subgroup
chains, which may be significantly smaller than log |G|.
A recent direction of research is to upgrade the Monte Carlo nearly linear-
time algorithms to Las Vegas–type algorithms. The papers [Kantor and Seress,
1999, 2001] establish a general framework for the upgrade, which we shall
discuss in Section 8.3.
Most of the nearly linear-time algorithms listed above are implemented in
GAP and are available in the library of the standard GAP distribution. Practical
computations almost exclusively deal with small-base input groups. The nearly
linear-time algorithms usually run faster than quadratic ones even in the indi-
vidual cases when the factor logc |G| occurring in their running time estimate
is larger than n. Because of their practical importance, we devote a significant
portion of this book to nearly linear-time algorithms.
large library of algorithms has been developed; we mention [Butler, 1982] for
a general description of backtrack searches, [Butler, 1983] and [Holt, 1991] for
computing normalizers, and [Butler and Cannon, 1989] for computing Sylow
subgroups. A second generation of backtrack algorithms is developed in [Leon,
1991], where ideas from the graph isomorphism program nauty from [McKay,
1981] are utilized. The thesis [Theißen, 1997] extends Leon’s method to obtain
a very efficient normalizer algorithm.
4
Bases and Strong Generating Sets
where G [i] := G (β1 ,...,βi−1 ) is the pointwise stabilizer of {β1 , . . . , βi−1 }. The base
is called nonredundant if G [i+1] is a proper subgroup of G [i] for all i ∈ [1, m].
Different nonredundant bases can have different size (cf. Exercise 4.2).
By repeated applications of Lagrange’s theorem, we obtain
m
|G| = G [i] : G [i+1] . (4.2)
i=1
Because the cosets of G [i] mod G [i+1] correspond to the elements of the orbit
[i] [i]
βiG , we obtain |G [i] : G [i+1] | = |βiG | ≤ n for all i ∈ [1, m]. Moreover, if B is
nonredundant then |G [i] : G [i+1] | ≥ 2. These inequalities, combined with (4.2),
immediately yield 2|B| ≤ |G| ≤ n |B| , or log |G|/ log n ≤ |B| ≤ log |G|. The
last inequality justifies the name “small-base group” introduced in Section 3.2:
When |G| is small, so is |B|.
A strong generating set (SGS) for G relative to B is a generating set S for
G with the property that
S ∩ G [i] = G [i] , for 1 ≤ i ≤ m + 1. (4.3)
55
56 Bases and Strong Generating Sets
the set [1, 4] = {1, 2, 3, 4}. The sequence B = (1, 2, 3) is a nonredundant base
for G, since G [1] = Sym([1, 4]) G [2] = Sym([2, 4]) G [3] = Sym([3, 4])
G [4] = 1. The sets S = {(1, 2, 3, 4), (3, 4)} and T = {(1, 2, 3, 4), (2, 3, 4), (3, 4)}
generate G, but S is not a strong generating set relative to B since S ∩ G [2] =
Sym([3, 4]) = G [2] = Sym([2, 4]). In contrast, T is an SGS relative to B.
[i]
Given an SGS, the orbits βiG can be easily computed (cf. Theorem 2.1.1(i)).
These orbits are called the fundamental orbits of G. By (4.2), from the orbit
sizes we immediately get |G|. Also, keeping track of elements of G [i] in the orbit
[i]
algorithm that carry βi to points in βiG , we obtain transversals Ri for G [i] mod
G [i+1]
. We always require that the representative of the coset G [i+1] · 1 = G [i+1]
is the identity.
g3 := g2r2−1 ∈ G [3] ; etc. This factorization procedure is called sifting and may
be considered as a permutation group version of Gaussian elimination.
Sifting can also be used for testing membership in G. Given h ∈ Sym(),
we attempt to factor h as a product of coset representatives. If the factor-
ization is successful then h ∈ G. Two things may go awry: It is possible
that, for some i ≤ m, the ratio h i := hr1−1r2−1 · · · ri−1
−1
computed by the sift-
[i]
ing procedure carries βi out of the orbit βi ; the other possibility is that
G
Lemma 4.2.1. Let H ≤ G = S and let R be a right transversal for G mod
H , with 1 ∈ R. Then the set
T = {r s(r s)−1 | r ∈ R, s ∈ S}
generates H .
58 Bases and Strong Generating Sets
The elements of T are called Schreier generators for H .
Remark 4.2.2. In this book we deal only with finite groups, and so every
element h of a given group G = S can be written as a product h = s1 s2 · · · sk
of generators and we do not have to deal with the possibility that some si is the
inverse of a generator. In the proof of Lemma 4.2.1, we included the possibility
si ∈ S −1 since this lemma is valid for infinite groups as well, and in an infinite
group we may need the inverses of generators to write every group element as
a finite product.
In the analysis of the Schreier–Sims algorithm, we shall also use the following
observation from [Sims, 1971a].
S j β j = S j+1 (4.5)
holds for all 1 ≤ j ≤ k then B = (β1 , . . . , βk ) is a base for G and S = 1≤ j≤k Sj
is an SGS for G relative to B.
Some remarks are in order. First, we note that in both versions a factor n can
be shaved off from the term including |T | in the time analysis (Exercise 4.4).
The time–space tradeoff in complexity demonstrated in Theorem 4.2.4 is
quite common in computational group theory. Because of hardware restrictions,
we usually choose versions with smaller memory usage.
In the analysis of the second version, we estimated the depth of Schreier trees
by n. As a cyclic group with one generator shows, this bound can be achieved;
however, practical experience indicates that, for most groups and generating
sets, a breadth-first-search Schreier tree will have significantly smaller depth.
Nevertheless, even in such nicely behaving groups, Schreier trees with large
depth may occur! During the construction, when we define a new base point,
the last group in our stabilizer chain is temporarily a cyclic group with one
generator. If the group contains elements of large order, this cyclic group may
define a Schreier tree of large depth. When a second generator is added to this
level, the breadth-first Schreier tree defined by the two generators may have
small depth; however, if we always augment previous Schreier trees, a long
branch remains in the tree forever. (This situation occurs for example in the
group GLd (q), acting as a permutation group on the q d points of the underlying
62 Bases and Strong Generating Sets
vector space.) Hence, it may be useful to recompute the ith Schreier tree from
scratch each time a new generator is added to Si . Depending on the types of
groups with which we work, this recomputation may mean savings in the sifting
part of the algorithm or it may be just a waste of time because the augmented tree
works just as well. We brought up this issue to indicate the nature of problems
we may face at implementation, in addition to the complexity considerations.
For arbitrary inputs, the worst-case complexity of the first version is O(n 5 )
(as in the memory estimates, the extra log n factors can be avoided by using
stronger bounds than the trivial log(n!) on the length of subgroup chains).
[Knuth, 1991] gives a family of generators for Sn for which the running time
of the algorithm is indeed (n 5 ). Based on experiments, we believe that, using
random generators for Sn , the expected running time is (n 4 ). In Section 10.2,
we shall describe faster algorithms for the recognition and effective handling
of symmetric and alternating groups.
SGS constructions are essential parts of almost all permutation group al-
gorithms since strong generating sets provide the only known means to test
membership in groups. Therefore, it is of utmost importance to have efficient
ways to construct strong generating sets. This importance is the reason why we
deal at length with the various attempts to speed up the Schreier–Sims algorithm.
Of course, we want to apply the solution in the case when G = Si and H =
Si+1 for two consecutive levels during the SGS construction.
Practical experience shows that if H = G β then usually at least half of the
Schreier generators constructed from R and S are in G β \H . Therefore, testing
only a random sample of Schreier generators, we obtain a heuristic solution for
(4.6), and thus a heuristic randomized algorithm for SGS construction.
We caution that, although testing a small number of randomly chosen Schreier
generators usually suffices, it is easy to construct examples when there is only
one generator g ∈ S such that Schreier generators obtained from g and R may
show H = G β , because for all k ∈ S\{g} and r ∈ R, r k(r k)−1 ∈ H (cf. Exercise
4.5). Nevertheless, it is possible to extend the heuristics above to an algorithm
4.3 The Power of Randomization 63
with rigorous probability analysis, leading to a fast Monte Carlo SGS construc-
tion. We shall describe this extension in Section 4.5.
Another possibility for a randomized SGS construction is based on the fact
that if an alleged SGS is not correct then, with probability at least 1/2, a uni-
formly distributed random element of G does not sift through the correspond-
ing transversal system. More precisely, let G = S, B = (β1 , . . . , βm ) ⊆
such that no element of S fixes B pointwise, and let Si := S ∩ G (β1 ,...,βi−i ) for
S
1 ≤ i ≤ m + 1. For 1 ≤ i ≤ m, we can compute the orbits βi i and sets of
Si
group elements Ri ⊆ Si such that for all γ ∈ βi there is a unique rγ ∈ Ri
r
with βi γ = γ . (Note that Ri may be only a proper subset of a transversal for
Si mod Si+1 if S is not an SGS for G. The Ri may be computed explicitly
or stored in a Schreier tree data structure.) Given h ∈ Sym(), we can apply the
sifting procedure to attempt to factor h in the form h = rm · · · r1 with ri ∈ Ri .
We say that h sifts through if this factorization procedure succeeds.
Proof. Let i be the largest integer such that Si βi Si+1 . By Lemma 4.2.3,
such an i exists and we have a correct SGS for Si+1 . In particular, we can test
membership in the group Si+1 .
Let p be the probability that a given uniformly distributed random g ∈ G
sifts through the first i levels of our data structure and, in the case when
g sifts through, let ḡ denote the unique element of Ri · · · R2 R1 , which is
the product of transversal elements computed by the sifting procedure. Then
g(ḡ)−1 ∈ G (β1 ,...,βi ) , and every element of G (β1 ,...,βi ) has the same chance to
occur as g(ḡ)−1 for some g ∈ G. Hence Prob(g(ḡ)−1 ∈ Si+1 ) = 1/|G (β1 ,...,βi ) :
Si+1 |. We know that |G (β1 ,...,βi ) : Si+1 | ≥ 2, since G (β1 ,...,βi ) ≥ Si βi Si+1 ;
we also know that g(ḡ)−1 ∈ Si+1 if and only if it sifts through the transver-
sal system of Si+1 . In summary, g does not sift through with probability
(1 − p) + (1 − 1/|G (β1 ,...,βi ) : Si+1 |) p ≥ 1 − p + p/2 ≥ 1/2.
The randomized method of [Cooperman et al., 1990], from which the follow-
ing two lemmas are taken, constructs a Schreier tree of depth O(log |α G |) for G
mod G α . It assumes the availability of uniformly distributed random elements
of G. A variant of Lemma 4.4.4 seems to have first appeared in [Erdős and
Rényi, 1965].
For a random variable A, the expected value of A is denoted by E(A).
Proof. The basic idea of the proof is the following. Let D j denote the cube of
the first j random elements, and let j := δ D j . We also fix some 0 < p < 1/2.
g
Then j+1 = j ∪ j j+1 , and Lemma 4.4.5 implies that, with probability at
least p, the size of the j increase by at least a constant factor when | j | ≤
m/2, and the size of the \ j decrease by at least a constant factor when
| j | > m/2. Hence, after O(log m) rounds, δ D j = . Also, the vertices in
δ D j are on the first j levels of the Schreier tree (note that the Schreier tree uses
the inverses of the gi as labels, since edges are directed toward the root δ).
We can obtain the estimate for the depth of the Schreier tree described in the
statement of the theorem by a refinement of the argument in the previous para-
graph. Let us fix p := 0.46. We define a subsequence H = (h 1 , h 2 , . . . , h s )
of (g1 , g2 , . . . , gt ), consisting of those gi for which the already constructed
part of increases by the amount indicated in Lemma 4.4.5. Formally, sup-
pose that (h 1 , h 2 , . . . , h j ) is already constructed, where h i = gki for some k1 <
k2 < · · · < k j . Let C j denote the cube of (h 1 , h 2 , . . . , h j ), and let j := δ C j .
If | j | ≤ m/2 then we define h j+1 as the gi with the smallest index i > k j
satisfying
gi
∪ j ≥ | j | 2 − | j | ,
j
0.54m
whereas if | j | > m/2 then we define h j+1 as the gi with the smallest index
68 Bases and Strong Generating Sets
i > k j satisfying
g (m − | j |)2
m − j i ∪ j ≤ .
0.54m
We have to show that
(a) s ≥ &2 log m' + 4, with probability at least 1 − m −0.29 if t ≥ 8 log m + 16,
and with probability at least 1 − m −0.29c if t ≥ c(8 log m + 16); and
(b) δ C&2 log m'+4 = .
(a) and (b) imply that the breadth-first-search tree using the first &2 log m' + 4
of the h i as generators already has depth at most &2 log m' + 4 and, of course,
the depth of the breadth-first-search tree using all gi cannot be greater.
(a) is an easy consequence of Lemma 2.3.3, applied with the parameters
ε = 0.4, t = 8 log m + 16 , p = 0.46 and ε = 1 − 0.6/c, t = c(8 log m +
16) , p = 0.46 respectively. To prove (b), consider the function
For m < 28 /0.54, (4.8) can be checked by a small computer calculation. For
m ≥ 28 /0.54, we can argue as follows. If k > 1.36 log m − 3.35 > log5/3 0.18m
{k}
then f m (1) ≥ 0.18m, since f m (x) ≥ 5x/3 for x ≤ 0.18m. If x ≥ 0.18m then
{5}
f m (x) > 0.625m. Finally, if x > 0.625m and k > 0.93 + log log 0.54m then
{k}
f m (x) = m. To see this, observe that for such an x and k,
k
(m − x)2
m− f m{k} (x) ≤ < 1.
(0.54m)(2k −1)
{k}
Since f m (x) is an integer, it must be equal to m. To finish the proof, we check
that 1.36 log m − 3.35 + 5 + 0.93 + log log 0.54m < 2 log m + 4 if m ≥
28 /0.54.
The cutoff point 0.18m and the probability p = 0.46 in the previous analysis
are quite arbitrary. Using a cutoff point closer to 0 and probability closer to 0.5,
the same argument as in the proof of Theorem 4.4.6 shows that, for each fixed
ε > 0, the breadth-first-search tree of (2 + ε) log m + c random group elements
has depth at most (1 + ε) log m + c with probability at least 1 − m −C . Here c and
C are positive constants, depending on ε. In practice, the breadth-first-search
tree of log m random elements usually has depth less than log m.
Given an arbitrary strong generating set for a group G, a slight modifica-
tion of the idea of Lemma 4.4.2 produces an SGS of size at most log |G|,
which also defines shallow Schreier trees. The following lemma is from
[Cooperman and Finkelstein, 1992].
Lemma 4.4.8. Let G = S and suppose that S is a strong generating set relative
to the base B = (β1 , . . . , βm ) with the corresponding point stabilizer chain G =
G [1] ≥ G [2] ≥ · · · ≥ G [m+1] = 1. Then an SGS R relative to B can be computed
in nearly linear time by a deterministic algorithm, such that |R| ≤ log |G| and
the breadth-first-search Schreier tree defined by (R ∪ R −1 ) ∩ G [i] for G [i] mod
G [i+1] has depth at most 2 log |G [i] |, for 1 ≤ i ≤ m.
[i]
βiG then we decrease i by 1. The procedure stops when i reaches 0.
Lemma 4.5.1. The Schreier tree computations of the SGS construction require
O(n log n log4 |G|) total time and O(n log |G|) memory. This part of the SGS
construction algorithm is of Las Vegas type.
Proof. We use the notation introduced at the beginning of this section. There
are at most log |G| base points, and each base point βi is processed at most
log |G| times, since the subgroup Si increases at each processing. Hence the
algorithms in Lemma 4.4.2 and Remark 4.4.7 are invoked at most log2 |G|
times. Lemma 4.4.2 is deterministic. Therefore, to obtain a fixed, but arbitrarily
small, constant error probability for the entire algorithm, we have to ensure
that each invocation of the algorithm in Remark 4.4.7 fails with probability
at most c / log2 |G|, for some constant c prescribed by the overall error re-
quirement. This can be achieved by choosing the value of c in the statement of
Theorem 4.4.6 as c = c log n for an appropriate constant c . In fact, using a
large enough but constant c , we can reduce the error probability to less than
1/n d , since log2 |G| < (n log n)2 , and so the algorithm in Remark 4.4.7 fails
with probability less than 2−0.29c log n < 1/(log2 |G|n d ). Note that, when invok-
ing this algorithm on level j, during the execution of the algorithm we actually
S
check whether the vertex set of the Schreier tree we construct is β j j ; therefore
the Schreier tree computation part of the SGS algorithm is of Las Vegas type.
Concerning the time requirement, note that |Si | is O(log |G|) during the
S
entire algorithm, since Si is obtained by adding O(log |βi i |) elements to Si+1 .
Hence one call of the algorithm in Lemma 4.4.2 runs in O(n log2 |G|) time.
On level i, one call of the algorithm in Remark 4.4.7 runs in O(n log |G| ·
log n log |G [i] : G [i+1] |) time, since our requirement on the depths of Schreier
trees ensures that a random element can be generated by O(log |G|) multipli-
cations and, as discussed at the error estimate part above, we have to generate
O(log n log |G [i] : G [i+1] |) random elements.
Since both kinds of Schreier tree computations are called at most log2 |G|
times, the total time requirement is as stated.
What is left is to describe the solution for (4.6). This solution is based on the
following local expansion lemma from [Babai, 1992].
Lemma 4.5.3. Let S denote a set of generators of the group G and set T =
S ∪ S −1 ∪ {1}. Let D be any finite subset of T t , the set of t-term products of
members of T (in any order). Suppose that there exists q with 0 < q ≤ 1/(2t +1)
such that |D| ≤ (1 − 2qt)|G|. Then, for at least one generator g ∈ S,
|D\Dg| ≥ q|D|.
Proof. Suppose, on the contrary, that |D\Dg| < q|D| for all g ∈ S. Using this
assumption, we first prove that G = T 2t .
Observe that, for each g ∈ S, |D\Dg −1 | = |Dg\D| = |D\Dg| < q|D| and
for any x, y ∈ G,
Hence, if 4ct random elements d ∈ D are chosen and dg ∈ D for all such d and
all g ∈ S then Prob(H = G β ) > 1 − e−c .
How can we test that dg ∈ D? Observe that each d ∈ D is factored uniquely
in the form d = hr for some h ∈ H and r ∈ R. Moreover, dg = hrg ∈ D if
and only if hrg(hrg)−1 ∈ H . (As usual, k̄ denotes the element of R such that
k ∈ G β k̄.) Now h fixes β, so hrg = rg. Also, hrg(hrg)−1 ∈ H if and only if
rg(hrg)−1 ∈ H . Summarizing, we obtain that dg ∈ D if and only if the Schreier
generator rg(rg)−1 ∈ H . Consequently, instead of generating random elements
of D, it is enough to check whether Schreier generators constructed by com-
bining randomly chosen elements of R with all elements of S are in H . Hence
the solution of (4.6) we obtained can be considered as a theoretical justification
(with error estimate) of a version of the first heuristic described in Section 4.3.
The algorithm, as just described, requires that, after choosing some random
r ∈ R, we check the Schreier generators rg(rg)−1 for all g ∈ S. Our next goal is to
present an improvement that avoids checking all elements of S. Picking random
elements of S may not work well; this is the essence of Exercise 4.5. Instead, we
use an enhancement of the random subproduct method (cf. Section 2.3). Let k
be an integer chosen randomly from the uniform distribution on [0, |S| − 1] and
let k = &k /2'. Moreover, let w1 be a random subproduct of a random ordering
of S and let w2 be a random subproduct of a random ordering of a random subset
S of size k of S (cf. Exercise 2.1 for the construction of random orderings).
Finally, for g ∈ G, D ⊆ G, and integer a, let P(g, a) denote the proposition
that |{d ∈ D | dg ∈ D}| ≥ a. Note that
and
Since max{(1 − p)/2, p/2} ≥ 1/4, P(wi , a/4) holds with probability at least
1/4 for i = 1 or 2.
The term |T |n log |G| in the running time stems from the handling of level 0.
Recall that we stored the original generators in a set S0 . When the data structure
is up to date below level 0, we have to sift the elements of S0 in S1 to ensure
that G = S1 .
By Remark 4.5.2, the Schreier tree computations run in O(n log n log3 |G|)
time. In Section 5.2 we shall give a version of the algorithm that, with high
probability, runs in O(n log n log3 |G|) time for small-base group inputs.
4.5.1. Implementation
A version of the nearly linear-time Monte Carlo SGS construction was first im-
plemented in the C language (see [Seress and Weisz, 1993]). The main purpose
was to experiment with the different Schreier tree constructions and to get a
feeling for the true probabilities in (4.12), which turn out to be much higher
(usually, at least 1/3) than the estimate 1/64t obtained in (4.12).
It turns out that almost always the Schreier tree construction described in
Remark 4.4.3 already produces at most log |α G | generators, such that these
generators and their inverses define a Schreier tree of depth at most log |α G | for
76 Bases and Strong Generating Sets
G mod G α , and applying the algorithm in Remark 4.4.7 does not improve the
depth significantly. Hence, in the GAP implementation, we use the algorithm
in Remark 4.4.3, and no additional work for Schreier tree computations is
necessary.
As indicated already, the practical performance of the algorithm is far better
than our probability estimate in (4.12). One reason may be that the Schreier tree
construction of Lemma 4.4.2 “randomizes” the coset representatives in the sense
that by writing out a coset representative as a product of original generators, the
length of the word may be exponential compared to the distance from the root of
the Schreier tree. Also, the usage of the random subproducts w1 and w2 provides
further random mixing of the generators. So, although we could guarantee only
with probability (1/ log |G|) that a Schreier generator obtained from a random
coset and the wi s witnesses that H = G β , the practical performance is probably
closer to the case when the Schreier generator is constructed using a random
element as a generator. Such a Schreier generator has a chance of at least
1 − 1/|G β : H | ≥ 1/2 to witness that H = G β .
Based on that experience, the following very fast heuristic version is imple-
mented in GAP. While working on level i of the data structure, we test only
one Scheier generator obtained from a random coset and (the long) random
subproduct w1 . If this Schreier generator sifts through Si+1 then we declare
that Si βi = Si+1 and that the data structure is up to date below level i − 1.
Although such a declaration may be wrong with a significant probability, indi-
vidual errors seem to be corrected as the Schreier–Sims procedure moves along
the base points. We mention that the implementation uses a further heuristic, re-
placing the sifting of Schreier generators; we shall describe this further heuristic
at the end of Section 5.2.
To ensure the correctness of the output, the user has two options in GAP. The
default value is a deterministic test, which decides with certainty whether the
SGS construction is correct. The user also has the option to choose a randomized
test, which declares that, with probability at least 1 − δ, the SGS construction
is correct; the error bound δ is specified by the user.
In this section we describe only the randomized SGS construction test, which
is based on the following lemma. The deterministic strong generating set used
in GAP is the so-called Verify routine by Sims, which we shall discuss in
Section 8.2. In Chapter 8, we shall present other strong generating tests as well.
Proof. By Lemma 4.2.3, if S is not a strong generating set then Si βi Si+1
for at least one i ∈ [1, m]. This implies that |D| ≤ |G|/2 and Lemma 4.5.3 can
be applied with q = 1/4t. We obtain that Prob(dh ∈ D) ≥ 1/4t for some h ∈ S
and so, by Lemma 4.5.4, Prob(gw j ∈ D) ≥ 1/64t.
It is clear that if 64t ln(1/δ) pairs gw1 , gw2 sift through D then the probability
that S is indeed an SGS is at least 1 − δ.
The advantages of separating the construction of an SGS and the checking of
the correctness of the construction are twofold. On one hand, in the oft occurring
case when |G| is known in advance, it is very easy to check correctness by
comparing |D| and |G| and, in the case |D| = |G|, we can terminate already
after the fast heuristic construction. On the other hand, even if |G| is not known
in advance, we have to apply the expensive sifting of Schreier generators about
a factor log2 |G| less times than in the original algorithm, which checks whether
Si βi = Si+1 each time a level i is processed.
If the input SGS is not correct then the probability that gw j ∈ D seems to
be much higher in practice than the estimate 1/64t in Lemma 4.5.6, and errors
are detected very quickly. In the case when gw j ∈ D is detected, the siftee of
gw j can be added to S and we can return to the construction phase of the
algorithm. Hence, the vast majority of time in an SGS construction is spent
checking the already correct final result. However, since the algorithm is Monte
Carlo, we cannot terminate and guarantee the correctness of the answer with
the prescribed probability earlier than the number of checks implied by the
estimate 1/64t in Lemma 4.5.6.
Exercises
4.1. Given G = S, construct less than n 2 generators for G in O(|S|n 2 ) time.
Hint: Sift the elements of S.
4.2. Give an example of a permutation group G ≤ Sn with two nonredundant
bases: One of size &log |G|' and one of size log |G|/ log n . Hint: G can
be chosen cyclic.
4.3. Prove Lemma 4.2.1 without the assumption that 1 ∈ R.
4.4. Design a version of the Schreier–Sims algorithm that handles all base
points uniformly (i.e., do not place all input generators immediately into
78 Bases and Strong Generating Sets
the generating set for G [1] ) and that achieves the time savings indicated
following Theorem 4.2.4.
4.5. (Luks) Let H = {h 1 , . . . , h n−2 } be a regular group acting on {1, . . . , n −
2}, and let G = h 1 , . . . , h n−2 , (n − 1, n). Show that B = (1, n − 1) is
a nonredundant base for G but the probability that a randomly chosen
Schreier generator detects that G 1 = 1 is only 1/(n − 1).
4.6. Determine the time requirement of the algorithm presented in the proof of
Lemma 4.4.8.
4.7. [Sims, 1971a] Design and analyze an algorithm that computes an SGS of
size at most log |G| as a subset of a given SGS S. Hint: Let S be an SGS
relative to the base (β1 , . . . , βm ), let G = G [1] ≥ · · · ≥ G [m+1] = 1 be
the corresponding point stabilizer chain, and let Si = S ∩ G [i] . List the
elements of S such that Si+1 comes before Si \Si+1 , for all i ∈ [1, m]. Do
an orbit computation to decide whether the jth element s j of the list is in
the subgroup generated by the first j − 1 elements; if it is then discard s j .
4.8. Prove the implication (4.11) in Section 4.5. Hint: Use that
(Dg\D)h ⊆ (Dgh\D) ∪ (D\Dh)
and
In this chapter, we collect some basic algorithms that serve as building blocks
to higher level problems. Frequently, the efficient implementation of these low-
level algorithms is the key to the practicality of the more glamorous, high-level
problems that use them as subroutines.
79
80 Further Low-Level Algorithms
segment of the base, say the first k elements, that contains the base points in
and so the (k + 1)st group in the stabilizer chain is G () .
The only modification in the Schreier–Sims algorithm is that we temporarily
lift the restriction that the constructed base must be nonredundant and we declare
every element of to be a base point (in the chosen ordering). After the strong
generating construction is finished, we can simply delete the redundant base
points. Note that on levels with |G [i] : G [i+1] | > 1, the created strong generating
set and Schreier trees do not have to be changed. Also, the time complexity of
the algorithm remains the same as it was originally, since sifting through a level
with fundamental orbit of size 1 can be done in O(1) time.
5.1.2. Homomorphisms
The same idea as in pointwise stabilizer computations can be used to compute
kernels of actions. Given G = S ≤ Sym() and a homomorphism ϕ : G →
Sym() (defined by the images of the elements of S), we can consider G as a
permutation group acting on ∪ . Namely, for g ∈ S, denote by (g, ϕ(g))
the element of Sym() × Sym() whose restrictions to and are g and
ϕ(g), respectively. Then Ḡ := {(g, ϕ(g)) | g ∈ G} is isomorphic to G, and
ker(ϕ) is the restriction of the pointwise stabilizer Ḡ () to . Similarly, we can
determine whether a map ϕ : S → Sym() defines a homomorphism. The map
ϕ defines a homomorphism if and only if the pointwise stabilizer of in the
group (g, ϕ(g)) | g ∈ S is the identity.
Specifying the images of the generators at a homomorphism ϕ : G = S →
Sym() is not sufficient for the effective handling of ϕ. We also need to be able
to compute ϕ(g) for any g ∈ G and a representative from the coset ϕ −1 (g) for
any g ∈ ϕ(G). To this end, we store two strong generating sets S1 , S2 with the
corresponding Schreier tree data structures for the group Ḡ = (g, ϕ(g)) | g ∈
S ≤ Sym( ∪ ). The first SGS S1 is relative to a base B1 = (β1 , . . . , βm ) such
that an initial segment (β1 , . . . , βk ) of B1 consists of points of and Ḡ (β1 ,...,βk )
fixes pointwise; S2 is relative to a base B2 such that an initial segment of
B2 consists of points in and the pointwise stabilizer of this initial segment is
Ḡ () . Given g ∈ G, ϕ(g) can be computed by sifting (g, 1) in the Schreier tree
data structure corresponding to S1 and then restricting the inverse of the siftee of
(g, 1) to . Similarly, given g ∈ ϕ(G), a representative of the preimage ϕ −1 (g)
can be obtained by sifting (1, g) in the Schreier tree data structure corresponding
to S2 and restricting the inverse of the siftee to .
In implementations, permutation groups of degree n are represented acting on
the set [1, n]. In particular, ϕ : G → Sym() is specified by pairs of permutations
5.1 Consequences of the Schreier–Sims Method 81
(g, ϕ(g)), where g acts on [1, n] for n := || and ϕ(g) acts on [1, k] for k := ||.
Therefore, when computing in Sym() × Sym(), we have to fix a permuta-
tion c acting on [1, n + k] such that [1, k]c = [n + 1, n + k] and use c and
c−1 to conjugate the elements of Sym() to Sym([n + 1, n + k]) and vice
versa.
In practice, we often know an SGS S for G ≤ Sym() before the homomor-
phism ϕ is defined, and ϕ is specified by the images of the elements of the SGS.
In this case, we have at hand the SGS S1 defined in this section without any
additional work. Note also that restricting the elements of S2 to , we obtain
an SGS for the image group ϕ(G).
base B̄ and SGS for H , where B̄ starts with some specified ω ∈ . This can be
achieved by a base change algorithm (cf. Section 5.4). This application of base
change is much faster than the computation of an SGS from scratch.
This means that X j+1 = G . Note that X j+1 is an SGS for G , relative
[ j+1] [ j+1]
to the base B j .
The kernel computation just described has an additional advantage: In the
process, we get strong generating sets for the subgroups in the chain G = G [1] ≥
G [2] ≥ · · · ≥ G [k+1] = ker(ϕ), which is the point stabilizer chain of the image
group ϕ(G) on the permutation domain . More precisely, for each j ∈ [1, k],
we get two SGS for G [ j] : The first SGS comes from the recursion, and it is
discarded during the base change, when the second SGS relative to the base
B j is constructed. After the SGS X j+1 for G [ j+1] is constructed, we discard
the second SGS for G [ j] as well, but we store the Schreier tree (S j , T j ), which
codes the transversal R j for G [ j] mod (G [ j] )δ j . Hence, the output of the kernel
computation is an SGS X k+1 for ker(ϕ) and a sequence of Schreier trees (S j , T j )
5.1 Consequences of the Schreier–Sims Method 83
for 1 ≤ j ≤ k. The set
k
ϕ(S j )
j=1
is an SGS for ϕ(G). Note that we do not compute and store ϕ(S j ) explicitly;
we can work with the elements of S j , which are permutations of .
The last basic task we have to perform is to find a preimage for some given
g ∈ ϕ(G). This can be accomplished by a slight modification of the usual sifting
procedure, using the transversals R j . The only concern is that g ∈ Sym() ∼ =
Sk and R j ⊆ Sym(), so we cannot take the product of g with inverses of
transversal elements, as required in sifting. Instead, we construct recursively
r j ∈ R j for j = 1, 2, . . . so that gϕ(r1 )−1 · · · ϕ(r j−1 )−1 ∈ ϕ(G [ j] ). Suppose
that r1 , . . . , r j−1 and the product p j := r1−1 · · · r −1
j−1 of their inverses are already
pj
constructed. Let l := j , and suppose that δl is in the block l . Then we pick
g
the inverse of a preimage of g. Note that, as customary with sifting, the same
procedure can be used to test membership in ϕ(G). Given g ∈ Sk , either the
procedure constructs a preimage of g, or we notice for some j that no element
of R j carries δ j to the block l , as required in the sifting procedure above.
Theorem 5.2.3. Given a base B for some G = S ≤ Sym(), with || = n,
an SGS for G can be computed in O(n|B|2 |S| log3 |G|) time, by a deterministic
algorithm. In particular, if a nonredundant base is known then an SGS can be
obtained by a nearly linear-time deterministic algorithm.
Lemma 5.2.4. Let G ≤ Sym(), with || = n, be transitive. Suppose that the
minimal base size of G is b. Then, for all g ∈ G\{1}, we have |supp(g)| ≥ n/b.
The speedup of the nearly linear-time SGS construction for small-base groups
may proceed as follows. We warn the reader in advance that this is a theoretical
exercise: The GAP implementation also uses base image computations, but the
implemented version is much simpler than the algorithm presented here. We
shall describe how base images are used in the implementation after the proof
of Theorem 5.2.6.
We first handle the case of transitive groups. Please note the distinction in the
statement of the next theorem between the maximum running time, which we
never exceed, and the typical running time, which happens with high probability.
Such distinction occurs frequently in Las Vegas algorithms, where we can check
the correctness of the output with certainty, but it is quite unusual in the case
of Monte Carlo algorithms.
The Schreier tree computations run in O(n M 3 log n) time, since we do not
encounter more than M base points (otherwise the algorithm aborts), each
S j increases at most M times, and by Remark 4.5.2, the invocations of the
algorithm in the proof of Lemma 4.4.2 on each fixed level run in O(n M 2 ) total
time. Hence all log |G| factors in the original timing estimate in Remark 4.5.2
can be replaced by M.
The other part of the algorithm, the sifting of Schreier generators, can be
done in O(n M 3 log n + M 5 log n) time: We have at most M base points, each
S j increases at most M times, and after each increase of S j , we sift O(M log n)
Schreier generators as words. Note that we have to sift twice as many Schreier
generators as in the version in Section 4.5, since we have to compensate for the
fact that we detect a nontrivial siftee only with probability between 1/2 and 1.
The handling of each Schreier generator takes O(M 2 ) time: This is the time
requirement of sifting as a word by Lemma 5.2.2, as well as the required time
to compute the images of M randomly chosen points in the siftee. We detect a
nontrivial siftee at most M 2 times (at most M times at each of at most M base
points) and multiplying out a nontrivial siftee takes O(Mn) time. Finally, the
term O(|T |M 2 ) in (5.1) comes from the sifting of the input generators as words.
If our guess for M is correct then the error probability analysis in Section 4.5
is valid (note again that we have compensated for the fact that not all non-
trivial siftees are detected by sifting twice as many Schreier generators as in
Section 4.5) and we obtain a correct SGS with probability greater than 1−1/n d .
However, it is possible that, although the algorithm did not abort, the output
is incorrect because our estimate for M was too low and we missed too many
nontrivial siftees. Hence, after termination, we use the strong generating test
described in Lemma 4.5.6 to check whether the result is correct. If R1 , . . . , Rk
denote the transversals constructed by the algorithm then, by Lemma 4.5.6,
sifting O(M log n) elements of T and of D := Rk · · · R1 in D detects if the
output is incorrect with probability greater than 1 − 1/n d . We sift the elements
of D and T as permutations, not only as words, and the time requirement for
each sift is O(Mn).
90 Further Low-Level Algorithms
If something bad happens (the algorithm aborts or the final strong generating
test reveals that the output SGS is incorrect) then we double our guess M and
repeat the entire algorithm.
When our guess exceeds the actual value of log |G| for the first time, the SGS
construction succeeds with probability greater than 1 − 1/n d and the algorithm
terminates. Since this last guess satisfies log |G| ≤ M < 2 log |G|, the running
time estimate of the last run is O(log5 |G| log n + n log n log3 |G| + |T | log2
|G|). Also, because we double our guesses for M, the running time estimate of
the last run dominates the sum of running time estimates for all previous runs.
What happens if, owing to a sequence of unlucky random bits, the algorithm
does not terminate when M exceeds log |G|? When the algorithm is run with a
guess M > log |G|, the running time is
O(n log3 |G| log n + M log4 |G| log n + |T |M log |G|), (5.2)
since we do not encounter more than log |G| base points, each S j increases
at most log |G| times, the sum of depths of Schreier trees is O(log |G|), and
on each level we construct O(log |G| log n) Schreier generators (note that this
number depended on the sum of depths of trees). Hence all factors M in the
estimate (5.1) can be replaced by log |G|, except one: The number of points
whose images are traced in the siftees of Schreier generators. Note also that if
the guess M exceeds n then the factors M can be replaced by n in (5.2), because
we never trace the images of more than n points.
We continue the doubling of our guess M at most until M exceeds log(n!).
Hence the run with the last estimate for M terminates in O(n log4 |G| log n +
|T |n log |G|) time, and this time estimate dominates the sum of running time
estimates for all previous runs.
(a) Decide whether G (1 ∪···∪ j−1 ) = 1; if not, then find an orbit j where
G (1 ∪···∪ j−1 ) acts nontrivially.
(b) Compute a nonredundant base for the G-action on j .
(c) Compute a base and SGS for the G-action on 1 ∪ · · · ∪ j .
The recursion terminates when Step (a) reports that G (1 ∪···∪ j−1 ) = 1, or
1 ∪ · · · ∪ j = .
Step (a) is simply calling the strong generating test of Lemma 4.5.6, with the
set D := Rk · · · R1 .
If an orbit j is output by Step (a) then we call the algorithm of Theorem 5.2.5
to compute a nonredundant base := (γ1 , . . . , γl ) (and an SGS, which we shall
discard) for G| j . We carry out this computation using only the restrictions of
permutations to j . Since G|1 ∪···∪ j ≤ G|1 ∪···∪ j−1 ×G| j , the concatenation
of B and is a (possibly redundant) base for G (1 ∪···∪ j ) .
In Step (c), we augment the already known SGS for G|(1 ∪···∪ j−1 ) , modifying
the nearly linear-time algorithm of Section 4.5 by executing all sifts only as
words and checking whether a siftee is trivial by computing the images of all
points in B and . Initially, we declare that the already known data structure
is up to date below level k. Note that, although the Schreier trees at the base
points β1 , . . . , βk do not change, we may have to add new generators to some
Si for some i ≤ k.
Now we address the time requirement of the algorithm. The base case,
the computation of an SGS for G|1 , runs in O(|1 | log4 |G| log n +
n log3 |G| log n + |T ||1 | log |G|) maximum time and, with probability greater
than 1 − 1/n d+2 , the algorithm terminates in
time. Recursion is called at most log |G| times, since the order of G|1 ∪···∪ j is
at least twice the order of G|1 ∪···∪ j−1 .
92 Further Low-Level Algorithms
One call of Step (a) takes O(nt 2 log n) time, where t is the sum of depths of
Schreier trees at the time of the call. This sum is always at most 6 log |G|, so the
total cost of calls of Step (a) during the entire recursion is O(n log3 |G| log n).
Calling Step (b) on the orbit j runs in time O(| j | log4 |G| log n +
|T || j | log |G|) and, with probability greater than 1 − 1/n d+2 , this call ter-
minates in
time.
In Step (c), the cost of Schreier tree computations and multiplying out non-
trivial siftees is O(n log3 |G| log n) during the entire algorithm, since the usual
estimates (at most log |G| base points, each S j increases at most log |G| times,
etc.) are valid and we do not duplicate work at the different calls of Step (c).
Sifting one Schreier generator and deciding whether the siftee is trivial
can be done in O(log2 |G|) time, since we evaluate siftees on |B| + || ≤
2 log |G| points. During one call of Step (c), Schreier generators are sifted
O(log3 |G| log n) times because there are at most log |G| base points, each S j
increases at most log |G| times, and O(log |G| log n) generators are sifted at
the processing of a level. Consequently, during all calls of Step (c), we spend
O(log6 |G| log n) time handling Schreier generators.
If an element of the generating set T sifts through as a word in G|1 ∪···∪ j−1
with trivial siftee w (i.e., w fixes all base points in 1 ∪ · · · ∪ j−1 ), then we
store w and sift it along the base points in j at the next call of Step (c). In this
way, we can achieve that the total time requirement of sifting T during calls of
Step (c) is O(|T | log2 |G|).
Summarizing, we obtain that the maximum time requirement of the entire
algorithm is O(log6 |G| log n + n log4 |G| log n + |T |n log |G|). Note that we
ensured that each call of the base case and of Steps (a), (b), and (c) succeeds
with probability greater than 1 − 1/n d+2 , so the probability of a correct output
is greater than 1 − 3 log |G|/n d+2 > 1 − 1/n d and the algorithm terminates in
O(log6 |G| log n + n log3 |G| log n+|T |n log |G|) time with probability greater
than 1 − log |G|/n d+2 > 1 − 1/n d .
(a) The quantities ξ, µ, t for H are bounded from above by logc |G|.
(b) For any g ∈ G, ψ(g) can be computed in O(logc |G|) time. Conversely,
given h ∈ H, ψ −1 (h) can be computed in O(|| logc |G|) time.
Proof. (a) By Lemma 4.4.2, given any strong generating set of G relative to
some nonredundant base B, we can compute Schreier trees of depth at most
2 log |G| for a Schreier tree data structure of G in nearly linear time by a deter-
ministic algorithm. Hence, we can suppose that t ≤ 2|B| log |G| ≤ 2 log2 |G|.
(In fact, if the SGS of G was computed by the nearly linear-time Monte Carlo
algorithm described in Section 4.5 then t ∈ O(log |G|).)
Given h 1 , h 2 ∈ H , we compute their products by concatenating them, then
follow the images of base points in this concatenated word to obtain a function
f : B → , and use Lemma 5.2.2 to compute the standard word represent-
ing h 1 h 2 . This requires O(|B|t) time. Similarly, the inverse of any h ∈ H can
be computed by taking the inverse of the word h formally, as a word, follow-
ing the images of base points to obtain a function f : B → , and again using
Lemma 5.2.2. Hence, products and inverses in H can be computed in O(|B|t)
time and so µ ∈ O(log3 |G|). Since the elements of H are represented by stan-
dard words in a unique way, comparision of group elements can be done in
O(t) time.
Random elements of H can be constructed in O(t) time, by concatenating
words representing randomly chosen coset representatives along the point sta-
bilizer subgroup chain of G. In subgroups of H , nearly uniformly distributed
random elements can be constructed by the algorithm of Theorem 2.2.4 in
O(log8 |H |) time.
(b) Given g ∈ G, ψ(g) can be computed in O(|B|t) time by Lemma 5.2.2.
Conversely, given h ∈ H, ψ −1 (h) is computed by at most t permutation multi-
plications.
The algorithm runs much faster in the case when the orbits of G α and the
coset representatives rβ are already known. For example, this situation occurs
when a strong generating set for G was already computed.
Theorem 5.5.2. Suppose that the orbits of G α are known and R is coded in a
Schreier tree of depth l. Then a minimal block can be computed in O(nl log n)
time by a deterministic algorithm.
102 Further Low-Level Algorithms
Lemma 5.5.3. (a) At any moment during the execution of the algorithm, each
i is a subset of a ∼-equivalence class and 1 - · · · - k .
(b) Suppose that Step 4 of the algorithm is entered and, at that moment,
is the last element of L and λ is the representative of . Then α H,rλ ⊆
{α} ∪ ⊆ α G α ,rλ ; in particular, if H and G α have the same orbits then
{α} ∪ is a minimal block of imprimitivity.
Next we describe the data structures used in the algorithm. We also give some
further details of Steps 4 and 5 and prove the claimed bounds on the running
time.
The coset representatives rβ are stored in the usual Schreier tree data structure.
The sets 1 , . . . , k are stored as linked lists. We maintain four lists of
length k, named firstL, lastL, lengthL, and testL, and two lists of length n,
named nextL and reprL. firstL[i] and lastL[i] contain the first and last ele-
ments of i , respectively; lengthL[i] is the cardinality of i . reprL[ν], if de-
fined, contains the index i such that ν ∈ i . The list nextL contains the links:
nextL[ν], if defined, is the successor of ν in the set i , for ν ∈ L i (note that the
set i is stored as a list in the computer, so it makes sense to talk about the suc-
cessor of ν). The representative λi of i will always be the first element of i .
testL[i] gives the latest element ν of i for which ν rλi was already computed.
The orbits of H are also stored in linked lists. We define firstH, lastH, nextH,
and reprH analogously to the appropriate items for L.
The implementation of Step 3 is quite straightforward; therefore we re-
mark only on one subcase. Suppose µ = γ rλ was computed for some γ ∈
and, using the list reprL, we found that µ ∈ i for some i < k. The smallest
index j such that | j | ≥ | j | for all j ∈ [i, k] can be determined from the list
lengthL. Then we link the sets i , . . . , k such that the elements of j come
first; in this way, we can maintain the property that the coset representative rλ
belonging to the first element λ of the new block was already applied to an initial
segment of this new block, and the last element of this initial segment is given as
testL[ j].
We use the same “biggest comes first” linking strategy in Step 5. Suppose
that Step 4 returned some g ∈ G α \H such that g collapses some orbits of H ;
5.5 Blocks of Imprimitivity 105
our goal is to make the necessary merges of the sets i and orbits of H to
maintain the properties described in Lemma 5.5.3(a).
Step 5:
(i) Order the entries of lengthL in decreasing order:
| j1 | ≥ | j2 | ≥ · · · ≥ | jk |.
(ii) for i = 1 to k do
iffirstL[ ji ] is bound (i.e., defined)
then for all ν ∈ ji do
µ := ν g ;
if µ ∈ {α} ∪ 1 ∪ · · · ∪ k
then link the H -orbit of µ to ji
elseif µ ∈ l , l = ji
then link all s with firstL[s] bound and s between l and ji to ji ;
unbind firstL[s].
(iii) Left-justify bound entries of firstL (i.e., write the entries of firstL that are
still defined in a sequence); update lastL, reprL, testL, lengthL, nextL.
(iv) Compute orbits of (new) H ; update firstH, lastH, reprH, nextH.
At the end of Step 5, the sets in L are closed for the action of g.
The following lemma plays a crucial role in estimating the running time.
Proof. The representative may change only when i is merged into a larger set
in Step 3 or Step 5. At these occasions, since the representative of the new set
is defined as the representative of the largest original set j that was merged,
|| ≥ 2|i | for all sets i whose representative was changed. Clearly, such
doubling of size can occur only at most log n times.
Step 4 is handled by the following lemma. Recall that the input is a set
⊆ \{α} such that all elements of are known to be equivalent and a
representative λ ∈ . The set {α} ∪ is closed for the action of rλ .
Lemma 5.5.5. (a) In O(nl + n|S|) time, it can be decided whether {α} ∪ is
a block of imprimitivity.
(b) If {α} ∪ is not a block of imprimitivity then some g ∈ G α \H can be
constructed in O(n log2 |G| + n|S1 |) time, where S1 is the known generating
set of H .
106 Further Low-Level Algorithms
Proof. (a) Let := {α} ∪ . We try to partition into disjoint G-images of .
Suppose that the disjoint sets = 1 , 2 , . . . , j−1 are already constructed;
we pick β j ∈ \(1 ∪ · · · ∪ j−1 ) and compute rβ j . If rβ j ⊆ \(1 ∪ · · · ∪
j−1 ) then we define j := rβ j ; otherwise, we have ∅ rβi ∩ rβ j rβ j
for some i < j, which proves that is not a block. Since the rβ j are stored as
words of length at most l, this part of the algorithm runs in O(nl) time.
If the partition of into G-images of succeeds then we check that each
generator of G respects this partition. This can be done in O(n|S|) time.
(b) If is not a block then we claim that := α H,rλ is not a block either.
By Lemma 5.5.3, ⊆ . If = then obviously is not a block. If
then λ ∈ \{α} \{α} ⊆ λ̄, so is not a block.
We repeat the algorithm described in part (a) with playing the role of . This
means that we try to partition into G-images of and, if the partition succeeds,
compute the action of the generators of G on the sets in the partition. Since
is not a block, we shall encounter h 1 , h 2 ∈ G such that ∅ = h 1 ∩ h 2 h 1 .
(If h 1 , h 2 are found during the partition phase then h 1 = rβi , h 2 = rβ j for
some coset representatives; if they are found at the checking of the action of
generators then h 1 = rβi s, h 2 = rβ j for some s ∈ S.) Defining g1 := h 1 h −1 2 , we
have ∅ = ∩ . We multiply out g1 as a permutation. As in part (a), the
g1
Lemma 5.5.6.
(a) In O(n log2 |G| + n|S|) time, coset representatives rβ as words of length l
for some l ≤ 2 log |G| can be computed.
(b) The total amount of time spent in Step 3 is O(nl log n).
(c) The time spent in Step 4 is O(n log3 |G| + n|S| log |G|).
(d) The time spent in Step 5 is O(n log n log |G|).
Proof. (a) This is proven by Lemma 4.4.2. Recall that, although a Schreier tree
codes inverses of coset representatives, the set S ∗ of labels in the Schreier tree
(S ∗ , T ) constructed in Lemma 4.4.2 is closed for inverses. Hence, after (S ∗ , T )
is constructed, the coset representatives themselves can also be written as words
in S ∗ .
5.5 Blocks of Imprimitivity 107
(b) By (a), computing an image γ rλ takes O(l) time. By Lemma 5.5.4, for
each fixed γ ∈ , γ rλ for some λ is computed at most log n times; hence the
total cost of these computations is O(nl log n). Updating reprL and the linking
of lists cost O(n log n).
(c) Step 4 is entered at most log |G α | times, since H increases between
consecutive calls.
(d) Step 5 is entered always after a call of Step 4, that is, at most log |G α |
times. At one call, ordering lengthL costs O(n log n) whereas the linking and
updating of lists runs in O(n) time.
Let t(m, n) denote the worst-case time requirement of at most m Find opera-
tions and at most n − 1 Union operations. If we do not use the collapse rule then
it is fairly easy to estimate t(m, n), regardless of whether the weighted union
rule is used or not (cf. Exercise 5.9). However, if both the collapse and weighted
union rules are used then the analysis becomes much harder. The asymptotic
behavior of t(m, n) was determined in [Tarjan, 1975]. To state the result, we
have to introduce the Ackerman function.
Let A : N × N → N be defined recursively as follows: For all i, x ∈ N, let
A(0, x) := 2x, A(i, 0) := 0, and A(i, 1) := 2. Moreover, for i ≥ 1 and x ≥ 2,
110 Further Low-Level Algorithms
let
Using (5.3), it is easy to see by induction that A(1, x) = 2x and A(i, 2) = 4 for
all x ≥ 1 and i ≥ 0. Also, A(2, x + 1) = A(1, A(2, x)) = 2 A(2,x) , and so
·2
··
A(2, x) = 22 (a tower of x twos).
The function A(2, x) already grows very fast, but A(3, x) is mind-boggling. We
22
have A(3, 0) = 0, A(3, 1) = 2, A(3, 2) = 4, A(3, 3) = A(2, A(3, 2)) = 22 =
65536, and
·2
··
A(3, 4) = A(2, A(3, 3)) = 22 (a tower of 65536 twos).
For i ≥ 4, the functions A(i, x) are growing even faster. We just compute one
more value: A(4, 3) = A(3, A(4, 2)) = A(3, 4).
For fixed m ≥ 1, we define the inverse of A(m, n) as A−1 (m, n) := min{x |
A(m, x) > log n}. If m ≥ 3 then, for all practical purposes and beyond,
A−1 (m, n) ≤ 4. The main result of [Tarjan, 1975] is the following theorem.
Theorem 5.5.8. Let t(m, n) denote the worst-case time requirement of at most
m Find operations and at most n − 1 Union operations, and suppose that
m ≥ n. If both the collapse rule and the weighted union rule are used then
t(m, n) ∈ (m A−1 (m, n)).
The method described in this section can be used to test the primitivity of
G, by computing the smallest block containing {α, β} for some fixed α ∈
and all β ∈ \{α}. However, this multiplies the running time indicated in
Corollary 5.5.9 by a factor n. There is a possible shortcut. If we already know
that the smallest block containing {α, β1 } is and at a subsequent computation
we deduce that the smallest block containing {α, β2 } must contain β1 , then we
can terminate that computation. This observation usually helps in practice, but
the worst-case running time is quadratic in n.
Exercises 111
Exercises
5.1. Suppose that a nonredundant base B is given for some G ≤ Sym().
Prove that for H ≤ G and T ⊆ G, the subgroup H ∪ T and normal
closure H G can be computed in nearly linear time by a deterministic
algorithm.
5.2. Let S be an SGS for G relative to the base (β1 , . . . , βm ). Let Si = S ∩
G (β1 ,...,βi−1 ) and let Ri be the transversal computed from Si for G (β1 ,...,βi−1 )
112 Further Low-Level Algorithms
mod G (β1 ,...,βi ) . Consider the elements of Ri as words in Si . For all j ∈
[1, m], r ∈ R j , and g ∈ S j , let
rg = rm rm−1 · · · r1 , ri ∈ Ri (5.4)
5.9. Let t(m, n) denote the worst-case time requirement of at most m Find
operations and at most n − 1 Union operations, and suppose that m ≥ n.
Prove that
(i) if neither the collapse rule nor the weighted union rule is used then
t(m, n) ∈ (mn); and
(ii) if only the weighted union rule is used then t(m, n) ∈ (m log n).
5.10. Let G ≤ Sym(). Prove that any structure forest of G has less than 2||
vertices.
5.11. Give examples of groups G ≤ Sym() with exponentially many (as a
function of ||) structure forests.
The following four exercises lead to the regularity test from [Acciaro
and Atkinson, 1992].
5.12. Let G = S ≤ Sym() be transitive, and let α ∈ . Prove that G acts
regularly on if and only if G α = G α g for all g ∈ S.
5.13. Let G = S ≤ Sym() be transitive, and let α, ω ∈ . Construct a
list (T [β] | β ∈ ) in the following way: Initialize T [α] := ω. Build a
breadth-first-search tree L that computes the orbit α G by the algorithm
−−−→
described in Section 2.1.1. Each time an edge (β, γ ) is added to L, because
γ = β s for some already constructed vertex β of L and some s ∈ S, we
also define T [γ ] := T [β]s .
Prove that G α = G ω if and only if T [β g ] = T [β]g for all β ∈ and
for all g ∈ S.
5.14. Based on Exercises 5.12 and 5.13, design a deterministic algorithm that
tests whether a group G = S ≤ Sym(), with || = n, acts regularly
on , in O(|S|2 n) time.
5.15. Let G = S ≤ Sym(), with || = n, be transitive. Suppose that for
some α, β ∈ we have G α = G β , and let be the smallest block of
imprimitivity for G that contains α and β. Prove that G α = G δ for each
δ ∈ .
Based on this observation, modify the algorithm in Exercise 5.14 to
run in O(|S|n log n) time.
5.16. Let G = S ≤ Sym(), with || = n, be transitive. Based on Lemma
4.4.2, design a deterministic algorithm that in O((|S| + log n)n log n)
time either decides that G is not regular, or outputs at most log n generators
for G.
6
A Library of Nearly Linear-Time Algorithms
114
6.1 A Special Case of Group Intersection and Applications 115
deterministic nearly linear-time algorithms. In implementations, usually the
faster Las Vegas versions are used.
In the entire chapter, there is only one algorithm where the worst-case running
time is not nearly linear: The computation of CSym() (G) in Section 6.1.2. We
included that algorithm here because its method is similar to the accompanying
nearly linear-time algorithms.
so (1, x2 ) ∈ K (1 ) .
Corollary 6.1.3. Suppose that strong generating sets defining shallow Schreier
tree data structures are given for G, H ≤ Sym(). If G normalizes H then the
intersection of G and H can be computed by a deterministic nearly linear-time
algorithm.
Proof. We use the notation just introduced. Observe that the action of K on 1
is permutation isomorphic to the action of G on , so Lemma 6.1.1 implies that
|K | = |G| · |H G |. In particular, if G normalizes H then |K | = |G||H | and
it follows from Lemma 6.1.1 that B1 ∪ B2 is a base for K , where B1 is a copy
of the base of G in 1 and B2 is a copy of the base of H in 2 . Therefore, by
Lemma 5.2.3, an SGS for K is computable by a deterministic algorithm with
running time estimate of the form O(n(log |G| + log |H |)c ). If we order B1 ∪ B2
such that the elements of B2 precede the elements of B1 then (G ∩ H )×1 occurs
as a member of the point stabilizer subgroup chain of K .
We remark that Lemma 6.1.1 also implies that the computation of normal
closures can also be reduced to point stabilizer computations. However, imple-
mentations of the method presented in Section 5.1.4 are faster in practice.
It is noted in [Cooperman et al., 1989] that if G normalizes H and strong
generating sets for H and G are already known then an SGS for K (relative
to a base that first fixes the points of 1 ) can be constructed directly, without
invoking the Schreier–Sims algorithm. After that, generators for G ∩ H may
be obtained by a base change.
6.1 A Special Case of Group Intersection and Applications 117
6.1.2. Centralizer in the Symmetric Group
First we handle the transitive case. Let G ≤ Sym() be transitive and, for some
fixed α ∈ , let A be the set of fixed points of G α :
δ c = β gc = β cg = β g = δ.
β h = α ch = α hc = α c = β,
γ c := β g . (6.2)
β g = α cg = α gc = γ c = δ c = α hc = α ch = β h ,
so gh −1 ∈ G β . However, G β = G α and so γ = α g = α h = δ.
118 A Library of Nearly Linear-Time Algorithms
Corollary 6.1.5. If G ≤ Sym() is transitive and abelian then G is regular
and CSym() (G) = G.
Proof. Let β, γ ∈ be in the same H -orbit. Then there exists h ∈ H such that
−1 ∗
β h = γ and so γ y = β yy hy = β yh for h ∗ = y −1 hy ∈ H . This means that
if two points β, γ ∈ are in the same H -orbit then their images β y , γ y are in
the same H -orbit again.
6.1 A Special Case of Group Intersection and Applications 119
Since C = CSym() (G) normalizes G, Lemma 6.1.7 implies that C permutes
the orbits of G. We say that two orbits 1 , 2 of G are equivalent, denoted
1 ≡ 2 , if there is a bijection ϕ : 1 → 2 such that for all g ∈ G and
δ ∈ 1 ,
Lemma 6.1.9. Let 1 , 2 be orbits of G of the same size, let α ∈ 1 , and let
A be the set of fixed points of G α . Then 1 ≡ 2 if and only if A ∩ 2 = ∅.
If α is the first base point of G then, using the set of fixed points of G α
and a shallow Schreier tree coding the transversal for G mod G α , we can
compute the orbits equivalent to the orbit of α and bijections between them by
a nearly linear-time deterministic algorithm. Equivalence of the orbit containing
some γ ∈ with the other orbits can be determined by first applying a base
120 A Library of Nearly Linear-Time Algorithms
change algorithm to obtain a base starting with γ . However, if G has numerous
inequivalent orbits (but of the same size, so equivalence
√
must be tested) then
this computation of CSym() (G) may require (n /22 log n
) time even for small-
base inputs (cf. Exercise 6.3). It is an open problem whether there is a nearly
linear-time algorithm that computes CSym() (G) for all G ≤ Sym().
k
K = CSym(i ) (G ∗ |i )
i=1
k
K = C G ∗ |i (N ∗ |i )
i=1
Note that despite the fact that G is transitive in (6.4), N may have numer-
ous orbits and so the difficulties with the nearly linear-time construction of
CSym() (N ) seem to persist. Although in this special case it is possible to con-
struct generators for CSym() (N ) in nearly linear time, CSym() (N ) may be too
large. Therefore we still cannot apply the intersection method of Section 6.1.1.
Instead, we proceed by constructing a block homomorphism whose kernel G̃
contains C G (N ) and a second homomorphism from G̃ whose kernel is exactly
C G (N ).
(i) For some α ∈ , let A = fix(Nα ) be the set of fixed points of Nα . Then A is
a block of imprimitivity for G.
(ii) Let B be the block system consisting of the G-images of A and let G̃ be the
kernel of the G-action on B. Then C G (N ) ≤ G̃.
Proof. (i) By Lemmas 6.1.4, 6.1.8, and 6.1.9, A is an orbit of CSym() (N ). So,
by Exercise 6.4 and Lemma 6.1.7, A is a block of imprimitivity for G.
(ii) The blocks in B are the orbits of CSym() (N ); therefore, in particular, the
elements of C G (N ) cannot move these blocks. Thus C G (N ) ≤ G̃.
122 A Library of Nearly Linear-Time Algorithms
The tricky part of the computation of C G (N ) is the definition of a ho-
momorphism ϕ : G̃ → Sym() with ker(ϕ) = C G (N ). We shall define ϕ in
Lemma 6.1.12, but first we need some preparatory steps.
m
cg := (c1 j , . . . , ck j ; π j ) ∈ CSym() (N ),
j=1
gcg−1 fixes pointwise and it is clear from the discussion above that there is
only one element in CSym() (N ) with this property.
(ii) Observe that for g, h ∈ G̃, we have gcg−1 hch−1 = gh(cg−1 )h ch−1 . Here
(cg ) ch ∈ CSym() (N ) and gh(cg−1 )h ch−1 fixes pointwise, so the uniqueness
−1 h −1
−1
of cgh implies that cgh = (cg−1 )h ch−1 and ϕ(g)ϕ(h) = ϕ(gh).
(iii) C G (N ) is the kernel of the homomorphism ϕ since ϕ(g) = 1 if and only
if g = cg , that is, g ∈ G̃ ∩ CSym() (N ) = C G (N ).
The process stops when Cm = CoreG (H ). Since the Ci define a strictly de-
creasing subgroup chain and, by definition, Ci ≥ CoreG (H ) for all i, we reach
CoreG (H ) in m ≤ log |G| steps.
Suppose that Ci and g0 , . . . , gi are already defined for some i ≥ 0. Then
g
we compute the conjugates of Ci by the generators of G. If Ci = Ci for all
6.2 Composition Series 125
g ∈ S then Cih = Ci for all h ∈ G. This follows from induction on the length
of the word as h is written as a product of generators. Also, (6.6) implies that
Ci ≤ H h . Hence Ci = CoreG (H ) and we are done.
g g
Otherwise, there exists g ∈ S such that Ci = Ci . Since |Ci | = |Ci |, this means
g
that Ci is not a subgroup of H g j for some j ∈ [0, i]. Such j can be found by
testing for each generator c of Ci whether c g is in the group H g j (or, for
−1
easier implementation, testing whether c gg j is in H ). After the appropriate
index j is found, we define gi+1 := g j g −1 , use the −1 algorithm in the proof of
g j g −1 gg j −1
Lemma 6.1.14 to compute Ci+1 := Ci ∩ H = (Ci ∩ H )g j g , and proceed
with the recursion.
We remark that [Kantor and Luks, 1990] contains a polynomial-time algo-
rithm for computing the core of arbitrary subgroups of G. However, that algo-
rithm uses Sylow subgroup computations, which, at the moment, have nearly
linear-time versions only for groups with no exceptional Lie-type composition
factors (cf. [Morje, 1995]), and the algorithm is much more complicated than
the elementary procedure described here.
The proof of Theorem 6.2.1 is spread out in Sections 6.2.1–6.2.4 and 6.2.6.
In Sections 6.2.1–6.2.4 we describe a nearly linear-time Monte Carlo compo-
sition series construction. This construction is Monte Carlo, even if we have
a nonredundant base and a strong generating set for the input group. The up-
grade to a Las Vegas algorithm (supposing that a base and SGS are known) is
given in Section 6.2.6. Moreover, in Section 8.3, we describe how to compute a
nonredundant base, SGS, and composition series by a Las Vegas nearly linear-
time algorithm, in groups satisfying some restrictions on their composition
factors.
Corollary 6.2.3. Let G ≤ Sym(), with || = n, and let N be a maximal nor-
mal subgroup of G. Then G/N has a faithful permutation representation of
degree at most n.
The fact that the construction of the action of G on the cosets of G [i+1] N
in Lemma 6.2.2 can be done in nearly linear time was observed in [Beals and
Seress, 1992]. That paper also contains a strengthening of the method, when
we do not have generators for N . We shall need this extension in Section 6.2.3.
Proof. The first part of the algorithm is similar to the one described in
Lemma 6.2.2. Namely, we construct the permutation action of G [i] on the cosets
of K . The only difference from Lemma 6.2.2 is that we compute H := G [i] ∩ K
(instead of the unknown G [i] ∩ G [i+1] N ). Then, as in the proof of Lemma 6.2.2,
we compute := βiH and the block system
consisting of the G [i] -images of
[i]
in the fundamental orbit βiG . The same proof as in Lemma 6.2.2, just replacing
each reference to G [i+1] N by K , shows that the G [i] -action on the cosets of K
is permutation isomorphic to the G [i] -action on
. Let 1 := , 2 , . . . , m
denote the blocks in
(with m = |G : K | ≤ n).
The new part of the algorithm is that we also have to be able to construct the
action of arbitrary elements of G on the cosets of K . To this end, we compute
a shallow Schreier tree data structure for K relative to the base B, and for each
j ∈ [1, m] we choose a point σ j from j . Using the ith tree in the shallow
Schreier tree data structure of G, we write the coset representatives h j carrying
βi to σ j as words in the strong generators. The set of all h j is a transversal
system for G mod K . Note that because of the nearly linear time constraint, we
cannot compute the permutations h j explicitly.
Given some g ∈ G, for each j ∈ [1, m] we want to determine to which coset
of K the permutation h j g belongs. First, we compute g −1 . We claim that after
that, for a fixed j, the coset of h j g can be found in O(log3 |G|) time. Applying
this procedure for the generators g of G, the action of G on the cosets of K is
found in nearly linear time.
Recall our convention that strong generating sets are closed for taking in-
verses. Hence we can write a word w of length at most 2 log |G| + 1 whose
product is g −1 h −1
j . We sift w as a word (cf. Lemma 5.2.2) through the first
6.2 Composition Series 129
i − 1 levels of the Schreier tree data structure of K . Since any element of
G = G [i] K can be written in the form gi k for some k ∈ K and gi ∈ G [i] , this
sifting is feasible. By Lemma 5.2.2, the time requirement is O(log3 |G|) and the
siftee s := g −1 h −1
j k ∈ G , for some k ∈ K , is represented as a word of length
[i]
O(log |G|). So we can write a word (in terms of g and the SGS for G, K )
2
Theorem 6.2.5. Let G ≤ Sym() be primitive and let || = n. Then one of
the following holds:
Theorem 6.2.7. Let G ≤ Sym() be primitive and let || = n. Then G satisfies
at least one of the following properties:
Proof. We have to show that the groups in case II of Theorem 6.2.5 are covered
by cases B, C, and D of this theorem. Using the notation of Theorem 6.2.5,
suppose that G is in case II, with minimal normal subgroup N = T1 × · · · × Tr
and all Ti isomorphic to the same simple nonabelian group T . Since |T | ≥ 60
and |T : H | ≥ 5 for any proper subgroup H of T , we have r ≤ log5 n in each
subcase of case II.
If r = 1 then either G is simple (and it is in case D) or N ∼ = Inn(T ) G ≤
Aut(T ). In the latter case, G/N is solvable by Schreier’s Hypothesis (which is
now a theorem, as a consequence of the classification of finite simple groups);
hence G has a cyclic factor group of order p for some prime p. Clearly p ≤ n,
and so G is in case B.
Suppose now that r ≥ 2. Let
be a minimal block system in the conjugation
action of G on {T1 , , . . . , Tr } and let m := |
|. If G is in case II(iii) and m <
r then we also suppose that the groups Ti belonging to the same diagonal
collection are in the same block of
. Then G acts primitively on
; let P ≤ Sm
denote the image of this action. By [Praeger and Saxl, 1980], if P ≤ Sm is
primitive and P = Am , Sm then |P| < 4m ; since m ≤ r ≤ log5 n, the kernel of
action on
has index less than n and G is in case B. If P = Sm then it has
a normal subgroup of index 2 and G is in case B. This leaves us with the
case that P = Am . If m!/2 < n then we are in case B, so we may suppose that
m!/2 ≥ n.
Now, if G is in case II(i) then G is in case C(i). If G is in case II(ii) then it is
in case C(ii). If G is in case II(iii) with l > 1 then it is in case C(ii). Finally, if
G is in case II(iii) with l = 1 then G is in case C(iii).
The composition series algorithm that we shall describe in the next two
sections uses different methods for solving the problem (6.7) from Section 6.2.1
for inputs belonging to the different cases of Theorem 6.2.7. If it is not known
to which case of Theorem 6.2.7 the input primitive group G belongs, then the
algorithm tries all methods. Note that if the output of (6.7) is a proper normal
subgroup or a faithful permutation action on at most n/2 points then the cor-
rectness of the output can be checked in nearly linear time. Hence if all methods
report failure or that G is simple then we can conclude that G is indeed simple.
The only regular primitive groups are cyclic of prime order, and these can
be easily recognized. Hence, in the next two sections, we may assume that the
input primitive group is not regular.
6.2 Composition Series 133
6.2.3. Normal Subgroups with Nontrivial Centralizer
In practice, solving the problem (6.7) is most time consuming in the case when
the input primitive group G has a regular normal subgroup with a nontrivial
centralizer (i.e., G has a regular abelian normal subgroup or G has two regular
nonabelian normal subgroups). We shall present two approaches. The first
method originated in [Luks, 1987] and subsequently was refined in [Babai
et al., 1993] and [Beals and Seress, 1992] to a nearly linear-time algorithm. The
second method was introduced in [Neumann, 1986], and it handles the case
when G has a regular abelian normal subgroup. Here we give a nearly linear-
time version. As we shall discuss in Remarks 6.2.11 and 6.2.14, both approaches
can be extended to handle all primitive groups with regular normal subgroups.
However, we cannot guarantee nearly linear running time of these extensions
in the cases when G has a unique regular nonabelian normal subgroup or G has
a nonabelian regular normal subgroup, respectively.
Let G ≤ Sym() be primitive, and suppose that G has a regular normal
subgroup N . For any α, β ∈ , there is a unique element xαβ ∈ N such that
α xαβ = β. The first method finds generators for larger and larger subgroups
of G that contain xαβ , eventually constructing some maximal subgroup K ≤ G
such that the action of G on the cosets of K is not faithful and |G : K | < n.
The action on the cosets of K is constructed, and its kernel is a solution for
(6.7). The second method constructs smaller and smaller subsets containing
xαβ , eventually arriving to some subset K ⊆ G that is small enough so that we
can afford to take the normal closure of each element.
Luks’s Algorithm
The algorithm for solving (6.7) is to compute NG (G αβ ), embed it into a maximal
subgroup K , construct the action of G on the cosets of K , and compute the
kernel of the action. We have to show that these computations can be done in
nearly linear time.
Proof. Applying a base change (cf. Section 5.4), we may assume that α and β
are the first two base points. Let R1 and R2 be the transversals for G mod G α
and G α mod G αβ , respectively, encoded in shallow Schreier trees (S1 , T1 ) and
(S2 , T2 ). Let N denote a regular normal subgroup of G.
Let be the fixed point set of G αβ . For any g ∈ G, it is clear that g ∈ NG (G αβ )
if and only if g fixes setwise and g fixes setwise if and only if α g ∈ and
6.2 Composition Series 135
β g ∈ . Moreover, for any γ , δ ∈ , the unique element xγ δ ∈ N moving γ to
δ centralizes G αβ and so xγ δ ∈ NG (G αβ ). Therefore NG (G αβ ) acts transitively
on .
First, we compute . This can be done in nearly linear time, using the genera-
tors for G αβ in the SGS for G. After that, generators for NG (G αβ ) are computed
in two steps. We initialize H := G αβ and the orbits α H := {α} and β H := {β}.
In the first step, we compute NG (G αβ )α . This is done by running through the
elements of . If for some γ ∈ we have γ ∈ β H but γ is in the fundamental
orbit β G α then we multiply out the product rγ of labels along the path from γ to
β in the Schreier tree T2 , we replace H by H, rγ , and we recompute the orbit
β H . Since R2 is encoded in a shallow Schreier tree, the computation of rγ can
be done in nearly linear time. Moreover, H is increased at most log(n − 1)
times, since |NG (G αβ )α : G αβ | ≤ n − 1.
In the second step, we embed NG (G αβ )α into a subgroup that acts transitively
on . By the discussion in the second paragraph of the proof, this subgroup
will be NG (G αβ ). Again, we run through the elements of . If we encounter
some δ ∈ that is not in α H then we multiply out the product of labels rδ in
the Schreier tree T1 along the path from δ to α and compute β G α ∩ rδ . This
intersection is nonempty, since there exist elements in G that map δ to α and fix
setwise (and so, in particular, map β to an element of ). Any such element
x can be written in the form x̄rδ−1 for some x̄ ∈ G α and then β x̄ ∈ β G α ∩ rδ .
Therefore we can pick γ ∈ β G α ∩ rδ , multiply out the product of labels rγ in
the Schreier tree T2 along the path from γ to β, replace H by H, rγ−1rδ−1 , and
recompute the orbit α H . Since H increases in the second step at most log ||
times, the second step runs in nearly linear time as well.
Remark 6.2.11. A similar algorithm can be used to solve (6.7) in the case
when G has a unique nonabelian regular normal subgroup. The construction
of NG (G αβ ) and its embedding into a maximal subgroup K can be done in
nearly linear time, as described in Lemmas 6.2.9 and 6.2.10. However, we
cannot guarantee that K contains N , and so the action of G on the cosets of K
may be faithful and Lemma 6.2.4 may not be applied. We obtain a solution for
6.2 Composition Series 137
(6.7) even in the case when K does not contain N , since then G = K N and
K ∩ N = 1, so |G : K | ≤ n/2. However, it is not clear how we can construct the
action on the cosets of K in nearly linear time. We shall see in Section 6.2.6 that
this is possible if β is chosen from the smallest orbit of G α in \{α}, but the
proof of that uses consequences of the classification of simple groups. There
is also an elementary nearly linear-time algorithm for the handling of groups
with regular non-abelian socle, which we shall present in Section 6.2.4.
Remark 6.2.14. This method can also be extended to solve (6.7) for primitive
groups with one or two nonabelian regular normal subgroups. Let N denote
a regular normal subgroup of G. Using the notation of Neumann’s algorithm,
our observation that R consists of the restrictions of the elements of N to
is still valid, and R can be constructed in nearly linear time. For an element
6.2 Composition Series 139
x ∈ D of prime order r satisfying x| = 1 and for the subgroup P of Z (G αβ )
consisting of elements of order dividing r , it is still true that P x contains a
nontrivial element of N . Corollary 6.2.13 is also true, since G α has no non-
trivial solvable normal subgroups (cf. Exercise 6.15 for the case of two regular
normal subgroups and [Dixon and Mortimer, 1996, Theorem 4.7B(i)] for the
case of a unique nonabelian regular normal subgroup). Hence we can com-
pute the normal closures in H of all standard words representing elements
of P x in nearly linear total time. However, we cannot decide which normal
closure corresponds to a regular subgroup of G within the nearly linear time
constraint.
Lemma 6.2.15. Let H = S be a black-box group, and suppose that there
exists a normal subgroup of H of index less than n. Then generators for a proper
normal subgroup of H can be computed by a Monte Carlo algorithm, using
O(log(ε−1 )n(|S| + log |H | + log n + log(ε −1 ))2 ) group operations, where ε is
the desired error probability. (As in Lemma 6.2.10, we consider constructing a
nearly uniformly distributed random element in a subgroup of H as one group
operation.)
140 A Library of Nearly Linear-Time Algorithms
h i , up to log(3l/ε) times.)
We claim that with probability greater than 1 − ε, the sequence (h 1 , . . . , h l )
is constructed and 1 h lH H . By the definition of the sequence, for each
i we have 1 h iH . For a fixed i, if h i−1 is defined then the probability that
a random element of h i−1 H
is not 1 is at least 1/2, so the probability that the
algorithm fails to construct h i with log(3l/ε) random selections in h i−1 H
is at
most ε/3l. Therefore, the sequence (h 1 , . . . , h l ) is constructed with probability
at least 1 − ε/3.
Suppose now that (h 1 , . . . , h l ) is constructed. If h i−1
H
= H for some i then
the probability that h i is in a proper normal subgroup is greater than 1/n, and
if h iH is a proper subgroup of H then h Hj is a proper subgroup of H for all
j ≥ i. Hence the probability that h lH = H is less than (1 − 1/n)l < ε/3.
The analysis in the previous paragraph assumed that the normal closure
computations give the correct result. If we ensure that each of the l normal
closure computations are correct with probability at least 1 − ε/3l then the
entire algorithm succeeds with probability greater than 1 − ε.
The normal closures are computed by the algorithm of Theorem 2.3.9. Since
for each i we want the computation of h iH to succeed with probability at
least 1 − ε/3l, we need to construct O(log |H | + log n + log 1/ε) generators
for h iH , at a cost of O((log |H | + log n + log 1/ε))(|S| + log |H | + log n +
log 1/ε) group operations (see also Remark 2.3.5 for the justification of these
quantities and Exercise 6.16).
Lemma 6.2.16. Let G ≤ Sym() be primitive, with || = n, and suppose that
G belongs to case C(i) or C(ii) of Theorem 6.2.7. Then for any α ∈ , the
smallest orbits of G α in \{α} have size less than log2 n. Moreover, if G be-
longs to case C(ii) then points in these smallest orbits have only one nontrivial
coordinate and even the sum of the sizes of the smallest orbits is less than log2 n.
Proof. Since m!/2 ≥ n and (log n/ log log n)log n/ log log n < n, we have m >
log n/ log log n. Therefore, since n = |N1 : (N1 )α |m , we must have |N1 :
(N1 )α | < log n. Hence the number of points with one nontrivial coordinate
is at most m|N1 : (N1 )α | < log2 n. So there are orbits of G α in \{α}, namely
the orbits containing all points with exactly one nontrivial coordinate, whose
union has size less than log2 n.
Hence, to finish the proof, it is enough to show that, in case C(ii), the smallest
orbits in \{α} consist of points with one nontrivial coordinate. Let K G α be
the kernel of the conjugation action of G α on the set {N1 , . . . , Nm }. Note that
Nα ≤ K , G α /K ∼ = Am , and K permutes the cosets of (N1 )α in N1 . Let q denote
the size of the smallest K -orbit L among the cosets of (N1 )α , excluding (N1 )α
itself. Then, taking β ∈ with the first coordinate in L and the other m − 1
coordinates trivial, we have |β G α | = mq.
We claim that for any γ ∈ with k nontrivial coordinates, |γ G α | ≥ ( mk )q2k−1 ,
which is greater than mq if k ≥ 2. Indeed, if (h 1 , . . . , h m ) is a coordinate
sequence for γ and the jth coordinate is nontrivial, then (N j )α h j is not fixed
by the conjugation action of (N j )α ≤ K (otherwise (1, . . . , 1, h j , 1, . . . , 1) is
a fixed point of Nα , contradicting the primitivity of G by Lemma 6.1.10(i)).
Thus |γ K | ≥ q2k−1 , since γ has at least q images differing pairwise in the first
nontrivial coordinate of γ , and for each of these, the action of (N j )α in the other
142 A Library of Nearly Linear-Time Algorithms
Lemma 6.2.17. Let G = S ≤ Sym() be transitive, with = [1, n], and
suppose that for some α, β ∈ , |G α : G αβ | = k. Then the action of any g ∈ G
on the cosets of G αβ can be computed in O(|S|nk 2 ) time by a deterministic
algorithm, using O(nk) memory.
Lemma 6.2.19. Let G ≤ Sym() be primitive, with || = n, and suppose that
G belongs to case C(iii) of Theorem 6.2.7. Then, for any α ∈ , the cardinality
of the union of orbits of G α of size less than log2 n is O(log6 n).
Lemma 6.2.20. Let G ≤ Sym(), with || = n, belong to case C(iii), and let
α ∈ . Then there exists β ∈ such that
6.2.5. Implementation
Composition series computations are implemented in both GAP and Magma.
Reduction to the primitive case proceeds almost as described in Section 6.2.1;
however, primitive groups are handled somewhat differently, based on ideas
from [Kantor, 1991]. We give some details about the GAP implementation; the
version in Magma, as described in [Cannon and Holt, 1997], is similar. The
current GAP implementation works for inputs of degree up to 106 , whereas the
Magma implementation works up to degree 107 .
The GAP implementation starts with checking whether the input group is
solvable. This can be done very efficiently by an SGS construction for solvable
groups from [Sims, 1990]. We shall describe this algorithm in Section 7.1. If the
input group G is solvable then the output is not only an SGS but a composition
series as well; if G is not solvable then the SGS construction reports failure. For
nonsolvable inputs, we apply one more reduction step, the computation of de-
rived subgroups, besides the transitive constituent and block homomorphisms
indicated in Section 6.2.1. This eventually reduces the composition series prob-
lem to one of finding a composition series in primitive perfect groups.
Kantor’s idea is that there are categories of the O’Nan–Scott theorem such
that no primitive perfect group G ≤ Sym(), with || = n, can belong to these
categories if n is in the practical range n ≤ 107 . Moreover, just by looking at
n and |G|, for most inputs G we can determine which category of the O’Nan–
Scott theorem G belongs to. Therefore, we do not have to try all algorithms
designed for the different categories. In particular, simple groups can be often
recognized without any further work.
n1 s
5 |A5 |
6 |A5 |; |A6 |
7 |PSL2 (7)|; |A7 |
8 |PSL2 (7)|; |A8 |
9 |PSL2 (8)|; |A9 |
10 |A5 |; |A6 |; |A10 |
11 |PSL2 (11)|; |M11 |; |A11 | r u
12 |PSL2 (11)|; |M11 |; |M12 |; |A12 |
5 |A5 |
13 |PSL3 (3)|; |A13 |
6 |A5 |; |A6 |
14 |PSL2 (13)|; |A14 |
7 |PSL2 (7)|; |A7 |
15 |A6 |; |A7 |; |A8 |; |A15 |
8 |PSL2 (7)|; |AGL3 (2)|; |A8 |
16 |A16 |
9 |PSL2 (8)|; |A9 |
17 |PSL2 (16)|; |A17 |
10 |A5 |; 24 · |A5 |; |A6 |; |A10 |
18 |PSL2 (17)|; |A18 |
19 |A19 |
20 |PSL2 (19)|; |A20 |
21 |A7 |; |PSL2 (7)|; |PSL3 (4)|; |A21 |
22 |M22 |; |A22 |
23 |M23 |; |A23 |
24 |PSL2 (23)|; |M24 |; |A24 |
25 |A25 |
(6.8)
Proof. We go through the cases of Theorem 6.2.5 and use the notation of that
theorem. If G belongs to case I(i) then it belongs to case (1) of this lemma.
If G belongs to case I(ii) then the minimal normal subgroups N1 , N2 of G are
isomorphic to T r for some simple group T (cf. Exercise 6.7) and n = |T |r . The
conjugation action of G permutes the r simple groups in N1 and N2 , and so G
has a factor group isomorphic to a subgroup F ≤ Sr × Sr . We must have r ≤ 3
since |T | ≥ 60 and 604 > 107 . Since G is perfect, F = 1, and so G/Soc(G)
is isomorphic to a subgroup of Out(T )2r . By Schreier’s hypothesis, Out(T )2r is
solvable, so we must have G = Soc(G) = N1 × N2 . Our last observation is that
r = 1 because otherwise, identifying with N1 = T1 × · · · × Tr , we have that
148 A Library of Nearly Linear-Time Algorithms
the set 1T1 is a nontrivial block of imprimitivity for G (cf. Exercise 6.7 again).
Hence we are in case (2).
Case II(i) cannot occur since the perfectness of G implies r ≥ 5 and n ≥ 605
(in fact, the smallest degree of a primitive group belonging to case II(i) is 606 ,
cf. [Dixon and Mortimer, 1996, Theorem 4.7B(iv)]).
If G belongs to case II(ii) and r = 1 then, by Schreier’s hypothesis, G belongs
to case (4) of this lemma. If r > 1 then the transitive permutation action of G
on {T1 , . . . , Tr } is isomorphic to a perfect subgroup of Sr of order u, and n = n r1
for some integer n 1 . Moreover, by Exercise 6.8, there is a primitive permutation
representation of degree n 1 of a subgroup of Aut(T ) containing Inn(T ). Note
the subtle point that the permutation representation of degree n 1 of T ∼ = Inn(T )
may not be primitive; the smallest example, in fact the only one in our range
n ≤ 107 , occurs when n 1 = 21 and r = 5, and T = PSL2 (7). We are in case
(3), and the possibilities for n 1 , r, u, and s := |T | are listed in Table (6.8). The
group orders s, u are given in a form that indicates the structure of the groups.
The Mathieu groups are denoted by Mi (i ∈ {11, 12, 22, 23, 24}). The factor t
in the order of G in this case is the contribution of the outer automorphisms
of T .
Finally, we show that G cannot belong to case II(iii). Suppose the contrary.
Then we must have l = 1 or l ≥ 5 since G is perfect. Here l ≥ 5 cannot occur
because n ≤ 107 . If l = 1 then k ≥ 5, using again that G is perfect; hence
n ≥ 604 > 107 , a contradiction.
Both algorithms described in Section 6.2.3 for the handling of case (1) are
implemented in GAP. Neumann’s method runs somewhat faster in practice, so
it is the current default technique. The algorithm for Frobenius groups differs
from the one given in Section 6.2.3. By Exercise 6.12(v), if G ≤ Sym(),
with || = p d , is a perfect Frobenius group then G/Soc(G) ∼ = SL2 (5). Hence,
taking random elements of G, we have a 1/120 chance of finding an element of
the socle. Note that it is very easy to test whether some g ∈ G is in Soc(G): This
happens if and only if g p = 1. Alternatively, g ∈ Soc(G) if and only if g = 1 or
g fixes no points of .
Groups belonging to case (2) are handled by the extension of Neumann’s
method, described in Remark 6.2.14.
For groups belonging to case (3), the algorithm is different from the one given
in Section 6.2.4. Let α ∈ and let K denote the kernel of the permutation action
of G on {T1 , . . . , Tr }. We compute the smallest orbit of G α in \{α}. It can be
checked that in each group in case (3), using the terminology of Lemma 6.2.16,
consists of points with exactly one nontrivial coordinate. Next, we compute
6.2 Composition Series 149
the (unique) minimal block system
for G α | and the kernel M of the action
of G α on
. With one exception, points in belong to the same block of
if
and only if their nontrivial coordinate is in the same position, and so M = K α .
The exception is when n 1 = 5, r = 10, and u = 24 · |A5 | in Table (6.8), in
which case the nontrivial coordinates can be in two positions for points in the
same block, and M is the extension of K α by an elementary abelian group of
order 16. Therefore, we can obtain Soc(G) as M G or M G (we may need
the second derived subgroup in the case n 1 = 21, s = |PSL3 (4)| as well).
(i) N = M Soc(G).
(ii) If T is solvable then N /Soc(G) is solvable.
(iii) If T is nonabelian then either N = Soc(G) or C Nα (M) = Soc(G)α .
150 A Library of Nearly Linear-Time Algorithms
Proof. (i) Since Soc(G) is transitive, any g ∈ G can be written in the form
g = g1 h for some g1 ∈ G α and h ∈ Soc(G). Therefore, since M G α , we have
M g = M g1 h = M h ≤ MSoc(G) and so N ≤ MSoc(G). Conversely, M ≤ N
by definition and we claim that Soc(G) ≤ N . This is clear if Soc(G) is the
unique minimal normal subgroup of G. If G has two minimal regular normal
subgroups then Soc(G)α ≤ M ≤ N by Exercise 6.15, and N also contains a
minimal normal subgroup N1 of G. Hence Soc(G) = Soc(G)α , N1 ≤ N .
(ii) Suppose that T is a cyclic group of order p for some prime p. Then M is
a p-group by Exercise 6.17(ii) and so N /Soc(G) ∼ = M/(Soc(G) ∩ M) is also a
p-group.
(iii) If T is nonabelian then M is a minimal normal subgroup of G α by
Exercise 6.17(i). Since Soc(G)α G α , we must have M ≤ Soc(G)α or M ∩
Soc(G)α = 1. If M ≤ Soc(G)α then N = Soc(G). If M ∩ Soc(G)α = 1 then
Soc(G)α centralizes M and so Soc(G)α ≤ C Nα (M). Conversely, if c ∈ C Nα (M)
then Soc(G)c centralizes MSoc(G)/Soc(G). However, MSoc(G)/Soc(G) ∼ =
M, so it has trivial center, implying c ∈ Soc(G), and so c ∈ Soc(G) ∩ Nα =
Soc(G)α .
&log |G|'
f (n, G) ≤ Cn(log |G| − i)c ≤ Cn logc+1 |G|.
i=0
Proof. We have seen that if a primitive group G satisfies the conditions of the
lemma then either Soc(G) is regular abelian or G is simple.
If Soc(G) is regular abelian then G δ ∩ Soc(G) = 1, so G = N = MSoc(G)
implies that M = G δ . This also implies that T = M, since if T = G δ then T G δ
is a proper normal subgroup of G δ , contradicting M = G δ . The other property
in the conclusion of the lemma, that |G δ | divides |GLr ( p)|, obviously holds in
this case.
If G is simple then by [Guralnick, 1983] or [Kantor, 1985a], the pair (G, G δ )
is one of the following:
(i) G ∼= A pr and G δ ∼
= A pr −1 ;
∼
(ii) G = PSLd (q) with (q d − 1)/(q − 1) = pr , and G δ is a maximal parabolic
subgroup of G;
(iii) G ∼= PSL2 (11), pr = 11, and G δ ∼
= A5 ;
∼ ∼
(iv) G = M23 , p = 23, and G δ = M22 ;
r
Here M23 , M22 , and M11 denote Mathieu groups, and M10 is an index two
extension of A6 . In cases (ii), (v), and (vi) G δ is not simple, whereas in cases
(i), (iii), and (iv) |G δ | does not divide |GLr ( p)|.
Based on Lemma 6.2.23, we can easily finish the solution of (6.10) in the
case when || is a prime power. The last remaining case is to decide whether
Soc(G) is regular nonabelian. We shall do this by showing that, for primitive
groups with regular nonabelian socle, (6.7) from Section 6.2.1 can be solved in
nearly linear time by a deterministic algorithm.
154 A Library of Nearly Linear-Time Algorithms
Lemma 6.2.24. Let G ≤ Sym() be primitive with regular nonabelian socle
and let α ∈ and || = n. Then the smallest orbit of G α in \{α} has size
less than log6 n.
By Lemmas 6.2.24 and 6.2.17, if G has regular nonabelian socle then the
action of G on the cosets of G αβ can be constructed by a nearly linear-time
deterministic algorithm, provided that β is chosen from the smallest orbit of
G α in \{α}. After that, the action of G on the cosets of a maximal subgroup
containing NG (G αβ ) can be constructed as described in Section 6.2.4. This latter
action either is not faithful or falls into a different case of Theorem 6.2.5 (in
fact, into case II(ii)). If the first of these alternatives holds then G is not simple,
and in the second case we have seen already how (6.10) can be solved.
The last observation we have to make is that the algorithm solving (6.10) can
be organized such that it calls itself recursively only once, for a group of order
at most half of the order of the input group. If this property holds then the same
argument that we used earlier in this section in the case of (6.9) shows that
the running time is nearly linear. As we described the algorithm, one recursive
call is made for the simple group T G δ . The only way a second recursive
call may occur is that the subroutine for the handling of groups with regular
nonabelian socle returns a faithful primitive permutation representation on a
smaller domain, and we check the simplicity of a subnormal subgroup of a
point stabilizer in this new representation. To avoid two recursive calls, we
postpone the verification of the simplicity of T as the last step of the algorithm,
6.2 Composition Series 155
and if a new permutation representation is constructed then we do not perform
the verification of T at all.
Theorem 6.3.1. Let G ≤ Sym(), with || = n, and let N G be abelian. Then
there exists a homomorphism ϕ : G → Sn such that N ≤ ker(ϕ) and the prime
divisors of N and | ker(ϕ)| are the same. In particular, if N is a p-group then
so is ker(ϕ); moreover, if N is elementary abelian then ker(ϕ) is elementary
abelian. Given generators for G and N , ϕ can be constructed in linear time by
a deterministic algorithm.
k
Proof. Let 1 , . . . , k be the orbits of N and so = i=1 i . By Corol-
lary 6.1.5, N acts regularly on each set i , so N |i has |i | elements. Let
i
k
denote the set of permutations in N |i , and let
:= i=1
i . Clearly, |
| = n.
We shall define an action of G on
, satisfying the conclusion of the theorem.
g
Given g ∈ G and σi ∈
i , we define σi ∈
in the following way: By
g
Lemma 6.1.7, G permutes the orbits of N ; suppose that i = j . Take any
6.3 Quotients with Small Permutation Degree 157
g
x ∈ N with x|i = σi , and let σi := x g | j . This map is well defined since if
x|i = y|i for some x, y ∈ N then x g | j = y g | j . It is also clear that this
construction defines a homomorphism ϕ : G → Sym(
).
Now N ≤ ker(ϕ), since N acts trivially on itself by conjugation. Moreover, if
g ∈ ker(ϕ) then g fixes the orbits of N and g|i centralizes N |i . By the second
part of Corollary 6.1.5, N |i is self-centralizing in Sym(i ), so g|i ∈ N |i .
Hence the transitive constituents of N and ker(ϕ) are the same, implying the
required relations between the prime factors and exponents.
To see the algorithmic feasibility of the construction, let us fix a representa-
tive δi ∈ i in each orbit of N and build a (breadth-first-search) Schreier tree
Ti with root δi for N mod Nδi . In this tree, contrary to the usual Schreier tree
constructions, we direct the edges away from the root. Identifying the trivial
permutation in
i with δi , the elements of
i naturally correspond to the ele-
g
ments of i . If σi ∈
i corresponds to γ ∈ i then, to compute σi , it is enough
g
to determine the image of δ j under the permutation xγ , where xγ is the prod-
uct of edge labels in Ti along the path Pγ from δi to γ . (Of course, we do
not compute the permutation xγ ; instead, we work with the word consisting of
the sequence of edge labels g
along Pγ .) This path Pγ may be (|i |) long, so
computing one image δ jγ may grequire (|i |) time; however, noticing that
x
g
if
x
δ is the parent of γ in Ti and δ jδ is already computed then we can obtain δ jγ in
x
g
constant time, we can construct the entire image
i in O(|i |) time.
Proof. Since the radical and p-core of G intersect N trivially, they must central-
ize N . In particular, if the radical [[ p-core]] of G is nontrivial then C G (N ) has
nontrivial radical [[ p-core]]. If C G (N ) ≤ N then its radical [[ p-core]] is triv-
ial. If C G (N ) ≤ N then G = N C G (N ) and C G (N )/Z (N ) ∼ = G/N , so C G (N )
has a chance to have nontrivial radical [[ p-core]] only if G/N is cyclic [[is a
p-cycle]].
Lemma 6.3.3.
The Algorithms
Now we are ready to describe the algorithms. First, we compute a composition
series 1 = N1 N2 · · · Nm = G. By the results of Section 6.2, this can
be done in nearly linear time. Then we find the smallest index i such that
Ni has a nontrivial radical [[ p-core]] by the following method: Suppose that
O∞ (Ni ) = 1[[O p (Ni ) = 1]]. If Ni+1 /Ni is not cyclic [[not a p-cycle]] then, by
Lemma 6.3.2, we conclude that O∞ (Ni+1 ) = 1[[O p (Ni+1 ) = 1]]. Otherwise,
we compute C Ni+1 (Ni ), using the algorithm of Section 6.1.4. If C Ni+1 (Ni ) ≤ Ni
Exercises 159
then, by Lemma 6.3.3, we get a nontrivial radical [[ p-core]]. We take its normal
closure H in G; by Exercise 6.17(ii), it is a solvable normal subgroup [[normal
p-group]] in G.
Next, we compute the derived series of H . The last nontrivial term in the
derived series is an abelian normal subgroup [[abelian normal p-group]] N of
G. We compute the homomorphism ϕ : G → Sn described in Theorem 6.3.1
and repeat the procedure in the image of ϕ. Note that the composition series
computation need not be repeated, since 1 = ϕ(N1 ) ϕ(N2 ) · · · ϕ(Nm ) =
ϕ(G), with ϕ(Ni+1 )/ϕ(Ni ) ∼ = Ni+1 /Ni or ϕ(Ni+1 )/ϕ(Ni ) = 1.
Theorem 6.3.1 and computing O∞ (G) are becoming more and more
important. In any group G, we can define a series of characteristic subgroups
1 ≤ N1 ≤ N2 ≤ N3 ≤ G, where N1 = O∞ (G), N2 /N1 = Soc(G/N1 ) ∼ = T1 ×
· · · × Tm is the direct product of nonabelian simple groups, N3 /N2
Out(T1 ) × · · · × Out(Tm ), and G/N3 is a permutation group of degree m,
corresponding to the conjugation action of G on {T1 , . . . , Tm }. Constructing
these subgroups is one of the main approaches for matrix group computations
(cf. [Babai and Beals, 1999; Kantor and Seress, 2002]), but a number of
recent permutation group algorithms utilize them as well, for example, for
computing conjugacy classes (cf. [Cannon and Souvignier, 1997; Hulpke,
2000]), maximal subgroups (cf. [Eick and Hulpke, 2001; Cannon and Holt,
2002]), or automorphism groups (cf. [Holt, 2001]).
In the permutation group setting, during the construction of O∞ (G) we obtain
faithful permutation representations for the factor groups G/K i for a sequence
of normal subgroups 1 = K r ≤ K r −1 ≤ · · · ≤ K 1 = N1 = O∞ (G), with elemen-
tary abelian factors K i /K i+1 . First we solve the problem at hand (for example,
the computation of conjugacy classes) for G/O∞ (G). The bulk of the work in
this step is the handling of the simple nonabelian groups T j in Soc(G/O∞ (G)),
which can be done by identifying these groups with a standard copy of the T j
(cf. Section 8.3 for details and a precise formulation of the identification). In the
standard copy, the problems indicated here can be solved more easily than in
arbitrary permutation representations. After that, we lift the result recursively
from G/K i to G/K i+1 , for i ∈ [1, r − 1]. For that task, techniques for solvable
permutation groups are used. We shall give examples of such lifting procedures
in Section 7.3.
Exercises
6.1. Let G ≤ Sym() be primitive and 1 = N G. Prove that N is transitive.
6.2. Finish the proof of Lemma 6.1.9.
6.3. Construct
√
a permutation group G ≤ Sym(), with || = n, and with
(n/2 log n ) orbits that are of the same size but pairwise not equivalent
160 A Library of Nearly Linear-Time Algorithms
in the equivalence relation defined in Section 6.1.2. Hint: We may choose
G to be an elementary abelian 2-group.
6.4. Let H, G ≤ Sym() such that G normalizes H . Prove that G normalizes
CSym() (H ).
6.5. Prove that if H ≤ G then H G if and only if the subgroup chain defined
recursively as H0 := G, Hi := H Hi−1 for i > 0 reaches H .
6.6. [Neumann, 1986] Let = {α1 , . . . , α4m }. For i ∈ [1, m], let G i be a
dihedral group of order 8, acting on {α4i−3 , α4i−2 , α4i−1 , α4i } and fixing
the other 4m − 4 points of . Let z i be the generator of Z (G i ). Prove
that the subgroup N := z 1 z 2 , z 1 z 3 , . . . , z 1 z m is normal in G := G 1 ×
· · · × G m and G/N has no faithful permutation representation of degree
less than 2m+1 .
6.7. Let G ≤ Sym() be primitive, with two regular normal subgroups N1
and N2 . Prove that N1 ∼ = N2 ∼= N for some characteristically simple non-
abelian group N and that can be identified with N such that N1 and N2
are the right-regular and left-regular representations of N on itself.
6.8. Let 1 = K ≤ Sym() and 1 = H ≤ Sm . We define an action of K H
on the set m by the rule that if g := (k1 , . . . , km ; h) ∈ K H and
k h̄ k h̄
δ := (δ1 , . . . , δm ) ∈ m then δ g := (δ11h̄ , . . . , δmmh̄ ) for h̄ := h −1 . (This
is called the product action of the wreath product.)
Prove that this product action is primitive if and only if K acts primi-
tively but not regularly on and H is a transitive subgroup of Sk .
6.9. Let T1 , . . . , Tr be nonabelian simple groups and let H be a subdirect prod-
uct of the Ti , i = 1, 2, . . . , r . Prove that there is a partition (
1 , . . . ,
l )
of {1, 2, . . . , r } such that, for each fixed j ∈ [1, l], all groups Tk with
k ∈
j are isomorphic, and there is a diagonal subgroup D j ≤ k∈
j Tk
such that H = lj=1 D j .
6.10. Let T1 , . . . , Tr be nonabelian simple groups. Prove that if M is a minimal
normal subgroup of T1 × · · · × Tr then M = Ti for some i ∈ [1, r ].
6.11. For any group G, prove that the following are equivalent:
(i) G has a faithful transitive permutation representation that is not regu-
lar, but the stabilizer of any two points is trivial (i.e., G is a Frobenius
group).
(ii) G has a nontrivial subgroup H such that NG (H ) = H and any two
conjugates of H are identical or their intersection is 1 (such a sub-
group H is called a Frobenius complement).
6.12. This is not an exercise but rather a compilation of results about Frobe-
nius groups. Proofs can be found, for example, in Sections 17 and 18 of
[Passman, 1968]. Let G ≤ Sym() be a Frobenius group and let α ∈ .
Then H := G α is a Frobenius complement for G.
Exercises 161
(i) (Frobenius) The set {1} ∪ (G\ g∈G H g ) is a regular normal sub-
group of G, which is called the Frobenius kernel.
(ii) (Thompson) The Frobenius kernel is nilpotent.
(iii) If |H | is even then H has a unique element of order 2, which is
therefore central.
(iv) (Zassenhaus) If |H | is odd then for every prime divisor p of |H : H |,
elements of order p in H are central.
(v) (Zassenhaus) If H is nonsolvable then H has a normal subgroup H0
of index 1 or 2 such that H0 ∼ = SL2 (5) × S for some solvable group
S of order relative prime to 30.
6.13. Prove Lemma 6.2.12. Hint: Use induction on m.
6.14. Let p be a prime number.
(i) Let P be the set of d × d lower triangular matrices over GF( p), with
all diagonal entries equal to 1. Prove that P is a Sylow p-subgroup
of GLd ( p).
(ii) For H ≤ GLd ( p), let fix(H ) denote that the set of vectors in GF( p)d
fixed by each element of H . Prove that if H is a p-group then fix(H )
is a nonzero subspace of GF( p)d .
(iii) Prove that if H ≤ GLd ( p) then fix(O p (H )) is an H -invariant sub-
space.
6.15. Let G ≤ Sym() be primitive with two regular normal subgroups and let
α ∈ . Prove that the only minimal normal subgroup of G α is Soc(G)α .
6.16. Design a faster version of the algorithm described in Lemma 6.2.15,
where the intermediate groups h iH for 1 ≤ i < l are not computed; in-
stead, guarantee that a subgroup of h iH of order greater than n is con-
structed with sufficiently high probability and that h i+1 is chosen from
this subgroup.
6.17. Suppose that T G and T is simple. Prove that
(i) if T is nonabelian then T G is a direct product of simple groups
isomorphic to T and it is a minimal normal subgroup of G;
(ii) if T is cyclic of order p then T G is a p-group;
(iii) it is possible that T G is not elementary abelian in part (ii).
6.18. Combining ideas from Sections 6.2.5 and 6.2.6, design a fast Las Vegas
algorithm to decide whether a primitive group of degree at most 107 and
of known order is simple.
7
Solvable Permutation Groups
162
7.1 Strong Generators in Solvable Groups 163
Lemma 7.1.2. Suppose that G = H, y ≤ Sym(), with || = n, and y
normalizes H . Moreover, suppose that a nonredundant base B and an SGS S
relative to B is given for H , and let t denote the sum of depths of the Schreier
trees defined by S. Then there is a deterministic algorithm that computes an SGS
S ∗ ⊇ S for G in O(n(t + log |G : H |)) time, satisfying |S ∗ | ≤ |S| + 2 log |G :
H | and that the sum of depths of the Schreier trees defined by S ∗ is at most
t + 2 log |G : H |.
Following Sims, we call the algorithm described in the proof of Lemma 7.1.2
Normalizing Generator.
These are the relators for the so-called power-conjugate presentation (pcp)
for G. By repeated applications of the relators in (7.1), any word in the yi
representing an element g ∈ G can be rewritten to the collected word for g.
This process is called a collection. In particular, given collected words for g, h ∈
G, we can compute the collected word of their product by concatenating the
collected words for g and h and performing a collection. We note that no matter
in which order the rules (7.1) are applied, the collection process terminates;
however, the number of applications can vary significantly. A discussion of
the different collection methods can be found in [Leedham-Green and Soicher,
1990] and in [Vaughan-Lee, 1990].
166 Solvable Permutation Groups
The pcps are very compact, but still relatively easily manageable represen-
tations of polycyclic groups. In some instances, groups G with log |G| in the
thousands can be handled, which is currently not possible in a representation
as a permutation or matrix group. There is a large library of algorithms to deal
with polycyclic groups, but this is the topic for another book. Here we just give
the references [Laue et al., 1984], [Celler et al., 1990], [Sims, 1994], and [Eick,
1997] as a starting point for the interested reader. Most algorithms work well
in practice, but the complexity analysis is not as well developed as in the case
of permutation groups.
It is often beneficial to work with a pcp of a special form. By inserting
appropriate powers of the generators into the polycyclic generating sequence,
we can assume that the indices pi := |G i : G i+1 | are prime numbers. Moreover,
it is useful if the tower of subgroups G = G 1 G 2 · · · G r G r +1 = 1 is a
refinement of a chain of normal subgroups G = N1 N2 · · · Nm = 1, with
Ni /Ni+1 elementary abelian. Then we can use induction to extend information
from G/Ni to G/Ni+1 . Since G/Ni acts as a group of linear transformations
on Ni /Ni+1 , often linear algebra methods can be utilized.
Conversion to a pcp
Because of the usefulness of pcps, we may want to compute one for a
given solvable permutation group. Fortunately, the solvable SGS construc-
tion in Section 7.1 accomplishes this task, with very little extra work. By
Exercises 7.1 and 7.2, we can construct a polycyclic generating sequence
(y1 , . . . , yr ) such that the indices in the corresponding subgroup chain G =
G 1 G 2 · · · G r G r +1 = 1 are prime numbers and the SGS S constructed
by the algorithm of Section 7.1 consists of powers of the yi . Moreover, by
Lemma 7.1.2, S ∩ G i is an SGS for G i relative to the base of G (but this base
|G :G |
may be redundant for indices i > 1). Therefore, sifting yi i i+1 as a word in
y
G i+1 for 1 ≤ i ≤ r − 1, and sifting y j i as a word in G i+1 for 1 ≤ i < j ≤ r , we
can express the left-hand-sides of the relators in (7.1) as words in the yk . Induc-
|G :G | y
tively for i = r − 1, r − 2, . . . , 1, we can rewrite these words for yi i i+1 , y j i
in the form required in (7.1), utilizing the already constructed part of (7.1).
Although we cannot give a complexity estimate of this procedure because the
last step involves a collection process, the practical performance is very good.
All five terms on the second line of (7.3) are elements of Q. We can compute the
coordinate vectors of these elements (as functions of the αkl ) since [xi , x j ] is a
known element of Q, and conjugation by xi and x j are linear transformations
of Q with known matrices Ti and T j , respectively. Hence, for example, the
x
coordinate vector of qi j is the product of the coordinate vector of qi and the
matrix T j . The sum of the five coordinate vectors corresponding to the terms of
the second line of (7.3) is 0 and each coordinate provides a linear equation for
the αkl . Altogether, we get ( r2 )m ∈ O(log3 |G|) equations in mr ∈ O(log2 |G|)
unknowns. This system of equations has a solution (since Sylow subgroups
exist), and we can find a solution by Gaussian elimination, using O(log7 |G|)
field operations in GF(q).
need yg ∈ Q such that (x j q j ) = x j for 1 ≤ j ≤ r . This last equation is
yg
equivalent to
x j
yg−1 · yg · q j = 1, (7.4)
Writing (7.5) for all g ∈ T and noticing that g −1 ψ2−1 (ψ1 (g)) ∈ Q, we obtain
|T |m equations for the coefficients of y.
Theorem 7.3.3. Given a solvable group H ≤ Sym(), with || = n, and two
Sylow p-subgroups P1 , P2 of H , an element of H conjugating P1 to P2 can be
constructed by a deterministic nearly linear-time algorithm.
since [c1 c2 , h −1 ] = [c1 , h −1 ]c2 [c2 , h −1 ]. This last identity is a special case of
Lemma 2.3.10, and it can also be checked directly.
The first shortcut is to compute an SGS {z 1 , . . . , z m } of Q supporting co-
ordinatization and matrices in this basis for the affine transformations A T :=
{Ac | c ∈ T } for a generating set T of C. Then, instead of the orbits of the
conjugation of C on Qh, we compute the orbits of AC = A T on Q and take
the preimages of representatives of these orbits under ϕ −1 . The affine images
of elements of the vector space Q can be computed much more quickly than
the conjugates of the corresponding elements of Qh.
The second shortcut goes even further and computes the orbits of AC =
A T on a potentially smaller factor space of Q instead of Q. Since Q C,
Lemma 6.1.7 implies that the orbits of the subgroup A Q := {Ac | c ∈ Q} of AC
174 Solvable Permutation Groups
are permuted by AC . For c ∈ Q, we have q c = q for all q ∈ Q and so Ac
is just a translation of the vector space Q by the vector [c, h −1 ]. Hence, for
the 0 vector of Q, we have 0 A Q = [Q, h −1 ]. This subspace [Q, h −1 ] can be
computed easily, and the orbits of AC on Q are the complete preimages of the
orbits of AC on the factor space Q/[Q, h −1 ]. Let ĀC denote the permutation
action of AC on Q/[Q, h −1 ]. Note that ĀC acts on Q/[Q, h −1 ] as a group of
affine transformations as well, since for any q, r ∈ Q and c ∈ C we have
(i) there is an element r (c) ∈ Q such that Acr (c) fixes q j ; and
(ii) if Acr1 , Acr2 fix q j for some r1 , r2 ∈ Q then r1r2−1 ∈ C Q (h).
7.4 Two Algorithms for Nilpotent Groups 175
To show (i), observe that
for some q ∈ Q, since Ac fixes [Q, h −1 ]q j . For that q, using (7.7) and using
repeatedly that Q is abelian and that [c, h −1 ] ∈ Q, we have
cq −1 −1
Acq −1 (q j ) = q j [cq −1 , h −1 ] = q cj [c, h −1 ]q [q −1 , h −1 ]
= q cj [c, h −1 ][q −1 , h −1 ] = [q, h −1 ]q j [q −1 , h −1 ] = q j .
to
1 . By Lemma 4.2.1, these restrictions generate G 1 |
1 .
After that, the same procedure as we just described can be applied recursively
to determine whether G 1 |
1 is a p-group. Note that G 1 |
1 acts transitively on
The nilpotency test for permutation groups is based on the p-group test
described in the proof of Theorem 7.4.1. A permutation group G = S ≤
Sym(), with || = n, is nilpotent if and only if all of its transitive con-
stituents are nilpotent, so again we start with the construction of the transitive
constituents, in O(|S|n) time. From now on, we suppose that G is transitive.
Let p be a prime divisor of n. If G is nilpotent then G = P × Q, where P is
a nontrivial p-group and Q is a nilpotent group of order not divisible by p. By
Lemma 6.1.7, the orbits of P constitute a block system
∗ for G. Since P is a
p-group, the size of blocks in
∗ is p k for some k ≥ 1. Moreover, as P is in the
kernel of the action of G on
∗ , the action of G on
∗ has order relative prime
to p. So, in particular, the divisor |
∗ | of this group order is relative prime to p
and p k is the largest power of p that divides n. The orbits of Q constitute another
block system
∗∗ for G. By the same argument we used for
∗ , we obtain that
|
∗∗ | is relative prime to each prime divisor of |Q|. Hence |
∗∗ | = p k .
Finally, we describe how can we get generators for P and Q. For a generator
l
s ∈ S, let |s| = pl r with ( p, r ) = 1. Then s p := s r ∈ P and sq := s p ∈ Q, and
s ∈ s p , sq . Hence P = s p | s ∈ S and Q = sq | s ∈ S. Note that |s| divides
n since obviously pl ≤ p k , and by induction on the number of different prime
divisors of |Q| we can prove that r divides n/ p k .
Proof. We have seen that a transitive nilpotent group satisfies (i)–(vi). Con-
versely, suppose that a transitive group G satisfies (i)–(vi), and let N1 and N2 be
the kernels of actions of G on
∗ and
∗∗ , respectively. We have s ∈ s p , sq
for each s ∈ S, so G = P, Q. Since P ≤ N1 and Q ≤ N2 , we also have
G = N1 , N2 . Moreover, N1 ∩ N2 = 1, since the orbits of N1 ∩ N2 constitute
a block system that is a common refinement of
∗ and
∗∗ , and so by (iv) the
orbits must have size 1. Hence G = N1 × N2 . Using again that G = P, Q and
P ≤ N1 , Q ≤ N2 , we obtain G = P × Q. By (v) and (vi), G is nilpotent.
Proof. We have to show that properties (i)–(vi) in Lemma 7.4.2 can be checked
in the given time bound. The prime factorization of n can be obtained in
√
O( n log2 n) time by brute force. For each s ∈ S, we can check whether
s n = 1 and, if the answer is yes, we can compute s p and sq in O(n log n)
time. The orbits of P and Q can be determined, and properties (ii), (iii), and
(iv) can be checked, in O(|S|n) time. By Theorem 7.4.1, (v) can be checked
in O(|S|n log n A−1 (2|S|n, n)) time. Finally, (vi) is checked by a recursive call.
Since the orbits of Q have size n/ p k ≤ n/2, and further recursive calls are for
groups of degree decreasing at least by a factor 2, the recursive calls contribute
only a constant multiplier to the overall running time.
on less than 2n/4 points (cf. Exercise 6.6). Hence a completely new approach is
necessary. In this section we present the nearly linear-time upper central series
algorithm for nilpotent groups from [Seress, 1998]. The method is based on
ideas from [Kantor and Luks, 1990].
As in Section 6.1.1, given G, H ≤ Sym(), let 1 , 2 be disjoint copies
of the set and introduce the subgroups D = Diag(G × G) := {(g, g) | g ∈
G}, 1 × H := {(1, h) | h ∈ H }, and K := D, 1 × H . The following lemma
is a special case of the discussion in [Kantor and Luks, 1990, Section 6].
The last condition means that g and g y are in the same coset of H for all y ∈ G;
in other words, H g ∈ Z (G/H ).
K i = D, 1 × H L i . (7.8)
Exercises 181
Suppose that (7.8) holds for some i ≥ 0. Since K d = K for all d ∈ D, the
group K i+1 = K K i is generated by elements of the form (g, gh 1 )(1,h 2 l) for g ∈
G, h 1 , h 2 ∈ H , and l ∈ L i . By (7.9), all generators are in D, 1 × H L i+1 since
[g, l] ∈ L i+1 and [g, h 2 ]l , h 1h 2 l ∈ H . Conversely, D, 1 × H L i+1 is generated
by D and elements of the form (1, [g, l]h) for g ∈ G, h ∈ H , and l ∈ L i .
−1
Applying (7.9) backward with the notation h 1 := h l and h 2 := 1, we obtain
that
−1 l −1 (1,l)
(1, [g, l]h) = 1, [g, l][g, 1]l h l = (g −1 , g −1 ) g, gh l ∈ K i+1 .
Exercises
7.1. Modify the solvable SGS construction of Section 7.1 such that the out-
put polycyclic generating sequence defines a subgroup chain with prime
indices.
182 Solvable Permutation Groups
7.2. Prove that if G = H, y ≤ Sym(), y normalizes H , and S is an SGS
for H then the algorithm of Lemma 7.1.2 constructs an SGS S ∗ for G
relative to some base (β1 , . . . , β M ) such that for all i ∈ [1, M] the ratio
G (β ,...,β ) H(β ,...,β )
|βi 1 i−1 |/|βi 1 i−1 | of the fundamental orbit sizes divides |G : H |. In
particular, if |G : H | is a prime number then
(i) the fundamental orbits of H and G differ for exactly one base point;
(ii) all elements in S ∗ \S are powers of some z ∈ G\H belonging to the
coset H y.
7.3. Let H ≤ G and suppose that |H | = p m for a prime p. Prove that if H is
not a Sylow p-subgroup of G then there exists h ∈ Z (H )\{1} such that
|C G (h)| is divisible by p m+1 .
7.4. Let N G and let P be a Sylow p-subgroup of N . Prove that G =
NG (P)N .
7.5. [Kantor and Taylor, 1988] Modify the methods of Section 7.3.1 to construct
and conjugate Hall subgroups of solvable groups.
7.6. Use the techniques of Section 7.3 to design a nearly linear-time deter-
ministic algorithm to construct a Sylow p-subgroup of a solvable group
G ≤ Sym() that contains a given p-subgroup of G.
7.7. Use the techniques of Section 7.3 to design a nearly linear-time construc-
tive version of one half of the Schur–Zassenhaus theorem. Namely, given
N G ≤ Sym() with (|N |, |G/N |) = 1 and N solvable, construct a com-
plement H for N in G. Furthermore, given two complements H1 , H2 of
N , construct some g ∈ G that conjugates H1 to H2 .
7.8. How can random subproducts speed up the algorithms for computing
strong generating sets in solvable groups (Section 7.1), cores of subnor-
mal subgroups (Section 6.1.5), and Sylow subgroups of solvable groups
(Section 7.3.1)?
7.9. Show that the following are equivalent for a finite group G.
(i) The lower central series of G terminates at 1.
(ii) Every subgroup of G is subnormal.
(iii) Every proper subgroup of G is properly contained in its normalizer.
(iv) Every maximal subgroup of G is normal.
(v) The Sylow subgroups of G are normal in G.
(vi) G is the direct product of p-groups.
(vii) The upper central series of G terminates at G.
8
Strong Generating Tests
The fast constructions of strong generating sets are randomized and in prac-
tice we often use heuristics; therefore, it is necessary to test the correctness of
the output. Given an input group G = S ≤ Sym(), suppose that we have
constructed an alleged base B = (β1 , . . . , βm ), generating sets Si for each
G [i] := G (β1 ,...,βi−1 ) , and transversals Ri for Si mod Si βi . The correctness of
the computation can be checked by forming all Schreier generators from Si , Ri
for i = 1, 2, . . . , m and sifting them through our data structure. However, this
method is too slow, and trying to avoid the formation of all Schreier generators
was the primary reason why we resorted to randomized constructions.
So far, we have seen two methods to check the result of an SGS computation.
The most simple-minded way, which is very useful in practice, is to compare
i |Ri | and |G| if the order of G is known in advance. Errors in SGS computa-
tions are one-sided in the sense that we never put permutations into Si that are
not in G [i] , and the possible error is that Si generates only a proper subgroup
of G [i] . Therefore, if i |Ri | = |G| then the construction is correct.
The second method we have encountered is the Monte Carlo strong gen-
erating test described in Lemma 4.5.6. Although this procedure is reasonably
fast in practice if the SGS construction used one of the shallow Schreier tree
construction methods of Section 4.4, the drawback is that we cannot guarantee
correctness with absolute certainty.
In this chapter, we describe three further methods to check the correctness of
an SGS computation. By Lemma 4.2.3, it is enough to check that Si βi = Si+1
holds for each i. To this end, the first two methods we present solve the problem
(4.6) in Section 4.3. Recall that (4.6) asks for deciding whether H = G β , given
the input G = S, a transversal for G mod G β , and H ≤ G β . The third method
departs radically from the traditional point stabilizer approach and is based on
structural properties of G. By the time it confirms the correctness of the SGS,
it also computes a composition series and a short presentation for G.
183
184 Strong Generating Tests
CT g1 g2 RT g1 g2 g1 g2 g1 g2
1 2 3 1 2 4 3 1
2 1 4 2 1 3 4 2
3 1 3 4 2 1 3
(8.1)
4 2 4 3 1 2 4
ST g1 g2 g1 g2
1 2 4 3 1
shows the coset table CT, relation table RT, and subgroup table ST after the
definition of the cosets 1 := H, 2 := 1g1 , 3 := 1g2 , 4 := 2g2 . At that moment,
the last entry (in the second column) of ST is filled and we get the deduction
4g1 = 3, which also implies 3g1 = 4. Then all entries in CT are known, and
we can complete RT; this leads to the coincidences 1 = 4 and 2 = 3.
There are different strategies for coset enumeration. The version we pre-
sented here, where we fill all consequences of new definitions, deductions, and
186 Strong Generating Tests
coincidences in all relation tables as soon as possible, is called the Felsch strat-
egy. However, it is possible to postpone the filling of some relation tables: If
we ensure that if a coset k is defined then sooner or later we define kg for all
g ∈ Ē, and sooner or later we deduce all consequences of these definitions,
then it is guaranteed that the algorithm terminates if |G : H | < ∞. This remark
plays an important role in the theoretical justification of the algorithm described
in Section 8.1.2, where we do not have a full presentation at the start of coset
enumeration and, as the coset enumeration progresses, we add relators to R.
From a theoretical point of view, we can suppose that all relators are already
present at the beginning; we just postpone the filling of entries in some relation
tables.
There is no recursive function of |G : H | and the input length that would
bound the number of cosets defined during the Todd–Coxeter procedure. It
is easy to give presentations of the trivial group such that no commonly used
variant of the Todd–Coxeter algorithm can handle them. This, and different coset
enumeration strategies, are discussed, for example, in [Sims, 1994, Chap. 5].
Success mostly depends on the number of entries defined in the coset table rather
than |G : H |. There are instances of successful enumeration with |G :H | > 106 .
Case 1: The coset table closes with n cosets. In this case, |G : H | = n or,
equivalently, H = G β .
Case 2: The coset table closes with more than n cosets. Then H = G β and
an element of G β \H can be obtained by choosing two words w1 , w2 from
the coset table that represent different cosets of H but β ϕ(w1 ) = β ϕ(w2 ) .
Then ϕ(w1 w2−1 ) ∈ G β \H .
Case 3: The coset enumeration was interrupted after defining m cosets. Then,
as in Case 2, there are words w1 , w2 in the coset table such that β ϕ(w1 ) =
β ϕ(w2 ) . This means that g := ϕ(w1 w2−1 ) ∈ G β ; we test whether g ∈ H .
If the answer is “no” then we have found a witness for G β = H . If the
answer is “yes” then we sift g as a word in H and obtain a word t in
the strong generators of H such that the product of permutations in t is
g. Let w denote the word corresponding to t in E ∗ . We add the relator
w1 w2−1 w −1 = 1 to R and perform the collapse w1 = w2 in the coset table.
Then we resume the coset enumeration.
Of course, we need much more than finiteness of the procedure. Leon experi-
mented extensively with the value of the parameter m and the coset enumeration
technique to be used. He chose the value m = 1.2n and developed a general-
ization of one of the basic strategies for coset enumeration, the Haselgrove–
Leech–Trotter method. Note that if m < 2n then Case 2 cannot occur, since
|G : H | > n means |G β : H | ≥ 2 and so |G : H | = |G : G β | · |G β : H | ≥ 2n.
When the entire procedure terminates in Case 1, we have a presentation
E | R for a group Ḡ that has a subgroup H̄ of index n, and G satisfies this
presentation. Hence there is a homomorphism ψ : Ḡ → G such that ψ( H̄ ) =
H . However, since relators for H are a subset of R, we must have H̄ ∼ =
∼
H, ker(ψ) = 1, and Ḡ = G. Hence we obtain a presentation for G, which is
usually much shorter than the one described in Exercise 5.2.
The STCS algorithm is used in Magma.
Lemma 8.2.1. Suppose that for all δ ∈ β G there exist sets U (δ) ⊆ G satisfying
the following properties:
Then H = G β .
Proof. We use the idea of the proof of Lemma 4.2.1. Let g ∈ G β be arbi-
trary. Since G β ≤ G, g can be written in the form g = x1 x2 · · · xk for some
nonnegative integer k and xi ∈ S for i ≤ k. Let δi := β x1 ···xi , for 0 ≤ i ≤ k.
By (iii), for each i ∈ [1, k] there exist u i−1 ∈ U (δi−1 ) and vi ∈ U (δi ) such
that u i−1 xi vi−1 ∈ H . Also, vi u i−1 ∈ H by (ii) and u −1
0 , vk ∈ H by (i). Hence
g = x1 x2 · · · xk
−1
= u −1
0 u 0 x 1 v1
−1
v1 u 1 u 1 x2 v2−1 · · · vk−1 u −1 −1
k−1 u k−1 x k vk vk ∈ H.
Lemma 8.2.2. If the following conditions (a), (b), and (c) hold then the sets
U (δ) defined in (8.2) satisfy (ii) and (iii) of Lemma 8.2.1.
−1
(a) Hδu(δ
i
i)
≤ H for all i ∈ [1, l];
(b) u(γi j ) z v(γizj )−1 ∈ H for all i ∈ [1, l] and for all j ∈ [1, li ]; and
(c) Hγz ≤ H .
and by (a) it is in H .
For (iii), we have to consider two cases: x ∈ S1 or x = z. If x ∈ S1 then
let u = u(δi )h ∈ U (δ) be arbitrary, and v := ux. Note that δ x is in the same
H -orbit i as δ, so v = u(δi ) (hx) ∈ U (δ x ). Moreover, uxv −1 = 1 ∈ H .
Finally, let x = z. Let γi j be the representative of the Hγ -orbit of δ, and fix
w ∈ Hγ such that γiwj = δ. Define u := u(γi j ) w. Then u = u(δi ) (y(γi j )w) ∈
U (δ). We also define v := v(γizj ) w z . By (c), w z ∈ H , so, if the representa-
tive of the Hγ -orbit of γizj is γk jk , then v = u(δk ) h for some h ∈ H . We have
−1
β v = β v(γi j ) w = (γizj )z wz = δ z , and so v ∈ U (δ z ). By (b),
z z
−1
uzv −1 = u(γi j ) wz(z −1 w −1 z) v γizj ∈ H,
so (iii) holds.
190 Strong Generating Tests
witness for H = G β . On the other hand, if (a), (b), and (c) are satisfied then,
by Lemmas 8.2.2 and 8.2.1, we have H = G β .
The required orbit computations of this algorithm are deterministic and can
be done in nearly linear time. If the transversals for the point stabilizer chain of
H are stored in shallow Schreier trees then each sifting required in (b), and each
base change required in (a) and (c), can also be executed by a nearly linear-
time deterministic or by a nearly linear-time Las Vegas algorithm. Hence the
time requirement of the algorithm is O(n M logc |G|), where n is the size of the
permutation domain and M is the number of Hγ -orbits in β G .
Now we handle the general case. Let S = {s1 , . . . , sk } and G(i) := H,
s1 , . . . , si . Recursively for i = 1, 2, . . . , k, we shall check whether G(i)β = H .
The case i = 1 is discussed in the previous paragraph. Suppose that we have
verified that G(i − 1)β = H for some i ≥ 2 and have constructed a Schreier
tree data structure for G(i − 1). Our next aim is to decide whether G(i)β = H .
We sift si in G(i −1). If si ∈ G(i −1) then G(i) = G(i −1) and we are done. If
si has a nontrivial siftee g then there are two cases. The first one is that β g = β,
which means that g ∈ G(i)β \G(i − 1) and so it is a witness for G(i)β = H .
The second case is that β g = β. This can happen only if the sifting procedure
failed on the first level because β g ∈ β G(i) \β G(i−1) . In this case, let := β G(i)
and := β G(i−1) . If indeed G(i)β = H then G(i) G(i − 1) ≥ G(i)β and
is a block of imprimitivity for the G-action on , so our first task is to check
whether is a block. This is exactly the situation we have encountered in
Section 5.5.1, in Step 4 of the construction of a block system. By Lemma 5.5.5,
in nearly linear time it is possible to check whether is a block and, if it is not,
to construct some h ∈ G(i)β \H . (We leave to the reader to check the details that
the algorithm described in Lemma 5.5.5(b) works in the situation considered
here, with G(i − 1) playing the role of H, rλ in Lemma 5.5.5.)
If is a block of imprimitivity then let
denote the block system consisting
of the G(i)-images of . In the G(i)-action on
, the “point” is stabilized
by G(i − 1); therefore, since G(i) is generated by G(i − 1) and one additional
element, we can use the special case of Sims’s Verify routine discussed earlier
in this section to check whether G(i) = G(i − 1). If indeed G(i) = G(i − 1)
then |G(i)| = (||/||)|G(i − 1)| = (||/||)|||H | = |||H | and |G(i) :
G(i)β | = ||, so G(i)β = H . However, if G(i) = G(i − 1) then the algorithm
constructs some h 1 ∈ G(i) \G(i − 1) to witness this fact. Taking h 2 ∈ G(i − 1)
such that β h 1 = β h 2 , we obtain h 1 h −1
2 ∈ G β \H .
8.3 Toward Strong Generators by a Las Vegas Algorithm 191
Versions of Sims’s Verify routine are implemented both in Magma and GAP.
The bottleneck to nearly linear running time and to good practical performance
is that in the special case G = H, z, the running time is proportional to the
−1
number of orbits of Hγ , for γ = β (z ) . To alleviate this problem, the Verify
routine is combined with the Schreier–Todd–Coxeter–Sims method (cf. Sec-
tion 8.1) in Magma. This combination is referred to as the Brownie–Cannon–
Sims algorithm. Since this algorithm is unpublished, we do not know what
criterion on the number of orbits of Hγ is used to choose between Verify and
STCS.
The GAP implementation constructs a maximal block system
for the G-
action on β G , and z ∈ G such that z fixes the block B containing β. Then it
verifies that H, zβ = H and, recursively by the same method for computing
block systems, that G B = H, z for the point stabilizer G B in the action on
. Block computations are so cheap that their cost seems to be more than
offset by the decrease of the number of orbits of Hγ , which we achieve with
that choice of z. In essence, this modification inserts a chain of subgroups
between G and G β , which is in agreement with the general philosophy behind
permutation group algorithms, namely, that we try to define a subgroup chain
G = G 1 ≥ G 2 ≥ · · · ≥ G m = 1 with small indices |G i : G i+1 |.
As mentioned in Section 4.5.1 and discussed further at the end of Section 5.2,
GAP employs a heuristic version of the nearly linear-time Monte Carlo SGS
construction discussed in the above mentioned sections. The user can choose
whether the output should be checked with a Monte Carlo algorithm by apply-
ing Lemma 4.5.6 or its correctness should be decided with certainty, applying
Sims’s Verify routine. Not only do both strong generating tests give a yes/no
answer but, in the case of incorrect SGS, they construct an element of the input
group that can be the starting point of the continuation of the SGS construction.
However, the Verify routine is a quite expensive way to discover that an SGS
is not correct, and GAP tries to call Verify only for correct inputs. Therefore,
another heuristics is built in: If the user asks for Verify then, as a first step, ten
random elements are tested, as described in Lemma 4.5.6; Verify is called only
after these elements pass the test.
follow the description of the algorithm from [Kantor and Seress, 1999], which
makes clear how the class of groups covered by the algorithm extends when the
nearly linear-time constructive recognition of further simple groups becomes
available. In Theorem 8.3.2, we shall give the current list of simple groups that
can be recognized constructively in nearly linear time.
The algorithm can be used as a strong generating test by comparing the order
of the input group it computed with the order computed from the SGS to be
tested. As we emphasized earlier, if the initial SGS computation is correct then
the nearly linear-time algorithms described in Chapters 5 and 6 are deterministic
or of Las Vegas type; therefore, for groups with restricted composition factors,
the entire nearly linear-time library of algorithms is upgraded to Las Vegas
type.
As promised, we give the definition of constructive recognition of simple
groups. We formulate constructive recognition in the black-box group setting,
since its applications extend beyond permutation group algorithms: Construc-
tive recognition of simple groups is a central concept in matrix group algorithms
as well. Let F be a family of simple groups and let f : F → R be a function
taking positive values. We say that F is black-box f -recognizable if, whenever
a group H = S isomorphic to a member of F is given as a black-box group
encoded by strings of length at most N and, in the case of Lie-type H , the
characteristic of H is given, there are Las Vegas algorithms for (i) and (ii) and
a deterministic algorithm for (iii) of the following:
(i) Find the isomorphism type of H .
(ii) Find a new set S ∗ of size O(log |H |) generating H and a presentation
of length O(log2 |H |) such that S ∗ satisfies the presentation. (Note that
log |H | ≤ N , and this presentation proves that H has the isomorphism
type determined in (i).)
(iii) Given h ∈ H , find a straight-line program of length O(N ) from S ∗ to h.
(Straight-line programs were defined in Section 1.2.3.)
Moreover,
(iv) The algorithms for (i)–(iii) run in time O((ξ + µ) f (H )N c ), where ξ is an
upper bound on the time requirement per element for the construction of
independent, (nearly) uniformly distributed random elements in subgroups
of H, µ is an upper bound on the time required for each group operation
in H , and c is an absolute constant.
As we have seen in Section 5.3, a permutation group G ≤ Sym() is iso-
morphic to a black-box group H over an alphabet consisting of the labels in
a Schreier tree data structure of G. Let ψ : G → H denote this isomorphism.
8.3 Toward Strong Generators by a Las Vegas Algorithm 193
By Lemma 5.3.1, the quantities ξ, µ, N for H in part (iv) of the definition of
constructive recognition are bounded from above by logc |G| for some absolute
constant c . Moreover, for any g ∈ G, ψ(g) can be computed in O(logc |G|)
time, and for any h ∈ H, ψ −1 (h) can be computed in O(|| logc |G|) time.
Therefore, a constructive recognition algorithm runs in nearly linear time
for permutation group inputs if f is bounded from above by a nearly linear
function of the degree of the input group. This fact motivates the consideration
of the function m : G → R, where G is the family of all finite simple groups and
m(G) is the degree of the smallest faithful permutation representation of G.
Proof. We compute an alleged base and strong generating set, and a composi-
tion series G = N1 N2 · · · Nl = 1, by the nearly linear-time Monte Carlo
algorithms in Sections 4.5 and 6.2. The composition series algorithm also pro-
vides homomorphisms ϕi : Ni → Sn with ker(ϕi ) = Ni+1 , for 1 ≤ i ≤ l − 1.
We also compute strong generating sets for all Ni relative to the base of G.
We will verify the correctness of the base and strong generating sets for the
subgroups Ni by induction on i = l, l − 1, . . . ,1.
Suppose that we already have verified an SGS for Ni+1 . We compute a base,
SGS, shallow Schreier-tree data structure, and an isomorphism ψi : ϕi (Ni ) →
Hi with a black-box group Hi for the image ϕi (Ni ) ≤ Sn , which is allegedly
isomorphic to a simple group. Our first goal is to recognize Hi , and so ϕi (Ni ),
constructively.
As a consequence of the classification of finite simple groups, we know that
there are no three pairwise nonisomorphic simple groups of the same order.
Therefore, since we know |ϕi (Ni )|, we have at most two candidate simple
groups C for the isomorphism type of ϕi (Ni ), and in the ambiguous cases we
try both possibilities. Also, if |ϕi (Ni )| > 8!/2 then |ϕi (Ni )| determines whether
ϕi (Ni ) is of Lie type, and if it is, its characteristic. Hence, using that ϕi (Ni )
is from an m-recognizable family, we can obtain by a nearly linear-time Las
Vegas algorithm a generating set Si∗ of Hi of size O(log |Hi |) and a presentation
E i | Ri of length O(log2 |Hi |) satisfied by Si∗ . Moreover, for any given h ∈
Hi , we can write a straight-line program of length O(log |Hi |) from Si∗ to
h in nearly linear time by a deterministic algorithm. By Lemma 5.3.1, the
preimage Si∗∗ = ψi−1 (Si∗ ) ⊆ ϕi (Ni ) can also be obtained in nearly linear time
by a deterministic algorithm.
194 Strong Generating Tests
Now the correctness of the SGS for Ni can be proved in the following way: Let
Ti be the set of generators of Ni computed by the composition series algorithm.
We check that
(a) Ni+1 Ni and Ni = Ni+1 ;
(b) ϕi−1 (Si∗∗ )Ni+1 /Ni+1 satisfies the presentation E i | Ri computed for
ϕi (Ni ) ∼
= Hi ; and
(c) Ti ⊂ ϕi−1 (Si∗∗ )Ni+1 , where ϕi−1 (g) denotes a lift (i.e., an arbitrary preim-
age) of g ∈ Si∗∗ ⊆ ϕi (Ni ).
If (a), (b), and (c) hold then |Ni | = |Ni+1 ||ϕi (Ni )|. If |Ni | is equal to the
value for |Ni | computed from the alleged SGS of Ni then the SGS construction
is correct, since the alleged order of a group computed by a Monte Carlo SGS
construction is not greater than the true order, with equality if and only if the
SGS construction is correct.
We indicate how (a), (b), and (c) can be checked algorithmically. For (a),
we conjugate the generators of Ni+1 by the elements of Ti and check that the
resulting permutations are in Ni+1 . Because the correctness of the SGS for
Ni+1 is already known, membership testing giving guaranteed correct results
is available for that group. Also, we check that not all elements of Ti are in
Ni+1 . For (b), we multiply out the relators that were written in terms of Si∗ ,
using the permutations in ϕi−1 (Si∗∗ ) = ϕi−1 ψ −1 (Si∗ ); then we check that the
resulting permutations are in Ni+1 . Finally, for (c), for each ti ∈ Ti we write a
straight-line program P from Si∗ to ψi ϕi (ti ) ∈ Hi and evaluate P starting from
ϕi−1 ψ −1 (Si∗ ) (recall that straight-line programs can be evaluated in any group,
giving values to the symbols representing generators in the program). The result
of the evaluation is some t ∗ ∈ ϕi−1 ψi−1 (Si∗ ) and we check that t ∗ t −1 ∈ Ni+1 .
By (b) and (c), we have checked that the factor group Ni /Ni+1 ∼ = C.
At the end of the induction, we have obtained a correct SGS for the group
N1 = T1 that was output by the composition series algorithm. After that, we
verify that G = N1 by sifting the elements of the original generating set T
in N1 .
To justify the nearly linear running time of the entire algorithm, note that
l ∈ O(log |G|), so it is enough to show that the ith step of the induction for
a fixed value i ≤ l runs in nearly linear time. We have already seen that the
constructions of both Si∗ and of the presentation for ϕi (Ni ) ∼ = Hi are within
∗
this time bound. Since both |Ti | and |Si | are O(log |G|), whereas the length
of the presentation E i | Ri is O(log2 |G|) and the Schreier-tree data structure
of Ni+1 is shallow, the number of permutation multiplications that occur while
checking (a), (b), and (c) is bounded from above by a polylogarithmic function
of |G|.
8.3 Toward Strong Generators by a Las Vegas Algorithm 195
We note that, to achieve an algorithm failure with probability less than ε, we
have to require that calls to the SGS construction algorithm and to constructive
recognition fail with probability less than ε/(c log |G|) for some constant c,
because, during the induction, O(log |G|) such calls may be made; however,
this multiplies the running time only by a O(log log |G|) factor.
Theorem 8.3.2. The set of all finite simple groups, with the possible exceptions
of the groups 2F4 (q) and 2G 2 (q), comprises a black-box m-recognizable family.
We do not prove Theorem 8.3.2; instead we just make some comments and
give the appropriate references. Cyclic simple groups are m-recognizable by
brute force, and the twenty-six sporadic simple groups can be handled with a
constant number of black-box group operations.
The alternating groups comprise a 1-recognizable family (i.e., one can take
f (G) = 1 in the definition of constructive recognition for all alternating groups
G). This result was first proven in [Beals and Babai, 1993], and more effi-
cient versions are described in [Bratus and Pak, 2000] and [Beals et al., 2002].
We shall return to the recognition of alternating and symmetric groups in Sec-
tion 10.2.
Classical simple groups are treated in [Kantor and Seress, 2001]. We note
that in that paper there is no algorithm for the constructive recognition of three-
dimensional unitary groups, since at the time of writing it was not known
whether the groups G ∼ = PSU3 (q) have presentations of length O(log2 |G|), as
required in part (ii) of the definition. Such presentations are described in [Hulpke
and Seress, 2001]. A predecessor of [Kantor and Seress, 2001] is [Cooperman
et al., 1997], where a constructive recognition algorithm for the groups GLn (2)
is given. In [Bratus, 1999], that method is extended to all special linear groups
PSLn (q).
Constructive recognition algorithms for the exceptional groups of Lie type
are in [Kantor and Magaard, 2002]. Both [Kantor and Seress, 2001] and
[Kantor and Magaard, 2002] far exceed the scope of this book.
Actually, in all these cited references, much more is done than parts (i)–
(iii) of constructive recognition: Given a black-box group H isomorphic to
some simple group S, an isomorphism λ : H → C, defined by the images of
the elements of S ∗ , is constructed between H and a standard copy C of S,
together with procedures to compute the image of any element of H under
196 Strong Generating Tests
λ or of any element of C under λ−1 . These procedures are very useful for
further computations with groups containing S as a composition factor, such
as the construction of Sylow subgroups (cf. [Kantor, 1985b; Morje, 1995]) or
the computation of maximal subgroups or conjugacy classes (cf. [Cannon and
Souvignier, 1997; Hulpke, 2000; Eick and Hulpke, 2001]). Another possible
application of these procedures is in computation with matrix groups.
The “standard copy” of a simple group S referred to in the previous paragraph
is the natural permutation representation of degree n if S is the alternating group
An , matrices modulo scalars in the correct dimension, over the field of definition
if S is a classical group, and a presentation utilizing the Bruhat decomposition
if S is an exceptional group of Lie type.
In [Kantor and Seress, 2001] and [Kantor and Magaard, 2002], there are also
more precise timings of algorithms than required in Theorem 8.3.1. It is shown
that the family of Lie-type simple groups, with the possible exception of the
groups 2 F4 (q) and 2 G 2 (q), is black-box f -recognizable for the function
q 3/2 ∼ PSUd (q 1/2 ) for some d
if G =
q for all other classical G defined on a vector space over GF(q)
f (G) = q 2 if G ∼
= 2B2 (q)
q3 for other exceptional groups defined over GF(q) except
G∼ = 2F4 (q) and 2G 2 (q).
It seems very likely that the set of all groups of Lie type is a black-box f -
recognizable family with some f (G) ≤ m(G).
The algorithm described in Theorem 8.3.1 has not been implemented. Al-
though the running time is nearly linear, computing the order of the input
group by this algorithm involves a lot of superfluous work. However, it may
be worthwhile to verify the correctness of an SGS computation in this way
if the constructive recognition of the composition factors is needed in further
computations. Besides the examples of computing Sylow subgroups, maximal
subgroups, or conjugacy classes that we have mentioned earlier, this is the case
at the construction of a short presentation, as described in the next section.
Constructive recognition of simple groups is even more important in the
matrix group setting, where every attempt so far for finding the structure of
a matrix group has led to some version of the constructive recognition prob-
lem (cf. [Babai and Beals, 1999; Leedham-Green, 2001; Kantor and Seress,
2002]).
Currently, constructive recognition of black-box groups isomorphic to alter-
nating, special linear, symplectic, and unitary groups is implemented in GAP,
but these algorithms are not in the standard distribution.
8.4 A Short Presentation 197
We finish this section with a lemma that will be needed in Section 8.4.
201
202 Backtrack Methods
9.1. Traditional Backtrack
The content of this section essentially was formulated in [Sims, 1971a, b]. Let
B = (β1 , . . . , βm ) be a base of G ≤ Sym(), and let G = G [1] ≥ G [2] ≥ · · · ≥
G [m] ≥ G [m+1] = 1 be the corresponding point stabilizer subgroup chain. The
images of base points uniquely determine the elements of G, and we may
identify group elements with their sequence of base images. The image of an
initial segment of B of length l defines a coset of G [l+1] . The images of all initial
segments define a partial order by inclusion, which is called the search tree T
for G. The lth level Tl of T corresponds to the cosets of G [l+1] ; in particular,
the root of T corresponds to G and the leaves of the search tree correspond to
the elements of G. We shall denote the subtree of T rooted at t ∈ T by T (t).
Formally, we can define a function ϕ from T to the power set of G by the rule
g
ϕ((γ1 , . . . , γl )) := g ∈ G | βi = γi for 1 ≤ i ≤ l .
Lemma 9.1.2 implies that when processing the successors of the vertex t :=
(γ1 , γ2 , . . . , γl−1 ), it is enough to consider extensions of t by the ≺-minimal
elements of the orbits of L (γ1 ,γ2 ,...,γl−1 ) . The application of this criterion involves
a base change in L, in which a base for L starting with (γ1 , γ2 , . . . , γl−1 )
is computed. Because of the order in which the search tree is traversed, we
already have L (γ1 ,γ2 ,...,γl−2 ) at hand when L (γ1 ,γ2 ,...,γl−1 ) is to be computed, so only
γl−1 has to be included in the new base. We have to find a balance between the
cost of the base change and the gain in pruning the search tree. Hence, in some
implementations, this criterion is applied only for small values of l.
[Leon, 1991] contains the following observation which, when an additional
condition is satisfied, leads to a strengthening of Lemma 9.1.2.
K (β ,...,β
Lemma 9.1.3. Suppose that βl ∈ βk 1 k−1 for some k ≤ l, and let t =
)
K (β ,...,β
Proof. Since βl ∈ βk 1 k−1 , there exists h 1 ∈ K (β1 ,...,βk−1 ) such that βl = βkh 1 .
)
L (γ ,γ ,...,γ )
Suppose, on the contrary, that there exists γ ∈ γl 1 2 k−1 with γ ≺ γk . Then
γ = γlh 2 for some h 2 ∈ L (γ1 ,γ2 ,...,γk−1 ) , and h 1 gh 2 ∈ K gL. Moreover, h 1 gh 2 ≺
h gh g h gh g
g, since βi 1 2 = βi for i < k, and βk 1 2 = γ ≺ γk = βk . This contradicts
the minimality of g in K gL.
K (β ,...,β
In particular, if βl ∈ βk 1 k−1 for some k < l then we have to consider only
)
K (β ,...,β
Lemma 9.1.4. Let s be the length of the lth fundamental orbit βl 1 l−1 of K ,
)
hg g
Proof. The set := {βl | h ∈ K (β1 ,...,βl−1 ) } has s elements and γl = βl ∈ .
hg
All elements of are in the same orbit of G (γ1 ,...,γl−1 ) , since for γ = βl ∈ ,
−1 −1
we have γ g h g = γl with g −1 h −1 g ∈ G (γ1 ,...,γl−1 ) . Also, hg ∈ K gL, so the
G (γ ,...,γ )
minimality of g implies γl = min . Hence γl 1 l−1 contains at least s − 1
elements after γl .
9.1 Traditional Backtrack 205
Lemma 9.1.4 implies that when processing the successors of the vertex t :=
(γ1 , γ2 , . . . , γl−1 ), extensions by the last s −1 elements of each G (γ1 ,...,γl−1 ) -orbit
can be skipped.
for all g ∈ ϕ((γ1 , . . . , γl−1 )) and all h ∈ ϕ ((γ1 , . . . , γl−1 )), where ϕ is the
map from T to the power set of H , defined analogously to ϕ, namely,
ϕ ((γ1 , . . . , γl−1 )) := {h ∈ H | βih = γi , 1 ≤ i ≤ l − 1}. When the traversing of
T reaches Tm , it means that we have constructed a sequence t = (γ1 , . . . , γm )
with |ϕ(t)| = 1 and ϕ (t) = ∅. We have to check that the unique element of
ϕ(t) occurs in H as well.
α1 and
β are not identical then there is no x ∈ Aut(X ) with α1x = β. We build
the next level of the search tree by picking some α2 ∈ \{α1 } and computing a
refinement
(α1 ,α2 ) of
α1 , which must be fixed by Aut(X )(α1 ,α2 ) . Further vertices
on this level are the partitions
(β,γ ) for sequences (β, γ ) such that the list of cell
sizes does not exclude the existence of x ∈ Aut(X ) with α1x = β and α2x = γ .
208 Backtrack Methods
Continuing, we arrive at a sequence (α1 , α2 , . . . , αm ) of points in , such that the
base partition fixing the αi is discrete (i.e., contains only one-element cells),
and at other discrete partitions that at the appropriate positions contain the
possible images of the αi by elements of Aut(X ). Obviously, Sym() contains a
unique permutation that maps a given discrete ordered partition to another given
discrete ordered partition. Therefore, we constructed a base (α1 , α2 , . . . , αm )
for Aut(X ) and the permutations in Aut(X ).
These ideas were adopted for searches in permutation groups G in [Leon,
1991]. A formal description and some examples can be found in [Leon, 1997]
as well. For simplicity, we describe only the case when G(P) is a subgroup
and indicate the necessary modification for coset-type problems at the end of
the section. Let B = (β1 , . . . , βm ) be a base for G. We consider the search
tree T defined in Section 9.1 and use the sequences of base images as names
for the vertices of T as in that section. However, the vertices themselves are
ordered partitions of . The vertex with name t = (γ1 , . . . , γl ) is a partition
t , satisfying the following properties:
ki
[i] = [ j],
j=ki−1 +1
To prove (p5), suppose that = (γ1 ,...,γl−1 ) ∧ ({γl }, \{γl }) and g ∈ G(P).
9.3 Normalizers 211
g
R1 ()g = R1 (γ1 ,...,γl−1 ) ∧ ({γl }, \{γl })
g g
= (γ1 ,...,γl−1 ) ∧ ({γl }, \{γl }) ∧ G [l+1] 1
[l+1] g1 g
= (γ1 ,...,γl−1
) ∧ ({γl }, \{γl }) ∧ G = R1 (g ).
We finish this section by briefly indicating the necessary change in the par-
tition backtrack method when G(P) is a coset of a subgroup of G. We have
to define two refinement processes R and R , connected by a modification of
property (p5):
p(5) R()g = R (g ) for all vertices of the search tree and for all
g ∈ G(P).
9.3. Normalizers
Given G, H ≤ Sym(), computing the normalizer NG (H ) is considered to be
a harder problem than the ones discussed in the examples in Section 9.1.2. All
of those problems can be reduced to each other in polynomial time (cf. [Luks,
1993]), but no polynomial-time reduction of the normalizer computation to
centralizer computations is known. In fact, there is no known “easy” way to
compute even NSym() (H ); in contrast, we saw in Section 6.1.2 that CSym() (H )
is computable in polynomial time. Note that since NG (H ) = NSym() (H ) ∩ G,
a polynomial-time algorithm for computing NSym() (H ) would imply the
polynomial-time equivalence of normalizer computations with centralizer com-
putations.
If some g ∈ Sym() normalizes H then it must leave invariant certain struc-
tures associated with H . For example, Lemma 6.1.7 says that g must permute
the orbits of H . The first algorithm for computing normalizers, which appeared
in [Butler, 1983], was based on this observation. In this section we describe the
method of [Theißen, 1997], which uses that g must permute the orbital graphs
of H . Considering orbital graphs gives more efficient criteria for pruning the
search tree, yet these criteria are still easy enough to compute.
212 Backtrack Methods
Let H ≤ Sym(). Then H also acts on the ordered pairs (α, β) ∈ ×
by the natural componentwise action: (α, β)h := (α h , β h ), for all h ∈ H . The
orbital graphs of H are directed graphs with vertex set . The edge set of an
orbital graph of H is one of the orbits in the action ϕ : H → Sym( × ).
We denote the orbital graph with edge set (α, β) H by X (α, β). Note that H
is a subgroup of the automorphism group of each orbital graph X (α, β), since
obviously H leaves invariant the set (α, β) H of edges. Applying Lemma 6.1.7 to
ϕ(H ), we obtain that any g ∈ NSym( (H ) must permute the orbital graphs of H .
Let also G ≤ Sym() be given; our goal is to compute G(P) := NG (H ).
It is not necessary that H ≤ G, although the condition H ≤ G makes the
application of the criteria in Section 9.1.1 more efficient since we know a priori
a nontrivial subgroup of NG (H ).
The main tool of [Theißen, 1997] is a refinement process that can be applied to
ordered partitions satisfying the condition that there are α, β ∈ belonging
to the same H -orbit on and that α and β occur in one-element cells in .
The orbital graph X (α, β) defines an ordered partition := ([0], [1], . . . ,
[k], [∞]), where [i] consists of vertices of X (α, β) of (directed) distance
i from α for 0 ≤ i ≤ k, and [∞] contains the elements of that cannot be
reached by directed paths from α. If g ∈ G(P) maps to with α := α g
and β := β g then g must also map the refinement ∧ to ∧ , where
is the distance partition from α in the orbital graph X (α , β ). If ∧ and
∧ have different sequences of cell sizes then we can conclude that there
is no g ∈ G(P) mapping to .
Instead of , we could have used a refinement of as well, which can be
obtained by partitioning the cells [i] according to the number of neighbors
in different cells (we described such refinements in detail at the beginning of
Section 9.2, while sketching McKay’s graph isomorphism program nauty). Of
course, we have to consider whether the cost of computing this refinement is
justified by the additional pruning of the search tree. As a compromise, in the
GAP implementation the sets [i] are partitioned according to the valencies of
vertices, but further refinements are not computed.
We need an efficient method for computing . For 0 ≤ i ≤ k, let [i] ¯ be
the set of vertices of X (α, β) that can be reached by a walk of length i from
α. If [0], . . . , [i − 1] are already known then it is enough to compute [i], ¯
since [i] = [i]\¯ j<i [ j]. Also, since α and β are in the same H -orbit,
we can fix r ∈ H with αr = β.
(r Hα )i+1 ).
Lemma 9.3.1 implies that the cells [i] are unions of Hα -orbits. This pro-
vides the opportunity to compute working in a quotient graph of X (α, β),
which may have significantly fewer vertices than X (α, β). The vertices of
are the Hα -orbits on ; for γ ∈ , we denote by [γ ] the vertex γ Hα of . For
[γ ], [δ] ∈ V (), the ordered pair ([γ ], [δ]) is an edge of if and only if there
exists γ ∈ [γ ] and δ ∈ [δ] such that (γ , δ ) ∈ (α, β) H .
Lemma 9.3.2. Let γ ∈ be arbitrary. Then there exists a walk of length i from
α to γ in X (α, β) if and only if there is a walk of length i in from [α] to [γ ].
Lemma 9.4.1. The Markov chain M defined in the previous paragraph is irre-
ducible and aperiodic.
Proof. Given any two states g, h ∈ G, we can reach h from g in at most two
steps since pg,1 = 1/|C G (g)| > 0 and p1,h = 1/|G| > 0. Hence M is irreducible.
Also, p1,1 = 1/|G| > 0 and so M is aperiodic.
Exercises
9.1. Draw the search tree T for S4 , according to the base B = (1, 2, 3). What
vertices of T are visited during the computation of C S4 ((1, 2)), if we
use the restrictions in Section 9.1.1? What happens if we use instead the
restrictions in Example 1 of Section 9.1.2? And if we combine all restric-
tions?
9.2. Repeat the computation of C S4 ((1, 2)), using the partition method. Define
a refinement process using the restrictions in Example 1 of Section 9.1.2.
9.3. Prove the following properties of ordered partitions:
(i) ( ∧
) ∧ T = ∧ (
∧ T ).
(ii) ( ∧
)g = g ∧
g .
(iii) ≤
⇐⇒ g ≤
g .
9.4. [Jerrum, 1995] Let G ≤ Sym(). We define a Markov chain M with
states , and with the following transition rule: Given a state α ∈ ,
first compute a uniformly distributed random element g of G α , and then
compute a uniformly distributed random element β of the fixed point set
{γ ∈ | γ g = γ } of g. Then we define β as the next state in M. Let k
denote the number of orbits of G on .
Prove that M is irreducible and aperiodic and that the stationary distri-
bution vector u = (u[α] | α ∈ ) satisfies u[α] = (k|α G |)−1 for all α ∈ .
10
Large-Base Groups
The central theme of this book is the design and analysis of nearly linear-time
algorithms. Given a permutation group G = S ≤ Sym(), with || = n, such
algorithms run in O(|S|n logc |G|) time for some constant c. If, as happens
most of the time in practice, log |G| is bounded from above by a polyloga-
rithmic function of n then these algorithms run in time that is nearly linear as
a function of the input length |S|n. In most cases, we did not make an effort
to minimize the exponent c of log |G| in the running time since achieving the
smallest possible exponent c was not the most pressing problem from either
the theoretical or practical point of view. However, in families of groups where
log |G| or, equivalently, the minimal base size of G is large, the nearly linear-
time algorithms do not run as fast as their name indicates; in fact, for certain
tasks, they may not be the asymptotically fastest algorithms at all. Another
issue is the memory requirement of the algorithms, which again may be unnec-
essarily high. The purpose of this chapter is to describe algorithms that may be
used in the basic handling of large-base groups. The practical limitation is their
memory requirement, which is in most cases a quadratic function of n. The use
of (n 2 ) memory in storing an SGS for an arbitrary G ≤ Sym() seems to be
unavoidable.
218
10.1 Labeled Branchings 219
Let G ≤ Sym(), with || = n, be an arbitrary permutation group, and let
us fix an ordering ω1 ≺ · · · ≺ ωn of . Let G = G [1] ≥ G [2] ≥ · · · ≥ G [n−1] ≥
G [n] = 1 be the point stabilizer chain corresponding to this ordering; that is,
G [i] is the subgroup comprised of the elements of G that fix {ω1 , . . . , ωi−1 }
pointwise. A labeled branching B(, E) for G is a directed graph with vertex
−−−−→
set and edge set E, and with a function f : E → G. For (ωi , ω j ) ∈ E,
we
−−−−→
call the group element f ((ωi , ω j )) the label of this edge. A labeled branching
must satisfy the following properties:
We denote by f (B) the set of elements of G that are labels of some edge.
We shall use repeatedly the following simple characterization of branchings.
Lemma 10.1.1. Let B(, E) be a directed graph such that for each −−−−→
(ωi , ω j ) ∈
we have i < j. Then the underlying graph of B is a forest if and only if for
E,
each ωk ∈ there exists at most one edge with endpoint ωk .
Proof. If the underlying graph of B is a forest then it is clear that each vertex
is the endpoint of at most one edge. To prove the converse, suppose, on the
contrary, that the underlying graph contains a cycle and let ωk be the vertex
with the largest index k in this cycle. Then both edges of the cycle incident with
ωk are directed toward ωk , contradicting the assumption of the lemma.
Now we describe a data structure for storing a labeled branching B that en-
ables us to recover the labels of edges of the transitive closure T of B efficiently.
First, we store an array B of length n := ||, containing integers from [1, n].
We define B[k] := k if ωk is the root of a tree in B; otherwise, B[k] := j for
−−−−→
the integer j such that (ω j , ωk ) is the unique edge in B with endpoint ωk .
The most natural solution for storing the labels of edges of B is to define an
array P of permutations of length n such that the kth entry P[k] is the identity
permutation if ωk is the root of a tree in B, and otherwise P[k] is the label of
−−−−→
the edge (ω j , ωk ) for the parent ω j of ωk . However, if we need the label of an
−−−−→
edge (ωi , ωk ) of T then we have to construct the path from ωi to ωk in B, and
then we have to take the product of edge labels along this path. Although the
construction of the path can be done in O(n) time, starting at ωk and using the
array B to construct larger and larger finishing segments of the path, taking
the product of edge labels may require (n 2 ) time.
We can do better if we also define an array Q of n permutations as follows:
Let Q[k] := () if ωk is the root of a tree in B, and otherwise let Q[k] be the
product of edge labels along the path from ωt to ωk , where ωt is the root of the
−−−−→
tree containing ωk . Then the label of an edge (ωi , ωk ) of T can be computed in
O(n) time, as the product Q[i]−1 Q[k].
−−−−→
Given i, j ∈ [1, n], we have two methods to decide whether (ωi , ω j ) is an
edge of T . The first one is to use the array B and to check whether the unique
path from ω j to the root of the tree containing ω j goes through ωi . The second
−−−−→
method computes Q[i]−1 Q[ j]. It is clear that (ωi , ω j ) is an edge of T if and
only if Q[i]−1 Q[ j] fixes {ω1 , . . . , ωi−1 } pointwise. The second method also
−−−−→
computes the label of the edge (ωi , ω j ), in the case when this ordered pair is
indeed an edge of T .
If the labeled branching B represents G then the array Q can be used to
test membership in G, in O(n 2 ) time. Given g ∈ Sym(), we try to factor g
as a product g = rn−1 · · · r1 of coset representatives, as in the original sifting
procedure of Sims (cf. Section 4.1). If r1 , . . . , ri−1 and gi := gr1−1 · · · ri−1
−1
gi
are already computed for some i ∈ [1, n − 1] then we compute ω j := ωi and
Q[i]−1 Q[ j]. If Q[i]−1 Q[ j] fixes {ω1 , . . . , ωi−1 } pointwise then we define ri :=
Q[i]−1 Q[ j] and gi+1 := gi ri−1 ; otherwise we conclude that g ∈ G.
222 Large-Base Groups
10.1.1. Construction
In Theorem 10.1.2, we have shown that each permutation group can be rep-
resented by a labeled branching. However, the proof of that theorem did not
provide an efficient algorithm for the construction of the labeled branching.
Step 1: Compute a generating set Si for G [i] and a transversal Ti for G [i]
mod G [i+1] .
Step 2: Modify Bi into a labeled branching Bi such that Bi represents Ti ,
maintaining the property that Bi represents transversals G [ j] mod G [ j+1] ,
for all j ∈ [1, i − 1].
Step 3: For each Schreier generator g constructed from Si and Ti
(cf. Lemma 4.2.1), modify Bi into a labeled branching Bi such that
f (Bi ) ∩ G [i+1] = f (Bi ) ∩ G [i+1] , g, maintaining the property (c1)
for all j ∈ [1, i]. Rename Bi as Bi .
At the end of Step 3, Bi+1 can be defined as the current version of Bi .
10.1 Labeled Branchings 223
To start the recursion, we define the trivial branching B0 by the arrays B, P,
and Q with B[ j] := j, P[ j] := (), and Q[ j] := () for all j ∈ [1, n], and we
perform Step 3 using the elements g of the input generating set S.
Now we describe the three steps of the construction of Bi+1 in more detail.
In Steps 1 and 2, we work only with the arrays B and P.
In Step 1, we obtain Si as Si := {P[k] | i ≤ B[k] < k}. Then we compute
S
the orbit ωi i . The elements of the transversal Ti can be constructed during
the orbit computation, as described at the end of Section 2.1.1. The time and
memory requirements of Step 1 are O(n 2 ).
In Step 2, we modify the arrays P and B. We process each g ∈ Ti . Let
g
ωk := ωi . If k = i then we do not do anything; otherwise, we define P[k] := g
and B[k] := i.
We have to show that this definition of P[k] and B[k] does not destroy
the property that the labeled branching Bi , defined by B and P, represents
transversals G [ j] mod G [ j+1] , for all j ∈ [1, i − 1]. Let j ≤ i − 1 be fixed,
[ j]
and let ωl be an arbitrary element of the fundamental orbit ω Gj . If the unique
path from ω j to ωl in Bi (before the modification of Bi ) does not go through
ωk then this path remains intact after the change of P[k] and B[k]; therefore, it
is enough to consider the case that the path from ω j to ωl contains ωk . In this
g
case, in particular, ωk ∈ ωGj and so there exists g j ∈ G [ j] with ω j j = ωk . Then
[ j]
−1
g g
ωj j = ωi and g j g −1 ∈ G [ j] , so there is a path ω j = ωi1 , . . . , ωim = ωi from
ω j to ωi in Bi (this path exists before and after the change of P[k] and B[k]).
Hence ω j = ωi1 , . . . , ωim , ωk is a path in Bi after the change of P[k] and B[k],
and so there is a path from ω j to ωl .
The time requirement of Step 2 is O(n 2 ), and it is clear that after Step 2 the
labeled branching represents the transversal Ti . After Step 2, we reconstruct the
array Q. For k = 1, 2, . . . , n, we define Q[k] := () if B[k] = k and Q[k] :=
Q[B[k]] · P[k] if B[k] < k. The construction of Q requires O(n 2 ) time. In
Step 3, we shall use all three arrays B, P, and Q.
Before we describe Step 3, we introduce a quantity associated with labeled
branchings. Let B be a labeled branching, stored in the arrays B, P, and Q as
in the foregoing. The length l(B) of B is defined as
l(B) = (k − B[k]) + k.
{k|B[k]<k} {k|B[k]=k}
(Recall that we notice that there is no such path by the fact that Q[l]−1 Q[k] does
not fix {ω1 , . . . , ωl−1 } pointwise.) In some gl satisfying (10.1) is constructed
then sifting enters a second phase that tries to decrease the value of k. Namely,
−−−−−→
if there is an edge (ωm , ωk ) in Bi with m > l then gl is replaced by gl · P[k]−1 ,
and we redefine k := m. This new gl also satisfies (10.1). The second phase
terminates when B[k] = k or B[k] < l.
−−−−→
If B[k] = k then we add the edge (ωl , ωk ) to Bi with label gl , and terminate
sifting with an output in case (ii). Adding the edge to Bi is done by defining
B[k] := l and P[k] := gl .
−−−−−→ −−−−→
If m := B[k] < l then we delete the edge (ωm , ωk ) and add the edge (ωl , ωk )
with label gl . Again, this modification is done by defining B[k] := l and
10.2 Alternating and Symmetric Groups 225
P[k] := gl . The same argument as in Step 2 shows that the new branching
Bi represents transversals G [ j] mod G [ j+1] for all j ∈ [1, i]. Moreover, if m <
i + 1 then f (Bi ) ∩ G [i+1] = f (Bi ) ∩ G [i+1] , g and we are in case (ii). If
−−−−−→
i + 1 < m < l then we also output the label of the deleted edge (ωm , ωk ) as h,
and we are in case (iii).
If sifting terminates with output in case (ii) or (iii) then after termination we
recompute the array Q in the same way as was done after Step 2.
Finally, we estimate the time requirement of Step 3. One call of sifting and
the possible recomputation of Q takes O(n 2 ) time. For a fixed i ≥ 1, the number
of calls is less than 2n 2 , since we sift less than n 2 Schreier generators and less
than n 2 permutations h obtained as an output in case (iii) of some previous call
of sifting. The number of outputs in case (iii) is less than n 2 since at each output
in case (ii) or (iii) the length of the labeled branching decreases, and the original
length after Step 2 is at most n(n − 1)/2. Similarly, for i = 0, the number of
calls of sifting is less than |S| + n 2 . Hence the total time requirement of Step 3,
added up for all i, is O(n 5 + |S|n 2 ). The memory requirement for the storing
of Si , Ti , B, P, and Q is O(n 2 ).
Lemma 10.2.3. (a) Let q be an integer in the range n/2 < q ≤ n. The
proportion of elements of Sn containing a cycle of length q is 1/q. In An , this
proportion is 1/q if q ≤ n − 2, and it is 0 or 2/q if q ∈ {n − 1, n}.
(b) The proportion of elements of the giants that contain a cycle of length p
for some prime p in the range n/2 < p < n − 2 is asymptotically ln 2/ ln n.
Proof. (a) There are ( qn )(q − 1)!(n −q)! elements of Sym([1, n]) with a cycle of
length q, since there are ( qn ) possibilities for the support of this cycle, and on
a fixed support, there are (q − 1)! cycles. The term (n − q)! counts the number
of possibilities for the restrictions of the permutations on [1, n]\. We counted
every permutation containing a cycle of length q exactly once, since q > n/2
implies that a permutation cannot contain two cycles of length q. Dividing by n!,
we obtain the required proportion 1/q. If q ≤ n − 2 then a similar computation
shows that there are ( qn )(q − 1)!(n − q)!/2 elements of Alt([1, n]) with a cycle
of length q, and dividing by n!/2 gives the proportion 1/q. If q ∈ {n − 1, n}
then either none or all of the ( qn )(q − 1)!(n − q)! permutations of Sym([1, n])
containing a cycle of length q are in Alt([1, n]).
(b) No element of Sym([1, n]) can have cycles of lengths p1 and p2 for
two different primes greater than n/2. Therefore, the proportion of elements
of the giants that contain a cycle of length p for some prime p in the range
n/2 < p < n − 2 is the sum of the proportions for each p. The sum of the
reciprocals of primes less than x is asymptotically ln ln x (see [Hardy and
Wright, 1979, Theorem 427]); hence
1 n ln(n − 2) ln 2 ln 2
∼ ln ln(n−2)−ln ln = ln ∼ ln 1 + ∼ .
n/2< p<n−2
p 2 ln n − ln 2 ln n ln n
in the case G ∼
= Sn ,
in the case G ∼
= An with n odd, and
in the case G ∼
= An with n even. Moreover, there are deterministic algorithms
for the following:
(i) Given g ∈ G, find λ(g) and a straight-line program of length O(n log n)
from S ∗ to g.
(ii) Given h ∈ Sym([1, n]), decide whether or not h ∈ λ(G); and, if it is, find
λ−1 (h) and a straight-line program of length O(n log n) from λ(S ∗ ) to h.
We note that the presentation (10.2) is from the book [Coxeter and Moser,
1957], and the presentations (10.3) and (10.4) are from [Carmichael, 1923].
The permutations s = (1, 2, . . . , n) and t = (1, 2) satisfy (10.2), whereas s =
(3, 4, . . . , n) and t = (1, 2, 3) satisfy (10.3), and s = (1, 2)(3, 4, . . . , n) and
t = (1, 2, 3) satisfy (10.4).
228 Large-Base Groups
The rest of this section is devoted to the description of the constructive recog-
nition of alternating and symmetric groups from [Beals et al., 2002]. The mate-
rial in Sections 10.2.1–10.2.3 is reproduced from this paper, c 2002 American
Mathematical Society. Reprinted by Permission. Although the algorithm works
in the black-box group setting, so it can be used to construct an isomorphism
between G and a standard copy of An or Sn for any permutation group or matrix
group representation G of these groups, there is a much simpler version if G is
a giant (i.e., G is a permutation group acting on a domain of size n). We shall
comment on this simpler version in Section 10.2.4.
Let ξ be an upper bound on the time required per element to construct inde-
pendent, nearly uniformly distributed random elements of G and let µ be an
upper bound on the time required for each group operation in G.
We shall prove Theorem 10.2.4 in Sections 10.2.2 and 10.2.3 (cf. the summary
at the end of Section 10.2.3).
Lemma 10.2.5. For all n, the number of divisors of n is at most 48 n 1/3 /25201/3 ,
and the sum of the divisors of n is at most n(3 + ln n)/2.
Proof. For each δ > 0, the number of divisors of n is at most Cδ n δ for a constant
Cδ . An algorithm for computing the value of Cδ is described in [Niven et al.,
1991, pp. 395–396]; C1/3 = 48/2520 1/3
≈ 3.527.
√n −1
We use the trivial estimate i=1 i < n/2 for the sum of the divisors less
√ √
than n. The sum
of divisors greater than or equal to n satisfies
√n
n n n 1 ln n
≤ n + + + ··· + √ < n 1 + dt = n 1 + .
2 3 & n' 1 t 2
Adding these two estimates, we obtain the second assertion of the lemma.
Lemma 10.2.6. Let x = np, where p is prime, with n > p 2 and all prime
divisors of n greater than p. Let D denote the set of divisors of x that are not
greater than n. Then |D| ≤ 96 n 1/3 /25201/3 < 8n 1/3 and
p + 3 − 2( p + 1) ln p + ( p + 1) ln n
d≤n .
d∈D
2
Proof. Because p n, the number of divisors of x is twice the number of divisors
of n. Thus Lemma 10.2.5 yields |D| ≤ 96 n 1/3 /25201/3 .
Also, because of the restriction on the prime factorization of n, all divisors
of x except np are at most n, and d∈D d = ( p + 1) d|n d − pn. Hence it
is enough to estimate the sum of the divisors of n. We use a refinement of the
√
argument in Lemma 10.2.5. For the sum of divisors less than n, we still use
the trivial estimate n/2. However, for the larger divisors, we note that since the
largest proper divisor of n is at most n/( p + 1), the sum
of divisors greater
√
or equal than n satisfies
n n n
≤n+ + + ··· + √
p+1 p+2 & n'
√n
1 ln n
< n 1+ dt = n 1 + − ln p .
p t 2
Lemma 10.2.7. Let x = ny, and let D denote the set of divisors of x that are
not greater than n. Let k > 1. Then
y
d ≤n 1+
k k
.
d∈D
k−1
Proof. All divisors d ∈ D are of the form d = x/s for some s ≥ y. Therefore,
x k k ∞
x x 1
d ≤
k
< +x k
dt
d∈D s=y s y y tk
k k
x 1 x y
= + x · k−1
k
= 1+ .
y y (k − 1) y k−1
Lemma 10.2.8. Let x = m! (n − m). If n > m + m! m then the largest divisor
of x not exceeding n is n − m.
The proof of the next lemma is quite simple and, since it is similar to the
proof of Lemma 10.2.3(a), it is omitted.
Lemma 10.2.9. For n ≥ 10, each column of the following table lists a cycle
type in the first row and the proportion of elements of that cycle type in Sn in
the second row.
cycle type n1 11 (n − 1)1 31 (n − 3)1
proportion n −1 (n − 1)−1 1
3
(n − 3)−1
cycle type 11 31 (n − 4)1 12 31 (n − 5)1 13 31 (n − 6)1
proportion 1
3
(n − 4)−1 1
6
(n − 5)−1 1
18
(n − 6)−1
In An , the proportions are 0 or twice the proportions in Sn , depending on
whether the permutations with the desired cycle type are odd or even.
Theorem 10.2.10. Let m be a fixed nonnegative integer. For any ε > 0 there
exists a bound n(ε) such that if h is a uniformly distributed random permutation
from Sn for some n > n(ε) then the probability that h m! (n−m) = 1 is less than
(1 + ε)/n.
ways, and then the cycle Ci can be chosen in (di − 1)! ways. Finally, after
C1 , . . . , Cl are defined, the rest of the permutation can be chosen in at most
(n − li=1 di )! ways. Multiplying these numbers, we obtain
l
(n − k)! − 1)!
l
i=1 (di |Pi |−1
N (k, , d1 , . . . , dl ) ≤ l ≤ (n − k)! di . (10.5)
i=1 (di − |Pi |)! i=1
232 Large-Base Groups
For a fixed partition , let N (k, ) denote the sum of all N (k, , d1 , . . . , dl )
obtained above for all choices d1 , . . . , dl of divisors of m! (n − m). If l = 1
then we obtain N (k, ) ≤ (n − k)! d∈D d k−1 , which, by Lemmas 10.2.8 and
10.2.7, is at most
m!
(n − k)!(n − m)k−1 1 + . (10.6)
k−2
For l ≥ 2, we use the estimates that, by Lemma 10.2.5, a sequence d1 , . . . , dl
from D can be chosen in at most (8(m! (n −m))1/3 )l ways and for each sequence
|P |−1
(|Pi |−1)
trivially li=1 di i ≤n i = n k−l . Hence, since n ≥ 8 + 8m! m,
l
N (k, ) ≤ (n − k)! 8(m! (n − m))1/3 n k−l
2
≤ (n − k)! 8(m!(n − m))1/3 n k−2 < 64m! (n − k)! n k−4/3 .
We estimate (n − k)! by
k
n! n n n n! k n! k
(n − k)! = · · · < 1 + < k e n−k
nk n − 1 n − 2 n−k+1 nk n−k n
and the number of partitions of {1, 2, . . . , k} by k k (note that each partition can
be obtained as the sets where some function f : {1, 2, . . . , k} → {1, 2, . . . , k}
takes constant values). Combining these estimates with the observation that
each element of Tn (m! (n − m)) is counted exactly once in N (k, ), we
obtain that
m! k 1 k 1
Nn (m! (n − m)) ≤ n! 1 + e n−k + 64m!k k e n−k 4/3 . (10.7)
k−2 n n
Given ε > 0, we choose k such that
m! k ε
1+ e k 2 −k < 1 + .
k−2 2
Proof. (a) This is immediate from Lemma 10.2.3(a) and Theorem 10.2.10.
(b) Tn ((n − m)s) ⊆ Tn (m! (n − m)), so it has at most (1 + ε)n!/n elements.
Out of these, there are n!/(s(n − m)(m − s)!) > n!/(m! n) with cycle structure
1m−s s 1 (n −m)1 , so the proportion of elements in Tn ((n −m)s) with the required
cycle structure is greater than (1 − ε)/m!.
Proof. Using the notation of the proof of Theorem 10.2.10, we derive a tighter
upper bound for Nn (x) by evaluating (10.5) more carefully in the case k = 3.
First, we suppose that n > 50.
There is one partition 3 of {1, 2, 3} with three parts, and (10.5) gives N (3,
3 , d1 , d2 , d3 ) ≤ (n − 3)! d10 d20 d30 = (n − 3)!. By Lemmas 10.2.5 and 10.2.6,
963 963
N (3, 3 ) ≤ (n − m) (n − 3)! ≤ n (n − 3)!.
2520 2520
There are three partitions 2 of {1, 2, 3} with two parts, and (10.5) gives
N (3, 2 , d1 , d2 ) ≤ (n − 3)! d1 d20 . Hence, using both statements of Lemmas
10.2.5 and 10.2.6,
96 6 − 8 ln 3 + 4 ln n
N (3, 2 ) ≤ (n − 3)! n 4/3 .
25201/3 2
In this estimate, we used Lemma 10.2.6 with p = 3, since for n > 50 this value
gives the largest upper bound.
234 Large-Base Groups
k − 1
r x (k) = (d − 1)! rn (k − d),
d|x
d −1
Although Theorem 10.2.12 gives a positive constant lower estimate for the
conditional probability that can be used in the design of an algorithm with
asymptotic running time described in Theorem 10.2.4, the constant 1/180 is
too small for an efficient implementation. In fact, we cannot expect a good
constant from an argument covering all cases together, since Nn (n) ≥ (n − 1)!
and the number of permutations with cycle type 13 31 (n − 6)1 is only about
(n − 1)!/18. Therefore, in the range n ≤ 300, we have computed the value
of Nn (x) explicitly, and this can be used for obtaining better estimates for the
number of iterations in an implementation. We summarize these computations
in the next lemma. Without doubt, the constant 1/7 bounds the conditional
probabilities for larger values of n as well, for all required cycle types. We also
note that [Warlimont, 1978] gives an asymptotic estimate with a very precise
error term for Nn (n).
10.2 Alternating and Symmetric Groups 235
Lemma 10.2.13. Let 8 ≤ n ≤ 300.
(a) If n ∈ {8, 12, 24} then the conditional probability is greater than 1/2 for
the event that an element of Sn of order dividing n is an n-cycle. For all n,
the conditional probability is greater than 1/4.
(b) If n is even then the conditional probability is greater than 1/2 for the event
that an element of An of order dividing n − 1 has cycle type 11 (n − 1)1 .
(c) If n ∈ {31, 61, 91, 121, 151, 181, 211, 241, 271} then the conditional prob-
ability is greater than 1/3 for the event that an element of An of order di-
viding 3(n − m) for the m values described in Lemma 10.2.9 has cycle type
1m−3 31 (n − m)1 . For all n, the conditional probability is greater than 1/7.
n mod 6 0 1 2 3 4 5
(10.8)
m 5 6 3 4 3 4
Lemma 10.2.14. Let n be odd and n ≥ 7. Suppose that a ∈ G has cycle type
n 1 and b ∈ G has cycle type 1n−3 31 . Then in O(ξ n + µn) time, it is possible to
236 Large-Base Groups
construct s, t ∈ G such that s, t satisfies the presentation (10.3) for An . This
algorithm is of Las Vegas type and succeeds with probability at least 3/4.
The rest of the algorithm is deterministic and runs in O(µ) time. Without
loss of generality, we can suppose that supp(λ(c)) = {1, 2, k} for some k with
3 ≤ k ≤ n − 1. The next goal is to construct t ∈ G such that λ(t) = (1, 2, 3).
If k = 3 then cca is an involution whereas if 4 ≤ k ≤ n − 1 then cca has
order 5. Hence these two cases can be distinguished in O(µ) time.
Suppose first that k = 3. We can distinguish the cases λ(c) = (1, 2, 3) and
2
λ(c) = (1, 3, 2) by computing x := ca and y := c x . In the first case λ(y) =
(1, 2, 4) and in the second case λ(y) = (1, 5, 2), which can be distinguished by
2
checking whether [y, y a ] = 1. After that, t is defined as t := c or t := c2 ,
respectively.
Suppose next that 4 ≤ k ≤ n−1. The case k ∈ {4, n−1} can be distinguished
2
from the case 5 ≤ k ≤ n −2 by checking whether [c, ca ] = 1. If 5 ≤ k ≤ n −2
then λ(c) = (1, 2, k) can be distinguished from λ(c) = (1, k, 2) by computing
x := ca and y := c x . In the first case λ(y) = (1, 3, k) and in the second
case λ(y) = (1, k, k + 1), which can be distinguished by checking whether
[y, y a ] = 1. If it turns out that λ(c) = (1, 2, k) then define t := [c2 , x]. If
λ(c) = (1, k, 2) then define t := [c, x 2 ].
10.2 Alternating and Symmetric Groups 237
If k ∈ {4, n − 1} then the cases λ(c) = (1, 2, 4), λ(c) = (1, 4, 2), λ(c) =
(1, 2, n − 1), and λ(c) = (2, 1, n − 1) are also distinguished by computing x :=
ca and y := c x . In the four cases, λ(y) = (1, 3, 4), λ(y) = (1, 4, 5), λ(y) = (n−
1, 1, 3), and λ(y) = (n −1, n, 1), respectively. The third of these is distinguished
from the others as the only one with [y, y a ] = 1. The second one is distinguished
2
among the remaining three as the only one with [y, y a ] = 1. Finally, the first
and fourth are distinguished by the order of yy a . If λ(c) = (1, 2, 4) or λ(c) =
(1, 2, n − 1) then define t := [c2 , x]. If λ(c) = (1, 4, 2) or λ(c) = (2, 1, n − 1)
then define t := [c, x 2 ].
Finally, output s := at 2 , which satisfies λ(s) = (3, 4, . . . , n), and t.
Lemma 10.2.15. Let n be even and n ≥ 10. Suppose that a ∈ G has cycle type
11 (n − 1)1 and b ∈ G has cycle type 1n−3 31 . Then, in O(ξ n + µn) time, it is
possible to construct s, t ∈ G such that s, t satisfies the presentation (10.4)
for An . This algorithm is of Las Vegas type and succeeds with probability at
least 3/4.
Proof. The case when n is even, as well as the evaluation of the relators s n−2 , t 3 ,
and (st)n in the odd case, is clear. In the case when n is odd, we evaluate the
relators (ts −k ts k )2 for k = 1, . . . , &(n − 3)/2' in &(n − 3)/2' rounds: In the
kth round, we use the input s k−1 and we output s k and (ts −k ts k )2 . One round
requires only a constant number of group operations.
Remark 10.2.18. We note that, in view of Lemma 10.2.13, for “practical” val-
ues of n the number (4/3)3 1802 ln(ε −1 ) can be replaced by (4/3)3 72 ln(ε −1 )
in the expression for the number of iterations in the proof of Theorem 10.2.17.
This reduction does not really matter if the input group is indeed isomorphic
to An or Sn , since the expected number of iterations depends on the true con-
ditional probabilities that an element of order dividing (n − m)s has cycle type
1n−m−s s 1 (n − m)1 for the appropriate values of s and m, and it does not depend
on how badly or well we estimate these conditional probabilities. However, if
the algorithm is used as a Monte Carlo algorithm to test whether an unknown
input group G is isomorphic to An or Sn then the better constants ensure earlier
termination in the case of a negative answer.
10.2 Alternating and Symmetric Groups 239
10.2.3. Constructive Recognition: The Homomorphism λ
Given a black-box group G = S isomorphic to An or Sn , the algorithm
described in the proof of Theorem 10.2.17 constructs s, t ∈ G such that s, t ∼ =
An and it satisfies the presentation in (10.3) or (10.4). In this section we construct
a homomorphism λ : G → Sym([1, n]) by specifying the images of s and t and
by giving a procedure that constructs the image of any z ∈ G. The algorithm
will detect if G ∼ = Sn , and in this case it replaces s and t by two new elements
s1 , t1 such that s1 , t1 satisfies (10.2). We shall also describe a procedure that,
given an arbitrary element z ∈ G, computes a straight-line program reaching z
from s and t (or from s1 and t1 in the case G ∼ = Sn ).
Recall that for n > 6 we have Aut(An ) ∼ = Sn , and so, for any g ∈ G, the cy-
cle structure of λ(g) is the same for any faithful homomorphism λ : G →
Sym([1, n]). Therefore, without loss of generality, we can assume that λ(s) =
(3, 4, . . . , n) or λ(s) = (1, 2)(3, 4, . . . , n), depending on the parity of n, and
λ(t) = (1, 2, 3).
We start with the easier inverse problem: Finding straight-line programs
reaching any p ∈ An from g := λ(s) and h := λ(t).
gi h i+1 if n − i is even
gi+1 = −1 (10.9)
gi h i2 h i+1 if n − i is odd.
form (i, j)(n −1, n) or (n −1, n)(i, j) and it is enough to show that any such per-
mutation can be obtained by a straight-line program of length O(log n) from T .
For any k ∈ [i + 2, n], we have gi−(k−i−2) h i gik−i−2 = (i, i + 1, k) or
gi−(k−i−2) h i gik−i−2 = (i +1, i, k), and these 3-cycles can be reached by straight-
line programs of length O(log n) from T . Hence it is enough to observe that
The main idea of the construction of λ(z) for an arbitrary z ∈ G is the fol-
lowing. We need to define an n-element set on which G acts, and then we
need to identify with {1, 2, . . . , n}. We define as a set of unordered pairs
{a, b} ⊆ G such that both a and b have cycle type 1n−3 31 , satisfying the require-
ment that the supports of λ(a) and λ(b) intersect in exactly one point i ∈ [1, n].
In this way, {a, b} can be identified with i, and can be identified with [1, n].
Representing in this way creates two problems. First, we do not want to
store 2n elements of G, so the set is not computed explicitly. Elements of ,
when needed, will be reached by straight-line programs from s and t. Second,
for an arbitrary z ∈ G the image {a z , b z } of a point {a, b} ∈ is not necessarily
10.2 Alternating and Symmetric Groups 241
an element of . Thus we need to be able to identify the intersection of the
supports of λ(a z ) and λ(b z ) with supp(λ(c))∩supp(λ(d)) for some {c, d} ∈ in a
way different from simply checking whether {a z , b z } = {c, d}. We shall narrow
down the possibilities for supp(λ(a z )) ∩ supp(λ(b z )) to a constant number by
taking commutators of a z and b z with certain elements x1 , . . . , xm of G of
order five, utilizing the fact that an element of order five and a three-cycle in
Sn commute if and only if their supports are disjoint.
Now we describe the construction of the elements x1 , . . . , xm . Let n = 5k +r
where 0 ≤ r ≤ 4, let m := log2 (k + 1) , and put m := 2m . For 1 ≤ j ≤ m ,
we define partitions p j = {P j,0 , P j,1 } of the set {1, . . . , k} into two parts such
that the common refinement of these partitions is the trivial one. Namely, for
each i ∈ [1, k], we compute the binary expansion b1(i) b2(i) . . . bm(i) of i. For i ∈
[1, k] and j ∈ [1, m], let
b(i)
j if j ≤ m
ε( j, i) = (10.10)
1 − b(i)
j−m if j ≥ m + 1.
If this intersection is nonempty, then it is the support of λ(ai ) for the unique i
whose binary expansion contains 1s exactly in the positions given in J .
Proof. (a) By Lemma 10.2.22, the images λ(z) of all input generators z ∈ S
can be computed in O(µ|S|n log n) time. All λ(z) are even permutations if and
only if G ∼= An .
(b) Suppose that we have found z 0 ∈ S such that q := λ(z 0 ) is an odd
permutation. By Corollary 10.2.20, z 1 := λ−1 (q · (1, 2)) can be computed in
O(µn log n) time, and then t1 := z 0−1 z 1 satisfies λ(t1 ) = (1, 2). Depending on
the parity of n, we compute z 2 := s or z 2 := t1 s such that λ(z 2 ) = (3, 4, . . . , n).
Finally, we compute s1 := z 2 t, which satisfies λ(s1 ) = (1, 2, . . . , n).
(c) Depending on the parity of p, we compute a straight-line program reaching
p or p · (1, 2) from λ(s) and λ(t), as described in Lemma 10.2.19. Then, it is
enough to observe that λ(s) and λ(t) can be reached from λ(s1 ) and λ(t1 ) by a
straight-line program of constant length, since t = [s1 , t1 ] and s = s1 t 2 if n is
odd, and s = t1 s1 t 2 if n is even.
The evaluation of this straight-line program in G is done as described in
Corollary 10.2.20.
Proof. Using the algorithm described in the proof of Lemma 10.2.22, we com-
pute λ(z). By Lemmas 10.2.19 and 10.2.23(c), we can write a straight-line
program reaching λ(z) from λ(s), λ(t) or λ(s1 ), λ(t1 ), respectively. The same
straight-line program reaches z from s, t or s1 , t1 .
244 Large-Base Groups
acts transitively on the 6-tuples then S1 | = Alt(). The time requirement of
this initialization step is O(d|S|n log k).
Suppose that Si , gi , and h i are already constructed for some i ∈ [1, k − 6].
We compute and store a transversal Ti for Si mod Si δi , and we choose
g
transversal elements a, b such that gi+1 := agi and h i+1 := bh i satisfy δi i+1 = δis
h i+1
and δi = δi . We construct the elements of the set Si+1 as random subproducts,
t
made from the Schreier generators defined by Si and Ti (cf. Lemma 4.2.1).
Again, we construct c log k random subproducts. If Si |{δi ,...,δk } = Alt({δi , . . . ,
δk }) then the same 6-transitivity argument that we used in the case of S1 shows
that, with probability at least 1 − k d+1 , Si+1 |{δi+1 ,...,δk } = Alt({δi+1 , . . . , δk }).
If i ∈ [k − 5, k − 1] then we compute Ti , gi+1 , and h i+1 as in the previous
paragraph and let Si+1 consist of all Schreier generators defined by Si and Ti .
For a fixed i, the construction of Ti can be done by an orbit algorithm (cf.
Section 2.1.1) in O(dnk log k) time. To stay within the memory usage asserted
in the statement of the theorem, we do not compute and store the Schreier gener-
ators explicitly. Whenever a particular Schreier generator is needed in a random
subproduct, we compute it from Si and Ti . Even with this restriction, the con-
struction of one random subproduct can be performed in O(dnk log k) time, and
the construction of Si+1 is accomplished in O(dnk log2 k) time. Therefore, per-
forming all k steps of the recursion can be done in O(d(nk 2 log2 k + |S|n log k))
time and the probability of failure is at most k/k d+1 = 1/k d .
get mark 1; if the action on this orbit is not primitive, a system of blocks of
imprimitivity is added with mark 2; etc. Higher mark means higher priority.
The main routine is a recursive procedure SGS[H, ], where H is a group
acting on the set . In the previous notation, H is G i−1 , and is the corre-
sponding domain i−1 with the fixed points deleted. The sets Ui are collected
in an array U . The purpose of the following pseudocode is to describe how to
choose the next point to be fixed and how to detect that the group acts as a giant
on an orbit. The subroutines that compute generators for the next group in the
subgroup chain will be described later.
Initially, we set the global variable n to ||, U to the empty list, and
transitivity to 0. We set the mark m(α) = 0 for all α ∈ . Then
SGS[G, ] is called. All internal calculations on permutations are carried out
on all n points and possibly additional points (such as points corresponding to
the action on blocks).
SGS[H,]
Input: generators for H ≤ Sym(); global variables n, transitivity,
U
Output: U
:= \{α ∈ | α H = {α}}
if = ∅ then stop
mark := max{m(α) | α ∈ }
O := {α ∈ | m(α) = mark}
Case:
H O is intransitive:
transitivity := 0
let O1 := smallest orbit of H O
for α ∈ O1 , m(α) := mark + 1
call SGS[H, ]
H O is imprimitive:
compute a minimal block system B on O
:= B ∪
for α ∈ B, m(α) := mark + 1
transitivity := 0
call SGS[H, ]
H O is primitive:
transitivity := transitivity + 1
if transitivity ≥ 6 then
(* H O is Sym(O) or Alt(O) *)
compute s, t ∈ H with s|O , t|O satisfying one of
10.3 A Randomized Strong Generator Construction 249
(10.2)–(10.4)
add {s, t} to U
H := pointwise stabilizer of O in H
call SGS[H, ]
else
α := first element of O
add a transversal H mod Hα to U
compute generators for Hα
call SGS[Hα , ]
Next, we describe the auxiliary routines that compute generators for the
groups G i and estimate the time requirement of these routines. First, we estab-
lish that the permutation domains i cannot grow too large.
Lemma 10.3.2. Suppose that G i = Si is already computed, and suppose that
G i+1 = (G i )αi for some point αi ∈ i+1 . Let k := |αiG i |. Then there is a Monte
Carlo algorithm that computes a transversal G i mod G i+1 and a generating set
Si+1 of size O(n) for G i+1 in O(nk|Si | log n) time. The memory requirement is
O(n min{n, k|Si |}).
and Ui . Since the length of any subgroup chain of G i is less than 3n/2, the
number of Schreier generators is k|Si |, and one permutation multiplication can
be performed in O(n) time, Theorem 2.3.6 implies that the construction of Si+1
takes O(kn|Si | log n) time.
Lemma 10.3.3. Suppose that G i = Si is already computed, and suppose that
G i+1 is the pointwise stabilizer of an orbit i ⊆ i on which G i acts as a giant.
Let k := |i |. Then there is a Las Vegas algorithm that computes s, t ∈ G i such
that s|i and t|i satisfy the appropriate one of the presentations (10.2)–(10.4).
The time requirement is O(n 2 log4 n + nk 2 log2 n + |Si |n log n) and the memory
requirement is O(n 2 log2 n + |Si |n).
Proof. If |Si | > 15n then, using the algorithm described in the proof of Theo-
rem 2.3.6, we construct a generating set of size O(n) for G i in O(|Si |n log n)
time. Hence, from now on, we suppose that |Si | ∈ O(n).
First, we compute s, t ∈ G i such that s|i and t|i satisfy the appropriate
one of the presentations (10.2)–(10.4). By Lemma 10.3.3, this can be done in
O(n 2 log4 n + k 2 n log2 n) time.
Our next goal is to construct S̃i and E(s, t). For each g ∈ Si , we com-
pute slp(g) and ḡ. By Corollary 10.2.20 and Lemma 10.2.23(c), the time re-
quirement for that is O(kn log n). Hence, S̃i can be obtained in O(kn 2 log n)
time. The set E(s, t) can be constructed in O(kn) time (cf. the proof of
Lemma 10.2.16).
Finally, we compute the normal closure of S̃ i ∪ E(s, t) in G i . By
Corollary 2.4.6, the time requirement of this step is O(n 2 log4 n).
For the overall running time estimate of the algorithm SGS[G, ], we need
a lemma that may be interesting on its own.
Theorem 10.3.7. Given G = S ≤ Sym(), with || = n, the running time
of the algorithm SGS[G, ] is O(n 3 log4 n + |S|n log n).
Proof. By Theorem 2.3.6, in O(|S|n log n) time we can construct O(n) gen-
erators for G. Hence, because all subroutines construct O(n) generators for
subgroups, we may suppose that, at each recursive call, the input of SGS[H, ]
contains O(n) generators.
During the run of SGS[G, ], the number of recursive calls of SGS[H, ]
is O(n). At each call, we have to find the orbits of the input group and and test
primitivity on a domain of size O(n). By Theorem 2.1.1 and Corollary 5.5.9,
one computation of the orbit structure can be done in O(n 2 ) time and one test
of primitivity in O(n 2 log n) time. Therefore, the total time spent in orbit and
primitivity computations is O(n 3 log n).
Next, we estimate the time spent in computing G i+1 for the values of i when
G i acts as a giant on some orbit i ⊆ i , and G i+1 is the pointwise stabilizer
of . For different i, these orbits i are disjoint, so Lemma 10.3.1 implies
i
i |i | ≤ 2n and Lemma 10.3.5 implies that the total time spent for these i
values is O(n 3 log4 n).
Finally, we estimate the time requirement of computing G i+1 for the values
of i when G i+1 = (G i )αi for some αi ∈ i+1 . Let I ⊆ [1, m] denote the set of
indices i in this category, and let ki be the length of the orbit αiG i in i+1 . We
claim that
ki ≤ 10n log n. (10.12)
i∈I
Our last task is to give the time requirement of the second phase of the
algorithm, the construction of a strong generating set relative to some ordering
of the original permutation domain . Recall that the output U of SGS[G, ]
is used to generate uniformly distributed random elements of G.
If G i+1 = (G i )αi for some αi ∈ i+1 then we have a transversal G i mod
G i+1 stored in U , so a uniformly distributed random coset representative can
be written down in O(n) time. If G i+1 is the pointwise stabilizer of an orbit
i ⊆ i , with |i | = ki , on which G i acts as a giant then Corollary 10.2.20
and Lemma 10.2.23(c) imply that a coset representative G i mod G i+1 can be
constructed in O(ki n log n) time, as described at the beginning of this sec-
tion. Therefore, a uniformly distributed random element of G is constructed
in O(n 2 log n) time, and Lemma 5.4.1 implies that an SGS for G can be con-
structed in O(n 3 log4 n) time.
Bibliography
[Acciaro and Atkinson, 1992] Acciaro, V., and Atkinson, M. D. (1992). A new
algorithm for testing the regularity of a permutation group. Congr. Numer.,
90:151–160.
[Atkinson, 1975] Atkinson, M. (1975). An algorithm for finding the blocks of a
permutation group. Math. Comp., 29:911–913.
[Atkinson and Neumann, 1990] Atkinson, M. D., and Neumann, P. M. (1990).
Computing Sylow subgroups of permutation groups. Congr. Numer., 79:55–60.
[Atkinson et al., 1984] Atkinson, M., Hassan, R., and Thorne, M. (1984). Group
theory on a micro-computer. In Atkinson, M., editor, Computational Group
Theory, pages 275–280, Academic Press, London.
[Babai, 1979] Babai, L. (1979). Monte Carlo algorithms for graph isomorphism
testing. Technical report DMS 79–10, Univ. de Montréal.
[Babai, 1991] Babai, L. (1991). Local expansion of vertex-transitive graphs and
random generation in finite groups. In Proc. 23rd ACM Symposium on Theory of
Computing, pages 164–174, ACM Press, New York.
[Babai, 1992] Babai, L. (1992). Bounded round interactive proofs in finite groups.
SIAM J. Discrete Math., 5:88–111.
[Babai, 1997] Babai, L. (1997). Randomization in group algorithms: Conceptual
questions. In Groups and Computation II, volume 28 of Amer. Math. Soc.
DIMACS Series, pages 1–17.
[Babai and Beals, 1999] Babai, L., and Beals, R. (1999). A polynomial-time theory
of black box groups I. In Campbell, C. M., Robertson, E. F., Ruskuc, N., and
Smith, G. C., editors, Groups St. Andrews 1997 in Bath, I, volume 260 of London
Math. Soc. Lecture Note Ser., pages 30–64, Cambridge University Press,
Cambridge.
[Babai and Moran, 1988] Babai, L., and Moran, S. (1988). Arthur–Merlin games: A
randomized proof system and a hierarchy of complexity classes. J. Comp. Syst.
Sci., 36:254–276.
[Babai and Szemerédi, 1984] Babai, L., and Szemerédi, E. (1984). On the complexity
of matrix group problems, I. In Proc. 25th IEEE Symposium on Foundations of
Computer Science, pages 229–240, IEEE Press, Washington.
[Babai et al., 1983] Babai, L., Kantor, W. M., and Luks, E. M. (1983). Computational
complexity and the classification of finite simple groups. In Proc. 24th IEEE
Symposium on Foundations of Computer Science, pages 162–171, IEEE Press,
Washington.
[Babai et al., 1987] Babai, L., Luks, E. M., and Seress, Á. (1987). Permutation groups
254
Bibliography 255
in NC. In Proc. 19th ACM Symposium on Theory of Computing, pages 409–420,
ACM Press, New York.
[Babai et al., 1988] Babai, L., Luks, E. M., and Seress, Á. (1988). Fast management of
permutation groups. In Proc. 29th IEEE Symposium on Foundations of Computer
Science, pages 272–282, IEEE Press, Washington.
[Babai et al., 1991] Babai, L., Cooperman, G., Finkelstein, L., and Seress, Á. (1991).
Nearly linear time algorithms for permutation groups with a small base. In Proc.
of International Symposium on Symbolic and Algebraic Computation ISSAC ’91,
pages 200–209, ACM Press, New York.
[Babai et al., 1993] Babai, L., Luks, E. M., and Seress, Á. (1993). Computing
composition series in primitive groups. In Groups and Computation, volume 11
of Amer. Math. Soc. DIMACS Series, pages 1–16.
[Babai et al., 1995] Babai, L., Cooperman, G., Finkelstein, L., Luks, E. M., and Seress,
Á. (1995). Fast Monte Carlo algorithms for permutation groups. J. Comp. Syst.
Sci., 50:296–308.
[Babai et al., 1997a] Babai, L., Goodman, A. J., Kantor, W. M., Luks, E. M., and Pálfy,
P. P. (1997a). Short presentations for finite groups. J. Algebra, 194:97–112.
[Babai et al., 1997b] Babai, L., Luks, E. M., and Seress, Á. (1997b). Fast management
of permutation groups I. SIAM J. Computing, 26:1310–1342.
[Beals, 1993a] Beals, R. (1993a). Computing blocks of imprimitivity for small-base
groups in nearly linear time. In Groups and Computation, volume 11 of Amer.
Math. Soc. DIMACS Series, pages 17–26.
[Beals, 1993b] Beals, R. M. (1993b). An elementary algorithm for computing the
composition factors of a permutation group. In Proc. of International Symposium
on Symbolic and Algebraic Computation ISSAC ’93, pages 127–134, ACM Press,
New York.
[Beals and Babai, 1993] Beals, R., and Babai, L. (1993). Las Vegas algorithms for
matrix groups. In Proc. 34rd IEEE Symposium on Foundations of Computer
Science, pages 427–436, IEEE Press, Washington.
[Beals and Seress, 1992] Beals, R. M., and Seress, Á. (1992). Structure forest and
composition factors for small base groups in nearly linear time. In Proc. 24th ACM
Symposium on Theory of Computing, pages 116–125, ACM Press, New York.
[Beals et al., 2002] Beals, R., Leedham-Green, C., Niemeyer, A., Praeger, C., and
Seress, Á. (2002). A black-box group algorithm for recognizing finite symmetric
and alternating groups, I. Transactions Amer. Math. Soc. To appear.
[Bosma et al., 1997] Bosma, W., Cannon, J., and Playoust, C. (1997). The Magma
algebra system I: The user language. J. Symbolic Comput., 24:235–265.
[Brassard and Bratley, 1988] Brassard, G., and Bratley, P. (1988). Algorithmics Theory
and Practice, Prentice–Hall, Englewood Cliffs, NJ.
[Bratus, 1999] Bratus, S. (1999). Recognition of Finite Black Box Groups. PhD thesis,
Northeastern University.
[Bratus and Pak, 2000] Bratus, S., and Pak, I. (2000). Fast constructive recognition of
a black box group isomorphic to Sn or An using Goldbach’s Conjecture.
J. Symbolic Comput., 29:33–57.
[Brown et al., 1989] Brown, C. A., Finkelstein, L., and Purdom, P. W. (1989). A new
base change algorithm for permutation groups. SIAM J. Computing,
18:1037–1047.
[Butler, 1979] Butler, G. (1979). Computational Approaches to Certain Problems in
the Theory of Finite Groups. PhD thesis, University of Sydney.
[Butler, 1982] Butler, G. (1982). Computing in permutation and matrix groups II:
Backtrack algorithm. Math. Comp., 39:671–680.
256 Bibliography
[Butler, 1983] Butler, G. (1983). Computing normalizers in permutation groups.
J. Algorithms, 4:163–175.
[Butler, 1985] Butler, G. (1985). Effective computation with group homomorphisms.
J. Symbolic Comput., 1:143–157.
[Butler, 1991] Butler, G. (1991). Fundamental Algorithms for Permutation Groups,
volume 559 of Lecture Notes in Computer Science, Springer-Verlag, Berlin.
[Butler, 1994] Butler, G. (1994). An inductive schema for computing conjugacy
classes in permutation groups. Math. Comp., 62:363–383.
[Butler and Cannon, 1989] Butler, G., and Cannon, J. J. (1989). Computing in
permutation and matrix groups III: Sylow subgroups. J. Symbolic Comput.,
8:241–252.
[Butler and Cannon, 1991] Butler, G., and Cannon, J. J. (1991). Computing Sylow
subgroups using homomorphic images of centralizers. J. Symbolic Comput.,
12:443–458.
[Cameron, 1999] Cameron, P. J. (1999). Permutation Groups, Cambridge University
Press, Cambridge.
[Cameron et al., 1984] Cameron, P. J., Neumann, P. M., and Saxl, J. (1984). On groups
with no regular orbits on the set of subsets. Archive Math., 43:295–296.
[Cameron et al., 1989] Cameron, P. J., Solomon, R., and Turull, A. (1989). Chains of
subgroups in symmetric groups. J. Algebra, 127:340–352.
[Cannon, 1984] Cannon, J. (1984). A computational toolkit for finite permutation
groups. In Proc. Rutgers Group Theory Year, 1983–1984, pages 1–18, Cambridge
University Press, Cambridge.
[Cannon and Souvignier, 1997] Cannon, J., and Souvignier, B. (1997). On the
computation of conjugacy classes in permutation groups. In Proc. of International
Symposium on Symbolic and Algebraic Computation ISSAC ’97, pages 392–399,
ACM Press, New York.
[Cannon, 1971] Cannon, J. J. (1971). Computing local structure of large finite groups.
In Birkhoff, G., and Marshall Hall, J., editors, Computers in Algebra and Number
Theory, volume 4 of Proc. Amer. Math. Soc., pages 161–176, Amer. Math. Soc.,
Providence, RI.
[Cannon, 1973] Cannon, J. J. (1973). Construction of defining relators for finite
groups. Discrete Math., 5:105–129.
[Cannon and Holt, 1997] Cannon, J. J., and Holt, D. F. (1997). Computing chief series,
composition series and socles in large permutation groups. J. Symbolic Comput.,
24:285–301.
[Cannon and Holt, 2002] Cannon, J. J., and Holt, D. F. (2002). Computing maximal
subgroups of finite groups. Submitted.
[Cannon et al., 1997] Cannon, J. J., Cox, B. C., and Holt, D. F. (1997). Computing
Sylow subgroups in permutation groups. J. Symbolic Comput., 24:303–316.
[Carmichael, 1923] Carmichael, R. (1923). Abstract definitions of the symmetric and
alternating groups and certain other permutation groups. Quart. J. Math.,
49:226–270.
[Celler et al., 1995] Celler, F., Leedham-Green, C. R., Murray, S. H., Niemeyer, A. C.,
and O’Brien, E. (1995). Generating random elements of a finite group. Comm.
Algebra, 23:4931–4948.
[Celler et al., 1990] Celler, F., Neubüser, J., and Wright, C. R. B. (1990). Some
remarks on the computation of complements and normalizers in soluble groups.
Acta Appl. Math., 21:57–76.
[Chernoff, 1952] Chernoff, H. (1952). A measure of asymptotic efficiency for tests of a
hypothesis based on the sum of observations. Ann. Math. Statistics, 23:493–507.
[Conway et al., 1985] Conway, J., Curtis, R., Norton, S., Parker, R., and Wilson, R.
(1985). Atlas of Finite Groups, Clarendon Press, Oxford.
[Cooperman and Finkelstein, 1992] Cooperman, G., and Finkelstein, L. (1992). A fast
Bibliography 257
cyclic base change for permutation groups. In Proc. of International Symposium
on Symbolic and Algebraic Computation ISSAC ’92, pages 224–232, ACM Press,
New York.
[Cooperman and Finkelstein, 1993] Cooperman, G., and Finkelstein, L. (1993).
Combinatorial tools for computational group theory. In Groups and Computation,
volume 11 of Amer. Math. Soc. DIMACS Series, pages 53–86.
[Cooperman et al., 1989] Cooperman, G., Finkelstein, L., and Luks, E. M. (1989).
Reduction of group constructions to point stabilizers. In Proc. of International
Symposium on Symbolic and Algebraic Computation ISSAC ’89, pages 351–356,
ACM Press, New York.
[Cooperman et al., 1990] Cooperman, G., Finkelstein, L., and Sarawagi, N. (1990). A
random base change algorithm for permutation groups. In Proc. of International
Symposium on Symbolic and Algebraic Computation ISSAC ’90, pages 161–168,
ACM Press, New York.
[Cooperman et al., 1997] Cooperman, G., Finkelstein, L., and Linton, S. A. (1997).
Recognizing GLn (2) in non-standard representation. In Groups and Computation
II, volume 28 of Amer. Math. Soc. DIMACS Series, pages 85–100.
[Cooperstein, 1978] Cooperstein, B. N. (1978). Minimal degree for a permutation
representation of a classical group. Israel J. Math., 30:213–235.
[Coxeter and Moser, 1957] Coxeter, H., and Moser, W. (1957). Generators and
Relations for Discrete Groups, 4th edition, volume 14 of Ergeb. Math. Grenzgeb.
Springer-Verlag, Berlin.
[Dixon, 1968] Dixon, J. D. (1968). The solvable length of a solvable linear group.
Math. Z., 107:151–158.
[Dixon and Mortimer, 1996] Dixon, J. D., and Mortimer, B. (1996). Permutation
Groups. Graduate Texts in Math., Springer-Verlag, Berlin.
[Easdown and Praeger, 1988] Easdown, D., and Praeger, C. E. (1988). On minimal
faithful permutation representations of finite groups. Bull. Aust. Math. Soc.,
38:207–220.
[Eick, 1997] Eick, B. (1997). Special presentations for finite soluble groups and
computing (pre-)Frattini subgroups. In Groups and Computation II, volume 28 of
Amer. Math. Soc. DIMACS Series, pages 101–112.
[Eick and Hulpke, 2001] Eick, B., and Hulpke, A. (2001). Computing the maximal
subgroups of a permutation group I. In Kantor, W. M., and Seress, Á., editors,
Groups and Computation III, volume 8 of OSU Mathematical Research Institute
Publications, pages 155–168, de Gruyter, Berlin.
[Erdős and Rényi, 1965] Erdős, P., and Rényi, A. (1965). Probabilistic methods in
group theory. J. d’Analyse Math., 14:127–138.
[Feller, 1968] Feller, W. (1968). An Introduction to Probability Theory and Its
Applications, volume 1, 3rd edition, Wiley, New York.
[Furst et al., 1980] Furst, M., Hopcroft, J., and Luks, E. M. (1980). Polynomial-time
algorithms for permutation groups. In Proc. 21st IEEE Symposium on
Foundations of Computer Science, pages 36–41, IEEE Press, Washington.
[GAP, 2000] GAP (2000). GAP – Groups, Algorithms, and Programming, Version 4.2.
The GAP Group, Aachen, St Andrews
(http://www-gap.dcs.st-and.ac.uk/~gap).
[Garey and Johnson, 1979] Garey, M. R., and Johnson, D. S. (1979). Computers and
Intractability, A Guide to the Theory of NP-completeness, Freeman, New York.
[Gebhardt, 2000] Gebhardt, V. (2000). Constructing a short defining set of relations for
a finite group. J. Algebra, 233:526–542.
[Guralnick, 1983] Guralnick, R. M. (1983). Subgroups of prime power index in a
simple group. J. Algebra, 81:304–311.
[Guralnick and Kantor, 2000] Guralnick, R. M., and Kantor, W. M. (2000). The
probability of generating a simple group. J. Algebra, 234:743–792.
258 Bibliography
[Hardy and Wright, 1979] Hardy, G. H., and Wright, E. M. (1979). An Introduction the
Theory of Numbers, 5th edition, Clarendon Press, Oxford.
[Hoffmann, 1982] Hoffmann, C. M. (1982). Group-theoretic Algorithms and Graph
Isomorphism, volume 136 of Lecture Notes in Computer Science,
Springer-Verlag, Berlin.
[Holt, 1991] Holt, D. F. (1991). The computation of normalizers in permutation
groups. J. Symbolic Comput., 12:499–516.
[Holt, 1997] Holt, D. F. (1997). Representing quotients of permutation groups. Quart.
J. Math. Oxford Ser. (2), 48:347–350.
[Holt, 2001] Holt, D. F. (2001). Computing automorphism groups of finite groups. In
Kantor, W. M., and Seress, Á., editors, Groups and Computation III, volume 8 of
OSU Mathematical Research Institute Publications, pages 201–208, de Gruyter,
Berlin.
[Holt and Rees, 1994] Holt, D. F., and Rees, S. (1994). Testing modules for
irreducibility. J. Austral. Math. Soc. Ser. A, 57:1–16.
[Hulpke, 1993] Hulpke, A. (1993). Zur Berechnung von Charaktertafeln.
Diplomarbeit, RWTH Aachen.
[Hulpke, 1996] Hulpke, A. (1996). Konstruktion transitiver Permutationsgruppen.
PhD thesis, RWTH Aachen.
[Hulpke, 2000] Hulpke, A. (2000). Conjugacy classes in finite permutation groups via
homomorphic images. Math. Comp., 69:1633–1651.
[Hulpke and Seress, 2001] Hulpke, A., and Seress, Á. (2001). Short presentations for
three-dimensional unitary groups. J. Algebra, 245:719–729.
[Huppert, 1967] Huppert, B. (1967). Endliche Gruppen I, volume 134 of Grundlehren
Math. Wiss., Springer-Verlag, Berlin.
[Ivanyos and Lux, 2000] Ivanyos, G., and Lux, K. (2000). Treating the exceptional
cases of the Meataxe. Exp. Math., 9:373–381.
[Jerrum, 1986] Jerrum, M. (1986). A compact representation for permutation groups.
J. Algorithms, 7:60–78.
[Jerrum, 1995] Jerrum, M. (1995). Computational Pólya theory. In Surveys in
Combinatorics, 1995, pages 103–118, Cambridge University Press,
Cambridge.
[Jordan, 1873] Jordan, C. (1873). Sur la limite de transitivité des groupes non alternés.
Bull. Soc. Math. France, 1:40–71.
[Kantor, 1985a] Kantor, W. M. (1985a). Some consequences of the classification of
finite simple groups. In McKay, J., editor, Finite Groups – Coming of Age,
volume 45 of Contemporary Mathematics, pages 159–173, Amer. Math. Soc.,
Providence, RI.
[Kantor, 1985b] Kantor, W. M. (1985b). Sylow’s theorem in polynomial time.
J. Comp. Syst. Sci., 30:359–394.
[Kantor, 1990] Kantor, W. M. (1990). Finding Sylow normalizers in polynomial time.
J. Algorithms, 11:523–563.
[Kantor, 1991] Kantor, W. M. (1991). Finding composition factors of permutation
groups of degree n ≤ 106 . J. Symbolic Comput., 12:517–526.
[Kantor and Luks, 1990] Kantor, W. M., and Luks, E. M. (1990). Computing in
quotient groups. In Proc. 22nd ACM Symposium on Theory of Computing, pages
524–534, ACM Press, New York.
[Kantor and Magaard, 2002] Kantor, W. M., and Magaard, K. (2002). Black-box
exceptional groups of Lie type. In preparation.
[Kantor and Penttila, 1999] Kantor, W. M., and Penttila, T. (1999). Reconstructing
simple group actions. In Cossey, J., Miller, C. F., Neumann, W. D., and Shapiro,
M., editors, Geometric Group Theory Down Under, pages 147–180, de Gruyter,
Berlin.
Bibliography 259
[Kantor and Seress, 1999] Kantor, W. M., and Seress, Á. (1999). Permutation group
algorithms via black box recognition algorithms. In Campbell, C. M., Robertson,
E. F., Ruskuc, N., and Smith, G. C., editors, Groups St. Andrews 1997 in Bath, II,
volume 261 of London Math. Soc. Lecture Note Ser., pages 436–446, Cambridge
University Press, Cambridge.
[Kantor and Seress, 2002] Kantor, W. M., and Seress, Á. (2002). Computing with
matrix groups. Submitted.
[Kantor and Seress, 2001] Kantor, W. M., and Seress, Á. (2001). Black box classical
groups. Memoirs Amer. Math. Soc., 149(708).
[Kantor and Taylor, 1988] Kantor, W. M., and Taylor, D. E. (1988). Polynomial-time
versions of Sylow’s theorem. J. Algorithms, 9:1–17.
[Kantor et al., 1999] Kantor, W. M., Luks, E. M., and Mark, P. D. (1999). Sylow
subgroups in parallel. J. Algorithms, 31:132–195.
[Kimmerle et al., 1990] Kimmerle, W., Lyons, R., Sandling, R., and Teague, D. N.
(1990). Composition factors from the group ring and Artin’s theorem on orders of
simple groups. Proc. London Math. Soc. (3), 60:89–122.
[Knuth, 1969] Knuth, D. E. (1969). The Art of Computer Programming. Volume 2:
Seminumerical Algorithms, Addison-Wesley, Reading, MA.
[Knuth, 1973] Knuth, D. E. (1973). The Art of Computer Programming. Volume 3:
Sorting and Searching, Addison-Wesley, Reading, MA.
[Knuth, 1991] Knuth, D. E. (1991). Notes on efficient representation of perm groups.
Combinatorica, 11:57–68.
[Laue et al., 1984] Laue, R., Neubüser, J., and Schoenwaelder, U. (1984). Algorithms
for finite soluble groups and the SOGOS system. In Atkinson, M., editor,
Computational Group Theory, pages 105–135, Academic Press, London.
[Leedham-Green and Soicher, 1990] Leedham-Green, C., and Soicher, L. (1990).
Collection from the left and other strategies. J. Symbolic Comput., 9:665–675.
[Leedham-Green, 2001] Leedham-Green, C. R. (2001). The computational matrix
group project. In Kantor, W. M., and Seress, Á., editors, Groups and Computation
III, volume 8 of OSU Mathematical Research Institute Publications, pages
229–247, de Gruyter, Berlin.
[Leon, 1980a] Leon, J. S. (1980a). Finding the order of a permutation group. In
Cooperstein, B., and Mason, G., editors, Finite Groups, volume 37 of Proc.
Sympos. Pure Math., pages 511–517, Amer. Math. Soc., Providence, RI.
[Leon, 1980b] Leon, J. S. (1980b). On an algorithm for finding a base and strong
generating set for a group given by generating permutations. Math. Comp.,
35:941–974.
[Leon, 1991] Leon, J. S. (1991). Permutation group algorithms based on partitions, I:
Theory and algorithms. J. Symbolic Comput., 12:533–583.
[Leon, 1997] Leon, J. S. (1997). Partitions, refinements, and permutation group
computation. In Groups and Computation II, volume 28 of Amer. Math. Soc.
DIMACS Series, pages 123–158.
[Luks, 1982] Luks, E. M. (1982). Isomorphism of graphs of bounded valence can be
tested in polynomial time. J. Comp. Syst. Sci., 25:42–65.
[Luks, 1987] Luks, E. M. (1987). Computing the composition factors of a permutation
group in polynomial time. Combinatorica, 7:87–99.
[Luks, 1990] Luks, E. M. (1990). Lectures on Polynomial-Time Computation in
Groups. Lecture notes, Northeastern University.
[Luks, 1993] Luks, E. M. (1993). Permutation groups and polynomial-time
computation. In Groups and Computation, volume 11 of Amer. Math. Soc.
DIMACS Series, pages 139–175.
[Luks and Seress, 1997] Luks, E. M., and Seress, Á. (1997). Computing the Fitting
subgroup and solvable radical of small-base permutation groups in nearly linear
260 Bibliography
time. In Groups and Computation II, volume 28 of Amer. Math. Soc. DIMACS
Series, pages 169–181.
[Mark, 1993] Mark, P. D. (1993). Parallel computation of Sylow subgroups in solvable
groups. In Groups and Computation, volume 11 of Amer. Math. Soc. DIMACS
Series, pages 177–187.
[McIver and Neumann, 1987] McIver, A., and Neumann, P. M. (1987). Enumerating
finite groups. Quart. J. Math. Oxford, 38:473–488.
[McKay, 1981] McKay, B. D. (1981). Practical graph isomorphism. Congr. Numer.,
30:45–87.
[Mecky and Neubüser, 1989] Mecky, M., and Neubüser, J. (1989). Some remarks on
the computation of conjugacy classes of soluble groups. Bull. Aust. Math. Soc.,
40:281–292.
[Morje, 1995] Morje, P. (1995). A Nearly Linear Algorithm for Sylow Subgroups of
Small-Base Permutation Groups. PhD thesis, The Ohio State University.
[Morje, 1997] Morje, P. (1997). On nearly linear time algorithms for Sylow subgroups
of small-base permutation groups. In Groups and Computation II, volume 28 of
Amer. Math. Soc. DIMACS Series, pages 257–272.
[Neubüser, 1982] Neubüser, J. (1982). An elementary introduction to coset table
methods in computational group theory. In Campbell, C. M., and Robertson, E. F.,
editors, Groups – St Andrews 1981, volume 71 of London Math. Soc. Lecture
Note Ser., pages 1–45, Cambridge University Press, Cambridge.
[Neumann, 1979] Neumann, P. M. (1979). A lemma that is not Burnside’s. Math.
Scientist, 4:133–141.
[Neumann, 1986] Neumann, P. M. (1986). Some algorithms for computing with finite
permutation groups. In Robertson, E., and Campbell, C., editors, Proc. of
Groups – St Andrews 1985, volume 121 of London Math. Soc. Lecture Note Ser.,
pages 59–92, Cambridge University Press, Cambridge.
[Niven et al., 1991] Niven, I., Zuckerman, H. S., and Montgomery, H. L. (1991). An
Introduction to the Theory of Numbers, 5th edition, Wiley, New York.
[Pak, 2001] Pak, I. (2001). What do we know about the product replacement
algorithm? In Kantor, W. M., and Seress, Á., editors, Groups and Computation
III, volume 8 of OSU Mathematical Research Institute Publications, pages
301–347, de Gruyter, Berlin.
[Parker and Nikolai, 1958] Parker, E. T., and Nikolai, P. J. (1958). A search for
analogues of the Mathieu groups. Math. Tables Aids Comput., 12:38–43.
[Parker, 1984] Parker, R. (1984). The computer calculation of modular characters (the
Meat-Axe). In Atkinson, M., editor, Computational Group Theory, pages
267–274, Academic Press, London.
[Passman, 1968] Passman, D. (1968). Permutation Groups. Benjamin, New York.
[Praeger and Saxl, 1980] Praeger, C. E., and Saxl, J. (1980). On the orders of primitive
permutation groups. Bull. London Math. Soc., 12:303–307.
[Pyber, 1993] Pyber, L. (1993). Asymptotic results for permutation groups. In Groups
and Computation, volume 11 of Amer. Math. Soc. DIMACS Series, pages
197–219.
[Rákóczi, 1995] Rákóczi, F. (1995). Fast recognition of the nilpotency of permutation
groups. In Proc. of International Symposium on Symbolic and Algebraic
Computation ISSAC ’95, pages 265–269, ACM Press, New York.
[Rákóczi, 1997] Rákóczi, F. (1997). Data Structures and Algorithms for Computing in
Nilpotent and Solvable Permutation Groups. PhD thesis, University of Oregon.
[Rónyai, 1990] Rónyai, L. (1990). Computing the structure of finite algebras.
J. Symbolic Comput., 9:355–373.
[Rose, 1965] Rose, J. (1965). Abnormal depth and hypereccentric length in finite
soluble groups. Math. Z., 90:29–40.
Bibliography 261
[Rotman, 1995] Rotman, J. J. (1995). An Introduction to the Theory of Groups, 4th
edition, Springer-Verlag, Berlin.
[Schönert and Seress, 1994] Schönert, M., and Seress, Á. (1994). Finding blocks of
imprimitivity in small-base groups in nearly linear time. In Proc. of International
Symposium on Symbolic and Algebraic Computation ISSAC ’94, pages 154–157,
ACM Press, New York.
[Scott, 1980] Scott, L. L. (1980). Representations in characteristic p. In The Santa
Cruz Conference on Finite Groups, volume 37 of Proc. Symposium Pure Math.,
pages 319–331, Amer. Math. Soc., Providence, RI.
[Sedgewick, 1988] Sedgewick, R. (1988). Algorithms, 2nd edition, Addison–Wesley,
Reading, MA.
[Seress, 1997] Seress, Á. (1997). An introduction to Computational Group Theory.
Notices Amer. Math. Soc., 44:671–679.
[Seress, 1998] Seress, Á. (1998). Nearly linear time algorithms for permutation
groups: an interplay between theory and practice. Acta Appl. Math., 52:183–207.
[Seress and Weisz, 1993] Seress, Á., and Weisz, I. (1993). PERM: A program
computing strong generating sets. In Groups and Computation I, volume 11 of
Amer. Math. Soc. DIMACS Series, pages 269–276.
[Sims, 1970] Sims, C. C. (1970). Computational methods in the study of permutation
groups. In Computational Problems in Abstract Algebra, pages 169–183,
Pergamon Press, Oxford.
[Sims, 1971a] Sims, C. C. (1971a). Computation with permutation groups. In Proc.
Second Symposium on Symbolic and Algebraic Manipulation, pages 23–28, ACM
Press, New York.
[Sims, 1971b] Sims, C. C. (1971b). Determining the conjugacy classes of permutation
groups. In Birkhoff, G., and Marshall Hall, J., editors, Computers in Algebra and
Number Theory, volume 4 of Proc. Amer. Math. Soc., pages 191–195, Amer.
Math. Soc., Providence, RI.
[Sims, 1978] Sims, C. C. (1978). Some group theoretic algorithms. In Dold, A., and
Eckmann, B., editors, Topics in Algebra, volume 697 of Lecture Notes in Math.,
pages 108–124, Springer-Verlag, Berlin.
[Sims, 1990] Sims, C. C. (1990). Computing the order of a solvable permutation
group. J. Symbolic Comput., 9:699–705.
[Sims, 1994] Sims, C. C. (1994). Computation with Finitely Presented Groups,
Cambridge University Press, Cambridge.
[Suzuki, 1962] Suzuki, M. (1962). On a class of doubly transitive groups. Ann. Math.,
75:105–145.
[Tarjan, 1975] Tarjan, R. E. (1975). Efficiency of a good but not linear set union
algorithm. J. Assoc. Comput. Mach., 22:215–225.
[Taylor, 1992] Taylor, D. E. (1992). The Geometry of the Classical Groups, volume 9
of Sigma Series in Pure Mathematics, Heldermann-Verlag, Berlin.
[Theißen, 1997] Theißen, H. (1997). Eine Methode zur Normalisatorberechnung in
Permutationsgruppen mit Anwendungen in der Konstruktion primitiver Gruppen.
PhD thesis, RWTH Aachen.
[Vaughan-Lee, 1990] Vaughan-Lee, M. (1990). Collection from the left. J. Symbolic
Comput., 9:725–733.
[Warlimont, 1978] Warlimont, R. (1978). Über die Anzahl der Lösungen von x n = 1
in der symmetrischen Gruppe Sn . Archive Math., 30:591–594.
[Wielandt, 1934] Wielandt, H. (1934). Abschätzungen für den Grad einer
Permutationsgruppe von vorgeschriebenem Transitivitätsgrad. Schrift. Math.
Sem. Inst. Angew. Math. Univ. Berlin, 2:151–174.
[Wielandt, 1964] Wielandt, H. (1964). Finite Permutation Groups, Academic Press,
New York.
Index
( ), 9 ( f ), 11
1m 1 · · · n m n , 228 P (γ1 , . . . , γl−1 ), 205
A × B, 8 Out(G), 7
A(m, x), 109 ∧
, 208
An , 9 R(), 209
C G (U ), 7 Soc(G), 8
Cn , 8 Sym(), 9
E(A), 36 T (t), 202
G| , 9 (H ), 210
G H , 10 ( f ), 11
G, 8 X (V, E), 11
G(P), 201 X (α, β), 212
G, 9 αg , 9
G [i] , 55 N, 10
G () , 9 R, 10
G , 10 Z, 10
H G, 7 fix(G), 117
H << G, 7 S, 7
L[i], 12 U G , 7
N (k, , d1 , . . . , dl ), 231 µ, 94, 228
N G (U ), 7 ωG , 9
Nn (x), 231 O ∼ ( f ), 11
O( f ), 11 Ai , 8
O ∞ (G), 8 supp(g), 9
O p (G), 8 ϕ((γ1 , . . . , γl )), 202
O∞ (G), 8 ξ , 94, 228
Sn , 9 o( f ), 11
Tn (x), 231 OP(), 207
[H, K ], 8 Prob(A|B), 30
[a, b], 8
Alt(), 9 Ackerman function, 109, 176
Aut(G), 7 automorphism group, 7, 129, 131, 144
Aut(X ), 11 of a graph, 11, 207
g , 9
Diag(H ), 129 backtrack, 53, 169, 201–217
GF(q), 8 base, 50, 55
GLd (q), 8 R-base, 210
Inn(G), 7 nonredundant, 55
262
Index 263
base change, 82, 97, 112, 116, 134, 143, 190, forest, 12, 219, 249
204, 205 Frattini argument, 170, 182
black-box f -recognizable, 192, 193, 195, 196, Frobenius group, 10, 133, 137, 148, 160
198
black-box group, 16, 16–47, 135, 139, 192, graph, 11
193, 195, 228, 235–244 component, 12, 112
block, 9, 50, 100, 107–110, 112, 113, 121, connected, 12
142, 190, 214
maximal, 9, 144, 146 Hall subgroup, 182
minimal, 9, 101–107, 112 hash function, 22, 142
block homomorphism, 81 hypergraph, 11, 87
block system, 9, 121, 126, 128, 142, 176, 178 uniform, 11, 87
maximal, 9, 191
minimal, 9, 132, 149, 247, 253 labeled branching, 219, 218–225
represents a group, 220
Cayley graph, 12, 26, 64 represents a transversal, 220
center, 7, 50, 120, 133 Las Vegas algorithm, 14
centralizer, 7, 53, 117–124, 130, 134, 149, local expansion, 72
150, 152, 158, 169, 172, 205, 216 lower central series, 8, 24, 38, 49, 84, 180
Chernoff’s bound, 31, 33–35, 37–39
basic-type application, 32 Markov chain, 25, 27, 215, 217
chief series, 49, 155 aperiodic, 25, 28, 47, 215
closure, 83, 111 irreducible, 25, 28, 215
G-closure, 7, 23, 38, 44, 83 period of, 25, 47
normal closure, 7, 23, 83, 111, 116, 138, stationary distribution of, 25, 28, 215
140, 155, 250, 251 transition probability, 25, 28, 47, 215
collection, 17, 165 Monte Carlo algorithm, 13
commutator, 8
complement, 7, 182 nearly linear-time algorithm, 51
composition series, 50, 125–155, 158, 165, nearly uniform distribution, 24, 29
193, 197, 199 nilpotent group, 175–182
conjugacy class, 172, 214–216 nonconstructive recognition, 227
constructive recognition, 168, 192, 193, 195, normalizer, 7, 134, 137, 142, 154, 170, 211
196, 227, 235–246
coordinatization, 167, 169–171, 173 orbit, 9, 18, 36, 49, 60, 65, 102, 112, 120–122,
core, 8, 50, 124, 180 141, 143, 144, 154, 173, 179, 189, 190,
p-core, 8, 51, 138, 157–159 252
coset enumeration, 184–186 fundamental, 56, 83, 97, 99, 182, 204, 223
cube, 64, 67, 69 orbital graph, 212, 251
nondegenerate, 65 ordered partition, 207
cycle type, 228 cell of, 207
refinement, 208
degree out-degree, 11
of a graph or hypergraph, 11
of a permutation, 9 path, 12, 219, 252
of a permutation group, 9 perfect group, 8, 146
derived series, 8, 24, 38, 49, 84, 159 permutation group as black-box group, 93,
diagonal subgroup, 129, 131, 144, 160 135, 138, 139, 192, 193, 197
direct product, 8, 119–122, 129, 141, 147, permutation isomorphism, 10, 95, 119, 127,
216 128, 140, 142, 146, 173
projection, 8, 130 polycyclic generating sequence, 162, 163, 165,
directed graph, 11 166, 181
out-degree, 11 power-conjugate presentation, 165–166
strongly connected, 12, 112, 251 presentation, 49, 112, 165, 184, 192, 194,
underlying graph, 12, 219 197–200, 227, 236, 237, 244, 247,
double coset, 53, 203 250
264 Index
primitive group, 9, 95, 100, 126, 129–149, stabilizer
160, 225, 251 pointwise, 9, 49, 79, 115, 127, 247
product replacement algorithm, 27 setwise, 10, 53, 145, 176, 206
standard word, 93
random prefix, 40 straight-line program, 10, 192–194, 197, 199,
random subproduct, 30, 30–40, 73, 77, 84, 88, 200, 227, 239, 240, 243, 244, 250
92, 182, 245, 246, 249 strong generating set, see SGS
recognition, see constructive recognition subdirect product, 8, 130, 160
regular graph, 11 subnormal, 7, 49, 124, 149, 152, 160, 161,
regular group, 10, 78, 113, 129, 133, 137, 138, 180
146, 150, 154, 160, 161 support, 9
Sylow subgroup, 50, 95, 125, 157, 161,
Schreier generator, 58, 59, 62, 73, 76–78, 86, 167–172, 182, 216
88, 92, 97, 177, 198, 222, 246, 249
Schreier tree, 56, 65, 67, 70, 72, 75, 77, 82, 85, transitive closure, 219, 220
88, 91, 97, 99, 101, 106, 118, 128, 135, transitive constituent homomorphism, 81
136, 155, 163, 190 transitive group, 9, 36, 87, 117
shallow, 114 transversal, 8, 56, 57, 59, 65, 82, 101, 118,
Schreier vector, see Schreier tree 218, 220, 246, 247, 249, 253
Schreier–Sims algorithm, 59 tree, 12
Schur–Zassenhaus theorem, 182 breadth-first-search, 19, 65, 68, 113, 157,
search tree, 202 187
semiregular group, 10, 117, 123, 130 rooted, 12, 108, 111, 202, 219
SGS, 50, 55 children of a vertex, 12
construction, 59, 63, 70, 72, 75, 86, 87, 90, leaf, 12
99, 162, 193, 222, 246 parent of a vertex, 12
testing, 64, 77, 186, 190, 193
siftee, 56 Union-Find data structure, 108, 113
sifting, 56 up-to-date SGS data structure, 59, 70, 83, 88,
as a word, 86, 88, 91, 128, 156, 166, 167, 91
187 upper central series, 8, 179–181
in a labeled branching, 221, 223, 224
small-base group, 51 valency, 11
socle, 8, 51, 129, 147–149, 152–154, 161
solvable radical, 8, 157–159, 216 walk, 12, 25, 212, 213, 215
solvable residual, 8, 150, 152 lazy random, 26
spreader, 41 wreath product, 10, 119, 122, 129, 160, 226