Color Refinement
Color Refinement
Color Refinement
Sandra Kiefer
RWTH Aachen University, Aachen, Germany
kiefer@cs.rwth-aachen.de
Brendan D. McKay
Australian National University, Canberra, Australia
brendan.mckay@anu.edu.au
Abstract
The Colour Refinement procedure and its generalisation to higher dimensions, the Weisfeiler-Leman
algorithm, are central subroutines in approaches to the graph isomorphism problem. In an iterative
fashion, Colour Refinement computes a colouring of the vertices of its input graph.
A trivial upper bound on the iteration number of Colour Refinement on graphs of order n is
arXiv:2005.10182v1 [cs.DM] 20 May 2020
n − 1. We show that this bound is tight. More precisely, we prove via explicit constructions that
there are infinitely many graphs G on which Colour Refinement takes |G| − 1 iterations to stabilise.
Modifying the infinite families that we present, we show that for every natural number n ≥ 10, there
are graphs on n vertices on which Colour Refinement requires at least n − 2 iterations to reach
stabilisation.
1 Introduction
Colour Refinement, which is also known as Naïve Vertex Classification or the 1-dimensional
Weisfeiler-Leman algorithm (1-WL), is an important combinatorial algorithm in theoretical
and practical approaches to the graph isomorphism problem. In an iterative fashion, it
refines an isomorphism-invariant partition of the vertex set of the input graph. This process
stabilises at some point and the final partition can often be used to distinguish non-isomorphic
graphs [3]. Colour Refinement can be implemented to run in time O((m + n) log n), where
n is the order of the input graph and m is its number of edges [6, 28]. Most notably,
its efficient implementations are used in all competitive graph isomorphism solvers (such
as Nauty and Traces [29], Bliss [19] and saucy [7]).
Colour Refinement has been rediscovered many times, one of its first occurences being
in a paper on chemical information systems from the 1960s [30]. The procedure is applied
in plenty of other fields, for example, it can be modified to reduce the dimension of linear
programs significantly [14]. Other applications are in the context of graph kernels [33] or
static program analysis [26]. A recently discovered connection to deep learning shows that
the expressive power of Colour Refinement is captured by graph neural networks [31].
As described above, Colour Refinement computes a stable colouring of its input graph.
It is known that two given graphs result in equal colourings, i.e. are not distinguished by
Colour Refinement, if and only if there is a fractional isomorphism between them [12, 32, 34].
Moreover, the graphs which Colour Refinement identifies up to isomorphism (i.e. distinguishes
from all non-isomorphic ones) have been completely characterised [2, 23].
To obtain its final colouring, the algorithm proceeds in iterations. In this paper, we
investigate how many iterations it takes for the algorithm to terminate. More specifically,
for n ∈ N, we are interested in WL1 (n), the maximum number of iterations required to reach
stabilisation of Colour Refinement among all graphs of order n.
While not directly linked to the running time on a sequential machine, the iteration
number corresponds to the parallel running time of Colour Refinement (on a standard PRAM
model) [17, 24]. Furthermore, via a connection to counting logics, a bound on the iteration
number for graphs of a fixed size directly translates into a bound on the descriptive complexity
of the difference between the two graphs, namely into a bound on the quantifier depth of
2 The Iteration Number of Colour Refinement
Thus, there are infinitely many n ∈ N with WL1 (n) = n − 1. We can even determine the
iteration number up to an additive constant of 1 for all n ∈ N (where the precise numbers
for n ≤ 9 can easily be determined computationally), as stated in our second main result.
I Theorem 1.2. For every n ∈ N≥10 , it holds that WL1 (n) ∈ {n − 2, n − 1}.
We obtain our bounds via an empirical approach. More precisely, we have designed a
procedure that enables us to systematically generate for all n ≤ 64 graphs of order n that
obey certain constraints (to render the procedure tractable) and on which Colour Refinement
takes n − 1 iterations to stabilise. Analysing the graphs, we determined the connections
between colour classes during the execution of the algorithm in detail. If the vertex degrees
that are present in the graph are low, then the connections between colour classes of size 2
are restricted. This allows us to develop an elegant graphical visualisation and a compact
string representation of the graphs with low vertex degrees that take n − 1 iterations to
stabilise. Using these encodings, we are able to provide infinite families with n − 1 Colour
Refinement iterations until stabilisation.
Our analysis enables a deep understanding of the families that we present. Via slight
modifications of the graph families, we can then cover a large portion of graph sizes and,
allowing to go from connected graphs to general graphs, we can construct the graphs that
yield Theorem 1.2.
Related work
Colour Refinement is the 1-dimensional version of the so-called Weisfeiler-Leman algorithm.
For every k ∈ N, there exists a generalisation of it (k-WL), which colours vertex k-tuples in
the input graph instead of single vertices only. See [20] for an in-depth study of the main
parameters of Colour Refinement and k-WL.
Similarly as for Colour Refinement, one can consider the number WLk (n) of iterations of
k-WL on graphs of order n. Notably, contrasting our results for Colour Refinement, in [21], it
was first proved that the trivial upper bound of WL2 (n) ≤ n2 − 1 is not even asymptotically
tight (see also the journal version [22]). This foundation fostered further work, leading to an
astonishingly good new upper bound of O(n log n) for the iteration number of 2-WL [27].
For fixed k > 1, it is already non-trivial to show linear lower bounds on WLk (n).
Modifying a construction of Cai, Fürer, and Immerman [5], this was achieved by Fürer [9],
who showed that WLk (n) ∈ Ω(n), remaining to date the best known lower bound when the
S. Kiefer and B. McKay 3
input is a graph. Only when considering structures with relations of higher arity than 2 as
input, better lower bounds on the iteration number of k-WL have been proved [4].
For k > 2, regarding upper bounds on the iteration number of k-WL, without further
knowledge about the input graph, no significant improvements over the trivial upper bound
nk − 1 are known.1 Still, when the input graph has bounded treewidth or is a 3-connected
planar graph, polylogarithmic upper bounds on the iteration number of k-WL needed to
identify the graph are known [17, 35].
Although for every natural number k, there are non-isomorphic graphs that are not
distinguished by k-WL [5], it is known that for every graph class with a forbidden minor, a
sufficiently high-dimensional Weisfeiler-Leman algorithm correctly decides isomorphism [13].
Recent results give new upper bounds on the dimension needed for certain interesting graph
classes [15, 16]. A closely-related direction of research investigates what properties the
Weisfeiler-Leman algorithm can detect in graphs [1, 8, 10].
2 Preliminaries
By N, we denote the set of natural numbers, i.e. {1, 2, . . . }. We set N0 := N ∪ {0} and, for
k, ` ∈ N0 , we define [k, `] := {n ∈ N0 | k ≤ n ≤ `} and [k] := [1, k]. For a set S, a partition of
S is a set Π of non-empty sets such that M ∈Π M = S and for all M, M 0 ∈ Π with M 6= M 0 ,
S
it holds that M ∩ M 0 = ∅. For two partitions Π and Π0 of the same set S, we say that Π0
is finer than Π (or Π0 refines Π) if every element of Π0 is a (not necessarily proper) subset
of an element of Π. We write Π Π0 (and equivalently Π0 Π) to express that Π0 is finer
than Π. Concurrently, we say that Π is coarser than Π0 . If both Π Π0 and Π0 Π hold,
we denote this by Π ≡ Π0 .
For S 6= ∅, the partition {S} is the unit partition of S. The partition {s} | s ∈ S is
called the discrete partition of S. A set of cardinality 1 is a singleton.
All graphs that we consider in this paper are finite and simple, i.e. undirected without
self-loops at vertices. For a graph with vertex set V (G) and edge set E(G), its order is
|G| := |V (G)|. For a vertex v ∈ V (G), we denote by N (v) the neighbourhood ofv in G,
i.e. the set {w | {v, w} ∈ E(G)}. Similarly, for a vertex set W , we set N (W ) := v | v ∈ /
W, ∃w ∈ W : v ∈ N (w) . The degree of a vertex v is deg(v) := |N (v)| (since the graph G
will be clear from the context, we do not need to include it in our notation). We also set
deg(G) := {deg(v) | v ∈ V (G)}. If there is a d ∈ N0 such that deg(G) = {d}, the graph G is
d-regular. A regular graph is a graph that is d-regular for some d ∈ N0 . By a matching, we
mean a 1-regular graph.
Let G be a graph with at least two vertices. If there are sets ∅ 6= P, Q ⊆ V (G) such that
V (G) = P ∪Q and P ∩Q = ∅ and E(G)∩{{v, w} | v, w ∈ P } = ∅ = E(G)∩{{v, w} | v, w ∈ Q},
then G is bipartite (on bipartition (P, Q)). If, additionally, {v, w} | v ∈ P, w ∈ Q = E(G),
the graph G is complete bipartite.
For k, ` ∈ N0 , a (k, `)-biregular graph (on bipartition (P, Q)) is a bipartite graph on
bipartition (P, Q) such that for every v ∈ P , it holds that |N (v)| = k, and for every w ∈ Q,
it holds that |N (w)| = `. A biregular graph is a graph G for which there are P, Q ⊆ V (G)
and k, ` ∈ N0 such that G is (k, `)-biregular on bipartition (P, Q).
For a graph G and a set V 0 ⊆ V (G), we let G[V 0 ] be the induced subgraph of G on V 0 , i.e.
the subgraph of G with vertex set V 0 and edge set E(G) ∩ {{v, w} | v, w ∈ V 0 }. We define
1
Note that the bound nk − 1 is not tight, since the initial partition of the k-tuples already has multiple
classes, for example, one consisting of all tuples of the form (v, v, . . . , v).
4 The Iteration Number of Colour Refinement
3 Colour Refinement
Colour Refinement proceeds by iteratively refining a partition of the vertices of its input
graph until the partition is stable with respect to the refinement criterion.
I Definition 3.1 (Colour Refinement). Let λ : V (G) → C be a colouring of the vertices of
a graph G, where C is some set of colours. The colouring computed by Colour Refinement
on input (G, λ) is defined recursively: we set χ0G := λ, i.e. the initial colouring is λ. For
i ∈ N, the colouring χiG computed by Colour
Refinement after i iterations on G is defined as
i i−1 i−1
χG (v) := χG (v), {{χG (w) | w ∈ N (v)}} .
That is, χiG (v) consists of the colour of v from the previous iteration as well as the
multiset of colours of neighbors of v from the previous iteration. It is not difficult to see
that π(χi−1 i
G ) π(χG ) holds for every graph G and every i ∈ N. Therefore, there is a unique
minimal integer j such that π(χjG ) ≡ π(χj+1G ). For this value j, we define the output of
Colour Refinement on input G to be χG := χjG and call χG and π(χG ) the stable colouring
and the stable partition, respectively, of G. Accordingly, executing i Colour Refinement
iterations on G means computing the colouring χiG . We call a graph G with colouring λ
and the induced partition π(λ) stable if π(λ) ≡ π(χG ). Note that for all P, Q ∈ π(χG ) with
P 6= Q, the graph G[P ] is regular and G[P, Q] is biregular.
Colour Refinement can be used to check whether two given graphs G and G0 are non-
isomorphic by computing the stable colouring on the disjoint union of the two. If there is a
colour C such that, in the stable colouring, the numbers of vertices of colour C differ in G
and G0 , they are non-isomorphic. However, even if they agree in every colour class size in
the stable colouring, the graphs might not be isomorphic. It is not trivial to describe for
which graphs this isomorphism test is always successful (see [2, 23]).
I Notation 3.2. We write WL1 (G) for the number of iterations of Colour Refinement on
input G, that is, WL1 (G) = j, where j is the minimal integer for which π(χjG ) = π(χj+1
G ).
Similarly, for n ∈ N, we write WL1 (n) to denote the maximum number of iterations that
Colour Refinement needs to reach stabilisation on an n-vertex graph.
We call every graph G with WL1 (G) = |G| − 1 a long-refinement graph.
I Fact 3.3. Let G be an uncoloured path with n vertices. Then WL1 (G) = b n−1
2 c.
Proof sketch. In the first iteration, the two end vertices are distinguished from all others
because they are the only ones with degree 1. Then in each iteration, the information of
being adjacent to a “special” vertex, i.e. the information about the distance to a vertex
of degree 1, is propagated one step closer to the vertices in the centre of the path. This
procedure takes b n−1
2 c iterations. J
S. Kiefer and B. McKay 5
In 2015, Krebs and Verbitsky improved on the explicit linear lower bound for graphs of
order n given by Fact 3.3 by constructing a family of pairs of graphs whose members of order
√
n can only be distinguished after n − 8 n Colour Refinement iterations (see [25, Theorem
4.6]). Hence, since for a set {v1 , . . . , vn } =: S and partitions π1 , . . . , π` of S that satisfy
π1 π2 · · · {v1 }, . . . {vn } = π` ,
it holds that ` ≤ n − 1, we obtain the following corollary.
√
I Corollary 3.4. For every n ∈ N, it holds that n − 8 n ≤ WL1 (n) ≤ n − 1.
It has remained open whether any of the two bounds is tight. In preliminary research
conducted together with Gödicke and Schweitzer, towards improving the lower bound, the
first author took up an approach to reverse-engineer the splitting of colour classes. Gödicke’s
implementation of those split procedures led to the following result.
I Theorem 3.5 ([11]). For every n ∈ {1, 10, 11, 12}, it holds that WL1 (n) = n − 1. For
n ∈ [2, 9], it holds that WL1 (n) < n − 1.
Unfortunately, due to computational exhaustion, it was not possible to test for larger
graph sizes. Also, the obtained graphs do not exhibit any structural properties that would
lend themselves for a generalisation in order to obtain larger graphs.
Using a fast implementation of Colour Refinement, we could verify that there are exactly
16 long-refinement graphs of order 10, 24 long-refinement graphs of order 11, 32 of order 12,
and 36 of order 13. However, again, with simple brute-force approaches, we could not go
beyond those numbers exhaustively.
The proposition implies that in order to find long-refinement graphs, we have to look
for graphs in which, in every Colour Refinement iteration, only one additional colour class
appears. That is, in each iteration, only one colour class is split and the splitting creates
exactly two new colour classes.
6 The Iteration Number of Colour Refinement
I Corollary 4.2. Let G be a long-refinement graph with at least two vertices. Then there
exist d1 , d2 ∈ N0 with d1 =
6 d2 and such that deg(G) = {d1 , d2 }.
Proof. This is a direct consequence of Proposition 4.1: every (monochromatic) regular graph
G satisfies WL1 (G) = 0 and if there were more than two vertex degrees present in G, we
would have |{χ1G (v) | v ∈ V (G)}| − |{χ0G (v) | v ∈ V (G)}| ≥ 3 − 1 ≥ 2. J
We can thus restrict ourselves to graphs with exactly two vertex degrees.
i
I Notation 4.3. For a graph G and i ∈ N0 , we let πG denote the partition induced by χiG
on V (G), i.e. after i Colour Refinement iterations on G. If G is clear from the context, we
omit it in the expression.
As a result of the regularity conditions that must hold for the graph G[V1 , V2 ], we make
the following observation. It implies that, in a long-refinement graph, to determine the class
C that is split in iteration i, it suffices to consider the neighbourhood of an arbitrary class
obtained in the preceding iteration.
I Lemma 4.4. Let G be a graph. Suppose there are i ∈ N and C1 , C2 , C 0 with π i \ π i−1 =
{C1 , C2 } and C 0 ∈ π i \ π i+1 . Then there are vertices v10 , v20 ∈ C 0 such that |N (v10 ) ∩ C1 | 6=
|N (v20 ) ∩ C1 |.
Proof. Note that there must be a C ∈ π i−1 \ π i with C1 ∪ C2 = C. Since C 0 ∈ π i and
C ∈ π i−1 , there is a d ∈ N0 such that for every v ∈ C 0 , it holds that d = |N (v) ∩ C|. Since
{C1 , C2 } = π i \ π i−1 and C 0 ∈/ π i+1 , there are vertices v10 , v20 ∈ C 0 such that |N (v10 ) ∩ C1 | =
6
|N (v2 ) ∩ C1 | or |N (v1 ) ∩ C2 | 6= |N (v20 ) ∩ C2 |. In the first case, we are done. In the second
0 0
Note that the validity of the lemma depends on the assumption {C1 , C2 } = π i \ π i−1 ,
which by Proposition 4.1 is always fulfilled in long-refinement graphs as long as π i−1 6≡ π i .
I Corollary 4.5. No graph with more than one connected component is a long-refinement
graph.
Proof. Since the refinement process takes place in parallel in each connected component,
WL1 (G) is the maximum of all W L1 (H) for the connected components H of G. J
We can therefore restrict ourselves to connected graphs. The only connected graphs G
with deg(G) = {1, 2} are paths and, by Fact 3.3, they are not long-refinement graphs. Thus,
the smallest degree pairs for a search for candidates are {1, 3} and {2, 3}.
I Lemma 4.6. Let G be a long-refinement graph. Then |{v ∈ V (G) | deg(v) = 1}| ≤ 2.
Proof. Suppose the lemma does not hold. Let G be a long-refinement graph with at least
three vertices of degree 1. Consider the execution of Colour Refinement on input G and let
n := |G|. In π 1 , there are two vertex colour classes, namely a class V1 containing the vertices
of degree 1 and a class Vd containing the vertices of the second vertex degree d 6= 1.
Suppose that |V1 | ≥ 2. The class V1 is not split before N (V1 ) has been split. Thus,
consider the iteration j after which N (V1 ) has been subdivided into two classes W1 and W2 .
This induces the splitting of V1 into N (W1 ) ∩ V1 and N (W2 ) ∩ V1 , which by Proposition 4.1
implies in particular that for all pairs of partition classes C, C 0 ∈ π j with C ∩V1 = ∅ = C 0 ∩V1 ,
the graph induced between the two classes is biregular. Therefore, however, now for every pair
of classes C, C 0 ∈ π j+1 , the graph G[C, C 0 ] is biregular and thus, the partition is equitable.
Hence, j = n − 2, i.e. the splitting of V1 must happen in the (n − 1)-st iteration. In particular,
N (W1 ) ∩ V1 and N (W2 ) ∩ V1 must be singletons, i.e. |V1 | = 2. J
S. Kiefer and B. McKay 7
Table 1 Adjacency lists of long-refinement graphs G with deg(G) = {1, 5} (left) and deg(G) =
{1, 3} (right), respectively.
Table 1 displays the adjacency lists of two long-refinement graphs on 12 and 14 vertices,
respectively, which each have exactly one vertex of degree 1.
The lemma allows us to reduce the decision problem whether there are infinitely many
long-refinement graph with degrees in {1, 2, 3} to the question whether there are such families
with degrees in {2, 3}.
I Corollary 4.7. If there is a long-refinement graph G with deg(G) = {1, 3}, then there is
also a long-refinement graph Ĝ with deg(Ĝ) = {2, 3} and |Ĝ| ∈ {|G| − 1, |G|}.
Proof. Let G be a long-refinement graph with deg(G) = {1, 3}. Then π 1 = {V1 , V3 }, where
V1 = {v ∈ V (G) | deg(v) = 1} and V3 = {v ∈ V (G) | deg(v) = 3}. By Lemma 4.6, it holds
that |V1 | ∈ {1, 2}.
First suppose |V1 | = 2. Consider the graph Ĝ with V (Ĝ) = V (G) and E(Ĝ) = E(G)∪{V1 },
i.e. obtained from G by inserting an edge between the two vertices in V1 . In the following,
we identify the vertices of Ĝ with their counterparts in G. For i ∈ N0 , let π̂ i be the partition
of V (Ĝ) induced by χiĜ . Let n := |G|. Then, for i ∈ [0, n − 1], it holds that
π̂ i = π i .
This follows from Ĝ − V1 = G − V1 and Ĝ[V1 , N (V1 )] = G[V1 , N (V1 )], the regularity of Ĝ[V1 ]
and that there is only one way to split V1 , which results in two singletons. In particular, it
holds that WL1 (Ĝ) = WL1 (G) = |Ĝ| − 1.
Now suppose |V1 | = 1. In π 1 , there are only the two partition classes V1 and V3 . In π 2 , the
set V3 is subdivided into the singleton N (V1 ) and V3 \ N (V1 ). Define Ĝ := G − V1 and again,
for i ∈ N0 , let π̂ i be the partition of V (Ĝ) induced by χiĜ . Then π̂ 1 = {N (V1 ), V3 \ N (V1 )} =
π 2 \ {V1 } and, more generally, for i ∈ N, we obtain π̂ i = π i+1 \ {V1 }. This can be deduced
from the equality Ĝ − V1 = G − V1 . Thus, WL1 (Ĝ) = (n − 1) − 1 = |Ĝ| − 1. J
With the help of the tool Nauty [28], our quest for long-refinement graphs was successful.
We tested exhaustively up to order 13. To render the search for larger long-refinement
graphs tractable, we imposed further conditions. Restricting the degrees to {2, 3}, it was
possible to test for graphs up to order 64. Altogether, we found graphs G with n − 1 Colour
Refinement iterations, where n = |G|, for all even n ∈ [10, 64] \ {24, 30, 42, 48, 60} and for all
odd n ∈ [11, 63] \ {21, 27, 39, 45, 57, 63}.2
2
We exclude the case n = 10 in the following analysis since, as our computational results have shown,
although long-refinement graphs of order 10 do exist, none of them has vertex degrees 2 and 3.
8 The Iteration Number of Colour Refinement
In the following, in order to generalise the results to bigger graph sizes, we analyse the
obtained graphs. Among our computational results, the even-size graphs G with vertex
degrees 2 and 3 have the following property in common: there is an iteration j such that
j
for every C ∈ πG , it holds that |C| = 2. That is, with respect to their assigned colours, the
vertices remain in pairs until there are no larger colour classes left. Then the first such pair
is split into singletons, which must induce a splitting of another pair, and so on, until the
discrete partition is obtained. (Similar statements hold for the odd-size graphs, but are more
technical.) In the following, a pair is a set of two vertices which occurs as a colour class
during the execution of Colour Refinement. That is, vertices v, v 0 form a pair if and only if
{v, v 0 } is an element of π i for some i ∈ N0 .
As just argued, there is a splitting order on the pairs, i.e. a linear order ≺ induced by
the order in which pairs are split into singletons. We now examine the possible connections
between pairs.
From now on, we make the following assumption.
I Assumption 4.8. G is a long-refinement graph with deg(G) = {2, 3} and such that there
is an i ∈ N0 for which π i contains only pairs. Let ≺ be the splitting order of these pairs.
We call pairs P1 , P2 ⊆ V (G) successive if P2 is the successor of P1 with respect to ≺. Note
that for successive pairs P1 , P2 , in the graph G[P1 , P2 ], every v2 ∈ P2 must have the same
number of neighbours in P1 , otherwise it would hold that P2 ≺ P1 . By a simple case analysis,
together with an application of Lemma 4.4, this rules out all connections but matchings for
successive pairs.
I Corollary 4.9. Let P1 and P2 be successive pairs. Then G[P1 , P2 ] is a matching.
Towards a compact representation of the graphs, we further examine the connections
between pairs P1 and P2 with S(P1 ) ≺ P2 , where S(P1 ) is the successor of P1 with respect
to ≺.
I Lemma 4.10. Let P1 be a pair. Then exactly one of the following holds.
P1 6= min(≺) and for every pair P2 with S(P1 ) ≺ P2 , it holds that E(G[P1 , P2 ]) = ∅.
P1 = min(≺) and there are exactly two choices P2 , P20 for a pair P 0 with S(P1 ) ≺ P 0 such
that E(G[P1 , P 0 ]) 6= ∅. Furthermore, there is a vertex v1 ∈ P1 such that G[{v1 }, P2 ] and
G[P1 \ {v1 }, P20 ] are complete bipartite and E(G[{v1 }, P20 ]) = E(G[P1 \ {v1 }, P2 ]) = ∅.
Proof. Suppose P1 = 6 min(≺). If P1 = max(≺), the statement trivially holds. Otherwise,
by Corollary 4.9, every vertex v1 ∈ P1 has exactly one neighbour in S(P1 ) and exactly one
neighbour in the predecessor of P1 , i.e. in the unique pair A(P1 ) such that P1 = S(A(P1 )).
Thus, due to the degree restrictions, v1 can have at most one additional neighbour in a pair
P 0 with P1 ≺ P 0 and P 0 6= S(P1 ). However, if v1 had a neighbour in such a P 0 , the graph
G[{v1 }, P 0 ] would not be biregular, implying that P 0 = S(P1 ), a contradiction. Therefore,
N (v1 ) ⊆ A(P1 ) ∪ P1 ∪ S(P1 ) and thus, N (P1 ) ⊆ A(P1 ) ∪ S(P1 ). In particular, for every pair
P2 with P1 ≺ P2 and P2 6= S(P1 ), it holds that E(G[P1 , P2 ]) = ∅.
Now suppose that P1 = min(≺). Since the splitting of P1 must be induced by a splitting
of a union of two pairs and G[P1 , S(P1 )] is biregular and G[P1 ] is regular, we cannot have
N (P1 ) ⊆ S(P1 ). Thus, there is a pair P2 with S(P1 ) ≺ P2 and such that E(G[P1 , P2 ]) 6= ∅.
Let v1 ∈ P1 be a vertex with N (v1 ) ∩ P2 6= ∅. Then P2 ⊆ N (v1 ), otherwise P2 = S(P1 ).
Thus, G[{v1 }, P2 ] is complete bipartite. Therefore and due to the degree restrictions, v1 has
exactly three neighbours: one in S(P1 ) and two in P2 . In particular, for every pair P20 with
P2 6= P20 6= S(P1 ), it holds that E(G[{v1 }, P20 ]) = ∅.
S. Kiefer and B. McKay 9
Let v10 6= v1 be the second vertex in P1 . Since the splitting of P1 induces the splitting of
S(P1 ), by Proposition 4.1, for every pair P 0 with P1 6= P 0 = 6 S(P1 ), the graph G[{v10 }, P 0 ]
must be biregular, i.e. either empty or complete bipartite.
Moreover, since deg(v1 ) = 3, also deg(v10 ) = 3. By Corollary 4.9, it holds that |N (v10 ) ∩
S(P1 )| = 1. Therefore, there is exactly one pair P20 such that G[{v10 }, P20 ] is complete bipartite
and for all other pairs P 0 with P1 6= P 0 6= S(P1 ), the graph G[{v10 }, P 0 ] is empty.
Suppose P20 = P2 . Choose i such that π i \ π i+1 = {P1 }. Then the unique element
in π i−1 \ π i is a union of two pairs, whose splitting induces the splitting of P1 . However,
N (P1 ) = S(P1 ) ∪ P2 and both graphs G[P1 , S(P1 )] and G[P1 , P2 ] are biregular.
Thus, P20 6= P2 , which concludes the proof. J
Corollary 4.9 and Lemma 4.10 characterise G[P1 , P2 ] for all pairs P1 6= P2 . Thus, all
additional edges must be between vertices from the same pair. Hence, we can use the
following compact graphical representation to fully describe the graphs of order at least 12
that we found. As the set of nodes, we take the pairs. We order them according to ≺ and
connect successive pairs with an edge representing the matching. If the two vertices of a pair
are adjacent, we indicate this with a loop at the corresponding node. The only other type of
connection between pairs is constituted by the edges from min(≺) to two other pairs which
form the last colour class of size 4, i.e. a colour class of size 4 in the partition π i for which
π i+1 \ π i+2 = {min(≺)}. We indicate this type of edge with a dotted curve.
An example graph as well as the evolution of the colour classes computed by Colour
Refinement on the graph is depicted in Figure 1.
Figure 1 Top left: A long-refinement graph G on 32 vertices. The subsequent pictures show the
partitions of V (G) after the first 15 Colour Refinement iterations. There are 16 further iterations
not depicted here, which consist in the splitting of the pairs into singletons.
I Notation 4.11. Since ≺ is a linear order, we can also use a string representation to fully
describe the graphs. For this, we introduce the following notation, letting A(P ) and S(P ) be
the predecessor and successor of P , respectively, with respect to ≺.
0 represents a pair of vertices of degree 2.
1 represents a pair P of vertices of degree 3 that is not the minimum of ≺ and for which
N (P ) ⊆ A(P ) ∪ S(P ). (This implies that P ∈ E(G).)
10 The Iteration Number of Colour Refinement
X represents a pair P of vertices of degree 3 that is not the minimum of ≺ and for which
N (P ) 6⊆ A(P ) ∪ S(P ).
S represents the minimum of ≺.
Thus, by Lemma 4.10, there are exactly two pairs of type X, namely P2 and P20 from the
lemma. Now we can use the alphabet Σ = {0, 1, S, X} and the order ≺ to encode the graphs
as strings. The i-th letter of a string is the i-th element of ≺. Note that S is always a pair
of non-adjacent vertices of degree 3 due to the degree restrictions. For example, the string
representation for the graph in Figure 1 is S11100111X1X1110.
Formally, for every ` ≥ 3 and every string Ξ : [`] → {0, 1, S, X} with Ξ(1) = S and
Ξ−1 (X) = {r, r0 } for some 0 0
r, r ∈ [`] with r < r , we define the corresponding graph
G := G(Ξ) with V (G) = vi,j i ∈ [`], j ∈ [2] and
E(G) = {vi,1 , vi,2 } i ∈ [`], Ξ(i) = 1 ∪
{vi,j , vi+1,j } i ∈ [` − 1], j ∈ [2] ∪
{vr,j , v1,1 } j ∈ [2] ∪
{vr0,j , v1,2 } j ∈ [2] .
We use this encoding in the next section, which contains our main results.
Figure 2 A visualisation of the graph with string representation S011XX and the evolution of
the colour classes in the first 5 Colour Refinement iterations on the graph.
I Theorem 5.1. For every string Ξ contained in the following sets, the graph G(Ξ) is a
long-refinement graph.
{S011XX}
{S1k 001k X1X1k 0 | k ∈ N0 }
{S1k 11001k XX1k 0 | k ∈ N0 }
{S1k 0011k XX1k 10 | k ∈ N0 }
{S011(011)k 00(110)k XX(011)k 0 | k ∈ N0 }
{S(011)k 00(110)k 1X0X1(011)k 0 | k ∈ N0 }
S. Kiefer and B. McKay 11
Proof. Let G := G(S011XX) (cf. Figure 2). The vertices v2,1 and v2,2 are the only ones of
degree 2. Thus,
π 2 = {v2,1 , v2,2 }, {vi,j | i ∈ {1, 3}, j ∈ [2]}, {vi,j | i ∈ [4, 6], j ∈ [2]} .
Then
π 3 = {v1,1 , v1,2 }, {v2,1 , v2,2 }, {v3,1 , v3,2 }, {vi,j | i ∈ [4, 6], j ∈ [2]} ,
since the vertices in the S-pair have no neighbours in {vi,j | i ∈ {1, 3}, j ∈ [2]}. Similarly,
π 4 = {v1,1 , v1,2 }, {v2,1 , v2,2 }, {v3,1 , v3,2 }, {v4,1 , v4,2 }, {vi,j | i ∈ [5, 6], j ∈ [2]} ,
Now the splitting of the last colour class of size 4 into two X-pairs induces the splitting of
the S-pair into singletons, which is propagated linearly according to ≺, adding 6 further
iterations, thus summing up to 11 iterations.
We now consider the various infinite families of graphs. The proofs for them work
similarly by induction over k. Therefore, we only present the full detailed proof for the family
{S1k 001k X1X1k 0 | k ∈ N0 }, which includes the graph from Figure 1.
For k = 0, the graph G0 := G(S00X1X0) has 14 vertices. It is easy to verify that it
indeed takes 13 Colour Refinement iterations to stabilise. We sketch how Colour Refinement
processes the graph: for this, for i ∈ N0 , we let π0i denote the partition of V (G0 ) induced by
χiG0 , i.e. after i iterations of Colour Refinement on G0 . First, vertices are assigned colours
indicating their degrees. That is,
Now
π02 = {vi,j | i ∈ {2, 3, 7}, j ∈ [2]}, {vi,j | i ∈ {1, 4, 6}, j ∈ [2]}, {v5,i | i ∈ [2]} ,
since the vertices contained in the 1-pair are not adjacent to vertices from 0-pairs. Since no
vertex contained in the S-pair is adjacent to any vertex from the 1-pair, we obtain
π03 = {vi,j | i ∈ {2, 3, 7}, j ∈ [2]}, {vi,j | i ∈ {4, 6}, j ∈ [2]}, {v5,i | j ∈ [2]},
{v1,j | j ∈ [2]} .
Furthermore,
π04 = {vi,j | i ∈ {3, 7}, j ∈ [2]}, {vi,j | i ∈ {4, 6}, j ∈ [2]}, {v5,i | j ∈ [2]},
{v1,j | j ∈ [2]}, {v2,j | j ∈ [2]} ,
π05 = {v7,j | j ∈ [2]}, {vi,j | i ∈ {4, 6}, j ∈ [2]}, {v5,i | j ∈ [2]},
{v1,j | j ∈ [2]}, {v2,j | j ∈ [2]}, {v3,j | j ∈ [2]} ,
π06 = {vi,j | j ∈ [2]} | i ∈ [7] ,
i.e. with respect to the order ≺ induced by the string representation, the first 0-pair, the
second 0-pair and the first X-pair are separated from the others. Once the two X-pairs
form separate colour classes, this induces the splitting of S into two singletons, which is
12 The Iteration Number of Colour Refinement
propagated linearly through the entire string, adding 7 further iterations, thus summing up
to 13 iterations.
For general k ≥ 1, let Gk := G(S1k 001k X1X1k 0). To count the iterations of Colour
Refinement, we introduce some vocabulary for the pairs in Gk (see also Figure 1). We let
V := {vi,j | i ∈ [2, k+1]∪[k+4, 2k+3]∪[2k+7, 3k+6], j ∈ [2]}. Note that V is the set of vertices
contained in the subgraphs corresponding to the substrings 1k in the string representation.
Furthermore, for all i ∈ [k + 2], we call the set {vi0 ,j | i0 ∈ {i, 2k + 5 − i, 2k + 5 + i}, j ∈ [2]}
the i-th column and denote it by Vi . The 0-th column is the set {v2k+5,j | j ∈ [2]}. Thus,
[
V = Vi .
2≤i≤k+1
For every j ∈ [2], the sets {vi,j | i ∈ [1, k + 2]}, {vi,j | i ∈ [k + 3, 2k + 4]}, and {vi,j | i ∈
[2k + 6, 3k + 7]} are called rows. In accordance with Figure 1, we fix an ordering on the rows:
the first row is {vi,1 | i ∈ [2k + 6, 3k + 7]}, the second row is {vi,2 | i ∈ [2k + 6, 3k + 7]}, . . . ,
the sixth row is {vi,2 | i ∈ [1, k + 2]}. To be able to refer to the vertices in V and the adjacent
columns more easily, we relabel them: for i ∈ [k + 2], j ∈ [6], the vertex wi,j is defined to be
the unique vertex in the i-th column and the j-th row.
The following observation is the crucial insight for counting the iterations of Colour
Refinement on Gk . We will use it to show that, informally stated, the subgraph Gk [V ] delays
the propagation of the splitting of the colour classes in the remainder of the graph by k
iterations whenever the splitting of a colour class contained in V1 or Vk+2 initiates a splitting
of a colour class contained in V .
Claim 1. Consider a colouring λ of Gk and its induced partition πk of V (Gk ). For t ∈ N0 , let
πkt be the partition induced by χtGk on input (Gk , λ). Suppose Gk , λ, πk satisfy the following
conditions.
S
1. There exist ` ∈ [6] and I1 , . . . , I` ⊆ [6] such that i∈[`] Ii = [6] and Ii ∩ Ij = ∅ for
1 ≤ i < j ≤ ` and for every i ∈ [`], it holds that {wk+2,j 0 | j 0 ∈ Ii } ∈ πk0 . That is, Vk+2 is
a union of colour classes with respect to λ.
2. π 1 = {wk+1,j 0 | j 0 ∈ Ii } i ∈ [`] ∪ C \ Vk+1 C ∈ πk , C \ Vk+1 6= ∅ .
k
3. For all C, C 0 ⊆ Vk+1 with C, C 0 ∈ πk1 , the graph Gk [C] is regular and Gk [C, C 0 ] is
biregular.
Then for every t ∈ [k], it holds that
[
πkt = {wk+2−i0 ,j 0 | j 0 ∈ Ii } i ∈ [`] ∪
i0 ∈[t]
[ [
Vk+2−i0 C ∈ πk0 , C \
C\ Vk+2−i0 6= ∅ .
i0 ∈[t] i0 ∈[t]
Proof. We show the claim via induction. For t = 1, the statement is exactly the second item
from the assumptions. For the inductive step, suppose the statement holds for all t0 ≤ t for
some t ∈ [k − 1]. We show that it also holds for t + 1.
Note that the right-hand side of the equation is a partition of V (Gk ). Thus, it suffices to
show “⊇”, i.e. that the right-hand side is contained in the left-hand side of the equation.
Since Vk+2 is a union of elements of πk0 , it is also a union of elements of πkt . Thus, by the
S. Kiefer and B. McKay 13
induction hypothesis,
⊆ πkt . (1)
Similarly, using the induction hypothesis for t − 1, all other Vi with i ≥ k + 2 − (t − 1) are
unions of elements of πkt−1 (if t = 1, this holds trivially). Thus,
[ [
t−1 t−1
C C ∈ πk , C ∩
Vk+2−i0 6= ∅ = C C ∈ πk , C ⊆
Vk+2−i0
i0 ∈[t−1] i0 ∈[t−1]
[
{wk+2−i0 ,j 0 | j 0 ∈ Ii } i ∈ [`]
=
i0 ∈[t−1]
⊆ πkt .
This means that all elements from πkt−1 that have a non-empty intersection with
the set i0 ∈[0,t−1] Vk+2−i0 are also present in πkt . Therefore, for all C, C 0 ∈ πkt with
S
and Gk [C, C 0 ] must be biregular. (Otherwise, at least one of these classes would have been
split in the t-th iteration.)
Actually, this also holds when relaxing the restriction for C 0 to have a non-empty
S
intersection with i0 ∈[0,t] Vk+2−i0 . Indeed, Gk [Vk+2−t , Vk+2−(t−1) ] is a matching between
vertices contained in equal rows and, by the induction hypothesis for t, it holds that
{C | C ∈ πkt , C ⊆ Vk+2−(t−i0 ) } = {wk+2−(t−i0 ),j 0 | j 0 ∈ Ii } | i ∈ [`] for i0 ∈ {0, 1}. Thus,
for all C, C 0 ∈ πkt with C ⊆ Vk+2−(t−1) and C 0 ⊆ Vk+2−t , the graph Gk [C, C 0 ] is either a
perfect matching or empty. In particular, it is biregular.
Note that there are no edges between vertices from columns Vi , Vi0 with |i−i0 | ≥ 2. Hence,
in fact, for every C ∈ πkt with C ∩ i0 ∈[0,t−1] Vk+2−i0 6= ∅ and for all C 0 ∈ πkt , the graph Gk [C]
S
is regular and Gk [C, C 0 ] is biregular. Thus, every C ∈ πkt with C ∩ i0 ∈[0,t−1] Vk+2−i0 6= ∅ is
S
We have seen that for C, C 0 ∈ πkt with C ⊆ i0 ∈[0,t−1] Vk+2−i0 and C 0 ⊆ Vk+2−t , the
S
graph G[C] is regular and G[C, C 0 ] is biregular. We now show that we can actually relax
S
the location restriction for C to C ⊆ i0 ∈[0,t] Vk+2−i0 . For this, note that all subgraphs
Gk [Vi ] with i ∈ [2, k + 1] have the same structure. That is, for all r, s ∈ [6] and all
i, j ∈ [2, k + 1], the vertices in Vi in rows r and s are adjacent if and only if the corresponding
ones are adjacent in Vj . Furthermore, by the induction assumption for t, it holds that
{C | C ∈ πk , C ⊆ Vi } = {wi ,j | j ∈ Ii } | i ∈ [`] for i0 ∈ {k + 2 − t, k + 1}. Thus,
0
t
0 0 0
by Conditions (2) and (3) from the prerequisites of the claim, for all C, C 0 ⊆ Vk+2−t with
C, C 0 ∈ πkt , the graph Gk [C] is regular and Gk [C, C 0 ] is biregular.
In order to determine the colour classes of χt+1 Gk contained in Vk+2−t , we still need to
analyse the structure of the graph Gk [Vk+1−t , Vk+2−t ] with respect to πkt . To this end, for
14 The Iteration Number of Colour Refinement
t−1
{wk+1−t,j 0 | j 0 ∈ Ii } | i ∈ [`] .
Mk+1−t
However, the induction assumption yields Mji−1 = Mji for all i ∈ [t], j ∈ [0, k + 1 − t]. In
particular, the partition of Vk+1−t induced by πkt is not strictly finer than the one induced
by πkt−1 . Thus,
t
{wk+1−t,j 0 | j 0 ∈ Ii } | i ∈ [`] .
Mk+1−t (4)
Moreover, since Vk+1−t ⊆ N (Vk+2−t ) and Vk+2−t is a union of elements of πkt , the column
Vk+2−(t+1) = Vk+1−t must be a union of elements of πkt+1 . Hence,
which there is a C ∈ πk0 with u, v ∈ C, it holds that |N (u) ∩ C 0 | = |N (v) ∩ C 0 |. (To see this,
Sk+1−t
recall that N (u), N (v) ⊆ i=0 Vi .) This implies χt+1 t+1
Gk (u) = χGk (v). Together with (3),
this yields that
[ [
C\ 0
Vk+2−i0 C ∈ πk , C \
Vk+2−t 6= ∅ ⊆ πkt+1 ,
i0 ∈[t+1] i0 ∈[t+1]
We call the property described in the claim path propagation from right to left. The
proof for the following statement, path propagation from left to right, is completely analogous.
Therefore, we skip it.
Claim 2. Consider a colouring λ of Gk and its induced partition πk of V (Gk ). For
t ∈ N0 , let πk be the partition induced by χtGk on input (Gk , λ). Suppose Gk , λ, πk satisfy
t
2. πk1 = {w2,j 0 | j 0 ∈ Ii } i ∈ [`] ∪ C \ V2 C ∈ πk , C \ V2 6= ∅ .
3. For all C, C 0 ⊆ V2 with C, C 0 ∈ πk1 , the graph Gk [C] is regular and Gk [C, C 0 ] is biregular.
Then for every t ∈ [k], it holds that
[
πkt = {wi0 +1,j 0 | j 0 ∈ Ii } i ∈ [`]
i0 ∈[t]
[ [
∪ C\ Vi0 +1 C ∈ πk0 , C \ Vi0 +1 6= ∅ . y
i0 ∈[t] i0 ∈[t]
We are now ready to analyse the run of Colour Refinement on input Gk . Recall that π0t
denotes the partition induced by χtG0 on V (G0 ) ⊆ V (Gk ). For the following arguments, see
also Figure 1.
In πk1 , the vertices are distinguished according to their degrees. We can then use path
propagation from right to left to deduce that
and πk2k+5 = π05 ∪ (πk2k+4 \ π04 ). Again using path propagation from right to left, we get that
In the fifth and sixth family, there are more 0-pairs. We sketch the splitting. In those
families, in π 1 , there is one partition class formed by all vertices contained in 0-pairs. The
second partition class contains all other vertices. In π 2 , the vertices contained in the two
adjacent 0-pairs as well as the vertices contained in max(≺) (i.e. the rightmost 0) form a
separate partition class, while the class consisting of all vertices not contained in 0-pairs is
not split. Now, like in the other families, those three 0-pairs take up the role of Vk+2 , initiate
the first path propagation and the proof proceeds similarly as above. J
I Corollary 5.3. For every even n ∈ N≥12 such that n = 12 or n mod 18 ∈/ {6, 12}, there is a
long-refinement graph G with |G| = n. The graph G can be chosen to satisfy deg(G) = {2, 3}.
Proof. The string representation S011XX covers n = 12. The first infinite family from
Theorem 5.1 covers all even n ∈ N≥14 with n = 2 mod 6, i.e. with n mod 18 ∈ {2, 8, 14}.
The second and the third infinite family both cover all even n ∈ N≥16 with n = 4 mod 6, i.e.
with n mod 18 ∈ {4, 10, 16}. The fourth and the fifth infinite family cover all even n ∈ N≥18
with n = 0 mod 18. Thus, among the even graph orders larger than 10, only the ones with
n mod 18 ∈ {6, 12} remain not covered. J
We now turn to the long-refinement graphs G of odd order with vertex degrees in {2, 3}. If
the graph has odd size, we cannot represent it just with pairs. For this, we relax Assumption
4.8 as follows.
I Assumption 5.4. G is a long-refinement graph with deg(G) = {2, 3} and such that there
is an i ∈ N0 for which π i contains only pairs and at most one singleton.
We maintain the vocabulary and notation from the long-refinement graphs of even order,
i.e. 0, 1, S, X will be used in the same way as before. However, in order to fully describe
the odd-size graphs via strings, we have to extend the string alphabet by fresh letters 1̂
and X̂, which represent particular pairs with attached vertices as follows. For a string
Ξ̂ : [`] → {0, 1, S, X, 1̂, X̂}, we define the base string Ξ as the string obtained by removing
hats, i.e. by replacing every 1̂ with a 1 and every X̂ with an X. Let I(Ξ̂) ⊆ [`] be the set of
positions i with Ξ̂(i) ∈ {1̂, X̂}. If in the base graph G(Ξ), every vertex pair corresponding to
a position in I(Ξ̂) (a hat vertex pair) is adjacent, we call Ξ̂ a hat string.
Similarly as for the even-size long-refinement graphs, to every hat string Ξ̂, we assign
a graph G(Ξ̂). We obtain the graph G(Ξ̂) by subdividing in G(Ξ) each edge connecting
a hat vertex pair with a new fresh vertex, which we call a hat. For a hat v̂, we call the
neighbourhood N (v̂) ⊆ V G(Ξ̂) the hat base of v̂. Note that every vertex in the hat base
has degree 3 since it already has degree 3 in G(Ξ) (cf. Notation 4.11). Also, a hat always has
degree 2 and thus, with respect to χ1G(Ξ̂) , it has a different colour than its hat base.
Graphically, we represent a hat with a loop attached to the corresponding hat vertex pair,
which we subdivide by inserting a small vertex that represents the hat (see Figure 3). It is
not difficult to see that every graph G(Ξ̂) corresponding to a hat string Ξ̂ has exactly one
hat and that the hat is the first vertex forming a singleton colour class during the execution
of Colour Refinement on G(Ξ̂). Thus, |G(Ξ̂)| = |G(Ξ)| + 1.
I Theorem 5.5. For every string Ξ̂ contained in the following sets, the graph G(Ξ̂) is a
long-refinement graph.
{S1̂11XX}
{S0X1X̂} ∪ {S1k 1011k X1X1k 1̂ | k ∈ N0 }
S. Kiefer and B. McKay 17
Figure 3 A visualisation of the graph with string representation S1̂11XX and the evolution of
the colour classes in the first 6 Colour Refinement iterations on the graph.
Proof sketch. The proof techniques for the infinite families are very similar to the ones
presented for the families from Theorem 5.1. Therefore, we only sketch the proof on two
concrete examples containing 1̂ and X̂, respectively. To be able to refer to vertices explicitly,
recall the formal definition of G := G(Ξ) from Section 4. Since for a hat string Ξ̂, it holds
that |G(Ξ̂)| = |G(Ξ)| + 1, we can use the same indexing of vertices, additionally letting v̂ be
the unique hat in G(Ξ̂).
In the graph G := G(S1̂11XX) (cf. Figure 3), the only vertex of degree 2 is v̂. Thus, in
π 1 , it forms a singleton colour class. In π 2 , the hat base {v2,1 , v2,2 } forms a new colour class.
i i−1 0
In general, for i ∈ N, we have that πG = πG 0 ∪ {v̂}, where G = G(S011XX) (cf. Theorem
π 3 = {v̂}, {v2,1 , v2,2 }, {vi,j | i ∈ {1, 3}, j ∈ [2]}, {vi,j | i ∈ [4, 6], j ∈ [2]} ,
π 4 = {v̂}, {v1,1 , v1,2 }, {v2,1 , v2,2 }, {v3,1 , v3,2 }, {vi,j | i ∈ [4, 6], j ∈ [2]} ,
π 5 = {v̂}, {v1,1 , v1,2 }, {v2,1 , v2,2 }, {v3,1 , v3,2 }, {v4,1 , v4,2 }, {vi,j | i ∈ [5, 6], j ∈ [2]} ,
Now in 6 further iterations, the splitting of the pairs is propagated linearly according to
the order ≺.
Next, let G be the graph G(S0X1X̂), i.e. a member of the first infinite family. It has
three vertices of degree 2, namely v̂, v2,1 , and v2,2 , which therefore form a colour class in
π 1 . Also, the vertices contained in the 1-pair are the only vertices of degree 3 that are not
18 The Iteration Number of Colour Refinement
π 2 = {v̂, v2,1 , v2,2 }, {v4,1 , v4,2 }, V (G) \ {v̂, v2,1 , v2,2 , v4,1 , v4,2 } ,
and similarly,
π 3 = {v̂, v2,1 , v2,2 }, {v4,1 , v4,2 }, {v1,1 , v1,2 }, {vi,j | i ∈ {3, 5}, j ∈ [2]} .
Now the hat forms a singleton since, in contrast to the vertices of the 0-pair, it is not adjacent
to any vertex in the S-pair. We obtain:
π 4 = {v̂}, {v2,1 , v2,2 }, {v4,1 , v4,2 }, {v1,1 , v1,2 }, {vi,j | i ∈ {3, 5}, j ∈ [2]} ,
Then in 5 further iterations, the splitting of the pairs is propagated linearly according to the
order ≺. J
I Corollary 5.6. For every odd n ∈ N≥11 with n mod 18 ∈/ {3, 9}, there is a long-refinement
graph G with |G| = n. The graph G can be chosen to satisfy deg(G) = {2, 3}.
Proof. The string representation S1̂11XX covers n = 13. The first infinite family covers
all odd n ∈ N≥11 with n = 5 mod 6, i.e. with n mod 18 ∈ {5, 11, 17}. The second and
the third infinite family both cover all all odd n ∈ N≥13 with n = 1 mod 6, i.e. with
n mod 18 ∈ {1, 7, 13}. The fourth infinite family covers all odd n ∈ N≥15 with n = 15
mod 18. Thus, among the odd orders larger than 10, only the ones with n mod 18 ∈ {3, 9}
family are skipped. J
I Corollary 5.7. For every n ∈ N≥11 such that n = 12 or n mod 18 ∈/ {3, 6, 9, 12}, there is a
long-refinement graph G with |G| = n. The graph G can be chosen to satisfy deg(G) = {2, 3}.
I Lemma 5.8. Let n ∈ N be arbitrary. Suppose there is a long-refinement graph G such that
|G| = n. If there is a d ∈ N with deg(G) = {d, d + 1} such that |{v ∈ V (G) | deg(v) = d}| =
6
d + 1, then there is also a long-refinement graph G0 with |G0 | = n + 1.
Proof. We can insert an isolated vertex w into G and insert edges from w to every vertex
v ∈ V (G) with deg(v) = d. In the new graph G0 , the vertex w has a degree other than d + 1,
1
whereas all other vertices have degree d + 1. Thus, the colour classes in πG 0 are {w} and
0
V (G ) \ {w}. After the second iteration, the neighbours of w are distinguished from all other
1
vertices, just
i i−1
are in πG . Inductively, it is easy to see that for i ∈ N, it holds that
like they
πG0 = πG ∪ {w} . Thus, Colour Refinement takes n − 1 + 1 = n iterations to stabilise on
G0 . J
I Corollary 5.9. For every odd n ∈ N≥11 , there is a long-refinement graph G with |G| = n.
Proof. By Corollary 5.6, it suffices to provide long-refinement graphs of order n for every
odd n ∈ N≥11 with n mod 18 ∈ {3, 9}. We will accomplish this by applying Lemma 5.8
to suitable graphs of orders n0 with n0 mod 18 ∈ {2, 8}. Every graph G with a string
representation contained in one of the infinite families from Theorem 5.1 has an even number
of vertices of degree 2. In particular, it satisfies |{v ∈ V (G) | deg(v) = 2}| =
6 3. Furthermore,
every even graph order larger than 10 not covered by Corollary 5.3 is a multiple of 6.
S. Kiefer and B. McKay 19
Hence, since N := {n ∈ N≥18 | n mod 18 ∈ {2, 8}} contains only even numbers and no
multiples of 6, for every n0 ∈ N , there is a graph of order n0 that satisfies the prerequisites of
Lemma 5.8 with d = 2. (Actually, we can cover all of these graph orders with the family
{S1k 001k X1X1k 0 | k ∈ N0 }.)
Thus, applying the lemma, we can construct for every n ∈ N≥18 with n mod 18 ∈ {3, 9}
a long-refinement graph G0 . J
Note that, since we apply Lemma 5.8 to close the gaps, we cannot guarantee anymore
that the vertex degrees are 2 and 3, as we could in Corollary 5.7.
We are ready to prove Theorem 1.1.
Proof of Theorem 1.1. The theorem follows from combining Corollaries 5.3 and 5.9 with
Theorem 3.5. J
Although the corollary leaves some gaps, we can deduce a new lower bound on the
number of Colour Refinement iterations until stabilisation, which is optimal up to an additive
constant of 1.
Proof of Theorem 1.2. By Theorem 1.1, we only need to consider the numbers n ≥ 24 for
which n mod 18 ∈ {6, 12}. Since these numbers are all even, for every such n, we can use
Corollary 5.9 to obtain a long-refinement graph G0 with |G0 | = n − 1. Let v be a fresh vertex
not contained in V (G0 ). Now define G := (V, E) with V := V (G0 ) ∪ {v} and E := E(G0 ).
i i 0
Then πG = πG 0 ∪ {{v}} for every i ∈ N. In particular, WL1 (G) = WL1 (G ) = n − 2. J
6 Conclusion
With Theorem 1.2, it holds for all n ∈ N≥10 that WL1 (n) ≥ n − 2. In particular, this proves
that the trivial upper bound WL1 (n) = n − 1 is tight, up to an additive constant of 1.
For infinitely many graph sizes, the graph G can even be chosen to have vertex degrees 2
and 3, as Theorems 5.1 and 5.5 show. We applied Lemma 5.8 to cover some of the remaining
sizes. However, no order n ∈ N≥18 with n mod 18 ∈ {6, 12} is covered by Theorem 5.1. Also,
for |G| ∈ {n ∈ N≥18 | n mod 18 ∈ {5, 11}}, all the long-refinement graphs G we have found
satisfy |{v ∈ V (G) | deg(v) = 2}| = 3 (see the first infinite family in Theorem 5.5). Thus,
these graphs do not satisfy the prerequisites of Lemma 5.8. Note that we cannot apply
the construction from the proof if |{v ∈ V (G) | deg(v) = d}| = d + 1, since the new graph
G0 would be (d + 1)-regular and would thus satisfy WL1 (G0 ) = 0. Hence, it is not clear
how to apply our techniques to construct a long-refinement graph of order 24. In fact, our
computational results yield that there are no long-refinement graphs with 24 vertices and
maximum degree 3. Altogether, the values n ∈ N≥24 with n mod 18 ∈ {6, 12} are precisely
the graph orders for which it remains open whether there is a graph G with WL1 (G) = |G|−1.
A related question is for which values d1 6= d2 there are long-refinement graphs G with
deg(G) = {d1 , d2 }. It would be nice to know whether we have actually found all long-
refinement graphs G with deg(G) = {2, 3}. Similarly, we can ask for long-refinement graphs
when fixing other parameters. For example, since all long-refinement graphs that we found
have girth at most 4, it would be interesting to know whether there exists any long-refinement
graph with larger girth, or infinite families with unbounded girth.
Also, in the light of the graph isomorphism problem, it is a natural follow-up task to
find for each order n pairs of non-isomorphic graphs G, H for which it takes n − 1 Colour
Refinement iterations to distinguish the graphs from each other. A first step towards this
goal is the search for pairs of long-refinement graphs of equal order. It is easy to see that
20 The Iteration Number of Colour Refinement
for infinitely many n, Theorems 5.1 and 5.5 yield such pairs of graphs. Still, for example,
when evaluating the colourings computed by Colour Refinement on the graphs with string
representations S1100XX0 and S001XX10, they differ after less than n − 1 iterations. To
see this, observe that in G(S1100XX0), all vertices of degree 2 have paths of length 3 to
a vertex of degree 2 whose inner vertices only have degrees other than 2. This is not the
case for G(S001XX10) and this property is detected by Colour Refinement after at most 4
iterations.) Thus, finding for n ∈ N≥10 two graphs of order n which Colour Refinement only
distinguishes after n − 1 iterations remains a challenge.
References
1 Vikraman Arvind, Frank Fuhlbrück, Johannes Köbler, and Oleg Verbitsky. On Weisfeiler-
Leman invariance: Subgraph counts and related graph properties. In Fundamentals of
Computation Theory – 22nd International Symposium, FCT 2019, Copenhagen, Denmark,
August 12–14, 2019, Proceedings, pages 111–125, 2019. URL: https://doi.org/10.1007/
978-3-030-25027-0_8.
2 Vikraman Arvind, Johannes Köbler, Gaurav Rattan, and Oleg Verbitsky. Graph isomorphism,
color refinement, and compactness. Computational Complexity, 26(3):627–685, 2017. URL:
https://doi.org/10.1007/s00037-016-0147-6.
3 László Babai, Paul Erdős, and Stanley M. Selkow. Random graph isomorphism. SIAM Journal
on Computing, 9(3):628–635, 1980. URL: https://doi.org/10.1137/0209047.
4 Christoph Berkholz and Jakob Nordström. Near-optimal lower bounds on quantifier depth and
Weisfeiler-Leman refinement steps. In Proceedings of the 31st Annual ACM/IEEE Symposium
on Logic in Computer Science, LICS ’16, New York, NY, USA, July 5-8, 2016, pages 267–276,
2016. URL: https://doi.org/10.1145/2933575.2934560.
5 Jin-yi Cai, Martin Fürer, and Neil Immerman. An optimal lower bound on the number
of variables for graph identifications. Combinatorica, 12(4):389–410, 1992. URL: https:
//doi.org/10.1007/BF01305232.
6 Alain Cardon and Maxime Crochemore. Partitioning a graph in O(|A| log |A|). Theoretical
Computer Science, 19:85–98, 1982. URL: https://doi.org/10.1016/0304-3975(82)90016-0.
7 Paul T. Darga, Mark H. Liffiton, Karem A. Sakallah, and Igor L. Markov. Exploiting
structure in symmetry detection for CNF. In Proceedings of the 41th Design Automation
Conference, DAC 2004, San Diego, CA, USA, June 7-11, 2004, pages 530–534. ACM, 2004.
URL: https://doi.org/10.1145/996566.996712.
8 Frank Fuhlbrück, Johannes Köbler, and Oleg Verbitsky. Local WL invariance and hidden shades
of regularity. CoRR, abs/2002.04590, 2020. URL: https://arxiv.org/abs/2002.04590.
9 Martin Fürer. Weisfeiler-Lehman refinement requires at least a linear number of iterations. In
Automata, Languages and Programming, 28th International Colloquium, ICALP 2001, Crete,
Greece, July 8-12, 2001, Proceedings, volume 2076 of Lecture Notes in Computer Science, pages
322–333. Springer, 2001. URL: https://doi.org/10.1007/3-540-48224-5_27.
10 Martin Fürer. On the combinatorial power of the Weisfeiler-Lehman algorithm. In Dimitris
Fotakis, Aris Pagourtzis, and Vangelis Th. Paschos, editors, Proceedings of the Tenth In-
ternational Conference on Algorithms and Complexity, volume 10236 of Lecture Notes in
Computer Science, pages 260–271. Springer Verlag, 2017. URL: https://doi.org/10.1007/
978-3-319-57586-5%5C_22.
11 Maximilian Gödicke. The iteration number of the Weisfeiler-Lehman-algorithm. Master’s
thesis, RWTH Aachen University, 2015.
12 Christopher D. Godsil. Compact graphs and equitable partitions. Linear Algebra Appl.,
255:259–266, 1997. URL: https://doi.org/10.1016/S0024-3795(97)83595-1.
13 Martin Grohe. Fixed-point definability and polynomial time on graphs with excluded minors.
Journal of the ACM, 59(5):27, 2012. URL: https://doi.org/10.1145/2371656.2371662.
S. Kiefer and B. McKay 21
14 Martin Grohe, Kristian Kersting, Martin Mladenov, and Erkal Selman. Dimension reduction
via colour refinement. In Algorithms - ESA 2014 - 22th Annual European Symposium, Wroclaw,
Poland, September 8-10, 2014. Proceedings, volume 8737 of Lecture Notes in Computer Science,
pages 505–516. Springer, 2014. URL: https://doi.org/10.1007/978-3-662-44777-2_42.
15 Martin Grohe and Sandra Kiefer. A linear upper bound on the Weisfeiler-Leman dimension
of graphs of bounded genus. In Proceedings of the Forty-Sixth International Colloquium
on Automata, Languages, and Programming, pages 117:1–117:15, July 2019. URL: https:
//doi.org/10.4230/LIPIcs.ICALP.2019.117.
16 Martin Grohe and Daniel Neuen. Canonisation and definability for graphs of bounded
rank width. In Proceedings of the Thirty-Fourth Annual ACM/IEEE Symposium on Logic
in Computer Science, pages 1–13, June 2019. URL: https://doi.org/10.1109/LICS.2019.
8785682.
17 Martin Grohe and Oleg Verbitsky. Testing graph isomorphism in parallel by playing a
game. In Automata, Languages and Programming, 33rd International Colloquium, ICALP
2006, Venice, Italy, July 10–14, 2006, Proceedings, Part I, pages 3–14, 2006. URL: https:
//doi.org/10.1007/11786986_2.
18 Neil Immerman and Eric Lander. Describing graphs: A first-order approach to graph canoniz-
ation. In Alan L. Selman, editor, Complexity theory retrospective, pages 59–81. Springer, 1990.
URL: https://doi.org/10.1007/978-1-4612-4478-3_5.
19 Tommi A. Junttila and Petteri Kaski. Engineering an efficient canonical labeling tool for
large and sparse graphs. In Proceedings of the Nine Workshop on Algorithm Engineering and
Experiments, ALENEX 2007, New Orleans, Louisiana, USA, January 6, 2007. SIAM, 2007.
URL: https://doi.org/10.1137/1.9781611972870.13.
20 Sandra Kiefer. Power and Limits of the Weisfeiler-Leman Algorithm. PhD thesis, RWTH
Aachen University, Aachen, 2020. URL: https://doi.org/10.18154/RWTH-2020-03508.
21 Sandra Kiefer and Pascal Schweitzer. Upper bounds on the quantifier depth for graph
differentiation in first order logic. In Proceedings of the Thirty-First Annual ACM/IEEE
Symposium on Logic in Computer Science, pages 287–296, July 2016. URL: https://doi.
org/10.1145/2933575.2933595.
22 Sandra Kiefer and Pascal Schweitzer. Upper bounds on the quantifier depth for graph
differentiation in first-order logic. Logical Methods in Computer Science, 15(2), 2019. URL:
https://doi.org/10.23638/LMCS-15(2:19)2019.
23 Sandra Kiefer, Pascal Schweitzer, and Erkal Selman. Graphs identified by logics with counting.
In Proceedings of the Fortieth International Symposium on Mathematical Foundations of
Computer Science, volume 9234 of Lecture Notes in Computer Science, pages 319–330. Springer,
August 2015. URL: https://doi.org/10.1007/978-3-662-48057-1_25.
24 Johannes Köbler and Oleg Verbitsky. From invariants to canonization in parallel. In Computer
Science - Theory and Applications, Third International Computer Science Symposium in
Russia, CSR 2008, Moscow, Russia, June 7-12, 2008, Proceedings, volume 5010 of Lecture
Notes in Computer Science, pages 216–227. Springer, 2008. URL: https://doi.org/10.1007/
978-3-540-79709-8_23.
25 Andreas Krebs and Oleg Verbitsky. Universal covers, color refinement, and two-variable
counting logic: Lower bounds for the depth. In Proceedings of the Thirtieth Annual ACM/IEEE
Symposium on Logic in Computer Science, pages 689–700, 2015. URL: https://doi.org/10.
1109/LICS.2015.69.
26 Wenchao Li, Hossein Saidi, Huascar Sanchez, Martin Schäf, and Pascal Schweitzer. De-
tecting similar programs via the Weisfeiler-Leman graph kernel. In Software Reuse:
Bridging with Social-Awareness - 15th International Conference, ICSR 2016, Limassol,
Cyprus, June 5-7, 2016, Proceedings, pages 315–330, 2016. URL: https://doi.org/10.
1007/978-3-319-35122-3_21.
27 Moritz Lichter, Ilia N. Ponomarenko, and Pascal Schweitzer. Walk refinement, walk logic, and
the iteration number of the Weisfeiler-Leman algorithm. In Proceedings of the Thirty-Fourth
22 The Iteration Number of Colour Refinement
Annual ACM/IEEE Symposium on Logic in Computer Science, pages 1–13, June 2019. URL:
https://doi.org/10.1109/LICS.2019.8785694.
28 Brendan D. McKay. Practical graph isomorphism. Congressus Numerantium, 30:45–87, 1981.
29 Brendan D. McKay and Adolfo Piperno. Practical graph isomorphism, II. Journal of Symbolic
Computation, 60:94–112, 2014. URL: https://doi.org/10.1016/j.jsc.2013.09.003.
30 Harry L. Morgan. The generation of a unique machine description for chemical structures
– a technique developed at chemical abstracts service. Journal of Chemical Documentation,
5(2):107–113, 1965. URL: https://doi.org/10.1021/c160017a018.
31 Christopher Morris, Martin Ritzert, Matthias Fey, William L. Hamilton, Jan E. Lenssen,
Gaurav Rattan, and Martin Grohe. Weisfeiler and Leman go neural: Higher-order graph
neural networks. In Proceedings of the Thirty-Third AAAI Conference on Artificial Intelligence,
January 2019. URL: https://doi.org/10.1609/aaai.v33i01.33014602.
32 Motakuri V. Ramana, Edward R. Scheinerman, and Daniel Ullman. Fractional isomorphism
of graphs. Discrete Mathematics, 132(1–3):247–265, 1994. URL: https://doi.org/10.1016/
0012-365X(94)90241-0.
33 Nino Shervashidze, Pascal Schweitzer, Erik J. van Leeuwen, Kurt Mehlhorn, and Karsten M.
Borgwardt. Weisfeiler-Lehman graph kernels. Journal of Machine Learning Research, 12:2539–
2561, 2011. URL: http://dl.acm.org/citation.cfm?id=2078187.
34 Gottfried Tinhofer. A note on compact graphs. Discrete Applied Mathematics, 30(2–3):253–264,
1991. URL: https://doi.org/10.1016/0166-218X(91)90049-3.
35 Oleg Verbitsky. Planar graphs: Logical complexity and parallel isomorphism tests. In STACS
2007, 24th Annual Symposium on Theoretical Aspects of Computer Science, Aachen, Germany,
February 22–24, 2007, Proceedings, pages 682–693, 2007. URL: https://doi.org/10.1007/
978-3-540-70918-3_58.