Wolfgang Aat
Wolfgang Aat
Wolfgang Aat
Automata
Theory
Prof. Dr. Wolfgang Thomas
RWTH Aachen
November 2, 2005
2
i
Note
These notes are based on the courses “Applied Automata Theory” and
“Angewandte Automatentheorie” given by Prof. W. Thomas at RWTH
Aachen University in 2002 and 2003.
In 2005 they have been revised and corrected but still they may contain
mistakes of any kind. Please report any bugs you find, comments, and
proposals on what could be improved to
[email protected]
ii
Contents
0 Introduction 1
0.1 Notation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
0.2 Nondeterministic Finite Automata . . . . . . . . . . . . . . . 6
0.3 Deterministic Finite Automata . . . . . . . . . . . . . . . . . 7
3 Tree Automata 77
3.1 Trees and Tree Languages . . . . . . . . . . . . . . . . . . . . 78
3.2 Tree Automata . . . . . . . . . . . . . . . . . . . . . . . . . . 82
3.2.1 Deterministic Tree Automata . . . . . . . . . . . . . . 82
3.2.2 Nondeterministic Tree Automata . . . . . . . . . . . . 88
3.2.3 Emptiness, Congruences and Minimization . . . . . . 91
3.3 Logic-Oriented Formalisms over Trees . . . . . . . . . . . . . 94
3.3.1 Regular Expressions . . . . . . . . . . . . . . . . . . . 94
3.3.2 MSO-Logic . . . . . . . . . . . . . . . . . . . . . . . . 99
3.4 XML-Documents and Tree Automata . . . . . . . . . . . . . 102
3.5 Automata over Two-Dimensional Words (Pictures) . . . . . . 109
iii
iv CONTENTS
Index 181
Chapter 0
Introduction
1
2 CHAPTER 0. INTRODUCTION
0 1 0,1
1 1 0 1
q0 q1 q2 q3 q4
0
• a (bi-)simulation class
Figure 4: SDL-diagram
• Which questions about automata are decidable and, for those that are,
what complexity do the corresponding algorithms have (e.g. concern-
ing equivalence)?
• logic
• circuit theory
The course and therefore these course notes are structured as follows:
Chapter 1 deals with the connection of automata and logic and with the
paradigm of model checking. In Chapter 2 equivalence issues including mini-
mization and bisimulation are discussed. Chapter 3 focuses on tree automata
and their applications. In Chapter 4 pushdown systems as a first model with
an infinite set of configurations are investigated. The next two chapters deal
with automata models which allow communication and concurrency between
different processes, namely communicating finite state machines in Chapter
5, including the graphical formalism of message sequence charts, and Petri
nets in Chapter 6.
0.1 Notation
An alphabet is a finite set. Its elements are called symbols. We usually denote
an alphabet by Σ, and its elements by a, b, c, . . . . A word over an alphabet
Σ is a finite sequence of letters a1 a2 . . . an with ai ∈ Σ for 1 ≤ i ≤ n. n is
called the length of a1 a2 . . . an . Words are usually named u, v, w, u1 , u2 , . . . .
By ² we denote the word of length 0, i.e. the empty word. Σ∗ (Σ+ ) is the set
of all (nonempty) words over Σ. The concatenation of words u = a1 . . . an
and v = b1 . . . bm is the word uv := a1 . . . an b1 . . . bm . Finally, we use capital
calligraphic letters A, B, . . . to name automata.
A set L ⊆ Σ∗ is called a language over Σ. Let L, L1 , L2 ⊆ Σ∗ be lan-
guages. We define the concatenation of L1 and L2 to be the set L1 · L2 :=
{uv | u ∈ L1 and v ∈ L2 }, also denoted by L1 L2 . We define L0 to be the
i i−1
languageS {²},i and L := LL . The Kleene closure of L is the language
L := i≥0 L , and the positive Kleene closure L+ is the Kleene closure of
∗
S
L without adding the empty word, i.e. L+ := i≥1 Li .
Regular expressions over an alphabet Σ are recursively defined: ∅, ², and
a for a ∈ Σ are regular expressions denoting the languages ∅, {²}, and {a}
respectively. Let r, s be regular expressions denoting the languages R and
S respectively. Then also (r + s), (r · s), and (r∗ ) are regular expressions
denoting the languages R ∪ S, RS, and R∗ , respectively. We usually omit
the parentheses assuming that ∗ has higher precedence than · and +, and
that · has higher precedence than +.We also denote the regular expression
(r + s) by r ∪ s, and r · s by rs.
• Σ is a finite alphabet,
0.3. DETERMINISTIC FINITE AUTOMATA 7
• q0 ∈ Q is an initial state,
b a a
q0 q1 q2 q3
The word abbaaa is accepted by A0 , because there exists a run, e.g. 1122234,
yielded by this word that starts in the initial state and leads the automaton
to a final one. We notice the nondeterminism in state q0 , where on the
occurrence of b the automaton can choose between staying in the same state
or proceeding to state q1 , as well as in state q1 , where an analogous choice
has to be made when reading an a. £
• Σ is a finite alphabet,
• q0 ∈ Q is an initial state,
δ ∗ : Q × Σ∗ → Q
w
(p, w) 7→ q if A : p −
→q
This can also be defined inductively by δ ∗ (p, ²) = p and δ ∗ (p, wa) = δ(δ ∗ (p, w), a).
The language accepted by a DFA A = (Q, Σ, q0 , δ, F ) is L(A) := {w ∈ Σ∗ |
δ ∗ (q0 , w) ∈ F }. To simplify notation we often write δ instead of δ ∗ . Note
that by definition every DFA is complete. As every automaton with a par-
tially defined transition function can easily be transformed in to DFA by
adding an additional state, we often also consider automata with partial
transition function as DFAs.
b a a,b
a b b
q0 q1 q2 q3
a
a,b,c a a,b,c
b b
1 2 3
9
10 CHAPTER 1. AUTOMATA AND LOGICAL SPECIFICATIONS
x z
Example 1.2 Consider the well known circuit (half-adder) of Figure 1.1:
Its behavior can be described by the following Boolean formulas:
s = x · z + x · z, u=x·z
1.1. MSO-LOGIC OVER WORDS 11
£
There is a translation from circuits to Boolean formulas and vice versa.
The main difference between circuits and automata is that the former are
acyclic graphs. In the above example we can notice how data flows “from
left to right”. In the case of automata we may have loops, consisting of one
or more states, and this is what makes them more complex to handle.
We also know some other roughly speaking “logical formulas” that de-
scribe the behavior of automata, namely regular expressions over Σ. These
are built up by the symbols of Σ, ∅, ² with + (for union), · (for concatena-
tion) and ∗ (for the iteration of concatenation).
b b
1 2 3
(Example 1.1) “There are two occurrences of b, between which only a’s
occur”
Example 1.5 “There are no two occurrences of b, between which only a’s
occur” £
Example 1.7 Consider the language consisting of words with the property:
Regular expression: Σ∗ · a · Σ
Logical formula: ∃x (Pa (x) ∧ x + 1 = max)
(ΣΣ) · (ΣΣ)∗
∃x (x + x = max)
Where:
14 CHAPTER 1. AUTOMATA AND LOGICAL SPECIFICATIONS
Remark 1.10 We decide to use only non-empty words for models for two
reasons. First, in mathematical logic it is generally convenient to use models
that have at least one element. Otherwise, we have to bother with some
additional considerations and special cases. Second, in case we accept the
empty word ² to define a model, we are no longer able to make use of the
constants min and max.
MSO stands for “monadic second-order”:
Second-order because it allows quantification not only over (first-order) po-
sition variables but also over (second-order) set variables.
Monadic because quantification is allowed at most over unary (monadic) re-
lations, namely sets. For an alphabet Σ we get the MSOΣ [S, <]-formulas. In
the special case that we use quantifiers only over first-order variables (rang-
ing over positions) we get the FOΣ [S, <]-formulas. Usually, the lowercase
Greek letters φ, χ, ψ, . . . are used for formulas. A variable xi or Xi is free in
a formula if it does not occur within the scope of a quantifier. The notation
φ(x1 , . . . , xm , X1 , . . . , Xn ) for a formula indicates that at most the variables
x1 , . . . , xm , X1 , . . . , Xn occur free in φ.
Assume φ(x1 , . . . , xm , X1 , . . . , Xn ) is given. To interpret the truth value
of φ we need:
• positions k1 , . . . , km as interpretations of x1 , . . . , xm
(w, k1 , . . . , km , K1 , . . . , Kn )
Now,
(w, k1 , . . . , km , K1 , . . . , Kn ) |= φ(x1 , . . . , xm , X1 , . . . , Xn )
w |= φ[k1 , . . . , km , K1 , . . . , Kn ]
1.1. MSO-LOGIC OVER WORDS 15
Then:
2. (w, 2, 4) |= φ2 (x1 , x2 )
3. w |= φ3
The last example-formula does not have any free variables, and can there-
fore be interpreted given just a word as a model. Such formulas are called
sentences. If we collect all words that satisfy such a sentence, and put them
into a set, then this set is the language defined by the sentence. Formally,
for an MSOΣ [S, <]-sentence φ, let L(φ) := {w ∈ Σ+ | w |= φ} be the lan-
guage defined by φ. Such a language is called MSOΣ [S, <]-definable (short:
MSO-definable).
In case no set quantifiers are needed in φ, we call the language FOΣ [S, <
]-definable or just FO-definable. FO stands for first-order, because now
quantifiers range only over position variables.
• Σ∗ aΣ is FO-definable, namely by
∃x (Pa (x) ∧ S(x, max))
Furthermore, we shall see in some future section that this formula cannot
be expressed in FO-logic at all. £
Remark 1.13 In FO- and MSO-formulas we can eliminate min and max.
16 CHAPTER 1. AUTOMATA AND LOGICAL SPECIFICATIONS
we write
∃x (Pa (x) ∧ ∃y (S(x, y) ∧ ¬∃z S(y, z)))
£
In general, substitute ψ(max), by:
2. a+ is definable by ∀x Pa (x).
3. a(ba)∗ is definable by
¡ ¢
Pa (min)∧∀x∀y S(x, y) → (Pa (x) → Pb (y))∧(Pb (x) → Pa (y)) ∧Pa (max)
b1 b2 bm
(c1 )1 (c2 )1 (cm )1
.. .. ... ..
. . .
(c1 )n (c2 )n (cm )n
and we set (ck )j = 1 iff k ∈ Kj .
b b
1 2 3
w |= φA iff w ∈ L(A) .
Example 1.20 abbab. An accepting run for this word is 112333. We encode
X1 , X2 , X3 in the following way (without the last state):
w : a b b a b
X1 : 1 1 0 0 0
X2 : 0 0 1 0 0
X3 : 0 0 0 1 1
£
18 CHAPTER 1. AUTOMATA AND LOGICAL SPECIFICATIONS
Naturally the automaton can only be in one state at each point of time.
Therefore there is just one 1 in every column of the run. How can we describe
a successful run? That is, how do we set constraints to X1 , . . . , Xn ? The
formula we are looking for is a conjunction over 4 basic properties:
£
φA := ∃X1 ∃X2 ∃X3 “X1 , X2 , X3 form a Partition”
∧ X1 (min)
∧ ∀x∀y (S(x, y) → “at x, y one of the transitions is applied”
¤
∧ “at max the last transition leads to a final state”
Now let us explain each one of these properties (except the second, which is
obvious) and give the respective MSO-formulas.
£
Proof of Theorem 1.18: Let A = ({1, . . . , m}, Σ, 1, ∆, F ) be an NFA. The
| {z }
Q
formula expressing that there exists a successful run of A on a word w ∈ Σ+
is then given by
£ ¡ ¢ V ¡ ¢
φA = ∃X1 . . . ∃Xm ∀x X1 (x) ∨ . . . ∨ Xm (x) ∧ i6=j ¬∃x Xi (x) ∧ Xj (x)
1.2. THE EQUIVALENCE THEOREM 19
∧ X1 (min)
W
∧ ∀x∀y (S(x, y) → (i,a,j)∈∆ (Xi (x) ∧ Pa (x) ∧ Xj (y)))
W ¤
∧ (i,a,j)∈∆,j∈F (Xi (max) ∧ Pa (max))
Remark 1.21 φA does not contain any universal quantifiers over set vari-
ables but only existential ones. Such an MSO-formula, namely of the form
where ψ does not contain any set quantifiers at all, is called an existential
MSO-formula (EMSO-formula).
• X ⊆ Y and X ⊆ Pa
¡b¢¡a¢¡a¢¡b¢
0 0 1 0 does not satisfy φ.
L(φ) = the set of words over Σ × {0, 1}, such that after every a in the first
component there is a 1 in the second component. £
In the general case, for each φ(X1 , . . . , Xn ) we have to inductively con-
struct a finite automaton over Σ × {0, 1}n . For the base case we consider
the atomic MSO0 -formulas.
• X1 ⊆ X2 :
Given a word w ∈ (Σ × {1, 0}n )+ , an automaton has to check that
whenever the first component is 1, the second component is 1 as well.
This is performed by the following NFA:
2 3 2 3 2 3
# # #
1 4 0 5, 4 0 5, 4 1 5
0 1 1
∗ ∗ ∗
#
1
0 , where # stands for arbitrary
that is, it accepts everything except
∗
letters in Σ and ∗ for arbitrary bit vectors in {0, 1}n−2 .
1.2. THE EQUIVALENCE THEOREM 21
• X1 ⊆ Pa :
»a– »a– »6= a–
1 0 , 1 ,
∗ ∗
0
∗
• Sing(X1 ):
»#– »#–
0 0
∗ »#– ∗
1
∗
1 2
• S(X1 , X2 ):
2 3 2 3
# #
405 405
0 2 3 2 3 0
∗ # # ∗
415 405
0 1
∗ ∗
1 2 3
• X 1 < X2 :
2 3 2 3 2 3
# # #
405 405 405
0 2 3 0 2 3 0
∗ # ∗ # ∗
415 405
0 1
∗ ∗
1 2 3
For the induction step, we suppose that for the formulas φ1 , φ2 the
corresponding NFAs A1 , A2 are given and we only consider the connectors
∨, ¬ and the quantifier ∃. The other operators (∧, →, ↔ and ∀) can be
expressed as a combination of the previous ones. Regarding the negation,
¬φ1 is equivalent to the complement automaton of A1 . As we know, we can
construct it using the subset construction on A1 and then, in the resulting
DFA, declare all states Q \ F as final ones. For the disjunction φ1 ∨ φ2 we
construct the automaton that recognizes L(A1 ) ∪ L(A2 ). As usually we do
this by constructing the union automaton out of A1 and A2 . To handle the
existential quantifier we need some preparation.
Remark 1.29 The NFA for the projection results by projecting the transi-
tion labels.
Example 1.30 Consider the (incomplete) DFA over {a, b} × {0, 1} that
recognizes
³¡ ¢ ¡ ¢´∗ ¡ ¢ ¡¡ ¢¡ ¢¢
a a ∗
L = a0 + 0b b
1 1 1 :
h ih i
a b
0 , 0
h i
h i
b a
1 1
1 2 h i 3
a
1
¡ a¢ ¡b¢
The projection f : c 7→ a, c 7→ b yields the NFA:
a,b
b a
1 2 a
3
Remark 1.33 The NFA for ∃Xn ϕ(X1 , . . . , Xn ) is constructed from the au-
tomaton for ϕ(X1 , . . . , Xn ) by “omitting the last component” in the transi-
tion labels.
A01 : 1 A02 : 1
[ a0 ] [ a1 ]
[ a1 ]
[ a0 ]
2 [ a∗ ] 2 3
[ a1 ]
Since A01 only requires that in the first position in the word model a 1
occurs in the second component, we form the intersection of A01 and A02 by
removing the transition from 1 → 2. So for the formula ψ1 (X1 ) ∧ ψ2 (X1 ) ∧
ψ3 (X1 ) we get the following equivalent automaton:
A01 ∩ A02 : 1
[ a1 ]
[ a0 ]
2 3
[ a1 ]
24 CHAPTER 1. AUTOMATA AND LOGICAL SPECIFICATIONS
a
1 2
Does this mean that the transformation requires exponential time with
respect to the number of states of the automata? Unfortunately not. The
fact that the projection may turn a DFA into an NFA and the fact that
constructing the complement automaton requires to make an automaton
first deterministic (subset construction) means that, whenever an alternation
between negation and projection occurs (e.g. when a universal quantifier
occurs), an additional exponential burden is added to our computational
effort. So, in the general case, if k is the number of logical operators in
the formula, o then the number of states in the resulting automaton grows
. . . 2n k
like 22 . The following theorem shows that we cannot hope a better
bound.
1.2. THE EQUIVALENCE THEOREM 25
. . 2n k
o
2.
2
with a constant k.
MONA input:
var2 X,Y;
X\Y = {0,4} union {1,2};
pred even(var1 x) = ex2 Z: 0 in Z & x in Z &
(all1 y: (0 < y & y <= x) =>
(y in Z => y-1 notin Z) &
(y notin Z => y-1 in Z));
var1 x;
even(x) & x notin X;
0 1 1
X 0 1
X,1,X
X
X 1
X
0 1 0 0 1 1
0 X 0 1
X,1,X
3
1
0 0 1 1
0 X 0 1
X,1,X
4
1 X
0 0 1 1 X
0 X 0 1 X
1,X,1
5
0 1 0 1 1 2
X 1 X 0 1
0,0 X,1,X
0 1 1
6 X 0 1
1 1,X,1
0
0
1
0 1 0 1 0
X 1 X 1 X
7 0,0 X,X
0
X
1 9
0 1
X 1 8
0,0 1 1
0 1
X,1
£
Now that we have connected automata to logic, we can make a compar-
ison between them, as well as consider the applications of this connection.
How does all this relate to the “logic” of regular expressions? These can
be enriched by further operators that make them more convenient to use
without increasing their expressive power. We add ∼ (for the complement)
and ∩ (for the intersection).
“there are no two occurrences of b, between which only a’s occur, and at
most three c’s occur”
is defined by
∼ (Σ∗ ba∗ bΣ∗ ) ∩ ∼ (Σ∗ cΣ∗ cΣ∗ cΣ∗ cΣ∗ )
£
Remark 1.40 The BET-Theorem has been extended to other types of mod-
els instead of finite words:
1. Automata on ω-words and their equivalence to MSO-logic (Büchi 1960)
• The theory S1S refers to the structure (N, +1) with arbitrary set quan-
tifiers.
28 CHAPTER 1. AUTOMATA AND LOGICAL SPECIFICATIONS
• The theory WS2S concerns the structure of the infinite binary tree
(with two successor functions), in which the set quantifiers range only
over finite sets.
• The theory S2S refers to the infinite binary tree with arbitrary set
quantifiers.
All these theories are decidable using the corresponding automata and sim-
ilar techniques as in the proof of Theorem 1.18.
where ψ is an FO-formula.
Proc0: loop
(00) a0: Non_Critical_Section_0;
(01) b0: wait unless Turn = 0;
(10) c0: Critical_Section_0;
(11) d0: Turn := 1;
Proc1: loop
(00) a1: Non_Critical_Section_1;
(01) b1: wait unless Turn = 1;
(10) c1: Critical_Section_1;
(11) d1: Turn := 0;
If we additionally label each transition with its target state, and declare all
reachable states as final ones, the words being accepted by this automaton
are exactly the system runs. Finally, for matters of system specification we
can represent the 5 bits by predicates X1 , . . . , X5 and introduce abbreviated
compact state properties that can be described by MSO-formulas:
ata0(t) := ¬X2 (t) ∧ ¬X3 (t) (states of the form (X00XX)
Turn(t) = 0 := ¬X1 (t) (states of the (0XXXX)
The first formula specifies all states where some part of the system is “at
a0”, which means that process 0 is in line 00. The second one specifies all
those states where the value of the synchronization variable happens to be
0. For the general case where state properties p1 , . . . , pk are considered, a
framework is used that is called Kripke structure.
1. Σ∗ is star-free, defined by ∼∅
Remark 1.49
a. The use of complementation ∼ is essential.
Indeed, complementation is the operation that enables us to represent
infinite languages because by complementing a finite language we get
an infinite one. With union, concatenation, and intersection we can
only produce finite languages starting from the finitely many symbols
of the alphabet.
ϕr (x, y) := ∃z∃z 0 (x ≤ z ∧ z ≤ y ∧ x ≤ z 0 ∧ z 0 ≤ y ∧
ϕs (x, z) ∧ S(z, z 0 ) ∧ ϕt (z 0 , y))
The value bi (k) ∈ {0, 1} codes the truth value of proposition pi at position
k in w.
We define the semantics of temporal formulas as follows:
“Whenever p1 holds, then from the next moment on the following holds:
p1 holds until eventually p2 holds.”
Example 1.54
G(p1 → X(¬p2 Up3 ))
expresses in (w, j) with w ∈ ({0, 1}3 )+ the following property:
34 CHAPTER 1. AUTOMATA AND LOGICAL SPECIFICATIONS
£
· ¸· ¸· ¸· ¸· ¸· ¸
1 0 1 0 1 0
Example 1.55 Let n = 2 and w =
0 0 1 1 0 1
• w 6|= p1 U(p1 ∧ p2 )
Example 1.58
The converses also hold but won’t be proven in these course notes since
this requires much bigger technical effort:
Theorems 1.65 and 1.59 imply that FO-definable languages are non-
counting.
£
Induction step “·”: Take the induction hypothesis as before and set n0 =
2 · max(n1 , n2 ) + 1. For n ≥ n0 consider uv n w ∈ L1 · L2 . Consider
the decomposition of uv n w in two segments, uv n w = z1 z2 with z1 ∈
L1 , z2 ∈ L2 as illustrated in Figure 1.2. By the choice of n0 one of the
following cases applies.
Case 1: z1 contains at least n1 v-segments. Then by induction hy-
pothesis z10 (with one v-segment more than z1 ) is in L1 .
Case 2: z2 contains at least n2 v-segments. Then by induction hy-
pothesis z20 (with one v-segment more than z2 ) is in L2 .
In both cases we get uv n+1 w ∈ L1 L2 . The converse (from uv n+1 w ∈
L1 L2 to uv n w ∈ L1 L2 ) can be proven analogously. 2
1.5. BETWEEN FO- AND MSO-DEFINABILITY 37
vn
u w
n1 n2
z1 L1 z2 L2
• sh(r∗ ) = sh(r) + 1
Theorem 1.68 (Eggan 1963) The star-height hierarchy is strict (over al-
phabets with at least two letters): For each n ≥ 0 there exists a language Ln
with sh(Ln ) = n.
For Ln we can take the family of languages, recognized by the following NFA
(see exercises):
a a a a
0 1 2 ··· 2n
b b b b
The property of NFAs that corresponds to the star-height of expressions is
the “loop complexity” (also “feedback number”).
38 CHAPTER 1. AUTOMATA AND LOGICAL SPECIFICATIONS
Example 1.70 Verify that the following automaton has loop complexity 2:
2 3
a
A1 : 1 2 1st step: Remove 1 and 2.
A3 : 0 1 2 3
1st step: Remove 4
2nd step: Remove 2 and 6
3rd step: Remove 0 and 7
8 7 6 5 4
£
Analogously, we can define the loop complexity of a regular language.
Theorem 1.73 (Eggan 1963) For every regular language L, sh(L) = lc(L).
The proof is a refinement of the one for Kleene’s theorem (Theorem 1.4). For
further details, please refer to J.R. Büchi: Finite Automata, Their Algebras
and Grammars, Sect. 4.6.
Remark 1.75
• gsh(L) ≤ sh(L) (i.e. we may be able to reduce the star-height by
substituting some stars by the extended operators of the generalized
regular expressions) but in general we cannot reduce it to 0 because
• gsh(L) = 0 iff L is star-free (by definition).
• There exist regular languages L with gsh(L) ≥ 1 (compare Exam-
ple 1.63 on page 35).
Open problem (since 1971): Are there regular languages L with gsh(L) ≥
2?
We suspect that in case there is a solution, it goes to the direction of
nesting modulo-counting since this is the identifying property of non-star-
free expressions. We define the languages:
E = (00)∗ 1 (“even block”)
∗ The suggestion of using the language
O = (00) 01 (“odd block”)
specifying an even number of even blocks (((O∗ E)(O∗ E))∗ O∗ ) failed after a
few years.
1.6 Exercises
Exercise 1.1 Give FO- or MSO-formulas for the languages that are defined
by the following regular expressions or descriptions in colloquial language:
(a) a+ b∗ ,
(b) aab∗ aa,
(c) there are at least three occurrences of b, and before the first b there are
at most two occurrences of a.
b
1 2
b a
4 b
3 a
40 CHAPTER 1. AUTOMATA AND LOGICAL SPECIFICATIONS
(b) Let Σ = {a, b}. The following MSO0 -formula defines a language over
Σ × {0, 1}3 . Give (by direct construction) an equivalent NFA.
(c) Using (b) construct an NFA over Σ × {0, 1} recognizing L(ψ(X1 )).
(b) Given an MSO-sentence ϕ, does ϕ hold for all nonempty word models
of even length?
Describe algorithms for solving these problems.
2. ψ = G(p1 → Xp2 )
(a) For each j = 1, · · · , 6 answer whether the formulas hold in (w, j) or not.
(b) Show that for L(A2n ) there exists a conventional regular expression (i.e.
only using +, ·,∗ ) of star-height n.
(The star-height of an expression is the maximal number of nested stars.
Example: (a(aa)∗ + (bb)∗ )∗ has star-height 2)
Exercise 1.10 Decide whether the following languages are “counting” and
verify your assumption.
(a) b(abb)∗
(b) a(bbb)∗ aa
42 CHAPTER 1. AUTOMATA AND LOGICAL SPECIFICATIONS
Chapter 2
Congruences and
Minimization
Example 2.1 The NFA A2 on the right-hand side of Figure 2.1 is a rough
view on the NFA A2 on the left-hand side, where the subsystem of the states
1, 2, and 3 (counting a’s modulo 3) is represented by state B. This merging
of states corresponds to the following homomorphism:
0 7→ A, 1 7→ B, 2 7→ B, 3 7→ B
43
44 CHAPTER 2. CONGRUENCES AND MINIMIZATION
b b
2
a
b b
0 1 a A B a
a
3
• universes A, B,
• binary relations RA ⊆ A × A, RB ⊆ B × B,
• h(cA ) = cB
Example 2.4 Consider again the NFAs A1 and A2 from Figure 2.1:
({0, 1, 2, 3}, 0, Ra1 , Rb1 , {1}) with ({A, B}, A, Ra2 , Rb2 , {B}) with
Ra1 = {(1, 2), (2, 3), (3, 1)} and Ra2 = {(B, B)} and
Rb1 = {(0, 0), (0, 1)} Rb2 = {(A, A), (A, B)}
2.1. HOMOMORPHISMS, QUOTIENTS AND ABSTRACTION 45
The domain of this homomorphism h is the state set of A1 and has the
following discrete values: h(1) = h(2) = h(3) = B, h(0) = A. It is indeed a
homomorphism, because every transition from Ra1 is mapped to a transition
in Ra2 . Analogously for Rb1 and Rb2 . £
L(A1 ) = b+ (aaa)∗
L(A2 ) = b+ a∗
b a b
0 1 2 A B
a a a a a a
5 4 3 C
a
b b a, b
Example 2.12 On the same automaton as before we can merge the pairs
(1, 5) and (0, 4):
b b
a
1, 5 2
a, b a a
0, 4 a 3
b b
£
In both previous examples we observed that if B results from A by
merging states, then L(A) ⊆ L(B). If we could find a way to merge the
correct states, so that the resulting automaton recognizes exactly the same
language as the initial one (L(A) = L(B)), then we would have improved
the initial automaton by reducing its number of states, without affecting its
2.1. HOMOMORPHISMS, QUOTIENTS AND ABSTRACTION 47
Example 2.13 Let A be an NFA and call Aq the same NFA as A but with
a different initial state, namely q. To minimize A we suggest to merge two
states p, q, if L(Ap ) = L(Aq ). That is, if A accepts the same language both
starting from p and from q, then we shall merge p and q.
a, b
B
b
A a
b C
In the automaton above there is no pair that can be merged in way described
before:
But still, it is obvious that the automaton can be minimized since states A
and B suffice to accept the same language, namely Σ∗ b. £
Unfortunately, things are even worse since there are cases where even
if we try all possible mergings (without following a specific rule as in the
previous example), we will not find the minimal NFA.
Example 2.14 Consider the language L = Σ∗ aΣ, over Σ = {a, b}. Obvi-
ously the minimal automaton A that recognizes L is the following:
a, b
a a, b
b
b 3
b
a a
1 2 b
a
4 a
Example 2.16
b a a
1 2
b b
a a a a A B
b b
4 3
b
or in standard syntax:
• h(q01 ) = q02
• h(δ 1 (p, a)) = δ 2 (h(p), a)
• h(F 1 ) ⊆ F 2
Example 2.18 From the left automaton we proceed to the right one by
merging 1 with 3 and 2 with 4.
a b b
b a a b b
0 1 2
a
b
a a A B C
a
4 a 3
b b
This corresponds to a homomorphism h with: h(0) = A, h(1) = h(3) = B,
h(2) = h(4) = C. This homomorphism is also a DFA-homomorphism,
because both the initial and the resulting automata are DFAs. £
a b b a a, b
b a b
0 1 2 A B
a a a a
4 a 3 C
b b a, b
The mapping h(0) = A, h(1) = h(2) = B, h(3) = h(4) = C yields an
NFA-homomorphism, but not a DFA-homomorphism. We observe that if
we want to preserve determinism in the resulting automaton, we cannot
merge any pairs of states, even if this adheres to the conditions of the NFA-
homomorphism. There must be some additional relation between the two
states before they are merged. £
b) The DFA A/∼A is called the reduced DFA of A, short Ared and L(A) =
L(Ared ).
This means that for any future w, u and v need not be distinguished with
respect to L. We often use the negation of this equivalence to prove that
two words belong in different equivalence classes, namely u 6∼L v iff there
exists some w with uw ∈ L and vw 6∈ L (or uw 6∈ L and vw ∈ L).
Example 2.26 Consider the language L = a(bb)∗ . a and b are not equiva-
lent: Take w = ², then aw = a ∈ L but bw = b 6∈ L. a, abb, abbbb, . . . are for
example equivalent. ², a, b, ab are pairwise not equivalent. £
It follows that if L is regular, then ∼L has a finite index, i.e. there are
finitely many equivalence classes.
• q0L := [ε],
52 CHAPTER 2. CONGRUENCES AND MINIMIZATION
• FL := {[u] | u ∈ L}.
Example 2.30 Let L = {a, b}∗ abb{a, b}∗ . There are four different ∼L -
classes: [ε]L , [a]L , [ab]L , [abb]L . Since
b
a b
[ε]L [a]L [ab]L [abb]L a,b
a
b a
a, b
2
b a
1 3 a
a
b b
4
£
Our aim is to compute from any DFA A a DFA B as in Theorem 2.32,
and thus obtain a minimal deterministic automaton. Given a DFA A =
(Q, Σ, q0 , δ, F ) we have to determine the equivalence classes of ∼A . But to
fix two equivalent states, we have to examine all possible words that the
automaton may read from those states, and examine whether they lead the
automaton to a final state or not. That is why we prefer the converse: we
identify all pairs of states that are not ∼A -equivalent. In other words, we
identify those pairs p, q for which the following holds:
“there exists a w, such that one of the two states δ(p, w) and δ(q, w) is in
F and the other one not”
To fix an appropriate w we proceed stepwise on its length. In the first step
we take w = ² and separate all final states from the non-final ones: (F, Q\F ).
In the next step (because p ∼A q ⇒ δ(p, a) ∼A δ(q, a)) we declare two
(not yet separated) states p and q as non-equivalent, if for some a ∈ Σ,
δ(p, a) and δ(q, a) are non-equivalent as well. According to this concept we
refine the classification (F, Q \ F ) as long as this is possible. Formally, the
partition Q \ F has to be split in two with respect to F if:
“there exist p, q ∈ Q \ F and a letter a, with δ(p, a) ∈ F and δ(q, a) 6∈ F ;
then the partition Q \ F is split into {p ∈ Q \ F | δ(p, a) ∈ F } and
{p ∈ Q \ F | δ(p, a) 6∈ F }”.
For every following step and a currently available partition (B1 , . . . , Bk ) the
task is generally described as follows:
Bi is being split in two with respect to Bj and a, if there exist p, q in Bi ,
such that δ(p, a) ∈ Bj , but δ(q, a) 6∈ Bj
This partitioning splits Bi into
(∗) {p ∈ Bi | δ(p, a) ∈ Bj } and {p ∈ Bi | δ(p, a) 6∈ Bj }
54 CHAPTER 2. CONGRUENCES AND MINIMIZATION
a a
a b a b
A B C b AD BG C b
a a
b b b
a a
b b b
D E F a G E F a
b b
a
a
Figure 2.2: Partition refinement
1. Initialization: B1 = F, B2 = Q \ F
Q
initialization ABDEF G C
1st step BG ADEF
2nd step AD EF
3rd step E F
Corollary 2.35 (of the Main Minimization Theorem) Two DFAs over
Σ are equivalent, if and only if the corresponding reduced automata are iso-
morphic.
This means that we can use the following algorithm to check, whether
two DFAs are equivalent:
3. Check whether the transition tables and the final states are identical.
a, b
[b] a b
b a [²] [a] [b]
[a] [b] [ab]
[ε] [ab] a
a [b] [b] [b]
b b [ab] [b] [a]
[a]
the initial state [²] and write it into the transition table. In the next loop
we visit [a] and from there [b]. We enter both under [²]. In the next loop
we reach again [b], but take no action, because we have already seen this
state. In the last loop we visit [ab] and write it down into the transition
table. We complete the transition table for every state that we reached
56 CHAPTER 2. CONGRUENCES AND MINIMIZATION
x x
y y y
z z z
0 1 0 1
Finally, a node that leads to its (only) successor with both a 0 or a 1
can be eliminated, because it does not affect the final result. This leads us
to the final minimized diagram above (right). £
Note that the second rule makes use of the fact that the nodes are labeled
by the names of the variables. This way, striking out a node does not lead
to processing the wrong bits. If we apply these reduction rules on a decision
tree, until no rules can be applied anymore, then the resulting diagram is
called a “Reduced Ordered Binary Decision Diagram” (ROBDD).
Example 2.41 Consider the language Ln = the set of words over {a, b}
having an a on the n-th position before the end. A minimal NFA can
recognize this language with n + 1 states by guessing when the n-th last
position is being read and initiate a run of n transitions to the final state.
On the contrary, a minimal DFA would always have to memorize the last
n letters that it has processed by having built in a path for each possible
combination. This needs obviously 2n states. £
58 CHAPTER 2. CONGRUENCES AND MINIMIZATION
1. L0 belongs to PSPACE
Example 2.44 QBF, namely the evaluation problem for quantified boolean
formulas is PSPACE-complete. £
Our aim is now to prove that all three decision problems mentioned
above are PSPACE-hard, i.e. they are not efficiently solvable like in the
case of DFAs. (It can be also shown that they belong to PSPACE, but we
are not going to deal with that part of the proof.) For the case of the Non-
Universality Problem of NFAs we have to prove that for every L in PSPACE
“L ≤p NFA Non-Universality Problem” holds.
working cell, the current state and the word that is written on the right of
the working cell. So, a configuration of M has the form uqv.
Let w = a1 . . . an . An accepting computation M, w is a word over Γ ∪
Q ∪ {#} of the following form: u0 q0 v0 # u1 q1 v1 # . . . # us qs vs #, where
3. us qs vs is a configuration with qs ∈ F
(∗) w ∈ L iff NFA F (w) does not accept all words in (Σ0 )∗
We choose Σ0 = Γ∪Q∪{#} and we define the NFA F (w) in such a way that
it accepts exactly the words that are not an accepting M, w-computation.
Then (∗) holds! To achieve this, the NFA F (w) has to check whether on the
input u one of the constraints 1. to 3. is indeed violated. In other words,
the NFA has to verify that:
The first condition can be tested with p(n)+2 states (because |u0 v0 | = p(n)).
The second condition can be tested by guessing 3 consecutive cells, whose
update (after p(n) + 2 positions) is not correct. O(p(n) + 2) states suffice
for this test.
. . . [aab][apb] . . . # . . .[aab][qaa]
| {z }
p(n)+2
The automaton may compare the three consecutive cells in the first square
brackets with the ones that are p(n) + 2 positions away. In the case that
none of these cells is currently (or in the previous step) occupied by the
2.3. EQUIVALENCE AND REDUCTION OF NFAS 61
head of the Turing Machine, one cell would suffice for the comparison. But
in case one of the cells contains the state of the Turing Machine (like in the
second square brackets) three cells are required. In the example above the
NFA verifies that the Turing Machine changed from state p to q, moving
the head to the left (that is why the q is now written on the left of the a
that was previously on the left of the p) and printing an a on the b. In
this example the cells adhere to the transition function. In the general case
the Turing Machine has to guess such a pair of triples where this does not
hold. It is now obvious that O(p(n) + 2) states suffice for this test, where
the constant represents the space that is required to save a triple of cells like
it was mentioned above.
The last condition can be trivially tested by using a memory state to
remember the information “no final state has been seen between the last
two #”. Consequently, from M and w we can construct the appropriate
NFA F (w) as a union-NFA from the three automata mentioned above with
p(n)+2, const·(p(n)+2) and 5 states respectively. Altogether, this NFA has
polynomially many states with respect to n = |w|, which also means that
the transformation w 7→ F (w) is computable in polynomial time. Thus, we
have proven the following theorem:
The first question is about the existence of an appropriate w and the ex-
istence of some corresponding accepting run on it, which, as we know, is
solvable in linear time (by performing breadth-first or depth-first search).
On the contrary, the second problem is about the existence of an accepting
run for each possible w, or for the existence of such a w that all possi-
ble runs on it are rejecting. Now we can see that there is a big difference
in terms of logic between the two problems. The first one contains two
consecutive existential quantification whereas the second is expressed by a
universal quantification followed by an existential one (or vice versa in the
case of the negation). The PSPACE hardness of the second problem relies
exactly on this alternation of the existential and the universal quantifier. It
is easy to see that the universality problem of an NFA A over Σ is a special
case of the equivalence problem between A and the standard NFA A0 with
just one state that is also a final one (obviously accepting Σ∗ ). Likewise,
62 CHAPTER 2. CONGRUENCES AND MINIMIZATION
•
a b
b
a
• • • • •
c
a c
•
They accept the same language, but they are not bisimilar (both started on
their leftmost state). £
2. If player I can pick (p, a, p0 ) ∈ ∆A such that for all (q, a, q 0 ) ∈ ∆B that
player II can choose, player I wins from (p0 , q 0 ), then from (p, q) player
I wins. Similarly, if player I can pick (q, a, q 0 ) ∈ ∆B such that for all
(p, a, p0 ) ∈ ∆A that player II can choose, player I wins from (p0 , q 0 ),
then from (p, q) player I wins.
The Remark 2.53 motivates the marking algorithm for marking pairs
(p, q) with (A, p) 6∼ (B, q) shown in Figure 2.3.
Example 2.54 We apply the marking algorithm to reduce the NFA shown
below. According to our convention, the final states are 4 and 6. That is
b
a
1 2 3
b
a b b
b
4 5 6
why before entering the loop, we mark all pairs, where exactly one of those
two states occurs. In the first loop we mark the pair (2, 1) because we can
choose to move from 2 with b, but we cannot move with the same letter
from state 1. We also mark the pair (5, 2) because there exists a transition
with b from 5 that leads to state 6 that is already marked with every state
that is reachable from 2 with the same transition (namely 3, since it is the
only state reachable with a b and (6, 3) is marked). Continuing like this, the
algorithm terminates leaving the pairs (5, 3) and (6, 4) unmarked.
2.3. EQUIVALENCE AND REDUCTION OF NFAS 65
This means that we can merge these states in the reduced automaton.
The resulting NFA is shown below and it happens to be also a minimal one.
a
b
a
1 2 3, 5
b
a
b
4, 6
£
The algorithm terminates after ≤ |QA | · |QB | loops (all possible pairs of
states). Every loop requires the consideration of each previously unmarked
pair (p, q) by checking the corresponding outgoing transitions, i.e. a compu-
tational effort that is proportional to |QA | · |∆A | + |QB | · |∆B |. Altogether,
if A, B have |QA | + |QB | = n states and |∆A | + |∆B | = m transitions, then
the marking algorithm terminates in time O(n3 m). Better algorithms are
known working in time O(n · log2 (n)) (Page and Tarjan, 1987).
NFA A PTIME
A− bisimulation quotient of A
NP-hard
gap in size
smallest NFAs A0 , A1 , . . . , Ak
Example 2.55 For the following NFAs the alternating simulation holds,
but not the bisimulation.
66 CHAPTER 2. CONGRUENCES AND MINIMIZATION
a a
a a
b b
a
Finally, there is an alternative technique that leads not only to the reduced,
but even to the minimal NFA. The idea relies on the fact that given an NFA
with n states, there are only finitely (of course exponentially) many NFAs
with less than n states. The following procedure illustrates, how we can
construct a (in most cases huge) NFA that contains all equivalent smaller
NFAs and in which we can restrict our attention only to those NFAs who
really have the possibility to be minimal ones. Given a regular language L,
there is indeed a canonical NFA, in which we can search for the minimal
one; it is constructed as follows:
2. In AL invert all transitions and switch all final states to initial ones
and vice versa and thus obtain the NFA AL .
This confirms our initial claim, namely that the minimal NFA can be found
within the Universal-NFA.
2.4. THE SYNTACTIC MONOID 67
Example 2.60 Consider the language L = bba∗ . a ∼L ba, because for all
w ∈ Σ∗ : aw ∈ / L and baw ∈
/ L. But a 6≈L ba, because b · a · ² ∈
/ L, and
b · ba · ² ∈ L. £
Notation:
The structure M (L) := (Σ∗ /≈L , ·, h²i) is also a monoid, namely the syn-
tactic monoid of L. But how can we construct the correct monoid for a
language? We do this by applying specific transformations on the automaton
(we take the canonical one) that recognizes this language.
Let A = (Q, Σ, q0 , δ, F ) be a DFA. Every word w ∈ Σ∗ induces a state
transformation wA : Q → Q that is defined by wA (q) := δ(q, w). Note that
²A = idQ (identity function). The composition goes as follows: uA ◦ v A is
the function (uv)A : uA ◦ v A (q) = v A (uA (q)).
Note that over Q, there are at most |Q||Q| many functions. This means
that the transition monoid of a DFA is finite. If Q = {0, . . . , n}, then we
represent the state transformation uA by the sequence uA (0), . . . , uA (n) of
values on the states of the DFA.
a
0 1 b
²A : (0 1 2)
a aA : (1 2 0)
a
aaA : (2 0 1)
2 b
One can easily see that there are no other (different) transformations on this
DFA, e.g. bA = ²A or aabaaA = aA . But of course this is no proof that we
have indeed found all possible transformations. There is a formal method
to determine all functions uA and it goes as follows:
012 a 120
a b b
ε a
0 1 b
a a
a a
201
aa
2 b b
Having determined the functions from the previous example, we draw the
transition monoid (on the right). £
a a,b
0 1 0123 a 1213 2123
² a aa
a,b
b a, b a, b b
a,b
3213 3123
a, b 3 2 b ba
a,b
Hence, by constructing the transition monoid (by using the previously il-
lustrated method) and by applying the last theorem, we obtain the syntactic
monoid.
Proof: Define f : M (L) → T (AL ) by f (hui) := uAL . f is the appropriate
isomorphism. f is well-defined and injective because
iff uAL = v AL
f is surjective, because
2
Now we can use these results to identify the class, in which a language
belongs. For example, in the previous chapter we have shown that the star-
free languages are non-counting. In the following we show that it can be
identified, whether a language is non-counting, by the property “group-free”
of the syntactic monoid. Furthermore, we show that for regular languages,
the properties non-counting and star-free coincide.
² a aa 0 1 2
012 a 120
² ² a aa 0 0 1 2
b b
ε a a a aa ² 1 1 2 0
aa aa ² a 2 2 0 1
a a
201
aa
b
a
0 1 b
AL1 :
a a
2 b
a
0 1
b a, b a, b
AL2 :
a, b 3 2
Theorem 2.75 For a regular language L the following conditions are equiv-
alent:
(a) L is non-counting.
2.4. THE SYNTACTIC MONOID 73
2
Proof (from modulo-counting to group): Note that in every finite
monoid each element m has a power mk that is idempotent. Now set K =
the smallest common multiple of these k-values. For every element m of a
monoid mK = m2K (∗). By hypothesis there are arbitrarily large n’s (also
> K), such that for appropriate u, v, w:
uv n w ∈ L but uv n+1 w 6∈ L
2.5 Exercises
Exercise 2.1 Find a family (Ln )n>0 of languages over {a, b}, such that
• Ln is recognized by a DFA with O(n) states, and
Hint: The first automaton of Example 2.14 points to the right direction.
(LR
n = {bn · · · b1 | b1 · · · bn ∈ L} is the language of the flipped words.)
Exercise 2.3 Consider the language L = b(a + b)∗ aa over the alphabet
{a, b}.
(a) Give representatives for the five ∼L -classes.
(b) Verify that the representatives are pairwise not language equivalent, i.e.
for any pair u, v, u 6∼L v holds (construct a table, where for each pair a
separating word w is entered).
Exercise 2.4 (a) Apply the minimization algorithm (page 54) in order to
minimize the following DFA (describe the steps as in Example 2.34):
a
b
a
1 2 3 b
4
b
b
b b a a
a
a
a
5 6 a,b
7 8
b
2.5. EXERCISES 75
a
b
7 6 3 2
a b
b b
Exercise 2.5 Consider the DFA A without final states as given by the
following transition graph:
a
1
a
a
b
2 3 b
b
Choose final states such that A is minimal, or show that this is impossible.
Exercise 2.6 ∗ Show that there exists a family of regular languages lan-
guages (Ln )n>0 such that Ln is recognized by n different (not isomorphic)
minimal NFAs.
Hint: The editors do not know the solution yet.
Exercise 2.7 (a) determine the transition monoid of the following automa-
ton according to the method presented in the lecture:
b a a,b
a b
0 1 2
3
Give a non-trivial group.
76 CHAPTER 2. CONGRUENCES AND MINIMIZATION
Tree Automata
In the last two chapters we dealt with two fundamental issues in the world of
finite automata, namely establishing a connection to logical (or “logic-like”)
expressions and discovering algebraic ideas and methods to solve typical
problems on finite automata. In this chapter we are going to generalize the
model of a finite automaton itself, and use it for processing generalized inputs
(i.e. not only finite words). This can be done in two different dimensions.
One is to lift the constraint of finiteness and consider infinite inputs such
as the run of a non-terminating procedure. In this case we are not anymore
interested in reaching some final state but rather in ensuring stability, e.g.
by passing again and again through a certain state. The other is to consider
(finite) inputs that are branching rather than being linearly ordered (i.e.
letters follow one another).
The first dimension goes beyond the scope of this course and is the core
of the next semester’s course about automata over infinite objects. The
second dimension leads to the tree automata and tree languages, which are
our main subject in this chapter.
Our primary motivation relies on the fact that trees model structured
objects, which happen to be very frequent in computer science. Even the
finite words we were dealing with in the previous chapters are a special case
in this category; they represent linear trees (only one successor). The fun-
damental questions we are going to answer are accordingly a generalization
of the ones we considered in the last two chapters, i.e.:
77
78 CHAPTER 3. TREE AUTOMATA
·
x +
π z
∧
¬ 1
∨
x y
{Input in x1 , x2 }
x3 := x1; x4 := x2;
WHILE x3 <> x4 DO
3.1. TREES AND TREE LANGUAGES 79
way to represent the structure of a derivation tree in normal text form is the
use of special bracketing. For every non-terminal symbol X the brackets (X
and )X are introduced. Then, the derivation tree t for w is coded by the
word wt that is produced by inserting bracketing symbols into w.
Example 3.3 The derivation tree on the left can be coded by the word on
the right.
S
A A
(S (A (A c )A (B b )B )A (A a )A )S
A B a
c b
£
XML-documents are also coding trees in normal text form. Here, the
bracketing notation is slightly modified: hXi is used for (X and h/Xi is
used for )X .
<ProfileOfStudies>
<faculty> Computer Science </faculty>
<semester> 3 </semester>
<courses>
<title> PROGRAMMING </title>
<title> COMPUTER STRUCTURES </title>
<title> DATA STRUCTURES </title>
</courses>
</ProfileOfStudies>
Example 3.7 The ranked alphabet for the term ¬(x ∨ y) ∧ 1 from Example
3.1 is Σ = {0, 1, ¬, ∧, ∨} with Σ0 = {0, 1}, Σ1 = {¬}, Σ2 = {∧, ∨}. £
Example 3.8 The ranked alphabet for the context-free grammar from Ex-
ample 3.1 is Σ = {a, b, c, A, B, S} with Σ0 = {a, b, c}, Σ1 = {A, B}, Σ2 =
{S, A}. £
We first define trees in a term-like fashion.
t1 t2 ... tk−1 tk
Example 3.12 An inductive proof for the arithmetical terms over {0, 1, +, ·}:
82 CHAPTER 3. TREE AUTOMATA
Inductive proof:
1. Induction start:
Verify the hypothesis for the terms 0 and 1.
2. Induction step:
Assume the hypothesis is fulfilled for t1 , t2 . Show that the hypothesis
holds for (t1 + t2 ) and (t1 · t2 ) (by simply distinguishing the possible
cases).
£
An alternative, but equivalent, definition of trees distinguishes the tree-
domain, which is the set of underlying nodes, from their values under an
evaluation function.
Example 3.13 Below the tree associated to the term ¬(x ∨ y) ∧ 1 and its
underlying domain is shown. The evaluation function val satisfies val(²) =
∧; val(1) = ¬, . . .
∧ ²
¬ 1 1 2
∨ 11
x y 111 112
• Σ is a ranked alphabet,
3.2. TREE AUTOMATA 83
Example 3.16 We define a tree automaton accepting all true Boolean ex-
pressions over Σ = {∧, ∨, ¬, 0, 1}. Let Q = {q0 , q1 } be the state set, F = {q1 }
be the set of final states, and δ be defined by
δ(0) = q0
δ(1) = q1
δ(qi , ¬) = q|i−1|
δ(qi , qj , ∧) = qmin(i,j)
δ(qi , qj , ∨) = qmax(i,j)
Example 3.18 The set of the true logical expressions (the ones that can
be evaluated to 1) is regular. £
For T2 let Q = {q0 , q1 , q2 } (signaling respectively no, one and at least two
occurrences of an f in the subtree below the current node) and F = {q2 }.
This induces the following transition function:
δ(0) = δ(1) = q0
δ(qi , g) = qi
δ(qi , qj , f ) = qmin(i+j+1,2)
For T3 let Q = {q0 , q1 , q01 , q10 } and F = {q0 , q1 , q01 }. A node of the tree
is labeled with state q01 if all 0’s are before the first 1 in the frontier word
of the subtree below this node. q10 signals that there is a 1 somewhere left
of a 0 on the frontier, and q0 , q1 signal that only 0’s, respectively 1’s, appear
on the frontier of the corresponding subtrees. The meaning of the states
is implemented in the following transition function: δ(0) = q0 , δ(1) = q1 ,
δ(q, g) = q for all q ∈ Q,
δ(q0 , q0 , f ) = q0 δ(q01 , q0 , f ) = q10
δ(q1 , q1 , f ) = q1 δ(q01 , q1 , f ) = q01
δ(q0 , q1 , f ) = q01 δ(q0 , q01 , f ) = q01
δ(q1 , q0 , f ) = q10 δ(q1 , q01 , f ) = q10 ,
Example 3.22 The derivation tree on the right belongs to the language de-
fined by the grammar on the left.
S → AA
S
A → c | a | AB
B→b A A
c A B
a b
£
The following theorem shows that this kind of tree languages forms a
proper subset of the regular ones.
δ(a) = a for a ∈ Σ0
δ(σ1 , . . . , σi , X) = X for X → σ1 . . . σi ∈ P for i > 0, X ∈ Σi
δ(σ1 , . . . , σi , X) = ∗ otherwise (“error00 − state)
δ(σ1 , . . . , σi , ∗) = ∗
2
Proof: [part b] It suffices to show that there exists a tree language that
cannot be represented as a language of derivation trees. For this reason we
consider the set T containing all trees with the following property:
(∗) On every path there are at most two A’s and there exists at least one
path with two A’s.
86 CHAPTER 3. TREE AUTOMATA
a
t t
When reading the root of a tree, the only information a DTA can get from
the two subtrees below, is the states it reaches at the top of each subtree.
Hence, the idea to prove this assumption (as well as other similar ones) is
to show that the finitely many states of the DTA do not suffice to compare
the infinitely many different subtrees.
Proof: Towards a contradiction, we assume that the DTA A = (Q, Σ, δ, F )
recognizes T . Since we do not have a restriction on the length of the
tree, there are obviously infinitely many trees over Σ. But because Q is
finite, there exist two different trees t, t0 with δ ∗ (t) = δ ∗ (t0 ) and hence also
δ ∗ (a(t, t)) = δ ∗ (a(t, t0 )) (at this point the automaton cannot distinguish be-
tween t and t0 anymore, since both lead it to the same state). Now, since
δ ∗ (a(t, t)) ∈ F also δ ∗ (a(t, t0 )) ∈ F , which means that a(t, t0 ) ∈ T . Contra-
diction. £
Example 3.27 Let Σ = Σ0 ∪Σ2 with Σ0 = {e}, Σ2 = {a, b}. The language
T = {t ∈ TΣ | t is complete } is not regular. (A tree t is complete iff all its
paths have the same length).
Proof: Towards a contradiction, assume that T is regular (i.e. recognized
by some DTA). Then, consider a split according to the Pumping Lemma.
By repeating the middle part twice, we get a tree that is accepted, although
it is not complete. Contradiction. £
By the previous two examples we realized that DTAs are not powerful
enough to solve some fundamental questions over trees. There are still some
other interesting problems reducible to recognizing tree properties that can
be characterized by finitely many states. The problem of type checking
that is embedded in every modern programming language represents such
an issue.
To answer this purpose, a DTA is used that has a state for each one of
the types (finitely many). The DTA evaluates the term using the type rules
as transitions and assumes an error state in case of a type mismatch. The
run tree of the automaton is shown below:
nat
nat nat
int ← nat
nat nat
Because the abs-function is not defined on nat, the last rule (namely that nat
is a subtype of int) has to be applied first. This corresponds to the implicit
type casting in programming languages and is implemented by ²-transitions
in the tree automaton. £
• Σ = Σ0 ∪ . . . ∪ Σm is a ranked alphabet,
£
In the case of automata over finite words, nondeterminism does not offer
any additional expressive power. The following theorem shows that the same
happens in the case of tree automata.
Theorem 3.31 For each NTA one can construct an equivalent DTA.
• Q0 = 2Q ,
• F 0 = {P ⊆ Q | F ∩ P 6= ∅},
It remains to show that for every t ∈ TΣ , δ ∗ (t) is the set of those states
q ∈ Q, for which a run of A on t exists reaching q at the root of t. By
induction on the construction of t, one can easily see that:
Then:
A0 accepts t
iff δ ∗ (t) ∈ F 0
iff ∃q ∈ δ ∗ (t) ∩ F
iff A accepts t.
2
The closure properties also hold, as they do for NFAs.
90 CHAPTER 3. TREE AUTOMATA
Theorem 3.32 The class of regular tree languages is closed under the Boolean
operations (∪, ∩, complement) and under projection.
This can be easily shown by copying the following well-known proofs:
• intersection and union by product automaton,
• complement as with normal NFAs,
• for the projection f : Σ → Γ (assumed as rank preserving) change
from a transition (q1 , q2 , a, q) to (q1 , q2 , f (a), q) .
A natural thing to ask is whether there are any results that cannot be
directly transferred from word automata to tree automata. Unfortunately,
the answer to this question is positive. Until now, we were using tree au-
tomata that process trees from the leaves to the root (bottom-up). We can
also introduce automata that evaluate their input in the opposite direction
(top-down) and show that in this case a reduction from nondeterminism to
determinism is not possible.
Remark 3.34 Normal NTAs can be directly transformed to ↓NTAs and vice
versa.
A ↓DTA is a special ↓NTA that has a singleton set {q0 } as an initial
state set Q0 , transition functions δ1 , . . . , δm with δi : Q × Σi → Qi instead of
∆ and final combinations exactly like a ↓NTA. A tree t is accepted by the
unambiguous run on t. This also means that a state assumed on a position
of the run tree can only depend on the nodes of the path back to the root.
The example language considered below shows that ↓DTA can recognize
less than ↓NTA (in contrast to what happens in the case of bottom-up tree
automata).
3.2. TREE AUTOMATA 91
Q := {q0 , qa , qb }
Q0 := {q0 }
∆ := {(q0 , f, qa , qb ), (q0 , f, qb , qa ), (qa , a), (qb , b)}
2
Proof (↓DTA non-recognizability): Towards a contradiction we assume
that the ↓DTA A recognizes T . Then, A accepts
f f q0
Á Â , with the unambiguous run Á Â
a b aq1 bq2
This means that δ(q0 , f ) = (q1 , q2 ) and (q1 , a) and (q2 , b) are final combina-
tions. A likewise accepts
f f q0
Á Â with the run Á Â
b a bq1 aq2
Consequently, (q1 , b) and (q2 , a) are also final combinations. Hence A also
accepts f (a, a). Contradiction. 2
Proof: Given an NTA A, the main idea for the proof is to consider the
set EA of all states that are reachable on the roots of input trees and note
that T (A) 6= ∅ iff EA contains a final state. Then, EA is computed in the
following way:
Emptiness algorithm:
Input: A = (Q, Σ, ∆, F )
E0 := {q ∈ Q | (a, q) ∈ ∆ for some a ∈ Σ0 }
n := 0
Repeat
n := n + 1
En := En−1 ∪ {q ∈ Q | there exists q1 , . . . , qi ∈ En−1 ,
a ∈ Σi with (q1 , . . . , qi , a, q) ∈ ∆}
Until En = En−1
If En ∩ F = ∅ output(“empty”) else output(“non-empty”)
Note that since Q is finite, the algorithm terminates. The effort in terms of
time complexity is O(|Q| · (|Q| · |∆|)).
T = { t ∈ TΣ | on every branch of t
there are two consecutive occurrences of an a}.
Consider the trees t = a(b, a), t0 = b(a, a) and t00 = a(b, b). Then, t 6∼T t0
and t ∼T t00 . The first can be shown by choosing the special tree s = a(c, a).
Then t ◦ s ∈ T but t0 ◦ s ∈/ T as it can be seen below (the letter in brackets
is the one that was substituted for the sake of the concatenation).
3.2. TREE AUTOMATA 93
a a
a(c) a b(c) a
b a a a
δ ∗ (t1 ◦ s) ∈ F ⇔ δ ∗ (t2 ◦ s) ∈ F
Remark 3.40 In the special case that Σ = {c}∪Σ1 the relations ∼T and ∼A
are identical to the corresponding congruences for the case of finite words.
c) index of ∼T ≤ index of ∼A .
Definition 3.42 For a given ∼T with finite index we define the DTA AT =
(QT , Σ, δT , FT ) with
• FT = { [t] | t ∈ T }
Only the way we identify two equivalent states (given two other states that
have been already found to be equivalent) is a little more complicated than
in the case of finite words. This is because, as stated before, in trees we
have to explicitly state at which leaf we concatenate a tree with another.
Formally: q ∼A q 0 , a ∈ Σi ⇒
δ(q1 , . . . , qj−1 , q, qj+1 , . . . , qi , a) ∼A δ(q1 , . . . , qj−1 , q 0 , qj+1 , . . . , qi , a)
for all j ∈ {1, . . . , i} and all q1 , . . . , qj−1 , qj+1 , . . . , qi ∈ Q.
Translated into an algorithm, the procedure takes the following steps:
1. Mark all pairs (q, q 0 ), such that exactly one state belongs to F .
2. While a new pair was marked in the last step: mark all yet non-
marked pairs (q, q 0 ), for which there exists an a ∈ Σi and states
q1 , . . . , qj−1 , qj+1 , . . . , qi , such that the pair
is already marked.
Definition 3.43 T1 ·c T2 is the set of all trees that are formed from some
t ∈ T1 , by replacing each c-labeled leaf by some tree from T2 .
Note that from now on we are concatenating trees in a top-down manner, in
contrast to what we did in Section 3.2. This is a natural thing to do at this
point, since we now append more than one tree to an existing one. Also note
3.3. LOGIC-ORIENTED FORMALISMS OVER TREES 95
• •
½ f ¾
Example 3.45 Consider the singleton tree language T := . Some
c c a
of the trees belonging to T ∗c are schematically shown below: £
•
• • • • • • • • • •
• • • • •• •• • •
•• ••
a1
a1 a3
b2 b2 a1 b2
b2 b1
96 CHAPTER 3. TREE AUTOMATA
This tree can be generated according to the above mentioned regular expres-
sion by twice applying the star-operator (∗c ) and once the concatenation (·c ):
∗c ∗c
a1 −−→ a1 −−→ a1
c c a1 a3 a1 a3
c c c c c c a1 c
c c
c
·
−−→ a1
a1 a3
b2 b2 a1 b2
b2 b1
£
∆ = (∆1 \ { (c, q) | q ∈ Q1 }) ∪ ∆2
© (2) (2)
∪ (q1 , . . . , qi , a, q (1) ) | ∃q 0 ∈ F2 :
(2) (2) ª
(q1 , . . . , qi , a, q 0 ) ∈ ∆2 and (c, q (1) ) ∈ ∆1
98 CHAPTER 3. TREE AUTOMATA
2
Proof ((⇒)): We suppose that T is recognized by the NTA A = (Q, Σ, ∆, F ).
First of all we have to choose some distinguished 0-ary symbols to enable
concatenation (“c-symbols”). For this purpose we misuse the state set of
the automaton and extend Σ to Σ ∪ Q, where the symbols from Q have arity
0.
Example 3.52 Let Σ = Σ0 ∪Σ2 with Σ0 = {b}, Σ2 = {a} and Q = {∗, 0, 1}.
An example tree over this extended alphabet would be
a
a a
a 0 ∗ b
b b
£
After having extended the alphabet we construct A0 from A by adding
the transitions (p, p) for every p ∈ Q, to allow the automaton to read the new
0-ary symbols on the leaves and synchronously assume the corresponding
state (having the same name).
Next we define a set T (R, S, q) for R, S ⊆ Q that contains all trees over
Σ ∪ Q, on which there exists a run of A0 ,
• whose leaves that are labeled by Q-symbols in the input tree, are
labeled with states from R
• whose root is labeled by the state q and
• whose remaining nodes are occupied by states from S
We call the elements of R the “source states”, q the “target state” and
the elements of S the “intermediate states”.
Example 3.53
a0
a1 a∗ ∈ T ({0, ∗}, {1, ∗}, 0)
a∗ 00 ∗∗ b∗
b∗ b∗
£
3.3. LOGIC-ORIENTED FORMALISMS OVER TREES 99
Hence, it suffices to show that all sets T (R, S, q) are regular, which we prove
by induction over the size of S.
then by induction hypothesis we can build a regular expression for the de-
composed parts, since all their intermediate nodes are in S0 . Using the
following equation we can then construct a regular expression for T (R, S, q).
T (R, S0 ∪ {s}, q) = ¡ ¢∗
T (R, S0 , q) ∪ T (R ∪ {s}, S0 , q) ·s T (R ∪ {s}, S0 , s) s ·s T (R, S0 , s). 2
3.3.2 MSO-Logic
In this subsection we introduce a way to describe the properties of trees
using the Monadic Second-Order Logic. Furthermore, as we did in the case
of finite words, we compare the expressive power of this logic to the tree
automata, this time without giving a proof.
• Pat := {u ∈ domt | valt (u) = a} (i.e. the set of all positions of t labeled
with an a)
The syntax of MSO-Logic over trees follows the one over words, though
with the atomic formulas
b a
a b b
b
£
1. We can express that a node x is the root of a tree (i.e. it does not
have any predecessors) by
£
The following theorem asserts the equivalence between tree automata
and MSO-logic.
Example 3.59 We know that an XML-Document, like the one shown be-
low, is a word that codes a tree using a bracketing hXi and h/Xi to signal
the beginning and the end of a branch.
<ProfileOfStudies>
<faculty> Computer Science </faculty>
<semester> 3 </semester>
<courses>
<title> PROGRAMMING </title>
<title> COMPUTER STRUCTURES </title>
<title> DATA STRUCTURES </title>
</courses>
</ProfileOfStudies>
PS
F S C
T XT T XT T T T
T XT T XT T XT
Note that the branching under “Courses” is arbitrary, i.e. we could as well
list more than three courses. £
This unlimited branching is already known from other areas. In arith-
metics for example, although we formally define “+” as a binary operation,
we are used to writing a sum of arbitrarily many numbers (n1 +. . .+nm ) be-
cause we know that the order is not important. In this sense we could as well
say that the “+”-operation has arbitrary arity (at least 2). The following
example illustrates a similar case in the area of boolean expressions.
((1 ∨ 0) ∨ (1 ∧ 0 ∧ 0) ∨ (0 ∨ 1 ∨ 0))
3.4. XML-DOCUMENTS AND TREE AUTOMATA 103
∨
∨ ∧ ∨
1 0 1 0 0 0 1 0
The model of a Σ-valued tree is defined similarly to the one over a ranked al-
phabet. The main difference is that in this case it is rather inappropriate to
use an indexed successor relation, because the index could be arbitrary. In-
stead, we only use a simple (not indexed) relation of the immediate successor
and additionally we introduce an ordering relation between the immediate
successors of a node.
• a labeling alphabet Σ,
Note that because of the arbitrary branching of the trees, there may be
infinitely many different combinations of son-nodes allowed, in order to pro-
ceed to a certain parent node. Nevertheless, the transition relation is still
presented as a finite set by each time specifying a regular expression or an
NFA over Q for the first component of a transition tuple.
• for every node u ∈ domt of rank n there exists (L, valt (u), ρ(u)) ∈ ∆
with ρ(u1) . . . ρ(un) ∈ L
• ρ(²) ∈ F (i.e. the state assumed on the root has to be a final one)
Note that, according to the first condition of the run function, for an a-
valued leaf node the transition relation of an XML-automaton must contain
a transition (L, a, q) with ² ∈ L.
• (², 0, q0 ), (², 1, q1 )
• (Q∗ q0 Q∗ , ∧, q0 ), (q1+ , ∧, q1 )
• (Q∗ q1 Q∗ , ∨, q1 ), (q0+ , ∨, q0 )
£
106 CHAPTER 3. TREE AUTOMATA
Proof: Simply use the non-terminal symbols of the rule system S as states
in the XML-automaton (cf. Theorem 3.23). 2
One should note that for the other direction (the translation from XML-
automata to DTDs) the situation is as in Theorem 3.23. This can be avoided
by allowing several “instances” A1 , A2 , . . . , Ak of each non-terminal symbol
A and considering the derivation trees with the indices of these instances
removed (the details are omitted).
Note that the automaton defined in Example 3.68 happens to have at
most one run on a given tree. This holds because the transitions are defined
in a “mutually exclusive” way, i.e. all languages defined by the ∧-transitions
do not have common elements (and the same goes for the ∨-transitions).
This kind of determinism results from the specific transitions of the automa-
ton, but is still not integrated into its structure. In the following we define a
variation of the conventional XML-automaton with a built-in determinism.
• for a node u of rank n > 0 and valt (u) = a: ρ(u) is the (unambiguous)
output of Sa on the input ρ(u1) . . . ρ(un) (i.e. upon reading the ordered
sequence of the states of the immediate successors of an a-valued node,
Sa returns the state that has to be assigned to the node)
Now that we have defined both the deterministic and the non-deterministic
version of XML-automata it is natural to question ourselves about their ex-
pressive power. The following theorem asserts that, as with automata on
ranked trees, nondeterminism does not offer the ability to describe more
languages.
Furthermore, all main issues that were considered for the case of au-
tomata on ranked trees (and also on finite words) hold in exactly the same
way for XML-automata. These issues mainly include the following points:
1. The regular sets of valued trees are closed under the Boolean opera-
tions and under projection.
All these statements can be proved by simply copying and adapting the
results that we know from the previous cases. Alternatively, we can code the
Σ-valued trees by binary trees and use the results of the previous sections
directly.
The mentioned properties of XML-automata can be used for various
applications concerning XML-documents. Among these are:
(This is similar to the way NFAs read the input of finite words letter by
letter, although in this case we cannot really speak of a navigation since the
movement takes place only in one direction). In the following we construct
and briefly study such a “path-oriented” automaton that is always at one
node of a tree at a time, and from there it can move to the parent node
(↑), to the first immediate successor (↓), to the next immediate successor
of the immediate predecessor node (i.e. to the next brother node) (→)
and analogously to the previous one (←), or remain at the same position
(stationary move: s). To avoid the case that this automaton gets out of
bounds we extend the input tree t to t+ , by adding border nodes labeled by
., 5, /, 4.
Example 3.73 The simple tree on the left must be extended to the one on
the right before it is given as an input to the new automaton.
a → 5
b c d . a /
. b c d /
4 4 4
generalized model does not really belong to the area of trees. The essen-
tial distinction between them is the fact that trees are the kind of directed
graphs where two pieces of information flow from the root to the leaves in
a “spreading” manner, i.e. without being able to converge again into some
node. In other words, in trees a node has always at most one predecessor
whereas in two-dimensional words it has at most two. Still, we are going
to study pictures as a generalized model in this chapter because this very
step of generalization is the one that leads us out of the convenient world of
decidability and efficient computability as far as it concerns the most funda-
mental problems on describing and processing input models. In particular,
after introducing automata and MSO-formulas over labeled grids, we are
going to verify that
• valp : domp → Σ.
a b b
b a b
£
Definition 3.79 (Picture Language) Let Σ++ be the set of all pictures
over Σ. Any B ⊆ Σ++ is called a picture language over Σ. If Σ is a singleton
set, then we speak of grids [m, n] and grid languages.
Example 3.80 Below we list three basic picture languages that we are
going to consider in further detail later.
2. for Σ = {·}:
B2 = {[n, n] | n > 0} (square grids)
3.5. AUTOMATA OVER TWO-DIMENSIONAL WORDS (PICTURES)111
£
As usual we are going to introduce a model over these pictures that
allows us afterwards to define MSO-formulas, describing languages like the
ones mentioned above. Models of pictures belonging to a language should
then serve as an interpretation for the corresponding MSO-formula.
i.e. by expressing the fact that in the picture there must be a top left
position from which, after “moving” twice right and twice down, we reach
an a-labeled position whereas all other positions are labeled by letter b. £
alternating fashion. In this sense we can say that a square grid is actually
a picture in which “every set X that contains the top left position and is
closed under diagonal successors also contains the bottom right position”.
This is exactly what we express by the following MSO-formula:
£
∀X ∀x(topleft(x) → X(x))∧
∀x∀y∀z(X(x) ∧ S1 (x, y) ∧ S2 (y, z) → X(z))
¤
→ ∀x(bottomright(x) → X(x))
(q0 , a)
tree transition
(q1 , a) (q2 , b)
transitions
ap bp bq
bp aq br
# # # # #
# a b b #
# b a b #
# # # # #
#0 # # # #
# b1 b b #
# b b2 b #
# b b a0 #
# b b b #
# # # # #
#1 # # # #
# ·1 · · #
# · ·1 · #
# · · ·1 #
# # # # #1
The t-system
µ accepts¶ the grid only if it can match its special verification
·1 #0
transition with some part of the input, i.e. if the 1 that was
#0 #1
emitted from the top-left corner is received by the bottom-right corner. £
Example 3.89 Let Σ = {a, b} and consider the language B3 , namely the
set of all pictures over Σ that do not have the form pp for some square p. A
picture p0 belongs to this language if either
2. p0 is indeed of size (n, 2n), but there exist some 1 ≤ i, j ≤ n such that
valp0 (i, j) 6= valp0 (i, j + n).
Case 1: Let us first find out how we can accept pictures of size (n, 2n).
This can be done by extending the idea of Example 3.87, namely as soon as
the signal state 1 is received by some bottom position it has to be reflected
to the up-right diagonal direction (renamed as signal state 2 to avoid any
confusion). Now we shall expect to receive the signal state 2 in the top-right
corner of the grid. Below we can see an accepting run that also shows the
other transitions that are required in order to fill out the grid correctly:
#1 #0 #0 #0 #0 #0 #0 #0 #0 #0 #0 #2
#0 ·1 ·0 ·0 ·0 ·0 ·0 ·0 ·0 ·0 ·2 #0
#0 ·0 ·1 ·0 ·0 ·0 ·0 ·0 ·0 ·2 ·0 #0
#0 ·0 ·0 ·1 ·0 ·0 ·0 ·0 ·2 ·0 ·0 #0
#0 ·0 ·0 ·0 ·1 ·0 ·0 ·2 ·0 ·0 ·0 #0
#0 ·0 ·0 ·0 ·0 ·1 ·2 ·0 ·0 ·0 ·0 #0
#0 #0 #0 #0 #0 #0 #0 #0 #0 #0 #0 #0
Case 2: The second t-system has to accept pictures p of size (n, 2n) (we
already know from the previous case how this can be done) that have at
least one pair of positions (i, j), (i, j + n) such that valp (i, j) 6= valp (i, j + n).
First of all the t-system has to guess such a position (i, j), which can be
done easily by using nondeterminism. Then, it has to count n positions
to the right to verify that the symbol at (i, j + n) is different. Of course
counting cannot be done by using n different states because n is not fixed.
The idea is depicted below (we assume that (i, j) is a-labeled and (i, j + n)
is b-labeled):
5 6
3a 3a 3a 4a
a1 b
1 2 7
2
2
µ ¶
a1 ∗3a
The position (i, j) is guessed by using the transition . From
∗1 ∗0
there the signal state 1 is emitted vertically towards the bottom border
and the signal state 3a (the a stands for the letter that has been read; it
is perfectly in order to do this because the alphabet is finite and given)
is emitted horizontally towards position (i, j + n). How do we know that
state 3a just reached this position to perform the comparison with (i, j)?
As soon as state 1 hits the bottom border, it is renamed to 2 and further
transmitted
µ to the
¶ upperµ right diagonal
¶ direction, mainly with the tran-
∗1 ∗2 ∗0 ∗2
sitions and . When state 2 crosses the horizontal
#0 #0 ∗2 ∗0
transmission of state 3a, theyµare renamed¶ to 5µand 4a, respectively.
¶ The
∗3a ∗4a ∗0 ∗5
corresponding transitions are and . When state 5
∗2 ∗0 ∗4a ∗4a
hits the top border, it is renamed to 6 and further
µ transmitted
¶ µ vertically
¶
#0 #0 ∗5 ∗6
towards the bottom border, using transitions , and
∗5 ∗6 ∗0 ∗6
µ ¶
∗0 ∗6
. Note that this happens at column j + n because the picture has
∗0 ∗6
n rows and state 2 (later state 5 and 6) took as many steps to the right as
to the top. When state 6 crosses the horizontalµ transmission
¶ µ of state ¶ 4a,
∗0 ∗6 ∗4a b7
the letter comparison is done by transitions , . In
∗4a b7 ∗0 ∗7
3.5. AUTOMATA OVER TWO-DIMENSIONAL WORDS (PICTURES)117
case the letters are different an “OK” signal is transmitted (by using state
7)
µ to the¶ bottom
µ and ¶finally
µ to the ¶bottom-right
µ corner with transitions
¶
∗0 ∗7 ∗0 ∗7 ∗7 ∗7 ∗7 #7
, , , . £
∗0 ∗7 #0 #7 #7 #7 #7 #7
But what about the language B3 of Example 3.80? Is it also recognizable
by a t-system as it is the case with its complement? If not, then we would
have proven that the closure property under complement does not hold for
picture languages recognizable by t-systems.
%p %0p %q %0q
Combining the left part of the left picture with the right part of the right
one, we can compose a run that accepts pq:
%p %0q
Contradiction. 2
The following theorem summarizes the closure properties that we studied
above.
This should suffice to prove that it is not possible to have a method to decide
the emptiness problem for t-systems because, if we had one, then we would
be able to decide the halting problem for TMs as well (through the above
mentioned transformation), which has been proven to be impossible.
We construct AM in such a way that B(AM ) codes (in its elements)
exactly the terminating computations of M when started on the empty
tape. Let M = (Q, Σ, q0 , ∆, qf ) with à ∈ Σ. ∆ contains transitions of
the form (q, a, a0 , l/r, q 0 ). We suppose that the tape is left bounded. The
configuration words are of the form b1 . . . bi−1 (qbi ) bi+1 . . . bn (denoting
that M is currently in state q, reading letter b and the rest of the symbols are
written on the tape to the left and the right of the head). For transitions
(q, bi , b0 , l, q 0 ) ∈ ∆ and (q, bi , b0 , r, q 0 ) ∈ ∆ the successor configurations are
b1 . . . (q 0 bi−1 ) b0 bi+1 . . . bn and b1 . . . bi−1 b0 (q 0 bi+1 ) . . . bn , respectively.
If we write the sequence of configurations one after the other into the rows
of a table, we obtain a picture of size (m, n), for sufficiently large m, n. This
way, a terminating computation of M started on the empty tape can be
represented by a picture over Σ ∪ (Q × Σ):
with
• a1,1 . . . a1,n = (q0 Ã)Ã. . . Ã,
3.5. AUTOMATA OVER TWO-DIMENSIONAL WORDS (PICTURES)119
for a, a0 ∈ Σ. 2
120 CHAPTER 3. TREE AUTOMATA
3.6 Exercises
Exercise 3.1 Let Σ = Σ0 ∪Σ2 with Σ0 = Σ2 = {f, g} be a ranked alphabet.
Give tree automata for the following languages:
(a) { t ∈ TΣ | on some path in t there are two consecutive occurrences of f }
(b) { t ∈ TΣ | on every path in t there are two consecutive occurrences of f }
Remark: a path is always starts at the root and ends at a leaf.
Exercise 3.5 (a) Which language is generated by the following regular ex-
pression over Σ0 = Σ2 = {0, 1}?
µ 0 ¶∗c · 1 µ 0 ¶∗c ¸
0 0
c0 c1 c0
· · · {0}
c0 c0 c1 c1 c0 c0
Exercise 3.8 (a) Give a deterministic (!) TWA that accepts those input
trees over Σ = {a, b, c} where an a occurs.
Hint: Program a run through the whole input tree.
(b) Give a deterministic TWA for the language T presented in the lecture,
namely the set of all Σ-valued trees having at least one a-labeled node
with both a b-labeled and a c-labeled successor.
Provide an explanation for each part.
Exercise 3.9 Give a t-system that accepts exactly those pictures p over
the singleton alphabet {·} whose domain domp contains an even number of
elements. Provide an explanation.
Exercise 3.10 ∗ Show that the set of square grids [m, m] where m is a prime
number is recognizable by a t-system.
122 CHAPTER 3. TREE AUTOMATA
Chapter 4
In this chapter we are entering the area of systems whose sets of configu-
rations are infinite. This makes it harder to find algorithmic solutions to
certain problems, because we cannot anymore hope to gather all required
information by taking finitely many steps throughout the transition graph
of such a system. Nevertheless, we present some methods that successfully
solve the most fundamental problems. We then apply our results to systems
that are based on recursive structures and gain some significant results.
Finally, we identify and prove decidability limits towards more complex sys-
tems.
123
124 CHAPTER 4. PUSHDOWN AND COUNTER SYSTEMS
procedures. In the special case that the stack alphabet consists of a single
symbol, apart from the initial stack symbol, a stack actually implements a
counter. If the stack contains only the initial stack symbol, the number 0 is
represented. Any other number is represented by the number of occurrences
of the (only) stack symbol on top of the initial one. This again shows that
the set of all possible stack contents is infinite, as natural numbers are too.
Example 4.3 The depth-first search in a graph G = (V, E), starting from
a node v0 , can conveniently be implemented by the following recursive pro-
cedure DFS(v):
Mark v
For all non-marked successors w of v: DFS(w)
We illustrate this process on the graph below, where v0 = 1, and denote
the nested calls of the algorithm on the left.
DFS(1) 1 2 3
DFS(2)
DFS(4) 6 5 4
DFS(6)
DFS(5)
S → AA, A → AB | b | c, B→b,
S S
AA
A A ABA
A B c bBA
BA
b b bA
A
c
£
By defining the configurations of a PDA as states of a transition system,
we can extend the PDA to an infinite automaton, since there are in general
infinitely many configurations. This shows again that PDAs can accept a
wider class of languages than finite automata.
Example 4.6 We define a PDA A for the language L = {an bn | n > 0}.
Let A = ({q0 , q1 }, {a, b}, {Z0 , Z}, q0 , Z0 , ∆) with
(q0 , a, Z0 , ZZ0 , q0 ), (q0 , a, Z, ZZ, q0 ),
∆= (q0 , b, Z, ², q1 ), (q1 , b, Z, ², q1 ), .
(q1 , ², Z0 , ², q1 )
The configuration graph of A is shown below.
a a a ...
q0 Z0 q0 ZZ0 q0 ZZZ0
b b b
² b b b ...
q1 q1 Z0 q1 ZZ0 q1 ZZZ0
£
4.1. PUSHDOWN AND COUNTER AUTOMATA 127
Lemma 4.7 For every PDA there exists an equivalent PDA in normal
form.
Proof: For each transition with |β| > 2 add |β|−2 intermediate transitions.
2
Every PDA induces a pushdown system, where we suppress the special
roles of q0 and Z0 , and ignore the input letters.
• pw `∗ p0 w0 if pw `i p0 w0 for some i ≥ 0.
£
In the following we consider an extension of pushdown systems to systems
that have more than one stack.
We recall the special case of counter systems and extend them to more than
one counters by giving analogous formal definitions.
Z : (p, i1 , . . . , in ) `∗ (q, j1 , . . . , jn )
iff R : (p, i1 , . . . , in ) `∗ (q, j1 , . . . , jn )
and
pre ∗ (C) := {c | c `∗ c0 for some c0 ∈ C}.
With this notation the reachability problem for two configurations can be
restated in two versions:
Corollary 4.16 The stack contents that are reachable from the initial con-
figuration q0 Z0 of a PDA form a regular set.
This means that the context-free language is non-empty iff q0 Z0 ∈ pre ∗ (Q²),
which can be decided by Theorem 4.15. 2
Proof (of Theorem 4.15): First we have to clarify how we will give
definitions for sets of configurations. We use special NFAs to represent such
regular sets. Let P = (P, Γ, ∆) be a PDS and C ⊆ P Γ∗ a regular set
of configurations. A P -automaton for C is an NFA A = (Q, Γ, P, ∆A , F )
that accepts from an initial state p ∈ P exactly the words w ∈ Γ∗ such that
pw ∈ C. The control states of P are used as initial states of A. Additionally,
it is not allowed in A to have a transition to an initial state.
132 CHAPTER 4. PUSHDOWN AND COUNTER SYSTEMS
Example 4.20 Consider the pushdown system from Example 4.9 and the
configuration C = {p0 aa}. The P -automaton for C is the following:
A: p2
a a
p0 s1 s2
p1
We can apply the induction hypothesis for i to (1) (i. e. to pu1 and p1 ). This
u0
means that there is p01 u01 such that A : p01 −→
1
p1 and pu1 `∗ p01 u01 . As a
P -automaton does not have any transitions leading to an initial state, the
u0
only possibility for A : p01 −→ 1
p1 is u01 = ² and p01 = p1 . We get pu1 `∗ p1 .
Considering (2), the transition (p1 , a, q1 ) was added by the algorithm
v
because there is a pushdown rule p1 a → p2 v with Ai−1 : p2 − → q1 . Concate-
v u2
nating this run with (3) we obtain the run Ai : p2 − → q1 −→ q in which the
transition (p1 , a, q1 ) is used only j − 1 times. The induction hypothesis on j
w0
yields p0 w0 such that A : p0 −→ q and p2 vu2 `∗ p0 w0 .
Combining this with pu1 `∗ p1 (derived from (1)) and the rule p1 a → p2 v
we get pw `∗ p1 au2 ` p2 vu2 `∗ p0 w0 , proving (*).
For the part of Theorem 4.15 concerning post ∗ (C) we use the same idea.
We suppose we are given a PDS P = (P, Γ, ∆), this time in normal form,
134 CHAPTER 4. PUSHDOWN AND COUNTER SYSTEMS
²
1. v = ²: add a transition q −
→ r (now qvw = qw is accepted by the run
² w
q−→r− → F );
b
2. v = b: add a transition q −
→ r (now qvw = qbw is accepted by the run
b w
q−
→r−
→ F );
bc
3. v = bc: we have to add some transition “q − → r”. But since the
NFA can process only one letter at a time, we need an intermediate
state rρ for each such rule ρ with |v| = 2. In this case, before starting
the procedure we should first have extended the NFA by a transition
b c
q−→ rρ . Now we can add the transition rρ −→ r, so that the NFA finally
b c w
→ rρ −
accepts qvw = qbcw by the run q − →r−
→ F.
Example 4.21 Let P be the pushdown system from Example 4.9. Apply-
ing the first saturation algorithm for the P -automaton from 4.20, we obtain
the automaton Ā accepting pre ∗ (p0 aa).
Ā : p2
a
b c
a a
p0 s1 s2
b
p1
b
a
b,²
c
p2 r2
stack by one symbol. This can be done as many times as there are rules,
namely |∆| times. In each loop and for each rule in ∆ we have to find out
in which of the three cases it belongs (at most 3 comparisons) and then find
a matching transition (not a run!). In the worst case this can be done in
3 · |∆| · |Q| steps. For a similar reason as before the loop can be executed at
most (|P | + |∆|) · |Q| · |Γ| times. All in all we get a (polynomial) complexity
of |∆| + 3 · |∆| · (|P | + |∆|) · |Q|2 · |Γ|.
We conclude this section by recalling what holds about the equivalence
tests between two PDAs.
is decidable.
K0
K0
a
K1
several inputs and outputs. For the sake of simplicity, we only consider the
case of a single input and output in the following.
The concept which recursive hierarchical systems rely on is of course
well-known to us from the context-free grammars. The idea of replacing a
component with several ones (including itself) is represented by the rules
that imply a substitution of some letter by several ones (including itself).
S
S S
A S B A B
S S
A B A B
S
A B
A S B
S S
A
a
c
A
• a transition relation ∆j with transitions of the form (p, a, q), (p, ², B),
(B, ², q), (B, ², B 0 ) for p, q ∈ Qj , a ∈ Σ, and B, B 0 ∈ BXj .
• labels bi ∈ Σ ∪ {²}.
For each segment pi , bi+1 , pi+1 of the sequence there is a j ∈ {0, . . . , n} such
that the following holds:
Example 4.27 Consider the word ccab and the recursive hierarchical sys-
tems of Figures 4.4 and 4.5. By following the upper branch of system S and
twice the lower and then once the upper branch of component A, we get
the following accepting run of ccab (of course we also follow once the upper
branch of component B, which is analogous to A):
² c ² c ² a ² ² ² b ²
iS −
→ iA −
→q−
→ iA −
→q−
→ iA −
→ oA −
→ oA −
→ oA −
→ iB −
→ oB −
→ oS .
is stored on the stack, then no box call is currently active, and furthermore
the current state is in Q0 .
A PDA-transition (p, b, Z, v, q) is added in the following cases:
b) As discussed before, the theorem leads to the solution of the (forward and
backward) reachability problem for recursive hierarchical systems.
Furthermore we have to ensure that in the end both stacks of the PDS are
empty, so that it accepts exactly when the TM reaches a final state. Hence,
we have to add transitions that force the PDS to repeatedly erase the top
symbols in both stacks, until it reaches the initial ones (“]” and “[”), namely:
Now we have ensured that M can reach a certain configuration from another
one if and only if PM can do the same for the corresponding configurations.
Formally:
This holds also if one configuration is the initial one (with an empty tape)
and the other is a final one:
In other words, M terminates when started on the empty tape if and only
if PM can reach the configuration c1 := (qf , ], [) from the configuration
c0 := (q0 , Ã ], [). 2
£
These arithmetic operations can easily be carried out by using a second
counter. We assume that the PDS P = (P, Γ, ∆) is given in normal form.
To simplify the construction of the 2-counter system Z, we continue using
|Γ| = 9. Every P-transition is carried out by a block of Z-transitions (which
we will represent by a pseudo program p):
• Replacing the top symbol of the stack by another symbol (with tran-
sition pa → qb) is simulated by:
p: if rem(z1 ) = a then z1 := z1 div 10; z1 := z1 ∗ 10; z1 := z1 + b; goto
q
• Replacing the top symbol of the stack by two symbols (with transition
pa → qbc) is simulated by:
p: if rem(z1 ) = a then z1 := z1 div 10; z1 := z1 ∗ 10; z1 := z1 + c;
z1 := z1 ∗ 10; z1 := z1 + b; goto q
We give the counter operations only for the second case. The other two
are analogous. We describe (almost) all of the four operations of this case
below:
• Multiplication by 10:
• Addition of b is easy.
Finally, for the initial configuration c = (p, i0 · · · ir ) of the PDS, the cor-
responding configuration of the 2-counter system is c0 = (p, (ir · · · i0 )10 , 9),
where 9 is used as the initial stack (counter) symbol. Then, since we have
coded all possible transitions of the PDS by corresponding ones for the 2-
counter system, the following holds:
2
It follows directly from Lemma 4.34 that any 2-PDS P can be simu-
lated by a 4-counter system ZP (actually 3 counters would suffice, since
we only need one auxiliary counter). For the initial configuration c =
(p, i0 · · · ir , j0 · · · js ) of P, the corresponding configuration of ZP is then
c0 = (p, (ir · · · i0 )10 , 9, (js · · · j0 )10 , 9).
3. The following lemma shows that we can in fact further reduce the num-
ber of counters we need.
that this is definitely going to be the case sometime, because the overall
number of available states in the system is finite anyway.
In the algorithm that is presented below we actually manipulate a for-
mula or an automaton until it describes exactly the set that we are looking
for. We denote
The symbolic algorithm for backwards reachability takes the following steps:
1. β := description of C;
We already know that in the case of finite systems on the state space {0, 1}n ,
it is very convenient to describe sets by boolean formulas β(x1 , . . . , xn ).
Example 4.37 Consider the state set Q = {0, 1}5 and the designated set
of configurations C = {q ∈ Q | 3rd or 5th component = 0}. The set C can
be described by β = ¬x3 ∨ ¬x5 . In Section 2.2 we studied a way to construct
a minimal acyclic automaton for such a boolean formula, namely OBDDs.
The OBDD for (the description of) C is shown below.
x3
x5
0 1
£
It is obvious that it is much easier to apply the reachability algorithm on
the level of OBDDs than to apply it on enumerated sets. To achieve this,
two steps of the algorithm presented above still remain to be implemented:
At this point let us recall that this is exactly what we were striving to-
wards when we were trying to solve the reachability problem for pushdown
systems. To describe the given set C we used a P -automaton. To imple-
ment the extension to the transitive closure (under predecessor) of C, we
4.6. EXERCISES 147
gradually exhausted the states belonging strictly to pre ∗ (C) by adding new
transitions to the P -automaton. Furthermore, because we used such a struc-
ture, we were not concerned with an equivalence test. The termination of
the algorithm depended on whether it was possible to add new transitions
to the automaton. What we finally observe is that the reachability analysis
of pushdown systems is performed according to the symbolic method, by
using NFAs for the representation of state sets.
We conclude this chapter about pushdown systems by noting that these
are a first essential step in the verification of infinite systems. In such systems
we look at two words u, v, e. g. representing two configurations, and we want
to examine all ways that may lead from the first to the second. To change
the first letter of one word and replace it by another (or two) is only a special
case.
4.6 Exercises
Exercise 4.1 Show that every PDA A is (language) equivalent to a PDA
B that uses only purely expanding and purely reducing transitions.
Exercise 4.4 ∗ Show that the following problem is decidable: Given a push-
down automaton A, does there exist a sequence q0 Z0 ` q1 w1 ` q2 w2 ` . . . of
configurations such that the length of the configurations is unbounded, i. e.
for every n ∈ N there is an i such that |qi wi | > n?
Hint: Consider the repetitions and the following parameters: The num-
ber of states n, the number of symbols in the stack alphabet m, and the
maximal length of a word v in a transition (p, a/², Z, v, q).
148 CHAPTER 4. PUSHDOWN AND COUNTER SYSTEMS
S → aB | bA
A → a | aS | bAA
B → b | bS | aBB
The length of the configuration is the sum of the lengths of both stack
contents.
Hint: For a reduction the halting problem for offline Turing machines is
recommended, which can be assumed to be undecidable. (An offline Turing
machine has a read-only input tape and a read-write working tape.)
Chapter 5
Communicating Systems
with
(
¡ ¢ (pi , a, qi ) ∈ ∆i for a ∈ Σi and
(p1 , . . . , pn ), a, (q1 , . . . , qn ) ∈ ∆ :⇔
pi = q i for a ∈
6 Σi .
1st process: This process models a shipbuilder with two sites for storing
ships. Possible actions are: produce ship, check the quality, sell a ship,
and retake a ship. This process can be implemented by the following
149
150 CHAPTER 5. COMMUNICATING SYSTEMS
NFA, where the upper three states indicate whether there are 0 (left),
1 (middle), or 2 ships (right) ready to be sold.
s s
r r
P1 : • • •
p q p q
q
• • q
2nd process: This process models a very cautious ship owner who uses a
ship only if she possesses two. Possible actions are: buy (sell for 1st
process), return a ship (retake for 1st process), use ship, and lose a
ship. This process can be implemented by the following NFA.
r r,l
P2 • • • u
s s
turn=1?
F1 P1
P5 F2
F5 P2
P4 F3
F4 P3
£
The synchronized product of the transition systems presented in the
previous example is only a model for all possible actions (transitions) that
can occur and for all possible situations (states), in which the group of
philosophers can find itself at any moment in time. To really describe the
problem of the dining philosophers (there are several versions of it) we have
to construct a protocol that specifies a code of behavior between them by
allowing and forbidding certain actions. Such a protocol would for example
force the following constraints:
152 CHAPTER 5. COMMUNICATING SYSTEMS
• Every philosopher will again and again obtain the forks that are ac-
cessible by him (liveliness condition).
Conditions like the ones above are not modeled in the transition rules of the
system. They are posed on the behavior of the system, i.e. on the possible
paths through the graph. A path can be coded by a word, and thus the
condition will specify a subset of all possible words, i.e. a language. The
first condition stated above is for example a typical reachability condition
and hence can be expressed by the techniques on finite words that we have
learned so far. For the last condition above one has to examine infinite
executions of the system and thus this cannot be expressed by the same
techniques.
In the general case the solution and verification of such problems is re-
alized in three different levels. The synchronized product is constructed
to describe a language L over the action alphabet representing the possi-
ble behavior. The required behavior of the communicating components is
described by an automaton for some language K. A solution (protocol)
is another automaton recognizing a sub-language L0 ⊆ K. Usually L and
K are given, and we have to follow an efficient method to determine L0 .
Although the synchronized product offers a very compact description of a
system with say k components of n states each, the “state space explo-
sion” to the number nk is generally inevitable. Techniques like the symbolic
method are unfortunately not always helpful, and this is one of the main
problem of system verification nowadays.
{(1, 2), (3, 1), (3, 2)(2, 3)} we obtain the following communication structure:
c12
A1 A2
c32
c31
A3 c23
• (p, a, q) ∈ Qi × Σi × Qi , (local)
Proof: We reduce the halting problem for Turing Machines to the reacha-
bility problem for CFSMs. Remember that instructions of a Turing machine
M are of the form (s, a, a0 , ∗, s0 ) where s, s0 are states of M , a, a0 are symbols
from the tape alphabet, and ∗ ∈ {L, R, N } is the direction in which the head
of M moves. A configuration of M is a word [usv$ where [ and $ are the
left and right end markers of the tape inscription, u is the word to the left of
the head of M , v is the word to the right of it (including the symbol beneath
the head), and s is the current state of M . We assume that M only stops if
a special state sf in is reached.
We define an effective transformation of a Turing machine M into a
CFSM AM where AM = (AM M
1 , A2 , {(1, 2), (2, 1)}) such that M stops when
started on the empty tape iff AM reaches a configuration in a set C to be
defined later on.
The idea is that AM 1 puts the current configuration of M into channel
c12 . AM2 reads this configuration, applying changes to it according to the
instructions of M and sends the new configuration back to AM 1 using channel
c21 . More precisely:
To start the simulation of M , AM 1 puts the initial configuration [s0 ~$
into channel c12 , and proceeds to state p0 . From thereon AM 1 will write
every letter read from c21 to channel c12 . AM 2 behaves as follows
1. Starting from the initial state read the first three letters a1 , a2 , a3 from
c12 and store them in a state [a1 , a2 , a3 ].
msc C
P1 P2 P3
m1
s1 r1
m2
s2 r2
m3
s3 r3
m4
r4 s4
m5
r5 s5
m6
r6 s6
£
5.3. MESSAGE SEQUENCE CHARTS 157
Definition 5.13 Let (A, <) be a partial order. The Hasse diagram (A, <·)
of (A, <) is the acyclic graph defined by
a <· b :⇔ a < b and there exists no c ∈ A : a < c < b
A linearization of a partial order (A, <) is a total order (A, ≺) such that
a < b ⇒ a ≺ b.
A linearization of a partial order can be obtained algorithmically by a topo-
logical sort based on depth-first search.
Example 5.14 Let A := {pants, trousers, socks, shoes, watch, belt, shirt,
tie, jacket}. Figure 5.2 shows a Hasse diagram for A such that the associated
partial order allows to dress correctly. The associated partial order is defined
by a < b iff there exists a nonempty path from a to b in the Hasse diagram
for A.
A possible linearization of this partial order is the following:
watch < shirt < tie < pants < trousers < belt < jacket < socks < shoes.
£
pants
jacket
Example 5.16 Figure 5.3 shows the Hasse diagram of the partial order
defined by the MSC C from Figure 5.1. Note that r5 and r6 are not ordered
in the partial order induced by (E, →), but r5 <1 r6 , i.e. r5 and r6 are a
race of C. £
An important task when dealing with MSCs is to find all races. The
standard procedure is to construct the acyclic graph (E, →) from Definition
5.15, then compute its transitive closure (E, <), and compare this with the
visual orders <i .
MSCs can be generalized to higher level message sequence charts (HM-
SCs), where a HMSC is a directed graph composed of MSCs. A HMSC
defines a family of MSCs, obtained from paths in the HMSC graph by con-
catenating the component MSCs.
Example 5.17 Let a HMSC be given by the following graph with compo-
nent MSCs as shown in Figure 5.4:
C1
C2 C3
An expansion of this HMSC is shown in Figure 5.5, where the dashed lines
indicate that the part between label 1 and 2 can be repeated arbitrarily
often afterwards. We obtain a family of MSCs with an unbounded set of
races (for process P3 ).
£
The previous example leads us to partial orders with a larger and larger
number of nodes. There is a special theory that deals with such partial
5.3. MESSAGE SEQUENCE CHARTS 159
s1
r1
s2
r2
s3
r3
s4
r4
s6
s5
r5
r6
msc C1 msc C2
P1 P2 P3 P1 P2 P3
msc C3
P1 P2 P3
P1 P2 P3
a d
b c
5.4 Exercises
Exercise 5.1 Consider the following automata A1 over {a.b} and A2 over
{b, c}.
b
b b
A1 : 1 2 A2 : 4 5
b a b c
3 6
a c
Exercise 5.2 Consider the synchronized product modeling the five philoso-
phers, with automata P1 , . . . , P5 and F1 , . . . , F5 as explained in this chapter.
Determine the number of states of this synchronized product. Give an upper
bound (as small as you can justify) on the number of reachable states from
the initial states “think” for the philosophers and “down” for the forks.
Exercise 5.3 (a) Consider the language L defined by the regular expres-
sion (b+ ac+ a)∗ . Find an NFA A1 over {a, b} and an NFA A2 over {a, c}
such that L is accepted by the synchronized product A1 ◦a A2 .
(b) Show that no NFAs A1 , A2 as in (a) exist such that the language
(b+ ac+ a)∗ + (c+ ab+ a)∗ is accepted by their synchronized product A1 ◦a
A2 .
Exercise 5.5 (a) Develop a CFSM A that models the “Five Philosophers
Problem”. A shall consist of component automata P1 , . . . , P5 for the
philosophers and F1 , . . . , F5 for the forks, which are lifted separately
here. Pi shall be connected via forward and backward channels to both
Fi and Fi+1 . In particular your model has to ensure that
(ii) a philosopher can eat iff she possesses both of her forks,
(iii) a philosopher can think iff she possesses no fork, and
(iv) once a fork is requested, a philosopher is not in state “think” any-
more.
(b) Modify your CFSM such that a philosopher is allowed to drop her re-
quest for a fork (because the fork is in use) and go back to state “think”.
Exercise 5.6 In the course it was shown that CFSMs consisting of a sin-
gle component automaton A with a channel to itself induces an undecidable
reachability problem. (“Given such an A, can it reach from an initial config-
uration (q, ²) a configuration (p, w) with control state p and channel content
w?”)
Analyze this problem for the restricted case of a CFSM with an one-letter
message alphabet Γ = {m}. Justify your answer.
Exercise 5.7 Consider the message sequence chart from Figure 5.6.
msc C
P1 P2 P3 P4
Id1
Id2
Req
Check
Ack1
Inf
Interr
Disconn1
Disconn2
We denote the send events by s(Id1 ), s(Id2 ), . . . and the receive events
by r(Id1 ), r(Id2 ), . . ..
(a) Draw the Hasse diagram for the partial order associated with C.
5.4. EXERCISES 163
Petri Nets
Petri nets are models for concurrent systems and their behavior.
Example 6.2 The following Petri net models a mutual exclusion protocol
165
166 CHAPTER 6. PETRI NETS
m: ••• • m0 : •• ••
£
• •
•• ••
Example 6.5 The following Petri net illustrates the modeling power of
Petri nets.
1• 2•
a b c
3 4 5
Transitions a and b are in conflict, only one of them can be fired at a time.
We call such transitions nondeterministic. Transition c can be fired inde-
pendently of a and b. We call c concurrent to a and b. If transitions a, c
fire, the Petri net ends up in a deadlock. If b, c and d fire repeatedly, more
and more tokens are collected at place 1, i.e., the Petri net is unbounded. £
Example 6.6 The following Petri net models two production lines A and
B with two intermediate steps each; the main manufacturing phase and the
reset back to the idle state. Production line A requires resources p and q,
168 CHAPTER 6. PETRI NETS
Figure 6.1: A Petri net modeling the problem of five dining philosophers.
p q
A B
£
Example 6.8 Consider a small computer network where three PCs have to
share two printers. These two printers can be represented by an auxiliary
6.2. PETRI NETS AS LANGUAGE ACCEPTORS 169
••
• 0 otherwise.
In the special case of condition/event nets the markings are also 0-1-vectors.
That is, we are only interested whether there is a token at some place or
not, and even if there is more than one token at some place, the number is
normalized to 1, after each update of the marking. The mutual exclusion
protocol in Example 6.2 and the production lines in Example 6.6 are such
special cases.
A: 1 a 2 3
N: a 1• 2 3
a b
b
£
Lemma 6.11 For every regular language L there exists a Petri net N such
that L = L(N ). 2
The power of Petri nets as language acceptors goes beyond the regular lan-
guages.
Example 6.12 Let L = {an cbn | n ≥ 0}. L is a Petri net language accepted
by the following Petri net with initial marking m− = (1, 0, 0) and final
marking m+ = (0, 0, 1).
1• 2 b 3
a
c
6.2. PETRI NETS AS LANGUAGE ACCEPTORS 171
Reading an yields the marking (1, n, 0), from (1, n, 0) we proceed to (0, n, 1)
via c, and via bn from (0, n, 1) to (0, 0, 1). This is the only way to reach
(0, 0, 1) from (1, 0, 0). Hence this Petri net accepts L. £
In Figure 6.2 the power of Petri nets as language acceptors is shown. Ex-
ample languages are shown that separate the different classes.
CSL
an b n c n
wwR
ww
Figure 6.2: On the power of Petri nets as language acceptors
a b c
a b c
This Petri net accepts L with initial marking m− = (1, 0, 0, 0, 0, 0) and final
marking m+ = (0, 0, 0, 0, 0, 1). £
The previous examples demonstrate the power of Petri nets in checking the
correct number of occurrences of a letter. However, Petri nets are weak
when it comes to checking also the order of occurrences. In the following we
realize this situation on the language consisting of the palindromes of even
length.
172 CHAPTER 6. PETRI NETS
Lemma 6.15 For every Petri net that recognizes L the following holds: For
a sufficiently large s, only less than 2s markings are reachable.
Let D := D+ − D− .
contains all the information about the “net effect” of transition ti in its i-th
row. Suppose ti can be fired from a marking m̄ = (m1 , . . . , mn ). Then
2 4
t2
1 3
t1 t3
Then
1 1 1 0 1 0 0 0
D− = 0 0 0 1 and D+ = 0 1 1 0 .
0 0 1 0 0 0 0 1
So we obtain
0 −1 −1 0
D = 0 1 1 −1 .
0 0 −1 1
1 2 4
t1 t2
If m̄ = (1, 0, 0, 0) and m̄0 = (0, 0, 0, 1), then m̄0 is not reachable from m̄, but
µ ¶
−1 1 −1 0
(0, 0, 0, 1) = (1, 0, 0, 0) + (x1 , x2 ) ·
0 −1 1 1
2• 4
5•
1• 3
Definition 6.23 The boundedness problem for Petri nets is the following:
K has only markings (m1 , . . . , mn ) with Σni=1 mi ≤ K. This means that also
mi ≤ K for every i and hence there are less than K n different markings
reachable in N . These can be represented by the states of a huge, but still
finite automaton.
In the following we first solve the boundedness problem and then the cov-
ering problem. Unfortunately, we have to skip the proof for the decidability
of the reachability problem because it is far too complex to be presented
here.
Theorem 6.24 (Karp, Miller 1968) The boundedness problem for Petri
nets is decidable.
The proof of this theorem needs some preparation. We first define reacha-
bility trees over a Petri net and make some important observations about
them.
Definition 6.25 [Reachability Tree of a Petri net] Given a Petri net N and
an initial marking m− , the reachability tree t[N, m− ] is defined as follows:
• the root of t[N, m− ] is m− , and
• the immediate successors of a node m are all markings that are reach-
able from m by firing one transition.
Remark 6.26 The reachability tree t[N, m− ] contains exactly the markings
that are reachable from m− .
However, since a place of a Petri net can have arbitrarily many markers,
the reachability tree can be infinite. Is there some way to break the (pos-
sibly never ending) construction of a reachability tree in order to be able
to perform the boundedness test? The construction of a reachability tree
for a Petri net relies on the following monotonicity property. If m is reach-
able from m− and there is a nonempty firing sequence τ such that m ¤τ m0
and m ≤ m0 , then τ can be fired infinitely often. Two cases have to be
distinguished:
1. If m = m0 , then m can be reproduced infinitely often by executing τ
again and again.
2. If m < m0 , then mi < m0i for some i, and mi can be increased ar-
bitrarily by executing τ again and again. In this case we obtain a
monotonically increasing marking sequence m < m0 < m00 < . . . and
consequently the unboundedness of the Petri net.
The algorithm from Figure 6.3 uses the previous observations to compute a
truncated reachability tree t(N, m− ) for a Petri net N , the Karp-Miller tree.
For simplicity we assume that the initial marking m− is not a deadlock. Note
that the marking m0 is possibly changed in line 7 and thus, the conditions
in lines 8 and 9 are checked for this possibly new marking.
As we shall prove in the following the Karp-Miller tree contains all the
information we need. Note that the structure of the output tree of the
algorithm may depend on the order of chosen leaves in line 3. Further, there
may be m1 < m0 and m2 < m0 with m1 £ m2 and m2 £ m1 . The structure
of the output tree may also depend on the choice of such a marking in line 6.
But nevertheless, the set of all leaf labels for different output trees is always
the same, and this suffices for our purpose.
Example 6.28 The Karp-Miller tree for the Petri net from Example 6.5
has the following form:
(1, 1, 0, 0, 0)
a c
b
(1,1,0,61,0)
(0, 1, 1, 0, 0) (1,1,0,∞,0) (1, 0, 0, 0, 1)
c a c b
b a
(0,1,1,∞,0) (1,1,0,∞,0) (1,0,0,∞,1) (0,0,1,0,1)
(1,0,0,61,1)
(0,0,1,0,1) √ √ (1,0,0,∞,1)
deadlock .. .. √
. .
178 CHAPTER 6. PETRI NETS
£
Now we have all the knowledge we need to decide whether a Petri net is
bounded when started from some given marking. The procedure goes as
follows: Given a Petri net N and an initial marking m− we construct the
Karp-Miller tree t(N, m− ) by using the algorithm of Figure 6.3. If no ∞ sign
occurs in the tree then N is bounded when started from m− . Otherwise, N
is unbounded when started from m− .
To prove the correctness of this method we proceed in two stages. First,
we show that the algorithm of Figure 6.3 terminates, i.e., that the Karp-
Miller tree is finite, and second that N is unbounded when started from m−
if and only if a ∞ occurs in t(N, m− ).
Theorem 6.29 The algorithm from Figure 6.3 computing t(N, m− ) termi-
nates, i.e., t(N, m− ) is finite.
For the proof we need two intermediate results. The first one is the well
known König’s Lemma:
Lemma 6.30 (König) Every finitely branching infinite tree has an infinite
path.
Proof: Let t be a finitely branching infinite tree. Starting from the root
r of t we find an infinite path through t by repeatedly picking a son s of
the current node such that the subtree of t rooted at s is infinite. Since t is
assumed to be a finitely branching but infinite, this is always possible. 2
a) For some x0 we have x ≤ x0 for all (m̄, x) ∈ V with m̄ ∈ V00 . Then there
exists x1 ≤ x0 such that (m̄, x1 ) ∈ V for infinitely many m̄ ∈ V00 . These
vectors form the desired infinite totally ordered set.
b) For every (m̄0 , x0 ) ∈ V with m̄0 ∈ V00 there exists m̄ ∈ V00 and x ∈ N with
m̄ > m̄0 and x > x0 such that (m̄, x) ∈ V . In this case the existence of
an infinite totally ordered set is trivial.
6.4. DECISION PROBLEMS FOR PETRI NETS 179
2
Proof of Theorem 6.29: Assume that t(N, m− ) is infinite. By construc-
tion t(N, m− ) is finitely branching. Hence, by König’s Lemma, t(N, m− )
contains an infinite path π. For markings (m1 , . . . , mn ) appearing on π let
i1 , . . . , is be those components where eventually ∞ appears. We know that
{i1 , . . . , is } 6= {1, . . . , n} because otherwise (∞, . . . , ∞) would be reached,
then repeated, and therefore π would be finite. So w.l.o.g. assume that af-
ter the last ∞ sign has been introduced (this happens after a finite number
of steps!) only markings (m1 , . . . , mr , ∞, . . . , ∞) (r ≥ 1) appear on π.
By Dixon’s Lemma there exists an infinite totally ordered set V0 of mark-
ings of the form (m1 , . . . , mr , ∞, . . . , ∞) on π. Let m ∈ V0 be the first such
marking appearing on π. Consider now the markings in V0 appearing after
m on π. Only finitely many of these can be smaller than m. Since V0 is
infinite there must be a m0 appearing after m on π with m0 > m. Hence
another ∞ sign would be introduced by the algorithm, contradicting the
assumption above. 2
Now that we proved that the algorithm terminates we have to show that
N is unbounded when started from m− if and only if a ∞ occurs in the
Karp-Miller tree t(N, m− ).
6.5 Exercises
Exercise 6.1 Consider the following Petri net N where the numbers denote
the names of the places.
2 3
(a) Construct the reachability tree of N with initial marking (0, 1, 0, 0).
(b) Determine the deadlock markings reachable from (0, 1, 0, 0) and give a
description of the reachable states.
Index
T (A) t• , 166
transition monoid, 68 t[N, m− ], 176
tree language, 83 valp , 110
TΣ , 81 valt , 82, 103
X(y), 13 w, 13
huiL , 67 wA , 68
[m, n], 110
[u]L , 51 algebra, 48
Σ++ , 110 alphabet, 6
Σi , 80 ranked, 80
·c , 94 automaton
∗c , 95 P -, 131
¤t , 166 canonical, 51
∼L , 51 complete, 7
≈L , 67 deterministic
∼T , 92 over trees, 82, 90
∼A , 50, 93 over words, 7
`, 124, 153 XML —, 106
`∗ , 124 nondeterministic
dom(w), 14 over trees, 88, 90
domp , 110 over words, 6
domt , 82, 103 XML —, 105
181
182 INDEX
race, 157
reachability problem
for PDS, 130
for CFSM, 154
for Petri nets, 175