Lectures

Lecture Theoretical Computer Science II:
Foundations of Theoretical Computer Science

Lecture notes
of
V. Claus and E.-R. Olderog
translated by Ievgeniia Karashchuk
Winter semester 2010/11
Chapter I
Basic denitions
1 Modelling in theoretical computer science
Characteristics of the theory:
- from the particular to the general
- use of modelling to answer general questions
- analysis and synthesis of models
Question: What is necessary to describe the syntax of a programming language?
Modelling of languages:
We consider Chomsky-languages and in particular
- regular languages
- context-free languages
We will show: Regular languages are recognized by nite automata.
Context-free languages are recognized by push-down automata.
Question: How can we describe parallel and communicating systems?
Modelling of processes and process calculi:
- Operators for parallelism and communication
Question: How can we describe time-critical systems?
Modelling of real-time automata:
- Extension of nite automata with clocks
Question: What tasks can be accomplished using computers?
Modelling of a computer:
- Turing machines (1936)
1
2 I. Basic denitions
- Grammars (1959)
We will show: These approaches are equivalent. There exist non-computable functions. Ex-
amples of non-computable functions are considered. Finite and push-down automata can
be seen as special cases of Turing machines.
Question: How much time and memory does the computation take?
Modelling of complexity:
- fast, i.e. polynomial algorithms
- complexity classes P and NP
Open problem: Does P = NP ?
Topics of the lecture:
Automata
Formal languages and grammars
Computability
Complexity
2 Logic, sets, relations and functions
Below we have put together the notations that will be used in these lecture notes.
In logical formulae we use logical connectives (negation), (conjunction), (disjunc-
tion), (implication) and (equivalence) as well as quantiers (universal quantier)
and (existential quantier).
We can describe nite sets by listing their elements, for example, a, b, c, d or coee, tea, sugar.
The empty set is denoted by . An important example of an innite set is set N of
natural numbers: N = 0, 1, 2, 3, ....
x X denotes that x is an element of the set X, and x , X denotes that x is not an
element of the set X.
The principle of extensionality holds, i.e. two sets are equal, if they have the same ele-
ments. For example,
a, b, c, d = a, a, b, c, d, d.
X Y means that X is a subset of Y , i.e. x X : x Y . Subsets can be selected from
the given sets using logical formulae. For example,
M = n N [ n mod 2 = 0
describes the set of even natural numbers.
For sets X, Y Z there exist standard set-theoretic operations
3
union: X Y = z Z [ z X z Y
intersection: X Y = z Z [ z X z Y
dierence: X Y = XY = z Z [ z X z , Y
complement: X = Z X
Furthermore X

Y stands for the disjoint union of X and Y , i.e. additionally X Y =
holds.
[X[ and card(X) denote the cardinality or size of the set X. [X[ is the number of elements
in the set X when X is nite. E.g., [coee, tea, sugar[ = 3.
P(Z) and 2
Z
denote the power set of a set Z, i.e. the set of all subsets of Z: P(Z) =
X [ X Z. In particular, P(Z) and Z P(Z) holds.
X Y denotes the (Cartesian) product of two sets X and Y , consisting of all pairs where
the rst element is from X and the second from Y : X Y = (x, y) [ x X y Y .
In general, X
1
... X
n
denotes the set of all n-tuples, where for every i 1, ..., n
holds that the set X
i
contains the i-th component: X
1
X
n
= (x
1
, ..., x
n
) [ x
1

X
1
... x
n
X
n
.
Relations are special sets. A (2-place or binary) relation R of two sets X and Y is a subset
of the product X Y , i.e. R X Y . The inx-notation xRy is often used to denote
element relation (x, y) R. The domain of R is dened by
dom(R) = x X [ y Y : (x, y) R
and the range of R by
ran(R) = y Y [ x X : (x, y) R.
The relation R X Y is
left-unique, if x
1
, x
2
X, y Y : (x
1
, y) R (x
2
, y) R x
1
= x
2
,
right-unique, if x X, y
1
, y
2
Y : (x, y
1
) R (x, y
2
) R y
1
= y
2
,
left-total, if X = dom(R),
right-total, if ran(R) = Y .
The identity relation on X is denoted by id
X
: id
X
= (x, x) [ x X. The inverse relation
of R X Y is R
1
Y X that is dened as follows:
x X, y Y : (x, y) R (y, x) R
1
The composition of two relations R X Y and S Y Z is dened as follows: for
all x X and z Z holds
(x, z) R S y Y : (x, y) R and (y, z) S.
By a 2-place relation on the set X we understand a relation R X X. This relation is
reexive, if x X : (x, x) R,
or in terms of relations: id
X
R,
irreexive, if x X : (x, x) R,
or in terms of relations: R id
X
= ,
symmetric, if x, y X : (x, y) R (y, x) R,
or in terms of relations: R = R
1
,
antisymmetric, if x, y X : (x, y) R (y, x) R x = y,
or in terms of relations: R R
1
id
X
,
transitive, if x, y, z X : (x, y) R (y, z) R (x, z) R,
or in terms of relations: R R R.
R is an equivalence relation, if R is reexive, symmetric and transitive. (Note the initial
letters r-s-t of these properties.) The equivalence relation R on X divides the set X
into disjoint subsets, i.e. each subset contains pairwise equivalent elements. [x]
R
is an
equivalence class of x for element x X, i.e. the set
[x]
R
= y X [ (x, y) R.
An element of an equivalence class is called a representative of this class, because the whole
class can be identied with the representative and the relation R. In addition, arbitrary
element can be chosen as a representative element of the class, because
x, y X : (x, y) R [x]
R
= [y]
R
.
By the index of R we mean the set cardinality of all equivalence classes of R on X.
Notation:
Index(R) = [ [x]
R
[ x X[
If it is clear from the context, which R is considered, then we write [x] instead of [x]
R
.
An equivalence relation R on X is called a renement of an equivalence relation S on X,
if R S. Then every equivalence class of R is a subset of some equivalence class of S.
The n-th power of R X X is dened inductively:
R
0
= id
X
and R
n+1
= R R
n
The transitive closure R
+
and the reexive transitive closure R
of R are dened as follows:

R
+
=
_
nN\{0}
R
n
and R
=
_
nN
R
n
Functions are special relations. A relation f X Y is called a partial function (or a
partial image) from X to Y , if f is right-unique. It is denoted as follows f : X
part
Y .
Instead of (x, y) f we write f(x) = y. f is a (total ) function from X to Y , if f is
additionally left-total. This is denoted by f : X Y .
A function f : X Y is
injective, if f is left-unique,
surjective, if f is right-total,
bijective, if f is injective und surjective.
A bijective function is also called bijection.
5
3 Alphabets, strings and languages
In this lecture we will pay special attention to analysing formal languages. For this purpose the
following notation will be used.
Alphabet = nite set of characters (symbols) = character set
We use A, B, , as typical names for alphabets and a, b as typical names for characters,
i.e. elements of alphabets.
String over an alphabet = nite sequence of characters from . In particular, there is
the empty string . We use u, v, w as typical names for strings.
Example : Let = 1, 2, +. Then 1 + 2 and 2 + 1 are strings over .
= set of all strings over .
holds.
+
= set of all non-empty strings over , i.e.
=
+
.
[u[ = length of the string u = number of characters that occur in u.
In particular: [[ = 0.
The symbols of a string u with the length n we denote by u
1
, . . . , u
n
.
The concatenation of two strings u and v consists of the characters from u, followed by
the characters from v and is denoted by u v or simply by uv.
Example : The concatenation of 1+ and 2 + 0 is 1 + 2 + 0.
A string v is called substring of a string w, if u
1
, u
2
: w = u
1
vu
2
.
A string v is called prex of a string w, if u : w = vu.
A string v is called sux of a string w, if u : w = uv.
There may be several occurencies of the same substring in string w.
Example : The string w = 1 + 1 + 1 has two occurencies of the substring 1+ and three
occurencies of the substring 1.
A (formal) language over an alphabet is a subset of
. We use L as a typical name for

a language.
There exist standard set-theoretic operations on languages :
L
1
L
2
(union)
L
1
L
2
(intersection)
L
1
L
2
or L
1
L
2
(dierence)
L =
df

L (complement)
Furthermore, there are special operations on languages.
The concatenation of strings is extended to languages L
1
and L
2
as follows:
L
1
L
2
= u v [ u L
1
and v L
2
.
The n-th power of a language L is dened inductively:
L
0
= and L
n+1
= L L
n
The (Kleene) star operator (also Kleene-closure or iteration) of a language L is
L
=
_
nN
L
n
= w
1
. . . w
n
[ n N and w
1
, . . . , w
n
L.
For all L holds L
.
Example : For L
1
= 1+, 2+ and L
2
= 1 + 0, 1 + 1 is
L
1
L
2
= 1 + 1 + 0, 1 + 1 + 1, 2 + 1 + 0, 2 + 1 + 1,
L
2
1
= 1 + 1+, 1 + 2+, 2 + 1+, 2 + 2+,
L
1
= , 1+, 2+, 1 + 1+, 1 + 2+, 2 + 1+, 2 + 2+, 1 + 1 + 1+, 1 + 1 + 2+, ...
4 Bibliography
The following books are recommended for further reading:
J.E. Hopcroft, R. Motwani & J.D. Ullmann: Introduction to Automata Theory, Languages,
and Computation. Addison-Wesley, 2nd Edition, 2001.
U. Schoning: Theoretische Informatik kurzgefasst. Spektrum Akademischer Verlag,
5. Auage, 2008.
D. Harel: Algorithmics The Spirit of Computing. Addison-Wesley, 1987.
The following sources were used during the preparation for the lecture:
J. Albert & T. Ottmann: Automaten, Sprachen und Maschinen f ur Anwender. BI 1983
(nur einf uhrendes Buch).
E. Borger: Berechenbarkeit, Komplexitat, Logik. Vieweg, Braunschweig 1986 (2. Auage).
W. Brauer: Automatentheorie. Teubner Verlag, Stuttgart 1984.
E. Engeler & P. Lauchli: Berechnungstheorie f ur Informatiker. Teubner, Stuttgart 1988.
H. Hermes: Aufzahlbarkeit, Entscheidbarkeit, Berechenbarkeit. 2. Auage, Springer-
Verlag, Berlin 1971.
J.E. Hopcroft & J.D. Ullmann: Introduction to Automata Theory, Languages, and Com-
putation. Addison-Wesley, 1979.
A.R. Lewis & C.H. Papadimitriou: Elements of the Theory of Computation. Prentice
Hall, Englewood Clis 1981.
K.R. Reischuk: Einf uhrung in die Komplexitatstheorie. Teubner Verlag, Stuttgart 1990.
A. Salomaa: Computation and Automata. Cambridge University Press, Cambridge 1985.
7
A. Salomaa: Formal Languages. Academic Press, New York 1973.
F. Setter: Grundbegrie der Theoretischen Informatik. Springer-Verlag, Heidelberg 1988
(einf uhrendes Buch).
W. Thomas: Grundz uge der Theoretischen Infomatik. Vorlesung im Wintersemester
1989/90 an der RWTH Aachen.
Chapter II
Finite automata and regular
languages
In the following chapter we deal with a simple model of a machine: the nite automaton. We
will see that
nite automata may be widely used in computer science,
the languages recognized by nite automata have many structural properties, e.g. repre-
sentability by regular expressions,
questions about computations performed by nite automata are decidable.
Moreover, nite automata can be easily implemented as tables in programs and as circuits.
1 Finite automata
We want to use nite automata for recognizing languages. We can imagine a nite automaton
as a machine with nal states, which reads characters from a tape, can move the head only to
the right and can print no new symbols on the tape.
Therefore, the transition function of nite automata is often dened as mapping : Q Q,
where Q is a set of states and is an input alphabet of the automaton. This automaton can be
applied to the string w
to check its acceptance.

9
10 II. Finite automata and regular languages
Sketch:
A U T O M A T O N
Reading direction
q Q
nite
`

The representation of automata as the so called transition systems is more suitable for graphical
representation and denition of the acceptance behaviour of automata.
1.1 Denition (deterministic nite automaton): A deterministic nite automaton (ac-
ceptor), in short DFA, is a 5-tuple
/ = (, Q, , q
0
, F) or / = (, Q, , q
0
, F)
with following properties:
1. is nite, the input alphabet,
2. Q is a nite set of states,
3. : Q Q is the transition function
resp. Q Q is a deterministic transition relation, i. e.,
q Q a exactly q
Q : (q, a, q
) ,
4. q
0
Q is the initial state,
5. F Q is the set of nal states.
Both representations of automata are related as follows :
(q, a) = q
(q, a, q
)
The elements (q, a, q
) are called transitions . We mostly write q

a
q
instead of (q, a, q
) .
For given a we use
a
also as a 2-place relation
a
QQ:
q, q
Q : q
a
q
(q, a, q
)
A DFA can be graphically represented by a nite state diagram. It is a directed graph which
contains a vertice labelled with q for every state q of the automaton and a directed labelled edge
for every transition q
a
q
. The initial state q

0
is marked with an incoming arrow. Final states
are labelled with additional circles.
11
Example : Consider /
1
= (0, 1, q
0
, q
1
, , q
0
, q
1
) with
= (q
0
, 0, q
0
), (q
0
, 1, q
1
), (q
1
, 0, q
1
), (q
1
, 1, q
0
).
Then /
1
is represented by the following state diagram:
`
_
`
_
_
\
\
\
_
\
\
\
q
0
q
1
`
_
0 0
1
1
1.2 Denition (Acceptance and reachability)::

Let / = (, Q, , q
0
, F) resp. / = (, Q, , q
0
, F) be a DFA.
1. We extend the transition relations
a
from input symbols a to strings of these symbols
w
and dene the respective relations

w
inductively:
q

q
if and only if q = q
or in terms of relations:

= id
Q
q
aw
q
if and only if q
Q : q
a
q
and q
w
q
aw
=
a

w
Similarly we dene the extended transition function
: Q
Q with
(q, ) = q and
(q, aw) =
((q, a), w)
for all q, q
Q, a and w
. It holds that:
(q, w) = q
q
w
q
2. The language accepted by / is

L(/) = w
[ q F : q
0
w
q resp. L(/) = w
(q
0
, w) F
A language L is accepted, if there is a DFA / with L = L(/).
3. A state q Q is reachable in /, if w
: q
0
w
q.
Example : For the automaton from the last example it holds that:
L(/
1
) = w 0, 1
[ w has an odd number of1
s.
Remark : For all a
1
, . . . , a
n
and q, q
Q :
q
a
1
...a
n
q
q
1
, . . . , q
n
Q : q
a
1
q
1
. . . q
n1
a
n
q
n
= q
.
a
1
...a
n
=
a
1

a
n
The sequence q
a
1
q
1
. . . q
n1
a
n
q
n
is also called transitions sequence .
Example (from the eld of compiler construction): A compiler for a programming lan-
guage works in several phases, which can sometimes run parallely:
Lexical analysis : In this phase the so called scanner breaks the input text into the sequence
of tokens. These are identiers, keywords and delimiters.
Syntax analysis : The generated token sequence is the input to the so called parser, which
decides whether this sequence is a syntactically correct program.
Code generation : The syntactic structure recognized by the parser is used to generate
machine code.
Optimisation : Execution time of the program should be improved mostly by local changes
of machine code.
The lexical analysis is a simple task, which can be accomplished by nite automata. As an
example let us consider the typical structure of identiers in a programming language. We will
consider a syntax diagram of MODULA
Ident

letter
digit
letter

`
`
_
a
. . .
`
_
z
`
_
A
. . .
`
_
Z
. . .
. . .
. . .
. . .
letter
`
_
0
. . .
`
_
9
. . .
. . .
digit
The identiers constructed this way can be recognized by the following nite automata:
13
`
_
`
_
\
\
\`
_
.
.
-
-
a, . . . , Z
0, . . . , 9
`
_
\
\
\`
0, . . . , 9
`
_
.
.,
-
-
a, . . . , Z
0, . . . , 9 a, . . . , Z
For the sake of clarity we have labelled the edges of the state diagram with several symbols.
Example (from the eld of operating systems): If several programs try to access shared
resources in the multitasking environment, then these attempts need to be synchronized. For
example, consider two programs P
1
and P
2
, which share one printer. P
1
and P
2
should be
synchronized in such a way that they do not send data to the printer simultaneously. We
construct a nite automata that would monitor the printer use by P
1
and P
2
.
P
1
reports the beginning and the end of printing to the automaton by the symbols b
1
and
e
1
P
2
behaves similarly to P
1
and uses symbols b
2
and e
2
respectively.
In each time point the behaviour of P
1
and P
2
regarding the printer is given by the nite string
w b
1
, e
1
, b
2
, e
2
The automaton should accept this string, if

every P
i
uses the printer correctly, i. e., symbols b
i
and e
i
, alternate in w, starting with b
i
,
i = 1, 2.
P
1
and P
2
do not use the printer simultaneously, i. e., there is neither substring b
1
b
2
nor
b
2
b
1
in string w.
For example, w
1
= b
1
e
1
b
2
e
2
should be accepted, but w
2
= b
1
b
2
e
1
e
2
should not be accepted.
The following nite automata fullls the task:
`
_
`
_
`
_
`
_
`
_
`
_
`
_
\
\
\`
e
1
e
2
b
2
b
1
e
1
, e
2
b
1
, b
2
, e
1
b
1
, b
2
, e
2
b
1
, b
2
, e
1
, e
2
Non-determinism
In many applications nite automata can be simplied, if non-determinism is allowed, i. e., we
consider an arbitrary relation
Q Q
as transition relation. It may happen that for some q Q, a there are several successor
states q
1
, . . . , q
n
, so that (in graphical representation) it holds that
`
_
q
`
_
`
_
q
1
q
n
.
.
.
.
.
.
.
-
-
-
-
-
-
-
a
a
.
.
.
After accepting a the automaton moves non-deterministically from q to one of the successor
states q
1
, . . . , q
n
. Special case is n = 0; then for q Q and a there is no successor state
q
with q
a
q
. Then the automaton stops and rejects symbol a. These remarks lead to the
following denition.
1.3 Denition : A non-deterministic nite automata (or acceptor), in short NFA, is a 5-tuple
B = (, Q, , q
0
, F),
where , Q, q
0
and F are dened as in DFAs and for it holds that:
Q Q.
Transitions are written (q
a
q
) and extended to strings (q

w
q
) as in DFAs.
15
A graphical representation by state diagram ramains unchanged.
1.4 Denition (acceptance and equivalence):
(i) The language accepted (or recognized) by NFA B = (, Q, , q
0
, F) is
L(B) = w
[ q F : q
0
w
q.
By NFA we denote the class of languages accepted by NFAs.
(ii) Two NFAs B
1
and B
2
are called equivalent, if L(B
1
) = L(B
2
) holds.
NFA recognizes the string w, if while reading w we can reach one of the nal states. It may
be the case that, while reading w, there may exist other transition sequences, which end up in
non-nal states.
Obviously, a DFA is a special case of an NFA. Thus the equivalence between arbitrary nite
automata is dened.
Example (sux recognition): Consider an alphabet and a string v = a
1
. . . a
n

with
a
i
for i = 1, . . . , n.
We want to recognize the language
L
v
= wv [ w
,
which consists of all strings over with the sux v.
For this purpose we consider the NFA
B
v
= (, q
0
, . . . , q
n
, , q
0
, q
n
),
where is dened by the following state diagram B
v
:
`
_
`
_
`
_
`
_

`
_
q
0
q
1
q
2
q
n
a
1
a
2
a
3
a
n

a
_
\
\
\
B
v
behaves non-deterministically in the initial state q
0
: while reading a string, B
v
can decide
in each occurrence of a
1
, whether to try to recognize the sux v. In order to do it B
v
goes to
q
1
and now waits for a
2
. . . a
n
as a sux. Should this not be the case, then B
v
stops at some
i 1, . . . , n and a ,= a
i
.
The question arises: Do NFAs accept more languages than DFAs?
The answer is no.
1.5 Theorem (Rabin and Scott, 1959): For every NFA there exists an equivalent DFA.
Proof : Consider an NFA B = (, Q, , q
0
, F). We introduce a DFA / with L(/) = L(B)
using the following powerset construction:
/ = (, P(Q),
A
, q
0
, F
A
),
where for S, S
Q and a it holds that:

S
a
A
S
if and only if S
= q
Q [ q S : q
a
q
,
Let the set of nal states be F
A
= S Q [ S F ,= .
There exists a state in / for each subset S of the set of states Q of B (we denote such a state by
S). S
a
A
S
if and only if S
is the set of all successor states that can be reached by a-transitions

of non-deterministic automaton B from states of S. It can be graphically represented by:
g
g g
g
g
_
`
_
`
S S
a
a
a
results in S
a
A
S
We see that
A
is deterministic, thus
S Q a exactly one S
Q : S
a
A
S
.
Furthermore, for all q, q
Q, S, S
Q, a holds:
(i) If q
a
q
and q S,
then S
Q : S
a
A
S
and q
.
(ii) If S
a
A
S
and q
,
then q Q : q
a
q
and q S.
This way we can easily show that L(/) = L(B). Let w = a
1
. . . a
n

with a
i
for
i = 1, . . . , n. Then it holds that:
w L(B) q F : q
0
w
q
q
1
, . . . , q
n
Q :
q
0
a
1
q
1
. . . q
n1
a
n
q
n
with q
n
F
follows from (i) and follows from (ii)
S
1
, . . . , S
n
Q :
q
0
a
1
A
S
1
. . . S
n1
a
n
A
S
n
with S
n
F ,=
S F
A
: q
0
A
S
w L(/).
17
Example : We consider the the sux recognition of 01 in strings over the alphabet = 0, 1, 2.
According to the previous example the language L
01
= w01 [ w
is recognized by the
NFA B
01
with the following diagram:
`
_
`
_
`
_
q
2
`
_
q
1
q
0

_
\
\
\
0, 1, 2
0 1
Now we apply the powerset construction from the Rabin-Scott theorem to B
01
. As a result
we get the following DFA /
01
:
`
_
q
0
\
\
\
_
q
0
, q
1
_
q
0
, q
2
_
{q
0
,q
1
,q
2
}
\
\
\`
_
`
_
q
1
`
_
`
_
q
2
_
q
1
, q
2
`
_
`
`
`
`
`
`
`
`
2
0
0
1
0
1
0 2
1, 2
0, 2
1
0, 1, 2
1
0, 2
_
\
\
\
0, 1, 2
`
`
`
`
`
`
1, 2
The fragments in boxes are unreachable from the initial state q
0
of /
01
. Therefore, /
01
can be simplied to the following equivalent DFA :

`
_
\
\
\
\
\
\`
1, 2
0
2
0
1
0
.
1, 2
The number of states of this DFA can not be further minimized. There are examples, where
the DFA given in the proof of the Rabin-Scott theorem cannot be further simplifed, i. e., if
the given NFA has n states, then in the worst case the DFA actually needs 2
n
states.
Spontaneous transitions
Automata can be dened even more conveniently, if -transitions in addition to non-determinism
are allowed. These are the transitions, which an automaton makes spontaneously, without
reading a symbol of the input string.
1.6 Denition : A nondeterministic nite automaton (acceptor) with -transitions, in short
-NFA, is a 5-tuple
B = (, Q, , q
0
, F),
where , Q, q
0
and F are dened as in NFAs or DFAs and for the transition relation it holds
that:
Q( ) Q.
Transition q

q
is called -transition and is represented in the state diagrams as follows:

`
_
`
_
q
In order to dene the acceptance behaviour of -NFAs, we need an extended transition relation
q
w
q
. For this purpose we make two terminological remarks.

We dene for every a 2-place relation

over Q, i. e.,

QQ ,
q, q
Q holds: q

q
(q, , q
) .
Here we use a standard inx notation for 2-place relations, i. e., q

q
instead of (q, q
. Note that

is called -transition relation.
Therefore, we can use the composition for such transition relations. For all ,

is dened as follows: For q, q
Q holds
q

q
Q : q

q
and q
.
19
1.7 Denition : Let B = (, Q, , q
0
, F) be a -NFA. Then a 2-place relation
w
QQ is
dened inductively for every string w
:
q

q
, if n 0 : q

. .
n-times
q
q
aw
q
, if q

a

w
q
,
where q, q
Q, a and w
.
Remark : For all q, q
Q and a
1
, . . . , a
n
:
(i) q

q, but q = q
does not follow from q

q
.
(ii)
q
a
1
...a
n
= q
q

a
1

a
n

q
q
a
1

a
n
q
(iii) It is decidable, whether the relation q

q
holds for the given states q, q
Q.
Proof : (i) and (ii) follow directly from the denition.
(iii): Let k be the number of states in Q. Then:
q

q
n k 1 : q

. .
n-times
q
.
If at least k transitions

take place one after the other, then the states repeat in the path from
q to q
. So in order to nd all by -transitions reachable states from q, it is sucient to check

the nitely many sequences with maximum k 1 transitions

. Thus the decidability of q

q
is proved.
1.8 Denition (acceptance): Let B = (, Q, , q
0
, F) be a -NFA.
(i) The language accepted (or recognized) by B is
L(B) = w
[ q F : q
0
w
q.
(ii) Two -NFAs B
1
and B
2
are called equivalent, if L(B
1
) = L(B
2
).
Obviously NFA is a special case of -NFA. Therefore, the equivalence of NFAs and -NFAs can
be dened.
Example : For the input alphabet = 0, 1, 2 we consider the -NFA B dened by the
following state diagram:
`
_
`
_
`
_

`
_
_
\
\
\
q
0
q
1
q
2
_
\
\
\

1 0
_
\
\
\
2
Then:
L(B) = w 0, 1, 2
[ k, l, m 0 : w = 0 . . . 0
. .
k-times
1 . . . 1
. .
l-times
2 . . . 2
. .
m-times
1.9 Theorem : For every -NFA there exists an equivalent NFA.

Proof : Consider a -NFA B = (, Q, , q
0
, F).
We construct the following NFA / = (, Q,
A
, q
0
, F
A
), where for q, q
Q and a it holds
that:
q
a
A
q
if and only if q
a
q
in B, therefore

a

q
in B
q F
A
if and only if q
F : q

q
By denition / is a NFA without -transitions with F F

A
. It remains to show that: L(/) =
L(B). Let w = a
1
. . . a
n

with n 0 and a
i
for i = 1, . . . , n. Then:
w L(/) q F
A
: q
0
w
A
q
q F
A
: q
0
a
1
A

a
n
A
q
q F
A
: q
0
a
1

a
n
q
: choose q
with q

q
.
: choose q = q
.
q
F : q
a
1

a
n
q
F : q
w
q
w L(B)
In the special case w = the above argument is reduced as follows:
L(/) q
0
F
A
q F : q
0
q
L(B)
Remarks :
(i) In the proof F
A
could also be dened as follows:
F
A
=
_
F q
0
if q F : q
0
q
F otherwise
Compare with the proof in Hopcroft & Ullman.
21
(ii) NFA / can be computed from the given -NFA B, because the relation q

q
is decidable:
Compare with the previous remark.
(iii) If there are -cycles in -NFA B, then the set of states of NFA / can be reduced by the
preceeding contraction of -cycles; every -cycle can be replaced by one state.
Example : We apply the construction introduced in the proof to the -NFA B from the previous
example. As the result we have the following NFA /:
`
_
`
_
`
_
`
_
_
\
\
\
q
0
q
1
q
2
_
\
\
\
0, 1 1, 2
1 0 _
\
\
\
2
`
_
`
_

0, 1, 2
It is easy to check, that L(/) = L(B) actually holds.
Therefore, we have the following result:
DEA = NEA = -NEA,
i. e., the classes of languages accepted by DFAs, NFAs and -NFAs coincide; we consider the
class of nitely accepted languages. If we want to show the properties of this class, we can choose
the type of automaton most suited for this purpose.
2 Closure properties
Now we explore, under which operations the class of nitely accepted languages is closed. For
this purpose we consider the set operations union, intersection, complement, dierence as well
as operations on languages such as concatenation and iteration (Kleene star operator), which
were introduced in Chapter I.
2.1 Theorem : The class of nitely acceptable languages is closed under the following opera-
tions:
1. union,
2. complement,
3. intersection,
4. dierence,
5. concatenation,
6. iteration.
Proof : Let L
1
, L
2

be nately acceptable. Then there are DFAs A

i
= (, Q
i
,
i
, q
0i
, F
i
)
with L
i
= L(/
i
), i = 1, 2, and Q
1
Q
2
= . We show that L
1
L
2
, L
1
, L
1
L
2
, L
1
L
2
, L
1
L
2
and L
1
are nately acceptable. For L
1
L
2
, L
1
L
2
and L
1
we shall provide the accepting
-NFAs. This makes the task less complicated.
1. L
1
L
2
: Let us construct the -NFA B = (, q
0
Q
1
Q
2
, , q
0
, F
1
F
2
), where
q
0
/ Q
1
Q
2
and
= (q
0
, , q
01
), (q
0
, , q
02
)
1

2
hold. B is graphically represented as follows:
`
_
`
_
`
_
>
>
>
>
_
`
_
`
q
0
q
01
q
02 /
2
/
1
It is clear that L(B) = L

1
L
2
holds.
2. L
1
: Consider the DFA / = (, Q
1
,
1
, q
01
, Q
1
F
1
). Then for all w
it holds that:
w L(/) q Q
1
F
1
: q
01
w
1
q
1
determ.
q F
1
: q
01
w
1
q
w / L(/
1
)
w / L
1
Therefore, L(/) = L
1
holds.
3. L
1
L
2
: is obvious, because L
1
L
2
= L
1
L
2
holds.
4. L
1
L
2
: is obvious, because L
1
L
2
= L
1
L
2
holds.
5. L
1
L
2
: Construct the -NFA B = (, Q
1
Q
2
, , q
01
, F
2
) with
=
1
(q, , q
02
) [ q F
1
)
2
.
B is graphically represented as follows:
`
_
_
`
_
`
_
q
01
q
1
1
q
1
m
.
.
.
.
.
.
`
_
q
02
-
-
-
-
-
F
1
from
1
/
1
`
_
`
_
q
2
1
q
2
n
.
.
.
F
2
from
2
/
2
It is easy to show that L(B) = L
1
L
2
holds.
23
6. L
1
: For the iteration we construct the -NFA
B = (, q
0
Q
1
, , q
0
, q
0
), where q
0
/ Q
1
and
= (q
0
, , q
01
)
1
(q, , q
0
) [ q F
1
hold. B is graphically represented as follows:

`
_
_
`
_
`
_
q
01
q
1
q
n
.
.
.
.
.
.
F
1
from
1
/
1
`
_
`
_
q
0

_
Again it is easy to show that L(B) = L
1
holds.
Thus the proof is complete.
Remark : There is also an interesting direct construction of accepting DFAs for the union and
intersection . Let /
1
and /
2
be as in the above proof. Then we consider the following transition
relation Q Q on the cartesian product Q = Q
1
Q
2
of the sets of states:
for all q
1
, q
1
Q
1
and q
2
, q
2
Q
2
and a it holds that
(q
1
, q
2
)
a
(q
1
, q
2
) if and only if q
1
a
1
q
1
and q
2
a
2
q
2
.
The relation
a
models the simultaneous parallel progress of automata /
1
and /
2
, while reading
the symbol a.
Consider the DFAs
A
= (, Q, , (q
01
, q
02
), F
),
A
= (, Q, , (q
01
, q
02
), F
)
with
F
= (q
1
, q
2
) Q[ q
1
F
1
or q
2
F
2
,
F
= (q
1
, q
2
) Q[ q
1
F
1
and q
2
F
2
.
Then it holds that L(/
) = L
1
L
2
and L(/
) = L
1
L
2
.
Proof : We show that the statement for /
holds. For any w
it holds that:
w L(/
) (q
1
, q
2
) F
: (q
01
, q
02
)
w
(q
1
, q
2
)
q
1
F
1
, q
2
F
2
: q
01
w
1
q
1
and q
02
w
2
q
2
w L(/
1
) and w L(/
2
)
w L
1
L
2

3 Regular expressions
With the help of regular expressions we can inductively describe the nately acceptable lan-
guages. For this purpose we consider the alphabet given above.
3.1 Denition (regular expressions and languages):
1. The syntax of regular expressions over is given as follows:
and are regular expressions over .
a is a regular expression over for every a .
If re, re
1
, re
2
are regular expressions over , then
(re
1
+re
2
), (re
1
re
2
), re
are also regular expressions over .

2. The semantics of a regular expression re over is the language L(re)
, which is
inductively dened as follows:
L() =
L() =
L(a) = a for a
L((re
1
+re
2
)) = L(re
1
) L(re
2
)
L((re
1
re
2
)) = L(re
1
) L(re
2
)
L(re
) = L(re)
3. A language L
is called regular, if there is a regular expression re over with

L = L(re).
In order to save space by omitting some brackets we dene operations priorities:
binds stronger than and binds stronger than +.
Besides we omit the outer brackets and use the associativity of and +. The concatenation dot
is often omitted.
Example : We use regular expressions to describe some previously considered languages.
1. The language of identiers considered by the lexical analysis is describes by the regular
expression
re
1
= (a +. . . +Z)(a +. . . +Z + 0 +. . . + 9)
.
2. The language over b
1
, e
1
, b
2
, e
2
used for synchronization of two programs, which share a
printer, is described by the following regular expression
re
2
= (b
1
e
1
+b
2
e
2
)
( +b
1
+b
2
).
25
3. Let = a
1
, . . . a
n
, b
1
, . . . , b
m
and v = a
1
. . . a
n
. Then the language L
v
= wv [ w
of the strings with sux v is described by the regular expression

re
3
= (a
1
+. . . a
n
+b
1
+. . . +b
m
)
a
1
. . . a
n
.
It holds that: L(re
3
) = L
v
.
Now we will show that a more general statement holds:
3.2 Theorem (Kleene): A language is regular if and only if it is nitely accepted.
Proof : We consider a language L
.
: For a regular expression re over it holds that L = L(re). We show using induction over
the structure of re that L(re) is nately acceptable.
Basis: L(), L() and L(a) are obviously nitely acceptable for a .
Induction step: Let L(re), L(re
1
) and L(re
2
) be nitely acceptable. Then L(re
1
+re
2
), L(re
1
re
2
)
and L(re
) are also nitely acceptable, because the class of nitely acceptable languages is closed
under union, concatenation and iteration.
: It holds that L = L(/) for a DFA / with n states. W.l.o.g. / = (, Q, , 1, F), where
Q = 1, . . . , n. For i, j 1, . . . , n and k 0, 1, . . . , n we dene
L
k
i,j
= w
[ i
w
j and u
, l Q :
from ( v : v ,= , v ,= w and uv = w) and i
u
l holds l k.
L
k
i,j
consists of all strings w such that automaton / moves from state i to state j, while reading
the string w. Furthermore, while reading the string w, only states with the number less or equal
k occur.
Now we show by induction on k that the languages L
k
i,j
are all regular.
k = 0: For strings from L
0
i,j
no intermediate state can be used. Thus it holds that:
L
0
i,j
=
_
a [ i
a
j if i ,= j
a [ i
a
j if i = j
Therefore, L
0
i,j
is regular as a nite subset of .
k k+1: Let the languages L
k
i,j
be regular for all i, j 1, . . . , n. Then for all i, j 1, . . . , n
it holds that:
L
k+1
i,j
= L
k
i,j
L
k
i,k+1
(L
k
k+1,k+1
)
L
k
k+1,j
,
In order to go from the state i to the state j, the intermediate state k +1 is either unnecessary,
then L
k
i,j
is sucient for the description; or the state k + 1 is used as an intermediate state one
or sevaral times, then L
k
i,k+1
(L
k
k+1,k+1
)
L
k
k+1,j
is used for description. Therefore, by induction
on k we get:
L
k+1
i,j
is regular.
From the regularity of languages L
k
i,j
we conclude that L itself is regular, because it holds that
L = L(/) =
_
jF
L
n
1,j
.
Thus we have proved the statement of Kleene.
Remark : As an alternative to the above construction of a regular expression from a given
nite automaton we can solve guarded regular systems of equations.
4 Structural properties of regular languages
As regular languages are exactly the nitely acceptable languages, we can derive important
characteristics of regular languages from the niteness of the sets of states of the accepting
automata. First we consider the so called Pumping Lemma, which gives us a necessary condition
for a regular language to be regular. Let be an arbitrary alphabet.
4.1 Theorem (the pumping lemma for regular languages): For every regular language
L
there exists a number n N, so that for all strings z L with [z[ n there is a
decomposition z = uvw with v ,= and [uv[ n and for all i N holds uv
i
w L, i. e., we can
pump the substring v and the resulting string will be in the regular language L.
In quantier notation:
regular L
n N z L with [z[ n
u, v, w
: z = uvw and v ,= and [uv[ n and i N : uv

i
w L
Proof : Let L
be regular.
According to the Kleene theorem there is a DFA / = (, Q, , q
0
, F) with L = L(/). We
choose n = [Q[ and consider a string z L with [z[ n. Then / must go through at least one
state twice, while reading z. More precisely it holds that:
Let z = a
1
. . . a
r
with r = [z[ n and a
i
for i = 1, . . . , r and let q
1
, . . . , q
r
Q be dened
as follows: q
i1
a
i
q
i
for i = 1, . . . , r. Then there is j, k 1, . . . , n with 0 j < k n r, so
that q
j
= q
k
holds. In graphical representation:
27
`
_
`
_
q
r
`
_
q
j
`
_
q
0
`
_
q
k1
.
.
.

a
1
a
j
a
j+1
a
k1
a
k
a
k+1
a
r
Let: u = a
1
. . . a
j
, v = a
j+1
. . . a
k
, w = a
k+1
. . . a
r
. Then according to the properties of j and k
it holds that v ,= and [uv[ n. Besides, it is clear that the automaton /, while recognizing,
can go through q
j
-loop any number of times, i. e., for all i N it holds that: uv
i
w L.
We consider a typical application of the pumping lemma, where we prove that certain languages
are not regular.
Example : The language L = a
n
b
n
[ n N is not regular. We provide a proof by contradiction.
Hypothesis: L is regular. According to the pumping lemma, there is a n N with given
properties. Now consider z = a
n
b
n
. Since [z[ n holds, we can break z into z = uvw with v ,=
and [uv[ n, so that for all i N it holds: uv
i
w L. However, v consists only of symbols a, so
that in particular, uw = a
n|v|
b
n
L should hold. Contradiction
The above example shows that nite automata cannot count unlimitedly. We schall get ac-
quainted with other applications of the pumping lemma in the section devoted to decidability
properties.
Nerode relation
If we consider the so called Nerode relation, we get a characteristic, i. e., necessary and sucient
condition for the regularity of languages L
.
4.2 Denition :
Let L
be some language. The Nerode relation of L is a 2-place relation

L
on
, i. e.,
, which is dened as follows for u, v
:
u
L
v if and only if for all w
it holds that: uw L vw L.
Therefore, it holds that u
L
v, if u and v can be extended in the same way to strings from L.
In particular, holds (with w = ) u L v L.
Remark : The Nerode relation is a right congruence, i. e., it has the following properties:
1.
L
is an equivalence relation on
, i. e., reexive, symmetric and transitive.

2.
L
is right compatible with the concatenation, i. e., from u
L
v it follows uw
L
vw for
all w
Given that
L
is an equivalence relation, we can determine the index of
L
, i. e., the number
of equivalence classes of
L
.
4.3 Theorem (Myhill and Nerode):
A language L
is regualr if and only if

L

has a nite index.

Proof : : Let L be regular, i. e., L = L(/) for a DFA / = (, Q, , q
0
, F). We introduce
the following 2-place relation
A
on
. For u, v
it holds that:
u
A
v if and only if we have a q Q, with q
0
u
q and q
0
v
q,
i. e., if the automaton / moves from q
0
to the same state q after having read the inputs u and v.
Note that q is uniquely dened, because / is deterministic. The relation
A
is an equivalence
relation on
.
We show:
A
is a renement of
L
, i. e., for all u, v
it holds that:
For u
A
v it follows that u
L
v.
Let u
A
v and w
. Then it holds that:

uw L q Q q
F : q
0
u
q
w
q
u
A
v
q Q, q
F : q
0
v
q
w
q
vw L.
Thus there are at least as many equivalence classes of
A
as of
L
. Therefore, it holds that
Index(
L
)
Index(
A
)
= number of states reachable from q
0
[Q[,
thus
L
has a nite index.
: Let L
be a language and k N the nite index of

L
. We choose k strings
u
1
, . . . , u
k

with u
1
= as representatives of the equivalence class of
L
. Then
can be
represented as a disjoint union of these equivalence classes:
= [u
1
]

. . .

[u
k
].
In particular, for every string u
there is a i 1, . . . , k with [u] = [u

i
].
Now we construct the following equivalence-class automaton
29
/
L
= (, Q
L
,
L
, q
L
, F
L
):
Q
L
= [u
1
], . . . , [u
k
],
q
L
= [u
1
] = []
F
L
= [u
j
] [ u
j
L,
and for i, j 1, . . . , k and a let
[u
i
]
a
L
[u
j
] if and only if [u
j
] = [u
i
a].
Then /
L
is a DFA and for all strings w
it holds that:
[]
w
L
[u
j
] if and only if [u
j
] = [w],
more precisely for w = a
1
. . . a
n
[]
w
L
[u
j
] if and only if []
a
1
L
[a
1
] . . .
a
n
L
[a
1
. . . a
n
] = [u
j
],
and thus
w L(/
L
) =
[u
j
] F
L
: []
w
L
[u
j
]
u
j
L : [u
j
] = [w]
w L
So /
L
accepts the language L. Therefore, L is regular.
We use the method of proof of the Myhill-Nerode theorem, in order to minimize the number
of states of automaton. In this case we refer to the deterministic equivalence-class automaton
/
L
from the proof.
4.4 Corollary : Let L
be regular and k = Index(

L
). Then every DFA that accepts L
has at least k states. The minimum number k is reached by the DFA /
L
. However, there may
exist NFA s with less than k states accepting L.
Proof : In the proof of the Myhill-Nerode theorem we have shown in that every DFA
/ = (, Q, , q
0
, F) accepting L has at least k states. At the same time k = Index(
L
) [Q[.
In we have constructed the DFA /
L
with k states accepting L.
The equivalence-class automaton /
L
is the prototype of all DFAs accepting L with minimum
number of states k. We can show that every other DFA accepting L and that has k states is
isomorphic to /
L
, i. e., we can get it from /
L
by a bijective renaming of states.
4.5 Denition : Two DFAs or NFAs /
i
= (, Q
i
,
i
, q
0i
, F
i
), i = 1, 2, are called isomorphic ,
if there is a bijection : Q
1
Q
2
with the following properties:
(q
01
) = q
02
,
(F
1
) = (q) [ q F
1
= F
2
,
q, q
Q
1
a : q
a
1
q
(q)
a
2
(q
).
The bijection is called the isomorphism from /
1
to /
2
.
Note that isomorphism is an equivalence relation on nite automata. Now we will show the
announced statement.
4.6 Theorem : Let L
be regular and k = Index(

L
). Then every DFA / that accepts L
and that has k states is isomorphic to /
L
.
Proof : We have / = (, Q, , q
1
, F) with L(/) = L and [Q[ = k and the equivalence-class
automaton from the Myhill-Nerode theorem with Q
L
= [u
1
], . . . , [u
k
] and u
1
= . For every
i 1, . . . , k we dene the state q
i
Q by transition q
1
u
i
q
i
.
Note that q
i
is uniquely dened, because the transition relation is deterministic .
Now we dene the mapping : Q
L
Q by
([u
i
]) = q
i
for i 1, . . . , k and show that is an isomorphism from /
L
to /.
1. is injective: Let q
i
= q
j
. Then it holds that q
1
u
i
q
i
and q
1
u
j
q
i
. Therefore, for all it
holds w
: u
i
w L u
j
w L. Therefore, u
i

L
u
j
and [u
i
] = [u
j
].
2. is surjective: this property follows from (1) and the fact that k = [Q[. Thus it holds in
particular that Q = q
1
, . . . , q
k
.
3. ([q
L
]) = ([u
1
]) = ([]) = q
1
4. (F
L
) = F : [u
j
] F
L
u
j
L q
j
F
5. For all i, j 1, . . . , k and a it holds that:
[u
i
]
a
L
[u
j
] q
i
a
q
j
.
Proof of : Let [u
i
]
a
L
[u
j
]. Then by denition of
L
: [u
i
a] = [u
j
]. There is a
l 1, . . . , k with q
i
a
q
l
. By denition of q
i
and q
l
we have the following image:
q
1
q
i
q
l

`
`
u
i a
u
l
Given that the strings u
i
a and u
l
lead to the same state q
l
, it follows that u
i
a
L
u
l
.
Therefore, it holds that [u
j
] = [u
i
a] = [u
l
]. According to the choice of u
1
, . . . , u
k
in A
L
it
follows that u
j
= u
l
, even j = l. Thus q
i
a
q
j
follows as desired.
31
Proof of : Let q
i
a
q
j
. Thus we have the following image similar to the one above:
q
1
q
i
q
j

`
`
u
i a
u
j
Hence we make a conclusion as above: u
i
a
L
u
j
. Therefore, it holds that [u
i
a] = [u
j
]; by
denition of
L
we can derive that [u
i
]
a
L
[u
j
].
Thus /
L
and / are isomorphic.
For every regular language L
with k as index of
L
there is a DFA, which is unique up to
isomorphism, with minimum number of states k accepting L.
4.7 Denition : The minimal automaton for a regular language L
is the DFA, which

accepts L and has the number of states equal to the index of Nerode relation
L
. This minimal
automaton is unique up to isomorphism.
The minimal automaton for a regular language L
from every DFA / = (, Q, , q

0
, F)
accepting L by reduction is algorithmically computable. The reduction includes the following
steps:
1. Eliminate unreachable states.
A state q Q is called reachable, if there is a string w
with q
0
w
q. The subset
of reachable states of Q is computable, because we can cinsider only those strings w with
q
0
w
q, which have the length [Q[ (compare with the proof of the pumping lemma).
2. Combine equivalent states.
For q Q, w
and S Q we write q
w
S, if there is a q
S with q
w
q
. Two states
q
1
, q
2
Q are called equivalent, in short q
1
q
2
, if for all w
it holds:
q
1
w
F q
2
w
F,
i. e., the same strings lead from q
1
and q
2
to nal states.
There is a close relation between equivalence and the Nerode relation
L
. Let q
0
u
q
1
and q
0
v
q
2
q
1
q
2
u
L
v [u] = [v].
Making comparison with the equivalence-class automaton /
L
we see that equivalent states
must coincide in the minimal automaton. q
0
, q
1
and q
2
in /
L
are represented by the
equivalence classes [], [u] and [v] such that from []
u
[u] and []
v
[v] it follows that:
[u] [v] u
L
v [u] = [v].
5 Decidability questions
First of all, we determine that the following constructions are algorithmically computable:
-NFA NFA DFA
DFA minimal automaton
-NFAs for the following operations on nitely acceptable languages:
union, complement, intersection, dierence, concatenation and iteration
regular expression NFA DFA regular expression
After that we can move on to decidability questions for the languages represented by nite
automata or regular expressions. Due to the constructions mentioned above we consider only
the languages represented by DFAs. We consider the following problems for regular languages.
acceptance problem Given: DFA / and a string w
Question: Does w L(/) hold ?
emptiness problem Given: DFA /
Question: Does L(/) = hold ?
niteness problem Given: DFA /
Question: Is L(/) nite ?
equivalence problem Given: DFA s /
1
and /
2
Question: Does L(/
1
) = L(/
2
) hold ?
inclusion problem Given: DFA s /
1
and /
2
Question: Does L(/
1
) L(/
2
) hold ?
intersection problem Given: DFA s /
1
and /
2
Question: Does L(/
1
) L(/
2
) = hold ?
5.1 Theorem (decidability): For regular languages
the acceptance problem,
the emptiness problem,
the niteness problem,
the equivalence problem,
the inclusion problem,
the intersection problem
33
are all decidable.
Proof : Acceptance problem: We apply / to the given string w and decide, whether a nal
state of / is reached. Therefore, / itself gives us the decision procedure.
Emptiness problem: Let n be the number we get by applying the pumping lemma to the regular
language L(/). Then it holds that:
L(/) = w L(/) : [w[ < n ()
The proof of (): is obvious. by contraposition: Let L(/) ,= . If there is a string
w L(/) with [w[ < n, then we have nothing to show. Otherwise there is a string w L(/)
with [w[ n. By application of the pumping lemma with i = 0, we get that a string w
0
L(/)
with [w
0
[ < n. This completes the proof.
The decision procedure for L(/) = now solves the acceptance problem w L(/)? for every
string over the input alphabet of / with [w[ < n. If the answer is no, then we get L(/) = .
Otherwise we get L(/) ,= .
Finiteness problem: Let n be dened as above. Then it holds that:
L(/) is nite w L(/) : n [w[ < 2n ()
The proof of (): : If there were a string w L(/) with [w[ n, then L(/) would be
innite by the pumping lemma. by contraposition: Let L(/) be innte. Then there are
strings of arbitrary length in L(/), in particular a string w with w 2n. By applying the
pumping lemma with i = 0 we get a string w
0
with n [w
0
[ < 2n, because with i = 0 the given
string is shortened maximally by n letters.
The niteness problem can be decided by () by solving of the acceptance problem nite number
of times.
equivalence problem: First we construct a DFA / with the following property:
L(/) = (L(/
1
) L(/
2
)) (L(/
2
) L(/
1
)).
Obviously it holds that:
L(/
1
) = L(/
2
) L(/) = ( )
Hence the equivalence problem for automata is reduced to the emptiness problem for /. The
construction of / described above is hard to implement. Alternatively we can use the following
product construction, which is similar to the last remarks in section 2 of this chapter. Let
/
i
= (, Q
i
,
i
, q
0i
, F
i
), i = 1, 2. Then consider Q = Q
1
Q
2
with the following transition
relation Q Q
for all q
1
, q
1
Q
1
and q
2
, q
2
Q
2
and a it holds that
(q
1
.q
2
)
a
(q
1
, q
2
) if and only if q
1
a
q
1
and q
2
a
q
2
. Then dene / = (, Q, , (q
01
, q
02
), F)
with F = (q
1
, q
2
) Q[ q
1
F
1
q
2
/ F
2
.
Then (***) holds for this DFA /:
L(/
1
) = L(/
2
) w
: (w L(/
1
) w L(/
2
))
w
: (( q
1
F
1
: q
0
w
q
1
) ( q
2
F
2
: q
0
w
q
2
))
w
(q
1
, q
2
) Q : (q
01
, q
02
)
w
(q
1
, q
2
) (q
1
, q
2
) , F
L(/) =
Inclusion problem: We construct a DFA /, so that
L(/) = L(/
1
) L(/
2
)
holds. Due to
L(/
1
) L(/
2
) L(/) =
the inclusion problem for /
1
and /
2
can be reduced to the emptiness problem of /.
Intersection problem: We construct a DFA / with
L(/) = L(/
1
) L(/
2
),
so that the intersection problem of /
1
and /
2
can be reduced to the emptiness problem of /.
For / we can use the product construction /
from the last remark in section 2.

6 Automatic verication
The automatic verication can be implemented for programs represented by nite automata. In
general, the verication problem can be formulated as follows:
Given : program P and specication S
Question : Does P full the specication S ?
This problem is also called Model Checking, because in terms of logic the question is, whether
the program P is a model of specication S. Here we consider the following special case of this
problem:
Program P

= nite automaton (DFA, NFA or -NFA) /
Specication S

= an extended regular expression re, i. e., extended by the operators
re
1
re
2
(intersection), re (complement) and as an abbreviation
for a
1
+. . . +a
n
if = a
1
, . . . , a
n
holds.
P satises S

= L(/) L(re). This test can be conducted automatically, because
the inclusion problem for regular languages is decidable.
Example : Consider once again the example from the eld of operating systems, where two
programs share a printer by means of the operations b
1
and b
2
, and report the end of use by
e
1
and e
2
. We put = b
1
, b
2
, e
1
, e
2
and describe the set of strings over , which dene the
allowed executions by the following extended regular expression:
35
re =
b
1
b
2
b
2
b
1
.
The application of the operators re and makes the regular expression re simplier. Due to
closure properties of regular languages, we certainly do not leave the class of regular languages.
Hence we can also decide for every given nite automaton / for synchronization of printer usage
by both programs, whether L(/) L(re) holds.
One of the possibilities to dene the satisability is as follows:
P satises S

= L(/) = L(re). This test can be automatically conducted, because
the equivalence problem for languages is decidable.
Even the following synthesis problem can be solved automatically for program P represented as
a nite automaton and specication S represented as extended regular expression:
Given : a specication S
Searched : a program P that satises S.
Chapter III
Context-free languages and
push-down automata
In the previous chapter we have seen that regular languages have multiple applications in com-
puter science (e.g. lexical analysis and substring recognition) and are particularly easy to use
(representability by nite automata and regular expressions, good closure and decidability prop-
erties). However these are not enough to carry out an important task of the computer science,
and namely the syntax description of programming languages.
The matter is that programming languages accept bracket structures of arbitrary nesting-depth,
such as for example
arithmetic expressions of the form 3 (4 (x + 1)),
lists of the form (CAR(CONS x y )) or
statements of the form
while(b1)
x = e1;
while(b1)
y = e2;
z = e3;
In section 4 we have shown with the help of the pumping lemma that the simplest example of
such a bracket structure, namely the language
L = a
n
b
n
[ n N
is no longer regular. Context-free languages are used for the syntax description of programming
languages.
37
38 III. Context-free languages and push-down automata
1 Context-free grammars
First we consider the corresponding grammars.
1.1 Denition : A context-free grammar is a 4-tuple G = (N, T, P, S), where the following
holds:
(i) N is an alphabet of non-terminal symbols,
(ii) T is an alphabet of terminal symbols with N T = ,
(iii) S N is the start symbol,
(iv) P N (N T)
is a nite set of productions or rules.

We use the following formal denitions:
A, B, C, . . . stand for non-terminal symbols,
a, b, c, . . . stand for terminal symbols,
u, v, w, . . . stand for strings with terminal and non-terminal symbols.
We often write down a production (A, u) P as an arrow notation A u. If several productions
have the same left side, such as
A u
1
, A u
2
, . . . , A u
k
,
then we write it down shorter as a unique metarule
A u
1
[ u
2
[ . . . [ u
k
or also
A ::= u
1
[ u
2
[ . . . [ u
k
. ()
Thus [ is a metasymbol for the disjunction of alternatives u
1
, . . . , u
k
, which may not occur in
N T.
If the productions of a context-free grammar are represented in the form (), then we deal with
Backus-Naur form or shortly BNF-notation. This notation was introduced in 1960 by John
Backus and Peter Naur to dene the programming language ALGOL 60. The extended BNF-
notation, also called EBNF, allows us to make further abbreviations. The EBNF-notation can
be translated 11 into the syntax diagram, which was introduced in 1970 by Niklaus Wirth to
dene the programming language PASCAL.
39
Every context-free grammar G has the two-place derivation relation
G
on (N T)
:
x
G
y if and only if A u P w
1
, w
2
(N T)
:
x = w
1
A w
2
and y = w
1
u w
2
.
By
G
we denote the reexive and transitive closure of
G
. We read x
y as y can be derived
from x. The language generated by G is
L(G) = w T
[ S
G
w.
Two context-free grammars G
1
and G
2
are called equivalent , if L(G
1
) = L(G
2
).
1.2 Denition : A language L T
is called context-free, if there is a context-free grammar G

with L = L(G).
Example :
(1) The language L = a
n
b
n
[ n N is generated by the grammar
G
1
= (S, a, b, P
1
, S), where P
1
is given as follows:
S [ aSb.
For example, it holds that a
2
b
2
L(G
1
), because
S
G
1
aSb
G
1
aaSbb
G
1
aabb.
(2) The arithmetic expressions with variables a, b, c and operators + and are generated by
the grammar G
2
= (S, a, b, c, +, , (, ), P
2
, S) with the following P
2
:
S a [ b [ c [ S +S [ S S [ (S).
For example (a +b) c L(G
2
), because
S
G
2
S S
G
2
(S) S
G
2
(S +S) S
G
2
(a +S) S
G
2
(a +b) S
G
2
(a +b) c.
Now we will consider the derivation of strings in a context-free grammar in more detail.
1.3 Denition : A derivation from A to w in G with the length n 0 is a sequence of derivation
steps
A = z
0

G
z
1

G

G
z
n
= w. ()
This derivation is called leftmost derivation, if we replace the leftmost non-terminal symbol in
every derivation step z
i

G
z
i+1
, i.e. if z
i
and z
i+1
always have the form
z
i
= w
1
Aw
2
and z
i+1
= w
1
uw
2
, where w
1
T
.
The rightmost derivations are dened respectively (then it holds that: w
2
T
).
Every derivation can be graphically represented as a tree.
1.4 Denition : A derivation tree from A to w in G is a tree with the following properties:
(i) Every node is labelled with a symbol from N T . The root is labelled with A and
every internal node is labelled with a symbol from N.
(ii) If an internal node labelled with B has k children nodes, which are labelled with the
symbols
1
, . . . ,
k
from left to right, then it holds that
a) k = 1 and
1
= and B P
or
b) k 1 and
1
, . . . ,
k
N T and B
1
. . .
k
P.
(iii) The string w is generated by concatenating the symbols on the leaves from left to right.
Illustration
A
1

k B
. .
w
`
`
`
`
`
`
`
`
`
`
We construct the derivation tree from A to w corresponding to a derivation from A to w of the

form () by induction on the length n of the derivation.
n = 0 : The trivial derivation A belongs to the trivial derivation A.
n n + 1 : Consider a derivation
A = z
0

G
. . .
G
z
n
= w
1
Bw
2

G
w
1
uw
2
= z
n+1
.
Let t be the derivation tree corresponding to the derivation A = z
0

G
. . .
G
z
n
.
If u = , then the entire derivation tree is as follows:
A
t
B
w
1
w
2
`
`
`
`
`
If u =
1
. . .
k
, where
1
, . . . ,
k
N T, then the entire derivation tree is as follows:
41
A
t
B
w
1
w
2
`
`
1

k
`
`
`
`
`
Example : Derivation trees for the derivations mentioned in the previous example are
S
a
S b
a
S b
and
S
S

S
(
S
) c
S
+
S
a
b
Remark : There exists the following relationship between derivations and derivation trees:
(i) A
G
w There is a derivation tree from A to w in G.
(ii) In general, several derivations from A to w, however only one leftmost derivation and one
rightmost derivation, correspond to the given derivation tree from A to w.
Proof :
(i) is obvious due to the above construction
We show inductively on the depth of the derivation tree.
(ii) Derivation trees abstract from the unimportant order of the rule application, if several
non-terminal symbols occur simultaneously. For example, both derivations
S
G
2
S +S
G
2
a +S
G
2
a +b
and
S
G
2
S +S
G
2
S +b
G
2
a +b
result in the same derivation tree:
S
S
+
S
a
b
If we have decided to use leftmost or rightmost derivation respectively, then such alterna-
tives are not possible.
Now let the syntax of a programming language PL be given by a context-free grammar G. A

compiler for PL generates for every given PL-program a derivation tree in G in the syntax
analysis phase. The meaning or semantics of the PL-program depends on the structure of the
generated derivation tree. The compiler generates the machine code of the PL-program on the
basis of the derivation tree.
For using the programming language PL it is important that every PL-program has an unam-
biguous semantics. Therefore, for every PL-program there should exist exactly one derivation
tree.
1.5 Denition :
(i) A context-free grammar G = (N, T, P, S) is called unambiguous, if for every string w T
there is at most one derivation tree or a leftmost derivation from S to w in G respectively.

Otherwise G is ambiguous.
(ii) A context-free language L T
is called unambiguous, if there is an unambiguous context-

free grammar G with L = L(G). Otherwise L is called (inherently) ambiguous.
Example : The above given grammar G
2
for arithmetic expressions is ambiguous. For the
string a +b c L(G
2
) there exist the following two derivation trees:
S
S
+
S
a
S

S
b
c
and
S
S

S
S
+
S
c
a
b
These correspond to semantically dierent bracketings a + (b c) and (a +b) c.
Therefore, in programming languages we choose a grammar G
3
for arithmetic expressions, where
the following evaluation strategy is chosen:
The evaluation takes place from left to right. Thus, for example, a +b +c is evaluated as
(a +b) +c.
Multiplication has a higher priority than +. Thus, for example, a +b c is evaluated as
a + (b c).
If we want to have another evaluation sequence, we must explicitly put brackets ( and ).
43
Consider G
3
= (E, T, F, a, b, c, +, , (, ), P
3
, E) with the following P
3
:
E T [ E +T
T F [ T F
F (E) [ a [ b [ c
G
3
is unambiguous, and it holds that L(G
3
) = L(G
2
). For example, a +b c has the derivation
tree in G
3
E
E
+
T
T T

F
F F
c
a
b
Example : (Chomsky, 1964) An inherent ambiguous context-free language is
L = a
i
b
j
c
k
[ i, j, k 0 and (i = j or j = k).
For the proof see literature.
2 Pumping Lemma
There also exists a pumping lemma for context-free languages. It provides a necessary condition
for a given language to be context-free.
2.1 Theorem (Pumping lemma for context-free languages or uvwxy-lemma): For
every context-free language L T
there exists a number n N, so that for all strings z L

with [z[ n there exists a decomposition z = uvwxy with the following properties:
(i) vx ,= ,
(ii) [vwx[ n,
(iii) for all i N it holds that: uv
i
wx
i
y L.
We can pump the substrings v and x arbitrary number of times without leaving the context-
free language L.
To prove the pumping lemma we need the following general lemma for trees, which we can then
apply to derivation trees. For nite trees t we dene:
Branching factor of t = maximum number of children nodes,
which has a node in t.
A path with length m in t is a sequence of edges from the root to the leaf of t with m
edges. The trivial case m = 0 is allowed.
2.2 Lemma : Let t be a nite tree with the branching factor k, where every path has the
length m. Then the number of leaves in t is k
m
.
Proof : Induction on m N:
m = 0: t consists only of k
0
= 1 nodes.
m m+ 1: t has j subtrees t
1
, . . . , t
j
with j k, where the paths have the length m:
t =
,
`
`
`
` , ,
t
1

t
j
`
`
`
`
`
`
`
`
`
`
By inductive hypothesis the number of leaves in each of the subtrees t

1
, . . . , t
j
is k
m
. Therefore,
it holds for t that:
number of leaves j k
m
k k
m
= k
m+1
.
45
Now we are ready for the

Proof of the pumping lemma: Let G = (N, T, P, S) be a context-free grammar with L(G) =
L. Let:
k = the length of the longest right side of a production from P, however at least 2
m = [N[,
n = k
m+1
.
Now let z L with [z[ n. Then there is a derivation tree t from S to z in G with no part
corresponding to a derivation of the form B
G
B:
S
.
.
.
B
.
.
.
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`

`
`
`
B
`
`
`
Every part like this could actually be removed from t without changing the derived string z.
Having chosen k and [z[ as above, we can see that t has a branching factor k and k
m+1
leaves. Therefore, according to the previously considered lemma there is a path with the length
m+ 1 in t. There are m+ 1 internal nodes on this path, so that a non-terminal symbol is
repeated, while labeling these nodes. We need this repetition in a special case.
By a repetition tree in t we denote a subtree of t, where the labeling of the root is repeated in
some other node. Now we choose a minimal repetition tree t
0
in t, i.e. such a tree that contains
no other repetition tree, except for the real subtree. In t
0
every path has a length m+ 1.
Let A be the root labeling of t
0
. Then t has the following structure:
t = S
.
.
.
t
0
=
A
.
.
.
A
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
u v w x
y
. .
z
From this structure we get a decomposition z = uvwxy with
S
G
uAy
G
uvAxy
G
uvwxy. ()
We show that this decomposition of z satises the conditions of the pumping lemma:
(i) By the choice of t it holds that vx ,= .
(ii) By the choice of t
0
and the preceding pumping lemma it holds that [vwx[ k
m+1
= n.
(iii) From () it follows immediately that for all i N it holds that: uv
i
wx
i
y L(G).
The derivation tree from S to uv
i
wx
i
y in G for i = 3 looks as follows:
S
.
.
.
A
.
.
.
A
.
.
.
A
.
.
.
A
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
u v x
y
`
`
`
`
`
v x
`
`
v w x
Just as in regular languages, the above pumping lemma can be used to prove that a certain
language is not context-free.
n
b
n
c
n
[ n N is not context-free. We provide a proof by
contradiction.
Hypothesis: L is context-free. Then according to the pumping lemma there is an n N with
given properties. Now consider z = a
n
b
n
c
n
. Because [z[ n holds, we can decompose z into
z = uvwxy with vx ,= and [vwx[ n, so that for all i N it holds that: uv
i
wx
i
y L.
Because [vwx[ n, neither as nor cs occur in the substring vwx. Therefore, while pumping
up to uv
i
wx
i
y, at most two of the characters a, b, c will be considered. However such strings are
not in L. Contradiction.
The pumping lemma allows us to show that context-free grammars are not enough to provide the
full description of the syntax of more advanced programming languages such as Java. Though
we describe the basic structure of the syntactically correct programs with context-free grammars
(in BNF or EBNF notation), there exist side conditions, which are context-sensitive.
47
Example : The programming language Java, which consists of all syntactically correct Java
programs, is not context-free. We make a proof by contradiction.
Hypothesis: Let Java be context-free. Then there is a n N with the properties mentioned in
the pumping lemma. Now let us consider the following syntactically correct Java-class:
class C
int X 1 . . . 1
. .
n
;
void m()
X 1 . . . 1
. .
n
= X 1 . . . 1
. .
n
By every uvwxy-decomposition of this program, the vwx-part inuences at most two out of
three occurences of the variables X 1 . . . 1
. .
n
, because [vwx[ n. Therefore, while pumping to
uv
i
wx
i
y, there appear either character strings, which do not fulll the requirements of the Java
syntax diagram, or character strings of the form
class C
int X 1 . . . 1
. .
k
;
void m()
X 1 . . . 1
. .
l
= X 1 . . . 1
. .
m
where k, l, m are not all equal. These character strings violate the following condition for syn-
tactically correct Java programs:
Every variable must be declared before use. ()
Here either X 1 . . . 1
. .
l
or X 1 . . . 1
. .
m
is not declared.
Contradiction
Such conditions as () are context-sensitive and thus are mentioned, while dening program-
ming languages together with the context-free basic structure of programs. A compiler checks
these context-sensitive conditions by applying appropriate tables for saving declared variables
or general identiers.
3 Push-down automata
So far we dened context-free languages by the fact that they can be generated by grammars.
Now we want to consider recognizability of such languages by automata. Our goal is to extend
the model of the nite automaton, which would recognize the context-free languages.
The weak point of nite automata is that they lack memory to store an unlimited amount of
data. Obviously a nite automaton can not recognize such a language as
L = a
n
b
n
[ n N
because at the moment, when the rst b is read, it doesnt know any more, how many as have
been read. The only information saved is the current state, i.e. an element of a nite set of
states.
Now we consider the so called push-down automata. These are nondeterministic nite automata
with -transitions (-NFAs), extended with memory, which can save innitely long strings, but
which can be accessed very limitedly. The memory is organized as a stack or a pushdown-list
with access only to the topmost symbol. The transitions of a push-down automaton may depend
on the current state, on the symbol of the input string, which has been read, and on the topmost
symbol of the stack; they change the state and the contents of the stack.
Sketch
a a a
b b b
input string
reading direction

`
q Q

#
A
A
topmost symbol
nite
control
stack
3.1 Denition (push-down automaton): A (nondeterministic) push-down automaton, shortly
PDA is a 7-tuple
/ = (, Q, , , q
0
, Z
0
, F)
(i) is the input alphabet,
(ii) Q is a nite set of states,
(iii) is the stack alphabet,
(iv) Q ( ) Q
is the transition relation,

49
(v) q
0
Q is the start state,
(vi) Z
0
is the start symbol of the stack,
(vii) F Q is the set of nal states.
While speaking about PDAs we typically use the following characters: a, b, c , u, v, w
,
, q Q, Z ,
. These characters can also be decorated with indices and

primes. The elements (q, Z, , q
) are called transitions. Instead of (q, Z, , q
) we
often write (q, Z)

(q
).
We can illustrate it as follows. A transition (q, Z)
a
(q
) means: if q is the current state

and Z is the topmost stack symbol, then the PDA can read the input symbol a, can move to
the state q
and replace the symbol Z by the string
. Similarly a transition (q, Z)

(q
)
means: in the state q and Z as a topmost stack symbol PDA can make a transition into the
state q
and replace Z by
on the top of the stack. In this case no input symbols are read.
If
= , then in a transition (q, Z)

(q
, ) we speak about a pop step, because the topmost

stack symbol is removed. If
= Z, then we speak about a push step in a transition (q, Z)

(q
, Z), because the string is added to the top of the stack.

In order to dene the acceptance behaviour of PDAs, we must just as in -NFAs rst of
all extend the transition relation. For this purpose we need the denition of conguration of a
PDA.
3.2 Denition : Let / = (, Q, , , q
0
, Z
0
, F) be a PDA.
(i) By a conguration of / we understand a pair (q, ) Q
, which describes the current

state q and the current stack contents of /.
(ii) For every ,

is a 2-place relation on congurations of /, which is dened as
follows:
(q, )

(q
), if Z ,
0
,
1

:
= Z
0
and (q, Z, , q
,
1
) and
=
1
0
By

we denote the -transition relation on congurations.
(iii) For every string w
,
w
is a 2-place relation on congurations of /, which is dened
inductively:
(q, )

(q
), if n 0 : (q, )

. . .

. .
n-times
(q
)
(q, )
aw
(q
), if (q, )

a

w
(q
), for all a .
By
w
we denote the extended w - transition relation on congurations.
Remark : For all , a
1
, . . . , a
n
and w
1
, w
2

it holds that:
(i) (q, )

(q
) implies (q, )

(q
).
(ii) (q, )
a
1
...a
n
= (q
) if and only if (q, )

a
1

. . .

a
n

(q
)
if and only if (q, )
a
1
. . .
a
n
(q
)
(iii) (q, )
w
1
w
2
= (q
) if and only if (q, )

w
1

w
2
(q
)
There exist two variants of language acceptance.
3.3 Denition (acceptance):
Let / = (, Q, , , q
0
, Z
0
, F) be a PDA and w
.
(i) / accepts w, if q F
: (q
0
, Z
0
)
w
= (q, ).
The language accepted (or recognized) by / is
L(/) = w
[ / accepts w.
(ii) / accepts w with the empty stack,
if
q Q : (q
0
, Z
0
)
w
= (q, ).
The language accepted or (or recognized) by / with an empty stack is
L
(/) = w
[ / accepts w with an empty stack .

Example : We construct a push-down automaton, which accepts the mentioned language
L = a
n
b
n
[ n N. Let
/ = (a, b, q
0
, q
1
, q
2
, a, Z, , q
0
, Z, q
0
),
where the transition relation consists of the following transitions:
(1) (q
0
, Z)
a
(q
1
, aZ)
(2) (q
1
, a)
a
(q
1
, aa)
(3) (q
1
, a)
b
(q
2
, )
(4) (q
2
, a)
b
(q
2
, )
(5) (q
2
, Z)

(q
0
, )
/ accepts a
n
b
n
for n 1 by using the following 2n + 1 transitions, which we have labelled for
the sake of clarity with numbers from 1 to 5:
(q
0
, Z)
a
1
(q
1
, aZ)
a
2
(q
1
, aaZ) . . .
a
2
(q
1
, a
n
Z)
b
3
(q
2
, a
n1
Z)
b
4
(q
2
, a
n2
Z) . . .
b
4
(q
2
, Z)
5
(q
0
, )
Thus the relation (q
0
, Z)
a
n
b
n
= (q
0
, ) holds for n 1. For n = 0 it trivially holds that (q
0
, Z)

(q
0
, Z). Because q
0
is the nal state of / , then it follows that L L(/).
51
For the inclusion L(/) L we must analyse the transition behaviour of /. For this purpose we
index the transitions as above. We can see that / is deterministic, i.e. while sequentially reading
characters of a given input string, only one transition can be applied on each step. Namely the
following transition sequences are possible, where n m 0 holds:
(q
0
, Z)
a
1

_
a
2
_
n
(q
1
, a
n+1
Z)
(q
0
, Z)
a
1

_
a
2
_
n
3

_
b
4
_
m
(q
2
, a
nm
Z)
(q
0
, Z)
a
1

_
a
2
_
n
3

_
b
4
_
n
5
(q
0
, )
Therefore, / accepts only those strings, which have the form a
n
b
n
. In general, it holds that
L(/) = L.
Furthermore, note that the automaton / accepts all strings a
n
b
n
with the empty stack, except
for the case n = 0: L
(/) = a
n
b
n
[ n 1.
Now we want to derive general properties of push-down automata. For this purpose we often
need the following Top Lemma. It clearly points out that the changes on the top of the stack
do not depend on the rest of stack.
3.4 Lemma (top of the stack): Let / = (, Q, , , q
0
, Z
0
, F) be a push-down automaton.
Then for all w
, q, q
Q, Z and
it holds that: if
(q, Z)
w
= (q
, ),
then also
(q, Z)
w
= (q
, ).
Proof : Exercise
Now we show that both variants of language acceptance are equivalent.
3.5 Theorem (acceptance):
(1) For every KA / we can construct a KA B with L(/) = L
(B).
(2) For every KA / we can construct a KA B with L
(/) = L(B) .
Proof : Let / = (, Q, ,
A
, q
0
, Z
0
, F).
(1): The idea of the proof is simple: B functions as / and empties the stack in nal states.
However we should take care of the fact that B should not have empty stack after having read
input strings, which / doesnt accept. Therefore, B uses an aditional symbol # to label the
bottom of the stack. To be more precise we construct:
B = (, Q q
B
, q
, #,
B
, q
B
, #, )
with q
B
, q
/ Q and # / and the following transition relation:
B
= (q
B
, #, , q
0
, Z
0
#) starting /

A
functioning as /
(q, Z, , q
, ) [ q F, Z #
(q
, Z, , q
, ) [ Z #
_
emptying the stack
Then it holds for all w
, q F and
:
(q
0
, Z
0
)
w
=
A
(q, )
if and only if
(q
B
, #)

B
(q
0
, Z
0
#)
w
=
A
(q, #)

B
(q
, ).
(For the if-then direction we use the Top Lemma.) Analysing the applicability of the new
transitions in B we get L(/) = L
(B).
(2): Idea of the proof: B works like /, but uses an additional symbol # to label the bottom
of the stack. As soon as / has emptied its stack, B reads the symbol # and moves to a nal
state. Making the exact construction of B should be done as an exercise.
Now we want to show that the nondeterministic push-down automata accept context-free lan-
guages (with empty stack). First for a given context-free grammar G we construct a push-down
automaton, which represents a nondeterministic top-down parser of the language L(G).
3.6 Theorem : For every context-free grammar G we can construct a nondeterminstic push-
down automaton / with L
(/) = L(G).
Proof : Let G = (N, T, P, S). We construct / in such a way that it simulates the leftmost
derivation in G:
/ = (T, q, N T, , q, S, ),
where the transition relation consists of the following types of transitions:
(1) (q, A)

(q, u), if A u P,
(2) (q, a)
a
(q, ), if a T.
/ works as follows: rst S is in the stack. A rule application A u of the grammar is executed
in the stack by replacing the topmost stack symbol A by u. If the terminal symbols are on the
top of the stack, we compare them with the symbols of the input string and if they are equal,
we remove them from the stack. This way we produce stepwise by / a leftmost derivation of
the input string. If this derivation is successful, then / accepts the input string with the empty
stack. The application of transitions of type (1) is nondeterministic, if there are several rules
with the same non-terminal symbol A in P. The only state q is not important for the transition
behavior of /, but must be indicated in order for the denition of / to be complete.
To show that L(G) = L
(/) holds, we consider more carefully the relationship between left-

most derivations in G and transition sequences in /. For this purpose we use the following
abbreviations of strings w (N T)
:
53
w
T
= the longest prex of w with w
T
T
,
w
R
is the remainder of w, dened by w = w
T
w
R
.
Hypothesis 1 For all A N, w (N T)
, n 0 and leftmost derivations

A
G
. . .
G
. .
n-times
w
the length n holds
(q, A)
w
T
= (q, w
R
).
Proof by induction on n:
n = 0: Then w = A, thus w
T
= and w
R
= A. Trivially (q, A)

(q, A) holds.
n n + 1: We analyse the last step of a leftmost derivation with the length n + 1:
A
G
. . .
G
. .
n-times
w = w
T
Bv
G
w
T
uv = w
for B N and u, v (N T)
with B u P. By the induction hypothesis it holds that

(q, A)
w
T
= (q, Bv).
The transition type (1) implies that
(q, Bv)

(q, uv).
The transition type (2) also implies that
(q, uv)
(uv)
T
= (q, (uv)
R
).
Because w
T
= ( w
T
uv)
T
= w
T
(uv)
T
and w
R
= ( w
T
uv)
R
= (uv)
R
hold, then in general we get
(q, A)
w
T
= (q, w
R
).
Thus we have proved statement 1.
Hypothesis 2 For all A N, m 0,
1
, . . . ,
m
T ,
0
, . . . ,
m
(N T)
and all
transition sequences
(q, A) = (q,
0
)

1
(q,
1
)

m
(q,
m
)
the length m holds
A
G

1
. . .
m
m
.
Proof by induction on m:
m = 0: Then
0
= A. Trivially A
G
A holds.
m m+ 1: We analyse the last transition
(q,
m
)

m+1
(q,
m+1
).
By the induction hypothesis A
G

1
. . .
m
m
.
Case
m+1
=
Then we used transition type (1) and the transition has the form
(q,
m
) = (q, Bv)

(q, uv) = (q,
m+1
)
for some B N and u, v (N T)
, where B u P. Thus it holds

A
G

1
. . .
m
Bv
G

1
. . .
m
uv =
1
. . .
m
m+1
m+1
.
Case
m+1
= a T
Then we used transition type (2) and the transition has the form
(q,
m
) = (q, av)
a
(q, v) = (q,
m+1
)
for some v (N T)
. Then it holds that

A
G

1
. . .
m
av =
1
. . .
m
m+1
m+1
.
Thus we have also proved statement 2.
Particularly from the statements 1 and 2 it follows that for all strings w T
it holds that:
S
G
w if and only if (q, S)
w
(q, )
if and only if / accepts w with the empty stack.
Thus L(G) = L
(/) holds as required.

Example : We consider again the language L = a
n
b
n
[ n N. We have already seen that the
language is generated by the context-free grammar G
1
= (S, a, b, P
1
, S), where P
1
consists
of the productions
S [ aSb,
i.e. L(G
1
) = L. The construction used in the proof above gives us the push-down automaton
/
1
= (a, b, q, S, a, b, , q, S, ),
where the transition relation consists of the following transitions:
(q, S)

(q, )
(q, S)

(q, aSb)
(q, a)
a
(q, )
(q, b)
b
(q, )
It follows from the proof that: L
(/
1
) = L(G
1
). To illustrate this, let us consider the transition
sequence of /
1
, while accepting a
2
b
2
:
(q, S)

(q, aSb)
a
(q, Sb)
(q, aSbb)
a
(q, Sbb)
(q, bb)
b
(q, b)
b
(q, ).
55
Now for every given push-down automaton we construct a respective context-free grammar.
3.7 Theorem : For every push-down automaton / we can construct a context-free grammar G
with L(G) = L
(/).
Proof : Let / = (, Q, , , q
0
, Z
0
, F). We construct G = (N, T, P, S) with T = and
N = S [q, Z, q
] [ q, q
Q and Z .
The idea of non-terminal symbols [q, Z, q
] is as follows:
(1) All strings w
, which can accept / from the conguration (q, Z) with empty stack
and the state q
, should be generated in G from [q, Z, q
]: (q, Z)
w
(q
, ).
(2) Thus a transition (q, Z)

(r
0
, Z
1
. . . Z
k
) from / is simulated in G by the following pro-
ductions:
[q, Z, r
k
] [r
0
, Z
1
, r
1
][r
1
, Z
2
, r
2
] . . . [r
k1
, Z
k
, r
k
]
for each r
1
, . . . , r
k
Q. The strings, which are accepted by / up to the reduction of the
symbol Z
1
, are generated from [r
0
, Z
1
, r
1
]; the strings accepted by / up to the reduction
of the symbol Z
2
, are generated from [r
1
, Z
2
, r
2
], and so on. The intermediate states
r
1
, . . . , r
k1
are those states, which / reaches directly after the reduction of the symbols
Z
1
, . . . , Z
k1
.
To make it more precise, P consists of the following transitions:
Type (1): S [q
0
, Z
0
, r] P for all r Q,
Type (2): For every transition (q, Z)

(r
0
, Z
1
. . . Z
k
) with and k 1 in /:
[q, Z, r
k
] [r
0
, Z
1
, r
1
] . . . [r
k1
, Z
k
, r
k
] P for all r
1
, ..., r
k
Q.
Type (3): (Special case (2) for k = 0.) For every transition (q, Z)

(r
0
, ) in /:
[q, Z, r
0
] P.
To show that L(G) = L
(/) holds, consider the relationship between derivations in G and

transition sequences in /.
Hypothesis 1 For all q, q
Q, Z , w
, n 1 and derivations in G
[q, Z, q
]
G
. . .
G
. .
n-times
w
with the length n holds for /
(q, Z)
w
(q
, ).
n = 1: From [q, Z, q
]
G
w it follows (because w
) that we deal with the production type

(3) in G. Therefore, it holds that w = and (q, Z)

(q
, ). Thus (q, Z)

(q
, ).
n n + 1: We analyse the rst step of a derivation with the length n + 1, which should take
place according to the production type (2):
[q, Z, r
k
]
G
[r
0
, Z
1
, r
1
] . . . [r
k1
, Z
k
, r
k
]
G
. . .
G
. .
n-times
w
1
. . . w
k
= w,
where (q, Z)

(r
0
, Z
1
. . . Z
k
) in /, r
k
= q
, , w
1
, . . . , w
k

and
[r
i1
, Z
i
, r
i
]
G
. . .
G
. .
n-times
w
i
for i = 1, . . . , k holds. Due to induction it holds in / that
(r
i1
, Z
i
)
w
i
(r
i
, )
for i = 1, . . . , k and thus according to the Top Lemma
(q, Z)

(r
0
, Z
1
. . . Z
k
)
w
1
(r
1
, Z
2
. . . Z
k
)
.
.
.
(r
k1
, Z
k
)
w
k
(r
k
, ).
So in conclusion (q, Z)
w
(q
, ) for / as required.
Hypothesis 2 For all q, q
Q, Z , n 1,
1
, . . . ,
n
and all transition sequences
(q, Z)

1

n
(q
, )
in / with the length n it holds in G that
[q, Z, q
G

1
. . .
n
.
n = 1: Then (q, Z)

1
(q
, ) holds. By denition of P in G see production type (3) it

follows that [q, Z, q
]
G

1
.
n n + 1: We analyse the rst step of a transition sequence in / with the length n + 1:
(q, Z)

1
(r
0
, Z
1
. . . Z
k
)

2

n+1
(q
, ),
where k 1 holds. By denition of P there is a production of type (2) in G
[q, Z, q
]
1
[r
0
, Z
1
, r
1
] [r
k1
, Z
k
, r
k
],
where r
k
= q
. We consider in more detail the successive reduction of the stack contents Z

1
. . . Z
k
of /. Furthermore, there exist transition sequences in /
(r
0
, Z
1
)

11

1m
1
(r
1
, )
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
(r
k1
, Z
k
)

k1

km
k
(r
k
, ) = (q
, )
57
with
2
. . .
n+1
=
11
. . .
1m
1
. . .
k1
. . .
km
k
and m
1
, . . . , m
k
n. By condition of induction
it holds in G that
[r
i1
, Z
i
, r
i
]
G

i1
. . .
im
i
for i = 1, . . . , k. In conclusion it results in G that
[q, Z, q
G

1
2
. . .
n+1
as required.
The hypotheses 1 and 2 imply that: for all q Q and w
it holds in G
S
G
[q
0
, Z
0
, q]
G
w
if and only if in /
(q
0
, Z
0
)
w
(q, )
holds. Thus we have shown that L(G) = L
(/).
4 Closure properties
Now we consider, under which operations the class of context-free languages is closed. In contrast
to the regular (i.e. nitely acceptable) languages we have the following results.
4.1 Theorem : The class of context-free languages is closed under the following operations
(i) union,
(ii) concatenation,
(iii) iteration,
(iv) intersection with regular languages.
However the class of context-free languages is not closed under the operations
(v) intersection,
(vi) complement.
Proof : Let L
1
, L
2
T
be context-free. Then there are context-free grammars G

i
=
(N
i
, T, P
i
, S
i
) with L(G
i
) = L
i
, where i = 1, 2 and N
1
N
2
= . First we show that L
1
L
2
, L
1
L
2
and L
1
are context-free. After that we consider the operations intersection and complement.
Let S / N
1
N
2
be a new start symbol.
(i) L
1
L
2
: Consider the context-free grammar G = (S N
1
N
2
, T, P, S) with
P = S S
1
, S S
2
P
1
P
2
. Obviously L(G) = L
1
L
2
holds.
(ii) L
1
L
2
1
N
2
, T, P, S) with
P = S S
1
S
2
P
1
P
2
. Obviously L(G) = L
1
L
2
holds.
(iii) L
1
1
, T, P, S) with
P = S , S S
1
S P
1
. Then S
G
S
n
1
holds for all n 0 and thus L(G) = L
1
.
(iv) L
1
regSpr: For the intersection with regular languages we use the representation of
context-free and regular languages by push-down automata and nite automata respec-
tively. Let L
1
= L(/
1
) for the (nondeterministic) PDA /
1
= (T, Q
1
, ,
1
, q
01
, Z
0
, F
1
)
and L
2
= L(/
2
) for the DFA /
2
= (T, Q
2
,
2
, q
02
, F
2
). We construct from /
1
and /
2
the
(nondeterministic) PDA
/ = (T, Q
1
Q
2
, , , (q
01
, q
02
), F
1
F
2
),
where the transition relation for q
1
, q
1
Q
1
, q
2
, q
2
Q
2
, Z , T and
is dened as follows:
((q
1
, q
2
), Z)

((q
1
, q
2
),
) in /
59
if and only if
(q
1
, Z)

1
(q
1
,
) in /
1
and q
2
2
q
2
in /
2
.
Note that in the special case = the notation q
2
2
q
2
for the DFA /
2
simply means
q
2
= q
2
. (Compare with the denition of the extended transition relation q
w
2
q
for
DFAs in section 1) Thus the relation

of / models the synchroneous parallel progress of
the automata /
1
and /
2
. However in the special case = only /
1
makes a spontaneous
-transition, while the DFA /
2
remains in the current state.
We show that for the acceptance by nal states it holds that: L(/) = L(/
1
) L(/
2
) =
L
1
L
2
. Let w = a
1
. . . a
n
T
, where n 0 and a
i
T for i = 1, . . . , n. Then it holds
that:
w L(/) (q
1
, q
2
) F,
: ((q
01
, q
02
), Z
0
)
a
1

a
n
((q
1
, q
2
), )
q
1
F
1
, q
2
F
2
,
: (q
01
, Z
0
)
a
1
1

a
n
1
(q
1
, )
and q
02
a
1
2

a
n
2
q
2
w L(/
1
) L(/
2
).
(v) not L
1
L
2
: However the context-free languages are not closed under intersection with
other context-free languages. Consider
L
1
= a
m
b
n
c
n
[ m, n 0
and
L
2
= a
m
b
m
c
n
[ m, n 0.
It is easy to see that L
1
and L
2
are context-free. For example, L
1
can be generated by the
context-free grammar G = (S, A, B, a, b, c, P, S with the following P:
S AB,
A [ aA,
B [ bBc.
The intersection of L
1
and L
2
is the following language
L
1
L
2
= a
n
b
n
c
n
[ n 0,
which is not context-free (as we have shown with the help of the pumping lemma).
(vi) not L
i
: The context-free languages are also not closed under the complement. This
statement follows directly from (i) and (v) due to the De Morganss laws. If context-
free languages were closed against the complement, then they would also be closed under
intersection, bacause L
1
L
2
= L
1
L
2
holds.

5 Transformation in normal forms
While considering context-free languages, it is benecial, when the rules of the underlying
context-free grammars are as simple as possible. Therefore, in this section we introduce such
transformations, which transform the given context-free grammars into equivalent grammars
with the rules satisfying additional conditions.
5.1 Denition : A -production is a rule of the form A . A context-free grammar G =
(N, T, P, S) is called -free, if in G there is
(i) either no -productions at all
(ii) or only the -production S and then S does not occur on the right side of any
production in G.
In case (i) it holds that / L(G) and in case (ii) it holds that L(G).
5.2 Theorem : Every context-free grammar can be transformed into an equivalent -free gram-
mar.
Proof : Let G = (N, T, P, S) be context-free. G may contain -productions, otherwise we have
nothing to do. A symbol A N is called erasable, if A
G
holds.
Step 1 First we compute all erasable A N. For this purpose we inductively compute the sets
N
1
, N
2
, N
3
, . . . of erasable non-terminal symbols:
N
1
= A N [ A P
N
k+1
= N
k
A N [ A B
1
. . . B
n
P mit B
1
, . . . , B
n
N
k
These sets represent an ascending bounded above sequence:

N
1
N
2
N
3
. . . N.
Because N is nite, there is a minimal k
0
with N
k
0
= N
k
0
+1
.
Statement: A N
k
0
A is erasable.
is obvious from the denition of N
k
0
.
is shown by the induction on the depth of the derivation tree from A to .
Step 2 We construct a grammar G
= (N
, T, P
, S
) equivalent to G. Let S
be a new start
symbol N
= S
N. The set of productions P
is dened in two steps. First we introduce

the set P
0
generated from P by replacing every production
A
1
. . .
n
P
with
1
, . . . ,
n
N T and n 1 by the production of the form
A
1
. . .
n
,
where the following holds:
61
If
i
N is erasable, then
i
= or
i
=
i
.
If
i
T or
i
N are non-erasable, then
i
=
i
.
Not all
i
s are .
Then it holds that: P
0
contains no -productions and P A [ A N P
0
. Then we get
P
from P
0
as follows:
P
= S
[ S is erasable S
u[ S u P
0
P
0
Thus G
is dened completely. It remains to show that: L(G
) = L(G).
: This inclusion is obvious, because S
u P
implies S
G
u, and A u P
with
A N implies A
G
u.
: If L(G) holds, then S is erasable and thus S
. Therefore, L(G
) holds as
well. Now let w L(G) . Consider a derivation tree from S to w in G:
S
.
.
.
`
`
`
`
`
`
`
`
`
`
B
/
/
/
/
/
/
/
\
\
\
\
\
\
\
A
_
`
. .
w
We get a derivation tree from S
to w in G
by replacing S by S
and by removing all maximal

subtrees, which represent the derivations of the form A
G
:
S
.
.
.
B
/
/
/
\
\
\
. .
w
Thus w L(G
) holds.
5.3 Denition : A context-free grammar G = (N, T, P, S) is in Chomsky normal form , if the
following holds:
G is -free (so that at the most S P is allowed),
every production in P dierent from S has either the form
A a or A BC,
where A, B, C N and a T.
5.4 Theorem : Every context-free grammar can be transformed into an equivalent grammar in
Chomsky normal form.
Proof : see literature.
5.5 Denition : A context-free grammar G = (N, T, P, S) is in Greibach normal form, if the
following holds:
G is -free (so that at most S P is allowed),
every production in P dierent from S has the form
A aB
1
. . . B
k
,
where k 0, A, B
1
, . . . , B
k
N and a T.
5.6 Theorem : Every context-free grammar can be transformed into an equivalent grammar in
Greibach normal form.
Proof : see literature.
63
6 Deterministic context-free languages
In section 3 we have shown that in general every context-free language can be recognized by
a nondeterministic push-down automaton. The question is, whether we can eliminate the non-
determinism as in the case of nite automata and whether we are always able to construct
equivalent deterministic push-down automata. This question is of practical importance for the
construction of a parser for a given context-free language. First we dene the notion of deter-
minism for push-down automata and context-free languages.
6.1 Denition :
(i) A push-down automaton / = (, Q, , , q
0
, Z
0
, F) is called deterministic, if the transition
relation fulls the following conditions:
q Q, Z , a :
( number of transitions of the form (q, Z)
a

+ number of transitions of the form (q, Z)

) 1
(ii) A context-free language L is called deterministic, if there is a deterministic push-down
automaton / with L = L(/) (acceptance by nal states).
n
b
n
[ n N is deterministic, because in section 3 we have
introduced a deterministic push-down automaton / with L(/) = L.
Example : The language PAL
c
= wcw
R
[ w a, b
of palindromes with c as a middle

symbol is also a deterministic context-free language. The notation w
R
means that the string w
should be read backwards.
The theorem 3.5 does not hold for deterministic push-down automata. Thus we cannot replace
L(/) (acceptance by nal states) by L
(/) (acceptance by empty stack).

Now we show that not all context-free languages are deterministic. For this purpose we use the
following theorem.
6.2 Theorem : Deterministic context-free languages are closed under complementation.
Proof sketch: We could use the same approach as in the case of nite automata and con-
sider the deterministic PDA /
= (, Q, , , q
0
, Z
0
, Q F) for a deterministic PDA / =
(, Q, , , q
0
, Z
0
, F). Unfortunately, it generally holds that L(/
)

=
L(/). The reason

for the fact that not all strings from the complement of L(/) are accepted, is the existence of
non-terminating computations. For example, if a transition
(q, A)

(q, AA)
is used once, then it must be used repeatedly; then innitely many elements will be added to the
stack and / will not terminate. If this is the case for some input string w
, then w , L(/).
However the same holds for /
, i.e. w , L(/
).
Thus we must rst transform / into an equivalent deterministic PDA, which terminates after
nitely many steps for every input string. Such a construction is actually possible for push-down
automata, because the set
(q, A) [
with (q, A)

(q, A)
can be eectively constructed for / and the respective transitions (q, A)

(q
) can be deleted
or replaced by appropriate transitions.
Details: think yourself or refer to literature.
6.3 Corollary : There exist context-free languages that are not deterministic.
Proof : If all context-free languages were deterministic, then the context-free languages would
be closed under complementation. Contradiction to the theorem from section 4.1
6.4 Lemma : Deterministic context-free languages are
(i) closed under intersection with regular languages,
(ii) not closed under union, intersection, concatenationand iteration.
Proof : We prove (i) using the same construction for the nondeterministic case (see section
4.1); the PDA / generated from /
1
and /
2
is deterministic, if /
1
is deterministic.
(ii) The context-free languages L
1
= a
m
b
n
c
n
[ m, n 0 and L
2
= a
m
b
m
c
n
[ m, n 0 are both
deterministic, however their intersection is not even context-free. Because L
1
L
2
= L
1
L
2
holds, the deterministic context-free languages are not closed under union. For concatenation
and iteration refer to literature.
As an example for a nondeterministic context-free language, we consider
PAL = ww
R
[ w a, b
w ,= ,
the language of all non-empty palindromes with even length. The notation w
R
means that the
string w should be read backwards. PAL is obviously context-free: to generate it we need the
following rules
S aa [ bb [ aSa [ bSb.
In order to show that PAL is nondeterministic we use an auxillary operator Min.
65
6.5 Denition : For a language L
let
Min(L) = w L[ there is no strict prex v of w with v L,
where v is a strict prex of w, if v ,= w and u
: w = v u.
6.6 Lemma : If L with , L is a deterministic context-free language, then so is Min(L) as
well.
Proof : Consider a deterministic PDA / = (, Q, , , q
0
, Z
0
, F) with L(/) = L. Because
, L, it holds as well q
0
, F. We transform / into PDA /
1
that functions as /, but in every
transition sequence it reaches one of the nal states from F at most one time and then stops
immediately. Lets consider a new state q
1
/ Q and dene /
1
= (, Qq
1
, ,
1
, q
0
, Z
0
, q
1
)
with
1
= (q, Z, , q
) [ q QF and (q, Z, , q
)
(q, Z, , q
1
, Z) [ q F and Z
/
1
is deterministic and L(/
1
) = Min(L(/)) holds, because / is deterministic.
Now we show:
6.7 Theorem : The context-free language PAL is nondeterministic.
Proof : Hypothesis: PAL is deterministic. Then according to both lemmas considered above
the language
L
0
= Min(PAL) L((ab)
+
(ba)
+
(ab)
+
(ba)
+
)
is context-free and deterministic. (ab)
+
stands for the regular expression ab(ab)
and (ba)
+
similarly for the regular expression ba(ba)
. Because all strings in L

0
are palindromes with even
length without strict prex, then it holds that
L
0
= (ab)
i
(ba)
j
(ab)
j
(ba)
i
[ i > j > 0.
According to the pumping lemma there exists a number n N for L
0
with the properties given
in the pumping lemma. Then we can decompose the string
z = (ab)
n+1
(ba)
n
(ab)
n
(ba)
n+1
L
0
into z = uvwxy. Because the intermediate part vwx fulls the conditions [vwx[ n and vx ,= ,
we can show that not all strings of the form uv
i
wx
i
y with i N can be in L
0
. Therefore, L
0
is
not even context-free, not to mention deterministic context-free. Contradiction
Remark: Because L
0
is not context-free, then the closure properties imply that context-free
languages are not closed under the operator Min.
For the practical syntax analysis of programming languages we will use the deterministic context-
free languages. There are two dierent methods of syntax analysis: in the top-down method we
construct the derivation trees from the start symbol of grammar and in the bottom-up method
we reconstruct from the given string. We want to be able to determine unambiguously the next
derivation step by looking k symbols ahead for some k 0.
For the top-down method we can use the so called LL(k)-grammars. These are context-free
grammars G = (N, T, P, S) where for every intermediate string w
1
Av in a left derivation S
G
w
1
Av
G
w
1
w
2
= w of a string w T
from S the part Av and the rst k symbols of the

remainder w
2
of w dene unambiguously the next left derivation step w
1
Av
G
w
1
uv. This
LL(k)-condition can be graphically represented as follows:
top-down
.
.
.
w
1
A
v
`
`
`
`
`
`
`
`
`
`
u
k symbols
w
2
. .
w=w
1
w
2
T
For the bottom-up method we can use the so called LR(k)-grammars. These are context-free
grammars G = (N, T, P, S) with the following property. For every intermediate string vw
2
in
the bottom-up reconstruction of a right derivation S
G
vw
2

G
w
1
w
2
= w of a string w T
from S, the part v and the rst k symbols of the rest w

2
dene unambiguously the previous
right derivation step vAw
2

G
vuw
2
with vu = v and A u P. This LR(k)-condition can be
graphically represented as follows:
bottom-up
S
.
.
.
A
`
`
`
`
`
`
`
`
`
`
`
`
k symbols
u
v
v w
2
w
1
. .
w=w
1
w
2
T
LL(k) grammars were introduced in 1968 by P.M. Stearns and R.E. Stearns and LR(k)
grammars in 1965 by D.E. Knuth. The following relations hold for languages LL(k) and LR(k)
generated by these grammars:
k0
LL(k) languages
_
=
_
k0
LR(k) languages
_
= deterministic context-free languages
67
It even holds that: LR(1) languages = deterministic context-free languages
7 Questions of decidability
The following constructions are algorithmically computable:
PDA acceptable by nal states PDA acceptable by empty stack
PDA context-free grammar
context-free grammar -free grammar
context-free grammar Chomsky or Greibach normal form
The questions of decidability concerning context-free languages can be answered arbitrarily
using representation by context-free grammars (in normal form) or by push-down automata.
We consider the same problems for regular languages (compare with chapter II, chapter 5).
7.1 Theorem (decidability): For context-free languages
the membership problem,
the emptiness problem,
the niteness problem
are decidable.
Proof :
Membership problem: Consider a context-free grammar G = (N, T, P, S) in Greibach normal
form and a string w T
. The question is: Does w L(G) hold? The case w = can be decided
immediately, because G is -free: L(G) S P. Now let w ,= . Then the following
holds:
w L(G) n 1 : S
G
. . .
G
. .
n-times
w
In Greibach normal form every derivation step
produces exactly one character of w.
S
G
. . .
G
. .
|w|-times
w
In order to decide w L(G) it is enough to check all derivation sequences with the length [w[
in G. This implies the decidability of the decision problem.
Emptiness problem: Consider a context-free grammar G = (N, T, P, S). The question is:
Does L(G) = hold? Let n be the number, which belongs to the context-free language L(G)
69
according to the pumping lemma. The same way as in the case of regular languages we show
that:
L(G) = w L(G) : [w[ < n.
Thus we have solved the emptiness problem by solving the decision problem for all strings
w T
, where [w[ < n.

Finiteness problem: Consider a context-free grammar G = (N, T, P, S). The question is: Is
L(G) nite? Let n be the same as above. Then we show the same way as in the case of regular
languages:
L(G) is nite w L(G) : n [w[ < 2 n
Therefore, the niteness problem can be decided by solving the decision problem nite number
of times.
However unlike the regular languages the following result holds.
7.2 Theorem (undecidability): For context-free languages
the intersection problem,
the inclusion problem
are undecidable.
We can prove this theorem only after formalizing the notion of algorithm.
Another result of undecidability concerns the ambiguity of context-free grammars. For the
practical application of context-free grammars for describing the syntax of programming lan-
guages, it would be benecial to have an algorithmical test to check the ambiguity of context-free
grammars. However we will show later that such a test does not exist.
7.3 Theorem : It is undecidable, whether a given context-free grammar is ambiguous.
In practice we can easily avoid the problem of having to test the ambiguity by restricting
oneself to LR(1) grammars or its subclasses. The LR(1)-property is algorithmically decidable
and because LR-grammars are always unambiguous (because the last right derivation step must
always be dened unambiguously and thus every string can have only one right derivation) the
problem of ambiguity doesnt exist.
Chapter IV
The notion of algorithm:
What can be computed using
machines?
1 Turing machines
were introduced in 1936 by A.M. Turing (19121954). There should be introduced an elemen-
tary model for calculating with pencil and paper. For this purpose we need a tape, where we
can write and change the characters and a nite program.
Sketch
. . . G T I

L R b b
`
nite
q Q
Turing program
L: means: move tape
one square left.
R: similarly to the right.
S : no movement.
1.1 Denition : A Turing machine, shortly TM, is a 6-tuple = (Q, , , , q
0
, .) with the
following properties:
(i) Q is a nite non-empty set of states,
(ii) q
0
Q is the start state,
(iii) is a nite non-empty set, the tape alphabet, with Q = ,
(iv) is the input alphabet,
(v) . is the blank symbol or blank
71
72 IV. The notion of algorithm
(vi) : Q
part
Q R, L, S is the transition function.
Representation of as Turing table or Turing program:
: q
1
a
1
q
1
a
1
P
1
with q
i
, q
i
Q,
. . . . . . a
i
, a
i
,
. . . . . . P
i
R, L, S
. . . . . .
. . . . . .
q
n
a
n
q
n
a
n
P
n
Operating principles of a TM: informal
Conguration of the TM describes the current state, the contents of the tape and the considered
cell.
Initial conguration:
. .
w
0
w
1 w
n .
w
..
`
q
0
Making a step:
.
w
0
w
1 w
n .
`
q
0
e.g. (q
0
, w
0
) = (q
0
, a, R) leads to
. a w
1 w
n .

`
q
0
Repetition of this step gives us the following result:
. a a a .

`
q
0
73
where (q
0
, .) is undened.
Then as the result of the computation we get a string aa . . . a, i.e. the contents of the tape
without blanks.
Example : Let us compute the function
linear: [
0, 1
with
linear( [
n
) =
_
1 if n can be divided by 2
0 otherwise
for (n 0). Choose
3
= (q
0
, q
1
, q
e
, [ , [, 0, 1, ., , q
0
, .)
using the following Turing table:
: q
0
[ q
1
. R
q
0
. q
e
1 S
q
1
[ q
0
. R
q
1
. q
e
0 S
We can easily check that
3
computes the function linear.
Notion of the conguration
We consider the current state q, the current contents of the tape and the current position of the
writing/reading head. There are usually used two kinds of notations:
(1) We number the squares of the innite tape sequentially:

-1 0 1 2
Contents of the tape: f : Z (Z = Set of integers)
Conguration: triple (q, f, i) with i Z
Disadvantage: complicated manipulation
(2) Only a nite part of the tape diers from .. Abstraction of blanks and cells numbers.
The conguration
. . . .
u
1 u
m
v
0
v
1
v
n
q
`
can be unambiguously represented as a string
u
..
u
1
. . . u
m
q
v
..
v
0
v
1
. . . v
n
with u
i
, v
j

, m, n 0.
Note: Q =
1.2 Denition : The set /
of the congurations of a = (Q, , , , q

0
, .) is given by
/
Q
+
.
A conguration uqv means that the TM is in state q, the contents of the tape is .
uv.
and
the rst (leftmost) symbol of the string v is read.
1.3 Denition (Operating principles of a = (Q, , , , q
0
, .)):
(1) The initial conguration (v) for a string v
is
(v) =
_
q
0
v if v ,=
q
0
. otherwise
(2) The transition relation
K
(K
is a successor conguration of K)
if u, v
a, b q, q
Q :
(K = uqav (q, a) = (q
, a
,S) K
= uq
v )
(K = uqabv (q, a) = (q
, a
,R) K
= ua
bv)
(K = uqa (q, a) = (q
, a
,R) K
= ua
. )
(K = ubqav (q, a) = (q
, a
,L) K
= uq
ba
v)
(K = qav (q, a) = (q
, a
,L) K
= q
.a
v )
By
we denote the reexive transitive closure of
, i.e. it holds
K
if K
0
, . . . , K
n
, n 0 :
K = K
0

. . .
K
n
= K
Further on by
+
we will denote the transitive closure of
, i.e. it holds that

K
+
K
if K
0
, . . . , K
n
, n 1 :
K = K
0

. . .
K
n
= K
(3) A nal conguration is a conguration K /
, which has no successor congurations.

(4) The result (or the visible output) of a conguration uqv is
(uqv) = uv,
where u is the shortest string with u = .. . . .u and v is the shortest string with v =
v.. . . .. Thus we eliminate q, as well the blanks in the beginning and in the end of the
string; however the blanks located between characters are not removed.
75
Remark : Because is a (partial) function, then
is right-unique, i.e. from

K
K
1
K
K
2
it follows that K
1
= K
2
.
1.4 Denition : The function computed by TM = (Q, , , , q
0
, .) is
h
part

with
h
(v) =
_
_
w if nal conguration K /
:
(v)
K w = (K)
undef. otherwise
We also write Res
for h
(result function of ).
Remark : Because
is right-unique, then h
is a partial function.
Illustration of the result function:
v

h
(v)

K
/
1.5 Denition : A set M
is called a domain of , if it holds that:

M = v
[ h
(v) is dened
A set N
is called range of values of , if

N = w
[ v
: h
(v) = w.
1.6 Denition (Turing computability):
Let A, B be alphabets.
(i) A partially dened function h : A
part
B
is called Turing computable, if there exists a

TM = (Q, , , , q
0
, .) with A = , B and h = h
, i.e. h(v) = h
(v) for all v A
.
(ii) T
A,B
=
def
h : A
part
B
[ h is Turing computable
(iii) Let T be the class of all Turing computable functions (for arbitrary alphabets A, B).
1.7 Denition (Turing decidability):
Let A be an alphabet.
(i) A set L A
is Turing decidable, if the characteristic function of L
L
: A
0, 1
is Turing computable. Thus
L
is the following total function:
L
(v) =
_
1 if v L
0 otherwise
(ii) A property E : A
true, false is Turing decidable, if the set v A
[ E(v) = true
of the strings with the property E is Turing decidable.
Remarks on computability
(1) Computability of functions with several arguments:
k-tuple as input can be described by a function h : (A
)
k
part
B
. For the computability of

such functions we simply modify the denition of the initial conguration using separator
# :
(v
1
, . . . , v
k
) = q
0
v
1
#v
2
#. . . #v
k
If v
1
= , then q
0
observes the rst separator. Except for this modication we can use the
previous denition of computability.
(2) Computability of number theoretic functions:
f : N
part
N
We use the dash representation of natural numbers:
n = [
n
= [ . . . [
..
n times
Then f is Turing computable, if the function
h
f
: [
part
[
with
h
f
: ( [
n
) = [
f(n)
is Turing computable.
Constructing Turing machines:
the owchart notation
First we dene useful elementary TMs. Let = a
0
, . . . , a
n
be the tape alphabet with a
0
= ..
77
Small right-machine r:
makes one step to the right and halts. Turing table:
r q
0
a
0
q
e
a
0
R
. . . . . . . . . . . . . . .
q
0
a
n
q
e
a
n
R
Small left-machine l:
makes one step to the left and the halts. Turing table:
l q
0
a
0
q
e
a
0
L
. . . . . . . . . . . . . . .
q
0
a
n
q
e
a
n
L
Printer-machine a for a :
prints the symbol a and then halts. Turing table:
a q
0
a
0
q
e
a S
. . . . . . . . . . . . . . .
q
0
a
n
q
e
a S
Further we assume that all constructed TMs have exactly one nal state, i.e. there exists one
state q
e
so that for all nal congurations uqv it holds that:
q = q
e
.
It is obvious that TMs r, l, a fulll this condition. Such TMs can be combined as follows:
1
a

2
obviously means that rst
1
works. If
1
terminates on a cell with symbol a,
then
2
starts.
1

2
means that rst
1
works. As soon as
1
terminates,
2
starts.
2
is an abbreviation for
1

2
.
1
=a

2
means that rst
1
works. If
1
terminates on a cell with the symbol ,= a, then
2
starts.
On the basis of the given TMs we can construct owcharts. The nodes of this owchart are
labelled with the names of TMs. The edges are labelled with arrows of the form
a
, or
=a
. Loops are allowed. A TM in the owchart is labelled with an arrow as a start-TM.
Illustration:
1

2
-
a
`
`
b
A owchart describes a large TM. We can get its Turing table as follows:
Step 1: For every occurrence of a TM
i
in the owchart construct the respective Turing table.
Step 2: Make the states in dierent tables disjoint.
Step 3: Generate the table of TM by writing all tables (in any order) one below each other.
Step 4: Combining: For each
1
a

2
in the owchart add to the table of TM line
q
e
1
aq
0
2
aS
. Let q
e
1
be the nal state (renamed according to Step 2) of
1
and q
0
2
the start state
(renamed according to Step 2) of
2
. Similarly for
1

2
and
1
=a

2
.
Example :
Large right-machine 1:
rst makes a step to the right. Then 1 moves to the right until a blank a
0
= . is observed.
1
r

,= .
Construction of the Turing table of 1:
1
q
0
a
0
q
e
a
0
R
. . . . . . . . . . . . . . .
q
0
a
n
q
e
a
n
R
q
e
a
1
q
0
a
1
S
. . . . . . . . . . . . . . .
q
e
a
n
q
0
a
n
S
r
.
,= a
0
Large left-machine L: similarly to the large right-machine.
L
l

,= .
79
Application: Consider a TM for the computation of the minus function:
f : N N N
with
f(x, y) = x
.
y =
_
x y for x y
0 otherwise
The initial conguration of the TM is:
.q
0
| . . . |
. .
x-times
#| . . . |
. .
y-times
.
Idea: erase x-dashes as long as y-dashes are there. Then erase the remaining y-dashes and #.
Construction of the TM using a owchart:
1l .Lr

(y > 0) (x > 0)
.
_

# # (y = 0) (x = 0)
. . .r

(y > 0)
`
`
As an exercise we recommend to construct the whole Turing table.
Variations of Turing machines
There exist many variations of the TM denition. These variations are often more convenient,
if we want to prove that a certain function is computable. We can show that these variations do
not increase the power of TM, which was dened in Denition 1.1. First we consider TM with
several tapes.
1.8 Denition (k-tape Turing machine): A k-tape TM = (Q, , , , k, q
0
, .) is a 7-tuple
(i)-(v) Q, , , q
0
, . as in Denition 1.1.
(vi) k N is the number of tapes
(vii) The transition function is now given as follows:
: Q
k
part
Q
k
R, L, S
k
Illustration of for k = 3:

c
c
c
d
d
a
a
a
a
a
a
a
a a
d
d
d
c
d
q
q

`

`
_
`
`
_

`
_
`
`
_

`
This transition occurs for
(q, (a, c, c)) = (q
, (d, d, a), (L, R, S))

Conguration of a k-tape TM :
K = (u
1
qv
1
, . . . , u
k
qv
k
) /
with u
1
, . . . , u
k

, q Q, v
1
, . . . , v
k

+
.
Operating principles of a k-tape TM :
(1) Initial conguration for a string v
is
k
(v) = ((v), q
0
., . . . , q
0
.
. .
(k1)-times
),
i.e. v is written on the rst tape.
(2) Transition relation
: similarly
(3) Final conguration: no successor conguration (as before)
(4) The result
k
of a k-tape conguration is
k
(u
1
qv
1
, . . . , u
k
qv
k
) = (u
1
qv
1
),
i.e. we take the result of the rst tape; we consider the remaining tapes only as auxiliary
tapes for computation.
The computed function of a k-tape TM is h
part

with
h
(v) =
_
_
w if nal cong. K /
k
(v)
K w =
k
(K)
r undef.
Remark: Several tapes can be used for computing functions with several arguments; in this
case we distribute the input strings on the tapes appropriately.
81
Goal: We want to prove that every function computed by a k-tape TM can be computed by a
normal 1-tape TM.
Idea of the proof: Every k-tape TM can be simulated using a constructed 1-tape TM.
Therefore, rst we want to specify the notion of simulation.
Notion of simulation
The notion of simulation is essential for proving that dierent machines have the same power.
1.9 Denition : We consider (1- or k-tape) TM and
over the same input alphabet , with

the sets of congurations /
and /
, with transition relations
, initial congurations
,
and result functions ,
. In a graphical representation is the higher TM and
the
lower TM.
A simulation of by
is a totally dened function

sim : /

(1) v
(v)
sim((v))
(2) K
1
, K
2
/
: sim(K
1
)
+
sim(K
2
) implies K
1

K
2
,
i.e. every step of can be simulated by several steps of
.
(3) nal cong. K /
nal cong. K
:
sim(K)
and (K) =
(K
).
Illustration:
(v)
sim((v))
+

sim(K)
(v)

K
v

h
= h
sim
`
sim

1.10 Theorem (Simulation theorem): Let and
(1- or k-tape) be TM over the same input

alphabet . There is a simulation sim of by
. Then the functions and
computed from
h
and h
coincide, i.e.
v
: h
(v) = h
(v)
(The equality here means that either both sides are undened or they have the same value).
Proof : Let v
.
Case 1: h
(v) is undened.
Then there is an innite sequence of congurations
(v) = K
1

K
2

K
3

. . .
of . Therefore, due to properties (1) and (2) of sim , we have an innite sequence of congu-
rations
(v)
sim(K
1
)
+
sim(K
2
)
+
sim(K
3
)
+
. . .
of
. Thus h
(v) is also undened.

Case 2: h
(v) = w is dened.
Then there is a nite sequence of congurations
(v) = K
1

. . .
K
n
, n 1,
of , where K
n
is a nal conguration and w = (K
n
) holds. Due to the properties (1) - (3) of
sim , we have a nite sequence of congurations
(v)
sim(K
1
)
+
. . .
+
sim(K
n
)
n
of
, where K
n
is a nal conguration with (K
n
) =
(K
n
). Thus it holds that
h
(v) = w = (K
n
) =
(K
n
) = h
(v).
Now we can show:

1.11 Theorem : For every k-tape TM there is a 1-tape TM
with h
= h
.
Proof : Let = (Q, , , , k, q
0
, .). We construct
= (Q
, q
0
, .) so that there is a
simulation sim of by
. Denition of sim:
K = ( u
1
qa
1
v
1
, is sim(K) =
. . . , simulated u
1
q a
1
v
1
#. . . #u
k
a
k
v
k
u
k
qa
k
v
k
) by
/
That is we mark the symbols a

1
, . . . , a
k
read by k heads using new symbols a
1
, . . . , a
k
, we write
down the contents of k tapes one after each other on the tape of
, separated by #, and we set

the state q in such a way that a
1
is observed.
Thus we choose:
= a [ a #. We sketch the choice of Q
and
according to the
following idea of simulation. A step of of the form
K
1
= (u
1
q
1
a
1
v
1
,
K
2
= (u
1
q
2
b
1
v
1
,
. . . , . . . ,
u
k
q
1
a
k
v
k
) u
k
q
2
b
k
v
k
)
83
generated by
(q
1
, (a
1
, . . . , a
k
)) = (q
2
, (b
1
, . . . , b
k
), (S, . . . , S))
is simulated by
in 2 phases of steps.
Reading phase:
sim(K
1
) = u
1
q
1
a
1
v
1
#. . . #u
k
a
k
v
k
u
1
[read, q
1
] a
1
v
1
#. . . #u
k
a
k
v
k
u
1
a
1
v
1
#. . . #u
k
[read, q
1
, a
1
, . . . , a
k
] a
k
v
k
Changing phase:
Now we will change the conguration according to (q, (a
1
, . . . , a
k
)).
u
1
a
1
v
1
#. . . #u
k
[read, q
1
, a
1
, . . . , a
k
] a
k
v
k
u
1
a
1
v
1
#. . . #u
k
[change, q
2
, b
1
, . . . , b
k
, S, . . . , S] a
k
v
k
u
1
[change, q
2
]
b
1
v
1
#. . . #u
k
b
k
v
k
sim(K
2
) = u
1
q
2
b
1
v
1
#. . . #u
k
b
k
v
k
Thus a step of is simulated by maximal as many steps of
as the double length of all strings

written on the k tapes.
Similarly we deal with other -transitions. However if in R- or L-steps we must stick an empty
eld . on a k tape, then we must shift the contents of the neighbor #-parts on the tape of
.
Now we consider the initial and nal congurations.
Initial conguration: Let v
and v ,= .
Then for
(v) = (q
0
v , and sim((v)) = q
0
v#.#. . . #.
q
0
.,
. . .
q
0
.)
For
(v) = q
0
v. Of course we can program
in such a way that
(v)
sim((v))
holds.
Final conguration: If K /
is a nal conguration of the form

K = (u
1
q
e
a
1
v
1
,
. . . ,
u
k
q
e
a
k
v
k
),
then we can understand that
sim(K) = u
1
q
e
a
1
v
1
#. . . #u
k
a
k
v
k
holds only in the reading phase. The we must bring sim(K) by erasing all symbols starting
from the rst # in the result form of 1-tape TM. Thus we can program
in such a way
that
sim(K)
u
1
q
e
a
1
v
1
holds and K
= u
1
q
e
a
1
v
1
is the nal conguration of
with (K) =
(K
).
Thus we have:
Q
= Q q
0
, q
e

[read, q, a
1
, . . . , a
j
] [ q Q, a
1
, . . . , a
j
, 0 j kff
[change, q, b
1
, . . . , b
j
, P
1
, . . . , P
j
] [ q Q, 0 j k, b
1
, . . . , b
j
,
P
1
, . . . , P
j
R, L, S
. . . further states . . .
In general we have now shown that sim is a simulation of by
. Thus according to the

simulation theorem it follows that h
= h
85
Other variations of Turing machines
Below we will consider other variations of Turing machines:
Turing machines with k heads on one tape:
. . F R E I B U R G
q
`

`
_
`
Denition of the transition function : as in Denition 1.8 of k-tape TM, but the denition
of conguration is dierent, because k cells on the tape should be marked.
Turing machines with a two-dimensional computational space divided into cells (comput-
ing on checkered paper), possibly with several heads:
q
`
_
`

`
= lled
= blank
Turing machine with one-sided innite tape:

q
`
Normal TM (with two-sided innite tape) can be simulated as follows by TM with one-
sided innite tape:
2-sided conguration K = a
m
. . . a
1
qb
1
. . . b
n
with m 0, n 1
is (for m n) simulated by the
1-sided conguration sim(K) =
q
b
1
. . .
b
m
b
m+1
. . .
b
n
a
1
a
m
. .
i.e. pairs of symbols of the 2-sided tape are
the symbols of the 1-sided tape.
Turing machines with nal states:
= (Q, , , , q
0
, ., F),
where F Q is the set of nal states and all remaining components are dened as before.
We will use TM with nal states, in order to accept languages L
.
1.12 Denition :
(1) A conguration K = uqv of is called accepting, if q F.
(2) Let v
. The TM accepts v, if accepting nal conguration K : (v)
K.
(3) The language accepted by is
L() = v
[ accepts v.
A language L
is called Turing-acceptable, if there is a TM with nal states, for

which L = L() holds.
(4) If we speak about sets, then by T we denote the languages accepted by Turing machines.
1.13 Theorem : Let L
and L =
L.
L and L are Turing-acceptable L is Turing-decidable.
Proof : : Let L be decidable by . By adding nal state transitions we get an accepting
TM for L and L.
: L is accepted by
1
and L by
2
. Now let us construct with 2 tapes in such a way that
a step of
1
on tape 1 and simultaneously a step of
2
on tape 2 is made. If
1
accepts the string,
then produces the value 1. Otherwise
2
accepts the string, and produces the value 0. Thus
decides the language L.
Remark : If only L is Turing-acceptable, then it does not imply that L is also Turing-decidable.
The accepting TM might not terminate for strings from L .
Non-deterministic Turing machines:
We consider 1-tape TM with nal states. The transition function is extended as
follows:
: Q T(Q R, L, S).
A tuple
(q
, b, R) (q, a)
means that the TM, in case if the symbol a is read in state q, can do the following:
accept state q
, write b, move to the right.

87
If there is another tuple
(q
, c, S) (q, a),
then the TM can do the following instead:
accept state q
, write c, terminate.
The choice between these two possible steps of TM is non-deterministic, i.e. TM can
arbitrarily choose one of the two possible steps. The transition relation
of a non-
deterministic TM is not right-unique any more. Apart from that, the language L()
accepted by is dened as for a deterministic TM.
1.14 Theorem : Every language accepted by a non-deterministic TM is also acceptable by a
deterministic TM.
Proof : Consider a non-deterministic 1-tape TM . Then there exists a maximal number r of
non-deterministic choices, which has according to in state q and symbol a, i.e.
q Q a : [(q, a)[ r,
q Q a : [(q, a)[ = r
By r we denote the degree of non-determinism of . For each pair (q, a) we number the possible
choices from 1 to (maximal) r consequently according to (q, a). Then we can represent every
nite sequence of non-deterministic choices as strings over the alphabet 1, . . . , r.
Example: For r = 5, for example, 1 3 5 means:
- in the 1st step make the 1st choice,
- in the 2nd step make the 3rd choice,
- in the 3rd step make the 5th choice.
Now let v
. Then accepts the string v, if there is an accepting computation of :

(v)
K accepting.
The set of all computations of starting in (v) can be represented as a tree with nodes labelled
with congutrations:
(v) = K
K
1

K
r

K
11

K
1r

K
r1

K
rr

. . . . . . . . . . . . . . . . . . . . . . . . . .

K
accepting.
By we mark the choices, which have lead to the acceptable conguration K.

Now we construct a deterministic TM
so that
generates and traverses every computation

tree of using breadth-rst search:
0.
1.
2.

3.

For this purpose
needs 3 tapes.
Tape 1 stores the input string v.
Tape 2 is used for systematic generation of all strings over 1, . . . , r. In order to model
the breadth-rst search the strings are generated as follows:
(i) according to the length, the shorter rst,
(ii) when the length is equal, then we use lexicographic order. Thus we get the following
sequence:
, 1, 2, . . . , r, 1 1, 1 2, . . . , 1 r, . . . , r 1, . . . , r r, . . .
Tape 3 is used for simulation of by
. First the input string v is copied on the tape 3,

then a sequence of transitions of is simulated according to the string encoding choices
from 1, . . . , r, which is written on tape 2.
If has an accepting computation v, then its string of choices is generated by
on tape 2, and
then
accepts v. If there is no accepting computation v of , then
will run innitely long

and generate all strings over 1, . . . , r on tape 2.
Thus L() = L(
) holds.
Remark : The above simulation of by
has of exponential complexity in the number of steps:

in order to nd an accepting computation of with n steps,
must generate and traverse a

computation tree with the breadth r and depth n, i.e. with r
n
nodes.
The variations of Turing machines considered here
k-tape TM
several heads
2-dimensional TM
Final states: for acceptance
non-deterministic
89
2 Grammars
In the previous chapter we got acquainted with context-free grammars. They are a special case
of Chomsky-Grammars introduced in 1959 by American linguist Noam Chomsky. There exist
several types of such grammars (Types 03). Here we consider the most general Type 0.
2.1 Denition (Grammar):
A (Chomsky-0- or shortly CH0-)grammar is a 4-tuple G = (N, T, P, S) with
(i) N is an alphabet of non-terminal symbols,
(ii) T is an alphabet of terminal symbols with N T = ,
(iii) S N is the start symbol,
(iv) P (N T)
(N T)
is a nite set of productions or rules, where for (u, v) P u

must contain at least one non-terminal symbol.
Productions (u, v) P are written down using arrow notation u v, as in context-free gram-
mars. Every grammar G has a derivation relation
G
on (N T)
:
x
G
y if and only if u v P w
1
, w
2
(N T)
:
x = w
1
u w
2
and y = w
1
v w
2
.
By
G
we again denote the reexive and transitive closure of
G
. We read x
y as y can be
derived from x. It holds
x
y, if z
0
, . . . , z
n

, n 0 : x = z
0

G
. . .
G
z
n
= y
The sequence x = z
0

G
. . .
G
z
n
= y is also called a derivation of y from x (or from x to y) of
the length n. Particularly, it holds: x
G
x with a derivation of the length 0.
2.2 Denition :
(i) The language generated from G is
L(G) = w T
[ S
G
w,
i.e. we are interested only in strings over the terminal symbols; the non-terminal symbols
are used as auxiliary symbols within derivations.
(ii) L T
is called Chomsky-0- (or shortly CH0-) language, if there is a Chomsky-0-

grammar G with L = L(G). The class of all CH0-languages is also shortly denoted by
CH0.
2.3 Theorem (T CH0):
Every Turing-acceptable language L T
is a Chomsky-0-language.
Proof : Let L be accepted by a TM = (Q, , , , q
0
, ., F) with F = q
e
so that for all
congurations uqv K
it holds:
uqv is a nal conguration u = and q = q
e
.
We can easily see that for every TM we can provide a TM with this property, which accepts the
same language. Now we construct a Chomsky-0-grammar G = (N, T, P, S) with L(G) = L in
4 steps.
Step 1: (Generate double start congurations)
Let us dene a partial grammar G
1
= (N
1
, T, P
1
, S) with S, c[, $, q
0
, . N
1
so that for all w T
G
1
wc[(w)$
holds and even for all v (N T)
:
S
G
1
v and v contains q
0
w T
: v = wc[(w)$.
The start conguration (w) is dened as usual:
(w) =
_
q
0
w if w ,=
q
0
. otherwise.
Choose N
1
= S, c[, $, q
0
, ., A, B C
a
[ a T and P
1
as follows:
P
1
= S c[q
0
.$,
S aAc[C
a
$ for all a T,
C
a
b bC
a
for all a, b T,
C
a
$ Ba$ for all a T,
bB Bb for all b T,
Ac[B c[q
0
,
Ac[B aAc[C
a
for all a T
Operating principle of P
1
:
S
G
1
c[q
0
.$ = c[()$
or
S
G
1
aAc[C
a
$
S
G
1
w Ac[B w$
S
G
1
w aAc[C
a
w$
S
G
1
w aAc[wC
a
$
S
G
1
w aAc[wBa$
S
G
1
w a Ac[B wa$
S
G
1
w ac[q
0
wa$
Step 2: (Simulate transition relation
)
Let us dene G
2
= (N
2
, T, P
2
, S) with N
2
= S, c[, $ Q so that for all w T
, v
it
holds that:
(w)
q
e
v wc[(w)$
G
2
wc[q
e
v$.
91
We can not simply choose P
2
=
, because
is innite. For example, (q, a) = (q
, a
, S)
implies that for all u, v
: uqav
uq
v. We can focus on the nite and choose a rule of

the form
qa q
.
More precisely we dene:
P
2
= qa q
[ q, q
Q and a, a
and (q, a) = (q
, a
, S)
qab a
b [ q, q
Q and a, a
, b and (q, a) = (q
, a
, R)
qa$
..
TM is at the right end of the tape
a
.$ [ q, q
Q and a, a
and (q, a) = (q
, a
, R)
bqa q
ba
[ q, q
Q and a, a
, b and (q, a) = (q
, a
, L)
c[qa
..
TM is at the left end of the tape
c[q
.a
[ q, q
Q and a, a
and (q, a) = (q
, a
, L)
Step 3: (Erase nal conguration)
Dene G
3
= (N
3
, T, P
3
, S) with N
3
= S, c[, $, q
e
, ., D so that for all w T
, v
it holds
that:
wc[q
e
v$
G
3
w.
Choose P
3
as follows:
P
3
= c[q
e
D,
Da D for all a ,
D$
Operating principle:
wc[q
e
v$
G
3
wDv$
G
3
wD$
G
3
w.
Step 4: ( Compute) G from G
1
, G
2
, G
3
)
Now let us dene G = (N, T, P, S) as follows:
N = N
1
N
2
N
3
,
P = P
1

P
2

P
3
(disjoint union).
Then for all w T
, v
it holds that:
(w)
q
e
v S
G
wc[(w)$ (rules P
1
)
G
wc[q
e
v$ (rules P
2
)
G
w (rules P
3
)
For note that the rules from at most one set of rules P
1
, P
2
or P
3
can be applied to a given
string over N T.
Therefore we get L(G) = L.
2.4 Corollary : Let the function h : T
part
T
be computed by a Turing machine. Then the

graph of h, i.e. the set
L = w#h(w) [ w T
and h(w) is dened ,

is a Chomsky-0-language.
Proof : If h is computed by a TM , then there is a 2-tape TM
, which accepts L. The TM
operates as follows:
(1)
leaves a given input string of the form w#v unchanged on the 1st tape.
(2)
copies the part w on the initially empty 2nd tape and then simulates on this tape.
(3) If terminates, then the result h(w) of the nal conguration is compared to the part v of
the 1st tape. If h(w) = v holds, then
accepts the input w#v on the 1st tape. Otherwise,
does not accept the input w#v on the 1st tape.

Thus according to the above theorem L is a Chomsky-0-language.
2.5 Theorem (CH0 T ):
Every Chomsky-0-language L T
is Turing-acceptable.
Proof : L is generated by a Chomsky-0-grammar G = (N, T, P, S): L(G) = L.
We construct a non-deterministic 2-tape TM , which accepts L. The TM operates as follows:
(1) leaves a given input string w T
unchanged on the 1st tape.

(2) On the initially empty 2nd tape, iteratively generates strings over NT according to the
rules from P, starting with the start string S. In every step chooses non-deterministically
a substring u from the last generated string and a rule u v from P and then replaces u
by v. If the input string w occurs in some step, then w is accepted.
Thus it holds: accepts L.
In addition to general Chomsky-0-grammars G = (N, T, P, S), where for rules p q P it
holds that p, q (N T)
and p contains at least one non-terminal symbol, there also exist

other classes of grammars.
93
2.6 Denition : A Chomsky-0-grammar (or shortly CH0-grammar) G = (N, T, P, S) is called
(for see below)
(i) context-sensitive (Chomsky-1- or shortly CH1-grammar) if and only if
p q P N, u, v, w (N T)
, w ,= : p = uv q = uwv
(i.e. a non-terminal symbol N is replaced by the non-empty string w in context
u. . . v).
(ii) context-free (Chomsky-2- or shortly CH2-grammar) if and only if
p q P : p N (q = is allowed).
(iii) right-linear (Chomsky-3- or shortly CH3-grammar) if and only if
p q P : p N q T
N T
For context-sensitive grammars G = (N, T, P, S) there exists the following special rule, so that
L(G) is possible: -production may only be of the form S . Then all other derivations
begin with a new non-terminal symbol S
:
S S
. . .
2.7 Denition : A language L is called context-sensitive, context-free, right-linear, if there
exists a grammar of corresponding type, which generates L.
Let the classes of languages be dened for some alphabet T:
CH0 = Class of languages, generated by Chomsky-0- grammars
CS = , context-sensitive ,
CF = , context-free ,
RLIN = , right-linear ,
By denition it already holds that (however, the inclusion CF CS can be proved only on the
basis of the later theorems):
2.8 Corollary : RLIN CF CS CH0
Now we want to show that the Chomsky-3- (i.e. right-linear) languages are exactly the nitely
acceptable (i.e. regular) languages from Chapter II.
2.9 Lemma : Every nitely acceptable language is Chomsky-3-language
Proof : Consider a DFA / = (, Q, , q
0
, F). Let us construct the following Chomsky-3-
grammar G:
G = (Q, , P, q
0
),
where for q, q
Q and a it holds that

P =q aq
[ q
a
q
q [ q F .
It is easy to show that L(/) = L(G) holds.
For the reverse direction we use the following lemma.
Remark : Every Chomsky-3-language can be generated by a grammar G = (N, T, P, S) with
P N (T N ).
Proof : Exercise.
2.10 Lemma : Every Chomsky-3-language is nitely acceptable.
Proof : Consider a Chomsky-3-language L T
. According to the remark L can be generated

by a grammar G = (N, T, P, S) with P N (T N ): L = L(G). Now we construct
the following -NFA B:
B = (T, N q
e
, , S, q
e
),
where q
e
/ N holds and the transition relation for , N and a T is dened as follows:

a
if and only if a P

q
e
if and only if P
Again it is easy to show that L(G) = L(B) holds.
From these two lemmas we get:
2.11 Theorem (Chomsky-3 = DFA): A language is Chomsky-3 (right-linear), if and only if
it is nitely acceptable (regular).
95
Summary of the chapter
We have proved the following result:
2.12 Main theorem : For function f : A
part
B
the following statements are equivalent

f is Turing-computable.
The graph of f is a Chomsky-0-language.
In the literature it is often spoken about recursion.
2.13 Denition : A function f : A
part
B
is called recursive, if f is Turing-computable or the

graph of f is a Chomsky-0-language respectively.
A set L A
is called recursive, if
L
: A
0, 1 is recursive, i.e. if L is Turing-decidable.

Remark : Sometimes when dealing with partially dened recursive functions, we speak about
partially recursive functions; by recursive we mean only total recursive functions. That is
why, we should make sure that we understand what exactly recursive means.
The Churchs thesis (1936) says:
The functions, which are intuitively computable using algorithms, are exactly the recursive
functions, i.e. recursion is the mathematic formalization of the notion algorithm.
The Churchs thesis can not be formally proved, because intuitively computable is not a
mathematically precise notion, but can only be conrmed by the observations.
However these observations are so important that the Churchs thesis is universally accepted.
Particularly it holds that:
- In more than 50 years of experience with computable functions no counterexamples have
been found.
- Every further formalization of the notion of algorithm has proved to be equivalent to
recursion: see below further examples for such formalization.
- Recursive functions have properties, for example, closure under the -operator (while
loop), which should also hold for the algorithms.
Therefore, we can use the following informal but universally accepted notions for functions f
and sets L:
f is computable means that f is partially recursive.
L is decidable means that L is recursive.
In addition to the already considered formalizations of the notion computable, there exist
other equivalent formalizations:
-recursive functions:
According to the works of G odel, Herbrand, Kleene and Ackermann the set of all
computable funcions f : N
n
part
N is dened inductively. For this purpose we rst of
all introduce very simple basic functions (such as successor function, projection function,
constant function). Then we dene how we can get new functions from the functions we
already know (by the principles of composition, primitive recursion and -recursion).
Register machines with k 2 registers:
Every register contains one natural number. The machine can test each of these registers,
whether its contents is equal to or larger than 0. Depending on this test and the current
state the machine can do the following operations with the values of the register
leave unchanged
increase by 1
reduce by 1, if we get no negative number.
Note that 2 registers are already enough for the computation of all computable functions
f : N
part
N (except for encoding). 2-register machines are also called 2-counter machines
or Minsky-machines.
-calculus:
This calculus introduced by Church describes the computation using higher-order func-
tions, i.e. functions, which take other functions as parameters. Today the -calculus is
used to describe the semantics of functional languages.
First-order predicate calculus:
The notion of decidability of formulae turns out to be equivalent to the notion of com-
putability.
Chapter V
Non-computable functions
undecidable problems
1 Existence of non-computable functions
In this section we consider totally dened functions f : N N.
1.1 The question of G odel, Turing, Church 1936 :
Is every function f : N N algorithmically computable?
More precise: Is there a Turing machine, which computes f or is the graph of f a Chomsky-0-language
respectively? In other words: Is f recursive?
The answer is no. Below we will show that there exist non-computable functions f : N N.
More precisely:
1.2 Theorem (Existence of non-computable functions):
There exist functions f : N N, which are not Turing-computable.
Idea of the proof : The cardinality of the set of all functions is greater (namely uncountable)
than the cardinality of the set of Turing-computable functions (which is countable).
In order to prove this statement we need to make some preparations. Therefore, we put together
denitions and results from the set theory.
97
98 V. Non-computable functions undecidable problems
1.3 Denition : Let M and N be sets. Then we dene
1. M N (M and N have the same cardinality ), if bijection : M N
2. M _ N(the cardinality of M is less than or equal to the cardinality of N), if N
N :
M N
, i.e. M has the same cardinality as N.

3. M N (the cardinality of N is greater than the cardinality of M), if M _ N and M , N.
4. M is (at most) countable , if M _ N.
5. M is uncountable , if N M.
6. M is nite , if M N.
7. M is innite, if N _ M
Remark :
M _ N injection : M N.
Every nite set is countable.
Every uncountable set is innite.
1.4 Theorem (Schr oder-Bernstein): M _ N and N _ M imply M N.
Proof : See, for example, P.R. Halmos, Naive Set Theory.
1.5 Corollary : An innite set M is countable if and only if N M respectively, i.e. it holds
that M = (0), (1), (2), . . . for a bijection : N M.
Examples : The following sets are countable:
N,
n N [ n even and
N N (compare with the proof of Lemma 1.6).
However the following sets are over uncountable:
T(N),
N N ( set of functions from N to N: see Lemma 1.7).
First we show:
1.6 Lemma :
99
The set of Turing-computable functions f : N N is countable.
Proof : Let us assume that we have a countable set q
0
, q
1
, q
2
, . . . of states and a countable
set
0
,
1
,
2
, . . . of tape symbols, where
0
= . and
1
=[. The sets
0
,
1
,
2
, . . . and
q
0
, q
1
, q
2
, . . . are disjoint.
Every Turing-computable function f : N N can be computed by a Turing machine of the
form
= (Q, [, , , q
0
, .) with
Q = q
0
, . . . , q
k
for k 0,
=
0
, . . . ,
l
for l 1
Let T be the set of all those Turing machines.
It is sucient to show: T is countable, because the set of Turing-computable functions can
not be larger than the set of respective Turing machines.
Let us dene for k 0, l 1:
T
k,l
= T [ has k + 1 states and l + 1 tape symbols.
(Remark: we put +1 next to k and l, because the states and tape symbols are counted from
0.)
Thus for every T
k,l
it holds that
= (q
0
, . . . , q
k
. .
Q
, [,
0
, . . . ,
l
. .
, , q
0
, .),
where is one of the nitely many functions
: Q Q R, L, S.
Thus it holds that: T
k,l
is nite.
Because T =
k0
l1
T
k,l
holds, T is countable due to a bijection : N T , which is dened
according to the following diagonal scheme to arrange sets T
k,l
:
g
g
2 3 4
1
2
3
0
1
k
l
g
g
g
g
g
g g g
counts the Turing machines in T in the following order:

1st diagonal: all Turing machines in T
0,1
;
2nd diagonal: all Turing machines in T
0,2
, then in T
1,1
;
3rd diagonal: all Turing machines in T
0,3
, then in T
1,2
,
then in T
2,1
;
and so on.
Thus we have proved Lemma 1.6.
Remark : Therefore, the basic idea for the countability of T is the countability of sets of all
pairs (k, l) [ k 0, l 1 according to the diagonal scheme. This scheme goes back to G.
Cantors proof for countability of N N.
However it holds:
1.7 Lemma :
The set of all functions f : N N is uncountable.
Proof :
Let T = f [ f : N N.
Hypothesis : T is countable, i.e. there is a bijection : N T, so that T = f
0
, f
1
, f
2
, . . .
with f
n
= (n) for n N holds. Now let us dene g T by g(n) = f
n
(n) + 1 for all n N.
According to the hypothesis there is an index m N with g = f
m
. However then it holds that
f
m
(m) = g(m) = f
m
(m) + 1. Contradiction!
Remark :
Here we use Cantors diagonal method, i.e. the values of f
x
(y) are considered on the diagonals
x = y; then the contradiction arises at some point m :
g
g
g
g
1 2 3
1
2
3
f
1
(1)
f
2
(2)
f
3
(3)
f
0
(0)
0
0
f
x
(y)
X
Y
The Lemmas 1.6 and 1.7 imply the theorem about the existence of non-computable functions.
This raises the following question:
Do there exist functions important for computer science, which are non-computable?
101
2 Example of undecidable problem: halting for Turing machines
Below we consider the binary alphabet B = 0, 1. Now we consider sets (languages) L B
,
which are undecidable and for which the problem:
Given : w B
Question : Does w L hold ?

is not algorithmically decidable.
First of all we want to consider the halting problem for Turing machines with the input alphabet
B:
Given : Turing machine and string w B
Question : Does applied to w halt, i.e. is h
(w) dened?
The relevance of this problem for computer science: is it decidable, whether a given program
terminates for given input values?
First we translate the halting problem into an appropriate language L B
. In order to do it
we need a binary encoding of Turing machines. Let us assume that we have countable sets
q
0
, q
1
, q
2
, . . . of states and

0
,
1
,
2
, . . . of tape symbols, now with
0
= 0,
1
= 1,
2
= ..
We consider only Turing machines = (Q, B, , , q
0
, .) with Q = q
0
, . . . , q
k
for k 0,
=
0
, . . . ,
l
for l 2.
2.1 Denition : The standard encoding of such a Turing machine is the following string w
over N :
w
= k l
.
.
.
i j s t nr(P) for (q
i
,
j
) = (q
s
,
t
, P)
.
.
.
where P R, L, S and nr(R) = 0, nr(L) = 1, nr(S) = 2 holds and the substrings of the form
. . . of w
are written down in lexicographical order of pairs (i, j).

Thus the binary encoding of is the string bw
generated from w
by making the following

substitution:

n 01
n+1
for n N
Example :
Consider the small right-machine r = (q
0
, q
1
, B, B ., , q
0
, .) with the Turing-table:
:
q
0
0 q
1
0 R
q
0
1 q
1
1 R
q
0
. q
1
. R
Then
w
= 1 2
0 0 1 0 0
0 1 1 1 0
0 2 1 2 0
and
bw
= 011 0111
01 01 011 01 01
01 011 011 011 01
01 0111 011 0111 01
Remark :
1. Not every string w B
is a binary encoding of a Turing machine, for example w = is

not.
2. It is decidable, whether a given string w B
is a binary encoding of a Turing machine,

i.e. the language
BW = w B
[ Turing machine; : w = bw
is decidable.
3. The binary encoding is injective, i.e. bw
1
= bw
2
implies
1
=
2
.
2.2 Denition (Special halting problem or self-application problem for Turing ma-
chines): The special halting problem or self-application problem for Turing machines is the
language
K = bw
[ applied to bw
halts.
2.3 Theorem : K B
is undecidable.
103
Proof : Again we use the Cantors diagonal method in a proof by contradiction.
Hypothesis: K B
is decidable. Then there exists a Turing machine , which computes
K
: B
0, 1. It is generally known that
K
(w) =
_
1 if w K
0 otherwise
We change into a Turing machine
as follows. In a ow chart notation
= 1
\
\
\`
Thus it holds:
gets into an innite loop, if halts with 1, and
halts (with 0), if halts

with 0.
Then it holds:
applied to bw
halts applied to bw
halts with 0

K
(bw
) = 0
bw
, K

applied to bw
does not halt.

Contradiction!
Now we will show a general method, which lets us prove that some problems are undecidable by
using some other problems (e.g. K), which are known to be undecidable. This method we will
call reduction.
2.4 Denition (Reduction): Let L
1

1
and L
2

2
be languages. Then L
1
is reducible to
L
2
, shortly L
1
L
2
, if there is a total computable function f :
2
so that for all w
1
it holds that: w L
1
f(w) L
2
. We also write: L
1
L
2
using f.
Idea: L
1
is a special case of L
2
.
2.5 Lemma (Reduction): Let L
1
L
2
(i) If L
2
is decidable, then L
1
is also decidable.
(ii) If L
1
is undecidable, then L
2
is also undecidable
Proof : Because (ii) is the contraposition of (i), then it is enough to show (i). Thus let L
1
L
2
using total computable function f and let L
2
be decidable, i.e.
L
2
is computable. Then the
composition
L
2
(f) = f
L
2
(rst apply f, then
L
2
) is computable. It holds
L
1
= f
L
2
,
because for all w
1
L
1
(w) =
_
1 if w L
1
0 otherwise
_
=
_
1 if f(w) L
2
0 otherwise
_
=
L
2
(f(w))
Thus
L
1
is computable, i.e. L
1
is decidable.
We apply reduction to the general halting problem for Turing machines.
2.6 Denition (general halting problem for TM): the (general) halting problem for Turing
machines is the language
H = bw
00u B
[ applied to u halts .
2.7 Theorem : H B
is undecidable.
Proof : According to the reduction lemma, it is enough to show K H, i.e. K is a special
case of H. It is here trivial: choose f : B
with f(w) = w00w. Then f is computable and

it holds:
bw
K f(bw
) H.
2.8 Denition : The blank-tape halting problem for Turing machines is the language
H
0
= bw
[ applied to the blank tape halts.

2.9 Theorem : H
0
is undecidable.
Proof : We show H H
0
. First we describe for a given Turing machine and a string u B
a Turing machine
u
, which operates as follows:
Applied to the blank tape
u
rst writes u on the tape. Then
u
operates as (applied to
u).
It is not important, how
u
operates, if the tape is not blank at the beginning.
We can construct
u
from and u. Thus there exists a computable function f : B
with
f(bw
00u) = bw
(
u
)
.
105
Then it holds
bw
00u H
applied to u halts.
f(bw
00u) H
0
.
Thus H H
0
using f holds.
We got acquainted with three variants of halting problems, namely K, H and H
0
. All three
variants are undecidable.
For K we have shown it directly using a diagonal method, and for H and H
0
we have shown it
by reduction: K H H
0
.
In all three variants we dealt with the question, whether there exists a decision method, which
decides halting for every given Turing machine.
In all cases the answer is no.
Perhaps we can at least construct a decision procedure for a given Turing machine , which
decides, whether halts.
2.10 Denition : Consider a Turing machine . The halting problem for is the language
H
= w B
[ applied to w halts.
For certain Turing machines H
is decidable, for example, for the small right-machine r it holds

that
H
r
= B
and for the Turing machine with :
= .
However there also exist Turing machines , for which H
is undecidable. These are the pro-

grammable or universal Turing machines.
2.11 Denition : A Turing machine
uni
with the input alphabet B is called universal , if for
the function h
uni
computed by
uni
the following holds:
h
uni
(bw
00u) = h
(u),
i.e.
uni
can simulate every other Turing machine applied to input string u B
.
Relevance for computer science :
The Turing machine
uni
is an interpreter of Turing machines, which itself is written as a Turing
machine:
uni

(u)
bw
u
Thus the construction of
uni
corresponds to writing an interpreter of a simple programming
language in this language.
2.12 Lemma : Universal Turing machines
uni
can be constructed eectively.
Proof : Obviously
uni
applied to a string w B
works as follows :

uni
determines, whether w has the form w = bw
00u for a Turing machine (with input

alphabet B) and a string u B
.
If no, then
uni
goes in an innite loop.
If yes, then
uni
simulates the Turing machine applied to u.
The last can be described in more detail as follows:
1.
uni
changes the tape to
bw
c[q
0
u$,
where q
0
is the initial state of .
2. In general
uni
sees on the tape
bw
c[v
l
qav
r
$
where v
l
qav
r
represents the current conguration of the Turing machine .
uni
remembers
the pair (q, a) and determines the value (q, a) of the transition function (which is binary
encoded in bw
) of .
3. If (q, a) is dened, then
uni
returns to the conguration v
l
qav
r
of and then makes the
necessary transition. Usually
uni
must run back and forth several times, because it can
store only a nite amount of information in its memory Q. As a result we have a new
label of the tape of the form
bw
c[v
l
v
r
$,
and
uni
operates further as described in (2).
4. If (q, a) is undened, i.e. v
l
qav
r
is a nal conguration of , then
uni
erases the substrings
bw
c[ and $ and halts in a nal conguration of the form

v
l
q
e
av
r
.
107
It is obvious that the results of
uni
and are equal:
(v
l
qav
r
) = (v
l
q
e
av
r
).
Thus it holds that:
h
uni
(bw
00u) = h
(u)
as desired.
2.13 Theorem : H
uni
B
is undecidable.
Proof : According to the denition of
uni
it holds that H H
uni
using f = id
B
. For the
Turing machine
uni
constructed in the proof it even holds that H = H
uni
, because
uni
gets into
an innite loop, if a given string w B
is not from H, i.e. does not have the form w = bw
00u.
Thus we have shown: K H = H
uni
H
0
3 Recursive enumerability
In this section we deal with an weakening of the notion of decidability (recursivity) for languages
L over an alphabet A.
3.1 Denition : A language L A
is called recursively enumerable, shortly r.e., if L = or

there exists a total computable function : N A
with
L = (N) = (0), (1), (2), . . .,
i.e. L is the range of values of .
Let us revise the notion countable for comparison. L A
is countable, if L _ N or equivalent:
if there is a total function : N A
with
L = or L = (N) = (0), (1), (2), . . ..
Thus the dierence is that by countable we mean that the function must not necessarily be
computable.
We want to consider the new notion of the recursive enumerability in more detail.
3.2 Denition (Semi-decidability):
A language L A
is called semi-decidable, if the partial characteristic function of L
L
: A
part
1
is computable. The partial function
L
L
(v) =
_
1 if v L
undef. otherwise
Thus a semi-decision procedure for a set L A
is a yes-procedure, while a decision procedure

is a yes-no-procedure.
Remark : For all languages L A
it holds that:
1. L is semi-decidable L is Turing-acceptable.
2. L is decidable L and L = A
L are Turing-acceptable or semi-decidable.

Proof : (1) follows from the denition of semi-decidable. (2) follows from the respective
theorem about Turing-acceptability.
3.3 Lemma : For all languages L A
it holds that: L is recursively enumerable L is

semi-decidable.
Proof : : Let L be recursively enumerable using the function : N A
. Let f :
A
part
1 be computable by the following algorithm:
Input: w A
Apply to n = 0, 1, 2, . . . successively.
If at some point (n) = w holds, halt with output 1. (Otherwise the algorithm does not
halt.)
f =
L
holds. Thus L is semi-decidable.
: ( dove tailing )
Let L be semi-decidable by the Turing machine . If L ,= , then we must provide a total
computable function : N A
with (N) = L.
Let w
0
be some strictly chosen string from L, is computed by the following algorithm:
Input : n N
Determine, whether n is prime number encoding of a pair (w, k) A
N, i.e. whether
n = T
2
(T (w), k) holds, compare to the proof of the theorem T T
in the rst chapter.

If no, the output is: w
0
.
Otherwise determine, whether applied to w halts in maximum k steps halts (thus the
value 1, i.e. w L, is produced).
If yes, then output is: w
Otherwise output is: w
0
We show: (N) = L.
: The above algorithm produces only strings from L.
109
: Let w L. Then there is a number of steps k N, so that applied to w halts in k steps
and thus produces 1 (w L ). Put n = T
2
(T (w), k). Then w = (n) (N) holds.
3.4 Theorem (Characterizing recursive enumerability): For all languages L A
the
following statements are equivalent:
1. L is recursively enumerable.
2. L is range of results of a Turing machine , i.e. L = v A
with h
() = v.
3. L is semi-decidable.
4. L is halting range of a Turing machine , i.e.
L = v A
[ h
(v) exists.
5. L is Turing-acceptable.
6. L is Chomsky-0.
3.5 Corollary : For all languages L A
it holds that:
L is decidable (recursive) L and L = A
L are recursive enumerable

The following extension of the reduction-lemma holds for reductions:
3.6 Lemma : Let L
1
L
2
. Then it holds : If L
2
is recursively enumerable, then L
1
is also
enumerable.
Proof : Let L
1
L
2
using f. Then
L
1
= f
L
2
holds (rst apply f, then
L
2
).
Now we show that the halting problems for Turing machines are recursively enumerable.
3.7 Theorem : H
0
B
is recursively enumerable.
Proof : H
0
is semi-decidable by the Turing machine
0
, applied to w B
, which operates as
follows:

0
determines, whether w of the form w = bw
is for a Turing machine .

If no, then
0
goes into an innite loop.
If yes, then
0
lets the Turing machine run on the empty tape. If halts, then
0
produces the value 1. Otherwise
0
runs further innitely.
Thus we get the following main result about the halting of Turing machines.
3.8 Main theorem (Halting of Turing machines):
1. The halting problems K, H, H
0
and H
uni
are recursively enumerable,
but not decidable.
2. The complementary problems K, H, H
0
and H
uni
are countable but not recursively enu-
merable
Proof :
(1): It holds that K H = H
uni
H
0
.
(2): If the complementary problems were recursively enumerable, then the halting problems
would be decidable according to (1) r.e. and the above corollary. Contradiction!
4 Automatic program verication
Can the computer verify programs, i.e. determine, whether a program T satises the given
specication o ? We consider the following verication problem in more detail:
Given : Program T and specication o
Question : Does T satisfy the specication o ?
We formalize this problem as follows:
Program T

= Turing machine with input alphabet B = 0, 1
Specication o

= subset o of T
B,B
, the set of all Turing-computable functions
h : B
part
B
T satises o

= h
o
The answer is given in the theorem of Rice (1953, 1956):
The verication problem is undecidable except for two trivial exceptions:
o = : answer always no.
o = T
B,B
: answer always yes
4.1 Theorem (Theorem of Rice): Let o be an arbitrary non-trivial subset of T
B,B
, i.e. it
holds that o T
B,B
. Then the language
BW(o) = bw
[ h
o B
of the binary encodings of all Turing machines , whose computed function h
is in the set of
functions o, is undecidable.
Proof : In order to show the undecidability of BW(o), we reduce the undecidable problem H
0
or its complement H
0
= B
H
0
respectively to BW(o)x.
111
Therefore, rst we consider an arbitrary function g T
B,B
and an arbitrary Turing machine .
Let
g
be a TM computed by g. Furthermore, let
be the following TM depending on and g.

Thus we denote more precisely:
(, g). When applied to a string v B
(, g) operates
as follows:
1. First v is ignored and
operates as applied to the blank tape.

2. If halts, then
operates as
g
applied to v.
Let T
B,B
be totally undened function. Then the function
(, g) computed by h
(,g)
is
computable by the following case distinction:
h
(,g)
=
_
g if halts on the blank tape
otherwise
For a given g there exists a total computable function f
g
: B
with
f
g
(bw
) = bw
(,g)
,
i.e. f
g
computes the binary encoding of a Turing machine
(, g) from a given binary encoding

of a Turing machine .
We use this function f
g
for reduction. We distinguish between two cases.
Case 1 : , o
Choose some function g o. It is possible, because o , = .
We show: H
0
BW(o) using f
g
.
bw
H
0
halts applied to the blank tape
h
(,g)
= g
It holds h
(,g)
g, , , o and g o.
h
(,g)
o
bw
(,g)
BW(o)
f
g
(bw
) BW(o)
Case 2 : o
Choose any function g , o. It is possible, because o , = T
B,B
.
We show: H
0
BW(o) using f
g
.
bw
, H
0
does not halt applied to the blank tape
h
(,g)
=
It holds h
(,g)
g, , o and g , o.
h
(,g)
o
bw
(,g)
BW(o)
f
g
(bw
) BW(o)
Thus we have proved the theorem of Rice.
It is clear that the theorem of Rice proves that it is not worth trying to algorithmically verify
the semantics of Turing machines or programs on the basis of their syntactic form, i.e. their
input / output behavior.
Nevertheless automatic program verication (also called :model checking, where model is
equivalent to program) is today a very active eld of research. For example, it is possible to
analyse programs, which behavior can be described by nite automata.
5 Grammar problems and Post correspondence problem
First we consider two problems for Chomsky-0-grammars.
5.1 Denition (decision problem for Chomsky-0):
The decision problem for Chomsky-0-grammars is :
Given : Chomsky-0-grammar G = (N, T, P, S) and a string w T
Question : Does w L(G) hold?

5.2 Theorem : The decision problem for Chomsky-0-grammars is undecidable.
Idea of the proof : With appropriate binary encoding of grammar G and string w the halting
problem for Turing machines can be reduced to the following :
H decision problem.
For the direct proof compare also the proofs of Theorem 1.2 and Lemma 1.6.
5.3 Denition : The derivation problem for Chomsky-0-grammars is as follows:
Given : Chomsky-0-grammar G = (N, T, P, S) and
two strings u, v (N T)
Question : Does u
G
v hold?
5.4 Theorem : The derivation problem for Chomsky-0-grammars is undecidable.
Proof : Obviously the decision problem derivation problem holds, because for all grammars
G = (N, T, P, S) and strings w T
it holds that:
w L(G) S
G
w.
Next we consider some sort of a puzzle game, the Post correspondence problem. It was intro-
duced by E. Post (1946) and is abbreviated as PCP (from Post Correspondence Problem).
The point is to construct a string in two dierent ways.
For subsequent encoding purposes let us assume that we have a countable innite set of symbols.
SYM = a
0
, a
1
, a
2
, . . ..
113
We always consider strings over SYM. Let X SYM be an alphabet.
5.5 Denition : An input/instance of the PCP is a nite sequence Y of pairs of strings, i.e.
Y = ((u
1
, v
1
), . . . , (u
n
, v
n
))
with n 1 and u
i
, v
i
SYM
for i = 1, . . . , n. If u
i
, v
i
X
holds, then Y is called an input of

the PCP over X .
A correspondence or solution of Y is a nite sequence of indices
(i
1
, . . . , i
m
)
with m 1 and i
1
, . . . , i
m
1, . . . , n , so that
u
i
1
u
i
2
. . . u
i
m
= v
i
1
v
i
2
. . . v
i
m
holds. Y is called solvable, if there exists a solution of Y.
5.6 Denition :
1. The Post correspondence problem (PCP) is the following problem:
Given: Input Y of PCP
Question: Does Y have a correspondence ?
2. The PCP over X is the following problem:
Given: Input Y of PCP over X
Question: Does Y have a correspondence ?
3. The modied PCP (shortly MPCP) is the following problem:
Given: Input Y = ((u
1
, v
1
), . . . , (u
n
, v
n
)) of PCP,
where u
i
,= for i = 1, . . . , n
Question: Does Y have a correspondence (i
1
, . . . , i
m
) with i
1
= 1 ?
Thus the correspondence should begin with the rst pair of strings,
so that u
i
1
u
i
2
. . . u
i
m
= v
i
1
v
i
2
. . . v
i
m
holds.
Example : First we consider
Y
1
= ((10, 00), (1, 101), (011, 11))
i.e.
u
1
= 10 , v
1
= 00
u
2
= 1 , v
2
= 101
u
3
= 011 , v
3
= 11
Then (2, 3, 1, 3) is a correspondence of Y
1
, because it holds that:
u
2
u
3
u
1
u
3
= 101110011 = v
2
v
3
v
1
v
3
(2) Next we consider
Y
2
= ((00, 10), (1, 0), (101, 0)).
Y
2
has no correspondence, because for every pair of strings (u, v) Y
2
it holds that: neither u
is the initial string of v, nor v is the initial string of u.
Even simple inputs of PCP can have a high degree of complexity. For example, the shortest
solution for
Y = ((001, 0), (01, 011), (01, 101), (10, 001))
already has 66 indices (compare with Schoning, 4th edition, p. 132). Thus we consider the
question, whether the PCP is decidable.
Remark : For one-element alphabet X, the PCP is decidable over X.
Proof : Let X = I. Every string I
n
over X can be identied with natural number n. Thus
every input Y of PCP over X can be considered as sequence of pairs of natural numbers:
Y = ((u
1
, v
1
), . . . , (u
n
, v
n
))
with n 1 and u
i
, v
i
N for i = 1, . . . , n. Y is solvable, if and only if there is a set of indices
(i
1
, . . . , i
m
) with m 1 and i
j
1, . . . , n for j = 1, . . . , m, so that
m
j=1
u
i
j
=
m
j=1
v
i
j
holds. Therefore, we deal with solvability of a problem for natural numbers.
We show:
Y is solvable (a) j 1, . . . , n : u
j
= v
j
or
(b) k, j 1, . . . , n :
u
k
< v
k
and u
j
> v
j
: Proof by contradiction: If neither (a) nor (b) hold, then all pairs (u, v) Y have the
property u < v or all pairs (u, v) Y have the property u > v respectively. Thus the strings
composed of u-pairs would be shorter or longer than the strings composed of v-pairs. Thus Y
is not solvable.
: If u
j
= v
j
holds, the trivial sequence of indices (j) is the solution of Y . If u
k
< v
k
and
u
l
> v
l
hold, then the sequence
( k, . . . , k
. .
(u
l
v
l
)times
, l, . . . , l
. .
(v
k
u
k
)times
)
115
is the solution of Y , then it holds that
u
k
(u
l
v
l
) +u
l
(v
k
u
k
) = v
k
(u
l
v
l
) +v
l
(v
k
u
k
).
Obviously the properties (a) and (b) are decidable. Thus the PCP is decidable over a one-element
alphabet.
Now we consider the general case of the PCP. Our goal is the following theorem:
5.7 Theorem : The PCP over X is undecidable for every alphabet X with [X[ 2.
We show the theorem by reduction of the derivation problem for Chomsky-0-grammars on the
PCP over 0, 1. We consider the following three reductions:
Derivation problem MPCP PCP PCP over 0, 1.
We start with the rst reduction.
5.8 Lemma : Derivation problem MPCP
Proof : We introduce an algorithm, which constructs an input Y
G,u,v
for MPCP for given
Chomsky-0-grammar G = (N, T, P, S) with (N T SYM) and given strings u, v (N T)
so that the following holds:

(1) u
G
v Y
G,u,v
has a correspondence starting with the
rst pair of srings.
We use a symbol , which does not occur in N T. Then Y
G,u,v
consists of the following pairs
of strings:
The rst pair of strings : (, u)
Productions : (p, q) for all p q P
Copy : (a, a) for all a N T
Last pair : (v, )
The exact sequence of pairs of strings in Y
G,u,v
is irrelevant, except for the rst pairs of strings.
Now we show that (1) holds.
: It holds that u
G
v. Then there is a derivation from u to v of the form
(2) u = u
0
p
0
v
0

G
u
0
q
0
v
0
= u
1
p
1
v
1
G
u
1
q
1
v
1
= u
2
p
2
v
2
. . .
G
u
k
q
k
v
k
= v
where p
0
q
0
, . . . , p
k
q
k
P holds. There exists the following correspondence of Y
G,u,v
,
where we write down both components of pairs of strings one after another in two rows :
_
`
p
0
u
0
v
0
u #
#
#
_
`
p
1
u
1
v
1
_
`
q
0
u
0
v
0
_
`
q
1
u
1
v
1
`
_
#
`
_
#
. . .
. . .

`
_
#
`
_
#
_
`
p
k
u
k
v
k
`
_
#
. . .
. . .
`
_
#
v # #
_
`
q
k
u
k
v
k
#

`
`

The angle pieces with u and v represent the rst pair of strings and the last pair respectively.
The substrings u
i
, v
i
, in round frames are constructed character-by-character by copy-pairs
(a, a) for a N T . The substrings p
i
, q
i
in angled frames are placed by the respective
pair of productions (p
i
, q
i
). The connecting lines mean that both strings belong to the same pair
of strings.
: Now consider a correspondence of Y
G,u,v
, which begins with the rst pair of strings. The
form of the pairs of strings in Y
G,u,v
implies that the correspondence must be constructed as
shown under with the exception that the substrings u and u
i
q
i
v
i
respectively can repeat
innitely many times by applying only copy-pairs (a, a) :
_
`
q
i
u
i
v
i
_
`
q
i
u
i
v
i
_
`
q
i
u
i
v
i
_
`
q
i
u
i
v
i
`
_
#
`
_
#
. . .
. . .

`
_
#
`
_
#
`
`

_
`
q
i
u
i
v
i
`
_
#
`
_
#
. . .
In order to make the correspondence complete, we must be able to place the nal pair. For
this purpose we must insert pairs of productions (p
i
, q
i
) so that the correspondence describes a
derivation (2) from u to v.
Note that holds only because we begin with the rst pair of strings in the sense of MPCP.
Without this restriction, for example, the copy-pair (, ) gives us a tivial correspondence, which
does not imply u
G
v.
5.9 Lemma : MPCP PCP
117
Proof : We introduce an algorithm, which constructs an input Y
PCP
for the PCP for the given
input Y = ((u
1
, v
1
), . . . , (u
n
, v
n
)) for the MPCP, i.e. with u
i
,= for i = 1, . . . , n, so that the
following holds :
(3) Y has a correspondence, which starts with the rst pair of strings.
Y
PCP
has a correspondence.
Idea : Let us construct Y
PCP
in such a way that every correspondence necessarily starts with
the rst pair of strings.
For this purpose we use new symbols and , which may not appear in any of the strings from
Y .
We dene two mappings
: SYM
SYM
,
: SYM
SYM
,
where for w SYM
(w) is generated from w, where to the right of every character of w we insert the symbol
,
(w) is generated from w, where to the left of every character of w we insert the symbol
.
For example it holds that
(GTI) = GTI and (GTI) = GTI.
For the mappings and the following holds: for all u, v SYM
a
st
1. () = and () =
2. (uv) = (u)(v) and (uv) = (u)(v)
3. u = v (u) = (v)
The statements 1 and 2 mean that and are string homomorphisms, i.e. and the concate-
nation are preserved.
From Y we construct the following Y
PCP
with pairs of strings, which are numbered from 0 to
n + 1:
Y
PCP
= ( ((u
1
), (v
1
)), 0 pair of strings
((u
1
), (v
1
)) , 1st pair of strings
. . . , . . .
((u
n
), (v
n
)) , n-th pair of strings
(, )) , n + 1-th pair of strings
Now we show that (3) holds:
: Let (1, i
1
, . . . , i
m
) be a correspondence of Y , i.e. u
1
u
i
2
. . . u
i
m
= v
1
v
i
2
. . . v
i
m
. According to
the statement 3 it holds that :
(u
1
u
i
2
. . . u
i
m
) = (v
1
v
i
2
. . . v
i
m
)
By applying statement 2 several times, we get :
=
(u
1
)
(v
1
)
(u
i
2
)
(v
i
2
)
. . .
. . .
(u
i
m
)
(v
i
m
)
By using frames we have put together strings, which appear in a pair of strings of Y
PCP
. We
see that:
(1, i
2
, . . . , i
m
, n + 1)
is a correspondence of Y
PCP
.
: We show this direction only for the following case: v
i
,= for i = 1, . . . , n
Let (i
1
, . . . , i
m
) be some correspondence of Y
PCP
. Then i
1
= 0 and i
m
= n+1 hold, because only
the pairs of strings in the 1st pair of strings ((u
1
), (v
1
)) start with the same symbol and only
the strings in (n+1)-th pair of strings (, ) end with the same symbol. Let k 2, . . . , m be
the smallest index with i
k
= n + 1. Then (i
1
, . . . , i
k
) is also a correspondence of Y
PCP
, because
appears only as the last symbol in the appropriate correspondence string. The form of the
pairs of the strings implies that:
i
j
,= 1 for j = 2, . . . , k 1.
Otherwise there would be two consequent s in the substring (u
i
j1
)(u
1
), which could not
be reproduced by placing (v
i
)s, because v
i
,= .
Thus the correspondence string has the following structure
=
(u
1
)
(v
1
)
(u
i
2
)
(v
i
2
)
. . .
. . .
(u
i
k1
)
(u
i
k1
)
By applying the statements (ii) and (iii) we conclude that

u
1
u
i
2
. . . u
i
k1
= v
1
v
i
2
. . . v
i
k1
.
119
Thus (1, i
2
, . . . , i
k1
) is a correspondence of Y . In case, if there is a v
i
= , the argumentation
is more dicult.
5.10 Lemma : PCP PCP over 0, 1

Proof : For reduction we use a binary encoding over SYM, i.e. an injective computable function
bw : SYM
0, 1
,
with which we have got acquainted while considering binary encoding of Turing machines.
Now consider an input Y = ((u
1
, v
1
), . . . , (u
n
, v
n
)) of the PCP. Then we dene
bw(Y ) = ((bw(u
1
), bw(v
1
)), . . . , (bw(u
n
), bw(v
n
)))
as input of PCP over 0, 1. Obviously it holds that:
Y is solvable bw(Y ) is solvable.
From the three lemmas above we get the undecidability of the following problems:
MPCP,
PCP,
PCP over 0, 1 and
PCP over X with [X[ 2.
In particular we have proved the above theorem.
6 Results on undecidability of context-free languages
Now we can prove the results on undecidability of context-free languages mentioned in Chapter
III. In contrast to the regular languages the following holds:
6.1 Theorem : For context-free (i.e. Chomsky-2-) languages
the intersection problem,
the inclusion problem
are undecidable.
Proof :
Intersection problem: Consider two context-free grammars G
1
and G
2
. The question is:
Does L(G
1
) L(G
2
) = hold? We show that the Post correspondence problem can be reduced
to the intersection problem
PCP intersection problem
This implies the undecidability of the intersection problem.
Now consider an arbitrary input Y = ((u
1
, v
1
), . . . , (u
n
, v
n
)) of the PCP with u
i
, v
i
X
for
an alphabet X. We provide an algorithm, which for every input Y constructs two context-free
grammars G
1
and G
2
, so that the following holds
Y has a correspondence. L(G
1
) L(G
2
) ,= . ()
The idea is that G
1
generates all strings, which can be produced by putting u
i
s one after
another, and G
2
generates all strings, which can be produced by putting v
i
s one after another.
In order for the necessary relation () to hold, G
1
and G
2
must also record the indices i of the
placed u
i
and v
i
. For this purpose we use n new symbols a
1
, . . . , a
n
/ X and choose them as
the set of terminal symbols of G
1
and G
2
:
T = a
1
, . . . , a
n
X
Then put G
i
= (S, T, P
i
, S), i = 1, 2, where P
1
is given by the following productions:
S a
1
u
1
[ a
1
Su
1
[ . . . [ a
n
u
n
[ a
n
Su
n
P
2
is given by the following productions:
S a
1
v
1
[ a
1
Sv
1
[ . . . [ a
n
v
n
[ a
n
Sv
n
Obviously it holds that
L(G
1
) = a
i
m
. . . a
i
1
u
i
1
. . . u
i
m
[ m 1 and i
1
, . . . , i
m
1, . . . , n
121
and
L(G
2
) = a
i
m
. . . a
i
1
v
i
1
. . . v
i
m
[ m 1 and i
1
, . . . , i
m
1, . . . , n.
This implies that:
Y has the correspondence (i
1
, . . . , i
m
).
a
i
m
. . . a
i
1
u
i
1
. . . u
i
m
= a
i
m
. . . a
i
1
v
i
1
. . . v
i
m
L(G
1
) L(G
2
).
Therefore, () holds and so does the undecidability of the intersection problem. We have proved
a stronger result: the undecidability of the intersection problem for deterministic context-free
languages. As it is easy to check, the languages L(G
1
) and L(G
2
) are deterministic.
Equivalence problem: Consider two context-free grammars G
1
and G
2
. The question is: Does
L(G
1
) = L(G
2
) hold? We show the undecidability of this problem using the following reduction:
Intersection problem for deterministic context-free languages
equivalence problem.
Consider two deterministic push-down automata /
1
and /
2
. We show that we can construct
from them two (not necessarily deterministic) contest-free grammars G
1
and G
2
so that the
following holds:
L(/
1
) L(/
2
) = L(G
1
) = L(G
2
).
We use the fact that for /
2
we can construct the complement of this push-down automata /
2
with L(/
2
) = L(/
2
) (compare with Chapter III, section 6). Then from /
1
and /
2
we can
algorithmically construct context-free grammars G
1
and G
2
with
L(G
1
) = L(/
1
) L(/
2
) and L(G
2
) = L(/
2
).
Thus it holds that
L(/
1
) L(/
2
) = L(/
1
) L(/
2
)
L(/
1
) L(/
2
)
L(/
1
) L(/
2
) = L(/
2
)
L(G
1
) = L(G
2
)
as required.
Inclusion problem: Consider two context-free grammars G
1
and G
2
. The question is: Does
L(G
1
) L(G
2
) hold? Obviously the reduction
equivalence problem inclusion problem
holds. Therefore, the inclusion problem is also undecidable.
Another result of undecidability concerns the ambiguity of context-free grammars. For the
practical use of context-free grammars for syntax description of programming languages it would
be benecial to have an algorithmic ambiguity test. However further we show that such a test
does not exist.
6.2 Theorem : It is undecidable, whether a given context-free grammar is ambiguous.
Proof : We show the following reduction
PCP ambiguity problem. ()
Consider an input Y = ((u
1
, v
1
), . . . , (u
n
, v
n
)) of the PCP. First we construct the context-free
grammar G
i
= (S
i
, T, P
i
, S
i
), i = 1, 2, as in the case of reduction of PCP to the intersection
problem. Afterwards we construct a context-free grammar G = (S, S
1
, S
2
, T, P, S) from it
with
P = S S
1
, S S
2
P
1
P
2
.
Because G
1
and G
2
are unambiguous, the only possible ambiguity of G, while constructing a
derivation tree of G for a string w T
, is the use of productions S S

1
or S S
2
respectively.
Therefore, we have the following result:
G is ambiguous L(G
1
) L(G
2
) ,=
Y has a correspondence.
Thus we have shown the reduction (), and it implies the undecidability of the ambiguity prob-
lem.
Chapter VI
Complexity
1 Computational complexity
So far we have considered the computability of problems, i.e. the question, whether the given
problems are algorithmically solvable at all. We have also considered a kind of structural complex-
ity, i.e. determining, which type of machine is necessary for solving problems using algorithms.
We have got acquainted with the following hierarchy:
(1) regular languages nite automata
(2) context-free languages push-down automata
(3) Chomsky-0-languages Turing machines
Now we will consider the eciency or computational complexity , i.e. the question: how much
computing time and how much memory do we need, in order to solve the problem algorithmically.
We will study time and memory depending on the size of the input. There exist two working
directions:
a) Provide most ecient algorithms for concrete problems:
- important for practical problem solving
- theoretically interesting is the proof of an upper bound for the problem: for exam-
ple, an existing n
3
-algorithm proves that the problem is at most n
3
dicult.
b) Find a lower bound for a problem so that every algorithm solving this problem has at least
this complexity. n, the size of the input, is trivial lower bound for the time complexity.
Statements about complexity depend on the machine model. In theory we mostly consider
deterministic or also non-deterministic Turing machines with several tapes. By using several
123
124 VI. Complexity
tapes we get more realistic statements than using 1-tape-TM, because the computation time for
merely moving back and forth on the tape of TM can be avoided.
1.1 Denition : Let f : N R be a function and a non-deterministic TM with several tapes
with the input alphabet .
(i) has the time complexity f(n), if for every string w
of the length n holds: applied

to the input w terminates for every possible computation in at most f(n) steps.
(ii) has the memory complexity f(n), if for every string w
of the length n holds:

applied to the input w uses for every possible computation on every tape at most f(n)
elds.
This denition can be also applied to deterministic TM with several tapes. For such TM there
exists exactly one computation for every input w.
If we deal with TM, we mostly represent problems as languages, which should be accepted. We
have already used such representation in Chapter Non-computable functions and undecidable
problems. For example, the halting problem for TM was represented as a language
H = bw
00u B
[ applied to u halts .
Now we put together problems, i.e. languages, with the same complexity into the so called
complexity classes.
1.2 Denition : Let f : N R.
DTIME (f(n)) = L[ there exists a deterministic TM with several tapes, which has time complexity f(n)
and accepts L
NTIME (f(n)) = L[ there is a non-deterministic TM with several tapes, which has time complexity
f(n) and accepts L
DSPACE (f(n)) = L[ there exists a deterministic TM with several tapes, which has space complexity f(n),
and accepts L
NSPACE (f(n)) = L[ there is a non-deterministic TM with several tapes, which has space complexity
f(n) and accepts L
Example : Consider L = uu[ u a, b
. We want to construct a most ecient TM with

several tapes, which accepts L.
Solution idea:
- For a given string w a, b
rst determine the center of w and then

compare both halves.
- In order to determine the center we use the 2nd tape, i.e. deterministic
2-tape TM.
Operating phases: Consider w a, b
with the length n.

125
. . w

`

`
(1)
. . w
\
\`
_
`
(2)
move the
lower head
half as quickly
as the upper
head
(
n
2
vertical
slashes on
tape 2) Test,
whether n is
even.
on both
tapes
n
2
steps to
the left
w
..
u
v

`

`
(3)
u . . .
v
`
`
_
`
(4)
copy the
2nd half of
v from the
1st tape
to the 2nd
tape and
erase it
on the 1st
tape.
move the
upper head
to the right
end of u
u
v
u
v

`
(5)

`

`
compare
u and v
backwards.
Accept, if
u = v
Let us compute time complexity of 5 phases:
(n + 1) + (
n
2
+ 1) + (
n
2
+ 1) + (
n
2
+ 1) + (
n
2
+ 1) = 3n + 5
Space complexity: n + 2
Thus it holds that: L DTIME (3n + 5),
L DSPACE (n + 2).
We can proceed non-determinstically as follows:
Part 1: Copy character by character from tape 1 to tape 2. Stop this process non-deterministically
at any time.
Part 2: Return to tape 2 at the beginning.
Part 3: Now compare character by character that starting from the current position, the rst tape
has the same contents as the second one. If it is the case and both heads point to the
blank eld, then accept.
126 VI. Complexity
In the best case this process needs
n
2
+
_
n
2
+ 1
_
+
_
n
2
+ 1
_
=
3n
2
+ 2 steps.
In the worst case, when the non-deterministic decision is made only at the last symbol of the
input string, we need
n + (n + 1) + 1 = 2n + 2 steps.
Thus it holds that: L NTIME(2n + 2)
In complexity theory we compare the asymptotic behavior of time and space complexity, i.e. the
behavior for n, which is large enough. Thus we can omit constant factors. For this purpose
we use the O-notation from the number theory.
1.3 Denition : Let g : N R. Then
O(g(n)) = f : N R[ n
0
, k Nn n
0
: f(n) k g(n)
i.e. O(g(n)) is the class of all functions f, which are bounded by the constant multiplied by g
for suciently large values of n.
Example : (n 3n + 4), (n n + 2) O(n). Therefore, we also say: the language L is of
time and space complexity O(n) or L is of linear time and space complexity respectively.
Because a TM can visit at most one new eld on its tapes on each computational step, it follows
that:
DSPACE(f(n))
DTIME(f(n)) NSPACE(f(n))
NTIME(f(n))

127
2 The classes P and NP
The complexity for algorithms, which can be applied in practice, should be a polynomial p(n)
of the k-th degree, i.e. of the form
p(n) = a
k
n
k
+ +a
1
n +a
0
with a
i
N for i = 0, 1, . . . , k, k N and a
k
,= 0.
2.1 Denition (Cobham, 1964):
P =
p polynomial in n
DTIME(p(n))
NP =
p polynomial in n
NTIME(p(n))
PSPACE =
p polynomial in n
DSPACE(p(n))
NPSPACE =
p polynomial in n
NSPACE(p(n))
2.2 Theorem (Savitch, 1970): For all polynomials p in n it holds that:
NSPACE(p(n)) = DSPACE(p
2
(n))
and thus
NPSPACE = PSPACE.
Furthermore, it holds that
P NP PSPACE, ()
because, as we have already mentioned, a TM can visit at most one new eld in every compu-
tational step.
Open problem in computer science: Are the inclusions in () strict or does the equality
hold ?
The classes P and NP are very important, because they mark the transition from computability
or decidability questions, which are of practical interest, to the questions, which are purely of
theoretical interest. This transition can be illustrated by the following diagram:
128 VI. Complexity
all problems resp. languages
computable resp. decidable
exponential time
PSPACE
non-det. polyn. time: NP
det. polyn. time: P
_
NPC
`
Practically solvable, i.e. practically computable or decidable are those problems in the class P,
which can be characterized as follows:
P: Construct the right solution deterministically
and in polynomial time.
At the same time the polynomials, which bound the computation time, should have the smallest
possible degree, such as n, n
2
or n
3
.
However, practically unsolvable are all problems, for which it can be proved that the computa-
tion time grows exponentially with the input size n. Between these two extremes there is a large
class of practically important problems, for which at the moment we know only exponential
deterministic algorithms. However, these problems can be solved using non-deterministic algo-
rithms in polynomial time. This is the class NP, which in comparison to P, can be characterized
as follows:
NP: Guess a solution proposal non-deterministically and then verify /
check deterministically and in polynomial time,
whether this proposal is right.
However, these problems are also still practically unsolvable without restriction of generality. In
practice we make use of the so called heuristics, which strongly restricts the non-deterministic
search space of the possible solution proposals. Using these heuristics we try to approximate an
ideal solution.
Examples of problems from the class NP
(1) Problem of Hamiltonian path
Given: A nite graph with n vertices.
Question: Does the graph contain a Hamiltonian path, i.e. a path, which visits each vertex
exactly once?
Consider, for example, the following graph:
129
, ,
, ,
,
\
\
\
\
\
\
`
`
`
`
`
K2
K3
K1
K4
Then the sequence of vertices K1-K2-K3-K4 is a Hamiltonian path. It is easy to see that
the problem of the Hamiltonian path lies in NP: First let us guess a path and then check,
whether each vertex is visited exactly once. Because there are n! paths in the graph, this
method could be used deterministically only with exponential complexity.
(2) Traveling salesman problem
Given: A nite graph with n vertices and length N of every edge, as well as a number
k N.
Question: Is there a round-trip for the traveling salesman with the length k or to put it
more mathematically: is there a cycle in the graph with the length k, which visits each
vertex at least once?
This problem also lies in NP: First let us guess a cycle and then compute its length. The
traveling salesman problem is of practical importance, for example, for designing telephone
networks or integrated circuits.
(3) Satisability problem for Boolean expressions (shortly SAT)
Given: A Boolean expression B, i.e. an expression, which consists only from variables
x
1
, x
2
, . . . , x
n
connected by operators (not), (and) and (or), as well as by brackets.
Question: Is B satisable, i.e. is there an assignment of 0 and 1 to the Boolean variables
x
1
, x
2
, . . . , x
n
in B, so that B evaluates to 1?
For example, B = (x
1
x
2
) x
3
is satisable with the values x
1
= x
2
= 1 or x
3
= 0. In
Section 3 we will consider the SAT problem in more detail.
While considering the question, which still remains open, whether P = NP holds, we have come
across a subclass of NP, the class NPC of the so called NP-complete problems. It holds that:
If one problem of NPC lies in P liegt, then all problems of NP are already in P, i.e.
P = NP holds.
The class NPC was introduced in 1971 by S.A. Cook. Cook was the rst to prove that the
SAT problem, which we have just introduced, is NP-complete. Since 1972 R. Karp has proved
that many other problems are also NP-complete. Today we know more than 1000 examples of
problems from the class NPC.
Below we want to dene the notion of NP-completeness. For this purpose we need the notion
of polynomial-time reduction, which was introduced by Karp in 1972 as a technique to prove
130 VI. Complexity
NP-completeness. Therefore, we tighten the notion of reduction L
1
L
2
, which was introduced
in Section 2.
2.3 Denition : Let L
1

1
and L
2

2
be languages. Then L
1
is called polynomially-time
reducible to L
2
, shortly
L
1

p
L
2
,
if there is a function f :
2
, which is total and computable with a polynomial time
complexity, so that for all w
1
it holds that:
w L
1
f(w) L
2
.
We also say: L
1

p
L
2
using f.
L
1

p
L
2
clearly indicates that L
1
is not more complex than L
2
. We can easily recognize that
p
is a reexive and transitive relation on languages, because with two polynomials p
1
(n) and
p
2
(n), p
1
(p
2
(n)) is also a polynomial.
2.4 Denition (Cook, 1971): A language L
0
is called NP-complete, if L
0
NP holds and
L NP : L
p
L
0
.
2.5 Lemma (Polynomial-time reduction): Let L
1

p
L
2
(i) If L
2
P holds, then L
1
P holds as well.
(ii) If L
2
NP holds, then L
1
NP holds as well.
(iii) If L
1
is NP-complete and L
2
NP holds, then L
2
is also NP-complete.
Proof : for (i) and (ii): Let L
1

p
L
2
using a function f, which is computed by a Turing
machine
1
. Let the polynomial p
1
bound the computing time of
1
. Because L
2
P (or
L
2
NP respectively), there is a (non-deterministic) Turing machine
2
, which is bounded by
a polynomial p
2
and computes the characteristic function
L
2
.
Similar to normal reduction, the characteristic function
L
1
for all w
1
can be computed as
follows:
L
1
(w) =
L
2
(f(w)).
We do it by sequentially connecting the Turing machines
1
and
2
. Now let [w[ = n. Then
1
computes the string f(w) in p
1
(n) steps. This time bound also limits the length of f(w), i.e.
[f(w)[ p
1
(n). Therefore, the computation of
L
2
(f(w)) is carried out in
p
1
(n) +p
2
(p
1
(n))
steps. Thus also L
1
P (and L
1
NP respectively).
for (iii): Let L NP. Because L
1
is NP-complete, L
p
L
1
holds. Moreover L
1

p
L
2
holds.
The transitivity of
p
implies L
p
L
2
, what we wanted to show.
131
2.6 Corollary (Karp): Let L
0
be an NP-complete language. Then it holds that: L
0
P
P = NP.
Proof : : Statement (i) of the Lemma.
: is obvious.
Example : We show the following polynomial-time reduction:
Hamiltonian path
p
traveling salesman.
Consider a graph G with n vertices. As a reduction function we consider the following construc-
tion f : G G
of a new graph G
as follows:
G
has n vertices as G.
Every edge of G is the edge of G
of the length 1.
Every non-edge of G is the edge of G
of the length 2.
Thus G
is a complete graph, i.e. every two vertices are connected by one edge.
E.g.
G: n = 5 G
:
and appropriate bound
k = n + 1
, ,
, ,
,
\
\
\
\
\
\
`
`
`
`
`
f
, ,
, ,
,
\
\
\
\
\
\
`
`
`
`
`

1
1
1 1
1
1
2
2
2
2
The construction f takes place in polynomial time. (In order to nd the non-edges: consider an
incidence matrix, i.e. O(n
2
)) Now we show the reduction property of f, i.e.
G has a Hamiltonian path G
has a round-trip of the length n + 1.

Proof of : We need n 1 edges of the length 1 (i.e. from G) to connect n dierent
vertices. In order to close this Hamiltonian path, we need another edge of the length 2. All
in all the constructed round-trip has the length n + 1.
Proof of : The round-trip has at least n edges (to connect n vertices) and at most n + 1
edges (due to the edge length 1).
During the round-trip at least 1 vertex is visited twice (start = end). Therefore, we must
denitely remove one edge, in order to get a path with dierent vertices. We check, whether it
is enough.
Case 1: After removing one edge, the remaining path has the length n 1.
132 VI. Complexity
Thus the vertices, which can be visited, are all dierent. Therefore, we have found a Hamiltonian
path.
(Example for case 1: see above)
Case 2: Otherwise the remaining path has the length n.
Then this path has n edges and one vertex is visited twice one of the remaining cycle. Then we
get a Hamiltonian path after having removed one more edge.
Example for case 2: n = 3
G
:
, , , , , ,
_
`
_
`
1 1
1 1
_ _
Round-trip of the length 4
Hamiltonian path
133
3 The satisability problem for Boolean expressions
3.1 Denition SAT:
(i) A Boolean expression is a term over 0,1 and variables with the operations not, and,
or. In particular: Let V = x
1
, x
2
, . . . be an innite set of variables. Let B = 0, 1 be
the set of logical values (false and true).
(1) Every x
i
V is a Boolean expression.
(2) Every a B is a Boolean expression.
(3) If E is a Boolean expression, then E (not E) is a Boolean expression as well.
(4) If E
1
, E
2
are Boolean expressions, then (E
1
E
2
) and (E
1
E
2
) are also Boolean
expressions.
If x V , then we mostly write x instead of x.
If x V , then we call x and x literals.
_
Priorities:
over over
_
(ii) Conjunctive norman form:
(1) If y
1
, y
2
, . . . , y
k
are literals, then (y
1
y
2
. . . y
k
) is called clause (of the order k,
i.e. k literals/alternatives)
(2) If c
1
, c
2
, . . . , c
r
are clauses (of the order k), then c
1
c
2
. . .c
r
is called Boolean ex-
pression in conjunctive normal form (CNF) (of the order k). If at least one clause
contains k literals, then this expression is called a CNF of the order k. [Similarly we
can dene disjoint normal form.]
(iii) An assignment is a function : V 0, 1, which can be extended to Boolean expres-
sions (see (i),(2)-(4)):
(0) = 0, (1) = 1,
(E) = 1 (E),
(E
1
E
2
) = Min((E
1
), (E
2
)),
(E
1
E
2
) = Max((E
1
), (E
2
)).
(iv) A Boolean expression E is called satisable, if there is an assignment with (E) = 1.
(v) The satisability problem is described using the language
ERF = E [ E is Boolean expression, E is satisable .
In this case the alphabet is V 0, 1, , , , ), ( . It is innite. Thus we code V using:
x
i
xj, where j is the binary representation of the number i,
i.e. x
1
x1, x
2
x10, x
3
x11, x
4
x100 etc. The element from B can be easily
removed from E using calculation rules, and we can bring E to CNF. Thus we get:
134 VI. Complexity
SAT := E [ E is Boolean expression (without elements from B) in CNF;
if E contains m dierent variables, then these are
the variables x
1
, . . . , x
m
(in encoded form);
E is satisable
x, 0, 1, , , , (, )
.
SAT(k) := E [ E SAT and E of the order k .
3.2 Theorem (Cook,1971): SAT is NP-complete.
Proof :
First of all it is easy to see that SAT lies in NP:
Let us guess an assignment : x
1
, . . . , x
m
0, 1 for a Boolean expression E in CNF,
which contains exactly m variables. Then to every variable x
i
we assign the value (x
i
)
and calculate (E) accourding to the standard calculation rules (Def. 3.1(iii)). If [E[ = n
held, then E would have less than n variables, i.e. m n; the guessing of an assignment
is carried out in linear time (create a table, m steps), the substitution in E takes [E[ const
steps as well as the evaluation, i.e. SAT NTIME (c
1
n+c
2
) for appropriate constants c
1
and c
2
. In particular: SAT NP. [Describe, how we can computate (E) in linear time!?]
The dicult part is to show that for every L NP it holds that: L SAT. Thus
consider an arbitrary L NP. Then there is a non-deterministic Turing machine =
(Q, X, , , q
0
, ., F) with: Q = set of states; X = input alphabet; = tape alphabet,
which contains X; q
0
Q initial state; b X blank symbol; F Q set of nal states.
(Compare with the denition in the rst chapter. Supplement with the nal states, be-
cause must halt for all inputs. Only the strings w X
belong to L, which bring into

a nal state.) accepts L in polynomial time, i.e. there exists a polynomial p, so that
halts for every w X
and every computation sequence after at most p(n) (with n = [w[)

steps, and w L if and only if there is a computation sequence of , which transfers the
initial conguration q
0
w into a nal conguration u
1
qu
2
with q F.
We assume without loss of generality that has only one tape. Now for every w X
we construct a Boolean expression g(w)
(with = x, 0, 1, , , , (, ) ) in CNF, so
that it holds that: w L g(w) SAT. If we showed that this function g : X
is a polynomial reduction, then it follows that SAT is NP-complete, because L from NP

was arbitrary.
Construction of g(w) from w: (We follow [Mehlhorn, Data Structures and Algorithms
2, Section VI.4]) Let Q = q
1
, . . . , q
s
, q
1
= initial state, F = q
r
, q
r+1
, . . . , q
s
. Let
= c
1
, . . . , c
with c
1
= . = blank symbol.
Without loss of generality let : Q 2
Q{
L
Left
1,
R
`
Right
+1, 0
S
Stay
}
with (q, c) = for q F.
We make complete by changing every (q, c) = into (q, c) = (q, c, 0). Thus the
135
empty set never appears as an image of . Because the end of the computation was
determined by (q, c) = , then by changing we dene a non-terminatingTuring machine
, which however exactly reects the operation of : If halted in state q, then
after
making p([w[) steps reaches a conguration u
1
qu
2
with exactly this state q and
stays in
this conguration innitely long; the opposite direction also holds. Thus:
w L applied to w after p([w[) in a conguration u
1
qu
2
with q F

runs through a sequence K

1
, K
2
, . . . , K
p(n)
of congurations with
(i) K
1
= q
1
w initial conguration.
(ii) K
i+1
successor conguration of K
i
for all i 1.
(iii) n = [w[ and K
p(n)
contains a nal state q F.
By we usually mean the subset of QQ1, +1, 0 . Let = tupel
1
, tupel
2
, . . . , tupel
m
be numbered from 1 to m.
Let w X
be given, [w[ = n, w = c
j
1
c
j
2
. . . c
j
n
. The formula g(w) is constructed by the
following variables:
meaning of the variables:
z
t,k
1 t p(n)
1 k s
z
t,k
= 1
is in sate q
k
at the time point t
a
t,i,j
1 t p(n)
p(n) i p(n)
1 j
a
t,i,j
= 1 c
j
is the content of the square i at the
time point t
s
t,i
1 t p(n)
p(n) i p(n)
s
t,i
= 1
is located at the square i at the time

point t
b
t,l
1 t p(n)
1 l m
b
t,l
= 1 the l-th tuple of is used to transfer
from the time point t to the time point
t + 1.
Because makes at most p(n) steps, we can always assume that [i[ p(n) and t p(n).
The Boolean expression g(w) should exactly describe the above sequence of congurations
K
1
, K
2
, . . . , K
p(n)
(or all such acceptable sequence of congurations respectively). This
implies the fulllment of the following conditions:
(1) Initial conguration:
in state q
1
, .
p(n)+1
w.
p(n)n
is on the tape, the head is at the
square 1. Time point t = 1.
(2) Final conguration (provided that w is accepted):
in state q
j
with r j s at
the time point t = p(n).
(3) Transition condition:
at every time point 1 t p(n) is in exactly one state,

136 VI. Complexity
every square from p(n) to +p(n) contains exactly one symbol from , the head is
exactly at one of these squares, and exactly one tuple of is used for the transfer.
(4) Successor conguration: The next conguration follows from the previous congura-
tion on the basis of transfer, which is determined by the tuple of described under
(3).
Let g(w) = A
1
A
2
A
3
A
4
with:
for (1): A
1
= a
1,p(n),1
a
1,p(n)+1,1
. . . a
1,0,1
_
.
p(n)+1
stand on the left
_
a
1,1,j
1
a
1,2,j
2
. . . a
1,n,j
n
(w = c
j
1
. . . c
j
n
)
a
1,n+1,1
a
1,n+2,1
. . . a
1,p(n),1
_
.
p(n)n
stand on the right of w
_
z
1,1
s
1,1
This formula describes exactly the initial conguration with 2 p(n) + 3 variables.
for (2): A
2
= z
p(n),r
z
p(n),r+1
. . . z
p(n),s
([F[ variables).
for(3): First we describe an auxiliary expression: For variables x
1
, . . . , x
k
let
exactlyone(x
1
, . . . , x
k
) := atleastone(x
1
, . . . , x
k
) atmostone(x
1
, . . . , x
k
)
with
atleastone(x
1
, . . . , x
k
) := (x
1
x
2
. . . x
k
)
and
atmostone(x
1
, . . . , x
k
) := atleasttwo(x
1
, . . . , x
k
)
=
_
1i<jk
(x
i
x
j
)
=
_
1i<jk
(x
i
x
j
)
_
apply
de Morgans law!
_
Thus the expression exactlyone(x
1
, . . . , x
k
) is in CNF; it has k +
1
2
k (k 1) 2 = k
2
variables. It will be 1, if exactly one x
i
has the value 1.
Now put
A
3
=
1tp(n)
_
A
state
3
(t) A
place
3
(t) A
eld
3
(t) A
U
3
(t)
_
with
A
state
3
(t) = exactlyone(z
t,1
, . . . , z
t,s
)
A
place
3
(t) = exactlyone(s
t,p(n)
, s
t,p(n)+1
, . . . , s
t,p(n)
)
A
eld
3
(t) =
_
p(n)ip(n)
exactlyone(a
t,i,1
, . . . , a
t,i,
)
A
U
3
(t) = exactlyone(b
t,1
, . . . , b
t,m
)
137
Again this formula is in CNF. A
3
has
p(n) (s
2
+ (2 p(n) + 1)
2
+ (2 p(n) + 1)
2

2
+m
2
) variables.
A
3
describes exactly the transition condition (3).
for(4): A
4
=
_
1t<p(n)
A
4
(t)
Now we must consider the tuple of in more detail. For l = 1, 2, . . . , m let the l-th
tuple of be given by
tupel
l
= ( q
k
l
, c
j
l
, q
k
l
, c
j
l
, d
l

Q Q +1, 1, 0
)
Put:
A
4
(t) =
_
p(n)ip(n)
_
_
1j
(s
t,i
a
t,i,j
a
t+1,i,j
)
_
1lm
_
(s
t,i
b
t,l
z
t,k
l
) (s
t,i
b
t,l
a
t,i,j
l
)
(s
t,i
b
t,l
z
t+1,
k
l
) (s
t,i
b
t,l
a
t+1,i,
j
l
)
(s
t,i
b
t,l
s
t+1,i+d
l
)
_
_
Comment: (s
t,i
a
t,i,j
a
t+1,i,j
) is 1
s
t,i
is 1 (i.e.
is at the time point t at the square i)

or s
t,i
= 0 and c
j
is not at the square i at the time point t
or s
t,i
= 0 and a
t,i,j
= 1 and a
t+1,i,j
= 1
s
t,i
= 0 and a
t,i,j
= 1 implies that a
t+1,i,j
= 1
(i.e. the elds, which are not considered, are not changed by
!)
In a similar way we read the remaining clauses:
(s
t,i
b
t,l
z
t,k
l
) is 1
if
was at the time point t at the square i and the l-th tuple was chosen for
transition, then
must have been in state q

k
l
(at the time point t).
A
4
is in CNF. A
4
contains p(n) (2 p(n) + 1) (3 + 15 m) variables. Now it is
easy to show: (g(w) is in CNF!)
Statement 1: g is polynomial.
Obvious; the above construction gives us the formula with
(2 p(n) + 3) + (s r + 1)+p(n)(s
2
+ (2p(n) + 1)
2
(
2
+ 1) +m
2
)
+p(n)(2 p(n) + 1) (3 + 15m) = p
(n)
variables (for this polynomial p
). It is obvious that the generation of g(w) from w

is proportional to p
(n), i.e. should be computed on a deterministic 1-tape Turing

machine in at most const (p
(n))
2
steps, i.e. in polynomial time.
Statement 2: g is a reduction, i.e.
w X
holds: (w L g(w) SAT).

138 VI. Complexity
This expression follows from the construction, where can be, however, proved
in a more complex way.
Thus g is a polynomial reduction. SAT is NP-complete.
(End of the proof of theorem 3.2)

3.3 Theorem : SAT(3) is NP-complete.
Proof : Because SAT NP, it also holds that SAT(3) NP. Let us reduce SAT to SAT(3).
Let us replace every clause (x
1
x
2
. . . x
r
) by
(x
1
y
1
) (y
1
x
2
y
2
) (y
2
x
3
y
3
) . . . (y
r1
x
r
y
r
) y
r
with new variables y
1
, y
2
, . . . , y
r
. It is easy to see that (x
1
. . . x
r
) is satisable if and only if
it is the longer formulae (i.e. of the order 3). The process is of polynomial size SAT can be
polynomially-time reduced to SAT(3).
For further NP-complete problems: see literature (rucksack problem, shortest round-trip, Hamil-
tonian paths in graphs, clique problem, coloring problem of graphs, integer programming,
timetable problem, allocation problem, graph embedding problem, etc.).

Lectures

Uploaded by

Copyright:

Available Formats

Lectures

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Lectures

Uploaded by

Copyright:

Available Formats

Lecture Theoretical Computer Science II:

Foundations of Theoretical Computer Science

of R are dened as follows:

= set of all strings over .

. We use L as a typical name for

to check its acceptance.

) are called transitions . We mostly write q

. The initial state q

1.2 Denition (Acceptance and reachability)::

and dene the respective relations

Similarly we dene the extended transition function

2. The language accepted by / is

[ w has an odd number of1

The automaton should accept this string, if

) and extended to strings (q

Q and a it holds that:

is the set of all successor states that can be reached by a-transitions

is called -transition and is represented in the state diagrams as follows:

. For this purpose we make two terminological remarks.

does not follow from q

(iii) It is decidable, whether the relation q

holds for the given states q, q

. So in order to nd all by -transitions reachable states from q, it is sucient to check

1.9 Theorem : For every -NFA there exists an equivalent NFA.

By denition / is a NFA without -transitions with F F

be nately acceptable. Then there are DFAs A

It is clear that L(B) = L

hold. B is graphically represented as follows:

Again it is easy to show that L(B) = L

holds. For any w

24 II. Finite automata and regular languages

are also regular expressions over .

is called regular, if there is a regular expression re over with

of the strings with sux v is described by the regular expression

: z = uvw and v ,= and [uv[ n and i N : uv

be some language. The Nerode relation of L is a 2-place relation

, which is dened as follows for u, v

, i. e., reexive, symmetric and transitive.

is regualr if and only if

has a nite index.

. Then it holds that:

be a language and k N the nite index of

there is a i 1, . . . , k with [u] = [u

be regular and k = Index(

be regular and k = Index(

is the DFA, which

from every DFA / = (, Q, , q

from the last remark in section 2.

is a nite set of productions or rules.

is called context-free, if there is a context-free grammar G

We construct the derivation tree from A to w corresponding to a derivation from A to w of the

Now let the syntax of a programming language PL be given by a context-free grammar G. A

there is at most one derivation tree or a leftmost derivation from S to w in G respectively.

is called unambiguous, if there is an unambiguous context-

there exists a number n N, so that for all strings z L

By inductive hypothesis the number of leaves in each of the subtrees t

Now we are ready for the

is the transition relation,

. These characters can also be decorated with indices and

) are called transitions. Instead of (q, Z, , q