Lectures
Lectures
Lectures
=
_
nN
R
n
Functions are special relations. A relation f X Y is called a partial function (or a
partial image) from X to Y , if f is right-unique. It is denoted as follows f : X
part
Y .
Instead of (x, y) f we write f(x) = y. f is a (total ) function from X to Y , if f is
additionally left-total. This is denoted by f : X Y .
A function f : X Y is
injective, if f is left-unique,
surjective, if f is right-total,
bijective, if f is injective und surjective.
A bijective function is also called bijection.
5
3 Alphabets, strings and languages
In this lecture we will pay special attention to analysing formal languages. For this purpose the
following notation will be used.
Alphabet = nite set of characters (symbols) = character set
We use A, B, , as typical names for alphabets and a, b as typical names for characters,
i.e. elements of alphabets.
String over an alphabet = nite sequence of characters from . In particular, there is
the empty string . We use u, v, w as typical names for strings.
Example : Let = 1, 2, +. Then 1 + 2 and 2 + 1 are strings over .
holds.
+
= set of all non-empty strings over , i.e.
=
+
.
[u[ = length of the string u = number of characters that occur in u.
In particular: [[ = 0.
The symbols of a string u with the length n we denote by u
1
, . . . , u
n
.
The concatenation of two strings u and v consists of the characters from u, followed by
the characters from v and is denoted by u v or simply by uv.
Example : The concatenation of 1+ and 2 + 0 is 1 + 2 + 0.
A string v is called substring of a string w, if u
1
, u
2
: w = u
1
vu
2
.
A string v is called prex of a string w, if u : w = vu.
A string v is called sux of a string w, if u : w = uv.
There may be several occurencies of the same substring in string w.
Example : The string w = 1 + 1 + 1 has two occurencies of the substring 1+ and three
occurencies of the substring 1.
A (formal) language over an alphabet is a subset of
L (complement)
Furthermore, there are special operations on languages.
The concatenation of strings is extended to languages L
1
and L
2
as follows:
L
1
L
2
= u v [ u L
1
and v L
2
.
The n-th power of a language L is dened inductively:
L
0
= and L
n+1
= L L
n
6 I. Basic denitions
The (Kleene) star operator (also Kleene-closure or iteration) of a language L is
L
=
_
nN
L
n
= w
1
. . . w
n
[ n N and w
1
, . . . , w
n
L.
For all L holds L
.
Example : For L
1
= 1+, 2+ and L
2
= 1 + 0, 1 + 1 is
L
1
L
2
= 1 + 1 + 0, 1 + 1 + 1, 2 + 1 + 0, 2 + 1 + 1,
L
2
1
= 1 + 1+, 1 + 2+, 2 + 1+, 2 + 2+,
L
1
= , 1+, 2+, 1 + 1+, 1 + 2+, 2 + 1+, 2 + 2+, 1 + 1 + 1+, 1 + 1 + 2+, ...
4 Bibliography
The following books are recommended for further reading:
J.E. Hopcroft, R. Motwani & J.D. Ullmann: Introduction to Automata Theory, Languages,
and Computation. Addison-Wesley, 2nd Edition, 2001.
U. Schoning: Theoretische Informatik kurzgefasst. Spektrum Akademischer Verlag,
5. Auage, 2008.
D. Harel: Algorithmics The Spirit of Computing. Addison-Wesley, 1987.
The following sources were used during the preparation for the lecture:
J. Albert & T. Ottmann: Automaten, Sprachen und Maschinen f ur Anwender. BI 1983
(nur einf uhrendes Buch).
E. Borger: Berechenbarkeit, Komplexitat, Logik. Vieweg, Braunschweig 1986 (2. Auage).
W. Brauer: Automatentheorie. Teubner Verlag, Stuttgart 1984.
E. Engeler & P. Lauchli: Berechnungstheorie f ur Informatiker. Teubner, Stuttgart 1988.
H. Hermes: Aufzahlbarkeit, Entscheidbarkeit, Berechenbarkeit. 2. Auage, Springer-
Verlag, Berlin 1971.
J.E. Hopcroft & J.D. Ullmann: Introduction to Automata Theory, Languages, and Com-
putation. Addison-Wesley, 1979.
A.R. Lewis & C.H. Papadimitriou: Elements of the Theory of Computation. Prentice
Hall, Englewood Clis 1981.
K.R. Reischuk: Einf uhrung in die Komplexitatstheorie. Teubner Verlag, Stuttgart 1990.
A. Salomaa: Computation and Automata. Cambridge University Press, Cambridge 1985.
7
A. Salomaa: Formal Languages. Academic Press, New York 1973.
F. Setter: Grundbegrie der Theoretischen Informatik. Springer-Verlag, Heidelberg 1988
(einf uhrendes Buch).
W. Thomas: Grundz uge der Theoretischen Infomatik. Vorlesung im Wintersemester
1989/90 an der RWTH Aachen.
8 I. Basic denitions
Chapter II
Finite automata and regular
languages
In the following chapter we deal with a simple model of a machine: the nite automaton. We
will see that
nite automata may be widely used in computer science,
the languages recognized by nite automata have many structural properties, e.g. repre-
sentability by regular expressions,
questions about computations performed by nite automata are decidable.
Moreover, nite automata can be easily implemented as tables in programs and as circuits.
1 Finite automata
We want to use nite automata for recognizing languages. We can imagine a nite automaton
as a machine with nal states, which reads characters from a tape, can move the head only to
the right and can print no new symbols on the tape.
Therefore, the transition function of nite automata is often dened as mapping : Q Q,
where Q is a set of states and is an input alphabet of the automaton. This automaton can be
applied to the string w
Q : (q, a, q
) ,
4. q
0
Q is the initial state,
5. F Q is the set of nal states.
Both representations of automata are related as follows :
(q, a) = q
(q, a, q
)
The elements (q, a, q
instead of (q, a, q
) .
For given a we use
a
also as a 2-place relation
a
QQ:
q, q
Q : q
a
q
(q, a, q
)
A DFA can be graphically represented by a nite state diagram. It is a directed graph which
contains a vertice labelled with q for every state q of the automaton and a directed labelled edge
for every transition q
a
q
\
\
\
_
\
\
\
q
0
q
1
`
_
0 0
1
1
if and only if q = q
or in terms of relations:
= id
Q
q
aw
q
if and only if q
Q : q
a
q
and q
w
q
or in terms of relations:
aw
=
a
w
: Q
Q with
(q, ) = q and
(q, aw) =
((q, a), w)
for all q, q
Q, a and w
. It holds that:
(q, w) = q
q
w
q
[ q F : q
0
w
q resp. L(/) = w
(q
0
, w) F
A language L is accepted, if there is a DFA / with L = L(/).
3. A state q Q is reachable in /, if w
: q
0
w
q.
Example : For the automaton from the last example it holds that:
L(/
1
) = w 0, 1
s.
Remark : For all a
1
, . . . , a
n
and q, q
Q :
q
a
1
...a
n
q
q
1
, . . . , q
n
Q : q
a
1
q
1
. . . q
n1
a
n
q
n
= q
.
12 II. Finite automata and regular languages
or in terms of relations:
a
1
...a
n
=
a
1
a
n
The sequence q
a
1
q
1
. . . q
n1
a
n
q
n
is also called transitions sequence .
Example (from the eld of compiler construction): A compiler for a programming lan-
guage works in several phases, which can sometimes run parallely:
Lexical analysis : In this phase the so called scanner breaks the input text into the sequence
of tokens. These are identiers, keywords and delimiters.
Syntax analysis : The generated token sequence is the input to the so called parser, which
decides whether this sequence is a syntactically correct program.
Code generation : The syntactic structure recognized by the parser is used to generate
machine code.
Optimisation : Execution time of the program should be improved mostly by local changes
of machine code.
The lexical analysis is a simple task, which can be accomplished by nite automata. As an
example let us consider the typical structure of identiers in a programming language. We will
consider a syntax diagram of MODULA
Ident
letter
digit
letter
`
`
_
a
. . .
`
_
z
`
_
A
. . .
`
_
Z
. . .
. . .
. . .
. . .
letter
`
_
0
. . .
`
_
9
. . .
. . .
digit
The identiers constructed this way can be recognized by the following nite automata:
13
`
_
`
_
\
\
\`
_
.
.
-
-
a, . . . , Z
0, . . . , 9
`
_
\
\
\`
0, . . . , 9
`
_
.
.,
-
-
a, . . . , Z
0, . . . , 9 a, . . . , Z
For the sake of clarity we have labelled the edges of the state diagram with several symbols.
Example (from the eld of operating systems): If several programs try to access shared
resources in the multitasking environment, then these attempts need to be synchronized. For
example, consider two programs P
1
and P
2
, which share one printer. P
1
and P
2
should be
synchronized in such a way that they do not send data to the printer simultaneously. We
construct a nite automata that would monitor the printer use by P
1
and P
2
.
P
1
reports the beginning and the end of printing to the automaton by the symbols b
1
and
e
1
P
2
behaves similarly to P
1
and uses symbols b
2
and e
2
respectively.
In each time point the behaviour of P
1
and P
2
regarding the printer is given by the nite string
w b
1
, e
1
, b
2
, e
2
\
\
\`
e
1
e
2
b
2
b
1
e
1
, e
2
b
1
, b
2
, e
1
b
1
, b
2
, e
2
b
1
, b
2
, e
1
, e
2
Non-determinism
In many applications nite automata can be simplied, if non-determinism is allowed, i. e., we
consider an arbitrary relation
Q Q
as transition relation. It may happen that for some q Q, a there are several successor
states q
1
, . . . , q
n
, so that (in graphical representation) it holds that
`
_
q
`
_
`
_
q
1
q
n
.
.
.
.
.
.
.
-
-
-
-
-
-
-
a
a
.
.
.
After accepting a the automaton moves non-deterministically from q to one of the successor
states q
1
, . . . , q
n
. Special case is n = 0; then for q Q and a there is no successor state
q
with q
a
q
. Then the automaton stops and rejects symbol a. These remarks lead to the
following denition.
1.3 Denition : A non-deterministic nite automata (or acceptor), in short NFA, is a 5-tuple
B = (, Q, , q
0
, F),
where , Q, q
0
and F are dened as in DFAs and for it holds that:
Q Q.
Transitions are written (q
a
q
) as in DFAs.
15
A graphical representation by state diagram ramains unchanged.
1.4 Denition (acceptance and equivalence):
(i) The language accepted (or recognized) by NFA B = (, Q, , q
0
, F) is
L(B) = w
[ q F : q
0
w
q.
By NFA we denote the class of languages accepted by NFAs.
(ii) Two NFAs B
1
and B
2
are called equivalent, if L(B
1
) = L(B
2
) holds.
NFA recognizes the string w, if while reading w we can reach one of the nal states. It may
be the case that, while reading w, there may exist other transition sequences, which end up in
non-nal states.
Obviously, a DFA is a special case of an NFA. Thus the equivalence between arbitrary nite
automata is dened.
Example (sux recognition): Consider an alphabet and a string v = a
1
. . . a
n
with
a
i
for i = 1, . . . , n.
We want to recognize the language
L
v
= wv [ w
,
which consists of all strings over with the sux v.
For this purpose we consider the NFA
B
v
= (, q
0
, . . . , q
n
, , q
0
, q
n
),
where is dened by the following state diagram B
v
:
`
_
`
_
`
_
`
_
`
_
q
0
q
1
q
2
q
n
a
1
a
2
a
3
a
n
a
_
\
\
\
B
v
behaves non-deterministically in the initial state q
0
: while reading a string, B
v
can decide
in each occurrence of a
1
, whether to try to recognize the sux v. In order to do it B
v
goes to
q
1
and now waits for a
2
. . . a
n
as a sux. Should this not be the case, then B
v
stops at some
i 1, . . . , n and a ,= a
i
.
The question arises: Do NFAs accept more languages than DFAs?
The answer is no.
1.5 Theorem (Rabin and Scott, 1959): For every NFA there exists an equivalent DFA.
16 II. Finite automata and regular languages
Proof : Consider an NFA B = (, Q, , q
0
, F). We introduce a DFA / with L(/) = L(B)
using the following powerset construction:
/ = (, P(Q),
A
, q
0
, F
A
),
where for S, S
A
S
if and only if S
= q
Q [ q S : q
a
q
,
Let the set of nal states be F
A
= S Q [ S F ,= .
There exists a state in / for each subset S of the set of states Q of B (we denote such a state by
S). S
a
A
S
if and only if S
_
`
_
`
S S
a
a
a
results in S
a
A
S
We see that
A
is deterministic, thus
S Q a exactly one S
Q : S
a
A
S
.
Furthermore, for all q, q
Q, S, S
Q, a holds:
(i) If q
a
q
and q S,
then S
Q : S
a
A
S
and q
.
(ii) If S
a
A
S
and q
,
then q Q : q
a
q
and q S.
This way we can easily show that L(/) = L(B). Let w = a
1
. . . a
n
with a
i
for
i = 1, . . . , n. Then it holds that:
w L(B) q F : q
0
w
q
q
1
, . . . , q
n
Q :
q
0
a
1
q
1
. . . q
n1
a
n
q
n
with q
n
F
follows from (i) and follows from (ii)
S
1
, . . . , S
n
Q :
q
0
a
1
A
S
1
. . . S
n1
a
n
A
S
n
with S
n
F ,=
S F
A
: q
0
A
S
w L(/).
17
Example : We consider the the sux recognition of 01 in strings over the alphabet = 0, 1, 2.
According to the previous example the language L
01
= w01 [ w
is recognized by the
NFA B
01
with the following diagram:
`
_
`
_
`
_
q
2
`
_
q
1
q
0
_
\
\
\
0, 1, 2
0 1
Now we apply the powerset construction from the Rabin-Scott theorem to B
01
. As a result
we get the following DFA /
01
:
`
_
q
0
\
\
\
_
q
0
, q
1
_
q
0
, q
2
_
{q
0
,q
1
,q
2
}
\
\
\`
_
`
_
q
1
`
_
`
_
q
2
_
q
1
, q
2
`
_
`
`
`
`
`
`
`
`
2
0
0
1
0
1
0 2
1, 2
0, 2
1
0, 1, 2
1
0, 2
_
\
\
\
0, 1, 2
`
`
`
`
`
`
1, 2
The fragments in boxes are unreachable from the initial state q
0
of /
01
. Therefore, /
01
can be simplied to the following equivalent DFA :
18 II. Finite automata and regular languages
`
_
\
\
\
\
\
\`
1, 2
0
2
0
1
0
.
1, 2
The number of states of this DFA can not be further minimized. There are examples, where
the DFA given in the proof of the Rabin-Scott theorem cannot be further simplifed, i. e., if
the given NFA has n states, then in the worst case the DFA actually needs 2
n
states.
Spontaneous transitions
Automata can be dened even more conveniently, if -transitions in addition to non-determinism
are allowed. These are the transitions, which an automaton makes spontaneously, without
reading a symbol of the input string.
1.6 Denition : A nondeterministic nite automaton (acceptor) with -transitions, in short
-NFA, is a 5-tuple
B = (, Q, , q
0
, F),
where , Q, q
0
and F are dened as in NFAs or DFAs and for the transition relation it holds
that:
Q( ) Q.
Transition q
q
q
In order to dene the acceptance behaviour of -NFAs, we need an extended transition relation
q
w
q
Q holds: q
q
(q, , q
) .
Here we use a standard inx notation for 2-place relations, i. e., q
q
instead of (q, q
. Note that
is called -transition relation.
Therefore, we can use the composition for such transition relations. For all ,
is dened as follows: For q, q
Q holds
q
q
Q : q
q
and q
.
19
1.7 Denition : Let B = (, Q, , q
0
, F) be a -NFA. Then a 2-place relation
w
QQ is
dened inductively for every string w
:
q
q
, if n 0 : q
. .
n-times
q
q
aw
q
, if q
a
w
q
,
where q, q
Q, a and w
.
Remark : For all q, q
Q and a
1
, . . . , a
n
:
(i) q
q, but q = q
.
(ii)
q
a
1
...a
n
= q
q
a
1
a
n
q
q
a
1
a
n
q
Q.
Proof : (i) and (ii) follow directly from the denition.
(iii): Let k be the number of states in Q. Then:
q
q
n k 1 : q
. .
n-times
q
.
If at least k transitions
take place one after the other, then the states repeat in the path from
q to q
is proved.
1.8 Denition (acceptance): Let B = (, Q, , q
0
, F) be a -NFA.
(i) The language accepted (or recognized) by B is
L(B) = w
[ q F : q
0
w
q.
(ii) Two -NFAs B
1
and B
2
are called equivalent, if L(B
1
) = L(B
2
).
Obviously NFA is a special case of -NFA. Therefore, the equivalence of NFAs and -NFAs can
be dened.
Example : For the input alphabet = 0, 1, 2 we consider the -NFA B dened by the
following state diagram:
20 II. Finite automata and regular languages
`
_
`
_
`
_
`
_
_
\
\
\
q
0
q
1
q
2
_
\
\
\
1 0
_
\
\
\
2
Then:
L(B) = w 0, 1, 2
[ k, l, m 0 : w = 0 . . . 0
. .
k-times
1 . . . 1
. .
l-times
2 . . . 2
. .
m-times
Q and a it holds
that:
q
a
A
q
if and only if q
a
q
in B, therefore
a
q
in B
q F
A
if and only if q
F : q
q
with n 0 and a
i
for i = 1, . . . , n. Then:
w L(/) q F
A
: q
0
w
A
q
q F
A
: q
0
a
1
A
a
n
A
q
q F
A
: q
0
a
1
a
n
q
: choose q
with q
q
.
: choose q = q
.
q
F : q
a
1
a
n
q
F : q
w
q
w L(B)
In the special case w = the above argument is reduced as follows:
L(/) q
0
F
A
q F : q
0
q
L(B)
Remarks :
(i) In the proof F
A
could also be dened as follows:
F
A
=
_
F q
0
if q F : q
0
q
F otherwise
Compare with the proof in Hopcroft & Ullman.
21
(ii) NFA / can be computed from the given -NFA B, because the relation q
q
is decidable:
Compare with the previous remark.
(iii) If there are -cycles in -NFA B, then the set of states of NFA / can be reduced by the
preceeding contraction of -cycles; every -cycle can be replaced by one state.
Example : We apply the construction introduced in the proof to the -NFA B from the previous
example. As the result we have the following NFA /:
`
_
`
_
`
_
`
_
_
\
\
\
q
0
q
1
q
2
_
\
\
\
0, 1 1, 2
1 0 _
\
\
\
2
`
_
`
_
0, 1, 2
It is easy to check, that L(/) = L(B) actually holds.
Therefore, we have the following result:
DEA = NEA = -NEA,
i. e., the classes of languages accepted by DFAs, NFAs and -NFAs coincide; we consider the
class of nitely accepted languages. If we want to show the properties of this class, we can choose
the type of automaton most suited for this purpose.
2 Closure properties
Now we explore, under which operations the class of nitely accepted languages is closed. For
this purpose we consider the set operations union, intersection, complement, dierence as well
as operations on languages such as concatenation and iteration (Kleene star operator), which
were introduced in Chapter I.
2.1 Theorem : The class of nitely acceptable languages is closed under the following opera-
tions:
1. union,
2. complement,
3. intersection,
4. dierence,
5. concatenation,
6. iteration.
22 II. Finite automata and regular languages
Proof : Let L
1
, L
2
1
are nately acceptable. For L
1
L
2
, L
1
L
2
and L
1
we shall provide the accepting
-NFAs. This makes the task less complicated.
1. L
1
L
2
: Let us construct the -NFA B = (, q
0
Q
1
Q
2
, , q
0
, F
1
F
2
), where
q
0
/ Q
1
Q
2
and
= (q
0
, , q
01
), (q
0
, , q
02
)
1
2
hold. B is graphically represented as follows:
`
_
`
_
`
_
>
>
>
>
_
`
_
`
q
0
q
01
q
02 /
2
/
1
it holds that:
w L(/) q Q
1
F
1
: q
01
w
1
q
1
determ.
q F
1
: q
01
w
1
q
w / L(/
1
)
w / L
1
Therefore, L(/) = L
1
holds.
3. L
1
L
2
: is obvious, because L
1
L
2
= L
1
L
2
holds.
4. L
1
L
2
: is obvious, because L
1
L
2
= L
1
L
2
holds.
5. L
1
L
2
: Construct the -NFA B = (, Q
1
Q
2
, , q
01
, F
2
) with
=
1
(q, , q
02
) [ q F
1
)
2
.
B is graphically represented as follows:
`
_
_
`
_
`
_
q
01
q
1
1
q
1
m
.
.
.
.
.
.
`
_
q
02
-
-
-
-
-
F
1
from
1
/
1
`
_
`
_
q
2
1
q
2
n
.
.
.
F
2
from
2
/
2
It is easy to show that L(B) = L
1
L
2
holds.
23
6. L
1
: For the iteration we construct the -NFA
B = (, q
0
Q
1
, , q
0
, q
0
), where q
0
/ Q
1
and
= (q
0
, , q
01
)
1
(q, , q
0
) [ q F
1
`
_
`
_
q
01
q
1
q
n
.
.
.
.
.
.
F
1
from
1
/
1
`
_
`
_
q
0
_
1
holds.
Thus the proof is complete.
Remark : There is also an interesting direct construction of accepting DFAs for the union and
intersection . Let /
1
and /
2
be as in the above proof. Then we consider the following transition
relation Q Q on the cartesian product Q = Q
1
Q
2
of the sets of states:
for all q
1
, q
1
Q
1
and q
2
, q
2
Q
2
and a it holds that
(q
1
, q
2
)
a
(q
1
, q
2
) if and only if q
1
a
1
q
1
and q
2
a
2
q
2
.
The relation
a
models the simultaneous parallel progress of automata /
1
and /
2
, while reading
the symbol a.
Consider the DFAs
A
= (, Q, , (q
01
, q
02
), F
),
A
= (, Q, , (q
01
, q
02
), F
)
with
F
= (q
1
, q
2
) Q[ q
1
F
1
or q
2
F
2
,
F
= (q
1
, q
2
) Q[ q
1
F
1
and q
2
F
2
.
Then it holds that L(/
) = L
1
L
2
and L(/
) = L
1
L
2
.
Proof : We show that the statement for /
it holds that:
w L(/
) (q
1
, q
2
) F
: (q
01
, q
02
)
w
(q
1
, q
2
)
q
1
F
1
, q
2
F
2
: q
01
w
1
q
1
and q
02
w
2
q
2
w L(/
1
) and w L(/
2
)
w L
1
L
2
, which is
inductively dened as follows:
L() =
L() =
L(a) = a for a
L((re
1
+re
2
)) = L(re
1
) L(re
2
)
L((re
1
re
2
)) = L(re
1
) L(re
2
)
L(re
) = L(re)
3. A language L
.
2. The language over b
1
, e
1
, b
2
, e
2
used for synchronization of two programs, which share a
printer, is described by the following regular expression
re
2
= (b
1
e
1
+b
2
e
2
)
( +b
1
+b
2
).
25
3. Let = a
1
, . . . a
n
, b
1
, . . . , b
m
and v = a
1
. . . a
n
. Then the language L
v
= wv [ w
a
1
. . . a
n
.
It holds that: L(re
3
) = L
v
.
Now we will show that a more general statement holds:
3.2 Theorem (Kleene): A language is regular if and only if it is nitely accepted.
Proof : We consider a language L
.
: For a regular expression re over it holds that L = L(re). We show using induction over
the structure of re that L(re) is nately acceptable.
Basis: L(), L() and L(a) are obviously nitely acceptable for a .
Induction step: Let L(re), L(re
1
) and L(re
2
) be nitely acceptable. Then L(re
1
+re
2
), L(re
1
re
2
)
and L(re
) are also nitely acceptable, because the class of nitely acceptable languages is closed
under union, concatenation and iteration.
: It holds that L = L(/) for a DFA / with n states. W.l.o.g. / = (, Q, , 1, F), where
Q = 1, . . . , n. For i, j 1, . . . , n and k 0, 1, . . . , n we dene
L
k
i,j
= w
[ i
w
j and u
, l Q :
from ( v : v ,= , v ,= w and uv = w) and i
u
l holds l k.
L
k
i,j
consists of all strings w such that automaton / moves from state i to state j, while reading
the string w. Furthermore, while reading the string w, only states with the number less or equal
k occur.
Now we show by induction on k that the languages L
k
i,j
are all regular.
k = 0: For strings from L
0
i,j
no intermediate state can be used. Thus it holds that:
L
0
i,j
=
_
a [ i
a
j if i ,= j
a [ i
a
j if i = j
Therefore, L
0
i,j
is regular as a nite subset of .
k k+1: Let the languages L
k
i,j
be regular for all i, j 1, . . . , n. Then for all i, j 1, . . . , n
it holds that:
L
k+1
i,j
= L
k
i,j
L
k
i,k+1
(L
k
k+1,k+1
)
L
k
k+1,j
,
In order to go from the state i to the state j, the intermediate state k +1 is either unnecessary,
then L
k
i,j
is sucient for the description; or the state k + 1 is used as an intermediate state one
or sevaral times, then L
k
i,k+1
(L
k
k+1,k+1
)
L
k
k+1,j
is used for description. Therefore, by induction
on k we get:
L
k+1
i,j
is regular.
26 II. Finite automata and regular languages
From the regularity of languages L
k
i,j
we conclude that L itself is regular, because it holds that
L = L(/) =
_
jF
L
n
1,j
.
Thus we have proved the statement of Kleene.
Remark : As an alternative to the above construction of a regular expression from a given
nite automaton we can solve guarded regular systems of equations.
4 Structural properties of regular languages
As regular languages are exactly the nitely acceptable languages, we can derive important
characteristics of regular languages from the niteness of the sets of states of the accepting
automata. First we consider the so called Pumping Lemma, which gives us a necessary condition
for a regular language to be regular. Let be an arbitrary alphabet.
4.1 Theorem (the pumping lemma for regular languages): For every regular language
L
there exists a number n N, so that for all strings z L with [z[ n there is a
decomposition z = uvw with v ,= and [uv[ n and for all i N holds uv
i
w L, i. e., we can
pump the substring v and the resulting string will be in the regular language L.
In quantier notation:
regular L
n N z L with [z[ n
u, v, w
be regular.
According to the Kleene theorem there is a DFA / = (, Q, , q
0
, F) with L = L(/). We
choose n = [Q[ and consider a string z L with [z[ n. Then / must go through at least one
state twice, while reading z. More precisely it holds that:
Let z = a
1
. . . a
r
with r = [z[ n and a
i
for i = 1, . . . , r and let q
1
, . . . , q
r
Q be dened
as follows: q
i1
a
i
q
i
for i = 1, . . . , r. Then there is j, k 1, . . . , n with 0 j < k n r, so
that q
j
= q
k
holds. In graphical representation:
27
`
_
`
_
q
r
`
_
q
j
`
_
q
0
`
_
q
k1
.
.
.
a
1
a
j
a
j+1
a
k1
a
k
a
k+1
a
r
Let: u = a
1
. . . a
j
, v = a
j+1
. . . a
k
, w = a
k+1
. . . a
r
. Then according to the properties of j and k
it holds that v ,= and [uv[ n. Besides, it is clear that the automaton /, while recognizing,
can go through q
j
-loop any number of times, i. e., for all i N it holds that: uv
i
w L.
We consider a typical application of the pumping lemma, where we prove that certain languages
are not regular.
Example : The language L = a
n
b
n
[ n N is not regular. We provide a proof by contradiction.
Hypothesis: L is regular. According to the pumping lemma, there is a n N with given
properties. Now consider z = a
n
b
n
. Since [z[ n holds, we can break z into z = uvw with v ,=
and [uv[ n, so that for all i N it holds: uv
i
w L. However, v consists only of symbols a, so
that in particular, uw = a
n|v|
b
n
L should hold. Contradiction
The above example shows that nite automata cannot count unlimitedly. We schall get ac-
quainted with other applications of the pumping lemma in the section devoted to decidability
properties.
Nerode relation
If we consider the so called Nerode relation, we get a characteristic, i. e., necessary and sucient
condition for the regularity of languages L
.
4.2 Denition :
Let L
, i. e.,
:
u
L
v if and only if for all w
it holds that: uw L vw L.
Therefore, it holds that u
L
v, if u and v can be extended in the same way to strings from L.
In particular, holds (with w = ) u L v L.
Remark : The Nerode relation is a right congruence, i. e., it has the following properties:
1.
L
is an equivalence relation on
Given that
L
is an equivalence relation, we can determine the index of
L
, i. e., the number
of equivalence classes of
L
.
4.3 Theorem (Myhill and Nerode):
A language L
. For u, v
it holds that:
u
A
v if and only if we have a q Q, with q
0
u
q and q
0
v
q,
i. e., if the automaton / moves from q
0
to the same state q after having read the inputs u and v.
Note that q is uniquely dened, because / is deterministic. The relation
A
is an equivalence
relation on
.
We show:
A
is a renement of
L
, i. e., for all u, v
it holds that:
For u
A
v it follows that u
L
v.
Let u
A
v and w
F : q
0
u
q
w
q
u
A
v
q Q, q
F : q
0
v
q
w
q
vw L.
Thus there are at least as many equivalence classes of
A
as of
L
. Therefore, it holds that
Index(
L
)
Index(
A
)
= number of states reachable from q
0
[Q[,
thus
L
has a nite index.
: Let L
with u
1
= as representatives of the equivalence class of
L
. Then
can be
represented as a disjoint union of these equivalence classes:
= [u
1
]
. . .
[u
k
].
In particular, for every string u
L
[u
j
] if and only if [u
j
] = [u
i
a].
Then /
L
is a DFA and for all strings w
it holds that:
[]
w
L
[u
j
] if and only if [u
j
] = [w],
more precisely for w = a
1
. . . a
n
[]
w
L
[u
j
] if and only if []
a
1
L
[a
1
] . . .
a
n
L
[a
1
. . . a
n
] = [u
j
],
and thus
w L(/
L
) =
[u
j
] F
L
: []
w
L
[u
j
]
u
j
L : [u
j
] = [w]
w L
So /
L
accepts the language L. Therefore, L is regular.
We use the method of proof of the Myhill-Nerode theorem, in order to minimize the number
of states of automaton. In this case we refer to the deterministic equivalence-class automaton
/
L
from the proof.
4.4 Corollary : Let L
Q
1
a : q
a
1
q
(q)
a
2
(q
).
The bijection is called the isomorphism from /
1
to /
2
.
Note that isomorphism is an equivalence relation on nite automata. Now we will show the
announced statement.
4.6 Theorem : Let L
: u
i
w L u
j
w L. Therefore, u
i
L
u
j
and [u
i
] = [u
j
].
2. is surjective: this property follows from (1) and the fact that k = [Q[. Thus it holds in
particular that Q = q
1
, . . . , q
k
.
3. ([q
L
]) = ([u
1
]) = ([]) = q
1
4. (F
L
) = F : [u
j
] F
L
u
j
L q
j
F
5. For all i, j 1, . . . , k and a it holds that:
[u
i
]
a
L
[u
j
] q
i
a
q
j
.
Proof of : Let [u
i
]
a
L
[u
j
]. Then by denition of
L
: [u
i
a] = [u
j
]. There is a
l 1, . . . , k with q
i
a
q
l
. By denition of q
i
and q
l
we have the following image:
q
1
q
i
q
l
`
`
u
i a
u
l
Given that the strings u
i
a and u
l
lead to the same state q
l
, it follows that u
i
a
L
u
l
.
Therefore, it holds that [u
j
] = [u
i
a] = [u
l
]. According to the choice of u
1
, . . . , u
k
in A
L
it
follows that u
j
= u
l
, even j = l. Thus q
i
a
q
j
follows as desired.
31
Proof of : Let q
i
a
q
j
. Thus we have the following image similar to the one above:
q
1
q
i
q
j
`
`
u
i a
u
j
Hence we make a conclusion as above: u
i
a
L
u
j
. Therefore, it holds that [u
i
a] = [u
j
]; by
denition of
L
we can derive that [u
i
]
a
L
[u
j
].
Thus /
L
and / are isomorphic.
For every regular language L
with k as index of
L
there is a DFA, which is unique up to
isomorphism, with minimum number of states k accepting L.
4.7 Denition : The minimal automaton for a regular language L
with q
0
w
q. The subset
of reachable states of Q is computable, because we can cinsider only those strings w with
q
0
w
q, which have the length [Q[ (compare with the proof of the pumping lemma).
2. Combine equivalent states.
For q Q, w
and S Q we write q
w
S, if there is a q
S with q
w
q
. Two states
q
1
, q
2
Q are called equivalent, in short q
1
q
2
, if for all w
it holds:
q
1
w
F q
2
w
F,
i. e., the same strings lead from q
1
and q
2
to nal states.
There is a close relation between equivalence and the Nerode relation
L
. Let q
0
u
q
1
and q
0
v
q
2
. Then it holds that:
q
1
q
2
u
L
v [u] = [v].
Making comparison with the equivalence-class automaton /
L
we see that equivalent states
must coincide in the minimal automaton. q
0
, q
1
and q
2
in /
L
are represented by the
equivalence classes [], [u] and [v] such that from []
u
[u] and []
v
[v] it follows that:
[u] [v] u
L
v [u] = [v].
32 II. Finite automata and regular languages
5 Decidability questions
First of all, we determine that the following constructions are algorithmically computable:
-NFA NFA DFA
DFA minimal automaton
-NFAs for the following operations on nitely acceptable languages:
union, complement, intersection, dierence, concatenation and iteration
regular expression NFA DFA regular expression
After that we can move on to decidability questions for the languages represented by nite
automata or regular expressions. Due to the constructions mentioned above we consider only
the languages represented by DFAs. We consider the following problems for regular languages.
acceptance problem Given: DFA / and a string w
Question: Does w L(/) hold ?
emptiness problem Given: DFA /
Question: Does L(/) = hold ?
niteness problem Given: DFA /
Question: Is L(/) nite ?
equivalence problem Given: DFA s /
1
and /
2
Question: Does L(/
1
) = L(/
2
) hold ?
inclusion problem Given: DFA s /
1
and /
2
Question: Does L(/
1
) L(/
2
) hold ?
intersection problem Given: DFA s /
1
and /
2
Question: Does L(/
1
) L(/
2
) = hold ?
5.1 Theorem (decidability): For regular languages
the acceptance problem,
the emptiness problem,
the niteness problem,
the equivalence problem,
the inclusion problem,
the intersection problem
33
are all decidable.
Proof : Acceptance problem: We apply / to the given string w and decide, whether a nal
state of / is reached. Therefore, / itself gives us the decision procedure.
Emptiness problem: Let n be the number we get by applying the pumping lemma to the regular
language L(/). Then it holds that:
L(/) = w L(/) : [w[ < n ()
The proof of (): is obvious. by contraposition: Let L(/) ,= . If there is a string
w L(/) with [w[ < n, then we have nothing to show. Otherwise there is a string w L(/)
with [w[ n. By application of the pumping lemma with i = 0, we get that a string w
0
L(/)
with [w
0
[ < n. This completes the proof.
The decision procedure for L(/) = now solves the acceptance problem w L(/)? for every
string over the input alphabet of / with [w[ < n. If the answer is no, then we get L(/) = .
Otherwise we get L(/) ,= .
Finiteness problem: Let n be dened as above. Then it holds that:
L(/) is nite w L(/) : n [w[ < 2n ()
The proof of (): : If there were a string w L(/) with [w[ n, then L(/) would be
innite by the pumping lemma. by contraposition: Let L(/) be innte. Then there are
strings of arbitrary length in L(/), in particular a string w with w 2n. By applying the
pumping lemma with i = 0 we get a string w
0
with n [w
0
[ < 2n, because with i = 0 the given
string is shortened maximally by n letters.
The niteness problem can be decided by () by solving of the acceptance problem nite number
of times.
equivalence problem: First we construct a DFA / with the following property:
L(/) = (L(/
1
) L(/
2
)) (L(/
2
) L(/
1
)).
Obviously it holds that:
L(/
1
) = L(/
2
) L(/) = ( )
Hence the equivalence problem for automata is reduced to the emptiness problem for /. The
construction of / described above is hard to implement. Alternatively we can use the following
product construction, which is similar to the last remarks in section 2 of this chapter. Let
/
i
= (, Q
i
,
i
, q
0i
, F
i
), i = 1, 2. Then consider Q = Q
1
Q
2
with the following transition
relation Q Q
for all q
1
, q
1
Q
1
and q
2
, q
2
Q
2
and a it holds that
(q
1
.q
2
)
a
(q
1
, q
2
) if and only if q
1
a
q
1
and q
2
a
q
2
. Then dene / = (, Q, , (q
01
, q
02
), F)
with F = (q
1
, q
2
) Q[ q
1
F
1
q
2
/ F
2
.
34 II. Finite automata and regular languages
Then (***) holds for this DFA /:
L(/
1
) = L(/
2
) w
: (w L(/
1
) w L(/
2
))
w
: (( q
1
F
1
: q
0
w
q
1
) ( q
2
F
2
: q
0
w
q
2
))
w
(q
1
, q
2
) Q : (q
01
, q
02
)
w
(q
1
, q
2
) (q
1
, q
2
) , F
L(/) =
Inclusion problem: We construct a DFA /, so that
L(/) = L(/
1
) L(/
2
)
holds. Due to
L(/
1
) L(/
2
) L(/) =
the inclusion problem for /
1
and /
2
can be reduced to the emptiness problem of /.
Intersection problem: We construct a DFA / with
L(/) = L(/
1
) L(/
2
),
so that the intersection problem of /
1
and /
2
can be reduced to the emptiness problem of /.
For / we can use the product construction /
b
1
b
2
b
2
b
1
.
The application of the operators re and makes the regular expression re simplier. Due to
closure properties of regular languages, we certainly do not leave the class of regular languages.
Hence we can also decide for every given nite automaton / for synchronization of printer usage
by both programs, whether L(/) L(re) holds.
One of the possibilities to dene the satisability is as follows:
P satises S
= L(/) = L(re). This test can be automatically conducted, because
the equivalence problem for languages is decidable.
Even the following synthesis problem can be solved automatically for program P represented as
a nite automaton and specication S represented as extended regular expression:
Given : a specication S
Searched : a program P that satises S.
36 II. Finite automata and regular languages
Chapter III
Context-free languages and
push-down automata
In the previous chapter we have seen that regular languages have multiple applications in com-
puter science (e.g. lexical analysis and substring recognition) and are particularly easy to use
(representability by nite automata and regular expressions, good closure and decidability prop-
erties). However these are not enough to carry out an important task of the computer science,
and namely the syntax description of programming languages.
The matter is that programming languages accept bracket structures of arbitrary nesting-depth,
such as for example
arithmetic expressions of the form 3 (4 (x + 1)),
lists of the form (CAR(CONS x y )) or
statements of the form
while(b1)
x = e1;
while(b1)
y = e2;
z = e3;
In section 4 we have shown with the help of the pumping lemma that the simplest example of
such a bracket structure, namely the language
L = a
n
b
n
[ n N
is no longer regular. Context-free languages are used for the syntax description of programming
languages.
37
38 III. Context-free languages and push-down automata
1 Context-free grammars
First we consider the corresponding grammars.
1.1 Denition : A context-free grammar is a 4-tuple G = (N, T, P, S), where the following
holds:
(i) N is an alphabet of non-terminal symbols,
(ii) T is an alphabet of terminal symbols with N T = ,
(iii) S N is the start symbol,
(iv) P N (N T)
:
x
G
y if and only if A u P w
1
, w
2
(N T)
:
x = w
1
A w
2
and y = w
1
u w
2
.
By
G
we denote the reexive and transitive closure of
G
. We read x
y as y can be derived
from x. The language generated by G is
L(G) = w T
[ S
G
w.
Two context-free grammars G
1
and G
2
are called equivalent , if L(G
1
) = L(G
2
).
1.2 Denition : A language L T
G
2
(a +S) S
G
2
(a +b) S
G
2
(a +b) c.
Now we will consider the derivation of strings in a context-free grammar in more detail.
1.3 Denition : A derivation from A to w in G with the length n 0 is a sequence of derivation
steps
A = z
0
G
z
1
G
G
z
n
= w. ()
This derivation is called leftmost derivation, if we replace the leftmost non-terminal symbol in
every derivation step z
i
G
z
i+1
, i.e. if z
i
and z
i+1
always have the form
z
i
= w
1
Aw
2
and z
i+1
= w
1
uw
2
, where w
1
T
.
The rightmost derivations are dened respectively (then it holds that: w
2
T
).
Every derivation can be graphically represented as a tree.
1.4 Denition : A derivation tree from A to w in G is a tree with the following properties:
40 III. Context-free languages and push-down automata
(i) Every node is labelled with a symbol from N T . The root is labelled with A and
every internal node is labelled with a symbol from N.
(ii) If an internal node labelled with B has k children nodes, which are labelled with the
symbols
1
, . . . ,
k
from left to right, then it holds that
a) k = 1 and
1
= and B P
or
b) k 1 and
1
, . . . ,
k
N T and B
1
. . .
k
P.
(iii) The string w is generated by concatenating the symbols on the leaves from left to right.
Illustration
A
1
k B
. .
w
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
If u =
1
. . .
k
, where
1
, . . . ,
k
N T, then the entire derivation tree is as follows:
41
A
t
B
w
1
w
2
`
`
1
k
`
`
`
`
`
Example : Derivation trees for the derivations mentioned in the previous example are
S
a
S b
a
S b
and
S
S
S
(
S
) c
S
+
S
a
b
Remark : There exists the following relationship between derivations and derivation trees:
(i) A
G
w There is a derivation tree from A to w in G.
(ii) In general, several derivations from A to w, however only one leftmost derivation and one
rightmost derivation, correspond to the given derivation tree from A to w.
Proof :
(i) is obvious due to the above construction
We show inductively on the depth of the derivation tree.
(ii) Derivation trees abstract from the unimportant order of the rule application, if several
non-terminal symbols occur simultaneously. For example, both derivations
S
G
2
S +S
G
2
a +S
G
2
a +b
and
S
G
2
S +S
G
2
S +b
G
2
a +b
result in the same derivation tree:
S
S
+
S
a
b
If we have decided to use leftmost or rightmost derivation respectively, then such alterna-
tives are not possible.
42 III. Context-free languages and push-down automata
S
+
S
a
S
S
b
c
and
S
S
S
S
+
S
c
a
b
These correspond to semantically dierent bracketings a + (b c) and (a +b) c.
Therefore, in programming languages we choose a grammar G
3
for arithmetic expressions, where
the following evaluation strategy is chosen:
The evaluation takes place from left to right. Thus, for example, a +b +c is evaluated as
(a +b) +c.
Multiplication has a higher priority than +. Thus, for example, a +b c is evaluated as
a + (b c).
If we want to have another evaluation sequence, we must explicitly put brackets ( and ).
43
Consider G
3
= (E, T, F, a, b, c, +, , (, ), P
3
, E) with the following P
3
:
E T [ E +T
T F [ T F
F (E) [ a [ b [ c
G
3
is unambiguous, and it holds that L(G
3
) = L(G
2
). For example, a +b c has the derivation
tree in G
3
E
E
+
T
T T
F
F F
c
a
b
Example : (Chomsky, 1964) An inherent ambiguous context-free language is
L = a
i
b
j
c
k
[ i, j, k 0 and (i = j or j = k).
For the proof see literature.
44 III. Context-free languages and push-down automata
2 Pumping Lemma
There also exists a pumping lemma for context-free languages. It provides a necessary condition
for a given language to be context-free.
2.1 Theorem (Pumping lemma for context-free languages or uvwxy-lemma): For
every context-free language L T
`
`
`
` , ,
t
1
t
j
`
`
`
`
`
`
`
`
`
`
G
B:
S
.
.
.
B
.
.
.
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
B
`
`
`
Every part like this could actually be removed from t without changing the derived string z.
Having chosen k and [z[ as above, we can see that t has a branching factor k and k
m+1
leaves. Therefore, according to the previously considered lemma there is a path with the length
m+ 1 in t. There are m+ 1 internal nodes on this path, so that a non-terminal symbol is
repeated, while labeling these nodes. We need this repetition in a special case.
By a repetition tree in t we denote a subtree of t, where the labeling of the root is repeated in
some other node. Now we choose a minimal repetition tree t
0
in t, i.e. such a tree that contains
no other repetition tree, except for the real subtree. In t
0
every path has a length m+ 1.
Let A be the root labeling of t
0
. Then t has the following structure:
t = S
.
.
.
t
0
=
A
.
.
.
A
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
u v w x
y
. .
z
46 III. Context-free languages and push-down automata
From this structure we get a decomposition z = uvwxy with
S
G
uAy
G
uvAxy
G
uvwxy. ()
We show that this decomposition of z satises the conditions of the pumping lemma:
(i) By the choice of t it holds that vx ,= .
(ii) By the choice of t
0
and the preceding pumping lemma it holds that [vwx[ k
m+1
= n.
(iii) From () it follows immediately that for all i N it holds that: uv
i
wx
i
y L(G).
The derivation tree from S to uv
i
wx
i
y in G for i = 3 looks as follows:
S
.
.
.
A
.
.
.
A
.
.
.
A
.
.
.
A
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
u v x
y
`
`
`
`
`
v x
`
`
v w x
Just as in regular languages, the above pumping lemma can be used to prove that a certain
language is not context-free.
Example : The language L = a
n
b
n
c
n
[ n N is not context-free. We provide a proof by
contradiction.
Hypothesis: L is context-free. Then according to the pumping lemma there is an n N with
given properties. Now consider z = a
n
b
n
c
n
. Because [z[ n holds, we can decompose z into
z = uvwxy with vx ,= and [vwx[ n, so that for all i N it holds that: uv
i
wx
i
y L.
Because [vwx[ n, neither as nor cs occur in the substring vwx. Therefore, while pumping
up to uv
i
wx
i
y, at most two of the characters a, b, c will be considered. However such strings are
not in L. Contradiction.
The pumping lemma allows us to show that context-free grammars are not enough to provide the
full description of the syntax of more advanced programming languages such as Java. Though
we describe the basic structure of the syntactically correct programs with context-free grammars
(in BNF or EBNF notation), there exist side conditions, which are context-sensitive.
47
Example : The programming language Java, which consists of all syntactically correct Java
programs, is not context-free. We make a proof by contradiction.
Hypothesis: Let Java be context-free. Then there is a n N with the properties mentioned in
the pumping lemma. Now let us consider the following syntactically correct Java-class:
class C
int X 1 . . . 1
. .
n
;
void m()
X 1 . . . 1
. .
n
= X 1 . . . 1
. .
n
By every uvwxy-decomposition of this program, the vwx-part inuences at most two out of
three occurences of the variables X 1 . . . 1
. .
n
, because [vwx[ n. Therefore, while pumping to
uv
i
wx
i
y, there appear either character strings, which do not fulll the requirements of the Java
syntax diagram, or character strings of the form
class C
int X 1 . . . 1
. .
k
;
void m()
X 1 . . . 1
. .
l
= X 1 . . . 1
. .
m
where k, l, m are not all equal. These character strings violate the following condition for syn-
tactically correct Java programs:
Every variable must be declared before use. ()
Here either X 1 . . . 1
. .
l
or X 1 . . . 1
. .
m
is not declared.
Contradiction
Such conditions as () are context-sensitive and thus are mentioned, while dening program-
ming languages together with the context-free basic structure of programs. A compiler checks
these context-sensitive conditions by applying appropriate tables for saving declared variables
or general identiers.
48 III. Context-free languages and push-down automata
3 Push-down automata
So far we dened context-free languages by the fact that they can be generated by grammars.
Now we want to consider recognizability of such languages by automata. Our goal is to extend
the model of the nite automaton, which would recognize the context-free languages.
The weak point of nite automata is that they lack memory to store an unlimited amount of
data. Obviously a nite automaton can not recognize such a language as
L = a
n
b
n
[ n N
because at the moment, when the rst b is read, it doesnt know any more, how many as have
been read. The only information saved is the current state, i.e. an element of a nite set of
states.
Now we consider the so called push-down automata. These are nondeterministic nite automata
with -transitions (-NFAs), extended with memory, which can save innitely long strings, but
which can be accessed very limitedly. The memory is organized as a stack or a pushdown-list
with access only to the topmost symbol. The transitions of a push-down automaton may depend
on the current state, on the symbol of the input string, which has been read, and on the topmost
symbol of the stack; they change the state and the contents of the stack.
Sketch
a a a
b b b
input string
reading direction
`
q Q
#
A
A
topmost symbol
nite
control
stack
3.1 Denition (push-down automaton): A (nondeterministic) push-down automaton, shortly
PDA is a 7-tuple
/ = (, Q, , , q
0
, Z
0
, F)
with the following properties:
(i) is the input alphabet,
(ii) Q is a nite set of states,
(iii) is the stack alphabet,
(iv) Q ( ) Q
,
, q Q, Z ,
) we
often write (q, Z)
(q
).
We can illustrate it as follows. A transition (q, Z)
a
(q
)
means: in the state q and Z as a topmost stack symbol PDA can make a transition into the
state q
and replace Z by
on the top of the stack. In this case no input symbols are read.
If
(q
), if Z ,
0
,
1
:
= Z
0
and (q, Z, , q
,
1
) and
=
1
0
By
we denote the -transition relation on congurations.
(iii) For every string w
,
w
is a 2-place relation on congurations of /, which is dened
inductively:
(q, )
(q
), if n 0 : (q, )
. . .
. .
n-times
(q
)
(q, )
aw
(q
), if (q, )
a
w
(q
), for all a .
By
w
we denote the extended w - transition relation on congurations.
Remark : For all , a
1
, . . . , a
n
and w
1
, w
2
it holds that:
50 III. Context-free languages and push-down automata
(i) (q, )
(q
) implies (q, )
(q
).
(ii) (q, )
a
1
...a
n
= (q
)
if and only if (q, )
a
1
. . .
a
n
(q
)
(iii) (q, )
w
1
w
2
= (q
)
There exist two variants of language acceptance.
3.3 Denition (acceptance):
Let / = (, Q, , , q
0
, Z
0
, F) be a PDA and w
.
(i) / accepts w, if q F
: (q
0
, Z
0
)
w
= (q, ).
The language accepted (or recognized) by / is
L(/) = w
[ / accepts w.
(ii) / accepts w with the empty stack,
if
q Q : (q
0
, Z
0
)
w
= (q, ).
The language accepted or (or recognized) by / with an empty stack is
L
(/) = w
1
(q
1
, aZ)
a
2
(q
1
, aaZ) . . .
a
2
(q
1
, a
n
Z)
b
3
(q
2
, a
n1
Z)
b
4
(q
2
, a
n2
Z) . . .
b
4
(q
2
, Z)
5
(q
0
, )
Thus the relation (q
0
, Z)
a
n
b
n
= (q
0
, ) holds for n 1. For n = 0 it trivially holds that (q
0
, Z)
(q
0
, Z). Because q
0
is the nal state of / , then it follows that L L(/).
51
For the inclusion L(/) L we must analyse the transition behaviour of /. For this purpose we
index the transitions as above. We can see that / is deterministic, i.e. while sequentially reading
characters of a given input string, only one transition can be applied on each step. Namely the
following transition sequences are possible, where n m 0 holds:
(q
0
, Z)
a
1
_
a
2
_
n
(q
1
, a
n+1
Z)
(q
0
, Z)
a
1
_
a
2
_
n
3
_
b
4
_
m
(q
2
, a
nm
Z)
(q
0
, Z)
a
1
_
a
2
_
n
3
_
b
4
_
n
5
(q
0
, )
Therefore, / accepts only those strings, which have the form a
n
b
n
. In general, it holds that
L(/) = L.
Furthermore, note that the automaton / accepts all strings a
n
b
n
with the empty stack, except
for the case n = 0: L
(/) = a
n
b
n
[ n 1.
Now we want to derive general properties of push-down automata. For this purpose we often
need the following Top Lemma. It clearly points out that the changes on the top of the stack
do not depend on the rest of stack.
3.4 Lemma (top of the stack): Let / = (, Q, , , q
0
, Z
0
, F) be a push-down automaton.
Then for all w
, q, q
Q, Z and
it holds that: if
(q, Z)
w
= (q
, ),
then also
(q, Z)
w
= (q
, ).
Proof : Exercise
Now we show that both variants of language acceptance are equivalent.
3.5 Theorem (acceptance):
(1) For every KA / we can construct a KA B with L(/) = L
(B).
(2) For every KA / we can construct a KA B with L
(/) = L(B) .
Proof : Let / = (, Q, ,
A
, q
0
, Z
0
, F).
(1): The idea of the proof is simple: B functions as / and empties the stack in nal states.
However we should take care of the fact that B should not have empty stack after having read
input strings, which / doesnt accept. Therefore, B uses an aditional symbol # to label the
bottom of the stack. To be more precise we construct:
B = (, Q q
B
, q
, #,
B
, q
B
, #, )
52 III. Context-free languages and push-down automata
with q
B
, q
B
= (q
B
, #, , q
0
, Z
0
#) starting /
A
functioning as /
(q, Z, , q
, ) [ q F, Z #
(q
, Z, , q
, ) [ Z #
_
emptying the stack
Then it holds for all w
, q F and
:
(q
0
, Z
0
)
w
=
A
(q, )
if and only if
(q
B
, #)
B
(q
0
, Z
0
#)
w
=
A
(q, #)
B
(q
, ).
(For the if-then direction we use the Top Lemma.) Analysing the applicability of the new
transitions in B we get L(/) = L
(B).
(2): Idea of the proof: B works like /, but uses an additional symbol # to label the bottom
of the stack. As soon as / has emptied its stack, B reads the symbol # and moves to a nal
state. Making the exact construction of B should be done as an exercise.
Now we want to show that the nondeterministic push-down automata accept context-free lan-
guages (with empty stack). First for a given context-free grammar G we construct a push-down
automaton, which represents a nondeterministic top-down parser of the language L(G).
3.6 Theorem : For every context-free grammar G we can construct a nondeterminstic push-
down automaton / with L
(/) = L(G).
Proof : Let G = (N, T, P, S). We construct / in such a way that it simulates the leftmost
derivation in G:
/ = (T, q, N T, , q, S, ),
where the transition relation consists of the following types of transitions:
(1) (q, A)
(q, u), if A u P,
(2) (q, a)
a
(q, ), if a T.
/ works as follows: rst S is in the stack. A rule application A u of the grammar is executed
in the stack by replacing the topmost stack symbol A by u. If the terminal symbols are on the
top of the stack, we compare them with the symbols of the input string and if they are equal,
we remove them from the stack. This way we produce stepwise by / a leftmost derivation of
the input string. If this derivation is successful, then / accepts the input string with the empty
stack. The application of transitions of type (1) is nondeterministic, if there are several rules
with the same non-terminal symbol A in P. The only state q is not important for the transition
behavior of /, but must be indicated in order for the denition of / to be complete.
To show that L(G) = L
:
53
w
T
= the longest prex of w with w
T
T
,
w
R
is the remainder of w, dened by w = w
T
w
R
.
Hypothesis 1 For all A N, w (N T)
and all
transition sequences
(q, A) = (q,
0
)
1
(q,
1
)
m
(q,
m
)
the length m holds
A
G
1
. . .
m
m
.
Proof by induction on m:
m = 0: Then
0
= A. Trivially A
G
A holds.
m m+ 1: We analyse the last transition
(q,
m
)
m+1
(q,
m+1
).
54 III. Context-free languages and push-down automata
By the induction hypothesis A
G
1
. . .
m
m
.
Case
m+1
=
Then we used transition type (1) and the transition has the form
(q,
m
) = (q, Bv)
(q, uv) = (q,
m+1
)
for some B N and u, v (N T)
G
1
. . .
m
Bv
G
1
. . .
m
uv =
1
. . .
m
m+1
m+1
.
Case
m+1
= a T
Then we used transition type (2) and the transition has the form
(q,
m
) = (q, av)
a
(q, v) = (q,
m+1
)
for some v (N T)
G
1
. . .
m
av =
1
. . .
m
m+1
m+1
.
Thus we have also proved statement 2.
Particularly from the statements 1 and 2 it follows that for all strings w T
it holds that:
S
G
w if and only if (q, S)
w
(q, )
if and only if / accepts w with the empty stack.
Thus L(G) = L
(/
1
) = L(G
1
). To illustrate this, let us consider the transition
sequence of /
1
, while accepting a
2
b
2
:
(q, S)
(q, aSb)
a
(q, Sb)
(q, aSbb)
a
(q, Sbb)
(q, bb)
b
(q, b)
b
(q, ).
55
Now for every given push-down automaton we construct a respective context-free grammar.
3.7 Theorem : For every push-down automaton / we can construct a context-free grammar G
with L(G) = L
(/).
Proof : Let / = (, Q, , , q
0
, Z
0
, F). We construct G = (N, T, P, S) with T = and
N = S [q, Z, q
] [ q, q
Q and Z .
The idea of non-terminal symbols [q, Z, q
] is as follows:
(1) All strings w
, which can accept / from the conguration (q, Z) with empty stack
and the state q
]: (q, Z)
w
(q
, ).
(2) Thus a transition (q, Z)
(r
0
, Z
1
. . . Z
k
) from / is simulated in G by the following pro-
ductions:
[q, Z, r
k
] [r
0
, Z
1
, r
1
][r
1
, Z
2
, r
2
] . . . [r
k1
, Z
k
, r
k
]
for each r
1
, . . . , r
k
Q. The strings, which are accepted by / up to the reduction of the
symbol Z
1
, are generated from [r
0
, Z
1
, r
1
]; the strings accepted by / up to the reduction
of the symbol Z
2
, are generated from [r
1
, Z
2
, r
2
], and so on. The intermediate states
r
1
, . . . , r
k1
are those states, which / reaches directly after the reduction of the symbols
Z
1
, . . . , Z
k1
.
To make it more precise, P consists of the following transitions:
Type (1): S [q
0
, Z
0
, r] P for all r Q,
Type (2): For every transition (q, Z)
(r
0
, Z
1
. . . Z
k
) with and k 1 in /:
[q, Z, r
k
] [r
0
, Z
1
, r
1
] . . . [r
k1
, Z
k
, r
k
] P for all r
1
, ..., r
k
Q.
Type (3): (Special case (2) for k = 0.) For every transition (q, Z)
(r
0
, ) in /:
[q, Z, r
0
] P.
To show that L(G) = L
Q, Z , w
, n 1 and derivations in G
[q, Z, q
]
G
. . .
G
. .
n-times
w
with the length n holds for /
(q, Z)
w
(q
, ).
Proof by induction on n:
56 III. Context-free languages and push-down automata
n = 1: From [q, Z, q
]
G
w it follows (because w
, ). Thus (q, Z)
(q
, ).
n n + 1: We analyse the rst step of a derivation with the length n + 1, which should take
place according to the production type (2):
[q, Z, r
k
]
G
[r
0
, Z
1
, r
1
] . . . [r
k1
, Z
k
, r
k
]
G
. . .
G
. .
n-times
w
1
. . . w
k
= w,
where (q, Z)
(r
0
, Z
1
. . . Z
k
) in /, r
k
= q
, , w
1
, . . . , w
k
and
[r
i1
, Z
i
, r
i
]
G
. . .
G
. .
n-times
w
i
for i = 1, . . . , k holds. Due to induction it holds in / that
(r
i1
, Z
i
)
w
i
(r
i
, )
for i = 1, . . . , k and thus according to the Top Lemma
(q, Z)
(r
0
, Z
1
. . . Z
k
)
w
1
(r
1
, Z
2
. . . Z
k
)
.
.
.
(r
k1
, Z
k
)
w
k
(r
k
, ).
So in conclusion (q, Z)
w
(q
, ) for / as required.
Hypothesis 2 For all q, q
Q, Z , n 1,
1
, . . . ,
n
and all transition sequences
(q, Z)
1
n
(q
, )
in / with the length n it holds in G that
[q, Z, q
G
1
. . .
n
.
Proof by induction on n:
n = 1: Then (q, Z)
1
(q
]
G
1
.
n n + 1: We analyse the rst step of a transition sequence in / with the length n + 1:
(q, Z)
1
(r
0
, Z
1
. . . Z
k
)
2
n+1
(q
, ),
where k 1 holds. By denition of P there is a production of type (2) in G
[q, Z, q
]
1
[r
0
, Z
1
, r
1
] [r
k1
, Z
k
, r
k
],
where r
k
= q
km
k
(r
k
, ) = (q
, )
57
with
2
. . .
n+1
=
11
. . .
1m
1
. . .
k1
. . .
km
k
and m
1
, . . . , m
k
n. By condition of induction
it holds in G that
[r
i1
, Z
i
, r
i
]
G
i1
. . .
im
i
for i = 1, . . . , k. In conclusion it results in G that
[q, Z, q
G
1
2
. . .
n+1
as required.
The hypotheses 1 and 2 imply that: for all q Q and w
it holds in G
S
G
[q
0
, Z
0
, q]
G
w
if and only if in /
(q
0
, Z
0
)
w
(q, )
holds. Thus we have shown that L(G) = L
(/).
58 III. Context-free languages and push-down automata
4 Closure properties
Now we consider, under which operations the class of context-free languages is closed. In contrast
to the regular (i.e. nitely acceptable) languages we have the following results.
4.1 Theorem : The class of context-free languages is closed under the following operations
(i) union,
(ii) concatenation,
(iii) iteration,
(iv) intersection with regular languages.
However the class of context-free languages is not closed under the operations
(v) intersection,
(vi) complement.
Proof : Let L
1
, L
2
T
1
are context-free. After that we consider the operations intersection and complement.
Let S / N
1
N
2
be a new start symbol.
(i) L
1
L
2
: Consider the context-free grammar G = (S N
1
N
2
, T, P, S) with
P = S S
1
, S S
2
P
1
P
2
. Obviously L(G) = L
1
L
2
holds.
(ii) L
1
L
2
: Consider the context-free grammar G = (S N
1
N
2
, T, P, S) with
P = S S
1
S
2
P
1
P
2
. Obviously L(G) = L
1
L
2
holds.
(iii) L
1
: Consider the context-free grammar G = (S N
1
, T, P, S) with
P = S , S S
1
S P
1
. Then S
G
S
n
1
holds for all n 0 and thus L(G) = L
1
.
(iv) L
1
regSpr: For the intersection with regular languages we use the representation of
context-free and regular languages by push-down automata and nite automata respec-
tively. Let L
1
= L(/
1
) for the (nondeterministic) PDA /
1
= (T, Q
1
, ,
1
, q
01
, Z
0
, F
1
)
and L
2
= L(/
2
) for the DFA /
2
= (T, Q
2
,
2
, q
02
, F
2
). We construct from /
1
and /
2
the
(nondeterministic) PDA
/ = (T, Q
1
Q
2
, , , (q
01
, q
02
), F
1
F
2
),
where the transition relation for q
1
, q
1
Q
1
, q
2
, q
2
Q
2
, Z , T and
is dened as follows:
((q
1
, q
2
), Z)
((q
1
, q
2
),
) in /
59
if and only if
(q
1
, Z)
1
(q
1
,
) in /
1
and q
2
2
q
2
in /
2
.
Note that in the special case = the notation q
2
2
q
2
for the DFA /
2
simply means
q
2
= q
2
. (Compare with the denition of the extended transition relation q
w
2
q
for
DFAs in section 1) Thus the relation
of / models the synchroneous parallel progress of
the automata /
1
and /
2
. However in the special case = only /
1
makes a spontaneous
-transition, while the DFA /
2
remains in the current state.
We show that for the acceptance by nal states it holds that: L(/) = L(/
1
) L(/
2
) =
L
1
L
2
. Let w = a
1
. . . a
n
T
, where n 0 and a
i
T for i = 1, . . . , n. Then it holds
that:
w L(/) (q
1
, q
2
) F,
: ((q
01
, q
02
), Z
0
)
a
1
a
n
((q
1
, q
2
), )
q
1
F
1
, q
2
F
2
,
: (q
01
, Z
0
)
a
1
1
a
n
1
(q
1
, )
and q
02
a
1
2
a
n
2
q
2
w L(/
1
) L(/
2
).
(v) not L
1
L
2
: However the context-free languages are not closed under intersection with
other context-free languages. Consider
L
1
= a
m
b
n
c
n
[ m, n 0
and
L
2
= a
m
b
m
c
n
[ m, n 0.
It is easy to see that L
1
and L
2
are context-free. For example, L
1
can be generated by the
context-free grammar G = (S, A, B, a, b, c, P, S with the following P:
S AB,
A [ aA,
B [ bBc.
The intersection of L
1
and L
2
is the following language
L
1
L
2
= a
n
b
n
c
n
[ n 0,
which is not context-free (as we have shown with the help of the pumping lemma).
(vi) not L
i
: The context-free languages are also not closed under the complement. This
statement follows directly from (i) and (v) due to the De Morganss laws. If context-
free languages were closed against the complement, then they would also be closed under
intersection, bacause L
1
L
2
= L
1
L
2
holds.
G
holds.
Step 1 First we compute all erasable A N. For this purpose we inductively compute the sets
N
1
, N
2
, N
3
, . . . of erasable non-terminal symbols:
N
1
= A N [ A P
N
k+1
= N
k
A N [ A B
1
. . . B
n
P mit B
1
, . . . , B
n
N
k
= (N
, T, P
, S
) equivalent to G. Let S
be a new start
symbol N
= S
from P
0
as follows:
P
= S
[ S is erasable S
u[ S u P
0
P
0
Thus G
) = L(G).
: This inclusion is obvious, because S
u P
implies S
G
u, and A u P
with
A N implies A
G
u.
: If L(G) holds, then S is erasable and thus S
. Therefore, L(G
) holds as
well. Now let w L(G) . Consider a derivation tree from S to w in G:
S
.
.
.
`
`
`
`
`
`
`
`
`
`
B
/
/
/
/
/
/
/
\
\
\
\
\
\
\
A
_
`
. .
w
We get a derivation tree from S
to w in G
by replacing S by S
G
:
S
.
.
.
B
/
/
/
\
\
\
. .
w
Thus w L(G
) holds.
62 III. Context-free languages and push-down automata
5.3 Denition : A context-free grammar G = (N, T, P, S) is in Chomsky normal form , if the
following holds:
G is -free (so that at the most S P is allowed),
every production in P dierent from S has either the form
A a or A BC,
where A, B, C N and a T.
5.4 Theorem : Every context-free grammar can be transformed into an equivalent grammar in
Chomsky normal form.
Proof : see literature.
5.5 Denition : A context-free grammar G = (N, T, P, S) is in Greibach normal form, if the
following holds:
G is -free (so that at most S P is allowed),
every production in P dierent from S has the form
A aB
1
. . . B
k
,
where k 0, A, B
1
, . . . , B
k
N and a T.
5.6 Theorem : Every context-free grammar can be transformed into an equivalent grammar in
Greibach normal form.
Proof : see literature.
63
6 Deterministic context-free languages
In section 3 we have shown that in general every context-free language can be recognized by
a nondeterministic push-down automaton. The question is, whether we can eliminate the non-
determinism as in the case of nite automata and whether we are always able to construct
equivalent deterministic push-down automata. This question is of practical importance for the
construction of a parser for a given context-free language. First we dene the notion of deter-
minism for push-down automata and context-free languages.
6.1 Denition :
(i) A push-down automaton / = (, Q, , , q
0
, Z
0
, F) is called deterministic, if the transition
relation fulls the following conditions:
q Q, Z , a :
( number of transitions of the form (q, Z)
a
+ number of transitions of the form (q, Z)
) 1
(ii) A context-free language L is called deterministic, if there is a deterministic push-down
automaton / with L = L(/) (acceptance by nal states).
Example : The language L = a
n
b
n
[ n N is deterministic, because in section 3 we have
introduced a deterministic push-down automaton / with L(/) = L.
Example : The language PAL
c
= wcw
R
[ w a, b
= (, Q, , , q
0
, Z
0
, Q F) for a deterministic PDA / =
(, Q, , , q
0
, Z
0
, F). Unfortunately, it generally holds that L(/
)
=
, then w , L(/).
However the same holds for /
, i.e. w , L(/
).
Thus we must rst transform / into an equivalent deterministic PDA, which terminates after
nitely many steps for every input string. Such a construction is actually possible for push-down
64 III. Context-free languages and push-down automata
automata, because the set
(q, A) [
with (q, A)
(q, A)
can be eectively constructed for / and the respective transitions (q, A)
(q
) can be deleted
or replaced by appropriate transitions.
Details: think yourself or refer to literature.
6.3 Corollary : There exist context-free languages that are not deterministic.
Proof : If all context-free languages were deterministic, then the context-free languages would
be closed under complementation. Contradiction to the theorem from section 4.1
6.4 Lemma : Deterministic context-free languages are
(i) closed under intersection with regular languages,
(ii) not closed under union, intersection, concatenationand iteration.
Proof : We prove (i) using the same construction for the nondeterministic case (see section
4.1); the PDA / generated from /
1
and /
2
is deterministic, if /
1
is deterministic.
(ii) The context-free languages L
1
= a
m
b
n
c
n
[ m, n 0 and L
2
= a
m
b
m
c
n
[ m, n 0 are both
deterministic, however their intersection is not even context-free. Because L
1
L
2
= L
1
L
2
holds, the deterministic context-free languages are not closed under union. For concatenation
and iteration refer to literature.
As an example for a nondeterministic context-free language, we consider
PAL = ww
R
[ w a, b
w ,= ,
the language of all non-empty palindromes with even length. The notation w
R
means that the
string w should be read backwards. PAL is obviously context-free: to generate it we need the
following rules
S aa [ bb [ aSa [ bSb.
In order to show that PAL is nondeterministic we use an auxillary operator Min.
65
6.5 Denition : For a language L
let
Min(L) = w L[ there is no strict prex v of w with v L,
where v is a strict prex of w, if v ,= w and u
: w = v u.
6.6 Lemma : If L with , L is a deterministic context-free language, then so is Min(L) as
well.
Proof : Consider a deterministic PDA / = (, Q, , , q
0
, Z
0
, F) with L(/) = L. Because
, L, it holds as well q
0
, F. We transform / into PDA /
1
that functions as /, but in every
transition sequence it reaches one of the nal states from F at most one time and then stops
immediately. Lets consider a new state q
1
/ Q and dene /
1
= (, Qq
1
, ,
1
, q
0
, Z
0
, q
1
)
with
1
= (q, Z, , q
) [ q QF and (q, Z, , q
)
(q, Z, , q
1
, Z) [ q F and Z
/
1
is deterministic and L(/
1
) = Min(L(/)) holds, because / is deterministic.
Now we show:
6.7 Theorem : The context-free language PAL is nondeterministic.
Proof : Hypothesis: PAL is deterministic. Then according to both lemmas considered above
the language
L
0
= Min(PAL) L((ab)
+
(ba)
+
(ab)
+
(ba)
+
)
is context-free and deterministic. (ab)
+
stands for the regular expression ab(ab)
and (ba)
+
similarly for the regular expression ba(ba)
G
w
1
Av
G
w
1
w
2
= w of a string w T
.
.
.
w
1
A
v
`
`
`
`
`
`
`
`
`
`
u
k symbols
w
2
. .
w=w
1
w
2
T
For the bottom-up method we can use the so called LR(k)-grammars. These are context-free
grammars G = (N, T, P, S) with the following property. For every intermediate string vw
2
in
the bottom-up reconstruction of a right derivation S
G
vw
2
G
w
1
w
2
= w of a string w T
S
.
.
.
A
`
`
`
`
`
`
`
`
`
`
`
`
k symbols
u
v
v w
2
w
1
. .
w=w
1
w
2
T
LL(k) grammars were introduced in 1968 by P.M. Stearns and R.E. Stearns and LR(k)
grammars in 1965 by D.E. Knuth. The following relations hold for languages LL(k) and LR(k)
generated by these grammars:
k0
LL(k) languages
_
=
_
k0
LR(k) languages
_
= deterministic context-free languages
67
It even holds that: LR(1) languages = deterministic context-free languages
68 III. Context-free languages and push-down automata
7 Questions of decidability
The following constructions are algorithmically computable:
PDA acceptable by nal states PDA acceptable by empty stack
PDA context-free grammar
context-free grammar -free grammar
context-free grammar Chomsky or Greibach normal form
The questions of decidability concerning context-free languages can be answered arbitrarily
using representation by context-free grammars (in normal form) or by push-down automata.
We consider the same problems for regular languages (compare with chapter II, chapter 5).
7.1 Theorem (decidability): For context-free languages
the membership problem,
the emptiness problem,
the niteness problem
are decidable.
Proof :
Membership problem: Consider a context-free grammar G = (N, T, P, S) in Greibach normal
form and a string w T
. The question is: Does w L(G) hold? The case w = can be decided
immediately, because G is -free: L(G) S P. Now let w ,= . Then the following
holds:
w L(G) n 1 : S
G
. . .
G
. .
n-times
w
In Greibach normal form every derivation step
produces exactly one character of w.
S
G
. . .
G
. .
|w|-times
w
In order to decide w L(G) it is enough to check all derivation sequences with the length [w[
in G. This implies the decidability of the decision problem.
Emptiness problem: Consider a context-free grammar G = (N, T, P, S). The question is:
Does L(G) = hold? Let n be the number, which belongs to the context-free language L(G)
69
according to the pumping lemma. The same way as in the case of regular languages we show
that:
L(G) = w L(G) : [w[ < n.
Thus we have solved the emptiness problem by solving the decision problem for all strings
w T
1
a
1
P
1
with q
i
, q
i
Q,
. . . . . . a
i
, a
i
,
. . . . . . P
i
R, L, S
. . . . . .
. . . . . .
q
n
a
n
q
n
a
n
P
n
Operating principles of a TM: informal
Conguration of the TM describes the current state, the contents of the tape and the considered
cell.
Initial conguration:
. .
w
0
w
1 w
n .
w
..
`
q
0
Making a step:
.
w
0
w
1 w
n .
`
q
0
e.g. (q
0
, w
0
) = (q
0
, a, R) leads to
. a w
1 w
n .
`
q
0
Repetition of this step gives us the following result:
. a a a .
`
q
0
73
where (q
0
, .) is undened.
Then as the result of the computation we get a string aa . . . a, i.e. the contents of the tape
without blanks.
Example : Let us compute the function
linear: [
0, 1
with
linear( [
n
) =
_
1 if n can be divided by 2
0 otherwise
for (n 0). Choose
3
= (q
0
, q
1
, q
e
, [ , [, 0, 1, ., , q
0
, .)
using the following Turing table:
: q
0
[ q
1
. R
q
0
. q
e
1 S
q
1
[ q
0
. R
q
1
. q
e
0 S
We can easily check that
3
computes the function linear.
Notion of the conguration
We consider the current state q, the current contents of the tape and the current position of the
writing/reading head. There are usually used two kinds of notations:
(1) We number the squares of the innite tape sequentially:
-1 0 1 2
Contents of the tape: f : Z (Z = Set of integers)
Conguration: triple (q, f, i) with i Z
Disadvantage: complicated manipulation
(2) Only a nite part of the tape diers from .. Abstraction of blanks and cells numbers.
The conguration
. . . .
u
1 u
m
v
0
v
1
v
n
q
`
74 IV. The notion of algorithm
can be unambiguously represented as a string
u
..
u
1
. . . u
m
q
v
..
v
0
v
1
. . . v
n
with u
i
, v
j
, m, n 0.
Note: Q =
1.2 Denition : The set /
Q
+
.
A conguration uqv means that the TM is in state q, the contents of the tape is .
uv.
and
the rst (leftmost) symbol of the string v is read.
1.3 Denition (Operating principles of a = (Q, , , , q
0
, .)):
(1) The initial conguration (v) for a string v
is
(v) =
_
q
0
v if v ,=
q
0
. otherwise
(2) The transition relation
is dened as follows:
K
(K
is a successor conguration of K)
if u, v
a, b q, q
Q :
(K = uqav (q, a) = (q
, a
,S) K
= uq
v )
(K = uqabv (q, a) = (q
, a
,R) K
= ua
bv)
(K = uqa (q, a) = (q
, a
,R) K
= ua
. )
(K = ubqav (q, a) = (q
, a
,L) K
= uq
ba
v)
(K = qav (q, a) = (q
, a
,L) K
= q
.a
v )
By
, i.e. it holds
K
if K
0
, . . . , K
n
, n 0 :
K = K
0
. . .
K
n
= K
Further on by
+
we will denote the transitive closure of
+
K
if K
0
, . . . , K
n
, n 1 :
K = K
0
. . .
K
n
= K
K
1
K
K
2
it follows that K
1
= K
2
.
1.4 Denition : The function computed by TM = (Q, , , , q
0
, .) is
h
part
with
h
(v) =
_
_
w if nal conguration K /
:
(v)
K w = (K)
undef. otherwise
We also write Res
for h
(result function of ).
Remark : Because
is right-unique, then h
is a partial function.
Illustration of the result function:
v
h
(v)
K
/
[ h
(v) is dened
A set N
[ v
: h
(v) = w.
1.6 Denition (Turing computability):
Let A, B be alphabets.
(i) A partially dened function h : A
part
B
, i.e. h(v) = h
.
(ii) T
A,B
=
def
h : A
part
B
[ h is Turing computable
(iii) Let T be the class of all Turing computable functions (for arbitrary alphabets A, B).
76 IV. The notion of algorithm
1.7 Denition (Turing decidability):
Let A be an alphabet.
(i) A set L A
L
: A
0, 1
is Turing computable. Thus
L
is the following total function:
L
(v) =
_
1 if v L
0 otherwise
(ii) A property E : A
[ E(v) = true
of the strings with the property E is Turing decidable.
Remarks on computability
(1) Computability of functions with several arguments:
k-tuple as input can be described by a function h : (A
)
k
part
B
part
[
with
h
f
: ( [
n
) = [
f(n)
is Turing computable.
Constructing Turing machines:
the owchart notation
First we dene useful elementary TMs. Let = a
0
, . . . , a
n
be the tape alphabet with a
0
= ..
77
Small right-machine r:
makes one step to the right and halts. Turing table:
r q
0
a
0
q
e
a
0
R
. . . . . . . . . . . . . . .
q
0
a
n
q
e
a
n
R
Small left-machine l:
makes one step to the left and the halts. Turing table:
l q
0
a
0
q
e
a
0
L
. . . . . . . . . . . . . . .
q
0
a
n
q
e
a
n
L
Printer-machine a for a :
prints the symbol a and then halts. Turing table:
a q
0
a
0
q
e
a S
. . . . . . . . . . . . . . .
q
0
a
n
q
e
a S
Further we assume that all constructed TMs have exactly one nal state, i.e. there exists one
state q
e
so that for all nal congurations uqv it holds that:
q = q
e
.
It is obvious that TMs r, l, a fulll this condition. Such TMs can be combined as follows:
1
a
2
obviously means that rst
1
works. If
1
terminates on a cell with symbol a,
then
2
starts.
1
2
means that rst
1
works. As soon as
1
terminates,
2
starts.
2
is an abbreviation for
1
2
.
1
=a
2
means that rst
1
works. If
1
terminates on a cell with the symbol ,= a, then
2
starts.
On the basis of the given TMs we can construct owcharts. The nodes of this owchart are
labelled with the names of TMs. The edges are labelled with arrows of the form
a
, or
=a
. Loops are allowed. A TM in the owchart is labelled with an arrow as a start-TM.
78 IV. The notion of algorithm
Illustration:
1
2
-
a
`
`
b
A owchart describes a large TM. We can get its Turing table as follows:
Step 1: For every occurrence of a TM
i
in the owchart construct the respective Turing table.
Step 2: Make the states in dierent tables disjoint.
Step 3: Generate the table of TM by writing all tables (in any order) one below each other.
Step 4: Combining: For each
1
a
2
in the owchart add to the table of TM line
q
e
1
aq
0
2
aS
. Let q
e
1
be the nal state (renamed according to Step 2) of
1
and q
0
2
the start state
(renamed according to Step 2) of
2
. Similarly for
1
2
and
1
=a
2
.
Example :
Large right-machine 1:
rst makes a step to the right. Then 1 moves to the right until a blank a
0
= . is observed.
1
r
,= .
Construction of the Turing table of 1:
1
q
0
a
0
q
e
a
0
R
. . . . . . . . . . . . . . .
q
0
a
n
q
e
a
n
R
q
e
a
1
q
0
a
1
S
. . . . . . . . . . . . . . .
q
e
a
n
q
0
a
n
S
r
.
,= a
0
Large left-machine L: similarly to the large right-machine.
L
l
,= .
79
Application: Consider a TM for the computation of the minus function:
f : N N N
with
f(x, y) = x
.
y =
_
x y for x y
0 otherwise
The initial conguration of the TM is:
.q
0
| . . . |
. .
x-times
#| . . . |
. .
y-times
.
Idea: erase x-dashes as long as y-dashes are there. Then erase the remaining y-dashes and #.
Construction of the TM using a owchart:
1l .Lr
(y > 0) (x > 0)
.
_
# # (y = 0) (x = 0)
. . .r
(y > 0)
`
`
As an exercise we recommend to construct the whole Turing table.
Variations of Turing machines
There exist many variations of the TM denition. These variations are often more convenient,
if we want to prove that a certain function is computable. We can show that these variations do
not increase the power of TM, which was dened in Denition 1.1. First we consider TM with
several tapes.
1.8 Denition (k-tape Turing machine): A k-tape TM = (Q, , , , k, q
0
, .) is a 7-tuple
with the following properties:
(i)-(v) Q, , , q
0
, . as in Denition 1.1.
(vi) k N is the number of tapes
(vii) The transition function is now given as follows:
: Q
k
part
Q
k
R, L, S
k
80 IV. The notion of algorithm
Illustration of for k = 3:
c
c
c
d
d
a
a
a
a
a
a
a
a a
d
d
d
c
d
q
q
`
`
_
`
`
_
`
_
`
`
_
`
This transition occurs for
(q, (a, c, c)) = (q
with u
1
, . . . , u
k
, q Q, v
1
, . . . , v
k
+
.
Operating principles of a k-tape TM :
(1) Initial conguration for a string v
is
k
(v) = ((v), q
0
., . . . , q
0
.
. .
(k1)-times
),
i.e. v is written on the rst tape.
(2) Transition relation
: similarly
(3) Final conguration: no successor conguration (as before)
(4) The result
k
of a k-tape conguration is
k
(u
1
qv
1
, . . . , u
k
qv
k
) = (u
1
qv
1
),
i.e. we take the result of the rst tape; we consider the remaining tapes only as auxiliary
tapes for computation.
The computed function of a k-tape TM is h
part
with
h
(v) =
_
_
w if nal cong. K /
k
(v)
K w =
k
(K)
r undef.
Remark: Several tapes can be used for computing functions with several arguments; in this
case we distribute the input strings on the tapes appropriately.
81
Goal: We want to prove that every function computed by a k-tape TM can be computed by a
normal 1-tape TM.
Idea of the proof: Every k-tape TM can be simulated using a constructed 1-tape TM.
Therefore, rst we want to specify the notion of simulation.
Notion of simulation
The notion of simulation is essential for proving that dierent machines have the same power.
1.9 Denition : We consider (1- or k-tape) TM and
and /
, initial congurations
,
the
lower TM.
A simulation of by
(v)
sim((v))
(2) K
1
, K
2
/
: sim(K
1
)
+
sim(K
2
) implies K
1
K
2
,
i.e. every step of can be simulated by several steps of
.
(3) nal cong. K /
nal cong. K
:
sim(K)
and (K) =
(K
).
Illustration:
(v)
sim((v))
+
sim(K)
(v)
K
v
h
= h
sim
`
sim
computed from
82 IV. The notion of algorithm
h
and h
coincide, i.e.
v
: h
(v) = h
(v)
(The equality here means that either both sides are undened or they have the same value).
Proof : Let v
.
Case 1: h
(v) is undened.
Then there is an innite sequence of congurations
(v) = K
1
K
2
K
3
. . .
of . Therefore, due to properties (1) and (2) of sim , we have an innite sequence of congu-
rations
(v)
sim(K
1
)
+
sim(K
2
)
+
sim(K
3
)
+
. . .
of
. Thus h
(v) = w is dened.
Then there is a nite sequence of congurations
(v) = K
1
. . .
K
n
, n 1,
of , where K
n
is a nal conguration and w = (K
n
) holds. Due to the properties (1) - (3) of
sim , we have a nite sequence of congurations
(v)
sim(K
1
)
+
. . .
+
sim(K
n
)
n
of
, where K
n
is a nal conguration with (K
n
) =
(K
n
). Thus it holds that
h
(v) = w = (K
n
) =
(K
n
) = h
(v).
with h
= h
.
Proof : Let = (Q, , , , k, q
0
, .). We construct
= (Q
, q
0
, .) so that there is a
simulation sim of by
. Denition of sim:
K = ( u
1
qa
1
v
1
, is sim(K) =
. . . , simulated u
1
q a
1
v
1
#. . . #u
k
a
k
v
k
u
k
qa
k
v
k
) by
/
and
according to the
following idea of simulation. A step of of the form
K
1
= (u
1
q
1
a
1
v
1
,
K
2
= (u
1
q
2
b
1
v
1
,
. . . , . . . ,
u
k
q
1
a
k
v
k
) u
k
q
2
b
k
v
k
)
83
generated by
(q
1
, (a
1
, . . . , a
k
)) = (q
2
, (b
1
, . . . , b
k
), (S, . . . , S))
is simulated by
in 2 phases of steps.
Reading phase:
sim(K
1
) = u
1
q
1
a
1
v
1
#. . . #u
k
a
k
v
k
u
1
[read, q
1
] a
1
v
1
#. . . #u
k
a
k
v
k
u
1
a
1
v
1
#. . . #u
k
[read, q
1
, a
1
, . . . , a
k
] a
k
v
k
Changing phase:
Now we will change the conguration according to (q, (a
1
, . . . , a
k
)).
u
1
a
1
v
1
#. . . #u
k
[read, q
1
, a
1
, . . . , a
k
] a
k
v
k
u
1
a
1
v
1
#. . . #u
k
[change, q
2
, b
1
, . . . , b
k
, S, . . . , S] a
k
v
k
u
1
[change, q
2
]
b
1
v
1
#. . . #u
k
b
k
v
k
sim(K
2
) = u
1
q
2
b
1
v
1
#. . . #u
k
b
k
v
k
Thus a step of is simulated by maximal as many steps of
.
84 IV. The notion of algorithm
Now we consider the initial and nal congurations.
Initial conguration: Let v
and v ,= .
Then for
(v) = (q
0
v , and sim((v)) = q
0
v#.#. . . #.
q
0
.,
. . .
q
0
.)
For
(v) = q
0
v. Of course we can program
(v)
sim((v))
holds.
Final conguration: If K /
in such a way
that
sim(K)
u
1
q
e
a
1
v
1
holds and K
= u
1
q
e
a
1
v
1
is the nal conguration of
with (K) =
(K
).
Thus we have:
Q
= Q q
0
, q
e
[read, q, a
1
, . . . , a
j
] [ q Q, a
1
, . . . , a
j
, 0 j kff
[change, q, b
1
, . . . , b
j
, P
1
, . . . , P
j
] [ q Q, 0 j k, b
1
, . . . , b
j
,
P
1
, . . . , P
j
R, L, S
. . . further states . . .
In general we have now shown that sim is a simulation of by
= h
85
Other variations of Turing machines
Below we will consider other variations of Turing machines:
Turing machines with k heads on one tape:
. . F R E I B U R G
q
`
`
_
`
Denition of the transition function : as in Denition 1.8 of k-tape TM, but the denition
of conguration is dierent, because k cells on the tape should be marked.
Turing machines with a two-dimensional computational space divided into cells (comput-
ing on checkered paper), possibly with several heads:
q
`
_
`
`
= lled
= blank
Turing machine with one-sided innite tape:
q
`
Normal TM (with two-sided innite tape) can be simulated as follows by TM with one-
sided innite tape:
2-sided conguration K = a
m
. . . a
1
qb
1
. . . b
n
with m 0, n 1
is (for m n) simulated by the
1-sided conguration sim(K) =
q
b
1
. . .
b
m
b
m+1
. . .
b
n
a
1
a
m
. .
i.e. pairs of symbols of the 2-sided tape are
the symbols of the 1-sided tape.
86 IV. The notion of algorithm
Turing machines with nal states:
= (Q, , , , q
0
, ., F),
where F Q is the set of nal states and all remaining components are dened as before.
We will use TM with nal states, in order to accept languages L
.
1.12 Denition :
(1) A conguration K = uqv of is called accepting, if q F.
(2) Let v
K.
(3) The language accepted by is
L() = v
[ accepts v.
A language L
and L =
L.
L and L are Turing-acceptable L is Turing-decidable.
Proof : : Let L be decidable by . By adding nal state transitions we get an accepting
TM for L and L.
: L is accepted by
1
and L by
2
. Now let us construct with 2 tapes in such a way that
a step of
1
on tape 1 and simultaneously a step of
2
on tape 2 is made. If
1
accepts the string,
then produces the value 1. Otherwise
2
accepts the string, and produces the value 0. Thus
decides the language L.
Remark : If only L is Turing-acceptable, then it does not imply that L is also Turing-decidable.
The accepting TM might not terminate for strings from L .
Non-deterministic Turing machines:
We consider 1-tape TM with nal states. The transition function is extended as
follows:
: Q T(Q R, L, S).
A tuple
(q
, b, R) (q, a)
means that the TM, in case if the symbol a is read in state q, can do the following:
accept state q
, c, S) (q, a),
then the TM can do the following instead:
accept state q
, write c, terminate.
The choice between these two possible steps of TM is non-deterministic, i.e. TM can
arbitrarily choose one of the two possible steps. The transition relation
of a non-
deterministic TM is not right-unique any more. Apart from that, the language L()
accepted by is dened as for a deterministic TM.
1.14 Theorem : Every language accepted by a non-deterministic TM is also acceptable by a
deterministic TM.
Proof : Consider a non-deterministic 1-tape TM . Then there exists a maximal number r of
non-deterministic choices, which has according to in state q and symbol a, i.e.
q Q a : [(q, a)[ r,
q Q a : [(q, a)[ = r
By r we denote the degree of non-determinism of . For each pair (q, a) we number the possible
choices from 1 to (maximal) r consequently according to (q, a). Then we can represent every
nite sequence of non-deterministic choices as strings over the alphabet 1, . . . , r.
Example: For r = 5, for example, 1 3 5 means:
- in the 1st step make the 1st choice,
- in the 2nd step make the 3rd choice,
- in the 3rd step make the 5th choice.
Now let v
K accepting.
The set of all computations of starting in (v) can be represented as a tree with nodes labelled
with congutrations:
(v) = K
K
1
K
r
K
11
K
1r
K
r1
K
rr
. . . . . . . . . . . . . . . . . . . . . . . . . .
K
accepting.
88 IV. The notion of algorithm
so that
needs 3 tapes.
Tape 1 stores the input string v.
Tape 2 is used for systematic generation of all strings over 1, . . . , r. In order to model
the breadth-rst search the strings are generated as follows:
(i) according to the length, the shorter rst,
(ii) when the length is equal, then we use lexicographic order. Thus we get the following
sequence:
, 1, 2, . . . , r, 1 1, 1 2, . . . , 1 r, . . . , r 1, . . . , r r, . . .
Tape 3 is used for simulation of by
on tape 2, and
then
) holds.
Remark : The above simulation of by
(N T)
:
x
G
y if and only if u v P w
1
, w
2
(N T)
:
x = w
1
u w
2
and y = w
1
v w
2
.
By
G
we again denote the reexive and transitive closure of
G
. We read x
y as y can be
derived from x. It holds
x
y, if z
0
, . . . , z
n
, n 0 : x = z
0
G
. . .
G
z
n
= y
The sequence x = z
0
G
. . .
G
z
n
= y is also called a derivation of y from x (or from x to y) of
the length n. Particularly, it holds: x
G
x with a derivation of the length 0.
2.2 Denition :
(i) The language generated from G is
L(G) = w T
[ S
G
w,
i.e. we are interested only in strings over the terminal symbols; the non-terminal symbols
are used as auxiliary symbols within derivations.
(ii) L T
is a Chomsky-0-language.
90 IV. The notion of algorithm
Proof : Let L be accepted by a TM = (Q, , , , q
0
, ., F) with F = q
e
so that for all
congurations uqv K
it holds:
uqv is a nal conguration u = and q = q
e
.
We can easily see that for every TM we can provide a TM with this property, which accepts the
same language. Now we construct a Chomsky-0-grammar G = (N, T, P, S) with L(G) = L in
4 steps.
Step 1: (Generate double start congurations)
Let us dene a partial grammar G
1
= (N
1
, T, P
1
, S) with S, c[, $, q
0
, . N
1
so that for all w T
G
1
wc[(w)$
holds and even for all v (N T)
:
S
G
1
v and v contains q
0
w T
: v = wc[(w)$.
The start conguration (w) is dened as usual:
(w) =
_
q
0
w if w ,=
q
0
. otherwise.
Choose N
1
= S, c[, $, q
0
, ., A, B C
a
[ a T and P
1
as follows:
P
1
= S c[q
0
.$,
S aAc[C
a
$ for all a T,
C
a
b bC
a
for all a, b T,
C
a
$ Ba$ for all a T,
bB Bb for all b T,
Ac[B c[q
0
,
Ac[B aAc[C
a
for all a T
Operating principle of P
1
:
S
G
1
c[q
0
.$ = c[()$
or
S
G
1
aAc[C
a
$
S
G
1
w Ac[B w$
S
G
1
w aAc[C
a
w$
S
G
1
w aAc[wC
a
$
S
G
1
w aAc[wBa$
S
G
1
w a Ac[B wa$
S
G
1
w ac[q
0
wa$
Step 2: (Simulate transition relation
)
Let us dene G
2
= (N
2
, T, P
2
, S) with N
2
= S, c[, $ Q so that for all w T
, v
it
holds that:
(w)
q
e
v wc[(w)$
G
2
wc[q
e
v$.
91
We can not simply choose P
2
=
, because
, a
, S)
implies that for all u, v
: uqav
uq
.
More precisely we dene:
P
2
= qa q
[ q, q
Q and a, a
and (q, a) = (q
, a
, S)
qab a
b [ q, q
Q and a, a
, b and (q, a) = (q
, a
, R)
qa$
..
TM is at the right end of the tape
a
.$ [ q, q
Q and a, a
and (q, a) = (q
, a
, R)
bqa q
ba
[ q, q
Q and a, a
, b and (q, a) = (q
, a
, L)
c[qa
..
TM is at the left end of the tape
c[q
.a
[ q, q
Q and a, a
and (q, a) = (q
, a
, L)
Step 3: (Erase nal conguration)
Dene G
3
= (N
3
, T, P
3
, S) with N
3
= S, c[, $, q
e
, ., D so that for all w T
, v
it holds
that:
wc[q
e
v$
G
3
w.
Choose P
3
as follows:
P
3
= c[q
e
D,
Da D for all a ,
D$
Operating principle:
wc[q
e
v$
G
3
wDv$
G
3
wD$
G
3
w.
Step 4: ( Compute) G from G
1
, G
2
, G
3
)
Now let us dene G = (N, T, P, S) as follows:
N = N
1
N
2
N
3
,
P = P
1
P
2
P
3
(disjoint union).
Then for all w T
, v
it holds that:
(w)
q
e
v S
G
wc[(w)$ (rules P
1
)
G
wc[q
e
v$ (rules P
2
)
G
w (rules P
3
)
For note that the rules from at most one set of rules P
1
, P
2
or P
3
can be applied to a given
string over N T.
92 IV. The notion of algorithm
Therefore we get L(G) = L.
2.4 Corollary : Let the function h : T
part
T
operates as follows:
(1)
leaves a given input string of the form w#v unchanged on the 1st tape.
(2)
copies the part w on the initially empty 2nd tape and then simulates on this tape.
(3) If terminates, then the result h(w) of the nal conguration is compared to the part v of
the 1st tape. If h(w) = v holds, then
is Turing-acceptable.
Proof : L is generated by a Chomsky-0-grammar G = (N, T, P, S): L(G) = L.
We construct a non-deterministic 2-tape TM , which accepts L. The TM operates as follows:
(1) leaves a given input string w T
, w ,= : p = uv q = uwv
(i.e. a non-terminal symbol N is replaced by the non-empty string w in context
u. . . v).
(ii) context-free (Chomsky-2- or shortly CH2-grammar) if and only if
p q P : p N (q = is allowed).
(iii) right-linear (Chomsky-3- or shortly CH3-grammar) if and only if
p q P : p N q T
N T
For context-sensitive grammars G = (N, T, P, S) there exists the following special rule, so that
L(G) is possible: -production may only be of the form S . Then all other derivations
begin with a new non-terminal symbol S
:
S S
. . .
2.7 Denition : A language L is called context-sensitive, context-free, right-linear, if there
exists a grammar of corresponding type, which generates L.
Let the classes of languages be dened for some alphabet T:
CH0 = Class of languages, generated by Chomsky-0- grammars
CS = , context-sensitive ,
CF = , context-free ,
RLIN = , right-linear ,
By denition it already holds that (however, the inclusion CF CS can be proved only on the
basis of the later theorems):
2.8 Corollary : RLIN CF CS CH0
Now we want to show that the Chomsky-3- (i.e. right-linear) languages are exactly the nitely
acceptable (i.e. regular) languages from Chapter II.
2.9 Lemma : Every nitely acceptable language is Chomsky-3-language
Proof : Consider a DFA / = (, Q, , q
0
, F). Let us construct the following Chomsky-3-
grammar G:
G = (Q, , P, q
0
),
where for q, q
[ q
a
q
q [ q F .
94 IV. The notion of algorithm
It is easy to show that L(/) = L(G) holds.
For the reverse direction we use the following lemma.
Remark : Every Chomsky-3-language can be generated by a grammar G = (N, T, P, S) with
P N (T N ).
Proof : Exercise.
2.10 Lemma : Every Chomsky-3-language is nitely acceptable.
Proof : Consider a Chomsky-3-language L T
part
B
part
B
is called recursive, if
L
: A
N :
M N
. .
Q
, [,
0
, . . . ,
l
. .
, , q
0
, .),
where is one of the nitely many functions
: Q Q R, L, S.
Thus it holds that: T
k,l
is nite.
Because T =
k0
l1
T
k,l
holds, T is countable due to a bijection : N T , which is dened
according to the following diagonal scheme to arrange sets T
k,l
:
g
g
2 3 4
1
2
3
0
1
k
l
g
g
g
g
g
g g g
X
Y
The Lemmas 1.6 and 1.7 imply the theorem about the existence of non-computable functions.
This raises the following question:
Do there exist functions important for computer science, which are non-computable?
101
2 Example of undecidable problem: halting for Turing machines
Below we consider the binary alphabet B = 0, 1. Now we consider sets (languages) L B
,
which are undecidable and for which the problem:
Given : w B
(w) dened?
The relevance of this problem for computer science: is it decidable, whether a given program
terminates for given input values?
First we translate the halting problem into an appropriate language L B
. In order to do it
we need a binary encoding of Turing machines. Let us assume that we have countable sets
q
0
, q
1
, q
2
, . . . of states and
0
,
1
,
2
, . . . of tape symbols, now with
0
= 0,
1
= 1,
2
= ..
We consider only Turing machines = (Q, B, , , q
0
, .) with Q = q
0
, . . . , q
k
for k 0,
=
0
, . . . ,
l
for l 2.
2.1 Denition : The standard encoding of such a Turing machine is the following string w
over N :
w
= k l
.
.
.
i j s t nr(P) for (q
i
,
j
) = (q
s
,
t
, P)
.
.
.
where P R, L, S and nr(R) = 0, nr(L) = 1, nr(S) = 2 holds and the substrings of the form
. . . of w
generated from w
= 1 2
0 0 1 0 0
0 1 1 1 0
0 2 1 2 0
and
bw
= 011 0111
01 01 011 01 01
01 011 011 011 01
01 0111 011 0111 01
Remark :
1. Not every string w B
[ Turing machine; : w = bw
is decidable.
3. The binary encoding is injective, i.e. bw
1
= bw
2
implies
1
=
2
.
2.2 Denition (Special halting problem or self-application problem for Turing ma-
chines): The special halting problem or self-application problem for Turing machines is the
language
K = bw
[ applied to bw
halts.
2.3 Theorem : K B
is undecidable.
103
Proof : Again we use the Cantors diagonal method in a proof by contradiction.
Hypothesis: K B
K
: B
K
(w) =
_
1 if w K
0 otherwise
We change into a Turing machine
= 1
\
\
\`
Thus it holds:
applied to bw
halts applied to bw
halts with 0
K
(bw
) = 0
bw
, K
applied to bw
Now we will show a general method, which lets us prove that some problems are undecidable by
using some other problems (e.g. K), which are known to be undecidable. This method we will
call reduction.
2.4 Denition (Reduction): Let L
1
1
and L
2
2
be languages. Then L
1
is reducible to
L
2
, shortly L
1
L
2
, if there is a total computable function f :
2
so that for all w
1
it holds that: w L
1
f(w) L
2
. We also write: L
1
L
2
using f.
Idea: L
1
is a special case of L
2
.
2.5 Lemma (Reduction): Let L
1
L
2
. Then it holds that:
(i) If L
2
is decidable, then L
1
is also decidable.
(ii) If L
1
is undecidable, then L
2
is also undecidable
104 V. Non-computable functions undecidable problems
Proof : Because (ii) is the contraposition of (i), then it is enough to show (i). Thus let L
1
L
2
using total computable function f and let L
2
be decidable, i.e.
L
2
is computable. Then the
composition
L
2
(f) = f
L
2
(rst apply f, then
L
2
) is computable. It holds
L
1
= f
L
2
,
because for all w
1
L
1
(w) =
_
1 if w L
1
0 otherwise
_
=
_
1 if f(w) L
2
0 otherwise
_
=
L
2
(f(w))
Thus
L
1
is computable, i.e. L
1
is decidable.
We apply reduction to the general halting problem for Turing machines.
2.6 Denition (general halting problem for TM): the (general) halting problem for Turing
machines is the language
H = bw
00u B
[ applied to u halts .
2.7 Theorem : H B
is undecidable.
Proof : According to the reduction lemma, it is enough to show K H, i.e. K is a special
case of H. It is here trivial: choose f : B
K f(bw
) H.
2.8 Denition : The blank-tape halting problem for Turing machines is the language
H
0
= bw
a Turing machine
u
, which operates as follows:
Applied to the blank tape
u
rst writes u on the tape. Then
u
operates as (applied to
u).
It is not important, how
u
operates, if the tape is not blank at the beginning.
We can construct
u
from and u. Thus there exists a computable function f : B
with
f(bw
00u) = bw
(
u
)
.
105
Then it holds
bw
00u H
applied to u halts.
f(bw
00u) H
0
.
Thus H H
0
using f holds.
We got acquainted with three variants of halting problems, namely K, H and H
0
. All three
variants are undecidable.
For K we have shown it directly using a diagonal method, and for H and H
0
we have shown it
by reduction: K H H
0
.
In all three variants we dealt with the question, whether there exists a decision method, which
decides halting for every given Turing machine.
In all cases the answer is no.
Perhaps we can at least construct a decision procedure for a given Turing machine , which
decides, whether halts.
2.10 Denition : Consider a Turing machine . The halting problem for is the language
H
= w B
[ applied to w halts.
For certain Turing machines H
= .
However there also exist Turing machines , for which H
uni
computed by
uni
the following holds:
h
uni
(bw
00u) = h
(u),
i.e.
uni
can simulate every other Turing machine applied to input string u B
.
106 V. Non-computable functions undecidable problems
Relevance for computer science :
The Turing machine
uni
is an interpreter of Turing machines, which itself is written as a Turing
machine:
uni
(u)
bw
u
Thus the construction of
uni
corresponds to writing an interpreter of a simple programming
language in this language.
2.12 Lemma : Universal Turing machines
uni
can be constructed eectively.
Proof : Obviously
uni
applied to a string w B
works as follows :
uni
determines, whether w has the form w = bw
.
If no, then
uni
goes in an innite loop.
If yes, then
uni
simulates the Turing machine applied to u.
The last can be described in more detail as follows:
1.
uni
changes the tape to
bw
c[q
0
u$,
where q
0
is the initial state of .
2. In general
uni
sees on the tape
bw
c[v
l
qav
r
$
where v
l
qav
r
represents the current conguration of the Turing machine .
uni
remembers
the pair (q, a) and determines the value (q, a) of the transition function (which is binary
encoded in bw
) of .
3. If (q, a) is dened, then
uni
returns to the conguration v
l
qav
r
of and then makes the
necessary transition. Usually
uni
must run back and forth several times, because it can
store only a nite amount of information in its memory Q. As a result we have a new
label of the tape of the form
bw
c[v
l
v
r
$,
and
uni
operates further as described in (2).
4. If (q, a) is undened, i.e. v
l
qav
r
is a nal conguration of , then
uni
erases the substrings
bw
uni
(bw
00u) = h
(u)
as desired.
2.13 Theorem : H
uni
B
is undecidable.
Proof : According to the denition of
uni
it holds that H H
uni
using f = id
B
. For the
Turing machine
uni
constructed in the proof it even holds that H = H
uni
, because
uni
gets into
an innite loop, if a given string w B
00u.
uni
H
0
3 Recursive enumerability
In this section we deal with an weakening of the notion of decidability (recursivity) for languages
L over an alphabet A.
3.1 Denition : A language L A
with
L = (N) = (0), (1), (2), . . .,
i.e. L is the range of values of .
Let us revise the notion countable for comparison. L A
is countable, if L _ N or equivalent:
if there is a total function : N A
with
L = or L = (N) = (0), (1), (2), . . ..
Thus the dierence is that by countable we mean that the function must not necessarily be
computable.
We want to consider the new notion of the recursive enumerability in more detail.
3.2 Denition (Semi-decidability):
A language L A
L
: A
part
1
is computable. The partial function
L
is dened as follows:
L
(v) =
_
1 if v L
undef. otherwise
108 V. Non-computable functions undecidable problems
Thus a semi-decision procedure for a set L A
it holds that:
1. L is semi-decidable L is Turing-acceptable.
2. L is decidable L and L = A
. Let f :
A
part
1 be computable by the following algorithm:
Input: w A
Apply to n = 0, 1, 2, . . . successively.
If at some point (n) = w holds, halt with output 1. (Otherwise the algorithm does not
halt.)
f =
L
holds. Thus L is semi-decidable.
: ( dove tailing )
Let L be semi-decidable by the Turing machine . If L ,= , then we must provide a total
computable function : N A
with (N) = L.
Let w
0
be some strictly chosen string from L, is computed by the following algorithm:
Input : n N
Determine, whether n is prime number encoding of a pair (w, k) A
N, i.e. whether
n = T
2
(T (w), k) holds, compare to the proof of the theorem T T
the
following statements are equivalent:
1. L is recursively enumerable.
2. L is range of results of a Turing machine , i.e. L = v A
with h
() = v.
3. L is semi-decidable.
4. L is halting range of a Turing machine , i.e.
L = v A
[ h
(v) exists.
5. L is Turing-acceptable.
6. L is Chomsky-0.
3.5 Corollary : For all languages L A
it holds that:
L is decidable (recursive) L and L = A
is recursively enumerable.
Proof : H
0
is semi-decidable by the Turing machine
0
, applied to w B
, which operates as
follows:
0
determines, whether w of the form w = bw
Thus we get the following main result about the halting of Turing machines.
3.8 Main theorem (Halting of Turing machines):
110 V. Non-computable functions undecidable problems
1. The halting problems K, H, H
0
and H
uni
are recursively enumerable,
but not decidable.
2. The complementary problems K, H, H
0
and H
uni
are countable but not recursively enu-
merable
Proof :
(1): It holds that K H = H
uni
H
0
.
(2): If the complementary problems were recursively enumerable, then the halting problems
would be decidable according to (1) r.e. and the above corollary. Contradiction!
4 Automatic program verication
Can the computer verify programs, i.e. determine, whether a program T satises the given
specication o ? We consider the following verication problem in more detail:
Given : Program T and specication o
Question : Does T satisfy the specication o ?
We formalize this problem as follows:
Program T
= Turing machine with input alphabet B = 0, 1
Specication o
= subset o of T
B,B
, the set of all Turing-computable functions
h : B
part
B
T satises o
= h
o
The answer is given in the theorem of Rice (1953, 1956):
The verication problem is undecidable except for two trivial exceptions:
o = : answer always no.
o = T
B,B
: answer always yes
4.1 Theorem (Theorem of Rice): Let o be an arbitrary non-trivial subset of T
B,B
, i.e. it
holds that o T
B,B
. Then the language
BW(o) = bw
[ h
o B
is in the set of
functions o, is undecidable.
Proof : In order to show the undecidability of BW(o), we reduce the undecidable problem H
0
or its complement H
0
= B
H
0
respectively to BW(o)x.
111
Therefore, rst we consider an arbitrary function g T
B,B
and an arbitrary Turing machine .
Let
g
be a TM computed by g. Furthermore, let
(, g) operates
as follows:
1. First v is ignored and
operates as
g
applied to v.
Let T
B,B
be totally undened function. Then the function
(, g) computed by h
(,g)
is
computable by the following case distinction:
h
(,g)
=
_
g if halts on the blank tape
otherwise
For a given g there exists a total computable function f
g
: B
with
f
g
(bw
) = bw
(,g)
,
i.e. f
g
computes the binary encoding of a Turing machine
H
0
halts applied to the blank tape
h
(,g)
= g
It holds h
(,g)
g, , , o and g o.
h
(,g)
o
bw
(,g)
BW(o)
f
g
(bw
) BW(o)
Case 2 : o
Choose any function g , o. It is possible, because o , = T
B,B
.
We show: H
0
BW(o) using f
g
.
bw
, H
0
does not halt applied to the blank tape
h
(,g)
=
It holds h
(,g)
g, , o and g , o.
h
(,g)
o
bw
(,g)
BW(o)
f
g
(bw
) BW(o)
112 V. Non-computable functions undecidable problems
Thus we have proved the theorem of Rice.
It is clear that the theorem of Rice proves that it is not worth trying to algorithmically verify
the semantics of Turing machines or programs on the basis of their syntactic form, i.e. their
input / output behavior.
Nevertheless automatic program verication (also called :model checking, where model is
equivalent to program) is today a very active eld of research. For example, it is possible to
analyse programs, which behavior can be described by nite automata.
5 Grammar problems and Post correspondence problem
First we consider two problems for Chomsky-0-grammars.
5.1 Denition (decision problem for Chomsky-0):
The decision problem for Chomsky-0-grammars is :
Given : Chomsky-0-grammar G = (N, T, P, S) and a string w T
Question : Does u
G
v hold?
5.4 Theorem : The derivation problem for Chomsky-0-grammars is undecidable.
Proof : Obviously the decision problem derivation problem holds, because for all grammars
G = (N, T, P, S) and strings w T
it holds that:
w L(G) S
G
w.
Next we consider some sort of a puzzle game, the Post correspondence problem. It was intro-
duced by E. Post (1946) and is abbreviated as PCP (from Post Correspondence Problem).
The point is to construct a string in two dierent ways.
For subsequent encoding purposes let us assume that we have a countable innite set of symbols.
SYM = a
0
, a
1
, a
2
, . . ..
113
We always consider strings over SYM. Let X SYM be an alphabet.
5.5 Denition : An input/instance of the PCP is a nite sequence Y of pairs of strings, i.e.
Y = ((u
1
, v
1
), . . . , (u
n
, v
n
))
with n 1 and u
i
, v
i
SYM
for i = 1, . . . , n. If u
i
, v
i
X
j=1
u
i
j
=
m
j=1
v
i
j
holds. Therefore, we deal with solvability of a problem for natural numbers.
We show:
Y is solvable (a) j 1, . . . , n : u
j
= v
j
or
(b) k, j 1, . . . , n :
u
k
< v
k
and u
j
> v
j
: Proof by contradiction: If neither (a) nor (b) hold, then all pairs (u, v) Y have the
property u < v or all pairs (u, v) Y have the property u > v respectively. Thus the strings
composed of u-pairs would be shorter or longer than the strings composed of v-pairs. Thus Y
is not solvable.
: If u
j
= v
j
holds, the trivial sequence of indices (j) is the solution of Y . If u
k
< v
k
and
u
l
> v
l
hold, then the sequence
( k, . . . , k
. .
(u
l
v
l
)times
, l, . . . , l
. .
(v
k
u
k
)times
)
115
is the solution of Y , then it holds that
u
k
(u
l
v
l
) +u
l
(v
k
u
k
) = v
k
(u
l
v
l
) +v
l
(v
k
u
k
).
Obviously the properties (a) and (b) are decidable. Thus the PCP is decidable over a one-element
alphabet.
Now we consider the general case of the PCP. Our goal is the following theorem:
5.7 Theorem : The PCP over X is undecidable for every alphabet X with [X[ 2.
We show the theorem by reduction of the derivation problem for Chomsky-0-grammars on the
PCP over 0, 1. We consider the following three reductions:
Derivation problem MPCP PCP PCP over 0, 1.
We start with the rst reduction.
5.8 Lemma : Derivation problem MPCP
Proof : We introduce an algorithm, which constructs an input Y
G,u,v
for MPCP for given
Chomsky-0-grammar G = (N, T, P, S) with (N T SYM) and given strings u, v (N T)
G
v Y
G,u,v
has a correspondence starting with the
rst pair of srings.
We use a symbol , which does not occur in N T. Then Y
G,u,v
consists of the following pairs
of strings:
The rst pair of strings : (, u)
Productions : (p, q) for all p q P
Copy : (a, a) for all a N T
Last pair : (v, )
The exact sequence of pairs of strings in Y
G,u,v
is irrelevant, except for the rst pairs of strings.
Now we show that (1) holds.
: It holds that u
G
v. Then there is a derivation from u to v of the form
(2) u = u
0
p
0
v
0
G
u
0
q
0
v
0
= u
1
p
1
v
1
G
u
1
q
1
v
1
= u
2
p
2
v
2
. . .
G
u
k
q
k
v
k
= v
116 V. Non-computable functions undecidable problems
where p
0
q
0
, . . . , p
k
q
k
P holds. There exists the following correspondence of Y
G,u,v
,
where we write down both components of pairs of strings one after another in two rows :
_
`
p
0
u
0
v
0
u #
#
#
_
`
p
1
u
1
v
1
_
`
q
0
u
0
v
0
_
`
q
1
u
1
v
1
`
_
#
`
_
#
. . .
. . .
`
_
#
`
_
#
_
`
p
k
u
k
v
k
`
_
#
. . .
. . .
`
_
#
v # #
_
`
q
k
u
k
v
k
#
`
`
The angle pieces with u and v represent the rst pair of strings and the last pair respectively.
The substrings u
i
, v
i
, in round frames are constructed character-by-character by copy-pairs
(a, a) for a N T . The substrings p
i
, q
i
in angled frames are placed by the respective
pair of productions (p
i
, q
i
). The connecting lines mean that both strings belong to the same pair
of strings.
: Now consider a correspondence of Y
G,u,v
, which begins with the rst pair of strings. The
form of the pairs of strings in Y
G,u,v
implies that the correspondence must be constructed as
shown under with the exception that the substrings u and u
i
q
i
v
i
respectively can repeat
innitely many times by applying only copy-pairs (a, a) :
_
`
q
i
u
i
v
i
_
`
q
i
u
i
v
i
_
`
q
i
u
i
v
i
_
`
q
i
u
i
v
i
`
_
#
`
_
#
. . .
. . .
`
_
#
`
_
#
`
`
_
`
q
i
u
i
v
i
`
_
#
`
_
#
. . .
In order to make the correspondence complete, we must be able to place the nal pair. For
this purpose we must insert pairs of productions (p
i
, q
i
) so that the correspondence describes a
derivation (2) from u to v.
Note that holds only because we begin with the rst pair of strings in the sense of MPCP.
Without this restriction, for example, the copy-pair (, ) gives us a tivial correspondence, which
does not imply u
G
v.
5.9 Lemma : MPCP PCP
117
Proof : We introduce an algorithm, which constructs an input Y
PCP
for the PCP for the given
input Y = ((u
1
, v
1
), . . . , (u
n
, v
n
)) for the MPCP, i.e. with u
i
,= for i = 1, . . . , n, so that the
following holds :
(3) Y has a correspondence, which starts with the rst pair of strings.
Y
PCP
has a correspondence.
Idea : Let us construct Y
PCP
in such a way that every correspondence necessarily starts with
the rst pair of strings.
For this purpose we use new symbols and , which may not appear in any of the strings from
Y .
We dene two mappings
: SYM
SYM
,
: SYM
SYM
,
where for w SYM
(w) is generated from w, where to the right of every character of w we insert the symbol
,
(w) is generated from w, where to the left of every character of w we insert the symbol
.
For example it holds that
(GTI) = GTI and (GTI) = GTI.
For the mappings and the following holds: for all u, v SYM
a
st
1. () = and () =
2. (uv) = (u)(v) and (uv) = (u)(v)
3. u = v (u) = (v)
The statements 1 and 2 mean that and are string homomorphisms, i.e. and the concate-
nation are preserved.
From Y we construct the following Y
PCP
with pairs of strings, which are numbered from 0 to
n + 1:
Y
PCP
= ( ((u
1
), (v
1
)), 0 pair of strings
((u
1
), (v
1
)) , 1st pair of strings
. . . , . . .
((u
n
), (v
n
)) , n-th pair of strings
(, )) , n + 1-th pair of strings
118 V. Non-computable functions undecidable problems
Now we show that (3) holds:
: Let (1, i
1
, . . . , i
m
) be a correspondence of Y , i.e. u
1
u
i
2
. . . u
i
m
= v
1
v
i
2
. . . v
i
m
. According to
the statement 3 it holds that :
(u
1
u
i
2
. . . u
i
m
) = (v
1
v
i
2
. . . v
i
m
)
By applying statement 2 several times, we get :
=
(u
1
)
(v
1
)
(u
i
2
)
(v
i
2
)
. . .
. . .
(u
i
m
)
(v
i
m
)
By using frames we have put together strings, which appear in a pair of strings of Y
PCP
. We
see that:
(1, i
2
, . . . , i
m
, n + 1)
is a correspondence of Y
PCP
.
: We show this direction only for the following case: v
i
,= for i = 1, . . . , n
Let (i
1
, . . . , i
m
) be some correspondence of Y
PCP
. Then i
1
= 0 and i
m
= n+1 hold, because only
the pairs of strings in the 1st pair of strings ((u
1
), (v
1
)) start with the same symbol and only
the strings in (n+1)-th pair of strings (, ) end with the same symbol. Let k 2, . . . , m be
the smallest index with i
k
= n + 1. Then (i
1
, . . . , i
k
) is also a correspondence of Y
PCP
, because
appears only as the last symbol in the appropriate correspondence string. The form of the
pairs of the strings implies that:
i
j
,= 1 for j = 2, . . . , k 1.
Otherwise there would be two consequent s in the substring (u
i
j1
)(u
1
), which could not
be reproduced by placing (v
i
)s, because v
i
,= .
Thus the correspondence string has the following structure
=
(u
1
)
(v
1
)
(u
i
2
)
(v
i
2
)
. . .
. . .
(u
i
k1
)
(u
i
k1
)
0, 1
,
with which we have got acquainted while considering binary encoding of Turing machines.
Now consider an input Y = ((u
1
, v
1
), . . . , (u
n
, v
n
)) of the PCP. Then we dene
bw(Y ) = ((bw(u
1
), bw(v
1
)), . . . , (bw(u
n
), bw(v
n
)))
as input of PCP over 0, 1. Obviously it holds that:
Y is solvable bw(Y ) is solvable.
From the three lemmas above we get the undecidability of the following problems:
MPCP,
PCP,
PCP over 0, 1 and
PCP over X with [X[ 2.
In particular we have proved the above theorem.
120 V. Non-computable functions undecidable problems
6 Results on undecidability of context-free languages
Now we can prove the results on undecidability of context-free languages mentioned in Chapter
III. In contrast to the regular languages the following holds:
6.1 Theorem : For context-free (i.e. Chomsky-2-) languages
the intersection problem,
the equivalence problem,
the inclusion problem
are undecidable.
Proof :
Intersection problem: Consider two context-free grammars G
1
and G
2
. The question is:
Does L(G
1
) L(G
2
) = hold? We show that the Post correspondence problem can be reduced
to the intersection problem
PCP intersection problem
This implies the undecidability of the intersection problem.
Now consider an arbitrary input Y = ((u
1
, v
1
), . . . , (u
n
, v
n
)) of the PCP with u
i
, v
i
X
for
an alphabet X. We provide an algorithm, which for every input Y constructs two context-free
grammars G
1
and G
2
, so that the following holds
Y has a correspondence. L(G
1
) L(G
2
) ,= . ()
The idea is that G
1
generates all strings, which can be produced by putting u
i
s one after
another, and G
2
generates all strings, which can be produced by putting v
i
s one after another.
In order for the necessary relation () to hold, G
1
and G
2
must also record the indices i of the
placed u
i
and v
i
. For this purpose we use n new symbols a
1
, . . . , a
n
/ X and choose them as
the set of terminal symbols of G
1
and G
2
:
T = a
1
, . . . , a
n
X
Then put G
i
= (S, T, P
i
, S), i = 1, 2, where P
1
is given by the following productions:
S a
1
u
1
[ a
1
Su
1
[ . . . [ a
n
u
n
[ a
n
Su
n
P
2
is given by the following productions:
S a
1
v
1
[ a
1
Sv
1
[ . . . [ a
n
v
n
[ a
n
Sv
n
Obviously it holds that
L(G
1
) = a
i
m
. . . a
i
1
u
i
1
. . . u
i
m
[ m 1 and i
1
, . . . , i
m
1, . . . , n
121
and
L(G
2
) = a
i
m
. . . a
i
1
v
i
1
. . . v
i
m
[ m 1 and i
1
, . . . , i
m
1, . . . , n.
This implies that:
Y has the correspondence (i
1
, . . . , i
m
).
a
i
m
. . . a
i
1
u
i
1
. . . u
i
m
= a
i
m
. . . a
i
1
v
i
1
. . . v
i
m
L(G
1
) L(G
2
).
Therefore, () holds and so does the undecidability of the intersection problem. We have proved
a stronger result: the undecidability of the intersection problem for deterministic context-free
languages. As it is easy to check, the languages L(G
1
) and L(G
2
) are deterministic.
Equivalence problem: Consider two context-free grammars G
1
and G
2
. The question is: Does
L(G
1
) = L(G
2
) hold? We show the undecidability of this problem using the following reduction:
Intersection problem for deterministic context-free languages
equivalence problem.
Consider two deterministic push-down automata /
1
and /
2
. We show that we can construct
from them two (not necessarily deterministic) contest-free grammars G
1
and G
2
so that the
following holds:
L(/
1
) L(/
2
) = L(G
1
) = L(G
2
).
We use the fact that for /
2
we can construct the complement of this push-down automata /
2
with L(/
2
) = L(/
2
) (compare with Chapter III, section 6). Then from /
1
and /
2
we can
algorithmically construct context-free grammars G
1
and G
2
with
L(G
1
) = L(/
1
) L(/
2
) and L(G
2
) = L(/
2
).
Thus it holds that
L(/
1
) L(/
2
) = L(/
1
) L(/
2
)
L(/
1
) L(/
2
)
L(/
1
) L(/
2
) = L(/
2
)
L(G
1
) = L(G
2
)
as required.
Inclusion problem: Consider two context-free grammars G
1
and G
2
. The question is: Does
L(G
1
) L(G
2
) hold? Obviously the reduction
equivalence problem inclusion problem
holds. Therefore, the inclusion problem is also undecidable.
Another result of undecidability concerns the ambiguity of context-free grammars. For the
practical use of context-free grammars for syntax description of programming languages it would
122 V. Non-computable functions undecidable problems
be benecial to have an algorithmic ambiguity test. However further we show that such a test
does not exist.
6.2 Theorem : It is undecidable, whether a given context-free grammar is ambiguous.
Proof : We show the following reduction
PCP ambiguity problem. ()
Consider an input Y = ((u
1
, v
1
), . . . , (u
n
, v
n
)) of the PCP. First we construct the context-free
grammar G
i
= (S
i
, T, P
i
, S
i
), i = 1, 2, as in the case of reduction of PCP to the intersection
problem. Afterwards we construct a context-free grammar G = (S, S
1
, S
2
, T, P, S) from it
with
P = S S
1
, S S
2
P
1
P
2
.
Because G
1
and G
2
are unambiguous, the only possible ambiguity of G, while constructing a
derivation tree of G for a string w T
00u B
[ applied to u halts .
Now we put together problems, i.e. languages, with the same complexity into the so called
complexity classes.
1.2 Denition : Let f : N R.
DTIME (f(n)) = L[ there exists a deterministic TM with several tapes, which has time complexity f(n)
and accepts L
NTIME (f(n)) = L[ there is a non-deterministic TM with several tapes, which has time complexity
f(n) and accepts L
DSPACE (f(n)) = L[ there exists a deterministic TM with several tapes, which has space complexity f(n),
and accepts L
NSPACE (f(n)) = L[ there is a non-deterministic TM with several tapes, which has space complexity
f(n) and accepts L
Example : Consider L = uu[ u a, b
. . w
\
\`
_
`
(2)
move the
lower head
half as quickly
as the upper
head
(
n
2
vertical
slashes on
tape 2) Test,
whether n is
even.
on both
tapes
n
2
steps to
the left
w
..
u
v
`
`
(3)
u . . .
v
`
`
_
`
(4)
copy the
2nd half of
v from the
1st tape
to the 2nd
tape and
erase it
on the 1st
tape.
move the
upper head
to the right
end of u
u
v
u
v
`
(5)
`
`
compare
u and v
backwards.
Accept, if
u = v
Let us compute time complexity of 5 phases:
(n + 1) + (
n
2
+ 1) + (
n
2
+ 1) + (
n
2
+ 1) + (
n
2
+ 1) = 3n + 5
Space complexity: n + 2
Thus it holds that: L DTIME (3n + 5),
L DSPACE (n + 2).
We can proceed non-determinstically as follows:
Part 1: Copy character by character from tape 1 to tape 2. Stop this process non-deterministically
at any time.
Part 2: Return to tape 2 at the beginning.
Part 3: Now compare character by character that starting from the current position, the rst tape
has the same contents as the second one. If it is the case and both heads point to the
blank eld, then accept.
126 VI. Complexity
In the best case this process needs
n
2
+
_
n
2
+ 1
_
+
_
n
2
+ 1
_
=
3n
2
+ 2 steps.
In the worst case, when the non-deterministic decision is made only at the last symbol of the
input string, we need
n + (n + 1) + 1 = 2n + 2 steps.
Thus it holds that: L NTIME(2n + 2)
In complexity theory we compare the asymptotic behavior of time and space complexity, i.e. the
behavior for n, which is large enough. Thus we can omit constant factors. For this purpose
we use the O-notation from the number theory.
1.3 Denition : Let g : N R. Then
O(g(n)) = f : N R[ n
0
, k Nn n
0
: f(n) k g(n)
i.e. O(g(n)) is the class of all functions f, which are bounded by the constant multiplied by g
for suciently large values of n.
Example : (n 3n + 4), (n n + 2) O(n). Therefore, we also say: the language L is of
time and space complexity O(n) or L is of linear time and space complexity respectively.
Because a TM can visit at most one new eld on its tapes on each computational step, it follows
that:
DSPACE(f(n))
DTIME(f(n)) NSPACE(f(n))
NTIME(f(n))
127
2 The classes P and NP
The complexity for algorithms, which can be applied in practice, should be a polynomial p(n)
of the k-th degree, i.e. of the form
p(n) = a
k
n
k
+ +a
1
n +a
0
with a
i
N for i = 0, 1, . . . , k, k N and a
k
,= 0.
2.1 Denition (Cobham, 1964):
P =
p polynomial in n
DTIME(p(n))
NP =
p polynomial in n
NTIME(p(n))
PSPACE =
p polynomial in n
DSPACE(p(n))
NPSPACE =
p polynomial in n
NSPACE(p(n))
2.2 Theorem (Savitch, 1970): For all polynomials p in n it holds that:
NSPACE(p(n)) = DSPACE(p
2
(n))
and thus
NPSPACE = PSPACE.
Furthermore, it holds that
P NP PSPACE, ()
because, as we have already mentioned, a TM can visit at most one new eld in every compu-
tational step.
Open problem in computer science: Are the inclusions in () strict or does the equality
hold ?
The classes P and NP are very important, because they mark the transition from computability
or decidability questions, which are of practical interest, to the questions, which are purely of
theoretical interest. This transition can be illustrated by the following diagram:
128 VI. Complexity
all problems resp. languages
computable resp. decidable
exponential time
PSPACE
non-det. polyn. time: NP
det. polyn. time: P
_
NPC
`
Practically solvable, i.e. practically computable or decidable are those problems in the class P,
which can be characterized as follows:
P: Construct the right solution deterministically
and in polynomial time.
At the same time the polynomials, which bound the computation time, should have the smallest
possible degree, such as n, n
2
or n
3
.
However, practically unsolvable are all problems, for which it can be proved that the computa-
tion time grows exponentially with the input size n. Between these two extremes there is a large
class of practically important problems, for which at the moment we know only exponential
deterministic algorithms. However, these problems can be solved using non-deterministic algo-
rithms in polynomial time. This is the class NP, which in comparison to P, can be characterized
as follows:
NP: Guess a solution proposal non-deterministically and then verify /
check deterministically and in polynomial time,
whether this proposal is right.
However, these problems are also still practically unsolvable without restriction of generality. In
practice we make use of the so called heuristics, which strongly restricts the non-deterministic
search space of the possible solution proposals. Using these heuristics we try to approximate an
ideal solution.
Examples of problems from the class NP
(1) Problem of Hamiltonian path
Given: A nite graph with n vertices.
Question: Does the graph contain a Hamiltonian path, i.e. a path, which visits each vertex
exactly once?
Consider, for example, the following graph:
129
, ,
, ,
,
\
\
\
\
\
\
`
`
`
`
`
K2
K3
K1
K4
Then the sequence of vertices K1-K2-K3-K4 is a Hamiltonian path. It is easy to see that
the problem of the Hamiltonian path lies in NP: First let us guess a path and then check,
whether each vertex is visited exactly once. Because there are n! paths in the graph, this
method could be used deterministically only with exponential complexity.
(2) Traveling salesman problem
Given: A nite graph with n vertices and length N of every edge, as well as a number
k N.
Question: Is there a round-trip for the traveling salesman with the length k or to put it
more mathematically: is there a cycle in the graph with the length k, which visits each
vertex at least once?
This problem also lies in NP: First let us guess a cycle and then compute its length. The
traveling salesman problem is of practical importance, for example, for designing telephone
networks or integrated circuits.
(3) Satisability problem for Boolean expressions (shortly SAT)
Given: A Boolean expression B, i.e. an expression, which consists only from variables
x
1
, x
2
, . . . , x
n
connected by operators (not), (and) and (or), as well as by brackets.
Question: Is B satisable, i.e. is there an assignment of 0 and 1 to the Boolean variables
x
1
, x
2
, . . . , x
n
in B, so that B evaluates to 1?
For example, B = (x
1
x
2
) x
3
is satisable with the values x
1
= x
2
= 1 or x
3
= 0. In
Section 3 we will consider the SAT problem in more detail.
While considering the question, which still remains open, whether P = NP holds, we have come
across a subclass of NP, the class NPC of the so called NP-complete problems. It holds that:
If one problem of NPC lies in P liegt, then all problems of NP are already in P, i.e.
P = NP holds.
The class NPC was introduced in 1971 by S.A. Cook. Cook was the rst to prove that the
SAT problem, which we have just introduced, is NP-complete. Since 1972 R. Karp has proved
that many other problems are also NP-complete. Today we know more than 1000 examples of
problems from the class NPC.
Below we want to dene the notion of NP-completeness. For this purpose we need the notion
of polynomial-time reduction, which was introduced by Karp in 1972 as a technique to prove
130 VI. Complexity
NP-completeness. Therefore, we tighten the notion of reduction L
1
L
2
, which was introduced
in Section 2.
2.3 Denition : Let L
1
1
and L
2
2
be languages. Then L
1
is called polynomially-time
reducible to L
2
, shortly
L
1
p
L
2
,
if there is a function f :
2
, which is total and computable with a polynomial time
complexity, so that for all w
1
it holds that:
w L
1
f(w) L
2
.
We also say: L
1
p
L
2
using f.
L
1
p
L
2
clearly indicates that L
1
is not more complex than L
2
. We can easily recognize that
p
is a reexive and transitive relation on languages, because with two polynomials p
1
(n) and
p
2
(n), p
1
(p
2
(n)) is also a polynomial.
2.4 Denition (Cook, 1971): A language L
0
is called NP-complete, if L
0
NP holds and
L NP : L
p
L
0
.
2.5 Lemma (Polynomial-time reduction): Let L
1
p
L
2
. Then it holds that:
(i) If L
2
P holds, then L
1
P holds as well.
(ii) If L
2
NP holds, then L
1
NP holds as well.
(iii) If L
1
is NP-complete and L
2
NP holds, then L
2
is also NP-complete.
Proof : for (i) and (ii): Let L
1
p
L
2
using a function f, which is computed by a Turing
machine
1
. Let the polynomial p
1
bound the computing time of
1
. Because L
2
P (or
L
2
NP respectively), there is a (non-deterministic) Turing machine
2
, which is bounded by
a polynomial p
2
and computes the characteristic function
L
2
.
Similar to normal reduction, the characteristic function
L
1
for all w
1
can be computed as
follows:
L
1
(w) =
L
2
(f(w)).
We do it by sequentially connecting the Turing machines
1
and
2
. Now let [w[ = n. Then
1
computes the string f(w) in p
1
(n) steps. This time bound also limits the length of f(w), i.e.
[f(w)[ p
1
(n). Therefore, the computation of
L
2
(f(w)) is carried out in
p
1
(n) +p
2
(p
1
(n))
steps. Thus also L
1
P (and L
1
NP respectively).
for (iii): Let L NP. Because L
1
is NP-complete, L
p
L
1
holds. Moreover L
1
p
L
2
holds.
The transitivity of
p
implies L
p
L
2
, what we wanted to show.
131
2.6 Corollary (Karp): Let L
0
be an NP-complete language. Then it holds that: L
0
P
P = NP.
Proof : : Statement (i) of the Lemma.
: is obvious.
Example : We show the following polynomial-time reduction:
Hamiltonian path
p
traveling salesman.
Consider a graph G with n vertices. As a reduction function we consider the following construc-
tion f : G G
of a new graph G
as follows:
G
has n vertices as G.
Every edge of G is the edge of G
of the length 1.
Every non-edge of G is the edge of G
of the length 2.
Thus G
is a complete graph, i.e. every two vertices are connected by one edge.
E.g.
G: n = 5 G
:
and appropriate bound
k = n + 1
, ,
, ,
,
\
\
\
\
\
\
`
`
`
`
`
f
, ,
, ,
,
\
\
\
\
\
\
`
`
`
`
`
1
1
1 1
1
1
2
2
2
2
The construction f takes place in polynomial time. (In order to nd the non-edges: consider an
incidence matrix, i.e. O(n
2
)) Now we show the reduction property of f, i.e.
G has a Hamiltonian path G
:
, , , , , ,
_
`
_
`
1 1
1 1
_ _
Round-trip of the length 4
Hamiltonian path
133
3 The satisability problem for Boolean expressions
3.1 Denition SAT:
(i) A Boolean expression is a term over 0,1 and variables with the operations not, and,
or. In particular: Let V = x
1
, x
2
, . . . be an innite set of variables. Let B = 0, 1 be
the set of logical values (false and true).
(1) Every x
i
V is a Boolean expression.
(2) Every a B is a Boolean expression.
(3) If E is a Boolean expression, then E (not E) is a Boolean expression as well.
(4) If E
1
, E
2
are Boolean expressions, then (E
1
E
2
) and (E
1
E
2
) are also Boolean
expressions.
If x V , then we mostly write x instead of x.
If x V , then we call x and x literals.
_
Priorities:
over over
_
(ii) Conjunctive norman form:
(1) If y
1
, y
2
, . . . , y
k
are literals, then (y
1
y
2
. . . y
k
) is called clause (of the order k,
i.e. k literals/alternatives)
(2) If c
1
, c
2
, . . . , c
r
are clauses (of the order k), then c
1
c
2
. . .c
r
is called Boolean ex-
pression in conjunctive normal form (CNF) (of the order k). If at least one clause
contains k literals, then this expression is called a CNF of the order k. [Similarly we
can dene disjoint normal form.]
(iii) An assignment is a function : V 0, 1, which can be extended to Boolean expres-
sions (see (i),(2)-(4)):
(0) = 0, (1) = 1,
(E) = 1 (E),
(E
1
E
2
) = Min((E
1
), (E
2
)),
(E
1
E
2
) = Max((E
1
), (E
2
)).
(iv) A Boolean expression E is called satisable, if there is an assignment with (E) = 1.
(v) The satisability problem is described using the language
ERF = E [ E is Boolean expression, E is satisable .
In this case the alphabet is V 0, 1, , , , ), ( . It is innite. Thus we code V using:
x
i
xj, where j is the binary representation of the number i,
i.e. x
1
x1, x
2
x10, x
3
x11, x
4
x100 etc. The element from B can be easily
removed from E using calculation rules, and we can bring E to CNF. Thus we get:
134 VI. Complexity
SAT := E [ E is Boolean expression (without elements from B) in CNF;
if E contains m dierent variables, then these are
the variables x
1
, . . . , x
m
(in encoded form);
E is satisable
x, 0, 1, , , , (, )
.
SAT(k) := E [ E SAT and E of the order k .
3.2 Theorem (Cook,1971): SAT is NP-complete.
Proof :
First of all it is easy to see that SAT lies in NP:
Let us guess an assignment : x
1
, . . . , x
m
0, 1 for a Boolean expression E in CNF,
which contains exactly m variables. Then to every variable x
i
we assign the value (x
i
)
and calculate (E) accourding to the standard calculation rules (Def. 3.1(iii)). If [E[ = n
held, then E would have less than n variables, i.e. m n; the guessing of an assignment
is carried out in linear time (create a table, m steps), the substitution in E takes [E[ const
steps as well as the evaluation, i.e. SAT NTIME (c
1
n+c
2
) for appropriate constants c
1
and c
2
. In particular: SAT NP. [Describe, how we can computate (E) in linear time!?]
The dicult part is to show that for every L NP it holds that: L SAT. Thus
consider an arbitrary L NP. Then there is a non-deterministic Turing machine =
(Q, X, , , q
0
, ., F) with: Q = set of states; X = input alphabet; = tape alphabet,
which contains X; q
0
Q initial state; b X blank symbol; F Q set of nal states.
(Compare with the denition in the rst chapter. Supplement with the nal states, be-
cause must halt for all inputs. Only the strings w X
(with = x, 0, 1, , , , (, ) ) in CNF, so
that it holds that: w L g(w) SAT. If we showed that this function g : X
with c
1
= . = blank symbol.
Without loss of generality let : Q 2
Q{
L
Left
1,
R
`
Right
+1, 0
S
Stay
}
with (q, c) = for q F.
We make complete by changing every (q, c) = into (q, c) = (q, c, 0). Thus the
135
empty set never appears as an image of . Because the end of the computation was
determined by (q, c) = , then by changing we dene a non-terminatingTuring machine
after
making p([w[) steps reaches a conguration u
1
qu
2
with exactly this state q and
stays in
this conguration innitely long; the opposite direction also holds. Thus:
w L applied to w after p([w[) in a conguration u
1
qu
2
with q F
be numbered from 1 to m.
Let w X
be given, [w[ = n, w = c
j
1
c
j
2
. . . c
j
n
. The formula g(w) is constructed by the
following variables:
meaning of the variables:
z
t,k
1 t p(n)
1 k s
z
t,k
= 1
is in sate q
k
at the time point t
a
t,i,j
1 t p(n)
p(n) i p(n)
1 j
a
t,i,j
= 1 c
j
is the content of the square i at the
time point t
s
t,i
1 t p(n)
p(n) i p(n)
s
t,i
= 1
in state q
1
, .
p(n)+1
w.
p(n)n
is on the tape, the head is at the
square 1. Time point t = 1.
(2) Final conguration (provided that w is accepted):
in state q
j
with r j s at
the time point t = p(n).
(3) Transition condition:
1tp(n)
_
A
state
3
(t) A
place
3
(t) A
eld
3
(t) A
U
3
(t)
_
with
A
state
3
(t) = exactlyone(z
t,1
, . . . , z
t,s
)
A
place
3
(t) = exactlyone(s
t,p(n)
, s
t,p(n)+1
, . . . , s
t,p(n)
)
A
eld
3
(t) =
_
p(n)ip(n)
exactlyone(a
t,i,1
, . . . , a
t,i,
)
A
U
3
(t) = exactlyone(b
t,1
, . . . , b
t,m
)
137
Again this formula is in CNF. A
3
has
p(n) (s
2
+ (2 p(n) + 1)
2
+ (2 p(n) + 1)
2
2
+m
2
) variables.
A
3
describes exactly the transition condition (3).
for(4): A
4
=
_
1t<p(n)
A
4
(t)
Now we must consider the tuple of in more detail. For l = 1, 2, . . . , m let the l-th
tuple of be given by
tupel
l
= ( q
k
l
, c
j
l
, q
k
l
, c
j
l
, d
l
Q Q +1, 1, 0
)
Put:
A
4
(t) =
_
p(n)ip(n)
_
_
1j
(s
t,i
a
t,i,j
a
t+1,i,j
)
_
1lm
_
(s
t,i
b
t,l
z
t,k
l
) (s
t,i
b
t,l
a
t,i,j
l
)
(s
t,i
b
t,l
z
t+1,
k
l
) (s
t,i
b
t,l
a
t+1,i,
j
l
)
(s
t,i
b
t,l
s
t+1,i+d
l
)
_
_
Comment: (s
t,i
a
t,i,j
a
t+1,i,j
) is 1
s
t,i
is 1 (i.e.
!)
In a similar way we read the remaining clauses:
(s
t,i
b
t,l
z
t,k
l
) is 1
if
was at the time point t at the square i and the l-th tuple was chosen for
transition, then
(n)
variables (for this polynomial p
(n))
2
steps, i.e. in polynomial time.
Statement 2: g is a reduction, i.e.
w X