Mathematical Logic
Mathematical Logic
Mathematical Logic
(Math 570)
Lecture Notes
Lou van den Dries
Fall Semester 2007
Contents
1 Preliminaries 1
1.1 Mathematical Logic: a brief overview . . . . . . . . . . . . . . . . 1
1.2 Sets and Maps . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
2 Basic Concepts of Logic 13
2.1 Propositional Logic . . . . . . . . . . . . . . . . . . . . . . . . . . 13
2.2 Completeness for Propositional Logic . . . . . . . . . . . . . . . . 19
2.3 Languages and Structures . . . . . . . . . . . . . . . . . . . . . . 24
2.4 Variables and Terms . . . . . . . . . . . . . . . . . . . . . . . . . 28
2.5 Formulas and Sentences . . . . . . . . . . . . . . . . . . . . . . . 30
2.6 Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
2.7 Logical Axioms and Rules; Formal Proofs . . . . . . . . . . . . . 37
3 The Completeness Theorem 43
3.1 Another Form of Completeness . . . . . . . . . . . . . . . . . . . 43
3.2 Proof of the Completeness Theorem . . . . . . . . . . . . . . . . 44
3.3 Some Elementary Results of Predicate Logic . . . . . . . . . . . . 49
3.4 Equational Classes and Universal Algebra . . . . . . . . . . . . . 53
4 Some Model Theory 59
4.1 Lowenheim-Skolem; Vaughts Test . . . . . . . . . . . . . . . . . 59
4.2 Elementary Equivalence and Back-and-Forth . . . . . . . . . . . 64
4.3 Quantier Elimination . . . . . . . . . . . . . . . . . . . . . . . . 65
4.4 Presburger Arithmetic . . . . . . . . . . . . . . . . . . . . . . . . 69
4.5 Skolemization and Extension by Denition . . . . . . . . . . . . . 73
5 Computability, Decidability, and Incompleteness 77
5.1 Computable Functions . . . . . . . . . . . . . . . . . . . . . . . . 77
5.2 The Church-Turing Thesis . . . . . . . . . . . . . . . . . . . . . . 86
5.3 Primitive Recursive Functions . . . . . . . . . . . . . . . . . . . . 87
5.4 Representability . . . . . . . . . . . . . . . . . . . . . . . . . . . 91
5.5 Decidability and Godel Numbering . . . . . . . . . . . . . . . . . 95
5.6 Theorems of Godel and Church . . . . . . . . . . . . . . . . . . . 100
5.7 A more explicit incompleteness theorem . . . . . . . . . . . . . . 103
i
5.8 Undecidable Theories . . . . . . . . . . . . . . . . . . . . . . . . 105
Chapter 1
Preliminaries
We start with a brief overview of mathematical logic as covered in this course.
Next we review some basic notions from elementary set theory, which provides
a medium for communicating mathematics in a precise and clear way. In this
course we develop mathematical logic using elementary set theory as given,
just as one would do with other branches of mathematics, like group theory or
probability theory.
For more on the course material, see
Shoeneld, J. R., Mathematical Logic, Reading, Addison-Wesley, 1967.
For additional material in Model Theory we refer the reader to
Chang, C. C. and Keisler, H. J., Model Theory, New York, North-
Holland, 1990,
Poizat, B., A Course in Model Theory, Springer, 2000,
and for additional material on Computability, to
Rogers, H., Theory of Recursive Functions and Eective Com-
putability, McGraw-Hill, 1967.
1.1 Mathematical Logic: a brief overview
Aristotle identied some simple patterns in human reasoning, and Leibniz dreamt
of reducing reasoning to calculation. As a mathematical subject, however, logic
is relatively recent: the 19th century pioneers were Bolzano, Boole, Cantor,
Dedekind, Frege, Peano, C.S. Peirce, and E. Schroder. From our perspective
we see their work as leading to boolean algebra, set theory, propositional logic,
predicate logic, as clarifying the foundations of the natural and real number
systems, and as introducing suggestive symbolic notation for logical operations.
Also, their activity led to the view that logic + set theory can serve as a basis for
1
2 CHAPTER 1. PRELIMINARIES
all of mathematics. This era did not produce theorems in mathematical logic
of any real depth,
1
but it did bring crucial progress of a conceptual nature,
and the recognition that logic as used in mathematics obeys mathematical rules
that can be made fully explicit.
In the period 1900-1950 important new ideas came from Russell, Zermelo,
Hausdor, Hilbert, Lowenheim, Ramsey, Skolem, Lusin, Post, Herbrand, Godel,
Tarski, Church, Kleene, Turing, and Gentzen. They discovered the rst real
theorems in mathematical logic, with those of Godel having a dramatic impact.
Hilbert (in Gottingen), Lusin (in Moscow), Tarski (in Warsaw and Berkeley),
and Church (in Princeton) had many students and collaborators, who made up
a large part of that generation and the next in mathematical logic. Most of
these names will be encountered again during the course.
The early part of the 20th century was also marked by the so-called
foundational crisis in mathematics.
A strong impulse for developing mathematical logic came from the attempts
during these times to provide solid foundations for mathematics. Mathematical
logic has now taken on a life of its own, and also thrives on many interactions
with other areas of mathematics and computer science.
In the second half of the last century, logic as pursued by mathematicians
gradually branched into four main areas: model theory, computability theory (or
recursion theory), set theory, and proof theory. The topics in this course are
part of the common background of mathematicians active in any of these areas.
What distinguishes mathematical logic within mathematics is that
statements about mathematical objects
are taken seriously as mathematical objects in their own right. More generally,
in mathematical logic we formalize (formulate in a precise mathematical way)
notions used informally by mathematicians such as:
property
statement (in a given language)
structure
truth (what it means for a given statement to be true in a given structure)
proof (from a given set of axioms)
algorithm
1
In the case of set theory one could dispute this. Even so, the main inuence of set theory
on the rest of mathematics was to enable simple constructions of great generality, like cartesian
products, quotient sets and power sets, and this involves only very elementary set theory.
1.1. MATHEMATICAL LOGIC: A BRIEF OVERVIEW 3
Once we have mathematical denitions of these notions, we can try to prove
theorems about these formalized notions. If done with imagination, this process
can lead to unexpected rewards. Of course, formalization tends to caricature
the informal concepts it aims to capture, but no harm is done if this is kept
rmly in mind.
Example. The notorious Goldbach Conjecture asserts that every even integer
greater than 2 is a sum of two prime numbers. With the understanding that
the variables range over N = 0, 1, 2, . . . , and that 0, 1, +, , < denote the
usual arithmetic operations and relations on N, this assertion can be expressed
formally as
(GC) x[(1+1 < xeven(x)) pq(prime(p)prime(q)x = p+q)]
where even(x) abbreviates y(x = y +y) and prime(p) abbreviates
1 < p rs(p = r s (r = 1 s = 1)).
The expression GC is an example of a formal statement (also called a sentence)
in the language of arithmetic, which has symbols 0, 1, +, , < to denote arithmetic
operations and relations, in addition to logical symbols like =, , , , , , ,
and variables x, y, z, p, q, r, s.
The Goldbach Conjecture asserts that this particular sentence GC is true in
the structure (N; 0, 1, +, , <). (No proof of the Goldbach Conjecture is known.)
It also makes sense to ask whether the sentence GC is true in the structure
(R; 0, 1, +, , <)
(Its not, as is easily veried. That the question makes sense does not mean
that it is of any interest.)
A century of experience gives us condence that all classical number-theoretic
resultsold or new, proved by elementary methods or by sophisticated algebra
and analysiscan be proved from the Peano axioms for arithmetic.
2
However,
in our present state of knowledge, GC might be true in (N; 0, 1, +, , <), but not
provable from those axioms. (On the other hand, once you know what exactly
we mean by
provable from the Peano axioms,
you will see that if GC is provable from those axioms, then GC is true in
(N; 0, 1, +, , <), and that if GC is false in (N; 0, 1, +, , <), then its negation
GC is provable from those axioms.)
The point of this example is simply to make the reader aware of the notions
true in a given structure and provable from a given set of axioms, and their
dierence. One objective of this course is to gure out the connections (and
disconnections) between these notions.
2
Here we do not count as part of classical number theory some results like Ramseys
Theorem that can be stated in the language of arithmetic, but are arguably more in the spirit
of logic and combinatorics.
4 CHAPTER 1. PRELIMINARIES
Some highlights (19001950)
The results below are among the most frequently used facts of mathematical
logic. The terminology used in stating these results might be unfamiliar, but
that should change during the course. What matters is to get some preliminary
idea of what we are aiming for. As will become clear during the course, each of
these results has stronger versions, on which applications often depend, but in
this overview we prefer simple statements over strength and applicability.
We begin with two results that are fundamental in model theory. They
concern the notion of model of where is a set of sentences in a language
L. At this stage we only say by way of explanation that a model of is a
mathematical structure in which all sentences of are true. For example, if
is the (innite) set of axioms for elds of characteristic zero in the language of
rings, then a model of is just a eld of characteristic zero.
Theorem of Lowenheim and Skolem. If is a countable set of sentences
in some language and has a model, then has a countable model.
Compactness Theorem (Godel, Malcev). Let be a set of sentences in some
language. Then has a model if and only if each nite subset of has a model.
The next result goes a little beyond model theory by relating the notion of
model of to that of provability from :
Completeness Theorem (Godel, 1930). Let be a set of sentences in some
language L, and let be a sentence in L. Then is provable from if and only
if is true in all models of .
In our treatment we shall obtain the rst two theorems as byproducts of the
Completeness Theorem and its proof. In the case of the Compactness Theorem
this reects history, but the theorem of Lowenheim and Skolem predates the
Completeness Theorem. The Lowenheim-Skolem and Compactness theorems
do not mention the notion of provability, and thus model theorists often prefer
to bypass Completeness in establishing these results; see for example Poizats
book.
Here is an important early result on a specic arithmetic structure:
Theorem of Presburger and Skolem. Each sentence in the language of the
structure (Z; 0, 1, +, , <) that is true in this structure is provable from the
axioms for ordered abelian groups with least positive element 1, augmented, for
each n = 2, 3, 4, ..., by an axiom that says that for every a there is a b such that
a = nb or a = nb + 1 or . . . or a = nb + 1 + + 1 (with n disjuncts in total).
Moreover, there is an algorithm that, given any sentence in this language as
input, decides whether this sentence is true in (Z; 0, 1, +, , <).
Note that in (Z; 0, 1, +, , <) we have not included multiplication among the
primitives; accordingly, nb stands for b + +b (with n summands).
When we do include multiplication, the situation changes dramatically:
1.2. SETS AND MAPS 5
Incompleteness and undecidability of arithmetic. (Godel-Church, 1930s).
One can construct a sentence in the language of arithmetic that is true in the
structure (N; 0, 1, +, , <), but not provable from the Peano axioms.
There is no algorithm that, given any sentence in this language as input,
decides whether this sentence is true in (N; 0, 1, +, , <).
Here there is no algorithm is used in the mathematical sense of
there cannot exist an algorithm,
not in the weaker colloquial sense of no algorithm is known. This theorem
is intimately connected with the clarication of notions like computability and
algorithm in which Turing played a key role.
In contrast to these incompleteness and undecidability results on (suciently
rich) arithmetic, we have
Tarskis theorem on the eld of real numbers (1930-1950). Every sentence
in the language of arithmetic that is true in the structure
(R; 0, 1, +, , <)
is provable from the axioms for ordered elds augmented by the axioms
- every positive element is a square,
- every odd degree polynomial has a zero.
There is also an algorithm that decides for any given sentence in this language
as input, whether this sentence is true in (R; 0, 1, +, , <).
1.2 Sets and Maps
We shall use this section as an opportunity to x notations and terminologies
that are used throughout these notes, and throughout mathematics. In a few
places we shall need more set theory than we introduce here, for example, or-
dinals and cardinals. The following little book is a good place to read about
these matters. (It also contains an axiomatic treatment of set theory starting
from scratch.)
Halmos, P. R., Naive set theory, New York, Springer, 1974
In an axiomatic treatment of set theory as in the book by Halmos all assertions
about sets below are proved from a few simple axioms. In such a treatment the
notion of set itself is left undened, but the axioms about sets are suggested
by thinking of a set as a collection of mathematical objects, called its elements
or members. To indicate that an object x is an element of the set A we write
x A, in words: x is in A (or: x belongs to A). To indicate that x is not in A we
write x / A. We consider the sets A and B as the same set (notation: A = B)
if and only if they have exactly the same elements. We often introduce a set
via the bracket notation, listing or indicating inside the brackets its elements.
6 CHAPTER 1. PRELIMINARIES
For example, 1, 2, 7 is the set with 1, 2, and 7 as its only elements. Note that
1, 2, 7 = 2, 7, 1, and 3, 3 = 3: the same set can be described in many
dierent ways. Dont confuse an object x with the set x that has x as its
only element: for example, the object x = 0, 1 is a set that has exactly two
elements, namely 0 and 1, but the set x = 0, 1 has only one element,
namely x.
Here are some important sets that the reader has probably encountered
previously.
Examples.
(1) The empty set: (it has no elements).
(2) The set of natural numbers: N = 0, 1, 2, 3, . . ..
(3) The set of integers: Z = . . . , 2, 1, 0, 1, 2, . . ..
(4) The set of rational numbers: Q.
(5) The set of real numbers: R.
(6) The set of complex numbers: C.
Remark. Throughout these notes m and n always denote natural numbers.
For example, for all m ... will mean for all m N....
If all elements of the set A are in the set B, then we say that A is a subset of
B (and write A B). Thus the empty set is a subset of every set, and each
set is a subset of itself. We often introduce a set A in our discussions by dening
A to be the set of all elements of a given set B that satisfy some property P.
Notation:
A := x B : x satises P (hence A B).
Let A and B be sets. Then we can form the following sets:
(a) A B := x : x A or x B (union of A and B);
(b) A B := x : x A and x B (intersection of A and B);
(c) AB := x : x A and x / B (dierence of A and B);
(d) AB := (a, b) : a A and b B (cartesian product of A and B).
Thus the elements of A B are the so-called ordered pairs (a, b) with a A
and b B. The key property of ordered pairs is that we have (a, b) = (c, d) if
and only if a = c and b = d. For example, you may think of R R as the set
of points (a, b) in the xy-plane of coordinate geometry.
We say that A and B are disjoint if AB = , that is, they have no element
in common.
Remark. In a denition such as we just gave: We say that if , the
meaning of if is actually if and only if. We committed a similar abuse
of language earlier in dening set inclusion by the phrase If , then we say
that . We shall continue such abuse, in accordance with tradition, but
only in similarly worded denitions. Also, we shall often write i or to
abbreviate if and only if.
1.2. SETS AND MAPS 7
Maps
Denition. A map is a triple f = (A, B, ) of sets A, B, such that AB
and for each a A there is exactly one b B with (a, b) ; we write f(a) for
this unique b, and call it the value of f at a (or the image of a under f).
3
We
call A the domain of f, and B the codomain of f, and the graph of f.
4
We
write f : A B to indicate that f is a map with domain A and codomain B,
and in this situation we also say that f is a map from A to B.
Among the many synonyms of map are
mapping, assignment, function, operator, transformation.
Typically, function is used when the codomain is a set of numbers of some
kind, operator when the elements of domain and codomain are themselves
functions, and transformation is used in geometric situations where domain
and codomain are equal. (We use equal as synonym for the same or identical;
also coincide is a synonym for being the same.)
Examples.
(1) Given any set A we have the identity map 1
A
: A A dened by 1
A
(a) = a
for all a A.
(2) Any polynomial f(X) = a
0
+ a
1
X + + a
n
X
n
with real coecients
a
0
, . . . , a
n
gives rise to a function x f(x) : R R. We often use the
maps to symbol in this way to indicate the rule by which to each x
in the domain we associate its value f(x).
Denition. Given f : A B and g : B C we have a map g f : A C
dened by (g f)(a) = g(f(a)) for all a A. It is called the composition of g
and f.
Denition. Let f : A B be a map. It is said to be injective if for all a
1
,= a
2
in A we have f(a
1
) ,= f(a
2
). It is said to be surjective if for each b B there
exists a A such that f(a) = b. It is said to be bijective (or a bijection) if it is
both injective and surjective. For X A we put
f(X) := f(x) : x X B (direct image of X under f).
(There is a notational conict here when X is both a subset of A and an element
of A, but it will always be clear from the context when f(X) is meant to be the
the direct image of X under f; some authors resolve the conict by denoting this
direct image by f[X] or in some other way.) We also call f(A) = f(a) : a A
the image of f. For Y B we put
f
1
(Y ) := x A : f(x) Y A (inverse image of Y under f).
Thus surjectivity of our map f is equivalent to f(A) = B.
3
Sometimes we shall write fa instead of f(a) in order to cut down on parentheses.
4
Other words for domain and codomain are source and target, respectively.
8 CHAPTER 1. PRELIMINARIES
If f : A B is a bijection then we have an inverse map f
1
: B A given by
f
1
(b) := the unique a A such that f(a) = b.
Note that then f
1
f = 1
A
and f f
1
= 1
B
. Conversely, if f : A B and
g : B A satisfy g f = 1
A
and f g = 1
B
, then f is a bijection with f
1
= g.
(The attentive reader will notice that we just introduced a potential conict of
notation: for bijective f : A B and Y B, both the inverse image of Y
under f and the direct image of Y under f
1
are denoted by f
1
(Y ); no harm
is done, since these two subsets of A coincide.)
It follows from the denition of map that f : A B and g : C D are
equal (f = g) if and only if A = C, B = D, and f(x) = g(x) for all x A. We
say that g : C D extends f : A B if A C, B D, and f(x) = g(x) for
all x A.
5
Denition. A set A is said to be nite if there exists n and a bijection
f : 1, . . . , n A.
Here we use 1, . . . , n as a suggestive notation for the set m : 1 m n.
For n = 0 this is just . If A is nite there is exactly one such n (although if
n > 1 there will be more than one bijection f : 1, . . . , n A); we call this
unique n the number of elements of A or the cardinality of A, and denote it by
[A[. A set which is not nite is said to be innite.
Denition. A set A is said to be countably innite if there is a bijection N A.
It is said to be countable if it is either nite or countably innite.
Example. The sets N, Z and Q are countably innite, but the innite set R
is not countably innite. Every innite set has a countably innite subset.
One of the standard axioms of set theory, the Power Set Axiom says:
For any set A, there is a set whose elements are exactly the subsets of A.
Such a set of subsets of A is clearly uniquely determined by A, is denoted
by T(A), and is called the power set of A. If A is nite, so is T(A) and
[T(A)[ = 2
|A|
. Note that a a : A T(A) is an injective map. However,
there is no surjective map A T(A):
Cantors Theorem. Let S : A T(A) be a map. Then the set
a A : a / S(a) (a subset of A)
is not an element of S(A).
Proof. Suppose otherwise. Then a A : a / S(a) = S(b) where b A.
Assuming b S(b) yields b / S(b), a contradiction. Thus b / S(b); but then
b S(b), again a contradiction. This concludes the proof.
5
We also say g : C D is an extension of f : A B or f : A B is a restriction of
g : C D.
1.2. SETS AND MAPS 9
Let I and A be sets. Then there is a set whose elements are exactly the maps
f : I A, and this set is denoted by A
I
. For I = 1, . . . , n we also write A
n
instead of A
I
. Thus an element of A
n
is a map a : 1, . . . , n A; we usually
think of such an a as the n-tuple (a(1), . . . , a(n)), and we often write a
i
instead
of a(i). So A
n
can be thought of as the set of n-tuples (a
1
, . . . , a
n
) with each
a
i
A. For n = 0 the set A
n
has just one element the empty tuple.
An n-ary relation on A is just a subset of A
n
, and an n-ary operation on
A is a map from A
n
into A. Instead of 1-ary we usually say unary, and
instead of 2-ary we can say binary. For example, (a, b) Z
2
: a < b is a
binary relation on Z, and integer addition is the binary operation (a, b) a +b
on Z.
Denition. a
i
iI
or (a
i
)
iI
denotes a family of objects a
i
indexed by the set
I, and is just a suggestive notation for a set (i, a
i
) : i I, not to be confused
with the set a
i
: i I. (There may be repetitions in the family, that is, it
may happen that a
i
= a
j
for distinct indices i, j I, but such repetition is not
reected in the set a
i
: i I. For example, if I = N and a
n
= a for all n, then
(i, a
i
) : i I = (i, a) : i N is countably innite, but a
i
: i I = a
has just one element.) For I = N we usually say sequence instead of family.
Given any family (A
i
)
iI
of sets (that is, each A
i
is a set) we have a set
_
iI
A
i
:= x : x A
i
for some i I,
the union of the family. If I is nite and each A
i
is nite, then so is the union
above and
[
_
iI
A
i
[
iI
[A
i
[.
If I is countable and each A
i
is countable then
iI
A
i
is countable.
Given any family (A
i
)
iI
of sets we have a set
iI
A
i
:= (a
i
)
iI
: a
i
A
i
for all i I,
the product of the family. One axiom of set theory, the Axiom of Choice, is a
bit special, but we shall use it a few times. It says that for any family (A
i
)
iI
of nonempty sets there is a family (a
i
)
iI
such that a
i
A
i
for all i I, that
is,
iI
A
i
,= .
Words
Denition. Let A be a set. Think of A as an alphabet of letters. A word of
length n on A is an n-tuple (a
1
, . . . , a
n
) of letters a
i
A; because we think
of it as a word (string of letters) we shall write this tuple instead as a
1
. . . a
n
(without parentheses or commas). There is a unique word of length 0 on A, the
empty word and written . Given a word a = a
1
. . . a
n
of length n 1 on A,
10 CHAPTER 1. PRELIMINARIES
the rst letter (or rst symbol) of a is by denition a
1
, and the last letter (or
last symbol) of a is a
n
. The set of all words on A is denoted A
:
A
=
_
n
A
n
(disjoint union).
Logical expressions like formulas and terms will be introduced later as words of
a special form on suitable alphabets. When A B we can identify A
with a
subset of B
:
ab = a
1
. . . a
m
b
1
. . . b
n
.
Thus ab is a word on A of length m + n. Concatenation is a binary operation
on A
, with as two-sided
identity: a = a = a for all a A
of an
element a A is dened by a
= b
: a A.
This quotient set is a partition of A, that is, it is a collection of pairwise disjoint
nonempty subsets of A whose union is A. (Collection is a synonym for set; we
use it here because we dont like to say set of ... subsets ....) Every partition
of A is the quotient set A/ for a unique equivalence relation on A. Thus
1.2. SETS AND MAPS 11
equivalence relations on A and partitions of A are just dierent ways to describe
the same situation.
In the previous example (congruence modulo n) the equivalence classes are
called congruence classes modulo n (or residue classes modulo n) and the cor-
responding quotient set is often denoted Z/nZ.
Remark. Readers familiar with some abstract algebra will note that the con-
struction in the example above is a special case of a more general construction
that of a quotient of a group with respect to a normal subgroup.
Posets
A partially ordered set (short: poset) is a pair (P, ) consisting of a set P and
a partial ordering on P, that is, is a binary relation on P such that for all
p, q, r P:
(i) p p (reexivity);
(ii) if p q and q p, then p = q (antisymmetry);
(iii) if p q and q r, then p r (transitivity).
If in addition we have for all p, q P,
(iv) p q or q p,
then we say that is a linear order on P, or that (P, ) is a linearly ordered
set.
6
Each of the sets N, Z, Q, R comes with its familiar linear order on it.
As an example, take any set A and its collection T(A) of subsets. Then
X Y : X Y (for subsets X, Y of A)
denes a poset (T(A), ), also referred to as the power set of A ordered by
inclusion. This is not a linearly ordered set if A has more than one element.
Finite linearly ordered sets are determined up to unique isomorphism by their
size: if (P, ) is a linearly ordered set and [P[ = n, then there is a unique map
: P 1, . . . , n such that for all p, q P we have: p q (p) (q).
This map is a bijection.
Let (P, ) be a poset. Here is some useful notation. For x, y P we set
x y : y x,
x < y : y > x : x y and x ,= y.
Note that (P, ) is also a poset. A least element of P is a p P such that p x
for all x P; a largest element of P is dened likewise, with instead of .
Of course, P can have at most one least element; therefore we can refer to the
6
One also uses the term total order instead of linear order.
12 CHAPTER 1. PRELIMINARIES
least element of P, if P has a least element; likewise, we can refer to the largest
element of P, if P has a largest element.
A minimal element of P is a p P such that there is no x P with x < p;
a maximal element of P is dened likewise, with > instead of <. If P has a
least element, then this element is also the unique minimal element of P; some
posets, however, have more than one minimal element. The reader might want
to prove the following result to get a feeling for these notions:
If P is nite and nonempty, then P has a maximal element, and there is a linear
order
q, for all p, q P.
(Hint: use induction on [P[.)
Let X P. A lowerbound (respectively, upperbound) of X in P is an element l
P (respectively, an element u P), such that l x for all x X (respectively,
x u for all x X).
We often tacitly consider X as a poset in its own right, by restricting the
given partial ordering of P to X. More precisely this means that we consider
the poset (X,
X
) where the partial ordering
X
on X is dened by
x
X
y x y (x, y X).
Thus we can speak of least, largest, minimal, and maximal elements of a set
X P, when the ambient poset (P, ) is clear from the context. For example,
when X is the collection of nonempty subsets of a set A and X is ordered by
inclusion, then the minimal elements of X are the singletons a with a A.
We call X a chain in P if (X,
X
) is linearly ordered.
Occasionally we shall use the following fact about posets (P, ).
Zorns Lemma. Suppose P is nonempty and every nonempty chain in P has
an upperbound in P. Then P has a maximal element.
For a further discussion of Zorns Lemma and its proof using the Axiom of
Choice we refer the reader to Halmoss book on set theory.
Chapter 2
Basic Concepts of Logic
2.1 Propositional Logic
Propositional logic is the fragment of logic where we construct new statements
from given statements using so-called connectives like not, or and and.
The truth value of such a new statement is then completely determined by the
truth values of the given statements. Thus, given any statements p and q, we
can form the three statements
p (the negation of p, pronounced as not p),
p q (the disjunction of p and q, pronounced as p or q),
p q (the conjunction of p and q, pronounced as p and q).
This leads to more complicated combinations like
_
p (q)
_
. We shall regard
p as true if and only if p is not true; also, p q is dened to be true if and
only if p is true or q is true (including the possibility that both are true), and
p q is deemed to be true if and only if p is true and q is true. Instead of not
true we also say false. We now introduce a formalism that makes this into
mathematics.
We start with the ve distinct symbols
to be thought of as true, false, not, or, and and, respectively. These symbols are
xed throughout the course, and are called propositional connectives. In this
section we also x a set A whose elements will be called propositional atoms (or
just atoms), such that no propositional connective is an atom. It may help the
reader to think of an atom a as a variable for which we can substitute arbitrary
statements, assumed to be either true or false.
A proposition on A is a word on the alphabet A , , , , that can
be obtained by applying the following rules:
13
14 CHAPTER 2. BASIC CONCEPTS OF LOGIC
(i) each atom a A (viewed as a word of length 1) is a proposition on A;
(ii) and (viewed as words of length 1) are propositions on A;
(iii) if p and q are propositions on A, then the concatenations p, pq and pq
are propositions on A.
For the rest of this section proposition means proposition on A, and p, q, r
(sometimes with subscripts) will denote propositions.
Example. Suppose a, b, c are atoms. Then abc is a proposition. This
follows from the rules above: a is a proposition, so a is a proposition, hence
ab as well; also c is a proposition, and thus abc is a proposition.
We dened proposition using the suggestive but vague phrase can be ob-
tained by applying the following rules. The reader should take such an infor-
mal description as shorthand for a completely explicit denition, which in the
case at hand is as follows:
A proposition is a word w on the alphabet A , , , , for which there
is a sequence w
1
, . . . , w
n
of words on that same alphabet, with n 1, such that
w = w
n
and for each k 1, . . . , n, either w
k
A, (where each element
in the last set is viewed as a word of length 1), or there are i, j 1, . . . , k 1
such that w
k
is one of the concatenations w
i
, w
i
w
j
, w
i
w
j
.
We let Prop(A) denote the set of propositions.
Remark. Having the connectives and in front of the propositions they
connect rather than in between, is called prex notation or Polish notation.
This is theoretically elegant, but for the sake of readability we usually write pq
and p q to denote pq and pq respectively, and we also use parentheses and
brackets if this helps to clarify the structure of a proposition. So the proposition
in the example above could be denoted by [(a)b](c), or even by (ab)c
since we shall agree that binds stronger than and in this informal way
of indicating propositions. Because of the informal nature of these conventions,
we dont have to give precise rules for their use; its enough that each actual
use is clear to the reader.
The intended structure of a propositionhow we think of it as built up
from atoms via connectivesis best exhibited in the form of a tree, a two-
dimensional array, rather than as a one-dimensional string. Such trees, however,
occupy valuable space on the printed page, and are typographically demanding.
Fortunately, our ocial prex notation does uniquely determine the intended
structure of a proposition: that is what the next lemma amounts to.
Lemma 2.1.1 (Unique Readability). If p has length 1, then either p = , or
p = , or p is an atom. If p has length > 1, then its rst symbol is either ,
or , or . If the rst symbol of p is , then p = q for a unique q. If the rst
symbol of p is , then p = qr for a unique pair (q, r). If the rst symbol of p
is , then p = qr for a unique pair (q, r).
(Note that we used here our convention that p, q, r denote propositions.) Only
the last two claims are worth proving in print, the others should require only a
2.1. PROPOSITIONAL LOGIC 15
moments thought. For now we shall assume this lemma without proof. At the
end of this section we establish more general results which are needed also later
in the course.
Remark. Rather than thinking of a proposition as a statement, its better
viewed as a function whose arguments and values are statements: replacing the
atoms in a proposition by specic mathematical statements like 2 2 = 4,
2
< 7, and every even integer > 2 is the sum of two prime numbers, we
obtain again a mathematical statement.
We shall use the following notational conventions: p q denotes p q, and
p q denotes (p q) (q p). By recursion on n we dene
p
1
. . . p
n
=
_
_
if n = 0
p
1
if n = 1
p
1
p
2
if n = 2
(p
1
. . . p
n1
) p
n
if n > 2
Thus p q r stands for (p q) r. We call p
1
. . . p
n
the disjunction of
p
1
, . . . , p
n
. The reason that for n = 0 we take this disjunction to be is that
we want a disjunction to be true if and only if (at least) one of the disjuncts is
true.
Similarly, the conjunction p
1
. . . p
n
of p
1
, . . . , p
n
is dened by replacing
everywhere by and by in the denition of p
1
. . . p
n
.
Denition. A truth assignment is a map t : A 0, 1. We extend such a t
to
t : Prop(A) 0, 1 by requiring
(i)
t() = 1
(ii)
t() = 0
(iii)
t(p) = 1
t(p)
(iv)
t(p q) = max(
t(p),
t(q))
(v)
t(p q) = min(
t(p),
t(q))
Note that there is exactly one such extension
t by unique readability. To simplify
notation we often write t instead of
t. The array below is called a truth table.
It shows on each row below the top row how the two leftmost entries t(p) and
t(q) determine t(p), t(p q), t(p q), t(p q) and t(p q).
p q p p q p q p q p q
0 0 1 0 0 1 1
0 1 1 1 0 1 0
1 0 0 1 0 0 0
1 1 0 1 1 1 1
Let t : A 0, 1. Note that t(p q) = 1 if and only if t(p) t(q), and that
t(p q) = 1 if and only if t(p) = t(q).
Suppose a
1
, . . . , a
n
are the distinct atoms that occur in p, and we know how
p is built up from those atoms. Then we can compute in a nite number of steps
16 CHAPTER 2. BASIC CONCEPTS OF LOGIC
t(p) from t(a
1
), . . . , t(a
n
). In particular, t(p) = t
: A 0, 1 such
that t(a
i
) = t
(a
i
) for i = 1, . . . , n.
Denition. We say that p is a tautology (notation: [= p) if t(p) = 1 for all
t : A 0, 1. We say that p is satisable if t(p) = 1 for some t : A 0, 1.
Thus is a tautology, and p p, p (p q) are tautologies for all p and
q. By the remark preceding the denition one can verify whether any given p
with exactly n distinct atoms in it is a tautology by computing 2
n
numbers and
checking that these numbers all come out 1. (To do this accurately by hand is
already cumbersome for n = 5, but computers can handle somewhat larger n.
Fortunately, other methods are often ecient for special cases.)
Remark. Note that [= p q i t(p) = t(q) for all t : A 0, 1. We call p
equivalent to q if [= p q. Note that equivalent to denes an equivalence
relation on Prop(A). The lemma below gives a useful list of equivalences. We
leave it to the reader to verify them.
Lemma 2.1.2. For all p, q, r we have the following equivalences:
(1) [= (p p) p, [= (p p) p
(2) [= (p q) (q p), [= (p q) (q p)
(3) [= (p (q r)) ((p q) r), [= (p (q r)) ((p q) r)
(4) [= (p (q r)) (p q) (p r), [= (p (q r)) (p q) (p r)
(5) [= (p (p q)) p, [= (p (p q)) p
(6) [= ((p q)) (p q), [= ((p q)) (p q)
(7) [= (p p) , [= (p p)
(8) [= p p
Items (1), (2), (3), (4), (5), and (6) are often referred to as the idempotent
law, commutativity, associativity, distributivity, the absorption law, and the De
Morgan law, respectively. Note the left-right symmetry in (1)(7) : the so-called
duality of propositional logic. We shall return to this issue in the more algebraic
setting of boolean algebras.
Some notation: let (p
i
)
iI
be a family of propositions with nite index set
I, choose a bijection k i(k) : 1, . . . , n I and set
iI
p
i
:= p
i(1)
p
i(n)
,
iI
p
i
:= p
i(1)
p
i(n)
.
If I is clear from context we just write
_
i
p
i
and
_
i
p
i
instead. Of course, the
notations
_
iI
p
i
and
_
iI
p
i
can only be used when the particular choice of
bijection of 1, . . . , n with I does not matter; this is usually the case, because
the equivalence class of p
i(1)
p
i(n)
does not depend on this choice, and
the same is true for the equivalence class of p
i(1)
p
i(n)
.
Next we dene model of and tautological consequence of .
Denition. Let Prop(A). By a model of we mean a truth assignment
t : A 0, 1 such that t(p) = 1 for all p . We say that a proposition p is
a tautological consequence of (written [= p) if t(p) = 1 for every model t of
. Note that [= p is the same as [= p.
2.1. PROPOSITIONAL LOGIC 17
Lemma 2.1.3. Let Prop(A) and p, q Prop(A). Then
(1) [= p q [= p and [= q.
(2) [= p = [= p q.
(3) p [= q [= p q.
(4) If [= p and [= p q, then [= q. (Modus Ponens.)
Proof. We will prove (3) here and leave the rest as exercise.
() Assume p [= q. To derive [= p q we consider any model
t : A 0, 1 of , and need only show that then t(p q) = 1. If t(p) = 1
then t( p) 1, hence t(q) = 1 and thus t(p q) = 1. If t(p) = 0 then
t(p q) = 1 by denition.
() Assume [= p q. To derive p [= q we consider any model
t : A 0, 1 of p, and need only derive that t(q) = 1. By assumption
t(p q) = 1 and in view of t(p) = 1, this gives t(q) = 1 as required.
We nish this section with the promised general result on unique readability.
We also establish facts of similar nature that are needed later.
Denition. Let F be a set of symbols with a function a : F N (called the
arity function). A symbol f F is said to have arity n if a(f) = n. A word on
F is said to be admissible if it can be obtained by applying the following rules:
(i) If f F has arity 0, then f viewed as a word of length 1 is admissible.
(ii) If f F has arity m > 0 and t
1
, . . . , t
m
are admissible words on F, then
the concatenation ft
1
. . . t
m
is admissible.
Below we just write admissible word instead of admissible word on F. Note
that the empty word is not admissible, and that the last symbol of an admissible
word cannot be of arity > 0.
Example. Take F = A , , , , and dene arity : F N by
arity(x) = 0 for x A , , arity() = 1, arity() = arity() = 2.
Then the set of admissible words is just Prop(A).
Lemma 2.1.4. Let t
1
, . . . , t
m
and u
1
, . . . , u
n
be admissible words and w any
word on F such that t
1
. . . t
m
w = u
1
. . . u
n
. Then m n, t
i
= u
i
for i =
1, . . . , m, and w = u
m+1
u
n
.
Proof. By induction on the length of u
1
. . . u
n
. If this length is 0, then m = n =
0 and w is the empty word. Suppose the length is > 0, and assume the lemma
holds for smaller lengths. Note that n > 0. If m = 0, then the conclusion of the
lemma holds, so suppose m > 0. The rst symbol of t
1
equals the rst symbol
of u
1
. Say this rst symbol is h F with arity k. Then t
1
= ha
1
. . . a
k
and
u
1
= hb
1
. . . b
k
where a
1
, . . . , a
k
and b
1
, . . . , b
k
are admissible words. Cancelling
the rst symbol h gives
a
1
. . . a
k
t
2
. . . t
m
w = b
1
. . . b
k
u
2
. . . u
n
.
18 CHAPTER 2. BASIC CONCEPTS OF LOGIC
(Caution: any of k, m1, n1 could be 0.) We have length(b
1
. . . b
k
u
2
. . . u
n
) =
length(u
1
. . . u
n
) 1, so the induction hypothesis applies. It yields k +m1
k + n 1 (so m n), a
1
= b
1
, . . . , a
k
= b
k
(so t
1
= u
1
), t
2
= u
2
, . . . , t
m
= u
m
,
and w = u
m+1
u
n
.
Here are two immediate consequences that we shall use:
1. Let t
1
, . . . , t
m
and u
1
, . . . , u
n
be admissible words such that t
1
. . . t
m
=
u
1
. . . u
n
. Then m = n and t
i
= u
i
for i = 1, . . . , m.
2. Let t and u be admissible words and w a word on F such that tw = u.
Then t = u and w is the empty word.
Lemma 2.1.5 (Unique Readability).
Each admissible word equals ft
1
. . . t
m
for a unique tuple (f, t
1
, . . . , t
m
) where
f F has arity m and t
1
, . . . , t
m
are admissible words.
Proof. Suppose ft
1
. . . t
m
= gu
1
. . . u
n
where f, g F have arity m and n
respectively, and t
1
, . . . , t
m
, u
1
, . . . , u
n
are admissible words on F. We have to
show that then f = g, m = n and t
i
= u
i
for i = 1, . . . , m. Observe rst that
f = g since f and g are the rst symbols of two equal words. After cancelling
the rst symbol of both words, the rst consequence of the previous lemma leads
to the desired conclusion.
Given words v, w F
and w
1
has length
i 1. (For example, if f, g F are distinct, then the word fgf has exactly
two occurrences in the word fgfgf, one at starting position 1, and the other
at starting position 3; these two occurrences overlap, but such overlapping is
impossible with admissible words, see exercise 5 at the end of this section.)
Given w = w
1
vw
2
as above, and given v
w
2
.
Lemma 2.1.6. Let w be an admissible word and 1 i length(w). Then there
is a unique admissible word that occurs in w at starting position i.
Proof. We prove existence by induction on length(w). Uniqueness then follows
from the fact stated just before Lemma 2.1.5. Clearly w is an admissible word
occurring in w at starting position 1. Suppose i > 1. Then we write w =
ft
1
. . . t
n
where f F has arity n > 0, and t
1
, . . . , t
n
are admissible words, and
we take j 1, . . . , n such that
1 + length(t
1
) + + length(t
j1
) < i 1 + length(t
1
) + + length(t
j
).
Now apply the inductive assumption to t
j
.
Remark. Let w = ft
1
. . . t
n
where f F has arity n > 0, and t
1
, . . . , t
n
are
admissible words. Put l
j
:= 1 +length(t
1
) + +length(t
j
) for j = 0, . . . , n (so
l
0
= 1). Suppose l
j1
< i l
j
, 1 j n, and let v be the admissible word
that occurs in w at starting position i. Then the proof of the last lemma shows
that this occurrence is entirely inside t
j
, that is, i 1 + length(v) l
j
.
2.2. COMPLETENESS FOR PROPOSITIONAL LOGIC 19
Corollary 2.1.7. Let w be an admissible word and 1 i length(w). Then
the result of replacing the admissible word v in w at starting position i by an
admissible word v
1
1
. . . a
n
n
with all
j
|1, 1 and
where for an atom a we put a
1
:= a and a
1
:= a.
(2) (Conjunctive Normal Form) Same as last problem, except that the signs and
are interchanged, as well as the words disjunction and conjunction, and also
the words disjunct and conjunct.
(3) To each p associate the function f
p
: |0, 1
A
|0, 1 dened by f
p
(t) = t(p).
(Think of a truth table for p where the 2
n
rows correspond to the 2
n
truth
assignments t : A |0, 1, and the column under p records the values t(p).)
Then for every function f : |0, 1
A
|0, 1 there is a p such that f = f
p
.
(4) Let be the equivalence relation on Prop(A) given by
p q : [= p q.
Then the quotient set Prop(A)/ is nite; determine its cardinality as a function
of n = [A[.
(5) Let w be an admissible word and 1 i < i
be the
admissible words that occur at starting positions i and i
respectively in w. Then
these occurrences are either nonoverlapping, that is, i 1 +length(v) < i
, or the
occurrence of v
1 + length(v
) i 1 + length(v).
2.2 Completeness for Propositional Logic
In this section we introduce a proof system for propositional logic, state the
completeness of this proof system, and then prove this completeness.
As in the previous section we x a set A of atoms, and the conventions of
that section remain in force.
A propositional axiom is by denition a proposition that occurs in the list below,
for some choice of p, q, r:
1.
2. p (p q); p (q p)
3. p
_
q (p q)
_
20 CHAPTER 2. BASIC CONCEPTS OF LOGIC
4. (p q) p; (p q) q
5. p
_
q (p q)
_
6.
_
p (q r)
_
_
(p q) (p r)
_
7. p (p )
8. (p ) p
Each of items 28 describes innitely many propositional axioms. That is why
we do not call these items axioms, but axiom schemes. For example, if a, b A,
then a (a ) and b (b (a b)) are distinct propositional axioms,
and both instances of axiom scheme 2. It is easy to check that all propositional
axioms are tautologies.
Here is our single rule of inference for propositional logic:
Modus Ponens (MP): from p and p q, infer q.
In the rest of this section denotes a set of propositions, that is, Prop(A).
Denition. A formal proof , or just proof, of p from is a sequence p
1
, . . . , p
n
with n 1 and p
n
= p, such that for k = 1, . . . , n:
(i) either p
k
,
(ii) or p
k
is a propositional axiom,
(iii) or there are i, j 1, . . . , k 1 such that p
k
can be inferred from p
i
and
p
j
by MP.
If there exists a proof of p from , then we write p, and say proves p.
For = we also write p instead of p.
Lemma 2.2.1. p p.
Proof. The proposition p
_
(p p) p
_
is a propositional axiom by axiom
scheme 2. By axiom scheme 6,
p
_
(p p) p
_
_
p (p p)
_
(p p)
is a propositional axiom. Applying MP to these two axioms yields
_
p (p p)
_
(p p).
Since p (p p) is also a propositional axiom by scheme 2, we can apply MP
again to obtain p p.
Proposition 2.2.2. If p, then [= p.
This should be clear from earlier facts that we stated and which the reader
was asked to verify. The converse is true but less obvious:
Theorem 2.2.3 (Completeness - rst form).
p [= p
2.2. COMPLETENESS FOR PROPOSITIONAL LOGIC 21
Remark. There is some arbitrariness in our choice of axioms and rule, and thus
in our notion of formal proof. This is in contrast to the denition of [=, which
merely formalizes the basic underlying idea of propositional logic as stated in
the introduction to the previous section. However, the equivalence of and
[= (Completeness Theorem) shows that our choice of axioms and rule yields a
complete proof system. Moreover, this equivalence has consequences which can
be stated in terms of [= alone. An example is the Compactness Theorem.
Theorem 2.2.4 (Compactness of Propositional Logic). If [= p then there is
a nite subset
0
of such that
0
[= p.
It is convenient to prove rst a variant of the Completeness Theorem.
Denition. We say that is inconsistent if , and otherwise (that is, if
) we call consistent.
Theorem 2.2.5 (Completeness - second form).
is consistent if and only if has a model.
From this second form of the Completenenes Theorem we obtain easily an al-
ternative form of the Compactness of Propositional Logic:
Corollary 2.2.6. has a model every nite subset of has a model.
We rst show that the second form of the Completeness Theorem implies the
rst form. For this we need a technical lemma that will also be useful later in
the course.
Lemma 2.2.7 (Deduction Lemma). Suppose p q. Then p q.
Proof. By induction on proofs.
If q is a propositional axiom, then q, and since q (p q) is a
propositional axiom, MP yields p q. If q p, then either q
in which case the same argument as before gives p q, or q = p and then
p q since p p by the lemma above.
Now assume that q is obtained by MP from r and r q, where p r
and p r q and where we assume inductively that p r and
p (r q). Then we obtain p q from the propositional axiom
_
p (r q)
_
_
(p r) (p q)
_
by applying MP twice.
Corollary 2.2.8. p if and only if p is inconsistent.
Proof. () Assume p. Since p (p ) is a propositional axiom, we
can apply MP twice to get p . Hence p is inconsistent.
() Assume p is inconsistent. Then p , and so by the
Deduction Lemma we have p . Since (p ) p is a propositional
axiom, MP yields p.
22 CHAPTER 2. BASIC CONCEPTS OF LOGIC
Corollary 2.2.9. The second form of Completeness (Theorem 2.2.5) implies
the rst form (Theorem 2.2.3).
Proof. Assume the second form of Completeness holds, and that [= p. We
want to show that then p. From [= p it follows that p has
no model. Hence by the second form of Completeness, the set p is
inconsistent. Then by Corollary 2.2.8 we have p.
Denition. We say that is complete if is consistent, and for each p either
p or p.
Completeness as a property of a set of propositions should not be confused
with the completeness of our proof system as expressed by the Completeness
Theorem. (It is just a historical accident that we use the same word.)
Below we use Zorns Lemma to show that any consistent set of propositions
can be extended to a complete set of propositions.
Lemma 2.2.10 (Lindenbaum). Suppose is consistent. Then
for some
complete
Prop(A).
Proof. Let P be the collection of all consistent subsets of Prop(A) that contain
. In particular P. We consider P as partially ordered by inclusion. Any
totally ordered subcollection
i
: i I of P with I ,= has an upper bound
in P, namely
i
: i I. (To see this it suces to check that
i
: i I
is consistent. Suppose otherwise, that is, suppose
i
: i I . Since a
proof can use only nitely many of the axioms in
i
: i I, there exists
i I such that
i
, contradicting the consistency of
i
.)
Thus by Zorns lemma P has a maximal element
p is
consistent, hence p
by maximality of
, and thus
p.
Suppose A is countable. For this case we can give a proof of Lindenbaums
Lemma without using Zorns Lemma as follows.
Proof. Because A is countable, Prop(A) is countable. Take an enumeration
(p
n
)
nN
of Prop(A). We construct an increasing sequence =
0
1
. . .
of consistent subsets of Prop(A) as follows. Given a consistent
n
Prop(A)
we dene
n+1
=
_
n
p
n
if
n
p
n
,
n
p
n
if
n
p
n
,
so
n+1
remains consistent by earlier results. Thus
:=
n
: n N
is consistent and also complete: for any n either p
n
n+1
or p
n
n+1
.
Dene the truth assignment t
: A 0, 1 by
t
(a) = 1 if a, and t
(a) = 0 otherwise.
2.2. COMPLETENESS FOR PROPOSITIONAL LOGIC 23
Lemma 2.2.11. Suppose is complete. Then for each p we have
p t
(p) = 1.
In particular, t
is a model of .
Proof. We proceed by induction on the number of connectives in p. If p is an
atom or p = or p = , then the equivalence follows immediately from the
denitions. It remains to consider the three cases below.
Case 1: p = q, and (inductive assumption) q t
(q) = 1.
() Suppose p. Then t
(p) = 1: Otherwise, t
(q) = 1, so q by the
inductive assumption; since q (p ) is a propositional axiom, we can apply
MP twice to get , which contradicts the consistency of .
() Suppose t
(p) = 1. Then t
(q) = 1, and r t
(r) = 1.
() Suppose that p. Then t
(p) = 1: Otherwise, t
(p) = 0, so t
(q) = 0
and t
(p) = 1. Then t
(q) = 1 or t
(r) = 1. Hence q or r.
Using MP and the propositional axioms q p and r p we obtain p.
Case 3: p = q r, q t
(q) = 1, and r t
(r) = 1.
We leave this case as an exercise.
We can now nish the proof of Completeness (second form):
Suppose is consistent. Then by Lindenbaums Lemma is a subset of a
complete set
has a model,
and such a model is also a model of .
The converseif has a model, then is consistentis left to the reader.
Application to coloring innite graphs. What follows is a standard use
of compactness of propositional logic, one of many. Let (V, E) be a graph, by
which we mean here that V is a set (of vertices) and E (the set of edges) is a
binary relation on V that is irreexive and symmetric, that is, for all v, w V
we have (v, v) / E, and if (v, w) E, then (w, v) E. Let some n 1 be
given. Then an n-coloring of (V, E) is a function c : V 1, . . . , n such that
c(v) ,= c(w) for all (v, w) E: neighboring vertices should have dierent colors.
Suppose for every nite V
0
V there is an n-coloring of (V
0
, E
0
), where
E
0
:= E (V
0
V
0
). We claim that there exists an n-coloring of (V, E).
Proof. Take A := V 1, . . . , n as the set of atoms, and think of an atom (v, i)
as representing the statement that v has color i. Thus for (V, E) to have an
n-coloring means that the following set Prop A has a model:
:= (v, 1) (v, n) : v V
_
(v, i) (v, j)
_
: v V, 1 i < j n
_
(v, i) (w, i)
_
: (v, w) E, 1 i n.
24 CHAPTER 2. BASIC CONCEPTS OF LOGIC
The assumption that all nite subgraphs of (V, E) are n-colorable yields that
every nite subset of has a model. Hence by compactness has a model.
Exercises.
(1) Let (P, ) be a poset. Then there is a linear order
1
, . . . , a
m
) : (a
1
, . . . , a
m
) R
A
on A/;
(iii) the interpretation of an n-ary F L
f
in // is the n-ary operation
(a
1
, . . . , a
n
) F
A
(a
1
, . . . , a
n
)
on A/.
Note that then we have a strong homomorphism a a
: / //.
Products. Let
_
B
i
_
iI
be a family of L-structures, B
i
= (B
i
; . . . ) for i I.
The product
iI
B
i
is dened to be the L-structure B whose underlying set is the product set
iI
B
i
, and where the basic relations and functions are dened coordinate-
wise: for m-ary R L
r
and elements b
1
= (b
1i
), . . . , b
m
= (b
mi
)
iI
B
i
,
(b
1
, . . . , b
m
) R
B
(b
1i
, . . . , b
mi
) R
B
i
for all i I,
and for n-ary F L
f
and b
1
= (b
1i
), . . . , b
n
= (b
ni
)
iI
B
i
,
F
B
(b
1
, . . . , b
n
) :=
_
F
B
i
(b
1i
, . . . , b
ni
)
_
iI
.
For j I the projection map to the jth factor is the homomorphism
iI
B
i
B
j
, (b
i
) b
j
.
This product construction makes it possible to combine several homomorphisms
with a common domain into a single one: if for each i I we have a homomor-
phism h
i
: / B
i
we obtain a homomorphism
h = (h
i
) : /
iI
B
i
, h(a) :=
_
h
i
(a
i
)
_
.
28 CHAPTER 2. BASIC CONCEPTS OF LOGIC
Exercises.
(1) Let G be a group viewed as a structure for the language of groups. Each normal
subgroup N of G yields a congruence
N
on G by
a
N
b aN = bN,
and each congruence on G equals
N
for a unique normal subgroup N of G.
(2) Consider a strong homomorphism h : / B of L-structures. Then we have an
isomorphism from //
h
onto h(/) given by a
h
h(a).
2.4 Variables and Terms
Throughout this course
Var = v
0
, v
1
, v
2
, . . .
is a countably innite set of symbols whose elements will be called variables;
we assume that v
m
,= v
n
for m ,= n, and that no variable is a function or
relation symbol in any language. We let x, y, z (sometimes with subscripts or
superscripts) denote variables, unless indicated otherwise.
Remark. Chapters 24 go through if we take as our set Var of variables any
innite (possibly uncountable) set; in model theory this can even be convenient.
For this more general Var we still insist that no variable is a function or relation
symbol in any language. In the few cases in chapters 24 that this more general
set-up requires changes in proofs, this will be pointed out.
The results in Chapter 5 on undecidability presuppose a numbering of the
variables; our Var = v
0
, v
1
, v
2
, . . . comes equipped with such a numbering.
Denition. An L-term is a word on the alphabet L
f
Var obtained as follows:
(i) each variable (viewed as a word of length 1) is an L-term;
(ii) whenever F L
f
is n-ary and t
1
, . . . , t
n
are L-terms, then the concatena-
tion Ft
1
. . . t
n
is an L-term.
Note: constant symbols of L are L-terms of length 1, by clause (ii) for n = 0.
The L-terms are the admissible words on the alphabet L
f
Var where each
variable has arity 0. Thus unique readability is available.
We often write t(x
1
, . . . , x
n
) to indicate an L-term t in which no variables
other than x
1
, . . . , x
n
occur. Whenever we use this notation we assume tacitly
that x
1
, . . . , x
n
are distinct. Note that we do not require that each of x
1
, . . . , x
n
actually occurs in t(x
1
, . . . , x
n
). (This is like indicating a polynomial in the
indeterminates x
1
, . . . , x
n
by p(x
1
, . . . , x
n
), where one allows that some of these
indeterminates do not actually occur in the polynomial p.)
If a term is written as an admissible word, then it may be hard to see how
it is built up from subterms. In practice we shall therefore use parentheses
and brackets in denoting terms, and avoid prex notation if tradition dictates
otherwise.
2.4. VARIABLES AND TERMS 29
Example. The word +x yz is an L
Ri
-term. For easier reading we indicate
this term instead by (x + (y)) z or even (x y)z.
Denition. Let / be an L-structure and t = t(x) be an L-term where x =
(x
1
, . . . , x
m
). Then we associate to the ordered pair (t, x) a function t
A
: A
m
A
as follows
(i) If t is the variable x
i
then t
A
(a) = a
i
for a = (a
1
, . . . , a
m
) A
m
.
(ii) If t = Ft
1
. . . t
n
where F L
f
is n-ary and t
1
, . . . , t
n
are L-terms, then
t
A
(a) = F
A
(t
A
1
(a), . . . , t
A
n
(a)) for a A
m
.
This inductive denition is justied by unique readability. Note that if B is a
second L-structure and / B, then t
A
(a) = t
B
(a) for t as above and a A
m
.
Example. Consider R as a ring in the usual way, and let t(x, y, z) be the L
Ri
-
term (xy)z. Then the function t
R
: R
3
R is given by t
R
(a, b, c) = (ab)c.
A term is said to be variable-free if no variables occur in it. Let t be a variable-
free L-term and / an L-structure. Then the above gives a nullary function
t
A
: A
0
A, identied as usual with its value at the unique element of A
0
, so
t
A
A. In other words, if t is a constant symbol c, then t
A
= c
A
A, where
c
A
is as in the previous section, and if t = Ft
1
. . . t
n
with n-ary F L
f
and
variable-free L-terms t
1
, . . . , t
n
, then t
A
= F
A
(t
A
1
, . . . , t
A
n
).
Denition. Let t be an L-term, let x
1
, . . . , x
n
be distinct variables, and let
1
, . . . ,
n
be L-terms. Then t(
1
/x
1
, . . . ,
n
/x
n
) is the word obtained by re-
placing all occurrences of x
i
in t by t
i
, simultaneously for i = 1, . . . , n. If t is
given in the form t(x
1
, . . . , x
n
), then we write t(
1
, . . . ,
n
) as a shorthand for
t(
1
/x
1
, . . . ,
n
/x
n
).
The easy proof of the next lemma is left to the reader.
Lemma 2.4.1. Suppose t is an L-term, x
1
, . . . , x
n
are distinct variables, and
1
, . . . ,
n
are L-terms. Then t(
1
/x
1
, . . . ,
n
/x
n
) is an L-term. If
1
, . . . ,
n
are
variable-free and t = t(x
1
, . . . , x
n
), then t(
1
, . . . ,
n
) is variable-free.
We urge the reader to do exercise (1) below and thus acquire the condence that
these formal term substitutions do correspond to actual function substitutions.
In the denition of t(
1
/x
1
, . . . ,
n
/x
n
) the replacing should be simultaneous,
because it can happen that for t
:= t(
1
/x
1
) we have t
(
2
/x
2
) ,= t(
1
/x
1
,
2
/x
2
).
(Here t,
1
,
2
are L-terms and x
1
, x
2
are distinct variables.)
Generators. Let B be an L-structure, let G B, and assume also that L has
a constant symbol or that G ,= . Then the set
t
B
(g
1
, . . . , g
m
) : t(x
1
, . . . , x
m
) is an L-term and g
1
, . . . , g
m
G B
is the underlying set of some / B, and this / is clearly a substructure of any
/
B with G A
(y
1
, . . . , y
n
) := t(
1
(y
1
, . . . , y
n
), . . . ,
m
(y
1
, . . . , y
n
)) has the property that
if / is an L-structure and a = (a
1
, . . . , a
n
) A
n
, then
(t
)
A
(a) = t
A
`
A
1
(a), . . . ,
A
m
(a)
.
(2) For every L
Ab
-term t(x
1
, . . . , x
n
) there are integers k
1
, . . . , k
n
such that for every
abelian group / = (A; 0, , +),
t
A
(a
1
, . . . , a
n
) = k
1
a
1
+ +k
n
a
n
, for all (a
1
, . . . , a
n
) A
n
.
Conversely, for any integers k
1
, . . . , k
n
there is an L
Ab
-term t(x
1
, . . . , x
n
) such
that in every abelian group / = (A; 0, , +) the above displayed identity holds.
(3) For every L
Ri
-term t(x
1
, . . . , x
n
) there is a polynomial
P(x
1
, . . . , x
n
) Z[x
1
, . . . , x
n
]
such that for every commutative ring 1 = (R; 0, 1, , +, ),
t
R
(r
1
, . . . , r
n
) = P(r
1
, . . . , r
n
), for all (r
1
, . . . , r
n
) R
n
.
Conversely, for any polynomial P(x
1
, . . . , x
n
) Z[x
1
, . . . , x
n
] there is an L
Ri
-
term t(x
1
, . . . , x
n
) such that in every commutative ring 1 = (R; 0, 1, , +, ) the
above displayed identity holds.
(4) Let / and B be L-structures, h : / B a homomorphism, and t = t(x
1
, . . . , x
n
)
an L-term. Then
h
`
t
A
(a
1
, . . . , a
n
)
= t
B
(ha
1
, . . . , ha
n
), for all (a
1
, . . . , a
n
) A
n
.
(If / B and h : / B is the inclusion, this gives t
A
(a
1
, . . . , a
n
) = t
B
(a
1
, . . . , a
n
)
for all (a
1
, . . . , a
n
) A
n
.)
(5) Consider the L-structure / = (N; 0, 1, +, ) where L = L
Rig
.
(a) Is there an L-term t(x) such that t
A
(0) = 1 and t
A
(1) = 0?
(b) Is there an L-term t(x) such that t
A
(n) = 2
n
for all n N?
(c) Find all the substructures of /.
2.5 Formulas and Sentences
Besides variables we also introduce the eight distinct logical symbols
=
The rst ve of these we already met when discussing propositional logic. None
of these eight symbols is a variable, or a function or relation symbol of any
language. Below L denotes a language. To distinguish the logical symbols from
those in L, the latter are often referred to as the non-logical symbols.
Denition. The atomic L-formulas are the following words on the alphabet
L Var , , =:
2.5. FORMULAS AND SENTENCES 31
(i) and ,
(ii) Rt
1
. . . t
m
, where R L
r
is m-ary and t
1
, . . . , t
m
are L-terms,
(iii) = t
1
t
2
, where t
1
and t
2
are L-terms.
The L-formulas are the words on the larger alphabet
L Var , , , , , =, ,
obtained as follows:
(i) every atomic L-formula is an L-formula;
(ii) if , are L-formulas, then so are , and ;
(iii) if is a L-formula and x is a variable, then x and x are L-formulas.
Note that all L-formulas are admissible words on the alphabet
L Var , , , , , =, , ,
where =, and are given arity 2 and the other symbols have the arities
assigned to them earlier. This fact makes the results on unique readability
applicable to L-formulas. (However, not all admissible words on this alphabet
are L-formulas: the word xx is admissible but not an L-formula.)
The notational conventions introduced in the section on propositional logic
go through, with the role of propositions there taken over by formulas here. (For
example, given L-formulas and we shall write to indicate , and
to indicate .)
The reader should distinguish between dierent ways of using the symbol =.
Sometimes it denotes one of the eight formal logical symbols, but we also use it
to indicate equality of mathematical objects in the way we have done already
many times. The context should always make it clear what our intention is in
this respect without having to spell it out. To increase readability we usually
write an atomic formula = t
1
t
2
as t
1
= t
2
and its negation = t
1
t
2
as t
1
,= t
2
,
where t
1
, t
2
are L-terms. The logical symbol = is treated just as a binary relation
symbol, but its interpretation in a structure will always be the equality relation
on its underlying set. This will become clear later.
Denition. Let be a formula of L. Written as a word on the alphabet above
we have = s
1
. . . s
m
. A subformula of is a subword of the form s
i
. . . s
k
where 1 i k m which also happens to be a formula of L.
An occurrence of a variable x in at the j-th place (that is, s
j
= x) is said
to be a bound occurrence if has a subformula s
i
s
i+1
. . . s
k
with i j k that
is of the form x or x. If an occurrence is not bound then it is said to be a
free occurrence.
At this point the reader is invited to do the rst exercise at the end of this
section, which gives another useful characterization of subformulas.
Example. In the formula
_
x(x = y)
_
x = 0, where x and y are distinct, the
rst two occurrences of x are bound, the third is free, and the only occurrence
of y is free. (Note: the formula is actually the string x = xy = x0, and the
occurrences of x and y are really the occurrences in this string.)
32 CHAPTER 2. BASIC CONCEPTS OF LOGIC
Denition. A sentence is a formula in which all occurrences of variables are
bound occurrences.
We write (x
1
, . . . , x
n
) to indicate a formula such that all variables that occur
free in are among x
1
, . . . , x
n
. In using this notation it is understood that
x
1
, . . . , x
n
are distinct variables, but it is not required that each of x
1
, . . . , x
n
occurs free in . (This is like indicating a polynomial in the indeterminates
x
1
, . . . , x
n
by p(x
1
, . . . , x
n
), where one allows that some of these indeterminates
do not actually occur in p.)
Denition. Let be an L-formula, let x
1
, . . . , x
n
be distinct variables, and
let t
1
, . . . , t
n
be L-terms. Then (t
1
/x
1
, . . . , t
n
/x
n
) is the word obtained by
replacing all the free occurences of x
i
in by t
i
, simultaneously for i = 1, . . . , n.
If is given in the form (x
1
, . . . , x
n
), then we write (t
1
, . . . , t
n
) as a shorthand
for (t
1
/x
1
, . . . , t
n
/x
n
).
We have the following lemma whose routine proof is left to the reader.
Lemma 2.5.1. Suppose is an L-formula, x
1
, . . . , x
n
are distinct variables,
and t
1
, . . . , t
n
are L-terms. Then (t
1
/x
1
, . . . , t
n
/x
n
) is an L-formula. If
t
1
, . . . , t
n
are variable-free and = (x
1
, . . . , x
n
), then (t
1
, . . . , t
n
) is an L-
sentence.
In the denition of (t
1
/x
1
, . . . , t
n
/x
n
) the replacing should be simultaneous,
because it can happen that (t
1
/x
1
)(t
2
/x
2
) ,= (t
1
/x
1
, t
2
/x
2
).
Let / be an L-structure with underlying set A, and let C A. We extend L to
a language L
C
by adding a constant symbol c for each c C, called the name
of c. These names are symbols not in L. We make / into an L
C
-structure
by keeping the same underlying set and interpretations of symbols of L, and
by interpreting each name c as the element c C. The L
C
-structure thus
obtained is indicated by /
C
. Hence for each variable-free L
C
-term t we have
a corresponding element t
A
C
of A, which for simplicity of notation we denote
instead by t
A
. All this applies in particular to the case C = A, where in L
A
we
have a name a for each a A.
Denition. We can now dene what it means for an L
A
-sentence to be true
in the L-structure / (notation: / [= , also read as / satises or holds in
/, or is valid in /). First we consider atomic L
A
-sentences:
(i) / [= , and / ,[= ;
(ii) / [= Rt
1
. . . t
m
if and only if (t
A
1
, . . . , t
A
m
) R
A
, for m-ary R L
r
, and
variable-free L
A
-terms t
1
, . . . , t
m
;
(iii) / [= t
1
= t
2
if and only if t
A
1
= t
A
2
, for variable-free L
A
-terms t
1
, t
2
.
We extend the denition inductively to arbitrary L
A
-sentences as follows:
(i) Suppose =
1
. Then / [= if and only if /
1
.
(ii) Suppose =
1
2
. Then / [= if and only if / [=
1
or / [=
2
.
(iii) Suppose =
1
2
. Then / [= if and only if / [=
1
and / [=
2
.
(iv) Suppose = x(x). Then / [= if and only if / [= (a) for some a A.
2.5. FORMULAS AND SENTENCES 33
(v) Suppose = x(x). Then / [= if and only if / [= (a) for all a A.
Even if we just want to dene / [= for L-sentences , one can see that if
has the form x(x) or x(x), the inductive denition above forces us to
consider L
A
-sentences (a). This is why we introduced names. We didnt say so
explicitly, but inductive refers here to induction with respect to the number of
logical symbols in . For example, the fact that (a) has fewer logical symbols
than x(x) is crucial for the above to count as a denition.
It is easy to check that for an L
A
-sentence = x
1
. . . x
n
(x
1
, . . . , x
n
),
/ [= / [= (a
1
, . . . , a
n
) for some (a
1
, . . . , a
n
) A
n
,
and that for an L
A
-sentence = x
1
. . . x
n
(x
1
, . . . , x
n
),
/ [= / [= (a
1
, . . . , a
n
) for all (a
1
, . . . , a
n
) A
n
.
Denition. Given an L
A
-formula (x
1
, . . . , x
n
) we let
A
be the following
subset of A
n
:
A
= (a
1
, . . . , a
n
) : / [= (a
1
, . . . , a
n
)
The formula (x
1
, . . . , x
n
) is said to dene the set
A
in /. A set S A
n
is said to be denable in / if S =
A
for some L
A
-formula (x
1
, . . . , x
n
). If
moreover can be chosen to be an L-formula then S is said to be 0-denable
in /.
Examples.
(1) The set r R : r <
1
(x
1
, . . . , x
n
) and
2
(x
1
, . . . , x
n
) respectively. Then:
(a) S
1
S
2
is dened in / by (
1
2
)(x
1
, . . . , x
n
).
(b) S
1
S
2
is dened in / by (
1
2
)(x
1
, . . . , x
n
).
(c) A
n
S
1
is dened in / by
1
(x
1
, . . . , x
n
).
(d) S
1
S
2
/ [= x
1
. . . x
n
`
1
2
).
(4) Let : A
m+n
A
m
be the projection map given by
(a
1
, . . . , a
m+n
) = (a
1
, . . . , a
m
),
and for S A
m+n
and a A
m
, put
S(a) := |b A
n
: (a, b) S (a section of S).
Suppose that S A
m+n
is dened in / by the L
A
-formula (x, y) where x =
(x
1
, . . . , x
m
) and y = (y
1
, . . . , y
n
). Then y
1
. . . y
n
(x, y) denes in / the subset
(S) of A
m
, and y
1
. . . y
n
(x, y) denes in / the set
|a A
m
: S(a) = A
n
.
(5) The following sets are 0-denable in the corresponding structures:
(a) The ordering relation |(m, n) N
2
: m < n in (N; 0, +).
(b) The set |2, 3, 5, 7, . . . of prime numbers in the semiring A = (N; 0, 1, +, ).
(c) The set |2
n
: n N in the semiring A.
(d) The set |a R : f is continuous at a in (R; <, f) where f : R R is any
function.
(6) Let the symbols of L be a binary relation symbol < and a unary relation symbol
U. Then there is an L-sentence such that for all X R we have
(R; <, X) [= X is nite.
(7) Let / B. Then we consider L
A
to be a sublanguage of L
B
in such a way
that each a A has the same name in L
A
as in L
B
. This convention is in force
throughout these notes.
(a) For each variable free L
A
-term t we have t
A
= t
B
.
(b) If the L
A
-sentence is quantier-free, then / [= B [= .
(c) If is an existential L
A
-sentence, then / [= B [=
(d) If is a universal L
A
-sentence, then B [= / [= .
(8) Suppose h : / B is a homomorphism of L-structures. For each L
A
-term t,
let t
h
be the L
B
-term obtained from t by replacing each occurrence of a name
a of an element a A by the name ha of the corresponding element ha B.
Similarly, for each L
A
-formula , let
h
be the L
B
-formula obtained from by
replacing each occurrence of a name a of an element a A by the name ha of
the corresponding element ha B. Note that if is a sentence, so is
h
. Then:
2.6. MODELS 35
(a) if t is a variable-free L
A
-term, then h(t
A
) = t
B
h
;
(b) if is a positive L
A
-sentence without -symbol, then / [= B [=
h
;
(c) if is a positive L
A
-sentence and h is surjective, then / [= B [=
h
;
(d) if is an L
A
-sentence and h is an isomorphism, then / [= B [=
h
;
In particular, isomorphic L-structures satisfy exactly the same L-sentences.
2.6 Models
In the rest of this chapter L is a language, / is an L-structure (with underlying
set A), and, unless indicated otherwise, t is an L-term, , , and are L-
formulas, is an L-sentence, and is a set of L-sentences. We drop the prex
L in L-term and L-formula and so on, unless this would cause confusion.
Denition. We say that / is a model of or holds in / (denoted / [= )
if / [= for each .
To discuss examples it is convenient to introduce some notation. Suppose L
contains (at least) the constant symbol 0 and the binary function symbol +.
Given any terms t
1
, . . . , t
n
we dene the term t
1
+ +t
n
inductively as follows:
it is the term 0 if n = 0, the term t
1
if n = 1, and the term (t
1
+ +t
n1
) +t
n
for n > 1. We write nt for the term t+ +t with n summands, in particular, 0t
and 1t denote the terms 0 and t respectively. Suppose L contains the constant
symbol 1 and the binary function symbol (the multiplication sign). Then we
have similar notational conventions for t
1
. . . t
n
and t
n
; in particular, for n = 0
both stand for the term 1, and t
1
is just t.
Examples. Fix three distinct variables x, y, z.
(1) Groups are the L
Gr
-structures that are models of
Gr := x(x 1 = x 1 x = x), x(x x
1
= 1 x
1
x = 1),
xyz((x y) z = x (y z))
(2) Abelian groups are the L
Ab
-structures that are models of
Ab := x(x + 0 = x), x(x + (x) = 0), xy(x +y = y +x),
xyz((x +y) +z = x + (y +z))
(3) Torsion-free abelian groups are the L
Ab
-structures that are models of
Ab x(nx = 0 x = 0) : n = 1, 2, 3, . . .
(4) Rings are the L
Ri
-structures that are models of
Ri := Ab xyz
_
(x y) z = x (y z)
_
, x
_
x 1 = x 1 x = x
_
,
xyz
_
(x (y +z) = x y +x z (x +y) z = x z +y z)
_
_
( ) ( )
_
7. ( )
8. ( )
Each of items 28 is a scheme describing innitely many axioms. Note that
this list is the same as the list in Section 2.2 except that instead of propositions
p, q, r we have formulas , , .
The logical axioms of L are the propositional axioms of L and the equality
and quantier axioms of L as dened below.
Denition. The equality axioms of L are the following formulas:
(i) x = x,
(ii) x = y y = x,
(iii) (x = y y = z) x = z,
(iv) (x
1
= y
1
. . . x
m
= y
m
Rx
1
. . . x
m
) Ry
1
. . . y
m
,
(v) (x
1
= y
1
. . . x
n
= y
n
) Fx
1
. . . x
n
= Fy
1
. . . y
n
,
with the following restrictions on the variables and symbols of L: x, y, z are
distinct in (ii) and (iii); in (iv), x
1
, . . . , x
m
, y
1
, . . . , y
m
are distinct and R L
r
is m-ary; in (v), x
1
, . . . , x
n
, y
1
, . . . , y
n
are distinct, and F L
f
is n-ary. Note
that (i) represents an axiom scheme rather than a single axiom, since dierent
variables x give dierent formulas x = x. Likewise with (ii)(v).
Let x and y be distinct variables, and let (y) be the formula x(x ,= y).
Then (y) is valid in all / with [A[ > 1, but (x/y) is invalid in all /. Thus
substituting x for the free occurrences of y does not always preserve validity. To
get rid of this anomaly, we introduce the following restriction on substitutions
of a term t for free occurrences of y.
Denition. We say that t is free for y in , if no variable in t can become bound
upon replacing the free occurrences of y in by t, more precisely: whenever x
is a variable in t, then there are no occurrences of subformulas in of the form
x or x that contain an occurrence of y that is free in .
Note that if t is variable-free, then t is free for y in . We remark that free
for abbreviates free to be substituted for. In exercise 2 the reader is asked to
show that, with this restriction, substitution of a term for the free occurrences
of a variable does preserve validity.
2.7. LOGICAL AXIOMS AND RULES; FORMAL PROOFS 39
Denition. The quantier axioms of L are the formulas (t/y) y and
y (t/y) where t is free for y in .
These axioms have been chosen to have the following property.
Proposition 2.7.1. The logical axioms of L are valid in every L-structure.
We rst prove this for the propositional axioms of L. Let
1
, . . . ,
n
be distinct
propositional atoms not in L. Let p = p(
1
, . . . ,
n
) Prop
1
, . . . ,
n
. Let
1
, . . . ,
n
be formulas and let p(
1
, . . . ,
n
) be the word obtained by replac-
ing each occurrence of
i
in p by
i
for i = 1, . . . , n. One checks easily that
p(
1
, . . . ,
n
) is a formula.
Lemma 2.7.2. Suppose
i
=
i
(x
1
, . . . , x
m
) for 1 i n and let a
1
, . . . , a
m
A. Dene a truth assignment t :
1
, . . . ,
n
0, 1 by t(
i
) = 1 i
/ [=
i
(a
1
, . . . , a
m
). Then p(
1
, . . . ,
n
) is an L-formula and
p(
1
, . . . ,
n
)(a
1
/x
1
, . . . , a
m
/x
m
) = p(
1
(a
1
, . . . , a
m
), . . . ,
n
(a
1
, . . . , a
m
)),
t(p(
1
, . . . ,
n
)) = 1 / [=p(
1
(a
1
, . . . , a
m
), . . . ,
n
(a
1
, . . . , a
m
)).
In particular, if p is a tautology, then / [= p(
1
, . . . ,
n
).
Proof. Easy induction on p. We leave the details to the reader.
Denition. An L-tautology is a formula of the form p(
1
, . . . ,
n
) for some
tautology p(
1
, . . . ,
n
) Prop
1
, . . . ,
n
and some formulas
1
, . . . ,
n
.
By Lemma 2.7.2 all L-tautologies are valid in all L-structures. The propositional
axioms of L are L-tautologies, so all propositional axioms of L are valid in all
L-structures. It is easy to check that all equality axioms of L are valid in all
L-structures. In exercise 3 below the reader is asked to show that all quantier
axioms of L are valid in all L-structures. This nishes the proof of Proposition
2.7.1.
Next we introduce rules for deriving new formulas from given formulas.
Denition. The logical rules of L are the following:
(i) Modus Ponens (MP): From and , infer .
(ii) Generalization Rule (G): If the variable x does not occur free in , then
(a) from , infer x;
(b) from , infer x .
A key property of the logical rules is that their application preserves validity.
Here is a more precise statement of this fact, to be veried by the reader.
(i) If / [= and / [= , then / [= .
(ii) Suppose x does not occur free in . Then
(a) if / [= , then / [= x;
(b) if / [= , then / [= x .
Denition. A formal proof , or just proof , of from is a sequence
1
, . . . ,
n
of formulas with n 1 and
n
= , such that for k = 1, . . . , n:
40 CHAPTER 2. BASIC CONCEPTS OF LOGIC
(i) either
k
,
(ii) or
k
is a logical axiom,
(iii) or there are i, j 1, . . . , k 1 such that
k
can be inferred from
i
and
j
by MP, or from
i
by G.
If there exists a proof of from , then we write and say proves .
Proposition 2.7.3. If , then [= .
This follows easily from earlier facts that we stated and which the reader was
asked to verify. The converse is more interesting, and due to Godel (1930):
Theorem 2.7.4 (Completeness Theorem of Predicate Logic).
[=
Remark. Our choice of proof system, and thus our notion of formal proof
is somewhat arbitrary. However the equivalence of and [= (Completeness
Theorem) justies our choice of logical axioms and rules and shows in particular
that no further logical axioms and rules are needed. Moreover, this equivalence
has consequences that can be stated in terms of [= alone. An example is the
important Compactness Theorem.
Theorem 2.7.5 (Compactness Theorem). If [= then there is a nite subset
0
of such that
0
[= .
The Compactness Theorem has many consequences. Here is one.
Corollary 2.7.6. Suppose is an L
Ri
-sentence that holds in all elds of char-
acteristic 0. Then there exists a natural number N such that is true in all
elds of characteristic p > N.
Proof. By assumption,
Fl(0) = Fl n1 ,= 0 : n = 2, 3, 5, . . . [= .
Then by Compactness, there is N N such that
Fl n1 ,= 0 : n = 2, 3, 5, , . . . , n N [= .
It follows that is true in all elds of characteristic p > N.
The converse of this corollary fails, see exercise 8 below. Note that Fl(0) is
innite. Could there be an alternative nite set of axioms whose models are
exactly the elds of characteristic 0?
Corollary 2.7.7. There is no nite set of L
Ri
-sentences whose models are
exactly the elds of characteristic 0.
Proof. Suppose there is such a nite set of sentences
1
, . . . ,
N
. Let :=
1
N
. Then the models of are just the elds of characteristic 0. By the
previous result holds in some eld of characteristic p > 0. Contradiction!
2.7. LOGICAL AXIOMS AND RULES; FORMAL PROOFS 41
Exercises. All but the last one to be done without using Theorem 2.7.4 or
2.7.5. Actually, Exercise (4) will be used in proving Theorem 2.7.4.
(1) Let L = |R where R is a binary relation symbol, and let / = (A; R) be a nite
L-structure (i. e. the set A is nite). Then there exists an L-sentence such that
the models of are exactly the L-structures isomorphic to /. (In fact, for an
arbitrary language L, two nite L-structures are isomorphic i they satisfy the
same L-sentences.)
(2) If t is free for y in and is valid in /, then (t/y) is valid in /.
(3) Suppose t is free for y in = (x
1
, . . . , x
n
, y). Then:
(i) Each /-instance of the quantier axiom (t/y) y has the form
(a
1
, . . . , a
n
, ) y(a
1
, . . . , a
n
, y)
with a
1
, . . . , a
n
A and a variable-free L
A
-term.
(ii) The quantier axiom (t/y) y is valid in /. (Hint: use Lemma 2.6.2.)
(iii) The quantier axiom y (t/y) is valid in /.
(4) If is an L-tautology, then .
(5)
i
for i = 1, . . . , n
1
n
.
(6) If and , then .
(7) x x and x x.
(8) Indicate an L
Ri
-sentence that is true in the eld of real numbers, but false in all
elds of positive characteristic.
(9) Let be an L
Ab
-sentence which holds in all non-trivial torsion free abelian groups.
Then there exists N N such that is true in all groups Z/pZ where p is a
prime number and p > N.
Chapter 3
The Completeness Theorem
The main aim of this chapter is to prove the Completeness Theorem. As a
byproduct we also derive some more elementary facts about predicate logic.
The last section contains some of the basics of universal algebra, which we can
treat here rather eciently using our construction of a so-called term-model in
the proof of the Completeness Theorem.
Conventions on the use of L, /, t, , , , and are as in the beginning
of Section 2.6.
3.1 Another Form of Completeness
It is convenient to prove rst a variant of the Completeness Theorem.
Denition. We say that is consistent if , and otherwise (that is, if
), we call inconsistent.
Theorem 3.1.1 (Completeness Theorem - second form).
is consistent if and only if has a model.
We rst show that this second form of the Completeness Theorem implies the
rst form. This will be done through a series of technical lemmas, which are
also useful later in this Chapter.
Lemma 3.1.2. Suppose . Then x.
Proof. From and the L-tautology (x ) we obtain
x by MP. Then by G we have x x. Using the L-
tautology (x x) x and MP we get x.
Lemma 3.1.3 (Deduction Lemma). Suppose . Then .
Proof. By induction on the length of a proof of from .
The cases where is a logical axiom, or or is obtained by
MP are treated just as in the proof of the Deduction Lemma of Propositional
Logic.
43
44 CHAPTER 3. THE COMPLETENESS THEOREM
Suppose that is obtained by part (a) of G, so is
1
x where x does
not occur free in
1
and
1
, and where we assume inductively
that (
1
). We have to argue that then (
1
x).
From the L-tautology
_
(
1
)
_
_
(
1
)
_
and MP we get
(
1
) . Since x does not occur free in
1
this gives (
1
) x,
by G. Using the L-tautology
_
(
1
) x
_
_
(
1
x)
_
and MP this gives (
1
x).
The case that is obtained by part (b) of G is left to the reader.
Corollary 3.1.4. Suppose
1
, . . . ,
n
. Then
1
. . .
n
.
We leave the proof as an exercise.
Corollary 3.1.5. if and only if is inconsistent.
The proof is just like that of the corresponding fact of Propositional Logic.
Lemma 3.1.6. y if and only if .
Proof. () This is Lemma 3.1.2. For (), assume y. We have the
quantier axiom y , so by MP we get .
Corollary 3.1.7. y
1
. . . y
n
if and only if .
Corollary 3.1.8. The second form of the Completeness Theorem implies the
rst form, Theorem 2.7.4.
Proof. Assume the second form of the Completeness Theorem holds, and that
[= . It suces to show that then . From [= we obtain [=
y
1
. . . y
n
where = (y
1
, . . . , y
n
), and so has no model where
is the sentence y
1
. . . y
n
. But then by the 2
nd
form of the Completeness
Theorem is inconsistent. Then by Corollary 3.1.5 we have and
thus by Corollary 3.1.7 we get .
We nish this section with another form of the Compactness Theorem:
Theorem 3.1.9 (Compactness Theorem - second form).
If each nite subset of has a model, then has a model.
This follows from the second form of the Completeness Theorem.
3.2 Proof of the Completeness Theorem
We are now going to prove Theorem 3.1.1. Since () is clear, we focus our
attention on (), that is, given a consistent set of sentences we must show
that has a model. This job will be done in a series of lemmas. Unless we say
so, we do not assume in those lemmas that is consistent.
3.2. PROOF OF THE COMPLETENESS THEOREM 45
Lemma 3.2.1. Suppose and t is free for x in . Then (t/x).
Proof. From we get x by Lemma 3.1.2. Then MP together with
the quantier axiom x (t/x) gives (t/x) as required.
Lemma 3.2.2. Suppose , and t
1
, . . . , t
n
are terms whose variables do not
occur bound in . Then (t
1
/x
1
, . . . , t
n
/x
n
).
Proof. Take distinct variables y
1
, . . . , y
n
that do not occur in or t
1
, . . . , t
n
and that are distinct from x
1
, . . . , x
n
. Use Lemma 3.2.1 n times in succession to
obtain where = (y
1
/x
1
, . . . , y
n
/x
n
). Apply Lemma 3.2.1 again n times
to get (t
1
/y
1
, . . . , t
n
/y
n
). To nish, observe that (t
1
/y
1
, . . . , t
n
/y
n
) =
(t
1
/x
1
, . . . , t
n
/x
n
).
Lemma 3.2.3.
(1) For each L-term t we have t = t.
(2) Let t, t
be L-terms and t = t
. Then t
= t.
(3) Let t
1
, t
2
, t
3
be L-terms and t
1
= t
2
and t
2
= t
3
. Then t
1
= t
3
.
(4) Let R L
r
be m-ary and let t
1
, t
1
, . . . , t
m
, t
m
be L-terms such that
t
i
= t
i
for i = 1, . . . , m and Rt
1
. . . t
m
. Then Rt
1
. . . t
m
.
(5) Let F L
f
be n-ary, and let t
1
, t
1
, . . . , t
n
, t
n
be L-terms such that t
i
=
t
i
for i = 1, . . . , n. Then Ft
1
. . . t
n
= Ft
1
. . . t
n
.
Proof. For (1), use the equality axiom x = x and apply Lemma 3.2.2. For
(2), take an equality axiom x = y y = x and apply Lemma 3.2.2 to get
t = t
= t. Then MP gives t
= t.
For (3), take an equality axiom (x = y y = z) (x = z) and apply Lemma
3.2.2 to get (t
1
= t
2
t
2
= t
3
) t
1
= t
3
. By the assumptions and Exercise 5
in Section 2.7 we have t
1
= t
2
t
2
= t
3
. Then MP yields t
1
= t
3
.
To prove (4) we rst use the same Exercise 5 to get
t
1
= t
1
. . . t
m
= t
m
Rt
1
. . . t
m
.
Take an equality axiom x
1
= y
1
. . . x
m
= y
m
Rx
1
. . . x
m
Ry
1
. . . y
m
and
apply Lemma 3.2.2 to obtain
t
1
= t
1
. . . t
m
= t
m
Rt
1
. . . t
m
Rt
1
. . . t
m
.
Then MP gives Rt
1
. . . t
m
. Part (5) is obtained in a similar way by using
an equality axiom x
1
= y
1
. . . x
n
= y
n
Fx
1
. . . x
n
= Fy
1
. . . y
n
.
Denition. Let T
L
be the set of variable-free L-terms. We dene a binary
relation
on T
L
by
t
1
t
2
t
1
= t
2
.
Parts (1), (2) and (3) of the last lemma yield the following.
Lemma 3.2.4. The relation
is an equivalence relation on T
L
.
46 CHAPTER 3. THE COMPLETENESS THEOREM
Denition. Suppose L has at least one constant symbol. Then T
L
is non-
empty. We dene the L-structure /
as follows:
(i) Its underlying set is A
:= T
L
/
.
(ii) If R L
r
is m-ary, then R
A
A
m
is given by
([t
1
], . . . , [t
m
]) R
A
: Rt
1
. . . t
m
(t
1
, . . . , t
m
T
L
).
(iii) If F L
f
is n-ary, then F
A
: A
n
is given by
F
A
([t
1
], . . . , [t
n
]) = [Ft
1
. . . t
n
] (t
1
, . . . , t
n
T
L
).
Remark. The reader should verify that this counts as a denition, i.e., does
not introduce ambiguity. (Use parts (4) and (5) of Lemma 3.2.3.)
Corollary 3.2.5. Suppose L has a constant symbol, and is consistent. Then
(1) for each t T
L
we have t
A
= [t];
(2) for each atomic we have: /
[= .
Proof. Part (1) follows by an easy induction. Let be Rt
1
. . . t
m
where R L
r
is m-ary and t
1
, . . . , t
m
T
L
. Then
Rt
1
. . . t
m
([t
1
], . . . , [t
m
]) R
A
[= Rt
1
. . . t
m
,
where the last follows from the denition of [= together with part (1). Now
suppose that is t
1
= t
2
where t
1
, t
2
T
L
. Then
t
1
= t
2
[t
1
] = [t
2
] t
A
1
= t
A
2
/
[= t
1
= t
2
.
We also have /
[= .
If the equivalence in part (2) of this corollary holds for all (not only for atomic
), then /
for some
complete set of L-sentences
.
The proof uses Zorns Lemma, and is just like that of the corresponding fact of
Propositional Logic in Section 1.2.
Completeness of does not guarantee that the equivalence of part (2) of
Corollary 3.2.5 holds for all . Completeness is only a necessary condition
for this equivalence to hold for all ; another necessary condition is to have
witnesses:
Denition. A -witness for the sentence x(x) is a term t T
L
such that
(t). We say that has witnesses if there is a -witness for every sentence
x(x) proved by .
Theorem 3.2.7. Let L have a constant symbol, and suppose is consistent.
Then the following two conditions are equivalent:
(i) For each we have: /
[= .
(ii) is complete and has witnesses.
In particular, if is complete and has witnesses, then /
is a model of .
Proof. It should be clear that (i) implies (ii). For the converse, assume (ii). We
use induction on the number of logical symbols in to obtain (i). We already
know that (i) holds for atomic sentences. The cases that =
1
, =
1
2
,
and =
1
2
are treated just as in the proof of the corresponding Lemma
2.2.11 for Propositional Logic. It remains to consider two cases:
Case = x(x):
() Suppose that . Because we are assuming that has witnesses we
have a t T
L
such that (t). Then by the inductive hypothesis /
[= (t).
So by Lemma 2.6.2 we have an a A
such that /
[= (a). Therefore
/
[= x(x), hence /
[= .
() Assume /
[= . Then there is an a A
such that /
[= (a). Choose
t T
L
such that [t] = a. Then t
A
= a, hence /
1
(x
1
), . . .,
n
=
x
n
n
(x
n
) be such that
i
for every i = 1, . . . , n. Let c
1
, . . . , c
n
be
distinct constant symbols not in L. Put L
:= L c
1
, . . . , c
n
and
=
1
(c
1
), . . . ,
n
(c
n
). Then
is a consistent set of L
-sentences.
Proof. The previous lemma covers the case n = 1. The general case follows by
induction on n.
In the next lemma we use a superscript w for witness.
Lemma 3.2.11. Suppose is consistent. For each L-sentence = x(x) such
that , let c
is a dierent
L-sentence of the form x
(x
,= c
. Put
L
w
:= L c
w
:= (c
1
, . . . , c
n
be constant symbols in L
w
L such that this proof is a proof
of in the language Lc
1
, . . . , c
n
from
1
(c
1
), . . . ,
n
(c
n
), where
i
= x
i
i
(x
i
) for 1 i n. So
1
(c
1
), . . . ,
n
(c
n
) is an inconsistent
set of L c
1
, . . . , c
n
-sentences. This contradicts Lemma 3.2.10.
Lemma 3.2.12. Let (L
n
) be an increasing sequence of languages: L
0
L
1
L
2
. . ., and denote their union by L
. Let
n
be a consistent set of L
n
-
sentences, for each n, such that
0
1
2
. . .. Then the union
:=
n
n
is a consistent set of L
-sentences.
Proof. Suppose that
be an
L
(and /
an expansion of /)
if / and /
have the same underlying set and the same interpretations of the
symbols of L. For example, (N; 0, +) is a reduct of (N; <, 0, 1, +, ). Note that
any L
-structure /
[
L
. A key fact (to be veried by the reader) is that if / is a reduct of
/
, then t
A
= t
A
[=
for all L
A
-sentences .
We can now prove Theorem 3.1.1.
Proof. Let be a consistent set of L-sentences. We construct a sequence (L
n
)
of languages and a sequence (
n
) where each
n
is a consistent set of L
n
-
sentences. We begin by setting L
0
= L and
0
= . Given the language L
n
and the consistent set of L
n
-sentences
n
, put
L
n+1
:=
_
L
n
if n is even,
L
w
n
if n is odd,
choose a complete set of L
n
-sentences
n
n
, and put
n+1
:=
_
n
if n is even,
w
n
if n is odd.
Here L
w
n
and
w
n
are obtained from L
n
and
n
in the same way that L
w
and
w
are obtained from L and in Lemma 3.2.11. Note that L
n
L
n+1
, and
n
n+1
.
By the previous lemma the set
of L
or
.
We claim that
-
sentence such that
n+1
(c
), so
(c
).
It follows from Theorem 3.2.7 that
. Put
/ := /
[
L
. Then / [= . This concludes the proof of the Completeness
Theorem (second form).
Exercises.
(1) Suppose is consistent. Then is complete if and only if every two models of
satisfy the same sentences.
(2) Let L have just a constant symbol c, a unary relation symbol U and a unary
function symbol f, and suppose that Ufc, and that f does not occur in the
sentences of . Then xUx.
3.3 Some Elementary Results of Predicate Logic
Here we obtain some generalities of predicate logic: Equivalence and Equality
Theorems, Variants, and Prenex Form. In some proofs we shall take advantage
of the fact that the Completeness Theorem is now available.
50 CHAPTER 3. THE COMPLETENESS THEOREM
Lemma 3.3.1 (Distribution Rule).
(i) Suppose . Then x x and x x.
(ii) Suppose . Then x x and x x.
Proof. We only do (i), since (ii) then follows easily. Let / be a model of . By
the Completeness Theorem it suces to show that then / [= x x and
/ [= x x. We shall prove / [= x x and leave the other part
to the reader. We have / [= . Choose variables y
1
, . . . , y
n
such that
= (x, y
1
, . . . , y
n
) and = (x, y
1
, . . . , y
n
). We need only show that then
for all a
1
, . . . , a
n
A
/ [= x(x, a
1
, . . . , a
n
) x(x, a
1
, . . . , a
n
)
Suppose / [= x(x, a
1
, . . . , a
n
). Then / [= (a
0
, a
1
, . . . , a
n
) for some a
0
A.
From / [= we obtain / [= (a
0
, . . . , a
n
) (a
0
, . . . , a
n
), hence / [=
(a
0
, . . . , a
n
), and thus / [= x(x, a
1
, . . . , a
n
).
Theorem 3.3.2 (Equivalence Theorem). Suppose
, and let
be
obtained from by replacing some occurrence of as a subformula of by
.
Then
.
Proof. By induction on the number of logical symbols in . If is atomic, then
necessarily = and
, where
is obtained by
replacing that occurrence (of ) by
. Then
) is trivial. Suppose
,= . Then the occurrence of we are replacing is an occurrence inside .
So by inductive hypothesis we have
iI
i
:=
i(1)
i(n)
,
iI
i
:=
i(1)
i(n)
.
If I is clear from context we just write
_
i
i
and
_
i
i
instead. Of course, these
notations
_
iI
i
and
_
iI
i
can only be used when the particular choice of
bijection of 1, . . . , n with I does not matter; this is usually the case because
the equivalence class of
i(1)
i(n)
does not depend on this choice, and
the same is true with instead of .
3.3. SOME ELEMENTARY RESULTS OF PREDICATE LOGIC 51
Denition. A variant of a formula is obtained by successive replacements of
the following type:
(i) replace an occurrence of a subformula x by y(y/x);
(ii) replace an occurrence of a subformula x by y(y/x).
where y is free for x in and y does not occur free in .
Lemma 3.3.3. A formula is equivalent to any of its variants.
Proof. By the Equivalence Theorem (3.3.2) it suces to show x y(y/x)
and x y(y/x) where y is free for x in and does not occur free in .
We prove the rst equivalence, leaving the second as an exercise. Applying G
to the quantier axiom (y/x) x gives y(y/x) x. Similarly we
get x y(y/x) (use that = (y/x)(x/y) by the assumption on y).
An application of Exercise 6 of Section 2.7 nishes the proof.
Denition. A formula in prenex form is a formula Q
1
x
1
. . . Q
n
x
n
where
x
1
, . . . , x
n
are distinct variables, each Q
i
, and is quantier-free. We
call Q
1
x
1
. . . Q
n
x
n
the prex, and the matrix of the formula. Note that a
quantier-free formula is in prenex form; this is the case n = 0.
We leave the proof of the next lemma as an exercise. Instead of occurrence of
... as a subformula we say part .... In this lemma Q denotes a quantier,
that is, Q , , and Q
= and
= .
Lemma 3.3.4. The following prenex transformations always change a formula
into an equivalent formula:
(1) replace the formula by one of its variants;
(2) replace a part Qx by Q
x;
(3) replace a part (Qx) by Qx( ) where x is not free in ;
(4) replace a part Qx by Qx( ) where x is not free in ;
(5) replace a part (Qx) by Qx( ) where x is not free in ;
(6) replace a part Qx by Qx( ) where x is not free in .
Remark. Note that the free variables of a formula (those that occur free in the
formula) do not change under prenex transformations.
Theorem 3.3.5 (Prenex Form). Every formula can be changed into one in
prenex form by a nite sequence of prenex transformations. In particular, each
formula is equivalent to one in prenex form.
Proof. By induction on the number of logical symbols. Atomic formulas are
already in prenex form. To simplify notation, write =
pr
to indicate
that can be obtained from by a nite sequence of prenex transformations.
Assume inductively that
1
=
pr
Q
1
x
1
. . . Q
m
x
m
2
=
pr
Q
m+1
y
1
. . . Q
m+n
y
n
2
,
where Q
1
, . . . , Q
m
, . . . , Q
m+n
, , x
1
, . . . , x
m
are distinct, y
1
, . . . , y
n
are
distinct, and
1
and
2
are quantier-free.
52 CHAPTER 3. THE COMPLETENESS THEOREM
Then for :=
1
, we have
=
pr
Q
1
x
1
. . . Q
m
x
m
1
.
Applying m prenex transformations of type (2) we get
Q
1
x
1
. . . Q
m
x
m
1
=
pr
Q
1
x
1
. . . Q
m
x
m
1
,
hence =
pr
Q
1
x
1
. . . Q
m
x
m
1
.
Next, let :=
1
2
. The assumptions above yield
=
pr
(Q
1
x
1
. . . Q
m
x
m
1
) (Q
m+1
y
1
. . . Q
m+n
y
n
2
).
Replacing here the RHS (righthand side) by a variant we may assume that
x
1
, . . . , x
m
y
1
, . . . , y
n
= , and also that no x
i
occurs free in
2
and no
y
j
occurs free in
1
. Applying m+n times prenex transformation of types (3)
and (4) we obtain
(Q
1
x
1
. . . Q
m
x
m
1
) (Q
m+1
y
1
. . . Q
m+n
y
n
2
) =
pr
Q
1
x
1
. . . Q
m
x
m
Q
m+1
y
1
. . . Q
m+n
y
n
(
1
2
).
Hence =
pr
Q
1
x
1
. . . Q
m
x
m
Q
m+1
y
1
. . . Q
m+n
y
n
(
1
2
). Likewise, to deal
with
1
2
, we apply prenex transformations of types (5) and (6).
Next, let := x
1
. Applying prenex transformations of type (1) we can
assume x
1
, . . . , x
m
are dierent from x. Then =
pr
xQ
1
x
1
. . . Q
m
x
m
1
, and
the RHS is in prenex form. The case := x
1
is similar.
We nish this section with results on equalities. Note that by Corollary 2.1.7,
the result of replacing an occurrence of an L-term in t by an L-term
is an
L-term t
.
Proposition 3.3.6. Let and
, let t
be
obtained from t by replacing an occurrence of in t by
. Then t = t
.
Proof. We proceed by induction on terms. First note that if t = , then t
.
This fact takes care of the case that t is a variable. Suppose t = Ft
1
. . . t
n
where F L
f
is n-ary and t
1
, . . . , t
n
are L-terms, and assume t ,= . Using the
facts on admissible words at the end of Section 2.1, including exercise 5, we see
that t
= Ft
1
. . . t
n
where for some i 1, . . . , n we have t
j
= t
j
for all j ,= i,
j 1, . . . , n, and t
i
is obtained from t
i
by replacing an occurrence of in t
i
by
1
, . . . , t
n
= t
n
, so by part
(5) of Lemma 3.2.3 we have t = t
.
An occurrence of an L-term in is said to be inside quantier scope if the
occurrence is inside an occurrence of a subformula x or x in . An occur-
rence of an L-term in is said to be outside quantier scope if the occurrence
is not inside quantier scope.
3.4. EQUATIONAL CLASSES AND UNIVERSAL ALGEBRA 53
Proposition 3.3.7. Let and
, let
be
obtained from by replacing an occurrence of in outside quantier scope by
. Then
.
Proof. First take care of the case that is atomic by a similar argument as for
terms, and then proceed by the usual kind of induction on formulas.
Exercises.
(1) Let P be a unary relation symbol, Q be a binary relation symbol, and x, y distinct
variables. Use prenex transformations to put
xy(P(x) Q(x, y)) xy(Q(x, y) P(y))
into prenex form.
(2) Let (
i
)
iI
be a family of formulas with nite index set I. Then
`
x
_
i
`
_
i
x
i
,
`
x
^
i
`
^
i
x
i
.
(3) If
1
(x
1
, . . . , x
m
) and
2
(x
1
, . . . , x
m
) are existential formulas, then
(
1
2
)(x
1
, . . . , x
m
), (
1
2
)(x
1
, . . . , x
m
)
are equivalent to existential formulas
12
(x
1
, . . . , x
m
) and
12
(x
1
, . . . , x
m
). The
same holds with existential replaced by universal.
(4) A formula is said to be unnested if each atomic subformula has the form Rx
1
. . . x
m
with m-ary R L
r
|, , = and distinct variables x
1
, . . . , x
m
, or the form
Fx
1
. . . x
n
= x
n+1
with n-ary F L
f
and distinct variables x
1
, . . . , x
n+1
. (This
allows and as atomic subformulas of unnested formulas.) Then:
(i) for each term t(x
1
, . . . , x
m
) and variable y / |x
1
, . . . , x
m
the formula
t(x
1
, . . . , x
m
) = y
is equivalent to an unnested existential formula
1
(x
1
, . . . , x
m
, y), and also to an
unnested universal formula
2
(x
1
, . . . , x
m
, y).
(ii) each atomic formula (y
1
, . . . , y
n
) is equivalent to an unnested existential
formula
1
(y
1
, . . . , y
n
), and also to an unnested universal formula
2
(y
1
, . . . , y
n
).
(iii) each formula (y
1
, . . . , y
n
) is equivalent to an unnested formula
u
(y
1
, . . . , y
n
).
3.4 Equational Classes and Universal Algebra
The term-structure /
is a -algebra.
Proof. Consider an identity
x
_
s
1
(x) = t
1
(x) s
n
(x) = t
n
(x)
_
, x = (x
1
, . . . , x
m
)
in , let j 1, . . . , n and put s = s
j
and t = t
j
. Let a
1
, . . . , a
m
A
and put
/ = /
is an initial -algebra.
Proof. Let B be any -algebra. Note that if s, t T
L
and [s] = [t], then
s = t, so s
B
= t
B
. Thus we have a map
A
B, [t] t
B
.
It is easy to check that this map is a homomorphism /
B. By Exercise 4
in Section 2.4 it is the only such homomorphism.
Free algebras. Let I be an index set in what follows. Let / be a -algebra
and (a
i
)
iI
an I-indexed family of elements of A. Then / is said to be a free
-algebra on (a
i
) if for every -algebra B and I-indexed family (b
i
) of elements
of B there is exactly one homomorphism h : / B such that h(a
i
) = b
i
for all
i I. We also express this by
_
/, (a
i
)
_
is a free -algebra. Finally, / itself is
sometimes referred to as a free -algebra if there is a family (a
j
)
jJ
in A such
that
_
/, (a
j
)
_
is a free -algebra.
As an example, take L = L
Ri
and cRi := Ri xy xy = yz, where x, y
are distinct variables. So the cRi-algebras are just the commutative rings. Let
Z[X
1
, . . . , X
n
] be the ring of polynomials in distinct indeterminates X
1
, . . . , X
n
over Z. For any commutative ring R and elements b
1
, . . . , b
n
R we have a
56 CHAPTER 3. THE COMPLETENESS THEOREM
unique ring homomorphism Z[X
1
, . . . , X
n
] R that sends X
i
to b
i
for i =
1, . . . , n, namely the evaluation map (or substitution map)
Z[X
1
, . . . , X
n
] R, f(X
1
, . . . , X
n
) f(b
1
, . . . , b
n
).
Thus Z[X
1
, . . . , X
n
] is a free commutative ring on (X
i
)
1in
.
For a simpler example, let L = L
Mo
:= 1, L
Gr
be the language of monoids,
and consider
Mo := x(1 x = x x 1 = x), xyz
_
(xy)z = x(yz)
_
,
where x, y, z are distinct variables. A monoid, or semigroup with identity, is a
model / = (A; 1, ) of Mo, and we call 1 A the identity of the monoid /, and
its product operation.
Let E
as a monoid by
taking the empty word as its identity and the concatenation operation (v, w)
vw as its product operation. Then E
I
is a
free -algebra on ([c
i
]). Thus, up to unique isomorphism of
I
-algebras, there
is a unique free -algebra on an I-indexed family of its elements.
Let (/, (a
i
)
iI
) be a free -algebra. Then / is generated by (a
i
). To see
why, let B be the subalgebra of / generated by (a
i
). Then we have a unique
homomorphism h : / B such that h(a
i
) = a
i
for all i I, and then the
composition
/ B /
is necessarily id
A
, so B = /.
3.4. EQUATIONAL CLASSES AND UNIVERSAL ALGEBRA 57
Let B be any -algebra, and take any family (b
j
)
jJ
in B that generates B.
Take a free -algebra
_
/, (a
j
)
jJ
_
, and take the unique homomorphism h :
_
/, (a
j
)
_
_
B, (b
j
)
_
. Then h(t
A
(a
j
1
, . . . , a
j
n
) = t
B
(b
j
1
, . . . , b
j
n
) for all L-
term t(x
1
, . . . , x
n
) and j
1
, . . . , j
n
J, so h(A) = B, and thus h induces an
isomorphism //
h
= B. We have shown:
Every -algebra is isomorphic to a quotient of a free -algebra.
This fact can sometimes be used to reduce problems on -algebras to the case
of free -algebras; see the next subsection for an example.
Proof of Birkhos theorem. Let us say that a class ( of L-algebras is closed
if it has properties (1)(5) listed in Theorem 3.4.1. Assume ( is closed; we have
to show that then ( is equational. Indeed, let (() be the set of L-identities
x
_
s(x) = t(x)
_
that are true in all algebras of (. It is clear that each algebra in ( is a (()-
algebra, and it remains to show that every (()-algebra belongs to (. Here is
the key fact from which this will follow:
Claim. If / is an initial (()-algebra, then / (.
To prove this claim we take / := /
(C)
. For s, t T
L
such that s = t does not
belong to (() we pick B
s,t
( such that B
s,t
[= s ,= t, and we let h
s,t
: / B
s,t
be the unique homomorphism, so h
s,t
([s]) ,= h
s,t
([t]). Let B :=
B
s,t
where the
product is over all pairs (s, t) as above, and let h : / B be the homomorphism
given by h(a) =
_
h
s,t
(a)
_
. Note that B (. Then h is injective. To see why,
let s, t T
L
be such that [s] ,= [t] in A = A
(C)
. Then s = t does not belong
to ((), so h
s,t
([s]) ,= h
s,t
([t]), and thus h([s]) ,= h([t]). This injectivity gives
/
= h(/) B, so / (. This nishes the proof of the claim.
Now, every (()-algebra is isomorphic to a quotient of a free (()-algebra, so
it remains to show that free (()-algebras belong to (. Let / be a free (()-
algebra on (a
i
)
iI
. Let (
I
be the class of all L
I
-algebras
_
B, (b
i
)
_
with B (.
It is clear that (
I
is closed as a class of L
I
-algebras. Now,
_
/, (a
i
)
_
is easily
seen to be an initial ((
I
)-algebra. By the claim above, applied to (
I
in place
of (, we obtain
_
/, (a
i
)
_
(
I
, and thus / (.
Generators and relations. Let G be any set. Then we have a -algebra /
with a map : G A such that for any -algebra B and any map j : G B
there is a unique homomorphism h : / B such that h = j; in other words, /
is a free as a -algebra on (g)
gG
. Note that if (/
) (with
: G A
) has the
same universal property as (/, ), then the unique homomorphism h : / /
such that h =
constructed
in the proof of the Completeness Theorem is countable. It follows that there
are only countably many variable-free L
-terms, hence /
is countable, and
thus its reduct /
[
L
is a countable model of .
Remark. The above proof is the rst time that we used the countability of
Var = v
0
, v
1
, v
2
, . . . . As promised in Section 2.4, we shall now indicate why the
Downward Lowenheim-Skolem Theorem goes through without assuming that
Var is countable.
Suppose that Var is uncountable. Take a countably innite subset Var
Var. Then each sentence is equivalent to one whose variables are all from Var
.
By replacing each sentence in by an equivalent one all whose variables are
59
60 CHAPTER 4. SOME MODEL THEORY
from Var
have
the same models. As in the proof above, we obtain a countable model of
are used in
terms and formulas. This model is a countable model of .
The following test can be useful in showing that a set of axioms is complete.
Proposition 4.1.2 (Vaughts Test). Let L be countable, and suppose has a
model, and that all countable models of are isomorphic. Then is complete.
Proof. Suppose is not complete. Then there is such that and .
Hence by the Lowenheim-Skolem Theorem there is a countable model / of
such that / , and there is a countable model B of such that B . We
have /
= B, / [= and B [= , contradiction.
Example. Let L = , so the L-structures are just the non-empty sets. Let
=
1
,
2
, . . . where
n
= x
1
. . . x
n
1i<jn
x
i
,= x
j
.
The models of are exactly the innite sets. All countable models of are
countably innite and hence isomorphic to N. Thus by Vaughts Test is
complete.
In this example the hypothesis of Vaughts Test is trivially satised. In other
cases it may require work to check this hypothesis. One general method in
model theory, Back-and-Forth, is often used to verify the hypothesis of Vaughts
Test. The proof of the next theorem shows Back-and-Forth in action. To
formulate that theorem we dene a totally ordered set to be a structure (A; <)
for the language L
O
that satises the following axioms (where x, y, z are distinct
variables):
x(x ,< x), xyz
_
(x < y y < z) x < z
_
, xy(x < y x = y y < x).
Let To be the set consisting of these three axioms. A totally ordered set is said
to be dense if it satises in addition the axiom
xy(x < y z(x < z z < y)),
and it is said to have no endpoints if it satises the axiom
xyz(y < x x < z).
So (Q; <) and (R; <) are dense totally ordered sets without endpoints. Let
To (dense without endpoints) be To augmented by these two extra axioms.
Theorem 4.1.3 (Cantor). Any two countable dense totally ordered sets without
endpoints are isomorphic.
4.1. L
0
:= b
0
. Let n > 0, and suppose we have distinct
0
, . . . ,
n1
in A and distinct
0
, . . . ,
n1
in B such that for all i, j < n we have
i
<
j
i
<
j
. Then
we dene
n
A and
n
B as follows:
Case 1: n is even. (Here we go forth.) First take k N minimal such that
a
k
/
0
, . . . ,
n1
; then take l N minimal such that b
l
is situated with
respect to
0
, . . . ,
n1
as a
k
is situated with respect to
0
, . . . ,
n1
, that is, l
is minimal such that for i = 0, . . . , n 1 we have:
i
< a
k
i
< b
l
, and
i
> a
k
i
> b
l
. (The reader should check that such an l exists: that is
where density and no endpoints come in); put
n
:= a
k
and
n
:= b
l
.
Case 2: n is odd. (Here we go back.) First take l N minimal such that
b
l
/
0
, . . . ,
n1
; next take k N minimal such that a
k
is situated with
respect to
0
, . . . ,
n1
as b
l
is situated with respect to
0
, . . . ,
n1
, that is, k
is minimal such that for i = 0, . . . , n 1 we have:
i
< a
k
i
< b
l
, and
i
> a
k
i
> b
l
. Put
n
:= b
l
and
n
:= a
k
.
One proves easily by induction on n that then a
n
0
, . . . ,
2n
and b
n
0
, . . . ,
2n
. Thus we have a bijection
n
n
: A B, and this bijection
is an isomorphism (A; <) (B; <).
In combination with Vaughts Test this gives
Corollary 4.1.4. To (dense without endpoints) is complete.
In the results below we let denote an innite cardinal. Recall that the set of
ordinals < has cardinality . We have the following generalization of the
L owenheim-Skolem theorem.
Theorem 4.1.5 (Generalized Lowenheim-Skolem Theorem). Suppose L has
size at most and has an innite model. Then has a model of cardinality
.
Proof. Let c
<
be a family of new constant symbols that are not in L and
are pairwise distinct (that is, c
,= c
= Lc
: <
and let
= c
,= c
is consistent. To see
this it suces to show that, given any nite set , the set of L
-sentences
:= c
,= c
: , , ,=
has a model. Take an innite model / of . We make an L
-expansion /
of / by interpreting distinct c
is a model of
, which
establishes the claim.
Note that L
has a model B
= (B, (b
)
<
) of cardinality at most . We have b
,= b
for
< < , hence B is a model of of cardinality .
62 CHAPTER 4. SOME MODEL THEORY
The next proposition is Vaughts Test for arbitrary languages and cardinalities.
Proposition 4.1.6. Suppose L has size at most , has a model and all models
of are innite. Suppose also that any two models of of cardinality are
isomorphic. Then is complete.
Proof. Let be an L-sentence and suppose that and . We
will derive a contradiction. First means that has a model.
Similarly means that has a model. These models must be
innite since they are models of , so by the Generalized Lowenheim-Skolem
Theorem has a model / of cardinality , and has a model
B of cardinality . By assumption /
= B, contradicting that / [= and
B [= .
We now discuss in detail one particular application of this generalized Vaught
Test. Fix a eld F. A vector space over F is an abelian (additively written)
group V equipped with a scalar multiplication operation
F V V : (, v) v
such that for all , F and all v, w V ,
(i) ( +)v = v +v,
(ii) (v +w) = v +w,
(iii) 1v = v,
(iv) ()v = (v).
Let L
F
be the language of vector spaces over F: it extends the language
L
Ab
= 0, , + of abelian groups with unary function symbols f
F
:=
F
x
1
. . . x
n
1i<jn
x
i
,= x
j
: n = 2, 3, . . ..
So the models of
F
are exactly the innite vector spaces over F. Note that if
F itself is innite then each non-trivial vector space over F is innite.
We will need the following facts about vector spaces V and W over F.
(Proofs can be found in various texts.)
Fact.
(a) V has a basis B, that is, B V , and for each vector v V there is a
unique family (
b
)
bB
of scalars (elements of F) such that b B :
b
,= 0
is nite and v =
bB
b
b.
4.1. L
F
is complete.
Proof. Take a > [F[. In particular L
F
has size at most . Let V be a vector
space over F of cardinality . Then a basis of V must also have size by
property (d) above. Thus any two vector spaces over F of cardinality have
bases of cardinality and thus are isomorphic. It follows by the Generalized
Vaught Test that
F
is complete.
Remark. Theorem 4.1.7 and Exercise 3 imply for instance that if F = R then
all non-trivial vector spaces over F satisfy exactly the same sentences in L
F
.
With the generalized Vaught Test we can also prove that ACF(0) (whose models
are the algebraically closed elds of characteristic 0) is complete. The proof is
similar, with transcendence bases taking over the role of bases. The relevant
denitions and facts are as follows.
Let K be a eld with subeld k. A subset B of K is said to be algebraically
independent over k if for all distinct b
1
, . . . , b
n
B we have p(b
1
, . . . , b
n
) ,= 0
for all nonzero polynomials p(x
1
, . . . , x
n
) k[x
1
, . . . , x
n
], where x
1
, . . . , x
n
are
distinct variables. A transcendence basis of K over k is a set B K such that
B is algebraically independent over k and K is algebraic over its subeld k(B).
Fact.
(a) K has a transcendence basis over k;
(b) any two transcendence bases of K over k have the same size;
(c) If K is algebraically closed with transcendence basis B over k and K
is
also an algebraically closed eld extension of k with transcendence basis B
extends to an isomorphism K K
;
(d) if K is uncountable and [K[ > [k[, then [K[ = [B[ for each transcendence
basis B of K over k.
Applying this with k = Q and k = F
p
for prime numbers p, we obtain that
any two algebraically closed elds of the same characteristic and the same un-
countable size are isomorphic. Using Vaughts Test for models of size
1
this
yields:
Theorem 4.1.8. The set ACF(0) of axioms for algebraically closed elds of
characteristic zero is complete. Likewise, ACF(p) is complete for each prime
number p.
If the hypothesis of Vaughts Test (or its generalization) is satised, then many
things follow of which completeness is only one; it goes beyond the scope of
these notes to develop this large chapter of model theory, which goes under the
name of categoricity in power.
64 CHAPTER 4. SOME MODEL THEORY
Exercises.
(1) Let L = |U where U is a unary relation symbol. Consider the L-structure
(Z; N). Give an informative description of a complete set of L-sentences true in
(Z; N). (A description like | : (Z; N) [= is not informative. An explicit,
possibly innite, list of axioms is required. Hint: Make an educated guess and
try to verify it using Vaughts Test or one of its variants.)
(2) Let
1
and
2
be sets of L-sentences such that no symbol of L occurs in both
1
and
2
. Suppose
1
and
2
have innite models. Then
1
2
has a model.
(3) Let L = |S where S is a unary function symbol. Consider the L-structure (Z; S)
where S(a) = a + 1 for a Z. Give an informative description of a complete set
of L-sentences true in (Z; S).
4.2 Elementary Equivalence and Back-and-Forth
In the rest of this chapter we relax notation, and just write (a
1
, . . . , a
n
) for an
L
A
-formula (a
1
, . . . , a
n
), where / = (A; . . . ) is an L-structure, (x
1
, . . . , x
n
)
an L
A
-formula, and (a
1
, . . . , a
n
) A
n
.
In this section / and B denote L-structures. We say that / and B are elemen-
tarily equivalent (notation: / B) if they satisfy the same L-sentences. Thus
by the previous section (Q; <) (R; <), and any two innite vector spaces
over a given eld F are elementarily equivalent.
A partial isomorphism from / to B is a bijection : X Y with X A and
Y B such that
(i) for each m-ary R L
r
and a
1
, . . . , a
m
X
(a
1
, . . . , a
m
) R
A
(a
1
, . . . , a
m
) R
B
.
(ii) for each n-ary f L
f
and a
1
, . . . , a
n
, a
n+1
X
f
A
(a
1
, . . . , a
n
) = a
n+1
f
B
(a
1
, . . . , a
n
) = (a
n+1
).
Example: suppose / = (A; <) and B = (B; <) are totally ordered sets, and
a
1
, . . . , a
N
A and b
1
, . . . , b
N
B are such that a
1
< a
2
< < a
N
and
b
1
< b
2
< < b
N
; then the map a
i
b
i
: a
1
, . . . , a
N
b
1
, . . . , b
N
is a
partial isomorphism from / to B.
A back-and-forth system from / to B is a collection of partial isomorphisms
from / to B such that
(i) (Forth) for each and a A there is a
such that
extends
and a domain(
);
(ii) (Back) for each and b B there is a
such that
extends
and b codomain(
).
If is a back-and-forth system from / to B, then
1
:=
1
: is a
back-and-forth system from B to /. We call / and B back-and-forth equivalent
4.3. QUANTIFIER ELIMINATION 65
(notation: /
bf
B) if there exists a nonempty back-and-forth system from /
to B. Hence /
bf
/, and if /
bf
B, then B
bf
/.
Cantors proof in the previous section generalizes as follows:
Proposition 4.2.1. Suppose / and B are countable and /
bf
B. Then /
= B.
Proof. Let be a nonempty back-and-forth system from / to B. We proceed
as in the proof of Cantors theorem, and construct a sequence (
n
) of partial
isomorphisms from / to B such that each
n+1
extends
n
, A =
n
domain(
n
)
and B =
n
codomain(
n
). Then the map A B that extends each
n
is an
isomorphism / B.
In applying this proposition and the next one in a concrete situation, the
key is to guess a back-and-forth system. That is where insight and imagination
(and experience) come in. In the following result we do not need to assume
countability.
Proposition 4.2.2. If /
bf
B, then / B.
Proof. Suppose is a nonemepty back-and-forth system from / to B. By
induction on the number of logical symbols in unnested formulas (y
1
, . . . , y
n
)
one shows that for each and a
1
, . . . , a
n
domain() we have
/ [= (a
1
, . . . , a
n
) B [= (a
1
, . . . , a
n
).
For n = 0 and using Exercise 4 of Section 3.3 this yields / B.
Exercises.
(1) Dene a nite restriction of a bijection : X Y to be a map [E : E (E)
with nite E X. If is a back-and-forth system from / to B, so is the set of
nite restrictions of members of .
(2) If /
bf
B and B
bf
(, then /
bf
(.
4.3 Quantier Elimination
In this section x = (x
1
, . . . , x
n
) is a tuple of distinct variables and y is a single
variable distinct from x
1
, . . . , x
n
.
Denition. has quantier elimination (QE) if every L-formula (x) is -
equivalent to a quantier free (short: q-free) L-formula
qf
(x).
By taking n = 0 in this denition we see that if has QE, then every L-sentence
is -equivalent to a q-free L-sentence.
Remark. Suppose has QE. Let L
L and all
symbols of L
= L.) Then
every set
of L
-sentences with
-formula
(x) has the form (c, x) for some L-formula (u, x), where
u = (u
1
, . . . , u
m
) is a tuple of distinct variables distinct from x
1
, . . . , x
n
, and
c = (c
1
, . . . , c
m
) is a tuple of constant symbols in L
L.)
66 CHAPTER 4. SOME MODEL THEORY
A basic conjunction in L is by denition a conjunction of nitely many atomic
and negated atomic L-formulas. Each q-free L-formula (x) is equivalent to a
disjunction
1
(x)
k
(x) of basic conjunctions
i
(x) in L (disjunctive
normal form).
Lemma 4.3.1. Suppose that for every basic conjunction (x, y) in L there is a
q-free L-formula
qf
(x) such that
y(x, y)
qf
(x).
Then has QE.
Proof. Let us say that an L-formula (x) has -QE if it is -equivalent to a q-
free L-formula
qf
(x). Note that if the L-formulas
1
(x) and
2
(x) have -QE,
then
1
(x), (
1
2
)(x), and (
1
2
)(x) have -QE.
Next, let (x) = y(x, y), and suppose inductively that the L-formula
(x, y) has -QE. Hence (x, y) is -equivalent to a disjunction
_
i
i
(x, y) of
basic conjunctions
i
(x, y) in L, with i ranging over some nite index set. In
view of the equivalence of y
_
i
i
(x, y) with
_
i
y
i
(x, y) we obtain
(x)
i
y
i
(x, y).
Each y
i
(x, y) has -QE, by hypothesis, so (x) has -QE.
Finally, let (x) = y(x, y), and suppose inductively that the L-formula
(x, y) has -QE. This case reduces to the previous case since (x) is equivalent
to y(x, y).
Remark. In the structure (R; <, 0, 1, +, , ) the formula
(a, b, c) := y(ay
2
+by +c = 0)
is equivalent to the q-free formula
(b
2
4ac 0 a ,= 0) (a = 0 b ,= 0) (a = 0 b = 0 c = 0).
Note that this equivalence gives an eective test for the existence of a y with a
certain property, which avoids in particular having to check an innite number
of values of y (even uncountably many in the case above). This illustrates the
kind of property QE is.
Lemma 4.3.2. Suppose has QE and B and ( are models of with a common
substructure / (we do not assume / [= ). Then B and ( satisfy the same L
A
-
sentences.
Proof. Let be an L
A
-sentence. We have to show B [= ( [= . Write
as (a) with (x) an L-formula and a A
n
. Take a q-free L-formula
qf
(x)
that is -equivalent to (x). Then B [= i B [=
qf
(a) i / [=
qf
(a) (by
Exercise 7 of Section 2.5) i ( [=
qf
(a) (by the same exercise) i ( [= .
4.3. QUANTIFIER ELIMINATION 67
Corollary 4.3.3. Suppose has a model, has QE, and there exists an L-
structure that can be embedded into every model of . Then is complete.
Proof. Take an L-structure / that can be embedded into every model of . Let
B and ( be any two models of . So / is isomorphic to a substructure of B and
of (. Then by a slight rewording of the proof of Lemma 4.3.2 (considering only
L-sentences), we see that B and ( satisfy the same L-sentences. It follows that
is complete.
Remark. We have seen that Vaughts test can be used to prove completeness.
The above corollary gives another way of establishing completeness, and is often
applicable when the hypothesis of Vaughts Test is not satised. Completeness
is only one of the nice consequences of QE, and the easiest one to explain at this
stage. The main impact of QE is rather that it gives access to the structural
properties of denable sets. This will be reected in exercises at the end of
this section. Applications of model theory to other areas of mathematics often
involve QE as a key step.
Theorem 4.3.4. To(dense without endpoints) has QE.
Proof. Let (x, y) = (x
1
, . . . , x
n
, y) be a tuple of n + 1 distinct variables, and
consider a basic conjunction (x, y) in L
O
. By Lemma 4.3.1 it suces to show
that y(x, y) is equivalent in To(dense without endpoints) to a q-free formula
(x). We may assume that each conjunct of is of one of the following types:
y = x
i
, x
i
< y, y < x
i
(1 i n).
To justify this, observe that if we had instead a conjunct y ,= x
i
then we could
replace it by (y < x
i
) (x
i
< y) and use the fact that y(
1
(x, y)
2
(x, y))
is equivalent to y
1
(x, y) y
2
(x, y). Similarly, a negation (y < x
i
) can
be replaced by the disjunction y = x
i
x
i
< y, and likewise with negations
(x
i
< y). Also conjuncts in which y does not appear can be eliminated because
y((x) (x, y)) (x) y(x, y).
Suppose that we have a conjunct y = x
i
in (x, y), so, (x, y) is equivalent to
y = x
i
(x, x
i
), and we are done. So we can assume also that (x, y)
has no conjuncts of the form y = x
i
.
After all these reductions, and after rearranging conjuncts we can assume
that (x, y) is a conjunction
iI
x
i
< y
jJ
y < x
j
where I, J 1, . . . , n and where we allow I or J to be empty. Up till this
point we did not need the density and no endpoints axioms, but these come
in now: y(x, y) is equivalent in To(dense without endpoints) to the formula
iI,jJ
x
i
< x
j
.
68 CHAPTER 4. SOME MODEL THEORY
We mention without proof two examples of QE, and give a complete proof for a
third example in the next section. The following theorem is due to Tarski and
(independently) to Chevalley. It dates from around 1950.
Theorem 4.3.5. ACF has QE.
Clearly, ACF is not complete, since it says nothing about the characteristic: it
doesnt prove 1 + 1 = 0, nor does it prove 1 + 1 ,= 0. However, ACF(0), which
contains additional axioms forcing the characteristic to be 0, is complete by 4.3.3
and the fact that the ring of integers embeds in every algebraically closed eld
of characteristic 0. Tarski also established the following more dicult theorem,
which is one of the key results in real algebraic geometry. (His original proof is
rather long; there is a shorter one due to A. Seidenberg, and an elegant short
proof by A. Robinson using a combination of basic algebra and elementary
model theory.)
Denition. RCF is a set of axioms true in the ordered eld (R; <, 0, 1, , +, )
of real numbers. In addition to the ordered eld axioms, it has the axiom
x (x > 0 y (x = y
2
)) (x, y distinct variables) and for each odd n > 1 the
axiom
x
1
. . . x
n
y (y
n
+x
1
y
n1
+ +x
n
= 0)
where x
1
, . . . , x
n
, y are distinct variables.
Theorem 4.3.6. RCF has QE and is complete.
Exercises. In (5) and (6), an L-theory is a set T of L-sentences such that for all
L-sentences , if T , then T. An axiomatization of an L-theory T is a set of
L-sentences such that T = | : is an L-sentence and .
(1) The subsets of C denable in (C; 0, 1, , +, ) are exactly the nite subsets of C
and their complements in C. (Hint: use the fact that ACF has QE.)
(2) The subsets of R denable in the model (R; <, 0, 1, , +, ) of RCF are exactly
the nite unions of intervals of all kinds (including degenerate intervals with just
one point) (Hint: use the fact that RCF has QE.)
(3) Let Eq
extend L by
new symbols of arity 0, and let
be a set of L
-sentences. Then
(as a
set of L
hH
my = t
h
(x)
iI
t
i
(x) < my
jJ
my < t
j
(x)
kK
P
n(k)
(my +t
k
(x))
where m > 0 and H, I, J, K are disjoint nite index sets. We allow some of
these index sets to be empty in which case the corresponding conjunction can
be left out.
Suppose that H ,= , say h
hH
t
h
(x) = t
h
(x)
iI
t
i
(x) < t
h
(x)
jJ
t
h
(x) < t
j
(x)
kK
P
n(k)
(t
h
(x) +t
k
(x))
For the rest of the proof we assume that H = .
4.4. PRESBURGER ARITHMETIC 71
To understand what follows, it may help to focus on the model
Z, although
the arguments go through for arbitrary models of PrA. Fix any value a Z
n
of x. Consider the system of linear congruences (with unknown y)
P
n(k)
(my +t
k
(a)), (k K),
which in more familiar notation would be written as
my +t
k
(a) 0 mod n(k), (k K).
The solutions in Z of this system form a union of congruence classes modulo
N :=
kK
n(k), where as usual we put N = 1 for K = . This suggests
replacing y successively by Nz, 1 +Nz,. . . ,(N 1)1 +Nz. Our precise claim is
that y(x, y) is PrA-equivalent to the formula (x) given by
N1
r=0
_
kK
P
n(k)
((mr)1 +t
k
(x)) z
_
iI
t
i
(x) < m(r1 +Nz)
jJ
m(r1 +Nz) < t
j
(x)
__
We prove this equivalence with (x) as follows. Suppose
/ = (A, . . .) [= PrA, a = (a
1
, . . . , a
n
) A
n
.
We have to show that / [= y(a, y) if and only if / [= (a). So let b A be
such that / [= (a, b). Division with remainder yields a c A and an r such
that b = r1 +Nc and 0 r N 1. Note that then for k K,
mb +t
k
(a) = m(r1 +Nc) +t
k
(a) = (mr)1 + (mN)c +t
k
(a) n(k)/
and so / [= P
n(k)
((mr)1 +t
k
(a)). Also,
t
i
(a) < m(r1 +Nc) for every i I,
m(r1 +Nc) < t
j
(a) for every j J.
Therefore / [= (a) with z witnessed by c. For the converse, suppose that the
disjunct of (a) indexed by a certain r 0, . . . , N 1 is true in /, with z
witnessed by c A. Then put b = r1+Nc and we get / [= (a, b). This proves
the claimed equivalence.
Now that we have proved the claim we have reduced to the situation (after
changing notation) where H = K = (i. e. no equations and no congruences).
So (x, y) now has the form
iI
t
i
(x) < my
jJ
my < t
j
(x)
72 CHAPTER 4. SOME MODEL THEORY
If J = or I = then PrA y(x, y) . This leaves the case where
both I and J are nonempty. So suppose / [= PrA and that A is the underlying
set of /. For each value a A
n
of x there is i
0
I such that t
i
0
(a) is maximal
among the t
i
(a) with i I, and a j
0
J such that t
j
0
(a) is minimal among the
t
j
(a) with j J. Moreover each interval of m successive elements of A contains
an element of m/. Therefore y(x, y) is equivalent in / to the disjunction
over all pairs (i
0
, j
0
) I J of the q-free formula
iI
t
i
(x) t
i
0
(x)
jJ
t
j
0
(x) t
j
(x)
r=1
_
P
m
(t
i
0
(x) +r1) (t
i
0
(x) +r1 < t
j
0
(x))
_
This completes the proof. Note that L
PrA
does not contain the relation symbol
; we just write t t
to abbreviate (t < t
) (t = t
).
Remark. It now follows from Corollary 4.3.3 that PrA is complete: it has QE
and
Z can be embedded in every model.
Discussion. The careful reader will have noticed that the elimination procedure
in the proof above is constructive: it describes an algorithm that, given any basic
conjunction (x, y) in L
PrA
as input, constructs a q-free formula (x) of L
PrA
such that PrA y(x, y) (x). In view of the equally constructive proof
of Lemma 4.3.1 this yields an algorithm that, given any L
PrA
-formula (x) as
input, constructs a q-free L
PrA
-formula
qf
(x) such that PrA (x)
qf
(x).
(Thus PrA has eective QE.)
In particular, this last algorithm constructs for any L
PrA
-sentence a q-free
L
PrA
-sentence
qf
such that PrA
qf
. Since we also have an obvious
algorithm that, given any q-free L
PrA
-sentence
qf
, checks whether
qf
is true
in
Z, this yields an algorithm that, given any L
PrA
-sentence , checks whether
is true in
Z. Thus the structure
Z is decidable. (A precise denition of
decidability will be given in the next Chapter.) The algorithms above can
easily be implemented by computer programs.
Let some L-structure / be given, and suppose we have an algorithm for
deciding whether any given L-sentence is true in /. Even if this algorithm can
be implemented by a computer program, it does not guarantee that the program
is of practical use, or feasible: on some moderately small inputs it might have
to run for 10
100
years before producing an output. This bad behaviour is not
at all unusual: no (classical, sequential) algorithm for deciding the truth of
L
PrA
-sentences in
Z is feasible in a precise technical sense. Results of this kind
belong to complexity theory; this is an area where mathematics (logic, number
theory,. . . ) and computer science interact.
There do exist feasible integer linear programming algorithms that decide
the truth in
Z of sentences of a special form, and this shows another (very
practical) side of complexity theory.
A positive impact of QE is that it yields structural properties of denable
sets, as we discuss next for
Z.
4.5. SKOLEMIZATION AND EXTENSION BY DEFINITION 73
Denition. Let d be a positive integer. An arithmetic progression of modulus
d is a set of the form
k Z : k r mod d, < k < ,
where r 0, . . . , d 1, , Z , +, < .
We leave the proof of the next lemma to the reader.
Lemma 4.4.3. Arithmetic progressions have the following properties.
(1) If P, Q Z are arithmetic progressions of moduli d and e respectively, then
P Q is an arithmetic progression of modulus lcm(d, e).
(2) If P Z is an arithmetic progression, then Z P is a nite union of
arithmetic progressions.
(3) Let T be the collection of all nite unions of arithmetic progressions. Then
T contains with any two sets X, Y also X Y , X Y , X Y .
Corollary 4.4.4. Let S Z. Then
S is denable in
Z S is a nite union of arithmetic progressions.
Proof. () It suces to show that each arithmetic progression is denable in
Z;
this is straightforward and left to the reader. () By QE and Lemma 4.4.3 it
suces to show that each atomic L
PrA
-formula (x) denes in
Z a nite union
of arithmetic progressions. Every atomic formula (x) dierent from and
has the form t
1
(x) < t
2
(x) or the form t
1
(x) = t
2
(x) or the form P
d
(t(x)), where
t
1
(x), t
2
(x) and t(x) are L
PrA
-terms. The rst two kinds reduce to t(x) > 0
and t(x) = 0 respectively (by subtraction). It follows that we may assume that
(x) has the form kx+l1 > 0, or the form kx+l1 = 0 , or the form P
d
(kx+l1),
where k, l Z. Considering cases (k = 0, k ,= 0 and k 0 mod d, and so on),
we see that such a (x) denes an arithmetic progression.
Exercises.
(1) Prove that 2Z cannot be dened in the structure (Z; 0, 1, +, , <) by a q-free
formula of the language |0, 1, +, , <.
4.5 Skolemization and Extension by Denition
In this section L is a sublanguage of L
a set of
L
-sentences with
.
Denition.
L
L
.
74 CHAPTER 4. SOME MODEL THEORY
Here (=) is the signicant direction, (=) is automatic.
Remarks
(1) Suppose
is
consistent.
(2) If each model of has an L
-expansion to a model of
, then
is conser-
vative over . (This follows easily from the Completeness Theorem.)
Proposition 4.5.1. Let (x
1
, . . . , x
n
, y) be an L-formula. Let f
be an n-ary
function symbol not in L, and put L
:= L f
and
:= x
1
. . . x
n
(y(x
1
, . . . , x
n
, y) (x
1
, . . . , x
n
, f
(x
1
, . . . , x
n
)))
Then
is conservative over .
Proof. Let / be any model of . By remark (2) it suces to obtain an L
-
expansion /
true. We choose a
function f
A
: A
n
A as follows. For any (a
1
, . . . , a
n
) A
n
, if there is a
b A such that / [= (a
1
, . . . , a
n
, b) then we let f
A
(a
1
, . . . , a
n
) be such an
element b, and if no such b exists, we let f
A
(a
1
, . . . , a
n
) be an arbitrary element
of A. Thus
/
[= x
1
. . . x
n
(y(x
1
, . . . , x
n
, y) (x
1
, . . . , x
n
, f
(x
1
, . . . , x
n
)))
as desired.
Remark. A function f
A
be an m-ary relation
symbol not in L, and put L
:= L R
and
:=
where
is
x
1
. . . x
m
((x
1
, . . . , x
m
) R
(x
1
, . . . , x
m
)).
The sentence
. We call
an extension
of by a denition for the relation symbol R
.
Remark. Each model / of expands uniquely to a model of
. We denote
this expansion by /
. Every model of
is of the form /
is conservative over .
(2) For each L
(y)
(y).
(3) Suppose / [= and S A
m
. Then S is 0-denable in / if and only if S
is 0-denable in /
t
1
(y) . . . t
m
(y) where the t
i
are L-terms. In this case we can take
u
1
. . . u
m
(u
1
= t
1
(y) . . . u
m
= t
m
(y) (u
1
/x
1
, . . . , u
m
/x
m
))
as
!
y(x, y), where
!
y(x, y) abbreviates y
_
(x, y) z((x, z) y = z)
_
, with z a variable
not occurring in and not among x
1
, . . . , x
m
, y. Let f
be an m-ary function
symbol not in L and put L
:= L f
and
:=
where
is
x
1
. . . x
m
(x, f
(x))
The sentence
. We call
an extension of
by a denition of the function symbol f
.
Remark. Each model / of expands uniquely to a model of
. We denote
this expansion by /
. Every model of
is of the form /
, and /
-structure B. Let x
1
, . . . , x
n
be distinct variables (viewed as ranging over A ),
and let x
11
, . . . , x
1k
, . . . , x
n1
, . . . , x
nk
be nk distinct variables (viewed as ranging
over B). Then we have a map that assigns to each L-formula (x
1
, . . . , x
n
) an
L
-formula (x
11
, . . . , x
1k
, . . . , x
n1
, . . . , x
nk
) such that
(
A
) = ()
B
B
nk
.
In particular, for n = 0 the map above assigns to each L-sentence an
L
: N
2
N, and the coordinate functions I
n
i
(for each n and i = 1, . . . , n) are computable.
(R2) If G : N
m
N is computable and H
1
, . . . , H
m
: N
n
N are computable,
then so is the function F = G(H
1
, . . . , H
m
) : N
n
N dened by
F(a) = G(H
1
(a), . . . , H
m
(a)).
(R3) If G : N
n+1
N is computable, and for all a N
n
there exists x N
such that G(a, x) = 0, then the function F : N
n
N given by
F(a) = x(G(a, x) = 0)
is computable.
A relation R N
n
is said to be computable (or recursive) if its characteristic
function
R
: N
n
N is computable.
Example. If F : N
3
N and G : N
2
N are computable, then so is the
function H : N
4
N dened by H(x
1
, x
2
, x
3
, x
4
) = F(G(x
1
, x
4
), x
2
, x
4
). This
follows from (R2) by noting that H(x) = F(G(I
4
1
(x), I
4
4
(x)), I
4
2
(x), I
4
4
(x)) where
x = (x
1
, x
2
, x
3
, x
4
). We shall use this device from now on in many proofs, but
only tacitly. (The reader should of course notice when we do so.)
From (R1), (R2) and (R3) we derive further rules for obtaining computable
functions. This is mostly an exercise in programming.
Lemma 5.1.1. Let H
1
, . . . , H
m
: N
n
N and R N
m
be computable. Then
R(H
1
, . . . , H
k
) N
n
is computable, where for a N
n
we put
R(H
1
, . . . , H
m
)(a) R(H
1
(a), . . . , H
m
(a)).
Proof. Observe that
R(H
1
,...,H
m
)
=
R
(H
1
, . . . , H
m
). Now apply (R2).
Lemma 5.1.2. The functions
and
=
on N
2
are computable, as is the
constant function c
n
0
: N
n
N where c
n
0
(a) = 0 for all n and a N
n
.
5.1. COMPUTABLE FUNCTIONS 79
Proof. The function
is computable because
(m, n) =
(n, m) =
(I
2
2
(m, n), I
2
1
(m, n))
which enables us to apply (R1) and (R2). Similarly,
=
is computable:
=
(m, n) =
(m, n)
(m, n).
To handle c
n
0
we observe
c
n
0
(a) = x(I
n+1
n+1
(a, x) = 0).
For k N we dene the constant function c
n
k
: N
n
N by c
n
k
(a) = k.
Lemma 5.1.3. Every constant function c
n
k
is computable.
Proof. This is true for k = 0 (and all n) by Lemma 5.1.2. Next, observe that
c
n
k+1
(a) = x(c
n
k
(a) < x) = x
_
(c
n+1
k
(a, x), I
n+1
n+1
(a, x)) = 0
_
for a N
n
, so the general result follows by induction on k.
Let P, Q be n-ary relations on N. Then we can form the n-ary relations
P := N
n
P, P Q := P Q, P Q := P Q,
P Q := (P) Q, P Q := (P Q) (Q P)
on N.
Lemma 5.1.4. Suppose P, Q are computable. Then P, P Q, P Q, P Q
and P Q are also computable.
Proof. Let a N
n
. Then P(a) i
P
(a) = 0 i
P
(a) = c
n
0
(a), so
P
(a) =
=
(
P
(a), c
n
0
(a)). Hence P is computable by (R2) and Lemma 5.1.2. Next,
the relation P Q is computable since
PQ
=
P
Q
. By De Morgans Law,
P Q = (P Q). Thus P Q is computable. The rest is clear.
Lemma 5.1.5. The binary relations <, , =, >, , ,= on N are computable.
Proof. The relations , and = have already been taken care of by Lemma
5.1.2 and (R1). The remaining relations are complements of these three, so by
Lemma 5.1.4 they are also computable.
Lemma 5.1.6. (Denition by Cases) Let R
1
, . . . , R
k
N
n
be computable such
that for each a N
n
exactly one of R
1
(a), . . . , R
k
(a) holds, and suppose that
G
1
, . . . , G
k
: N
n
N are computable. Then G : N
n
N given by
G(a) =
_
_
G
1
(a) if R
1
(a)
.
.
.
.
.
.
G
k
(a) if R
k
(a)
is computable.
80 CHAPTER 5. COMPUTABILITY
Proof. This follows from G = G
1
R
1
+ +G
k
R
k
.
Lemma 5.1.7. (Denition by Cases) Let R
1
, . . . , R
k
N
n
be computable such
that for each a N
n
exactly one of R
1
(a), . . . , R
k
(a) holds. Let P
1
, . . . , P
k
N
n
be computable. Then the relation P N
n
dened by
P(a)
_
_
P
1
(a) if R
1
(a)
.
.
.
.
.
.
P
k
(a) if R
k
(a)
is computable.
Proof. Use that P = (P
1
R
1
) (P
k
R
k
).
Lemma 5.1.8. Let R N
n+1
be computable such that for all a N
n
there
exists x N with (a, x) R. Then the function F : N
n
N given by
F(a) = xR(a, x)
is computable.
Proof. Note that F(a) = x(
R
(a, x) = 0) and apply (R3).
Here is a nice consequence of 5.1.5 and 5.1.8.
Lemma 5.1.9. Let F : N
n
N. Then F is computable if and only if its graph
(a subset of N
n+1
) is computable.
Proof. Let R N
n+1
be the graph of F. Then for all a N
n
and b N,
R(a, b) F(a) = b, F(a) = xR(a, x),
from which the lemma follows immediately.
Lemma 5.1.10. If R N
n+1
is computable, then the function F
R
: N
n+1
N
dened by F
R
(a, y) = x
<y
R(a, x) is computable.
Proof. Use that F
R
(a, y) = x(R(a, x) or x = y).
Some notation: below we use the bold symbol as shorthand for there
exists a natural number; likewise, we use the bold symbol to abbreviate for
all natural numbers. These abbreviation symbols should not be confused with
the logical symbols and .
Lemma 5.1.11. Suppose R N
n+1
is computable. Let P, Q N
n+1
be the
relations dened by
P(a, y) x
<y
R(a, x)
Q(a, y) x
<y
R(a, x),
for (a, y) = (a
1
, . . . , a
n
, y) N
n+1
. Then P and Q are computable.
5.1. COMPUTABLE FUNCTIONS 81
Proof. Using the notation and results from Lemma 5.1.10 we note that P(a, y)
i F
R
(a, y) < y. Hence
P
(a, y) =
<
(F
R
(a, y), y). For Q, note that Q(a, y)
i x
<y
R(a, x).
Lemma 5.1.12. The function
: N
2
N dened by a
b =
_
a b if a b,
0 if a < b
is computable.
Proof. Use that a
b = x(b +x = a or a < b).
The results above imply easily that many familiar functions are computable.
But is the exponential function n 2
n
computable? It certainly is in the
intuitive sense: we know how to compute (in principle) its value at any given
argument. It is not that obvious from what we have proved so far that it is
computable in our precise sense. We now develop some coding tricks due to
Godel that enable us to prove routinely that functions like 2
x
are computable
according to our denition of computable function.
Denition. Dene the function Pair : N
2
N by
Pair(x, y) :=
(x +y)(x +y + 1)
2
+ x
We call Pair the pairing function.
Lemma 5.1.13. The function Pair is bijective and computable.
Proof. Exercise.
Denition. Since Pair is a bijection we can dene functions
Left, Right : N N
by
Pair(x, y) = a Left(a) = x and Right(a) = y.
The reader should check that Left(a), Right(a) a for a N, and Left(a) < a
if 0 < a N.
Lemma 5.1.14. The functions Left and Right are computable.
Proof. Use 5.1.9 in combination with
Left(a) = x
_
y
<a+1
Pair(x, y) = a
_
and Right(a) = y
_
x
<a+1
Pair(x, y) = a
_
.
For a, b, c Z we have (by denition): a b mod c a b cZ.
Lemma 5.1.15. The ternary relation a b mod c on N is computable.
82 CHAPTER 5. COMPUTABILITY
Proof. Use that for a, b, c N we have
a b mod c
_
x
<a+1
a = x c +b or x
<b+1
b = x c +a
_
.
We can now introduce Godels function : N
2
N.
Denition. For a, i N we let (a, i) be the remainder of Left(a) upon division
by 1 + (i + 1) Right(a), that is,
(a, i) := x
_
x Left(a) mod 1 + (i + 1) Right(a)
_
.
Proposition 5.1.16. The function is computable, (a, i) a
1 for all a, i
N. For any sequence (a
0
, . . . , a
n1
) of natural numbers there exists a N such
that (a, i) = a
i
for i < n.
Proof. The computability of is clear from earlier results. We have (a, i)
Left(a) a
1.
Let a
0
, . . . , a
n1
be natural numbers. Then we take an N N such that
a
i
N for all i < n and N is a multiple of every prime number less than
n. We claim that then 1 + N, 1 + 2N, . . . , 1 + nN are relatively prime. To
see this, suppose p is a prime number such that p [ 1 + iN and p [ 1 + jN
(1 i < j n); then p divides their dierence (j i)N, but p does not divide
N, hence p [ j i < n, a contradiction.
By the Chinese Remainder Theorem there exists an M such that
M a
0
mod 1 +N
M a
1
mod 1 + 2N
.
.
.
M a
n1
mod 1 +nN.
Put a := Pair(M, N); then Left(a) = M and Right(a) = N, and thus (a, i) =
a
i
as required.
Remark. Proposition 5.1.16 shows that we can use to encode a sequence of
numbers a
0
, . . . , a
n1
in terms of a single number a. We use this as follows to
show that the function n 2
n
is computable.
If a
0
, . . . , a
n
are natural numbers such that a
0
= 1, and a
i+1
= 2a
i
for
all i < n, then necessarily a
n
= 2
n
. Hence by Proposition 5.1.16 we have
(a, n) = 2
n
where
a := x((x, 0) = 1 and i
<n
(x, i + 1) = 2(x, i)),
that is,
2
n
= (a, n) = (x((x, 0) = 1 and i
<n
(x, i + 1) = 2(x, i)), n)
It follows that n 2
n
is computable.
5.1. COMPUTABLE FUNCTIONS 83
The above suggests a general method, which we develop next. To each se-
quence (a
1
, . . . , a
n
) of natural numbers we assign a sequence number, denoted
a
1
, . . . , a
n
), and dened to be the least natural number a such that (a, 0) = n
(the length of the sequence) and (a, i) = a
i
for i = 1, . . . , n. For n = 0 this
gives ) = 0, where ) is the sequence number of the empty sequence. We de-
ne the length function lh : N N by lh(a) = (a, 0), so lh is computable.
Observe that lh(a
1
, . . . , a
n
)) = n.
Put (a)
i
:= (a, i +1). The function (a, i) (a)
i
: N
2
N is computable,
and (a
1
, . . . , a
n
))
i
= a
i+1
for i < n. Finally, let Seq N denote the set of
sequence numbers. The set Seq is computable since
a Seq x
<a
(lh(x) ,= lh(a) or i
<lh(a)
(x)
i
,= (a)
i
)
Lemma 5.1.17. For any n, the function (a
1
, . . . , a
n
) a
1
, . . . , a
n
) : N
n
N
is computable, and a
i
< a
1
, . . . , a
n
) for (a
1
, . . . , a
n
) N
n
and i = 1, . . . , n.
Proof. Use a
1
, . . . , a
n
) = a((a, 0) = n, (a, 1) = a
1
, . . . , (a, n) = a
n
), and
apply Lemmas 5.1.8, 5.1.4 and 5.1.16.
Lemma 5.1.18.
(1) The function In : N
2
N dened by
In(a, i) = x(lh(x) = i and j
<i
(x)
j
= (a)
j
)
is computable and In(a
1
, . . . , a
n
), i) = a
1
, . . . , a
i
) for all a
1
, . . . , a
n
N
and i n.
(2) The function : N
2
N dened by
a b = x(lh(x) = lh(a) + lh(b) and i
<lh(a)
(x)
i
= (a)
i
and j
<lh(b)
(x)
lh(a)+j
= (b)
j
)
is computable and a
1
, . . . , a
m
) b
1
, . . . , b
n
) = a
1
, . . . , a
m
, b
1
, . . . , b
n
) for
all a
1
, . . . , a
m
, b
1
, . . . , b
n
N.
Denition. For F : N
n+1
N , let
F : N
n+1
N be given by
,
so F(a
1
, . . . , a
n
) = F)(a
1
, . . . , a
n
)) for all a
1
, . . . , a
n
N. Then F is com-
putable i F) is computable. (Hence n-variable computability reduces to 1-
variable computability.)
Let T be a collection of functions F : N
m
N for various m. We say that T is
closed under composition if for all G : N
m
Nin T and all H
1
, . . . , H
m
: N
n
N
in T, the function F = G(H
1
, . . . , H
m
) : N
n
N is in T. We say that T is
closed under minimalization if for every G : N
n+1
N in T such that for all
a N
n
there exists x N with G(a, x) = 0, the function F : N
n
N given by
F(a) = x(G(a, x) = 0) is in T. We say that a relation R N
n
is in T if its
characteristic function
R
is in T.
(8) Suppose T contains the functions mentioned in (R1), and is closed under com-
position and minimalization. All lemmas and propositions of this Section go
through with computable replaced by in T.
86 CHAPTER 5. COMPUTABILITY
5.2 The Church-Turing Thesis
The computable functions as dened in the last section are also computable
in the informal sense that for each such function F : N
n
N there is an
algorithm that on any input a N
n
stops after a nite number of steps and
produces an output F(a). An algorithm is given by a nite list of instructions,
a computer program, say. These instructions should be deterministic (leave
nothing to chance or choice). We deliberately neglect physical constraints of
space and time: imagine that the program that implements the algorithm has
unlimited access to time and space to do its work on any given input.
Let us write calculable for this intuitive, informal, idealized notion of
computable. The Church-Turing Thesis asserts
each calculable function F : N N is computable.
The corresponding assertion for functions N
n
N follows, because the re-
sult of Exercise 7 in Section 5.1 is clearly also valid for calculable instead
of computable. Call a set P N calculable if its characteristic function is
calculable.
While the Church-Turing Thesis is not a precise mathematical statement, it
is an important guiding principle, and has never failed in practice: any function
that any competent person has ever recognized as being calculable, has turned
out to be computable, and the informal grounds for calculability have always
translated routinely into an actual proof of computability. Here is a heuristic
(informal) argument that might make the Thesis plausible.
Let an algorithm be given for computing F : N N. We can assume
that on any input a N this algorithm consists of a nite sequence of steps,
numbered from 0 to n, say, where at each step i it produces a natural number
a
i
, with a
0
= a as starting number. It stops after step n with a
n
= F(a).
We assume that for each i < n the number a
i+1
is calculated by some xed
procedure from the earlier numbers a
0
, . . . , a
i
, that is, we have a calculable
function G : N N such that a
i+1
= G(a
0
, . . . , a
i
)) for all i < n. The
algorithm should also tell us when to stop, that is, we should have a calculable
P N such that P(a
0
, . . . , a
i
)) for i < n and P(a
0
, . . . , a
n
)). Since G and
P describe only single steps in the algorithm for F it is reasonable to assume
that they at least are computable. Once this is agreed to, one can show easily
that F is computable as well, see the exercise below.
A skeptical reader may nd this argument dubious, but Turing gave in 1936
a compelling informal analysis of what functions F : N N are calculable
in principle, and this has led to general acceptance of the Thesis. In addition,
various alternative formalizations of the informal notion of calculable function
have been proposed, using various kinds of machines, formal systems, and so
on. They all have turned out to be equivalent in the sense of dening the same
class of functions on N, namely the computable functions.
The above is only a rather narrow version of the Church-Turing Thesis, but
it suces for our purpose. There are various renements and more ambitious
versions. Also, our Church-Turing Thesis does not characterize mathematically
5.3. PRIMITIVE RECURSIVE FUNCTIONS 87
the intuitive notion of algorithm, only the notion function computable by an
algorithm that produces for each input from N an output in N.
Exercises.
(1) Let G : N N and P N be given. Then there is for each a N at most one
nite sequence a
0
, . . . , a
n
of natural numbers such that a
0
= a, for all i < n we
have a
i+1
= G(a
0
, . . . , a
i
)) and P(a
0
, . . . , a
i
)), and P(a
0
, . . . , a
n
)). Suppose
that for each a N there is such a nite sequence a
0
, . . . , a
n
, and put F(a) := a
n
,
thus dening a function F : N N. If G and P are computable, so is F.
5.3 Primitive Recursive Functions
This section is not really needed in the rest of this chapter, but it may throw light
on some issues relating to computability. One such issue is the condition, in Rule
(R3) for generating computable functions, that for all a N
n
there exists y N
such that G(a, y) = 0. This condition is not constructive: it could be satised
for a certain G without us ever knowing it. We shall now argue informally that
it is impossible to generate in a fully constructive way exactly the computable
functions. Such a constructive generation process would presumably enable
us to enumerate eectively a sequence of algorithms
0
,
1
,
2
, . . . such that
each
n
computes a (computable) function f
n
: N N, and such that every
computable function f : N N occurs in the sequence f
0
, f
1
, f
2
, . . . , possibly
more than once. Now consider the function f
diag
: N N dened by
f
diag
(n) = f
n
(n) + 1.
Then f
diag
is clearly computable in the intuitive sense, but f
diag
,= f
n
for all n,
in violation of the Church-Turing Thesis.
This way of producing a new function f
diag
from a sequence (f
n
) is called
diagonalization.
1
The same basic idea applies in other cases, and is used in a
more sophisticated form in the proof of Godels incompleteness theorem.
Here is a class of computable functions that can be generated constructively:
The primitive recursive functions are the functions f : N
n
N obtained in-
ductively as follows:
(PR1) The nullary function N
0
N with value 0, the unary successor function
S, and all coordinate functions I
n
i
are primitive recursive.
(PR2) If G : N
m
N is primitive recursive and H
1
, . . . , H
m
: N
n
N are
primitive recursive, then G(H
1
, . . . , H
m
) is primitive recursive.
(PR3) If F : N
n+1
N is obtained by primitive recursion from primitive re-
cursive functions G : N
n
N and H : N
n+2
N, then F is primitive
recursive.
1
Perhaps it would be better to call it antidiagonalization.
88 CHAPTER 5. COMPUTABILITY
A relation R N
n
is said to be primitive recursive if its characteristic function
R
is primitive recursive. As the next two lemmas show, the computable func-
tions that one ordinarily meets with are primitive recursive. In the rest of this
section x ranges over N
m
with m depending on the context, and y over N.
Lemma 5.3.1. The following functions and relations are primitive recursive:
(i) each constant function c
n
m
;
(ii) the binary operations +, , and (x, y) x
y
on N;
(iii) the predecessor function Pd : N N given by Pd(x) = x
1, the unary
relation x N : x > 0, the function
: N
2
N;
(iv) the binary relations , and = on N.
Proof. The function c
0
m
is obtained from c
0
0
by applying (PR2) m times with
G = S. Next, c
n
m
is obtained by applying (PR2) with G = c
0
m
(with k = 0 and
t = n). The functions in (ii) are obtained by the usual primitive recursions. It is
also easy to write down primitive recursions for the functions in (iii), in the order
they are listed. For (iv), note that
(x, y + 1) =
>0
(x)
(Pd(x), y).
Lemma 5.3.2. With the possible exceptions of Lemmas 5.1.8 and 5.1.9, all
Lemmas and Propositions in Section 5.1 go through with computable replaced
by primitive recursive.
Proof. To obtain the primitive recursive version of Lemma 5.1.10, note that
F
R
(a, 0) = 0, F
R
(a, y+1) = F
R
(a, y)
R
(a, F
R
(a, y))+(y+1)
R
(a, F
R
(a, y)).
A consequence of the primitive recursive version of Lemma 5.1.10 is the following
restricted minimalization scheme for primitive recursive functions:
if R N
n+1
and H : N
n
N are primitive recursive, and for all a N
n
there exists x < H(a) such that R(a, x), then the function F : N
n
N given
by F(a) = xR(a, x) is primitive recursive.
The primitive recursive versions of Lemmas 5.1.115.1.15 and Proposition 5.1.16
now follow easily. In particular, the function is primitive recursive. Also, the
proof of Proposition 5.1.16 yields:
There is a primitive recursive function B : N N such that, whenever
n < N, a
0
< N, . . . , a
n1
< N, (n, a
0
, . . . , a
n1
, N N)
then for some a < B(N) we have (a, i) = a
i
for i = 0, . . . , n 1.
Using this fact and restricted minimalization, it follows that the unary relation
Seq, the unary function lh, and the binary functions (a, i) (a)
i
, In and are
primitive recursive.
Let a function F : N
n+1
N be given. Then
F : N
n+1
N satises the
primitive recursion
F(a, 0) = 0 and
F(a, b + 1) =
F(a, b) F(a, b)). It follows
5.3. PRIMITIVE RECURSIVE FUNCTIONS 89
that if F is primitive recursive, so is
F. The converse is obvious. Suppose also
that G : N
n+2
N is primitive recursive, and F(a, b) = G(a, b,
F(a, b)) for all
(a, b) N
n+1
; then
F satises the primitive recursion
i
H
i
(x) A
N+1
([x[) for all x. Then
F(x) = G(H
1
(x), . . . , H
k
(x)) A
N
(
i
H
i
(x)) A
N
(A
N+1
([x[)) A
N+2
([x[).
Finally, assume that F : N
m+1
N is obtained by primitive recursion from
the primitive recursive functions G : N
m
N and H : N
m+2
N, and assume
inductively that n(G) and n(H) are bounds for G and H. Take N N such
that n(G) N + 3 and n(H) N. We claim that N + 3 is a bound for F:
F(x, 0) = G(x) A
N+3
([x[), and by part (v) of the lemma above,
F(x, y + 1) = H(x, y, F(x, y)) A
N
[x[ +y +A
N+3
([x[ +y)
A
N+2
A
N+3
([x[ +y) = A
N+3
([x[ +y + 1).
5.4. REPRESENTABILITY 91
Consider the function A
: N N dened by A
(y) for all y > n(F), where n(F) is a bound for F. In particular, A
+ : N
2
N, : N
2
N,
: N
2
N, and the coordinate function
I
n
i
(for each n and i = 1, . . . , n) are N-representable.
(R2)
If G : N
m
N and H
1
, . . . , H
m
: N
n
N are N-representable, then so
is F = G(H
1
, . . . , H
m
) : N
n
N dened by
F(a) = G(H
1
(a), . . . , H
k
(a)).
94 CHAPTER 5. COMPUTABILITY
(R3)
If G : N
n+1
N is N-representable, and for all a N
n
there exists
x N such that G(a, x) = 0, then the function F : N
n
N given by
F(a) = x(G(a, x) = 0)
is N-representable.
(R1)
: N
2
N is N-representable:
By (i) and (iv), the formula x
1
< x
2
x
1
= x
2
represents the set
(a, b) N
2
: a b in N. So by Proposition 5.4.3,
: N
2
N is
N-representable.
(vi) For n 1 and 1 i n , the term t
n
i
(x
1
, . . . , x
n
) := x
i
, represents
the function I
n
i
: N
n
N in N. This is obvious.
(R2)
: Let H
1
, . . . , H
m
: N
n
N be N-represented by
i
(x
1
, . . . , x
n
, y
i
) and
let G : N
m
N be N-represented by (y
1
, . . . , y
m
, z) where z is distinct
from x
1
, . . . , x
n
and y
1
, . . . , y
m
.
Claim : F = G(H
1
, . . . , H
m
) is N-represented by
(x
1
, . . . , x
n
, z) := y
1
. . . y
m
((
m
i=1
i
(x
1
, . . . , x
n
, y
i
)) (y
1
, . . . , y
m
, z)).
Put a = (a
1
, . . . , a
n
) and let c = F(a). We have to show that
N (S
a
0, z) z = S
c
0
where S
a
0 := (S
a
1
0, . . . , S
a
n
0). Let b
i
= H
i
(a) and put b = (b
1
, . . . , b
m
).
Then F(a) = G(b) = c. Therefore, N (S
b
0, z) z = S
c
0 and
N
i
(S
a
0, y
i
) y
i
= S
b
i
0, (i = 1, . . . , m)
Argue in models to conclude : N (S
a
0, z) z = S
c
0.
5.5. DECIDABILITY AND G
ODEL NUMBERING 95
(R3)
: Let G : N
n+1
N be such that for all a N
n
there exists b N with
G(a, b) = 0. Dene F : N
n
N by F(a) = b(G(a, b) = 0). Suppose
that G is N-represented by (x
1
, . . . , x
n
, y, z). We claim that the formula
(x
1
, . . . , x
n
, y) := (x
1
, . . . , x
n
, y, 0) w(w < y (x
1
, . . . , x
n
, w, 0))
N-represents F. Let a N
n
and let b = F(a). Then G(a, i) ,= 0 for i < b
and G(a, b) = 0. Therefore, N (S
a
0, S
b
0, z) z = 0 and for i < b,
G(a, i) ,= 0 and N (S
a
0, S
i
0, z) z = S
G(a,i)
0. By arguing in models
using Lemma 5.4.2 we obtain N (S
a
0, y) y = S
b
0, as claimed.
Remark. The converse of this theorem is also true, and is plausible from the
Church-Turing Thesis. We shall prove the converse in the next section.
Exercises. In the exercises below, L is a numerical language and is a set of
L-sentences.
(1) Suppose S
m
0 ,= S
n
0 whenever m ,= n. If a function F : N
m
N is -
represented by the L-formula (x
1
, . . . , x
m
, y), then the graph of F, as a relation
of arity m+1 on N, is -represented by (x
1
, . . . , x
m
, y). (This result applies to
= N, since N S
m
0 ,= S
n
0 whenever m ,= n.)
(2) Suppose N. Then the set of all -representable functions F : N
m
N,
(m = 0, 1, 2, . . . ) is closed under composition and minimalization.
5.5 Decidability and Godel Numbering
Denition. An L-theory T is a set of L-sentences closed under provability, that
is, whenever T , then T.
Examples.
(1) Given a set of L-sentences, the set Th() := : of theorems
of , is an L-theory. If we need to indicate the dependence on L we write
Th
L
() for Th(). We say that axiomatizes an L-theory T (or is an
axiomatization of T) if T = Th(). For = we also refer to Th
L
() as
predicate logic in L.
(2) Given an L-structure /, the set Th(/) := : / [= is also an L-
theory, called the theory of /. Note that the theory of / is automatically
complete.
(3) Given any class / of L-structures, the set
Th(/) := : / [= for all / /
is an L-theory, called the theory of /. For example, for L = L
Ri
, and /
the class of nite elds, Th(/) is the set of L-sentences that are true in all
nite elds.
96 CHAPTER 5. COMPUTABILITY
The decision problem for a given L-theory T is to nd an algorithm to
decide for any L-sentence whether or not belongs to T. Since we have not
(yet) dened the concept of algorithm, this is just an informal description at
this stage. One of our goals in this section is to dene a formal counterpart,
called decidability. In the next section we show that the L(N)-theory Th(N) is
undecidable; by the Church-Turing Thesis, this means that the decision problem
for Th(N) has no solution. (This result is a version of Churchs Theorem, and
is closely related to the Incompleteness Theorem.)
Unless we specify otherwise we assume in the rest of this chapter that the
language L is nite. This is done for simplicity, and at the end of the remaining
sections we indicate how to avoid this assumption. We shall number the terms
and formulas of L in such a way that various statements about these formulas
and about formal proofs in this language can be translated eectively into
equivalent statements about natural numbers expressible by sentences in L(N).
Recall that v
0
, v
1
, v
2
, . . . are our variables. We assign to each symbol
s L . logical symbols . v
0
, v
1
, v
2
, . . .
a symbol number SN(s) N as follows: SN(v
i
) := 2i and to each remaining
symbol (in the nite set L.logical symbols) we assign an odd natural number
as symbol number, subject to the condition that dierent symbols have dierent
symbol numbers.
Denition. The Godel number t of an L-term t is dened recursively:
t =
_
SN(v
i
)) if t = v
i
,
SN(F), t
1
, . . . , t
n
) if t = Ft
1
, . . . , t
n
.
The Godel number of an L-formula is given recursively by
=
_
_
SN()) if = ,
SN()) if = ,
SN(=), t
1
, t
2
) if = (t
1
= t
2
),
SN(R), t
1
, . . . , t
m
) if = Rt
1
. . . t
m
,
SN(), ) if = ,
SN(),
1
,
2
) if =
1
2
,
SN(),
1
,
2
) if =
1
2
,
SN(), x, ) if = x,
SN(), x, ) if = x.
Lemma 5.5.1. The following subsets of N are computable:
(1) Vble := x : x is a variable
(2) Term := t : t is an L-term
(3) AFor := : is an atomic L-formula
(4) For := : is an L-formula
5.5. DECIDABILITY AND G
ODEL NUMBERING 97
Proof. (1) a Vble i a = 2b) for some b a.
(2) a Term i a Vble or a = SN(F), t
1
, . . . , t
n
) for some function
symbol F of L of arity n and L-terms t
1
, . . . , t
n
with Godel numbers < a.
We leave (3) to the reader.
(4) We have For(a)
_
_
For((a)
1
) if a = SN(), (a)
1
),
For((a)
1
) and For((a)
2
) if a = SN(), (a)
1
, (a)
2
)
or a = SN(), (a)
1
, (a)
2
),
Vble((a)
1
) and For((a)
2
) if a = SN(), (a)
1
, (a)
2
)
or a = SN(), (a)
1
, (a)
2
),
AFor(a) otherwise.
So For is computable.
In the next two lemmas, x ranges over variables, and over L-formulas,
and t and over L-terms.
Lemma 5.5.2. The function Sub : N
3
N dened by Sub(a, b, c) =
_
_
c if Vble(a) and a = b,
(a)
0
, Sub((a)
1
, b, c), . . . , Sub((a)
n
, b, c)) if a = (a)
0
, . . . , (a)
n
) with n > 0 and
(a)
0
,= SN(), (a)
0
,= SN(),
SN(), (a)
1
, Sub((a)
2
, b, c)) if a = SN(), (a)
1
, (a)
2
) and (a)
1
,= b,
SN(), (a)
1
, Sub((a)
2
, b, c)) if a = SN(), (a)
1
, (a)
2
) and (a)
1
,= b,
a otherwise
is computable, and satises
Sub(t, x, ) = t(/x) and Sub(, x, ) = (/x).
Proof. Exercise.
Lemma 5.5.3. The following relations on N are computable:
(1) Fr := (, x) : x occurs free in N
2
(2) FrSub := (, x, ) : is free for x in N
3
(3) PrAx := : is a propositional axiom N
(4) Eq := : is an equality axiom N
(5) Quant := : is a quantier axiom N
(6) MP := (
1
,
1
2
,
2
) :
1
,
2
are L-formulas N
3
(7) Gen := (, ) : follows from by the generalization rule N
2
(8) Sent := : is a sentence N
Proof. The usual inductive or explicit description of each of these notions trans-
lates easily into a description of its Godel image that establishes computability
of this image. As an example, note that
Sent(a) For(a) and i
<a
Fr(a, i),
so (8) follows from (1) and earlier results.
98 CHAPTER 5. COMPUTABILITY
In the rest of this Section is a set of L-sentences. Put
:= : ,
and call computable if is computable.
Denition. We dene Prf
:=
1
, . . . ,
n
) :
1
, . . . ,
n
is a proof from . So every
element of Prf
is of the form
1
, . . . ,
n
) where n 1 and every
k
is
either in , or a logical axiom, or obtained from some
i
,
j
with 1 i, j < k
by Modus Ponens, or obtained from some
i
with 1 i < k by Generalization.
Lemma 5.5.4. If is computable, then Prf
is computable.
Proof. This is because a is in Prf
ODEL NUMBERING 99
Proof. Let P, Q N
n+1
be computable such that for all a N
n
we have
A(a) xP(a, x), A(a) xQ(a, x).
Then there is for each a N
n
an x N such that (P Q)(a, x). The com-
putability of A follows by noting that for all a N
n
we have
A(a) P(a, x(P Q)(a, x)).
Proposition 5.5.7. Every complete and computably axiomatizable L-theory is
decidable.
Proof. Let T be a complete L-theory with computable axiomatization . Then
T = Th() is computably generated. Now observe:
a / T a / Sent or SN(), a) T
a / Sent or b
_
Prf
to be the
set of all Godel numbers of proofs from , and for an L-theory T we dene the
notions of T being computably axiomatizable, T being decidable, and T being
undecidable, all just as we did earlier in this section. (Note, however, that the
denitions of these notions are all relative to our given computable numbering
of L; for nite L dierent choices of numbering of L yield equivalent notions of
being computable, T being computably axiomatizable, and T being decidable.)
It is now routine to check that Lemmas 5.5.4, 5.5.5, and Proposition 5.5.7 go
through.
Exercises.
(1) Let a and b denote positive real numbers. Call a computable if there are com-
putable functions f, g : N N such that for all n > 0,
g(n) ,= 0 and [a f(n)/g(n)[ < 1/n.
Then:
(i) every positive rational number is computable, and e is computable;
(ii) if a and b are computable, so are a + b, ab, and 1/a, and if in addition
a > b, then a b is also computable;
(iii) a is computable if and only if the binary relation R
a
on N dened by
R
a
(m, n) n > 0 and m/n < a
is computable. (Hint: use the Negation Theorem.)
(2) A nonempty S N is computably generated i there is a computable function
f : N N such that S = f(N). Moreover, if S is innite and computably
generated, then f can be chosen injective.
(3) If f : N Nis computable and f(x) > x for all x N, then f(N) is computable.
(4) Every innite computably generated subset of N has an innite computable sub-
set.
(5) A function F : N
n
N is computable i its graph is computably generated.
(6) Suppose is a computable and consistent set of sentences in the numerical lan-
guage L. Then every -representable relation R N
n
is computable.
(7) A function F : N
m
Nis computable if and only if it is -representable for some
nite, consistent set N of sentences in some numerical language L L(N).
The last exercise gives an alternative characterization of computable function.
5.6 Theorems of Godel and Church
In this section we assume that the nite language L extends L(N).
Theorem 5.6.1 (Church). No consistent L-theory extending N is decidable.
5.6. THEOREMS OF G
N
2
by
P
(a, b) (S
b
0).
Lemma 5.6.5. Suppose N is consistent. Then each computable set X N
is of the form X = P
(a).
102 CHAPTER 5. COMPUTABILITY
Proof of Churchs Theorem. Let N be consistent. We have to show that
then Th() is undecidable, that is, Th() is not computable. Suppose that
Th() is computable. Then the antidiagonal Q
N of P
is computable:
b Q
(b, b) / P
such that
(1) N [=
; and (2) N [=
, .
From (1) and (2) we get N [= , . Assuming that N [= produces
a contradiction. Hence is true in N, and thus(!) not provable from .
How to implement this strange idea? To take care of (2), one might guess
that
= xpr(x, S
:= xpr(x, S
0) depends on , the
solution is to apply the xed-point lemma below to (y) := xpr(x, y).
This nishes our sketch. What follows is a rigorous implementation.
Lemma 5.7.1 (Fixpoint Lemma). Suppose N. Then there is for every
L-formula (y) an L-sentence such that (S
n
0) where n = .
Proof. The function (a, b) Sub(a, x, Num(b)) : N
2
N is computable by
Lemma 5.5.2. Hence by the representability theorem it is N-representable. Let
sub(x
1
, x
2
, y) be an L(N)-formula representing it in N. We can assume that the
variable x does not occur in sub(x
1
, x
2
, y). Then for all a, b in N,
N sub(S
a
0, S
b
0, y) y = S
c
0, where c = Sub(a, x, Num(b)) (1)
104 CHAPTER 5. COMPUTABILITY
Now let (y) be an L-formula. Dene (x):= y(sub(x, x, y) (y)) and let
m = (x). Let := (S
m
0), and put n = . We claim that
(S
n
0).
Indeed,
n = = (S
m
0) = Sub((x), x, S
m
0) = Sub(m, x, Num(m)).
So by (1),
N sub(S
m
0, S
m
0, y) y = S
n
0 (2)
We have
= (S
m
0) = y(sub(S
m
0, S
m
0, y) (y)),
so by (2) we get y(y = S
n
0 (y)). Hence, (S
n
0).
Theorem 5.7.2. Suppose is consistent, computable, and proves all axioms
of N. Then there exists an L(N)-formula (x) such that N (S
m
0) for each
m, but , x(x).
Proof. Consider the relation Pr
N
2
dened by
Pr
is computable. Hence Pr
is representable in N. Let
pr
(S
m
0, S
n
0) Pr
(m, n) (1)
pr
(S
m
0, S
n
0) Pr
(m, n) (2)
Let (y) be the L(N)-formula xpr
0). It follows
that (S
0), that is
xpr
(x, S
0) (3)
Claim: , . Assume towards a contradiction that ; let m be the
Godel number of a proof of from , so Pr
(x, S
0), so pr
(S
m
0, S
(x, S
yields N pr
(S
m
0, S
is not denable in N
.
This result (of Tarski) is known as the undenability of truth. It strengthens the
special case of Churchs theorem which says that the set of Godel numbers of
L-sentences true in N
is a set of L
-sentences.
Lemma 5.8.1. Let L L
and
.
(1) Suppose
) is undecidable.
(2) Suppose L = L
and
is nite. Then
Th(
) is undecidable.
Proof. (1) In this case we have for all a N,
a Th
L
() a Sent
L
and a Th
L
(
).
It follows that if Th
L
(
) is decidable, so is Th
L
().
(2) Write
=
1
, . . . ,
N
, and put
:=
1
N
. Then for each
L-sentence we have
, so for all a N,
a Th(
), a) Th().
It follows that if Th() is decidable then so is Th(
).
(3) Let c
0
, . . . , c
n
be the distinct constant symbols of L
L. Given any L
-
sentence we dene the L-sentence
:= v
k
. . . v
k+n
(v
k
, . . . , v
k+n
). An easy argument
using the completeness theorem shows that
L
L
.
By the Church-Turing Thesis there is a computable function a a
: N N
such that
for all L
Th
L
().
This yields the direction of (3); the converse holds by (1).
(4) The direction holds by (1). For the we use an algorithm (see ...) that
computes for each L
-sentence an L-sentence
such that
. By
the Church-Turing Thesis there is a computable function a a
: N N such
that
for all L
) a Sent
L
and a
Th
L
().
This yields the direction of (4).
Remark. We cannot drop the assumption L = L
in (2): take L = , = ,
L
= L(N) and
= . Then Th
L
(
-
structure B and / is strongly undecidable. Then B is strongly undecidable.
Proof. (Sketch) By the previous lemma (with L
of L
-sentences with B [=
[= s.
Let
be a set of L
)
is undecidable. Suppose towards a contradiction that Th
L
(
) is decidable.
Then Th
L
(
.
Take as in (ii) above. Then / [= by (i), and by (ii) we obtain an algorithm
for deciding whether any given L-sentence is provable from . But Th
L
() is
undecidable by assumption, and we have a contradiction.
Corollary 5.8.4. Th(Ri) is undecidable, in other words, the theory of rings is
undecidable.
Proof. It suces to show that the ring (Z; 0, 1, +, , ) of integers is strongly
undecidable. Using Lagranges theorem that
N = a
2
+b
2
+c
2
+d
2
: a, b, c, d Z,
we see that the inclusion map N Z denes N in the ring of integers, so by
Tarskis Theorem the ring of integers is strongly undecidable.
For the same reason, the theory of commutative rings, of integral domains, and
more generally, the theory of any class of rings that has the ring of integers
among its members is undecidable.
108 CHAPTER 5. COMPUTABILITY
Fact. The set Z Q is 0-denable in the eld (Q; 0, 1, +, , ) of rational
numbers, and thus the ring of integers is denable in the eld of rational num-
bers.
We shall take this here on faith. The only known proof uses non-trivial results
about quadratic forms, and is due to Julia Robinson.
Corollary 5.8.5. The theory Th(Fl) of elds is undecidable. The theory of any
class of elds that has the eld of rationals among its members is undecidable.
Exercises.
(1) Argue informally, using the Church-Turing Thesis, that Th(ACF) is decidable.
You can use the fact that ACF has QE.
Below we let a, b, c denote integers. We say that a divides b (notation: a [ b) if
ax = b for some integer x, and we say that c is a least common multiple of a and
b if a [ c, b [ c, and c [ x for every integer x such that a [ x and b [ x. Recall
that if a and b are not both zero, then they have a unique positive least common
multiple, and that if a and b are coprime (that is, there is no integer x > 1 with
x [ a and x [ b), then they have ab as a least common multiple.
(2) The structure (Z; 0, 1, +, [) is strongly undecidable, where [ is the binary relation
of divisibility on Z. Hint: Show that if b +a is a least common multiple of a and
a+1, and ba is a least common multiple of a and a1, then b = a
2
. Use this to
dene the squaring function in (Z; 0, 1, +, [), and then show that multiplication
is 0-denable in (Z; 0, 1, +, [).
(3) Consider the group G of bijective maps Z Z, with composition as the group
multiplication. Then G (as a model of Gr) is strongly undecidable. Hint: let s
be the element of G given by s(x) = x + 1. Check that if g G commutes with
s, then g = s
a
for some a. Next show that
a [ b s
b
commutes with each g G that commutes with s
a
.
Use these facts to specify a denition of the structure (Z; 0, 1, +, [) in the group
G.
Thus by Tarskis theorem, the theory of groups is undecidable. In fact, the
theory of any class of groups that has the group G of the exercise above among
its members is undecidable. On the other hand, Th(Ab), the theory of abelian
groups, is known to be decidable (Szmielew).
(4) Let L = |F have just a binary function symbol. Then predicate logic in L (that
is, Th
L
()) is undecidable.
It can also be shown that predicate logic in the language whose only symbol is a
binary relation symbol is undecidable. On the other hand, predicate logic in the
language whose only symbol is a unary function symbol is decidable.
To do?
improve titlepage
improve or delete index
more exercises (from homework, exams)
footnotes pointing to alternative terminology, etc.
brief discussion of classes at end of Sets and Maps?
at the end of results without proof?
brief discussion on P=NP in connection with propositional logic
section(s) on boolean algebra, including Stone representation, Lindenbaum-
Tarski algebras, etc.
section on equational logic? (boolean algebras, groups, as examples)
solution to a problem by Erdos via compactness theorem, and other simple
applications of compactness
include equality theorem,
translation of one language in another (needed in connection with Tarski
theorem in last section)
more details on back-and-forth in connection with unnested formulas
extra elementary model theory (universal classes, model-theoretic crite-
ria for qe, etc., application to ACF, maybe extra section on RCF, Axs
theorem.
On computability: a few extra results on c.e. sets, and exponential dio-
phantine result.
basic framework for many-sorted logic.