Sets Logic Computation - Open Logic Project PDF
Sets Logic Computation - Open Logic Project PDF
Sets Logic Computation - Open Logic Project PDF
net/publication/313970375
CITATIONS READS
0 498
1 author:
Richard Zach
The University of Calgary
76 PUBLICATIONS 749 CITATIONS
SEE PROFILE
Some of the authors of this publication are also working on these related projects:
All content following this page was uploaded by Richard Zach on 24 February 2017.
Winter 2017
Sets, Logic, Computation
The Open Logic Project
Instigator
Richard Zach, University of Calgary
Editorial Board
Aldo Antonelli,† University of California, Davis
Andrew Arana, Université Paris I Panthénon–Sorbonne
Jeremy Avigad, Carnegie Mellon University
Walter Dean, University of Warwick
Gillian Russell, University of North Carolina
Nicole Wyatt, University of Calgary
Audrey Yap, University of Victoria
Contributors
Samara Burns, University of Calgary
Dana Hägg, University of Calgary
Sets, Logic, Computation
An Open Logic Text
Winter 2017
The Open Logic Project would like to acknowledge the generous
support of the Faculty of Arts and the Taylor Institute of Teaching
and Learning of the University of Calgary.
1 Sets 2
1.1 Basics . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.2 Some Important Sets . . . . . . . . . . . . . . . . 4
1.3 Subsets . . . . . . . . . . . . . . . . . . . . . . . . 5
1.4 Unions and Intersections . . . . . . . . . . . . . . 6
1.5 Proofs about Sets . . . . . . . . . . . . . . . . . . 8
1.6 Pairs, Tuples, Cartesian Products . . . . . . . . . 10
1.7 Russell’s Paradox . . . . . . . . . . . . . . . . . . 11
Summary . . . . . . . . . . . . . . . . . . . . . . . . . . 12
Problems . . . . . . . . . . . . . . . . . . . . . . . . . . 12
2 Relations 14
2.1 Relations as Sets . . . . . . . . . . . . . . . . . . . 14
2.2 Special Properties of Relations . . . . . . . . . . . 16
2.3 Orders . . . . . . . . . . . . . . . . . . . . . . . . 18
2.4 Graphs . . . . . . . . . . . . . . . . . . . . . . . . 21
2.5 Operations on Relations . . . . . . . . . . . . . . 22
Summary . . . . . . . . . . . . . . . . . . . . . . . . . . 23
Problems . . . . . . . . . . . . . . . . . . . . . . . . . . 24
v
CONTENTS vi
3 Functions 26
3.1 Basics . . . . . . . . . . . . . . . . . . . . . . . . . 26
3.2 Kinds of Functions . . . . . . . . . . . . . . . . . 28
3.3 Inverses of Functions . . . . . . . . . . . . . . . . 29
3.4 Composition of Functions . . . . . . . . . . . . . 30
3.5 Isomorphism . . . . . . . . . . . . . . . . . . . . . 31
3.6 Partial Functions . . . . . . . . . . . . . . . . . . . 31
3.7 Functions and Relations . . . . . . . . . . . . . . 32
Summary . . . . . . . . . . . . . . . . . . . . . . . . . . 33
Problems . . . . . . . . . . . . . . . . . . . . . . . . . . 34
II First-order Logic 55
5.12 Extensionality . . . . . . . . . . . . . . . . . . . . 82
5.13 Semantic Notions . . . . . . . . . . . . . . . . . . 83
Summary . . . . . . . . . . . . . . . . . . . . . . . . . . 85
Problems . . . . . . . . . . . . . . . . . . . . . . . . . . 87
11 Undecidability 199
11.1 Introduction . . . . . . . . . . . . . . . . . . . . . 199
11.2 Enumerating Turing Machines . . . . . . . . . . . 201
11.3 The Halting Problem . . . . . . . . . . . . . . . . 203
11.4 The Decision Problem . . . . . . . . . . . . . . . 205
11.5 Representing Turing Machines . . . . . . . . . . . 206
CONTENTS ix
A Induction 219
A.1 Introduction . . . . . . . . . . . . . . . . . . . . . 219
A.2 Induction on N . . . . . . . . . . . . . . . . . . . 220
A.3 Strong Induction . . . . . . . . . . . . . . . . . . 223
A.4 Inductive Definitions . . . . . . . . . . . . . . . . 224
A.5 Structural Induction . . . . . . . . . . . . . . . . . 227
B Biographies 229
B.1 Georg Cantor . . . . . . . . . . . . . . . . . . . . 229
B.2 Alonzo Church . . . . . . . . . . . . . . . . . . . 230
B.3 Gerhard Gentzen . . . . . . . . . . . . . . . . . . 231
B.4 Kurt Gödel . . . . . . . . . . . . . . . . . . . . . . 233
B.5 Emmy Noether . . . . . . . . . . . . . . . . . . . 235
B.6 Bertrand Russell . . . . . . . . . . . . . . . . . . . 236
B.7 Alfred Tarski . . . . . . . . . . . . . . . . . . . . . 238
B.8 Alan Turing . . . . . . . . . . . . . . . . . . . . . 239
B.9 Ernst Zermelo . . . . . . . . . . . . . . . . . . . . 241
Glossary 245
Bibliography 253
xi
PREFACE xii
2 The difference between the latter four is not terribly important, but
roughly: A theorem is an important result. A proposition is a result worth
recording, but perhaps not as important as a theorem. A lemma is a result we
mainly record only because we want to break up a proof into smaller, easier to
manage chunks. A corollary is a result that follows easily from a theorem or
proposition, such as an interesting special case.
PART I
Sets,
Relations,
Functions
1
CHAPTER 1
Sets
1.1 Basics
Sets are the most fundamental building blocks of mathematical
objects. In fact, almost every mathematical object can be seen as
a set of some kind. In logic, as in other parts of mathematics,
sets and set theoretical talk is ubiquitous. So it will be important
to discuss what sets are, and introduce the notations necessary
to talk about sets and operations on sets in a standard way.
2
1.1. BASICS 3
When we say that sets are independent of the way they are
specified, we mean that the elements of a set are all that matters.
For instance, it so happens that
{Nicole, Jacob},
{x : is a niece or nephew of Richard}, and
{x : is a child of Ruth}
are also two ways of specifying the same set. In other words, all
that matters is which elements a set has. The elements of a set
are not ordered and each element occurs only once. When we
specify or describe a set, elements may occur multiple times and in
different orders, but any descriptions that only differ in the order
of elements or in how many times elements are listed describes
the same set.
1.3 Subsets
Sets are made up of their elements, and every element of a set is a
part of that set. But there is also a sense that some of the elements
of a set taken together are a “part of” that set. For instance, the
number 2 is part of the set of integers, but the set of even numbers
is also a part of the set of integers. It’s important to keep those
two senses of being part of a set separate.
Note that a set may contain other sets, not just as subsets but
as elements! In particular, a set may happen to both be an element
and a subset of another, e.g., {0} ∈ {0, {0}} and also {0} ⊆
{0, {0}}.
Extensionality gives a criterion of identity for sets: X = Y iff
every element of X is also an element of Y and vice versa. The
definition of “subset” defines X ⊆ Y precisely as the first half of
this criterion: every element of X is also an element of Y . Of
course the definition also applies if we switch X and Y : Y ⊆ X
iff every element of Y is also an element of X . And that, in turn,
is exactly the “vice versa” part of extensionality. In other words,
extensionality amounts to: X = Y iff X ⊆ Y and Y ⊆ X .
CHAPTER 1. SETS 6
℘(X ) = {Y : Y ⊆ X }
X ∪ Y = {x : x ∈ X ∨ x ∈ Y }
X ∩ Y = {x : x ∈ X ∧ x ∈ Y }
X \ Y = {x : x ∈ X and x < Y }.
This completes the first half of the proof. Note that in the
last step we used the fact that if a conjunction (z ∈ X and z ∈
X ∪ Y ) follows from an assumption, each conjunct follows from
that same assumption. You may know this rule as “conjunction
elimination,” or ∧Elim. Now let’s prove (b):
X ∩ (X ∪ Y ) = X
X × Y = {h0, 1i, h0, ai, h0, bi, h1, 1i, h1, ai, h1, bi}.
X ∗ = {∅} ∪ X ∪ X 2 ∪ X 3 ∪ . . .
1.7. RUSSELL’S PARADOX 11
S = {x : x is a sibling of Richard}.
R = {x : x < x }
exist?
If R exists, it makes sense to ask if R ∈ R or not—it must be
either ∈ R or < R. Suppose the former is true, i.e., R ∈ R. R was
defined as the set of all sets that are not elements of themselves,
and so if R ∈ R, then R does not have this defining property of R.
But only sets that have this property are in R, hence, R cannot
CHAPTER 1. SETS 12
Summary
A set is a collection of objects, the elements of the set. We write
x ∈ X if x is an element of X . Sets are extensional—they are
completely determined by their elements. Sets are specified by
listing the elements explicitly or by giving a property the ele-
ments share (abstraction). Extensionality means that the order
or way of listing or specifying the elements of a set doesn’t mat-
ter. To prove that X and Y are the same set (X = Y ) one has to
prove that every element of X is an element of Y and vice versa.
Important sets include the natural (N), integer (Z), rational
(Q), and real (R) numbers, but also strings (X ∗ ) and infinite
sequences (X ω ) of objects. X is a subset of Y , X ⊆ Y , if every
element of X is also one of Y . The collection of all subsets of
a set Y is itself a set, the power set ℘(Y ) of Y . We can form
the union X ∪ Y and intersection X ∩ Y of sets. An ordered
pair hx, yi consists of two objects x and y, but in that specific
order. The pairs hx, yi and hy, xi are different pairs (unless x = y).
The set of all pairs hx, yi where x ∈ X and y ∈ Y is called the
Cartesian product X × Y of X and Y . We write X 2 for X × X ;
so for instance N2 is the set of pairs of natural numbers.
Problems
Problem 1.1. Show that there is only one empty set, i.e., show
that if X and Y are sets without members, then X = Y .
1.7. RUSSELL’S PARADOX 13
Relations
2.1 Relations as Sets
You will no doubt remember some interesting relations between
objects of some of the sets we’ve mentioned. For instance, num-
bers come with an order relation < and from the theory of whole
numbers the relation of divisibility without remainder (usually writ-
ten n | m) may be familar. There is also the relation is identical
with that every object bears to itself and to no other thing. But
there are many more interesting relations that we’ll encounter,
and even more possible relations. Before we review them, we’ll
just point out that we can look at relations as a special sort of
set. For this, first recall what a pair is: if a and b are two objects,
we can combine them into the ordered pair ha, bi. Note that for
ordered pairs the order does matter, e.g, ha, bi , hb, ai, in contrast
to unordered pairs, i.e., 2-element sets, where {a, b } = {b, a}.
If X and Y are sets, then the Cartesian product X ×Y of X and
Y is the set of all pairs ha, bi with a ∈ X and b ∈ Y . In particular,
X 2 = X × X is the set of all pairs from X .
Now consider a relation on a set, e.g., the <-relation on the
set N of natural numbers, and consider the set of all pairs of
numbers hn, mi where n < m, i.e.,
R = {hn, mi : n, m ∈ N and n < m}.
Then there is a close connection between the number n being
14
2.1. RELATIONS AS SETS 15
L = {h0, 1i, h0, 2i, . . . , h1, 2i, h1, 3i, . . . , h2, 3i, h2, 4i, . . .},
is the less than relation, i.e., Lnm iff n < m. The subset of pairs
below the diagonal, i.e.,
G = {h1, 0i, h2, 0i, h2, 1i, h3, 0i, h3, 1i, h3, 2i, . . . },
CHAPTER 2. RELATIONS 16
2.3 Orders
Very often we are interested in comparisons between objects,
where one object may be less or equal or greater than another
in a certain respect. Size is the most obvious example of such a
comparative relation, or order. But not all such relations are alike
in all their properties. For instance, some comparative relations
require any two objects to be comparable, others don’t. (If they
do, we call them linear or total.) Some include identity (like ≤)
and some exclude it (like <). Let’s get some order into all this.
Example 2.14. Every linear order is also a partial order, and ev-
ery partial order is also a preorder, but the converses don’t hold.
For instance, the identity relation and the full relation on X are
preorders, but they are not partial orders, because they are not
anti-symmetric (if X has more than one element). For a some-
what less silly example, consider the no longer than relation 4
on B∗ : x 4 y iff len(x) ≤ len(y). This is a preorder, even a con-
nected preorder, but not a partial order.
The relation of divisibility without remainder gives us an ex-
ample of a partial order which isn’t a linear order: for integers
n, m, we say n (evenly) divides m, in symbols: n | m, if there is
some k so that m = kn. On N, this is a partial order, but not a
linear order: for instance, 2 - 3 and also 3 - 2. Considered as a
relation on Z, divisibility is only a preorder since anti-symmetry
fails: 1 | −1 and −1 | 1 but 1 , −1. Another important partial
order is the relation ⊆ on a set of sets.
Notice that the examples L and G from Example 2.2, although
we said there that they were called “strict orders” are not linear
orders even though they are connected (they are not reflexive).
But there is a close connection, as we will see momentarily.
2. Exercise.
2.4 Graphs
A graph is a diagram in which points—called “nodes” or “ver-
tices” (plural of “vertex”)—are connected by edges. Graphs are
a ubiquitous tool in descrete mathematics and in computer sci-
ence. They are incredibly useful for representing, and visualizing,
relationships and structures, from concrete things like networks
of various kinds to abstract structures such as the possible out-
comes of decisions. There are many different kinds of graphs in
the literature which differ, e.g., according to whether the edges
are directed or not, have labels or not, whether there can be edges
from a node to the same node, multiple edges between the same
nodes, etc. Directed graphs have a special connection to relations.
1 2 4
1 2
3. The restriction R Y of R to Y is R ∩ Y 2
Summary
A relation R on a set X is a way of relating elements of X . We
write Rxy if the relation holds between x and y. Formally, we can
CHAPTER 2. RELATIONS 24
Problems
Problem 2.1. List the elements of the relation ⊆ on the set
℘({a, b, c }).
Functions
3.1 Basics
A function is a mapping of which pairs each object of a given set
with a single partner in another set. For instance, the operation
of adding 1 defines a function: each number n is paired with a
unique number n + 1. More generally, functions may take pairs,
triples, etc., of inputs and returns some kind of output. Many
functions are familiar to us from basic arithmetic. For instance,
addition and multiplication are functions. They take in two num-
bers and return a third. In this mathematical, abstract sense, a
function is a black box: what matters is only what output is paired
with what input, not the method for calculating the output.
26
3.1. BASICS 27
natural number with the same output. The definitions for f and
g specify the same mapping by means of different equations, and
so count as the same function.
x2
if x is even
h(x) =
x+1
2 if x is odd.
x2
if x is even
f (x) =
x+1
2 if x is odd.
The scare quotes around “defined by” suggest that this is not
a definition. At least, it is not in general. For in order for this
definition to specify a function, there has to be one and only one x
such that f (x) = y—the output of g has to be uniquely specified.
Moreover, it has to be specified for every y ∈ Y . If there are x 1
and x 2 ∈ X with x 1 , x 2 but f (x 1 ) = f (x 2 ), then g (y) would not
be uniquely specified for y = f (x 1 ) = f (x 2 ). And if there is no x
at all such that f (x) = y, then g (y) is not specified at all. In other
words, for g to be defined, f has to be injective and surjective.
Proof. Exercise.
give this function, you first add one, and then you multiply the
result by two. So their composition is (g ◦ f )(x) = 2(x + 1).
3.5 Isomorphism
An isomorphism is a bijection that preserves the structure of the
sets it relates, where structure is a matter of the relationships that
obtain between the elements of the sets. Consider the following
two sets X = {1, 2, 3} and Y = {4, 5, 6}. These sets are both struc-
tured by the relations successor, less than, and greater than. An
isomorphism between the two sets is a bijection that preserves
those structures. So a bijective function f : X → Y is an isomor-
phism if, i < j iff f (i ) < f ( j ), i > j iff f (i ) > f ( j ), and j is the
successor of i iff f ( j ) is the successor of f (i ).
Summary
A function f : X → Y maps every element of the domain X
to a unique element of the codomain Y . If x ∈ X , we call the y
that f maps x to the value f (x) of f for argument x. If X is a set
of pairs, we can think of the function f as taking two arguments.
The range ran(f ) of f is the subset of Y that consists of all the
values of f .
If ran(f ) = Y then f is called surjective. The value f (x) is
unique in that f maps x to only one f (x), never more than one.
If f (x) is also unique in the sense that no two different arguments
are mapped to the same value, f is called injective. Functions
which are both injective and surjective are called bijective.
Bijective functions have a unique inverse function f −1 . Func-
tions can also be chained together: the function (g ◦ f ) is the
composition of f with g . Compositions of injective functions are
injective, and of surjective functions are surjective, and (f −1 ◦ f )
is the identity function.
If we relax the requirement that f must have a value for every
x ∈ X , we get the notion of a partial functions. If f : X → 7
Y is partial, we say f (x) is defined, f (x) ↓ if f has a value
for argument x. Any (partial) function f is associated with the
graph R f of f , the relation that holds iff f (x) = y.
CHAPTER 3. FUNCTIONS 34
Problems
Problem 3.1. Show that if f is bijective, an inverse g of f exists,
i.e., define such a g , show that it is a function, and show that it
is an inverse of f , i.e., f (g (y)) = y and g (f (x)) = x for all x ∈ X
and y ∈ Y .
The Size of
Sets
4.1 Introduction
When Georg Cantor developed set theory in the 1870s, his inter-
est was in part to make palatable the idea of an infinite collection—
an actual infinity, as the medievals would say. Key to this reha-
bilitation of the notion of the infinite was a way to assign sizes—
“cardinalities”—to sets. The cardinality of a finite set is just a
natural number, e.g., ∅ has cardinality 0, and a set containing
five things has cardinality 5. But what about infinite sets? Do
they all have the same cardinality, ∞? It turns out, they do not.
The first important idea here is that of an enumeration. We
can list every finite set by listing all its elements. For some infinite
sets, we can also list all their elements if we allow the list itself
to be infinite. Such sets are called countable. Cantor’s surprising
result was that some infinite sets are not countable.
35
CHAPTER 4. THE SIZE OF SETS 36
−d 02 e d 12 e −d 22 e d 32 e −d 24 e d 52 e −d 62 e . . .
0 1 −1 2 −2 3 ...
0 if n = 1
f (n) = if n is even
n/2
−(n − 1)/2 if n is odd and > 1
That is fine for “easy” sets. What about the set of, say, pairs
of natural numbers?
Z+ × Z+ = {hn, mi : n, m ∈ Z+ }
1 2 3 4 ...
1 h1, 1i h1, 2i h1, 3i h1, 4i ...
2 h2, 1i h2, 2i h2, 3i h2, 4i ...
3 h3, 1i h3, 2i h3, 3i h3, 4i ...
4 h4, 1i h4, 2i h4, 3i h4, 4i ...
.. .. .. .. .. ..
. . . . . .
such an array into a one-way list? The pattern in the array below
demonstrates one way to do this:
1 2 4 7 ...
3 5 8 ... ...
6 9 ... ... ...
10 . . . . . . . . . ...
.. .. .. .. ..
. . . . .
This pattern is called Cantor’s zig-zag method. Other patterns are
perfectly permissible, as long as they “zig-zag” through every cell
of the array. By Cantor’s zig-zag method, the enumeration for
Z+ × Z+ according to this scheme would be:
h1, 1i, h1, 2i, h2, 1i, h1, 3i, h2, 2i, h3, 1i, h1, 4i, h2, 3i, h3, 2i, h4, 1i, . . .
What ought we do about enumerating, say, the set of ordered
triples of positive integers?
Z+ × Z+ × Z+ = {hn, m, k i : n, m, k ∈ Z+ }
We can think of Z+ × Z+ × Z+ as the Cartesian product of Z+ × Z+
and Z+ , that is,
(Z+ )3 = (Z+ × Z+ ) × Z+ = {hhn, mi, k i : hn, mi ∈ Z+ × Z+, k ∈ Z+ }
and thus we can enumerate (Z+ )3 with an array by labelling one
axis with the enumeration of Z+ , and the other axis with the
enumeration of (Z+ )2 :
1 2 3 4 ...
h1, 1i h1, 1, 1i h1, 1, 2i h1, 1, 3i h1, 1, 4i ...
h1, 2i h1, 2, 1i h1, 2, 2i h1, 2, 3i h1, 2, 4i ...
h2, 1i h2, 1, 1i h2, 1, 2i h2, 1, 3i h2, 1, 4i ...
h1, 3i h1, 3, 1i h1, 3, 2i h1, 3, 3i h1, 3, 4i ...
.. .. .. .. .. ..
. . . . . .
Thus, by using a method like Cantor’s zig-zag method, we may
similarly obtain an enumeration of (Z+ )3 .
4.3. UNCOUNTABLE SETS 41
1 2 3 4 ...
1 s1 (1) s 1 (2) s 1 (3) s 1 (4) ...
2 s 2 (1) s2 (2) s 2 (3) s 2 (4) ...
3 s 3 (1) s 3 (2) s3 (3) s 3 (4) ...
4 s 4 (1) s 4 (2) s 4 (3) s4 (4) ...
.. .. .. .. .. ..
. . . . . .
The labels down the side give the number of the sequence in the
list s 1 , s 2 , . . . ; the numbers across the top label the elements of the
individual sequences. For instance, s 1 (1) is a name for whatever
number, a 0 or a 1, is the first element in the sequence s 1 , and so
on.
Now we construct an infinite sequence, s , of 0’s and 1’s which
cannot possibly be on this list. The definition of s will depend on
the list s 1 , s 2 , . . . . Any infinite list of infinite sequences of 0’s and
1’s gives rise to an infinite sequence s which is guaranteed to not
appear on the list.
To define s , we specify what all its elements are, i.e., we spec-
ify s (n) for all n ∈ Z+ . We do this by reading down the diagonal
of the array above (hence the name “diagonal method”) and then
changing every 1 to a 0 and every 1 to a 0. More abstractly, we
define s (n) to be 0 or 1 according to whether the n-th element of
4.3. UNCOUNTABLE SETS 43
Proof. We proceed in the same way, by showing that for every list
of subsets of Z+ there is a subset of Z+ which cannot be on the
list. Suppose the following is a given list of subsets of Z+ :
Z 1, Z 2, Z 3, . . .
We now define a set Z such that for any n ∈ Z+ , n ∈ Z iff n < Z n :
Z = {n ∈ Z+ : n < Z n }
Z is clearly a set of positive integers, since by assumption each Z n
is, and thus Z ∈ ℘(Z+ ). But Z cannot be on the list. To show
this, we’ll establish that for each k ∈ Z+ , Z , Zk .
So let k ∈ Z+ be arbitrary. We’ve defined Z so that for any
n ∈ Z+ , n ∈ Z iff n < Z n . In particular, taking n = k , k ∈ Z
iff k < Zk . But this shows that Z , Zk , since k is an element of
one but not the other, and so Z and Zk have different elements.
Since k was arbitrary, Z is not on the list Z 1 , Z 2 , . . .
The preceding proof did not mention a diagonal, but you
can think of it as involving a diagonal if you picture it this way:
Imagine the sets Z 1 , Z 2 , . . . , written in an array, where each ele-
ment j ∈ Zi is listed in the j -th column. Say the first four sets on
that list are {1, 2, 3, . . . }, {2, 4, 6, . . . }, {1, 2, 5}, and {3, 4, 5, . . . }.
Then the array would begin with
Z1 = {1, 2, 3, 4, 5, 6, . . . }
Z2 = { 2, 4, 6, . . . }
Z3 = {1, 2, 5 }
Z4 = { 3, 4, 5, 6, . . . }
.. ..
. .
4.4 Reduction
We showed ℘(Z+ ) to be uncountable by a diagonalization argu-
ment. We already had a proof that Bω , the set of all infinite
sequences of 0s and 1s, is uncountable. Here’s another way we
can prove that ℘(Z+ ) is uncountable: Show that if ℘(Z+ ) is count-
able then Bω is also countable. Since we know Bω is not count-
able, ℘(Z+ ) can’t be either. This is called reducing one problem
to another—in this case, we reduce the problem of enumerat-
ing Bω to the problem of enumerating ℘(Z+ ). A solution to the
latter—an enumeration of ℘(Z+ )—would yield a solution to the
former—an enumeration of Bω .
How do we reduce the problem of enumerating a set Y to
that of enumerating a set X ? We provide a way of turning an
enumeration of X into an enumeration of Y . The easiest way to
do that is to define a surjective function f : X → Y . If x 1 , x 2 , . . .
enumerates X , then f (x 1 ), f (x 2 ), . . . would enumerate Y . In our
case, we are looking for a surjective function f : ℘(Z+ ) → Bω .
f (Z 1 ), f (Z 2 ), f (Z 3 ), . . .
Y = {x ∈ X : x < g (x)}.
CHAPTER 4. THE SIZE OF SETS 50
Summary
The size of a set X can be measured by a natural number if
the set is finite, and sizes can be compared by comparing these
numbers. If sets are infinite, things are more complicated. The
first level of infinity is that of countably infinite sets. A set X is
countable if its elements can be arranged in an enumeration, a
one-way infinite, possibly gappy list, i.e., when there is a surjective
function f : Z+ → 7 X . It is countably infinite if it is countable but
not finite. Cantor’s zig-zag method shows that the sets of pairs
of elements of countably infinite sets is also countable; and this
can be used to show that even the set of rational numbers Q is
countable.
4.6. COMPARING SIZES OF SETS 51
There are, however, infinite sets that are not countable: these
sets are called uncountable. There are two ways of showing that
a set is uncountable: directly, using a diagonal argument, or
by reduction. To give a diagonal argument, we assume that the
set X in question is countable, and use a hypothetical enumera-
tion to define an element of X which, by the very way we define
it, is guaranteed to be different from every element in the enu-
meration. So the enumeration can’t be an enumeration of all
of X after all, and we’ve shown that no enumeration of X can
exist. A reduction shows that X is uncountable by associating
every element of X with an element of some known uncountable
set Y in a surjective way. If this is possible, than a hypothetical
enumeration of X would yieled an enumeration of Y . Since Y is
uncountable, no enumeration of X can exist.
In general, infinite sets can be compared sizewise: X and
Y are the same size, or equinumerous, if there is a bijection
between them. We can also define that X is no larger than Y
(|X | ≤ |Y |) if there is an injective function from X to Y . By
the Schröder-Bernstein Theorem, this in fact provides a sizewise
order of infinite sets. Finally, Cantor’s theorem says that for
any X , |X | < |℘(X )|. This is a generalization of our result that
℘(Z+ ) is uncountable, and shows that there are not just two, but
infinitely many levels of infinity.
Problems
Problem 4.1. According to Definition 4.4, a set X is enumerable
iff X = ∅ or there is a surjective f : Z+ → X . It is also possible to
define “countable set” precisely by: a set is enumerable iff there
is an injective function g : X → Z+ . Show that the definitions are
equivalent, i.e., show that there is an injective function g : X →
Z+ iff either X = ∅ or there is a surjective f : Z+ → X .
Problem 4.9. Show that the set of all finite subsets of an arbitrary
infinite enumerable set is enumerable.
Problem 4.15. Show that the set of all sets of pairs of positive
integers is uncountable by a reduction argument.
Problem 4.17. Let P be the set of functions from the set of posi-
tive integers to the set {0}, and let Q be the set of partial functions
from the set of positive integers to the set {0}. Show that P is
countable and Q is not. (Hint: reduce the problem of enumerat-
ing Bω to enumerating Q ).
Problem 4.19. Show that the set R of all real numbers is un-
countable.
First-order
Logic
55
CHAPTER 5
Syntax and
Semantics
5.1 Introduction
In order to develop the theory and metatheory of first-order logic,
we must first define the syntax and semantics of its expressions.
The expressions of first-order logic are terms and formulas. Terms
are formed from variables, constant symbols, and function sym-
bols. Formulas, in turn, are formed from predicate symbols to-
gether with terms (these form the smallest, “atomic” formulas),
and then from atomic formulas we can form more complex ones
using logical connectives and quantifiers. There are many dif-
ferent ways to set down the formation rules; we give just one
possible one. Other systems will chose different symbols, will se-
lect different sets of connectives as primitive, will use parentheses
differently (or even not at all, as in the case of so-called Polish
notation). What all approaches have in common, though, is that
the formation rules define the set of terms and formulas induc-
tively. If done properly, every expression can result essentially
in only one way according to the formation rules. The induc-
tive definition resulting in expressions that are uniquely readable
means we can give meanings to these expressions using the same
56
5.1. INTRODUCTION 57
method—inductive definition.
1. Logical symbols
1. ⊥ is an atomic formula.
2. A ↔ B abbreviates (A → B) ∧ (B → A).
these are just conventional abbreviations for A20 (t1, t2 ), f02 (t1, t2 ),
A20 (t1, t2 ) and f01 (t ), respectively.
1. We take D to be A and D → D to be B.
2. We take A to be D → D and B is D.
Proof. Exercise.
1. A ≡ ⊥.
Proof. Exercise.
1. A is atomic.
3. A is of the form (B ∧ C ).
4. A is of the form (B ∨ C ).
5. A is of the form (B → C ).
6. A is of the form ∀x B.
7. A is of the form ∃x B.
5.6 Subformulas
It is often useful to talk about the formulas that “make up” a
given formula. We call these its subformulas. Any formula counts
as a subformula of itself; a subformula of A other than A itself is
a proper subformula.
B is the scope of the first ∀v0 , C is the scope of ∃v1 , and D is the
scope of the second ∀v0 . The first ∀v0 binds the occurrences of v0
in B, ∃v1 the occurrence of v1 in C , and the second ∀v0 binds the
occurrence of v0 in D. The first occurrence of v1 and the fourth
occurrence of v0 are free in A. The last occurrence of v0 is free
in D, but bound in C and A.
5.8 Substitution
1. s ≡ c : s [t/x] is just s .
3. s ≡ x: s [t/x] is t .
CHAPTER 5. SYNTAX AND SEMANTICS 72
1. |N| = N
2. N = 0
ValM (t ) = f M
(ValM (t1 ), . . . , ValM (tn )).
1. t ≡ c : ValsM (t ) = c M .
2. t ≡ x: ValsM (t ) = s (x).
3. t ≡ f (t1, . . . , tn ):
ValsM (t ) = f M
(ValsM (t1 ), . . . , ValsM (tn )).
1. A ≡ ⊥: not M, s |= A.
5. A ≡ (B ∧ C ): M, s |= A iff M, s |= B and M, s |= C .
5.11. SATISFACTION OF A FORMULA IN A STRUCTURE 79
ValsM1 (f (t1, . . . , tk )) = f M
(ValsM1 (t1 ), . . . , ValsM1 (tk ))
=f M
(ValsM1 (t1 ), . . . , ValsM1 (tk )) =
=f M
(ValsM2 (t1 ), . . . , ValsM2 (tk )) =
= ValsM2 (f (t1, . . . , tk )) = ValsM2 (t ).
2. A ≡ B ∧ C : exercise.
3. A ≡ B ∨ C : if M, s1 |= A, then M, s 1 |= B or M, s 1 |= C . By
induction hypothesis, M, s2 |= B or M, s 2 |= C , so M, s2 |= A.
4. A ≡ B → C : exercise.
6. A ≡ ∀x B: exercise.
Proof. Exercise.
5.12 Extensionality
Extensionality, sometimes called relevance, can be expressed in-
formally as follows: the only thing that bears upon the satisfaction
of formula A in a structure M relative to a variable assignment s ,
are the assignments made by M and s to the elements of the
language that actually appear in A.
One immediate consequence of extensionality is that where
two structures M and M 0 agree on all the elements of the lan-
guage appearing in a sentence A and have the same domain, M
and M 0 must also agree on A itself.
Proof. By induction on t .
5.13. SEMANTIC NOTIONS 83
ValsM (t [t 0/x]) =
= ValsM (f (t1 [t 0/x], . . . , tn [t 0/x]))
by definition of t [t 0/x]
=f M
(ValsM (t1 [t 0/x]), . . . , ValsM (tn [t 0/x]))
by definition of ValsM (f (. . . ))
=f M
(ValsM0 (t1 ), . . . , ValsM0 (tn ))
by induction hypothesis
= ValsM0 (t ) by definition of ValsM0 (f (. . . ))
Proof. Exercise.
Summary
A first-order language consists of constant, function, and pred-
icate symbols. Function and constant symbols take a specified
number of arguments. In the language of arithmetic, e.g., we
CHAPTER 5. SYNTAX AND SEMANTICS 86
Problems
Problem 5.1. Prove Lemma 5.10.
1. |M| = {1, 2, 3}
2. c M = 3
Theories and
Their Models
6.1 Introduction
The development of the axiomatic method is a significant achieve-
ment in the history of science, and is of special importance in the
history of mathematics. An axiomatic development of a field in-
volves the clarification of many questions: What is the field about?
What are the most fundamental concepts? How are they related?
Can all the concepts of the field be defined in terms of these
fundamental concepts? What laws do, and must, these concepts
obey?
The axiomatic method and logic were made for each other.
Formal logic provides the tools for formulating axiomatic theo-
ries, for proving theorems from the axioms of the theory in a
precisely specified way, for studying the properties of all systems
satisfying the axioms in a systematic way.
89
CHAPTER 6. THEORIES AND THEIR MODELS 90
{ ∀x x ≤ x,
∀x ∀y ((x ≤ y ∧ y ≤ x) → x = y),
∀x ∀y ∀z ((x ≤ y ∧ y ≤ z ) → x ≤ z ) }
∀x ¬x < x,
∀x ∀y ((x < y ∨ y < x) ∨ x = y),
∀x ∀y ∀z ((x < y ∧ y < z ) → x < z )
∀x (x · ) = x
∀x ∀y ∀z (x · (y · z )) = ((x · y) · z )
∀x ∃y (x · y) =
¬∃x x 0 =
∀x ∀y (x 0 = y 0 → x = y)
∀x ∀y (x < y ↔ ∃z (x + z 0 = y))
∀x (x + ) = x
∀x ∀y (x + y 0) = (x + y)0
∀x (x × ) =
∀x ∀y (x × y 0) = ((x × y) + x)
Since there are infinitely many sentences of the latter form, this
axiom system is infinite. The latter form is called the induction
schema. (Actually, the induction schema is a bit more complicated
than we let on here.)
The third axiom is an explicit definition of <.
∃x ¬∃y y ∈ x
∀x ∀y (∀z (z ∈ x ↔ z ∈ y) → x = y)
∀x ∀y ∃z ∀u (u ∈ z ↔ (u = x ∨ u = y))
∀x ∃y ∀z (z ∈ y ↔ ∃u (z ∈ u ∧ u ∈ x))
∃x ∀y (y ∈ x ↔ A(y))
The first axiom says that there is a set with no elements (i.e., ∅
exists); the second says that sets are extensional; the third that
for any sets X and Y , the set {X,Y } exists; the fourth that for
any sets X and Y , the set X ∪ Y exists.
The sentences mentioned last are collectively called the naive
comprehension scheme. It essentially says that for every A(x), the set
{x : A(x)} exists—so at first glance a true, useful, and perhaps
even necessary axiom. It is called “naive” because, as it turns out,
it makes this theory unsatisfiable: if you take A(y) to be ¬y ∈ y,
you get the sentence
∃x ∀y (y ∈ x ↔ ¬y ∈ y)
∀x P (x, x),
∀x ∀y ((P (x, y) ∧ P (y, x)) → x = y),
∀x ∀y ∀z ((P (x, y) ∧ P (y, z )) → P (x, z )),
∀z (z ∈ x → z ∈ y)
∃x (¬∃y y ∈ x ∧ ∀z x ⊆ z )
∀u ((u ∈ x ∨ u ∈ y) ↔ u ∈ z )
∀u (u ⊆ x ↔ u ∈ y)
since the elements of X ∪ Y are exactly the sets that are either
elements of X or elements of Y , and the elements of ℘(X ) are
exactly the subsets of X . However, this doesn’t allow us to use
x ∪ y or ℘(x) as if they were terms: we can only use the entire
formulas that define the relations X ∪ Y = Z and ℘(X ) = Y . In
fact, we do not know that these relations are ever satisfied, i.e.,
we do not know that unions and power sets always exist. For
instance, the sentence ∀x ∃y ℘(x) = y is another axiom of ZFC
(the power set axiom).
Now what about talk of ordered pairs or functions? Here we
have to explain how we can think of ordered pairs and functions
CHAPTER 6. THEORIES AND THEIR MODELS 100
as special kinds of sets. One way to define the ordered pair hx, yi
is as the set {{x }, {x, y }}. But like before, we cannot introduce
a function symbol that names this set; we can only define the
relation hx, yi = z , i.e., {{x }, {x, y }} = z :
∀u (u ∈ z ↔ (∀v (v ∈ u ↔ v = x) ∨ ∀v (v ∈ u ↔ (v = x ∨ v = y))))
This says that the elements u of z are exactly those sets which
either have x as its only element or have x and y as its only
elements (in other words, those sets that are either identical to
{x } or identical to {x, y }). Once we have this, we can say further
things, e.g., that X × Y = Z :
∀z (z ∈ Z ↔ ∃x ∃y (x ∈ X ∧ y ∈ Y ∧ hx, yi = z ))
∀u (u ∈ f → ∃x ∃y (x ∈ X ∧ y ∈ Y ∧ hx, yi = u)) ∧
∀x (x ∈ X → (∃y (y ∈ Y ∧ maps(f , x, y)) ∧
(∀y ∀y 0 ((maps(f , x, y) ∧ maps(f , x, y 0)) → y = y 0)))
f : X → Y ∧ ∀x ∀x 0 ((x ∈ X ∧ x 0 ∈ X ∧
∃y (maps(f , x, y) ∧ maps(f , x 0, y))) → x = x 0)
One might think that set theory requires another axiom that
guarantees the existence of a set for every defining property. If
A(x) is a formula of set theory with the variable x free, we can
consider the sentence
∃y ∀x (x ∈ y ↔ A(x)).
This sentence states that there is a set y whose elements are all
and only those x that satisfy A(x). This schema is called the
“comprehension principle.” It looks very useful; unfortunately
it is inconsistent. Take A(x) ≡ ¬x ∈ x, then the comprehension
principle states
∃y ∀x (x ∈ y ↔ x < x),
i.e., it states the existence of a set of all sets that are not elements
of themselves. No such set can exist—this is Russell’s Paradox.
ZFC, in fact, contains a restricted—and consistent—version of
this principle, the separation principle:
∀z ∃y ∀x (x ∈ y ↔ (x ∈ z ∧ A(x)).
A ≥n ≡ ∃x 1 ∃x 2 . . . ∃x n (x 1 , x 2 ∧ x 1 , x 3 ∧ x 1 , x 4 ∧ · · · ∧ x 1 , x n ∧
x2 , x3 ∧ x2 , x4 ∧ · · · ∧ x2 , xn ∧
..
.
x n−1 , x n )
A=n ≡ ∃x 1 ∃x 2 . . . ∃x n (x 1 , x 2 ∧ x 1 , x 3 ∧ x 1 , x 4 ∧ · · · ∧ x 1 , x n ∧
x2 , x3 ∧ x2 , x4 ∧ · · · ∧ x2 , xn ∧
..
.
x n−1 , x n ∧
∀y (y = x 1 ∨ . . . y = x n ) . . . ))
Summary
Sets of sentences in a sense describe the structures in which they
are jointly true; these structures are their models. Conversely, if
we start with a structure or set of structures, we might be inter-
ested in the set of sentences they are models of, this is the theory
of the structure or set of structures. Any such set of sentences has
the property that every sentence entailed by them is already in
the set; they are closed. More generally, we call a set Γ a theory
if it is closed under entailment, and say Γ is axiomatized by ∆
is Γ consists of all sentences entailed by ∆.
Mathematics yields many examples of theories, e.g., the the-
ories of linear orders, of groups, or theories of arithmetic, e.g.,
the theory axiomatized by Peano’s axioms. But there are many
examples of important theories in other disciplines as well, e.g.,
relational databases may be thought of as theories, and meta-
physics concerns itself with theories of parthood which can be
axiomatized.
One significant question when setting up a theory for study is
whether its language is expressive enough to allow us to formu-
late everything we want the theory to talk about, and another is
whether it is strong enough to prove what we want it to prove. To
express a relation we need a formula with the requisite number
of free variables. In set theory, we only have ∈ as a relation sym-
bol, but it allows us to express x ⊆ y using ∀u (u ∈ x → u ∈ y).
Zermelo-Fraenkel set theory ZFC, in fact, is strong enough to
both express (almost) every mathematical claim and to (almost)
prove every mathematical theorem using a handful of axioms and
a chain of increasingly complicated definitions such as that of ⊆.
Problems
Problem 6.1. Find formulas in LA which define the following
relations:
1. n is between i and j ;
CHAPTER 6. THEORIES AND THEIR MODELS 104
1. the inverse R −1 of R;
1. {0} is definable in N;
2. {1} is definable in N;
3. {2} is definable in N;
∃y ∀x (x ∈ y ↔ x < x) ` ⊥.
Natural
Deduction
7.1 Introduction
Logical systems commonly have not just a semantics, but also
proof systems. The purpose of proof systems is to provide a
purely syntactic method of establishing entailment and validity.
They are purely syntactic in the sense that a derivation in such
a system is a finite syntactic object, usually a sequence (or other
finite arrangement) of formulas. Moreover, good proof systems
have the property that any given sequence or arrangement of for-
mulas can be verified mechanically to be a “correct” proof. The
simplest (and historically first) proof systems for first-order logic
were axiomatic. A sequence of formulas counts as a derivation
in such a system if each individual formula in it is either among
a fixed set of “axioms” or follows from formulas coming before it
in the sequence by one of a fixed number of “inference rules”—
and it can be mechanically verified if a formula is an axiom and
whether it follows correctly from other formulas by one of the in-
ference rules. Axiomatic proof systems are easy to describe—and
also easy to handle meta-theoretically—but derivations in them
are hard to read and understand, and are also hard to produce.
105
CHAPTER 7. NATURAL DEDUCTION 106
The rules for natural deduction are divided into two main
types: propositional rules (quantifier-free) and quantifier rules. The
rules come in pairs, an introduction and an elimination rule for
CHAPTER 7. NATURAL DEDUCTION 108
Propositional Rules
Rules for ⊥
A ¬A ⊥
⊥Intro ⊥Elim
⊥ A
Rules for ∧
A B A∧B A∧B
∧Intro ∧Elim ∧Elim
A∧B A B
Rules for ∨
[A]n [B]n
A B
∨Intro ∨Intro
A∨B A∨B
n
A∨B C C
∨Elim
C
Rules for ¬
[A]n
¬¬A
¬Elim
A
n ⊥
¬Intro
¬A
7.2. RULES AND DERIVATIONS 109
Rules for →
[A]n
A A→B
→ Elim
B
B
n → Intro
A→B
Quantifier Rules
Rules for ∀
A(a) ∀x A(x)
∀Intro ∀Elim
∀x A(x) A(t )
where t is a ground term (a term that does not contain any vari-
ables), and a is a constant symbol which does not occur in A, or
in any assumption which is undischarged in the derivation end-
ing with the premise A. We call a the eigenvariable of the ∀Intro
inference.
Rules for ∃
[A(a)]n
A(t )
∃Intro
∃x A(x)
∃x A(x) C
n ∃Elim
C
where t is a ground term, and a is a constant which does not
occur in the premise ∃x A(x), in C , or any assumption which is
undischarged in the derivations ending with the two premises C
(other than the assumptions A(a)). We call a the eigenvariable of
the ∃Elim inference.
The condition that an eigenvariable not occur in the upper
sequent of the ∀ intro or ∃ elim inference is called the eigenvariable
condition.
We use the term “eigenvariable” even though a in the above
rules is a constant. This has historical reasons.
CHAPTER 7. NATURAL DEDUCTION 110
[A ∧ B]1
A
1 → Intro
(A ∧ B) → A
[A ∧ B]1
∧Elim
A
1 → Intro
(A ∧ B) → A
[¬A ∨ B]1
A→B
1 → Intro
(¬A ∨ B) → (A → B)
B B
3 → Intro 4 → Intro
[¬A ∨ B]1 A→B A→B
2 ∨Elim
A→B
1 → Intro
(¬A ∨ B) → (A → B)
For the two missing parts of the derivation, we need deriva-
tions of B from ¬A and A in the middle, and from A and B on the
left. Let’s take the former first. ¬A and A are the two premises of
⊥Intro:
[¬A]2 [A]3
⊥ ⊥Intro
B
By using ⊥Elim, we can obtain B as a conclusion and complete
the branch.
[B]2, [A]4
[¬A]2 [A]3
⊥ ⊥Intro
⊥Elim
B B
3 → Intro 4 → Intro
[¬A ∨ B]1 A→B A→B
2 ∨Elim
A→B
1 → Intro
(¬A ∨ B) → (A → B)
CHAPTER 7. NATURAL DEDUCTION 114
[¬A]2 [A]3
⊥ ⊥Intro
B
⊥Elim [B]2
3 → Intro → Intro
[¬A ∨ B]1 A→B A→B
2 ∨Elim
A→B
1 → Intro
(¬A ∨ B) → (A → B)
[∃x ¬A(x)]1
¬∀x A(x)
→ Intro
∃x ¬A(x) → ¬∀x A(x)
7.3. EXAMPLES OF DERIVATIONS 115
[¬A(a)]2
3
⊥
¬Intro
[∃x ¬A(x)]1 ¬∀x A(x)
2 ∃Elim
¬∀x A(x)
→ Intro
∃x ¬A(x) → ¬∀x A(x)
[∀x A(x)]3
∀Elim
[¬A(a)]2 A(a)
⊥ ⊥Intro
1 3 ¬Intro
[∃x ¬A(x)] ¬∀x A(x)
2 ∃Elim
¬∀x A(x)
→ Intro
∃x ¬A(x) → ¬∀x A(x)
∃x C (x, b)
We have two premises to work with. To use the first, i.e., try
to find a derivation of ∃x C (x, b) from ∃x (A(x) ∧ B(x)) we would
use the ∃Elim rule. Since it has an eigenvariable condition, we
will apply that rule first. We get the following:
[A(a) ∧ B(a)]1
[A(a) ∧ B(a)]1
∧Elim
B(a)
¬∀x A(x)
1
⊥
¬Intro
¬∀x A(x)
So far so good. We can use ∀Elim but it’s not obvious if that will
help us get to our goal. Instead, let’s use one of our assumptions.
∀x A(x) → ∃y B(y) together with ∀x A(x) will allow us to use the
→ Elim rule.
1
⊥
¬Intro
¬∀x A(x)
We now have one final assumption to work with, and it looks like
this will help us reach a contradiction by using ⊥Intro.
A(x) ∨ ¬A(x)
7.3. EXAMPLES OF DERIVATIONS 119
[¬(A(x) ∨ ¬A(x))]1
1
⊥
¬Intro
¬¬(A(x) ∨ ¬A(x))
¬Elim
A(x) ∨ ¬A(x)
[A(x)]2
∨Intro
[¬(A(x) ∨ ¬A(x))]1 A(x) ∨ ¬A(x)
⊥ ⊥Intro
2 ¬Intro
¬A(x)
1
⊥
¬Intro
¬¬(A(x) ∨ ¬A(x))
¬Elim
A(x) ∨ ¬A(x)
[A(x)]2
∨Intro
[¬(A(x) ∨ ¬A(x))]1 A(x) ∨ ¬A(x)
⊥ ⊥Intro
2 ¬Intro
¬A(x)
∨Intro
A(x) ∨ ¬A(x) [¬(A(x) ∨ ¬A(x))]1
1 ¬Intro
¬¬(A(x) ∨ ¬A(x))
¬Elim
A(x) ∨ ¬A(x)
Proof. Exercise.
Proof. Exercise.
1
⊥
¬Intro
¬A
[A]1 [¬A]2
δ1 δ2
1
⊥ 2
⊥
¬Intro ¬Intro
¬A ¬¬A
⊥ ¬Elim
7.5. PROPERTIES OF DERIVABILITY 123
Proof. Exercise.
δ
A
∨Intro
A∨B
Therefore Γ ` A ∨ B. The proof for when Γ ` B is similar.
Proof. Exercise
Proof. Exercise.
Proof. Exercise.
CHAPTER 7. NATURAL DEDUCTION 124
Proof. Exercise.
δ
A(t )
∃Intro
∃x A(x)
δ
∀x A(x)
∀Elim
A(t )
7.6 Soundness
A derivation system, such as natural deduction, is sound if it
cannot derive things that do not actually follow. Soundness is
thus a kind of guaranteed safety property for derivation systems.
Depending on which proof theoretic property is in question, we
would like to know for instance, that
Rules for =:
t = t = Intro
t1 = t2 A(t1 ) t1 = t2 A(t2 )
= Elim and = Elim
A(t2 ) A(t1 )
where t1 and t2 are closed terms. The = Intro rule allows us to
derive any identity statement of the form t = t outright, from no
assumptions.
A(s ) s =t
= Elim
A(t )
This may be familiar as the “principle of substitutability of iden-
ticals,” or Leibniz’ Law.
7.7. DERIVATIONS WITH IDENTITY PREDICATE 129
∀x ∀y ((A(x) ∧ A(y)) → x = y)
∃x ∀y (A(y) → y = x)
a =b
1 → Intro
((A(a) ∧ A(b)) → a = b)
∀Intro
∀y ((A(a) ∧ A(y)) → a = y)
∀Intro
∀x ∀y ((A(x) ∧ A(y)) → x = y)
∃x ∀y (A(y) → y = x) a =b
2 ∃Elim
a =b
1 → Intro
((A(a) ∧ A(b)) → a = b)
∀Intro
∀y ((A(a) ∧ A(y)) → a = y)
∀Intro
∀x ∀y ((A(x) ∧ A(y)) → x = y)
Summary
Proof systems provide purely syntactic methods for characteriz-
ing consequence and compatibility between sentences. Natural
deduction is one such proof system. A derivation in it con-
sists of a tree of formulas. The topmost formula a derivation are
assumptions. All other formulas, for the derivation to be cor-
rect, must be correctly justified by one of a number of inference
rules. These come in pairs; an introduction and an elimination
rule for each connective and quantifier. For instance, if a for-
mula A is justified by a → Elim rule, the preceding formulas (the
7.8. SOUNDNESS OF IDENTITY PREDICATE RULES 131
Problems
Problem 7.1. Give derivations of the following formulas:
1. ¬(A → B) → (A ∧ ¬B)
The
Completeness
Theorem
8.1 Introduction
The completeness theorem is one of the most fundamental re-
sults about logic. It comes in two formulations, the equivalence
of which we’ll prove. In its first formulation it says something fun-
damental about the relationship between semantic consequence
and our proof system: if a sentence A follows from some sen-
tences Γ, then there is also a derivation that establishes Γ ` A.
Thus, the proof system is as strong as it can possibly be without
proving things that don’t actually follow. In its second formula-
tion, it can be stated as a model existence result: every consistent
set of sentences is satisfiable.
These aren’t the only reasons the completeness theorem—or
rather, its proof—is important. It has a number of important con-
sequences, some of which we’ll discuss separately. For instance,
since any derivation that shows Γ ` A is finite and so can only
133
CHAPTER 8. THE COMPLETENESS THEOREM 134
1. Γ is consistent, and
2. if Γ ( Γ 0, then Γ 0 is inconsistent.
1. Γ is consistent, and
1. If Γ ` A, then A ∈ Γ.
4. (A ∨ B) ∈ Γ iff either A ∈ Γ or B ∈ Γ.
1. If Γ ` A, then A ∈ Γ.
Suppose that Γ ` A. Suppose to the contrary that A < Γ:
then since Γ is maximally consistent, Γ ∪ {A} is inconsis-
CHAPTER 8. THE COMPLETENESS THEOREM 138
3. Exercise.
4. (A ∨ B) ∈ Γ iff either A ∈ Γ or B ∈ Γ.
For the contrapositive of the forward direction, suppose
that A < Γ and B < Γ. We want to show that (A ∨ B) < Γ.
Since Γ is maximally consistent, Γ ∪ {A} ` ⊥ and Γ ∪ {B } `
⊥. By Proposition 7.20, Γ ∪{(A∨B)} is inconsistent. Hence,
(A ∨ B) < Γ, as required.
For the reverse direction, suppose that A ∈ Γ or B ∈ Γ.
Then Γ ` A or Γ ` B. By Proposition 7.21, Γ ` A ∨ B. By
(1), (A ∨ B) ∈ Γ, as required.
5. Exercise.
and adding, for each formula with one free variable A(x) a for-
mula of the form ∃x A → A(c ), where c is one of the new constant
symbols. When we construct the structure satisfying Γ, this will
guarantee that each true existential sentence has a witness among
the new constants.
Γn ∪ {An }
if Γn ∪ {An } is consistent;
Γn+1 =
Γn ∪ {¬An } otherwise.
Let Γ ∗ = n ≥0 Γn . Since Γ 0 ⊆ Γ ∗ , for each formula A, Γ ∗
S
contains a formula of the form ∃x A → A(c ) and thus is saturated.
Each Γn is consistent: Γ0 is consistent by definition. If Γn+1 =
Γn ∪ {A}, this is because the latter is consistent. If it isn’t, Γn+1 =
Γn ∪ {¬A}, which must be consistent. If it weren’t, i.e., both
Γn ∪{A} and Γn ∪{¬A} are inconsistent, then Γn ` ¬A and Γn ` A,
so Γn would be inconsistent contrary to induction hypothesis.
Every formula of Frm(L0) appears on the list used to de-
fine Γ ∗ . If An < Γ ∗ , then that is because Γn ∪ {An } was inconsis-
tent. But that means that Γ ∗ is maximally consistent.
∗
2. The interpretation of a constant symbol c is c itself: c M(Γ ) =
c.
∗)
4. If R is an n-place predicate symbol, then ht1, . . . , tn i ∈ R M(Γ
iff R(t1, . . . , tn ) ∈ Γ ∗ .
4. A ≡ B ∧ C : exercise.
6. A ≡ B → C : exercise.
7. A ≡ ∀x B(x): exercise.
8.7 Identity
The construction of the term model given in the preceding sec-
tion is enough to establish completeness for first-order logic for
sets Γ that do not contain =. The term model satisfies every
A ∈ Γ ∗ which does not contain = (and hence all A ∈ Γ). It does
not work, however, if = is present. The reason is that Γ ∗ then
may contain a sentence t = t 0, but in the term model the value of
any term is that term itself. Hence, if t and t 0 are different terms,
their values in the term model—i.e., t and t 0, respectively—are
different, and so t = t 0 is false. We can fix this, however, using a
construction known as “factoring.”
CHAPTER 8. THE COMPLETENESS THEOREM 144
1. ≈ is reflexive.
2. ≈ is symmetric.
3. ≈ is transitive.
[t ]≈ = {t 0 : t 0 ∈ Trm(L), t ≈ t 0 }
1. |M/≈ | = Trm(L)/≈ .
2. c M/≈ = [c ]≈
3. f M/≈ ([t
1 ]≈, . . . , [tn ]≈ ) = [f (t1, . . . , tn )]≈
and
Proof. Note that the Γ’s in Corollary 8.17 and Theorem 8.16 are
universally quantified. To make sure we do not confuse ourselves,
let us restate Theorem 8.16 using a different variable: for any set
of sentences ∆, if ∆ is consistent, it is satisfiable. By contraposi-
tion, if ∆ is not satisfiable, then ∆ is inconsistent. We will use this
to prove the corollary.
Suppose that Γ A. Then Γ ∪ {¬A} is unsatisfiable by Propo-
sition 5.46. Taking Γ ∪ {¬A} as our ∆, the previous version of
Theorem 8.16 gives us that Γ ∪ {¬A} is inconsistent. By Propo-
sition 7.13, Γ ` A.
are covered? The compactness theorem shows that this is not the
case if Γ has infinite models. Here’s how to see this: Let M be
an infinite model of Γ, and let c be a constant symbol not in the
language of Γ. Let ∆ be the set of all sentences c , t for t a term
in the language L of Γ, i.e.,
∆ = {c , t : t ∈ Trm(L)}.
∆ = {A ≥n : n ≥ 1}
Summary
Problems
Problem 8.1. Complete the proof of Proposition 8.2.
Beyond
First-order
Logic
9.1 Overview
First-order logic is not the only system of logic of interest: there
are many extensions and variations of first-order logic. A logic
typically consists of the formal specification of a language, usu-
ally, but not always, a deductive system, and usually, but not
always, an intended semantics. But the technical use of the term
raises an obvious question: what do logics that are not first-order
logic have to do with the word “logic,” used in the intuitive or
philosophical sense? All of the systems described below are de-
signed to model reasoning of some form or another; can we say
what makes them logical?
No easy answers are forthcoming. The word “logic” is used
in different ways and in different contexts, and the notion, like
that of “truth,” has been analyzed from numerous philosophical
stances. For example, one might take the goal of logical reason-
154
9.2. MANY-SORTED LOGIC 155
jects they can take as arguments. Otherwise, one keeps the usual
rules of first-order logic, with versions of the quantifier-rules re-
peated for each sort.
For example, to study international relations we might choose
a language with two sorts of objects, French citizens and German
citizens. We might have a unary relation, “drinks wine,” for ob-
jects of the first sort; another unary relation, “eats wurst,” for
objects of the second sort; and a binary relation, “forms a multi-
national married couple,” which takes two arguments, where the
first argument is of the first sort and the second argument is of
the second sort. If we use variables a, b, c to range over French
citizens and x, y, z to range over German citizens, then
∀a ∀x[(Mar r i edT o(a, x) → (Dr i nksW i ne(a)∨¬EatsW ur st(x))]]
asserts that if any French person is married to a German, either
the French person drinks wine or the German doesn’t eat wurst.
Many-sorted logic can be embedded in first-order logic in a
natural way, by lumping all the objects of the many-sorted do-
mains together into one first-order domain, using unary predicate
symbols to keep track of the sorts, and relativizing quantifiers.
For example, the first-order language corresponding to the exam-
ple above would have unary predicate symbolss “Ger man” and
“F r ench,” in addition to the other relations described, with the
sort requirements erased. A sorted quantifier ∀x A, where x is a
variable of the German sort, translates to
∀x (Ger man(x) → A).
We need to add axioms that insure that the sorts are separate—
e.g., ∀x ¬(Ger man(x) ∧ F r ench(x))—as well as axioms that guar-
antee that “drinks wine” only holds of objects satisfying the pred-
icate F r ench(x), etc. With these conventions and axioms, it is
not difficult to show that many-sorted sentences translate to first-
order sentences, and many-sorted derivations translate to first-
order derivations. Also, many-sorted structures “translate” to cor-
responding first-order structures and vice-versa, so we also have
a completeness theorem for many-sorted logic.
9.3. SECOND-ORDER LOGIC 157
particular you can quantify over these sets; for example, one can
express induction for the natural numbers with a single axiom
1. ∀x ¬x 0 =
2. ∀x ∀y (s (x) = s (y) → x = y)
3. ∀x (x + ) = x
4. ∀x ∀y (x + y 0) = (x + y)0
5. ∀x (x × ) =
6. ∀x ∀y (x × y 0) = ((x × y) + x)
7. ∀x ∀y (x < y ↔ ∃z y = (x + z 0))
The negation of this sentence then defines the class of finite struc-
tures.
In addition, one can define the class of well-orderings, by
adding the following to the definition of a linear ordering:
This asserts that every non-empty set has a least element, modulo
the identification of “set” with “one-place relation”. For another
example, one can express the notion of connectedness for graphs,
by saying that there is no nontrivial separation of the vertices into
disconnected parts:
As you may have guessed, one can iterate this idea arbitrarily.
In practice, higher-order logic is often formulated in terms
of functions instead of relations. (Modulo the natural identifica-
tions, this difference is inessential.) Given some basic “sorts” A,
B, C , . . . (which we will now call “types”), we can create new ones
by stipulating
1. N is a finite type.
2. is a term of type N
Rs t (0) = s
Rs t (x + 1) = t (x, R s t (x)),
hs, t i denotes the pair whose first component is s and whose sec-
ond component is t , and p 1 (s ) and p 2 (s ) denote the first and
second elements (“projections”) of s . Finally, λx . s denotes the
function f defined by
f (x) = s
for any x of type σ; so item (6) gives us a form of comprehension,
enabling us to define functions using terms. Formulas are built
up from identity predicate statements s = t between terms of the
same type, the usual propositional connectives, and higher-type
quantification. One can then take the axioms of the system to be
the basic equations governing the terms defined above, together
with the usual rules of logic with quantifiers and identity predi-
cate.
9.5. INTUITIONISTIC LOGIC 165
since 3log3 x = x.
Intuitionistic logic is designed to model a kind of reasoning
where moves like the one in the first proof are disallowed. Proving
the existence of an x satisfying A(x) means that you have to give a
specific x, and a proof that it satisfies A, like in the second proof.
Proving that A or B holds requires that you can prove one or the
other.
Formally speaking, intuitionistic first-order logic is what you
get if you omit restrict a proof system for first-order logic in a
certain way. Similarly, there are intuitionistic versions of second-
order or higher-order logic. From the mathematical point of view,
these are just formal deductive systems, but, as already noted,
they are intended to model a kind of mathematical reasoning.
One can take this to be the kind of reasoning that is justified on
a certain philosophical view of mathematics (such as Brouwer’s
intuitionism); one can take it to be a kind of mathematical rea-
soning which is more “concrete” and satisfying (along the lines
of Bishop’s constructivism); and one can argue about whether or
not the formal description captures the informal motivation. But
whatever philosophical positions we may hold, we can study in-
tuitionistic logic as a formally presented logic; and for whatever
reasons, many mathematical logicians find it interesting to do so.
There is an informal constructive interpretation of the intu-
itionist connectives, usually known as the Brouwer-Heyting-Kolmogorov
interpretation. It runs as follows: a proof of A ∧ B consists of a
proof of A paired with a proof of B; a proof of A ∨ B consists
of either a proof of A, or a proof of B, where we have explicit
information as to which is the case; a proof of A → B consists
CHAPTER 9. BEYOND FIRST-ORDER LOGIC 168
1. (A → ⊥) → ¬A.
2. A ∨ ¬A
3. ¬¬A → A
(A ∨ B)N ≡ ¬¬(AN ∨ B N )
(A → B)N ≡ (AN → B N )
(∀x A)N ≡ ∀x AN
(∃x A)N ≡ ¬¬∃x AN
2. P, p 1 ⊥.
3. P, p (A ∧ B) iff P, p A and P, p B.
4. P, p (A ∨ B) iff P, p A or P, p B.
One would like to augment logic with rules and axioms deal-
ing with modality. For example, the system S4 consists of the
ordinary axioms and rules of propositional logic, together with
the following axioms:
♦A → ♦A
Turing
Machines
177
CHAPTER 10
Turing
Machine
Computations
10.1 Introduction
178
10.1. INTRODUCTION 179
. I I I t I I I I t t t
q1
with a t, move right to the fourth square, and change the state
of the machine to q 5 .
We say that the machine halts when it encounters some state,
q n , and symbol, σ such that there is no instruction for hq n, σi,
i.e., the transition function for input hq n, σi is undefined. In other
words, the machine has no instruction to carry out, and at that
point, it ceases operation. Halting is sometimes represented by
a specific halt state h. This will be demonstrated in more detail
later on.
The beauty of Turing’s paper, “On computable numbers,”
is that he presents not only a formal definition, but also an ar-
gument that the definition captures the intuitive notion of com-
putability. From the definition, it should be clear that any func-
tion computable by a Turing machine is computable in the intu-
itive sense. Turing offers three types of argument that the con-
verse is true, i.e., that any function that we would naturally regard
as computable is computable by such a machine. They are (in
Turing’s words):
t, I , R
start q0 q1
Recall that the Turing machine has a read/write head and a tape
with the input written on it. The instruction can be read as if
reading a blank in state q 0 , write a stroke, move right, and move to
state q 1 . This is equivalent to the transition function mapping
hq 0, ti to hq 1, I , Ri.
CHAPTER 10. TURING MACHINE COMPUTATIONS 182
start q0 q1
I,I,R
.I I 1 I I t . . .
.I I I I 1 t . . .
.I I I I t0 . . .
The machine is now in state q 0 scanning a blank. Based on the
transition diagram, we can easily see that there is no instruction
to be carried out, and thus the machine has halted. This means
that the input has been accepted.
Suppose next we start the machine with an input of three
strokes. The first few configurations are similar, as the same in-
structions are carried out, with only a small difference of the tape
input:
.I 0 I I t . . .
.I I 1 I t . . .
.I I I 0 t . . .
.I I I t1 . . .
The machine has now traversed past all the strokes, and is read-
ing a blank in state q 1 . As shown in the diagram, there is an
instruction of the form δ(q 1, t) = hq 1, t, Ri. Since the tape is in-
finitely blank to the right, the machine will continue to execute
this instruction forever, staying in state q 1 and moving ever further
CHAPTER 10. TURING MACHINE COMPUTATIONS 184
to the right. The machine will never halt, and does not accept
the input.
It is important to note that not all machines will halt. If halt-
ing means that the machine runs out of instructions to execute,
then we can create a machine that never halts simply by ensuring
that there is an outgoing arrow for each symbol at each state.
The even machine can be modified to run infinitely by adding an
instruction for scanning a blank at q 0 .
Example 10.2.
t, t, R t, t, R
I,I,R
start q0 q1
I,I,R
I,I,R I,I,R
I , t, R t, t, R
start q0 q1 q2
t, t, R t, I , R
q5 q4 q3
t, t, L I,I,L
I,I,L I,I,L t, I , L
CHAPTER 10. TURING MACHINE COMPUTATIONS 186
3. an initial state q 0 ∈ Q ,
Q = {q 0, q 1 }
Σ = {., t, I },
δ(q 0, I ) = hq 1, I , Ri,
δ(q 1, I ) = hq 0, I , Ri,
δ(q 1, t) = hq 1, t, Ri.
10.4. CONFIGURATIONS AND COMPUTATIONS 187
3. q ∈ Q
the right of the left end marker), and the mechanism is in the
designated start state q 0 .
h. _ I , 1, q 0 i
4. a) D = L and n 0 = n − 1, or
b) D = R and n 0 = n + 1, or
c) D = N and n 0 = n,
I k 1 t I k 2 t . . . t I kn
I,I,R I,I,R I , t, N
t, I , R t, t, L
start q0 q1 q2
start q0 q1
I,I,R
t, t, N
h
10.7. COMBINING TURING MACHINES 191
I,I,R
start q0 q1
I,I,R
t, t, N t, t, N
h r
t, I , R t, t, L
start q0 q1 q2
t, I , R t, t, L
start q0 q1 q2
I , t, L
I,I,L q3
., ., R
q4
10.8. VARIANTS OF TURING MACHINES 193
I,I,R I,I,R
t, I , R t, t, L
start q0 q1 q2
I , t, L
I,I,L q3
I,I,L ., ., R
t, t, L t, t, R
q8 q9 q4
I,I,L I,I,L I , t, R
t, I , L q7 q6 q5
t, I , R t, t, R
I,I,R I,I,R
Summary
A Turing machine is a kind of idealized computation mecha-
nism. It consists of a one-way infinite tape, divided into squares,
each of which can contain a symbol from a pre-determined al-
phabet. The machine operates by moving a read-write head
along the tape. It may also be in one of a pre-determined num-
ber of states. The actions of the read-write head are determined
by a set of instructions; each instruction is conditional on the ma-
chine being in a certain state and reading a certain symbol, and
specifies which symbol the machine will write onto the current
square, whether it will move the read-write head one square left,
right, or stay put, and which state it will switch to. If the tape
contains a certain input, represented as a sequence of symbols
on the tape, and the machine is put into the designated start state
with the read-write head reading the leftmost square of the input,
the instruction set will step-wise determine a sequence of config-
urations of the machine: content of tape, position of read-write
head, and state of the machine. Should the machine encounter
a configuration in which the instruction set does not contain an
instruction for the current symbol read/state combination, the
machine halts, and the content of the tape is the output.
Numbers can very easily be represented as sequences of strokes
on the Tape of a Turing machine. We say a function N → N is Tur-
ing computable if there is a Turing machine which, whenever it
is started on the unary representation of n as input, eventually
halts with its tape containing the unary representation of f (n) as
output. Many familiar arithmetical functions are easily (or not-
so-easily) shown to be Turing computable. Many other models
of computation other than Turing machines have been proposed;
and it has always turned out that the arithmetical functions com-
putable there are also Turing computable. This is seen as support
10.9. THE CHURCH-TURING THESIS 197
Problems
Problem 10.1. Choose an arbitary input and trace through the
configurations of the doubler machine in Example 10.4.
rearranges them so that all the As are to the left of all the Bs. (e.g.,
the sequence BABAA should become the sequence AAABB, and
the sequence ABBABB should become the sequence AABBBB).
1
if x = y
equality(x, y) =
0
if x , y
Undecidability
11.1 Introduction
It might seem obvious that not every function, even every arith-
metical function, can be computable. There are just too many,
whose behavior is too complicated. Functions defined from the
decay of radioactive particles, for instance, or other chaotic or
random behavior. Suppose we start counting 1-second intervals
from a given time, and define the function f (n) as the number
of particles in the universe that decay in the n-th 1-second inter-
val after that initial moment. This seems like a candidate for a
function we cannot ever hope to compute.
But it is one thing to not be able to imagine how one would
compute such functions, and quite another to actually prove that
they are uncomputable. In fact, even functions that seem hope-
lessly complicated may, in an abstract sense, be computable. For
instance, suppose the universe is finite in time—some day, in the
very distant future the universe will contract into a single point,
as some cosmological theories predict. Then there is only a fi-
nite (but incredibly large) number of seconds from that initial
moment for which f (n) is defined. And any function which is
defined for only finitely many inputs is computable: we could list
the outputs in one big table, or code it in one very big Turing
machine state transition diagram.
199
CHAPTER 11. UNDECIDABILITY 200
1
if n is prime
isprime(n) =
0
otherwise.
assume, for instance, that the states and vocabulary symbols are
natural numbers, or that the states and vocabulary are all strings
of letters and digits.
Suppose we fix a countably infinite vocabulary for specifying
Turing machines: σ0 = ., σ1 = t, σ2 = I , σ3 , . . . , R, L, N ,
q 0 , q 1 , . . . . Then any Turing machine can be specified by some
finite string of symbols from this alphabet (though not every fi-
nite string of symbols specifies a Turing machine). For instance,
suppose we have a Turing machine M = hQ, Σ, q, δi where
Q = {q 00, . . . , q n0 } ⊆ {q 0, q 1, . . . } and
Σ = {., σ10 , σ20 , . . . , σm0 } ⊆ {σ0, σ1, . . . }.
Theorem 11.1. There are functions from N to N which are not Turing
computable.
0
if machine Me does not halt for input e
s (e ) =
1
if machine Me halts for input e
3. A constant symbol
11.5. REPRESENTING TURING MACHINES 207
0=
n + 1 = n0
∀x ∀y (x < y → x , y)
Qq 0 (1, 0)
CHAPTER 11. UNDECIDABILITY 208
∀x ∀y ((Qqi (x 0, y) ∧ Sσ (x 0, y)) →
(Qq j (x, y 0) ∧ Sσ0 (x 0, y 0) ∧ A(x, y)))
11.6. VERIFYING THE REPRESENTATION 209
∃x ∃y Qh (x, y)
Lemma 11.9. For each n, if M has not halted after n steps, T (M, w)
C (M, w, n).
1. δ(q, σ) = hq 0, σ 0, Ri
2. δ(q, σ) = hq 0, σ 0, Li
3. δ(q, σ) = hq 0, σ 0, N i
Qq 0 (m 0, n 0) ∧ Sσ0 (m, n 0) ∧
Sσ0 (0, n 0) ∧ · · · ∧ Sσk (k, n 0) ∧
∀x (k < x → St (x, n 0))
∀x ∀y ((Qq (x 0, y) ∧ Sσ (x 0, y)) →
(Qq 0 (x, y 0) ∧ Sσ0 (x 0, y 0) ∧ A(x, y)))
Qq 0 (m, n 0) ∧ Sσ0 (m 0, n 0) ∧
Sσ0 (0, n 0) · · · ∧ Sσk (k, n 0) ∧
∀x (k < x → St (x, n 0))
Qq 0 (m 0, n 0) ∧
Sσ0 (0, n 0) ∧ · · · ∧ Sσk (k, n 0) ∧
St (k + 1, n 0) ∧ · · · ∧ St (m − 1, n 0) ∧
Sσ0 (m, n 0) ∧
∀x (m < x → St (x, n 0))
Proof. By Lemma 11.9, we know that, for any time n, the de-
scription C (M, w, n) of the configuration of M at time n is a con-
sequence of T (M, w). Suppose M halts after k steps. It will
be scanning square m, say. Then C (M, w, k ) contains as con-
juncts both Qq (m, k ) and Sσ (m, k ) with δ(q, σ) undefined. Thus,
C (M, w, k ) E(M, w). But then T (M, w) E(M, w) and therefore
T (M, w) → E(M, w) is valid.
Summary
Turing machines are determined by their instruction sets, which
are finite sets of quintuples (for every state and symbol read, spec-
ify new state, symbol written, and movement of the head). The
finite sets of quintuples are enumerable, so there is a way of as-
sociating a number with each Turing machine instruction set.
The index of a Turing machine is the number associated with
its instruction set under a fixed such schema. In this way we can
“talk about” Turing machines indirectly—by talking about their
indices.
One important problem about the behavior of Turing ma-
chines is whether they eventually halt. Let h(e, n) be the func-
tion which = 1 if the Turing machine with index e halts when
started on input n, and = 0 otherwise. It is called the halting
function. The question of whether the halting function is itself
Turing computable is called the halting problem. The answer is
no: the halting problem is unsolvable. This is established using
a diagonal argument.
The halting problem is only one example of a larger class
of problems of the form “can X be accomplished using Turing
machines.” Another central problem of logic is the decision
problem for first-order logic: is there a Turing machine that
can decide if a given sentence is valid or not. This famous prob-
CHAPTER 11. UNDECIDABILITY 216
Problems
Problem 11.1. The Three Halting (3-Halt) problem is the prob-
lem of giving a decision procedure to determine whether or not
an arbitrarily chosen Turing Machine halts for an input of three
strokes on an otherwise blank tape. Prove that the 3-Halt problem
is unsolvable.
Induction
A.1 Introduction
Induction is an important proof technique which is used, in dif-
ferent forms, in almost all areas of logic, theoretical computer
science, and mathematics. It is needed to prove many of the re-
sults in logic.
Induction is often contrasted with deduction, and character-
ized as the inference from the particular to the general. For in-
stance, if we observe many green emeralds, and nothing that we
would call an emerald that’s not green, we might conclude that
all emeralds are green. This is an inductive inference, in that it
proceeds from many particlar cases (this emerald is green, that
emerald is green, etc.) to a general claim (all emeralds are green).
Mathematical induction is also an inference that concludes a gen-
eral claim, but it is of a very different kind that this “simple in-
duction.”
Very roughly, and inductive proof in mathematics concludes
that all mathematical objects of a certain sort have a certain prop-
erty. In the simplest case, the mathematical objects an induc-
tive proof is concerned with are natural numbers. In that case
an inductive proof is used to establish that all natural numbers
have some property, and it does this by showing that (1) 0 has
the property, and (2) whenever a number n has the property, so
219
APPENDIX A. INDUCTION 220
A.2 Induction on N
In its simplest form, induction is a technique used to prove results
for all natural numbers. It uses the fact that by starting from 0 and
repeatedly adding 1 we eventually reach every natural number.
So to prove that something is true for every number, we can (1)
establish that it is true for 0 and (2) show that whenever a number
has it, the next number has it too. If we abbreviate “number n
has property P ” by P (n), then a proof by induction that P (n) for
all n ∈ N consists of:
To make this crystal clear, suppose we have both (1) and (2).
Then (1) tells us that P (0) is true. If we also have (2), we know
in particular that if P (0) then P (0 + 1), i.e., P (1). (This follows
from the general statement “for any n, if P (n) then P (n + 1)” by
putting 0 for n. So by modus ponens, we have that P (1). From
(2) again, now taking 1 for n, we have: if P (1) then P (2). Since
we’ve just established P (1), by modus ponens, we have P (2). And
so on. For any number k , after doing this k steps, we eventually
A.2. INDUCTION ON N 221
Theorem A.1. With n dice one can throw all 5n + 1 possible values
between n and 6n.
Proof. Let P (n) be the claim: “It is possible to throw any number
between n and 6n using n dice.” To use induction, we prove:
1. The induction basis P (1), i.e., with just one die, you can
throw any number between 1 and 6.
s0 = 0
sn+1 = sn + (n + 1)
s0 = 0,
s1 = s0 + 1 = 1,
s2 = s1 + 2 =1+2=3
s3 = s2 + 3 = 1 + 2 + 3 = 6, etc.
in question for any number under the assumption it holds for its
predecessor.
There is a variant of the principle of induction in which we
don’t just assume that the claim holds for the predecessor n − 1
of n, but for all numbers smaller than n, and use this assumption
to establish the claim for n. This also gives us the claim P (k ) for
all k ∈ N. For once we have established P (0), we have thereby
established that P holds for all numbers less than 1. And if we
know that if P (l ) for all l < n then P (n), we know this in particular
for n = 1. So we can conclude P (2). With this we have proved
P (0), P (1), P (2), i.e., P (l ) for all l < 3, and since we have also the
conditional, if P (l ) for all l < 3, then P (3), we can conclude P (3),
and so on.
In fact, if we can establish the general conditional “for all n,
if P (l ) for all l < n, then P (n),” we do not have to establish P (0)
anymore, since it follows from it. For remember that a general
claim like “for all l < n, P (l )” is true if there are no l < n. This
is a case of vacuous quantification: “all As are Bs” is true if there
are no As, ∀x (A(x) → B(x)) is true if no x satisfies A(x). In this
case, the formalized version would be “∀l (l < n → P (l ))”—and
that is true if there are no l < n. And if n = 0 that’s exactly the
case: no l < 0, hence “for all l < 0, P (0)” is true, whatever P is.
A proof of “if P (l ) for all l < n, then P (n)” thus automatically
establishes P (0).
This variant is useful if establishing the claim for n can’t be
made to just rely on the claim for n − 1 but may require the
assumption that it is true for one or more l < n.
1. ∅ is a parexpression.
(Note that we have not yet proved that every balanced paren-
thesis expression is a parexpression, although it is quite clear that
every parexpression is a balanced parenthesis expression.)
The key feature of inductive definitions is that if you want to
prove something about all parexpressions, the definition tells you
which cases you must consider. For instance, if you are told that
q is a parexpression, the inductive definition tells you what q can
look like: q can be ∅, it can be (p) for some other parexpression p,
or it can be pp 0 for two parexpressions p and p 0 , ∅. Because of
clause (4), those are all the possibilities.
When proving claims about all of an inductively defined set,
the strong form of induction becomes particularly important. For
APPENDIX A. INDUCTION 226
o 1 (p) =(p)
o 2 (q, q 0) =q q 0
Biographies
B.1 Georg Cantor
An early biography of Georg Can-
tor (gay-org kahn-tor) claimed that
he was born and found on a ship
that was sailing for Saint Peters-
burg, Russia, and that his parents
were unknown. This, however, is
not true; although he was born in
Saint Petersburg in 1845.
Cantor received his doctorate
in mathematics at the University of
Berlin in 1867. He is known for his
work in set theory, and is credited
with founding set theory as a dis-
tinctive research discipline. He was Fig. B.1: Georg Cantor
the first to prove that there are infi-
nite sets of different sizes. His theories, and especially his theory
of infinities, caused much debate among mathematicians at the
time, and his work was controversial.
Cantor’s religious beliefs and his mathematical work were in-
extricably tied; he even claimed that the theory of transfinite num-
bers had been communicated to him directly by God. In later
229
APPENDIX B. BIOGRAPHIES 230
245
GLOSSARY 246
section 6.1).
compactness theorem States that every finitely satisfiable set of
sentences is satisfiable (see section 8.9).
completeness Property of a proof system; it is complete if, when-
ever Γ entails A, then there is also a derivation that es-
tablishes Γ ` A; equivalently, iff every consistent set of
sentences is satisfiable (see section 8.1).
completeness theorem States that first-order logic is complete:
every consistent set of sentences is satisfiable.
composition (g ◦ f ) The function resulting from “chaining to-
gether” f and g ; (g ◦ f )(x) = g (f (x)) (see section 3.4).
connected R is connected if for all x, y ∈ X with x , y, then
either Rxy or Ryx (see section 2.2).
consistent A set of sentences Γ is consistent iff Γ 0 ⊥, otherwise
inconsistent (see section 7.4).
covered A structure in which every element of the domain is the
value of some closed term (see section 5.9).
251
Photo Credits 252
253
BIBLIOGRAPHY 254
Tarski, Alfred. 1981. The Collected Works of Alfred Tarski, vol. I–IV.
Basel: Birkhäuser.
258
259