An Introduction To Set Theory
An Introduction To Set Theory
An Introduction To Set Theory
for each
subformula of as follows:
1. If is atomic, then
is ;
2. If is () for some formula , then
is (
);
3. If is ( ) for some formula , then
is (
);
4. If is ( ) for some formula , then
is (
);
5. If is ( ) for some formula , then
is (
);
6. If is ( ) for some formula , then
is (
);
7. If is (v
k
) for some formula then
is just (v
k
)
if k ,= i, but
if k = i then
is (v
j
) where is the result of substituting v
j
for
each occurrence of v
i
in ; and,
14 CHAPTER 1. LOST
8. If is (v
k
) for some formula then
is just (v
k
)
if k ,= i, but
if k = i then
is (v
j
) where is the result of substituting v
j
for
each occurrence of v
i
in .
That a variable v
i
occurs free in a formula means that at least one of
the following is true:
1. is an atomic formula and v
i
occurs in ;
2. is (), is a formula and v
i
occurs free in ;
3. ( ), and are formulas and v
i
occurs free in or occurs free
in ;
4. is ( ), and are formulas and v
i
occurs free in or occurs
free in ;
5. is ( ), and are formulas and v
i
occurs free in or occurs
free in ;
6. is ( ), and are formulas and v
i
occurs free in or occurs
free in ;
7. is (v
j
) and is a formula and v
i
occurs free in and i ,= j; or,
8. is (v
j
) and is a formula and v
i
occurs free in and i ,= j.
As in the example below, a variable can occur both free and bound in a
formula. However, notice that if a variable occurs in a formula at all it must
occur either free, or bound, or both (but not at the same occurrence).
We dene the important notion of the substitution of a variable v
j
for
each free occurrence of the variable v
i
in the formula . This procedure is as
follows.
1. Substitute a new variable v
l
for all bound occurrences of v
i
in .
2. Substitute another new variable v
k
for all bound occurrences of v
j
in
the result of (1).
15
3. Directly substitute v
j
for each occurrence of v
i
in the result of (2).
Example. Let us substitute v
2
for all free occurrences of v
1
in the formula
((v
1
)((v
1
= v
2
) (v
1
v
0
)) (v
2
)(v
2
v
1
))
The steps are as follows.
1. ((v
1
)((v
1
= v
2
) (v
1
v
0
)) (v
2
)(v
2
v
1
))
2. ((v
3
)((v
3
= v
2
) (v
3
v
0
)) (v
2
)(v
2
v
1
))
3. ((v
3
)((v
3
= v
2
) (v
3
v
0
)) (v
4
)(v
4
v
1
))
4. ((v
3
)((v
3
= v
2
) (v
3
v
0
)) (v
4
)(v
4
v
2
))
For the reader who is new to this abstract game of formal logic, step (2)
in the substitution proceedure may appear to be unnecessary. It is indeed
necessary, but the reason is not obvious until we look again at the example
to see what would happen if step (2) were omitted. This step essentially
changes (v
2
)(v
2
v
1
) to (v
4
)(v
4
v
1
). We can agree that each of these
means the same thing, namely, v
1
is non-empty. However, when v
2
is
directly substituted into each we get something dierent: (v
2
)(v
2
v
2
) and
(v
4
)(v
4
v
2
). The latter says that v
2
is non-empty and this is, of course
what we would hope would be the result of substituting v
2
for v
1
in v
1
is
non-empty. But the former statement, (v
2
)(v
2
v
2
), seems quite dierent,
making the strange assertion that v
2
is an element of itself, and this is not
what we have in mind. What caused this problem? An occurrence of the
variable v
2
became bound as a result of being substituted for v
1
. We will not
allow this to happen. When we substitute v
2
for the free v
1
we must ensure
that this freedom is preserved for v
2
.
For a formula and variables v
i
and v
j
, let (v
i
[v
j
) denote the formula
which results from substituting v
j
for each free occurance of v
i
. In order
to make (v
i
[v
j
) well dened, we insist that in steps (1) and (2) of the
substitution process, the rst new variable available is used. Of course, the
use of any other new variable gives an equivalent formula. In the example, if
is the formula on the rst line, then (v
1
[v
2
) is the formula on the fourth
line.
16 CHAPTER 1. LOST
As a simple application we can show how to express there exists a unique
element. For any formula of the language of set theory we denote by
(!v
j
) the formula
((v
j
) (v
j
)(v
l
)(( (v
j
[v
l
)) (v
j
= v
l
)))
where v
l
is the rst available variable which does not occur in . The ex-
pression (!v
j
) can be considered as an abbreviation in the language of set
theory, that is, an expression which is not actually part of the language.
However, whenever we have a formula containing this expression, we can
quickly convert it to a proper formula of the language of set theory.
A class is just a string of symbols of the form v
i
: where v
i
is a
variable and is a formula. Two important and well-known examples are:
v
0
: ((v
0
= v
0
))
which is called the empty set and is usually denoted by , and
v
0
: (v
0
= v
0
)
which is called the universe and is usually denoted by V.
A term is dened to be either a class or a variable. Terms are the names
for what the language of set theory talks about. A grammatical analogy is
that terms correspond to nouns and pronounsclasses to nouns and variables
to pronouns. Continuing the analogy, the predicates, or verbs, are = and .
The atomic formulas are the basic relationships among the predicates and
the variables.
We can incorporate classes into the language of set theory by showing
how the predicates relate to them. Let and be formulas of the language
of set theory and let v
j
, v
k
and v
l
be variables. We write:
v
k
v
j
: instead of (v
j
[v
k
)
v
k
= v
j
: instead of (v
l
)((v
l
v
k
) (v
j
[v
l
))
v
j
: = v
k
instead of (v
l
)((v
j
[v
l
) (v
l
v
k
))
v
j
: = v
k
: instead of (v
l
)((v
j
[v
l
) (v
k
[v
l
))
v
j
: v
k
instead of (v
l
)((v
l
v
k
) (v
j
)((v
j
v
l
) ))
v
j
: v
k
: instead of (v
l
)((v
k
[v
l
) (v
j
)((v
j
v
l
) ))
17
whenever v
l
is neither v
j
nor v
k
and occurs in neither nor .
We can now show how to express, as a proper formula of set theory,
the substitution of a term t for each free occurrence of the variable v
i
in the
formula . We denote the resulting formula of set theory by (v
i
[t). The
case when t is a variable v
j
has already been discussed. Now we turn our
attention to the case when t is a class v
j
: and carry out a proceedure
similar to the variable case.
1. Substitute the rst available new variable for all bound occurrences of
v
i
in .
2. In the result of (1), substitute, in turn, the rst available new variable
for all bound occurrences of each variable which occurs free in .
3. In the result of (2), directly substitute v
j
: for v
i
into each atomic
subformula in turn, using the table above.
For example, the atomic subformula (v
i
v
k
) is replaced by the new subfor-
mula
(v
l
)((v
l
v
k
) (v
j
)((v
j
v
l
) ))
where v
l
is the rst available new variable. Likewise, the atomic subformula
(v
i
= v
i
) is replaced by the new subformula
(v
l
)((v
j
[v
l
) (v
j
[v
l
))
where v
l
is the rst available new variable (although it is not important to
change from v
j
to v
l
in this particular instance).
18 CHAPTER 1. LOST
Chapter 2
FOUND
The language of set theory is very precise, but it is extremely dicult for us
to read mathematical formulas in that language. We need to nd a way to
make these formulas more intelligible.
In order to avoid the inconsistencies associated with Richards paradox,
we must ensure that the formula in the class v
j
: is indeed a proper
formula of the language of set theoryor, at least, can be converted to a
proper formula once the abbreviations are eliminated. It is not so important
that we actually write classes using proper formulas, but what is important is
that whatever formula we write down can be converted into a proper formula
by eliminating abbreviations and slang.
We can now relax our formalism if we keep the previous paragraph in
mind. Lets adopt these conventions.
1. We can use any letters that we like for variables, not just v
0
, v
1
, v
2
, . . . .
2. We can freely omit parentheses and sometimes use brackets ] and [
instead.
3. We can write out and for , or for , implies for and
use the if...then... format as well as other common English expres-
sions for the logical connectives and quantiers.
19
20 CHAPTER 2. FOUND
4. We will use the notation (x, y, w
1
, . . . , w
k
) to indicate that all free
variables of lie among x, y, w
1
, . . . , w
k
. When the context is clear we
use the notation (x, t, w
1
, . . . , w
k
) for the result of substituting the
term t for each free occurrence of the variable y in , i.e., (y[t).
5. We can write out formulas, including statements of theorems, in any
way easily seen to be convertible to a proper formula in the language
of set theory.
For any terms r, s, and t, we make the following abbreviations of formulas.
(x t) for (x)(x t )
(x t) for (x)(x t )
s / t for (s t)
s ,= t for (s = t)
s t for (x)(x s x t)
Whenever we have a nite number of terms t
1
, t
2
, . . . , t
n
the notation
t
1
, t
2
, . . . , t
n
is used as an abbreviation for the class:
x : x = t
1
x = t
2
x = t
n
.
Furthermore, t : will stand for x : x = t , while x t : will
represent x : x t .
We also abbreviate the following important classes.
Union s t for x : x s x t
Intersection s t for x : x s x t
Dierence s t for x : x s x / t
Symmetric Dierence st for (s t) (t s)
Ordered Pair s, t) for s, s, t
Cartesian Product s t for p : x y (x s y t p = x, y))
Domain dom(f) for x : y x, y) f
Range rng(f) for y : x x, y) f
21
Image f
A for y : x A x, y) f
Inverse Image f
B for x : y B x, y) f
Restriction f[A for p : p f x A y p = x, y)
Inverse f
1
for p : x y x, y) f y, x) = p
These latter abbreviations are most often used when f is a function. We
write
f is a function
for
p f x y p = x, y) (x)(y x, y) f !y x, y) f)
and we write
f : X Y for f is a function dom(f) = X rng(f) Y
f is one to one for y rng(f) !x x, y) f
f is onto Y for Y = rng(f)
We also use the terms injection (for a one-to-one function), surjection (for
an onto function), and bijection (for both properties together).
Russells Paradox
The following is a theorem.
z z = x : x / x.
The proof of this is simple. Just ask whether or not z z.
The paradox is only for the naive, not for us. x : x / x is a classjust
a description in the language of set theory. There is no reason why what it
describes should exist. In everyday life we describe many things which dont
exist, ctional characters for example. Bertrand Russell did exist and Peter
Pan did not, although they each have descriptions in English. Although
Peter Pan does not exist, we still nd it worthwhile to speak about him. The
same is true in mathematics.
Upon reection, you might say that in fact, nothing is an element of itself
so that
x : x / x = x : x = x = V
22 CHAPTER 2. FOUND
and so Russells paradox leads to:
z z = V.
It seems we have proved that the universe does not exists. A pity!
The mathematical universe fails to have a mathematical existence in the
same way that the physical universe fails to have a physical existence. The
things that have a physical existence are exactly the things in the universe,
but the universe itself is not an object in the universe. This does bring up
an important pointdo any of the usual mathematical objects exist? What
about the other things we described as classes? What about ? Can we
prove that exists?
Actually, we cant; at least not yet. You cant prove very much, if you
dont assume something to start. We could prove Russells Paradox because,
amazingly, it only required the basic rules of logic and required nothing
mathematicalthat is, nothing about the real meaning of . Continuing
from Russells Paradox to z z = V required us to assume that x x /
xnot an unreasonable assumption by any means, but a mathematical
assumption none the less. The existence of the empty set z z = may
well be another necessary assumption.
Generally set theorists, and indeed all mathematicians, are quite willing
to assume anything which is obviously true. It is, after all, only the things
which are not obviously true which need some form of proof. The problem,
of course, is that we must somehow know what is obviously true. Naively,
z z = V would seem to be true, but it is not and if it or any other
false statement is assumed, all our proofs become infected with the virus of
inconsistency and all of our theorems become suspect.
Historically, considerable thought has been given to the construction of
the basic assumptions for set theory. All of mathematics is based on these
assumptions; they are the foundation upon which everything else is built.
These assumptions are called axioms and this system is called the ZFC Axiom
System. We will begin to study it in the next chapter.
Chapter 3
The Axioms of Set Theory
We will explore the ZFC Axiom System. Each axiom should be obviously
true in the context of those things that we desire to call sets. Because we
cannot give a mathematical proof of a basic assumption, we must rely on
intuition to determine truth, even if this feels uncomfortable. Beyond the
issue of truth is the question of consistency. Since we are unable to prove
that our assumptions are true, can we at least show that together they will
not lead to a contradiction? Unfortunately, we cannot even do thisit is
ruled out by the famous incompleteness theorems of K. Godel. Intuition is
our only guide. We begin.
We have the following axioms:
The Axiom of Equality x y [x = y z (x z y z)]
The Axiom of Extensionality x y [x = y u (u x u y)]
The Axiom of Existence z z =
The Axiom of Pairing x y z z = x, y
Dierent authors give slightly dierent formulations of the ZFC axioms.
All formulations are equivalent. Some authors omit the Axiom of Equality
and Axiom of Existence because they are consequences of the usual logical
background to all mathematics. We include them for emphasis. Redundancy
is not a bad thing and there is considerable redundancy in this system.
23
24 CHAPTER 3. THE AXIOMS OF SET THEORY
The following theorem gives some results that we would be quite willing
to assume outright, were they not to follow from the axioms. The rst three
parts are immediate consequences of the Axiom of Extensionality.
Theorem 1.
1. x x = x.
2. x y x = y y = x.
3. x y z [(x = y y = z) x = z].
4. x y z z = x, y).
5. u v x y [u, v) = x, y) (u = x v = y)].
Exercise 1. Prove parts (4) and (5) of Theorem 1
We now assert the existence of unions and intersections. No doubt the
reader has experienced a symmetry between these two concepts. Here how-
ever, while the Union Axiom is used extensively, the Intersection Axiom is
redundant and is omitted in most developments of the subject. We include
it here because it has some educational value (see Exercise 4).
The Union Axiom x [x ,= z z = w : (y x)(w y)]
The class w : (y x)(w y) is abbreviated as
y : x, y) f.
Exercise 3. Suppose f is a function and x dom(f). Prove that
x, y) f i y = f(x).
Suppose that x is a set and that there is some way of removing each
element u x and replacing u with some element v. Would the result be
a set? Well, of courseprovided there are no tricks here. That is, there
should be a well dened replacement procedure which ensures that each u is
replaced by only one v. This well dened procedure should be described by
a formula, , in the language of set theory. We can guarantee that each u is
replaced by exactly one v by insisting that u x !v (x, u, v).
We would like to obtain an axiom, written in the language of set theory
stating that for each set x and each such formula we get a set z. However,
this is impossible. We cannot express for each formula in the language of
set theoryin fact this formal language was designed for the express purpose
of avoiding such expressions which bring us perilously close to Richards
Paradox.
The answer to this conundrum is to utilise not just one axiom, but in-
nitely manyone axiom for each formula of the language of set theory.
Such a system is called an axiom scheme.
26 CHAPTER 3. THE AXIOMS OF SET THEORY
The Replacement Axiom Scheme
For each formula (x, u, v, w
1
, . . . , w
n
) of the language of set theory, we have
the axiom:
w
1
. . . w
n
x [u x !v z z = v : u x ]
Note that we have allowed to have w
1
, . . . , w
n
as parameters, that is,
free variables which may be used to specify various procedures in various
contexts within a mathematical proof. This is illustrated by the following
theorem.
Theorem 3. x y z z = x y.
Proof. From Theorem 1 parts (4) and (5), for all t y we get
u x !v v = u, t).
We now use Replacement with the formula (x, u, v, t) as v = u, t); t
is a parameter. We obtain, for each t y:
q q = v : u x v = u, t).
By Extensionality, in fact t y !q q = v : u x v = u, t).
We again use Replacement, this time with the formula (y, t, q, x) as
q = v : u x v = u, t); here x is a parameter. We obtain:
r r = q : t y q = v : u x v = u, t)
By the Union Axiom z z =
r and so we have:
z = p : q [q r p q]
= p : q [(t y) q = v : u x v = u, t) p q]
= p : (t y)(q)[q = v : u x v = u, t) p q]
= p : (t y)p v : u x v = u, t)
= p : (t y)(u x) p = u, t)
= x y
27
Exercise 4. Show that the Intersection Axiom is indeed redundant.
It is natural to believe that for any set x, the collection of those elements
y x which satisfy some particular property should also be a set. Again, no
tricksthe property should be specied by a formula of the language of set
theory. Since this should hold for any formula, we are again led to a scheme.
The Comprehension Scheme
For each formula (x, y, w
1
, . . . , w
n
) of the language of set theory, we have
the statement:
w
1
. . . w
n
x z z = y : y x (x, y, w
1
, . . . , w
n
)
This scheme could be another axiom scheme (and often is treated as such).
However, this would be unnecessary, since the Comprehension Scheme follows
from what we have already assumed. It is, in fact, a theorem schemethat is,
innitely many theorems, one for each formula of the language of set theory.
Of course we cannot write down innitely many proofs, so how can we prove
this theorem scheme?
We give a uniform method for proving each instance of the scheme. So to
be certain that any given instance of the theorem scheme is true, we consider
the uniform method applied to that particular instance. We give this general
method below.
For each formula (x, u, w
1
, . . . , w
n
) of the language of set theory we have:
Theorem 4.
w
1
. . . w
n
x z z = u : u x .
Proof. Apply Replacement with the formula (x, u, v, w
1
, . . . , w
n
) given by:
((x, u, w
1
, . . . , w
n
) v = u) ((x, u, w
1
, . . . , w
n
) v = )
to obtain:
y y = v : (u x)[( v = u) ( v = )].
28 CHAPTER 3. THE AXIOMS OF SET THEORY
Note that u : (x, u, w
1
, . . . , w
n
) y and the only other possible element
of y is . Now let z =
Y ). Then f : X
f
j
: j x. By the claim, the f
j
s agree on their
common domains, so that g is a function with domain x and
(m x) (m, g[m, g(m), w).
By the hypothesis of the theorem applied to g there is a unique y such that
(x, g, y, w). Dene f to be the function f = g x, y). It is straightfor-
ward to verify that f witnesses that x, y) F.
To prove (2), note that, by (1), for each x N there is n N and
f : n V such that F(x) = f(x) and, in fact, F[n = f. Hence,
(m n) (m, f[m, f(m), w).
38 CHAPTER 4. THE NATURAL NUMBERS
We prove (3) by induction. Assume that
(m n) H(m) = F(m)
with intent to show that H(n) = F(n). We assume (n, H[n, H(n), w) and
by (2) we have (n, F[n, F(n), w). By the hypothesis of the theorem applied
to H[n = F[n we get H(n) = F(n).
By applying this theorem to our specic example we see that REC(, N, w)
does indeed give us a function F. Since F is dened by recursion on N, we
use induction on N to verify the properties of F. For example, it is easy to
use induction to check that F(n) N for all n N.
We do not often explicitly state the formula in a denition by recursion.
The denition of F would be more often given by:
F(0) = 3
F(succ(n)) = succ(F(n))
This is just how the example started; nevertheless, this allows us to construct
the formula immediately, should we wish. Of course, in this particular
example we can use the plus symbol and give the denition by recursion by
the following formulas.
3 + 0 = 3
3 +succ(n) = succ(3 +n)
Now, lets use denition by recursion in other examples. We can dene
general addition on N by the formulas
a + 0 = a
a +succ(b) = succ(a +b)
for each a N. Here a is a parameter which is allowed by the inclusion of w
in our analysis. The same trick can be used for multiplicaton:
a 0 = 0
a (succ(b)) = a b +a
39
for each a N, using the previously dened notion of addition. In each
example there are two cases to specifythe zero case and the successor case.
Exponentiation is dened similarly:
a
0
= 1
a
succ(b)
= a
b
a
The reader is invited to construct, in each case, the appropriate formula ,
with a as a parameter, and to check that the hypothesis of the previous
theorem is satised.
G. Peano developed the properties of the natural numbers from zero, the
successor operation and induction on N. You may like to see for yourself
some of what this entails by proving that multiplication is commutative.
A set X is said to be nite provided that there is a natural number n and
a bijection f : n X. In this case n is said to be the size of X. Otherwise,
X is said to be innite.
Exercise 6. Use induction to prove the pigeon-hole principle: for n N
there is no injection f : (n+1) n. Conclude that a set X cannot have two
dierent sizes.
Do not believe this next result:
Proposition. All natural numbers are equal.
Proof. It is sucient to show by induction on n N that if a N and b N
and max (a, b) = n, then a = b. If n = 0 then a = 0 = b. Assume the
inductive hypothesis for n and let a N and b N be such that
max (a, b) = n + 1.
Then max (a 1, b 1) = n and so a 1 = b 1 and consequently a = b.
40 CHAPTER 4. THE NATURAL NUMBERS
Chapter 5
The Ordinal Numbers
The natural number system can be extended to the system of ordinal num-
bers.
An ordinal is a transitive set of transitive sets. More formally: for any
term t, t is an ordinal is an abbreviation for
(t is transitive) (x t)(x is transitive).
We often use lower case Greek letters to denote ordinals. We denote
: is an ordinal by ON.
From Theorem 6 we see immediately that N ON.
Theorem 10.
1. ON is transitive.
2. (z)(z = ON).
Proof.
1. Let ON; we must prove that ON. Let x ; we must prove
that
41
42 CHAPTER 5. THE ORDINAL NUMBERS
(a) x is transitive; and,
(b) (y x)(y is transitive).
Clearly (a) follows from the denition of ordinal. To prove (b), let
y x; by transitivity of we have y ; hence y is transitive.
2. Assume (z)(z = ON). From (1) we have that ON is a transitive
set of transitive sets, i.e., an ordinal. This leads to the contradiction
ON ON.
Theorem 11. (Trichotomy of Ordinals)
( ON)( ON)( = ).
Proof. The reader may check that a proof of this theorem can be obtained
by replacing N with ON in the proof of Theorem 7.
Because of this theorem, when and are ordinals, we often write <
for .
Since N ON, it is natural to wonder whether N = ON. In fact, we
know that N = ON can be neither proved nor disproved from the axioms
that we have stated (provided, of course, that those axioms are actually
consistent). We nd ourselves at a crossroads in Set Theory. We can either
add N = ON to our axiom system, or we can add N ,= ON.
As we shall see, the axiom N = ON essentially says that there are no
innite sets and the axiom N ,= ON essentially says that there are indeed
innite sets. Of course, we go for the innite!
The Axiom of Innity N ,= ON
43
As a consequence, there is a set of all natural numbers; in fact, N ON.
Theorem 12. (z)(z ON z = N).
Proof. Since N ON and N ,= ON, pick ONN. We claim that for each
n N we have n ; in fact, this follows immediately from the trichotomy
of ordinals and the transitivity of N. Thus N = x : x N and by
Comprehension z z = x : x N. The fact that N ON now follows
immediately from Theorem 6.
The lower case Greek letter is reserved for the set N considered as an
ordinal; i.e., = N. Theorems 6 and 12 now show that the natural numbers
are the smallest ordinals, which are immediately succeeded by , after which
the rest follow. The other ordinals are generated by two processes illustrated
by the next lemma.
Lemma.
1. ON ON = succ().
2. S [S ON ON =
S].
Exercise 7. Prove this lemma.
For S ON we write sup S for the least element of
ON : ( S)( )
if such an element exists.
Lemma. S [S ON
S = sup S]
Exercise 8. Prove this lemma.
An ordinal is called a successor ordinal whenever ON = succ().
If = sup , then is called a limit ordinal.
44 CHAPTER 5. THE ORDINAL NUMBERS
Lemma. Each ordinal is either a successor ordinal or a limit ordinal, but
not both.
Exercise 9. Prove this lemma.
We can perform induction on the ordinals via a process called
transnite induction. In order to justify transnite induction we need a the-
orem scheme.
For each formula (v, w) of the language of set theory we have:
Theorem 13.
For all w, if
n ON [(m n (m, w)) (n, w)]
then
n ON (n, w).
Proof. The reader may check that a proof of this theorem scheme can be
obtained by replacing N with ON in the proof of Theorem Scheme 8.
We can also carry out recursive denitions on ON. This process is called
transnite recursion. For any formula (x, f, y, w) of the language of set
theory, we denote by REC(, ON, w) the class
x, y) : (n ON)(f)[f : n Vf(x) = ym n (m, f[m, f(m), w)].
Transnite recursion is justied by the following theorem scheme.
For each formula (x, f, y, w) of the language of set theory we have:
Theorem 14.
For all w, suppose that we have
(x ON)(f)[(f : x V) !y (x, f, y, w)].
Then, letting F denote REC(, ON, w), we have:
45
1. F : ON V;
2. x ON (x, F[x, F(x), w);
and, furthermore, for any n ON and any function H with n
dom(H) we have:
3. If (x, H[x, H(x), w) for all x n n then H(n) = F(n).
Proof. The reader may check that a proof of this theorem scheme can be
obtained by replacing N with ON in the proof of Theorem Scheme 9.
When applying transnite recursion on ON we often have three sepa-
rate cases to specify, rather than just two as with recursion on N. This is
illustrated by the recursive denitions of the arithmetic operations on ON.
Addition:
+ 0 = ;
+succ() = succ( +);
+ = sup + : , for a limit ordinal .
Multiplication:
0 = 0;
succ() = ( ) +;
= sup : , for a limit ordinal .
Exponentiation:
0
= 1;
succ()
= (
) ;
= sup
= 1;
14. If 1 < and < then
<
;
15.
sup S
= sup
: S;
16.
(+)
=
;
17. (
; and,
18. If < then
.
Exercise 10. Build your transnite induction skills by proving two parts of
this theorem.
47
However, ordinal addition and multiplication are not commutative. This
is illustrated by the following examples, which are easy to verify from the
basic denitions.
Examples.
1. 1 + = 2 +
2. 1 + ,= + 1
3. 1 = 2
4. 2 ,= 2
5. 2
= 4
6. (2 2)
,= 2
is a limit ordinal.
Exercise 11. Prove this lemma.
Lemma. If is a non-zero ordinal, then there is a largest ordinal such
that
.
Exercise 12. Prove this lemma. Show that the and that there are
cases in which = . Such ordinals are called epsilon numbers (The
smallest such ordinal =
is called
0
.)
Lemma. ON ! ON = +.
Exercise 13. Prove this lemma.
Commonly, any function f with dom(f) is called a sequence. If
dom(f) n+1 for some n , we say that f is a nite sequence; otherwise
f is an innite sequence. As usual, we denote the sequence f by f
n
, where
each f
n
= f(n).
Theorem 16. There is no innite descending sequence of ordinals.
48 CHAPTER 5. THE ORDINAL NUMBERS
Proof. Lets use an indirect proof. Suppose x is innite and f : x ON
such that if n < m then f(n) > f(m). Let X = f(n) : n x. By
Foundation there is y X such that y X = ; i.e., there is n x such that
f(n) X = . However, if m x and m > n then f(m) f(n), which is a
contradiction.
If n and s: (n + 1) ON is a nite sequence of ordinals, then the
sum
n
i=0
s(i) is dened by recursion as follows.
0
i=0
s(i) = s(0); and,
m+1
i=0
s(i) =
m
i=0
s(i) +s(m + 1), for m < n.
This shows that statements like the following theorem can be written
precisely in the language of set theory.
Theorem 17. (Cantor Normal Form)
For each non-zero ordinal there is a unique n and nite sequences
m
0
, . . . , m
n
of positive natural numbers and
0
, . . . ,
n
of ordinals which sat-
isfy
0
>
1
> >
n
such that
=
0
m
0
+
1
m
1
+ +
n
m
n
.
Proof. Using the penultimate lemma, let
0
= max :
and then let
m
0
= max m :
0
m
which must exist since
0
m for all m would imply that
0
+1
.
49
By the previous lemma, there is some
0
ON such that
=
0
m
0
+
0
where the maximality of m
0
ensures that
0
<
0
. Now let
1
= max :
so that
1
<
0
. Proceed to get
m
1
= max m :
1
m
0
and
1
<
1
such that
0
=
1
m
1
+
1
. We continue in this manner as
long as possible. We must have to stop after a nite number of steps or else
0
>
1
>
2
> . . . would be an innite decreasing sequence of ordinals. The
only way we could stop would be if some
n
= 0. This proves the existence
of the sum. Uniqueness follows by induction on ON.
Exercise 14. Verify the last statement of this proof.
Lemma.
1. If 0 < m < and is a non-zero ordinal, then m
.
2. If k , and m
0
, . . . , m
k
< , and
0
, . . . ,
k
< , then
m
0
0
+ +m
k
k
<
.
Exercise 15. Prove this lemma and note that it implies that m = for
each positive integer m and each limit ordinal .
There is an interesting application of ordinal arithmetic to Number The-
ory. Pick a numbersay x = 54. We have 54 = 2
5
+ 2
4
+ 2
2
+ 2 when it
is written as the simplest sum of powers of 2. In fact, we can write out 54
using only the the arithmetic operations and the numbers 1 and 2. This will
be the rst step in a recursively dened sequence of natural numbers, x
n
.
It begins with n = 2 and is constructed as follows.
x
2
= 54 = 2
(2
2
+1)
+ 2
2
2
+ 2
2
+ 2.
50 CHAPTER 5. THE ORDINAL NUMBERS
Subtract 1.
x
2
1 = 2
(2
2
+1)
+ 2
2
2
+ 2
2
+ 1.
Change all 2s to 3s, leaving the 1s alone.
x
3
= 3
(3
3
+1)
+ 3
3
3
+ 3
3
+ 1.
Subtract 1.
x
3
1 = 3
(3
3
+1)
+ 3
3
3
+ 3
3
.
Change all 3s to 4s, leaving any 1s or 2s alone.
x
4
= 4
(4
4
+1)
+ 4
4
4
+ 4
4
.
Subtract 1.
x
4
1 = 4
(4
4
+1)
+ 4
4
4
+ 3 4
3
+ 3 4
2
+ 3 4 + 3.
Change all 4s to 5s, leaving any 1s, 2s or 3s alone.
x
5
= 5
(5
5
+1)
+ 5
5
5
+ 3 5
3
+ 3 5
2
+ 3 5 + 3.
Subtract 1 and continue, changing 5s to 6s, subtracting 1, changing 6s to
7s and so on. One may ask the value of the limit
lim
n
x
n
.
What is your guess? The answer is surprising.
Theorem 18. (Goodstein)
For any initial choice of x there is some n such that x
n
= 0.
Proof. We use an indirect proof; suppose x N and for all n 2 we have
x
n
,= 0. From this sequence, we construct another sequence. For each n 2
we let g
n
be the result of replacing each occurrence of n in x
n
by . So, in
the example above we would get:
g
2
=
(
+1)
+
(
)
+
+,
g
3
=
(
+1)
+
(
)
+
+ 1,
g
4
=
(
+1)
+
(
)
+
,
g
5
=
(
+1)
+
(
)
+ 3
3
+ 3
2
+ 3 + 3,
g
6
=
(
+1)
+
(
)
+ 3
3
+ 3
2
+ 3 + 2,
51
etc. The previous lemma can now be used to show that g
n
would be an
innite decreasing sequence of ordinals.
It is interesting that, although the statement of the theorem does not
mention innity in any way, we used the Axiom of Innity in its proof. We
do not need the Axiom of Innity in order to verify the theorem for any one
particular value of xwe just need to carry out the arithmetic. The reader
can do this for x = 4; however, nishing our example x = 54 would be tedious.
Moreover, the calculations are somewhat dierent for dierent values of x.
Mathematical logicians have proved that, in fact, there is no uniform method
of nitary calculations which will give a proof of the theorem for all x. The
Axiom of Innity is necessary for the proof.
52 CHAPTER 5. THE ORDINAL NUMBERS
Chapter 6
Relations and Orderings
In the following denitions, R and C are terms.
1. We say R is a relation on C whenever R C C.
2. We say a relation R is irreexive on C whenever x C x, x) / R.
3. We say a relation R is transitive on C whenever
x y z [(x, y) R y, z) R) x, z) R.]
4. We say a relation R is well founded on C whenever
X [(X C X ,= ) (x X y X y, x) / R)].
Such an x is called minimal for X.
5. We say a relation R is total on C whenever
x C y C [x, y) R y, x) R x = y].
6. We say R is extensional on C whenever
x C y C [x = y z C (z, x) R z, y) R)].
53
54 CHAPTER 6. RELATIONS AND ORDERINGS
An example of a relation R on an ordinal is given by the membership
relation:
x, y) R i x y.
R satises all the above properties. On the other hand, if / , the reverse
relation R
given by
x, y) R
i y x
is not well founded but nevertheless has all the other properties.
Any well founded relation is irreexive. Any total relation is extensional.
Any relation which is both well founded and total is also transitive.
Exercise 16. Prove that a transitive set is an ordinal i the membership
relation is total on .
Suppose is an ordinal and f : X . A relation R on X with the
property that f(x) < f(y) whenever x, y) R must be a well founded
relation. In fact, this turns out to be a characterisation.
Theorem 19. Let R be a relation on a set X. R is well founded i there
is an ordinal and a surjection f : X such that f(x) < f(y) whenever
x, y) R.
Proof. We treat only the forward implication. Using recursion on ON we
dene g : ON T(X) by
g() =
_
x : x is a minimal element of X
_
g() : <
_
.
From g we obtain f : X ON by
f(x) =
_
the unique ON with x g(), if possible;
0, otherwise.
By Theorem 10 and the Axiom of Replacement there must be some least
ON such that / rng(f). This means that g() = , and since R is well
founded we must have
X =
_
g() : < .
55
To nish the proof suppose x, y) R and f(y) = . We have y g()
so that y is a minimal element of
X
_
g() : < ,
and hence we have that
x / X
_
g( : < .
In other words x g() for some < and so f(x) = < = f(y).
A relation R on a set A is said to be isomorphic to a relation S on a set
B provided that there is a bijection f : A B, called an isomorphism, such
that for all x and y in A we have
x, y) R i f(x), f(y)) S.
Exercise 17. Prove that any isomorphism between transitive sets is the
identity. Of course, the relation on the transitive set is the membership
relation, which is extensional and well founded.
This unexpected result leads to the important Mostowski Collapsing The-
orem.
Theorem 20. Let R be a well founded extensional relation on a set X. There
is a unique transitive set M and a unique isomorphism h : X M.
Proof. Obtain f : X directly from the previous theorem. By recursion
on the ordinals we dene for each < a function h
: f
V such
that
h
(y) = h
and x, y) R then x f
(y) = h
(x).
61
For any ordinal , we denote by
+
the least cardinal greater than .
This is well dened by Theorem 24.
Exercise 21. Prove the following:
1. The supremum of a set of cardinals is a cardinal.
2. z z = : is a cardinal.
Theorem 25. For any innite cardinal , [ [ = .
Proof. Let be an innite cardinal. The formulas [[ = [ 0[ and
[ 0[ [ [ imply that [ [. We now show that [ [ .
We use induction and assume that [ [ = for each innite cardinal
< .
We dene an ordering on by:
0
,
0
) <
1
,
1
) i
_
_
max
0
,
0
< max
1
,
1
;
max
0
,
0
= max
1
,
1
0
<
1
; or,
max
0
,
0
= max
1
,
1
0
=
1
0
<
1
.
It is easy to check that < well orders .
Let = type , <). It suces to show that . And for this it
suces to show that for all , ) ,
type <
, ), <) < .
To this end, pick , ) . Let be such that <
, ) .
This is possible since is a limit ordinal. It now suces to prove that
type <
, ), <) < .
By Theorem 23 it suces to prove that [<
X by
f(a, )) = f
a
().
Using Exercise 20 and the previous corollary, the result follows.
Dene
B
A as f : f : B A and [A]
as x : x A [x[ = .
Lemma. If is an innite cardinal, then [
[ = [[]
[ = [T()[.
Proof. We have,
2
[ ]
T( ).
Using characteristic functions it is easily proved that [
2[ = [T()[. Since
[ [ = we have the result.
Lemma. If and is an innite cardinal, then [
[ = [[]
[.
Proof. For each x []
there is a bijection f
x
: x. Since f
x
, we
have an injection from []
into
, so [[]
[ [
[.
Now
[ ]
. Thus, [
[ [[]
[.
A subset S of a limit ordinal is said to be conal whenever sup S = .
A function f : is said to be conal whenever rng(f) is conal in .
The conality, cf(), of a limit ordinal is the least ON such that there
is a conal f : .
63
A cardinal is said to be regular whenever cf() = . Otherwise,
is said to be singular. These notions are of interest when is an innite
cardinal. In particular, is regular.
Lemma.
1. For each limit ordinal , cf() is a regular cardinal.
2. For each limit ordinal ,
+
is a regular cardinal.
3. Each innite singular cardinal contains a conal subset of regular car-
dinals.
Exercise 22. Prove this lemma.
Theorem 27. (Konigs Theorem)
For each innite cardinal , [
cf()
[ > .
Proof. We show that there is no surjection g :
, where = cf().
Let f : witness that cf() = . Dene h: such that each
h() / g()() : < f(). Then h / g
2[ = [
()
2[ = [
2)[ = [
[ [
cf()
[ > .
Cantors Theorem guarantees that for each ordinal there is a set, T(),
which has cardinality greater than . However, it does not imply, for exam-
ple, that
+
= [T()[. This statement is called the Continuum Hypothesis,
and is equivalent to the third question in the introduction.
64 CHAPTER 7. CARDINALITY
The aleph function : ON ON is dened as follows:
(0) =
() = sup ()
+
: .
We write
for ().
The beth function : ON ON is dened as follows:
(0) =
() = sup [T(())[ : .
We write
for ().
It is apparent that
1
1
. The continuum hypothesis is the statement
1
=
1
; this is abbreviated as CH. The generalised continuum hypothesis,
GCH, is the statement ON
.
Exercise 23. Prove that
[ is an innite cardinal ( ON( =
)].
A cardinal is said to be inaccessible whenever both is regular and
< [T()[ < .
An inaccessible cardinal is sometimes said to be strongly inaccessible, and
the term weakly inaccesible is given to a regular cardinal such that
<
+
< .
Under the GCH these two notions are equivalent.
Axiom of Inaccessibles > and is an inaccessible cardinal
This axiom is a stronger version of the Axiom of Innity, but the mathemat-
ical community is not quite ready to replace the Axiom of Innity with it
just yet. In fact, the Axiom of Inacessibles is not included in the basic ZFC
axiom system and is therefore always explicitly stated whenever it is used.
Exercise 24. Are the following two statements true? What if is assumed
to be a regular cardinal?
1. is weakly inaccessible i =
.
2. is strongly inaccesssible i =
.
Chapter 8
There Is Nothing Real About
The Real Numbers
We now formulate three familiar number systems in the language of set the-
ory. From the natural numbers we shall construct the integers; from the
integers we shall construct the decimal numbers and the real numbers.
For each n N let n = m : m n. The integers, denoted by Z,
are dened as
Z = N n : n N.
For each n N we denote n by n.
We can extend the ordering < on N to Z by letting x < y i one of the
following holds:
1. x N y N x < y;
2. x / N y N; or,
3. x / N y / N
y <
x.
To form the reals, rst let
F = f : f
Z.
65
66CHAPTER 8. THERE IS NOTHINGREAL ABOUT THE REAL NUMBERS
We pose a few restrictions on such functions as follows. Let us write:
A(f) for (n > 0)(9 f(n) 9);
B(f) for (n )(f(n) 0) (n )(f(n) 0);
C(f) for (m )(n m)(f(n) / 9, 9); and,
D(f) for (m )(n m)(f(n) = 0).
Finally, let
R = f : f F and A(f) and B(f) C(f)
to obtain the real numbers and let
D = f : f R and D(f)
to obtain the decimal numbers.
We now order R as follows: let f < g i
(n ) [f(n) < g(n) (m n)(f(m) = g(m))].
This ordering clearly extends our ordering on N and Z, and restricts to D.
In light of these denitions, the operations of addition, multiplication and
exponentiation dened in Chapter 4 can be formally extended from N to Z,
D and R in a naturalif cumbersomefashion.
Lemma.
1. [D[ =
0
.
2. R is uncountable.
3. < is a linear order on D.
4. (p R)(q R) [p < q d D p < d < q].
I.e., D is a countable dense subset of R.
5. < is complete; i.e., bounded subsets have suprema and inma.
Exercise 25. Prove this lemma.
Theorem 28. Any two countable dense linear orders without endpoints are
isomorphic.
67
Proof. This method of proof, the back-and-forth argument, is due to G.Cantor.
The idea is to dene an isomorphism recursively in steps, such that at each
step we have an order-preserving nite function; at even steps f(x
i
) is dened
and at odd steps f
1
(y
j
) is dened.
Precisely, if X = x
i
: i and Y = y
j
: j are two countable
dense linear orders we dene f : X Y by the formulas
f
0
= x
0
, y
0
)
f
n+1
= f
n
x
i
, y
j
)
where
1. if n is even, i = min k : x
k
/ dom(f
n
) and j is chosen so that
f
n
x
i
, y
j
) is order-preserving; and,
2. if n is odd, j = min k : y
k
/ rng(f
n
) and i is chosen so that
f
n
x
i
, y
j
) is order-preserving.
We then check that for each n , there is indeed a choice of j in (1) and i
in (2) and that f =
f
n
: n is an isomorphism.
Any complete dense linear order without endpoints and with a countable
dense subset is isomorphic to R, <). Can with a countable dense subset
be replaced by in which every collection of disjoint intervals is countable?.
The armation of this is called the Suslin Hypothesis. It is the second most
important problem in Set Theory. It too requires new axioms for its solution.
We have begun with N, extended to Z, then extended again to R. We
now extend once more to
R. This is the set of hyperreals. In order to do
this, we introduce the important notion of an ultralter.
A collection of subsets | T(S) is a lter provided that it satises the
rst three of the following conditions:
1. S | and / |.
68CHAPTER 8. THERE IS NOTHINGREAL ABOUT THE REAL NUMBERS
2. If A | and B |, then A B |.
3. If A | and A B, then B |.
4. A | or B | whenever A B = S.
5. x S x / |.
If a lter | obeys condition (4) it is called an ultralter, and if | satises
conditions (1) through (5) it is said to be a free or non-principal ultralter.
An ultralter is a maximal lter under inclusion. Every lter can be
extended to an ultralter. (We can recursively dene it.) There is a free
ultralter over .
Theorem 29. (Ramsey)
If P : []
2
0, 1, then there is H []
such that [P
[H]
2
[ = 1.
Proof. Let | be a free ultralter over . Either:
1. : : P(, ) = 0 | |; or,
2. : : P(, ) = 1 | |.
As such, the proof breaks into two similar cases. We address case (1).
Let S = : : P(, ) = 0 |. Pick
0
S and let
S
0
= : P(
0
, ) = 0.
Pick
1
S S
0
and let
S
1
= : P(
1
, ) = 0.
In general, recursively choose
n
: n < such that for each n
n+1
S S
0
S
n
,
69
where
S
n
= : P(
n
, ) = 0.
Then H =
n
: n < exhibits the desired property.
Theorem 30. (Sierpinski)
There is a function P : [
1
]
2
0, 1 such that there is no H [
1
]
1
with [P
[H]
2
[ = 1.
Proof. Let f :
1
R be an injection. Dene P as follows: for < , let
P(, ) =
_
0, if f() < f();
1, if f() > f().
(8.1)
The following exercise nishes the proof.
Exercise 26. There is no subset of R with order type
1
.
Let | be a free ultralter over . Form an equivalence relation on
R
by the rule:
f g whenever n : f(n) = g(n) |.
The equivalence class of f is denoted by
[f] = g
R : g f.
The set of equivalence classes of is called the ultrapower of R with respect
to |. The elements of
R are often called the hyperreal numbers and denoted
R.
There is a natural embedding of R into
R given by
x [f
x
]
where f
x
: R is the constant function; i.e., f
x
(n) = x for all n ; we
identify R with its image under the natural embedding.
70CHAPTER 8. THERE IS NOTHINGREAL ABOUT THE REAL NUMBERS
We can dene an ordering
< on
R by the rule:
a <
F : (
R)
n
R
given by
F(a) = [F s]
where a = a
0
, . . . , a
n1
) (
R)
n
and s: R
n
such that for each j
s(j) = s
0
(j), . . . , s
n1
(j))
and s
i
a
i
for each i.
Theorem 31. (The Leibniz Transfer Principle)
Suppose F : R
n
R and G: R
n
R.
1. a R
n
F(a) = G(a) i a (
R)
n
F(a) =
G(a).
2. a R
n
F(a) < G(a) i a (
R)
n
F(a) <
G(a).
Exercise 29. Prove the above theorem. This includes verifying that
F is
a function.
As a consequence of this theorem, we can extend + and to
R. For
example, a +b = c means that
f a g b h c n : f(n) +g(n) = h(n) |.
71
Indeed, the natural embedding embeds R as an ordered subeld of
R.
In order to do elementary calculus we consider the innitesimal elements
of
R. Note that R ,=
R; consider
= [1, 1/2, 1/3, . . . , 1/n, . . . )].
Then > 0 but < r for each positive r R; a member of
R with this
property is called a positive innitesimal. There are negative innitesimals;
0 is an innitesimal. Since
R is a eld 1/ exists and 1/ > r for any real
number r; it is an example of a positive innite number.
A hyperreal number a is said to be nite whenever [a[ < r for some real
r. Two hyperreal numbers a and b are said to be innitely close whenever
a b is innitesimal. We write a b.
Lemma. Each nite hyperreal number is innitely close to a unique real
number.
Proof. Let a be nite. Let s = sup r R : r < a. Then a s. If we also
have another real t such that a t, then we have s t and so s = t.
If a is nite, the standard part of a, st(a), is dened to be the unique real
number which is innitely close to a.
It is easy to check that for nite a and b,
st(a +b) = st(a) +st(b); and,
st(a b) = st(a) st(b).
For a function F : R R we dene the derivative, F
(x), of F(x) to be
st
_
F(x +x) F(x)
x
_
provided this exists and is the same for each non-zero innitesimal x.
72CHAPTER 8. THERE IS NOTHINGREAL ABOUT THE REAL NUMBERS
The fact that F
(x)x +x
where y = F(x +x) F(x).
Theorem 32. (The Chain Rule)
Suppose y = F(x) and x = G(t) are dierentiable functions.
Then y = F(G(t)) is dierentiable and has derivative F
(G(t)) G
(t).
Proof. Let t be any non-zero innitesimal. Let x = G(t + t) G(t).
Since G
(t) = st(
x
t
) = 0.
So st(
y
t
) = F
(G(t)) G
(t).
Case 1: x ,= 0
y
t
=
y
x
x
t
. So st(
y
t
) = st(
y
x
) st(
x
t
), and again,
st(
y
t
) = F
(x) G
(t) = F
(G(t)) G
(t).
Chapter 9
The Universe
In this chapter we shall discuss two methods of measuring the complexity
of a set, as well as their corresponding gradations of the universe. For this
discussion it will be helpful to develop both a new induction and a new
recursion procedure, this time on the whole universe. Each of these will
depend upon the fact that every set is contained in a transitive set, which
we now prove.
By recursion on N, we dene:
_
0
X = X; and,
_
n+1
X =
_
(
_
n
X).
We dene the transitive closure of X as,
trcl(X) =
_
_
n
X : n .
Theorem 33.
1. X trcl(X) is the smallest transitive set containing X.
2. x trcl(x) = x
trcl(y) : y x.
Proof. Note that Y is transitive i
Y Y , so trcl(X) is transitive.
73
74 CHAPTER 9. THE UNIVERSE
1. If Y is transitive and X Y then for each n N,
n
X
n
Y Y .
So trcl(X) Y .
2. To prove that trcl(x) x
n
y
n+1
x. Hence
trcl(y) trcl(x). For the opposite containment, we prove by induction
that n
n
x x
trcl(y) : y x.
As with N and ON, we can perform induction on the universe, called
induction, as illustrated by the following theorem scheme.
For each formula (v, w) of the language of set theory we have:
Theorem 34.
For all w, if
n [(m n (m, w)) (n, w)]
then
n (n, w).
Proof. We shall assume that the theorem is false and derive a contradiction.
We have w and a xed l such that (l, w).
Let t be any transitive set containing l. Thanks to the previous theorem,
we can let t = trcl(l l). The proof now proceeds verbatim as the proofs
of Theorems 8 and 13.
We can also carry out recursive denitions on V; this is called recursion.
For any formula (x, f, y, w) of the language of set theory, we denote by
REC(, V) the class
x, y) : (n)(f) [f : n V f(x) = y m n (m, f[m, f(m), w)].
recursion is justied by the next theorem scheme.
75
For each formula (x, f, y, w) of the language of set theory we have:
Theorem 35.
For all w, suppose that we have
(x)(f) [(f : x V) !y (x, f, y, w)].
Then, letting F denote REC(, V), we have:
1. F : V V;
2. x (x, F[x, F(x), w); and furthermore for any n and any function H
with n dom(H) we have,
3. (x, H[x, H(x), w) for all x n n then H(n) = F(n).
Proof. The proof is similar to that of Theorems 9 and 14.
Our rst new measure of the size of a set is given by the rank function.
This associates, to each set x, an ordinal rank(x) by the following rule:
rank(x) = sup rank(y) + 1 : y x
Observe that ON rank() = .
By recursion on ON we dene the cumulative hierarchy, an ordinal-gradation
on V, as follows.
R(0) = ;
R( + 1) = T(R()); and,
R() =
_
R() : < if is a limit ordinal.
Sometimes we write R
or V
R() : ON.
Proof.
1. This is an easy induction on ON.
2. Apply induction on , using (1).
3. () Note that if is least ordinal such that x R(), then is a
successor ordinal, so choose such that = + 1;
() This uses (2).
4. First show by induction on that rank(x) < implies x R().
5. This follows from x ON rank(x) = .
We have discussed cardinality as a way of measuring the size of a set.
However, if our set is not transitive, the cardinality function does not tell the
whole story since it cannot distinguish the elements of the set. For example,
although N N, [N[ =
0
while [N[ = 1. Intuitively, we think of N as
no smaller than N. As such, we dene the hereditary cardinality, hcard(x),
of a set x, as the cardinality of its transitive closure:
hcard(x) = [trcl(x)[
The corresponding cardinal-gradation is dened as follows.
For each cardinal ,
H() = x : [trcl(x) < .
77
The members of H() are called the hereditarily nite sets and the members
of H(
1
) are called the hereditarily countable sets. The next theorem lists
some important properties of H.
Theorem 37.
1. For any innite cardinal , H() is transitive.
2. For any innite cardinal , x hcard(x) < rank(x) < .
3. For any innite cardinal , H() R().
4. For any innite cardinal , z z = H().
5. x ( is a cardinal and x H()); i.e,
V =
H() : is a cardinal.
Proof.
1. Apply part (2) of Theorem 33.
2. Case 1: is regular.
The proof is by -induction on V, noting that if rank(y) < for
each y x and [x[ < , then rank(x) < .
Case 2: is singular.
There is a regular cardinal such that hcard(x) < < ; now
use Case 1.
3. This follows from (2) and Theorem 36.
4. Apply (3) and Comprehension.
5. This follows since x hcard(x) = .
Theorem 38. If is an inaccessible cardinal, then H() = R().
78 CHAPTER 9. THE UNIVERSE
Proof. From Theorem 37 we have H() R().
For the reverse containment, let x R(). By Theorem 36, <
such that x R( + 1) R(). So x R(). Since R() is transitive,
trcl(x) R().
It suces to prove by induction that < [R()[ < . For successor
= + 1, we note that [T()[ < , where = [R()[; for limit we apply
Theorem 36, observing that is regular.
Chapter 10
Reection
There is a generalisation of the Equality Axiom, called the Equality Principle,
which states that for any formula we have x = y implying that holds at
x i holds at y. The proof requires a new technique, called induction on
complexity of the formula.
We make this precise. For each formula of set theory all of whose free
variables lie among v
0
, . . . , v
n
we write (v
0
, . . . , v
n
) and for each i and j with
0 i n, we denote by (v
0
, . . . , v
i
/v
j
, . . . , v
n
) the result of substituting v
j
for each free occurance of v
i
.
For each formula (v
0
, . . . , v
n
) and each i and j with 0 i n we have:
Theorem 39. , i, j
v
0
. . . v
i
. . . v
n
v
j
[v
i
= v
j
((v
0
, . . . , v
n
) ((v
0
, . . . , v
i
/v
j
, . . . , v
n
))]
This is a scheme of theorems, one for each appropriate , i, j.
Proof. The proof will come in two steps.
1. We rst prove this for atomic formulas .
79
80 CHAPTER 10. REFLECTION
Case 1 is v
i
v
k
where i ,= k
This is true by Axiom of Equality.
Case 2 is v
k
v
i
where i ,= k
This is true by Axiom of Extensionality.
Case 3 is v
i
v
i
This is true since by Theorem 2 both v
i
v
j
and v
j
v
i
are false.
Case 4 is v
k
v
k
where i ,= k
This is true since both (v
0
, . . . , v
n
) and (v
0
, . . . , v
i
/v
j
, . . . , v
n
)
are the same.
Case 5 is v
i
= v
k
where i ,= k
This is true by Theorem 1.
Case 6 is v
k
= v
i
where i ,= k
This is similar to case 5.
Case 7 is v
i
= v
i
This is true since by Theorem 1 both v
i
= v
i
and v
j
= v
j
are true.
Case 8 is v
k
= v
k
where i ,= k
This is similar to case 4.
2. We now show that for any subformula of , if the theorem is true
for all proper subformulas of then it is true for . Here and
are subformulas of and is expressed in parentheses for each case
according to how is built.
Case 1 ()
This is true since if
(v
0
, . . . , v
n
) (v
0
, . . . , v
i
/v
j
, . . . , v
n
)
then
(v
0
, . . . , v
n
) (v
0
, . . . , v
i
/v
j
, . . . , v
n
)
Case 2 ( )
From the hypothesis that
(v
0
, . . . , v
n
) (v
0
, . . . , v
i
/v
j
, . . . , v
n
)
and
(v
0
, . . . , v
n
) (v
0
, . . . , v
i
/v
j
, . . . , v
n
)
81
we obain
(v
0
, . . . , v
n
) (v
0
, . . . , v
n
)
i
(v
0
, . . . , v
i
/v
j
, . . . , v
n
) (v
0
, . . . , v
i
/v
j
, . . . , v
n
).
Cases 3 through 5 result from Cases 1 and 2.
Case 3 ( )
Case 4 ( )
Case 5 ( )
Case 6 (v
k
) and i ,= k
We have
v
0
. . . v
n
v
j
[v
i
= v
j
(v
0
, . . . , v
n
)
(v
0
, . . . , v
i
/v
j
, . . . , v
n
)].
If v
k
is not free in , then v
k
and we are done. If v
k
is
free in , then since 0 k n we have
v
0
. . . v
n
v
j
v
k
[v
i
= v
j
(v
0
, . . . , v
n
)
(v
0
, . . . , v
i
/v
j
, . . . , v
n
)]
and so
v
0
. . . v
n
v
j
[v
i
= v
j
(v
k
)((v
0
, . . . , v
n
)
(v
0
, . . . , v
i
/v
j
, . . . , v
n
))]
and so
v
0
. . . v
n
v
j
[v
i
= v
j
(v
k
)(v
0
, . . . , v
n
)
(v
k
)(v
0
, . . . , v
i
/v
j
, . . . , v
n
))].
Case 7 (v
k
) ) and i ,= k
This follows from Cases 1 and 6.
Case 8 (v
i
)
This is true since v
i
is not free in , hence (v
0
, . . . , v
i
/v
j
, . . . , v
n
)
is (v
0
, . . . , v
n
).
82 CHAPTER 10. REFLECTION
Case 9 (v
i
) )
This is similar to case 8.
We conclude that the theorem scheme holds for all appropriate , i, j.
Let M be a term and any formula of the language of set theory. We
dene the relativisation of to M, denoted by
M
, as follows:
1. If is atomic then
M
is ;
2. If is then
M
is
M
;
3. If is (
1
2
) then
M
is (
M
1
M
2
);
4. If is (
1
2
) then
M
is (
M
1
M
2
);
5. If is (
1
2
) then
M
is (
M
1
M
2
);
6. If is (
1
2
); then
M
is (
M
1
M
2
);
7. If is (v
i
) then
M
is (v
i
M)
M
; and,
8. If is (v
i
) then
M
is (v
i
M)
M
.
We write M [= for
M
and moreover whenever has no free variables we
say that M is a model of .
We denote by ZFC the collection of axioms which include: Equality,
Extensionality, Existence, Pairing, Foundation, Union, Intersection, the Re-
placement Scheme, Power Set, Choice and Innity.
For each axiom of ZFC, except for the Axiom of Innity, we have:
Lemma.
R() [= .
83
Lemma. If M is transitive, then M models Equality, Extensionality, Exis-
tence and Foundation.
For each axiom of ZFC, except for those in the Replacement Scheme,
we have:
Theorem 40. For each uncountable , R() [= .
Exercise 30. Prove the above theorem scheme. Use the fact that V [= .
For each axiom of ZFC, except for Power Set, we have:
Theorem 41. For each uncountable regular cardinal , H() [= .
Exercise 31. Prove the above theorem scheme.
If is an inaccessible cardinal, then R() = H() [= , for each axiom
of ZFC.
Lemma. If is the least inaccessible cardinal, then
H() [= an inaccessible cardinal.
From this lemma we can infer that there is no proof, from ZFC, that
there is an inaccessible cardinal. Suppose is the conjuction of all the
(nitely many) axioms used in such a proof. Then from we can derive
()( is an inaccessible). Let be the least inaccessible cardinal. Then
H() [= so
H() [= ()( is an inaccessible).
This assumes that our proof system is sound; i.e., if from
1
we can derive
2
and M [=
1
, then M [=
2
. We conclude that ZFC plus an inaccessible
is consistent.
We can also infer that ZFC minus Innity plus (z)(z = N) is
consistent. Suppose not. Suppose is the conjuction of the nitely many
axioms of ZFC Inf needed to prove (z)(z = N). Then H() [= , so
H() [= (z)(z = N), which is a contradiction.
84 CHAPTER 10. REFLECTION
In fact, given any collection of formulas without free variables and a class
M such that for each in the collection M [= , we can then conclude that
the collection is consistent.
For example, ZFC minus Power Set plus (x)(x is countable) is con-
sistent:
Lemma. H(
1
) [= (x)(x is countable).
If (v
0
, . . . , v
k
) is a formula of the language of set theory and
M = x :
M
(x, v) and C = x :
C
(x, v) are classes, then we say
is absolute between M and C whenever
(v
0
M C) . . . (v
k
M C) [
M
C
].
This concept is most often used when M C. When C = V, we say that
is absolute for M.
For classes M = x :
M
(x, v) and C = x :
C
(x, v) with M C and
a list
0
, . . . ,
m
of formulas of set theory such that for each i m every
subformula of
i
is contained in the list, we have:
Lemma.
M
,
C
,
1
, . . . ,
m
The following are equivalent:
1. Each of
1
, . . . ,
m
are absolute between M and C.
2. Whenever
i
is x
j
(x, v) for i, j m we have
(v
1
M . . . v
k
M)(x C
C
j
(x, v) x M
C
j
(x, v)).
The latter statement is called the Tarski-Vaught Condition.
Proof. ((1) (2))
85
Let v
0
M, . . . , v
k
M and suppose x C
C
j
(x, v
0
, . . . , v
k
). Then
C
i
(v
0
, . . . , v
k
) holds. By absoluteness
M
i
(v
0
, . . . , v
k
) holds; i.e.,
x M
M
j
(x, v
0
, . . . , v
k
), so by absoluteness of
j
(x, v
0
, . . . , v
k
) for x M
we have x M
C
j
(x, v
0
, . . . , v
k
).
(2) (1)
This is proved by induction on complexity of
i
, noting that each sub-
formula appears in the list. There is no problem with the atomic formula
step since atomic formulas are always absolute. Similarly, the negation and
connective steps are easy. Now if
i
is x
j
and each of v
1
, . . . , v
k
is in M
we have
M
i
(v
1
, . . . , v
k
) x M
M
j
(x, v
1
, . . . , v
k
)
x M
C
j
(x, v
1
, . . . , v
k
)
x C
C
j
(x, v
1
, . . . , v
k
)
C
i
(v
1
, . . . , v
k
)
where the second implication is due to the inductive hypothesis and the third
implication is by part (2).
The next theorem scheme is called the Levy Reection Principle.
For each formula of the language of set theory, we have:
Theorem 42.
ON ON [ > and is absolute for R()].
If, in addition, has no free variables then implies R() [= . This is
interpreted as the truth of being reected to R().
Proof. Form a collection
1
, . . . ,
m
of all the subformulas of . We will
use the Tarski-Vaught Condition for
1
, . . . ,
m
to get absoluteness between
R() and V; but rst we must nd .
86 CHAPTER 10. REFLECTION
For each i = 1, . . . , m such that
i
is x for some formula we dene
f
i
such that f
i
: V ON by setting
f
i
(y
1
, . . . , y
l
i
)) = min : ON x R() (x, y
1
, . . . , y
l
i
)
if such an ordinal exists, and f
i
(y
1
, . . . , y
l
i
)) = otherwise.
Now dene h: ON by recursion by the formulas
h(0) =
h(n + 1) = sup f
i
(y
1
, . . . , y
l
i
)) : 1 i < m and each y
j
R(h(n))
and then let = sup h(n) : n . This works.
The analogous theorem scheme can be proven for the H() hierarchy as
well.
We can now argue that ZFC cannot be nitely axiomatised. That is,
there is no one formula without free variables which implies all axioms of
ZFC and is, in turn, implied by ZFC. Suppose such a exists. By the
Levy Reection Principle, choose the least ON such that
R()
. We
have R() [= . Hence R() [= ON
R()
, since this instance of the
theorem follows from ZFC. Thus,
( ON
R()
)
R()
.
That is,
(ON R())
R()R()
.
But ON R() = , the ordinals of rank < . So, < and we have
<
R()
, contradicting the minimality of .
For any formula of the language of set theory with no free variables
and classes M = v :
M
(v), C = v :
C
(v), and F = v :
F
(v) we
have:
Lemma. ,
M
,
C
,
F
If F : M C is an isomorphism then is absolute between M and C.
That is, M [= i C [= .
87
Proof. It is easy to show by induction on the complexity of that
(x
1
, . . . , x
k
) (F(x
1
), . . . , F(x
k
))
for each subformula of .
A bounded formula (also called a
0
formula) is one which is built up as
usual with respect to atomic formulas and connectives, but where each x
clause is replaced by x y , and the x clause is replaced by x y .
0
formulas are absolute for transitive models. That is, for each
0
formula (v
0
, . . . , v
k
, w) and for each class M = x :
M
(x, w) we have:
Theorem 43. ,
M
w if M is transitive, then
(v
0
M) . . . (v
k
M) [
M
(v
0
, . . . , v
k
, w) (v
0
, . . . , v
k
, w)]
Proof. We use induction on the complexity of . We only show the step.
Suppose [(v
0
M) . . . (v
k
M)
M
(v
0
, . . . , v
k
, w)] (v
0
, . . . , v
k
, w).
We wish to consider (v
i
v
j
) (v
0
, . . . , v
k
, w).
Fix any v
0
, . . . , v
k
M, but not v
i
; however, v
j
is xed in M. Now
(v
i
v
j
(v
0
, . . . , v
k
, w))
M
(v
i
v
j
(v
0
, . . . , v
k
, w))
since
M
(v, w) (v, w). Also,
(v
i
v
j
) (v, w)
(v
i
v
j
)
M
(v, w) since (v, w)
M
(v, w)
v
i
(v
j
M)
M
(v, w) since v
j
M implies v
j
M
((v
i
v
j
) (v, w))
M
88 CHAPTER 10. REFLECTION
A formula ( w) is said to be
T
1
, where T is a subcollection of ZFC,
whenever there are
0
formulas
1
(v, w) and
2
(v, w) such that using only
the axioms from T we can prove that both:
( w)(
1
( w) v
0
. . . v
k
1
(v, w))
( w)(
2
( w) v
0
. . . v
k
2
(v, w)).
We can now use the above theorem to show that if ( w) is a
T
1
formula
and M [= for each in T, then ( w) is absolute for M, whenever M is
transitive. We do so as follows, letting play the role of
1
above:
( w) v
0
. . . v
k
(v, w)
(v
0
M) . . . (v
k
M) [(v, w)]
v
0
. . . v
k
(v, w)
M
( w)
M
The rst and third implications accrue from ( w) being
T
1
; the second from
the fact that is
0
and M is transitive and models the axioms of T.
This is often used with T as ZFC without Power Set and M = H().
Chapter 11
Elementary Submodels
In this chapter we shall rst introduce a collection of set operations proposed
by Kurt Godel which are used to build sets. We shall then discuss the new
concept of elementary submodel.
We now dene the ordered ntuple with the following innitely many
formulas, thereby extending the notion of ordered pair.
x) = x
x, y) = x, x, y
x, y, z) = x, y), z)
x
1
, . . . , x
n
) = x
1
, . . . , x
n1
), x
n
)
The following operations shue the components of such tuples in a set
S.
F
0
(S) = u, v), w) : u, v, w)) S
F
1
(S) = u, v, w)) : u, v), w) S
F
2
(S) = v, u) : u, v) S
F
3
(S) = v, u, w) : u, v, w) S
F
4
(S) = t, v, u, w) : t, u, v, w) S
Lemma. (Shue Lemma Scheme)
89
90 CHAPTER 11. ELEMENTARY SUBMODELS
For any m N and any permutation : m m, there is a composition of
the operations F
0
, F
1
, F
2
, F
3
, F
4
such that for any S,
F
(S) = x
(0)
, . . . , x
(m1)
) : x
0
, . . . , x
m1
) S.
Proof. Since binary exchanges generate the symmetric group, noting that
the identity permutation is given by F
2
F
2
, it suces to consider only
such that for some l m,
(i) =
_
_
i + 1, if i = l;
i 1, if i = l + 1;
i, otherwise.
For convenience, let F
n
i
denote the nfold composition of F
i
. There are
several cases.
Case 1: m = 2
F
= F
2
Case 2: m = 3, l = 1
F
= F
3
Case 3: m = 3, l = 2
F
= F
0
F
2
F
3
F
2
F
1
Case 4: m 4, l = 1
F
= F
m3
0
F
3
F
m3
1
Case 5: m 4, 1 l m1
F
= F
ml2
0
F
4
F
ml2
1
Case 6: m 4, l = m1
F = F
0
F
2
F
3
F
2
F
1
The Godel Operations are as follows:
91
G
1
(X, Y ) = X, Y
G
2
(X, Y ) = X Y
G
3
(X, Y ) = u, v) : u X v Y ; i.e., X Y
G
4
(X, Y ) = u, v) : u X v Y u = v
G
5
(X, Y ) = u, v) : u X v Y u v
G
6
(X, Y ) = u, v) : u X v Y v u
G
7
(X, R) = u : x X u, x) R
G
8
(X, R) = u : x X u, x) R
G
9
(X, R) = F
0
(R)
G
10
(X, R) = F
1
(R)
G
11
(X, R) = F
2
(R)
G
12
(X, R) = F
3
(R)
G
13
(X, R) = F
4
(R)
G
9
through G
13
are dened as binary operations for conformity.
We now construct a function which, when coupled with recursion on ON,
will enumerate all possible compositions of Godel operations. G: NV V
is given by the following rule. For each k and each s: k V, we dene
G[
Ns
by recursion on N as follows:
G(n, s) =
_
_
s(i), if n = 17
i
, 0 i k;
G
m
(G(i, s), G(j, s)), if n = m 19
i+1
23
j+1
, 1 m 13;
, otherwise.
2. x
1
, . . . , x
m
) : x
1
X x
m
X w
i
w
j
3. x
1
, . . . , x
m
) : x
1
X x
m
X w
i
= x
j
4. x
1
, . . . , x
m
) : x
1
X x
m
X w
i
x
j
5. x
1
, . . . , x
m
) : x
1
X x
m
X x
i
w
j
6. x
1
, . . . , x
m
) : x
1
X x
m
X x
i
= x
j
7. x
1
, . . . , x
m
) : x
1
X x
m
X x
i
x
j
( w)(n )[x
1
, . . . , x
m
) : x
1
X x
m
X
and
X
(x
1
, . . . , x
m
, w) = G(n, s)]
The proof of the theorem for any given will assume the corresponding
result for a nite number of simpler formulas, the proper subformulas of .
We begin by looking at atomic formulas . This step is covered by the
previous lemma.
Now we proceed by induction on complexity. Suppose that
x
1
, . . . , x
m
) : x
1
X x
m
X and
X
1
(x
1
, . . . , x
m
, w) = G(n
1
, s)
and
x
1
, . . . , x
m
) : x
1
X x
m
X and
X
2
(x
1
, . . . , x
m
, w) = G(n
2
, s).
94 CHAPTER 11. ELEMENTARY SUBMODELS
Then
x
1
, . . . , x
m
) : x
1
X . . . x
m
X and
X
1
(x
1
, . . . , x
m
, w)
= P
m
(X) G(n
1
, s) = G
2
(P
m
(X), G(n
1
, s))
and
x
1
, . . . , x
m
) : x
1
X . . . x
m
X and
X
1
X
2
(x
1
, . . . , x
m
, w)
= G(n
1
, s) G(n
2
, s).
The other connectives can be formed from and ; as such, x
l
is x
l
,
so it only remains to do the = x
l
1
step. Thanks again to the last
lemma, we may assume that l = m. Then is x
m
1
and
x
1
. . . x
m1
) : x
1
X x
m1
X and
X
(x
1
, . . . , x
m1
, w)
= G
7
(G(n
1
, s))
Lets write G(n, X, y) for G(n, s), where s(0) = X and s(k + 1) = y(k)
for all k dom(y) = dom(s) 1.
M is said to be an elementary submodel of N whenever
1. M N; and,
2. k y
k
M n G(n, N, y) N ,= G(n, N, y) M ,= .
We write M N.
Justication of the terminology comes from the following theorem scheme.
For each formula of the language of set theory we have:
Theorem 45.
Suppose M N. Then is absolute between M and N.
95
Proof. We will use the Tarski-Vaught criterion. Let
0
, . . . ,
m
enumer-
ate and each of its subformulas. Suppose
i
is x
j
(x, y
0
, . . . , y
k
) with
y
0
, . . . , y
k
M in a sequence y.
Let n so that by Theorem 44 we can nd n such that
G(n, N, y) = x N :
N
j
(x, y
0
, . . . , y
k
).
Then
x N
N
j
(x, y
0
, . . . , y
k
) G(n, N, y) N ,=
G(n, N, y) M ,=
x M
N
j
(x, y
0
, . . . , y
k
).
The following is sometimes called the Lowenheim-Skolem-Tarski theorem.
Theorem 46. Suppose X N. Then there is an M such that
1. M N;
2. X M; and,
3. [M[ max, [X[.
Proof. Dene F :
k
N : k N by choice:
F(n, s) =
_
some element of G(n, N, s) if G(n, N, s) ,=
any element of N otherwise
Now dene X
n
n
by recursion on N as follows:
X
0
= X
X
m+1
= X
m
F
(
_
k
(X
m
) : k )
96 CHAPTER 11. ELEMENTARY SUBMODELS
Let M =
m
X
m
. As such, (2) and (3) are clearly satised. To check
(1), let y
k
(X
m
). If G(n, N, y) N ,= , then F(n, y) X
m+1
M and
G(n, N, y) M ,= .
The use of elementary submodels of the H() can be illustrated.
Theorem 47. (Pressing Down Lemma)
Let f :
1
0
1
be regressive; i.e., f() < for all .
Then
1
such that f
is uncountable.
Theorem 48. (Delta System Lemma)
Let / be an uncountable collection of nite sets.
Then T / R such that
1. T is uncountable, and
2. D
1
, D
2
T D
1
D
2
= R.
We need some lemmas. Assume M H() where is an uncountable
regular cardinal. For each
0
formula (v
0
, . . . , v
k
) we have:
Lemma.
(y
0
M) . . . (y
k
M) [M [= (y
0
, . . . , y
k
) (y
0
, . . . , y
k
)].
Proof.
M [= (y
0
, . . . , y
k
) H() [= (y
0
, . . . , y
k
) by elementarity,
(y
0
, . . . , y
k
) since H() is transitive.
97
Remark. The same is true for
T
1
formulas where T is ZFC without Power
Set.
For any formula (v
0
, . . . , v
k
) of LOST, we have:
Lemma.
y
0
M y
2
M . . . y
k
M x H()
[H() [= z = x : (x, y
0
, . . . , y
k
) z M].
Proof. Let y
0
, . . . , y
k
M and z H() be given such that
H() [= z = x : (x, y
0
, . . . , y
k
).
Then,
H() [= u u = x : (x, y
0
, . . . , y
k
)
M [= u u = x : (x, y
0
, . . . , y
k
)
p M [M [= p = x : (x, y
0
, . . . , y
k
)]
H() [= p = x : (x, y
0
, . . . , y
k
)
H() [= p = z.
H() is transitive; therefore, p = z and hence z M.
Corollaries. 1. If M H(), then
(a) M;
(b) M; and,
(c) M.
2. If also >
1
, then
1
M.
Proof. and are direct. For M show that y M y y M.
98 CHAPTER 11. ELEMENTARY SUBMODELS
Lemma. Suppose M H() where is regular and uncountable. Suppose
p is countable and p M. Then p M.
Proof. Let q p; we must show that q M. Let f
0
: p be a surjection.
Since , p H() we must have f
0
H(). Since the formula f :
p and p is surjective is a
0
formula and f
0
, , p H(), we have H() [=
(f
0
: p and p is surjective). So
H() [= (f)(f : and p is surjective).
Since , p M we have,
M [= (f)(f : p and p is surjective).
That is, (f
p
M)(f
p
: p and p is surjective).
Pick n such that f
p
(n) = q, and again use the rst lemma as follows.
Since p, f
p
, n M and (!x)(x p and f
p
(n) = x) is a
0
formula
M [= (!x)(x p and f
p
(n) = x).
That is, (!x)(x p M and f
p
(n) = x). Since x is unique, x = q and thus
q M.
Corollary.
1
M
1
.
Proof. It is enough to show that
1
M is a countable initial segment of
1
.
If
1
M, then by the above lemma, M.
Proof of Pressing Down Lemma
Let M H(
2
) such that M is countable and f M. Let =
1
M
and let = f() < . Then,
( < )(x
1
) [x > f(x) = ].
99
So < H(
2
) [= (x
1
)(x > f(x) = ), since everything relevant
is in H(
2
). Hence,
< M [= (x
1
)(x > f() = )
since , ,
1
, f M. Now, since =
1
M we have,
M [= (
1
)(x
1
) [x > f() = ].
So H(
2
) [= (
1
)(x
1
) [x > f() = ]. Thus we have
H(
2
) [= f
is uncountable.
Again, since everything relevant is in H(
2
) we conclude that f
is
uncountable.
2
Proof of the Delta System Lemma
Let / be as given. We may, without loss of generosity, let
/ = a() : <
1
where a:
1
V. We may also assume that a:
1
T(
1
).
Let M be countable with /, / and M H(
2
). Let =
1
M.
Let R = a() . Since R M, we know R M by the second lemma. So,
< > a() = R
H(
2
) [= ( < )( > ) [a() = R]
( < ) [H(
2
) [= ( > )(a() = R)]
( < ) [M [= ( > )(a() = R)]
M [= ( <
1
)( > ) [a() = R]
( <
1
)( > ) [a() = R].
Now recursively dene D:
1
/ as follows:
D() = a(0);
D() = a()
100 CHAPTER 11. ELEMENTARY SUBMODELS
where is the least ordinal such that
> sup D() : < and a() = R.
Now if
1
<
2
<
1
, then D(
1
)
2
. So,
R D(
1
) D(
2
)
2
D(
2
) = R.
Thus we let T = D() : <
1
.
2
Theorem 49. (Elementary Chain Theorem)
Suppose that is a limit ordinal and M
( <
< M
).
Let
M
=
_
M
: < .
Then M
H().
Proof. Let k , let y
k
M
,= .
But this is easy since y
k
M
0
and hence absolute for transitive sets.
We have:
= L() ON = x L() : x is an ordinal
= x L() : (x)
= x L() :
L(+1)
()
cl(L()) using Theorem 44
= L( + 1).
104 CHAPTER 12. CONSTRUCTIBILITY
Lemma.
1. For each ordinal , = L() ON.
2. ON L.
Proof. This is easy from the previous lemmas.
Lemma.
1. If W is a nite subset of X then W cl(X).
2. If W is a nite subset of L() then W L( + 1).
Proof.
1. We apply Theorem 44 to the formula x = w
0
x = w
n
, where
W = w
1
, w
2
, . . . , w
n
.
2. This follows immediately from (1).
Lemma.
1. If X is innite then [cl(X)[ = [X[.
2. then [L()[ = [[.
Proof.
1. By Theorem 44 we can construct an injection cl(X)
k
X : k
. Hence, [X[ [cl(X)[ max (
0
, [
k
X : k [) = [X[.
105
2. We proceed by induction, beginning with the case = . We rst note
that from the previous lemma, we have L(n) = R(n) for each n .
Therefore,
[L()[ = [
_
L(n) : n [
= max (
0
, sup [L(n)[ : n )
= max (
0
, sup [R(n)[ : n )
=
0
.
For the successor case,
[L( + 1)[ = [L()[ by (1)
= [[ by inductive hypothesis
= [ + 1[ since is innite .
And if is a limit ordinal then
[L()[ = [
_
L() : [
= max ([[, sup [L()[ : )
= max ([[, sup [()[ : ) by inductive hypothesis
= [[.
Lemma. (x) [x L (y L)(x y)].
Proof. x L means that u x ON x L(). By the Axiom of
Replacement,
z z = : (u x)( is the least ordinal such that u L()).
Let = sup z; then ON and for each u x, there is such that
u L() L(). Since L() L( + 1) L, we can take y = L().
Remark. The above lemma is usually quoted as L is almost universal.
106 CHAPTER 12. CONSTRUCTIBILITY
Lemma. L [= V = L.
Proof. This is not the trivial statement
x L x L
but rather
x L (x L)
L
which is equivalent to (x L)( ON x L())
L
; which is, in turn, since
ON L, equivalent to (x L)( ON)(x L())
L
.
This latter statement is true since x L() is a
0
formula when
written out in full in LOST, and since L is transitive.
For each Axiom of ZFC we have:
Theorem 50.
L [= .
Proof. Transitivity of L automatically gives Equality, Extensionality, Exis-
tence and Foundation. We get Innity since L and z = N is a
0
formula.
For Comprehension, let be any formula of LOST; we wish to prove
y L w
0
L. . . w
n
L z L z = x y :
L
(x, y, w
0
, . . . , w
n
)
since L is transitive.
Fix y, w
0
, . . . , w
n
and ON such that y, w L(). By the Levy
Reection Principle, there is some > such that is absolute between L
and L().
By Theorem 44, there is an n such that
G(n, L(), y, w) = x L() :
L()
(x, y, w).
107
and so by denition, x L() :
L()
(x, y, w) L( + 1). Now by
absoluteness, x L() :
L()
(x) = x L()
L
(x). So we have
x L() :
L
(x, y, w) L( + 1).
Moreover, since y L( + 1),
x y :
L
(x, y, w) = y x L( + 1) :
L
(x, y, w)
L( + 2)
and since L( + 2) L we are done.
For the Power Set Axiom, we must prove that (x z z = y : y x)
L
.
That is, x L z L z = y : y L and y x. Fix x L; by the Power
Set Axiom and the Axiom of Comprehension we get
z
= y T(x) : y L y x = y : y L y x.
By the previous lemma L is almost universal and z
L so
z
L z
.
So z
= z
= y z
: y x)
L
;
i.e.,
z L z = y z
: y L y x
= y : y L y x
The Union Axiom and the Replacement Scheme are treated similarly. To
prove (the Axiom of Choice)
L
, we will show that the Axiom of Choice follows
from the other axioms of ZFC with the additional assumption that V = L.
It suces to prove that for each ON there is a ON and a
surjection f
: b
L().
108 CHAPTER 12. CONSTRUCTIBILITY
To do this we dene f
recursively. Of course f
0
=
0
= = L(0). If
is a limit ordinal, then we let
: <
and f
() = f
() where =
.
If = + 1 is a successor ordinal, use f
k
(L()) : k and use this to obtain an ordinal
and a surjection
k
(L()) : k .
Now let
=
+1
=
and let
f
() = G(n, L(),
f
()) where =
n +, <
.
This completes the proof of Theorem 50 and motivates calling V = L
the Axiom of Constructibility.
Remark. V = L is consistent with ZFC in the sense that no nite subcollec-
tion of ZFC can possibly prove V ,= L; To see this, suppose
0
, . . . ,
n
V ,= L.
Then
L
0
, . . . ,
L
n
(V ,= L)
L
.
by Theorem 50. This contradicts the preceding lemma.
Remark. Assuming V = L we actually can nd a formula (x, y) which gives
a well ordering of the universe.
We denote by
L
the conjunction of a nite number of axioms of ZFC
conjoined with V = L such that
L
implies all our lemmas and theorems
about ordinals and ensures that x L() is equivalent to some
0
formula
109
(but I think we have already dened it to be
0
). In particular, x ON will
be equivalent to a
0
formula.
Furthermore, we explicitly want
L
to imply that ON z z = L()
and that there is no largest ordinal.
We shall use the abbreviation o(M) = ON M.
Lemma. M(M is transitive and
M
L
M = L(o(M))).
Proof. Let M be transitive such that M [=
L
. Note that o(M) ON. We
have M [= ON z z = L(). So,
o(M) M [= z z = L()
o(M) z M M [= z = L()
o(M) z M z = L()
o(M) L() M
o(M) L() M.
Since M [=
L
, o(M) is a limit ordinal and hence
L(o(M)) =
_
L() : o(M) M.
Now let a M. Since M [= V = L we have
M [= x y ON x L(y)
M [= y ON a L(y)
o(M) M [= a L()
o(M) a L()
a L(o(M)).
Lemma.
C
If ON C, C is transitive, and
C
L
, then C = L.
110 CHAPTER 12. CONSTRUCTIBILITY
Proof. The proof is similar to that of the previous lemma.
Theorem 51. (K. Godel)
If V = L then GCH holds.
Proof. We rst prove the following:
Claim. ON T(L()) L(
+
).
Proof of Claim. This is easy for nite , since L(n) = R(n) for each n .
Lets prove the claim for innite ON. Let X T(L()); we will
show that X L(
+
).
Let A = L() X. A is transitive and [A[ = [[.
By the Levy Reection Principle, there is a ON such that both
A L() and L() [=
L
, where
L
is the formula introduced earlier.
Now use the Lowenheim-Skolem-Tarski Theorem to obtain an elementary
submodel K L() such that A K and [K[ = [A[ = [[ so by elementarily
we have K [=
L
.
Now use the Mostowski Collapsing Theorem to get a transitive M such
that K
= M. Since A is transitive, the isomorphism is the indentity on A
and hence A M. We also get M [=
L
and [M[ = [[.
Now we use the penultimate lemma to infer that M = L(o(M)). Since
[M[ = [[ we have [o(M)[ = [[ so that o(M) < [
+
[.
Hence A M = L(o(M)) L(
+
), so that X L(
+
).
We now see that the GCH follows from the claim. For each cardinal
we have L() so that [T()[ [T(L())[ [L(
+
).
Since [L(
+
)[ =
+
we have [P()[ =
+
.
111
We now turn our attention to whether V = L is true.
Let be a cardinal and let | be an ultralter over . Recalling that
V = f : f : V, let
U
be a binary relation on
V dened by
f
U
g i : f() = g() |.
It is easy to check that
U
is an equivalence relation.
For each f
V let (f) be the least element of
ON : rank(g) = f
U
g.
Let [f] = g R((f) + 1) : g
U
f and let ULT
U
V = [f] : f
V.
Dene a relation
U
on ULT
U
V by
[f]
U
[g] i : f() g() |.
It is easy to check that
U
is well dened.
For each cardinal , we use the abbreviation
[X]
<
= Y X : [Y [ < .
Given an uncountable cardinal , an ultralter | is said to be complete
if X [|]
<
X |.
An uncountable cardinal is said to be measurable whenever there exists
a complete free ultralter over .
Lemma. If | is a countably complete ultralter (in particular if | is a
complete ultralter) then
U
is set-like, extensional and well founded.
Proof. To see that
U
is set-like, just note that
[g] : [g]
U
[f] R((f) + 2).
For extentionality, suppose [f] ,= [g]; i.e., : f() = g() / |.
Then either : f() g() | or : g() f() |.
This leads to two similar cases; we address the rst.
112 CHAPTER 12. CONSTRUCTIBILITY
Pick any h
V such that h() f() g() whenever f() g().
Then [h]
U
[f] and [h]
U
[g].
To see that
U
is well founded, suppose f
n
n
such that
n [f
n+1
]
U
[f
n
].
Let
A =
: f
n+1
() f
n
() : n |.
A | by the countable completeness of |, so that A ,= . Pick any A.
Then F
n+1
() f
n
() for each n , which is a contradiction.
We now create a Mostowski collapse of ULT
U
V
h
U
: ULT
U
V M
U
given by the recursion
h
U
([f]) = h
U
([g]) : [g]
U
[f]
As per the Mostowski Theorem, h is an isomorphism and M
U
is transitive.
The natural embedding i
U
: V ULT
U
V is given by i
U
(x) = [f
x
] where
f
x
: V such that f
x
() = x for all .
This natural embedding i
U
combines with the unique isomorphism h
U
to
give
j
U
: V M
U
given by j
U
(x) = h
U
(i
U
(x)).
j
U
is called the elementary embedding generated by |, since for all for-
mulas (v
0
, . . . , v
n
) of LOST we have:
Lemma. v
0
. . . v
n
(v
0
, . . . , v
n
)
M
U
(j
U
(v
0
), . . . , j
U
(v
n
)).
Proof. This follows from two claims, each proved by induction on the com-
plexity of .
113
Claim 1. v
0
. . . v
n
(v
0
, . . . , v
n
)
(i
U
(v
0
), . . . , i
U
(v
n
)).
Claim 2. v
0
. . . v
n
(i
U
(v
0
), . . . , i
U
(v
n
))
M
U
(j
U
(v
0
), . . . , j
U
(v
n
)), where
is with replaced by
U
and all quantiers restricted to ULT
U
V.
We leave the proofs to the reader.
Theorem 52. Every measurable cardinal is inaccessible.
Proof. We rst prove that is regular. If cf() = < , then is the union
of sets each smaller than . This contradicts the existence of a complete
free ultralter over .
We now prove that if < , then [T()[ < . Suppose not; then there
is X [T()]
= x X : x and B
= x X : / x. Let
I = : A
| and J = : B
|. Since | is an ultralter,
I J = . Since | is complete and < we have
: I
: J |.
But this intersection is equal to XI, which is either empty or a singleton,
contradicting that | is a free lter.
Lemma. Let | be a complete ultralter over an measurable cardinal .
Let M = M
U
, h = h
U
, i = i
U
and j = j
U
as above. Then for each ON
we have j() ON and j() . Furthermore, if < then j() = and
j() > .
Proof. For each ON we get, by the elementary embedding property of j,
that M [= j() ON; since M is transitive, j() ON.
Let be the least ordinal such that j() . Then M [= j(j()) j()
by elementarity, and j(j()) j() by transitivity of M. This contradicts
the minimality of .
114 CHAPTER 12. CONSTRUCTIBILITY
Now lets prove that j() = for all < by induction on . Suppose
that j() = for all < < . We have
j() = h(i())
= h([g]) : [g]
U
i()
= h([g]) : [g]
U
[f
] where f
() = for all
= h([g]) : : g() f
() |
= h([g]) : : g() |
= h([g]) : : g() = | by completeness of |
= h([g]) : [g] = [f
] where f
() = for all
= h([f
]) :
= h(i()) :
= j() :
= : by inductive hypothesis
Hence j() = .
We now show that j() > . Let g : ON such that g() = for each
. We will show that h([g]) for each and that h([g]) j().
Let .
: f
() g() = :
= ( + 1)
|
Hence [f
]
U
[g] and so h([f
() = : = |. Hence
[g]
U
[f
]) = h(i()) = j().
115
Theorem 53. (D. Scott)
If V = L then there are no measurable cardinals.
Proof. Assume that V = L and that is the least measurable cardinal; we
derive a contradiction. Let | be a complete ultralter over and consider
j = j
U
and M = M
U
as above.
Since V = L we have
L
and by elementarity of j we have
M
L
. Note that
L
is a sentence; i.e., it has no free variables.
Since M is transitive, ON M by the previous lemma. So, by an earlier
lemma M = L. So we have
L = V [= ( is the least measurable cardinal)
and
L = M [= (j() is the least measurable cardinal).
Thus L [= j() = ; i.e., j() = , contradicting the previous theorem.
Remark. We have demonstrated the existence of an elementary embedding
j : V M. K. Kunen has shown that there is no elementary j : V V.
Large cardinal axioms are often formulated as embedding axioms. For
example, is said to be supercompact whenever
j [j : V M and j() > and j[
R()
= id[
R()
and
M M].
116 CHAPTER 12. CONSTRUCTIBILITY
Chapter 13
Appendices
.1 The Axioms of ZFC
Zermelo-Frankel (with Choice) Set Theory, abbreviated to ZFC, is consti-
tuted by the following axioms.
1. Axiom of Equality
x y [x = y z (x z y z)]
2. Axiom of Extensionality
x y [x = y u (u x u y)]
3. Axiom of Existence
z z =
4. Axiom of Pairing
x y z z = x, y
5. Union Axiom
x [x ,= z z = w : (y x)(w y)]
117
118 CHAPTER 13. APPENDICES
6. Intersection Axiom
x [x ,= z z = w : (y x)(w y)]
7. Axiom of Foundation
x [x ,= (y x)(x y = )]
8. Replacement Axiom Scheme
For each formula (x, u, v, w
1
, . . . , w
n
) of the language of set theory,
w
1
. . . w
n
x [u x !v z z = v : u x ]
9. Axiom of Choice
X [(x X y X (x = y xy ,= )) z (x X !y y xz)]
10. Power Set Axiom
x z z = y : y x
11. Axiom of Innity
N ,= ON
.2 Tentative Axioms
Here is a summary of potential axioms which we have discussed but which
lie outside of ZFC.
1. Axiom of Inaccessibles
> and is an inaccessible cardinal
2. Continuum Hypothesis
[T()[ =
1
.2. TENTATIVE AXIOMS 119
3. Generalised Continuum Hypothesis
[ is a cardinal [T()[ =
+
4. Suslin Hypothesis
Suppose that R is a complete dense linear order without endpoints in
which every collection of disjoint intervals is countable.
Then R
= R.
5. Axiom of Constructibility
V = L