Set Theory Notes Itay Kaplan 2016
Set Theory Notes Itay Kaplan 2016
Set Theory Notes Itay Kaplan 2016
ITAY KAPLAN
1. Week of 07/11/2016
These notes are largely based on notes written by Azriel Levy which can be found in the Moodle site of
2015.
1.1. History and motivation. Consider the following question. Let R, , be the field of real numbers.
Call a real number r (real) algebraic if there is some nontrivial polynomial f X > Z X such that f r
0. Say that r is transcendental otherwise. Is there a non-algebraic real number? what about π? π is
indeed transcendental, but this is a non-trivial result. Is there a pair of numbers r1 , r2 such that both are
transcendental and even as a pair, i.e., for no non-trivial polynomial f X, Y > Z X, Y is it the case that
f r1 , r2 0? (What about π, e? this is still unknown.)
However, Georg Cantor, solved both these questions in 1874 in a monumental paper, which started a
whole new mathematical field: set theory. In short, he showed, using a beautiful and simple argument,
that there are more real numbers then algebraic real numbers, and thus there must be a transcendental real
number (the same argument shows the existence of transcendental pairs as above).
Mathematicians were always fascinated with the notion of infinity. Set theory allows to give precise
meaning to the term, to compare one infinity (e.g., the real algebraic numbers) and another (the real
numbers).
The basic idea of set theory is to treat collections of mathematical objects as mathematical objects in its
own right. Exactly as in geometry one studies triangles, lines, and angles, in set theory one would study sets,
which might be collections of triangles, or collection of lines, or collections of real numbers, or collections of
any other mathematical object. In fact, one can found all of mathematics on set theory, in the sense that
every mathematical field can be regarded as a sub-field of set theory, and every object is a set. (However, it
is still much more useful to think of each mathematical field as its own field and not to interpret everything
to sets.)
What is a set? For us a set is an object which we think of as a “collection of objects”. This means that if
x is a set, then we are allowed to ask for each object y whether or not y > x (in words: y is in x), and that
x is determined by the answer to all of these questions. To illuminate this think of two different councils
representing all students in the university, one deals with tuition and another with academics, as sets, the
collection of students they represent is the same, but each has another function.
In some sense, sets existed before numbers: if I wanted to know if I have as many apples as my neighbor,
I’d put my apples and his side by side and see which line is longer, and this means that my set of apples is
bigger then his. For this thinking we do not need the further abstraction of a natural number.
1.2. Fundamental objects. The first thing one would like from sets is that there is at least one set. Let
us fix some fundamental objects we will not treat as sets, at least not at the beginning. These will include
the natural numbers 0, 1, 2, 3, . . ., the rational numbers 0, 1, 2, 1~2, 8~7, . . ., the real numbers 0, 1, π, e, 3~4 and
1
perhaps more, like the order relation on the three of them, addition and multiplication. You may have seen
that the real numbers can be constructed from the rationals which in turn can be constructed from the
natural numbers. These objects are not sets, so one cannot ask whether 2 > 1 (for now), i.e., we can put a
fundamental object only on the left side of a > expression.
The Axiom of Infinity states that the collection of all natural numbers is a set, denoted by N.
We will later see that the collection of rational numbers Q and real numbers R are also sets.
Our universe is composed of sets and of fundamental objects. We will use the term object to refer to
either a set or a fundamental object.
Later we will see that there is no real need for fundamental objects: we will code all the fundamental
objects as sets, so eventually, every object will be set, and the term “object” will be redundant.
1.3. Extensionality. As we said above, a set is determined by its members. This is expressed via the
following axiom, the Axiom of Extensionality:
1.4. Subsets.
Definition 1.1. If x, y are sets then we say that x contains y, denoted by y b x, if ¦ z z > y z > x, in
words: whenever z is in y, z is in x. If y b x, we say that y is a subset of x.
1.5. Russell’s paradox and the axiom of separation. We want to construct sets. We want to
One of the properties we would like sets to have is what we will call the Basic Existential Axiom (also
called the Axiom Schema of Comprehension). It says Suppose P v is some property (In Subsection 1.6 we
will say what we mean by property more precisely). Then the collection of objects satisfying the property
P is a set. We will denote that set by x S P x . However, this axiom leads to a paradox, discovered by
Russell in 1904. Consider the property P x x ¶ x. So P x holds iff x is not in x. Suppose that x S x ¶ x
is a set, call it y. If y > y then y ¶ y which then implies that y > y — contradiction.
To avoid this paradox we will abandon the basic existential axiom in favor of a more modest Axiom
of Separation which states that if x is a set, P v some property of sets, then there is a set y such that
y t > x S P t ( t S t > x , P t ), in other words, y is precisely the set of members of x which satisfy P .
Example 1.2. The sets n S n > N , §m m 2 n and n S n > N , §m m m n are indeed sets by
separation and infinity, and are equal (even though they have different descriptions) by extensionality.
1.6. Language. What is a property? We cannot just allow any use of natural language to describe mathe-
matical objects, because natural language is not precise enough. For example, consider the following descrip-
tion of a number n: n is the smallest number which cannot be described with at most 2 million characters in
English (including digits). Of course, such a number exists, because only finitely many descriptions with at
most 2 million characters exist, so there is a minimal one. But we just described n with less than 2 millions
characters — contradiction.
The basic language that we will use contains variables A, B, . . . x, y, . . . , t, . . . , v, . . . as well as the relations
meaning equality of objects and >. As well, the language contains logical connectives such as ,(and), -
2
(or), (implies), (iff), (not), and the quantifiers ¦ (for all) and § (there exists). Call the basic language
the language of set theory.
We will, for now, enrich the language by adding predicates for natural numbers, rational numbers, and
real numbers (and possibly more), and the symbols for the natural operations on these structures (such as
the order relation, addition, multiplication, etc.). We will also add a predicate for “being a set” (as opposed
to being a fundamental object), in the same way we wrote the axiom of extensionality.
For instance, when we write ¦x set§y . . . we read it as “for all sets x, there is some object y such that...”.
To conclude, when we say “property” in the axiom of separation, we mean that P is a property written
in the language of set theory.
Theorem 1.3. The property x x, which defines the class of all sets, does not define a set.
Proof. Suppose there is a set a such that a x S x x , then by separation, the class x S x ¶ x is a set, but
this gives a contradiction exactly as in Russell’s paradox.
1.7. Classes. Note that we used the term “class” in Theorem 1.3. A class is a collection of objects satisfying
some property in the language of set theory. For a class A and a set x we write x > A or A x to say that
x belongs to the class A. One can define when a class A is contained in a class B similarly to what we did
with sets, write A b B, and say that A is a subclass of B.
We say that a class A is a set if there is a set a which has the same members: for all x, x > A iff x > a.
For example, if a is a set, then the class x S x > a is a class (a collection of objects) and this class is the set
a. In this sense every set is a class, but not all classes are sets. For example, Theorem 1.3 implies that the
class of all sets is a proper class: it is a class which is not a set.
In terms of classes, the axiom of separation says that a subclass of a set is a set.
So classes are not objects, but rather collections of objects. We are not allowed to write A > a when A is
a class.
During the course, we will try to emphasis when something is a class and when it is a set.
Theorem 1.4. There is a unique empty set, which we denote by g . In other words there exists a unique set
such that for no object x is it the case that x > g.
Proof. On the face of it, g is defined as a class by the property x x x. However we will use the axiom of
infinity and axiom of separation to prove that g is a set.
By the axiom of infinity the set N exists. By the axiom of separation there is a set g defined by g
x > N S x x x .
Clearly g satisfies the condition that ¦x x ¶ g (if x > N this is clear, and otherwise of course x ¶ g). The
uniqueness of g follows from the axiom of extensionality.
Proof. These all follow from the axioms we saw so far. For instance (1) follows by pairing x with x (x, x
x). For (2), first by pairing, there is a set x, y , then x 8 y x, y. Note that in (6), if x has no
sets as members ( this is different than just saying that x x g, because some objects are not sets), then the
condition that a belongs to all sets in x is an empty condition: every a realizes this condition, and so the
property described in (6) gives the class of all sets, which we already saw is not a set.
2. Week of 14/11/2016
Definition 2.1. A map f which takes two objects x, y and maps them to a set f x, y and satisfies the
property “for all x, y, u, v, f x, y f u, v iff x u and y v” will be called an ordered pair map.
The existence of such a map is desirable: we would like for instance to talk not only on the set of natural
numbers or real numbers, but also at sets of coordinates in the plane, or even triple of coordinates (which
we can think of as triangle).
Proof. Define `x, y e x , x, y . We already know that this is a set by Theorem 1.5. Given x, y, u, v, we
have to show that if `x, y e `u, v e then x u and y v (the other direction is easy, do it).
First assume that x x y. Then it follows that `x, y e has two distinct elements, so necessarily u x v. Also,
since x > `u, v e then x u as this is the unique singleton in `u, v e, so that x u. It follows that
x, y u, v so x, y x, v and as y is a member of the left hand side, it must follow that y v.
Now assume that x y. It follows that u v as above. Then `x, y e x u `u, v e. Hence
x u and so x u and we are done.
2.2. Relations.
Definition 2.3. A (binary) relation is a class whose members are ordered pairs. In other words, R is a
relation if for all x, if x > R then x `u, v e for some (unique) u and v. We sometimes write R u, v or u R v
for R `u, v e.
The domain of a relation R is the class of all u such that for some v the relation R u, v holds. Similarly,
the range of R is the class of all v such that for some u, R u, v holds.
4
Given a property P x, y in two variables, we can define the relation R as z S z `x, y e , P x, y . In
this sense, a relation is just a property of objects in two variables. We sometimes identify the relation R and
the property it defines on pairs. For example, we may consider the property x > y. This property defines
a relation R which consists of all ordered pairs `x, y e such that x > y. However, abusing notation a bit, we
identify R with the property x > y and even call this relation >.
Definition 2.4. Given two classes A, B the Cartesian product A B is the relation `x, y e S x > A, y > B . If
A B, we will write also A2 .
We will now prove that if A, B are sets, then their Cartesian product is also a set. For this we will need
one more axiom.
Axiom 2.5. The Power-Set Axiom states that if x is a set, there is a set, denoted by P x, consisting of
all subsets of x. In other words, ¦y y b x y > P x.
Theorem 2.6. If A, B are sets, then so is their Cartesian product.
Proof. Note that for every x > A, y > B, `x, y e x , x, y > P P A 8 B . Hence, by separation,
AB z > P P A 8 B S §x > A§y > B z `x, y e is a set.
Corollary 2.7. A relation R is a set iff its domain and range are sets.
Proof. Let A Dom R and B Range R and suppose these are sets. Then R b A B, so by separation
and Theorem 2.6 it is a set.
On the other hand if R is a set, then A u > R S `u, v e > R is a set by union and separation. Indeed
u > A iff for some v, `u, v e u , u, v > R iff u > R and for some v, `u, v e > R iff u > R and for
some v, `u, v e > R. Similarly, B is also a set.
Definition 2.8. A relation R is on a class A if R b A A. In other words iff Dom R b A and Range R b A.
Definition 2.9. (1) A relation R on a class A is symmetric if for all objects x, y from A, x R y y R x.
(2) A relation R on a class A is reflexive if for all objects x from A, x R x.
(3) A relation R on a class A is transitive if for all objects x, y, z from A, x R y , y R z x R z.
Example 2.10. The relation , which holds between two objects when they are equal, is symmetric, tran-
sitive and reflexive; The relation @ on N is transitive but not symmetric nor reflexive; The relation b on sets
is reflexive and transitive but not symmetric. The relation > is neither reflexive, symmetric nor transitive.
Example 2.12. The relation is an equivalence relation, the relation `x, y e S x, y > N , 3S x y is also
an equivalence relation and also a set (why?) (here 3Sx means that 3 divides x).
Definition 2.13. Suppose that E is an equivalence relation on A. For a > A, The E-equivalence class of a
is aE x > A S x E a . This may also be denoted by a~E.
Definition 2.16. Suppose that E is an equivalence relation on a set A. Let A~E be the class z E S z > A .
Example 2.17. Suppose A is a set. Let E A A. Then E is an equivalence relation, and A~E A. On
the other hand, let E
, i.e., E
`a, ae S a > A . Then A~E a S a > A .
2.4. Functions. A function is a notion that maps an object x to an object y in such a way that for every
x there is at most one y that corresponds to x. The following is a precise definition.
Remark 2.19. Sometime we say “image” instead of “range” when we discuss functions.
Remark 2.20. Given two functions which are sets, F and G, they are equal iff Dom F Dom G and for
every x > Dom F , F x G x (by extensionality).
Example 2.21. Consider the relation `x, xe S x x . This relation is a function. It is called the identity
function, denoted by id. Given a class A, let idA I
id A.
Axiom 2.24. The Axiom of Replacement states that if F is a function and A Dom F is a set, then
F A is a set.
Equivalently, by Corollary 2.7, a function F is a set iff its domain is a set.
Exercise 2.25. Show that given F A B and G B C, the relation defined by G X F is indeed a
function. Show that Dom G X F A and that Range G X F G F A.
Corollary 2.26. The collections of integer numbers Z, rational numbers Q and real numbers R are sets.
Proof. Z is the image of the function f N 0, 1 Z defined by f n, 0 n, f n, 1 n. Q is the image
of the function f
Z Z 0 Q defined by f n, m
n~m. R is the image of the function f
P Q R
given by f
s sup s if s is bounded from above, and otherwise 0. By the axiom of replacement all of these
are sets.
3. Week of 22/11/2016
Theorem 3.1. Suppose that E is an equivalence relation on a set A. Then the class of equivalence classes
A~E of A, is a set.
Proof. Let F A A~E be the function defined by F x xE . Then A~E F A so is a set by
replacement.
Proof. (1) implies (2). First we show uniqueness. Suppose that G1 , G2 both satisfy (2). Then for every
y > B, F G1 y F G2 y y so G1 y G2 y as F is injective, in fact G1 y is the unique x such
1
that F x y. Thus we must define F
y as the unique x such that F x y. This defines a function
1 1
from B to A as F is injective and surjective. As well, F
F x x by definition, and F F
y F x
1
where F x y, so F F
y y.
1
(2) (even without uniqueness) implies (1). F is injective because if F x1 F x2 then applying F
to
both sides with get x1 x2 . F is surjective because given y > B, F F 1
y y.
Exercise 3.6. The composition of injective functions is injective: if F A B, G B C are injective then
also G X F . Similarly, the composition of surjective functions is surjective.
7
3.1. Finite sets and cardinality.
Definition 3.7. Say that two sets A, B have the same cardinality, denoted by SAS SB S if there is a bijection
F A B.
This is a major notion in set theory: it allows us to compare two sets even if they have infinitely many
members.
Proposition 3.9. Suppose that SAS SB S and a > A, b > B. Then SA aS SB bS.
Proof. Suppose that F A B is a bijection. Then F witnesses that SA aS SF A F aS
SB F aS.
Hence to finish it is enough to see that SB F aS SB bS . For this consider the function
GB B which is just idB b,F a 8 `b, F ae , `F a , be. This is a bijection of B which switches F a
and b. Then G I B F a is a bijection which shows that SB F aS SB bS .
Definition 3.13. A set A is called finite if for some n > N, SAS S nS. In that case we will say that A has
cardinality (or size) n. A set A is infinite if it is not finite. By Corollary 3.12, the size of A is a well defined
notion for a finite set A.
Example 3.14. The empty set g is finite, since g 0 and hence SgS S 0S (the empty function g g g
Note that A is finite iff there is an injective function F A n for some n > N (because if there is such
an injective function, then F A F A is a bijection and F A b n, and by induction on n, one can
prove that for any s b n, s is finite of size B n: if s n it is clear, and otherwise s b n k for some k @ n,
and since S n k S S n 1S, s has the same cardinality as a subset of n 1). Hence a subset of a finite
set is finite. In addition, if this subset is proper (i.e., not equal to the set), then its size is smaller.
Notation 3.15. When A is finite, we write SAS n for SAS S nS. Later we will give the notation SAS a precise
meaning.
8
Corollary 3.16. ( The pigeon hole principle for finite sets) Suppose that m @ n are from N. Suppose that
SAS n, SB S m, and C is infinite. Then there is no injective function from A to B, nor is there an injective
function from C to B. In particular, if A is finite then there is no injective function from A to any proper
subset of A.
Proof. Follows from (3) in Lemma 3.11, and the discussion above.
We will later see that the pigeon hole principle identifies finite sets.
Proof. The proof is by induction on the size of A: we will show by induction on n that for any set A of size
n, and any finite set B, A 8 B is finite.
If n 0 then A g, so A 8 B B and there is nothing to prove.
Suppose by induction that the theorem is true for n and we are given A of size n 1. Let x > A. Then
A8B A x 8 B 8 x. Now, A x has size n (by Lemma 3.11(1)), so by induction C A x 8 B is
finite. But C 8 x is also finite (if C has size n then C 8 x has size B n 1).
4. Week of 28/11/2016
Proposition 4.1. If A is a finite set, and F A B is any function, then Range f f A is finite.
Proof. By definition, for some n > N, SAS n, so there is a bijection g n A, hence composing with f , it
is enough to show that for every number n > N, and any function f n B, Range f is finite.
This is proved by induction on n. For n 0, Range f g so there is nothing to do. Suppose that
the proposition is true for n. Let f n 1 B. Then Range f f n 8 f n. But by induction,
f n Range f I n is finite and so by Theorem 3.17.
Example 4.3. N is countable and any countable set is infinite (this follows from Lemma 3.11). In addition,
if n > N, then the set NCn m > N S m C n is countable (why?). The sets Even and Odd of even numbers
and odd numbers are both countable.
In some sense, countable sets are the smallest infinite sets. We will give this a precise meaning later.
Theorem 4.4. (Baby case of definition by recursion on N) Suppose that f A A is some function where
A is a set. Let a > A. Then there is a unique function F N A such that F 0 a and for each n > N,
n
F n 1 f F n. In other words, F n f a f f . . . f a where the composition is done n
times.
Exercise 4.6. Suppose that X is a set of functions. Suppose that if f, g > X then there is some h with
f, g b h (in this case we say that f, g are consistent). Then X is a function from Dom f S f > X to
Range f S f > X . In fact it is the unique function F with domain Dom f S f > X such that for all
f > X, F I Dom f f.
Proof of Theorem. For the proof we will first prove the following claim.
Claim. Given f, A, a as in the theorem, for every n > N there is a unique function Fn n A such that
Fn 0 a (if 0 @ n) and for every m @ n 1, Fn m 1 f Fn m.
Proof of claim. The proof is by induction on n. For n 0 there is nothing to do. For n 1, first prove
uniqueness. Suppose we have two such functions Fn11 and Fn2 1 . Then by induction, Fn1 1 I n Fn
I
Fn21 n. But then Fn11 n f Fn n 1 Fn21 n and so Fn11 Fn21 (see Remark 2.20). For
existence, we let Fn 1 Fn 8 `n, f Fn n 1e.
Let us continue with the proof of the theorem. Note that if F is a function as in the theorem, then for any
n > N, F I n satisfies the conditions of the claim. Hence F I n is unique — it must be Fn . This implies
that F , if exists, is unique. In fact, by Exercise 4.6, F must be Fn S n > N . This shows existence as well:
just let F be this union (note that Fn b Fm for n B m, so the condition of that exercise are satisfied). Then
F I n Fn for all n > N and so satisfies the conditions of the theorem.
Definition 4.7. Call a subset S b N bounded if it is contained in some n for n > N. Call S unbounded if it
is not bounded.
Note that by if S b N is bounded then it is finite. The following lemma describes unbounded sets.
Proof. Let a min S, and let f S S be such that f k min m > S S m A k . Then f is well defined
because S is unbounded. Let F be the sequence af n
a S n > N f which exists by Theorem 4.4.
Then F N S is a bijection:
It is injective. Indeed, it is enough to show that F is increasing: F k A F m whenever k A m are
from N. For this it is enough to show that F n 1 A F n for all n > N (why?). This is true since
F n 1 f F n by definition, and f k A k for all k > S by choice of f .
Next we prove it is surjective. Suppose not, and let k > S be minimal such that k ¶ Range F . Then
k A min S a, since F 0 a by definition. Hence there is some m @ k maximal with m > S. So m F n
for some n > N. But then F n 1 f F n k and we are done.
Corollary 4.9. Suppose S b N. Then S is either finite or countable. Moreover, the same is true for any set
S such that there is an injective function f S N.
Proof. If S b N then it is either bounded (so finite) or unbounded (so countable). If f S N is injective,
then Sf S S SS S, so the second sentence follows from the first.
10
Definition 4.10. Two sets A and B are disjoint when A 9 B g .
Proposition 4.11. Suppose that A and B are sets. If A is countable and B is finite or countable, then
A 8 B is countable.
Proof. First assume that A and B are disjoint. Suppose that f A Even and g B Odd are injective
functions (exist since both Even and Odd are countable). Let F A 8 B N be F f 8 g then F is indeed
an injective function (why?). This means that there is an injective function from A 8 B is to N. Hence by
Corollary 4.9, it is either finite or countable. But it cannot be finite, by the pigeon hole principle, as A is
infinite.
For the general case, let B
B A. Then A 8 B
A 8 B and A 9 B
. Also, B is finite or countable
g
We know that Pnk 1m k is the sum of an arithmetic series, so we know that the sum is n
m n m 1 ~2
(by induction on n m). However, here is another way to see it.
In general, PNk 1 k is the number of (unordered) pairs x, y b N 1 (why? because the number of pairs
equals the sum over all k B N of the number of pairs x, y with max x, y k, and there are k such pairs.).
N 1
The number of pairs contained in N 1 is 2 N 1 N ~2 (there are N 1 choices for the first
element, and N for the second, and then we divide by 2 since there is no meaning to the order of choice).
From this we get that F n, m defined above gives the n’th place on the n m 1’th diagonal.
This description of F shows that it is injective, but it is not a formal proof.
To formally prove that F is injective, one shows that if `n, me x `n , m e then F n, m x F n , m .
n m n m 1 ~2 because
n m n m 1 2n @ n m 1 n m 2 B n
m n
m
1 2n ,
since
2
n m 1 n m 2 n m 3 n m 2
so we are done.
If n m n
m then n x n so necessarily F n, m x F n , m .
11
Finally, since F is injective, it follows that N N is finite or countable by Corollary 4.9. But it must be
countable as there is an injective function from N to N N taking n to n, 0 (so N N is infinite).
Remark. In fact, F itself is onto (so it is a bijection). One can prove by induction on n > N that n > Range F .
Indeed, F 0, 0 0, and if F k, m n and m x 0 then F k 1, m 1 n 1, and if m 0, then
F 0, k 1 n 1.
Remark. Another, much simpler construction is to write F n, m 2n 3m . This is also injective, but not
onto. This would also give us a proof using Corollary 4.9. However, the function described in the proof is
very important for historic reasons and also it is much simpler in a precise mathematical sense that you will
learn about if you continue to logic 2.
Proof. Without loss of generality we may assume that X N (composing with a bijection from X to N). Now
define the following function G Range F N: G y min n > N S F n y . Note that G is injective
because for every y > Range F , F G y y. This shows that Range F is finite or countable by Corollary
4.9.
Proof. First note that SZ NC1 S SN NS as Z is countable (in general, if SAS SB S and SC S SD S then
SA C S SB DS, why?). Hence Z NC1 is countable. Now the corollary follows from Proposition 4.14
applied to the function F n, m n~m (note that Q is not finite).
5. Week of 5/12/2016
5.1. Finite sequences. In Notation 4.5, we said that we call functions with domain N sequences.
Definition 5.1. A finite sequence is a function with domain n for some n > N. A finite sequence with
domain n is called also an n-tuple. If a is an n-tuple, then we can write ai for a i for all i @ n.
Remark 5.2. For all n > N, if a, b are both n-tuples then a b iff for all k @ n, ak bk . (In general, two
functions F, G are equal iff they have the same domain, and for all x in said domain F x G x, this is
just the axiom of extensionality.)
In particular, with n 2, the map taking x, y to the 2-tuple a with a0 x, a1 y is an ordered pair map.
Of course, a x `x, y e (since a `0, xe , `1, y e x x , x, y ). However, we will not distinguish these two
and write a `x, y e where it is clear from the context if a is an ordered pair or a 2-tuple, and moreover it
will never really matter which one it is.
Generalizing from the notation `x, y e for ordered pair, if a is n-tuple, we write a `ai S i @ n e or a
`a0 , . . . , an1 e.
When n 1, if x is some object then `xe is then the function `0, xe.
Definition 5.3. Given a set A, let An be the set of all n-tuples of A (in other words, all functions F n
A). In this notation A0 g.
Proposition 5.4. For any set A and for any n > N, TAn 1
T SAn AS .
12
Proof. Let G An 1
An A be defined as G a `a I n , an e (the first coordinate is called the projection
of a to the first n coordinates).
Proof. This can either be proved by induction on n using Proposition 5.4 and the fact that if SAS SB S,
SC S SD S then SA B S SC DS or directly as below.
Suppose that f A B is a bijection. Define a function G An B n as follows. Given s n A, let
G s f X s. We can also define H B n n
A by H t f 1
X t for any t > B n . Note that H X G s
1
H f X s f
X f X s s (by associativity of composition, see Remark 3.4). So H X G idAn . Similarly,
GXH idB n . By Proposition 3.3, G is a bijection.
Definition 5.6. Given a set A, let A be the set of all finite tuples of elements of A. Equivalently,
A
A n
Sn > N .
Exercise 5.7. Why is A a set when A is?
Proof. Similar to the proof of Lemma 5.5 (the same definition of the function G works).
Definition 5.9. For a > A , let len a (the length of a) be the unique n such that Dom a
n .
(2) The set A is countable.
Proof. Note that if A is finite, (1) follows from Exercise 3.18 (and Lemma 5.5) by induction. As for (2),
note that A b A 8 N so if we knew (2) for countable sets, it would follow from Corollary 4.9 that A is
either finite or countable. However, since A is nonempty, for some a > A, A contains an S n > N where an
is the constant sequence an n A, i.e., an k a for all k @ n. This shows that A must be infinite
For n 0, H0 g 1. For n 1, Hn 1 a Hn a I n pann 1 . Then Hn is injective by the fact the unique
factorization of natural numbers into product of primes. (Note that we do not really need to take ai 1, but
it will be convenient for (2), see below.)
To prove (2), note that both proofs of (1) actually gave us more. Let B be the set of all bijections f of
the form f An N for some 1 B n. Then there is a function F B B such that if Dom f An then
Dom F f An 1 . (Let’s see this this formally, using the first proof of (1). Given a bijection f An
N,
n1 n
we want to define F f . There is a bijection C A A A, and there is a bijection D A N, as
13
well as a bijection E N N N. Let J be the function f E, by which we mean J An A N N,
J a, b `f a , D be. Then let F f E X J X C.)
By Theorem 4.4, there is a function G N B such that G n 1 F G n for all n > N and G 0 is
n1
some bijection from A to N. In particular, G n is a bijection from A to N. Then let H A
NN
be such that H a `len a , G len a 1 ae for all a with len a A 0, and H g `0, 0e. Then H is
injective, because if H a H b then len a len b, so this follows from the fact that G k is injective
for all k > N. Since obviously A is infinite, we are done by Corollary 4.9.
The alternative proof for (1) makes things slightly easier (again, we may assume that A N by Lemma
5.8). Why? because here we can just take H `g, 1e 8 G n S n > N and it is already injective (it is
well defined because the domains of G n for different n’s are disjoint, using Exercise 4.6). Indeed, we get
that H a Li@lena paii 1
. Then for the length of a is just the largest i for which pi divides a, plus 1.
Remark 5.11. The proof of Theorem 5.10 (2) may seem complicated, but it really just a formal way of writing
a very easy argument that says how to find a sequence of injective functions An N. Once we have that
sequence, we can define H. (Note: using the axiom of Choice, to be discussed much below, there is no need
for any recursion argument!)
The alternative proof of (2) is just to look at the function H a Li@lena pai i 1
(the product of powers
of primes), and this is obviously injective. But to formally define this function we need to use recursion (to
even define the sequence of primes we need recursion, and then also the product of finite sequences).
Notation 5.13. Suppose A is some set. Let PFin A be the set of all finite subsets of A.
Corollary 5.14. Suppose that A is a nonempty set. Then A is countable iff PFin A is countable.
Proof. Suppose that A is countable. Let F A
PFin A be defined by F a Range a. Then F
is surjective (because given some finite subset s b A, by definition there is a function a n s, so
F a s). Hence PFin A is countable or finite by Proposition 4.14. But it cannot be finite, since it
contains a S a > A which is infinite.
On the other hand, if PFin A is countable, then the injective function taking a to a shows that A is
either countable or finite (using Corollary 4.9). However, it cannot be finite by Exercise 3.18 (PFin A b
P A and so finite).
Exercise 5.15. Show that a set A is finite iff PFin A is finite.
Recall that a number a > R is real algebraic if there is some nontrivial polynomial f X > Q X such
that f a 0. But what do we mean by polynomials? they are not one of the objects that we have in our
universe (i.e., sets and numbers). We identify a polynomial with the finite sequence of its coefficients, i.e.,
we identify the polynomial Pni 0 ai X i by `ai S i @ n e.
Notation 5.16. Denote the set of real algebraic numbers by Ralg .
6. Week of 12/12/2016
Definition 6.1. Suppose that A, B are sets. We write SAS B SB S and say that the cardinality (or size) of A
is at most that of B if there is an injective function F A B.
We write SAS @ SB S and say that the cardinality of A is strictly less than that of B if SAS B SB S but SAS x SB S.
In other words, we defined SAS @ SB S so that [SAS B SB S iff SAS @ SB S or SAS SB S].
Remark 6.2. The relation SAS B SB S is transitive: if SAS B SB S and SB S B SC S then SAS B SC S: take the composition
of the injective functions, and use Exercise 3.6.
Example 6.3. For n, m > N, S nS @ S mS iff n @ m (see Corollary 3.12 above). The same is true after
replacing @ by B. In addition, S nS @ SNS (by the pigeon hole principle for finite sets, Corollary 3.16).
Proof. First we have to show that SAS B SP AS. This is done by the injective function taking a to a.
The difficult part is to show that there is no bijection between A and P A. This is very similar to Russell’s
paradox. Suppose that F A P A is any function. Then consider the set S x > A S x ¶ F x b A. We
will show that S ¶ Range F .
Consider any a > A. Is it true that a > F a? if yes, then a ¶ S so F a x S. If a ¶ F a, then a > S, so
F a x S. So in any case F a x S, so S ¶ Range F .
This shows that F is not onto. Since this is the case for any function, we showed in particular that there
is no bijection F A P A.
Definition 6.5. For two sets A, B let AB be the set of functions f B A. For a number n > N, we may
B
write nB instead of n .
The following theorem implies that the relation SAS @ SB S is also transitive.
which is impossible.
Let us now check that G is onto. Given y > B, we must find some x > A with G x y. If y > C then
G y y. Otherwise, y > Range F and F 1
y (which is well defined) is not in C, which means that
1 1
G F
y F F
y y.
This completes the proof.
Here is another, perhaps more visual, explanation of the choice of C0 , which I will sketch. For every a > A,
consider the sequence
n n1 1 n
a. . . , F a , F a , . . . , F a , a, F a , . . . , F a , . . .f .
Proof. By Exercise 6.9, Proposition 6.6 and Cantor-Bernstein it is enough to prove that SRS B S P QS and
that T2 N
T B SRS.
For the first inequality, define the function taking r > R to q > Q S q @ r (why is it injective? because if
r1 @ r2 then there is a rational number q between r1 and r2 ).
For the second inequality, consider the function mapping each f N 2 to af Pi 0 f i 3 i . Notice that
ª
03
i
converges to 3~2, so Pni k0k
f i 3 i B
Pni k0k 3
B 3 k 2; this is a Cauchy sequence). This is also an injective function because if f x g, then let
i
n min j > N S f j x j , and suppose that f n 1, g n 0. Then, as Pi n 1 3 i 3 n ~2, we have that ª
ª n
af Q f i 3 i
C Q f i 3 i
i 0 i 0
n1 ª
A Q f i 3 i
Q 3 i
i 0 i n1
n1 ª
C Q g i 3 i
Q g i 3 i
ag .
i 0 i n1
16
By Cantor’s theorem we have finally shown that:
Corollary 6.12. There is a non-algebraic real number (See before Notation 5.16).
Proof. By Corollary 5.17, TRalg T @ SRS, so in particular, Ralg x R and we are done.
Exercise 6.13. * Show that for all n > N there is a finite sequence of length n of real numbers `a0 , . . . , an 1e
such that for no polynomial 0 x f > Q X0 , . . . , Xn 1 (this denotes the set of all polynomials in n variables)
is it the case that f a0 , . . . , an 1
0. These kind of n-tuples of real numbers are called algebraically
independent.
Hint: look at how we did this for n 1 and use induction.
7. Week of 19/12/2016
Definition 7.1. A binary relation (see Definition 2.3) B on a class X is called a partial order relation on X
if it is transitive (for all x, y, z > X, if x B y B z then x B z), reflexive (for all x > X, x B x) and anti-symmetric
(for all x, y > X if x B y and y B x then x y).
A binary relation @ on a class X is called a sharp partial order relation on X if it is transitive and
irreflexive (for all x > X, it is not true that x @ x).
Notation 7.2. Usually when we call a relation a partial order relation without specifying, we mean one of
the two definitions (i.e. sharp or not), and it will be clear from the notation to which one we refer (with a
line on the bottom it is not sharp, and without it is sharp).
Note that if B is a partial order relation on X then the relation x @ y iff x B y and x x y is a sharp partial
order relation (why?). Similarly, if @ is a sharp partial order relation on X then the relation x B y iff x @ y
or x y is a partial order relation.
Example 7.3. The usual order relation on R; The relation “n divides m” ; b on the class of sets.
Definition 7.4. A binary relation V on a a class X is called a preorder relation on X if it is transitive and
reflexive.
Definition 7.5. The cardinality of a set A, denoted by SAS, is the equivalence class of A under the equivalence
relation of having the same cardinality (see Lemma 3.8).
17
Note that the cardinality of A is a class and not a set, so we cannot talk about the set of cardinalities
or even the class of cardinalities. (Later, however, we will see that it is possible to talk about the class of
cardinal numbers).
By Definition 7.5, the cardinality of a finite set A of size n is just the class of all finite sets of size n.
Recall: we introduced the notation SAS n to say that SAS S nS, but for now this is just notation. Later we
will attach to each cardinality a specific set which we will identify with it. In this way the expression SAS n
will make sense.
The relation SAS B SB S is really a partial order relation on cardinalities in the following sense: if SA S
S AS
and SB
S SB S then SAS B SB S iff SA S B SB S (it is not a relation on cardinalities in the sense of Definition 7.1
because the cardinalities do not form a class). Similarly, the relations SAS @ SB S and SAS SB S are relations on
cardinalities.
7.3. The arithmetic of cardinalities. Now we would like to define a sum of two cardinalities. Formally,
a sum of two cardinalities is a 2-place function on cardinalities. What do we mean by that?
Definition 7.6. A well-defined 2-place function on cardinalities is a relation F on triples (i.e., sequences of
length 3 — think of this as the graph of the function), such that:
Y If F A, B, C and SA S
SAS, SB S SB S and SC
S SC S then F A , B , C
(meaning that F depends
only on cardinalities).
Y For all A, B there is some C such that F A, B, C (meaning that the domain of F is the set of all
pairs of cardinalities).
Y If SA S
SAS, SB S SB S and F A, B, C , F A , B , C
then SC
S SC S (meaning that it is a function:
the cardinality of the result only depends on the cardinality of the input).
If F is a well-defined 2-place function on cardinalities, we can write F SAS , SB S SC S instead of F SAS , SB S , SC S
and this notation makes sense: F SAS , SB S is the unique cardinality SC S such that F SAS , SB S , SC S. In the
future we will just say that F is well-defined (instead of a well-defined 2-place function). We will freely
compose such functions. The following exercise says that this is fine.
Exercise 7.7. (1) Extend Definition 7.6 to define what is a well-defined n-place function on cardinali-
ties.
(2) Suppose that F, G and H are three well-defined 2-place functions on cardinalities. Let R x, y, u, w, z
be a relation on 5-tuples such that R x, y, u, w, z iff for some e, f , G SxS , Sy S SeS and H SuS , SwS
Sf S, and F SeS , Sf S Sz S. Show that R defines a well-defined 4-place function on cardinalities. Explain
why R is the composition F X G, H .
Remark 7.8. Note that this is not a function in the usual sense, since we cannot talk about the class of
cardinalities.
7.4. Addition. In the finite case, the sum of the cardinalities of two finite sets, one of size n and the other
of size m, would be n m. How did we get there? one approach is to take a disjoint union of two sets, one
of size n and the other of size m, and take its cardinality. This is exactly how we define the sum of two
cardinalities in general.
Definition 7.9. Given two cardinalities SAS and SB S, we define SAS SB S SA 0 8 B 1S.
The next lemma implies that the sum of two cardinalities is well-defined (i.e., that the relation F A, B, C
which says that SC S SA 0 8 B 1S is a well-defined 2-place function on cardinalities).
18
Lemma 7.10. Suppose that A, B are sets. Then SA 0 8 B 1S is the cardinality of any union A
8 B
where SA S
S AS and SB
S SB S,
and A , B are disjoint.
Lemma 7.11. We have that S nS S mS S n mS and hence for any two finite sets A, B such that SAS n
and SB S m, SAS SB S n m.
Proof. Let a n, . . . n m 1. Note that SaS m, a is disjoint from n and that n 8 a n m. By
Lemma 7.10 we are done.
Using the notation already established, we write SAS n for SAS S nS.
Let us list some properties of the sum.
Proof. (1) By Lemma 7.10 and the definition of the sum, SAS SB S SC S is the cardinality
SAS SB S SC S .
(2), (3), (4), (5) are clear (for (5), if f A C is injective, and B 9 A 8 C is empty, then f 8 idB
As we already established (by Theorem 6.10) that T2N T SRS, we will also write T2N T ¯
2 0 . We also call
this cardinality the continuum, and this is denoted by ¯ in some texts. Under this notation, Corollary 6.11
says that ¯0 @2 . ¯0
However this easily also follows from power considerations, which we will do next time.
One has to be a bit careful with cardinal arithmetic: although some of the rules are the same as for finite
arithmetic, we see that for instance ¯0 ¯0 ¯0 , and this is impossible for any finite number A 0. In addition,
it is impossible to define subtraction as we do not even have x y x z y z as 0 0
0 1. This
¯ ¯
example also shows that it is not true that x y B x z y B z (what if we replace B by @? we will return
to that).
So far we have not ruled out the existence of an infinite cardinality SB S such that ¯0 B~ SB S. However, note
that if such a SB S exists then ¯0 does not swallow SB S, otherwise SB S B ¯0 and hence must be either finite or
countable, so SB S ¯0 .
Proof. (4) implies (3): by assumption SAS SB S ¯ 0 (where B AF N with F N A injective). Hence
SAS ¯0 SB S ¯ 0 ¯0 SB S ¯0 SAS. (3) implies (2): if SAS swallows ¯0 , then SAS B SAS n B SAS ¯0 SAS.
Obviously (2) implies (1).
(1) implies (4): suppose that SAS 1 SAS. Then there is an injective function from A to some proper subset
B ø A. Why? let b ¶ A be some element (you can choose b > P A if you like). Then by assumption there is
a bijection f A 8 b A so let B f A. Let a > AB. Then f a x a. By induction on n > N, one can
prove that moreover f n
a ¶ f k
a S k @ n (note that f n a x a and use the induction hypothesis).
Hence by Theorem 4.4, the function F n f n a is an injective function from N to A.
7.5. Multiplication. Next we would like to define multiplication of cardinalities. This will also be a 2-place
function of cardinalities as in Definition 7.6.
Lemma 7.20. For n, m > N, S nS S mS n m, so multiplication of finite cardinalities identifies with the
usual multiplication on finite cardinalities.
Using the notation already established, we write SAS n for SAS S nS.
Here are some properties of the product.
Similar to the situation with subtraction, one cannot define dividing (as ¯0 1 ¯0 2).
Proposition 7.23. For any two sets A, B such that SAS , SB S C 2, SAS SB S B SAS SB S.
Proof. Suppose that A, B are disjoint. Let a , a > A and b , b > B such that a x a and b x b .
7.6. Power. Next we would like to define the power of one cardinality by another.
SB S
Definition 7.25. For any two sets A, B, SAS TAB T.
8. Week of 26/12/2016
Definition 8.1. Given two sets X, Y , let π0X,Y X Y X and π1X,Y X Y Y be the left and right
projections respectively (e.g., π1X Y `x, y e y).
(2) We have to show that A1 A 1 A0 has the same cardinality as A (recall Definition 5.3). Let
f A1 A be f s s 0. It is a bijection.
(3) Since any function whose range is g is empty, its domain is also g , so the set of functions f A g
is g if A x g.
(4) We may assume that B 9 C g (otherwise replace them with B 0, C 1). We have to show
that TAB 8C
T TAB AC T. Define F AB 8 C
AB AC by F g `g I B, g I C e. Then G AB AC AB 8 C
By Corollary 8.3, for n > N, we can write SAS and nSAS without any confusion (because when A is finite
n
Proof. 2 ¯0
2 ¯0 ¯0 ¯0
2 ¯
2 0 , and 2 ¯0
B ¯0 0 B 2
¯ ¯0 ¯0
2 ¯0 ¯0 ¯
2 0.
Proof. Let F C I
RQ be given by F f f Q. Then F is injective by continuity and density of Q in
R. (If f1 , f2 > C and f1 x x f2 x for some x > R, then there is some open interval I around x such that
, so that for any q > I, f1 q x f2 q .) So we get that SC S B 2
¯0
f1 I 9 f2 I g
¯0
¯
2 0 . On the other
hand, since every constant function is in C, 2 ¯0
B SC S.
8.1. The Axiom of Choice. Suppose that A is a set. We already know that if there is an injective function
f A B such that B ø A, then SAS B SAS 1 B SB S 1 B SAS so A swallows 1 and hence A is infinite and
moreover ¯0 B SAS (Proposition 7.17). In particular, if A is finite then there is no such f , but we do not know
that if there is no such f then A is finite. In other words, we do not know whether being infinite is the same
as swallowing 1, i.e., the same as having such a function, i.e., the same as being of cardinality at least ¯0 .
Let us try to prove the last point directly.
Since A is infinite, it contains at least one point, x0 . Since it is not true that A b x0 , there is at
least one more point in A, x1 . Since x0 , x1 do not cover A, there is at least one more point, x2 .
By recursion, this gives us an injective function F N A, and hence SAS C ¯0 .
What is the problem with the argument in ? Well, Theorem 4.4 gives us one way to define a function by
recursion, so in order to make the argument in precise, we should find some set B, some point b > B, and
some function f B B (so that F n f n b or some variant of this). Let B A , the set of all finite
sequences from A. We want a function f A A such that, given s > A , if len s n (see Definition
5.9), then len f s n 1, and f s n 1 > A Range s. If we manage to do that, then let G N B
be defined by recursion, starting with b g , and let F G n S n > N . Then it is an exercise to see that
F is well defined and injective. However, how do we know that such an f exist?
Here is an example of the trouble in finding functions such as the one described above. Suppose we are
given an infinite set S of pairs of real numbers (i.e., each set in S has size 2). Suppose we want to find a
function f S R such that for each s > S, f s > s. Then there is no problem, just take f s min s.
23
However, suppose that we are now interested in doing the same but this time with P R? Then there is no
clear way to choose a “minimal” set.
([Bertrand Russell] A more illustrative example, while less formal, is to think of a problem of finding a
function f from an infinite set of pairs of shoes such that for each pair, f chooses one of the shoes in that
pair. Then just choose the right shoe at each time. However, what if we have to do the same for pairs of
socks?)
Definition 8.6. A choice function from a set X is a function f X X such that if g x a > X is a set
then f a > a.
The Axiom of Choice is a very important axiom. It has many equivalent formulations, some of which
we will discuss. For example, it is equivalent to the statement that every vector space (including infinite
dimensional ones) has a basis. The problem with this axiom is that it is non-constructive: it says that there
is a choice function, but doesn’t say how to get it. Proofs that use Choice are thus non-constructive, and
mathematicians often prefer constructive proofs. However, due to the many important results that Choice
imply, mathematicians have largely adopted this axiom.
From now on, we assume that the Axiom of Choice holds, unless we explicitly say otherwise.
Theorem 8.11. If f A B is surjective, then there is an injective function g B A such that f X g idB ,
and in particular SB S B SAS.
Note that we already know this theorem in the case when A is finite (Proposition 4.1) or countable
(Proposition 4.14). The proof in the countable case really used Exercise 8.7.
Exercise 8.12. Show that the statement of Theorem 8.11 implies the Axiom of Choice: if every surjective
function f A B has an inverse as in Theorem 8.11, then the Axiom of Choice holds.
24
9. Week of 2/1/2017
9.1. Infinite sums and products of cardinalities. Another use of the axiom of choice is to define the
cardinality of an infinite sum and infinite product of cardinalities.
Let us start with the sum. Suppose that we are given a set of sets X, and we want to define Px>X SxS where
we take the sum in some “order”. We want to define it in such a way that if X is finite, then Px>X SxS is exactly
the sum of the cardinalities of all sets appearing in X, ordered in some order (note that by associativity and
commutativity of the sum, we know that this does not depend on the order in which we put X).
To make this formal, we fix a function f I X, and we take the sum Pi>I Sf iS.
Definition 9.1. Suppose that X is a set of sets and f I X is some function. We write Pi>I Sf iS for
S i>I f i iS. This is the sum of the sequence of cardinalities defined by f .
We would like to say that this sum is well defined as a function on cardinalities in the following sense: if
gI X is such that Sg iS Sf iS for all i > I, then Pi>I Sf iS Pi>I Sg iS.
The next lemma says this and more (compare to Lemma 7.10).
Lemma 9.2. Given a function f I X where X is a set of sets, the operation Pi>I Sf iS is well-defined
as a function of cardinalities, and moreover, it equals S g i S i > I S whenever g I Y is such that
Sg iS Sf iS for all i > I and g i 9 g i
g for all i x i > I.
Pi> n SXi S SXn S (where the is the usual sum on two cardinalities).
(3) Suppose that X Xi Xj for all i, j > I (i.e., it is a constant sequence). Then Pi>I SXi S SI S SX S.
(4) Suppose that SXi S B SYi S for all i > I. Then show that Pi>I SXi S B P SYi S.
Now we want to define the infinite product Li>I Sf iS. Recall that in the finite case we just took the
Cartesian product of the sets involved. So first let us define the infinite product of sets.
Definition 9.5. Given a sequence of sets `Xi S i > I e where Xi is a set for all i > I, let Li>I Xi be the set of
functions f I Xi S i > I such that f i > Xi for all i > I.
Note that if some Xi g , then Li>I Xi g . Also note that if Xi Xj X for all i, j > I, then Li>I Xi XI .
25
Lemma 9.6. The Axiom of Choice is equivalent to saying that whenever `Xi S i > I e is a sequence of nonempty
sets, then Li>I Xi x g.(In fact, Li>I Xi is the set of all choice functions.)
Proof. From left to right we leave to the reader. Suppose that X is some set of sets, and we want to find
a choice function from X. We may assume that g ¶ X (otherwise replace X with X g). As Lx>X x x g
(we use the identity function idX for the sequence), we are done.
Definition 9.7. Suppose that `Xi S i > I e is a sequence of sets. Then Li>I SXi S SLi>I Xi S.
Again we have to check that this is well defined as a function on cardinalities.
Lemma 9.8. Suppose that `Yi S i > I e is such that SYi S SXi S for all i > I. Then Li>I SXi S Li>I SYi S.
Proof. Again, using Choice, there is a function G I YiX i
Si > I such that G i is a bijection from
Xi to Yi . Then we can define F Li>I Xi Li>I Yi by F f i G i f i for all i > I. This is a
1
bijection whose inverse is given by H g i G i g i.
Definition 9.10. Suppose that `X, @e is a partial order (this notation means: X is a set and @ is a partial
(sharp) order on X). Recall that we denoted by B the partial order x B y iff x @ y or x y.
(1) The order @ is called a linear order (sometimes also a total order) if for all x, y, either x B y or y B x.
(2) An element x > X is called minimal (or @-minimal if we want to specify the order) if there is no
y > X with y @ x.
(3) An element x > X is a minimum (or a @-minimum when we want to specify the order, or first) if for
all y > X, x B y.
(4) An element x > X is maximal (or @-maximal) if there is no y > Y such that x @ y.
(5) An element x > X is a maximum (or a @-maximum, or last) if for all y > X, y B x.
I I
(6) When Y b X, we may write `Y, @e for `Y, @ Y e where @ Y is the reduction of @ to Y , i.e., @ 9 Y 2 .
Example 9.13. The partial order `N, Se where x S y iff x divides y is a partial order with maximum 0.
However, `N 0 , Se has no maximum.
Exercise 9.14. Every finite partial order `X, @e has a minimal element and a maximal element (prove by
induction on SX S).
9.3. Well-orders. Proofs by induction are very useful for proving theorems on the natural numbers. Why
do they work?
Suppose we want to prove a property P x on all natural numbers. Well, we first check that P 0 holds.
Now we want to check that P 1 holds, and then that P 2 holds and so on. But this process will last
forever, so we cannot hope to prove that P x holds for all x > N this way. Induction says that it is enough
to check that P 0 holds, and that
Y For all n > N, If P n holds then P n 1 holds.
Then it allows us to conclude that P n holds for all n > N.
Why does this work? Roughly speaking, this works because the usual order @ on N orders the elements
one-by-one, so that given n, it will appear at some point in this order, and then we already know that
P n 1 holds, so P n holds as well.
We want to formalize this reason.
Definition 9.15. A linear order `X, @e is called a well-order (or well-ordering) if every S b X such that
g x S contains a minimal element for `S, @e. We denote this element by min S.
Proof. Suppose that S b N is nonempty. We want to find a minimal element for S. Suppose no such element
exist, and prove that S g . We prove by induction on n > N that n 9 S g. Well, 0 9 S g , because
0 g . Suppose that n 9 S g but n 1 9 S x g. But then n > S and for all i @ n, i ¶ S. This means
that n is minimal for S — contradiction.
Recall that the induction principle on N can be viewed in two equivalent ways. One is the way stated
above. The other is: if P x is a property of natural numbers such that if P x holds for all x @ n then
P n, then P n holds for all n > N. (To see the equivalence, the usual induction implies this one by looking
at the property Q x saying that P x holds for all x @ n. Then Q 0 holds trivially, and if Q n then
P n, so Q n 1. The other direction is an exercise.)
This second view of the induction principle on N extends to any well-ordered set.
27
Theorem 9.19. (Generalized induction principle) Suppose that `X, @e is a well-order. Suppose that P is a
class of objects (which, recall, is the same as all objects satisfying a property). Suppose that for all x > X, if
y > P for all y @ x then x > P , then X b P .
Proof. Suppose not, i.e., that X b~ P . Then S X P x g. Then min S ¶ P but for all y @ min S, y > P , which
contradicts the condition on P .
The generalized induction principle, when applied to sets which are not finite nor N, is called also transfinite
induction.
Exercise 9.20. The generalized induction principle is equivalent to being well-ordered. More precisely, if
`X, @e is a linear order such that the generalized induction principle holds on X, then `X, @e is a well-order.
Definition 9.21. Suppose that `X, @X e and `Y, @Y e are two partially ordered sets. Then an injective function
f X Y is called an embedding (of orders) function if for all x, y > X, x BX y iff f x BY f y . A function
f X Y is called an isomorphism (of orders) if it is a surjective embedding. We write `X, @X e `Y, @Y e
to mean that there is an isomorphism f X Y . We sometime remove @X and @Y from the notation when
these are clear.
Exercise 9.22. (1) In the definition of an embedding, there is no need to require that f is injective (it
follows).
(2) If `X, @X e is linearly ordered, and `Y, @Y e is a partial order, and X Y , then Y is linearly ordered
(similarly for well-ordered).
(3) If `X, @X e is linearly ordered and `Y, @Y e is a partial order then any function f X Y is an
embedding iff it is order preserving: if x @X y then f x @Y f y .
Lemma 9.23. Suppose that `X, @X e, `Y, @Y e and `Z, @Z e are partial orders. If f X Y and g Y Z are
1
both order-preserving, then so is g X f X Z. In addition, if f X Y is an isomorphism then f Y X
is also an isomorphism. It follows that is an equivalence relation between partial orders.
Lemma 9.24. If `X, @e is a well-order, and f X X is order preserving then x B f x for all x > X.
Proof. Let S x > X S f x C x . We want to prove that S X. We use Theorem 9.19 (i.e., generalized
induction): we have to show that for all x > X, if every y @ x is in S then also x > S. So suppose x has this
property (that for all y @ x, y > S), and suppose that x ¶ S, so f x @ x.
If y @ x, then y B f y @ f x so y @ f x for all such y. If we put y f x, we get that f x @ f x —
a contradiction.
Definition 9.25. If `X, @e is a partial order, then a subset Y b X is called an initial segment if for all y > Y ,
if z @ y, then z > Y .
that X X.
28
(2) Suppose that f X X is an isomorphism. Then on the one hand x B f x for all x > X, and on
the other hand x B f 1
x for all x > X. Applying f on both sides of the last inequality we get f x B x.
Together we have that x f x.
Exercise 10.1. Suppose that `X, @e is a linear order. Let IS X C b X S S is an initial segment . Then
C is linearly ordered by b, and for any S b IS X , S > IS X (this is an upper bound of all initial
segments C > S).
that X X.
(2) Suppose that f X X is an isomorphism. Then on the one hand x B f x for all x > X, and on
the other hand x B f 1
x for all x > X. Applying f on both sides of the last inequality we get f x B x.
Together we have that x f x.
1
(3) Suppose that f, g X Y are isomorphisms. Then g X f X X is an isomorphism from X to itself
1
so g X f idX , so f g.
(4) Suppose that X Z and X
Z . Then Z Z . By Exercise 10.1 either Z b Z or Z b Z . In either
case, by (1), Z Z.
Lemma 10.3. Suppose that `X, @e is a well-order. Then every proper initial segment has the form X@x
y > X S y @ x for some unique x.
Proposition 10.4. If `X, @e is a well-order, then every element x > X which is not maximal has a successor:
a (unique) element y > X such that x @ y and for all z > X, if x B z B y then z x or z y.
Proof. Consider the set S y > X S x @ y . Since x is not maximal S is not empty. Hence S has a minimal
element. This element is the successor of x.
Exercise 10.5. Find a well-order `X, @e and some element x > X which is not minimal and not a successor.
Such an element is called a limit.
We would like to extend the definition by recursion (Theorem 4.4) to any well-ordered set.
Theorem 10.6. (Generalized definition by recursion) Suppose that `X, @e is a well-order, and Y some set.
Suppose that D f X@x Y S x > X and that H D Y is some function. Then there is a function
F X Y such that for all x > X, F x H F I X@x .
Note that Theorem 10.6 implies Theorem 4.4: there, we were given f Y Y and a > Y and wanted to
find a function F N Y such that F 0 a and F n 1 f F n. We apply the generalized theorem
29
with X being N so that D is the set Y
of all finite sequences. Then we let H s a if s g , and otherwise,
H s f s len s 1 (i.e., the last element in s). Applying the theorem, we get F N Y such that
F 0 H g a and F n 1 H F I n 1 f F n as we wanted.
The proof of Theorem 10.6 is not much different than the proof of Theorem 4.4, but there is one added
ingredient: the limit case. We include it here, but this was not done in class.
Proof of Theorem. For x > X, let Sx be the set of functions F XBx Y such that for all z > XBx ,
F z H F I X@z . We prove by induction on x > X (using Theorem 9.19) that Sx x , in fact SSx S
g 1.
During the proof we will denote the unique element of Sx by Fx .
Suppose that the statement is true for all x @ x (i.e., that Sx is a singleton) and we want to show the
same for x.
Note that if x @ x @ x then Fx
I XBx > Sx so by the induction hypothesis, Fx Fx I XBx . This
means that the union Gx Fx S x
@ x is a well defined function with domain XBx S x @ x X@x .
Moreover if Fx exists then Fx I XBx Fx for every x @ x so Fx I X@x Gx . Hence, it must be that
Fx x H Gx . Hence if Fx exits, it is unique and we must define Fx Gx 8 `x, H Gx e. Now it is easy
to check that Fx indeed belongs to Sx and we are done.
Finally, we let F
Fx S x > X . Then we must check that F is indeed a well-defined function (which
is true as Fx b Fx whenever x @ x ), and that F satisfies the conditions of the theorem (which is true by
definition of Fx ).
We apply Theorem 10.6 to prove the following result. We will also give a proof that does not use Theorem
10.6.
Theorem 10.7. (The comparison theorem for well-orders) Suppose that `X, @X e and `Y, @Y e are two well-
orders. Then one of the following happen.
(1) There is an initial segment D b Y such that X D.
(2) There is an initial segment C b X such that Y C.
Proof. Suppose that (1) does not occur and we prove (2).
As (1) is false, X x g (as g is always an initial segment). Let x0 > X.
We want to define an embedding F Y X by recursion (we also say that we build F by induction). For
this, we have to find a function H from the set of functions with domain an initial segment of Y to X, such
that F will be “the next element”.
We think of H as taking the (partial) function we already constructed and adding one more element.
So, suppose that s C X where C b Y is a proper initial segment, and we want to define H s. Since
in the end our function will be an embedding of Y onto an initial segment of X, we are only interested in
such s’s where s is itself an isomorphism from C to an initial segment D of X. If s is not such, just define
H s x0 .
If s is such an isomorphism s C D where D is an initial segment of X, then D x X, because otherwise
1
s
X C will give us (1). Let x min X D, and let H s x. This defines H.
By Theorem 10.6, there is a function F Y X such that F y H F I Y@y .
Let us check that F is an isomorphism from Y to an initial segment D of X.
We prove by induction on y > Y that F I YBy is an isomorphism onto XBF y . This is enough because
then F F I YBy S y > Y Y XBF y S y >Y is an isomorphism.
30
So suppose that this is true for all y @ y. Then in particular, F
I Y@y F I YBy S y
@ y is an
isomorphism from an initial segment of Y to D XBF y S y
@ y which is an initial segment of X. So
Here is an alternative proof of this theorem (this proof was not given in class).
Proof. Let
Note that for C > S, both D and f are uniquely determined by Corollary 9.26 (3) and (4), so we call these
DC and fC .
Claim. S is an initial segment in F S X (see Exercise 10.1). Moreover, if C b C > S is an initial segment
then fC fC IC .
Proof of Claim. We have to show that if C b C > S and C is an initial segment of X, then C > S and
fC fC IC.
Indeed, this follows as fC IC
is an isomorphism from C to some initial segment of DC ,
which is itself an initial segment of Y . Since fC is uniquely determined, we are done.
fC x @ fC x
f x .
isomorphism from E 8 x to D 8 y , both initial segments, and hence E 8 x > S. But then E 8 x b E,
and x > E which cannot happen. Hence D Y and we are in case (2).
Corollary 10.8. If X, Y are two sets which are well-orderable: there are some @X and @Y such that `X, @X e
and `Y, @Y e are well-orders, then either SX S B SY S or SY S B SX S.
10.1. Ordinals. We just saw that any two well-orders can be compared in the sense that one has to be an
initial segment of the other. We can say that two well-orders are equivalent if they are isomorphic. We will
define a class, the class of ordinals, which are canonical well-order representatives for the classes. In other
words, for each well-order `X, @e there is a unique ordinal αX which is isomorphic to X (so any two ordinals
are equivalent iff they are equal).
Definition 10.9. A set A of sets is called transitive if for every a > b > A, a > A. In other words, if b > A,
then b b A.
Example 10.10. ,
g g are transitive, while g is not.
Exercise 10.11. (1) If A is a set and every a > A is transitive, then so is A. If A x , then
g A is
transitive.
31
(2) If a is transitive, then so is a 8 a.
Definition 10.12. A transitive set a is an ordinal (or an ordinal number) if `a, >e is a well-order. Let Ord
be the class of all ordinals.
Example 10.13. ,
g g, g, g are all ordinals.
Proposition 10.14. If α is an ordinal and β > α then β is an ordinal. Moreover, β is an initial segment of
α. It follows that α β > Ord S β > α .
Proof. We have to check that β is transitive: suppose that x > y > β > α. Then y > α, so x > α, so as > is a
partial order on α, x > β by transitivity of > on α. This also shows that β is an initial segment of α. Now we
have to check that `β, >e is a well-order. But β b α so `β, >e is also a well-order by Exercise 9.18.
We would like now to show that >is a well-order on the class of all ordinals. We only defined partial orders
on sets, but we can easily extend this definition to classes: when P is a class and @ is a relation on P , then
we define when `P, @e is a partial order or linear order in the same way as we did when P was a set.
Definition 10.15. A class `P, @e is a well-order if it is a linear order such that every nonempty set s b P
has a minimum.
Note that by the second condition, if `P, @e is a well-order class, and Q b P is a nonempty subclass,
then Q has a minimum: take y > Q (exists since Q x ). If y
g min Q we are done. Otherwise, min Q
min x > Q S x @ y .
Claim. If α is an ordinal and b ø α is a proper initial segment of α (which is the same as a transitive proper
subset), then b > α (and in particular, b is an ordinal).
Proof. Let γ min αb (it is an ordinal by Proposition 10.14). Then b b γ: if x > b, then x > α and x x γ, so
x > γ as `α, >e is a linear order and b is an initial segment (so it is impossible that γ > x). On the other hand,
γ b b: if y > γ, then y > α so y > b by minimality of γ.
Proof. Suppose that α > β. Then α b β by transitivity of β, and α x β because of (2) above. On the other
hand, if α ø β then α is an initial segment of β by transitivity of α, so by the first claim, α > β.
(3) > is linear: suppose β x α > Ord. We have to show that either α > β or β > α. Note that by Exercise
10.11, α 9 β b α, β is an initial segment of both α, β. If α 9 β α then α ø β so α > β (by the second
claim). Similarly, if α 9 β β then β > α. If neither happen, then α 9 β > α, β, so α 9 β > α 9 β which
is impossible since α 9 β is an ordinal by Exercise 10.11.
32
(4) `Ord, >e is a well-order:
If g x s b Ord is a set then s has a minimum, which is just s (see Exercise 10.11). By the second
claim, the (sharp) order > on Ord coincides with ø, and the non-sharp order coincides with b . In
this language, the ordinal s b β for all β > s, so to finish we have to show that s > s. Otherwise,
s > β for all β > s (by linearity of > on Ord), which means that s > s, contradicting (2).
Exercise 10.17. If g x s b Ord is a set and α > s, then either min s α or min s min α 9 s (this is an
alternative proof to the existence of minimums).
Proof. Suppose that Ord is a set. Ord is itself a transitive set such that `Ord, >e is a well-order. Hence
Ord > Ord, contradiction.
Definition 10.19. Suppose that `X, @e is a linear order, and s b x. An element a > X is an upper bound of
s if x B a for all x > s. The element a > X is a least upper bound of s if for any upper bound a of s, a B a .
Notation 10.20. From now on we may write @ instead of > for ordinals. So for α, β > Ord, α @ β means that
α > β and α B β means that α b β.
Proposition 10.21. If s b Ord is a set of ordinals, then s is the least upper bound (i.e., supremum) of s
and is denoted by sup s.
Proof. By Exercise 10.11, s is an ordinal. Now this follows since the non sharp order on Ord coincides
with b.
Proof. By Exercise 10.11, α 1 is a transitive set of ordinals, hence an ordinal by Theorem 10.16. By
definition α @ α 1. If α @ β then α b β and α > β so α 1 b β which shows that α 1 is the successor of
α.
11.1. The natural numbers. This part was done to some extent in the exercises and exercise session.
Recall that we assumed that our universe contains objects which are either sets or number: natural,
rationals, reals, etc. However, we also promised to see how to interpret all these mathematical objects as
sets.
You already saw in the exercises how to interpret the integers Z once you have N (as equivalence classes
of the equivalence relation on N2 given by `x, y e `e, f e iff x f e y), the rationals Q once we are given
Z (as equivalence classes of the equivalence relation on Z NA0 given by `x, y e `e, f e iff x f e y), and
the reals R once we have the Q (as Dedekind cuts, i.e., as proper initial segments without a last element of
Q with the usual order). So to get rid of all these objects it is enough to interpret the natural numbers as
sets. (Formally, we also have to interpret the structure of addition, multiplication and the order on them).
John von Neumann suggested to define the natural number n as the set of all numbers less than n (which
we denoted as n so far). (He also suggested the definition of ordinals we gave here.) In this way, 0 g ,
1 g, 2 0, 1 g, g, 3 g, g , g, g, ...
All these are ordinals!
So we define the natural numbers formally as follows.
Definition 11.6. A natural number* is an ordinal α such that either α 0 or α is a successor and every
β @ α is also either 0 or a successor.
Note that the natural numbers* forms a class of ordinals which is transitive, and if α is there, then so is
α 1, but we do not know it is a set just yet. It satisfies the induction principle: if P is a class of natural
numbers* such that 0 > P and if α > P then α 1 > P then P contains all natural numbers* (take the smallest
natural number* not in P ; it cannot be 0 nor a successor because its predecessor is also a natural number*
so cannot exist).
Now, Theorem 4.4 applied to the function f Ord Ord given by f α α 1 (in Theorem 4.4 the
function was a set, but the proof there works just as well when f is a class) gives us a function F N Ord
such that F n 1 f F n and F 0 0. By induction on natural numbers*, every natural number*
has the form F n for some n > N. Thus the class of natural numbers* is in fact a set, which we can identify
with N.
So from now on, we remove the * from the definition and define the natural numbers as natural numbers*.
The axiom of infinity thus says that the class of natural numbers* is a set. We will continue to denote natural
numbers as n, m, k, etc.
Definition 11.7. The set of natural numbers, N is an ordinal which we also denote as ω. It is the first limit
ordinal.
34
Proof. Since N is a transitive set of ordinals, it itself is an ordinal. It is limit, since if ω α 1 for some
α @ ω, then α is a natural number, so so is α 1, i.e., ω > ω — contradiction. If δ is any limit ordinal, then
0 B δ and for any α @ δ, α 1 @ δ, so by induction every n > ω is @ δ, so ω B δ.
Proof. Suppose that α @ β. Then β is isomorphic to an initial segment of itself, contradicting Corollary 10.2
(1).
Theorem 11.9. Suppose that `X, @X e is a well-order. Then there is a unique ordinal α such that `α, @e
`X, @X e.
We call α the order type of X, and denote this by otp X, @X α.
Theorem 11.10. (Hartogs’ Theorem) For any set X there is some ordinal α such that SαS B~ SX S.
Let F S Ord be the following function. Given `A, Re > S, F `A, Re is the unique ordinal α
such that `A, Re `α, @e, which exists by Theorem 11.9 (in other words, F `A, Re is the order type of
`A, Re). Then F is onto Ord: given α > Ord, as SαS B SX S, there is an injective function f α X, which
induces a well-order isomorphism of `α, @e and `f α , f @e (where f @ `f x , f y e S x @ y @ α ). Hence
α F `f α , f @e.
But then by the axiom of replacement (Axiom 2.24), Ord is a set, contradicting Corollary 10.18.
11.3. Equivalences of the axiom of Choice and Zorn’s lemma. We come back to the axiom of Choice.
This axiom has many equivalent formulations. The most important among them are Zorn’s lemma and the
well-ordering principle.
We start with Zorn’s Lemma.
Definition 11.11. Suppose that `X, @e is a partial order. A set Y b X is called a chain if `Y, @e is a linear
order.
35
Definition 11.12. The statement of Zorn’s Lemma is as follows.
Suppose that `X, @e is a partial order and every chain Y b X has an upper bound. Then X has a maximal
element (see Definition 9.10).
Definition 11.13. The well-ordering principle states that for any set X, there is some relation R on X
such that `X, Re is a well-order.
chain of X. Hence it has an upper bound yβ , and let xβ be such that yβ @ xβ (which exists by assumption).
Note: this formulation is completely accepted as a proof in a more advance course or a paper, but for
us this is not enough, as in particular it doesn’t explain where exactly we use Choice. We give a complete
formal proof below.
Let D f β X S β @ αX (note that D is exactly the same D as in Theorem 10.6: for β > αX , the
initial segments αX @β is just β by Proposition 10.14).
Let H D X be defined as follows. Given β @ αX and f β X, if f is not an order preserving function
from `β, @e to `X, @X e (see Exercise 9.22), then H f x0 (we do not care where f is sent to in this case).
Otherwise, f is order preserving, and in particular, f β b X is a chain, so has an upper bound xf > X. Let
yf > X be such that xf @X yf (which exists by our assumption towards contradiction). Let H f yf .
The construction of H uses the axiom of Choice in two places: first we chose the upper bound xf , and
secondly we chose yf . To make it formal, let G P X X be a choice function. We can define two
functions G1 X X and G2 C X where C Y b X S Y is a chain is the set of chains of X, such
that G1 x G y > X S x @ y and G2 Y G y > X S y is an upper bound of Y . In this notation,
H f G1 G2 f β .
36
Finally, by Theorem 10.6 there is a function F αX X such that for every β @ αX , F β H F I β .
Now we may prove by induction on β @ αX that F I β 1 β 1 X is order preserving. (We did
something similar in the proof of Corollary 10.7). Suppose that this is true for all β @ β. Then F
Iβ
F I β
1 S β @ β (see Proposition 11.5), so F
I β is order preserving, so F β H F I β is @X -greater
than F β
for all β @ β, so F
I β 1 is order preserving and we are done with the induction.
In particular, F is order preserving and we are done.
(2) implies (3).
We are given a set X and we want to find a well-order on X. It is enough to find a bijection f s X
I
for some set s b Ord, since then f @ s well-orders X.
By Theorem 11.10, there is some αX > Ord such that SαX S B~ SX S.
Let S be the set of injective functions f s X for some s b αX , and write f1 j f2 whenever f1 b f2 .
Given a chain T b S, its union fT T is also an injective function from sT Dom f S f > T b αX to
X (we have to check that it is a function and that it is injective. We will show that it is injective, and leave
the fact that it’s a function to the reader. If γ1 x γ2 > sT then fT γ1 x fT γ2 , but this is true because if
γ1 > Dom f1 and γ2 > Dom f2 for f1 , f2 > T then either f1 b f2 in which case γ1 , γ2 > Dom f2 and hence
fT γ1 f2 γ1 x f2 γ2 fT γ2 or f2 b f1 and similarly we get that fT γ1 x fT γ2 ). So we see that
T is an upper bound of T .
By (2) there is a maximal element f > S. Note that Dom f x αX by choice of αX , so there is some
β > αX Dom f . If f is not onto, then there is x > X f s, and hence f 8 `β, xe is still injective thus
contradicting the maximality of f . So f is the bijection we wanted.
(3) implies (1).
Suppose X is some set. We want to find a choice function f P X X (see the formulation of Axiom
8.8). By (3) there is a well-order @X on X. Now given s b X nonempty, let f s min s.
Exercise 11.16. Prove that the axiom of Choice follows from Corollary 11.15.
Proof. We are going to use Zorn’s Lemma, which we already know is equivalent to the axiom of Choice. Let
V be a vector space over a field K. Let S be the set of all B b V which are linearly independent. Then `S, `e
satisfies the condition of Zorn’s Lemma, since given a chain T b S, its union T is also linearly independent.
Hence S has a maximal element B0 . Then B0 is a basis of V : if not, then there is some x > V which is
not the span of B0 , but then x 8 B0 is linearly independent, contradicting the maximality of B0 .
In many places in mathematics one has to deal with equivalence relations and equivalence classes. Some-
times it is very useful to be able to instead represent a class by choosing an element from it in a canonical way
(to be more precise, what we want is to find, for every element x, an equivalent one which only depends on the
class of x, but not on x itself). For instance, we may consider the equivalence relation E on R2 0 given by
37
v R u iff v, u are on the same line from 0 (i.e., if they are linearly dependent). Then a canonical representative
of v E might be the unique member of v E on the upper unit circle `x, y e > R2 T x2 y 2 1, 0 @ y, 1 @ x .
We are interested to do the same thing for SAS, where A is a set.
So far, for a set A, we defined SAS (the cardinality of A) as the equivalence class of A under the relation
SX S SY S, i.e., SAS B S SB S SAS . There is a problem with this definition: the cardinality of SAS is not set
(but a class), and the collection of cardinalities is not a class nor a set (it is a collection of classes). By the
well-ordering principle (Definition 11.13), there is a well-order on A, and thus by Theorem 11.9, there is
some ordinal α which is isomorphic to it. In particular, SAS SαS.
This means that SAS 9 Ord x g, and thus it has a smallest element. This is a canonical representative of
SAS, and from now on, we will just identify SAS with this ordinal.
Definition 12.1. (assuming the axiom of Choice) For a set A, SAS is the smallest α > Ord such that SαS SAS.
Suppose that SAS α. What does that imply on α? It means that α has the property that for all β @ α,
Sβ S @ SαS.
Definition 12.2. A cardinal (some times called a cardinal number) is an ordinal κ such that for all α @ κ,
SαS @ SκS. Infinite cardinals will usually be denoted by κ, λ, µ. Denote the class of cardinals by Card.
Remark 12.3. Suppose that A is a set. Then SAS is a cardinal (because SAS is the minimal ordinal of the same
cardinality by definition).
Conversely, if κ is a cardinal, then SκS κ. Indeed, SκS is some ordinal which is the smallest ordinal of
the same cardinality as κ. Hence SκS B κ, but it cannot be @ κ since κ is a cardinal. (This also follows from
Proposition 12.4 below).
We have now two notions of order: one is between ordinals, and the other between cardinalities. Let us
make sure that they coincide.
Proof. (1) Suppose that κ B λ. Then κ b λ (see Notation 10.20), so SκS B SλS. On the other hand, if SκS B SλS
then κ B λ: if not, then λ @ κ, hence λ b κ so SλS SκS by Cantor-Bernstein (Theorem 6.7), so κ is not a
cardinal.
(2) Suppose that SκS SλS. Then SκS B SλS and SλS B SκS so by (1) κ λ.
(3) Follows from (1) and (2).
Example 12.5. Every natural number (in the sense of Definition 11.6, which we already agree is the
definition of natural numbers) n is a cardinal. In fact these are precisely the finite cardinals (as well as finite
ordinals). So far, for a finite set of size n, we said that the notation SAS n is short for SAS S nS, which is the
same as SAS SnS (because n n), but now with the new definition, SAS n just means this automatically.
Example 12.6. The first infinite ordinal, ω, is also a cardinal (recall that this is just the set of natural
numbers, which is the same as N). In fact it is the least infinite cardinal. In its capacity as a cardinal, we
denote it as ¯0 . Thus the notation from Definition 7.13 gets a new meaning: instead of writing SAS ¯0 as
a short for “SAS SNS”, we now understand it as an equality of two cardinals, that of SAS and ¯0 .
38
How about addition, multiplication and exponentiation of cardinalities? We defined what are SAS SB S,
SAS SB S and SAS
SB S
, as well as Pi>I SAi S and Li>I SAi S. These definitions go through to cardinal numbers: we
write κ λ for SκS SλS, etc., and the output of this operation is a cardinal. Thus, 2 ¯0
gets a precise meaning:
it is the output of the exponentiation of the cardinal 2 by the cardinal ¯0 .
In the exercise, you defined the sum and products of ordinals. This does not give the same definition!
For instance, as ordinals, ω 1 x ω (ordinal addition), but ¯0 1 ¯0 (cardinal addition). This is not a
problem because it will always be clear from the context when the operation is on ordinals and when is it
on cardinals.
Note that Card is well-ordered as it is a subclass of Ord (with the same order, @).
Proposition 12.7. Given a set s b Card, s is a cardinal and it is precisely sup s (i.e., the least upper
bound of s).
Proof. We already know that γ s is an ordinal (by Exercise 10.11 it is transitive, and by Theorem 10.16
it is well-ordered). We also know that γ sup s in the sense of ordinals (by Proposition 10.21). So, by
Proposition 12.4, we will finish by showing that it is a cardinal. Suppose α @ γ. Then for some β > s, α @ β,
so SαS @ Sβ S as β > Card, so SαS @ Sβ S B Sγ S and we are done.
By Hartogs’ Theorem (Theorem 11.10), if κ is a cardinal, there is some ordinal α such that SαS B~ κ, and
so κ @ SαS (as @ is linear on ordinals). Hence for every cardinal κ there is a greater cardinal. (An easier way
to see this is just by noting that 2κ A κ by Cantor’s Theorem 6.4.)
Definition 12.8. Given a cardinal κ, denote by κ the successor of κ in Card (i.e., the least cardinal greater
than κ).
Note that if κ is an infinite cardinal, then κ cannot be a successor ordinal (as α 1 infinite α infinite
SαS Sα 1S) so it must be a limit ordinal.
Proof. Since for all ordinals α, we have that α @ SαS , if Card was a set, we would have had Ord
Card,
but we already know that Ord is not a set (see Corollary 10.18).
Alternatively, if Card was a set, then Card sup Card is a cardinal by Proposition 12.7, which is
greater than all cardinals, so sup Card B sup Card so sup Card @ sup Card contradicting the fact that
@ is an order on Ord.
Theorem 12.10. Suppose that κ is an infinite cardinal (in other words, ¯0 B κ). Then κ κ κ.
Proof. The proof is by induction on cardinals, i.e., by induction on κ > Card. This means that we assume
that the theorem is true for all infinite cardinals λ @ κ and we want to prove it for κ. In particular, the
induction hypothesis implies that for all infinite ordinals α @ κ, SαS SαS @ κ, so Sα αS @ κ. However, this is
trivially true for all finite ordinals (as Sα αS is finite in that case), so the induction hypothesis implies that
for all ordinals α @ κ, Sα αS @ κ.
We will define a well-order @ on κ κ, such that every initial segment has cardinality @ κ. If we succeed,
then, letting α be the order type of `κ κ, @ e, we get that on the one hand, κ B α (as κ B Sκ κS
B α) SαS
and on the other hand, if κ @ α, then κ is an initial segment of α, so as `α, @e is isomorphic to `κ κ, @ e,
To finish we have to show that every initial segment of @ has cardinality @ κ. Indeed, let `α, β e > κ κ.
This means that the initial segment κ κB `α,β e has cardinality at most Sβ 1 β 1S @ κ as we wanted
(we used here that κ is a limit ordinal, so that β 1 @ κ).
Corollary 12.12. (1) Suppose that κ and λ are infinite cardinals. Then κ λ κλ max κ, λ.
(2) The statement of (1) holds also when κ is finite.
Proof. (1) Suppose that max κ, λ λ. We have λ B κ λ B κ λ and κ λ B λ λ λ, so together we are
done.
(2) follows since if λ is infinite and κ n is finite then λ n λ (by Proposition 7.17) and λ n λ...λ
where the sum is done n times so λ by (1) (by induction on n).
Theorem 12.15. Suppose that V is a vector space, and B1 , B2 are bases of V such that SB1 S C ¯0 , then
SB1 S SB2 S.
Proof. The proof uses only this fact from linear algebra:
Fact. If B is linearly independent and finite then span B cannot contain more than SB S independent
elements.
By the fact we already know that SB2 S is infinite. We also get that for every s b B2 finite, the set
sB1 b > B1 S b > span s is finite.
We find an injective function F B1 PFin B2 ω. Since the latter has the same cardinality as B2 , we
are done.
Let @ be a linear order on V (which exists by the well-order principle).
Given b > B1 , let F b `s, ne where s b B2 is finite so that b > span s and n is such that b is the nth
member of sB1 , when ordered by @. (Note that we use Choice to define F .)
40
12.1. The Alephs.
Definition 12.16. For α > Ord we define the cardinal ¯α by induction on α. For α 0, ¯0 is already defined.
sup ¯β S β @ α
For α β 1 a successor, ¯α ¯β (the successor of ¯β in Card). For α limit, ¯α
The definition of ¯α is done by recursion on Ord. We haven’t discussed recursion on a well-ordered class,
but it is precisely the same as in the case of sets. Here, ¯α is the function we get by applying recursion
on the function H f α Card S α > Ord Card where H g ¯0 and otherwise, if α Dom f is a
successor β 1, then H f f β
and if α is a limit ordinal, then H f Range f .
Theorem 12.17. For every infinite cardinal κ, there is some α > Ord such that κ ¯α .
12.2. The continuum hypothesis. Thus, addition and multiplication on infinite cardinals is very simple
— much simpler than the situation on finite ones. What about exponentiation? there, the situation is not
so clear. What do we know? we know that for every infinite cardinal κ, κ @ 2κ by Cantor’s theorem, and
that 2κ B κκ B 2κ
κ
2κ κ 2κ .
Proposition 12.19. Suppose that `λi S i > I e is a sequence of infinite cardinals. Suppose that SI S B sup λi S i > I .
Then Pi>I λi sup λi S i > I .
Proof. Let λ sup λi S i > I . We have that λi B Pi>I λi for every i > I, so λ B Pi>I λi by definition of λ. On
the other hand, Pi>I λi B Pi>I λ SI S λ λ.
Can we get a similar formula for infinite multiplication? no, since we do not even know how to compute
exponentiation.
¯0
The Continuum Hypothesis (CH) says that 2 ¯1 , and the Generalized Continuum Hypothesis (GCH)
¯α
asserts that 2 ¯α1 . Cantor has tried to prove CH but couldn’t. Since then, Gödel and Cohen proved
that CH is independent from the axioms of set theory (there are models of the axioms of set theory in which
CH holds, and others where the negation of CH holds!). For more on this, see the advanced courses in set
theory: axiomatic set theory and forcing.
¯
What can be said about 2 0 ?
Exercise 12.20. (Zermelo-König) Suppose that ``κi , λi e S i > I e is a sequence of pairs of infinite cardinals
and κi @ λi for each i > I. Then Pi>I κi @ Li>I λi .
¯0
Corollary 12.21. It cannot be that 2 ¯ω .
41
Proof. By definition, ¯ω sup ¯n S n @ ω and so ¯ω Pn@ω ¯n
¯0
by Proposition 12.19. Suppose that 2 ¯ω .
Then for all n @ ω, ¯n @ ¯n 1
B 2 . Hence by Exercise 12.20, we have that
¯0
¯ω Q ¯n @ M2 ¯0
2
¯0
¯0
2 ¯0 ¯ 0 ¯
2 0.
n@ω n@ω
42