Set Theory for Verification: II
Induction and Recursion
Lawrence C. Paulson
Computer Laboratory, University of Cambridge
April 1995
Minor revisions, September 2000
Abstract. A theory of recursive definitions has been mechanized in Isabelle’s Zermelo-Fraenkel
(ZF) set theory. The objective is to support the formalization of particular recursive definitions
for use in verification, semantics proofs and other computational reasoning.
Inductively defined sets are expressed as least fixedpoints, applying the Knaster-Tarski Theorem over a suitable set. Recursive functions are defined by well-founded recursion and its
derivatives, such as transfinite recursion. Recursive data structures are expressed by applying
the Knaster-Tarski Theorem to a set, such as Vω , that is closed under Cartesian product and
disjoint sum.
Worked examples include the transitive closure of a relation, lists, variable-branching trees
and mutually recursive trees and forests. The Schröder-Bernstein Theorem and the soundness of
propositional logic are proved in Isabelle sessions.
Key words: Isabelle, set theory, recursive definitions, the Schröder-Bernstein Theorem
Contents
1
2
3
4
5
6
Introduction
1.1
Outline of the Paper
1.2
Preliminary Definitions
Least Fixedpoints
2.1
The Knaster-Tarski Theorem
2.2
The Bounding Set
2.3
A General Induction Rule
2.4
Monotonicity
2.5
Application: Transitive Closure of a Relation
2.6
Application: The Schröder-Bernstein Theorem
2.7
Proving the Schröder-Bernstein Theorem in Isabelle
Recursive Functions
3.1
Well-Founded Recursion
3.2
Ordinals
3.3
The Natural Numbers
3.4
The Rank Function
3.5
The Cumulative Hierarchy
3.6
Recursion on a Set’s Rank
Recursive Data Structures
4.1
Disjoint Sums
4.2
A Universe
4.3
Lists
4.4
Using list(· · ·) in Recursion Equations
4.5
Mutual Recursion
Soundness and Completeness of Propositional Logic
5.1
Defining the Set of Propositions
5.2
Defining an Inference System in ZF
5.3
Rule Induction
5.4
Proving the Soundness Theorem in Isabelle
5.5
Completeness
Related Work and Conclusions
1
1
1
2
2
3
4
5
6
7
8
13
15
17
19
20
22
24
26
26
26
27
29
32
37
38
38
40
41
43
45
Set Theory for Verification: II
1
1. Introduction
Recursive definitions pervade theoretical Computer Science. Part I of this work [22]
has described the mechanization of a theory of functions within Zermelo-Fraenkel
(ZF) set theory using the theorem prover Isabelle. Part II develops a mechanized
theory of recursion for ZF: least fixedpoints, recursive functions and recursive
data structures. Particular instances of these can be generated rapidly, to support
verifications and other computational proofs in ZF set theory.
The importance of this theory lies in its relevance to automation. I describe the
Isabelle proofs in detail, so that they can be reproduced in other set theory provers.
It also serves as an extended demonstration of how mathematics is developed using
Isabelle. Two Isabelle proofs are presented: the Schröder-Bernstein Theorem and
a soundness theorem for propositional logic.
1.1. Outline of the Paper
Part I [22] contains introductions to axiomatic set theory and Isabelle. Part II,
which is the present document, proceeds as follows.
− Section 2 presents a treatment of least fixedpoints based upon the KnasterTarski Theorem. Examples include transitive closure and the SchröderBernstein Theorem.
− Section 3 treats recursive functions. It includes a detailed derivation of wellfounded recursion. The ordinals, ∈-recursion and the cumulative hierarchy
are defined in order to derive a general recursion operator for recursive data
structures.
− Section 4 treats recursive data structures, including mutual recursion. It
presents examples of various types of lists and trees. Little new theory is
required.
− Section 5 is a case study to demonstrate all of the techniques. It describes an
Isabelle proof of the soundness and completeness of propositional logic.
− Section 6 outlines related work and draws brief conclusions.
1.2. Preliminary Definitions
For later reference, I summarize below some concepts defined in Part I [22], mainly
in §7.5. Ideally, you should read the whole of Part I before continuing.
A binary relation is a set of ordered pairs. Isabelle’s set theory defines the usual
operations: converse, domain, range, etc. The infix operator “ denotes image.
y, x ∈ converse(r) ↔ x, y ∈ r
2
Lawrence C. Paulson
x ∈ domain(r)
y ∈ range(r)
field(r)
y ∈ (r “ A)
↔
↔
≡
↔
∃y . x, y ∈ r
∃x . x, y ∈ r
domain(r) ∪ range(r)
∃x∈A . x, y ∈ r
The definite description operator ιx . ψ(x) denotes the unique a satisfying ψ(a),
if such exists. See §7.2 of Part I for its definition and discussion.
Functions are single-valued binary relations. Application and λ-abstraction are
defined as follows:
f ‘a ≡ ιy . a, y ∈ f
λx∈A . b(x) ≡ {x, b(x) . x ∈ A}
2. Least Fixedpoints
One aspect of the Isabelle ZF theory of recursion concerns sets defined by least
fixedpoints. I use an old result, the Knaster-Tarski Theorem. A typical application
is to formalize the set of theorems inductively defined by a system of inference
rules. The set being defined must be a subset of another set already available.
Later (§4.2) we shall construct sets large enough to contain various recursive data
structures, which can be ‘carved out’ using the Knaster-Tarski Theorem.
This section gives the Isabelle formulation of the Theorem. The least fixedpoint
satisfies a general induction principle that can be specialized to obtain structural
induction rules for the natural numbers, lists and trees. The transitive closure of a
relation is defined as a least fixedpoint and its properties are proved by induction.
A least fixedpoint argument also yields a simple proof of the Schröder-Bernstein
Theorem. Part of this proof is given in an interactive session, to demonstrate
Isabelle’s ability to synthesize terms.
2.1. The Knaster-Tarski Theorem
The Knaster-Tarski Theorem states that every monotone function over a complete
lattice has a fixedpoint. (Davey and Priestley discuss and prove the Theorem [7].)
Usually a greatest fixedpoint is exhibited, but a dual argument yields the least
fixedpoint.
A partially ordered set P is a complete lattice if, for every subset S of P , the
least upper bound and greatest lower bound of S are elements of P . In Isabelle’s
implementation of ZF set theory, the theorem is proved for a special case: powerset
lattices of the form ℘(D), for a set D. The partial ordering is ⊆; upper bounds are
unions; lower bounds are intersections.
Other complete lattices could be useful. Mutual recursion can be expressed
as a fixedpoint in the lattice ℘(D1 ) × · · · × ℘(Dn ), whose elements are n-tuples,
Set Theory for Verification: II
3
with a component-wise ordering. But proving the Knaster-Tarski Theorem in its
full generality would require a cumbersome formalization of complete lattices. The
Isabelle ZF treatment of mutual recursion uses instead the lattice ℘(D1 +· · ·+Dn ),
which is order-isomorphic1 to ℘(D1 ) × · · · × ℘(Dn ).
The predicate bnd mono(D, h) expresses that h is monotonic and bounded by D,
while lfp(D, h) denotes h’s least fixedpoint, a subset of D:
bnd mono(D, h) ≡ h(D) ⊆ D ∧ (∀x y . x ⊆ y ∧ y ⊆ D → h(x) ⊆ h(y))
lfp(D, h) ≡
{X ∈ ℘(D) . h(X) ⊆ X}
These are binding operators; in Isabelle terminology, h is a meta-level function.
I originally defined lfp for object-level functions, but this needlessly complicated
proofs. A function in set theory is a set of pairs. There is an obvious correspondence
between meta- and object-level functions with domain ℘(D), associating h with
λX∈℘(D) . h(X). The latter is an element of the set ℘(D) → ℘(D), but this is
irrelevant to the theorem at hand. What matters is the mapping from X to h(X).
Virtually all the functions in this paper are meta-level functions, not sets of
pairs. One exception is in the well-founded recursion theorem below (§3.1), where
the construction of the recursive function simply must be regarded as the construction of a set of pairs. Object-level functions stand out because they require
an application operator: we must write f ‘x instead of f (x).
The Isabelle theory derives rules asserting that lfp(D, h) is the least prefixedpoint of h, and (if h is monotonic) a fixedpoint:
h(A) ⊆ A A ⊆ D
lfp(D, h) ⊆ A
bnd mono(D, h)
lfp(D, h) = h(lfp(D, h))
The second rule above is one form of the Knaster-Tarski Theorem. Another form
of the Theorem constructs a greatest fixedpoint; this justifies coinductive definitions [23], but will not concern us here.
2.2. The Bounding Set
When justifying some instance of lfp(D, h), showing that h is monotone is generally easy, if it is true at all. Harder is to exhibit a bounding set, namely some
D satisfying h(D) ⊆ D. Much of the work reported below involves constructing
bounding sets for use in fixedpoint definitions. Let us consider some examples.
− The natural numbers. The Axiom of Infinity (see §3.3) asserts that there is a
bounding set Inf for the mapping λX . {0} ∪ {succ(i) . i ∈ X}. This justifies
defining the set of natural numbers by
nat ≡ lfp(Inf, λX . {0} ∪ {succ(i) . i ∈ X}).
4
Lawrence C. Paulson
− Lists and trees. Let A+B denote the disjoint sum of the sets A and B (defined
below in §4.1). Consider defining the set of lists over A, satisfying the recursion
equation
list(A) = {∅} + A × list(A).
This requires a set closed under the mapping λX . {∅} + A × X. Section 4.2
defines a set univ(A) with useful closure properties:
A ⊆ univ(A)
nat ⊆ univ(A)
univ(A) × univ(A) ⊆ univ(A)
univ(A) + univ(A) ⊆ univ(A)
This set contains all finitely branching trees over A, and will allow us to define
a wide variety of recursive data structures.
− The Isabelle ZF theory also constructs bounding sets for infinitely branching
trees.
− The powerset operator is monotone, but has no bounding set. Cantor’s Theorem implies that there is no set D such that ℘(D) ⊆ D.
2.3. A General Induction Rule
Because lfp(D, h) is a least fixedpoint, it enjoys an induction rule. Consider the set
of natural numbers, nat. Suppose ψ(0) holds and that ψ(x) implies ψ(succ(x))
for all x ∈ nat. Then the set {x ∈ nat . ψ(x)} contains 0 and is closed under
successors. Because nat is the least such set, we obtain nat ⊆ {x ∈ nat . ψ(x)}.
Thus, x ∈ nat implies ψ(x).
To derive an induction rule for an arbitrary least fixedpoint, the chief problem
is to express the rule’s premises. Suppose we have defined A ≡ lfp(D, h) and have
proved bnd mono(D, h). Define the set
Aψ ≡ {x ∈ A . ψ(x)}.
Now suppose x ∈ h(Aψ ) implies ψ(x) for all x. Then h(Aψ ) ⊆ Aψ and we conclude
A ⊆ Aψ . This derives the general induction rule
[x ∈ h(Aψ )]x
..
..
A ≡ lfp(D, h) a ∈ A bnd mono(D, h)
ψ(x)
ψ(a)
The last premise states the closure properties of ψ, normally expressed as separate
‘base cases’ and ‘induction steps.’ (As in Part I of this paper, the subscripted
variable in the assumption stands for a proviso on the rule: x must not be free in
the conclusion or other assumptions.)
To demonstrate this rule, consider again the natural numbers. The appropriate
h satisfies
h(natψ ) = {0} ∪ {succ(i) . i ∈ natψ }.
Set Theory for Verification: II
5
Now x ∈ h(natψ ) if and only if x = 0 or x = succ(i) for some i ∈ nat such that
ψ(i). We may instantiate the rule above to
[x ∈ h(natψ )]x
..
..
n ∈ nat
ψ(x)
ψ(n)
and quickly derive the usual induction rule
n ∈ nat
[x ∈ nat ψ(x)]x
..
..
ψ(0)
ψ(succ(x))
ψ(n)
2.4. Monotonicity
The set lfp(D, h) is a fixedpoint if h is monotonic. The Isabelle ZF theory derives
many rules for proving monotonicity; Isabelle’s classical reasoner proves most of
them automatically. Here are the rules for union and product:
A⊆C B⊆D
A∪B ⊆C ∪D
A⊆C B⊆D
A×B ⊆C ×D
Here are the rules for set difference and image:
A⊆C D⊆B
A−B ⊆C −D
r⊆s A⊆B
r“A ⊆ s“B
And here is the rule for general union:
[x ∈ A]x
..
..
A ⊆ C B(x) ⊆ D(x)
( x∈A .B(x)) ⊆ ( x∈C .D(x))
There is even a rule that lfp is itself monotonic.2 This justifies nested applications
of lfp:
[X ⊆ D]X
..
..
bnd mono(D, h) bnd mono(E, i) h(X) ⊆ i(X)
lfp(D, h) ⊆ lfp(E, i)
6
Lawrence C. Paulson
2.5. Application: Transitive Closure of a Relation
Let id(A) denote the identity relation on A, namely {x, x . x ∈ A}. Then the
reflexive/transitive closure r∗ of a relation r may be defined as a least fixedpoint:
r∗ ≡ lfp(field(r) × field(r), λs . id(field(r)) ∪ (r ◦ s))
The mapping λs . id(field(r)) ∪ (r ◦ s) is monotonic and bounded by field(r) ×
field(r), by virtue of similar properties for union and composition. The KnasterTarski Theorem yields
r∗ = id(field(r)) ∪ (r ◦ r∗ ).
This recursion equation affords easy proofs of the introduction rules for r∗ :
a ∈ field(r)
a, a ∈ r∗
a, b ∈ r∗ b, c ∈ r
a, c ∈ r∗
Because r∗ is recursively defined, it admits reasoning by induction. Using the
general induction rule for lfp, the following rule can be derived simply:
a, b ∈ r∗
[x ∈ field(r)]x [ψ(x, y) x, y ∈ r∗ y, z ∈ r]x,y,z
..
..
..
..
ψ(x, x)
ψ(x, z)
ψ(a, b)
(1)
This is the natural elimination rule for r∗ because its minor premises reflect the
form of its introduction rules [25]; it is however cumbersome. A simpler rule starts
from the idea that if a, b ∈ r∗ then there exist a0 , a1 , . . . , an such that (writing
r as an infix relation)
a = a0 r a1 r · · · r an = b.
If ψ holds at a and is preserved by r, then ψ must hold at b:
a, b ∈ r∗
[ψ(y) a, y ∈ r∗ y, z ∈ r]y,z
..
..
ψ(z)
ψ(a)
ψ(b)
(2)
Formally, the rule follows by assuming its premises and instantiating the original
induction rule (1) with the formula ψ ′ (z), where
ψ ′ (z) ≡ ∀w . z = a, w → ψ(w).
Reasoning about injectivity of ordered pairing, we eventually derive
∀w . a, b = a, w → ψ(w)
7
Set Theory for Verification: II
f
A
A
B
B
g
X
Y
Figure 1. Banach’s Decomposition Theorem
and reach the conclusion, ψ(b).
To demonstrate the simpler induction rule (2), let us show that r∗ is transitive.
Here is a concise proof of c, b ∈ r∗ from the assumptions c, a ∈ r∗ and a, b ∈ r∗ :
a, b ∈
r∗
r∗
c, y ∈ r∗ y, z ∈ r
c, z ∈ r∗
c, a ∈
c, b ∈ r∗
The transitive closure r+ of a relation r is defined by r+ ≡ r ◦ r∗ and its usual
properties follow immediately.
2.6. Application: The Schröder-Bernstein Theorem
The Schröder-Bernstein Theorem plays a vital role in the theory of cardinal numbers. If there are two injections f : X → Y and g : Y → X, then the Theorem
states that there is a bijection h : X → Y . Halmos [11] gives a direct but complicated proof. Simpler is to use the Knaster-Tarski Theorem to prove a key lemma,
Banach’s Decomposition Theorem [7].
Recall from §1.2 the image and converse operators. These apply to functions also, because functions are relations in set theory. If f is an injection then
converse(f ) is a function, conventionally written f −1 . Write f ↾ A for the restriction of function f to the set A, defined by
f ↾ A ≡ λx∈A . f ‘x
2.6.1. The Informal Proof
Suppose f : X → Y and g : Y → X are functions. Banach’s Decomposition
Theorem states that both X and Y can be partitioned (see Figure 1) into regions A
and B, satisfying six equations:
XA ∩ XB = ∅
YA ∩ YB = ∅
XA ∪ XB = X
YA ∪ YB = Y
f “XA = YA
g“YB = XB
8
Lawrence C. Paulson
To prove Banach’s Theorem, define
XA
XB
YA
YB
≡
≡
≡
≡
lfp(X, λW . X − g“(Y − f “W ))
X − XA
f “XA
Y − YA
Five of the six equations follow at once. The mapping in lfp is monotonic and
yields a subset of X. Thus Tarski’s Theorem yields XA = X − g“(Y − f “XA ),
which justifies the last equation:
g“YB =
=
=
=
g“(Y − f “XA )
X − (X − g“(Y − f “XA ))
X − XA
XB .
To prove the Schröder-Bernstein Theorem, let f and g be injections (for the Banach
Theorem, they only have to be functions). Partition X and Y as above. The desired
bijection between X and Y is (f ↾ XA ) ∪ (g ↾ YB )−1 .
2.7. Proving the Schröder-Bernstein Theorem in Isabelle
This section sketches the Isabelle proof of the Schröder-Bernstein Theorem;
Isabelle synthesizes the bijection automatically. See Part I for an overview of
Isabelle [22, §2]. As usual, the proofs are done in small steps in order to demonstrate Isabelle’s workings.
2.7.1. Preliminaries for Banach’s Decomposition Theorem
Most of the work involves proving Banach’s Theorem. First, we establish monotonicity of the map supplied to lfp:
bnd_mono(X, %W. X - g‘‘(Y - f‘‘W))
The proof is trivial, and omitted; the theorem is stored as decomp bnd mono.
Next, we prove the last equation in Banach’s Theorem:
val [gfun] = goal Cardinal.thy
"g: Y->X ==>
\
g‘‘(Y - f‘‘ lfp(X, %W. X - g‘‘(Y - f‘‘W))) =
\
X - lfp(X, %W. X - g‘‘(Y - f‘‘W)) ";
\
\
9
Set Theory for Verification: II
Isabelle responds by printing an initial proof state consisting of one subgoal, the
equation to be proved.
Level 0
g ‘‘ (Y - f ‘‘ lfp(X,%W. X - g ‘‘ (Y - f ‘‘ W))) =
X - lfp(X,%W. X - g ‘‘ (Y - f ‘‘ W))
1. g ‘‘ (Y - f ‘‘ lfp(X,%W. X - g ‘‘ (Y - f ‘‘ W))) =
X - lfp(X,%W. X - g ‘‘ (Y - f ‘‘ W))
The first step is to use monotonicity and Tarski’s Theorem to substitute
for lfp(· · ·). Unfortunately, there are two occurrences of lfp(· · ·) and the substitution must unfold only the second one. The relevant theorems are combined
and then instantiated with a template specifying where the substitution may occur.
by (res_inst_tac [("P", "%u. ?v = X-u")]
(decomp_bnd_mono RS lfp_Tarski RS ssubst) 1);
Level 1
g ‘‘ (Y - f ‘‘ lfp(X,%W. X - g ‘‘ (Y - f ‘‘ W))) =
X - lfp(X,%W. X - g ‘‘ (Y - f ‘‘ W))
1. g ‘‘ (Y - f ‘‘ lfp(X,%W. X - g ‘‘ (Y - f ‘‘ W))) =
X - (X - g ‘‘ (Y - f ‘‘ lfp(X,%W. X - g ‘‘ (Y - f ‘‘ W))))
Observe the substitution’s effect upon subgoal 1. We now invoke Isabelle’s simplifier, supplying basic facts about subsets, complements, functions and images. This
simplifies X − (X − g“(Y − f “lfp(· · ·))) to g“(Y − f “lfp(· · ·)), which proves the
subgoal.
by (simp_tac
(ZF_ss addsimps [subset_refl, double_complement, Diff_subset,
gfun RS fun_is_rel RS image_subset]) 1);
Level 2
g ‘‘ (Y - f ‘‘ lfp(X,%W. X - g ‘‘ (Y - f ‘‘ W))) =
X - lfp(X,%W. X - g ‘‘ (Y - f ‘‘ W))
No subgoals!
The proof is finished. We name this theorem for later reference during the proof
session.
val Banach_last_equation = result();
2.7.2. The Proof of Banach’s Decomposition Theorem
We are now ready to prove Banach’s Theorem proper:
val prems = goal Cardinal.thy
"[| f: X->Y; g: Y->X |] ==>
\
EX XA XB YA YB. (XA Int XB = 0) & (XA Un XB = X) &
\
(YA Int YB = 0) & (YA Un YB = Y) &
\
f‘‘XA=YA & g‘‘YB=XB";
\
\
\
10
Lawrence C. Paulson
Level
EX XA
XA
XA
YA
0
XB YA YB.
Int XB = 0 &
Un XB = X &
Int YB = 0 & YA Un YB = Y & f ‘‘ XA = YA & g ‘‘ YB = XB
1. EX XA
XA
XA
YA
XB YA YB.
Int XB = 0 &
Un XB = X &
Int YB = 0 & YA Un YB = Y & f ‘‘ XA = YA & g ‘‘ YB = XB
Starting in the initial proof state, we apply a command to strip the existential
quantifiers and conjunctions repeatedly. The result is a proof state consisting of
six subgoals:
by (REPEAT (FIRSTGOAL (resolve_tac [exI, conjI])));
Level
EX XA
XA
XA
YA
1.
2.
3.
4.
5.
6.
1
XB YA YB.
Int XB = 0 &
Un XB = X &
Int YB = 0 & YA Un YB = Y & f ‘‘ XA = YA & g ‘‘ YB = XB
?XA Int ?XB1 = 0
?XA Un ?XB1 = X
?YA2 Int ?YB3 = 0
?YA2 Un ?YB3 = Y
f ‘‘ ?XA = ?YA2
g ‘‘ ?YB3 = ?XB1
The next command solves five of these subgoals by repeatedly applying facts such
as A ∩ (B − A) = ∅. Observe how the unknowns are instantiated; only ?XA is left.
by (REPEAT
(FIRSTGOAL (resolve_tac [Diff_disjoint, Diff_partition, refl])));
Level
EX XA
XA
XA
YA
2
XB YA YB.
Int XB = 0 &
Un XB = X &
Int YB = 0 & YA Un YB = Y & f ‘‘ XA = YA & g ‘‘ YB = XB
1. ?XA <= X
2. f ‘‘ ?XA <= Y
3. g ‘‘ (Y - f ‘‘ ?XA) = X - ?XA
We apply the result proved in the previous section to subgoal 3. This instantiates
the last unknown to lfp(· · ·):
by (resolve_tac [Banach_last_equation] 3);
11
Set Theory for Verification: II
Level 3
EX XA XB YA YB.
XA Int XB = 0 &
XA Un XB = X &
YA Int YB = 0 & YA Un YB = Y & f ‘‘ XA = YA & g ‘‘ YB = XB
1. lfp(X,%W. X - g ‘‘ (Y - f ‘‘ W)) <= X
2. f ‘‘ lfp(X,%W. X - g ‘‘ (Y - f ‘‘ W)) <= Y
3. g : Y -> X
The remaining subgoals are verified by appealing to lemmas and the premises.
by (REPEAT (resolve_tac (prems@[fun_is_rel, image_subset,
lfp_subset, decomp_bnd_mono]) 1));
Level 4
EX XA XB YA YB.
XA Int XB = 0 &
XA Un XB = X &
YA Int YB = 0 & YA Un YB = Y & f ‘‘ XA = YA & g ‘‘ YB = XB
No subgoals!
2.7.3. The Schröder-Bernstein Theorem
The Schröder-Bernstein Theorem is stated as
f ∈ inj(X, Y ) g ∈ inj(Y, X)
∃h . h ∈ bij(X, Y )
The standard Isabelle proof consists of an appeal to Banach’s Theorem and a call
to the classical reasoner (fast tac). Banach’s Theorem introduces an existentially
quantified assumption. The classical reasoner strips those quantifiers, adding new
bound variables XA , XB , YA and YB , to the context; then it strips the existential
quantifier from the goal, yielding an unknown; finally it instantiates that unknown
with a suitable bijection.
The form of the bijection is forced by the following three lemmas, which come
from a previously developed library of permutations:
f ∈ bij(A, B) g ∈ bij(C, D) A ∩ C = ∅
f ∪ g ∈ bij(A ∪ C, B ∪ D)
f ∈ bij(A, B)
(bij converse bij)
f −1 ∈ bij(B, A)
B∩D =∅
(bij disjoint Un)
f ∈ bij(A, B) C ⊆ A
(restrict bij)
f ↾ C ∈ bij(C, f “C)
To demonstrate how the bijection is instantiated, let us state the theorem using
an unknown rather than a existential quantifier. This proof requires supplying as
12
Lawrence C. Paulson
premises the conclusions of Banach’s Theorem without their existential quantifiers:
val prems = goal Cardinal.thy
"[| f : inj(X,Y) ;
g : inj(Y,X)
\
XA Int XB = 0 ;
XA Un XB = X
\
YA Int YB = 0 ;
YA Un YB = Y
\
f‘‘XA = YA
;
g‘‘YB = XB
;
;
;
|] ==>
\
\
\
?h: bij(X,Y)";
Level 0
?h : bij(X,Y)
1. ?h : bij(X,Y)
The first step inserts the premises into subgoal 1 and performs all possible substitutions, such as Y to YA ∪ YB and YA to f “XA .
by (cut_facts_tac prems 1 THEN
REPEAT (hyp_subst_tac 1) THEN
Level 1
?h69 : bij(X,Y)
1. [| f : inj(XA Un g ‘‘
g : inj(f ‘‘ XA Un
f ‘‘ XA Int YB = 0
?h69 : bij(XA Un g ‘‘
flexflex_tac);
YB,f ‘‘ XA Un YB);
YB,XA Un g ‘‘ YB); XA Int g ‘‘ YB = 0;
|] ==>
YB,f ‘‘ XA Un YB)
The second step applies bij disjoint Un, instantiating the bijection to consist of
some union.
by (resolve_tac [bij_disjoint_Un] 1 THEN REPEAT (assume_tac 3));
Level 2
?f70 Un ?g70 : bij(X,Y)
1. [| f : inj(XA Un g ‘‘ YB,f ‘‘ XA Un YB);
g : inj(f ‘‘ XA Un YB,XA Un g ‘‘ YB); XA Int g ‘‘ YB = 0;
f ‘‘ XA Int YB = 0 |] ==>
?f70 : bij(XA,f ‘‘ XA)
2. [| f
g
f
?g70
: inj(XA Un g ‘‘ YB,f ‘‘ XA Un YB);
: inj(f ‘‘ XA Un YB,XA Un g ‘‘ YB); XA Int g ‘‘ YB = 0;
‘‘ XA Int YB = 0 |] ==>
: bij(g ‘‘ YB,YB)
The third step applies bij converse bij to subgoal 2, instantiating the bijection
with a converse term. This rule should only be used in the last resort, since it
can be repeated indefinitely.
by (resolve_tac [bij_converse_bij] 2);
Level 3
?f70 Un converse(?f71) : bij(X,Y)
1. [| f : inj(XA Un g ‘‘ YB,f ‘‘ XA Un YB);
g : inj(f ‘‘ XA Un YB,XA Un g ‘‘ YB); XA Int g ‘‘ YB = 0;
f ‘‘ XA Int YB = 0 |] ==>
?f70 : bij(XA,f ‘‘ XA)
Set Theory for Verification: II
2. [| f
g
f
?f71
13
: inj(XA Un g ‘‘ YB,f ‘‘ XA Un YB);
: inj(f ‘‘ XA Un YB,XA Un g ‘‘ YB); XA Int g ‘‘ YB = 0;
‘‘ XA Int YB = 0 |] ==>
: bij(YB,g ‘‘ YB)
The fourth step applies restrict bij, instantiating the bijection with restrictions.
We obtain (f ↾ XA ) ∪ (g ↾ YB )−1 .
by (REPEAT (FIRSTGOAL (eresolve_tac [restrict_bij])));
Level 4
restrict(f,XA) Un converse(restrict(g,YB)) : bij(X,Y)
1. [| g : inj(f ‘‘ XA Un YB,XA Un g ‘‘ YB); XA Int g ‘‘ YB = 0;
f ‘‘ XA Int YB = 0 |] ==>
XA <= XA Un g ‘‘ YB
2. [| f : inj(XA Un g ‘‘ YB,f ‘‘ XA Un YB); XA Int g ‘‘ YB = 0;
f ‘‘ XA Int YB = 0 |] ==>
YB <= f ‘‘ XA Un YB
Finally we appeal to some obvious facts.
by (REPEAT (resolve_tac [Un_upper1,Un_upper2] 1));
Level 5
restrict(f,XA) Un converse(restrict(g,YB)) : bij(X,Y)
No subgoals!
The total execution time to prove the Banach and Schröder-Bernstein Theorems
is about three seconds.3
The Schröder-Bernstein Theorem is a long-standing challenge problem; both
Bledsoe [3, page 31] and McDonald and Suppes [14, page 338] mention it. The
Isabelle proof cannot claim to be automatic — it draws upon a body of lemmas
— but it is short and comprehensible. It demonstrates the power of instantiating
unknowns incrementally.
This mechanized theory of least fixedpoints allows formal reasoning about any
inductively-defined subset of an existing set. Before we can use the theory to
specify recursive data structures, we need some means of constructing large sets.
Since large sets could be defined by transfinite recursion, we now consider the
general question of recursive functions in set theory.
3. Recursive Functions
A relation ≺ is well-founded if it admits no infinite decreasing chains
· · · ≺ xn ≺ · · · ≺ x2 ≺ x1 .
Well-founded relations are a general means of justifying recursive definitions and
proving termination. They have played a key role in the Boyer/Moore Theorem
14
Lawrence C. Paulson
Prover since its early days [4]. Manna and Waldinger’s work on deductive program
synthesis [12] illustrates the power of well-founded relations; they justify the termination of a unification algorithm using a relation that takes into account the
size of a term and the number of free variables it contains.
The rise of type theory [6, 9, 13] has brought a new treatment of recursion.
Instead of a single recursion operator justified by well-founded relations, each
recursive type comes equipped with a structural recursion operator. For the natural
numbers, structural recursion admits calls such as double(n + 1) = double(n) + 2;
for lists, it admits calls such as rev(Cons(x, l)) = rev(l)@[x].
These recursion operators are powerful — unlike computation theory’s primitive recursion, they can express Ackermann’s function — but they are sometimes
inconvenient. They can only express recursive calls involving an immediate component of the argument. This excludes functions that divide by repeated subtraction
or that sort by recursively sorting shorter lists. Coding such functions using structural recursion requires ingenuity; consider Smith’s treatment of Quicksort [26].
Nordström [19] and I [21] have attempted to re-introduce well-founded relations to type theory, with limited success. In ZF set theory, well-founded relations
reclaim their role as the foundation of induction and recursion. They can express
difficult termination arguments, such as for unification and Quicksort; they include
structural recursion as a special case.
Suppose we have defined the operator list such that list(A) is the set of all
lists of the form
Cons(x1 , Cons(x2 , . . . , Cons(xn , Nil) . . .))
x1 , x2 , . . . , xn ∈ A
We could then define the substructure relation is tail(A) to consist of all pairs
l, Cons(x, l) for x ∈ A and l ∈ list(A), since l is the tail of Cons(x, l). Proving
that is tail(A) is well-founded justifies structural recursion on lists.
But this approach can be streamlined. The well-foundedness of lists, trees and
many similar data structures follows from the well-foundedness of ordered pairing, which follows from the Foundation Axiom of ZF set theory.4 This spares us
the effort of defining relations such as is tail(A). Moreover, recursive functions
defined using is tail(A) have a needless dependence upon A; exploiting the Foundation Axiom eliminates this extra argument.
Achieving these aims requires considerable effort. Several highly technical settheoretic constructions are defined in succession:
− A well-founded recursion operator, called wfrec, is defined and proved to
satisfy a general recursion equation.
− The ordinals are constructed. Transfinite recursion is an instance of wellfounded recursion.
− The natural numbers are constructed. Natural numbers are ordinals and inherit many of their properties from the ordinals. Primitive recursion on the natural
numbers is an instance of transfinite recursion.
Set Theory for Verification: II
15
− The rank operation associates a unique ordinal with every set; it serves as
an absolute measure of a set’s depth. To define this operation, transfinite
recursion is generalized to a form known as ∈-recursion (transrec in Isabelle
ZF). The construction involves the natural numbers.
− The cumulative hierarchy of ZF set theory is finally introduced, by transfinite
recursion. As a special case, it includes a small ‘universe’ for use with lfp in
defining recursive data structures.
− The general recursion operator Vrec justifies functions that make recursive
calls on arguments of lesser rank.
3.1. Well-Founded Recursion
The ZF derivation of well-founded recursion is based on one by Tobias Nipkow
in higher-order logic. It is much shorter than any other derivation that I have
seen, including several of my own. It is still complex, more so than a glance at
Suppes [27, pages 197–8] might suggest. Space permits only a discussion of the
definitions and key theorems.
3.1.1. Definitions
First, we must define ‘well-founded relation.’ Infinite descending chains are difficult
to formalize; a simpler criterion is that each non-empty set contains a minimal
element. The definition echoes the Axiom of Foundation [22, §4].
wf(r) ≡ ∀Z . Z = ∅ ∨ (∃x∈Z . ∀y . y, x ∈ r → y ∈ Z)
From this, it is fairly easy to derive well-founded induction:
[∀y . y, x ∈ r → ψ(y)]x
..
..
wf(r)
ψ(x)
ψ(a)
Proof: put {z ∈ domain(r) ∪ {a} . ¬ψ(z)} for Z in wf(r). If Z = ∅ then ψ(a)
follows immediately. If Z is nonempty then we obtain an x such that ¬ψ(x) and
(by the definition of domain) ∀y . y, x ∈ r → ψ(y), but the latter implies ψ(x).
The Isabelle proof is only seven lines.
Well-founded recursion, on the other hand, is difficult even to formalize. If f
is recursive over the well-founded relation r then f ‘x may depend upon x and,
for y, x ∈ r, upon f ‘y. Since f need not be computable, f ‘x may depend upon
infinitely many values of f ‘y. The inverse image r−1 “{x} is the set of all y such
that y, x ∈ r: the set of all r-predecessors of x. Formally, f is recursive over r if
it satisfies the equation
f ‘x = H(x, f ↾ (r−1 “{x}))
(3)
16
Lawrence C. Paulson
for all x. The binary operation H is the body of f . Restricting f to r−1 “{x} ensures
that the argument in each recursive call is r-smaller than x.
Justifying well-founded recursion requires proving, for all r and H, that the
corresponding recursive function exists. It is constructed in stages by well-founded
induction. Call f a restricted recursive function for x if it satisfies equation (3) for
all y such that y, x ∈ r. For a fixed x, we assume there exist restricted recursive
functions for all the r-predecessors of x, and construct from them a restricted
recursive function for x. We must also show that the restricted recursive functions
agree where their domains overlap; this ensures that the functions are unique.
Nipkow’s formalization of the construction makes several key simplifications.
Since the transitive closure r+ of a well-founded relation r is well-founded, he
restricts the construction to transitive relations; otherwise it would have to use r
in some places and r+ in others, leading to complications. Second, he formalizes
‘f is a restricted recursive function for a’ by a neat equation:
is recfun(r, a, H, f ) ≡ (f = λx ∈ r−1 “{a} . H(x, f ↾ (r−1 “{x})))
Traditional proofs define the full recursive function as the union of all restricted
recursive functions. This involves tiresome reasoning about sets of ordered pairs.
Nipkow instead uses descriptions:
the recfun(r, a, H) ≡ ιf . is recfun(r, a, H, f )
wftrec(r, a, H) ≡ H(a, the recfun(r, a, H))
Here the recfun(r, a, H) denotes the (unique) restricted recursive function for a.
Finally, wftrec gives access to the full recursive function; wftrec(r, a, H) yields
the result for the argument a.
3.1.2. Lemmas
Here are the key lemmas. Assume wf(r) and trans(r) below, where trans(r)
expresses that r is transitive.
Two restricted recursive functions f and g agree over the intersection of their
domains — by well-founded induction on x:
is recfun(r, a, H, f ) is recfun(r, b, H, g)
x, a ∈ r ∧ x, b ∈ r → f ‘x = g‘x
In consequence, the restricted recursive function at a is unique:
is recfun(r, a, H, f ) is recfun(r, a, H, g)
f =g
Another consequence justifies our calling such functions ‘restricted,’ since they are
literally restrictions of larger functions:
is recfun(r, a, H, f ) is recfun(r, b, H, g) b, a ∈ r
f ↾ (r−1 “{b}) = g
Set Theory for Verification: II
17
Using well-founded induction again, we prove the key theorem. Restricted recursive
functions exist for all a:
is recfun(r, a, H, the recfun(r, a, H))
It is now straightforward to prove that wftrec unfolds as desired for well-founded
recursion:
wftrec(r, a, H) = H(a, λx ∈ r−1 “{a} . wftrec(r, x, H))
The abstraction over r−1 “{a} is essentially the same as restriction.
3.1.3. The Recursion Equation
It only remains to remove the assumption trans(r). Because the transitive closure
of a well-founded relation is well-founded, we can immediately replace r by r+ in
the recursion equation for wftrec. But this leads to strange complications later,
involving transfinite recursion. I find it better to remove transitive closure from the
recursion equation, even at the cost of weakening it.5 The operator wfrec applies
wftrec with the transitive closure of r, but restricts recursive calls to immediate
r-predecessors:
wfrec(r, a, H) ≡ wftrec(r+ , a, λxf . H(x, f ↾ (r−1 “{x})))
Assuming wf(r) but not trans(r), we can show the equation for wfrec:
wfrec(r, a, H) = H(a, λx ∈ r−1 “{a} . wfrec(r, x, H))
All recursive functions in Isabelle’s ZF set theory are ultimately defined in terms
of wfrec.
3.2. Ordinals
My treatment of recursion requires a few properties of the set-theoretic ordinals.
The development follows standard texts [27] and requires little further discussion.
By convention, the Greek letters α, β and γ range over ordinals.
A set A is transitive if it is downwards closed under the membership relation:
y ∈ x ∈ A implies y ∈ A. An ordinal is a transitive set whose elements are
also transitive. The elements of an ordinal are therefore ordinals also. The finite
ordinals are the natural numbers; the set of natural numbers is itself an ordinal,
called ω. Transfinite ordinals are those greater than ω; they serve many purposes
in set theory and are the key to the recursion principles discussed below.
The Isabelle definitions are routine. The predicates Transset and Ord define
transitive sets and ordinals, while < is the less-than relation on ordinals:
Memrel(A)
Transset(i)
Ord(i)
i<j
≡
≡
≡
≡
{z ∈ A × A . ∃x y . z = x, y ∧ x ∈ y}
∀x∈i . x ⊆ i
Transset(i) ∧ (∀x∈i . Transset(x))
i ∈ j ∧ Ord(j)
18
Lawrence C. Paulson
The set Memrel(A) internalizes the membership relation on A as a subset of A × A.
If A is transitive then Memrel(A) internalizes the membership relation everywhere
below A. For then
x1 ∈ x2 ∈ · · · ∈ xn ∈ A
implies that x1 , x2 , . . . , xn are all elements of A; we have xk , xk+1 ∈ Memrel(A)
for 0 < k < n.
A common use of wfrec has the form wfrec(Memrel(A), x, H), where A is a
transitive set and x ∈ A. The recursion equation for wfrec(Memrel(A), x, H) supplies Memrel(A) as the well-founded relation in the recursive calls. We must use
Memrel(A) because well-founded induction and recursion take their well-founded
relation as a set, not as a binary predicate such as ∈.
Using the Foundation Axiom, it is straightforward to show that Memrel(A) is
well-founded. This fact, together with the transitivity of ordinals, yields transfinite
induction:
[Ord(β) ∀γ∈β . ψ(γ)]β
..
..
Ord(α)
ψ(β)
ψ(α)
Many properties of the ordinals are established by transfinite induction. For example, the ordinals are linearly ordered:
Ord(α) Ord(β)
α<β∨α=β∨β <α
The successor of x, written succ(x), is traditionally defined by succ(x) ≡ {x} ∪ x.
The Isabelle theory makes an equivalent definition using cons:
succ(x) ≡ cons(x, x)
Successors have two key properties:
succ(x) = succ(y)
x=y
succ(x) = 0
Proving that succ is injective seems to require the Axiom of Foundation. Proving
succ(x) = 0 is trivial because zero is the empty set; let us write the empty set as 0
instead of ∅ when it serves as zero.
The smallest ordinal is zero. The ordinals are closed under the successor operation. The union of any family of ordinals is itself an ordinal, which happens to be
their least upper bound:
Ord(0)
Ord(α)
Ord(succ(α))
[x ∈ A]x
..
..
Ord(β(x))
Ord( x∈A .β(x))
Set Theory for Verification: II
19
By the first two rules above, every natural number is an ordinal. By the third, so
is the set of natural numbers. This ordinal is traditionally called ω; the following
section defines it as the set nat.
Transfinite recursion can be expressed using wfrec and Memrel; see nat rec
below. Later (§3.4) we shall define a more general form of transfinite recursion,
called ∈-recursion.
3.3. The Natural Numbers
The natural numbers are a recursive data type, but they must be defined now (a
bit prematurely) in order to complete the development of the recursion principles.
The operator nat case provides case analysis on whether a natural number has
the form 0 or succ(k), while nat rec is a structural recursion operator similar to
those in Martin-Löf’s Type Theory [13].
nat ≡ lfp(Inf, λX . {0} ∪ {succ(i) . i ∈ X})
nat case(a, b, k) ≡ ιy . (k = 0 ∧ y = a) ∨ (∃i . k = succ(i) ∧ y = b(i))
nat rec(a, b, k) ≡ wfrec(Memrel(nat), k, λnf . nat case(a, λm . b(m, f ‘m), n))
Each definition is discussed below. They demonstrate the Knaster-Tarski Theorem,
descriptions, and well-founded recursion.
3.3.1. Properties of nat
The mapping supplied to lfp, which takes X to {0}∪{succ(i).i ∈ X}, is obviously
monotonic. The Axiom of Infinity supplies the constant Inf for the bounding set:6
(0 ∈ Inf) ∧ (∀y∈Inf . succ(y) ∈ Inf)
The Axiom gives us a set containing zero and closed under the successor operation;
the least such set contains nothing but the natural numbers.
The Knaster-Tarski Theorem yields
nat = {0} ∪ {succ(i) . i ∈ nat}
and we immediately obtain the introduction rules
0 ∈ nat
n ∈ nat
succ(n) ∈ nat
By instantiating the general induction rule of lfp, we obtain mathematical induction (recall our discussion in §2.3 above):
n ∈ nat
[x ∈ nat ψ(x)]x
..
..
ψ(succ(x))
ψ(0)
ψ(x)
20
Lawrence C. Paulson
3.3.2. Properties of nat case
The definition of nat case contains a typical definite description. Given theorems
stating succ(m) = 0 and succ(m) = succ(n) → m = n, Isabelle’s fast tac
automatically proves the key equations:
nat case(a, b, 0) = a
nat case(a, b, succ(m)) = b(m)
3.3.3. Properties of nat rec
Because nat is an ordinal, it is a transitive set. Well-founded recursion on
Memrel(nat), which denotes the less-than relation on the natural numbers, can
express primitive recursion. Unfolding the recursion equation for wfrec yields
nat rec(a, b, n) = nat case(a, λm . b(m, f ‘m), n)
where f ≡ λx ∈ Memrel(nat)−1 “{n}.nat rec(a, b, x). We may derive the equations
m ∈ nat
nat rec(a, b, succ(m)) = b(m, nat rec(a, b, m))
nat rec(a, b, 0) = a
The first equation is trivial, by the similar one for nat case. Assuming m ∈ nat,
the second equation follows by β-conversion. This requires showing
m ∈ Memrel(nat)−1 “{succ(m)},
which reduces to
m, succ(m) ∈ Memrel(nat),
and finally to the trivial
m ∈ succ(m)
m ∈ nat
succ(m) ∈ nat.
The Isabelle proofs of these rules are straightforward. Recursive definitions of lists
and trees will follow the pattern established above. But first, we must define transfinite recursion in order to construct large sets.
3.4. The Rank Function
Many of the ZF axioms assert the existence of sets, but all sets can be generated
in a uniform manner. Each stage of the construction is labelled by an ordinal α;
the set of all sets generated by stage α is called Vα . Each stage simply gathers up
the powersets of all the previous stages. Define
Vα =
℘(Vβ )
β∈α
by transfinite recursion on the ordinals. In particular we have V0 = ∅ and Vsucc(α) =
℘(Vα ). See Devlin [8, pages 42–48] for philosophy and discussion.
21
Set Theory for Verification: II
We can define the ordinal rank(a), for all sets a, such that a ⊆ Vrank(a) . This
attaches an ordinal to each and every set, indicating the stage of its creation.
When seeking a large ‘bounding set’ for use with the lfp operator, we can restrict
our attention to sets of the form Vα , since every set is contained in some Vα .
Taken together, the Vα are called the cumulative hierarchy. They are fundamental to the intuition of set theory, since they impart a structure to the universe of
sets. Their role here is more mundane. We need rank(a) and Vα to apply lfp and
to justify structural recursion. The following section will formalize the definition
of Vα .
3.4.1. Informal Definition of rank
The usual definition of rank requires ∈-recursion:
rank(a) =
succ(rank(x))
x∈a
The recursion resembles that of Vα , except that it is not restricted to the ordinals. Recursion over the ordinals is straightforward because each ordinal is transitive (recall the discussion in §3.2). To justify ∈-recursion, we define
an operation
n
eclose, such that eclose(a) extends a to be transitive. Let
(X) denote the
n-fold union of X, with 0 (X) = X and succ(m) (X) = ( m (X)). Then put
eclose(a) =
n
(a)
n∈nat
and supply Memrel(eclose({a})) as the well-founded relation for recursion on a.
3.4.2. The Formal Definitions
Here are the Isabelle definitions of eclose, transrec (which performs ∈-recursion)
and rank:
eclose(a) ≡
nat rec(a, λm r .
n∈nat
(r), n)
transrec(a, H) ≡ wfrec(Memrel(eclose(a)), a, H)
rank(a) ≡ transrec(a, λxf .
succ(f ‘y))
y∈x
3.4.3. The Main Theorems
Many results are proved about eclose; the most important perhaps is that
eclose(a) is the smallest transitive set containing a. Now Memrel(eclose({a}))
contains enough of the membership relation to include every chain x1 ∈ · · · ∈ xn ∈
a descending from a. As an instance of well-founded induction, we obtain
22
Lawrence C. Paulson
∈-induction:
[∀y∈x . ψ(y)]x
..
..
ψ(x)
ψ(a)
Now ∈-recursion follows similarly, but there is another technical hurdle. In
transrec(a, H), the well-founded relation supplied to wfrec depends upon a; we
must show that the result of wfrec does not depend upon the field of the relation
Memrel(eclose(· · ·)), if it is big enough. Specifically, we must show
k∈i
wfrec(Memrel(eclose({i})), k, H) = wfrec(Memrel(eclose({k})), k, H)
in order to derive the recursion equation
transrec(a, H) = H(a, λx∈a . transrec(x, H)).
Combining this with the definition of rank yields
rank(a) =
succ(rank(y)).
y∈a
Trivial transfinite inductions prove Ord(rank(a)) and rank(α) = α for ordinals α.
We may use rank to measure the depth of a set. The following facts will justify
recursive function definitions over lists and trees by proving that the recursion is
well-founded:
a∈b
rank(a) < rank(b)
rank(a) < rank(a, b)
rank(b) < rank(a, b)
Let us prove the last of these from the first. Recall from Part I [22, §7.3] the
definition of ordered pairs, a, b ≡ {{a}, {a, b}}. From b ∈ {a, b} we obtain
rank(b) < rank({a, b}). From {a, b} ∈ a, b we obtain rank({a, b}) < rank(a, b).
Now < is transitive, yielding rank(b) < rank(a, b).
We need ∈-recursion only to define rank, since this operator can reduce every
other instance of ∈-recursion to transfinite recursion on the ordinals. We shall use
transrec immediately below and rank in the subsequent section.
3.5. The Cumulative Hierarchy
We can now formalize the definition Vα = β∈α ℘(Vβ ), which was discussed above.
A useful generalization is to construct the cumulative hierarchy starting from a
given set A:
V [A]α = A ∪
β∈α
℘(V [A]β )
(4)
23
Set Theory for Verification: II
Later, V [A]ω will serve as a ‘universe’ for defining recursive data structures; it
contains all finite lists and trees built over A. The Isabelle definitions include
V [A]α ≡ transrec(α, λαf . A ∪
℘(Vβ ))
β∈α
Vα ≡ V [∅]α
3.5.1. Closure Properties of V [A]α
The Isabelle ZF theory proves several dozen facts involving V [A]α . Because its
definition uses ∈-recursion, V [A]x is meaningful for every set x. But the most
important properties concern V [A]α where α is an ordinal. Many are proved by
transfinite induction on α.
To justify the term ‘cumulative hierarchy,’ note that V [A]x is monotonic in
both A and x:
A⊆B x⊆y
V [A]x ⊆ V [B]y
For ordinals we obtain V [A]α ⊆ V [A]succ(α) as a corollary.
The cumulative hierarchy satisfies several closure properties. Here are three
elementary ones:
x ⊆ V [A]x
A ⊆ V [A]x
a ⊆ V [A]α
a ∈ V [A]succ(α)
By the third property, increasing the ordinal generates finite sets:
a1 ∈ V [A]α · · · an ∈ V [A]α
{a1 , . . . , an } ∈ V [A]succ(α)
Since a, b ≡ {{a}, {a, b}}, increasing the ordinal twice generates ordered pairs:
a ∈ V [A]α b ∈ V [A]α
a, b ∈ V [A]succ(succ(α))
Now put α = ω, recalling that ω is just the set nat of all natural numbers. Let us
prove that V [A]ω is closed under products:
V [A]ω × V [A]ω ⊆ V [A]ω
Suppose we have a, b ∈ V [A]ω . By equation (4), there exist i, j ∈ nat such that
a ∈ V [A]i and b ∈ V [A]j . Let k be the greater of i and j; then a, b ∈ V [A]k . Since
a, b ∈ V [A]succ(succ(k)) and succ(succ(k)) ∈ nat, we conclude a, b ∈ V [A]ω .
By a similar argument, every finite subset of V [A]ω is an element of V [A]ω .
These ordered pairs and finite subsets are ultimately constructed from natural
numbers and elements of A, since V [A]ω contains nat and A as subsets.
24
Lawrence C. Paulson
A limit ordinal is one that is non-zero and closed under the successor operation:
Limit(α) ≡ Ord(α) ∧ 0 < α ∧ (∀y . y < α → succ(y) < α)
The smallest limit ordinal is ω. The closure properties just discussed of V [A]ω
hold when ω is replaced by any limit ordinal. We shall use these closure properties
in §4.2.
3.6. Recursion on a Set’s Rank
Consider using recursion over lists formed by repeated pairing. The tail of the list
x, l is l. Since l is not a member of the set x, l, we cannot use ∈-recursion to
justify a recursive call on l. But l has smaller rank than x, l; since ordinals are
well-founded, this ensures that the recursion terminates.
The following recursion operator allows any recursive calls involving sets of lesser rank. It handles the list example above, as well as recursive calls for components
of deep nests of pairs:
Vrec(a, H) ≡ transrec(rank(a),
λi g . λz ∈ Vsucc(i) . H(z, λy ∈ Vi . g‘rank(y)‘y)) ‘ a
This definition looks complex, but its formal properties are easy to derive. The
rest of this section attempts to convey the underlying intuition.
3.6.1. The Idea Behind Vrec
To understand the definition of Vrec, consider a technique for defining general
recursive functions over the natural numbers. The definition is reduced to one
involving a primitive recursive functional. Suppose we wish to define a function f
satisfying the recursion
f ‘x = H(x, f ).
Suppose that, for all x in the desired domain of H, the number k(x) exceeds the
depth of recursive calls required to compute f ‘x. Define the family of functions fˆn
by primitive recursion over n:
fˆ0 ≡ λx∈nat . x
fˆn+1 ≡ λx∈nat . H(x, fˆn )
Clearly, fˆn behaves like f if the depth of recursive calls is smaller than n; the
definition of fˆ0 is wholly immaterial, since it is never used. We can therefore define
f ≡ λx∈nat . fˆk(x) ‘x.
3.6.2. The Workings of Vrec
The definition of Vrec follows a similar idea. Using transfinite recursion, define a
family of functions fˆα such that
fˆα ‘x = H(x, λy ∈ Vrank(x) . fˆrank(y) ‘y)
(5)
Set Theory for Verification: II
x
25
rank( x )
Vrank( x )
Figure 2. Domain for recursive calls in Vrec(x, H)
for all x in a sufficiently large set (which will depend upon α), and define
Vrec(x, H) ≡ fˆrank(x) ‘ x .
(6)
Here, rank(x) serves as an upper bound on the number of recursive calls required
to compute Vrec(x, H). Combining equations (5) and (6) immediately yields the
desired recursion:
Vrec(x, H) = H(x, λy ∈ Vrank(x) . fˆrank(y) ‘y)
= H(x, λy ∈ Vrank(x) . Vrec(y, H))
The key fact y ∈ Vα ↔ rank(y) ∈ α states that the set Vα consists of all sets
whose rank is smaller than α. For a given x, Vrec(x, H) may perform recursive
calls for all y of smaller rank than x (see Figure 2). This general principle can
express recursive functions for lists, trees and similar data structures based on
ordered pairing.
We may formalize fˆα using transrec:
fˆα ≡ transrec(α, λi g . λz ∈ Vsucc(i) . H(z, λy ∈ Vi . g‘rank(y)‘y))
Unfolding transrec and simplifying yields equation (5), with Vsucc(α) as the ‘sufficiently large set’ mentioned above. Joining this definition with equation (6) yields
the full definition of Vrec.
The recursion equation for Vrec can be recast into a form that takes a definition
in the premise:
∀x . h(x) = Vrec(x, H)
h(a) = H(a, λy ∈ Vrank(a) . h(y))
This expresses the recursion equation more neatly. The conclusion contains only
one occurrence of H instead of three, and H is typically complex.
The following sections include worked examples using Vrec to express recursive
functions.
26
Lawrence C. Paulson
4. Recursive Data Structures
This section presents ZF formalizations of lists and two different treatments of
mutually recursive trees/forests. Before we can begin, two further tools are needed:
disjoint sums and a ‘universe’ for solving recursion equations over sets.
4.1. Disjoint Sums
Let 1 ≡ succ(0). Disjoint sums have a completely straightforward definition:
A + B ≡ ({0} × A) ∪ ({1} × B)
Inl(a) ≡ 0, a
Inr(b) ≡ 1, b
We obtain the obvious introduction rules
a∈A
Inl(a) ∈ A + B
b∈B
Inr(b) ∈ A + B
and other rules to state that Inl and Inr are injective and distinct. A case operation, defined by a description, satisfies two equations:
case(c, d, Inl(a)) = c(a)
case(c, d, Inr(b)) = d(b)
This resembles the when operator of Martin-Löf’s Type Theory [20].
4.2. A Universe
The term universe generally means the class of all sets, but here it refers to the
set univ(A), which contains all finitely branching trees over A. The set is defined
by
univ(A) ≡ V [A]ω .
By the discussion of V [A]ω in §3.5, we have
univ(A) × univ(A) ⊆ univ(A).
From the simpler facts A ⊆ univ(A) and nat ⊆ univ(A), we obtain
univ(A) + univ(A) ⊆ univ(A).
So univ(A) contains A and the natural numbers, and is closed under disjoint sums
and Cartesian products. We may use it with lfp to define lists and trees as least
fixedpoints over univ(A), for a suitable set A.
Set Theory for Verification: II
27
Infinitely branching trees require larger universes. To construct them requires
cardinality reasoning. Let κ be an infinite cardinal. Writing the next larger cardinal
as κ+ , a suitable universe for infinite branching up to κ is V [A]κ+ . I have recently
formalized this approach in Isabelle’s ZF set theory, proving the theorem κ →
V [A]κ+ ⊆ V [A]κ+ and constructing an example with countable branching. The
cardinality arguments appear to require the Axiom of Choice, and involve a large
body of proofs. I plan to report on this work in a future paper.
4.3. Lists
Let list(A) denote the set of all finite lists taking elements from A. Formally,
list(A) should satisfy the recursion list(A) = {∅} + A × list(A). Since univ(A)
contains ∅ and is closed under + and ×, it contains solutions of this equation. We
simultaneously define the constructors Nil and Cons:7
list(A) ≡ lfp(univ(A), λX . {∅} + A × X)
Nil ≡ Inl(∅)
Cons(a, l) ≡ Inr(a, l)
The mapping from X to {∅} + A × X is trivially monotonic by the rules shown
in §2.4, and univ(A) is closed under it. Therefore, the Knaster-Tarski Theorem
yields list(A) = {∅} + A × list(A) and we obtain the introduction rules:
a ∈ A l ∈ list(A)
Cons(a, l) ∈ list(A)
Nil ∈ list(A)
With equal ease, we derive structural induction for lists:
[x ∈ A
l ∈ list(A) ψ(Nil)
ψ(l)
y ∈ list(A) ψ(y)]x,y
..
..
ψ(Cons(x, y))
4.3.1. Operating Upon Lists
Again following Martin-Löf’s Type Theory [13], we operate upon lists using case
analysis and structural recursion. Here are their definitions in set theory:
list case(c, h, l) ≡ case(λu . c, split(h), l)
list rec(c, h, l) ≡ Vrec(l, λl g . list case(c, λx y . h(x, y, g‘y), l))
Recall from Part I [22] that split satisfies split(h, a, b) = h(a, b). The equations
for list case follow easily by rewriting with the those for case and split.
list case(c, h, Nil) = case(λu . c, split(h), Inl(∅))
28
Lawrence C. Paulson
= (λu . c)(∅)
= c.
list case(c, h, Cons(x, y)) = case(λu . c, split(h), Inr(x, y))
= split(h, x, y)
= h(x, y).
To summarize, we obtain the equations
list case(c, h, Nil) = c
list case(c, h, Cons(x, y)) = h(x, y).
Proving the equations for list rec is almost as easy. Unfolding the recursion
equation for Vrec yields
list rec(c, h, l) = list case(c, λx y . h(x, y, g‘y), l)
(7)
where g ≡ λz ∈ Vrank(l) . list rec(c, h, z). We instantly obtain the Nil case, and
with slightly more effort, the recursive case:
list rec(c, h, Nil) = c
list rec(c, h, Cons(x, y)) = h(x, y, list rec(c, h, y))
In deriving the latter equation, the first step is to put l ≡ Cons(x, y) in (7) and
apply an equation for list case:
list rec(c, h, Cons(x, y)) = list case(c, λx y . h(x, y, g‘y), Cons(x, y))
= h(x, y, g‘y)
All that remains is the β-reduction of g‘y to list rec(c, h, y), where g‘y is
(λz ∈ Vrank(Cons(x,y)) . list rec(c, h, z)) ‘ y .
This step requires proving y ∈ Vrank(Cons(x,y)) . Note that Cons(x, y) = 1, x, y;
by properties of rank (§3.4), we must show
rank(y) < rank(1, x, y).
This is obvious because rank(b) < rank(a, b) for all a and b, and because the
relation < is transitive.
Recursion operators for other data structures are derived in the same manner.
4.3.2. Defining Functions on Lists
The Isabelle theory defines some common list operations, such as append and map,
using list rec:
map(h, l) ≡ list rec(Nil, λx y r . Cons(h(x), r), l)
xs@ys ≡ list rec(ys, λx y r . Cons(x, r), xs)
Set Theory for Verification: II
29
The usual recursion equations follow directly. Note the absence of typing conditions
such as l ∈ list(A):
map(h, Nil) = Nil
Nil@ys = ys
map(h, Cons(a, l)) = Cons(h(a), map(h, l))
Cons(a, l)@ys = Cons(a, l@ys)
The familiar theorems about these functions have elementary proofs by list induction and simplification. Theorems proved by induction have typing conditions; here
is one example out of the many proved in Isabelle:
xs ∈ list(A)
map(h, xs@ys) = map(h, xs)@map(h, ys)
We can also prove some unusual type-checking rules:
l ∈ list(A)
map(h, l) ∈ list({h(x) . x ∈ A})
Here, list({h(x) . x ∈ A}) is the set of all lists whose elements have the form
h(x) for some x ∈ A. Using list(· · ·) in recursive definitions raises interesting
possibilities, as the next section will illustrate.
4.4. Using list(· · ·) in Recursion Equations
Recursive data structure definitions typically involve × and +, but sometimes it
is convenient to involve other set constructors. This section demonstrates using
list(· · ·) to define another data structure.
Consider the syntax of terms over the alphabet A. Each term is a function
application f (t1 , . . . , tn ), where f ∈ A and t1 , . . . , tn are themselves terms. We
shall formalize this syntax as term(A), the set of all trees whose nodes are labelled
with an element of A and which have zero or more subtrees. It is natural to regard
the subtrees as a list; we solve the recursion equation
term(A) = A × list(term(A)).
(8)
Before using list(· · ·) with the Knaster-Tarski Theorem, we must show that it is
monotonic and bounded:
A⊆B
list(A) ⊆ list(B)
list(univ(A)) ⊆ univ(A)
The proofs are simple using lemmas such as the monotonicity of lfp (§2.4). If we
now define
term(A) ≡ lfp(univ(A), λX . A × list(X))
Apply(a, ts) ≡ a, ts
then we quickly derive (8) and obtain the single introduction rule
a ∈ A ts ∈ list(term(A))
Apply(a, ts) ∈ term(A)
30
Lawrence C. Paulson
The structural induction rule takes a curious form:
[x ∈ A
t ∈ term(A)
zs ∈ list({z ∈ term(A) . ψ(z)})]x,zs
..
..
ψ(Apply(x, zs))
ψ(t)
Because of the presence of list in the recursion equation (8), we cannot express
induction hypotheses in the familiar manner. Clearly, zs ∈ list({z ∈ term(A) .
ψ(z)}) if and only if every element z of zs satisfies ψ(z) and belongs to term(A).
Proofs by this induction rule generally require a further induction over the term
list zs.
4.4.1. Recursion on Terms
Let us define analogues of list case and list rec. The former is trivial: because
every term is an ordered pair, we may use split.
A recursive function on terms will naturally apply itself to the list of subterms,
using the list functional map. Define
term rec(d, t) ≡ Vrec(t, λt g . split(λx zs . d(x, zs, map(λz . g‘z, zs)), t))
Note that map was defined above to be a binding operator; it applies to a metalevel function, not a ZF function (a set of pairs). Since g denotes a ZF function, we
must write map(λz . g‘z, zs) instead of map(g, zs). Although the form of map causes
complications now, it leads to simpler equations later.
Put t ≡ Apply(a, ts) in the definition of term rec. Unfolding the recursion
equation for Vrec and applying the equation for split yields
term rec(d, Apply(a, ts)) = split(λx zs . d(x, zs, map(λz . g‘z, zs)), a, ts)
= d(a, ts, map(λz . g‘z, ts))
where g ≡ λx ∈ Vrank(a,ts) .term rec(d, x). The map above applies term rec(d, x),
restricted to x such that rank(x) < rank(a, ts), to each member of ts. Clearly,
each member of ts has lesser rank than ts, and therefore lesser rank than a, ts; the
restriction on x has no effect, and the result must equal map(λz .term rec(d, z), ts).
We may abbreviate this (by η-contraction) to map(term rec(d), ts).
To formalize this argument, the ZF theory proves the more general lemma
l ∈ list(A) Ord(α) rank(l) ∈ α
map(λz . (λx ∈ Vα . h(x))‘z, l) = map(h, l)
by structural induction on the list l. The lemma simplifies the term rec equation
to
ts ∈ list(A)
term rec(d, Apply(a, ts)) = d(a, ts, map(term rec(d), ts))
Set Theory for Verification: II
31
The curious premise ts ∈ list(A) arises from the map lemma just proved; A need
not be a set of terms and does not appear in the conclusion. Possibly, this premise
could be eliminated by reasoning about the result of map when applied to non-lists.
4.4.2. Defining Functions on Terms
To illustrate the use of term rec, let us define the operation to reflect a term about
its vertical axis, reversing the list of subtrees at each node. First we define rev,
the traditional list reverse operation.8
rev(l) ≡ list rec(Nil, λx y r . r@Cons(x, r), l)
reflect(t) ≡ term rec(λx zs rs . Apply(x, rev(rs)), t)
Unfolding the recursion equation for term rec instantly yields, for ts ∈ list(A),
reflect(Apply(a, ts)) = Apply(a, rev(map(reflect, ts))).
(9)
Note the simple form of the map application above, since reflect is a meta-level
function. Defining functions at the meta-level allows them to operate over the class
of all sets. On the other hand, an object-level function is a set of pairs; its domain
and range must be sets.
4.4.3. An Induction Rule for Equations Between Terms
The Isabelle ZF theory defines and proves theorems about several term operations.
Many term operations involve a corresponding list operation, as reflect involves
rev. Proofs by term induction involve reasoning about map.
Since many theorems are equations, let us derive an induction rule for proving
equations easily. First, we derive two rules:
l ∈ list({x ∈ A . ψ(x)})
l ∈ list(A)
l ∈ list({x ∈ A . h1 (x) = h2 (x)})
map(h1 , l) = map(h2 , l)
The first rule follows by monotonicity of list. To understand the second rule,
suppose l ∈ list({x ∈ A . h1 (x) = h2 (x)}). Then h1 (x) = h2 (x) holds for every
member x of the list l, so map(h1 , l) = map(h2 , l). This argument may be formalized
using list induction.
Combining the two rules with term induction yields the derived induction rule:
[x ∈ A
t ∈ term(A)
zs ∈ list(term(A)) map(h1 , zs) = map(h2 , zs)]x,zs
..
..
h1 (Apply(x, zs)) = h2 (Apply(x, zs))
h1 (t) = h2 (t)
The induction hypothesis, map(h1 , zs) = map(h2 , zs), neatly expresses that h1 (z) =
h2 (z) holds for every member z of the list zs.
32
Lawrence C. Paulson
4.4.4. Example of Equational Induction
To demonstrate the induction rule, let us prove reflect(reflect(t)) = t. The
proof requires four lemmas about rev and map. Ignoring the premise l ∈ list(A),
the lemmas are
rev(map(h, l))
map(h1 , map(h2 , l))
map(λu . u, l)
rev(rev(l))
=
=
=
=
map(h, rev(l))
map(λu . h1 (h2 (u)), l)
l
l
(10)
(11)
(12)
(13)
To apply the derived induction rule, we may assume the induction hypothesis
map(λu . reflect(reflect(u)), zs) = map(λu . u, zs)
(14)
and must show
reflect(reflect(Apply(x, zs))) = Apply(x, zs).
Simplifying the left hand side, we have
reflect(reflect(Apply(x, zs)))
= reflect(Apply(x, rev(map(reflect, zs))))
= reflect(Apply(x, map(reflect, rev(zs))))
= Apply(x, rev(map(reflect, map(reflect, rev(zs)))))
= Apply(x, rev(map(λu . reflect(reflect(u)), rev(zs))))
= Apply(x, map(λu . reflect(reflect(u)), rev(rev(zs))))
= Apply(x, map(λu . reflect(reflect(u)), zs))
= Apply(x, map(λu . u, zs))
= Apply(x, zs)
by (9)
by (10)
by (9)
by (11)
by (10)
by (13)
by (14)
by (12)
The use of map may be elegant, but the proof is rather obscure. The next section
describes an alternative formulation of the term data structure.
This section has illustrated how list can be added to our repertoire of set
constructors permitted in recursive data structure definitions. It seems clear that
other set constructors, including term itself, can be added similarly.
4.5. Mutual Recursion
Consider the sets tree(A) and forest(A) defined by the mutual recursion equations
tree(A) = A × forest(A)
forest(A) = {∅} + tree(A) × forest(A)
Set Theory for Verification: II
33
Observe that tree(A) is essentially the same data structure as term(A), since
forest(A) is essentially the same as list(term(A)). Mutual recursion avoids the
complications of recursion over the operator list, but introduces its own complications.
4.5.1. The General Approach
Mutual recursion equations are typically solved by applying the Knaster-Tarski
Theorem over the lattice ℘(A)×℘(B), the Cartesian product of two powersets. But
we have proved the Theorem only for a simple powerset lattice. Because the lattice
℘(A + B) is order-isomorphic to ℘(A) × ℘(B), we shall instead apply the Theorem
to a lattice of the form ℘(A + B). We solve the equations by constructing a disjoint
sum comprising all of the sets in the definition — here, a set called TF(A), which
will contain tree(A) and forest(A) as disjoint subsets. This approach appears to
work well, and TF(A) turns out to be useful in itself. A minor drawback: it does
not solve the recursion equations up to equality, only up to isomorphism.
To support this approach to mutual recursion, define
Part(A, h) ≡ {x ∈ A . ∃z . x = h(z)}.
Here Part(A, h) selects the subset of A whose elements have the form h(z). Typically h is Inl or Inr, the injections for the disjoint sum. Note that Part(A+B, Inl)
equals not A but {Inl(x) . x ∈ A}. The disjoint sum of three or more sets involves
nested injections. We may use Part with the composition of injections, such as
λx . Inr(Inl(x)), and obtain equations such as
Part(A + (B + C), λx . Inr(Inl(x))) = {Inr(Inl(x)) . x ∈ B}.
4.5.2. The Formal Definitions
Now TF(A), tree(A) and forest(A) are defined by
TF(A) ≡ lfp(univ(A), λX . A × Part(X, Inr) +
({∅} + Part(X, Inl) × Part(X, Inr)))
tree(A) ≡ Part(TF(A), Inl)
forest(A) ≡ Part(TF(A), Inr)
The presence of Part does not complicate reasoning about lfp. In particular,
Part(A, h) is monotonic in A. We obtain
TF(A) = A × Part(TF(A), Inr) +
({∅} + Part(TF(A), Inl) × Part(TF(A), Inr))
= A × forest(A) + ({∅} + tree(A) × forest(A))
This solves our recursion equations up to isomorphism:
tree(A) = {Inl(x) . x ∈ A × forest(A)}
forest(A) = {Inr(x) . x ∈ {∅} + tree(A) × forest(A)}
34
Lawrence C. Paulson
These equations determine the tree and forest constructors, Tcons, Fnil and
Fcons. Due to the similarity to list(A), we can use the list constructors to abbreviate the definitions:
Tcons(a, f ) ≡ Inl(a, f )
Fnil ≡ Inr(Nil)
Fcons(t, f ) ≡ Inr(Cons(t, f ))
A little effort yields the introduction rules:
a ∈ A f ∈ forest(A)
Tcons(a, f ) ∈ tree(A)
Fnil ∈ forest(A)
t ∈ tree(A) f ∈ forest(A)
Fcons(t, f ) ∈ forest(A)
The usual methods yield a structural induction rule for TF(A):
x∈A
f ∈ forest(A)
ψ(f )
z ∈ TF(A)
x,f
..
..
ψ(Tcons(x, f ))
ψ(z)
t ∈ tree(A)
f ∈ forest(A)
ψ(t)
ψ(f )
t,f
..
..
ψ(Fnil)
ψ(Fcons(t, f ))
(15)
(The assumptions are stacked vertically to save space.) Although this may not
look like the best rule for mutual recursion, it is surprisingly simple and useful.
It affords easy proofs of several theorems in the Isabelle theory. For the general
case, there is a rule that allows different induction formulae, ψ for trees and φ for
forests:
t ∈ tree(A)
f ∈ forest(A)
x∈A
ψ(t)
f ∈ forest(A)
φ(f )
φ(f )
x,f
t,f
..
..
..
..
ψ(Tcons(x, f ))
φ(Fnil)
φ(Fcons(t, f ))
(∀t∈tree(A) . ψ(t)) ∧ (∀f ∈forest(A) . φ(f ))
(16)
This rule follows by applying the previous one to the formula
(z ∈ tree(A) → ψ(z)) ∧ (z ∈ forest(A) → φ(z)).
Its derivation relies on the disjointness of tree(A) and forest(A). Both rules are
demonstrated below.
Set Theory for Verification: II
35
4.5.3. Operating on Trees and Forests
The case analysis operator is called TF case and the recursion operator is called
TF rec:
TF case(b, c, d, z) ≡ case(split(b), list case(c, d), z)
TF rec(b, c, d, z) ≡ Vrec(z, λz r . TF case(λx f . b(x, f, r‘f ),
c, λt f . d(t, f, r‘t, r‘f ), z))
Note the use of the case analysis operators for disjoint sums (case), Cartesian products (split), and lists (list case). Unfolding Vrec, we now derive the recursion
rules, starting with the one for trees:
TF rec(b, c, d, Tcons(a, f ))
= TF rec(b, c, d, Inl(a, f ))
= TF case(λx f . b(x, f, r‘f ), c, λt f . d(t, f, r‘t, r‘f ), Inl(a, f ))
= case(split(λx f . b(x, f, r‘f )),
list case(c, λt f . d(t, f, r‘t, r‘f )), Inl(a, f ))
= split(λx f . b(x, f, r‘f ), a, f )
= b(a, f, r‘f )
where r ≡ λx ∈ Vrank(Inl(a,f )) . TF rec(b, c, d, x). The usual lemmas prove
rank(f ) < rank(Inl(a, f )),
allowing the β-reduction of r‘f to b(a, f, TF rec(b, c, d, f )). The other recursion
rules for TF rec are derived similarly. To summarize, we have
TF rec(b, c, d, Tcons(a, f )) = b(a, f, TF rec(b, c, d, f ))
TF rec(b, c, d, Fnil) = c
TF rec(b, c, d, Fcons(t, f )) = d(t, f, TF rec(b, c, d, t), TF rec(b, c, d, f ))
4.5.4. Defining Functions on Trees and Forests
Some examples may be helpful. Here are three applications of TF rec:
− TF map applies an operation to every label of a tree.
− TF size returns the number of labels in a tree.
− TF preorder returns the labels as a list, in preorder.
Each operation is defined simultaneously for trees and forests:
TF map(h, z) ≡ TF rec(λx f r . Tcons(h(x), r),
36
Lawrence C. Paulson
Fnil,
λt f r1 r2 . Fcons(r1 , r2 ), z)
TF size(h, z) ≡ TF rec(λx f r . succ(r),
0,
λt f r1 r2 . r1 ⊕ r2 , z)
TF preorder(h, z) ≡ TF rec(λx f r . Cons(x, r),
Nil,
λt f r1 r2 . r1 @r2 , z)
Here ⊕ is the addition operator for natural numbers. Recall that @ is the append
operator for lists (§4.3).
Applying the TF rec recursion equations to TF map immediately yields
TF map(h, Tcons(a, f )) = Tcons(h(a), TF map(h, f ))
TF map(h, Fnil) = Fnil
TF map(h, Fcons(t, f )) = Fcons(TF map(h, t), TF map(h, f ))
Many theorems can be proved by the simple induction rule (15) for TF(A), taking
advantage of ZF’s lack of a formal type system. Separate proofs for tree(A) and
forest(A) would require the cumbersome rule for mutual induction.
4.5.5. Example of Simple Induction
Let us prove TF map(λu . u, z) = z for all z ∈ TF(A). By the simple induction
rule (15), it suffices to prove three subgoals:
− TF map(λu . u, Tcons(x, f )) = Tcons(x, f ) assuming the induction hypothesis
TF map(λu . u, f ) = f
− TF map(λu . u, Fnil) = Fnil
− TF map(λu . u, Fcons(t, f )) = Fcons(t, f ) assuming the induction hypotheses
TF map(λu . u, t) = t and TF map(λu . u, f ) = f
These are all trivial, by the recursion equations. For example, the first subgoal is
proved in two steps:
TF map(λu . u, Tcons(x, f )) = Tcons(x, TF map(λu . u, f )) = Tcons(x, f )
The simple induction rule proves various laws relating TF map, TF size and
TF preorder with equal ease.
Set Theory for Verification: II
37
4.5.6. Example of Mutual Induction
The mutual induction rule (16) proves separate properties for tree(A) and
forest(A). The simple rule (15) can show that TF map takes elements of TF(A)
to TF(B), for some B; let us sharpen this result to show that TF map takes trees
to trees and forests to forests. Assume h(x) ∈ B for all x ∈ A and apply mutual
induction to the formula
(∀t∈tree(A) . TF map(h, t) ∈ tree(B)) ∧ (∀f ∈forest(A) . TF map(h, f ) ∈ forest(B))
The first subgoal of the induction is to show
TF map(h, Tcons(x, f )) ∈ tree(B)
assuming x ∈ A, f ∈ forest(A) and TF map(h, f ) ∈ forest(B). The recursion
equation for TF map reduces it to
Tcons(h(x), TF map(h, f )) ∈ tree(B);
the type-checking rules for Tcons and h reduce it to the assumptions x ∈ A and
TF map(h, f ) ∈ forest(B).
The second subgoal of the induction is
TF map(h, Fnil) ∈ forest(B),
which reduces to the trivial Fnil ∈ forest(B). The third subgoal,
TF map(h, Fcons(t, f )) ∈ forest(B),
is treated like the first.
We have considered two approaches to defining variable-branching trees. The
previous section defines term(A) by recursion over the operator list, so that
list(term(A)) denotes the set of forests over A. I prefer this to the present
approach of mutual recursion. But this one example does not demonstrate that
mutual recursion should always be avoided. An example to study is a programming language that allows embedded commands in expressions; its expressions and
commands would be mutually recursive.
5. Soundness and Completeness of Propositional Logic
We have discussed the ZF formalization of least fixedpoints, recursive functions
and recursive data structures. Formalizing propositional logic — its syntax, semantics and proof theory — exercises each of these principles. The proofs of soundness
and completeness amount to an equivalence proof between denotational and operational semantic definitions. Similar examples abound in theoretical Computer
Science.
38
Lawrence C. Paulson
5.1. Defining the Set of Propositions
The propositions come in three forms:
1. Fls is the absurd proposition.
2. #v is a propositional variable, for v ∈ nat.
3. p ⊃ q is an implication if p and q are propositions.
The set prop consists of all propositions. It is the least solution to the recursion
equation
prop = {∅} + nat + prop × prop.
The definition is similar to the others described above. We obtain the introduction
rules
p ∈ prop q ∈ prop
v ∈ nat
Fls ∈ prop
#v ∈ prop
p ⊃ q ∈ prop
with the usual induction rule for proving a property for every element of prop.
Recursive functions on prop are defined in the standard way.
Next, we define the denotational semantics of a proposition by translation to
first-order logic. A truth valuation t is a subset of nat representing a set of atoms
regarded as true (all others to be regarded as false). If p ∈ prop and t ⊆ nat then
is true(p, t) states that p evaluates to true under t. Writing ⊥ for the absurd
formula in first-order logic, the recursion equations are
is true(Fls, t) ↔ ⊥
is true(#v, t) ↔ v ∈ t
is true(p ⊃ q, t) ↔ (is true(p, t) → is true(q, t))
Our recursion principles cannot express is true(p, t) directly since it is a formula.
Instead, is true(p, t) is defined in terms of a recursive function that yields the
truth value of p as an element of {0, 1}. The details are omitted.
5.2. Defining an Inference System in ZF
Let H be a set of propositions and p a proposition. Write H |= p to mean that
the truth of all elements of H implies the truth of p, for every truth valuation t.
Logical consequence is formalized in ZF by
H |= p ≡ ∀t . (∀q∈H . is true(q, t)) → is true(p, t)
The objective is to prove that H |= p holds if and only if p is provable from H
using the axioms (K), (S), (DN ) with the Modus Ponens rule (M P ). Note that
⊃ associates to the right:
p⊃q⊃p
(K)
39
Set Theory for Verification: II
(p ⊃ q ⊃ r) ⊃ (p ⊃ q) ⊃ (p ⊃ r)
((p ⊃ Fls) ⊃ Fls) ⊃ p
p⊃q p
q
(S)
(DN )
(M P )
Such inference systems are becoming popular for defining the operational semantics
of programming languages. They can be extremely large — consider the Definition
of Standard ML [17]. The Knaster-Tarski Theorem can express the least set of
propositions closed under the axioms and rules, but we must adopt a formalization
that scales up to large inference systems.
Defining a separate Isabelle constant for each axiom and rule affords some
control over formula expansion during proof. An axiom is expressed as a union
over its schematic variables:
axK ≡
{p ⊃ q ⊃ p}
p∈prop q∈prop
axS ≡
{(p ⊃ q ⊃ r) ⊃ (p ⊃ q) ⊃ (p ⊃ r)}
p∈prop q∈prop r∈prop
axDN ≡
{((p ⊃ Fls) ⊃ Fls) ⊃ p}
p∈prop
A rule takes a set X of theorems and generates the set of all immediate consequences of X:
ruleMP(X) ≡
{q ∈ prop . {p ⊃ q, p} ⊆ X}
p∈prop
The axioms and rules could have been defined in many equivalent ways. Unions and
singletons give a uniform format for the axioms. But ruleMP makes an ad-hoc use of
the Axiom of Separation, since its conclusion is just a schematic variable; this need
not be the case for other rules. The use of the subset relation in {p ⊃ q, p} ⊆ X
simplifies the proof that ruleMP(X) is monotonic in X.
We now define the set thms(H) of theorems provable from H, and the consequence relation H ⊢ p. The first part of the union, H ∩ prop, considers only the
propositions in H as theorems; putting just H here would make most of our results
conditional on H ⊆ prop.
thms(H) ≡ lfp(prop, λX . (H ∩ prop) ∪ axK ∪ axS ∪ axDN ∪ ruleMP(X))
H ⊢ p ≡ p ∈ thms(H)
We immediately obtain introduction rules corresponding to the axioms; the premises perform type-checking:
p ∈ H p ∈ prop
(H)
H⊢p
p ∈ prop q ∈ prop
(K)
H⊢p⊃q⊃p
40
Lawrence C. Paulson
p ∈ prop q ∈ prop r ∈ prop
(S)
H ⊢ (p ⊃ q ⊃ r) ⊃ (p ⊃ q) ⊃ (p ⊃ r)
p ∈ prop
(DN )
H ⊢ ((p ⊃ Fls) ⊃ Fls) ⊃ p
Proving that every theorem is a proposition helps to derive a rule for Modus Ponens
that is free of type-checking:
H⊢p
p ∈ prop
H⊢p⊃q H⊢p
(M P )
H⊢q
We may use these rules, cumbersome though they are, as an Isabelle object-logic.
They can be supplied to tools such as the classical reasoner in order to prove
Isabelle goals involving assertions of the form H ⊢ p. This rule is derived using
(M P ), (S) and (K):
p ∈ prop
(I)
H⊢p⊃p
By the monotonicity result from §2.4, thms(H) is monotonic in H, which justifies
a rule for weakening on the left. Axiom (K) justifies weakening on the right:
G⊆H G⊢p
H⊢p
H ⊢ q p ∈ prop
H⊢p⊃q
5.3. Rule Induction
Because it is defined using a least fixedpoint in ZF, our propositional logic admits
induction over its proofs. This principle, sometimes called rule induction, does not
require an explicit data structure for proofs; just apply the usual induction rule
for lfp. Below we shall discuss this rule with two examples of its use, the Deduction
Theorem and the Soundness Theorem (proving the latter in an Isabelle session).
The rule is too large to display in the usual notation. Its conclusion is ψ(p) and
it has six premises:
1. H ⊢ p, which is the major premise
2. ψ(x) with assumptions [x ∈ prop
x ∈ H]x
3. ψ(x ⊃ y ⊃ x) with assumptions [x ∈ prop y ∈ prop]x,y
4. ψ((x ⊃ y ⊃ z) ⊃ (x ⊃ y) ⊃ x ⊃ z)
with assumptions [x ∈ prop y ∈ prop
z ∈ prop]x,y,z
5. ψ(((x ⊃ Fls) ⊃ Fls) ⊃ x) with assumption [x ∈ prop]x
6. ψ(y) with assumptions [H ⊢ x ⊃ y
H⊢x
ψ(x ⊃ y) ψ(x)]x,y
Set Theory for Verification: II
41
The rationale for this form of induction is simple: if ψ holds for all the axioms
and is preserved by all the rules, then it must hold for all the theorems. The
premise ψ(x ⊃ y ⊃ x) ensures that ψ holds for all instances of axiom (K), and
similar premises handle the other axioms. The last premise ensures that rule (M P )
preserves ψ; thus it takes ψ(x ⊃ y) and ψ(x) as induction hypotheses.9
The Deduction Theorem states that {p}∪H ⊢ q implies H ⊢ p ⊃ q. In Isabelle’s
set theory, it is formalized as follows (since cons(p, H) = {p} ∪ H):
cons(p, H) ⊢ q p ∈ prop
H⊢p⊃q
The proof is by rule induction on cons(p, H) ⊢ q. Of the five remaining subgoals,
the first is to show H ⊢ p ⊃ x assuming x ∈ prop and x ∈ cons(p, H). From
x ∈ cons(p, H) there are two subcases:
− If x = p then H ⊢ x ⊃ x follows using (I).
− If x ∈ H then H ⊢ p ⊃ x follows using (H) and weakening.
The next three subgoals correspond to one of the axioms (K), (S) or (DN ), and
hold by that axiom plus weakening. For the last subgoal, H ⊢ p ⊃ y follows from
H ⊢ p ⊃ x ⊃ y and H ⊢ p ⊃ x using (S) and (M P ).
Isabelle executes this proof of the Deduction Theorem in under six seconds. The
classical reasoner, given the relevant lemmas, proves each subgoal automatically.
5.4. Proving the Soundness Theorem in Isabelle
Another application of rule induction is the Soundness Theorem:
H⊢p
H |= p
The proof is straightforward. The most difficult case is showing that H |= x ⊃ y
and H |= x imply H |= y. The Isabelle proof consists of three tactics. The
goalw command states the goal and expands the definition of logical consequence,
logcon def.
goalw PropThms.thy [logcon_def] "!!H. H |- p ==> H |= p";
Level 0
!!H. H |- p ==> H |= p
1. !!H. H |- p ==> ALL t. (ALL q:H. is_true(q, t)) --> is_true(p, t)
Applying rule induction to the premise H ⊢ p returns five subgoals:
by (eresolve_tac [PropThms.induct] 1);
42
Lawrence C. Paulson
Level 1
!!H. H |- p ==> H |= p
1. !!H p.
[| p : H; p : prop |] ==>
ALL t. (ALL q:H. is_true(q, t)) --> is_true(p, t)
2. !!H p q.
[| p : prop; q : prop |] ==>
ALL t. (ALL q:H. is_true(q, t)) --> is_true(p => q => p, t)
3. !!H p q r.
[| p : prop; q : prop; r : prop |] ==>
ALL t.
(ALL q:H. is_true(q, t)) -->
is_true((p => q => r) => (p => q) => p => r, t)
4. !!H p.
p : prop ==>
ALL t.
(ALL q:H. is_true(q, t)) -->
is_true(((p => Fls) => Fls) => p, t)
5. !!H p q.
[| H |- p => q;
ALL t. (ALL q:H. is_true(q, t)) --> is_true(p => q, t);
H |- p; ALL t. (ALL q:H. is_true(q, t)) --> is_true(p, t);
p : prop; q : prop |] ==>
ALL t. (ALL q:H. is_true(q, t)) --> is_true(q, t)
The equations for is true, shown in §5.1 above, are called is true Fls,
is true Var and is true Imp in Isabelle. Each is an ‘if and only if’ assertion.
The next command converts is true Imp into the rule
is true(p ⊃ q, t) is true(p, t)
is true(q, t)
and gives it to fast tac. The rule breaks down an induction hypothesis to solve
subgoal 5.
by (fast_tac (ZF_cs addSDs [is_true_Imp RS iffD1 RS mp]) 5);
Level 2
!!H. H |- p ==> H |= p
As above but without subgoal 5. . .
Rewriting by the recursion equations for is true, Isabelle’s simplifier solves the
other four subgoals. For example, the conclusion of subgoal 2 rewrites to
is true(x, t) → is true(y, t) → is true(x, t),
Set Theory for Verification: II
43
which is obviously true.
by (ALLGOALS
(simp_tac
(ZF_ss addsimps [is_true_Fls, is_true_Var, is_true_Imp])));
Level 3
!!H. H |- p ==> H |= p
No subgoals!
This proof executes in about six seconds.
5.5. Completeness
Completeness means every valid proposition is provable: if H |= p then H ⊢ p. We
consider first the special case where H = ∅ and later generalize H to be any finite
set.
A key lemma is the Law of the Excluded Middle, ‘q or not q.’ Since our propositions lack a disjunction symbol, the Law is expressed as a rule that reduces p to
two subgoals — one assuming q and one assuming ¬q:
cons(q, H) ⊢ p
cons(q ⊃ Fls, H) ⊢ p q ∈ prop
H⊢p
5.5.1. The Informal Proof
Let t be a truth valuation and define hyps(p, t) by recursion on p:
hyps(Fls, t) = ∅
{#v}
if v ∈ t
{#v ⊃ Fls} if v ∈ t
hyps(p ⊃ q, t) = hyps(p, t) ∪ hyps(q, t)
hyps(#v, t) =
Informally, hyps(p, t) returns a set containing each atom in p, or the negation of
that atom, depending on its value in t. The set hyps(p, t) is necessarily finite.
For this section, call H a basis of p if H ⊢ p. Assume that p is valid, ∅ |= p.
After proving a lemma by induction, we find that hyps(p, t) is a basis of p for every
truth valuation t:
p ∈ prop ∅ |= p
hyps(p, t) ⊢ p
The next step towards establishing ∅ ⊢ p is to reduce the size of the basis. If
hyps(p, t) = cons(#v, H), then the basis contains #v; removing v from t creates
an almost identical basis that contains ¬#v:
hyps(p, t − {v}) = cons(#v ⊃ Fls, H) − {#v}.
Applying the Law of the Excluded Middle with #v for q yields H ⊢ p, which is a
basis of p not mentioning #v at all. Repeating this operation yields smaller and
44
Lawrence C. Paulson
smaller bases of p. Since hyps(p, t) is finite, the empty set is also a basis. Thus we
obtain ∅ ⊢ p, as desired.
5.5.2. An Inductive Definition of Finite Sets
The formalization of this argument is complex and will be omitted here. But one
detail is relevant to recursive definitions: what is a finite set? Finite sets could be
defined by reference to the natural numbers, but they are more easily defined as a
least fixedpoint. The empty set is finite; if y is finite then cons(x, y) is also:
Fin(A) ≡ lfp(℘(A), λZ . {∅} ∪ (
{cons(x, y)}))
y∈Z x∈A
Monotonicity is shown by the usual lemmas; the Knaster-Tarski Theorem immediately yields the introduction rules:
{∅} ∈ Fin(A)
a ∈ A b ∈ Fin(A)
cons(a, b) ∈ Fin(A)
We have defined a finite powerset operator; Fin(A) consists of all the finite subsets
of A. The induction rule for Fin(A) resembles the rule for lists:
[x ∈ A y ∈ Fin(A) x ∈ y
..
..
b ∈ Fin(A) ψ(∅)
ψ(cons(x, y))
ψ(b)
ψ(y)]x,y
This rule strengthens the usual assumption rule for lfp by discharging the assumption x ∈ y. Its proof notes that x ∈ y implies cons(x, y) = y, rendering the
induction step trivial in this case.
Reasoning about finiteness is notoriously tricky, but finite set induction proves
many results about Fin(A) easily. The union of two finite sets is finite; the union
of a finite set of finite sets is finite; a subset of a finite set is finite:
b ∈ Fin(A) c ∈ Fin(A)
b ∪ c ∈ Fin(A)
C ∈ Fin(Fin(A))
C ∈ Fin(A)
c ⊆ b b ∈ Fin(A)
c ∈ Fin(A)
5.5.3. The Variable-Elimination Argument
Returning to the completeness theorem, we can now prove that hyps(p, t) is finite
by structural induction on p:
p ∈ prop
hyps(p, t) ∈ Fin( v∈nat .{#v, #v ⊃ Fls})
For the variable-elimination argument, we assume p ∈ prop and ∅ |= p, and prove
∀t . hyps(p, t) − hyps(p, t0 ) ⊢ p
Set Theory for Verification: II
45
by induction on the finite set hyps(p, t0 ). (Here t0 is simply a free variable.) Finally,
instantiating t to t0 and using A − A = ∅, we obtain ∅ ⊢ p.
This establishes an instance of the Completeness Theorem:
∅ |= p p ∈ prop
∅⊢p
To show H |= p implies H ⊢ p where H may be any finite set requires a further
application of finite set induction. I have not considered the case where H is infinite,
since it seems irrelevant to computational reasoning.
6. Related Work and Conclusions
This theory is intended to support machine proofs about recursive definitions.
Every set theorist knows that ZF can handle recursion in principle, but machine
proofs require assertions to be formalized correctly and conveniently. The derivations of the recursion operators wfrec, transrec and Vrec are particularly sensitive to formal details. Let us recall the chief problems, and their solutions:
− Inductively defined sets are expressed as least fixedpoints, applying the
Knaster-Tarski Theorem over a suitable set.
− Recursive functions are defined by well-founded recursion and its derivatives,
such as transfinite recursion.
− Recursive data structures are expressed by applying the Knaster-Tarski Theorem to a set with strong closure properties.
I have not attempted to characterize the class of recursive definitions admitted by
these methods, but they are extremely general.
The overall approach is not restricted to ZF set theory. I have applied it, with
a few changes, to Isabelle’s implementation of higher-order logic. It may be applicable to weaker systems such as intuitionistic second-order logic and intuitionistic
ZF set theory. Thus, we have a generic treatment of recursion for generic theorem
proving.
In related work, Noël [18] has proved many theorems about recursion using
Isabelle’s set theory, including well-founded recursion and a definition of lists. But
Noël does not develop a general theory of recursion. Ontic [10] provides strong
support for recursively defined functions and sets. Ontic’s theory of recursion differs
from mine; it treats recursive functions as least fixedpoints, with no use of wellfounded relations.
The Knaster-Tarski Theorem can be dropped. If h is continuous then
n
n∈ω h (∅) is its least fixedpoint. Induction upon n yields computation induction, which permits reasoning about the least fixedpoint. Ontic and Noël both use
46
Lawrence C. Paulson
the construction, which generalizes to larger ordinals, but I have used it only to
define univ and eclose.
The Knaster-Tarski Theorem has further applications in its dual form, which
yields greatest fixedpoints. These crop up frequently in Computer Science, mainly
in connection with bisimulation proofs [16].
Recently I have written an ML package to automate recursive definitions in
Isabelle ZF [24]. My package is inspired by T. Melham’s inductive definition packages for the Cambridge HOL system [5, 15]. It is unusually flexible because of its
explicit use of the Knaster-Tarski Theorem. Monotone operators may occur in the
introduction rules, such as the occurrence of list in the definition of term(A)
above.
Given the desired form of the introduction rules, my package makes fixedpoint
definitions. Then it proves the introduction and induction rules. It can define the
constructors for a recursive data structure and prove their freeness. The package
has been applied to most of the inductive definitions presented in this paper. It
supports inductively defined relations and mutual recursion.
The Isabelle ZF theory described in this paper is available by ftp. For more
information, please send electronic mail to the author,
[email protected].
Acknowledgements
Martin Coen, Sara Kalvala and Philippe Noël commented on this paper. Tobias
Nipkow (using Isabelle’s higher-order logic) contributed the propositional logic
example of §5. Thomas Melham suggested defining the finite powerset operator.
Thanks are also due to Deepak Kapur (the editor) and to the four referees.
The research was funded by the EPSRC (grants GR/G53279, GR/H40570)
and by the ESPRIT Basic Research Actions 3245 ‘Logical Frameworks’ and 6453
‘Types.’
Notes
1
This means the two sets are in one-to-one correspondence and have equivalent orderings.
The bnd mono premises could be weakened, but to little purpose, because they hold in typical
uses of lfp.
3
All Isabelle timings are on a Sun SPARCstation ELC.
4
The approach could be generalized to non-well-founded set theory [2] by verifying that the
set univ(A), defined in §4.2, is well-founded.
5
There is no loss of generality: you can always apply transitive closure again.
6
The traditional Axiom of Infinity has an existentially quantified variable in place of Inf.
Introducing the constant is conservative, and allows nat to be defined explicitly.
7
Earlier versions of Isabelle ZF defined list(A) to satisfy the recursion list(A) = {∅} ∪ (A ×
list(A)). Then ∅ stood for the empty list and a, l for the list with head a and tail l; note that
∅ does not equal any pair. The present approach follows a uniform treatment of data structures.
8
This version takes quadratic time but it is easier to reason about than a linear time reverse.
2
Set Theory for Verification: II
47
9
The other hypotheses, H ⊢ x ⊃ y and H ⊢ x, are typical of strong rule induction [5]; they
come for free from the induction rule for lfp.
References
1.
2.
3.
4.
5.
6.
7.
8.
9.
10.
11.
12.
13.
14.
15.
16.
17.
18.
19.
20.
21.
22.
23.
24.
25.
26.
27.
Abramsky, S., The lazy lambda calculus, In Resesarch Topics in Functional Programming,
D. A. Turner, Ed. Addison-Wesley, 1977, pp. 65–116
Aczel, P., Non-Well-Founded Sets, CSLI, 1988
Bledsoe, W. W., Non-resolution theorem proving, Art. Intel. 9 (1977), 1–35
Boyer, R. S., Moore, J. S., A Computational Logic, Academic Press, 1979
Camilleri, J., Melham, T. F., Reasoning with inductively defined relations in the HOL
theorem prover, Tech. Rep. 265, Comp. Lab., Univ. Cambridge, Aug. 1992
Coquand, T., Paulin, C., Inductively defined types, In COLOG-88: International Conference
on Computer Logic (1990), Springer, pp. 50–66, LNCS 417
Davey, B. A., Priestley, H. A., Introduction to Lattices and Order, Cambridge Univ. Press,
1990
Devlin, K. J., Fundamentals of Contemporary Set Theory, Springer, 1979
Girard, J.-Y., Proofs and Types, Cambridge Univ. Press, 1989, Translated by Yves LaFont
and Paul Taylor
Givan, R., McAllester, D., Witty, C., Zalondek, K., Ontic: Language specification and user’s
manual, Tech. rep., MIT, 1992, Draft 4
Halmos, P. R., Naive Set Theory, Van Nostrand, 1960
Manna, Z., Waldinger, R., Deductive synthesis of the unification algorithm, Sci. Comput.
Programming 1, 1 (1981), 5–48
Martin-Löf, P., Intuitionistic type theory, Bibliopolis, 1984
McDonald, J., Suppes, P., Student use of an interactive theorem prover, In Automated Theorem Proving: After 25 Years, W. W. Bledsoe, D. W. Loveland, Eds. American Mathematical
Society, 1984, pp. 315–360
Melham, T. F., Automating recursive type definitions in higher order logic, In Current
Trends in Hardware Verification and Automated Theorem Proving, G. Birtwistle, P. A. Subrahmanyam, Eds. Springer, 1989, pp. 341–386
Milner, R., Communication and Concurrency, Prentice-Hall, 1989
Milner, R., Tofte, M., Harper, R., The Definition of Standard ML, MIT Press, 1990
Noël, P., Experimenting with Isabelle in ZF set theory, J. Auto. Reas. 10, 1 (1993), 15–58
Nordström, B., Terminating general recursion, BIT 28 (1988), 605–619
Nordström, B., Petersson, K., Smith, J., Programming in Martin-Löf’s Type Theory. An
Introduction, Oxford University Press, 1990
Paulson, L. C., Constructing recursion operators in intuitionistic type theory, J. Symb.
Comput. 2 (1986), 325–355
Paulson, L. C., Set theory for verification: I. From foundations to functions, J. Auto. Reas.
11, 3 (1993), 353–389
Paulson, L. C., A concrete final coalgebra theorem for ZF set theory, Tech. rep., Comp.
Lab., Univ. Cambridge, 1994
Paulson, L. C., A fixedpoint approach to implementing (co)inductive definitions, In 12th
Conf. Auto. Deduct. (1994), A. Bundy, Ed., Springer, pp. 148–161, LNAI 814
Schroeder-Heister, P., Generalized rules for quantifiers and the completeness of the intuitionistic operators &, ∨, ⊃, ⊥, ∀, ∃, In Computation and Proof Theory: Logic Colloquium
’83 (1984), Springer, pp. 399–426, Lecture Notes in Mathematics 1104
Smith, J., The identification of propositions and types in Martin-Löf’s type theory: A
programming example, In Foundations of Computation Theory (1983), M. Karpinski, Ed.,
Springer, pp. 445–456, LNCS 158
Suppes, P., Axiomatic Set Theory, Dover, 1972