Proceedings of the 5th
International Workshop on
Higher-Order Rewriting
– HOR 2010 –
A FLoC affiliated workshop held on 14 July 2010
Edinburgh, UK
Preface
HOR 2010 is a forum to present work concerning all aspects of higher-order
rewriting. The aim is to provide an informal and friendly setting to discuss recent work and work in progress. Previous editions of HOR were held in Copenhagen – Denmark (HOR 2002), Aachen – Germany (HOR 2004), Seattle – USA
(HOR 2006) and Paris – France (HOR 2007).
This year we had a total of 8 papers presented, all addressing interesting
ideas. Also, we had two invited speakers to whom I would like to give thanks:
• Maribel Fernández (King’s College London) who talked about Closed nominal rewriting: properties and applications and
• Silvia Ghilezan (University of Novi Sad) who talked about Computational
interpretations of logic.
My appreciation also to the members of the PC (Zena Ariola, Frédéric Blanqui, Mariangiola Dezani-Ciancaglini and Roel de Vrijer) for lending their time
and expertise, the referees, and to Delia Kesner and Femke van Raamsdonk for
providing valuable support. Thanks also to GDR-IM which awarded funds to
HOR’2010 that was used for supporting presentation of papers by students.
Finally, I would like to thank the organizers of FLoC 2010 and affiliated
events for contributing towards such an exciting event.
Eduardo Bonelli (Universidad Nacional de Quilmes, Argentina)
1
Contents
1
2
3
4
5
6
7
8
9
10
Maribel Fernández (Invited Speaker)
Closed nominal rewriting: properties and applications
Alejandro Dı́az-Caro, Simon Perdrix, Christine Tasson and Benoı̂t Valiron
Equivalence of algebraic lambda-calculi
Harald Zankl, Nao Hirokawa and Aart Middeldorp
Uncurrying for Innermost Termination and Derivational Complexity
Giulio Manzonetto and Paolo Tranquilli
A Calculus of Coercions Proving the Strong Normalization of MLF
Cynthia Kop (Student Talk)
A new formalism for higher-order rewriting
Silvia Ghilezan (Invited Speaker)
Computational interpretations of logic
Thibaut Balabonski
On the Implementation of Dynamic Patterns
Kristoffer Rose
Higher-order Rewriting for Executable Compiler Specifications
Ariel Mendelzon, Alejandro Rı́os and Beta Ziliani
Swapping: a natural bridge between named and indexed explicit substitution
calculi
Delia Kesner, Carlos Lombardi and Alejandro Rı́os
Standardisation for constructor based pattern calculi
2
3–5
6–11
12–16
17–21
22–26
27–28
29–33
34–40
41–46
47–52
Closed nominal rewriting:
properties and applications
Maribel Fernández
King’s College London, Strand, London WC2R 2LS, UK
[email protected]
Rewriting systems (see, for instance, [6, 1, 20]) have been used to model the dynamics (deduction and
evaluation for example) of formal systems described by abstract syntax trees, also called terms. In the
presence of binding, α-equivalence, that is, the equivalence relation that equates terms modulo renaming
of bound variables, must be taken into account. One alternative is to define binders through functional
abstraction, taking α-equivalence as a primitive, implicit notion, and working with equivalence classes
of terms. For instance, Combinatory Reduction Systems (CRS) [14], Higher-order Rewrite Systems
(HRS) [17] and Expression Reduction Systems (ERS) [13] use the λ -calculus as meta-language and
terms are defined “modulo alpha”. The price to pay is that we can no longer rely on simple notions such
as structural induction on terms and syntactic unification.
Alternatively, the nominal approach [12, 19] distinguishes between object-level variables (written
a, b, c and called atoms), which can be abstracted but behave similarly to name constants, and metalevel variables or just variables (X,Y, Z, . . .), which are first-order in that there are no binders for them
and substitution does not avoid capture of free atoms. In nominal terms variables have arity zero, as in
ERSs (but unlike ERSs, substitution of atoms for terms is not a primitive notion). The α-equivalence
relation is axiomatised in a syntax-directed manner (thus we can reason by structural induction) using
a freshness relation between atoms and terms, written a#t (i.e., “a is fresh for t”). Nominal rewriting
systems (NRSs) [7] are rewriting systems on nominal terms. For example, β -reduction and η-expansion
rules for the λ -calculus are written as:
a#X ⊢
app(λ ([a]M), N) → subst([a]M, N)
X
→ λ ([a]app(X, a))
where the substitution in the β -rule is represented by a function symbol, also defined by rewrite rules.
For instance, we can add the following rules, where we sugar subst([a]M, N) to M{a7→N}, to propagate
substitutions avoiding capture:
(σvar )
a#Y ⊢
(σε )
(σapp )
b#Y ⊢
(σλ )
a{a7→X}
Y {a7→X}
app(X, X ′ ){a7→Y }
(λ [b]X){a7→Y }
→
→
→
→
X
Y
app(X{a7→Y }, X ′ {a7→Y })
λ [b](X{a7→Y })
We refer to [7] for more examples of nominal rewriting rules.
A step of nominal rewriting involves matching modulo α-equivalence, which is decidable [21]. For
arbitrary NRSs, checking whether there is a rule that can be applied to a given term is an NP-complete
problem in general [3]. However, if we only use closed rules, nominal matching is sufficient, and can
be implemented in linear time and space [2]. Closed rules are, roughly speaking, rules that preserve
Submitted to:
HOR 2010
c M. Fernández
This work is licensed under the
Creative Commons Attribution License.
4
Closed nominal rewriting: properties and applications
abstracted atoms during reductions (as in the examples above); all atoms occur under abstractions in
closed rules. CRSs, ERSs, and HRSs impose similar conditions on rules, by definition (ERSs impose a
condition on matching substitutions, which corresponds to our notion of closed rules). We refer to [9]
for an encoding of CRSs using closed nominal rules.
In addition to efficient matching, closed NRSs inherit other good properties of first-order rewriting:
for instance, we have a critical pair lemma (see [7]) which can be used to derive confluence of terminating systems. Confluent and terminating NRSs with closed rules have a decidable equational theory [8]
(see [11, 4] for definitions and examples of nominal equational theories). In other words, if a nominal
equational theory can be represented by a confluent and terminating closed NRSs, then equality in the
theory can be decided by rewriting. However, confluence and termination are both undecidable properties. Sufficient conditions for confluence are given in [7]. Recently, we have also shown that standard
orderings used to check termination of first-order rewriting systems, such as the recursive path ordering
(rpo) [5], can be generalised to deal with nominal terms and α-equivalence [10]. The nominal recursive
path ordering inherits the properties of the rpo, and can be used to check termination of NRSs. Using
this result, we have designed a completion procedure à la Knuth and Bendix [15] for closed NRSs.
The principle behind completion is that if a given equational theory is presented by a terminating
but not confluent rewrite system, then we can try to transform it into a confluent one by computing its
critical pairs and adding rules to join them, preserving termination (but completion may fail, or may not
terminate). Completion has been generalised to systems that use higher-order functions but no binders,
i.e. with a first-order syntax [16]. In the case of higher-order rewriting systems, not only we need an
ordering that can deal with terms including binders, but also, after computing a critical pair, we need
to be able to add the corresponding rules if the pair is not joinable. Adding these rules may not always
possible, as mentioned in [18], due to the syntactic or type restrictions used in higher-order rewriting
formalisms. So far, no completion procedures are available for CRSs, ERSs or HRSs. However, NRSs
do not rely on a typed language as HRSs, and do not impose the syntactic restrictions that ERSs and
CRSs impose. A completion procedure can indeed be defined for NRSs when the rules are closed. This
result opens the way for the development of tools for automated reasoning in equational theories that
include binders.
References
[1] F. Baader and T. Nipkow. Term rewriting and all that. Cambridge University Press, Great Britain, 1998.
[2] C. Calvès and M. Fernández. Matching and Alpha-Equivalence Check for Nominal Terms. Journal of
Computer and System Sciences, Special issue: Selected papers from WOLLIC 2008. Elsevier, 2009.
[3] J. Cheney. The complexity of equivariant unification. In Automata, Languages and Programming, Proceedings of the 31st Int. Colloquium, ICALP 2004, volume 3142 of Lecture Notes in Computer Science. Springer,
2004.
[4] R. A. Clouston and A. M. Pitts. Nominal Equational Logic. In Computation, Meaning and Logic. Articles
dedicated to Gordon Plotkin. L. Cardelli, M. Fiore and G. Winskel (eds.), volume 1496, Electronic Notes in
Theoretical Computer Science, Elsevier, 2007.
[5] N. Dershowitz. Orderings for Term-Rewriting Systems. Theoretical Computer Science, vol. 17, no. 3, pp.
279-301. Elsevier, 1982.
[6] N. Dershowitz and J.-P. Jouannaud. Rewrite Systems. In J. van Leeuwen, editor, Handbook of Theoretical
Computer Science: Formal Methods and Semantics, volume B. North-Holland, 1989.
[7] M. Fernández and M.J. Gabbay. Nominal rewriting. Information and Computation, Volume 205(6), 2007.
M. Fernández
5
[8] M. Fernández and M.J. Gabbay. Closed nominal rewriting and efficiently computable nominal algebra equality. In Proceedings of LFMTP 2010, EPTCS.
[9] M. Fernández, M.J. Gabbay, and I. Mackie. Nominal rewriting systems. In Proceedings of the 6th ACMSIGPLAN Symposium on Principles and Practice of Declarative Programming (PPDP’04), Verona, Italy.
ACM Press, 2004.
[10] M. Fernández and A. Rubio. Reduction Orderings and Completion for Rewrite Systems with Binding. Available from http://www.dcs.kcl.ac.uk/staff/maribel/
[11] M. J. Gabbay and A. Mathijssen. Nominal Algebra. Proceedings of the 18th Nordic Workshop on Programming Theory (NWPT’06), 2006.
[12] M. J. Gabbay and A. M. Pitts. A new approach to abstract syntax involving binders. In 14th Annual Symposium on Logic in Computer Science, pages 214–224. IEEE Computer Society Press, 1999.
[13] Z. Khasidashvili. Expression reduction systems. In Proceedings of I.Vekua Institute of Applied Mathematics,
volume 36, pages 200–220, Tbilisi, 1990.
[14] J.-W. Klop, V. van Oostrom, and F. van Raamsdonk. Combinatory reduction systems, introduction and survey.
Theoretical Computer Science, 121:279–308, 1993.
[15] D. Knuth and P. Bendix. Simple word problems in universal algebras. In Computational Problems in Abstract
Algebra, ed. J. Leech, 263–297. Oxford: Pergamon Press, 1970.
[16] K. Kusakari and Y. Chiba. A higher-order Knuth-Bendix procedure and its applications. IEICE Transactions
on Information and Systems, Vol. E90-D,4, 707–715, 2007.
[17] R. Mayr and T. Nipkow. Higher-order rewrite systems and their confluence. Theoretical Computer Science,
192:3–29, 1998.
[18] T. Nipkow and C. Prehofer. Higher-Order Rewriting and Equational Reasoning, In W. Bibel and P. Schmitt,
editors, Automated Deduction — A Basis for Applications. Volume I: Foundations, Applied Logic Series,
volume 8, pages 399–430. Kluwer, 1998.
[19] A. M. Pitts. Nominal logic, a first order theory of names and binding. Information and Computation,
186:165–193, 2003.
[20] Terese. Term Rewriting Systems. Volume 55 of Cambridge Tracts in Theoretical Computer Science. Cambridge University Press, 2003.
[21] C. Urban, A. M. Pitts, and M. J. Gabbay. Nominal unification. Theoretical Computer Science, 323:473–497,
2004.
Equivalence of Algebraic λ -calculi
– extended abstract∗–
Alejandro Dı́az-Caro
Simon Perdrix
LIG, Université de Grenoble, France
CNRS, LIG, Université de Grenoble, France
[email protected]
[email protected]
Christine Tasson
Benoı̂t Valiron
CEA-LIST, MeASI, France
LIG, Université de Grenoble, France
[email protected]
[email protected]
We examine the relationship between the algebraic λ -calculus (λalg ) [9] a fragment of the differential λ -calculus [4]; and the linear-algebraic λ -calculus (λlin ) [1], a candidate λ -calculus for quantum computation. Both calculi are algebraic: each one is equipped with an additive and a scalarmultiplicative structure, and the set of terms is closed under linear combinations. We answer the
conjectured question of the simulation of λalg by λlin [2] and the reverse simulation of λlin by λalg .
Our proof relies on the observation that λlin is essentially call-by-value, while λalg is call-by-name.
The former simulation uses the standard notion of thunks, while the latter is based on an algebraic
extension of the continuation passing style. This result is a step towards an extension of call-by-value
/ call-by-name duality to algebraic λ -calculi.
1 Introduction
Context. Two algebraic versions of the λ -calculus arise independently in distinct contexts: the algebraic
λ -calculus (λalg ) and the linear algebraic λ -calculus (λlin ). The former has been introduced in the context
of linear logic as a fragment of the differential λ -calculus. The latter has been introduced as a candidate
λ -calculus for quantum computation: in λlin , a linear combination of terms reflects the phenomenon of
superposition, i.e. the capability for a quantum system to be in two or more states at the same time.
Linearity of functions and arguments. In both languages, functions which are linear combinations of
terms are interpreted pointwise: (α. f + β .g) x = α.( f ) x + β .(g) x, where “.” is the external product.
The two languages differ on the treatment of the arguments. In λlin , any function is considered as a
linear map: ( f ) (α.x + β .y) →∗ℓ α.( f ) x + β .( f ) y, reflecting the fact that any quantum evolution is
a linear map; while λalg has a call-by-name evolution: (λ x M) N →a M[x := N], without restriction
on N. As a consequence, the evolutions are different as illustrated by the following example. In λlin ,
(λ x (x) x) (α.y + β .z) →∗ℓ α.(y) y + β .(z) z while in λalg , (λ x (x) x) (α.y + β .z) →a (α.y + β .z) (α.y +
β .z) =a α 2 .(y) y + (αβ ).(y) z + (β α).(z) y + β 2 .(z) z.
Simulations. These two languages behave in different manner. An essential question is whether they
would nonetheless be equivalent (and in which manner). Indeed, a positive answer would link two
distinct research areas and unify works done in linear logic and works on quantum computation. It has
been conjectured [2] that λlin simulates λalg . Our contribution is to prove it formally (Section 3.1) and
to provide also the other way around proof of λalg simulating λlin (Section 3.2). The first simulation
∗A
full version of this paper with all the proofs is available in the arXiv
Submitted to:
HOR 2010
c A. Dı́az-Caro, S. Perdrix, C. Tasson & B. Valiron
This work is licensed under the
Creative Commons Attribution License.
7
A. Dı́az-Caro, S. Perdrix, C. Tasson & B. Valiron
uses the encoding, known as “thunk” in the folklore [6], which is based on “freezing” the evaluation
of arguments by systematically encapsulating them into abstractions (that is, making them into values).
It has been extensively studied in the case of the regular, untyped lambda-calculus [5]. The other way
around is based on an algebraic extension of continuation passing style encoding [8].
Modifications to the original calculi. In this paper we slightly modify the two languages. The unique
modification to λalg consists in avoiding reduction under λ , so that for any M, λ x M is a value. As a
consequence, λ is not linear: λ x (α.M + β .N) 6= α.λ x N + β .λ x N. In λlin , we restrict the application
of several rewriting rules in order to make the rules more coherent with a call-by-value leftmost-redex
evaluation. For instance, the rule (M + N) L →ℓ (M) L + (N) L is restricted to the case where both M + N
and L are values.
Finally, several distinct techniques can be used to make an algebraic calculus confluent. In λlin ,
restrictions on reduction rules are introduced, e.g. α.M + β .M →ℓ (α + β ).M if M is closed normal.
In λalg a restriction to positive scalars is proposed. Finally, one can use a typing system to guarantee
confluence. In this paper, we assume that one of these techniques – without specifying explicitly which
one – is used to make the calculi confluent.
2 Algebraic λ -calculi
The languages λlin and λalg share the same syntax, defined as follows:
M, N, L ::= V | (M) N | M + N | α.M (terms),
(values),
U,V,W ::= 0 | B | α.B | V +W
(basis terms).
B ::= x | λ x M
where α represents scalars which may themselves be defined by a term grammar, and endowed with a
term rewrite system compatible with their basic ring operations (+, ×). Formally it is captured in the
definition [1, sec. III – def. 1] of a scalar rewrite system, but for our purpose it is sufficient to think of
them as a ring.
The main differences between the two languages are the β -reduction and the algebraic linearity of
function arguments. If U,V and W are values, and B is a basis term, the rules are defined by:
(λ x M) N
(λ x M) B
(U) (V +W )
(V ) (α.W )
(V ) 0
→a
→ℓ
→ℓ
→ℓ
→ℓ
M[x := N]
M[x := B]
(U) V + (U) W
α.(V ) W
0
βλalg ,
βλlin ,
γλlin ,
γλlin ,
γλlin .
In both languages, + is associative and commutative, i.e. (M + N) + L = M + (N + L) and M + N =
N + M. Notwithstanding their different axiomatizations – one based on equations and the other one on
rewriting rules – linear combination of terms is treated in the same way: the set of terms behaves as a
module over the ring of scalar in both languages.
In λalg the following algebraic equality is defined1 :
1 The reader should not be surprised by noticing that two terms that are equal under = may reduce to terms that are not
a
equal any more. Indeed, it is already the case with the syntactical equality of the λ -calculus.
Equivalence of Algebraic λ -calculi
8
(M + N) L
(α.M) N
0+M
α.(M + N)
α.M + β .M
=a
=a
=a
=a
=a
(M) L + (N) L
α.(M) N
M
α.M + α.N
(α + β ).M
(λalg )
(λalg )
(λalg )
(λalg )
(λalg )
α.(β .M)
(0) M
1.M
0.M
α.0
=a
=a
=a
=a
=a
(α × β ).M
0
M
0
0
(λalg )
(λalg )
(λalg )
(λalg )
(λalg )
In the opposite, the ring structure and the linearity of functions in λlin are provided by reduction rules.
Let U,V and W stand for values2 , the rules are defined as follows.
(U +V ) W
(α.V ) W
α.(M + N)
α.M + β .M
α.M + M
M+M
→ℓ
→ℓ
→ℓ
→ℓ
→ℓ
→ℓ
(U) V + (V ) W
α.(V ) W
α.M + α.N
(α + β ).M
(α + 1).M
(1 + 1).M
(λlin )
(λlin )
(λlin )
(λlin )
(λlin )
(λlin )
0+M
α.(β .M)
(0) V
1.M
0.M
α.0
→ℓ
→ℓ
→ℓ
→ℓ
→ℓ
→ℓ
M
(α × β ).M
0
M
0
0
(λlin )
(λlin )
(λlin )
(λlin )
(λlin )
(λlin )
The context rules for both languages are
M → M′
M → M′
N → N′
M → M′
(M) N → (M ′ ) N
M + N → M′ + N
M + N → M + N′
α.M → α.M ′
together with the additional context rule only for λlin
M →l M ′
(V ) M →l (V ) M ′
.
The β -reduction of λalg corresponds to a call-by-name evaluation, while the β -reduction of λlin
occurs only if the argument is a basis term, i.e. a variable or an abstraction. The γ-rules, only available
in λlin , allows linearity in the arguments.
3 Simulations
3.1
λlin simulates λalg
We consider the following encoding L·M : λalg → λlin . The variables f and z are chosen fresh.
L0M = 0,
LxM = (x) f ,
Lλ x MM = λ x LMM,
L(M) NM = (LMM) λ z LNM,
LM + NM = LMM + LNM,
Lα.MM = α.LMM.
One could be tempted to prove a result in the line of M →a N implies LMM →∗ℓ LNM. Unfortunately
this does not work. Indeed, the encoding brings “administrative” redexes, as in the following example
(where I = λ x x). Although (λ x λ y (y) x) I →a λ y (y) I,
L(λ x λ y (y) x) IM = (λ x λ y ((y) f ) (λ z (x) f )) (λ z λ x (x) f ) →∗ℓ λ y ((y) f ) (λ z (λ z λ x (x) f ) f ),
Lλ y (y) IM = λ y ((y) f ) (λ z λ x (x) f )
are not equal: there is an “administrative” redex hidden in the first expression. This redex does not bring
any information, it is only brought by the encoding.
In order to clear these redexes, we define the map Admin as follows.
2 Notice that in λ
lin a value is not necessarily in normal form. For instance the value λ x x + λ x x reduces to 2.λ x x. The
reductions of values result solely from the ring structure, and all values are normalizing terms.
9
A. Dı́az-Caro, S. Perdrix, C. Tasson & B. Valiron
Admin 0
Admin x
Admin (λ f M) f
Admin (M) N
=
=
=
=
Admin λ x M = λ x Admin M,
Admin M + N = Admin M + Admin N,
Admin α.M = α.Admin M.
0,
x,
Admin M,
(Admin M) Admin N,
Theorem 3.1 For any program (i.e. closed term) M, if M →a N and LNM →∗ℓ V for a value V, then there
exists M ′ such that LMM →∗ℓ M ′ and Admin M ′ = AdminV .
Proof Proof by induction on the derivation of M →a N.
⊓
⊔
Lemma 3.2 If W is a value and M a term such that AdminW = Admin M, then there exists a value V
such that M →∗ℓ V and AdminW = AdminV .
Lemma 3.3 If V is a closed value, then LV M is a value.
Theorem 3.4 (Simulation) For any program (i.e. closed term) M, if M →∗a V a value, then there exists
a value W such that LMM →∗ℓ W and AdminW = Admin LV M.
Proof The proof is done by induction on the size of the sequence of reductions M →∗a V . If M = V , this is
trivially true by choosing W = LV M, which is a value since V is closed, by Lemma 3.3. Now, suppose the
result true for the reduction N →∗a V and suppose that M →a N. By induction hypothesis, LNM →∗ℓ W , for
some value W such that AdminW = Admin LV M. From Theorem 3.1, there exists M ′ such that LMM →∗ℓ M ′
and Admin M ′ = AdminW . From Lemma 3.2, without loss of generality we can choose this M ′ to be a
value W ′ . This closes the proof of the theorem: we have indeed the equality AdminW ′ = Admin LV M. ⊓
⊔
3.2 λalg simulates λlin
To prove the simulation of λlin with λalg we use the following encoding. This is an algebraic extension
of the continuation passing style used to prove that call-by-name simulates call-by-value in the regular
λ -calculus [8].
Let J·K : λlin → λalg be the following encoding. The variables f , g and h are chosen fresh.
JxK = λ f ( f ) x,
J0K = 0,
Jλ x MK = λ f ( f ) λ x JMK,
J(M) NK = λ f (JMK) λ g (JNK) λ h ((g) h) f ,
Jα.MK = λ f (α.JMK) f ,
JM + NK = λ f (JMK + JNK) f .
Let Ψ be the encoding for values defined by:
Ψ(x) = x,
Ψ(0) = 0,
Ψ(λ x M) = λ x JMK,
Ψ(α.V ) = α.Ψ(V ),
Ψ(V +W ) = Ψ(V ) + Ψ(W ).
Using this encoding, it is possible to prove that λalg simulates λlin for any program reducing to a value:
Theorem 3.5 (Simulation) For any program M, if M →∗ℓ V where V is a value, then
JMK (λ x x) →∗a Ψ(V ).
Thanks to the subtle modifications done to the original algebraic calculi (presented in the introduction), the proof in [8] can easily be extended to the algebraic case. We first define a convenient infix
operation (:) that captures the behaviour of the translated terms. For example, if B is a base term, i.e. a
variable or an abstraction, then its translation into λalg is JBK 7→ λ f ( f ) Ψ(B). If we apply this translated
term to a certain K, we obtain λ f ( f ) Ψ(B) K →a K Ψ(B). We capture this by defining B : K = K Ψ(B).
In general, M : K is the reduction of the λalg term JMK K, as Lemma 3.7 states.
Equivalence of Algebraic λ -calculi
10
Definition 3.6 Let (:) : Λλlin × Λλalg → Λλalg be the infix binary operation defined as follows:
B : K = (K) Ψ(B)
(M) N : K = M : λ g (JNK) λ h ((g) h) K
(M) N : K = N : λ f ((Ψ(M)) f ) K
(M) N : K = ((Ψ(M)) Ψ(N)) K
(M) (N1 + N2 ) : K = ((M) N1 + (M) N2 ) : K
(M) (α.N) : K = α.(M) N : K
(M) 0 : K = 0
(M + N) : K = M : K + N : K,
α.M : K = α.(M : K),
0 : K = 0.
(with B a base term),
(with M not a value),
(with M, but not N, being a value),
(with M a value, and N a base term),
(with M and N1 + N1 values),
(with M and α.N values),
(with M a value),
Lemma 3.7 If K is a value, then ∀M, JMK K →∗a M : K.
Lemma 3.8 If M →ℓ N then ∀K value, M : K →∗a N : K
The proof of the Theorem 3.5 is now stated as follows.
Proof of Theorem 3.5. From Lemma 3.7, JMK (λ x x) reduces to M : (λ x x). From Lemma 3.8, it reduces
to V : (λ x x). We now proceed by structural induction on V .
• Let V be a base term. Then V : (λ x x) = (λ x x) Ψ(V ) → Ψ(V ).
• Let V = V1 + V2 . Then V : (λ x x) = V1 : (λ x x) + V2 : (λ x x), which by the induction hypothesis,
reduces to Ψ(V1 ) + Ψ(V2 ) = Ψ(V ).
• Let V = α.V ′ . Then V : (λ x x) = α.(V ′ : (λ x x)), which by the induction hypothesis, reduces to
α.Ψ(V ′ ) = Ψ(V ).
⊓
⊔
4 Conclusion and perspectives
In this paper we proved the conjectured [2] simulation of λalg by λlin and its inverse, on valid programs
(that is, programs reducing to values), answering an open question about the equivalence of the algebraic
λ -calculus (λalg ) [9] and the linear-algebraic λ -calculus (λlin ) [1].
As already shown by Plotkin [8], if the simulation of call-by-value by call-by-name is sound, it fails
to be complete for general (possibly non-terminating) programs. To make it complete, a known solution
is to consider the problem from the point of view of Moggi’s computational calculus [7]. A direction
for study is to consider an algebraic computational λ -calculus instead of a general algebraic λ -calculus.
This raises the question of finding a correct notion of monad for capturing both algebraicity and nontermination in the context of higher-order structures. Another direction of study is the relation between
the simulation of call-by-name by call-by-value using thunks and the CPS encoding. A first direction of
study is [5]
Concerning semantics, the algebraic λ -calculus admits finiteness spaces as a model [3]. What is the
structure of the model of the linear algebraic λ -calculus induced by the continuation-passing style translation in finiteness spaces? The algebraic lambda-calculus can be equipped with a differential operator.
What is the corresponding operator in λlin through the translation?
A. Dı́az-Caro, S. Perdrix, C. Tasson & B. Valiron
11
References
[1] Pablo Arrighi & Gilles Dowek (2008): Linear-algebraic lambda-calculus: higher-order, encodings, and confluence. In: Andrei Voronkov, editor: RTA 2008, Lecture Notes in Computer Science 5117, Springer, Hagenberg, Austria, pp. 17–31.
[2] Pablo Arrighi & Lionel Vaux (2009): Embeding AlgLam into Lineal. Private communication.
[3] Thomas Ehrhard (2005): Finiteness spaces. Mathematical Structures in Computer Science 15(4), pp. 615–646.
[4] Thomas Ehrhard & Laurent Regnier (2003): The differential lambda-calculus. Theoretical Computer Science
309(1), pp. 1–41.
[5] John Hatcliff & Olivier Danvy (1997): Thunks and the lambda-calculus. Journal of Functional Programming
7(03), pp. 303–319.
[6] Peter Zilahy Ingerman (1961): Thunks: a way of compiling procedure statements with some comments on
procedure declarations. Communication of the ACM 4(1), pp. 55–58.
[7] Eugenio Moggi (1989): Computational Lambda-Calculus and Monads. In: LICS, IEEE Computer Society,
pp. 14–23.
[8] Gordon D. Plotkin (1975): Call-by-name, call-by-value and the lambda-calculus. Theoretical Computer
Science 1(2), pp. 125–159.
[9] Lionel Vaux (2009): The algebraic lambda calculus. Mathematical Structures in Computer Science 19(5), pp.
1029–1059.
Uncurrying for Innermost Termination and Derivational
Complexity∗
Harald Zankl,1 Nao Hirokawa,2 and Aart Middeldorp1
1
2
Institute of Computer Science, University of Innsbruck, Austria
School of Information Science, Japan Advanced Institute of Science and Technology, Japan
In this paper we investigate the uncurrying transformation from (Hirokawa et al., 2008) for innermost
termination and derivational complexity.
1 Introduction
Proving termination of first-order applicative term rewrite systems is challenging since the rules lack
sufficient structure. But these systems are important since they provide a natural framework for modeling
higher-order aspects found in functional programming languages some of which (e.g., OCaml) have an
eager evaluation strategy. Since proving termination is easier for innermost than for full rewriting we
lift some of the recent results from [2] from full to innermost termination. For the properties that do not
transfer to the innermost setting we provide counterexamples. Furthermore we show that the uncurrying
transformation is suitable for (innermost) derivational complexity analysis.
The remainder of this paper is organised as follows. After recalling the uncurrying transformation
from [2] in Section 2 we show that it preserves innermost nontermination (but not innermost termination)
in Section 3. In Section 4 we show that it preserves polynomial complexities of programs.
2 Uncurrying
We assume familiarity with term rewriting [1, 7]. This section recalls definitions and results from [2].
An applicative term rewrite system (ATRS for short) is a TRS over a signature that consists of constants
and a single binary function symbol called application denoted by the infix and left-associative symbol
◦. In examples we often use juxtaposition instead of ◦. Every ordinary TRS can be transformed into
an ATRS by currying. Let F be a signature. The currying system C(F) consists of the rewrite rules
fi+1 (x1 , . . . , xi , y) → fi (x1 , . . . , xi ) ◦ y for every n-ary function symbol f ∈ F and every 0 6 i < n. Here
fn = f and, for every 0 6 i < n, fi is a fresh function symbol of arity i. The currying system C(F) is
confluent and terminating. Hence every term t has a unique normal form t↓C(F ) . For instance, f(a, b)
is transformed into f a b. Let R be a TRS over the signature F. The curried system R↓C(F ) is the
ATRS consisting of the rules l↓C(F ) → r↓C(F ) for every l → r ∈ R. The signature of R↓C(F ) contains the
application symbol ◦ and a constant f0 for every function symbol f ∈ F. In the following we write R↓C
for R↓C(F ) whenever F can be inferred from the context or is irrelevant. Moreover, we write f for f0 .
Next we recall the uncurrying transformation from [2]. Let R be an ATRS over a signature F. The
applicative arity aa(f ) of a constant f ∈ F is defined as the maximum n such that f ◦ t1 ◦ · · · ◦ tn is
∗ This research is supported by FWF (Austrian Science Fund) project P18763 and the Grant-in-Aid for Young Scientists
Nos. 20800022 and 22700009 of the Japan Society for the Promotion of Science.
Submitted to:
HOR 2010
c H. Zankl, N. Hirokawa & A. Middeldorp
This work is licensed under the
Creative Commons Attribution License.
H. Zankl, N. Hirokawa & A. Middeldorp
13
a subterm in the left- or right-hand side of a rule in R. This notion is extended to terms as follows:
aa(t) = aa(f ) if t is a constant f and aa(t1 ) − 1 if t = t1 ◦ t2 . Note that aa(t) is undefined if the head
symbol of t is a variable. The uncurrying system U(F) consists of the rewrite rules fi (x1 , . . . , xi ) ◦ y →
fi+1 (x1 , . . . , xi , y) for every constant f ∈ F and every 0 6 i < aa(f ). Here f0 = f and, for every i > 0, fi
is a fresh function symbol of arity i. We say that R is left head variable free if aa(t) is defined for every
non-variable subterm t of a left-hand side of a rule in R. This means that no subterm of a left-hand side
in R is of the form t1 ◦ t2 where t1 is a variable. The uncurrying system U(F), or simply U, is confluent
and terminating. Hence every term t has a unique normal form t↓U . The uncurried system R↓U is the
TRS consisting of the rules l↓U → r↓U for every l → r ∈ R. However the rules of R↓U are not enough
to simulate an arbitrary rewrite sequence in R. The natural idea is now to add U(F). In the following
we write U + (R, F) for R↓U (F ) ∪ U(F). If F can be inferred from the context or is irrelevant, U + (R, F)
is abbreviated to U + (R).
Let R be a left head variable free ATRS. The η-saturated ATRS Rη is the smallest extension of R
such that l ◦ x → r ◦ x ∈ Rη whenever l → r ∈ Rη and aa(l) > 0. Here x is a variable that does not appear
in l → r. For a term t over the signature of the TRS U + (R), we denote by t↓C ′ the result of identifying
different function symbols in t↓C that originate from the same function symbol in F. The notation ↓C ′
is extended to TRSs and substitutions in the obvious way. For a substitution σ, we write σ↓U for the
substitution {x 7→ σ(x)↓U | x ∈ V}. Next we recall some results from [2].
Lemma 1. Let σ be a substitution. If t is head variable free then t↓U σ↓U = (tσ)↓U .
Lemma 2. Let R be a left head variable free ATRS. If s and t are terms over the signature of U + (R)
then (1) s →R↓U t if and only if s↓C ′ →R t↓C ′ and (2) s →U t implies s↓C ′ = t↓C ′ .
Theorem 3. A left head variable free ATRS R is terminating if and only if U + (Rη ) is terminating.
3 Innermost Uncurrying
Before showing that our transformation reflects innermost termination we show that it does not preserve
innermost termination.
Example 4. Consider the ATRS R = {f x → f x, f → g}. In an innermost sequence the first rule is
never applied and hence R is innermost terminating. We have U + (Rη ) = {f1 (x) → f1 (x), f → g, f1 (x) →
g ◦ x, f ◦ x → f1 (x)} which is not innermost terminating due to the rule f1 (x) → f1 (x).
i
i +
The next example shows that s →
R t does not imply s↓U →U + (Rη ) t↓U . This is not a counterexample
to soundness of uncurrying for innermost termination, but it shows that the proof for the “if-direction”
of Theorem 3 (see [2] for details) cannot be adopted for the innermost case without further ado.
i
Example 5. Consider the ATRS R = {f → g, a → b, g x → h} and the innermost step s = f a →
R g a = t.
+
We have s↓U = f ◦ a and t↓U = g1 (a). In the TRS U (Rη ) = {f → g, a → b, g1 (x) → h, g ◦ x → g1 (x)}
i
we have s↓U →
U + (Rη ) g ◦ a but the step from g ◦ a to t↓U is not innermost.
The above problems can be solved if we consider terms that are not completely uncurried. The next
two lemmata prepare for our proof. Below we write s ⊲ t if t is a proper subterm of s.
Lemma 6. Let R be a left head variable free ATRS. If s is a term over the signature of R, s ∈ NF(R),
and s →∗U t then t ∈ NF(Rη ↓U ).
Lemma 7. →∗U · ⊲ ⊆ ⊲ · →∗U
i ǫ
i +
∗
Lemma 8. For every left head variable free ATRS R the inclusion ∗U ← · →
R ⊆ →U + (Rη ) · U ← holds.
14
Uncurrying
i +
i ǫ
∗
∗
Proof. We prove that s →
U + (Rη ) r↓U σ↓U U ← rσ whenever s U ← lσ →R rσ for some rewrite rule l → r
i ∗
∗
in R. By Lemma 1 and the confluence of U, s →
U (lσ)↓U = l↓U σ↓U →U + (Rη ) r↓U σ↓U U ← rσ. It
i ∗
remains to show that the sequence s →U (lσ)↓U and the step l↓U σ↓U →U + (Rη ) r↓U σ↓U are innermost
i ∗
i
i ǫ
′ i ∗
′
with respect to U + (Rη ). For the former, let s →
U C[u] →U C[u ] →U (lσ)↓U with u →U u and let t
∗
∗
be a proper subterm of u. Obviously lσ →U C[u] ⊲ t. According to Lemma 7, lσ ⊲ v →U t for some
i
term v. Since lσ →
R rσ, the term v is a normal form of R. Hence t ∈ NF(Rη ↓U ) by Lemma 6. Since
i ǫ ′
u →U u , t is also a normal form of U. Hence t ∈ NF(U + (Rη )) as desired. For the latter, let t be a proper
subterm of (lσ)↓U . According to Lemma 7, lσ ⊲ u →∗U t. The term u is a normal form of R. Hence
t ∈ NF(Rη ↓U ) by Lemma 6. Obviously, t ∈ NF(U) and thus also t ∈ NF(U + (Rη )).
The next example shows that for Lemma 8 the R-step must take place at the root position.
i
∗
6 i +
Example 9. If R = {f → g, f x → g x, a → b} then f1 (a) ∗U ← f ◦ a →
R g ◦ a but f1 (a) →
U + (Rη ) · U ← g ◦ a.
ri
In order to extend Lemma 8 to non-root positions, we have to use rightmost innermost step →.
This
avoids the situation in the above example where parallel redexes become nested by uncurrying.
ri
i +
∗
Lemma 10. For every left head variable free ATRS R the inclusion ∗U ← · →
R ⊆ →U + (Rη ) · U ← holds.
ri
i ǫ
Proof. Let s ∗U ← t = C[lσ] →
R C[rσ] = u with lσ →R rσ. We use induction on C. If C = then
i ǫ
i +
∗
∗
s U ← t →R u. Lemma 8 yields s →U + (Rη ) · U ← u. For the induction step we consider two cases.
• Suppose C = ◦ s1 ◦ · · · ◦ sn and n > 0. Since R is left head variable free, aa(l) is defined. If
i
′ ∗
aa(l) = 0 then s = t′ ◦ s′1 ◦ · · · ◦ s′n ∗U ← lσ ◦ s1 ◦ · · · ◦ sn →
R rσ ◦ s1 ◦ · · · ◦ sn with t U ← lσ and
′
∗
sj U ← sj for 1 6 j 6 n. The claim follows using Lemma 8 and the fact that innermost rewriting
is closed under contexts. If aa(l) > 0 then the head symbol of l cannot be a variable. We have
to consider two cases. In the case where the leftmost ◦ symbol in C has not been uncurried
we proceed as when aa(l) = 0. If the leftmost ◦ symbol of C has been uncurried, we reason as
follows. We may write lσ = f ◦ u1 ◦ · · · ◦ uk where k < aa(f ). We have t = f ◦ u1 ◦ · · · ◦ uk ◦
s1 ◦ · · · ◦ sn and u = rσ ◦ s1 ◦ · · · ◦ sn . There exists an i with 1 6 i 6 min{aa(f ), k + n} such that
s = fi (u′1 , . . . , u′k , s′1 , . . . , s′i−k ) ◦ s′i−k+1 ◦ · · · ◦ s′n with u′j ∗U ← uj for 1 6 j 6 k and s′j ∗U ← sj for
1 6 j 6 n. Because of rightmost innermost rewriting, the terms u1 , . . . , uk , s1 , . . . , sn are normal
forms of R. According to Lemma 6 the terms u′1 , . . . , u′k , s′1 , . . . , s′n are normal forms of Rη ↓U .
Since i−k 6 aa(l), Rη contains the rule l ◦x1 ◦· · ·◦xi−k → r ◦x1 ◦· · ·◦xi−k where x1 , . . . , xi−k are
pairwise distinct variables not occurring in l. Hence the substitution τ = σ ∪ {x1 7→ s1 , . . . , xi−k 7→
si−k } is well-defined. We obtain
i ∗
′
′
s →
U + (Rη ) fi (u1 ↓U , . . . , uk ↓U , s1 ↓U , . . . , si−k ↓U ) ◦ si−k+1 ◦ · · · ◦ sn
i
′
′
→
U + (Rη ) (r ◦ x1 ◦ · · · ◦ xi−k )↓U τ ↓U ◦ si−k+1 ◦ · · · ◦ sn
∗←
U
(r ◦ x1 ◦ · · · ◦ xi−k )τ ◦ si−k+1 ◦ · · · ◦ sn = rσ ◦ s1 ◦ · · · ◦ sn = t
where we use the confluence of U in the first sequence.
ri
′
• In the second case we have C = s1 ◦ C ′ . Clearly C ′ [lσ] →
R C [rσ]. If aa(s1 ) 6 0 or if aa(s1 )
is undefined or if aa(s1 ) > 0 and the outermost ◦ has not been uncurried in the sequence from t
ri
′
′ ∗
′ ∗
′
to s then s = s′1 ◦ s′ U∗ ← s1 ◦ C ′ [lσ] →
R s1 ◦ C [rσ] = u with s1 U ← s1 and s U ← C [lσ]. If
aa(s1 ) > 0 and the outermost ◦ has been uncurried in the sequence from t to s then we may write
s1 = f ◦ u1 ◦ · · · ◦ uk where k < aa(f ). We have s = fk+1 (u′1 , . . . , u′k , s′ ) for some term s′ with
i +
s′ U∗ ← C ′ [lσ] and u′i ∗U ← ui for 1 6 i 6 k. In both cases the induction hypothesis yields s′ →
U + (Rη )
i +
∗
′
· U ← C [rσ] and, since innermost rewriting is closed under contexts, we obtain s →U + (Rη ) · ∗U ← u
as desired.
H. Zankl, N. Hirokawa & A. Middeldorp
15
By Lemma 10 and the equivalence of rightmost innermost and innermost termination [6] we obtain
the main result of this section.
Theorem 11. A left head variable free ATRS R is innermost terminating if U + (Rη ) is.
4 Derivational Complexity
Hofbauer and Lautemann [4] introduced the concept of derivational complexity for terminating TRSs.
The idea is to measure the maximal length of rewrite sequences (derivations) depending on the size
of the starting term. Formally, the derivation length of t (with respect to →) is defined as dl(t, →) =
max{m ∈ N | t →m u for some u} . The derivational complexity dcR (n) of a TRS R is then defined
as dcR (n) = max{dl(t, →R ) | |t| 6 n} where |t| denotes the size of t. Similarly we define idcR (n) =
i
max{dl(t, →
R ) | |t| 6 n}. Since we regard only finite TRSs, these functions are well-defined only when
R is (innermost) terminating. If dcR (n) is bounded by a linear, quadratic, cubic, . . . polynomial, R is
said to have linear, quadratic, cubic . . . (or polynomial) derivational complexity. A similar definition
applies for idcR (n).
4.1
Full Rewriting
It is sound to use uncurrying as a preprocessor for proofs of derivational complexity:
Theorem 12. Let R be a left head variable free and terminating ATRS. Then dcR (n) 6 dcU + (Rη ) (n) for
all n ∈ N.
Proof. Consider an arbitrary maximal rewrite sequence t0 →R t1 →R t2 →R · · · →R tm which we
+
+
+
can transform into the sequence t0 ↓U →+
U + (Rη ) t1 ↓U →U + (Rη ) t2 ↓U →U + (Rη ) · · · →U + (Rη ) tm ↓U just as
the “if-direction” of the proof of Theorem 3 (see [2]). Moreover, t0 →∗U + (Rη ) t0 ↓U holds. Therefore,
dl(t0 , →R ) 6 dl(t0 , →U + (Rη ) ). Hence dcR (n) 6 dcU + (Rη ) (n) holds for all n ∈ N.
Next we show that uncurrying preserves polynomial complexities. Hence we disregard duplicating
(exponential complexity, cf. [3]) and empty (constant complexity) ATRSs. A TRS R is called lengthreducing if R is non-duplicating and |l| > |r| for all rules l → r ∈ R. The following lemma is an easy
consequence of [3, Theorem 23]. Here →R/S denotes →∗S · →R · →∗S .
Lemma 13. Let R be a non-empty non-duplicating TRS over a signature containing at least one symbol
of arity at least two and let S be a length-reducing TRS. If R ∪ S is terminating then dcR∪S (n) ∈
O(dcR/S (n)).
Note that the above lemma does not hold if the TRS R is empty.
Theorem 14. Let R be a non-empty, non-duplicating, left head variable free, and terminating ATRS. If
dcR (n) is in O(nk ) then dcRη ↓U /U (n) and dcU + (Rη ) (n) are in O(nk ).
Proof. Let dcR (n) be in O(nk ) and consider a maximal rewrite sequence of →Rη ↓U /U from a term
t0 : t0 →Rη ↓U /U t1 →Rη ↓U /U · · · →Rη ↓U /U tm . By Lemma 2 (2) we obtain the sequence t0 ↓C ′ →R
t1 ↓C ′ →R · · · →R tm ↓C ′ . Thus, dl(t0 , →Rη ↓U /U ) 6 dl(t0 ↓C ′ , →R ). Because |t0 ↓C ′ | 6 2|t0 | holds, we obtain dcRη ↓U /U (n) 6 dcR (2n). From the assumption the right-hand side is in O(nk ), hence dcRη ↓U /U (n)
is in O(nk ). Because U is length-reducing, dcU + (Rη ) (n) is also in O(nk ), by Lemma 13.
16
Uncurrying
In practice it is recommendable to investigate dcRη ↓U /U (n) instead of dcU + (Rη ) (n), see [8]. The next
example shows that uncurrying might be useful to enable criteria for polynomial complexity.
Example 15. Consider the ATRS R = {add x 0 → x, add x (s y) → s (add x y)}. It is easy to see
that there exists a triangular matrix interpretation of dimension 2 that orients all rules in U + (Rη ) strictly,
inducing quadratic derivational complexity of U + (Rη ) (see [5]) and by Theorem 12 also of R. In contrast,
the rule add x (s y) → s (add x y) does not admit such an interpretation of dimension 2. To see this we
encoded the required condition as a satisfaction problem in non-linear arithmetic constraint over the
integers. MiniSmt1 can prove this problem unsatisfiable.
4.2
Innermost Rewriting
Next we consider innermost derivational complexity. Let R be an innermost terminating TRS. From a
i
ri
result by Krishna Rao [6, Section 5.1] we infer that dl(t, →
R ) = dl(t, →R ) holds for all terms t.
Theorem 16. Let R be a left head variable free and innermost terminating ATRS. We have idcR (n) 6
idcU + (Rη ) (n) for all n ∈ N.
ri
ri
ri
ri
Proof. Consider a maximal rightmost innermost rewrite sequence t0 →
R t1 →R t2 →R · · · →R tm .
i +
i +
′ i +
′ i +
′
Using Lemma 10 we obtain a sequence t0 →
U + (Rη ) t1 →U + (Rη ) t2 →U + (Rη ) · · · →U + (Rη ) tm for terms
i
ri
i
t′1 , t′2 , . . . , t′m such that ti →∗U t′i for all 1 6 i 6 m. Thus, dl(t0 , →
R ) = dl(t0 , →R ) 6 dl(t0 , →U + (Rη ) ).
Hence, we conclude idcR (n) 6 idcU + (Rη ) (n).
As Example 4 shows, uncurrying does not preserve innermost termination. Similarly, it does not
preserve polynomial complexities even if the original ATRS has linear innermost derivational complexity.
Example 17. Consider the non-duplicating ATRS R = {f → s, f (s x) → s (s (f x))}. Since the second rule is never used in innermost rewriting, idcR (n) 6 n is easily shown by induction on n. We
show that the innermost derivational complexity of U + (Rη ) is at least exponential. We have U + (Rη ) =
{f → s, f1 (s1 (x)) → s1 (s1 (f1 (x))), f ◦ x → f1 (x), f1 (x) → s1 (x), s ◦ x → s1 (x)} and one can verify that
i
n
n
dl(f1n (s1 (x)), →
U + (Rη ) ) > 2 for all n > 1. Hence, idcU + (Rη ) (n + 3) > 2 for all n > 0.
References
[1] F. Baader & T. Nipkow (1998): Term Rewriting and All That. Cambridge University Press.
[2] N. Hirokawa, A. Middeldorp & H. Zankl (2008): Uncurrying for Termination. In: LPAR, LNCS 5330, pp.
667–681.
[3] N. Hirokawa & G. Moser (2008): Automated Complexity Analysis Based on the Dependency Pair Method. In:
IJCAR, LNCS 5195, pp. 364–379.
[4] D. Hofbauer & C. Lautemann (1989): Termination Proofs and the Length of Derivations (Preliminary Version). In: RTA, LNCS 355, pp. 167–177.
[5] G. Moser, A. Schnabl & J. Waldmann (2008): Complexity Analysis of Term Rewriting Based on Matrix and
Context Dependent Interpretations. In: FSTTCS, LIPIcs 2, pp. 304–315.
[6] M.R.K. Krishna Rao (2000): Some Characteristics of Strong Innermost Normalization. TCS 239, pp. 141–164.
[7] Terese (2003): Term Rewriting Systems, Cambridge Tracts in Theoretical Computer Science 55. Cambridge
University Press.
[8] H. Zankl & M. Korp (2010): Modular Complexity Analysis via Relative Complexity. In: RTA, LIPIcs. To
appear.
1 http://cl-informatik.uibk.ac.at/software/minismt/
A Calculus of Coercions Proving the Strong
Normalization of MLF
Giulio Manzonetto∗
Paolo Tranquilli†
LIPN, CNRS UMR 7030
LIP, CNRS UMR 5668, INRIA
Université Paris Nord, France
ENS de Lyon, Université Claude Bernard Lyon 1, France
[email protected]
[email protected]
We provide a strong normalization result for MLF , a type system generalizing ML with firstclass polymorphism as in system F. The proof is achieved by translating MLF into a calculus
of coercions, and showing that this calculus is a decorated version of system F. Simulation
results then entail strong normalization from the same property of system F.
Introduction. MLF [3] is a type system for (extensions of) λ -calculus which enriches ML with
the first class polymorphism of system F, providing a partial type annotation mechanism with an
automatic type reconstructor. In this extension we can write programs that cannot be written in
ML, while still being conservative: ML programs still typecheck without needing any annotation.
An important feature are principal type schemata, lacking in system F, which are obtained by
employing a downward bounded quantification ∀(α ≥ σ )τ, the so-called flexible quantifier. This
type says that τ may be instantiated to any τ {σ ′ /α}, provided that σ ′ is an instantiation of σ .
As already pointed out, system F is contained in MLF . It is not yet known, but it is conjectured [3], that the inclusion is strict. This makes the question of strong normalization (SN,
i.e. whether λ -terms typed in MLF always terminate) a non-trivial one. In this paper we answer
positively to the question. The result is proved via a suitable simulation in system F, with
additional structure dealing with the complex type instantiations possible in MLF .
Our starting point is xMLF [5], the Church version of MLF : here type inference (and the
rigid quantifier ∀(α = σ )τ we did not mention) is omitted, with the aim of providing an internal
language to which a compiler might map the surface language briefly presented above (denoted
eMLF from now on1 ). Compared to Church-style system F, the type reduction →ι of xMLF is
more complex, and may a priori cause unexpected glitches: it could cause non-termination, or
block the reduction of a β -redex. To prove that none of this happens, we use as target language
of our translation a decoration of system F, the coercion calculus, which in our opinion has its
own interest. Indeed, xMLF has syntactic entities (the instantiations φ ) which testify an instance
relation between types, and it is not hard to regard them as coercions. The delicate point is
that some of these instantiations (the “abstractions” !α) behave in fact as variables, abstracted
when introducing a bounded quantifier. In fact, for all the choices of α, ∀(α ≥ σ )τ expects a
coercion from σ to α.
A question that arises naturally is: what does it mean to be a coercion in this context? Our
answer, which works for xMLF , is in the form of a type system (Figure 2). In section 2 we will
show the good properties enjoyed by coercion calculus. The generality of coercion calculus allows
∗ Supported
by Digiteo project COLLODI (2009-28HD).
by ANR project COMPLICE (ANR-08-BLANC-0211-01).
1 There is also a completely annotation-free version, iMLF , clearly at the cost of loosing type inference.
† Supported
Submitted to:
HOR 2010
c G. Manzonetto & P. Tranquilli
A Calculus of Coercions Proving SN of MLF
18
Syntactic definitions
σ , τ ::= α | σ → τ | ⊥ | ∀(α ≥ σ )τ
(types)
φ , ψ ::= τ | !α | ∀(≥ φ ) | ∀(α ≥)φ | & | | φ ; ψ | 1
(instantiations)
a, b, c ::= x | λ (x : τ)a | ab | Λ(α ≥ τ)a | aφ | let x = a in b
(terms)
(environments)
Γ, ∆ ::= 0/ | Γ, α ≥ τ | Γ, x : τ
&
Reduction rules
(λ (x : τ)a)b →β a {x/b}
a →ι Λ(α ≥ ⊥)a, α ∈
/ ftv(τ)
a1 →ι a
(Λ(α ≥ τ)a)& →ι a {1/!α} {τ/α}
a(φ ; ψ) →ι (aφ )ψ
let x = b in a →β a {x/b}
(Λ(α ≥ τ)a)(∀(α ≥)φ ) →ι Λ(α ≥ τ)(aφ )
(Λ(α ≥ τ)a)(∀(≥ φ )) →ι Λ(α ≥ τφ )a {φ ; !α/!α}
&
Figure 1: Syntactic definitions and reduction rules of xMLF .
then to lift these results to xMLF via a translation (section 3). The main idea of the translation
is the same as the one shown for eMLF in [4], where however no dynamic property was provided.
Here we finally produce a proof of SN for all versions of MLF . Moreover the bisimulation result
for xMLF establishes once and for all that xMLF can be used as an internal language for eMLF ,
as the additional type structure cannot block reductions of programs in eMLF .
1
A short introduction to xMLF
The syntactic entities of xMLF are presented in Figure 1. Intuitively, ⊥ ∼
= ∀α.α and ∀(α ≥ σ )τ
restricts the variable α to range over instances of σ only. Instantiations2 generalize system F’s
type application, by providing a way to instantiate from one type to another. A let construct
is added mainly to accommodate the type reconstructor of eMLF ; apart from type inference
purposes, one could assume (let x = a in b) = (λ (x : σ )b)a, with σ the correct type of a. Apart
from the usual variable assignments x : τ, environments also contain type variable assignments
α ≥ τ, which are abstracted by the type abstraction Λ(α ≥ τ)a.
Typing judgments are of the usual form Γ ⊢ a : σ for terms, and Γ ⊢ φ : σ ≤ τ for instantiations.
The latter means that φ can take a term a of type σ to aφ of type τ. For the sake of space,
we will not present here the typing rules of instantiations and terms, for which we refer to [5],
along with a more detailed discussion about xMLF . Reduction rules are divided into →β (regular
β -reductions) and →ι , reducing instantiations. The type τφ is given by an inductive definition
(which we will not give here) which computes the unique type such that Γ ⊢ φ : τ ≤ τφ , if φ
typechecks. We recall (from [5]) that both →β and →ι enjoy subject reduction. Moreover, we
denote by ⌈a⌉ the type erasure that ignores all type and instantiation annotations and maps
xMLF terms to ordinary λ -terms (with let).
2
The coercion calculus
The syntax, the type system and the reduction rules of the coercion calculus are introduced in
Figure 2. The notion of coercion is captured by the type τ ⊸ σ : the use of linear logic’s linear
implication for the type of coercions is not casual. Indeed the typing system is a fragment of
&
2 We follow the original notation of [5]; in particular it must be underlined that
whatsoever with linear logic’s par and with connectives.
and & have no relation
G. Manzonetto & P. Tranquilli
σ,τ
κ
ζ
a, b
u, v
19
Syntactic
::= α | σ → τ | κ → τ | ∀α.τ
(types)
::= σ ⊸ τ
(coercion types)
::= τ | κ
(type expressions)
::= x | λ x.a | λ x.a | ab | a ⊳ b | a ⊲ b (terms)
::= λ x.a | λ x.u | x ⊲ u
(c-values)
definitions
Γ, ∆ ::= 0/ | Γ, x : τ | Γ, x : σ ⊸ α (environments)
L ::= 0/ | x : τ
(linear environments)
Γ; ⊢ a : σ
(term judgements)
Γ; ⊢ a : κ
(coercion judgements)
Γ; z : τ ⊢ a : σ
(linear judgements)
Typing rules
Γ(y) = ζ
Γ; L ⊢ a : ∀α.σ
Γ; ⊢ a : σ → τ Γ; ⊢ b : σ
Inst
App
Ax
Γ; ⊢ ab : τ
Γ; L ⊢ a : σ {τ ′ /α}
Γ; ⊢ y : ζ
Γ, x : τ; ⊢ a : σ
Γ; L ⊢ a : σ α ∈
/ ftv(Γ; L)
Γ; ⊢ a : τ Γ, x : τ; ⊢ b : σ
Abs
Let
Gen
Γ; ⊢ let x = a in b : σ
Γ; ⊢ λ x.a : τ → σ
Γ; L ⊢ a : ∀α.σ
Γ; z : τ ⊢ a : σ
Γ; ⊢ a : σ1 ⊸ σ2 Γ; L ⊢ b : σ1
LAbs
LApp
LAx
Γ; L ⊢ a ⊲ b : σ2
Γ; z : τ ⊢ z : τ
Γ; ⊢ λ w.a : τ ⊸ σ
Γ, x : κ; L ⊢ a : σ
Γ; L ⊢ a : κ → σ Γ; ⊢ b : κ
CApp
CAbs
Γ; L ⊢ a ⊳ b : σ
Γ; L ⊢ λ x.a : κ → σ
Reduction rules
let x = b in a →β a {b/x} ,
(λ x.a)b →β a {b/x} ,
(λ x.a) ⊳ b →c a {b/x} ,
(λ x.a) ⊲ b →c a {b/x} ,
(λ x.u) ⊳ b →cv u {b/x} ,
(λ x.a) ⊲ u →cv a {u/x} .
Figure 2: Syntactic definitions, typing and reduction rules of the coercion calculus.
DILL, the dual intuitionistic linear logic [1]. This captures an aspect of coercions: they consume
their argument without erasing it (as they must preserve it) nor duplicate it (as there is no true
computation, just a type recasting). Environments are of shape Γ; L, where Γ is a map from
variables to type expressions3 , and L is the linear part of the environment, containing (contrary
to DILL) at most one assignment.
Reductions are divided into →β (the actual computation) and →c (the coercion reduction),
having a subreduction →cv which intuitively is just enough to unlock β -redexes, and is needed
for Theorem 4. We start from the basic properties of the coercion calculus. As usual, the
following result is achieved with weakening and substitution lemmas.
Theorem 1 (Subject reduction). Γ; L ⊢ a : ζ and a →β c b entail Γ; L ⊢ b : ζ .
The coercion calculus can be seen as a decoration of Curry-style system F. The latter can
be recovered by just collapsing the constructs ⊸, λ , ⊳ and ⊲ to their regular counterparts, via
the decoration erasure defined as follows.
|α| := α,
|x| := x,
|ζ → τ| := |ζ | → |τ|,
|λ x.a| = |λ x.a| := λ x.|a|,
|σ ⊸ τ| := |σ | → |τ|,
|Γ|(y) := |Γ(y)|,
|let x = a in b| = (λ x.|b|)|a|,
|Γ; z : τ| := |Γ|, z : |τ|,
|a ⊳ b| = |a ⊲ b| = |ab| := |a||b|.
It is possible to prove that Γ; L ⊢ a : ζ implies that |Γ; L| ⊢ a : |ζ | in system F. From this, and the
SN of system F [2, Sec. 14.3] it follows that the coercion calculus is SN. Confluence of reductions
can be proved by standard Tait-Martin Löf’s technique of parallel reductions. Summarizing, the
following theorem holds.
3 Notice the restriction to σ ⊸ α for coercion variables. Theorem 4 relies on this restriction (d = λ x.(x ⊲ δ )δ :
(σ ⊸ (σ → σ )) → σ , with δ = λ y.yy : σ , ⌊d⌋ = δ δ is a counterexample), but the preceding results do not.
A Calculus of Coercions Proving SN of MLF
20
α•
⊥•
Types and contexts
(σ →
:= σ • → τ • ,
•
(∀(α ≥ σ )τ) := ∀α.(σ • ⊸ α) → τ • ,
τ)•
:= α,
:= ∀α.α,
Instantiations
:= λ x.λ vα .x,
:= λ z.ψ ◦ ⊲ (φ ◦ ⊲ z),
:= λ x.x, (
◦
◦
(!α) := vα ,
(∀(≥ φ )) := λ x.λ vα .x ⊳ (λ z.vα ⊲ (φ ◦ ⊲ z)),
)◦
(φ ; ψ)◦
&
τ◦
x◦ := x,
(let x = a in b)◦ := let x = a◦ in b◦ ,
(x : τ)• := x : τ • ,
(α ≥ τ)• := vα : τ • ⊸ α.
(&)◦ := λ x.x ⊳ λ z.z, (1)◦ := λ z.z,
(∀(α ≥)φ )◦ := λ x.λ vα .φ ◦ ⊲ (x ⊳ vα ).
Terms
(λ (x : τ)a)◦ := λ x.a◦ ,
(Λ(α ≥ τ)a)◦ := λ vα .a◦ ,
(ab)◦ := a◦ b◦ ,
(aφ )◦ := φ ◦ ⊲ a◦ .
Figure 3: Translation of types, instantiations and terms into the coercion calculus. For every
type variable α we suppose fixed a fresh term variable vα .
Theorem 2 (Confluence and termination). All of →β , →c , →cv and →β c are confluent. Moreover the coercion calculus is SN.
The use of coercions is annotated at the level of terms: λ is used to distinguish between
regular and coercion reduction, ⊳ and ⊲ locate coercions without the need to carry typing information (the triangle’s side points to the direction of the coercion). Thus, the actual semantics
of the term can be recovered via its coercion erasure:
⌊x⌋ := x,
⌊λ x.a⌋ := λ x.⌊a⌋,
⌊let x = a in b⌋ = let x = ⌊a⌋ in ⌊b⌋,
⌊ab⌋ := ⌊a⌋⌊b⌋,
⌊a ⊳ b⌋ := ⌊a⌋,
⌊λ x.a⌋ := ⌊a⌋,
⌊a ⊲ b⌋ := ⌊b⌋.
Proposition 3 (Preservation of semantics). Take a typable coercion term a. If
a →β b (resp. a →c b) then ⌊a⌋ → ⌊b⌋ (resp. ⌊a⌋ = ⌊b⌋). Moreover we have the
confluence diagram shown on the right.
a
β
c
b1
b1
c∗
β
c
The following result shows the connection between the reductions of a term and of its semantics.
∗
Theorem 4 (Bisimulation of ⌊ . ⌋). If Γ; ⊢A a : σ , then ⌊a⌋ →β b iff a →cv →β c with ⌊c⌋ = b.
3
The translation
A translation from xMLF terms and instantiations into the coercion calculus is given in Figure 3.
The idea is that instantiations can be seen as coercions; thus a term starting with a type
abstraction becomes a term waiting for a coercion, and a term aφ becomes a◦ coerced by φ ◦ .
The rest of this section is devoted to showing how this translation and the properties of the
coercion calculus lead to the main result of this work, that is SN of both xMLF and eMLF .
First one needs to show that the translation maps to well-typed terms. As expected, type
instantiations are mapped to coercions.
Proposition 5 (Soundness). For Γ ⊢ a : σ an xMLF term (resp. Γ ⊢ φ : σ ≤ τ an xMLF instantiation) we have Γ• ; ⊢ a◦ : σ • (resp. Γ• ; ⊢ φ ◦ : σ • ⊸ τ • ). Moreover ⌈a⌉ = ⌊a◦ ⌋.
The following result shows that the translation is “faithful”, in the sense that β and ι steps
are mapped to β and c steps respectively: coercions do the job of instantiations, and just that.
Proposition 6 (Coercion calculus simulates xMLF ). If a →β b (resp. a →ι b) in xMLF , then
+
a◦ →β b◦ (resp. a◦ →c b◦ ) in coercion calculus.
G. Manzonetto & P. Tranquilli
21
The above already shows SN of xMLF , however in order to show that eMLF is also normalizing
we need to make sure that ι-redexes cannot block β ones: in other words, a bisimulation result.
The following lemma lifts to xMLF the reduction in coercion calculus that bisimulates β -steps
(Theorem 4).
∗
∗
∗
Lemma 7 (Lifting). For an xMLF term a, if a◦ →cv →β b then a →ι →β c with b →c c◦ .
Theorem 8 (Bisimulation of ⌈ . ⌉ for xMLF ). For a typed xMLF term a, we have that ⌈a⌉ →β b
∗
iff a →ι →β c with ⌈c⌉ = b.
As a corollary of the two results stated above, we get the main result of this work, proving
conclusively that all versions of MLF enjoy SN.
Theorem 9 (SN of MLF ). Both eMLF and xMLF are strongly normalizing.
Further work. We were able to prove new results for MLF (namely SN and bisimulation
of xMLF with its type erasure) by employing a more general calculus of coercions. It becomes
natural then to ask whether its typing system may be a framework to study coercions in general,
like those arising in Fη or when using subtyping. The typing rules of Figure 2 were tailored to
xMLF , disallowing in coercions polymorphism or coercion abstraction, i.e. coercion types ∀α.κ
and κ1 → κ2 . Removing such restrictions we could still derive the main result, even though the
proofs would be more complex.
Apart from the extensions previously mentioned, one would need a way to build coercions of arrow types, which are unneeded for xMLF . Namely, given coercions c1 : σ2 ⊸ σ1 and
c2 : τ1 ⊸ τ2 , there should be a coercion c1 ⇒ c2 : (σ1 → τ1 ) ⊸ (σ2 → τ2 ), allowing a reduction
(c1 ⇒ c2 ) ⊲ λ x.a →c λ x.c2 ⊲ a {c1 ⊲ x/x}. This could be achieved by introducing it as a primitive,
by translation or by special typing rules. Indeed if some sort of η-expansion would be available
while building a coercion, one could write c1 ⇒ c2 := λ f .λ x.(c2 ⊲ ( f (c1 ⊲ x))). However how to do
this without loosing bisimulation is under investigation.
Acknowledgements.
We thank Didier Rémy for stimulating discussions and remarks.
References
[1] Andrew Barber & Gordon Plotkin (1997): Dual intuitionistic linear logic. Technical Report LFCS96-347, University of Edinburgh.
[2] Jean-Yves Girard, Yves Lafont & Paul Taylor (1989): Proofs and Types. Number 7 in Cambridge
tracts in theoretical computer science. Cambridge University Press.
[3] Didier Le Botlan & Didier Rémy (2003): MLF: Raising ML to the power of System F. In: Proc. of
International Conference on Functional Programming (ICFP’03), pp. 27–38.
[4] Daan Leijen (2007): A type directed translation of MLF to System F. In: Proc. of International
Conference on Functional Programming (ICFP’07), ACM Press.
[5] Didier Rémy & Boris Yakobowski (2009): A Church-style intermediate language for MLF. Available
at http://www.yakobowski.org/xmlf.html. Submitted.
Transformations of higher-order term rewrite systems
Cynthia Kop
Department of Theoretical Computer Science
Vrije Universiteit, Amsterdam
[email protected]
We study the common ground and differences of different frameworks for higher-order rewriting
from the viewpoint of termination by encompassing them in a generalised framework.
1 Introduction
In the past decades a lot of research has been done on termination of term rewrite systems. However,
the specialised area of higher order rewriting is sadly lagging behind. There are many reasons for this.
Primarily, the topic is relatively difficult, mostly due to the presence of the beta rule. Applications are
also not in as much abudance as with first order rewriting. A third limiting factor is the lack of a set
standard in higher order rewriting. There are several important formalisms, each dealing with the higher
order aspect in a different way, plus various variations and restrictions. Because of the differences in
what is and is not allowed, results in one formalism do not trivially, or not at all, carry over to another.
As such it is difficult to reuse results in a slightly different context, which necessitates a lot of double
work.
In this paper we present work in progress investigating the common ground and differences of various
formalisms from the viewpoint of termination. We introduce yet another formalism, but show how
the usual styles of rewriting can be represented in it. We then look into properties within the general
formalism and show which ones can always be obtained by transforming the system and which cannot.
Finally, to demonstrate that the system is not too general to work with, we extend the Computability Path
Ordering [2] to our formalism.
2 The formalism
In this section we will introduce a formalism of higher-order rewriting, called Higher Order Decidable
Rewrite Systems.
types We assume a set of base sorts B and a set of type variables A . Each (base) sort b has a fixed
arity, notation: b : n; there must be at least one sort of arity 0. A polymorphic type is an expression over
B and A built according to the following grammar:
T = α|b(T n )|T → T
(α ∈ A , b : n ∈ B)
A monomorphic type does not contain type variables. A type is called composed if it is headed by the
arrow symbol. A type b() with b : 0 ∈ B is denoted as just b. The → associates to the right. We say
σ ≥ τ if τ can be obtained from σ by substituting types for type variables. For example, α ≥ α ≥ α →
β ≥ N → N, but not α → α ≥ N → R.
Submitted to:
HOR 2010
c C. Kop
This work is licensed under the
Creative Commons Attribution License.
C. Kop
23
(meta-)terms A metaterm is a typed expression over a set F of typed constants (also known as function
symbols) f : σ and infinite set V of variables. We define the set of metaterms M together with a set V ar
of free variables for each metaterm recursively with the following rules:
if x ∈ V
V ar(xτ ) = {xτ }
(var) xτ : τ ∈ M
(fun)
fτ : τ ∈ M
if f : σ ∈ F and σ ≥ τ
V ar( fτ ) = 0/
V ar(λ xσ .s) = V ar(s) \ {xσ }
(abs) λ xσ .s : σ → τ if x ∈ V , s : τ ∈ M and
∈M
V ar(s) ∪ {xσ } UT(∗∗)
(app) s · t : τ ∈ M
if s : σ → τ ∈ M, t : σ ∈ M V ar(s · t) = V ar(s) ∪ V ar(t)
and V ar(s) ∪ V ar(t) UT
(meta) Xσ [s1 , . . . , sn ] if σ = σ1 → . . . σn → τ and V ar(Xσ [s1 , . . . , sn ]) =
S
s1 : σ1 , . . . , sn : σn and
:τ ∈M
{Xσ } ∪ i V ar(si )
S
{Xσ } ∪ i V ar(si ) UT
(**) A set V of typed variables is called UT, uniquely typed, if for any xσ ∈ V there is no xτ ∈ V with
σ 6= τ.
A metaterm generated without clause (meta) is a term. We work modulo renaming of variables
bound by an abstraction operator (α-conversion). Explicit typing of terms will usually be omitted. The ·
operator associates to the left, so a metaterm s · t · r should be read as (s · t) · r. We will adopt the custom
of writing a (meta-)term s · t1 · · ·tn in the form s(t1 , . . . ,tn ).
type substitution A type substitution is a mapping p : A → T . For any metaterm s let s′ p be
s with all type variables α replaced by p(α). As an example, (ifα (Xbool ,Yα , 0α ) : α)′ {α → N} =
ifN (Xbool ,YN , 0N ).
We say s ≥ t if there is a substitution p such that s = t ′ p. Given a typable expression s (that is, a
term with some type indicators omitted), it has a principal term t, which is ≥ any term r obtained from
s by filling in type indicators. When a term is displayed with type indicators missing, we always mean
its principal term. For example, if f : α → α ∈ F , the principal term of f is fα→α , whereas the principal
term of f (0N ) is fN→N (0N ).
term and metaterm substitution A (term) substitution is the homomorphic extension of a typepreserving mapping [x1,σ1 := s1 , . . . , xn,σn := sn ] with all si terms. Substitutions for meta-applications
X[t1 , . . . ,tm ] “eat” their arguments. Formally, let γ be the function mapping xi,σi to si and γ(xτ ) = xτ for
other typed variables. For any metaterm s, sγ is generated by the following rules:
xσ γ = γ(xσ ) for x ∈ V
fσ γ = fσ for f ∈ F
(s · t)γ = (sγ) · (tγ)
/ dom(γ) (we can rename x if necessary)
(λ x.s)γ = λ x.(sγ) if x ∈
xσ [s1 , . . . , sn ]γ = q[y1 := s1 γ, . . . , ym := sm γ] · (sm+1 γ) · · · (sm γ)
if γ(x) = λ y1 . . . ym .q, m ≤ n and m = n or q is not an abstraction.
A metaterm is standard if variables occuring at the head of a meta-application are not bound, and all free
variables occur at the head of a meta-application.
Examples: (x(λ y.y)[x := λ z.z(a)] = (λ z.z(a))(λ y.y) whereas x[λ y.y][x := λ z.z(a)] = (λ y.y)(a)).
Even an empty substitution has an effect on a proper metaterm: x[λ y.y][] = x(λ y.y).
β and η =β is the equivalence relation generated by (λ x.s)·t =β s[x := t]. Every metaterm s is β -equal
to a unique β -normal term s ↓β which has no subterms (λ x.s) · t. This is a wellknown result, which is
24
Transformations of higher-order TRSs
easily extended to HODRSs.
=η is the equivalence relation generated by s = λ x.s·x if x ∈
/ V ar(s), and X[s1 . . . , sn ] = λ x.X[s1 , . . . , sn , x].
A metaterm is in η-normal form if any higher-order subterm is either an abstraction or meta-application,
occurs at the head of an application or occurs as a direct argument of a meta-application (for example,
X[s] is η-normal if X : σ → τ with τ not composed, and all direct subterms of s are η-normal). While a
term may have more than one η-normal form ( f : o→o has normal forms λ x. f (x) and λ x.(λ x. f (x)) · x),
we define s ↓η as its minimal η-normal form.
= β η is the union of these relations. Each term has a unique β η-normal form.
rules A term rewrite system consists of an alphabet F , a set of rules R and an equivalence relation δ ,
where δ is one of β , η, β η or normal equality ε (the α rule is implicit in all). Rules are tuples (l, r)
(commonly denoted l ⇒ r), where l, r are standard metaterms satisfying the following properties: 1) l
and r have the same type, 2) all variables and typevariables in r also occur in l, 3) if l has a subterm
X[s1 , . . . , sn ] then the si are all distinct bound variables (the parameter restriction), 4) if the equivalence
relation is either β or β η, no subterms X[s1 , . . . , sn ] · t0 · · ·tm (n, m ≥ 0) occur in l.
R induces a rewrite relation ⇒R over terms in minimal δ -normal form:
(top)
l ′ pγ =δ s ⇒R t =δ r′ pγ if l ⇒ r ∈ R, p a type substitution
and γ a substitution
′
(app-l) s · t ⇒R s · t
if s ⇒R s′
′
(app-r) s · t ⇒R s · t
if t ⇒R t ′
′
(abs)
λ x.s ⇒R λ x.s
if s ⇒R s′
The reduction relation is decidable due to the parameter restriction.
3 Pleasant properties and transformations
To prove results about HODRSs it is often convenient to have a bit more to go on than just the general
definition. To this end we define a number of standard properties, which define common subclasses
which are relatively easy to work with.
implicit beta the equivalence relation is either β or β η
explicit beta the equivalence relation is either η or ε but R contains the rule beta, that is: (λ xα .Z[x]) ·
Y ⇒ Z[Y ]
parameter-free in all rules, except possibly beta, any meta-applications occuring on either side have
the form X[]
beta-free the system is parameter-free, does not contain beta, and its equivalence relation is either η or
ε
monomorphic no rule contains type variables, except possibly beta
left-beta-normal the left-hand side of each rule (except possibly beta) is β -normal
right-beta-normal the right-hand side of each rule is β -normal
beta-normal both left-beta-normal and right-beta-normal
eta-normal both sides of all rules have η-normal form
C. Kop
25
nth order any variable or function symbol occuring in one of the rules has a type of order at most n. A
sort-headed type b(t1 , . . . ,tn ) has order 0, a type σ1 → . . . σm → b with b not composed has order
max(order(σ1 ),. . . ,order(σm ))+1; we only speak of order in a monomorphic system
finite R is finite
algebraic there are no abstractions in the left-hand side of any rule
abstraction-free there are no abstractions in either side of any rule
left-linear no variable occurs twice free in a term
without head variables no left-hand side contains a subterm X[s1 , . . . , sn ] · t
completely without head variables no left-hand side contains a subterm X[s1 , . . . , sn ] · t or X · t (so
bound variables may also not occur at a head).
function-headed the head of the left-hand side of each rule is a function symbol
with base rules the type of a rule may not be composed
Many of these properties can be made to hold, by transforming the system. When we are analysing
termination in specific, we can enforce the following properties without affecting either termination or
non-termination of the system:
1. any system can be made monomorphic, although at the price of finiteness
2. any system can be presented in beta-normal form
3. a system with explicit beta can be transformed to have implicit beta
4. any algebraic system can be turned abstraction-free
5. any system has a function-headed equivalent without head variables
Moreover, a system can be turned eta-normal and with base rules without losing non-termination; if
the transformed system is terminating then so is the original. However, turning a system eta-normal may
sometimes lose termination.
4 Embedding existing systems
There are four mainstream forms of higher-order rewriting: Nipkow’s HRSs, Jouannaud and Okada’s
AFSs, Yamada’s STTRSs and Klop’s CRSs. The latter three can be embedded into HODRSs, but since
HRSs in general do not have a decidable reduction relation, they can not. However, the common restriction to pattern HRSs essentially gives function-headed HODRSs with an equivalence relation β η. An
AFS is a parameter-free system with explicit beta, an STTRS is an abstraction-free, beta-free system,
and a CRS can be presented as a second-order HODRS with equivalence relation ε. Several quirks need
to be ironed out (such as AFSs using function symbols with arity; a symbol f of arity n only occurs in
the form f (s1 , . . . , sn ), and CRS terms being untyped), but this is easy to do.
5 HORPO
The recursive path ordering, a common syntactic termination method, has been extended to AFSs in a
long line of research, starting with HORPO [2] and culminating in CPO [1]. We consider the last of
26
Transformations of higher-order TRSs
these works, CPO. The definition extends a wellfounded ordering on monomorphic function symbols to
a wellfounded ordering on terms. There are no requirements on the terms except that function symbols
have to occur with all their arguments.
We extend CPO to polymorphic metaterms by defining X[s1 , . . . , sn ] X[t1 , . . . ,tn ] if s1 t1 , . . . , sn
tn ; for the type ordering we just use a type ordering on monomorphic types, together with the relation
σ → τ >T ρ if τ ≥T ρ. To compare the possibly infinite number of function symbols, we might for
example choose a wellfounded ordering on the different symbol names, and additionally define fσ ⊲ fτ
if τ is a strict subtype of σ .
We can now prove that s′ pγ ↓η ≻ t ′ pγ ↓η if we can derive s ≻ t (for any type substitution p and
substitution γ). As the beta rule is included in ≻, we also have s′ pγ ↓η ≻ t ′ pγ ↓β η . Since a system is
terminating if it is terminating modulo η, we only have to show that l ≻ r for all rules l ⇒ r ∈ R to prove
termination of the system, whatever its modulo relation and properties are.
References
[1] F. Blanqui, J.-P. Jouannaud & A. Rubio (2008): The Computability Path Ordering: The End of a Quest. In:
Lecture Notes in Computer Science (CSL ’08), pp. 1–14.
[2] J.-P. Jouannaud & A. Rubio (1999): The higher-order recursive path ordering. In: Proceedings of the 14th
annual IEEE Symposium on Logic in Computer Science (LICS ’99), Trento, Italy, pp. 402–411.
Computational interpretations of logic
Silvia Ghilezan
Faculty of Technical Sciences
University of Novi Sad, Serbia
[email protected]
The fundamental connection between logic and computation, known as the Curry-Howard correspondence or formulae-as-types and proofs-as-programs paradigm, relates logical and computational
systems. We present an overview and a comparison of computational interpretations of intuitionistic and
classical logic both in natural deduction and sequent-style setting. We will further discuss and develop
a sequent term calculi with explicit control of erasure and duplication operations in the management of
resources.
Formulae-as-types, proofs-as-term, proofs-as-programs paradigm Gentzen’s natural deduction is a
well established formalism for expressing proofs. Church’s simply typed λ -calculus is a core formalism
for writing programs. Simply typed λ -calculus represents a computational interpretation of intuitionistic
natural deduction: formulae correspond to types, proofs to terms/programs and simplifying a proof
corresponds to executing a program. In its traditional form, terms in the λ -calculus encode proofs in
intuitionistic natural deduction; from another perspective the proofs serve as typing derivations for the
terms. This correspondence was discovered in the late 1950s and early 1960s independently in logic by
Curry, later formulated by Howard; in category theory, Cartesian Closed Categories, by Lambek; and in
mechanization of mathematics, the language Automath, by de Brujin.
Griffin extended the Curry-Howard correspondence to classical logic in his seminal 1990 paper [7],
by observing that classical tautologies suggest typings for certain control operators. This initiated a
vigorous line of research: on the one hand classical calculi can be seen as pure programming languages
with explicit representations of control, while at the same time terms can be tools for extracting the
constructive content of classical proofs. The λ µ-calculus of Parigot [10] expresses the computational
content of classical natural deduction and has been the basis of a number of investigations into the
relationship between classical logic and theories of control in programming languages.
Computational interpretation of sequent-style logical systems has come into the picture much later,
by the end of 1990s. There were several attempts, over the years, to design a term calculus which would
embody the Curry-Howard correspondence for intuitionistic sequent logic. The first calculus accomplishing this task is Herbelin’s λ̄ -calculus [8]. Recent interest in the Curry-Howard correspondence for
intuitinistic sequent logic [8, 1, 5, 6] made it clear that the computational content of sequent derivations
and cut-elimination can be expressed through an extension of the λ -calculus. In the classical setting,
there are several term calculi based on classical sequent logic, in which terms unambiguously encode
sequent derivations and reduction corresponds to cut elimination [11, 2, 12, 4, 3]. In contrast to natural
deduction proof systems, sequent calculi exhibit inherent symmetries in proof structures which create
technical difficulties in analyzing the reduction properties of these calculi.
Resource operators for sequent λ -calculus The simply typed λ Gtz -calculus, proposed by Espı́rito
Santo [5] as a modification of Herbelin’s λ̄ -calculus, completely corresponds to the implicational fragment of intuitionistic sequent logic. We extend the Curry-Howard correspondence to intuitionistic seSubmitted to:
HOR 2010
c S. Ghilezan
This work is licensed under the
Creative Commons Attribution License.
28
Computational interpretations of logic
quent logic with explicit structural rules of weakening (thinning) and contraction. We propose a term
calculus derived from λ Gtz by adding explicit resource operators for weakening and contraction, which
we call ℓλ Gtz (linear λ Gtz ). The main motivation for our work is to explore these features in the sequent
calculus setting. Kesner and Lengrand [9] developed a term calculus corresponding to intuitionistic calculus in natural deduction format equipped with explicit substitution, weakening and contraction. For
the proposed calculus we introduce the type assignment system with simple types and prove some operational properties, including the types preservation under reduction (subject reduction) and termination
properties. We then relate the proposed linear type calculus to the calculus of Kesner and Lengrand in
the natural deduction framework.
Parts of the presented work have been realised jointly with José Espı́rito Santo, Jelena Ivetić, Pierre
Lescanne, Silvia Likavec and Dragiša Žunić.
References
[1] H. P. Barendregt and S. Ghilezan. Lambda terms for natural deduction, sequent calculus and cut-elimination.
J. of Functional Programming, 10(1):121–134, 2000.
[2] P.-L. Curien and H. Herbelin. The duality of computation. In Proc. of the 5th ACM SIGPLAN Int. Conference
on Functional Programming, ICFP’00, pages 233–243, Montreal, Canada, 2000. ACM Press.
[3] D. Dougherty, S. Ghilezan, and P. Lescanne. Characterizing strong normalization in the Curien-Herbelin
symmetric lambda calculus: extending the Coppo-Dezani heritage. Theor. Comput. Sci., 398(1-3), 2008.
[4] D. Dougherty, S. Ghilezan, P. Lescanne, and S. Likavec. Strong normalization of the dual classical sequent
calculus. In 12th International Conference Logic for Programming, Artificial Intelligence, and Reasoning,
LPAR 2005, volume 3835 of LNCS, pages 169–183, 2005.
[5] J. Espı́rito Santo. Completing Herbelin’s programme. In Proceedings of Types Lambda Calculus and Application, TLCA’07, volume 4583 of LNCS, pages 118–132, 2007.
[6] J. Espı́rito Santo, S. Ghilezan, and J. Ivetić. Characterising strongly normalising intuitionistic sequent terms.
In International Workshop TYPES’07 (Selected Papers), volume 4941 of LNCS, pages 85–99, 2008.
[7] T. Griffin. A formulae-as-types notion of control. In Proc. of the 19th Annual ACM Symp. on Principles Of
Programming Languages, (POPL’90), pages 47–58, San Fransisco, USA, 1990. ACM Press.
[8] H. Herbelin. A lambda calculus structure isomorphic to Gentzen-style sequent calculus structure. In Computer Science Logic, CSL 1994, volume 933 of LNCS, pages 61–75, 1995.
[9] D. Kesner and S. Lengrand. Resource operators for lambda-calculus. Inf. Comput., 205(4):419–473, 2007.
[10] M. Parigot. An algorithmic interpretation of classical natural deduction. In Proc. of Int. Conf. on Logic
Programming and Automated Reasoning, LPAR’92, volume 624 of LNCS, pages 190–201, 1992.
[11] C. Urban and G. M. Bierman. Strong normalisation of cut-elimination in classical logic. In Typed Lambda
Calculus and Applications, TLCA’99, volume 1581 of LNCS, pages 365–380, 1999.
[12] P. Wadler. Call-by-value is dual to call-by-name, reloaded. In Rewriting Technics and Application, RTA’05,
volume 3467 of LNCS, pages 185–203, 2005.
On the Implementation of Dynamic Patterns
Thibaut Balabonski
Laboratoire PPS, CNRS and Université Paris Diderot
[email protected]
Pattern matching against dynamic patterns in functional programming languages is modelled in the
Pure Pattern Calculus by one single meta-rule. The present contribution is a refinement which narrows the gap between the abstract calculus and the implementation, and allows reasoning on matching
algorithms and strategies.
1 Dynamic Patterns
Pattern matching is a base mechanism used to deal with algebraic data structures in functional programming languages. It allows the reasoning on the shape of the arguments in the definition of a function. For
instance, define a binary tree to be either a single data or a node with two subtrees (left code, in ML-like
syntax). Then a function on binary trees may be defined by reasoning on the shapes generated by these
two possibilities (right code).
type ’a tree =
| Data ’a
| Node of ’a tree * ’a tree
let f t =
| Data
| Node
| Node
match t with
d
->
(Data d) r ->
l r
->
<code1>
<code2>
<code3>
An argument given to the function f is first compared to (or matched against) the shape Data d (called a
pattern). In case of success, d is instantiated in <code1> by the corresponding part of the argument, and
<code1> is executed. In case of failure of this first matching (the argument is not a data) the argument is
matched against the second pattern, and so on until a matching succeeds or there is no pattern left.
One limit of this approach of programming is that patterns are fixed expressions mentioning explicitly
the constructors to which they can apply, which restricts polymorphism and reusability of the code. This
can be improved by allowing patterns to be parametrized: one single function can be specialized in
various ways by instantiating the parameters of its patterns by different constructors or even by functions
building patterns. However, introducing parameters and functions into patterns modifies deeply their
nature: they become dynamic objects that have to be evaluated. The Pure Pattern Calculus (PPC)
of B. Jay and D. Kesner [JK09, Jay09] models these phenomena using a meta-level notion of pattern
matching. The present contribution analyzes the content of the meta pattern matching of PPC (reviewed
in Section 2), and proposes an explicit pattern matching calculus (Section 3) which is confluent, which
simulates PPC, and which allows the description of new reduction strategies (Section 4). Additional
material may be found in [Bal08]
2 The Pure Pattern Calculus
This section only reviews some key aspects of PPC. Please refer to [JK09] for a complete story. The
syntax of PPC is very close to the one of λ -calculus: the difference is the abstraction [θ ]p b over
a pattern and not λ x.b over a single variable, and the distinction between variable occurrences x and
Submitted to:
HOR 2010
c T. Balabonski
This work is licensed under the Creative Commons
Attribution-Share Alike License.
30
On the Implementation of Dynamic Patterns
matchable occurrences x̂ of a name x. Variable occurrences are usual variables which may be substituted
while matchable occurrences are immutable and used as matching variables or constructors.
t
::= x | x̂ | tt | [θ ]t t
PPC terms
where θ is a list of names. Letter a (resp. b, p) is used to indicate a PPC term in position of argument
(resp. function body, pattern). Letters t, u v are used when there is nothing to emphasize.
As pictured below, in the abstraction [θ ]p b the list of names θ binds matchable occurrences in
the pattern p and variable occurrences in the body b. Substitution of free variables and α-conversion are
deduced (see [JK09]).
[ x ] x x̂ x x̂ =α [y] x ŷ y x̂
One feature of PPC is the use of a single syntactic application for two different meanings: the term tu
may represent either the usual functional application of a function t to an argument u or the construction
of a data structure by structural application of a constructor to one or more arguments. The latter is
invariant: any structural application is forever a data structure, whereas the functional application may
be reduced someday (and then turn into anything else).
The simplest notion of pattern matching is syntactic matching: an argument a matches a pattern p
if and only if there is a substitution σ such that a = pσ . However, with arbitrary patterns this solution
generates non-confluent calculi [Klo80]. In the lambda-calculus with patterns for instance [KvOdV08],
syntactic matching is used together with a restriction on patterns (rigid pattern condition). The alternative
solution of PPC allows a priori any term to be a pattern, and checks the validity of patterns only when
pattern matching is performed. This verification is done by a more subtle notion of matching, called
compound matching, which tests whether patterns and arguments are in a so-called matchable form. A
matchable form denotes a term which is understood as a value, or a term whose current form is stable
and then allows matching. Matchable forms are described in PPC at the meta-level by the following
grammar:
d
m
::= x̂ | dt
::= d | [θ ]t t
Data structures
Matchable forms
The compound matching is then defined (still at the meta-level) by the following equations, in order:
{{a /θ x̂}}
{{x̂ /θ x̂}}
{{a1 a2 /θ p1 p2 }}
{{a /θ p}}
:=
:=
:=
:=
{x 7→ a}
{}
{{a1 /θ p1 }} ⊎ {{a2 /θ p2 }}
⊥
if x ∈ θ
if x 6∈ θ
if a1 a2 and p1 p2 are matchable forms
if a and p are matchable forms, otherwise
where ⊥ denotes a matching failure and ⊎ is the disjoint union on substitutions (⊥ ⊎ σ = σ ⊎ ⊥ = ⊥,
and σ1 ⊎ σ2 = ⊥ if domains of σ1 and σ2 overlap). This disjoint union checks that patterns are linear: no
matching variable is used twice in the same pattern (non-linearity would break confluence).
Remark that compound matching may not be defined if the pattern or the argument is not a matchable
form. This represents patterns or arguments that have still to be evaluated or instantiated before being
matched.
PPC has to deal with a problem of dynamic patterns: a matching variable may be erased of a pattern
during its evaluation, which would make reduction ill-defined. This is avoided in PPC by a last (metalevel) test, called check: the result {a /θ p} of the matching of a against p is defined as follows.
• if {{a /θ p}} = ⊥ then {a /θ p} = ⊥.
• if {{a /θ p}} = σ with dom(σ ) 6= θ then {a /θ p} = ⊥.
• if {{a /θ p}} = σ with dom(σ ) = θ then {a /θ p} = σ .
T. Balabonski
31
Finally, the reduction −→PPC of PPC is defined by a unique reduction rule (applied in any context):
([θ ]p b)a
−→
b{a /θ
p}
where b⊥ is some fixed close normal term ⊥ for any term b.
Example 1. Let t be a PPC term. The redex ([x]ĉx̂ x) (ĉt) reduces to t: the constructor ĉ matches itself
and the matchable x̂ is associated to t. On the other hand, ([x, y]ĉx̂ xy) (ĉt) reduces to ⊥: whereas
the compound matching is defined and successful, the check fails since their is no match for y and the
result would be ty where y appears as a free variable. The redex ([x]ĉx̂ x) (ĉ) also reduces to ⊥ since
a constructor will never match a structural application. And last, ([x]yx̂ x) (ĉt) is not a redex since the
pattern has to be instantiated.
3 Explicit Matching
This section defines the Pure Pattern Calculus with Explicit Matching (PPCEM ), a calculus which gives
an account of all steps of a pattern matching process of PPC. An explicit version of PPC has to deal
with the four aforementioned points: identification of structural applications, pattern matching, linearity
of patterns, and check.
Firstly, a new syntactic construct is introduced to discriminate between functional and structural applications (as in [FMS06] for the rewriting calculus for instance). Any application is supposed functional
a priori, and two rules (−→• , given on page 4) propagate structural information: the explicit structural
application of t to u is written t • u.
Another new syntactic object has to be introduced to represent an ongoing matching operation. The
basic information contained in such an object are: the list of matching variables, a partial result representing what has already been computed, and a list of matchings that have still to be solved. The new
grammar is:
t ::= x | x̂ | tt | t • t | [θ ]t t | t hθ |τ|∆i PPCEM Terms
Matchable forms
m ::= x̂ | t • t | [θ ]t t
where in the matching hθ |τ|∆i, θ and τ are lists of names and ∆ is the list of submatchings that have
still to be solved (a list of pairs of terms). The choice is here to apply partial results as soon as they are
obtained so that they do not need to be remembered. However, a trace of the partial results is remembered
for linearity verification: τ is the list of matching variables that have already been used, while θ is the
list of matching variables that have not yet been used, and hence the check succeeds if and only if θ
is empty. A pure term of PPCEM is a term without structural applications and matchings (that means a
PPC term). Definition of free variables and matchables is as follows. The set of free names of a term t is
fn(t):
fn(t) = fv(t) ∪ fm(t)
fv(x)
fv(x̂)
fv(tu)
fv(t • u)
fv([θ ]p b)
:= {x}
:= 0/
:= fv(t) ∪ fv(u)
:= fv(t) ∪ fv(u)
:= fv(p) ∪ ( fv(b) \ θ )
fm(x)
fm(x̂)
fm(tu)
fm(t • u)
fm([θ ]p b)
:= 0/
:= {x}
:= fm(t) ∪ fm(u)
:= fm(t) ∪ fm(u)
:= ( fm(p) \ θ ) ∪ fm(b)
fv(t hθ |τ|∆i) := ( fv(t) \ (θ ∪ τ)) ∪ fv(∆)
fm(t hθ |τ|∆i) := fm(t) ∪ fm(π1 (∆)) ∪ ( fm(π2 (∆)) \ (θ ∪ τ))
where if ∆ = (a1 , p1 )...(an , pn ) then fm(π1 (∆)) =
S
i
fm(ai ) and fm(π2 (∆)) =
S
i
fm(pi ).
32
On the Implementation of Dynamic Patterns
The (meta-) definition of substitution of free variables can be deduced:
xσ := σx
xσ := x
x̂σ := x̂
([θ ]p b)σ :=
(t hθ |τ|∆i)σ :=
x ∈ dom(σ )
x 6∈ dom(σ )
(tu)σ
(t • u)σ
:= t σ uσ
:= t σ • uσ
([θ ]pσ bσ ) θ ∩ (dom(σ ) ∪ fn(σ )) = 0/
(θ ∪ τ) ∩ (dom(σ ) ∪ fn(σ )) = 0/
t σ hθ |τ|∆σ i
where in ∆σ the substitution propagates in all terms of ∆. A notion of α-conversion is associated, and
for now on it is supposed that all bound names in a term are different, and disjoint from free names.
Rules for matching are of three kinds: an initialization rule −→B which triggers a new matching
operation, several matching rules −→m corresponding to all possible elementary matching steps and two
resolution rules −→r that apply the result of a completed matching.
Structural application
Initialization
x̂t
(t • u)v
−→•
−→•
x̂ • t
(t • u) • v
([θ ]p b)a
−→B
b hθ |0|(a,
/
p)i
Matching
b hθ |τ|(a, x̂)∆i
b hθ |τ|(x̂, x̂)∆i
b hθ |τ|(a1 • a2 , p1 • p2 )∆i
b hθ |τ|(a, p)∆i
−→m
−→m
−→m
−→m
Resolution
b{x7→a} hθ \ {x}|τ ∪ {x}|∆i
b hθ |τ|∆i
b hθ |τ|(a1 , p1 )(a2 , p2 )∆i
⊥
b h0|τ|εi
/
−→r
b hθ |τ|εi −→r
b
⊥
if x ∈ θ , x 6∈ τ and fn(a) ∪ (θ ∪ τ) = 0/
if x 6∈ θ and x 6∈ τ
if a and p are matchable forms, otherwise
if θ 6= 0/
Reduction −→EM of PPCEM is defined by application of any rule of −→B , −→• , −→m or −→r in
any context. The subsystem −→ p = −→• ∪ −→m ∪ −→r computes already existing pattern matchings
but does not create new ones.
3.1
Confluence and Simulation properties
Proofs are similar to those presented in [Bal08].
Theorem 1. −→ p is confluent and strongly normalizing.
Theorem 2. PPCEM is confluent.
For any PPCEM term t, write JtK the term t where all structural applications (•) are replaced by functional
applications. Let t be a PPCEM term, and t ′ the unique normal form of t by −→ p . Write t ↓ the term Jt ′ K.
Theorem 3.
• For any terms t and t ′ of PPC, if t −→PPC t ′ then t −→∗EM t ′ .
′
• For any terms t and t ′ of PPCEM , if t −→EM t ′ and t ↓ and t ′ ↓ are pure, then t −→=
PPC t .
4 Reduction Strategies
Pattern matching raises two new issues concerning reduction strategies (that means the evaluation order
of programs). One is related to the order in which pattern matching steps are done, the other concerns
T. Balabonski
33
the amount of evaluation of the pattern or the argument done before pattern matching. This subsection
focuses on the latter problem. In PPC, the basic evaluation strategy for a term ([θ ]p b)a is: evaluate
the pattern p and the argument a, and then resolve the matching. As the usual call-by-value, this solution
may perform unneeded evaluation of the argument (in parts that are not reused in the body b of the
function). Call-by-name, which means substituting non-evaluated arguments, is the more basic solution
to this problem. But how can such a solution be described in a pattern calculus? In PPC some evaluation
of the argument has to be done before pattern matching, but the exact amount of evaluation needed
depends on the pattern. Hence a description of a reduction strategy performing a minimal evaluation
of the argument is not possible in PPC without defining a reduction parametrized by a pattern. On
the other end PPCEM allows to shuffle pattern or argument reduction and pattern matching steps. This
finer control allows for instance to define call-by-name reduction to head-normal form by the following
evaluation contexts:
E ::= []
|
|
|
|
Et
t hθ |τ|(E,t)∆i
t hθ |τ|(x̂, E)∆i
t hθ |τ|(t • t, E)∆i
x 6∈ θ and x 6∈ τ
Reduction following these evaluation contexts triggers pattern matchings as soon as possible. Then the
pattern and the argument are evaluated until they become matchable, and a pattern matching step is
performed before the story goes on.
More generally this also allows to revisit standardisation for pattern calculi [KLR10].
5 Conclusion
The Pure Pattern Calculus is a compact framework modelling pattern matching with dynamic patterns.
However, the conciseness of PPC is due to its use of several meta-level notions which deepens the gap
between the calculus and implementation-related problems. This contribution defines the Pure Pattern
Calculus with Explicit Matching, a refinement which is confluent and simulates PPC, and allows reasoning on the pattern matching mechanisms. This enables the definition of new reduction strategies in the
spirit of call-by-name (which is new in this kind of framework since the reduction of the argument of a
function depends on the pattern of the function, pattern which is itself a dynamic object).
References
[Bal08] T. Balabonski: Calculs avec Motifs Dynamiques. Rapport technique PPS, Université Paris Diderot, 2008.
Available at http://hal.archives-ouvertes.fr/hal-00476940.
[FMS06] M. Fernández, I. Mackie, F.-R. Sinot: Interaction Nets vs the Rho-Calculus: Introducing Bigraphical
Nets. ENTCS, 154(3):19–32, 2006.
[Jay09] B. Jay. Pattern Calculus: Computing with Functions and Data Structures. Springer, 2009.
[JK09] B. Jay and D. Kesner: First-class patterns. J. Funct. Programming, 19(2):191–225, 2009.
[KLR10] D. Kesner, C. Lombardi and A. Rı́os: Standardisation for Constructor Based Pattern Calculi. 5th
International Workshop on Higher-Order Rewriting: HOR 2010.
[Klo80] J. W. Klop: Combinatory Reduction Systems. Ph.D. Thesis, Mathematisch Centrum, Amstermdam, 1980
[KvOdV08] J. W. Klop, V. van Oostrom, and R. de Vrijer: Lambda calculus with patterns. TCS, 398:16–31, 2008.
Higher-order Rewriting for Executable Compiler Specifications
Kristoffer Rose, IBM Research∗
Abstract
2 Parse
In this paper we show how a simple compiler can
be completely specified using higher order rewriting in all stages of the compiler: parser, analysis/optimization, and code generation, specifically
using the crsx.sourceforge.net system for a small
declarative language called “X” inspired by XQuery
(for which we are building a production compiler in
the same way).
The first component of our compiler is the parsing
from X syntax to the IL, which consists of higher
order terms (one can consider the IL a as higher
order abstract syntax [11] representation). In the
CRSX system this is achieved with the parser specification shown in Figure 1, which follows the format
of the JJCRS component of the CRSX system [13].
The parser specification is merely a grammar defining some tokens (written in his) and some nonterminals, such as can be found in any text on compilers,
using annotations to provide details of how to build
the (higher-order) abstract syntax tree:
1 Introduction
A canonical minimal compiler consists of a parser
translating the source language SL to the intermediate language IL, some rewrites inserting analysis
results into and performing simplifications of the IL,
and code generation to the target language TL (presumably using the annotations).
SL
Parse //
CodeGen//
TL
EE IL
Rewrite
The actual samples we’ll present below are mere toys,
of course, but do illustrate the ideas in a manner that
is consistent with the production compiler.
For terms and rewriting we use the CRSX system [12, 13] variation of Combinatory Reduction
Systems [9]. We have chosen to use the straight
CRSX notation here to show the actual executable
form of our specifications; the basic notations are
summarized in Appendix A.
Thomas J. Watson Research Center, P.O. Box 704, Yorktown Heights, NY 10598, USA;
[email protected], ❤tt♣✿✴✴
❦r✐st♦❢❢❡r✳r♦s❡✳♥❛♠❡. Printed May 17, 2010.
• each J. . .K denotes the generated term for that
particular production choice;
• nonterminals are named with meta-variables1
such that they can be referenced in generated
terms by use of the corresponding meta-variable
(with an optional numeric subscript, if needed);
• if a token is marked with a superscript metavariable, like hCompiC0 , then the corresponding meta-applications, here C0 [. . . ], builds a construction with the token value as constructor;
?v
•
after a token promotes the token value to a
scoped identifier definition and makes [v] after a
single nonterminal in the same production indicate that the scope of v is that nonterminal; and
•
!v
after a token indicates that the token must be
an occurrence of an identifier that is in scope.
Notice the following specifics for the X grammar:
• The top-level nonterminal is X, which generates
an X program term.
∗ IBM
1 Meta-variables are here italic upper case letters; see Appendix A for details.
Page 34
q
y
X ::= E X[E]
E ::= S1
(program)
q
y
“✱” E2 Con[S1 , E2 ]
q y
S1
(expression)
q
y
S ::= “❢♦r” hVari?v “✐♥” S1 “r❡t✉r♥” S2 [v] For[S1 , x.S2 [x]]
q
y
“✐❢” “✭” E0 “✮” “t❤❡♥” S1 “❡❧s❡” S2 If[E0 , S1 , S2 ]
q y
C1 C1
q
y q y
A1
C ::= A1 hCompiC0 A2 C0 [A1 , A2 ]
q
y q y
O0
A ::= P1 hOpi A2 O0 [P1 , A2 ]
P1
q
y
q
y
q y
q
y
P ::= hVari!v v
hIntiI Int[I]
“✭” E “✮” E
“s✉♠” P Sum[P ]
hVari ::= [❛✲③, ❆✲❩] [❛✲③, ❆✲❩, ✵✲✾]+
(simple)
(comparison)
(addition)
(primary)
(var)
hInti ::= [✵✲✾]
(int)
hCompi ::= “❧❡”
(comp)
+
hOpi ::= “✲”
(op)
Figure 1: X syntax and IL generation.
• An expression is either produced by E ✱ S, and
then it generates a Con-term, or just by S, and
then it simply generates what is generated for
the S.
• A simple expression S has three forms, where
the ❢♦r expression is the only one involving a
scope: the grammar annotation specifies where
the variable and scope occur in the syntax and
then specifies how they are transferred to the
term, where the scope is explicitly specified by
x . S2 [x].
• One of the choices for a primary expression P is
a hVari occurrence where the “!” in the annotation stipulates that the symbol must be in scope
(thus effectively occur as the iteration variable
of a parent ❢♦r).
The X program
s✉♠✭❢♦r ♥ ✐♥ ✭✶✱✷✮ r❡t✉r♥ ♥✲✶✮
parses to the term
X[Sum[For[Con[Int[1], Int[2]], n . -[n, Int[1]]]]]
In summary, the parser specification looks like many
other abstract syntax tree generation notations, such
as MetaPRL [7], except for the additional direct support for higher order abstract syntax by explicitly
specifying the scoping and a pleasantly compact way
to generate terms where tokens are used directly as
constructors, which reduces the size of large parsers
considerably.
3 Rewrite
As a simple analysis example we show the use of inference rules for a simple X type checker in Figure 2.
Again, such systems can be found in standard texts
on programming language semantics and compilation, and the system shown here is directly generated
from the inference system processed by CRSX.2
The rules define a judgment that adds type information to every subterm in the form of an environment annotation, {Ty : T }, which associates Ty (a
constructor like all the upright names in the system)
with the type of the annotated subterm, T , where the
type sort can be described by
T ::= 1[U ] ∗[U ]
U ::= B I ?
(type)
(unboxed)
(where 1 and ∗ are cardinalities, and B, I, and ?,
stand for integer, boolean, and unknown value types,
respectively).
Definition 3.1. If E has the free variables x1 , . . . , xn
then
{Γ ; x1 : T1 ; . . . ; xn : Tn }E ⇒ {Γ ′ ; Ty : T }E ′
means that with x1 of type T1 , etc., then the type of
E is T .
2 The only difference is that CRSX processes terms of the form
n
(N ).
Infer[N ,(P1 ;. . . Pn ;),C] which are shown as P1 ...P
C
Page 35
{Γ }E ⇒ E ′
(X)
{Γ } X[E] ⇒ {Γ ; Ty : T[E ′ ]} X[E ′ ]
{Γ }E1 ⇒ E1′
{Γ }E2 ⇒ E2′
(Con)
{Γ } Con[E1 , E2 ] ⇒ {Γ ; Ty : J[T[E1′ ], T[E2′ ]]} Con[E1′ , E2′ ]
∀x.( {Γ }E1 ⇒ E1
{Γ ; x : P[T1 ]}E2 [x] ⇒ E2′ [x]
T[E2′ [x]] = T2 )
(For)
′
{Γ } For[E1 , v.E2 [v]] ⇒ {Γ ; Ty : S[T2 ]} For[E1 , v.E2′ [v]]
{Γ }E ⇒ E ′
T[E ′ ] = 1[B]
{Γ }E1 ⇒ E1′
{Γ }E2 ⇒ E2′
(If)
′
′
{Γ } If[E, E1 , E2 ] ⇒ {Γ ; Ty : U[T[E1 ], T[E2 ]]} If[E ′ , E1′ , E2′ ]
T[E2′ ] = 1[I]
{Γ }E2 ⇒ E2′
T[E1′ ] = 1[I]
{Γ }E1 ⇒ E1′
(le)
{Γ } ’le’[E1 , E2 ] ⇒ {Γ ; Ty : 1[B]} ’le’[E1′ , E2′ ]
{Γ }E1 ⇒ E1′
T[E1′ ] = 1[I]
{Γ }E2 ⇒ E2′
T[E2′ ] = 1[I]
(-)
′
′
{Γ } -[E1 , E2 ] ⇒ {Γ ; Ty : 1[I]} -[E1 , E2 ]
{Γ ; x : T }x ⇒ {Γ ; Ty : T } Typed[x]
(Var)
{Γ }E ⇒ E ′
S[T[E ′ ]] = ∗[I]
(Sum)
{Γ } Sum[E] ⇒ {Γ ; Ty : 1[I]} Sum[E ′ ]
{Γ } Int[I] ⇒ {Γ ; Ty : 1[I]} Int[I]
(Int)
J[1[T ], 1[T ′ ]] → ∗[Uu[T, T ′ ]]; J[1[T ], ∗[T ′ ]] → ∗[Uu[T, T ′ ]]; J[∗[T ], 1[T ′ ]] → ∗[Uu[T, T ′ ]]; J[∗[T ], ∗[T ′ ]] → ∗[Uu[T, T ′ ]];
U[1[T ], 1[T ′ ]] → 1[Uu[T, T ′ ]]; U[1[T ], ∗[T ′ ]] → ∗[Uu[T, T ′ ]]; U[∗[T ], 1[T ′ ]] → ∗[Uu[T, T ′ ]]; U[∗[T ], ∗[T ′ ]] → ∗[Uu[T, T ′ ]];
Uu[I, I] → I; Uu[B, B] → B; Uu[I, B] → ?; Uu[B, I] → ?;
S[1[T ]] → ∗[T ]; S[∗[T ]] → ∗[T ]; P[1[T ]] → 1[T ]; P[∗[T ]] → 1[T ];
T[{Γ ; Ty : T }E] → T ;
Figure 2: X typing inference rules and helpers.
The rules follow the conventions of compilers written in natural semantics [5] by being “left-to-right
deterministic,” and can be translated directly into
rewrite rules corresponding to a recursive functional
program, except for the (For) rule, which involves a
“subproof under binding.” The higher order nature of
(For) is manifest in the rule by the ∀x.(. . . ) wrapper
around all the premises: this permits the subproofs
of the premises to use x as a free variable. It is then
fairly easy to establish that for a closed expression E,
all subproofs of the proof for {}E ⇒ E ′ will satisfy
the invariant that Γ = {x1 : T1 ; . . . ; x : n : xn } contains all free variables in the subterm to investigate.
It is also pleasant that the impact of the use of bound
variables is restricted to the rules dealing explicitly
with binding: there are no “lift” or “shift” operators
to manipulate.
Each inference judgment (in our example just “⇒”)
corresponds to a family of recursive functions and
constructors that permits representing the stage of
each inference rule as a data structure with a “current
problem” top judgment to reduce next. Note that the
synthetic function symbols are inserted under the environment, which makes non-terms like {Γ ; x : T }x
possible: the generated rewrite rule for the (Var) inference will, in fact, be
{Γ ; x : T } ?-[x] → {Γ ; Ty : T } Typed[x]
because each judgment ⇒N corresponds to a function named ?-N .
In addition, the rules employ a number of helper
rewrite rules that define the computation of certain
type combinations, e.g., J computes the type of a
concatenation, U means union, S means sequences
of, and P means the type of members of sequences
(called the “prime” type).
We give the details for how inference rules are
translated into rewrite rules in a separate paper; here
we shall just remark that the following properties of
the rules are important for permitting efficient translation all the way to low level code:
Page 36
• Every conclusion judgment has a unique constructor in the pattern. (This can be relaxed to
a condition on unique prefix collections similarly
to the way LR grammars are generalized to LALR
grammars.)
• The left side of every premise judgment can be
fully constructed from components from prior
(left) premises and the left side of the conclusion
judgment.
• Judgments only ever observe inherited attributes
in the environment, i.e., values set in the context (specifically for bound variables set at their
binder site, which in fact permits using a single
global environment where no variables are ever
removed).
The example term,
X[Sum[For[Con[Int[1], Int[2]], n . -[n, Int[1]]]]]
type analyses to
X[{Ty:1[I]}Sum[{Ty:*[I]}For[
{Ty:*[I]}Con[{Ty:1[I]}Int[1], {Ty:1[I]}Int[2]],
n . {Ty:1[I];n:1[I]}-[
{Ty:1[I];n:1[I]}Typed[n],
{Ty:1[I];n:1[I]}Int[1]
]
]]]
We can now use the type annotations to perform
more traditional CRS(X) simplification rewrites, such
as
{Γ } For[{Ty : 1[T ]}E1 , v.E2 []] → E2 []
(which matches For expressions that iterate over a
singleton sequence and do not use the iteration variable in the result, and replaces them with just the For
body).
type T and scoped over the execution of C; LV[r1 , r2 ]
loads (a copy of) the value of r2 into r1 ; LC[r, V ] loads
the constant value V (an integer or the booleans T or
F) into r; −[r1 , r2 ] subtracts the value of r2 from r1 ;
+[r1 , r2 ] adds the value of r2 to r1 ; LE[r1 , r2 , C1 , C2 ]
checks if the value of r1 is less than or equal to r2 and
then executes C1 otherwise C2 ; TF[r, C1 , C2 ] checks if
the value of r is T and then executes C1 otherwise C2 ;
OU[r, o] outputs a copy of the value of r to the output
o; BU[o.C, r] creates a buffer collecting from o, allows
C to send output items of type T to o, and then copies
the result to r; and PC[T, o.C1 , r.C2 ] is a pipeline that
executes C1 and for every value of type T sent to o
executes C2 with the value in r.
Notice that this instruction set does not have the
usual typed assembly language property that all registers are constant: we prefer to keep the option of
reuse of variables, which is particularly pertinent because it permits representing handlers by registers.
The code generation rules, shown in Figure 3,3 has
three compilation schemes: G1 generates code that
sends the computed value to an output, G2 generates code that selects between two code fragments
depending on whether the computed boolean value is
“true” or “false” (so it is only defined for expression
fragments that can be boolean), and G3 is a helper
scheme that generates code that stores the computed
value in a register (using an intermediate buffer –
further simplifications can be achieved by also specialising G3 to each case). It uses the P and T helpers
from Figure
It should be noted that this code generation
scheme is especially suited for reasoning about data
flow
Code generation for small sample program omitted
for space reasons.
5 Discussion
4 Code Generation
Finally, the code generation. We have found that
binders can be used rather effectively to represent
registers. We’ll use the following instruction set,
which is a hybrid of (relational) algebras and a register machine and has proven to have a good abstraction level when compiling into current imperative languages such as Java or C.
Code C is a “two-address code” with one of the
following forms: PR[o.C] is a complete program
where the code C must send all output to the output (handler) o; (C1 ; C2 ) first executes C1 , then C2 ;
NR[T, r.C] creates an new unititialized register r of
At the end what remains is to put all the pieces together. The driver is the top-level X symbol introduced by parsing. We add a the following rule to
start reduction directly from the parser:
X[E] → G[?-[E]];
I have found that this kind of architecture is relatively
easy to explain to developers with traditional “compiler block diagrams” like the one in the introduction,
where the fact that each analysis and translation is
3 The shown rules are reformatted in mathematics style from
the original text file.
Page 37
G[E] → PR[o. G1 [E, o]];
(G)
G1 [Con[E1 , E2 ], o] → (G1 [E1 , o]; G1 [E2 , o])
(Con1)
G1 [For[E1 , v.E2 [v]], o] → PC[P[T[E1 ]], o1 . G1 [E1 , o1 ], v. G1 [E2 [v], o]]
(For1)
G1 [If[E, E1 , E2 ], o] → G2 [E, G1 [E1 , o], G1 [E2 , o]]
(If1)
G1 [{Ty : T } ’le’[E1 , E2 ], o] → NR[1[B], b.(G2 [{Ty : T } ’le’[E1 , E2 ], LC[b, T], LC[b, F]]; OU[b, o])] (le1)
G1 [{Ty : T } -[E1 , E2 ], o] → G3 [E1 , r1 . G3 [E2 , r2 .(−[r1 , r2 ]; OU[r1 , o])]]
G1 [{Ty : T } Sum[E1 ], o] → NR[T, s.(LC[s, 0]; PC[o. G1 [E1 , o], v.+[s, v]]; OU[s, o])]
G1 [{Ty : T }x, o] → OU[x, o]
G1 [{Ty : T } Int[E], o] → NR[T, i.(LV[i, E]; OU[i, o])]
G2 [{Ty : T } Let[E1 , v.E2 [v]], C1 , C2 ] → G3 [E1 , r1 . G2 [E2 [r1 ], C1 , C2 ]]
G2 [{Ty : T } If[E, E1 , E2 ], C1 , C2 ] → G3 [{Ty : T } If[E, E1 , E2 ], b. TF[b, C1 , C2 ]]
G2 [{Ty : T } ’le’[E1 , E2 ], C1 , C2 ] → G3 [E1 , r1 . G3 [E2 , r2 . LE[r1 , r2 , C1 , C2 ]]]
G2 [{Ty : T }x, C1 , C2 ] → TF[x, C1 , C2 ]
(−1)
(Sum1)
(Var1)
(Int1)
(Let2)
(If2)
(le2)
(Var2)
G3 [{Ty : T }E, r.C[r]] → NR[T, r.(BU[o. G1 [{Ty : T }E, o], r]; C[r])]
(Store)
Figure 3: X code generation rules.
specified independently, makes using a structured approach realistic. The chaotic nature of the resulting
execution of the specification comes out as an advantage and our implementation using a standard functional innermost-needed strategy often ends up interleaving the stages of the compilation in interesting
ways, for example eliminating dead code before type
checking, usually making mistakes in dependencies
blatantly obvious.
The current production compiler prototype specified in this way is working so well that it seems feasible to implement the actual production compiler this
way. Important factors in this has been the disciplined use of systems that can be transformed into
orthogonal constructor systems, for which a tabledriven normalizing strategy can be used in almost all
cases (there is a performance penalty for some substitution cases).
The CRSX system implements higher order rewriting fully in the form of CRS, thus can handle full
substitution and thus express transformations such
as inlining. However, it turns out that many specific systems share with the small ones presented
here the property that they use only “explicit substitution” style rewrites, which only permits observing variables [1]. Indeed it seems that the fact that
the approach is not functional or logical is an advantage: the expressive power of explicit substitution is
strictly smaller (in a complexity sense) than general
functions.
Related Work. The area of verifying a compiler
specification is well established using both handwritten and mechanical proofs [4]. Work has also
been done on linking correct compiler specification and implementations using generic proof theoretic tools [10]. Tools supporting mechanical generation of compilers from specifications, such as
SDF+ASF [2] and Stratego [3], have focused on
compilers restricted to first-order representations of
intermediate languages used by the compiler and on
using explicit rewriting strategies to guide compilation. Our goal is the opposite: to only specify dependencies between components of the compiler and
leave the actual rewriting strategy to the system (in
practice using analysis-guided rule transformations
coupled with a generic normalizing strategy).
We are only aware of one published work that
uses higher order features with compiler construction, namely the work by Hickey and Nogin on specifying compilers using logical frameworks [7]. The
resulting specification looks very similar to ours, and
indeed one can see the code synthesis that could be
done for their logic system as similar to the code generation we are employing. Also, both systems employ embedded source language syntax and higherorder abstract syntax. However, there are differences
as well. First, CRSX is explicitly designed to implement just the kind of rewrite systems that we have
described, and is tuned to generate code that drives
transformation through lookup tables. Second, vari-
Page 38
ables are first class in CRSX and not linked to metalevel abstraction, thus closer to the approach used
by explicit substitution for CRS [1] and “nominal”
rewriting [6]. This permits us, for example, to use
an assembly language with mutable registers. Third,
we find that the focus on local rewriting rules is easier to explain to compiler writers, and the inclusion
of environments and inference rules in the basic notation further helps. Finally, the CRSX engine has no
assumed strategy so we find the notion of local correctness easier to grasp.
What’s Next? With CRSX we continue to experiment with pushing the envelope for supporting more
higher-order features without sacrificing efficiency.
An important direction is to connect with nominal
rewriting and understand the relationship between
what the two formalisms can express.
Another interesting direction for both performance
and analysis is to introduce explicit weakening operators that “unbind” a given bound variable in a part of
its scope. While used in this way with explicit substitution [15, 8], the interaction with higher-order
rewriting is not yet clear.
In companion papers we explain the details of the
translation from the supported three forms of rules,
“recursive compilation scheme,” “chaotic annotation
rules,” and “deterministic inference rules,” into effective native executables, and we explain annotations
that make it feasible to avoid rewriting-specific static
mistakes.
Acknowledgements. The author is grateful for insightful comments by the anonymous referees including being made aware of the work in logical frameworks.
References
[1] R. Bloo and K. H. Rose. Combinatory reduction systems with explicit substitution that preserve strong
normalisation. In H. Ganzinger, editor, RTA ’96—
Rewriting Techniques and Applications, number 1103
in Lecture Notes in Computer Science, pages 169–
183, New Brunswick, New Jersey, July 1996. Rutgers
University, Springer-Verlag.
[2] M. Brand, J. Heering, P. Klint, and P. A. Olivier.
Compiling Rewrite Systems: The ASF+SDF Compiler. ACM Transactions on Programming Languages
and Systems, 24(4):334–368, 2002.
[3] M. Bravenboer, A. van Dam, K. Olmos, and E. Visser.
Program transformation with scoped dynamic rewrite
rules. Fundamenta Informaticae, 69(1–2):123–178,
2006.
[4] M. A. Dave. Compiler verification: a bibliography.
SIGSOFT Softw. Eng. Notes, 28(6):2–2, 2003.
[5] J. Despeyroux. Proof of translation in natural semantics. In First Symposium on Logic in Computer
Science, pages 193–205, Cambridge, Massachusetts,
USA, June 1986. IEEE Computer Society.
[6] M. Fernández and M. J. Gabbay. Nominal rewriting.
Inf. Comput., 205(6):917–965, 2007.
[7] J. Hickey and A. Nogin. Formal compiler construction in a logical framework. Higher-Order and Symb.
Comp., 19(2-3):197–230, Sept. 2006.
[8] D. Kesner and F. Renaud. The prismoid of resources.
In 34th International Symposium on Mathematical
Foundations of Computer Science (MFCS), number
5734 in LNCS, pages 464–476, Novy Smokovec, High
Tatras, Slovakia, Aug. 2009. Springer-Verlag.
[9] J. W. Klop, V. van Oostrom, and F. van Raamsdonk.
Combinatory reduction systems: Introduction and
survey. Theoretical Computer Science, 121:279–308,
1993.
[10] K. Okuma and Y. Minamide. Executing verified compiler specification. In A. Ohori, editor, APLAS 2003—
First Asian Symposium on Programming Languages
and Systems, volume 2895 of Lecture Notes in Computer Science, pages 178–194, Beijing, China, Nov.
2003. Springer.
[11] F. Pfenning, , and C. Elliot. Higher-order abstract syntax. SIGPLAN Notices, 23(7):199–208, 1988.
[12] K. Rose. CRSX – an open source platform for experimenting with higher order rewriting. Presented in
absentia at HOR 2007—❤tt♣✿✴✴❦r✐st♦❢❢❡r✳r♦s❡✳
♥❛♠❡✴♣❛♣❡rs, June 2007.
[13] K. Rose. Combinatory reduction systems with extensions. ❤tt♣✿✴✴❝rs①✳s♦✉r❝❡❢♦r❣❡✳♥❡t, Apr. 2010.
[14] K. H. Rose. Operational Reduction Models for Functional Programming Languages. PhD thesis, DIKU,
University of Copenhagen, Universitetsparken 1, DK2100 København Ø, Feb. 1996. ❤tt♣✿✴✴❦r✐sr♦s❡✳
♥❡t✴t❤❡s✐s✳♣❞❢.
[15] K. H. Rose, R. Bloo, and F. Lang. On explicit substitution with names. IBM Research Report RC24909, IBM
Thomas J. Watson Research Center, P.O. Box 704,
Yorktown Heights, NY 10598, USA, Dec. 2009.
A CRSX Summary
Our setting is Combinatory Reduction Systems CRS [9] as realized by the “CRSX” system [13]. Here we briefly summarize the used notation and where it differs from reference
CRS.
Page 39
A.1
A.3
Lexical conventions
At the character level, usual white space is permitted and
each of the reserved characters ❬❪✭✮④⑥✿❀✳✱ is a separate
token (except in strings and entities as noted below). Everything between ✴✴ and the end of a line (outside of a
string) is white space (a comment).
Three categories of composite tokens are used:
• Words starting with a lower case letter are variables,
denoted v in the grammar below.
• Capitalized Italic words (as well as words that contain
the # character) are meta-variables and generally denoted by M in the grammar below;
• All other words and composite symbols (except with
❁❃✫) as well as strings written inside double or single
quotes (permitting white space and reserved characters as well as both duplicated and ❭-escaped quotes
of the same kind inside) are constructors and generally
denoted with the letter C in the grammar below.
Notice that strings are just ordinary constructors C and
thus can be used wherever such are allowed. Constructors
containing ✩ in their name are reserved.
A.2
Terms
The lexical tokens combine into terms according to the following basic grammar (the characters []{}, should be seen
as literals):
t ::= v | {e}C[s, . . . , s] | M [t, . . . , t]
(Terms)
s ::= ~v .t | t
(Scope)
e ::= M | e; v : t | e; C : T
(Env)
where ~v denotes v1 v2 . . . vn for n > 0. A scope binds the
variables in the vector such that lexical occurrences of these
inside the t denote those specific variable instances (with
the usual caveat that the innermost possible scope is used
for each particular variable name).
The parser furthermore permits the following abbreviations borrowed from λ calculus and programming languages:
• Parenthesis are allowed around every term, so (t) is
the same as t;
• c ~v .t abbreviates c[v1 .c[v2 . · · · c[vn .t] · · · ]];
• t1 t2 abbreviates @[t1 ,t2 ] and is left recursive so
t1 t2 t3 is the same as (t1 t2 )t3 ;
• t1 ; t2 abbreviates $Cons [t1 ,t2 ] and is right recursive with the special rule that omitted segments correspond to the special subterm $Nil, so (t1 ;t2 ; ) corresponds to the term $Cons[t1 ,$Cons[t2 ,$Nil]]; and
• empty brackets [] can be omitted.
Rules
Rules are a special interpretation of terms use by the
rewrite engine to define rewrite systems, written as
name[options] : pattern → contraction
with special conventions for the name, pattern, and
contraction parts:
• the name becomes the name of the rule,
• the options is a comma-separated list of instructions
to relax the requirement that all used meta-variables
occur exactly once on each sides of the rule, that all
used variables are explicitly scoped, and that all pattern meta-applications permit all in-scope variables,
• the pattern is just a term that must be a construction where meta-applications are applied exclusively
to distinct bound variables, and
• the → is really written as “✫r❛rr❀”.
Otherwise, rewriting is like CRS where matching builds a
“valuation” that maps each meta-variable to an abstration
over the bound pattern variables in the redex, and contraction then uses the valuation to substitute matched bound
variables appropriately [9]. Rules generalize to match environments in the natural way, with one important restriction to avoid the axiom of choice: constant keys must be
given explicitly, and variable keys must be constrained to
a specific variable (by being matched elsewhere in the pattern). Additionally, environment patterns permit a “catchall” meta-variable that denotes the collection of all keyvalue pairs in the matched environment (including any
matched explicitly), which can then be inserted by contraction (where explicit key-value pairs are seen as modifiers).
Finally, we permit free variables in patterns and contractions. A free variable in a pattern will only match actual
locally free variable occurrences in redices. This interferes
with the usual CRS constraint that distinct variables in a
pattern match distinct variables in the redex explicit to
avoid capture of locally bound variables. A free variable
in a contraction is either already matched in the pattern or
corresponds to generating a globally “fresh” variable [14].
A.4
Differences from Standard CRS
The CRSX notation differs from standard CRS in the following ways.
• Meta-variables use distinct tokens rather than reserving the Z symbol.
• Binding is indicated with a dot as in λ calculus instead
of using leading brackets. In addition, scopes are not
terms but are restricted to occur on the subterms of
constructions. So in CRSX the pattern C[#] does not
match the term C[x.x] and the pattern C[x.#[x]] does
not match the term C[x y.A].
• Several abbreviated forms are permitted.
• There is special support for environments and the use
of free variables.
Page 40
Swapping: a natural bridge between named and indexed
explicit substitution calculi
Ariel Mendelzon
Alejandro Rı́os
Beta Ziliani
Depto. de Computación, FCEyN,
Universidad de Buenos Aires.
[email protected]
Depto. de Computación, FCEyN,
Universidad de Buenos Aires.
[email protected]
Depto. de Computación, FCEyN,
Universidad de Buenos Aires.
[email protected]
This article is devoted to the presentation of λ rex, an explicit substitution calculus with de Bruijn
indexes and an elegant notation. By being isomorphic to λ ex – a state-of-the-art formalism with
variable names –, λ rex accomplishes simulation of β -reduction (Sim), preservation of β -strong normalization (PSN) and metaconfluence (MC), among other desirable properties. Our calculus is based
on a novel presentation of λdB , using a peculiar swap notion that was originally devised by de Bruijn.
Besides λ rex, two other indexed calculi isomorphic to λ x and λ xgc are presented, demonstrating the
potential of our technique when applied to the design of indexed versions of known named calculi.
1 Introduction
This article is devoted to explicit substitutions (ES, for short), a formalism that has attracted attention
since the appearance of λ σ [1] and, later, of Melliès’ famous counterexample [13]. The main motivation
behind the field of ES is studying how substitution behaves when internalized in the language it serves
(in the classic λ -calculus, substitution is a meta-level operation). Several calculi have been proposed
since [13] and few have been shown to have a whole set of desirable properties: Sim, PSN, MC, Full
Composition, etc. For a detailed introduction to the ES field, we refer the reader to e.g. [12, 11, 16].
In 2009, D. Kesner proposed λ ex [11], a formalism with variable names that has the entire set of
properties expected from an ES calculus. As far as we know, no ES calculus with de Bruijn indexes [5],
a simple enough notation and the whole set of properties exists to date. We present here such a calculus:
λ rex, based on a more adequate swapping-based version of λdB [5] – λ r –, that we also introduce here.
Moreover, the calculus is isomorphic to λ ex and, therefore, properties are preserved exactly. Together
with λ r and λ rex we present λ re and λ regc , two formalisms that, in turn, are isomorphic to λ x [4, 3] and
λ xgc [4]. As far as we know, no indexed isomorphic versions of λ x and λ xgc were known either.
2 A new presentation for λdB : the λ r-calculus
The λ -calculus with de Bruijn indexes (λdB , for short) [5] accomplishes the elimination of α-equivalence,
since α-equivalent λ -terms are sintactically identical under λdB . This greatly simplifies implementations,
given that working modulo α-equivalence is generally tedious and expensive. One usually refers to a
de Bruijn indexed calculus as a nameless calculus, for names are replaced by indexes. We observe here
that, even though this nameless notion makes sense in the classical λdB -calculus (because the substitution
operator is located in a meta-level), it seems not to be the case in certain ES calculi derived from λdB ,
such as: λ s [8], λ se [9] or λ t [10]. These calculi have constructions of the form a[i := b] to denote ES
(notations vary). Here, even though i is not a name per se, it plays a similar role: i indicates which index
should be replaced; then, we believe, these calculi are not purely nameless.
Given this not-completely-nameless notion, we start by eliminating the index i from the substitution
operator. Then, we are left with terms of the form a[b], and with a (Beta) reduction rule that changes
from (λ a) b → a[1 := b] to (λ a) b → a[b]. The semantics of a[b] should be clear from the new (Beta)
Submitted to:
HOR 2010
c Ariel Mendelzon, Alejandro Rı́os & Beta Ziliani
This work is licensed under the
Creative Commons Attribution License.
42
Swapping: a natural bridge between named and indexed explicit substitution calculi
rule. The problem is, of course, how to define it. Two difficulties arise when a substitution crosses (goes
into) an abstraction: first, the indexes of b should be incremented in order to reflect the new variable
bindings; second – and the key to our technology –, some mechanism should be implemented in order to
replace the need for indexes inside closures (since these should be incremented, too).
The first problem is solved easily: we just use an operator to progresively
increment indexes with every abstraction crossing, in the style of λ t [10]. The
σ b( λ 1 2 )
second issue is a bit harder. Figure 1 will help us clarify what we do when a
(a)
substitution crosses an abstraction, momentarily using σ b a to denote a[b] in
order to emphasize the binding character of the substitution. In this example
we use the term σ b (λ 1 2) (which stands for (λ 1 2)[b]). Figure 1(a) shows
λ (σ b 1 2 )
the bindings in the original term; Figure 1(b) shows that bindings are inverted
(b)
if we cross the abstraction and do not make any changes. Then, in order to
get bindings “back on the road”, we just swap indexes 1 and 2! (Figure 1(c)).
With this operation we recover, intuitively, the original semantics of the term.
λ (σ b 2 1 )
Summarizing, all that is needed when abstractions are crossed is: swap indexes
(c)
1 and 2 and, also, increment the indexes of the term carried in the substitution.
That is exactly what λ r does, with substitutions in the meta-level. In particular,
Figure 1: Bindings
we show that λ r and λdB are identical.
Terms for λ r are the same as those for λdB . That is:
Definition 1 (Terms for λdB and λ r). The set of terms for λdB and λ r, denoted ΛdB , is given in BNF by:
a ::= n | a a | λ a
n ∈ N>0
We now define the new meta-operators used to implement index increments and swaps.
Definition 2 (Increment operator – ↑i ). For every i ∈ N, ↑i : ΛdB → ΛdB is given inductively by:
↑i (a b) = ↑i (a) ↑i (b)
n
if n ≤ i
↑i (n) =
↑i (λ a) = λ ↑i+1 (a)
n + 1 if n > i
Definition 3 (Swap operator – li ). For every i ∈ N>0 , li : ΛdB → ΛdB is given inductively by:
if n < i ∨ n > i + 1
n
li (a b) = li (a) li (b)
i + 1 if n = i
li (n) =
li (λ a) = λ li+1 (a)
i
if n = i + 1
Finally, we present the meta-level substitution definition for λ r, and then the λ r-calculus itself.
Definition 4 (Meta-substitution for λ r). For every a, b, c ∈ ΛdB , n ∈ N>0 , •{•} : ΛdB × ΛdB → ΛdB is
given inductively by:
c
if n = 1
(a b){c} = a{c} b{c}
n{c} =
n − 1 if n > 1
(λ a){c} = λ l1 (a){↑0 (c)}
Definition 5 (λ r-calculus). The λ r-calculus is the reduction system (ΛdB , βr ), where βr ⊆ ΛdB × ΛdB is:
(∀a, b ∈ ΛdB ) a →βr b ⇐⇒ (∃ C context; c, d ∈ ΛdB ) (a = C [(λ c)d] ∧ b = C [c{d}])
The next theorem states the relationship between λ r and λdB meta-substitution operators1 , having as
an immediate corollary that λ r and λdB are the same calculus.
1 See
[8], for example, for the definition of λdB meta-substitution: a{{i ← b}}.
Ariel Mendelzon, Alejandro Rı́os & Beta Ziliani
43
Theorem 6. For every a, b ∈ ΛdB : a{{1 ← b}} = a{b}. Therefore, λdB and λ r are the same calculus.
Proof. The proof is not trivial: to cope with stacked swaps and increments, an extended notion for these
operations is needed; as well as many natural intermediate lemmas. A full technical proof can be found
in [14], chapter 3. Moreover, this result was checked using the Coq theorem prover2 .
3 The λ re, λ regc and λ rex calculi
In order to derive an ES calculus from λ r, we first need to internalize substitutions in the language.
Thus, we add the construction a[b] to ΛdB , and call the resulting set of terms Λre. As a design decision,
operators ↑i and li are left in the meta-level. Naturally, we must extend their definitions to the ES case,
task that needs several lemmas to ensure correctness, and that can be found in [14], chapter 4. Extensions
are rather intuitive:
↑i (a[b]) = ↑i+1 (a)[↑i (b)]
and
li (a[b]) = li+1 (a)[li (b)]
Then, we just orient the equalities from the meta-substitution definition as expected and get a calculus we
call λ re (that turns out to be isomorphic to λ x [4, 3], as we will later explain). It is important to mention
that, even though independently discovered, the swapping mechanism introduced in λ r – and then used
for the conception of λ re – was first depicted by de Bruijn for his Cλ ξ φ [6] (later updated w.r.t. notation
– λ ξ φ – and compared to λ υ in [2]). Although Cλ ξ φ is completely explicit (↑i and li are implemented
by means of a special sort of substitution, c.f. [2]), whereas λ re keeps ↑i and li as meta-operators, the
similarity between both calculi is remarkable. In spite of this, and as far as we know, no direct succesor
of Cλ ξ φ was found to satisfy PSN and MC (in particular, Cλ ξ φ lacks composition).
As a next step in our work, we add Garbage Collection to λ re. To get this right, we introduce a new
meta-level operator – ↓i – that simply decrements indexes by one, so as to mimic the meta-substitution
(and the corresponding λ re rule) for the n > 1 index case. The operator is inspired in a similar one from
[15], needing a few lemmas to ensure a correct definition for ↓i (a[b]) (cf. [14], chapter 5). Then, we get:
Definition 7 (Decrement operator – ↓i ). For every i ∈ N>0 , ↓i : Λre → Λre is given inductively by:
↓i (a b) = ↓i (a) ↓i (b)
if n < i
n
↓i (λ a) = λ ↓i+1 (a)
undefined if n = i
↓i (n) =
n−1
if n > i
↓i (a[b]) = ↓i+1 (a)[↓i (b)]
Note. Notice that ↓i (a) is well-defined iff i 6∈ FV(a).
The Garbage Collection rule added (GC) can be seen in Figure 2, and the resulting calculus is called
λ regc (which, as we will see, is isomorphic to λ xgc [4]).
Finally, the idea is approaching λ ex [11]. To accomplish this, we need to be able to compose substitutions the λ ex way. There, composition is handled by one rule and one equation:
t[x := u][y := v] →(Comp) t[y := v][x := u[y := v]] if y ∈ FV(u)
t[x := u][y := v] =C
t[y := v][x := u]
if y 6∈ FV(u) ∧ x 6∈ FV(v)
The rule (Comp) is used when substitutions are dependent, and reasoning modulo C-equation is needed
for independent substitutions. Since in λ r-derived calculi there is no simple way of implementing an
ordering of substitutions (remember: no indexes inside closures!), and thus no trivial path for the elimination of equation C exists, we need an analogue equation.
Let us start with the composition rule: in a term of the form a[b][c], substitutions [b] and [c] are dependent iff 1 ∈ FV(b). In such a term, indexes 1 and 2 in a are being affected by [b] and [c], respectively.
2 The
proof can be downloaded from http://www.mpi-sws.org/~beta/lambdar.v
44
Swapping: a natural bridge between named and indexed explicit substitution calculi
Consequently, if we were to reduce to a term of the form a′ [c′ ][b′ ], a swap should be performed over
a. Moreover, as substitution [c] crosses the binder [b], an index increment should be done, too. Finally,
since substitutions are dependent – that is, [c] affects b –, b′ should be b[c]. Then, we are left with the
term l1 (a)[↑0 (c)][b[c]].
For the equation, let us suppose we negate the composition condition (i.e., 1 6∈ FV(b)). Using
Garbage Collection in the last term, we have l1 (a)[↑0 (c)][b[c]] →(GC) l1 (a)[↑0 (c)][↓1 (b)]. It is important
to notice that the condition in rule (Comp) is essential; that is: we cannot leave (Comp) unconditional
and let (GC) do its magic: we would immediately generate infinite reductions, losing PSN. Thus, our
composition rule and equation are:
a[b][c] →(Comp) l1 (a)[↑0 (c)][b[c]]
if 1 ∈ FV(b)
a[b][c] =D
l1 (a)[↑0 (c)][↓1 (b)] if 1 6∈ FV(b)
Rules for the λ rex-calculus can be seen in Figure 2. The relation rexp is generated by the set of rules
(App), (Lamb), (Var), (GC) and (Comp); λ rexp by (Beta) + rexp . D-equivalence is the least equivalence
and compatible relation generated by (EqD). Relations λ rex (resp. rex) are obtained from λ rexp (resp.
rexp ) modulo D-equivalence (thus specifying rewriting on D-equivalence classes). That is,
∀ a, a′ ∈ Λre : a →(λ )rex a′ ⇐⇒ ∃ b, b′ ∈ Λre : a =D b →(λ )rexp b′ =D a′
We define λ rex as the reduction system (Λre, λ rex). We shall define λ re and λ regc next. Since the rule
(VarR) does not belong to λ rex, but only to λ re and λ regc , we present it here:
(VarR)
(n + 1)[c] → n
The relation re is generated by (App), (Lamb), (Var) and (VarR); λ re by (Beta) + re; the relation regc
by re + (GC); and λ regc by (Beta) + regc . Finally, the λ re and λ regc calculi are the reduction systems
(Λre, λ re) and (Λre, λ regc ), respectively.
(EqD)
a[b][c]
=
l1 (a)[↑0 (c)][↓1 (b)]
(1 6∈ FV(b))
(Beta)
(App)
(Lamb)
(Var)
(GC)
(Comp)
(λ a) b
(a b)[c]
(λ a)[c]
1[c]
a[c]
a[b][c]
→
→
→
→
→
→
a[b]
a[c] b[c]
λ l1 (a)[↑0 (c)]
c
↓1 (a)
l1 (a)[↑0 (c)][b[c]]
(1 6∈ FV(a))
(1 ∈ FV(b))
Figure 2: Equations and rules for the λ rex-calculus
For the isomorphism between λ ex and λ rex (and also between λ x and λ re; and between λ xgc and
λ regc ), we must first give a translation from the set Λx (i.e., the set of terms for λ x, λ xgc and λ ex; see
e.g. [11] for the expected definition) to Λre, and viceversa. It is important to notice that our translations
depend on a list of variables, which will determine the indexes of the free variables. All this work is
inspired in a similar proof that shows the isomorphism between the λ and λdB calculi, found in [10].
Definition 8 (Translation from Λx to Λre). For every t ∈ Λx, n ∈ N, such that FV(t) ⊆ {x1 , . . . , xn },
w[x1 ,...,xn ] : Λx → Λre is given inductively by:
w[x1 ,...,xn ] (λ x.t)
= λ w[x,x1 ,...,xn ] (t)
w[x1 ,...,xn ] (x) = min j : x j = x
w[x1 ,...,xn ] (t u) = w[x1 ,...,xn ] (t) w[x1 ,...,xn ] (u)
w[x1 ,...,xn ] (t[x := u]) = w[x,x1 ,...,xn ] (t) w[x1 ,...,xn ] (u)
Ariel Mendelzon, Alejandro Rı́os & Beta Ziliani
45
Definition 9 (Translation from Λre to Λx). For every a ∈ Λre, n ∈ N, such that FV(a) ⊆ {1, . . . , n},
u[x1 ,...,xn ] : Λre → Λx, with {x1 , . . . , xn } different variables, is given inductively by:
u[x1 ,...,xn ] ( j)
= xj
u[x1 ,...,xn ] (a b) = u[x1 ,...,xn ] (a) u[x1 ,...,xn ] (b)
u[x1 ,...,xn ] (λ a) = λ x.u[x,x1 ,...,xn ] (a)
u[x1 ,...,xn ] (a[b]) = u[x,x1 ,...,xn ] (a) [x := u[x1 ,...,xn ] (b)]
with x 6∈ {x1 , . . . , xn } in the cases of abstraction and closure.
Translations are correct w.r.t. α-equivalence (i.e., α-equivalent Λx terms have the same image under
w[x1 ,...,xn ] , and identical Λre terms have α-equivalent images under different elections of x for u[x1 ,...,xn ] ).
Besides, adding variables at the end of translation lists does not affect the result; thus, uniform translations w and u can be defined straightforwardly, depending only on a preset ordering of variables. Full
proofs for all this can be found in [14], chapter 4. We now state the isomorphisms:
Theorem 10 (λ ex ∼
= λ rex, λ x ∼
= λ re and λ xgc ∼
= λ regc ). The λ ex (resp. λ x, λ xgc) and λ rex (resp. λ re,
λ regc ) calculi are isomorphic. That is,
1. w ◦ u = IdΛre ∧ u ◦ w = IdΛx
2. ∀t, u ∈ Λx : t →λ ex(λ x,λ xgc) u =⇒ w(t) →λ rex(λ re,λ regc ) w(u)
3. ∀a, b ∈ Λre : a →λ rex(λ re,λ regc ) b =⇒ u(a) →λ ex(λ x,λ xgc) u(b)
Proof. This is actually a three-in-one theorem. Proofs require many intermediate lemmas that assert the
interaction between translations and meta-operators. Full technical details for each of the isomorphisms
can be found in [14], chapters 4 (λ x ∼
= λ re), 5 (λ xgc ∼
= λ regc ) and 6 (λ ex ∼
= λ rex).
Finally, in order to show MC for λ rex, an extension to metaterms (with decorated metavariables)
must be provided. The extension is given as expected, and a proof of its isomorphism w.r.t. the corresponding extension of λ ex is also shown. We refer the reader to [14], chapter 6, section 3 for details.
As a direct consequence of theorem 10, pairwise isomorphic calculi enjoy the same set of properties:
Corollary 11 (Preservation of properties). The λ ex (resp. λ x, λ xgc) and λ rex (resp. λ re, λ regc ) have
the same properties. In particular, this implies λ rex has, among other properties, Sim, PSN and MC.
Proof sketch for e.g. PSN in λ rex. Assume PSN does not hold in λ rex. Then, there exists a ∈ SNλdB s.t.
a 6∈ SNλ rex . Besides, a ∈ SNλdB implies u(a) ∈ SNλ . Therefore, by PSN of λ ex [11], u(a) ∈ SNλ ex . Now,
as a 6∈ SNλ rex , there exists an infinite reduction a →λ rex a1 →λ rex a2 →λ rex · · · . Thus, by theorem 10, we
have u(a) →λ ex u(a1 ) →λ ex u(a2 ) →λ ex · · · , contradicting the fact that u(a) ∈ SNλ ex .
4 Conclusions and further work
We have presented λ rex, an ES calculus with de Bruijn indexes that is isomorphic to λ ex, a formalism
with variable names that fulfills a whole set of interesting properties. As a consequence of the isomorphism, λ rex inherits all of λ ex’s properties. This, together with a very simple notation makes it, as far as
we know, the first calculus of its kind. Besides, the λ re and λ regc calculi (isomorphic to λ x and λ xgc,
respectively) were also introduced. The development was based on a novel presentation of the classical
λdB . Given the homogeneity of definitions and proofs, not only for λ r and λ rex, but also for λ re and
λ regc , we think we found a truly natural bridge between named and indexed formalisms. We believe this
opens a new set of posibilities in the area: either by translating and studying existing calculi with good
properties; or by rethinking old calculi from a different perspective (i.e., with λ r’s concept in mind).
Work is yet to be done in order to get a more suitable theoretical tool for implementation purposes,
for unary closures and equations still make such a task hard. In this direction, we think a mixed approach
46
Swapping: a natural bridge between named and indexed explicit substitution calculi
using ideas from λ rex and λ σ -styled calculi may lead to the solution of both issues. The explicitation of
meta-operators may also come to mind: we think this is not a priority, because the main merit of λ rex is
putting into evidence the accesory nature of index updates. Furthermore, an attempt to use λ rex in proof
assistants or higher order unification [7] implementations may be taken into account. Finally, adding an
η rule to λ rex should be fairly simple using the decrement meta-operator.
Acknowledgements: Special thanks to Delia Kesner for valuable discussions and insight on the subject; as well
as to the anonymous referees for their very useful comments.
References
[1] M. Abadi, L. Cardelli, P.-L. Curien & J.-J. Lévy (1991): Explicit Substitutions. J. Funct. Prog. 1, pp. 31–46.
[2] Z. Benaissa, D. Briaud, P. Lescanne & J. Rouyer-Degli (1996): λ υ, a Calculus of Explicit Substitutions
which Preserves Strong Normalisation. J. Funct. Prog. 6(5), pp. 699–722.
[3] R. Bloo & H. Geuvers (1999): Explicit substitution on the edge of strong normalization. Theor. Comput. Sci.
211(1-2), pp. 375–395.
[4] R. Bloo & K. H. Rose (1995): Preservation of strong normalisation in named lambda calculi with explicit
substitution and garbage collection. In: CSN-95: Computing Science in the Netherlands, pp. 62–72.
[5] N. G. de Bruijn (1972): Lambda Calculus Notation with Nameless Dummies, a Tool for Automatic Formula
Manipulation, with Application to the Church-Rosser Theorem. Indagationes Mathematicae 34, pp. 381–392.
[6] N. G. de Bruijn (1978): A namefree λ calculus with facilities for internal definition of expressions and
segments. Tech. Rep. TH-Report 78-WSK-03, Dept. of Mathematics, Technical University of Eindhoven .
[7] G. Dowek, Th. Hardin & C. Kirchner (2000): Higher order unification via explicit substitutions. Inf. Comput.
157(1-2), pp. 183–235.
[8] F. Kamareddine & A. Rı́os (1995): A Lambda-Calculus à la de Bruijn with Explicit Substitutions. In: PLILP
’95: Proceedings of the 7th International Symposium on Programming Languages: Implementations, Logics
and Programs, Lecture Notes in Computer Science 982, pp. 45–62.
[9] F. Kamareddine & A. Rı́os (1997): Extending a λ -calculus with explicit substitution which preserves strong
normalisation into a confluent calculus on open terms. J. Funct. Prog. 7(4), pp. 395–420.
[10] F. Kamareddine & A. Rı́os (1998): Bridging de Bruijn Indices and Variable Names in Explicit Substitutions
Calculi. Logic Journal of the IGPL 6(6), pp. 843–874.
[11] D. Kesner (2009): A Theory of Explicit Substitutions with Safe and Full Composition. Logical Methods in
Computer Science 5(3:1), pp. 1–29.
[12] P. Lescanne (1994): From λ σ to λ υ: a journey through calculi of explicit substitutions. In: POPL ’94:
Proceedings of the 21st ACM SIGPLAN-SIGACT symposium on principles of programming languages,
ACM, New York, NY, USA, pp. 60–69.
[13] P.-A. Melliès (1995): Typed lambda-calculi with explicit substitutions may not terminate. In: TLCA ’95:
Proceedings of the Second International Conference on Typed Lambda Calculi and Applications, Lecture
Notes in Computer Science 902, pp. 328–334.
[14] A. Mendelzon (2010). Una curiosa versión de λdB basada en “swappings”: aplicación a traducciones entre
cálculos de sustituciones explı́citas con nombres e ı́ndices. Master’s thesis, FCEyN, Univ. de Buenos Aires.
Available at http://publi.dc.uba.ar/publication/pdffile/128/tesis_amendelzon.pdf.
[15] A. Rı́os (1993): Contributions à l’étude des Lambda-calculus avec Substitutions Explicites. Ph.D. thesis,
Université Paris 7.
[16] K. H. Rose, R. Bloo & F. Lang (2009): On explicit substitution with names.
Technical Report, IBM.
Available at http://domino.research.ibm.com/library/cyberdig.nsf/papers/
39D13836281BDD328525767F0056CE65.
Standardisation for constructor based pattern calculi
Delia Kesner
Carlos Lombardi
Alejandro Rı́os
PPS, CNRS and Université Paris Diderot
France
[email protected]
Depto. de Ciencia y Tecnologı́a
Univ. Nacional de Quilmes
Argentina
[email protected]
Depto. de Computación
Facultad de Cs. Exactas y Naturales
Univ. de Buenos Aires – Argentina
[email protected]
This work gives some insights and results on standardisation for call-by-name pattern calculi. More
precisely, we define standard reductions for a pattern calculus with constructor-based data terms and
patterns. This notion is based on reduction steps that are needed to match an argument with respect
to a given pattern. We prove the Standardisation Theorem by using the technique developed by Takahashi [14] and Crary [3] for lambda-calculus. The proof is based on the fact that any development
can be specified as a sequence of head steps followed by internal reductions, i.e. reductions in which
no head steps are involved.
We expect to extend these results to more complex calculi with open and dynamic patterns.
1 Introduction
Several calculi have been proposed in order to give a formal description of pattern matching; i.e. the
ability to analyse the form of the argument of a function in order to decide among alternative function
definition clauses, adequate to different argument forms. We will call them pattern calculi.
Central to several pattern calculi is the concept of matching; an application of an abstraction to an
argument can only be performed if the argument matches the pattern of the abstraction. An analysis of
various pattern calculi based on different notions of matching operations and different sets of allowed
patterns can be found in [9].
A fundamental result in the lambda-calculus is the Standardisation Theorem, which states that if a
term M β -reduces to a term N, then there is a “standard” β -reduction sequence from M to N. This result
has several applications, e.g. it is used to prove the non-existence of reduction between given terms.
One of its main corollaries is the quasi-leftmost-reduction theorem, which in turn is used to prove the
non-existence of normal forms for a given term.
A first study on standardisation for call-by-name lambda-calculus appers in [4]. Subsequently, several standardisation methods have been devised, for example [2] section 11.4, [14], [10] and [13].
While leftmost-outermost reduction gives a standard strategy for call-by-name lambda-calculus,
more refined notions of reductions are necessary to define standard strategies for call-by-value lambdacalculus [13], first-order term rewriting systems [7, 15], Proof-Nets [5], etc.
All standard reduction strategies involve the definition of some selected redex/step by means of a
partial function from terms to redexes/steps; they all give priority to the selected step, if possible. We
will refer to this selected redex/step concept as the head step/redex of a term.
For the standard call-by-name lambda calculus, any term of the form (λ x.M)N is a redex, and the
head redex for such a term is the whole term. In pattern calculi any term of the form (λ p.M)N is a redex
candidate, but not necessarily a redex. The parameter p in such terms can be more complex than a single
variable, and the whole term is not a redex if the argument N does not match p, i.e., if N does not verify
the structural conditions imposed by p. In this case we will choose as head a reduction step lying inside
N (or even inside p) which makes p and N be closer to a possible match. While this situation bears some
resemblance with what happens with standard call-by-value lambda calculus [13], there is an important
Submitted to:
HOR 2010
c D.Kesner, C.Lombardi & A.Rı́os
This work is licensed under the Creative Commons
Attribution-Share Alike License.
48
Standardisation for constructor based pattern calculi
difference: both the fact of (λ p.M)N being a redex, and whether a redex inside N could be useful to get
p and N closer to a possible match, depend in pattern calculi on both N and p.
The aim of this contribution is to analyse the existence of standard reduction strategies for pattern
calculi in a direct way, without using any encoding of such calculi into some general computational
framework [11]. This direct approach puts in evidence the fine interaction between reduction and pattern matching, and gives a normalization algorithm which is specified in terms of the combination of
computations of independent terms with partial computations of terms depending on some pattern. We
hope to be able to extend this algorithmic approach to more sophisticated pattern calculi handling open
and dynamic patterns [8]. The expected standardisation algorithm should be expressed using the explicit
version of the Pure Pattern Calculus [1] recently proposed.
The paper is organized as follows. Section 2 introduces the calculus, Sections 3 and 4 give, respectively, the main concepts and ideas needed for the standardisation proof and the main results, and
Section 5 concludes and gives future research directions.
2 The calculus
We will study a very simple form of pattern calculus, consisting of the extension of standard lambda
calculus with a set of constructors and allowing constructed patterns. This calculus appears for example
in Section 4.1 in [9].
Definition 2.1 (Syntax) The calculus is built upon two different enumerable sets of symbols, the variables x, y, z, w and the constants c, a, b; its syntactical categories are:
Terms
M, N, Q, R ::= x | c | λ p.M | MM DataTerms
D ::= c | DM
Patterns
p, q ::= x | d
DataPatterns d ::= c | d p
Substitutions θ , ν, τ ::= {x1 /M1 , . . . , xn /Mn }
Definition 2.2 (Matching) The match is the partial function from patterns and terms to substitutions
defined by the following rules (⊎ on substitutions denotes disjoint union with respect to their domains,
being undefined if the domains have a non-empty intersection):
d ≪θ1 D
x ≪{x/M} M
c ≪0/ c
p ≪θ2 M
dp ≪
θ1 ⊎θ2
θ1 ⊎ θ2 defined
DM
We write p ≪ M iff ∃θ p ≪θ M. Remark that p ≪ M implies that p is linear.
Definition 2.3 (Reduction step) The reduction steps of the calculus are defined by the following rules:
M → M′
N → N′
p ≪θ N
M → M′
M N → M′ N
M N → M N′
(λ p.M) N → θ M
λ p.M → λ p.M ′
Crucial to the standardisation proof is the concept of development, we formalize it through the relation ⊲ , meaning M ⊲ N iff there is a development (not necessarily complete) with source M and target N.
Definition 2.4 (Term and substitution development) We define the relation ⊲ on terms and a corresponding relation ◮ on substitutions. The relation ⊲ is defined by the following rules:
M ⊲ M′
M⊲M
λ p.M ⊲ λ p.M ′
M ⊲ M′
N ⊲ N′
M ⊲ M′
θ ◮ θ′
p ≪θ N
(λ p.M) N ⊲ θ ′ M ′
M N ⊲ M′ N ′
and ◮ is defined as follows: θ ◮ θ ′ iff dom(θ ) = dom(θ ′ ) and ∀x ∈ dom(θ ) . θ x ⊲ θ ′ x
D.Kesner, C.Lombardi & A.Rı́os
49
The definition of head step will take into account the terms (λ p.M)N even if p 6≪ N. In such cases,
the head redex will be inside N as the patterns in this calculus are always normal forms (this will not be
the case for more complex pattern calculi).
The selection of the head redex inside N depends on both N and p. This differs from standard
call-by-value lambda-calculus, where the selection depends only on N.
We show this phenomenon with a simple example. Let a, b, c be constants, N = (aR1 )R2 a term,
where R1 and R2 are redexes. The redexes in N needed to achieve a match with a certain pattern q, and
thus the selection of the head redex, depend on q.
Take for example different patterns p1 = (ax)(by), p2 = (abx)y, p3 = (abx)(cy), p4 = (ax)y, and consider the term Q = (λ q.M)N. If q = p1 , then it is not necessary to reduce R1 (because it already matches
x) but it is necessary to reduce R2 , because no redex can match the pattern by; hence R2 will be the head
redex in this case. Analogously, for p2 it is necessary to reduce R1 but not R2 , for p3 both are needed (in
this case we will chose the leftmost one) and p4 does match N, hence the whole Q is the head redex.
This observation motivates the following definition.
Definition 2.5 (Head step) The relations → (head step) and
h
p
(preferred needed step to match pattern
p) are defined as follows:
M → M′
h
M N → M′ N
p ≪θ N
HApp1
(λ p.M) N → θ M
M → M′
h
d
HBeta
h
h
M
N
M′
D
PatHead
DM
d
dp
N′
(λ p.M) N → (λ p.M) N ′
HPat
h
D′
M
D′ M
p
Pat1
p
M′
DM
dp
d≪D
DM ′
Pat2
The rule PatHead is intended for data patterns only, not being valid for variable patterns; we point this
by giving a d (data pattern) instead of a p (any pattern) in the arrow subscript inside the conclusion.
We observe that the rule analogous to HPat in the presentation of standard reduction sequences for
call-by-value lambda-calculus in both [13] and [3] reads
N → N′
h
(λ p.M)N → (λ p.M)N ′
h
reflecting the N-only-dependency feature aforementioned.
We can see also that a head step in a term like (λ p.M)N determined by rule HPat will lie inside N,
but the same step will not necessarily be considered head if we analyse N alone.
It is easy to check that if M M ′ then p 6≪ M, avoiding any overlap between HBeta and HPat and
p
also between Pat1 and Pat2. This in turn implies that all terms have at most one head redex. We remark
also that the head step depends not only on the pattern structure but also on the match or lack of match
between pattern and argument.
3 Main concepts and ideas needed for the standardisation proof
In order to build a standardisation proof for constructor based pattern calculi we chose to adapt the one
in [14] for the call-by-name lambda-calculus, later adapted to call-by-value lambda-calculus in [3], over
the classical presentation of [13].
50
Standardisation for constructor based pattern calculi
The proof method relies on a h-development property stating that any development can be split into
a leading sequence of head steps followed by a development in which no head steps are performed; this
is our Lemma 4.1 which corresponds to the so-called “main lemma” in the presentations by Takahashi
and Crary.
Even for a simple form of pattern calculus such as the one presented in this contribution, both the
definitions (as we already mentioned when defining head steps) and the proofs are non-trivial extensions
of the corresponding ones for standard lambda-calculus, even in the framework of call-by-value. As
mentioned before, the reason is the need to keep into account, for terms involving the application of a
function to an argument, the pattern of the function parameter when deciding whether a redex inside the
argument should be considered as a head redex.
In order to formalize the notion of “development without occurrences of head steps”, an internal
development relation will be defined. The dependency on both N and p when analysing the reduction
steps from a term like (λ p.M)N is shown in the rule IApp2.
int
int
Definition 3.1 (Internal development) The relations ⊲ (internal development) and ⊲ p (internal development with respect to the pattern p) are defined as follows:
int
M ⊲ M′
IRefl
IAbs
int
M ⊲M
M 6= λ p.M1
λ p.M ⊲ λ p.M ′
N ⊲ p N′
int
int
IApp2
′
(λ p.M) N ⊲ (λ p.M )N
int
int
int
N ⊲ dp N
p≪N
int
N ⊲c N
int
D ⊲ d D′
PNoCData
′
M ⊲ M′
d 6≪ D
int
′
′
DM ⊲ d p D M
D ⊲ D′
int
M ⊲ p M′
IApp1
int
N ⊲ N′
PMatch
N ⊲ p N′
′
N∈
/ DataTerms N ⊲ N ′
N ⊲ N′
M N ⊲ M′ N ′
N ⊲ N′
int
M ⊲ M′
int
M ⊲ M′
d≪D
p 6≪ M
int
PConst
′
PCDataNo1
PCDataNo2
DM ⊲ d p D′ M ′
int
int
We observe that if either N ⊲ N ′ or N ⊲ p N ′ then N ⊲ N ′ .
The formal description of the h-development condition takes a form of an additional binary relation.
This relation corresponds to the one called strong parallel reduction in [3].
Definition 3.2 (H-development) We define the relations ⊲ and ◮. Let M, N terms; ν, θ substitutions.
h
(ii) ∃ Q . M
a. M ⊲ N
iff
(i) M ⊲ N,
b. ν ◮ θ
iff
(i) Dom(ν) = Dom(θ ),
h
h
→∗
h
int
Q ⊲ N,
h
(iii) ∀ p . ∃ Q p . M
∗
p
int
Q p ⊲ p N.
(ii) ∀ x ∈ Dom(ν) . νx ⊲ θ x.
h
The clause (iii) in the definition of ⊲ shows that the dependency on the patterns already noted when
h
defining head step and internal development carries on to the definition of h-development.
This clause is needed when proving that all developments are h-developments; let’s grasp the reason
through a brief argument. Suppose we want to prove that a development inside N in a term like (λ p.M)N
is an h-development. The rules to apply in the Definitions 2.5 and 3.1 are HPat and IApp2 respectively;
therefore we need to perform an analysis relative to the pattern p when taking N alone. This analysis
D.Kesner, C.Lombardi & A.Rı́os
51
is what clause (iii) expresses. Consequently the proof of clause (ii) for a term needs to consider clause
(iii) (instantiated to a certain pattern) valid for a subterm, this is achieved including clause (iii) in the
definition and performing an inductive argument on the terms being analysed.
4 Main results
We summarize the sequence of the main lemmas needed to prove the Standardisation Theorem, and then
the theorem itself.
Once Lemmas 4.1 and 4.2 have been obtained, both Lemma 4.3 and the Standardisation Theorem
4.5 admit very simple proofs. The proofs of the former lemmas involve some work, mostly related to the
need to check carefully the different cases when analysing a term like (λ p.M)N specially when p 6≪ N.
The proof details, along with the statements and proofs of more technical lemmas, are included in
the extended version of this contribution, available at
www.pps.jussieu.fr/∼kesner/papers/std-patterns-long-hor10.pdf.
Lemma 4.1 (H-development property)
(i) Let M, N terms, ν, θ substitutions, such that M ⊲ N and ν ◮ θ . Then νM ⊲ θ N.
h
h
(ii) Let M, N terms such that M ⊲ N. Then M ⊲ N.
h
Lemma 4.2 (Postponement)
(i)
(ii)
int
if M ⊲ N → R then there exists a term N ′ such that M → N ′ ⊲ R.
h
h
int
for every pattern p, if M ⊲ p N
p
R then there exists a term Np′ such that M
p
Np′ ⊲ R.
Lemma 4.3 (Bifurcation)
int
Assume M, N terms such that M ⊲∗ N. Then M →∗ R ⊲∗ N for some term R.
h
Definition 4.4 (Standard reduction sequence) The standard reduction sequences (in the following s.r.s)
are sequences of terms M1 ; . . . ; Mn which can be generated by:
M2 ; . . . ; Mk
k ≥ 2 M1 → M2
h
StdHead
M1 ; . . . ; Mk
M1 ; . . . ; M j
x
M1 ; . . . ; Mk
StdVar
(λ p.M1 ); . . . ; (λ p.Mk )
N1 ; . . . ; Nk
(M1 N1 ); . . . (M j N1 ); (M j N2 ); . . . ; (M j Nk )
StdAbs
StdApp
Remark that by induction every term is a unitary s.r.s, the rule StdVar being the base case.
Theorem 4.5 (Standardisation)
Assume M, N terms such that M ⊲∗ N. Then there exists a s.r.s M; . . . ; N .
As in [13] standard reduction sequences are not unique, unless we work modulo permutation equivalence [4, 12]. Indeed, let us suppose M → M ′ and N → N ′ . We then get two different (but permutation
h
h
equivalent) s.r.s from (λ d.M)N to (λ d.M ′ )N ′ :
N → N′
M → M′
h
M; M
′
(λ d.M); (λ d.M ′ )
M → M′
h
N → N′
h
N; N ′
(λ d.M)N; (λ d.M ′ )N; (λ d.M ′ )N ′
N
N′
d
(λ d.M)N → (λ d.M)N ′
h
h
M; M ′
λ d.M; λ d.M ′
N′
(λ d.M)N ′ ; (λ d.M ′ )N ′
(λ d.M)N; (λ d.M)N ′ ; (λ d.M ′ )N ′
52
Standardisation for constructor based pattern calculi
5 Conclusion and further work
We have presented an elegant proof of the Standardisation Theorem for constructor-based pattern calculi.
We aim to generalize both the concept of standard reduction and the elegant structure of the Standardisation Theorem proof presented here to a large class of pattern calculi, including both open and
closed variants as the Pure Pattern Calculus [8]. It would be interesting to have sufficient conditions for
a pattern calculus to enjoy the standardisation property. This will be close in spirit with [9] where an
abstract confluence proof for pattern calculi is developed.
The kind of calculi we want to deal with imposes challenges that are currently not handled in the
present contribution, as open patterns, reducible (dynamic) patterns, and the possibility of having fail
as a decided result of matching. Furthermore, the possibility of decided fail combined with compound
patterns leads to the convenience of studying forms of inherently parallel standard reduction strategies.
The abstract Standardisation Theorem developed by [6] in an homogenous axiomatic framework
could be useful for our purpose. While the axioms of the abstract formulation of standardisation are
assumed to hold in the proof of the standardisation result, they need to be defined and verified for each
language to be standardised. This could be not trivial, as in the case of TRS [7, 15], where a meta-level
matching operation is involved in the definition of the rewriting framework.
References
[1] Th. Balabonski (2008): Calculs avec motifs dynamiques. De la conception à l’implémentation. Stage de
Master, Université Paris-Diderot Paris 7.
[2] H.P. Barendregt (1984): The Lambda Calculus: Its Syntax and Semantics. Elsevier, Amsterdam.
[3] K. Crary (2009): A Simple Proof of Call-by-Value Standardization. Technical Report CMU-CS-09-137,
Carnegie-Mellon University.
[4] H.B. Curry & R. Feys (1958): Combinatory Logic. North-Holland Publishing Company, Amsterdam.
[5] J.-Y. Girard (1987): Linear Logic. Theoretical Computer Science 50(1), pp. 1–101.
[6] G. Gonthier, J.-J. Lévy & P.-A. Melliès (1992): An abstract standardisation theorem. In: Proceedings,
Seventh Annual IEEE Symposium on Logic in Computer Science, 22-25 June 1992, Santa Cruz, California,
USA, IEEE Computer Society, pp. 72–81.
[7] G. Huet & J.-J. Lévy (1991): Computations in orthogonal rewriting systems. In: Jean-Louis Lassez & Gordon
Plotkin, editors: Computational Logic, Essays in Honor of Alan Robinson, MIT Press, pp. 394–443.
[8] C.B. Jay & D. Kesner (2006): Pure Pattern Calculus. In: Peter Sestoft, editor: European Symposium on
Programming, number 3924 in LNCS, Springer-Verlag, pp. 100–114.
[9] C.B. Jay & D. Kesner (2009): First-class patterns. J. Funct. Program. 19(2), pp. 191–225. Available at
http://dx.doi.org/10.1017/S0956796808007144.
[10] Ryo Kashima (2000): A Proof of the Standardization Theorem in λ -Calculus. Research Reports on Mathematical and Computing Sciences C-145, Tokyo Institute of Technology.
[11] J.W. Klop, V. van Oostrom & R.C. de Vrijer (2008): Lambda calculus with patterns. Theor. Comput. Sci.
398(1-3), pp. 16–31. Available at http://dx.doi.org/10.1016/j.tcs.2008.01.019.
[12] J.-J. Lévy (1980): Optimal Reductions in the lambda-calculus. In: R. Hindley & J.P. Seldin, editors: To
Haskell Curry: Essays in Combinatory Logic, Lambda Calculus and formalism, Acad. Press, pp. 159–191.
[13] G. Plotkin (1975): Call-by-name, call-by-value and the Lambda-calculus. Theor. Comput. Sci. 1(2), pp.
125–159.
[14] M. Takahashi (1995): Parallel reductions in lambda-calculus. Inf. and Comput. 118(1), pp. 120–127.
[15] Terese (2003): Term Rewriting Systems, Cambridge Tracts in Theoretical Computer Science 55. Cambridge
University Press.