Academia.eduAcademia.edu

Equivalence of Algebraic λ-calculi–extended abstract–

2010, International Workshop on Higher-Order Rewriting–HOR 2010–

We examine the relationship between the algebraic λ-calculus (λalg)[9] a fragment of the differential λ-calculus [4]; and the linear-algebraic λ-calculus (λlin)[1], a candidate λ-calculus for quantum computation. Both calculi are algebraic: each one is equipped with an additive and a scalarmultiplicative structure, and the set of terms is closed under linear combinations. We answer the conjectured question of the simulation of λalg by λlin [2] and the reverse simulation of λlin by λalg. Our proof relies on the observation that λlin is essentially call-by- ...

Proceedings of the 5th International Workshop on Higher-Order Rewriting – HOR 2010 – A FLoC affiliated workshop held on 14 July 2010 Edinburgh, UK Preface HOR 2010 is a forum to present work concerning all aspects of higher-order rewriting. The aim is to provide an informal and friendly setting to discuss recent work and work in progress. Previous editions of HOR were held in Copenhagen – Denmark (HOR 2002), Aachen – Germany (HOR 2004), Seattle – USA (HOR 2006) and Paris – France (HOR 2007). This year we had a total of 8 papers presented, all addressing interesting ideas. Also, we had two invited speakers to whom I would like to give thanks: • Maribel Fernández (King’s College London) who talked about Closed nominal rewriting: properties and applications and • Silvia Ghilezan (University of Novi Sad) who talked about Computational interpretations of logic. My appreciation also to the members of the PC (Zena Ariola, Frédéric Blanqui, Mariangiola Dezani-Ciancaglini and Roel de Vrijer) for lending their time and expertise, the referees, and to Delia Kesner and Femke van Raamsdonk for providing valuable support. Thanks also to GDR-IM which awarded funds to HOR’2010 that was used for supporting presentation of papers by students. Finally, I would like to thank the organizers of FLoC 2010 and affiliated events for contributing towards such an exciting event. Eduardo Bonelli (Universidad Nacional de Quilmes, Argentina) 1 Contents 1 2 3 4 5 6 7 8 9 10 Maribel Fernández (Invited Speaker) Closed nominal rewriting: properties and applications Alejandro Dı́az-Caro, Simon Perdrix, Christine Tasson and Benoı̂t Valiron Equivalence of algebraic lambda-calculi Harald Zankl, Nao Hirokawa and Aart Middeldorp Uncurrying for Innermost Termination and Derivational Complexity Giulio Manzonetto and Paolo Tranquilli A Calculus of Coercions Proving the Strong Normalization of MLF Cynthia Kop (Student Talk) A new formalism for higher-order rewriting Silvia Ghilezan (Invited Speaker) Computational interpretations of logic Thibaut Balabonski On the Implementation of Dynamic Patterns Kristoffer Rose Higher-order Rewriting for Executable Compiler Specifications Ariel Mendelzon, Alejandro Rı́os and Beta Ziliani Swapping: a natural bridge between named and indexed explicit substitution calculi Delia Kesner, Carlos Lombardi and Alejandro Rı́os Standardisation for constructor based pattern calculi 2 3–5 6–11 12–16 17–21 22–26 27–28 29–33 34–40 41–46 47–52 Closed nominal rewriting: properties and applications Maribel Fernández King’s College London, Strand, London WC2R 2LS, UK [email protected] Rewriting systems (see, for instance, [6, 1, 20]) have been used to model the dynamics (deduction and evaluation for example) of formal systems described by abstract syntax trees, also called terms. In the presence of binding, α-equivalence, that is, the equivalence relation that equates terms modulo renaming of bound variables, must be taken into account. One alternative is to define binders through functional abstraction, taking α-equivalence as a primitive, implicit notion, and working with equivalence classes of terms. For instance, Combinatory Reduction Systems (CRS) [14], Higher-order Rewrite Systems (HRS) [17] and Expression Reduction Systems (ERS) [13] use the λ -calculus as meta-language and terms are defined “modulo alpha”. The price to pay is that we can no longer rely on simple notions such as structural induction on terms and syntactic unification. Alternatively, the nominal approach [12, 19] distinguishes between object-level variables (written a, b, c and called atoms), which can be abstracted but behave similarly to name constants, and metalevel variables or just variables (X,Y, Z, . . .), which are first-order in that there are no binders for them and substitution does not avoid capture of free atoms. In nominal terms variables have arity zero, as in ERSs (but unlike ERSs, substitution of atoms for terms is not a primitive notion). The α-equivalence relation is axiomatised in a syntax-directed manner (thus we can reason by structural induction) using a freshness relation between atoms and terms, written a#t (i.e., “a is fresh for t”). Nominal rewriting systems (NRSs) [7] are rewriting systems on nominal terms. For example, β -reduction and η-expansion rules for the λ -calculus are written as: a#X ⊢ app(λ ([a]M), N) → subst([a]M, N) X → λ ([a]app(X, a)) where the substitution in the β -rule is represented by a function symbol, also defined by rewrite rules. For instance, we can add the following rules, where we sugar subst([a]M, N) to M{a7→N}, to propagate substitutions avoiding capture: (σvar ) a#Y ⊢ (σε ) (σapp ) b#Y ⊢ (σλ ) a{a7→X} Y {a7→X} app(X, X ′ ){a7→Y } (λ [b]X){a7→Y } → → → → X Y app(X{a7→Y }, X ′ {a7→Y }) λ [b](X{a7→Y }) We refer to [7] for more examples of nominal rewriting rules. A step of nominal rewriting involves matching modulo α-equivalence, which is decidable [21]. For arbitrary NRSs, checking whether there is a rule that can be applied to a given term is an NP-complete problem in general [3]. However, if we only use closed rules, nominal matching is sufficient, and can be implemented in linear time and space [2]. Closed rules are, roughly speaking, rules that preserve Submitted to: HOR 2010 c M. Fernández This work is licensed under the Creative Commons Attribution License. 4 Closed nominal rewriting: properties and applications abstracted atoms during reductions (as in the examples above); all atoms occur under abstractions in closed rules. CRSs, ERSs, and HRSs impose similar conditions on rules, by definition (ERSs impose a condition on matching substitutions, which corresponds to our notion of closed rules). We refer to [9] for an encoding of CRSs using closed nominal rules. In addition to efficient matching, closed NRSs inherit other good properties of first-order rewriting: for instance, we have a critical pair lemma (see [7]) which can be used to derive confluence of terminating systems. Confluent and terminating NRSs with closed rules have a decidable equational theory [8] (see [11, 4] for definitions and examples of nominal equational theories). In other words, if a nominal equational theory can be represented by a confluent and terminating closed NRSs, then equality in the theory can be decided by rewriting. However, confluence and termination are both undecidable properties. Sufficient conditions for confluence are given in [7]. Recently, we have also shown that standard orderings used to check termination of first-order rewriting systems, such as the recursive path ordering (rpo) [5], can be generalised to deal with nominal terms and α-equivalence [10]. The nominal recursive path ordering inherits the properties of the rpo, and can be used to check termination of NRSs. Using this result, we have designed a completion procedure à la Knuth and Bendix [15] for closed NRSs. The principle behind completion is that if a given equational theory is presented by a terminating but not confluent rewrite system, then we can try to transform it into a confluent one by computing its critical pairs and adding rules to join them, preserving termination (but completion may fail, or may not terminate). Completion has been generalised to systems that use higher-order functions but no binders, i.e. with a first-order syntax [16]. In the case of higher-order rewriting systems, not only we need an ordering that can deal with terms including binders, but also, after computing a critical pair, we need to be able to add the corresponding rules if the pair is not joinable. Adding these rules may not always possible, as mentioned in [18], due to the syntactic or type restrictions used in higher-order rewriting formalisms. So far, no completion procedures are available for CRSs, ERSs or HRSs. However, NRSs do not rely on a typed language as HRSs, and do not impose the syntactic restrictions that ERSs and CRSs impose. A completion procedure can indeed be defined for NRSs when the rules are closed. This result opens the way for the development of tools for automated reasoning in equational theories that include binders. References [1] F. Baader and T. Nipkow. Term rewriting and all that. Cambridge University Press, Great Britain, 1998. [2] C. Calvès and M. Fernández. Matching and Alpha-Equivalence Check for Nominal Terms. Journal of Computer and System Sciences, Special issue: Selected papers from WOLLIC 2008. Elsevier, 2009. [3] J. Cheney. The complexity of equivariant unification. In Automata, Languages and Programming, Proceedings of the 31st Int. Colloquium, ICALP 2004, volume 3142 of Lecture Notes in Computer Science. Springer, 2004. [4] R. A. Clouston and A. M. Pitts. Nominal Equational Logic. In Computation, Meaning and Logic. Articles dedicated to Gordon Plotkin. L. Cardelli, M. Fiore and G. Winskel (eds.), volume 1496, Electronic Notes in Theoretical Computer Science, Elsevier, 2007. [5] N. Dershowitz. Orderings for Term-Rewriting Systems. Theoretical Computer Science, vol. 17, no. 3, pp. 279-301. Elsevier, 1982. [6] N. Dershowitz and J.-P. Jouannaud. Rewrite Systems. In J. van Leeuwen, editor, Handbook of Theoretical Computer Science: Formal Methods and Semantics, volume B. North-Holland, 1989. [7] M. Fernández and M.J. Gabbay. Nominal rewriting. Information and Computation, Volume 205(6), 2007. M. Fernández 5 [8] M. Fernández and M.J. Gabbay. Closed nominal rewriting and efficiently computable nominal algebra equality. In Proceedings of LFMTP 2010, EPTCS. [9] M. Fernández, M.J. Gabbay, and I. Mackie. Nominal rewriting systems. In Proceedings of the 6th ACMSIGPLAN Symposium on Principles and Practice of Declarative Programming (PPDP’04), Verona, Italy. ACM Press, 2004. [10] M. Fernández and A. Rubio. Reduction Orderings and Completion for Rewrite Systems with Binding. Available from http://www.dcs.kcl.ac.uk/staff/maribel/ [11] M. J. Gabbay and A. Mathijssen. Nominal Algebra. Proceedings of the 18th Nordic Workshop on Programming Theory (NWPT’06), 2006. [12] M. J. Gabbay and A. M. Pitts. A new approach to abstract syntax involving binders. In 14th Annual Symposium on Logic in Computer Science, pages 214–224. IEEE Computer Society Press, 1999. [13] Z. Khasidashvili. Expression reduction systems. In Proceedings of I.Vekua Institute of Applied Mathematics, volume 36, pages 200–220, Tbilisi, 1990. [14] J.-W. Klop, V. van Oostrom, and F. van Raamsdonk. Combinatory reduction systems, introduction and survey. Theoretical Computer Science, 121:279–308, 1993. [15] D. Knuth and P. Bendix. Simple word problems in universal algebras. In Computational Problems in Abstract Algebra, ed. J. Leech, 263–297. Oxford: Pergamon Press, 1970. [16] K. Kusakari and Y. Chiba. A higher-order Knuth-Bendix procedure and its applications. IEICE Transactions on Information and Systems, Vol. E90-D,4, 707–715, 2007. [17] R. Mayr and T. Nipkow. Higher-order rewrite systems and their confluence. Theoretical Computer Science, 192:3–29, 1998. [18] T. Nipkow and C. Prehofer. Higher-Order Rewriting and Equational Reasoning, In W. Bibel and P. Schmitt, editors, Automated Deduction — A Basis for Applications. Volume I: Foundations, Applied Logic Series, volume 8, pages 399–430. Kluwer, 1998. [19] A. M. Pitts. Nominal logic, a first order theory of names and binding. Information and Computation, 186:165–193, 2003. [20] Terese. Term Rewriting Systems. Volume 55 of Cambridge Tracts in Theoretical Computer Science. Cambridge University Press, 2003. [21] C. Urban, A. M. Pitts, and M. J. Gabbay. Nominal unification. Theoretical Computer Science, 323:473–497, 2004. Equivalence of Algebraic λ -calculi – extended abstract∗– Alejandro Dı́az-Caro Simon Perdrix LIG, Université de Grenoble, France CNRS, LIG, Université de Grenoble, France [email protected] [email protected] Christine Tasson Benoı̂t Valiron CEA-LIST, MeASI, France LIG, Université de Grenoble, France [email protected] [email protected] We examine the relationship between the algebraic λ -calculus (λalg ) [9] a fragment of the differential λ -calculus [4]; and the linear-algebraic λ -calculus (λlin ) [1], a candidate λ -calculus for quantum computation. Both calculi are algebraic: each one is equipped with an additive and a scalarmultiplicative structure, and the set of terms is closed under linear combinations. We answer the conjectured question of the simulation of λalg by λlin [2] and the reverse simulation of λlin by λalg . Our proof relies on the observation that λlin is essentially call-by-value, while λalg is call-by-name. The former simulation uses the standard notion of thunks, while the latter is based on an algebraic extension of the continuation passing style. This result is a step towards an extension of call-by-value / call-by-name duality to algebraic λ -calculi. 1 Introduction Context. Two algebraic versions of the λ -calculus arise independently in distinct contexts: the algebraic λ -calculus (λalg ) and the linear algebraic λ -calculus (λlin ). The former has been introduced in the context of linear logic as a fragment of the differential λ -calculus. The latter has been introduced as a candidate λ -calculus for quantum computation: in λlin , a linear combination of terms reflects the phenomenon of superposition, i.e. the capability for a quantum system to be in two or more states at the same time. Linearity of functions and arguments. In both languages, functions which are linear combinations of terms are interpreted pointwise: (α. f + β .g) x = α.( f ) x + β .(g) x, where “.” is the external product. The two languages differ on the treatment of the arguments. In λlin , any function is considered as a linear map: ( f ) (α.x + β .y) →∗ℓ α.( f ) x + β .( f ) y, reflecting the fact that any quantum evolution is a linear map; while λalg has a call-by-name evolution: (λ x M) N →a M[x := N], without restriction on N. As a consequence, the evolutions are different as illustrated by the following example. In λlin , (λ x (x) x) (α.y + β .z) →∗ℓ α.(y) y + β .(z) z while in λalg , (λ x (x) x) (α.y + β .z) →a (α.y + β .z) (α.y + β .z) =a α 2 .(y) y + (αβ ).(y) z + (β α).(z) y + β 2 .(z) z. Simulations. These two languages behave in different manner. An essential question is whether they would nonetheless be equivalent (and in which manner). Indeed, a positive answer would link two distinct research areas and unify works done in linear logic and works on quantum computation. It has been conjectured [2] that λlin simulates λalg . Our contribution is to prove it formally (Section 3.1) and to provide also the other way around proof of λalg simulating λlin (Section 3.2). The first simulation ∗A full version of this paper with all the proofs is available in the arXiv Submitted to: HOR 2010 c A. Dı́az-Caro, S. Perdrix, C. Tasson & B. Valiron This work is licensed under the Creative Commons Attribution License. 7 A. Dı́az-Caro, S. Perdrix, C. Tasson & B. Valiron uses the encoding, known as “thunk” in the folklore [6], which is based on “freezing” the evaluation of arguments by systematically encapsulating them into abstractions (that is, making them into values). It has been extensively studied in the case of the regular, untyped lambda-calculus [5]. The other way around is based on an algebraic extension of continuation passing style encoding [8]. Modifications to the original calculi. In this paper we slightly modify the two languages. The unique modification to λalg consists in avoiding reduction under λ , so that for any M, λ x M is a value. As a consequence, λ is not linear: λ x (α.M + β .N) 6= α.λ x N + β .λ x N. In λlin , we restrict the application of several rewriting rules in order to make the rules more coherent with a call-by-value leftmost-redex evaluation. For instance, the rule (M + N) L →ℓ (M) L + (N) L is restricted to the case where both M + N and L are values. Finally, several distinct techniques can be used to make an algebraic calculus confluent. In λlin , restrictions on reduction rules are introduced, e.g. α.M + β .M →ℓ (α + β ).M if M is closed normal. In λalg a restriction to positive scalars is proposed. Finally, one can use a typing system to guarantee confluence. In this paper, we assume that one of these techniques – without specifying explicitly which one – is used to make the calculi confluent. 2 Algebraic λ -calculi The languages λlin and λalg share the same syntax, defined as follows: M, N, L ::= V | (M) N | M + N | α.M (terms), (values), U,V,W ::= 0 | B | α.B | V +W (basis terms). B ::= x | λ x M where α represents scalars which may themselves be defined by a term grammar, and endowed with a term rewrite system compatible with their basic ring operations (+, ×). Formally it is captured in the definition [1, sec. III – def. 1] of a scalar rewrite system, but for our purpose it is sufficient to think of them as a ring. The main differences between the two languages are the β -reduction and the algebraic linearity of function arguments. If U,V and W are values, and B is a basis term, the rules are defined by: (λ x M) N (λ x M) B (U) (V +W ) (V ) (α.W ) (V ) 0 →a →ℓ →ℓ →ℓ →ℓ M[x := N] M[x := B] (U) V + (U) W α.(V ) W 0 βλalg , βλlin , γλlin , γλlin , γλlin . In both languages, + is associative and commutative, i.e. (M + N) + L = M + (N + L) and M + N = N + M. Notwithstanding their different axiomatizations – one based on equations and the other one on rewriting rules – linear combination of terms is treated in the same way: the set of terms behaves as a module over the ring of scalar in both languages. In λalg the following algebraic equality is defined1 : 1 The reader should not be surprised by noticing that two terms that are equal under = may reduce to terms that are not a equal any more. Indeed, it is already the case with the syntactical equality of the λ -calculus. Equivalence of Algebraic λ -calculi 8 (M + N) L (α.M) N 0+M α.(M + N) α.M + β .M =a =a =a =a =a (M) L + (N) L α.(M) N M α.M + α.N (α + β ).M (λalg ) (λalg ) (λalg ) (λalg ) (λalg ) α.(β .M) (0) M 1.M 0.M α.0 =a =a =a =a =a (α × β ).M 0 M 0 0 (λalg ) (λalg ) (λalg ) (λalg ) (λalg ) In the opposite, the ring structure and the linearity of functions in λlin are provided by reduction rules. Let U,V and W stand for values2 , the rules are defined as follows. (U +V ) W (α.V ) W α.(M + N) α.M + β .M α.M + M M+M →ℓ →ℓ →ℓ →ℓ →ℓ →ℓ (U) V + (V ) W α.(V ) W α.M + α.N (α + β ).M (α + 1).M (1 + 1).M (λlin ) (λlin ) (λlin ) (λlin ) (λlin ) (λlin ) 0+M α.(β .M) (0) V 1.M 0.M α.0 →ℓ →ℓ →ℓ →ℓ →ℓ →ℓ M (α × β ).M 0 M 0 0 (λlin ) (λlin ) (λlin ) (λlin ) (λlin ) (λlin ) The context rules for both languages are M → M′ M → M′ N → N′ M → M′ (M) N → (M ′ ) N M + N → M′ + N M + N → M + N′ α.M → α.M ′ together with the additional context rule only for λlin M →l M ′ (V ) M →l (V ) M ′ . The β -reduction of λalg corresponds to a call-by-name evaluation, while the β -reduction of λlin occurs only if the argument is a basis term, i.e. a variable or an abstraction. The γ-rules, only available in λlin , allows linearity in the arguments. 3 Simulations 3.1 λlin simulates λalg We consider the following encoding L·M : λalg → λlin . The variables f and z are chosen fresh. L0M = 0, LxM = (x) f , Lλ x MM = λ x LMM, L(M) NM = (LMM) λ z LNM, LM + NM = LMM + LNM, Lα.MM = α.LMM. One could be tempted to prove a result in the line of M →a N implies LMM →∗ℓ LNM. Unfortunately this does not work. Indeed, the encoding brings “administrative” redexes, as in the following example (where I = λ x x). Although (λ x λ y (y) x) I →a λ y (y) I, L(λ x λ y (y) x) IM = (λ x λ y ((y) f ) (λ z (x) f )) (λ z λ x (x) f ) →∗ℓ λ y ((y) f ) (λ z (λ z λ x (x) f ) f ), Lλ y (y) IM = λ y ((y) f ) (λ z λ x (x) f ) are not equal: there is an “administrative” redex hidden in the first expression. This redex does not bring any information, it is only brought by the encoding. In order to clear these redexes, we define the map Admin as follows. 2 Notice that in λ lin a value is not necessarily in normal form. For instance the value λ x x + λ x x reduces to 2.λ x x. The reductions of values result solely from the ring structure, and all values are normalizing terms. 9 A. Dı́az-Caro, S. Perdrix, C. Tasson & B. Valiron Admin 0 Admin x Admin (λ f M) f Admin (M) N = = = = Admin λ x M = λ x Admin M, Admin M + N = Admin M + Admin N, Admin α.M = α.Admin M. 0, x, Admin M, (Admin M) Admin N, Theorem 3.1 For any program (i.e. closed term) M, if M →a N and LNM →∗ℓ V for a value V, then there exists M ′ such that LMM →∗ℓ M ′ and Admin M ′ = AdminV . Proof Proof by induction on the derivation of M →a N. ⊓ ⊔ Lemma 3.2 If W is a value and M a term such that AdminW = Admin M, then there exists a value V such that M →∗ℓ V and AdminW = AdminV . Lemma 3.3 If V is a closed value, then LV M is a value. Theorem 3.4 (Simulation) For any program (i.e. closed term) M, if M →∗a V a value, then there exists a value W such that LMM →∗ℓ W and AdminW = Admin LV M. Proof The proof is done by induction on the size of the sequence of reductions M →∗a V . If M = V , this is trivially true by choosing W = LV M, which is a value since V is closed, by Lemma 3.3. Now, suppose the result true for the reduction N →∗a V and suppose that M →a N. By induction hypothesis, LNM →∗ℓ W , for some value W such that AdminW = Admin LV M. From Theorem 3.1, there exists M ′ such that LMM →∗ℓ M ′ and Admin M ′ = AdminW . From Lemma 3.2, without loss of generality we can choose this M ′ to be a value W ′ . This closes the proof of the theorem: we have indeed the equality AdminW ′ = Admin LV M. ⊓ ⊔ 3.2 λalg simulates λlin To prove the simulation of λlin with λalg we use the following encoding. This is an algebraic extension of the continuation passing style used to prove that call-by-name simulates call-by-value in the regular λ -calculus [8]. Let J·K : λlin → λalg be the following encoding. The variables f , g and h are chosen fresh. JxK = λ f ( f ) x, J0K = 0, Jλ x MK = λ f ( f ) λ x JMK, J(M) NK = λ f (JMK) λ g (JNK) λ h ((g) h) f , Jα.MK = λ f (α.JMK) f , JM + NK = λ f (JMK + JNK) f . Let Ψ be the encoding for values defined by: Ψ(x) = x, Ψ(0) = 0, Ψ(λ x M) = λ x JMK, Ψ(α.V ) = α.Ψ(V ), Ψ(V +W ) = Ψ(V ) + Ψ(W ). Using this encoding, it is possible to prove that λalg simulates λlin for any program reducing to a value: Theorem 3.5 (Simulation) For any program M, if M →∗ℓ V where V is a value, then JMK (λ x x) →∗a Ψ(V ). Thanks to the subtle modifications done to the original algebraic calculi (presented in the introduction), the proof in [8] can easily be extended to the algebraic case. We first define a convenient infix operation (:) that captures the behaviour of the translated terms. For example, if B is a base term, i.e. a variable or an abstraction, then its translation into λalg is JBK 7→ λ f ( f ) Ψ(B). If we apply this translated term to a certain K, we obtain λ f ( f ) Ψ(B) K →a K Ψ(B). We capture this by defining B : K = K Ψ(B). In general, M : K is the reduction of the λalg term JMK K, as Lemma 3.7 states. Equivalence of Algebraic λ -calculi 10 Definition 3.6 Let (:) : Λλlin × Λλalg → Λλalg be the infix binary operation defined as follows: B : K = (K) Ψ(B) (M) N : K = M : λ g (JNK) λ h ((g) h) K (M) N : K = N : λ f ((Ψ(M)) f ) K (M) N : K = ((Ψ(M)) Ψ(N)) K (M) (N1 + N2 ) : K = ((M) N1 + (M) N2 ) : K (M) (α.N) : K = α.(M) N : K (M) 0 : K = 0 (M + N) : K = M : K + N : K, α.M : K = α.(M : K), 0 : K = 0. (with B a base term), (with M not a value), (with M, but not N, being a value), (with M a value, and N a base term), (with M and N1 + N1 values), (with M and α.N values), (with M a value), Lemma 3.7 If K is a value, then ∀M, JMK K →∗a M : K. Lemma 3.8 If M →ℓ N then ∀K value, M : K →∗a N : K The proof of the Theorem 3.5 is now stated as follows. Proof of Theorem 3.5. From Lemma 3.7, JMK (λ x x) reduces to M : (λ x x). From Lemma 3.8, it reduces to V : (λ x x). We now proceed by structural induction on V . • Let V be a base term. Then V : (λ x x) = (λ x x) Ψ(V ) → Ψ(V ). • Let V = V1 + V2 . Then V : (λ x x) = V1 : (λ x x) + V2 : (λ x x), which by the induction hypothesis, reduces to Ψ(V1 ) + Ψ(V2 ) = Ψ(V ). • Let V = α.V ′ . Then V : (λ x x) = α.(V ′ : (λ x x)), which by the induction hypothesis, reduces to α.Ψ(V ′ ) = Ψ(V ). ⊓ ⊔ 4 Conclusion and perspectives In this paper we proved the conjectured [2] simulation of λalg by λlin and its inverse, on valid programs (that is, programs reducing to values), answering an open question about the equivalence of the algebraic λ -calculus (λalg ) [9] and the linear-algebraic λ -calculus (λlin ) [1]. As already shown by Plotkin [8], if the simulation of call-by-value by call-by-name is sound, it fails to be complete for general (possibly non-terminating) programs. To make it complete, a known solution is to consider the problem from the point of view of Moggi’s computational calculus [7]. A direction for study is to consider an algebraic computational λ -calculus instead of a general algebraic λ -calculus. This raises the question of finding a correct notion of monad for capturing both algebraicity and nontermination in the context of higher-order structures. Another direction of study is the relation between the simulation of call-by-name by call-by-value using thunks and the CPS encoding. A first direction of study is [5] Concerning semantics, the algebraic λ -calculus admits finiteness spaces as a model [3]. What is the structure of the model of the linear algebraic λ -calculus induced by the continuation-passing style translation in finiteness spaces? The algebraic lambda-calculus can be equipped with a differential operator. What is the corresponding operator in λlin through the translation? A. Dı́az-Caro, S. Perdrix, C. Tasson & B. Valiron 11 References [1] Pablo Arrighi & Gilles Dowek (2008): Linear-algebraic lambda-calculus: higher-order, encodings, and confluence. In: Andrei Voronkov, editor: RTA 2008, Lecture Notes in Computer Science 5117, Springer, Hagenberg, Austria, pp. 17–31. [2] Pablo Arrighi & Lionel Vaux (2009): Embeding AlgLam into Lineal. Private communication. [3] Thomas Ehrhard (2005): Finiteness spaces. Mathematical Structures in Computer Science 15(4), pp. 615–646. [4] Thomas Ehrhard & Laurent Regnier (2003): The differential lambda-calculus. Theoretical Computer Science 309(1), pp. 1–41. [5] John Hatcliff & Olivier Danvy (1997): Thunks and the lambda-calculus. Journal of Functional Programming 7(03), pp. 303–319. [6] Peter Zilahy Ingerman (1961): Thunks: a way of compiling procedure statements with some comments on procedure declarations. Communication of the ACM 4(1), pp. 55–58. [7] Eugenio Moggi (1989): Computational Lambda-Calculus and Monads. In: LICS, IEEE Computer Society, pp. 14–23. [8] Gordon D. Plotkin (1975): Call-by-name, call-by-value and the lambda-calculus. Theoretical Computer Science 1(2), pp. 125–159. [9] Lionel Vaux (2009): The algebraic lambda calculus. Mathematical Structures in Computer Science 19(5), pp. 1029–1059. Uncurrying for Innermost Termination and Derivational Complexity∗ Harald Zankl,1 Nao Hirokawa,2 and Aart Middeldorp1 1 2 Institute of Computer Science, University of Innsbruck, Austria School of Information Science, Japan Advanced Institute of Science and Technology, Japan In this paper we investigate the uncurrying transformation from (Hirokawa et al., 2008) for innermost termination and derivational complexity. 1 Introduction Proving termination of first-order applicative term rewrite systems is challenging since the rules lack sufficient structure. But these systems are important since they provide a natural framework for modeling higher-order aspects found in functional programming languages some of which (e.g., OCaml) have an eager evaluation strategy. Since proving termination is easier for innermost than for full rewriting we lift some of the recent results from [2] from full to innermost termination. For the properties that do not transfer to the innermost setting we provide counterexamples. Furthermore we show that the uncurrying transformation is suitable for (innermost) derivational complexity analysis. The remainder of this paper is organised as follows. After recalling the uncurrying transformation from [2] in Section 2 we show that it preserves innermost nontermination (but not innermost termination) in Section 3. In Section 4 we show that it preserves polynomial complexities of programs. 2 Uncurrying We assume familiarity with term rewriting [1, 7]. This section recalls definitions and results from [2]. An applicative term rewrite system (ATRS for short) is a TRS over a signature that consists of constants and a single binary function symbol called application denoted by the infix and left-associative symbol ◦. In examples we often use juxtaposition instead of ◦. Every ordinary TRS can be transformed into an ATRS by currying. Let F be a signature. The currying system C(F) consists of the rewrite rules fi+1 (x1 , . . . , xi , y) → fi (x1 , . . . , xi ) ◦ y for every n-ary function symbol f ∈ F and every 0 6 i < n. Here fn = f and, for every 0 6 i < n, fi is a fresh function symbol of arity i. The currying system C(F) is confluent and terminating. Hence every term t has a unique normal form t↓C(F ) . For instance, f(a, b) is transformed into f a b. Let R be a TRS over the signature F. The curried system R↓C(F ) is the ATRS consisting of the rules l↓C(F ) → r↓C(F ) for every l → r ∈ R. The signature of R↓C(F ) contains the application symbol ◦ and a constant f0 for every function symbol f ∈ F. In the following we write R↓C for R↓C(F ) whenever F can be inferred from the context or is irrelevant. Moreover, we write f for f0 . Next we recall the uncurrying transformation from [2]. Let R be an ATRS over a signature F. The applicative arity aa(f ) of a constant f ∈ F is defined as the maximum n such that f ◦ t1 ◦ · · · ◦ tn is ∗ This research is supported by FWF (Austrian Science Fund) project P18763 and the Grant-in-Aid for Young Scientists Nos. 20800022 and 22700009 of the Japan Society for the Promotion of Science. Submitted to: HOR 2010 c H. Zankl, N. Hirokawa & A. Middeldorp This work is licensed under the Creative Commons Attribution License. H. Zankl, N. Hirokawa & A. Middeldorp 13 a subterm in the left- or right-hand side of a rule in R. This notion is extended to terms as follows: aa(t) = aa(f ) if t is a constant f and aa(t1 ) − 1 if t = t1 ◦ t2 . Note that aa(t) is undefined if the head symbol of t is a variable. The uncurrying system U(F) consists of the rewrite rules fi (x1 , . . . , xi ) ◦ y → fi+1 (x1 , . . . , xi , y) for every constant f ∈ F and every 0 6 i < aa(f ). Here f0 = f and, for every i > 0, fi is a fresh function symbol of arity i. We say that R is left head variable free if aa(t) is defined for every non-variable subterm t of a left-hand side of a rule in R. This means that no subterm of a left-hand side in R is of the form t1 ◦ t2 where t1 is a variable. The uncurrying system U(F), or simply U, is confluent and terminating. Hence every term t has a unique normal form t↓U . The uncurried system R↓U is the TRS consisting of the rules l↓U → r↓U for every l → r ∈ R. However the rules of R↓U are not enough to simulate an arbitrary rewrite sequence in R. The natural idea is now to add U(F). In the following we write U + (R, F) for R↓U (F ) ∪ U(F). If F can be inferred from the context or is irrelevant, U + (R, F) is abbreviated to U + (R). Let R be a left head variable free ATRS. The η-saturated ATRS Rη is the smallest extension of R such that l ◦ x → r ◦ x ∈ Rη whenever l → r ∈ Rη and aa(l) > 0. Here x is a variable that does not appear in l → r. For a term t over the signature of the TRS U + (R), we denote by t↓C ′ the result of identifying different function symbols in t↓C that originate from the same function symbol in F. The notation ↓C ′ is extended to TRSs and substitutions in the obvious way. For a substitution σ, we write σ↓U for the substitution {x 7→ σ(x)↓U | x ∈ V}. Next we recall some results from [2]. Lemma 1. Let σ be a substitution. If t is head variable free then t↓U σ↓U = (tσ)↓U . Lemma 2. Let R be a left head variable free ATRS. If s and t are terms over the signature of U + (R) then (1) s →R↓U t if and only if s↓C ′ →R t↓C ′ and (2) s →U t implies s↓C ′ = t↓C ′ . Theorem 3. A left head variable free ATRS R is terminating if and only if U + (Rη ) is terminating. 3 Innermost Uncurrying Before showing that our transformation reflects innermost termination we show that it does not preserve innermost termination. Example 4. Consider the ATRS R = {f x → f x, f → g}. In an innermost sequence the first rule is never applied and hence R is innermost terminating. We have U + (Rη ) = {f1 (x) → f1 (x), f → g, f1 (x) → g ◦ x, f ◦ x → f1 (x)} which is not innermost terminating due to the rule f1 (x) → f1 (x). i i + The next example shows that s → R t does not imply s↓U →U + (Rη ) t↓U . This is not a counterexample to soundness of uncurrying for innermost termination, but it shows that the proof for the “if-direction” of Theorem 3 (see [2] for details) cannot be adopted for the innermost case without further ado. i Example 5. Consider the ATRS R = {f → g, a → b, g x → h} and the innermost step s = f a → R g a = t. + We have s↓U = f ◦ a and t↓U = g1 (a). In the TRS U (Rη ) = {f → g, a → b, g1 (x) → h, g ◦ x → g1 (x)} i we have s↓U → U + (Rη ) g ◦ a but the step from g ◦ a to t↓U is not innermost. The above problems can be solved if we consider terms that are not completely uncurried. The next two lemmata prepare for our proof. Below we write s ⊲ t if t is a proper subterm of s. Lemma 6. Let R be a left head variable free ATRS. If s is a term over the signature of R, s ∈ NF(R), and s →∗U t then t ∈ NF(Rη ↓U ). Lemma 7. →∗U · ⊲ ⊆ ⊲ · →∗U i ǫ i + ∗ Lemma 8. For every left head variable free ATRS R the inclusion ∗U ← · → R ⊆ →U + (Rη ) · U ← holds. 14 Uncurrying i + i ǫ ∗ ∗ Proof. We prove that s → U + (Rη ) r↓U σ↓U U ← rσ whenever s U ← lσ →R rσ for some rewrite rule l → r i ∗ ∗ in R. By Lemma 1 and the confluence of U, s → U (lσ)↓U = l↓U σ↓U →U + (Rη ) r↓U σ↓U U ← rσ. It i ∗ remains to show that the sequence s →U (lσ)↓U and the step l↓U σ↓U →U + (Rη ) r↓U σ↓U are innermost i ∗ i i ǫ ′ i ∗ ′ with respect to U + (Rη ). For the former, let s → U C[u] →U C[u ] →U (lσ)↓U with u →U u and let t ∗ ∗ be a proper subterm of u. Obviously lσ →U C[u] ⊲ t. According to Lemma 7, lσ ⊲ v →U t for some i term v. Since lσ → R rσ, the term v is a normal form of R. Hence t ∈ NF(Rη ↓U ) by Lemma 6. Since i ǫ ′ u →U u , t is also a normal form of U. Hence t ∈ NF(U + (Rη )) as desired. For the latter, let t be a proper subterm of (lσ)↓U . According to Lemma 7, lσ ⊲ u →∗U t. The term u is a normal form of R. Hence t ∈ NF(Rη ↓U ) by Lemma 6. Obviously, t ∈ NF(U) and thus also t ∈ NF(U + (Rη )). The next example shows that for Lemma 8 the R-step must take place at the root position. i ∗ 6 i + Example 9. If R = {f → g, f x → g x, a → b} then f1 (a) ∗U ← f ◦ a → R g ◦ a but f1 (a) → U + (Rη ) · U ← g ◦ a. ri In order to extend Lemma 8 to non-root positions, we have to use rightmost innermost step →. This avoids the situation in the above example where parallel redexes become nested by uncurrying. ri i + ∗ Lemma 10. For every left head variable free ATRS R the inclusion ∗U ← · → R ⊆ →U + (Rη ) · U ← holds. ri i ǫ Proof. Let s ∗U ← t = C[lσ] → R C[rσ] = u with lσ →R rσ. We use induction on C. If C =  then i ǫ i + ∗ ∗ s U ← t →R u. Lemma 8 yields s →U + (Rη ) · U ← u. For the induction step we consider two cases. • Suppose C =  ◦ s1 ◦ · · · ◦ sn and n > 0. Since R is left head variable free, aa(l) is defined. If i ′ ∗ aa(l) = 0 then s = t′ ◦ s′1 ◦ · · · ◦ s′n ∗U ← lσ ◦ s1 ◦ · · · ◦ sn → R rσ ◦ s1 ◦ · · · ◦ sn with t U ← lσ and ′ ∗ sj U ← sj for 1 6 j 6 n. The claim follows using Lemma 8 and the fact that innermost rewriting is closed under contexts. If aa(l) > 0 then the head symbol of l cannot be a variable. We have to consider two cases. In the case where the leftmost ◦ symbol in C has not been uncurried we proceed as when aa(l) = 0. If the leftmost ◦ symbol of C has been uncurried, we reason as follows. We may write lσ = f ◦ u1 ◦ · · · ◦ uk where k < aa(f ). We have t = f ◦ u1 ◦ · · · ◦ uk ◦ s1 ◦ · · · ◦ sn and u = rσ ◦ s1 ◦ · · · ◦ sn . There exists an i with 1 6 i 6 min{aa(f ), k + n} such that s = fi (u′1 , . . . , u′k , s′1 , . . . , s′i−k ) ◦ s′i−k+1 ◦ · · · ◦ s′n with u′j ∗U ← uj for 1 6 j 6 k and s′j ∗U ← sj for 1 6 j 6 n. Because of rightmost innermost rewriting, the terms u1 , . . . , uk , s1 , . . . , sn are normal forms of R. According to Lemma 6 the terms u′1 , . . . , u′k , s′1 , . . . , s′n are normal forms of Rη ↓U . Since i−k 6 aa(l), Rη contains the rule l ◦x1 ◦· · ·◦xi−k → r ◦x1 ◦· · ·◦xi−k where x1 , . . . , xi−k are pairwise distinct variables not occurring in l. Hence the substitution τ = σ ∪ {x1 7→ s1 , . . . , xi−k 7→ si−k } is well-defined. We obtain i ∗ ′ ′ s → U + (Rη ) fi (u1 ↓U , . . . , uk ↓U , s1 ↓U , . . . , si−k ↓U ) ◦ si−k+1 ◦ · · · ◦ sn i ′ ′ → U + (Rη ) (r ◦ x1 ◦ · · · ◦ xi−k )↓U τ ↓U ◦ si−k+1 ◦ · · · ◦ sn ∗← U (r ◦ x1 ◦ · · · ◦ xi−k )τ ◦ si−k+1 ◦ · · · ◦ sn = rσ ◦ s1 ◦ · · · ◦ sn = t where we use the confluence of U in the first sequence. ri ′ • In the second case we have C = s1 ◦ C ′ . Clearly C ′ [lσ] → R C [rσ]. If aa(s1 ) 6 0 or if aa(s1 ) is undefined or if aa(s1 ) > 0 and the outermost ◦ has not been uncurried in the sequence from t ri ′ ′ ∗ ′ ∗ ′ to s then s = s′1 ◦ s′ U∗ ← s1 ◦ C ′ [lσ] → R s1 ◦ C [rσ] = u with s1 U ← s1 and s U ← C [lσ]. If aa(s1 ) > 0 and the outermost ◦ has been uncurried in the sequence from t to s then we may write s1 = f ◦ u1 ◦ · · · ◦ uk where k < aa(f ). We have s = fk+1 (u′1 , . . . , u′k , s′ ) for some term s′ with i + s′ U∗ ← C ′ [lσ] and u′i ∗U ← ui for 1 6 i 6 k. In both cases the induction hypothesis yields s′ → U + (Rη ) i + ∗ ′ · U ← C [rσ] and, since innermost rewriting is closed under contexts, we obtain s →U + (Rη ) · ∗U ← u as desired. H. Zankl, N. Hirokawa & A. Middeldorp 15 By Lemma 10 and the equivalence of rightmost innermost and innermost termination [6] we obtain the main result of this section. Theorem 11. A left head variable free ATRS R is innermost terminating if U + (Rη ) is. 4 Derivational Complexity Hofbauer and Lautemann [4] introduced the concept of derivational complexity for terminating TRSs. The idea is to measure the maximal length of rewrite sequences (derivations) depending on the size of the starting term. Formally, the derivation length of t (with respect to →) is defined as dl(t, →) = max{m ∈ N | t →m u for some u} . The derivational complexity dcR (n) of a TRS R is then defined as dcR (n) = max{dl(t, →R ) | |t| 6 n} where |t| denotes the size of t. Similarly we define idcR (n) = i max{dl(t, → R ) | |t| 6 n}. Since we regard only finite TRSs, these functions are well-defined only when R is (innermost) terminating. If dcR (n) is bounded by a linear, quadratic, cubic, . . . polynomial, R is said to have linear, quadratic, cubic . . . (or polynomial) derivational complexity. A similar definition applies for idcR (n). 4.1 Full Rewriting It is sound to use uncurrying as a preprocessor for proofs of derivational complexity: Theorem 12. Let R be a left head variable free and terminating ATRS. Then dcR (n) 6 dcU + (Rη ) (n) for all n ∈ N. Proof. Consider an arbitrary maximal rewrite sequence t0 →R t1 →R t2 →R · · · →R tm which we + + + can transform into the sequence t0 ↓U →+ U + (Rη ) t1 ↓U →U + (Rη ) t2 ↓U →U + (Rη ) · · · →U + (Rη ) tm ↓U just as the “if-direction” of the proof of Theorem 3 (see [2]). Moreover, t0 →∗U + (Rη ) t0 ↓U holds. Therefore, dl(t0 , →R ) 6 dl(t0 , →U + (Rη ) ). Hence dcR (n) 6 dcU + (Rη ) (n) holds for all n ∈ N. Next we show that uncurrying preserves polynomial complexities. Hence we disregard duplicating (exponential complexity, cf. [3]) and empty (constant complexity) ATRSs. A TRS R is called lengthreducing if R is non-duplicating and |l| > |r| for all rules l → r ∈ R. The following lemma is an easy consequence of [3, Theorem 23]. Here →R/S denotes →∗S · →R · →∗S . Lemma 13. Let R be a non-empty non-duplicating TRS over a signature containing at least one symbol of arity at least two and let S be a length-reducing TRS. If R ∪ S is terminating then dcR∪S (n) ∈ O(dcR/S (n)). Note that the above lemma does not hold if the TRS R is empty. Theorem 14. Let R be a non-empty, non-duplicating, left head variable free, and terminating ATRS. If dcR (n) is in O(nk ) then dcRη ↓U /U (n) and dcU + (Rη ) (n) are in O(nk ). Proof. Let dcR (n) be in O(nk ) and consider a maximal rewrite sequence of →Rη ↓U /U from a term t0 : t0 →Rη ↓U /U t1 →Rη ↓U /U · · · →Rη ↓U /U tm . By Lemma 2 (2) we obtain the sequence t0 ↓C ′ →R t1 ↓C ′ →R · · · →R tm ↓C ′ . Thus, dl(t0 , →Rη ↓U /U ) 6 dl(t0 ↓C ′ , →R ). Because |t0 ↓C ′ | 6 2|t0 | holds, we obtain dcRη ↓U /U (n) 6 dcR (2n). From the assumption the right-hand side is in O(nk ), hence dcRη ↓U /U (n) is in O(nk ). Because U is length-reducing, dcU + (Rη ) (n) is also in O(nk ), by Lemma 13. 16 Uncurrying In practice it is recommendable to investigate dcRη ↓U /U (n) instead of dcU + (Rη ) (n), see [8]. The next example shows that uncurrying might be useful to enable criteria for polynomial complexity. Example 15. Consider the ATRS R = {add x 0 → x, add x (s y) → s (add x y)}. It is easy to see that there exists a triangular matrix interpretation of dimension 2 that orients all rules in U + (Rη ) strictly, inducing quadratic derivational complexity of U + (Rη ) (see [5]) and by Theorem 12 also of R. In contrast, the rule add x (s y) → s (add x y) does not admit such an interpretation of dimension 2. To see this we encoded the required condition as a satisfaction problem in non-linear arithmetic constraint over the integers. MiniSmt1 can prove this problem unsatisfiable. 4.2 Innermost Rewriting Next we consider innermost derivational complexity. Let R be an innermost terminating TRS. From a i ri result by Krishna Rao [6, Section 5.1] we infer that dl(t, → R ) = dl(t, →R ) holds for all terms t. Theorem 16. Let R be a left head variable free and innermost terminating ATRS. We have idcR (n) 6 idcU + (Rη ) (n) for all n ∈ N. ri ri ri ri Proof. Consider a maximal rightmost innermost rewrite sequence t0 → R t1 →R t2 →R · · · →R tm . i + i + ′ i + ′ i + ′ Using Lemma 10 we obtain a sequence t0 → U + (Rη ) t1 →U + (Rη ) t2 →U + (Rη ) · · · →U + (Rη ) tm for terms i ri i t′1 , t′2 , . . . , t′m such that ti →∗U t′i for all 1 6 i 6 m. Thus, dl(t0 , → R ) = dl(t0 , →R ) 6 dl(t0 , →U + (Rη ) ). Hence, we conclude idcR (n) 6 idcU + (Rη ) (n). As Example 4 shows, uncurrying does not preserve innermost termination. Similarly, it does not preserve polynomial complexities even if the original ATRS has linear innermost derivational complexity. Example 17. Consider the non-duplicating ATRS R = {f → s, f (s x) → s (s (f x))}. Since the second rule is never used in innermost rewriting, idcR (n) 6 n is easily shown by induction on n. We show that the innermost derivational complexity of U + (Rη ) is at least exponential. We have U + (Rη ) = {f → s, f1 (s1 (x)) → s1 (s1 (f1 (x))), f ◦ x → f1 (x), f1 (x) → s1 (x), s ◦ x → s1 (x)} and one can verify that i n n dl(f1n (s1 (x)), → U + (Rη ) ) > 2 for all n > 1. Hence, idcU + (Rη ) (n + 3) > 2 for all n > 0. References [1] F. Baader & T. Nipkow (1998): Term Rewriting and All That. Cambridge University Press. [2] N. Hirokawa, A. Middeldorp & H. Zankl (2008): Uncurrying for Termination. In: LPAR, LNCS 5330, pp. 667–681. [3] N. Hirokawa & G. Moser (2008): Automated Complexity Analysis Based on the Dependency Pair Method. In: IJCAR, LNCS 5195, pp. 364–379. [4] D. Hofbauer & C. Lautemann (1989): Termination Proofs and the Length of Derivations (Preliminary Version). In: RTA, LNCS 355, pp. 167–177. [5] G. Moser, A. Schnabl & J. Waldmann (2008): Complexity Analysis of Term Rewriting Based on Matrix and Context Dependent Interpretations. In: FSTTCS, LIPIcs 2, pp. 304–315. [6] M.R.K. Krishna Rao (2000): Some Characteristics of Strong Innermost Normalization. TCS 239, pp. 141–164. [7] Terese (2003): Term Rewriting Systems, Cambridge Tracts in Theoretical Computer Science 55. Cambridge University Press. [8] H. Zankl & M. Korp (2010): Modular Complexity Analysis via Relative Complexity. In: RTA, LIPIcs. To appear. 1 http://cl-informatik.uibk.ac.at/software/minismt/ A Calculus of Coercions Proving the Strong Normalization of MLF Giulio Manzonetto∗ Paolo Tranquilli† LIPN, CNRS UMR 7030 LIP, CNRS UMR 5668, INRIA Université Paris Nord, France ENS de Lyon, Université Claude Bernard Lyon 1, France [email protected] [email protected] We provide a strong normalization result for MLF , a type system generalizing ML with firstclass polymorphism as in system F. The proof is achieved by translating MLF into a calculus of coercions, and showing that this calculus is a decorated version of system F. Simulation results then entail strong normalization from the same property of system F. Introduction. MLF [3] is a type system for (extensions of) λ -calculus which enriches ML with the first class polymorphism of system F, providing a partial type annotation mechanism with an automatic type reconstructor. In this extension we can write programs that cannot be written in ML, while still being conservative: ML programs still typecheck without needing any annotation. An important feature are principal type schemata, lacking in system F, which are obtained by employing a downward bounded quantification ∀(α ≥ σ )τ, the so-called flexible quantifier. This type says that τ may be instantiated to any τ {σ ′ /α}, provided that σ ′ is an instantiation of σ . As already pointed out, system F is contained in MLF . It is not yet known, but it is conjectured [3], that the inclusion is strict. This makes the question of strong normalization (SN, i.e. whether λ -terms typed in MLF always terminate) a non-trivial one. In this paper we answer positively to the question. The result is proved via a suitable simulation in system F, with additional structure dealing with the complex type instantiations possible in MLF . Our starting point is xMLF [5], the Church version of MLF : here type inference (and the rigid quantifier ∀(α = σ )τ we did not mention) is omitted, with the aim of providing an internal language to which a compiler might map the surface language briefly presented above (denoted eMLF from now on1 ). Compared to Church-style system F, the type reduction →ι of xMLF is more complex, and may a priori cause unexpected glitches: it could cause non-termination, or block the reduction of a β -redex. To prove that none of this happens, we use as target language of our translation a decoration of system F, the coercion calculus, which in our opinion has its own interest. Indeed, xMLF has syntactic entities (the instantiations φ ) which testify an instance relation between types, and it is not hard to regard them as coercions. The delicate point is that some of these instantiations (the “abstractions” !α) behave in fact as variables, abstracted when introducing a bounded quantifier. In fact, for all the choices of α, ∀(α ≥ σ )τ expects a coercion from σ to α. A question that arises naturally is: what does it mean to be a coercion in this context? Our answer, which works for xMLF , is in the form of a type system (Figure 2). In section 2 we will show the good properties enjoyed by coercion calculus. The generality of coercion calculus allows ∗ Supported by Digiteo project COLLODI (2009-28HD). by ANR project COMPLICE (ANR-08-BLANC-0211-01). 1 There is also a completely annotation-free version, iMLF , clearly at the cost of loosing type inference. † Supported Submitted to: HOR 2010 c G. Manzonetto & P. Tranquilli A Calculus of Coercions Proving SN of MLF 18 Syntactic definitions σ , τ ::= α | σ → τ | ⊥ | ∀(α ≥ σ )τ (types) φ , ψ ::= τ | !α | ∀(≥ φ ) | ∀(α ≥)φ | & | | φ ; ψ | 1 (instantiations) a, b, c ::= x | λ (x : τ)a | ab | Λ(α ≥ τ)a | aφ | let x = a in b (terms) (environments) Γ, ∆ ::= 0/ | Γ, α ≥ τ | Γ, x : τ & Reduction rules (λ (x : τ)a)b →β a {x/b} a →ι Λ(α ≥ ⊥)a, α ∈ / ftv(τ) a1 →ι a (Λ(α ≥ τ)a)& →ι a {1/!α} {τ/α} a(φ ; ψ) →ι (aφ )ψ let x = b in a →β a {x/b} (Λ(α ≥ τ)a)(∀(α ≥)φ ) →ι Λ(α ≥ τ)(aφ ) (Λ(α ≥ τ)a)(∀(≥ φ )) →ι Λ(α ≥ τφ )a {φ ; !α/!α} & Figure 1: Syntactic definitions and reduction rules of xMLF . then to lift these results to xMLF via a translation (section 3). The main idea of the translation is the same as the one shown for eMLF in [4], where however no dynamic property was provided. Here we finally produce a proof of SN for all versions of MLF . Moreover the bisimulation result for xMLF establishes once and for all that xMLF can be used as an internal language for eMLF , as the additional type structure cannot block reductions of programs in eMLF . 1 A short introduction to xMLF The syntactic entities of xMLF are presented in Figure 1. Intuitively, ⊥ ∼ = ∀α.α and ∀(α ≥ σ )τ restricts the variable α to range over instances of σ only. Instantiations2 generalize system F’s type application, by providing a way to instantiate from one type to another. A let construct is added mainly to accommodate the type reconstructor of eMLF ; apart from type inference purposes, one could assume (let x = a in b) = (λ (x : σ )b)a, with σ the correct type of a. Apart from the usual variable assignments x : τ, environments also contain type variable assignments α ≥ τ, which are abstracted by the type abstraction Λ(α ≥ τ)a. Typing judgments are of the usual form Γ ⊢ a : σ for terms, and Γ ⊢ φ : σ ≤ τ for instantiations. The latter means that φ can take a term a of type σ to aφ of type τ. For the sake of space, we will not present here the typing rules of instantiations and terms, for which we refer to [5], along with a more detailed discussion about xMLF . Reduction rules are divided into →β (regular β -reductions) and →ι , reducing instantiations. The type τφ is given by an inductive definition (which we will not give here) which computes the unique type such that Γ ⊢ φ : τ ≤ τφ , if φ typechecks. We recall (from [5]) that both →β and →ι enjoy subject reduction. Moreover, we denote by ⌈a⌉ the type erasure that ignores all type and instantiation annotations and maps xMLF terms to ordinary λ -terms (with let). 2 The coercion calculus The syntax, the type system and the reduction rules of the coercion calculus are introduced in Figure 2. The notion of coercion is captured by the type τ ⊸ σ : the use of linear logic’s linear implication for the type of coercions is not casual. Indeed the typing system is a fragment of & 2 We follow the original notation of [5]; in particular it must be underlined that whatsoever with linear logic’s par and with connectives. and & have no relation G. Manzonetto & P. Tranquilli σ,τ κ ζ a, b u, v 19 Syntactic ::= α | σ → τ | κ → τ | ∀α.τ (types) ::= σ ⊸ τ (coercion types) ::= τ | κ (type expressions) ::= x | λ x.a | λ x.a | ab | a ⊳ b | a ⊲ b (terms) ::= λ x.a | λ x.u | x ⊲ u (c-values) definitions Γ, ∆ ::= 0/ | Γ, x : τ | Γ, x : σ ⊸ α (environments) L ::= 0/ | x : τ (linear environments) Γ; ⊢ a : σ (term judgements) Γ; ⊢ a : κ (coercion judgements) Γ; z : τ ⊢ a : σ (linear judgements) Typing rules Γ(y) = ζ Γ; L ⊢ a : ∀α.σ Γ; ⊢ a : σ → τ Γ; ⊢ b : σ Inst App Ax Γ; ⊢ ab : τ Γ; L ⊢ a : σ {τ ′ /α} Γ; ⊢ y : ζ Γ, x : τ; ⊢ a : σ Γ; L ⊢ a : σ α ∈ / ftv(Γ; L) Γ; ⊢ a : τ Γ, x : τ; ⊢ b : σ Abs Let Gen Γ; ⊢ let x = a in b : σ Γ; ⊢ λ x.a : τ → σ Γ; L ⊢ a : ∀α.σ Γ; z : τ ⊢ a : σ Γ; ⊢ a : σ1 ⊸ σ2 Γ; L ⊢ b : σ1 LAbs LApp LAx Γ; L ⊢ a ⊲ b : σ2 Γ; z : τ ⊢ z : τ Γ; ⊢ λ w.a : τ ⊸ σ Γ, x : κ; L ⊢ a : σ Γ; L ⊢ a : κ → σ Γ; ⊢ b : κ CApp CAbs Γ; L ⊢ a ⊳ b : σ Γ; L ⊢ λ x.a : κ → σ Reduction rules let x = b in a →β a {b/x} , (λ x.a)b →β a {b/x} , (λ x.a) ⊳ b →c a {b/x} , (λ x.a) ⊲ b →c a {b/x} , (λ x.u) ⊳ b →cv u {b/x} , (λ x.a) ⊲ u →cv a {u/x} . Figure 2: Syntactic definitions, typing and reduction rules of the coercion calculus. DILL, the dual intuitionistic linear logic [1]. This captures an aspect of coercions: they consume their argument without erasing it (as they must preserve it) nor duplicate it (as there is no true computation, just a type recasting). Environments are of shape Γ; L, where Γ is a map from variables to type expressions3 , and L is the linear part of the environment, containing (contrary to DILL) at most one assignment. Reductions are divided into →β (the actual computation) and →c (the coercion reduction), having a subreduction →cv which intuitively is just enough to unlock β -redexes, and is needed for Theorem 4. We start from the basic properties of the coercion calculus. As usual, the following result is achieved with weakening and substitution lemmas. Theorem 1 (Subject reduction). Γ; L ⊢ a : ζ and a →β c b entail Γ; L ⊢ b : ζ . The coercion calculus can be seen as a decoration of Curry-style system F. The latter can be recovered by just collapsing the constructs ⊸, λ , ⊳ and ⊲ to their regular counterparts, via the decoration erasure defined as follows. |α| := α, |x| := x, |ζ → τ| := |ζ | → |τ|, |λ x.a| = |λ x.a| := λ x.|a|, |σ ⊸ τ| := |σ | → |τ|, |Γ|(y) := |Γ(y)|, |let x = a in b| = (λ x.|b|)|a|, |Γ; z : τ| := |Γ|, z : |τ|, |a ⊳ b| = |a ⊲ b| = |ab| := |a||b|. It is possible to prove that Γ; L ⊢ a : ζ implies that |Γ; L| ⊢ a : |ζ | in system F. From this, and the SN of system F [2, Sec. 14.3] it follows that the coercion calculus is SN. Confluence of reductions can be proved by standard Tait-Martin Löf’s technique of parallel reductions. Summarizing, the following theorem holds. 3 Notice the restriction to σ ⊸ α for coercion variables. Theorem 4 relies on this restriction (d = λ x.(x ⊲ δ )δ : (σ ⊸ (σ → σ )) → σ , with δ = λ y.yy : σ , ⌊d⌋ = δ δ is a counterexample), but the preceding results do not. A Calculus of Coercions Proving SN of MLF 20 α• ⊥• Types and contexts (σ → := σ • → τ • , • (∀(α ≥ σ )τ) := ∀α.(σ • ⊸ α) → τ • , τ)• := α, := ∀α.α, Instantiations := λ x.λ vα .x, := λ z.ψ ◦ ⊲ (φ ◦ ⊲ z), := λ x.x, ( ◦ ◦ (!α) := vα , (∀(≥ φ )) := λ x.λ vα .x ⊳ (λ z.vα ⊲ (φ ◦ ⊲ z)), )◦ (φ ; ψ)◦ & τ◦ x◦ := x, (let x = a in b)◦ := let x = a◦ in b◦ , (x : τ)• := x : τ • , (α ≥ τ)• := vα : τ • ⊸ α. (&)◦ := λ x.x ⊳ λ z.z, (1)◦ := λ z.z, (∀(α ≥)φ )◦ := λ x.λ vα .φ ◦ ⊲ (x ⊳ vα ). Terms (λ (x : τ)a)◦ := λ x.a◦ , (Λ(α ≥ τ)a)◦ := λ vα .a◦ , (ab)◦ := a◦ b◦ , (aφ )◦ := φ ◦ ⊲ a◦ . Figure 3: Translation of types, instantiations and terms into the coercion calculus. For every type variable α we suppose fixed a fresh term variable vα . Theorem 2 (Confluence and termination). All of →β , →c , →cv and →β c are confluent. Moreover the coercion calculus is SN. The use of coercions is annotated at the level of terms: λ is used to distinguish between regular and coercion reduction, ⊳ and ⊲ locate coercions without the need to carry typing information (the triangle’s side points to the direction of the coercion). Thus, the actual semantics of the term can be recovered via its coercion erasure: ⌊x⌋ := x, ⌊λ x.a⌋ := λ x.⌊a⌋, ⌊let x = a in b⌋ = let x = ⌊a⌋ in ⌊b⌋, ⌊ab⌋ := ⌊a⌋⌊b⌋, ⌊a ⊳ b⌋ := ⌊a⌋, ⌊λ x.a⌋ := ⌊a⌋, ⌊a ⊲ b⌋ := ⌊b⌋. Proposition 3 (Preservation of semantics). Take a typable coercion term a. If a →β b (resp. a →c b) then ⌊a⌋ → ⌊b⌋ (resp. ⌊a⌋ = ⌊b⌋). Moreover we have the confluence diagram shown on the right. a β c b1 b1 c∗ β c The following result shows the connection between the reductions of a term and of its semantics. ∗ Theorem 4 (Bisimulation of ⌊ . ⌋). If Γ; ⊢A a : σ , then ⌊a⌋ →β b iff a →cv →β c with ⌊c⌋ = b. 3 The translation A translation from xMLF terms and instantiations into the coercion calculus is given in Figure 3. The idea is that instantiations can be seen as coercions; thus a term starting with a type abstraction becomes a term waiting for a coercion, and a term aφ becomes a◦ coerced by φ ◦ . The rest of this section is devoted to showing how this translation and the properties of the coercion calculus lead to the main result of this work, that is SN of both xMLF and eMLF . First one needs to show that the translation maps to well-typed terms. As expected, type instantiations are mapped to coercions. Proposition 5 (Soundness). For Γ ⊢ a : σ an xMLF term (resp. Γ ⊢ φ : σ ≤ τ an xMLF instantiation) we have Γ• ; ⊢ a◦ : σ • (resp. Γ• ; ⊢ φ ◦ : σ • ⊸ τ • ). Moreover ⌈a⌉ = ⌊a◦ ⌋. The following result shows that the translation is “faithful”, in the sense that β and ι steps are mapped to β and c steps respectively: coercions do the job of instantiations, and just that. Proposition 6 (Coercion calculus simulates xMLF ). If a →β b (resp. a →ι b) in xMLF , then + a◦ →β b◦ (resp. a◦ →c b◦ ) in coercion calculus. G. Manzonetto & P. Tranquilli 21 The above already shows SN of xMLF , however in order to show that eMLF is also normalizing we need to make sure that ι-redexes cannot block β ones: in other words, a bisimulation result. The following lemma lifts to xMLF the reduction in coercion calculus that bisimulates β -steps (Theorem 4). ∗ ∗ ∗ Lemma 7 (Lifting). For an xMLF term a, if a◦ →cv →β b then a →ι →β c with b →c c◦ . Theorem 8 (Bisimulation of ⌈ . ⌉ for xMLF ). For a typed xMLF term a, we have that ⌈a⌉ →β b ∗ iff a →ι →β c with ⌈c⌉ = b. As a corollary of the two results stated above, we get the main result of this work, proving conclusively that all versions of MLF enjoy SN. Theorem 9 (SN of MLF ). Both eMLF and xMLF are strongly normalizing. Further work. We were able to prove new results for MLF (namely SN and bisimulation of xMLF with its type erasure) by employing a more general calculus of coercions. It becomes natural then to ask whether its typing system may be a framework to study coercions in general, like those arising in Fη or when using subtyping. The typing rules of Figure 2 were tailored to xMLF , disallowing in coercions polymorphism or coercion abstraction, i.e. coercion types ∀α.κ and κ1 → κ2 . Removing such restrictions we could still derive the main result, even though the proofs would be more complex. Apart from the extensions previously mentioned, one would need a way to build coercions of arrow types, which are unneeded for xMLF . Namely, given coercions c1 : σ2 ⊸ σ1 and c2 : τ1 ⊸ τ2 , there should be a coercion c1 ⇒ c2 : (σ1 → τ1 ) ⊸ (σ2 → τ2 ), allowing a reduction (c1 ⇒ c2 ) ⊲ λ x.a →c λ x.c2 ⊲ a {c1 ⊲ x/x}. This could be achieved by introducing it as a primitive, by translation or by special typing rules. Indeed if some sort of η-expansion would be available while building a coercion, one could write c1 ⇒ c2 := λ f .λ x.(c2 ⊲ ( f (c1 ⊲ x))). However how to do this without loosing bisimulation is under investigation. Acknowledgements. We thank Didier Rémy for stimulating discussions and remarks. References [1] Andrew Barber & Gordon Plotkin (1997): Dual intuitionistic linear logic. Technical Report LFCS96-347, University of Edinburgh. [2] Jean-Yves Girard, Yves Lafont & Paul Taylor (1989): Proofs and Types. Number 7 in Cambridge tracts in theoretical computer science. Cambridge University Press. [3] Didier Le Botlan & Didier Rémy (2003): MLF: Raising ML to the power of System F. In: Proc. of International Conference on Functional Programming (ICFP’03), pp. 27–38. [4] Daan Leijen (2007): A type directed translation of MLF to System F. In: Proc. of International Conference on Functional Programming (ICFP’07), ACM Press. [5] Didier Rémy & Boris Yakobowski (2009): A Church-style intermediate language for MLF. Available at http://www.yakobowski.org/xmlf.html. Submitted. Transformations of higher-order term rewrite systems Cynthia Kop Department of Theoretical Computer Science Vrije Universiteit, Amsterdam [email protected] We study the common ground and differences of different frameworks for higher-order rewriting from the viewpoint of termination by encompassing them in a generalised framework. 1 Introduction In the past decades a lot of research has been done on termination of term rewrite systems. However, the specialised area of higher order rewriting is sadly lagging behind. There are many reasons for this. Primarily, the topic is relatively difficult, mostly due to the presence of the beta rule. Applications are also not in as much abudance as with first order rewriting. A third limiting factor is the lack of a set standard in higher order rewriting. There are several important formalisms, each dealing with the higher order aspect in a different way, plus various variations and restrictions. Because of the differences in what is and is not allowed, results in one formalism do not trivially, or not at all, carry over to another. As such it is difficult to reuse results in a slightly different context, which necessitates a lot of double work. In this paper we present work in progress investigating the common ground and differences of various formalisms from the viewpoint of termination. We introduce yet another formalism, but show how the usual styles of rewriting can be represented in it. We then look into properties within the general formalism and show which ones can always be obtained by transforming the system and which cannot. Finally, to demonstrate that the system is not too general to work with, we extend the Computability Path Ordering [2] to our formalism. 2 The formalism In this section we will introduce a formalism of higher-order rewriting, called Higher Order Decidable Rewrite Systems. types We assume a set of base sorts B and a set of type variables A . Each (base) sort b has a fixed arity, notation: b : n; there must be at least one sort of arity 0. A polymorphic type is an expression over B and A built according to the following grammar: T = α|b(T n )|T → T (α ∈ A , b : n ∈ B) A monomorphic type does not contain type variables. A type is called composed if it is headed by the arrow symbol. A type b() with b : 0 ∈ B is denoted as just b. The → associates to the right. We say σ ≥ τ if τ can be obtained from σ by substituting types for type variables. For example, α ≥ α ≥ α → β ≥ N → N, but not α → α ≥ N → R. Submitted to: HOR 2010 c C. Kop This work is licensed under the Creative Commons Attribution License. C. Kop 23 (meta-)terms A metaterm is a typed expression over a set F of typed constants (also known as function symbols) f : σ and infinite set V of variables. We define the set of metaterms M together with a set V ar of free variables for each metaterm recursively with the following rules: if x ∈ V V ar(xτ ) = {xτ } (var) xτ : τ ∈ M (fun) fτ : τ ∈ M if f : σ ∈ F and σ ≥ τ V ar( fτ ) = 0/ V ar(λ xσ .s) = V ar(s) \ {xσ } (abs) λ xσ .s : σ → τ if x ∈ V , s : τ ∈ M and ∈M V ar(s) ∪ {xσ } UT(∗∗) (app) s · t : τ ∈ M if s : σ → τ ∈ M, t : σ ∈ M V ar(s · t) = V ar(s) ∪ V ar(t) and V ar(s) ∪ V ar(t) UT (meta) Xσ [s1 , . . . , sn ] if σ = σ1 → . . . σn → τ and V ar(Xσ [s1 , . . . , sn ]) = S s1 : σ1 , . . . , sn : σn and :τ ∈M {Xσ } ∪ i V ar(si ) S {Xσ } ∪ i V ar(si ) UT (**) A set V of typed variables is called UT, uniquely typed, if for any xσ ∈ V there is no xτ ∈ V with σ 6= τ. A metaterm generated without clause (meta) is a term. We work modulo renaming of variables bound by an abstraction operator (α-conversion). Explicit typing of terms will usually be omitted. The · operator associates to the left, so a metaterm s · t · r should be read as (s · t) · r. We will adopt the custom of writing a (meta-)term s · t1 · · ·tn in the form s(t1 , . . . ,tn ). type substitution A type substitution is a mapping p : A → T . For any metaterm s let s′ p be s with all type variables α replaced by p(α). As an example, (ifα (Xbool ,Yα , 0α ) : α)′ {α → N} = ifN (Xbool ,YN , 0N ). We say s ≥ t if there is a substitution p such that s = t ′ p. Given a typable expression s (that is, a term with some type indicators omitted), it has a principal term t, which is ≥ any term r obtained from s by filling in type indicators. When a term is displayed with type indicators missing, we always mean its principal term. For example, if f : α → α ∈ F , the principal term of f is fα→α , whereas the principal term of f (0N ) is fN→N (0N ). term and metaterm substitution A (term) substitution is the homomorphic extension of a typepreserving mapping [x1,σ1 := s1 , . . . , xn,σn := sn ] with all si terms. Substitutions for meta-applications X[t1 , . . . ,tm ] “eat” their arguments. Formally, let γ be the function mapping xi,σi to si and γ(xτ ) = xτ for other typed variables. For any metaterm s, sγ is generated by the following rules: xσ γ = γ(xσ ) for x ∈ V fσ γ = fσ for f ∈ F (s · t)γ = (sγ) · (tγ) / dom(γ) (we can rename x if necessary) (λ x.s)γ = λ x.(sγ) if x ∈ xσ [s1 , . . . , sn ]γ = q[y1 := s1 γ, . . . , ym := sm γ] · (sm+1 γ) · · · (sm γ) if γ(x) = λ y1 . . . ym .q, m ≤ n and m = n or q is not an abstraction. A metaterm is standard if variables occuring at the head of a meta-application are not bound, and all free variables occur at the head of a meta-application. Examples: (x(λ y.y)[x := λ z.z(a)] = (λ z.z(a))(λ y.y) whereas x[λ y.y][x := λ z.z(a)] = (λ y.y)(a)). Even an empty substitution has an effect on a proper metaterm: x[λ y.y][] = x(λ y.y). β and η =β is the equivalence relation generated by (λ x.s)·t =β s[x := t]. Every metaterm s is β -equal to a unique β -normal term s ↓β which has no subterms (λ x.s) · t. This is a wellknown result, which is 24 Transformations of higher-order TRSs easily extended to HODRSs. =η is the equivalence relation generated by s = λ x.s·x if x ∈ / V ar(s), and X[s1 . . . , sn ] = λ x.X[s1 , . . . , sn , x]. A metaterm is in η-normal form if any higher-order subterm is either an abstraction or meta-application, occurs at the head of an application or occurs as a direct argument of a meta-application (for example, X[s] is η-normal if X : σ → τ with τ not composed, and all direct subterms of s are η-normal). While a term may have more than one η-normal form ( f : o→o has normal forms λ x. f (x) and λ x.(λ x. f (x)) · x), we define s ↓η as its minimal η-normal form. = β η is the union of these relations. Each term has a unique β η-normal form. rules A term rewrite system consists of an alphabet F , a set of rules R and an equivalence relation δ , where δ is one of β , η, β η or normal equality ε (the α rule is implicit in all). Rules are tuples (l, r) (commonly denoted l ⇒ r), where l, r are standard metaterms satisfying the following properties: 1) l and r have the same type, 2) all variables and typevariables in r also occur in l, 3) if l has a subterm X[s1 , . . . , sn ] then the si are all distinct bound variables (the parameter restriction), 4) if the equivalence relation is either β or β η, no subterms X[s1 , . . . , sn ] · t0 · · ·tm (n, m ≥ 0) occur in l. R induces a rewrite relation ⇒R over terms in minimal δ -normal form: (top) l ′ pγ =δ s ⇒R t =δ r′ pγ if l ⇒ r ∈ R, p a type substitution and γ a substitution ′ (app-l) s · t ⇒R s · t if s ⇒R s′ ′ (app-r) s · t ⇒R s · t if t ⇒R t ′ ′ (abs) λ x.s ⇒R λ x.s if s ⇒R s′ The reduction relation is decidable due to the parameter restriction. 3 Pleasant properties and transformations To prove results about HODRSs it is often convenient to have a bit more to go on than just the general definition. To this end we define a number of standard properties, which define common subclasses which are relatively easy to work with. implicit beta the equivalence relation is either β or β η explicit beta the equivalence relation is either η or ε but R contains the rule beta, that is: (λ xα .Z[x]) · Y ⇒ Z[Y ] parameter-free in all rules, except possibly beta, any meta-applications occuring on either side have the form X[] beta-free the system is parameter-free, does not contain beta, and its equivalence relation is either η or ε monomorphic no rule contains type variables, except possibly beta left-beta-normal the left-hand side of each rule (except possibly beta) is β -normal right-beta-normal the right-hand side of each rule is β -normal beta-normal both left-beta-normal and right-beta-normal eta-normal both sides of all rules have η-normal form C. Kop 25 nth order any variable or function symbol occuring in one of the rules has a type of order at most n. A sort-headed type b(t1 , . . . ,tn ) has order 0, a type σ1 → . . . σm → b with b not composed has order max(order(σ1 ),. . . ,order(σm ))+1; we only speak of order in a monomorphic system finite R is finite algebraic there are no abstractions in the left-hand side of any rule abstraction-free there are no abstractions in either side of any rule left-linear no variable occurs twice free in a term without head variables no left-hand side contains a subterm X[s1 , . . . , sn ] · t completely without head variables no left-hand side contains a subterm X[s1 , . . . , sn ] · t or X · t (so bound variables may also not occur at a head). function-headed the head of the left-hand side of each rule is a function symbol with base rules the type of a rule may not be composed Many of these properties can be made to hold, by transforming the system. When we are analysing termination in specific, we can enforce the following properties without affecting either termination or non-termination of the system: 1. any system can be made monomorphic, although at the price of finiteness 2. any system can be presented in beta-normal form 3. a system with explicit beta can be transformed to have implicit beta 4. any algebraic system can be turned abstraction-free 5. any system has a function-headed equivalent without head variables Moreover, a system can be turned eta-normal and with base rules without losing non-termination; if the transformed system is terminating then so is the original. However, turning a system eta-normal may sometimes lose termination. 4 Embedding existing systems There are four mainstream forms of higher-order rewriting: Nipkow’s HRSs, Jouannaud and Okada’s AFSs, Yamada’s STTRSs and Klop’s CRSs. The latter three can be embedded into HODRSs, but since HRSs in general do not have a decidable reduction relation, they can not. However, the common restriction to pattern HRSs essentially gives function-headed HODRSs with an equivalence relation β η. An AFS is a parameter-free system with explicit beta, an STTRS is an abstraction-free, beta-free system, and a CRS can be presented as a second-order HODRS with equivalence relation ε. Several quirks need to be ironed out (such as AFSs using function symbols with arity; a symbol f of arity n only occurs in the form f (s1 , . . . , sn ), and CRS terms being untyped), but this is easy to do. 5 HORPO The recursive path ordering, a common syntactic termination method, has been extended to AFSs in a long line of research, starting with HORPO [2] and culminating in CPO [1]. We consider the last of 26 Transformations of higher-order TRSs these works, CPO. The definition extends a wellfounded ordering on monomorphic function symbols to a wellfounded ordering on terms. There are no requirements on the terms except that function symbols have to occur with all their arguments. We extend CPO to polymorphic metaterms by defining X[s1 , . . . , sn ]  X[t1 , . . . ,tn ] if s1  t1 , . . . , sn  tn ; for the type ordering we just use a type ordering on monomorphic types, together with the relation σ → τ >T ρ if τ ≥T ρ. To compare the possibly infinite number of function symbols, we might for example choose a wellfounded ordering on the different symbol names, and additionally define fσ ⊲ fτ if τ is a strict subtype of σ . We can now prove that s′ pγ ↓η ≻ t ′ pγ ↓η if we can derive s ≻ t (for any type substitution p and substitution γ). As the beta rule is included in ≻, we also have s′ pγ ↓η ≻ t ′ pγ ↓β η . Since a system is terminating if it is terminating modulo η, we only have to show that l ≻ r for all rules l ⇒ r ∈ R to prove termination of the system, whatever its modulo relation and properties are. References [1] F. Blanqui, J.-P. Jouannaud & A. Rubio (2008): The Computability Path Ordering: The End of a Quest. In: Lecture Notes in Computer Science (CSL ’08), pp. 1–14. [2] J.-P. Jouannaud & A. Rubio (1999): The higher-order recursive path ordering. In: Proceedings of the 14th annual IEEE Symposium on Logic in Computer Science (LICS ’99), Trento, Italy, pp. 402–411. Computational interpretations of logic Silvia Ghilezan Faculty of Technical Sciences University of Novi Sad, Serbia [email protected] The fundamental connection between logic and computation, known as the Curry-Howard correspondence or formulae-as-types and proofs-as-programs paradigm, relates logical and computational systems. We present an overview and a comparison of computational interpretations of intuitionistic and classical logic both in natural deduction and sequent-style setting. We will further discuss and develop a sequent term calculi with explicit control of erasure and duplication operations in the management of resources. Formulae-as-types, proofs-as-term, proofs-as-programs paradigm Gentzen’s natural deduction is a well established formalism for expressing proofs. Church’s simply typed λ -calculus is a core formalism for writing programs. Simply typed λ -calculus represents a computational interpretation of intuitionistic natural deduction: formulae correspond to types, proofs to terms/programs and simplifying a proof corresponds to executing a program. In its traditional form, terms in the λ -calculus encode proofs in intuitionistic natural deduction; from another perspective the proofs serve as typing derivations for the terms. This correspondence was discovered in the late 1950s and early 1960s independently in logic by Curry, later formulated by Howard; in category theory, Cartesian Closed Categories, by Lambek; and in mechanization of mathematics, the language Automath, by de Brujin. Griffin extended the Curry-Howard correspondence to classical logic in his seminal 1990 paper [7], by observing that classical tautologies suggest typings for certain control operators. This initiated a vigorous line of research: on the one hand classical calculi can be seen as pure programming languages with explicit representations of control, while at the same time terms can be tools for extracting the constructive content of classical proofs. The λ µ-calculus of Parigot [10] expresses the computational content of classical natural deduction and has been the basis of a number of investigations into the relationship between classical logic and theories of control in programming languages. Computational interpretation of sequent-style logical systems has come into the picture much later, by the end of 1990s. There were several attempts, over the years, to design a term calculus which would embody the Curry-Howard correspondence for intuitionistic sequent logic. The first calculus accomplishing this task is Herbelin’s λ̄ -calculus [8]. Recent interest in the Curry-Howard correspondence for intuitinistic sequent logic [8, 1, 5, 6] made it clear that the computational content of sequent derivations and cut-elimination can be expressed through an extension of the λ -calculus. In the classical setting, there are several term calculi based on classical sequent logic, in which terms unambiguously encode sequent derivations and reduction corresponds to cut elimination [11, 2, 12, 4, 3]. In contrast to natural deduction proof systems, sequent calculi exhibit inherent symmetries in proof structures which create technical difficulties in analyzing the reduction properties of these calculi. Resource operators for sequent λ -calculus The simply typed λ Gtz -calculus, proposed by Espı́rito Santo [5] as a modification of Herbelin’s λ̄ -calculus, completely corresponds to the implicational fragment of intuitionistic sequent logic. We extend the Curry-Howard correspondence to intuitionistic seSubmitted to: HOR 2010 c S. Ghilezan This work is licensed under the Creative Commons Attribution License. 28 Computational interpretations of logic quent logic with explicit structural rules of weakening (thinning) and contraction. We propose a term calculus derived from λ Gtz by adding explicit resource operators for weakening and contraction, which we call ℓλ Gtz (linear λ Gtz ). The main motivation for our work is to explore these features in the sequent calculus setting. Kesner and Lengrand [9] developed a term calculus corresponding to intuitionistic calculus in natural deduction format equipped with explicit substitution, weakening and contraction. For the proposed calculus we introduce the type assignment system with simple types and prove some operational properties, including the types preservation under reduction (subject reduction) and termination properties. We then relate the proposed linear type calculus to the calculus of Kesner and Lengrand in the natural deduction framework. Parts of the presented work have been realised jointly with José Espı́rito Santo, Jelena Ivetić, Pierre Lescanne, Silvia Likavec and Dragiša Žunić. References [1] H. P. Barendregt and S. Ghilezan. Lambda terms for natural deduction, sequent calculus and cut-elimination. J. of Functional Programming, 10(1):121–134, 2000. [2] P.-L. Curien and H. Herbelin. The duality of computation. In Proc. of the 5th ACM SIGPLAN Int. Conference on Functional Programming, ICFP’00, pages 233–243, Montreal, Canada, 2000. ACM Press. [3] D. Dougherty, S. Ghilezan, and P. Lescanne. Characterizing strong normalization in the Curien-Herbelin symmetric lambda calculus: extending the Coppo-Dezani heritage. Theor. Comput. Sci., 398(1-3), 2008. [4] D. Dougherty, S. Ghilezan, P. Lescanne, and S. Likavec. Strong normalization of the dual classical sequent calculus. In 12th International Conference Logic for Programming, Artificial Intelligence, and Reasoning, LPAR 2005, volume 3835 of LNCS, pages 169–183, 2005. [5] J. Espı́rito Santo. Completing Herbelin’s programme. In Proceedings of Types Lambda Calculus and Application, TLCA’07, volume 4583 of LNCS, pages 118–132, 2007. [6] J. Espı́rito Santo, S. Ghilezan, and J. Ivetić. Characterising strongly normalising intuitionistic sequent terms. In International Workshop TYPES’07 (Selected Papers), volume 4941 of LNCS, pages 85–99, 2008. [7] T. Griffin. A formulae-as-types notion of control. In Proc. of the 19th Annual ACM Symp. on Principles Of Programming Languages, (POPL’90), pages 47–58, San Fransisco, USA, 1990. ACM Press. [8] H. Herbelin. A lambda calculus structure isomorphic to Gentzen-style sequent calculus structure. In Computer Science Logic, CSL 1994, volume 933 of LNCS, pages 61–75, 1995. [9] D. Kesner and S. Lengrand. Resource operators for lambda-calculus. Inf. Comput., 205(4):419–473, 2007. [10] M. Parigot. An algorithmic interpretation of classical natural deduction. In Proc. of Int. Conf. on Logic Programming and Automated Reasoning, LPAR’92, volume 624 of LNCS, pages 190–201, 1992. [11] C. Urban and G. M. Bierman. Strong normalisation of cut-elimination in classical logic. In Typed Lambda Calculus and Applications, TLCA’99, volume 1581 of LNCS, pages 365–380, 1999. [12] P. Wadler. Call-by-value is dual to call-by-name, reloaded. In Rewriting Technics and Application, RTA’05, volume 3467 of LNCS, pages 185–203, 2005. On the Implementation of Dynamic Patterns Thibaut Balabonski Laboratoire PPS, CNRS and Université Paris Diderot [email protected] Pattern matching against dynamic patterns in functional programming languages is modelled in the Pure Pattern Calculus by one single meta-rule. The present contribution is a refinement which narrows the gap between the abstract calculus and the implementation, and allows reasoning on matching algorithms and strategies. 1 Dynamic Patterns Pattern matching is a base mechanism used to deal with algebraic data structures in functional programming languages. It allows the reasoning on the shape of the arguments in the definition of a function. For instance, define a binary tree to be either a single data or a node with two subtrees (left code, in ML-like syntax). Then a function on binary trees may be defined by reasoning on the shapes generated by these two possibilities (right code). type ’a tree = | Data ’a | Node of ’a tree * ’a tree let f t = | Data | Node | Node match t with d -> (Data d) r -> l r -> <code1> <code2> <code3> An argument given to the function f is first compared to (or matched against) the shape Data d (called a pattern). In case of success, d is instantiated in <code1> by the corresponding part of the argument, and <code1> is executed. In case of failure of this first matching (the argument is not a data) the argument is matched against the second pattern, and so on until a matching succeeds or there is no pattern left. One limit of this approach of programming is that patterns are fixed expressions mentioning explicitly the constructors to which they can apply, which restricts polymorphism and reusability of the code. This can be improved by allowing patterns to be parametrized: one single function can be specialized in various ways by instantiating the parameters of its patterns by different constructors or even by functions building patterns. However, introducing parameters and functions into patterns modifies deeply their nature: they become dynamic objects that have to be evaluated. The Pure Pattern Calculus (PPC) of B. Jay and D. Kesner [JK09, Jay09] models these phenomena using a meta-level notion of pattern matching. The present contribution analyzes the content of the meta pattern matching of PPC (reviewed in Section 2), and proposes an explicit pattern matching calculus (Section 3) which is confluent, which simulates PPC, and which allows the description of new reduction strategies (Section 4). Additional material may be found in [Bal08] 2 The Pure Pattern Calculus This section only reviews some key aspects of PPC. Please refer to [JK09] for a complete story. The syntax of PPC is very close to the one of λ -calculus: the difference is the abstraction [θ ]p  b over a pattern and not λ x.b over a single variable, and the distinction between variable occurrences x and Submitted to: HOR 2010 c T. Balabonski This work is licensed under the Creative Commons Attribution-Share Alike License. 30 On the Implementation of Dynamic Patterns matchable occurrences x̂ of a name x. Variable occurrences are usual variables which may be substituted while matchable occurrences are immutable and used as matching variables or constructors. t ::= x | x̂ | tt | [θ ]t  t PPC terms where θ is a list of names. Letter a (resp. b, p) is used to indicate a PPC term in position of argument (resp. function body, pattern). Letters t, u v are used when there is nothing to emphasize. As pictured below, in the abstraction [θ ]p  b the list of names θ binds matchable occurrences in the pattern p and variable occurrences in the body b. Substitution of free variables and α-conversion are deduced (see [JK09]). [ x ] x x̂  x x̂ =α [y] x ŷ  y x̂ One feature of PPC is the use of a single syntactic application for two different meanings: the term tu may represent either the usual functional application of a function t to an argument u or the construction of a data structure by structural application of a constructor to one or more arguments. The latter is invariant: any structural application is forever a data structure, whereas the functional application may be reduced someday (and then turn into anything else). The simplest notion of pattern matching is syntactic matching: an argument a matches a pattern p if and only if there is a substitution σ such that a = pσ . However, with arbitrary patterns this solution generates non-confluent calculi [Klo80]. In the lambda-calculus with patterns for instance [KvOdV08], syntactic matching is used together with a restriction on patterns (rigid pattern condition). The alternative solution of PPC allows a priori any term to be a pattern, and checks the validity of patterns only when pattern matching is performed. This verification is done by a more subtle notion of matching, called compound matching, which tests whether patterns and arguments are in a so-called matchable form. A matchable form denotes a term which is understood as a value, or a term whose current form is stable and then allows matching. Matchable forms are described in PPC at the meta-level by the following grammar: d m ::= x̂ | dt ::= d | [θ ]t  t Data structures Matchable forms The compound matching is then defined (still at the meta-level) by the following equations, in order: {{a /θ x̂}} {{x̂ /θ x̂}} {{a1 a2 /θ p1 p2 }} {{a /θ p}} := := := := {x 7→ a} {} {{a1 /θ p1 }} ⊎ {{a2 /θ p2 }} ⊥ if x ∈ θ if x 6∈ θ if a1 a2 and p1 p2 are matchable forms if a and p are matchable forms, otherwise where ⊥ denotes a matching failure and ⊎ is the disjoint union on substitutions (⊥ ⊎ σ = σ ⊎ ⊥ = ⊥, and σ1 ⊎ σ2 = ⊥ if domains of σ1 and σ2 overlap). This disjoint union checks that patterns are linear: no matching variable is used twice in the same pattern (non-linearity would break confluence). Remark that compound matching may not be defined if the pattern or the argument is not a matchable form. This represents patterns or arguments that have still to be evaluated or instantiated before being matched. PPC has to deal with a problem of dynamic patterns: a matching variable may be erased of a pattern during its evaluation, which would make reduction ill-defined. This is avoided in PPC by a last (metalevel) test, called check: the result {a /θ p} of the matching of a against p is defined as follows. • if {{a /θ p}} = ⊥ then {a /θ p} = ⊥. • if {{a /θ p}} = σ with dom(σ ) 6= θ then {a /θ p} = ⊥. • if {{a /θ p}} = σ with dom(σ ) = θ then {a /θ p} = σ . T. Balabonski 31 Finally, the reduction −→PPC of PPC is defined by a unique reduction rule (applied in any context): ([θ ]p  b)a −→ b{a /θ p} where b⊥ is some fixed close normal term ⊥ for any term b. Example 1. Let t be a PPC term. The redex ([x]ĉx̂  x) (ĉt) reduces to t: the constructor ĉ matches itself and the matchable x̂ is associated to t. On the other hand, ([x, y]ĉx̂  xy) (ĉt) reduces to ⊥: whereas the compound matching is defined and successful, the check fails since their is no match for y and the result would be ty where y appears as a free variable. The redex ([x]ĉx̂  x) (ĉ) also reduces to ⊥ since a constructor will never match a structural application. And last, ([x]yx̂  x) (ĉt) is not a redex since the pattern has to be instantiated. 3 Explicit Matching This section defines the Pure Pattern Calculus with Explicit Matching (PPCEM ), a calculus which gives an account of all steps of a pattern matching process of PPC. An explicit version of PPC has to deal with the four aforementioned points: identification of structural applications, pattern matching, linearity of patterns, and check. Firstly, a new syntactic construct is introduced to discriminate between functional and structural applications (as in [FMS06] for the rewriting calculus for instance). Any application is supposed functional a priori, and two rules (−→• , given on page 4) propagate structural information: the explicit structural application of t to u is written t • u. Another new syntactic object has to be introduced to represent an ongoing matching operation. The basic information contained in such an object are: the list of matching variables, a partial result representing what has already been computed, and a list of matchings that have still to be solved. The new grammar is: t ::= x | x̂ | tt | t • t | [θ ]t  t | t hθ |τ|∆i PPCEM Terms Matchable forms m ::= x̂ | t • t | [θ ]t  t where in the matching hθ |τ|∆i, θ and τ are lists of names and ∆ is the list of submatchings that have still to be solved (a list of pairs of terms). The choice is here to apply partial results as soon as they are obtained so that they do not need to be remembered. However, a trace of the partial results is remembered for linearity verification: τ is the list of matching variables that have already been used, while θ is the list of matching variables that have not yet been used, and hence the check succeeds if and only if θ is empty. A pure term of PPCEM is a term without structural applications and matchings (that means a PPC term). Definition of free variables and matchables is as follows. The set of free names of a term t is fn(t): fn(t) = fv(t) ∪ fm(t) fv(x) fv(x̂) fv(tu) fv(t • u) fv([θ ]p  b) := {x} := 0/ := fv(t) ∪ fv(u) := fv(t) ∪ fv(u) := fv(p) ∪ ( fv(b) \ θ ) fm(x) fm(x̂) fm(tu) fm(t • u) fm([θ ]p  b) := 0/ := {x} := fm(t) ∪ fm(u) := fm(t) ∪ fm(u) := ( fm(p) \ θ ) ∪ fm(b) fv(t hθ |τ|∆i) := ( fv(t) \ (θ ∪ τ)) ∪ fv(∆) fm(t hθ |τ|∆i) := fm(t) ∪ fm(π1 (∆)) ∪ ( fm(π2 (∆)) \ (θ ∪ τ)) where if ∆ = (a1 , p1 )...(an , pn ) then fm(π1 (∆)) = S i fm(ai ) and fm(π2 (∆)) = S i fm(pi ). 32 On the Implementation of Dynamic Patterns The (meta-) definition of substitution of free variables can be deduced: xσ := σx xσ := x x̂σ := x̂ ([θ ]p  b)σ := (t hθ |τ|∆i)σ := x ∈ dom(σ ) x 6∈ dom(σ ) (tu)σ (t • u)σ := t σ uσ := t σ • uσ ([θ ]pσ  bσ ) θ ∩ (dom(σ ) ∪ fn(σ )) = 0/ (θ ∪ τ) ∩ (dom(σ ) ∪ fn(σ )) = 0/ t σ hθ |τ|∆σ i where in ∆σ the substitution propagates in all terms of ∆. A notion of α-conversion is associated, and for now on it is supposed that all bound names in a term are different, and disjoint from free names. Rules for matching are of three kinds: an initialization rule −→B which triggers a new matching operation, several matching rules −→m corresponding to all possible elementary matching steps and two resolution rules −→r that apply the result of a completed matching. Structural application Initialization x̂t (t • u)v −→• −→• x̂ • t (t • u) • v ([θ ]p  b)a −→B b hθ |0|(a, / p)i Matching b hθ |τ|(a, x̂)∆i b hθ |τ|(x̂, x̂)∆i b hθ |τ|(a1 • a2 , p1 • p2 )∆i b hθ |τ|(a, p)∆i −→m −→m −→m −→m Resolution b{x7→a} hθ \ {x}|τ ∪ {x}|∆i b hθ |τ|∆i b hθ |τ|(a1 , p1 )(a2 , p2 )∆i ⊥ b h0|τ|εi / −→r b hθ |τ|εi −→r b ⊥ if x ∈ θ , x 6∈ τ and fn(a) ∪ (θ ∪ τ) = 0/ if x 6∈ θ and x 6∈ τ if a and p are matchable forms, otherwise if θ 6= 0/ Reduction −→EM of PPCEM is defined by application of any rule of −→B , −→• , −→m or −→r in any context. The subsystem −→ p = −→• ∪ −→m ∪ −→r computes already existing pattern matchings but does not create new ones. 3.1 Confluence and Simulation properties Proofs are similar to those presented in [Bal08]. Theorem 1. −→ p is confluent and strongly normalizing. Theorem 2. PPCEM is confluent. For any PPCEM term t, write JtK the term t where all structural applications (•) are replaced by functional applications. Let t be a PPCEM term, and t ′ the unique normal form of t by −→ p . Write t ↓ the term Jt ′ K. Theorem 3. • For any terms t and t ′ of PPC, if t −→PPC t ′ then t −→∗EM t ′ . ′ • For any terms t and t ′ of PPCEM , if t −→EM t ′ and t ↓ and t ′ ↓ are pure, then t −→= PPC t . 4 Reduction Strategies Pattern matching raises two new issues concerning reduction strategies (that means the evaluation order of programs). One is related to the order in which pattern matching steps are done, the other concerns T. Balabonski 33 the amount of evaluation of the pattern or the argument done before pattern matching. This subsection focuses on the latter problem. In PPC, the basic evaluation strategy for a term ([θ ]p  b)a is: evaluate the pattern p and the argument a, and then resolve the matching. As the usual call-by-value, this solution may perform unneeded evaluation of the argument (in parts that are not reused in the body b of the function). Call-by-name, which means substituting non-evaluated arguments, is the more basic solution to this problem. But how can such a solution be described in a pattern calculus? In PPC some evaluation of the argument has to be done before pattern matching, but the exact amount of evaluation needed depends on the pattern. Hence a description of a reduction strategy performing a minimal evaluation of the argument is not possible in PPC without defining a reduction parametrized by a pattern. On the other end PPCEM allows to shuffle pattern or argument reduction and pattern matching steps. This finer control allows for instance to define call-by-name reduction to head-normal form by the following evaluation contexts: E ::= [] | | | | Et t hθ |τ|(E,t)∆i t hθ |τ|(x̂, E)∆i t hθ |τ|(t • t, E)∆i x 6∈ θ and x 6∈ τ Reduction following these evaluation contexts triggers pattern matchings as soon as possible. Then the pattern and the argument are evaluated until they become matchable, and a pattern matching step is performed before the story goes on. More generally this also allows to revisit standardisation for pattern calculi [KLR10]. 5 Conclusion The Pure Pattern Calculus is a compact framework modelling pattern matching with dynamic patterns. However, the conciseness of PPC is due to its use of several meta-level notions which deepens the gap between the calculus and implementation-related problems. This contribution defines the Pure Pattern Calculus with Explicit Matching, a refinement which is confluent and simulates PPC, and allows reasoning on the pattern matching mechanisms. This enables the definition of new reduction strategies in the spirit of call-by-name (which is new in this kind of framework since the reduction of the argument of a function depends on the pattern of the function, pattern which is itself a dynamic object). References [Bal08] T. Balabonski: Calculs avec Motifs Dynamiques. Rapport technique PPS, Université Paris Diderot, 2008. Available at http://hal.archives-ouvertes.fr/hal-00476940. [FMS06] M. Fernández, I. Mackie, F.-R. Sinot: Interaction Nets vs the Rho-Calculus: Introducing Bigraphical Nets. ENTCS, 154(3):19–32, 2006. [Jay09] B. Jay. Pattern Calculus: Computing with Functions and Data Structures. Springer, 2009. [JK09] B. Jay and D. Kesner: First-class patterns. J. Funct. Programming, 19(2):191–225, 2009. [KLR10] D. Kesner, C. Lombardi and A. Rı́os: Standardisation for Constructor Based Pattern Calculi. 5th International Workshop on Higher-Order Rewriting: HOR 2010. [Klo80] J. W. Klop: Combinatory Reduction Systems. Ph.D. Thesis, Mathematisch Centrum, Amstermdam, 1980 [KvOdV08] J. W. Klop, V. van Oostrom, and R. de Vrijer: Lambda calculus with patterns. TCS, 398:16–31, 2008. Higher-order Rewriting for Executable Compiler Specifications Kristoffer Rose, IBM Research∗ Abstract 2 Parse In this paper we show how a simple compiler can be completely specified using higher order rewriting in all stages of the compiler: parser, analysis/optimization, and code generation, specifically using the crsx.sourceforge.net system for a small declarative language called “X” inspired by XQuery (for which we are building a production compiler in the same way). The first component of our compiler is the parsing from X syntax to the IL, which consists of higher order terms (one can consider the IL a as higher order abstract syntax [11] representation). In the CRSX system this is achieved with the parser specification shown in Figure 1, which follows the format of the JJCRS component of the CRSX system [13]. The parser specification is merely a grammar defining some tokens (written in his) and some nonterminals, such as can be found in any text on compilers, using annotations to provide details of how to build the (higher-order) abstract syntax tree: 1 Introduction A canonical minimal compiler consists of a parser translating the source language SL to the intermediate language IL, some rewrites inserting analysis results into and performing simplifications of the IL, and code generation to the target language TL (presumably using the annotations). SL Parse // CodeGen// TL EE IL Rewrite The actual samples we’ll present below are mere toys, of course, but do illustrate the ideas in a manner that is consistent with the production compiler. For terms and rewriting we use the CRSX system [12, 13] variation of Combinatory Reduction Systems [9]. We have chosen to use the straight CRSX notation here to show the actual executable form of our specifications; the basic notations are summarized in Appendix A. Thomas J. Watson Research Center, P.O. Box 704, Yorktown Heights, NY 10598, USA; [email protected], ❤tt♣✿✴✴ ❦r✐st♦❢❢❡r✳r♦s❡✳♥❛♠❡. Printed May 17, 2010. • each J. . .K denotes the generated term for that particular production choice; • nonterminals are named with meta-variables1 such that they can be referenced in generated terms by use of the corresponding meta-variable (with an optional numeric subscript, if needed); • if a token is marked with a superscript metavariable, like hCompiC0 , then the corresponding meta-applications, here C0 [. . . ], builds a construction with the token value as constructor; ?v • after a token promotes the token value to a scoped identifier definition and makes [v] after a single nonterminal in the same production indicate that the scope of v is that nonterminal; and • !v after a token indicates that the token must be an occurrence of an identifier that is in scope. Notice the following specifics for the X grammar: • The top-level nonterminal is X, which generates an X program term. ∗ IBM 1 Meta-variables are here italic upper case letters; see Appendix A for details. Page 34 q y X ::= E X[E] E ::= S1 (program) q y “✱” E2 Con[S1 , E2 ] q y S1 (expression) q y S ::= “❢♦r” hVari?v “✐♥” S1 “r❡t✉r♥” S2 [v] For[S1 , x.S2 [x]] q y “✐❢” “✭” E0 “✮” “t❤❡♥” S1 “❡❧s❡” S2 If[E0 , S1 , S2 ] q y C1 C1 q y q y A1 C ::= A1 hCompiC0 A2 C0 [A1 , A2 ] q y q y O0 A ::= P1 hOpi A2 O0 [P1 , A2 ] P1 q y q y q y q y P ::= hVari!v v hIntiI Int[I] “✭” E “✮” E “s✉♠” P Sum[P ] hVari ::= [❛✲③, ❆✲❩] [❛✲③, ❆✲❩, ✵✲✾]+ (simple) (comparison) (addition) (primary) (var) hInti ::= [✵✲✾] (int) hCompi ::= “❧❡” (comp) + hOpi ::= “✲” (op) Figure 1: X syntax and IL generation. • An expression is either produced by E ✱ S, and then it generates a Con-term, or just by S, and then it simply generates what is generated for the S. • A simple expression S has three forms, where the ❢♦r expression is the only one involving a scope: the grammar annotation specifies where the variable and scope occur in the syntax and then specifies how they are transferred to the term, where the scope is explicitly specified by x . S2 [x]. • One of the choices for a primary expression P is a hVari occurrence where the “!” in the annotation stipulates that the symbol must be in scope (thus effectively occur as the iteration variable of a parent ❢♦r). The X program s✉♠✭❢♦r ♥ ✐♥ ✭✶✱✷✮ r❡t✉r♥ ♥✲✶✮ parses to the term X[Sum[For[Con[Int[1], Int[2]], n . -[n, Int[1]]]]] In summary, the parser specification looks like many other abstract syntax tree generation notations, such as MetaPRL [7], except for the additional direct support for higher order abstract syntax by explicitly specifying the scoping and a pleasantly compact way to generate terms where tokens are used directly as constructors, which reduces the size of large parsers considerably. 3 Rewrite As a simple analysis example we show the use of inference rules for a simple X type checker in Figure 2. Again, such systems can be found in standard texts on programming language semantics and compilation, and the system shown here is directly generated from the inference system processed by CRSX.2 The rules define a judgment that adds type information to every subterm in the form of an environment annotation, {Ty : T }, which associates Ty (a constructor like all the upright names in the system) with the type of the annotated subterm, T , where the type sort can be described by T ::= 1[U ] ∗[U ] U ::= B I ? (type) (unboxed) (where 1 and ∗ are cardinalities, and B, I, and ?, stand for integer, boolean, and unknown value types, respectively). Definition 3.1. If E has the free variables x1 , . . . , xn then {Γ ; x1 : T1 ; . . . ; xn : Tn }E ⇒ {Γ ′ ; Ty : T }E ′ means that with x1 of type T1 , etc., then the type of E is T . 2 The only difference is that CRSX processes terms of the form n (N ). Infer[N ,(P1 ;. . . Pn ;),C] which are shown as P1 ...P C Page 35 {Γ }E ⇒ E ′ (X) {Γ } X[E] ⇒ {Γ ; Ty : T[E ′ ]} X[E ′ ] {Γ }E1 ⇒ E1′ {Γ }E2 ⇒ E2′ (Con) {Γ } Con[E1 , E2 ] ⇒ {Γ ; Ty : J[T[E1′ ], T[E2′ ]]} Con[E1′ , E2′ ] ∀x.( {Γ }E1 ⇒ E1 {Γ ; x : P[T1 ]}E2 [x] ⇒ E2′ [x] T[E2′ [x]] = T2 ) (For) ′ {Γ } For[E1 , v.E2 [v]] ⇒ {Γ ; Ty : S[T2 ]} For[E1 , v.E2′ [v]] {Γ }E ⇒ E ′ T[E ′ ] = 1[B] {Γ }E1 ⇒ E1′ {Γ }E2 ⇒ E2′ (If) ′ ′ {Γ } If[E, E1 , E2 ] ⇒ {Γ ; Ty : U[T[E1 ], T[E2 ]]} If[E ′ , E1′ , E2′ ] T[E2′ ] = 1[I] {Γ }E2 ⇒ E2′ T[E1′ ] = 1[I] {Γ }E1 ⇒ E1′ (le) {Γ } ’le’[E1 , E2 ] ⇒ {Γ ; Ty : 1[B]} ’le’[E1′ , E2′ ] {Γ }E1 ⇒ E1′ T[E1′ ] = 1[I] {Γ }E2 ⇒ E2′ T[E2′ ] = 1[I] (-) ′ ′ {Γ } -[E1 , E2 ] ⇒ {Γ ; Ty : 1[I]} -[E1 , E2 ] {Γ ; x : T }x ⇒ {Γ ; Ty : T } Typed[x] (Var) {Γ }E ⇒ E ′ S[T[E ′ ]] = ∗[I] (Sum) {Γ } Sum[E] ⇒ {Γ ; Ty : 1[I]} Sum[E ′ ] {Γ } Int[I] ⇒ {Γ ; Ty : 1[I]} Int[I] (Int) J[1[T ], 1[T ′ ]] → ∗[Uu[T, T ′ ]]; J[1[T ], ∗[T ′ ]] → ∗[Uu[T, T ′ ]]; J[∗[T ], 1[T ′ ]] → ∗[Uu[T, T ′ ]]; J[∗[T ], ∗[T ′ ]] → ∗[Uu[T, T ′ ]]; U[1[T ], 1[T ′ ]] → 1[Uu[T, T ′ ]]; U[1[T ], ∗[T ′ ]] → ∗[Uu[T, T ′ ]]; U[∗[T ], 1[T ′ ]] → ∗[Uu[T, T ′ ]]; U[∗[T ], ∗[T ′ ]] → ∗[Uu[T, T ′ ]]; Uu[I, I] → I; Uu[B, B] → B; Uu[I, B] → ?; Uu[B, I] → ?; S[1[T ]] → ∗[T ]; S[∗[T ]] → ∗[T ]; P[1[T ]] → 1[T ]; P[∗[T ]] → 1[T ]; T[{Γ ; Ty : T }E] → T ; Figure 2: X typing inference rules and helpers. The rules follow the conventions of compilers written in natural semantics [5] by being “left-to-right deterministic,” and can be translated directly into rewrite rules corresponding to a recursive functional program, except for the (For) rule, which involves a “subproof under binding.” The higher order nature of (For) is manifest in the rule by the ∀x.(. . . ) wrapper around all the premises: this permits the subproofs of the premises to use x as a free variable. It is then fairly easy to establish that for a closed expression E, all subproofs of the proof for {}E ⇒ E ′ will satisfy the invariant that Γ = {x1 : T1 ; . . . ; x : n : xn } contains all free variables in the subterm to investigate. It is also pleasant that the impact of the use of bound variables is restricted to the rules dealing explicitly with binding: there are no “lift” or “shift” operators to manipulate. Each inference judgment (in our example just “⇒”) corresponds to a family of recursive functions and constructors that permits representing the stage of each inference rule as a data structure with a “current problem” top judgment to reduce next. Note that the synthetic function symbols are inserted under the environment, which makes non-terms like {Γ ; x : T }x possible: the generated rewrite rule for the (Var) inference will, in fact, be {Γ ; x : T } ?-[x] → {Γ ; Ty : T } Typed[x] because each judgment ⇒N corresponds to a function named ?-N . In addition, the rules employ a number of helper rewrite rules that define the computation of certain type combinations, e.g., J computes the type of a concatenation, U means union, S means sequences of, and P means the type of members of sequences (called the “prime” type). We give the details for how inference rules are translated into rewrite rules in a separate paper; here we shall just remark that the following properties of the rules are important for permitting efficient translation all the way to low level code: Page 36 • Every conclusion judgment has a unique constructor in the pattern. (This can be relaxed to a condition on unique prefix collections similarly to the way LR grammars are generalized to LALR grammars.) • The left side of every premise judgment can be fully constructed from components from prior (left) premises and the left side of the conclusion judgment. • Judgments only ever observe inherited attributes in the environment, i.e., values set in the context (specifically for bound variables set at their binder site, which in fact permits using a single global environment where no variables are ever removed). The example term, X[Sum[For[Con[Int[1], Int[2]], n . -[n, Int[1]]]]] type analyses to X[{Ty:1[I]}Sum[{Ty:*[I]}For[ {Ty:*[I]}Con[{Ty:1[I]}Int[1], {Ty:1[I]}Int[2]], n . {Ty:1[I];n:1[I]}-[ {Ty:1[I];n:1[I]}Typed[n], {Ty:1[I];n:1[I]}Int[1] ] ]]] We can now use the type annotations to perform more traditional CRS(X) simplification rewrites, such as {Γ } For[{Ty : 1[T ]}E1 , v.E2 []] → E2 [] (which matches For expressions that iterate over a singleton sequence and do not use the iteration variable in the result, and replaces them with just the For body). type T and scoped over the execution of C; LV[r1 , r2 ] loads (a copy of) the value of r2 into r1 ; LC[r, V ] loads the constant value V (an integer or the booleans T or F) into r; −[r1 , r2 ] subtracts the value of r2 from r1 ; +[r1 , r2 ] adds the value of r2 to r1 ; LE[r1 , r2 , C1 , C2 ] checks if the value of r1 is less than or equal to r2 and then executes C1 otherwise C2 ; TF[r, C1 , C2 ] checks if the value of r is T and then executes C1 otherwise C2 ; OU[r, o] outputs a copy of the value of r to the output o; BU[o.C, r] creates a buffer collecting from o, allows C to send output items of type T to o, and then copies the result to r; and PC[T, o.C1 , r.C2 ] is a pipeline that executes C1 and for every value of type T sent to o executes C2 with the value in r. Notice that this instruction set does not have the usual typed assembly language property that all registers are constant: we prefer to keep the option of reuse of variables, which is particularly pertinent because it permits representing handlers by registers. The code generation rules, shown in Figure 3,3 has three compilation schemes: G1 generates code that sends the computed value to an output, G2 generates code that selects between two code fragments depending on whether the computed boolean value is “true” or “false” (so it is only defined for expression fragments that can be boolean), and G3 is a helper scheme that generates code that stores the computed value in a register (using an intermediate buffer – further simplifications can be achieved by also specialising G3 to each case). It uses the P and T helpers from Figure It should be noted that this code generation scheme is especially suited for reasoning about data flow Code generation for small sample program omitted for space reasons. 5 Discussion 4 Code Generation Finally, the code generation. We have found that binders can be used rather effectively to represent registers. We’ll use the following instruction set, which is a hybrid of (relational) algebras and a register machine and has proven to have a good abstraction level when compiling into current imperative languages such as Java or C. Code C is a “two-address code” with one of the following forms: PR[o.C] is a complete program where the code C must send all output to the output (handler) o; (C1 ; C2 ) first executes C1 , then C2 ; NR[T, r.C] creates an new unititialized register r of At the end what remains is to put all the pieces together. The driver is the top-level X symbol introduced by parsing. We add a the following rule to start reduction directly from the parser: X[E] → G[?-[E]]; I have found that this kind of architecture is relatively easy to explain to developers with traditional “compiler block diagrams” like the one in the introduction, where the fact that each analysis and translation is 3 The shown rules are reformatted in mathematics style from the original text file. Page 37 G[E] → PR[o. G1 [E, o]]; (G) G1 [Con[E1 , E2 ], o] → (G1 [E1 , o]; G1 [E2 , o]) (Con1) G1 [For[E1 , v.E2 [v]], o] → PC[P[T[E1 ]], o1 . G1 [E1 , o1 ], v. G1 [E2 [v], o]] (For1) G1 [If[E, E1 , E2 ], o] → G2 [E, G1 [E1 , o], G1 [E2 , o]] (If1) G1 [{Ty : T } ’le’[E1 , E2 ], o] → NR[1[B], b.(G2 [{Ty : T } ’le’[E1 , E2 ], LC[b, T], LC[b, F]]; OU[b, o])] (le1) G1 [{Ty : T } -[E1 , E2 ], o] → G3 [E1 , r1 . G3 [E2 , r2 .(−[r1 , r2 ]; OU[r1 , o])]] G1 [{Ty : T } Sum[E1 ], o] → NR[T, s.(LC[s, 0]; PC[o. G1 [E1 , o], v.+[s, v]]; OU[s, o])] G1 [{Ty : T }x, o] → OU[x, o] G1 [{Ty : T } Int[E], o] → NR[T, i.(LV[i, E]; OU[i, o])] G2 [{Ty : T } Let[E1 , v.E2 [v]], C1 , C2 ] → G3 [E1 , r1 . G2 [E2 [r1 ], C1 , C2 ]] G2 [{Ty : T } If[E, E1 , E2 ], C1 , C2 ] → G3 [{Ty : T } If[E, E1 , E2 ], b. TF[b, C1 , C2 ]] G2 [{Ty : T } ’le’[E1 , E2 ], C1 , C2 ] → G3 [E1 , r1 . G3 [E2 , r2 . LE[r1 , r2 , C1 , C2 ]]] G2 [{Ty : T }x, C1 , C2 ] → TF[x, C1 , C2 ] (−1) (Sum1) (Var1) (Int1) (Let2) (If2) (le2) (Var2) G3 [{Ty : T }E, r.C[r]] → NR[T, r.(BU[o. G1 [{Ty : T }E, o], r]; C[r])] (Store) Figure 3: X code generation rules. specified independently, makes using a structured approach realistic. The chaotic nature of the resulting execution of the specification comes out as an advantage and our implementation using a standard functional innermost-needed strategy often ends up interleaving the stages of the compilation in interesting ways, for example eliminating dead code before type checking, usually making mistakes in dependencies blatantly obvious. The current production compiler prototype specified in this way is working so well that it seems feasible to implement the actual production compiler this way. Important factors in this has been the disciplined use of systems that can be transformed into orthogonal constructor systems, for which a tabledriven normalizing strategy can be used in almost all cases (there is a performance penalty for some substitution cases). The CRSX system implements higher order rewriting fully in the form of CRS, thus can handle full substitution and thus express transformations such as inlining. However, it turns out that many specific systems share with the small ones presented here the property that they use only “explicit substitution” style rewrites, which only permits observing variables [1]. Indeed it seems that the fact that the approach is not functional or logical is an advantage: the expressive power of explicit substitution is strictly smaller (in a complexity sense) than general functions. Related Work. The area of verifying a compiler specification is well established using both handwritten and mechanical proofs [4]. Work has also been done on linking correct compiler specification and implementations using generic proof theoretic tools [10]. Tools supporting mechanical generation of compilers from specifications, such as SDF+ASF [2] and Stratego [3], have focused on compilers restricted to first-order representations of intermediate languages used by the compiler and on using explicit rewriting strategies to guide compilation. Our goal is the opposite: to only specify dependencies between components of the compiler and leave the actual rewriting strategy to the system (in practice using analysis-guided rule transformations coupled with a generic normalizing strategy). We are only aware of one published work that uses higher order features with compiler construction, namely the work by Hickey and Nogin on specifying compilers using logical frameworks [7]. The resulting specification looks very similar to ours, and indeed one can see the code synthesis that could be done for their logic system as similar to the code generation we are employing. Also, both systems employ embedded source language syntax and higherorder abstract syntax. However, there are differences as well. First, CRSX is explicitly designed to implement just the kind of rewrite systems that we have described, and is tuned to generate code that drives transformation through lookup tables. Second, vari- Page 38 ables are first class in CRSX and not linked to metalevel abstraction, thus closer to the approach used by explicit substitution for CRS [1] and “nominal” rewriting [6]. This permits us, for example, to use an assembly language with mutable registers. Third, we find that the focus on local rewriting rules is easier to explain to compiler writers, and the inclusion of environments and inference rules in the basic notation further helps. Finally, the CRSX engine has no assumed strategy so we find the notion of local correctness easier to grasp. What’s Next? With CRSX we continue to experiment with pushing the envelope for supporting more higher-order features without sacrificing efficiency. An important direction is to connect with nominal rewriting and understand the relationship between what the two formalisms can express. Another interesting direction for both performance and analysis is to introduce explicit weakening operators that “unbind” a given bound variable in a part of its scope. While used in this way with explicit substitution [15, 8], the interaction with higher-order rewriting is not yet clear. In companion papers we explain the details of the translation from the supported three forms of rules, “recursive compilation scheme,” “chaotic annotation rules,” and “deterministic inference rules,” into effective native executables, and we explain annotations that make it feasible to avoid rewriting-specific static mistakes. Acknowledgements. The author is grateful for insightful comments by the anonymous referees including being made aware of the work in logical frameworks. References [1] R. Bloo and K. H. Rose. Combinatory reduction systems with explicit substitution that preserve strong normalisation. In H. Ganzinger, editor, RTA ’96— Rewriting Techniques and Applications, number 1103 in Lecture Notes in Computer Science, pages 169– 183, New Brunswick, New Jersey, July 1996. Rutgers University, Springer-Verlag. [2] M. Brand, J. Heering, P. Klint, and P. A. Olivier. Compiling Rewrite Systems: The ASF+SDF Compiler. ACM Transactions on Programming Languages and Systems, 24(4):334–368, 2002. [3] M. Bravenboer, A. van Dam, K. Olmos, and E. Visser. Program transformation with scoped dynamic rewrite rules. Fundamenta Informaticae, 69(1–2):123–178, 2006. [4] M. A. Dave. Compiler verification: a bibliography. SIGSOFT Softw. Eng. Notes, 28(6):2–2, 2003. [5] J. Despeyroux. Proof of translation in natural semantics. In First Symposium on Logic in Computer Science, pages 193–205, Cambridge, Massachusetts, USA, June 1986. IEEE Computer Society. [6] M. Fernández and M. J. Gabbay. Nominal rewriting. Inf. Comput., 205(6):917–965, 2007. [7] J. Hickey and A. Nogin. Formal compiler construction in a logical framework. Higher-Order and Symb. Comp., 19(2-3):197–230, Sept. 2006. [8] D. Kesner and F. Renaud. The prismoid of resources. In 34th International Symposium on Mathematical Foundations of Computer Science (MFCS), number 5734 in LNCS, pages 464–476, Novy Smokovec, High Tatras, Slovakia, Aug. 2009. Springer-Verlag. [9] J. W. Klop, V. van Oostrom, and F. van Raamsdonk. Combinatory reduction systems: Introduction and survey. Theoretical Computer Science, 121:279–308, 1993. [10] K. Okuma and Y. Minamide. Executing verified compiler specification. In A. Ohori, editor, APLAS 2003— First Asian Symposium on Programming Languages and Systems, volume 2895 of Lecture Notes in Computer Science, pages 178–194, Beijing, China, Nov. 2003. Springer. [11] F. Pfenning, , and C. Elliot. Higher-order abstract syntax. SIGPLAN Notices, 23(7):199–208, 1988. [12] K. Rose. CRSX – an open source platform for experimenting with higher order rewriting. Presented in absentia at HOR 2007—❤tt♣✿✴✴❦r✐st♦❢❢❡r✳r♦s❡✳ ♥❛♠❡✴♣❛♣❡rs, June 2007. [13] K. Rose. Combinatory reduction systems with extensions. ❤tt♣✿✴✴❝rs①✳s♦✉r❝❡❢♦r❣❡✳♥❡t, Apr. 2010. [14] K. H. Rose. Operational Reduction Models for Functional Programming Languages. PhD thesis, DIKU, University of Copenhagen, Universitetsparken 1, DK2100 København Ø, Feb. 1996. ❤tt♣✿✴✴❦r✐sr♦s❡✳ ♥❡t✴t❤❡s✐s✳♣❞❢. [15] K. H. Rose, R. Bloo, and F. Lang. On explicit substitution with names. IBM Research Report RC24909, IBM Thomas J. Watson Research Center, P.O. Box 704, Yorktown Heights, NY 10598, USA, Dec. 2009. A CRSX Summary Our setting is Combinatory Reduction Systems CRS [9] as realized by the “CRSX” system [13]. Here we briefly summarize the used notation and where it differs from reference CRS. Page 39 A.1 A.3 Lexical conventions At the character level, usual white space is permitted and each of the reserved characters ❬❪✭✮④⑥✿❀✳✱ is a separate token (except in strings and entities as noted below). Everything between ✴✴ and the end of a line (outside of a string) is white space (a comment). Three categories of composite tokens are used: • Words starting with a lower case letter are variables, denoted v in the grammar below. • Capitalized Italic words (as well as words that contain the # character) are meta-variables and generally denoted by M in the grammar below; • All other words and composite symbols (except with ❁❃✫) as well as strings written inside double or single quotes (permitting white space and reserved characters as well as both duplicated and ❭-escaped quotes of the same kind inside) are constructors and generally denoted with the letter C in the grammar below. Notice that strings are just ordinary constructors C and thus can be used wherever such are allowed. Constructors containing ✩ in their name are reserved. A.2 Terms The lexical tokens combine into terms according to the following basic grammar (the characters []{}, should be seen as literals): t ::= v | {e}C[s, . . . , s] | M [t, . . . , t] (Terms) s ::= ~v .t | t (Scope) e ::= M | e; v : t | e; C : T (Env) where ~v denotes v1 v2 . . . vn for n > 0. A scope binds the variables in the vector such that lexical occurrences of these inside the t denote those specific variable instances (with the usual caveat that the innermost possible scope is used for each particular variable name). The parser furthermore permits the following abbreviations borrowed from λ calculus and programming languages: • Parenthesis are allowed around every term, so (t) is the same as t; • c ~v .t abbreviates c[v1 .c[v2 . · · · c[vn .t] · · · ]]; • t1 t2 abbreviates @[t1 ,t2 ] and is left recursive so t1 t2 t3 is the same as (t1 t2 )t3 ; • t1 ; t2 abbreviates $Cons [t1 ,t2 ] and is right recursive with the special rule that omitted segments correspond to the special subterm $Nil, so (t1 ;t2 ; ) corresponds to the term $Cons[t1 ,$Cons[t2 ,$Nil]]; and • empty brackets [] can be omitted. Rules Rules are a special interpretation of terms use by the rewrite engine to define rewrite systems, written as name[options] : pattern → contraction with special conventions for the name, pattern, and contraction parts: • the name becomes the name of the rule, • the options is a comma-separated list of instructions to relax the requirement that all used meta-variables occur exactly once on each sides of the rule, that all used variables are explicitly scoped, and that all pattern meta-applications permit all in-scope variables, • the pattern is just a term that must be a construction where meta-applications are applied exclusively to distinct bound variables, and • the → is really written as “✫r❛rr❀”. Otherwise, rewriting is like CRS where matching builds a “valuation” that maps each meta-variable to an abstration over the bound pattern variables in the redex, and contraction then uses the valuation to substitute matched bound variables appropriately [9]. Rules generalize to match environments in the natural way, with one important restriction to avoid the axiom of choice: constant keys must be given explicitly, and variable keys must be constrained to a specific variable (by being matched elsewhere in the pattern). Additionally, environment patterns permit a “catchall” meta-variable that denotes the collection of all keyvalue pairs in the matched environment (including any matched explicitly), which can then be inserted by contraction (where explicit key-value pairs are seen as modifiers). Finally, we permit free variables in patterns and contractions. A free variable in a pattern will only match actual locally free variable occurrences in redices. This interferes with the usual CRS constraint that distinct variables in a pattern match distinct variables in the redex explicit to avoid capture of locally bound variables. A free variable in a contraction is either already matched in the pattern or corresponds to generating a globally “fresh” variable [14]. A.4 Differences from Standard CRS The CRSX notation differs from standard CRS in the following ways. • Meta-variables use distinct tokens rather than reserving the Z symbol. • Binding is indicated with a dot as in λ calculus instead of using leading brackets. In addition, scopes are not terms but are restricted to occur on the subterms of constructions. So in CRSX the pattern C[#] does not match the term C[x.x] and the pattern C[x.#[x]] does not match the term C[x y.A]. • Several abbreviated forms are permitted. • There is special support for environments and the use of free variables. Page 40 Swapping: a natural bridge between named and indexed explicit substitution calculi Ariel Mendelzon Alejandro Rı́os Beta Ziliani Depto. de Computación, FCEyN, Universidad de Buenos Aires. [email protected] Depto. de Computación, FCEyN, Universidad de Buenos Aires. [email protected] Depto. de Computación, FCEyN, Universidad de Buenos Aires. [email protected] This article is devoted to the presentation of λ rex, an explicit substitution calculus with de Bruijn indexes and an elegant notation. By being isomorphic to λ ex – a state-of-the-art formalism with variable names –, λ rex accomplishes simulation of β -reduction (Sim), preservation of β -strong normalization (PSN) and metaconfluence (MC), among other desirable properties. Our calculus is based on a novel presentation of λdB , using a peculiar swap notion that was originally devised by de Bruijn. Besides λ rex, two other indexed calculi isomorphic to λ x and λ xgc are presented, demonstrating the potential of our technique when applied to the design of indexed versions of known named calculi. 1 Introduction This article is devoted to explicit substitutions (ES, for short), a formalism that has attracted attention since the appearance of λ σ [1] and, later, of Melliès’ famous counterexample [13]. The main motivation behind the field of ES is studying how substitution behaves when internalized in the language it serves (in the classic λ -calculus, substitution is a meta-level operation). Several calculi have been proposed since [13] and few have been shown to have a whole set of desirable properties: Sim, PSN, MC, Full Composition, etc. For a detailed introduction to the ES field, we refer the reader to e.g. [12, 11, 16]. In 2009, D. Kesner proposed λ ex [11], a formalism with variable names that has the entire set of properties expected from an ES calculus. As far as we know, no ES calculus with de Bruijn indexes [5], a simple enough notation and the whole set of properties exists to date. We present here such a calculus: λ rex, based on a more adequate swapping-based version of λdB [5] – λ r –, that we also introduce here. Moreover, the calculus is isomorphic to λ ex and, therefore, properties are preserved exactly. Together with λ r and λ rex we present λ re and λ regc , two formalisms that, in turn, are isomorphic to λ x [4, 3] and λ xgc [4]. As far as we know, no indexed isomorphic versions of λ x and λ xgc were known either. 2 A new presentation for λdB : the λ r-calculus The λ -calculus with de Bruijn indexes (λdB , for short) [5] accomplishes the elimination of α-equivalence, since α-equivalent λ -terms are sintactically identical under λdB . This greatly simplifies implementations, given that working modulo α-equivalence is generally tedious and expensive. One usually refers to a de Bruijn indexed calculus as a nameless calculus, for names are replaced by indexes. We observe here that, even though this nameless notion makes sense in the classical λdB -calculus (because the substitution operator is located in a meta-level), it seems not to be the case in certain ES calculi derived from λdB , such as: λ s [8], λ se [9] or λ t [10]. These calculi have constructions of the form a[i := b] to denote ES (notations vary). Here, even though i is not a name per se, it plays a similar role: i indicates which index should be replaced; then, we believe, these calculi are not purely nameless. Given this not-completely-nameless notion, we start by eliminating the index i from the substitution operator. Then, we are left with terms of the form a[b], and with a (Beta) reduction rule that changes from (λ a) b → a[1 := b] to (λ a) b → a[b]. The semantics of a[b] should be clear from the new (Beta) Submitted to: HOR 2010 c Ariel Mendelzon, Alejandro Rı́os & Beta Ziliani This work is licensed under the Creative Commons Attribution License. 42 Swapping: a natural bridge between named and indexed explicit substitution calculi rule. The problem is, of course, how to define it. Two difficulties arise when a substitution crosses (goes into) an abstraction: first, the indexes of b should be incremented in order to reflect the new variable bindings; second – and the key to our technology –, some mechanism should be implemented in order to replace the need for indexes inside closures (since these should be incremented, too). The first problem is solved easily: we just use an operator to progresively increment indexes with every abstraction crossing, in the style of λ t [10]. The σ b( λ 1 2 ) second issue is a bit harder. Figure 1 will help us clarify what we do when a (a) substitution crosses an abstraction, momentarily using σ b a to denote a[b] in order to emphasize the binding character of the substitution. In this example we use the term σ b (λ 1 2) (which stands for (λ 1 2)[b]). Figure 1(a) shows λ (σ b 1 2 ) the bindings in the original term; Figure 1(b) shows that bindings are inverted (b) if we cross the abstraction and do not make any changes. Then, in order to get bindings “back on the road”, we just swap indexes 1 and 2! (Figure 1(c)). With this operation we recover, intuitively, the original semantics of the term. λ (σ b 2 1 ) Summarizing, all that is needed when abstractions are crossed is: swap indexes (c) 1 and 2 and, also, increment the indexes of the term carried in the substitution. That is exactly what λ r does, with substitutions in the meta-level. In particular, Figure 1: Bindings we show that λ r and λdB are identical. Terms for λ r are the same as those for λdB . That is: Definition 1 (Terms for λdB and λ r). The set of terms for λdB and λ r, denoted ΛdB , is given in BNF by: a ::= n | a a | λ a n ∈ N>0 We now define the new meta-operators used to implement index increments and swaps. Definition 2 (Increment operator – ↑i ). For every i ∈ N, ↑i : ΛdB → ΛdB is given inductively by:  ↑i (a b) = ↑i (a) ↑i (b) n if n ≤ i ↑i (n) = ↑i (λ a) = λ ↑i+1 (a) n + 1 if n > i Definition 3 (Swap operator – li ). For every i ∈ N>0 , li : ΛdB → ΛdB is given inductively by:  if n < i ∨ n > i + 1  n li (a b) = li (a) li (b) i + 1 if n = i li (n) = li (λ a) = λ li+1 (a)  i if n = i + 1 Finally, we present the meta-level substitution definition for λ r, and then the λ r-calculus itself. Definition 4 (Meta-substitution for λ r). For every a, b, c ∈ ΛdB , n ∈ N>0 , •{•} : ΛdB × ΛdB → ΛdB is given inductively by:  c if n = 1 (a b){c} = a{c} b{c} n{c} = n − 1 if n > 1 (λ a){c} = λ l1 (a){↑0 (c)} Definition 5 (λ r-calculus). The λ r-calculus is the reduction system (ΛdB , βr ), where βr ⊆ ΛdB × ΛdB is:  (∀a, b ∈ ΛdB ) a →βr b ⇐⇒ (∃ C context; c, d ∈ ΛdB ) (a = C [(λ c)d] ∧ b = C [c{d}]) The next theorem states the relationship between λ r and λdB meta-substitution operators1 , having as an immediate corollary that λ r and λdB are the same calculus. 1 See [8], for example, for the definition of λdB meta-substitution: a{{i ← b}}. Ariel Mendelzon, Alejandro Rı́os & Beta Ziliani 43 Theorem 6. For every a, b ∈ ΛdB : a{{1 ← b}} = a{b}. Therefore, λdB and λ r are the same calculus. Proof. The proof is not trivial: to cope with stacked swaps and increments, an extended notion for these operations is needed; as well as many natural intermediate lemmas. A full technical proof can be found in [14], chapter 3. Moreover, this result was checked using the Coq theorem prover2 . 3 The λ re, λ regc and λ rex calculi In order to derive an ES calculus from λ r, we first need to internalize substitutions in the language. Thus, we add the construction a[b] to ΛdB , and call the resulting set of terms Λre. As a design decision, operators ↑i and li are left in the meta-level. Naturally, we must extend their definitions to the ES case, task that needs several lemmas to ensure correctness, and that can be found in [14], chapter 4. Extensions are rather intuitive: ↑i (a[b]) = ↑i+1 (a)[↑i (b)] and li (a[b]) = li+1 (a)[li (b)] Then, we just orient the equalities from the meta-substitution definition as expected and get a calculus we call λ re (that turns out to be isomorphic to λ x [4, 3], as we will later explain). It is important to mention that, even though independently discovered, the swapping mechanism introduced in λ r – and then used for the conception of λ re – was first depicted by de Bruijn for his Cλ ξ φ [6] (later updated w.r.t. notation – λ ξ φ – and compared to λ υ in [2]). Although Cλ ξ φ is completely explicit (↑i and li are implemented by means of a special sort of substitution, c.f. [2]), whereas λ re keeps ↑i and li as meta-operators, the similarity between both calculi is remarkable. In spite of this, and as far as we know, no direct succesor of Cλ ξ φ was found to satisfy PSN and MC (in particular, Cλ ξ φ lacks composition). As a next step in our work, we add Garbage Collection to λ re. To get this right, we introduce a new meta-level operator – ↓i – that simply decrements indexes by one, so as to mimic the meta-substitution (and the corresponding λ re rule) for the n > 1 index case. The operator is inspired in a similar one from [15], needing a few lemmas to ensure a correct definition for ↓i (a[b]) (cf. [14], chapter 5). Then, we get: Definition 7 (Decrement operator – ↓i ). For every i ∈ N>0 , ↓i : Λre → Λre is given inductively by:  ↓i (a b) = ↓i (a) ↓i (b) if n < i  n ↓i (λ a) = λ ↓i+1 (a) undefined if n = i ↓i (n) =  n−1 if n > i ↓i (a[b]) = ↓i+1 (a)[↓i (b)] Note. Notice that ↓i (a) is well-defined iff i 6∈ FV(a). The Garbage Collection rule added (GC) can be seen in Figure 2, and the resulting calculus is called λ regc (which, as we will see, is isomorphic to λ xgc [4]). Finally, the idea is approaching λ ex [11]. To accomplish this, we need to be able to compose substitutions the λ ex way. There, composition is handled by one rule and one equation: t[x := u][y := v] →(Comp) t[y := v][x := u[y := v]] if y ∈ FV(u) t[x := u][y := v] =C t[y := v][x := u] if y 6∈ FV(u) ∧ x 6∈ FV(v) The rule (Comp) is used when substitutions are dependent, and reasoning modulo C-equation is needed for independent substitutions. Since in λ r-derived calculi there is no simple way of implementing an ordering of substitutions (remember: no indexes inside closures!), and thus no trivial path for the elimination of equation C exists, we need an analogue equation. Let us start with the composition rule: in a term of the form a[b][c], substitutions [b] and [c] are dependent iff 1 ∈ FV(b). In such a term, indexes 1 and 2 in a are being affected by [b] and [c], respectively. 2 The proof can be downloaded from http://www.mpi-sws.org/~beta/lambdar.v 44 Swapping: a natural bridge between named and indexed explicit substitution calculi Consequently, if we were to reduce to a term of the form a′ [c′ ][b′ ], a swap should be performed over a. Moreover, as substitution [c] crosses the binder [b], an index increment should be done, too. Finally, since substitutions are dependent – that is, [c] affects b –, b′ should be b[c]. Then, we are left with the term l1 (a)[↑0 (c)][b[c]]. For the equation, let us suppose we negate the composition condition (i.e., 1 6∈ FV(b)). Using Garbage Collection in the last term, we have l1 (a)[↑0 (c)][b[c]] →(GC) l1 (a)[↑0 (c)][↓1 (b)]. It is important to notice that the condition in rule (Comp) is essential; that is: we cannot leave (Comp) unconditional and let (GC) do its magic: we would immediately generate infinite reductions, losing PSN. Thus, our composition rule and equation are: a[b][c] →(Comp) l1 (a)[↑0 (c)][b[c]] if 1 ∈ FV(b) a[b][c] =D l1 (a)[↑0 (c)][↓1 (b)] if 1 6∈ FV(b) Rules for the λ rex-calculus can be seen in Figure 2. The relation rexp is generated by the set of rules (App), (Lamb), (Var), (GC) and (Comp); λ rexp by (Beta) + rexp . D-equivalence is the least equivalence and compatible relation generated by (EqD). Relations λ rex (resp. rex) are obtained from λ rexp (resp. rexp ) modulo D-equivalence (thus specifying rewriting on D-equivalence classes). That is,   ∀ a, a′ ∈ Λre : a →(λ )rex a′ ⇐⇒ ∃ b, b′ ∈ Λre : a =D b →(λ )rexp b′ =D a′ We define λ rex as the reduction system (Λre, λ rex). We shall define λ re and λ regc next. Since the rule (VarR) does not belong to λ rex, but only to λ re and λ regc , we present it here: (VarR) (n + 1)[c] → n The relation re is generated by (App), (Lamb), (Var) and (VarR); λ re by (Beta) + re; the relation regc by re + (GC); and λ regc by (Beta) + regc . Finally, the λ re and λ regc calculi are the reduction systems (Λre, λ re) and (Λre, λ regc ), respectively. (EqD) a[b][c] = l1 (a)[↑0 (c)][↓1 (b)] (1 6∈ FV(b)) (Beta) (App) (Lamb) (Var) (GC) (Comp) (λ a) b (a b)[c] (λ a)[c] 1[c] a[c] a[b][c] → → → → → → a[b] a[c] b[c] λ l1 (a)[↑0 (c)] c ↓1 (a) l1 (a)[↑0 (c)][b[c]] (1 6∈ FV(a)) (1 ∈ FV(b)) Figure 2: Equations and rules for the λ rex-calculus For the isomorphism between λ ex and λ rex (and also between λ x and λ re; and between λ xgc and λ regc ), we must first give a translation from the set Λx (i.e., the set of terms for λ x, λ xgc and λ ex; see e.g. [11] for the expected definition) to Λre, and viceversa. It is important to notice that our translations depend on a list of variables, which will determine the indexes of the free variables. All this work is inspired in a similar proof that shows the isomorphism between the λ and λdB calculi, found in [10]. Definition 8 (Translation from Λx to Λre). For every t ∈ Λx, n ∈ N, such that FV(t) ⊆ {x1 , . . . , xn }, w[x1 ,...,xn ] : Λx → Λre is given inductively by:  w[x1 ,...,xn ] (λ x.t) = λ w[x,x1 ,...,xn ] (t) w[x1 ,...,xn ] (x) = min j : x j = x   w[x1 ,...,xn ] (t u) = w[x1 ,...,xn ] (t) w[x1 ,...,xn ] (u) w[x1 ,...,xn ] (t[x := u]) = w[x,x1 ,...,xn ] (t) w[x1 ,...,xn ] (u) Ariel Mendelzon, Alejandro Rı́os & Beta Ziliani 45 Definition 9 (Translation from Λre to Λx). For every a ∈ Λre, n ∈ N, such that FV(a) ⊆ {1, . . . , n}, u[x1 ,...,xn ] : Λre → Λx, with {x1 , . . . , xn } different variables, is given inductively by: u[x1 ,...,xn ] ( j) = xj u[x1 ,...,xn ] (a b) = u[x1 ,...,xn ] (a) u[x1 ,...,xn ] (b) u[x1 ,...,xn ] (λ a) = λ x.u[x,x1 ,...,xn ] (a) u[x1 ,...,xn ] (a[b]) = u[x,x1 ,...,xn ] (a) [x := u[x1 ,...,xn ] (b)] with x 6∈ {x1 , . . . , xn } in the cases of abstraction and closure. Translations are correct w.r.t. α-equivalence (i.e., α-equivalent Λx terms have the same image under w[x1 ,...,xn ] , and identical Λre terms have α-equivalent images under different elections of x for u[x1 ,...,xn ] ). Besides, adding variables at the end of translation lists does not affect the result; thus, uniform translations w and u can be defined straightforwardly, depending only on a preset ordering of variables. Full proofs for all this can be found in [14], chapter 4. We now state the isomorphisms: Theorem 10 (λ ex ∼ = λ rex, λ x ∼ = λ re and λ xgc ∼ = λ regc ). The λ ex (resp. λ x, λ xgc) and λ rex (resp. λ re, λ regc ) calculi are isomorphic. That is, 1. w ◦ u = IdΛre ∧ u ◦ w = IdΛx 2. ∀t, u ∈ Λx : t →λ ex(λ x,λ xgc) u =⇒ w(t) →λ rex(λ re,λ regc ) w(u) 3. ∀a, b ∈ Λre : a →λ rex(λ re,λ regc ) b =⇒ u(a) →λ ex(λ x,λ xgc) u(b) Proof. This is actually a three-in-one theorem. Proofs require many intermediate lemmas that assert the interaction between translations and meta-operators. Full technical details for each of the isomorphisms can be found in [14], chapters 4 (λ x ∼ = λ re), 5 (λ xgc ∼ = λ regc ) and 6 (λ ex ∼ = λ rex). Finally, in order to show MC for λ rex, an extension to metaterms (with decorated metavariables) must be provided. The extension is given as expected, and a proof of its isomorphism w.r.t. the corresponding extension of λ ex is also shown. We refer the reader to [14], chapter 6, section 3 for details. As a direct consequence of theorem 10, pairwise isomorphic calculi enjoy the same set of properties: Corollary 11 (Preservation of properties). The λ ex (resp. λ x, λ xgc) and λ rex (resp. λ re, λ regc ) have the same properties. In particular, this implies λ rex has, among other properties, Sim, PSN and MC. Proof sketch for e.g. PSN in λ rex. Assume PSN does not hold in λ rex. Then, there exists a ∈ SNλdB s.t. a 6∈ SNλ rex . Besides, a ∈ SNλdB implies u(a) ∈ SNλ . Therefore, by PSN of λ ex [11], u(a) ∈ SNλ ex . Now, as a 6∈ SNλ rex , there exists an infinite reduction a →λ rex a1 →λ rex a2 →λ rex · · · . Thus, by theorem 10, we have u(a) →λ ex u(a1 ) →λ ex u(a2 ) →λ ex · · · , contradicting the fact that u(a) ∈ SNλ ex . 4 Conclusions and further work We have presented λ rex, an ES calculus with de Bruijn indexes that is isomorphic to λ ex, a formalism with variable names that fulfills a whole set of interesting properties. As a consequence of the isomorphism, λ rex inherits all of λ ex’s properties. This, together with a very simple notation makes it, as far as we know, the first calculus of its kind. Besides, the λ re and λ regc calculi (isomorphic to λ x and λ xgc, respectively) were also introduced. The development was based on a novel presentation of the classical λdB . Given the homogeneity of definitions and proofs, not only for λ r and λ rex, but also for λ re and λ regc , we think we found a truly natural bridge between named and indexed formalisms. We believe this opens a new set of posibilities in the area: either by translating and studying existing calculi with good properties; or by rethinking old calculi from a different perspective (i.e., with λ r’s concept in mind). Work is yet to be done in order to get a more suitable theoretical tool for implementation purposes, for unary closures and equations still make such a task hard. In this direction, we think a mixed approach 46 Swapping: a natural bridge between named and indexed explicit substitution calculi using ideas from λ rex and λ σ -styled calculi may lead to the solution of both issues. The explicitation of meta-operators may also come to mind: we think this is not a priority, because the main merit of λ rex is putting into evidence the accesory nature of index updates. Furthermore, an attempt to use λ rex in proof assistants or higher order unification [7] implementations may be taken into account. Finally, adding an η rule to λ rex should be fairly simple using the decrement meta-operator. Acknowledgements: Special thanks to Delia Kesner for valuable discussions and insight on the subject; as well as to the anonymous referees for their very useful comments. References [1] M. Abadi, L. Cardelli, P.-L. Curien & J.-J. Lévy (1991): Explicit Substitutions. J. Funct. Prog. 1, pp. 31–46. [2] Z. Benaissa, D. Briaud, P. Lescanne & J. Rouyer-Degli (1996): λ υ, a Calculus of Explicit Substitutions which Preserves Strong Normalisation. J. Funct. Prog. 6(5), pp. 699–722. [3] R. Bloo & H. Geuvers (1999): Explicit substitution on the edge of strong normalization. Theor. Comput. Sci. 211(1-2), pp. 375–395. [4] R. Bloo & K. H. Rose (1995): Preservation of strong normalisation in named lambda calculi with explicit substitution and garbage collection. In: CSN-95: Computing Science in the Netherlands, pp. 62–72. [5] N. G. de Bruijn (1972): Lambda Calculus Notation with Nameless Dummies, a Tool for Automatic Formula Manipulation, with Application to the Church-Rosser Theorem. Indagationes Mathematicae 34, pp. 381–392. [6] N. G. de Bruijn (1978): A namefree λ calculus with facilities for internal definition of expressions and segments. Tech. Rep. TH-Report 78-WSK-03, Dept. of Mathematics, Technical University of Eindhoven . [7] G. Dowek, Th. Hardin & C. Kirchner (2000): Higher order unification via explicit substitutions. Inf. Comput. 157(1-2), pp. 183–235. [8] F. Kamareddine & A. Rı́os (1995): A Lambda-Calculus à la de Bruijn with Explicit Substitutions. In: PLILP ’95: Proceedings of the 7th International Symposium on Programming Languages: Implementations, Logics and Programs, Lecture Notes in Computer Science 982, pp. 45–62. [9] F. Kamareddine & A. Rı́os (1997): Extending a λ -calculus with explicit substitution which preserves strong normalisation into a confluent calculus on open terms. J. Funct. Prog. 7(4), pp. 395–420. [10] F. Kamareddine & A. Rı́os (1998): Bridging de Bruijn Indices and Variable Names in Explicit Substitutions Calculi. Logic Journal of the IGPL 6(6), pp. 843–874. [11] D. Kesner (2009): A Theory of Explicit Substitutions with Safe and Full Composition. Logical Methods in Computer Science 5(3:1), pp. 1–29. [12] P. Lescanne (1994): From λ σ to λ υ: a journey through calculi of explicit substitutions. In: POPL ’94: Proceedings of the 21st ACM SIGPLAN-SIGACT symposium on principles of programming languages, ACM, New York, NY, USA, pp. 60–69. [13] P.-A. Melliès (1995): Typed lambda-calculi with explicit substitutions may not terminate. In: TLCA ’95: Proceedings of the Second International Conference on Typed Lambda Calculi and Applications, Lecture Notes in Computer Science 902, pp. 328–334. [14] A. Mendelzon (2010). Una curiosa versión de λdB basada en “swappings”: aplicación a traducciones entre cálculos de sustituciones explı́citas con nombres e ı́ndices. Master’s thesis, FCEyN, Univ. de Buenos Aires. Available at http://publi.dc.uba.ar/publication/pdffile/128/tesis_amendelzon.pdf. [15] A. Rı́os (1993): Contributions à l’étude des Lambda-calculus avec Substitutions Explicites. Ph.D. thesis, Université Paris 7. [16] K. H. Rose, R. Bloo & F. Lang (2009): On explicit substitution with names. Technical Report, IBM. Available at http://domino.research.ibm.com/library/cyberdig.nsf/papers/ 39D13836281BDD328525767F0056CE65. Standardisation for constructor based pattern calculi Delia Kesner Carlos Lombardi Alejandro Rı́os PPS, CNRS and Université Paris Diderot France [email protected] Depto. de Ciencia y Tecnologı́a Univ. Nacional de Quilmes Argentina [email protected] Depto. de Computación Facultad de Cs. Exactas y Naturales Univ. de Buenos Aires – Argentina [email protected] This work gives some insights and results on standardisation for call-by-name pattern calculi. More precisely, we define standard reductions for a pattern calculus with constructor-based data terms and patterns. This notion is based on reduction steps that are needed to match an argument with respect to a given pattern. We prove the Standardisation Theorem by using the technique developed by Takahashi [14] and Crary [3] for lambda-calculus. The proof is based on the fact that any development can be specified as a sequence of head steps followed by internal reductions, i.e. reductions in which no head steps are involved. We expect to extend these results to more complex calculi with open and dynamic patterns. 1 Introduction Several calculi have been proposed in order to give a formal description of pattern matching; i.e. the ability to analyse the form of the argument of a function in order to decide among alternative function definition clauses, adequate to different argument forms. We will call them pattern calculi. Central to several pattern calculi is the concept of matching; an application of an abstraction to an argument can only be performed if the argument matches the pattern of the abstraction. An analysis of various pattern calculi based on different notions of matching operations and different sets of allowed patterns can be found in [9]. A fundamental result in the lambda-calculus is the Standardisation Theorem, which states that if a term M β -reduces to a term N, then there is a “standard” β -reduction sequence from M to N. This result has several applications, e.g. it is used to prove the non-existence of reduction between given terms. One of its main corollaries is the quasi-leftmost-reduction theorem, which in turn is used to prove the non-existence of normal forms for a given term. A first study on standardisation for call-by-name lambda-calculus appers in [4]. Subsequently, several standardisation methods have been devised, for example [2] section 11.4, [14], [10] and [13]. While leftmost-outermost reduction gives a standard strategy for call-by-name lambda-calculus, more refined notions of reductions are necessary to define standard strategies for call-by-value lambdacalculus [13], first-order term rewriting systems [7, 15], Proof-Nets [5], etc. All standard reduction strategies involve the definition of some selected redex/step by means of a partial function from terms to redexes/steps; they all give priority to the selected step, if possible. We will refer to this selected redex/step concept as the head step/redex of a term. For the standard call-by-name lambda calculus, any term of the form (λ x.M)N is a redex, and the head redex for such a term is the whole term. In pattern calculi any term of the form (λ p.M)N is a redex candidate, but not necessarily a redex. The parameter p in such terms can be more complex than a single variable, and the whole term is not a redex if the argument N does not match p, i.e., if N does not verify the structural conditions imposed by p. In this case we will choose as head a reduction step lying inside N (or even inside p) which makes p and N be closer to a possible match. While this situation bears some resemblance with what happens with standard call-by-value lambda calculus [13], there is an important Submitted to: HOR 2010 c D.Kesner, C.Lombardi & A.Rı́os This work is licensed under the Creative Commons Attribution-Share Alike License. 48 Standardisation for constructor based pattern calculi difference: both the fact of (λ p.M)N being a redex, and whether a redex inside N could be useful to get p and N closer to a possible match, depend in pattern calculi on both N and p. The aim of this contribution is to analyse the existence of standard reduction strategies for pattern calculi in a direct way, without using any encoding of such calculi into some general computational framework [11]. This direct approach puts in evidence the fine interaction between reduction and pattern matching, and gives a normalization algorithm which is specified in terms of the combination of computations of independent terms with partial computations of terms depending on some pattern. We hope to be able to extend this algorithmic approach to more sophisticated pattern calculi handling open and dynamic patterns [8]. The expected standardisation algorithm should be expressed using the explicit version of the Pure Pattern Calculus [1] recently proposed. The paper is organized as follows. Section 2 introduces the calculus, Sections 3 and 4 give, respectively, the main concepts and ideas needed for the standardisation proof and the main results, and Section 5 concludes and gives future research directions. 2 The calculus We will study a very simple form of pattern calculus, consisting of the extension of standard lambda calculus with a set of constructors and allowing constructed patterns. This calculus appears for example in Section 4.1 in [9]. Definition 2.1 (Syntax) The calculus is built upon two different enumerable sets of symbols, the variables x, y, z, w and the constants c, a, b; its syntactical categories are: Terms M, N, Q, R ::= x | c | λ p.M | MM DataTerms D ::= c | DM Patterns p, q ::= x | d DataPatterns d ::= c | d p Substitutions θ , ν, τ ::= {x1 /M1 , . . . , xn /Mn } Definition 2.2 (Matching) The match is the partial function from patterns and terms to substitutions defined by the following rules (⊎ on substitutions denotes disjoint union with respect to their domains, being undefined if the domains have a non-empty intersection): d ≪θ1 D x ≪{x/M} M c ≪0/ c p ≪θ2 M dp ≪ θ1 ⊎θ2 θ1 ⊎ θ2 defined DM We write p ≪ M iff ∃θ p ≪θ M. Remark that p ≪ M implies that p is linear. Definition 2.3 (Reduction step) The reduction steps of the calculus are defined by the following rules: M → M′ N → N′ p ≪θ N M → M′ M N → M′ N M N → M N′ (λ p.M) N → θ M λ p.M → λ p.M ′ Crucial to the standardisation proof is the concept of development, we formalize it through the relation ⊲ , meaning M ⊲ N iff there is a development (not necessarily complete) with source M and target N. Definition 2.4 (Term and substitution development) We define the relation ⊲ on terms and a corresponding relation ◮ on substitutions. The relation ⊲ is defined by the following rules: M ⊲ M′ M⊲M λ p.M ⊲ λ p.M ′ M ⊲ M′ N ⊲ N′ M ⊲ M′ θ ◮ θ′ p ≪θ N (λ p.M) N ⊲ θ ′ M ′ M N ⊲ M′ N ′ and ◮ is defined as follows: θ ◮ θ ′ iff dom(θ ) = dom(θ ′ ) and ∀x ∈ dom(θ ) . θ x ⊲ θ ′ x D.Kesner, C.Lombardi & A.Rı́os 49 The definition of head step will take into account the terms (λ p.M)N even if p 6≪ N. In such cases, the head redex will be inside N as the patterns in this calculus are always normal forms (this will not be the case for more complex pattern calculi). The selection of the head redex inside N depends on both N and p. This differs from standard call-by-value lambda-calculus, where the selection depends only on N. We show this phenomenon with a simple example. Let a, b, c be constants, N = (aR1 )R2 a term, where R1 and R2 are redexes. The redexes in N needed to achieve a match with a certain pattern q, and thus the selection of the head redex, depend on q. Take for example different patterns p1 = (ax)(by), p2 = (abx)y, p3 = (abx)(cy), p4 = (ax)y, and consider the term Q = (λ q.M)N. If q = p1 , then it is not necessary to reduce R1 (because it already matches x) but it is necessary to reduce R2 , because no redex can match the pattern by; hence R2 will be the head redex in this case. Analogously, for p2 it is necessary to reduce R1 but not R2 , for p3 both are needed (in this case we will chose the leftmost one) and p4 does match N, hence the whole Q is the head redex. This observation motivates the following definition. Definition 2.5 (Head step) The relations → (head step) and h p (preferred needed step to match pattern p) are defined as follows: M → M′ h M N → M′ N p ≪θ N HApp1 (λ p.M) N → θ M M → M′ h d HBeta h h M N M′ D PatHead DM d dp N′ (λ p.M) N → (λ p.M) N ′ HPat h D′ M D′ M p Pat1 p M′ DM dp d≪D DM ′ Pat2 The rule PatHead is intended for data patterns only, not being valid for variable patterns; we point this by giving a d (data pattern) instead of a p (any pattern) in the arrow subscript inside the conclusion. We observe that the rule analogous to HPat in the presentation of standard reduction sequences for call-by-value lambda-calculus in both [13] and [3] reads N → N′ h (λ p.M)N → (λ p.M)N ′ h reflecting the N-only-dependency feature aforementioned. We can see also that a head step in a term like (λ p.M)N determined by rule HPat will lie inside N, but the same step will not necessarily be considered head if we analyse N alone. It is easy to check that if M M ′ then p 6≪ M, avoiding any overlap between HBeta and HPat and p also between Pat1 and Pat2. This in turn implies that all terms have at most one head redex. We remark also that the head step depends not only on the pattern structure but also on the match or lack of match between pattern and argument. 3 Main concepts and ideas needed for the standardisation proof In order to build a standardisation proof for constructor based pattern calculi we chose to adapt the one in [14] for the call-by-name lambda-calculus, later adapted to call-by-value lambda-calculus in [3], over the classical presentation of [13]. 50 Standardisation for constructor based pattern calculi The proof method relies on a h-development property stating that any development can be split into a leading sequence of head steps followed by a development in which no head steps are performed; this is our Lemma 4.1 which corresponds to the so-called “main lemma” in the presentations by Takahashi and Crary. Even for a simple form of pattern calculus such as the one presented in this contribution, both the definitions (as we already mentioned when defining head steps) and the proofs are non-trivial extensions of the corresponding ones for standard lambda-calculus, even in the framework of call-by-value. As mentioned before, the reason is the need to keep into account, for terms involving the application of a function to an argument, the pattern of the function parameter when deciding whether a redex inside the argument should be considered as a head redex. In order to formalize the notion of “development without occurrences of head steps”, an internal development relation will be defined. The dependency on both N and p when analysing the reduction steps from a term like (λ p.M)N is shown in the rule IApp2. int int Definition 3.1 (Internal development) The relations ⊲ (internal development) and ⊲ p (internal development with respect to the pattern p) are defined as follows: int M ⊲ M′ IRefl IAbs int M ⊲M M 6= λ p.M1 λ p.M ⊲ λ p.M ′ N ⊲ p N′ int int IApp2 ′ (λ p.M) N ⊲ (λ p.M )N int int int N ⊲ dp N p≪N int N ⊲c N int D ⊲ d D′ PNoCData ′ M ⊲ M′ d 6≪ D int ′ ′ DM ⊲ d p D M D ⊲ D′ int M ⊲ p M′ IApp1 int N ⊲ N′ PMatch N ⊲ p N′ ′ N∈ / DataTerms N ⊲ N ′ N ⊲ N′ M N ⊲ M′ N ′ N ⊲ N′ int M ⊲ M′ int M ⊲ M′ d≪D p 6≪ M int PConst ′ PCDataNo1 PCDataNo2 DM ⊲ d p D′ M ′ int int We observe that if either N ⊲ N ′ or N ⊲ p N ′ then N ⊲ N ′ . The formal description of the h-development condition takes a form of an additional binary relation. This relation corresponds to the one called strong parallel reduction in [3]. Definition 3.2 (H-development) We define the relations ⊲ and ◮. Let M, N terms; ν, θ substitutions. h (ii) ∃ Q . M a. M ⊲ N iff (i) M ⊲ N, b. ν ◮ θ iff (i) Dom(ν) = Dom(θ ), h h →∗ h int Q ⊲ N, h (iii) ∀ p . ∃ Q p . M ∗ p int Q p ⊲ p N. (ii) ∀ x ∈ Dom(ν) . νx ⊲ θ x. h The clause (iii) in the definition of ⊲ shows that the dependency on the patterns already noted when h defining head step and internal development carries on to the definition of h-development. This clause is needed when proving that all developments are h-developments; let’s grasp the reason through a brief argument. Suppose we want to prove that a development inside N in a term like (λ p.M)N is an h-development. The rules to apply in the Definitions 2.5 and 3.1 are HPat and IApp2 respectively; therefore we need to perform an analysis relative to the pattern p when taking N alone. This analysis D.Kesner, C.Lombardi & A.Rı́os 51 is what clause (iii) expresses. Consequently the proof of clause (ii) for a term needs to consider clause (iii) (instantiated to a certain pattern) valid for a subterm, this is achieved including clause (iii) in the definition and performing an inductive argument on the terms being analysed. 4 Main results We summarize the sequence of the main lemmas needed to prove the Standardisation Theorem, and then the theorem itself. Once Lemmas 4.1 and 4.2 have been obtained, both Lemma 4.3 and the Standardisation Theorem 4.5 admit very simple proofs. The proofs of the former lemmas involve some work, mostly related to the need to check carefully the different cases when analysing a term like (λ p.M)N specially when p 6≪ N. The proof details, along with the statements and proofs of more technical lemmas, are included in the extended version of this contribution, available at www.pps.jussieu.fr/∼kesner/papers/std-patterns-long-hor10.pdf. Lemma 4.1 (H-development property) (i) Let M, N terms, ν, θ substitutions, such that M ⊲ N and ν ◮ θ . Then νM ⊲ θ N. h h (ii) Let M, N terms such that M ⊲ N. Then M ⊲ N. h Lemma 4.2 (Postponement) (i) (ii) int if M ⊲ N → R then there exists a term N ′ such that M → N ′ ⊲ R. h h int for every pattern p, if M ⊲ p N p R then there exists a term Np′ such that M p Np′ ⊲ R. Lemma 4.3 (Bifurcation) int Assume M, N terms such that M ⊲∗ N. Then M →∗ R ⊲∗ N for some term R. h Definition 4.4 (Standard reduction sequence) The standard reduction sequences (in the following s.r.s) are sequences of terms M1 ; . . . ; Mn which can be generated by: M2 ; . . . ; Mk k ≥ 2 M1 → M2 h StdHead M1 ; . . . ; Mk M1 ; . . . ; M j x M1 ; . . . ; Mk StdVar (λ p.M1 ); . . . ; (λ p.Mk ) N1 ; . . . ; Nk (M1 N1 ); . . . (M j N1 ); (M j N2 ); . . . ; (M j Nk ) StdAbs StdApp Remark that by induction every term is a unitary s.r.s, the rule StdVar being the base case. Theorem 4.5 (Standardisation) Assume M, N terms such that M ⊲∗ N. Then there exists a s.r.s M; . . . ; N . As in [13] standard reduction sequences are not unique, unless we work modulo permutation equivalence [4, 12]. Indeed, let us suppose M → M ′ and N → N ′ . We then get two different (but permutation h h equivalent) s.r.s from (λ d.M)N to (λ d.M ′ )N ′ : N → N′ M → M′ h M; M ′ (λ d.M); (λ d.M ′ ) M → M′ h N → N′ h N; N ′ (λ d.M)N; (λ d.M ′ )N; (λ d.M ′ )N ′ N N′ d (λ d.M)N → (λ d.M)N ′ h h M; M ′ λ d.M; λ d.M ′ N′ (λ d.M)N ′ ; (λ d.M ′ )N ′ (λ d.M)N; (λ d.M)N ′ ; (λ d.M ′ )N ′ 52 Standardisation for constructor based pattern calculi 5 Conclusion and further work We have presented an elegant proof of the Standardisation Theorem for constructor-based pattern calculi. We aim to generalize both the concept of standard reduction and the elegant structure of the Standardisation Theorem proof presented here to a large class of pattern calculi, including both open and closed variants as the Pure Pattern Calculus [8]. It would be interesting to have sufficient conditions for a pattern calculus to enjoy the standardisation property. This will be close in spirit with [9] where an abstract confluence proof for pattern calculi is developed. The kind of calculi we want to deal with imposes challenges that are currently not handled in the present contribution, as open patterns, reducible (dynamic) patterns, and the possibility of having fail as a decided result of matching. Furthermore, the possibility of decided fail combined with compound patterns leads to the convenience of studying forms of inherently parallel standard reduction strategies. The abstract Standardisation Theorem developed by [6] in an homogenous axiomatic framework could be useful for our purpose. While the axioms of the abstract formulation of standardisation are assumed to hold in the proof of the standardisation result, they need to be defined and verified for each language to be standardised. This could be not trivial, as in the case of TRS [7, 15], where a meta-level matching operation is involved in the definition of the rewriting framework. References [1] Th. Balabonski (2008): Calculs avec motifs dynamiques. De la conception à l’implémentation. Stage de Master, Université Paris-Diderot Paris 7. [2] H.P. Barendregt (1984): The Lambda Calculus: Its Syntax and Semantics. Elsevier, Amsterdam. [3] K. Crary (2009): A Simple Proof of Call-by-Value Standardization. Technical Report CMU-CS-09-137, Carnegie-Mellon University. [4] H.B. Curry & R. Feys (1958): Combinatory Logic. North-Holland Publishing Company, Amsterdam. [5] J.-Y. Girard (1987): Linear Logic. Theoretical Computer Science 50(1), pp. 1–101. [6] G. Gonthier, J.-J. Lévy & P.-A. Melliès (1992): An abstract standardisation theorem. In: Proceedings, Seventh Annual IEEE Symposium on Logic in Computer Science, 22-25 June 1992, Santa Cruz, California, USA, IEEE Computer Society, pp. 72–81. [7] G. Huet & J.-J. Lévy (1991): Computations in orthogonal rewriting systems. In: Jean-Louis Lassez & Gordon Plotkin, editors: Computational Logic, Essays in Honor of Alan Robinson, MIT Press, pp. 394–443. [8] C.B. Jay & D. Kesner (2006): Pure Pattern Calculus. In: Peter Sestoft, editor: European Symposium on Programming, number 3924 in LNCS, Springer-Verlag, pp. 100–114. [9] C.B. Jay & D. Kesner (2009): First-class patterns. J. Funct. Program. 19(2), pp. 191–225. Available at http://dx.doi.org/10.1017/S0956796808007144. [10] Ryo Kashima (2000): A Proof of the Standardization Theorem in λ -Calculus. Research Reports on Mathematical and Computing Sciences C-145, Tokyo Institute of Technology. [11] J.W. Klop, V. van Oostrom & R.C. de Vrijer (2008): Lambda calculus with patterns. Theor. Comput. Sci. 398(1-3), pp. 16–31. Available at http://dx.doi.org/10.1016/j.tcs.2008.01.019. [12] J.-J. Lévy (1980): Optimal Reductions in the lambda-calculus. In: R. Hindley & J.P. Seldin, editors: To Haskell Curry: Essays in Combinatory Logic, Lambda Calculus and formalism, Acad. Press, pp. 159–191. [13] G. Plotkin (1975): Call-by-name, call-by-value and the Lambda-calculus. Theor. Comput. Sci. 1(2), pp. 125–159. [14] M. Takahashi (1995): Parallel reductions in lambda-calculus. Inf. and Comput. 118(1), pp. 120–127. [15] Terese (2003): Term Rewriting Systems, Cambridge Tracts in Theoretical Computer Science 55. Cambridge University Press.