Centrum voor Wiskunde en Informatica
Compilation and memory management for ASF+SDF
M.G.J. van den Brand, P. Klint, P.A. Olivier
Software Engineering (SEN)
SEN-R9906 February 28, 1999
Report SEN-R9906
ISSN 1386-369X
CWI
P.O. Box 94079
1090 GB Amsterdam
The Netherlands
CWI is the National Research Institute for Mathematics
and Computer Science. CWI is part of the Stichting
Mathematisch Centrum (SMC), the Dutch foundation
for promotion of mathematics and computer science
and their applications.
SMC is sponsored by the Netherlands Organization for
Scientific Research (NWO). CWI is a member of
ERCIM, the European Research Consortium for
Informatics and Mathematics.
Copyright © Stichting Mathematisch Centrum
P.O. Box 94079, 1090 GB Amsterdam (NL)
Kruislaan 413, 1098 SJ Amsterdam (NL)
Telephone +31 20 592 9333
Telefax +31 20 592 4199
Compilation and Memory Management for ASF+SDF
Mark van den Brand
CWI
P.O. Box 94079, NL-1090 GB Amsterdam, The Netherlands
[email protected]
Paul Klint
CWI
P.O. Box 94079, NL-1090 GB Amsterdam, The Netherlands,
and
Programming Research Group, University of Amsterdam
Kruislaan 403, NL-1098 SJ Amsterdam, The Netherlands
Paul
[email protected]
Pieter Olivier
Programming Research Group, University of Amsterdam
Kruislaan 403, NL-1098 SJ Amsterdam, The Netherlands
[email protected]
ABSTRACT
Can formal specification techniques be scaled-up to industrial problems such as the development of domain-specific
languages and the renovation of large COBOL systems?
We have developed a compiler for the specification formalism A SF +S DF that has been used successfully to meet
such industrial challenges. This result is achieved in two ways: the compiler performs a variety of optimizations and
generates efficient C code, and the compiled code uses a run-time memory management system based on maximal
subterm sharing and mark-and-sweep garbage collection.
We present an overview of these techniques and evaluate their effectiveness in several benchmarks. It turns out
that execution speed of compiled A SF +S DF specifications is at least as good as that of comparable systems, while
memory usage is in many cases an order of magnitude smaller.
1991 Computing Reviews Classification System: D.2.1, D.3.4, E.2, F.4.2.
Keywords and Phrases: Code Generation, Hashing, Languages, Performance, Subterm Sharing, Term Rewriting.
Note: To appear in Proceedings of Compiler Construction (CC’99), 1999
Note: Work carried out under project SEN-1.4, ASF+SDF
1 Introduction
Efficient implementation based on mainstream technology is a prerequisite for the application and acceptance of
declarative languages or specification formalisms in real industrial settings. The main characteristic of industrial
applications is their size and the predominant implementation consideration should therefore be the ability to handle
huge problems.
In this paper we take the specification formalism A SF +S DF [5, 19, 15] as point of departure. Its main focus is
on language prototyping and on the development of language specific tools. A SF +S DF is based on general contextfree grammars for describing syntax and on conditional equations for describing semantics. In this way, one can
easily describe the syntax of a (new or existing) language and specify operations on programs in that language such
as static type checking, interpretation, compilation or transformation. A SF +S DF has been applied successfully in a
1
number of industrial projects [9, 11], such as the development of a domain-specific language for describing interest
products (in the financial domain) [4] and a renovation factory for restructuring of COBOL code [12]. In such
industrial applications, the execution speed is very important, but when processing huge COBOL programs memory
usage becomes a critical issue as well. Other applications of A SF +S DF include the development of a GLR parser
generator [26], an unparser generator [13], program transf ormation tools [14], and the compiler discussed in this
paper. Other components, such as parsers, structure editors, and interpreters, are developed in A SF +S DF as well
but are not (yet) compiled to C.
What are the performance standards one should strive for when writing a compiler for, in our case, an algebraic
specification formalism? Experimental, comparative, studies are scarce, one notable exception is [18] where measurements are collected for various declarative programs solving a single real-world problem. In other studies it is
no exception that the units of measurement (rewrite steps/second, or logical inferences/second) are ill-defined and
that memory requirements are not considered due to the small size of the input problems.
In this paper, we present a compiler for A SF +S DF that performs a variety of optimizations and generates efficient C code. The compiled code uses a run-time memory management system based on maximal subterm sharing
and mark-and-sweep garbage collection. The contribution of this paper is to bring the performance of executable
specifications based on term rewriting into the realm of industrial applications.
In the following two subsections we will now first give a quick introduction to A SF + S DF (the input language of
the compiler to be described) and to A SF (the abstract intermediate representation used internally by the compiler).
Next, we describe the generation of C code (Section 2) as well as memory management (Section 3). Section 4 is
devoted to benchmarking. A discussion in Section 5 concludes the paper.
1.1 Specification Language: A SF +S DF
The specification formalism A SF +S DF [5, 19] is a combination of the algebraic specification formalism A SF and
the syntax definition formalism S DF. An overview can be found in [15]. As an illustration, Figure 1 presents the
definition of the Boolean datatype in A SF +S DF. A SF +S DF specifications consist of modules, each module has an
S DF-part (defining lexical and context-free syntax) and an A SF-part (defining equations). The S DF part corresponds
to signatures in ordinary algebraic specification formalisms. However, syntax is not restricted to plain prefix notation since arbitrary context-free grammars can be defined. The syntax defined in the S DF-part of a module can be
used immediately when defining equations, the syntax in equations is thus user-defined.
The emphasis in this paper will be on the compilation of the equations appearing in a specification. They have
the following distinctive features:
Conditional equations with positive and negative conditions.
Non left-linear equations.
List matching.
Default equations.
It is possible to execute specifications by interpreting the equations as conditional rewrite rules. The semantics
of A SF +S DF is based on innermost rewriting. Default equations are tried when all other applicable equations have
failed, because either the arguments did not match or one of the conditions failed.
One of the powerful features of the A SF +S DF specification language is list matching. Figure 2 shows a single equation which removes multiple occurrences of identifiers from a set. In this example, variables with a superscript are list-variables that may match zero or more identifiers. The implementation of list matching may
involve backtracking to find a match that satisfies the left-hand side of the rewrite rule as well as all its conditions.
There is only backtracking within the scope of a rewrite rule, so if the right-hand side of the rewrite rule is normalized and this normalization fails no backtracking is performed to find a new match.
The development of A SF +S DF specifications is supported by an interactive programming environment, the
A SF +S DF Meta-Environment [23]. In this environment specifications can be developed and tested. It provides
syntax-directed editors, a parser generator, and a rewrite engine. Given this rewrite engine terms can be reduced by
interpreting the equations as rewrite rules. For instance, the term
true & ( false | true )
reduces to true when applying the equations of Figure 1.
2
imports Layout
exports
sorts BOOL
context-free syntax
true
false
BOOL \j" BOOL
BOOL \&" BOOL
BOOL \xor" BOOL
not BOOL
\(" BOOL \)"
variables
Bool
! BOOL
! BOOL
! BOOL
! BOOL
! BOOL
! BOOL
! BOOL
fconstructorg
fconstructorg
fleftg
fleftg
fleftg
fbracketg
[0-9 0 ] ! BOOL
priorities
BOOL \j"BOOL ! BOOL < BOOL \xor"BOOL ! BOOL
< BOOL \&"BOOL ! BOOL < notBOOL ! BOOL
equations
[B1]
[B2]
[B3]
[B4]
[B5]
[B6]
[B7]
[B8]
true j Bool
= true
false j Bool = Bool
true & Bool = Bool
false & Bool = false
not false
= true
not true
= false
true xor Bool = not Bool
false xor Bool = Bool
Figure 1: A SF +S DF specification of the Booleans.
imports Layout
exports
sorts ID
lexical syntax
[a-z][a-z0-9] ! ID
sorts Set
context-free syntax
\f" fID \;"g \g" ! Set
hiddens
variables
Id
Id
\"[0-9] ! fID \;"g
[0-9 0 ] ! ID
equations
[1]
fId0 ; Id; Id1 ; Id; Id2 g = fId0 ; Id; Id1 ; Id2 g
Figure 2: A SF +S DF specification of the Set equation.
3
1.2 Intermediate Representation Language: A SF
The user-defined syntax that may be used in equations poses two major implementation challenges.
First, how do we represent A SF +S DF specifications as parse trees? Recall that there is no fixed grammar since
the basic A SF +S DF-grammar can be extended by the user. The solution we have adopted is to introduce the intermediate format A S F IX (A SF +S DF fixed format) which is used to represent the parse trees of the A SF +S DF modules
in a format that is easy processable by a machine. The user-defined syntax is replaced by prefix functions. The parse
trees in the A S F IX format are self contained.
Second, how do we represent A SF +S DF specifications in a more abstract form that is suitable as compiler input?
We use a simplified language A SF as an intermediate representation to ease the compilation process and to perform
various transformations before generating C code. A SF is in fact a single sorted (algebraic) specification formalism
that uses only prefix notation. A SF can be considered as the abstract syntax representation of A SF +S DF. A S F IX
and A SF live on different levels, A SF is only visible within the compiler whereas A S F IX serves as exchange
format between the various components, such as structure editor, parser, and compiler.
A module in A SF consists of a module name, a list of functions, and a set of equations. The main differences
between A SF and A SF +S DF are:
Only prefix functions are used.
The syntax is fixed (eliminating lexical and context-free definitions, priorities, and the like).
Lists are represented by binary list constructors instead of the built-in list construct as in A SF +S DF; associative matching is used to implement list matching.
Functions are untyped, only their arity is declared.
Identifiers starting with capitals are variables; variable declarations are not needed.
Figure 3 shows the A SF specification corresponding to the A SF +S DF specification of the Booleans given earlier in Figure 11 . Figure 4 shows the A SF specification of sets given earlier in Figure 2. Note that this specification
is not left-linear since the variable Id appears twice on the left-hand side of the equation. The flistg function
is used to mark that a term is a list. This extra function is needed to distinguish between a single element list and
an ordinary term, e.g., flistg(a) versus a or flistg(V) versus V. An example of a transformation on A SF
specifications is shown in Figure 5, where the non-left-linearity has been removed from the specification in Figure
4 by introducing new variables and an auxiliary condition.
2 C Code Generation
The A SF compiler uses A SF as intermediate representation format and generates C code as output. The compiler
consists of several independent phases that gradually simplify and transform the A SF specification and finally
generate C code.
A number of transformations is performed to eliminate “complex” features such as removal of non left-linear
rewrite rules, simplification of matching patterns, and the introduction of “assignment” conditions (conditions that
introduce new variable bindings). Some of these transformations are performed to improve the efficiency of the
resulting code whereas others are performed to simplify code generation.
In the last phase of the compilation process C code is generated which implements the rewrite rules in the specification using adaptations of known techniques [22, 17]. Care is taken in constructing an efficient matching automaton, identifying common and reusable (sub)expressions, and efficiently implementing list matching. For each A SF
function (even the constructors) a separate C function is generated. The right-hand side of an equation is directly
translated to a function call, if necessary. A detailed description of the construction of the matching automaton is
beyond the scope of this paper, a full description of the construction of the matching automaton can be found in
[10]. Each generated C function contains a small part of the matching automaton, so instead of building one big
automaton, the automaton is split over the functions. The matching automaton respects the syntactic specificity of
the arguments from left to right in the left-hand sides of the equations. Non-variable arguments are tried before the
variable ones.
1 To increase the readability of the generated code in this paper, we have consistently renamed generated names by more readable ones, like
true, false, etc.
4
module Booleans
signature
true;
false;
and( , );
or( , );
xor( , );
not( );
rules
and(true,B) = B;
and(false,B) = false;
or(true,B) = true;
or(false,B) = B;
not(true) = false;
not(false) = true;
xor(true,B) = not(B);
xor(false,B) = B;
Figure 3: A SF specification of the Booleans.
module Set
signature
flistg( );
set( );
conc( , );
rules
set(flistg(conc(*Id0,conc(Id,conc(*Id1,
conc(Id,*Id2)))))) =
set(flistg(conc(*Id0,conc(Id,conc(*Id1,*Id2)))));
Figure 4: A SF specification of Set.
module Set
signature
flistg( );
set( );
conc( , );
t;
term-equal( , );
rules
term-equal(Id1,Id2) == t
==>
set(flistg(conc(*Id0,
conc(Id1,conc(*Id1,
conc(Id2,*Id2)))))) =
set(flistg(conc(*Id0,conc(Id1,conc(*Id1,*Id2)))));
Figure 5: Left-linear A SF specification of Set.
5
ATerm and(ATerm arg0, ATerm arg1) f
if (check sym(arg0, truesym))
return arg1;
if (check sym(arg0, falsesym))
return arg0;
return make nf2(andsym,arg0,arg1);
g
Figure 6: Generated C code for the and function of the Booleans.
The datatype ATerm (for Annotated Term) is the most important datatype used in the generated C code. It is
provided by a run-time library which takes care of the creation, manipulation, and storage of terms. ATerms consist
of a function symbol and zero or more arguments, e.g., and(true,false). The library provides predicates,
such as check sym to check whether the function symbol of a term corresponds to the given function symbol,
and functions, like make nfi to construct a term (normal form) given a function symbol and i arguments (i 0).
There are also access functions to obtain the i-th argument (i 0) of a term, e.g., arg 1(and(true,false))
yields false.
The usage of these term manipulation functions can be seen in Figures 6 and 7. Figure 6 shows the C code
generated for the and function of the Booleans (also see Figures 1 and 3). This C code also illustrates the detection
of reusable subexpressions. In the second if-statement a check is made whether the first argument of the andfunction is equal to the term false. If the outcome of this test is positive, the first argument arg0 of the andfunction is returned rather than building a new normal form for the term false or calling the function false().
The last statement in Figure 6 is necessary to catch the case that the first argument is neither a true or false
symbol, but some other Boolean normal form.
Figure 7 shows the C code generated for the Set example of Figure 2. List matching is translated into nested
while loops, this is possible because of the restricted nature of the backtracking in list matching. The functions
not empty list, list head, list tail, conc, and slice are library functions which give access to the
C data structure which represents the A SF +S DF lists. In this way the generated C code needs no knowledge of the
internal list structure. We can even change the internal representation of lists without adapting the generated C
code, by just replacing the library functions. The function term equal checks the equality of two terms.
When specifications grow larger, separate compilation becomes mandatory. There are two issues related to the
separate compilation of A SF +S DF specifications that deserve special attention. The first issue concerns the identification and linking of names appearing in separately compiled modules. Essentially, this amounts to the question
how to translate the A SF +S DF names into C names. This problem arises since a direct translation would generate
names that are too long for C compilers and linkage editors. We have opted for a solution in which each generated C
file contains a “register” function which stores at run-time for each defined function defined in this C file a mapping
between the address of the generated function and the original A SF +S DF name. In addition, each C file contains
a “resolve” function which connects local function calls to the corresponding definitions based on their A SF +S DF
names. An example of registering and resolving can be found in Figure 8.
The second issue concerns the choice of a unit for separate compilation. In most programming language environments, the basic compilation unit is a file. For example, a C source file can be compiled into an object file and
several object files can be joined by the linkage editor into a single executable. If we change a statement in one of
the source files, that complete source file has to be recompiled and linked with the other object files.
In the case of A SF +S DF, the natural compilation unit would be the module. However, we want to generate
a single C function for each function in the specification (for efficiency reasons) but A SF +S DF functions can be
defined in specifications using multiple equations occurring in several modules. The solution is to use a single
function as compilation unit and to re-shuffle the equations before translating the specification. Equations are thus
stored depending on the module they occur in as well as on their outermost function symbol. When the user changes
an equation, only those functions that are actually affected have to be recompiled into C code. The resulting C code
is then compiled, and linked together with all other previously compiled functions.
6
ATerm set(ATerm arg0) f
if(check sym(arg0,listsym)) f
ATerm tmp0 = arg 0(arg0);
ATerm tmp1 [2];
tmp1 [0] = tmp0 ;
tmp1 [1] = tmp0 ;
while(not empty list(tmp0 )) f
ATerm tmp3 = list head(tmp0 );
tmp0 = list tail(tmp0 );
ATerm tmp2 [2];
tmp2 [0] = tmp0 ;
tmp2 [1] = tmp0 ;
while(not empty list(tmp0 )) f
ATerm tmp4 = list head(tmp0 );
tmp0 = list tail(tmp0 );
if(term equal(tmp3 ,tmp4 )) f
return set(list(conc(slice(tmp1 [0],tmp1 [1]),
conc(tmp3 ,
conc(slice(tmp2 [0],
tmp2 [1]),
tmp0 )))));
g
tmp2 [1] = list tail(tmp2 [1]);
tmp0 = tmp2 [1];
g
tmp1 [1] = list tail(tmp1 [1]);
tmp0 = tmp1 [1];
g
g
return make nf1(setsym,arg0);
g
Figure 7: C code for the Set specification.
3 Memory Management
At run-time, the main activities of compiled A SF +S DF specifications are the creation and matching of large amounts
of terms. Some of these terms may even be very big (more than 106 nodes). The amount of memory used during
rewriting depends entirely on the number of terms being constructed and on the amount of storage each term occupies. In the case of innermost rewriting a lot of redundant (intermediate) terms are constructed.
At compile time, we can take various measures to avoid redundant term creation (only the last two have been
implemented in the A SF +S DF compiler):
Postponing term construction. Only the (sub)terms of the normal form must be constructed, all other (sub)terms
are only needed to direct the rewriting process. By transforming the specification and extending it with rewrite
rules that reflect the steering effect of the intermediate terms, the amount of term construction can be reduced.
In the context of functional languages this technique is known as deforestation [27]. Its benefits for term
rewriting are not yet clear.
Local sharing of terms, only those terms are shared that result from non-linear right-hand sides, e.g., f(X)
= g(X,X). Only those terms will be shared of which the sharing can be established at compile-time; the
amount of sharing will thus be limited. This technique is also applied in ELAN [8].
Local reuse of terms, i.e., common subterms are only reduced once and their normal form is reused several
times. Here again, the common subterm has to be determined at compile-time.
At run-time, there are various other mechanisms to reduce the amount of work:
7
void register xor() f
xorsym = "prod(Bool xor Bool -> Bool fleftg)";
register prod("prod(Bool xor Bool -> Bool fleftg)",
xor, xorsym);
g
void resolve xor() f
true = lookup func("prod(true -> Bool)");
truesym = lookup sym("prod(true -> Bool)");
false = lookup func("prod(false -> Bool)");
falsesym = lookup sym("prod(false -> Bool)");
not = lookup func("prod(not Bool -> Bool)");
notsym = lookup sym("prod(not Bool -> Bool)");
g
ATerm xor(ATerm arg0, ATerm arg1) f
if (check sym(arg0, truesym))
return (*not)(arg1);
if (check sym(arg0, falsesym))
return arg1;
return make nf2(xorsym,arg0,arg1);
g
Figure 8: Generated C code for the xor function of the Booleans.
Storage of all original terms to be rewritten and their resulting normal forms, so that if the same term must be
rewritten again its normal form is immediately available. The most obvious way of storing this information
is by means of pairs consisting of the original term and the calculated normal form. However, even for small
specifications and terms an explosion of pairs may occur. The amount of data to be manipulated makes this
technique useless.
A more feasible solution is to store only the results of functions that have been explicitly annotated by the
user as “memo-function” (see Section 5).
Dynamic sharing of (sub)terms. This is the primary technique we use and it is discussed in the next subsection.
3.1 Maximal Sharing of Subterms
Our strategy to minimize memory usage during rewriting is simple but effective: we only create terms that are new,
i.e., that do not exist already. If a term to be constructed already exists, that term is reused thus ensuring maximal
sharing. This strategy fully exploits the redundancy that is typically present in the terms to be build during rewriting.
The library functions to construct normal forms take care of building shared terms whenever possible. The sharing
of terms is invisible, so no extra precautions are necessary in the code generated by the compiler.
Maximal sharing of terms can only be maintained when we check at every term creation whether a particular
term already exists or not. This check implies a search through all existing terms but must nonetheless be executed
extremely fast in order not to impose an unacceptable penalty on term creation. Using a hash function that depends
on the internal code of the function symbol and the addresses of its arguments, we can quickly search for a function
application before creating it. The (modest but not negligible) costs at term creation time are hence one hash table
lookup.
Fortunately, we get two returns on this investment. First, the considerably reduced memory usage also leads to
reduced (real-time) execution time. Second, we gain substantially since the equality check on terms (term equal)
becomes very cheap: it reduces from an operation that is linear in the number of subterms to be compared to a
constant operation (pointer equality). Note that the compiler generates calls to term equal in the translation of
patterns and conditions.
The idea of subterm sharing is known in the LISP community as hash consing and will be discussed below.
8
3.2 Shared Terms versus Destructive Updates
Terms can be shared in a number of places at the same time, therefore they cannot be modified without causing
unpredictable side-effects. This means that all operations on terms should be functional and that terms should effectively be immutable after creation.
During rewriting of terms by the generated code this restriction causes no problems since terms are created in a
fully functional way. Normal forms are constructed bottom-up and there is no need to perform destructive updates
on a term once it has been constructed. When normalizing an input term, this term is not modified, the normal
form is constructed independent of the input term. If we would modify the input term we would get graph rewriting
instead of (innermost) term rewriting. The term library is very general and is not only used for rewriting; destructive
updates would therefore also cause unwanted side effects in other components based on this term library.
However, destructive operations on lists, like list concatenation and list slicing, become expensive. For instance,
the most efficient way to concatenate two lists is to physically replace one of the lists by the concatenation result.
In our case, this effect can only be achieved by taking the second list, prepending the elements of the first list to it,
and return the new list as result.
In LISP, the success of hash consing [1] has been limited by the existence of the functions rplaca and rplacd
that can destructively modify a list structure. To support destructive updates, one has to support two kinds of list
structures “mono copy” lists with maximal sharing and “multi copy” lists without maximal sharing. Before destructively changing a mono copy list, it has to be converted to a multi copy list. In the 1970’s, E. Goto has experimented
with a Lisp dialect (HLisp) supporting hash consing and list types as just sketched. See [25] for a recent overview
of this work and its applications.
In the case of the A SF +S DF compiler, we generate the code that creates and manipulates terms and we can selectively generate code that copies subterms in cases where the effect of a destructive update is needed (as sketched
above). This explains why we can apply the technique of subterm sharing with more success.
3.3 Reclaiming Unused Terms
During rewriting, a large number of terms is created, most of which will not appear in the end result. These terms
are used as intermediate results to guide the rewriting process. This means that terms that are no longer used have
to be reclaimed in some way.
After experimentation with various alternatives (reference counting, mark-and-compact garbage collection) we
have finally opted for a mark-and-sweep garbage collection algorithm to reclaim unused terms. Mark-and-sweep
collection is more efficient, both in time and space than reference counting [20]. The typical space overhead for a
mark-sweep garbage collection algorithm is only 1 bit per object.
Mark-and-sweep garbage collection works using three (sometimes two) phases. In the first phase, all the objects
on the heap are marked as ‘dead’. In the second phase, all objects reachable from the known set of root objects are
marked as ‘live’. In the third phase, all ‘dead’ objects are swept into a list of free objects.
Mark-and-sweep garbage collection is also attractive, because it can be implemented efficiently in C and can
work without support from the programmer or compiler [7]. We have implemented a specialized version of Boehm’s
conservative garbage collector [6] that exploits the fact that we are managing ATerms.
4 Benchmarks
Does maximal sharing of subterms lead to reductions in memory usage? How does it affect execution speed? Does
the combination of techniques presented in this paper indeed lead to an implementation of term rewriting that scalesup to industrial applications?
To answer these questions, we present in Section 4.1 three relatively simple benchmarks to compare our work
with that of other efficient functional and algebraic language implementations. In Section 4.2 we give measurements
for some larger A SF +S DF specifications.
4.1 Three Small Benchmarks
mod
17, with 17 n 23. A nice
All three benchmarks are based on symbolic evaluation of expressions 2n
aspect of these expressions is that there are many ways to calculate their value, giving ample opportunity to validate
the programs in the benchmark. The actual source of the benchmarks can be obtained at
9
Compiler
Clean (strict)
SML
Clean (lazy)
A SF +S DF (with sharing)
Haskell
Opal
A SF +S DF (without sharing)
Elan
Time (sec)
32.3
32.9
36.9
37.7
42.4
75.7
190.4
287.0
Table 1: The execution times for the evaluation of 223 .
140000
Asf+Sdf
Asf+Sdf (no sharing)
Clean (lazy)
Clean (strict)
Opal
Haskell
SML
Elan
average memory usage (kilobyte)
120000
100000
80000
60000
40000
20000
0
17
18
19
20
21
22
23
Figure 9: Memory usage for the evalexp benchmark
http://adam.wins.uva.nl/olivierp/benchmark/index.html.
Note that these benchmarks were primarily designed to evaluate specific implementation aspects such as the
effect of sharing, lazy evaluation, and the like. They cannot (yet) be used to give an overall comparison between
the various systems. Also note that some systems failed to compute results for the complete range 17 n 23 in
some benchmarks. In those cases, the corresponding graph also ends prematurely. Measurements were performed
on an ULTRA SPARC-5 (270 MHz) with 512 Mb of memory. So far we have used the following implementations
in our benchmarks:
The A SF +S DF compiler as discussed in this paper: we give results with and without maximal sharing.
The Clean compiler developed at the University of Nijmegen [24]: we give results for standard (lazy) versions
and for versions optimized with strictness annotations (strict).
The ELAN compiler developed at INRIA, Nancy [8].
The Opal compiler developed at the Technische Universität Berlin [16].
The Glasgow Haskell compiler [21].
The Standard ML compiler [3].
10
800
700
time (tenths of a second)
600
Asf+Sdf
Asf+Sdf (no sharing)
Clean (lazy)
Clean (strict)
Opal
Haskell
SML
Elan
500
400
300
200
100
0
17
18
19
20
21
22
23
Figure 10: Execution times for the evalexp benchmark
4.1.1 The evalsym Benchmark
The first benchmark is called evalsym and uses an algorithm that is CPU intensive, but does not use a lot of memory. This benchmark is a worst case for our implementation, because little can be gained by maximal sharing. The
results are shown in Table 1. The differences between the various systems are indeed small. Although, A SF +S DF
(with sharing) cannot benefit from maximal sharing, it does not loose much either.
4.1.2 The evalexp Benchmark
The second benchmark is called evalexp and is based on an algorithm that uses a lot of memory when a typical
eager (strict) implementation is used. Using a lazy implementation, the amount of memory needed is relatively
small.
Memory usage is shown in Figure 9. Clearly, normal strict implementations cannot cope with the excessive
memory requirements of this benchmark. Interestingly, A SF +S DF (with sharing) has no problems whatsoever due
to the use of maximal sharing, although it is also based on strict evaluation
Execution times are plotted in Figure 10. Only Clean (lazy) is faster than A SF +S DF (with sharing) but the
differences are small.
4.1.3 The evaltree Benchmark
The third benchmark is called evaltree and is based on an algorithm that uses a lot of memory both with lazy
and eager implementations. Figure 11 shows that neither the lazy nor the strict implementations can cope with the
memory requirements of this benchmark. Only A SF +S DF (with sharing) can keep the memory requirements at an
acceptable level due to its maximal sharing. The execution times plotted in Figure 12 show that only A SF +S DF
scales-up for n > 20.
4.2 Compilation Times of Larger A SF +S DF Specifications
Table 2 gives an overview of the compilation times of four non-trivial A SF +S DF specifications and their sizes in
number of equations, lines of A SF +S DF specification, and generated C code. The A SF +S DF compiler is the specification of the A SF +S DF to C compiler discussed in this paper. The parser generator is an A SF +S DF specification
which generates a parse table for an GLR-parser [26]. The COBOL formatter is a pretty-printer for COBOL, this
formatter is used within a renovation factory for COBOL [12]. The Risla expander is an A SF +S DF specification
of a domain-specific language for interest products, it expands modular Risla specifications into “flat” Risla specifications [4]. These flat Risla specifications are later compiled into COBOL code by an auxiliary tool. The com-
11
140000
Asf+Sdf
Asf+Sdf (no sharing)
Clean (lazy)
Clean (strict)
Opal
Haskell
SML
Elan
average memory usage (kilobyte)
120000
100000
80000
60000
40000
20000
0
17
18
19
20
21
22
23
Figure 11: Memory usage for the evaltree benchmark
pilation times in the column “A SF +S DF compiler” give the time needed to compile each A SF +S DF specification
to C code. Note that the A SF +S DF compiler has been fully bootstrapped and is itself a compiled A SF +S DF specification. Therefore the times in this column give a general idea of the execution times that can be achieved with
compiled A SF +S DF specifications. The compilation times in the last column are produced by a native C compiler
(SUN’s cc) with maximal optimizations.
Specification
A SF +S DF
compiler
Parser generator
COBOL
formatter
Risla expander
A SF +S DF
(equations)
A SF +S DF
(lines)
8699
Generated
C code
(lines)
85185
A SF +S DF
compiler
(sec)
216
C
compiler
(sec)
323
1876
1388
2037
4722
9205
47662
85976
106
208
192
374
1082
7169
46787
168
531
Table 2: Measurements of the A SF +S DF compiler.
Table 3 gives an impression of the effect of maximal sharing on execution time and memory usage of compiled
A SF +S DF specifications. We show the results (with and without sharing) for the compilation of the A SF +S DF to
C compiler itself and for the expansion of a non-trivial Risla specification.
5 Concluding Remarks
We have presented the techniques for the compilation of A SF +S DF to C, with emphasis on memory management
issues. We conclude that compiled A SF +S DF specifications run with speeds comparable to that of other systems,
while memory usage is in some cases an order of magnitude smaller. We have mostly used and adjusted existing
techniques but their combination in the A SF +S DF compiler turns out to be very effective.
It is striking that our benchmarks show results that seem to contradict previous observations in the context of
SML [2] where sharing resulted in slightly increased execution speed and only marginal space savings. On closer
inspection, we come to the conclusion that both methods for term sharing are different and can not be compared
easily. We share terms immediately when they are created: the costs are a table lookup and the storage needed
12
650
Asf+Sdf
Asf+Sdf (no sharing)
Clean (lazy)
Clean (strict)
Opal
Haskell
SML
Elan
600
550
time (tenths of a second)
500
450
400
350
300
250
200
150
100
50
0
17
18
19
20
21
22
23
Figure 12: Execution times for the evaltree benchmark
Application
A SF +S DF compiler (with sharing)
A SF +S DF compiler (without sharing)
Risla expansion (with sharing)
Risla expansion (without sharing)
Time (sec)
216
661
9
18
Memory (Mb)
16
117
8
13
Table 3: Performance with and without maximal sharing.
for the table while the benefits are space savings due to sharing and a fast equality test (one pointer comparison).
In [2] sharing of subterms is only determined during garbage collection in order to minimize the overhead of a table
lookup at term creation. This implies that local terms that have not yet survived one garbage collection are not yet
shared thus loosing most of the benefits (space savings and fast equality test) as well. The different usage patterns
of terms in SML and A SF +S DF may also contribute to these seemingly contradicting observations.
There are several topics that need further exploration. First, we want to study the potential of compile-time
analysis for reducing the amount of garbage that is generated at run-time. Second, we have just started exploring
the implementation of memo-functions. Although the idea of memo-functions is rather old, they have not be used
very much in practice due to their considerable memory requirements. We believe that our setting of maximally
shared subterms will provide a new perspective on the implementation of memo-functions. Finally, our ongoing
concern is to achieve an even further scale-up of prototyping based on term rewriting.
Acknowledgments
The discussions with Jan Heering on A SF +S DF compilation are much appreciated. The idea for the benchmarks in
Section 4.1 originates from Jan Bergstra. Reference [2] was pointed out to us by one of the referees.
References
[1] J.R. Allen. Anatomy of LISP. McGraw-Hill, 1978.
[2] A.W. Appel and M.J.R. Goncalves. Hash-consing garbage collection. Technical Report CS-TR-412-93,
Princeton University, 1993.
13
[3] A.W. Appel and D. MacQueen. A standard ML compiler. In G. Kahn, editor, Functional Programming Languages and Computer Architecture, LNCS, pages 301–324, 1987.
[4] B.R.T. Arnold, A. van Deursen, and M. Res. An algebraic specification of a language for describing financial
products. In M. Wirsing, editor, ICSE-17 Workshop on Formal Methods Application in Software Engineering,
pages 6–13. IEEE, April 1995.
[5] J.A. Bergstra, J. Heering, and P. Klint, editors. Algebraic Specification. ACM Press/Addison-Wesley, 1989.
[6] H. Boehm. Space efficient conservative garbage collection. In Proceedings of the ACM SIGPLAN ’91 Conference on Programming Language Design and Implementation, SIGPLAN Notices 28, 6, pages 197–206, June
1993.
[7] H. Boehm and M. Weiser. Garbage collection in an uncooperative environment. Software - Practice and
Experience (SPE), 18(9):807–820, 1988.
[8] P. Borovanský, C. Kirchner, H. Kirchner, P.-E. Moreau, and M. Vittek. ELAN: A logical framework based on
computational systems. In José Meseguer, editor, Proceedings of the First International Workshop on Rewriting Logic, volume 4 of Electronic Notes in Theoretical Computer Science. Elsevier Science, 1996.
[9] M.G.J. van den Brand, A. van Deursen, P. Klint, S. Klusener, and A.E. van der Meulen. Industrial applications
of ASF+SDF. In M. Wirsing and M. Nivat, editors, Algebraic Methodology and Software Technology (AMAST
’96), volume 1101 of Lecture Notes in Computer Science. Springer-Verlag, 1996.
[10] M.G.J. van den Brand, J. Heering, P. Klint, and P.A. Olivier. Compiling rewrite systems: The asf+sdf compiler.
Technical report, Centrum voor Wiskunde en Informatica (CWI), 1999. In preparation.
[11] M.G.J. van den Brand, P. Klint, and C. Verhoef. Term rewriting for sale. In C. Kirchner and H. Kirchner,
editors, Proceedings of the First International Workshop on Rewriting Logic and its Applications, volume 15
of Electronic Notes in Theoretical Computer Science, pages 139–161. Elsevier Science, 1998.
[12] M.G.J. van den Brand, M.P.A. Sellink, and C. Verhoef. Generation of components for software renovation
factories from context-free grammars. In I.D. Baxter, A. Quilici, and C. Verhoef, editors, Proceedings of the
Fourth Working Conference on Reverse Engineering, pages 144–153, 1997.
[13] M.G.J. van den Brand and E. Visser. Generation of formatters for context-free languages. ACM Transactions
on Software Engineering and Methodology, 5:1–41, 1996.
[14] J.J. Brunekreef. A transformation tool for pure Prolog programs. In J.P. Gallagher, editor, Logic Program
Synthesis and Transformation. Proceedings of the 6th International Workshop, LOPSTR’96, volume 1207 of
LNCS, pages 130–145. Springer-Verlag, 1996.
[15] A. van Deursen, J. Heering, and P. Klint, editors. Language Prototyping: An Algebraic Specification Approach, volume 5 of AMAST Series in Computing. World Scientific, 1996.
[16] K. Didrich, A. Fett, C. Gerke, W. Grieskamp, and P. Pepper. OPAL: Design and implementation of an algebraic
programming language. In J. Gutknecht, editor, International Conference on Programming Languages and
System Architectures, volume 782 of Lecture Notes in Computer Science, pages 228–244. Springer-Verlag,
1994.
[17] C.H.S. Dik. A fast implementation of the Algebraic Specification Formalism. Master’s thesis, University of
Amsterdam, Programming Research Group, 1989.
[18] P.H. Hartel et al. Benchmarking implementations of functional languages with ‘pseudoknot’, a float-intensive
benchmark. Journal of Functional Programming, 6:621–655, 1996.
[19] J. Heering, P.R.H. Hendriks, P. Klint, and J. Rekers. The syntax definition formalism SDF — Reference manual. SIGPLAN Notices, 24(11):43–75, 1989. Most recent version available at URL: http://www.cwi.nl/gipe/.
[20] R. Jones and R. Lins. Garbage Collection: Algorithms for Automatic Dynamic Memory Management. Wiley,
1996.
14
[21] S.L. Peyton Jones, C.V. Hall, K. Hammond, W.D. Partain, and P.L. Wadler. The glasgow haskell compiler: a
technical overview. Proc. Joint Framework for Information Technology (JFIT) Technical Conference, pages
249–257, 1993.
[22] S. Kaplan. A compiler for conditional term rewriting systems. In P. Lescanne, editor, Proceedings of the First
International Conference on Rewriting Techniques, volume 256 of Lecture Notes in Computer Science, pages
25–41. Springer-Verlag, 1987.
[23] P. Klint. A meta-environment for generating programming environments. ACM Transactions on Software
Engineering and Methodology, 2:176–201, 1993.
[24] M.J. Plasmeijer and M.C.J.D. van Eekelen. Concurrent Clean - version 1.0 - Language Reference Manual,
draft version. Department of Computer Science, University of Nijmegen, Nijmegen, The Netherlands, 1994.
[25] M. Terashima and Y. Kanada. HLisp—its concept, implementation and applications. Journal of Information
Processing, 13(3):265–275, 1990.
[26] E. Visser. Syntax Definition for Language Prototyping. PhD thesis, University of Amsterdam, 1997.
[27] P. Wadler. Deforestation: Transforming programs to eliminate trees.
73(2):231–248, 22 June 1990.
15
Theoretical Computer Science,