Academia.eduAcademia.edu

Implementation Strategies for Single Assignment Variables

2004

Flow Java integrates single assignment variables (logic variables) into Java. This paper presents and compares three implementation strategies for single assignment variables in Flow Java. One strategy uses forwarding and dereferencing while the two others are variants of Taylor's scheme. The paper introduces how to adapt Taylor's scheme for a concurrent language based on operating system threads, token equality, and update of data structures. Evaluation of the strategies clarifies that the key issue for efficiency is reducing memory usage.

Implementation Strategies for Single Assignment Variables Frej Drejhammar1,2 Christian Schulte1 IMIT, KTH - Royal Institute of Technology, Sweden SICS - Swedish Institute of Computer Science, Sweden {frej,schulte}@imit.kth.se 1 2 Abstract Flow Java integrates single assignment variables (logic variables) into Java. This paper presents and compares three implementation strategies for single assignment variables in Flow Java. One strategy uses forwarding and dereferencing while the two others are variants of Taylor’s scheme. The paper introduces how to adapt Taylor’s scheme for a concurrent language based on operating system threads, token equality, and update of data structures. Evaluation of the strategies clarifies that the key issue for efficiency is reducing memory usage. 1 Introduction Flow Java attempts to simplify concurrent programming in Java by conservatively extending Java with single assignment variables (logic variables). Clearly, the most important aspect of implementing Flow Java concerns maintaining single assignment variables. Motivation, related work, and an implementation based on forwarding and dereferencing are presented in [2]. This paper discusses and compares three different implementation strategies for single assignment variables in Flow Java. In addition to the forwarding scheme, we discuss two schemes based on maintaining aliased (equal but not yet bound) single assignment variables in a circular data structure originally due to Taylor [10]. Other approaches based on Taylor’s scheme are [4, 9, 8]. They have in common that they carefully investigate the interaction with search by optimizing trailing. This paper naturally takes a different perspective. Firstly, Flow Java implements concurrency with operating system threads. This means that all operations on variables must take this concurrency model into account and use locking to guarantee atomicity. Locking is made deadlock free by exploiting ordering of objects in memory. This is different from Prolog where order is used for trailing. Secondly, in contrast to Prolog and HAL, equality in Flow Java is based on object identity (token equality). The paper makes the following specific contributions: it develops and evaluates two Taylorbased schemes for maintaining single assignment variables in a truly concurrent setting. The schemes differ in the asymptotic complexity of locking data structures. It presents how to use a Taylor-based scheme in a language with token equality and update. It evaluates the performance of the different schemes and identifies memory usage as the key criterion for good performance. While the implementation is based on the GNU GCJ Java compiler and the libjava runtime environment, the techniques presented in this paper apply to any Java runtime environment using a memory layout similar to C++. The techniques are not limited to Java, they can equally well be applied to other object-oriented languages such as C#. 1 Plan of the Paper. The next section gives a brief overview of Flow Java. Section 3 describes an architecture for implementing Flow Java which is parametric with respect to different implementations of single assignment variables. This is followed by the different implementation strategies for variables. Section 5 evaluates the different strategies and the next section concludes. 2 Flow Java Flow Java is a conservative extension to Java which adds single assignment variables (a variant of logic variables) to Java. This section provides a brief overview of how single assignment variables are supported in Flow Java, more details including the discussion of related work, types, and futures for security and seamless integration can be found in [2]. Single Assignment Variables. Single assignment variables in Flow Java are typed and serve as place holders for objects. They are introduced with the type modifier single. For example, single Object s; introduces s as a single assignment variable of type Object. Initially, a single assignment variable is unbound containing no object. A single assignment variable of type t can be bound to any object of type t. Binding a single assignment variable to an object o makes it indistinguishable from o. After binding, the variable is bound or determined. Binding. Flow Java uses @= to bind a single assignment variable to an object. For example, Object o = new Object(); s @= o; binds s to the newly created object o. This makes s equivalent to o in any subsequent computation. The attempt to bind an already determined single assignment variable x to an object o raises an exception if x is bound to an object different from o. Otherwise, the binding operation does nothing. Binding two single assignment variables is called aliasing and is discussed below. Note that equality is concerned with the identity of objects only (token equality). Synchronization. Statements that access the content of a yet undetermined single assignment variable automatically suspend the executing thread. These statements are: field access and update, method invocation, and type conversion. Suspension for synchronization variables has the same properties as explicit synchronization in Java through wait() and notify(). For example, assume a class C with method m and that c refers to a single assignment variable of type C. The method invocation c.m() suspends its executing thread, if c is not determined. As soon as some other thread binds c, execution continues and the method m is executed for c. Aliasing. Single assignment variables in Flow Java can be aliased (made equal) while still being unbound. Aliasing two single assignment variables x and y is done by x @= y. Binding either x or y to an object o, binds both x and y to o. 3 Implementation Architecture The Flow Java implementation is based on the GNU GCJ Java compiler and the libjava runtime environment. They provide a virtual machine and the ability to compile Java source code and byte code to native code. Garbage collection is provided by a conservative collector. Extensions to the runtime system and to the compiler implement binding, aliasing, and synchronization on synchronization objects as implementations of single assignment variables. 2 3.1 The GCJ/libjava Runtime Environment Object Representation. The GCJ/libjava implementation uses a memory layout similar to C++. An object reference points to a memory area containing the object fields and a pointer, called vptr, to a virtual method table, called vtab. The vtab contains pointers to object methods and a pointer to the object class. The memory layout is the same for byte code and native code. Suspension. The GCJ/libjava runtime uses operating system threads. For example, on x86linux pthreads [3] are used. Explicit suspension and resumption in Java is implemented by wait(), notifyAll(), and notify() methods. The methods are present in all Java objects. A thread suspends if it calls wait() on an object. The thread resumes execution when another thread calls either notifyAll() or notify() on the same object. The wait/notify functionality is made available in the libjava runtime as two functions, prim wait/prim notifyAll, each taking the waiting/notified object as an argument. The functions interface with the underlying system level thread implementation. Monitors. Orthogonal to the wait/notify mechanism is the monitor which is present in each Java object required for synchronized methods. Internally to libjava the lock associated with the monitor can be acquired and released with the two functions lock and unlock. 3.2 Implementing Synchronization Objects Synchronization objects are allocated on the heap containing the minimal information to support aliasing. We refer by equivalence class to all synchronization objects aliased to each other. The implementation strategies discussed below select one element from the equivalence class as leader. Equivalence classes are maintained on two layers. An upper layer handles the language level operations and makes them safe and atomic. The lower layer (described in Section 4) handles the representation and maintenance of equivalence classes. Binding. When a synchronization object is bound to an object o, its internal information is updated to point to o. Binding is implemented by the primitive bind(a,b). It is infeasible to allocate synchronization objects which are large enough to contain the largest possible object in the system. Therefore, a synchronization object has at least one forwarding step to its value. This in contrast to tagging where logic variables are simply overwritten during binding. Aliasing. Aliasing creates or extends an equivalence class by merging two, possibly singleton, equivalence classes with the primitive alias(a,b). The aliasing operation modifies the internal information of the synchronization objects to maintain the equivalence relation (equality). Synchronization. The runtime system suspends execution until a synchronization object becomes determined. The primitive waitdet(r) suspends until its argument becomes determined and then returns the determined value. Synchronization objects do not use the same virtual method table as ordinary objects. Entries in the vtab of a synchronization object point to stub functions which are created by the runtime system during class loading. The stub suspends the executing thread until the object becomes determined, using waitdet(r), and then restarts method invocation. This provides automatic synchronization of method invocation without a runtime penalty for method invocation on ordinary objects. 3 3.3 Concurrency and Aliasing Atomic aliasing and binding are required by Flow Java. In contrast to other systems supporting logic variables (for example, PARMA [10], WAM [11, 1], or even Mozart [6, 5]), the runtime system of Flow Java provides concurrency by using operating system threads. The primitives implementing synchronization and atomic bind/alias are more complex as the operations must be made safe and atomic without resorting to a “stop the world” approach. Operations. This section describes how binding, aliasing, and synchronization operations can be implemented using lock and unlock (see Section 3.1). The operations manipulate equivalence classes through a set of primitives (low-level primitives, starting with ll ): ll is so(r) tests whether r is a synchronization object. ll bind(a, b) updates the internal representation of the equivalence class a to bind it to b. Binding an equivalence class binds all synchronization objects in the equivalence class. ll alias(a, b) updates the representation of a and b by merging their equivalence classes. ll leader(r) returns the leader of the equivalence class r. ll compress(orig, new) Shortens the reference chain of orig to point directly to new if the representation needs or supports it. Invariants. The following invariants apply to the use of the low level primitives: 1. The leader of a determined object is the object itself. 2. An equivalence class is only modified if the lock for its leader is held by the modifying thread. 3. Leader locks are acquired in order of increasing address of the leader. 4. Binding an equivalence class notifies all threads suspending on its leader by prim notifyAll. The lock of the leader is still being held by the binding thread. 5. If two equivalence classes are merged, the leader at the highest address is notified by a call to prim notifyAll while its lock is still being held by the modifying thread. 6. All low level primitives except ll leader(r) and ll is so(r) take leaders as arguments. Bind. The bind(a,b) primitive (defined in Figure 1) binds the synchronization object a to b. It first acquires the determined value of b by using waitdet() (which will suspend if b is not already determined). Then it uses ll leader(a) to find the leader of a and acquire its lock. If another thread is modifying the equivalence class this may require multiple iterations. When the lock has been acquired the binding is checked for validity. The equivalence class is updated by ll bind(). prim notifyAll is then called on the leader to wake up all threads suspended on the leader. Finally the lock for the leader is released. Aliasing. Aliasing of synchronization objects is implemented by ll alias. In order to be thread safe, alias iteratively acquires the locks of the two leaders. The lock of the leader at the lowest address is acquired first to prevent deadlock. The definition of alias can be found in Figure 1. 4 1 10 20 30 40 jobject alias(jobject a, jobject b) { bool as, bs; jobject low, high; while(true) { a = ll_leader(a); b = ll_leader(b); as = is_so(a); bs = is_so(b); if(!as && !bs) { if(a == b) return a; throw TellFailureException; } else if(as && bs) { if(a < b) { low = a; high = b; } else { low = b; high = a; } lock(low); lock(high); if(low == ll_leader(low) && high == ll_leader(high)) break; unlock(high); unlock(low); continue; } else { if(as) return bind(b, a); else return bind(a, b); } } if(!valid_alias(low, high)) { unlock(high); unlock(low); throw TellFailureException; } ll_alias(low,high); prim_notifyAll(high); unlock(high); unlock(low); return low; } jobject bind(jobject a, jobject b) { b = waitdet(b); while(true) { a = ll_leader(a); lock(a); if(ll_leader(a) == a) break; unlock(a); } if (!bind_is_valid(a, b)) { unlock(a); throw error; } else if(a == b) { // Nothing to do } else { ll_bind(a, b); prim_notifyAll(a); } unlock(a); return b; } jobject waitdet(jobject o) { if(!is_so(o)) return o; jobject t = o; while(is_so(o)) { o = ll_leader(o); lock(o); if(is_so(o)) prim_wait(o); unlock(o); } ll_compress(t, o); return o; } Figure 1: Primitive operations, alias, bind, and waitdet 5 Synchronization. The waitdet primitive suspends the currently executing thread until its argument becomes determined. Only the bind(a,b) primitive changes the status of a synchronization object from unbound to bound. The invariants maintained by alias(a,b) and bind(a,b) (invariants 4 + 5) guarantee the following property: if the leader for an equivalence class changes or all members become bound, then prim notifyAll is called on the leader when its lock is held by the thread doing the modification. Therefore waitdet(r) can be implemented as shown in Figure 1. It is based on a loop which uses ll leader(r) and terminates when a determined object is found. If an undetermined leader is found, the lock associated with the leader is acquired. If the object is still undetermined prim wait is called to wait for the leader to be updated. When prim wait returns, the lock is released and the loop continues. Requiring the thread to acquire the lock before calling prim wait guarantees that no binding or aliasing notifications are lost. For representations which can make use of pathcompression ll compress is executed as final step. 4 Maintaining Equivalence Classes The description of the operations in Section 3.3 defined the low level operations (named ll <name>). This section describes three different schemes for implementing the underlying representation. By construction of the high level operations the operations modifying equivalence classes (ll bind and ll compress) can assume exclusive access. The only exception is ll compress which is allowed to shorten a hypothetical reference chain without holding the lock as it does not change the interpretation of a determined equivalence class. This section describes three representations for equivalence classes. First a scheme based on a forwarding pointer is described in Section 4.1. Then an variant of Taylor’s scheme [10] adapted to a language with update and token equality (non structural equality) is described. Then finally Section 4.3 shows an optimization of Taylor’s scheme in the concurrent setting. 4.1 Forwarding This scheme is similar to the forwarding pointer scheme used in the WAM [1]. An equivalence class is represented as tree of synchronization objects rooted in the leader. A bound equivalence class has a determined object at its root. Synchronization objects are in this scheme allocated as two-field objects containing a redirectionpointer field rptr and the vptr. Normal objects also have a rptr, the rptr is used to indicate binding status and is also used as a forwarding pointer. Standard Java objects have their rptr pointing to the object itself. The rptr of a synchronization object can be: a sentinel UNB (for unbound), a pointer to a determined object, or a pointer to a synchronization object. A sentinel is used as otherwise an undetermined synchronization object would be indistinguishable from an object bound to null. The rptr for all objects increases the memory requirements, but requires only one pointer dereference and a comparison to determine whether an object is a synchronization object (that is o->rptr != o). To save memory the rptr could be present only in synchronization objects. But as libjava does not have tagged pointers, the test whether an object is a synchronization object would increase runtime. There are at least two ways to implement such a test. The first emulates tagged pointers by allocating vtables in a special area. The vtable pointer is then tested to see if it is inside this area. This approach is troublesome as the area cannot be of fixed size, and testing would have to be aware of the current area size and location. The second approach makes use of the reference 6 to an object’s class object which is present in each vtable (that is both the synchronization and normal vtable). The vtable is dereferenced to reach the class object which is in turn dereferenced to acquire the reference to the synchronization vtable, that is o->vtab->class->svtab == o->vtab. The test requires at least three pointer dereferences and a comparison. The primitives are implemented as follows: ll is so(r) An object is a synchronization object if it is not null and its rptr is not pointing to the object itself. This operation is constant time. ll bind(a, b) Binding is implemented by changing the leader’s rptr to point to the object b. Again, this operation is constant time. ll alias(a, b) Aliasing is implemented by allowing a synchronization object’s rptr -field to point to another synchronization object. The operation updates the rptr of the synchronization object at the higher address to point to the object at the lower address. This makes the “high” object aliased and the “low” the leader of the joined equivalence class, Section 3.3 (Synchronization) motivates the order. The operation is constant time. ll leader(r) follows the rptr of its argument until it finds an object which is either determined or which has its rptr set to UNB. The found object is returned. This operation takes linear time in the number of objects forming the equivalence class (worst-case). ll compress(orig, new) The conservative garbage collector used in the libjava runtime does not shorten or remove chains of aliased objects. Therefore path compression [7] is implemented by waitdet (see Section 3.3) which dereferences synchronization objects. The ll compress(orig, new) primitive simply updates the rptr of orig to contain new. 4.2 Taylor In this adaption of Taylor’s scheme [10] an equivalence class is represented as a cycle containing all elements of the class. The element at the lowest address is defined as the leader. Taylor’s scheme is a conceptually simple scheme to represent free variables in Prolog. It avoids arbitrarily long reference chains as in the WAM by representing a free variable by a special reference type with a single pointer field. A single free variable contains a reference to itself, thus making it a member of a one-element cycle. When two free variables are aliased their cycles are merged by exchanging the pointer values of the objects being aliased. Binding is implemented by traversing the cycle and overwriting the variables with the value to which they are bound. Figure 2(a) graphically shows how variables are represented in Taylor’s scheme. Taylor’s scheme can not be used for Flow Java without some modifications. Overwriting single assignment variables as part of the binding operation is troublesome. Single assignment variables would have to be allocated as large as the size of the largest object which could be stored in the variable. The largest size of a compatible object is not necessarily available to the runtime system when the variable is created as classes can be loaded at runtime. Another problem is that token equality is implemented by pointer comparison. Consider: 1 2 3 4 5 single Object a, b; Object v = new Object(); a @= b; a @= v; bool result = a == b; // result = false 7 Free variables Aliased variables Aliased variables Bound variables Value Bound to value V V V (b) (a) Figure 2: Variable representation in Taylor’s scheme: a, plain; b for Flow Java. As a and b are at different addresses the equality test on line five will return false although a and b should be equivalent after the aliasing on line three. Even if equality in Java was defined on the contents of the objects, Taylor’s scheme would still be incompatible with Flow Java. An update of a would not modify b even though a and b are aliased. Taylor’s scheme can in Flow Java be used to reduce the number of dereferencing steps needed to get the value of a determined single assignment variable to one. Instead of overwriting the single assignment variable during the binding, the forwarding pointer is overwritten to point to the determined object, as in Figure 2(b). Limiting the length of the reference chains is attractive but has drawbacks. When synchronization objects are bound, Taylor’s scheme will modify all objects in the cycle even if only one thread is interested in the value. The forwarding scheme will only update objects which are accessed (see waitdet, Section 3.3). As the libjava garbage collector is conservative the system is unable to collect a cycle of synchronization objects unless all references to the cycle are unreachable. Taylor’s scheme leads to the following implementation of the low level primitives: ll is so(r) A sentinel in place of the forwarding pointer is used to indicate a bound object. A special case is null which cannot be dereferenced but is not a synchronization object. This operation is constant time. ll bind(a, b) Traverses the cycle overwriting the forwarding fields of the variables with b. This operation is linear in the number of elements in the cycle. ll alias(a, b) Aliasing merges the cycles by exchanging the forwarding pointer values. This operation is constant time. ll leader(r) traverses the cycle. If a determined object is found (only occurs if another thread is modifying the cycle concurrently) it is returned. Otherwise the object at the lowest address is returned. This operation is linear in the number of elements in the cycle. ll compress(orig, new) This operation is a noop. 4.3 Hybrid The hybrid scheme removes the linear time complexity of the ll leader(r) primitive by maintaining a field in all synchronization objects pointing to the leader of the equivalence class. 8 Compared to the Taylor scheme, only the following operations change: ll alias(a, b) Merges the cycles as in plain Taylor followed by choosing a new leader for the now merged cycle. The leader at the lowest address is selected as the new leader. The half-cycle which is assigned a new leader is traversed and the leader pointer is updated. This operation is linear time in the size of the cycle. ll leader(r) The value of the leader is simply returned, making the operation constant time. 5 Evaluation To measure the performance of the three different implementation schemes, we use four benchmark sets: constructing an equivalence class of size n (the benchmarks are named cr.f, cr.t, and cr.h where .f is for forwarding, .t for Taylor, and .h for hybrid); aliasing two equivalence classes of size n each (al.f, al.t, and al.h); binding an equivalence class of size 2n (bi.f, bi.t, and bi.h); and accessing a bound value of an equivalence class through all its members repeatedly (ac1.f, ac1.t, and ac1.h for first time access, ac2.f, ac2.t, and ac2.h for second time access). Methodology. All benchmarks have been run on a 3GHz Intel Pentium 4 with 1GB RAM. Each benchmark has been run a hundred times and the mean time for each set has been calculated. The standard deviation of the individual runtimes is for all cases less than 6.5 percent which is small enough to not change the relative performance of the three implementation schemes. time cr.f cr.t cr.h 0 50 100 150 200 250 300 size Figure 3: Time vs. size for constructing equivalence classes Random Allocation. The benchmarks have been performed with synchronization objects allocated at random addresses. This captures the situation where synchronization objects are allocated by different program parts. It is also a typical memory layout after garbage collection. 9 time al.f al.t al.h bi.f bi.t bi.h 0 50 100 150 size 200 250 300 Figure 4: Time vs. size for aliasing equivalence classes Figure 3 shows the results of the cr.*-benchmarks involving n objects. The equivalence class is constructed by adding one element at a time. The Taylor scheme (cr.t) is slowest due to scanning the whole cycle to find the leader (quadratic complexity). The forwarding (cr.f) and hybrid (cr.h) schemes also have quadratic complexity. On average they follow an indirection path of length n/2. As the entire chain fits into the cache the actual scanning time is dwarfed by the time taken to handle cache misses (linear in the number of unique memory locations accessed). Therefore, in practice, building an equivalence class is done in O(n). As cr.h accesses more memory than cr.f, cr.h is marginally slower due to more cache misses. For aliasing and binding the caching effects dominate here as well, see Figure 4. For aliasing, two chains of length n are aliased to each other. al.f and al.h are linear time, but they are more or less constant time for the cycle lengths considered. The hybrid scheme has a much larger constant overhead for aliasing as it updates the leader pointer in half of its resulting cycle (n elements). Pure Taylor is slowest as it accesses all objects in both cycles (2n). The difference in performance for bind is less pronounced as both bi.f and bi.h access all elements. Also for accessing the value of a bound equivalence class through its members caching effects dominate. Figure 5 shows the time required for accessing all elements the first (ac1.*) and second (ac2.*) time. As to be expected the forwarding scheme is slowest as it accesses the largest amount of memory. The hybrid scheme is slower than the pure Taylor scheme as it accesses more memory. Looking at the time required for the second access it is clear that path compression has little impact compared to the effect of a hot cache in the Taylor based schemes. Ordered Allocation. The same set of benchmarks has also been conducted with synchronization objects allocated in order. The objects have been ordered in memory such that the forwarding based scheme constructs the longest possible forwarding chains. For creating equivalence classes cr.f shows the same relative performance as for random allocation. This is due to the low overhead for traversing the elements already loaded in the cache by 10 time ac1.f ac1.t ac1.h ac2.f ac2.t ac2.h 0 50 100 150 size 200 250 300 Figure 5: Time vs. size for accessing equivalence classes the previous aliasing operation. For aliasing, al.t and bi.t outperform al.h and bi.h. This is because synchronization objects are smaller for the Taylor scheme. Hence more objects fit into the cache and also accessing one object might already prefetch part of another object into the cache. Even if the experiment is set up to maximize the length of the forwarding chains, and neutralize the effect of path compression, the measured time for accessing a bound class is linear in the number of elements. This has been verified with an instrumented waitdet primitive which counts the number of forwarding hops taken (O(n2 )). Summary. The benchmarks show that the time required to handle cache misses dominates to such a large extent as to make the quadratic components insignificant. To maximize performance one should minimize the amount of memory accessed, as multiple accesses to memory already in the cache is essentially for free. With these selection criteria, the forwarding scheme is best. 6 Conclusion The paper presents three different implementation strategies for single assignment variables which take locking, token equality, and updates into account. The implementation factorizes out the operations concerned with manipulating the different implementations of single assignment variables. The paper clarifies how Taylor-based schemes need to be adapted to be compatible with threadbased concurrency, token equality, and update. Evaluation establishes that the most crucial aspect for efficiency is to access as little memory as possible. Acknowledgements. This work has been partially funded by the Swedish Vinnova PPC (Peer to Peer Computing, project 2001-06045) project. 11 References [1] H. Aı̈t-Kaci. Warren’s Abstract Machine: A Tutorial Reconstruction. Logic Programming Series. The MIT Press, Cambridge, MA, USA, 1991. [2] F. Drejhammar, C. Schulte, S. Haridi, and P. Brand. Flow Java: Declarative concurrency for Java. In Proceedings of the Nineteenth International Conference on Logic Programming, volume 2916 of LNCS, pages 346–360, Mumbai, India, Dec. 2003. Springer-Verlag. [3] IEEE Computer Society. Portable Operating System Interface (POSIX)—Amendment 2: Threads Extension (C Language). 345 E. 47th St, New York, NY 10017, USA, 1995. [4] T. Lindgren, P. Mildner, and J. Bevemyr. On Taylor’s scheme for unbound variables. Technical Report 116, Computer Science Department, Uppsala University, Oct. 1995. [5] M. Mehl. The Oz Virtual Machine: Records, Transients, and Deep Guards. Doctoral dissertation, Universität des Saarlandes, Im Stadtwald, 66041 Saarbrücken, Germany, 1999. [6] Mozart Consortium. www.mozart-oz.org. The Mozart programming system, 1999. Available from [7] D. Sahlin and M. Carlsson. Variable shunting for the WAM. Research Report R91-07, Swedish Institute of Computer Science, Kista, Sweden, 1991. [8] T. Schrijvers, M. G. de la Banda, and B. Demoen. Trailing analysis for HAL. In International Conference on Logic Programming, volume 2401 of LNCS, pages 38–53. Springer-Verlag, 2002. [9] T. Schrijvers and B. Demoen. Combining an improvement to PARMA trailing with trailing analysis. In Proceedings of the 4th international ACM SIGPLAN conference on Principles and practice of declarative programming. ACM Press, 2002. [10] A. Taylor. High Performance Prolog Implementation. PhD thesis, University of Sydney, Sydney, Australia, 1991. [11] D. H. D. Warren. An abstract Prolog instruction set. Technical Note 309, SRI International, Artificial Intelligence Center, Menlo Park, CA, USA, Oct. 1983. 12