Academia.eduAcademia.edu

A grainless semantics for the HARPO/L language

2009

This paper presents a dynamic semantics for the parallel language HARPO/L, based on Reynolds's grainless approach [1]. It shows that the approach scales to somewhat more sophisticated languages with few changes, while providing a solid semantics for the language, which will be used as a basis for compilation and optimization.

A GRAINLESS SEMANTICS FOR THE HARPO/L LANGUAGE Theodore S. Norvell Memorial University of Newfoundland Computer Engineering Research Labs St. John’s, NL [email protected] ABSTRACT be “wrong”. For example This paper presents a dynamic semantics for the parallel language HARPO/L, based on Reynolds’s grainless approach [1]. It shows that the approach scales to somewhat more sophisticated languages with few changes, while providing a solid semantics for the language, which will be used as a basis for compilation and optimization. Index Terms— parallel languages, concurrency, language semantics, grainless semantics 1. INTRODUCTION HARPO/L is a new behavioural language intended for programming configurable hardware and microprocessors [2]. The language supports interprocess communication via shared variables and rendezvous. The meaning of shared variable programs is conventionally dependent on the granularity of operations. Consider (co x := x + 1 || x := x + 1 co) (1) where x is an integer variable. If we consider that each assignment command is atomic, then (1) means x := x + 2; if we consider that each integer access is atomic, then the command nondeterministically increments x by either 1 or 2; if we consider that each memory byte access is atomic, the meaning might depend on the detailed representation of integers. Any granularity assumption puts severe constraints on optimization. Arrays and dynamically assigned pointers allow locations to have numerous aliases, and this can inhibit almost all optimizations that involve reordering memory accesses. Worse, any granularity assumption above that provided by the native hardware requires locks to be inserted that can largely kill the benefits of parallelism. If each assignment is atomic, then a command a[i] := a[i] + 1 may have to be compiled as lock(s); a[i] := a[i] + 1; unlock(s), where s is some compiler generated lock. For these reasons, we adopt the grainless approach of [1]. This approach defines all programs that contain data races to The author would like to thank NSERC for funding this work. (co a[i] := a[i] + 1 || a[j] := a[j] + 1 co) is “wrong” for initial states where i = j. This puts the onus on the programmer to ensure the absence of data races in their programs. One way the programmer can do so is by using explicit locks, replacing races for data with races for locks. The rest of this paper outlines an approach to formalizing the dynamic semantics of HARPO/L. A complete semantics of the language is being developed as part of a complete description of the language. In adapting the semantic approach of [1], the following differences arise: First, HARPO/L is a more complex language than the toy language considered in [1], including, for example, objects, complex lvalue expressions, expressions that can go wrong, and rendezvous. Second, HARPO/L allows read/read data races. Third, I take a different approach to infinite computations and to computations that go wrong; these differences will be discussed further in Section 5. We proceed as follows: In Section 2 we discuss the context in which programs are executed. This includes the object structure of the program and the data states of the program. Section 3 deals with lvalue expressions and value expressions. The semantics here is almost a traditional denotational semantics, but includes a twist to account for the fact that we need to track which locations are needed to evaluate each expression. In Section 4, we map commands to trace sets, which are sets of sequences of actions, each action being atomic. Following [1], I represent operations that access memory as two atomic actions, one indicating the start of the operation and one representing the end of the operation. I also give a semantics to the traces. Finally, Section 5 compares our work with related work and points the way to future developments. 2. CONTEXT 2.1. Object graph HARPO/L is a static, object-based language, meaning that, while programs are structured in terms of classes, all object instantiation is done statically (at compile-time). The fully semantics of the language involves three levels: the first level explains the specialization of generic classes to specific classes; the second level explains the instantiation of classes as objects; the third level explains the dynamic semantics of objects. This paper deals only with the dynamic semantics of the language and so, as a starting point, we consider a data structure that might be the output of the front-end of a compiler. This data structure consists of an object graph, a single command, and an initial state. The purpose of the dynamic semantics is to give the meaning of this command in the context of the object graph. An object graph consists of a number of sets and functions. N is a set of identifiers. L is a set of locations. O is a set of objects. A is a set of arrays. We will also use a set V , which is the set of primitive values for a particular implementation. The special value F indicates an expression computation “gone wrong” and we’ll assume that it is outside any of the above sets. Each object o has a fixed set of field names fields(o) ⊆fin N and for n ∈ fields(o), field(o, n) ∈ O∪L∪A. We’ll extend the field function to wrong values by defining field(F, n) = F, where n ∈ N . We assume that all local variables have been replaced by object fields, so local variables are also accessed as fields. Similarly, for each a ∈ A, there is a length length(a) ∈ N, and for v ∈ {0, .. length(a)},1 index(a, v) ∈ O ∪ L ∪ A. For all other v ∈ V , define index(a, v) = F. We’ll also define index(F, v) = index(a, F) = F. The set of global objects is represented by a finite partial function global ∈ N Ã O ∪ L ∪ A . Because of the static nature of HARPO/L, the global, fields, field, methods, length, and index functions do not depend on the state. If σ0 and σ1 are states, then σ0 σ1 is a state such that δ(σ0 σ1 ) = δ(σ0 ) ∪ δ(σ1 ) ½ σ0 (l) if l ∈ δ(σ0 ) (σ0 σ1 ) (l) = σ1 (l) if l 6∈ δ(σ0 ) and l ∈ δ(σ1 ) . 3. EXPRESSIONS We use two semantic functions for expressions: For lvalue expressions E, we have [E]@ ⊆ Σ × (O ∪ L ∪ A ∪ {F}). For all expressions E of primitive type (int, real etc.), we have [E]$ ⊆ Σ × (V ∪ {F}). If (σ, x) is in [E]@ or [E]$ , then evaluating E in any state σ0 ⊇ σ, may yield result x, while reading exactly the locations in δ(σ). For example, [a[i]]@ contains (l0 7→ 0, l1 ), where l0 is the location of i and l1 is the location of the first element of array a; it also contains (l0 7→ −1, F); it does not contain (l0 7→ 0 ∪ l1 7→ 13, l1 ), as location l1 is not accessed in computing the location. Now [a[i]]$ contains (l0 7→ 0 ∪ l1 7→ 13, 13) as well as (l0 7→ −1, F). The defining equations for []@ and []$ are quite straightforward; I provide only a few examples here.2 [n]@ = {(∅, global(n))} , where n ∈ δ(global) [E.n]@ = {(σ, o) ∈ [E]@ · (σ, field(o, n))} [E[F ]]@ = {(σ0 , a) ∈ [E]@ , (σ1 , i) ∈ [F ]$ | σ0 ` σ1 · (σ0 ∪ σ1 , index(a, i))} [E]$ = {(σ, l) ∈ [E]@ , v ∈ VE | l ∈ L ∧ σ ` (l 7→ v) · (σ ∪ (l 7→ v) , v)} ∪ {(σ, F) ∈ [E]@ · (σ, F)} The last equation applies only when E is an lvalue expression of some primitive type; VE is the subset of V appropriate to E’s type. Nondeterministic expressions can be handled by this approach, but not expressions that change the state. 2.2. States A state is a partial function σ ∈ Σ = (L Ã V ). I’ll consider each state to be equal to its graph, i.e. to the set of pairs that defines it. The domain of a partial function f is written δ(f ). Two partial states are compatible exactly if they agree on the intersection of their domains: σ0 ` σ1 = ∀l ∈ δ(σ0 ) ∩ δ(σ1 ) · σ0 (l) = σ1 (l) . Using the pun between states and their graphs, σ0 ` σ1 exactly if σ0 ∪ σ1 is a state. For location l and value v, let l 7→ v be the state such that δ(l 7→ v) = {l} and (l 7→ v)(l) = v. Using the pun between states and their graphs we have (l 7→ v) = {(l, v)}. 1 The notation {i, ..k} means {j ∈ Z | i ≤ j < k}. We assume that {0, .. length(a)} ⊆ V , for all arrays a. 4. COMMANDS The meaning of each command C is a set of traces (or sequences) of actions [C]! . We start with a discussion of traces. 4.1. Traces Given an alphabet of symbols P , P ∗ is the set of all finite traces over P , while P ω is the set of infinite traces. Let P ∞ = P ∗ ∪ P ω . We view traces as functions from initial segments of N. The length #s of a trace s is |δ(s)|. The catenation s; t of traces s and t in P ∞ is standard; in particular, s; t = s if s ∈ P ω . 2 The notation {v ∈ S | P · E} means the set that contains exactly those things that equal E for some value v ∈ S such that P . For example {n ∈ N | n is prime · n2 } is the set of squares of prime natural numbers: {4, 9, 25, 49, · · · }. {v ∈ S · E} means {v ∈ S | true · E} and {v ∈ S | P } means {v ∈ S | P · v}. [E0 := E1 ]! = {(σ0 , l) ∈ [E0 ]@ , (σ1 , v) ∈ [E1 ]$ | σ0 ` σ1 ∧ l 6= F ∧ v 6= F· start(σ0 ∪ σ1 , {l}); fin(l 7→ v, δ(σ0 ) ∪ δ(σ1 ))} (2) ∪ {(σ0 , l) ∈ [E0 ]@ , (σ1 , v) ∈ [E1 ]$ | σ0 ` σ1 ∧ (l = F ∨ v = F) · chaos(σ0 ∪ σ1 )} [C0 C1 ]! = [C0 ]! ; [C1 ]! (3) ª © ª © filter(E) = σ | (σ, true) ∈ [E]$ · start(σ, ∅), fin(∅, δ(σ)) ∪ σ | (σ, F) ∈ [E]$ · chaos(σ) [(if E C0 else C1 if )]! = filter(E); [C0 ]! ∪ filter(¬E); [C1 ]! (4) ¡ ¢ ∗ ω [(wh E C wh)]! = (filter(E); [C]! ) ; filter(¬E) ∪ (filter(E); [C]! ) (5) [(co C0 || C1 co)]! = [C0 ]! || [C1 ]! (6) ∞ ∞ ∞ enter(k, E) = (try({k}) ; acq(k); filter(¬E); rel(k)) ; try({k}) ; acq(k); filter(E) enter(k) = try({k})∞ ; acq(k) S [(with E0 when E1 C with)]! = (σ,k)∈[E0 ] |k6=F start(σ, ∅); fin(∅, δ(σ)); enter(k, E1 ); [C]! ; rel(k) (7) @ ∪ {(σ, k) ∈ [E0 ]@ | k = F · chaos(σ)} S [E.n()]! = (σ,o)∈[E] |o6=F start(σ, ∅); fin(∅, δ(σ)); rel(a(o, n)); enter(d(o, n)) @ ∪ {(σ, k) ∈ [E0 ]@ | k = F · chaos(σ)} [(accept n() C accept)o ]! = enter(a(o, n)); [C]! ; rel(d(o, n)) (8) (9) Fig. 1. Semantics for commands A trace set S is any subset of P ∞ . For trace sets S and T : S; T is the elementwise catenation of trace sets, S ∗ is the Kleene closure of S, S ω is the infinite catenation of S with itself, and S ∞ = S ∗ ∪ S ω . For two traces s and t, we define the fair merge s || t of the traces as the trace set generated by breaking s and t into any number of finite pieces that are then interleaved. The fair S merge of two trace sets S and T is S || T = s∈S,t∈T s || t. As usual in formal language theory, I will pun between a symbol p, trace hpi, and trace set {hpi}, when there is no ∗ ambiguity; for example p; q = hpi ; hqi and p∗ = {hpi} . 4.3. Command semantics Now we are ready to give the trace sets associated with sequential commands. These are given in Figure 1. A crucial command is assignment; see (2). If either expression evaluates to F, the trace simply a chaos action. For other states, the trace consists of a start action and a fin action with appropriate marking and unmarking of locations. Sequential composition (3) is reflected by catenation of trace sets. ‘If’ commands (4) and ‘while’ commands (5) are defined with the help of the filter function. Parallel composition is simply a fair merge of the trace sets (6). 4.2. Actions Our symbol set P is a set of actions, borrowed in the main from [1]. We describe actions informally here and formally in Section 4.6. In each case σ is a state. • start(σ, W ), where W is a set of locations. This action is enabled in states compatible with σ. It marks the domain of σ as “being read” and marks all location in W as “being written”. • fin(σ, R), where R is a set of locations. This action is always enabled. It updates the state according to σ. The domain of σ is unmarked as “being written” and the set R is unmarked as “being read”. • chaos(σ).This action is enabled in states compatible with σ. It is completely nondeterministic. 4.4. Lock actions In HARPO/L objects implementing the Lock interface are locks for conditional critical sections. At run-time each lock has two states: locked and unlocked. The following actions affect locks: • try(K) is a failed attempt to acquire a lock in set K. This action is enabled when all locks in K are locked. It has no effect. • acq(k) is a successful acquisition of lock k. This action is enabled when lock k is unlocked; it locks k. • rel(k) is the release of lock k. This action is always enabled; it unlocks k. ⎧ ⎪ ⎪ (σ0 , δ(σ) ] R0 , W0 ∪ W, K0 , true) start(σ,W ) ⎨ if σ ` σ0 and δ(σ) ∩ W0 = ∅ and W ∩ (W0 ∪ R0 ) = ∅ (σ0 , R0 , W0 , K0 , true) ½ (σ , R ⎪ 1 1 , W1 , K1 , ok 1 ) ⎪ ⎩ if σ ` σ0 and ( δ(σ) ∩ W0 6= ∅ or W ∩ (W0 ∪ R0 ) 6= ∅ ) (σ0 , R0 , W0 , K0 , true) (σ0 , R0 , W0 , K0 , true) fin(σ,R) ½ (σ σ0 , R0 − R, K0 , W0 − δ(σ), true) chaos(σ) ½ try(K) acq(k) (11) if σ ` σ0 (σ1 , R1 , W1 , K1 , ok 1 ) (σ0 , R0 , W0 , K0 , true) ½ (σ0 , R0 , W0 , K0 , true) (10) if K ⊆ K0 (σ0 , R0 , W0 , K0 , true) ½ (σ0 , R0 , W0 , K0 ∪ {k} , true) rel(k) if k ∈ / K0 (σ0 , R0 , W0 , K0 , true) ½ (σ0 , R0 , W0 , K0 − {k} , true) p (σ0 , R0 , W0 , K0 , false) ½ (σ1 , R1 , W1 , K1 , ok 1 ) Fig. 2. Semantics for actions 4.5. Interprocess synchronization HARPO/L has two mechanisms for interprocess synchronization: conditional critical sections and rendezvous. Conditional critical sections, also known as ‘with’ commands, are shown in (7). Except for the need to compute the location of the lock object, this is essentially the same as in [1], [3] and [4]. A rendezvous is an asymmetric communication mechanism [5]. Simultaneously the client executes a method call E.n() while, the server executes an ‘accept’ command, which in its simplest form is3 (accept n() C)o . The server must wait at an accept until there is a client ready to rendezvous. The client must wait until the server is ready to execute the ‘accept’ command. The semantics for this simplest form of rendezvous is shown in (8–9). There are two initially locked locks associated with the method named n in object o: a(o, n) is used for the server to wait for the client, while d(o, n) is used for the client to wait for the server. In their full generality, ‘accept’ commands are fairly complex. Space does not permit a full exposition in the present paper, although the details are worked out in the full semantic description for the language. The full ‘accept’ command can generalize the simplest form in three ways. • ‘Accept’ commands can have both input and output parameters. To handle these, a protocol using up to five locks is used: server waits for client, client waits before copying input parameters, server waits before executing the method body, client waits before copying output parameters, server waits until parameters are copied. 3 Recall that classes have been instantiated to objects at an earlier stage. We assume that, as part of instantiation, each ‘accept’ command has been tagged with the object o that it appears in. • ‘Accept’ commands can have multiple branches, so that multiple methods can be implemented by the same ‘accept’ command. In that case the server first waits on all a locks associated with the various methods. This is why the ‘try’ action has a set of locks. • ‘Accept’ commands can have boolean guard expressions. These are evaluated first and govern which a locks are waited on. 4.6. The semantics of traces We supply a meaning to each trace by means of an operational semantics. In effect we define a nondeterministic state machine for traces. Each trace then defines a binary relation on states of the machine. We call the states of this machine configurations, as the word ‘state’ is already in use for the data state. Each configuration consists is a 5-tuple (σ, R, W, K, ok ) where σ is a total state, R is a multiset of locations currently being read, W is a set of locations currently being written, K is a set of locks currently locked, and ok is a boolean indicating that the computation has not gone wrong. The R, W , and ok fields are purely fictions, there is no need to actually represent them at run-time. p Each action p defines a binary relation ½ on configurations. The meaning of each action is given in Fig 2. For a given configuration x and action p there are three possibilities. p • There exists no y such that x ½ y. Then we say that the action is not enabled. p • For all y, x ½ y. In this case the computation has gone wrong and anything can happen. • Somewhere in between. From (10), you can see that start actions are only enabled in configurations where the state is compatible with the partial state of the action. Then there are two possibilities. If the action requires a read of a location currently being written or a write of a location currently being read or written, then the computation has gone wrong and any configuration could be next. Otherwise, the locations are marked and the computation proceeds. The fin actions (11) are a bit simpler, they simply modify part of the state and release read and write claims made in the corresponding start operation. The other actions are straight-forward. We can extend the relation to finite sequences as the least ps ε relation such that x ½ x and x ½ z if there exists y such p s that x ½ y and y ½ z. We extend it to infinite sequences by s saying that x ½ z exactly if there is an infinite computation s(0) s(1) s(2) x ½ y1 ½ y2 ½ y3 · · · . S s Now for a trace set S, x ½ y exactly if ∃s ∈ S · x ½ y. [C]! Each command C gives a relation ½ . 5. RELATED AND FUTURE WORK The work presented here is strongly based on that in [1]. I note the following differences: • HARPO/L is a more complex language than that in [1]. It has objects, rendezvous, and expressions that can go wrong. • HARPO/L allows concurrent read operations. In [1], partial states are used in the configuration with the missing part of the domain indicating which locations are marked as unusable. I could not adopt this approach as there are two ways in which locations can be unusable. This leads to the use of the R and W (multi-)sets. • I used a different approach to computations gone wrong. Whereas [1] uses a special wrong value to indicate computations gone wrong. I use chaotic computations, which lead to a simpler formulation. • I used a different approach to infinite computations. Whereas [1] uses a special trace ⊥ to indicate infinite computations. I use infinite traces. This leads to a simpler formalization. • In [1] the trace is part of the configuration. This seemed unnecessary to me, as only the first action is actually used. • I used a try action that tries multiple locks. This change is not necessary, the same effect can be had with interleaving actions that each try only one lock. The motivation is a cleaner approach to accept statements with multiple branches. View publication stats The work here shows that with only moderate extensions the approach of [1] can be extended to a more complex language. Two closely approaches have been developed by Brookes. In [3], actions indicate individual reads and writes. In [4], state changes from a number of statements are combined. Thus purely sequential parts of the computation are given essentially the semantics that they would be given in a two-state sequential model. This latter approach is intriguing and may form the basis of future work on the HARPO/L semantics. Future work will include an investigation of equivalence of traces and how it applies to the safety of optimizations. Two traces s and t are equivalent, s ≡ t, iff for all x, y, and s||u t||u u, (x ½ y) = (x ½ y). This notion of equivalence can be applied to program optimization as follows. Let C and D be two commands, the C is refined by D, written C v D, iff ∀s ∈ [D]! · ∃t ∈ [C]! · s ≡ t. Then a source level program optimization f is safe on C if C v f (C). A similar idea can be applied to low-level optimizations, as long as the effect on the trace sets is clear. This notion of refinement is entirely local and does not consider the context of the command. The intermediate representation used in our compiler is a form of dataflow graph. By giving a grainless semantics to these graphs, we may be able to at least informally verify the stage of the compiler that produces dataflow graphs. 6. REFERENCES [1] John C. Reynolds, “Towards a grainless semantics for shared-variable concurrency,” in Proc. 24th Conference on Foundations of Software Technology and Theoretical Computer Science (FSTTCS 2004), 2005, vol. 3328 of Lecture Notes in Computer Science, pp. 35–48. [2] Theodore S. Norvell, “HARPO/L: A language for hardware/software codesign.,” in Newfoundland Electrical and Computer Engineering Conference (NECEC), 2008. [3] Stephen Brookes, “A semantics for concurrent separation logic,” Theoretical Computer Science, vol. 375, no. 1–3, pp. 227–270, May 2007. [4] Stephen Brookes, “A grainless semantics for parallel programs with shared mutable data,” Electronic Notes in Theoretical Computer Science, vol. 155, pp. 277–307, 2006, Proceedings of the 21st Annual Conference on Mathematical Foundations of Programming Semantics (MFPS XXI). [5] Jean D. Ichbiah, Bernd Krieg-Brueckner, Brian A. Wichmann, John G. P. Barnes, Olivier Roubine, and JeanClaude Heliard, “Rationale for the design of the Ada programming language,” SIGPLAN Not., vol. 14, no. 6b, pp. 1–261, 1979.