Titanium Language Reference Manual
Titanium Language Reference Manual
Titanium Language Reference Manual
Version 2.20
August, 2006
Abstract
The Titanium language is a Java dialect for high-performance parallel scientific computing.
Titanium’s differences from Java include multi-dimensional arrays, an explicitly parallel
SPMD model of computation with a global address space, a form of value class, and
zone-based memory management. This reference manual describes the differences between
Titanium and Java.
Contents
1 Lexical Structure 2
2 Program Structure 3
5 Immutable Classes 18
iii
6.3 Region-Based Memory Allocation . . . . . . . . . . . . . . . . . . . . . . . 27
6.3.1 Shared and Private Regions . . . . . . . . . . . . . . . . . . . . . . 27
6.3.2 Details of Region-Based Allocation Constructs . . . . . . . . . . . . 29
7 Templates 32
7.1 Instantiation Denotations . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
7.2 Template Definition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
7.3 Names in Templates . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
7.4 Template Instantiation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
7.5 Name Equivalence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
8 Operator Overloading 35
9 Processes 37
9.1 Process teams . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
9.2 Interprocess Communication . . . . . . . . . . . . . . . . . . . . . . . . . . 37
9.2.1 Exchange . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
9.2.2 Broadcast . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
9.3 Barriers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
9.4 Checking Global Synchronization . . . . . . . . . . . . . . . . . . . . . . . 39
9.4.1 Single-valued Expressions . . . . . . . . . . . . . . . . . . . . . . . 41
9.4.2 Restrictions on Statements with Global Effects . . . . . . . . . . . . 43
9.4.3 Restrictions on Control Flow . . . . . . . . . . . . . . . . . . . . . . 43
9.4.4 Restrictions on Methods and Constructors . . . . . . . . . . . . . . 45
9.5 Consistency of Shared Data . . . . . . . . . . . . . . . . . . . . . . . . . . 45
9.6 Static Storage . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47
iv
12.7 Timer class . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63
12.8 PAPI Counters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64
12.9 ti.lang.Ti . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67
12.10Additional properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68
12.11java.lang.Object . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68
12.12java.lang.Math . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69
12.13java.lang.Boolean, java.lang.Number and Subclasses . . . . . . . . . . . . . 69
12.14Polling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72
12.15Sparse Titanium Array Copying . . . . . . . . . . . . . . . . . . . . . . . . 72
12.15.1 Copy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72
12.15.2 Gather . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73
12.15.3 Scatter . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74
12.16Non-blocking Array Copying . . . . . . . . . . . . . . . . . . . . . . . . . . 75
12.17Bulk I/O . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78
12.17.1 Bulk I/O for Titanium Arrays . . . . . . . . . . . . . . . . . . . . . 78
12.17.2 Bulk I/O for Java Arrays in Titanium . . . . . . . . . . . . . . . . 79
12.18Controlling Garbage Collection . . . . . . . . . . . . . . . . . . . . . . . . 81
14 Handling of Errors 83
14.1 Erroneous Exceptions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83
A Notes 84
A.1 On Consistency . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84
A.2 Advantages of Regions [David Gay] . . . . . . . . . . . . . . . . . . . . . . 86
A.3 Disadvantages of Regions [David Gay] . . . . . . . . . . . . . . . . . . . . 86
Index 88
v
Preface
This document informally describes our current design for the Titanium language. It is in
the form of a set of changes to Java. Unless otherwise indicated, the reader may assume
the syntax and semantics of Java 2, version 1.4. Currently, however, the standard Java
packages provided in Titanium officially comprise a subset of the standard version 1.0 Java
library.
Acknowledgments. Many people have contributed to the Titanium project over the
years, both as implementors and commentators. We especially wish to thank Peter Mc-
Corquodale, Phillip Colella, and Tong Wen (of the Lawrence Berkeley National Laboratory)
for their many comments. Johnathon Jamison has also contributed numerous comments
in the process of implementing an alternative front end for Titanium as part of the Harmo-
nia project. Former members of the Titanium group who have contributed to this report
include Alexander Aiken, Andrew Begel, Joe Darcy, and Luigi Semenzato.
This work was supported in part by the Advanced Research Projects Agency of the
Department of Defense under contract F30602-95-C-0136, the National Science Foundation
under grant ACI-9619020, and the Department of Energy under contracts W-7405-ENG-48
and DE-AC03-765F00098. We would also like to thank Sun Microsystems and Microsoft
for their support of related parallel-compiler work. The information presented here does
not necessarily reflect the views of Sun Microsystems, Inc., Microsoft, Inc., or the position
or policy of the Government, and no official endorsement should be inferred.
1
Chapter 1
Lexical Structure
2
Chapter 2
Program Structure
Types introduced by Titanium are contained in the package ti.lang (‘Ti’ being the stan-
dard chemical symbol for Titanium). There is an implicit declaration
import ti.lang.*;
3
Chapter 3
3.1 Points
The immutable class Point<N>, for N a compile-time positive integer constant, is a tuple
of N int’s. Point<N>s are used as indices into N-dimensional arrays.
4
7. The assignment operators +=, -=, *=, /=, applied to a Point<N>s on the left and
another Point<N> or a scalar on the right act like the assignment operators on
scalars in standard Java. That is,
p ⊕= E
8. If R is any of <, >, <=, >=, or ==, then p0 R p1 if p0 [i] R p1 [i] for all 1 ≤ i ≤ N.
The expression p0 !=p1 is equivalent to !(p0 ==p1 ). The member function equals on
Point<N>s is the same as ==
9. The expressions p0 .lowerBound (p1 ) and p0 .upperBound (p1 ) yield p such that
p[i] is the minimum (respectively, maximum) of p0 [i] and p1 [i].
10. The expression p0 .permute (p1 ), is the Point<N>, p, for which p[p1 [i]] = p0 [i],
where p1 is a permutation of 1, . . . , N.
11. The expression p.replace (k, v) returns a Point<N> that is identical to p except
that p[k] = v.
5
it was created1 . Library methods that yield Domain<N>s are not guaranteed to produce
new values (i.e., values that do not compare == to previously created Domain<N>s). That
is, the implementation is free to take advantage of the fact that there are no methods that
mutate Domain<N>s, and to re-use any Domain<N> with the properties otherwise required
of the specified result.
2. RD.isRectangular() is true for all RectDomains and for any Domain that is rect-
angular.
and to get the subset of this set in which all coordinates are even we may write
[ i1 : k1 : s1 , . . . , iN : kN : sN ]
is the same as
1
As a result, applying methods to a Domain<N > can carry unexpected performance penalties if the
programmer does not pay attention to locality. Hence, recommended practice is to make a local copy of a
Domain<N > using its copy constructor before invoking a sequence of operations upon it.
6
[ [ i1 , . . . , iN ] : [ k1 , . . . , kN ] : [ s1 , . . . , sN ] ].
and
[ i1 : k1 , . . . , iN : kN ]
is the same as
[ [ i1 , . . . , iN ] : [ k1 , . . . , kN ] ].
7. The expression RD.min() yields a Point<N> such that RD.min()[k] is the min-
imum over all p in RD of p[k]. Likewise for RD.max(). For empty domains,
RD.min() and RD.max() are not defined.
9. RD.boundingBox() yields the smallest RectDomain<N> that contains all the points
of RD and is either empty or unit-strided. For RectDomains R with unit stride,
R.boundingBox()=R.
11. RD.size() is the cardinality of RD (the number of Points it contains). The predicate
RD.isEmpty() is true iff RD.size() = 0.
12. RD.permute(p1 ) is the domain consisting of all points p.permute(p1 ) for p in RD.
It is a RectDomain<N> if RD is a RectDomain<N>. Otherwise it is a Domain<N>.
13. The operations +, -, and * are defined between any combination of RectDomain<N>s
and Domain<N>s. They stand for union, difference, and intersection of the sets of
elements in the operand domains. The intersection of two RectDomain<N>s yields a
RectDomain<N>. All other operations and operands yield Domain<N>s.
7
14. For a Point<N>, p, the expressions RD+p, RD-p, RD*p, and RD/p compute the
domains
{d|d = d0 ⊕ p, for some d0 ∈ RD}
where ⊕ = +, -, *, or integer division rounded toward −∞ (unlike ordinary integer
division, which rounds toward 0). Divisions require that each p[i] be non-zero, and,
if RD is a RectDomain, that each RD.stride()[i] either be divisible by |p[i]| or less
than |p[i]|; it is an error otherwise. If RD is a RectDomain, so is the result.
RD += p; RD -= p; RD *= p; RD /= p;
RD ⊕= V
assigns the value of RD ⊕ V to RD, evaluating the location of RD exactly once.
Only the particular combinations shown are legal, since others would involve type
errors.
16. The operations <, >, <=, and >= are defined between RectDomains and between
Domains and represent set comparisons (< is strict subset, etc.).
The function equals is defined as set equality on all combinations of Domain<N> and
RectDomain<N>. When the arities do not match, the result will always be false, even
when the operands denote empty sets.
Between RectDomains, R1 ==R0 and R1 !=R2 are synonyms for R1 .equals(R2 ) and
!R1 .equals(R2 ). The expression RD.equals(x) is true iff x is a Domain<N> that
contains the same set of points as RD; it is false when x is any other reference value.
As a result, its value is false if x is null, or does not denote a Domain<N> (so that
again, even if RD and x both denote empty domains, they are not equal if their
arities do not match).
17. The expression RD.contains(p), for Point<N> p, is true iff p ∈ RD.
18. The expression R.accrete(k, dir , s) gives the result of adding elements to R on
the side indicated by direction dir , expanding it by k ≥ 0 strides of size s > 0 (all
arguments but R are integers). That is, it computes
8
and returns the result (always rectangular) as a RectDomain. It is an error if R could
not have been constructed with a stride of s in direction dir, so that the resulting set
of elements would not be a RectDomain. It follows that accrete is the identity on
empty RectDomain<N>s. The argument s may be omitted; it defaults to 1. Requires
that 1 ≤ |dir| ≤ N.
19. The expression R.accrete(k, S) gives the result of expanding R on all sides by
k ≥ 0, as for the value V computed by the sequence
The argument S is a Point with the same arity as R. It may be omitted, in which
case it defaults to all(1).
20. The expression R.shrink(k, dir ) gives the result of shrinking R on the side in-
dicated by direction dir by k ≥ 0 strides of size R.stride()[|dir |] That is, it
computes
21. The expression R.shrink(k) gives the result of shrinking R on all sides by k ≥ 0
strides, as for the value V computed by the sequence
where all arguments are of type int with k ≥ R.stride()[|dir |] and 0 < |dir| ≤ N.
When R has unit stride in direction dir , this consists of the k-thick layer of index
positions on the side of R indicated by dir , shifted by shift positions in direction dir .
Thus,
R.border(1, dir , 0)
is the layer of cells on the face of and interior to R in direction dir , while
R.border(1, dir , 1)
is the layer of cells just over that face (outside the interior of R). As shorthand,
9
R.border(k, dir ) = R.border(k, dir, 1)
and
23. The expression R.slice(k), where 0 < k ≤ N, and N > 1 is a RectDomain<N − 1>
consisting of the set
30. The type Domain<N> has a copy constructor: new Domain<N>(D) creates a new
local copy of D. Likewise, new (W ) Domain<N>(D) will create a local copy of D
in region W .
foreach (p in RD) S
or equivalently
10
foreach (Point<n> p in RD) S
for RD any kind of n-dimensional domain, executes S repeatedly, binding p to the points
in RD in some unspecified order. The control variable p is a local variable whose scope
is confined to S, and it is constant (final) within its scope. As for other local variables
in Java, you may not use a control variable name that shadows another local variable
(including formal parameters and other control variables). The control constructs break
and continue function in foreach loops analogously to other loops.
11
Chapter 4
4.1 Syntax
Titanium modifies Java syntax to allow for grid types (§4.3), and the qualifiers local (§6.1),
nonshared (§6.2), polyshared (§6.2) and single (§9.4.1).
Type:
QualifiedBaseType ArraySpecifiersopt
QualifiedBaseType:
BaseType Qualifiersopt
ArraySpecifiers:
ArraySpecifiers ArraySpecifier
ArraySpecifier
ArraySpecifier:
[] Qualifiersopt
[ IntegerConstantExpression d ] Qualifiersopt
Qualifiers:
Qualifier Qualifiers
Qualifier
12
where PrimitiveType and ClassOrInterfaceType are from the standard Java syntax. The
grouping of qualifiers with the types they modify is as suggested by the syntax: In Quali-
fiedBaseType, the qualifiers modify the base type. Array specifiers apply to their Qualified-
BaseType from right to left: that is, ‘T [1d][2d]’ is “1-D array of (2-D arrays of T).” The
qualifiers in the array specifiers apply to the type resulting from the immediately preceding
array specification. That is, for qualifiers Q1 , Q2 , Q3 ,
Object Q3 [] Q1 [] Q2 V;
4.3 Arrays
A Java array (object) of length N—hereafter called a standard array—is an injective
mapping from the interval of non-negative integers [0, N) to a set of variables, called
the elements of the array. Titanium extends this notion to grid arrays or grids; an N-
dimensional grid is an injective mapping from a RectDomain<N> to a set of variables. As
in Java, the only way to refer to any kind of array object—standard or grid—is through
a pointer to that object. Each element of a standard array resides in (is mapped to from)
precisely one array1 . The elements of a grid, by contrast, may be shared among any number
of distinct grids. That is, even if the array pointers A and B are distinct, A[p] and B[p]
may still be the same object. We say then that A and B share that element.
It is illegal to cast grid values to the type Object. However, as for other normal objects,
the value null is a legal value for a grid variable, and it is legal to compare grid quantities
of the same arity using == or !=. If A and B are two grids of the same arity, then A==B
1
This does not mean that there is only one name for each element. If X and Y are arrays (of any kind),
then after ‘X=Y;’, X[p] and Y[p] denote the same variable. That is an immediate consequence of the fact
that arrays are always referred to through pointers.
13
iff A.domain()==B.domain() and for every p in their common domain, A[p] and B[p]
are the same variable. Similarly for !=.
For a non-grid type T , the type “N-dimensional grid with element type T ” is denoted
T [Nd]
For the type “N0 -dimensional grid whose elements are N1 -dimensional grids. . . whose ele-
ments are of type T ,” we write
There is a reason for this irregularity in the syntax (where one might have expected a
more compositional form, in which the dimensionalities are listed in the opposite order).
The idea is to make the order of indices consistent between type designations, allocation
expressions, and array indexing. Thus, given
A[p1][p2]
14
4.3.1 Array Operations
In the following, take p, p0 , . . . to be of type Point<N>, A to be a grid of some type T
[Nd], and D to be a RectDomain<N>.
3. The expression A.copy(B) copies the contents of the elements of B with indices
in A.domain()*B.domain() into the elements of A that have the same index. It is
illegal if A and B do not have the same grid type. It is also illegal if A is a global
array whose element type has local qualification (it is easy to construct instances in
which such a copy would be unsound). Finally, it is illegal if B resides in a private
region, A resides in a shared region (see §6.3), and the elements of B have a non-
atomic type (see §4.2). (This last provision is a conservative restriction to prevent
situations where objects allocated in shared regions contain references to objects in
private regions.) It is legal for A and B to overlap, in which case the semantics are
as if B were first copied to a (new) temporary before being copied to A. See also
the operations for sparse-array copying described in §12.15, and the operations for
non-blocking array copying described in §12.16.
15
(e) A.slice(k, j) produces the grid, B, with domain A.domain().slice(k) such
that
B[[p1 , . . . , pk−1, pk+1 , . . . , pn ]] = A[[p1 , . . . , pk−1 , j, pk+1 , . . . , pn ]].
It is an error if any of the index points used to index A are not in its domain,
or if A.arity≤ 1.
(f) A.permute(p), where p is a permutation of 1, . . . , N, produces a grid, B, where
A[i1 , . . . , iN ] aliases B[ip[1] , . . . , ip[N ] ].
9. A.isContiguous() is true iff all elements of A are adjacent to each other in vir-
tual memory. This property has no semantic significance, but can be important for
tuning performance. Arrays are created fully contiguous and in row-major order4 .
Operations that produce views of subsets of the data can result in non-contiguous
results.
10. Because of the close relationship between grids and their domains, a number of com-
mon idioms involve operations on both. As a convenience, several methods capture
these idioms5 :
16
11. The I/O operations described in §12.17.1.
1. Formal grid parameters to a method or constructor (including the implicit this in the
case of methods defined on grids) may not overlap unless otherwise specified—that
is, given two formals F1 and F2 , none of the variables F1 [p1 ] may be the same as
F2 [p2 ]. It is an error otherwise.
is illegal if A is a global array whose element type has local qualification. It is also illegal
if A resides in a private region, B resides in a shared region (see §6.3), and the elements
of A have a non-atomic type (see §4.2).
17
Chapter 5
Immutable Classes
Immutable classes (or immutable types) extend the notion of Java primitive type to classes.
An immutable class is a final class that is not a subclass of Object and whose non-static
fields are final. It is declared with the usual class syntax and an extra modifier
1. Non-static fields are implicitly final, but need not be explicitly declared as final.
Because all immutable classes are final and are not subtypes of Object, they may
never have an extends or implements clause.
2. Because immutable classes are not subclasses of any other class, their constructors
may not call super() explicitly, and, contrary to the usual rule for Java classes other
than Object, do not do so implicitly either.
3. There are no type coercions to or from any immutable class, aside from those defined
on standard types in the Titanium library (see §3.2). The standard classes Point<N>
and RectDomain<N> are immutable.
4. The value null may not be assigned to a variable of an immutable class. Each class
variable, instance variable, or array component with immutable type T is initialized
to the default value for that type. For each process, this default value initially consists
of the value in which all the instance fields are set to their respective default values.
The language inserts a static initializer at the end of the definition of T whose effect
is to evaluate new T () and then set the default value of T for the process to the
result. Initialization of the fields in objects of these types otherwise follows the rules
of Java 1.4.
18
5. Circular chains of immutable types are disallowed. That is, no immutable type may
contain a non-static subcomponent of its own type. Here, a non-static subcomponent
is a non-static field, or a non-static subcomponent of a field, where the field has an
immutable type.
6. By default, the operators == and != for a type T are defined by comparison of the
non-static fields of the operands using == and !=, as if defined
where the Fk are the non-static fields of T in declaration order. (This implies that if
== or != has been overridden on the type of a field, it is the new, overriding definition
that is used for the implicit comparison.) These default definitions of == and != are
not accessible via calls to op== or op!=. Any void finalize() method supplied
with an immutable object is ignored.
7. As a consequence of these rules, it is impossible, except in certain pathological pro-
grams, to distinguish two objects of an immutable class that contain equal fields. The
compiler is free to ignore the possibility of pathology and unbox immutable objects.
8. Nested immutable classes must be static, and may not be local. That is, whereas it
is legal to have an immutable class that is nested immediately inside a class, it is not
legal to nest an immutable class in a block.
10. An immutable class must have a zero-argument constructor with at least as much
access as the class itself. If no such constructor is provided by the user, the compiler
automatically generates a publically accessible zero-argument constructor with an
empty method body.
19
Chapter 6
20
int local i3; /* illegal: int not a reference type */
We refer to a reference type apart from its local/global attribute as its object type.
Several rules prevent programs from treating references to other demesnes as if they
were local:
1. If a field selection r.a or r[a] yields a reference value and r has static global type,
then so does the result of the field selection.
2. A locally qualified variable may only be assigned a value whose type is (statically)
local.
3. If r has a static global type and a is an immutable, non-array field defined for that
type that contains embedded local pointers, then the expression r.a is illegal.
4. If r has a global array type whose element type is a non-array immutable with
embedded local pointers, then the expression r[E] is illegal.
21
5. Coercions between reference types are only legal if their object types obey the usual
Java restrictions on conversion (plus Titanium’s more stringent rules on arrays).
6. A reference value may only be coerced to have a local type (by means of a cast) if
it is null or denotes an object residing in the demesne of the process performing the
coercion. It is an error to execute such a cast otherwise.
7. Coercions from local to corresponding global types are implicit (they extend assign-
ment and method invocation conversion). Global to local coercions must be expressed
by explicit casts.
The term embedded field, as used above, is defined as follows:
1. If class C defines field C.f , then field C.f is embedded within class C.
2. If class C has superclass D, then all fields embedded within class D are also embedded
within class C.
3. If class C has embedded field B.f , and B.f has immutable type I, then fields em-
bedded within immutable class I are also embedded within class C.
All reference classes support the following operations:
r.creator() Returns the identity of the lowest-numbered process whose demesne contains
the object pointed to by r. NullPointerException if r is null. This number may differ
from the identity of the process that actually allocated the object when multiple
processes share a demesne.
r.isLocal() Returns true iff r may be coerced to a local reference. This returns the same
value as r instanceof T local, assuming r to have object type T. On some (but
not all) platforms r.isLocal() is equivalent to r.creator() == Ti.thisProc().
r.regionOf() See §6.3.2.
r.clone() Is as in Java, but with local references inside the object referenced by r set
to null. The result is a global pointer to the newly created clone, which resides in
the same region (see §6.3.2) as the object referenced by r. This operation may be
overridden.
r.localClone() Is the default clone() function of Java (a shallow copy), except that r
must be local. The result is local and in the same region (see §6.3.2) as the object
referenced by r.
To indicate that the special variable this in a method body is to be a local pointer,
label the method local by adding the local keyword to the method qualifiers:
22
public local int length () {...}
Otherwise this is global. If necessary, the value for which a method is invoked is coerced
implicitly to be global. A static method may not be local.
/* a shared object */
Object x;
/* a non-shared object */
Object nonshared y;
23
/* ok: a non-shared object */
Object nonshared b = new Integer nonshared (6);
Reference types can have zero or one sharing qualifier: it is an error to qualify any
type as both nonshared and polyshared. Primitive and immutable types may not be
qualified as either nonshared or polyshared.
6.2.2 Methods
Within a reference class, a sharing qualifier may also appear among the qualifiers for a
non-static method or constructor, in which case it applies to the implicit this parameter
within the method or constructor body. It is an error to add any sharing qualifier to a
static method, immutable method, or immutable constructor. It is an error to add more
than one sharing qualifier to a reference class method.
Two methods that vary in the sharing qualifiers on their formal parameters or on
the implicit this parameter are always considered distinct types for purposes of method
overloading and overriding. A method with return type R1 may override one with return
type R2 only if:
24
1. Within field initialization expressions of reference classes, this is assumed to be
polyshared.
2. If the compiler provides a default constructor (Java spec §8.8.7), then this constructor
is assumed to be polyshared.
4. The null type can be converted to any shared, non-shared, or polyshared reference
type, so the sharing qualification of the null literal is unspecified and irrelevant.
6.2.4 Conversion
Standard Java conversion rules are modified as follows.
All widening reference conversions (Java spec §5.1.4) are restricted to apply only when
at least one of the following conditions hold:
1. the source type and destination type have identical top-level sharing qualifiers (that
is, the sharing qualifiers that apply to the types as a whole—as opposed to the types
of components—are identical).
Widening reference conversion of array types is further constrained to apply only when
the top-level sharing qualifiers of the element types are identical. For example:
25
* (polyshared vs. non-shared) */
Object polyshared [] o4 = new Integer nonshared [];
All narrowing reference conversions (Java spec §5.1.5) are restricted to apply only when
the source type and destination type have identical top-level sharing qualifiers. Narrowing
reference conversion of array types is further constrained to apply only when the top-
level sharing qualifiers of the element types are identical. Informally, one cannot convert
polyshared data back to shared or non-shared data, even using a run time test.
The restriction on narrowing reference conversion does allow casting to a local type. It
merely forbids checked casts that do not simultaneously recover localness.
26
6.2.6 Early Enforcement
Titanium contains two alternative definitions of sharing. Late enforcement is the more re-
laxed of the two, and entails only those restrictions already listed above. Early enforcement
entails the following additional restrictions:
1. Any type that is non-shared or polyshared must also be local; global pointers may
only address shared data.
2. If any use of a named class is (implicitly) qualified as shared, then all fields embedded
within that class must be qualified as shared.
3. If any use of an array type is (implicitly) qualified as shared, then the elements of
the array must be shared as well.
As a result of these additional restrictions, nonshared data is only accessible within the
process that creates it, whereas under late enforcement, nonshared data is available to any
process in its demesne.
The relative merits of early and late enforcement were not originally clear. Late enforce-
ment was originally the implemented policy, but the preferred policy is early enforcement.
At some point this will become the default and the compiler will provide late enforcement
as an option.
27
class A {
void f()
{
PrivateRegion r = new PrivateRegion();
2. A static field;
28
1. If r is externally referenced, throw a ti.lang.RegionInUse exception.
2. Run the finalize methods of all objects in r for which it has not been run.
This returns the region of the object, or null for garbage-collected objects. For shared-
region objects, the local representative of the shared region is returned.
The expression Domain<N>.setRegion(reg), where reg is a Region, dynamically makes
reg the Region used for allocating all internal pointer structures used in representing do-
mains (until the next call of setRegion). A null value for reg causes subsequent allocations
29
package ti.lang;
30
to come from garbage-collected storage. Returns the previous value passed to setRegion
(initially null). The value of N is irrelevant; all general domains are allocated in the same
region.
31
Chapter 7
Templates
TemplateInstantiation:
Name "<" TemplateActual { "," TemplateActual }* ">"
TemplateActual:
Type | AdditiveExpression
CompilationUnit:
PackageDeclarationopt{ ImportDeclaration }* { OuterDeclaration }*
OuterDeclaration:
32
TypeDeclaration
TemplateDeclaration
TemplateDeclaration:
TemplateHeader ClassDeclaration
TemplateHeader InterfaceDeclaration
TemplateHeader:
template "<" TemplateFormal { "," TemplateFormal }* ">"
TemplateFormal:
class Identifier
PrimitiveType Identifier
The first form of TemplateFormal allows any type as argument; the second allows Con-
stantExpressions of the indicated type.
Templates introduce new opportunities for creating illegal circular dependencies among
types. In Java, it is a compile-time error if a class depends on itself. In Titanium, the
definition of directly depends is extended as follows:
1. C directly depends on T .
33
Names other than template parameters in a template are captured at the point of the
template definition. Names in a TemplateActual (and at the point it is substituted for in
a template instantiation) are resolved at the point of instantiation.
or
might be illegal according to the usual rules (e.g., the template expansion could reside in package P and
yet contain fields whose types were non-public members of package Q). The rationale for this alternative
is that otherwise, a package of utility templates would be useless unless one was willing to make public all
classes used as template parameters.
34
Chapter 8
Operator Overloading
Titanium allows overloading of the following Java operators as instance methods of classes
and interfaces.
Unary prefix - ! ~
Binary < > <= >=
+ - * / & | ^ % << >> >>>
Matchfix [] []=
Assignment += -= *= /= &= |= ^= %= <<= >>= >>>=
On immutable types, furthermore, one may also overload the binary operators == and !=.
They must have exactly the following signatures:
35
1. E0 ⊕ E1 =⇒ (E0 ).op ⊕ (E1 )
2. E0 ⊕ E1 =⇒ (E1 ).op ⊕ (E0 , E1 ), if T0 is primitive.
3. ⊕E0 =⇒ E0 .op ⊕ ()
4. E0 [E1 , . . . , En ] = E =⇒ (E0 ).op[]=(E1 , . . . , En , E)
5. E0 [E1 , . . . , En ] =⇒ E0 .op[](E1 , . . . , En ), in other contexts
36
Chapter 9
Processes
37
multiple processes avoid attempting to modify the same variable without an intervening
barrier. Titanium also provides specialized constructs to facilitate communication for some
common cases, as detailed in the following sections.
9.2.1 Exchange
The member function exchange, defined on Titanium array types, allows all processes in
a team to exchange data values with each other. Given E is of some type T and A is a
grid of T with an index set that is a superset of Ti.myTeam().domain(), the call
A.exchange(E)
acts as a barrier (see §9.3) among the processes with indices in Ti.myTeam(). It is an
error if the processes reach different textual instances of exchange. The collective effect of
exchange is as if it collected all the values of E and then copied them into all the arrays
A, so that after proceeding from the barrier, the value of A[i] for the A supplied by each
process will be the value of E supplied by the process indexed by Ti.myTeam()[i], for
each i in Ti.myTeam()’s domain. It is illegal to exchange arrays of local pointers (that is
arrays of a type qualified ‘local’) or non-array immutables with embedded local pointers.
If T is non-atomic and A resides in a shared region on any processor, then it is also illegal
to pass an argument that resides in a private region.
Thus, the code
double [1d] single [2d] x = new double[Ti.myTeam().domain()][2d];
x.exchange(new double [D]);
creates a vector of pointers to arrays, each on a separate processor, and distributes this
vector to all processors.
Access to the arrays A is restricted in order to give implementations some leeway to
implement exchange efficiently. If the notation Ak denotes the array reference calculated
by a process k in an exchange operation, the value of a read or effect of a write to Ak [j]
by process i 6= k is undefined between the last dynamically encountered barrier before
the call to exchange and the next dynamically encountered barrier after returning from
the call. As for other method calls, the actual “call to exchange” is an event that occurs
after evaluation of A and v. As a result of these restrictions, the implementation need not
actually complete the copy of data to all the arrays A before any given process continues
from its call to exchange, nor need it wait for all processes to reach the barrier before it
begins setting elements of A for the processes that have reached the barrier.
9.2.2 Broadcast
The broadcast expression allows one process in a team to broadcast a value to all others.
Specifically, in
38
broadcast E from p
—where E is an arbitrary value and p (a UnaryExpression) evaluates to the index of a
process on the current process team—process p (only) evaluates E to yield a value V , and
all other processes in the team wait at the broadcast (if necessary) until p performs the
evaluation and then all return the value V . Processes may proceed even if some processes
have not yet reached the broadcast and received the broadcast value. It is an error if
the processes reach a different sequence of textual instances of the call. It is an error if
the processes do not agree on the value p—in fact, p must be a single-valued expression
(see §9.4.1), whose value is non-negative and less than Ti.numProcs(). It is an error for
the evaluation of E on processor p to throw an exception (which, informally, would keep
one process from reaching the barrier). It is illegal to broadcast a value of a non-array
immutable type that contains embedded local pointers.
9.3 Barriers
The call
Ti.barrier();
causes the process executing it to wait until all processes in its team have executed the
same textual instance of the barrier call. It is an error for a process to attempt to execute
a different sequence of textual instances of barrier calls than another in its team. Each
textually distinct occurrence of partition and each textually distinct call on exchange
waits on a distinct anonymous barrier.
39
replicated objects that are of the same dynamic type and that, in the case of arrays, have
identical bounds. To ensure the coherence of such expressions, the compiler must ensure
that the sequences of values assigned to certain storage locations by the processes in a
given team are themselves coherent. The programmer must mark such storage locations
with the type qualifier single.
The following properties are important in checking these coherence constraints:
ThrowStatement:
single throw Expression ;
where the Expression must be single-valued. Normal throw statements are statically
taken to throw non-universal exceptions. A catch clause in a try statement and the
throws list of a method or constructor declaration indicate that an exception is
universal by qualifying the exception type with single (we use the terms universal
catch clause and universal throws clause for these cases). A universal catch clause
catches only (statically) universal exceptions; a non-universal catch clause will catch
both universal and non-universal exceptions.
3. A statement has global effects if it or any of its substatements either assigns to any
storage (variable, field or array element) whose type is t single, may call a method or
constructor that has global effects, is a broadcast expression, or is a single throw
statement.
40
(a) A method has global effects if any statement of its body has global effects;
however, assignments to single local variables do not count as global effects.
(b) A constructor has global effects if any statement of its body has global effects.
However, an assignment to a single local variable does not count for this pur-
pose, nor does an assignment to a non-static single field of the object being
constructed if it occurs within the constructor body and the destination field is
specified as a simple identifier, possibly qualified by this.
7. e1 [e2 ] if e1 and e2 are single-valued, e1 is an array (standard or grid), and the type
of the elements of e1 is single;
12. e1 = e2 if e2 is single-valued;
41
13. e1 ⊕ = e2 if e1 and e2 are single-valued and ⊕ = a built-in assignment operator;
(a) e0 is single-valued
(b) ei is single-valued if the i’th argument of v is declared single;
(c) the result of v is declared single;
17. new T (e1 , . . . , en ) if ei is single-valued wherever the i’th argument of the appropriate
constructor for T is declared single;
19. Array initializers in declarations and array creation expressions produce single-valued
expressions. Thus, new T [· · ·]{e1 , . . . , en } produces a result of type T [· · ·] single.
21. [e1 : e2 : e3 ] if e1 , e2 , e3 are single-valued (and similarly for the other domain literal
syntaxes);
The rule for broadcast excludes reference values from single expressions, which may
seem odd since the recipients of the broadcast will manifestly get the same value. This
merely illustrates, however, the subtle point that “single” and “equal” are not synonyms.
Consider a loop such as
2
As of version 3.86 of the compiler, however, all of the ei must be single-valued before the allocator’s
result is single-valued. This is rectified in later versions.
42
x = broadcast E from 0;
while (x.N > 0) {
...
Ti.barrier ();
x.N -= 1;
}
(where ‘x’ and the ‘N’ field of its type are declared single). If we were to allow this, it is
clearly easy to get different processes to hit the barrier different numbers of times, which
is what the rules concerning single are supposed to prevent.
2. If a method call e0 .v(e1 , . . . , en ) may call a method m1 that has global effects then:
4. In an object allocation new T (e1 , . . . , en ), if the constructor has global effects then
ei must be single-valued if the i’th argument of the constructor is declared single.
5. An initializer expression for an instance (non-static) variable may not have global
effects. (For simplicity, this rule is stronger than it needs to be.)
43
1. An if (or ? operator) whose condition is not single-valued cannot have statements
(expressions) with global effects as its ‘then’ or ‘else’ branch.
3. A while, do/while or for loop whose exit condition is not single-valued, and a
foreach loop whose iteration domain is not single-valued cannot contain statements
or expressions with global effects.
4. If the main statement of a try has global effects, then non-universal unchecked excep-
tions may not be specified in its catch clauses. It is an error if one of its non-universal
catch clauses catches an unchecked exception at runtime.
If the main statement of a try has non-universal terminations then its catch clauses
cannot specify any universal exceptions.
The rules for determining whether a termination is universal are essentially identical
to the restrictions on statements with global effects: any termination raised in a statement
that cannot have global effects is not universal. In addition:
44
9.4.4 Restrictions on Methods and Constructors
The following additional restrictions are imposed on methods and constructors:
1. If a method is declared to return a single result, then the expression in all its return
statements must be single-valued. The return terminations must all be universal.
3. A method may not have global effects if it overrides or implements a method that
does not have global effects. The reverse is not true—an overriding method need not
have global effects if the method it overrides or implements does.
Thus, a Titanium program defines a total order on memory events for each process.
This corresponds to the order that a naive compiler and processor would exhibit, i.e.,
45
without reorderings. The union of these process orders forms a partial order called the
program order. We write P (a, b) if event a happens before event b in the program order.
During an actual execution, some of the events visible in the Titanium source may be
reordered, modified, or eliminated by the compiler or hardware. However, the processes
see the events in some total order that is related to the program order. For each execution
there exists a total order, E, of memory events from P such that:3 :
1. P (a, b) ⇒ E(a, b) if a and b access the same variable, and at least one of them is an
assign.
(a) If E(assign(V, A), use(V, B)) and there is no intervening write to V , then A =
B.
(b) If there were n processes in P , then there are n consecutive barrier statements
in E for each instance of a barrier.
(c) A process T may contain an unlock operation l, only if the number of preceding
locks by T (according to E) on the object locked by l is is strictly greater than
the number of unlocks by T .
(d) A process T may contain a lock operation l, only if the number of preceding
locks by other processes (according to E) is equal to the number of preceding
unlocks.
Less formally, rule 1 says that dependences in the program will be observed. Rules 2 and
3 say that reads and writes performed in a critical demesne must appear to execute in-
side that demesne. Rule 4 says that synchronization operations must not be reordered.
(Titanium extends the Java rules for synchronization, which include only lock and unlock
statements to explicitly include barrier.) Rule 5 says that the usual semantics of mem-
ory and synchronization constructs are observed. Rule 6 says that operations on volatile
variables must execute in order.
As in Java, the indivisible unit for assignment and use is a 32-bit value. In other
words, the assign and use operations in the program order are on at most 32-bit quantities;
3
See author’s notes 1 and 2.
46
assignment to double or long variables in Titanium source code corresponds to two separate
operations in this semantics. Thus, a program with unsynchronized reads and writes of
double or long values may observe undefined values from the reads4 . It is a weakness
of current implementation, unfortunately, that global pointers are not indivisible in this
sense.
4
See authors’ note 3.
47
Chapter 10
10.1 Inlining
The inline qualifier on a method declaration acts as in C++. The semantics of calls to
such methods is identical to that for ordinary methods. The qualifier is simply advice to
the compiler. In particular, you should probably expect it to be ignored when the dynamic
type of the target object in a method call cannot be uniquely determined at compilation
time.
The qualifier may also be applied to loops:
# Does nothing.
48
#if...#elif...#else...#endif For conditional compilation.
#ifdef NAME Short for #if defined(NAME ).
#ifndef NAME Short for #if !defined(NAME ).
#include Inserts text files verbatim.
#line Specifies a source file and line number to ascribe to the current source text for
purposes of generating messages.
#pragma Modifies the behavior of the program in various implementation-dependent ways.
#undef Undefines names introduced by previous #define directives.
This syntax includes a small set of macro operators for use inside definitions
#P The textual macro parameter P in string quotes.
P ## Q The tokens P and Q (after textual parameter substitution) concatenated into a
single token.
Since each Titanium source is preprocessed independently, only #defines that appear
in a given file or those files it explicitly #includes will be available. One consequence of
these semantics is that class declarations should not appear in a file that is #included by
more than one loaded file, because that would lead to a multiply-defined class.
The preprocessor predefines certain names. Due to its legacy, these include the standard
names defined in the C language specification, in addition to those that are specific to
Titanium. The names of likely interest to Titanium programmers are as follows:
__FILE__ Name of the current source file (a string literal).
__LINE__ Line number on which the directive appears within the current source file (an
integer).
__DATE__ Current date in “MMM DD YYYY” form (a string literal).
__TIME__ Current local time in “HH:MM:SS” format on a 24-hour clock (a string literal).
__TITANIUMC__ The major version number for the compiler translating this source file (an
integer).
__TITANIUMC_MINOR__ The minor version number for compiler translating this source file
(an integer).
For full details of all these features, see any current reference work on the C programming
language.
49
Chapter 11
This section discusses features of Titanium whose implementation we have deferred indef-
initely until we can evaluate the need for them.
11.1 Partition
The constructs
and
divide a team into one or more teams without changing the total number of processes.
The construct begins and ends with implicit calls to Ti.barrier(). When all processes
in a team reach the initial barrier, the system divides the team into n teams (some possibly
empty). All those for which C0 is true execute S0 . Of the remaining, all for which C1 is
true execute S1 , and so forth. All processes (including those satisfying none of the Ci ) wait
at the barrier at the end of the construct until all have reached the barrier.
Since the construct partitions the team, it also changes the value of Ti.myTeam(), (but
not Ti.thisProc()) for all processes for the duration of the construct. If supplied, the
variable name V is bound to an integer value such that Ti.thisProc() = Ti.myTeam()[V ].
Its scope is the text of all the Si .
50
Chapter 12
12.1 Grids
The syntax below is not, of course, legal (and grid types are not classes); we use it simply
as a convenient notation. For each type T and positive integer n:
final class T [n d] {
public static final int single arity = n;
public static int single arity () { return n; }
public RectDomain<n> single domain();
51
/* only if n > 1 */
public local T[(n - 1)d] local single slice(int single k, int single j);
52
public void readFrom(java.io.RandomAccessFile file)
throws java.io.IOException;
public void readFrom(java.io.DataInputStream str)
throws java.io.IOException;
public void writeTo(java.io.RandomAccessFile file)
throws java.io.IOException;
public void writeTo(java.io.DataOutputStream str)
throws java.io.IOException;
}
12.2 Points
53
public boolean single op>=(Point<n> single p);
public int single op[] (int single x);
12.3 Domains
A Domain<n> is an Object, and therefore differs in subtle ways from a RectDomain<n>.
Specifically, == and != on Domains denote pointer (in)equality, as for other reference types.
54
public RectDomain<n> [1d] single RectDomainList();
55
12.4 RectDomains
56
public Point<n> single min();
public Point<n> single max();
public Point<n> single stride();
public int single size();
public boolean single isEmpty();
public RectDomain<n> single accrete(int single k, int single dir,
int single s);
public RectDomain<n> single accrete(int single k, int single dir);
public RectDomain<n> single accrete(int single k, Point<n> single S);
public RectDomain<n> single accrete(int single k); /* S = all(1) */
public RectDomain<n> single shrink(int single k, int single dir);
public RectDomain<n> single shrink(int single k);
public RectDomain<n> single permute(Point<n> single p);
public RectDomain<n> single border(int single k, int single dir,
int single shift);
public RectDomain<n> single border(int single k, int single dir);
public RectDomain<n> single border(int single dir);
public boolean single contains(Point<n> single p);
public RectDomain<n> single boundingBox();
/* only if n > 1 */
public RectDomain<n - 1> single slice(int single k);
package ti.lang;
public final class Reduce
{
/* The result of Reduce.F (E) is the result of applying the binary
* operator F to all processes’ values of E in some unspecified
* grouping. The result of Reduce.F (E, k) is 0 or false for all
* processes other than K, and the result of applying F to all the
* values of E for processor K.
*
* Here, F can be ’add’ (+), ’mult’ (*), ’max’ (maximum),
57
* ’min’ (minimum), ’and’ (&), ’or’ (|), or ’xor’ (^).
*/
58
public static single long single xor(long n);
public static single boolean single xor(boolean n);
public static single Object gen(ObjectOp oper, Object o, int single to);
public static single Object gen(ObjectOp oper, Object o);
public static single int gen(IntOp oper, int n, int single to);
public static single int single gen(IntOp oper, int n);
public static single long gen(LongOp op, long n, int single to);
public static single long single gen(LongOp oper, long n);
public static single double gen(DoubleOp oper, double n, int single to);
public static single double single gen(DoubleOp oper, double n);
}
59
package ti.lang;
public final class Scan
{
/* Scan.F (E) produces the result of applying the operation F to all
* values of E for this and lower-numbered processes, according to
* some unspecified grouping. */
60
interface IntOp {
int eval(int x, int y);
}
interface LongOp {
long eval(long x, long y);
}
interface DoubleOp {
double eval(double x, double y);
}
interface ObjectOp {
Object eval(Object arg0, Object arg1);
}
package ti.lang;
/** Zero */
public Complex();
public Complex(double re, double im);
/** The Complex value denoted by S. S must have the form <num>,
* <num><i>, <num><op><num><i>, or (<num>,<num>),
* where each <num> is parsable as a double, <i> is i or I,
* and <op> is + or -. The parenthesized form is Fortran format:
61
* (real part, imag part). The <nums> may contain
* no embedded whitespace; otherwise whitespace is ignored. */
public static Complex parseComplex(String s) throws NumberFormatException;
/* Assignment operations */
62
public Complex op/=(Complex c1);
public Complex op^=(Complex c1);
public Complex op+=(double d1);
public Complex op-=(double d1);
public Complex op*=(double d1);
public Complex op/=(double d1);
public Complex op^=(double d1);
package ti.lang;
public class Timer {
63
/** The maximum possible value of secs (). Roughly 1.84e13. */
public final double MAX_SECS;
/** Add the time elapsed from the last call to start() to the value
* of secs (). */
public void stop();
package ti.lang;
/** PAPI counter library. The available counters are platform dependent. */
64
public class PAPICounter {
65
public static final int Load_Store_Instruction;
public static final int Synchronization_Instruction;
public static final int L1_DataCache_Hits;
public static final int L2_DataCache_Hits;
public static final int L1_DataCache_Accesses;
public static final int L2_DataCache_Accesses;
public static final int L3_DataCache_Accesses;
public static final int L1_DataCache_Reads;
public static final int L2_DataCache_Reads;
public static final int L3_DataCache_Reads;
public static final int L1_DataCache_Writes;
public static final int L2_DataCache_Writes;
public static final int L3_DataCache_Writes;
public static final int L1_ICache_Hits;
public static final int L2_ICache_Hits;
public static final int L3_ICache_Hits;
public static final int L1_ICache_Accesses;
public static final int L2_ICache_Accesses;
public static final int L3_ICache_Accesses;
public static final int L1_ICache_Reads;
public static final int L2_ICache_Reads;
public static final int L3_ICache_Reads;
public static final int L1_ICache_Writes;
public static final int L2_ICache_Writes;
public static final int L3_ICache_Writes;
public static final int L1_Cache_Hits;
public static final int L2_Cache_Hits;
public static final int L3_Cache_Hits;
public static final int L1_Cache_Accesses;
public static final int L2_Cache_Accesses;
public static final int L3_Cache_Accesses;
public static final int L1_Cache_Reads;
public static final int L2_Cache_Reads;
public static final int L3_Cache_Reads;
public static final int L1_Cache_Writes;
public static final int L2_Cache_Writes;
public static final int L3_Cache_Writes;
public static final int Float_Multiply_Instruction;
public static final int Float_Add_Instruction;
public static final int Float_Divide_Instruction;
public static final int Float_SquareRoot_Instruction;
public static final int Float_Inverse_Instruction;
66
public static final int Float_Operations;
/** True iff the current counter is running (not idle). Initially
* false. */
public boolean single running();
/** Stops (idles) the counter. Requires that the counter be running. */
public void stop();
/** Returns the current counter value. Requires that the counter be
* idle. */
public long getCounterValue();
12.9 ti.lang.Ti
This class provides miscellaneous static methods supporting SPMD programming and
memory management in Titanium.
package ti.lang;
public class Ti {
public static int thisProc();
67
public static int single numProcs();
public static single void barrier();
public static int single [1d] single local myTeam();
public static void poll();
public static void suspend_autogc();
public static void resume_autogc();
}
runtime.distributed Has the value "true" if and only if the Titanium program reading
it has been compiled for a platform with a distributed memory architecture, and
otherwise "false".
runtime.shared Has the value "true" iff some processes may share a memory space.
(The CLUMP backends make “shared” and “distributed” orthogonal, rather than
mutually exclusive.)
runtime.model Indicates the specific platform that the Titanium program reading it has
been compiled for.
12.11 java.lang.Object
Titanium adds or modifies several methods in class java.lang.Object, as indicated in
§6.1:
68
protected Object clone() throws CloneNotSupportedException;
protected local Object local localClone()
throws CloneNotSupportedException;
public final ti.lang.Region local regionOf();
2. r.isLocal() returns true iff r may be coerced to a local reference. This returns the
same value as r instanceof T local, assuming r to have object type T. On some
(but not all) platforms r.isLocal() is equivalent to r.creator() == thisProc().
3. r.clone() is as in Java, but with local references inside the object referenced by r
set to null. The result is a global pointer to the newly created clone, which resides
in the same region (see §6.3.2) as the object referenced by r. This operation may be
overridden.
12.12 java.lang.Math
The static methods in java.lang.Math, with the exception of java.lang.Math.random,
take single arguments and produce single results.
69
public static int single parseInt(String single s, int single radix)
throws NumberFormatException;
public static int single parseInt(String single s)
throws NumberFormatException;
public static Integer single valueOf(String single s, int single radix)
throws NumberFormatException;
public static Integer single valueOf(String single s)
throws NumberFormatException;
public Integer(int single value);
public Integer(String single s) throws NumberFormatException;
public int single intValue();
public long single longValue();
public float single floatValue();
public double single doubleValue();
}
70
public float single floatValue();
public double single doubleValue();
}
71
public boolean single booleanValue();
public static Boolean single valueOf(String single s);
}
12.14 Polling
The operation Ti.poll() services any outstanding network messages. This is needed in
programs that have long, purely-local computations that can starve the NIC (e.g. a self-
scheduled computation). It has no semantic content, and affects only performance (on
some systems).
12.15.1 Copy
Assuming
T [N d] dest, src;
Domain<N> dom;
Point<N> [1d] pts;
the calls
dest.copy(src, dom)
dest.copy(src, pts);
72
Restrictions
1. pts must not overlap dest or src.
2. Each point element of pts or dom must be contained within the intersection of
.domain()dest and .domain()src (it is a fatal error otherwise).
3. If T is a non-atomic type, then it may not be the case that src resides in a private
region and dest resides in a shared region.
4. If T contains embedded local pointers then src and dest must both be local.
5. The contents of src (at the selected points) and pts may not be modified (i.e., by
other processes) during the operation.
Effects
After the operation completes, the following conditions will hold:
1. For every Point<N> p in dom (or in pts), dest[p] == src’[p] (where src’ denotes
the contents of src prior to the operation).
2. All other values are unchanged.
3. pts is permitted to contain duplicate points, but by definition these will not affect
the result.
4. src and dest are permitted to overlap, and if they do it will be as if the relevant
values were first copied from src to an intermediate temporary array and then to
dest.
5. During the operation, the contents of dest at affected points is undefined.
12.15.2 Gather
Assuming
T [N d] src;
Point<N> [1d] pts;
T [1d] dest;
the call
src.gather(dest, pts);
packs the values from src selected by pts into dest, maintaining ordering of the points
and data, and preserving any duplicated points in the packed array.
73
Restrictions
1. dest.domain().size() >= pts.domain().size() (i.e. there is enough space to
gather into). It is a fatal error otherwise.
4. If T is a non-atomic type, then it may not be the case that src resides in a private
region and dest resides in a shared region.
5. If T contains embedded local pointers then src and dest must both be local.
6. The contents of src (at the selected points) and pts may not be modified (i.e., by
other processes) during the operation.
Effects
After the operation completes, the following conditions will hold:
1. src and pts will be unchanged
12.15.3 Scatter
Assuming
T [1d] src;
Point<N> [1d] pts;
T [N d] dest;
the call
dest.scatter(src, pts);
unpacks the values from src into dest at the positions selected by pts.
74
Restrictions
1. pts.domain().size() <= src.domain().size() (i.e. there are enough values to
be scattered in src). It is a fatal error otherwise.
4. If T is a non-atomic type, then it may not be the case that src resides in a private
region and dest resides in a shared region.
5. If T contains embedded local pointers then src and dest must both be local.
6. The contents of src and pts may not be modified (i.e., by other processes) during
the operation.
Effects
After the operation completes, the following conditions will hold:
4. The contents of destArray at the affected points are undefined while operation is in
progress.
75
defined. Between the initiation of the copy operation and the completion of the correspond-
ing sync, any use of one of the target elements (by any process) produces an undefined
value and any assignment to one of the target or source elements (by any process) has an
undefined effect on the targets’ final values.
There are two basic categories of non-blocking operations, defined by the synchroniza-
tion mechanism used:
• An explicit handle (NB) operation returns a specific handle from the initiation (of
type ti.lang.Handle), which the calling process must later use to synchronize that
particular copy (through its syncNB() instance method).
• An implicit handle (NBI) operation does not return a handle from the initiation.
Instead, the static routine Handle.syncNBI() must later be used to synchronize all
outstanding NBI operations issued by the calling process.
76
public static native Handle arraycopyNB(Object src, int src_position,
Object dst, int dst_position,
int length);
Handle h2 = System.arraycopyNB(ja,0,jb,0,jb.length);
// initiate a non-blocking copy from ja -> jb
// as for System.arraycopy
System.arraycopyNBI(ja,0,jb,0,jb.length);
// initiate a non-blocking copy from ja -> jb
77
12.17 Bulk I/O
This library supports fast I/O operations on both Titanium arrays and Java arrays. These
operations are synchronous (that is, they block the caller until the operation completes).
78
12.17.2 Bulk I/O for Java Arrays in Titanium
The BulkDataInputStream, BulkDataOutputStream, and BulkRandomAccessFile classes
in the ti.io package implement bulk, synchronous I/O. They subclass the three classes
in java.io that can be used for I/O on binary data (so you can still use all the familiar
methods), but they add a few new methods that allow I/O to be performed on entire ar-
rays in a single call, leading to significantly less overhead (in practice speedups of over 60x
have been observed for Titanium code that performs a single large I/O using the readAr-
ray() and writeArray() methods, rather than many calls to a single-value-at-a-time method
like DataInputStream.readDouble()). These classes only handle single-dimensional Java
arrays whose elements have atomic types (see §4.2).
package ti.io;
79
/** Equivalent to writeArray (A, 0, N), where N is the length of
* array A. */
void writeArray(Object primjavaarray)
throws java.io.IOException;
}
80
/** A file providing access to the external file NAME in mode
* MODE, as described in the documentation of the superclass. */
public BulkRandomAccessFile(String name, String mode)
throws java.io.IOException;
/** A file providing access to the external file FILE in mode
* MODE, as described in the documentation of the superclass. */
public BulkRandomAccessFile(java.io.File file, String mode)
throws java.io.IOException;
suspend autogc() Temporarily suspends the use of automatic garbage collection on the
calling process. While suspended, garbage collection will not run on the current pro-
cess during allocation operations, except in response to explicit calls to System.gc().
In some configurations, suspending collection on one process may also prevent auto-
matic collections from running on one or more other processes. Suspending collection
for too long can lead to memory exhaustion.
resume autogc() Resumes the use of automatic garbage collection on a particular process.
Must be matched by a previous call to suspend autogc() on this process. Pairs of
calls to suspend autogc() and resume autogc() nest; automatic collection will only
be enabled outside all such pairs of calls.
81
Chapter 13
1. Blank finals: Currently the compiler does not prevent one from assigning to a blank
final field multiple times. This minor pathology is sufficiently unimportant that is
unlikely to be fixed, but it is best for programmers to adhere to Java’s rules.
3. Thread creation: SPMD processes are the only thread-like entities that may be
created. Java classes that depend on thread creation, such as those in java.awt and
java.net, are consequently not implemented.
5. Covariant return types: As in Java 1.5, but not 1.4, Titanium allows the return
type of an overriding method to be a subtype of that of the overridden method.
6. Static fields: The compiler replicates static storage so that a single textual definition
of a static field actually represents a distinct variable in each process. See §9.6.
82
Chapter 14
Handling of Errors
Technically, in those places that the language specifically says that “it is an error” for the
program to perform some action, the result of further execution of the program is undefined.
However, as a practical matter, compilers should comply with the following constraints,
absent a compelling (and documented) implementation consideration. In general, a situ-
ation that “is an error” should halt the program (preferably with a helpful traceback or
other message that locates the error). It is not required that the program halt immediately,
as long as it does so eventually, and before any change to state that persists after execution
of the program (specifically, to external files). Therefore, it is entirely possible that several
erroneous situations might be simultaneously pending, and such considerations as which
of them to report to the user are entirely implementation dependent.
ArithmeticException ArrayStoreException,
ClassCastException,
IndexOutOfBoundsException, NegativeArraySizeException,
NullPointerException,
ThreadDeath,
VirtualMachineError (and subclasses)
83
Appendix A
Notes
These are collected discussion notes on various topics. This section is not part of the
reference manual.
A.1 On Consistency
Note 1. We debated whether there should be a single total order E for a given execution
or one for every process in the execution. The latter seems to admit implementations
that are not strictly cache-coherent, since processes may see writes happening in different
orders. Our interpretation of the Java semantics is the stronger single serial order, so we
have decided to use that in Titanium. This is subject to change if we find a significant
performance advantage on some platform to the weaker semantics. Even with the single
serial order, the semantics are quite weak, so it is unlikely that any program would rely on
the difference. The following example is an execution that would be correct in the weaker
semantics, but not in the stronger one – we are currently unable to find a motivating
problem in which this execution would arise.
// initially X = Y = 0
P1 P2 P3
X = 2 X = Y Y = X
Y = 1
P1 P2 P3
84
(A) Write X Read Y (C) Read X (E)
(B) Write Y Write X (D) Write Y (F)
We observe that:
1. Access (C) consumes the value produced by (B) since it returns 1. The only other
candidate is (F). Let us assume that (F) indeed wrote the value 1. That would imply:
Note 2. The Titanium semantics as specified are weaker than Split-C’s in that the default
is weak consistency; sequential consistency (the default in Split-C) can be achieved through
the use of volatile variables. However, this semantics is stronger than Split-C’s put and
get semantics, since Split-C does not require that dependences be observed. For example,
a put followed by a read to the same variable is undefined in Split-C, unless there is an
intervening synch. This stronger Titanium semantics is much nicer for the programmer, but
may create a performance problem on some distributed memory platforms. In particular,
if the network reorders messages between a single sender and receiver, which is likely if
there are multiple paths through the networks, then two writes to the same variable can
be reordered. On shared memory machines this will not be an issue. We felt that it was
worth trying to satisfy dependences at some risk of performance degradation.
85
Note 3. The Java specification makes this qualification about divisibility only on non-
volatile double and long values. It gives the (unstated) impression that accesses to 64-bit
volatile values are indivisible. This seems to confuse two orthogonal issues: the size of an
indivisible value and the relative order in which operations occur.
3. I believe that this style of region-based memory management is more efficient than
parallel garbage collection. Obviously this claim requires validation.
2. As formulated above, regions will not mesh well with Java threads (which are cur-
rently not part of Titanium). You must stop everything to delete a shared region.
Currently this is enforced by including a barrier in the shared region deletion oper-
ation, but with threads that is no longer sufficient. There are a number of possible
solutions, but none of them seem very good:
(a) Require r.delete() to be called from all threads. This would be painful for
the programmers.
(b) The implementation of r.delete() can just stop the other threads on the same
process. However, to efficiently handle local variables containing references one
needs to know all points where a process may be stopped (and obviously if these
86
points are “all points in the program” then efficiency is lost). So this solution
doesn’t seem very good either.
87
Index
!= operator on RectDomains, 7, 8, 57
on grids, 14 -= operator
on immutable types, 19 on Complexs, 63
on Complexs, 63 on Domains, 8, 55
on Points, 5, 54 on Points, 5, 54
on RectDomains, 8, 57 on RectDomains, 8, 57
* operator / operator
on Complexs, 63 on Complexs, 63
on Domains, 7, 8, 55 on Domains, 8, 55
on Points, 4, 54 on Points, 4, 54
on RectDomains, 7, 8, 57 on RectDomains, 8, 57
*= operator ^= operator
on Complexs, 63 on Complex, 63
on Domains, 8, 55 ^ operator
on Points, 5, 54 on Complex, 63
on RectDomains, 8, 57 ~ operator
+ operator on Complex, 63
on immutable types, 19 /= operator
on Complexs, 63 on Complexs, 63
on Domains, 7, 8, 55 on Domains, 8, 55
on Points, 4, 54 on Points, 5, 54
on RectDomains, 7, 8, 57 on RectDomains, 8, 57
on Strings, 36 < operator
+= operator on Domains, 8, 55
on Complexs, 63 on Points, 5, 54
on Domains, 8, 55 on RectDomains, 8, 57
on Points, 5, 54 <= operator
on RectDomains, 8, 57 on Domains, 8, 55
- operator on Points, 5, 54
on Complexs, 63 on RectDomains, 8, 57
on Domains, 7, 8, 55 == operator
on Points, 4, 54 on grids, 14
88
on immutable types, 19 Point.arity field, 5, 54
on Complexs, 63 Point.arity method, 5, 54
on Points, 5, 54 RectDomain.arity field, 7, 57
on RectDomains, 8, 57 RectDomain.arity method, 7, 57
> operator arity field (of grid), 15
on Domains, 8, 55 arity method (on grid), 15
on Points, 5, 54 array copying, non-blocking, 75–77
on RectDomains, 8, 57 array, standard, 13
>= operator java.lang.System.arraycopyNB method,
on Domains, 8, 55 77
on Points, 5, 54 java.lang.System.arraycopyNBI method,
on RectDomains, 8, 57 77
#define, 48 arrays, overlapping, 17
#elif, 48 ArraySpecifier, 13
#endif, 48 Complex.asin method, 63
#error, 48 Complex.asinh method, 63
#if, 48 assignment compatibility of grids, 14
#ifdef, 48 Complex.atan method, 63
#ifndef, 48 Complex.atanh method, 63
#include, 48 atomic type, 13
#line, 48
#pragma, 48 Ti.barrier method, 39
#undef, 48 ti.lang.Ti.barrier method, 68
BaseType, 13
Complex.abs method, 63 blank finals, 82
Complex.absSquare method, 63 Boolean class, modifications, 69
RectDomain.accrete method, 9, 57 RectDomain.border method, 10, 57
Complex.acos method, 63 border method (on grid), 16
Complex.acosh method, 63 Domain.boundingBox method, 7, 55
Reduce.add method, 59 RectDomain.boundingBox method, 7, 57
Scan.add method, 61 broadcast, 2
Point.all method, 4, 54 broadcast expression, 39
RectDomain.all method, 57 bulk I/O, 78–81
allocating from regions, 29 BulkDataInput methods
allocating grids, 14 readArray, 79
Reduce.and method, 59 BulkDataInputStream, 78, 79
Scan.and method, 61 BulkDataInputStream methods
Complex.arg method, 63 readArray, 80
Domain.arity field, 7, 55 BulkDataOutput methods
Domain.arity method, 7, 55 writeArray, 80
89
BulkDataOutputStream, 78, 79 Complex.conj method, 63
BulkDataOutputStream methods consistency model, 45–47
writeArray, 80 constructor
BulkRandomAccessFile, 78, 79 for domains, 6
BulkRandomAccessFile methods Domain.contains method, 8, 55
readArray, 81 RectDomain.contains method, 8, 57
writeArray, 81 control flow and global synchronization,
Byte class, modifications, 69 43–44
conversions, reference type, 22
circular dependence, 33 copy method (on grid), 15, 53, 72
Object.clone method, 69 copy constructor for Domain, 10, 55
clone method (on reference), 22 copying sparse grids, 72–75
coercion copyNB method (on grid), 53, 77
Domain to RectDomain, 6 copyNBI method (on grid), 53, 77
RectDomain to Domain, 6 Complex.cos method, 63
coercions, reference type, 22 Complex.cosh method, 63
compiler.flags property, 68 Object.creator method, 69
Complex, 61–63 creator method (on grid), 53
Complex methods creator method (on reference), 22
absSquare, 63
abs, 63 default constructor, sharing qualification,
acosh, 63 25
acos, 63 default value of immutable type, 18
arg, 63 demesne, 20–23, 37
asinh, 63 depends, 33
asin, 63 Point.direction method, 4, 54
atanh, 63 directly depends, 33
atan, 63 Domain, 5–11
conj, 63 Domain fields
cosh, 63 arity, 7, 55
cos, 63 Domain methods
exp, 63 PointList, 10, 55
im, 63 RectDomainList, 10, 55
parseComplex, 63 arity, 7, 55
re, 63 boundingBox, 7, 55
sinh, 63 contains, 8, 55
sin, 63 equals, 8, 55
sqrt, 63 isEmpty, 55
tanh, 63 isRectangular, 6, 55
tan, 63 lwb, 7, 55
90
max, 7, 55 copy, 15, 53, 72
min, 7, 55 creator, 53
permute, 55 domain, 15, 53
setRegion, 10, 31, 55 exchange, 16, 38, 53
size, 7, 55 gather, 53, 73
toDomain, 10, 55 inject, 15, 53
toString, 10, 55 isContiguous, 16, 53
upb, 7, 55 isEmpty, 16
domain method (on grid), 15, 53 isLocal, 53
Domain copy constructor, 10, 55 overlap, 53
Double class, modifications, 69 permute, 16, 53
DoubleOp, 61 project, 15, 53
dynamic class loading, absence of, 82 readFrom, 53, 78
regionOf, 53
embedded fields, 22 restrict, 15, 53
Domain.equals method, 8, 55 scatter, 53, 74
Point.equals method, 5 set, 16, 53
RectDomain.equals method, 8, 57 shrink, 16
errors, 83 size, 16
exchange method (on grid), 16, 38, 53 slice, 16, 53
Complex.exp method, 63 translate, 15, 53
externally referenced regions, 28 writeTo, 53, 78
finalize on immutable types, 19 grid types, and Object, 13, 14
finalize methods in regions, 29 grid, allocation, 14
Float class, modifications, 69 grids, assignment compatibility, 14
foreach, 2, 11 grids, indexing, 15
from, 2 grids, overlapping, 17
91
isContiguous method (on grid), 16, 53 Scan.mult method, 61
Domain.isEmpty method, 55 ti.lang.Ti.myTeam method, 68
RectDomain.isEmpty method, 57
isEmpty method (on grid), 16 new, 29
Object.isLocal method, 69 non-blocking array copying, 75–77
isLocal method (on grid), 53 nonshared, 2, 13, 23–27
isLocal method (on reference), 22 null, sharing qualification, 25
Domain.isRectangular method, 6, 55 Number class, modifications, 69
RectDomain.isRectangular method, 6, 57 ti.lang.Ti.numProcs method, 68
92
direction, 4, 54 equals, 8, 57
equals, 5 isEmpty, 57
lowerBound, 5 isRectangular, 6, 57
permute, 5, 54 lwb, 7, 57
replace, 5 max, 7, 57
toString, 5 min, 7, 57
upperBound, 5 permute, 7, 57
Domain.PointList method, 10, 55 shrink, 9, 57
Ti.poll method, 72 size, 7, 57
ti.lang.Ti.poll method, 68 slice, 10, 57
polyshared, 2, 13, 23–27 stride, 7, 57
pre-processing, 48 toString, 10
private regions, 27–29 upb, 7, 57
PrivateRegion, 29 Domain.RectDomainList method, 10, 55
process, 37 Reduce methods
process team, 37 add, 59
project method (on grid), 15, 53 and, 59
gen, 59
QualifiedBaseType, 13 max, 59
Qualifier, 13 min, 59
qualifier, base, 13 mult, 59
qualifier, top-level, 13 or, 59
Complex.re method, 63 xor, 59
BulkDataInput.readArray method, 79 reference methods
BulkDataInputStream.readArray method, clone, 22
80 creator, 22
BulkRandomAccessFile.readArray method, isLocal, 22
81 localClone, 22
readFrom method (on grid), 53, 78 regionOf, 22
RectDomain, 5–11 region-based allocation, 27–31
RectDomain fields RegionInUse, 29
arity, 7, 57 Object.regionOf method, 29
RectDomain methods regionOf method (on grid), 53
accrete, 9, 57 regionOf method (on reference), 22
all, 57 regions, allocating from, 29
arity, 7, 57 Point.replace method, 5
border, 10, 57 Timer.reset method, 64
boundingBox, 7, 57 restrict method (on grid), 15, 53
contains, 8, 57 ti.lang.Ti.resume autogc method, 68,
81
93
runtime.boundschecking property, 68 Timer.start method, 64
runtime.distributed property, 68 static fields, per-process, 47
runtime.gc property, 68 static initializers, 47
runtime.model property, 68 Timer.stop method, 64
runtime.shared property, 68 stride, 5
RectDomain.stride method, 7, 57
Scan methods String constants, interning, 47
add, 61 String literals, sharing qualification, 25
and, 61 ti.lang.Ti.suspend autogc method, 68,
gen, 61 81
max, 61 ti.lang.Handle.syncNB method, 76
min, 61 ti.lang.Handle.syncNBI method, 76
mult, 61
or, 61 Complex.tan method, 63
xor, 61 Complex.tanh method, 63
scatter method (on grid), 53, 74 template, 2
Timer.secs method, 64 TemplateActual, 32
set method (on grid), 16, 53 TemplateDeclaration, 33
Domain.setRegion method, 10, 31, 55 TemplateFormal, 33
sglobal, 2 TemplateHeader, 33
shared regions, 27–29 TemplateInstantiation, 32, 34
SharedRegion, 29 templates, 32
sharing qualification, 23–27 defining, 32
Shrt class, modifications, 69 denoting instantiations, 32
RectDomain.shrink method, 9, 57 instantiation, 34
shrink method (on grid), 16 name equivalence, 34
Complex.sin method, 63 names in, 33
single, 2, 13, 40 this, sharing qualification, 25
single analysis, 39–45 Ti.thisProc method, 37
single throw statement, 40 ti.lang.Ti.thisProc method, 68
single-valued expression, 40–43 threads, absence of, 82
Complex.sinh method, 63 ThrowStatement, 40
Domain.size method, 7, 55 Ti, 67–68
RectDomain.size method, 7, 57 Ti methods
size method (on grid), 16 barrier, 39
RectDomain.slice method, 10, 57 poll, 72
slice method (on grid), 16, 53 thisProc, 37
sparse grids, copying, 72–75 ti.lang, 3
Complex.sqrt method, 63 ti.lang.Handle methods
standard arrays, restrictions, 17 syncNBI, 76
94
syncNB, 76 Scan.xor method, 61
ti.lang.Ti methods
barrier, 68
myTeam, 68
numProcs, 68
poll, 68
resume autogc, 68, 81
suspend autogc, 68, 81
thisProc, 68
Timer, 63–64
Timer methods
micros, 64
millis, 64
reset, 64
secs, 64
start, 64
stop, 64
Titanium array, see grid
Domain.toDomain method, 10, 55
Domain.toString method, 10, 55
Point.toString method, 5
RectDomain.toString method, 10
translate method (on grid), 15, 53
Type, 13
unit stride, 5
universal catch clause, 40
universal termination, 40
universal throws clause, 40
Domain.upb method, 7, 55
RectDomain.upb method, 7, 57
Point.upperBound method, 5
BulkDataOutput.writeArray method, 80
BulkDataOutputStream.writeArray method,
80
BulkRandomAccessFile.writeArray method,
81
writeTo method (on grid), 53, 78
Reduce.xor method, 59
95