Unit 03 - Part 02

Topic
Types and Declarations

Types
The applications of types can be grouped under checking and translation.
Type checking
 It uses logical rules to reason about the behavior of a program at run time. Specifically, it
ensures that the types of the operands match the type expected by an operator.
 For example, the && operator in Java expects its two operands to be booleans; the result
is also of type boolean.
Translation Applications
 From the type of a name, a compiler can determine the storage that will be needed for
that name at run time.
 Type information is also needed to calculate the address denoted by an array reference,
to insert explicit type conversions, and to choose the right version of an arithmetic
operator, among other things.
 The actual storage for a procedure call or an object is allocated at run time, when the
procedure is called or the object is created.
i) Type Expressions
 Types have structure, which can be represented using type expressions: a type
expression is either a basic type or is formed by applying an operator called a type
constructor to a type expression. The sets of basic types and constructors depend on the
language to be checked.
 Example: The array type int [2] [3] can be read as "array of 2 arrays of 3 integers each"
and written as a type expression array(2, array(3, integer)).
 This type is represented by the tree. The operator array takes two parameters, a
number and a type.
Use the following definition of type expressions:
 A basic type is a type expression. Typical basic types for a language include boolean,
char, integer, float, and void; the latter denotes "the absence of a value."
 A type name is a type expression.
 A type expression can be formed by applying the array type constructor to a number
and a type expression.
 A record is a data structure with named fields. A type expression can be formed by
applying the record type constructor to the field names and their types. Record types
will be implemented by applying the constructor record to a symbol table containing
entries for the fields.
 A type expression can be formed by using the type constructor —>• for func-tion types.
We write s —»• t for "function from type s to type r."
Type Names and Recursive Types
 Once a class is defined, its name can be used as a type name in C++ or Java; for example,
consider Node in the program fragment
public class Node { • • • }
public Node n;
 Names can be used to define recursive types, which are needed for data structures such
as linked lists.
 The pseudocode for a list element defines the recursive type Cell as a class that contains
a field info and a field next of type Cell.
 Similar recursive types can be defined in C using records and pointers.
class Cell { int info; Cell next; ••• }
 Type expressions may contain variables whose values are type expressions.
A convenient way to represent a type expression is to use a graph.
ii) Type Equivalence
 Many type-checking rules have the form, "if two type expressions are equal then return
a certain type else error." Potential ambiguities arise when names are given to type
expressions and the names are then used in subsequent type expressions.
 The key issue is whether a name in a type expression stands for itself or whether it is an
abbreviation for another type expression.
 When type expressions are represented by graphs, two types are structurally equivalent
if and only if one of the following conditions is true:
1. They are the same basic type.
2. They are formed by applying the same constructor to structurally equivalent types.
3. One is a type name that denotes the other.
 If type names are treated as standing for themselves, then the first two conditions in the
above definition lead to name equivalence of type expressions.
Declarations
 Types and declarations using a simplified grammar that declares just one name at a time;
declarations with lists of names can be handled. The grammar is as follows.
 The fragment of the above grammar that deals with basic and array types was used to
illustrate inherited attributes.
 Nonterminal D generates a sequence of declarations.
 Nonterminal T generates basic, array, or record types.
 Nonterminal B generates one of the basic types int and float.
 Nonterminal C, for "component," generates strings of zero or more integers, each integer
surrounded by brackets.
 An array type consists of a basic type specified by B, followed by array components specified
by nonterminal C. A record type (the second production for T) is a sequence of declarations
for the fields of the record, all surrounded by curly braces.
Sequences of Declarations
 Languages such as C and Java allow all the declarations in a single procedure to be
processed as a group.
 The declarations may be distributed within a Java procedure, but they can still be
processed when the procedure is analyzed. Therefore use a variable, say offset, to keep
track of the next available relative address.
P -> D { offset =0;}
D -> T id ; D1 { top.put(id.lexeme, T.type, offset);
offset = offset + T.width; }
D -> ε
 The translation scheme of above production deals with a sequence of declarations of the
form T id, where T generates a type. Before the first declaration is considered, offset is
set to 0.
 As each new name x is seen, x is entered into the symbol table with its relative address
set to the current value of offset, which is then incremented by the width of the type of
x.
 The semantic action within the production D —>T id;D1 creates a
symbol
table entry by executing top.put(id.lexeme, T.type, offset).
 Here top denotes the current symbol table. The method top.put creates a symbol-table
entry for id.lexeme, with type T.type and relative address offset in its data area.
 The initialization of offset is more evident if the first production appears on one line as
P -> { offset = 0 ; } D
Nonterminals generating e, called marker non terminals, can be used to rewrite

productions so that all actions appear at the ends of right sides. Using a marker
nonterminal M can be restated as:
P -> M D
M -> ε { offset = 0; }
Declarations in a procedure
 In the procedure, for each local name, we create a symbol-table entry with information
like type and relative address of the storage for the name.
 The relative address consists of an offset from the base of the static data area or the
field for local data in an activation record.
 Computing the types and relative addresses of declared names.
Topic
Translation of Expressions
Translation Schemes
 The notion of a syntax-directed definitions in order to specify the order of evaluation of the
semantic rules, leading to the concept of a translation scheme.
 A Translation Scheme is a context-free grammar in which semantic rules are embedded
within the right sides of the productions.
i) Assignment Statements
 Expressions can be of type integer, real array and record.
 In the part of the translation of assignments into three-address code, it show how names can
be looked up in the symbol table and how elements of arrays and records can be accessed.
Names in the symbol table
 Names stood for pointers to their symbol-table entries.
 The lexeme for the name represented by id is given by attribute id.name.
 Operation lookup(id.name) checks if there is an entry for this occurrence of the name in the
symbol table. If so, a pointer to the entry is returned; otherwise lookup will return nil to
indicate that no entry was found.
 In the semantic actions of assignments, procedure emit() is used to emit three-address
statements to an output file, rather than building up code attributes for non-terminals.
Example Production rule Semantic actions
Consider the grammar
S → id := E S → id :=E { p = look_up(id.name);
If p ≠ nil then
E → E1 + E2 Emit (p = E.place)
E → E1 * E2 Else
E → (E1) Error; }
E → id
The translation scheme of above E → E1 + E2 { E.place = newtemp( );
Emit (E.place = E1.place '+' E2.place) }
grammar is given in a table.
 The p returns the entry for
E → E1 * E2 { E.place = newtemp();
id.name Emit (E.place = E1.place '*' E2.place) }
in the symbol table.
 The Emit function is used for
E → (E1) { E.place = E1.place }
appending the three address code to
the output file. Otherwise it will E → id { p = look_up(id.name);
report an error. If p ≠ nil then
Emit (p = E.place)
 The newtemp() is a function used to Else
generateholds
E.place newthe temporary
value of E.variables. Error; }
ii) Addressing Array Elements
 Elements of the array can be accessed quickly if the elements are stored in a block of
consecutive locations. If the width of each array element is w, then ith element of array A
begins in location
base + (i – low ) * w ------ ( 1 )
where low - lower bound on the subscript
base – relative address of the storage allocated for the array. i.e. base is the relative
address of A[low].
 The above expression can be partially evaluated at compile time if it is rewritten as:
i * w + ( base – low * w) ------ ( 2 )
 The sub-expression c = base – low * w can be evaluated when the declaration of the
array is seen. Assume that c is saved in symbol table entry for A, so the relative address
of A[i] is obtained by simply adding i * w to c.
 A two-dimensional array is normally stored in one of two forms, either rowmajor (row-
by-row) or column-major(column-by-column).
In the case of two-dimensional array stored in row-major form, the relative address of
A[i1, i2] can be calculated by the formula
base + ((i1 – low1) * n2 + i2 – low2) * w ------ ( 3 )
Where low1 and low2 – lower bounds on the values of i1and i2 n2– number of values that
i2 can take. if high2 is the upper bound on the value of i2, then n2 = high2 – low2 + 1.
Assuming that i1and i2 are the only values that are not known at compile time, we can
rewrite the above expression as:
((i1 * n2) + i2) * w + (base – ((low1 * n2) + low2) * w) ------ (4 )
The expression (4) generalizes to the following expression for the relative address of
A[i1,i2,….ik]
( ( ……((i1n2+i2)n3+i3)…… )nk + ik) X w + base – (( ……((low1 n2 + low2) n3
+low3)…… )nk + lowk) x w ------ (5 )
 Array references can be permitted in assignments if non-terminal L with the following
productions is allowed .
L -> id [Elist ] | id
Elist ->Elist, E | E
 In order that the various dimensional limits nj of the array be available as we group
index expressions into an Elist, it is useful to rewrite the productions as
L -> Elist] | id
Elist -> Elist, E | id [ E
 That is, the array name is attached to the leftmost index expression rather than being
joined to Elist when an L is formed.
Translation Scheme for Addressing Array Elements
S->L:=E Production Rule Semantic Action
E->E
+E E- L := E {
if L.offset = null then emit(L.place ':=' E.place)
>(E) E- else EMIT (L.place'['L.offset ']' ':=' E.place);
>L }
L->Elist ]
L->id E → E+E { E.place := newtemp;
EMIT (E.place ':=' E1.place '+' E2.place);
Elist->Elist , E }
Elist-> id [ E
E → (E) {E.place := E1.place;}
E→L {if L.offset = null then E.place = L.place
else {E.place = newtemp;
EMIT (E.place ':=' L.place '[' L.offset ']');
}
}
Production Rule Semantic Action
L → Elist ] {
L.place = newtemp; L.offset = newtemp;
EMIT (L.place ':=' c(Elist.array));
EMIT (L.offset ':=' Elist.place '*'
width(Elist.array);
}
L → Elist ] {
L.place = newtemp; L.offset = newtemp;
EMIT (L.place ':=' c(Elist.array));
EMIT (L.offset ':=' Elist.place '*'
width(Elist.array);
}
L → id {L.place = lookup(id.name);
L.offset = null;
}
Elist → Elist, E {
t := newtemp;
m := Elist1.ndim + 1;
EMIT (t ':=' Elist1.place '*'
limit(Elist1.array, m));
EMIT (t, ':=' t '+' E.place);
Elist.array =
Elist1.array;
Elist.place := t;
Elist.ndim := m;
}
Elist → id[E {
Elist.array := lookup(id.name);
Elist.place := E.place
Elist.ndim := 1;
}
iii) Boolean expressions
Boolean expressions have two primary purposes. They are used for computing the logical
values. They are also used as conditional expression using if-then-else or while-do.
Consider the grammar
E → E OR E
E → E AND E
E → NOT E
E → (E)
E → id relop id
E → TRUE
E → FALSE
The relop is
denoted by <,
>, <, >
The AND and OR are left associated. NOT has the higher precedence then AND and lastly
OR.
E → E1 OR E2 { E.place = newtemp( );
Emit (E.place ':=' E1.place 'OR' E2.place) }
E → E1 AND E2 { E.place = newtemp( );

Emit (E.place ':=' E1.place 'AND' E2.place) }
E → NOT E1 {
E.place = newtemp( );
Emit (E.place ':=' 'NOT' E1.place) }
E → (E1) { E.place = E1.place }

E → id relop id2 {E.place = newtemp();

Emit ('if' id1.place relop.op id2.place 'goto‘ nextstar + 3);
EMIT (E.place ':=' '0')
EMIT ('goto' nextstat + 2)
EMIT (E.place ':=' '1')
}
E → TRUE {E.place := newtemp();

Emit (E.place ':=' '1')
}
E → FALSE {E.place := newtemp();

Emit (E.place ':=' '0')
}
Example
p>q AND r<s OR u>r
100: if p>q goto 103
101: t1:=0
102: goto 104
103: t1:=1
104: if r>s goto 107
105: t2:=0
106: goto 108
107: t2:=1
108: if u>v goto 111
109: t3:=0
110: goto 112
111: t3:= 1
112: t4:= t1 AND t2
113: t5:= t4 OR t3
iv) Flow-of-Control Statements
 Consider the translation of boolean expression into three-address code in the context of if-then, if-
then-else and while-do statements .
S → if E then S1
| if E then S1 else S2
| while E do S1
 In each of these productions, E is the Boolean expression to be translated.
 Two labels are associated with Boolean expression E, they are :
E.true – the label to which control flows if E is true. E.false
– the label to which control flows if E is false.
 In translating the if-then statement, S → if E then S1, a new label E.true is created and attached to
the first three-address instruction generated for the statement S1 as shown in the figure.
 In translating the if-then-else statement, S → if E then S1 else S2, the code for the Boolean
expression E has jumps out of it to the first instruction of the code for S1 if E is true, and to the first
instruction of the code for S2 if E is false as illustrated in figure.
 As with the if-then statement, an inherited attribute S.next gives the label of the three-address
instruction to be execute next after executing the code for S. The code for S → while E do S1 is
formed as shown in the figure.
A syntax-directed definition for flow-of-control statements

Unit 03 - Part 02

Uploaded by

Copyright:

Available Formats

Unit 03 - Part 02

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Unit 03 - Part 02

Uploaded by

Copyright:

Available Formats

Topic

Types and Declarations

Nonterminals generating e, called marker non terminals, can be used to rewrite

E → E1 AND E2 { E.place = newtemp( );

E → (E1) { E.place = E1.place }

E → id relop id2 {E.place = newtemp();

E → TRUE {E.place := newtemp();

E → FALSE {E.place := newtemp();

You might also like