Unit 3 - Compiler Design - WWW - Rgpvnotes.in
Unit 3 - Compiler Design - WWW - Rgpvnotes.in
Unit 3 - Compiler Design - WWW - Rgpvnotes.in
Tech
Subject Name: Compiler Design
Subject Code: IT-603
Semester: 6th
Downloaded from www.rgpvnotes.in
Unit III
Syllabus: Syntax Directed Translation: Definitions, Inherited Attributes, L-attributed definitions, Sattributed
definitions, Dependency graph, Construction of syntax trees, Top down translation, postfix notation, bottom up
evaluation.
Unit Outcome: Write a scanner, parser, and semantic analyser without the aid of automatic generators.
During the syntax analysis of the language we use syntax-directed definitions. The set of attributes are
associated with each terminal and non-terminal symbols. The attribute can be a string, a number, a type, a
memory location or anything else.
The syntax directed definition is a kind of abstract specification. The conceptual view of syntax directed
translations is:
Evaluation
Depedency
Input String Syntax Tree order for
Graph
Semantic Rules
Definition: Syntax-directed definition is a generalization of context free grammar in which each grammar
production X α is associated with it a set of semantic rules of the form a: f(b1, b2,……, bk), where a is an
attribute obtained from the function f.
Attribute: The attribute can be a string, a number, a type, a memory location or anything else.
Consider X α be a context free grammar and a: f(b1, b2,……, bk) where a is the attribute. Then there are two
types of attributes:
1. Synthesized attribute: The attribute ‘a’ is called synthesized attribute of X and b1, b2, ….bk are
attributes belonging to the production symbols. The value of synthesized attribute at a node is
computed from the values of attributes at the children of that node in the parse tree.
2. Inherited attribute: The attribute ‘a’ is called inherited attribute of one of the grammar symbol on the
right side of the production (i.e. α ) and b1, b2, …bk are belonging to either X or α.
The inherited attribute can be computed from the values of the attributes at the siblings and parent of that
node.
S Attributed Definition: If an SDT uses only synthesized attributes, it is called as S-attributed SDT. These
attributes are evaluated using S-attributed SDTs that have their semantic actions written after the production
(right hand side).
As shown in figure, attributes in S-attributed SDTs are evaluated in bottom-up parsing, as the values of the
parent nodes depend upon the values of the child nodes.
Following steps are followed to compute S-attributed definition
1. Write the syntax-directed definition using the appropriate semantic actions for corresponding
production rule of the given grammar.
2. The annotated parse tree is generated and attribute values are computed. The computation is done in
bottom up manner.
3. The value obtained at the root node is supposed to be the final output.
L Attributed Definition: This form of SDT uses both synthesized and inherited attributes with restriction of not
taking values from right siblings.
In L-attributed SDTs, a non-terminal can get values from its parent, child, and sibling nodes. As in the following
production
S ABC
S can take values from A, B, and C (synthesized). A can take values from S only. B can take values from S and A.
C can get values from S, A, and B. No non-terminal can get values from the sibling to its right.
Attributes in L-attributed SDTs are evaluated by depth-first and left-to-right parsing manner.
L-Attributed SDT
S-Attributed SDT
We may conclude that if a definition is S-attributed, then it is also L-attributed as L-attributed definition
encloses S-attributed definitions.
Dependency Graph: The directed graph that represents the interdependencies between synthesized and
inherited attribute at nodes in the parse tree is called dependency graph.
For the rule X YZ the semantic action is given by X.x = f(Y.y, Z.z) then synthesized attribute X.x depends on
attributes Y.y and Z.z.
The synthesized attributes can be represented by .val. Hence the synthesized attributes are given by E.val and
E2.val. The dependencies among the nodes is given by solid arrows. The arrows from E1 and E2 show the
value of E depends upon E1 and E2. The parse tree is represented by using dotted lines.
Constructing syntax tree for an expression means translation of expression into postfix form. The nodes can be
implemented as a record with multiple fields.
The following functions are used for making syntax tree
1. mknode (op, left, right): Creates a node with the field operator having operator as label, and the two
pointers to left and right.
2. mkleaf(id, entry): Creates an identifier node with label id and a pointer to symbol table is given by
‘entry’.
3. mkleaf(num, val): Creates node for number with label num and val is for value of that number.
Top-Down Translation:
In top-down translation we will look at the implementation of L-attributed definitions during predictive
parsing. Instead of the syntax-directed translations, we will work with translation schemes. The top down
translation will see how to evaluate inherited attributes (in L-attributed definitions) during recursive predictive
parsing. This will also look at what happens to attributes during the left-recursion elimination in the left-
recursive grammars.
D → T id {addtype (id.entry, T.type), L.in = T.type}
LT → int {T.type = integer}
T → real {T.type = real}
L → id {addtype (id.entry, L.in), L1.in = L.in}
L→ε
This is a translation scheme for an L-attributed definitions.
Postfix Notation:
Postfix notation is a notation for writing arithmetic expressions in which the operands appear before their
operators. There are no precedence rules to learn, and parentheses are never needed. Because of this
simplicity, some popular hand-held calculators use postfix notation to avoid the complications of the multiple
parentheses required in nontrivial infix expressions.
Most programming languages uses Postfix Notation, because it clearly shows the order in which operations
are performed, and because it disambiguates operator groupings.
For example, look at this postfix expression:
“2 3 1 * + 9 -“. We scan all elements one by one.
1) Scan ‘2’, it’s a number, so push it to stack. Stack contains ‘2’
2) Scan ‘3’, again a number, push it to stack, stack now contains ‘2 3’ (from bottom to top)
3) Scan ‘1’, again a number, push it to stack, stack now contains ‘2 3 1’
4) Scan ‘*’, it’s an operator, pop two operands from stack, apply the * operator on operands, we get 3*1
which results in 3. We push the result ‘3’ to stack. Stack now becomes ‘2 3’.
5) Scan ‘+’, it’s an operator, pop two operands from stack, apply the + operator on operands, we get 3 + 2
which results in 5. We push the result ‘5’ to stack. Stack now becomes ‘5’.
6) Scan ‘9’, it’s a number, we push it to the stack. Stack now becomes ‘5 9’.
7) Scan ‘-‘, it’s an operator, pop two operands from stack, apply the – operator on operands, we get 5 – 9
which results in -4. We push the result ‘-4’ to stack. Stack now becomes ‘-4’.
8) There are no more elements to scan, we return the top element from stack (which is the only element left
in stack).
The parser is implemented as combination of state and value. The state[i] is for grammar symbol A and
value[i] holds the synthesized attribute A.a.
Consider the translation scheme for the example given below:
S T [List.in := T.type]
List
T int [T.type := int]
T float [T.type := float]
List List1, id [List1 := List.in]
{Enter_type(id.entry, List.in)}
List id {Enter_type(id.entry, List.in)}
Consider the input int a,b,c for bottom-up evaluation of inherited attributes
Input String State Production Used
int a,b,c _
a,b,c int
a,b,c T T int
,b,c Ta
,b,c T List List id
b,c T List ,
,c T List , b
,c T List List List1,b
C T List ,
T List , c
T List List List1 , id
S S T List
Table 3.1: Bottom-Up Evaluation of Inherited Attributes
Basically the function Enter_type performs the task of copying T.type to List.in. In this way the bottom up
implementation of inherited attributes can be done.
Before reduction the state A, B and C can be inserted in the stack along with the values. A.a, B.b and C.c. The
top pointer of value[top] will point the value C.c, similarly B.b is in value[top-1] and A.a is in value[top-2]. After
reduction the left hand side symbol of the production i.e. X will be placed in the stack along with the value X.x
at the top. Hence after reduction value[top] = X.x.
4. After reduction top is decremented by 2 the state covering X is placed at the top of state[top] and
value of synthesized attribute X.x is put in value[top].
5. If the symbol has no attribute then the corresponding entry in the value array will be kept undefined.