Chapter 6 - ICG

Download as ppt, pdf, or txt
Download as ppt, pdf, or txt
You are on page 1of 14

Compiler Design

Instructor: Mohammed O.
Email: [email protected]
Samara University
Chapter Six
This Chapter Covers:
 Intermediate Code Generation
 Three Address Code
 Quadruples and Triples
Intermediate Code Generator: Intr
The front part of the compiler translates the source code programme into an
intermediate language representation, which after that is converted into a
machine code.

Phases of a compiler
Intermediate representations may be:
Assembly language like or
An abstract syntax tree
Types of Intermediate Representation
There are three types of intermediate representation:
a. Syntax Trees
b. Postfix notation
c. Three Address Code

Graphical Representations
A syntax tree depicts (shows) the natural hierarchical
structure of a source programme.

A DAG (Directed Acyclic Graph) gives the same information


but in a more compact (more dense - compress) way
because common sub expressions are identified.
DAG (Directed Acyclic Graph)

A syntax tree and DAG representation of the assignment


statement a:=b*-c + b*-c
Three-Address Code
Three-address code is a sequence (particular order) of
statements of the general form:
x := y op z

where: x, y, z are names, constants or compiler-generated


temporaries, op stands for an operator, such as a fixed or
floating-point arithmetic operator, or a logical operator on
a Boolean-valued data.

Example: An expression of the kind x + y * z will be


translated into the sequence (three address code).
Cont.
t1 := y * z
t2 := x + t1

where t1 and t2 are compiler-generated temporary names.

This solving of complicated arithmetic expressions makes


three-address code desirable (fine or optimal) for target
code generation and optimisation.

The use of names for the intermediate values computed by


a programme allow - three-address code to be easily
rearranged .
Cont.
Three-address code is a linear (extending along a straight)
representation of a syntax tree or a dag.

Unary operator – perform an action with a single operand


Binary operator - perform an action with two operands.

Three address code for expression a = b*-c + b*-c


t1 := -c
t2 := b * t1
t3 := -c
t4 := b * t3
t5 := t2 + t4
a := t5
Cont.
Three-address code for DAG
t1 := -c
t2 := b * t1
t3 := t2 + t2
a := t3

The reason for the term “three-address code” is that each


statement usually contains three addresses, two for the
operands and one for the result.

*
In the implementations of three-address code, a programmer-
defined name is replaced by a pointer to a symbol-table
entry for that name.
Implementations of three-Address
A three-address statement is an abstract form of
intermediate code.
In a compiler, these code (statements) can be implemented
as records with fields for the operator and the operands.
Three such representations are quadruples, triples, and
indirect triples.

Quadruples
A quadruple is a record structure with four fields, which
we call op, argl, arg2, and result.

The op field contains an operator.


Implementations of three-Address
The three-address statement x:= y op z is represented by
placing y in arg1, z in arg2, and x in result.

The contents of fields arg1, arg2, and result are typically


pointers to symbol table entries.

Temporaries must be entered into the symbol table as they


are created.

Quadruples for the statement a := b*-c + b*-c


Implementations of three-Address Statements
Implementations of three-Address
Triples
To avoid entering temporary names into the symbol table, we
might (may) refer to a temporary value by the position of the
statement that computes it.
Then, three-address statements can be represented by records
with only three fields: op, arg1 and arg2.

The fields argl and arg2, for the arguments of op, are either
pointers to the symbol table or pointers into the triple structure
(for temporary values).

Since three fields are used, this intermediate code format is


known as triples.
Implementations of three-Address Statements
Triples for the statement a := b*-c + b*-c

Parenthesized numbers represent pointers into the triple structure,


while symbol-table pointers are represented by the names themselves.

You might also like