Module 3

Download as pdf or txt
Download as pdf or txt
You are on page 1of 29

CS304 Compiler Design/B.

Tech/S6

MODULE 3

Ms. Anakha Satheesh P , Assistant Professor/CSE 1


CS304 Compiler Design/B.Tech/S6

3.1 BOTTOM UP PARSING


A bottom-up parse starts with the string of terminals itself and builds from the leaves upward,
working backwards to the start symbol by applying the productions in reverse.

Along the way, a bottom-up parser searches for substrings of the working string that match
the right side of some production.

When it finds such a substring, it reduces it, i.e., substitutes the left side non-terminal for the
matching right side. The goal is to reduce all the way up to the start symbol and report a
successful parse.

3.1.1 SHIFT REDUECE PARSING


Shift reduce parsing attempts to construct a parse tree for an input string beginning
at the leaves (bottom) and working up towards the root (top).

This can be considered as the process of “reducing” a string w to the start symbol of
a grammar.

At each reduction step parser searches for substrings of the working string that match
the right side of some production.

When it finds such a substring, it reduces it, i.e., substitutes the left side non-terminal
for the matching right side. The goal is to reduce all the way up to the start symbol
and report a successful parse.

In Shift-reduce parsing a stack holds grammar symbols and an input buffer holds the
rest of the string to be parsed.

As we shall see, the handle always appears at the top of the stack just before it is
identified as the handle.

We use $ to mark the bottom of the stack and also the right end of the input. Initially,
the stack is empty, and the string w is on the input, as follows:

STACK INPUT

$ w$

During a left-to-right scan of the input string, the parser shifts zero or more input
symbols onto the stack, until it is ready to reduce a string β of grammar symbols on
top of the stack.

It then reduces β to the head of the appropriate production. The parser repeats this
cycle until it has detected an error or until the stack contains the start symbol and the
input is empty as follows:

STACK INPUT

$S $

Ms. Anakha Satheesh P , Assistant Professor/CSE 2


CS304 Compiler Design/B.Tech/S6

Upon entering this configuration, the parser halts and announces successful
completion of parsing.

There are actually four possible actions a shift-reduce parser can make:

 Shift
 Reduce
 Accept
 Error

1. Shift: The next input symbol is shifted onto the top of the stack.
2. Reduce: The parser knows the right end of the string to be reduced must be at the
top of the stack. It must then locate the left end of the string within the stack and
decide with what nonterminal to replace the string.
3. Accept: Announce successful completion of parsing.
4. Error: Discover a syntax error has occurred and calls an error recovery routine.

EXAMPLE

Following figure steps through the actions a shift-reduce parser might take in parsing the
input string id1 *id2 according to the expression grammar.

E→ E + T | T

T→ T * F | F

F→ ( E ) | id

Configurations of a shift-reduce parser on input idl*id2

Ms. Anakha Satheesh P , Assistant Professor/CSE 3


CS304 Compiler Design/B.Tech/S6

3.1.1.1 OPERATOR PRECEDENCE PARSING


Bottom-up parsers for a large class of context-free grammars can be easily developed
using operator grammars.

In an operator grammar, no production rule can have:

o at the right side (no production).


o two adjacent non-terminals at the right side.

This property enables the implementation of efficient operator-precedence parsers.

EXAMPLE

Precedence Relations
In operator-precedence parsing, we define three precedence relations between certain
pairs of terminals as follows:

The intention of the precedence relations is to find the handle of a right sentential
form,

<. with marking the left end,

=· appearing in the interior of the handle, and

.> marking the right end.

In our input string $a1a2...an$, we insert the precedence relation between the pairs of
terminals.

Example: Consider the string id + id * id and the grammar is:

E→ E+E | E-E | E*E | E/E | (E) | -E | id

Ms. Anakha Satheesh P , Assistant Professor/CSE 4


CS304 Compiler Design/B.Tech/S6

The corresponding precedence relations is

Then the string with the precedence relations inserted is:

$ <. id .> + <. id .> * <. id .> $

<. is inserted between the leftmost $ and id since <. is the entry in row $ and column
id.

Handle And Handle Pruning


In our input string $a1a2...an$, we insert the precedence relation between the pairs of
terminals.

Example: Consider the string id + id * id and the grammar is:

Handle is a substring that matches with the right side of the production and if the
substring matches with the right side of the production, it is reduced with the non-
terminal on the left side of the production

EXAMPLE

Consider the grammar

E  E+E

E  E*E

E  (E)

E id

And the input string is id1 + id2 * id3

The rightmost derivation is:

EE+E

 E+E*E
 E + E * id3
 E + id2 * id3
 Id1 + id2 * id3

Ms. Anakha Satheesh P , Assistant Professor/CSE 5


CS304 Compiler Design/B.Tech/S6

In the above derivation, the underlined substrings are called handles.

HANDLE PRUNING

The process of obtaining rightmost derivation in reverse order is called “handle pruning”.
(i.e.) if w is a sentence or string of the grammar at hand, then w = n where n is the nth right
sentinel form of some rightmost derivation.

3.1.2 LR PARSING

LR parsing is most efficient method of bottom up parsing which can be used to parse large
class of context free grammar.

The technique is called LR(k) parsing; the “L” is for left to right scanning of input symbol, the
“R” for constructing right most derivation in reverse, and the k for the number of input
symbols of lookahead that are used in making parsing decision.

There are three types of LR parsing,

 SLR (Simple LR)


 CLR (Canonical LR)
 LALR (Lookahead LR)

The schematic form of LR parser is given below.

Requirements of LR parser:
 Input Buffer
 Stack

Ms. Anakha Satheesh P , Assistant Professor/CSE 6


CS304 Compiler Design/B.Tech/S6

 Parsing Table
 LR Parsing Program

Input Buffer
The parsing program reads characters from an input buffer one at a time.

Stack

Parsing program uses a stack to store string of the form s0X1s1X2s2X3 ………Xmsm
where sm is on the top.

Xi is a grammar symbol and Si is the state

Each state symbol summarizes the information contained in the stack below it

Combination of state symbol on stack top and current input symbol are used to index
the parsing table and determine the shift reduce parsing decision

Parsing Table

Consist of 2 parts
1. parsing action function action
2. goto function goto

Action
This function takes as arguments a state i and a terminal a (or $, the input end marker). The
value of ACTION [i, a] can have one of the four forms:
i. Shift j, where j is a state.
ii. Reduce by a grammar production A---> β.
iii. Accept.
iv. Error.
Goto
This function takes a state and grammar symbol as arguments and produces a state. If GOTO
[Ii ,A] = Ij, the GOTO also maps a state i and nonterminal A to state j.

Parsing Program works as follows

It determines sm, the state currently on top of stack and ai the current input symbol

It consults parsing action table entry for action[sm,ai] which can have 4 values:

The function goto takes a state and grammar symbol as arguments and produces a
state.

1. Shift s, where s is a state


2. Reduce by a grammar production A→b
3. Accept
4. Error

Ms. Anakha Satheesh P , Assistant Professor/CSE 7


CS304 Compiler Design/B.Tech/S6

The function goto takes a state and grammar symbol as arguments and produces a
state.

A configuration of LR parser is a pair whose first component is stack contents and


second component is remaining input:

( s0 X1 s1 X2 . . .Xmsm, ai ai+1 . . . an $)

The next move of the parser is determined by reading ai, the current input symbol
and sm, state on stack top and then consulting parsing action table entry action[sm,ai].

The configuration resulting after each of four types of moves are as follows:

1. If action [sm, ai] = shift s, the parser executes a shift move, entering the
configuration

( s0 X1 s1 X2 . . .Xmsm, ai ai+1 . . . an $)

Here the parser has shifted both the current input symbol ai and the next state s,
which is given in action [sm, ai], onto the slack; ai+1 becomes the current input
symbol.

2. If action [sm, ai] =reduce A  , then the parser executes a reduce move, entering
the configuration

( s0 X1 s1 X2 s2 . . . Xm-r sm-r A s, ai ai+1 . . . an $)

where s = goto [sm-r , A] and r is the length of  , the right side of the production.
Here the parser first popped 2r symbols off the stack (r state symbols and r
grammar symbols), exposing state sm-r. The parser then pushed both A, the left
side of the production, and s, the entry for goto[sm-r , A], onto the stack. The
current input symbol is not changed in a reduce move.

3. If action [sm, ai] = accept, parsing is completed.

4. If action [sm, ai] = error, the parser has discovered an error and calls an error
recovery routine.

LR Parsing Algorithm
INPUT : Input string w, LR-Parsing table with functions ACTION and GOTO for a grammar
G

OUTPUT : If w is in L(G), a bottom-up parse for w, otherwise, an error indication.

Ms. Anakha Satheesh P , Assistant Professor/CSE 8


CS304 Compiler Design/B.Tech/S6

METHOD : Initially, the parser has s0 on its stack, where s0 is the initial state, and w$ in the
input buffer. The parser then executes the following program until an accept or error action
is encountered.

set ip to point to the first symbol of w$

repeat forever begin

Iet s be the state on top of the slack and

a the symbol pointed to by ip;

if action [s, a]= shift s' then begin

push a then s' on lop of the stack;

advance ip to the next input symbol

end

else if action [s,a] = reduce A then begin

pop 2*|β| symbols off the stack ;

let s' be the state now on top of the stack;

push A then goto [s', A ] on top of the stack;

output the production A

end

else if action [s,a] = accept then

return

else error();

end

Different types of LR Parsers


All LR parsers use the same parsing algorithm but the difference is only in terms of the
construction of parsing tables. There are three widely used algorithms available for
constructing an LR parsing Tables. Different types of LR parsers are:

SLR(l) - Simple LR

 Works on smallest class of grammar.


 Few number of states, hence very small table.
 Simple and fast construction.

LR( 1) - LR parser

 Also called as Canonical LR parser.


 Works on complete set of LR(1) Grammar.

Ms. Anakha Satheesh P , Assistant Professor/CSE 9


CS304 Compiler Design/B.Tech/S6

 Generates large table and large number of states.


 Slow construction.

LALR(1) - Look ahead LR parser

 Works on intermediate size of grammar.


 Number of states are same as in SLR(1).

Reasons for attractiveness of LR parser


LR parsers can handle a large class of context-free grammars.

The LR parsing method is a most general non-back tracking shift-reduce parsing


method.

An LR parser can detect the syntax errors as soon as they can occur.

LR grammars can describe more languages than LL grammars.

Drawbacks of LR parsers
It is too much work to construct LR parser by hand. It needs an automated parser
generator.

If the grammar contains ambiguities or other constructs then it is difficult to parse in


a left-to-right scan of the input.

Construction of LR Parsing Table


3 techniques for constructing LR parsing table for a grammar :-

1. SLR
2. CLR
3. LALR

Simple LR (SLR)

Easy to implement

Least powerful

Canonical LR

Most powerful

Most expensive

LookAhead LR(LALR)

Intermediate in power and cost between other 2

Ms. Anakha Satheesh P , Assistant Professor/CSE 10


CS304 Compiler Design/B.Tech/S6

Limitations Of SLR Parser

For an SLR parser, if we are in a state with the item A ->  ., then we reduce by A ->  iff the
next input symbol ‘a’ belongs to Follow(A).

This is actually a weak rule and can lead to erroneous results and/or shift/reduce conflicts.

EXAMPLE

Consider A -> 

State i: A ->  .

State j: A ->  .

String 1:   a

String 2: β  a

Ms. Anakha Satheesh P , Assistant Professor/CSE 11


CS304 Compiler Design/B.Tech/S6

3.1.2.1 CONSTRUCTING SLR PARSING TABLES


LR(0) Item
LR parser using SLR parsing table is called an SLR parser.

A grammar for which an SLR parser can be constructed is an SLR grammar.

An LR(0) item (item) of a grammar G is a production of G with a dot at the some


position of the right side.

Ex: for the productions A → aBb

Possible LR(0) Items are:

A → .aBb

A → a.Bb

A → aB.b

A → aBb.

A production rule of the form A → yields only one item A → .

Intuitively, an item shows how much of a production we have seen till the current
point in the parsing procedure.

A collection of sets of LR(0) items (the canonical LR(0) collection) is the basis for
constructing SLR parsers.

To construct the canonical LR(0) collection for a grammar we define:

 Augmented grammar

 Two functions: CLOSURE and GOTO

Augmented grammar
G’ is the augmented grammar of G with a new production rule S’→S where S’ is the
new starting symbol i.e,

G’= G  {S’ → S} where S is the start state of G.


The start state of G’ = S’.

This is done to signal to the parser when the parsing should stop to announce
acceptance of input.

Kernel and Non-Kernel items


Kernel items include the set of items that do not have the dot at leftmost end.

S’→ .S is an exception and is considered to be a kernel item.

Non-kernel items are the items which have the dot at leftmost end.

Ms. Anakha Satheesh P , Assistant Professor/CSE 12


CS304 Compiler Design/B.Tech/S6

Sets of items are formed by taking the closure of a set of kernel items.

The Closure Operation


If I is a set of LR(0) items for a grammar G, then closure(I) is the set of LR(0) items
constructed from I by the two rules:

1. Initially, every LR(0) item in I is added to closure(I).

2. If A →  .B is in closure(I) and B→ is a production rule of G; then B→.


 will be in the closure(I).

We will apply this rule until no more new LR(0) items can be added to closure(I).

Computation of Closure
function closure ( I )

begin

J := I;

repeat

for each item A →  .B in J and each production

B→ of G such that B→. is not in J do

add B→. to J

until no more items can be added to J

return J

end

Ms. Anakha Satheesh P , Assistant Professor/CSE 13


CS304 Compiler Design/B.Tech/S6

Goto Operation
If I is a set of LR(0) items and X is a grammar symbol (terminal or non-terminal), then
goto(I,X) is defined as follows:

 If A →  . X in I then every item in closure({A →  . X }) will be in goto(I,X).


 If I is the set of items that are valid for some viable prefix , then goto(I,X) is
the set of items that are valid for the viable prefix X.

EXAMPLE

I ={ E’ → E., E → E.+T}

goto(I,+) = { E → E+.T

T →.T*F

T →.F

F →.(E)

F →.id }

Construction of Canonical LR(0) Collection


To create the SLR parsing tables for a grammar G, we will create the canonical LR(0)
collection of the grammar G’.

ALGORITHM

Procedure items( G’ )

begin

C := { closure({S’→.S}) }

repeat for each set of items I in C and each grammar symbol X

if goto(I,X) is not empty and not in C

add goto(I,X) to C

until no more set of LR(0) items can be added to C.

\end

goto function is a DFA on the sets in C.

Ms. Anakha Satheesh P , Assistant Professor/CSE 14


CS304 Compiler Design/B.Tech/S6

I0 : I6: goto(I1, +) I5: goto(I4, id) I4 : goto(I7, ( )


E’ → .E E → E+.T F → id. F → (.E)

E → .E+T T → .T*F E → .E+T

E → .T T → .F I9:goto(I6, T) E → .T

T → .T*F F → .(E) E → E+T. T → .T*F

T → .F F → .id T → T.*F T → .F

F → .(E) F → .(E)

F → .id I7: goto(I2 , *) F → .id


I3: goto(I6, F)
T → T*.F
T → F.
I1: goto(I0, E) F → .(E) I11 : goto(I8 , ) )
E’ → E. F → .id F → (E).
I4 : goto(I6, ( )
E → E.+T
F → (.E)
I8: goto( I4 , E)
E → .E+T I6: goto(I8, +)
I2 :goto(I0, T) F → (E.)
E → .T E → E+.T
E → T. E → E.+T
T → .T*F T → .T*F
T → T.*F
T → .F T → .F
I2 :goto(I4, T)
F → .(E) F → .(E)
I3: goto(I0, F) E → T.
F → .id F → .id
T → F. T → T.*F

I3: goto(I4, F) I5: goto(I6, id) I7: goto(I9, *)


I4 : goto(I0, ( )
T → F. F → id. T → T*.F
F → (.E)
F → .(E)
E → .E+T
I4 : goto(I0, ( ) I10: goto(I7, F) F → .id
E → .T
F → (.E) T → T*F.
T → .T*F
E → .E+T
T → .F
E → .T I5: goto(I7, id)
F → .(E)
T → .T*F F → id.
F → .id
T → .F

I5: goto(I0, id) F → .(E)

F → id. F → .id

Ms. Anakha Satheesh P , Assistant Professor/CSE 15


CS304 Compiler Design/B.Tech/S6

Constructing SLR Parsing Table


INPUT: An augmented grammar G'.

OUTPUT: The SLR parsing table functions action and goto for G' .

METHOD:

1. Construct C ={I0, I1,…..In}, the collection of sets of LR(0) items for G'.
2. State i is constructed from Ii . The parsing actions for state i are determined as follows:

Ms. Anakha Satheesh P , Assistant Professor/CSE 16


CS304 Compiler Design/B.Tech/S6

If any conflicting actions are generated by the above rules, we say the grammar is not
SLR (1). The algorithm fails to produce a parser in this case.
3. The goto transitions for state i are constructed for all nonterminals A using the rule:
if GOTO(Ii , A)=Ij then GOTO[I, A]=j .
4. All entries not defined by rules (2) and (3) are made “error”.
5. The initial state of the parser is the one constructed from the set of items containing
[S’  S].

Parsing Tables of Expression Grammar

Blank entries are error entries.

Number the given Grammar


E’ → E
E →E+T --------(1)
E→T --------(2)
T →T*F --------(3)
T →F --------(4)
F →(E) --------(5)
F →id --------(6)

Ms. Anakha Satheesh P , Assistant Professor/CSE 17


CS304 Compiler Design/B.Tech/S6

I1: E’ → E. (1, $) ====> ACCEPT

I2 : E → T. (2, $), (2, )), (2, +) ====> r2

I3: T → F. (3, $), (3, +), (3,)), (3, *) ====> r4

I5: F → id. (5, *), (5, +), (5, )), (5, $) ====> r6

I9: E → E+T. (9, $), (9, )), (9, +) ====> r1

I10: T → T*F. (10, $), (10, +), (10, )), (10, *) ====> r3

(11, *), (11, +), (11, )), (11, $) ====> r5


I11: F → (E).

Shift/Reduce and Reduce/Reduce conflicts


If a state does not know whether it will make a shift operation or reduction for a
terminal, we say that there is a shift/reduce conflict.

If a state does not know whether it will make a reduction operation using the
production rule i or j for a terminal, we say that there is a reduce/reduce conflict.

If the SLR parsing table of a grammar G has a conflict, we say that that grammar is
not SLR grammar.

Stack Implementation
Check whether the given input is valid or not?

Ms. Anakha Satheesh P , Assistant Professor/CSE 18


CS304 Compiler Design/B.Tech/S6

EXAMPLE

SLR ( 1 ) GRAMMAR

S→E

E→E+T|T

T→T*F|F

F → id

AUGMENTED GRAMMAR

S’ → S

S→E

E→E+T|T

T→T*F|F

F → id

CANONICAL COLLECTION OF LR(0) ITEMS

I0 I3= Go to (I0, F) I7= Go to (I5, T)


S’→ S T → F• E → E + T•

S → •E

E → •E + T I4= Go to (I0, id) I3 = Go to (I5, F)


E → •T F → id• T → F•
T → •T * F

T → •F I5= Go to (I1 , +)
I4= Go to (I5, id)
F → •id E → E +•T
F → id•
T → •T * F
I1= Go to (I0, E) T → •F
I8= Go to (I6, F)
S` → E• F → •id
T → T * F•
E → E• + T

I6= Go to (I2, *)
I4= Go to (I6, id)
I2= Go to (I0, T) T → T * •F
F → id•
E → T•T F → •id
T• → * F

Ms. Anakha Satheesh P , Assistant Professor/CSE 19


CS304 Compiler Design/B.Tech/S6

DRAW CANONICAL COLLECTION OF LR(0) ITEMS

NUMBER THE PRODUCTIONS

1. S→E
2. E→E+T|T
3. T→T*F|F
4. F → id

SLR(1) TABLE

FOLLOW (S) = {$}


FOLLOW (E)= {+, $}
FOLLOW (T) = {*, +, $}
FOLLOW (F) = {*, +, $}
 I1 contains the final item which drives S → E• and follow (S) = {$}, so action
{I1, $} = Accept
 I2 contains the final item which drives E → T• and follow (E) = {+, $}, so action
{I2, +} = R2, action {I2, $} = R2
 I3 contains the final item which drives T → F• and follow (T) = {+, *, $}, so
action {I3, +} = R4, action {I3, *} = R4, action {I3, $} = R4

Ms. Anakha Satheesh P , Assistant Professor/CSE 20


CS304 Compiler Design/B.Tech/S6

 I4 contains the final item which drives F → id• and follow (F) = {+, *, $}, so
action {I4, +} = R5, action {I4, *} = R5, action {I4, $} = R5
 I7 contains the final item which drives E → E + T• and follow (E) = {+, $}, so
action {I7, +} = R1, action {I7, $} = R1
 I8 contains the final item which drives T → T * F• and follow (T) = {+, *, $}, so
action {I8, +} = R3, action {I8, *} = R3, action {I8, $} = R3.

EXAMPLE

Every SLR (1) grammar is unambiguous, but there are many unambiguous grammars that
are not SLR(1). Consider the grammar with productions

SL=R

SR

L*R

L id

RL

AUGMENTED GRAMMAR
S’  S

SL=R

SR

L*R

L id

RL

Ms. Anakha Satheesh P , Assistant Professor/CSE 21


CS304 Compiler Design/B.Tech/S6

CANONICAL COLLECTION OF LR(0) ITEMS

I0 I5= Go to (I0, id) I9= Go to (I6 , R)


S’ • S L  id• S  L = R•

S  •L = R

S  •R I6= Go to (I2 , =) I8= Go to (I6 , L)


L  •* R S  L = •R RL•
L  •id R•L
R•L L  •* R I7= Go to (I6 , L)
L  •id L  * R•
I1= Go to (I0, S)
S’  S • I7= Go to (I4 , R) I4= Go to (I6 , *)
L  * R• L  *• R
I2= Go to (I0, L) R•L
S  L •= R I8= Go to (I4 , L) L  •* R
R  L• R  L• L  •id

I3= Go to (I0, R) I4= Go to (I4 , *) I5= Go to (I6 , id)


S  R• L  * •R L  id•

R•L
I4= Go to (I0, *) L  •* R
L  * •R L  •id
R•L

L  •* R I5= Go to (I4 , id)


L  •id L  id•

NUMBER THE PRODUCTIONS

1. SL=R
2. SR
3. L*R
4. L id
5. RL

Ms. Anakha Satheesh P , Assistant Professor/CSE 22


CS304 Compiler Design/B.Tech/S6

PARSING TABLE

I1= S’  S •
I2= R  L• FOLLOW (S)={$}

I3= S  R• FOLLOW (L)= {=,$}

I5= L  id• FOLLOW (R)= {=,$}

I7= L  * R•
I8= R  L•
I9= S  L = R•
(2, =), (2, $) =====> r5

(3, $) =====> r2

(5, =), (5, $) =====> r4

(7, =), (7, $) =====> r3

(8, $), (8, =) =====> r5

(9, $) =====> r1

STATE ACTION GOTO

id * = $ R L S

0 s5 s4 3 2 1

1 ACCEPT

2 s6/ r5 r5

3 r2

4 s5 s4 7 8

5 r4 r4

6 s5 s4 9 8/7

7 r3 r3

8 r5 r5

9 r1

Here there is a shift reduce conflict in (2,=) so grammar is not SLR(1)

Ms. Anakha Satheesh P , Assistant Professor/CSE 23


CS304 Compiler Design/B.Tech/S6

3.1.2.2 CONSTRUCTING CANONICAL LR(CLR)


PARSING TABLES

Ms. Anakha Satheesh P , Assistant Professor/CSE 24


CS304 Compiler Design/B.Tech/S6

Canonical Collection of Sets of LR(1) Items


The construction of the canonical collection of the sets of LR(1) items are similar to the
construction of the canonical collection of the sets of LR(0) items, except that closure
and goto operations work a little bit different.

Construction of CLR (canonical LR) Parsing Tables


Algorithm: Construction of canonical- LR parsing tables.

INPUT: An augmented grammar G’.

OUTPUT: the canonical-LR parsing table functions ACTION AND GOTO for G’.

METHOD:

1. Construct C’ = { I0, I1, …….., In}, the collection of sets of LR(1) items for
G’.
2. State i of the parser is constructed from Ii. The parsing action for state
i is determined as follows.

(a) If [ A  .a,b] is in Ii, and GOTO (Ii, a) = Ij, then set ACTION [i, a]
To “shift j”. Here a must be a terminal.
(b) If [A  ., a] is in Ii, A  S’, then set ACTION[ i, a ] to “reduce A  .”
(c) If [ S’  S., $ ] is in Ii, then set ACTION[ i, $ ] to “accept”.

If any conflicting actions result from the above rules, we say the grammar
is not LR(1). The algorithm falls to produce a parser in this case.

3. The goto transitions for state i are constructed for all non terminals A
using the rule: if GOTO( Ii, A) = Ij, then GOTO[ i, A ]= j.
4. All entries not defined by rules (2) and (3) are made “error”.
5. The initial state of the parser is the one constructed from the set of items
containing [S’  .S, $ ].

Ms. Anakha Satheesh P , Assistant Professor/CSE 25


CS304 Compiler Design/B.Tech/S6

EXAMPLE

Construct CLR parsing table for the flowing grammar

S CC

C cC| d

AUGMENTED GRAMMAR
S’ S

S CC

C cC

C d

CANONICAL COLLECTION OF LR(1) ITEMS

I0 I6 = GOTO (I2, c) I8 = GOTO (I3, C)


S’  •S , $ C  c•C, $ C  c C•, c | d
S •CC, $ C  •c C ,$
C •cC, c | d C  •d , $ I3 = GOTO (I3, c)
C •d, c | d C  c• C , c | d
I7 = GOTO (I2, d) C •cC, c | d
I1 = GOTO (I0, S) C  d• , $ C •d, c | d
S’  S•, $
I4 = GOTO (I0, d) I4 = GOTO (I3, d)
I2 = GOTO (I0, C) C  d• , c | d C d •, c | d
S  C•C, $
C  •cC, $ I5 = GOTO (I2, C) I9 = GOTO (I6, C)
C  •d, $ S  CC•, $ C  c C• , $

I6 = GOTO (I2, c) I6 = GOTO (I6, c)


I3 = GOTO (I0, c) C  c•C, $ C  c• C ,$
C  c•C, c | d C  •c C ,$ C •cC, $
C  •c C , c | d C  •d , $ C •d, $
C  •d, c | d
I7 = GOTO (I6, d)
I4 = GOTO (I0, d) I7 = GOTO (I2, d)
C d•, $
C  d• , c | d C  d• , $

I5 = GOTO (I2, C)
S  CC•, $

Ms. Anakha Satheesh P , Assistant Professor/CSE 26


CS304 Compiler Design/B.Tech/S6

GOTO GRAPH FOR THE GRAMMAR

NUMBER THE PRODUCTIONS

1. S CC
2. C cC
3. C d

PARSING TABLE

I1 = S’  S• , $
I4 = C  d• , c | d
I5 = S  CC•, $
I7 = C  d• , $
I8 = C  c C•, c | d
I9 = C  c C• , $

Ms. Anakha Satheesh P , Assistant Professor/CSE 27


CS304 Compiler Design/B.Tech/S6

3.1.2.3 CONSTRUCTING LALR PARSING TABLE

LALR stands for Lookahead LR.

LALR parsers are often used in practice because LALR parsing tables are smaller than
LR(1) parsing tables.

The number of states in SLR and LALR parsing tables for a grammar G are equal.

But LALR parsers recognize more grammars than SLR parsers.

yacc creates a LALR parser for the given grammar.

A state of LALR parser will be again a set of LR(1) items.

Creating LALR Parsing Tables


Canonical LR(1) Parser ➔ LALR Parser

The Core of A Set of LR(1) Items


The core of a set of LR(1) items is the set of its first component

We need to do following update in the parsing table.

Ms. Anakha Satheesh P , Assistant Professor/CSE 28


CS304 Compiler Design/B.Tech/S6

INPUT: An augmented grammar G’.

OUTPUT: The LALR parsing – table functions ACTION and GOTO for G’

METHOD:

1. Construct C = { I0, I1, ......, In }, the collection of sets of LR(1) items.

2. For each core present among the set of LR(1) items, find all sets
having that core, and replace these sets by their union.

3. Let C’ = {J0, J1, ........, Jm}, the collection of sets of LR(1) items. The
parsing action for state i are constructed from Ji in the same manner as
in Algorithm. If there is a parsing action conflict, the algorithm fails
to produce a parser, and the grammar is said not to be LALR(1)

4. The GOTO table is constructed as follows. if J is the union of one or


more sets of LR(1) items, that is, J = I1  I2  ...  Ik, then the
I1, I2, ....., Ik all have the same core. Let K be the union of all sets of
items having the same core as GOTO( I1,X ). Then GOTO (J, X) = K.

**********

Ms. Anakha Satheesh P , Assistant Professor/CSE 29

You might also like