Tekkom M4,5
Tekkom M4,5
Tekkom M4,5
1
Materi
Peranan parser
Jenis-jenis parser
Error level dan Error recovery
Context Free grammar (CFG)
Parse tree
Konversi RE ke CFG
Ambiguity
Menghilangkan ambiguity
Left factoring dan left recursive
Menghilangkan left recursion
2
Syntax Analyzer
Syntax Analyzer creates the syntactic structure of the given
source program.
This syntactic structure is mostly a parse tree.
Syntax Analyzer is also known as parser.
The syntax of a programming is described by a context-free
grammar (CFG). We will use BNF (Backus-Naur Form) notation
in the description of CFGs.
The syntax analyzer (parser) checks whether a given source
program satisfies the rules implied by a context-free grammar
or not.
If it satisfies, the parser creates the parse tree of that program.
Otherwise the parser gives the error messages.
A context-free grammar
gives a precise syntactic specification of a programming language.
the design of the grammar is an initial phase of the design of a
compiler.
a grammar can be directly converted into a parser by some tools.
3
Parser
• Parser works on a stream of tokens.
4
Parsers (cont.)
We categorize the parsers into two groups:
1. Top-Down Parser
the parse tree is created top to bottom, starting
from the root.
2. Bottom-Up Parser
the parse is created bottom to top; starting from
the leaves
5
Context-Free Grammars
Inherently recursive structures of a programming language
are defined by a context-free grammar.
Example:
E E+E | E–E | E*E | E/E | -E
E (E)
E id
6
Derivations
E E+E
E+E derives from E
we can replace E by E+E
to able to do this, we have to have a production rule EE+E
in our grammar.
E E+E id+E id+id
A sequence of replacements of non-terminal symbols is called a
derivation of id+id from E.
In general a derivation step is
*
: derives in one step
: derives in zero or more steps
+
: derives in one or more steps
7
CFG - Terminology
L(G) is the language of G (the language generated by
G) which is a set of sentences.
A sentence of L(G) is a string of terminal symbols of
G.
If S is the start symbol of G then
is a sentence of L(G)+iff S where is a string of terminals of
G.
If G is a context-free grammar, L(G) is a context-free
language.
Two grammars are equivalent if they produce the
same* language.
S
- If contains non-terminals, it is called as a sentential form of G.
- If does not contain non-terminals, it is called as a sentence of G.
8
Derivation Example
E -E -(E) -(E+E) -(id+E) -(id+id)
OR
E -E -(E) -(E+E) -(E+id) -(id+id)
9
Left-Most and Right-Most
Derivations
Left-Most Derivation
Right-Most Derivation
E rm -E rm
-(E) rm
-(E+E)rm -(E+id)rm -(id+id)
We will see that the top-down parsers try to find the
left-most derivation of the given source program.
10
Parse Tree
• Inner nodes of a parse tree are non-terminal symbols.
• The leaves of a parse tree are terminal symbols.
E -E E
-(E) E
-(E+E)
E
- E - E - E
( E ) ( E )
E E E + E
- E - E
-(id+E) -(id+id)
( E ) ( E )
E + E E + E
id id id
11
Ambiguity
• A grammar produces more than one parse tree for a sentence is
called as an ambiguous grammar.
E
E E+E id+E id+E*E E + E
id+id*E id+id*id
id E * E
id id
E
E E*E E+E*E id+E*E
id+id*E id+id*id E * E
E + E id
id id
12
Ambiguity (cont.)
For the most parsers, the grammar must be
unambiguous.
unambiguous grammar
unique selection of the parse tree for a
sentence
E2 S1 14
1
Ambiguity (cont.)
• We prefer the second parse tree (else matches with closest if).
• So, we have to disambiguate our grammar to reflect this choice.
15
Ambiguity – Operator
Precedence
Ambiguous grammars (because of ambiguous
operators) can be disambiguated according to
the precedence and associativity rules.
16
Left Recursion
A grammar is left recursive if it has a non-
terminal A such that there is a derivation.
+
A A for some string
Top-down parsing techniques cannot handle
left-recursive grammars.
So, we have to convert our left-recursive
grammar into an equivalent grammar which
is not left-recursive.
The left-recursion may appear in a single step
of the derivation (immediate left-recursion),
or may appear in more than one step of
the derivation.
17
Immediate Left-Recursion
AA| where does not start with A
eliminate immediate left recursion
A A’
A’ A’ | an equivalent grammar
In general,
A A 1 | ... | A m | 1 | ... | n where 1 ... n do not start with A
eliminate immediate left recursion
A 1 A’ | ... | n A’
A’ 1 A’ | ... | m A’ | an equivalent grammar
18
Immediate Left-Recursion --
Example
E E+T | T
T T*F | F
F id | (E)
19
Left-Recursion -- Problem
• A grammar cannot be immediately left-recursive, but it still can be
left-recursive.
• By just eliminating the immediate left-recursion, we may not get
a grammar which is not left-recursive.
S Aa | b
A Sc | d
This grammar is not immediately left-recursive,
but it is still left-recursive.
S Aa Sca or
A Sc Aac causes to a left-recursion
25
Left-Factoring -- Algorithm
For each non-terminal A with two or more alternatives (production
rules) with a common non-empty prefix, let say
convert it into
A A’ | 1 | ... | m
A’ 1 | ... | n
26
Left-Factoring – Example1
A abB | aB | cdg | cdeB | cdfB
A aA’ | cdg | cdeB | cdfB
A’ bB | B
A aA’ | cdA’’
A’ bB | B
A’’ g | eB | fB
27
Left-Factoring – Example2
A ad | a | ab | abc | b
A aA’ | b
A’ d | | b | bc
A aA’ | b
A’ d | | bA’’
A’’ | c
28
Non-Context Free Language
Constructs
There are some language constructions in the
programming languages which are not context-free.
This means that, we cannot write a context-free
grammar for these constructions.