Chapter 3 - Context Free Languages
Chapter 3 - Context Free Languages
Chapter 3 - Context Free Languages
1
Context Free Languages
• Context Free languages are languages generated by Context Free
Grammar.
G = {V, ∑, P, S} 2
Cont…
• S is the start symbol that is used to derive the string. We can derive the
string by repeatedly replacing a non-terminal by the right-hand side of the
production until all non-terminals have been replaced by terminal
symbols. 3
Cont…
P = { S → aSb , S → aSb , S → ɛ }
S → aSb
S → a aSb b (substitute S by aSb),
S → aa aSb bb (substitute S by aSb), if we substitute S by ɛ, S = {aaabbb}
S → aaa aSb bbb, (substitute S by aSb), if we substitute S by ɛ, S =
{aaaabbbb}
Generally, the Above production rules generate the strings having an equal
number of a’s and b’s.
• Context free grammar generates the Context free languages. Let discuss
some Closure Properties of CFG.
9
How context-free language named
• Context-free languages are so named because they can be generated by
context-free grammars (CFGs), which are a type of formal grammar that
generate strings of symbols without taking into account any context or
surrounding symbols.
• In other words, the rules of a CFG depend only on the nonterminal symbols
and do not depend on the context in which they appear. This makes CFGs
simpler and more flexible than regular grammars and context-sensetive
languages, which have more complex rules that depend on the context of
the symbols. 10
Sentential form
• In context-free grammar (CFG), a sentential form is a sequence of
symbols that can be derived from the start symbol of the grammar by
applying a series of production rules.
11
Cont…
• For example, in the CFG
S -> NP VP,
NP -> Det Noun,
VP -> Verb,
Det -> "the",
Noun -> "cat",
Verb -> "sits",
• The sentential form can be derived by applying the production rules in the
following order: S -> NP VP -> Det Noun VP -> the cat VP -> the cat Verb
-> the cat sits. The sentential form derived from the above production
rule is: "the cat sits". 12
Cont…
Example 2:
S -> NP VP
NP -> Pronoun
VP -> Verb NP
Pronoun -> "I"
Verb -> "saw"
NP -> Det Noun
Det -> "a"
Noun -> "dog"
• Sentential form: "I saw a dog"
13
Derivation tree or parse tree
• The process of deriving or generating a string from a given grammar is
called derivation.
14
Cont…
• In the Derivation Tree, the deepest sub-tree from leftmost is traversed first by
following the rule of In-order traversal. In this way, the original input string is
obtained but
15
Types derivation trees
• There are two types of derivation trees:
• a tree obtained by applying the production rule to the rightmost variable or no-
terminal in each step (getting a string by expanding the rightmost non-terminal at
16
each step)
Examples
• Consider a string W = xxxyyxyyyx using the grammar
19
Cont…
Now we have finished the move and we got xxxyy. But there are remaining
symbols xyyyx. To represent these symbols we have to go back to the previous
step.
20
Cont…
21
Cont…
22
Cont…
2. Rightmost derivation
27
Cont…
28
Cont…
29
Cont…
30
Cont…
31
Cont…
32
Cont…
33
Cont…
34
Cont…
S → bB | aA
A → b | bS | aAA
B → a |aS | bBB
and draw the derivation tree (both leftmost and rightmost derivation trees)
36
Simplification of CFG
• In CFG(Context Free Grammar), all production rules and symbols are not
necessary for the derivation of a string.
1. Reduction of CFG
Phase 1: Derivation of an equivalent Grammar G' from the Context Free Grammar
G, such that each variables (non-terminal) derives some terminal symbol.
Derivation Procedure:
1. Include all symbols Wi, that derives some terminal and initialize i = 1
Phase 2: Derivation of an equivalent Grammar G'' from the Context Free Grammar
G', such that each symbol appears as sentential.
Derivation Procedure:
1. Include Start symbol in Yi, that derives some terminal and initialize i = 1
2. Include symbols Yi+1, which can be derived from Yi, and include all production
rules that have been applied.
39
Cont…
Phase 1:
T = {a, c, e}, V = {S, A, C, B, E}, S = {S}
Step 1: include all symbols that can derive terminal symbols
W1 = {A, C, E}, S and B don’t drive any terminal symbols
Step 2: include all symbols that can derive the symbols in W1
W2 = {A, C, E, S}, S (included) can derive symbols A and C
Step 3: include all symbols that can derive the symbols in W2
W3 = {A, C, E, S} 40
Cont…
W2 = {A, C, E, S}
W3 = {A, C, E, S}
Now W3 = W2, satisfies Wi+1 = Wi
G' = ({A, C, E, S}, {a, c, e}, P, {S})
{A, C, E, S}: non-terminal symbols found in W3
{a, c, e}: are terminal symbols derived by non-terminal symbols in W3, A derives
‘a’, C derives ‘c’, E derives ‘e’, and S derives no terminal symbol (checked from the
production rule given.
Step 4: Production rule: we have to start from Start symbol
P1: S → AC, A → a, C → c, E → aA|e
In S → AC (why B is not included, because B is not found in G' .
Let’s proceed to phase 2
41
Cont…
Phase 2:
Step 1: Start from Start symbol
Y1 = {S}
Step 2: include all symbols that can be derived from Y1(use a production P1 rule of
G ’)
Y2 = {S, A, C}, A and C can be derived from S in P1
Step 3: include all symbols that can be derived from Y2
Y3 = {S, A, C, a, c}, a derived by A, and c derived by C
Step 4: include all symbols that can be derived from Y3
Y4 = {S, A, C, a, c}
Now Y4 = Y3 , now Yi+1 = Yi is satisfied.
42
Cont…
43
2. Removal of Unit productions
• Any production rule of the form A → B, where A, B ∈ V (non-terminal symbols)
is called Unit production
44
Cont…
Example : Remove a Unit production from the grammar G having the production
rule as:
P: S → XY, X → a, Y → Z | b, Z → M, M → N, N → a
Unit productions (in the from form if A → B) are Y → Z , Z → M, and M → N
Steps to remove these unit productions:
1. Since N → a, add M → a to the production (because M → N)
P: S → XY, X → a, Y → Z | b, Z → M, M → a, N → a
2. Since M → a, add Z → a to the production (because Z → M)
P: S → XY, X → a, Y → Z | b, Z → a, M → a, N → a
3. Since Z → a, add Y → a to the production (because Y → Z)
P: S → XY, X → a, Y → a | b, Z → a, M → a, N → a
45
Cont…
3. Since Z → a, add Y → a to the production (because Y → Z)
P: S → XY, X → a, Y → a | b, Z → a, M → a, N → a
• In the production rule S → XY, X gives a, and Y gives a | b. but Z → a, M → a, N
→ a can’t be reached from the start symbol. So we have to remove them.
• The new production rule after the unit production is removed will be:
P: S → XY, X → a, Y → a | b
• If a symbol does not exist on the right-hand side of the production rule and does
not participate in the derivation of any string, that symbol is considered a useless
symbol. Similarly, if a variable does not participate in the derivation of any
string, it is useless. 46
3. Removal of Null productions
• In CFG, a non-terminal symbol A is a nullable if there is a production in the form
A → ɛ or there is a derivation that starts at A and leads to ɛ. Like (A→ … → ɛ)
47
Cont…
Example: Remove the null production from the grammar G, having the production
rule as follows:
P: S → ABAC, A → aA | ɛ, B → bB | ɛ, C → C
• The null production in the form of (Non-terminal → ɛ), are A → ɛ, B → ɛ. We
have to remove these null productions.
• To remove A → ɛ:
1. Replace A by ɛ in the production rule
S → ABAC, A → aA | ɛ
S → ABAC | ABC | BAC | BC (replacing A by ɛ in different cases)
A→ a
The new production rule is:
S → ABAC | ABC | BAC | BC , A → aA | a, B → bB | ɛ, C → C
48
Cont…
• To remove B → ɛ:
2. Replace A by ɛ in the production rule
The production rule is:
S → ABAC | ABC | BAC | BC , A → aA | a, B → bB | ɛ, C → C
Replace B by ɛ
S → AAC | AC | C, B → b
The new production rule is:
S → AAC | AC | C, A → aA | a B → bB | b, C → C
As you ca see all the null productions are removed
49
Parsing and ambiguity
• Ambiguity refers to a situation where a given sentence or string of
symbols can be generated by the grammar in more than one way.
• In other words, there are multiple possible parse trees or derivations that
can lead to the same sentence. This can lead to confusion or uncertainty
about the intended meaning of the sentence.
50
Cont…
• Ambiguity can arise in CFGs when there are multiple production rules that
can be applied to the same nonterminal symbol, or when there are
overlapping or conflicting rules.
51
Ambiguous grammar
• A grammar or a Context-Free Grammar(CFG) is said to be ambiguous if
there exists more than one leftmost derivation(LMDT) or more than one
rightmost derivation(RMDT), or more than one parse tree for a given input
string.
52
Cont…
• Since Ambiguous Grammar can produce two Parse trees for the same
expression, it's often confusing for a compiler to find out which one among
all available Parse Trees is the correct one according to the context of the
work.
53
Cont…
55
Unambiguous grammar
• A grammar is unambiguous if there is no ambiguity in it. This means it
should contain only one leftmost derivation (LMD), one rightmost
derivation (RMD), and one parse tree for the given input string.
59