Unit-III Context Free Grammar and Languages:: Type 0
Unit-III Context Free Grammar and Languages:: Type 0
Unit-III Context Free Grammar and Languages:: Type 0
Introduction to Grammar:
Types of grammar:
Type 0
Type 1
Type 2
Type 3
Type 0 Grammar:
Type 1 Grammar:
This grammar contains the restriction that the production of the α and β, the length
of the β is larger than or equal to the length of the α.
Example:
S → aAB
B→b
A → aBcb
AB → bB
Type 2 Grammar:
This grammar contains the production of the form α→β, |α| ≤ |β|.
In this grammar, the left side of the production..............missing........
1. E → I
2. E → E + E
3. E → E * E
4. E → (E)
5. I → a
6. I → b
7. I → Ia
8. I → Ib
9. I → I0
10. I → I1
Here the grammar for expression is stated formally as G = ({E,I}, T, P, E), where T is the
set of symbols {+, *, (, ), a, b, 0, 1} and P is the set of productions, E is the starting NT
(Non terminal).
1. L = { 0n | n=1}
S → 0A | 0
A→ε
(or)
S→A
A→0
2. L = {0n | n ≥ 1}
S → 0A | 0
A → 0A | ε
3. L = {0*}
S → 0A | ε
A → 0A | ε
n n
4. L = {0 1 | n = 1}
S → AB
A → 0A | ε
B → 1B | ε
5. L = {0 1 | n ≥ 1}
n n
S → AB
A → 0A | 0 | ε
B → 1B | ε
6. L = {0n10n | n = 1}
S → A1A
A → 0A | ε
7. L = {0 10n | n ≥ 1}
n
S → A1A
A → 0A | 0 | ε
8. L = {a b c | i ≠ j or j ≠ k}
i j k
S → ABC
A → aA | a | ε
B → Bb | b | ε
C → cC | c | ε
i j
9. L = {a b | j > i}
S → aSb
S → aCb
C → bC | ε
Derivations
using grammar
Recursive
Derivations
Inference
Leftmost Rightmost
Derivations Derivations
Recursive inference:
We apply the productions of a CFG to infer that certain strings are in the language
of a certain variable.
We take strings known to be in the language of each of the variable of the body,
concatenate them in the proper order with any terminals appearing in the body and
infer that the resulting string is in the language of the variable in the head. This
procedure is called as recursive inference.
Example:
Now let us see how the string a*(a+b00) is inferred by the above grammar.
Derivation:
Definition:
Example:
Derive the string a*b+a using top down derivation from the grammar G for
algebraic expression.
Solution:
E => E * E
E => E * E + E
E => a * E + E
E => a * b + E
E => a * b + a
Leftmost derivation
Rightmost derivation
Leftmost derivation:
Leftmost derivation is one in which the leftmost non terminal is replaced by RHS of a
production whose LHS is the Non terminal that is being replaced.
∗
One or more derivation steps recursively describes as ⇒
rm
Example:
String a*b+a
E =>E + E
E*E+E
a*E+E
a *b+E
a*b+a
Rightmost derivation:
∗
One or more derivation steps recursively describes as ⇒
lm
Rightmost derivation is the one in which the rightmost non-terminal is replaced by RHS
of a production whose LHS is the non terminal that is being replaced.
Example:
String a*b+a
E => E + E
E*E+E
E*E+a
E*b+a
a*b+a
1. Derive the string (a+(a*a)) using leftmost and rightmost derivation for the
following production E → E + E | E * E | (E) | a.
Solution:
2. Derive the string b*(b+a11) using leftmost and rightmost derivation for the
following production
E → I | E + E | E * E | (E)
I → a | b | Ia | Ib | I0 | I1
Solution:
Leftmost derivation Rightmost derivation
E⇒ E*E E⇒ E*E
𝑙𝑚 𝑟𝑚
⇒ I*E ⇒ E*(E)
𝑙𝑚 𝑟𝑚
⇒ b*E ⇒ E*(E+E)
𝑙𝑚 𝑟𝑚
⇒ b*(E) ⇒ E*(E+I)
𝑙𝑚 𝑟𝑚
⇒ b*(E+E) ⇒ E*(E+I1)
𝑙𝑚 𝑟𝑚
⇒ b*(I+E)) ⇒ E*(E+I11)
𝑙𝑚 𝑟𝑚
⇒ b*(b+E) ⇒ E*(E+a11)
𝑙𝑚 𝑟𝑚
⇒ b*(b+I) ⇒ E*(I+a11)
𝑙𝑚 𝑟𝑚
⇒ b*(b+I1) ⇒ E*(b+a11)
𝑙𝑚 𝑟𝑚
⇒ b*(b+I11) ⇒ I*(b+a11)
𝑙𝑚 𝑟𝑚
⇒ b*(b+a11) ⇒ b*(b+a11)
𝑙𝑚 𝑟𝑚
3. The following grammar generate the language for the expression 0*1(0+1)*
S → A1B
A → 0A | ε
B → 0B | 1B | ε
i. 00101
ii. 1001
iii. 00011
4. Consider the CFG and derive the string aabbab using leftmost and rightmost
derivation. S → aSbS | bSaS | ε.
Derive the string (a*a) +a using recursive inference for the following CFG. E → E + E |
E → E * E | E → (E) | E → a
Solution:
1. E→E+E
2. E→E*E
3. E → (E)
4. E→a
2. Derive the string 00101 using recursive inference for the following CFG.
1. S → A1B
2. A → 0A
3. A→ε
4. B → 0B
5. B → 1B
6. B→ε
CFG Language:
The context free grammar G = (V,T,P,S). The language of the grammar G denotes
as L(G) and it represents the set of terminal string that will be derived from the start
∗
symbol and it is represented as L(G) = {w | w is in T and S ⇒ w}
Sentential forms:
The string that are derived from the starting symbol of the grammar are said to be
sentential form.
For example, if the grammar G = (V, T, P, S) is a CFG, then the string α which is
(V∪T)* such that {α} can be derived from the starting symbol S.
∗
S ⇒α and α is the sentential form.
A sentence is a sentential form with no variables.
The types of sentential form are
o Left sentential form
o Right sentential form
If the string α can be generated from the starting symbol by using left most
∗
derivation such that, S ⇒α, then α is the Left sentential form.
lm
Right sentential form
If the string α can be generated from the starting symbol by using right most
∗
derivation such that, S ⇒α, then α is the right sentential form.
rm
Parse tree:
Definition:
The strings that are derived from the context free grammar can be represented in a
tree format known as parse tree or derivation tree.
Parse tree can be used for both the derivation and recursive inference.
Parse tree is used to find the ambiguity in grammars and languages.
Parse tree is used in compiler which facilitates the translation of the source
program into an executable code.
Let us consider the grammar G = (V, T, P, S) and the parse tree for G is
constructed with the following conditions.
S S
A
A (or)
ε
X1 X2 ......... Xn
1. Construct the parse tree for the string a*(a+b00) for the following CFG using both
leftmost and rightmost derivation.
E→I
E→E+E
E→E*E
E → (E)
I→a
I→b
I → Ia
I → Ib
I → I0
I → I1
Solution:
Example 2:
Construct the parse tree for the string a*(a+a) for the following CFG using both leftmost
and rightmost derivation. The CFG is E → E + E | E * E | (E) | a.
Solution:
Construct the parse tree for the string 00101 for the following CFG using both leftmost
and rightmost derivation. The CFG is
S → A1B
A → 0A | ε
B → 0B | 1B | ε
Solution:
Definition:
Example:
The context free grammar A → A+A | A-A | a is ambiguous since there are two leftmost
derivations for the string a+a+a.
A→A+A A→A+A
A→ a + A A→ A + A + A
A→ a + A + A A→ a + A + A
A→ a + a + A A→ a + a + A
A→ a + a + a A→ a + a + a
Example 2;
The grammar is ambiguous since there are two parse trees for the string a+a-a.
A A
A A A A
A - A A A
a
a + -
a a a
a +
a
Example:
a S a S
c b
a S b S ε
a S
ε ε
ε
Rightmost derivation-1 Rightmost derivation-2
S aS S aSbS
aaSbS aSbε
aaSbε aSb
aaSb aaSb
aaεb aaεb
S aab S aab
a S a S
S b
a S b S ε
a S
ε ε
ε
Example:
A B
C
a A b c B d a d
s C
a b c d d
a D
b D c
b c
Introduction:
Examples:
1. L = {anbn | n ≥ 0}. To handle this language, we must not only check that all a’s
precede the first b, we must also count the number of a’s. Since n is unbounded
this counting cannot be done with a finite memory. So we require PDA (Push
Down Automata), a machine that can count without limit.
2. L = {wwR | w ϵ {a,b}*}. To handle this language, we need more than unlimited
counting ability. We need to store and match a sequence of symbols in reverse
order.
3. Push down automata is a machine similar to finite automata that will accept
context free languages.
Input tape a a b b $ Δ Δ Δ Δ
Finite
control
Stack
Formal definition of a push down automata;
Where,
NFA:
(p,a,q) ϵΔ means if machine M is in state p, then on reading ‘a’ from input tape
go to state q.
(p,ε,q) ϵΔ means if machine M is in state p, goes to sate q without consuming
input.
PDA:
1. ((p,a,β), (q,y)) → if machine M is in state p, the symbol read from input tape is ‘a’
and β is on top of stack, goes to state q, and replace β by ’y’ on top of stack.
2. ((s,a,ε),(s,a)) → if machine M is in state s. Reads ‘a’ remain in state s and push ‘a’
onto stack ε, empty stack.
3. ((s,c,ε),(f,ε)) → if read ‘c’ in state s and stack is empty, goes to final state f and
nothing to push onto stack.
4. ((s,ε,ε),(f,ε))→ if in state s, go to state f.
5. ((f,q,a), (f,φ)) → if read ‘a’ in state f, remain in state f and pop ‘a’ from stack.
6. PDA’s are non-deterministic.
Transitions:
a,β/γ
p q
1. When we push β, we must push the symbols of β as we read them right to left.
2. When we pop γ, we pop the symbols of γ as we read them left to right.
Types of PDA:
Example:
0, Z0/0Z0
1, Z0/1Z0
0, 0/00
0, 1/01
1, 0/10 0, 0/ ε
1, 1/11 1, 1/ε
Start
q0 q1 q2
ε, Z0/Z0 ε, Z0/Z0
ε,0/0
ε, 1/1
Moves of PDA:
The moves of the push down automata from current state to next state can be
represented by the symbol ├.
1. Suppose the PDA P = ({p, q},s {0,1}, {z0, x},δ, q,z0,{p}) has the following
transition functions:
a. (q,0,z0) = {(q, x)}
b. (q,0,x) = {(q, xx)}
c. (q,1,x) = {(q, x)}
d. (q, , x) = {(p,)}
e. (p, , x) = {(p,)}
f. (p, 1, x) = {(p,xx)}
g. (p, 1,) = {(p,)}
Starting from the initial ID (q, w, z0), show all the reachable instantaneous description
where the input string ‘w’ is as follows:
1. 01
2. 0011
3. 010
Solution:
1. String w=01
ID = (q, w,z0)
(q, 01,z0)├ (q, 1, xz0)
├ (q, , x)
├ (p, )
Now final state p is reached. So the string w=01 is accepted.
2. W=0011
ID = (q, w,z0)
(q, 0011,z0)├ (q, 011, xz0)
├ (q, 11,xxz0)
├ (q, 1,xxz0)
├ (q, ,xxz0)
├ (p, ,xz0)
├ (p, )
So the string 0011 is accepted.
3. W=010
(q, 010,z0)├ (q, 10,xz0)
├ (q, 0,xz0)
├ (q, ,xxz0)
├ (p, ,xz0)
├ (p, )
So the string 010 is accepted.
2. Consider the following PDA and find whether the string 000111 and 00011 is
accepted.
0,z0 / 0z0
0,0 /00
1, 0 /ε
Start 1, 0 /ε ε, z0 /ε
p q r
ε, z0 /ε
Solution:
Transition function:
(p, 1, 0) = (q, )
(q, 1, 0) = (q, )
├(p,111, 000z0)
├(q, 1, 0z0)
├(q, , z0)
├ (r,)
String w = 00011
├(p,11, 000z0)
├(q, 1, 00z0)
├(q, , 0z0)
The language of PDA is the set of all string accepted by the PDA.
Two methods of accepting a string in the PDA is as follows:
o Acceptance by final state.
o Acceptance by empty stack.
The string is accepted by the PDA after consuming the input, and then the PDA
should enter the final state. This method is accepting the language by reaching
final state.
Let P = (Q, ∑, ┌, δ, q0, Z0, F) be a PDA, then the language L(P) accepted
by final state is L(P) = {w | (q0,w,z0) * (q, ε, α)}
P
o Where
q → final state (q ϵ F)
α → any stack symbol. (α ϵ ┌)
After reading the input, the stack should be empty, then the language is accepted
by PDA.
Let P = (Q, ∑, ┌, δ, q0, Z0, F) be a PDA, then the language N(P) accepted by final
state is N(P) = {w | (q0,w,z0) * (p, ε, ε)}
o Where P
p → any state in Q
Theorem:
If L = N(PN) for some PDA, where PN = (Q, ∑, ┌, δ N, q0, Z0) then there is a PDA PF such
that L = L(PN).
If there exists a PDA PN that accepts a language by empty stack, then there exists
PDA PN that accepts a language by reaching final state.
Proof:
To prove this theorem, we use a symbol X0 which must not be a symbol in ┌, that
is, (x0 ϵ ┌*).
Here we are going to simulate P F from PN that is we are going to construct PN.
Here x0 is used as the starting top symbol of the stack.
X0 is the symbol marked on the bottom of the stack for PN.
PN goes on processing the input if it sees any other symbol on the stack except the
symbol x0.
If PN sees x0, then it finishes processing the string.
Now we are going to construct PF with new starting state P0 and new final state PF.
P0 which is the starting state of PF is used to push the stack to make x0at the
bottom of the stack, when it reads x0 symbol and enters the initial state.
PF simulates PN and accepts if PN empties its stack.
ϵ,x0 / ϵ
ϵ,x0 / ϵ
Start
P0 q0 ϵ,x0 / ϵ PF
PN
ϵ,x0 / ϵ
The function of the new starting state P0 is to push the symbol Z0 to the stack
when it sees x0 on the top of the stack and enters the state q0.
The q0 state is the initial state of PN. So PF simulates PN until PN empties its stack.
PF detects that the PN emptied stack when PF sees x0 on the top of the stack. So
when the PDA PF sees x0 on the top of the stack, it reaches the new final state Pf of
PDA PF.
So the PDA PF is given by
PF = (Q ∪ {P0, PF}, ∑, ┌∪{x0}, δF, p0, x0,{ PF})
Where δF is defined as
1. δF (P0, ε, x0) = (q0, z0, x0) PF pushes x0 to the bottom of the stack.
2. For all state q in Q, input ‘a’ in ∑ or a=ε and any stack symbol ‘y’ in ┌, δF(q, a, y)
contains all the pairs in δN(q, a, y)
3. δF(q, ε, x0) = (pf, ε), accepts the string by reaching final states.
So from the above theorem, we can conclude that ‘w’ is in L(PN). So by
combining all the moves, the instantaneous description of the PF after simulating PN is
given by
(P0, w, x0) ├ (P0, w, x0 z0) ├ (q, ε, x0) ├ (pf, ε, ε)
Thus the PDA Pf accepts ‘w’ by final state.
Theorem:
If L is a PDA L(PF) for some PDA PF = (Q, ∑, ┌, δ F, q0, Z0, F) then there is a
PDA PN such that L = N(PN).
To prove:
We have to prove that there exists a PDA P N if and only if there exists a
PDA PF.
Proof:
To prove this theorem, here also we use a stack symbol x0 on the bottom of the
stack.
Here we are going to construct PN from PF.
Here q0 is the initial state of PF and we have a set of final state for Pf.
So the PDA Pf enter the final state after accepting the string ‘w’ with leaving any
symbols on the stack.
So our target is to delete all the stack symbol of PDA PF after entering the final
state in order to construct PN by empty stack.
Now construct PN with P0 as the new start state and P as new final state.
The transition diagram of PN is shown below: