For Every CFG There Is An Equivalent PDA: Theorem 1

Download as pdf or txt
Download as pdf or txt
You are on page 1of 4

For every CFG there is an equivalent PDA

It has been said many times that PDA’s are an abstract machine just powerful enough to recognize
the languages generated by CFG’s. In order to prove this one must show that for any CFG G there
exists a PDA M s.t. L(M)=L(G). Furthermore one must also show that for any PDA M there exists
a CFG G s.t. L(G) = L(M ).

It is interesting to point out that PDA’s were formally introduced by Oettinger in 1961 and CFG’s
by Chomsky in 1956. The equivalence between these two were first perceived by Chomsky in
1962 and Evey in 1963.

In both directions you will see a constructive proof. (i.e. A construction followed by a proof that
the construction does what it says it does.)

Theorem 1
If G = (V, Σ, S, P ) is a CFG, there exists a PDA M = (Q, Σ, Γ, q0 , Z0 , δ, A) such that Lf (m) =
L(G).
Proof: We define our PDA M as follows:
Q = {q0 , q1 , q2 } - It is interesting to note that ANY CFG can be recognized (via final state) by a
PDA with only three states.
Γ = V ∪ Σ ∪ {Z0 }, Z0 ∈ / V ∪ Σ - Our stack symbols are the grammar variables, a stack start
symbol and the input alphabet.
A = {q2 } - Only need one accepting state for ANY CFG.
We must now define our PDA’s transition function δ

1. δ(q0 , ǫ, Z0 ) = {(q1 , SZ0 )}

2. For A ∈ V , δ(q1 , ǫ, A) = {(q1 , α)|(A → α) ∈ P }

3. For each a ∈ Σ, δ(q1 , a, a) = {(q1 , ǫ)}

4. δ(q1 , ǫ, Z0 ) = {(q2 , Z0 )}

The proof can be completed by showing L(G) ⊆ Lf (M ) and Lf (M ) ⊆ L(G) (both via induction)
which is left as an exercise for the reader. ⋄

The basic idea of the construction is to use the start state and a required ǫ-transition first move to
transit to state q1 while placing the grammar’s start symbol on the stack. Whenever a grammar
variable is on the top of the stack we can pop it off replacing it with its righthand side string.

1
Whenever the input symbol matches the stack’s top (grammar terminals) we eat the input and pop
the symbol off the stack. Finally when input is exhausted and Z0 is all that is left on the stack
(nothing was ever placed under it), the PDA transits to the accepting state.

Hopefully one sees that the constructed PDA will correctly guess the steps necessary for a leftmost
string derivation for x ∈ L(G). Simply put if at some point in the derivation the current string
looks like xα, where x ∈ Σ∗ (a string of terminals), then at some point in the PDA’s simulation of
the leftmost derivation, the input string read so far is x and that stack contains α.

It should be clear that this PDA is NOT a DPDA. The nondeterminism is introduced by there being
multiple productions for any given variable A. Any nondeterminism inherent in the grammar is
replicated in the PDA.

2
For every PDA there is an equivalent CFG

A word (or two) is in order before looking at how to construct a CFG out of any PDA M . Assume
w.l.o.g. that the given PDA M accepts language L via empty stack L = Le (M ). (Given a PDA M ′
that accepts L via final state one can convert M ′ into a PDA M such that L = Le (M ) = Lf (M ′ ).)

We already know what the terminal symbols are, but what to use for the grammar’s variables? It
should be obvious that one cannot use the PDA’s states; there are probably not enough of them for
our purposes. Furthermore one cannot use either the set Γ of stack symbols (not enough informa-
tion encoded) or all the possible stack configurations (infinitely many). Instead our variables will
be things of the form [p, A, q] where p, q ∈ Q; A ∈ Γ. This will certainly provide more variables
then needed. (No need to worry though: you should know the algorithm for eliminating useless
variables).

Whenever the PDA can move from state p to state q, read a ∈ Σ ∪ {ǫ} from the input stream and
pop A, A ∈ Γ, off the stack (a true pop operation, not a replace) we introduce into the grammar the
production [p, A, q] → a. Hence our grammar’s variables incorporate both state information and
stack top symbol information.

It will be good if the PDA and grammar to operate as they did above: simulate leftmost derivations.
Another way of stating this is that the string read so far by the PDA will be the initial portion of the
current string in a leftmost derivation, and what is on the stack will correspond to the remaining
portion of the current string.

Theorem 2
If M = (Q, Σ, Γ, q0 , Z0 , δ, A) is a PDA there exists a CFG G = (V, Σ, S, P ) such that Le (M ) =
L(G)
Proof: The CFG G is defined as follows:

V = {S} ∪ {[p, A, q]|A ∈ Γ; p, q ∈ Q} – more than is needed but it is harmless to include useless
variables.

The productions of the grammar are:

• For every q ∈ Q; (S → [q0 , Z0 , q]) ∈ P


• If q1 , q2 ∈ Q; a ∈ Σ ∪ {ǫ}; A ∈ Γ; and δ(q1 , a, A) contains (q2 , ǫ) then ([q1 , A, q2 ] → a) ∈ P
• If m ≥ 1; q0 , q1 ∈ Q; a ∈ Σ ∪ {ǫ}; A ∈ Γ; δ(q0 , a, A) contains (q1 , B1 B2 . . . Bm ) for some
string B1 , B2 , . . . , Bm ∈ Γ then for every possible choice (i.e. assignment) of qa , . . . , qm ∈
Q, the production
[q0 , A, qm ] → a[q1 , B1 , qa ][qa , B2 , qb ] . . . [ql , Bm , qm ]

3
is included in P

The proof is completed by showing L(G) ⊆ Le (M ) and Le (M ) ⊆ L(G) (both via induction)
which are left as exercises for the reader. ⋄

It is my opinion that the transformation from a CFG to a PDA makes “sense” (whatever that
means). On the other hand the transformation from a PDA to a CFG appears like Voodoo magic;
how can this possibly work. One must first resign oneself to the observation that this algorithm
produces many many useless productions and variables. It is also true that while one can certainly
produce strings from the grammar that are in Le (M ), it is also true that one cannot produce any
strings using this grammar that are not in Le (M ).

It might help to know that the variables and productions of G have been defined in such a way that
a leftmost derivation in G of a string x is a simulation of the PDA M when fed x as input. The
variables that appear in any step of a leftmost derivation in G correspond to the symbols on the
stack of M at a time when M has seen as much of the input as the grammar has already generated.
Put another way, the intention is that [p, A, q] derive x if and only if x causes M to erase an A from
its stack by some sequence of moves beginning in state q and ending in state p.

You might also like