Automata Chapter 1 2 &3
Automata Chapter 1 2 &3
Automata Chapter 1 2 &3
Definition of Automata.
An Automaton is defined as a system where energy, materials and
information’s are transformed, transmitted and used for performing some
functions without direct help of man. Example automatic packing
machine and automatic photo printing machines.
The term "Automata" is derived from the Greek word "αὐτόματα" which means "self-
acting". An automaton (Automata in plural) is an abstract self-propelled computing
device which follows a predetermined sequence of operations automatically.
An automaton with a finite number of states is called a Finite Automaton (FA) or Finite
State Machine (FSM).
Alphabets and strings
Alphabet
Definition − An alphabet is any finite set of symbols.
Example − ∑ = {a, b, c, d} is an alphabet set where ‘a’, ‘b’, ‘c’, and ‘d’ are symbols.
String
Definition − A string is a finite sequence of symbols taken from ∑.
Example − ‘cabcad’ is a valid string on the alphabet set ∑ = {a, b, c, d}
Length of a String
1
Definition − It is the number of symbols present in a string. (Denoted by |S|).
Examples −
o If S = ‘cabcad’, |S|= 6
o If |S|= 0, it is called an empty string (Denoted by λ or ε)
In DFA, for each input symbol, one can determine the state to which the machine will move.
Hence, it is called Deterministic Automaton. As it has a finite number of states, the machine is
called Deterministic Finite Machine or Deterministic Finite Automaton.
A DFA can be represented by a 5-tuple (Q, ∑, δ, q0, F) where −
Q is a finite set of states.
2
Example
Let a deterministic finite automaton be →
Q = {a, b, c},
∑ = {0, 1},
q0 = {a},
F = {c}, and
Transition function δ as shown by the following table −
Present State Next State for Input 0 Next State for Input 1
a a b
b c a
c b c
Transition Systems
A transition graph or a transition system is a finite directed labeled graph in
which each vertex or node represents a state and the directed edges
indicate the transition of a state and the edges are labeled with
input/output. An initial state represented by a circle with an arrow pointing
towards it, the final state by two concentric circles, and the other states are
represented by one circle.
The transitions frorn one internal state to another are governed bv the transition function
δ, for example, for
Then the dfa is in state q0 and the current input symbol is a, the dfa will go into state q1.
Properties of Transition Function.
Property 1. δ(q, ) = q in a finite automaton. This means the state of the
3
system can be changed only by an input symbol.
Property 2. For all strings W ∑* and input symbols a.
δ(q, aw) = δ ( δ(q, a), w)
δ(q, wa) = δ ( δ(q, w), a)
This property gives the state after the automaton reads the first symbol
of a string aw and the state after the automaton reads a prefix of the string
wa.
Example : prove that for any transition function δ and for any two input
string x and y.
δ(q, xy) = δ ( δ(q, x), y) 2.1
Proof: By the method of induction on |y|. i.e. length of y. when
|y|=1,y=a.
= δ(q, xa) = δ ( δ(q, x), a) by property 2.
Consider the finite state machine whose ={0,1}, K={q0,q1,q2, q3},q0 in
K is a initial state, final state is q0. Transition function δ is given in the
table. Give the entire sequence of state for the input string 110101.
Let consider the DFA ={a,b}, K={ q0 },q0 in K is a initial state, final state is q0.
4
Languages L(m) = { x/x * }
Let consider the DFA ={a,b}, K={q0,q1,D}, q0 in K is an initial state, final
state is q0.
Example
Let a deterministic finite automaton be →
Q = {a, b, c},
∑ = {0, 1},
q0 = {a},
F = {c}, and
Transition function δ as shown by the following table −
Present State Next State for Input 0 Next State for Input 1
a a b
b c a
c b c
In NFA, for a particular input symbol, the machine can move to any combination of the states in
the machine. In other words, the exact state to which the machine moves cannot be determined.
Hence, it is called Non-deterministic Automaton. As it has finite number of states, the machine is
called Non-deterministic Finite Machine or Non-deterministic Finite Automaton.
5
∑ is a finite set of symbols called the alphabets.
δ is the transition function where δ: Q × ∑ → 2
Q
(Here the power set of Q (2Q) has been taken because in case of NDFA, from
a state, transition can occur to any combination of Q states)
q0 is the initial state from where any input is processed (q 0 ∈ Q).
F is a set of final state/states of Q (F ⊆ Q).
Let see the example below, which describes the nondeterministic automaton since
there are two transitions labeled with ‘a’ out of qo.
Q = {a, b, c}
∑ = {0, 1}
q0 = {a}
F = {c}
The transition function δ as shown below −
Present State Next State for Input 0 Next State for Input 1
a a, b b
b c a, c
c b, c c
6
Consider the NFA whose ={a,b}, Q={ q0,q1,q2 },q0 in Q is initial state,
final state is q2, transition function δ is given in the table. To check
whether the given string aaabb is accepted by the machine or not.
Initial state q0. Two choose are there reading
a from q0 either it is go to q0 or q1. Similarly
two choose are there reading b from q1 it is go
either to q1 or q2. Final state is q2. The string
sequence of a’s followed by sequence of b’s
will be accepted by the machine. The string
starting with b and ending a are not accepted
by the machine. The state sequence is q0
q0q0 q1 q1 q2 for the input aaabb.
An NFA that accepts all binary strings that end with 101.
7
The transition diagram can be drawn by using the mapping function as given in the table.
Design an NFA with ∑ = {0, 1} in which double '1' is followed by double '0'.
Now before double 1, there can be any string of 0 and 1. Similarly, after double 0, there can be
any string of 0 and 1.
The Equivalence of DFA and NFA
Let, M = (Q, ∑, δ, q0, F) is an NFA which accepts the language L(M). There should be
equivalent DFA denoted by M' = (Q', ∑', q0', δ', F') such that L(M) = L(M'). Any string is
accepted by NFA can be accepted by a DFA. Method is called as subset.
Example 1:
Convert the given NFA to DFA.
8
For the given transition diagram we will first construct the transition table.
State 0 1
δ'([q1], 0) = ϕ
δ'([q1], 1) = [q0, q1]
Now we will obtain δ' transition on [q0, q1].
δ'([q0, q1], 0) = δ(q0, 0) ∪ δ(q1, 0)
= {q0, q1} ∪ ϕ
= {q0, q1}
= [q0, q1]
Similarly,
State 0 1
9
Example 2:
Convert the given NFA to DFA.
For the given transition diagram we will first construct the transition table.
State 0 1
→q0 q0 q1
q1 {q1, q2} q1
10
= {q1} ∪ {q1, q2}
= {q1, q2}
= [q1, q2]
The state [q1, q2] is the final state as well because it contains a final state q2. The transition
table for the constructed DFA will be:
State 0 1
11
Chapter 2: Regular Expression and Regular languages
Regular expressions
Regular Expressions are an algebraic way to describe languages
Some RE Examples
(a+b)* Set of strings of a’s and b’s of any length including the null string. So
L = { ε, a, b, aa , ab , bb , ba, aaa…….}
(a+b)*abb Set of strings of a’s and b’s ending with the string abb. So L = {abb,
aabb, babb, aaabb, ababb, …………..}
(11)* Set consisting of even number of 1’s including empty string, So L= {ε,
11, 1111, 111111, ……….}
12
number of b’s , so L = {b, aab, aabbb, aabbbbb, aaaab, aaaabbb,
…………..}
(aa + ab + ba + bb)* String of a’s and b’s of even length can be obtained by concatenating
any combination of the strings aa, ab, ba and bb including null, so L =
{aa, ab, ba, bb, aaab, aaba, …………..}
Example 1:
Write the regular expression for the language accepting all combinations of a's,
over the set ∑ = {a}
Solution:
All combinations of a's means a may be zero, single, double and so on. If a is
appearing zero times, that means a null string. That is we expect the set of {ε, a, aa,
aaa, ....}. So we give a regular expression for this as
R = a*
Example 2. Write the regular expression for the language accepting all the string
containing any number of a's and b's.
Solution:
R = (a + b)*
This will give the set as L = {ε, a, aa, b, bb, ab, ba, aba, bab, .....}, any combination
of a and b.
The (a + b)* shows any combination with a and b even a null string.
Example 3. Write the regular expression for the language accepting all the string
which are starting with 1 and ending with 0, over ∑ = {0, 1}.
Solution:
In a regular expression, the first symbol should be 1, and the last symbol should be
0. The r.e. is as follows:
13
R = 1 (0+1)* 0
Solution:
The language can be predicted from the regular expression by finding the meaning of it.
We will first split the regular expression as:
L = {The language consists of the string in which a's appear triples, there is no restriction
on the number of b's}
Regular Sets
Any set that represents the value of the Regular Expression is called a Regular Set.
14
L2 = { ε, aa, aaaa, aaaaaa,.......} (Strings of even length including Null)
L1 ∩ L2 = { aa, aaaa, aaaaaa,.......} (Strings of even length excluding Null)
RE (L1 ∩ L2) = aa(aa)* which is a regular expression itself.
Property 3. The complement of a regular set is regular.
Proof −
Let us take a regular expression −
RE = (aa)*
So, L = {ε, aa, aaaa, aaaaaa, .......} (Strings of even length including Null)
Complement of L is all the strings that is not in L.
So, L’ = {a, aaa, aaaaa, .....} (Strings of odd length excluding Null)
RE (L’) = a(aa)* which is a regular expression itself.
Property 4. The difference of two regular set is regular.
Proof −
Let us take two regular expressions −
RE1 = a (a*) and RE2 = (aa)*
So, L1 = {a, aa, aaa, aaaa, ....} (Strings of all possible lengths excluding Null)
L2 = { ε, aa, aaaa, aaaaaa,.......} (Strings of even length including Null)
L1 – L2 = {a, aaa, aaaaa, aaaaaaa, ....}
(Strings of all odd lengths excluding Null)
RE (L1 – L2) = a (aa)* which is a regular expression.
Property 5. The reversal of a regular set is regular.
Proof −
We have to prove LR is also regular if L is a regular set.
Let, L = {01, 10, 11, 10}
RE (L) = 01 + 10 + 11 + 10
LR = {10, 01, 11, 01}
RE (LR) = 01 + 10 + 11 + 10 which is regular
Property 6. The closure of a regular set is regular.
15
Proof −
If L = {a, aaa, aaaaa, .......} (Strings of odd length excluding Null)
i.e., RE (L) = a (aa)*
L* = {a, aa, aaa, aaaa , aaaaa,……………} (Strings of all lengths excluding Null)
RE (L*) = a (a)*
Property 7. The concatenation of two regular sets is regular.
Proof −
Let RE1 = (0+1)*0 and RE2 = 01(0+1)*
Here, L1 = {0, 00, 10, 000, 010, ......} (Set of strings ending in 0)
and L2 = {01, 010,011,.....} (Set of strings beginning with 01)
Then, L1 L2 = {001,0010,0011,0001,00010,00011,1001,10010,.............}
Set of strings containing 001 as a substring which can be represented by an RE −
(0 + 1)*001(0 + 1)*
Hence, proved.
Connection between regular expression and regular languages
17
Example 1
The equations for the three states q1, q2, and q3 are as follows −
q3 = q2a
q1 = q1a + q3a + ε
18
= (a + b(b + ab)*aa)*
Example 2
Construct a regular expression corresponding to the automata given below −
Solution −
Here the initial state is q1 and the final state is q2
Now we write down the equations −
q1 = q10 + ε
q2 = q11 + q20
q3 = q21 + q30 + q31
Now, we will solve these three equations −
q1 = ε0* [As, εR = R]
So, q1 = 0*
q2 = 0*1 + q20
So, q2 = 0*1(0)* [By Arden’s theorem]
Hence, the regular expression is 0*10*.
Regular grammar
In the literary sense of the term, grammars denote syntactical rules for conversation in natural
languages. Linguistics have attempted to define grammars since the inception of natural
languages like English, Sanskrit, Mandarin, etc.
The theory of formal languages finds its applicability extensively in the fields of Computer
Science. Noam Chomsky gave a mathematical model of grammar in 1956, which is effective for writing
computer languages.
19
Grammar
P is Production rules for Terminals and Non-terminals. A production rule has the form α → β,
where α and β are strings on VN ∪ ∑ and least one symbol of α belongs to VN.
Example
Grammar G1 −
Here,
Productions, P : S → AB, A → a, B → b
20
N= {q1, q2, q3} T= {a, b} S = {q1}
P = {q1 aq1, q1bq2, q2aq3, q2bq2, q3aq1, q3bq2}
Example
Let us consider the grammar −
G2 = ({S, A}, {a, b}, S, {S → aAb, aA → aaAb, A → ε } )
Some of the strings that can be derived are −
S ⇒ aAb using production S → aAb
⇒ aaAbb using production aA → aAb
⇒ aaaAbbb using production aA → aaAb
⇒ aaabbb using production A → ε
We will see another example The peacock is a beautiful bird. Starting
with the sentence symbol S, this can be dividing as Noun parse one NP1
and verb parse VP, then Noun parse one is split into article one and
Noun one, article one is The and Noun one is Peacock, then VP is
dividing into verb V, article two, adjective one and Noun two N2. Verb
is is, Article two is a, Adjective is beautiful and Noun two is bird
<S> <NP1> <VP>
<NP1> <Art1> <N1>
<Art1> The
<N1> Peacock
<VP> <V> <Art2> <Adj><N2>
<V> is
<Art2> a
<Adj> Beautiful
21
<N2> Bird.
So this single arrow can be rewritten as and double
arrow directly derives.
S <NP1> <VP>
<NP1><V> <ART2> <ADJ> <N2>
<NP1><V> a <ADJ><N2>
<NP1><V> a beautiful <N2>
<ART1> <N1><V> a beautiful <N2>
The <N1><V> a beautiful <N2>
The <N1> is a beautiful <N2>
The Peacock is a Beautiful <N2>
The Peacock is a Beautiful Bird.
<S> rewritten as <NP1><VP>, then <NP1> rewritten as <PropN>, then <PropN> rewritten as
Venice, then <VP> rewritten as <Verb><NP2>, then <Verb> rewritten as is, then <NP2>
rewritten as <Art><Adj><N2>, then <Art> rewritten as a, then <Adj> rewritten as beautiful
and finally <N2> rewritten as city. So Grammar generates the sentence Venice is a beautiful
city.
Language Generated by a Grammar
Example
If there is a grammar
G: N = {S, A, B} T = {a, b} P = {S → AB, A → a, B → b}
Here S produces AB, and we can replace A by a, and B by b. Here, the only accepted
string is ab, i.e.,
L(G) = {ab}
Example
Suppose we have the following grammar −
G: N = {S, A, B} T = {a, b} P = {S → AB, A → aA|a, B → bB|b}
The language generated by this grammar −
L(G) = {ab, a2b, ab2, a2b2, ………}
22
= {am bn | m ≥ 1 and n ≥ 1}
We’ll consider some languages and convert it into a grammar G which produces those
languages.
Example1
Problem − Suppose, L (G) = {am bn | m ≥ 0 and n > 0}. We have to find out the
grammar G which produces L(G).
Solution
Since L(G) = {am bn | m ≥ 0 and n > 0}
the set of strings accepted can be rewritten as −
L(G) = {b, ab,bb, aab, abb, …….}
Here, the start symbol has to take at least one ‘b’ preceded by any number of ‘a’
including null.
To accept the string set {b, ab, bb, aab, abb, …….}, we have taken the productions −
S → aS , S → B, B → b and B → bB
S → B → b (Accepted)
S → B → bB → bb (Accepted)
S → aS → aB → ab (Accepted)
S → aS → aaS → aaB → aab(Accepted)
S → aS → aB → abB → abb (Accepted)
Thus, we can prove every single string in L(G) is accepted by the language generated
by the production set.
Example2
Problem − Suppose, L (G) = {am bn | m > 0 and n ≥ 0}. We have to find out the grammar
G which produces L(G).
Solution −
Since L(G) = {am bn | m > 0 and n ≥ 0}, the set of strings accepted can be rewritten as −
L(G) = {a, aa, ab, aaa, aab ,abb, …….}
Here, the start symbol has to take at least one ‘a’ followed by any number of ‘b’
including null.
To accept the string set {a, aa, ab, aaa, aab, abb, …….}, we have taken the productions
−
23
S → aA, A → aA , A → B, B → bB ,B → λ
S → aA → aB → aλ → a (Accepted)
S → aA → aaA → aaB → aaλ → aa (Accepted)
S → aA → aB → abB → abλ → ab (Accepted)
S → aA → aaA → aaaA → aaaB → aaaλ → aaa (Accepted)
S → aA → aaA → aaB → aabB → aabλ → aab (Accepted)
S → aA → aB → abB → abbB → abbλ → abb (Accepted)
Thus, we can prove every single string in L(G) is accepted by the language generated
by the production set.
Hence the grammar −
G: ({S, A, B}, {a, b}, S, {S → aA, A → aA | B, B → λ | bB })
24
Type-0 grammars: generate recursively enumerable languages. The productions have no restrictions.
They are any phase structure grammar including all formal grammars.
The productions can be in the form of α → β where α is a string of terminals and non-terminals with at
least one non-terminal and α cannot be null. β is a string of terminals and non-terminals.
Example
S → ACaB
Bc → acB
CB → DB
aD → Db
Type-1 grammars: generate context-sensitive languages.
α A β → α γ β, | α |<= | β |
where A ∈ N (Non-terminal)
The rule S → ε is allowed if S does not appear on the right side of any rule. The languages generated by
these grammars are recognized by a linear bounded automaton.
Example
AB → AbBc
25
A → bcA
B→b
Example
S→Xa
X→a
X → aX
X → abc
X→ε
Type-3 grammars: generate regular languages. Type-3 grammars must have a single non-terminal on
the left-hand side and a right-hand side consisting of a single terminal or single terminal followed by a
single non-terminal.
and a ∈ T (Terminal)
The rule S → ε is allowed if S does not appear on the right side of any rule.
Example
X→ε
X → a | aY
Y→b
26
Chapter 3: Context free languages
Definition: Let L be a language. Then L is a context-free language if and only if there exists a
context-free grammar G such that L = L(G).
Every regular grammar is context-free, so a regular language is also a context- free one, so we
see that the family of regular languages is a proper subset of the family of context-free
languages. Context- free grammars derive their name from the fact that the substitution of the
variable on the left of a production can be made any time such a variable appears in a sentential
form. It does not depend on the symbols in the rest of the sentential form (the context). This
feature is the consequence of allowing only a single variable on the left side of the production.
Example
27
For generating a language that generates equal number of a’s and b’s in te form
anbn . the context free grammar is defined as G= {(S, A), (a, b), (S aAb, A
aAb/ ε)}
SaAb
aaAbb (AaAb)
aaaAbbb (A aAb)
aaabb (A ε)
aaabbb a3b3 anbn
28
Parsing is the process of analyzing a string of symbols, either in natural or computer language.
Ambiguity
Grammar can be derived in either leftmost or rightmost derivation. We can draw the derivation
tree called as parse tree or syntax tree.
Parse tree is a unique one though the derivation is leftmost or rightmost.
If there exists more than one parse tree for a given grammar that means more than one
rightmost or leftmost is possible: that grammar is said to be ambiguous grammar.
29
For example, consider the grammar
S → SS
S → aSb
S→ε
The sentence aabb has two derivation trees as shown in the figures below:
30
each statement, ambiguity must be removed when possible. Often we can achieve this by
rewriting the grammar in an equivalent, unambiguous form.
Sentential forms
Replacing right most non terminal, then replacing right most non terminal, and so on. So always
replacing right most non terminal, such a derivation is called right most derivation. Left most
and right most derivation are very important because later on when you learn about compiler
realizing top most parsing used for left most derivation and bottom most parsing used for right
most derivation.
We will consider one more example Venice is a beautiful city.
<S> <NP1> <VP> <S> <NP1><VP>
<NP1> <PropN> <PropN><VP>
<PropN Venice Venice <VP>
>
<VP> <V> <NP2> Venice <V><NP2>
<V> Is Venice is <NP2>
<NP2> <Art><Adj><N Venice is <Art><Adj><N2>
2>
<Art> A Venice is a <Adj><N2>
<Adj> beautiful Venice is a beautiful <N2>
<N2> City Venice is a beautiful city.
31
• B b | bS | aBB
• For the string baaabbabba find leftmost, rightmost derivation and Derivation tree
• First write the production rules separately like below
• S aB rule1
• S bA rule2
• Aa rule3
• A aS rule4
• A bAA rule5
• Bb rule6
• B bS rule7
• B aBB rule8
32
Left most and right most derivations
Generation of the string using production rules is called derivation. In derivation of string:
replacing the non -terminal by appropriate rule. If there are more non-terminal available then
which non-terminal has to replace first; which is based on the two methods listed below:
Leftmost Derivation
It’s a derivation in which, leftmost non-terminal is replaced first from the sentential form
Rightmost Derivation
Its derivation in which, rightmost non-terminal is replaced first from the sentential form
Simplification of context free grammar
All Grammars are not always optimized. That means grammars are containing some unnecessary
symbols(non-terminals) and this will increase the length of the grammar.
Simplification of grammar means reduction of grammar by removing useless symbols.
This is done in three ways:
i. Removal of useless symbols
ii. Elimination of ε production
iii. Removal of unit production
P= { S 11A | 11
A0 }
33
34
35
Chomsky’s hierarchy of grammars
36