TOC SY Unit-3

Download as pdf or txt
Download as pdf or txt
You are on page 1of 80

Unit 3: Regular Expression (07)

Regular Expression (RE):

1. Definition and operators, Primitive Regular Expressions, Algebraic Laws of Regular


Expressions, Languages Defined by Regular Expressions, Building Regular Expressions,
Closure Properties of Regular Languages, Regular expression examples.
2. Inter-conversion of RE and FA,
3. Construction of FA equivalent to RE (RE to Є-NFA, Є-NFA to DFA).
4. Construction of RE equivalent to FA using Arden’s Theorem.
5. Pumping Lemma for Regular languages, Limitations of FA.(Will cover after T2)
Regular Language: Set of String accepted by FA
Regular Expression :
It is a symbolic representation of strings in a language(L)
It is a finite expression which can represent infinite number of strings of a language
It is a short notation used to denote complex and infinite regular languages
It is composed of some operands and some operators

OPERANDS: are characters from the alphabet over which the regular expression is defined.

OPERATORS:
+= Matches the character before the + one or more times Eg: a+c -> ac, aac, aaac

*=Matches anything in the place of the *, or a "greedy" match : Eg : ab*c -> abc, abbcc, abcdc
. = Matches any character as a wildcard Eg: a.c -> abc, asc, a123c

Precedence of REs Operators:


1. The star (*) operator is of the highest precedence.
2. Next comes the dot (.) or concatenation operator,After grouping all stars to their operands, we group concatenation
operator of their operands. Concatenation is an associative operator so it doesn’t matter in which order we group consecutive
concatenations.
3. Union (+ operator) are grouped with their operands.

In a regular expression, x* means zero or more occurrence of x. It can generate {e, x, xx, xxx, xxxx, .....}

In a regular expression, x+ means one or more occurrence of x. It can generate {x, xx, xxx, xxxx, .....}
b
(a+b)*
=
(a+b)0 = ɛ
(a+b)1 = {a,b}
b (a+b)2 = {aa,ab,ba,bb}
(a+b)3 = {aaa,aab,abb,bbb}

(a+b)n = {an,bn,............}
Formal Definition
The class of regular expressions over Σ is defined recursively as follows
1. Regular expressions over Σ (include letters ∅ (empty set) and ε (empty
string of length 0) i.e L(r)={ε} and L(r)= { } for ∅
2. Every symbol a ∈ Σ is a RE over Σ.
3. If R1 & R2 are RE over Σ then so are (R1+R2), (R1.R2) & (R1)*, where
‘+’ indicates alternative path (parallel path),
The operation ‘.’ denotes concatenation (series connection) and “*” denotes
iteration (closure)
4. RE are only those that are obtained using rules 1 to 3
● If ‘r’ is a regular expression, then the language represented by ‘r ‘ is denoted by
L(r).
● e.g.
1) if r =a + b i.e. union, a or b
L (r) = { a,b}
2) if r= a.b i.e concatenation
L (r) = { ab}
3) if r = a* i.e. kleene closure
L (r) = {ε,a, aa,aaa, aaaa,......}

● Thus, we say that set of regular language is closed under union, concatenation
and kleene closure.
● R1+R2 means ORing of two regular expression i.e. R1 or R2
● R1.R2- means R1 followed by R2 i.e. R1R2
● R1* = Zero and more occurance of R, (iterative operation)
● R1+ =( Transitive closure)- Atleast one and more occurance of R1
● R+ = R.R*

Regular Set
Any set that represents the value of the Regular Expression is called a Regular Set

Regular sets have various properties:

1. The union of two regular sets is also a regular set


2. The intersection of two regular set is also a regular set.
3. The complement of a regular set is also regular set.
4. The difference of two regular set is also a regular set.
5. The reversal of a regular set is also a regular set.
6. The closure of a regular set or an expression is also a regular set.

And …….
Algebraic Laws of Regular expressions over ∑
● Ø is a RE denotes the empty set { }
● ε is a RE denotes {ε}
● a is a RE, a ∈ ∑ denotes {a}
● let R1 and R2 be two REs on ∑ then
● R1 + R2 is RE

● R1 . R2 is RE

● R1* is RE
Quick test: Match the following

Regular Language Regular Expression

1 L = {a,b,c} a 0(0+1)*1

2 L = {abb,ab} b (ab)*

3 L = {ɛ,a,aa,ab,ba,bb,aab,abb....} c (a+b)*

4 L = {ɛ ,ab,abab,..} d (0+1)*

5 L = {01,011,001,00110,0111….} e (a+b+c)

6 L = {ɛ,0,1,01,10,11,00,001,111....} f (ab+a).b
Quick test: Complete the table by finding RE or RL of the following

Regular Language Regular Expression

L = {ε, 00, 0000, 000000, ......}

R = (1* 0*)

L = {a, aa, aaa, ....}

L = {a, b, c, ab, aab,ac, bc,bcc,abbcc,


aabc,...}

R= (b* (aaa)* b*)*

r.e. = (any combination of b's) (aaa)*


(any combination of b's)
Quick test: Find RE or RL of the following

Regular Language Regular Expression

L = {ε, 00, 0000, 000000, ......} R = (00)*

L = {ε, 0, 1, 00, 11, 10, 100, .....} R = (1* 0*)

L = {a, aa, aaa, ....} R = a+

L = {ε,a, b, c, ab, aab,ac, bc,bcc,abbcc, aabc,...} R = a* b* c*

L = {ε,b, aaa,baaa,bbbaaab,...} R= (b* (aaa)* b*)*

r.e. = (any combination of b's)


(aaa)* (any combination of b's)
Closure Properties of Regular Languages

● Union : If L1 and If L2 are two regular languages, their union L1 ∪ L2 is a regular

Language.

● Intersection : If L1 and If L2 are two regular languages, their intersection L1 ∩

L2 is a regular language.

● Concatenation : If L1 and If L2 are two regular languages, their concatenation

L1.L2 is a regular language.

● Kleene Closure : zero or more occurrence of language L


Examples
1. Write the Regular Expression for language accepting all combinations of
a’s over the set ∑={a}
2. Write the Regular Expression for language accepting all strings except
Null string over the set ∑={a}
3. Write the Regular Expression for language accepting all strings
containing any no. of a’s and b’s
4. Write the Regular Expression for language accepting all the strings
starting with 1 and ending with 0 over ∑={0,1}
5. Write the Regular Expression for language accepting all the strings
starting with a but not having consecutive b’s
Examples
1. Write the Regular Expression for language accepting all combinations of
a’s over the set ∑={a}
ans:
All combinations of a's means a may be zero, single, double and so on. If
a is appearing zero times, that means a null string. That is we expect the
set of {ε, a, aa, aaa, ....}. So we give a regular expression for this as:

R = a*
Examples
1. Write the Regular Expression for language accepting all combinations of a’s over the
set ∑={a} Ans: RE = a*

2. Write the Regular Expression for language accepting all strings except Null string over
the set ∑={a} Ans: RE = a+

3. Write the Regular Expression for language accepting all strings containing any no. of
a’s and b’s Ans: RE = (a+b)*

4. Write the Regular Expression for language accepting all the strings starting with 1 and
ending with 0 over ∑={0,1} Ans: RE = 1(0+1)*0

5. Write the Regular Expression for language accepting all the strings starting with a but
not having consecutive b’s Ans: L = {a, aba, aab, aba, aaa, abab, .....} RE= (a+ab)*
Examples
6. Write the Regular Expression for language consisting of all strings over ∑={0,1}
with at least two consecutive 0’s.

7. Describe in simple english the language represented by RE r=(1+10)*


Examples
6. Write the Regular Expression for language consisting of all strings over ∑={0,1} with
at least two consecutive 0’s.
R = (0+1)*0.0(0+1)*
7. Describe in simple english the language represented by RE r=(1+10)*
language L over ∑={0,1} having strings beginning with ‘1’ and not having two
consecutive 0’s.
11. Show that (a.b) * ≠ a*.b*
● Ans
Let r1 = (a.b)* and r2 = a*.b*

L(r1) = {ε,ab,abab,ababab,....} -1
L (r2) = {ε,a,aa,aaa,aaaa,..} . {ε,b,bb,bbb,bbbb,...}
= {ε,a,b,aa,bb,aaa,bbb,.....} -2
From 1 & 2
● (a.b) * ≠ a*.b*
For Practice:
If L(r) = Set of all strings over Σ = {0,1,2} such that at least one ‘0’ followed by at least one ‘1’
followed by at least one ‘2’ is there, Find RE ‘r’ representing this language

Ans- r= 0.0*.1.1*.2.2*

If r= ab*a, describe L(r) in the form of a set.

Ans -

L(r)= {aa,aba,abba,abbba,abbbba,.....}
If L(r) = {a,c,ab,cb,abb,cbb,abbb,....} What is r ?
Ans : r= (a+c).b*

If r= b*.a.b*.a.b* Describe L (r) in simple english language

Ans: L( r ) = is the language over Σ={a,b} containing exactly two a’s

Represent the language over Σ = {a,b} containing at least one ‘a’ & at least one ‘b’
Ans

RE. = (a+b)*. a.(a+b)* . b(a+b)* + (a+b)*. b.(a+b)* .a. (a+b)*


Represent set of all strings of a’s and b’s containing atleast one combination of
double letter using RE

Ans. R= (a+b)* . (aa + bb) . (a+b)*


If L(r) = {ε, x, xx, xxx, xxxx, xxxxx} what is r?
Ans : R = ( ε + x) 5

10. L(r)= set of all strings over Σ = {0,1} ending with 011

Ans- r= (0+1)*. 011


Determine the regular expression for all strings containing exactly one ‘a’ over ∑ = {a, b, c}.

(b + c)* a (b + c)*
Write the regular expression for the language accepting all the string in which any number of a's is followed by any number of b's
is followed by any number of c's.

As we know, any number of a's means a* any number of b's means b*, any number of c's means c*. Since as given in problem
statement, b's appear after a's and c's appear after b's. So the regular expression could be:

1. R = a* b* c*

Describe the language denoted by following regular expression

r.e. = (b* (aaa)* b*)*

The language can be predicted from the regular expression by finding the meaning of it. We will first split the regular expression
as:

r.e. = (any combination of b's) (aaa)* (any combination of b's)

L = {The language consists of the string in which a's appear triples, there is no restriction on the number of b's}
Write the regular expression for the language containing the string over {0, 1} in which there are at least two occurrences of 1's
between any two occurrences of 1's between any two occurrences of 0's.

At least two 1's between two occurrences of 0's can be denoted by (0111*0)*.

Similarly, if there is no occurrence of 0's, then any number of 1's are also allowed. Hence the r.e. for required language is:

1. R = (1 + (0111*0))*
Let L(r) = set of all strings over ∑={a,b} in which the string either contain all b’s or else there is
an an ‘a’ followed by some b’s , the set also contain ε. Find Regular Language for above
language

R.E.= b* + ab* = ( ε + a) b*

Let L is subset of {0,1}* be the language of all strings of even length. Find RE
Ans
Input= {0,1}
L= {00,01,10,11}
L(r) = (00+01+10+11)*
Or
((0+1).(0+1))*
➢ Find RE for the language consisting of all strings of a’s and b’s without
any combination of double letters.
Ans- (ε+b) (ab)* (ε+a)

➢ If L(r) = {aaa,aab,aba,abb,baa,bab,bba,bbb}. Find RE r which represent L


(r)
Ans- (a+b) 3 0r (a+b).(a+b.(a+b)
RE to FA

•We want to construct a DFA that accepts the language


denoted by given Regular expression.

•We first need to obtain Є-NFA, converts it to NFA which in


turn can be converted to its equivalent DFA
RE to NFA with Є- moves conversion
•Draw an NFA with Є-moves for RE ,
r= a. (a+b) *
which represents language consisting of strings of a’s and
b’s, starting with a.
Draw an NFA with Є-moves for RE , r= (a*+b*)
(ab+ba)*.aa.(ab+ba)*
Convert RE to ∊-NFA
0+01*
(11+0)* (00+1)*
RE to DFA
Direct Method
Ex. Construct DFA that accepts the language represented by
0*.1*.2*
Step1: convert 0*.1*.2* to its equivalent NFA with Є
Step 2: from Є - NFA start finding with initial epsilon state transitions

0 1, 3, 4, 5, 7, 8, 9, 11

As 11 is final state make this state as final state


Step 2: from Є - NFA start finding with initial epsilon state transitions

0 1, 3, 4, 5, 7, 8, 9, 11

As 11 is final state make this state as fina


Step 3: find input(i.e. 0 1 2 ) transition here for State A

A
0
0 1, 3, 4, 5, 7, 8, 9, 11 2 1, 3, 4, 5, 7, 8, 9, 11 B

1
2
Step 3: find input(i.e. 0 1 2 ) transition here for State A

A
0
0 1, 3, 4, 5, 7, 8, 9, 11 2 1, 3, 4, 5, 7, 8, 9, 11 B

1
2
C
D 6 5, 7, 8, 9, 11
10 9, 11
Step 3: find input(i.e. 0 1 2 ) transition here for State A

A
0
0 1, 3, 4, 5, 7, 8, 9, 11 2 1, 3, 4, 5, 7, 8, 9, 11 B

1
2
C
D 6 5, 7, 8, 9, 11
10 9, 11
Step 2:
It can be converted to DFA using direct method
•Construct DFA that accepts language represented
by

• r= (ab/ba)*.aa.(ab/ba)*

• = (ab+ba)*.aa.(ab+ba)*
Convert to DFA a. (a+b)*
Find RE for DFA

R.E. = (0+1).(0+1)*
DFA to RE: Arden’s Theorem

● State Transition Graph is of DFA (no ‘ε’ moves)


● STG has only one initial state ‘q0’
● q0, q1, ....., qk are states in STG

● state ‘qi’ is the RE representing the set of words accepted by DFA if ‘qi’ is
the final state
● equations are written for each state
● start state has incoming ‘ε’ signal
● equations take form R = Q+RP where P,Q are REs and P does not contain
‘ε’ then R has unique solution R=QP*
● back substitute here onwards
Derivation R= QP*

R = Q + RP
= Q + (Q + RP)P
= Q+QP+RP2
= Q+QP+ (Q + RP)P2
= Q+QP+QP2+RP3
= Q+QP+QP2+(Q+RP)P3
= Q+QP+QP2+QP3+RP4
= Q+QP+QP2……+QPn +RPn+1
= Q(ε+P+P2+....+Pn) +RPn+1
.....
= QP*
Arden’s Theorem Examples
Find regular expression for the following DFA using Arden’s Theorem-
Solution-
Step-01:
Form a equation for each state-
A = B.1 => A = ∈ + B.1 ……(1) to write equation in Arden’s Thm format use ∈
B = A.0 ……(2)

Step-02:

Bring final state in the form R = Q + RP.


Using (1) in (2), we get-
B = (∈ + B.1).0
B = ∈.0 + B.1.0
B = 0 + B.(1.0) ……(3) Using Arden’s Theorem in (3), we get- R=B; Q=0; P=(1.0)
B = 0.(1.0)*
Thus, Regular Expression for the given DFA = 0(10)*
Arden’s Theorem Examples 1 0
Find regular expression for the following DFA using Arden’s Theorem-
Q0 Q1
Solution- 0
Step-01:
Form a equation for each state-
1
Q0 = Q0.1 => Q0 = ∈ + Q0.1 ……(1) to write equation in Arden’s Thm format use ∈
Q1 = Q0.0 +Q1.0 + Q1.1……(2)

Step-02:

Bring final state in the form R = Q + RP.


Using (1) => Q0 = ∈ .1* ……(3) (as, R=Q0, P=, Q=∈ => R=QP*)
in (2), we get-
Q1 = Q0.0 +(0 +1) Q1

Q1 =1*.0 +(0 +1) Q1 Using Arden’s Theorem in (3), we get- R=Q1; Q=1*.0; P=(1.0)
Q1 =1*.0(0 +1)*
Thus, Regular Expression for the given DFA = 0(10)*
Home Work
Find regular expression for the following DFA using Arden’s Theorem-
Find regular expression for the following DFA using Arden’s Theorem-
Solution-

Step-01:

Form a equation for each state-


q1 = ∈ + q1.b + q2.a ……(1)
q2 = q1.a + q2.b ……(2)
Step-02:
Bring final state in the form R = Q + RP. Using Arden’s Theorem in (2), we get-
q2 = q1.a.b* …….(3)

Using (3) in (1), we get-


q1 = ∈ + q1.b + q1.a.b*.a
q1 = ∈ + q1.(b + a.b*.a) …….(4)
Using Arden’s Theorem in (4), we get-
q1 = ∈.(b + a.b*.a)*
q1 = (b + a.b*.a)*
Thus, Regular Expression for the given DFA = (b + a.b*.a)*
•A regular set or a regular
language is a language
accepted by some FA that
Regular can be represented by a
sets & their regular expression .
closure
properties •Regular sets are thus closed
under some operations that
are used to construct RE
such as +, ., *
Formal definition for Regular sets
Closure properties of Regular Sets
Pumping lemma for Regular Language
• Pumping lemma is used to check whether a given language is
regular language or not.
Given that i>=1
1. Assume that L is a regular language.
L= {ab,aabb,aaabbb,....}
&n is a constant
Now any take any string that can be from defined
language and write it in uvw form and compare |uv| <=
n & |v|>=1
2. z= aa ab bb
3. Using pumping lemma uviw
Prove that L= { anbn | n>=1} is not regular
•{ ap | P is a prime number and is not
regular}
Application of RE & FA

1. Lexical Analyser
2. Text Editor
3. Grep Command

You might also like