Lecture 2 - Chapter 1 (1.2)

Download as pdf or txt
Download as pdf or txt
You are on page 1of 32

Chapter 1: Introduction

Languages, Grammars and


Automata
 Language
 Symbol
 Alphabet
 String
 Length of a string, concatenation of string, …
 Grammars
 Automata
 Deterministic Automata
 Non-deterministic Automata
 The Chomsky Hierarchy
 Before we define a language lets come clear with some
important terms.
 Symbol - an arbitrary datum which has some meaning to or
effect on the machine.
 Letters and digits are examples of frequently used
symbols.
 Alphabet (Â): is a finite set of symbols
 Example:- Â = {a, b, c, d}, Â = {0, 1}
 String ( or Word) – is a finite sequence of symbols
 Example:-W = abc, V = 123, Z = 01010110
 Length a string W, denoted by |W| , is the number of symbols
composing the string.
 Example: if W = abbab, then
 |W| = 5
 |W|b = 3 (length of a word w with respect to b )

• Empty string is denoted by ε or l, |l| = 0


 If Â= {a, b}, then ab, abb, a, b, are some of the strings over Â
 Σ* =All strings of symbols from Σ (Kleene closure)
 Σ+ = Σ* - {ε} ( Positive closure)
 Σi = Set of Strings over Σ having length i.
 Example If Σ = {a, b, c} then
 Σ* = {l, a, b,c, aa, ab, bc, abc, , …}
 Σ+ = { a, aa, abc, aabc, …}
 Σ 1= Σ
 Σ 3= {aab, abc, ccc, abb, …}
 If w is a string then w0 = l
 Note that
 Σ* =

 Σ+ =
 Given a string w = abc
 Prefix of w are l, a, ab, & abc
 Suffix of w are l, c, bc, & abc
 Proper prefix or suffix is a prefix or suffix other
than the string itself
 Proper Prefix of w are l, a, & ab
 Proper Suffix of w are l, c, & bc
 The concatenation of two strings is obtained by appending the
second string to the right of the first string
 Example: -
 Given w = a1a2..an & v = a1a2..am then concatenation of w &
v , denoted as wv, is given as wv = a1a2..ana1a2..am
 a3.a2.= aaabb
 Two strings are equal if they have the same number of symbols
and these symbols match, character for character.
 Concatenation operation is associative, and it is not
commutative
 The empty string serves as the identity element
for concatenation.
 That is, for all strings x, x.l =l.x =x
 String Reversal – Let w = a1a2…an be a string,
then the string reversal of w, reverse (w) or wR, is
anan-1…a1
 E.g. If w = abc, then reverse(w) = wR =cba
 Definition:
 A language, L, is a collection of strings over a given
alphabet with some rules known as grammars. It may or
may not be finite.
 A language may be thought of as a subset of all possible
strings (L z Σ*)

 Example Let Σ = {a, b}


 The set {a, aa, abab, ba} is a finite language over Σ
 The set L = {anbn:n > 0} is an infinite language over Σ
 Example: Let Σ = {0, 1}, then
 L = {x|x is in Σ* and x contains an even number of
0’s} = {11, 1010, 00, 0000, 1001, etc}
 Language are sets: Hence, the union, intersection, &
differences of two languages are defined.
 Complement of a language L with respect to Σ* is
L = Σ*- L
 The reverse of a language is the set of all string
reversals, LR = {wR:weL}
 The concatenation of two languages L1 & L2 is the set of all
strings obtained by concatenating any element of L1 with any
element of L2:
 i.e. L1L2 = {xy: x e L1, y e L2 }
• Example: L1 = {a, b} & L2 = {c, de}, then L1L2 = {ac, ade,
bc, bde}
• Ln is L concatenated with itself n times
 L0 = {l}, L1 = L, L2 = L1 L1 , L3 = L1 L1 L1…
 L* = L0UL1UL2… (Star Closure)
 L+ = L1UL2UL3… (Positive closure)
 Example: If L = {anbn:n > 0} then
 L0 = {l}
 L1 = L = {anbn:n > 0}
 L2 = {anbnambm :n > 0, m > 0}. note that n and m
are unrelated – the string aaabbbaabb is in L2
 L3 = {anbnambm akbk:n > 0, m > 0, k > 0}, here
again n, m and k are unrelated – (that is they be the
same or different)
 LR = {bnan:n > 0}
 Definition
 A Grammar is defined as a quadruple
G = (N, T, S, P), where
 N (sometimes denoted as V) is the set of finite set of objects called
non-terminals or variables.
 They are place holders and are denoted by capital letters.
 There should exist rules in P that tells us how to replace the non-
terminals.
 T (sometimes denoted as Σ ) is a finite set of objects called
terminals.
 Used to construct strings of a language, How?
 Are usually represented by small letters and digits
 P - stands for production rules, are hearts of
grammar
 Each production rule is in the form of
(N U T)+ f (N U T)*
 The left hand side should contain at least one non-
terminal
 A production rule XfY implies replace X by Y in
some string
 Example: wxy  wyv … using production XfY
 S e N is a start symbol
 There should be at least one production that has S at
left hand side. (S f (NUT)*).
 The derivation of a string of a language should start
with S.
 Definition:
 Let G = (N, T, S, P) be a grammar. Then the set
L(G) = { w e T* : S
*
w} is the language
generated by G.
 Example1:
 Consider the grammar G = ({S}, {a, b}, S, P) with p
S f aSb ……..(1)
S f l ………..(2)
 Find the language generated by G
 Solution
 Let the language be L(G)
 Look how strings of the language are produced and the
general formula for the language produced by the
grammar is given
S l (2) So, l e L(G)
S aSb using (1) ab e L(G)
ab using (2) So,
S aSb using (1) aabb e L(G)
aaSbb Using (1) So, or
aabb using (2) a2b2 e L(G)
S aSb using (1) aaabbb e L(G)
aaSbb Using (1) So, or
aaaSbbb Using (1) a3b3 e L(G)
aaabbb Using (2)
… …
So, L(G) = {, ab, aabb, aaabbb, aaaabbbb, etc}
L(G) = {anbn : n≥0}
 Example 2:
 Find the language generated by the following grammar G =
{N, T, S, P} where
 N = {S, B}
 T = {a, b, c}
 P : S f aBSc….(1)
S f abc….(2)
Ba f aB….(3)
Bb f bb….(4)
 Answer: L(G) = {anbncn : n e N}
S  abc using (2) So abc e L(G)
S  aBSc using (1)
aBabcc using (2) aabbcc e L(G)
aaBbcc using (3) or
aabbcc using (4) So a2b2c2 e L(G)

S  aBSc using (1)


 aBaBScc using (1) aaabbbccc e L(G)
 aBaBabccc using (2) Or
 aaBaBbc3 using (3 & 3) So a3b3c3 e L(G)
 aaaBbbc3 using (3 & 4)
 a3b3c3 using (4)
… …
 Example:
 Find a grammar that generates L(G) = {anbn+1: n ≥ N}
 Answer
 G = (N, T, S, P) where
 N = {S, A}, T = {a, b}, S = S
 P: S f Ab, S f aSb, A f l
Or
 (P: S f Ab, A f aS, A f l)
1. Find a grammar that generates
a) L(G) = {an: n ≥ 0}
b) L(G) = {(ab)n: n ≥ 0}

2. Find the language generated by the following grammar


a) G = (N, T, S, P) where
N = {A, S}
T = {a, b}
P: S  Ab
A  aA | a | b | bA
 An automata is an abstract model of a digital computer
(= it is a self operating machine).
 Every automaton includes some essential features.
 Mechanism for reading input – the input is a string
over a given alphabet written on an input file
 Produce an output in some form
 Have temporary storage device – the storage cells
can be read and changed.
 Has a control unit
 Can be in any one of a finite number of internal states
 The internal state of the control unit is determined by the next-
state or transition function
input

Storage

Control Unit

Output
BATTERY

input: switch
output: light bulb
Actions (Control Unit): flip switch
States (Storage): on, off
 A particular state of the control unit, input file, and
temporary storage is known as configuration.
 A move is the transition of the automaton from one
configuration to the next.
 One automaton differs from the other in the way in
which the output can be produced and the nature of
the temporary storage.
 An automaton can be deterministic or non-
deterministic
 A deterministic automaton is the one in which
each move is uniquely determined by the current
configuration.
 i.e. if the internal state, the input and the content of
the temporary storage, we can predict the future
behavior of the automaton.
 In non-deterministic automaton there may have
several possible moves, so we can only predict a set of
possible actors.
 An automaton whose output response is limited to a
simple “yes” or “no” is called an accepter (because it
accepts or rejects)
 A more general automaton, capable of producing
strings of symbols as output is called a transducer.
 There are four categories of formal grammars in the
Chomsky Hierarchy,
 Type 0 - the most general  difficult to parse but highly expressive
 Type 1
 Type 2
 Type 3 - the most restrictive easy to parse but less expressive

 Every type 3 grammar is a type 2 grammar, and every type 2 is


a type 1 and so on.
 Each category of grammars have corresponding category of
languages and automata
The Chomsky Hierarchy
 ….
 Type 3 grammars - Regular Grammars - is a grammar
which is either left-linear or right-linear
 Left linear – all production have the form A -> Bx, A -> x
where A & B are non-terminals and x is any string of
terminals
 Right linear – all production have the form A -> xB, A ->
x where A & B are non-terminals and x is any string of
terminals

 Type 2 grammars - Context Free Grammars


 Productions are of the form X–> v where v is an arbitrary
string of symbols in V, and X is a single non-terminal.
 Type 1 Grammars - Context-sensitive grammars
 Productions are of the form uXw –> uvw where u, v and w are
arbitrary strings of symbols in V, with v non-null, and X a single
non-terminal.
 In other words, X may be replaced by v but only when it is
surrounded by u and w. (i.e. in a particular context).
 Type 0 Grammars - Free or unrestricted grammars
 These are the most general productions which have the form u –>
v where both u and v are arbitrary strings of symbols in V, with
u non-null.
 There are no restrictions on what appears on the left or right-
hand side other than the left-hand side must be non-empty.
Grammar Language Automata

Regular Grammars - Regular Languages Finite State Automata


Type 3
Context Free Context Free Push down Automata
Grammars - Type 2 Languages
Context-sensitive Context sensitive Bounded Turing
grammars – Type 1 languages Machine
Free or unrestricted Recursively Turing Machine
grammars– Type 0 Enumerable
languages

You might also like