Theoretical
Elsevier
Computer
Science
277
112 (1993) 2777289
Descriptional complexity of
context-free grammar forms
Erzkbet
Csuhaj-Varjti
Alica Kelemenovi
Communicated
by A. Salomaa
Received August 1989
Revised November
I99 1 zyxwvutsrqponmlkjihgfedcbaZYXWVUTSRQPONMLKJIHGFEDCB
Abstract
Csuhaj-Vajh,
E. and A. Kelemenovi,
Descriptional
Theoretical Computer Science 112 (1993) 277-289.
complexity
of context-free
grammar
forms,
Descriptional
complexity
aspects of grammar
forms are studied. It is shown that grammatical
complexity measures HEI,q, LEV,<, VAR,<, PRO D,< and DEP., related to any appropriate
infinite
class ‘8 of grammars are unbounded
on the infinite class of languages determined by strict/general
interpretations
of any infinite grammar form.
1. Introduction
Descriptional
(grammatical)
complexity
measures
were introduced
in [l, 4, 51 in
order to classify context-free languages according to the size and/or structural properties of their grammars. For the size of grammars they are expressed by such complexity measures as the number of nonterminals
(VAR) and the number of productions
(PROD). The number of grammatical
levels (UP’), the maximal number of elements
of grammatical
levels (DIP) and the height of the digraph of grammatical levels (HEZ)
are the complexity measures reflecting the structure of grammars.
One of the aspects of grammatical
complexity theory is the study of the functional
behaviour of the complexity measures on language classes. Complexity measures are
Correspondence
to: E. Csuhaj-Varjti,
Computer
Sciences, Victor Hugo u. 18-22, H-l 132 Budapest,
0304-3975,/93/$06.00
and Automation
Hungary.
by; 1993.mmElsevier Science Publishers
Institute,
Hungarian
B.V. All rights reserved
Academy
of
278
E. Csuhaj- Yurjk
A. KelemewtG
functions defined on context-free languages, with values being natural numbers; thus,
one can ask for the set of all values of complexity of languages or simply for the
boundedness/unboundedness
of the complexity
measure (on a given class of languages); the latter leads to the finiteness/infinity
of the corresponding
language
hierarchy. Obviously,
this behaviour
depends strictly on the grammar class that is
used to specify languages. For a large variety of complexity measures it was proved
that, related to appropriate
grammar classes, for an arbitrary natural number n, there
is a context-free language with the complexity equal to n (see e.g. [4, 5, 71). Following
this line, the study of the problem of boundedness/unboundedness
for remarkable
subclasses of the context-free language class is of interest. In this paper we concentrate
on language families defined by grammar forms which present a natural generalization of the class of all context-free languages.
Context-free
grammar forms define infinite families of structurally
related grammars via special finite substitutions
(interpretations)
of terminals and nonterminals
in
the production
set. (For details the reader is referred to [l 11.) The main result of this
paper establishes
unboundedness
of complexity
measures zyxwvutsrqponmlkjihgfedcbaZYXWVUTSRQ
VARI, PROD*, LEVsq,
DEP,q, HEIfg on the classes
of languages defined by grammar forms. This property
holds for a rather large variety of grammar classes ?? describing these languages. The
statements are presented with full technical details. They complete the earlier results
given in [3, 8, 91.
The paper is organized as follows.
Section 2 lists some basic definitions from formal language theory.
In Section 3 we construct, for every fixed natural number k, k3 1, some context-free
languages that are of complexity at least k for an arbitrary subclass of context-free
grammars which enables one to generate these languages. The results are of auxiliary
character and serve in proving the main statement in Section 5.
In Section 4 some special interpretations
are presented to obtain the interpretation
grammars generating languages of the previous section. These mappings are isolation,
linear isolation, copy and renaming a single symbol.
In Section
LEI/,,
5 we show
that
grammatical
complexity
measures
VAR!# , PRODr,
on strict and on general grammatical
families
(infinite non-self-embedding
linear) grammar forms for an arbitrary
HEIfq and DEPzc are not bound
of self-embedding
class of reduced (e.g. non-self-embedding
linear) grammars, that is, for every natural
number zyxwvutsrqponmlkjihgfedcbaZYXWVUTSRQPONMLKJIHGFEDCBA
k and for each of the above complexity measures, there is a language of
grammatical complexity at least k in the strict/general
grammatical family of grammar
forms. From
corollaries.
these statements
some results
from previous
papers
can be derived
as
2. Basic definitions
We assume the reader to be familiar with the basics of formal language
the details not explained here the reader is referred to [lo].
theory. For
Descriptiord
complexity
@‘context7free
grammar,f~rms
279
We denote context-free grammars (shortly, grammars) by G=(N, T, P, S), where N, T, P
are the sets of nonterminals,
terminals and productions, respectively, and S is the start
symbol. The context-free language (the language) generated by G is denoted by L(G).
By SF(G) we mean the set of sentential forms derivable in a context-free grammar
G from S.
A context-free
S a* uAu a’ w
Gred a grammar
derivation S **
A nonterminal
where UUET +.
grammar G is said to be reduced iff, for all A EN, there is a derivation
in G, with u, u, WET*. For a context-free grammar G, we denote by
obtained from G by elimination
of all nonterminals
A for which no
uAv =s+ w can be found in G, where u, IJ, WET*.
A of G is said to be recursive iff there is a derivation A a+ uAu in G,
A reduced context-free grammar G is said to be self-embedding
if there is a nonterminal A in N such that a derivation
A =z-*uAv, with u, UET ’ exists; otherwise, it is
said to be non-self-embedding.
For a language L, we denote by alp/~(L) the smallest alphabet T such that L z T*.
For WEL, we denote by 1w I the length of w and by suf;(w) the suffix of length 1of w.
For a class 3 of context-free grammars, we denote by _Y(?%) the class of languages
generated by elements of 3.
In what follows, we review the notions of descriptional
complexity measures (size
and structural complexity measures) of context-free grammars (languages) introduced
in [l, 4, 51.
The size measures for a context-free grammar G are the number of its nonterminals,
denoted by VAR(G), and the number of its productions, denoted by PROD(G).
In order to define structural complexity measures, we have to introduce relation zyxwvutsrqponmlkjih
D on N for G=(N, T, P, S). For two nonterminals
A and B of G, we write A D B if
there is a production
A + uBc in G, with u, UE(N u T)*. D + denotes the transitive
closure of D and D* the reflexive and transitive closure of D.
An equivalence relation 3, defined as A = B iff A D * B and B D * A, determines on
N equivalence classes, called grammatical
levels. For two grammatical
levels Q1 and
Q2 of G, where Q1 #QQz, we write Qi >Q2 iff there are nonterminals
AEQ~ and BEQ~,
with ADB.
Structural complexity measures for a context-free grammar G are defined as follows:
LEV(G) denotes the number of grammatical
levels of Gred,
DEP(G)=max{card(Q):
Q is a grammatical
level of Gfed},
HEI(G)=max{HEI(Q):
Q is a grammatical
level of GTed},
where HEI(
1 iff SEQ and HEI(
1 +max{HEI(Q,):
Qt>Qi}.
In what follows, we use for complexity measures VAR, LE V, HEI, DEP and PROD
the common denotation
K.
The descriptional
complexity measure
grammars 9 is defined as follows:
K.(L)=
min{K(G):
undefined
of a language
GEY, L(G)=L}
L with respect
if L=L(G)
otherwise.
for some
to a class of
GE??,
280
E. Csuhaj- C’arjir, A. Kelemeno~i
Note that, by definition,
for an arbitrary
class 99 of grammars,
with LE_!?(%),
H&(L)
< LEV,(L) < VAR,+( L) < PROD,q(L) holds.
In what follows, we review the notions of a grammar form and its strict and general
interpretations.
For further details, see [l 11.
Let G,=(N,, Ti, Pi, S,), where i= 1,2 be context-free
grammars.
We say G, is
obtained from grammar form G1 by a general interpretation
(shortly, a g-interpretation) p, denoted
conditions
(i)-(iv)
(i) ,u(A)E N2
(ii) ,~(a) c T$
by Gz D, G, (/L), if 11 is a finite
hold:
for all AEN, and ,u(A)np(B)=@
for all OE T1;
substitution
for A, BEN,,
on (N, u T1)* and
with A#B;
Gz is said to be obtained from G, by a strict interpretation
(shortly, an s-interpretation) p, denoted by G, D, G,(p), if condition (ii) is modified as follows: pi
T, for
every UET, and p(a)np(h)=$
for all u,h~T,, where u#b.
The collection of grammars
obtained
by x-interpretations
from a grammar
G,
where x~jg, s>, is denoted by <q,(G).
The class of languages Y/‘,(G)= ( L: L= L(G), GE!~,(G) )- is called the x-grammatical family of G.
The grammar G itself is often referred to as a grammar form.
A grammar form G is said to be infinite if L(G) is infinite; otherwise, it is said to be
finite.
3. On descriptional
complexity
In this section we determine
languages. The results obtained
of the paper.
of context-free
languages
the complexity measures of some special context-free
here will be used in Section 5 to prove the main results
Definition 3.1. Let G=(N, T, P, S) be a context-free grammar with a derivation tree
tofM’=C(C’IJinG,withqpET*,rET’.
We say t,. is a minimal subtree oft completely
deriving ZJif t, is a derivation tree of .XUJI,where 3:= x0x, /II= 4’~~ and t,; has no subtree
t:. such that t:. is a derivation tree of x’t~y’, where x=x,x’,~=~‘)‘,
and .x~~:~ET*.
We shall use the pumping
by the following lemma.
property
of context-free
grammars
in the form specified
Lemma 3.2. Let G =( N, T, P, S) he u contextTfree
grammar.
Let w = my be in L(G),
where l~l>d”‘for
d=maxjIxI:
A+c(EP]
and m=card(N).
Let t be a deriwtion
tree CI~‘W with no suhdericution
A =s-+A jiv any A in N and let
t,, he u rninirnul subtree cfderiuutiorz tree t completely dericiny c. Then there is utl A,,E N
which occurs twice on the same brunch oft,.. Moreover, the subderivation A,, a+ cl A,,r2
Descriplionalcomplexity ~Jcontrxt-free yrammarfilrms zyxwvutsrqponmlkjihgfedcbaZYX
281
is determined
in t,, by two consecutive
occurrences
of A, on this branch, where v1 and
v2 are subwords of w and v1 v2 #E.
Proof. Suppose
by contradiction
that
no nonterminal
occurs
twice on the same
branch of t,. Then the length of any branch of t, is at most m, and 1v/ bd”. This
contradicts
the assumption
of the lemma.
Let A, occur twice on the same branch of tl, and let A, a+ v1 A,vz, with vl, vzeT*,
be a subderivation
determined
in t, by two consecutive
occurrences
of A, in this
0
branch. Since A, a+ A, is not a subderivation
in t, we have immediately v1 v2 #E.
The following theorem is about LEVCq,, VARl and PRODr complexity of contextfree languages being finite union of languages over pairwise disjoint alphabets.
Theorem 3.3. Let L= Uf= 1 Li, where Li, 1 <i< k, are injinite context-free languages
over pairwise disjoint alphabets. Let 9 be a class
grammars such that L, L,~_Y’(27)
hold. Then K,(L)bkfor
KE{LEV,
VAR, PROD).
of
Proof. Let
G =(N,
T, P, S) be in $9 and
L(G)= L. Let, for a given
i, 1 <id
k,
wi = xiviyi~ Li have a derivation tree ti and 1vi )> d”, where ti, d and m are as in Lemma
3.2. Let ti be a minimal subtree of ti completely deriving Ui. Then, by Lemma 3.2, there
is a nonterminal
Ai and a derivation
Aid+ uiAivi determined
by Ai in ti, with
uiviE(alph(Li))+. Since alph(Li)na!ph(Lj)=fJ
for i#j, 1 <i,j<k,
neither AiD+Aj nor
Aj zyxwvutsrqponmlkjihgfedcbaZYXWVUTSRQPONMLKJIHGFEDCBA
D+ Ai holds for i #j, 1 <i, j< k. This implies the statement for K = LE V and, thus,
also for K = VAR or K = PROD.
Notation 3.4. Let Ui, vi, 1 <i < k, be nonempty words and alph(uivi)n alph(ajvj) =@ for
i#j.
Let Lk,O=~: . uk+.
For XEL~.~, let mi(x) be defined
as follows: for x=u;l’...~~~L~,~,
let
mi(x)=v?...v;I’,
where
mi(x1x2)=mi(x2)mi(xl).
Let u, v, w be arbitrary
Ui, vi, 1 <id
Let
ml ,..., mk31.
For
words with disjoint
x=x1x2,
alphabets,
where
xl,xZ~L&,
also disjoint
let
with those of
k. (In the case where u, zyxwvutsrqponmlkjihgfedcbaZYXWVUTSRQPONMLKJIHGFEDCB
or w are nonempty.)
M:“={uxw:xELk,O},
M ;+)={uXw:XEL&},
L~‘=fuxwmi(x)v:~~L~~~}
and
L:+‘= (uxwmi(x)v:xEL&}.
u
The structure of any context-free grammar generating any of the above language is
determined in the following sense: all words with sufficiently many (say S) repetitions
of the subword t~i(~li)are generated by pumping a subword of ul(t$), with the length
equal to some multiple
of the length
of Ui (Ci).
Lemma 3.5. zyxwvutsrqponmlkjihgfedcbaZYXWVUTSRQPONMLKJIHGFEDCBA
Let Lk he urzy of the languuges Lc’ and L:+‘. Let Lk= L(G,)for a contextfree grammar Gk. Then, ,for every i, 1 <i < k, there exists a nonterminal Ai in Gk and
a number ni 2 1 such that Ai *+
ii;’ Airy1 holds, where Ui= yx for some x, y, with xy = ui,
and Ci= tz,for some z, t, with zt = L’~.Moreover,.for
Lk = Ly
’ and,for
Lk with k > 3, Ai # Aj
.for i #j.
Proof. Let d and m be as in Lemma 3.2. Consider MI,= uu”, . . . u”,wui . . 3, t’, where
w,EL:~’ G L L+‘. Let t be a derivation
tree of w, in Gk fulfilling the
conditions of Lemma 3.2. According to Lemma 3.2, for every i, 1~ id k, and for every
minimal subtree ti of t completely deriving u;, it holds that there is a nonterminal
s>dm. Obviously,
Ai and a subderivation
Ai *+ u;AiZi, determined
in ti by Ai, where zyxwvutsrqponmlkjihgfedcbaZYXW
ui is a nonempty
subword of us and zi is a terminal word. (The case where zi is a nonempty subword of
us leads to a contradiction
with the structure of Lk.) Let ui = yuj’x for some x, y, where
0 < 1x ) < 1tdi1and 0 < 1~) < 1LIiI. We prove that XJ’= Ui, which leads to Ai 3’
where Ui= ys.
Consider
UT” ’ Ai z;,
the derivation
in Gk, where u, I’, \vi are terminal words and wi is derived with a minimal number of
steps in Gk resulting from Ai a terminal word. By the structure of words in Lk, we have,
for j=l
and j=2,
uyul’xlt,iz~~=.~l,f’,~cf’~
for
maximal
numbers
and
~q’~~~_~y~;‘xw,~j~~~=_~uf~\tl~!~y
/z > I, > 1. This
u?xyur4 = ~7, ,
gives
i.e.
Xy= Ui and
12-11 =ri+ 1 =ni. Then su.f;(vci)z~I)=W~:fL~ and su.f;(w~)z~z:‘iY=ti~@~ for some IdI Wil.
This implies z{ = t$‘z for some t, z, where O<ltldltlil,
O~lzl<l~iI
and Zt=Vi.
Let A,=A, for some i,j, l<i,j~k.
Then U~‘U~JU~‘,ni>l,
nj>l,
is a subword of
some word in Lk. This implies i =j for LI, = Lv’, where k 3 2, and for Lk = L:“,
where
k33.
In the case of regular languages M :I’ and M :+’ an analogous theorem holds for the
non-self-embedding
linear class of grammars. Using similar methods and arguments
as in the proof of Lemma 3.5, we can prove the following lemma.
Lemma
3.6. Let Lk be any of the languages
se!f-embedding
linear grammar generating
My’
and ML+‘.
Let
Gk be a non-
Lk. Then, ,for every i, 1 <i < k, there exists
a nonterminal Ai in Gk and ni> 1 such that Ai -’
Ui= J.~.for some x, J’, with xy = ui. Moreover,
U?‘Ai or Ai a+ Aiiir’ holds, where
Ai # Aj
for
i #j, 1 < i, j < k.
Drscriptional
complexity
of contextjkee
Theorem 3.7. (i) Let 29 be a class of context-free
L(“k EY(C~).
Then HEI
% ’(L(l))>
k
’
Then
grammars
such that, for
every
k>
1,
k.
(ii) Let 9 be a class of non-self-embedding
M:“ EY(C+? ).
283
zyxwvutsrqponmlkjihgfedcbaZYXWVUTSRQPONML
grammar jiwms
linear grammars
such that, for every
k 3 1,
HEI,(M:“ )>k.
Proof. Consider an arbitrary grammar Gk in 59 for which L(Gk)= L:” (L(G,)= Mr’ )
of Gk determined in Lemma 3.5 (in Lemma 3.6),
holds. Let A 1, . . . , Ak be nonterminals
which are used in the derivation
of w,=uu;
. u~wv~ . . v; v (w,=uu; . uiw), where
s>d” ,
d and m being the numbers given in Lemma 3.2. Since no Vican preceed uj for
1 <i,j<k
[since Gk is a linear grammar for (ii)], no sentential form XAiyAjZ
can
D* Ai+l or
be derived in Gk, where x, JJ, zc(Nu T) *. This implies that either Ai zyxwvutsrqponmlkjihgfedcbaZ
Ai+l D* Ai
Ai+l
for
Since
l<i<k-I.
Ui+l
never
preceeds
D+Ai does not hold and then A, D+A2 D+...
Ui, in the
case
of Ly’,
each
D+Ak . In the case of Mf’,
Ai is either right-linear
or left-linear but not both; so, a permutation
(pl,. . . , pk) of
(1, . . . . k) can be determined such that A,,D+A,,
D+ ... DfAp,.
Hence, HEI(Gk)>k.
Theorem 3.8. (i) Let $9 be a class of context-free
L:+‘EY(~).
Then
(ii) Let ‘ 9 be a class of non-se!f-embedding
M:+‘EY(Y).
grammars
such that, for
every
k 2 3,
such that, for
every
k 2 1,
DEP,(L:+‘)>k.
linear grammars
Then DEP,# (M:+‘ )>k.
Proof. Let Gk be an arbitrary element of ie for which L(G,)=L:+’
(L(G,)=
M:+‘)
holds. Let A:‘, . . . . A:‘, for 1 <f dm + 1, be nonterminals
of Gk, determined
in
Lemma 3.5 (in Lemma 3.6), which are used in the derivation
of the word
$,=u~m+lwmi(@m+‘)~
(Ws=uWrn+l w), where W= u; . . u; for some s>dm, where
d and m are defined
position
in Lemma 3.2 (i.e. A:’ is a nonterminal
producing ii’s in the tth
of us in G;,.) Note that A~“ “ # A~2’ for i# j and for arbitrary
sl,sz,
1<s,,s2<m+1.
AS no “j can preceed ui for O<i,j,<k
in L:+‘, A~‘ D+A~~,
for
1 and A:’ D+ A:“ +” for 1 <s<m.
Since t=l,2,...,
m-t 1, there exist two different positions
s1 and s2 such that
AIs’ ) = A)s” for a fixed i. For i = 1 let us choose sl, s2, where s2 > sl, and s2 - s1 is
minimal. Then ,4y1’ D+ At” Df
. Df A:” D+ A:‘ +”
D* Atf2’ =Ay” .
(In the case
16idk-
of M:+‘, permutations
(pII, . . , ptk) of (1, . . ..k) for 1 <tdm+l
exist such that, for
BF’ equal to A!!,, B” ,’ D ’ B? D + ,.. D+ B:’ D+B~1 +1 ’D* B;2 =B;’
holds.) Thus,
DEP(Gk)>k.
4 . Basic interpretations
~ auxiliary results
In this section we specify some strict interpretations
used in the sequel. The basic
idea behind them is a suitable renaming of nonterminals
and arising rules that are not
of interest.
By isolation we mean an interpretation
which, roughly speaking, isolates a given
derivation of a sentential form.
Lemma
4.1. zyxwvutsrqponmlkjihgfedcbaZYXWVUTSRQPONMLKJIHGFEDCBA
Let G=(N,
T, P, S) he u context$ee
grunmar
and let, ,for A in N,
D:A=uo~ul~u2’...~u,,,
n 3 1, he
a
derivation
in
G,
1 <j < n. Then there is a strict interpretation
pD, called an isolation
G,=(ND,
T,, PD, A) such that GI, D,G(p,)
and
where
uj~(N u T)*,
of‘ D, and a grammar
(i) SF(G,) n (N u T)* = SF(G,,,,), where G,,, = (N. T, ( A + u, 1. A),
(ii) A = u,, + c, =
+ P,,_, - vn= u,,, bvhere tli~~~(ui) ,for i= 1, 2,.
in CD
only derivation
(iii) Pu consists
of length
e.xactlJ,
n starting
, n - 1, is the
with A, and
qfproductions
used in vi *
vi+ 1,for i = 0, I ,
. , n - 1.
Proof. Let us define /co as follows:
for aET,
/b(a)
= (a),
for BEN,
for every
p,(B)={B” .“ :
I <i<n-
such that B occurs
[i,j]
as the jth
letter in Ui,
1) up(B),
where
for B= A or for B being
a letter
of u,,
otherwise.
where Xj,k~(N u T). 1 <k < lj. We associate
Let, for j, 1 <j < II - 1, ui = X, zyxwvutsrqponmlkjihgfedcbaZYXWVUTSRQPONMLKJIHGFED
a word u,~with Uj, where cli= Xi, 1 . . Xi,,,, where
,
Xj,
ii,
if Xj,,ET,
Xj.k
x;,,=
I X!j,kl
J.k
if Xj,,~N.
Let us consider
the derivation
D’ : A + v1 + v2 + ... at!,,_ 1 =S u,, and let PO be
the set of productions
used in this derivation.
Then, obviously,
PD c pD(P) and,
for the grammar
CD given
implicitly
by PD, we have CD D,G(,u,)
and
SF(G,)n(NuT)*=SF(G,,,,).
Remark.
is infinite
SF(G,,,)
if A
is a letter
of u,~, where
u,# A;
otherwise,
SF(G,,,,) = (A, II,, 1.
Linear isolations (constructed
in the next lemma) cause fixed derivations
to be
isolated, and terminals derived left and right from a fixed branch of the derivation tree
to be distinguished.
Lemma
4.2. Let
N, D : A = u0 *
he a primed
G =(N,
u, + u2 *
version
of D, and a grammar
(i) SF(G;)n(N
T, P, S)
he
a
context:fiee
grammar
‘. =P u AI*, n > 1, he a derivation
of T. Then there is a strict
Gb=(N&,
TV T’, Pb,
u Tu T’)*=SF(GI,,),
and
in G, where
let, ,for
interpretation
,LL;, called
a linear
A) such that
CL D,G(,ub)
and
nhere
Gi,,I=(N,
Tu
A
u, OET ‘. Let
T’, {A -+ uAv’ J,
in
T’
isolation
S),
Descriptional
(ii) A = l&=Sv;
only derivation
=S...=Sv:,_,
complexity
*
V: =
of context-free
i= 1, 2, . . . , n-
where viE,ab(ui)for
uAu’,
285
p-ammar forms
1, is the
in Gb of length n starting with A, and
qf productions
(iii) Pb consists exactly
used in v; + 2;;+ 1for i = 0, 1, . . . , n - 1.
Proof. Let Ui=XiXiBi, 1 did n- 1, where Cli,P~E(Nu
zyxwvutsrqponmlkjihgfedcbaZYXWVUTSRQPONMLK
T)*
and Xi are nonterminals
lying on the branch of the derivation tree of D beginning and ending with A. We define
,& as follows:
&(a)={a,
where
pLo is
the
a’ > for aET
and
isolation
defined
where Xj,l;E(Nu
U j=X j,1 . ..X j.n,...X j,I ,,
We associate
,u~(B)=~~(B)
in
Lemma
T), 1 dkdlj,
for BEN,
4.1.
nj<lj
Let, for j, zyxwvutsrqponmlkjihgfedc
1 < j<n, zyxwvutsrqponmlkji
and Xj,“,=Xj.
with Uj a word vi, where vJ= Xj. 1. . . xj, ,,, where
for xj.kET,
for
k<nj, zyxwvutsrqponmlkjihgfedcbaZYXWVUTSRQPONML
X j, kE T, k > nj,
Let us consider the derivation 0: A 3 vi + c; => ... + VA_I =t=uAu’ and let Pb be the
set of productions
used in this derivation.
Then, obviously, Pb 5 &,(P) and, for
the grammar
CL given implicitly
by Pb, we have GI, D,G(pb)
as well as zyxwvutsrqponmlk
SF(Gb)n(N u
Next
Tu T’ )*=SF(G:,).
we fix the notions
isomorphic interpretations
core grammar.
of jth copy and renaming
and define the corresponding
a single
grammars
symbol
by special
isomorphic
to the
Definition 4.3. Let G=(N, T, P, S) be a grammar
and j be a natural
number.
By pcj we denote an interpretation,
called a jth copy, defined by j~~~(X)=x(j),
for XE(NU 7).
The jth copy G’j’ of G is the grammar Go’ U, G(pCj), with P(j’ =pCj(P).
Interpretations
can change some fixed occurrences
of some symbol in the set of
productions.
Definition 4.4. Let G=(N, T, P, S) be a grammar and let XEN, Y#(Nu T). By P~_~,
called renaming X (by Y) we denote an interpretation
with px+r(X)=
{X, Y} and
px+,(Z)={Z}
for Z# X,ZENUT.
By GXdY we denote the grammar
G*_,,=
where
P~_r={S-rx~:S~C(EP}u~Y-rclr:X~a~P}u
(Nu{ Y}, T, P,,,,
S),
{A + gy : A + XEP, A ZX},
where zy denotes
the word obtained
from c( by replacing
all occurrences of X by Y.
Informally,
G,,,
is such an interpretation
of G in which XEN is replaced by
Y$(N u T) in any position of X except where X is the start symbol of G and all other
letters remain unchanged.
5. Complexity
of grammar forms
In this section we show that grammatical
complexity measures VAR+, PROD,,q,
LEV.,, HEI, and DEP,, are unbounded
on strict and on general grammatical
families
of self-embedding
(infinite non-self-embedding
linear) grammar forms.
Theorem 5.1. Let G he u self-embeddiny
c.onte.ut$ee
yrammws
such
PROD, LE V, HEI, DEP).
that
contextTfree
L“x(Y)
yrammurfiwm.
G Y(Y),
Then K., is ur~borrnded
where
on
Let ?? be a class of
Let KE( VAR,
SE (g, s}.
Yx(G).
Proof. Let s =g. If G is a self-embedding
grammar, then Y’,(G) contains all linear
languages (see [l 1, p. 433). By [S] and Theorem 3.7, for each K and for every natural
number k, there is a linear language Lk such that K,,(L,)>
k. This results in K,$ being
unbounded
on Yv,(G).
Let x=s. First we prove the result for K =HEI. This gives the proof for VAR,
PROD, LEV, too. Since G is self-embedding,
there is a nonterminal
A in G with
derivations
D,:S ++ sAy,
D,: A a+ uAo,
with .Y,y, WET* and K, LIET +. Denote by Ps, P,, PF the sets of productions
of P used
in derivations
Ds, D,, DF, respectively. According to Theorem 3.7, it is sufficient to
give, for any k 3 1, a grammar
Gk such that Gk Cl,G and L(Gk)=Lt)=
(xu~’ ...Lpwk+,
lp‘...ry’y:
rq>l,l<i<ki.LetPi,P;,Pkbethesetsofproductions
obtained from Ps, P,, P,. by isolations ~r,~, ,u&, pDF, defined in Lemmas 4.1 and 4.2,
respectively. We use abbreviation
pro for pA+A1 and ~L,jfor pLAIdA,,,, where 1 <j< k.
Let Pk=p,O(Pi)u
uf=, p,;(P;)u U5=l~rj(~,j(P;))u~L,k+I(P~)
and let Gk be the
grammar given implicitly by the productions
of Pk. We shall prove that L(G,)= L:“.
Let w = uu;ll . u~~M~~
+ 1tp
. urn’
, y. Then w can be derived in Gk by using the following
partial derivations:
S =>+ sA, y, which uses the productions
of p,,(P$),
Ai -+ uiAici, which uses the productions
of pCi(P;) for 1 <i< k,
Aj”ujAj+1
P.,
, which uses the productions
of prj(pCj(P;)) for 1 <j< k,
A kflzyxwvutsrqponmlkjihgfedcbaZYXWVUTSRQPONMLKJIHGFEDCBA
which uses productions
of /L,.~+l(Pk).
+ K’k+lr
Thus, L:” E L(G,).
We show that the opposite inclusion holds. Let D : S * w1 * w2 * ... a w, = WET*
be a derivation in Gk. Following Pk and Lemmas 4.1 and 4.2 any sentential form of
Gk contains at most one recursive letter. The recursive nonterminal
Ai, 2 d i < k, does
not appear before Ai-, is rewritten. Moreover, every terminating
derivation contains
each Ai, 1 <i<k,
at least once. Without loss of generality, we may assume that all
nonrecursive
nonterminals
in D are rewritten before a recursive symbol is rewritten.
<Y- 1 can be found such that A, occurs in the
Then indices 2dil~izd~~~dik+l,
sentential form wi, and it does not occur in any wk for zyxwvutsrqponmlkjihgfedcbaZYXWVUTSRQPONMLKJIHG
k d i,. This gives
~~,=xu;“lA~~;ly
for some mr3 1,
nonrecursive
m,,...,mk>
1 and
for
some
w, = .xu;l’
up b”k+l~~...t~~l~ and L(Gk)SL:“.
Let K =DEP. We show that there is a grammar
Ak + 1.
nonterminal
G,, where
G, a,G,
Thus,
such that
L(G,)= I,:+‘. Let Pk have the same meaning as above. Let Fk = Pku&(pCk(P;)),
where pL,kabbreviates pAk+AI. Let Gk be the grammar given implicitly by the elements
of Fk. It can be shown that L(G,)=L~+‘={xzw,+,mi(z)y:~~L~~}.
According to
Theorem 3.8, DEP,(L(G,))>k.
0
We illustrate
Example.
Let G contain
the constructions
of Gk and Gk from the previous
proof.
the productions
S+aA,
A+uAjAala.
We give Gk and Gk corresponding
to the derivations
D, : A + aA + aAa,
Gk is given by the productions
Ai + a.A!‘.‘]
II
3
AI’,” + AiuL
A!1.21 + A.,+lai
I
for i= 1, 2 ,..., k,
for i=1,2
,..., k,
A k+l
zyxwvutsrqponmlkjihgfedcbaZYXWVUTSRQPONMLKJIHGFEDCBA
‘“k+l
C?k
contains
the same productions
as Gk and, moreover,
the production
Ai1.21 -+ Al a;.
To continue our study, we discuss the case where G is a non-self-embedding
infinite
grammar form. In this case Yp,(G) c 9(.%‘,AeF?).Now we have to distinguish between
complexity
measures in {HEI, DEP} and .in { VAR, PROD, LEV} since, for any
regular language R, HEI,,
and DEP,-,(R)=
1, while VARCFr PRODcF, and
LE VCF form infinite hierarchies on the class of regular languages.
Theorem 5.2. Let G be a non-selflembedding
injinite context-free
3 be a class of context:free yrammars such that YX(G) c Y’(g),
LEVsq, VAR!+ and PRODfq are unbounded on 5!“,(G).
grammar form and let
where xE{g, s}. Then
Proof. Let us discuss first K = LE V. Let x =g. Y,(G)= 2’(R&W); by [SJ, for every
k> 1, there is a regular language Rk such that LE V,--(Rk)=
k holds. Let x=s.
Without loss of generality, we may assume that G has a recursive nonterminal
A with derivations
Ds : S a* xAy,
D,:A++
DF:A*+
uA (or D,: A *+Au,
but not both)
w,
with x, y, tt’~T* and UET+. Let Ps, PI, PF be the sets of productions
used in derivations Ds, D,, DF, respectively. To prove the theorem, we construct, for any k> 1,
a grammar
Gk such that L(G,) = L satisfies the conditions
of Theorem
3.1. Let
be
the
isolations
defined
in
Lemma
4.1
and
denote
by
P$,
Pi, Pk
PD,, pD,> PDF
sets of productions
obtained
by them
from
Ps, P,, PF, respectively.
Let
Pk=UT=l({S~Ilci(a):S~a~P$)‘U~~i(P~UP;UP’,)).
Let Gk be the grammar given implicitly by productions
grammar Gk with the following derivations:
S+*~iAiyi,
A;++uiAi(or
Ai~‘Aiui),
of Pk. Then Pk determines
Ai~+wi,
ldidk,
where alph(xiyiuiwi)
are pairwise disjoint for different i. Since L(G,)= Uf= 1 Li,
LE V,,L(Gk)3 k, according
to Theorem
3.1. Since
where Li G a/ph(xiJJiuiwi)‘,
PROD,L(Gk)3
VARCqL(Gk)3 LEV,L(G,),
the proof is completed.
0
If we restrict 9 to be a class of non-self-embedding
linear grammars
then for
G a non-self-embedding
linear infinite grammar form we obtain infinite hierarchy for
HEI,
and DEP,# on Pia,(
too.
Theorem 5.3. Let G be a non-seIf-embedding
infinite linear grammar form. Let 3 be
a class of non-self-embedding linear grammars such that 9,(G) s .2(g), where xE{ g, s}.
Let KE{HEI, DEPl(. Then Ktq is unbounded on Y,(G).
Proof (sketch). The theorem
can be proved
by constructing
ML+’ using similar methods and arguments as in Theorem 5.1.
languages
My’,
References
[l]
W. Brauer,
MFCS’73,
On grammatical
191-196.
complexity
of context-free
languages
(extended
abstract),
in:
Proc. of
Descriprioml
[2]
[3]
[4]
[S]
[6]
[7]
[8]
[9]
[lo]
[1 I]
complrsity
of‘ contr? ct-free
yrmnmar ,forms
289
E. Csuhaj-Varji,
A connection
between descriptional
complexity
of context-free
grammars
and
grammar form theory, in: A. Kelemenovi
and J. Kelemen, eds., Trends, Techniques, and Problems in
Throrrtical Computer Sciencr, Lecture Notes in Computer
Science, Vol. 281 (Springer, Berlin, 1987)
6&74.
E. Csuhaj-Varju
and J. Dassow, On bounded interpretations
of grammar forms, Theoret. Comput. Sci.
87 (1991) 2877313.
J. Gruska, On a classification
of context-free grammars, Kybernetika 3 (1967) 22-29.
J. Gruska, Some classifications
of context-free languages, InJorm. and Control 14 (1969) 152-179.
A. Kelemenova, Grammatical
levels of position restricted grammars, in: J. Gruska and M. Chytil, eds.,
Proc. ofMFCS
‘ 81, Lectures Notes in Computer
Science, Vol. 118 (Springer, Berlin, 1981) 347-359.
A. Kelemenovi,
Grammatical
complexity of context-free languages and normal forms of context-free
grammars, in: P. Mikulecky, ed.. Proc. qj’ IMYCS ‘82 (Bratislava,
1982) 2399258.
A. Kelemenovi,
Complexity of normal form grammars,
Theoret. Comput. Sci. 28 (1984) 2888314.
A. Kelemenovi,
Structural
complexity measures on grammar
forms, in: Proc. Conf: on Automata,
Lunguages and Proyramminy Systems, Salgotarjan
DM 88-4 (Budapest, 1988) 73376.
A. Salomaa, Forma/ Languayrs (Academic Press, New York, 1973).
D. Wood, Grammar and L-forms: An Introduction, Lecture Notes in Computer
Science, Vol. 91
(Springer, Berlin, 1980).