Academia.eduAcademia.edu

Descriptional complexity of context-free grammar forms

1993, Theoretical Computer Science

Csuhaj-Vajh, E. and A. Kelemenovi, Descriptional complexity of context-free grammar forms, Theoretical Computer Science 112 (1993) 277-289.

Theoretical Elsevier Computer Science 277 112 (1993) 2777289 Descriptional complexity of context-free grammar forms Erzkbet Csuhaj-Varjti Alica Kelemenovi Communicated by A. Salomaa Received August 1989 Revised November I99 1 zyxwvutsrqponmlkjihgfedcbaZYXWVUTSRQPONMLKJIHGFEDCB Abstract Csuhaj-Vajh, E. and A. Kelemenovi, Descriptional Theoretical Computer Science 112 (1993) 277-289. complexity of context-free grammar forms, Descriptional complexity aspects of grammar forms are studied. It is shown that grammatical complexity measures HEI,q, LEV,<, VAR,<, PRO D,< and DEP., related to any appropriate infinite class ‘8 of grammars are unbounded on the infinite class of languages determined by strict/general interpretations of any infinite grammar form. 1. Introduction Descriptional (grammatical) complexity measures were introduced in [l, 4, 51 in order to classify context-free languages according to the size and/or structural properties of their grammars. For the size of grammars they are expressed by such complexity measures as the number of nonterminals (VAR) and the number of productions (PROD). The number of grammatical levels (UP’), the maximal number of elements of grammatical levels (DIP) and the height of the digraph of grammatical levels (HEZ) are the complexity measures reflecting the structure of grammars. One of the aspects of grammatical complexity theory is the study of the functional behaviour of the complexity measures on language classes. Complexity measures are Correspondence to: E. Csuhaj-Varjti, Computer Sciences, Victor Hugo u. 18-22, H-l 132 Budapest, 0304-3975,/93/$06.00 and Automation Hungary. by; 1993.mmElsevier Science Publishers Institute, Hungarian B.V. All rights reserved Academy of 278 E. Csuhaj- Yurjk A. KelemewtG functions defined on context-free languages, with values being natural numbers; thus, one can ask for the set of all values of complexity of languages or simply for the boundedness/unboundedness of the complexity measure (on a given class of languages); the latter leads to the finiteness/infinity of the corresponding language hierarchy. Obviously, this behaviour depends strictly on the grammar class that is used to specify languages. For a large variety of complexity measures it was proved that, related to appropriate grammar classes, for an arbitrary natural number n, there is a context-free language with the complexity equal to n (see e.g. [4, 5, 71). Following this line, the study of the problem of boundedness/unboundedness for remarkable subclasses of the context-free language class is of interest. In this paper we concentrate on language families defined by grammar forms which present a natural generalization of the class of all context-free languages. Context-free grammar forms define infinite families of structurally related grammars via special finite substitutions (interpretations) of terminals and nonterminals in the production set. (For details the reader is referred to [l 11.) The main result of this paper establishes unboundedness of complexity measures zyxwvutsrqponmlkjihgfedcbaZYXWVUTSRQ VARI, PROD*, LEVsq, DEP,q, HEIfg on the classes of languages defined by grammar forms. This property holds for a rather large variety of grammar classes ?? describing these languages. The statements are presented with full technical details. They complete the earlier results given in [3, 8, 91. The paper is organized as follows. Section 2 lists some basic definitions from formal language theory. In Section 3 we construct, for every fixed natural number k, k3 1, some context-free languages that are of complexity at least k for an arbitrary subclass of context-free grammars which enables one to generate these languages. The results are of auxiliary character and serve in proving the main statement in Section 5. In Section 4 some special interpretations are presented to obtain the interpretation grammars generating languages of the previous section. These mappings are isolation, linear isolation, copy and renaming a single symbol. In Section LEI/,, 5 we show that grammatical complexity measures VAR!# , PRODr, on strict and on general grammatical families (infinite non-self-embedding linear) grammar forms for an arbitrary HEIfq and DEPzc are not bound of self-embedding class of reduced (e.g. non-self-embedding linear) grammars, that is, for every natural number zyxwvutsrqponmlkjihgfedcbaZYXWVUTSRQPONMLKJIHGFEDCBA k and for each of the above complexity measures, there is a language of grammatical complexity at least k in the strict/general grammatical family of grammar forms. From corollaries. these statements some results from previous papers can be derived as 2. Basic definitions We assume the reader to be familiar with the basics of formal language the details not explained here the reader is referred to [lo]. theory. For Descriptiord complexity @‘context7free grammar,f~rms 279 We denote context-free grammars (shortly, grammars) by G=(N, T, P, S), where N, T, P are the sets of nonterminals, terminals and productions, respectively, and S is the start symbol. The context-free language (the language) generated by G is denoted by L(G). By SF(G) we mean the set of sentential forms derivable in a context-free grammar G from S. A context-free S a* uAu a’ w Gred a grammar derivation S ** A nonterminal where UUET +. grammar G is said to be reduced iff, for all A EN, there is a derivation in G, with u, u, WET*. For a context-free grammar G, we denote by obtained from G by elimination of all nonterminals A for which no uAv =s+ w can be found in G, where u, IJ, WET*. A of G is said to be recursive iff there is a derivation A a+ uAu in G, A reduced context-free grammar G is said to be self-embedding if there is a nonterminal A in N such that a derivation A =z-*uAv, with u, UET ’ exists; otherwise, it is said to be non-self-embedding. For a language L, we denote by alp/~(L) the smallest alphabet T such that L z T*. For WEL, we denote by 1w I the length of w and by suf;(w) the suffix of length 1of w. For a class 3 of context-free grammars, we denote by _Y(?%) the class of languages generated by elements of 3. In what follows, we review the notions of descriptional complexity measures (size and structural complexity measures) of context-free grammars (languages) introduced in [l, 4, 51. The size measures for a context-free grammar G are the number of its nonterminals, denoted by VAR(G), and the number of its productions, denoted by PROD(G). In order to define structural complexity measures, we have to introduce relation zyxwvutsrqponmlkjih D on N for G=(N, T, P, S). For two nonterminals A and B of G, we write A D B if there is a production A + uBc in G, with u, UE(N u T)*. D + denotes the transitive closure of D and D* the reflexive and transitive closure of D. An equivalence relation 3, defined as A = B iff A D * B and B D * A, determines on N equivalence classes, called grammatical levels. For two grammatical levels Q1 and Q2 of G, where Q1 #QQz, we write Qi >Q2 iff there are nonterminals AEQ~ and BEQ~, with ADB. Structural complexity measures for a context-free grammar G are defined as follows: LEV(G) denotes the number of grammatical levels of Gred, DEP(G)=max{card(Q): Q is a grammatical level of Gfed}, HEI(G)=max{HEI(Q): Q is a grammatical level of GTed}, where HEI( 1 iff SEQ and HEI( 1 +max{HEI(Q,): Qt>Qi}. In what follows, we use for complexity measures VAR, LE V, HEI, DEP and PROD the common denotation K. The descriptional complexity measure grammars 9 is defined as follows: K.(L)= min{K(G): undefined of a language GEY, L(G)=L} L with respect if L=L(G) otherwise. for some to a class of GE??, 280 E. Csuhaj- C’arjir, A. Kelemeno~i Note that, by definition, for an arbitrary class 99 of grammars, with LE_!?(%), H&(L) < LEV,(L) < VAR,+( L) < PROD,q(L) holds. In what follows, we review the notions of a grammar form and its strict and general interpretations. For further details, see [l 11. Let G,=(N,, Ti, Pi, S,), where i= 1,2 be context-free grammars. We say G, is obtained from grammar form G1 by a general interpretation (shortly, a g-interpretation) p, denoted conditions (i)-(iv) (i) ,u(A)E N2 (ii) ,~(a) c T$ by Gz D, G, (/L), if 11 is a finite hold: for all AEN, and ,u(A)np(B)=@ for all OE T1; substitution for A, BEN,, on (N, u T1)* and with A#B; Gz is said to be obtained from G, by a strict interpretation (shortly, an s-interpretation) p, denoted by G, D, G,(p), if condition (ii) is modified as follows: pi T, for every UET, and p(a)np(h)=$ for all u,h~T,, where u#b. The collection of grammars obtained by x-interpretations from a grammar G, where x~jg, s>, is denoted by <q,(G). The class of languages Y/‘,(G)= ( L: L= L(G), GE!~,(G) )- is called the x-grammatical family of G. The grammar G itself is often referred to as a grammar form. A grammar form G is said to be infinite if L(G) is infinite; otherwise, it is said to be finite. 3. On descriptional complexity In this section we determine languages. The results obtained of the paper. of context-free languages the complexity measures of some special context-free here will be used in Section 5 to prove the main results Definition 3.1. Let G=(N, T, P, S) be a context-free grammar with a derivation tree tofM’=C(C’IJinG,withqpET*,rET’. We say t,. is a minimal subtree oft completely deriving ZJif t, is a derivation tree of .XUJI,where 3:= x0x, /II= 4’~~ and t,; has no subtree t:. such that t:. is a derivation tree of x’t~y’, where x=x,x’,~=~‘)‘, and .x~~:~ET*. We shall use the pumping by the following lemma. property of context-free grammars in the form specified Lemma 3.2. Let G =( N, T, P, S) he u contextTfree grammar. Let w = my be in L(G), where l~l>d”‘for d=maxjIxI: A+c(EP] and m=card(N). Let t be a deriwtion tree CI~‘W with no suhdericution A =s-+A jiv any A in N and let t,, he u rninirnul subtree cfderiuutiorz tree t completely dericiny c. Then there is utl A,,E N which occurs twice on the same brunch oft,.. Moreover, the subderivation A,, a+ cl A,,r2 Descriplionalcomplexity ~Jcontrxt-free yrammarfilrms zyxwvutsrqponmlkjihgfedcbaZYX 281 is determined in t,, by two consecutive occurrences of A, on this branch, where v1 and v2 are subwords of w and v1 v2 #E. Proof. Suppose by contradiction that no nonterminal occurs twice on the same branch of t,. Then the length of any branch of t, is at most m, and 1v/ bd”. This contradicts the assumption of the lemma. Let A, occur twice on the same branch of tl, and let A, a+ v1 A,vz, with vl, vzeT*, be a subderivation determined in t, by two consecutive occurrences of A, in this 0 branch. Since A, a+ A, is not a subderivation in t, we have immediately v1 v2 #E. The following theorem is about LEVCq,, VARl and PRODr complexity of contextfree languages being finite union of languages over pairwise disjoint alphabets. Theorem 3.3. Let L= Uf= 1 Li, where Li, 1 <i< k, are injinite context-free languages over pairwise disjoint alphabets. Let 9 be a class grammars such that L, L,~_Y’(27) hold. Then K,(L)bkfor KE{LEV, VAR, PROD). of Proof. Let G =(N, T, P, S) be in $9 and L(G)= L. Let, for a given i, 1 <id k, wi = xiviyi~ Li have a derivation tree ti and 1vi )> d”, where ti, d and m are as in Lemma 3.2. Let ti be a minimal subtree of ti completely deriving Ui. Then, by Lemma 3.2, there is a nonterminal Ai and a derivation Aid+ uiAivi determined by Ai in ti, with uiviE(alph(Li))+. Since alph(Li)na!ph(Lj)=fJ for i#j, 1 <i,j<k, neither AiD+Aj nor Aj zyxwvutsrqponmlkjihgfedcbaZYXWVUTSRQPONMLKJIHGFEDCBA D+ Ai holds for i #j, 1 <i, j< k. This implies the statement for K = LE V and, thus, also for K = VAR or K = PROD. Notation 3.4. Let Ui, vi, 1 <i < k, be nonempty words and alph(uivi)n alph(ajvj) =@ for i#j. Let Lk,O=~: . uk+. For XEL~.~, let mi(x) be defined as follows: for x=u;l’...~~~L~,~, let mi(x)=v?...v;I’, where mi(x1x2)=mi(x2)mi(xl). Let u, v, w be arbitrary Ui, vi, 1 <id Let ml ,..., mk31. For words with disjoint x=x1x2, alphabets, where xl,xZ~L&, also disjoint let with those of k. (In the case where u, zyxwvutsrqponmlkjihgfedcbaZYXWVUTSRQPONMLKJIHGFEDCB or w are nonempty.) M:“={uxw:xELk,O}, M ;+)={uXw:XEL&}, L~‘=fuxwmi(x)v:~~L~~~} and L:+‘= (uxwmi(x)v:xEL&}. u The structure of any context-free grammar generating any of the above language is determined in the following sense: all words with sufficiently many (say S) repetitions of the subword t~i(~li)are generated by pumping a subword of ul(t$), with the length equal to some multiple of the length of Ui (Ci). Lemma 3.5. zyxwvutsrqponmlkjihgfedcbaZYXWVUTSRQPONMLKJIHGFEDCBA Let Lk he urzy of the languuges Lc’ and L:+‘. Let Lk= L(G,)for a contextfree grammar Gk. Then, ,for every i, 1 <i < k, there exists a nonterminal Ai in Gk and a number ni 2 1 such that Ai *+ ii;’ Airy1 holds, where Ui= yx for some x, y, with xy = ui, and Ci= tz,for some z, t, with zt = L’~.Moreover,.for Lk = Ly ’ and,for Lk with k > 3, Ai # Aj .for i #j. Proof. Let d and m be as in Lemma 3.2. Consider MI,= uu”, . . . u”,wui . . 3, t’, where w,EL:~’ G L L+‘. Let t be a derivation tree of w, in Gk fulfilling the conditions of Lemma 3.2. According to Lemma 3.2, for every i, 1~ id k, and for every minimal subtree ti of t completely deriving u;, it holds that there is a nonterminal s>dm. Obviously, Ai and a subderivation Ai *+ u;AiZi, determined in ti by Ai, where zyxwvutsrqponmlkjihgfedcbaZYXW ui is a nonempty subword of us and zi is a terminal word. (The case where zi is a nonempty subword of us leads to a contradiction with the structure of Lk.) Let ui = yuj’x for some x, y, where 0 < 1x ) < 1tdi1and 0 < 1~) < 1LIiI. We prove that XJ’= Ui, which leads to Ai 3’ where Ui= ys. Consider UT” ’ Ai z;, the derivation in Gk, where u, I’, \vi are terminal words and wi is derived with a minimal number of steps in Gk resulting from Ai a terminal word. By the structure of words in Lk, we have, for j=l and j=2, uyul’xlt,iz~~=.~l,f’,~cf’~ for maximal numbers and ~q’~~~_~y~;‘xw,~j~~~=_~uf~\tl~!~y /z > I, > 1. This u?xyur4 = ~7, , gives i.e. Xy= Ui and 12-11 =ri+ 1 =ni. Then su.f;(vci)z~I)=W~:fL~ and su.f;(w~)z~z:‘iY=ti~@~ for some IdI Wil. This implies z{ = t$‘z for some t, z, where O<ltldltlil, O~lzl<l~iI and Zt=Vi. Let A,=A, for some i,j, l<i,j~k. Then U~‘U~JU~‘,ni>l, nj>l, is a subword of some word in Lk. This implies i =j for LI, = Lv’, where k 3 2, and for Lk = L:“, where k33. In the case of regular languages M :I’ and M :+’ an analogous theorem holds for the non-self-embedding linear class of grammars. Using similar methods and arguments as in the proof of Lemma 3.5, we can prove the following lemma. Lemma 3.6. Let Lk be any of the languages se!f-embedding linear grammar generating My’ and ML+‘. Let Gk be a non- Lk. Then, ,for every i, 1 <i < k, there exists a nonterminal Ai in Gk and ni> 1 such that Ai -’ Ui= J.~.for some x, J’, with xy = ui. Moreover, U?‘Ai or Ai a+ Aiiir’ holds, where Ai # Aj for i #j, 1 < i, j < k. Drscriptional complexity of contextjkee Theorem 3.7. (i) Let 29 be a class of context-free L(“k EY(C~). Then HEI % ’(L(l))> k ’ Then grammars such that, for every k> 1, k. (ii) Let 9 be a class of non-self-embedding M:“ EY(C+? ). 283 zyxwvutsrqponmlkjihgfedcbaZYXWVUTSRQPONML grammar jiwms linear grammars such that, for every k 3 1, HEI,(M:“ )>k. Proof. Consider an arbitrary grammar Gk in 59 for which L(Gk)= L:” (L(G,)= Mr’ ) of Gk determined in Lemma 3.5 (in Lemma 3.6), holds. Let A 1, . . . , Ak be nonterminals which are used in the derivation of w,=uu; . u~wv~ . . v; v (w,=uu; . uiw), where s>d” , d and m being the numbers given in Lemma 3.2. Since no Vican preceed uj for 1 <i,j<k [since Gk is a linear grammar for (ii)], no sentential form XAiyAjZ can D* Ai+l or be derived in Gk, where x, JJ, zc(Nu T) *. This implies that either Ai zyxwvutsrqponmlkjihgfedcbaZ Ai+l D* Ai Ai+l for Since l<i<k-I. Ui+l never preceeds D+Ai does not hold and then A, D+A2 D+... Ui, in the case of Ly’, each D+Ak . In the case of Mf’, Ai is either right-linear or left-linear but not both; so, a permutation (pl,. . . , pk) of (1, . . . . k) can be determined such that A,,D+A,, D+ ... DfAp,. Hence, HEI(Gk)>k. Theorem 3.8. (i) Let $9 be a class of context-free L:+‘EY(~). Then (ii) Let ‘ 9 be a class of non-se!f-embedding M:+‘EY(Y). grammars such that, for every k 2 3, such that, for every k 2 1, DEP,(L:+‘)>k. linear grammars Then DEP,# (M:+‘ )>k. Proof. Let Gk be an arbitrary element of ie for which L(G,)=L:+’ (L(G,)= M:+‘) holds. Let A:‘, . . . . A:‘, for 1 <f dm + 1, be nonterminals of Gk, determined in Lemma 3.5 (in Lemma 3.6), which are used in the derivation of the word $,=u~m+lwmi(@m+‘)~ (Ws=uWrn+l w), where W= u; . . u; for some s>dm, where d and m are defined position in Lemma 3.2 (i.e. A:’ is a nonterminal producing ii’s in the tth of us in G;,.) Note that A~“ “ # A~2’ for i# j and for arbitrary sl,sz, 1<s,,s2<m+1. AS no “j can preceed ui for O<i,j,<k in L:+‘, A~‘ D+A~~, for 1 and A:’ D+ A:“ +” for 1 <s<m. Since t=l,2,..., m-t 1, there exist two different positions s1 and s2 such that AIs’ ) = A)s” for a fixed i. For i = 1 let us choose sl, s2, where s2 > sl, and s2 - s1 is minimal. Then ,4y1’ D+ At” Df . Df A:” D+ A:‘ +” D* Atf2’ =Ay” . (In the case 16idk- of M:+‘, permutations (pII, . . , ptk) of (1, . . ..k) for 1 <tdm+l exist such that, for BF’ equal to A!!,, B” ,’ D ’ B? D + ,.. D+ B:’ D+B~1 +1 ’D* B;2 =B;’ holds.) Thus, DEP(Gk)>k. 4 . Basic interpretations ~ auxiliary results In this section we specify some strict interpretations used in the sequel. The basic idea behind them is a suitable renaming of nonterminals and arising rules that are not of interest. By isolation we mean an interpretation which, roughly speaking, isolates a given derivation of a sentential form. Lemma 4.1. zyxwvutsrqponmlkjihgfedcbaZYXWVUTSRQPONMLKJIHGFEDCBA Let G=(N, T, P, S) he u context$ee grunmar and let, ,for A in N, D:A=uo~ul~u2’...~u,,, n 3 1, he a derivation in G, 1 <j < n. Then there is a strict interpretation pD, called an isolation G,=(ND, T,, PD, A) such that GI, D,G(p,) and where uj~(N u T)*, of‘ D, and a grammar (i) SF(G,) n (N u T)* = SF(G,,,,), where G,,, = (N. T, ( A + u, 1. A), (ii) A = u,, + c, = + P,,_, - vn= u,,, bvhere tli~~~(ui) ,for i= 1, 2,. in CD only derivation (iii) Pu consists of length e.xactlJ, n starting , n - 1, is the with A, and qfproductions used in vi * vi+ 1,for i = 0, I , . , n - 1. Proof. Let us define /co as follows: for aET, /b(a) = (a), for BEN, for every p,(B)={B” .“ : I <i<n- such that B occurs [i,j] as the jth letter in Ui, 1) up(B), where for B= A or for B being a letter of u,, otherwise. where Xj,k~(N u T). 1 <k < lj. We associate Let, for j, 1 <j < II - 1, ui = X, zyxwvutsrqponmlkjihgfedcbaZYXWVUTSRQPONMLKJIHGFED a word u,~with Uj, where cli= Xi, 1 . . Xi,,,, where , Xj, ii, if Xj,,ET, Xj.k x;,,= I X!j,kl J.k if Xj,,~N. Let us consider the derivation D’ : A + v1 + v2 + ... at!,,_ 1 =S u,, and let PO be the set of productions used in this derivation. Then, obviously, PD c pD(P) and, for the grammar CD given implicitly by PD, we have CD D,G(,u,) and SF(G,)n(NuT)*=SF(G,,,,). Remark. is infinite SF(G,,,) if A is a letter of u,~, where u,# A; otherwise, SF(G,,,,) = (A, II,, 1. Linear isolations (constructed in the next lemma) cause fixed derivations to be isolated, and terminals derived left and right from a fixed branch of the derivation tree to be distinguished. Lemma 4.2. Let N, D : A = u0 * he a primed G =(N, u, + u2 * version of D, and a grammar (i) SF(G;)n(N T, P, S) he a context:fiee grammar ‘. =P u AI*, n > 1, he a derivation of T. Then there is a strict Gb=(N&, TV T’, Pb, u Tu T’)*=SF(GI,,), and in G, where let, ,for interpretation ,LL;, called a linear A) such that CL D,G(,ub) and nhere Gi,,I=(N, Tu A u, OET ‘. Let T’, {A -+ uAv’ J, in T’ isolation S), Descriptional (ii) A = l&=Sv; only derivation =S...=Sv:,_, complexity * V: = of context-free i= 1, 2, . . . , n- where viE,ab(ui)for uAu’, 285 p-ammar forms 1, is the in Gb of length n starting with A, and qf productions (iii) Pb consists exactly used in v; + 2;;+ 1for i = 0, 1, . . . , n - 1. Proof. Let Ui=XiXiBi, 1 did n- 1, where Cli,P~E(Nu zyxwvutsrqponmlkjihgfedcbaZYXWVUTSRQPONMLK T)* and Xi are nonterminals lying on the branch of the derivation tree of D beginning and ending with A. We define ,& as follows: &(a)={a, where pLo is the a’ > for aET and isolation defined where Xj,l;E(Nu U j=X j,1 . ..X j.n,...X j,I ,, We associate ,u~(B)=~~(B) in Lemma T), 1 dkdlj, for BEN, 4.1. nj<lj Let, for j, zyxwvutsrqponmlkjihgfedc 1 < j<n, zyxwvutsrqponmlkji and Xj,“,=Xj. with Uj a word vi, where vJ= Xj. 1. . . xj, ,,, where for xj.kET, for k<nj, zyxwvutsrqponmlkjihgfedcbaZYXWVUTSRQPONML X j, kE T, k > nj, Let us consider the derivation 0: A 3 vi + c; => ... + VA_I =t=uAu’ and let Pb be the set of productions used in this derivation. Then, obviously, Pb 5 &,(P) and, for the grammar CL given implicitly by Pb, we have GI, D,G(pb) as well as zyxwvutsrqponmlk SF(Gb)n(N u Next Tu T’ )*=SF(G:,). we fix the notions isomorphic interpretations core grammar. of jth copy and renaming and define the corresponding a single grammars symbol by special isomorphic to the Definition 4.3. Let G=(N, T, P, S) be a grammar and j be a natural number. By pcj we denote an interpretation, called a jth copy, defined by j~~~(X)=x(j), for XE(NU 7). The jth copy G’j’ of G is the grammar Go’ U, G(pCj), with P(j’ =pCj(P). Interpretations can change some fixed occurrences of some symbol in the set of productions. Definition 4.4. Let G=(N, T, P, S) be a grammar and let XEN, Y#(Nu T). By P~_~, called renaming X (by Y) we denote an interpretation with px+r(X)= {X, Y} and px+,(Z)={Z} for Z# X,ZENUT. By GXdY we denote the grammar G*_,,= where P~_r={S-rx~:S~C(EP}u~Y-rclr:X~a~P}u (Nu{ Y}, T, P,,,, S), {A + gy : A + XEP, A ZX}, where zy denotes the word obtained from c( by replacing all occurrences of X by Y. Informally, G,,, is such an interpretation of G in which XEN is replaced by Y$(N u T) in any position of X except where X is the start symbol of G and all other letters remain unchanged. 5. Complexity of grammar forms In this section we show that grammatical complexity measures VAR+, PROD,,q, LEV.,, HEI, and DEP,, are unbounded on strict and on general grammatical families of self-embedding (infinite non-self-embedding linear) grammar forms. Theorem 5.1. Let G he u self-embeddiny c.onte.ut$ee yrammws such PROD, LE V, HEI, DEP). that contextTfree L“x(Y) yrammurfiwm. G Y(Y), Then K., is ur~borrnded where on Let ?? be a class of Let KE( VAR, SE (g, s}. Yx(G). Proof. Let s =g. If G is a self-embedding grammar, then Y’,(G) contains all linear languages (see [l 1, p. 433). By [S] and Theorem 3.7, for each K and for every natural number k, there is a linear language Lk such that K,,(L,)> k. This results in K,$ being unbounded on Yv,(G). Let x=s. First we prove the result for K =HEI. This gives the proof for VAR, PROD, LEV, too. Since G is self-embedding, there is a nonterminal A in G with derivations D,:S ++ sAy, D,: A a+ uAo, with .Y,y, WET* and K, LIET +. Denote by Ps, P,, PF the sets of productions of P used in derivations Ds, D,, DF, respectively. According to Theorem 3.7, it is sufficient to give, for any k 3 1, a grammar Gk such that Gk Cl,G and L(Gk)=Lt)= (xu~’ ...Lpwk+, lp‘...ry’y: rq>l,l<i<ki.LetPi,P;,Pkbethesetsofproductions obtained from Ps, P,, P,. by isolations ~r,~, ,u&, pDF, defined in Lemmas 4.1 and 4.2, respectively. We use abbreviation pro for pA+A1 and ~L,jfor pLAIdA,,,, where 1 <j< k. Let Pk=p,O(Pi)u uf=, p,;(P;)u U5=l~rj(~,j(P;))u~L,k+I(P~) and let Gk be the grammar given implicitly by the productions of Pk. We shall prove that L(G,)= L:“. Let w = uu;ll . u~~M~~ + 1tp . urn’ , y. Then w can be derived in Gk by using the following partial derivations: S =>+ sA, y, which uses the productions of p,,(P$), Ai -+ uiAici, which uses the productions of pCi(P;) for 1 <i< k, Aj”ujAj+1 P., , which uses the productions of prj(pCj(P;)) for 1 <j< k, A kflzyxwvutsrqponmlkjihgfedcbaZYXWVUTSRQPONMLKJIHGFEDCBA which uses productions of /L,.~+l(Pk). + K’k+lr Thus, L:” E L(G,). We show that the opposite inclusion holds. Let D : S * w1 * w2 * ... a w, = WET* be a derivation in Gk. Following Pk and Lemmas 4.1 and 4.2 any sentential form of Gk contains at most one recursive letter. The recursive nonterminal Ai, 2 d i < k, does not appear before Ai-, is rewritten. Moreover, every terminating derivation contains each Ai, 1 <i<k, at least once. Without loss of generality, we may assume that all nonrecursive nonterminals in D are rewritten before a recursive symbol is rewritten. <Y- 1 can be found such that A, occurs in the Then indices 2dil~izd~~~dik+l, sentential form wi, and it does not occur in any wk for zyxwvutsrqponmlkjihgfedcbaZYXWVUTSRQPONMLKJIHG k d i,. This gives ~~,=xu;“lA~~;ly for some mr3 1, nonrecursive m,,...,mk> 1 and for some w, = .xu;l’ up b”k+l~~...t~~l~ and L(Gk)SL:“. Let K =DEP. We show that there is a grammar Ak + 1. nonterminal G,, where G, a,G, Thus, such that L(G,)= I,:+‘. Let Pk have the same meaning as above. Let Fk = Pku&(pCk(P;)), where pL,kabbreviates pAk+AI. Let Gk be the grammar given implicitly by the elements of Fk. It can be shown that L(G,)=L~+‘={xzw,+,mi(z)y:~~L~~}. According to Theorem 3.8, DEP,(L(G,))>k. 0 We illustrate Example. Let G contain the constructions of Gk and Gk from the previous proof. the productions S+aA, A+uAjAala. We give Gk and Gk corresponding to the derivations D, : A + aA + aAa, Gk is given by the productions Ai + a.A!‘.‘] II 3 AI’,” + AiuL A!1.21 + A.,+lai I for i= 1, 2 ,..., k, for i=1,2 ,..., k, A k+l zyxwvutsrqponmlkjihgfedcbaZYXWVUTSRQPONMLKJIHGFEDCBA ‘“k+l C?k contains the same productions as Gk and, moreover, the production Ai1.21 -+ Al a;. To continue our study, we discuss the case where G is a non-self-embedding infinite grammar form. In this case Yp,(G) c 9(.%‘,AeF?).Now we have to distinguish between complexity measures in {HEI, DEP} and .in { VAR, PROD, LEV} since, for any regular language R, HEI,, and DEP,-,(R)= 1, while VARCFr PRODcF, and LE VCF form infinite hierarchies on the class of regular languages. Theorem 5.2. Let G be a non-selflembedding injinite context-free 3 be a class of context:free yrammars such that YX(G) c Y’(g), LEVsq, VAR!+ and PRODfq are unbounded on 5!“,(G). grammar form and let where xE{g, s}. Then Proof. Let us discuss first K = LE V. Let x =g. Y,(G)= 2’(R&W); by [SJ, for every k> 1, there is a regular language Rk such that LE V,--(Rk)= k holds. Let x=s. Without loss of generality, we may assume that G has a recursive nonterminal A with derivations Ds : S a* xAy, D,:A++ DF:A*+ uA (or D,: A *+Au, but not both) w, with x, y, tt’~T* and UET+. Let Ps, PI, PF be the sets of productions used in derivations Ds, D,, DF, respectively. To prove the theorem, we construct, for any k> 1, a grammar Gk such that L(G,) = L satisfies the conditions of Theorem 3.1. Let be the isolations defined in Lemma 4.1 and denote by P$, Pi, Pk PD,, pD,> PDF sets of productions obtained by them from Ps, P,, PF, respectively. Let Pk=UT=l({S~Ilci(a):S~a~P$)‘U~~i(P~UP;UP’,)). Let Gk be the grammar given implicitly by productions grammar Gk with the following derivations: S+*~iAiyi, A;++uiAi(or Ai~‘Aiui), of Pk. Then Pk determines Ai~+wi, ldidk, where alph(xiyiuiwi) are pairwise disjoint for different i. Since L(G,)= Uf= 1 Li, LE V,,L(Gk)3 k, according to Theorem 3.1. Since where Li G a/ph(xiJJiuiwi)‘, PROD,L(Gk)3 VARCqL(Gk)3 LEV,L(G,), the proof is completed. 0 If we restrict 9 to be a class of non-self-embedding linear grammars then for G a non-self-embedding linear infinite grammar form we obtain infinite hierarchy for HEI, and DEP,# on Pia,( too. Theorem 5.3. Let G be a non-seIf-embedding infinite linear grammar form. Let 3 be a class of non-self-embedding linear grammars such that 9,(G) s .2(g), where xE{ g, s}. Let KE{HEI, DEPl(. Then Ktq is unbounded on Y,(G). Proof (sketch). The theorem can be proved by constructing ML+’ using similar methods and arguments as in Theorem 5.1. languages My’, References [l] W. Brauer, MFCS’73, On grammatical 191-196. complexity of context-free languages (extended abstract), in: Proc. of Descriprioml [2] [3] [4] [S] [6] [7] [8] [9] [lo] [1 I] complrsity of‘ contr? ct-free yrmnmar ,forms 289 E. Csuhaj-Varji, A connection between descriptional complexity of context-free grammars and grammar form theory, in: A. Kelemenovi and J. Kelemen, eds., Trends, Techniques, and Problems in Throrrtical Computer Sciencr, Lecture Notes in Computer Science, Vol. 281 (Springer, Berlin, 1987) 6&74. E. Csuhaj-Varju and J. Dassow, On bounded interpretations of grammar forms, Theoret. Comput. Sci. 87 (1991) 2877313. J. Gruska, On a classification of context-free grammars, Kybernetika 3 (1967) 22-29. J. Gruska, Some classifications of context-free languages, InJorm. and Control 14 (1969) 152-179. A. Kelemenova, Grammatical levels of position restricted grammars, in: J. Gruska and M. Chytil, eds., Proc. ofMFCS ‘ 81, Lectures Notes in Computer Science, Vol. 118 (Springer, Berlin, 1981) 347-359. A. Kelemenovi, Grammatical complexity of context-free languages and normal forms of context-free grammars, in: P. Mikulecky, ed.. Proc. qj’ IMYCS ‘82 (Bratislava, 1982) 2399258. A. Kelemenovi, Complexity of normal form grammars, Theoret. Comput. Sci. 28 (1984) 2888314. A. Kelemenovi, Structural complexity measures on grammar forms, in: Proc. Conf: on Automata, Lunguages and Proyramminy Systems, Salgotarjan DM 88-4 (Budapest, 1988) 73376. A. Salomaa, Forma/ Languayrs (Academic Press, New York, 1973). D. Wood, Grammar and L-forms: An Introduction, Lecture Notes in Computer Science, Vol. 91 (Springer, Berlin, 1980).