Academia.eduAcademia.edu

Cyclic Computation, a Computationally Efficient Minimalist Syntax

1999, Syntax

Syntax 2:1, April 1999, 1–27 CYCLIC COMPUTATION, A COMPUTATIONALLY EFFICIENT MINIMALIST SYNTAX John Frampton and Sam Gutmann Abstract. We construct a completely cyclic Minimalist theory of syntactic derivations. A derivation consists of a sequence of cycles. Each cycle starts with the introduction of a new head and merger of the head’s selected arguments, followed by satisfaction (via checking) of the head’s removable features. The theory includes no acyclic devices such as lexical arrays or comparison of derivations. Satisfaction of features is accompanied by full category movement whenever it is not blocked by morphology or constraints barring multiple specifiers. The Minimal Link Condition is viewed computationally and is naturally incorporated into Satisfy. Our precise notion of checking involves sets of features interacting in the same checking relation, and yields an account of successive cyclic movement, the distribution of expletives, EPP, and quirky case phenomena. The paper can be read as empirical evidence that the core syntactic algorithm is computationally efficient. 1. Introduction In the first period of generative grammar, the computation involved in generating grammatical derivations was straightforward. The allowable structural transformations were listed and the computation was organized into a sequence of cycles. In each cycle, the computation went through the list, applying whatever transformations it could. The GB period introduced a very different view of how grammatical derivations should be characterized. Structural transformations were no longer licensed by fitting a pattern on a list of possibilities, but by satisfying a list of constraints. If a structural transformation was not forbidden, it was well-formed. The problem with such an approach, as Chomsky recognized in the late 1980s, was that this view predicts that a structural transformation could take place without motive. It was licensed if no constraint forbade it. The facts of language, however, seemed to indicate the opposite, that there was no superfluous movement. *Portions of this work were presented in a course and at a colloquium at the University of California at Santa Cruz and at a workshop at Potsdam Universität. We thank those audiences for their questions, examples, and comments, which were invaluable in bringing this work to its present form. We thank the following people for helpful comments on an earlier draft of this paper: Cedric Boeckx, Sylvain Bromberger, Sandra Chung, Chris Collins, Morris Halle, Jason Merchant, David Pesetsky, Gil Rappaport, Esther Torrego, and Charles Yang. We particularly thank Jonathan Bobaljik for detailed written comments which led to several improvements and helped us avoid at least one major blunder; Noam Chomsky for creating the problems to solve, for comments on an early outline of these ideas, and for a running commentary in his lectures over the past few years on the issue of computational complexity; and Bill Ladusaw for an insightful historical perspective. We also thank the following coffee houses for graciously providing temporary office space: Maxim (Newton), Panini (Somerville), Bentonwood (Newton), and Seattle’s Best Coffee (Brookline and Newton). ß Blackwell Publishers Ltd, 1999. Published by Blackwell Publishers, 108 Cowley Road, Oxford OX4 1JF, UK and 350 Main Street, Malden, MA 02148, USA 2 John Frampton and Sam Gutmann The outcome was Chomsky’s idea that a principle of ‘‘least effort’’ was involved. That is, derivations were trying to accomplish something and they did it in the most economical way. Instead of a list of constraints, the emphasis was shifted to understanding what derivations had to accomplish and how the effort in various ways of carrying this out could be compared. The theory was based on Move-a, with Least Effort viewed as a constraint on derivations. Major advances were made in understanding what derivations had to accomplish. This was Chomsky’s theory of feature checking and the idea that derivations were driven by the need to check a certain subclass of features in the course of the derivation. This is a highly non-cyclic view of syntactic computation. Structural change in one cycle can be blocked because a structural change in a later cycle could accomplish the same objective with less effort. This approach proved increasingly untenable because of the difficulty in comparing the cost of various ways of accomplishing feature checking. In particular, the principle of Greed proved impossible to formulate satisfactorily. This led to a move away from the Move-a/Least Effort view of syntax. Increasingly, the power to regulate structural change was taken away from Least Effort and returned to cyclic principles. The shift to computation organized around Attract was the major step in this direction. Chomsky (1995), however, still retains significant reliance on non-cyclic computation, comparison of derivations in particular. This comparison is manifestly non-cyclic, and computationally complex. It is far from clear that computational efficiency is either required or even expected for theories which purport to explain certain aspects of cognitive systems.1 It is interesting, nevertheless, to see if pursuing computationally efficient theories leads simultaneously to empirically successful theories; we believe that there is evidence that it does. What we would like to do in this paper is to complete the return to the highly cyclic syntactic computation of the earliest period, but keep the core insights of the Minimalist framework. In place of an ordered list of transformations, there is an unordered list of feature-checking relations. In each cycle, whatever structural transformations those checking relations license are carried out. The system of checking relations and the algorithm that specifies how checking relations are translated into structural transformations play the central role in such a theory. This sets our investigation in the general context. It is also useful to sketch some of the details here, to help orient the reader to what follows. Cyclic Computation means that we will dispense with comparison of derivations as a computational device. We can therefore dispense with Chomsky’s notion of ‘‘numeration,’’ the initial array of lexical items assembled in advance of the 1 See Kosslyn (1980:123) for a succinct argument that ‘‘some of the constraints on cognitive processing may not be those that foster the most computationally efficient processing for the tasks at hand.’’ It would be a naive view of evolution to assume, a priori, that the brain is organized in such a way that computational efficiency for language processing is ensured. ß Blackwell Publishers Ltd, 1999 Cyclic Computation 3 derivation which he needs in order to constrain the set of derivations which compete with each other with respect to Least Effort.2 We will also dispense with Chomsky’s weak/strong feature distinction, which he needed to force derivations to go against the tendency of Least Effort to lead to doing nothing at all. In the theory we propose, lexical items are introduced throughout the derivation. A derivation consists of a sequence of cycles of the following form: (1) The Cycle a. (Select) A new lexical item is introduced. It selects its arguments and is merged with them. b. (Satisfy) The features of this newly introduced head are satisfied as fully as possible by checking, which induces overt movement whenever possible. Following from the idea that the syntactic architecture is organized so that the computations are efficient, we assume that syntax constantly consults external systems. In (1a), for example, the h-criterion is satisfied at insertion. In (1b), making ‘‘whenever possible’’ precise leads to distributed interaction with morphology to determine the possibility of head incorporation. See Frampton & Gutmann (1998) for a full discussion of verb raising in English in this framework. Section 2 contains terminology and preliminaries and in section 3 we briefly review Chomsky’s theory of comparison of derivations. In section 4, we detail our proposals and specify the core algorithms, Select and Satisfy. Section 5 elaborates the feature checking involved in structural case and agreement. Section 6 presents a case-based account of the Chain Condition, the EPP, and expletive constructions. 2. Preliminaries In order to fix some terminology which we will use in the coming sections, we will begin by outlining some of the technical apparatus which we will use later. Most of the material in this section is not new, although the particular point of view which we take and the particular terminology we adopt may be. We assume that there is a feature set and a predicate removable defined on it. Removable features play a special role because derivations are organized around the elimination of removable features. We assume that there is a lexicon, which is a procedure of some kind yielding purely lexical items. A lexical item is a copy of a purely lexical item. It is taken to be a copy because it is necessary to distinguish two occurrences 2 The acyclic character of the numeration device is worth noting. The task of assembling the lexical items which will be employed in the derivation is split off into a pre-cyclic step. ß Blackwell Publishers Ltd, 1999 4 John Frampton and Sam Gutmann of the same underlying purely lexical item. The two occurrences are simply two different copies of the same purely lexical item. Lexical items are the raw material from which syntax builds phrases, representations, and derivations, as well as those complex words which are built syntactically. Each lexical item has, as a property, a set of features.3 A morphological item, also called a head, is either a lexical item or an object built by morphology. We assume that there is a morphology, which has the (recursive) capacity to build structures, morphological items, out of certain binary combinations of morphological items. We emphasize ‘‘certain combinations’’ because the internal workings of morphology, reflected in its combinatorial possibilities, will play a major role in what follows. Certain combinations will be possible and certain combinations will not be. This will place constraints on the possibilities for overt head incorporation. We also assume that each head X, whether a lexical item or a structured morphological item, has an associated set of features, (F)X, the features which enter into syntactic computation.4 The notion phrase is defined recursively. A phrase a is either a morphological item or a pair of phrases, with one of the phrases distinguished as the primary term of a. In the first case, the head of a is defined to be a itself. In the second case, the head of a is defined, recursively, to be the head of the primary term of a. Notationally, the ordered pair (b, c) will be used to denote a phrase whose primary term is b. Note that there is no notion of linear order in the sense relevant to phonology. We take it to be the responsibility of the computation which determines phonological form to determine linear order in the phonological sense. A representation is a list of phrases.5 Given a representation , we say that a is a phrase in  if a is on the list  or a is a subphrase of a phrase on the list . Given , we say that a morphological item X is a head if X is the head of some phrase in . A derivation is a finite sequence of representations. A derivation 0 ; . . . ; n † is a well-formed derivation if all the steps k ! k‡1 are well-formed and no head in n has a removable feature. The well-formed steps are determined constructively by specifying several transformational algorithms which take a representation as input and (if the input is suitable) yield another representation as output.6 The well-formed steps are those 3 It could be that lexical items have no other property. That is, they are simply unstructured sets of features. All structured words would then be formed in the syntax. We tend to think that the lexicon produces structured items as well, but the resolution of this question is not relevant to the present work. 4 We ignore the question of precisely how morphology determines which features of compound words are externally available to the syntax. 5 In fact, we do not use the list structure of a representation at any point. It would suffice to assume that a representation is a set of phrases, doing away with the ordering inherent in lists. Conceptually and computationally however, lists are more natural than sets. 6 This is an oversimplification. The algorithms are non-deterministic, allowing the possibility that different outcomes can result from applying a given algorithm to a given representation. ß Blackwell Publishers Ltd, 1999 Cyclic Computation 5 resulting from the application of one of the transformational algorithms. In Chomsky’s development of the theory, there are two algorithms, Merge and Attract. In our reworking of Chomsky’s theory there are two algorithms as well, Select and Satisfy. Chomsky assumes that the initial representation in a derivation already contains all the lexical material which will be used in the derivation. The initial representation is taken to be a list of lexical items. (This is equivalent to his notion of numeration.) We assume that the initial representation is empty and that lexical insertion occurs in the course of the derivation, eliminating the pre-derivational construction of the initial representation in Chomsky’s theory. 3. Optimal Derivations In Chomsky’s early work on what has become the Minimalist Program, notions of optimality (called ‘‘economy of derivation’’) played a major role. As the theory took on its present form, in ‘‘Bare Phrase Structure’’ (Chomsky 1994) and subsequent developments, the role of optimality was progressively circumscribed. Derivational economy principles were originally introduced as a way to formalize the intuition that there is no superfluous movement. As the theory developed, however, it became apparent that it was not possible to formulate a notion of a well-formed movement step except on the basis of a requirement that the movement step eliminate at least one feature. In effect, the fact that movement is never superfluous was built right into the definition of movement. Since removable features must be eliminated, eliminating a removable feature is never superfluous. Rather than being a consequence of economy principles, the absence of superfluous movement was now a consequence of the core architecture of the theory. Chomsky (1995, chapter 4) takes significant steps towards limiting the earlier extensive role of economy of derivation, but retains a crucial optimality principle. (2) (Merge before Attract) Suppose w ˆ 0 ; . . . ; n ; n‡1 ; . . . ; k † and x ˆ 0 ; . . . ; n ; 0n‡1 ; . . . ; 0k 0 † are two well-formed derivations. Then w is more economical than x if n ! n‡1 is an instance of Merge and n ! 0n‡1 is an instance of Attract. Note that the two deviations above are assumed to be identical up to and including the representation n . Note also that since Chomsky assumes that all the lexical material is already present in the initial representation 0 , lexical insertion does not compete with Merge or Attract in comparing derivations. In practice, the elegance of (2) was compromised by the need to adopt the device of ‘‘strong features,’’ a way of encoding instructions to override (2). A familiar example is overt verb raising in French. Application of (2) would lead to postponing verb raising to covert movement since there will always be ß Blackwell Publishers Ltd, 1999 6 John Frampton and Sam Gutmann merger operations which are preferred to verb raising (an instance of Attract) on economy grounds. Chomsky’s solution was to suppose that some movement is governed by what he called strong features, which demand immediate elimination. In effect, the system balances between (2) and the imperatives of strong features, based on the initial array of lexical items which are available to the derivation. Our main purpose in this paper is to complete the erosion of the role of optimality and to remove all appeal to comparison of derivations from the theory. In the process, we will remove the idea that the available lexical items are initially fixed for the derivation as well as any appeal to the idea of strong feature. In order to understand where we are heading, it is good to have some idea of the consequences of (2) in Chomsky’s development of the theory. Obviously, one of the main purposes of this paper will be to give alternate accounts of the phenomena which are explained by (2). Its explanatory power is surprisingly narrow. This fact itself should be an indication that there are conceptual problems. It is suspicious that an extremely powerful computational device (optimality) has only weak effects. The most striking evidence for (2) is the ingenious account it provides for (3). (3) a. b. c. Someonei seems ti to be ti in the room. Therei seems ti to be someone in the room.  There seems someonei to be ti in the room. The account is straightforward. In (3a), the EPP requirements of the embedded and matrix clause can only be satisfied by raising someone. With all the available lexical material already present in the initial representation, there is no possibility of using an expletive to satisfy these requirements. In (3b) and (3c), the expletive there is present in the initial representation. The derivations associated with both (3b) and (3c) are wellformed and have the same initial representation, but the derivation associated with (3c) employs Attract (raising someone) at the point where the derivation associated with (3b) employs Merge (merging in there). The former derivation is therefore well-formed, but not optimal, yielding the contrast between (3b) and (3c). We will return to give an alternate account of (3) in section 6. 4. Select, Satisfy, and the Transformational Cycle A lexical item has what we might call ‘‘needs’’ of two kinds. First, if it requires arguments, its selectional needs must be met. Selection and argument here are meant in the broadest sense, so that Infl selects VP, be selects complements of a certain type (but does not h-select its complement), etc. It should not be confused with h-selection. Second, a lexical item also must have its removable features eliminated. These two needs are at the core of the two transformational algorithms. ß Blackwell Publishers Ltd, 1999 Cyclic Computation 7 4.1 Select A purely lexical item w which selects is chosen from the lexicon and a copy X is generated.7 If w has one argument, then either a phrase a is removed from the representation or a copy of lexical item a is generated and (X, a) is appended to the representation. If w has two arguments, then phrases a1 and a2 are either taken from the representation or directly from the lexicon and ((X, a1), a2) is appended to it. If w has more than two arguments, the process is the obvious extension of the case with two arguments.8 Until the last cycle, the representation can be viewed as simply a holding area for complex (i.e., non-lexical) arguments that have already been built for heads that have not yet entered the derivation. This conception of how lexical items enter syntax builds selection into the core of the syntax in a way similar to the way that the h-Criterion and phrase structure rules (which did the work of selection for functional heads) were originally imposed as a condition on so-called Deep Structure. The present framework has done away with Deep Structure, but Select imposes selection at the deepest possible level for each selecting head, the point at which the head enters the derivation. It should also be noted that satisfying selectional requirements at the point of entry into syntax is part of the general approach of distributed interface interaction. It significantly reduces computational complexity in the syntactic component by narrowing the range of possible derivations. We call the head which Select introduces into the representation the pivot of the representation. Each application of Select changes the pivot. 4.2 Feature Checking Before we can discuss Satisfy, which is driven by the imperative of eliminating removable features, we need to discuss feature checking itself. We assume that there are a certain number of checking relations. A checking relation is defined by a pair of sets of features.9 We take a checking relation to be defined by a pair of sets of features, rather than simply by pairs of features, because in many checking relations it is combinations of features which enter into the relation, rather than single features. Nominative case and agreement features, for example, appear to work together in checking finite inflection. We will take this up in detail in the next section. The conclusion we arrive at there is along the following lines: 7 Two points are relevant. First, the distinction between w and X is equivalent to Chomsky’s distinction between a lexical item and an indexed lexical item. Second, we ignore the question of whether X is a strict copy or there is a more articulated mechanism which generates X by first making a copy of a lexical item and then adjoining removable features to the copy. 8 We avoid the question of whether a lexical item can have more than two arguments. 9 See section 5.3 below for some speculation on the extent to which checking relations are language independent. ß Blackwell Publishers Ltd, 1999 8 (4) John Frampton and Sam Gutmann {‡Nom, ‡u} „ {‡Tensed, ‡u} The notation „ is used to emphasize the two-way character of checking. We will give empirical justification for this particular checking relation in the next section. But it is a useful example here to explain the mechanics of feature checking. We use the symbol ‡u to refer to the inflectional u-features. The feature ‡u is a removable feature, while ‡u is not. The ‡ has no independent meaning. It is just a flag to announce that the symbol it precedes refers to a feature. Note that the relation „, which we will call Checks, is symmetric, so that (4) is equivalent to {‡Tensed, ‡u } „ {‡Nom, ‡u}. We can now turn to saying exactly what happens when feature checking is carried out between X and Y. Recall that F(Z) denotes the set of features of Z which enter into syntactic computation. Suppose for example that (4) holds. The features ‡u and ‡Tensed are not removable features, so there is no question of eliminating them. The checking relation (4) has the following effect. If {‡Nom, ‡u}  F(Y), and ‡u 2 F(X), then ‡u is removed from X. If {‡Tensed, ‡u }  F(X) and ‡Nom 2 F(Y), then ‡Nom is removed from Y. Note carefully that we do not require that all features mentioned in the checking relation be present. We do require that a full feature set (on one side of the relation) is needed to eliminate a removable feature on the other side of the relation. One way to look at this is to view the symmetric relation (5a) as splitting into the pair of unsymmetric relationships given in (5b): (5) a. {‡Nom, ‡u} „ {‡Tensed, ‡u } b. {‡Nom, ‡u} * ‡u ‡Nom ) {‡Tensed, ‡u } It is not as revealing of the underlying logic, but it is more convenient to write the pair of relations in (5b) as (6). (6) {‡Nom, ‡u} * ‡u {‡Tensed, ‡u } * ‡Nom This is equivalent to (5b) because the relation in (5a) is symmetric. The relation *, which we will call Removes, is an unsymmetric relation between sets of features and removable features. Although Removes * is a simpler relationship than Checks „, in the sense that it is easier to see what the effect of Removes is on feature elimination, it would miss an important generalization if we assumed that feature elimination is determined by unsymmetric relations between sets of features and removable features. This would fail to capture the symmetry of the two relations in (6). It is not accidental that the same removable features are involved in both sides of the relations (6). In what follows, we will often use the relation Removes because feature removal plays such a central role in the theory. But it is important to ß Blackwell Publishers Ltd, 1999 Cyclic Computation 9 remember that Removes * is a derived relation, derived from the fundamental relation Checks „. In order to simplify the discussion which follows, a definition is useful. We will say that a head X recognizes a head Y if a removable feature ‡f of X can be removed by a feature set m, at least some part of which is contained in F(Y). More formally, if there is a checking relation l „ m, and features ‡f 2 F(X)\ l and ‡g 2 F(Y)\ m, such that m * ‡f . If X recognizes Y, then feature checking between X and Y might be able to eliminate a removable feature of X, depending on the structural relation of X and Y and whether or not m and F(Y) not only share a feature, but m is contained in F(Y). If X does not recognize Y, then there is no possibility that feature checking between X and Y can eliminate a removable feature of X. As we shall see shortly, the relation ‘‘recognize’’ is important because it establishes locality in feature checking. 4.3 Satisfy Satisfy is a complex transformational algorithm, which can be broken down into a number of substeps. The steps are: a. location of a head Y which the pivot X recognizes; b. carrying out feature checking between X and Y; and c. displaying the feature checking structurally. Additionally, checking can have morphological side effects. We take this up later in section 5. Suppose we wish to apply Satisfy with a pivot X which has just selected the argument a so that the phrase (X, a) has been built. (We take the oneargument case simply for expository convenience.) The first step is locating a head Y which X recognizes. If X has no removable features, this step will necessarily fail. If X does have at least one removable feature, there are two options. One option, just as it is with Select, is taking a Y directly from the lexicon. More precisely, a copy Y of a purely lexical item is generated such that X recognizes Y. Because Y does not enter the syntax via selection, this option will be limited to expletive Y. The second (much more common) option is finding a Y in (X, a). The search is top-down. If a Y which X recognizes is found, the search does not go more deeply down into a. The major subtlety is specifying exactly what it means to say that Z is deeper in a than Y is, so that if X recognizes Y, then Y blocks access to finding a Z (which X recognizes) in the top-down search of a. This is the question of equidistance (see Chomsky 1995:298). We will sidestep this issue here, since the intervention effects which are relevant in this paper are not delicate. It can be, of course, that the first step fails and no Y which X recognizes is found. In this case, the algorithm simply terminates. No feature checking ß Blackwell Publishers Ltd, 1999 10 John Frampton and Sam Gutmann takes place, and the derivation either terminates or the next cycle begins, initiated by an application of Select. Carrying out the second step, checking, is fairly straightforward. We will assume that all possible checking is carried out. This is important. Without this assumption, we would not be able to account for Chain Condition effects. Consider: (7) *Maxi appears [ti is [ti happy] ]. It is not enough to require that Max remove some feature of the embedded Infl when it raises to the position of the intermediate trace. Its own removable features must be removed as well. Otherwise it could raise further and carry out checking in the matrix sentence and an explanation for (7) would be lost. It can be that no checking is possible. If X recognizes Y, there is no guarantee that feature checking can take place between X and Y. Unlike the previous step, the cycle is not necessarily terminated. There may be a Z, distinct from Y, that X also recognizes (so that Z and Y are equidistant from X). Satisfy then attempts to carry out feature checking with Z. The last step in the Satisfy algorithm is the most complex. The guiding principle is that, to the extent possible, feature checking is displayed (i.e., made overtly visible). This is accomplished by carrying out a structural transformation which corresponds with the checking. There are three possibilities: (8) a. projecting a specifier of X; b. incorporation into X; and c. covert incorporation into X. 4.3.1 Projecting a Specifier of X One way to display checking between the pivot X and a head Y is via a transformation: (9) (X, a) ! ((X, a), b) Here b, the maximal projection of Y, becomes a specifier of X. (Recall that all specifiers and complements are written to the right.) If (X, a) is not itself originally a maximal projection, then (9) must be understood as substituting the right hand side for the left hand side in the phrase on the representation list which X occurs in. This operation will build a multiple specifier projection. To the extent that a language imposes restrictions on building multiple specifier phrases, the availability of this option to Satisfy is restricted. Note that inherent in the specification of the Satisfy algorithm, because transformations are specified as transformations of (X, a) at their core, is the conclusion that transformations always build ‘‘inner specifiers.’’10 If Y is an expletive 10 See Richards (1997) and Mülders (1997). ß Blackwell Publishers Ltd, 1999 Cyclic Computation 11 generated by the lexicon, then (9) is simply merger of Y as a specifier of X. If Y is originally contained in a, it is movement to specifier with associated trace and chain formation. 4.3.2 Head Incorporation into X This is the transformation: (10) (X, a) ! (X ‡ Y, a) Just as in instances of projecting a specifier via movement, the effect of the transformation (10) is to substitute the right hand for the left hand side in the phrase on the representation list which X heads and to replace Y by a trace of Y. The formation of X‡Y is a morphological operation; X‡Y is a complex word. The incorporation structure cannot be formed unless morphology can carry out the operation. Morphology therefore imposes a constraint on the ability of syntax to display feature checking overtly. This should be viewed as distributed interaction of the syntax with the external systems, morphology in this case. 4.3.3 Covert Movement In Frampton & Gutmann (1997) we assumed that if overt movement was not possible, then feature satisfaction was put off until after Spell-out. The idea was that after Spell-out morphological constraints would no longer block head incorporation. Here, we take a different approach, in keeping with the guiding principle that syntactic operations are carried out strictly cyclically, and follow Yang (see this issue) in assuming that even if overt movement is impossible, feature satisfaction is carried out. This extension is based on a shift in point of view about the relation of feature checking and movement. Chomsky’s early view was that movement was undertaken in order to bring heads into a local relationship so that checking could be carried out. The checking itself was assumed to take place under strict locality. Particularly with Chomsky’s shift from viewing the basic transformational operation as Move to viewing it as Attract, there is a peculiar twist to this logic. Why should strict locality be necessary for checking? Locality is generally required to establish syntactic relationships. But in the Attract framework, the relationship must be first recognized longdistance by feature matching before any local relationship is established. The local relationship which results from movement is after the fact of establishing the feature matching relationship. Rather than assuming that movement is required for checking to take place, we assume that movement is a consequence of checking. Under some circumstances, overt movement will be impossible. But this does not make checking impossible. It simply prevents overt syntax from faithfully mirroring the underlying feature checking which is driving the derivation. ß Blackwell Publishers Ltd, 1999 12 John Frampton and Sam Gutmann Is there any syntactic reflection of feature checking in this case? Suppose that Satisfy applies to (X, a) by carrying out feature checking with some Y in a. Yang (see this issue), following Chomsky (1995) supposes that there is adjunction of the formal features of Y to X, but there are unresolved questions about this approach which make us hesitant to adopt it wholesale.11 The issues are technical and would divert the discussion, so we leave them for future work. It is not completely obvious that any syntactic effect of feature checking other than eliminating removable features is necessary, but there are indications that there must be some syntactic residue. At issue is the hierarchical position of Y with respect to further feature checking. There are some reasons for believing that this position is the position of X. This involves the exact formulation of the locality conditions on feature checking, a question of intervening features, which we have sidestepped. Movement of formal features is one way to alter the hierarchical relationships, but there are other approaches which might accomplish the same end.12 If we use the notation X ‡Y† to indicate the transformed head X, either X with the formal features of Y adjoined or some other device, the transformation of (X, a) is: (11) (X, a) ! (X(‡Y), a) We assume that feature checking is displayed overtly if possible. We do not know any examples in which both head incorporation and movement to specifier offer real alternative derivations. There are examples in which there is an option of which checking to carry out first, which has consequences for the order in which overt operations are carried out, but the derivations are equivalent. The assumption that feature checking must be displayed overtly, if possible, raises some difficult questions that we do not have completely satisfactory answers for. Consider the D-N relationship, for example. Translating Longobardi’s (1994) work to a feature driven framework, suppose determiners have a N -feature which can be removed by an N. (See the next section for the precise checking relation involved.) If the N cannot incorporate into D, what blocks displaying N -feature checking by moving the NP complement of D to Spec(D)? Consider, for example, a simple DP (ignoring possible case features): (12) the man ‡D,‡N ‡N 11 The nature of the chains created by this kind of movement is particularly obscure. Such a chain would have its head adjoined to X, but its tail would be an unstructured subset of the feature set of Y. 12 We take these issues up in a paper (in preparation) on dative-nominative constructions in Icelandic. ß Blackwell Publishers Ltd, 1999 Cyclic Computation 13 The N -feature of the determiner will look for a ‡N and feature checking will be carried out. We assume that this will be displayed either by head incorporation or movement to specifier, if either is possible. What blocks movement to specifier? (13) [the man] ! [man [the t]] We can note this problem is not unique to the framework we are developing here. Within a framework that codes overt movement by making a distinction between weak and strong attractors, it can be stipulated that ‡N is weak. But that is not really a solution to the problem. Within the present framework, we can stipulate that determiners, excepting possessives, do not permit a specifier. It is a similarly unrevealing stipulation. 4.4 The Cycle We have already detailed the workings of the cycle, but it is worth reviewing. Select applies, then Satisfy applies as many times as it can, then the cycle is repeated. Note that each application of Satisfy eliminates at least one removable feature, so that it cannot apply indefinitely. A metaphor may be appropriate. One can imagine that removable features increase what we can think of as the ‘‘tension’’ of the system, i.e., the current representation. Selection generally raises the tension. Before further selection, the system relieves the tension as much as possible by its own internal devices, applications of Satisfy. Like a mechanical system, it cannot see into the future and does only what produces an immediate relaxation of tension. It cannot look ahead and transform itself with possible future reductions of tension in mind. 5. Structural Case Quirky case constructions in Icelandic illustrate the connection between nominative case checking and agreement clearly.13 In finite clauses in Icelandic in which there is no nominative case assignment, Infl has an invariant 3rd person singular form, independent of the u-features of the nominal in its specifier. We follow the standard view and take it to be default agreement, the morphological form assigned to tensed verbs lacking morphological u-features. Consider a standard example (with Dat used in the gloss to indicate dative case and dflt to indicate default agreement): 13 This section relies heavily on the work of Halldór Sigurðsson. He has carefully and insightfully investigated the key phenomena in a series of papers (see 1996 in particular, and references contained therein). ß Blackwell Publishers Ltd, 1999 14 John Frampton and Sam Gutmann (14) a. b. c. Hann hjálpaði okkur. he helped us(Dat) ‘He helped us.’ hjálpað ti. Okkuri var/*vorum us(Dat) was(dflt)/were(1pl) helped t ‘We were helped.’ var/*voru hjálpað ti. Þeimi them(Dat) was(dflt)/were(3pl) helped t ‘They were helped.’ The verb selects for a dative case object. The object in the passives (14b,c) moves to Spec(Infl) for EPP reasons. (We will return in section 6 to discuss the EPP in much more detail.) Note that there is no subject-verb agreement in (14b,c). The agreement facts above contrast with parallel cases in which there is no selection for inherent case and there is nominative case checking by Infl. The following are simple passives with a verb which checks structural accusative case in the active form. (15) a. b. Við aðstoðuðum Þa. we(Nom) aided(1pl) them(Acc) ‘We aided them.’ voru aðstoðir ti. Þeiri they(Nom) were(3pl) aided t ‘They were aided.’ Even if nominative case appears on a postverbal argument, with covert nominative case checking, there is overt agreement with Infl. The following are from Sigurðsson (1996). The verb in (16a) and (17a) is of the familiar variety which checks the accusative case of its object in active voices. The verb in (16b) and (17b) assigns inherent dative case to its object. (16) a. b. (17) a. b. voru lesnar ti. Bækurnari the-books(Nom) were(3pl) read t ‘The books were read.’ var skilað ti. Bókunumi the-books(Dat) was(dflt) returned t ‘The books were returned.’ Þad voru lesnar fjórar bækur. there were(3pl) read four books(Nom,pl) Þad var lesnar fjórum bókum. there was(dflt) read four books(Dat,pl) There is a two way implication. Nominative case checking induces agreement; agreement requires nominative case. It might be possible to argue that in English this simply reflects a morphological fact about Infl, that ß Blackwell Publishers Ltd, 1999 Cyclic Computation 15 inflectional agreement features always accompany finiteness. It could then be claimed that the coincidence of agreement and nominative case assignment is just a reflection of a morphological coincidence. But this line of reasoning does not hold up for Icelandic because quirky subjects appear with finite Infl without any overt agreement morphology. The feature checking mechanism proposed in the last section was formulated with this kind of feature interaction in checking relations in mind. We could express the facts above by assuming the following checking relation: (18) {‡u, ‡Nom} checks {‡u , ‡Tensed} The structure of the theory, however, will be much clearer if we refine (18) in several ways. In the first place, we must understand correctly the role of ‡u in the theory. For various reasons (the Minimal Link Condition, in particular) it cannot be that ‡u is specialized to particular u-features. This would correspond to a feature system in which a 3pl inflectional head would not recognize a 1sg nominal. But as far as intervention effects are concerned, the particular u-features of intervening heads appear to be irrelevant. What is relevant is only that the intervening head have u-features, i.e., it is ‡N. Then (18) can be written as: checks f‡N ; ‡Tensedg (19) f‡N; ‡Nomg If syntactic features do not directly encode morphological agreement, how then is morphological agreement established? We suppose that syntactic feature checking can have morphological side effects. Restricting our attention to this particular checking relation, we assume that if f‡N; ‡Nomg removes ‡N , the u-features of the N are copied onto the ‡N head.14 We will soon see examples in which other morphological features are copied onto the pivot. The idea of morphological side effects of syntactic checking can be extended to viewing structural case itself as a kind of agreement. The specific morphological form agrees with the syntactic type of the case checker: ‡Nom is a reflection of ‡Tensed, and ‡Acc is a reflection of V. Parallel with the reduction of the various u-features of Infl to the single ‡N , we suppose that there is a unitary structural case feature ‡Ku rather than specific structural case features ‡Nom and ‡Acc. (For ease of exposition, we restrict the options to these two structural cases, ignoring structural genitive and other possibilities.) We can then, finally, express the basic structural case checking relation as: (20) f‡N; ‡Ku} 14 checks f‡N ; ‡ Tensed/‡Vg Special thanks to Morris Halle for discussion of the nature of inflectional u-features. ß Blackwell Publishers Ltd, 1999 16 John Frampton and Sam Gutmann Agreement is then a process of guaranteeing that the morphological ufeatures on the ‡N head agree with the u-features of the ‡N head, and the morphological case feature of the ‡N head agrees with the type of the ‡N head. That is, if {‡N; ‡Ku} removes ‡N , the u-features of the N are copied onto the ‡N head. We now need to specify the morphological consequences of, for example, f‡N ; ‡Tensedg removes ‡Ku. There is a fundamental asymmetry because, empirically, the derivation does not appear to wait until this checking operation to fix the morphological case feature (‡Nom) of the nominal. Consider, for example, past participle agreement in Icelandic. There is agreement with the past participle for case, number, and gender. The following paradigm is from Andrews (1982:445). (21) a. b. c. d. e. Hún er vinsæl. she(Nom) is popular Þeir segja hana vera vinsæla. They(Nom) say her(Acc) to-be popular Hún er sögð vera vinsæl. she(Nom) is said(Nom) to-be popular Þeir telja hana vera sagða vera vinsæl. they(Nom) believe her(Acc) to-be said(Acc) to-be popular Hún er talin vera sögð vera vinsæl. she(Nom) is believed(Nom) to-be said(Nom) to-be popular For the sake of clarity, the only inflection which is glossed is case on nominals and case agreement on past participles. If we assume that past participle agreement is established by means of an InflPP and the nominal passes cyclically through Spec(InflPP), it is most straightforward to assume that the nominal already has a morphological case feature when it is checked by InflPP.* Otherwise, the morphological case feature of the nominal could not be copied onto InflPP. Presumably, InflPP has a N -feature which checks the nominal, with its Ku-feature. Feature removal {‡N, ‡Ku} * ‡N applies and has the morphological side effect of copying morphological features to the past participle, including the case feature of the nominal. This strongly indicates that the morphological case feature must be part of the feature makeup of the nominal at insertion. The morphological side effect of ‡Ku elimination, which is carried out only at the head of the nominal chain, is simply verification that the morphological case feature agrees with * Note added in proof: Chomsky (Fall ’98 lectures) has proposed that phenomena like Icelandic past participle case agreement are the result of case assignment to the past participle by the higher case checker, rather than via agreement with the nominal as we proposed. If Chomsky’s proposal is successful, there is no need for inserting nominals with specific morphological case features, and it is likely that our program of eliminating lookahead from the theory can be advanced even further than we have succeeded in doing here. It now seems possible that the notion ‘‘crash’’ loses relevance. Incomplete derivations may be defective, but this can be detected almost immediately. ß Blackwell Publishers Ltd, 1999 Cyclic Computation 17 the case-type of the head involved in eliminating ‡Ku. If the verification fails, the deriviation is morphologically ill-formed. Note that precisely the same checking relation (20) applies in the D–N configuration discussed earlier. The determiner has an N -feature, but since D never is ‡Tensed or ‡V, the structural case feature of the N cannot be removed. The N -feature of the determiner, however, can be removed, possibly with morphological consequences.15 5.1 Cyclic Morphology We assume that morphology is determined cyclically. By this we mean that at the end of each cycle, all morphology has been determined. The asymmetry (verification versus feature copying) in the morphological side effects of structural case checking make sense under this view. If a pivot X checks a head Y, then (except for the special case of expletives) Y will already have been present in a previous cycle. Assuming ‘‘Cyclic Morphology,’’ the morphology of Y must have already been determined in that cycle. Any morphological side effects of the feature checking between X and Y cannot alter the morphology of Y. Verfication of morphology is permitted, but morphological features of Y cannot be changed. The morphological features of X, however, can be altered. The checking is part of the cycle in which the morphology of X is determined. We explicitly depart here from Chomsky’s assumption that the feature set of heads cannot be altered derivationally except by removal of features. But the departure is limited to alteration of the set of morphological features. Changes in the feature set are further limited by the assumption of cyclic morphology. We can further limit the possible changes in the feature makeup of heads by assuming that morphological feature changes are only possible as a consequence of an application of Satisfy and that they are restricted to the pivot of the transformation. They are further restricted to copying onto the pivot morphological features of the head which undergoes checking with the pivot. In contrast to removable features, which must be eliminated in the course of the derivation, morphological features are never eliminated. Note that the assumption of Cyclic Morphology is made possible by the assumption that all possible removable features of the pivot are eliminated cyclically. Consider a simple expletive construction: (22) a. b. Bill thinks there is someone in the room. [there Infl(Pres)‡be (be) someone in the room] Infl must acquire u-features on the cycle in which (22b) is built so that the morphology of Infl‡be can be determined. If feature checking did not take 15 It might seem odd that the N -feature of D’s is not distinguished from the N -feature which inflection can bear. We can see no empirical issue. ß Blackwell Publishers Ltd, 1999 18 John Frampton and Sam Gutmann place between someone and the embedded Infl on this cycle, the morphology of that Infl could not be fixed cyclically. 5.2 Inherent Case Sometimes, the lexical item selecting a nominal selects the case of that nominal as well. The case theory developed in the previous sections must now be extended to this situation. It is well-known that various oblique objects in Icelandic, for example, behave as if they require some kind of structural licensing akin to structural case licensing. See, for example, Freiden & Sprouse (1991) and Sigurðsson (1991). Consider typical examples: var hjálpað ti. Jónii Jon(Dat) was helped t ‘Jon was helped.’ var taldir [ti hafa verið hjálpað ti]. b. Jónii Jon(Dat) was believed t to-have been helped t ‘Jon was believed to have been helped.’ hafa verið hjálpað ti]. c. Þeir talja [Jónii They believe Jon(Dat) to-have been helped t ‘They believe Jon to have been helped.’ hafa verið hjálpað ti] er mikilvægt. d. *[Jónii Jon(Dat) to-have been helped t is important (23) a. The dative marked object of the verb can appear in subject position, as in (23a). It can raise in normal fashion, as in (23b). The issue is the contrast between (23c) and (23d). In spite of the fact that there is inherent dative case assignment in both cases (which we assume is a question of selection) and no evidence that Jon(Dat) is structurally case-marked, it must appear in a structural case position. On the basis of contrasts like that between (23c) and (23d), Freiden & Sprouse concluded that the nominal, even though it does not require structural case assignment (since it already has been assigned dative case lexically), requires some kind of structural licensing. Further, that such structural licensing is only available in positions of structural case assignment. There has been considerable discussion in the literature on the question of the possibility of quirky subjects bearing covert structural case (nominative in (23a) and (23b), accusative in (23c)). Various proposals have been advanced trying to bring these constructions in line with the standard ideas about case marking.16 In spite of the fact that the subject of the embedded sentence in (23b,c,d) has dative case, it must appear in a structural case position, either as the subject of a tensed clause, in (23b), or in an ECM position, as in (23c). 16 The idea, we believe, originates with Chomsky (1986). See Schütze (1993) for a recent treatment. ß Blackwell Publishers Ltd, 1999 Cyclic Computation 19 There is a fundamental difference between licensing inherently case marked arguments and the form of structural case discussed earlier. There is no morphological agreement. The morphological case is invariant (independent of the type of the case checker) and no morphological agreement features are copied to the case checker. Consider, for example, the Icelandic examples which follow: (24) a. b. c. d. Stúlkan kyssti drengina. the-girl(Nom,fem,sg) kissed(3,sg) the-boys(Acc,masc,pl) ‘The girl kissed the boys.’ Drengirnir voru kysstir. the-boys(Nom,masc,pl) were(3,pl) kissed(Nom,masc,pl) ‘The boys were kissed.’ Jóni hjalpaði okkur. Jon(Nom) helped(3,sg) us(Dat) ‘Jon helped us.’ Okkur var hjalpað. us(Dat) was(dflt) helped(dflt) ‘We were helped.’ We can easily capture this pattern in the present system by supposing that there is a second structural case feature, ‡K, which behaves syntactically exactly as ‡Ku does, but differs from ‡Ku in 1) not triggering morphological agreement and 2) satisfying different selectional restrictions than ‡Ku. We assume: (25) {‡N, ‡Ku=‡Kg „ f‡N ; ‡Tensed=‡Vg We assume that ‡K checking has no morphological side effects, either in verifying morphological case or in copying morphological features to the case checker. Selectionally, we assume that arguments whose case is selected must have a K-feature and arguments whose case is not selected must have a Ku-feature. Conceptually, there is a partial justification for the failure of ‡K checking to induce agreement. Agreement verifies morphological case. Arguments whose case is selected have already had their morphological case verified by the selector. 5.3 Language Specific versus Language Universal Case Features We can speculate that the separation between checking structural case features and the morphological side effects of case checking coincides with the separation between universal case features and checking relations and the language particular ways that case is made manifest morphologically. Under this view, the checking relations (25) are universal. We discussed above ß Blackwell Publishers Ltd, 1999 20 John Frampton and Sam Gutmann various natural constraints on the morphological side effects of checking, the assumption of Cyclic Morphology in particular. It is natural to assume that language has wide latitude to express case checking morphologically within these constraints. 6. Expletives and the EPP Let us first consider what English would be like if (25) was the extent of structural case checking. The main features of the case and EPP system follow from (25). A N -feature on Infl will ensure that Spec(Infl) is filled overtly. Chain Condition (Chomsky 1986) effects follow directly as well. Once the the structural case feature of an N is removed, ‡N can no longer be removed by the N. Only ‡N in conjunction with a structural case feature removes ‡N . Consider, for example: (26) a. b. *John was believed [t is guilty] was believed [John is guilty] ‡N ‡N ,‡Tensed The structural case feature of John is removed by the f‡Tensed; ‡N g of Infl in the embedded clause. At the point where (26b) has been built, the matrix Infl cannot check John. The only removable feature which could be eliminated is ‡N , and a structural case feature (‡Ku or ‡K) must cooperate with ‡N in removing ‡N . The possibility of cyclic NP movement follows as well if we assume that raising Infl has a N -feature, so that a raising Infl can be in a checking relation with an N with a structural case feature. But ‡N on an untensed Infl (i.e., one lacking ‡Tensed) cannot remove ‡Ku. Consider, for example: (27) a. b. c. John is likely [t to be [t guilty]. Infl be [John guilty] ‡N,‡Ku ‡N John Infl be [t guilty] ‡N,‡Ku The raising Infl checks John, but ‡Ku is not removed from John because no case-checking feature (i.e., neither ‡Tensed or ‡V) cooperates with the N feature of Infl. On the other hand, ‡N is removed from Infl, because ‡Ku cooperates with ‡N in removing it. The matrix Infl can then check John and (27a) results. The main descriptive inadequacy of such a system is that it does not permit expletive constructions. Consider a standard example: (28) There are likely t to be several women in the room. ß Blackwell Publishers Ltd, 1999 Cyclic Computation 21 If case checking were restricted to (25), the matrix Infl would lose its N feature in checking with the expletive. We would need to assume that the expletive is ‡N and bears a structural case, or it would not be in a checking relation with the higher Infl. But then it would remove the N -feature of the matrix Infl when it raised and there would be no way for the postcopular nominal to lose its case feature. The existence of expletive constructions appears to imply that Infl has two distinct removable features. One is ‡N , and the other must be in a checking relation with the expletive. Chomsky (1995) proposed that Infl has a D -feature as well as a ‡N feature (‡u in his system). In the present terms, the checking relation that he proposed is ‡D „ ‡D , which can be more simply written as ‡D * ‡D , because ‡D is not removable. If we suppose that Infl has both ‡N and ‡D , we are led to a system along the lines of Collins (1997). This system comes close to a descriptively adequate model. Examples (26) and (27) go through with little change. A simple expletive construction, as in (28), receives a direct account. It is assumed that there has only the feature ‡D. In (28), the expletive satisfies the D -feature of the embedded Infl when it is inserted in this position and satisfies the D -feature of the matrix Infl when it raises (displaying checking with ‡D ) to the matrix clause. Covertly, the nominal raises first to the embedded Infl, satisfying its ‡N feature and preserving its own Ku-feature. The nominal then raises to the matrix Infl, satisfying the latter’s N -feature and losing its own Ku-feature. Consider also: (29) *There is likely that someone was in the room. The expletive satisfies the D -feature of the matrix Infl, but the N -feature of the matrix Infl remains unsatisfied. The nominal someone in the embedded clause cannot raise to satisfy ‡N because it has lost its Ku-feature, which is needed for erasing ‡N . The system falls short, however, in more subtle expletive constructions, including the crucial (3) from section 3, repeated here as (30a). (30) a. b. *There is likely someone to be t in the room. *There is likely there to be someone in the room. Neither of these examples is blocked. Attempts to account for these examples have taken different forms and have played a central role in the development of Minimalist theory. Chomsky originally introduced the notion numeration to account for (30a) and the idea of a numeration continues to play a prominent role in Minimalist grammar. Its persistence is due almost entirely, as far as we can see, to its role in explaining (30a). Collins (1997), correctly in our view, tries to do away with the numeration machinery, but does it only by stipulating a principle (called Extend Chain) to account for (30a) that is otherwise unmotivated and has very little theoretical scope. Chomsky uses the old idea ß Blackwell Publishers Ltd, 1999 22 John Frampton and Sam Gutmann of expletive replacement as an explanation for (30b). We discuss expletive replacement in section 6.2. Collins does not address the problem of (30b). Why Infl should have both D and N -features is a mystery. A grammar with only ‡N on Infl is only marginally different than a grammar with both ‡N and ‡D on Infl. The possibility of expletive-associate constructions, as we have seen, is the salient difference. We can first ask ourselves why Infl has a N -feature in the first place. This is just the flip side of the question of why argument nouns bear structural case. If N’s do have a structural case feature, some mechanism must be provided for removing the structural case feature, and ‡N plays a key role in that mechanism. The reason for a D feature is not so clear. We will pursue the intuition that ‡D arises for the same reason that ‡N does, as a featural device for ensuring the erasure of a case feature. This leads directly to the idea that D’s as well as N’s bear a structural case feature. We will denote the structural case feature on D’s by KD . The obvious checking relation is (31b), entirely parallel to the structural case checking relations (25), which are repeated here as (31a).17 (31) a. b. f‡N; ‡Ku= ‡ Kg „ f‡N ; ‡Tensed=‡Vg f‡D; ‡KD g „ f‡D ; ‡Tensed=‡Vg Two further conclusions are more or less forced. First, it must be that expletive there bears ‡KD . Otherwise, there could not remove Infl’s ‡D . Recall that (31b) requires both ‡D and ‡KD to remove ‡D . Second, it must be that the associate of the expletive is an NP, not a DP. We can see this by considering: (32) There is someone in the room. The expletive removes ‡D from Infl. If someone was a DP, it would bear ‡KD and there would be no ‡D which could cooperate with ‡Tensed to remove it. If the nominal is a simple NP, the problem vanishes. The conclusion that the associate of the expletive must be an NP is the core of an explanation of the Indefiniteness Effect.18 The conclusion that a nominal like someone can be either an NP or a DP is more ordinary than it may seem. The internal structure of [someone]DP is most reasonably realized as [d [someone]NP], with a phonologically null determiner d. If this is indeed the structure, then the two varieties of someone are already implicit in someone as a DP. The problematic examples in (30) also receive an immediate and direct explanation. First, consider (30a). There are two possibilities; either someone 17 It is conceivable that ‡K and ‡KD are the same feature. This depends partly on the analysis of pronouns. If pronouns are heads which are both ‡N and ‡D, our analysis does not go through if they bear a single ‡KD feature. It is awkward, and perhaps incoherent, for them to bear two identical features. 18 This argument is from Frampton (1996). ß Blackwell Publishers Ltd, 1999 Cyclic Computation 23 is a DP or an NP. If it is an NP, then the D -feature of the embedded Infl will not be removed. If it is a DP, then its KD -feature will not be removed because the D -feature of the matrix Infl is consumed when the expletive is introduced. Next, consider (30b). Both expletives will bear KD -features which must be removed. The KD -feature of the expletive in the embedded sentence will not be removed by a raising Infl. That Infl has a D -feature, but it must cooperate with ‡Tensed to remove KD . The matrix Infl will lose its D -feature in checking the higher expletive, so there is no way for the lower expletive to lose its KD -feature. There is one further type of example which has proved troublesome for Minimalist theories, but which is elementary in the present framework. (33) a. b. There was circulated a rumor that someone was in the room. A rumor that there was someone in the room was circulated. In the framework developed above, (33) is routine. In (33a), a rumor is an NP and someone is a DP. In (33b), a rumor is a DP and someone is an NP. In Chomsky’s framework, (33) presents a puzzle. It is not at all clear why (33a) is permitted. The inventory of lexical items in the two derivations is identical. At some point in the derivation of (33a), [ was someone in the room ] has been built and the issue is raising someone. Recall that in that framework, Merge is always more economical than Attract. In this case, Merge has an expletive there available for merger (the there which is ‘‘intended’’ for the matrix sentence). The consequence should be that (33a) is not optimal.19 The introduction of ‡KD alongside ‡Ku also clarifies a residual puzzle for the proposed association of structural case and agreement. Why is it that raising infinitives do not show agreement? If Infl has a N -feature, we would expect that cyclic NP movement of an N with ‡Ku would induce agreement. The introduction of ‡KD clears this up. We can assume that raising Infl does have a D -feature, but does not have a N -feature. Syntactically, a raising Infl is nothing else besides a D -feature. Unlike ‡Ku checking, ‡KD checking has no morphological side effects. 6.1 The Chain Condition in Alternate Minimalist Theories The discussion above focused on showing that the Chain Condition can be derived from the proposals we have advanced. It is worth emphasizing that this is a non-trivial result. Chomsky (1995), for example, does not have an account of examples like the following: 19 Chomsky has suggested in past lectures that the problem could be overcome by introducing a variation on his notion of numeration, what he calls a restricted numeration. The idea is that at various points in the derivation, there is an option of adding additional lexical items to the numeration. Recent improvements in the theory (Chomsky, Fall 1997 lectures) do appear to allow a coherent explanation to go through along these lines. ß Blackwell Publishers Ltd, 1999 24 John Frampton and Sam Gutmann (34) *[Bill to be believed t is guilty] is surprising. Since Chomsky assumed that a raising Infl has a D -feature, and that this is its only removable feature, and made no assumptions about feature interaction in checking, nothing blocks Bill from raising and eliminating Infl’s D -feature. In order to account for successive cyclic movement to Spec(Infl), with a raising Infl, Collins (1997:99) proposes that Infl has a removable feature ‡Tnull which has special checking properties. It checks structural case of any variety. In checking, however, ‡Tnull is eliminated but the (removable) structural case feature that it checks is not eliminated. Under this assumption, (34) is correctly predicted. In general, the proposal is empirically adequate, but it has conceptual problems. In the first place, the Tnull-feature is otherwise unmotivated. It is not the feature responsible for EPP driven movement, because expletives also undergo successive cyclic raising and they do not bear structural case. In the second place, there is no attempt to account for the exceptional behavior of ‡Tnull checking. Other than the structural case feature which enters into checking with ‡Tnull, removable features are assumed to be eliminated by checking. In fact, in the same work, because of these conceptual problems, Collins abandons his ‡Tnull proposal in favor of Chomsky’s idea that a raising Infl has just a D -feature (see Collins 1997:100). But we have already seen that Chomsky’s proposal, while conceptually satisfying, is not sufficiently constrained to account for (34). 6.2 Expletive Replacement The account of (35) in terms of KD -case is straightforward: the KD -features of the expletives are not eliminated. (35) a. b. *[There to be believed that Jack has left] is surprising. *I regret [there to be believed that Jack has left]. Chomsky (1995) rules these out by the requirement that expletives cannot appear in the LF representation and the introduction of a mechanism for expletive deletion. The idea is that if expletive there, a D in Chomsky’s system, is brought in a suitable structural relation with an N, it is deleted. In the examples in (35), no such deletion occurs and the sentences are ruled out. There are three major problems with the expletive deletion analysis. First, assuming that the D-feature of expletive there is a removable feature introduces a significant complication. Some distinction must be made between D-features which are removable and those which are not. Otherwise, the computation could not recognize one particular occurrence of ‡D as removable, while a different occurrence of ‡D was considered not removable. The checking relations, at least under current assumptions, cannot distinguish between two occurrences of a particular feature depending upon ß Blackwell Publishers Ltd, 1999 Cyclic Computation 25 the context they occur in. It would have to be encoded in some way. It certainly can be done, but only at a cost.20 Second, there are empirical problems. In English, it might be argued that it in (36) is either an expletive which is eliminated syntactically or that it is an argument and the apparent clausal complement is actually an adjunct. (36) It seems that Bill left early. In Icelandic, there are examples which make both possibilities dubious. (37) a. b. c. Það glampaði á sverðið. it gleamed on the-sword ‘The sword gleamed.’ Það var dansað. it was danced ‘There was dancing.’ Það dimmir nú. it dims now ‘It is getting dark.’ The examples are taken from Andrews (1982:493). The expletive in the last example, (37c), is perhaps not so different than so-called weather-it in English. But in (37a), in particular, it is hard to see what argument structure would make an argument-adjunct analysis possible and it is equally hard to see how the expletive could be replaced.21 Third, in Chomsky’s system, a non-pronominal expletive, phonological material aside, consists of nothing more than a single removable D feature. If this feature is removed, the expletive becomes contentless. Either some coherence must be given to the idea that the interpretive system does not object to a contentless phrase, as opposed to a phrase with content but no interpretation, or some theory of phrasal deletion must be advanced. It is certainly possible to introduce suitably constrained phrasal deletion into the system. But it should be recognized as an added cost of this approach. 6.3 Intervention Effects (The Minimal Link Condition) Now that some concrete examples of complex feature checking relations have been developed, we are in a position to understand the precise formulation of the application of Satisfy which was given in section 4.3. 20 One scheme might be to suppose that the feature set of heads is not an undifferentiated set of features, but is intrinsically structured into removable features and permanent features. The lexicon would have the responsibility of assigning features to the appropriate category right from the start. This scheme, in effect, is a way to encode the context sensitivity of removability. 21 It is worth noting the similarity between (37a) and the Irish examples that McCloskey (1995) uses to argue against the universality of the EPP. ß Blackwell Publishers Ltd, 1999 26 John Frampton and Sam Gutmann (38) *Maxi appears [ it is believed ti to be guilty ] The intervening it has had its structural case feature removed so it cannot move to the matrix Infl. Nevertheless, it must block movement of a more deeply embedded D which has both ‡Ku and ‡KD and could possibly raise to the matrix Infl were it not for the intervening D. Our particular formulation of Satisfy accounts for this. Consider the relevant point in the derivation. (39) Infl appears [it is believed Max to be guilty] Infl does a top-down search for a head to eliminate one of its removable features, ‡D or ‡N . It stops when it finds a head it recognizes. Recall that Infl will recognize a head Y if there is a removable feature ‡f of Infl and a checking relation l „ m such that 1) ‡f is in l and 2) m and Y share at least one common feature. In (39), Infl recognizes it. This follows from the fact that f‡N ; ‡Tensedg „ f‡N; ‡Kug, so that f‡N; ‡Kug * ‡N , with ‡N a feature of it. Note that Infl recognizes it even though it is not able to eliminate a feature of Infl because it has lost its Ku-feature. The top-down search, therefore, stops with it. We can ask why the top-down search is limited in this particular way, so that it stops when a Y which is recognized is encountered, even though checking cannot be carried out with that Y and there is a deeper head with the needed feature structure. One could easily imagine alternatives. It could be, for example, that the search stopped only when a head which could actually enter into feature checking with the pivot was found. It is clear that the more an intervention condition restricts the depth of search, the easier the computation is, but the more difficult it is to construct well-formed derivations. This intervention effect can therefore be taken as evidence that the syntax is organized in such a way as to simplify the computation. 7. Language Design In the introduction, we made it clear that we were not going to argue for eliminating comparison of derivations from the theory on the basis of the claim that only a theory with simple computations can be psychologically real. Instead, we adopted simplification of the computation involved in verifying derivations as a potential simplification of the theory, therefore a potential improvement, and examined what this point of view led to. There could have been major problems or complications in attempting to rework the Minimalist Program along these lines. Instead, it led to what we believe is a simpler and more constrained theory with broader explanatory adequacy. Assuming for the moment that this is true, we need to ask what conclusions can be drawn from the fact that recasting the theory using the simplification of computation as a guiding principle led to a better theory, using traditional measures of simplicity and explanatory adequacy. ß Blackwell Publishers Ltd, 1999 Cyclic Computation 27 A plausible conclusion is that the computations the theory uses to verify derivations are in some sense real, in that they correspond in some more or less direct way with actual mental computations. As well language may be designed in such a way that these computations, which of course must be consistent with the computational limits of the brain, do not include the complex device of comparing derivations. References ANDREWS, A. 1982. Case in Modern Icelandic. In The mental representation of grammatical relations, ed. J. Bresnan. Cambridge, Mass.: MIT Press. CHOMSKY, N. 1986. Knowledge of language. New York: Praeger. CHOMSKY, N. 1994. Bare phrase structure. MIT occasional papers in linguistics 5. MIT, Cambridge, Mass. CHOMSKY, N. 1995. The minimalist program. Cambridge, Mass.: MIT Press. COLLINS, C. 1997. Local economy. Cambridge, Mass.: MIT Press. FRAMPTON, J. 1996. Expletive insertion. In The role of economy principles in linguistic theory, ed. C. Wilder, H.-M. Gärtner & M. Bierwisch. Berlin: Akademie Verlag. FRAMPTON, J. & S. GUTMANN. 1997. Eliminating non-local computation in minimalist syntax. Ms., Northeastern University, Boston, Mass. FRAMPTON, J. & S. GUTMANN. 1998. Distributed morphological interface and Vto-I raising. Ms., Northeastern University, Boston, Mass. FREIDEN, R. & R. SPROUSE. 1991. Lexical case phenomena. In Principles and parameters in comparative grammar, ed. R. Freiden. Cambridge, Mass.: MIT Press. KOSSLYN, S. 1980. Image and mind. Cambridge, Mass.: Harvard University Press. LONGOBARDI, G. 1994. Reference and proper names. Linguistic Inquiry 25:609– 666. MCCLOSKEY, J. 1995. Subjects and subject positions in Irish. In The syntax of Celtic languages, ed. R. Borsley & I. Roberts. Cambridge: Cambridge University Press. MULDERS, I. 1997. Mirrored specifiers. To appear in Linguistics in the Netherlands. RICHARDS, N. 1997. What moves where when in which language? Ph.D. dissertation, MIT, Cambridge, Mass. SCHÜTZE, C. 1993. Towards a minimalist account of quirky case and licensing in Icelandic. MIT working papers in linguistics 19, 321–375. Department of Linguistics and Philosophy, MIT, Cambridge, Mass. SIGURÐSSON, H. 1991. Icelandic case-marked pro and the licensing of lexical arguments. Natural Language and Linguistic Theory 9:327–363. SIGURÐSSON, H. 1992. The case of quirky subjects. In Working papers in Scandinavian syntax 49, 1–26. SIGURÐSSON, H. 1996. Icelandic finite verb agreement. In Working papers in Scandinavian syntax 57, 1–46. YANG, C. 1999. Unordered merge and its linearization. Syntax 2:38–64. John Frampton Department of Mathematics Nightingale Hall Northeastern University Boston, MA 02115 [email protected] ß Blackwell Publishers Ltd, 1999 Sam Gutmann Department of Mathematics Nightingale Hall Northeastern University Boston, MA 02115 [email protected]