Syntax 2:1, April 1999, 1–27
CYCLIC COMPUTATION, A
COMPUTATIONALLY EFFICIENT
MINIMALIST SYNTAX
John Frampton and Sam Gutmann
Abstract. We construct a completely cyclic Minimalist theory of syntactic derivations.
A derivation consists of a sequence of cycles. Each cycle starts with the introduction
of a new head and merger of the head’s selected arguments, followed by satisfaction
(via checking) of the head’s removable features. The theory includes no acyclic
devices such as lexical arrays or comparison of derivations. Satisfaction of features is
accompanied by full category movement whenever it is not blocked by morphology or
constraints barring multiple specifiers. The Minimal Link Condition is viewed
computationally and is naturally incorporated into Satisfy. Our precise notion of
checking involves sets of features interacting in the same checking relation, and yields
an account of successive cyclic movement, the distribution of expletives, EPP, and
quirky case phenomena. The paper can be read as empirical evidence that the core
syntactic algorithm is computationally efficient.
1. Introduction
In the first period of generative grammar, the computation involved in
generating grammatical derivations was straightforward. The allowable
structural transformations were listed and the computation was organized into
a sequence of cycles. In each cycle, the computation went through the list,
applying whatever transformations it could. The GB period introduced a very
different view of how grammatical derivations should be characterized.
Structural transformations were no longer licensed by fitting a pattern on a
list of possibilities, but by satisfying a list of constraints. If a structural
transformation was not forbidden, it was well-formed. The problem with such
an approach, as Chomsky recognized in the late 1980s, was that this view
predicts that a structural transformation could take place without motive. It
was licensed if no constraint forbade it. The facts of language, however,
seemed to indicate the opposite, that there was no superfluous movement.
*Portions of this work were presented in a course and at a colloquium at the University of
California at Santa Cruz and at a workshop at Potsdam Universität. We thank those audiences for
their questions, examples, and comments, which were invaluable in bringing this work to its
present form. We thank the following people for helpful comments on an earlier draft of this
paper: Cedric Boeckx, Sylvain Bromberger, Sandra Chung, Chris Collins, Morris Halle, Jason
Merchant, David Pesetsky, Gil Rappaport, Esther Torrego, and Charles Yang. We particularly
thank Jonathan Bobaljik for detailed written comments which led to several improvements and
helped us avoid at least one major blunder; Noam Chomsky for creating the problems to solve,
for comments on an early outline of these ideas, and for a running commentary in his lectures
over the past few years on the issue of computational complexity; and Bill Ladusaw for an
insightful historical perspective. We also thank the following coffee houses for graciously
providing temporary office space: Maxim (Newton), Panini (Somerville), Bentonwood (Newton),
and Seattle’s Best Coffee (Brookline and Newton).
ß Blackwell Publishers Ltd, 1999. Published by Blackwell Publishers, 108 Cowley Road, Oxford OX4 1JF, UK and
350 Main Street, Malden, MA 02148, USA
2
John Frampton and Sam Gutmann
The outcome was Chomsky’s idea that a principle of ‘‘least effort’’ was
involved. That is, derivations were trying to accomplish something and they
did it in the most economical way. Instead of a list of constraints, the
emphasis was shifted to understanding what derivations had to accomplish
and how the effort in various ways of carrying this out could be compared.
The theory was based on Move-a, with Least Effort viewed as a constraint on
derivations. Major advances were made in understanding what derivations
had to accomplish. This was Chomsky’s theory of feature checking and the
idea that derivations were driven by the need to check a certain subclass of
features in the course of the derivation.
This is a highly non-cyclic view of syntactic computation. Structural change
in one cycle can be blocked because a structural change in a later cycle could
accomplish the same objective with less effort. This approach proved
increasingly untenable because of the difficulty in comparing the cost of
various ways of accomplishing feature checking. In particular, the principle of
Greed proved impossible to formulate satisfactorily. This led to a move away
from the Move-a/Least Effort view of syntax. Increasingly, the power to
regulate structural change was taken away from Least Effort and returned to
cyclic principles. The shift to computation organized around Attract was the
major step in this direction. Chomsky (1995), however, still retains significant
reliance on non-cyclic computation, comparison of derivations in particular.
This comparison is manifestly non-cyclic, and computationally complex.
It is far from clear that computational efficiency is either required or even
expected for theories which purport to explain certain aspects of cognitive
systems.1 It is interesting, nevertheless, to see if pursuing computationally
efficient theories leads simultaneously to empirically successful theories; we
believe that there is evidence that it does.
What we would like to do in this paper is to complete the return to the
highly cyclic syntactic computation of the earliest period, but keep the core
insights of the Minimalist framework. In place of an ordered list of
transformations, there is an unordered list of feature-checking relations. In
each cycle, whatever structural transformations those checking relations
license are carried out. The system of checking relations and the algorithm
that specifies how checking relations are translated into structural
transformations play the central role in such a theory.
This sets our investigation in the general context. It is also useful to sketch
some of the details here, to help orient the reader to what follows. Cyclic
Computation means that we will dispense with comparison of derivations as a
computational device. We can therefore dispense with Chomsky’s notion of
‘‘numeration,’’ the initial array of lexical items assembled in advance of the
1
See Kosslyn (1980:123) for a succinct argument that ‘‘some of the constraints on cognitive
processing may not be those that foster the most computationally efficient processing for the
tasks at hand.’’ It would be a naive view of evolution to assume, a priori, that the brain is
organized in such a way that computational efficiency for language processing is ensured.
ß Blackwell Publishers Ltd, 1999
Cyclic Computation
3
derivation which he needs in order to constrain the set of derivations which
compete with each other with respect to Least Effort.2 We will also dispense
with Chomsky’s weak/strong feature distinction, which he needed to force
derivations to go against the tendency of Least Effort to lead to doing nothing at
all.
In the theory we propose, lexical items are introduced throughout the
derivation. A derivation consists of a sequence of cycles of the following
form:
(1) The Cycle
a. (Select) A new lexical item is introduced. It selects its arguments and is
merged with them.
b. (Satisfy) The features of this newly introduced head are satisfied as
fully as possible by checking, which induces overt movement whenever
possible.
Following from the idea that the syntactic architecture is organized so that the
computations are efficient, we assume that syntax constantly consults
external systems. In (1a), for example, the h-criterion is satisfied at insertion.
In (1b), making ‘‘whenever possible’’ precise leads to distributed interaction
with morphology to determine the possibility of head incorporation. See
Frampton & Gutmann (1998) for a full discussion of verb raising in English
in this framework.
Section 2 contains terminology and preliminaries and in section 3 we
briefly review Chomsky’s theory of comparison of derivations. In section 4,
we detail our proposals and specify the core algorithms, Select and Satisfy.
Section 5 elaborates the feature checking involved in structural case and
agreement. Section 6 presents a case-based account of the Chain Condition,
the EPP, and expletive constructions.
2. Preliminaries
In order to fix some terminology which we will use in the coming sections,
we will begin by outlining some of the technical apparatus which we will use
later. Most of the material in this section is not new, although the particular
point of view which we take and the particular terminology we adopt may be.
We assume that there is a feature set and a predicate removable defined on
it. Removable features play a special role because derivations are organized
around the elimination of removable features.
We assume that there is a lexicon, which is a procedure of some kind
yielding purely lexical items. A lexical item is a copy of a purely lexical item.
It is taken to be a copy because it is necessary to distinguish two occurrences
2
The acyclic character of the numeration device is worth noting. The task of assembling the
lexical items which will be employed in the derivation is split off into a pre-cyclic step.
ß Blackwell Publishers Ltd, 1999
4
John Frampton and Sam Gutmann
of the same underlying purely lexical item. The two occurrences are simply
two different copies of the same purely lexical item. Lexical items are the raw
material from which syntax builds phrases, representations, and derivations,
as well as those complex words which are built syntactically. Each lexical
item has, as a property, a set of features.3
A morphological item, also called a head, is either a lexical item or an
object built by morphology. We assume that there is a morphology, which has
the (recursive) capacity to build structures, morphological items, out of
certain binary combinations of morphological items. We emphasize ‘‘certain
combinations’’ because the internal workings of morphology, reflected in its
combinatorial possibilities, will play a major role in what follows. Certain
combinations will be possible and certain combinations will not be. This will
place constraints on the possibilities for overt head incorporation. We also
assume that each head X, whether a lexical item or a structured
morphological item, has an associated set of features, (F)X, the features
which enter into syntactic computation.4
The notion phrase is defined recursively. A phrase a is either a
morphological item or a pair of phrases, with one of the phrases distinguished
as the primary term of a. In the first case, the head of a is defined to be a
itself. In the second case, the head of a is defined, recursively, to be the head
of the primary term of a. Notationally, the ordered pair (b, c) will be used to
denote a phrase whose primary term is b. Note that there is no notion of linear
order in the sense relevant to phonology. We take it to be the responsibility of
the computation which determines phonological form to determine linear
order in the phonological sense.
A representation is a list of phrases.5 Given a representation , we say
that a is a phrase in if a is on the list or a is a subphrase of a phrase on
the list . Given , we say that a morphological item X is a head if X is the
head of some phrase in .
A derivation is a finite sequence of representations. A derivation
0 ; . . . ; n is a well-formed derivation if all the steps k ! k1 are
well-formed and no head in n has a removable feature. The well-formed
steps are determined constructively by specifying several transformational
algorithms which take a representation as input and (if the input is suitable)
yield another representation as output.6 The well-formed steps are those
3
It could be that lexical items have no other property. That is, they are simply unstructured
sets of features. All structured words would then be formed in the syntax. We tend to think that
the lexicon produces structured items as well, but the resolution of this question is not relevant to
the present work.
4
We ignore the question of precisely how morphology determines which features of
compound words are externally available to the syntax.
5
In fact, we do not use the list structure of a representation at any point. It would suffice to
assume that a representation is a set of phrases, doing away with the ordering inherent in lists.
Conceptually and computationally however, lists are more natural than sets.
6
This is an oversimplification. The algorithms are non-deterministic, allowing the possibility
that different outcomes can result from applying a given algorithm to a given representation.
ß Blackwell Publishers Ltd, 1999
Cyclic Computation
5
resulting from the application of one of the transformational algorithms. In
Chomsky’s development of the theory, there are two algorithms, Merge and
Attract. In our reworking of Chomsky’s theory there are two algorithms as
well, Select and Satisfy. Chomsky assumes that the initial representation in a
derivation already contains all the lexical material which will be used in the
derivation. The initial representation is taken to be a list of lexical items.
(This is equivalent to his notion of numeration.) We assume that the initial
representation is empty and that lexical insertion occurs in the course of the
derivation, eliminating the pre-derivational construction of the initial
representation in Chomsky’s theory.
3. Optimal Derivations
In Chomsky’s early work on what has become the Minimalist Program,
notions of optimality (called ‘‘economy of derivation’’) played a major
role. As the theory took on its present form, in ‘‘Bare Phrase Structure’’
(Chomsky 1994) and subsequent developments, the role of optimality was
progressively circumscribed. Derivational economy principles were
originally introduced as a way to formalize the intuition that there is no
superfluous movement. As the theory developed, however, it became
apparent that it was not possible to formulate a notion of a well-formed
movement step except on the basis of a requirement that the movement
step eliminate at least one feature. In effect, the fact that movement is never
superfluous was built right into the definition of movement. Since
removable features must be eliminated, eliminating a removable feature
is never superfluous. Rather than being a consequence of economy
principles, the absence of superfluous movement was now a consequence
of the core architecture of the theory. Chomsky (1995, chapter 4) takes
significant steps towards limiting the earlier extensive role of economy of
derivation, but retains a crucial optimality principle.
(2)
(Merge before Attract) Suppose w 0 ; . . . ; n ; n1 ; . . . ; k and
x 0 ; . . . ; n ; 0n1 ; . . . ; 0k 0 are two well-formed derivations. Then
w is more economical than x if n ! n1 is an instance of Merge and
n ! 0n1 is an instance of Attract.
Note that the two deviations above are assumed to be identical up to and
including the representation n . Note also that since Chomsky assumes that
all the lexical material is already present in the initial representation 0 ,
lexical insertion does not compete with Merge or Attract in comparing
derivations.
In practice, the elegance of (2) was compromised by the need to adopt the
device of ‘‘strong features,’’ a way of encoding instructions to override (2). A
familiar example is overt verb raising in French. Application of (2) would
lead to postponing verb raising to covert movement since there will always be
ß Blackwell Publishers Ltd, 1999
6
John Frampton and Sam Gutmann
merger operations which are preferred to verb raising (an instance of Attract)
on economy grounds. Chomsky’s solution was to suppose that some
movement is governed by what he called strong features, which demand
immediate elimination. In effect, the system balances between (2) and the
imperatives of strong features, based on the initial array of lexical items
which are available to the derivation.
Our main purpose in this paper is to complete the erosion of the role of
optimality and to remove all appeal to comparison of derivations from the
theory. In the process, we will remove the idea that the available lexical items
are initially fixed for the derivation as well as any appeal to the idea of strong
feature. In order to understand where we are heading, it is good to have some
idea of the consequences of (2) in Chomsky’s development of the theory.
Obviously, one of the main purposes of this paper will be to give alternate
accounts of the phenomena which are explained by (2). Its explanatory power
is surprisingly narrow. This fact itself should be an indication that there are
conceptual problems. It is suspicious that an extremely powerful
computational device (optimality) has only weak effects.
The most striking evidence for (2) is the ingenious account it provides for (3).
(3)
a.
b.
c.
Someonei seems ti to be ti in the room.
Therei seems ti to be someone in the room.
There seems someonei to be ti in the room.
The account is straightforward. In (3a), the EPP requirements of the
embedded and matrix clause can only be satisfied by raising someone.
With all the available lexical material already present in the initial
representation, there is no possibility of using an expletive to satisfy these
requirements. In (3b) and (3c), the expletive there is present in the initial
representation. The derivations associated with both (3b) and (3c) are wellformed and have the same initial representation, but the derivation
associated with (3c) employs Attract (raising someone) at the point where
the derivation associated with (3b) employs Merge (merging in there). The
former derivation is therefore well-formed, but not optimal, yielding the
contrast between (3b) and (3c). We will return to give an alternate account
of (3) in section 6.
4. Select, Satisfy, and the Transformational Cycle
A lexical item has what we might call ‘‘needs’’ of two kinds. First, if it
requires arguments, its selectional needs must be met. Selection and
argument here are meant in the broadest sense, so that Infl selects VP, be
selects complements of a certain type (but does not h-select its complement),
etc. It should not be confused with h-selection. Second, a lexical item also
must have its removable features eliminated. These two needs are at the core
of the two transformational algorithms.
ß Blackwell Publishers Ltd, 1999
Cyclic Computation
7
4.1 Select
A purely lexical item w which selects is chosen from the lexicon and a copy
X is generated.7 If w has one argument, then either a phrase a is removed
from the representation or a copy of lexical item a is generated and (X, a) is
appended to the representation. If w has two arguments, then phrases a1 and
a2 are either taken from the representation or directly from the lexicon and
((X, a1), a2) is appended to it. If w has more than two arguments, the process
is the obvious extension of the case with two arguments.8 Until the last cycle,
the representation can be viewed as simply a holding area for complex (i.e.,
non-lexical) arguments that have already been built for heads that have not
yet entered the derivation.
This conception of how lexical items enter syntax builds selection into the
core of the syntax in a way similar to the way that the h-Criterion and phrase
structure rules (which did the work of selection for functional heads) were
originally imposed as a condition on so-called Deep Structure. The present
framework has done away with Deep Structure, but Select imposes selection
at the deepest possible level for each selecting head, the point at which the
head enters the derivation. It should also be noted that satisfying selectional
requirements at the point of entry into syntax is part of the general approach
of distributed interface interaction. It significantly reduces computational
complexity in the syntactic component by narrowing the range of possible
derivations.
We call the head which Select introduces into the representation the pivot
of the representation. Each application of Select changes the pivot.
4.2 Feature Checking
Before we can discuss Satisfy, which is driven by the imperative of
eliminating removable features, we need to discuss feature checking itself.
We assume that there are a certain number of checking relations. A checking
relation is defined by a pair of sets of features.9 We take a checking relation
to be defined by a pair of sets of features, rather than simply by pairs of
features, because in many checking relations it is combinations of features
which enter into the relation, rather than single features. Nominative case and
agreement features, for example, appear to work together in checking finite
inflection. We will take this up in detail in the next section. The conclusion
we arrive at there is along the following lines:
7
Two points are relevant. First, the distinction between w and X is equivalent to Chomsky’s
distinction between a lexical item and an indexed lexical item. Second, we ignore the question of
whether X is a strict copy or there is a more articulated mechanism which generates X by first
making a copy of a lexical item and then adjoining removable features to the copy.
8
We avoid the question of whether a lexical item can have more than two arguments.
9
See section 5.3 below for some speculation on the extent to which checking relations are
language independent.
ß Blackwell Publishers Ltd, 1999
8
(4)
John Frampton and Sam Gutmann
{Nom, u} {Tensed, u}
The notation is used to emphasize the two-way character of checking. We
will give empirical justification for this particular checking relation in the next
section. But it is a useful example here to explain the mechanics of feature
checking. We use the symbol u to refer to the inflectional u-features. The
feature u is a removable feature, while u is not. The has no independent
meaning. It is just a flag to announce that the symbol it precedes refers to a
feature. Note that the relation , which we will call Checks, is symmetric, so
that (4) is equivalent to {Tensed, u } {Nom, u}.
We can now turn to saying exactly what happens when feature checking is
carried out between X and Y. Recall that F(Z) denotes the set of features of Z
which enter into syntactic computation. Suppose for example that (4) holds.
The features u and Tensed are not removable features, so there is no
question of eliminating them. The checking relation (4) has the following
effect. If {Nom, u} F(Y), and u 2 F(X), then u is removed from
X. If {Tensed, u } F(X) and Nom 2 F(Y), then Nom is removed
from Y. Note carefully that we do not require that all features mentioned in
the checking relation be present. We do require that a full feature set (on one
side of the relation) is needed to eliminate a removable feature on the other
side of the relation.
One way to look at this is to view the symmetric relation (5a) as splitting
into the pair of unsymmetric relationships given in (5b):
(5)
a. {Nom, u} {Tensed, u }
b. {Nom, u} * u
Nom )
{Tensed, u }
It is not as revealing of the underlying logic, but it is more convenient to write
the pair of relations in (5b) as (6).
(6)
{Nom, u} * u
{Tensed, u } * Nom
This is equivalent to (5b) because the relation in (5a) is symmetric.
The relation *, which we will call Removes, is an unsymmetric relation
between sets of features and removable features. Although Removes * is a
simpler relationship than Checks , in the sense that it is easier to see what
the effect of Removes is on feature elimination, it would miss an important
generalization if we assumed that feature elimination is determined by
unsymmetric relations between sets of features and removable features. This
would fail to capture the symmetry of the two relations in (6). It is not
accidental that the same removable features are involved in both sides of the
relations (6). In what follows, we will often use the relation Removes because
feature removal plays such a central role in the theory. But it is important to
ß Blackwell Publishers Ltd, 1999
Cyclic Computation
9
remember that Removes * is a derived relation, derived from the
fundamental relation Checks .
In order to simplify the discussion which follows, a definition is useful.
We will say that a head X recognizes a head Y if a removable feature f of X
can be removed by a feature set m, at least some part of which is contained in
F(Y). More formally, if there is a checking relation l m, and features
f 2 F(X)\ l and g 2 F(Y)\ m, such that m * f . If X recognizes Y, then
feature checking between X and Y might be able to eliminate a removable
feature of X, depending on the structural relation of X and Y and whether or
not m and F(Y) not only share a feature, but m is contained in F(Y). If X does
not recognize Y, then there is no possibility that feature checking between X
and Y can eliminate a removable feature of X. As we shall see shortly, the
relation ‘‘recognize’’ is important because it establishes locality in feature
checking.
4.3 Satisfy
Satisfy is a complex transformational algorithm, which can be broken down
into a number of substeps. The steps are:
a. location of a head Y which the pivot X recognizes;
b. carrying out feature checking between X and Y; and
c. displaying the feature checking structurally.
Additionally, checking can have morphological side effects. We take this up
later in section 5.
Suppose we wish to apply Satisfy with a pivot X which has just selected
the argument a so that the phrase (X, a) has been built. (We take the oneargument case simply for expository convenience.) The first step is
locating a head Y which X recognizes. If X has no removable features, this
step will necessarily fail. If X does have at least one removable feature,
there are two options. One option, just as it is with Select, is taking a Y
directly from the lexicon. More precisely, a copy Y of a purely lexical item
is generated such that X recognizes Y. Because Y does not enter the syntax
via selection, this option will be limited to expletive Y. The second (much
more common) option is finding a Y in (X, a). The search is top-down. If a
Y which X recognizes is found, the search does not go more deeply down
into a. The major subtlety is specifying exactly what it means to say that Z
is deeper in a than Y is, so that if X recognizes Y, then Y blocks access to
finding a Z (which X recognizes) in the top-down search of a. This is the
question of equidistance (see Chomsky 1995:298). We will sidestep this
issue here, since the intervention effects which are relevant in this paper are
not delicate.
It can be, of course, that the first step fails and no Y which X recognizes is
found. In this case, the algorithm simply terminates. No feature checking
ß Blackwell Publishers Ltd, 1999
10
John Frampton and Sam Gutmann
takes place, and the derivation either terminates or the next cycle begins,
initiated by an application of Select.
Carrying out the second step, checking, is fairly straightforward. We will
assume that all possible checking is carried out. This is important. Without this
assumption, we would not be able to account for Chain Condition effects. Consider:
(7)
*Maxi appears [ti is [ti happy] ].
It is not enough to require that Max remove some feature of the embedded
Infl when it raises to the position of the intermediate trace. Its own removable
features must be removed as well. Otherwise it could raise further and carry
out checking in the matrix sentence and an explanation for (7) would be lost.
It can be that no checking is possible. If X recognizes Y, there is no
guarantee that feature checking can take place between X and Y. Unlike the
previous step, the cycle is not necessarily terminated. There may be a Z,
distinct from Y, that X also recognizes (so that Z and Y are equidistant from
X). Satisfy then attempts to carry out feature checking with Z.
The last step in the Satisfy algorithm is the most complex. The guiding
principle is that, to the extent possible, feature checking is displayed (i.e., made
overtly visible). This is accomplished by carrying out a structural transformation which corresponds with the checking. There are three possibilities:
(8)
a. projecting a specifier of X;
b. incorporation into X; and
c. covert incorporation into X.
4.3.1 Projecting a Specifier of X
One way to display checking between the pivot X and a head Y is via a
transformation:
(9)
(X, a) ! ((X, a), b)
Here b, the maximal projection of Y, becomes a specifier of X. (Recall that all
specifiers and complements are written to the right.) If (X, a) is not itself
originally a maximal projection, then (9) must be understood as substituting the
right hand side for the left hand side in the phrase on the representation list
which X occurs in. This operation will build a multiple specifier projection. To
the extent that a language imposes restrictions on building multiple specifier
phrases, the availability of this option to Satisfy is restricted. Note that inherent
in the specification of the Satisfy algorithm, because transformations are
specified as transformations of (X, a) at their core, is the conclusion that
transformations always build ‘‘inner specifiers.’’10 If Y is an expletive
10
See Richards (1997) and Mülders (1997).
ß Blackwell Publishers Ltd, 1999
Cyclic Computation
11
generated by the lexicon, then (9) is simply merger of Y as a specifier of X. If Y
is originally contained in a, it is movement to specifier with associated trace and
chain formation.
4.3.2 Head Incorporation into X
This is the transformation:
(10)
(X, a) ! (X Y, a)
Just as in instances of projecting a specifier via movement, the effect of the
transformation (10) is to substitute the right hand for the left hand side in the
phrase on the representation list which X heads and to replace Y by a trace of Y.
The formation of XY is a morphological operation; XY is a complex
word. The incorporation structure cannot be formed unless morphology can
carry out the operation. Morphology therefore imposes a constraint on the
ability of syntax to display feature checking overtly. This should be viewed as
distributed interaction of the syntax with the external systems, morphology in
this case.
4.3.3 Covert Movement
In Frampton & Gutmann (1997) we assumed that if overt movement was not
possible, then feature satisfaction was put off until after Spell-out. The idea
was that after Spell-out morphological constraints would no longer block
head incorporation. Here, we take a different approach, in keeping with the
guiding principle that syntactic operations are carried out strictly cyclically,
and follow Yang (see this issue) in assuming that even if overt movement is
impossible, feature satisfaction is carried out.
This extension is based on a shift in point of view about the relation of
feature checking and movement. Chomsky’s early view was that movement
was undertaken in order to bring heads into a local relationship so that
checking could be carried out. The checking itself was assumed to take place
under strict locality. Particularly with Chomsky’s shift from viewing the
basic transformational operation as Move to viewing it as Attract, there is a
peculiar twist to this logic. Why should strict locality be necessary for
checking? Locality is generally required to establish syntactic relationships.
But in the Attract framework, the relationship must be first recognized longdistance by feature matching before any local relationship is established. The
local relationship which results from movement is after the fact of
establishing the feature matching relationship.
Rather than assuming that movement is required for checking to take
place, we assume that movement is a consequence of checking. Under some
circumstances, overt movement will be impossible. But this does not make
checking impossible. It simply prevents overt syntax from faithfully
mirroring the underlying feature checking which is driving the derivation.
ß Blackwell Publishers Ltd, 1999
12
John Frampton and Sam Gutmann
Is there any syntactic reflection of feature checking in this case? Suppose that
Satisfy applies to (X, a) by carrying out feature checking with some Y in a.
Yang (see this issue), following Chomsky (1995) supposes that there is
adjunction of the formal features of Y to X, but there are unresolved
questions about this approach which make us hesitant to adopt it wholesale.11
The issues are technical and would divert the discussion, so we leave them for
future work.
It is not completely obvious that any syntactic effect of feature checking
other than eliminating removable features is necessary, but there are
indications that there must be some syntactic residue. At issue is the
hierarchical position of Y with respect to further feature checking. There are
some reasons for believing that this position is the position of X. This
involves the exact formulation of the locality conditions on feature checking,
a question of intervening features, which we have sidestepped. Movement of
formal features is one way to alter the hierarchical relationships, but there are
other approaches which might accomplish the same end.12 If we use the
notation X Y to indicate the transformed head X, either X with the formal
features of Y adjoined or some other device, the transformation of (X, a) is:
(11)
(X, a) ! (X(Y), a)
We assume that feature checking is displayed overtly if possible. We do
not know any examples in which both head incorporation and movement to
specifier offer real alternative derivations. There are examples in which there
is an option of which checking to carry out first, which has consequences for
the order in which overt operations are carried out, but the derivations are
equivalent.
The assumption that feature checking must be displayed overtly, if
possible, raises some difficult questions that we do not have completely
satisfactory answers for. Consider the D-N relationship, for example.
Translating Longobardi’s (1994) work to a feature driven framework,
suppose determiners have a N -feature which can be removed by an N. (See
the next section for the precise checking relation involved.) If the N cannot
incorporate into D, what blocks displaying N -feature checking by moving
the NP complement of D to Spec(D)? Consider, for example, a simple DP
(ignoring possible case features):
(12) the
man
D,N N
11
The nature of the chains created by this kind of movement is particularly obscure. Such a
chain would have its head adjoined to X, but its tail would be an unstructured subset of the
feature set of Y.
12
We take these issues up in a paper (in preparation) on dative-nominative constructions in
Icelandic.
ß Blackwell Publishers Ltd, 1999
Cyclic Computation
13
The N -feature of the determiner will look for a N and feature checking
will be carried out. We assume that this will be displayed either by head
incorporation or movement to specifier, if either is possible. What blocks
movement to specifier?
(13) [the man] ! [man [the t]]
We can note this problem is not unique to the framework we are developing
here. Within a framework that codes overt movement by making a distinction
between weak and strong attractors, it can be stipulated that N is weak.
But that is not really a solution to the problem. Within the present framework,
we can stipulate that determiners, excepting possessives, do not permit a
specifier. It is a similarly unrevealing stipulation.
4.4 The Cycle
We have already detailed the workings of the cycle, but it is worth reviewing.
Select applies, then Satisfy applies as many times as it can, then the cycle is
repeated. Note that each application of Satisfy eliminates at least one
removable feature, so that it cannot apply indefinitely.
A metaphor may be appropriate. One can imagine that removable features
increase what we can think of as the ‘‘tension’’ of the system, i.e., the current
representation. Selection generally raises the tension. Before further
selection, the system relieves the tension as much as possible by its own
internal devices, applications of Satisfy. Like a mechanical system, it cannot
see into the future and does only what produces an immediate relaxation of
tension. It cannot look ahead and transform itself with possible future
reductions of tension in mind.
5. Structural Case
Quirky case constructions in Icelandic illustrate the connection between
nominative case checking and agreement clearly.13 In finite clauses in
Icelandic in which there is no nominative case assignment, Infl has an
invariant 3rd person singular form, independent of the u-features of the
nominal in its specifier. We follow the standard view and take it to be default
agreement, the morphological form assigned to tensed verbs lacking
morphological u-features. Consider a standard example (with Dat used in
the gloss to indicate dative case and dflt to indicate default agreement):
13
This section relies heavily on the work of Halldór Sigurðsson. He has carefully and
insightfully investigated the key phenomena in a series of papers (see 1996 in particular, and
references contained therein).
ß Blackwell Publishers Ltd, 1999
14
John Frampton and Sam Gutmann
(14) a.
b.
c.
Hann hjálpaði okkur.
he
helped us(Dat)
‘He helped us.’
hjálpað ti.
Okkuri var/*vorum
us(Dat) was(dflt)/were(1pl) helped t
‘We were helped.’
var/*voru
hjálpað ti.
Þeimi
them(Dat) was(dflt)/were(3pl) helped t
‘They were helped.’
The verb selects for a dative case object. The object in the passives (14b,c) moves
to Spec(Infl) for EPP reasons. (We will return in section 6 to discuss the EPP in
much more detail.) Note that there is no subject-verb agreement in (14b,c).
The agreement facts above contrast with parallel cases in which there is no
selection for inherent case and there is nominative case checking by Infl. The
following are simple passives with a verb which checks structural accusative
case in the active form.
(15) a.
b.
Við
aðstoðuðum Þa.
we(Nom) aided(1pl) them(Acc)
‘We aided them.’
voru
aðstoðir ti.
Þeiri
they(Nom) were(3pl) aided t
‘They were aided.’
Even if nominative case appears on a postverbal argument, with covert
nominative case checking, there is overt agreement with Infl. The following
are from Sigurðsson (1996). The verb in (16a) and (17a) is of the familiar
variety which checks the accusative case of its object in active voices. The
verb in (16b) and (17b) assigns inherent dative case to its object.
(16) a.
b.
(17) a.
b.
voru
lesnar ti.
Bækurnari
the-books(Nom) were(3pl) read t
‘The books were read.’
var
skilað ti.
Bókunumi
the-books(Dat) was(dflt) returned t
‘The books were returned.’
Þad voru
lesnar fjórar bækur.
there were(3pl) read four books(Nom,pl)
Þad var
lesnar fjórum bókum.
there was(dflt) read four books(Dat,pl)
There is a two way implication. Nominative case checking induces
agreement; agreement requires nominative case. It might be possible to argue
that in English this simply reflects a morphological fact about Infl, that
ß Blackwell Publishers Ltd, 1999
Cyclic Computation
15
inflectional agreement features always accompany finiteness. It could then be
claimed that the coincidence of agreement and nominative case assignment is
just a reflection of a morphological coincidence. But this line of reasoning
does not hold up for Icelandic because quirky subjects appear with finite Infl
without any overt agreement morphology.
The feature checking mechanism proposed in the last section was
formulated with this kind of feature interaction in checking relations in mind.
We could express the facts above by assuming the following checking
relation:
(18) {u, Nom}
checks
{u , Tensed}
The structure of the theory, however, will be much clearer if we refine (18) in
several ways.
In the first place, we must understand correctly the role of u in the
theory. For various reasons (the Minimal Link Condition, in particular) it
cannot be that u is specialized to particular u-features. This would
correspond to a feature system in which a 3pl inflectional head would not
recognize a 1sg nominal. But as far as intervention effects are concerned, the
particular u-features of intervening heads appear to be irrelevant. What is
relevant is only that the intervening head have u-features, i.e., it is N. Then
(18) can be written as:
checks fN ; Tensedg
(19) fN; Nomg
If syntactic features do not directly encode morphological agreement, how
then is morphological agreement established? We suppose that syntactic feature
checking can have morphological side effects. Restricting our attention to this
particular checking relation, we assume that if fN; Nomg removes N , the
u-features of the N are copied onto the N head.14 We will soon see examples
in which other morphological features are copied onto the pivot.
The idea of morphological side effects of syntactic checking can be extended
to viewing structural case itself as a kind of agreement. The specific
morphological form agrees with the syntactic type of the case checker: Nom
is a reflection of Tensed, and Acc is a reflection of V. Parallel with the
reduction of the various u-features of Infl to the single N , we suppose that
there is a unitary structural case feature Ku rather than specific structural case
features Nom and Acc. (For ease of exposition, we restrict the options to
these two structural cases, ignoring structural genitive and other possibilities.)
We can then, finally, express the basic structural case checking relation as:
(20) fN; Ku}
14
checks
fN ; Tensed/Vg
Special thanks to Morris Halle for discussion of the nature of inflectional u-features.
ß Blackwell Publishers Ltd, 1999
16
John Frampton and Sam Gutmann
Agreement is then a process of guaranteeing that the morphological ufeatures on the N head agree with the u-features of the N head, and the
morphological case feature of the N head agrees with the type of the N
head. That is, if {N; Ku} removes N , the u-features of the N are copied
onto the N head. We now need to specify the morphological consequences
of, for example, fN ; Tensedg removes Ku. There is a fundamental
asymmetry because, empirically, the derivation does not appear to wait until
this checking operation to fix the morphological case feature (Nom) of the
nominal. Consider, for example, past participle agreement in Icelandic. There
is agreement with the past participle for case, number, and gender. The
following paradigm is from Andrews (1982:445).
(21) a.
b.
c.
d.
e.
Hún
er vinsæl.
she(Nom) is popular
Þeir
segja hana
vera vinsæla.
They(Nom) say her(Acc) to-be popular
Hún
er sögð
vera vinsæl.
she(Nom) is said(Nom) to-be popular
Þeir
telja
hana
vera sagða
vera vinsæl.
they(Nom) believe her(Acc) to-be said(Acc) to-be popular
Hún
er talin
vera sögð
vera vinsæl.
she(Nom) is believed(Nom) to-be said(Nom) to-be popular
For the sake of clarity, the only inflection which is glossed is case on
nominals and case agreement on past participles.
If we assume that past participle agreement is established by means of an
InflPP and the nominal passes cyclically through Spec(InflPP), it is most
straightforward to assume that the nominal already has a morphological case
feature when it is checked by InflPP.* Otherwise, the morphological case
feature of the nominal could not be copied onto InflPP. Presumably, InflPP has
a N -feature which checks the nominal, with its Ku-feature. Feature removal
{N, Ku} * N applies and has the morphological side effect of copying morphological features to the past participle, including the case feature of
the nominal.
This strongly indicates that the morphological case feature must be part of
the feature makeup of the nominal at insertion. The morphological side effect
of Ku elimination, which is carried out only at the head of the nominal
chain, is simply verification that the morphological case feature agrees with
* Note added in proof: Chomsky (Fall ’98 lectures) has proposed that phenomena like
Icelandic past participle case agreement are the result of case assignment to the past participle by
the higher case checker, rather than via agreement with the nominal as we proposed. If
Chomsky’s proposal is successful, there is no need for inserting nominals with specific
morphological case features, and it is likely that our program of eliminating lookahead from the
theory can be advanced even further than we have succeeded in doing here. It now seems possible
that the notion ‘‘crash’’ loses relevance. Incomplete derivations may be defective, but this can be
detected almost immediately.
ß Blackwell Publishers Ltd, 1999
Cyclic Computation
17
the case-type of the head involved in eliminating Ku. If the verification
fails, the deriviation is morphologically ill-formed.
Note that precisely the same checking relation (20) applies in the D–N
configuration discussed earlier. The determiner has an N -feature, but since
D never is Tensed or V, the structural case feature of the N cannot be
removed. The N -feature of the determiner, however, can be removed,
possibly with morphological consequences.15
5.1 Cyclic Morphology
We assume that morphology is determined cyclically. By this we mean that at
the end of each cycle, all morphology has been determined. The asymmetry
(verification versus feature copying) in the morphological side effects of
structural case checking make sense under this view. If a pivot X checks a
head Y, then (except for the special case of expletives) Y will already have
been present in a previous cycle. Assuming ‘‘Cyclic Morphology,’’ the
morphology of Y must have already been determined in that cycle. Any
morphological side effects of the feature checking between X and Y cannot
alter the morphology of Y. Verfication of morphology is permitted, but
morphological features of Y cannot be changed. The morphological features
of X, however, can be altered. The checking is part of the cycle in which the
morphology of X is determined.
We explicitly depart here from Chomsky’s assumption that the feature set
of heads cannot be altered derivationally except by removal of features. But
the departure is limited to alteration of the set of morphological features.
Changes in the feature set are further limited by the assumption of cyclic
morphology. We can further limit the possible changes in the feature makeup
of heads by assuming that morphological feature changes are only possible as
a consequence of an application of Satisfy and that they are restricted to the
pivot of the transformation. They are further restricted to copying onto the
pivot morphological features of the head which undergoes checking with the
pivot. In contrast to removable features, which must be eliminated in the
course of the derivation, morphological features are never eliminated.
Note that the assumption of Cyclic Morphology is made possible by the
assumption that all possible removable features of the pivot are eliminated
cyclically. Consider a simple expletive construction:
(22) a.
b.
Bill thinks there is someone in the room.
[there Infl(Pres)be (be) someone in the room]
Infl must acquire u-features on the cycle in which (22b) is built so that the
morphology of Inflbe can be determined. If feature checking did not take
15
It might seem odd that the N -feature of D’s is not distinguished from the N -feature which
inflection can bear. We can see no empirical issue.
ß Blackwell Publishers Ltd, 1999
18
John Frampton and Sam Gutmann
place between someone and the embedded Infl on this cycle, the morphology
of that Infl could not be fixed cyclically.
5.2 Inherent Case
Sometimes, the lexical item selecting a nominal selects the case of that
nominal as well. The case theory developed in the previous sections must
now be extended to this situation. It is well-known that various oblique
objects in Icelandic, for example, behave as if they require some kind of
structural licensing akin to structural case licensing. See, for example,
Freiden & Sprouse (1991) and Sigurðsson (1991). Consider typical examples:
var hjálpað ti.
Jónii
Jon(Dat) was helped t
‘Jon was helped.’
var taldir [ti hafa
verið hjálpað ti].
b. Jónii
Jon(Dat) was believed t to-have been helped t
‘Jon was believed to have been helped.’
hafa
verið hjálpað ti].
c. Þeir talja [Jónii
They believe Jon(Dat) to-have been helped t
‘They believe Jon to have been helped.’
hafa
verið hjálpað ti] er mikilvægt.
d. *[Jónii
Jon(Dat) to-have been helped t is important
(23) a.
The dative marked object of the verb can appear in subject position, as in
(23a). It can raise in normal fashion, as in (23b). The issue is the contrast
between (23c) and (23d). In spite of the fact that there is inherent dative case
assignment in both cases (which we assume is a question of selection) and no
evidence that Jon(Dat) is structurally case-marked, it must appear in a
structural case position. On the basis of contrasts like that between (23c) and
(23d), Freiden & Sprouse concluded that the nominal, even though it does not
require structural case assignment (since it already has been assigned dative
case lexically), requires some kind of structural licensing. Further, that such
structural licensing is only available in positions of structural case
assignment.
There has been considerable discussion in the literature on the question of
the possibility of quirky subjects bearing covert structural case (nominative in
(23a) and (23b), accusative in (23c)). Various proposals have been advanced
trying to bring these constructions in line with the standard ideas about case
marking.16 In spite of the fact that the subject of the embedded sentence in
(23b,c,d) has dative case, it must appear in a structural case position, either as
the subject of a tensed clause, in (23b), or in an ECM position, as in (23c).
16
The idea, we believe, originates with Chomsky (1986). See Schütze (1993) for a recent
treatment.
ß Blackwell Publishers Ltd, 1999
Cyclic Computation
19
There is a fundamental difference between licensing inherently case
marked arguments and the form of structural case discussed earlier. There is
no morphological agreement. The morphological case is invariant (independent of the type of the case checker) and no morphological agreement
features are copied to the case checker.
Consider, for example, the Icelandic examples which follow:
(24) a.
b.
c.
d.
Stúlkan
kyssti
drengina.
the-girl(Nom,fem,sg) kissed(3,sg) the-boys(Acc,masc,pl)
‘The girl kissed the boys.’
Drengirnir
voru
kysstir.
the-boys(Nom,masc,pl) were(3,pl) kissed(Nom,masc,pl)
‘The boys were kissed.’
Jóni
hjalpaði
okkur.
Jon(Nom) helped(3,sg) us(Dat)
‘Jon helped us.’
Okkur var
hjalpað.
us(Dat) was(dflt) helped(dflt)
‘We were helped.’
We can easily capture this pattern in the present system by supposing that
there is a second structural case feature, K, which behaves syntactically
exactly as Ku does, but differs from Ku in 1) not triggering morphological
agreement and 2) satisfying different selectional restrictions than Ku. We
assume:
(25) {N, Ku=Kg fN ; Tensed=Vg
We assume that K checking has no morphological side effects, either in
verifying morphological case or in copying morphological features to the
case checker. Selectionally, we assume that arguments whose case is selected
must have a K-feature and arguments whose case is not selected must have a
Ku-feature. Conceptually, there is a partial justification for the failure of K
checking to induce agreement. Agreement verifies morphological case.
Arguments whose case is selected have already had their morphological case
verified by the selector.
5.3 Language Specific versus Language Universal Case Features
We can speculate that the separation between checking structural case
features and the morphological side effects of case checking coincides with
the separation between universal case features and checking relations and the
language particular ways that case is made manifest morphologically. Under
this view, the checking relations (25) are universal. We discussed above
ß Blackwell Publishers Ltd, 1999
20
John Frampton and Sam Gutmann
various natural constraints on the morphological side effects of checking, the
assumption of Cyclic Morphology in particular. It is natural to assume that
language has wide latitude to express case checking morphologically within
these constraints.
6. Expletives and the EPP
Let us first consider what English would be like if (25) was the extent of
structural case checking. The main features of the case and EPP system
follow from (25). A N -feature on Infl will ensure that Spec(Infl) is filled
overtly. Chain Condition (Chomsky 1986) effects follow directly as well.
Once the the structural case feature of an N is removed, N can no longer
be removed by the N. Only N in conjunction with a structural case feature
removes N . Consider, for example:
(26) a.
b.
*John was believed [t is guilty]
was
believed [John is guilty]
N
N ,Tensed
The structural case feature of John is removed by the fTensed; N g of
Infl in the embedded clause. At the point where (26b) has been built, the
matrix Infl cannot check John. The only removable feature which could be
eliminated is N , and a structural case feature (Ku or K) must cooperate
with N in removing N .
The possibility of cyclic NP movement follows as well if we assume that
raising Infl has a N -feature, so that a raising Infl can be in a checking
relation with an N with a structural case feature. But N on an untensed Infl
(i.e., one lacking Tensed) cannot remove Ku. Consider, for example:
(27) a.
b.
c.
John is likely [t to be [t guilty].
Infl be [John
guilty]
N,Ku
N
John
Infl be [t guilty]
N,Ku
The raising Infl checks John, but Ku is not removed from John because no
case-checking feature (i.e., neither Tensed or V) cooperates with the N feature of Infl. On the other hand, N is removed from Infl, because Ku
cooperates with N in removing it. The matrix Infl can then check John and
(27a) results.
The main descriptive inadequacy of such a system is that it does not
permit expletive constructions. Consider a standard example:
(28) There are likely t to be several women in the room.
ß Blackwell Publishers Ltd, 1999
Cyclic Computation
21
If case checking were restricted to (25), the matrix Infl would lose its N feature in checking with the expletive. We would need to assume that the
expletive is N and bears a structural case, or it would not be in a checking
relation with the higher Infl. But then it would remove the N -feature of the
matrix Infl when it raised and there would be no way for the postcopular
nominal to lose its case feature. The existence of expletive constructions
appears to imply that Infl has two distinct removable features. One is N ,
and the other must be in a checking relation with the expletive.
Chomsky (1995) proposed that Infl has a D -feature as well as a N
feature (u in his system). In the present terms, the checking relation that
he proposed is D D , which can be more simply written as
D * D , because D is not removable. If we suppose that Infl has
both N and D , we are led to a system along the lines of Collins (1997).
This system comes close to a descriptively adequate model. Examples (26)
and (27) go through with little change. A simple expletive construction, as in
(28), receives a direct account. It is assumed that there has only the feature
D. In (28), the expletive satisfies the D -feature of the embedded Infl when
it is inserted in this position and satisfies the D -feature of the matrix Infl
when it raises (displaying checking with D ) to the matrix clause. Covertly,
the nominal raises first to the embedded Infl, satisfying its N feature and
preserving its own Ku-feature. The nominal then raises to the matrix Infl,
satisfying the latter’s N -feature and losing its own Ku-feature.
Consider also:
(29) *There is likely that someone was in the room.
The expletive satisfies the D -feature of the matrix Infl, but the N -feature of
the matrix Infl remains unsatisfied. The nominal someone in the embedded
clause cannot raise to satisfy N because it has lost its Ku-feature, which is
needed for erasing N .
The system falls short, however, in more subtle expletive constructions,
including the crucial (3) from section 3, repeated here as (30a).
(30) a.
b.
*There is likely someone to be t in the room.
*There is likely there to be someone in the room.
Neither of these examples is blocked. Attempts to account for these examples
have taken different forms and have played a central role in the development
of Minimalist theory. Chomsky originally introduced the notion numeration to
account for (30a) and the idea of a numeration continues to play a prominent
role in Minimalist grammar. Its persistence is due almost entirely, as far as we
can see, to its role in explaining (30a). Collins (1997), correctly in our view,
tries to do away with the numeration machinery, but does it only by stipulating
a principle (called Extend Chain) to account for (30a) that is otherwise
unmotivated and has very little theoretical scope. Chomsky uses the old idea
ß Blackwell Publishers Ltd, 1999
22
John Frampton and Sam Gutmann
of expletive replacement as an explanation for (30b). We discuss expletive
replacement in section 6.2. Collins does not address the problem of (30b).
Why Infl should have both D and N -features is a mystery. A grammar
with only N on Infl is only marginally different than a grammar with both
N and D on Infl. The possibility of expletive-associate constructions,
as we have seen, is the salient difference. We can first ask ourselves why Infl
has a N -feature in the first place. This is just the flip side of the question of
why argument nouns bear structural case. If N’s do have a structural case
feature, some mechanism must be provided for removing the structural case
feature, and N plays a key role in that mechanism. The reason for a D feature is not so clear. We will pursue the intuition that D arises for the
same reason that N does, as a featural device for ensuring the erasure of a
case feature. This leads directly to the idea that D’s as well as N’s bear a
structural case feature. We will denote the structural case feature on D’s by
KD . The obvious checking relation is (31b), entirely parallel to the structural
case checking relations (25), which are repeated here as (31a).17
(31) a.
b.
fN; Ku= Kg fN ; Tensed=Vg
fD; KD g fD ; Tensed=Vg
Two further conclusions are more or less forced. First, it must be that
expletive there bears KD . Otherwise, there could not remove Infl’s D .
Recall that (31b) requires both D and KD to remove D . Second, it must
be that the associate of the expletive is an NP, not a DP. We can see this by
considering:
(32) There is someone in the room.
The expletive removes D from Infl. If someone was a DP, it would bear
KD and there would be no D which could cooperate with Tensed to
remove it. If the nominal is a simple NP, the problem vanishes. The
conclusion that the associate of the expletive must be an NP is the core of an
explanation of the Indefiniteness Effect.18
The conclusion that a nominal like someone can be either an NP or a DP is
more ordinary than it may seem. The internal structure of [someone]DP is
most reasonably realized as [d [someone]NP], with a phonologically null
determiner d. If this is indeed the structure, then the two varieties of someone
are already implicit in someone as a DP.
The problematic examples in (30) also receive an immediate and direct
explanation. First, consider (30a). There are two possibilities; either someone
17
It is conceivable that K and KD are the same feature. This depends partly on the analysis
of pronouns. If pronouns are heads which are both N and D, our analysis does not go through
if they bear a single KD feature. It is awkward, and perhaps incoherent, for them to bear two
identical features.
18
This argument is from Frampton (1996).
ß Blackwell Publishers Ltd, 1999
Cyclic Computation
23
is a DP or an NP. If it is an NP, then the D -feature of the embedded Infl will
not be removed. If it is a DP, then its KD -feature will not be removed because
the D -feature of the matrix Infl is consumed when the expletive is
introduced. Next, consider (30b). Both expletives will bear KD -features
which must be removed. The KD -feature of the expletive in the embedded
sentence will not be removed by a raising Infl. That Infl has a D -feature, but
it must cooperate with Tensed to remove KD . The matrix Infl will lose its
D -feature in checking the higher expletive, so there is no way for the lower
expletive to lose its KD -feature.
There is one further type of example which has proved troublesome for
Minimalist theories, but which is elementary in the present framework.
(33) a.
b.
There was circulated a rumor that someone was in the room.
A rumor that there was someone in the room was circulated.
In the framework developed above, (33) is routine. In (33a), a rumor is an NP
and someone is a DP. In (33b), a rumor is a DP and someone is an NP.
In Chomsky’s framework, (33) presents a puzzle. It is not at all clear why
(33a) is permitted. The inventory of lexical items in the two derivations is
identical. At some point in the derivation of (33a), [ was someone in the room ]
has been built and the issue is raising someone. Recall that in that framework,
Merge is always more economical than Attract. In this case, Merge has an
expletive there available for merger (the there which is ‘‘intended’’ for the
matrix sentence). The consequence should be that (33a) is not optimal.19
The introduction of KD alongside Ku also clarifies a residual puzzle
for the proposed association of structural case and agreement. Why is it that
raising infinitives do not show agreement? If Infl has a N -feature, we would
expect that cyclic NP movement of an N with Ku would induce agreement.
The introduction of KD clears this up. We can assume that raising Infl does
have a D -feature, but does not have a N -feature. Syntactically, a raising
Infl is nothing else besides a D -feature. Unlike Ku checking, KD
checking has no morphological side effects.
6.1 The Chain Condition in Alternate Minimalist Theories
The discussion above focused on showing that the Chain Condition can be
derived from the proposals we have advanced. It is worth emphasizing that
this is a non-trivial result. Chomsky (1995), for example, does not have an
account of examples like the following:
19
Chomsky has suggested in past lectures that the problem could be overcome by introducing
a variation on his notion of numeration, what he calls a restricted numeration. The idea is that at
various points in the derivation, there is an option of adding additional lexical items to the
numeration. Recent improvements in the theory (Chomsky, Fall 1997 lectures) do appear to allow
a coherent explanation to go through along these lines.
ß Blackwell Publishers Ltd, 1999
24
John Frampton and Sam Gutmann
(34) *[Bill to be believed t is guilty] is surprising.
Since Chomsky assumed that a raising Infl has a D -feature, and that this is
its only removable feature, and made no assumptions about feature
interaction in checking, nothing blocks Bill from raising and eliminating
Infl’s D -feature.
In order to account for successive cyclic movement to Spec(Infl), with a
raising Infl, Collins (1997:99) proposes that Infl has a removable feature
Tnull which has special checking properties. It checks structural case of any
variety. In checking, however, Tnull is eliminated but the (removable)
structural case feature that it checks is not eliminated. Under this assumption,
(34) is correctly predicted. In general, the proposal is empirically adequate,
but it has conceptual problems. In the first place, the Tnull-feature is otherwise
unmotivated. It is not the feature responsible for EPP driven movement,
because expletives also undergo successive cyclic raising and they do not
bear structural case. In the second place, there is no attempt to account for the
exceptional behavior of Tnull checking. Other than the structural case
feature which enters into checking with Tnull, removable features are
assumed to be eliminated by checking. In fact, in the same work, because of
these conceptual problems, Collins abandons his Tnull proposal in favor of
Chomsky’s idea that a raising Infl has just a D -feature (see Collins
1997:100). But we have already seen that Chomsky’s proposal, while
conceptually satisfying, is not sufficiently constrained to account for (34).
6.2 Expletive Replacement
The account of (35) in terms of KD -case is straightforward: the KD -features of
the expletives are not eliminated.
(35) a.
b.
*[There to be believed that Jack has left] is surprising.
*I regret [there to be believed that Jack has left].
Chomsky (1995) rules these out by the requirement that expletives cannot
appear in the LF representation and the introduction of a mechanism for
expletive deletion. The idea is that if expletive there, a D in Chomsky’s
system, is brought in a suitable structural relation with an N, it is deleted. In
the examples in (35), no such deletion occurs and the sentences are ruled out.
There are three major problems with the expletive deletion analysis.
First, assuming that the D-feature of expletive there is a removable feature
introduces a significant complication. Some distinction must be made
between D-features which are removable and those which are not. Otherwise,
the computation could not recognize one particular occurrence of D as
removable, while a different occurrence of D was considered not removable. The checking relations, at least under current assumptions, cannot
distinguish between two occurrences of a particular feature depending upon
ß Blackwell Publishers Ltd, 1999
Cyclic Computation
25
the context they occur in. It would have to be encoded in some way. It
certainly can be done, but only at a cost.20
Second, there are empirical problems. In English, it might be argued that it
in (36) is either an expletive which is eliminated syntactically or that it is an
argument and the apparent clausal complement is actually an adjunct.
(36) It seems that Bill left early.
In Icelandic, there are examples which make both possibilities dubious.
(37) a.
b.
c.
Það glampaði á sverðið.
it gleamed on the-sword
‘The sword gleamed.’
Það var dansað.
it was danced
‘There was dancing.’
Það dimmir nú.
it dims now
‘It is getting dark.’
The examples are taken from Andrews (1982:493). The expletive in the last
example, (37c), is perhaps not so different than so-called weather-it in
English. But in (37a), in particular, it is hard to see what argument structure
would make an argument-adjunct analysis possible and it is equally hard to
see how the expletive could be replaced.21
Third, in Chomsky’s system, a non-pronominal expletive, phonological
material aside, consists of nothing more than a single removable D feature. If
this feature is removed, the expletive becomes contentless. Either some
coherence must be given to the idea that the interpretive system does not
object to a contentless phrase, as opposed to a phrase with content but no
interpretation, or some theory of phrasal deletion must be advanced. It is
certainly possible to introduce suitably constrained phrasal deletion into the
system. But it should be recognized as an added cost of this approach.
6.3 Intervention Effects (The Minimal Link Condition)
Now that some concrete examples of complex feature checking relations
have been developed, we are in a position to understand the precise
formulation of the application of Satisfy which was given in section 4.3.
20
One scheme might be to suppose that the feature set of heads is not an undifferentiated set of
features, but is intrinsically structured into removable features and permanent features. The
lexicon would have the responsibility of assigning features to the appropriate category right from
the start. This scheme, in effect, is a way to encode the context sensitivity of removability.
21
It is worth noting the similarity between (37a) and the Irish examples that McCloskey
(1995) uses to argue against the universality of the EPP.
ß Blackwell Publishers Ltd, 1999
26
John Frampton and Sam Gutmann
(38) *Maxi appears [ it is believed ti to be guilty ]
The intervening it has had its structural case feature removed so it cannot
move to the matrix Infl. Nevertheless, it must block movement of a more
deeply embedded D which has both Ku and KD and could possibly raise
to the matrix Infl were it not for the intervening D.
Our particular formulation of Satisfy accounts for this. Consider the
relevant point in the derivation.
(39) Infl appears [it is believed Max to be guilty]
Infl does a top-down search for a head to eliminate one of its removable
features, D or N . It stops when it finds a head it recognizes. Recall that
Infl will recognize a head Y if there is a removable feature f of Infl and a
checking relation l m such that 1) f is in l and 2) m and Y share at least one
common feature. In (39), Infl recognizes it. This follows from the fact that
fN ; Tensedg fN; Kug, so that fN; Kug * N , with N a
feature of it. Note that Infl recognizes it even though it is not able to eliminate a
feature of Infl because it has lost its Ku-feature. The top-down search,
therefore, stops with it.
We can ask why the top-down search is limited in this particular way, so
that it stops when a Y which is recognized is encountered, even though
checking cannot be carried out with that Y and there is a deeper head with the
needed feature structure. One could easily imagine alternatives. It could be,
for example, that the search stopped only when a head which could actually
enter into feature checking with the pivot was found. It is clear that the more
an intervention condition restricts the depth of search, the easier the
computation is, but the more difficult it is to construct well-formed
derivations. This intervention effect can therefore be taken as evidence that
the syntax is organized in such a way as to simplify the computation.
7. Language Design
In the introduction, we made it clear that we were not going to argue for
eliminating comparison of derivations from the theory on the basis of the
claim that only a theory with simple computations can be psychologically
real. Instead, we adopted simplification of the computation involved in
verifying derivations as a potential simplification of the theory, therefore a
potential improvement, and examined what this point of view led to.
There could have been major problems or complications in attempting to
rework the Minimalist Program along these lines. Instead, it led to what we
believe is a simpler and more constrained theory with broader explanatory
adequacy. Assuming for the moment that this is true, we need to ask what
conclusions can be drawn from the fact that recasting the theory using the
simplification of computation as a guiding principle led to a better theory,
using traditional measures of simplicity and explanatory adequacy.
ß Blackwell Publishers Ltd, 1999
Cyclic Computation
27
A plausible conclusion is that the computations the theory uses to verify
derivations are in some sense real, in that they correspond in some more or
less direct way with actual mental computations. As well language may be
designed in such a way that these computations, which of course must be
consistent with the computational limits of the brain, do not include the
complex device of comparing derivations.
References
ANDREWS, A. 1982. Case in Modern Icelandic. In The mental representation of
grammatical relations, ed. J. Bresnan. Cambridge, Mass.: MIT Press.
CHOMSKY, N. 1986. Knowledge of language. New York: Praeger.
CHOMSKY, N. 1994. Bare phrase structure. MIT occasional papers in linguistics 5.
MIT, Cambridge, Mass.
CHOMSKY, N. 1995. The minimalist program. Cambridge, Mass.: MIT Press.
COLLINS, C. 1997. Local economy. Cambridge, Mass.: MIT Press.
FRAMPTON, J. 1996. Expletive insertion. In The role of economy principles in
linguistic theory, ed. C. Wilder, H.-M. Gärtner & M. Bierwisch. Berlin: Akademie
Verlag.
FRAMPTON, J. & S. GUTMANN. 1997. Eliminating non-local computation in
minimalist syntax. Ms., Northeastern University, Boston, Mass.
FRAMPTON, J. & S. GUTMANN. 1998. Distributed morphological interface and Vto-I raising. Ms., Northeastern University, Boston, Mass.
FREIDEN, R. & R. SPROUSE. 1991. Lexical case phenomena. In Principles and
parameters in comparative grammar, ed. R. Freiden. Cambridge, Mass.: MIT Press.
KOSSLYN, S. 1980. Image and mind. Cambridge, Mass.: Harvard University Press.
LONGOBARDI, G. 1994. Reference and proper names. Linguistic Inquiry 25:609–
666.
MCCLOSKEY, J. 1995. Subjects and subject positions in Irish. In The syntax of Celtic
languages, ed. R. Borsley & I. Roberts. Cambridge: Cambridge University Press.
MULDERS, I. 1997. Mirrored specifiers. To appear in Linguistics in the Netherlands.
RICHARDS, N. 1997. What moves where when in which language? Ph.D.
dissertation, MIT, Cambridge, Mass.
SCHÜTZE, C. 1993. Towards a minimalist account of quirky case and licensing in
Icelandic. MIT working papers in linguistics 19, 321–375. Department of
Linguistics and Philosophy, MIT, Cambridge, Mass.
SIGURÐSSON, H. 1991. Icelandic case-marked pro and the licensing of lexical
arguments. Natural Language and Linguistic Theory 9:327–363.
SIGURÐSSON, H. 1992. The case of quirky subjects. In Working papers in
Scandinavian syntax 49, 1–26.
SIGURÐSSON, H. 1996. Icelandic finite verb agreement. In Working papers in
Scandinavian syntax 57, 1–46.
YANG, C. 1999. Unordered merge and its linearization. Syntax 2:38–64.
John Frampton
Department of Mathematics
Nightingale Hall
Northeastern University
Boston, MA 02115
[email protected]
ß Blackwell Publishers Ltd, 1999
Sam Gutmann
Department of Mathematics
Nightingale Hall
Northeastern University
Boston, MA 02115
[email protected]