DOCUMENT RESUME
FL 024 097
ED 399 772
Local, J. K., Ed.; Warner, A. R., Ed.
York Papers in Linguistics, Volume 17.
York Univ. (England). Dept. of Language and
Linguistic Science.
ISSN-0307-3238
Mar 96
471p.; For individual articles, see FL 024
AUTHOR
TITLE
INSTITUTION
REPORT NO
PUB DATE
NOTE
098-111.
Collected Works
Serials (022)
York Papers in Linguistics; v17 Mar 1996
PUB TYPE
JOURNAL CIT
MFO1 /PC19 Plus Postage.
EDRS PRICE
DESCRIPTORS
African Languages; Articulation (Speech); *Autism;
Black Dialects; Chinese; *Classroom Communication;
Cooperation; Diachronic Linguistics; Disabilities;
Echolalia; English; English (Second Language);
Finnish; Foreign Countries; French; Grammar; Greek;
Group Dynamics; Interpersonal Competence; Italian;
*Language Research; Language Rhythm; *Linguistic
Theory; Old English; Phonetics; Pronunciation; Second
Language Instruction; *Second Languages; Sex
Differences; Standard Spoken Usage; Suprasegmentals;
Syntax; *Uncommonly Taught Languages
Gerunds; Kalenjin Languages; *Repairs (Language)
IDENTIFIERS
ABSTRACT
These 14 articles on aspects of linguistics include
the following: "Economy and Optionality: Interpretations of Subjects
in Italian" (David Adger); "Collaborative Repair in EFL Classroom
Talk" (Zara Iles); "A Timing Model for Fast French" (Eric Keller,
Brigitte Zellner); "Another Travesty of Representation: Phonological
Representation and Phonetic Interpretation of ATR Harmony in
Kalenjin" (John Local, Ken Lodge); "On Being Echolalic: An Analysis
of the Interactional and Phonetic Aspects of.' an Autistic's Language"
(John Local, Tony Wootton); "The Nature of Resonance in English: An
Investigation into Lateral Articulations" (David E. Newton);
"Prosodies in Finnish" (Richard Ogden); "Old English Verb-Complement
Word Order and the Change from OV to VO" (Susan Pintzuk); "Situating
'Que" (Bernadette Plunkett); "Event Structure and the "Ba"
Construction" (Catrin Sian Rhys); "Explanation of Sound Change: How
Far Have We Come and Where Are We Now?" (Charles V. J. Russ); "Has It
Ever Been 'Perfect'? Uncovering the Grammar of Early Black English"
(Sali Tagliamonte); "Voice Source Characteristics of Male and Female
Speakers of French" (Rosalind A. M. Temple); and "Notes on Temporal
Interpretation and Control in Modern Greek Gerunds" (Georges
Tsoulas). (MSE)
.
**************1.A*AAA******-A**AAAA******************--***********
Reproductions supplied by EDRS are the best that can be made
*
from the original document.
***********************************************************************
*
York
Papers
In
Linguistics
17
PERMISSION TO REPRODUCE AND
DISSEMINATE THIS MATERIAL
HAS BEEN GRANTED BY
3\-e
--
TO THE EDUCATIONAL RESOURCES
INFORMATION CENTER (ERIC)
U.S. DEPARTMENT OF EDUCATION
71
Office of Educational Research and Improvement
EDUCATIONAL RESOURCES INFORMATION
loCENTER (ERIC)
his document has been reproduced as
edeived from the person or organization
originating it.
Minor changes have been made to
improve reproduction quality.
° Points of view or opinions stated in this
document do not necessarily represent
official OERI position or policy.
CI)
BEST COPY AVAILABLE
York
Papers
In
Linguistics
7
Editors
JK Local
and
AR Warner
ISSN 0307-3238
MARCH 1996
SERIES EDITORS SJ HARLOW AND AR WARNER
DEPARTMENT OF LANGUAGE AND LINGUISTIC SCIENCE
UNIVERSITY OF YORK, HESLINGTON, YORK YO1 SDD, ENGLAND
3
EDITORIAL BOARD
Professor James Hurford (Edinburgh)
Professor John Local
Professor Robert Le Page
Professor Neil Smith (University College London)
David Adger
Connie Cullen
Steve Harlow
John Kelly
Richard Ogden
Susan Pintzuk
Bernadette Plunkett
Charles Russ
Sali Tagliamonte
Ros Temple
Georges Tsoulas
Mahendra Verma
Carol Wallace
Anthony Warner
The Editors are grateful to members of the Editorial Board (and to
former members Patrick Griffiths and Joan Russell), who have
acted as referees and whose advice has contributed to the quality
of the papers published.
Papers submitted to York Papers in Linguistics are each sent to
two referees whose anonymous reports must both recommend
acceptance for the paper to be published. This issue contains 14
papers; a further ten papers were submitted but were not accepted.
John Local
Anthony Warner
CONTENTS
Economy and Optionality:
Interpretations of subjects in Italian
DAVID ADGER
ZARA ILES
1
Collaborative Repair in EFL Classroom
Talk
23
ERIC KELLER AND BRIGITTE ZELLNER A Timing
Model for Fast French
53
JOHN LOCAL AND KEN LODGE Another Travesty of
Representation: Phonological representation and
phonetic interpretation of ATR harmony in Kalenjin
77
JOHN LOCAL AND TONY WOOTTON On Being
Echolalic: An analysis of the interactional and
phonetic aspects of an autistic's language
119
DAVID E NEWTON The Nature of Resonance in
English: An investigation into lateral articulations
RICHARD OGDEN Prosodies in Finnish
Old English Verb-Complement Word
Order and the Change from OV to VO
191
SUSAN PINTZUK
241
BERNADETTE PLUNKETT Situating Que
265
CATKIN SIAN RHYS Event Structure and the Ba
Construction
299
Explanation of Sound Change.
How far have we come and where are we now?
CHARLES V. J. RUSS
333
SALI TAGLIAMONTE Has It Ever Been Perfect'?
Uncovering the Grammar of Early Black English
351
ROSALIND A.M. TEMPLE Voice Source Characteristics
of Male and Female Speakers of French
397
GEORGES TSOULAS Notes on Temporal Interpretation
and Control in Modern Greek Gerunds
441
EDITORIAL STATEMENT AND STYLE SHEET
471
ECONOMY AND OPTIONALITY:
INTERPRETATIONS OF SUBJECTS IN ITALIAN*
David Adger
Department of Language and Linguistic Science
University of York
1. Goals
Optional movement is inconsistent with the notion of Economy.
Interestingly, optional movement seems to correlate with different
interpretations for the resulting structures; when movement is
obligatory, on the other hand, the single resulting structure seems to
have both of the possible interpretations assigned to the two structures
given by optional movement. Why should these facts hold? I provide an
answer which is based on the observation that the 'interpretational'
differences noticed are actually not semantic at all, but fall within the
purview of a separate field of linguistic competence: the ability that
human beings have to assign sentences values as to their felicity in
discourses. Given this, it follows that there must he an independently
specified set of well-formedness conditions deriving well-formed
discourses (see, for example work in DRT, especially Kamp and Rey le
1993). I argue that apparent optionality in syntax arises because of a
constraint requiring each well-formed discourse to correspond to a
collection of corresponding well-formed syntactic structures.
Optionality in syntax then becomes essentially a meta-construct,
arising out of the interaction between two independent subsystems of
*
Many thanks to the following people for comments on the ideas presented
here: Elena Anagnostopoulou; Hagit Borer; Richard Breheny; Itziar Laka;
Fabio Pianesi; Manuela Pinto; Bernadette Plunkett; Josep Quer; Tanya
Reinhart; Enric Vallduvf and Anthony Warner. Many thanks also to Sandra
Paoli for help with the data.
York Papers in Linguistics 17 (1996) 1-21
e David Adger
YORK PAPERS IN LINGUISTICS 17
linguistic competence. The apparent interpretational effects are actually
effects that arise because native speakers attempt to construct different
discourse contexts to satisfy the principles that map between syntax and
discourse. The vitiation of these effects when movement is obligatory
arises through the interaction of this theory of the interface and the
requirement that the syntax be economical. I illustrate this conceptual
framework here by taking two narrow domains: subject placement in
Italian and the infelicity of anaphoric linkage in discourse across the
scope of a quantificational expression.
2. The Problem
Consider the following well-known paradigm from Standard Italian (I
shall ignore throughout this paper cases of so called free-inversion
where the post verbal subject is not in its theta-position - see Belletti
1988):
(1)
Tre
leoni
hanno sternutito.
lions
have-3p sneeze-pp
Three lions have sneezed.'
three
(2)
*Hann
sternutito
have-3p
sneeze-pp
ire
three
(3)
Tre
leoni
sono
scappati.
three
lions be-3p escape-pp-3p
Three of the lions have escaped.'
(4)
Sono scappati
tre
three
be-3p escape-pp-3p
Three lions have escaped.'
leoni.
lions
leoni.
lions
Assuming some version of the Unaccusative Hypothesis
(Perlmutter 1979; Burzio 1985), this paradigm raises an important
question for theories of grammar which incorporate some notion of
Economy of movement (Chomsky 1989, 1992, 1995): why, if
movement is a 'last resort' operation, is (3) a possible syntactic
2
ECONOMY AND OFTIONAIITY
structure? Under the Unaccusative Hypothesis, (4) is essentially the
base structure (where the subject is in its theta-position) and there
appears to be no motivation for the subject to move to result in (3).
Now consider (3) and (4) more carefully. Belletti (1988) has argued
that in (4) there is a definiteness effect which can be seen as long as we
make sure that the complement is not free-inverted to a position outside
VP. She gives examples with ditransitives:
(5)
finalmente arrivato a lezione.
Ogni studente era
arrived to the lecture
finally
be-3s
every student
'Every student finally arrived to the lecture.'
(6)
*Era
be-3s
finalmente
finally
arrivato ogni studente a lezione.
arrived every student to the lecture
Interestingly, as noticed by Pinto (1994), the surface subject
position of unaccusatives also shows an interpretative effect. Pinto
claims that pre-verbal unaccusative subjects have to be interpreted as
being D-linked (Pesetsky 1987); that is they have already been
introduced in the discourse. This contrasts with the case of the
unergative subject, which has no D-linking constraint imposed upon it.
There are three questions then: why can the subject move? Why
does this result in an interpretative difference for the two resulting
structures whereby the pre-verbal subject of an unaccusative is plinked? And why, hi the case of unergatives (and transitives) are preverbal subjects not necessarily D-linked? (I will ignore the definiteness
effect in (6) in this paper, since I think it has an independent
explanation.)
3. A Potential Solution
A potential solution to the first problem is suggested by Belletti's
(1988) analysis of post-verbal subjects and developments of her ideas by
de Hoop (1992) among others. Belletti claimed that the definiteness
effect in (5) could be explained by the nature of the type of Case
assigned by the unaccusative verb. She terms this Case 'partitive',
assumes that its assignment is optional, and correlates it with
3
YORK PAPERS IN LINGUISTICS 17
indefiniteness. De Hoop points out problems with this idea, but
essentially develops this line of thought, arguing for different types of
Case assignment in the syntax, corresponding with different types of
interpretative effect. I shall refer to the hypothesis that the kind of data
in (5) and (6) can be dealt with through Case assignment as the Case
Determination of Interpretation hypothesis (CDI).
How might the CDI account for the data in (5) and (6)? De Hoop
proposes two types of structural Case which she terms 'weak' and
'strong'. For her, these correlate semantically with weak and strong
readings of DPs, where a strong reading is essentially a generalised
quantifier reading, and a weak one we can take for the moment as
existential. Under the CDI we could propose that V-unaccusative
assigns weak case to its complement and the auxiliary essere assigns
strong case to its specifier. This will give us the right interpretative
consequences.
What about (1), where the subject can have both interpretations? In
this case we could say that the auxiliary avere assigns either type of
Case to its specifier, which would mean that the subject of an
unergative could have either type of reading. Note that if Pinto is right
in her semantic characterisation of the readings of subjects in Italian, we
can link the notion of D-linked to that of strong Case, and non-D-linked
to that of weak Case.
One point of clarification: we cannot actually make the type of
Case assigned relate to the auxiliary directly, since the same facts
pertain when there is no auxiliary. We must therefore make I bear the
Case assigning features, or assume an abstract auxiliary. However, for
convenience I will refer to the Case assigning properties of essere and
avere even though actually these properties are instantiated on finite I.
Unfortunately, however, this solution will not generalise
effectively to other languages. French is a language which displays
similar auxiliary selection facts to Italian and also displays a
definiteness effect in impersonal passives:
(7)
Il est arrivd
trois femmes/ *chaque femme.
*each woman
three women/
it be-3s arrive-pp
There arrived three women/*each woman.'
4
10
ECONOMY AND OPTIONALITY
Trois femmes/ chaque femme
each woman
three women/
'Three women/Each woman
(8)
arrivde(s).
sont/est
be-3p/be-3s arrive-pp-f(p)
have/has arrived.'
However, French does not appear to display an anti-definiteness
effect in (8), which is felicitous in contexts where the subject is non-Dlinked. To capture the difference between Italian and French under the
CDI one would be forced to jettison the claim that the type of Case was
related to the type of auxiliary (or finite inflection) since in (8) we see
the equivalent of the essere auxiliary in French with either a D-linked
or non-D-linked subject.
Furthermore, the CDI seems to miss an important correlation
which can be stated in the following intuitive terms: if movement to a
position is optional then the two possible structures will have different
interpretations; if movement to a position is obligatory, then both
interpretations are available for the single structure. This correlation
would seem to be essentially functional: you move something to a
position to achieve an interpretative effect. In Section 5 of this paper I
will develop a formal explanation for the correlation.
In the next two sections I want to present the details of an
alternative view to the CDI. I'll argue that the interpretation of preposed
subjects of unaccusatives in Italian is not simply that they are D-linked,
but rather that such subjects behave as though they are required to be
discourse anaphoric (in the sense of Discourse Representation Theory
(Heim 1982; Kamp 1981; Kamp and Key le 1993)). I'll do this by
showing that preposed subjects of unaccusatives obey the same
constraints as other discourse anaphors such as definites with respect to
the scope of adverbial quantifiers (which are discourse anaphor islands).
To do this I'll present a version of DRT designed to capture these
effects.
I'll then argue that a maximally simple view of Case should be
maintained, whereby Case has no interpretative force. It is required to
license a DP but not sufficient to determine that DP's surface position.
This does away with the notion of optional Case assignment as in
Belletti's system. It also paves the way for an explanation of the
interpretative correlates of subject placement. The idea is that
movement of the subject of an unaccusative to pre-verbal position is an
5
YORK PAPERS IN LINGUISTICS 17
option not because of Case optionality but rather because of conditions
regulating the pairing of S-Structures and Discourse Representation
Structures. A simple theory of Economy interacts with these conditions
to explain the interpretative consequences of optional as opposed to
obligatory subject raising.
4.
Some Semantics
4.1
A Little DRT
Within Discourse Representation Theory (DRT) indefinites and definites
contrast with true quantifiers such as every in that they are treated as
free variables which only become bound during the interpretation
procedure. These free variables are termed discourse referents (DRs) and
a Discourse Representation Structure (DRS) consists of a universe of
DRs and a collection of constraints on those DRs. An example might
make this clearer:
(9) a.
b.
A man entered. He sat down.
Every man entered. # He sat down.
In (9a) the subject of the first sentence introduces a DR x which is
constrained so that the formula man(x) must be true of it.
Furthermore, the predicate of the sentence, enter, must also be true of
it. This gives the following representation:
(10)
x
man(x)
enter(x)
The pronoun in the second sentence of (9a), being a definite,
introduces a further DR y, of which the condition that y sat down must
hold:
6
12
ECONOMY AND OFTIONAUTY
(10)'
xy
man(x)
enter(x)
sat-down(y)
Given what I have said so far there does not appear to be any
distinction between indefinites and definites. Both introduce DRs and
constrain then with formulae. However, in order to capture the fact that
the use of a definite pronoun is infelicitous unless there is something
for the pronoun to refer back to (I use refer here intuitively), Heim
(1982) proposes a felicity condition on definites, including pronouns:
(11) Suppose something is uttered under the reading represented by 4)
(where 4) is an LF) and the discourse preceding 4) has resulted in a
DRS 9G Kcontains a set of discourse referents U. Then for every
chain C in 4) it must be the case that:
Familiarity Condition: if C is a definite (including a definite
pronoun) then there is a discourse referent x associated with C and
x= y, y E 'U.
otherwise (1) is infelicitous with rcspeci to 5V,
This condition does not hold of indefinites like numerals, some,
many, several etc. predicting that indefinites can begin discourses while
definites cannot. The Familiarity Condition means that the DRS
corresponding to (9a) will actually have to look as follows:
7
YORK PAPERS IN LINGUISTICS 17
(12)
xy
man(x)
enter(x)
sat-down(y)
y=x
How then does this theory explain the infelicity of (9b)? The
answer is in the DRT structures for quantified sentences (including
sentences with adverbial quantifiers - this will become important later
on). Kamp (1981) argues that sentences which contain a quantifier give
rise to a sub-DRS within the main DRS. The extent of the sub-DRS is
defined by the scope of the quantifier. Crucially the DRs in this subDRS are not accessible for anaphoric linkage from the main DRS:
(13)
x
man(x)
-->
enter(x)
If we were to continue the first sentence of (9b) with the second,
then the felicity condition on pronouns (12) will require the DR of the
pronoun to be anaphorically linked with a DR in the main DRS. But
there is no DR in the main DRS, leading to the correct prediction of
infelicity of this sentence with respect to this discourse. I have followed
Kamp's early notation for universal quantification here, using an
implication sign. In actual fact it will turn out that we need to be
specific about the quantificational relation between the two sub-DRSs
in structures like (13) - see Kamp and Reyle (1993) for discussion.
Some types of DP always enter their discourse referent in the main
DRS though, even if they are in the scope of a quantifier. Examples are
8
14
ECONOMY AND OFTIONALITY
proper names and usually definites including demonstratives. So the
following is a felicitous discourse:
(14)
Every lion in captivity lived in this zoo. We thought it was
secure, but they've all escaped now.
Here it refers to the zoo, which is possible because demonstratives
enter their discourse referents in the main discourse and therefore the
felicity condition on it can be met. This sentence also illustrates that
the plural pronoun they seems to be able to pick up a group constructed
out of the lions mentioned. The anaphoric properties of plural pronouns
lie outside the scope of this paper (but see Kamp and Rey le 1993), but
note that every lion triggers singular not plural agreement and can be
anaphorically picked up by a singular pronoun in its scope, illustrating
that something extra is going on with plural pronoun anaphora:
(15)
Every lion in captivity wanted its freedom/knew that it needed to
be free.
The Interpretation of Preposed Subjects
4.2
Preposed subjects of unaccusatives in Italians appear to behave just like
other discourse anaphors, even when they contain a cardinal (indefinite)
like tre 'three'. Consider the following dialogues:
(16)
Questioner: I hear you have lots of cats and dogs staying with
you just now. How are they?
Speaker:
scappati
sono
Tre gatti
three cats be-3p escape-pp-3p
'Three cats have escaped.'
#Sono
be-3p
scappati
escape-pp-3p
tre
three
gatti.
cats
I The judgements here are from Standard Northern Italian.
9
YORK PAPERS IN LINGUISTICS 17
(17)
Questioner: How are you feeling?
Sono preoccupato. Sono scappati
(works in a zoo) I'm worried.
be-3p escape-pp-3p
Speaker.
#Sono preoccupato. Tre leoni
I'm worried.
three lions
tre leoni.
three lions
sono scappati.
be-3p escape-pp-3p
With the unaccusative verb it appears that when there is a discourse
referent available for Ire leoni 'three lions' then pre-verbal position is
the only one allowed. When there is no discourse referent available,
then only post-verbal position is felicitous. So far, this squares with
Pinto's report and one might imagine an account based on previous
mention.
With subjects of unergatives, only pre-verbal position is allowed.
We see this below:
(18)
Questioner: I hear you have lots of cats and dogs staying with
you just now. Have they been up to anything funny?
Speaker: Si, ieri
tre gatti
hanno sternutito.
have-3p sneeze-pp
yes, yesterday three cats
'Yes, yesterday three cats sneezed.'
(19)
Questioner: Have you seen anything funny lately?
Speaker: Si, ieri
tre gatti hanno sternutito lungo la strada.
yes yesterday three cats have-3p sneeze-pp along the street
'Yes, yesterday I saw three cats sneeze on the street.'
Note that in contrast to (17) the pre-verbal position is fine whether
there is an available discourse referent or not. Again this seems to
follow Pinto's claim that D-linking is irrelevant for unergative subjects.
However, there is an argument that DRT style accessibility is
actually what's at stake here, rather than just previous mention in the
discourse. Consider the following two discourses:
10
16
ECONOMY AND OFTIONAIITY
(20)
a.
Ogni volta che le pop-stars e i divi del cinema che vivono al
numero 27 ritornano a casa, mi emoziano.
'Every time the pop-stars and film stars that live at number
27 come home, I get excited.'
b.
tre pop-stars sono arrivate.
Ieri,
three pop-stars be-3p anive -3pf
yesterday,
'Yesterday, three of the pop-stars came back.'
tre pop-stars.
sono arrivate
be-3p
arrive-3pf
three
pop-stars
yesterday
b'. Ieri,
'Yesterday, three pop-stars arrived.'
(must be different pop-stars from those living at no. 27)
(21)
a.
Ogni volta che delle pop-stars venguno nella mia strada, mi
emoziano.
'Every time pop-stars come to my street, I get excited.'
b.
arrivate.
tre pop-stars sono
#Ieri,
arrive-3pf
three pop-stars be-3p
yesterday,
'Yesterday, three of the pop-stars came back.'
tre pop-stars.
arrivate
sono
b'. Ieri,
arrive-3pf
three
pop-stars
be-3p
yesterday,
'Yesterday, three pop-stars arrived.'
In both of these sentences we have an adverbial quantifier which
will give rise to sub-DRSs in DRT. This predicts that discourse
referents that are inside the scope of the quantifier are not accessible to
those outside. In (20a), however, we have a definite, which is entered in
the topmost discourse and a pre-verbal subject in (20b) is well-formed.
A post-verbal subject (20b) is also well formed, on the condition that
the pop-stars referred to are not the ones previously introduced (the
familiar definiteness effect). In (21a), the discourse referent of pop-stars
is introduced by an indefinite, it will therefore be interpreted within the
scope of the quantificational adverb predicting that it is not accessible
for anaphoric reference. Given this, to predict the infelicity of (21b), we
11
17
YORK PAPERS IN LINGUISTICS 17
simply need to say that whatever is in the specifier of IP falls under the
Familiarity Condition given above in (11) and repeated here.
(22) Suppose something is uttered under the reading represented by 4)
and the discourse preceding 4) has resulted in a discourse structure
contains a set of discourse referents V. Then for every chain
9C
C in 4) it must be the case that:
Familiarity Condition: if C is definite or in Spec, IP2 then
there is a discourse referent x associated with C and x = y, y e U.
otherwise 4) is infelicitous with respect to k
The point about (21) is that (21a) creates a sub-discourse X the
discourse referents of which are not accessible except within X. (21b)
however, is outside X , but contains an element in Spec, IP. There is
no discourse referent in '11which the discourse referent of pop-stars can
be equated with. (21b) is therefore infelicitous with respect to (21a).
4.3
Mapping between Syntax and DRS
Note that the condition x=y is essentially non-linguistic. Definites
behave in exactly the same way with respect to anaphora and deixis
(Kartunnen 1976) so if we wish to capture this fact we need to assume
that such a condition can be entered into the DRS non-linguistically, by
an act of ostension, or something similar. This point is crucial, in that
it means that there must be independent well-formedness conditions on
the construction of DRSs.
2 I have formulated the Familiarity Condition here using the notion Spec
IP. This is only for reasons of exposition, and readers will recognise that
there is an issue as to exactly what kind of syntactic description should go
in here so as to capture the widest variety of data. In Adger 1994 I developed
the notion of Agr-Chain, which is a chain with a link in Spec AgrP and
argued that by using this notion in the Familiarity Condition one could
unify the interpretative effects that arise with subject placement,
scrambling, clitic-doubling, wh-agreement and case.
12
18
ECONOMY AND OPTIONAIITY
The picture of the grammar built up here claims then that there is
some set of well-formedness conditions on DRSs, mid an independent
set of well-formedness conditions on terminal syntactic structures
(TSS), where by terminal syntactic structures I mean structures which
satisfy all of the constraints of the syntax. TSS then is LF or SS
depending on which you take to be the input to interpretation. Felicity
conditions like the Familiarity Condition are essentially relations
between DRSs and TSSs. Further mapping principles link other aspects
of TS structure to aspects of DRS structure (possibly also stipulated in
terms of chains). A minimal theory would relate head-chains to
predicates in the DRS, and XP chains to DRs.
Are all of these mapping principles of the form F(TSS)=DRS? Are
there any constraints the other way round? That is, are there mapping
principles which are of the form F(DRS)=TSS? I would like to suggest
that there is at least one and that it is this principle rather than Case
which motivates movement of a subject of an unaccusative to Spec IP
position. This principle essentially claims that the non-linguistically
introduced information in a DRS must also be able to be linguistically
introduced.
Assume that the (infinite set) of DRSs given by the DRS well-
formedness conditions is P , and the set of TSSs given by the syntax is
L, then:
(23) Effability: For every member p of P there is a corresponding
member !of L
where !corresponds to p iff for every felicity condition F, F(0=p.3
5.
Some Syntax
5.1
Movement and Economy
Chomsky (1991, 1992, 1995) has recently proposed that a number of
grammatical principles might be reduced to principles governing the
3 Fabio Pianesi has pointed out to me that this defmition as it stands will
not halt. This problem can of course be solved trivially by requiring a
single pass in whatever algorithm is used to implement it.
19
YORK PAPERS IN LINGUISTICS 17
complexity of derivations and representations, where complexity is to
be theoretically pinned down. For example, the principle of 'leasteffort' requires that a derivation must be as 'short' as possible deriving
the effects of the ECP under a relativised minimality view of the latter
(Rizzi 1990).- A further principle of Economy prohibits operations
which are not needed to enable the derivation to successfully converge.
For my purposes, it is sufficient to propose a rather general theory of
Economy, of the following sort:
(24) Economy:
Minimise computational operations
Computational operations are copying, insertion and deletion as in
the earliest versions of transformational grammar (Chomsky 1955). I
will assume that movement consists of (one or more) copying
operations, followed by a deletion operation, as argued in Chomsky
(1992). Note that deletion may take place at TSS to satisfy the
reqUirements of Full Interpretation (as discussed in Chomsky 1992 for
reconstruction effects) or at PF (perhaps for cases of ellipsis, etc.).
Deletion is of course subject to recoverability of content.
This theory of Economy should be construed globally, in the sense
of Reinhart (1994) and Adger (1995). That is, a derivation leading to a
particular TSS will be deemed to be more expensive than another
derivation leading to the same structure if the former consists of more
computational operations. It is in this sense that computational
operations should be minimised.
5.2 Capturing the correlations
Let us return to our original paradigm (repeated here):
(25)
(26)
hanno
leoni
have-3p
lions
three
Three lions have sneezed.'
Tre
*Hanno
have-3p
sternutito.
sneeze-pp
leoni.
sternutito tre
sneeze-pp three lions
ECONOMY AND OPTIONAIITY
(27)
scappati.
leoni sono
lions be-3p escape-pp-3p
'Three of the lions have escaped.'
Tre
three
(28)
tre
scappati
Sono
three
escape-pp-3p
be-3p
'Three lions have escaped.'
leoni.
lions
Ideally we would like to capture this with a minimal theory of
Case, something like the following:
(29)
V assigns Case to its complement, and not to its specifier.
I assigns Case to its specifier.
This theory predicts that an unaccusative subject gets Case in its
theta-position (complement of V position in (28)), and an unergative
subject must move to Spec IP ((25) - because it cannot get Case in
Spec VP, assuming that is its theta-position (Koopman and Sportiche
1991)). Ignoring Economy, it also predicts that a Spec IP subject of an
unaccusative verb is well-formed ((27) - since it can receive Case there
from I), and that a post-verbal subject of an unergative is bad (since it
doesn't get Case - (26)). However, given Economy, why will an
unaccusative subject ever raise to Spec IP if it can get Case in its theta
position?
The answer Belletti (1988) proposes is that the Case assigned by
unaccusatives is always optional. When the option is not taken to
assign Case, then the subject must raise to Spec IP to get Case there.
There is an alternative solution which does not involve
complicating Case theory in this way. An unaccusative subject will
raise if there is some further well-formedness principle that it must
obey. Now, note that if (27) were ill-formed there would be no TSS
corresponding to the DRS where the DR of the subject is a discourse
anaphor. This is in violation of Effability, which requires that for each
DRS there be a corresponding TSS. Effability then requires that (27) be
a possible TSS of Italian (note that to make this story go through, we
have to assume that TSS is S-Structure for Italian'. I suspect that it's SStructure for all languages).
15
21
YORK PAPERS IN LINGUISTICS 17
To see how this works in more detail consider the schematic
structures of (27) and (28):
(30)
a.
b.
escape three lions
three lions escape
(nothing in Spec IP)
(three lions in Spec IP)
The question is why (30b) is well-formed. (30a) corresponds to a
DRS with a single plural discourse referent (say x) and three conditions
on that discourse referent: lion(x), three(x) and escape(x). This
DRS is given independently by the DRS well-formedness conditions.
(30b) is a possible TSS because Effability requires there to be a
TSS corresponding to a DRS where the escaping lions are anaphoric to
some previously established lions. This will only be true if there is a
TSS of which the Familiarity Condition holds for the three lions. This
in turn will only be true if the DP three lions is definite or is in Spec
IP. But surely this predicts that we can simply make the DP definite,
rather than move it to Spec IP.
This conclusion certainly follows given what we have said so far.
However, the felicity conditions on definites and those on Spec IP
elements appears to be different. Crucially, it is possible to
accommodate (that is to use a definite which hasn't itself been
introduced in the discourse but is inferable from the discourse) from a
definite in post-verbal position but not from pre-verbal position (see
also Anagnostopoulou 1994 who first pointed out similar facts
concerning clitic doubling in Modern Greek, and see Delfitto 1994 for
scrambling of objects in Dutch):
Ieri ho visto un film su Fellini,
'Yesterday I saw a film about Fe llini,'
(31)
a.
e oggi
e
arrivato
it regista
a casa mia.
and today be-3s arrive-3s the director to my house
'and today the director (of the film) arrived at my house.'
16
22
ECONOMY AND OP TONALITY
b.
a casa mia.
e
arrivato
it regista
to my house
be-3s
arrive-3s
the
director
and today
'and today the director (Fellini) arrived at my house.'
e oggi
Given this we need to tease apart the Familiarity Condition into
two sections, where one part regulates Spec IP elements and the other
regulates definites.
Then Effability forces the syntax to generate (27), even though (28)
is well-formed.
The next question is why (27) is only felicitous with a discourse
anaphoric reading for its subject, while (25) is felicitous with a
discourse anaphoric reading or not. The answer to this question is the
interaction of Economy with Effability.
Note that there are actually two chains that result from raising an
unaccusative subject into Spec IP (30b) under the copy-and-delete view
of movement outlined above, depending upon which copy is deleted. I
will for the moment stipulate that (30b) itself is not a TSS and that
either the link in Spec IP or the link in Compl VP must be deleted.
This requirement is probably derivable from the different Mapping
Conditions on VP internal and VP external objects, but I shall not go
into that here (see Adger 1994, 1995; Diesing 1992). If we delete the
copy in complement of V position we have an element in Spec IP,
while if we delete the copy that is in Spec, IP position we obviously
have nothing in Spec IP:
(32)
a.
b.
a-lien escape a lion
a lion escape a-lien-
This would appear to predict that a preposed subject of an
unaccusative would have two readings, since there appear to be two
TSSs for this sentence, contrary to fact.
However, note that the derivation of (32a), the variant where three
lions is not discourse anaphoric involves two computational operations:
Copy a, followed by Delete a. Note also that the result of this twostep derivation is exactly the result of not raising the subject in the first
place. Given the theory of Economy discussed above, we predict that
(32a) is not actually a TSS for (30b). So a raised subject of an
17
23
co
YORK PAPERS IN LINGUISTICS 17
unaccusative verb does not have a non-discourse anaphoric reading,
because the derivation that would give rise to that reading is blocked by
the existence of an alternative structure which involves less
computational steps.
In contrast consider the schematic form of an unergative:
(33)
a.
b.
three lions sneeze
* sneeze three lions
The simple Case theory outlined in (29) rules out (33b). Given the
discussion above, however, we still have two putative TSSs for (33a):
(34)
a.
b.
throe -lions sneeze three lions
three lions sneeze three4ions
(nothing in Spec IP)
(three lions in Spec IP)
Note that there is no competing derivation in this case for (34a)
since (33b) is ruled out anyway. This predicts that the subject of an
unergative verb will have both readings, as it does.
A potential problem
5.3
The system outlined so far predicts that when movement to a position
is optional then a structure involving the moved element will have a
different interpretation from the structure involving the in-situ element.
Specifically, with subject placement, it predicts that when a VP internal
position for the subject is available, as well as Spec IP, then Spec IP
subjects will be discourse anaphoric. An empirical problem for this
prediction appears to arise in Catalan. In Catalan the canonical subject
position for all verbs appears to be VP-internal (Vallduvf 1993). An
unergative verb like trucar, 'phone', allows a post-verbal subject and is
felicitous in discourses where the subject is discourse anaphoric or not
(again controlling for right dislocation):
(35)
a.
Deuran trucar alguns convidats, oi?
must-3p call some guests, right
'Some (of the) guests will probably call, right?'
18
24
ECONOMY AND OFTIONALITY
Note that there is no definiteness effect here, even though the
subject is VP internal. This contrasts with Italian, suggesting that the
definiteness effect in Italian relates to a null expletive in subject
position, which is not present in Catalan. The subject can also be
preposed:
(35)
b.
deuran trucar, oi?
Alguns convidats
must-3p call, right
some guests
'Some (of the) guests will probably call, right?'
Unfortunately, there appears to be no interpretational difference
here, contrary to the predictions of the theory.
However, there is an independent explanation for this effect.
Catalan actually seems to have two subject positions: Spec IP, and an
IP adjoined position. Vallduvf (1992) has argued that Spec IP in
Catalan is reserved for quantificational elements on a weak reading (that
is in our terms non-discourse anaphoric). Vallduvi argues that referential
elements are barred from this position. The IP adjoined position, on the
other hand, corresponds to the subject position in Italian and must be
interpreted as discourse anaphoric.
6. Conclusion
This paper has argued that subject placement in Italian is not entirely
determined by Case, but rather that it is also partly determined by
interpretational considerations. The crucial step in the argument is that
there are independent well-formedness conditions on discourse structures
and that the apparent interpretational effects on preposed subjects of
unaccusatives in Italian are actually effects that derive from judgements
of felicity in discourse. The apparent optionality of syntactic movement
is in fact conditioned by an interface constraint that requires each well-
formed DRS to have a set of corresponding terminal syntactic
structures. These considerations interact with a notion of global
Economy to derive the correlation between subject placement,
optionality and interpretation.
This conclusion actually reinforces the autonomy of syntax rather
than threatens it. It removes any features from the syntax which have
19
YORK PAPERS IN LINGUISTICS 17
purely interpretational motivation and leaves a simple theory of
argument licensing which is purely structural.
REFERENCES
Adger, D. (1994) Functional heads and interpretation, Ph.D., University of
Edinburgh.
Adger, D. (1995) Meaning, Movement and Economy. In R. Aranovich, W.
Byrne, S, Preuss, and M. Senturia (eds.), Proceedings of WCCFL 13.
Stanford: CSLI.
Anagnostopoulou, E. (1994) On the representation of clitic doubling in
Modern Greek, ms., University of Tilburg.
Belletti , A. (1988) The Case of Unaccusatives LI 19.1-34.
Burzio, L. (1985) Italian Syntax. Dordrecht: Reidel.
Chomsky, N. (1955) The Logical Structure of Linguistic Theory. New York:
Plenum.
Chomsky, N. (1989) Some notes on economy of derivation and
representation MIT Working Papers in Linguistics 10.43-74.
Chomsky, N. (1992) A minimalist program for linguistic theory. In
Occasional Papers in Linguistics, 1. MIT.
Chomsky, N. (1995) Bare Phrase Structure. In G. Webelhuth (ed.)
Government and Binding Theory and the Minimalist Program.
Oxford: Blackwell. 385-439.
Delfitto, D (1994) Beyond specificity: Proposals on cliticization and
scrambling. Handout from talk given at the Third Plenary ESF
Conference on Language Typology, le Bischenberg.
Diesing, M. (1992) Indefinites. Cambridge, Mass: MIT Press.
Heim, I. (1982) The Semantics of Definite and Indefinite Noun Phrases,
Ph.D., University of Massachusetts, Amherst.
Hoop, H. d. (1992) Case Configuration and Noun Phrase Interpretation,
Ph.D., University of Groningen.
Kamp, H. (1981) A theory of truth and semantic representation. In J. A. S.
Groenendijk, T. M. V. Janssen and M. B. J Stokhof (eds.) Formal
Methods in the Study of Language. Amsterdam: Mathematical Center
Tracts.
Kamp, D. and U. Reyle (1993) From Discourse to Logic. Dordrecht: Kluwer.
20
26
ECONOMY AND OPTIONALITY
Karttunnen, L. (1976) Discourse referents. In J. Mc Cawley (ed.) Syntax and
Semantics 7. New York: Academic Press. 363-385.
Pinto, M. (1994) Subjects in Italian: distribution and interpretation, ms.
University of Utrecht.
Perlmutter, D. (1979) Impersonal passives and the unaccusative hypothesis
BLS IV.157-189, University of California.
Pesetsky, D. (1987) Wh-in-situ: movement and unselective binding. In E. J.
Reuland and A. G. B. ter Meulen (eds.) The Representation of
Indefiniteness. Cambridge, Mass: MIT Press.
Reinhart, T. (1994) From syntactic encoding to interface strategies, ms,
OTS, Utrecht.
Rizzi, L. (1990) Relativized Minima lity. Cambridge, Mass: MIT Press.
Vallduvf, E (1992) A preverbal landing site for quantificational operators.
In Catalan Working papers in Linguistics, Barcelona.
Vallduvf, E. (1993) The informational component, Ph.D., University of
Pennsylvania.
21
27
COLLABORATIVE REPAIR IN EFL CLASSROOM
TALK
Zara Iles
Department of Language and Linguistic Science
University of York
1. Preface
This paper explores some of the benefits to be gained by adopting a
conversation analysis (CA) perspective in an examination of 'English as
a foreign language' (EFL) classroom talk. The EFL classroom is a
context in which there is a heightened potentiality of problematic talk,
e.g. errors, misunderstandings and non-communication. The need for
REPAIR (Schegloff et al 1977) is therefore situationally endemic. In
everyday talk, between participants who hold mutual assumptions of
common ground and shared knowledge, repair has been shown to be an
activity which is executed quickly as repair trajectories can necessitate
certain interactional investments. EFL teachers and learners are
differentially capable of dealing with and resolving trouble-at-talk
situations because of the unequal knowledge distribution that exists
between them. Some of the ways in which talk created by EFL
participants is collaboratively built in order to address this particular
state of affairs are discussed in this paper.
It is seen that differences in the agenda of the lesson at hand, e.g.
involving a focus on language form or creation of conversation, are
reflected in the interactional structure. Forms of correction are shown to
impose different costs on the interaction, lesson agenda and for second
language learners. Teachers are seen to be orienting to the status of
other-correction as the least preferred repair trajectory (Schegloff et al.
1977), by a) pursuing repair initiation, b) withholding correction and c)
adopting various camouflages which serve to downgrade the dispreferred
activity of other-correction.
York Papers in Linguistics 17 (1996)
23-51
© Zara Iles
28
YORK PAPERS IN LINGUISTICS 17
1.1
Introduction
This paper arises as part of a larger investigation which examines the
ways, and the extent to which, matters pertaining to the development of
language competencies are worked on by EFL teachers and learners in
their talk. One such matter concerns errors and their treatments, one of
the major businesses in which EFL classroom participants routinely
engage. In spite of the fact that correction is an activity which is
customary in the EFL context, "so little is known about the nature of
correction as it occurs in the classroom and its effect on the learning
process" (Pica 1994:70). Error and error correction are important in the
characterisation of the nature of talk generated between EFL teachers and
learners, and as such, a valid and accurate account of this aspect of EFL
talk is of primary concern to second language acquisition (SLA)
research.
In SLA research deciding on a definition of 'error' and identifying
errors has proved problematic. An error is typically, and restrictively,
defined as "the production of a linguistic form which deviates from the
correct form" (Allwright and Bailey 1991:84); the correct form being
that of the native-speaker 'norm'. Lennon (1991) concludes that:
`no universally applicable definition can be formulated,
and what is to be counted as an error will vary according to
situation, reference group, interlocutor, mode, style,
production pressures' (Lennon, 1991:331)
A CA approach avoids such categorisation and analyses which result
from an investigator's own intuitive understanding of what is happening
in an instance of talk. It gives rise to an analysis which is based on
observation of the orientations of the participants themselves in
creating, and making sense of, their talk. The CA concept of repair
allows for a broader perspective of error and correction than what is
currently prevalent in SLA research. Repair is the structural and
organisational mechanism in conversation that allows speakers to deal
with troubles in speaking, hearing or understanding ongoing talk
(Schegloff et al 1977). The term thus refers to a wider range of events
than simply that of correction, which is just one possible realisation of
24
29
COLLABORATIVE REPAIR IN EFL CLASSROOMS
repair. Repair organisation offers all-inclusive and thus potentially more
useful notions of the terms 'error' and 'correction', referring to all
instances of problematic talk and the trajectories which are involved in
its treatment. Construed in this fashion, errors can thus be seen as being
more than the production of a deviant form by the learner, and hence
specifically the learner's problem; errors and their repair constitute an
interactional problem which EFL participants must jointly overcome,
and which involves them in the regeneration of their talk after trouble or
breakdown.
Repair entails making some aspect of language the focus of the talk
to one degree or other, i.e. correction becomes the explicit activity of
the talk or is a 'by-the-way-occurrence' and is dealt with swiftly
(Jefferson 1987). Repair sequences are environments in which the
identities of the participants as 'teacher' and 'learner' are made
interactionally relevant and so manifested in the details of the talk.
Repair trajectories are also environments within which knowledge
(possibly new knowledge) about the target language is made available
for the learner by the teacher. Language is demonstrated, experienced and
worked on by both teacher and learner in repair trajectories. As will be
shown in this paper, the structure and design of repair trajectories means
that the extent of this 'working on talk' is negotiated. A detailed
examination of these features of EFL interaction is therefore likely to
yield important insights into the nature of second language (L2)
development and the nature of its relationship to interaction.
This paper concentrates primarily on other-correction, the least
preferred trajectory in repair organisation in everyday talk. Schegloff et
al (1977) demonstrate that mundane conversation is 'structurally
skewed' so that self-repair opportunities, where the originator of the
trouble repairs his/her own talk, dominate over other-repair
opportunities, where a co-participant actions the repair. Othercorrections are the forms of repair which Schegloff et al suggest operate
as:
a device for dealing with those who are still learning or
being taught to operate with a system which requires, for its
routine operation, that they be adequate self-monitors as a
condition of competence. It is, in this sense, only a
25
30
YORK PAPERS IN LINGUISTICS 17
transitional usage, whose supersession by self-correction is
continuously awaited. (1977:381)
The paper reveals how the recurrent features of repair observed in
everyday conversation between native speakers, are employed in a
'specialised' way by participants in the context of the EFL classroom. It
further reveals how the forms of repair employed by the EFL teachers,
which orient to the maximisation or minimisation of explicit error
correction, reflect the nature and the agenda (local and global) of the
teaching activity. It also shows that the extent to which error correction
becomes the overt business of the talk, or not, can, potentially, be
controlled by both teacher and learner. For example, the design of
teacher other-correction may serve to downgrade the activity in order to
interrupt the ongoing talk as minimally as possible. Various
camouflaging features drawn from observing teacher other-correction are
highlighted in the extract analyses in section 4. The interaction in
which EFL participants are engaged can be designed to either give
priority to the business of 'creating conversation', or, the correction of
talk and conscious analysis of the target language.
The account given in this paper is developed from observations
made by Jefferson (1987) concerning explicit and embedded other-repair
and subsequent projected accountings in normal everyday conversation.
Examination and discussion of these repair trajectories is presented in
Section 2. Instances of these two forms of other-correction from
naturally-occurring EFL classroom data are described and discussed in
Section 4. It is demonstrated that repair strategies adopted by EFL
interactants can synchronously, a) attend to the nature, or expedite the
achievement, of different goals to be attained in EFL lessons, and b) be
sensitive to the linguistic, cognitive and interactional loads placed on
'less than fully competent' participants.
2. Exposed and Embedded Correction
Jefferson (1987) identifies and describes two forms of other-correction
observable in everyday talk which have different interactional
consequences; exposed and embedded correction. Jefferson demonstrates
that correction by other-speaker is an activity which can either be a)
31
26
COLLABORATIVE REPAIR IN EFL CLASSROOMS
accomplished explicitly, where the correction becomes the interactional
business, or, b) accomplished without it emerging to the conversational
surface. Exposed correction has an interactional cost as the ongoing talk
is interrupted and correction becomes the concern of the talk. It is
demonstrated that with exposed forms of correction:
`correcting can be a matter of, not merely putting things to right ... but
of specifically addressing lapses in competence and/or conduct'
(Jefferson 1987:88).
After exposed correction, giving an account of error is potentially
relevant. Exposed correction may therefore be a means of specifically
bringing a participant to account for their errors. On the other hand,
embedded other-correction is a way of handling problematic talk without
invoking the apparatus of repair, i.e. initiation attempts, repair markers,
hesitation, lengthy trajectories and so on, which lead to the successful,
or otherwise, treatment of the repairable. Embedded correction does not
project accountings and does not discontinue the ongoing talk.
Correction does not become the interactional business and therefore
demands less interactional investment, less time, and talk stays on
topic. The following examples A-D from Jefferson's 1987 paper
illustrate these two types of other-correction forms:
(Example A): Other-correction in next-turn with no overt markers (in
line 1) and a minimal receipt of correction (in 2). The repairable item is
picked out by Norm and an isolated repair, without surrounding
syntactic context or explicit repair markers, is performed. The repair is
imitated by. Norm, marked with stress and acknowledged with an
explicit receipt; 'Right'. The correction does not become topicalised, is
executed quickly and so the talk is minimally interrupted. The redoing
and completion of the repairing is signalled with a minimal 'M-hm'
receipt from Norm who actioned the repair.
Larry: They're going to drive back
Wednesday
Tomorrow.
1
Norm:
2
Larry: Tomorrow. Right.
3
(14-11m,
Norm
Larry: They're yorking half day.
32
YORK PAPERS IN LINGUISTICS 17
(Example B): Other-correction in next-turn with no overt markers (in 1)
and an embedded receipt of repair (in line 2). No account of the error is
given by Mil ly and she continues on topic. In next-turn after the
trouble-source turn an other-correction is actioned by Jean. The
repairable is isolated, redone without interval or explicit repair markers.
The initial consonant is stressed and this is imitated by Mil ly in her
subsequent redoing. Unlike in example A there are no acknowledgement
markers of the repair activity from either speakers. The correction
proceeds as a by the way occurrence and does not become the explicit
focus of the talk.
...and then they said something about
Kruschev has leukemia so I thought oh it's
Milly:
all a. big put on.
1
Jean:
Ureshnev.
2
Milly:
areshnev has leukemia.
So
I
didn't know
Ahat to think.
(Example C): An example of other-correction in next-turn with no overt
markers (in 1) and an explicit receipt of correction (from 2 onwards). Jo
actions the repair in line 1 without delay and without' explicit repair
markers. The repair is redone by Pat and she then maintains the repair as
the focus of the talk by doing an accounting. Correction becomes the
concern of the talk and there is some delay to the topic. The repair
activity is made the source of a joke, which orients to the status of
other-correction as a dispreferred activity and is a face-saving device.
...the\BlaCk Muslims are
Pat:
certainly more provocative
than the Black Muslims Byer
were.
1
Jo:
2
Pat
:
The Black panthers.
The Black Panthers. What'd I
Jo:
You said the Black Muslims
Pat:
Did I really?
Yes you dj:d but that's
twice.
Jo:
alright I forgive you.
33
COLLABORATIVE REPAIR IN EFL CLASSROOMS
In examples A, B and C, the repairable is isolated in the correction turn
i.e. there is no surrounding syntactic context. There are no explicit
repair markers and the repair is imitated immediately by the originator
of the trouble source in the following turn. The repair is executed
quickly and there is little interruption to the ongoing talk. The
examples also exhibit various behaviours by which participants
acknowledge that repair is being accomplished, e.g., intonational
highlighting of the repair elements and various minimal receipts. These
same features are found in the repair sequences from EFL lessons
discussed below in section 4. These sequences were taken from lessons
or points in lessons where making correction the focus of talk is not the
primary agenda. Explicitly packaged, exposed correction would interrupt
the topic and potentially take over as the focus of the talk. The repair
structure of examples A and B ensures that a) talk is repaired b) a
redoing by the originator of the trouble-source is projected and
accomplished, hence this can be regarded as an orientation to self-repair
preference in the last resort, and c) the cost of repair activity to the
interaction is limited.
The two forms of other-correction highlighted in the examples
above do not correspond to two symmetrically distinct modes of
correction. Correction may be explicitly actioned by one participant, but
be accepted in an embedded form by the co-participant, thus ignoring the
potentially projected accounting for error. Likewise, a correction may
take an embedded form but be brought to the conversational surface by
an explicit receipt. This phenomenon is illustrated in the following
example in which participants deal with racist language.
(Example D): Other-correction in overlap (in 1) with explicit repair
markers and embedded receipt of correction (in 2).
Jim:
Like yesterday there was a track meet at
Central.Reel.se was there. Isn't what a
reform schooll,
(0.4)
Jim:
Reelse?
Roger:
Yg:s.
(.)
34
YORK PAPERS IN LINGUISTICS 17
Ken:
[Yeah.
Jim:
[Buncha niggers and everything?
Ken:
Yeah.
JiM:
fig went right down on that fieild
(0.3)
like A niggAL and all the guy's
(mean) all these niggers are a:11
[up there inmean Kg]gro: don't you.
[You
]
1
Roger:
(.)
Jim:
Ken:
Well and [they're all-h-u]=
Jim:
=-They['re they're Alla up in the
(hunh stands you know
All
Ken:
[And .1i:g,
(.)
Jim:
2
Th:Ase guys (are) completely
LAdical.I think I think Negroes are
cool sh.:ys you knoIy,
Ken:
Some of them yeah.
In the example above, Roger's exposed correction, in line 1, projects a
potential accounting. But the repair is receipted in an embedded form by
Jim later in the talk, in 2, thus avoiding having to give an account for
his repairable. In this way, Jefferson argues, the activity of correction is
shown to be a collaborative enterprise as it is through the participants':
'collaborative, step-by-step construction that correction will be an
interactional business in its own right, with attendant activities
addressing issues of competence and/or conduct or that correction will
occur in such a way as to provide no room for accounting.' (Jefferson
1987:99)
In the EFL classroom context the capacity for this co-operative
enterprise is potentially constrained. Second language learners may not
be aware of the need for repair, let alone be in a position to action repair
for themselves. Consequently, forms of correction may prove to have
further costs for L2 teachers and learners. Exposed correction (initiation
and treatment) and its accompanying activities can require the learner to
focus explicitly and consciously on the form of the language s/he is
5
COLLABORATIVE REPAIR IN EFL CLASSROOMS
trying to learn. The learner may not be in a position to be able to meet
these projected demands. On the other hand embedded forms of
correction empowers the EFL teacher to attend to the repair of troublesources, but does not oblige an explicit of consciously motivated focus
on language form. The L2 may, if in possession of necessary
knowledge, accept the correction in an exposed receipt and even make
the correction the focus of the talk him/herself. The continuum of repair
and control of preference is negotiated as talk unfolds. For example,
where the learner displays no awareness of error or inability to action
self-repair in their talk EFL teachers may action other-correction in
either an exposed or embedded form. (The employment of these
structures is shown in section 4 to be indexical of the pedagogical
agenda of the lesson). What is projected as a relevant next is therefore
controlled, to some extent or other, by teacher and learner.
The extracts that follow reveal how types of correction are indexical
of the agenda of the lesson and learner competence. They also show how
various features in the talk of EFL teachers downgrade the activity of
other-correction, the least preferred trajectory in the organisation of
repair in mundane conversation.
3. Data
The extracts discussed below were selected from a corpus which includes
data from audio-taped lessons from 10 native-speaker EFL teachers and
12 learners (of various nationalities). The lessons which were either
described as 'conversation classes' or 'business English' took place in
language units/schools in York and London. Teachers and learners were
not informed of the express purpose of the study and the researcher was
not present during the recordings. Factors such as age or sex of the
participants were not a pre-consideration of the study reported in this
paper and were therefore not controlled for the purposes of the study.
Schegloff (1992) states that categorising speakers is only relevant when
interactants themselves orient to such distinctions and can be found in
the details of the talk. Such information would therefore only be
brought to light after analysis of the data. However, some information
about the learners and the language schools, where known, is given, and
a brief description of the nature of each lesson.
31
3
YORK PAPERS IN LINGUISTICS 17
ZLI:SFM:C1
A 'conversation class' at the University of York involving sixteen
learners of various nationalities. This class which ran throughout a nine
week term was targeted at overseas students and their partners who
sought conversation practice. In this lesson the learners, in pairs, have
been completing a gap-fill grammar exercise from a textbook. The
exercise involves choosing the correct phrasal verb from a range of six
possibilities. Extract 1 is taken from the point in the lesson where the
whole class is collectively going through answers and correcting
mistakes.
ZLI:SFM:GB1
A one-to-one 'conversation class' at the University of York involving a
female Turkish native-speaker. The student was enrolled on a course of
general English lessons prior to taking pre-sessional EAP courses
before the beginning of the academic year. In this lesson the teacher and
learner are involved in a discussion of images of Turkey after
independently watching a television programme during the week prior to
the class and discussing newspaper articles.
ZLI:SFM:P1
A one-to-one 'business English class' at a private language school in the
city of York involving a Portuguese native-speaker. At the beginning of
this lesson the teacher presented and explained various target sentences
for 'comparing and contrasting' and 'giving opinions'. The teacher and
learner discuss various statements given in their textbook, the learner's
task being to give his opinion about what the statements suggests and
to try to employ some of the target language previously given.
Examples of statements are "business failure is due to bad management"
and "high levels of unemployment will continue for decades".
32
37
COLLABORATIVE REPAIR IN EFL CLASSROOMS
ZLI:DC:G 1
A one-to-one lesson at a private language school in London involving a
German native speaker. The teacher and learner are discussing various
topics, e.g., theatre, books, television. Some correction is actioned
during the course of the conversation as errors occur, but 5 minutes is
given over to highlighting errors and working through them at the end
of the lesson.
ZLI: k.L1
A one -to -one 'Business English' lesson at a private language school in
York. The learner is a French native speaker who is on a one-week
course. The lesson was recorded on the last day of the learner's course
and the activity in the lesson involves correcting sentences prepared
previously for homework and reviewing new language.
4. Analysis of Data Extracts'
Extract 1: ZLI:SFM:C I
1
T:
Horiyo can you read out what you've got
for that please.(*) the whole, sentence
H:
Mm hm the local supermarket has got up
the pri:ces again
2
3
4
5
(*)
.HHHh now it's.
6
T:
7
L:
8
T:
is- yes something lig yes
10
T:
Now what do we sa-
11
L:
12
T:
9
([*)
]
the verb
[unintell)]
(*)
[(*)
)
not the
[( unintell)]
correct verb ([ *)
]
no Forget get
1 The notation employed in this paper is taken from Atkinson and Heritage
(1984). Square brackets indicate the onset and offset of overlapping talk;
untimed pauses are marked as (*).
YORK PAPERS IN LINGUISTICS 17
14
Ll:
15
L2:
16
T:
G-et
-get
No Forget get p'
17 L: ((unintell))
18 T: What?
19 L
Put
20
We:11 done good
T:
This first extract is from a lesson where language form and revealing
linguistic knowledge is the explicit focus of the talk. Repair is therefore
integral to the agenda of the lesson. The teacher nominates a particular
learner, H, to make a public display of his competence. The learner
provides an incorrect answer. The following delay. (line 5). and inbreath, dispreference markers at the start of the teacher's turn in line 6
signals inability to provide affiliative talk and that further work is
needed. Another learner offers a possible answer (unintelligible to the
observer). The teacher's turns from line 6 onwards involve repeated
other-repair initiation and a marked withholding of other-correction. T
highlights where the learners' attempts have been correct, "yes
something j yes", in line 8. This initiation does not lead to successful
learner repair. No possibles are offered by the learners. The teacher still
does not action a correction at this point, but pursues initiation and
providing clues. T proceeds to explicitly state that the learner's have
chosen an incorrect verb. Further incorrect attempts are forthcoming
from the class. In line 16, the teacher gives a further. clue "p" to locate
the correct verb - 'put' is the only verb in their list beginning with 'p'.
The teacher's explicit initiation succeeds in enabling the learners to
action the repair for themselves. Although the teacher has avoided
unmodulated other-correction, the various steps in the repair initiation
has demanded investment in the talk and of the learners' level of
linguistic knowledge. The withholding of other-correction and involved
repair trajectories to be found in this lesson echo observations made by
McHoul concerning repair organisation in subject classroom talk. A
regular pattern observed in McHoul's data was for the teacher to
reformulate questions as further repair initiation and to provide clues to
assist learner self-repair. McHoul concludes that "contrary to what may
be a popular image of the classroom, teachers tend to show students
COLLABORATIVE REPAIR IN EFL CLASSROOMS
where their talk is in need of correction, not how corrections should be
made" (1990:376). And in showing where, teachers indicate, of course,
candidate 'whats'
Extracts 1, 2, 3 and 4 are taken from a lesson where creating
conversation is the global pedagogic focus of the talk. The repair in the
next extract involves the treatment of a single lexical item by the
teacher after no display of error awareness by the learner.
Extract 2: ZLI:SFM:GB1
1
L:
2
N n no not private (0.7) e:hh some beach
e:m
(1.9)
3
(
)
4
L:
are different (0.9)(b) than another
5
T:
Uh hh.
L:
°Than others° .hh and e:m
L:
U:hh .h
6
7
(*)
(4.1) (c)
8
9
10
11
(2.8)(d)
L:
12
13
(4.2) (e)
L:
14
15
A:nd the beach .h e:hh intensive
tourists
(1.7)
16
T:
17
L:
18
T:
19
L:
°a lot of tourists°=
=0a lot of tourists° .h[h
e]:hh they
[hm mm]
(0.6) they can do easily
The frequency of hesitation markers in the learner's talk displays
uncertainty about the coming talk. There are pauses and a marked
withholding of help from the teacher, e.g.pauses (a) to (e) are potential
sites where T could have provided affiliative talk or assistance. This lack
of talk signals further work by L is required before alignment (Tarp lee
1993). Note that in line 5, T does provide a minimal affiliative receipt,
"Uh hh", but responsibility for speakership remains with L. (Schegloff
40
YORK PAPERS IN LINGUISTICS 17
1982). The learner actions a self-repair in line 7. The learner's turn,
lines 13-14, includes the repairable 'intensive'. A (1.7) pause follows
representing an opportunity point for learner self-repair or repairinitiation. However, there is no display made of awareness of error or
any repair attempts from L. The teacher actions a correction. The
repairable is picked out and is redone as "a lot of tourists". In this
correction, a) there are no explicit repair markers, b) no surrounding
syntactic frame, c) no stress pattern to highlight the repair, d) an even
intonation, e) it is quieter than the surrounding talk, and f) it is imitated
by the learner in receipt, this imitation is pitch-matched. The repair is
attended to by teacher and learner in a minimalistic way and does not
become the focus of the talk. The learner does an imitation/redoing of
the repair in line 17 and makes a claim for continuing speakership, ".hh
e:hh they (0.6)". The teacher does a minimal receipt of the learner's
redoing in overlap with this claim and also signals the learner's
responsibility for continuing the talk, "hm mm" in line 18 (Schegloff
1982) In contrast to extract 1, the 'camouflaged' other-correction in this
extract has economically and swiftly dealt with the need for repair and
avoided potentially lengthy repair-initiation which could provide further
problematic talk. The agenda of this lesson, in contrast to
ZLI:SFM:Cl, is creating and getting on with conversation and this is
indexed in the design of the talk. Exposed and explicit forms of repair
would have had a different interactional cost. Consider extract 3 below
which demonstrates further camouflaging characteristics.
Extract 3: ZLI:SFM:GB1
(.) u::h is belong- a hat
L:
A hat
L:
Is belong
L:
Yes (.)
7
T:
So the hat. comes from (.)
8
L:
Yes Greece..
9
T:
°Yes°.
10
L:
Greece and e:hm
1
(1.0)
2
3
(4.0)
4
5
to Gre- Greece.
(1.0)
6
3
Greece..
41
COLLABORATIVE REPAIR IN EFL CLASSROOMS
(2.0)
11
12
L: °Black°
13
14
(1.2)
L: °Clothes°
15
(1.0)
16
°Comes from°
(1.0)
17
(*) A- Africa.
L:
E::i ehh
19
T:
°Right°=
20
L: =°Africa°.
18
The hesitancy, cut-offs in the learner's turns and pauses signal concern
with the coming talk. The teacher refrains from assisting in spite of the
various pause opportunities. The learner makes another attempt at
completing her turn in 3. No assistance is requested from the teacher and
none is offered. There is also a lack of affiliative talk from the teacher;
no 'yes' or minimal 'hm' receipts. This lack of affiliation signals that
further work is required (Tarp lee 1993). However, after a 4.0 pause the
learner explicitly displays her own assessment of her talk and she then
completes her turn. A 1.0 pause follows and the teacher provides an
upshot, a clarification request, of the learner's prior talk in line 7. The
upshot a) displays, to the learner, the teacher's understanding of her talk,
b) summarises the prior talk, c) projects the opportunity for learner
alignment, or non-alignment which would project potential further work
is necessary before affiliation, and d) is a candidate model. The learner
does not action a redoing of the repair, but orients to the request for
clarification by providing agreement (in line 8). Notice that it is not the
specific repair element in this upshot that is intonationally highlighted
comes from (.) Greece". The focus on
in the teacher's talk; "So the
the repair activity is therefore downgraded. Evidence to support that L
has treated the teacher's talk as a repair is found later in line 16 where
the repair is embedded into the learner's talk. The teacher's model is
redone, but it is grammatically incorrect in this context.
In the following extract the learner requests help from the teacher
and states the nature of the required assistance.
37
YORK PAPERS IN LINGUISTICS 17
Extract 4: ZLI:SFM:GB1
1
L:
last year u:hh (1.0) pt .hh there was a
Turkish (1.0) Turkish woman (.) on the beach
L:
Very old and fat
2
(3.0)
3
4
(2.0)
5
6
L:
.h he heh an e::h without ((gestures around
chest))
7
°A bikini top°
9
T:
L
10
T:
°Hm mm°
11
L:
I- I'twas horrible
8
o-A
bikini top°
The repair in this fragment comes after learner request for assistance and
thus an explicit display of lack of knowledge is made. In line 6 the
learner pinpoints the target item with a gesture. The teacher's following
repair is isolated from a surrounding syntactic context and is quieter than
the surrounding talk. The repair is redone by the learner, it is also
quieter than the surrounding talk and is pitch-matched. The teacher
follows this ultimate learner self-repair with a minimal receipt which
displays that the repair activity has terminated successfully, that no
accounting is required and signals the learner's responsibility for ongoing speakership.
Extracts 5 and 6 are also taken from a lesson where conversation is
the global agenda, but target language has been specified for use. At the
beginning of the lesson T has introduced several target phrases. In the
extract below the learner requests assistance and the teacher actions a
camouflaged repair. The learner's redoing is in overlap with the teacher's
repair turn and further working on talk is necessitated in later turns.
Repair is made the explicit focus of the talk.
38
43
COLLABORATIVE REPAIR IN EFL CLASSROOMS
Extract 5: ZLI :SFM:P1
1
2
3
L: =failure is (0.1) u:m (0.4) failure is
.hh I: think that is somesing (0.4) mm:
u:m somesing like what uh like um::: .huh
(5.3)
4
L:
like I want to:
L:
to win (0.3) uh::
L:
a business and I I I I- and my- and the
conqueries- conquerency?
11
T:
12
L:
competi-tors
-competit- competitance uhh
5
(2.2)
6
7
(1.0)
8
9
10
(cough) uh
13
(2.0)
14
L:
could uh maybe (0.1) better than me
17
18
T:
okay .hh so (*) failure is perhaps the
ymposite of success
19
L:
20
T:
yes (0.1) yes
the opposite -of success
21
L:
22
L:
yes
T:
okay yes remember the word competitors
15
(1.0)
16
-yes
(0.4)
23
24
(0.2)
25
26
T:
27
L:
28
T:
29
L:
(competitors
[competitors
y[es
[competitors
This extract demonstrates how both teacher and learner may control the
extent of focus on target language form and thus cost to the interaction.
The learner's turns (lines 1-8 incorporate hesitation and pauses. The
teacher withholds from assisting or affiliating talk and so leaves
responsibility of speakership with the learner. In line 10 the learner
39
BEST COPY AVAILABLE
YORK PAPERS IN LINGUISTICS 17
displays awareness of a potential problem with his talk. and also that he
is unable to execute a repair by himself. L offers two possibilities, the
second of which, (marked by question intonation), is oriented to by the
teacher as a request for help and repair. The learner's request for help in
line 10 is a minimally designed request from the learner and so in itself
preserves the focus on topic rather than projecting a detailed digression
towards corrective exchanges and explanation of the form of the
language. The teacher's other-correction in line 11 also takes a minimal
form as it attends to a recent correctable part of the learner's utterance
and does it as a single lexical item. The activity of correction is
downgraded by both participants. The teacher's repair has no explicit
markers, is not embedded in a surrounding syntactic frame, is not
highlighted prosodically and is imitated in receipt by the learner.
However, on this occasion the learner does the redoing of the repair in
overlap with the teacher's repair. The learner's redoing is incorrect, it is
not an imitation of the teacher's model. At this point in the talk the
learner is not brought to account by the teacher. The talk continues and
the learner completes his specific, local goal at this juncture of the
lesson; defining the word 'success'. In lines 17-18 the teacher does an
upshot of the prior talk. The upshot, as in extract above a) provides an
opportunity for learner alignment, b) displays the state of the teacher's
understanding of the talk, c) projects an opportunity for further work to
be accomplished if affiliation is not accomplished d) models a candidate
target for the learner and so assists in the establishment of mutual
comprehension between the participants. The learner provides agreement
to the teacher's upshot. The teacher follows this with a redoing of part
of her upshotting turn. The learner actions further affiliative talk. After
the establishment of understanding, the teacher actions an explicit repair
of the repairable "competit competitance" as the previous downgraded
repair attempt failed and so correction is made the interactional focus.
The teacher models the repair once again and this is imitated by the
learner. The learner's redoing this time is acknowledged as being
acceptable by the teacher with a 'yes' receipt in line 27.
In extract 6, below, the learner displays his inability to action a
self-repair. After the teacher's camouflaged repair the learner pursues the
correction activity because the repair is not the category he requires.
40
45
COLLABORATIVE REPAIR IN EFL CLASSROOMS
Extract 6: ZLI:SFM:PI
1
L:
2
know
3
4
T:
5
L:
6
T:
7
8
9
10
11
look uh an uh (*) my company hadn't uh
hadn't uh:m subside o:r subside I don't
subsidised
subsidised subsidised
hm mm
L: subsidised but uh .h what a subsidise u:h
T: subsidy
L: a subsidy
T: subsidy
L: uh: subsidy of (*) EC or government
The learner explicitly displays that he is not sure about the word he
wants (lines 2-3) and is not able to come to a decision about it himself.
The teacher's other-correction takes a minimal form; there are no repair
markers, no syntactic frame, and it is not highlighted prosodically and is
imitated by the learner in receipt. The repair sequence is closed, as in
Example A and extract 2 with a minimal "Hm mm" which signals the
end of the repair activity, its successful accomplishment and that the
learner has responsibility for continuing speakership. However on this
occasion the learner is aware that the teacher's correction is not actually
what he was searching for and the focus on the form of the language is
maintained by the learner. The learner clearly signals the category of the
repair that is being requested (in line 7); a noun is required rather than
the verb form that was offered by T. This is evidence of real
collaboration in repair between T and L. The teacher provides the
required repair that has been explicitly sought for by the learner. The
repair takes a minimal form once again. The repair is imitated by the
learner and his turn proceeds. The teacher keeps the activity of correction
to a minimum, whilst the learner who is in possession of sufficient
knowledge is able to collaborate in this repair trajectory and maintain
focus on the form of the language until the repair is successfully
completed.
Extract.7 below illustrates the potential cost of repair initiation to
the interaction, lesson agenda and language learner. For comparison,
41
46
YORK PAPERS IN LINGUISTICS 17
example E below (Jefferson 1987) shows that between participants who
share native-speaker competencies there may be little cost to the
ongoing interaction. After a potential site for self-repair, (pause in 4),
Louise initiates repair by identifying the trouble-source by reputing the
repairable (line 5) with rising (`question') intonation. The beginning of
the repairable is emphasised by stress, thus locating and marking the
repairable. This initiation leads to a self-repair from Ken without delay.
Ken overtly marks out the repair with stress. The extent to which the
repair takes over the focus of the interaction is kept to a minimum, but
both parties highlight their parts of the repair activity.
(Example E)
1
Hey
Ken:
(.) the first time they
2
Atopped me from selling Ligarettes
3
was this morning.
(1.0)
4
5
Louise:
6
Ken:
From gelling cigarettes?
Or buying cigarettes.
Extract 7, taken from a lesson where teacher and learner are holding a
discussion about topics such as television, books, actresses etc.,
illustrates the potential cost of repair to the interaction, lesson agenda
and language learner. The language work accomplished in the sequence
of talk in the extract above does not remain restricted to the replacement
of one specific lexical item but is widened to include the displaying of
grammatical and syntactic knowledge (concerning the use of 'since', 'for'
and 'ago' when referring to points in the past).Therefore there are a
number of potential acceptable repairs.
42
47
COLLABORATIVE REPAIR IN EFL CLASSROOMS
Extract 7:
ZLI:DC:G
4
I: u:m (0.4) pt read something about her an
interview last time I w-was here (0.2) in
London an:d she got oscars already and
since (0.2) two or three (0.1) years she
5
is a member of (0.2) parliament
1
L:
2
3
(0.2)
6
T: S[:ince
7
L:
8
T: Since two or three yea:rs,
L: She: (0.1) since two or three years (0.4)
9
[she be)
she has been
10
(0.3
11
12
T: No [stop) that was okay but y- b- sin:ce=
13
L:
(0.2)
14
15
(aha
T: Two or three years
(0.2)
16
17
L:
Since two or three ye:ar (0.4)
been
18
(1.1)
19
20
T:
21
L:
22,
23
she: has
L:
24
25
T:
26
L:
(no re-) remember we wrote it=
=Hm: since two or [thr- ( *)[teacher writes on boardOh no laL two or three years s:- sh: she
has been or is (.) uh?
>She has been<
Has been .h for two or three years she
has been a member of parliament [h
27
1=
[ °Righ °)
28
T:
29
L:
=and she belongs to the labour party
T:
Or if you use since you could say (0.1) she
30
31
(0.2)
h[as been
32
33
L:
34
35 T: Since=
(0.2)
48
YORK PAPERS IN LINGUISTICS 17
36
L:
=Si:nce=
37
T
=Two years
(1.1)
38
39
L:
40
T:
She has been=
=s-heh-ince two y-heh-ears
42
L:
°Since° (*) °two°
43
T:
Yeh (0.1) yeah cause then y- [you're
44
L:
45
T:
46
L:
47
T:
(1.0)
41
48
(*) years aga
[hm
fixing it
Hm:[m hm since two years ago she has been
[ye
a member of parliament
The teacher attempts a repair initiation in line 6 which pinpoints the
site of the repair "s:ince". The initiation fails to generate a successful
repair from the learner who does a redoing of his previous talk. The
learner proves unable to locate and action a repair based on T's repair
initiation. The teacher withholds actioning other-correction and pursues
further repair-initiation. T indicates that the talk redone-by the learner is
not problematic, hence the repairable is located elsewhere. In line 12 the
teacher tries to initiate learner self-repair with a reiteration of the
repairable 'since' again. The repairable is highlighted by greater stress on
this occasion. The learner fails to action a self-repair. Later the teacher
alludes to his assumption and belief that the learner is in possession of
the knowledge about the target language under focus in this repair
sequence as they have worked on this aspect previously; "remember we
wrote it" (line 20). The learner is able to action a self-repair and overtly
marks his recognition of the repair and realisation of the repair
expectations by emphasising the repair element "fa" in line? L
continues with the local task of finishing the target sentence
completion. However the attempt terminates with a quick request for
help "uh?" (in line 24). An other-repair is actioned by T. The repair is
isolated, but the speed of delivery is increased. The learner does a
redoing of part of the teacher's model and after an in-breath does a
redoing of the whole target sentence. The focus of the talk on repair and
the form of the target language does not finish at this point. In line 31
49
COLLABORATIVE REPAIR IN EFL CLASSROOMS
the T sets up another sentence completion task for the learner but fails
to generate an immediate successful learner repair. The repair is
accomplished by the learner 11 lines later after repeated initiation
attempts. The learner explicitly acknowledges the repair activity as the
repairable is marked by stress ("ago" in line 42). The display of lack of
knowledge in the learner's turns and failure to identify the repairable and
complete a learner self-repair resulted in elongated initiation from T and
several failed repair attempts by L. The pursuit of self-repair and
withholding of other-correction in this,extract ensured that repair became
the local agenda and that the learner was forced to display his level of
knowledge about a particular aspect of the target language. What
happens in extract 7 clearly contrast with repair trajectories where
camouflaged other-correction ensured that the ongoing interaction was
minimally interrupted. The fact that the teacher had a basis for assuming
the level of learner knowledge was alluded to in the talk and may
explain his insistence on repair-initiation. Moreover, the repair required
more than the replacement of a single lexical item.
Extract 7: ZLI:DC:G I
1
T:
2
L:
3
So it's difficult
It was (*) difficult=yes but I understood
it because I saw the musical
4.
(*)
5
T:
6
L:
Because you saw the musical
I
(*) had seen
8
L:
Had seen?
9
T:
Yeah
10
L:
I had seen the musical=
11
T:
12
L:
=Right if you hadn't seen the musical
I wouldn't=more difficult to understand
7
or because
(*)
13
14
(*)
(*)
T:
°Right°
The repairable "saw" occurs in (line 3). The learner makes no display of
need for repair etc. After a pause (untimed) the teacher initiatef repair.
YORK PAPERS IN LINGUISTICS 17
He repeats part of L's prior talk, as in Example E and extract 7 above.
The repair is followed by another pause. No repair is attempted by L. T
then indicates the site of the repairable in line 5 with a sentence
completion task. The learner actions a self-repair. The learner's talk
displays uncertainty, a pause in line 6 mid-repair. The lack of affiliative
talk from the teacher is oriented to by the learner as a 'display of a need
for further work (Tarp lee 1993). The learner does a redoing of the repair
with question intonation displaying his uncertainty, but offers no other
alternative repairs. The teacher provides affiliative talk in next-turn and
maintains the focus on the form of. the talk by constructing a sentence
completion task which is successfully actioned by L.
Extracts 9 and 10 are from a lesson where correction is the concern
of the talk. The teacher and learner are going through sentences written
as a homework task. Focus on the form of the target language is an
explicit pedagogical agenda in the lesson.
Extract 9: ZLI:A:L1
1
L:
Yesterday I kept Kiting do:wn my notes on
my carnet °un carnet u:h (I -don't know°)=
2
[no
3
T:
4
T:
=Note?
T:
Notebook
8
L:
9
T:
Notebook
=Notebook
(0.7)
5
6
(0.4)
7
(6.0)
10
11
n:
T:
Right?
The lesson activity concerns going through and correcting the learner's
homework. The learner's task was to write sentences using specified
new language that he has learned on the course. The learner reads out
one of his answers (lines 1-2) and explicitly displays that he does not
know the word in English that he needs to complete his sentence. The
teacher makes repair attempts, which end in cut-offs, in overlap with L's
turn. In line 4 the teacher constructs a repair-initiation as a word
COLLABORATIVE REPAIR IN EFL CLASSROOMS
completion task which fails to engender a learner self-repair. The
completion task in itself promotes the activity as a collaborative
enterprise. A 0.7 pause follows this initiation attempt and the teacher
actions the projected repair; the learner's absence of talk signalling his
inability to perform a repair. The teacher's repair is isolated, i.e. without
any surrounding syntactic context, as were repairs dealing with the
replacement of specific and single lexical items in the learner's talk as in
extracts 2, 4, 5 and 6. The repair in extract 9 also generates an imitation
by the learner. A difference is that the teacher's repair is highlighted
intonationally. Focusing on the form of the language and correction
comprise the activity of the talk displayed in extract 9.
In the last extract 10 below, there is more than one source of
trouble in the learner's talk. This example is again taken from lesson
ZLI:A:Ll, where the activity of the talk concerns displaying
competency and linguistic knowledge. Lengthened repair initiation ,
explicit focus on language form and the use of metalanguage
characterise the talk as correction is an explicit agenda.
Extract 10: ZLI:A:Ll
1
L:
Are you sure we g.g. to the wright die- di-
uh direction
2
(.) (a)
3
4.
5
T: °Okay° .hh not we go:
the situation
(.)(b) h imagine you're in
(0.7)
6
T:
Uh we ri(de) -°no°
-Yeh bu- imagine=it's the tens:e
10
T:
°Lori° =imagine it's now
11
L: Okay
7.
L:
(0.4)
9
12
(0.7)
13
T: Which tense would you] use=
14
L:
15
L:
16
T:
17
[Are
you
sure]
=We are going
Aright .hh okay an we are going=not 1.2
(1.0)
52 2
YORK PAPERS IN LINGUISTICS 17
Not the preposition is n2(1. La
18
T:
19
L:
20
T:
Yes so say it again
21
L:
Okay
[i:n the
(0.9)
22
23 T: Say the sentence again
24 L: Alors are you sure we are going in the Light
25
26
T:
de- direction
uh Lori just say this .h are you
Yeh .hh
sure?
27
(0.8)
28
29
L:
30
T:
Yes
Stress the word sure
32
L:
Are you sure?
33
T:
Are you sure
35
L:
36
T:
In the wright direction
In the right direction
(0.5)
31
(*) we're aoinq
(0.4)
34
The learner reads out his sentence attempt containing the repairables,
"go" and "to" in lines 1-2. After a micro-pause, at (a), signalling a
coming dispreferred activity, the teacher receipts the turn and then
actions a repair-initiation. The initiation identifies one of the troublesources. A micro-pause follows at (b) and the teacher provides further
initiation, a "cluing" (McHoul 1990). After a 0.7 pause the learner
attempts a repair but rejects his repair himself. The teacher withholds
from other-correction and pursues further initiation. T explicitly states
that the learner has used the wrong tense. The teacher provides two
further initiations in lines 10 and 13 before the learner actions a selfrepair. T receipts the learner repair in line 16. The teacher then directly
proceeds to attend to a second repairable. The teacher's first initiation is
minimally packaged and identifies the site of trouble, "not to". There is
a one second interval and T continues with further initiation, avoiding
other-correction. T highlights the repairable again. The learner actions a
self-repair (line 19) and is requested to do a redoing of the repaired
stretch of talk (line 20). The activity of the talk now turns to
COLLABORATIVE REPAIR IN EFL CLASSROOMS
pronunciation business with a sequence in which the talk focuses on
intonation and stress.
The nature of the activity of the talk in this extract concerned overt
focus on language form and correctness. The lengthened repair initiation
sequence ensured that correction remained the explicit business.
6. Concluding remarks
The CA analysis of repair in EFL classroom talk reported in this paper
gives testament to the nature of the joint management of issues related
to second language development; issues connected with intelligibility,
repairing troubles and establishing mutual comprehensibility and
intersubjectivity. The description of one of the chief enterprises in EFL
classroom talk generated by this CA analysis, is vastly different from
the view of reactionary correction and appraisal, typified by 'initiationresponse-feedback' routines, deemed to be paradigmatic of classroom talk
(Sinclair and Coulthard 1975). Rather than segmenting EFL
conversation into such uni-directional categories as initiation, response,
teacher negative feedback, etc, correction, as part of the broader
phenomenon of repair, has been revealed as an activity which is
negotiated by EFL participants on a turn-by-turn basis as they
collaboratively work on the re-construction of their talk.
Repair strategies have been shown to impose different costs on the
lesson agenda and the learners. Teachers have also been seen to orient to
the status of other-correction as a dispreferred activity, by a), restraining
from other-correction, b), pursuing repair initiation to increase
opportunities for self-repair, and c), packaging other-correction when
actioned in an accommodating, 'camouflaged', (e.g. isolation of the
repair, delivered at a volume which is quieter than the surrounding talk,
and lack of intonational marking), environment which serves to tone
down unmodulated other-correction and take the focus off the activity of
repair. The 'camouflaged' corrections empowered the EFL teacher to
attend to the repair of trouble-sources, but did not oblige a lengthened,
explicit or consciously motivated focus on language form. As an
example, extract 6, demonstrated that where the L2 learner is in
possession of the necessary knowledge he/she may accept the correction
in an exposed receipt and even make the correction the focus of the talk
49
54
YORK PAPERS IN LINGUISTICS 17
him/herself. Repair and control of preference organisation is potentially
actionable by both teacher and learner and is negotiated on a 'here and
now' basis as their talk unfolds. For example, where the learner displays
no awareness of error or inability to action self-repair in their turns-attalk the EFL teacher may action other-correction in either an exposed or
embedded form. What is projected as a relevant next is therefore
controlled, to some extent or other, by the teacher and (subject to
his/her level of competence) the learner.
Forms of correction were shown to orient to the pedagogic goal of
the type of EFL lesson or activity in an EFL class which entails the
conscious analysis of aspects of the target language, e.g. a grammar
lesson, as in extract 1, 'correcting homework', as in extracts 9 and 10.
These types of teaching agendas contrast with lessons or activities in
which conversational practice is the global pedagogic goal, as in the
discussions of extracts 2, 3, and 4. Explicit forms of correction and their
accompanying accountings would require an investment in the talk and
make demands on the learner which could prove to be beyond their level
of competence. The extended repair activities of extracts 5 and 7 are
examples where local agendas become relevant as the talk proceeds and
so correction becomes the overt activity of the talk. In extract 5 the
teacher actions explicit repair after a 'camouflaged' attempt failed. In
extract 7 the teacher displays that he has good reason to anticipate the
learner's capacity for self-repair.
This paper has examined the organisational devices which provide
for flexibility, local-management and negotiation in the
accomplishment of immediate and global interactional agendas in EFL
classroom talk.
REFERENCES
Allwright, R.L. and K. M. Bailey. (1991). Focus on the Language
Classroom. Cambridge: Cambridge University Press.
Iles, Z. L. (1995). 'Learner control in repair in the EFL classroom'. Paper to
be presented at BAAL Annual Meeting, September 1995.
55
50
COLLABORATIVE REPAIR IN EFL CLASSROOMS
Jefferson, G. (1987). 'On exposed and embedded correction in conversation'.
In Button, G., and J. R. Lee (eds), Talk and Social Organisation.
Clevedon: Multilingual Matters. 86-100.
McHoul, A. (1990).The organisation of repair in classroom talk'. Language
in Society. 19. 349-377.
Lennon, P. (1991). Error and the very advanced learner, IRAL. Vol XXIX/1.
30-44.
Pica, T. (1994). Questions from the language classroom: Research
Perspectives. TESOL Quarterly. Vol. 28/1. 49-79
Schegloff, E.A. (1982). Discourse as an interactional achievement: Some
issues of 'uh huh' and other things that come between sentences. In
D. Tannen (ed.) Analyzing Discourse: Text and Talk. Washington
D.C: Georgetown U.P. 71-93.
Schegloff, E.A. (1992). In another context. In A. Duranti and C. Goodwin
(eds.) Rethinking context: language as an interactive phenomenon.
Cambridge: Cambridge University Press. 191-227.
Schegloff, E.A., Jefferson, G. and H. Sacks. (1977). The preference for selfcorrection in the organization of repair in conversation. Language.
53. 361-382.
Sinclair, J. and M. Coulthard. (1975). Towards an analysis of discourse.
Oxford: Oxford University Press.
Tarp lee, C. (1993). 'Working on Talk: the collaborative shaping of
linguistic skills within child-adult interaction'. Unpublished DPhil
thesis. University of York.
51
A TIMING MODEL FOR FAST FRENCH*
Eric Keller and Brigitte Zellner
University of Lausanne
1. Introduction
Previous research on the prediction of speech timing has documented
influences at three major levels: the phoneme or segmental, the syllabic
and the phrase level. In this paper we describe a three-tiered statistical
model which has been created for predicting the temporal structure of
French, as produced by a single, highly fluent speaker at a fast speech
rate. The first tier models segmental influences due to phoneme type and
contextual interactions between phoneme types. The second tier models
syllable-level influences of lexical vs. grammatical status of the
containing word, presence of schwa and the position within the word.
The third tier models utterance-final lengthening. The output of the
complete model correlates with the original corpus of 1204 syllables at
an overall r = 0.846. However, an examination of subsets of the
complete data set revealed considerable variation in the closeness of fit
of the model. Residuals have a normal distribution.
Models Based on the Prediction of Segmental
Durations
The most influential statistical model for spoken French text has
1.1.
probably been the model proposed by O'Shaughnessy (1981, 1984). On
the basis of numerous readings of a short text containing all phonemes
of French, a model of durations of acoustic segments suitable for
synthesis by rule was proposed. In this model, 33 rules for the
modification of segment duration according to segment type, segment
* Authors' address for correspondence: Laboratoire d'analyse informatique
Lettres, Universitd de Lausanne,
de la parole (LAIP). Informatique
CH-1015 LAUSANNE, Switzerland.
York Papers in Linguistics 17 (1996)
53-75
Co Eric Keller and Brigitte Zellner
57
YORK PAPERS IN LINGUISTICS 17
position and phoneme context served to specify basic phoneme
durations.
For sound classes that did not involve prepausal lengthening, the
model was able to predict the durations for 281 segments of a text with
a standard deviation of 9 ms. But it was less accurate for the prediction
of prepausal vowel durations, because of the greater variability of
segments in such positions. Moreover, this model was not able to
predict silent inter-lexical pauses.
O'Shaughnessy's statistical model is constructed around the
hypothesis that speech timing phenomena can be captured by the
segment, as if this unit "possesses an inherent target value in terms of
articulation or acoustic manifestation" (Fujimura 1981). However,
recent measures have indicated that syllable-sized durations are generally
less variable than subsyllabic durations, and thus may represent more
reliable anchor points for the calculation of a general timing structure
than segmental durations (Barbosa and Bail ly 1993; Keller 1993; Zellner
1994). The taking into account of explicit syllable-level information is
further supported by the observation that stress variations and variations
of speech rate tend to modify at least syllable-sized units.
Bartkova's model (1985, 1991) attempts to solve these deficiencies
by adding calculated coefficients to the formula for predicting segment
durations:
Dur Seg= Dun + ksyll+ k Ac
where Dud is the intrinsic duration of the segment, ksyll is a syllabic
coefficient, and kAc an accentuation coefficient. The exact manner in
which these coefficients are obtained is not described; it is only noticed
that they can vary from a minimum to a maximum interval, according
to the position of the segment in the speech chain, and according to the
acoustic properties of the speech sound.
The syllabic coefficient depends on the nature of the word
(lexical/grammatical), and on the position in the word (initial, medial,
final syllable). The coefficient of accentuation depends on the next
consonant, on the presence/absence of a syntactic boundary in the case
of a final vowel, or on the presence/absence of clusters in the case of a
final consonant, as well as on the syllabic structure near a pause.
58
54
A TIMING MODEL FOR FAST FRENCH
According to Bartkova, a comparison of predicted and measured
durations in 10 sentences gives rather good predictions, since the mean
difference on segmental duration is about ±15 ms.
However, it would seem that beyond the opacity of the coefficients,
a divergence between predicted and measured durations of the order of 15
to 30 ms can be a major handicap for short segments. In our corpus, for
example, the mean duration for /d/ was 50 ms. In the case of such a
short phoneme, a 15-30 ms divergence would correspond to an error of
30-60% with respect to its measured duration.
1.2.
Required Macro-timing Information
Since the segmental unit cannot capture the overall temporal structure
of speech, the next level which can be expected to encapsulate temporal
phenomena is the syllable. This appears to be a good candidate.
According to some psycholinguists, it is considered to be the minimal
perception unit, and according to a number of phoneticians and
phonologists, it is the minimal unit of rhythm (see Delais 1994).
It has been shown that quite a number of parameters are involved in
variations of syllabic duration. The most important are: the position in
the prosodic group, the position in the word, degree of stress, the length
of the prosodic group, the position according to the stressed syllable,
the position according to the local speech rate (as measured by cycles of
speeding up and slowing down), semantic focus, proximity of syntactic
boundaries, the status of the word (lexical or grammatical), and
emotional factors (Bartkova 1985, 1992; Campbell 1992; Delais 1994;
Duez, 1985, 1987; Fant and al. 1991; Fonagy 1992; Gregoire 1899;
Grosjean et al. 1975, 1983; GuaYtella 1992; Konopczynski 1986;
Martin 1987; Mertens 1987; Monnin et al. 1993; Pasdeloup 1988,
1990, 1992; Wenk et al. 1982; Wunder li. 1987). Some of these factors
may be redundant; for instance, in many cases of read text, lexeme-final
position may be redundant with phrase-final position.
In view of existing information, it thus seems best to begin with
segmental predictions, and to consider syllabic information as additional
information which is not captured at the segmental level. One of the
important points to consider in the present study will be the selection of
non-redundant and relevant information.
55
59
YORK PAPERS IN LINGUISTICS 17
Beyond the syllabic level, it is likely that a good predictive model
will eventually need to incorporate further information at the word or
the phrase level. For example, the prediction of pauses for slow speech
requires phrasal knowledge, which is not captured at the segmental or at
the syllabic level. In the area of word group boundaries in French
speech, a great deal of work has been accomplished to determine the
syntactic groups, prosodic groups, rythmic
nature of these groups
groups, intonational groups, the congruence between these labels
and to calculate the automatic generation of such groups and potential
inter-group pauses (Delais. 1994; Grosjean et al. 1975; Keller et al.
1993; Martin 1987; Monnin et al. 1993; Pasdeloup 1988; Saint-Bonnet
et al, 1977). These effects will have to be integrated into a general
timing model for a given language, but were not taken into account in
the present study.
In the current study, the objective was to account for a single
speaker's syllable durations with the smallest number of segmental and
syllabic factors. At each succeeding level, relevant parameters were
chosen so as to explain the greatest proportion of the variance in the
residue of the previous analysis. In this manner, a three-tier model,
based successively on segmental, syllabic and phrasal information, was
constructed.
2. Method
2.1.
The corpus
A highly fluent speaker of French (a professor of French literature) was
recorded with 277 sentences, the first 100 of which were analysed for the
present study. The speaker was instructed to speak quite rapidly, with a
normal, unexaggerated intonation. The resulting readings have generally
been judged by listeners as highly intelligible and well-pronounced. No
dialectal particularities were noted.
Recording occurred in studio conditions on DAT-tape. The digitized
data was transferred to Macintosh computer and was downsampled to 16
kHz.
60
56
A TIMING MODEL FOR FAST FRENCH
2.2.
Time labelling
The time occupied by each phoneme was labelled with the SignalyzeTM
program according to detailed instructions on how to handle phoneme-
to-phoneme transitions (Thdvoz and Enkerli 1994). Specifically,
transitions in the acoustic corpus was analyzed according to three
articulatory levels: labial, lingual and laryngeal. For example, the
coarticulatory overlap at the /e/-/s/ transition was marked by symbols
representing the following events: "onset of friction, associated with the
lingual level", followed at a given time interval by an "offset of
fundamental frequency, associated with a cessation of vocal cord
activity". The following possible states were distinguished:
Labial system: aperture, occlusion, friction, burst, error
Lingual system: aperture, occlusion, friction, burst, palatal,
transient movement, error
Laryngeal system: aperture, occlusion, transient movement,
diminution, error
"Error" refers to any state that occurs inadvertently, such as during a
speech error.
To examine the reliability of transcriptions, two judges compared
judgements concerning how and where points of transition between
inferred articulatory states were to be marked. Two measures of
interjudgemental agreement were used:
Robustness (agreement in the application of criteria to state
transition), scored 1 = low agreement, 2 = agreement in general, but
some further discussion required, and 3 = excellent agreement.
Precision, scored 1 = more than two Fo periods difference, 2 = 1-2
Fo periods difference and 3 = less than 1 Fo period difference in
measurement.
Both measures showed good to excellent interjudgemental
agreement. Over the 50 types of state transitions examined, there were
no cases of low robustness or low precision. The average robustness
was 2.53 and the average precision was 2.68.
A total of 4544 phonemes and 1203 syllables were analyzed in this
manner.
57
61
YORK PAPERS IN LINGUISTICS 17
3. Analysis and Results
A modified step-wise statistical regression technique was used to
develop a well-fitting model of this speaker's timing behaviour. In
accordance with previous observations on factors that influence speech
timing, it was decided to model three major levels: the segmental, the
syllabic and the phrase level. In step-wise fashion, each succeeding level
was made to model the residue left by the previous level. Three different
models were thus established, the Segmental, the Syllabic and the
Phrase Model (Figure 1).
The
Segmental
Model
The
Syllabic
Model
The
Phrase
Model
Figure 1. The Segmental, Syllabic and Phrase Models. Each subsequent
model incorporates the modelling effects of the previous level.
3.1. Model 1: The Segmental Model
Segmental Durations and Overlap Zones. An initial issue concerned the
calculation of segmental duration in a corpus where coarticulatory
transition zones are marked explicitly. Does phoneme duration
correspond to the zone of the signal which is unambiguously marked for
a given phoneme (zone B in figure 2), or does it include one or both
zones of coarticulatory overlap with adjoining phonemes (zones A and
C in figure 2)?
62
58
A TIMING MODEL FOR FAST FRENCH
overlap 2
overlap 1
"unambiguous"
zone
/s/
/c/
/R/
AI
B
C
Figure 2. What constitutes a phoneme? B is a portion of the signal that
is unambiguously marked for the phoneme /c/, while A and C are
transitory zones with adjoining phonemes.
The issue was resolved with reference to durational variation. The
combination of zones A, B and C (with an average coefficient of
variation of 0.375) turned out to be systematically less variable than the
unambiguous zone B (with an average coefficient of variation of 0.412)
(see Table 1).
Average coefficient of
variation (s.d./ mean)
for 34 phonemes
Average coefficient of
variation for 34
phonemes
A
B
C
1.6379
0.4123
1.7472
A+B
B+C
A+B+C
0.3916
0.3933
0.37 51
Table 1. Coefficients of variation for zones A, B and C as well as
various combinations of these zones
Also, combinations of zones A and B, or of B and C, were less variable
than zone B alone. The transition zones can thus be considered to be
"buffer zones" whose function, in part, may well be to "regularise"
59
63
YORK PAPERS IN LINGUISTICS 17
phoneme duration. For the purpose of the present research it was thus
decided to consider the combined duration of A, B and C as "phoneme
duration". Syllable durations were constructed from phoneme durations
by taking into account transitional overlaps. As a net effect, the
segmental duration entering the statistical modelling procedure is
slightly more regular than more commonly measured phoneme
durations. Nevertheless, it is not believed that the modelling results of
the present study seriously depend on this manner of proceeding; the
size and resilience of the measured effects suggest that as long as
transitions are handled in systematic fashion, the predictive pattern
should remain largely identical.
3 . 2 Segmental transformation and grouping.
Raw segment durations were non-normal in their distribution. Among
the common transformations, the log10 transformation produced the
closest approximation to a normal distribution (Figure 3a, b). All
calculations of the segmental portion of the model were thus performed
on log10-transformed durations.
500
400
300
200
100
0
50
150
0.75
250
1.25
1.75
2.25
2.75
log10 (ms)
me
Figure 3a. The distribution of segment durations before and after the log
10 transformation: histograms.
60
64
A TIMING MODEL FOR FAST FRENCH
2.5
0 2.0
9
225
1
150
m
e
1.5
0
75
ms
-2
0
1.0
0
-2
2
necores
2
nscores
Figure 3b. The distribution of segment durations before and after the log
10 transformation: normal probability plots.
Subsequent to transformation, phonemes were grouped according to
their mean durations and their articulatory definitions. Eight classes
could be identified (Table 2). Groups showed roughly comparable
coefficients of variation, and an inspection of histograms and normal
probability plots showed roughly normal distributions for all classes
whose N was greater than 100.
Phoneme type
Name
Mean duration
(ms)
ce, 0
Ant Round
fsf
Fric
ce, t, a, 6
Nas
o
PostMidRnd
p, t, k
UnvPlos
a, e, e, a, u, i, y
OthVow
b, z, m, D, g, v, 3, n,
VcdCons
109.45
105.17
97.78
94.92
92.94
69.62
61.72
SemiVLiquids
43.63
d, 7
R, j, w, 1, 4
90.23
Mean
Table 2. Mean durations for phoneme classes (N = 4544)
61
6 5:
YORK PAPERS IN LINGUISTICS 17
Phoneme type
Coefficient of variation
(s.d./mean)
0.4881
0.2708
00, 0
fsf
Frequency
(N)
71
0.3585
0.3130
0.3475
0.4089
0.3669
357
334
60
504
1557
892
R, j, w,l, q
0.4908
769
Mean
0.3648
539
ce,Z, a, 8
0
p, t, k
a, e, e, D, u, i, y
b, z, m, g, g, v, 3, n,
d, 7
Table 2.(continued) Mean durations for phoneme classes (N = 4544)
To test Model 1 in the syllabic context, square root-transformed syllable
durations were calculated on the basis of coefficients produced by the
linear model for segmental durations, and by taking into account mean
durations of phoneme-to-phoneme transitions. These calculated syllable
durations were compared to the square root-transformed measured
syllable durations. The correlation coefficient was r = .647 (N = 1203,
p<.0001) (Figure 5).
66
62
A TIMING MODEL FOR FAST FRENCH
22.6
8
q
M
16.0
7.5
a
-0.0
6
9
12
15
Model 1
Figure S. Prediction of the Segmental Model (Model 1): Syllable
durations predicted exclusively on the basis of segmental durations (r =
.647). Values are in sqrt(ms).
The residue from the model (= observed - predicted) was termed "Delta
1" and served as the basis for further factorial modelling at the syllabic
level.
3.3 A Linear Model for Segmental Durations.
Using the Data Desk® statistical package on the Macintosh, a general
linear model for discontinuous data (based on an ANOVA) was
calculated with partial (non-sequential, Type 3) sums of squares. The
following main and interaction factors (up to two -ways) were
postulated:
duration (logl 0(ms)) = constant + previous type + current type + next
type + previous type * current type + current type * next type +
previous type * next type
1 For reasons of insufficiency in per-cell observations, calculation
complexity and theoretical difficulty of interpretation, three-way
interactions were not calculated.
63
YORK PAPERS IN LINGUISTICS 17
Table 3. The Segmental Model: Analysis of Variance for Segmental
Data (N = 4544) Using Partial Sums of Squares
df
Source
Const
previous
current
next
previous * current
current * next
previous * next
1
8
7
8
Total
50
50
60
4360
4543
Source
df
Error
Const
previous
1
8
current
next
8
50
50
60
Error
4360
4543
Total
Mean Square
14903.8
0.123239
3.13402
0.267002
3.24144
5.04499
1.79531
101.137
196.070
14903.8
0.015405
0.447717
0.033375
0.064829
0.100900
0.029922
0.023197
F-ratio
Prob
642500
0.66410
5.. 0.0001
19.301
1.4388
2.7948
7
previous * current
current * next
previous * next
Sums of Squares
4.3498
1.2899
0.7236
0.0001
0.1748
0.0001
5 0.0001
0.0665
In the partial sums of squares solution, all factors were significant at
p<.05, with the exception of "previous type" and "next type", taken
alone, and the interaction term "previous type * next type" (Table 3).
The residual error was 101.137/196.070 = 0.516, that is, the model
explained about 48.4% of the variance. Expressed in terms of a Pearson
product-moment correlation, the model's predicted segmental durations
correlated with empirical phoneme durations at r = 0.696.
64
6 8,
A TIMING MODEL FOR FAST FRENCH
3.4
Syllable Durations and Delta 1.
Another means of testing the model is a comparison with measured
syllable durations. In contrast to phoneme durations, where a log
transformation served to provide roughly normal distributions, square
roots had to be applied to measured syllable durations in order to
approximate normal distributions (Figure 4).
250
22.5
200
s
15.0
150
r
100
I
7.5
Mq
50
-0.0
5
10
15
20
a
s
25
-2
0
2
nscores
sqnMeas
Figure 4. Syllable durations in ms were square-root transformed in order
to approximate a normal distribution.
3.4.1. Model 2: The Syllabic Model
Syllabic Factors Predicting Delta 1. After considerable experimentation
with a variety of factors described in the literature, a three-factor model,
including two-way interactions, was retained for analysis:
delta 1 = constant + function + position + schwa + function * position
+ function * schwa + position * schwa,
where 7unction" distinguishes whether the syllable is found in a lexical
or a function word, "position" identifies three types of position in the
word which are (1) "monosyllabic and polysyllabic-initial", (2)
"polysyllabic pre-schwa" and (3) "other", and "schwa" indicates whether
or not a schwa is present in the syllable. Again, a general linear model
for discontinuous data was calculated with partial (Type 3) sums of
squares. The results of the ANOVA showed that all main and interaction
factors were significant at p<.05 (Table 4). The residual error of
3277.29/5432.93 = .6 indicated that the model explained 40% of the
variance in Delta 1.
65
69.
YORK PAPERS IN LINGUISTICS 17
Table 4. Analysis of Variance for Delta 1 (N = 1203)
Using Partial Sums of Squares
Sums of
df
Source
Mean Square
Squares
Const
function
1
1
position
2
schwa
1
function * position
2
function * schwa
position * schwa
1
Error
1193
1202
Total
2
Source
df
Const
1
function
1
position
2
schwa
1
2663.53
176.508
49.2877
149.296
48.6936
27.5860
31.5234
2.74710
F-ratio
Prob
5 0.0001
969.58
64.252
17.942
54.347
17.725
10.042
11.475
function * position
function * schwa
position * schwa
2
Error
1193
1202
Total
2663.53
176.508
98.5753
149.296
97.3872
27.5860
63.0467
3277.29
5432.93
1
2
5. 0.0001
5 0.0001
5 0.0001
5 0.0001
0.0016
5 0.0001
Model 2 and Delta 2. Syllable durations obtained from the segmental
model were combined with those from the present linear model for Delta
1 to produce the Syllabic Model (Model 2). The predictions correlated
with observed square root-transformed syllable durations at r = .723
(N=1203) (Figure 6). The residual data was termed Delta 2.
66
70
A TIMING MODEL FOR FAST FRENCH
M
a
8
12
16
20
Model 2
Figure 6. Prediction of the Syllabic Model (Model 2): Syllable
durations predicted on the basis of segmental durations and syllable-level
factors (r = .723). Values are in sqrt(ms).
3.5.
Model 3: The Phrase Model
Inspection of the predictions of Models 1 and 2 (Figures 5 and 6)
showed a noticeable deviation from the regression line in the higher
values. Specifically, these models underestimated most syllable
durations in the > 280 ms range. Furthermore, an examination of Delta
2 revealed that the residual error was most pronounced for utterance-final
syllables ending in a consonant. Consequently, a correction term was
calculated, which was applied to such syllables in Model 3.
The predictions of Model 3, which incorporates segmental and
syllabic modelling as well as the phrase-final correction term, correlated
with the observed square root-transformed syllable durations at r = .846
(Figure 7). The residual values from Model 3 vary quasi-randomly
around 0. At the present time, it appears that only more sophisticated
rules for the generation of the schwa vowel may still be able to improve
this model's predictive capacity to some degree.
67
71
YORK PAPERS IN LINGUISTICS 17
22.5
15.0
M
7.5
a
-0.0
16
12
8
20
Model 3
Figure 7. Prediction of the Phrase Model (Model 3): Syllable durations
predicted on the basis of segmental durations, syllable-level factors and
phrase-final lengthening (r = .846). Values are in sqrt(ms).
3.5.1. Stability
The Phrase Model was examined for its predictive stability by
performing Pearson product-moment correlations between various
subsamples of the data and the model's prediction. The resulting data is
presented in Table 5.
Table 5. Pearson Product-Moment Correlations between Various
Subsets of the Dataset and the Phrase Model's Prediction
1st slice .
2nd slice
3rd slice
4th slice
5th slice
6th slice
a4
72
slices of 50
syllables
0.9
0.87
0.853
0.89
0.866
0.852
68
slices of 100
syllables
0.884
0.872
0.852
0.726
0.823
0.868
A TIMING MODEL FOR FAST FRENCH
1st slice
2nd slice
3rd slice
4th slice
5th slice
6th slice
slices of 200
syllables
0.878
0.789
0.838
0.885
0.841
0.838
slices of 300
syllables
0.869
0.805
0.874
0.838
Table 5. (Continued) Pearson Product-Moment Correlations between
Various Subsets of the Dataset and the Phrase Model's Prediction
It can be seen that the model's predictive capacity varies considerably
from one subset to the next. For example, the correlation was only .726
for the fourth slice of 100 syllables in the set, while it had been .884
for the first slice. Even when slices of 300 syllables are compared,
considerable variability prevails. The reasons for these instabilities are
presently being investigated.
4. Discussion
By a modified step-wise procedure, a general model for the prediction of
the fast-speech performance of a highly fluent speaker of French was
constructed. The initial model incorporates segmental information
concerning type of phoneme and proximal phonemic context. The
subsequent model adds information about whether the syllable occurs in
a function or a lexical word, on whether the syllable contains a schwa
and on where in the word the syllable is located. The final model adds
information on phrase-final lengthening. The effects of these three
levels are demonstrated on a single sentence in Figure 8. In view of
current discussions surrounding segmental and syllabic contributions to
timing models, it is interesting to note that segmental information
accounts for a major portion of the variance explained by the model. As
Figure 8 shows, segmental information alone successfully predicts
several cases of major syllable lengthening.
69
73
YORK PAPERS IN LINGUISTICS 17
measured 0 predicted --o- -delta
sqrt(ms)
20 T
Model 1:The
Segmental Model
15 .1-
10
5
it.....14
tx
At CI
-5
20
.7
a
0
NY
(XWea,*WC
Model 2:The
Syllabic Model
15
10
5
0
20
to
*
C
7 Ai
N _Ng I
0,
$
4-
Ca
Model 3:The
Phrase Model
15
10 -5
0
-5t
V
Yy
-V11.11PV
Irt V
1:5
S.
-10
Figure 8. A comparison of predictions of the three models and measured
syllable durations for the sentence "Son etude ethnologique porte sur la
relation entre les acupuncteurs et les centenaires afghans ".
The overall correlation of 0.846 between predictions of Model 3 and the
data set from which the model is derived is encouraging. This
74
70
A TIMING MODEL FOR FAST FRENCH
correlation level corresponds roughly to the average inter-speaker
correlation of r = 0.833 for phrase-final syllable durations, as measured
between the readings of a short text by 12 speakers in the CaelenHaumont corpus (Caelen-Haumont 1991; see Keller 1994). This means
that the model behaves as differently from its target data as one natural
speaker would behave with respect to another speaker. Although this
may be an acceptable initial predictive level for synthesis purposes,
further improvements in the modelling would be welcome. Preliminary
indications suggest that such improvements may come about through
predictions of the presence vs. the absence of schwa, through explicit
predictions of the effects of speech rate manipulation, and in longer
texts, through a better modelling of pauses. Further information on
possible improvements may also be gained through an examination of
cases of high delta 3 values in subsets of the present data set. These
effects are currently being studied.
It is worth noting that in the present fast-speech corpus, no phraselevel effects were identified, other than phrase-final lengthening. This is
in contrast to our findings on the production of French at a normal
speech rate, where a fairly systematic increase of lexeme-final syllable
durations was observed over the extent of the prosodic phrase (Keller et
al.. 1993). It seems likely that in conditions of considerably accelerated
speech rate, our speaker sacrificed some of the "niceties" of phraseinternal timing modulation, and limited himself to a single, phrase-final
durational marker.
Considerably more work also needs to be done before the
generalisability of the present model can be tested. The examination of
the model's stability has shown that predictions begin to show
comparable strength at about 300 syllables or more. Consequently,
systematic testing of these predictions for another speaker would
involve a completely new research study. Nevertheless, a few quick
examinations of predictions for another speaker's sentences suggest that
the model may indeed be generalisable to more than one speaker of
French (Figure 9)2.
2 The authors are grateful to the following members of the LAIP team for
their invaluable assistance in scoring and creating the present corpus:
Nicolas Thevoz, Alexandre Enkerli, Herve Mesot, Cedric Bourquart, Nicole
Blanchoud, and Thomas Styger. Particular thanks go to Prof. J. Local (York
71
75
YORK PAPERS IN LINGUISTICS 17
Figure 9. A comparison of predictions of Model 3 and the measured
syllable durations of another speaker of French for the fast reading of the
sentence "Beaucoup de gouvernements voient le CERN comme un
moteur de modernisation technologique" .
REFERENCES
Barbosa, P. and Bailly, G. (1993). Generation and evaluation of rhythmic
patterns for text-to-speech synthesis. Proceedings of ESCA
Workshop on Prosody. Lund: Sweden. 66-69.
Bartkova, K. (1985). Nouvelle approche dans le modele de prediction de la
duree segmentale. 14ame JEP. Paris. 188-191.
University, UK) for his many ideas and his encouragement. Prof. A. Wyss of
the University of Lausanne is cordially thanked for his participation as a
subject for this study. This research is supported by the Fonds National de
Recherches Suisses (Projet Prioritaire en informatique and ESPRIT Speech
Maps) and by the Office Federal pour l'Education et la Science (COST-233).
72
A TIMING MODEL FOR FAST FRENCH
Bartkova, K. (1991). Speaking rate in French application to speech
synthesis. XI lime Congres International des Sciences Phonetiques,
Aix en Provence. Actes. 482-485.
Caelen-Haumont, G. (1991). Strategies des locuteurs et consigns de lecture
d' un texte: Analyse des interactions entre modeles syntaxiques,
semantiques, pragmatique et parametres prosodiques, These d'Etat,
Aix-en-Provence.
Campbell, W.N. (1992). Syllable-based segmental duration. Talking
Machines. Theories, Models, and Designs. Amsterdam: Elsevier
Science Publishers. 211-224.
Delais, E. (1994). Prediction de la variabilite dans la distribution des accents
et les decoupages prosodiques en francais. XXimes Journees d'Etude
stir la Parole . Tregastel. 379-384.
Delais, E. (1994a). Rythme et structure prosodique en Francais. In C. Lyche
(ed). French generatve phonology: retrospective and perspectives.
Eurpean Studies Research Institue, Salford. 131-150.
Duez, D. and Nishinuma, Y. (1987). Vitesse d'elocution et durde des syllabes
et de leurs constituants en francais pad& Travaux de l' Institut de
Phonetique d'Aix. 11. 157-180.
Duez, D., Nishinuma, Y. (1985). Le rythme en francais. Travaux de l' Institut
de Phonetique d'Aix. 10. 151-169
Fant, G., Kruckenberg, A. and Nord, L. (1991). Durational correlates of
stress in Swedish, French and English. Journal of Phonetics. 19.
351-365.
Fenagy, I. (1992). Fonctions de la duree vocalique. In P. Martin (Ed.),
Melanges Leon. Editions Melodic-Toronto. 141-164.
Fujimura, 0. (1981). Temporal organisation of articulatory movements as a
multidimensional phrasal structure. Phonetica. 38. 66-83.
Gregoire, A. (1899). Variation de la duree de la syllabe en franyais. La
Parole. 1. 161-176.
Grosjean, F. (1983). How long is the sentence? Prediction and prosody in
the on-line processing of language. Linguistics. 21. 501-529.
Grosjean, F., and Deschamps, A. (1975). Analyse contrastive des variables
temporelles de I'anglais et du francais. Phonetica. 31. 144-184.
Keller, E. (1993). Prosodic Processing for 7TS Systems: Durational
Prediction in English Suprasegmentals. Final Report, Fellowship,
British Telecom.
73
77
141Mr r.nP v AvAILABLE
YORK PAPERS IN LINGUISTICS 17
Keller, E., Zellner, B., Werner, S., and Blanchoud, N. (1993). The Prediction
of Prosodic Timing: Rules for Final Syllable Lengthening in French.
Proceedings,of ESCA Workshop on Prosody. Lund, Sweden. 212215
Keller, E. (1994). Fundamentals of phonetic science. In E. Keller (ed.),
Fundamentals of Speech Synthesis and Speech Recognition: Basic
Concepts, State of the Art and Future Challenges. Chichester: John
Wiley. 5-21.
Konopczynski, G. (1986). Vers un modele developpemental du rythme
francais: Problemes d'isochronie reconsider& a la lumiere des
donnees de l'acquisition du langage. Bulletin de l' Institut de
Phonetique de Grenoble. 15. 157-190.
Martin, P. (1987). Structure rythmique de la phrase franyaise. Statut
theorique et donnees experimentales. Proceedings des 16e JEP.
Hammamet. 255-257.
Mertens, P. (1987). L'intonation du francais. De la description linguistique
a la reconnaissance automatique. These doctorale, Katholieke
Universiteit Leuven.
Monnin, P. and Grosjean, F. (1993). Les structures de performance en
francais: caracterisation et prediction. L'Annie Psychologique. 93.
9-30.
O'Shaughnessy, D. (1981). A study of French vowel and consonant
durations. Journal of Phonetics. 9. 385-406.
O'Shaughnessy, D. (1984). A multispeaker analysis of durations in read
French paragraphs. Journal of the Acoustical Society of America. 76.
1664-1672.
Pasdeloup, V. (1988). Analyse temporelle et perceptive de la structuration
rythmique d'un inonce oral. Travaux de l' Institut de Phonetique
d'Aix. 11. 203-240.
Pasdeloup, V. (1990). Organisation de l'enonce en phases temporelles:
Analyse d'un corpus de phrases reiteries, (pp 254 - 258). 186mes
Journees d'Etudes sur la Parole. Montreal. 28 - 31 Mai.
Pasdeloup, V. (1992). Duree intersyllabique dans le groupe accentuel en
Francais. Actes des I9emes Journees d'Etudes sur la Parole.
Bruxelles. 531-536.
74
A TIMING MODEL FOR FAST FRENCH
Saint-Bonnet, M. and Boe, J. (1977). Les pauses et les groupes rythmiques:
leur duree et disribution en fonction de la vitesse d'elocution.
Vllemes Journies d'Etude sur la Parole. Aix en Provence. 337-343.
Thevoz, N. and Enkerli, A. (1994). Criteres de segmentation: Rapport
intermidiaire. LAIP-Lausanne.
Wenk, B. J. and Wiolland, F. (1982). Is French really syllable-timed?
Journal of Phonetics. 10. 177-193.
Wiolland, F. (1984). Organisation temporelle des structures rythmiques du
Francais par16. Etude d'un cas. Rencontres regionales de Linguistique,
BLLL . 293 - 322.
Wunderli, P. (1987). L' intonation des sequences extraposees en frangais.
Tubingen: Narr.
Zellner, B. (1994). Pauses and the temporal structure of speech. In E. Keller
(Ed.), Fundamentals of Speech Synthesis and Speech Recognition:
Basic Concepts, State-of-the-Art and Future Challenges. Chichester:
John Wiley. 41-62.
75
9
ANOTHER TRAVESTY OF REPRESENTATION:
PHONOLOGICAL REPRESENTATION AND PHONETIC
INTERPRETATION OF ATR HARMONY IN KALENJIN*
John Local and Ken Lodge
Department of Language and Linguistic Science
University of York
1. Introduction
The Kalenjin group of languages, part of the Southern Nilotic or Chari
Nile family (Greenberg 1964) are spoken mainly in western Kenya. One
of their characteristics is that they display a harmony system which is
said to involve the phonological feature Advanced Tongue Root ([ATR) )
(Creider and Creider 1989; Hall et al. 1974; Halle and Vergnaud 1981).
In this paper we address issues of the phonological
representation of [ATR) in Kalenjin and its phonetic interpretation.
Specifically we will show:
that the harmony system encompasses the C-system as well as the
V-system
that [ATR) is best characterised as a phonological unit which has a
syllabic domain
that there are harmony constraints on the constituents of
monomorphemic polysyllables
that the phonetic exponents of [ATR] harmony provide evidence for
the need to maintain a strict demarcation between an abstract, relational
phonology and interpretative phonetic exponents (Pierrehumbert 1990;
Kelly and Local 1989)
We will argue that one straightforward way of handling the [ATR)
harmony system is in terms of underspecification (cf. Lodge 1993b). On
Authors' correspondence addresses: John Local, Department of Language
and Linguistic Science, University of York. Ken Lodge, School of Modern
Languages and European Studies, UEA, Norwich. NR4 7TJ
York Papers in Linguistics 17 (1996)
77-117
CD John Local & Ken Lodge
80
YORK PAPERS IN LINGUISTICS 17
the assumption that only unpredictable values/features are specified in
the lexical entry forms of morphemes (cf. Archangeli 1984, 1988) we
will show that
it is necessary to specify lexically [+ATR] for the dominant
morphemes and [-ATR] for the opaque ones.
the adaptive morphemes are unspecified for lexical [ATR] value.
[+ATR] harmony domains are immediately adjacent. (There is no
evidence that harmony patterns can or do 'skip' over adjacent
morphemes.)
[+ATR] harmony domains encompass immediately adjacent
unspecified adaptive morphemes or the default value, [-ATR], applies.
We will propose that a formal implementation of our analysis can be
constructed in terms of constraints on structured hierarchies of features
which permit partial specification and structure sharing, combined with
a phonetic interpretation function (Coleman 1992a; Local 1992; Ogden
1992; see also Bird 1990; Broe 1993; Scobbie 1991).
2. Phonetic interpretation of [ATR]
We begin with a consideration of some of the phonetic characteristics of
the [ATR] harmony system in Kalenjinl We will, in the manner of
Firthian Prosodic Analysis, refer to these as 'phonetic exponents'
(Carnochan 1957; Firth 1948;.Henderson 1949; Sprigg 1957).
Importantly our investigations reveal that the phonetic exponents of the
(ATR) feature in Kalenjin are varied and not simply confined to the V-
system (a detailed discussion is presented in Local and Lodge
(forthcoming)). The transcriptions in (1) give an impression of some of
these characteristics:
1 The data we discuss is drawn from observations and recordings of a female
and male speaker of the Tugen dialect. Both speakers are in their mid 30's.
81
78
ATR HARMONY IN KALENEN
(1)
ATR words
+ATR words
(TO SPRINICLE)2
[khg:klitY1
[i5hentil
(TO SCRAPE UP)
[khl:Y. LIM
Rht:Pgil
(ID DIG UP)
[kh.E9f3.1],1
(TO DIG)
[phg-p]
(MEAT)
[Ph'EJI]
(HARDSHIP)
[]9.]
(FAR)
[]Y?]
(SIX)
2.1
(TO GROW)
(TOBLOW)
Phonetic differences between words of the
[ ±ATR]
categories
There are a number of phonetic differences between words in the two
categories which can be observed not only in vocalic portions but also
in the consonantal portions of such words. These differences include
phonatory quality, vocalic and consonantal quality and articulation and
durational differences.
2.1.1 Phonatory differences
The two sets of words exhibit different kinds of phonatory activity. This
is audible in terms of voice quality. Words of the NATI1 ) set have
audible breathy phonation as compared with words in the (-I-ATR ) set.
This breathy voice quality is especially noticeable in the rime of the
words. Measurements of the open quotient (OQ) of the glottal cycle
made from electrolaryngographic recordings (Davies et al. 1986; Howard
et al. 1990; Lindsey et al. 1988) and inverse filtering (Karlsson 1988;
Wong et al. 1979) show statistically significant differences can be taken
2 We adopt the following notational conventions in presenting the
Kalenjin material: [phonetic font] for phonetic material; bold for
phonology; lower case for syntactico-morphological categories; {bold in
braces) for morphemes expressed in terms of phonology; (CAPITALS IN
BRACES) for meanings and glosses. These conventions are based on those
employed by Carnochan, 1957. Thanks to Richard Ogden for comments and
suggestions concerning notation.
79
82
YORK PAPERS IN LINGUISTICS 17
to confirm breathiness of phonation (typically, larger OQ values are
found for [-ATR] words). Examination of voice source measurements
also suggests different kinds of laryngeal behaviour in moving from
voice to voicelessness in the two sets of [ATR] words. In [+ATR]
voicing dies away slowly and continues at low level (often noticeably
overlapping with friction if present). In contrast, in [-ATR] words,
voicing drops off rapidly.
Examination of the spectral characteristics of vocalic portions of
the two classes also reveals differences commensurate with breathy
versus non-breathy phonation (Local and Lodge, forthcoming). There is,
for example, a tendency for words of the [-ATR] set to display a greater
amplitude of the fundamental in respect of the first harmonic.
2.1.2 Vocalic differences
There are striking auditory differences in vocalic quality between words
in the two sets. Vocalic portions in [-ATR] words are noticeably more
central (and frequently more open) than those in [+ATR] words. (Note
the open [+ATR] vocoid has a back quality in the region of CV5 [ a ]
while the open [-ATR] vocoid has a noticeably front quality in the
region of CV4 [ a ]. These harmonize with appropriate tokens from the
[ th.aogu ay ])
[ATR] sets: [sqmj[sj] [sa.mYisY] thqvgus;
Examination of plots of F1/F2 for tokens each of the (±ATRI vocoids
in the data confirms the results of impressionistic listening (for
example, [+ATR] vocoids show lower Fl values than their congeners
[-ATR]). For purposes of broad transcription we represent the vowels of
Kalenjin thus: (+ATR) [i e aou ]9 [-ATR] [ I E a 0 0 ].
2.1.3 Consonantal differences
Words of the two categories exhibit differences in types of consonantal
stricture and their ranges of variation. In [+ATR] words we final labial,
apical and velar closure with burst release, or with close approximation;
in comparable [ -ATR] words closure with burst release is not found. In
such words lax fricative portions occur but so do portions with open
approximation.
83
80
ATR HARMONY IN KALENJIN
There are also noticeable variations in terms of place of
articulation. 'Corona ls' in (+ATR] words are exponed with apico-alveolar
strictures whereas they may be exponed with either apico-alveolar or
dental strictures in [-ATR] words. Generally consonantal pieces in
(+ATR] words are tenser than their [-ATR] equivalents. This can give
rise to the percept of stop-like release of laterals and nasals in [ +ATR]
words.
2.1.4 Durational differences
Consonantal and vocalic portions are durationally different in [±ATR]
words. Typically consonantal portions are shorter in [ +ATR] words than
in [-ATR] words. This is particularly noticeable in the closure and
release phases of initial and final plosive portions. Averages of vocalic
duration reveal a tendency for (-ATR] vocoids to be shorter than [ +ATR]
vocoids but there is some overlap in terms of the ranges of duration.
However, [+ATR] words are routinely longer (measured from beginning
to end of voicing) than are (-ATR) words.
3.
Phonological preliminaries: some characteristics of
[ATR] domains
Having provided a brief characterisation of the phonetic exponents of
[ATR] we now provide an outline of the main aspects of the
organisation of the [ATR) harmony system in Kalenjin. There are three
different types of morpheme: adaptive, dominant and opaque whose
behaviour can be described as in (2) below:
(2)
dominant morphemes are always (+ATRI; any immediately adjacent
adaptive morpheme(s) will share this value: {MORPH)D.
(i)
(ii) adaptive morphemes vary their [ATR] value according to the
specification of [ATR] in their neighbouring morpheme(s):
{mORPMA.
opaque morphemes are always (-ATR I and do not vary the
(iii)
value, even next to a dominant morpheme. They delimit the domain of
dominant morphemes: (MORPH)0.
81
84
YORK PAPERS IN LINGUISTICS 17
3.1 Examples of ATR patterning
In (3) - (8) below we give examples of each of these possibilities with
accompanying broad phonetic transcriptions.
(3)
{KE:R }D
(SEE)
root
(4)
{KU:T }A
(BLOW)
root
{-UN}A
[ke:run]
directional
suffix
(SEE IT FROM HERE)
{ -UN }A
[Ico:tun]
directional
suffix
(BLOW IT HERE)
(imperative)
(5)
{KA-}A
recentpast
prefix
{KU:T}A
{A-)A
lsg subject (BLOW)
root
prefix
(6)
{ -E}D
[ka:yu:te]
continuous
suffix
(I WAS
BLOWING)
{A-)A
{KU:T)A
{ -UN }A
[ka:yo:tun]
recent-
lsg
(BLOW)
subject
prefix
root
directional
suffix
(I BLEW IT)
past
prefix
{KA-)A
(7)
{KI-)A
{A-)A
{UN)D
{-READ
[kiaungel]
far-past
prefix
lsg subject
(WASH)
prefix
root
reflexive
suffix
(I WASHED
MYSELF)
(8)
{KA - }A
recent-past
prefix
{KA: -)
perfective
prefix
{KO - }A
{KE:R)D
{-A}A
aspect
prefix
(SEE)
lsg object
root
suffix
[kaya:yoye:ra]
(HE HAD SEEN ME)
85
82
ATR HARMONY IN KALENJIN
Evidence for the three types of morpheme is as follows. Sentences (3)
and (4) show that the directional suffix {-UN}A is an adaptive
morpheme; in (3) it appears in (+ATR] form and ( -ATR] in (4).
Similarly comparison of (4) and (5) show that the verbal root {KU:T)A
may also vary in terms of [±.ATR] characteristics and can therefore be
treated as adaptive. In (4) we see that any such adaptive morphemes not
in the domain of dominant ones exhibit the exponents of [-ATR] .
Comparison of the characteristics of the structures in (5) and (6) shows
that the continuous suffix {-E)D is dominant (therefore [+ATR]) and
that all the other morphemes in its left domain share its [+ATR]
characteristics. In (7) the final suffix is opaque and so it does not share
the [ATR] characteristic of the preceding dominant (( +ATR]) root
(UN)D, while the two adaptive prefixes in the left domain of the root
share its [+ATR] properties. In (8) the perfective prefix {KA:-}0 is
opaque and thus the adaptive recent-past prefix {K A -)A at the
beginning of the construction is outside the domain of the dominant
root {KE:R }D. As expected from the behaviour of the adaptive suffix
in (4) this initial prefix is [-ATR] . However, the adaptive morphemes
in the immediate left and right domains of the dominant root share its
[+ATR] characteristics. Note that roots (nominal and verbal) and affixes
may be dominant or adaptive. Affixes may be opaque but roots are not.
[ATR] functions in a variety of ways in Kalenjin. In addition to the
harmony patternings in (3) - (8) and the lexical pairs given in (1) above,
it participates, for instance, in some singular/plural distinctions:
[sgm3isil (AWFUL) (plural) is [+ATRI; [4a mYze) {AWFUL) (singular)
is [ -ATR]; [PA 9 j]
(CALVES) is [ +ATR ] -
[II1Y9.:1]
(CALF) is
[-ATR] (see also Tucker and Bryan 1964).
4. Abstractness of phonological categories: [A T ] and the
inadequacy of intrinsic phonetic interpretation
[ATR] harmony is canonically the kind of phonological organisation
which has been seen as a candidate for autosegmental status3 (Clements
1976, 1981; Kaye 1982). We will discuss one such treatment of
Kalenjin [ATR] below. However, it is appropriate here to consider
3 Or within the Firthian tradition as 'prosodic'.
83
86
YORK PAPERS IN LINGUISTICS 17
briefly one issue which [ATR) harmony in Kalenjin raises for an
autosegmental analysis - that of the phonetic implementation or
interpretation of the phonological feature [ATR]. While conventional
non-linear approaches may be able to characterise graphically the longdomain implications of (ATR), it is not immediately clear how such
phonological approaches could deal in any coherent way with the
phonetic implementation of an [ATR] autosegment in Kalenjin given
the range of different phonetic exponents we have outlined above. The
problem arises because in contemporary autosegmental approaches
phonological features are deemed to have intrinsic (or intuitive)
interpretation
the IPI hypothesis (see eg Clements (on WI in feature
geometry) 19854; Durand 1990; Goldsmith 1990; Pulleyblank 1989).
The intrinsic approach to phonetic interpretation represents a continuity
of practice from traditional generative phonologies. In the generative
tradition phonetic interpretation is merely the end point of a process
which maps strings to strings. Phonological representations are
constructed from features taking binary values; phonetic representations
employ the same features with the difference that they usually take
scalar values. In the locus classicus of generative phonology, Chomsky
and Halle explicitly embrace this view of a phonetics-phonology
continuum and write 'We take 'distinctive features' to be the minimal
elements of which phonetic, lexical and phonological transcriptions are
composed' (1968: 64). This undefended position is only made possible
in SPE, as in more recent autosegmental approaches, because there is
no attempt at an explicit formulation of phonetic interpretation. In the
present case it would require a certain amount of ingenuity to postulate
an [ATR] autosegment and find what there is in common between
devoicing of coda approximants, breathy voice quality, front or back
secondary articulation, consonantal length, particular ranges of
consonantal variability and any putative advanced position of the tongue
root.
4 Although Clements argues that the geometric organisation of features
'depends upon phonological, rather than physiological criteria' (1985:
240) it would appear that the categories he discusses are deemed to have an
intrinsic phonetic interpretation.
87
84
ATR HARMONY IN KALENJIN
[ATIII to 'fall out'
It has been suggested to us (van der Hulst, personal communication)
that there might be some kind of phonetic/perceptual relationship even
in this case which might serve to rescue a conventional autosegmental
treatment of [ATR] in Kalenjin in respect of the IPI hypothesis. The
suggested solution would be to propose that [±ATR) is exponed by
4.1 Getting the exponents of
degrees of vocal tract tension with [-ATR] exponed by a generalised 'lax'
articulatory setting and [+ATR) by a 'tense' setting (cf. also the
description in Hall et al.. 1974: 244, without reference, and Schachter
and Fromkin. 1968, on Akan). This might then allow the consonantal
and vocalic features we are concerned with to 'fall out' of the categories
set up by the analysis.
However, such an analysis merely sidesteps the issue in replacing
`the feature [ATR] with some other intrinsically interpreted feature
. In itself this begs the question as to why precisely it should be
this combination of phonetic features (not universally lax') rather than
some other that is implicated in the interpretation of [±ATR) (see also
the discussion of cross-language differences in the phonetic
interpretation of [ATR] harmony in Lindau and Ladefoged 1986).
Moreover, such a proposal would not provide a readily accessible
account of the durational characteristics of vowels and consonants or the
observed variability in the 'coronal' consonants in the two sets. Nor, as
far as we can discern, would it give us any analytic leverage on the
counter-intuitive phonetic implementation of the open [+ATR) vowel as
[a] and the open [-ATR] vowel as [a].
However, the central problem with postulating universal features
like [ATR1 is that the phonetic and phonological levels are confounded,
phonological categories amount to little more than 'rounded up'
phonetics and phonetic detail is constantly being made to fit the
phonology (e.g. Lindau on `r-sounds', 1985). Since the phonetic
exponents of the harmony system in Kalenjin do not seem to have been
investigated thoroughly until our recent paper (Local & Lodge 1994), it
is of particular concern that a number of analyses have chosen [ATR] as
the phonological designation of the relationships involved.
85
88
YORK PAPERS IN LINGUISTICS 17
4.2 Definitions of
[ATIli
Harmony systems are of central phonological importance in a large
number of languages. They typically involve two sets of phonetic
exponents which alternate in some way, though not always in the same
way across languages. Let us call these sets A and B; thus far there can
be little disagreement. In the case of [ATR] , however, a search has been
made for a common phonetic parameter for the set of exponents of the
phonological category by investigating some, but not all, such
languages. This search has been limited from the outset by the
unwarranted assumption that the commonality resided solely in vowel
phoneme inventories.
Research by Stewart (1967), Lindau (1975, 1978), Ladefoged (1964
(on Igbo), 1971, 1972), Lindau et al. (1973) and Painter (1973) on the
[ATR] harmony systems in languages of the West African Akan family
establishes a connection between the vowel qualities in the two such
sets and the position of the tongue root. Lindau et al. (1973) show that
advancing of the tongue root may also be used as a mechanism to alter
tongue height, as in German and some English speakers, without there
being any justification for giving the mechanism phonological status
(87)5 They thus distinguish between those languages which use tongue
root position as the basis of a phonological vowel harmony system and
those that use it as an articulatory mechanism for raising the tongue
body. Lindau (1978) suggests that the important articulatory effect of
advancing or retracting the tongue root in general is to change the shape
of the pharyngeal cavity and labels the phenomenon [expanded]. This
is an elaboration of Ladefoged's (1971, 1972) suggestion that there is a
phonological (sic) feature [wide] covering three states of the pharynx:
wide, as in advanced tongue root articulations, neutral, where the tongue
root is in its 'normal' position (which may or may not be the position
for [-ATR], depending on the language), and narrow, where the tongue
root is retracted. The last state may be the equivalent of [-ATR] , but
Ladefoged exemplifies it with Arabic M. Lindau (1978: 553) also
suggests that neutral versus narrow is employed in Arabic to
5 Kenstowicz (1994: 20,22) provides a clear instance of the unwarranted
elevation of tongue root to phonological status in his discussion of vowel
symbols.
86
89
ATR HARMONY IN KALENJIN
differentiate between non-emphatic and emphatic consonants
respectively. This is the only reference to consonants in relation to the
position of the tongue root.
With the basic groundwork set up in this way it is easy to see how
phonologists (who have not necessarily investigated the so-called [ATR]
languages directly) find the [ATR] feature attractive as a generic binary
label for the two sets A and B. There is apparently a simple intrinsic
phonetic interpretation of the phonological phenomenon, a convenient
isomorphism: an advanced tongue root produces a wide pharynx, which
equates with [+ATR] in the phonology (see, for instance, Hall and Hall
1980 who, in discussing (ATR] harmony in Nez Perce, comment that
[ +ATR] [U1 ] 'follow(s) naturally if the tongue root is in advanced
position when /u/ is articulated' (214)). However, if, as might be
expected, a phonological contrast is exponed by a constellation of
phonetic exponents, it has been traditionally deemed necessary to have a
way of determining the choice of which the (single) exponent should be.
For example, in Gimson (1962: 90) we are told that with regard to RP
pairs of long and short vowels 'the opposition between the members of
the pairs is a complex of quality and quantity', but he decides to take
length as the phonologically relevant characteristic (ibid.: 93). In
Gimson (1945-49) he demonstrates that for native RP speakers vowel
quality and the duration of voicing in the rime are the importint cues for
vowel 'length'; the criteria used to come to a decision in Gimson (1962)
seem to be 'tradition' and a language-teaching expedient (cf. 90-93 for
the full discussion). These hardly represent substantive criteria for a
motivated phonological analysis.
In the context of the present paper we need to be convinced that a
single cover term is appropriate for the phenomena under discussion.
But even if this position is adopted, it is important that the
phonological analysis must at least make reference to the wider
phonological and grammatical context of the language concerned, rather
than relying on the discovery of some common physical denominator
(cf Firth 1948).
87
Liu
YORK PAPERS IN LINGUISTICS 17
S. The abstractness of phonological categories
We will start with a matter that concerns the phonetic interpretation of
only the vocalic part of the syllable in Kalenjin: namely, the exponents
of the open V's. First of all, it is striking to note that in the
investigations of those languages which have an open V distinction in
[ +ATLI] and [-ATR] sets e.g. Akan, (see, for instance, Lindau 1975,
1978, Lindau et al. 1973), little is said about their qualities, the nonopen vowels being the focus of attention. The pharyngeal cross-sections
for the latter show clear distinctions in the position of the tongue root,
but there are no such cross-sections for the low vowels, transcribed in
Lindau (1975) as [a] for [+ATR] and [a] for NATR , but in Lindau
(1978) as [a] and [A], respectively, without any comment, though on
the formant chart (Fig.7, Lindau 1978: 552) [a] appears in a relatively
back position near to [a], [A] being omitted. In their transcription of
Kalenjin Halle and Vergnaud use [a] and [a], respectively, again without
elaboration (unfortunately misinterpreted by Can 1993a: 260-262, as
[a] and [a], respectively)6 The important point about the Kalenjin
realizations of the two harmonic sets, as far as the low vowels are
concerned, is that we find the counter-intuitive occurrence of [a] for the
[+ATR] open V and [a] for the [-ATR] open V (cf. the relatively
detailed transcriptions given at the beginning of this paper). Careful
impressionistic observation and acoustic analysis indicates that the
backer of the two vocalics co-occurs with vocalic and consonantal
portions which typify [+ATR] . In other words, the expected tongue body
position on the front-back axis in relation to the assumed position of
the tongue root does not occur. Whatever the facts of Akan, in Kalenjin
the tongue body position is clearly not determined by the size of the
pharynx, so, even if we restricted the phonological domain of the
harmony system to the vowels, for the low vowels we would need the
contrary interpretation of [±ATR] to their interpretation for the non-low
6 Whether [ -ATR] is equivalent to a neutral or retracted tongue root is not a
question we concern ourselves with in this paper, but the issue has led to the
introduction of another feature [RTR] in the analysis of some languages; see
Carr, 1993b and references therein.
91
88
ATR HARMONY IN ICALENJIN
vowels - not a happy conclusion for universals of phonetic
implementation.
As far as consonantal articulations are concerned, the available
literature does not provide much in the way of indication of what
happens to them when the pharynx is wide (see, for example, Ladefoged
1972, or Lindau 1978). A narrow pharynx, as we have already noted,
has been implicated in the production of Arabic emphatic consonants.
This is of no help in explaining the consonantal articulations we have
observed in Kalenjin, nor in explaining the difference in phonation
types. It is Stewart (1967: 199) who assumes a relationship between
[+ATR] and breathy voice, for which we find no evidence; on the
contrary, in our data breathy voice in the sonorants goes with (--ATRI.
(Halle and Stevens (1969) also offer a tentative determinate account of
the relationship between tongue-root retraction, larynx lowering and
phonatory difference, but the work of Lindau and her associates indicates
that such an association is casual rather than causal). Similarly, the
lenition phenomena and the length phenomena referred to in §2 above
and discussed in detail in Local and Lodge (1994) seem to us to have no
obvious connection with pharynx width, any more than the fact that in
Kalenjin 'coronality' in [+ATR] words has exclusively alveolar
exponents whereas in [ATR] words it varies between alveolar and
dental exponents. The only conclusion we can draw is that (ATR] can
have no 'basic intrinsic' phonetic interpretation that will allow us to
apply it in any meaningful way to the Kalenjin material under
discussion here. Rather the interpretation of the abstract phonological
relationship designated [ ±ATR] must be accounted for in explicit
statements of temporal and parametric phonetic exponency (Carnochan
1957; Ogden and Local 1995; Sprigg 1957); we cannot appeal to some
kind of free-ride intrinsic phonetic interpretation principle./ If we adopt
7 Compare the statement of Gazdar et al (1985) concerning similar practices
in syntax. 'Unlike much theroetical linguistics, it [the GPSG exposition)
lays considerable stress on detailed specifications of the theory and of the
descriptions of parts of English grammar ... We do not believe that the
working out of such details can be dismissed as 'a matter of execution ... In
serious work, one cannot 'assume some version of the X-bar theory' or
conjecture that a 'suitable' set of interpretative rules will do something as
desired ...' (ix)
89
YORK PAPERS IN LINGUISTICS 17
this position, of course, it has considerable ramifications for all aspects
of the relationship between phonological categories and their phonetic
exponents.
Rejection of the IPI hypothesis is, of course, aligned with the
position of Firthian Prosodic Analysis wherein phonological
representations are entirely relational, encoding no information about
temporal or parametric events (Carnochan 1958; Firth 1948; Ogden
1993; Ogden and Local 1993, 1995; Sprigg 1957). Under this view the
phonological representations are abstract relational structures and are
treated as having no intrinsic phonetic denotation. This is different from
the view we highlighted earlier which is propounded in a number of
contemporary 'non-segmental' approaches where features in the
phonology are deemed to embody a transparent phonetic interpretation
typically cued by the featural name (e.g. Browman and Goldstein 1986;
1989; Bird and Klein 1990; Sagey 1986. See also the discussion in
Keating 1988).
The position we take does not mean that we see no interesting or
`explanatory' links between phonetic phenomena and phonological
structures. Rather our claim is that if we wish to develop a sophisticated
understanding of the relationships between the meaning systems of a
language and their exponents in speech, being forced to provide an
explicit statement of the detailed parametric phonetic exponents of
phonological structure is an essential prerequisite. The feature labels for
phonological units we employ may be given mnemonic labels (e.g.
[ATR] ), but their relation to the phonic substance need not be simple.
Because they are distributed over different parts of the syllabic structure,
their interpretation is essentially polysystemic (Firth 1948; Henderson
1949; Carnochan 1957). For example, the interpretation of the contrast
given the feature label (+ATR] or the label (+na sal ) at a syllable onset
need not necessarily be the same as the interpretation of the contrast
given the feature label [+ATR] or [+na
] at a rime (see also the
comments by Manuel et al. 1992 on the phonetic interpretation of
`alveolarity and plosion' in codas of English words). Moreover, the
occurrence of the phonologically contrastive feature (+na sal at some
point in the phonological structure may generalize over many more
phonetic parameters than those having to do simply with lowering of
the soft palate. Similarly the absence of a feature such as [ +voice]
90
93
ATR HARMONY IN KALENJIN
does not necessarily mean that the representation generalizes over tokens
where there is no activity involving vocal fold vibration - vocalic, nasal
and liquid portions typically have regular vocal fold activity, though the
phonological representation to which such portions may be referred does
not necessarily involve the feature (+voice) (cf Ladefoged 1977; Local
1992).
The consequence of this argument is that nothing at all hangs on
the name of a phonological feature (eg (ATRI) provided that the
canonical naive view of the relationship between phonological
categories and phonetic ones is eschewed. That is provided the semantics
of the phonological categories is explicitly and formally stated then it
really doesn't matter what they are called. All that the 'naming of parts'
achieves is some kind of mnemonic shorthand that can, in the worst
cases, lead to analytical infelicities. There are two aspects to specifying
the semantics: (i) it is necessary to know how the phonological
category(ies) in question relate to other phonological categories that is
provide a semantic statement of their place within the phonological
systems and structures and (ii) it is necessary to provide an explicit
statement of the phonetic interpretation of the phonological categories -
this is crucial because, in Firthian terms, it 'renews the connection'
(Firth 1957). For instance, Sprigg (1957:107) writes
`... it is clear that the phonological symbols are purely
formulaic, and in themselves without precise articulatory
implications. In order therefore to secure 'renewal of
connection' with utterances, it becomes necessary to cite
abstractions at another level of analysis, the Phonetic
level: abstractions at the Phonetic level are stated as
criteria for setting up the phonological categories
concerned, and as exponents of phonological categories
and terms.'
We return, therefore, to our initial labels A and B. As cover terms for
the categories that enter into the phonological system, they are as good
as anything else in that they are abstractions from the data without any
phonetic content or implication. It seems to us that this is not
dissimilar to a much simpler example that relates to the phonological
91
9 4.
YORK PAPERS IN LINGUISTICS 17
status of a feature [alveolar] or a binary equivalent [+cor, +ant]
,
as
a definition of English It d n/. As is well known, these three putative
phonological units are subject to (at least) place of articulation
assimilation with a following obstruent or nasal (cf. Gimson 1962, and
more recent discussions in Local 1992; Lodge 1984, 1992; Nolan
1992); in other words, their exponents, in this respect vary in terms of
articulatory place: bilabial, labiodental, dental, palato-alveolar, palatal
and velar, as well as alveolar. The only thing these features have in
common is that they are all indeed place specifications. Clearly, in such
cases as this the alveolar articulatory place descriptor cannot be equated
with the phonological category [ alveolar] . The proposals made by
Local (1992) and Lodge (1981, 1984, 1992) involve non-specification
of the place feature for such consonants; in addition, in Local (1992) and
Lodge (1992) feature-changing rules are excluded entirely from the
grammar, as proposed in §8 below, so by having no lexical
specification of a place feature for It d n/ the necessary level of
abstraction is achieved: these particular sounds are not defined as
alveolar at all, but as those that have no specific place. (For a proposal
that this may be a universal feature of coronals, see Paradis and Prunet
1991.) The appropriate place features are supplied by sharing the
following obstruent or nasal in particular structural domains, with
alveolarity as the default.
However, the case of Kalenjin is more complicated than this, since
the phonetic exponents of the terms of the harmony system cannot
easily be subsumed under a general heading such as 'place of
articulation'.
Fudge (1967) is an early attempt within the framework of
generative phonology to introduce phonological primes with no
implicit phonetic content (with a reference to Firthian Prosodic
Analysis). He states: 'It is ... dangerous and misleading to say that
either articulatory or auditory features ARE the phonological elements,
unless they correlate so closely that no facts of language are obscured by
treating them as if they were the same' (4, original emphasis). The two
reasons he gives to support his claim that facts are obscured if one
assumes identity of phonetic and phonological features are the matter of
biuniquness (discussed also by Chomsky 1964:
75-95) and
morphophonemic patterns, some of which are counter-phonetic. The
95
92
ATR HARMONY IN KALENJIN
first of these Fudge exemplifies with tone-sandhi in Mandarin, in which
Tone 2 followed by Tone 3, and Tone 3 followed by Tone 3 are both
realized as a high rising followed by a low rising pitch (1967: 4-7).
(There is evidence that such claims trade on less than compelling
phonetic observation - and an innocence about interrelationships
between levels of analysis. See, for example, Chuenkongchoo 1956, on
Thai and Henderson 1960, on Bwe Karen.) The second is exemplified by
the Hungarian vowel system, in which phonetic [o] pairs with phonetic
[a:] in a harmony system partly determined by lip-rounding or lack of
it; they are phonemicized as /a/ and /a:/, respectively. As Chomsky
points out (1964: 74; quoted by Fudge 1967: 10), /a/ is 'functionally
unrounded but phonetically rounded.' Fudge sees this as a convenient
shorthand, but argues that 'it is surely the task of phonology to make
classifications on its own terms, to state explicitly what these phoneticsounding labels (`Rounded' and `Unrounded', 'Long' and 'Short', etc.)
are a 'shorthand' for' (1967: 10). The Hungarian system also contains a
situation parallel to the Mandarin tone-sandhi: [i] and [i:] function
phonologically as both front and back, another pair of features involved
in harmony relations. He then goes on to show how abstract labels - he
uses A, B, 1, 2, a, b, (i), (ii) - can be used to define the phonological
relations involved, and then interpreted in four ways, by means of four
different sets of rules: articulatory, acoustic, auditory and recognitional.
We do not want to go into any further details of Fudge's proposals
(which are segmentally based), but would like to note in particular what
Fudge considers one serious disadvantage of distinctive feature notation,
namely that 'systematic phonemic elements and their systematic
phonetic counterparts are treated in terms which are formally
indistinguishable, and this often forces us to imply that one systematic
phonemic element has been changed into another (Tone 3 HAS
BECOME Tone 2 in our [Mandarin] example). This is not only
undesirable, but also unnecessary, since we do not require complete
biuniqueness in our phonology' (1967: 6). We applaud such cautionary
remarks, but we find it extraordinary that after nearly thirty years only a
few phonologists have started to pay any attention to them.
93
YORK PAPERS IN LINGUISTICS 17
4.2
Maintaining strict demarcation:
Phonetic Interpretation
Compositional
We have argued that the IPI hypothesis for phonological categories is,
in the general case, untenable and, in the particular case of [ATII]
harmony in Kalenjin, demonstrably inadequate. In the light of this we
have suggested that it is not only desirable but necessary to adopt an
analysis in which a strict demarcation between the abstract phonological
and physical phonetic levels is maintained as in Firthain prosodic
analysis. In order to do this, as we indicated, it is necessary to solve the
issue of the phonetic interpretation of phonological categories. To
accomplish this we adopt the proposal of Coleman and Local (1992) for
a compositional phonetic interpretation (CPI) function for partial
phonological descriptions. We sketch only the broad outlines of the CPI
here. Fuller, more technical descriptions, of the phonological theory and
the formal treatment of the CPI function, as formally implemented in
the York Talk speech generation system, can be found in Coleman
1992a; Local 1992; Ogden 1992).
In the CPI function adopted here, phonological structures and
features are associated with phonetic exponents. The phonological
descriptions being interpreted are here taken to be unordered acyclical
graph structures with complex attribute-value node labels (cf structures
found in GPSG or HPSG). The statement of phonetic exponents in CPI
has two formally distinct parts: temporal interpretation and parametric
phonetic interpretation. Temporal interpretation establishes timing
relationships which hold across constituents of a phonological graph
while parametric interpretation instantiates interpreted 'parameter strips'
for any given piece of structure (any feature or bundle of features at any
particular node in the phonological graph). The resulting 'parameter
strips' are sequences of ordered pairs where any pair denotes the value of
a particular parameter at a particular (linguistically relevant) time. Thus
in the general case:
((node: partial_phonological_description),(Time_start, Time_2,
Time_end), parameter section)
where the node represents any phonologically relevant contrast domain.
(Ladefoged 1980, argues for a similar formulation of the mapping from
97
94
ATR HARMONY IN KALENJIN
phonological categories to phonetic parameters.) The time values may
be absolute or relative, fixed or proportional. The precise physical
domain of the parameter strips (eg articulatory, acoustic, aerodynamic)
is not of immediate relevance here.
Under CPI, phonetic interpretation of the phonological descriptions
is constrained by the principle of compositionality (Partee 1984) which
requires that the 'meaning' of a complex expression is a function of the
form and meaning of its parts and the rules whereby the parts are
combined. Under the present proposal, the phonological 'meaning' of a
syllable equals the 'meaning' of its constituents (for a similar approach
see Bach and Wheeler 1981; Wheeler 1981; 1988). The compositional
principle is instantiated by requiring any given feature or bundle of
features at a given place in the phonological structure to have only one
possible phonetic interpretation. So, for instance, in the present case the
:1 'good planters' and (ii) [ khw. 9.l.
Kalenjin words (i) [
'plant!' can be given the following Firthian-like, partial representations
(similar representations can be found in Albrow 1975; Carnochan
1960):
IATR-
(AT") (KOX)
(KO ?)
Here the syllable-domain [ATII] unit as well as being semantically
distinctive serves to integrate the other syllabic material
(paradigmatically contrastive 'phonematic units' (Firth 1948)) with
consequences for their phonetic exponency as we illustrated above).
Given this, then the interpretation of (i) is of the form:
CPI( ( ATR :
(KOX)) = (phonetic exponents of 'kW)
where CPI is a phonetic interpretation function (cf Coleman and Local
1992). A more fully specified representation of (i) might be given as:
tATR+)
(h(c),
In this representation the units within the syllable are treated as
separate entities or sequences of entities; the superscript symbols h/
placed before the units (lc) and (off,) serve to indicate onset/rime domain
95
98
r,
YORK PAPERS IN LINGUISTICS 17
phonation prosodies
(h.
`voicelessness'; ,h. `voice'). Such a
representation can be reconstructed as a graph with attribute-value node
labels, thus:
AfT1(;+I
(Wt:-]
ivot:+1
cnslemp:+erv:+11
fent+, nas:-,str: -,
cnstemp:-, grv:-11
The compositional interpretation of this schematic representation can be
determined in the following quasi-articulatory fashion:8
1. CPI(knt:-, nas:-, str:-, cns(cmp:+, grv: +J /) = (contact of tongue back
with soft palate, closure of soft palate
2. CP1((hi:21).(relatively mid tongue-height...)
3. CP1(tent:+, nas:, str:, cns(cmp:-, grw:11) = (contact of tongue apex
with alveolar ridge...)
4. CP1(1voi:+1(lhi:4,
nas:, str:, cns(emp:-, grv:JI)) =
(succession of CPI (lent:+, nas:-, str:-, cns(cmp:-, grv:-11) to
CPI(Ihi:21), relative length of CP1((ht:21), relative slow decay of
voicing of CPI ((hi:2])...)
5. CPI(Ivoi:1((ent:, nas:,
cisslcmp:+,grv:+11)) = (voicelessness,
aspiration of CP1((ent:, nas:, str:, cits(cmp:+,grv:+11)...)
8 In a more complete representation backness and roundedness of the
nucleus would be accounted for at the syllable level, thus providing, inter
alia, for an appropriate phonetic interpretation of consonant-vowel
coarticulation (see Local, 1992).
99
96
ATR HARMONY IN KALENJIN
6. CP1(latr+J(Ivoi:-J(Icnt:-, nas:-, str:, cnslcmp:+, grv:+ll),
nas:-, str:-, cnsicmp:-, grv:-M)) = (succession
of CPI (Ivoi:1(lent:-, nas:-, str:-, ensfcmp:+,grv:+1])) to
CP1(fvoi:+1Uki:21, knt:+, nas:, str:-, cnskmp:-, grv:M), nonmaximal backness of CP1(ivoi:Mcnt:-, nas:,
cnsitmp:+,grv:+11)) and CP1(froi:+1(thi:21, Itnt:+, nas:, str:,
str:-,
enslcmp:-, grv:-JJ)), relative palatality of CPI(Icnt:+,
tnslemp:-, grv:-11), relative shortness of closure and release of
nas:, str:-, cnskmp:+,grv:+11)), tense phonatory
quality and slow decay of voicing of CP1(/voi:+1(lki:21,
str:-, tns[cmp:, grv: -Jb), )
We have formally tested and verified a CPI for Kalenjin within the
YorkTalk declarative speech generation system employing acoustic
parameters. Discussion and illustration of this and quantitative details of
the phonetic exponents of [ATR] in Kalenjin are given in Local and
Lodge (forthcoming).
6.
Phonological analysis
In order to develop our phonological analysis we shall now consider
Halle and Vergnaud's (1981) analysis of Kalenjin [ATR] harmony, the
contribution of underspecification and then return to a consideration of
the phonetic interpretation of [ATR].
6.1. Halle and Vergnaud's analysis
Halle and Vergnaud's (1981) paper was one of the first to argue for an
autosegmental account of the Kalenjin harmony system. In it they make
a number of substantive claims:
(ATR] autosegments can be linked only to vowel slots in the core
(CV anchor tier), (which they claim is 'obvious').
[ATR] can also be part of the core specifications, but autosegmental
specification overrides core specification.
97
YORK PAPERS IN LINGUISTICS 17
Autosegments are either linked to the core in the lexical
representations or they are floating, i.e. not linked to the core slots.
Linking is subject to the following conditions (= their (10):
(9)
i. Each (vowel) slot is linked to at most one (harmony) autosegment.
ii. Floating autosegments are linked automatically to all accessible
vowel slots.
iii. Unlinked autosegments are deleted at the end of the derivation.
(Emphasis original.)
In order to make their analysis work Halle and Vergnaud also find it
necessary to invoke the No Crossing Constraint (for a critique of this
constraint, see Coleman and Local 1989). To account for the facts in (2)
above, as exemplified in (3)-(8), they claim that all vowel slots are
(redundantly) specified [ -ATR ] and that dominant morphemes have a
floating [ +ATR] autosegmental specification in their lexical entry form.
Opaque morphemes are specified with a [ -ATR] autosegment. On the
basis of this analysis they give the lexical representations in (10a,b,c)
(= their (1g); we use Halle and Vergnaud's conventions for representing
Kalenjin morphophonology but additionally give broad phonetic
transcriptions).
(10a)
[luayer]
kI-a-ger
(I SHUT IT)
(10b)
[+ATR]
kI-a-ger
[kiayere]
-e
(I WAS SHUTTING IT)
(10C)
[-ATR]
[ +ATR]
ka-ma-a
-ge:r
-ak
[kamaayerrak] (I DIDN'T SEE YOU
(pl))
101
98
ATR HARMONY IN KALENJIN
In the first case (10a), where all the morphemes are adaptive, Halle and
Vergnaud state that the form is 'subject to no modifications and surfaces
in its underlying form as far as [ATR] harmony is concerned' (1981: 4),
giving [ATR], the redundant specification of all morphemes. In (10b)
all vowels are (+ATR] because (9ii) links the autosegment accordingly.
In the third example (10c), which is parallel to (8) above, the last three
vowels are linked to [+ATR] by (9ii), but the No Crossing Constraint
prevents it from being linked to the first morpheme; given the linking
of {MA}0 with (ATR] (KA)A surfaces as [ATR] (= 'is subject to no
modifications').
Since they operate with fully specified underlying forms, the
association of the floating (+ATR] autosegment necessarily has the
effect of changing the value of the redundant [ ATR] specification of the
lexical entry form. It is also the case that the 'blocking effect' of the
autosegmental (ATR) specification of the opaque morphemes is
arbitrary, in that in other cases (though not in Halle andVergnaud's
paper) spreading can delink such associations (cf. Broe 1992: 153-154).
That is to say, whether spreading can delink or not has to be indicated in
a language-specific way, and possibly even a phenomenon-specific way.
Halle and Vergnaud's analysis highlights three problems. The first
two are of some generality within conventional autosegmental
treatments of languages with [ATR) harmony. First there is an
unwarranted assumption that (ATR) associates with vocalic slots only.
Second there is a reliance on procedural, feature-changing rules (see, for
example, the extensive appeal to `delinking' and 'deletion' in Goldsmith
1990 and papers cited therein). The third problem concerns Halle and
Vergnaud's arbitrary account of the blocking effect of the opaque
morphemes. We will deal with the first of these problems in the
following section and with the other two when we give a declarative
analysis of Kalenjin [ATR] harmony.
7. The syllable domain of [ATR]
It is now appropriate to take a closer look at our earlier claim that
[ATR] harmony in Kalenjin is of syllabic domain. Halle and Vergnaud,
in conventional manner, associate (ATR] autosegments with vowels (in
this way they define dominant morphemes 'those with (---ATR) (sic)
99
102
YORK PAPERS IN LINGUISTICS 17
Given that [ATI%) harmony systems are conventionally dealt with under
the rubric 'vowel harmony' it may seem somewhat bizarre to suggest
that there is anything odd about this analytic claim. However, as we
indicated at the outset of this paper, the phonetic characteristics of
consonantal portions in Kalenjin also show marked differences
depending on their occurrence in [±ATR) domains. For example, initial
voicelessness and plosion have short voice onset times in (+ATR)
domains, but relatively long voice onset times with relatively greater
amplitude of burst in NATRI domains. In (+ATRI words such as
[porpor] ((CRUMBLY), plural) the apical portion is typically a
palatalized trill; in contrast in the NATRI form [porpor] ((CRUMBLY),
singular), we typically find a velarized tap or a lax apical approximant.
That consonantal portions should be implicated in the exponency of
'vowel harmony' should not be regarded as odd. There is evidence that in
other 'vowel harmony' languages consonantal portions may also be
different. For example, Kelly and Local (1989: 180) show that in Igbo
comparable intervocalic consonant portions vary in a number of ways
(e.g. in degree of stricture) according to the harmonic V-system they
occur with; Waterson (1956) similarly demonstrates that consonantal
portions in Turkish exhibit harmonic properties which go around with
the so-called vowel harmony in that language. (Dick Hayward (personal
communication) confirms noticeable consonantal differences,
particularly in duration, co-incident with the vowel harmony systems in
Dinka.)
It is important to stress here that the phonetic characteristics of
consonants which we have described are not to be attributed to low-level
'co-articulatory effects' (as might, for instance, be argued in the case of
'emphatic consonant harmony' in Arabic (van der Hulst and Smith
1982)9. We therefore contest Halle and Vergnaud's assumption about
[±ATR] association. It arises simply because the authors have paid
insufficient attention to the phonetic facts of the language.' °
9 Given Whalen's (1990) disscussion concerning the 'planned' nature of socalled low-level 'phonetic coarticulation effects' it is probably dangerous to
propose such an account in any case.
10 This may be a problem of some generality - wherein particular analytic
concerns or 'hunches' focus, in an unwarranted and potentially damaging
1O3
100
ATR HARMONY IN KALENJIN
The situation we have described for Kalenjin is one in which it
would be arbitrary to assign the harmony feature [±ATR] to either
vowels or consonants. We note, for example, that structural
configurations of the kind in (11) are not permitted:
* (polysyllabic word)
(morph)
syllable
syllable
+ATR
C
-ATR +ATR
C
V
-ATR
V
That is, we do not find cross-combinations of these ( +ATR] consonantal
portions with [-ATR] vocalic portions or vice versa. We refer to this
cohesiveness of (APR] within syllables as the Syllable Integrity
Constraint.
Second, we note here that there are syntagmatic dependencies
between onset and rimal constituents and within the rime between
nucleus and coda constituents. That is, while we find V, CV, VC as
autonomously occurring structures we do not find C (without the
implication of a following or preceding V). Taken along with our
observations about the integrity of [ATR] in CV(C) structures this
suggests that we need to formulate a constraint on the syllabic
association of (±ATRI.
manner, phonetic observation (cf Kelly and Local, 1989). This problem is
compounded by the willingness of many current phonologists to 're-work'
the analyses of others.
101
YORK PAPERS IN LINGUISTICS 17
We have just proposed that the simplest analysis for the phenomena we
have described would be to propose the syllable as the minimal domain
of association for [ATR] . We now consider some of the implications of
this claim for autosegmental accounts. A conventional non-linear
analysis would, like Halle and Vergnaud's, propose association of the
[ATR] feature with V-slots and then to allow spreading (cf. also
Archangeli 1985; Clements and Sezer 1982; Goldsmith 1990, for
example). Notice, though, that we need to deal with two kinds of
spreading. While both [ +ATR] and [-ATR] spread to all material within
syllables only [ +ATR] spreads between syllables. Given the inclusion of
consonantal material in the 'harmonic spreading' and the Syllable
Integrity Constraint, if we adopt the conventional V-association
approach, it is clear that we need to invoke a more complex architecture
of association precedence and/or blocking to ensure that spreading works
in the appropriate fashion. For instance we desire 12(a) but not 12(b).
(12a)
{morph)A
{morph)D
{morph)0
I
I
us ATR
+ATR
-ATR
CVC
CV
CV,
105
102
ATR HARMONY IN KALENIIN
(12b)
{morph}A
(morph)D
(morph)0
I
us ATR
..,,,..'n :
I
6v.
-ATR
+ATR
.
S
tve
cv
In 12(a) we have appropriate spreading of [+ATR) to the C's in the
dominant morpheme and to the V and C in the adaptive morpheme
(usATR = unspecified [ATR] ). This is in line with our observations that
it is necessary to spread (±.ATR) to any onset and coda consonants as
well as vowels, and that dominant (+ATR) harmony spreads to all
adaptive morphemes in its domain.
In 12(b), however, although we have spreading of (+ATR] as in (a)
to the C's in the dominant morpheme and to the C and V in the
adaptive morpheme, it also spreads to the C in the ( -ATR ] opaque
morpheme in violation of the Syllable Integrity Constraint. Clearly we
need a way of blocking the spread of dominant [+ATR] harmony to the
C's of adjacent opaque [ -APR] syllables. It would be possible to
propose a function which would allow morphemic information to
percolate to the C and V material in such syllables. However, there is a
simpler way of prohibiting this association by ordering the spreading of
[ ±ATR] to C's within syllables before spreading between syllables.
Once the parochial within-syllable spreading had been accomplished,
between syllable spreading would ensure that (+ATR) only associated
with V slots which were unspecified for [ATR] and in its immediate left
or right domain. This, of course, is tantamount to associating [±ATR]
with complete syllables in the first place. As we will show now, it is
possible to avoid these somewhat baroque extrinsically ordered
association rules if we treat CAPRI as having a syllabic domain and
adopt a constraint-based feature-sharing analysis of the harmony system.
103
106
YORK PAPERS IN LINGUISTICS 17
8. A declarative underspecification analysis of [ATR] in
Kalenjin
One way of avoiding destructive phonological rules, in which features
or values are changed or deleted from lexical or, in a derivational
framework, intermediate representations, whilst maintaining a single
lexical representation for each morpheme, is to employ underspecified
lexical representations. Radical underspecification has been developed by
Archangeli (1984 1988) and applied to the [ATR] harmony system in
Yoruba by Pulleyblank (1988) and Archangeli and Pulleyblank (1989).
The Yoruba system that they describe is different in several respects
from that of Kalenjin, but the same principles of analysis apply in each
case. (In Yoruba, for instance, the vowel /i/ is opaque to the harmony
system, whereas in Kalenjin certain morphemes are opaque.)
In general, in those cases where alternant realizations are involved,
the appropriate feature(s) or feature-value(s) must be unspecified
lexically (cf. Lodge 1992 and 1993a). (Whether one refers to features or
values is to some extent a matter of whether one uses unary or binary
features, respectively; see also the discussion in Calder and Bird 1991.
Under these assumptions, then, in Kalenjin the adaptive morphemes are
appropriately represented without a lexically specified value for the
[ATR] feature underlyingly. Dominant morphemes are specified as
(+ATR) (let us say, for the time being, associated with their syllable
head (vowel) slot(s), i.e. not floating as in Halle and Vergnaud's
analysis). [+ATR] , being the non-default value, will have in its domain
any adjacent syllables whose head features are not specified for [ATR],
i.e. those of the adaptive morphemes. In those words that involve no
dominant morphemes, as in (4) and (6) above, a language-specific
default rule will supply the redundant specification [-ATR] . (Which
value of [ATR] might be the universal default is unclear; in Yoruba, for
instance, [+ATR] is the redundant value, though the rule is described as
a language-specific complement rule by Pulleyblank 1988: 238, and
Archangeli and Pulleyblank 1989: 180, footnote 11.) The opaque
morphemes are lexically specified as [-ATR] , as in Halle and Vergnaud's
account, but given that we have ruled out destructive rules a priori as a
means of restricting phonological theory, such lexical specifications
will automatically serve to 'block' the 'spread' of any feature, since
delinking of any kind is not permitted. Thus, in an underspecification
107.
104
ATR HARMONY IN KALENJIN
account opaque morphemes are lexically specified for [ATR) , whereas
adaptive ones are not. This will yield lexical representations of the kind
given in (13) for example (8).
(13)
[+ATR]
[-ATR]
KA-
ICA:-
KO-
KE:R
-A
The unspecified (KO-) and (-A) are in the domain of (KE:R)D and
share its [+ATR] specification. The initial (KA-)A has the default value
[ -ATR ] . As we demonstrated earlier, this is because the presence of [
ATR ]
in the lexical representation of the second prefix delimits the
inheritance domain of [+ATR] .
Since, in the case of Kalenjin, we are dealing with constellations of
interacting phonetic parameters which also affect consonantal quality,
our analyis above is equivalent to extending the Ladefoged/Lindau
proposal to any appropriate consonants, as they do for Arabic. The
result is that in Kalenjin the whole syllable is (±ATR ] covering both
consonants and vowels; our representation in (13) would then be easily
modified as in (14), as a representation of the results of spreading and
default specification.
(14)
[ -ATRI
A
CV
{KA-}0
[+ATR]
(-ATR)
A
CV
(KA:- )0
CV
{K0-}A
cvc
(KE:R)D
V
(-A)A
(We do not concern ourselves here with the difference between long and
short vowels here, labelling both as V.)
7.1 Structure-sharing,and [AM] harmony.
In §4.2 we proposed a Compositional Phonetic Interpretation function
to allow us a formal means of relating abstract phonological categories
105
108
YORK PAPERS IN LINGUISTICS 17
to their phonetic exponents. Here we outline a declarative structuresharing account for [ATR] harmony which is consonant with this CPI.
The syntagmatic dependencies outlined above in §7 above imply
that V is the head of the syllable rime and that the rime is the head of
the whole syllabic structure. This provides us with an obvious solution
to the formulation of syllabic association of [±ATR] . In recognising V-
system units as heads of rimes, rimes as heads of syllables and Csystem units as dependents we are able to employ a version of the
familiar feature sharing constraints of the GPSG framework (Gazdar et
al. 1985). By designating a daughter of a particular category to be the
head we identify the relationship between that daughter and the mother
as a distinguished one. This allows us to encode the apparent 'featurespreading' of [±ATR] within a CV(C) structure as a declarative feature-
agreement constraint. What we require is to be able to say:
Onset Features [ATR] = Rime Features [ATR] (and Nucleus Features [ATR]
= CodaFeatures (ATR] ). This can be accomplished by employing
versions of Gazdar et al's Head Feature Convention (HFC) and Foot
Feature Principle (FFP) (Gazdar et al. 1985: 50ff; 70ff). These two
constraints may be phrased informally thus for a given fragment of
graph representation:
The head features of the mother must be an extension
HFC:
of the head features of the head daughter.
FFP:
The foot features of the mother must be identical to
the foot features of every daughter.
Combining the HFC and FFP with the structure in (15) below
constrains [SyllableFeatures [ATR] ] and [OnsetFeatures [ATR] ] to be
identical.
109
106
ATR HARMONY IN KALENJIN
(15)
Syllable
[Syllable features[ATR]]
Onset
[Onset features[ATR])
Rime
[Rime features[ATR]]
C
V
There are two things to notice here. First observe that it does not matter
which of the nodes has its (ATR) value determined or when. The effect
is identical (cf Coleman 1992b). Second, notice that the 'spreading' of
dominant [-I-ATR) harmony to immediately adjacent syllables can, by
extension, be handled by a similar feature-agreement technique in which
the domain of sharing is the word. In Kalenjin a 'word' consists of a
monomorphemic root monosyllable or polysyllable. These roots
include nominal, verbal, temporal-demonstrative and possessive
morphemes (see Lodge 1993b). Roots combine with other morphemes
(prefixes and suffixes of various kinds) to form larger word-pieces and
these provide the domain of application for the harmony.
Evidence for a word-domain harmony can be illustrated by
considering the constraint on the mixing of [-FATR) and [-ATM vocalic
and consonantal portions in monomorphemic polysyllabic structures.
Although it is possible, as we have seen in (3) - (8) above, to have
polysyllabic utterances in which (+ATR) and [ -ATR) properties may be
mixed, this is prohibited just in the case where the polysyllabic
structure is monomorphemic. So, for instance we find [tari:t] (BIRDS)
and [tana] (BIRDS) where the structures as a whole exhibit [-I-ATR) or
[ -ATR) harmonic characteristics. Structures of the following kind are
prohibited:
107
1
YORK PAPERS IN LINGUISTICS 17
(16)
* (polysyllabic word)
(morph)
syllable
syllable
+ATR
+ATR -ATR
V
C
C
-ATR
V
The ill-formedness of such structure is a natural consequence of the
contraint-based analysis we have proposed. Though the syllables respect
the Syllabic Integrity Constraint the HFC cannot be satisfied for the
(morph) node.
Lodge (1993b) provides further evidence of (ATRI harmony
encompassing word-domains. He shows that apparent failures of [+ATRI
harmony in some pieces can be attributed to the presence of a word
boundary within the piece. For instance, in [kwesa:yailad in (17),
where the syllables are (elsewhere) demonstrably adaptive, dominant,
adaptive, dominant, the first syllable would be expected to exhibit
(+ATRI harmony features; it does not.
(17a)
{KWES)A
##
(GOAT)
root
(KA)A
{NA : }D
temporal
demonstrative
recent-past
{-NYA:)D
possessive
suffix
[kwesa:yajiadll
(OUR GOAT (OF
YESTERDAY) )
11 Most sequences of two consonants are not allowed, hence the
interpretation of {KwEs}+{NA:) as [kwesa:].
108
ATR HARMONY IN KALENJIN
(17b)
{TUKA}A
##
(COW)
root
(17c)
{TUKA}A
##
{-ET}A
[tuyatfa:yet]
possessive
recent-past
suffix
(THOSE COWS OF
OURS )
(-CA:)D
{ -KAJ)o
{CA :K}D
(-KA)0 (-CA:K)D
recent-past
(COW)
temporal
root
demonstrative suffix
possessive
suffix
(toyatfa:yaiyotfa:k)
(THOSE COWS OF OURS
YESTERDAY)
Similarly in 17(b), [tuyatfa:yet], where the syllables are adaptive,
adaptive, dominant, adaptive, we would expect the first two syllables to
harmonise with the dominant syllable, whereas only the last, adaptive
syllable harmonizes with the dominant [tfa:y]. If these pieces are
analysed as consisting of two words (the second coinciding with the
start of the temporal demonstrative in two cases and the possessive in
the other), we see that this is exactly the point where the harmony
ceases to operate. Once this word division is recognized we find that the
harmony operates exactly as it does in (3) -(8).
9. Conclusion
Current work in phonological theory is moving away from procedural,
rule-ordered analyses to non-procedural, non-derivational analyses in
which phonological representations are incrementally constructed. The
phonological representations so constructed cannot be destructively
modified - there can be no deletion, `delinkingt or feature-changing
rules. The information in the phonological representation must be
preserved.
109
112
YORK PAPERS IN LINGUISTICS 17
In part, this work represents a research effort to elaborate grammars
which favour neither production nor recognition and which allow for a
felicitous interaction with contemporary declarative theories of syntax.
To this extent, the declarative research program in phonology is a direct
descendent of Firthian prosodic analysis (Coleman and Local 1992; Broe
1993; Local 1992; Ogden and Local 1995). The underspecification,
feature-agreement analysis we have provided of [ATRI harmony in
Kalenjin is intentionally undertaken as part of this research program.
Taken together with the Compositional Phonetic Interpretation function
which we have described, it provides a more felicitous account of the
phenomenon than the mechanisms discussed earlier in the paper and the
one offered by Halle and Vergnaud. Unlike the Halle and Vergnaud
analysis, underspecification with feature-agreement avoids the need to
invoke destructive, structure changing rules. Moreover, in constrast to a
conventional V-association account with procedural 'spreading', the
feature-sharing constraint offers a computationally tractable mechanism
of some generality (Bird 1990; Broe 1993; Coleman 1992b; Local
1992; Scobbie 1991) being more constrained and more comprehensive
than a standard analysis in not trading on a naive assumption that the
harmony is simply vocalic. In addition to proposing a computationally
tractable declarative approach to phonological representation we have
also described an explicit declarative, compositional approach to
phonetic interpretation which provides the 'renewal of connection'
(Firth 1948) between the abstract categories of the phonology and their
parametric phonetic exponents.
REFERENCES
Albrow, K.H. (1975) Prosodic theory, Hungarian and English. Festschrift
fur Norman Denison zum 50 Geburtstag (Grazer Linguistiche Studien,
2). Graz: University of Graz. Department of General and Applied
Linguistics.
Archangeli, D. (1984). Underspecification in Yawelmani phonology and
morphology. Unpublished Ph.D. dissertation, M.I.T.
1I
110
ATR HARMONY IN KALENJIN
Archangeli, D. (1985). Yokuts harmony: evidence for coplanar
representations in non-linear phonology. Linguistic Inquiry 16.
335 -372.
Archangeli, D. (1988). Aspects of underspecification theory. Phonology 5.
183-207.
Archangeli, D. and Pulleyblank, D. (1989). Yoruba vowel harmony.
Linguistic Inquiry 20. 173-217.
Bach E. and Wheeler, D.W. (1981). Montague phonology: a first
approximation. In W. Chao and D.W. Wheeler (eds). University of
Massachusetts Occasional Papers in Linguistics. Volume 7. Graduate
Linguistics Association, University of Massachusetts. 27-45.
Bird, S. (1990). Constraint-based phonology. Unpublished PhD thesis,
University of Edinburgh.
Bird, S. and Klein, E. (1990). Phonological events. Journal of
Linguistics.26. 33-56.
Broe, M. (1992). An introduction to feature geometry. In Docherty, G. and
Ladd, R. (eds.) Papers in laboratory phonology II. Cambridge: CUP.
149-165.
Broe, M. (1993). Specification theory: the treatment of redundancy in
generative phonology. Unpublished PhD thesis, University of
Edinburgh.
Browman, C. P. and Goldstein, L. M. (1986). Towards an articulatory
phonology. Phonology Yearbook 3. 219-252.
Browman, C. P. and Goldstein, L. M. (1989). Articulatory gestures as
phonological units. Phonology 6. 201-251.
Calder, J and Bird, S. (1991). Defaults in underspecification phonology. In
S. Bird (ed). Declarative Perspectives on Phonology. University of
Edinburgh. 107-125.
Carnochan, J. (1957). Gemination in Hausa. In Studies in Linguistic
Analysis. Special Volume of the Philological Society. Oxford: Basil
Blackwell. 149-181.
Carnochan, J. (1960). Vowel Harmony in Igbo. African Language Studies.
155-63. In Palmer 1970. 222-229.
Carr, P. (1993a). Phonology. Basingstoke: Macmillan.
Carr, P. (1993b). Tongue root harmony, lowness harmony and privative
theory. Newcastle and Durham Working Papers in Linguistics 1. 4273.
BEST COPY AVAILABLE
YORK PAPERS IN LINGUISTICS 17
Chomsky, A.N. (1964). Current issues in linguistic theory. The Hague:
Mouton.
Chomsky, N and M. Halle. (1968). The Sound Pattern of English. New York:
Harper and Row.
Chuenkongchoo, T. (1956). The prosodic characteristics of certain partiles
in spoken Thai. Unpublished MA thesis, London University.
Clements, G.N. (1976). Vowels harmony in nonlinear generative
phonology: an autosegmental model. IULC. Bloomington: Indiana.
Clements, G.N. (1981). Akan vowel harmony: a nonlinear analysis. In G.N.
Clements (ed) Harvard Studies in Phonology Vol. 2. Distributed by
IULC.
Clements, G.N. (1985). The geometry of phonological features. Phonology
Yearbook 2. 225-252.
Clements, G.N. and Sezer, E. (1982). Vowel and consonant disharmony in
Turkish. In H. van der Hulst and N.V.Smith (eds) The structure of
phonological representations (Part I). Dordrecht: Foris Publications.
213-255.
Coleman, J. (1992a). 'Synthesis-by-rule' without segments or rewrite-rules.
In Bailly, G. and Benoit, C. (eds) Talking machines. Amsterdam:
North-Holland: Elsevier. 43-60.
Coleman, J. C. (1992b). The phonetic interpretation of headed
phonological structures containing overlapping constituents.
Phonology 9. 1-44.
Coleman, J. and Local, J.K. (1992). The 'No Crossing Constraint' in
autosegmental phonology. Linguistics and Philosophy 14. 295338.
Creider, C.A. and Crieder, J.T. (1989). A grammar of Nandi. Hamburg:
Helmut Buske Verlag.
Davies, P., Lindsey, G.A., Fuller, H. and Fourcin, A.J. (1986). Variation of
glottal open and closed phases for speakers of English. Proceedings
Institute of Acoustics 8. 539-546.
Durand, J. (1990). Generative and Non-Linear Phonology. London:
Longmans.
Firth, J.R. (1948) Sounds and Prosodies. Transactions of the Philological
Society, 129-152.
112
ATR HARMONY IN1CALENJIN
Firth, J R (1957). A synopsis of linguistic theory. Studies in Linguistic
Analysis. Special Volume of the Philological Society, 2nd edition,
1962, 1-32.
Fudge, E.C. (1967). The nature of phonological primes. Journal of
Linguistics 3. 1-36.
Gazdar, G., Klein, E., Pullum, G., and Sag, I. (1985). Generalized Phrase
Structure Grammar. Oxford: Basil Blackwell.
Gimson, A.C. (1945-49). Implications of the phonemic/chronemic
grouping of English vowels. Acta Linguistica v.
Gimson, A.C. (1962). An introduction to the pronunciation of English.
London: Edward Arnold.
Goldsmith, J. (1990). Autosegmental and metrical phonology. Oxford:
Basil Blackwell.
Greenberg, J.H. (1964). The languages of Africa. Bloomington: Indiana
University.
Hall, B.L., Hall, R.M.R., Pam, M.D., Myers, A., Antell, S.A. and Cherono,
G.K. (1974). African vowel harmony from the vantage point of
Kalenjin. Afrika and Obersee LVII. 241-267.
Hall, B.L. and Hall, R.M.R., (1980). Nez Perce vowel harmony: an
africanist explanation and some theoretical questions. In R.M. Vago
(ed) Issues in vowel harmony. Amsterdam: John Benjamins.
Halle, M, and Stevens, K.N. (1969). On the feature 'Advanced Tongue Root'.
MIT Research Laboratory of Electronics Quarterly Progress Report
94. 209-215.
Halle, M. and Vergnaud, J-R. (1981). Harmony processes. In Klein, W. and
Levelt, W. (eds.) Crossing the boundaries in linguistics. Dordrecht:
Re idel. 1-22.
Henderson, E. J. A. (1949). Prosodies in Siamese. Asia major I, 189-215.
(Reprinted in Palmer, 1970. 27-53)
Henderson, E.J.A. (1960). Tone and intonation in Western Bwe Karen.
Burma Research Society Fiftieth Anniversary Publication 1. 59-69.
Howard, D.M., Lindsey, G.A. and Allen, B. (1990). Toward the
quantification of vocal efficiency. Journal of Voice, Volume 4, No.
3. 205-212.
Karlsson, I. (1988). Glottal wave form parameters for different speaker
types. Proceedings of SPEECH 88, 7th FASE Symposium.
Edinburgh: Institute of Acoustics. 225-231.
113
11 O
YORK PAPERS IN LINGUISTICS 17
van der Hu 1st, H. and Smith, N. (eds) (1982). The structure of phonological
representations (Part II). Dordrecht: Foris Publications.
Kaye, J.D. (1982). Harmony processes in Vata. In H. van der Hulst and N.
Smith (eds). The structure of phonological representations (Part II).
Dordrecht: Foris Publications. 385-452.
Keating, P. (1988). The phonology-phonetics interface. In F. Newmeyer
(ed). Cambridge Linguistic Survey, vo1.1: Linguistic Theory:
Foundations. Cambridge: Cambridge University Press.
Kelly, J. and Local, J.K. (1989). Doing phonology. Manchester:
Manchester University Press.
Kenstowicz, M. (1994). Phonology in generative grammar. Oxford: Basil
Blackwell.
Krishnamurthy, A.K. (1992). Glottal source estimation using a sum-ofexponentials model. IEEE Transactions on Signal Processing
Volume 40. No. 3. 682-686.
Ladefoged, P. (1964). A phonetic study of West African languages.
Cambridge: Cambridge University Press.
Ladefoged, P. (1971). Preliminaries to linguistic phonetics. Chicago:
University of Chicago Press.
Ladefoged, P. (1972). Phonological features and their phonetic correlates.
Journal of the International Phonetic Association 2. 2-12.
Ladefoged, Peter (1977). The abyss between phonetics and phonology. In
Proceedings of the 13th meeting of the Chicago Linguistic Society.
225-235.
Ladefoged, P. (1980). What are linguistic sounds made of? Language, Vol.
56, 3. 485-502.
Lindau, M.E. (1975). Features for vowels. UCLA Working Papers in
Phonetics 30.
Lindau, M.E. (1978). Vowel features. Language 54. 541-563.
Lindau, M.E. (1985). The story of /r/. In V.A. Fromkin (ed) Phonetic
Linguistics: Essays in Honor of Peter Ladefoged. New York:
Academic Press. 157-168.
Lindau, M.E., Jacobson, L. and Ladefoged, P. (1973). The feature advanced
tongue root. UCLA Working Papers in Phonetics 22. 76-94.
Lindau, M.E. and Ladefoged, P. (1986). Variability of feature specifications.
In J.S. Perkell and D.H. Klatt (eds). Invariance and variability in
114
ATR HARMONY IN KALENJIN
speech processes. Hillsdale, New Jersey: Lawrence Erlbaum
Associates. 464-479.
Lindsey, G.A., Breen, A.P. and Fourcin, A.J. (1988). Glottal closed time as
a function of prosody, style and sex in English. Proceedings of
SPEECH 88, 7th FASE Symposium. Edinburgh: Institute of
Acoustics. 1101-1106.
Local, J.K. (1992). Modelling assimilation in non-segmental rule-free
synthesis. In Docherty, G. and Ladd, R. (eds.) Papers in laboratory
phonology //. Cambridge: CUP. 190-223.
Local, J.K. and Lodge, K.R. (1994). (AM : Advanced Tongue Root or
Mother Travesty of Representation? An investigation of Kalenjin.
Paper presented at the LAGB Spring Meeting, University of Salford,
April 1994.
Local, J.K. and Lodge, K.R. (forthcoming). On the parametric phonetic
interpretation of [ATR] in Kalenjin. York Research Papers in
Linguistics.
Lodge, K.R. (1981). Dependency phonology and English consonants.
Lingua 54. 19-39.
Lodge, K.R. (1984). Studies in the phonology of colloquial English.
London: Croom Helm.
Lodge, K.R. (1992). Assimilation, deletion paths and underspecification.
Journal of Linguistics 28. 13-52.
Lodge, K.R. (1993a). Underspecification, polysystemicity and nonsegmental representations in phonology: an analysis of Malay.
Linguistics 31. 475-519.
Lodge, K.R. (1993b). Kalenjin phonology and morphology: a further
exemplification of underspecification and non-destructive
phonology. Paper read to the LAGB Autumn meeting, Bangor,
September 1993.
Manuel, S.Y., Shattuck-Hufnagel, S., Huffman, M., Stevens, K.N., Carlson,
R and Hunnicut, S. (1992). Studies of vowel and consonant reduction.
Proceedings of the International Conference on Speech and Language
Processing. Volume 2. 943-946.
Nolan, F. (1992). The descriptive role of segments: evidence from
assimilation. In Papers in laboratory phonology //. Cambridge:
CUP. 261-279.
BEST COPY AVAILABLE
115
118::
YORK PAPERS IN LINGUISTICS 17
Ogden, R. (1992). Parametric interpretation in York Talk. York Papers in
Linguistics 16. 81-99.
Ogden, R. (1993b). Where is timing? A response to Caroline Smith. Paper
presented at LabPhon 4, Oxford. To appear in A. Arvaniti and B.
Connell (eds) Papers in laboratory phonology 4. Cambridge: CUP.
Ogden, R. and Local, J.K. (1995). Disentangling autosegments from
prosodies: a note on the misrepresentation of a tradition
in
phonology. Journal of Linguistics 30. 477-498.
Painter, C. (1973). Cineradiographic data on the feature 'Covered' in Twi
vowel harmony. Phonetica 28. 97-120.
Palmer, F. R. (ed.) (1970). Prosodic Analysis . London: Oxford University
Press.
Paradis, C. and Prunet, J-F. (eds.). (1991). The special status of coronals
(Vol.2 of Phonetics and Phonology). San Diego: Academic Press.
Partee, B.H. (1984). Compositionality. In F. Landman and F. Veltman (eds).
Varieties of Formal Semantics. Dordrecht: Foris. 281-312.
Pierrehumbert, J. (1990). Phonological and phonetic representation.
Journal of Phonetics 18. 375-394.
Pulleyblank, D. (1988). Vocalic underspecification in Yoruba. Linguistic
Inquiry 19. 233-270.
Pulleyblank, D. (1989). The role of corona] in articulator based features. in
CLS 25. Papers from the 25th Annual Regional Meeting of the
Chicago Linguistic Society. Part One: The general session. (Eds) C
Wiltshire, R. Graczyk and B. Music. 379-393.
Sagey, E. (1986). The Representation of Features and Relation in Non-
Linear Phonology. PhD. thesis, Massachusetts Institute of
Technology.
Schachter, P. and Fromkin, V. (1968). A phonology of Akan: Akuapem,
Asante and Fame. UCLA Working Papers in Phonetics 9.
Scobbie, J.M. (1991). Attribute-value phonology. Unpublished PhD thesis,
University of Edinburgh.
Sprigg, K. (1957). Junction in Spoken Burmese. In Studies in Linguistic
Analysis. Special Volume of the Philological Society. Oxford: Basil
Blackwell. 104-138.
Stewart, J.M. (1967). Tongue root position in Akan vowel harmony.
Phonetica 16. 185-204.
119
116
ATR HARMONY IN KALENJIN
Tucker, A.N. (1964). Kalenjin Phonetics. In D. Abercrombie, D.B. Fry,
P.A.D.MacCarthy, N.C. Scott and J.L.M. Trim In Honour of Daniel
Jones. London: Longmans. 445-470.
Tucker, A.N. and Bryan, M.A. (1964). Noun classification in Kalenjin:
Plikot. African Language Studies. 3. 137-181.
Waterson, N. (1956). Some aspects of the phonology of the nominal forms
of the Turkish word. Bulletin of the School of Oriental and African
Studies 18. 578-591.
Whalen, D.H. (1990). Coarticulation is largely planned. Journal of
Phonetics 18. 3-35.
Wheeler, D. (1981). Aspects of a Categorial Theory of Phonology. PhD.
dissertation. University of Massachusetts, Amherst. Distributed by
the Graduate Linguistic Student Association, University of
Massachusetts, Amherst.
Wheeler, D. (1988). Consequences of some categorially-motivated
phonological assumptions. In R.T. Oehrle et al. (eds). Categorial
Grammars and Natural Language Structures.
Wong, D.J., Markel, J.D. and Gray, A.H. (1979). Least squares glottal
inverse filtering from the acoustic speech wave. IEEE Transactions of
Acoustics, Speech and Signal Processing. Volume ASSP-27. 350'155
117
120
ON BEING ECHOLALIC: AN ANALYSIS OF THE
INTERACTIONAL AND PHONETIC ASPECTS
OF AN AUTISTIC'S LANGUAGE*
John Local and Tony Wootton'
Department of Language and Linguistic Science
University of York
1. Preface
A case study is presented of an autistic boy aged 11 years. The analysis
is based on audio-visual recordings made in both his home and school.
The focus of the study is on that subset of immediate echolalia that has
been referred to as pure echoing. Using an approach informed by
conversation analysis and descriptive phonetics distinctions are drawn
between different forms of pure echo. It is argued that one of these
forms, what we call 'unusual echoes', has distinctive interactional and
phonetic properties which does not have a counterpart in the speech of
non-autistic children. These principally consist of a particular segmental
and suprasegmental relationship to the prior adult turn, a particular
rhythmic timing and a functional opaqueness. This behaviour is set
within the context of this child's general communicative behaviour
which, in various ways, places a premium on the use of repetition
skills. These skills also inform the child's use of repetition in unusual
echoes, though here the interactional and phonetic properties of such
* This work was made possible by a grant from the Innovation and Research
Priming Fund of the University of York. We would like to thank Kevin, his
family, and the staff at his school for allowing recordings to be made, and
Fiona Weir for conducting the collection and preliminary investigation of
the data discussed here. We are grateful to John Kelly and Patrick Griffiths
for their comments on earlier versions.
1Department of Sociology University of York
York Papers in Linguistics 17 (1996)
0 John Local & Tony Wootton
119-165
01
YORK PAPERS IN LINGUISTICS 17
repetitions suggest that they display a distinct interactional stance to the
questions that precede them.
1.1
Introduction
Echolalia refers to the repetition of words that have been used by
another speaker. It is a phenomenon that has come to have special
associations with autism, partly because it often makes up a high
proportion of the early speech of those autistic children who learn to
speak. The words that the child echoes need not be produced in the
immediate context in which the echo takes place. For example, while at
home the autistic child can sometimes repeat jingles that s/he has heard
on the television on some prior occasion, or phrases that have been
heard at school. This type of echoing is often referred to as 'delayed
echoing'. It contrasts with those cases in which the source of the words
being repeated is in the immediate context. Usually, in the research
literature, such 'immediate echolalia' is taken to include child repetitions
which are modelled on the prior turn of the child's interactional partner,
or the prior turn but one.
Within the literature on autism echolalia is generally viewed as a
symptom of this condition. Frith, for example, describes it as 'amongst
the most characteristic behavioural abnormalities of young autistic
children.' (1989:123). Yet, as Frith and others have noted, forms of
repetition akin to immediate echolalia also occur in the speech of
normal children. This raises the question of whether there are differences
between these two populations with respect to either the nature or
frequency of echo usage. The work of Prizant and Duchan (1981)
suggests that autistic children may be packaging a wider variety of
actions within immediate echo formats. When taking account of nonverbal behaviour, segmental and suprasegmental features they claim to
show that seven different functional action types can be reliably
discriminated within the overall set of immediate echoes. However,
work on normal children between the ages of about 2;0-3;0, the ages at
which repetition is most rife, also suggests that various actions can be
achieved through repetition formats (Mc Tear 1978; Casby 1986;
Greenfield and Savage-Rumbaugh 1993). It may still be possible that
there are differences between the nature of these action types in the
120
122
ECHOLALIA IN AUTISM
autistic and normal populations, but for several reasons this is less than
clear-cut. The most obvious is that different kinds of speech act
classifications have been used in studies of normal and autistic
populations. In the light of these and other considerations some writers
can still claim that there is little difference in the forms of repetition
used by normal and autistic children (Rydell and Mirenda, 1991).
In the course of research on autistic echoing further dimensions of
variation within echoes have also been identified. Of special importance
is the exactness of the repetition, the degree to which the words in the
utterance that is the target of the repetition are reproduced. This
parameter is of direct relevance to immediate echoes, and in this respect
distinctions have been made between three sub-types. First are 'pure
echoes', exact repeats of all or some portion of the words used in the
prior target turn. Second are 'telegraphic echoes', repeats of words which
are not adjacently positioned in the target utterance. Third are 'mitigated
echoes', repeats that include some or all words in the target with
additional words added. These three subtypes are illustrated below:
a. Speaker A: Where is daddy's hat
Speaker B: Daddy's hat [pure echo]
b.
A: Where is daddy's hat
B: Where hat [telegraphic echo]
c.
A: Where is daddy's hat
B: Daddy's hat there [mitigated echo]
Within the autistic population it is the prevalence of pure echoes at a
certain stage of development that seems to be the clearest potential case
of abnormality in the use of repetition. These pure echoes can preserve
suprasegmental features of the target utterance as well as segmental
ones, thus giving the impression of a speaker who is simply parroting
the speech of the other party. Developmentally such pure echoing gives
way to more mitigated forms at later ages, and eventually echoing can
be virtually eliminated (Roberts,1989).
Although pure echoing is the example par excellence of potentially
abnormal echoing behaviour it is not possible to be entirely clear about
several of its parameters. For example, we do not know whether the
121
123
YORK PAPERS IN LINGUISTICS 17
autistic child tends to repeat all the words in the target turn or just some
of them. And in the latter case, which undoubtedly occurs some of the
time, we do not know which words tend to be picked out for repetition.
Their functional properties are somewhat clouded by the fact that their
analysis in this respect has usually been combined with the analysis of
other kinds of echo, notably mitigated echoes. And, above all, there is
still the question as to why this repetition behaviour has the special
attraction that it does for the autistic child. To say this, though, is to
presume that pure echoes have a special status within the repertoire of
the autistic as against the normal child. This, however, is by no means
clear. And, if it is the case that the use of pure echoes can serve normal
communicative functions among autistic children then we need also to
detail the distinctive properties of those that appear abnormal in this
regard.
In this study, which is a case study of one autistic child, we will
focus principally on the child's pure echoes. We have investigated the
different ways in which these echoes can participate in the interaction
process, and we attempt to discriminate those that appear to serve a
recognisable conversational function from others that seem more
equivocal in this regard. In particular we identify a sub-set of pure
echoes, ones that we call 'unusual', to which no obvious functional
description can be attached. We compare this latter set with comparable
instances in studies of normal children so as to decide on whether and in
what ways this behaviour is different from potentially analogous
behaviour found in normal children. And, in general, we try and situate
the child's use of pure echoes within the context of his overall
interactional skills and predilections. In this way we arrive at certain
conclusions regarding how the child comes to use unusual echoes.
2. The child, the data base and methodological approach
The child, who will be called Kevin, is aged 11 years 4 months at the
time when the recordings were made. He lives in England and resides at
home with his mother, father and younger sister, attending a school for
children with special needs each day. In order to gain an empirical
estimation of the degree of Kevin's autism The Childhood Autism
Rating Scale (CARS) (Schopler, Reichler, Renner 1986; Schopler,
122
ECHOLALIA IN AUTISM
Reich ler, DeVellis and Daly, 1980) was applied to over 4 hours of
audio-visual recordings of Kevin made in various settings (see below).
The result of this rating was 50.5. CARS score of 37-60.0 is allocated
to the diagnostic category 'Autistic' and given the descriptive label
'Severe Autism' (Schopler et al, 1986: 57).
Audio-visual recordings were made of this child in a number of
different settings. One hour 45 minutes of recording took place in the
child's home. Relevant equipment, such as a tripod mounted camera,
was made available, and instruction given as to its use. All the
recordings were made in the absence of any research worker. The 105
minutes of recording are made up of six sections recorded over two days.
They include sections in which Kevin is playing with his younger
sister, looking at books with his mother, watching TV with relatives,
singing songs with his father and just sitting with his mother and father
in the context of no special activity. The other setting in which
recordings took place was his school where the recordings were
orchestrated by our research assistant. Here we have about 2 hours
involving Kevin in an open classroom situation, in various kinds of
group work with other children and teachers. In addition, three types of
one-to-one session were recorded in the school: a) a 10 minute session
between Kevin and a teacher which focussed on word recognition and the
assembling of word cards into simple sentences; b) a 14 minute session
in which Kevin's mother played a board game with him; and c) 43
minutes in which our research assistant engaged in interaction with
Kevin in the context of drawing activity and a large doll's house. For
reasons that will be later touched on the various one to one sessions
both at home and at school were those that yielded most of the speech
on which our analysis focuses.
Table 1 gives an overview of the main forms of speech employed
by Kevin on our recordings. The main type of speech excluded from this
table is delayed echolalia, speech which did not appear to be addressed to
other people with some specific communicative intent and which
usually consisted of recognisable reworkings of forms of talk that he
had heard on some other occasion. This is excluded from the table partly
because it would prove difficult to segment this talk into discrete
utterances for the purposes of quantification, and partly because its true
extent is difficult to capture from our recordings, especially in the open
123
YORK PAPERS IN LINGUISTICS 17
classroom situation. Very roughly, Kevin's delayed echoing would make
up at least as much of his talk as does the category 'Other forms of
response to vocal initiation' in Table 1. In addition we have excluded
from Table 1 such things as singing and words he says to himself as he
is sorting word cards into sentences.
N
Types of child vocalisation
Vocal initiations
Pure echoes
Mitigated echoes
Telegraphic echoes
Other forms of response to vocal initiation by
interlocutor
9
47
8
0
124
(%)
( 5)
(25)
( 4)
( 0)
(66)
Table I. Distribution of Kevin's communicative talk aggregated
across a variety of settings.
Our definition of 'pure echoes' is stricter than that generally employed in
the literature. It is confined to Kevin's turns which consist exclusively
of exact segmental repeats of all or some of the words used in the prior
target utterance. The Table conveys very well Kevin's low level of
dialogic initiation with other people. Apart from his delayed echolalia
most of his talk takes the form of replies to questions. This is true of
the various echoes in Table 1 as well as the category labelled 'other
forms of response to verbal initiation'. In the main he speaks to others
only when spoken to.
Psychometric information about Kevin is not available. It is also
difficult to make an informed judgement as to his level of language
development on the basis of his vocal output, principally because, as is
evident from Table 1, his speech production consists mainly of
responses to various kinds of question, which on average fall between 1
- 2 words in length (the mode is 1 word). Both mitigated and pure
echoes are always responses to questions, as well as the 'other forms of
response' speech. The most advanced of his few vocal initiations is Can
I have a crisp please, though we have no means of knowing whether he
124
126
ECHOLALIA IN AUTISM
has control over the syntax involved in the production of such
sentences. However, his delayed echolalic speech is generally more
complex than that contained in Table 1: here, average utterance length
appeared to be between 4 - 5 words. Furthermore, in his one-to-one
session with his class teacher he is able to construct, with word cards,
sentences like 'daddy and mummy play ball' and 'daddy make tea for me'.
Our approach to the analysis of the data extracts that form the core
of this paper is one that is principally informed by work in conversation
analysis (Levinson, 1983; Wootton, 1989). This approach insists on
the examination of linguistic and other communicative behaviour
within its local sequential context of production, and seeks inductively
to show how the participants, through the details of their behaviour,
adopt particular interactional alignments. Such an approach is, therefore,
especially concerned with the sequential position that an utterance
occupies, the details of that utterance design (and any co-occurring non-
verbal behaviour) and the way in which an utterance is treated by the
next speaker. Through the evidence that arises from these details we
attempt to construct an analysis that is compatible with the implicit
understandings of the participants as they go about their interactional
business.
The data fragments are given in a modified form of conventional
orthography. Where appropriate for analytic purposes, these are
supplemented with impressionistic phonetic information. Segmental
information is presented in square brackets following orthographic
versions (if such are possible), and pitch information is presented
syllable by syllable beneath the relevant turn in inter-linear format
where the ruler lines are indicative or top and bottom of the speaker's
pitch-range. Certain other conventions are adopted from conversation
analysis transcription procedures (Atkinson and Heritage, 1984). These
to
comprise the procedures for depicting speech overlap; the use of
signify no gap between speakers or within the speech of a single
speaker; where no pitch transcription is given we use ' ?' to indicate a
general rising pitch contour over a turn (all other turns have general
falling pitch); the use of double brackets to enclose transcriber
comment; the use of colons to mark sound sustension; (hh) to signify
audible aspiration within speech and (he) to signal laughter or
125
I. 2
YORK PAPERS IN LINGUISTICS 17
chuckling. Timings of pauses are given in seconds; (.) indicates a pause
of under half a second.
3.
General interactional profile
By contrast with normal children the most striking feature about
Kevin's verbal behaviour concerns what is absent rather than what is
present. Unlike normal children (Snow 1986) he rarely initiates
interaction with other people, a pattern that seems as true for his
behaviour in his own home as it is for that at school, and a pattern that
is characteristic of autistic children more generally (Fay 1988). During
free moments at school, for example, he seems content to wander
around the classroom, not seeking out contact with other children or
staff members, occasionally stopping to look at things, but for the
most part absorbed by matters which do not involve direct dealings with
other people. His verbal output at such times is made up largely of
delayed echolalia; during the recordings this type of talk mainly focuses
on regulatory themes. For example, a recurrent utterance frame, both at
home as well as at school, is You do not.., articulated with the
exaggerated forms of intonation characteristic of an adult reprimanding a
child. Typically these utterances are produced on a much higher or lower
pitch, and more loudly, than surrounding talk. They exhibit noticeable
whispery-voiced phonation and syllable-timing and are often done with
dynamic pitch rises on all syllables but the last. Their overall
articulatory setting is noticeably tenser than other utterances.
The very infrequent forms of vocal initiation, making up just 5%
of his overall vocal output recorded in Table 1, consist exclusively of
requests for goods or for the adult to perform an action for him.
Sometimes such requests, though still infrequent, can be accomplished
in entirely non-verbal ways, as when he takes his mother's hand and
moves it towards his back in order to get her to scratch it. When enacted
vocally these requests display distinctive articulatory and prosodic
characteristics, especially in contrast to the articulatory and prosodic
forms that are used to package the remainder of his vocal output. They
are produced relatively high in pitch with wide pitch range; any onsyllable pitch movements are likely to be accompanied by noticeable
vibrato. The articulatory components are produced laxly and obscurely,
126
ECHOLALIA IN AUTISM
the main impressionistic percept being one of overall nasality running
through the utterance. These turns also exhibit considerable variations
in tempo. Typically they begin slow, accelerate noticeably and slow
down. Taken together these phonetic characteristics yield a markedly
'strange' tenor to the speech produced. Kevin's co-interactants orient to
the obscurity of utterance and variability of tempo in their talk which
responds to these vocal initiations. These features are illustrated in the
extract below:
Fragment (1)
Kevin and his mother sit together on the settee at home looking out of the window. His
mother looks towards him, but does not speak. Two seconds later he turns to his mother
andsaysosnIstsheissallloddngathim:
K:
[
?'111A1 (inbreath)
l=
((touches M's upper arm))
M:
=Talk slowly Kev [in
K:
]
((still touches M's arm))
M:
You can have a rice cake later
(1.0)
M:
When you've had some dinner
One type of initiation that seems to be entirely absent is that concerned
with identifying the names of people or things. Such initiations are
commonly enough reported in the literature on normal children,
particularly in the kinds of context that frequently occur on our
recordings, such as book reading (Ninio and Bruner 1978). In the
127
BEST COPY AVAILABLE
129
YORK PAPERS IN LINGUISTICS 17
literature on autism there is some suggestion that the vocal/gestural
forms associated with such referential activity am more grossly retarded
than, say, those forms associated with the act of requesting (Sigman,
Mundy, Sherman and Ungerer 1986; Baron-Cohen 1989). But with
respect to pointing, one key ingredient of these referential forms, there
is evidence in Kevin's case that he can use this action, together with
appropriate vocal accompaniment, to engage in acts of reference. Where
he displays this proficiency, however, is in response to questions which
seek such a response from him rather than in acts of initiation.
Although the classification of questions that is employed in Table 2 is a
fairly crude one it nevertheless suffices to show that the large majority
of adult questions to which Kevin gives a non-echoing response are
eliciting from him the name of things or persons. Typically these
questions take forms like 'What's that?', 'Who is that?', 'Its not a snail
its a ?', 'What colour is it?'. For the most part (i.e. 57% of them) they
elicit names of things that he can actually see in his surroundings, and
such namings are frequently accompanied by points on his part.
Types of information
Visible person/object descriptors
Remote/non visible person/object descriptors
Location descriptions
Course of action information
Other
N
(%)
70
(57)
(13)
( 4)
(24)
( 2)
16
5
30
3
Table 2. Types of information sought by Kevin's interlocutor in
questions which received non-echoing forms of response.
There is ample evidence, therefore, that even though Kevin does not
engage in initiating acts of labelling he does, nevertheless, have a wide
experience and secure grasp of the labelling game when in response
position. In most cases, as in those just discussed in the context of
Table 2, when he replies to a question he produces a word that has not
been used in the question, he replies in a non-echolalic way. Among the
instances of pure echoes, however, there is also evidence of an
128
130
ECHOLALIA IN AUTISM
orientation to and grasp of such a labelling game. Furthermore, the
techniques through which such an orientation is displayed suggest that
the child has developed quite sophisticated discourse skills in his
management of this game.
4.
Repetition skills
In this section we will identify various ways in which those who
interact with Kevin employ forms of turn design which encourage the
use of repetition on his part. In a strict definitional sense his resultant
repetitions are often pure echoes, as will be evident from the extracts we
use by way of illustration. However, most of these repetitions, by
contrast with those we deal with in later sections, appear in no way
misfitted for the sequential positions in which they occur, and in most
cases they are treated by the child's interlocutor as appropriate moves in
the current language game. We begin this discussion by exploring these
matters in labelling sequences, ones in which the child is being asked to
name something. In assisting the child in his identification of the name
in question we shall see that the other party can resort to providing
names that the child then goes on to copy.
An important general feature of interaction between Kevin and other
people is that when they ask him questions he usually does not,
initially, give a vocal response. For example, if we take the same
questions that form the basis for Table 2, questions that elicited nonechoing forms of response from Kevin, we find that 61% of them occur
after at least one prior unsuccessful attempt by his interlocutor to elicit
a response to some version of that same question. Indeed, in many cases
there are several such prior attempts to elicit a response (e.g. see,
fragments 3, 5, 8, 9, 11 and 14 below). And this pattern does not seem
to be a simple function of the possible difficulty of the question.
Questions which seek labels concerning visible objects or persons,
perhaps the most straightforward type of question, are preceded by prior
unsuccessful elicitation attempts in 60% of cases. If non-response is
one type of contingency with which the other party has to deal, a further
contingency is that in which the child produces an incorrect response to
the question. Most of the questions addressed to him, especially
labelling questions, are, of course, test questions, ones for which the
129
131,
YORK PAPERS IN LINGUISTICS 17
other party knows the Answer. So the other party can also be placed in
the position of guiding Kevin towards the correct answer.
In the context of labelling questions both the contingencies
mentioned above, non-response and incorrect response, can be resolved
by the other party providing Kevin with a version of the answer that
they have been seeking in their question. In fragment (2) his mother
says Its jam, while in fragment (3) she says No its a watering can.
Fragment (2)
Kevin and his mother sitting side by side on the settee at home looking at a book. Kevin
begins by correctly identifying a picture of a cake, in response to a question from his
mother:
K:
Cake
A cake with
(1.2)
M:
What's this
((pointing to, and prodding, a place
on the page))
(1.2)
--)
M:
Its ja:::m=
K:
=
(1.3)
M:
So there's ja:m in the ca:ke
132
ECHOLALIA IN AUTISM
Fragment (3) In same context as fragment (2) above:
M:
((pointing to book))
What is it
(1.9)
M:
Its a w::
(
IISJONWALV
)
(0.7)
M:
K:
W-
[
[
spexplc-ofi
lvw91,11 /4060N531 )
M:
No its a wa:tering ca:n
K:
Watering can
M:
What do you do with the watering can?
(
)
(
In then producing a repeat of this label in next position, Jam in
fragment (2) and Watering can in fragment (3), Kevin is taking this
sequential opportunity to produce a first [for him] correct version of the
label that the parent has been attempting to elicit from him.In
producing this version, then, he is displaying his recognition that this
is the appropriate answer. In addition, and as a slight variant of this,
13 3
YORK PAPERS IN LINGUISTICS 17
Kevin has another way of constructing such repetitions which displays
an even closer monitoring of this type of assisting turn.
Fragment (4) In same context as fragment (2) above:
11:
What are they ((pointing to book))
)((also points briefly to place
on page))
K:
Berries
M:
They're like berries=they're called
M:
What are they called
(
(1.0)
M:
They're s::tra:[:w b e r r) i es:
I
(.) aren't they
)
(Strawb'ries) eq1.1XMlig)
K:
((no point))
(1.6)
M:
S:tra:wb'ries (.) Ye::s
In extracts like fragment (4) he is able to detect from the early part of
the word that is produced by the other party, in this case strawberries,
what that word is going to be. Indeed, in fragment (4) Kevin also
completes the word prior to the completion of the word by his mother.
In extracts like (4) the other party can subsequently display some doubt
ECHOLALIA IN AUTISM
fragment (4) Kevin's
as to the child's grasp of the label in question. In
they
(1.6)
Strawberries
(.) yes, this remother goes on to say Aren't
exposure of the child to the correct label perhaps being sensitive to the
overlapping position of the child's turn. But in the more frequent cases
like fragments (2) and (3) above there is no evidence of these child
repetitions being in any way treated as problematic, as displaying some
unsound grasp of the language game in question.
A further way in which Kevin can adopt a target word being offered
offers the child a
by the adult occurs in circumstances in which the adult
consists of the
clue as to the nature of the word being sought. The clue
beginning of the word that the adult is seeking, and such a clue is
offered when it has become clear that the child is having difficulty in
coming up with the word on his own. In fragment (5), for example, the
and he is not
mother's initial question is answered incorrectly by Kevin,
follow up
able to offer an alternative person in response to either of her
turns. In this circumstance the mother offers the clue/prompt Aa[q:n],
which Kevin then manages to complete with tie Sherry [iiilOuje [i.e.
'Auntie Sherry'].
a cup in her
Fragment (5) Mother and Kevin sitting on the settee at home; mother holdsgesture:
Kevin's
shoulders,
in
an
affectionate
right hand and has her left arm around
M:
Who's coming to see you
M:
Who's coming to see you
(1.4)
((stroking back of Kevin's
neck))
(1.7)
M:
Aun t
Z1:11:
(0.8)
INICE1.)/d I
K:
tie Sherry
M:
Auntie Sherry (.) A::nd?
_/
135
YORK PAPERS IN LINGUISTICS 17
Similarly, in fragment (6) the child is able to recognise the word that
his mother is seeking, 'caterpillar', from her production of the initial
voiceless velar plosive of that word. Notice that like the 'Auntie Sherry'
instance the child's production of the target word is built as a
completion of the prior turn - that is the initial portion is not produced
in the child's version.
off his classroom
Fragment (6) Mother playing a board game, with Kevin in a side room
throwing a dice, which
at school. Our research assistant'is also present. The game involves
the picture is
has pictures on its sides. Here his mother encourages Kevin to tell her what
on the exposed side of the dice:
M:
Look at the picture what is it
.
"N.
((initially touches his fingers, then points to the
dice face in her other hand))
sal.s'ne.tiov )
((briefly points to the dice))
K:
=
M:
Suh not a snail its ak
1144P )
(1.0)
K:
((obscure quiet))
M:
its a
[
[ !Clic& )
?Imo )?
((K briefly points to dice))
(0.7)
1
K:
Leaf
[
1?1.:113 )=
((no point))
136
ECHOLALIA IN AUTISM
M:
its ak
(
d(Sa'ICh? ]
((no point))
K:
Caterpillar
M:
Caterpillar right what have you got to do
(
le91)11I1)0
]
In these various ways, therefore, the child exhibits some skill in
monitoring the prior turn of the other party for material that directly
cues what is expected of him in his next turn. Routinely, where a label
is being elicited the child can look to the prior turn of the parent for a
sense of what that label is to be, and in many circumstances, as we have
seen, that will be a successful strategy in that it appears to generate a
label that is commensurate with the immediate sequential requirements.
Labelling games of this kind are important by virtue of their frequency
within our corpus of data, but they are not the only ones in which such
repetition strategies are fostered. Two further types are now discussed.
The first is a type of game that is frequently played with Kevin by
both his mother and younger sister on our recordings. The game, always
initiated by the other party, consists of presenting Kevin with two
options and asking him which of these options he would prefer:
Fragment (7) Kevin sitting on the settee at home between his mother and father. Engaged
in a playful game in which he is presented with alternatives that he chooses from. The game
is already underway when the transcript begins:
M:
D'ye wa::n (uh::m) smacked bottom or a kiss?
K:
Kiss
"\\
((takes his finger out of his mouth at beginning of this
utterance, smiles during it and then angles his cheek
to be kissed))
((M kisses K's cheek))
(1 . 6)
137
YORK PAPERS IN LINGUISTICS 17
tickle
D'you wa::nt (.) a smacked bottom or a
M:
'K:
uCterance))
Smacked bottom ((smiles during this
""'N
legs, accompanied by
((M playfully smacks hislaughter
from K and F))
(1.7)
M:
Do you wa:nt a:
(1.2) ki:ss::
(.) or a tickle
-"N
((K's laughter continues
K:
through this utterance))
Kiss
end of this
((turns his head towards M, for kissing, at
word))
Presumably, one feature which makes the game attractive from the
point of view of his interactional partner is that it seems to work. It
generates serious signs of recognition that Kevin understands the
options in question, an understanding displayed partly, perhaps, through
his systematic avoidance of certain options, notably being tickled, and
through the laughter and horseplay in the course of the game's
enactment. Our interest is particularly in the way in which the options
are presented. They are both explicitly mentioned by the other party, and
characteristically Kevin chooses between the options by repeating the
name of that which he prefers. The fact that he does not always select
the second of the options with which he is presented is important for
later arguments. For now we emphasise that his grasp of the options in
question is not just suggested by the considerations above, but also in
the minutiae of his non-verbal behaviour: when choosing kiss, for
example, his presentation of his cheek for kissing displays an
expectation that this will now take place. In these ways his choice of an
ECHOLALIA IN AUTISM
option is bound up with more than labelling a possibility, it earmarks a
course of action that he now expects to take place.
The second interactional tactic with which we will be concerned is
also typically used in circumstances in which the other party is seeking
guidance from Kevin as to some next course of action. We have already
noted that Kevin's co-interactant is often faced with a situation in which
no response is made to a question. One course of action that the other
party can then use in these circumstances is to transform the question
into a yes or no alternative.
Fragment (8) K sitting on settee between his mother and father.
14:
D'you want to go to bed?
'Si'
K:
M:
( (then inclines his head more
to M))
Kevi::n
(.)
(IS IS IS IS')
[Kevin
(0.7)
M:
Kevin
(1.3)
14:
Kevin listen (.)
K:
(look at me
((puts her hand to K's chin at
[
beginning of this
(
turn, and directs his face
(
towards her))
(
'SY 'SY
VSY ]
(0.7)
14:
Look at me d'you want to go to be [d
((K pushes her hand away from his
chin after word 'me'))
[
K:
(
.
.
ISJ 'SY
)
((then he looks
away from 14))
(2.0)
((M takes hold of his chin and redirects his face
towards her))
-4
M:
Yes or no
K:
Yes
M:
Yeas?
(
1.00
( . )
I((as he says this he pulls his chin
from her and looks away))
Are you tired
139
YORK PAPERS IN LINGUISTICS 17
So, in fragment (8), after eliciting nothing other than intermittent
voiceless alveolar fricative sounds from Kevin regarding her enquiry as
to whether or not he wants to go to bed, his mother eventually
formulates the question as Yes r no?. Such a formulation makes it
possible for Kevin to answer the original question by picking one or
other of the two alternatives, and he responds to this by saying Yes.
Here again, then, we find forms of turn design being used by other
parties which provide a word that the child can use in coming up with
an answer to a question. Indeed, such turn designs might be attractive
precisely because they offer such a ready facility to the child.
In his speech with others, therefore, Kevin is mainly concerned
with responding to questions, and in the course of this, and in a number
of ways, his co-participants offer within their own talk words that
Kevin can draw on in constructing a response. In this sense, the
availability of repetition to Kevin as a discourse strategy is built into,
and fostered, through the turn designs of those he interacts with. And
these turn designs are particularly found in circumstances in which the
child has not responded or has responded inaccurately. Here, therefore,
there is the potential for repetition, as a strategy, to have a particular
significance for the child in resolving communication disorder of one
kind or another. But its use, as we have seen, is not exclusive to such
contexts. In fragment (7), for example, the possibility for repetition to
be a, viable response is built into the design of turns that are not
officially designed to handle a communication problem, and there are
other discourse contexts within our data corpus where such is the case.
For example, when his teacher asks him to assemble, word cards in order
to make a sentence she gives him the cards and then vocally models the
sentence that he is to make. His job is to reproduce, that model, and as
he tries to do this he will often say to himself the words that the teacher
has used. Here again, as in most of the extracts above, there is little
sense of the child's use of repetition being out of kilter with the task in
hand. But there are some pure echoes where this is not the case, and it is
these which will principally occupy us in subsequent sections.
140
ECHOLALIA IN AUTISM
Inapposite repetition
5.
In a formal sense many of Kevin's repetitions that we have discussed in
the previous section are pure echoes, consisting exclusively of exact
segmental repetitions of all or part of a prior adult turn. In the main
they appear to be accepted as appropriate conversational moves by the
child's co-participant, and in some cases, such as fragment (7) there is
good supporting evidence that the child's grasp of the functional role of
the repetition is congruent with that of the co-participant. In other
cases, however, there might remain doubt as to the kind of
understanding displayed through the child's repetition even though the
co-participant accepts the child's act as an appropriately fitted
conversational move. For example, in fragment (5) it is possible that
although the parent is successful in prompting the label 'Auntie Sherry'
it may not be the case that Kevin recognises that Auntie Sherry will be
coming around later that day. The parent's prompt may simply serve to
select one of a number of person descriptors available to the child. And
in fragment (8) there is no supporting evidence suggesting that Kevin
himself understands that his Yes amounts to an interest in going to bed:
for example, on saying this he does not make any physical move which
would be consistent with such an understanding.
This kind of semantic/pragmatic insecurity is often tied up with the
possibility that at times the child may be operating with a different kind
of language game than his recipient. This possibility is concealed, and
must remain uncertain, within cases like fragment (5) because the
answer that the parent is seeking, 'Auntie Sherry', may also be an
answer to an alternative language game that the child might be playing that of simply guessing which person his mother is referring to. Such a
possibility is, however, more clearly realised in other instances like
fragment (9) below:
141
YORK PAPERS IN LINGUISTICS 17
Fragment (9) Kevin sitting on the settee at home between his mother and father. The
earlier part of this sequence is transcribed in fragment (13). As the sequence below begins
he is sitting with his finger in his mouth, looking frontwards, not at M or F:
M:
Kevin look at my poor cheek
((at the beginning of this turn she touches K's
shoulder, then uses that hand to point to her
cheek))
(0.9)
M:
((K stills his movements here, but
does not look at M))
Kevin look at my poor che[ek
(((initially M touches K's
(hand, which is still in his
(mouth, then points to her
(cheek))
(Cheek ( tslin )
K:
((turns to look at M, and moves hand from mouth))
K smiles and points at cheek))
M:
Look
((pointing again at her cheek))
Here Kevin's mother is attempting to establish a connection between a
mark/stain on Kevin's trousers and some offence that Kevin has
committed at an earlier date, an offence which involved his biting her
cheek. After initial difficulties in gaining a response from him, and
remedial action in the form of touching his hand, Kevin eventually
looks at her when she says Look at my poor cheek, words that he can
see are also accompanied by a point by her to her own cheek. Kevin's
response is to point to her cheek and say Cheek; in fact his production
of this word begins prior to his mother's completion of the word Cheek.
The fact that he also points to the cheek, that this action is accompanied
by a smile and that he just repeats the word 'cheek' (rather than, for
42
example, 'poor cheek') suggests that Kevin's understanding of the
sequential expectation obtaining here is for him simply to label the
ECHOLALIA IN AUTISM
parent's cheek. Just after our transcript ends, once he has become aware
of the earlier offence connotations being addressed by his mother and
father, his facial demeanour radically changes; pleasure gives way to
intense seriousness. And his mother's response to his production of
cheek in fragment 9 itself also treats it as misfitted for its sequential
position. Her follow up, look, uttered whilst he is already looking at
the cheek in question, is clearly attempting to obtain a recognition of
the bite related aspect of the cheek.
In this, and other cases, therefore, there is a basis for supposing
that the procedure that generates a pure echo on the child's part, the
language game that he is playing, can be orderly, though discrepant
with that of his co-participant. In fact such discrepancies can appear not
just in situations where he produces echoes, they can also be a feature of
exchanges in which he produces forms of non-echoing response. For
example, in fragment (10) he produces the label Sun in response to his
mother's question Listen what have you got to do?, a response that is
understandably treated as misfitted to this question by his mother, who
reposes it subsequent to his response:
Fragment (10) Mother and Kevin playing the board game at his school: see fragment (6)
above for description of the game. Mother is holding the dice, which has a picture of the
sun on the top:
M:
K:
M:
Kevin what do you (.) have to do
( (looks away, then says))
(
ST1
1
Kevin listen
(0.7)
-+
M:
Listen (.) What have you got to do
then points to
((she taps his hand at word
top of dice: K's gaze goes to dice))
Sun
M:
)
((and he points to top of dice))
You've got to:?
(
1 4-3
YORK PAPERS IN LINGUISTICS 17
Here, as in fragment (9), Kevin's labelling response appears to be cued
by the fact that when he turns to monitor his mother's action she is
pointing to the focal object in question. His labelling, therefore, arises
out of non-verbally influenced understandings of the prior turn of the
adult.
6. Unusual repetition
To this point we have outlined two types of pure echo. In both of these
the child's repetition represents a move in a recognisable language
game, even though in the second type, just dealt with, such a move is
misfitted for the sequential environment in which it takes place. Within
Kevin's corpus of pure echoes there remains a further subset that does
not fall easily into either of these two categories. This consists of
echoes for which a functional description is much more elusive, ones
that do not appear to amount to moves in recognisable language games.
Indeed, for this reason it may seem somewhat questionable to treat
them, as we have done in Table 1, as communicative actions that are
commensurate in this respect with the other forms of pure echo.
Leaving this issue aside for the moment our initial strategy will be to
illustrate this sub-type with two clear examples of it, and then to draw
out from these and other examples some general properties of what
seem to be these more unusual and puzzling forms of repetition.
The two initial fragments with which we will be concerned in this
section are (11) and (12) below:
4
ECHOLALIA IN AUTISM
Fragment (11) Kevin and his mother are in the same board game activity as fragments
(6) and (10) above. As this sequence begins M is holding the dice and its container in her
hand and K is looking away, towards the camera:
M:
Whose turn is it [hy41h.itiled?p']
((then M adjusts cards on the table between them,
and K looks at the table))
(1.5)
M:
[ hi1:416:neecre
whose turn is it
((M manually indicates to table))
((Near end of pause K looks away))
(1.5)
M:
Whose turn
is it
[ 14.4101:1:niajlirtjh
((begins to reach for container M is
K:
holding))
(.)
K:
Turn is it ['thl:MaNth]((looking at M's face))
M:
Whose turn is it
((withdrawing her hand that holds container))
K:
Kevin's turn
((his hand now flat on table, not reaching for
container, now looking at table))
145
I ;JAN rnrzna
Fragment (12) In the same context as fragment (9) above, in fact in the sequence
preceding that extract. Kevin has been closely inspecting, and pointing to, one knee of his
trousers; as he does this he says quietly, in a tuneful rhythmic way, going that doing that
on (.) purpose doing that:
M:
Do what on purpose
((K then leans back and half looks towardS M))
(0.7)
are
doing that on purpose
M:
Yes you
M:
you're making a hole aren't you
((as M says this she moves K's hands away from his
knee))
K:
Doing
doing a hole (in it?)
a hole
NottinefiV
M:
Look
dwtkrinelh3On?)
((brief point by M to knee of trousers))
(1.4)
M:
Who did that
((sustained point to K's knee))
K:
Who did th-
(
15041%5
I
((moving his head back sharply))
M:
-4
K:
K (evin
(Who did that
[
ihAl,141(la?)
((said as his head comes 'back' to its level
position))
C.A..;11ULALAA 1IN AU 1 1J1V1
M:
Who did it
K:
Kevins did it
M:
Kevin did it yes
When we speak of these instances as being 'puzzling' we refer in part to
the ways in which they are treated by tile adult involved. In both these
and the other cases in this subset the adult responds to Kevin's pure
echoes by reposing the target turn to which the echo was a response.
The child's echo is not officially being credited with meaning by the
child's co-participant, and in this sense is posing a puzzle to them as
well as to the analyst. This way of responding to the child's echoes
contrasts with the responses to pure echoes in fragments (2) and (3),
that have been previously discussed. But this is only one aspect of their
puzzling nature, for we have also seen that some earlier forms of echo
are treated similarly by the adult (in fragments (9) and (10)). What
makes the echoes in (11) and (12) especially puzzling is that, by
contrast with those in (9) and (10), they do not seem to be clear-cut
moves in any recognisable language game. This claim needs spelling
out a little more, particularly in the light of the analysis of echoes,
described earlier, carried out by Prizant and Duchan (1981).
Take fragment (11) above. Here, at the time at which Kevin
produces his echo Turn is it, there is direct evidence of a co-occurring
hand movement, a right hand reach to take the shaker that is being
proffered by his mother, and there is evidence of Kevin orienting to his
mother by looking at her. These features should assist in assigning this
overall echo configuration into one of the various functional categories
outlined by Prizant and Duchan. Yet in various ways this remains a
slippery exercise. For example, it could fulfil their criteria for being a
(
17
YORK PAPERS IN LINGUISTICS 17
request, for being a 'yes' answer to his mother's question or even for
being a self-regulatory remark that accompanies his reach. The reaching
for the object, for example, could be taken as an affirmation of the fact
that he wants it, or it could be taken as evidence of his desire to obtain
it. Such matters seem deeply opaque in such instances. Furthermore, it
remains a possibility that the child's reach is not strictly connected with
the utterance that comes to accompany it. His reach movement begins
immediately on the production of his mother's turn, while his Turn is
it, together with his gaze switch towards her, is initiated only after she
has said the remaining words. So Kevin's overall action configuration
could be generated by initially embarking on a course of action, taking
the shaker, and then speaking and orienting to his mother on finding
himself to be the recipient of her question. In some ways the continuing
assuredness of his take attempt and the uncertainty expressed through
his continuing gaze at her also speak to such a possibility. Even greater
uncertainty features in fragment (12). This time there are no
accompanying gestures nor any gaze toward the adult. Kevin's Who did
that simply seems to repeat back the adult question,--with no obvious
indicator of any particular kind of communicative intent.
Therefore, the subset of pure echoes with which we are dealing here
has puzzling features both from the point of view of the adult responder
and from the point of view of the analyst attempting to engage in
functional description. We now turn to describing some typical features
of this type of echo.
There are three properties of this sub-group of pure echoes 1Nvhich
will be addressed. First their segmental correspondence to the model that
they are echoing, second their intonational correspondence to this model
and third their timing in relation to this model. By segmental
correspondence we refer to the fact that the child includes in his echo all
the words that occurred in the target/model turn after the initial word
that begins the echo. So, in fragment (11) the child could have echoed
by saying just 'turn', or by producing a telegraphic version such as 'turn
it'. In fact, he produces all the words which occurred in the parental
model after his initial word, 'turn' i.e. Turn is it. This is an important
feature because we have seen that some of Kevin's echoes can consist of
just repeats of non turn final words that are present in the model,
notably in fragments (7) and (8). The only exception to this pattern of
.11A 8
ECHOLALIA IN AUTISM
word inclusion within the present subset of pure echoes is one instance
in which Kevin drops an address term that the parent has used in the
original model (i.e. the parent says What is it Kevin? and Kevin replies
What is it?). From a segmental phonetic point of view, too, these
echoes show quite remarkable attentiveness to the articulatory
characteristics of the model. Fragment (11) above and (13) below
exemplify this close segmental matching. For example, in fragment
(11) Kevin's mother's three versions of 'turn is it' are noticeably_
different in the kit portions. The first is [ere?p1], the second [deer
V], the third is [iziertill. The vocalic portions of Kevin's production
[izjitih] have the qualities of his mother's third, rather than first or
second version, and the final consonantal portion displays the same
front resonance, apicality and aspiration (not noticeable in mother's first
two versions) as the immediately preceding version. Similarly, Kevin's
echo production of the word boat in fragment (13) shows striking
similarities to the preceding adult model rather than to his own prior
non-echoed production of the same word:
Fragment (13) Kevin and his mother sitting side by side on the settee at home looking at
a book:
M:
oh: what's this (0.1) Kevin (0.1) what is it
K:
it's a boat
M:
boat ( INAI(Yh](.) yes (0.2) what's the boat on
M:
(0.4) where's the boat on (0.2) Kevin (.) Kevin (.)
M:
00 oo
( WOO
what's the boat on
(0.1)
1z 9
YORK PAPERS IN LINGUISTICS 17
K:
river
M:
river yes
(0.2)'
K:
(coughs)
M:
would you like to go for a ride in the boat (Vaytvh)
K:
boat
(
bYakkrivh
(.)
M:
yes or no
We can notice here that Kevin's first production of boat is segmentally
different from his mother's in a number of respects. The vocalic portion
of Kevin's production has noticeably creaky phonation and begins
relatively closer and more rounded than does his mother's; it also
finishes noticeably fronter and more open. The syllable coda has coordinate glottal closure with the final apical gesture whilst his mother's
version does not. The consonantal release of Kevin's production is also
noticeably fronter in resonance than that of his mother. Compare this
with the phonetics of Kevin's echo which is produced with a vocalic
portion and consonantal release which closely match those of his
mother's immediately preceding production.
The second property of our subset is the marked tempo, rhythmic
and pitch similarity between the echo of the child and that portion of the
adult target that is being echoed. Figure 1 below pictures the FO
contours for the relevant parts of fragment (11) (frequency is represented
150
ECHOLALIA IN AUTISM
in Hz on the vertical logarithmic axis, time in seconds is represented on
the horizontal axis):
100
ihN %
II
IP
%
2 lo
whose turn is it
whose turn is it
0
1
2
4
3
whose turn is it turn is it
5
6
7
8
time in seconds
Figure 1 Extracted FO contours from fragment (11)
We are particularly interested here in the relationship of Kevin's echo
`turn is it' to his mother's third version. There is a close matching of
pitch and pitch contour shape (in terms of start and end point; mother's
turn is it starts at about 350Hz and falls to around 180Hz; Kevin's
begins at about 340Hz and falls to 220H) The durational and rhythmic
characteristics of Kevin's turn also model very closely those of his
mother's third version. His mother's third version is noticeably slower
than the preceding two. The first version has a duration of 835ms with
'turn is it' occupying 572ms The second version has a total duration of
840ms with 'turn is it' occupying 586ms. The third version is 1.22
secs long with the 'turn is it' portion occupying 858ms. Kevin's echoed
version of 'turn is it' closely matches this with a duration of 845ms.
Frequency and durational similarities can also be observed in
Kevin's repeated version of 'boat' in fragment (13). Extracted FO
contours for the relevant part of this fragment are given in Figure 2
below:
149
151
YORK PAPERS IN LINGUISTICS 17
1000:
0.111110
100:
boat
3.4
3.6
yes or
boat
3.8
4 4.2 4.4
time in seconds
4.6
4.8
no
5
5.2
Figure 2 Extracted FO contours from fragment (13)
Here again there are striking similarities between the pitch
configurations of his mother's production of 'boat' and Kevin's version.
Both are stepped up rises with initial and final level portions. His
mother's production begins at approximately 380 and rises to around
420Hz. Kevin's version starts around 336Hz and terminates around
390Hz. They are also extremely closely matched in terms of their
durations: Kevin's lasts 170ms and his mother's lasts 174ms.
In the present data there is at least one instance, in fragment (12),
in which the child, on finding his initial echo not being commensurate
in these terms with that of the target, redoes the echo so as to produce a
version which more closely resembles it. Figure 3 below presents the
FO details for this instance:
152
150
ECHOLALIA IN AUTISM
1000
014,46.
Nor
111164
loo
who did th-
who did that
1
1.25 1.5 1.75
2
Kevin
who did that
2.25 2.5 2.75 3 3.25 3.5 3.75
time in seconds
Figure 3 Extracted F0 contours from fragment 12
The child's first production of who did that is done with relatively low
level pitch (some 200Hz lower than the starting frequency of his
mother's production) which falls slightly towards the end of his
utterance (to around 140Hz). It is a quiet, obscurely produced, truncated
form of his mother's version. Compare this with his second version
which is clearly audible and closely matches the contour and frequency
of his mother's version. Mother's version rises from around 330Hz to a
peak of 400Hz and falls to around 220Hz. Kevin's second version rises
from a starting frequency of around 330Hz to a peak of some 350Hz and
falls to about 140Hz. This second version is also more closely matched
in terms of duration than his first. His mother's first production lasts
some 420ms. Kevin's first version is some 160 ms shorter than this
while his second version is 440ms.
It is important to recognise that this phonetic matching is not
uniformly found across all instances of repetition produced by Kevin.
There are a number of examples where lexically repeated material can be
produced with quite different pitch characteristics. The extracted
fundamental contours from fragment (2) provide an illustration of this.
151
15 3
YORK PAPERS IN LINGUISTICS 17
1000:
100:
jam
1.6
1.8
2
2.2
2.4
2.6
2.8
time in seconds
Figure 4 Extracted FO contours from fragment (2)
Here the mother's and the child's productions are noticeably different.
The child's version of 'jam' exhibits a marked fall in frequency towards
the end while the mother's does not drop below its starting frequency.
The child's version reaches its frequency peak proportionately sooner
than the mother's version and shows proportionately less difference in
frequency between its starting point and peak. (Mother's version starts
around 330Hz rises to 500Hz in some 160ms and falls to 390Hz in
123ms. Kevin's version begins at about 270Hz, rises to its peak of
around 320Hz in 57ms and then falls to its end at about 140Hz in
171ms. The amplitude contours of these utterances are different too. In
mother's the amplitude peak is skewed towards the middle and end of the
utterance. In Kevin's utterance the peak occurs early, closely aligned
with the pitch peak, and rapidly falls away thereafter.) The overall
duration of the two versions is not matched in the way it is for the
'unusual echoes'. Mother's version lasts 375ms while Kevin's lasts
240ms.
The third feature referred to above concerns those cases where the
echo occurs immediately after the adult's target utterance. In this,
_
154
152
ECHOLALIA IN AUTISM
the normal case, the onset of the echo is routinely rhythmically more
rapid!'
than would be expected from the tempo and pattern of rhythm
established in the model; a feature which also differentiates this type of
echo from several of those discussed earlier in the paper. Couper-Kuhlen
(1989, 1990) and Couper-Kuhlen and Auer (1988) provide an innovative
and persuasive discussion of such rhythmic organisation in talk. They
have shown that turns at talk can be 'contextualised' in terms of their
interactional functioning by virtue of their rhythmic constitution and
their relationship to the rhythmic patternings in surrounding talk. They
demonstrate that if rhythmic isochrony is carefully distinguished from
prosodic word stress it is 'possible to gain an understanding of the kinds
of interactional work which can be accomplished by the rhythmic
alignment and non-alignment of turns at talk in normal adult speech.
This work, based on a substantial amount of natural conversational
material, shows that while syllable stress is important for establishing
the 'beat of interactional speech rhythm' (1988:4) not all stressed
syllables in talk contribute to the perception of rhythmic isochrony. It
demonstrates that it is crucially the organisation of talk into
isochronous/anisochronous chains, rather than the simple stress patterns
of sequences of words which serves to contextualise interactional
function. In discussing the rhythmic organisation of question-answer
sequences, for instance, Couper-Kuhlen and Auer (1988) observe that:
'fillers and vocalisations are not alone indicative of a
conversational 'hitch' or, as has been sometimes
claimed, of a 'dispreferred' second pair-part. Instead
whether or not they are integrated into a larger rhythmic
structure seems to affect their conversational function
significantly.' (10).
The following two fragments (11') and (13') provide instances of the
rhythmic non-integration of the 'unusual repetitions' produced by Kevin.
153
"I
YORK PAPERS IN LINGUISTICS 17
Fragment (11')
M:
Whose /'turn is it
M:
Whose /'turn is it
M:
Whose /'turn is it
(1.5)
(1.5)
()
K:
M:
K:
'Turn //is it
Whose /'turn is it
/'Kevin's turn
The symbol '/' is used to indicate where the rhythmic beat is located; '
indicates prosodic syllable stress.
In Mother's first two turns it so happens that syllable stress and
rhythmic beat coincide. In her third turn the rhythmic beat falls in the
same place and further reinforces the regular rhythmic pattern established
by her first two turns. The stressed syllable 'turn' in Kevin's next
utterance, however, is not aligned with this established rhythmic pattern
but comes in early. The place where the expected beat would fall is
indicated by the symbol '/ /'. It can be seen that it coincides with the
unstressed syllable 'is'. This creates a noticeable anisochronous
relationship of Kevin's production with that of his mother's preceding
turn. The same phenomenon is evidenced in fragment (13')
Fragment (13')
M:
-* K:
would you /'like
to
/'go
for a
/'ride in the
/'boat
'boat //
()
M:
/'yes or no
In this fragment the organisation of Mother's turn is such that the
rhythmic beats fall on 'like', 'go', 'ride' and 'boat'. Kevin's turn 'boat',
which redoes the final word of his mother's preceding utterance, is not
fitted to this rhythmic pattern but again comes in early so that the next
beat occurs after the word rather than coincident with its beginning.
1$6
154
ECHOLALIA IN AUTISM
When the three kinds of features we have just described combine they
give these echoes both a parasitic and autonomic feel. They, like most
of the echoes we have been discussing in this paper, are produced in
sequential positions in which the child is being required to produce a
next turn, but they appear to be occupying that turn simply by
repeating a portion of what the adult has said. When these three features
are present in the context of single word echoes then, even though the
word selected for repetition by the child could amount to an answer to
the question, they are routinely treated by the adult as empty and non-
meaningful. Nor can the analyst, in such cases, find any basis for
supposing that the child has any grasp of the question in hand.
Fragments (14) and (15) below illustrate this pattern:
Fragment (14) Kevin sitting on the settee at home, between his mother and father. He
has his one arm round his mother's neck; his other hand is holding M's hand throughout
the sequence below. His mother has asked him Who do you love?, and Kevin rust replies
Muttony. then Daddy in response to Who else ?. In response to a further Who else? he says
&Yin:
M:
M:
Kevin ye:(he):s? we know you love Kevin?
(.) Who else
(1.4)
M:
What about Lucy
M:
Love Lucy=
K:
=Lucy
F:
Is she asle(ep?
(
I
(0.6)
(.)
Lucy? ((to M))
(What about Lucy ((to K))
M:
M:
reading ((to F))
F:
Oh
M:
What about Lucy ((to K))
M:
D'you love Lucy? ((to K))
(0.8)
155
15?
(.) No she's
YORK PAPERS IN LINGUISTICS 17
Fragment (15) Follows on shortly after fragment (5) above. Kevin and his mother sitting
on their settee at home discussing people who might be going to visit them: throughout M
is rubbing the back of K's neck with her hand:
M:
And ma:ybe:? (.) 1a:
:r1 a:s we:11
.K:
((
M:
D'you want to see Ca:r1
K:
Carl
M:
Mmmm?=d'you want to play with Carl
))
I.)
(
ai911
I
(0.7)
K:
((
M:
Mm?
1)
Although this child is capable of saying 'yes' he does so very
infrequently, and some have argued that autistics have special difficulty
in engaging in such affirmation (Fay 1988). So, in fragment (14), for
example, given this it would be possible for the word 'Lucy' to be an
answer to Love Lucy?. But presumably the presence of the three features
mentioned above in Kevin's Lucy leads his mother not to treat his
answer as representing his views on this matter: she reposes the
question to him by saying What about Lucy do you love Lucy?.
There are two further observations that we want to make at this
stage about these unusual echoes. The first is that they often do not
seem to be associated with questions which are difficult to understand,
or ones for which it is difficult to come up with an answer.
Notwithstanding experimental work which has shown that autistics are
more likely to use echoes after questions that are beyond their
understanding (e.g.Paccia and Cursio,1982), there seems nevertheless, in
our data, extensive evidence that these unusual echoes are not contingent
on the question being ungraspable by the child. This evidence consists
of the fact that when the adult reposes the same question to the child
156
153
ECHOLALIA IN AUTISM
after the child's echo then the child often comes up with an answer that
is treated as a candidate answer by the adult. In fragment (11) Kevin
replies by saying Kevin's turn, and in fragment (12) Kevin's did it. If
the question were ungraspable by the child then we might expect to find
the child continuing to echo after the adult reposes the question.
Importantly, there is one instance of this occurring in our data, so this
is a tactic available as a communicative option to the child. But
although it is available it only occurs the once. In most cases the child
is able to construct an acceptable reply to the reposed question.
Our second, and final, observation in this section concerns the
sequential position in which these unusual echoes tend to occur. The
observation is that they appear to have a special affinity with the initial
stages of any particular line of questioning by the adult. Where they
occur they tend to occur as the first kind of vocal response that Kevin
makes. Logically it would be possible for them to occur in a variety of
sequential positions, as do various of those pure echoes discussed in
previous sections. For example, after the adult has asked a question and
the child has given an initial incorrect response then if the adult reposes
the question (e.g. 'No its not an x, what is it?') it would then be
possible for the child to produce what we have called an unusual echo, a
repeat of the question or some part of it. In practice, however, unusual
echoes do not appear in such sequential positions. They are ways of
repeating which appear to have their use as a first way of dealing with a
question. They are, of course, not the only way of initially dealing with
a question. Much more common within these data is non-response on
the part of the child. But where they do occur these unusual echoes are
usually the first vocal form of response that the child makes to the
question.
Before moving on to draw together the various threads of our
discussion, with a view to characterising the work achieved through
unusual echoes, we first of all want to consider whether it is a
distinctive subtype not just in comparison with the earlier types of pure
echo that we have discussed but also in comparison with the uses that
normal children make of repetition.
157
159
YORK PAPERS IN LINGUISTICS 17
7. Repetition in normal children
Within the age range of about 1;6 - 3;0 there is a good deal of repetition
within the speech of normal children. Several studies have now shown
that turns formatted as repetitions can perform a variety of interactional
roles (Casby,1986; McTear,1978). Some of these clearly parallel forms
of repetition that we have found in Kevin's data. For example, the use
by Kevin of kiss in fragment (7) and Yes in fragment (8) as ways of
answering a question follow patterns that are frequent among normal
children. The latter can also produce repetitions of what adults say in
turns which do not follow overt adult questions. They may choose, for
example, to imitate a word that has just been produced by the adult. For
example, Casby's (1986) analysis of the talk of one child revealed that
'imitations' made up between 38-49% of all the child's repetitive
utterances at MLU stages I-III (using Brown's (1973) criteria for
identifying such stages). From the examples of imitation that he
provides, like the one reproduced below, it is clear that the child may
use the provision of a label by the adult as an occasion for then
reproducing this label, either for a first time or with a view to
constructing an improved version on their own last try:
Fragment (16) From Casby (1986:136). Mother and child engaged in
book reading activity:
M:
What's this?
C:
[Mai]
M:
Butterfly, right.
C:
Butter-fly
This kind of imitative repetition is clearly analogous to forms of
repetition that we have found in Kevin's data, notably Jam in fragment
(2) and Watering can in fragment (3), and it also informs the more
inapposite uses like that of Cheek in fragment (9). Further parallel data
among normal children can be found in the more delicate analysis of the
language games involved in such situations which is reported in Tarplee
(1993). Casby notes (op cit:131) that those child utterances he classified
as imitative were often intonationally similar to the adult model. This
is to be expected in that the child's aim is to produce a version of a word
158
160
ECHOLALIA IN AUTISM
which is similar to that just produced by the adult. Likewise, within our
data on Kevin, we have found a tendency for such imitative repetition to
be intonationally similar to the target of the repetition (as in fragments
(2), (3) and (9)). All in all, therefore, it seems that many of Kevin's
pure echoes that we have discussed have their functional counterparts in
the language use of young normal children.
What we have described as unusual echoes are answers to questions
which do not appear to play a part in any recognisable language game.
So, a matter of interest is whether there are counterparts to these echoes
in studies of normal children. In order to examine this we will briefly
discuss two studies which have examined in some detail particular
normal children who have employed repetition as an answering device.
Steffensen (1978) describes the answering strategies of two children, one
of whom (Jackson), in the age range 1;8 - 2;2 and in the context of
yes/no questions, uses repetition rather than 'yes' as a technique of
affirmation even though he, like Kevin, is capable of using the negative
and affirmative particles. Although such repetitions are often used by
Jackson in what Steffensen refers to as semantically well formed ways,
ways that are appropriately fitted to the question and which display that
the child has some genuine grasp of it, in some cases (such as fragment
(17)) this is not so. Steffensen sees such answers as 'responding by
formula', as just imitations rather than genuine affirmations, especially
when viewed in the light of accompanying nonverbal behaviour:
Fragment (17) From Steffensen (1978:228). Adult and child
[Jackson, aged 2;0.7] talk about cutting meat:
A:
Shall I cut your meat?
J:
Meat
A:
Shall I cut it?
Steffensen's discussion of this child strongly suggests that at a certain
stage of development some normal children may resort to using
repetition in ways that have some similarities to Kevin's use of unusual
echoes. But there are also important actual and possible differences
between Jackson and Kevin in this respect. According to Steffensen, a
feature of Jackson's repetitions is that they are intonationally different
159
I6
YORK PAPERS IN LINGUISTICS 17
from their models, and in the examples provided by Steffensen there are
no cases of the child repeating longer stretches of the question than just
a potential answer constituent. Furthermore, there is no discussion of
whether, as is the case in Kevin's data (see fragments (11) and (12)
above), such repetition answering strategies are also found in response
to 'Wh' questions.
A study by Mc Tear
(1978)
of repetition in his own child between
the ages of 2;6-3;1 clearly shows a child who not only produces
repetitions of Wh questions but also ones which appear often to include
the Wh word itself. An example from Mc Tear is given below:
Fragment (18) From Mc Tear (1978:305): F denotes father, S denotes
his daughter who is aged somewhere between 2;6 - 3;1. Presumably,
they are talking about what they can see on a television:
F: What are they doing?
S: What they doing?
F: They're playing snooker
((a few minutes previously S had asked the
question and received this information))
For a variety of reasons, however, these child turns do not seem to us to
operate in ways analogous to Kevin's unusual echoes. Mc Tear's
argument is that these repetitions are not general answering devices but
are specific to particular types of question, what he calls 'display
questions'. These are questions in which 'the speaker already knows the
answer and wants the hearer to show whether he knows it or not' (op
cit:302). For Mc Tear the repetition of such questions is a device used by
the child to display that she is attending, but one which also
intentionally transfers the speaker role back to the questioner. The way
that adults are described as replying to these questions supports this
contention in that the adult can, after the child's repeat, supply the
answer (as in fragment (18)), or the adult can treat the child as
deliberately choosing not to answer by insisting on an answer. For
example, Mc Tear cites the child's grandmother as responding to such a
repeat by saying Come on you tell me (op cit:305). Kevin's unusual
echoes are never treated in these ways by his co-participant, nor is there
16?
160
ECHOLALIA IN AUTISM
ever any clear evidence that for Kevin himself these forms of repetition
are designed as speaker switching devices. Furthermore, Kevin's unusual
echoes are not specific to particular question types, nor are they, in the
main, full repetitions of the prior question. For these various reasons it
seems to us that this kind of repetition found in the speech of Mc Tear's
daughter is serving a different interactional role than that performed by
Kevin's unusual echoes.
S. Discussion
In this article we have been principally concerned with the pure echoes
of one autistic boy. Within this relatively unambiguous set of
vocalisations we have distinguished three subsets; those which are used
in communicatively appropriate ways; those which, though inapposite,
represent systematic moves in some language game; and those we have
described as 'unusual', that do not amount to moves in any recognisable
and conventional language game. We have not quantified these various
subsets because their membership is not always clear-cut. For example,
our discussion of fragments (5) and (8) has suggested various grounds
for uncertainty concerning the kind of understanding that informs
Kevin's production of pure echoes in these sequences. Nevertheless,
working with what seem to us canonical cases we have tried to identify
ways in which these various types are both used by Kevin and responded
to by those who interact with him. In doing this we have been
especially concerned with the possibly distinctive status of what we
have called 'unusual' echoes.
Unusual echoes have a number of features which suggest that they
are simply constructed as repetitions of what the adult has said. These
features are their segmental and suprasegmental relationship to the
model, their unusual rhythmic timing and their functional opaqueness.
We have shown, for example, that these unusual echoes appear to be
more acoustically matched to their models than is the case for those
pure echoes which represent appropriate moves in language games, and
that at a segmental level they systematically, and selectively, preserve
particular portions of the model. By virtue of these features these
unusual echoes impressionistically sound like 'empty' repetition, and are
treated as such by the adult. There are, as we have seen in the case of
161
163
YORK PAPERS IN LINGUISTICS 17
Steffensen's Jackson, occasional glimpses of somewhat similar
behaviour among normal children around the developmental age of
about 2;0. But in Kevin's data this type of echo is more intonationally
parasitic on the model, not necessarily confined to repeating particular
segments of the model and probably more widely used in response to
different types of question. As far as we can tell, therefore, unusual
echoes do not have counterparts in the speech of normal children.
In developing a characterisation of the role that unusual echoes play
in the repertoire of this autistic child it seems to us important to
consider them in the context of his more general pattern of interactional
skills and involvements. Crucially, vocalisations that are clearly
intended as communicative are solicited from Kevin: under 5% of these
communicative vocalisations amount to initiations on his part. His
world of spontaneous talk is largely made up of 'delayed echolalia',
utterances which are usually recognisable as being authored
(Goffman,1979) by other people in other contexts, and ones for which
he displays an ongoing, obsessive attachment. It is this domain of
language use in which Kevin seems most fluent and at home. And
insofar as he rarely displays any continuing and sustained (obsessive)
involvement with other people in any particular line of interaction, as
evidenced by his gaze, manual behaviour and general bodily orientation,
then it seems to be the topics of his delayed echolalia that stand at the
forefront of his immediate vocal, and perhaps mental, life.
In these circumstances attempts to elicit responses, communicative
speech, from Kevin face the twin tasks of both bringing him out of that
separate world and having him understand the import of the adult
initiation in question. That the first of these is a problem for those who
interact with Kevin is suggested by the frequency with which he appears
not to respond to adult initiations, not just in sequences in which
echoes occur, but also in those where he eventually makes what is taken
to be an appropriate communicative response. The continuing relevance
of these considerations routinely occasions various unusual, though for
this kind of interaction routine, forms of behaviour on the part of the
child's interactional partner - things like emphatic voice, a high
frequency in the use of his name as a summons, and physically taking
hold of his body so as to encourage orientation to the partner. In the
literature more prominence has been given to the second task mentioned
J64
162
ECHOLALIA IN AUTISM
above for the adult who attempts to solicit speech from the autistic
child, the problem of having the child grasp the linguistic content. Here
various research has drawn attention especially to pragmatic and
conceptual limitations that make it difficult for the child to understand
the nature of what is said to him (Fay, 1988). While this may be so we
have argued that this is of limited significance for explaining the
occurrence of unusual echoes. The main reason for this is that in many
of these sequences, such as fragments (11) and (13), the child seems
capable of eventually coming up with an appropriate response to the
adult question. Furthermore, it may be important to bear in mind that
when asking the child such questions, those who know the child well,
such as his mother or a teacher, are unlikely to ask him questions that
they know or suspect he is not able to answer, let alone repeat such
questions after he produces an unusual echo in response. The key
question then, as we see it, is why the child produces such an echo
when he has the cognitive equipment to come up with a response?
The answer as to why he chooses to echo seems fairly
straightforward. We have seen that the child possesses quite
sophisticated skills associated with repetition and that constructing a
reply out of material contained in the prior turn is frequently a
successful discourse strategy for him in his dealings with other people.
And in various ways the design of adult turns, especially in repair
sequences after non-response by Kevin, relies on and fosters repetition
skills. These points seem to be true not just for the most frequent
sequences involving the labelling of things but also in other sequence
types such as the games he plays at home. Repetition is thus the
obvious device for the child to pick, his most skilled device, in
situations which are not conducive to him being able to deal
appropriately with an adult question, the situations that seem
characteristic of 'unusual' repetition. Much more difficult to specify are
the properties of this kind of situation. The best clue here is the fact
that 'unusual' repetition is a first vocal response to any particular
question. It occurs in that temporal phase when the child's attention is
being drawn into the world of question and answer. By frequently not
answering at all the child evades entry into this world; through 'unusual'
echoes the child accords significance to what the adult has said simply
163
YORK PAPERS IN LINGUISTICS 17
by repeating it, by, in effect, saying that this is all he is willing or able
to do.
REFERENCES
Atkinson, J.M. and Heritage, J. (Eds) (1984) Structures of Social Action:
Studies in Conversation Analysis. Cambridge: Cambridge University
Press.
Baron-Cohen, S. (1989) Perceptual role taking and protodeclarative
pointing in autism. British Journal of Developmental Psychology 7.
113 27 .
Brown, R. (1973) A First Language: The Early Stages. Cambridge: Harvard
University Press.
Casby, M.W. (1986) A pragmatic perspective of repetition in child
language. Journal of Psycholinguistic Research 15. 127-40.
Couper-Kuhlen, E. (1989) Speech rhythm at turn transitions: its functioning
in everyday conversation. Part I. KontRi, Arbeitspapiere Nr 5,
University of Konstanz.
Couper-Kuhlen, E. (1990) Speech rhythm at turn transitions: its functioning
in everyday conversation. Part II. KontRi, Arbeitspapiere Nr 8,
University of Konstanz.
Couper-kuhlen, E. and Auer, P. (1988) On the contextualizing function of
speech rhythm in conversation: Question-answer sequences. KontRi,
Arbeitspapiere Nr 1, University of Konstanz.
Fay, W.H. (1988) Infantile autism. In D. Bishop and K. Mogford (Eds)
Language Development in Exceptional Circumstances.
Edinburgh:Churchill Livingstone.
Frith, U. (1989) Autism: Explaining the Enigma London: Blackwell)
Goffman, E. (1979) Footing. Semiotica 25. 1-20.
Greenfield, P.M. and savage-rumbaugh, E.S. (1993) Comparing
communicative competence in child and chimp: the pragmatics of
repetition. Journal of Child Language 20. 1-26.
Levinson, S (1983) Pragmatics. .Cambridge: CUP.
McTear, M.F.(1978) Repetition in child language: imitation or creation? In
R.N. Campbell and P.T. Smith (Eds) Recent Advances in the
Psychology of Language. New York: Plenum.
164
6
BEST COPY AVAILABLE
ECHOLALIA IN AUTISM
Ninio, A. and Bruner, J. (1978) The achievement and antecedents of
labelling. Journal of Child Language, 5. 1-15.
Paccia, J.M. and Curcio, F. (1982) Language processing and forms of
immediate echolalia in autistic children. Journal of Speech and
Hearing Research, 25.42-47.
Prizant, B.M. and Duchan, J.F. (1981) The functions of immediate echolalia
in autistic children. Journal of Speech and Hearing Disorders,46.
241-9.
Roberts, J.M.A. (1989) Echolalia and comprehension in autistic children.
Journal of Autism and Developmental Disorders, 19. 271-81.
Rydell, P.J. and Mirenda, P. (1991) The effects of two levels of linguistic
constraint on echolalia and generative language production in
children with autism. Journal of Autism and Developmental
Disorders, 21. 131-57.
Schopler, E., Reichler, R.J., Devellis, R.F. and Daly, K. (1980) Toward
objective classification of childhood autism: Childhood Autism
Rating Scale (CARS). Journal of Autism and Developmental
Disorders, 10. 91-103.
Schopler, E., Reichler, R.J. and Renner, B.R. (1986) The childhood autism
rating scale (CARS). New York: Irvington Publishers, Inc.
Sigman, M., Mundy,P., SHerman,T. and Ungerer, J. (1986) Social
interactions of autistic, mentally retarded and normal children and
their caregivers. Journal of Child Psychology and Psychiatry, 27.
647-56.
Snow, C. (1986) Conversations with children. In P. Fletcher and M. Garman
(Eds) Language Acquisition. Cambridge: Cambridge University
Press.
Steffensen, M.S. (1978) Satisfying inquisitive adults: some simple methods
of answering yes/no questions. Journal of Child Language, 5. 22136 .
Tarplee, C. (1993) Working on talk: the collaborative shaping of linguistic
skills within child-adult interaction. Unpublished DPhil Thesis,
University of York.
Wootton, A.J. (1989) Remarks on the methodology of conversation
analysis. In D. Roger and P. Bull (Eds) Conversation: an
interdisciplinary approach . Clevedon: Multilingual Matters.
165
16?
THE NATURE OF RESONANCE IN ENGLISH: AN
INVESTIGATION INTO LATERAL ARTICULATIONS*
David E Newton
University of Edinburgh
1. Introduction
This paper presents an instrumental study into the nature of clear and
dark sounds in English. 'Resonance' is a term which I shall be using to
cover the range of quality distinctions covered by the terms 'clear' and
'dark' (and intermediate varieties).1 The term 'resonance' has been used
by a number of linguists in the past (see, for example, Abercrombie
1936, Allen 1953, and Jones 1956), as well as more recently (Kelly and
Local 1986). However, its use as a phonetic label is far from universal.
2.1. The Nature of Resonance
The instrumental study detailed here will primarily look at those
resonance features which are associated with the lateral consonant /1/ in
Most of the work detailed here was carried out whilst at the University of
York. The author can currently be contacted at Department of Linguistics,
University of Edinburgh, Adam Ferguson Building, 40 George Square,
Edinburgh E118 9LL, UK, email
[email protected]. The author wishes to
thank John Kelly, Geoff Lindsey, John Local and my informants for all
their advice and help. I'm still to blame though.
1 Clear and dark are not the only terms used to refer to these particular kinds
of articulatory and acoustic events. Corresponding terms in the literature
include the following: 'front', 'palatalised', 'having front vowel resonance'.
Similarly, terms referring to darkness include: 'back', 'velarised', 'retracted,
'having back vowel resonance', 'pharyngealized'. In the main, these terms
tend to refer to the same kinds of articulatory gestures. Some of these labels
may be seen as more appropriate than others, although this is not an issue
which is to be confronted in this paper.
York Papers in Linguistics 17 (1996) 167-190
0 David Newton
1138
YORK PAPERS IN LINGUISTICS 17
English. However, before the study and its results are described, there
are three main concepts which are to be assumed in my treatment of
resonance features.
Firstly, at least in their phonetic forms, dark and clear sounds are
not simply opposed in a binary way. They are merely convenient labels
for the opposite ends of a continuous range of distinguishable phonetic
qualities. Of course, it may be the case that, when carrying out
subsequent phonological analysis, one might wish to talk about a
dark/clear opposition, but it is also important to recognise the range of
phonetic variability which can be recorded.
Secondly, it often appears to be assumed in much of the literature
that clear and dark are terms which only apply to the lateral consonant
/1/. This has become an especially widespread assumption in that part of
the literature which concentrates on the phonetics and phonology of
English (see Giegerich 1992). However, work on other languages (for
example, Westerman and Ward 1933), and more detailed works in
general phonetics (see Jones 1956) have recognised resonance
characteristics as applicable to any speech sounds.
The third point is the notion that the darkness or clearness of a
token applies only to that token in a given utterance. However, upon
closer examination, it can be seen that this is not the case. There have
been studies suggesting that different phonetic items may have different
effects on the resonance of their environment, depending on the nature
of what is sometimes called their acme function. For instance, one
study by Kelly (Kelly 1989; see also Kelly and Local 1986) examined
the following two sentences, as spoken in one variety of English (from
north Manchester/Salford):
(1) Ballet came to my mind.
(2) Barry came to my mind.
Electropalatography showed that the velar closure at the beginning of
the word came was fronter following the word ballet than it was after
the word Barry. Kelly proposed that, for this variety of English, /r/, in
the form of an approximant, has acme function, affecting nearby parts
of the utterance.
168
169
RESONANCE FEATURES INENGLISH LATERALS
This interaction of resonance effects has also been noted by Klatt,
who stated, with regard to speech synthesis, that
'the acoustic properties of A/ in a word like will cannot be
predicted from diphones obtained from with and hill because
the /w/ and /1/ velarise the /1/ to a greater extent.'
(Klan, quoted in Kelly and Local 1986: 304)
There is also a fourth aspect of resonance features which will be
discussed later on. This is the suggestion that dark tokens tend to occur
finally, whilst clear tokens tend to occur initially. This is one of the
more important, if problematical, aspects of the theory of resonance
that is being investigated here, and will be discussed towards the end of
the paper.
2.2 The Perception of Resonance
The major finding of Newton (1993) was related to how we perceive
different types of resonance. That study, which was suggested by casual
observations, used synthesised intervocalic laterals in English words and
pseudowords produced using the YorkTalk speech synthesiser (see
Ogden 1992) It was found that phonetically-trained subjects tended to
perceive longer lateral tokens as having a darker resonance simply as a
result of their duration, and regardless of their actual darkness or
clearness. Similarly, shorter laterals were consistently judged as having
relatively clear resonance, even though no differences other than
duration were present.
The results obtained from this experiment raised the question of
whether, for naturally-produced English laterals, darker varieties were,
indeed, longer in duration. The present paper reports an instrumental
study into this question, with special reference to initial and final
positions.
169
170
YORK PAPERS IN LINGUISTICS 17
3. Instrumental Study
This study used speech elicited from a number of informants, each
being a speaker of a different variety of English.
It was found that the cue of duration in laterals seems to be of great
importance in the perception of different degrees of resonance. It was
hypothesised that this is because the actual duration of laterals in
natural speech does indeed correlate with the resonance of the sound.
Specifically, it would be expected that one would find that tokens of /1/
which are marked as relatively dark in a named variety are of a longer
duration than those which are treated as clear.
3.1 Informants Used
All of the informants were first-year undergraduate students in the
Department of Language and Linguistic Science at the University of
York, with the exception of Speaker D, who is a member of staff there.
Four male speakers were used, each being a native speaker of a
different variety of English. Their details (summarised below) were
obtained through interview with each of the informants. They were also
given a brief questionnaire about their linguistic background to ensure
that these details were as accurate as possible:
Speaker A: 19 year-old male from Ashby-de-la-Zouch, Leicestershire,
but has lived in a variety of other places. Not an RP speaker, but his
idiolect is a fairly standard variety, somewhat influenced by northern
English.
Speaker B: 19 year-old male from Bolton, Greater Manchester. States
that he has a 'northwest Lancashire' accent, which is 'discernibly
different from the more rhotic Lancashire accents (north and west of
Bolton), and the Mancunian type accent which is east and south of
Bolton', and is said by him to be a typical Bolton accent.
Speaker C: 19 year-old male from North Antrim, Northern Ireland.
Judges that he has a North Antrim accent, but that his variety of it is
not completely typical, in that his speech is 'a little more refined than
where I come from'.
RESONANCE FEATURES IN ENGLISH LATERALS
Speaker D: 48 year-old male originally from South-West London.
Has an RP-like accent, and judges his accent to be 'RP-ish. Home
Counties middle middle class'. Has lived in several other areas, but
judges his accent as typical of his original background.
The use of all male speakers in this small-scale study was to make
cross-subject comparisons less difficult during the instrumental study.
Due to the configuration of the hardware and software, computer
analysis of speech wave spectrograms is often said to be difficult for
female speech, and so this was not attempted here. It should be noted,
however, that in the perception experiment reported in Newton (1993)
the subjects were of a rough split between female and male.
It was first hypothesised what resonance patterns speakers would
have from their idiolect background, and these hypotheses were
evaluated as part of the instrumental study. It was hoped to obtain the
following resonance patterns for their respective articulations of /11.
Speaker A
Speaker B
Speaker C
Speaker D
Initial /I/
Final /I/
clear
dark
dark
clear
clear
daik
clear
dark
Speakers A and D have the kind of resonance patterns that are generally
reported in the literature on the phonetics and phonology of (RP)
English. Speaker B has what shall be called a dark everywhere variety,
whilst Speaker C has a clear everywhere variety.
If it is to be assumed, following Newton (1993) and Ogden (1992),
that for all speakers word-medial varieties of /1/ are of an intermediate
variety with regard to their resonance, then we might expect the mean
darkness (and mean duration, if the hypothesis that darker tokens of /1/
are durationally longer is true) to be classifiable into the following order
(in ascending order, from clearer and shorter to darker and longer):
Speaker C > Speakers A and D > Speaker B
171
172
YORK PAPERS IN LINGUISTICS 17
For the differences between Speakers A and D, it was expected that this
should be in the order of
Speaker D ) Speaker A
which is possibly due to the latter's general Northern English
influenced speech. These claims will be investigated below.
Some further recorded materials were also used in this study. These
included some tape recordings of speakers of different varieties of
English producing various utterances involving /1/ and M in different
environments and were recorded by Kelly and Local as part of their
research work on resonance (Kelly and Local 1986). These were not
used here as primary material for the instrumental study, but
impressionistic observations made from them were noted for purposes
of comparing results with this present study.
3.2 Utterances Elicited
The informants were asked to read out a total of 27 utterances, each of
them in the form of a short phrase or sentence. The utterances were as
follows:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
say silly again
say sillow again
say solly again
say sollow again
it's the whale edition
the whale and the shark
say boy again
say boil again
say boiling again
say Boy ling again
say the boy Ling again
say May again
say mail again
say mailing again
say May ling again
say May Ling again
172
RESONANCE FEATURES IN ENGLISH LATERALS
Mr B LikkOvsky's from Madison
Mr Beel Hikkdvsky's from Madison
Mr Beau Lukkdvsky's from Madison
Mr Bole Hukk6vsky's from Madison
Mr Beelik wants actors
Beel, equate the actors
the beelic men are actors
I gave Beel equated actors
the beeling men are actors
the beel equipment's amazing
Beel equates the actors
17
18
19
20
21
22
23
24
25
26
27
Utterances 1-4 are for the purpose of obtaining articulations of the same
stimuli that were used in the previously mentioned perception
experiment.
Utterances 5 and 6 are also examined by Halle and Mohanan
(1985). These were elicited here to examine how the darkness or
clearness of the articulations varies with relation to morphological
boundaries.
The two similar groups of Utterances 7-11 and 12-16 were devised
for the purpose of seeing how darkness varies with syntactic and
morphological differences. The words mail and boil should, at least for
speakers A and D, be relatively dark, as should be words mailing and
boiling, since the /1/ portion is still morpheme-final. However, for the
words Mayling and Boyling, one might expect a clearer articulation,
since the /1/ in each case can be argued to be ambisyllabic, that is to
say, belonging exclusively neither to the first syllable nor to the
second, with no morpheme boundary. (For argumentation on this
subject, see Local 1995.) These words should be in contrast to May
Ling and boy Ling, in which it would be expected that there would be a
clearer articulation. (Utterances 7 and 21 were used for purposes of
comparison only, since they contain no lateral articulations.)
The remaining, somewhat unusual, utterances were all used in
Sproat and Fujimura (1993). For Utterances 17-20, all the contexts
were trochaic in nature (i.e., a stressed syllable followed by an
unstressed one), the rust two being in a /i - 1/ environment, whilst the
second two were in a /o - a/ environment. The /1/ in Utterances 17 and
173
YORK PAPERS IN LINGUISTICS 17
19 were made syllable-initial by the nature of the words involved,
whilst those in Utterances 18 and 20 were necessarily syllable-final
since they were followed by an /h/. This, as Sproat and Fujimura say,
'cannot be part of an initial consonant cluster in English and
there is therefore no chance of resyllabification.'
(ibid)
They also mention that, since /h/ can be considered a voiceless vowel
(see Catford 1977), the choice of this sound means that there is less
likelihood of interference with the lingual articulation, though they note
that the laryngeal gesture for /h/ may have some side-effects.
Since the remaining utterances (21-27) were primarily concerned
with drawing distinctions related to different types of morphosyntactic
boundaries, these were not examined in great detail. The previous
utterances (1-20) were found to provide sufficient data to be able to draw
some satisfactory conclusions. However, they were examined for
purposes of overview and comparison, and I shall therefore also describe
them here.
Utterance 21 is similar to Utterances 10 and 15, in that the /1/ is
intervocalic with no boundary, which, using the theory preferred here, is
to be interpreted as ambisyllabic. Utterance 22 places the /1/ before an
intonation boundary, as defined by Beckman and Pierrehumbert (1986).
Utterance 23 places the /1/ before a '+' boundary, which, in Lexical
Phonology (see Mohanan 1986), is a Stratum I boundary, whilst
Utterance 24 places the /1/ before a phrase break within a VP.
The boundary before which the /1/ occurs in Utterance 25 is what
Sproat and Fujimura call a '#' boundary (Lexical Phonology's Stratum
II boundary), whilst the boundary in Utterance 26 is between the two
phonological words in a compound. Finally, Utterance 27 is defined by
Sproat and Fujimura as placing the /1/ before a VP phrase break.
3.3 Method
Each of the four informants was asked to read out the list of utterances
in the same order. This order was chosen in a semi-random manner
before the recording, so that related sentences did not appear next to each
other.
174
Ye
RESONANCE FEATURES IN ENGLISH LATERALS
Informants were given ten minutes to look through the utterances,
which were written on individual cards, in order for them to be familiar
with what they were going to have to read out. This was especially
important, since many of the utterances are of an unusual nature, and it
was important to minimise any possible pronunciation errors (though
this was not completely successful; see below). The informants were
each told to read the cards in a natural, but careful, style. That is to say,
they understood that they were to be read as individual sentences, as this
was a reasonably formal scenario, but that they should not change their
accent in doing this. The recordings were later judged by members of
the Department who know the informants, and these instructions were
deemed to have been successful.
The recording was carried out in a sound-damped recording studio
environment. The recordings were sampled into a Macintosh II
computer running Mac Speech Lab II version 1.7 speech analysis
software. Some of the work was later carried out by transferring the
files to Signalyze (version 1.40) format, running on both a Macintosh
Quadra 950 and a Macintosh LCII, though most of the analysis work
was done on the former system.
Much of the analysis carried out took the form of measuring
durations, and by reading wide-band spectrograms, though non
instrumental techniques were also used.
4. Results
It was stated earlier that attempts to reduce misarticulations were
successful, though not entirely so. Some of these did not seem to have
any effect on the portion of the utterance under study. Speaker C
sometimes mispronounced the word Madison as / meidt,son/. In
addition, Speakers A and C both pronounced Bee!, equate the actors
with less of an intonation boundary than had been intended. Again,
since there was no reliance on the detail of this particular utterance, this
did not cause any major problems in analysis. Of perhaps slightly more
importance, Speaker C also pronounced Beelik as /ballik/, rather than
Marking the start and end point for the acoustic realisation of a
segment /1/ is not a straightforward task, in the sense that there are no
175
73
YORK PAPERS IN LINGUISTICS 17
real start and end points for the sound. Hence, two sets of measurements
were made for each of the articulations. They were:
the minimum extent where it can be said the articulation occurs,
the maximum extent where it can be said the articulation occurs.
An example follows. On the opposite page, the display is of a wideband spectrogram of the word Boy ling, as edited out of the phrase say
Boy ling again, spoken by Speaker A. The two parallel sets of vertical
lines show the maximum and minimum points for my measurement of
the /1/ portion. These were chosen using both visual and auditory
methods. This gives two different values for the duration of the /1/,
depending on which criteria one wishes to use to measure it. It is
therefore possible to have two discrete sets of measurements. If these
both lead to the same conclusions, then there is more motivation for
treating these results as accurate. In addition, these results were averaged
out, to create a value for the mean length of /1/.
4.1 Tempo
A checking experiment was also carried out, following Kelly et al
(1966), in order to make sure that the results were comparable across
speakers. This was in relation to the tempo of the utterance. If it can be
shown that the speakers' tempi are comparable (or even if they go in an
opposite direction to that example given above), then we can more
safely talk about the significance of any durational results that are
found.
Firstly, the whole utterance was measured for each of the 27
utterances as spoken by each of the four speakers, and the total duration
was noted. Secondly, a selected foot from each utterance was measured.
There were, perhaps not surprisingly, a few utterances where the
tempo differences between speakers were quite large, but it was found
that these differences had little impact on the overall results. The mean
totals of the measurements were as follows:
Mean length of utterance (ms)
Mean length of portion (ms)
17
A
1533
533
176
BCD
1389
485
1531
515
1440
540
RESONANCE FEATURES IN ENGLISH LATERALS
minimum
maximum
These tempo measurements were not found to be significantly different
across speakers.
It is interesting to note that, in both cases, the fastest mean tempo
was from those utterances produced by Speaker B. This is the speaker
who, it was hypothesised, would have longer /1/ tokens, because he had
darker /1/ tokens. The fact that his speech rate was the fastest amongst
the informants might suggest that, if resonance and the duration of its
177
178
YORK PAPERS IN LINGUISTICS 17
features was not a factor, then he would actually have shorter /1/ tokens
than the other speakers. Hence, it is possible to say that, if the
hypothesis that he had longer /1/ tokens were upheld, then this would be
all the more noteworthy.
4.2 Evaluation of Resonance Patterns
The first task was to find out whether the predicted and actual resonance
patterns matched. This was done partly through examination of
spectrograms, but also from listening and detailed impressionistic
phonetic transcription.
It was found that the resonance distributions were largely as
expected. Speakers A and D had the RP-like distribution of clear initial
/1/ tokens and dark final /1/ tokens. Speaker B had dark tokens in all
positions (in fact, his clearest tokens were still somewhat darker than
the darkest ones produced by Speakers A or D), whilst Speaker C had
very clear tokens in all positions, though some of the tokens were
slightly unusual, in that they did not appear to be typical of any kind of
/1/ that has been discussed in this study. Some of his /1/ tokens,
particularly the intervocalic ones, were very vocalic in nature, and
difficult to measure. In addition, some other intervocalic tokens which
he produced were tap-like in nature.
For those speakers who had a clear everywhere or a dark everywhere
distribution, their final tokens of /1/ were, relatively speaking, still
darker than the initial ones. Therefore, I would suggest that the RP-type
classification of /1/ as 'clear initial, dark final and intermediate medial'
holds at least for all the speakers under examination here, but in a
relative sense.
4.3.1 Durations: Intra-Speaker
It was expected that the degree of resonance present in the /1/ part of the
articulation would be classifiable in the following order, from darkest to
clearest:
8
9
boil
boiling
10 Boyling
11 boy Ling
178
179
RESONANCE FEATURES IN ENGLISH LATERALS
and
13
mail
14 mailing
15 May ling
16 May Ling
This was broadly found to be the case. For the Utterances 8-11, this
pattern was found decisively for Speakers A and C, whilst, for Speaker
B, the pattern was the same except that Utterances 8 and 9 were difficult
to distinguish in terms of their resonance. There was an equally good
result for Speaker D, with the exception that his articulations of
Utterances 9 and 10 were not easily distinguishable from each other.
Similar results were found for Utterances 13-16. All speakers had
the expected resonance patterns, with the exception of Speaker A's
production of May ling, which seemed to contain a darker /1/ than his
production of mailing. There was a possible problem with the 'expected
clear' articulations of May Ling. Some of these, on spectrographic
study, looked as if they were in fact darker than some of the
articulations of May ling, even though the reverse was expected. Note
that the syllable initial position of the /1/ here is in a position which
encouraged primary stress location, whereas, in all the other
articulations, the second syllable is an unstressed one.
This problem was avoided in Utterances 17-20, in which the
expected clear articulations B. Likkovsky and Beau Lukkovsky are
contrasted with the expected dark articulations Bee! Hikkovsky and Bole
Hukkovsky, whilst the pattern of stressed and unstressed syllables is
not disrupted, as may have been the case for the articulations boy Ling
and May Ling. In all cases, the expected clear articulations were found
to be obviously and substantially clearer than the expected dark ones.
This can be seen in the following two spectrograms (over), which are
both from Speaker A.
179
YORK PAPERS IN LINGUISTICS 17
A
A
.
.
Beel Hi
(from Beel Hikkovsky)
B. Li
(from B. Lilckovsky)
The main visual difference between these two spectrograms is the
difference in the second formant. The darker variety, on the right, has
F2 falling to a much greater extent than occurs in the clearer
articulation. Also, the third formant follows a similar pattern to the
second in the clearer articulation, whilst, in the darker variety of lateral
shown here, it moves upwards, away from the second formant.
Differences in amplitude of Fl are also visible.
It was mentioned earlier that some of the informants' articulations
of various /1/s were not easily recognisable. This was especially the
case for Speaker C. Some of his intervocalic varieties were very vocalic
in nature, making them quite difficult to segment satisfactorily. His
initial varieties were also sometimes quite tap-like in nature. Speaker D
produced some intervocalic articulations of /1/ which were quite fricative
in nature. However, this did not cause any particular measuring
difficulties.
180
1`8 1
RESONANCE FEATURES IN ENGLISH LATERALS
Having ascertained that the resonance distribution was as expected,
it was possible to investigate whether or not the durations of these /1/s
were in a predictable distribution with the resonance.
By averaging the durations for all speakers, and including both the
'minimum length' measurement and the 'maximum length'
measurement, it was found that, in the main, the results were as
expected. That is to say, those articulations of /1/ which were darker in
resonance also had a longer duration.
boil
boiling
Boyling
boy Ling
70.5
54.25
52.75
62.5
mail
mailing
May Ling
79.5
50
46.625
75.125
Bed Hikkovsky
B. Likk6vsky
77.5
60.875
Bole Hulck6vsky
Beau Lukk &sky
85.25
73.375
May ling
In the case of the italicised articulations, the reverse durational effect has
occurred. However, it was found that this unexpected effect was due to
the difference in stress patterning (see above), and these results were
discarded. It was then possible to directly compare the top two sets of
measurements with the bottom two pairs of utterances which avoid this
problem. Doing this, we find that the results are as expected, with the
darker varieties appreciably longer than the clear varieties.
If the results are considered for each individual speaker, or for each
of the two measuring methods, the results are not quite so consistently
in favour of the hypothesis. However, no one informant's results went
consistently against the hypothesis.
181
IS 2
YORK PAPERS IN LINGUISTICS 17
4.3.2 Durations: Inter-Speaker
The next piece of analysis to be carried out was to find out whether
those speakers with a generally darker variety generally have longer /1/s,
and whether the reverse is the case for those speakers who have an
clearer variety.
The first results which were obtained were derived from averaging
out all of the measured utterances across each speaker, regardless of in
what position the lateral occurred. They were, however, not as
hypothesised:
Mean duration of /1/ (msec)
Speaker A
Speaker B
Speaker C
Speaker D
60.556
70.860
61.889
66.368
Where we would have expected Speaker C to have the shortest durations
and Speaker B to have the longest, with Speakers A and D somewhere
in the middle, we find that Speaker C has an intermediate value. This
aside, the other speakers results are as expected, with Speaker B having
an appreciably longer mean duration of /1/.
On finding this unexpected result, referral was made to notes which
were made during the instrumental study. For the B. Likkovsky set of
four utterances, the articulations of /1/ produced by Speaker C were very
difficult to segment (see above). These are the ones which, it had been
noted earlier, seemed very tap-like (or at least, certainly non-lateral)
when under spectrographic and impressionistic study. As a result of
this, it was decided to measure these averages again, but this time
leaving out these problematical four utterances. The results which were
obtained this time were as follows:
Mean duration of /1/ (msec)
Speaker A
Speaker B
Speaker C
Speaker D
62.429
70.106
52.210
64.259
182
183
RESONANCE FEATURES IN ENGLISH LATERALS
It can be seen that the means for Speakers A, B and D remained almost
the same, but that for Speaker C decreased by fifteen per cent. This
resulted in the distribution being as originally anticipated, with the
three groups of speakers (those with a generally clear pattern, those
with a generally dark pattern, and those with a mixed pattern which
averages out as central) each being separated by a substantial amount,
around ten milliseconds in each case.
Since the laterals which were in the original perception experiment
were all ambisyllabic intervocalic varieties, the means of those
utterances which involved this variety of /1/ were measured for
comparison. These utterances were the ones which contained the
following articulations:
silly
sillow
solly
sollow
Boyling
May ling
The mean durations for these /1/s were as follows:
Mean duration of /1/ (msec)
Speaker A
Speaker B
Speaker C
Speaker D
52.30
69.83
46.58
53.00
Once again, these results were in line with what could be predicted from
the results obtained in the perception experiment.
5. Summary
The results given in the above two sections support the hypothesis that
darker tokens of /1/ have a greater duration than clearer tokens. This
appears to be the case both for individual speakers, and also between
speakers who have different resonance distribution patterns.
183
84
YORK PAPERS IN LINGUISTICS 17
Some caution may be required here. I would not like to suggest
that this pattern is always consistent, since the effects on resonance of
morphosyntactic boundaries and their interaction with vocalic
environment (and, for instance, whether one of these two factors is
prime over the other) does not, as yet, appear to be sufficiently
understood. In fact, as I have mentioned, some of the initial results did
go against what was expected, but, to a far greater extent, the
hypothesis was supported.
6.1 Discussion
One question that has been raised is whether, in general, dark tokens (of
anything) are (relatively) long. Of course, since the darker varieties of
/1/ which were looked at were mostly those in a final position, it is also
possible that final tokens are long, regardless of whether they are dark
or not. Similarly, those varieties of /1/ which were clearer were usually
those which were in initial position, and there is the question of
whether this is the nature of the /1/, or the nature of the position within
the word, or a combination of the two. The results which were found
can be schematised thus:
For Ill only
Speakers A and D
Speaker B
Speaker C
Initial
Final
Short
Clear
Long
Long
Dark
Long
+
Dark
+
Short
Clear
Short
Clear
Dark
Here, a '+' sign represents that there is 'more of the quality indicated,
represents 'less of that quality. The actual labels themselves
and a
(clear, dark, long, short) represent the classifications that we might
wish to give phonologically, whilst the additions of '+'s and ''s are of
a more phonetic nature.
184
1S5
RESONANCE FEATURES IN ENGLISH LATERALS
It can be seen from the above diagram that all speakers,
phonetically, do go in the same direction in terms of the durational
features of their /1/s. For the order Initial>Final, all Speakers would
have the order Short>Long.
This question of the possible lengthening of final items is raised
by Vaissiere (1983). She categorises Final Lengthening as a 'languageindependent prosodic feature', giving examples of several languages
which display this phenomenon, including French, English, German,
Spanish, Italian, Russian and Swedish. However, as Vaissiere admits
(1983: 60), it may be too much of a generalisation to state this a
universal, since there is contrary data for several languages, including
Finnish, Estonian and Japanese.
If Final Lengthening could be shown to be, if not a universal, then
at least a tendency, then one might wonder if there were any
physiological or other reasons why this might happen. Vaissiere reports
several suggestions that have been hypothesised by various studies. She
mentions that there may be a general relaxation of speech gestures
toward the end of utterances and that this decrease in amplitude may be
compensated for by increasing the duration. However, this seems to me
to be a strategy that is more likely to be language-specific (or, to be
more precise, dialect-specific), since we have the above examples where
it does not occur. In fact, Vaissiere notes that there have been studies of
children, who seem not to display the tendency of final lengthening,
thus suggesting that this is a learned process.
6.2 Further Study
Two areas would be relevant for investigation. Firstly, it would be
interesting to find a language variety where laterals were clearer and
shorter in final position. Secondly, and more generally, it would be
helpful to find a language variety where non-lateral tokens were notably
shorter finally than initially (perhaps regardless of resonance).
Some of these possibilities may be true for some Scottish dialects.
Work carried out by the Scots Section of the Linguistic Survey of
Scotland (see Hill 1960; also Hill p.c.) has suggested that some dialects
of Scottish English may have very clear final tokens of some alveolars,
nasals and plosives. These dialects and their resonance patterns are now
being researched. If it transpires that these claims are true, it would be
185
130
YORK PAPERS IN LINGUISTICS 17
interesting to examine the durational properties of these sounds. If they
were found to be relatively long, then this would add support to the
Final Lengthening cause, whilst, if they turn out to be short, this
would support the suggestion of a darkness/length correlation. In fact,
preliminary non-experimental observations suggest that the latter may
be the case.
I suggested above that final lengthening of /1/, if not a universal,
may be a tendency. If it is assumed for the moment that this is the case,
then it is then necessary to look for possible explanations. If longer
tokens of /1/ usually coincide with articulations of a darker resonance,
there are some reported physiological reasons why this may be so.
Amerman and Daniloff (1977) studied lingual coarticulation, though
they do not explicitly link dorsal gestures with increased length.
However, they do suggest that the gesture of the tongue apex is the
more important of the two, and that the dorsal position generally, in
terms of anticipation of vowels, 'does not need to adopt so specific a
position' (1977: 112). It seems possible that dorsal gestures generally
take longer to activate, particularly since this would seem to involve
more muscular activity, and this would be a possible explanation for
the lengthening of dark tokens. That is to say, dark tokens (in this case,
of /1/) have a more prominent dorsal component and dorsal components
may inherently require a longer articulation period.
If this last suggestion is, indeed, a reasonable one, then it would
seem to remove the need for the use of the concept of Final
Lengthening, since, rather than talking about the lengthening of final
items, what is here being talked about is the lengthening of the dorsal
(or dark) items.
If further study supports this, then this would seem to tie in well
with recent work carried out by Sproat and Fujimura (1993). They
model all articulations of /1/ as having an apical gesture, which is
consonantal in nature, and a dorsal gesture, which is vocalic in nature.
One difference which they draw between clear articulations of /1/ and
dark articulations is that, in clear articulations, the apical gesture occurs
first, whilst, in dark articulations, the dorsal gesture occurs first. In
addition, they note that
186
17
RESONANCE FEATURES IN ENGLISH LATERALS
`the acoustically measured duration of the rime containing a
preboundary /1/ correlates strongly with darkness.'
(1993: 2)
They propose that the vocalic gesture has an affinity for the nucleus of
the syllable, whilst the consonantal gesture has an affinity for the
margin. These gestures make use of different lingual muscles. Their
claim is that coarticulatory undershoot accounts, to an extent, for the
correlation of darkness with duration. They also define the notion Tip
Delay, which has a positive value in final (here, darker, tokens) and a
negative value in initial tokens.
However, Sproat and Fujimura only correlate duration with
resonance in the case of coda-position /1/s (1993: 18). They do not
explicitly state that this is the case for all positions, nor do they
suggest that this correlation is as important for perception as implied
by Newton (1993). That is to say, it seems to be the case that the
durational aspect of laterals may have primary status in the perception
of resonance, since it has been shown that manipulation of duration
affects resonance judgements when no other differences are present.
They do, however, state that their discoveries may only apply to
those varieties of English which have the clear/dark distinction. They
mention that there are varieties which do not display this distinction,
and that there are also other languages which do not. However, of
course, for the varieties used in my own instrumental study, even those
which were said to be 'clear /I/ everywhere' and 'dark /1/ everywhere',
were shown to have perceptible differences within these categories. As
yet, 1am not aware of any varieties of English having a perceptibly and
consistently clear /1/ in final positions and a darker /1/ in initial
positions. We do find varieties of English in which /r/, syllable-finally,
is clear (for rhotic dialects, or other situations where it is pronounced),
and syllable-initially is dark. However, Sproat and Fujimura do not
attempt to extend their findings to any tokens other than /1/.
Nevertheless, their model could hold for other, non-lateral sounds,
since the tongue gestures (as secondary articulations) for clearness and
for darkness would differ in similar ways, regardless of the nature of the
primary articulation.
187 1 3
YORK PAPERS IN LINGUISTICS 17
6.3 Implications
Provided that some of the work suggested in the previous section were
carried out, and that this could provide more concrete evidence for some
of the suggestions presented in this paper, there would appear to be two
possible implications for these findings. Firstly, there may be
implications for the theory of speech production (as well as speech
perception), in light of the possible clash between Sproat and
Fujimura's production model and the Final Lengthening model (and in
light of the perceptual findings of Newton 1993).
These findings may also have some importance in phonetic and
phonological modelling, for example, in speech synthesis and speech
recognition. If length is an inherent and predictable part of the structure
coincident with resonance (whether this is only for laterals, or for other
sounds), then it would appear to be important to ensure correct
modelling of both the resonance features and these durational aspects.
REFERENCES
Abercrombie, David. (1936) Notes on the phonetics of Icelandic. Ms.
University College London.
Allen, W. S. (1953) Phonetics in Ancient India. London: Oxford University
Press.
Amerman, James D. and Raymond J. Daniloff. (1977) Aspects of lingual
coarticulation. Journal of Phonetics 5.107-113.
Beckman, M. and J. Pierrehumbert. (1986) Intonational structure in
Japanese and English. Phonology Yearbook 3.255-310.
Catford, J. C. (1977) Fundamental Problems in Phonetics. Edinburgh:
Edinburgh University Press.
Cutler, Anne and D. Robert Ladd (eds.) (1983) Prosody: Models and
Measurements. Berlin: Springer Verlag.
Giegerich, Heinz J. (1992) English Pronunciation. Cambridge: Cambridge
University Press.
Halle, Morris and K. P. Mohanan. (1985) Segmental phonology of Modern
English. Linguistic Inquiry 16.57-116.
188
189
RESONANCE FEATURES IN ENGLISH LATERALS
Hill, Trevor. (1960) Phonemic and prosodic analysis in linguistic
geography. Paper presented at First International Congress of
General Dialectology (Louvain/Brussels, 1960), Firthian Phonology
Archive, Department of Language and Linguistic Science, University
of York.
Jones, Daniel. (1956) An Outline of English Phonetics, eighth edition.
Cambridge: W. Heffer and Sons.
Kelly, John. (1989) On the phonological relevance of some nonphonological elements. In Tamils Szende (ed.) 56-59.
Kelly, John, J. K. Anthony and Elizabeth Uldall. (1966) Tempo and
transitions. Paper presented at R. I. T. Stockholm Speech
Communication Seminar.
Kelly, John and John K. Local. (1986) Long-domain resonance patterns in
English. International Conference on Speech Input /Output;
Techniques and Applications. IEE Conference Publication 258.304309.
Kelly, John and John K. Local. (1989) Doing Phonology: Observing,
Recording, Interpreting. Manchester: Manchester University Press.
Local, John K. (1995) Syllabification and rhythm in a non-segmental
phonology. In J. W. Lewis (ed.) Studies in General and English
Phonetics: Essays in honour of Professor J. D. O'Connor. London:
Routledge. 350-366.
Mohanan, K. P. (1986) The Theory of Lexical Phonology. Dordrecht:
Kluwer.
Newton, David E. (1993) Types of resonance and their perception in
English. Paper presented at the Second Manchester University
Postgraduate Linguistics Conference, 13 March 1993.
Ogden, Richard A. (1992) Parametric interpretation in YorkTalk. York
Papers in Linguistics 16.81-99.
Sproat, Richard and Osamu Fujimura (1993) Allophonic variation in
English /1/ and its implications for phonetic implementation.
Journal of Phonetics 21.291-311.
Szende, Tamils. (1989) Proceedings of the Speech Research '89
International Conference, June 1-3, 1989, Budapest. Budapest:
Linguistics Institute of the Hungarian Academy of Sciences.
189
190
YORK PAPERS IN LINGUISTICS 17
Vaissiere, Jacqueline. (1983) Language-independent prosodic features. In
Cutler and Ladd (eds.) 1983, 53-56.
Westerman, Diedrich and Ida C Ward. (1933, 1990 edition). Practical
Phonetics for Students of African Languages, second edition with
introduction by John Kelly. London: Kogan Paul.
190
19 1,
PROSODIES IN FINNISH*
Richard Ogden
Department of Language and Linguistic Science
University of York
1. Introduction
Recently, it has been argued that phonetic detail ought to be accounted
for by phonology: to ignore detail is to produce analyses of linguists'
idealisations of data, rather than of real spoken material. Some studies
of English have shown uiat there is phonetic detail beyond what had
been expected: Zsiga (1994) has shown that post-lexical processes in
English produce different kinds of [f] from those produced by the
application of either level 1 or level 2 rules; Manuel et al. (1992) have
shown that /6/ in English may under certain circumstances be realised
by nasal portions with dental articulation and a dark secondary resonance
(low F2); Hawkins & Slater (1994) show that by modelling fine details
of coarticulatory behaviour it is possible to produce significantly more
intelligible synthetic speech which is also more robust in difficult
listening conditions. In a somewhat more theoretical vein, Docherty et
al. (1995) argue that unless phonetic detail and variability is described
within a phonological analysis, the analysis is seriously flawed, since it
remains unaccountable to observed data. Hawkins (1995) argues that
fine phonetic detail contributes to what she calls the coherence
(naturalness) of speech. If coherence is considered important, hitherto
* Parts of this paper appear in Ogden (1995a). My particular thanks to Steve
Harlow, John Kelly, Gerry Knowles, John Local for their help with that
work. Thanks are also due to my informants, and to Tapani Salminen, who
helped me decipher some of the material and produce an orthographic
version of it.
York Papers in Linguistics 17 (1996) 191-239
@ Richard Ogden
YORK PAPERS IN LINGUISTICS 17
ignored details of speech become central properties of the linguistic
system.
This paper presents a description of Finnish phonetics and a
Firthian Prosodic Analysis of some of the data. Rather than starting
from citation forms, the analysis is based on some of the observed
phonetic detail of spontaneously produced speech.
This paper has two main sections. The first section gives a general
phonetic description of my informants' speech, while the second section
pays particular attention to the ways in which words in the recorded
material are joined together, and presents a Firthian Prosodic Analysis
of these word joins. Where the informants produce forms that are not
Standard, the non-Standard forms are given in parentheses. Such forms
are generally shorter than Standard forms. My impressionistic records
contain as much detail as deemed necessary for the analysis presented.
The material discussed in this paper was elicited from two
informants (ET and SU). Both were female, and were 17 years of age at
the time of recording. They were good friends and were still at school in
Kuopio, where they received instruction in Standard Finnish.1 Since
there are no substantive differences between ET and SU, utterances from
both speakers are not distinguished in the text.
The material comes from two sources. The first one is a
conversation between the two informants, where one describes to the
other a picture so that the other informant can draw the picture seen
only by the first informant as exactly as possible. The second source is
a set of stories narrated by the informants based on a series of connected
pictures.
My informants, who come from Kuopio, described their speech as
Standard Finnish. The material elicited from them largely matches
descriptions of Standard Finnish (eg. Wiik 1981, Karlsson 1982),
although occasionally I obtained from my informants material which is
considered typical of the Savo dialect of their home town. A
linguistically trained informant from the Hame region of Finland
1 Standard Finnish is a somewhat artificial language which was formalised
in the 19th century. It contains elements taken from the two main dialect
areas of Finnish, East and West. It is the prestige language of Finland, and
the form most commonly cited by Finns to foreigners. It is also the
language used in broadcasting, publishing and education.
192
19a
PROSODIES IN FINNISH
(roughly the central south-west of Finland) identified my informants'
speech as distinctively Savo on the basis of intonation. The only other
striking aspects of my informants' speech in comparison to descriptions
of Standard Finnish were the rhythmical structure of their words, which
matches that described for the Savo dialects (Wiik & Lehiste 1968,
Wiik 1975, Kettunen 1981), and their use of the glottal stop (Itkonen
1965).
2. An outline of Finnish phonetics.
My observations presented in this section are not extensive, but
nonetheless provide some detail beyond commonly accepted general
descriptions of Finnish phonetics2 (e.g. Sovijarvi 1957, Wiik 1981).
Notes on tempo are included, where relevant, between braces (in the
manner of extIPA). Some of the standard assumptions made about
Finnish-pronunciation are challenged by the data in this paper. In
particular, general descriptions typically do not discuss the voicing or
aspiration of plosives, the precise variability in the articulation of the
'labiodental approximant' (/v/), the extent of laryngeal features such as
breathiness and creaky voice, and the variability in the qualities of
vowels. Standard descriptions of Finnish also concentrate on citation
forms: the material on which these notes are based is not citation form,
but speech produced in a relatively natural and spontaneous fashion.
2.1 Consonants with complete oral closure
Complete closure in Finnish can combine with partially or entirely
voiced closure, or with voiceless closure. Complete oral closure with
velic opening is only combined with voicing. The release of oral
closure without nasality is generally unaspirated and the voice onset
time is approximately 10-30ms (Suomi 1980, Lahti 1981). The
commonest closure in normal rate speech is voiceless.
2 In this paper, phonetic material is presented using an ipa font.
Phonological material appears in bold. Orthographic material appears in
italics.
193
94-
YORK PAPERS IN LINGUISTICS 17
1.
kan:u3
fall n maon
Omit on kannu
this is a jug
2.
ja nok:d On tom:Snen suik0
ja nokka on tommoinen suikula
and the spout is a kind of oval
3.
no: piale
no, piirre vaan!
go ahead and draw it then!
However, [k] may be aspirated, as in (4). It is not clear whether this is
because it is followed by a following close front spread vowel, or
whether it is because the word kirkas is in focal position and is
pronounced relatively slowly:
4.
Ionrrpgnj ualda) {len k"irkas len)
lampun valo on kirkas
the light from the lamp is bright
The spectrogram in Figure 7 below provides a visual of some of the
phonetic characteristics of this utterance (4). Note that the first velar
plosive (1) is accompanied by about 50ms of aspiration, while the
second one (3) has no aspiration and the VOT is shorter, at 30ms. Note
also that the apical tap (2) is voiced, not voiceless.
3 Phonetic material contained between curly brackets is characterised
throughout by the parameter(s) indicated subscript: (all ) = allegro; (len) =
lento; (p(p)) = pian(issim)o; (Tall) = rallentando.
194
1-9
PROSODIES IN FINNISH
IA
1
2
3
Fig. 1: [lomunj ualt5:1) k"ickas]
'the light from the lamp is bright'
In the example in Fig. 1, the first velar plosive whose burst is at (1) is
produced with aspiration and 50ms VOT, while the second one (at 3) is
produced without aspiration and with VOT of 25ms, which fits in better
with descriptions in the literature (Suomi 1980, Lahti 1981)
[d] occurs only in morphophonological alternation with [t]. It is
articulated as a very short voiced plosive, and usually has an alveolar
rather than dental place of articulation (Suomi 1980).4 It is accompanied
by a 'dark' resonance. Its closure duration is very short: usually about
half the length of the voiceless plosives.
4 /d/ occurs only initially in syllables which (i) contain a short vowel
followed by a consonant that closes the syllable or (ii) for lexical or
morphosyntactic reasons pattern in the same way (i.e. as short closed
syllables) (Karlsson 1982).
195
YORK PAPERS IN LINGUISTICS 17
5.
en tieda
en tiedd
I don't know
In fast speech, plosives can have a voiced closure and release when they
occur in a voiced stretch of speech. Voicing with closure and release is
not common word-initially. It occurs most frequently in words formed
from pronouns, as in tommoisella in example 6, and after periods of
voicing and lateral airflow:
6.
tehty all)
(all tsed Ld: dom:ozela acc ?yfiell
se on vaan tommoisella yhdella viivalla tehty
it's made with one sort of line
7.
nayt:a: korualdo
niiytteiei korvalta
looks like an ear
8.
nayt:faii a:y6 ne Oa:ldah all)
neiytteitiko ne tlieiltei?
do they look like this?
9.
boh'jdn
pohjan
bottom (gen.)5
5
Said as a repetition of the previous speaker; the previous utterances are
recorded in example (43).
196
197
PROSODIES IN FINNISH
A
1
2
3
Fig. 2: [luvat ovat heican]
'the licences belong to them'
Note the three different closure durations for the plosives. (1) was
measured at 90ms, (2) 60ms, and (3) at 40ms. In this instance, the
amount of voicing for [d] is very small, and the duration probably gives
the strongest cue to the status of the plosive.
When short and in the initial portion of an unstressed syllable,
plosives can sometimes be articulated with a stricture of less close than
complete closure, giving [p 1.10 or even friction and voicing. There arc
insufficient instances of this in my data for it to be possible to work
out whether there are any systematicities in the way this is used.
However, it seems true to say that the weaker closure occurs before
unstressed syllables, and only when the stretch as a whole is voiced.
Closure portions are always followed by audible release within the word
(where there is only one plosive-plosive cluster: [tk]). However,
197
198
YORK PAPERS IN LINGUISTICS 17
between words, the plosive [t] has a variety of release types. It may be
released medially:
(p namar.at p) ka:p:ejd
nand ovat kaappeja
10.
these are cupboards
When a lateral follows, it may be released laterally:
(p nampuati p) lamp:*
11.
llama oval lamppuja
these are lamps
When a bilabial plosive follows, there may be no audible release:
12.
hatut' pan:a:m pa:fi4n
hatut pannaan paahan
hats are put on the head
It may be that in the case of apical followed by bilabial closure, the
bilabial closing gesture masks the release of the apical closure. In other
words, the bilabial closure is timed so that it happens before the apical
release.
Unreleased closure is a common way for a speaker to keep hold of a
turn in a conversation. When this closure is released, the next stretch of
speech sounds like it begins with a plosive (e.g. (7) above, which
begins with a portion transcribed [ts-] and is preceded by [-?] and a
pause).
2.2 Velic opening and oral closure: [In t>J n g]
Nasality co-occurs with complete oral closure made at various places in
the oral tract: bilabial, labio-dental, dental, and velar. Nasality and
voicing always co-occur in Finnish. Finally in the syllable, nasal
consonants are articulated homorganic with any subsequent plosive;
otherwise they are articulated as apico-dentals. (See Section 3.1 n.)
198
PROSODIES IN FINNISH
is produced with the tongue tip just back of dental and forward of the
alveolar ridge.
2
1
Fig. 3: Relnylirfi;O:nj
'I made a mistake'
Note how the nasal portion ends with a very obvious plosive-type
release (1); the low amplitude of voicing for the initial part of the last
syllable (2), and the breathiness throughout this final syllable (3).
13.
(p e m:a tilt milli ne n:ayt:,4: pp)
en mina (ma) data (tiei) milts ne niiyiteiii
I don't know what they look like
199
YORK PAPERS IN LINGUISTICS 17
14.
minkalaind se ?alapa:
minkalainen se alaplia oli?
what was the bottom bit like?
15.
nob: tam on kan:1g kaula
no, Omit on kannun kaula
well, this is the neck of the jug
In portions with nasal and labiodental articulations, there is a great deal
of variability, from apical contact with nasality to labiodental contact
with nasality. In the latter case, it may be that this labiodental contact
is completely coextensive with nasality, and that length together with
labiodentality are the only exponents of the syllable-initial C. Release
is marked with a superscript ! in (20).
16.
seingn uieres1
seinan vieressii
next to the wall
17.
rdinylirffg:n
tein virheen
I made a mistake
2.3 Lateral airflow
Laterals are articulated dentally in Finnish. When a nasal precedes a
lateral, nasality may extend into the lateral portion, and laterality and
nasality may be produced simultaneously. Finnish laterals are on the
whole darker than their English counterparts, but are never as heavily
velarised as finally in English syllables.
2.4 Tapped and trilled articulations
Taps and trills seem to be in free variation in my informants'
speech; but taps (but not trills) are in free variation with the voiced
plosive [d]. Another informant (from flame) has trills and taps where
200
2 J1`
PROSODIES IN FINNISH
my informants have [d]. In citation forms and careful speech, the trill [r]
has 2-3 vibrations of the tongue when short, and 5-6 when long. In fast
speech, the tap [r] counts as the exponent of 'short' and the trill has 2-3
vibrations of the tongue, and counts as the exponent of the category
'long'. Both taps and trills are pronounced voiced in clusters with
voiceless plosives: [kerto:], not [kettol kertoo, 'tell', 3ps. present
tense. Initially however they may sometimes combine with a short
period of voicelessness.
18.
ma oni pi:rtAnyh
mind (md) olen (oon) piirtlinyt (piirtaany)
I have drawn (it)
19.
lafiehA retina:
Melia reunaa
near the edge
20.
tom:oneg korua
tommoinen korva
a sort of ear
21.
oih're:l'a 2 nsin test var:gt
vihrealla ensin feet varrat
you do the stalks first in green
Sometimes lateral and trill articulations are found with initial voiceless
portions utterance-initially:
22.
jasla
laske viiteen
count to five
23.
rakeensin talon
rakensin talon
I built a house
201
202
YORK PAPERS IN LINGUISTICS 17
2.5 Open approximation
Two approximants occur in Finnish: palatal and labiodental. The
labiodental approximant is often accompanied by a somewhat ballistic
lower lip gesture, producing something like a labiodental flap.
Sometimes in the initial portion of a stressed syllable, the stricture for
the labiodental approximant is that of rather close approximation,
producing weak friction; it is not uncommon word-initially to hear a
voiced labiodental plosive (see Fig. 3). The palatal approximant does
not exhibit this wide range of variability in its degree of stricture.
Approximants only occur syllable-initially. (Flifilet 1971; Suomi
1985a and the references therein consider whether this distributional
pattern is evidence for treating the final component of diphthongs,
which may be [i] or [u], and initial approximants as allophones of the
same phoneme.)
24.
tut(raii ternaton laji9 rall)6
tuntematon lajike
an unknown species
25.
no te: u:aak:a ruiskuk:in
no, tee vaikka ruiskukkia
well, why don't you do cornflowers
Sometimes in back harmonic words, the palatal approximant is very
back, and is transcribed as an advanced velar glide. There are not enough
instances of it in my data to be able to say anything very conclusive
about it.
26.
hap:out t
happoja
acid, part. pl
6 Note here that the utterance ends voiceless, as is common for utterancefinals. Note also that it is a dorsal articulation, and that it is front. It would
be inappropriate to regard this as some form of deletion, since all the
phonetic properties demonstrated at the end of this word can be shown to be
systematic. See Section 3.6 h.
202
203
PROSODIES IN FINNISH
2.6 Friction with and without voicing
The fricative [s] can be produced in Finnish with the tongue tip down.
This produces a rather flatter, duller sound than in, say, English. The
groove is also wider than in English, enhancing this impression of
dullness (cf. Sovijarvi 1957).
Another variant of [s] is also found. In this articulation, the groove
made by the tongue is considerably narrower than in English, and the
tongue tip is up. The groove made by the tongue forms a narrow V-
shape from the blade to the tip. The result is that this [s] sounds
whistly to English speakers. The data I have suggest (but not
conclusively) that the whistly [s] sound occurs before front, spread nonopen vowels. When these two articulations are combined with secondary
articulations affecting mostly the dorsum and harmonising with the
resonances of the neighbouring vowels, a gradual spectrum of qualities
is produced rather than the simple two-way split suggested here.
Nevertheless, the 'whistly' articulations do stand out in the recordings.
The records below show examples. The 'flat [s]' is transcribed [s]
and the 'whistly [s]' as [s]:
27.
asuin si:n6 talos:a
asuin siinti talossa
I lived in that house
28.
lafide ase.mal:e
ldhde asemalle!
go to the station
29.
katosi m:etsi:n
katosin metsatin
I disappeared into the forest
The different types of [s] sound are not marked elsewhere in this paper.
Between voiced sounds and within words, weak voicing may cooccur
with apical friction which is of short duration:
203
0
YORK PAPERS IN LINGUISTICS 17
30.
notme s:ag:grstg
nouse sangysta!
get out of bed
There are in the data some instances where a word begins with initial
voicing and friction. These words are commonly pronouns, as in the
words tatilta (demonstrative pronoun, ablative sg.) and tuonne
(demonstrative pronoun + illative sg.) in the examples below; strictures
of relatively open approximation in fast speech are sometimes also
found instead of strictures of complete closure. In these cases, the
friction is rather weak.
31.
nayt:(alli a:y6 ne oa:Idah all)
nayttaako ne radio?
do they look like this?
32.
(all jos kita loti8ot s all) ?ylha:lta pain
jos sits katsottaisiin (katottas) ylhaalta pain
if you looked at it from above
33.
ja kafiu5 tule: (all hone ?oijee all) pwo191:e
ja kahva tulee tuonne (tonne) oikealle puolelle
and the collar comes up to the right-hand side
2.7 Voicelessness, breathy voice: [, h fi]
Phonetically, it is perhaps best to see Finnish [h] as a voiceless version
of an adjacent vowel. This is also Sweet's description of Finnish [h]
(Sweet 1908, in Henderson (ed.) 1971: 174). 'There is also a "strong"
aspirate which occurs in Finnish and other languages, the formation of
which the full vowel position is assumed from the beginning of the
aspiration, which is therefore a voiceless vowel.' On the other hand, the
degree of aspiration at the syllable margins is greater than in the
voiceless vocalic syllabics noted below.
204
PROSODIES IN FINNISH
[fi] can be treated in a similar way, as a breathy voice version of an
adjacent vowel. [fi] occurs between two voiced sounds, and [h]
elsewhere. Both [h] and [fi] are found syllable-initially and finally.
In my informants' speech, [fi] as a distinct portion of breathy
voicing focused at the syllable margin is frequently not observed, but
breathiness throughout the syllable is. This is especially interesting in
view of some of the metathesis which is supposed to be fossilised in
Finnish (cf. Rapola 1966: 256ff). In the Standard language, there are
pairs of words such as valhe, 'a lie' and valehtella 'to tell a lie'.7 When
my informants were asked to give the word for 'a lie' they consistently
produced [unlv], with breathiness throughout the whole of the second
syllable (or if anything concentrated on the latter portion of it); but
certainly not initially in the syllable as the (generally phonemic)
orthography implies.Note that the lateral portion of this word is
pronounced half-long, where half-long duration serves as the regular
phonetic exponent of the first element of a CC-cluster (cf. Ogden
1995b).
Fig. 4 presents a spectrogram a token of the word hiihdin, 'I skied',
where the whole of the first syllable is pronounced breathy .
7 Similarly, there is the word paras, 'best', which has the stem parhaa-; /h/
may not occur finally, since only apical sounds occur in this position. This
instance can be seen therefore as an example of metathesis of friction.
205
YORK PAPERS IN LINGUISTICS 17
1
, f'4
111;fe,ritii(.
;
0,1
I
t
:01 111_10
2
1
3
Fig. 4: [hi:fidin ladah#]
'I skied on the track'
Note the breathiness evident throughout the first syllable (1); the very
short voiced closure for the [d] sounds (2), and the final
voicelessness (3).
At the end of a syllable, the tongue gesture for the vocalic part of
the syllable may be somewhat raised and accompanied by voicelessness,
producing weak friction, as in [laxti], 'Lahti', a place name.
34.
ukuoistt tehtyjk
viivoista tehtyjli
made of lines
35.
jatkat vafig
jatkat vahlin
you go on a bit
206
PROSODIES IN FINNISH
Voicelessness is frequently used to mark utterance finality. Stages into
complete voicelessness from voicing are typically: voicing, creak,
voicelessness. Voicelessness may frequently be accompanied by
quietness. Sometimes the voicelessness is rather 'strong' (recall Sweet's
observations), and is then transcribed as [h), with the meaning that a
more forceful articulation is used that that implied by the symbolisation
using a voiceless vowel.
36.
ihmelisa:ke tone ?oike:le rrwo:1:eh
ihmeen lislike tuonne (tonne ) oikealle puolelle
a strange appendage on to the right hand side
37.
/pp e m:a
en mina (ma) Ueda
ne n:ayta: pp)
milts ne niiyhdei
I don't know what they look like
38.
kis:ct ?istui matol:v
kissa istui matolla
the cat was sitting on the carpet
See also below, 'Voiceless vowels'.
2.8 Glottal stop and creaky voice: [2
]
The glottal stop and creaky voice are frequently used in the speech of
my Savo informants to mark the beginning of words which have a
vowel initially. Lehiste (1965) presents some similar data comparing
vowel-vowel sequences with and without intervening syllable
boundaries; those with syllable boundaries may use creaky voice as in
Fig. 5.
207
YORK PAPERS IN LINGUISTICS 17
2
1
3
Fig. 5: Uilfide asemal:0]
`go to the station'
Note the initial voicelessness (1), breathiness throughout the first
syllable (2), and the very striking creaky voice between the second and
third syllables (3). Much of the transition from one vowel sound to the
next coincides with the period of creaky voice.
39.
dolcse pyore: sea ?alha:l:a ?olevah
onko se pyoreei, se alhaalla oleva?
is it round, the one underneath?
40.
aiko ?iso'
aika iso
quite big
208
PROSODIES IN FINNISH
41.
tus:1
Tdn:sin ?ota'
ensin ota musta tussi
first take the black pen
42.
(fall migkAlaind se ?gala all npq:
minkalainen se alapaa oli?
what was the bottom part like?
Another function of glottal stops in conversation seems to be as a
device for keeping hold of the turn in the conversation. While one
speaker has an unreleased closure, the other speaker does not interrupt:
43.
ma pi:rsin siTa.T... tam pY0TYla malt jar... ja poh.jon
mina (ma) piirsin slid... taman pyorylaosan ja... ja pohjan
I drew it... this round bit and... and the bottom
More detailed descriptions of creaky voice are given in Section 3.3 under
the exponents of ?.
2.9 Resonance features
With the possible exception of [d), consonants in Finnish match their
resonance with that of the vowel of the syllable in which they appear.
However, there are not such extremes of consonantal articulation that
consonants with palatal place of articulation or heavily velarised
consonants are produced.8 These seem not to form part of the Finnish
repertoire.
Consonants in words with back harmony are consequently darker
than in words with front harmony. One way of delimiting words is a
change in the resonance of the consonants at the words' edges:
8 A distantly related language, Nenets, lost 'vowel' harmony early in its
development and now has palatalised and velarised consonants. Finnish
secondary articulations are not as extreme as these.
209
YORK PAPERS IN LINGUISTICS 17
44.
kaytimi riufieinto
kaytin puhelinta
I used the telephone
Note that in this example, the words are kept together by the shared
bilabial place of articulation but are kept separate by the different
resonances. The resonance of consonants is not marked in my
transcriptions unless it is different from what is expected.
Lip-rounding, which is predictable, is similarly not transcribed for
consonants, although it must be noted that the lips hold the same
gesture over the whole syllable, or in the case of diphthongs over the
syllable-initial or syllable-final piece.
As far as [d] is concerned, it could be that it is the low-frequency
voicing during the closure which gives the auditory impression of
darkness. It should be added that some writers (eg. Karlsson 1971)
believe that this voiced alveolar plosive is an import from Swedish and
that it came about when the modem language was standardised in the
capital Helsinki in the last centuryHelsinki was at that time
predominantly a Swedish-speaking city. Kettunen's map 65 (Kettunen
1981) shows that [d] only occurs natively in one or two areas on the
West coast, which, significantly, are also areas where Swedish has a
strong foothold. My informants were able (consciously) to produce
dialect forms which used other articulations than the one described here
such as a voiced bilabial approximant or a voiced tap. TS, my
informant from flame, regularly uses a voiced apical tap or trill in all
contexts where [d] appears in the data presented here.
2.10 Vowels
The symbols used in my records for the vowels are: [a n o o u y i e
This follows the usual IPA practice for Finnish vowels, although the
orthography is more common: <a a o 0 u y i e>. The symbol [n]
(sometimes also transcribed in my records as [a] for a slightly closer
vowel) is used to represent an open, central quality which is frequently
210
2,
PROSODIES IN FINNISH
found in unstressed syllables, particularly very short ones9. It is
normally accompanied by a diacritic for advancing or retracting.
Fig. 6: Vowel quaderilateral showing the approximate qualities of
Finnish vowels.
The symbols used in the transcriptions presented in this paper are used
as follows: [a] is not as open and front as CV4, nor is [a] as back; its
quality is rather more central though very open. The mid vowels [c o o]
are all more mid in quality than their IPA symbolisation implies,
though they are hardly less peripheral. [u] is very back and round,
almost cardinal. [i] is front and spread. [y] on the other hand is not so
front and is less rounded than, say, French [y]. It bears some
resemblance to the short German sound [y] as in wiinschen. Diacritics
accompanying vowel symbols modify the values described here, and not
cardinal vowel values.
No significant differences in quality have been observed for Finnish
vowels depending on their duration (cf. Sovijarvi 1938, Wiik 1965,
Engstrand & Krull 1984).
9 cf Harms (1964: 62), who uses the symbol [A] for this sound in back
harmonic words. He claims it appears only when preceded by a syllable
boundary or following a consonant cluster, and only in or beyond the third
syllable. My notes do not quite accord with this last observation, and I have
observed both fronter and backer varieties.
211
212
YORK PAPERS IN LINGUISTICS 17
2.11 Diphthongs
The so-called rising diphthongs of Finnish all end in a close vowel.
They are: [ai ni of oi ui yi ei], [ay au ou oy eu] (and, marginally, [ey
iu iy]). The diphthongs which end spread do not normally end as close
as the symbol [i] implies: they usually fall somewhat short of this, to
approximately [e] or [g]. The diphthongs that end spread but which are
not in the first syllable of the word are usually 'derived', ie. they are not
part of the stem of the word, but arise from the addition of [1], which
marks past tense and plural in Finnish.
The so-called opening diphthongs are: [uo yo ie]. These vary in
their articulation depending on the speaker's dialect (Kettunen 1981).
My own informants pronounced these sounds as scarcely diphthongal.
They tended to start with a short close portion opening to a mid portion
which nevertheless was quieter than the initial part of the diphthong,
e.g. [kwo:rutet:o] 'icing', part. sg., [tYenton], 'unemployed', nom. sg.
In Standard Finnish these vowels have longer initial portions with a
mid off-glide. These diphthongs are usually treated as the phonetic
exponents of long mid vowels, since in the first syllable (the only place
are only found
ie. pure, long vowels
where they occur), [e: o: 0:]
in loan words. In native words, therefore, the long vowels are in
complementary distribution with the opening diphthongs.
2.12 Velic opening and vocalic articulations
The timing of the lowering of the velum is generally such that it lowers
before a complete oral stricture is made, producing vowels which are
nasalised before nasal consonants. Word-finally, there is frequently no
complete oral stricture, but there is audible nasality throughout the final
syllable. Lehiste (1965) shows that the nasalisation of a vowel may
serve as a boundary marker in Finnish. The pair maan isa and maa
nisiikiis are distinguished partly by the fact that the first vowel of maan
i- is nasalised, while in maa ni- it is not.
212
PROSODIES IN FINNISH
2.13 Variability of vowel quality
Vowel qualities produced by my informants are somewhat variable; this
variability can be summarised somewhat, though some of the
observations in this section remain rather tentative.
Very short vowels tend to be centralised.
Vowels after the palatal glide are frequently fronter in quality than
elsewhere; but it is hard to tell whether there is anything substantial
to be said here, since these vowels also tend to be very short in my
data, occurring as part of the partitive plural suffix.
Vowels after apical consonants tend to sound slightly fronter in
quality than after labial or dorsal consonants.
Some examples from my data will give an impression of the kinds of
variability in vowel quality which can be observed.
Compare the formant values for the centre points of the three open
vocalic portions in the word [am:at:ej0]. The first one has the formant
values 855-1520-3335 Hz, and the second one 765-1570-2965 Hz.
These are roughly comparable; taking into account the fact that the
second one is short and occurs between two consonants, one might
expect a lower Fl value; the F3-F2 difference might be explained by the
proximity of bilabial closure, which tends to lower all the formant
values. The final open vowel however has the formant values 815 -1875-
2945 Hz, which is quite a lot fronter (i.e. with a higher F2) than the
other two open vowels. Bearing in mind the fact that this vowel is also
very short, and also next to a palatal approximant (which would have
slower formant transitions), this high F2 value might be explained by
coarticulation. However, one of my informants produced the word
housujaan 'his trousers', part. pl. as [ housujaan]; this makes it more
likely that there may be some kind of local harmony between the palatal
approximant and the subsequent vowel.
A kind of harmony may be observable within feet. The
observations made here are by no means conclusive, though they are
suggestive. In the phrase pidan ammatistani 'I like my job', it was
observed that the third open vowel in the word [am:dtistani] was fronter
than the other two open vowels (with formant values of 695 -1885-
3015, thus roughly comparable with the third open vowel in
213
2 rci
YORK PAPERS IN LINGUISTICS 17
ammatteja). Three possible explanations seem likely: (1) the vowel is
in a foot with two syllables with front resonance: perhaps there is
vowel-to-vowel coarticulation; (2) the vowel is surrounded by apical
consonantal articulations, which tend to raise F2 and so give the
impression of fronter vowels; (3) the functional load on the vowel so
late in the word is minimal, and no other vowel could occur in that
place in structure and make a difference in meaning, therefore one might
expect that this vowel would have the potential to be more variable in
quality; example 57 is a similar example of this. It may also be the case
that all three explanations have some validity.
2.14 Voiceless vowels
Vowels between voiceless consonants are sometimes voiceless. This
seems typical of fast stretches of speech, turn ends, or stretches where as
the result of metrical structure the vocalic portion would be very short
even if voiced.
45.
mita kuk:a: ne m:uistt)t:a:
mita kukkaa ne muistuttaa
what flower do they remind you of?
46.
(all jos sita knoot s am ?ylha:lta pain
jos sits katsottaisiin (katottas) ylhaeiltei pain
if you looked at it from above
47.
lamptit mat kirk:aitg
lamput ovat kirkkaita
the lamps are bright
Just as certain consonants are voiced in stretches which are overall
voiced, so it appears that short vowels in stretches which are overall
voiceless can be voiceless.
214
PROSODIES IN FINNISH
2.15
Quantity and Duration
There are many different quantities for both consonants and vowels in
Finnish. At the phonological level, it is usually said that there are two
contrastive degrees of length. At the phonetic level however, it is not
true to say that there are only two degrees of duration. In my records
five degrees of duration are marked: [v 9 v v. v:[. Note that it is more
accurate to see duration as gradient rather than as categorial, so that no
matter how refined the transcription, the records remain impressionistic
rather than conclusive.
Half-long vowels are found after short open syllables, giving the
shape [ever] (cf. in particular Wiik & Lehiste 1968, Wiik 1975, who
show that the precise duration is a dialectal matter: some dialects have
the shape [cvcv]). Half-long vowels in my informants' speech
frequently occur also in closed syllables, provided the syllable-final
consonant is a sonorant (typically [n]), giving the general shape
[cvcvn]. This pattern is not found when the final consonant is a
voiceless plosive (usually R)). [cvcvt]. Palomaa (1946) found that
vowels before voiceless consonants are shorter than before voiced ones.
Half-long consonants appear as the exponent of the first element of
CC-clusters, giving the general shape [cvc.cv].
Very short vowels are found after heavy first syllables, giving the
phonetic shapes [cvvcV] and [cvccV]. A short vowel after such a stretch
may also be very short: [ornakeja ka:ptstd] ammatteja, 'profession'
part. pl., lcaapista 'cupboard', elat. pl.
Factors which may be significant in determining consonant
duration are: place in the foot; the weight of preceding syllable; and the
phonological length. In one token of the utterance tapaa nainen ulkona
'meet the woman outside', the four nasal portions had the following
durations respectively: 85ms, 35ms, 70ms, 60ms.10 The first one
counts as the phonetic exponent of a 'long' nasal, while the others are
'short'; however, it can be seen that there is a wide range of variability
in the measured durations. Clearly, there can be no simple phonetic
interpretation of the categories 'long' and 'short'; and any interpretaion
10 cf. Flifilet (1971) who in discussing Finnish rhythm notes that
consonants after long vowels are very short.
215
YORK PAPERS IN LINGUISTICS 17
would have to make reference to position in the word, syllable, and
foot. See Local & Ogden (1994) for a desription of a computationally
implemented method for generating consonant durations for English in a
declarative metrical framework.
Occasionally, my informants demonstrate a feature judged typical of
their dialect: after a short open syllable, and before a long vowel,
phonologically short consonants can be durationally long. This type of
lengthening depends purely on the metrical structure and plays no part
in morphosyntactic processes, unlike the well-known 'consonant
gradation'. This is not a feature of Standard Finnish, and is not reflected
in the orthography.
48.
men:e: uafia ?cas pat nigicus:ah
menee (mennee) vahlin alas pain nun kuin find
goes down a bit like you...
49.
ei
ei mitelein
nothing
3. Inter-word Junctions in Finnish.
This Section presents a Firthian Prosodic Analysis of inter-word
junctions in Finnish. Some of the phonetic facts described in Section 2
are taken account of by the analysis presented here, and more data is
presented to back up the analysis.
In Fithian Prosodic Analysis, syntagmatic relations can be
considered primary: one starts by considering how linguistic items are
put together. This avoids the need for assimilation rules (Sprigg 1957),
and may also avoid the need for deletion rules. The fundamental nature
of syntagmatic relations is expressed by Whitley (ms), below:
'You can't tell from your isolate form what the junctions
will be. You have to start from the junctionsyou can' t
work from the isolates and say x becomes y in certain
circumstances.'
216
21r {1
PROSODIES IN FINNISH
Thus, for Whitley, citation forms (Isolates') do not provide the starting
point of the analysis; instead, she prefers to begin with items in
connection with one another. This is how the analysis of the Finnish
material in this section is conducted. The resulting statement is very
different from one which starts out with citation forms which have to be
altered to fit in with rules of word juncture. I will also show how at
least some of the observations made in the preceding section can be
taken into account.
In the analysis presented in this Section, I shall assume a structure
to-g-co, where to stands for 'word', and it for a system of word
junctions. I shall then consider whether the terms of this prosodic
system can usefully be reused in the prosodic system of syllable joins
within words.
In all, there are six terms of the prosodic system of inter-word
junction in Finnish: n g h? T. As long as the stated structural
constraints are not violated, up to two prosodies of word junction may
operate at one place in structure; but every to--to structure must
contain at least one prosodic term. The term is largely (but not entirely)
determined by `phonematic' structure, although lexical and
morphological structure also play a part. I shall consider each kind of
junction in turn, considering firstly its distribution (i.e. its
phonological status), and secondly its phonetic exponents. The term N
is used as a word-final phonematic unit whose exponents include
nasality; it is a term more delicate than C (which merely stands for any
term of the relevant C-system) and as delicate as P, which stands for a
subterm of the C-system and whose phonetic exponents at normal
tempo include complete oral closure.
The data in this Section have a different relevance from the data in
the preceding Section, and are consequently presented differently. In this
Section, the focus is more on the relations between the phonetics, the
phonology, and other levels of linguistic statement such as the
grammar. Therefore, impressionistic records annotated with the junction
prosodies in bold superscript are given, along with the generalised
partial phonological structure of which the phonetics is an exponent, a
brief account of the morphological structure of the items, and an
English gloss.
217
YORK PAPERS IN LINGUISTICS 17
3.1 n
Distribution ofji
n is found at the junction of two words where one word ends in -N and
the subsequent word begins with a C- whose exponents include
maintainable oral stricture (Catford 1988: 63) which involves the actual
physical contact of an active and passive articulator; ie. the exponents of
C- include [p t k m n s 1 r u], but preclude [j h].
Exponents of u
n pieces are characterised by the same place of articulation across the
syllable ending and the syllable beginning. The presence of nasality
determines the presence of voicing, but nasality may terminate before
voicing. In the case of the exponents of the structure -N n P-, voicing
may extend into the closure portion which is one exponent of P-.
Nasality may occasionally extend into the syllable beginning and
combine with labiodentality or laterality.
Nasality is perhaps best regarded as the exponent of -N, but the
temporal extent of nasality may best be regarded the exponent of n.
Note that what is accounted for by n is accounted for in other
analyses by rules of assimilation (eg. Karlsson 1982: 144'1. These rules
assume that the base form of the word ends in /n/: when a v.ord with
final /n/ precedes a word with, eg., initial /p/. then the nasal
assimilates. Such assimilation rules are only necessary because the
starting point of the analysis is citation form words; these forms are
dealt with under T below. Furthermore, these analyses do not account
for the range of variability in the exponents of pieces of the structure
-N v-, where the exponents of v are labiodentality and approximation
(cf. Section 2.5).
Examples:
50.
mu:tamou
kort:611m " pa:fianh Th
CN n PN n PN th
(several+gen block+gen top+ill)
down a few blocks
218
PROSODIES IN FINNISH
° kaufiistu:
51.
CN n PV
(woman+nom, is terrified+3ps)
the woman is terrified
52.
mentava
tokais113
goti:n
CV CN n PN
(go+pass+pres. part back home+ill)
has to go back home
53.
oven
n3a1i:n
VN " CN ti
(door+gen through+ill)
through the door
54.
an 12 osta;
kit3 " gel:On
CV VV CN n PN
(3ps+nom buy+3ps clitic clock+gen)
and he buys the clock
3.2
Distribution of
occurs in several structures: (i) Wherever the first part of a junction is
any term of the final -C system except -N. (ii) When any -C term
(including N) is utterance-final. (iii) In the structure -NI C-, where
the exponents of C- include a non-maintainable stricture, or no stricture
(ie. [j h]). (iv) In the structure -N I V-.
In the recorded material, there are stretches identified as words with
final consonantal portions [s t n]; this list may not be exhaustive,
since in theory, [1 r] could also occur word-finally.11 Therefore no
conclusive statement about the overall system of syllable (or word) final
terms is made here.
11 Finnish dictionaries list items such as askel, 'step', manner 'mainland'.
219
YORK PAPERS IN LINGUISTICS 17
Exponents of T
The exponent of T is the apical articulation of the exponent of the wordfinal C-term.
Examples
55. joke. C Og
hybIn '12 Ilan& n tatxras:a:n
CV C VN n PN CN VN n PN
(rel. pron.+nom. sg be+3ps+clitic very happy+nom
meet+inf+iness+3pers. poss)
who is also very happy to meet (when she meets)
56. 7 2ulos T tapa:ma:n
2y,st?aua:nsa g
? VC CN tr? V -?-V g
tout meet+inf+ill friend+part+3pers poss)
out to meet her friend
57. fianeg ° kat:el:es:6:n ° talostd
pois T pain '
C-101 PN PV CC CN
(3ps+gen walk+inf+iness+3pers poss house+elat away direction)
as she walks away from the house
3.3 ?
Distribution of ?
? is found in two main structures: (i) when the second of two words is
V-initial and the two words are not in what might be loosely called
`close grammatical contact' (see under C, Section 6.5.1.5 below), ie. in
structures -C 2 V- and -V 2 V -; (ii) word-internally, where it
frequently seems to be associated with resonant portions of long
duration, such as long voiced lateral approximant portions, diphthongs
or long vowels.
It should be pointed out that Itkonen (1965) shows that this type of
word join is common only in the Savo dialects; and therefore the
220
22
PROSODIES IN FINNISH
statement presented here, while accounting for my informants' speech,
may not apply more generally in Finnish.
Exponents of ?
The exponents of 2 include creaky voice. Creaky voice is timed in
interesting ways with other phonetic parameters. Usually, the creaky
voice coincides with changes in the vocal tract, so that any vowel
transitions at the join between two words are, so to say, `covered' by
the creaky voice. This is the most usual pattern in stretches which
expone -V 2 V- structures. In stretches which are the exponents of
-N 7 V- structures, where the exponents of -N include nasality, the
creaky voice is generally timed to coincide with the closing of the
velum and the ending of nasal airflow. It may however also be timed so
that a small amount of creaky voice and nasal airflow overlap; but when
the creak comes to an end, nasality is not present.
Another feature of periods of creaky voice is that they often mark
areas where the pitch changes. It is not uncommon to find creaky voice
between a stretch that ends with a low pitch, and followed by one which
begins with a high pitch.
For reasons which remain unclear, diphthongs and long vocalic or
resonant consonantal portions are all susceptible to creaky voice. In the
case of diphthongs, the creak tends to start at the end of the steady state
portion of the initial part of a diphthong. Otherwise, creak is timed to
start coincidental with the onset of the resonant portion. It may be true
to say that creaky voice is a sort of masking technique: a way to cover
up transitions from one state to another. It remains unclear what
function (if any) creaky voice may have word-internally. It could be that
there is just a conventional phonetic association in Finnish between
resonant articulations, the exponents of length, and creaky voice.
The duration of creaky voice is anything between 20 and 160 ms.
These are extremes, however. It is most usual in the material collected
to find creaky voice with a duration of approximately 60 ms (±20 ms).
Sometimes the glottal constriction is so tight as produce periods of
complete glottal closure; these are generally released into creaky voice.
Therefore, it would be inaccurate to describe these portions as `long
221
2"
x,2`2
YORK PAPERS IN LINGUISTICS 17
glottal stops' (cf. Itkonen 1965). Portions such as these are generally
associated with creaky voice of greater duration.
Examples:
58.
isu:
nainen
CN t? VV
(woman+nom sit+3ps)
a woman is sitting
59.
jal:e:n n takat?
CN n
h
PN tit VV
h
(again fireplace+gen edge+iness)
back by the fire again
60.
h xaunis "r? u:st C matzo
h CC VV C CV
(beautiful+nom new+nom rug+nom)
a lovely new rug
61.
? ?ulos 1 tapa:ma:n
?ist?aansa g
? VC CN r? V-7-V
(out meet+inf+ill friend+part+3pers poss)
out to meet her friend
62.
purk7autua
C-7-1/
(come undone+inf)
to come undone
2
222
PROSODIES IN FINNISH
3.4 g
Distribution of g
g occurs at the junction of certain morphological items with other
words. Itkonen (1965) lists nine structural places where g occurs, of
which the most important are: negative present tense forms; 2ps
imperatives; first infinitive; most nouns which end in [-e]; the third
person personal suffix (singular and plural), which has the phonetic
exponents [nsa, nsa] and adverbs marked with the suffix whose
exponents are [sti]. In all these cases, g is a property of the end of the
named elements of structure. The vast majority of Finnish words that
end in [e] are joined to the next word with g.
In the data collected, there are relatively few instances of structures
where g applies. There are one or two instances of negatives, and a few
instances of 3rd person personal suffixes with the exponents [nsa, nsa].
It seems reasonable from the available data to conclude that g only
occurs in structures with the general shape -V g C-, where C- stands
for a C-term whose exponents include oral stricture. Most studies of
`gemination' in Finnish include the possibility of the structure
-V g V-, but the cases of this in my data have exponents which are
not distinguishable from the exponents of the structure -V ' V-; since
it simplifies the statement of exponents and is within the terms of the
Principle of Reusability, I treat all the examples of potential -V g Vas the structure -V ? V-.
Exponents of g
The exponents of g include the prolonged duration of the closure phase
for the succeeding consonant, where 'closure' means any consonantal
stricture. Articulations which could be described as more tense are also
frequently found as exponents of g pieces. For instance, short [u], a
labiodental approximant, is found as the exponent of a C-term which
only occurs initially in the syllable; but the same C - term in
conjunction with g may have the exponent [v:], with a closer stricture
as well as greater duration. Plosive bursts in g pieces are also frequently
sharper than in non-g pieces.
223
YORK PAPERS IN LINGUISTICS 17
Examples
63.
(pp all fiatiel. n all pp) kiSltirlSa g k:arso:
(3ps+gen cat+3pers. poss look+3ps)
her cat his watching
64.
naine"
sa:
li:nansa g v:almi:ksi
GNneCAfge-11
(woman+nom get+3ps scarf+3pers. poss ready +transl)
the woman finishes her scarf
65.
mut:a
fiant ei
fiuoma: g k:a:n r et:5 2
C VAC NtV VCC VgC INI1V V?
(but 3ps not+3ps notice+pres emphatic clitic comp)
but she doesn't even notice that
66.
qi:doin t fie g p:a:scuat T koti
h
CNtCVgCCICV;VVh
(finally 3pp1 arrive+3ppl home+door+all)
finally they get to the front door
Descriptions of Finnish phonetics (eg. Itkonen 1965) frequently describe
long glottal plosives as the exponent of the join between two words
where one ends in a vowel and the next starts with a vowel, and where
the first word is joined to consonant-initial words with greater duration
of the initial consonant. This would lead us in the terms of the present
analysis to posit the structure -V g? V- to complement the structure
-V g C-. Greater duration would be allotted as the exponent of g, and
the glottal stricture as the exponent of ?. However, in the few cases in
the material where such a structure might apply, it seems not to. The
phonetics of such potential structures is indistinguishable from the
phonetics of the structure -V ? V-, and therefore I have chosen to state
224
PROSODIES IN FINNISH
the distribution of g in terms of the structure -V g C- only. For
example, in the stretch
g Lola:luso
67.
?a:res:a
g PV g? VV
fireplace+3pers poss edge+iness)
by her fire
the stretch of creaky voice lasts approximately 85ms. We may expect to
find the exponents of g in this stretch of phonetics, since we find
greater duration in other places where the third person possessive suffix
precedes another word. However, in the stretch
kaula
68.
lima?'
?alka:
CV CV vV
ineck+scarf+nom start+3ps)
the scarf starts
the period of creaky voice lasts approximately 160ms. This is
almost twice as long as the duration of the stretch of glottal constriction
in the example which potentially has g, but this is counterintuitive.
The long duration could also not justifiably be said to be the exponent
of g, since g is not otherwise used to put together the noun Iiina with
some other word, nor any other pair of words, except where the first one
ends in [ -e]. It may also be fair to say that the material collected here is
so small that no firm conclusions can be drawn from it.
3.5 C
Distribution of C
C occurs in all cases where the structure of the junction is -V C-. This
is the commonest junction in Finnish, since most words end with V
and most words begin with C (Wiik 1977). The most commonly found
inter-word structure is -V C C-.
C is also found in those -V V- structures to which 2 does not
apply: between words which are in what we might characterise as 'close
225
YORK PAPERS IN LINGUISTICS 17
contact'. This includes junctions with function words such as mutta,
but, ja, and; the combination of sanoa, to say, +ettii, the
complementiser; the negative verb; the verb olla, to be; and also
between two items in a compound word where the first of them is Vfinal, and the second is V-initial. There is also a case in the data where
C is found between a verb and the reflexive use.
Exponents of C
The exponents of C include the presence of an open vocal tract
accompanied by voicing followed by either a consonantal stricture with
the same resonance as the subsequent part of the word or a vocalic
portion, in which case the junction between the two vowels is marked
by the absence of any glottal constriction, which is one exponent of 2.
A change in resonance between front and back or back and front is one
possible exponent of C, but is not criteria! of C at word junctions.
Examples
69.
uierestA C jA C lam:it:ele: C takanCih h
CV C CV C CV C CV h
side+elat and warm+3ps behind+ess)
...from the side and warms itself behind...
70.
2 7ystaual:e:n
u:t:a C fiienoa C kaulali:na:h h
2 V N VV C CV C CV h
(friend+all+3pers. poss new+part fine+part neck+scarf+part)
(to) her friend the fine new scarf
71.
mut:d S ystaua C fiuoma: C kin"'
CV C VV C CV C CN 't
(but friend+nom notice+3ps clitic)
but the friend notices as well
72.
ja C alka: C neuloq h
CV C VV C CV h
226
PROSODIES IN FINNISH
(and start+3ps knit+inf)
and starts knitting
koti coel:th h
73.
Cv C vv h
(home+door+all)
to the front door
3.6 h
Distribution of II
h is found finally and sometimes initially in the utterance. It marks
initiality and finality. Not all initials nor finals are marked with h.
Exponents of ti
The exponents of h remain somewhat inconclusive. They involve
absence of regular vocal fold vibration (ie. presence of breathy voice,
creaky voice, whispery voice, or simply release of air through the vocal
tract). They may also involve relative more open, laxer, articulations.
They may also involve the aspiration of plosives, and even slight
affrication.
Examples
74. ja C naineg " keit:a: ?
CV C CN pv
ystaughea kafiuith th
0_1s' n p_c
(and woman+nom cook+3ps friend+all+3pers. pos coffee+acc. pl)
and the woman makes her friend coffee
75. h sc C On:
hytitn " ty:pihistah h
h CV z V-I-N t CN n PV h
3ps be+3ps very typical+part. sg.
it's quite typical
227
228
YORK PAPERS IN LINGUISTICS 17
76. h xaunis
u:sT
matzo
h CC t/ VV z CV
(lovely+nom new+nom rug+nom)
a lovely new rug
77. sulke:
uerfiots Th
CV z CC th
(shut+3ps curtain +nom.pl)
closes the curtains
3.7 The verb olla, to be
For the structure -C/V IC V-, the usual term of IC is ?. However,
when words are in what I loosely termed 'close grammatical contact',
they are more frequently joined by C. In this section, I shall consider in
more detail the phonetics of the verb olla, to be, which exhibits rather
complex word joins. This shows that the analysis presented in 3.1-6 is
partial, and points to the need for an even more refined statement than
the one given in this paper.
Examples 78-80 show the verb olla linked with C:
78.
ja ze on
ja se on
and it is
79.
ni: (all ma Om all) pi:rtanyh
niin, mina (ma) olen (oon ) piirtany(t)
yes, I've drawn (it)
80.
ei ne kouT ?isoja o: na: kukal
ei ne kovi(n) isoja ole (olo) nod kukat
they're not very big, these flowers
There are in fact a variety of ways in which the verb olla or its parts
may be joined to the preceding items. One of the common frames in my
22
228
PROSODIES IN FINNISH
data is 'these are '. For this, the Standard form is alma ovat.. My
informants' productions typically resemble those at (81).
81(a) fp namYEt p)
81(b)
narnacrt p)
It can be seen that the initial part is always [nAm-]. Then there is an
open portion which has some labiality in it and is dark, though the
darkness may vary in its domain from the nasal portion to the end, or
not start till later in the second syllabic portion. It is difficult to know
how many syllables there in in these utterances; but it is certainly not
the four implied by the orthography. For the phrase 'they are
unemployed' my informants produced:
82.
fp he"13 pit:Yettomila
he oval tyOttOmiii
they are unemployed
where it can be seen that there is labiality, but the expected amount of
syllabicity is not present. A more extreme form of this lack of
syllabicity as a distinct exponent of the verb olla can be seen in
examples such as:
83.
afimdtihtin 7iso mafia
ahmatilla on iso maha
the greedy person has a big stomach
84.
keli1.0 koloan pu:talhOs:0
ketun kolo on puutarhassa
the fox's den is in the garden
In these cases, greater duration of the word-final vowels of the items
just before the verb followed by nasality seems to be doing the work of
the third person singular form of olla. In many instances, then, the verb
olla seems to behave almost as if it were a clitic, and forms a special
piece with the preceding item in the sequence of the speech. Much of
229
YORK PAPERS IN LINGUISTICS 17
the phonetics typical of other items with apparently similar
phonological structure (i.e. -V V- pieces) is not to be found, and much
of the phonetics of this verb is unlike that which is to be found with
other verbs.
Frames such as nanta ovat and pieces where the items before the
verb olla end in anything other than complete oral closure are
commonly marked as lax' in my records: they tend to be articulated
quickly, with less close stricture, more breathiness, and with unclearly
differentiated syllables (i.e. it is often hard to say how many syllables
one hears). They are often also quieter. Perhaps surprisingly, when the
item before the verb ends in a consonant with complete oral stricture
(with or without nasality as well), this portion of complete closure can
be long before the verb olla:
85.
han: on tYfftein
han on tyOlon
s/he is unemployed
86.
atimatit: oust keit:jos:a
ahmatit ovat keittiiissa
the greedy people are in the kitchen
In these cases, the way in which the word before the verb olla and the
verb itself are joined phonetically is different from what is described
above. Rather than having a juncture where material seems to go
missing, here the juncture seems to be marked by 'more' material, i.e.
greater duration. This could be treated as an exponent of g; however, it
is the final consonant of the first item which is long, whereas in other
cases where g joins words, such as the imperative, it is the initial
consonant of the second item which is long.
Itkonen (1965: 248-265) discusses both these kinds of word join
across the Savo dialect area, and notes that in his data most examples of
-C C V- (cf. exx. 78-80) involve the verb olla and the negative verb ei.
Itkonen observes that this junction can only occur with 'close-knit
compounds'. He also notes the junctions with long consonantal
portions, and claims that they contain two distinct intensity peaks,
230
PROSODIES IN FINNISH
something which I did not observe with my informants. They are also
rare in his material. While no clear conclusions can be drawn, it does
seem clear that not all items can be handled in the same way in any
complete analysis of Finnish word joins.
3.8 Spectrograms of examples of inter-word junctions
Figures 7-10 below show spectrograms of some of the utterances
described in the previous section. The relevant details are commented on
in conjunction with the appropriate spectrogram. The spectrograms are
provided to show that phonetic exponency can be made to account to
more than one kind of phonetic description.
3
Fig.7: Spectrogram of Wainen keittlia ystlivalleen kahvit'.
231
232
YORK PAPERS IN LINGUISTICS 17
Note that the temporal extent of voicing between the nasal and plosive
portions is different at (1) and (3) in Fig. 7 above; this provides good
evidence that temporal information is properly part of the phonetic
exponency. The period of creaky phonation around (2) lasts
approximately 130ms; this is approximately twice as long as other
stretches of creaky voice in Figs. 34-36, yet there is no motivation for
saying that the duration of this portion of creak is an exponent of g?
rather than just 7. Note that the final plosive burst is rather diffuse,
aspirated, and does not have such a well-defined burst as at (1) and (3);
this lax articulation is an exponent of h. The structure of the whole
? V N n F_C h.
utterance, then, is CN n
.,.11111;10,1,
1
2
Fig. 8: Spectrogram of 'Keno yopoydallaan' .
233
232
3
PROSODIES IN FINNISH
In Fig. 8, note the creaky voice at (1), which extends for about 60ms.
Note also that the formant transitions are timed to coincide with this
stretch of creak, so that the non-creaky portions before and afterwards
contain more or less steady state formants. At (2) are the exponents of
C, a voiced vocalic portion followed by a portion with consonantal
stricture. Note how at (3) the creaky voice is timed to coincide exactly
with the release of lateral airflow, thus masking any formant transitions
out of the lateral. It remains unclear why creaky voice should associate
with stretches such as long vowels. The phonological structure for this
utterance is CV VV
C-7-N, since the word yopoyta is a
compound noun, yo 'night' + pOyta, 'table'.
2
3
Fig. 9: Spectrogram of 'Kaunis uusi motto'.
Fig. 9 shows the spectrogram for kaunis uusi motto, 'a lovely new
carpet'. In this case, attention is drawn to the lax articulation of the
233
2 3 4-,
YORK PAPERS IN LINGUISTICS 17
initial voiceless portion, which has a sudden onset, but lacks a clearlydefined burst, at (1); this is taken to be an exponent of h. Note that at
(2) the exponents of ? are evident, and that the creaky voice is timed to
coincide with the transitions from the preceding consonantal
constriction into the vocalic portion at the beginning of the second
word. At (3) the exponents of are again evident from the unmarked
transition from the vocalic portion at the end of one word and the
consonantal portion at the start of the next. The structure of this phrase
is h CC VV CV.
1
2
3
Fig. 10: Spectrogram of 'Min ostaakin kellonjota....
Fig. 10 shows the spectrogram of the phrase hiin ostaakin kellon jota...
'and he buys the clock which...'. Note at (1) the exponents of ?; in this
case the creak lasts for about 50ms. Note how again the creaky voice is
timed to coincide with the offset of the consonantal articulation and thus
covers the portion of the acoustic signal which exhibits the greatest
234
PROSODIES IN FINNISH
amount of formant transitions. The portions at (2) and (3) can be
usefully compared, since both show velar closure followed by a plosive
release. At (2) the closure is clearly unvoiced, and the structure is V
C P, since the two words are in close grammatical contact (verb +
clitic). At (3) on the other hand, there is obvious voicing in the closure
portion; this is attributable as an exponent of n. The overall structure of
the phrase then is CN VV C PN n PC T C-V.
3.9 Summary
Tables 1 and 2 present (i) the structures found in inter-word position,
and (ii) the statement of exponents in broad terms of the inter-word
prosodies.
Word-Final
Inter-word
Prosody
Word-Initial
-N
n
C- (C-' = [ptkinnslro))
-C or - V
C when in close
V-
gammatical
contact;
7 otherwise
-C
T
C-, V-, or utterance final
-V
g when
morphology
demands it;
C-
( otherwise
-C or - V
h
utterance-final
utterance-
h
C- or V -
initial
Table 1: Summary of the inter-word structures.
235
238.
YORK PAPERS IN LINGUISTICS 17
More than one statement above may apply, and two prosodies of inter-
word junction may be combined; the structures -C T ? V- and
-C1h# are possible, and do not contradict the above statements.
n
sameness of place of articulation of exponents of -N and C-.
7
creaky voice timed to coincide with changes in the vocal tract.
vocalic articulation followed either by a consonanatal articulation
(in 71C- structures) or by a vocalic articulation with no intervening
glottal constriction (in It V- structures).
h
voicelessness, creaky voice, breathy voice or exhalation; laxer and
more open consonantal articulations.
't
apical articulation of -C.
g
long duration of C-.
Table 2: Summary of the broad exponents of the inter-word
prosodies.
4. Conclusion
This paper has shown how a phonological statement can be made which
takes into consideration phonetic characteristics which in most
phonologies are considered irrelevant. Some of its important
characteristics are:
1. A parametric phonetic statement is made in either acoustic or
articulatory phonetic terms.
2. The phonological statement is made in phonological terms,
which are abstract in the sense that they have no implicit phonetics.
3. The two levels of phonetics and phonology are connected by
statements of phonetic exponency. These exponency statements need
not be simple, in the sense that they may refer to more than one
phonetic parameter (cf. Ogden 1995a).
4. The exponency statements account for what might be
characterised as 'fine phonetic detail'. The resulting analysis is therefore
based on, and accountable to, observed phonetic detail, some of which
would be deemed irrelevant if an analysis were used which were based on
236
237
PROSODIES IN FINNISH
a phoneme concept, or which could only produce a broad phonetic level
of description, such as most current work in generative phonology.
5. The phonological statement presented describes in declarative,
non-process terms features of Finnish which are otherwise typically
regarded as processes of assimilation, or the output of a series of rules;
or ignored altogether.
6. The phonological statement makes reference to other levels of
linguistic statement such as the morphosyntactic and interactional
levels. Thus there is integration of different levels of lingusitic
statement.
REFERENCES
Catford, J. C. (1988). A Practical Introduction to Phonetics. Oxford:
Clarendon Press.
Docherty, G, Foulkes, P., Milroy, J., Milroy, L. (1995). What is a
phonlogical fact? The relationship between theory and data.
Newcastle University, ms.
Engstrand, 0 & Krull, D. (1994). Durational correlates of quantity in
Swedish, Finnish and Estonian: cross-language evidence for a theory
of adaptive dispersion. Phonetica 51. 80-91.
Flifilet, A. L. (1971). Une Evaluation structurale du systeme syllabique
finnois. In L.L. Hamerich, Roman Jakobson, Eberhard Zimmer
(eds.). Form and Substance. Odense: Akademisk For lag. 193-203.
Hawkins, S. & Slater, A. (1994). Spread of CV and V-to-V coarticulation in
British English: implications for the intelligibility of synthetic
speech. Proc. ICSLP-94 1. 57-60.
Hawkins, S (1995). Arguments for a nonsegmental view of speech
perception. Proc ICPhS 3.18-25.
Henderson, E.J.A. (1971). The Indispensable Foundation. A Selection from
the writings of Henry Sweet. London: Oxford University Press.
Itkonen, T. (1965). Proto-Finnic final consonants. Their history in the
Finnic Languages with particular reference to Finnish Dialects.
Helsinki: SKS.
237
2 38
YORK PAPERS IN LINGUISTICS 17
Karlsson, F, (1971). Finskans rotmorfemstruktur: en generativ
beskrivning. Turun Yliopiston Fonetiikan Laitoksen Julkaisuja 10.
[The structure of Finnish stems: a generative description.]
Karlsson, F. (1982). Suomen kielen dame- ja muotorakenne. Juva: WSOY.
[The phonology and morphology of the Finnish language.]
Kettunen, L. (1981). Suomen murteet: Murrekartasto. Helsinki: SKS.
[Finnish dialects: dialect atlas.]
Lahti, L.-L. (1981). On Finnish plosives. Lund working papers in
Linguistics and Phonetics 21. 89-93.
Lehiste, I. (1965). Juncture. Proceedings of the Fifth International Congress
of the Phonetic Sciences, Basel/New York: S. Karger, 172-200.
Local, J & Ogden, R. A. (1994). A model of timing for non-segmental
phonological structure. Proceedings of ESCA /IEEE workshop on
Speech Synthesis. 236-239.
Manuel, S. Y, Shattuck-Hufnagel, S., Huffman, M., Stevens, K.N.,
Carlsson, R. & Hunnicutt, R. (1992). Studies of vowel and consonant
reduction. In Proceedings of the Internaitonal Conference on Speech
and Language Processing. Vol. 2. 943-946.
Ogden, R. A. (1995a). An exploration of phonetic exponency in Firthian
Prosodic Analysis: form and substance in Finnish phonology.
D.Phil., University of York.
Ogden, R. A. (1995b). Where is timing? A response to Caroline Smith.
Paper presented at LabPhon 4, Oxford. To appear in Amalia Arvaniti
& Bruce Connell (eds) Papers in Laboratory Phonology 4.
Cambridge: CUP.
Palomaa, J. K. (1946). Suomen kielen aannekestoista puhumaan oppineen
kuuromykitn ja kuulevan henkiRM aNntiimisessil. Publicationes
Intituti Phonetici Universitatis Helsingiensis. [On the durations of
Finnish sounds in the pronunciation of a deaf mute who has been
taught to speak and a hearing person.]
Rapola, M. (1966). Suomen kielen tilinnehistorian luennot. Helsinki: SKS.
[Lectures on the history of the Finnish language.]
Sovijarvi, A. (1938). Die gehaltenen, gefliisterten and gesungenen Vokale
der finnischen Sprache. Helsinki: SKS.
Sovijarvi, A. (1957). The Finno-Ugrian Languages. In L. Kaiser: A Manual
of Phonetics. Amsterdam: North-Holland. 312-324.
Sprigg, R. K. (1957). Junction in spoken Burmese. SLA. 104-138.
229
238
PROSODIES IN FINNISH
Suomi , K. (1980). Voicing in English and Finnish stops. A typological
comparison with an interlanguage study of the two languages in
contact. Turun yliopiston suomen laitoksen julkaisuja 10.
Suomi, K. (1985a). Yleissuomen /i//j/ ja /u//v/-oppositioiden
rakenteellisista perusteluista. Virittajd 89, 48-56. [On the structural
motivations of the Finnish /i//j/ and /u//v/ oppositions.]
Whitley, E. (ms). Phonetic Analysis and Phonological Statement. Notes
from a lecture course entitled Phonetic Analysis and Phonological
Statement, taken by Eugenie Henderson. York: Firthian Prosodic
Archive.
Wiik, K. & Lehiste, I. (1968). Vowel quantity in Finnish Dissyllabic words.
Congressus Secundus Internationalis Fenno - Ugristrarum, 1, Helsinki:
Suomalais-ugrilainen Seura. 569-574.
Wiik, K. (1965). Finnish and English Vowels. Turun Yliopiston fonetiikan
laitoksen julkaisuja 94. Turku.
Wiik, K. (1975). On vowel duration in Finnish dialects. Congressus Tertius
Internationalis Fenno-Ugristrarum, I. Tallinn: Valgus. 415-424.
Wiik, K. (1977). Suomen tavuista. Virittdjd 81, 265-278. [On Finnish
syllables.]
Wiik, K. (1981). Fonetiikan perusteet. Juva: WSOY. [The foundations of
phonetics.]
Zsiga, E. C. (1994). Acoustic evidence for gestural overlap in stop
sequences. Journal of Phonetics 22. 121-140.
239
2
41
OLD ENGLISH VERB-COMPLEMENT WORD ORDER
AND THE CHANGE FROM OV TO VO*
Susan Pintzuk
Department of Language and Linguistic Science
University of York
1. Introduction
The change from object-verb (OV) word order to verb-object (VO) word
order is one of the most striking changes in the history of the English
language. According to most generative accounts, Old English is an
OV language, with optional rules of postposition and some form of the
verb-second (V2) constraint. Modern English, of course, is a VO
language and exhibits only remnants of V2.1 The change from OV to
VO is usually described as an abrupt grammatical reanalysis occurring
at the end of the Old English period.2
This paper offers an alternative account of Old English
verb-complement word order and the change from OV to VO. Evidence
is provided that the change does not involve abrupt reanalysis but rather
The original version of this paper was presented at the Eighth
International Conference on English Historical Linguistics in Edinburgh,
Scotland, 19-23 September 1994. Thanks are due to two anonymous
reviewers for suggestions and comments. Author's e-mail:
sp20®york.ac.uk.
1 For example, Modem English shows residual V2 effects in questions and
in clauses with preposed negative polarity items:
What Should I do?
(i)
Never hug I seen such a sight.
(ii)
2 There are three stages in the history of English: Old English (700-1100),
Middle English (1100-1500), and Modern English (1500-present).
York Papers in Linguistics 17 (1996) 241-264
Susan Pintzuk
241
YORK PAPERS IN LINGUISTICS 17
synchronic competition between two grammars, which begins in the
Old English period and continues during the Middle English period.
The paper is organized as follows. Section 2 presents background
assumptions and terminology. Section 3 describes in more detail the
standard analysis of Old English and the change from OV to VO.
Section 4 presents three predictions of the standard analysis and shows
that they are not fulfilled. And Section 5 proposes an analysis of
grammatical competition to account for the variation in
verb-complement word order during the Old and Middle English periods.
The proposed analysis is based upon an investigation of data
collected from sixteen Old English texts; for sampling techniques and
information about the texts included in the database, see Appendix B of
Pintzulc (1993). Old English texts are cited according to the system
specified in Mitchell, Ball, and Cameron (1975, 1979); the
abbreviations used are listed in the Appendix.
2. Background assumptions and terminology
The analyses presented in this paper use a generative approach to
describe syntactic structure and word order, the Principles and
Parameters framework outlined in Chomsky (1981, 1986) and related
work. In particular, it is assumed that the base component of the
grammar generates underlying structure and word order that are modified
by syntactic movement, deriving surface structure and word order; both
structure and movement are constrained by universal principles. The
differences between languages, and between different stages of the same
language, are described in terms of parameters; for example, one
difference between Modem German and Modem English is the setting of
the parameter that determines the order of verbs and their complements.
For ease of exposition, I make the following three assumptions about
the syntax of Old English: (i) there are only two functional categories,
Infi and Comp; (ii) the underlying order of heads and their complements
can vary; and (iii) only finite verbs move from their underlying
position to functional heads. Nothing crucial rests on these
assumptions or on the choice of this particular framework: the
syntactic differences between OV and VO languages and grammars are
robust and can be expressed in any framework.
212
242
OE VERB-COMPLEMENT WORD ORDER
The term 'auxiliary verb' is used for expository convenience to
refer to those verbs that take infinitival or participial complements in
Old English.3 The terms 'verb raising' and 'verb projection raising' are
used to describe the permutation of auxiliary verbs and their infinitival
or participial complements in otherwise verb-final languages.4 The
term 'heavy constituent' is used for Old English PPs, non-pronominal
NPs, polysyllabic adverbs, and non-finite verbs, to distinguish them
from 'light constituents', i.e. pronouns, particles, and monosyllabic
adverbs.5 The terms 'OV' and 'VO' are used to refer to either
underlying or surface word order and structure; the use will be made
clear by the context. The term 'Infl-medial' is used for structures where
Infl, the head of IP, precedes its complement; the term 'Infl- final' is
used for structures where Infl follows its complement.
It is assumed that Old English is a V2 language, although the
precise formulation of the V2 constraint for Old English is still a
matter of some debate (see, for example, van Kemenade 1994, Pintzuk
1993); and that finite verbs obligatorily move to Infl to receive
inflection. Because leftward verb movement to a functional head can
distort the underlying word order in both main and subordinate clauses,
it is necessary to abstract away from this effect in order to focus upon
the order of verbs and their complements. The structural ambiguity is
illustrated below: clauses like (la), with the finite main verb in
clause-medial position, can be derived either by leftward movement of
the verb, as in (lb), or by rightward movement of the post-verbal
constituent, as in (1c).
3 Allen 1975 shows that Old English does not have a separate word class of
auxiliary verbs. But see Warner 1993 for features of a subset of my Old
English auxiliaries that distinguish them from lexical verbs.
4 See den Besten and Edmondson 1983, Evers 1975, 1981, Haegeman 1994,
Haegeman and van Riemsdijk 1986, Kroch and Santorini 1991, among
others, for formal analyses of verb (projection) raising in Germanic
languages. No position is taken here on the derived structures of verb
raising and verb projection raising. These processes are grouped with
postposition in Section 3 simply on the basis of derived word order.
5 It is shown in Pintzuk 1994 that Old English pronouns and adverbs
behave differently from heavy constituents: they can be syntactic clitics,
moving leftward to attach to maximal projections and/or heads.
243
YORK PAPERS IN LINGUISTICS 17
(1) a.
pe
god worhte purh hine
which God wrought through him
'... which God wrought through him ...'
(IELS 31.7)
b.
Leftward verb movement:
Pe god worhtei Purh hine
c.
Rightward movement of the PP:
be god ti worhte [pp purh hine];
To avoid this ambiguity, the data that will be considered here
consist mainly of clauses with finite auxiliary verbs and non-finite
main verbs; in these clauses the position of the auxiliary verb may be
affected by V2, but the non-finite main verb remains in its
base-generated position.6
3. The standard analysis of Old English
In this section the standard analysis of Old English, as proposed or
assumed by van Kemenade (1987), Koopman (1990), Lightfoot (1991),
and Stockwell and Minkova (1991), among others, is considered in
more detail. According to this analysis, Old English has underlying
OV structure, some form of V2, and postposition rules moving various
constituents rightward beyond the main verb of the clause. All surface
word orders are derived from a uniform base by optional movement
rules, as illustrated in the examples below.7 In (2), the underlying and
surface order of the main verb and its complement are the same; in (3),
VO surface word order is derived from OV underlying word order by
postposition of the NP.
6 Higgins 1991 suggests that Old English infinitives may move to the Intl
position of the embedded non-finite clause; see Pintzuk 1991 for criticism
of this analysis.
7 Since the focus of this paper is the order of main verbs and their
complements, the traces of topics and verbs affected by V2 are not shown in
the examples.
244
OE VERB-COMPLEMENT WORD ORDER
(2) OV surface word order
he ne mwg his agene aberan
he not may his own support
'He may not support his own.'
(CP 52.2)
(3) VO surface word order.
pu hafast gecoren [Np bone wer]i
the man
you have chosen
'You have chosen the man.'
(ApT 23.1)
There is strong evidence in favor of this analysis, which forms the
basis of most of the current work in Old English syntax within a
Principles and Parameters framework. Evidence for underlying OV
word order is provided by clauses in which main verbs follow their
complements and auxiliary verbs follow the main verbs, as in (4).
Evidence for the postposition of NPs and PPs and for verb (projection)
raising is provided by clauses in which the finite auxiliary is preceded
by two or more heavy constituents and followed by an NP, as in (5), a
PP, as in (6), a non-finite main verb, as in (7), or a projection of the
non-finite main verb, as in (8). Note that none of the clauses in (4)
through (8) can be analyzed as V2 clauses, since the finite auxiliary is
preceded by more than one heavy constituent.
(4) Evidence for underlying OV word order
him pwr se gionga cyning pws oferfwreldes forwiernan mehte
prevent could
him there the young king the crossing
'... the young king could prevent him from crossing there.'
(Or 44.19-20)
245
YORK PAPERS IN LINGUISTICS 17
(5) Evidence for NP postposition:
pxt mnig mon ti atellan mwge [Np ealne Pone demm ]i
that any man relate can
all
the misery
'... that any man can relate all the misery ...'
(Or 52.6-7)
(6) Evidence for PP postposition:
her
Cenwalh ti adrifen
wws [pp from Pendan cyninge];
in-this-year Cenwalh driven-out was
by Penda king
'In this year, Cenwalh was driven out by King Penda.'
(ChronA 26.19 (645))
(7) Evidence for verb raising:
Wilfrid eac swilce of breotan ealonde ti wes [v onsend]i
from Britain land
was
sent
Wilfred also
'Wilfred was also sent from Britain.'
(Chad 162.27-164.28)
(8) Evidence for verb projection raising:
hwxr enegu peod xt operre ti mehte [vp friO begietan[i
where any people from other might
peace obtain
'... where any people might obtain peace from another ...'
(Or 31.14-15)
In anticipation of the discussion in Section 4.1, it should be
pointed out that an OV grammar with optional rules of V2 and
postposition is quite powerful and can derive many different surface
word orders, some in more than one way. Because both leftward
movement of the finite verb and rightward movement of NPs, PPs,
verbs, and verb projections are permitted, the main verb can precede or
follow its complement, and the auxiliary can precede or follow the main
verb. This is illustrated in (9), where S = subject, XP = NP/PP
246
OE VERB-COMPLEMENT WORD ORDER
complement, Aux = auxiliary verb, Vf = finite main verb, V =
non-finite main verb.
Derivation
(9) Surface word order
a.
reflects underlying word order
S XP Vf
V2
b. S Vf; XP
postposition
Vf XPi
c.
S
d.
S XP V Aux
reflects underlying word order
e.
S XP ti Aux Vi
verb raising
f.
S Aux; XP V ti
V2
g.
S ti Aux [XP V]i
verb projection raising
h.
S ti V Aux XP;
postposition
i.
S Aux; ti V ti Xpi
V2 + postposition
j.
S
verb raising + postposition
ti Aux Vi XP;
Given this analysis of Old English syntax, the following scenario
is invoked to describe the change from OV to VO. During the Old
English period, VO surface word order gradually increases in frequency
at the expense of OV. Toward the end of the period, when the surface
word order is overwhelmingly VO, language learners abduce a new
grammar with underlying VO structure and word order on the basis of
the VO primary linguistic data. During the transition period, when two
grammatical systems are in use by the two different generations of
speakers, clauses like (10a) are produced and understood under both the
old and the new grammars, but with different analyses: under the old
system, they are derived from OV structure by postposition, as shown
in (10b); under the new system, they are derived from VO structure with
247
247
YORK PAPERS IN LINGUISTICS 17
no movement, as shown in (10c). One point deserves emphasis here.
To the linguist, (10a) is structurally ambiguous and can be derived from
one of two different underlying structures. But according to the abrupt
reanalysis view of syntactic change, children abduce either the old OV
grammar or the new VO grammar but not both, and the clause has a
single underlying word order within each system.
(10) a.
hafast gecoren pone wer
man
'You have chosen the man.'
(ApT 23.1)
[au
you have chosen the
b.
Old OV grammar with postposition:
pu hafast ti gecoren [Np bone welt
c.
New VO grammar:
1:1u hafast [vp gecoren pone wer]
The account presented above is both plausible and appealing. It
depicts a period of word order variation generated by a uniform
grammar, followed by the abrupt resetting of the parameter that controls
the underlying order of verbs and their complements. And it offers an
explanation for the change: the primary linguistic data used by children
for language acquisition have changed, and therefore the grammar that is
abduced differs in one or more parameter settings from the grammar of
the previous generation. Despite its plausibility and appeal, however,
it will be demonstrated in Section 4 that the predictions made by this
analysis are not correct, and therefore that the analysis cannot be
maintained.
4. Predictions of the standard analysis
The standard analysis of Old English and of the change from OV to VO
presented above makes three predictions that can be tested on historical
data. First, clauses unambiguously derived from the new VO grammar
are not used during the Old English period, before the change. Second,
clauses unambiguously derived from the old OV grammar are not used
248
OE VERB-COMPLEMENT WORD ORDER
during the Middle English period, after the change. And third, the
frequency of VO surface word order increases during the Old English
period, to reach near categorical status in the primary linguistic data
used by language learners. These three predictions are discussed in
Sections 4.1 through 4.3.
4.1. Prediction #1: no VO clauses in Old English
According to the first prediction made by the standard analysis, we will
not find Old English clauses that are unambiguously derived from the
new VO grammar. Contra this prediction, it will be demonstrated
below that clauses with underlying VO structure are used productively
during the Old English period.
Although (9) above illustrates that an OV grammar with optional
rules of V2 and postposition can derive many different surface word
orders, there is one clause type that constitutes evidence for underlying
VO word order. The relevant clauses are those with light constituents --
particles, pronominal objects, and monosyllabic adverbs. In Old
English clauses with auxiliary verbs, these constituents appear both
before and after the non-finite main verb, as shown in (11).
(11) a.
Particle before the main verb:
ut-brecan ne magon
and hi nwfre siaaan
and they never afterwards out-burst not may
'And afterwards they may never burst out ...'
(IECHom ii.174.3)
b.
Particle before the main verb:
woldon hig utdragart
and (they) would them aut:skag
'... and they would drag them out.'
(ChronE 215.6 (1083))
249
YORK PAPERS IN LINGUISTICS 17
c.
Particle after the main verb:
he wolde adrwfan lit anne aepeling
he would drive Qui a
prince
'... he would drive out a prince ...'
(ChronB (T) 82.18-19 (755))
However, the position of these constituents varies only in clauses
like (11b) and (11c), with the auxiliary verb before the main verb. In
clauses like (11a), with the auxiliary verb after the main verb, particles,
pronouns, and monosyllabic adverbs -- unlike heavier constituents -invariably appear before rather than after the main verb. The
distribution is shown in Table 1.8
Table 1
Distribution of particles, pronouns, and monosyllabic adverbs
in Old English main clauses with auxiliary verbs
Clause Type
Before Main Verb
N
%
Main verb + aux
90
100.0%
Aux + Main verb 260
94.5%
After Main Verb
N
%
0
0.0%
15
5.5%
Total
90
275
It is obvious from the order of the main verb and the auxiliary that
clauses like (11a) are OV in underlying structure, with the light
constituent base-generated in pre-verbal position. The fact that light
constituents never appear post-verbally in OV clauses indicates that
these constituents cannot be postposed, probably because of a heaviness
constraint on postposition.
But if particles, pronouns, and
monosyllabic adverbs do not postpose, then clauses like (11c) must be
derived from underlying VO structure, as shown in (12); and these
clauses therefore constitute evidence for the use of VO structure during
the Old English period.
8 The data for Table 1 consist of main clauses with particles from the
database of Hiltunen 1983, supplemented by main clauses with pronominal
objects and main clauses with monosyllabic adverbs.
259
250
OE VERB-COMPLEMENT WORD ORDER
(12) he wolde [vp adrwfan ut anne wkeling]
prince
drive out a
he would
The position of the other constituents in the 15 clauses with
post-verbal particles, pronouns, and monosyllabic adverbs lends further
support to this analysis. In 14 of the 15 clauses, the auxiliary and
main verb are adjacent, with all complements and adjuncts appearing
after the main verb, as in (11c) above. The remaining clause, given in
(13), has only an adverb between the auxiliary and the main verb.
(13) and man ne mihte swa Seel macian hi healfe up
them half up
and one not could nevertheless put
'... and nevertheless, one couldn't put half of them up.'
(/ELS 21.434)
It must be concluded that the first prediction of the standard
analysis is incorrect: VO structure is used productively, although
perhaps at a low frequency, during the Old English period, before the
change from OV to VO is supposed to have taken place.
4.2. Prediction #2: no OV clauses in Middle English
According to the second prediction made by the standard analysis, we
will not find clauses in Middle English that are unambiguously derived
from the old OV grammar.
Contra this prediction, it will be
demonstrated below that clauses with underlyingly OV structure are
used productively during the Middle English period.
A number of studies demonstrate that OV surface word order, at
least, is used in Middle English texts. Kroch and Taylor (1994)
examine the position of NP complements in subordinate clauses with
auxiliary verbs, where the order of the main verb and its complements
is not affected by verb movement to Infi, in Early Middle English prose
texts. In two West Midlands texts, they find a total of 23 out of 88
(26%) NPs in pre-verbal position between the auxiliary verb and the
non-finite main verb; in three Southeast texts, they find a total of 31
out of 108 (29%) NPs in pre-verbal position. Stockwell and Minkova
251
YORK PAPERS IN LINGUISTICS 17
(1991), citing Morohovskiy (1980), state that in 7.6% of the 14th to
16th century London texts, the complement appears before the main
verb in clauses with auxiliary verbs. And Foster and van der Wurff
(1993, 1994) show that OV surface word order is used productively
throughout the Middle English period, although at a low frequency. Of
course, we can't be sure how OV surface word order is derived in Middle
English: it could reflect underlying structure and word order, as shown
in (14a), or else be derived from a VO base by leftward movement, as
shown in (14b).9
(14) a.
Underlying OV structure:
S XP Vf
b.
Underlying VO structure with leftward movement:
S XPi Vf ti
Clearly, the simple existence of clauses with OV surface word order
is not sufficient evidence for OV underlying structure. But one clause
type does provide evidence for OV structure in Middle English: clauses
with pre-verbal particles. Since particles do not scramble leftward,
pre-verbal particles directly reflect the underlying word order. As shown
in Figure 1 ( = Hiltunen 1983: 111, his Figure 2), particles appear
before the main verb at a low but significant frequency throughout the
Middle English period, in main clauses as well as in subordinate
clauses, indicating that OV structure is used in Middle English.
9 See Kroch and Taylor 1994 for speculations that the West Midlands dialect
is mainly VO in underlying structure, while the Southeast dialect exhibits
synchronic competition between OV and VO grammars.
252
OE VERB-COMPLEMENT WORD ORDER
Figure 1
Frequency of verb...particle word order in Early Old English (EOE),
Late Old English (LOE), Early Middle English (EME),
and Late Middle English (LME).
100
90 80 70 60 50
40
30
.0'
20 -
Main
- Subor di nat e
10
0
EOE
EME
LOE
LME
It is interesting to note that the discourse function of OV surface
word order seems to be the same in Middle English as in Old English:
Foster and van der Wurff (1994) demonstrate that pre-verbal position in
Middle English is associated with inferable and evoked entities in
Middle English; similarly, Linson (1993) shows that pre-verbal
position in Old English is associated with entities that have been
previously mentioned in the discourse.
It must be concluded that the second prediction of the standard
analysis is incorrect: OV structure is used productively, although
perhaps at a low frequency, throughout the Middle English period, after
the change from OV to VO is supposed to have occurred.
253
253
YORK PAPERS IN LINGUISTICS 17
4.3. Prediction #3: increase in VO surface word order
According to the standard analysis, the frequency of VO surface word
order increased at the expense of OV surface word order during the Old
English period, until it became nearly categorical. This section
discusses the change in surface word order, the possible sources of the
VO increase, and the role that the increase may have played in the
change from OV to VO.
As a simple description of Old English word order, it is certainly
true that VO surface word order was more common at the end of the
period than in the earlier stages. Hiltunen (1983) shows that
verb-particle word order was used more frequently in Late Old English
than in Early Old English, both in main clauses and in subordinate
clauses (see Figure 1 above); and Bean (1983) shows that OV word
order decreased in frequency from the early to the late sections of the
Anglo-Saxon Chronicle.
However, given the analyses presented above, there are at least four
different ways to derive VO surface word order in Old English: (i) from
OV structure, by leftward movement of the finite main verb, as in
(15a); (ii) from OV structure, by postposition of the complement, as in
(15b); (iii) from OV structure, by a combination of verb movement and
postposition, as in (15c); and (iv) as a reflex of underlying VO
structure, as in (15d) and (15e).
(15) a.
Verb movement:
S Vfi [vp XP ti]
b.
Postposition:
S [vp Vf XP;
c.
Verb movement + postposition:
S Aux; [vp ti V ti] Xpi
d.
Underlying VO structure:
S [vpVfXP]
e.
Underlying VO structure:
S Aux [vp V XP
2
254
OE VERB-COMPLEMENT WORD ORDER
Researchers differ on the source of the increase in VO surface word
order during the Old English period. Most scholars (e.g. Aitchison
1979, Canale 1978, van Kemenade 1987, Stockwell 1977) attribute it
to an increase in the rate of postposition. Although the rate of
postposition over time has not been measured for Old English,
Santorini (1993) looked at the rates of NP and PP postposition in the
history of Yiddish, a language that has undergone syntactic changes
similar to English -- in particular, Yiddish changed from Infl-final to
Infl-medial and from OV to VO. Santorini found that while the rate of
postposition in structurally unambiguous clauses is highly variable
from text to text, it does not increase over time. The data are shown in
Table 2 below ( = Santorini 1993: 275, Table 5). It is reasonable to
conclude that the rate of postposition was not a factor in the OV to VO
change in Yiddish, and it remains to be demonstrated that an increase in
the rate of postposition played a role in the OV to VO change in the
history of English.
Table 2
Rates of NP and PP postposing in Yiddish
PP Postposing
NP Postposing
Not
Not
Time period Postposed Postposed Rate Postposed Postposed Rate
43%
12
8%
9
1
12
1400-1489
1490-1539
1540-1589
1590-1639
1640-1689
1690-1739
1740-1789
1790-1839
7
7
10
4
19
24
40
16
21
27%
23%
20%
17%
52
39
17
23
30
17%
6
3
13
1
19
5
1
2
33%
8
7
0
1
0%
1
1
45%
71%
63%
36%
67%
53%
50%
In fact Lightfoot (1991) states that there is no evidence for an
increase in the rate of postposition during the Old English period; he
suggests instead that the source of the increase in VO surface word order
in the primary linguistic data is an increase in the use of V2 in main
255
3
YORK PAPERS IN LINGUISTICS 17
clauses. Lightfoot shows that indicators of OV structurel° are robust
in languages like Dutch and German, but weak or non-existent in Old
English. He suggests that an increase in VO surface word order derived
by V2, coupled with the absence of evidence for OV structure, triggers
the change from OV to VO.
In apparent support of Lightfoot's hypothesis, an increase in the
frequency of clauses with the finite verb in second position is well
documented: Pintzuk (1991), for example, demonstrates that for clauses
with auxiliary verbs, the frequency of V2 in both main and subordinate
clauses increases over the course of the Old English period.11 But
while V2 derives VO surface word order in clauses with finite main
verbs and topicalized subjects, as in (16), it has no effect on the order of
verbs and their complements in clauses with topicalized objects, as in
(17), or in clauses with non-finite main verbs, as in (18).
(16) Philippus & Herodes todmIdun Lysiant
Philip
and Herod divided I.,ycia
Philip and Herod divided Lycia.'
(ChronA 6.4 (12))
(17) Of
locum comoR Cantware
& Wihtware
From Jutes came people-of-Kent and people-of-Wight
'From the Jutes came the people of Kent and the people of
Wight.'
(ChronA 12.13 (449))
10 Such indicators include (i) the clause-final position of separable
particles, negation, and sentential adverbs in main clauses with finite main
verbs, and (ii) the pre-verbal position of objects, separable particles,
negation, and sentential adverbs in main clauses with modal
verbs/perfective have and non-finite main verbs.
11 In Pintzuk 1991, 1993, IPs in Old English are either head-medial or
head-final, with obligatory movement of the finite verb to Infl; V2 is
analyzed as leftward movement to Infl in Infl-medial clauses. According to
this analysis, an increase in the frequency of V2 does not reflect an increase
in the use of an optional leftward movement rule, but rather an increase in
the use of an Infl-medial grammar.
256
OE VERB-COMPLEMENT WORD ORDER
gewyrcean
(18) Swa sceal geong guma gode
So shall young men good-things perform
'Young men shall perform good deeds in this way.'
(Beo 20)
Lightfoot cites Klein (1974) for evidence that Dutch language
learners pay attention to Dutch clauses analogous to (18), and Lightfoot
(1991: 62, 64) suggests that the order of object and verb in clauses like
(18) was accessible to Old English language learners. If the rate of
postposition remained constant during the Old English period, with the
frequency of clauses like (18) also remaining constant, it seems
plausible that these clauses could have been used as evidence for OV
structure by children learning Old English. With such a robust
indicator of OV structure still in existence at the end of the Old English
period, there is no clear support for the hypothesis that the increased
frequency of clauses like (16) could have triggered the change from OV
to VO.
We can see that although the frequency of VO surface word order
does increase during the Old English period, arguments that link this
increased frequency and the OV to VO change to an increase in the rate
of V2 and/or postposition are not convincing.
5. Synchronic competition between OV and VO grammars
Section 4 presented three types of evidence to contradict the standard
account of the change from OV to VO word order at the end of the Old
English period. First, clauses unambiguously derived from a VO
grammar are used productively during the Old English period, before the
change is supposed to have taken place.
Second, clauses
unambiguously derived from an OV grammar are used productively
during the Middle English period, after the change is supposed to have
taken place. And third, the increase in VO surface word order during the
Old English period and the trigger for change at the end of the period
cannot be directly linked to an increase in the rate of either postposition
rules or V2.
The evidence points to a different picture of the change from OV to
VO. Instead of a uniform grammatical system during the Old English
257
257
YORK PAPERS IN LINGUISTICS 17
period, with word order variation derived by optional movement rules,
there are two competing grammars, one underlyingly OV, the other
underlyingly VO. The VO grammar emerges early in the Old English
period, and competes with the old OV grammar throughout the Old and
Middle English periods, until the old system dies out. Thus the
variation in surface word order in both Old and Middle English is at
least partially the result of the use of two different grammatical
systems, rather than one system with optional rules. And the increase
in VO surface word order is at least partially the result of an increase in
the use of the new VO grammar, rather than simply an increase in the
frequency of use of movement rules.
This analysis replicates the analysis of grammatical competition in
languages as diverse as Old French (Kroch 1989), Middle Spanish
(Fontana 1993), Old English (Pintzuk 1991, 1993), Middle English
(Kroch 1989), Early Yiddish (Santorini 1989, 1993), and Ancient Greek
(Taylor 1994). Changes of this type that have been analyzed
quantitatively follow an S-shaped curve, as shown in Figure 2: the
change starts slowly, accelerates in the middle of the period, and then
tapers off to completion.
It should be pointed out that in apparent contradiction to this
analysis, many scholars (Gorrell 1895, Kellner 1892, Kohonen 1978,
Lightfoot 1991, Mitchell 1985, Stockwell and Minkova 1991) have
noticed an abrupt decrease in the frequency of verb-final word order in
subordinate clauses at the earliest stages of Middle English, an
observation that seems to refute the claim of competing grammars
during the Middle English period. But if the change in the underlying
order of verbs and their complements is a change of the type shown in
Figure 2 above, and if the accelerating middle section of the curve
coincides with the end of the Old English period, then a low frequency
of OV word order in the Middle English data is only to be expected.
Furthermore, it must be emphasized once again that surface word order
does not always reflect underlying structure, and that it is necessary to
abstract away from verb movement to study verb-complement word
order. If we assume that the change from Infl-final to Infl-medial
structure was complete early in the Middle English period (Pintzuk
258
258
OE VERB-COMPLEMENT WORD ORDER
Figure 2
S-shaped curve of syntactic change
1991), then subordinate clauses with finite main verbs will necessarily
exhibit VO surface word order, with the verb in clause-medial Infl
regardless of the underlying verb-complement word order. As discussed
in Section 4.2, in subordinate clauses with auxiliary verbs in Early
Middle English documents, Kroch and Taylor (1994) found 26%
pre-verbal NPs in West Midlands texts and 29% pre-verbal NPs in
Southeastern texts. These frequencies indicate that the order of verbs
and their complements in Early Middle English did not significantly
differ from the order in Old English, and that the grammars used by
speakers during the two stages were much more similar than has
previously been suggested.
259 259
YORK PAPERS IN LINGUISTICS 17
APPENDIX
ABBREVIATIONS
iECHom
= Thorpe, Benjamin (ed.) (1844) The Homilies of the
Anglo-Saxon Church. London: IElfric Society. Reprinted
1971, New York: Johnson. [volume.page.line]
'ELS
= Skeet, Walter W. (ed.) (1881-1900) lfric's Lives of Saints.
The Early English Text Society, Vols. 76, 82, 94, 114.
ApT
= Thorpe, Benjamin (ed.) (1834) The Anglo-Saxon Version of
the Story of Apollonius of Tyre. London: John and Arthur
Arch. [page.line]
= Klaeber, Fr. (ed.) (1950) Beowulf and the Fight at Finnsburg.
Third Edition. Lexington, Mass.: D. C. Heath. [line]
= Vleeskruyer, Rudolf (ed.) (1953) The Life of St. Chad: An Old
English Homily. Amsterdam: North-Holland. [page.line]
= Plummer, Charles (ed.) (1892) Two of the Saxon Chronicles
Parallel. Oxford: Clarendon Press. [page.line (year)]
= Thorpe, Benjamin (ed.) (1861) The Anglo-Saxon Chronicle,
According to the Several Original Authorities. London: Her
Majesty's Stationery Office. Reprinted 1964, Kraus Reprint
Ltd. [page.line (year)]
= Plummer, Charles (ed.) (1892) Two of the Saxon Chronicles
Parallel. Oxford: Clarendon Press. [page.line (year)]
= Sweet, Henry (ed.) (1871) King Alfred's West-Saxon Version
of Gregory's Pastoral Care. The Early English Text Society,
Vols. 45, 50. London: TrUbner. [page.line]
= Bately, Janet M. (ed.) (1980) The Old English Orosius. The
London: TrUbner. [life.line]
B eo
Chad
ChronA
ChronB
ChronE
CP
Or
Early English Text Society, SS, Vol. 6. London: Oxford
University Press. [page.line]
2
260
OE VERB-COMPLEMENT WORD ORDER
REFERENCES
Aitchison, J. (1979) The order of word order change. Transactions of the
Philological Society 77.43-65.
Allen, Cynthia L. (1975) Old English modals. Papers in the History and
Structure of English, ed. by Jane Grimshaw, 89-100. (University of
Massachusetts Occasional Papers in Linguistics 1.)
Bean, Marian C. (1983) The Development of Word Order Patterns in Old
English. Totowa, NJ: Barnes and Noble.
Besten, Hans den and Jerold A. Edmondson (1983) The verbal complex in
continental West Germanic. On the Formal Syntax of the
Westgermania, 155-216, ed. by Werner Abraham. (Papers from the
3rd Groningen Grammar Talks, Groningen, January 1981.)
Amsterdam: John Benjamins.
Cana le, William Michael (1978) Word Order Change in Old English: Base
Reanalysis in Generative Grammar. Montreal: McGill University
dissertation.
Chomsky, Noam (1981) Lectures on Government and Binding. (Studies in
Generative Grammar 9.) Dordrecht: Foris.
Chomsky, Noam (1986) Barriers. (Linguistic Inquiry Monograph 13.)
Cambridge, MA: The MIT Press.
Evers, Arnold (1975) The Transformational Cycle in Dutch and German.
Utrecht: University of Utrecht dissertation. Distributed by the
Indiana University Linguistics Club.
Evers, Arnold (1981) Two functional principles for the rule 'Move V'.
Groninger Arbeiten zur germanistischen Linguistik 19.96-110.
Fontana, Josep M. (1993) Phrase Structure and the Syntax of Clitics in the
History of Spanish. Philadelphia: University of Pennsylvania
dissertation.
Foster, Tony and Wim van der Wurff (1993) The survival of object-verb
order in Middle English: some data. Leiden: University of Leiden,
ms. Neophilologus, to appear.
Foster, Tony and Wim van der Wurff (1994) From syntax to discourse: the
function of object-verb order in Late Middle English. Leiden:
University of Leiden, ms. Paper presented at the International
Conference on Middle English Language, Rydzyna.
261
24 31,
YORK PAPERS IN LINGUISTICS 17
Correll, J. Hendren (1895) Indirect discourse in Anglo-Saxon. Publications
of the Modern Language Association 10.342-485.
Haegeman, Li liane (1994) Verb raising as verb projection raising: some
empirical problems. Linguistic Inquiry 25.509-21.
Haegeman, Liliane and Henk van Riemsdijk (1986) Verb projection raising,
scope and the typology of verb movement rules. Linguistic Inquiry
17.417-66.
Higgins, F. Roger (1991) The fronting of non-finite verbs in Old English.
Paper presented at the annual meeting of the Linguistic Society of
America Annual Meeting, Chicago.
Hiltunen, Risto (1983) The Decline of the Prefixes and the Beginnings of
the English Phrasal Verb: The Evidence from Some Old and Early
Middle English Texts. Turku: Turun Yliopisto.
Kellner, Leon (1892) Historical Outlines of English Syntax. London:
Macmillan.
Kemenade, Ans van (1987) Syntactic Case and Morphological Case in the
History of English. Dordrecht: Foris.
Kemenade, Ans van (1994) V2 and embedded topicalization in Old and
Middle English. Free University Amsterdam/Holland Institute of
Generative Linguistics, ms. Inflection and Syntax in Language
Change, ed. by Ans van Kemenade and Nigel Vincent. Cambridge:
Cambridge University Press, to appear.
Klein, R. (1974) Word order: Dutch children and their mothers. Publikaties
van het Instituut voor Algemene Taalwetenschap 9. Amsterdam.
Kohonen, Viljo (1978) On the Development of English Word Order in
Religious Prose Around 1000 and 1200 Al).: A Quantitative Study of
Word Order in Context. Meddelanden Fran Stiftelsens for Abo
Akademi Forskningsinstitut, Nr. 38. Publications of the Research
Institute of the Abo Akademi Foundation. Abo: Abo Akademi.
Koopman, Willem F. (1990) Word Order in Old English, with Special
Reference to the Verb Phrase. (Amsterdam Studies in Generative
Grammar 1.) Amsterdam: The Faculty of Arts.
Kroch, Anthony S. (1989) Reflexes of grammar in patterns of language
change. Language Variation and Change 1.199-244.
Kroch, Anthony S. and Beatrice Santorini (1991) The derived constituent
structure of the West Germanic verb-raising construction. Principles
262
262
OE VERB-COMPLEMENT WORD ORDER
and Parameters in Comparative Grammar, ed. by Robert Freidin,
269-338. Cambridge, MA: The MIT Press.
Kroch, Anthony S. and Ann Taylor (1994) Remarks on the XV/VX
alternation in Early Middle English. Paper presented at the Third
Diachronic Generative Syntax Workshop, Amsterdam.
Lightfoot, David W. (1991) How to Set Parameters: Arguments from
Language Change. Cambridge, MA: The MIT Press.
Linson, Brian (1993) A pragmatics of word order in Old English prose.
Philadelphia: University of Pennsylvania, ms.
Mitchell, Bruce (1985) Old English Syntax. Oxford: Clarendon.
Mitchell, Bruce, Christopher Ball, and Angus Cameron (1975) Short titles
of Old English texts. Anglo-Saxon England 4.207-21.
Mitchell, Bruce, Christopher Ball, and Angus Cameron (1979) Short titles
of Old English texts: Addenda and corrigenda. Anglo-Saxon England
8.331-3.
Morohovskiy, A.N. (1980) Slovo i predlozenie v istorii angliyskogo
yazika. Kiev: Visca skola.
Pintzuk, Susan (1991) Phrase Structures in Competition: Variation and
Change in Old English Word Order. Philadelphia: University of
Pennsylvania dissertation.
Pintzuk, Susan (1993) Verb seconding in Old English: verb movement to
Intl. The Linguistic Review 10.5-35.
Pintzuk, Susan (1994) Cliticization in Old English. Second Position Chiles
and Related Phenomena, ed. by Aaron L. Halpem and Arnold Zwicky.
Stanford, California: CSLI Press, to appear.
Santorini, Beatrice (1989) The Generalization of the Verb-Second
Constraint in the History of Yiddish. Philadelphia: University of
Pennsylvania dissertation.
Santorini, Beatrice (1993) The rate of phrase structure change in the history
of Yiddish. Language Variation and Change 5.257-83.
Stockwell, Robert P. (1977) Motivations for exbraciation in Old English.
Mechanisms of Syntactic Change, ed. by C. Li, 291-314. Austin:
University of Texas.
Stockwell, Robert P. and Donka Minkova (1991) Subordination and word
order change in the history of English. Historical English Syntax,
ed. by Dieter Kastovsky, 367-408. (Topics in English Linguistics
2.) New York: Mouton de Gruyter.
263
26 3
YORK PAPERS IN LINGUISTICS 17
Taylor, Ann (1994) The change from SOV to SVO in Ancient Greek.
Language Variation and Change 6.1-37.
Warner, Anthony R. (1993) English Auxiliaries: Structure and History.
Cambridge: Cambridge University Press.
264
264
SITUATING QUE.
Bernadette Plunkett
Department of Language and Linguistic Science
University of York
The correct analysis of questions in French is of considerable
theoretical interest and much discussion has been devoted to them in the
literature on French syntax. One particularly intractable subset of these
are 'what' questions. There are various restrictions on these types of
questions which, though easy enough to describe are difficult to explain
from a theoretical perspective. Of the numerous researchers who have
worked on this area (including Obenauer 1976, Goldsmith 1978,
Hirschbtihler 1978, Koopman 1982, Friedemann 1991, Plunkett 1994)
two (Friedemann and Koopman) have explicitly argued that part of the
paradigm can be taken to show that certain question phrases are required
to undergo Wh Movement into the C projection in the overt syntax of
French, even though in other cases such movement can be left until
LF. We will see that this is perhaps true, but I will argue that the
obligatory movement in such cases can be attributed to independent
factors and cannot be taken as proof of a general ban on in situ whsubjects.
In this paper I will redraw the lines around the problematic
paradigm and present a new analysis of it. I will then go on to discuss
the theoretical implications of the proposed approach.
I begin, in Section 1, by reviewing the relevant facts and
summarising the pertinent claims about que and quoi questions. In
This paper owes much to comments on a pievious draft by David Adger
and Anthony Warner and to lengthy discussions with the former as well as
to discussion and judgements from Paul HirschbUhler, Marie-Anne Hintze
and Georges Tsoulas. Thanks also to Marie-Laure Masson and Farid Aft Si
Semi for judgements.
*
York Papers in Linguistics 17 (1996) 265-298
Bernadette Plunkett
265
YORK PAPERS IN LINGUISTICS 17
Section 2 I lay out my assumptions about the working of Wh
Questions in general and in Section 3 I present the analysis. Section 4,
in which the theoretical implications are discussed, concludes the paper.
1. French Questions: Some Restrictions on 'What'
French 'what' questions are special in several respects. Though the final
account will link these peculiarities, for the time being I will treat them
as separate issues, reviewing each of the restrictions in turn.
1.1 What is 'what'?
Generally speaking, surface Wh Movement is optional in direct
questions in French. Wh-words may either move to the front of the
sentence or stay in situ. A straightforward example of this can be seen
in (1).
Qui aimes tu?
who love you
'Who do you love?'
b. T(u) aimes qui?
(1) a.
The (b) case here can, but need not be, interpreted as an echo question.
The same variability can be seen in the long-distance questions in (2).
Qui as tu dit que tu aimes?
who have you said that you love
'Who did you say you loved?'
b. T(u) as dit que t(u) aimes qui?
(2) a.
In fact the two forms may belong to different registers but for most
speakers both are possible.1
1
Further variability is involved when questions with full noun phrase
subjects occur since different types of inversion are available after
movement, or indeed no inversion at all. As far as I can tell nothing I have
to say about 'what' questions impinges on an adequate account of these
different types and I will abstract away from these issues in what follows.
266
CI
SITUATING QUE
As can be seen, in the case of 'who' questions, the wh-word takes
the same form in moved and in situ questions. This is not the case in
'what' questions, as (3) shows.
(3) a.
b.
Que cherchez vous?
you
what seek
What are you looking for?'
Vous cherchez quoi?
Not only are there two forms for the word 'what' but they are in
complementary distribution, as can be seen in (4).
(4) a. * Vous cherchez que?
what
b. * Quoi cherchez vous?
you
what seek
You seek
This fact leads to the suggestion, adopted by most researchers in the
area, that they are variants of the same morpheme (but see Obenauer
1976, 1977 for a different view). On this view the two forms of the
word for 'what' may be seen as a weak unstressed form que and a tonic
form quoi. This view is supported by the fact that the variants are
similar to those found in other weak-strong pronominal pairs such as to
toi, me moi, se soi. It is further supported by the fact that, just as
with those pairs, only the strong form appears inside PPs:
(5) a.
Vous pensez it quoi?
you think to what
'What are you thinking about?'
b. A quoi pensez vous?
c. * Vous pensez h que?
d. * A que pensez vous?
In addition, for most speakers que cannot be co-ordinated with
another wh-word. Thus (6a) and (6b) are parallel to (6d) where the coordination of weak subject pronouns is ruled out, while (6c) is perfect.
267
267
YORK PAPERS IN LINGUISTICS 17
(6) a. ?? Qui ou que voulez-vous photographier?
who or what want you to photograph
'Who or what do you want to photograph?'
b. * Que ou qui voulez-vous photographier?
c. Qui ou quoi voulez-vous photographier?
d. * Tu et it voulez photographier quelqu'un.
'You and he want to photograph someone.'
The treatment of que as a weak form of quoi then is well-supported, but
as we will see below the precise characterisation of 'weak' pronouns is
somewhat problematic.
The alternative view of the alternation in (3) is the one put forward
by Obenauer in which que in fronted questions is treated as the finite
complementiser que while quoi is treated as a genuine wh-word. This
treatment parallels that of Kayne (1976) and others for the que which
appears in relative clauses. However, while accepting Kayne's analysis
for relative que, both Goldsmith (1978) and Hirschbithler (1978) review
and argue in detail against Obenauer's view of interrogative que. Their
arguments are convincing; for example, as Goldsmith (1978, 1981)
points out, simple inversion of a verb and a pronominal subject is
blocked by the presence of an overt complementiser, not only in
embedded clauses in French but in matrix clauses too in the cases where
a complementiser may appear in them.
Peut-etre qu'il
est parti.
perhaps that he is left
'Perhaps he has left.'
b.* Peut-etre qu'est-il parti.
perhaps that is he left
c. Peut-etre est-il parti.
(7) a.
perhaps is-he left
Since this type of inversion does take place in interrogatives, as we
have seen in (1-3), the que there cannot be a complementiser unless just
in this case the verb is allowed to raise to C and adjoin to the right of
the overt complementiser. If this were to happen then clearly the que
complementiser in (3) and the que complementiser in (7) would have to
263
268
SITUATING QUE
be differentiated from one another. In fact, to the extent that que must
always appear immediately before the inflected verb and any clitics it
may have attached to it, as claimed by Obenauer (1977), all que
questions containing pronominal subjects will involve simple
inversion.2 Since inversion is typically taken to indicate that the verb
is in C, which is borne out by the contrast in (7), it is fairly safe to
assume that when que appears it is always outside IP.
It would seem then that the two views on the status of
interrogative que are incompatible. However, within current syntactic
analyses couched in the Principles and Parameters framework they can
be seen to have something in common. Complementisers and pronouns
are both treated as functional heads which may have syntactic
complements but do not assign theta roles and hence cannot take
arguments. Since this is the case, some aspects of the behaviour of que
may be attributed to its status as a functional head and are thus
compatible with its treatment as a pronoun in the current framework in
a way which was not possible in earlier approaches.
1.2 Subject questions
Further and yet more problematic constraints on 'what' arise in that in
simple direct questions if it functions as the subject it appears neither to
be possible to extract it, nor (if we take quoi to be the form used when
it has not been moved) to be able to stay in situ.
(8)
* Que flotte dans l'eau?
what floats in the water
'What floats in water?' or 'What is floating in the water?'
(9)
* Quoi flotte dans l'eau?
what floats in the water
2
Apparent exceptions to this generalisation, like (i), where complex
inversion has taken place, are rejected by Obenauer (1977) as marginal but
uniformally accepted by my informants.
(i) Que cela veut-il dire?
what that wants it to say
'What does that mean?'
269
283
YORK PAPERS IN LINGUISTICS 17
This is not true for other wh-phrases as (10) shows.
(10)
Qui flotte dans l'eau?
who floats in the water
'Who is floating/floats in the water?'
The restriction on extraction is not seen in more complex questions like
(11), which I take (pace Obenauer 1976) to be cases of long-distance
extraction given the standard que qui alternation which shows up after
extraction of an embedded subject.3
(11)
Qu'est ce qui flotte dans l'eau?
what is this that floats in water
'What (is it that) floats/is floating in (the) water?'
These cases completely parallel other cases of long-distance subject-que
extraction such as (12).
(12)
Que crains-tu qui soit advenu?
what fear-you that is taken place
'What do you fear has happened?'
Whether the restriction on quoi in [Spec,IP] extends to embedded
contexts is harder to determine. The impossibility of cases like (13)
suggests that it does.
(13)
dans le couloir?
you thought that what lay around in the corridor
'What did you think was lying around in the corridor?'
* Tu pensais que quoi trainait
However, an example given to me by Paul Hirschbtthler shows that
where movement is independently blocked, 'what' may perhaps stay in
subject position.
3
In contexts where that-t effects would show up in English a que
complementiser becomes qui; the effect is dubbed 'masquerade' by Kayne
(1976) and is considered by Rizzi (1989) to be a case of agreement in Comp,
with the C showing the presence of a wh-trace in its specifier.
270
279
SITUATING QUE
(14)
Qui a dit que quoi trainait
who has said that what lay around where
'Who said what was lying around where?'
This suggests that the ban on quoi in subject position is not merely due
to its incompatibility with nominative Case, as Goldsmith (1981)
claims.4 In Plunkett (1994) this explanation for the absence of
quelquoi subject questions was adopted and it was argued that stressed
subject pronouns such as the ones in the echo questions in (15) noted
by Koopman (1982) be taken to be non-nominative forms.5
(15) a. QUOI a ete decide?
what has been decided
'WHAT was decided?'
b. QUOI flotte dans l'eau?
what floats in the water
'WHAT floats in water?'
Another set of examples which might be problematic for
Goldsmith's view are those like (16) where, under most views, the
expletive subject would transmit nominative Case to quoi in postverbal position.
(16)
Il est arrive quoi?
it is happened what
What happened?'
These arise both with unaccusative type verbs such as those which
occur in English There-Insertion constructions and, in French in
passives, as in
Though that approach has the advantage of being able to explain why
many speakers only marginally accept quoi in subject positions in echo
4
questions and others reject it altogether.
It was felt that the contrast in (6) supported that view.
271
4 0. L
YORK PAPERS IN LINGUISTICS 17
(17)
II a
ete decide quoi pour demain?
it has been decided what for tomorrow
'What has been decided for tomorrow?'
These types of construction provide additional information about the
constraints on the extraction of 'what' since, when [Spec,IP] is filled
with an expletive, the post-verbal nominative que can be extracted as
(18) and (19) show.
(18)
Qu'est-il arrive?
what is it happened
'What happened?'
(19)
Qu'a-t-il ea decide pour demain?
what has it been decided for tomorrow
'What has been decided for tomorrow?'
This possibility might lead us to wonder whether the cases of
apparent long-distance subject que movement in (12) were not in fact
instances of extraction from a post-verbal position, since native
speakers often have difficulty in deciding which of the examples in (20)
is the appropriate way of writing the corresponding spoken question.
(20) a. Que dis-tu qui est advenu?
what say you that is happened
'What do you say happened?'
b. Que dis-tu gull est advenu?
what say you that it is happened
However, there are clear cases where no expletive subject is possible, as
in (21) and long distance subject extraction is indeed still licit.
(21)
Que pretendais-tu qui motivait cette analyse?
what claimed you that motivated that analysis
'What did you claim motivated that analysis?'
272
272
SITUATING QUE
One might wonder whether any further information could be
gleaned from looking at indirect subject questions. Unfortunately, this
is not possible. What' questions in this context are in fact anomalous,
but in this case, as the paradigm in (22) shows, there is no difference
between subject questions and object ones; when the embedded clause is
tensed, neither permits a simple question introduced by que. Instead,
these indirect 'what' questions are always introduced by the pronoun ce
('it'), resulting in a free-relative type structure.
demande que/quoi tu aimes.
you like
what
b. Je me demande ce que tu aimes.
it that you like
myself ask
I
'I wonder what you like'
c. * Je me demande qui/quoi lui fait peur.
him makes frightened
I myself ask what
demande ce qui lui fait peur.
d. Je me
it that him makes frightened
I myself ask
'I wonder what makes him frightened.'
(22) a. * Je me
I myself ask
This restriction is specific to indirect 'what' questions, since the
instances of (23) are unexceptional.
(23) a. Je me demande qui tu aimes.
b.
I myself ask who you like
'I wonder who you like.'
Je me demande qui lui fait peur.
I myself ask who him makes frightened
'I wonder who makes him frightened.'
The restriction could be linked to the dependence of que on an adjacent
verb but it can have nothing to do with the status of subject questions.
In fact, in the questions in (22b) and (d) que is clearly the relative
complementiser as Kayne (1976) argued was the case in all relatives,
since where the subject has been extracted we find the qui alternant
though the head of the relative is inanimate.
273
273
YORK PAPERS IN LINGUISTICS 17
Where the wh-clause is non-finite the facts are different again but
since in these cases there can never be an overt subject they cannot be
relevant with regards to the restriction on subject questions.6 Since it is
that restriction which I will now concentrate on, in what follows I will
abstract away from indirect questions.
1.3 Review
We have seen that quelquoi questions are special in several ways. First,
'what' has two forms in French, one appearing to be a weak or clitic
pronoun which undergoes movement and the second a strong pronoun
which appears when the in situ strategy for Wh Questions is used.
Second, in matrix direct questions que cannot appear bearing the
grammatical function of subject, suggesting in by now traditional terms
that 'extraction' of 'what' subjects is impossible in French. However,
coincidentally quoi may not appear as an in situ matrix subject either
and it is unclear how closely these facts should be related to the
availability of two forms for the 'what' pronoun.
In the next section I will be discussing one approach to Wh
Movement with a view to seeing whether it can shed any light on these
peculiarities.
2. Wh Movement
Rizzi (1991), reformulating the approach taken in May (1985),
proposed that Wh Movement could be accounted for by the Wh
Criterion as given in (24).
6
In infinitivals (as discussed in Hirschbt1hler 1978) we find the only
case where que and quoi are not in complete complementary distribution. An
embedded case is illustrated in (i).
Je ne sais quoi faire
(i)a.
I not know what to do
b.
Je ne sais que faire
I not know what to do
'I don't know what to do'
Hirschbtthler argues that subtle semantic factors distinguish these two.
274
274
SITUATING QUE
(24)
Wh Criterion
a. A Wh-operator must be in a Spec-head configuration with
an X°
+WH
b. An X° must be in a Spec-head configuration with a
+WH
Wh-operator
(Rizzi 1991: 2)
In Plunkett (1993) a similar, if somewhat less strict, approach is taken
with regard to questions where the principle in (25) is essentially
comparable to clause (b) of the Wh Criterion./
(25)
Interrogative Movement Principle (IMP)
The specifier of a head which bears question features must bear
matching features.
(Plunkett 1993: 262)
Although the two approaches diverge in detail, they converge in the
proposal that wh-features are marked on C in selected embedded whclauses but on the head which is normally immediately below C in root
clauses; we also agree that the principle applies at S-structure in
English. Rizzi assumes that in root clauses wh-features are associated
with the head containing tense features whereas I located them in Agr;
these details seem to be irrelevant to the analysis of the French data and
for the sake of simplicity I will illustrate with a unified Infl assuming
this to contain both Tns and Agr features.
The complementarity between inversion in root and embedded
clauses in English questions has led to the now standard analysis of
[Spec,CP] as the landing site for Wh Movement. Although both
approaches situate wh-features lower than C in root clauses, the claim
that [Spec,CP] is the usual landing site for Wh Movement is not
Clause (a) in (24) was originally intended to deal with non-inverting
structures such as relative clauses and will not be of relevance until Section
7
3. In the meantime I will refer only to the IMP in (25) with the
understanding that in nearly all cases, (24b) and (25) have the same
coverage.
275
275
YORK PAPERS IN LINGUISTICS 17
disputed. Both approaches employ the same mechanism to explain why
a wh-phrase usually ends up in [Spec,CP] in English; the subject
occupies [Spec,IP] so that the principle in (25) usually cannot be
satisfied by S-structure unless I moves into C, whose specifier is
empty; the wh-phrase can then move into the specifier position,
permitting spec-head agreement in the C projection with respect to wh-
features. A typical pre-Wh Movement structure would be the one
shown in (26), where arrows show the subsequent movement.
(26)
CP
C'
A
IP
DP
AGR+T
VP
V'
N
V DP
+WH
The Infl node and the subject NP do not agree in wh-features; if,
however, both the object NP and the head marked +wh, move into the
C projection then the IMP will be satisfied. The same type of situation
will arise when an adjunct phrase or an argument in a lower clause is
marked +wh.
276
2- 7 0
SITUATING QUE
There is one type of construction, however, where the approaches
differ more substantially; this is the configuration in which the root
subject is marked +wh, as in (27).
(27)
CP
C'
IP
DP
+WH
I
VP
Agr+T
+WH
V
DP
In a configuration such as the one in (27), IMP is immediately
satisfied. I assume the now familiar Lexical Clause Hypothesis with
subjects in French and English raising to [Spec,IP] to get Case; the
subject and Infl agree in wh-features here and there is no obvious
motivation for further movement of either the wh-phrase or the
wh-marked head. Since this is so, considerations of economy would lead
us to expect that no further movement of the wh-phrase will be required
either in the syntax or at LF; indeed, I will argue not only that further
movement is unnecessary but that once IMP has been satisfied, it is
impossible. In so far as this approach requires the minimum number of
steps it is the Minimal Approach to Wh Movement and will be referred
to as such in what follows. Rizzi (1991) acknowledges that to say that
no further movement takes place in such cases is the most
straightforward account of root subject questions in English. The
analysis correctly predicts that we will see no evidence of Subject
Auxiliary Inversion in such questions, this being a movement which is
triggered to allow satisfaction of the IMP. While the absence of
inversion in such questions is an effect which people have previously
struggled to explain, it is a natural consequence of the Minimal
Approach.
277
277
YORK PAPERS IN LINGUISTICS 17
However, Rizzi (1991) does not adopt the Minimal Approach.
One of his reasons is that part of the data from French questions,
discussed in the previous section, can be taken to indicate that subject
wh-phrases must vacate [Spec,IP]. As mentioned in the introduction,
this was the conclusion reached on different grounds by both Koopman
(1982) and Friedemann (1991). In the following section I will discuss
the analysis of French questions with respect to the type of approach
outlined, first in general and then with respect to the specific
restrictions on 'what' questions. As far as subject questions are
concerned I will focus on how the Minimal Approach can cope with the
French data.
3. The Minimal Approach to French Questions
An adequate approach to Wh Movement must be able to account for
when any wh-phrase must, may or may not move. In addition, it should
correctly predict in which cases of Wh Movement a concomitant
inversion must or may take place. In particular, leaving aside factors
specific to subject questions for the moment, with respect to French it
must explain:
(i) why (overt) Wh Movement is optional in matrix questions and
obligatory in embedded questions;
(ii) why inversion is possible but not obligatory with most matrix
(moved) questions but impossible in embedded questions;8
(iii) why, in obligatory contexts, only one wh-phrase has to move;
(iv) why inversion never happens when a wh-phrase stays in situ;
(v) why partial Wh Movement is not possible (eg. movement to an
intermediate [Spec,CP]).
In addition, with respect to 'what' questions, our theory must explain:
(vi) why inversion is obligatory in matrix que questions.
Rizzi (1991) deals with the first five of these. I will begin my
analysis by looking in detail at these factors and propose some
modifications to his treatment. Next, I will turn to the treatment of
'what' questions specifically and fmally, I will discuss subject questions
8
Stylistic Inversion is sometimes found in embedded contexts and is
thus an exception to this generalisation. A full investigation of the
differences in different types of inversion is beyond the scope of this paper.
278
SITUATING QUE
in general and argue that we should ensure that the approach is
'minimal'.
3.1 Optional inversion and optional movement
As we saw above, the IMP in (25) has the same effect as clause (b) of
the Wh Criterion (24) which was designed to deal with inversion
constructions. Let us first examine how the inversion data is explained
and then proceed to look briefly at non-inversion in French questions
and whether clause (a) of (24) or an equivalent is also necessary.
If the head of every question clause bears wh-features and, if
(24)/(25) applies at S-structure in French (as Rizzi (1991) claims), then
Wh Movement should be obligatory, as it is in English. This is a
correct prediction for indirect questions in French, where the matrix verb
selects a CP whose head is marked +wh, but since matrix Wh
Movement is optional Rizzi proposes that while matrix I may bear whfeatures, such features are not necessarily generated. He points to the
optionality of the question marker ka in Japanese matrix questions in
support of this claim.9 This proposal that wh-features are generated
freely, which largely accounts for factor (i), seems reasonable and I will
assume in what follows that in a direct question where no wh-phrase
moves, the head of the matrix clause is -wh. The question now arises
whether obligatory Wh Movement indicates that all question clauses
must obligatorily have +wh heads in English. It would seem rather ad
hoc to assume that wh-features are freely generated in French but
obligatorily generated in certain contexts in English. However, another
of his proposals allows Rizzi to circumvent this problem. In positing
two clauses of the Wh Criterion Rizzi is in effect postulating that spechead matching in wh-features is required independently by both whheads and wh-phrases. This entails that when the head of an unselected
clause is -wh but the sentence contains a wh-phrase, Wh Movement
will still be required at some level, as has usually been assumed. Rizzi
argues that when this situation arises in French, the wh-phrase may
move overtly to [Spec,CP] then, by a process of 'dynamic agreement'
the empty C position will come to agree with the wh-phrase and (24a)
will be satisfied. In this case, since no wh-feature has been forced to
9
It is obligatory in embedded questions.
279
27D
YORK PAPERS IN LINGUISTICS 17
move from I to C, no inversion will take place and Rizzi thus explains
factor (ii) which accounts for the possibility of uninverted questions
like (28) in French.
(28)
Comment to l'as su?
how
you it have known
'How did you know that?'
Rizzi (1991) argues that English lacks Dynamic Agreement. Since, on
his view, a question with no wh-head would only be able to satisfy (24)
if Dynamic Agreement were available, the postulation that it exists in
French but not English will account both for the fact that all questions
involve both overt movement and inversion in English.
Now, Rizzi (1991) assumes that both clauses of the Wh Criterion
(24) must apply at the same level in a given language, thus incidentally
explaining factor (iv), i.e. why clause (b) cannot be satisfied simply by
the operation of inversion, with subsequent movement of the wh-phrase
left until LF. However, if the presence of a wh-phrase is itself
sufficient to cause movement, as clause (a) of (24) suggests, then the
possibility that no +wh head will be generated in a given matrix
context ought not to be sufficient to predict the possibility of in situ
questions in French. In Rizzi (1991) the explanation for the fact that
some wh-phrases can remain in situ until LF is maintained by the
additional assumption that these do not have the status of 'operators'
until that level and, as a result, clause (a) does not apply to them until
then. 10
Overall then, Rizzi's (1991) approach manages to account for all
the factors in (i) to (iv) above but under current economy considerations
the approach faces a problem. If wh-phrases are not deemed to be
10 An alternative explanation of this option which Rizzi considers and
rejects is that clause (a) of (24) not apply until LF in French. Although
indirect questions (and relative clauses) involve no inversion, clause (b) is
sufficient to ensure obligatory movement in them. Late application of
clause (a) would have the desired effect then of correctly predicting not only
the possibility of in situ questions but also giving an account of factor (iii),
why in multiple wh-questions in French as in English only one wh-phrase
may move in the syntax.
280
SITUATING QUE
operators until LF, an assumption required for English where factor (iii)
also holds, why should they (be able to) move in the syntax in cases
like (28) in French?
Economy predicts that even if Dynamic
Agreement were available it should only ever be invoked at LF.
Under Minimalism (Chomsky 1993, 1995), pure optionality of
movement is ruled out. Movement of an element in the syntax is licit
only if a failure of such movement would result in a derivation which
could not converge. With a view to explaining the French data within
the current approach while retaining as much of the explanatory power
of Rizzi's approach as possible I would like now to propose some
revisions.
Let us assume as before that wh-features are generated freely in
unselected environments. If none of the clausal heads have been
generated with wh-features but a sentence contains a wh-phrase, then
that phrase will be required to stay in situ. However, semantic
requirements will mean that unless the scope of the wh-phrase can be
determined in some other way the sentence will be uninterpretable.
Leaving aside details, let us assume that languages which allow in situ
wh-phrases have access to such a mechanism while languages like
English do not. On such a view, visible movement entails the presence
of wh-features on some clausal head while lack of movement entails the
absence of such features. If this is correct then an alternative
explanation for uninverted structures like (28) must be sought. Consider
for a moment what form such structures take in the varieties of French
in which the Doubly Filled Comp Filter (DFCF) (Chomsky and Lasnik
1977) is not in effect.
(29)
Comment que to I'as
su?
how that you it have known
'How did you know that?'
One might claim that the C here is -wh and invoke something like
Dynamic Agreement in such structures, but given that it will be
necessary to assume that in these dialects C can be freely generated in
root contexts, it is much more straightforward to assume that when C
is the head of the clause, that is the head that any wh-features will
appear on. If Dynamic Agreement is not involved in (29), some head
281
28
YORK PAPERS IN LINGUISTICS 17
bears wh-features or the Wh Criterion could not be satisfied; it must be
C or an ad hoc mechanism will be required to explain the
grammaticality. Now suppose that with respect to Wh Questions,
dialects such as Metropolitan Standard French (MSF) and Quebtcois
differ only in their application of the DFCF. If wh-features can be
generated on C in root clauses in French,11 the operation of the DFCF
in some dialects will explain the absence of an overt complementiser in
cases like (28) but the presence of a non-overt +wh complementiser
there will obviate the need for inversion and its absence will thus be
explained. It would be superfluous to assume that the dialects differ
further by invoking Dynamic Agreement for cases such as (28). Since
in MSF movement with inversion is also possible we need only claim
that the projection of C is optional in French root clauses. This claim
is independently supported by the following well known contrast seen
in example (7) in which either inversion or an overt complementiser is
possible after certain sentential adverbs in MSF, but not both.
(30)
Peut-titre est-il parti.
perhaps is he left
'Perhaps he left.'
(31)
Peut-titre qu'il est parti.
perhaps that he is left
(32)
* Peut-titre qu'est-il parti.
perhaps that he is left
If this approach is correct and Dynamic Agreement can be dispensed
with in the explanation of structures like (28) then what accounts for
the absence of uninverted questions in English? The simplest account
must be correct here: complementisers cannot be generated in matrix
contexts in English.
11 We must ensure that the DCFC operates only in wh-contexts in which
the C projection is filled with a complementiser and not when it is filled
with a verb, i.e. when the C position is filled at D-structure.
282
22
SITUATING QUE
Having dispensed with the need for Dynamic Agreement in noninversion structures the question now arises of whether it is needed at
all. Under standard GB assumptions, in multiple questions where one
wh-phrase has moved in the syntax, movement of any remaining whphrases involves absorption (Higginbotham and May 1981), clause (a)
of (24) or its equivalent presumably being responsible for the
movement. Suppose however, that LF movement of an in situ whphrase is required merely because of the need for scope assignment
rather than because of an independent spec-head requirement on whphrases as such. Since the presence of the word 'operator' in (24a) is
crucial to an adequate description of the data it is unclear that this clause
can be in operation for anything other than semantic reasons. If this is
the only motivation for the postulation of clause (a) and its effect can
be guaranteed by independent requirements, then it should be dispensed
with, leaving a single-pronged Wh Criterion. Such a version of the
criterion would be much more in keeping with Chomsky's recent
proposals concerning the operation of Checking Theory (Chomsky
1995). Suppose then that there is no clause (a) to the Wh Criterion and
that in situ wh-phrases may be assigned scope by some means other
than movement and absorption at LF. If this approach is correct then
there will be no need to invoke Dynamic Agreement at LF and it can
thus be dispensed with completely. t2
Before proceeding, let us look briefly at whether the proposed
revisions to Rizzi's approach explain factor (v), the lack of partial Wh
Movement in French and English and continue to allow us to explain
factor (iv), why we never find inversion without concomitant Wh
Movement.
Under the monoclausal approach to the Wh Criterion there is a oneto-one correspondence between the presence of a wh clausal head and the
application of overt Wh Movement. Once the head of an IP or CP has
+wh-features, the revision of (24b)/(25) in (33) will kick in.
12 The question of what precisely happens to unmoved wh-phrases at LF
is left open here. In Baker (1970) and indeed in much recent work (Aoun and
Li 1993, Kiss 1993, Stroik 1995, Williams 1986) LF movement is not
invoked to explain the assignment of scope to in situ wh-elements.
283
283
YORK PAPERS IN LINGUISTICS 17
(33)
Wh Criterion (revised)
Heads marked +wh bear a strong (alternatively weak)
(Categorial) X feature13
Since strong features must be eliminated by Spell-out (S-structure), it
follows that partial movement should never be licit in a language in
which the categorial feature on a wh-head is strong." Under a checking
view of the Wh Criterion it follows too that inversion could never take
place without concomitant Wh Movement. Factors (iv) and (v) then
fall out quite neatly within this framework.
Before proceeding to the next section in which we consider factor
(vi) let us briefly summarise the assumptions entailed in the revised
approach to Wh Movement taken here.
In unselected contexts wh-features are freely generated on a clausal
head. Some languages limit the choice of clausal head in root contexts
(English) while others allow a choice between the projection of an
inflectional head only or a complementiser (French). Where a choice is
available, wh-features may freely appear on the topmost head; where
this is a head such as C which unlike I does not independently require
its spec to be filled, uninverted questions will be possible. These may
be of two types: those like spoken MSF in which the DFCF operates
and those like Quebecois in which it does not. (Visible) Wh Movement
is triggered solely by the presence of a strong categorial feature on any
wh-marked head, which in French may be either I or C. There is an
isomorphic relation between the presence of a clausal head marked +wh
and Wh Movement. In some languages assignment of scope to a whphrase at LF is limited to contexts in which a wh-phrase has already
moved in the syntax, so that in these languages all derivations of
questions in which no clausal heads are marked +wh will crash; English
is such a language while French is not. Note that it is with respect to
the presence or absence of this mechanism that English and French are
postulated to differ rather than with respect to Dynamic Agreement.
13 Where an X feature is similar to a D-feature as in Chomsky (1995) but
where clearly the particular category of the element is unimportant.
14 How languages such as those described in McDaniel (1989) should be
treated is as yet unclear to me.
234
284
SITUATING QUE
The proposed revisions are necessary to a complete explanation for
the behaviour of 'what' questions in French to which we now return.
3.2 Que questions
We begin our re-examination of que questions by looking at the reasons
for the obligatory inversion which it induces, we then move on to look
at the clitic-like nature of que.
3.2.1 Obligatory inversion in que questions
Let us look again at factor (vi), why 'what' questions always induce
inversion in French. Rizzi (1991) did not attempt to deal with this
matter, but within both his framework and our revisions of it inversion
occurs only where an inflectional head bears +wh-features; we may thus
see this restriction as one which rules out derivations in which whfeatures are generated on C.15 As can be seen from the examples in the
previous sub-section, when matrix C occurs overtly in French it has the
same form as the complementiser which introduces finite embedded
clauses, que (or qui when subject extraction has taken place). We may
say then that when the complementiser que bears wh-features,
movement of the weak form que causes the derivation to crash.16 One
might posit a fairly superficial reason why que questions are licit only
when 1 bears wh-features such as a filter blocking que in the spec of a
que Comp. The restriction is in fact more likely to have something to
do with the clitic-like properties of the question-word que, however.
One reason is that such a filter would be likely to have a phonological
basis and yet in this case we would have to say that it operates even in
MSF where the DFCF means that the second of two adjacent ques is
not even pronounced. The second reason is that a similar situation in
which qui occupies both the head and spec of CP results in no
ungrammaticality in the dialects in which DFCF does not operate.17
15 Absence of wh-features is still licit since quoi may remain in situ.
16 Note that even in Quebecois where there is a clear preference for
situating wh-features on C rather than 1, when que is used inversion must
occur.
17 The complementiser qui is not only possible here but according to
Lefebvre (1982) it is obligatory for reasons having to do with the ECP.
285
YORK PAPERS IN LINGUISTICS 17
Qui qui est venu?
who that is come
'Who came?'
(34)
Since qui does not have clitic-like properties this contrast is to be
expected if we attribute the restriction to the clitic nature of que.18 Let
us explore further the clitic-like nature of the wh-word que.
3.2.2 Que as a defective clitic
We saw in Section 1 that there are sound morphological and syntactic
reasons for regarding que as a weak form of the pronoun quoi. We may
take pronouns to be determiners which head a projection containing a
zero nominal head as in (29), and if 'what' in French is a pronoun then
we will expect it to sometimes behave as a full phrasal projection (i.e.
DP) and sometimes as a head (D).
(35)
/\
DP
/\
D'
D
I
que
/\0
NP
18 Further support can be found from the fact that in some dialects of
Canadian French the non-clitic form quoi may appear in a fronted position,
as in (i).
(i) Quoi c'est que Jean fait?
what it is that Jean does
'What is Jean doing?'
Indeed, a few speakers seem to even accept cases like (ii) though Lefebvre
(1982) claims that the majority of her informants rejected such cases.
(ii)(*)Quoi to fais?
what you do
'What are you doing?'
However, I have no explanation for why it is possible to move the strong
form alone in these dialects but not in the MSF example in (iii).
(iii)*Quoi fait Jean?
what does Jean
'What is Jean doing?'
286
SITUATING QUE
A most natural corollary of this view would be to treat que as the form
which is used when head movement has taken place and quoi as the full
DP form. This is the view espoused in Plunkett (1994) and it could
clearly account not only for the dependent status of que but also for the
fact that it cliticises only to verbs rather than whatever it happens to be
adjacent to. However, adopting this view is not straightforward; weak
object pronouns in French are standardly treated as syntactic clitics and
since Kayne (1975) clitic placement has been largely regarded as
involving movement of a head.19
HirschbUhler (1978), advocating a pronominal treatment of
interrogative que, already argued that it was a clitic, thus accounting for
its appearance adjacent to a verb.20 However, the rules which he
invoked to account for its status as a 'dependant' were phonological.
While the distribution of que, as described by HirschbUhler, clearly
shows that it is a phonological clitic on the verb, its status as a
syntactic clitic and hence as a head which has undergone head movement
is less certain. In particular, as already noted by Friedemann (1991), the
fact that que can occur in long-distance questions where it has been
extracted out of a tensed clause casts strong doubt on the possibility
that it reaches the head of the matrix clause by Head Movement,
especially since such Long Head Movement is otherwise unknown in
French.
19 In more recent approaches movement of a clitic is claimed to take place
in two steps, the first, movement of a maximal projection to the specifier of
an agreement phrase to get case and the second a further movement of the
head to the clitic position. This is the approach I believe to be correct;
however, some researchers (eg. Sportiche 1994), base generate clitics in a
fronted position.
20 Aside from the cases mentioned in an earlier footnote, the only
exceptions to the requirement that que be left-adjacent to a verb involve
instances of que diable ('what the devil') which is not as restricted in its
occurrence as simple cases of que. Like que this cannot occur next to a
subject pronoun.
(i) * Que diable to cherches?
what devil you look for
'What the hell are you looking for?'
HirschbUhler (1978) points out that all wh-diable phrases induce simple
inversion.
287
287
YORK PAPERS IN LINGUISTICS 17
Suppose we treat que as a phonological clitic but not a syntactic
one. In this case we could assume that Wh Movement of 'what' in
French involves movement of the whole DP until the target position
has been reached. At that point the head could pro-cliticise to the
adjacent verb or other clitic, where inversion has taken place. This
would explain why que consistently appears outside all other clitics,
including ne. It would also enable us to account for the fact that unlike
other clitics que need not attach to the verb of its own clause, as in (12)
and (21) repeated in (36).
(36) a.
Que crains-tu qui soit advenu?
what fear you that is taken place
'What do you fear has happened?'
b.
Que prdtendais-tu qui motivait cette analyse?
what claimed you that motivated that analysis
'What did you claim motivated that analysis?'
(37)
Que ne faudrait-il jamais faire t?
what NE ought-it never to do
'What ought one never to do?'
This solution does not require that we invoke Long Head
Movement. However, the problem remains of how to account for why
it always cliticises to a verb group and never anything else and in
particular, why it cannot cliticise to a complementiser. In fact, under
the view presented here it is this last case which it is essential to rule
out since uninverted questions are posited to contain a non-overt
complementiser adjacent to the wh-phrase. Clearly, it will be necessary
to assume that phonological clitics like que may cliticise only to heads
which are structurally adjacent and that these must have phonological
content. I would like to propose that what is at stake in the *que que
sequence is that the complementiser does not itself have enough
288
SITUATING QUE
phonological weight to act as a host for a phonological clitic while a
verb, plus or minus verbal clitics does.21
Assuming that que questions in which the matrix C bears whfeatures can be ruled out in this way, let us turn now to the remaining
problematic cases in which que functions as a subject.
3.2.3 Que and subject questions
Let us return finally to the restriction on matrix clauses with 'what'
subjects in French. As we saw earlier, these appear to be both banned
from staying in situ, in the [Spec,111, taking the form quoi and from
moving to [Spec,CP] and taking the form que. Let us now see how
this can be explained. To begin, let us review some of the problematic
cases:
(38) a. * Que/quoi a dt6 decide/
has been decided
what
'What was decided?'
b. * Que/quoi flotte dans l'eau?
floats in the water
what
'What floats in water?'
Simple matrix questions are ungrammatical when the subject is a form
of 'what', both when the subject is left in situ and when it is moved.
However, the echo version of the in situ question is acceptable, as we
saw in (15), repeated here as (39).
(39) a. QUOI a ad dkid6?
what has been decided
'WHAT was decided?'
b.
QUOI flotte dans l'eau?
what floats in the water
'WHAT floats in water?'
21 Although complement clitics are themselves phonologically light
they form a phonological phrase with the following verb. However,
phonological weight might also be relevant in accounting for the fact that
many speakers find que questions where the first clitic is ne to be odd.
289
289
YORK PAPERS IN LINGUISTICS 17
The impossibility of (38) cannot be attributed to any thematic
restriction on quelquoi as the thematic relations are the same in (38) as
in (39) and presumably they are the same again in the relevant part of
(17) repeated here as (40).
(40)
11 a ea ddcid6 quoi pour demain?
it has been decided what for tomorrow
'What has been decided for tomorrow?'
Note that here the wh-phrase does not occupy the subject position,
which is filled instead by an expletive. In addition, we cannot maintain
that quelquoi simply cannot be a subject because in elliptical questions
with no verb quoi can clearly refer to the subject as (41) (from Ldard
1982) shows.
(41) a. Quelque chose me chagrine.
something me upsets
'Something is upsetting me.'
b. Quoi done?
what then
'What?
In addition, we have just seen cases in (36) where que has been extracted
from the subject position in a lower clause. The acceptable periphrastic
forms such as the one in (11) repeated here as (42) were taken to fall
into this category too.
(42)
Qu'est ce qui t flotte dans l'eau?
what is this that floats in the water
'What (is it that) floats/is floating in the water?'
Echo interpretations aside, the contrasts seem generally to show that
que /quoi may occupy [Spec,II1 but not at S- structure and that que may
occupy [Spec,CP] but not if it has been extracted from the subject
position of the same clause. Let us dispense with the latter case first.
Given that 'what' cannot be completely barred from the specifier
290
99
SITUATING QUE
position of a tensed CP we need to explain why it is blocked from
moving the short distance shown in (43).
(43)
* [cpquei[c Vi
ti tj...]]]
This configuration could perhaps be ruled out as an ECP violation
which cannot be salvaged by Masquerade, as it can in the embedded
clause in the relevant cases, since only IP has been projected. However
it is not clear why an inverted verb would not be able to govern the
trace position as Rizzi assumes happens with the extraction of a 'who'
subject in (44).
(44)
Qui vient?
who comes
'Who is coming?
I would like to maintain, though, that the verb has nothing to salvage
in (44) since qui is in [Spec,IP] and not [Spec,CP]. This is exactly
what the Minimal Approach to Wh Movement (as in Plunkett 1993)
would predict. Put into the framework presented here, economy
considerations will block an I marked +wh from moving to C in this
situation since the wh-phrase in its specifier satisfies the revised Wh
Criterion in (33) and further movement, being completely unmotivated,
is blocked.22 If movement is blocked in (44) then the same applies in
(38), economy thus rules out the representation in (43). It is
interesting to compare (18) and (19) repeated here as (45) and (46) in
this regard.
(45)
Qu'est-il arrive?
what is it happened
What happened?'
22 Under Minimalism, movement is permitted only to satisfy
morphological
requirements and never in order
ungrammaticality.
291
291
to
salvage
YORK PAPERS IN LINGUISTICS 17
(46)
Qu'a-t-il et8 decide pour demain?
what has it been decided for tomorrow
'What has been decided for tomorrow?'
In cases such as these que is in fact an underlying object and at Sstructure [Spec,IP] is filled by an expletive. In this situation of course
economy will not block further movement because the only way to
satisfy the Wh Criterion (33) will be for Ito move to C and for the whphrase to move into [Spec,CP].
Let us concentrate then on explaining the remaining problem, the
ban on (non-echo) quoi. when in situ. I would like to attribute this to
the status of quelquoi as a non-specific indefinite.23 Not all of the
ungrammatical examples with quoi subjects have grammatical
equivalents with expletive subjects but it is significant that in the
examples usually cited quoi is the surface subject of a predicate with a
single argument, plausibly an unaccusative,24 or of a passive predicate.
In fact, when we look at a different type of predicate speakers will
sometimes, at least marginally, accept que subjects. The following have
been found acceptable by more than one speaker.
(47)
? Que demontrait le redressement de teconomie?25
what demonstrated the re-establishment of the economy
'What demonstrated the recovery of the economy?'
23 My thanks go to David Adger for first suggesting to me that the
contrast I discuss below might have something to do with specificity.
24 Though neither sentir 'feel' nor trainer 'lie around' take the auxiliary
etre on the relevant interpretation.
25 For both this and the example which follows an object interpretation
for the question is also available. I have controlled for this in asking
speakers' judgements by putting them into a context which forces the
subject reading as in (i).
(i) A ton avis, que revile le mieux [le redressement
de
in your opinion, what reveals the best the re-establishment of
l'economie], les chiffres de chomage
ou le taux de l'inflation?
the economy the figures of unemployment or the rate of the inflation
'In your view what best reveals the economic recovery, the
unemployment figures or the rate of inflation?'
292
SITUATING QUE
(48)
? Que vous demanderait un voeu de alibat
what you would ask a vow of celibacy
'What would require a vow of celibacy from you?'
(49)
? Que reclame toute noire attention?
what demands all our attention
'What demands our full attention?'
What seems particularly relevant here is that in all these cases, on a
subject interpretation,26 'what' seems to mean something like 'what
particular thing'. In other words, que is being interpreted here as 'Dlinked' to use the terminology of Pesetsky (1987), or if Kiss (1993) is
right in equating the two, a specific or familiar indefinite. It is well
known that many languages bar indefinites from occurring in the
[Spec,IP] position, or require that they receive a particular type of
interpretation either as a specific or a generic. In some languages
(Modern Standard Arabic is one), the addition of a modifier may be
sufficient to render the indefinite specific enough to be able to occupy
this position. Clearly, some indefinites may appear in subject position
in French but it may be that quelquoi are so resistant to a specific
interpretation that, except where no other interpretation is available, as
in an echo, it is rejected in [Spec,IP]. This ideas seems to be borne out
by the contrast mentioned to me by Paul Hirschbtihler (p.c.) between
the multiple interrogation in (50) and the more complex one in (14)
repeated here as (51).
(50) ?? Quoi trainait
what lay around where?
'What was lying around where?'
26 (47) and (48) are open to object interpretations too; perhaps the fact
that the object interpretation is more prominent in (i) than in (47) accounts
for the fact that fewer speakers accepted it.
(i)
Que demontre que l'economie se redresse?
what shows that the economy is re-establishing itself
'What shows that the economy is recovering?'
293
293
YORK PAPERS IN LINGUISTICS 17
(51)
? Qui a dit que quoi trainait ota
who has said that what lay around where
'Who said that what was lying around where?'27
In (51) the context provides strongly for an interpretation in which the
answer(s) to 'what' must be selected from a previously delimited set,
much as is the case with 'which X' in English, which has been claimed
to be associated with a necessarily D-linked interpretation. Of course,
to determine whether this explanation is really on the right track much
more detailed informant work would be required. However, the fact that
many speakers will accept quoi as a subject on an echo interpretation is
further suggestive of this view, since these are clearly specific. In
addition, the fact that long-distance questions where que can escape
[Spec,IP] are possible lends strong support to this view. Further,
questions with an expletive subject, where que does not need to transit
through [Spec,IP], are correctly predicted to be good under the Minimal
Approach since when [Spec,IP] is filled by a non-wh-element, just as in
object or adjunct questions the Wh Criterion cannot be satisfied without
subsequent movement.28
Finally, whether it is ultimately correct to regard periphrastic
questions like (52) as genuinely long-distance or not, they clearly differ
from simple questions in their propositional force, which in many cases
is a diagnostic of specificity. Thus in both English and French, (52)
but not (53) presupposes that something did indeed happen.
27 The ambiguity which appears in the English gloss if the
complementiser is omitted here is not a factor in the French where embedded
finite complementisers may be omitted only in interrogative clauses. The
alternative interpretation of the English gloss would have to be rendered as
in (i).
(i) Qui a dit ce qui trainait
oil?
who has said it that lay around where
'Who said what was lying around where?'
28 These questions do suggest, however, that seeing strong features as
categorial requirements only cannot be quite right. If it were, one would
wonder why an expletive could not satisfy the requirement. This leads us
back to a more traditional approach in which the element to be checked
against the strong feature must bear compatible wh-features.
234
294
SITUATING QUE
(52)
Qu'est ce qui s'est
(53)
Que s'est-il passe
what is-it happened
passe?
what is it that is happened
'What was it that happened?'
'What happened?'
There remains work to be done on fleshing out the idea presented
here but I am aware of only one problem with it. Pesetsky (1987)
claims that elements like 'what the hell' are strongly non-D-linked.
However, some speakers have been found to accept the following.
(54)
Que diable to faisait imaginer que je serais chez moi
what devil you made imagine that I would be house-my at
cette heure-la?
that hour-there
'What on earth made you think I'd be home at that time of day?'
I leave the resolution of this problem to further research.
4. Conclusion
In this paper we have seen that French questions possess a number of
peculiarities which have major implications for our understanding of
Wh Movement and how it is to be motivated within current syntactic
theory. I have proposed a number of revisions to Rizzi's approach to
questions to bring it into line with current thinking arguing in line
with Chomsky (forthcoming) that checking is a one-way mechanism, at
least with respect to wh-features. I have argued that the revisions
proposed to Rizzi's theory help us to explain in part the restrictions on
que questions which have been so widely discussed in the literature on
French syntax. These revisions alone do not suffice, however, there is a
further constraint on the position of que which I have proposed is a
strongly non-specific indefinite barred from terminating in [Spec,IP].
The impossibility of quoi subject questions is thus accounted for
without a requirement that subject question-words move and is perfectly
compatible with a Minimal Approach to Wh Movement, contra Rizzi
295
rye
YORK PAPERS IN LINGUISTICS 17
(1991). The impossibility of que subject questions, on the other hand is
attributed to economy considerations but their equivalents with
expletive subjects are correctly predicted to be possible. Rather than
invalidating the Minimal Approach then, French 'what' questions
actually lend support to it.
REFERENCES
Aoun, J. and Y-h A. Li (1993) Wh- elements in situ: syntax or LF?,
Linguistic Inquiry 24.199-238.
Baker, C.L. (1970) Notes on the description of English questions: the role
of an abstract question morpheme, Foundations of Language 6.197219 .
Belletti, A. and L. Rizzi (1995) (eds.) Parameters and Functional Heads:
Essays in Comparative Syntax , Oxford: OUP.
Chomsky, N. (1993) A minimalist program for linguistic theory. In Hale, K
and J. Keyser (eds.) The view from Building 20, 1-52, Cambridge:
MIT Press.
Chomsky, N. (forthcoming) Categories and transformations, Chapter 4 of
The Minimalist Program, Cambridge: MIT Press.
Chomsky, N. and H. Lasnik (1977) Filters and control, Linguistic Inquiry
8.3
Friedemann, M-A (1991) Propos sur la Montee du Verb en C° dans Certaines
Interrogatives Frangaises, M6moire de Licence, Universite de
Geneve.
Goldsmith, J. (1978) Que, c'est quoi? que, c'est QUOI, Recherches
linguistiques a Montreal, 1-13.
Goldsmith, J. (1981) Complementizers and root sentences, Linguistic
Inquiry 12.554-573.
Higginbotham, .1 and R. May (1981) Questions, quantifiers and crossing,
The Linguistic Review 1.41-80.
Hirschbilhler, P. (1978) The Syntax and Semantics of Wh Constructions,
Bloomington, Indiana: IULC.
Kayne, R. (1975) French Syntax: The Transformational Cycle, Cambridge:
MIT Press.
296
2 9: 6
SITUATING QUE
Kayne, R. (1976) French relative que. In Lujan, M and F. Hensey (eds.)
Current Studies in Romance Linguistics, 255-299, Washington:
Georgetown University Press.
Kiss, K. (1993) Wh-Movement and specificity, Natural Language and
Linguistic Theory 11.85-120.
Koopman, H. (1982) Theoretical implications of the distribution of quoi,
Proceedings of the North Eastern Linguistics Society, 153-62,
Amherst, MA: GLSA.
Leard, J-M. (1982) Essai d'explication de quelques redoublements en syntaxe
du quebecois; l'interrogatif-indefini, Revue quebecoise de
linguistique 2, Montreal: UQAM.
Lefebvre, C. (1982a) (ed.) La Syntaxe Comparee du Francais Standard et
Populaire, 2 vols., Quebec: Office de la Langue Francaise.
Lefebvre, C. (1982b) Qui qui vient ou Qui vient: voile la question, in
Lefebvre, C. (1982a).
May, R. (1985) Logical Form: Its Structure and Derivation, Cambridge: MIT
Press.
McDaniel, D. (1989) Partial and multiple Wh-Movement, Natural Language
and Linguistic Theory 7.565-604.
Obenauer, H-G. (1976) Etudes de syntaxe interrogative du frangais. Ouoi,
combien, et le complementeur,Tfibingen:Niemeyer.
Obenauer, H.G. (1977) Syntaxe et interpretation: que interrogatif, Le
Francais Moderne 45.305-341.
Pesetsky, D. (1987) Wh-in-situ-: movement and unselective binding. In
Reuland, E. and A. ter Meulen (eds.) The Representation of
(In)definiteness, 89-129, Cambridge: MIT Press.
Plunkett, B. (1993) Subjects and Specifier Positions, University of
Massachusetts doctoral dissertation, Michigan: University
Microfilms (also to be published by GLSA: Amherst).
Plunkett, B. (1994) The minimal approach to Wh Movement, ms..
University of York.
Sportiche, D. (1994) Clitic constructions, ms. UCLA.
Rizzi, L. (1989) Relativized Minima lity, Cambridge: MIT Press.
Rizzi, L. (1991) Residual Verb Second and the Wh-Criterion, Technical
Reports in Formal and Computational Linguistics, no. 2., Faculte
des Lettres, Universite de Geneve republished in Belletti and Rizzi
(1995).
297
297
YORK PAPERS IN LINGUISTICS 17
Stroik, T. (1995) Some remarks on superiority effects, Lingua 95.239-258.
Williams, E. (1986) A reassignment of the functions of LF, Linguistic
Inquiry 17.265-299.
298
93
EVENT STRUCTURE AND THE
BA CONSTRUCTION*
Catrin Sian Rhys
University of Ulster at Jordanstown
1. Introduction
The controversy surrounding the ba construction within Chinese
linguistics concerns the semantic content of ba and its relation to the
matrix verb. On the one hand, it is argued to be a full lexical
preposition, independently assigning a thematic role to its complement
(Li 1985, Cheng 1986). On the other hand, it is claimed to be a dummy
Case marker with no semantic content, inserted to license the direct
object of the verb (Huang 1982, Goodall 1987). Constraints on ba and
the interaction of ba with more general syntactic constraints in Chinese
have the effect that the well formedness of ba fronting ranges from
obligatory through preferred and optional to ill-formed. In its simplest
form, however, the ba construction is an optional mechanism for
fronting the object of a transitive verb:
(1) a.
to sha le fuqin.
he kill ASP father.
He killed his father.
b.
to ba fuqin sha le.
he father kill ASP.
He killed his father.
Author's address: School of Behavioural and Communication Sciences,
University of Ulster at Jordanstown, Newtonabbey, Co Antrim BT37 OQB.
York Papers in Linguistics 17 (1996) 299-332
Catrin Sian Rhys
2S
YORK PAPERS IN LINGUISTICS 17
Under early assumptions in GB, the conclusion that the ba object
was moved also forced the conclusion that ba itself was a semantically
empty dummy Case marker inserted at S-structure, because of the Theta
Criterion. Previous analyses have therefore tended to concentrate on the
properties of the movement operation and the contexts in which it was
obligatory.
With the advent of theories of functional heads, ba can be viewed as
a base generated functional head with independent semantic properties
but crucially no thematic grid. The constraints on the licensing of the
ba construction then move to centre stage, as the properties of the
functional head and its complement are determined. This is the approach
taken in this paper. Ba is given a novel analysis in which it interacts
with the thematic structure of matrix verb via a system of thematic
mediation, but more importantly, it interacts with event structure via
the hierarchy of aspectual roles proposed in Grimshaw (1990). This dual
interaction allows us to capture both the formal aspects of ba, that have
lead to its treatment as a dummy Case marker, and the interpretive
effects of ba, which have lead to its analysis as a thematic head.
Furthermore, I show that the analysis developed here has some
interesting results for the argument structure of the ba construction, in
addition to the desired effect of accounting for the relation between an
affectedness constraint on the DP following ba, and the aspectual
restrictions on the verb phrase in the ba construction.
Before investigating the constraints on the licensing of ba, the
structure assumed for the ba construction is outlined along with some
motivating data.
2. What is the structure of the ba construction?
The first observation to be made about the ba construction is that the
apparent object of ba canonically gets its thematic role from the verb
and appears in the post verbal complement position, as shown in the
simple ba construction given in (lb) which relates to the canonical
order in (la) (repeated here):
(1) a.
to sha le filqin.
she kill ASP father
She killed her father.
300
300
EVENT STRUCTURE AND THE BA CONSTRUCTION
b.
to ba fuqin sha le.
she ba father kill ASP
She killed her father.
This suggests that ba is not a thematic role assigner and that the
apparent object of ba is not a complement of ba, or at least is not
assigned a thematic role by ba. This suggestion is strengthened by the
observation that ba and its apparent object do not behave as a
constituent with respect to movement. The following examples show
that they cannot appear either postverbally, or sentence initially, or
outside VP.1
(2) a.
*ying lin sha le ba muqin.
Ying Lin kill ASP ba mother
b.
*ba muqin ying lin sha le.
Ba mother Ying Lin kill ASP
c.
*ying lin ba muqin zuotian yong dao shasi le.
Ying Lin ba mother yesterday use knife kill ASP
It should be noticed in this context that the apparent object of ba is
licensed to appear in all the above positions without ba. It can also
even appear in the preverbal ba position without ba, which suggests
that in addition to not being a thematic role assigner, ba is not simply
an inserted Case assigner.2
1 See Y. H. A. Li (1985: 373) for more detailed argumentation that ba
occupies a position within VP.
2 Although of course an alternative interpretation of this fact is that when
the object does appear in the ba position without ba, there is a null Case
assigner, carrying the focus interpretation of the construction. However,
the question of Case assignment in Chinese is not one I wish to address in
this paper (see Rhys 1992). It has also been pointed out to me by a reviewer
that it is not clear that the unmarked preverbal object is in fact in the same
position as ba, since interaction with adverbials points to the unmarked
preverbal object being outside VP.
301
0 rs,
't) I.) j.
YORK PAPERS IN LINGUISTICS 17
If ba and its apparent object do not form a constituent, what, then,
is the constituent structure involved? An important observation in this
case is that ba imposes aspectual restrictions on the VP that follows it.
So the following example is ruled out because the VP is stative and not
perfective as required by ba.3
*wo ba ta ai.
I ba her love
(3)
This relationship of ba to the VP, and the fact that it does not
assign a thematic role to its apparent object, point to a structure in
which the actual complement of ba is in fact the VP. Indeed ba does
appear to behave like other functional heads that have a VP
complement, in that the position of ba is fixed, as shown in (2), and
iteration of ba is not licensed. Hence in the following example, either
object of the double object verb jiao 'spray' can be ba fronted, but not
both:
(4) a.
to ba hua jiao le shui.
he ba flowers spray ASP water
He sprayed the flowers with water.
b.
to ba shui jiao le hua.
he ba water spray ASP flowers
He sprayed the water on the flowers.
c.
*ta ba hua ba shui jiao le.
he ba flowers spray ASP water
He sprayed the flowers with water.
In addition, reduplication of ba in the A-not-A structure, as in (5),
shows that it is a verbal head in the verbal projection since only verbs
3 This is a simplification of the aspectual restrictions as will become clear
below.
:2
302
EVENT STRUCTURE AND THEM CONSTRUCTION
can be negated by the negative particle bu that appears in the A-not-A
reduplication:4
(5)
ni ba bu ba shu gei ta?
you ba not ba book give her
The evidence thus points to the following structure in which ba is
a functional head with a VP complement. The apparent object then
appears in the specifier of the VP complement governed by ba, but not
theta marked by ba.5 Henceforth this DP will be referred to as the ba
DP, and not the ba object.
(6)
baP
N
ba
ba
VP
DP
V
V XP
The relation between ba and the ba DP, is taken to be one of
thematic mediation (see Rhys 1992 for motivation for such an
analysis). The idea of thematic mediation comes from Grimshaw's
discussion of the role of the prepositions to and of in licensing the
4 It has been pointed out by a reviewer that prepositions such as gen 'with'
might also arguably be negated by bu. In Rhys 1992, however, I have
argued that precisely this set of putative prepositions are in fact also verbal
functional heads interacting with the thematic structure of the matrix verb.
5 Note that this rules out adoption of any simple view of the VP internal
subject hypothesis of Koopman and Sportiche 1991. For discussion of this
see Rhys 1992.
303
3 n e'
YORK PAPERS IN LINGUISTICS 17
arguments of nominals (Grimshaw 1990: 71). This idea is developed in
Adger and Rhys (forthcoming), in which lexical heads have both
argument structure and thematic structure and the Generalised Theta
Criterion requires that thematic roles be assigned to arguments. In this
approach, a thematic mediator is a functional head with argument
structure but no thematic structure, which licenses a thematic role from
a lexical head which either has no argument structure (e.g. nominals),
or has an argument saturated by something other than the thematic role
(e.g. nominal gerunds). It is this relationship of thematic mediation
(and the a-role structure of ba to be discussed below) that gives the
appearance of constituenthood to ba plus the ba DP, and yields the
adjacency requirement of ba and the following VP, ruling out certain
kinds of typical VP behaviour, e.g. coordination, VP-initial adverbs,
etc.
3. Aspect and the constraints on ba fronting
With the exception of Cheng (1986), early accounts (e.g. Huang 1982)
have concentrated on the structural properties of ba , and the contexts in
which it is obligatory. The constraints on ba fronting have been
assumed to be peripheral; a matter of semantics or even pragmatics.
These accounts have therefore not attempted to explain the
ungrammaticality of examples such as:
(7)
* wo ba yige qianbao shi le.
I ba a purse find ASP
(8)
* wo ba to ai.
I ba her love
(9)
* wo ba ji kanjian le.
I ba chicken saw ASP
(10)
* wo ba qian you.
I ba money have
The unacceptability of (7) relates to the definiteness of the ba DP,
which is generally claimed to be necessarily definite, but in this
304
304
EVENT STRUCTURE AND THE BA CONSTRUCTION
example is marked as indefinite by the indefinite article yige. The
problem in (8) is one of aspect: ba fronting is not licensed when the
verb constellation is stative. Both (9) and (10) are generally explained in
terms of an affectedness restriction on ba DP, although (10) also does
not meet the aspectual constraints on ba since the verb you 'have' is
clearly stative.
Ba also interacts with the Postverbal Constraint (Huang 1982), the
syntactic constraint on word order that makes object fronting obligatory
when another constituent, whether complement or adjunct, appears in
the postverbal position:
(11)
a.
wo ba ta mian le zhi.
I ba him cancel le job
I fired him.
b.
*wo mian le zhi ta.
I cancel le job him
c.
*wo mian le ta zhi.
I cancel le him job
Thus ba fronting may be obligatory (under the Postverbal
Constraint), optional (in the simple ba construction as in (1)),
ungrammatical (with certain aspectual classes), or preferred (in the
resultative constructions to be discussed below).
Earlier GB accounts have generally acknowledged these descriptive
generalisations about the ba construction but have taken the constraints
on ba to be outwith the scope of a syntactic account. In the case of the
definiteness restriction, it is certainly the case that this restriction is not
specifically a property of the ba construction. Firstly, it is a more
general property of word order in Chinese that preverbal NPs have a
definite or specific interpretation whereas postverbal NPs have an
indefinite interpretation. Thus in the case of ergative verbs where the
subject is licensed either preverbally or postverbally, the difference in
interpretation between the two subject positions is one of definiteness
(examples from Sybesma 1992):
305
305
YORK PAPERS IN LINGUISTICS 17
(12)
a.
tankeche lai le.
tanks come le
The tanks have come.
b.
lai tankeche le.
come tanks le
There are some tanks coming.
It might also be argued that this definiteness restriction is the effect
of the communicative function of ba which is to mark the object as
'given' information (Li 1971).6 The aspectual restrictions and the
affectedness restriction, on the other hand, should form an integral part
of the analysis of ba licensing. Furthermore these two types of
restrictions intrinsically interact. Cheng (1986) also acknowledges a
connection between the notion of affectedness and the aspectual
structure of the verb phrase. In her account, however, there is nothing
inherent in either restriction from which this connection is derived. It is
simply stated in terms of feature cooccurrence. Other than Sybesma
(1992) whose analysis is discussed below, the only attempts to capture
the affectedness restriction (Huang 1991, Cheng 1986) assume that
there is a theta role <Affected Theme>.
In this paper, I suggest that the affectedness condition is not the
consequence of a thematic role <Affected Theme>, nor is it a subclass
of the thematic role <Theme>. Instead, based on Grimshaw (1990), I
propose that it derives from an independent hierarchy of semantic roles
distinct from thematic roles. Furthermore this second hierarchy is
derived from the aspectual structure of the verb constellation. The
interaction of the two restrictions on ba therefore derives from this
relationship between the semantic hierarchy and aspectual structure.
6 A reviewer has pointed out that the definiteness effects in the ba
construction appear to be much more robust than for other preverbal DPs,
and that the explanation for this may well lie in event structure of the ba
construction, which would fit well with the general approach developed
here.
306
EVENT STRUCTURE AND THE BA CONSTRUCTION
3.1. Aspectual classes and an aspectual ontology
Since Vend ler (1967), it has been generally acknowledged that the
classification of predicates into aspectual classes accounts for their
different behaviour with respect to temporal adverbials and aspect
markers. Dowty (1979) details a number of diagnostics for determining
aspectual class, and shows that the aspectual class of a clause can be
influenced by the arguments of a verb as well as by the verbal
constellation. Examples of the four aspectual classes given by Vend ler
and Dowty are as follows:
state
activity
accomplishment:
achievement
know, love, be tall
run, walk, drive a car
kill, paint a picture, build a house
recognise, reach, die
States relate to the traditional stative/non-stative distinction, a
distinction which is maintained between states and the other classes, so
that the general term for an aspectual class is eventuality, reserving the
term event for the non-stative aspectual classes. Among the events,
accomplishments and achievements differ from activities in that they
have an inherent endpoint, a property often termed felicity. This
telic/atelic distinction leads to a distinction in past tense aspects
between completion and termination (Smith 1991). A telic verb with
its inherent endpoint typically involves completion: the event John ran
to the shops ends when John reaches the shops. An activity, an atelic
verb with no inherent endpoint, simply terminates: John ran. Activities
and accomplishments differ from achievements in that they involve
duration.
Moens and Steedman (1988) develop an ontology of events based
on the event structure template of (13) (over) which gives the internal
structure of an event. Their proposal is that the different aspectual
classes map differently onto this template. The telic property of
accomplishments and achievements, mentioned above, is captured by a
mapping involving both the culmination and consequent state, the
difference between them being that the accomplishment also involves a
307
,6)
YORK PAPERS IN LINGUISTICS 17
(13)
culimination
preparatory
process
consequent
state
(14)
1111111111111111111111111111111I
11111111111111111111111111111111
culimination
preparatory
process
consequent
state
preparatory process. Hence, the achievement reach the top maps as in
(14), where the event involves the culmination, i.e. reaching the top,
and the consequent state of being at the top. Whereas the
accomplishment build a house involves the preparatory process of
building, in addition to the culmination, the completion of building,
and the consequent state, the existence of the house, as in (15).
(15)
111111111111111111111111111111111111111111111111111111
///0/0/0/0/M/00/00/0/0
culimination
consequent
state
preparatory
process
An activity such as run, on the other hand, involves neither
culmination nor consequent state, but just the preparatory process part
of the template:
3
308
EVENT STRUCTURE AND THE BA CONSTRUCTION
(16)
/////////////////////
I
culimination
preparatory
process
consequent
state
The difference between termination and completion can now be
reformulated as the difference between an event which culminates
(completion) and an event that ends before culmination (termination).
Moens and Steedman add an additional event to the traditional three; the
punctual event. This is an instantaneous event which involves only a
culmination and neither preparatory process, nor consequent state, for
example sneeze.
The relationship between the subevents in this template, Moens
and Steedman argue, is neither directly temporal nor causal (as proposed
in Dowty 1979). Rather they show that it is a relation of contingency.
In the analysis below, Moens and Steedman's system is adopted as it
renders the internal structure of an event transparent, and offers a
straightforward approach to the compositional building up of an event.
3.2. Grimshaw's aspectual roles
Grimshaw (1990), in an account of psychological predicates, suggests
that there is a dimension of semantic analysis independent from
thematic structure which is essentially causal in nature. The two classes
of psychological predicates are represented by frighten and fear which
have the same thematic analysis but are distinguished along this
dimension: frighten is causative whereas fear is stative. The importance
of this for Grimshaw is that it provides insight into the argument
realisation of the two verb classes. In particular, it sheds light on the
question of why, in the frighten class of predicates, the Theme is
realised as the subject despite being lower on the thematic hierarchy.
This fact now falls under the broader generalisation that cause
arguments of causative predicates are always subjects. The causal status
of arguments is thus indicative of an independent dimension of
309
309_
YORK PAPERS IN LINGUISTICS 17
prominence relations that is distinct and autonomous from the thematic
dimension:
(17)
(Cause(other( )))
It is the alignment (or misalignment) of arguments across the thematic
dimension and this causal dimension that yields differing behaviour in
relation to argument realisation.
The contentful notion of cause, however, is too narrow. Neither
agentive predicates, nor unergative predicates, nor psychological
predicates show any of the effects of the misalignment of the two
semantic dimensions, so their subjects must have some property in
common which qualifies them for maximal prominence on the causal
dimension. They are not however causatives. How then is this second
dimension defined? Grimshaw suggests that the answer lies in the event
structure of the predicates and that the dimension is aspectual in nature.
Adopting a Vendler/Dowty approach to event structure, Grimshaw
suggests that aspectual prominence derives from participation in the
subevents of a complex event. For example, an accomplishment such
as break is a complex event which breaks down into an activity and a
state, which in Moens and Steedman's terms, are the preparatory process
and the consequent state. (The Dowty/Vendler system does not separate
the consequent state from the culmination.)
(18)
Event
Activity
prep. proc. conseq. state
State
Under such an analysis, the cause argument is always associated
with the first subevent, the preparatory process. Grimshaw generalises
this to the claim that the argument that participates only in the first
subevent of a complex event is aspectually more prominent than an
argument that is associated with both or only the second subevent. I
shall continue to refer to the aspectual role (a-role) assigned to that
argument as <Cse>, although it should be understood that the causal
319
310
EVENT STRUCTURE AND THE BA CONSTRUCTION
interpretation stems not from the a-role itself but from the contingency
relation between the two subevents of the complex event, i.e. it is in
some sense epiphenomenal.
3.2.1. Aspectual roles in Chinese
Is there any evidence for this independent aspectual hierarchy in
Chinese? The causal interpretation of (19) suggests that there is:
(19)
wo ba lade chuangkou da-po le.
I ba her window hit-broken ASP
I broke her window.
The verb complex in this example, da-po, is a resultative
compound formed from the two verbs da and po. The verb da means
'hit' and has as its core theta roles Agent and Theme, neither of which
has a causal interpretation:
(20)
wo da le tade chuangkou.
I hit le her window
I hit her window.
The verb po is an intransitive verb roughly translating as 'broken', with
the single theta role Theme:
(21)
tade chuangkou po le.
her window broken le
Her window is broken.
If we assume that the thematic structure of the compound da-po
'break' derives from the thematic structure of its two component verbs,
then the overall thematic structure of the compound will be <Agent,
Theme>, that is identical to the thematic structure of da 'hit', where the
Theme of da 'hit' has identified with the Theme of po 'broken'. The
compound, however, has a causative interpretation that is absent from
either of the component verbs. This suggests that the interpretation of
the subject of the compound as a Cause cannot be thematic. Turning to
the event structure, on the other hand, we find that the compound is an
YORK PAPERS IN LINGUISTICS 17
overt realisation of the preparatory process-consequent state structure, in
which the Agent is a participant of only the preparatory process, hence
is assigned Grimshaw's a-role, <Cse>. Note that the object in (19) has
an affected interpretation that is similarly absent in (20) and (21). This
suggests that affectedness should also not be analysed as a property of
the thematic grid as Huang and Cheng have both assumed, but derives
from the aspectual dimension. This is the hypothesis addressed in the
next section.
3.3. Affectedness, the aspectual dimension and ba
The first step in the hypothesis is to look to event structure for a
participant that will be interpreted as affected. If this is the case then as
well as the a-role <Cse>, we can define a second a-role <Aff>, and the
aspectual hierarchy will be specified as:
(22)
(Cause(Aft))
Consider the predicate kill in the sentence: John killed the cat.
Here John is the <Cse> and the cat receives an interpretation as the
affected object. If we turn now to the event structure of the predicate, we
find that it is an accomplishment comprising a preparatory process,
killing, and a consequent state, being dead. In particular we find that
while John is the participant only of the preparatory process, and hence
is assigned the a-role <Cse>, the cat is the sole participant of the
consequent state. This points to a definition of the a-role <Aff> as the
participant of a consequent state. If we look now at the Chinese
translation of 'kill' the same appears to be true.
(23)
Zhangsan sha le xiaomao.
Zhangsan kill ASP cat.
Zhangsan killed the cat.
Assuming that sha has the same lexical event structure as its
English translation, Zhangsan is the Agent of the preparatory process
and xiaomao is the participant in the the consequent state. Thus, we
find again that the notions of cause and affected correlate with these
roles in the event structure. We can, therefore, abstract away from the
312
EVENT STRUCTURE AND THE BA CONSTRUCTION
contentful notions of Cause and Affected and work in terms of aspectual
subevents and their associated participants. Under this approach, we can
now reformulate the affectedness constraint on ba in terms of event
structure and aspectual roles. More precisely the ba DP can be viewed as
the participant of a consequent state in a complex event. Thus the
object of (23) can appear as a ba DP, whereas this is not possible with
a verb such as ai 'love' that is a state and not a complex event:
(24)
Zhangsan ba xiaomao sha le.
Zhangsan ba cat kill ASP
Zhangsan killed the cat.
(25)
*Zhangsan ba xiaomao ai.
Zhangsan ba cat love
This seems to be a step in the right direction because it does look
as though event structure rather than a contentful role is what is
relevant. So in the following example, the object could not be said to
be affected in any way, and yet ba fronting is licensed:
(26)
to ba yaoshi diu-le.
he ba key lose ASP
He lost the key.
The claim that ba picks out the participant of the consequent state
in a complex event entails that a verb like diu 'lose' must be argued to
be a complex event, having a consequent state, 'lost', that is predicated
of the ba DP. Evidence for this comes from adverbial modification. If
(26) is modified by an adverb of duration sange xiaoshi 'for three hours',
the only interpretation available is that the consequent state of the key
being lost lasted for three hours:
(27)
to ba yaoshi diu-le sange xiaoshi.
he ba key lose ASP three hours
He lost the key for three hours.
313
313
YORK PAPERS IN LINGUISTICS 17
In fact, a comparison between the verbs that do allow ba fronting
with the ones that do not, indicates that the feature that distinguishes
the verbs that allow ba fronting is that their event structure involves a
consequent state when the verb is combined with the aspect marker le
(le is ambiguous between termination and completion). Examples are
verbs such as chi 'eat', xi 'wash', si 'tear up', wang 'forget', pian 'cheat'.
The verbs that do not allow ba fronting on the other hand all seem to be
either states such as renshi 'know', or atelic processes such as ling
'listen', which either do not perfectivise (in the case of states) or involve
only termination where the perfective le is licensed. The following are
examples of verbs that do not generally license ba fronting: tui 'push',
shang 'go up', dai 'carry', xihuan 'like'.
3.4. V-V compounds, consequent states and ba
The idea that ba picks out the participant of the consequent state of a
complex event is supported by data from V-V compounds. There are
two kinds of V-V compounds, conjunctive and resultative (Li 1990).
The conjunctive ones are like bangzhu, where both halves of the
compound mean help. They are all either punctual or processes, and do
not break down into subevents. The resultative compounds are like
overt realisations of the preparatory process--consequent state structure
of the lexical complex events. So for example, chi-guang 'eat-empty'
involves the process of eating and the consequent state in which the
bowl is empty, and chi-bao 'eat-full' involves the process of eating and
the consequent state of the eater being full:
(28)
wo chi guang le fan.
I ate empty ASP rice
I ate up all the rice.
(29)
wo chi bao le fan.
I ate full ASP rice
I ate rice and ended up full.
If ba picks out the participant of the consequent state, then we
would expect ba fronting of the object to be licensed with chi-guang
'eat-empty', where the consequent state is predicated of the object fan,
314
314
EVENT STRUCTURE AND THE BA CONSTRUCTION
and not with chi-bao 'eat-full', where the consequent state is predicated
of the matrix subject. This expectation turns out to be correct:
wo ba fan chi-guang le.
I ba food eat-empty ASP
(30)
I ate up all the rice.
*wo ba fan chi-bao le.
I ba food eat-full ASP
(31)
Thus we can explain why it is that where the interpretation of the
V-V compound is ambiguous, as with qi-lei 'ride tired', ba fronting is
licensed, but yields only the interpretation where lei 'tired' is predicated
of the object:
(32)
a.
wo qi lei le neipi ma.
I ride tired le that horse
either:
or:
I rode that horse and it got tired.
I rode that horse and got tired (myself).
but
b.
wo ba neipi ma qi-lei le.
I rode that horse and got it tired.
3.5. Aspectual role assignment and functional heads
So far it is claimed that the ba DP occupies a particular position in the
event structure of the clause. This is implemented using Grimshaw's
notion of an aspectual hierarchy. In particular, the ba object must
realise the second most prominent role in the aspectual hierarchy, i.e.
<Aff>. Furthermore, this information must be part of the syntactic
representation of the ba construction. So how can ba be specified to
pick up the second role in an aspectual structure? Recall that ba is
claimed to be a thematic mediator, parallel to the analysis of the
coverbs given in Rhys (1992). It is thus a functional head with a VP
complement, licensing the thematic roles from its VP complement via
315
YORK PAPERS IN LINGUISTICS 17
its own argument structure. Given this structure, I propose that ba
actually assigns both <Cse> and <Aff>; <Aff> to the DP in the
specifier position of its VP complement, and <Cse> to its own
specifier. In other words, by analogy with thematic roles, it has the arole structure (Cse(Aff)).
In fact, I will adopt the strong claim that a-roles are not assigned at
all by lexical heads but only by functional heads such as ba. Thus the
ambiguity in example (32) (repeated here) arises because no a-roles are
assigned:
(32)
wo qi lei le neipi ma.
I ride tired le that horse
either: I rode that horse and got tired.
or: I rode that horse and it got tired.
Since no a-roles are assigned here, neither DP is explicitly marked
as the participant of the consequent state. When ba is projected, it
assigns the a-role Aff which explicitly marks the ba DP as the
participant in the consequent state. Assuming the requirement of the
standard Theta Criterion that all arguments must be assigned a thematic
role, a-role assignment is not sufficient to satisfy the Theta Criterion,
so the ba DP has to receive its thematic role from a lexical head. This
explains the conflict between the apparent semantic content of ba, and
the evidence that the ba DP receives its thematic role from the verb. Ba
does have independent semantic content but it is aspectual and not
thematic. Effectively what ba does, then, is assign aspectual
prominence relations, which interact with the event structure of its
complement. In other words, by virtue of the a-roles that it assigns, ba
requires that the event structure of its complement VP be a complex
event.
This is somewhat different from Grimshaw's approach in that aroles here are syntactically and not lexically assigned. In Grimshaw's
approach aspectual prominence relations are a lexical feature on an
argument derived from the lexical representation of the event structure
of a lexical head. In the Chinese data that we are considering here, the
event structure of the predicate is not lexical, but rather is built up
compositionally as part of the syntax. A-roles therefore cannot be part
316
EVENT STRUCTURE AND THE BA CONSTRUCTION
of the lexical representation of the thematic role assigning head. In fact,
even in Grimshaw's system it transpires that the representation of the
aspectual structure cannot simply be projected from the lexical semantic
representation of the individual predicate, but involves the projection of
an abstract event structure template that breaks down into two
subevents: an activity and a state or change of state. Aspectual
prominence is determined on the basis of participation in this abstract
event template. The difference between the two approaches thus reduces
to the level at which the template applies.
Under this analysis we now have an explanation for the following
difference in interpretation between a sentence with the object in
canonical postverbal position and the corresponding ba construction,
observed by Sybesma (1992).
(33)
wo qi lei le neipi ma.
I ride tired ASP that horse
I rode that horse and it got tired.
(34)
wo ba neipi ma qi lei le.
I ba that horse ride tired ASP
I rode that horse and got it tired.
The difference between the two sentences relates to causativity in
that there is a stronger causal interpretation in the sentence involving ba
fronting. Recall that the relationship between subevents in the Moens
and Steedman template is one of contingency. The semantics of the
resultative compound, however, further specifies the relationship as one
of causation. In example (33), we therefore have a relation of causation
between the preparatory process of riding, and the consequent state of
being tired. However, no a-roles are assigned and the causation is
interpreted as a relation between events. In (34), on the other hand, the
a-roles are explicitly assigned and the causation is relation between the
participants of the subevents, since the subject is marked as the Agent
of the causation, the Cse, as well as the thematic Agent, and the ba DP
is marked as the Aff. In this way, explicit assignment of the a-roles in a
causal complex event will yield a stronger causal interpretation.
317
37
YORK PAPERS IN LINGUISTICS 17
4. V-V compounds and argument structure
Whether in the V-V compound the consequent state is predicated of the
subject or the object of the process or is ambiguous is not a linguistic
issue; it is world knowledge not syntax that tells us that in example
(29) rice cannot be full. The fact that the consequent state has to be
predicated of one of the arguments of the first subevent is however a
matter of syntax. Li (1990) suggests that it is Case restrictions that
force argument identification. However, this fails to account for the
restrictions on licensing (see the discussion in Rhys 1992). Assuming,
however, that identification has somehow been forced, the extension of
Grimshaw's system developed here gives us the argument structure of
the V-V compound. So, for the V-V compound qi-lei 'ride-tired', one
interpretation is that the horse being ridden ends up tired, in other
words, the Theme of ride identifies with the experiencer of tired. I will
represent this as follows, where the indexes attached to the thematic
roles refer to the subevents that the arguments participate in, i.e. 1 is
the preparatory process, and 2 is the consequent state:
(35)
qi lei
Ag-1, Th-Exp-1+2
This means that the Agent is higher in the aspectual structure than the
Theme, because it participates only in the preparatory process. In other
words, in terms of the aspectual hierarchy (Cse(Aff)), the Agent is
compatible with the <Cse> role. The Th-Exp then is the participant of
the consequent state and can be assigned the a-role <Aff>. We thus
capture the fact that ba fronting of the object is licensed under this
interpretation.
So what about the alternative interpretation where the Agent
identifies with the Experience??
(36)
qi lei
Ag-Exp-1+2, Th-1
Reading the aspectual prominence relations directly from the indices
assigned to the thematic roles, we find that the change in interpretation
also yields the reverse aspectual prominence relations. It is the Theme
3_3
318
EVENT STRUCTURE AND THE BA CONSTRUCTION
that participates only in the preparatory process, whereas the Agent is
identified with the Experiencer and so participates in both subevents.
The <Aff> aspectual role therefore cannot be assigned to the Theme,
which is now highest on the aspectual rating. The fact that ba fronting
of the object is not available for this interpretation is thus captured.
However, Grimshaw's system for assigning aspectual prominence also
predicts that the Theme should be licensed as subject since it is only
associated with the first subevent, and the specification of ba predicts
that the Agent-Exp should be licensed as a ba object. This is because it
is indexed as the participant of the consequent state and therefore should
satisfy the a-role <Aff>. This prediction holds and the following
example is acceptable:
(37)
ma ba wo qi lei le.
horse ba I ride tired ASP
The horse tired me out riding it.
In fact, this arrangement of thematic and aspectual relations yields
precisely the set of examples which Sybesma calls the causative ba
sentences.
(38)
Zhei-jian shi ba Zhang San ku-lei le.
This-CL case ba Zhang San cry-tired ASP
This thing got Zhang San tired from crying.
(39)
ku-lei
Ag-Exp-1+2, Th-1
In fact, under this system we also get some explanation for the
ergativity shift phenomenon that Sybesma discusses. Sybesma argues
that the ba construction involves an abstract CAUS predicate which
gets phonological content either by V raising or by insertion of ba
which he claims is a dummy element. An important feature of his
analysis is the claim that the complement of this abstract CAUS
predicate is ergative. Adopting Hoekstra's (1988) account of
resultatives, Sybesma essentially claims that the resultative V-V
compounds involve at D-structure a matrix verb with a resultative
319
319
YORK PAPERS IN LINGUISTICS 17
complement and assumes that the resultative complement triggers a
shift to ergativity in the matrix verb, suppressing the external argument
of the matrix verb. The test for ergativity in Chinese is the postverbal
subject. Hence, while ku 'cry' does not license its subject postverbally
in (40), in the resultative compound ku-lei 'cry-tired', he claims it does:
(40)
*ku le yixie hao ren.
cry ASP some good people
(intended: Some good people cried.)
(41)
ku-lei le yixie hao ren.
cry-tired ASP some good people
Some good people cried themselves tired.
Similarly:
(42)
ku shi le shoujuan.
cry wet ASP handkerchief
The handkerchief got wet from crying.
Under my system, it is no surprise that such examples are ergative.
In the mapping from aspectual structure to argument structure,
Grimshaw argues that ergative/unergative distinction relates to whether
the single argument predicate maps onto the first or second subevent of
the event template. A single argument predicate that maps on to the
first subevent, the preparatory process, will be unergative, whereas the
single argument predicate that maps onto the second subevent, the
consequent state, will be ergative. In fact, exactly what this predicts for
(41) is not clear, since it maps on to both subevents and the single
argument is associated with both subevents. This is reflected in native
speaker judgements, which are divided over whether (42) necessarily
involves an implicit Cause argument, in which case, the predicate is
not ergative but transitive. In (42) on the other hand, the predictions are
clear. Since the only argument expressed is associated with only the
consequent state, it will be licensed as the internal argument and the
overall predicate will be ergative.
320
EVENT STRUCTURE AND THE BA CONSTRUCTION
5. Resultative complements
This analysis also carries over to the phrasal resultative using the
particle de. In this construction a consequent state is expressed by a
clause in complement position introduced by de, which is cliticised
onto the matrix verb:
(43)
ta qi de ma hen lei.
she ride de horse very tired
She rode so much the horse got tired.
(44)
ta qi de hen lei.
she ride de very tired
She rode so much she got tired.
In the examples above, there is no matrix object competing with
the resultative complement. Where the matrix object is expressed in
this construction, fronting of the object is obligatory, by the Postverbal
Constraint, as the resultative complement saturates the postverbal
complement position. However, the fronted object can be licensed
preverbally either by ba or by verb reduplication, and the different
licensing mechanisms trigger different interpretations. Adopting
Huang's (1991) insight that these resultative constructions are, at some
level of representation, complex predicates, they are assigned a complex
event structure parallel to the lexically formed V-V compounds. Again
licensing by ba forces the reading where the ba DP is the participant of
the consequent state. Compare:
(45)
wo ba ma qi de lei
le.
I ba horse ride de tired ASP
I rode the horse and got it tired.
(46)
wo qi ma qi de lei le.
I ride horse ride
de
tired ASP
I rode the horse and got tired.
The reason that the resultative construction is important to the
study of ba is that ba fronting of the subject of the resultative
321
321
YORK PAPERS IN LINGUISTICS 17
complement is licensed even where the DP in question is clearly an
argument only of the embedded clause and not of the matrix clause:
wo ku de Zhangsan hen shangxin.
I cry de Zhangsan very sad
(47)
I cried so much that Zhangsan was very sad.
wo ba Zhangsan ku de hen shangxin.
I ba Zhangsan cry de very sad
(48)
I cried so much that Zhangsan was very sad.
The matrix verb in these sentences is ku 'cry' which on its own does
not license an object, either in canonical object position or as a ba DP:
(49)
*wo ku le Zhangsan.
I cry ASP Zhangsan
(50)
*wo ba Zhangsan ku le.
I ba Zhangsan cry ASP
The ba DP must therefore be theta marked in the embedded clause. This
is a property only of resultative complements; other embedded clauses
do not permit ba fronting of their subjects. While this is problematic to
explain for purely syntactic accounts of ba, these facts simply fall out
from the aspectual account of ba that I have developed here.
In general there is, for every V-V compound, a corresponding
resultative construction. However, there is a difference in interpretation
between the V-V compound and the resultative construction relating to
causality. In the same way that ba fronting in a V-V compound yields a
stronger causative interpretation than the non-ba fronted form, so the
resultative compound has a stronger causative interpretation than its VV compound counterpart:
(51)
a.
wo qi lei le neipi ma.
I ride tired le that horse
I rode the horse and it got tired.
322
9()
EVENT STRUCTURE AND THE BA CONSTRUCTION
b.
wo qi de neipi ma lei le.
I ride de that horse tired le
I rode that horse and got it tired.
The particle de thus clearly does have some semantic content. In
particular, it has a similar semantic effect to ba. In the following
analysis I adopt Huang's basic intuition that the resultative construction
forms a complex predicate with the matrix verb, but I argue that this is
a property of the event structure and not syntactic as Huang assumes. A
detailed analysis of de resultatives is however beyond the scope of this
investigation. What we are interested in here is the interaction of the
resultative complement with ba and with the event structure of the
sentence.
5.1. Resultative
de
and event structure
The basic claim here is that de is a functional head which combines
with its complement and with the matrix clause to form a complex
event. More precisely, there is, as part of the semantic representation of
de, a rule that essentially means that de combines two independent
events, to yield one complex event. Using bracketing to mark
subevents this can be represented as shown:
(52)
(el) de (e2) > (E(e I)(e2))
This captures Huang's intuition that these are complex predicates
without forcing unmotivated abstraction in the syntax. Under this
analysis, it is a complex predicate in that it yields a single complex
event. This interaction of
de
with event structure is reflected
syntactically in that de is also an a-role assigner assigning the two aroles (Cse (Aff)). In fact, it may be possible to derive the rule in (52)
from the a-role structure of de. It assigns the a-role <Aff> to the DP
that it governs in the subject position of the resultative clause, and
assigns the most prominent a-role <Cse> to the subject of the matrix
clause.? If both de and ba are projected, the a-roles are forced to identify
7 Note that I am only claiming an aspectual parallel between de and ba.
Hence, we would not necessarily expect parallel behaviours in other
323
323
YORK PAPERS IN LINGUISTICS 17
as they map onto to the same complex event. The only difference in
interpretation is one of causality; there is a stronger causal
interpretation when both functional heads are projected. This, as we
have seen, can be attributed to the relationship between causality and
the a-roles assigned. Apart from this, the following have the same
interpretation:
(53)
Zhangsan ku de Lisi hen shangxin.
Zhangsan cry de Lisi very sad
a.
Zhangsan got Lisi sad with his crying.
b.
Zhangsan ba Lisi ku de hen shangxin.
Zhangsan ba Lisi cry de very sad
Zhangsan got Lisi sad with his crying.
These two have the same interpretation because the DPs in
question are assigned the same a-roles. This suggests an explanation for
the following, otherwise confusing, observation. Where the matrix verb
has both a transitive and an intransitive reading but there is no matrix
object, the matrix verb is nonetheless interpreted transitively and the
subject of the resultative is necessarily interpreted as the matrix object:
(54)
Zhejian shi jidong de Zhangsan ku le.
This matter excite de Zhangsan cry le
This matter excited Zhangsan so much that he cried.
not: This matter was so exciting that Zhangsan cried.
respects. For example, a reviewer has pointed out that while the ba DP must
be overt, the DP following de can be empty. There are a couple of potential
sources for this difference. Huang 1984 shows that empty complements are
in fact instances of wh-movement, whereas empty subjects can be pro.
Furthermore, only ba is a thematic mediator. So essentially, the question
seems to boil down to why a thematically mediated argument cannot be whmoved. Note that this is true for all the coverbs which I have argued should
be analysed as thematic mediators in Rhys 1992.
324
34
EVENT STRUCTURE AND THE BA CONSTRUCTION
As is seen from the translation, although the matrix verb jidong
'excite' appears to be used intransitively, it must be interpreted
transitively with the meaning excited Zhangsan. This can be understood
as the effect of the a-role assigned to Zhangsan, which is canonically
realised as an object. It also explains the marked preference for the
corresponding ba fronted sentence.
This analysis in terms of a-roles explains both the object
interpretation of the subject of the resultative and the availability of ba
fronting. It also captures the parallel causality effects of the resultative
complements and ba fronting in the V-V compounds.
6. Why do we need to refer to the internal structure of the
event?
Until now, we have been referring to the internal structure of an event.
However, the eventuality involved in the ba structures we have
addressed so far is always an accomplishment with a fixed internal
structure. If this is the case, then do we really need to build so much
structure into the analysis? Or could the analysis simply make reference
to the aspectual category of accomplishment, rather than the consequent
state in a complex event? For example, one could imagine an analysis
in terms of the object of an accomplishment formed by a simplex, or
complex predicate.
One response to the criticism that the account is building more
structure than is necessary might be to point to other linguistic
phenomena that require reference to the internal structure of the event.
Grimshaw's work on argument structure in English discussed above, for
example, requires reference to the internal structure of the event via an
event template. Stronger motivation, however, comes from the ba
construction itself. In the following data, examples are given in which
the ba construction is licensed, but the eventuality involved is clearly
not an accomplishment. Such data would obviously cause problems for
an analysis in terms of accomplishment. However, the internal structure
of the event does involve a consequent state as expected under this
analysis.
325
325
YORK PAPERS IN LINGUISTICS 17
6.1. Inchoatives
A frequently observed counterexample to the claim that ba is only
licensed in accomplishments is the following:
wo ba to ai shang le.
I ba her love PRT ASP
I fell in love with her.
(55)
The aspectual classification of such an utterance is inchoative, where
inchoatives are thought to pick out the begining part of the event. What
then is the internal structure of an inchoative? Going back to the Moens
and Steedman template, inchoatives are also analysed as involving a
culmination and consequent state.
1111111111111111111111111111111111
culimination
preparatory activity
consequent state
The difference between the accomplishment and the inchoative is that
the culmination in the inchoative marks the initial bound of the event,
whereas in the accomplishment it marks the final bound (Moens p.c.,
Kamp p.c., Dowty 1979). Thus, in an example such as (55), the
culmination is the falling in love and the consequent state is the being
in love. We can show that the consequent state is indeed part of the
linguistic representation of 'fall in love' by the contradiction in (56),
where the entailed consequent state is negated:
(56)
I I fell in love with her but I never loved her.
Thus the inchoative is clearly shown to involve a consequent state,
which would lead us to expect that ba fronting with inchoatives is
licensed.
6.2. Progressive - zhe
Another apparent counterexample to the descriptive restriction of ba to
bounded events is the use of ba with the progressive marker zhe.
326
EVENT STRUCTURE AND THE BA CONSTRUCTION
(57)
to ba yifu bao-zhe.
he ba clothes bundle-PROG
He is bundling up the clothes.
At first blush, such an example appears to be an irredeemable
problem for the account of ba given here. However, appearances can be
deceptive and in this instance, it is the translation of zhe as a
progressive, that leads to the deception. In fact a much more appropriate
translation would be as a resultative along the lines of 'He has the
clothes bundled up' with the resultative particle 'up'. In fact, Carlota
Smith argues very convincingly that 'in its basic meaning -zhe is a
resultative stative' (Smith 1994: 122).
The common representation of zhe as a progressive stems from its
additional use as a backgrounding particle, in examples such as the
following:
(58)
Xiao Li zuo zhe kan shu.
Xiao Li sit zhe read book
Xiao Li is reading sitting down.
In this use zhe loses the resultative interpretation, and has a simple
activity reading with no internal structure at all. If the analysis of ba
given here is correct, we would predict then that ba fronting with the
backgrounding use of zhe is not licensed. And indeed, the data in (59)
shows that this is the case:
(59)
*Xiao Li ba yifu bao zhe chang ge.
Xiao Li ba clothes bundle zhe sing song.
Thus again we find that it is the specification of consequent state that is
crucial to the distribution of ba.
327
327
YORK PAPERS IN LINGUISTICS 17
6.3. Directionals
An additional interesting result arises with examples such as the
following from Wang (1987):8
to zhengzai ba chuan wang shui li tui
she now ba boat towards water in push.
She's pushing the boat into the water.
(60)
It is generally assumed to be the case since Vend ler (1967) that an
activity verb with a goal yields an accomplishment, e.g. run to the
park, whereas an activity verb with a directional adverb or complement
remains an activity, and this can be tested for using Dowty's time
adverbial tests, where in-adverbials are appropriate with
accomplishments but not with activities. Hence:
(61)
a.
b.
Michelle drove to the university in five minutes flat.
?Michelle drove towards the university in five minutes flat.
Activity verbs with directionals are not, however, straightforward
activities, hence the oddness of (62a) as compared to (62b):
(62)
a.
b.
?Michelle drove towards the university for five minutes
Michelle drove around the university for five minutes.
(62a) is by no means ill-formed but does seem to require some
contextual explanation, hence the improvement in (63):
8 Note that this example provides counterevidence to the common
assumption that ba fronting is not licensed with monosyllabic verbs, based
on examples such as the following:
(a) *wo ba ni sha.
I ba you kill.
This is judged as unacceptable, but becomes acceptable combined with the
aspectual particle le. This not, in fact, a question of syllabicity, but rather
of event semantics, since the same expression is licensed in a conditional:
(b) ruguo wo ba ni sha,
If I ba you kill, ...
Thus, the explanation for (a) will be in terms of event semantics and
compatible with the approach to ba developed here.
328
EVENT STRUCTURE AND THE BA CONSTRUCTION
(63)
Michelle drove towards the university for five minutes before
changing her mind and turning back.
We can begin to get a handle on the difference between the simple
activity in (62b) and the activity plus directional in (62a), by referring
again to Moens and Steedman's event template:
culimination
preparatory activity
consequent state
The simple activity in (62b) involves just the first part of the
template, the activity part, and terminates, but has no culmination, as
follows:
////////////////////////////
/////////////////////////
I
///
culimination
preparatory activity
consequent state
The activity plus directional also refers to the activity part of the
template, but in addition it provides information about the consequent
state that would be reached if the event culminated rather than simply
terminating. That is, although a presupposition of (62a) is that
Michelle does not end up at the university, it is also true to say that
part of the meaning of (62a) is that if the activity of Michelle driving
towards the university does not terminate, then there is an inherent
culmination point, the arrival at the university, and the consequent state
of being at the university. In other words, the consequent state is not
entailed but can be inferred, and clearly must be part of the
representation of a directional expression.
Accounting for (60), therefore means that we must extend the
analysis of ba to incorporate not just consequent states that are entailed
by the event structure but also ones that can be logically inferred. This
might seem like an undesirable weakening of the initial analysis.
However, closer examination of the aspectual classes in Chinese
329
329
YORK PAPERS IN LINGUISTICS 17
suggests that this is necessary to account for simple lexical
accomplishments.
The question of the existence of lexical accomplishments in
Chinese is controversial. Based on the following examples, Tai (1984)
and Heinz (1984) both argue that in Chinese there is no
grammaticalisation of telicity; that is that the culmination and
consequent state that are the defining features of accomplishments are
not part of the lexical meaning of verbs such as sha 'kill' .9
(64)
wo sha le to hang ci dou mei si.
I kill ASP her 2 times all not die
I tried to kill her twice but she didn't die.
(65)
Zhangsan xue-le Fawen, keshi mai xue-hui.
Zhangsan learn le French but not learn-able
Zhangsan studied French but never learnt it.
(66)
wo mai le sanben shu, keshi mei mai-dao.
I buy le three books, but not buy-arrive
I tried to buy three books but didn't manage to.
Smith (1990) argues that these verbs are telic but that the
perfective particle le in Chinese does not have the same interpretation as
perfective in a language such as English, but is ambiguous between
termination (no culmination) and completion (culmination). An
alternative approach which avoids the disjunctive analysis of le is to
argue that the aspectual structure of a lexical accomplishment in
Chinese does include a culmination and a consequent state but that the
consequent state is not an entailment of the verb and hence is defeasible.
The relevance of this problem here is that ba fronting is licensed
showing that the consequent state required by ba need not be an
entailment of the predicate:
9 Native speaker judgements on these examples vary enormously. They are
give here in order of decreasing acceptability with only the first being
universally accepted.
330
EVENT STRUCTURE AND THE BA CONSTRUCTION
(67)
wo ba to sha le hang ci dou mei si.
I ba her kill ASP 2 times all not die
I tried to kill her twice but she didn't die.
Returning to the example in (60), there would seem then to be
independent motivation that a consequent state that is inferrable from
the directional expression is sufficient to license ba.
7. Conclusion
Much of the earlier controversy around ba stems from dissension over
whether or not ba has any independent semantic content. Either ba was
assumed to be a purely formal particle, the function of which was to
assign Case, or it was argued to have semantic content and this was
assumed to translate into thematic content. Under the hypothesis that
abstract Case does not play a role in Chinese (Rhys 1992), ba cannot be
a Case marker. However, I have also argued against the second option
of assuming thematic content to ba. Instead I have argued for a second
kind of semantic information that plays a role in syntactic description;
namely event structure. I have shown in this paper that the affected
interpretation of the ba DP is the consequence, not of a particular
thematic role, but of the a-role assigned by ba. In this way, the
constraints on ba are captured and shown to be intrinsically linked, and
the supposed control facts of Huang (1991) fall out. Furthermore the
relationship between ba and causality is now understood as a
consequence of the contingency relations between subevents of a
complex event. The extension developed here of Grimshaw's theory of
the interaction between aspectual structure and thematic structure and
the consequences for argument structure was shown to predict both the
ergativity shift in certain V-V compounds, and the well-formedness of
the causative ba sentences.
Thus this paper provides further evidence for a model of syntax in
which there is considerable interaction between the syntactic
representation and the level of event structure, cf. Ramchand (1993),
McClure (1994).
331
331
YORK PAPERS IN LINGUISTICS 17
REFERENCES
Adger, D. and Rhys, C. S. (forthcoming) Eliminating disjunction in lexical
specification. In P. Coopmans, M. Everaert and J. Grimshaw (eds.)
Lexical Specification and Lexical Insertion. Hillsdale, N-J: Lawrence
Eribaum Assoc.
Cheng, L. L. S. (1986) Clause Structures in Mandarin Chinese, MA Thesis,
University of Toronto.
Dowty, D. (1979) Word Meaning and Montague Grammar. Dordrecht:
Reidel.
Grimshaw, J. (1990) Argument Structure. Cambridge, Mass.: MIT Press.
Heinz, M. (1984) Chinese: A language without lexical accomplishment,
Ms, University of Wisconsin - Madison.
Hoekstra, T. (1988) Small clause results. Lingua 74.101-139
Huang, C-T. J. (1982) Logical Relations in Chinese and the Theory of
Grammar, PhD thesis, MIT
Huang, C-T. J. (1984) On the distribution and reference of empty pronouns.
LI 15.531-574
Huang, C-T. J. (1991) Complex predicates in control, Ms University of
California at Irvine.
Koopman, H. and Sportiche, D. (1991) The position of subjects. Lingua
85.211-258
Li, Y. F. (1990) On Chinese V-V compounds. NLLT 8.177-207
Li, Y. H. A. (1985) Abstract Case in Chinese, PhD thesis, University of
Southern California
McClure, W. (1994) Syntactic Projections of the Semantics of Aspect, PhD
thesis, Cornell University.
Moens, M. and Steedman, M. (1988) Temporal ontology and temporal
reference, CL 14.15-28
Ramchand, G. (1993) Aspect and Argument Structure in Modern Scottish
Gaelic, PhD thesis, Stanford University.
Rhys, C. S. (1992) Functional Heads, and Thematic Role Assignment in
Mandarin Chinese, PhD thesis, University of Edinburgh
Smith, C. (1994) Aspectual viewpoint and situation type in Mandarin
Chinese, Journal of East Asian Linguistics 3.107-146
Sybesma, R. (1992) Causatives and Accomplishments: The case of Chinese
ba. Leiden: HIL.
Vendler, Z. (1967) Linguistics in Philosophy. Ithaca: Cornell University
Press.
332
4
.4
EXPLANATION OF SOUND CHANGE.
HOW FAR HAVE WE COME AND WHERE ARE WE
NOW?
Charles V. J. Russ
Department of Language and Linguistic Science
University of York
1. Introductory: The development of explanations
1.1 Extra linguistic explanation
Early explanations of sound change were often sought in extralinguistic
factors such as the climate, or the physiology of the speakers. Thus, the
second or High German sound shift in which the initial Germanic
voiceless stops became affricates , e.g. 2, L k became [pi", [ts], [kx]
(the velar only in Upper German). This change was carried through in
initial position before vowels and, in the case of 12 and k before /1/ and
/r/, while 1 was only shifted before /w/. This was viewed by some
linguists as being caused by the Alpine climate. Since it was carried
through most completely in Southern Germany, Austria and
Switzerland, which are mountainous regions, it was assumed that there
was a causal relationship between the sound shift and the climate or
geography of the region. This view was advanced by serious linguists,
but it was to be refuted by Jespersen. He pointed out that the tendency
to affrication of voiceless stops was not confined to mountainous
regions, but that there was a strong tendency to affricate initial prevocalic i in the colloquial speech of Copenhagen (Jespersen 1922:
256f). Similar explanations were given for the First Germanic Sound
Shift (see survey in Russ 1978: 169-73).
Most scholars have been hesitant to explain sound changes in
terms of extralinguistic factors, but the most widely accepted way that
extralinguistic factors are used to explain change is in the substratum
theory. The Latin of the Roman Empire was imposed on countries with
York Papers in Linguistics 17 (1996) 333-349
© Charles V. J. Russ
333
YORK PAPERS IN LINGUISTICS 17
other native languages, e.g. Celtic in France, and consequently the
natives of these countries imposed the features of their own language on
the Latin they learned. These original, or substrate languages died out in
most cases, but have left their mark in the way Latin has developed in
different countries. For instance some linguists claim that the French
change of Latin a to [y:], e.g. Latin marus, French mur, is due to the
Celtic substrate, or that the shift of £ to h, which is then lost in
pronunciation in Spanish, e.g. Latin facere, Spanish hacer 'to do', is due
to the Basque substrate. In general it is accepted that some changes may
be due to substrate languages but the actual extent of this is not agreed
(see Pellegrini 1980 for further references).
Much of the use of extralinguistic factors in explaining sound
changes has been speculative and many changes have been found which
could not be put down to these factors. Bloomfield, and structural
American linguists in general, thought that the search for explanations
or causes of sound change was fruitless. Bloomfield said explicitly 'The
causes of sound change are unknown' (Bloomfield 1935: 385). Hockett
(1958), for example, contains no references to the causes of sound
change.
1.2 Internal linguistic explanations
Other linguists, notably the Prague group, swung away from
extralinguistic causes completely to the other extreme, wanting to see
the causes of linguistic change in the linguistic system itself. They, and
later Martinet, are the prime exponents of this view. They did not regard
sound laws as blind, as the Neogrammarians did, nor fortuitous as de
Saussure (1916: 127) thought, but rather purposeful. Sound change was
seen as teleological, goal directed. This might take various forms. There
might be various 'goals', the removal of peripheral phonemes, e.g. /31/
in English (Vachek 1964), or of phonemes with a low functional yield,
e.g. the merger of RI and /ce/ or /a/ and /a/ in French (Martinet 1961:
2100, or the making of an asymmetrical system symmetrical. A
persuasive example of the last type of change in Swiss German dialects
has been given by Moulton (1961: 155-182). Classical Middle High
German is assumed to have the following short vowel system:
334
EXPLANATION OF SOUND CHANGE
e
6
a
a
This is an asymmetrical system, since the back vowels have one less
tongue height than the front unrounded vowels. In the North East of
Switzerland this system was made symmetrical by the split of /0/ into
/0/ and Pt 'The asymmetry of the Middle High German system lay in
the fact that the front vowels contained one more relevant level than the
back vowels. In the West and Centre this asymmetry was removed by
decreasing the number of front vowels. In the North and East the
asymmetry was removed by increasing the number of back vowels: the
/0/ of Middle High German ofen, hose (New High German Ofen
'stove', Hose 'trousers') split into modern /of a/ # /h OS ar (Moulton
ibid., 172f [Translation CR]). The result of this change was a
symmetrical short vowel system. There was a complementary split of
Middle High German /6/ into /6/ and /ce/. Jakobson attempted to
illustrate his teleological view of sound change by applying it to
Russian. For example, the akanje, the merging of unstressed a and Q, in
Russian and other dialects, is seen as resulting from the change of the
correlation: musical accent - unstressed vowels, to expiratory accent unstressed vowels (Jakobson 1971: 92ft).
Martinet, building on the work of the Prague school, developed the
notion of the push-chain and the drag-chain. When a phoneme moves
phonetically in one direction and approaches another phoneme, e.g. IN
> /8/, then /B/ may also move towards another phoneme, /C/, /B/ >
/C/. This chain reaction is a push-chain, IN pushes /B/ towards /C/.
Another possibility would of course be that IN and /B/ merge, but
Martinet is more interested in the cases where this does not happen. If,
taking the three phonemes IN /BMA /C/ moves first, away from /B/,
then /B/ may well also be dragged into the space vacated by /C/, and
then /A/ may be dragged into the space left vacant by the shifting of /B/
(Martinet 1952: 5ff; 1955: 48ff). For instance, in early Old High
German there were two dental obstruents (excluding the sibilants) /IV,
and /d/. The latter was shifted to It/ and the space thus left vacant was
335
YORK PAPERS IN LINGUISTICS 17
then filled by the shift of /6/ to /d/ (Penzl 1975: 86). This kind of
chain reaction is called a drag-chain. This approach to sound change was
taken up by many linguists, among them Weinrich, who, in his studies
of Romance sound changes, sought to explain them without using
extralinguistic factors (Weinrich 1958: 5ff).
This type of approach to sound change has been criticized on
several grounds. The push-chains, drag-chains, development towards a
symmetry are said to be only tendencies (King 1969: 191ff). There are
asymmetrical sound systems - for instance many Upper German and
Central German dialects have two front vowel phonemes /e/ and /e/ but
only one back vowel phoneme /o/. Enough evidence seems to have
been produced that in certain cases sound changes can be explained in
terms of other changes, but there are also many changes which cannot
be thus explained. Also any teleological view of sound change is
circular. In the Swiss German example taken from Moulton it could be
seen that the result of the split of Middle High German /o/ into /0/ and
/3/ was a symmetrical short vowel system. The result and the cause are
regarded in fact as being the same thing (Anttilla 1989: 193f). In other
instances these explanations are only considered to be descriptions. This
was the position taken up by a reviewer of Weinrich (1958): 'A mon
avis, et j'espere pouvoir montrer par la suite qu'il est Bien fonde, la
phonologie diachmnique ne pourra etre que descriptive, ne saura jamais
repondre A la question: POURQUOI? Pour repondre a cette question, it
faut toujours recourir A des facteurs externes' (Togeby 1959/60: 402).
However, although criticisms have been levelled against this approach,
it has produced many results which have been accepted as worthwhile
by many linguists.
1.3 Generative linguistics and explanation
The scepticism which Bloomfield expressed at ever finding
explanations of sound changes was continued by generative
grammarians. The most extreme position is that taken up by Postal:
'There is no more reason for languages to change than there is for
automobiles to add fins one year and remove them the next, for jackets
to have three buttons one year and two the next' (Postal 1968: 283). On
the whole, the generative school has been criticized for not seeking
3
336
EXPLANATION OF SOUND CHANGE
explanations for sound change. This is not entirely fair, since opinions
among generative linguists seem to vary. King, for instance, is not as
sceptical as Postal: 'If there is little risk in being a cynic about the
origin of phonological change, there is also very little profit. In fact
linguistics has a great deal to lose by the position that the cause of
phonological change is beyond principled research' (King 1969: 1900.
However, he does not give any clear explanation of sound change. One
approach to explanation in sound change can be illustrated from
Kiparsky's historically orientated article entitled 'Explanation in
phonology'. He states: 'I have suggested a way in which the concept of
a 'tendency', which lends functionalist discussions their characteristic
unsatisfactory fuzziness, can be made more precise in terms of
hierarchies of optimality, which predict specific consequences for
linguistic change, language acquisition, and universal grammar'
(Kiparsky 1972: 224). For Kiparsky, explanation in sound change is
determined by constraints such as the conservation of functional
distinctions, e.g. a sound change will tend not to eliminate number or
tense endings. When sound changes cause phonological alternation
within an inflectional paradigm, e.g. lengthening of short vowels in
open syllables, North German [ta:g3], but nom. [tax] or [talc], the
alternation will tend to be removed to make the paradigm regular, cf.
standard German, Tage, Tag. Some sound changes may act together in a
'conspiracy' to produce a certain kind of phonological structure.
However these constraints do not always apply. For instance modern
German still retains the phonological alternation between medial voiced
obstruents and final voiceless obstruents. This has been in existence
since late Old High German and yet has not been levelled out except in
a few dialects.
1.4 Some recent developments
Most textbooks on historical linguistics give surveys of some of the
kinds of explanations and causes that have been outlined in 1.2 and 1.3,
adding remarks on how sociolinguistics can help account for why
particular variants are selected by a language (Anderson 1973: 3-5;
Jeffers and Lehiste 1979: 88-105; Aitchison 1981: 111-69). A landmark
in the discussion on explaining linguistic change is Lass (1980) who
comes to the conclusion that to explain linguistic change must also
337
Q
YORK PAPERS IN LINGUISTICS 17
entail predicting it. Therefore, since prediction of changes is
impossible, explanation is also impossible. However, Lass's
conclusion challenged many linguists to search for explanations.
Vennemann (1983) says that he will continue explaining linguistic
change, particular in terms of what is and what is not a possible
change. Bennett (1983) argues that Lass sets too high a standard for
explanations and that linguists should continue to search for them: 'The
best way to be sure of not discovering the causes of linguistic change is
to adopt the working assumption that there are no such causes. But if
we seek, we may find' (1983: 20). Aitchison (1987) in a contribution to
a workshop set up because of the impact of Lass's claim maintains that
linguists should at least be able to sketch possible paths of
development for changes. Lass (1987), himself, seems to offer a less
pessimistic scenario, urging linguists to take a more long-term view of
changes in languages in any attempts at explanation. Kiparsky (1988)
as well as surveying different types of change and causes expresses the
view that the linguist should not be surprised or despair if one language
develops a structure in one way whereas another language develops the
same structure in a different way. This balancing act of using both
internal, functional explanations as well as external, sociolinguistic
ones is continued in recent works (Hock 1986: 627-61, and 1992: 22831; Crowley 1992: 191-203; Ohala 1994: 4050-55). McMahon (1994:
46) expresses the problem by saying 'We shall consider further,
generally particularistic and non-predictive, explanations of changes in
all components of the grammar, while striving to find general causes
and motivations for change.' The wish to find causes and the conviction
that they may be discovered is thus very much alive.
2. Types of explanatory statement
We have so far used the term 'explanation' without any real definition.
In the following sections four ways in which it is used will be
examined and their usefulness evaluated. Much of this, paradoxically,
derives from a little known review by Bloomfield (1934).
2.1 General Historical Explanation
Bloomfield (1934: 340 outlines this type of explanation in the
following terms: 'Where the facts are accessible, we can define a feature
338
3 33
EXPLANATION OF SOUND CHANGE
of a language in terms of some earlier habit plus a change of habit'.
This is a general form of explanation: something in the present can
always be explained by saying that it represents something in the past
plus a change. The strange shape of a house, for example, may be
explained historically by saying that in the past there were two houses,
which were than joined together. A linguistic example would be the
explanation that umlaut in New High German is due to the fact that in
Old High German the vowels affected were followed by an s, i, or is
'Umlaut is used to express the change from a, o, u and au to a, 0, U and
au respectively ... . The cause of these vowel-changes can, as a rule, not
be seen in modern German: in order to understand them, one requires to
go back to the earlier stages of the language' (Eggeling 1961: 348).
This type of explanation is not restricted to linguistics but it is
common to all disciplines which have a historical branch. It has also
fallen out of favour since it mixes the synchronic and the diachronic. De
Saussure in his discussion of the necessity of separating the synchronic
from the diachronic uses umlaut of noun plurals as part of his
argument. He takes two stages in the development of German and
English: At stage A the plural of some nouns is formed by adding
Old High German gast, gasti, OEfot,foti. At a later stage B, the plural
is formed by changing the vowel, and in the case of German, adding -e:
Gast, Gdste, foot, feet. For de Saussure, these ways of marking the
plural have no historical connection. The only connection is between
individual forms, e.g. gasti, which becomes Gaste (de Saussure 1916:
120ff). For him, umlaut in New High German would not be explicable
in terms of Old High German. This attitude of de Saussure's seems to
have influenced linguists in turning away from the diachronic study of
language. This represents, in other disciplines as well as linguistics, 'a
general loss of faith in the efficacy of historical explanation. We try to
understand our present position by analysing the component forces in
play, not by tracing post facto the long chain of major forces which
have brought it about but may have ceased to operate' (Trim 1959: 19).
This type of explanation is too unrestricted to account for why sound
changes proceed along one particular path in one language but along a
different path in another.
339
339
YORK PAPERS IN LINGUISTICS 17
2.2 Universals of Sound Change
Another approach is to look at the universal nature of some sound
changes. Some similar patterns occur in different languages. For
instance, the raising of long and mid vowels has not only caused
diphthongization in English, but also in Dutch, and probably also in
German (Lass 1976). There is not an infinite number of sound changes
but a restricted number. If these can be characterized, then an
explanation can be attempted for a much smaller number. For the
Neogrammarians, sound laws were fixed to one place and one dialect at
one time. Consequently they did not believe in universals of sound
change. For them, what was universal was that sound laws had no
exceptions. However the whole question of universals has been
discussed not only on a synchronic level but also on a diachronic level.
This has chiefly taken the form of characterizing the possible forms of
linguistic change and to what constraints they are subject (Kiparsky
1972; Vennemann 1982: 149-54; Labov 1994). Universals can help to
explain sound changes in that they reduce the number of possible sound
changes to a finite number. A sound change is deemed to have been
'explained' if it is assigned to a more general process. Sound change is
viewed as consisting of a set of meta-rules: palatalization, nasalization
and so on, from which a language selects one, which, subject to certain
language specific constraints, will proceed in a defined way. For
instance, if a language palatalizes consonants, first the velars will be
affected, then the denials and finally the labials. It will not affect labials
only, or denials only. The consonants (only obstruents have so far been
considered) will be palatalized before high front vowels first, then before
mid front vowels and finally before low vowels (Chen 1973). As an
example, Italian has palatalized Latin k only before front high and mid
vowels: Latin civitatum, cention, Italian cilia, cento, but this has not
occurred before low vowels: Latin cantare, Italian cantare. French, on
the other hand, has palatalized Latin k before a as well: French cite,
cent, chanter. This approach does not completely solve the problem of
causation of linguistic change, but it does attempt to overcome the ad
hoc explanation of individual changes. Thus the change of Latin k to
[tf and further to If ] in French is not seen as an isolated change but as
part of the larger change of palatalization. Chen cites examples from
many different languages which make his thesis seem plausible, but he
340
34
EXPLANATION OF SOUND CHANGE
has to admit that there are exceptions. In Ancient Greek IE /kw/ and /t/
are palatalized to It/ and /s/ respectively before li/ and /e/. According to
Chen's scheme, if a dental stop has been palatalized then a velar stop
will have been palatalized as well. The reason for this exception, he
says, is that IE /kw/ and /t/ are involved in a drag-chain. IE /s/ became
/h/ in Ancient Greek, initially and medially, and the space left by the
shifting of medial IE /s/ was filled by the palatalization of IE /t/ before
/1/ in certain cases (there are exceptions to this).1 The gap created by the
change of /t/ to /s/ before /i/ was then filled by IE /kw/ becoming /t/
before /i/ and /e/.2 Language specific changes like this drag-chain in
Ancient Greek can invalidate the universal trend of palatalization. This
may well turn out to be an isolated case, but on the other hand it belies
the strong predictive power that Chen would like his theory to have.
Another approach to the problem of universals has been to set up
universal strength hierarchies. For example, if obstruents are deleted or
subject to lenition in a language, velars are most likely to be deleted
first, then denials and finally labials (Foley 1977: 28). Lass and
Anderson (1973: 183-87), in their study of Old English obstruents,
come to a different conclusion. When stops become weakened to
fricatives the order is: dentals first, then labials and finally velars.
Certain kinds of statements as to what are natural classes differ
sometimes according to the language or period of the language
concerned. This search for universal hierarchies is still very speculative
and more detailed studies must be made available before it can be proved
to have a more solid foundation. A phenomenon which is similar to
strength hierarchies is the concept of the Reihenschritt.3 If one
phoneme of a phonetic order changes, then all the other phonemes of
the same order change in the same way. A classic example is provided
by the First Germanic Sound Shift where each member of each order of
1
Buck 1933: para. 141: 'The assibilation of t before t is seen in large
classes of words. But c may also remain unchanged before t, and the precise
conditions governing this difference of treatment cannot be satisfactorily
formulated.'
2 Chen 1973 takes his interpretation from Allen 1957-8: 122f.
3 Pfalz 1918 used Reihenschritt for vowel changes. A free translation in
English might be 'parallel development'.
341
341
YORK PAPERS IN LINGUISTICS 17
consonants changed its manner of articulation: the voiceless stops u,
k became the voiceless fricatives f, Q, x, the voiced aspirated stops bh,
at, a became either voiced stops or voiced fricatives according to their
position in the word Igv,
g/, the voiced stops b, d, g became
voiceless stops u, 1, k (Fourquet 1954). Similarly all the Middle High
German long high vowels (I, Iil, ft) diphthongized, not just one or two
of them. The concept of Reihenschritt has been adopted by Martinet
(1952: 17) to show how sound changes proceed by changes in
distinctive features. In generative grammar the fact that parallel groups
of sounds may change has been accounted for in terms of 'natural
classes': 'Phonological changes tend to affect natural classes of sounds
(p, t, k, high vowels, voiced stops), because rules that affect natural
classes are simpler than rules that apply only to single segments' (King
1969: 122). The use of the word tend is significant in this quotation
since these changes do not always take place. On the basis of natural
classes one cannot always predict that of three voiceless stops, if I
becomes an affricate, then p and k will become affricates as well. This
may perhaps happen, as it does in some Upper German dialects, but it
is by no means automatic.
Any universals that do exist seem, at the moment, to be only
universal tendencies (even Chen 1973: 183 uses the term 'tendency').
Similar changes can be seen at work in many genetically unrelated and
geographically widely dispersed languages. The important thing that
this search for universals has shown is that sound change is not random
but, all things being equal, sound changes, e.g. palatalization, will
proceed in a predictable way, e.g. affecting velars first, then dentals and
finally labials. But unfortunately in languages all things are not equal.
Many other factors intervene. There may be the influence of the rest of
the sound system, the morphology and syntax, and external influences
from other dialects or languages. The social prestige of certain forms
and their spelling may influence changes. All these factors may and do
interfere in the smooth effectuation of these universal tendencies. There
seems no way of predicting when these other factors will intervene. The
search for universals has still not supplied an answer to the problem of
the explanation of sound change in general.
342
342
EXPLANATION OF SOUND CHANGE
2.3 The Predictive Power of Linguistic Explanation
This level of explanation can be characterized as the one 'in which we
could account for the occurrence of a certain linguistic change at a
certain place and time: e.g. Why did pre-Germanic change p, t, k to f,
0, h or why did English analogically extend the -s pl. of nouns? The
answer would be a correlation of linguistic change with some other
recognizable factor enabling us to predict the occurrence of a linguistic
change whenever this factor was known' (Bloomfield 1934: 390.
Bloomfield sets this up as a goal to be reached, but does not offer, here
or elsewhere, any solution. Nor, we must say, has any linguist to date.
Chen, who deals with prediction in phonological change, has to set his
sights lower: 'Even though we cannot predict that palatalization will
take place in language X, we can nevertheless predict that if
palatalization occurs at all it will spread along two dimensions or axes'
(Chen 1973: 177). Once a sound change has taken place, its course can
be predicted within certain limits, but we cannot predict why
palatalization should take place in French but not in Dutch. This has
been called the 'actuation problem' by some scholars: 'Why do changes
in a structural feature take place in a particular language at a given time,
but not in other languages with the same feature, or in the same
language at different times?' (Weinreich, Labov and Herzog 1968: 102).
For instance, why did the Germanic long high vowels diphthongize in
German, English and Dutch but not in the Scandinavian languages?
This type of question is the strongest and most interesting demand that
could be made of a theory of explanation in historical linguistics.
Unfortunately no answer can be given to it with the present state of
linguistics, and it is doubtful whether there will ever be an answer.
2.4 The Explanation of Specific Changes
One of the most widespread interpretations of 'explanation' is the
explaining of one event by another. Bloomfield puts this in the
following way: 'A favoured earlier event, the 'cause', pulls a kind of
invisible string which, in some metaphysical sense, forces the
occurrence of a later event, the 'effect" (Bloomfield 1934: 34). This
assumes that one can connect some linguistic effects but not others.
For instance, in the Germanic languages many original final vowels
have been lost or reduced to [a]. That is one linguistic event. It is also
343
343
YORK PAPERS IN LINGUISTICS 17
assumed that the stress accent in Germanic, instead of falling
potentially on any syllable, became fixed on the root syllable. This
represents another linguistic event. Most linguists link these two
events together, the fixing of the stress accent causing the weakening
and loss of unstressed syllables: 'The strong stress accent on the stem
(or first syllable) caused in Germanic a progressive weakening of
unaccented syllables' (Prokosch 1939: 133). Similarly the mutation of
the long and short back vowels a, Q, u in the Germanic languages at
various times has occurred before an LI, or j in the following syllable.
In this case it is usually said, not that one event caused another, but
that one factor, the existence and nature of the following j, j, and j,
caused the change known as j-mutation or umlaut. The following
explanation illustrated this clearly: 'There are two types of mutation in
0.E., one A., which affects back vowels is caused by a following i or j,
the other, B., which affects front vowels, is caused chiefly by u, or o,
in some dialects also by a' (Wyld 1921: para. 103). This mode of
explanation refers chiefly to individual conditioned changes. Where
changes are not phonetically conditioned, the explanatory power of one
change or factor in terms of another one is not so convincing. Attempts
have been made to explain one unconditioned change in the light of
another. This is the type of event which Martinet has dubbed push- or
drag-chain. The Great Vowel Shift in English has been explained in this
way. The two most important steps in the vowel shift are the
diphthongization of the long high vowels ME 1 and A, and the raising
of the long mid vowels ME A and A. Scholars have postulated causal
relationships between these changes. Luick thought that the raising of
the mid vowels happened first and caused the already existing high
vowels to diphthongize, while Jespersen, on the other hand, thought
that the diphthongization of ME long 1, A created a hole, into which the
mid vowels ME L A were dragged (Lass 1976: 51-102; 1992).
It is very often not possible to establish with any accuracy the
direction of the explanation in unconditioned changes such as this.
Documentary evidence may be lacking or inconclusive. These
explanations of changes in terms of other factors or events have one
great drawback: they are not final explanations. It may be the case that
the raising of the mid vowels caused the diphthongization of the high
vowels, or, that the fixing of the stress accent on the root syllable
3
344
EXPLANATION OF SOUND CHANGE
caused the weakening or loss of unstressed vowels. Even so there still
remains the question of why the mid vowels were raised in the first
place, or why the stress in Germanic became fixed to the root syllable.
In other words, final causation is not provided for at this level. The type
of explanation discussed here is of a specific sound change or changes.
These will probably only occur in one language or in related languages
and be tied to a particular period in that language. Most linguists would
accept that this level of explanation, linking events to other events, as
cause and effect, is indeed possible but that it is a weak form of the
explanation of sound change.
3. Conclusion
What can be reasonably demanded of a linguistic theory is that it should
explain language specific changes. Other types of explanation are far
more difficult, if not impossible, to formalize. Research into universals
may help, but much more evidence for many more different processes
will have to be forthcoming before it is based on a surer footing
Most linguists, however, are agreed that languages are subject to
change and that there is variation in the spoken chain. Where they differ
is on the emphasis placed upon this. The fact that language is subject
to variation does not explain sound change (this variation is simply a
characteristic of language), but it does point to the possible origin of
sound change. Variation in the spoken chain produces variants in
pronunciation, grammar and vocabulary. The important thing is what
happens to these variants once they have arisen for whatever reason.
Two things are important here. The variants may be idiosyncratic and
not spread at all, or they may find their way into the linguistic system
(Samuels 1972: 140). It is at this point that the question 'why?' may
begin to be asked. Here we find ourselves at the level of ad hoc
language specific explanations. These entail what has been called the
'transitional problem', i.e. what intermediate forms there are, and the
'embedding problem', i.e. how does a change fit into (a) the linguistic
system as a whole, and (b) into the social structure of the users of the
language concerned? There is also the 'evaluation problem', i.e. how the
speakers themselves reacted to the change (Weinreich, Labov and
Herzog 1968: 184f0. The question 'why?' seems only answerable in the
345
YORK PAPERS IN LINGUISTICS 17
case of why a particular variant was selected by the linguistic system in
a certain case, rather than saying why one was not selected.
Explanations or causes of sound changes can be given as long as it
is realized that they merely entail connecting phenomena to their
effects, the reason for the selection of a particular variant or process
may be due to several factors, in other words there may be multiple
causation (Malkiel 1967). All such explanations are ad hoc, even
though they represent a selection from a restricted range of sound
changes (Samuels 1972: 1550. The ultimate causes of sound change are
unknown but in many cases we can see with varying degrees of
confidence what the immediate causes are.
REFERENCES
Aitchison, J. (1981) Language Change: Progress or Decay. London:
Fontana.
Aitchison, J. (1987) The language lifegame: prediction, explanation and
linguistic change. In W. Koopmann, F. van der Leek, 0. Fischer and R.
Eaton (eds.) Explanation and Linguistic Change. (Current Issues in
Linguistic Theory 45). Amsterdam: Benjamins. 11-32.
Allen, W. S. (1957-8) Some problems of palatalization in Greek. Lingua
7.113 -33.
Anderson, J. M. (1973) Structural Aspects of Language Change, London:
Longman.
Anttila, R. (1989) Historical and Comparative Linguistics, Amsterdam,
Benjamins. This is basically the same as the 1972 edition, A n
Introduction to Historical and Comparative Linguistics.
Bennett, P. (1983) The nature of explanation in historical linguistics. York
Papers in Linguistics 10.5-22.
Bloomfield, L. (1934) Review of W. Havers, Handbuch der erklarenden
Syntax. Language 4.32-40.
Bloomfield, L. (1935) Language. London: Allen & Unwin. American edition
1933.
Buck, C. D. (1933) Comparative Grammar of Greek and Latin. Chicago:
University of Chicago Press.
Eynon, T. (1977) Historical Linguistics. Cambridge University Press.
346
EXPLANATION OF SOUND CHANGE
Chen, M. (1973) Predictive power in phonological description. Lingua
33.171-91
Crowley, T. (1992) An Introduction to Historical Linguistics. Oxford
University Press.
Eggeling, H. F. (1961) Dictionary of Modern German Prose Usage. Oxford:
Clarendon Press.
Foley, J. (1977) Foundations of Theoretical Phonology. Cambridge
University Press.
Fourquet, J. (1954) Die Nachwirkungen der ersten and zweiten
Lautverschiebung. Zeitschrift filr Mundartforschung 22.1-33.
Haudricourt, A. and A. Juilland (1949) Essai pour une histoire structurale du
phonetisme francais. 2nd edition. The Hague: Mouton.
Hock, H. H. (1986) Principles of Historical Linguistics. Berlin - New York:
Mouton de Gruyter.
Hock, H. H. (1992) Causation in language change. In W. Bright (ed.).
International Encyclopedia of Linguistics. Oxford University Press.
Vol. 1. 228-31.
Hockett, C. (1958) A Course in Modern Linguistics. New York: Macmillan.
Jakobson, R. (1971) Selected Writings I. Phonological Studies. 2nd
edition. The Hague: Mouton.
Jeffers, R. J. and I. Lehiste (1979) Principles and Methods for Historical
Linguistics. Cambridge, Mass.: MIT Press.
Jespersen, 0. (1922) Language. London: Allen & Unwin.
Keller, R. (1963) Zur Phonologie der hochalemannischen Mundart von
Jestetten. Phonetica 10.51-79.
King, R. D. (1969) Historical Linguistics and Generative Grammar.
Englewood Cliffs, New Jersey: Prentice Hall.
Kiparsky, P. (1972) Explanation in phonology. In S. Peters (ed.) Goals of
Linguistic Theory. Englewood Cliffs, New Jersey: Prentice Hall.
Kiparsky, P. (1988) Phonological change. In F.J. Newmeyer (ed.)
Linguistics. The Cambridge Survey. Vol. 1. Linguistic Theory.
Foundations. Cambridge University Press. 363-415.
Labov, W. (1994) Principles of Linguistic Change I. Internal Factors.
Oxford: Blackwell.
Lass, R. (1974) Linguistic orthogenesis: Scots vowel quantity and the
English length conspiracy. In C. Jones and J. M. Anderson (eds.) First
347
34 7
YORK PAPERS IN LINGUISTICS 17
International Conference on Historical Linguistics. Amsterdam: North
Holland. 311-43.
Lass, R. (1976) English Phonology and Phonological Theory .Cambridge
University Press.
Lass, R. (1980) On Explaining Language Change. Cambridge University
Press.
Lass, R. (1987) Language, speakers, history and drift. In W. Koopmann, F.
van der Leek, 0. Fischer and R. Eaton (eds.) Explanation and Linguistic
Change. (Current Issuses in Linguistic Theory 45). Amsterdam:
Benjamins. 151-76.
Lass, R. (1992) 'What, if anything, was the Great Vowel Shift?', in History
of Englishes. New Methods and Interpretations in Historical
Linguistics. M. Rissanen ei.sl. (eds.), Berlin - New York: Mouton de
Gruyter. 144-55.
Lass, R. and J. M. Anderson 1975) Old English Phonology. Cambridge
University Press.
Malkiel, Y. (1967) Mutiple versus simple causation in linguistic change. In
To Honor Roman Jakobson. The Hague: Mouton. Vol 2. 1228-46.
Martinet, A. (1952) Function, structure and sound change. Word 8.1-32.
Martinet, A. (1955) Economie des changements phonetiques. Berne:
Francke.
Martinet, A. (1961) Elements de linguistique g6neerale. Paris: Armand
Colin.
McMahon, A. M. S. (1994) Understanding Language Change. Cambridge
University Press.
Moulton, W. G. (1961) Lautwandel durch innere Kausalitilt. Zeitschrift fur
Mundartforschung 28.227-51.
Ohala, J. (1994) Sound change. In R. E. Asher (ed.) The Encyclopedia of
Language and Linguistics. Vol 8.4050-55.
Pellegrini, G. B. (1980) Substrata. In R. Posner and J. N. Green (eds.) Trends
in Romance Linguistics and Philology. The Hague: Mouton.
Penzl, H. (1971) Vom Urgermanischen zum Neuhochdeutschen. Berlin:
Schmidt.
Pfalz, A. (1918) Reihenschritte im Vokalismus. In Beitrdge zur Kunde der
bayerisch-osterreichischen Mundarten I. Sitzungsberichte der
Kaiserlishen Akademie der Wissenschaften in Wien. Phil. -hiss. Klasse
190-2. Vienna: Haider. 22-42.
3 43
348
EXPLANATION OF SOUND CHANGE
Postal, P. (1968) Aspects of Phonological Theory. New York: Harper Row.
Prokosch, E. (1939) A Comparative Germanic Grammar. Philadelphia:
Linguistic Society of America.
Russ, C. V. J. (1978) Kausalititt and Lautwandel. Leuvense Bijdragen
67.169-82.
Samuels, M. L. (1972) Linguistic Evolution. Cambridge University Press.
Saussure, F. de (1916) Cours de linguistique generale. Ed. T. de Mauro.
Critical edition. Paris: Payot.
Togeby, K. (1959-60) Les explications phonologiques historiques sontelles possibles? Romance Philology 13.401-13.
Trim, J. L. M. (1959) Historical, descriptive and dynamic linguistics.
Language and Speech 2.9-25.
Vachek, J. (1964) On peripheral phonemes of modem English. Brno Studies
in English 4.7-100. Reprinted in Selected Writings in English and
General Linguistics. The Hague: Mouton, 1976.
Vachek, J. (1970) Remarks on The Sound Pattern of English'. Folia
Linguistica 4.24-31.
Vennemann, T. (1982) Grundzuge der Sprachtheorie. Tubingen: Niemeyer.
Vennemann, T. (1983) Causality in linguistic change. Theories of linguistic
preferences as a basis for linguistic explanations. Folia Linguistica
Historica 4.7-26.
Weinreich, U., Labov, W. and U. Herzog (1968) Empirical foundations for a
theory of language change. In W. P. Lehmann and Y. Malkiel (eds.)
Directions for Historical Linguistics. Austin: University of Texas Press.
Weinrich, H. (1958) Phonologische Studien zur romanischen
Sprachgeschichte. Minster: Aschendorffsche Verlagsbuchhandlung.
Wyld, H. C. (1927) A Short History of English. Oxford: Blackwell.
349
349
HAS IT EVER BEEN 'PERFECT'?
UNCOVERING THE GRAMMAR OF EARLY BLACK
ENGLISH*
Sali Tagliamonte
Department of Language and Linguistic Science
University of York
1. Introduction
Genetic relationships between varieties are often assessed by crosslinguistic comparisons of the tense/aspect system. This is especially
true of African American Vernacular English (AAVE), whose verbal
delimitation paradigm has been the subject of intense study for decades.
This is in part due to the ongoing and still contentious debate on
whether its present system developed from a prior creole or directly
from the vernacular British varieties spoken by early white plantation
staff. The sheer complexity and abundance of grammatical apparatus
concentrated in this area of the grammar make it an excellent site for
examining the differences and similarities amongst related varieties.
Over the last few decades the frequently used domains of the verbal
system have been extensively exploited. In the area of copula usage and
past tense expression, the underlying systems of AAVE and other
varieties of English were found to be similar, though AAVE tends to
extend English rules through the application of additional phonological
I gratefully acknowledge the generous support of the Social Sciences and
Humanities Research Council of Canada, in the form of research grants
#410-90-0336 and #410-95-0778 for the project of which this research
forms part. I thank Salikoko Mufwene for his insightful comments and
especially his help in making sense of semantic vs. morphological tense
and aspect, Marjory Meechan for her meticulous critique of a final version of
this manuscript and Shana Pop lack for comments and encouragement all
through.
York Papers in Linguistics 17 (1996) 351-396
Sali Tagliamonte
350
YORK PAPERS IN LINGUISTICS 17
and grammatical processes.' In other areas of the verb system, such as
present time reference, the patterning of surface forms, although
atypical of contemporary varieties of standard English, has been shown
to constitute reflexes of linguistic change whose patterns of variability
reflect the state of the English vernaculars to which the slaves were
exposed (Pop lack and Tagliamonte 1989, 1991),2 while simultaneously
differing from the behaviour proposed for creoles (e.g. Tagliamonte et
al. 1996). But these findings have not been univocal. Some researchers
such as Winford (1991), De Bose (1994), and De Bose and Faraclas
(1994) claim that contemporary AAVE preserves traces of a creole
grammar. Thus, despite decades of research, the origins of AAVE
remain controversial.
One area of the tense/aspect system which presents a test in point
for this issue is what I will refer to here as the PERFECT. In standard
English the PERFECT is typically equated with the morphosyntactic
construction have + past participle, as in (1).3
(1) AUXILIARY HAVE + PAST PARTICIPLE:
Some of them have regretted it already. Yes, many of 'em
have regret it already. (SE/006/171-173)4
b. It been so long I've forgotten. (SE/020/87)
a.
c.
I have been told that if they know you handling money,
they raise your wages. (SE/010/1005-7)
d.
That was the first they learnt me and I'm old and it have
remained here. (SE/002/115-6)
See Baugh 1980; Faso ld 1971, 1972; Labov 1969, 1972a; Labov et al.
1968; Pfaff 1971; Pop lack and Sankoff 1987; Tagliamonte and Pop lack
1988; Wolfram 1969, 1974.
1
2
3
See also Pop lack & Tagliamonte 1994 for the plural.
In these data the main verb of the have + past participle construction can
surface as a weak verb without inflection or as a strong verb with preterit
morphology, in addition to the standard English past participle form, as
illustrated in the second verb phrase in (la).
4 Codes in parentheses identify the speaker and line number in one of two
corpora, Samaria English (SE) or the Ex-slave Recordings (ESR). For details
of the corpora see below.
331
352
HAS Tr EVER BEEN 'PERFECT?
In AAVE the infrequency of verbal constructions with have coupled
with the plethora of other forms used for comparable, though not
entirely similar functions, e.g. auxiliary be, as in (2), preverbal done, as
in (3), bare past participles, as in (4) and ain't + verb, as in (5), have
been used as evidence of an underlying creole system.
(2) AUXILIARY BE + VERB:
a. I'm pass a lot of trouble. (SE/002/374)
b. Now they have so many houses. They all is made it one
thing. (SE/003/480-2)
c.
d.
I'm forgot all them things. (SE/015/257)
Well, with me nothing is happen, nothing strange.
(SE/006/144)
e.
Let me see, I'm near forgot what I was to holler.
(ESR/001/43)
(3) PRE-VERBAL DONE:
a. Plenty done gone and they's lose their life. (SE/005/476)
b. I done been to Miami, Hollywood ... (SE/010/1032)
c. So much trouble done pass. (SE/002/113-4)
d. Grandpa was always saying them old oxens done run off inrunned off in the river with us. (ESR/00Y/62)
(4) BARE PAST PARTICIPLE DONEIBEENISEENIGONE:
I never seen him. (SE/001/919)
b. They been fixing the road. (SE/015/221)
c. She gone to San Martin. (SE/005/114)
a.
d.
Because what I had to do, I done it when I could.
(SE/011/1144)
(5) AIN'T + VERB:
a.
b.
c.
d.
He ain't wrote yet ... He ain't write yet. (SE/019/236-7)
She ain't married none yet. (SE/005/160)
I ain't got nothing to do. (SE/011/1143)
I ain't never wore none. (ESR/00X/270)
353
352 '--
YORK PAPERS IN LINGUISTICS 17
This study considers the PERFECT in two corpora which represent
an earlier stage of AAVE
Samana English and the Ex-Slave
Recordings. The Samaria English corpus comprises 21 interviews with
native English-speaking descendants of American ex-slaves, who settled
the remote peninsula of Samaria in the Dominican Republic in 1824
(Pop lack and Sankoff 1987). The variety spoken by these informants is
considered to derive from a variety of English spoken by African
Americans in the early nineteenth century.5 The Ex-Slave Recordings
are a series of audio-recorded interviews with 11 former slaves born in
the southern United States between 1844 and 1861 (Bailey et al. 1991).
These corpora bear crucially and uniquely on the controversial origins
and development issues in the current study of AAVE since they
provide the necessary time-depth for assessing linguistic change (ca.
1800's) and the advantages of data drawn from naturally-occurring
speech.
In PERFECT contexts, both Samaria English and the Ex-Slave
Recordings exhibit the same forms attested in contemporary studies of
AAVE, listed in (1)-(5) above. They also contain 'three verb clusters'
with auxiliary be and have, as in (6), English preterite morphology, as
in (8), and solitary verb stems, as in (7).
(6) THREE VERB CLUSTER WITH AUXILIARY HAVE:
a.
He told me that he had done pass through them English
b.
books. (SE/006/315-6)
He had done been to Saint Thomas and place. (SE/001/647)
(7) THREE VERB CLUSTER WITH AUXILIARY BE:
a.
They ain't paid us yet and I'm done spent plenty money
with the documents. (SE/006/155-6)
b.
I'm done been over there plenty but I don't like
it.
(SE/005/312-3)
5
For detailed background and justification for this contention, see
Pop lack and Sankoff 1987; Pop lack and Tagliamonte 1989; Tagliamonte
1991; Tagliamonte and Pop lack 1988.
3
354
HAS Tr EVER BEEN 'PERFECT'?
(8) PRETERITE MORPHOLOGY (SUPPLETION AND INFLECTION):
a.
b.
They all died out already. (SE/013/80)
But I don't know what took her now. (SE/015/245)
(9) UNMARKED VERBS:
a.
I'm got eighty - going on eighty five. I never put my foot to
b.
[an] obeah. I don't believe in that. (SE/002/1072-3)
I never like the city. (SE/013/113)
In this article I perform a distributional analysis of the forms used
for the PERFECT in these materials. The term PERFECT is employed to
refer to the semantic functions which prescriptive English grammar has
labelled 'present perfect' tense. The morphosyntactic constructions that
occur within these contexts are referred to as surface 'forms'. I approach
these data from two different perspectives. In the first I take the
semantic functions of the English PERFECT as the starting point and
examine the frequency and distribution of forms that occur there. In the
second, I begin with the individual forms and investigate their cooccurrence patterns with a number of independent features of the
linguistic environment.
In order to assess the grammatical function and/or functions of
these forms, I draw comparisons with standard and vernacular varieties
of English and English-based moles while at the same time casting the
analysis into the larger context of linguistic change. My results suggest
that despite the multitude of different forms, their distribution in
Samna English and the Ex-Slave Recordings patterns in the same way
as the English perfect. Co-occurrence patterns of the most frequent
forms in past time reference contexts more generally provide additional
support for this contention. Further, parallels not just in form, but also
in function with earlier stages of the English language suggest that the
non-standard variants can be interpreted as synchronic remnants. These
findings corroborate the accumulating evidence from earlier independent
analyses of Samang English and the Ex-Slave Recordings.6
6 Poplack and Sankoff 1987; Poplack and Tagliamonte 1989; 1991; 1994;
Tagliamonte 1991; Tagliamonte and Poplack 1988; 1993.
355
35
YORK PAPERS IN LINGUISTICS 17
2. Previous Analyses of the PERFECT in AAVE,
Creoles and English
2.1.
AAVE
The standard English PERFECT has generally been considered absent
from the underlying system of contemporary AAVE (Fasold and
Wolfram 1975: 65; Labov et al. 1968: 254; Loflin 1970; but cf.
Rickford 1975 for an alternative perspective). Three types of evidence
have been adduced in favour of this contention. First, the
morphosyntactic construction have + past participle is said to be
extremely rare. Second, verbs other than have appear in auxiliary
position, as in (10).
I was been in Detroit.
b. I didn't drink wine in a long time. (Labov et al. 1968: 254)
(10) a.
Third, past participles, e.g. been, done, sometimes occur without a
preceding auxiliary, as in (11), and where they cannot be accounted for
by deletion of an underlying have.
He been know your name.
b. He been own one of those.
(11) a.
This means they cannot be interpreted as an English past participle in a
present or past perfect construction (Labov 1972b: 53).
The explanation for these linguistic facts involves not only a
rejection of the PERFECT as a category of AAVE grammar, but also a
denial that the standard English distinction between the preterite and
past participle exists. A single surface form with no auxiliary appears
across the board whether it surfaces with the morphology of an English
past participle, as in example (12a), preterite, as in (12b), or there is
alternation between forms, as in (12c).
(12) a.
b.
c.
He taken it.
He came vs.
He done it. vs. He did it.
356
HAS IT EVER BEEN 'PERFECT'?
Wolfram and Fasold (1974: 66) suggest that instead of a separate
past participle in AAVE, there is a 'general past form' that encompasses
a number of separate categorical distinctions in English, particularly the
simple past and PERFECT.
But what underlying grammar produced these forms? Many
researchers have suggested they derive from a creole system. Dillard
(1972a) divides pre-verbal done into two separate categories, one with
an auxiliary preceding, e.g. He's done come, and one with no auxiliary,
e.g. He 0 done come, attributing this difference to the distinct sources
AUX + done being an English form and 0 +
of the respective forms
a
creole
form.
Fickett
(1972) suggests that been and done
done
represent specific time periods in the past, i.e. done for recent past, and
been for remote past time. Although this particular function for done is
not widely attested, the remote time interpretation for been is quite
widespread (e.g. Dillard 1972a; 1972b; Stewart 1965; Wolfram and
Fasold 1974).7
2.2.
Creoles
In creoles, pre-verbal done and been are widely-cited as typical
tense/aspect features. While done is considered a perfective or
completive marker (e.g. Alleyne 1980), been is considered a past and/or
anterior marker, often with a remote interpretation (e.g. Agheyisi 1971;
Faraclas 1987). The English have + past participle does not appear at
all, pointing to a polar distinction between English and creole
grammars (Bickerton 1975: 128). Unfortunately, the literature on this
subject is entirely qualitative making form/function inferences about
these forms difficult to assess. The only empirical investigation,
Winford's analysis of the PERFECT in Trinidadian Creole, corroborates
Bickerton's claim with its dramatic split between
have
usage with
middle class speakers and verb stem forms with working class speakers
(Winford 1993).
7
Rickford 1975 specifies that the remote time interpretation is only
applicable to the stressed version
of been
357
in AAVE.
35
YORK PAPERS IN LINGUISTICS 17
2.3.
English
But what exactly is the nature of the English PERFECT system? Much
of the research claiming that AAVE has a creole-like grammatical
system has based its conclusions on comparisons of AAVE features
with standard (prescriptive) contemporary English usage rather than
with vernacular varieties of English to which Africans must have had
closer historical connections (Montgomery and Bailey 1986: 13), or
with related present-day white vernaculars (Butters 1989: 194; Rickford
1990; Vaughn-Cooke 1987: 68) to which it might be more
appropriately compared. Research on present-day varieties of vernacular
American (Christian et al. 1988; Feagin 1979) and British English
(Ihalainen 1976) as well as other regional varieties, e.g. Tristan da
Cunha (Scur 1974) and Newfoundland, Canada (Noseworthy 1972))
have confirmed that many morphosyntactic forms used in PERFECT
contexts in AAVE also appear in a wide geographic range of English
dialects, many of which are entirely beyond the realm of creole
influence. Thus, for example, there is no independent validation of
Winford's (1993) claims that the patterns of surface forms used for
PERFECT functions in Trinidadian Creole differ from an English one.
In what follows I describe the inventory of surface forms that have
been attested in the literature on different varieties of English and review
the hypotheses (where they exist) which have been put forward to
explain them. We will see that the surface forms found in contexts of
PERFECT reference are virtually the same across descriptions of AAVE,
creoles and other varieties of English.
2.3.1. Have Deletion
The most frequently-cited non-standard form in PERFECT contexts is an
English past participle which surfaces with no preceding auxiliary, as in
(13) and in example (4) above.
(13) a.
b.
He been there. (SE/001/189)
Don't do that. I never done it. (ESR/008/25)
This form is attested in the United States (e.g. Atwood 1953; Christian
et al. 1988; Fries 1940; Krapp 1925; Marckwardt 1958; Mencken 1971;
Menner 1926; Vanneck 1955), Canada (Orkin 1971), Australia (Turner
358
HAS IT EVER BEEN 'PERFECT'?
1966), England (Wakelin 1977), Ireland (Visser 1970) and Tristan da
Cunha (Scur 1974). The most popular explanation for this form is the
have-deletion hypothesis which assumes that the forms with, and
without, have fulfil the same function and thus can be attributed to the
removal of an underlying have (e.g. Barber 1964; Wright 1905). But
this does not explain its appearance in contexts in which the distinction
between preterite and past participle appears to be neutralized (e.g.
Menner 1926).
2.3.2. Generalized Past Marker
Thus, a second hypothesis for the bare past participles is that they
represent the development of a new semantic category. They were
originally based on the PERFECT but contexts in which the auxiliary
syncopated, i.e. I('ve) seen, I('ve) done, led to complete elision. This
auxiliary-less form was then adopted in vernacular varieties, reanalyzed
as a preterite and extended to all the functions of the past tense (Menner
1926: 238; Vanneck 1955), so that I seen him has come to have exactly
the same meaning as I saw him. (Mencken 1971: 520). This
explanation for the bare past participle parallels the 'general past form'
posited for AAVE (Wolfram and Fasold 1974: 66).
2.3.3. Loss of the PERFECT Tense
Some researchers have suggested this is the first stage in a process
which will lead to the eventual loss of the PERFECT category in the
grammar. This conclusion is said not to be surprising in light of the
fact that the position of the PERFECT in the history of many languages
is rather unstable, having been lost and reintroduced at various times
(Scur 1974: 22; Vanneck 1955). For example, in French the gradual
relaxation of the degree of recentness or current relevance required for
use of the PERFECT enabled its form to supplant the simple PAST while
losing its original meaning.8
French, High German and Russian have all lost the distinction between
preterit and perfect and the same phenomenon is characteristic of some
8
other Germanic languages, Swedish and some Slavic languages as well (Scur
& Svavolya 1975).
359
3.5
YORK PAPERS IN LINGUISTICS 17
2.3.4. Lexical Restriction
Given these descriptions of have deletion one would think that bare past
participles are frequent and productive forms. In fact, a bare past
participle is a rare item in English since the only contexts where one
can be unambiguously identified are with strong verbs. Weak verbs,
which have no distinction between preterite and past participle
morphology, would appear as preterites in the event of have deletion,
making them indistinguishable from the (simple) past tense. Even
within this limited range of contexts however, bare past participles
rarely occur. An empirical study of variant forms of the PERFECT in the
English of Tristan da Cunha, a small island in the South Atlantic,
the verbs see, be, and do and
(Scur 1974) revealed only five
sometimes come and get, as in (14) below.
I been to South Africa.
b. We never seen a tractor around.
c. They done away with it.
d. We got plenty of them.
e. They just come.
(14) a.
The same lexical restriction appears to be true of different varieties
of English in England. Cheshire (1982) reports that working class
teenagers in Reading used done categorically in the preterite, as in (15),
while Hughes and Trudgill (1979) report variable occurrence of seen,
(16a), and done, (16b), as preterites.
(15)
She done it, didn't she, Tracy? (Cheshire 1982)
(16) a.
You never seen them, you know. (Hughes and Trudgill 1979:
68)
b.
I done another couple of years there, then they closed up.
(Hughes and Trudgill 1979: 79)
2.3.5. Been and Done
Two frequently-cited bare past participle forms, been and done, require
special mention because they appear in contexts which are not always
directly translatable into standard English via have deletion (see Section
359
360
HAS 1T EVER BEEN 'PERFECT?
2.1). Despite this fact, these forms are attested in vernacular (white)
English in a wide range of geographic locations in the United States and
Canada (Feagin 1979; Noseworthy 1972; Williams 1975). Done has
been referred to as an adverb (Feagin 1979) or a quasi-modal (Christian
et al. 1988) and is generally considered 'completive/emphatic'. Been is
generally attributed with meanings equivalent to the standard English
PERFECT although in Newfoundland, e.g. (17a-b), Noseworthy suggests
that it has a connotation of remoteness, indicating that the state of affairs took place 'farther back in the past than any action denoted by ...
have + past participle' (1972: 21-2). Note the similarity to attested
creole patterns (see Section 2.2). In Alabama, as in (18a-c), the
meaning corresponds to 'begun in the past long ago and continued up to
the present', or simply 'once, long ago', as in (18a-b).
(17) a.
b.
I ain't been done it.
I been cut more wood than you. (Noseworthy 1972: 22)
I been knowin' your grandaddy for forty years.
b. Well, I chewed tobacco some, and then I started smokin'
I been quit about 15
started smokin' cigarettes. Course 1
years since I smoked. (Feagin 1979: 255-6)
(18) a.
2.3.6. Three Verb Clusters
Although relatively obscure, three verb clusters, of the form AUX +
done + verb, are also attested in vernacular (white) English in the
United States (McDavid and McDavid 1960). Christian et al. (1988: 43)
describe an uninflected form, i.e. done, which occurs before an inflected
verb optionally preceded by an inflected auxiliary in Ozark and
Appalachian English, as in (19a-d). The same structure surfaces in
Alabama English, as in (19e) (Feagin 1979).
Ozark English:
(19) a. I think they done took it.
b. Them old half gentle ones
(Christian et al. 1988: 33)
has all done disappeared.
361
369
YORK PAPERS IN LINGUISTICS 17
Appalachian English:
c. She asked us if we turned in the assignment; we said we done
turned it in.
d.
... because the one that was in there
had done rotted.
(Christian et al. 1988: 33)
Annistan, Alabama English:
e.
You buy you a little milk and bread and
you've done spent
your five dollars! (Feagin 1979: 122)
2.3.7. Auxiliary Be vs. Have
Use of be as an auxiliary in PERFECT contexts instead of have is
attested in contemporary varieties in England, Scotland, Ireland (Curme
1977; Edwards and Weltens 1985) and in the southern United States
(Feagin 1979), as in (20).
(20) a.
b.
Some of the unions is done gone too far.
It was so quiet I thought everybody was done
gone
to bed.
(Feagin 1979: 127)
2.3.8. Present Perfect vs. Simple Past Tense
Clearly, there is robust variability amongst PERFECT forms in
vernacular English. In addition, although the meaning of past and
perfect tenses in English is distinguished in many cases, researchers
widely acknowledge that, even in the standard language, as in (21),
(Quirk et al. 1985: 191; Wright 1905: 298) there are many contexts in
which either one may be used (Frank 1972: 81; Leech 1971: 43).
(21) a.
b.
Now, where did I put my glasses?
Now, where have I put my glasses? (Leech 1987: 43)
This is also typical of Samaria English and the Ex-Slave
Recordings, as in (22), where the past and perfect forms can occur
within the same context, in the same discourse, by the same speaker, as
in (23).
-a
, n01
362
HAS IT EVER BEEN 'PERFECT?
God left me here for some purpose. (SE/002/390)
b. They didn't send it to me yet. (SE/022/390)
c. They all died out already. (SE/014/80)
(22) a.
(23)
But the wind and the rain has wash them away. The rain
wash them away. (SE/020/262-4)
In fact, in earlier varieties of English, interchangeability between
these two categories was quite common and, in fact, far more variable
than in the contemporary system. So, while many researchers have used
distributional asymmetries with standard English functions to argue for
an alternative grammar for surface forms used in PERFECT contexts,
diachronic evidence may suggest another explanation. I now turn to the
historical record.
3. Historical Development of the Perfect in English
In Old English, there were only two tenses: past and non-past. While
the non-past served for durative and non-durative present and future
reference, the past covered not only what is represented by the simple
past of today, but also durative past tense (e.g. past progressive), as
well as the PERFECT and past perfect tenses of the contemporary system
(Strang 1970: 311). In other words, there were no overt forms to
distinguish between habitual and progressive aspect and between
PERFECT and NON-PERFECT meaning (Traugott 1972: 90-1). This can
be seen in example (24) below, where habitual activity has no representative auxiliary (24a), and (24b) in which the simple past tense
inflection marks a function that today would be overtly marked with the
auxiliary and tense inflection combination of the perfect.
(24) a.
7 se cyning 7 Oa ricostan men drincap myran meolc
'and that king and those richest men drink mare's milk'.
( Traugott 1972: 89)
363
62
YORK PAPERS IN LINGUISTICS 17
b.
lie cydan hate Oxt me corn swiOe oft on gemynd hwelce
wiotan iu wmron giond Angelcynn
to-you tell command that to-me came very often to mind what
wise-men before were throughout England
'Let it be known to you that it has very often come to my
mind what wise-men there were formerly throughout England.'
(Traugott 1972: 91)
Clearly, simple past and the perfect tenses were not differentiated.
Moreover, it was often the case that the preterite forms marked a
function that today would be overtly marked with the auxiliary and
tense inflection combination of the perfect (Brunner 1963: 86; Traugott
1972: 90-91). In fact Visser (1970) claims that the simple past and
present perfect are interchangeable in most contexts, including those
where either one or the other alone would be required in contemporary
usage.
During the change from Old to Middle English this two-tense (past
vs. non-past) inflectional verb system underwent substantial elaboration
(Strang 1970: 98), putting in motion a four-century long changeover
from a highly inflectional or (synthetic) tense system to a periphrastic
(analytic) one (Traugott 1972: 110).
3.1.
Elaboration of the Verb Phrase
One of the most important changes to take place in the English time
reference system was the development of separate elements within the
verb phrase, in addition to the suffixal inflection on the main verb, to
mark tense and/or aspect distinctions in addition to the original, and far
more general, PAST tense.
3 .1.1 . Have /had
Perhaps the most prominent expansion of the tense system was the
development of the present and past perfect tenses from the stative main
verbs have and be as in (25) below.
(25)
I have the letter written (i.e. in a written state).
364
3 3
.
HAS Tr EVER BEEN 'PERFECT'?
Because the simple past tense gradually shifted in emphasis to
explicitly PAST time there was a need for a new verbal structure that
could function to represent a close relationship between PAST and
PRESENT time. Since a written state implies a previous action, the
structure have written gradually acquired verbal force, serving as a verbal
form pointing to the past and bringing it into relation with the present
(Curme 1977: 358).
During the initial phase of this development have and be competed
as auxiliaries for the new category, as in (26); however, have / had
gradually generalized to more and more verbs and eventually prevailed
over be (Curme 1977: 359).
He took his wyf to kepe whan he is gon vs. and also to han
gon to solitaire exil
b. the yonge sonne hath in the Ram his halfe cours yronne vs.
as rody and bright as dooth the yonge sonne that in the Ram
is foure degrees up ronne
(26) a.
(these examples from Chaucer cited from Brunner 1963: 87)
3.1.2. Three Verb Cluster
During the Middle English period a 'three-verb structure' developed, e.g.
He had done speak (cf. Visser 1969: 338ft). While the origins of this
form are obscure, it clearly represented a completed past time reference
action, as in (27a). Inflection on the past participle was apparently
variable as the form of the main verb originally surfaced as an
infinitive, e.g. speak, but was gradually replaced by the past participle,
e.g. I had done spoke, probably by analogy to forms such as I done it
(Visser 1970: 2210). Similarly, as Traugott (1972: 146, n.18) points
out, the past participle inflection -ed on weak verbs is not required.
Hence forms such as has done invent and has done invented were
synonymous, as in (27b).
(27) a.
Also he seyde ... he hadde don sherchyd att Clunye.
'Also he said ... he had done searched at Cluny.'
(He had finished searching) (Traugott 1972: 146)
365
364
YORK PAPERS IN LINGUISTICS 17
b.
And many other false abusion The Paip (=Pope) hes done
invent.
(Traugott 1972: 146)
Between Middle and Modern English the form with done became
stigmatized as nonstandard. It did not survive past the fifteenth century
in Southern England (Williams 1975: 273); however, in the Northern
dialectal regions it remained common.
3.1.3 Summary
The obvious similarities between the 'creole' forms reported in the
literature on AAVE and these Early Modern English analogues has not
gone without notice (e.g. Christian et al. 1988; D'Eloia 1973;
Herndobler and Sledd 1976; Schneider 1993; Traugott 1972). The same
forms as well as standard English have + past participle are also attested
in written representations of earlier varieties of AAVE (Schneider
1989).
Comparisons based on similarities between surface forms alone
however, do not provide unambiguous evidence for semantic function or
genetic relationship. It is by now well-known that linguistic items
from one language may pattern entirely according to another's rules
(e.g. Bickerton 1975; Mufwene 1983a; Rickford 1977; Sing ler 1990;
Tagliamonte et al. in press; Winford 1985). Other forms may represent
two systems simultaneously (e.g. while verb stems in creoles have very
similar interpretations to the English simple past tense, the same past
tense can also be used interchangeably with the present perfect in many
PERFECT contexts). Unfortunately, very few conditioning factors, in
particular linguistic ones, which might help to illuminate these facts
have ever been mentioned, nor, in the rare cases that such factors have
been considered, have they ever been identified. Thus, there is no basis
from which to differentiate between verbal patterns that are inherent to
the English language and those which could possibly be due to
hypercorrection, incomplete acquisition or even an alternate system.
The case of the PERFECT in English and creole grammars is a
particularly difficult site for disentangling these issues because it is a
semantic domain in which there is a complete lack of isomorphism
385
366
HAS IT EVER BEEN 'PERFECT'?
between morphological distinctions (i.e. form) and semantic
distinctions (i.e. function).
4. Circumscribing the Variable Context
The conceptual space of PERFECT comprises both a semantic aspect
(i.e. current relevance) and a semantic tense (i.e. indefinite past). Thus,
the form have + past participle is related to more than one semantic
function. On the other hand, what is often not recognized in the
literature, is that these semantic functions can be represented by more
than one form as well.
In addition to the parallels between overt English and creole
PERFECT markers, both grammars can be expected to admit morphosyntactially unmarked verbs for the same semantic functions. Because
English (at least) has widespread phonological deletion in (weak) past
time reference, verb stems are possible variants of the simple past. By
extension, this means that in PERFECT contexts as well, at least three
surface forms might occur: have + past participle, preterite and, to some
extent, verb stems. In creoles, on the other hand, where the PERFECT is
said not to exist, neither as the form have + verb, nor as a category in
the grammar (Bickerton 1975: 129), we might expect either many verb
stems in PERFECT contexts, as found by Winford (1993), and/or creole
forms, such as done and been. Thus, as found in previous studies of the
tense/aspect system, (Pop lack and Tagliamonte 1989; 1991; 1994;
Tagliamonte and Pop lack 1993; Tagliamonte et al. in press) the mere
existence of a form is not sufficient to identify the underlying
grammatical mechanism that produced
Take, for example, the been + verb construction: If this surface
form was produced by an English grammar it would be explained as one
in which the auxiliary have has been deleted and would be construed as a
variant of the PERFECT. While this form does correspond in some
instances to the English present perfect as in, e.g. John been workin'
here all day today, there are often cases where it corresponds to the past
or past perfect tense as well, suggesting that it cannot be solely equated
with the PERFECT and hence cannot be attributed to an English-like
grammar (Bickerton 1975; 1979; Dillard 1972a; Mufwene 1983b;
Stewart 1970). Instead, it may represent a creole remote past or anterior
367
3.
YORK PAPERS IN LINGUISTICS 17
marker. Similar arguments can be made for the done + verb
construction. It corresponds sometimes to English present perfect and
sometimes to past perfect tense depending on the context (Mufwene
1988: 258) and for these reasons it may reflect an underlying creole
function, such as completive, unrelated to the standard English system.
However, differentiation between patterns that are inherent to the
English language and those which derive from an alternate grammatical
system can only be observed through analysis of the frequency and
distribution of forms across all the contexts in which they might have
occurred and in relationship to all other forms and functions within the
past time reference system more generally.
5. Results
In order to evaluate these possibilities, the analyses reported here
approach these data from two different perspectives surface form and
semantic function. First, every verb which referred to (realis) past time
was extracted and coded for its morphosyntactic characteristics. Then,
using prescriptive English grammar as point of comparison, each
surface form was categorized according to the semantic tense/aspect
function(s) for which it was used. This allows for a calculation of
form/function correspondences in the data. Finally, the co-occurrence
patterns of each surface form were examined according to a number of
independent linguistic features from the literature on this subject, e.g.
time adverbs, conjunctions, and remoteness.
5.1.
Distributional Analysis by Semantic Function
Table 1 depicts the overall distribution of surface forms across all past
time reference contexts. Observe that both Samaria English and the Ex-
Slave Recordings have the same range of variants and, with no
substantial exceptions, the same hierarchy.9 As illustrated earlier, in
(1)-(9), both contain surface forms consistent with the literature on the
PERFECT in creoles as well as vernacular and historical varieties of
English. Have + past participle and bare past participles occur in both
9 The small differences in hierarchy amongst the rarer variants are
undoubtedly due to their extremely low frequency overall.
C167
368
HAS TT EVER BEEN 'PERFECT'?
Table 1
Overall distribution of surface forms found in past time reference
contexts in Samand English and the Ex-slave Recordings.
Surface Form
Samaria English
N
4861
%
Preterite
Verb stem
Habitual, progressive etc.1°
wasi got passive
had + past participle
have + past participle
Past participle
Verbal -s
be + verb
62
17
15
2
1.5
1
.7
.6
.5
.5
ain't + verb
.1
done + verb
had + done + verb
be + done + verb
.07
.05
TOTAL N
1311
1152
150
120
86
58
46
39
36
10
6
Ex-Slave
Recordings
N
%
1162
331
371
47
58
16
18
2
15
18
1
1
28
25
1
1
.04
.3
.4
.2
1
5
7
3
4
0
7879
2013
corpora with the same frequency. Done + verb, as well as three verb
clusters with auxiliary be or have also occur. But none of these surface
forms exceed 1% of the data, not even the English PERFECT marker
have. Can the striking infrequency of have forms be used as evidence
that PERFECT is not a full-fledged category in these data? And is there
10 This category consists of habitual forms such as used to, would + verb
and variants of the progressive, e.g. was going, which are not the focus of
this investigation.
11 This includes have /has /'s as well as a following verb form which could
include unmarked weak verbs and strong verbs with preterit morphology, in
addition to standard English past participles.
369
363
YORK PAPERS IN LINGUISTICS 17
any evidence that any of these fulfill creole-like, rather than Englishlike functions?
These questions can only be answered by taking into account the
distribution of forms by semantic function. For example, even though a
surface form may be infrequent, this may be entirely due to the fact that
the meaning which it embodies was also quite rare. Each past time
reference verbal construction was coded according to all tense/aspect
categories which could have been used in the same context: (i) the
context required the present perfect, as in (28), and (ii) the context
permitted either the present perfect or the simple past, as in (29) and
(22) above, (iii) the context required the simple past, as in (30), (iv) the
context required the past perfect, as in (31), and (v) the context
permitted either the past perfect or simple past, as in (32). The
remainder under the heading 'Other' consist of contexts permitting
habitual and progressive forms which are not the focus of this study (cf.
Tagliamonte and Pop lack in progress).
(28)
PRESENT PERFECT TENSE REQUIRED:
a.
But today we calmed off and everything got calm.
b.
I came in last Friday and I ain't been nowhere.
(SE/002/942)
(SE/002/1339-40)
c.
(29)
Now, those things fell out. (SE/016/173)
PRESENT PERFECT OR PAST TENSE:
They didn't send it to me yet. (SE/001/1149)
(30)
PAST TENSE REQUIRED:
This morning we went to the church in Clara. (SE/006/1549)
(31)
PAST PERFECT TENSE REQUIRED:
Because they hadn't cut the road yet. (SE/002/708)
(32)
PAST PERFECT OR PAST TENSE:
Well then, they killed the boy. After they killed the boy....
(SE/002/948)
370
HAS IT EVER BEEN 'PERFECT'?
Samaria English and the Ex-Slave Recordings represent an earlier
variety of English spoken in the United States. If that variety developed
directly from contact with contemporaneous English vernaculars, then it
would not be unreasonable to expect that verbal constructions which
have since disappeared from contemporary standard English might
appear there. I hypothesize that if a specific set of surface forms was
once possible in the semantic context for PERFECT, i.e. have or be
auxiliary, three verb clusters, done/been + verb etc., then we should
observe some proportion of each of these forms within those contexts.
We should also expect restricted usage of some forms in environments
which have become specialized to only one tense in contemporary
standard English, a context which requires the present perfect for
example. If, Samana English and the Ex-slave Recordings are creolelike, on the other hand, then it would not be unreasonable to expect
verb stems, been and/or done to appear in PERFECT contexts rather than
have. Moreover, we should also expect the distribution of these forms
to follow attested creole patterns, such as remoteness distinctions. Such
correspondences will enable us to evaluate whether or not the
distribution of morphological marking parallels what would be expected
in a English or creole system.
Tables 2 and 3 (see over) depict the percent distribution of each
surface form by semantic function. Note the infrequency, but highly
partitioned distribution of the rarer PERFECT forms.12 Bare past
participles, preverbal done, auxiliary be and the three verb clusters are
restricted to environments where the English present perfect tense can
occur (or in the case of the three verb cluster with had, past perfect
tense). The specifically creole form been
+
verb does not occur at all!
12 Passives and verbal -s clearly pattern with the simple past tense. The
latter are undoubtedly Historical Presents in the narrative complicating
action section of narratives of personal experience. Ain't + verb is
vanishingly rare and not specific to any context. See Howe 1994 for the
absence of ain't in past, as opposed to present time reference contexts
contra DeBose 1994.
371
YORK PAPERS IN LINGUISTICS 17
Table 2
Percent distribution of surface forms by semantic function in
Samand English.
SURFACE
FORMS
PAST
PAST/
PAST
PAST/
PAST
PER
PRES
PRES
PER
FECF
EN!'
ENT
PER
PER
FECF
FECF
FECr
Preterite
86
2
3
83
4
2
N
10
4861
10
1311
81
1083
7
120
1
88
86
0.2
0.2
Verb stem
OTHER TOTAL
0.4
Habitual and
progressive
had + past
participle
got passive
have + past
participle
was passive
Past participle
3Vb cluster had
Verbal -s
be auxiliary
41
3
ain't
17
done + verb
Vb cluster
with be
20
TOTAL N
5728
18
26
0.002
47
0.8
18
3
1
2
51
92
3
2
19
9
33
95
2
37
2
44
3
62
2
58
54
6
46
11
39
36
3
221
--
60
50
20
50
33
277
96
372
10
4
1524
7879
HAS IT EVER BEEN 'PERFECT'?
Table 3
Percent distribution of surface forms by semantic function in
the Ex-Slave Recordings.
PAST/
PAST/
SURFACE
FORMS
PAST
PAST
PAST
PRES
PRES
ENT
PER
PER
ENT
FECT
FECF
PER
PER
FECF
FECF
Verb stem
Habitual and
1
63
0.17
0.86
0.3
0.9
14
53
47
participle
have + past
participle
t
done + verb
3Vb cluster be
TOTAL N
1162
34
331
86
360
0.28
had + past
Past participle
3Vb cluster had
Verbal -s
be auxiliary
26
0.09
2
progressive
was passive°
TOTAL
%
%
Preterite
OTHER
''''
15
18
4
43
28
16
25
2
98
25
14
76
75
8
3
25
1
40
1176
40
20
5
7
7
14
14
14
14
19
43
37
12
29
29
13 The got passive does not occur in these data.
373
372
730
2013
YORK PAPERS IN LINGUISTICS 17
Consider these patterns in the context of the history of the English
language. The present perfect tense developed over a long period of time
in which alternation of have and be as auxiliaries and even multiple
auxiliaries such as have + done and be + done are amply attested. The
sporadic, but localized occurrence of exactly the same forms here and in
the very contexts where they would be expected to occur given this
history is striking.
Historical grammars reveal that at least some aspects of the
linguistic environment exerted an influence on the occurrence of some
of these forms. The auxiliary be tended to be used with intransitives
(Brunner 1963: 87) and where the participle clearly expressed the idea of
a state or had an adjectival interpretation (Curme 1977: 359).
Accordingly, we examine the distribution of auxiliary
the lexical aspect of the verb, illustrated in Table 4.
be
according to
Table 4
Percent distribution of be vs. have auxiliary forms by lexical
aspect in Samand English.
SURFACE
FORM
STATIVE
be + verb
have + verb
TOTAL
PUNCTUAL
N
%
71
N
27
29
11
54
46
46
40
%
38
86
Observe that verbs with a stative reading have a marked tendency to
occur with auxiliary be. Moreover, 79% of these contexts were
intransitive, as in (33). This patterning is identical to that suggested in
the historical record.
(33)a.
'Cause them now, since the war
is
got civilized.
(SE/002/746-7)
b.
I'm never been in prison half an hour. (SE/021/988)
Consider the bare past participles. The vast majority occur in
contexts of present perfect tense, providing initial support for an
4.4,1
rc$
374
HAS Tr EVER BEEN 'PERFECT'?
underlying auxiliary. However, a non-negligible proportion (about
25%), occur in contexts for the simple past. Is this evidence for loss of
the PERFECT via a past verb form generalizing across the verbal
delimitation paradigm?
Further examination of these forms by lexical type, depicted in
Table 5, reveals that bare past participles are restricted to only four
verbs
done, been, gone and seen.
Table 5
Percent distribution of bare past participles by lexical type across
semantic contexts in Samand English.
PAST/
PAST/
SURFACE
FORMS
PAST
PAST
PAST
PRES
PRES
PER
PER
ENT
ENT
FECT
FECT
%
%
43
17
%
been
done
gone
seen
TOTAL
CONTEXTS
25
5728
25
221
33
PER
PER
FECF
FECF
%
%
60
35
40
40
.277
60
50
96
OTHER
TOTAL
N
%
25
4
24
1524
4
58
5
But it is actually only done and seen which occur in contexts of
simple past, as in (34).
(34) a.
b.
They say they done as I done. (SE/006/256)
The daughter came and she seen about her. (SE/003/443)
Moreover, the form and its function parallel present-day varieties of
English (see Section 2.3.4). Thus, systematic encroachment of the bare
past participle into the domain of simple past tense (see Section 2.3.3)
is not borne out by these data.
375
374
YORK PAPERS IN LINGUISTICS 17
In fact, present perfect contexts bear close to the entire inventory of
have + verb forms in Samand English, whereas this form is used only
1% of the time anywhere else. A similar pattern is found in the ExSlave Recordings. Preterite morphology, on the other hand, occurs very
frequently, but only in the semantic contexts which require it in
standard English. This leaves the bare stem form. Does its use reflect a
creole grammar?
Clearly, its patterning is parallel to the inflected preterite form.
Taking into consideration the fact that simple past tense is often
rendered by the stem form due to phonological reduction processes in
vernacular varieties of English (e.g. Guy 1980; Neu 1981) as well as in
contemporary AAVE (e.g. Fasold 1972; Labov 1972b; Wolfram 1969),
this parallelism of preterite and verb stem is entirely predictable. There
is no association of the verb stem with PERFECT contexts as has been
found in a creole system (see Section 2.2).
5.2.
Summary
There are amazing parallels in the frequency and distribution of surface
forms used for past time reference in Samand English and the Ex-Slave
Recordings. Those typical of contemporary English are the predominant
forms in every one of the semantic contexts considered and their
marking patterns are as would be expected in a English time reference
system. While there are a number of non-standard forms, all of these
have been previously attested both in the history of the English
language or in dialects of contemporary English. Moreover, their
functions, as can be determined here by the semantic contexts in which
they occur, and by the other forms with which they are used, pattern
according to what would be expected in an English grammar.
5.3.
Distribution Analysis of Co-occurrence Patterns
I now turn to a distributional analysis of the most frequent forms14 and
examine their co-occurrence patterns across a number of independent
features of the linguistic environment which are specifically related to
14 The infrequency of the rarer surface forms do not permit comparable
analysis.
376
HAS IT EVER BEEN 'PERFECT'?
PERFECT. I hypothesize that if a specific surface form is associated with
a given feature in English (or creoles) and the same is found to be true
in Samana English and/or the Ex-Slave Recordings, then that will
provide a point of comparison. If such parallelisms can be found across
a number of features, I take this as evidence for similarity of the
underlying grammatical mechanism regulating the distribution of forms
in the data, and thus their grammars.
5.4.
Temporal Distance
In creoles past time reference forms have been linked to relative distance
from speech time. In English grammar differential location in time
cannot be said to be relevant to any tense, except one PERFECT
which occurs under conditions of recency and current relevance (Dahl
1984: 118). In order to determine the pertinence of temporal distance to
the appearance of surface forms in Samana and the Ex-Slave Recordings
each verb was categorized according to the event time. For example,
three distinct time periods are represented by the verbs in example (35):
a remote time represented by the verbal structure did buy, in (35a), a
less remote past time represented by the verbal structure had went, in
(35b), and a comparatively recent past represented by two unmarked
verbs, come and stay, in (35c-d).
(35) a.
b.
c.
d.
But in that time we did buy sugar four cent the pound, you
hear, four cent the pound, time of Trujillo.
And from since that look, the sugar had went up even to
thirty cents, you hear.
And it come back now to twenty and eighteen.
And stay so, you hear. (002/890)
If the underlying system of these varieties is creole-like, we would
expect to find a correlation between specific time periods and specific
surface forms whereas if the system is English-like, the only area in
which temporal distance will demonstrate an effect will be in immediate
or continuing past contexts.
377
37o
YORK PAPERS IN LINGUISTICS 17
Figures 1 and 215 compare the distribution of surface forms across
reference points at different temporal intervals in the past, i.e. remote,
distant, medial, recent, immediate and continuing. These are given in
terms of their percent occurrence out of the total number of all
tense/aspect forms.
Figure 1
Distribution of preterite, verb stem, have + past participle, and had +
past participle by time period
Samna English.
III
have
had
V
V
70
1E1
60
V-base
0 V-ed 1
50
40
% of total
30
20
10
0
rot,
4
Remote
15
,M11131-
Distant
Recent
Medial
Time Period
Imm
Cont
Abbreviations used in the tables and figures can be interpreted as
follows: 'V-edl' refers to inflected or suppletive preterit forms. 'V-b' refers
to a verb stem. 'V-s' refers to unambiguous present morphology, e.g. don't,
-s . 'Hab' refers to habituals such as used to + verb and would + verb, among
others. 'V-ing, '0-ing' and 'is V-ing' refer to variants of the progressive.
3 7 7/
378
HAS rr EVER BEEN 'PERFECT'?
Figure 2
Distribution of preterite, verb stem, have + past participle, and had
+ past participle by time period
Ex-Slave Recordings.
II have +
V
UN had + V
100
03 V-base
90
Ved1
80
70
60
% of total
50
40
30
20
10
Id
_Sr
0
Remote
Distant
Recent
Medial
Time Period
Imm
Cont
Despite a skewed representation of temporal distance in the ExSlave Recordings,16 all surface forms exhibit parallel occurrences
across past time reference time periods. These distributional facts
suggest that there is no remoteness distinction in the past time reference
system of either of these varieties.
One temporal context is an exception, that of 'continuing past'. In
both corpora it is composed of the same forms, have + past participle,
preterite and verb stems, and in the same proportions. Have + past
participle is almost non-existent in all other past reference times. The
same pattern is evident in the Ex-Slave Recordings.
16 In the Ex-slave Recordings, 94.2% of all verbs considered come from
that of the 'distant past'. This is the time period of
the same time period
the Ex-Slaves' youth and/or childhood from which most of their
reminiscences take place. All other time periods combined make up only
122 tokens.
379
373
YORK PAPERS IN LINGUISTICS 17
Recall that the function of the present perfect tense in English is to
describe 'an alliance between past and present time' (Jespersen 1964). In
these data, a form identical to that used in English for PERFECT
distinguishes itself from other potential past time reference forms of
sufficient frequency by the restriction of its occurrence to functions
which have been identified throughout the prescriptive and historical
literature on English as typical of PERFECT. Such correspondence
between form and function can hardly be coincidental and I interpret this
as another piece of evidence that the English present perfect is a viable
tense/aspect category in Samana English and the Ex-Slave Recordings.
5.5.
Temporal Indicators
The interpretation of surface forms in creoles, particularly with regard to
time reference, is said to be dependent on context. In English, on the
other hand the difference between tense categories, especially between
PERFECT and simple past tense, is marked by co-occurrence restrictions
with specific adverbs (e.g. lately, so far, already, yet, up to now, etc.)
and conjunctions (e.g. before, after, since, etc.) (cf. Huddleston 1984:
158-9; Jespersen 1964: 243; Leech 1971; Quirk and Greenbaum 1972:
44; Quirk et al. 1985).
5.5.1. Adverbs
In English grammar features which predict where the present perfect is
preferred to the simple past are related to temporal specification (Visser
1970: 2192). In creole grammars on the other hand temporal adverbs
provide contextual cues which help to disambiguate morphosyntactically unmarked verb in addition to the information provided by
the stative/non-stative distinction (Mufwene 1983a).
Accordingly, temporal indicators in the immediate (sentential)
environment of each past-reference form were tabulated in order to
determine what effect temporal adverbs have on surface morphological
forms in the two corpora. Figure 3 shows the frequency of adverbial
specification across surface forms.
380
HAS IT EVER BEEN 'PERFECT'?
Figure 3
Percent frequency of adverbs by surface forms in Samand
English and the Ex-Slave Recordings.
Samana
-0- Ex-Slaves
0
Vedl
1
I
V-b
Hab
i
I
V-ing
Morph type
have
had
Figure 3 shows that the presence of a temporal adverb in the local
clause structure has little effect on surface morphology in Samand
English. Marked and bare verbs behave almost identically. The high
frequency of adverbs with have + verb in the Ex-Slave Recordings is
due to the small number of contexts (N=18) in this category.
What happens when the adverbs are subdivided according to type?
Prescriptive English grammar holds that some adverbs are linked to
specific tense/aspect categories. For example, there is a restriction
against the PERFECT with time-position adverbials referring to specific
times, as in (36a). These adverbs, e.g. yesterday, at that time, in 1901,
etc. force the occurrence of the simple past tense, as in (36b). Though
not restricted to explicitly past time, time-frequency adverbials are said
to occur with simple past tense forms which have a habitual semantic
interpretation. Present relevance adverbs, on the other hand, i.e. those
that refer to a period of time that stretches from a point in the past to
381
YORK PAPERS IN LINGUISTICS 17
speech time, (36b), are reserved for use with the present perfect (Visser
1970).
(36) a.
b.
*I have seen him last night.
I have live here twenty-one years. ... I came in the '61.
(SE/019/82)
*I BIN know you for a long time.
(37)
Tables 8 and 9 illustrate the distribution of surface forms by adverb
type.
Table 8
Distribution of adverb types across surface forms in Samanci
English.
Preterite
Verb stem Other have
%
%
%
(N)
(N)
(N)
hzi
Total
%
N
%
(N)
(N)
63
(62)
3
2
(3)
(2)
16
(40)
2
2
(5)
(6)
Time/
17
15
frequency
(17)
(15)
Time/
position
58
(143)
21
'then'
61
(subsequence)
(212)
41
(18)
30
(105)
23
(10)
7
(26)
(0)
(4)
2
(1)
25
(11)
9
27
(11)
41
(17)
41
(17)
(6)
Present
reference
Continuous
(53)
TOTAL
1
99
247
347
44
(4)
41
15
(0)
778
r.) (,-.) 1
0 (3)
382
HAS IT EVER BEEN 'PERFECT'?
Table 9
Distribution of adverb types across surface forms in the Ex-
Slave Recordings.
Preterite
Other
(N)
(N)
have
%
(N)
41
(20)
8
(4)
Verb stem
%
(14)
had
%
49
33
18
frequency
(16)
(9)
Time/
position
69
13
11
11
2
(44)
(8)
(7)
(7)
(1)
32
39
(17)
30
(13)
(0)
(0)
(subsequence)
(14)
(0)
2
50
50
reference
(1)
(0)
(0)
(1)
14
16
3
29
6
(1)
(9)
(2)
(45)
(5)
64
44
Present
Continuous
N
(N)
Time/
'then'
Total
(0)
31
190
TOTAL
'Present reference' adverbs, illustrated in (38a-b), are distributed
across all the surface forms but they are the only type of adverb that
occurs with any degree of frequency in contexts marked by have in
Samand English. Of all adverbs that occur with have + verb (N=25),
44% are of this type.
(38) a.
They knocked that out. Everything now have change.
b.
(SE/003/827-8)
I'm sorry some of them haven't reach yet that you'd see them.
SE/(009/346)
383
38'2
YORK PAPERS IN LINGUISTICS 17
Unfortunately the Ex-Slave Recordings contain only two of these
so a similar comparison is impossible. The high percentage of other
adverb types co-occurring with this morphological form in the Ex-Slave
Recordings is due to a large number of continuous adverbials, as in
(39a-c), which are also consistent with the English present perfect.
(39) a.
I ain't had no clothes to buy since I been on the project and
I've been on it, I think, 'bout nine - 'bout eight or nine
b.
c.
years I believe. (ESR/00Z/98)
Then he died. He been dead forty some odd year. (00Z/75)
We been slaves all our lives. (008/188)
On the other hand been never occurs with time position adverbs.
Recall that in AAVE there is a restriction against the use of stressed
BIN with exactly these adverbs. This means that the 'absolute
restriction' against continuous adverbs in AAVE in contexts such as for
a long time, as in I BIN know you for a long time (Rickford 1977)
does not hold in these data. This can be clearly seen in (39b-c) above
from the Ex-Slave Recordings and in (40a-b) below from Samana
English. In contrast, forms with have rarely occur with adverbs referring
to specific time, e.g. last night.
(40) a.
b.
... been raining a good bit all these days pass. (021/581)
I can't hardly tell you 'cause it been so long. (020/18)
Finally, time-position adverbs in Samaria English are restricted to
preterite or verb stem forms
58% with preterite and 21% with verb
stems. The same is true of the Ex-Slave recordings where 69% of all
these adverbs occur with the preterite and 13% occur with verb stems.
5.5.2. Conjunctions
Conjunctions with disambiguating temporal value (Chung and
Timberlake 1985: 209) also have specific collocation restrictions in
English. For example, since actually requires the use of the present
perfect, e.g. He has been finished since last March. Others, such as
when, imply coincidence. While forms such as after can be used with
either simple past tense or past perfect (Quirk et al. 1972: 339).
384
HAS 1T EVER BEEN 'PERFECT'?
Accordingly, each context in these data was tabulated for its
occurrence with temporal conjunctions, as illustrated in (41).
(41)a. I've seed
covers since I've been big
enough.
(ESR/00W/334)
b. Oh he was so mean, fractious that-a-way, when he got
c.
drinking. (ESR/00W/470)
Well then after they had that war, well then all had to go
home. (SEC/004/401)
Figures 4,5 and 6 represent how the three main conjunction types,
since, when and after, are distributed across surface forms in Samaria
English, the only data set where there are a sufficient number of
temporal conjunctions to view patterns of co-occurrence. In Figure 4
since occurs with have + past participle and had + past participle,
although more frequently with have + past participle, the form which
most closely approximates the English present perfect. In Figure 5,
when exhibits a propensity to appear with present V-ing forms. After,
illustrated in Figure 6, is said to occur either with the simple past or
the past perfect. Predictably it is found with preterites, verb stems and
had + past participle as well as with habituals (e.g. used to, would etc.).
Figure 4
Percent occurrence of since with each surface form in Samand
English.
SINCE
tot
types
04
V-ed1
v-6
Hab
morph types
385
384
t
t
have
had
YORK PAPERS IN LINGUISTICS 17
Figure 5
Percent occurrence of when with each surface forms in Samand
English.
25
20
VVI-EN
15
tot
types
10
5
0
V-ed1
V-b
Hab
have
had
V-ing 0 V-ing is V-ing
morph types
Figure 6
Percent occurrence of after with each surface form in Samand
English.
2.5 i___.
%
AFTER
\..........
2.0
1.5
tot 1.0
types 0.5
0.0
V-ed1
.----------.,.__._.....
1
*
V-b
Hab
I
have
had
f
V-ing 0 V-ing is V-ing
morph types
5.6.
Summary
Distributional analyses of co-occurrence patterns have revealed that the
surface forms of past time reference are not differentiated by the relative
`remoteness' of past time except for that of 'continuing' past time. Here
the context is restricted to have + past participle, preterite and verb
386
335
HAS IT EVER BEEN 'PERFECT'?
stem. This behaviour parallels the English present perfect. Surface
forms exhibit co-occurrence patterns similar to those of English. Time
position adverbs co-occur with preterites and verb stems; time frequency
adverbs co-occur with habitual and progressive forms. There are no
functionally-motivated marked patterns as suggested for creoles in
which morphosyntactially unmarked verbs would occur in contexts of
temporal disambiguation. Present relevance adverbs are the only adverb
type typical of the surface form have which is, once again, consistent
with a PERFECT interpretation for the semantic function of this form.
The distribution of forms with conjunctions is also consistent with
those suggested in English grammars, in which the surface form have +
past participle patterns with since. Note also that the percent occurrence
of preterite and verb stem is the same across all of the conjunctions
considered here corroborating my earlier observations that these forms
are variants of the same tense.
Although co-occurrence patterns such as these cannot be entirely
conclusive in determining tense/aspect categories (cf. Comrie 1985),
taken in conjunction with the partitioning of forms across semantic
contexts, they provide corroborating evidence for interpreting these
patterns as English-like, not creole-like, while at the same time
confirming the parallelism between the two data sets more generally.
6. Discussion
This article has examined the PERFECT in Samang English and the ExSlave Recordings through separate analyses of the distribution of forms
by semantic function and co-occurrence patterns. Despite the overall
rarity of this category in the general realm of past time, the most
have + past participle and bare past
frequent forms used to mark it
participles are not at all marginal in contexts licensed for the present
perfect in English. Co-occurrence patterns with temporal distance,
adverbs and conjunctions also mirror those of the present perfect in
standard English, while differing from those proposed for creoles. These
findings suggest that the form have actually functions as a productive
marker of PERFECT in these data. Bare past participles, with the
exception of seen and done, are probably the result of have deletion
since their occurrence is highly restricted to the same PERFECT
387
386
YORK PAPERS IN LINGUISTICS 17
contexts. Other surface forms attested in the literature were also found
to mark PERFECT. Why should this be so?
I briefly outlined the development of the perfect in the history of
English and found that it is perhaps the only tense/aspect category in
English with such variability in forms. At its inception, auxiliary had
and be were productive variants. In Middle English further elaboration
of the verb phrase within the same domain of meaning led to the
development of three-verb structures, have /had done + verb and
'm /is /are done + verb. All these are attested in vernacular (white)
English in the southern United States. As far as the bare past participle
is concerned, forms such as / seen,' done have been traced to at least as
early as the high tide of Irish immigration in the 1840's, the same time
period represented by Samaria English and the Ex-Slave Recordings. In
England they remain common in the West Midlands and the north and
they also resemble Scottish forms. Thus, all the forms discussed here
are found to persist in many contemporary varieties of English around
the world where they are characterized as dialectal, non-standard, subject
to style-shifting and the effects of education (e.g. Francis 1958). It is
not surprising, given the extra-linguistic characteristics of the speakers
in these corpora and the status of these varieties as linguistic enclaves,
that members of an earlier English verb system persist, albeit
marginally.
Is there any support here for the loss of the perfect? If have
deletion is the first stage of this process, there is little evidence of a
general process of change. While earlier studies have not provided actual
figures for the frequency and distribution of have across lexical verbs,
without evidence to the contrary we might assume that all verbs have
an equal propensity to be used for PERFECT reference. But contexts in
which have deletion occurs are restricted to infrequent realizations of
been, done and seen. Infrequent preterite and verb stem forms in
contexts of PERFECT cannot be taken to reflect either creole origins or
ongoing change, since this usage is consistent with the historical record
which documents extensive variation between preterite and present
perfect tenses in earlier stages of English (see Section 3).
What of the forms that could not be subsumed under a
have-
deletion hypothesis? First, the creole-like structure been + verb did not
even occur. Thus, of all surface forms found in these data, only those in
c)
')
388
HAS TT EVER BEEN 'PERFECT?
(3), namely done + verb, could not be interpreted as the deletion of a
(standard) English past participle. Although these contexts are not
structurally parallel to the contemporary standard English perfect, if we
take the three verb cluster into account then these forms may simply be
deletion of the same perfect auxiliary, but from a three place verb
phrase, rather than the contemporary auxiliary + main verb structure.
Thus, the have deletion hypothesis can be maintained.
The similarities between Samana English, the Ex-Slave Recordings
and other varieties of English and their lack of similarity with creoles
can hardly be coincidental. Although English in the United States and
the Caribbean could arguably have been influenced by creoles (but cf.
Mufwene to appear-a; to appear-b for an alternative analysis) varieties
such as those found in Newfoundland and Tristan da Cunha were not.
Thus, the origins of these perfect forms and their functions must
necessarily be traced to the original source in Britain. The rare PERFECT
variants are remnants from an earlier stage in the development of the
present and past perfect tenses in the history of the English language.
Little, if anything, is known about the linguistic and extra-linguistic
conditioning of variability in this area of the grammar. While the
findings reported here now provide the basis for such analyses
(Tagliamonte and Pop lack 1995), it seems clear that the grammar of
early Black English, insofar as it is instantiated by Samna English and
the Ex-Slave Recordings, was PERFECT just the way it was.
REFERENCES
Agheyisi, R. N. (1971) West African Pidgin English: Simplification and
simplicity. Ph.D. Dissertation, Stanford University.
Alleyne, M. C. (1980) Comparative Afro-American: An historicalcomparative study of English-based Afro-American dialects of the
New World. Ann Arbor: Karoma.
Atwood, E. B. (1953) A survey of verb forms in the Eastern United States.
Ann Arbor: University of Michigan Press.
Bailey, G., Maynor, N. and Cukor-Avila, P. (1991) The emergence of Black
English: Texts and commentary. Amsterdam: John Benjamins.
389
388
YORK PAPERS IN LINGUISTICS 17
Barber, C. (1964) Linguistic change in present-day English.
Edinburgh/London: Oliver and Boyd
Baugh, J. (1980) A reexamination of the Black English copula. In W. Labov
(ed.) Locating language in time and space. New York: Academic
Press. 83-106.
Bickerton, D. (1975) Dynamics of a creole system. New York: Cambridge
University Press.
Bickerton, D. (1979) The status of bin in the Atlantic creoles. In I. Hancock
(ed.) Readings in Creole Studies. Ghent: E. Story-Scientia. 309-314.
Brunner, K. (1963) An outline of Middle English grammar. Oxford: Basil
Blackwell.
Butters, R. (1989) The death of Black English: Divergence and convergence
in black and white Vernaculars. 25. Frankfurt: Peter Lang.
Cheshire, J. (1982) Variation in an English dialect: A sociolinguistic study.
Cambridge: Cambridge University Press.
Christian, D., Wolfram, W. and Dube, N. (1988) Variation and change in
geographically isolated communities: Appalachian English and
Ozark English. Vol. 74. Tuscaloosa, Alabama: American Dialect
Society.
Chung, S. and Timberlake, A. (1985) Tense, aspect and mood. In T. Shopen
(ed.) Grammatical Categories and the Lexicon. Cambridge:
Cambridge University Press. 202-258.
Comrie, B. (1985) Tense. Cambridge: Cambridge University Press.
Curme, G. 0. (1977) A grammar of the English language. Essex,
Connecticut: Verbatim.
D'Eloia, S. G. (1973) Issues in the analysis of Nonstandard Negro English:
A review of J. L. Dillard's Black English: Its history and usage in the
United States. Journal of English Linguistics. 7.87-106.
Dahl, 0. (1984) Temporal distance: Remoteness distinctions in tenseaspect systems. In B. Butterworth, B. Comrie and 0. Dahl (eds.)
Temporal Distance: Remoteness Distinctions in Tense-aspect
Systems.
Berlin: Mouton. 105-22.
DeBose, C. and Faraclas, N. (1994) An Africanist Approach to the
Linguistic Study of Black English: Getting to the Roots of the
Tense-Aspect-Modality and Copula Systems in Afro-American. In S.
Mufwene (ed.) Africanisms in African American Language Varieties.
Athens: University of Georgia Press. 364-87.
3 S9
390
HAS IT EVER BEEN 'PERFECT'?
De Bose, C. E. (1994) A note on ain't vs. didn't negation in African
American Vernacular. Journal of pidgin and creole languages. 9.127-
30.
Dillard, J. L. (1972a) Black English: Its history and usage in the United
States. New York: Random House.
Dillard, J. L. (1972b) On the beginnings of Black English in the new world.
Orbis. 21.523-536.
Edwards, V. and Weltens, B. (1985) Focus on: England and Wales. In W.
Viereck (ed.) Varieties of English around the world
Amsterdam/Philadelphia: John Benjamins. 97-139.
Faraclas, N. G. (1987) Creolization and the tense-aspect-modality system of
Nigerian Pidgin. Journal of African Languages and Linguistics. 9.4559 .
Fasold, R. (1971) Minding your Z's and D's: Distinguishing syntactic and
phonological variable rules. In Papers from the Seventh Regional
Meeting, Chicago Linguistic Society. Chicago: Department of
Linguistics. University of Chicago. 360-367.
Fasold, R. (1972) Tense marking in Black English: A linguistic and social
analysis. Washington, D.C.: Center for Applied Linguistics.
Fasold, R. and Wolfram, W. (1975) Some linguistic features of Negro
dialect. In P. Stoller (ed.) Black American English: Its Background
and its Usage in the Schools and in the Literature. New York: Dell
Publishing Co., Inc. 49-83.
Feagin, C. (1979) Variation and change in Alabama English: A
sociolinguistic study of the White community. Washington, D.C.:
Georgetown University Press.
Fickett, J. G. (1972) Tense and aspect in Black English. Journal of English
Linguistics. 6.17-19.
Francis, W. N. (1958) The structure of American English. New York: Ronald
Press Co.
Frank, M. (1972) Modern English: A practical reference guide. Englewood
Cliffs, N.J.: Prentice-Hall.
Fries, C. C. (1940) American English grammar. New York: Appleton,
Century, Crofts.
Guy, G. (1980) Variation in the group and the individual: The case of final
stop deletion. In W. Labov (ed.) Locating language in time and
space. New York: Academic Press. 1-36.
391
330
YORK PAPERS IN LINGUISTICS 17
Herndobler, R. and Sledd, A. (1976) Black English
Notes on the
auxiliary. American Speech. 51.185-200.
Howe, D. (1994) Patterns in the use of ain't in North Preston Vernacular
English. Manuscript. University of Ottawa.
Huddleston, R. (1984) Introduction to the grammar of English. Cambridge:
Cambridge University Press.
Hughes, A. and Trudgill, P. (1979) English accents and dialects: An
introduction to social and regional varieties of British English.
London: Edward Arnold.
Ihalainen, 0. (1976) Periphrastic 'do' in affirmative sentences in the dialect
of East Somerset. Neuphilologische Mitteilungen. 77.608-622.
Jespersen, 0. H. (1964) Essentials of English grammar. University of
Alabama: University of Alabama Press.
Krapp, G. P. (1925) The English language in America. New York: The
Century Company.
Labov, W. (1969) Contraction, deletion, and inherent variability of the
English copula. Language. 45.715-762.
Labov, W. (1972a) Language in the inner city. Philadelphia: University of
Pennsylvania Press.
Labov, W. (1972b) Sociolinguistic patterns. Philadelphia: University of
Pennsylvania Press.
Labov, W., Cohen, P., Robins, C. and Lewis, J. (1968) A study of the nonstandard English of Negro and Puerto Rican Speakers in New York
City. Comparative Research Report. U.S. Regional Survey,
Philadelphia.
Leech, G. N. (1971) Meaning and the English verb. London: Longman
Group Ltd.
Loflin, M. (1970) On the structure of the verb in a dialect of American Negro
English. Linguistics. 59.14-28.
Marckwardt, A. H. (1958) American English. New York: Oxford University
Press.
McDavid, R. I. and McDavid, V. (1960) Grammatical differences in the
north central states. American Speech. 35.5-19.
Mencken, H. L. (1971) The American language. New York: Alfred A Knopf.
Menner, R. J. (1926) Verbs of the vulgate in their historical relations.
American Speech. 1.230-240.
392
HAS IT EVER BEEN 'PERFECT'?
Montgomery, M. B. and Bailey, G. (1986) Introduction. In M. B.
Montgomery and G. Bailey (eds.) Language Variety in the South:
Perspectives in Black and White. University: University of Alabama
Press. 1-29.
Mufwene, S. S. M. (1983a) Observations on time reference in Jamaican and
Guyanese creoles. English Worldwide. 4.199-229.
Mufwene, S. S. M. (1983b) Some observations on the verb in Black
English Vernacular. African and Afro-American Studies and Research
Center Papers. 5.1-46.
Mufwene, S. S. M. (1988) English pidgins: Form and function. World
Englishes. 7.255-267.
Mufwene, S. S. M. (to appear-a) African-American English. In J. Algeo (ed.)
The Cambridge History of the English Language.
Mufwene, S. S. M. (to appear-b) The founder principle in creole genesis.
Diachronica.
Neu, H. (1981) Ranking of constraints on /t,d/ deletion in American
English: A statistical analysis. In W. Labov (ed.) Locating language
in time and space. New York: Academic Press. 37-54.
Noseworthy, R. G. (1972) Verb usage in Grand Bank. Regional Language
Studies Newfoundland. 4.19-24.
Orkin, M. M. (1971) Speaking Canadian English. London: Routledge and
Kegan Paul.
Pfaff, C. W. (1971) Historical and structural aspects of sociolinguistic
variation: The copula in Black English. 37. Inglewood, California:
Southwest Regional Laboratory Technical Report.
Poplack, S. and Sankoff, D. (1987) The Philadelphia story in the Spanish
Caribbean. American Speech. 62.291-314.
Poplack, S. and Tagliamonte, S. (1989) There's no tense like the present:
Verbal -s inflection in Early Black English. Language Variation and
Change. 1.47-84.
Poplack, S. and Tagliamonte, S. (1991) African American English in the
diaspora: The case of old-line Nova Scotians. Language Variation
and Change. 3.301-339.
Poplack, S. and Tagliamonte, S. (1994) -S or nothing: Marking the plural
in the African American diaspora. American Speech. 69.227-259.
Quirk, R. and Greenbaum, S. (1972) A university grammar of English.
London: Longman.
393
392
YORK PAPERS IN LINGUISTICS 17
Quirk, R., Greenbaum, S., Leech, G. and Svartvik, J. (1972) A grammar of
contemporary English. New York: Harcourt Brace Javanovich, Inc.
Quirk, R., Greenbaum, S., Leech, G. and Svartvik, J. (1985) A
comprehensive grammar of the English language. New York:
Longman.
Rickford, J. (1975) Carrying the new wave into syntax: The case of Black
English bin. In R. Fasold and R. Shuy (eds.) Analyzing Variation in
Language. Washington, D.C.: Georgetown University Press.
Rickford, J. (1977) The question of prior creolization of Black English. In
A. Valdman (ed.) Pidgin and Creole Linguistics.. Bloomington:
Indiana University Press. 190-221.
Rickford, J. (1990) Grammatical variation and divergence in Vernacular
Black English. Paper presented at The 1989 ICHL workshop on
internal and external factors in syntactic change.
Schneider, E. W. (1989) American Earlier Black English, morphological
and syntactic variables. Tuscaloosa, Alabama: The University of
Alabama Press.
Schneider, E. W. (1993) Africanisms in Afro-American English Grammar. In
S. Mufwene (ed.) Africanisms in Afro-American language varieties.
Athens: University of Georgia Press. 209-221.
Scur, G. S. (1974) On the typology of some peculiarities of the perfect in
the English of Tristan da Cunha. Orbis. 23.392-396.
Sing ler, J. V. (ed). (1990) Pidgin and Creole tense-mood-aspect systems.
Amsterdam/Philadelphia: John Benjamins.
Stewart, W. A. (1965) Urban Negro speech: Sociolinguistic factors
affecting English teaching. In R. W. Shuy (ed.) Social Dialects and
Language Learning. Champaign, Illinois: The National Council of
Teachers of English. 10-18.
Stewart, W. A. (1970) Historical and structural bases for the recognition of
Negro dialect. In J. E. Alatis (ed.) Report of the 20th Annual Round
Table Meeting on Linguistics and Language Studies. Washington,
D.C.: Georgetown University Press. 239-247.
Strang, B. M. H. (1970) A history of English. London: Methuen.
Tagliamonte, S. (1991) A matter of time: Past temporal reference verbal
structures in Samantl English and the Ex-slave Recordings. Ph.D.
Dissertation, University of Ottawa.
394
HAS IT EVER BEEN 'PERFECT'?
Tagliamonte, S. and Pop lack, S. (1988) How Black English past got to the
present: Evidence from Samana. Language in Society. 17.513-533.
Tagliamonte, S. and Pop lack, S. (1993) The zero-marked verb: Testing the
creole hypothesis. Journal of Pidgin and Creole Languages. 8.171206.
Tagliamonte, S. and Poplack, S. (1995) The synchrony of obsolescence
tracking the PERFECT in African Nova Scotian English. American
Dialect Society Meeting. Chicago, Illinois. December 1995.
Tagliamonte, S. and Poplack, S. (in progress) Habits in variation: The
intersection of [+past]/[-punctual] in African American English in
the Diaspora and its contact vernaculars.
Tagliamonte, S., Poplack, S. and Eze, E. (1996) Pluralization patterns in
Nigerian Pidgin English. Journal of Pidgin and Creole Languages.
11(2).
Traugott, E. C. (1972) A history of English syntax. A transformational
approach to the history of English sentence structures. New York:
Holt, Rinehart and Winston.
Turner, G. W. (1966) The English language in Australia and New Zealand.
London: Longmans.
Vanneck, G. (1955) The colloquial preterite in Modern American English.
Word. 14.237-42.
Vaughn-Cooke, F. (1987) Are Black and White vernaculars diverging?
American Speech. 62.12-32.
Visser, F. T. (1970) An historical syntax of the English language. Leiden:
E. J. Brill.
Wakelin, M. F. (1977) English dialects: An introduction. London: Athlone.
Williams, J. M. (1975) Origins of the English language: A social and
linguistic history. New York: Macmillan.
Winford, D. (1985) The concept of 'diglossia' in Caribbean creole
situations. Language in Society. 14.345-356.
Winford, D. (1991) Back to the past: The BEVIcreole connection revisited.
Paper presented at Georgetown University.
Winford, D. (1993) Variability in the use of perfect have in Trinidadian
English: A problem of categorial and semantic mismatch. Language
Variation and Change. 5.141-188.
Wolfram, W. (1969) A sociolinguistic description of Detroit Negro speech.
Washington, D.C.: Center for Applied Linguistics.
395
394
YORK PAPERS IN LINGUISTICS 17
Wolfram, W. (1974) The relationship of White Southern speech to
Vernacular Black English. Language. 50.498-527.
Wolfram, W. and Faso ld, R. (1974) The study of social dialects in American
English. Englewood Cliffs, New Jersey: Prentice-Hall.
Wright, J. (1905) The English dialect grammar. Oxford: Clarendon Press.
396
VOICE SOURCE CHARACTERISTICS OF MALE AND
FEMALE SPEAKERS OF FRENCH.*
Rosalind A. M. Temple
University of York
1. Introduction
'Breathy Voice' is a phonation-type label used in phonology, in
experimental phonetics and in speech pathology. 'Breathiness' is also a
quality sometimes associated with females and with onsets and offsets
of voiceless consonants. It is far from clear, however, what exactly are
the acoustic characteristics of breathy voice, nor whether all the uses of
the terms can properly be said to refer to the same phenomenon.
My purpose in the present article is to give a detailed account of
part of an investigation into the realisation of the voicing contrast in
plosive consonants produced by young French adults (Temple 1988a,
b), which raised several questions which it was not possible to answer
within the scope of that study, and to review the questions which arose
at that time, in the light of subsequent literature.
2. Background to 1988 Study
2.1 The nature of 'breathiness'.
One physiological correlate of breathy voice quality is the vocal folds
being held in the position for voiceless consonants, but the airflow rate
is higher than normal and they vibrate loosely, 'so they appear to be
simply flapping in the airstream' (Ladefoged 1982: 128), producing the
breathy-voiced sound [h]. This occurs during the pronunciation of
English intervocalic /h/, as in ahead. Another, more deliberate strategy
is used in languages such as Gujarati, where there are phonemically
contrastive breathy vowels, during which the vocal folds are held closely
enough together at the front for voicing to occur, but apart at the back
so that a large volume of air passes out through the glottis producing
turbulence.
York Papers in Linguistics 17 (1996)
0 Rosalind A. M. Temple
397-440
396
YORK PAPERS IN LINGUISTICS 17
Bickley (1982) examined the vowels of Gujarati and !Xh6O to determine
acoustically and perceptually robust cues to the breathy-voice : modalvoice contrast. From the physiological description given in the previous
paragraph one would expect an important cue to be the presence of highamplitude inter-harmonic noise', and this is indeed found in the spectra
of breathy sounds. However, following Ladefoged (1981) and other
studies of Gujarati, Bickley wanted to investigate a cue at the other end
of the spectrum, that of the relative amplitudes of the fundamental and
the first harmonic above it2. She reanalysed Ladefoged's recordings of
!XhdO and compared them with her own recordings of four native
speakers of Gujarati. The measurements of the amplitude of the first
two harmonics for the !Xh6C5 speakers and one Gujarati speaker (op.
cit.: 73-74) are reproduced as Tables 1 and 2 below. The figures show
clearly that the fundamental (henceforth 'F0') is consistently higher in
amplitude than the first harmonic above it (henceforth 'H2') in breathy
vowels and not in clear vowels. To test the perceptual relevance of the
cue, informal judgements were elicited from a native English speaker
and a native Gujarati speaker, both trained in phonetics. The average
amplitude differences for vowels judged to be in four categories of
breathiness were as follows (the Gujarati speaker's judgements are given
first): 'Very breathy' - 12.5dB, 10dB; 'Breathy' - 8.3dB, 11dB; 'Slightly
breathy' 6.7dB, 5.3dB; 'Not breathy' - OdB, OdB. Bickley synthesised
/a/, /i/ and /u/ vowels with independent manipulation of the amplitude
of the fundamental and the amount of aspiration noise, and the vowels
were played to four Gujarati speakers . She found no correlation between
the noise level and the degree of breathy percept, but the vowels with
the highest amplitude FO were consistently identified as breathy. Given
the greater amount of noise passing through the glottis in breathy, as
opposed to modal, phonation, it is surprising that the noise level did
I Noise is the acoustic consequence of the turbulent airflow which would
here be escaping between the parts of the vocal folds which are not fully
adducted.
2 The relative strength of the fundamental is known to increase as open
quotient (the proportion of the vibratory cycle during which the vocal folds
are open) increases. Increased open quotient is a known articulatory
correlate of breathy voice quality.
398
VOICE SOURCE CHARACTERISTICS
not have a greater effect on the breathy percept, but this may be because
of problems with synthesis.
Difference (in dB)
Speaker 1
Speaker 2
Speaker 3
Speaker 4
Speaker 5
Speaker 6
Speaker 7
Speaker 8
Speaker 9
Speaker 10
Clear
Breathy
13
-4
0
-3
-3
2
4
-4
-9
-8
11
0
9
15
10
-2
-2
2
5
5
_
,
Table 1. Difference between amplitudes of first and second harmonics for
breathy and clear vowels in !Xhdli. After Bickley 1982: 73)
bar
maro
wali
bar
maro
wali
first harmonic
44
46
47
42
43
38
Amplitudes in dB
second harmonic
42
42
43
44
43
44
_
difference
2
4
4
-2
0
-6
Table 2. Relative amplitudes (in dB) of first and second harmonics for
breathy (top) and clear (bottom) vowels in Gujarati. (After Bickley
1982: 74)
"Breathiness" has also been much studied in a clinical context,
sometimes being explicitly compared to the quality which is given the
same label in other contexts. Hammarberg quotes a famous line of
399
398
YORK PAPERS IN LINGUISTICS 17
Ladefoged's: '... what is a pathological voice in one language may be
phonologically contrastive in another.' (Ladefoged 1983) and extends it
to: 'What is evaluated as an abnormal voice quality in one language or
dialect community may be a socially acceptable voice quality in
another.' (Hammarberg, op. cit. 27) A particular spectral shape which is
entirely attributable to physiological problems could thus be interpreted
by speakers to convey a sociolinguistic message. Laver (1980) has
exemplified how modes of phonation can be 'signals of emotional
status' (Hammarberg, op. cit. 27) and Hammarberg's example is
particularly pertinent to the present study, as we shall see in 2.2 below:
'For instance, breathiness is said to be a common female
vocal attribute in many social communities, whereas
creakiness often is a male characteristic.' (ibid. 27)
Hammarberg (1986) brings together a series of studies where
pathological voices were judged by pathologists and phoniatrists against
a series of voice quality parameters. The voices judged as breathy were
all from patients with unilateral vocal-fold paralysis3. Acoustic analyses
were made using long-term average spectra, and the typical long-term
spectral characteristics of these voices were the high level of the
fundamental, a low spectral level in the Fl region (400 to 600 Hz) and a
high level of amplitude in the highest frequency band (5 to 10 kHz).
2.2 Female-male voice source differences
2.2.1 Acoustic evidence
The vocal folds of mature males are on average fifty per cent longer than
those of females, and are thicker and greater in mass (Ohala, 1983). One
natural result of this is that male fundamental frequency (FO) is lower
than that of females4. As well as causing the perceived pitch of the
3 Unilateral paralysis, and other deformations of the vocal folds, such as
nodules, can impede complete closure during phonation, producing the same
effect as in the normal speakers' production of breathy voice described
above.
4 Average values given by Fant (1956: 11, cited in Laver, 1983: 15) are 120
Hz for males and 220 Hz for females.
339
400
VOICE SOURCE CHARACTERISTICS
male and female voices to be different, this difference in FO means that
the harmonics are more widely spaced and interact in a different way
with vocal tract resonances5 Moreover, the shape of the female source
waveform is more symmetrical than for males, and this is reflected in
the amplitudes of equivalent harmonics, which decline more steeply in
the case of the females. Monsen and Engebretson (1977) asked subjects
to phonate into a long, reflectionless metal tube, which significantly
reduced the resonances of the vocal tract and enabled them to analyse the
glottal waveform. The waveform shape was found to be much more
symmetrical for females than for males, with the opening and closing
phases occupying almost equal proportions of the period. The male
waveform had a characteristic 'hump' in the opening phase with the
closing phase taking only twenty to forty per cent of the total period.
These differences are reflected in the spectra with the slope in dB per
octave between the harmonics being much steeper in the female glottal
wave. The characteristics are not entirely surprising when one considers
the physiology of the vocal folds: because of their greater mass, the
males' vocal folds are drawn together faster than the females' by the
Bernoulli effect, giving a sharper closure onset. Their larger size also
results in the upper and lower parts being somewhat out of phase,
which would create an effectively longer closure period. The waveform
produced would thus be irregular in shape with enhanced harmonics
above the fundamental. The female vocal folds, on the other hand, are
drawn together less sharply, but with a smoother motion, and acting
more as a single mass, which would produce a smoother, more
sinusoidal waveform with the fundamental much stronger than the rest
of the harmonics. Monsen and Engebretson's harmonic-by-harmonic
comparison of glottal spectra in normal phonation (cf. Figure 1) reveals
this difference in slope, but when the spectra are plotted un-normalised
on the same frequency and amplitude axes, i.e. with the female signal
about an octave higher in FO than the male signal and with an overall
intensity level -4 to -6 dB lower, the actual spectral envelopes are seen
to be almost identical (cf. Fig lb). There thus appears to be some sort
of built-in normalisation factor for this particular spectral effect.
5 The vocal tract resonances themselves are also different.
401400
YORK PAPERS IN LINGUISTICS 17
a
2 -20
0
2 1:3
N
I
-40
Z
w
.e
E
... -60.
4
2
(a)
10
>-
z
w
,.,.
o
-6 -40
20
200
100
HARMONIC NUMBER
(b)
460'
1000 r 2000
FREOUENCY
4000
(Ht)
Figure 1. Average glottal spectra for male versus female normal voice
phonation: (a) spectra normalised for both frequency and intensity; (b)
non-normalised spectra. Male subjects, solid squares; female subjects,
solid circles. (From Monsen and Engebretson 1977: 987)
It is interesting to note that when Bickley subjected steady-state vowels
to inverse filtering to remove the effects of sound radiation and vocal
tract filtering from the signal, her observations of the glottal waveforms
produced in breathy and modal vowels corresponded closely to those
observed by Monsen and Engebretson for female and male glottal
waveforms respectively: 'The glottal waveforms of the clear vowels
exhibited slower opening than closing phases, abrupt closure, and a
closed phase that occupied approximately one third of the period of
vibration.... The glottal waveforms for breathy vowels exhibited similar
opening and closing phases, resulting in a more symmetrical shape.
Closure was less abrupt and the closed interval was shorter.' (Bickley
1982: 76-77)
Other studies by those concerned with the synthesis of femalesounding speech confirm Monsen and Engebretson's findings concerning
source differences. Klatt (1986) analysed the speech of a single female
speaker with a 'pleasant voice quality'. He found considerable random
breathiness noise above 2kHz over parts of many utterances and a
variable degree of general tilt of the spectrum (i.e. over a larger
frequency range than the F0 -H2 measure) and of the strength of the
fundamental. He attributes this variation to the presumed degree to
which the larynx is spread or constricted.
401
402
VOICE SOURCE CHARACTERISTICS
2.2.2 Perceptual evidence
Barry (1986) reviewed some of the literature on male-female voice
source differences and also concluded that they had much to do with
physiology. His own study sought to make good synthetic copies of a
male and female voice, to derive from these a set of tables that would
reproduce the voice quality (using a rule-synthesis algorithm on the
parallel-formant synthesiser developed by Holmes), and then to establish
transformations which could be applied to one set of tables to derive the
other. The acoustic features modified were FO, formant frequencies,
spectral tilt and noise. In manipulating spectral tilt, Barry found that the
best match was obtained by reducing the amplitude of the second
formant (A2) by 6dB relative to the male A2, and of the third and fourth
formants by 8dB. The male voice was generated without aperiodicity in
the source signal (although there had been some present in the human
subject) and this did not seem to make it sound unnatural. A 'good
match' female voice included 25% noise. A discrimination test was
carried out where listeners were played pairs of utterances and asked to
select the one which sounded more like an adult female. The utterances
most consistently judged as female were those where the formant
frequencies and amplitudes and the spectral noise level of the original
'male' synthetic voice had been modified. It proved impossible to
adjudicate between the relative importance of formant amplitude (and
hence spectral tilt) adjustments and the degree of spectral noise. Thus,
Barry's perceptual findings confirmed the importance of the production
phenomena discussed in 2.2.1 above in the perception of a voice as
"female".
2.2.3 Sociolinguistic claims
It would seem from the evidence just reviewed that the common claim
that breathiness is a female attribute is predictable on the grounds that
the physiology of female vocal folds gives rise to acoustic structures
which are known to cue both a breathy and a "female" percept.
However, the variability in degree of tilt found by Klatt suggests that
although physiology (a constant for a given speaker) plays a significant
role, voice source characteristics can be varied by manipulating the
403
4O2
YORK PAPERS IN LINGUISTICS 17
larynx constriction.6 It is known from investigations into other
acoustic phenomena that physiologically-predictable characteristics can
be endowed with sociolinguistic significance by speakers and
exaggerated or compensated for. For example, Mattingly (1966) tested
the hypothesis that formant frequency differences between speakers of
the same dialect were chiefly due to variations in the vocal tract size of
the speakers, using data from Peterson and Barney's seminal 1952 study
of vowels7 If the hypothesis were correct, Mattingly argued, there
should be high correlation scores between the distributions of values for
Fl, F2 and F3 for the three classes of speaker (men, women and
children). What he found in all but a few subsets of the data was that the
correlation scores were in fact very low, and that the separation between
male and female distributions of formants for some vowels was far
sharper than could be explained by vocal tract size variation. He
concluded:
'... the difference between male and female formant values,
though doubtless related to typical male and female vocal
tract size, is probably a linguistic convention.'
Further evidence for the linguistic conventionalisation of cues to
speaker identity which originated as physiological differences comes
from work on children's speech before the development at puberty of
physical vocal tract differences, since at the earlier stages there would be
no physiological reason to account for sex-specific differences. Sachs
(1975), for example, played children's' productions of /a/, /i/ and At/
vowels to a panel of listeners, and asked them to identify the sex of the
speaker. She obtained a statistically significant correct response rate of
66%, which suggests that the children (who were aged between 4 and
12) were beginning to produce sex-specific formant patterns despite the
fact that the boys' and girls' vocal tracts would still be similar in size.
6 If this were not possible, it would not be possible for female speakers of
Gujarati and other languages, where breathy voice is used distinctively, to
make the necessary distinctions. We shall return to this issue below.
7 Peterson, G. E. & H. L. Barney (1952) Control methods used in a study of
the vowels. Journal of the Acoustical Society of America 24: 175-184
3
404
VOICE SOURCE CHARACTERISTICS
Vowel
Females
Males
Difference (F-M)
/a/
/a/
8.4
6.4
0.77
5.63
0.98
7.42
/A/
6.2
0.16
6.04
/0/
3.3
0.39
2.91
Table 3. Average differences in amplitude in dB between the first and
second harmonics in male and female speakers of Received
Pronunciation. (After Henton and Bladon 1985: 224)
Henton and Bladon (1985) did not consider the physiological basis of
source spectrum differences corresponding to breathiness, but they did
examine the male-female differences as a sociolinguistically determined
sex-specific marker. They followed Bickley (1982)8 and measured the
amplitude of FO and H2 in the steady-state portions of open vowels
produced by male and female RP and 'Modified Northern' speakers. Their
results for the RP speakers are reproduced in Table 3. The male-female
differences were significant according to a 1-test (p<0.01) and the
difference across all the vowels (mean of means) was 5.5dB. As Henton
and Bladon point out (op. cit. 225), the differences 'would be sufficient
to carry the perceptual contrast between breathy and modal vowels' for
Bickley's Gujarati speakers; however, when their measurements are
compared with the values of the synthetic vowels played in Bickley's
perceptual experiment, it would appear that only /a/ would be considered
as more than 'slightly breathy' by either of Bickley's phoneticians
(compare Table 3 with the values given on p.2 above).
Interestingly, when Watson (1987) asked colleagues to listen to his
child-subject's voices, they did not perceive them as breathy until the
possibility was pointed out to them:
'It may be that we accept as normal in children what
would be 'breathy' in adults, until we are specifically
8 It should perhaps be noted that speaker sex was not specified by Bickley,
but it is assumed, because of the consistency of her results, that her speakers
were all male.
405
404
YORK PAPERS IN LINGUISTICS 17
called on as phoneticians to attend to phonation type.'
(ibid. 21)
The comment could easily be applied mutatis mutandis to sex-specific
differences in breathiness: might it not be the case that breathiness is a
comparative measure to be assessed against the cultural norm for modal
voice, and therefore cannot be measured in universal terms?
Alternatively, it could be that although we are dealing with measures
along the same acoustic continuum, it is unjustified to speak of what is
being labelled as breathiness as being classifiable as exactly the same
phenomenon in both the case of females (and children) and that of a
linguistic phonation type. If there were no difference, Gujarati women
would have great difficulty in producing phonologically breathy sounds
which were sufficiently different from sounds phonated with their modal
voice.
Henton and Bladon would presumably not consider these questions
to be problematic, as they see the spectral tilt9 characteristics as being
produced deliberately by the British female speakers, rather than as being
a result of physiology, and would presumably hypothesise that female
modal voice would not have the same culturally determined properties in
Gujarati. On the premise that breathy voice is used to convey intimacy
in English (Laver 1980: 135) they suggest that the RP. speakers are
trying to sound 'sexy' [sic ]:
'At an ethological level, breathy voice may be seen as part
of the courtship display ritual, as important as bodily
adornment and gesture. A breathy woman can be regarded
as using her paralinguistic tools to maximise the chances
of her achieving her goals, linguistic or otherwise.' (op.
cit. 226).
9 Hitherto the term 'tilt' has been used in its generally accepted designation
of the rate of decrease in amplitude across the whole source spectrum; I shall
also be using the term in this article to refer to the difference in amplitude
between FO and H2. I make no claims as to the equivalence of these two
measures, using the term in refer to this amplitude difference.
406
405,
VOICE SOURCE CHARACTERISTICS
The claim that the female RP voice has the distinctive spectral
characteristics described solely with the paralinguistic aim of aiding the
speaker to attract a mate seems rather exaggerated, especially in the light
of the other papers discussed above which hold that the female source
spectrum would tend towards the 'breathiness' pattern anyway for
physiological reasons. However, this does not rule out the role of other
sociolinguistic forces which could cause female speakers to move nearer
to or further away from the physiologically determined female 'norm',
which is the implication of the findings of Mattingly, cited above. It
should also be pointed out, of course, that males may well be
modifying their voice quality for similar reasons.
2.3 Breathiness and the Voicing Contrast
As is well-known, French, like English, has a two-way 'voicing
contrast' between cognate pairs of obstruents, but as far as plosives are
concerned, the labels 'Voiced' and 'Voiceless" correspond to different
phonetic patterns of realisation in the two languages, most obviously in
the timing of vocal-fold vibration relative to the release of the
consonant when in absolute initial position. The Voiced plosives of
French are canonically voiced throughout the closure and release period,
usually with no break (though see Temple 1988a, b); Voiceless
plosives have no prevoicing and little or no aspiration. English Voiced
stops are phonetically voiceless unaspirated, while the Voiceless ones
are voiceless and with longer aspiration following release. In addition to
the timing of voicing relative to the release of the consonant, there are
many other phonetic correlates to the voicing contrast in French and
English plosives which are well-documented elsewhere and which it is
not necessary to review here (see Temple 1988a for references). One
correlate which has been less thoroughly documented, although it is
taken to be a well-known fact about at least English plosives, is that
Voiceless plosives tend to have breathy voice at vowel onset, due to the
10 The labels Voiced and Voiceless, in italics and with initial capital letters
will be used throughout this paper to refer to phonological categories. The
same words in non-italic script, and entirely in lower-case will be used to
refer to the phonetic distinction between stops with prevoicing and those
without. Henceforth no citation marks will be used.
407
YORK PAPERS IN LINGUISTICS 17
vocal folds' beginning to vibrate before being fully adducted for the
vowel. Ni Chasaide and Gobl (1988) reported an analogous process
during the pre-aspiration of plosives in Swedish. Laryngographic traces
showed vibration of the vocal folds as they opened for the Voiceless
plosive, and this was accompanied by an increase in spectral tilt.
However, they also found that the onset of voicing in post-consonantal
vowels was much less 'clean' than the breathy offset of the preconsonantal vowel.
The evidence reviewed thus far shows that F0 -H2 differences have
been found to correlate with perceived "breathiness" in languages where
this quality plays a phonological role. The same measure has been
found to differentiate male and female voice sources, and this is to some
extent predictable from male-female physiological differences.
Moreover, it has been suggested that variability in this measure could
have a sociolinguistic value. Temple 1988a and 1988b thus attempted
to draw together whether degree of breathiness, measured by the F0 -H2
difference, was yet another marker of the voicing contrast in initial
position, and whether there were differences between male and female
French speakers, and if so, whether there was interaction between sex specific and voicing-specific effects.
3. The 1988 Study
3.1 Methodology
Seven speakers were recorded in their study bedrooms at the Ecole
Norma le Supdrieure in Paris, and two at Oxford University Phonetics
Laboratory (O.U.P.L.), reading lists of monosyllabic words with initial
plosive consonants in isolation and in the frame, 'Jean avait dit
The stimuli were presented individually on cards to minimise listing
effects, and the first element of each list was discounted. The six plosive
phonemes of French occurred several times before each of three vowels,
/i/, /a/ and /u/. Only tokens with the vowel /a/ were measured for this
part of the experiment because it is in here that the lower harmonics are
least likely to be affected by the first formant, either in transition or in
steady-state. The data were analysed using the Signal File Manager of
O.U.P.L.'s New England Digital microcomputer (see Clark 1986 for
details). Windows were positioned at the points indicated by the letters
A to E and V in Figure 2, that is, in the relatively steady-state parts of
407
408
VOICE SOURCE CHARACTERISTICS
the pre-voicing and the vowel, over the release itself and over the pitch
periods closest to the release. The two frames which fall into this latter
category were at varying distances in milliseconds from the release: B
covered the last three pitch periods of prevoicing for females and the last
two for males, including cases of Voiced stops which were partially
devoiced (i.e. where voicing ceased before release); and D covered the
first three and first two periods after release in both Voiced and
Voiceless stops, the latter having varying Voice Onset Times. The
frame lengths of 20ms and 16ms for males and females respectively
were chosen after experimenting to find settings which would give the
best resolution of harmonics whilst maintaining comparable lengths in
both time and number of periods. For each frame, frequency in Hz and
amplitude in decibels (dB) of FO and H2 were noted."
RIGIII
0
4.0
3.0
2.0
0
1.0
0.0
s -1.0
-2.0
-3.0
-4.0
-5.0
0.0e
0.04)
0.(
0.G ;)
0.140
0.110
Smonds
Figure 2. Positions of start of spectral windows for utterance "bac", by
speaker PIG (male)
11
For more details on the analytical procedure followed, see Temple
1988a: 57-70.
409
YORK PAPERS IN LINGUISTICS 17
Rel.
Amp.
Rel.
Amp.
100
200
300
400
500
100
200
300
400
500
Frequency (Hz)
Frequency (Hz)
Figure 3. Schematic representation of the effects of fundamental
frequency on the relations between harmonics in the spectrum.
A technical problem arises here in the question of how to compute what
we have been referring to 'spectral tilt'. Both Bickley and Henton and
Bladon calculated the straightforward difference between the amplitude
measurements of the harmonics. Assuming that all Bickley's subjects
were male, it is unimportant whether the measure is computed in this
way or whether a true slope is calculated in amplitude loss per frequency
unit (difference in dB 'over' difference in Hz). However, as soon as
speakers with notably different FO are to be compared, the choice of
calculation method becomes important, since a higher FO means a
greater distance in Hz between FO and H2, which would have a
significant effect on the calculation of the slope. A schematic example
is given in Figure 3 to illustrate this effect. The horizontal axis
represents frequency in Hz, the vertical axis a hypothetical amplitude
range. The solid vertical lines correspond to idealised harmonics for a
male versus a female speaker. The difference in amplitude between FO
and H2 in both pseudo-spectra is 1. However, if the slopes are calculated
in Amplitude/Frequency the results are 1/100 = 0.01 'A'/Hz for
spectrum M, but 1/200 = 0.005 'A'/Hz for spectrum F. As well as
having implications for comparisons across studies, this has
implications for comparisons within a single study wherever speakers
have significantly different fundamental frequencies. Indeed, spectra with
4O9
410
VOICE SOURCE CHARACTERISTICS
a different amplitude difference could actually have the same slope
gradient: if the difference in 'A' in spectrum M were 10, and in spectrum
F 20, the gradients would be 10/100 = 0.1 'A'/Hz, and 20/200 = 0.1
'A'/Hz respectively. The question of which is the best way of measuring
spectral 'tilt' is evidently potentially important and we shall return to it
below. For the purposes of the experiment being described here it was
decided to compute the measure both in terms of amplitude differences
and in terms of dB/Hz slope.
Statistical analysis of the measurements was carried out using
S.A.S.12 Institute package implemented on the VAX mainframe
computer at Oxford University Computing Service. The data were
subjected to a 'General Linear Models' (G.L.M.) procedure, which
allows Analysis of Variance to be carried out on 'unbalanced' models,
because the numbers of tokens analysable for each speaker were not the
same, principally because of the hazards of making recordings outside
the recording booth.
3.2 Results and discussion.
3.2.1 Waveforms
No procedures were used to derive the source waveform from the vowel
signal, but the waveforms during the closure period of prevoiced stops
did appear consistently differently in male versus female subjects.
Generally the waveform shapes in the speakers considered here seemed
to be as predicted by Monsen and Engebretson, that is with a nearsinusoidal appearance for females, but with a 'hump' in the opening
phase and a sharper closing phase for males (compare Figures 2 and 4).
12 Statistical Analysis System.
411
YORK PAPERS IN LINGUISTICS 17
0.(Ea
v :431
Figure 4. Waveform in prevoicing of "bac" by speaker ISR
(female).
Relationship between FO and H2: male versus
3.2.2
female speakers
Position
Sex
Males
Females
Males
Females
A
dB/
Hz
(13
D
B
-0.0378
-0.0813
-0.0491$ -0.104$
-5.026
-6.213
-15.853$ -18.330$
-0.0262
-0.0492$
-3.758
-10.642$
E
-0.0093
-0.0398$
-1.346
-8.920$
V
-0.0183
-0.0396$
-0.404
-9.504$
Table 4. Mean FO-H2 differences for frames positioned at A, B, D, E &
V by male and female speakers expressed in terms of slope (dB/Hz) and
amplitude (dB)
Mean values for the differences between FO and 112 at the different
positions in the word are given in Table 4 and Figure 5 in terms both of
the dB/Hz slope and of amplitude comparisons in dB. A negative
number indicates that the value for the fundamental is higher than for
the second harmonic, and a positive number represents a lower value for
FO. Another convention adopted has been to indicate the steeper gradient
slope or greater amplitude difference in a particular two-way comparison
412
VOICE SOURCE CHARACTERISTICS
with a superscript dollar sign ($). All the values in the table are higher
for females than for males, as predicted from the evidence discussed
hitherto, and the male-female contrast is high
Figure 5a. Mean FO-H2 slope (dB/Hz) across positions of all tokens for
males and females.
Figure 5b. Mean FO-H2 differences of amplitude across positions of all
tokens for males and females.
significant according to a t-test (p<0.001) in all cases except V for the
dB/Hz measure, which fails to reach significance even at the 5% level. It
is clear from Figure 5a that the male and female trends in terms of slope
413
412.
YORK PAPERS IN LINGUISTICS 17
stay firmly apart but follow much the same pattern with a sharp rise in
steepness at B, that is as the release approaches or the prevoicing is
about to cease. However, this effect is apparently reduced dramatically,
particularly for females, in Figure 5b, where both curves are much
smoother, showing only a slight rise in the dB difference at B. Also
apparent in this Figure is the reflection of how the male-female
difference at V 'becomes' statistically significant when calculated in
terms of amplitude.
These findings are interesting for two particular reasons. Firstly,
the only position where a significant difference was not found is the
only one where measurements were taken in the other experiments
reported, i.e. the relatively steady-state portion of the vowel. Secondly,
they seem to confirm that changing the method of calculating the 'value'
of the harmonic difference does have a significant effect on the apparent
relationships between the sets of production data, which in turn
suggests it could be relevant perceptually. Moreover, the measure which
fails to reach statistical significance in this position is not the one used
in the papers cited above, which begs the question 'how would those
results look when calculated in these terms?'
Possible
articulation
3.2.3
influence
of
consonant
place
of
The steady-state part of the vowel was chosen by the other researchers
referred to in order to avoid the possible effects of the Fl transition from
the preceding or following consonant, which could enhance the
amplitude of FO or H2 and thereby distort the results. However, because
the focus of this study was on the voicing contrast in consonants, these
transition sections were precisely the parts of the signal in which we
were interested. The only way to counteract the influence of formants
would have been inverse filtering, which it was not possible to carry
out at the time. Instead statistics were used to compare the effects of the
different places of articulation of the consonants on the spectral values.
Of course, the use of statistics cannot be seen as a replacement of
inverse filtering by an equivalent measure, but we can hope that it
would at least make us aware of any significant effect of components
which would have been filtered out by that process. The slope values
4! 3
414
VOICE SOURCE CHARACTERISTICS
obtained for males and females are given in Table 5, and the amplitudedifference values in Table 6. Values are given for each position for each
phoneme, and accompanying each value, an indication of those
phonemes which are significantly different from the one in question, at
the 5% level (Nest).
Position
Consonant Mean
in
/b/
f
bth
m
/d/
f
bth
in
/g/
f
bth
B
A
-0.04348
-0.09521$
-0.06430
-0.04975
-0.09092$
-0.06606
-0.01971
-0.06282$
-0.03781
Diff
From
Mean
g
-0.05464
-0.12022$
-0.08147
-0.05778
-0.10448$
-0.07637
-0.03248
-0.08860
-0.05479
g
g
g
g
g
bd
bd
bd
Diff
From
g
g
0
b
in
/p/
f
bth
m
Itl
f
bth
m
/k/
f
bth
Table 5(a). Mean slope differences (dB/Hz) across place of articulation
for the different sexes with indications of pair-wise contrasts significant
at the 5% level (t- test). Positions A and B as in Figure 2.
415
44
YORK PAPERS IN LINGUISTICS 17
!Position
Consonant
m
/b/
f
bth
/g/
-0.02125
f
f
bth
m
f
bth
m
/k/
-0.01150
gk
-0.04675$
d g k -0.02545
m
f
bth
-0.06695$
-0.03871
-0.02941
-0.04457$
-0.03611
-0.01626
-0.04713$
-0.03056
-0.03092
-0.05593$
-0.04182
Mean
From
d
bth
f
m
/t/
-0.01503
-0.03099$
-0.02134
Diff
bgt
bth
/p/
Mean
-0.04283
-0.05104$
-0.04614
m
Id/
E
D
Diff
From
t
t
bt
-0.01776
-0.03575$
-0.02502
d
b
b
-0.01418
-0.03492$
-0.02210
t
t
b
-0.01638
-0.03897$
-0.02637
d
-0.00897
pbdg
-aocoas
d
-0.01375
b
b
-0.00690
-0.04233
-0.02234
Table 5(b). Mean slope differences (dBIllz) across place of articulation
for the different sexes with indications of pair-wise contrasts significant
at the 5% level (t- test). Positions D and E as in Figure 2.
416
VOICE SOURCE CHARACTERISTICS
Position
V
J
Consonant Mean
/b/
m
+0.01744
bth
-0.04594$
-0.00763
f
m
/d/
f
bth
m
/g/
f
bth
/p/
/t/
f
-0.07412$
-0.03895
bth
-0.05857
in
-0.00814
-0.04275$
-0.01544
-0.01151
bth
m
lid
f
bth
p
-0.00461
-0.03259$
-0.01591
-0.05037$
-0.02796
-0.04181
m
f
I
Diff
From
-0.04672
b
g
-0.02685
Table 5(c). Mean slope differences (dB/Hz) across place of articulation
for the different sexes with indications of pair-wise contrasts significant
at the 5% level (t- test). Position V as in Figure 2.
416
417
YORK PAPERS IN LINGUISTICS 17
Position
A
Consonant Mean
B
Diff
Mean
From
m
/b/
f
bth
m
/d/
f
bth
m
/g/
f
bth
-5.546
-17.160$
-10.218
g
-6.328
-16.435$
-10.331
g
-2.762
-11.682$
-6.506
g
g
Diff
From
-7.044
-20.989$
-12.749
g
g
g
g
-7.268
-18.754$
-11.840
bd
bd
bd
-4.040
-14.903$
-8.359
d
g
g
g
bd
bd
m
/p/
f
bth
m
Itl
f
bth
/k/
Table 6(a). Mean amplitude differences (dB) across place of articulation
for the different sexes with indications of pair-wise contrasts significant
at the 5% level (t- test). Positions A and B as in Figure 2.
4 1 t-j$
418
VOICE SOURCE CHARACTERISTICS
Position
Consonant
/b/
/d/
bth
m
-5.048
f
f
bth
m
/g/
f
bth
m
/p/
f
bth
m
/t/
f
bth
/k/
Mean
-2.362
-7.236$
-4.290
m
E
D
Diff
From
d
k
ptkd
b
-10.343$
-7.185
b
Mean
From
-1.444
-9.158$
-4.496
-2.466
-7.309$
-4.421
-2.896
-11.085$
-6.025
-1.084
-7.255$
-3.442
-4.586
-10.339$
-7.131
-1.703
-9.467$
-5.137
b
-0.220
-2.846
-6.799
b
-9.674$
-4.601
-4.682
-12.750$
-8.197
b
b
-1.168
-10.077$
-5.050
-11.377$
Diff
t
d
Table 6(b). Mean amplitude differences (dB) across place of articulation
for the different sexes with indications of pair-wise contrasts significant
at the 5% level (t- test). Positions D and E as in Figure 2.
418
419
YORK PAPERS IN LINGUISTICS 17
Position
Consonant
j
m
/b/
/d/
/9/
-0.093
f
m
-0.797
-7.573$
bth
-3.532
f
m
-0.680
-6.797$
bth
-3.017
f
bth
f
bth
Duff
From
- 10.317$
-4.137
m
It/
Mean
bth
m
/p/
V
k
k
k
+0.579
-9.561$
-3.906
-0.117
10.426$
-4.769
-1.593
/k/
-11.607$
-5.955
Table 6(c). Mean amplitude differences (dB) across place of articulation
for the different sexes with indications of pair-wise contrasts significant
at the 5% level (t- test). Position V as in Figure 2.
measure, but we can hope that it would at least make us aware of any
significant effect of components which would have been filtered out by
that process. The slope values obtained for males and females are given
in Table 5, and the amplitude-difference values in Table 6. Values are
given for each position for each phoneme, and accompanying each
value, an indication of those phonemes which are significantly different
from the one in question, at the 5% level (t-test).
419
420
VOICE SOURCE CHARACTERISTICS
Again, the dB/Hz slopes for females are consistently steeper than the
males' slopes across all positions except at V for /g/ and /p/. The
picture becomes more interesting when these values are compared with
the dB values. For /p/, the male H2 is seen to be higher than FO. For
/g/, both the measures show FO generally higher than H2, but whereas
the dB difference is greater for females than for males, with the other
measure the result is the opposite. An extension of the hypothetical
example above shows that this is mathematically unsurprising: with
differences in 'A' of 10 in spectrum M, and of 20 in spectrum F, we saw
that the gradients would be the same; however, a reduction of just one
'A' unit would give an apparently steeper slope for spectrum M, even
though the amplitude difference would still be greater in spectrum F:
10/100 = 0.1 'A'/Hz; 19/200 = 0.095 'A'/Hz. Moreover, bringing the
amplitude difference in spectrum F down to, say, 13 would still leave it
greater than the difference for M, but in the slope would be 0.06 'A'/Hz,
only just over half as steep as the male counterpart.
There are further differences between the two tables in terms of
which pair-wise contrasts between phonemes show a significant
difference. To take the values for the prevoicing first, although the 'Diff
From' columns for measurements at position A are identical, there are
discrepancies in the same column for position B, where, for example,
/d/ enters into no significant contrasts for the dB/Hz measure, but
contrasts with /g/ for all groups of speakers for the dB measure. With or
without these discrepancies, these pair-wise contrasts also indicate that a
caveat needs to be added to our suggestion above that the waveform of
the prevoicing was the closest we were likely to get to the glottal
source waveform. They show (not surprisingly) that the supralaryngeal
characteristics of the consonants do affect the pre-voicing F0 -H2 tilt.
There are still large differences between males and females, but it could
be argued that since place of articulation obviously does have an effect
on the slope, the differences in the lower spectral components could be
accounted for by supra-glottal differences, rather than differences
generated by the vocal folds themselves. In view of the findings of the
literature reviewed earlier, it is improbable that the male-female spectral
differences found can be entirely ascribed to supra-glottal effects, but
there was no possibility of testing the extent of those effects within the
framework of this study.
42,
42
YORK PAPERS IN LINGUISTICS 17
In the post-release positions, the numbers of pairs of phonemes with
significant differences between them decreases in both tables from D
through E to V, but again different pair-wise contrasts were found to be
significant in the different tables. It is clear too that the formant
transitions do have an effect on the slope, and one is again forced to
question whether the highly significant male-female differences found at
D and E (as opposed to the failure to attain significance at V in the
dB/Hz measure) were not at least enhanced by supraglottal resonance
differences between the males and females. The effect of Fl would be
reduced by the time it had passed through the frequency band where it
would affect H2, hence the reduced inter-phoneme differences through E
to V. If H2 is being enhanced, that would reduce the difference between
it and FO, thus masking the characteristics of the 'breathy' spectrum.
That there still is at least some male-female difference at V is
encouraging for our original hypothesis that there is an effect
independent of formant differences. However, this should be confirmed
by examining the possible influence of the different Fl values of the
vowels themselves. Actual measurements of the formant frequencies
were not carried out, but a statistical analysis of possible vowel effects
was done.
3.2.4 Possible effect of following vowel
Henton and Bladon (op. cit. ) restricted their study to the English
vowels /a /, Ail, /A/ and /o/ in order to try and minimise the interference
of Fl (which is relatively high in these vowels) with FO or H2. The
results comparing vowel-contexts for the present data in dB/Hz are given
in Table 7 and Figure 6. Unfortunately the full set of statistics for the
dB measure is not available, so in the light of the differences noted in
the previous paragraph, the following comments, which are based on
the dB/Hz values, should be taken with a note of caution.
421
422
VOICE SOURCE CHARACTERISTICS
Position
Vowel
m
/i/
f
bth
m
/e/
f
bth
/u/
Mean
-0.04733
-0.19102$
-0.06517
-0.00096
-0.06823$
-0.02787
Diff Mean
From
e
-0.04956
-0.10871$
-0.07338
-0.03036
bth
m
-0.03751
-0.05564
-0.08800
-0.11250
f
bth
-0.05738
From
e
-0.07630
iu
-0.04887
f
Diff
-0.00485
-0.04061
-0.07567$
-0.05492
m
/a/
B
A
iau
-0.09830
-0.06985
e
-0.07758
e
e
Table 7(a). Mean slope values (dBIHz) showing effects of different
following vowels at positions A, and B across the sexes and indications
of pair-wise contrasts significant at the 5% level (t-test figures for both
groups only).
423
42
YORK PAPERS IN LINGUISTICS 17
Position
Vowel
E
D
Mean
Diff
Mean
bth
-0.03399
-0.08181$
-0.05418
m
+0.00739
m
/i/
/e/
/a/
f
f
-0.02424
m
-0.01902$
+0.00132
-0.01028
f
bth
m
/u/
f
bth
-0.00966
-0.06910
ea
-0.03477
eau
ia
+0.01403
-0.00605$
+0.00650
ia
-0.07690
bth
-0.03485
-0.07782$
-0.05290
Diff
From
From
-0.01440
iu
-0.00339
-0.00969
ia
ea
-0.00576
-0.05574$
-0.02675
iea
Table 7(b). Mean slope values (dBIHz) showing effects of different
following vowels at positions D, and E across the sexes and indications
of pair-wise contrasts significant at the 5% level (t-test figures for both
groups only).
423
424
VOICE SOURCE CHARACTERISTICS
Position
Vowel
V
Mean
Diff
From
m
/i/
/e/
f
-0.04997
m
+0.00408$
+0.02486
+0.01187
f
m
f
bth
m
/u/
-0.06490
bth
bth
/a/
-0.03901
f
bth
a
-0.00409
-0.01133$
-0.00766
-0.02060
-0.05256
-0.03402
a
Table 7(c). Mean slope values (dB /Hz) showing effects of different
following vowels at position V across the sexes and indications of pairwise contrasts significant at the 5% level (west figures for both groups
only).
424
425
YORK PAPERS IN LINGUISTICS 17
-0.08
-0.07
xr"
-0.06
-0.05
-0.04
-0 03
o
0
15 -0.02
"
X
-0.01
0
LL
0.01
0.02
A
Figure 6. Slope differences as a function of following vowel. All
speakers.
In Figure 6 the patterns for the four vowels when all speakers are taken
together have a somewhat similar trajectory. Apart from /e/, there is a
striking degree of similarity before the release, suggesting relatively
little coarticulatory effect on this part of the spectrum in prevoicing.
The atypical pattern for /e/ can be explained by the lack of tokens
following either /b/ or /d/. There are large post-release differences and an
inspection of the values for males and females separately (cf Table 7)
shows that there is a complex effect, which is not surprising when one
considers the complex sex-specific differences found in the acoustic
structure of vowels. The female slope is again generally steeper.
However, in /a/, where following previous studies we had expected to
see the hypothesis confirmed most firmly, the male-female position is
reversed after the release through D and E, and the only mean value for
females to be a positive value (indicating H2 higher than FO) is at D for
/a/ (although the male-female difference fails to reach significance at
either D or E). At V there is a return to the more common pattern of
females having the steeper mean slope, although this difference fails to
reach significance by a long way (p0.05). Clearly more detailed
analysis of the interaction of slope and formant frequency is needed.
426
VOICE SOURCE CHARACTERISTICS
3.2.5. The Voicing contrast
It was suggested above that the F0 -H2 difference may be found to vary
following voiced versus voiceless consonants as an indicator of
increased breathiness in the voiceless case. Values for the Voiced versus
Voiceless classes as wholes are given in Table 8. None of the
differences in slope between Voiced and Voiceless reaches significance.
The greatest differences tend to occur in the vocalic portion, which is
again where we should least expect to find them. The cross-phoneme
comparisons shown in Tables 5 and 6 above revealed hardly any
significant differences between cognate pairs, so these values are not
surprising and no positive conclusions can be drawn from then
concerning the discrimination of phonological classes.
Position
D
E
V
Sex Voicing
-0.02731$ -0.01466$ -0.01206
m Voiced
f
Vless
Voiced
Vless
-0.02509 -0.00415 -0.04039$
-0.04945$ -0.03898 -0.03542
-0.03579 -0.02040$ -0.03263$
Table 8. Mean values for FO-H2 slope (in dB /Hz) across Voicing
categories for males and females at post-release positions.
If,
as
suggested above, this is not an effect manipulated by speakers but
one due more to the physical effects of the gradual adduction of the
vocal folds, we should expect the de-voiced tokens to follow the pattern
of the Voiceless ones. Means were therefore computed across phonetic
voicing type and are presented in Figures 7 to 9 and Table 9. Two
graphs are given for the data for the male speakers and for the data for all
speakers considered together because of the drastic effect of the mean V
value for the 0-PREY tokens. The categories represented are fully-voiced
tokens (FVOICED); Voiceless tokens (PHON VLESS); Voiced tokens
where prevoicing ceased at some time at or before release (DEVOICED);
Voiced tokens with no actual prevoicing (0 PREV).
426
YORK PAPERS IN LINGUISTICS 17
Figure 7. Slope values across positions for voicing type. Male
speakers. Including (b), and not including (a), 0-prevoiced Voiced
tokens.
4-- F.VD
-0.06
-0.04
VLESS
-0.02
X
X DEVD
0-- 0 PREV
0.02 11
E
Position
V
Figure 8. Slope values across all positions for voicing type.
Female speakers.
Figure 9. Slope values across positions for voicing type. Male speakers.
Including (b), and not including (a), 0-prevoiced Voiced tokens.
4 27
428
VOICE SOURCE CHARACTERISTICS
Position
Voicing type
Sex
E
D
V
-0.01625
-0.01186
-0.02697
voiced
-0.02821
-0.00615
-0.02829
Voiceless
m
-0.01898
-0.01986
-0.03975
&voiced
-0.14358
-0.02455
-0.01354
Vd -- no prey
-0.03706
-0.04092
-0.05509
voiced
-0.04275
-0.04039
-0.04896
Voiceless
-0.01030
-0.00546
-0.01121
devoiced
-0.00290
Vd -- no prey +0.00628 +0.00687
-0.02461
-0.02353
-0.03826
voiced
-0.03473
-0.02150
-0.03755
Voiceless
both
-0.01592
-0.01478
-0.02965
devoiced
-0.10695
-0.01670
-0.00858
Vd -- no prey
Table 9. Mean values for FO-H2 slope (in dBIHz) across voicing
categories for males and females.
f
When the effect of the male 0 -PREY tokens is disregarded, the patterns
for the different voicing types across the spectral window positions are
very similar. There are no significant differences between types for
males or for the group as a whole, but for females the FVOICED and the
VLESS are significantly different from the DEVOICED and 0-PREY types,
as reflected in Figure 9. With regard to the voicing contrast, therefore,
there seems to be no phonetic or phonological grouping for which this
measure of breathiness is a robust acoustic correlate.
4. Studies published since 1988
A good deal of work has been published since 1988 on the nature of
voice source characteristics. We shall restrict ourselves here to a
description of just a small number of important studies.
The most substantial single study is that of Klatt and Klatt (1990)
on the analysis, synthesis and perception of voice quality variation.
Klatt and Klatt analysed recordings of ten female and six male speakers
uttering two 'real' sentences and reiterant imitations of those sentences
using [?a] and [ha] syllables and measured the relative strength of the
429 4 2 3
YORK PAPERS IN LINGUISTICS 17
first harmonic, the presence of noise in the F3 region and above, and the
presence of extra poles and zeros in the vowel spectrum, mid-way
through the vowel. They found an average male-female difference of
about 5.7dB in FO-H2 difference, but there was considerable subject-to-
subject variability within each group, with average FO-H2 across
sentences ranging from 8.4 to 17.1dB in females, and from 4.6 to 9.7 in
males. Periodicity versus noise excitation of F3 was measured for the
reiterant sentences with [ha], on a subjective five-point scale and noise
was found to be commonly present for both sexes with on average more
noise in female than male subjects, but again considerable within-group
variation. Both reiterant imitations of one of the original sentences
pronounced by all subjects were then played to a panel of eight
listeners, who were asked to judge the vowels on a seven-point scale
from 'not breathy' to 'strongly breathy'. On average, females were
perceived to be slightly more breathy than males, and sentences
consisting of [ha] syllables were generally perceived as considerably
more breathy than those with [7a]. Correlations of breathiness ratings
with acoustic measures suggested that both the FO-H2 measure and the
presence of noise were important. Finally, pairs of synthetic 'female'
vowels (the first of each pair being a constant reference vowel) were
played to a panel of five listeners who were asked to judge the relative
breathiness of the second, its naturalness and its nasality. The results
suggested that noise amplitude was more important than FO-H2
difference in giving a breathy percept; the latter cue was insufficient on
its own to induce a breathy percept and often contributed to a perceived
increase in nasality. The tentative conclusion of the authors is that,
'... either breathiness is signalled differently for men and
women, or that the increases in the first harmonic observed in
production data from women must be accompanied by other
cues to be interpreted by the listener as cues to breathiness.'
(851)
NI Chasaide and Gobl have published several papers developing the
theme of the 1988 presentation mentioned above, among them one in
Speech Communication (Gobl and NI Chasaide 1992) where they
analysed repetitions of a prose passage read with a range of voice
430
42
VOICE SOURCE CHARACTERISTICS
qualities by a male phonetician who is a native speaker of British
English. The data were subjected to manual interactive inverse filtering
and analysed using the four-parameter LF-model of differentiated glottal
flow developed by Gunnar Fant. Correlates of breathy voice were found
to be high values for the parameters RA (corresponding to attenuation
of higher frequencies), RK (corresponding to a more symmetrical pulse
shape) and OQ (Open Quotient, thus also suggesting a more
symmetrical pulse). Gobl and NI Chasaide also used data from frequency
domain analysis of the speech waveform to measure the levels of Fl and
F2 relative to the first harmonic (our FO) and their Figure 5 (487)
shows marked attenuation of both in the breathy data. An important
feature to note about both sets of measurements is that they vary over
time, and in their conclusion the authors emphasise the point that, 'a
switch between voice qualities may not necessarily involve a single
transformation which remains uniform throughout an utterance.'
Ni Chasaide and Gobl (1993) investigated voice quality in the
vicinity of Voiced and Voiceless stop consonants spoken by male and
female speakers in different languages. They found considerable crosslinguistic differences, but the effects were not grouped according to
language-family as they had expected. Thus Swedish and, to a somewhat
lesser extent, Italian /p:/ was preceded by a markedly higher RA than
/b/, whereas, although the values were occasionally slightly higher in
French and German (suggesting a slight tendency to relax the vocal
folds in anticipation of the following Voiceless stop), the effect was not
found to be consistent. The English speakers produced both patterns,
but information is not given as to whether the division corresponds to
the speaker's sex. RK values also rose in Swedish in anticipation of
/p: /. Spectral measurements on the whole confirmed these findings, with
the voicing category of the following consonant having little differential
effect on Fl (their LI) relative to FO in French and German, but
showing a marked relative decline in Fl before the Swedish /p:/ with a
rather lesser effect in the same direction in Italian. The English subjects
fell into two groups, as for the source parameter measures. It is
noticeable that for both sets of measures, the Figures show some
marked differences between the languages, even within one of the two
groupings (i.e. those with a /p/ /b/ difference and those without).
431
430
YORK PAPERS IN LINGUISTICS 17
In postconsonantal vowels, little categorial effect was found in the
source parameters in French and Italian, but German RA was much
higher at vowel onset following /p/ than /b/, and declined less rapidly.
The authors infer that this is the result of incomplete glottal closure
with the vocal folds vibrating in breathy mode following the aspirated
stop. However, the difference between voicing categories is less marked
in Swedish and English, despite the fact that these languages also have a
voiceless unaspirated vs. voiceless aspirated phonetic contrast. The
spectral data show less similarity between Swedish and the two
Romance languages, with a lower Fl in Swedish post /p/ onset than
following /b/, but no consistent effect in French or Italian. German
follows a similar pattern to Swedish, but with an even greater relative
lowering of Fl. Data for English are not given. In the light of these
findings, it is perhaps not surprising that no difference was found in the
study reported above for vowels following voice versus voiceless stops
in French.
A smaller-scale study is currently being carried out by Scobbie
(1995 and personal communication), in which he found a marked
difference between FO-H2 measures in vowel onset following /t/ vs. /d/,
and to a lesser extent /p/ vs. /b/ in four-year-old speech-disordered child
speakers of Edinburgh English.
5.
Discussion
The 1988 study reported above raised several issues, to which we shall
now return in the light of the subsequent work reported above.
5.1 Methodology
There are various methodological questions raised by a comparison of
the studies mentioned, principal among which are how the oft-referredto, but ill-defined feature 'spectral tilt' or 'spectral slope' is measured,
and how measurements are analysed.
5.1.1 The measurement of spectral tilt.
The studies take one of two approaches to gaining access to an accurate
measure of the voice source. Some invoke some procedure for negating
the effects of the supra-glottal filter. Thus, Fant and Ni Chasaide and
432
431
VOICE SOURCE CHARACTERISTICS
Gobl used inverse filtering techniques, whereas Monsen and
Engebretson had their subjects phonate down a reflectionless tube to
reduce the resonances of the vocal tract. Bickley also used inverse
filtering when she was looking at waveforms. The rest rely for the most
part on analysing vowels with a relatively high Fl to minimise its
effect on the lower harmonics, and/or on averaging large amounts of
data to derive an accurate picture of the shape of the source spectrum.
Henton and Bladon and Temple use statistical tests, while Hammarberg
uses Long-Term Average Spectra (LTAS). Of course, with either
approach it is impossible to be absolutely sure that a true picture of the
glottal wave has been revealed, although inverse filtering techniques
have improved greatly over recent years. The second type of approach
seems the less satisfactory one, particularly for the purposes of
comparing across studies, or even comparing different groups of
speakers within studies: it is well-known that vowel qualities differ
somewhat across languages (thus /a/ could represent something different
in Gujarati from French), and across sex groups (and that the degree of
sex-specific variation varies from language to language see Bladon el
al 1984)13. The fact that the trajectory for /a/ from position D to V in
Figure 5 (above) is different from those of the other three vowels does
suggest that we might be able to claim that the Fl transition is not
affecting H2 in this case, but the uncomfortable fact remains that it is
only this vowel which shows the unexpectedly steeper male slope in
two positions. Moreover, Table 7 shows that only in a few
measurements were the slope measurements for /a/ seen to be
significantly different from those for the other vowels, where Fl is
likely to have had an effect.
The actual measure of spectral tilt also differed from study to study.
Fant and NI Chasaide and Gobl used the LF model of glottal flow
developed by the former, and measured parameters assumed to
correspond to characteristics of the glottal wave. Because Hammarberg
used LTAS, she was unable to make detailed measurements of spectral
features, and instead identified breathy voice quality with relatively low
energy in the Fl region (400-600Hz) and high levels in the highest
13 It could also be the case that /4/ and /a/ in Gujarati do not have the same
formant values.
433
432
YORK PAPERS IN LINGUISTICS 17
frequency band (5-10kHz). Monsen and Engebretson measured slope in
the first two octaves of their spectra in terms of dB fall-off per octave.
Others measure formants, but in different ways: Barry compared
amplitude levels for the same formant in his female and male subjects,
while Gobl and Ni Chasaide measured Fl and F2 relative to FO. The
rest of the studies measured harmonics, and I shall return to them in the
next paragraph. The point needs to be made, however, that while these
different measures allow generalised comparisons to be made of greater
or less spectral tilt, the kind of detailed comparisons made, for example,
between Henton and Bladon's data and that of Bickley is not possible.
The studies using FO-H2 all measured the difference in amplitude
between the two harmonics in dB. As we have seen, comparison using
this measure between speakers with the same FO is unproblematic
(which is not to say that the interpretation of comparisons is without
problems), but as soon as speakers with different FO are compared, the
analyst is faced with a choice which has implications for the results and
can affect their statistical significance. Tables 10 and 11 present
recalculations of Bickley's and Henton and Bladon's figures to see how
this might affect the comparison between their sets of data.
Speaker 1
Speaker 2
Speaker 3
Speaker 4
Speaker 5
Speaker 6
Speaker 7
Speaker 8
Speaker 9
Speaker 10
Difference (in dB/Hz)
Clear
0
-0.0273
-0.0273
0.0455
-0.0364
0.0455
-0.0818
0.0364
-0.0727
0
0.1
Breathy
0.1182
-0.0364
0.0182
0.0818
0.1364
0.0909
-0.0182
-0.0182
0.0182
Table 10. Slope between first and second harmonics for breathy and
clear vowels (in d8 /Hz) in !Xh6O. Calculated from figures given in
Table 1 above, assuming FO to be 110 Hz.
434
33
VOICE SOURCE CHARACTERISTICS
Since the frequency data were not available, hypothetical values of 110
Hz for male speakers and 220 Hz for females were assumed. Moreover,
only mean amplitude differences are available for Henton and Bladon's
data.The Tables are intended to give an idea of how a different method of
calculation might affect the comparison between them, rather than a
mathematically precise reformulation of the data.
Vowel
Females
Males
/a/
0.0382
0.0089
/a/
0.0291
0.0070
/A/
0.0282
0.0015
1/
0.0150
0.0036
Table 11. Average slope (in dBIHz) between the first and second
harmonics in male and female speakers of Received Pronunciation.
Calculated from figures given in Table 3 above, assuming FO to be
220 Hz for female speakers and 110 Hz for male
Table 11 shows a clear difference still between the male and female RP
speakers and the female slopes are still steeper than the !XhoO clear
vowels. However, whereas the F0 -H2 amplitude difference for the RP
females' /a/, /a/ and In/ was greater than for six of the !X1166 breathy
vowels, it is only greater than two in the dB/Hz measure (with /a/ alone
being greater than one other in addition). Moreover, if the RP female /a/
measurement is compared with, for example, !XhOO speaker 10, the
ratio is 0.84 on the dB measure, but only 0.42 on the slope measure.
More significantly, the recalculation changes the relationship of the
measurements of the RP speakers with the evaluations of Bickley's
phoneticians. The recalculated average amplitude differences for vowels
judged to be in the four categories of breathiness (see p.4 above for dB
figures) are as follows: 'Very breathy' - 0.1136 dB/Hz, 0.0909 dB/Hz;
'Breathy' 0.0755 dB/Hz, 0.1 dB/Hz; 'Slightly breathy' - 0.0609 dB/Hz,
0.0482; 'Not breathy' 0dB/Hz, 0 dB/hZ. When these values are compared
with the RP females, the latter are seen not even to reach the 'Slightly
breathy' level. It is the case that many of the Gujarati and lXhc50 vowels
also do not reach that level in either measure, and it must be
remembered that the phoneticians were asked to judge degree of
breathiness rather than whether the vowels were breathy or not, and that
these are average values. Nevertheless, these calculations show that
435
434
YORK PAPERS IN LINGUISTICS 17
there are potential problems for comparative statements which remain to
be resolved.
It is evident that further experiments are needed to test whether the
straightforward amplitude difference between successive harmonics, or
the 'slope' between them is perceptually salient. The evidence reviewed
in the present article provides little basis for deciding between the
measures, but Monsen and Engebretson's suggestion that there is some
sort of built-in normalisation factor in the differing slopes (see Fig. 1
and comments in section 2.2 above) would imply that maybe it is the
slope which is important. Figure 1(b) shows the near-identity of the
spectral envelopes in un-normalised spectra: it is not the amplitude
difference alone between each pair of harmonics which allows this to
happen, but the combined effect of that and the distance between them
in frequency.
5.1.2 The use of statistics
Many of the studies discussed, use statistical analyses of the data. This
not only poses problems of comparability between studies because of
the different numbers of subjects studied, but also those studies which
present only statistical comparisons of groups of speakers risk masking
variability within each group. Dempster (1992) illustrates this
dramatically with an analysis of F0 -H2 differences in two contexts in
the large DARPA TIMIT Acoustic-Phonetic Speech Database Training
Set, a database containing material from 420 speakers of U.S. English.
Whilst one might want to take issue with aspects of Dempster's study,
his evidence for the dangers of relying on statistics for drawing
conclusions is salutary: he found a statistically significant difference
(p<0.1) between male and female F0 -H2 differences for the vowel /aa/14
(measured in dB), but when the data are presented in histogram form, a
very large degree of overlap is apparent.
While it is right, as Dempster says, that we should heed Klatt and
Klatt's warning that, 'it is unwise to make sweeping generalisations
with regard to sex typing' (op. cit 852), this does not invalidate or
preclude further exploration of some of the questions raised in the
14 TIMIT phonetic label representing the vowel in heart etc.
436
4 35
VOICE SOURCE CHARACTERISTICS
present paper concerning the undoubtedly strong sex-specific tendency
found in the work reviewed.
5.1.3 Perceptual experiments
All the perceptual experiments reported involve trained phoneticians.
The answers thus tell us whether phoneticians judge the voice qualities
according to a linear scale of 'breathiness' which they have learned. This
does not really tease out the different contributing factors or enable us to
make much progress with one of the central questions, that is whether
the findings discussed above are addressing something which can really
be construed as the same phenomenon in the real world. For example,
does FO-H2 difference contribute to the perception of [a] versus [4] for
the ordinary, untrained speaker of Gujarati?
That the judgements elicited tend to be on a scale of breathiness is
also worthy of comment. When breathiness is being examined as a
possible correlate of maleness or femaleness, or of degree of severity of
a pathological condition, the justification for the approach is evident,
but in an investigation of the acoustic correlates of phonological
categories its relevance is less clear (compare, for example, the fact that
English native speakers do not tend to hear absolute initial prevoiced
French stops as 'very voiced; when students of French are asked to
attend to prevoicing, they often perceive a preconsonantal nasal
element.)
5.2 Are we all talking about the same thing?
Perhaps the most important question, and one which needs to be
considered before further detailed investigations of some of the problems
highlighted in this paper are carried out, is whether we are not being
mislead by applying a single label to a variety of phenomena which are
different in some respects. There is common ground between all the
studies discussed, but they are looking at spectral tilt as a marker of
breathiness in four different contexts:
1. as indicative of male-female physiological differences (e.g.
Monsen and Engebretson);
437
436
YORK PAPERS IN LINGUISTICS 17
2. as indicative of breathy voice quality for sociolinguistic or
paralinguistic effect (e.g. Henton and Bladon);
3. as a characteristic of phonological categories (e.g. Bickley);
4. as indicative of a pathological problem (e.g. Hammarberg).
Is it justifiable to extend Ladefoged's 1983 statement quoted earlier to
apply to the studies reviewed here? That is, is it really reasonable to
claim that the 'breathiness' of pathological subjects or Gujarati speakers'
[4] vowels, rather than a tendency for the difference between FO and H2
to be greater, is characteristic of female speech? Barry's finding that
noise in the high-frequency regions of the spectrum was as important
for generating a 'good match' female voice suggests that it may be, and
indeed the vibratory pattern suggested by Monsen and Engebretson for
female vocal folds would predict that more noise would be generated
than by males, as well as females having an enhanced fundamental. But
this does not guarantee that the relative 'amounts' of noise and tilt are
the same in all the cases. If, as Klatt and Klatt claim, noise is more
important than tilt for giving a breathy percept, then maybe the FO-H2
differences found by Henton and Bladon are not indicative of breathiness
at all.
In addition, the physiological correlates of the acoustic phenomena
are reported or hypothesised to be different in the different cases:
Ladefoged (see page 2 above) describes different correlates for breathiness
in Gujarati vowels and English voiced /h/, the former a deliberate
configuration of the vocal folds, and the latter a passive effect;
Hammarberg posits incomplete abduction of the vocal folds as a result
of unilateral paralysis or nodules on the folds; and Monsen and
Engebretson ascribe the greater spectral tilt and noise to the different
vibratory patterns of the vocal folds in males and females, which are in
turn caused by differences in mass and structure. There is no reason why
the relationship between production settings and acoustic structure has
to be one-to-one, but it cannot be taken for granted that the different
settings will necessarily produce something which can be called the
same.
437
438
VOICE SOURCE CHARACTERISTICS
REFERENCES
Barry, M, 1986. Synthesising female voice quality: parameters and test
methods. Cambridge Papers in Phonetics and Experimental
Linguistics 5.
Bladon, R, A. W., Benton, C. G. and Pickering, J. B. 1984. Outline of an
auditory theory of speaker normalization. In van den Broecke, M. P.
R. and Cohen, M. (eds) Proceedings of the Xth International
Congress of Phonetic Sciences 313-317.
Bickley, C. 1982. Acoustic analysis and perception of breathy vowels. MIT
Working Papers in Speech Communication 1. 71-81.
Ni Chasaide, A. and Gobl, C. 1988. Voicing contrasts and the voice source.
Paper delivered at the British Association of Academic Phoneticians
Colloquium, Dublin.
Ni Chasaide, A. and Gobl, C. 1993. Contextual variation of the vowel voice
source as a function of adjacent consonants. Language and Speech
36. 303-330.
Clark, C. J. 1986. Description of a speech analysis system. Progress
Reports from Oxford PHonetics 1. 13-25.
Dempster, G. J. 1992. Acoustic cues to breathiness: a true marker of gender?
Proceedings of the Institute of Acoustics. Vol. 14, Part 6. 249-256.
Gobl, C. and Ni Chasaide, A. 1992. Acoustic characteristics of voice
quality. Speech Communication 11. 481-490.
Hammarberg, B. 1986. Perceptual and Acoustic Analysis of Dysphonia.
Dissertation from Dept of Logopedics and Phoniatrics, Huddinge
University Hospital, Sweden.
Henton, C. G. and Bladon, R. A. W. 1985. Breathiness in normal female
speech: inefficiency versus desirability. Language and
Communication 5. 221-227.
Klatt, D. H. 1986. Detailed spectral analysis of a female voice. Journal of
the Acoustical Society of America 80, Supplement 1. S69.
Klatt, D. H. and Klatt, L. C. 1990. Analysis, synthesis and perception of
voice quality variations among female and male talkers. Journal of
the Acoustical Society of America 87. 820-857.
Ladefoged, P. 1981. The relative nature of voice quality. Journal of the
Acoustical Society of America 69, Supplement 1. DD3.
439
438
YORK PAPERS IN LINGUISTICS 17
Ladefoged, P. 1982. A Course in Phonetics. New York: Harcourt, Brace
Jov anovich.
Ladefoged, P. 1983. Linguistic uses of different phonation types. In Bless,
D. M. and Abbs, J. H. (eds). Vocal Fold Physiology. San Diego:
College Hill Press. 351-360.
Laver, J. 1980. The Phonetic Description of Voice Quality. Cambridge:
C.U.P.
Mattingly, I. G. 1966. Speaker variation and vocal tract size. Journal of the
Acoustical Society of America 39. 1219.
Monsen, R. and Engebretson, A. M. 1977. Study of variations in the male
and female glottal wave. Journal of the Acoustical Society of
America. 62. 981-993.
Ohala, J. J. 1983. Cross-language use of pitch: an ethological view.
Phonetica. 40. 1-18.
Sachs, J. P. 1975. Cues to the identification of sex in children's speech. In
Thorne, B. and Henley, N. (eds) Language and Sex: difference and
dominance. 152-171.
Scobbie, J. M. 1995. Phonological and phonetic perspectives on the
delayed and disordered acquisition of English initial consonant
clusters. Paper presented at the Third Phonology Workshop, NorthWest Centre for Romance Linguistics, Manchester, May, 1995.
Temple, R. A. M. 1988a. Sex-specific Aspects of the Voicing Contrast in
French Stop Consonants. Oxford M.Phil. thesis.
Temple, R. A. M. 1988b. In search of sex-specific differences in the voicing
of French stop consonants. Progress Reports from Oxford Phonetics
3. 74-99.
Watson, I. M. C. 1987. Problems in quantifying the child voicing contrast.
Ms, University of Oxford.
439
440
NOTES ON TEMPORAL INTERPRETATION AND
CONTROL IN MODERN GREEK GERUNDS*
George Tsoulas
Department of Language and Linguistic Science
University of York
1. Introduction
In this paper I would like to examine some aspects of the syntax of
the Modem Greek gerund clauses. This study will mainly focus on the
following aspects of the syntax of these clausal constituents:
(i) Their External and Internal Syntax
(ii) Temporal Interpretation of Gerund clauses
(iii) Their Argument status
(iv) Control in Gerunds
As a starting point in this paper we adopt the commonly held view that
gerund clauses are never arguments but only adjunct modifiers. Our
account of their temporal interpretation relies on recent theories of
adjunction under which the configurational difference between adjuncts
Earlier versions of this paper have been presented to the first Workshop
on Modern Greek Syntax in Berlin on December 1994 and at the CNRS in
Paris (URA 1720) on February 1995. I want to thank these audiences for
their comments and discussion. Particularly I would like to thank Artemis
Alexiadou, Sabine Iatridou, Lea Nash, Alain Rouveret, Anne Zribi-Hertz.
Thanks also to David Adger for very useful comments and discussion on a
preliminary version of this work. Needless to say I am alone responsible for
the views defended here as well as for all remaining errors of fact and
interpretation.
York Papers in Linguistics 17 (1996) 441-470
0 George Tsoulas
440
YORK PAPERS IN LINGUISTICS 17
and specifiers vanishes. Furthermore,-we provide arguments from ECM
constructions, imperatives and topicalisation in favour of the claim that
gerund clauses can also be arguments. This in turn leads us to a
principled account of the puzzling control patterns found in gerund
clauses.
2. An Overview of the Issues
Consider the following Modern Greek sentences:
(1) I Mariai ide to Giannij [cp PRO*vj zografizondas ena dendro].
The Maria saw the Gianni
painting
Maria saw Gianni while he was painting a tree.
a tree
Mariai ide to Giannij [cp PROvi zografizondas to dendro].
The Maria saw the Gianni
painting
the tree
Maria saw Gianni while she was painting the tree.
(2) I
Under currently quite standard assumptions concerning the nature and
the sites of adjunction (Chomsky 1989, 1992, 1993; Kayne 1994) one
may suppose that there is no significant structural difference in the
syntax of sentences (1) and (2). As the indexing indicates however there
is a difference in so far as the controller of the PRO is concerned. The
only observable difference in the two sentences is the nature of the
object of the verbal form zografizondas: in (1) the object of this verb'
is an indefinite DP, and in (2) it is a definite DP.
Notice also that in a sentence like (3), in which (2) is embedded
under the verb Akousa 'I heard', the controller cannot be the subject of
the main clause (pro with first person features).
(3) Akousa oti i Maria ide to Gianni zografizondas to dendro.
Heard/I that the M saw the G
painting
the tree
1 Although the precise nature of this form remains to be determined we will
use verb for the moment for convenience.
442
441
TEMPORAL INTERPRETATION AND CONTROL IN GREEK
Bearing in mind that the gerund, as the glosses indicate, has a
specific temporal interpretation, one question that we have to address is
why in (3) the gerund clause cannot be associated with the matrix.
A further issue arising is whether the object To Gianni, which
displays accusative Case, genuinely belongs to the matrix sentence or
whether it is in fact the subject of the gerund clause which is
Exceptionally Case Marked by the higher verb. In order to provide a
satisfactory answer to this question one has to settle the issue of the
argument status of the gerund clause.
As will become clear in the remainder of the paper the differences
seen above in syntax and interpretation are due to the ambiguity of
these forms, which can be either participles or gerunds. The paper is
organised as follows. In the following section I present the distribution
of gerund clauses. Then I examine their categorial status and their
internal syntax, focusing principally on their temporal interpretation
and several temporal scope ambiguities. In the last part I examine their
argument status and modify the initial assumption that gerunds in
Modern Greek are only adjunct modifiers. I conclude with a discussion
of the control properties of gerunds.
3. The Modern Greek Gerund
In this section I want to investigate the properties of what has been
frequently called a gerund in Modern Greek. This form is exemplified in
(4).
(4) Pinondas to krasi
drinking the wine
This verbal form has not received much attention in the recent
literature.2 The question of what its precise nature is and its place
2 Not only in recent years but also in the literature since the 1930s, to the
best of my knowledge, this form received only a passing mention in the
morphology section of reference grammars and other works. Its syntax has
never really been seriously investigated, see for example Joseph and
Philippaki- Warburton 1986, Householder, Kazazis and Koutsoudas 1964,
Tzartzanos 1949, Seiler 1952, Mirambel 1939 among others.
443
442
YORK PAPERS IN LINGUISTICS 17
within the Modern Greek verbal paradigm has not yet been clearly
addressed. In fact whenever, in the literature, (4) is put under the heading
gerund, it is only because of its apparent lack of agreement and tense
features.3 On the other hand, the fact that this form, historically, clearly
derives from the active participle has led some researchers to classify it
with participles. In this paper I will argue that this form is ambiguous
in that in some cases it behaves as a participle, and in others more as a
gerund. Two caveats are in order here. First, as will become apparent in
the remainder of this paper, it would be misleading to understand by the
term gerund the notoriously syntactically and semantically ambiguous
English counterpart. Only one aspect of the function and distribution of
the English gerund is displayed by the Modem Greek (4). Examples (8)(11) are intended to show this.
Second, the participial uses of (4) are not on a par with the uses of
clearly participial forms in Modem Greek: although the gerund can be
considered a participle in so far as it restricts the possibilities of
control, it still preserves other verbal properties whereas real participles
do not.
Examples (5)-(11) cover essentially the distribution of the Modern
Greek gerund.
(5)
Pinondas to krasi o Giannis kapnize.
drinking the wine the Giannis was smoking
Giannis was smoking while he was drinking the wine.
(6)
0 Kostas kimotan
kratondas to molyvi tou.
The Kostas was sleeping holding the pen his
Kostas was sleeping holding his pen (with his pen in his
hand).
(7)
Rixnondas to potiri to espase.
dropping the glass it (S)he broke
She broke the glass by dropping it.
3 With the notable exception of Householder, Kazazis and Koutsoudas 1964
who provide more evidence for such a claim (see below).
444
4
TEMPORAL INTERPRETATION AND CONTROL IN GREEK
(8)
* 0 Giannis ekseplagi apo to telionondas tou arthrou.
The G. was surprised by the finishing of the paper
Finishing the paper was a fact that surprised Giannis.
(9)
* 0 Kostas pige psarevondas.4
The Kostas went fishing
Kostas went fishing.
(10)
* (To) telionondas to arthro toso grigora mas ekseplikse.
the paper so quickly us surprised
(The) finishing
Finishing the paper so quickly was a fact that surprised us.
(11)
* 0 kostas zitise arcizondas mathimata pianou.
lessons piano
The Kostas asked starting
Kostas asked to start taking up piano lessons.
It is clear from the above examples that gerundival clauses only
appear as adjunct modifiers (5, 6, 7), they can never be subjects or
objects of verbs or prepositions (8, 9, 10, 11); they can never occupy
an A-position. They can however be adjoined to various sites depending
on their meaning and in that respect they are parallel to adverbial
modifiers. Thus, a manner gerund will be adjoined to VP, a temporal
gerund is adjoined to IP and a modal even higher, as in (14).
(12)
Anna anisixise to Niko fonazondas voithia.
The Anna worried the Niko crying out help
I
Anna caused worry to Noko when (because) she cried out for
help.
(13)
tilefono.
sto
milai
I Anna ftiaxnondas kafe
on
the
phone
(she)speaks
coffee
The Anna fixing
Anna talks on the phone while she is making coffee.
4 I leave aside here the idiomatic pigeno girevondas 'I am looking for
trouble'.
445
4 41
YORK PAPERS IN LINGUISTICS 17
(14)
Echondas makria malia i Anna prepi na to xtenizi sinechia.
Having long hair the A. must C them comb always
Having long hair Anna must comb it all the time.
This difference in the semantic interpretation as reflected by the
syntax can be explained by a difference in intensionality. In (12) one
may suppose that given that the contents of the VP have all moved
higher to functional projections the gerund remains adjoined to the VP.
In (13) the subject is outside the scope of the adjunct but the remainder
of the VP is not. In (14) the gerund has in its scope something akin to
the E Phrase of Laka (1990) which explains its modal interpretation.
2.1 External Distribution5
What I call here gerund has frequently been confused with participles
and, consequently, it has been considered a 'nominal' form of the verb.
However there is clear evidence that the gerund shares distribution with
verbs. Gerunds are opposed to participles in that they can never be
nominalised (see (15)), i.e. they can never be preceded by a determiner;
they can only be modified by adverbs (see (16) and (17)); they do not
compose with auxiliaries to form complex tenses (see (18)); and, in
general, they only function as verbs. Participles, on the other hand have
all the opposite properties, (except for the complex tenses6) as the
following examples show.
5 I am interested here in the overall behaviour of the gerund and not in its
precise morphological constitution. Due to space limitations I will not
attempt here to analyse the function of the morpheme -ondas that forms the
gerund. Historically, this morpheme comes from the accusative of the active
participle of Ancient Greek (with the rather mysterious addition of the -s
ending). I believe that this resemblance and historical affiliation is
responsible for much of the confusion created among scholars as to the
nature of the gerund. I leave a more detailed analysis of its morphological
peculiarities for further research.
6 Strictly speaking participles do not either compose with auxiliaries to
form complex tenses. Complex Tenses in Modern Greek are formed by
means of a different form, derived from the past tense's root together with a
third person singular ending (with some exceptions), this form is not
homophonous to the third person singular of the past tense because it lacks
the temporal prefix (augment) /e/. However, the investigation of the
446
TEMPORAL INTERPRETATION AND CONTROL IN GREEK
GERUNDS
(15)
* To ksekinondas ine diskolo.
is difficult
The starting
(16)
theloume7 ine diskolo.
* To ksekinondas, to opio
the which (we) want is difficult
The starting
(17)
astamatita.
kitondas me
Milouse
looking at me all the time
he/she was talking
(18)
* echo/ime kitondas.
I have/be looking
PARTICIPLES
(19)
0 Xaroumenos
The happy/MASC
ine efxaristos.
is pleasant
(20)
0 Xaroumenos anthropos ine efxaristos.
The happy/MASC man is pleasant
(21)
0 Xamenos, o opios
bori na ine opiosdipote, den xerete.
neg rejoice
the looser/M the which can C be anyone
(22)
arnoumeni na me kitaksi.
Milouse
she was talking refusing/F C at me look
She was talking refusing to look at me.
morphological properties of this form would take us too far astray from our
initial purposes. I will thus leave it aside for the present paper.
7 Here the modifier is a relative clause. Examples showing the gerund being
modified by an adjective are not particularly illuminating since the gerund,
uninflected for gender, would have to be modified by a third person neuter
adjective, a form which, in Modern Greek, coincides with the adverb. Notice
also that in (16) the presence (or absence) of the determiner To is irrelevant
to the grammaticality of the sentence.
447
446
YORK PAPERS IN LINGUISTICS 17
These examples show that the distribution of the gerund can be
considered as a subset of the distribution of the participle. Participles
are in principle categorially ambiguous in the sense that they can
function either as verbs or as nouns or adjectives. The distribution of
the gerund covers only one part, the verbal part, of the participle's
distribution. Differently put, only example (22) is comparable to the
examples (12)-(14) which show the distribution of gerunds.
3.2 The Structure of Gerund Clauses
The main question arising in connection with the internal structure of
gerund clauses is their categorial status, this question will be shown to
be of a major importance because it bears directly on the status of their
subject. Gerund clauses seem to be CPs. In the following examples,
cases of wh-extraction from within the gerund clause are shown.8
(23)
Tii
pinondas
akouge mousiki?
what drinking (s)he listening music
What was she drinking while she listened to the music?
(24)
Se pion milondas magireve?
To whom talking he/she was cooking
Who was she talking to while she was cooking?
(25)
Pou
kitondas sou
milouse?
where looking to you was talkng
Where was she looking while she was talking to you?
In (23) and (24) argument extraction is displayed (direct and indirect
object respectively) and (24) shows adjunct extraction.9 These examples
8
All the sentences involving extraction are somehow marginal in
acceptability. Their marginal status is to be imputed to the well known fact
that extraction out of an adjunct is generally marginal. The relevance of
these examples will become more evident when they are compared with
extraction out of participles, which is impossible.
9 There is of course the possibility of leaving the wh in situ, which is also
more natural (but see note 8):
(i) Pinondas ti akouge mousiki
448
447
TEMPORAL INTERPRETATION AND CONTROL IN GREEK
show that a Spec, CP position is available and can be targeted by wh-
movement. On the other hand, similar examples involving clearly
participial forms (i.e. inflected for number, gender, person, and Case)
are sharply ungrammatical:
(26)
amoumeno.
* Ti ton thimasai
what him remember/you refusing/3/S/M/ACC
(27)
vriskomeno.
* Pou ton ides
where him saw/you being/M/S/3/Acc
(28)
??Pou ton ides eksaskoumeno?
where him you saw exercising
Where did you see him exercising?
There is a difference in acceptability between (26)-(27) and (28)
which is much better. The reason for this asymmetry between
argument/adjunct extraction is obscure. Notice that the locative in (27)
behaves more like an argument of the verb vriskomai 'being in a
location'.10
These examples suggest that, contrary to gerunds, participial
clauses are bare IPs (or even VPs). This observation is particularly
significant for the subpart of the distribution of participles that
coincides with the distribution of gerunds, i.e. when participles function
as verbs.11
drinking what was/(s)he listening to the music
magireve
(ii) Milondas se pion
to whom was/(s)he cooking
talking
milouse
sou
kitondas you
looking where to you was (s)he talking
(S)he was talking to you looking where?
10 This type of asymmetries suggests that the lexical semantics of each
item have some influence, but I will not pursue this path further.
11 It is rather interesting to note that for some obscure reason the option of
long wh-movement, widely attested in pro-drop languages such as Modern
Greek, is not available here.
449
448
YORK PAPERS IN LINGUISTICS 17
3.2.1 Temporal Interpretation of Gerunds
Gerunds are further opposed to participles in that, aspectually, they are
uniformly imperfectives whereas participles are perfectives.
(29)
milouse
gia glossologia.
Pinondas arga to krasi
drinking slowly the wine he was talking about linguistics
(30)
kapnizondas astamatita.
diavaze
he was reading smoking without stopping
(31)
* Arnoumenos
arga
tin prosfora efige.
Refusing/3/S/M/Nom slowly the offer left/he
(32)
Eksaskoumenos astamatita
katafere to skopo tou.
exercising/MASC all the time he reached the aim his
The perfective/imperfective difference can also be cast in terms of
definiteness/indefiniteness. I have proposed in Tsoulas (1994a, 1994b,
1995) that tense is also subject to the definiteness/indefiniteness
distinction. Furthermore, I have proposed that this distinction should
replace the classical finite/non-finite distinction, since it is now widely
accepted that non-finite verbal forms only lack morphological temporal
specifications, while semantically still they contain information
pertaining to temporal interpretation. This theory has interesting
predictions in that it parallels clausal and nominal (DP) constituents in
yet one more respect. Informally in the case under examination, the
gerund is indefinite in that it does not refer to a precise point or interval
in time whereas participles do In the grammatical example (32) the
temporal reference of the participle can be characterised as a closed
temporal interval located at some time before the occurrence of the
event denoted by the main verb. By contrast, gerunds denote open
intervals with respect to the main verb. If we consider gerunds as
indefinites, this constitutes an additional explanation for the extraction
data in the preceding paragraph, namely, indefinites permit extraction
while definites disallow it (see Ross 1968, Manzini 1993 among
others).
41 9
450
TEMPORAL INTERPRETATION AND CONTROL IN GREEK
3.2.2 Temporal Scope Ambiguities with Gerund Clauses
In this subsection I will present some more evidence for the CP status
of gerund clauses. This evidence also bears on the issues of control
mentioned in the introduction. This evidence involves temporal scope
ambiguities and binding with gerunds. Consider the following
sentences:
(33)
Tremondas apo to fovo tou o Giannis lei oti o Kostas efige.
trembling by the fear his the G. says that the K. left
Giannis says that Kostas left trembling from fear.
(34)
Vlepondas to ligosta malia tou o Giannis ipe
said
Seeing the few hair his the G
oti o Kostas epathe egefaliko.
had a stroke
that the K.
Giannis said that Kostas had a stroke seeing his thining hair.
(35)
Trogondas ti soupa tou o Giannis ipe oti o Kostas kaike.
Eating the soup his the G said that the K. was burned
Giannis said that Kostas burned himself while eating his soup.
(36)
Ida to Gianni vgainondas apo to spiti (tou)
Saw/I the G. coming out of the house (his)
prin na ton skotosi o Kostas.
before C him killed the K
I saw G. getting out of his/the house before K. killed him.
(37)
Ida ton Kosta na skotoni to Gianni vgainondas apo to spiti (tou).
coming out of the house (his)
Saw/I the K C kill the G
I saw Kostas killing Giannis while getting out of the/his house.
(38)
Ematha oti o Kostas skotose to Gianni vgainondas apo
killed the G coming out of
Learned/I that the K.
o tsakomos tous.
prin mathefti
to spiti tou
the house his before becomes-known the fight their
I learned that K killed G getting out of the/his house before
their fight becomes known.
451
450
YORK PAPERS IN LINGUISTICS 17
Examples (33)-(38) show that the gerund can be construed with each of
the clauses in the complex structure. For example, (38) can have the
following interpretations:
(i) I heard, when I was getting out of the/his house that Kostas killed
Gianni, before their fight becomes known.
(ii) I heard that Kostas, as he (Kostas) was getting out of the/his house
he (Kostas) killed Gianni, before their fight becomes known.
(iii) I heard that Kostas killed Gianni when he (Gianni) was getting out
of the/his house before their fight becomes known.
Interestingly enough the gerund clause cannot be associated with the
before-clause in this structure. We will be merely noting this fact for
the moment, we shall return to it shortly.
In general, it is natural to suppose that the adjunction site is what
determines the interpretation. In other words, the gerund clause must be
adjoined to a given T (or I) node in order to be able to modify that node.
However, we see that the same surface string can yield several
interpretations. The question is how these interpretations are to be
derived in a framework like the minimalist program (Chomsky 1993,
1994, 1995), where one of the major predictions of the theory is that
optionality should be banned. One way to deal with this problem is to
suppose that the entire adjunct is covertly moved and readjoined to some
other position. One may, however, legitimately ask what motivates
such a movement, since all movement operations must be driven by the
need to check some morphological feature. It is difficult to imagine
what that feature could be. Another way around this problem that comes
to mind derives from Geis' (Geis 1970) and Larson's treatment of
temporal prepositions as involving silent temporal operators that need
to be moved to the COMP position of the clausal complement of the
preposition.12 Consider for example a sentence containing a beforeclause:
12 Cited by Johnson 1988, who applies this analysis to clausal gerunds in
English.
452
45.1
TEMPORAL INTERPRETATION AND CONTROL IN GREEK
(39)
Valerie arrived before you said she had.13
This sentence is ambiguous. It has one meaning corresponding to (i)
and one meaning corresponding to (ii).
(i) Valerie left before the time of your saying that she had.
(ii) Valerie left before the time you said she had left at.
According to Larson, as cited by Johnson (1988), the ambiguity
arises because in these clauses there are empty temporal operators.
These operators, once moved to the appropriate position, bind a variable
located either in the matrix (i) or in the embedded clause (ii). This
analysis, since it is based on movement, has the major prediction, as
noted by Larson and Johnson, that the interpretation of this type of
sentences would be sensitive to island effects (see Johnson 1988 for the
relevant examples and discussion). This prediction, which is indeed a
true one, raises a major problem for the syntax of Modern Greek
gerunds. If we assume that a similar analysis can be proposed for
gerunds in Modern Greek then movement of the operator out of the
adjunct would violate the adjunct condition and yield ungrammatical
results. In the examples (33)- (38) the gerund always has scope over one
of the clauses in the structure excluding all the others. This fact is an
argument in favour of the analysis in terms of movement of a covert
operator in the sense that it makes it necessary to understand scope in
this particular context as the relation between an operator and the
variable it is associated with (i.e. that it binds), rather than in terms of
C-command or any other command-type relation. This fact is of a
crucial importance given the theory of adjunction we are adopting in
this work, to which I turn in a moment. Suppose that this analysis is
correct and Modern Greek gerunds truly contain a phonologically null
temporal operator (a silent when or while ); how can we account for the
improper movement of the operator out of the gerund? In order to
answer this question let us turn first to the nature of structures formed
by adjunction. Kayne (1994) proposes that there is no principled
difference between a specifier and an adjoined element, under this
13 Example adapted from Johnson 1988.
453
452
YORK PAPERS IN LINGUISTICS 17
assumption and given a phrase marker like (40) where B is adjoined to
A, if B represents the gerund clause of our examples and A is, say, a
VP or IP, then no locality problem arises if we move the operator to
the first superordinate CP position.14
(40)
A
D
This type of movement requires that the B adjunct be a CP projection,
for, otherwise the derivation would be ruled out as an ECP violation
while here antecedent government is satisfied. It is also interesting to
observe that even in (41) the gerund can still be associated with the
matrix clause, in the interpretation that the learning event takes place
when the learner steps out of her house.15
(41)
Ematha oti o Kostas ipe oti o Nikos skotose to Gianni
Learned/I that the K.
said that the N. killed the G.
vgainondas apo to spiti.
coming out of the house
I learned that Kostas said that Nikos killed Gianns while
getting out of the house.
If my analysis so far is correct we have to assume that only the
operator itself can bind an event variable, and, crucially, not its trace
14 Recall that we analyse gerunds as indefinites, thus allowing material
from within the gerund clause to be extracted.
15 Predictably, this reading is somewhat more difficult to obtain. It is
noteworthy that, in general, speakers require a clear pause before the adjunct
in that reading, this requirement is weakened though if the choice of lexical
items is such that the association of the gerund with another clause is
unlikely.
53
454
TEMPORAL INTERPRETATION AND CONTROL IN GREEK
(top) since to satisfy the ECP the operator has to move stepwise
through the specifiers of each of the embedded CPs. If top were to be a
potential binder for the event variable of each verb, the whole structure
would be uninterpretable and the derivation would crash as a violation
of the bijection principle of Koopman and Sportiche(1984).1 6
Returning to our example (38), under this analysis this example
should be problematic since under our assumption that there is no
principled, configurational difference between adjuncts and specifiers,
nothing would prevent the operator contained in the gerund clause from
moving to the specifier of the clausal complement of the preposition
prin. Recall however that the analysis proposed here crucially assumes
that these temporal operators are also present in other temporal clauses,
including before-clauses. Therefore it is impossible for the temporal
operator of the gerund clause to move into the position that is already
occupied by the operator originating in the prin-clause. Consequently in
sentence (38) the only interpretation of the prin-clause with respect to
the matrix is a narrow scope interpretation, which means that the time
that prin 'before' compares can only be construed with one of the
embedded clauses but crucially not with the gerund or the matrix clause.
3.2.3 Manner and Modal Gerunds
The analysis presented so far covers mainly temporal (and aspectual)
gerunds. Manner gerunds behave in almost the same way. Consider (41)
in a manner reading of the gerund. Suppose that (41) is uttered in order
to describe a particular scene of a gang fight where Nikos killed Gianni
as he (Nikos) was shooting his way out of the house. I propose that
this interpretation will not be merely the result of the fact that the
gerund is adjoined to the lowest VP but because the temporal operator
will move to the Spec of the most deeply embedded CP and no further
up. Strictly speaking, these should be considered as two relatively
independent processes. For one thing, the gerund has a specific
dependent temporal interpretation and this must somehow be accounted
16 This is quite natural. The operator and its trace are non distinct under the
copy theory of movement, since they share the same index.
455
454.
YORK PAPERS IN LINGUISTICS 17
for.17 Its adjunct status requires a different mechanism from those given
in Tsoulas (1994a, 1994b) for the interpretation of indefinite clausal
constituents. The data examined there involved, crucially, sentential
complements. Thus, although the adjunction site is still crucial to the
interpretation, it is the temporal operator that determines in a complex
structure with respect to which such adjunction site the gerund clause
will be interpreted.18 Consider now (42) in which the gerund is clearly
denoting manner:
(42)
Ematha oti o Kostas ipe oti o Nikos skotose to Gianni
I learned that the K. said that the N. killed the/Acc G.
pirovolondas ton.
him
shooting
I learned that Kostas said that Nikos killed Giannis shooting
him.
17 An Indefinite one as we said above. The morphological expression of the
temporal indefiniteness in this case is quite a distinct matter. Along the
lines of Tsoulas 1994a, if the generalisation concerning the morphological
realisation of temporal indefiniteness, is correct, we infer from the
existence of special bound morphology on the verb, that the [-DEFINITE]
feature is realised under I (or T). This generalisation states that temporal
(clausal) indefiniteness can either be realised in I or in C and either as bound
morpholgy on the verb or as an independent word, moreover whenever
temporal indefiniteness is realised as a bound morpheme it is necessarily
realised under I. These facts, in conjunction with the ones about temporal
indefiniteness in French presented in Tsoulas 1994a, b, 1995 raise a serious
problem, namely, it shows quite clearly that the morphological realisation
site, differing between C° and I (T) is not really subject to parametric
variation since the two options exist within the same language, French as
well as Modern Greek. The reasons for this optionally I don't really
understand for the moment. They might have to do with the availability of
control into the indefinite clausal constituent, but even this line of
reasoning is compromised by the Modern Greek data, since in Modern Greek
control is available both in subjunctives (Indefiniteness in C) and Gerunds
(Indefiniteness in I). I will leave the matter here for this paper and postpone
a more detailed examination for further research.
18 Semantically this account is also supported because of its
compositionali ty.
456
TEMPORAL INTERPRETATION AND CONTROL IN GREEK
It could be objected that in this case the previous account somehow
fails to capture the fact that the gerund can only be associated with the
lowest VP. In a way, it is entailed by the lexical meaning of each item
that the gerund says something about the manner in which the killing
took place. This is not strictly true however, it is also conceivable that
the clitic pronoun ton does not in fact refer to the DP to Gianni (the
killed man) but rather it picks out some other antecedent from the
preceding discourse. In this case, assuming for concreteness that the
temporal operator has moved to the [Spec CP] of the matrix, the
intended meaning is that the speaker learned about the facts reported
when she was shooting someone. This becomes even clearer in (43).
(43)
Akousa oti o Kostas ipe oti o Nikos skotose to Gianni
said that the N. killed the G.
Heard/I that the K.
pirovolondas tin.
her
shooting
The replacement of the masculine ton by a feminine form prevents its
association with any of the DPs present in the sentence. (43) remains
however grammatical, within, of course, the appropriate context.
The same considerations apply also to modal gerunds though the
facts get somewhat more complicated in this case, for reasons I don't
fully understand. Consider the following examples (partly adapted from
Stump 1985). In this set of examples we show Modal gerundival
clauses adjoined to various positions in the complex structures.
Interestingly, the temporal patterns shown are not homogeneous. They
differ in that the gerund clause in the examples (48)-(52) cannot be
freely associated with any of the other clauses in the complex structure.
(44)
olo ton kosmo.
trelene
forondas afta to rouha
wearing these the clothes he/She was driving mad all the people
Wearing this outfit (s)he was driving everybody crazy.
457
456
YORK PAPERS IN LINGUISTICS 17
(45)
Akousa oti o Kostas ipe oti o Nikos itan sigouros oti
Heard/I that the K. said that the N. was sure
that
forondas afta ta rouha
tha trelenotan
olos o kosmos.
wearing these the clothes would be driven mad all the people
I heard that Kostas said that that Nikos was sure that wearing
this outfit, he would drive everybody mad.
(46)
Pemondas to farmako se kanoniki dosi,
this drug
in normal dose
vlepis grigora apotelesmata.
see/you quick results
You see prompt results if you take this drug in normal dose.
Taking
(47)
Vlepis grigora apotelesmata,
See/you quick results
pemondas to farina() se kanoniki dosi.
taking this drug
in normal dose
You see prompt results if you take this drug in normal dose.
(48)
Akousa oti o Kostas ipe oti o Nikos itan sigouros oti
Heard/I that the K. said that the N. was sure
that
Pemondas to farmako se kanoniki dosi,
taking
the drug in normal dose
ta apotelesmata ine theamatika.
the results
are spectacular
I heard that Kostas said that Nikos was sure that you see
prompt results if you take this drug in normal dose.
(49)
Echondas makria heria o Nikos ftanei efkola to tavani.
Having long arms the N. reaches easily the ceiling
Having long arms Nikos reaches easily the ceiling
(50)
*0 Nikos ftanei efkola to tavani, echondas makria heria.
The N. reaches easily the ceiling, having long
arms
Having long arms Nikos reaches easily the ceiling
458
TEMPORAL INTERPRETATION AND CONTROL IN GREEK
(51)
Eleni ipe oti echondas makria heria
long
ftanei efkola to tavani.
arms
reaches/she easily the ceiling
Giannis knows that Eleni said that that having long arms she
can easily reach the ceiling
0 Giannis kseri oti
i
The G knows that the/fem E. said that having
(52)
efkola
?O Giannis kseri oti i Eleni ipe oti ftanei
The G. knows that the/fem E. said that reaches/she easily
to tavani, echondas makria heria.
arms
long
the ceiling having
Giannis knows that Eleni said that he/she reaches the ceiling
easily, having long arm.
Stump (1985) points out that a subclass (his "Weak" Adjuncts) of
modal gerunds generally behave like if-clauses.19 In the above
examples these correspond to the sentences in (44)-(47). We are
interested here in their temporal interpretation and whether the patterns
observed above hold also of this type of gerund clauses. This is indeed
the case in (44)-(47) the adjunct can be construed with each one of the
clauses in the complex structure. From this point of view then we can
consider them as when-clauses, containing an empty temporal operator.
This is not the case however in the examples (48)-(52) (Stump's
"strong" Adjuncts). In these cases the adjunct can only be construed
with the lowest clause. This difference can be traced to the
stage/individual level status of the predicate. From the perspective of
temporal interpretation, this fact does not undermine our proposal that
there is a temporal operator, since, as I pointed out earlier, we have to
19 Stump's discussion is broader. He considers all sorts of free adjuncts,
including gerunds, we restrict here our attention on adjuncts of the latter
type and consequently adapt some of his observations. We must also point
out that Stump does not use our Manner - Temporal - Modal distinction
which is intended to make more apparent the import of the syntax, provided
that each part of the distinction corresponds to a specific syntactic
configuration. Stump's aim rather is to discuss the interpretation of the
apparently homogeneous class of free adjuncts from the points of view of
Modality, Tense, and Aspect.
459
458
YORK PAPERS IN LINGUISTICS 17
account for the dependent temporal status of the adjunct. Stage-level
predicates seem to allow the operator all possible scope options whereas
individual-level predicates only admit narrowest scope. Consider
however the effect of preposing the adjunct in (52) as in (53):
(53)
i
Eleni ipe oti
Echondas makria heria, o Giannis kseri oti
having long arms the G. knows that the/fem E. said that
ftani
efkola to tavani.
reaches/she easily the ceiling
In the most natural interpretation of (53) the adjunct is constructed
with the matrix clause 20 Consequently, in this case the operator must
have wide scope. It seems that individual-level gerundival adjuncts have
to be construed with the closest clause (downwards) rather than with the
most deeply embedded as it would have been required if it had to take
narrow scope. Somehow then this adjunct belongs to this clause in a
more tight way. Why this is so? I want to propose here that in these
cases the gerund is topicalised within its clause. It is moved to a Top
position located at the complement of C. As it is natural, from this
position the temporal operator, if this type of gerunds contain one,
cannot move to the superordinate clause without violating the ECP.
This proposal naturally explains some of the effects of the postposition
of the adjunct as in (50). Assuming that the Top position is normally
to the left of IP as shown also in Tsimpli (1992), (48) is ruled out as
ungrammatical by the fact that the adjunct fails to be topicalised.21 The
20 It should be noted that (51) is judged somewhat strange by some speakers
(including myself). I think this relative deviance is accountable on the
nature of the predicate of each of the two clauses. The matrix predicate is
stage level whereas the predicate of the embedded clause is individual level.
Due partly to the embedded tense (habitual present) the embedded clause is
interpreted as a generic sentence. Consequently, the modal gerund is more
'naturally' associated with the embedded rather than with the matrix,
contrary to what is required by its position.
21 Whether topicalisation involves movement or not is a question I will
not address here. I will follow Chomsky 1977, Cinque 1991, Tsimpli 1992
in assuming that topicalised phrases are base-generated to their surface
position, contrary to focused elements. My analysis would also be
compatible with a movement approach to topicalisation if one wants to
460
TEMPORAL INTERPRETATION AND CONTROL IN GREEK
question that this analysis raises is why only this type of gerundadjuncts (strong adjuncts) must undergo topicalisation. Unfortunately I
don't have a satisfactory answer to this question for the moment.
Tentatively, I would like to suggest, as a first approximation, that the
reason for this might have something to do with the fact that they
derive from individual-level predicates whose interpretation is
independent from any time intervals. They are somehow presupposed as
topics generally are. Further refinements to this proposal are, no doubt,
necessary. Space limitations prevent me from discussing this proposal
further and I leave it for future research.
To sum up, the syntactic behaviour of Modern Greek gerunds does
not exactly parallel their semantic properties. They do not divide,
syntactically into manner, temporal, and modal. Manner and temporal
gerunds pattern in the same way as far as temporal interpretation is
concerned and are opposed to modal gerunds.22 The former show a
considerable liberty in their temporal interpretation, which we accounted
for by means of an abstract operator, whereas the latter are much more
restricted in their scope options. The reason for this, I argued, is that
they are topicalised in their clause.
4. Control in Gerunds
4.1 ECM, Argumenthood and the Subject of Gerunds
In this section I want to examine some issues arising with respect to
the determination of the reference of the subject of gerund clauses in
Modern Greek. Lexical subjects are generally not licensed in Modern
Greek gerunds. As we saw above, gerund clauses can apparently never
function as arguments. Therefore, it would be natural to suppose that
they are never subject to Exceptional Case Marking. Therefore, even
sentences like (54), which appear, prima facie, to be ECM structures
argue that argument topicalisation is different from adjunct topicalisation,
for reasons such as predication
22 Roughly speaking, this corresponds to Stump's Strong - Weak
distinction.
461
460
YORK PAPERS IN LINGUISTICS 17
have in fact to be distinct in some way or other from true ECM
constructions.
(54)
Thimamai
ton Kosta odigondas to aftokinito.
driving the car
I remember Kostas driving the car.
Remember/I the K.
The DP ton Kosta can be cliticised on the main verb:
(55)
Ton Thimamai
odigondas to Aftokinito.
Him Remember/I driving the car
I remember him driving the car.
Furthermore, if the entire gerund, with the object, is topicalised
then the object must obligatorily be linked to a resumptive preverbal
clitic on the main verb ((56) and its schematic representation in (57)).23
We can postulate that the clitic has moved to the preverbal position
from its basic post-verbal position. This must be so since the only
context in Modern Greek in which postverbal clitics are found is
imperatives.
(56)
Ton Kosta odigondas to aftokinito ton
thimamai.
The K. driving the car
HIM remember/I
(57)
[Ton Kosta)i [ odigondas to aftokinitolj tonk thimamai ti tj tk
1
Ton in (56) and (57) is the resumptive pronoun that the topicalised
element is linked to. These can be considered as clitic doubling
constructions.
23 This is the standard pattern of Topicalisation in Modem Greek. See also
Tsimpli 1992.
462
461
TEMPORAL INTERPRETATION AND CONTROL IN GREEK
There are however some more difficult cases which tend to suggest
that the DP object may in fact also be part of the gerund clause.
Consider first, imperatives:
(58) a
Ton Kosta odigondas to aftokinito thimisou.
remember/imp
The/Acc K driving the car
b
ton.
Ton Kosta odigondas to aftokinito thimisou
remember/imp
him
car
The/Acc K. driving the
c
Ton Kosta odigondas to aftokinito thimisou to.
The/Acc K driving the car remember/imp it
d
Ton Kosta
thimisou
ton
odigondas to aftokinito.
The/Acc K. remember/imp him driving the car
Imperatives, which are the only context where the resumptive clitic
could appear post-verbally, in fact show a different behaviour. In (58a)
it is clear that what has been topicalised is one constituent, namely, the
gerund clause. (58b) is what the sentence would have been had the only
topicalised constituent been the object. Finally (58c) shows that the
only way to express (58a) and still have a resumptive postverbal clitic
would require the latter to be in the neuter form to 'it', corresponding to
the meaning in (58e).
(58) e
Remember the event (situation) in which Kostas was driving
the car.
(58d) shows topicalisation of the object alone leaving the entire gerund
clause behind. The following examples raise also the same problem:
(59)
Ton Kosta odigondas to aftokinito (ton) ida ke trelathika.
The/Acc K driving the car (him) saw/I and went/I mad
I saw Kostas driving the car and went mad.
463
462
YORK PAPERS IN LINGUISTICS 17
(60)
Ton Kosta magirevondas (ton) thimithika ke eskasa sta gelia.
The/Acc K. cooking (him) remembered/I and burst/I in laughs
I remembered Kostas cooking and laughed.
These sentences show that, at least in some sense, our initial
assumption, which is also the widely accepted view, that gerunds are
always adjuncts and not subject to ECM is not accurate and must be
revised in order to account for this restricted argument status of gerund
clauses. It is restricted in the sense that only in some contexts, namely
as complements to verbs selecting indefinite clausal complements, can
they act as arguments.24 The account of ECM that I am adopting here
is the one presented in Tsoulas (forthcoming), and briefly outlined
below: I take ECM to involve raising of the subject of the non-finite,
Indefinite clausal complement to the specifier of the higher AgrO where
it can check accusative Case. In order for this movement to be possible
we must ensure that the Minimal Domain which this DP belongs to is
properly extended. On the other hand I consider the selection of an
Indefinite clausal complement as a marked selectional option,25
therefore this feature (a head selects for a feature in the head of its
complement) must be checked off. Checking the [+Indefinite] feature of
the C head requires it to raise and adjoin to the selecting head, in a way
similar to that in which Verb raises to T. It follows that the relevant
Minimal Domain is extended accordingly, thus permitting the lower
subject to raise to the specifier of AgrO.26
24 It is precisely in those contexts in which they can alternate with
subjunctives - the other type of indefinite clause one can find in Modern
Greek. This is not true however cross-linguistically. It is not, for example,
generally true for English. I have no explanation for this difference for the
moment but I think it has to do with the fact that instead of infinitives
Modern Greek possess only subjunctives, contrary to English. But I will
not pursue this question any further here.
25 I am considering any functional feature that has to be explicitly stated in
the lexical entry of an item as a marked one.
26 See Tsoulas 1995 for further technical details of this analysis.
464
403
TEMPORAL INTERPRETATION AND CONTROL IN GREEK
Of course, in the vast majority of cases, when no lexical subject
can be licensed in the adjunct the subject of the gerund is PRO.27
4.2 The Influence of the Object
I want now to turn back to the contrast mentioned in the introduction
and consider the shifting in the control pattern in the light of the above
discussion, consider again examples (1) and (2) repeated here:
(1)
I Maria ide to Giannij [cp PRO.% zografizondas ena dendro].
painting a tree
The Maria saw the Gianni
Maria saw Gianni while he was painting a tree.
(2)
I Maria ide to Giannij [cp PROwi zografizondas to dendro].
the tree
painting
The Maria saw the Gianni
Maria saw Gianni while she was painting the tree.
Given the above discussion it is natural to explain the quite
puzzling contrast between (1) and (2) in terms of ECM, that is in (1)
the verb ide Exceptionally Case marks inside the gerund clause, whereas
this is somehow impossible in (2). I will argue that it is the presence
of a definite object in (2) that is responsible for this situation. Recall
that ECM depends on the indefinite nature of the clausal constituent. If
the constituent is definite it is an absolute barrier to government and
consequently ECM is precluded.28 Thus, my proposal consists in the
claim that the definiteness of the object is transferred to the gerund and
furthermore to the entire CP. Krifka (1992) proposes a similar analysis
of the trade of of grammatical features between verbal and nominal
predicates affecting the temporal constitution of the sentence. As we
saw at the beginning of this paper, gerunds differ from participles in
several respects. We then considered participles as definites. Notice also
27 Although the presence, or absence, of PRO from the inventory of
Modem Greek's grammatical categories is a rather controversial matter, no
one, to the best of my knowledge, has ever suggested that PRO could be
dispensed with in these constructions.
28 Put in Minimalist terms, raising of the embedded subject to the
superordinate Spec AgiO for accusative Case checking is impossible.
465
464
YORK PAPERS IN LINGUISTICS 17
that there are no active participles, morphologically distinguished as
such, in Modern Greek. Transfer of a [+DEFINITE] feature to the gerund
can be said to transform it into a more participle-like form, though
somehow defective. This proposal, although very tentative and in need
of considerable refinement, seems however quite accurate in that it also
reflects the diachronic derivation of the gerund, which has presumably
resulted in a form of ambiguity in the specifications of the -ondas
morpheme.
One possible objection to this analysis could be that apparently
conflicting predictions are made by it and our analysis of the temporal
interpretation of gerunds in terms of movement of an abstract operator.
In fact the predictions are not conflicting because in one of those cases
the gerund clause is an argument whereas in the other it is an adjunct.
Of course, the question that still remains open is what happens with
participles that are themselves adjuncts; also, why is it that only
subject control is available in (2)? The answer to the latter question lies
within the general mechanisms of Control theory. I would like to adopt
here Williams' (1992) suggestion that in several cases of adjunct
control, the controller is identified as the logophoric centre of the
sentence in the case of (2) the perceiver is more likely to be the
logophoric centre of the sentence in the sense of Sells (1987), and
consequently the controller.
Conclusion
In this paper I have examined, as space limitations permitted, the
4.
structure and functioning of Modern Greek Gerundival constructions. I
first argued that there are clear differences between gerunds and
participles. I considered then issues concerning the temporal
interpretation of gerunds and gave an account of it postulating the
existence of a covert temporal operator akin to the one used by Geis
(1970) for temporal prepositions in English, movement of this operator
determines the clause with which the gerund will be associated. I
assumed Kayne's (1994) theory of adjunction, which does not
distinguish configurationally between adjunct phrases and specifiers in
order to void a potential violation of the adjunct constraint (ECP). This
analysis, independently, constitutes evidence for a disjunctive
formulation of the ECP. I then considered issues of Control with
466
465
TEMPORAL INTERPRETATION AND CONTROL IN GREEK
gerunds and concluded that although apparently restricted to adjoined
positions, gerunds can also be arguments and by virtue of their
indefinite nature, they permit ECM. This partly resolves the problem
raised by the sentences (1) and (2). On the other hand, following Krifka
(1992) I argued that there is some feature transfer from the object to the
gerund, which turns it to a more definite, participle-like constituent (but
see note 25) which accounts for its control properties. The analysis
presented in this paper represents further evidence for the
Definite/Indefinite distinction at the clausal level. It should be noted
however that the rather intuitive account of the properties of
temporal/clausal indefiniteness given in this paper fails to do full
justice to the linguistic reality it is supposed to account for.29 In fact,
temporal indefiniteness turns out to be much more complex than this
intuitive account suggests. It also raises nontrivial questions, left
untouched in this paper, concerning the representation of indefiniteness
temporal or otherwise. Crucially, it sheds doubt on the widely accepted
DRT idea of Indefinites as variables and it is possible that a detailed
account of temporal indefiniteness will lead us to abandon this idea.3°
Additional reasons for such a move, from a Situation Semantics point
of view, can be found in Cooper and Kamp (1991).
There are of course several other questions left open as indicated in
the course of the paper. I leave all these questions for further research.
29 See my 1994a, b, and forthcoming for some further details.
30 However, Manzini 1994 presents ideas very similar to the ones
presented in this paper and in Tsoulas 1994a, b and her analysis is fully cast
in the framework of Heim's 1982 analysis of Indefinites -as- variables.
467 466.
YORK PAPERS IN LINGUISTICS 17
REFERENCES
Chierchia, G. 1984. Topics in the Syntax and Semantics of Infinitives and
Gerunds. PhD Diss. UMass. Amherst.
Chomsky, N. 1977a. On WH-Movement. In Culicover, Peter et al. (eds.)
Formal Syntax. New York, Academic Press.
Chomsky, N. 1977b. Essays on Form and Interpretation, Elsevier, North
Holland.
Chomsky, N. 1981. Lectures on Government and Binding. Dordrecht, Foris.
Chomsky, N. 1986a. Barriers, Cambridge, Mass. MIT Press.
Chomsky, N. 1986b. Knowledge of Language. New York, Praeger.
Chomsky, N. 1991. Some Notes on the Economy of Derivation and
Representation. In Freidin, R. (ed.) Principles and Parameters in
Comparative Grammar, Cambridge, Mass. MIT Press.
Chomsky, N. 1993. A Minimalist Program for Linguistic Theory. In Hale,
K. and S. J. Keyser (eds.) The view from Building 20, Cambridge,
Mass. MIT Press.
Chomsky, N. 1994. Bare Phrase Structure. MITOPL 4. Cambridge, MIT
Chomsky, N. 1995. The Minimalist Program. Cambridge, Mass. MIT Press
Cinque, G 1991: Types of A'- dependencies. Cambridge, Mass. MIT Press
Cooper, R and H Kamp. 1991. Negation in Situation Semantics and
Discourse Representation Theory. In I Barwise et al. (eds.) Situation
Theory and its Applications vol.2 Stanford, CSLI.
Diesing, M. 1992. Indefinites. Cambridge, Mass. MIT Press.
Geis M. 1970. Adverbial Subordinate Clauses in English, PhD Diss.
Cambridge, MIT.
Johnson, K. 1988. Clausal Gerunds, the ECP, and Government. Linguistic
Inquiry 19.583-609.
Joseph, B. D. and I. Philippaki-Warburton 1986. Modern Greek, Croom
Helm Descriptive Grammars, London, Croom Helm.
Heim, I. 1982. The Semantics of Definite and Indefinite Noun Phrases, PhD
Diss. UMass. Amherst.
Higginbotham, J. 1985. On Semantics. Linguistic Inquiry 16.547-593.
Higginbotham, J. 1992. Reference and Control. In Larson, R. K. et al.
(eds.) Control and Grammar. Dordrecht, Kluwer. 79-108.
468
467
TEMPORAL INTERPRETATION AND CONTROL IN GREEK
Householder, F. W., K. Kazazis and A. Koutsoudas 1964. Reference
Grammar of Literary Dhimotiki. International Journal of American
Linguistics. Publication 31 of the Indiana University Research
Center in Anthropology, Folklore and Linguistics.
Kayne, R. S. 1994. The Antisymmetry of Syntax. Cambridge, Mass. MIT
Press.
Koopman, H. and Sportiche D. 1984. Le Principe de Bijection. In
Communications, 40 Paris, Editions du Seuil.
Krifka, M. 1992. Thematic Relations as Links between Nominal Reference
and Temporal Constitution. In Sag, Ivan A. and Anna Szabolsci
(eds.) Lexical Matters. Stanford, CSLI.
Laka, I. 1990. Negation in Syntax: On the nature of functional categories
and projections. PhD Diss. Cambridge, MIT.
Manzini, M-R. 1993. Locality: A theory and some of its empirical
consequences. Cambridge, Mass. MIT Press.
Manzini, M-R. 1994. The Subjunctive. Forthcoming in Nash L. and G.
Tsoulas (eds.) Paris-8 Working Papers in Linguistics no. 1.
Mirambel, A. 1939. Grammaire du Grec Moderne. Paris Klincksieck.
Ross. J.-R. 1967. Constraints on Variable in Syntax. PhD Diss.
Cambridge, MIT.
Portner, P. 1991. Interpreting Gerunds in Complement Positions,
Proceedings of the Xth West Coast Conference in Formal Linguistics
Stanford, CSLI.
Seiler, H. 1952. L' aspect et le Temps dans le verbs Neo-Grec. Paris, Les
Belles Lewes.
Stowell, T. A. 1982. The Tense of Infinitives. Linguistic Inquiry 13.561570.
Stowell, T. A. 1993. Syntax of Tense. Ms UCLA.
Stump, G. T. 1985. The Semantic Variability of Absolute Constructions.
Dordrecht, Reidel.
Szabolcsi, A. 1989. Noun Phrases and Clauses: Is DP analogous to CPT Ms
UCLA.
Terzi, A. 1992. PRO in Finite Clauses. A study of the Inflectional Heads of
the Balkan languages. PhD Diss, CUNY.
Tsimpli, I. 1992. Focusing in Modern Greek, Ms, University College
London.
469
468
YORK PAPERS IN LINGUISTICS 17
Tsoulas, G. 1994a. Indefinite Clauses. Forthcoming in the Proceedings of
the Xl llth West Coast Conference on Formal Linguistics, Stanford,
CSLI.
Tsoulas, G. 1994b. Subjunctives as. Indefinites. To appear in the
proceedings of the XX incontro de Grammatica Generativa, Padova,
Italy.
Tsoulas, G. 1994c. Minimalism and Control. Ms Universit6 Paris-8.
Tsoulas, G. (forthcoming) The Nature of the Subjunctive and the Formal
Grammar of Obviation. In K. Zagona (ed.) Selected Papers from the
XXVth Linguistic Symposium on Romance Languages. Amsterdam,
John Benjamins.
Tzartzanos, A. 1946, Neoelliniki Sintaksis (Modern Greek Syntax)
Thessaloniki, Kiriakidis.
Williams, E. 1992. Adjunct Control. In Larson, R. K. et al. (eds.) Control
and Grammar. Dordrecht, Kluwer. 297-322.
4 69
470
EDITORIAL STATEMENT
The editors welcome contributions for York Papers in Linguistics on
any linguistic topic. Camera-ready copy is produced using Microsoft
Word on an Apple Macintosh and it would be of considerable assistance
to the editors if contributors could submit copy on a disk in this
format. Failing this, Microsoft Word files on MS DOS format disks or
MacWrite files on a Macintosh disk, together with a hard copy of the
article, or an ASCII version of the text (in any disk format), again with
accompanying hard copy, would be acceptable.
Si Harlow
AR Warner
Series editors
STYLE SHEET FOR YORK PAPERS IN LINGUISTICS
Heading in capitals centralized and with * footnote for author's
address, acknowledgements, etc. Then two blank lines and author
centralized. Then one blank line and major affiliation. (No punctuation
at ends of these lines). Then 2 blank lines and text.
Paragraphs indented. Use tabs for indent. No blank line between
paragraphs. Italics for citation forms, single quotes for senses and for
quotations, double quotes only for quotes within quotes.
Examples numbered consecutively. Numbers on the margin, in
parentheses, with a., b., etc. for sub examples. Please use tabs before
a., b., etc. and at the beginning (but only at the beginning) of example
text and gloss, as follows
(1) a.
b.
J'ai lu [Np beaucoup d' articles] rdcemment
many (of) articles recently
I've read
de collegues]
Pierre s'est brouill6 avec [Np trop
too-many
(of)
colleagues
Pierre has argued with
Footnotes (after *) numbered consecutively and all placed at end of
text.
References in the text in this form: Dowty (1982: 28). But omit
parentheses in footnotes, and within other parentheses (like this: Dowty
1982: 28).
4714 7
YORK PAPERS IN LINGUISTICS 17
Major subheadings numbered and bold with word initial caps:
number - stop - heading - no final punctuation, thus:
2. Fraggle Rock
One blank line before a major subheading.
Bibliography following footnotes, with centered-heading
REFERENCES, and according to the conventions used in Language
except that the date should be in parentheses, and the main title should
be in italics and have initial capitals. Subordinate titles should be
capitalized like ordinary text. Single quotes should not be used round
titles:
Name, I. Q. (1998) My Wonderful Book. Someplace: Joe Publisher
Name, I. Q. (1998) A wonderful paper by me. York Papers in
Linguistics 43.71-78.
Name, I. Q. (1998) A very wonderful paper by me. In Little, K. and
Scrubb, J. (eds.) Delights of Language. Hopetown: Screwsome
Press Inc. 324-328.
BACK ISSUES OF YORK PAPERS IN LINGUISTICS
The following are still available, though in some cases only two or three
copies are left.
1992 YPL 16 (202pp.)
£8.
1991 YPL 15 (284pp.) (in honour of Jack Carnochan)
£10.
1989 YPL 14 (307pp.)
£10.
1989 YPL 13 (365pp.) (Festschrift for R B Le Page)
£10.
1986 YPL 12 (189pp.)
£5.
1984 YPL 11 (333pp.)
£5.
1983 YPL 10 (213pp.)
1976 YPL 6 (211pp.); 1975 YPL 5 (225pp.); 1974 YPL 4 (260pp.);
1973 YPL 3 (183pp.)
O. each
To order:
Send
- details of the volume(s) you want
a cheque or draft in pounds sterling made payable to
To
University of York
- your name and address
- York Papers in Linguistics
Department of Language & Linguistic Science
University of York, Heslington
York YO1 5DD, U.K.
Prices include surface mail. Please add £3.50 per volume for air mail.
472
471
FLoD-4 11
U.S. DEPARTMENT OF EDUCATION
Office of Educational Research and Improvement (OERI)
Educational Resources information Center (ERIC)
IC
NOTICE
REPRODUCTION BASIS
This document is covered by a signed "Reproduction Release
(Blanket)" form (on file within the ERIC system), encompassing all
or classes of documents from its source organization and, therefore,
does not require a "Specific Document" Release form.
This document is Federally-funded, or carries its own permission to
reproduce, or is otherwise in the public domain and, therefore, may
be reproduced by ERIC without a signed Reproduction Release
form (either "Specific Document" or "Blanket").
(9/92)