The Handbook of Second
Language Acquisition
EDITED BY
Catherine J. Doughty and Michael H. Long
THE HANDBOOK OF SECOND
LANGUAGE ACQUISITION
Blackwell Handbooks in Linguistics
This outstanding multi-volume series covers all the major subdisciplines within
linguistics today and, when complete, will offer a comprehensive survey of
linguistics as a whole.
Already published:
The Handbook of Child Language
Edited by Paul Fletcher and Brian MacWhinney
The Handbook of Phonological Theory
Edited by John A. Goldsmith
The Handbook of Contemporary Semantic Theory
Edited by Shalom Lappin
The Handbook of Sociolinguistics
Edited by Florian Coulmas
The Handbook of Phonetic Sciences
Edited by William J. Hardcastle and John Laver
The Handbook of Morphology
Edited by Andrew Spencer and Arnold Zwicky
The Handbook of Japanese Linguistics
Edited by Natsuko Tsujimura
The Handbook of Linguistics
Edited by Mark Aronoff and Janie Rees-Miller
The Handbook of Contemporary Syntactic Theory
Edited by Mark Baltin and Chris Collins
The Handbook of Discourse Analysis
Edited by Deborah Schiffrin, Deborah Tannen, and Heidi E. Hamilton
The Handbook of Language Variation and Change
Edited by J. K. Chambers, Peter Trudgill, and Natalie Schilling-Estes
The Handbook of Historical Linguistics
Edited by Brian D. Joseph and Richard D. Janda
The Handbook of Language and Gender
Edited by Janet Holmes and Miriam Meyerhoff
The Handbook of Second Language Acquisition
Edited by Catherine Doughty and Michael H. Long
The Handbook of Second
Language Acquisition
EDITED BY
Catherine J. Doughty and Michael H. Long
© 2003 by Blackwell Publishing Ltd
350 Main Street, Malden, MA 02148-5018, USA
108 Cowley Road, Oxford OX4 1JF, UK
550 Swanston Street, Carlton South, Melbourne, Victoria 3053, Australia
Kurfürstendamm 57, 10707 Berlin, Germany
The right of Catherine J. Doughty and Michael H. Long to be identified as
the Authors of the Editorial Material in this Work has been asserted in
accordance with the UK Copyright, Designs, and Patents Act 1988.
All rights reserved. No part of this publication may be reproduced,
stored in a retrieval system, or transmitted, in any form or by any
means, electronic, mechanical, photocopying, recording or otherwise,
except as permitted by the UK Copyright, Designs, and Patents Act
1988, without the prior permission of the publisher.
First published 2003 by Blackwell Publishing Ltd
Library of Congress Cataloging-in-Publication Data
The handbook of second language acquisition / edited by Catherine J.
Doughty and Michael H. Long.
p. cm. – (Blackwell handbooks in linguistics ; 14)
Includes bibliographical references and index.
ISBN 0-631-21754-1 (hardcover : alk. paper)
1. Second language acquisition. I. Doughty, Catherine. II. Long,
Michael H. III. Series.
P118.2 .H363 2003
418–dc21
2002154756
A catalogue record for this title is available from the British Library.
Set in 10/12pt Palatino
by Graphicraft Limited, Hong Kong
Printed and bound in the United Kingdom
by TJ International, Padstow, Cornwall
For further information on
Blackwell Publishing, visit our website:
http://www.blackwellpublishing.com
Contents
List of Contributors
viii
Acknowledgments
x
I Overview
1
1 The Scope of Inquiry and Goals of SLA
Catherine J. Doughty and Michael H. Long
II Capacity and Representation
2 On the Nature of Interlanguage Representation:
Universal Grammar in the Second Language
Lydia White
3 The Radical Middle: Nativism without Universal Grammar
William O’Grady
4 Constructions, Chunking, and Connectionism: The
Emergence of Second Language Structure
Nick C. Ellis
5 Cognitive Processes in Second Language Learners and
Bilinguals: The Development of Lexical and Conceptual
Representations
Judith F. Kroll and Gretchen Sunderman
6 Near-Nativeness
Antonella Sorace
3
17
19
43
63
104
130
vi Contents
III
Environments for SLA
153
7 Language Socialization in SLA
Karen Ann Watson-Gegeo and Sarah Nielsen
155
8 Social Context
Jeff Siegel
178
9 Input and Interaction
Susan M. Gass
224
10 Instructed SLA: Constraints, Compensation,
and Enhancement
Catherine J. Doughty
256
IV
311
Processes in SLA
11 Implicit and Explicit Learning
Robert DeKeyser
313
12 Incidental and Intentional Learning
Jan H. Hulstijn
349
13 Automaticity and Second Languages
Norman Segalowitz
382
14 Variation
Suzanne Romaine
409
15 Cross-Linguistic Influence
Terence Odlin
436
16 Stabilization and Fossilization in Interlanguage
Development
Michael H. Long
487
V Biological and Psychological Constraints
537
17 Maturational Constraints in SLA
Kenneth Hyltenstam and Niclas Abrahamsson
539
18 Individual Differences in Second Language Learning
Zoltán Dörnyei and Peter Skehan
589
Contents
vii
19 Attention and Memory during SLA
Peter Robinson
631
20 Language Processing Capacity
Manfred Pienemann
679
VI
715
Research Methods
21 Defining and Measuring SLA
John Norris and Lourdes Ortega
717
22 Data Collection in SLA Research
Craig Chaudron
762
VII The State of SLA
829
23 SLA Theory: Construction and Assessment
Kevin R. Gregg
831
24 SLA and Cognitive Science
Michael H. Long and Catherine J. Doughty
866
Index
871
Contributors
Niclas Abrahamsson
Stockholm University
Craig Chaudron
University of Hawai’i
Robert M. DeKeyser
University of Pittsburg
Zoltán Dörnyei
University of Nottingham
Catherine J. Doughty
University of Hawai’i
Nick C. Ellis
Bangor University of Wales
Susan M. Gass
Michigan State University
Kevin Gregg
Momoyama Gakuin/St Andrew’s University
Jan H. Hulstijn
University of Amsterdam
Kenneth Hyltenstam
Stockholm University
Judith F. Kroll
Pennsylvania State University
Michael H. Long
University of Hawai’i
Contributors
Sarah Nielsen
Las Positas College
John Norris
Northern Arizona University
Terence Odlin
Ohio State University
William O’Grady
University of Hawai’i
Lourdes Ortega
Northern Arizona University
Manfred Pienemann
Paderborn University
Peter Robinson
Aoyama Gakuin University
Suzanne Romaine
Merton College, University of Oxford
Norman Segalowitz
Concordia University
Jeff Siegel
University of New England, Armadale, and University of Hawai’i
Peter Skehan
King’s College, London
Antonella Sorace
University of Edinburgh
Gretchen Sunderman
University of Illinois at Urbana-Champaign
Karen Ann Watson-Gegeo
University of California, Davis
Lydia White
McGill University
ix
Acknowledgments
The editors gratefully acknowledge the following, who provided valuable
reviews of one or more of the chapters: Alan Beretta, Craig Chaudron, Richard
Cameron, Robert DeKeyser, Susan Gass, Kevin Gregg, Jan Hulstijn, Georgette
Ioup, Peter Robinson, Dick Schmidt, Bonnie Schwartz, Larry Selinker, Mary Tiles,
Michael Ullman, Jessica Williams, Lydia White, Kate Wolfe-Quintero, and several individuals who prefer to remain anonymous. The support and efficiency
of Steve Smith, Sarah Coleman, and Fiona Sewell at Blackwell Publishing were
greatly appreciated.
The Scope of Inquiry and Goals of SLA 1
I
Overview
The Scope of Inquiry and Goals of SLA 3
1
The Scope of Inquiry and
Goals of SLA
CATHERINE J. DOUGHTY AND
MICHAEL H. LONG
1 The Scope of Inquiry
The scope of second language acquisition (SLA) is broad. It encompasses basic
and applied work on the acquisition and loss of second (third, etc.) languages
and dialects by children and adults, learning naturalistically and/or with the aid
of formal instruction, as individuals or in groups, in foreign, second language,
and lingua franca settings (see, e.g., R. Ellis, 1994; Gass and Selinker, 2001;
Gregg, 1994; Jordens and Lalleman, 1988; W. Klein, 1986; Larsen-Freeman,
1991; Larsen-Freeman and Long, 1991; Ritchie and Bhatia, 1996; Towell and
Hawkins, 1994). Research methods employed run the gamut from naturalistic
observation in field settings, through descriptive and quasi-experimental studies
of language learning in classrooms or via distance education, to experimental
laboratory work and computer simulations.
Researchers enter SLA with graduate training in a variety of fields, including linguistics, applied linguistics, psychology, communication, foreign language
education, educational psychology, and anthropology, as well as, increasingly,
in SLA per se, and bring with them a wide range of theoretical and methodological allegiances. The 1980s and 1990s witnessed a steady increase in sophistication in the choice of data-collection procedures and analyses employed,
some of them original to SLA researchers (see, e.g., Birdsong, 1989; Chaudron,
this volume; Doughty and Long, 2000; Faerch and Kasper, 1987; Sorace, 1996;
Tarone, Gass, and Cohen, 1994), and also in the ways SLA is measured
(Bachman and Cohen, 1998; Norris and Ortega, this volume). However, longitudinal studies of children (e.g., Huebner, 1983a, 1983b; F. Klein, 1981; Sato,
1990; Watson-Gegeo, 1992) and adults (e.g., Iwashita, 2001; Liceras, Maxwell,
Laguardia, Fernandez, Fernandez, and Diaz, 1997; Schmidt, 1983) are distressingly rare; the vast majority of SLA studies are cross-sectional, with serious
resulting limitations on the conclusions that can be drawn on some important
issues. Theory proliferation remains a weakness, too, but the experience of
4 Catherine J. Doughty and Michael H. Long
more mature disciplines in overcoming this and related teething problems is
gradually being brought to bear (see, e.g., Beretta, 1991; Beretta and Crookes,
1993; Crookes, 1992; Gregg, 1993, 1996, 2000, this volume; Gregg, Long, Jordan,
and Beretta, 1997; Jordan, 2002; Long, 1990a, 1993, forthcoming a).1
As reflected in the contributions to this volume (see also Robinson, 2001),
much current SLA research and theorizing shares a strongly cognitive orientation, while varying from nativist, both special (linguistic) and general, to various kinds of functional, emergentist, and connectionist positions. The focus is
firmly on identifying the nature and sources of the underlying L2 knowledge
system, and on explaining developmental success and failure. Performance
data are inevitably the researchers’ mainstay, but understanding underlying
competence, not the external verbal behavior that depends on that competence, is the ultimate goal. Researchers recognize that SLA takes place in a
social context, of course, and accept that it can be influenced by that context,
both micro and macro. However, they also recognize that language learning,
like any other learning, is ultimately a matter of change in an individual’s
internal mental state. As such, research on SLA is increasingly viewed as a
branch of cognitive science.
2
The Goals: Why Study SLA?
Second language acquisition – naturalistic, instructed, or both – has long been
a common activity for a majority of the human species and is becoming ever
more vital as second languages themselves increase in importance. In many
parts of the world, monolingualism, not bilingualism or multilingualism, is
the marked case. The 300–400 million people whose native language is English,
for example, are greatly outnumbered by the 1–2 billion people for whom it is
an official second language. Countless children grow up in societies where
they are exposed to one language in the home, sometimes two, another when
they travel to a nearby town to attend primary or secondary school, and a
third or fourth if they move to a larger city or another province for tertiary
education or for work.
Where literacy training or even education altogether is simply unavailable
in a group’s native language, or where there are just too many languages to
make it economically viable to offer either in all of them, as is the case in
Papua New Guinea and elsewhere in the Pacific (Siegel, 1996, 1997, 1999, this
volume), some federal and state governments and departments of education
mandate use of a regional lingua franca or of an official national language as
the medium of instruction. Such situations are sometimes recognized in state
constitutions, and occasionally even in an official federal language policy, as
in Australia (Lo Bianco, 1987); all mean that SLA is required of students, and
often of their teachers, as well.
Elsewhere, a local variety of a language may be actively suppressed or stigmatized, sometimes even by people who speak it natively themselves, resulting
The Scope of Inquiry and Goals of SLA 5
in a need for widespread second dialect acquisition (SDA) for educational,
employment, and other purposes. Examples include Hawai’i Creole English
(Reynolds, 1999; Sato, 1985, 1989; Wong, 1999), Aboriginal English in Australia
(Eades, 1992; Haig, 2001; Malcolm, 1994), and African-American Vernacular
English in the USA (Long, 1999; Morgan, 1999; Rickford, 2000). In such cases,
a supposedly “standard” variety may be prescribed in educational settings,
despite the difficulty of defining a spoken standard objectively, and despite the
notorious track record of attempts to legislate language change. The prescribed
varieties are second languages or dialects for the students, and as in part of the
Solomon Islands (Watson-Gegeo, 1992; Watson-Gegeo and Nielsen, this volume),
once again, sometimes for their teachers, too, with a predictably negative effect
on educational achievement. In a more positive development, while language
death throughout the world continues at an alarming pace, increasing numbers
of children in some countries attend various kinds of additive bilingual, additive bidialectal, or immersion programs designed to promote first language
maintenance, SLA, or cultural revitalization (see, e.g., Fishman, 2001; Huebner
and Davis, 1999; Philipson, 2000; Sato, 1989; Warner, 2001).
SLA and SDA are not just common experiences for the world’s children, of
course. More and more adults are becoming second language or second dialect
learners voluntarily for the purposes of international travel, higher education,
and marriage. For increasing numbers of others, the experience is thrust upon
them. Involuntary SLA may take the fairly harmless form of satisfying a school
or university foreign language requirement, but regrettably often it has more
sinister causes. Each year, tens of millions of people are obliged to learn a second
language or another variety of their own language because they are members
of an oppressed ethnolinguistic minority, because forced to migrate across
linguistic borders in a desperate search for work, or worse, due to war, drought,
famine, religious persecution, or ethnic cleansing. Whatever they are seeking or
fleeing, almost all refugees and migrants need to reach at least a basic threshold proficiency level in a second language simply to survive in their new
environment. Most require far more than that, however, if they wish to succeed
in their new environment or to become members of the new culture. States
and citizens, scholars and laypersons alike recognize that learning a society’s
language is a key part of both acculturation and socialization. Finally, less
visibly, economic globalization and progressively more insidious cultural
homogenization affect most people, knowingly or not, and each is transmitted
through national languages within countries and through just a few languages,
especially English at present, at the international level.
Any experience that touches so many people is worthy of serious study,
especially when success or failure can so fundamentally affect life chances.
However, the obvious social importance of second language acquisition (SLA)
is by no means the only reason for researchers’ interest, and for many, not the
primary reason or not a reason at all. As a widespread, highly complex, uniquely
human, cognitive process, language learning of all kinds merits careful study
for what it can reveal about the nature of the human mind and intelligence. Thus, a
6 Catherine J. Doughty and Michael H. Long
good deal of what might be termed “basic research” goes on in SLA without
regard for its potential applications or social utility.
In linguistics and psychology, for example, data on SLA are potentially
useful for testing theories as different from one another as grammatical nativism
(see, e.g., Eubank, 1991; Gregg, 1989; Liceras, 1986; Pankhurst, Sharwood-Smith,
and Van Buren, 1988; Schwartz, 1992; White, 1989; and chapters by Gregg,
Sorace, and White, this volume), general nativism (see, e.g., Eckman, 1996a;
O’Grady, 2001a, 2001b, this volume; Wolfe-Quintero, 1996), various types of
functionalism (see, e.g., Andersen, 1984; Eckman, 1996b; Mitchell and Miles,
1998, pp. 100–20; Rutherford, 1984; Sato, 1988, 1990; Tomlin, 1990), and
emergentism and connectionism (see, e.g., Ellis, this volume; Gasser, 1990;
MacWhinney, 2001). Research on basic processes in SLA draws upon and contributes to work on such core topics in cognitive psychology and linguistics as
implicit and explicit learning (e.g., DeKeyser, this volume; N. Ellis, 1993, 1994;
Robinson, 1997), incidental and intentional learning (e.g., Hulstijn, 2001, this
volume; Robinson, 1996), automaticity (e.g., DeKeyser, 2001; Segalowitz, this
volume), attention and memory (e.g., N. Ellis, 2001; Robinson, this volume;
Schmidt, 1995; Tomlin and Villa, 1994), individual differences (e.g., Segalowitz,
1997; Dörnyei and Skehan, this volume), variation (e.g., Bayley and Preston,
1996; R. Ellis, 1999; Johnston, 1999; Preston, 1989, 1996; Romaine, this volume;
Tarone, 1988; Williams, 1988; Young, 1990; Zobl, 1984), language processing
(e.g., Clahsen, 1987; Doughty, this volume; Harrington, 2001; Pienemann, 1998,
this volume), and the linguistic environment for language learning (e.g.,
Doughty, 2000; Gass, this volume; Hatch, 1978; Long, 1996; Pica, 1992), as well
as at least two putative psychological processes claimed to distinguish first
from second language acquisition, that is, cross-linguistic influence (see, e.g.,
Andersen, 1983a; Gass, 1996; Gass and Selinker, 1983; Jordens, 1994; Kasper,
1992; Kellerman, 1984; Kellerman and Sharwood-Smith, 1986; Odlin, 1989, this
volume; Ringbom, 1987; Selinker, 1969) and fossilization (see, e.g., Kellerman,
1989; Long, this volume; Selinker, 1972; Selinker and Lakshmanan, 1992). SLA
data are also potentially useful for explicating relationships between language
and thought; for example, through exploring claims concerning semantic and
cultural universals (see, e.g., Dietrich, Klein, and Noyau, 1995), or relationships between language development and cognitive development (Curtiss,
1982) – confounded in children, but not in SLA by adults. There is also a rich
tradition of comparisons among SLA, pidginization, and creolization (see, e.g.,
Adamson, 1988; Andersen, 1983b; Andersen and Shirai, 1996; Bickerton, 1984;
Meisel, 1983; Schumann, 1978; Valdman and Phillips, 1975).
In neuroscience, SLA data can help show where and how the brain stores
and retrieves linguistic knowledge (see, e.g., Green, 2002; Obler and Hannigan,
1996; Ullman, 2002); which areas are implicated in acquisition (see, e.g.,
Schumann, 1998); how the brain adapts to additional burdens, such as
bilingualism (see, e.g., Albert and Obler, 1978; Jacobs, 1988; Kroll, Michael,
and Sankaranarayanan, 1998; Kroll and Sunderman, this volume), or trauma
resulting in bilingual or multilingual aphasia (see, e.g., Galloway, 1981; Paradis,
The Scope of Inquiry and Goals of SLA 7
1990); and whether the brain is progressively more limited in handling any
of those tasks. In what has become one of the most active areas of work in
recent years, SLA researchers seek to determine whether observed differences
in the success of children and adults with second languages is because the
brain is subject to maturational constraints in the form of sensitive periods for
language learning (see, e.g., Birdsong, 1999; Bongaerts, Mennen, and van der
Slik, 2000; DeKeyser, 2000; Flege, Yeni-Komshian, and Liu, 1999; Hyltenstam
and Abrahamsson, this volume; Ioup, Boustagui, El Tigi, and Moselle, 1994;
Long, 1990b, forthcoming b; Schachter, 1996).
Basic research sometimes yields unexpected practical applications, and that
may turn out to be true of basic SLA research, too. Much work in SLA, however, has clear applications or potential applications from the start. The most
obvious of these is second (including foreign) language teaching (see, e.g.,
Doughty, 1991, this volume; Doughty and Williams, 1998; N. Ellis and Laporte,
1997; R. Ellis, 1989; de Graaff, 1997; Lightbown and Spada, 1999; Long, 1988;
Norris and Ortega, 2000; Pica, 1983; Pienemann, 1989; Sharwood-Smith, 1993),
since SLA researchers study the process language teaching is designed to
facilitate.2 For bilingual, immersion, and second dialect education, second
language literacy programs, and whole educational systems delivered through
the medium of a second language, SLA research findings offer guidance on
numerous issues. Examples include the optimal timing of L1 maintenance and
L2 development programs, the linguistic modification of teaching materials,
the role of implicit and explicit negative feedback on language error, and
language and content achievement testing.
SLA research findings are also potentially very relevant for populations
with special language-learning needs. These include certain abnormal
populations, such as Alzheimer’s patients (see, e.g., Hyltenstam and Stroud,
1993) and Down syndrome children, where research questions concerning socalled (first) “language intervention” programs are often quite similar to those
of interest for (second) “language teaching” (see, e.g., Mahoney, 1975;
Rosenberg, 1982). Other examples are groups, such as immigrant children, for
whom it is crucial that educators not confuse second language problems with
learning disabilities (see, e.g., Cummins, 1984); bilinguals undergoing primary
language loss (Seliger, 1996; Seliger and Vago, 1991; Weltens, De Bot, and van
Els, 1986); and deaf and hearing individuals learning a sign language, such as
American Sign Language (ASL), as a first or second language, respectively
(see, e.g., Berent, 1996; Mayberry, 1993; Strong, 1988). In all these cases, as
Bley-Vroman (1990) pointed out, researchers are interested in explaining not
only how success is achieved, but why – in stark contrast with almost uniformly successful child first language acquisition – at least partial failure is so
common in SLA.
8 Catherine J. Doughty and Michael H. Long
NOTES
1 A seminar on theory change in SLA,
with readings from the history,
philosophy, and sociology of science
and the sociology of knowledge, is
now regularly offered as an elective
for M.A. and Ph.D. students in the
University of Hawai’i’s Department
of Second Language Studies. The
importance of such a “big picture”
methodology course in basic training
for SLA researchers – arguably
at least as great as that of the
potentially endless series of
“grassroots” courses in quantitative
and qualitative research methods
and statistics that are now routine –
will likely become more widely
recognized over time.
2 The utility of some work in SLA for
this purpose does not mean that
SLA is the only important source of
information, and certainly not that a
theory of SLA should be passed off
as a theory of language teaching.
Nor, conversely, does it mean, as has
occasionally been suggested, that SLA
theories should be evaluated by their
relevance to the classroom.
REFERENCES
Adamson, H. D. 1988: Variation Theory
and Second Language Acquisition.
Washington, DC: Georgetown
University Press.
Albert, M. L. and Obler, L. 1978: The
Bilingual Brain: Neuropsychological and
Neurolinguistic Aspects of Bilingualism.
San Diego: Academic Press.
Andersen, R. W. 1983a: Transfer to
somewhere. In S. M. Gass and
L. Selinker (eds), Language Transfer
in Language Learning. Rowley, MA:
Newbury House, 177–201.
Andersen, R. W. 1983b: Pidginization and
Creolization as Language Acquisition.
Rowley, MA: Newbury House.
Andersen, R. W. 1984: The one to one
principle of interlanguage construction.
Language Learning, 34 (4), 77–95.
Andersen, R. W. and Shirai, Y. 1996: The
primacy of aspect in first and second
language acquisition: the pidgin–
creole connection. In W. R. Ritchie and
T. J. Bhatia (eds), Handbook of Second
Language Acquisition. San Diego:
Academic Press, 527–70.
Bachman, L. and Cohen, A. D. 1998:
Interfaces between Second Language
Acquisition and Language Testing
Research. Cambridge: Cambridge
University Press.
Bayley, R. and Preston, D. R. (eds) 1996:
Second Language Acquisition and Linguistic
Variation. Philadelphia: John Benjamins.
Berent, G. P. 1996: The acquisition of
English syntax by deaf learners. In
W. R. Ritchie and T. J. Bhatia (eds),
Handbook of Second Language
Acquisition. San Diego: Academic
Press, 469–506.
Beretta, A. 1991: Theory construction in
SLA. Complementarity and
opposition. Studies in Second Language
Acquisition, 13 (4), 493–512.
Beretta, A. and Crookes, G. 1993:
Cognitive and social determinants in
the context of discovery in SLA.
Applied Linguistics, 14 (3), 250–75.
Bickerton, D. 1984: The language
bioprogram hypothesis and
second language acquisition. In
W. E. Rutherford (ed.), Language
The Scope of Inquiry and Goals of SLA 9
Universals and Second Language
Acquisition. Amsterdam and
Philadelphia: John Benjamins,
141–61.
Birdsong, D. 1989: Metalinguistic
Performance and Interlinguistic
Competence. Berlin and New York:
Springer Verlag.
Birdsong, D. (ed.) 1999: Second Language
Acquisition and the Critical Period
Hypothesis. Mahwah, NJ: Lawrence
Erlbaum Associates.
Bley-Vroman, R. 1990: The logical
problem of foreign language learning.
Linguistic Analysis, 20 (1–2), 3– 49.
Bongaerts, T., Mennen, S., and van der
Slik, F. 2000: Authenticity of
pronunciation in naturalistic second
language acquisition. The case of very
advanced late learners of Dutch as a
second language. Studia Linguistica, 54,
298–308.
Clahsen, H. 1987: Connecting theories of
language processing and (second)
language acquisition. In C. Pfaff (ed.),
First and Second Language Acquisition
Processes. Cambridge, MA: Newbury
House, 103–16.
Crookes, G. 1992: Theory format and
SLA theory. Studies in Second Language
Acquisition, 14 (4), 425–49.
Cummins, J. 1984: Bilingualism and Special
Education: Issues on Assessment and
Pedagogy. Clevedon: Multilingual
Matters.
Curtiss, S. 1982: Developmental
dissociation of language and cognition.
In L. K. Obler and L. Menn (eds),
Exceptional Language and Linguistics.
New York: Academic Press, 285–312.
DeKeyser, R. 2000: The robustness of
critical period effects in second
language acquisition. Studies in Second
Language Acquisition, 22 (4), 493–533.
DeKeyser, R. 2001: Automaticity and
automatization. In P. Robinson (ed.),
Cognition and Second Language
Instruction. Cambridge: Cambridge
University Press, 125–51.
Dietrich, R., Klein, W., and Noyau, C.
1995: The Acquisition of Temporality in
a Second Language. Amsterdam and
Philadelphia: John Benjamins.
Doughty, C. J. 1991: Second language
instruction does make a difference:
evidence from an empirical study
of SL relativization. Studies in
Second Language Acquisition, 13 (4),
431–69.
Doughty, C. J. 2000: Negotiating the
L2 linguistic environment. University
of Hawai’i Working Papers in ESL,
18 (2), 47–83.
Doughty, C. J. and Long, M. H. 2000:
Eliciting second language speech data.
In L. Menn and N. Bernstein Ratner
(eds), Methods for Studying Language
Production. Mahwah, NJ: Lawrence
Erlbaum Associates, 149–77.
Doughty, C. J. and Williams, J. 1998:
Focus on Form in Classroom Second
Language Acquisition. Cambridge:
Cambridge University Press.
Eades, D. 1992: Aboriginal English
and the Law: Communicating with
Aboriginal English-Speaking Clients:
A Handbook for Legal Practitioners.
Brisbane: Queensland Law Society.
Eckman, F. R. 1996a: On evaluating
arguments for special nativism in
second language acquisition theory.
Second Language Research, 12 (4),
335–73.
Eckman, F. R. 1996b: A functionaltypological approach to second
language acquisition theory. In
W. C. Ritchie and T. K. Bhatia
(eds), Handbook of Second Language
Acquisition. San Diego: Academic
Press, 195–211.
Ellis, N. 1993: Rules and instances in
foreign language learning: interactions
of explicit and implicit knowledge.
European Journal of Cognitive
Psychology, 5, 289–318.
Ellis, N. 1994: Implicit and Explicit
Learning of Languages. New York:
Academic Press.
10 Catherine J. Doughty and Michael H. Long
Ellis, N. 2001: Memory for language. In
P. Robinson (ed.), Cognition and Second
Language Instruction. Cambridge:
Cambridge University Press, 33–68.
Ellis, N. and Laporte, N. 1997: Contexts
of acquisition: effects of formal
instruction and naturalistic exposure
on second language acquisition.
In A. M. de Groot and J. F. Kroll
(eds), Tutorials in Bilingualism:
Psycholinguistic Perspectives. Mahwah,
NJ: Lawrence Erlbaum Associates,
53–83.
Ellis, R. 1989: Are classroom and
naturalistic acquisition the same?
A study of classroom acquisition
of German word order rules. Studies
in Second Language Acquisition, 11 (3),
305–28.
Ellis, R. 1994: The Study of Second
Language Acquisition. Oxford: Oxford
University Press.
Ellis, R. 1999: Item versus system
learning: explaining free variation.
Applied Linguistics, 20 (4), 460 –80.
Eubank, L. 1991: Introduction: Universal
Grammar in the second language. In
L. Eubank (ed.), Point Counterpoint:
Universal Grammar in the Second
Language. Amsterdam and
Philadelphia: John Benjamins, 1– 48.
Faerch, C. and Kasper, G. (ed.) 1987:
Introspection in Second Language
Research. Clevedon: Multilingual
Matters.
Fishman, J. A. 2001: Can Threatened
Languages be Saved? Clevedon:
Multilingual Matters.
Flege, J. E., Yeni-Komshian, G. H.,
and Liu, S. 1999: Age constraints
on second-language acquisition.
Journal of Memory and Language, 41,
78–104.
Galloway, L. M. 1981: The convolutions
of second language: a theoretical
article with a critical review and
some new hypotheses towards a
neuropsychological model of
bilingualism and second language
performance. Language Learning, 31 (2),
439–64.
Gass, S. M. 1996: Second language
acquisition and linguistic theory:
the role of language transfer. In
W. R. Ritchie and T. J. Bhatia (eds),
Handbook of Second Language
Acquisition. San Diego: Academic
Press, 317– 45.
Gass, S. M. and Selinker, L. (eds) 1983:
Language Transfer in Language Learning.
Rowley, MA: Newbury House.
Gass, S. M. and Selinker, L. 2001: Second
Language Acquisition: An Introductory
Course. Second edition. Mahwah, NJ:
Lawrence Erlbaum Associates.
Gasser, M. 1990: Connectionism and
universals of second language
acquisition. Studies in Second Language
Acquisition, 12 (2), 179–99.
Graaff, R. de 1997: The eXperanto
experiment: effects of explicit
instruction on second language
acquisition. Studies in Second Language
Acquisition, 19 (2), 249–76.
Green, D. W. (ed.) 2002: The cognitive
neuroscience of bilingualism.
Bilingualism: Language and Cognition,
4 (2), 101–201.
Gregg, K. R. 1989: Second language
acquisition theory: the case for a
generative perspective. In S. M. Gass
and J. Schachter (eds), Linguistic
Perspectives on Second Language
Acquisition. Cambridge: Cambridge
University Press, 15–40.
Gregg, K. R. 1993: Taking explanation
seriously; or, Let a couple of flowers
bloom. Applied Linguistics, 14 (3),
276–94.
Gregg, K. R. 1994: Second language
acquisition: history and theory.
Encyclopedia of Language and Linguistics.
Second edition. Oxford: Pergamon,
3720–6.
Gregg, K. R. 1996: The logical and
developmental problems of second
language acquisition. In W. R. Ritchie
and T. J. Bhatia (eds), Handbook of
The Scope of Inquiry and Goals of SLA 11
Second Language Acquisition. San Diego:
Academic Press, 49–81.
Gregg, K. R. 2000: A theory for every
occasion: postmodernism and SLA.
Second Language Research, 16 (4),
343–59.
Gregg, K. R., Long, M. H., Jordan, G.,
and Beretta, A. 1997: Rationality and
its discontents in SLA. Applied
Linguistics, 17 (1), 63–83.
Haig, Y. 2001: Teacher perceptions of
student speech. Ph.D. dissertation.
Edith Cowan University.
Harrington, M. 2001: Sentence
processing. In P. Robinson (ed.),
Cognition and Second Language
Instruction. Cambridge: Cambridge
University Press, 91–124.
Hatch, E. M. 1978: Discourse analysis
and second language acquisition.
In E. M. Hatch (ed.), Second Language
Acquisition: A Book of Readings.
Rowley, MA: Newbury House,
401–35.
Huebner, T. 1983a: A Longitudinal
Analysis of the Acquisition of English.
Ann Arbor, MI: Karoma.
Huebner, T. 1983b: Linguistic systems
and linguistic change in an
interlanguage. Studies in Second
Language Acquisition, 6 (1), 33–53.
Huebner, T. and Davis, K. A. 1999:
Sociopolitical Perspectives on Language
Policy and Planning in the USA.
Amsterdam and Philadelphia: John
Benjamins.
Hulstijn, J. H. 2001: Intentional and
incidental second language learning:
a reappraisal of elaboration, rehearsal
and automaticity. In P. Robinson
(ed.), Cognition and Second Language
Instruction. Cambridge: Cambridge
University Press, 258–86.
Hyltenstam, K. and Stroud, C. 1993:
Second language regression in
Alzheimer’s dementia. In K.
Hyltenstam and A. Viberg (eds),
Progression and Regression in Language:
Sociocultural, Neuropsychological and
Linguistic Perspectives. Cambridge:
Cambridge University Press, 222– 42.
Ioup, G., Boustagui, E., El Tigi, M., and
Moselle, M. 1994: Reexamining the
critical period hypothesis: a case
study of successful adult SLA in a
naturalistic environment. Studies in
Second Language Acquisition, 16 (1),
73–98.
Iwashita, N. 2001: The role of task-based
conversation in the acquisition of
Japanese grammar and vocabulary.
Ph.D. thesis. University of Melbourne,
Department of Linguistics and
Applied Linguistics.
Jacobs, B. 1988: Neurobiological
differentiation in primary and
secondary language acquisition.
Studies in Second Language Acquisition,
10 (3), 303–37.
Johnston, M. 1999: System and variation
in interlanguage development.
Unpublished Ph.D. dissertation.
Canberra: Australian National
University.
Jordan, G. 2002: Theory construction in
SLA. Ph.D. dissertation. London
University, Institute of Education.
Jordens, P. 1994: The cognitive function
of case marking in German as a native
and a foreign language. In S. M. Gass
and L. Selinker (eds), Language Transfer
in Language Learning. Second edition.
Amsterdam and Philadelphia: John
Benjamins, 138–75.
Jordens, P. and Lalleman, J. (eds) 1988:
Language Development. Dordrecht:
Foris.
Kasper, G. 1992: Pragmatic transfer.
Second Language Research, 8 (3), 203–31.
Kellerman, E. 1984: The empirical
evidence for the influence of the L1 in
interlanguage. In A. Davies, C. Criper,
and A. Howatt (eds), Interlanguage.
Edinburgh: Edinburgh University
Press, 98–122.
Kellerman, E. 1989: The imperfect
conditional. In K. Hyltenstam and
L. K. Obler (eds), Bilingualism Across
12 Catherine J. Doughty and Michael H. Long
the Lifespan: Aspects of Acquisition,
Maturity, and Loss. Cambridge:
Cambridge University Press, 87–115.
Kellerman, E. and Sharwood-Smith, M.
(eds) 1986: Cross-Linguistic Influence in
Second Language Acquisition. New York:
Pergamon.
Klein, F. 1981: The acquisition of English
in Hawai’i by Korean adolescent
immigrants: a longitudinal study of
verbal auxiliary agreement. Ph.D.
dissertation. University of Hawai’i,
Department of Linguistics.
Klein, W. 1986: Second Language
Acquisition. Cambridge: Cambridge
University Press.
Kroll, J. F., Michael, E., and
Sankaranarayanan, A. 1998: A model
of bilingual representation and its
implications for second language
acquisition. In A. F. Healy and L. E.
Bourne, Jr (eds), Foreign Language
Learning: Psycholinguistic Studies on
Training and Retention. Mahwah, NJ:
Lawrence Erlbaum Associates, 365–95.
Larsen-Freeman, D. 1991: Second
language acquisition research: staking
out the territory. TESOL Quarterly, 25
(2), 315–50.
Larsen-Freeman, D. and Long, M. H.
1991: An Introduction to Second
Language Acquisition Research. London:
Longman.
Liceras, J. 1986: Linguistic Theory and
Second Language Acquisition. Tubingen:
Gubter Narr.
Liceras, J. M., Maxwell, D., Laguardia, B.,
Fernández, Z., Fernández, R., and
Diaz, L. 1997: A longitudinal study of
Spanish non-native grammars: beyond
parameters. In A. T. Pérez-Leroux and
W. Glass (eds), Contemporary
Perspectives on the Acquisition of
Spanish. Vol. 1: Developing Grammars.
Somerville, MA: Cascadilla Press,
99–132.
Lightbown, P. M. and Spada, N. 1999: How
Languages are Learned. Revised edition.
Oxford: Oxford University Press.
Lo Bianco, J. 1987: National Policy on
Languages. Canberra: Australian
Government Publishing Service.
Long, M. H. 1988: Instructed
interlanguage development. In
L. M. Beebe (ed.), Issues in Second
Language Acquisition: Multiple
Perspectives. Cambridge, MA:
Newbury House, 115– 41.
Long, M. H. 1990a: The least a second
language acquisition theory needs
to explain. TESOL Quarterly, 24 (4),
649–66.
Long, M. H. 1990b: Maturational
constraints on language development.
Studies in Second Language Acquisition,
12 (3), 251–85.
Long, M. H. 1993: Assessment strategies
for second language acquisition
theories. Applied Linguistics, 14 (3),
225– 49.
Long, M. H. 1996: The role of the
linguistic environment in second
language acquisition. In W. R. Ritchie
and T. J. Bhatia (eds), Handbook of
Second Language Acquisition. San Diego:
Academic Press, 413 –68.
Long, M. H. 1998: SLA: breaking the
siege. University of Hawai’i Working
Papers in ESL, 17 (1), 79–129. Also to
appear in M. H. Long, Problems in
SLA. Mahwah, NJ: Lawrence Erlbaum
Associates.
Long, M. H. 1999: Ebonics, language and
power. In F. L. Pincus and H. J.
Ehrlich (eds), Race and Ethnic Conflict:
Contending Views on Prejudice,
Discrimination, and Ethnoviolence.
Second edition. Westview/
HarperCollins, 331– 45.
Long, M. H. forthcoming a: Theory
change in SLA. In M. H. Long,
Problems in SLA. Mahwah, NJ:
Lawrence Erlbaum Associates.
Long, M. H. forthcoming b: Age
differences and the sensitive periods
controversy in SLA. In M. H. Long,
Problems in SLA. Mahwah, NJ:
Lawrence Erlbaum Associates.
The Scope of Inquiry and Goals of SLA 13
Long, M. H. and Robinson, P. 1998:
Focus on form: theory, research and
practice. In C. J. Doughty and J.
Williams (eds), Focus on Form in
Classroom Second Language Acquisition.
Cambridge: Cambridge University
Press, 15–41.
MacWhinney, B. 2001: The Competition
Model: the input, the context and the
brain. In P. Robinson (ed.), Cognition
and Second Language Instruction.
Cambridge: Cambridge University
Press, 69–90.
Mahoney, G. 1975: Ethnological
approach to delayed language
acquisition. American Journal of Mental
Deficiency, 80, 139– 48.
Malcolm, I. 1994: Aboriginal English
inside and outside the classroom.
Australian Review of Applied Linguistics,
17 (1), 147–80.
Mayberry, R. 1993: First-language
acquisition after childhood differs
from second-language acquisition:
the case of American Sign Language.
Journal of Speech and Hearing Research,
36, 1258–70.
Meisel, J. M. 1983: Strategies of second
language acquisition: more than one
kind of simplification. In R. W.
Andersen (ed.), Pidginization and
Creolization as Second Language
Acquisition. Rowley, MA: Newbury
House, 120–57.
Mitchell, R. and Miles, F. 1998:
Functional/pragmatic perspectives
on second language learning. In R.
Mitchell and F. Miles (eds), Second
Language Learning Theories. London:
Arnold, 100–20.
Morgan, M. 1999: US language planning
and policies for social dialect speakers.
In T. Huebner and K. A. Davies (eds),
Sociopolitical Perspectives on Language
Policy and Planning in the USA.
Amsterdam and Philadelphia: John
Benjamins, 173–91.
Norris, J. and Ortega, L. 2000:
Effectiveness of instruction: a research
synthesis and quantitative metaanalysis. Language Learning, 50 (3),
417–528.
Obler, L. and Hannigan, S. 1996:
Neurolinguistics of second language
acquisition and use. In W. R. Ritchie
and T. J. Bhatia (eds), Handbook of
Second Language Acquisition. San Diego:
Academic Press, 509–23.
Odlin, T. 1989: Language Transfer.
Cambridge: Cambridge University
Press.
O’Grady, W. 2001a: Language
acquisition and language loss. Ms.
University of Hawai’i, Department of
Linguistics.
O’Grady, W. 2001b: An emergentist
approach to syntax. Ms. University of
Hawai’i, Department of Linguistics.
Pankhurst, J., Sharwood-Smith, M., and
Van Buren, P. 1988: Learnability and
Second Languages: A Book of Readings.
Dordrecht: Foris.
Paradis, M. 1990: Bilingual and polyglot
aphasia. In F. Boller and J. Grafman
(eds), Handbook of Neuropsychology.
Vol. 2. New York: Elsevier, 117– 40.
Philipson, R. (ed.) 2000: Rights to
Language: Equity, Power, and Education.
Mahwah, NJ: Lawrence Erlbaum
Associates.
Pica, T. 1983: Adult acquisition of
English as a second language under
different conditions of exposure.
Language Learning, 33 (4), 465–97.
Pica, T. 1992: The textual outcomes of
native speaker/non-native speaker
negotiation: what do they reveal about
second language learning? In C.
Kramsch and S. McConnell-Ginet
(eds), Text and Context: CrossDisciplinary Perspectives on Language
Study. Lexington, MA: D. C. Heath,
198–237.
Pienemann, M. 1989: Is language
teachable? Applied Linguistics, 10 (1),
52–79.
Pienemann, M. 1998: Language Processing
and Second Language Development:
14 Catherine J. Doughty and Michael H. Long
Processability Theory. Amsterdam and
Philadelphia: John Benjamins.
Preston, D. R. 1989: Sociolinguistics and
Second Language Acquisition. Oxford:
Blackwell.
Preston, D. R. 1996: Variationist
linguistics and second language
acquisition. In W. C. Ritchie and
T. K. Bhatia (eds), Handbook of Second
Language Acquisition. San Diego:
Academic Press, 229–65.
Reynolds, S. B. 1999: Mutual
intelligibility? Comprehension
problems between American Standard
English and Hawai’i Creole English
in Hawai’i’s public schools. In
J. R. Rickford and S. Romaine (eds),
Creole Genesis, Attitudes and Discourse.
Amsterdam and Philadelphia: John
Benjamins, 303–19.
Rickford, J. R. 2000: African American
Vernacular English: Features, Evolution,
Educational Implications. Oxford:
Blackwell.
Ringbom, H. 1987: The Role of the First
Language in Foreign Language Learning.
Clevedon: Multilingual Matters.
Ritchie, W. R. and Bhatia, T. J. (eds)
1996: Handbook of Second Language
Acquisition. San Diego: Academic
Press.
Robinson, P. 1996: Learning simple and
complex second language rules under
implicit, incidental, rule-search and
instructed conditions. Studies in
Second Language Acquisition, 18 (1),
27– 67.
Robinson, P. 1997: Individual differences
and the fundamental similarity of
implicit and explicit adult second
language learning. Language Learning,
47 (1), 45–99.
Robinson, P. (ed.) 2001: Cognition and
Second Language Instruction.
Cambridge: Cambridge University
Press.
Rosenberg, S. 1982: The language of
the mentally retarded: development
processes and intervention. In
S. Rosenberg (ed.), Handbook of
Applied Psycholinguistics: Major
Thrusts of Research and Theory.
Hillsdale, NJ: Lawrence Erlbaum
Associates, 329–92.
Rutherford, W. E. (ed.) 1984: Language
Universals and Second Language
Acquisition. Amsterdam and
Philadelphia: John Benjamins.
Sato, C. J. 1985: Linguistic inequality in
Hawai’i: the post-creole dilemma.
In N. Wolfson and J. Manes (eds),
Language of Inequality. Berlin: Mouton,
255–72.
Sato, C. J. 1988: Origins of complex
syntax in interlanguage development.
Studies in Second Language Acquisition,
10 (3), 371–95.
Sato, C. J. 1989: A non-standard
approach to Standard English. TESOL
Quarterly, 23 (2), 259–82.
Sato, C. J. 1990: The Syntax of
Conversation in Interlanguage
Development. Tubingen: Gunter Narr.
Schachter, J. 1996: Maturation and the
issue of UG in L2 acquisition. In
W. R. Ritchie and T. J. Bhatia (eds),
Handbook of Second Language
Acquisition. San Diego: Academic
Press, 159–93.
Schmidt, R. W. 1983: Interaction,
acculturation and the acquisition
of communicative competence. In
N. Wolfson and E. Judd (eds),
Sociolinguistics and Second Language
Acquisition. Rowley, MA: Newbury
House, 137–74.
Schmidt, R. W. (ed.) 1995: Attention and
Awareness in Foreign Language Learning.
Honolulu: University of Hawai’i Press.
Schumann, J. H. 1978: The Pidginization
Process: A Model for Second Language
Acquisition. Rowley, MA: Newbury
House.
Schumann, J. H. 1998: The neurobiology
of affect in language. Language
Learning, 48: Supplement 1.
Schwartz, B. D. 1992: Testing between
UG-based and problem-solving
The Scope of Inquiry and Goals of SLA 15
models of L2A: developmental
sequence data. Language Acquisition,
2 (1), 1–19.
Segalowitz, N. 1997: Individual
differences in second language
acquisition. In A. M. de Groot and
J. F. Kroll (eds), Tutorials in
Bilingualism: Psycholinguistic
Perspectives. Mahwah, NJ: Lawrence
Erlbaum Associates, 85–112.
Seliger, H. W. 1996: Primary language
attrition in the context of bilingualism.
In W. R. Ritchie and T. J. Bhatia (eds),
Handbook of Second Language
Acquisition. San Diego: Academic
Press, 605–26.
Seliger, H. W. and Vago, R. M. (eds)
1991: First Language Attrition.
Cambridge: Cambridge University
Press.
Selinker, L. 1969: Language transfer.
General Linguistics, 9, 67–92.
Selinker, L. 1972: Interlanguage.
International Review of Applied
Linguistics, 10 (3), 209–31.
Selinker, L. and Lakshmanan, U. 1992:
Language transfer and fossilization:
the multiple effects principle. In
S. M. Gass and L. Selinker (eds),
Language Transfer in Language Learning.
Amsterdam and Philadelphia: John
Benjamins, 97–116.
Sharwood-Smith, M. 1993: Input
enhancement in instructed SLA:
theoretical bases. Studies in Second
Language Acquisition, 15 (2), 165–79.
Siegel, J. 1996: Vernacular Education in
the South Pacific. International
Development Issues No. 45. Canberra:
Australian Agency for International
Development.
Siegel, J. 1997: Using a pidgin language
in formal education: help or
hindrance? Applied Linguistics, 18,
86–100.
Siegel, J. 1999: Creole and minority
dialects in education: an overview.
Journal of Multilingual and Multicultural
Development, 20, 508–31.
Sorace, A. 1996: The use of acceptability
judgments in second language
acquisition research. In W. R. Ritchie
and T. J. Bhatia (eds), Handbook of
Second Language Acquisition. San Diego:
Academic Press, 375– 409.
Strong, M. (ed.) 1988: Language Learning
and Deafness. Cambridge: Cambridge
University Press.
Tarone, E. E. 1988: Variation and Second
Language Acquisition. London: Edward
Arnold.
Tarone, E. E., Gass, S. M., and Cohen,
A. D. 1994: Research Methodology in
Second-Language Acquisition. Hillsdale,
NJ: Lawrence Erlbaum Associates.
Tomlin, R. S. 1990: Functionalism in
second language acquisition. Studies
in Second Language Acquisition, 12 (2),
155–77.
Tomlin, R. and Villa, V. 1994: Attention
in cognitive science and second
language acquisition. Studies in
Second Language Acquisition, 16 (2),
183–203.
Towell, R. and Hawkins, R. 1994:
Approaches to Second Language
Acquisition. Clevedon: Multilingual
Matters.
Ullman, M. T. 2002: The neural basis
of lexicon and grammar in first and
second language: the declarative/
procedural model. Bilingualism:
Language and Cognition, 4 (2),
105–22.
Valdman, A. and Phillips, J. 1975:
Pidginization, creolization and the
elaboration of learner systems. Studies
in Second Language Acquisition, 1 (1),
21– 40.
Warner, N. 2001: Küi ka mäna‘ai:
children acquire traits of those who
raise them. Plenary: Pacific Second
Language Research Forum. University
of Hawai’i, October 6.
Watson-Gegeo, K. A. 1992: Thick
explanation in the ethnographic study
of child socialization: a longitudinal
study of the problem of schooling for
16 Catherine J. Doughty and Michael H. Long
Kwara’ae (Solomon Islands) children.
In W. A. Corsaro and P. J. Miller
(eds), Interpretative Approaches to
Children’s Socialization. San Francisco:
Jossey-Bass, 51– 66.
Weltens, B., De Bot, K., and van Els, T.
(eds) 1986: Language Attrition in
Progress. Dordrecht: Foris.
White, L. 1989: Universal Grammar
and Second Language Acquisition.
Amsterdam and Philadelphia: John
Benjamins.
Williams, J. 1988: Zero anaphora in
second language acquisition: a
comparison among three varieties of
English. Studies in Second Language
Acquisition, 10 (3), 339–70.
Wolfe-Quintero, K. 1996: Nativism does
not equal Universal Grammar. Second
Language Research, 12 (4), 335–73.
Wong, L. 1999: Language varieties and
language policy: the appreciation of
Pidgin. In T. Huebner and K. A.
Davies (eds), Sociopolitical Perspectives
on Language Policy and Planning in the
USA. Amsterdam and Philadelphia:
John Benjamins, 205–22.
Young, R. 1990: Variation in Interlanguage
Morphology. Amsterdam and
Philadelphia: John Benjamins.
Zobl, H. 1984: The Wave Model of
linguistic change and the naturalness
of interlanguage. Studies in Second
Language Acquisition, 6 (2), 160–85.
On the Nature of Interlanguage Representation 17
II
Capacity and
Representation
2
On the Nature of
Interlanguage Representation:
Universal Grammar in
the Second Language
LYDIA WHITE
1 Introduction
In the late 1960s and early 1970s, several researchers pointed out that the language of second language (L2) learners is systematic and that learner errors are
not random mistakes but evidence of rule-governed behavior (Adjémian, 1976;
Corder, 1967; Nemser, 1971; Selinker, 1972). From this developed the conception
of “interlanguage,” the proposal that L2 learners have internalized a mental
grammar, a natural language system that can be described in terms of linguistic
rules and principles. The current generative linguistic focus on interlanguage
representation can be seen as a direct descendent of the original interlanguage
hypothesis. Explicit claims are made about the nature of interlanguage competence, the issues being the extent to which interlanguage grammars are like
other grammars, as well as the role of Universal Grammar (UG).
The question of whether UG mediates L2 acquisition, and to what extent, has
been much debated since the early 1980s. This question stems from a particular
perspective on linguistic universals and from particular assumptions about
the nature of linguistic competence. In the generative tradition, it is assumed
that grammars are mental representations, and that universal principles constrain these representations. Linguistic universals are as they are because of
properties of the human mind, and grammars (hence, languages) are as they
are because of these universal principles.
The first decade of research on the role of UG in L2 acquisition concentrated
on so-called “access,” exploring whether UG remains available in non-primary
acquisition. The issue of UG access relates to fundamental questions such as:
what are natural language grammars like? What is the nature of linguistic
competence? How is it acquired? UG is proposed as a partial answer, at least
in the case of the first language (L1) grammar, the assumption being that
20 Lydia White
language acquisition is impossible in the absence of specific innate linguistic
principles which place constraints on grammars, restricting the “hypothesis
space,” or, in other words, severely limiting the range of possibilities that the
language acquirer has to entertain. In L2 acquisition research, then, the issue is
whether interlanguage representations are also constrained by UG.
2
UG and the Logical Problem of Language
Acquisition
UG is proposed as part of an innate biologically endowed language faculty
(e.g., Chomsky, 1965, 1981; Pinker, 1994). It places limitations on grammars, constraining their form (the inventory of possible grammatical categories in the
broadest sense, i.e., syntactic, semantic, phonological), as well as how they operate (the computational system, principles that the grammar is subject to). UG
includes invariant principles, as well as parameters which allow for variation.
While theories like Government-Binding (GB) (Chomsky, 1981), Minimalism
(Chomsky, 1995), or Optimality Theory (Archangeli and Langendoen, 1997)
differ as to how universal principles and parameters are formalized, within
these approaches there is a consensus that certain properties of language are
too abstract, subtle, and complex to be acquired in the absence of innate and
specifically linguistic constraints on grammars.
UG is postulated as an explanation of how it is that learners come to know
properties of grammar that go far beyond the input, how they know that
certain things are not possible, why grammars are of one sort rather than
another. The claim is that such properties do not have to be learned. Proposals
for an innate UG are motivated by the observation that, at least in the case of
L1 acquisition, there is a mismatch between the primary linguistic data (PLD),
namely the utterances a child is exposed to, and the abstract, subtle, and
complex knowledge that the child acquires. In other words, the input (the
PLD) underdetermines the output (the grammar). This is known as the problem of the poverty of the stimulus or the logical problem of language acquisition.
As an example of a proposed principle of UG which accounts for knowledge
too subtle to be learned solely from input, we will consider the Overt Pronoun
Constraint (OPC) (Montalbetti, 1983), a constraint which has recently received
attention in L2 acquisition research. The OPC states that in null argument
languages (languages allowing both null and overt pronouns), an overt pronoun
cannot receive a bound variable interpretation, that is, it cannot have a quantified
expression (such as everyone, someone, no one) or a wh-phrase (who, which) as its
antecedent.1 This constraint holds true of null argument languages in general,
including languages unrelated to each other, such as Spanish and Japanese.
Consider the sentences in (1) from English, a language requiring overt subjects. In particular, we are concerned with the coreference possibilities (indicated by subscripts) between the pronominal subject of the lower clause and
its potential antecedent in the main clause:
On the Nature of Interlanguage Representation 21
(1) a. Everyonei thought [hei would win]
b. Whoi thought [hei would win]?
c. Johni thought [hei would be late]
In (1a), the pronoun he can be bound to the quantifier everyone. On this interpretation, every person in the room thinks himself or herself a likely winner: he,
then, does not refer to a particular individual. This is known as a bound variable
interpretation. Similarly, in (1b) the pronoun can be bound to the wh-phrase
who without referring to a particular individual. In (1c), on the other hand, the
pronoun refers to a particular person in the main clause, namely John. (In addition, in all three cases, disjoint reference is possible, with the pronoun in the
lower clause referring to some other person in the discourse – this interpretation
is not of concern here.)
In null argument languages, the situation regarding quantified antecedents
is somewhat different. On the one hand, an embedded null subject can take
either a quantified or a referential antecedent (or it can be disjoint in reference
from other NPs in the sentence), just like overt pronouns in English. This is
illustrated in (2) for Japanese:2
(2) a. Darei ga [∅i kuruma o
katta to] itta no?
Who NOM car
ACC bought that said Q
Whoi said that (hei) bought a car?
de itiban da to] itte-iru
b. Tanaka-sani wa [∅i kaisya
Tanaka-Mr TOP
company in best is that saying-is
Mr Tanakai is saying that (hei) is the best in the company
On the other hand, overt pronouns are more restricted than either null pronouns
in null argument languages or overt pronouns in languages requiring overt arguments. In particular, an overt pronoun may not have a quantified antecedent, as
in (3a), whereas it can have a sentence-internal referential antecedent, as in (3b):
[karei ga
kuruma o katta to] itta no?
(3) a. *Darei ga
Who NOM he
NOM car
ACC bought that said Q
Whoi said that hei bought a car?
kaisya de itiban da to] itte-iru
b. Tanaka-sani wa [karei ga
Tanaka-Mr TOP he NOM company in best is that saying-is
Mr Tanakai is saying that hei is the best in the company
The differences between null argument languages like Japanese and languages
that do not permit null arguments like English are summarized in table 2.1.
At issue, then, is how the L1 acquirer of a language like Japanese discovers
the restriction on overt pronouns with respect to quantified antecedents. This
case constitutes a clear poverty-of-the-stimulus situation. The phenomenon in
question is very subtle. In many cases, overt and null pronouns will appear in
the same syntactic contexts (although sometimes under different pragmatic
22 Lydia White
Table 2.1 Antecedents for pronouns in null and overt argument languages
Overt
argument
languages
Null argument
languages
Referential antecedents
Quantified antecedents
Null subjects
Overt subjects
Overt subjects
Yes
Yes
Yes
No
Yes
Yes
and discourse conditions), so it is unlikely that the absence of overt pronouns
with quantified antecedents would be detected. It is also highly unlikely that
L1 acquirers produce utterances incorrectly using overt pronouns with quantified antecedents and are then provided with negative evidence on this point.
How, then, could an L1 acquirer of a language like Japanese discover this
property? The argument is that the knowledge is built in, in the form of a
principle of UG, the OPC; it does not have to be learned at all.
3
UG and the Logical Problem of L2
Acquisition
Assuming a logical problem of L1 acquisition, hence motivating UG, people
have asked whether the same holds true of L2; that is, whether there is a
mismatch between the input that L2 learners are exposed to and the unconscious knowledge that they attain (Bley-Vroman, 1990; Schwartz and Sprouse,
2000; White, 1985). In the case of L2 acquisition, it is important to distinguish
between (i) the logical problem and (ii) UG availability. The first issue is whether
L2 learners attain unconscious knowledge (a mental representation) that goes
beyond the L2 input. (There would be no logical problem at all, if L2 learners
turned out not to achieve knowledge that goes beyond the input.) The second
issue is whether such knowledge (if found) is achieved by means of UG. These
are not in fact the same question, although they are often collapsed, since the
way to determine whether UG principles and parameters constrain interlanguage representations is similar to the way to assess whether there is a logical
problem of L2 acquisition. However, it is conceivable that there is a logical
problem of L2 acquisition, with L2 learners achieving far more than could
have come from the input alone, and that their achievement is to be explained
by postulating a reliance on the L1 grammar rather than a still-functioning UG
(Bley-Vroman, 1990; Schachter, 1988).
The strongest case for the operation of UG in L2 acquisition, then, is if
learners demonstrate knowledge of subtle and abstract properties which could
On the Nature of Interlanguage Representation 23
not have been learned from L2 input alone or from input plus general learning
principles (not specifically linguistic) or on the basis of explicit instruction or
from the L1 grammar. In such cases, not only is there a logical problem of L2
acquisition but also UG remains the only way to account for the knowledge
in question. To demonstrate an L2 logical problem, hence the likelihood of
involvement of UG, researchers have sought out genuine L2 poverty of the
stimulus cases, in which both of the following hold (White, 1989b, 1990):
i
ii
The phenomenon in question is underdetermined by the L2 input. That is,
it must not be something that could have been acquired by simple observation of the L2 input, as an effect of input frequency, or on the basis of
instruction, analogical reasoning, etc.
The phenomenon in question works differently in the L1 and the L2. If
L2 learners show evidence of subtle and abstract knowledge, we want to
exclude the possibility that such knowledge is obtained solely via the L1
grammar.
However, the requirement that L1 and L2 differ in the relevant respects
becomes harder and harder to achieve, in that many properties of UG will of
necessity manifest themselves in the L1 in some form (Dekydtspotter, Sprouse,
and Anderson, 1998; Hale, 1996). Nevertheless, if the L1 and L2 differ in terms
of surface properties, then transfer can be ruled out, at least at this level, as an
explanation of successful acquisition.
In the first decade of work on SLA from a UG perspective (starting in the
early 1980s), research focused mainly on whether or not UG is available to L2
learners, and in what form. The UG question seemed relatively straightforward
(and relatively global): is UG available (or accessible) to L2 learners? The assumption was that if you can show that a particular UG principle operates/does not
operate then this generalizes to other principles, hence to UG availability/nonavailability in general. Researchers looked for evidence that L2 learners could
(or could not) apply principles of UG, and set or reset parameters, as well as
investigating the extent to which the L1 was involved, in the form of L1
parameter settings in interlanguage grammars. Hypotheses varied as to whether
learners had no access, partial (indirect) access, or full (direct) access to UG,
and there were differing views on the role of the L1 grammar. But although
the issues were phrased in terms of access to UG, the question was then, and
remains, whether interlanguage representations show evidence of being constrained by principles of UG; that is, whether interlanguage grammars are
restricted in the same way as the grammars of native speakers are restricted.
As a recent example of research which takes into account the logical problem of L2 acquisition and looks for evidence as to whether a principle of UG
constrains the interlanguage representation, consider Kanno’s (1997) investigation of the operation of the OPC in the grammars of L2 learners of Japanese
(see box 2.1). Using a coreference judgment task, Kanno shows that L2 learners
demonstrate subtle knowledge of the restriction on overt pronouns, correctly
24 Lydia White
Box 2.1
The Overt Pronoun Constraint (OPC) (Kanno, 1997)
Research question: Do adult L2 learners observe principles of UG which are not
operative in their L1? In particular, do English-speaking learners of Japanese observe
the OPC?
Overt Pronoun Constraint (OPC) (Montalbetti, 1983): In null argument languages, an
overt pronoun cannot receive a bound variable interpretation.
L2 logical problem:
i
There appears to be nothing in the L2 input to signal the difference between overt
and null pronominals with respect to quantified antecedents. It is unlikely that
the absence of overt pronouns with quantified antecedents would be detected.
This issue is not explicitly taught and not discussed in L2 textbooks.
ii Knowledge of the restriction on overt pronouns in Japanese is not available from
the L1 English. In English, overt pronouns can receive a bound variable interpretation, contrary to Japanese.
Methodology:
Subjects: 28 intermediate-level English-speaking adult learners of Japanese. Control
group of 20 adult native speakers of Japanese.
Task: Coreference judgment task, involving 20 biclausal sentences (4 sentence types, 5
tokens of each). Each sentence had a pronoun subject (overt or null) in the lower clause,
and a potential antecedent (quantified or referential) in the main clause. Participants
had to indicate whether the subject of the embedded clause could refer to the same
person as the subject of the main clause or whether it referred to someone else.
Results: Native speakers and L2 learners differentiated in their treatment of overt pronouns depending on the type of antecedent involved (quantified or referential), as well
as differentiating between overt and null pronominals in these contexts (see table 2.2),
supporting the claim that the OPC is being observed. Native speakers overwhelmingly
rejected quantified antecedents for overt pronouns (2 percent), while accepting them
in the case of null subjects (83 percent). They indicated that null subjects can always
take a sentence-internal referential antecedent (100 percent), whereas for overt pronouns
an internal referential antecedent was accepted at about 50 percent (both an internal
and an external referent are possible). The L2 learners showed a remarkably similar
pattern of results and their responses did not differ significantly from the controls.
Conclusion: Adult L2 acquirers of Japanese observe the OPC, suggesting that
interlanguage grammars are constrained by UG.
Table 2.2 Acceptances of antecedents by subject type (percentages)
Null subject
kare (“he”)
Native speakers (n = 20)
L2 learners (n = 28)
Quantified
antecedent
Referential
antecedent
Quantified
antecedent
Referential
antecedent
83.0
2.0
100.0
47.0
78.5
13.0
81.5
42.0
On the Nature of Interlanguage Representation 25
disallowing quantified antecedents in cases like (3a). Kanno’s test sentences are
carefully constructed to control for use of both types of pronoun (overt and
null) in the context of both kinds of antecedent (referential and quantified).
This allows her to eliminate the possibility that L2 learners simply prohibit
overt pronouns from taking sentence-internal antecedents in general, as well
as the possibility that they reject quantified antecedents altogether. In addition
to considering group results, Kanno shows that subjects largely behave consistently with respect to the OPC when analyzed individually. Such individual
analyses are crucial, since the hypothesis is that UG constrains the grammars
of individuals, and group results may conceal individual variation.
The knowledge demonstrated by these L2 learners of Japanese could not
have come from the L1 English, where overt pronouns do take quantified
antecedents; it is knowledge that is underdetermined by the L2 input, where
null and overt pronouns allow similar antecedents in many cases. The distinction between permissible antecedents for overt and null pronouns is not taught
in L2 Japanese textbooks or classes. It seems unlikely that there are relevant
surface patterns in the L2 input that could be noticed by the learner, leading to
this result. Nevertheless, L2 learners demonstrate knowledge of the restriction,
suggesting that L2 representations must be constrained by UG. Similar results
have been reported for L2 Spanish by Pérez-Leroux and Glass (1997); that is,
adult English-speaking learners of Spanish also observe the OPC.
4 The Comparative Fallacy
So far, we have considered the case of learners who acquire subtle knowledge
of the constraint on antecedents for pronouns (the OPC). Here, then, properties of the L2 assumed to stem from UG are manifested in the interlanguage
grammar. The interlanguage grammar and the L2 grammar converge in this
respect, as suggested by Kanno’s results. But what if interlanguage representations fail to demonstrate certain L2 properties? What if the interlanguage and
the L2 diverge? Does this necessarily imply lack of UG? This was, in fact, the
interpretation taken (implicitly or explicitly) by a number of researchers in the
1980s.
Some researchers were quite explicit in their assumption that one should
compare L2 learners and native speakers with respect to UG properties, the
native speaker of the L2 providing a reference point for assessing UG availability. If L2 learners rendered judgments (or otherwise behaved) like native
speakers with respect to some principle or parameter of UG, then they were
deemed to have access to UG; on the other hand, if they differed in their
judgments from native speakers, then their grammars were assumed not to be
constrained by UG. For example, in Schachter’s (1989, 1990) investigations of
constraints on wh-movement, this was the underlying rationale for claiming
the non-operation of UG. Schachter found that, compared to native speakers,
L2 learners of English of certain L1 backgrounds were very inaccurate in their
26 Lydia White
judgments on illicit wh-movement out of structures such as embedded questions and relative clauses; hence, Schachter argued, L2 learners do not have
access to UG principles independently of the L1.
The problem with this kind of approach to UG in L2 acquisition is that
it presupposes that the interlanguage representation must converge on the
grammar of native speakers of the L2, that the endstate grammar of a second
language learner must be identical to that of a native speaker. But this is a
misconception (Cook, 1997; Schwartz, 1993, 1998b; White, 1996). An interlanguage grammar which diverges from the L2 grammar can nevertheless fall
within the bounds laid down by UG. If we are going to take the issue of
representation seriously, we need to consider Bley-Vroman’s comparative fallacy.
Bley-Vroman (1983) warned that “work on the linguistic description of learners’
languages can be seriously hindered or sidetracked by a concern with the target
language” (p. 2) and argued that “the learner’s system is worthy of study in its
own right, not just as a degenerate form of the target system” (p. 4).
A number of researchers pointed out quite early on the need to consider
interlanguage grammars in their own right with respect to principles and
parameters of UG, arguing that one should not compare L2 learners to native
speakers of the L2 but instead consider whether interlanguage grammars are
natural language systems (e.g., duPlessis et al., 1987; Finer and Broselow, 1986;
Liceras, 1983; Martohardjono and Gair, 1993; Schwartz and Sprouse, 1994; White,
1992b). These authors have shown that L2 learners may arrive at representations which indeed account for the L2 input, though not in the same way as
the grammar of a native speaker. The issue, then, is whether the interlanguage
representation is a possible grammar, not whether it is identical to the L2
grammar. For example, with respect to the violations of constraints on whmovement that Schachter (1989, 1990) reports, Martohardjono and Gair (1993),
White (1992b), and, more recently, Hawkins and Chan (1997) argue that L2
learners have a different analysis for the phenomenon in question, whereby
structures involving a fronted wh-phrase are derived without movement (based
on properties of the L1 grammar), explaining the apparent lack of movement
constraints.
A related kind of misleading comparison involves the use of control groups
in experimental tasks. There is often an (implicit) expectation that L2 speakers
should not differ significantly from native speakers with respect to performance on sentences testing for UG properties. Suppose that on a grammaticality
judgment task native speakers accept sentences violating some principle of
UG at less than 5 percent and accept corresponding grammatical sentences at
over 95 percent. In order to demonstrate “access” to this principle, it is not
necessary for L2 speakers to perform at the same level. Rather, the issue is
whether the interlanguage grammar shows evidence of certain distinctions:
does learners’ performance on grammatical sentences differ significantly from
their performance on ungrammatical sentences (cf. Grimshaw and Rosen, 1990,
for related comments on L1 acquisition)? Do L2 learners distinguish between
different kinds of ungrammatical sentences (see Martohardjono, 1993)? If certain
On the Nature of Interlanguage Representation 27
sentence types are treated significantly differently from other sentence types,
this suggests that the interlanguage grammar represents the relevant distinction (whatever it may be), even if the degree to which L2 learners observe it in
performance differs from that of native speakers. To return to Kanno’s study on
the OPC, the importance of her results lies not in the fact that the L2 learners
did not differ significantly from the native speakers, but rather in the fact that
the L2 learners showed a significant difference in their acceptances of quantified
antecedents depending on pronoun type, suggesting that their grammars make
the relevant distinction between licit and illicit antecedents.
It is not the case, however, that one should never compare L2 speakers to
native speakers of the L2 as far as properties of the grammar are concerned.3
There are legitimate reasons for asking whether the L2 learner has in fact
acquired properties of the L2. After all, the learner is exposed to L2 input in
some form, and the L2 is a natural language. What is problematic is when
certain conclusions are drawn based on failure to perform exactly like native
speakers. Failure to acquire L2 properties may nevertheless involve acquiring
properties different from the L1, properties of other natural languages, properties that are underdetermined by the L2 input. Such failure does not necessarily
entail lack of UG.
5 UG “Access” and Terminological Confusions
Earlier approaches to UG in L2 acquisition revealed a somewhat ambivalent
attitude to the L1. Perhaps because the strongest case for UG can be made if
one can eliminate the L1 as a potential source of UG-like knowledge, some
researchers felt that evidence of the influence of the L1 grammar on the
interlanguage representation would somehow weaken the case for UG. Nowhere is this more evident than in the terminological confusions and disagreements that arose over terms like direct access to UG. Direct access for some
researchers was taken to mean that L2 learners arrive at UG properties independently of their L1 (e.g., Cook, 1988). For others (e.g., Thomas, 1991b), it
meant the instantiation of any legitimate parameter setting (L1, L2, Ln). Similar
problems have arisen with the term full access, which at some point replaced
direct access. Epstein, Flynn, and Martohardjono (1996) restrict the term full
access to the position that UG operates independently of the L1 representation,
whereas Schwartz and Sprouse (1996) do not so restrict it.
Part of the problem is that terms like direct/full or indirect/partial access
are too global. In addition, in some cases at least, an overly simplistic and
misleading dichotomy between UG and the L1 is adopted. Since the L1 is a
natural language, there is no a priori justification for assuming that a representation based on the L1 implies lack of UG constraints on the interlanguage
grammar.
What is required is a greater focus on the nature of the representations that
L2 learners achieve. It may not always be appropriate to dwell explicitly on
28 Lydia White
the UG access question. But by looking in detail at the nature of interlanguage
representation, we in fact remain committed to this issue, since evidence of an
interlanguage grammar that does not fall within the hypothesis space sanctioned
by UG is evidence that UG does not fully constrain interlanguage grammars.
6
Interlanguage Representation: Convergence,
Divergence, or Impairment
In the 1990s, the UG debate shifted from a consideration of the broad access
question to a detailed consideration of the nature of interlanguage representation. Specific grammatical properties have been investigated and claims have
been made as to how they are represented. It is largely presupposed that the
interlanguage grammar and the grammars of native speakers of the L2 will
diverge in some respects, at least initially and possibly also finally (see Flynn,
1996, for a contrary view). Of interest, then, is the nature of that divergence: is
it indicative of a representation that is nevertheless constrained by UG (cf.
Sorace, 1993) or is it suggestive of some kind of impairment to the grammar,
such that the interlanguage representation is in some sense defective? If
interlanguage representations were to show properties not found elsewhere in
natural languages, this would suggest that they are not UG-constrained, at
least in some domains (see Thomas, 1991a, and Klein, 1995).
The focus on representation manifests itself particularly clearly in proposals
relating to the L2 initial state. Theories about the initial state are theories about
the representation that L2 learners start out with, the representations that they
initially use to make sense of the L2 input.
6.1
Example: strong features and verb movement
Since proposals regarding initial and subsequent interlanguage grammars often
dwell, in one way or another, on functional categories, we will consider an
example here to illustrate the kinds of properties that researchers have investigated in recent years. Functional categories, such as inflection (I), complementizer (C), and determiner (D), have certain formal features associated with them
(tense, agreement, case, number, person, gender, etc.). These features vary as to
strength (strong vs. weak). Functional categories are seen as the locus of parametric variation (e.g., Borer, 1984; Chomsky, 1995), which can be found at the
level of the categories themselves (not all categories are realized in all languages),
at the level of formal features (the features of a particular functional category
may vary from language to language), and at the level of feature strength (a
particular feature can be strong in one language and weak in another).
Here we will consider properties relating to functional projections above the
verb phrase (VP). Finite verbs have features (tense, agreement) which have to
be checked against corresponding features in I (Chomsky, 1995).4 If features in
I are strong, the finite verb raises overtly to check its features, as in the French
On the Nature of Interlanguage Representation 29
(4a). If features are weak, overt movement does not take place, as in the English (4b):5
CP
(4)
Spec
C′
C
IP
Spec
I′
I
(a) Jean
sorti
NegP
pas
VP
ti
(b) John (does) not
leave
Feature strength results in a number of syntactic consequences related to word
order. In languages such as French, where features in I are strong, there are alternations between the positions of finite and non-finite verbs, since non-finite verbs
have no features to check, hence do not raise.6 Comparing French to a language
with weak features, like English, there are word order differences between the
two with respect to where the finite verb is found (Emonds, 1978; Pollock,
1989). The difference between finite and non-finite verbs in French is illustrated
in (5); the differences between finite verbs in French and English are illustrated
in (6) and (7). In these examples, we consider only the position of the verb with
respect to negation and adverbs, but there is a variety of other verb placement
facts which are subsumed under this analysis (see Pollock, 1989):
(5) a. ne sortez
pas
(ne) leave-2PP not
b. pas sortir
not leave-INF
‘don’t go out’
(6) a. Marie n’aime pas Jean
Mary likes not John
b. Marie voit rarement Jean
Mary sees rarely
John
(7) a.
b.
c.
d.
Mary does not like John
*Mary likes not John
Mary rarely sees John
*Mary sees rarely John
30 Lydia White
In French, finite lexical verbs appear to the left of the negative pas while nonfinite verbs appear to the right (compare (5a) and (5b) ). English and French
contrast with respect to the position of the finite verb in relation to negation
and adverbs (compare (6) and (7) ). In English, lexical verbs appear to the right
of negation (7a) and adverbs (7c) and cannot precede them (7b, 7d), in contrast
to French (6a, 6b). A range of word order differences between the two languages are thus accounted for by one parametric difference between them,
namely the strength of features in I.
In the next section, we will use the example of verb movement to illustrate
some of the representational issues that are currently being pursued. It should
be noted, however, that not all of the theories to be discussed in fact have
made claims specifically about verb placement.
6.2
Initial state
Proposals concerning the initial interlanguage representation can broadly
be classified into two types: (i) the interlanguage representation conforms to
properties of natural language (though not necessarily the L2); or (ii) the
interlanguage representation differs from adult natural languages in fundamental respects (which, however, may not be permanent). Into the first category falls the Full Transfer/Full Access (FTFA) Hypothesis of Schwartz and
Sprouse (1994, 1996). I will also consider Epstein et al.’s (1996) Full Access
Hypothesis in this category. Although the Full Access Hypothesis is not, strictly
speaking, a hypothesis about the initial state (Epstein et al., 1996, p. 750), it
nevertheless has clear implications for the nature of the earliest grammar. The
second category includes the Minimal Trees Hypothesis of Vainikka and YoungScholten (1994, 1996), as well as Eubank’s (1993/4, 1994) claim that initially
features are neither strong nor weak but rather “inert” or “valueless.”
Schwartz and Sprouse (1994, 1996) propose that the L1 grammar constitutes
the interlanguage initial state. In other words, faced with L2 input that must
be accounted for, learners adopt the representation that they already have.
Schwartz and Sprouse (1994) originally presented this proposal in the context
of an analysis of the acquisition of German word order by a native speaker of
Turkish. Schwartz and Sprouse (1996) and Schwartz (1998a) extend the analysis to French-speaking learners of English, arguing, following White (1991a,
1991c, 1992a), that the initial interlanguage grammar includes strong features,
because this is the case in the L1 French. In consequence, verbs are incorrectly
placed with respect to adverbs, as White found. However, a potential problem
for FTFA is that while White’s (1992a) subjects had considerable problems
with adverb placement, producing and accepting forms like (7d), they did not
have equivalent problems with negation, correctly recognizing the impossibility of (7b).7
According to FTFA, the interlanguage representation is necessarily different
from the grammar of native speakers of the L2, at least initially; it is nevertheless
On the Nature of Interlanguage Representation 31
UG constrained, exemplifying functional categories and features, as well as
syntactic properties that derive from feature strength. The interlanguage representation may or may not converge on the L2 grammar in later stages of
development. When the L1 representation is unable to accommodate the L2
input, the learner has recourse to options made available through UG. Once
the L2 input reveals an analysis to be inappropriate, there is restructuring of
the interlanguage representation. For example, in the case of verb raising,
there are properties of the L2 input that could signal the need to change from
strong to weak feature values: the presence of do-support in negatives (7a)
shows that finite lexical verbs in English do not raise (Schwartz, 1987; White,
1992a). Thus, convergence might be expected in this case.
In contrast to FTFA, Epstein et al. (1996, p. 751) and Flynn (1996) claim the
L1 grammar is not implicated in the initial interlanguage representation. The
implicit logic of their argumentation suggests that UG must be the initial state8
and that the early grammar in principle has available all functional categories,
features, and feature values, from UG, so that an appropriate representation for
the L2 can be constructed without recourse to categories or features from the
L1. As far as representation of functional categories is concerned, there is no
development on such an account: the L2 categories are in place from early on;
because they are appropriate, there is no need for subsequent restructuring of
the grammar.
In terms of our example, this would mean that a French-speaking learner of
English should assume weak features initially, hence would make no word
order errors, contrary to fact, at least as far as adverb placement is concerned
(White, 1991a, 1991c). Similarly, an English-speaking learner of French should
assume strong features, hence exhibiting verb raising. Again, there is research
that suggests that this is not inevitable. White (1989a, 1991b) reports that Englishspeaking children learning French fail to consistently accept verb raising in a
variety of tasks. Hawkins, Towell, and Bazergui (1993) suggest that intermediate proficiency adult English-speaking learners of French fail to reset from the
weak L1 feature strength to the strong value required by the L2.
Although Schwartz and Sprouse (1996) and Epstein et al. (1996) differ radically in their claims about the involvement of the L1 grammar, they share the
assumption that the interlanguage representation shows a full complement of
functional categories, drawn either from the L1 or from UG. In other words,
the interlanguage representation is a grammar sanctioned by UG, both in the
initial state and subsequently.
Other theories posit a greater degree of divergence between what is found
in the interlanguage grammar and what is found in the grammars of adult
native speakers. Vainikka and Young-Scholten (1994, 1996) propose the Minimal Trees Hypothesis, whereby the initial state lacks functional categories
altogether, only lexical categories (N, V, P, etc.) being found. Lexical categories
are assumed to be drawn from the L1 grammar, hence to exhibit the same
properties as the L1 with respect to headedness, for example. Thus, this theory
32 Lydia White
shares with FTFA the assumption that L1 properties are found in the initial
representation. However, as far as functional categories are concerned, Vainikka
and Young-Scholten (1994, 1996) assume no transfer at all.
Vainikka and Young-Scholten’s (1994) proposals are based on an examination
of spontaneous production data from adult learners of German whose L1s are
Turkish and Korean. The evidence that they adduce is largely morphological:
in early production data from adult learners of German, inflectional morphology
is lacking. This leads them to conclude that the corresponding abstract categories
are lacking in the interlanguage grammar. (See Sprouse, 1998, and Lardiere, 2000,
for arguments against assuming such a close relationship between surface
morphology and abstract syntactic categories.) In addition, Vainikka and YoungScholten (1994) claim that the early grammar lacks word orders that would be
the result of movement of the finite verb to a functional projection. In terms of
our example, the prediction of Minimal Trees is that French-speaking learners
of English should not produce errors like (7d), since these are the result of verb
movement from V to I (motivated by strong features) (Schwartz, 1998b; Schwartz
and Sprouse, 1996). If the functional category I is altogether absent and there is
only a VP projection, there is nowhere for the verb to move to. Hence, the only
interlanguage word order should be the order that is in fact correct for English,
namely (7c), contrary to fact. (See Vainikka and Young-Scholten, 1996, 1998,
for discussion.)
Further evidence against Minimal Trees is provided by Grondin and White
(1996), who examine spontaneous production data from two English-speaking
children learning French. Grondin and White show that there is both morphological and syntactic evidence in favor of an IP projection in early stages. For
example, the children show an alternation in verb placement with respect to
negation: finite verbs precede pas whereas non-finite verbs follow it, suggesting
movement of the finite verb to I; this is inconsistent with Minimal Trees, which
postulates no I in the early grammar. However, as Vainikka and Young-Scholten
(1996) point out, these data may not be truly representative of the initial state,
since the children had several months of exposure to the L2 prior to beginning
to speak.
In some sense, the Minimal Trees Hypothesis might be seen as implying a
defective interlanguage grammar (Lardiere, 2000), since it postulates a period
during which the representation lacks functional categories, which are otherwise presumed to be a necessary characteristic of natural language grammars.
However, this impairment is assumed to be temporary, with functional categories developing gradually until, eventually, all functional categories appropriate for the L2 are acquired. Furthermore, Vainikka and Young-Scholten
(1994, 1996) take the position that gradual emergence of functional categories
is also characteristic of L1 acquisition (Clahsen, Eisenbeiss, and Vainikka, 1994);
thus, for them, L2 acquisition in this domain is similar to L1.
The final initial state proposal to be considered here also implies that interlanguage grammars are in some sense defective. Eubank (1993/4, 1994) shares
with Schwartz and Sprouse (1994, 1996) the assumption that the L1 grammar
On the Nature of Interlanguage Representation 33
constitutes a major part of the initial state: L1 lexical categories and functional
categories are assumed to be present. However, Eubank maintains that the initial
representation lacks fully specified feature values, at least some interlanguage
features being unspecified or “inert.” In Eubank (1993/4) and subsequently (e.g.,
Eubank and Grace, 1998) the focus is specifically on feature strength: while
features are strong or weak in natural language grammars, they are argued to
be neither in the interlanguage, suggesting an impairment in this domain.
According to Eubank, a consequence of inertness is that finite verbs will vary
optionally between raised and unraised positions; this will be true regardless
of what language is being acquired as the L2 and regardless of the situation in
the L1. In the case of French-speaking learners of English, then, variable word
orders are expected, that is, both (7c) and (7d). The same would be expected of
English-speaking learners of French. In support, Eubank (1993/4) points to
White’s (1991a, 1991c) results on the position of the verb with respect to the
adverb, where there was some evidence of variability, with francophone subjects allowing word orders like not only (7d) but also (7c). However, Yuan
(2000) shows that French-speaking and English-speaking learners of Chinese
(a language with weak features, hence lacking verb movement) are very accurate in positioning verbs in Chinese, even at the beginner level, showing no
evidence of optional verb placement.
In fact, Eubank’s assumption that raising of finite verbs will be optional
appears to be a stipulation which does not follow from any particular theory
of feature strength: if features have no strength, there is nothing to motivate
verb raising, since this requires a strong feature value (Robertson and Sorace,
1999; Schwartz, 1998b). Prévost and White (2000) provide evidence that finite
verbs in adult L2 French and German fail to appear in non-finite positions (i.e.,
unraised); instead, they occur almost exclusively in positions appropriate for
finite verbs, suggesting that inertness cannot be involved.
In its early instantiation, Eubank’s proposal was not unlike (indeed, was
modeled on) similar proposals that features in L1 acquisition are initially
underspecified (e.g., Hyams, 1996; Wexler, 1994). Although a grammar with
underspecified features is in some sense defective, underspecification in L1 is
assumed to be a temporary property. Similarly, Eubank originally assumed
inertness to be a passing phase in the interlanguage representation, with L2
feature strength ultimately attainable.
6.3
Beyond the initial state
Initial state theories necessarily have implications for the nature of representation during the course of development, as well as for endstate representation
(that is, the steady state interlanguage grammar). According to FTFA, while
the L1 grammar forms the interlanguage initial state, restructuring takes place
in response to L2 input; hence, convergence on the relevant L2 properties is
possible, though not guaranteed, since in some cases the L1 grammar may
appear to accommodate the L2 input adequately and thus change will not
34 Lydia White
be triggered. Divergent outcomes, then, would not be surprising, but the
interlanguage representation is nevertheless assumed to be UG-constrained.
There are researchers who agree with Schwartz and Sprouse that the L1
grammar is the initial state but who maintain that at least some (and possibly all)
L1 features and feature values remain in the interlanguage representation, L2
features or feature values not being acquirable (Hawkins, 1998; Hawkins and
Chan, 1997; Liceras, Maxwell, Laguardia, Fernández, and Fernández, 1997; Smith
and Tsimpli, 1995). This means that development in the form of restructuring
toward a more appropriate functional structure for the L2 is not expected.
On Epstein et al.’s proposal, there is no reason to expect change or development in the domain of functional categories for a different reason, since all
categories (including L2 categories) are present from early stages. Convergence
on the L2 grammar, then, is guaranteed (Flynn, 1996, p. 150). The only kind of
development to be expected is in the surface instantiation of abstract categories
in the language-particular morphology of the L2. The Minimal Trees Hypothesis
also appears to predict eventual convergence on the L2 functional properties,
as L2 functional categories are gradually added, in response to the L2 input.
Whether predicting ultimate divergence from or convergence on the L2 grammar, the above researchers agree that the interlanguage representation does
not suffer from any essential long-term impairment, that it ends up with characteristics of a natural language, be it the L1, the L2, or some other language.
This contrasts with recent proposals that the interlanguage representation suffers from a permanent deficit, rendering it unlike natural languages, hence not
fully UG-constrained.
In recent work, Beck (1998) has suggested that inert feature values are a permanent phenomenon, a proposal also adopted by Eubank in later work (e.g.,
Eubank and Grace, 1998). In other words, the interlanguage representation is
assumed to be defective not just initially and temporarily but permanently. In
terms of our example, this means that variable word orders in the case of
English-speaking learners of French or French-speaking learners of English are
predicted to be found even in the endstate. The results of Yuan (2000), mentioned above, argue against this claim: Yuan demonstrates that L2 learners can
indeed reset feature strength to the value appropriate for the L2, even when the
L1 value is different (as is the case for the French-speaking learners of Chinese),
and that there is no variability in word order at any level of proficiency.
Meisel (1997) proposes more global impairment to functional (and other) properties. He argues that interlanguage grammars are of an essentially different
nature from those found in L1 acquisition. He points to differences between L1
and L2 acquisition: in L1 acquisition, the position of the verb is determined by
finiteness (compare (5a) and (5b) ), whereas, according to Meisel, in L2 acquisition it is not. Prévost and White (2000) provide counter-arguments and data
that show that verb placement is not as free as Meisel suggests.
In order to investigate the nature of the interlanguage representation in the
functional domain, some of the researchers discussed above have considered
both morphological properties (namely whether inflection is present or absent,
On the Nature of Interlanguage Representation 35
accurate or faulty) and syntactic ones (whether there are alternations suggestive of verb movement to higher functional projections). Thus, Vainikka and
Young-Scholten (1994) argued that the early interlanguage exhibits both a
lack of verbal morphology and a lack of word orders indicating movement;
Eubank (1993/4) argued that syntactic optionality is associated with absence
of inflection; Meisel (1997) argued that both interlanguage morphology and
interlanguage verb placement are variable.
But what is one to conclude if syntactic reflexes of feature strength are
demonstrably present and morphological ones are lacking or not robustly
present? If the interlanguage contains a full complement of functional categories, it might seem somewhat mysterious that L2 learners reveal problems in
the domain of morphology associated with functional categories, such as verb
inflection. If functional categories are in place, and in place early, why should
L2 learners have problems with morphology? Yet it is well known that they
exhibit variability in their use of inflection, with tense and agreement morphology sometimes present and sometimes absent in L2 production.
This issue is addressed by Lardiere (1998a, 1998b), who provides a case
study of an adult L2 English speaker, Patty, whose L1 is Chinese and whose
interlanguage grammar is clearly at its endstate. Patty reveals a lack of consistency in her use of English inflectional morphology: tense marking on verbs in
spontaneous production is at about 35 percent, while 3rd person singular
agreement is less than 17 percent. At the same time, Patty shows full command of a variety of syntactic phenomena which suggest that tense and agreement are represented in her grammar, with appropriate weak values. For
example, Patty shows 100 percent correct incidence of nominative case assignment (nominative case being checked in I, hence implicating this functional
category) and complete knowledge of the fact that English verbs do not raise.
In other words, she shows no variability in verb placement with respect to
adverbs or negation. Word orders like (7b) and (7d) are never found; rather
she consistently produces orders like (7a) and (7c), suggesting that verbal
features are appropriately weak. According to Eubank and Grace (1998), if
interlanguage grammars have permanently inert features, then learners with
an L1 with weak features, such as Chinese, learning an L2 also with weak
features, like English, should allow optional verb movement. However, Lardiere
shows that Patty’s interlanguage grammar disallows verb movement and that
her problems are not due to any deficit in functional features as such. Even in
the absence of appropriate inflectional morphology, functional categories and
their feature specifications are present in the grammar and function in ways
appropriate for the L2. In this case, then, the underlying grammar does in fact
converge on the native grammar, though the surface morphology is divergent,
in the sense that it is often absent.
Lardiere argues that this divergence reflects a problem in mapping from
abstract categories to their particular surface morphological manifestations. This
problem in surface mapping is very different from the impairment to the
grammar implied by inert features. In the former case, abstract properties are
36 Lydia White
present and the grammar shows reflexes of feature strength, such as appropriate
case marking and word order. There is nothing in UG that says that past tense
in English must be realized by a morpheme /-ed/ or that agreement must
manifest itself as /-s/ in the 3rd person singular. Yet it is this realization that
is problematic, rather than the syntactic consequences of tense or agreement.
To conclude this section, while the issues are by no means resolved, it seems
clear that we have left behind the more general, global question (is there
access to UG?) and are now probing quite intricate properties of the interlanguage representation, in order to understand the nature of the grammar
that the learner creates to account for the L2. (Of course, the issue of UG involvement is still central, since a grammar constrained by UG will be different in
nature from one that is not.) Interesting conceptual questions are being raised:
does it make sense to think of an interlanguage representation as being defective
in one domain (morphological mapping) but not another (syntax); does it
make sense to think of some features being impaired but not others? If the
interlanguage representation indeed draws on a variety of knowledge sources
(UG, the L1, etc.), how do these come together?
7
Beyond Representation
UG is a theory relevant to the issue of linguistic competence, a theory as to the
nature of grammatical representation. Although UG provides constraints on
possible grammars in the course of acquisition, it is not, of itself, a theory of
acquisition. This point is often misunderstood, perhaps because of terms like
“Language Acquisition Device” (LAD) (Chomsky, 1965), which many people
in the past equated with UG. It would be more accurate to think of UG as a
component within an LAD or as part of a language faculty. A theory of language acquisition will also have to include learning principles, processing
principles, triggering algorithms, etc.
In other words, in addition to a theory of constraints on interlanguage representation, we need a theory of how that representation is acquired, a theory
of development (whether we are talking about L1 or L2 acquisition). A number
of researchers have pointed out that theories of acquisition must explain both
the representational problem (what L2 learners come to know) and the developmental problem (how they attain this knowledge) (e.g., Carroll, 1996; Felix,
1987; Gregg, 1996; Klein and Martohardjono, 1999). Most research looking at
the operation of UG in second language acquisition has focused on the nature of
the L2 learner’s grammar, looking for evidence for or against the involvement
of principles and parameters of UG, and exploring the nature of the initial state
and subsequent grammars. These are representational issues, as we have seen.
Even if one looks for UG-based properties in learner grammars at various
points in time, this is a question of representation rather than development. A
representational theory is not the same as a developmental one; there is clearly
a need for both and room for both. A representational theory makes claims
On the Nature of Interlanguage Representation 37
about what learner grammars are like (a grammar at time X conforms to
property X and at time Y to property Y) but does not seek to explain how or
why grammars develop in a particular way. We should bear in mind that UG
itself is not a learning theory; it can only interact with other theories that try to
explain development.
To account for grammar change (i.e., development), one needs a theory of
how the L2 input interacts with the existing grammar, what properties of the
input act as triggers for change, what properties force changes to the current
representation, what might drive stages of acquisition. Some L2 learnability
work has looked into these kinds of questions (the role of positive and negative evidence, learning principles, proposals that grammar change is failure
driven, possible triggers in the input, etc.) (e.g., Schwartz and Sprouse, 1994;
Trahey and White, 1993; White, 1991a). However, this is an area where much
remains to be done.
Another issue is relevant in this context. In the field of second language
acquisition, there is often a confusion between competence (in the sense of
underlying linguistic representation) and performance (use of that representation to understand and produce language). People often look at L2 performance,
note that it differs from that of native speakers, and argue that this demonstrates
essential defects in competence, or lack of UG (the comparative fallacy again).
But it is in fact possible that L2 learners’ underlying competence is to some extent
hidden by performance factors, such as the demands of processing or parsing.
Knowledge and use of knowledge do not always coincide. In recent years, there
has been an increase in research which investigates how the interlanguage
mental representation is accessed during processing, seeking to determine how
the representation is used on-line and off-line and the extent to which processing pressures may mask competence (e.g., Juffs and Harrington, 1995; Schachter
and Yip, 1990). Again, this is an area where more research is needed.
8 Conclusion
It is not the aim of UG-based theories of second language acquisition to account for all aspects of L2 development. These theories concentrate largely on
the nature of unconscious interlanguage knowledge. I have argued that it is
not necessary to show that the interlanguage representation is identical to the
grammars of native speakers of the L2 in order to demonstrate that the representation is constrained by UG. The pursuit of interlanguage representation has
led to a number of interesting and competing proposals: that interlanguage
grammars are natural language grammars, constrained by UG (on some accounts, restricted to L1 properties, on other accounts not), versus that interlanguage grammars suffer from impairments (permanent, according to some
researchers). The local impairment position contrasts with earlier views which
assumed a more global deficit, in the form of a total inability to reset parameters (e.g., Clahsen and Muysken, 1989).
38 Lydia White
In conclusion, it is important to bear in mind that claims for UG operation
in L2 acquisition are simply claims that interlanguage grammars will fall
within a limited range, that the “hypothesis space” is specified by UG. As
Dekydtspotter et al. (1998, p. 341, n. 1) point out: “Given that the sole ‘role’ of
UG is to restrict the hypothesis space available to the language acquirer, Full
Restriction might be a more perspicuous name than the standard Full Access.”
If we have to use such terms at all, this one has many advantages, since it
focuses our attention on properties of the learner’s representation, while at the
same time reminding us that the restrictions come from UG.
NOTES
1 For a more recent treatment of this
phenomenon, see Noguchi (1997).
2 The examples are drawn from Kanno
(1997). The following abbreviations
are used: NOM = nominative; ACC =
accusative; TOP = topic.
3 Of course native speaker control
groups should be included in
experiments in order to make sure
that the test instrument achieves
what it is meant to test. This is a
different matter.
4 For purposes of exposition, I ignore
analyses that have tense (T) and
agreement (Agr) heading their own
projections (e.g., Pollock, 1989).
5 Where features are weak, feature
checking is achieved by the
mechanism of covert movement
(Chomsky, 1995).
6 This is an oversimplification, which
I will adopt for the sake of the
argument. See Pollock (1989).
7 See White (1992a) and Schwartz and
Sprouse (2000) for analyses that
account for these data in a full
transfer framework.
8 In fact, Epstein et al. (1996, p. 751)
reject this possibility as well, so
that it is impossible to determine
their precise position on the initial
state.
REFERENCES
Adjémian, C. 1976: On the nature of
interlanguage systems. Language
Learning, 26, 297–320.
Archangeli, D. and Langendoen, T. (eds)
1997: Optimality Theory: An Overview.
Oxford: Blackwell.
Beck, M. 1998: L2 acquisition and
obligatory head movement: Englishspeaking learners of German and the
local impairment hypothesis. Studies
in Second Language Acquisition, 20,
311– 48.
Bley-Vroman, R. 1983: The comparative
fallacy in interlanguage studies: the
case of systematicity. Language
Learning, 33, 1–17.
Bley-Vroman, R. 1990: The logical
problem of foreign language learning.
Linguistic Analysis, 20, 3– 49.
Borer, H. 1984: Parametric Syntax.
Dordrecht: Foris.
Carroll, S. 1996: Parameter-setting
in second language acquisition:
explanans and explanandum.
On the Nature of Interlanguage Representation 39
Behavioral and Brain Sciences, 19,
720–1.
Chomsky, N. 1965: Aspects of the Theory
of Syntax. Cambridge, MA: MIT Press.
Chomsky, N. 1981: Lectures on
Government and Binding. Dordrecht:
Foris.
Chomsky, N. 1995: The Minimalist
Program. Cambridge, MA: MIT Press.
Clahsen, H. and Muysken, P. 1989: The
UG paradox in L2 acquisition. Second
Language Research, 5, 1–29.
Clahsen, H., Eisenbeiss, S., and Vainikka,
A. 1994: The seeds of structure: a
syntactic analysis of the acquisition
of Case marking. In T. Hoekstra and
B. D. Schwartz (eds), Language
Acquisition Studies in Generative
Grammar. Amsterdam: John
Benjamins, 85–118.
Cook, V. 1988: Chomsky’s Universal
Grammar: An Introduction. Oxford:
Blackwell.
Cook, V. 1997: Monolingual bias in second
language acquisition research. Revista
Canaria de Estudios Ingleses, 34, 35–49.
Corder, S. P. 1967: The significance of
learners’ errors. International Review of
Applied Linguistics, 5, 161–70.
Dekydtspotter, L., Sprouse, R., and
Anderson, B. 1998: Interlanguage
A-bar dependencies: binding
construals, null prepositions and
Universal Grammar. Second Language
Research, 14, 341–58.
duPlessis, J., Solin, D., Travis, L., and
White, L. 1987: UG or not UG, that is
the question: a reply to Clahsen and
Muysken. Second Language Research, 3,
56–75.
Emonds, J. 1978: The verbal complex
V′–V in French. Linguistic Inquiry, 9,
151–75.
Epstein, S., Flynn, S., and
Martohardjono, G. 1996: Second
language acquisition: theoretical and
experimental issues in contemporary
research. Brain and Behavioral Sciences,
19, 677–758.
Eubank, L. 1993/4: On the transfer of
parametric values in L2 development.
Language Acquisition, 3, 183–208.
Eubank, L. 1994: Optionality and the
initial state in L2 development. In
T. Hoekstra and B. D. Schwartz
(eds), Language Acquisition Studies
in Generative Grammar. Amsterdam:
John Benjamins, 369–88.
Eubank, L. and Grace, S. 1998: V-to-I
and inflection in non-native grammars.
In M. Beck (ed.), Morphology and its
Interface in L2 Knowledge. Amsterdam:
John Benjamins, 69–88.
Felix, S. 1987: Cognition and Language
Growth. Dordrecht: Foris.
Finer, D. and Broselow, E. 1986: Second
language acquisition of reflexivebinding. In S. Berman, J.-W. Choe,
and J. McDonough (eds), Proceedings
of NELS 16. Amherst, MA: Graduate
Linguistics Students Association,
154 –68.
Flynn, S. 1996: A parameter-setting
approach to second language
acquisition. In W. Ritchie and
T. Bhatia (eds), Handbook of Language
Acquisition. San Diego: Academic
Press, 121–58.
Gregg, K. R. 1996: The logical and
developmental problems of second
language acquisition. In W. Ritchie
and T. Bhatia (eds), Handbook of
Second Language Acquisition. San
Diego: Academic Press, 49–81.
Grimshaw, J. and Rosen, S. T. 1990:
Knowledge and obedience: the
developmental status of the
binding theory. Linguistic Inquiry,
21, 187–222.
Grondin, N. and White, L. 1996: Functional
categories in child L2 acquisition of
French. Language Acquisition, 5, 1–34.
Hale, K. 1996: Can UG and the L1 be
distinguished in L2 acquisition?
Behavioral and Brain Sciences, 19,
728–30.
Hawkins, R. 1998: The inaccessibility of
formal features of functional categories
40 Lydia White
in second language acquisition. Paper
presented at the Pacific Second
Language Research Forum. Tokyo,
March.
Hawkins, R. and Chan, Y.-C. 1997: The
partial availability of Universal Grammar
in second language acquisition: the
“failed features” hypothesis. Second
Language Research, 13, 187–226.
Hawkins, R., Towell, R., and Bazergui,
N. 1993: Universal Grammar and the
acquisition of French verb movement
by native speakers of English. Second
Language Research, 9, 189–233.
Hyams, N. 1996: The underspecification
of functional categories in early
grammar. In H. Clahsen (ed.),
Generative Perspectives on Language
Acquisition: Empirical Findings,
Theoretical Considerations, Crosslinguistic
Comparisons. Amsterdam: John
Benjamins, 91–127.
Juffs, A. and Harrington, M. 1995:
Parsing effects in second language
sentence processing: subject and object
asymmetries in wh-extraction. Studies
in Second Language Acquisition, 17,
483–516.
Kanno, K. 1997: The acquisition of null
and overt pronominals in Japanese by
English speakers. Second Language
Research, 13, 265–87.
Klein, E. 1995: Evidence for a “wild” L2
grammar: when PPs rear their empty
heads. Applied Linguistics, 16, 87–117.
Klein, E. and Martohardjono, G. 1999:
Investigating second language
grammars: some conceptual and
methodological issues in generative
SLA research. In E. Klein and G.
Martohardjono (eds), The Development
of Second Language Grammars: A
Generative Perspective. Amsterdam:
John Benjamins, 3–34.
Lardiere, D. 1998a: Case and tense in
the “fossilized” steady state. Second
Language Research, 14, 1–26.
Lardiere, D. 1998b: Dissociating syntax
from morphology in a divergent end-
state grammar. Second Language
Research, 14, 359–75.
Lardiere, D. 2000: Mapping features to
forms in second language acquisition.
In J. Archibald (ed.), Second Language
Acquisition and Linguistic Theory.
Oxford: Blackwell, 102–29.
Liceras, J. 1983: Markedness, contrastive
analysis and the acquisition of Spanish
as a second language. Ph.D. thesis.
University of Toronto.
Liceras, J., Maxwell, D., Laguardia, B.,
Fernández, Z., and Fernández, R. 1997:
A longitudinal study of Spanish nonnative grammars: beyond parameters.
In A. T. Pérez-Leroux and W. Glass
(eds), Contemporary Perspectives on the
Acquisition of Spanish. Vol. 1: Developing
Grammars. Somerville, MA: Cascadilla
Press, 99–132.
Martohardjono, G. 1993: Wh-movement
in the acquisition of a second
language: a crosslinguistic study of
three languages with and without
movement. Ph.D. thesis. Cornell
University.
Martohardjono, G. and Gair, J. 1993:
Apparent UG inaccessibility in
second language acquisition:
misapplied principles or principled
misapplications? In F. Eckman (ed.),
Confluence: Linguistics, L2 Acquisition
and Speech Pathology. Amsterdam: John
Benjamins, 79–103.
Meisel, J. 1997: The acquisition of the
syntax of negation in French and
German: contrasting first and second
language acquisition. Second Language
Research, 13, 227– 63.
Montalbetti, M. 1983: After binding: on
the interpretation of pronouns. Ph.D.
dissertation. MIT.
Nemser, W. 1971: Approximative
systems of foreign language learners.
International Review of Applied
Linguistics, 9, 115–23.
Noguchi, T. 1997: Two types of
pronouns and variable binding.
Language, 73, 770–97.
On the Nature of Interlanguage Representation 41
Pérez-Leroux, A. T. and Glass, W. 1997:
OPC effects in the L2 acquisition
of Spanish. In A. T. Pérez-Leroux
and W. Glass (eds), Contemporary
Perspectives on the Acquisition of
Spanish. Vol. 1: Developing Grammars.
Somerville, MA: Cascadilla Press,
149–65.
Pinker, S. 1994: The Language Instinct.
New York: William Morrow.
Pollock, J.-Y. 1989: Verb movement,
Universal Grammar, and the structure
of IP. Linguistic Inquiry, 20, 365–424.
Prévost, P. and White, L. 2000: Missing
surface inflection or impairment in
second language acquisition? Evidence
from tense and agreement. Second
Language Research, 16, 103–33.
Robertson, D. and Sorace, A. 1999:
Losing the V2 constraint. In E. Klein
and G. Martohardjono (eds), The
Development of Second Language
Grammars: A Generative Approach.
Amsterdam: John Benjamins, 317– 61.
Schachter, J. 1988: Second language
acquisition and its relationship to
Universal Grammar. Applied
Linguistics, 9, 219–35.
Schachter, J. 1989: Testing a proposed
universal. In S. Gass and J. Schachter
(eds), Linguistic Perspectives on Second
Language Acquisition. Cambridge:
Cambridge University Press, 73–88.
Schachter, J. 1990: On the issue of
completeness in second language
acquisition. Second Language Research,
6, 93–124.
Schachter, J. and Yip, V. 1990:
Grammaticality judgments: why does
anyone object to subject extraction?
Studies in Second Language Acquisition,
12, 379–92.
Schwartz, B. D. 1987: The modular basis
of second language acquisition. Ph.D.
dissertation. University of Southern
California.
Schwartz, B. D. 1993: On explicit and
negative data effecting and affecting
competence and “linguistic behavior.”
Studies in Second Language Acquisition,
15, 147–63.
Schwartz, B. D. 1998a: On two
hypotheses of “Transfer” in
L2A: minimal trees and absolute
L1 influence. In S. Flynn,
G. Martohardjono, and W. O’Neil
(eds), The Generative Study of Second
Language Acquisition. Mahwah, NJ:
Lawrence Erlbaum Associates, 35–59.
Schwartz, B. D. 1998b: The second
language instinct. Lingua, 106,
133–60.
Schwartz, B. D. and Sprouse, R. 1994:
Word order and nominative case in
nonnative language acquisition: a
longitudinal study of (L1 Turkish)
German interlanguage. In T. Hoekstra
and B. D. Schwartz (eds), Language
Acquisition Studies in Generative
Grammar. Amsterdam: John Benjamins,
317–68.
Schwartz, B. D. and Sprouse, R. 1996: L2
cognitive states and the full transfer/
full access model. Second Language
Research, 12, 40–72.
Schwartz, B. D. and Sprouse, R. 2000:
When syntactic theories evolve:
consequences for L2 acquisition
research. In J. Archibald (ed.), Second
Language Acquisition and Linguistic
Theory. Oxford: Blackwell, 156–86.
Selinker, L. 1972: Interlanguage.
International Review of Applied
Linguistics, 10, 209–31.
Smith, N. and Tsimpli, I.-M. 1995: The
Mind of a Savant. Oxford: Blackwell.
Sorace, A. 1993: Incomplete and
divergent representations of
unaccusativity in non-native
grammars of Italian. Second Language
Research, 9, 22– 48.
Sprouse, R. 1998: Some notes on the
relationship between inflectional
morphology and parameter setting in
first and second language acquisition.
In M. Beck (ed.), Morphology and the
Interfaces in Second Language Knowledge.
Amsterdam: John Benjamins, 41–67.
42 Lydia White
Thomas, M. 1991a: Do second language
learners have “rogue” grammars of
anaphora? In L. Eubank (ed.), Point
Counterpoint: Universal Grammar in the
Second Language. Amsterdam: John
Benjamins, 375–88.
Thomas, M. 1991b: Universal Grammar
and the interpretation of reflexives in a
second language. Language, 67, 211–39.
Trahey, M. and White, L. 1993: Positive
evidence and preemption in the second
language classroom. Studies in Second
Language Acquisition, 15, 181–204.
Vainikka, A. and Young-Scholten, M.
1994: Direct access to X′-theory:
evidence from Korean and Turkish
adults learning German. In T.
Hoekstra and B. D. Schwartz (eds),
Language Acquisition Studies in
Generative Grammar. Amsterdam: John
Benjamins, 265–316.
Vainikka, A. and Young-Scholten, M.
1996: Gradual development of L2
phrase structure. Second Language
Research, 12, 7–39.
Vainikka, A. and Young-Scholten, M.
1998: The initial state in the L2
acquisition of phrase structure. In
S. Flynn, G. Martohardjono, and
W. O’Neil (eds), The Generative Study
of Second Language Acquisition.
Mahwah, NJ: Lawrence Erlbaum
Associates, 17–34.
Wexler, K. 1994: Optional infinitives,
head movement and the economy of
derivations. In D. Lightfoot and N.
Hornstein (eds), Verb Movement.
Cambridge: Cambridge University
Press, 305–50.
White, L. 1985: Is there a logical problem
of second language acquisition? TESL
Canada, 2, 29–41.
White, L. 1989a: The principle of
adjacency in second language
acquisition: do L2 learners observe
the subset principle? In S. Gass and
J. Schachter (eds), Linguistic
Perspectives on Second Language
Acquisition. Cambridge: Cambridge
University Press, 134–58.
White, L. 1989b: Universal Grammar and
Second Language Acquisition.
Amsterdam: John Benjamins.
White, L. 1990: Second language
acquisition and universal grammar.
Studies in Second Language Acquisition,
12, 121–33.
White, L. 1991a: Adverb placement in
second language acquisition: some
effects of positive and negative
evidence in the classroom. Second
Language Research, 7, 133–61.
White, L. 1991b: Argument structure
in second language acquisition. Journal
of French Language Studies, 1, 189–207.
White, L. 1991c: The verb-movement
parameter in second language
acquisition. Language Acquisition, 1,
337–60.
White, L. 1992a: Long and short verb
movement in second language
acquisition. Canadian Journal of
Linguistics, 37, 273–86.
White, L. 1992b: Subjacency violations
and empty categories in L2 acquisition.
In H. Goodluck and M. Rochemont
(eds), Island Constraints. Dordrecht:
Kluwer, 445–64.
White, L. 1996: Universal grammar
and second language acquisition:
current trends and new directions.
In W. Ritchie and T. Bhatia (eds),
Handbook of Language Acquisition.
New York: Academic Press, 85–120.
Yuan, B. 2000: Is thematic verb raising
inevitable in the acquisition of a
nonnative language? In C. Howell,
S. Fish, and T. Keith-Lucas (eds),
Proceedings of the 24th Annual Boston
University Conference on Language
Development. Somerville, MA:
Cascadilla Press, 797–807.
The Radical Middle 43
3
The Radical Middle:
Nativism without
Universal Grammar
WILLIAM O’GRADY
1 Introduction
A phenomenon as puzzling and complex as language acquisition is no doubt
worthy of the controversy that its study has engendered. Indeed, it would be
unreasonable to expect a broad consensus on such a profoundly mysterious
phenomenon after a mere 30 or 40 years of investigation, much of it focused
on the acquisition of a single language.
Under these circumstances, the most that can perhaps be hoped for in the near
term is some agreement on the research questions that need to be addressed
and on the merits and shortcoming of the various explanatory ideas that are
currently being pursued. In the longer term, of course, one hopes for a convergence of views, and even now there is some indication that this has begun in
a limited way, as I will explain below. Nonetheless, for the time being at least,
there is still ample room for disagreement on many important points.
The purpose of this chapter is to outline a view of language acquisition –
both first and second – that is sometimes referred to as “general nativism.” I
will begin in the next section by offering an overview of this approach, including
its principal claims and the major challenges that it faces. Section 3 outlines
a general nativist theory of syntactic representations with respect to a wellestablished asymmetry in the development of relative clauses in the course of
first and second language acquisition. Section 4 addresses the possible advantages of general nativism compared to other theories of language acquisition.
2 Defining General Nativism
There is a near-consensus within contemporary linguistics (which I will not
question here) that language should be seen as a system of knowledge – a sort
of “mental grammar” consisting of a lexicon that provides information about
44 William O’Grady
the linguistically relevant properties of words and a computational system
that is responsible for the formation and interpretation of sentences.
The details of the computational system and even of the lexicon are the subject
of ongoing dispute, of course, but there is substantial agreement on a number
of points. For instance, it seems clear that the grammar for any human language
must assign words to categories of the appropriate type (noun, verb, etc.), that
it must provide a set of mechanisms for combining words into phrases and
sentences with a particular internal architecture, and that it must impose constraints on phenomena such as “movement” and pronoun interpretation.
What makes matters especially interesting for theories of language acquisition is that grammars that include even these basic and relatively uncontroversial mechanisms are underdetermined by experience in significant ways.
As far as we can tell, for instance, the input to the acquisition process (i.e., the
speech of others) includes no direct information about the criteria for category
membership, the architecture of syntactic representations, or the content of
constraints on movement and pronoun interpretation. (For a general review,
see O’Grady, 1997, pp. 249 ff.) How then can a language be acquired?
Theories of linguistic development typically address this problem by assuming that children are endowed with an “acquisition device” – an innate system
that both guides and supplements the learner’s interaction with experience.
This much is accepted by a broad spectrum of researchers ranging from Slobin
(e.g., 1985, p. 1158) to Chomsky (e.g., 1975, p. 13), but differences arise on one
important point. In one class of acquisition theories, a significant portion of
the grammar is taken to be “given in advance” by the acquisition device. This
grammatical component of the inborn acquisition device is known as Universal
Grammar, or UG – a system of categories and principles that is taken to
determine many of the core properties of human language (see figure 3.1).
Such theories are instances of what might be called “grammatical nativism,”
since they adopt the view that the innate endowment for language includes
actual grammatical categories and principles. Elsewhere, I have referred to this
view as “special nativism” (O’Grady, 1997, p. 307), because of its commitment
to the existence of innate mechanisms with a specifically grammatical character
(see also White, this volume).
Grammatical nativism contrasts with “general nativism,” which posits an
innate acquisition device but denies that it includes grammatical categories or
principles per se. According to this view (which might also be labeled “cognitive
nativism” or “emergentism,” as is more common these days), the entire grammar
is the product of the interaction of the acquisition device with experience; no
grammatical knowledge is inborn (see figure 3.2) (see Ellis, this volume).
Acquisition device
Experience
UG
Grammar
Figure 3.1 The UG-based acquisition device
The Radical Middle 45
Acquisition device
Experience
Grammar
Figure 3.2 The general nativist acquisition device
Later in this chapter, I will suggest that there are some signs of convergence
between general nativism and recent versions of grammatical nativism. For
now, though, I would like to emphasize the profound historical difference
between the two views. UG is not simply the name for whatever mechanisms
happen to be involved in grammatical development. As I interpret the literature
on grammatical nativism, proponents of the view that UG is part of the acquisition device subscribe to a very strong claim about its content and character –
namely, that it is an autonomous system of grammatical categories and principles
– autonomous in the sense that it is not reducible to non-linguistic notions and
grammatical in the sense that it is primarily concerned with matters of wellformedness, not parsing or processing or other types of language-related cognition. (For detailed discussion, see Newmeyer, 1998.) All varieties of general
nativism reject these assumptions, however much they may disagree on what
the acquisition device actually does comprise.
Skepticism concerning UG is widespread in the field of language acquisition
research. Relatively little of the literature on first language acquisition is couched
within a UG framework, and the same seems to be true of the literature on
second language acquisition as well. In addition to the huge amount of work
that simply ignores UG, there is also a substantial and varied literature that
explicitly rejects it in one form or another. This includes work by Martin
Braine (1987), Dan Slobin (1985), Melissa Bowerman (1990), and Michael
Tomasello (1995) (among many others) on first language acquisition and work
by Eric Kellerman (Kellerman and Yoshioka, 1999), Fred Eckman (1996), Kate
Wolfe-Quintero (1992, 1996), and others on second language acquisition. It
should be noted, though, that there is no unified general nativist approach to
language acquisition and certainly no agreement on the particular views that I
outline in the remainder of this chapter.
As I see it, the principal limitation of most work on general nativism lies in
its failure to develop a theory of learnability and development that is tied to an
explicit and comprehensive theory of grammar (see also Gregg, 1996). Most
non-UG work is quite casual in its approach to syntax: the phenomena whose
acquisition is being investigated are typically analyzed informally and on a caseby-case basis, without reference to an overarching syntactic theory. By contrast,
work in the special nativist tradition has not only put forward a theory of learnability (built around an inborn UG) but linked it to a far-reaching and explicit
theory of grammar (transformational grammar in its various incarnations).
For reasons that I will discuss further below, the most promising theories of
language posit explanatory principles that make reference to phonological,
46 William O’Grady
syntactic, and semantic representations of various sorts. Yet the vast majority of
work on general nativism either makes no reference to such representations or
adopts a very casual view as to their properties, typically avoiding any explicit
proposal about their architecture or ontogeny.
A good illustration of this point comes from an important body of research
on the acquisition of relative clauses by second language learners (e.g., Doughty,
1991; Eckman, Bell, and Nelson, 1988; Gass, 1979, 1980). This work has yielded
a robust and interesting finding: subject relative clauses such as (1) are easier
than direct object relatives such as (2) for second language learners. (The same
seems to be true for first language acquisition, all other things being equal; see
O’Grady, 1997, p. 179 for discussion.)
(1) Subject relative:
the truck that [_ pushed the car]
(2) Object relative:
the truck that [the car pushed _]
Further, it has been observed that this finding parallels an important generalization in syntactic typology dating back at least to Keenan and Comrie (1977):
direct object relatives are more marked than subject relatives. (That is, some
languages have only subject relatives, but any language with direct object
relatives must also permit subject relatives.)
The developmental pattern and its relationship to Keenan and Comrie’s
typological generalization raise questions that force us to address the two
principal explanatory challenges confronting contemporary linguistics:
i Why is language the way it is (e.g., why do all languages with direct object
relatives also have subject relatives, but not vice versa)?
ii How is it acquired (e.g., why are subject relatives easier for language learners than direct object relatives)?
It is my position that neither of these questions can be answered without
reference to hierarchically structured symbolic representations. On this view,
then, the first priority for general nativism must be a theory of syntactic representations that includes a proposal about their composition and architecture.
3
A General Nativist Theory of Representations
In a number of recent publications (e.g., O’Grady, 1996, 1997, 1998), I have put
forward the outlines of a general nativist theory of syntactic representations.
As I see it, the key to such a theory lies in two propositions. First, syntactic
categories, which are treated as purely formal elements in special nativism, must
be reducible to a semantic base. I have made one proposal about precisely how
The Radical Middle 47
Step 1: Combination of the subject and verb
N
V
Mary
speaks
Figure 3.3 First step in the formation of the sentence Mary speaks French
Targeting the verb
Step 2: Combination with the second argument
N
V
N
V
N
Mary
speaks
Mary
speaks
French
Figure 3.4 Second step in the formation of the sentence Mary speaks French
this might be achieved (O’Grady, 1997, 1998), and other ideas can be found in
the literature on grammatical categories (e.g., Croft, 1991; Langacker, 1987).
Second, contra the view adopted within UG-based approaches to language
acquisition, the computational principles that combine and arrange words to
form phrases and sentences cannot be specifically grammatical in character
(that is, there is no X-bar Schema, no Empty Category Principle, and so forth).
How then do we account for the sorts of grammatical phenomena that have
been the focus of so much linguistic research since the early 1960s?
In recent work on this matter (e.g., O’Grady, 2001b), I have proposed that
the theory of sentence structure can and should be unified with the theory of
sentence processing. As I see it, the processor itself has no specifically grammatical properties. Rather, its design reflects two more general computational
features – a propensity to operate on pairs of elements (a characteristic of the
arithmetical faculty as well)1 and a propensity to combine functors with their
arguments at the first opportunity (a storage-reducing strategy that I refer to
simply as “efficiency”). The system operates in a linear manner (i.e., “from left
to right”), giving the result depicted in figure 3.3 in the case of a simple
transitive sentence such as Mary speaks French.
In the next step, the verb combines directly with its second argument, an
operation that requires splitting the previously formed phrase in the manner
depicted in figure 3.4. (Such an operation has long been assumed, at least
implicitly, in the literature on sentence processing; see, e.g., Frazier, 1987,
p. 561; Levelt, 1989, p. 242; Marcus, 1980, pp. 79–80.)
Syntactic representations in this type of efficiency-driven computational
system have the familiar binary-branching design, with the subject higher than
the direct object – but not as the result of an a priori grammatical blueprint
such as the X-bar schema. Rather, their properties are in a sense epiphenomenal
– the by-product of a sentence formation process that proceeds from left to
48 William O’Grady
right, combining a verb with its arguments one at a time at the first opportunity.
Syntactic representations are thus nothing more than a residual record of how
the computational system goes about combining words to form sentences.
The architecture of the proposed syntactic representations offers a promising
account of why subject relatives are easier than direct object relatives. The key
idea is that the relative difficulty (and, by extension, the developmental order) of
structures that contain gaps is determined by the distance (calculated in terms
of intervening nodes) between the gap and its filler (e.g., the nominal modified
by the relative clause). As illustrated in (3) and (4), there is one such node in
the case of subject relatives (i.e., S) and two in the case of object relatives (i.e.,
S and VP):2
(3) Subject relative:
the truck that [S _ pushed the car]
(4) Direct Object relative:
the truck that [S the car [VP pushed _]]
A problematic feature of English is that structural distance is confounded
with linear distance: subject gaps are not only less deeply embedded than
object gaps, they are also linearly closer to the head noun. In order to ensure
that structural distance rather than linear distance is responsible for the contrast in the difficulty of relative clauses, it is necessary to consider the acquisition of languages such as Korean, in which the relative clause precedes the
head. (The verbal suffixes in Korean simultaneously indicate both tense and
clause type. RC = relative clause.)
Subject relative:
[S _ namca-lul cohaha-nun] yeca
man-Acc like-RC.Prs woman
“the woman who likes the man”
structural distance: one node (S)
linear distance: two words
b. Direct object relative:
[S Namca-ka [VP _ cohaha-nun]] yeca
man-Nom
like-RC.Prs woman
“the woman who the man likes”
structural distance: two nodes (VP and S)
linear distance: one word
(5) a.
If structural distance is the key factor, then the subject relative should be
easier; on the other hand, if linear distance is the key factor, the direct object
relative should be easier. O’Grady, Lee, and Choo (forthcoming) investigated
this matter with the help of a comprehension task (see box 3.1), uncovering a
strong and statistically significant preference for subject relative clauses.
The Radical Middle 49
Box 3.1 The acquisition of relative clauses in Korean as
a second language (O’Grady et al., forthcoming)
Research questions: Is there a subject–object asymmetry in the acquisition of Korean
relative clauses? If so, does it reflect a contrast in linear distance or in structural
distance?
Methodology:
Subjects: 53 native English speakers studying Korean as a second language – 25
second-semester students at the University of Texas at Austin, 20 fourth-semester
students at the same institution, and 8 fourth-semester students at the University of
Hawai’i at Manoa.
Task: Picture selection, in accordance with the following instructions:
Each page of this booklet contains a series of three pictures. As you go to each
page, you will hear a tape-recorded voice describing a person or animal in one
of the three pictures. Your job is simply to put a circle around the person or
animal described in the sentence. (Do NOT put the circle around the entire
box.)
Figure 3.5 presents a sample page from the questionnaire.
Figure 3.5 Sample test items
50 William O’Grady
Subjects who correctly understand relative clauses should circle the right-hand
figure in the third panel in response to a subject relative clause such as (ia) and the
left-hand figure in the second panel in response to a direct object relative such as
(ib):
Subject relative clause:
[_ namca-lul cohaha-nun] yeca
man-Acc like-RC.Prs woman
‘the woman who likes the man’
b. Direct object relative clause:
[namca-ka _ cohaha-nun] yeca
man-Nom like-RC.Prs woman
‘the woman who the man likes’
(i) a.
Results: The subjects did far better on subject relative clauses than on direct object
relatives, with scores of 73.2 percent correct on the former pattern compared to only
22.7 percent for the latter. This contrast is highly significant (F 30.59, p = .0001).
Equally revealing is an asymmetry in reversal errors (i.e., the number of times a
pattern of one type was misanalyzed as a pattern of the other type): direct object
relatives were misunderstood as subject relatives 115 times while subject relatives
were misanalyzed as direct object relatives only 26 times – a clear indication that
subject relatives are the easier pattern.
Conclusion: Learners of Korean as a second language find subject relatives far easier
than direct object relatives, which supports the claim that structural distance between a gap and its filler is the key factor in determining the relative difficulty of
these patterns.
If the structural distance account is correct, we expect to find comparable
asymmetries in the development of other gap-containing structures as well.
Wh-questions are a case in point. As illustrated in (6) and (7), subject and
object wh-questions exhibit a contrast that parallels the asymmetry found in
relative clauses:
(6) Subject wh-question:
Who [S _ met Mary]?
(7) Object wh-question:
Who did [S Mary [VP meet _]]?
The relative difficulty of these two patterns has been studied for both first
language acquisition (Yoshinaga, 1996) and second language acquisition (Kim,
1999) with the help of an elicited production task. Both studies revealed significantly better performance on subject wh-questions and a strong tendency
for these patterns to be used in place of their direct object counterparts, but not
vice versa.
The Radical Middle 51
By adopting a particular theory of syntactic representations, then, we are
able to uncover a plausible computational explanation for why object relatives
are more difficult than subject relatives for language learners and for why
object wh-questions are harder than subject wh-questions. This is a potential
step forward, not only because it helps explain the developmental facts, but
also because it sheds light on the typological facts as well.
In particular, it makes sense to think that the cut-off points that languages
adopt in defining the limits for relative clause formation are determined by the
same measure of computational complexity that defines developmental difficulty. Thus, subject relatives – the computationally simplest structure – will be
the most widespread typologically.3 Moreover, any language that allows the
computationally more difficult direct object relatives will also permit the simpler subject relatives. And so on.
This cannot be all there is to it, of course. Syntactic representations have
properties other than just binarity, and syntactic principles make reference to
more than just structural distance. The illustration given here omits many details
in order to make the key point – which is that the best prospects for an
explanatory general nativist theory of language lie in an approach that takes
syntactic representations as its starting point. As we have just seen, reference
to such representations allows us to make a proposal not only about how
language is acquired (e.g., why subject relatives are acquired first) but also
about why language is the way it is (e.g., why any language that allows object
relatives must also allow subject relatives).
The parallels between first and second language acquisition that are manifested in the emergence of relative clauses lend credence to the idea that the
two phenomena are fundamentally alike, at least in some respects. I believe
that this is right, at least insofar as computational operations are concerned.
The matter is hardly clear, though. Indeed, the facts are somewhat difficult to
interpret: as Bley-Vroman (1994, p. 4) has observed, experimental work on
computational principles in second language acquisition has yielded indecisive
results – “better than chance, [but] far from perfect.” Although this seems to
suggest diminished access to the computational mechanisms underlying sentence formation, a less pessimistic view is adopted by Uziel (1993), who follows
Grimshaw and Rosen (1990) in arguing that any indication that learners perform
above the level of chance on contrasts involving computational principles should
be interpreted as evidence for access to those principles – a not unreasonable
proposal in light of the many extraneous factors (e.g., inattention, processing
limitations, vocabulary deficits, nervousness, and so forth) that can interfere
with performance in experimental settings. (See also White, this volume.)
If this is right, then performance on computational principles should improve
as the effect of extraneous factors diminishes. There is already some indication
that this is right: Kanno (1996) investigates the status of a computational principle that is responsible for the asymmetry in the admissibility of case drop in
subject and direct object positions in Japanese (see section 4 for details). Because
the contrast is manifested in very simple sentences, Kanno was able to elicit
52 William O’Grady
grammaticality judgments for sentences that were just two and three words
long, thereby dramatically diminishing the potential effect of extraneous factors.
Interestingly, she reports that adult learners of Japanese as a second language
do not perform significantly differently from native speakers in assessing the
relative acceptability of the two patterns.
Why then are adults such poor language learners? There are a number of
possibilities, of course, two of which I find particularly interesting. First, it is
evident that some parts of the language faculty fare less well than the computational system with the passage of time. For instance, the ability to distinguish
among phonemic contrasts apparently begins to diminish by the age of 12
months (Werker, Lloyd, Pegg, and Polka, 1996), with the result that language
acquisition after age six or so typically results in a foreign accent (Long, 1990,
p. 266). There also appears to be a significant decline in learners’ ability to
exploit subtle semantic contrasts, including those underlying such familiar
phenomena as the the/a contrast in English (Larsen-Freeman and Long, 1991,
p. 89) or the wa/ga (topic/nominative) contrast in Japanese (Kuno, 1973, p. 37;
Russel, 1985, p. 197). This suggests that the acquisition device comprises several
autonomous components (at least a computational module, a perceptual module,
and a conceptual module), each with its own maturational prospects and its
own role to play in shaping the outcome of second language learning.
A second possibility, which focuses just on syntactic deficits (see, e.g.,
O’Grady, 2001a), is that the computational system, while intact, is underpowered in the case of adult language learners. The effects of this deficit are
manifested in patterns which, for one reason or another, place extra demands
on the computational system. One such pattern involves object relative clauses,
which require the establishment of a link between a direct object gap and a
structurally distant filler. As we have seen, both children and adults have
trouble with these patterns compared to subject relative clauses. Interestingly,
similar problems have been observed in agrammatic aphasics (e.g., Grodzinsky,
2000).
Another sort of pattern that may place an extra burden on the computational
system involves double object datives such as (8), compared to their prepositional dative counterparts as in (9):
(8) Double object dative:
agent
goal
theme
The boy sent the donkey the horse.
(9) Prepositional dative:
agent
theme
goal
The boys sent the horse to the donkey.
As observed by Dik (1989), Langacker (1995, pp. 18–20), and Talmy (1988),
among others, the word order employed in the prepositional pattern (agent–
theme-goal) is iconic with the structure of the event, which involves the agent
The Radical Middle 53
acting on the theme and then transferring it to the goal, giving the “action
chain” (to employ Langacker’s term) depicted in (10):
(10) agent → theme → goal
Interestingly, the double object dative, with its non-iconic agent–goal–theme
order, is harder to comprehend, both for children in the early stages of language
acquisition (Osgood and Zehler, 1981; Roeper, Lapointer, Bing, and Tavakolian,
1981; Waryas and Stremel, 1974) and for adult second language learners
(Hawkins, 1987; Mazurkewich, 1984; White, 1987). And here again, agrammatic
aphasics have been found to have difficulty with this pattern too (Caplan and
Futter, 1986; Kolk and Weijts, 1996, p. 111; O’Grady and Lee, 2001).
All of this suggests that in the early stages of language acquisition (and
perhaps in the case of agrammatism as well) the computational system may be
too underpowered to reliably execute the more demanding tasks involved in
natural language processing, including dealing with long-distance dependencies
and non-iconic word order. Whereas children routinely overcome this deficit,
its effects in the case of adults may be longer lasting, contributing to the pattern
of partial attainment that is typical of second language learning.
4 The Advantages of General Nativism
In evaluating general nativism, it is useful to compare it with two well-known
alternatives – UG-based special nativism, which posits inborn grammatical
categories and principles, and connectionism, certain varieties of which deny
the existence of traditional symbolic representations and principles altogether
(e.g., Elman, Bates, Johnson, Karmiloff-Smith, Parisi, and Plunkett, 1996). Each
approach has its own merits, of course, but it is nonetheless possible to identify
considerations that justify continued pursuit of the general nativist research
program.
The potential advantage of general nativism with respect to special nativism
is obvious. All scientific work, including the special nativist research program,
seeks the most general properties and principles possible. One does not posit
a grammatical rule specifically for passivization if the properties of passive
structures can be derived from a more general grammatical principle. And one
does not posit a grammatical constraint if the phenomena that it accounts for
can be derived from principles that are not specific to the language faculty.
(For an identical view within grammatical nativism, see Lightfoot, 1982, p. 45.)
Interestingly, the pursuit of this very goal within the special nativist research program has led to a partial convergence of views with general nativism
in recent years. As observed in O’Grady (1999), work within the “Minimalist
Program” that has grown out of Government and Binding theory (e.g.,
Chomsky, 1995) suggests that UG as it was conventionally understood is being
abandoned even by those traditionally committed to grammatical nativism in
54 William O’Grady
its strongest form. The latest generation of explanatory principles focuses on
the notion of economy, demanding “short moves” (the “Minimal Link Condition”) that take place only if necessary (“Last Resort”) and are postponed for
as long as possible (“Procrastinate”) – in short, the sort of principles that one
would expect to find in almost any computational system. (In fact, Fukui,
1996, has gone so far as to suggest that the economy principles of the Minimalist
Program follow from the laws of physics!)4
A concrete example of this convergence of views can be seen in the treatment
of gap-containing structures in the two varieties of nativism, where one can
find parallel proposals for calculating relative complexity and markedness. As
explained above, I have suggested that the relative ease of subject gaps compared to object gaps can be explained with reference to their distance from the
“filler” (the head in the case of relative clauses, the wh-word in the case of
questions). Working within the minimalist program, Collins (1994, p. 56) has
put forward a virtually identical proposal: the cost of “movement operations”
is determined by the number of nodes traversed.
In the final analysis, then, general and special varieties of nativism agree
on the existence of an inborn acquisition device, of hierarchically structured
symbolic representations, and of explanatory principles that refer to these representations. The principal difference between the two approaches revolves
around the precise nature of these constructs, with disagreement centered on
the question of whether the language faculty includes inborn categories and
mechanisms that are narrowly grammatical in character. But even here, there
is agreement that we should seek out the most general constructs that are
consistent with a viable account of the properties of language and the facts of
development. What remains to be determined is whether some of these constructs have the status necessary to justify continued adherence to the traditional conception of Universal Grammar.
At first glance at least, the type of general nativism advocated here shares
much less common ground with connectionism. This is somewhat ironic since,
in a sense, connectionism is an extreme form of general nativism. Indeed, some
of its current proponents (e.g., Elizabeth Bates and Brian MacWhinney) were
earlier associated with a more traditional general nativist perspective (e.g.,
Bates and MacWhinney, 1988), and Elman et al. (1996, p. 114) note that connectionism embodies aspects of Piaget’s (general nativist) theory of the mind.
As I see it, the attractiveness of connectionism stems in large part from the
fact that it takes the pursuit of generality so seriously, ultimately arriving at
the strongest possible conclusion concerning the nature of the human language faculty – namely that it has no special properties of its own, grammatical or otherwise. This idea deserves to be taken seriously. Ultimately, though,
the connectionist program must be evaluated in terms of the same criteria as
apply to all theories of language: it must account both for how language is
acquired and for why it is the way it is. To date, connectionist work seems to
have concentrated almost exclusively on the former question. There have been
The Radical Middle 55
impressive results in this area, but, for me at least, the challenge of explaining
why language is the way it is has yet to be satisfactorily addressed. A simple
example will help illustrate this point.
As is well known, many languages exhibit so-called “subject–verb” agreement:
affixation on the verb records person and number features of the subject. For
example:
(11)
3rd person, singular subject:
English
That man works hard.
Spanish
Ese hombre trabaja mucho.
3rd person, plural subject:
Those men work∅
∅ hard. Esos hombres trabajan mucho.
We know from the intriguing work of Elman (1993) and others that it is possible to build a connectionist net that can “learn” subject–verb agreement
without reference to hierarchical syntactic representations per se. Moreover,
on the face of it, it appears that such a proposal could count as an explanation
for how at least this feature of language is acquired.
But there is another challenge here. This is because the same connectionist
net could almost certainly “learn” a language – call it Lisheng – in which
agreement is triggered by the direct object rather than the subject:
(12)
Lisheng
3rd person, singular object:
I visited-a that city.
3rd person, plural object:
I visited-an those cities.
The problem is that there is apparently no such language: there are languages
such as English and Spanish in which the verb agrees only with the subject
and languages such as Swahili in which the verb agrees with both the subject
and the direct object, but no languages in which the verb agrees only with the
direct object (e.g., Croft, 1990, p. 106). Why should this be?
This asymmetry has a straightforward explanation in theories of language
that make use of hierarchically structured syntactic representations: the need for
agreement to mark a head–argument relation increases with the computational
distance between the two elements. Since verbs are structurally closer to their
direct objects than to their subjects in the sort of representation that I posit, it
follows that the need for agreement is greater in the latter case. This is true not
only for SOV languages such as Tamil, in which the subject is linearly more
distant from the verb, but also for SVO languages such as English, in which the
subject and direct object are both adjacent to the verb, and for VSO languages
such as Irish, in which the subject is linearly closer to the verb than is the
direct object (see figure 3.6).5
56 William O’Grady
SOV:
SVO:
VSO:
NP
NP
NP
V
NP
V
NP
V
NP
Figure 3.6 The subject–object asymmetry
Syntactic representations such as these shed light on other phenomena as
well. For instance, it is surely no accident that in languages such as Japanese,
case can be dropped from the direct object but not from the subject (Fukuda,
1993): the need for case presumably is greater on the more distant of the verb’s
arguments:
(13) a. Case drop on the subject:
*Dare gakusei-o nagutta-no?
who student-ac hit
-Ques
‘Who hit the student?’
b. Case drop on the direct object:
Gakusei-ga dare nagutta-no?
student-Nom who hit -Ques
‘Who did the student hit?’
Explanations such as these are plainly based on processing considerations.
As such, they are perfectly compatible with Elman et al.’s hint (1996, p. 386)
that linguistic universals are perhaps attributable to processing mechanisms –
an idea that they do not develop. Crucially, however, the specific processing
factors that underlie agreement and case drop asymmetries come to light only
when we consider symbolic representations with the defining properties of
traditional syntactic structure – binary branching and a subject–object asymmetry. (Recall, though, that these architectural features are derived from general
computational properties, not UG, in the approach that I adopt.) It remains to
be seen how and whether the connectionist program deals with these issues.
In the course of proposing an account for why language is the way it is with
respect to phenomena such as agreement and case drop, a theory based on
traditional symbolic representations also takes us a good deal of the way
toward understanding how language is acquired. In the case of agreement, for
instance, it seems reasonable to suppose that the computational demands
associated with keeping track of the structurally more distant verb–subject
relation create a place in syntactic representations where agreement would be
especially welcome.
Confounding factors make it difficult to test this prediction against developmental data, since subject agreement morphemes are more frequent than their
object agreement counterparts and may occur in the more salient word-initial
or word-final position (vs. word-medial position). Nonetheless, the developmental facts are at least suggestive.
The Radical Middle 57
In languages with both subject and object agreement, there seem to be only
two developmental patterns: either subject agreement is learned before object
agreement (the case in Sesotho, according to Demuth, 1992, p. 600), or the two
types of agreement emerge simultaneously (this is apparently what happens in
West Greenlandic (Fortescue and Olsen, 1992), K’iche’ Maya (Pye, 1992), Walpiri
(Bavin, 1992), and Georgian (Imedadze and Tuite, 1992). There appear to be no
languages in which object agreement is acquired before subject agreement.
Turning now to case drop, if in fact the computational demands associated
with keeping track of the more distant verb–subject relation make it worthwhile to retain case on the subject while permitting its suppression on the
direct object, we would expect this contrast to be evident in the course of
linguistic development. This seems to be right: Suzuki (1999) reports that
children learning Japanese exhibit an overwhelming greater tendency to have
a case marker on the subject than on the direct object, even though they sometimes use the wrong case form (see also Lakshmanan and Ozeki, 1996; Miyata,
1993). Moreover, as noted in the preceding section, Kanno (1996) reports that
the same tendency is strongly manifested in adult second language learners,
even when there is no relevant experience or instruction.
5 Conclusion
Reduced to its essentials, the study of language is centered on the investigation
of two very fundamental questions – why language is the way it is, and how it
is acquired. To date, the most detailed answer to these questions has come from
proponents of grammatical nativism, who have put forward a theory that
simultaneously addresses both questions: Universal Grammar determines the
properties that any human language must have and, by virtue of being inborn,
it helps explain the success and rapidity of the language acquisition process.
A defining feature of UG-based theories is their commitment to hierarchically structured symbolic representations. Not only are the key properties
of language defined in terms of these representations, but the mechanisms
determining a sentence’s pronunciation and interpretation are thought to
make crucial reference to them as well. On this view, then, the end point of
the language acquisition process can be seen, in part at least, as the ability to
associate such representations with the sentences of one’s language.
At the other extreme, recent work in connectionism denies the existence of
conventional syntactic representations, of Universal Grammar, and of an inborn
acquisition device specifically for language. Language acquisition, it is claimed,
is not fundamentally different from any other type of learning and can be
accounted for by the same mechanisms as are required for interaction with the
environment in general.
My own work has been exploring a radical idea of a different sort. As I
have characterized it, general (or cognitive) nativism differs from connectionism in being committed to the existence of hierarchically structured symbolic
58 William O’Grady
representations as part of a theory of why language is the way it is and to the
existence of an inborn acquisition device as part of a theory of how language is
acquired. At the same time, it differs from grammatical nativism in not positing
inborn categories or principles that are exclusively grammatical in character.
Differences as deep as these are unlikely to be resolved immediately, but the
challenge is at least clear – we need a viable account both of the properties that
define human language and of the acquisition of individual languages on the
basis of very limited types of input. There is surely a place for the study of
second language acquisition in all of this. At the very least, research on second
language learning provides opportunities to observe the acquisition device
functioning under conditions of duress – either because of extreme limitations
on the available input (as in the case of classroom learning) or because one or
more of its component modules have been compromised, or both. It is perhaps
not too optimistic to think that the further study of this phenomenon will
provide opportunities to extend and deepen our understanding of the acquisition device for human language.
NOTES
1 When we add three or more numbers
(e.g., 7 + 4 + 8), we always proceed in
a pair-wise fashion; no one is able to
compute all the numbers in a single
step.
2 As predicted, direct object relatives
are known to be easier than indirect
object relatives, in both first language
acquisition (de Villiers, Tager
Flusberb, Hakuata, and Cohen, 1979;
Hildebrand, 1987) and second
language acquisition (Gass, 1979;
Wolfe-Quintero, 1992). However, depth
of embedding cannot account for the
relative preference for preposition
stranding over “pied-piping” found
in children learning English as a first
language (e.g., McDaniel, McKee, and
Bernstein, 1998) and, possibly, in
second language learners too (White,
1989, pp. 122ff):
i
Preposition stranding: three
intervening nodes:
the man who [S you [VP talked
[PP to _]]]
ii
Pied-piping: two intervening nodes:
the man to whom [S you [VP
talked _]]
The obvious explanation for this
contrast is simply that the pied-piped
structure is all but non-existent in the
input. But this raises the question of
why English is this way, given the
general tendency in human language
to avoid preposition stranding. J.
Hawkins (1999) makes an interesting
proposal in this regard, but space
does not permit further discussion of
this matter here.
3 The same should be true of whquestions as well, and there do in fact
appear to be some languages in
which only subjects undergo whmovement (Cheng, 1991).
4 The Minimalist Program still falls
well short of being general nativist,
however. Chomsky (1995) makes a
number of proposals with a strong
special nativist character, including a
property “P” that permits multiple
The Radical Middle 59
nominative patterns in Japanese by
allowing a feature to remain active
even after being checked and deleted
(p. 286) and a parameter that licenses
multiple subject constructions in
Icelandic by permitting an unforced
violation of Procrastinate (p. 375).
5 As illustrated in the syntactic
representation for VSO languages, the
computational system I adopt permits
discontinuous constituents. For
extensive discussion, see O’Grady
(2001b).
REFERENCES
Bates, E. and MacWhinney, B. 1988:
What is functionalism? Papers and
Reports on Child Language Development,
27, 137–52.
Bavin, E. 1992: The acquisition of
Walpiri. In D. Slobin (ed.), The
Crosslinguistic Study of Language
Acquisition. Vol. 3. Hillsdale, NJ:
Lawrence Erlbaum Associates,
309–71.
Bley-Vroman, R. 1994: Updating the
Fundamental Difference Hypothesis.
Talk presented at the EuroSLA
Convention, Aix-en-Provence.
Bowerman, M. 1990: Mapping thematic
roles onto syntactic functions: are
children helped by innate linking
rules? Linguistics, 28, 1253–89.
Braine, M. 1987: What is learned in
acquiring word classes – a step
toward an acquisition theory. In
B. MacWhinney (ed.), Mechanisms of
Language Acquisition. Hillsdale, NJ:
Lawrence Erlbaum Associates, 65–87.
Caplan, D. and Futter, C. 1986:
Assignment of thematic roles to nouns
in sentence comprehension by an
agrammatic patient. Brain and
Language, 27, 117–34.
Cheng, L. 1991: On the typology of
wh-questions. Unpublished Ph.D.
dissertation. MIT.
Chomsky, N. 1975: Reflections on
Language. New York: Pantheon.
Chomsky, N. 1995: The Minimalist
Program. Cambridge, MA: MIT.
Collins, C. 1994: Economy of derivation
and the Generalized Proper Binding
Condition. Linguistic Inquiry, 25, 45–61.
Croft, W. 1990: Typology and Universals.
New York: Cambridge University
Press.
Croft, W. 1991: Syntactic Categories and
Grammatical Relations: The Cognitive
Organization of Information. Chicago:
University of Chicago Press.
Demuth, C. 1992: The acquisition of
Sesotho. In D. Slobin (ed.), The
Crosslinguistic Study of Language
Acquisition. Vol. 3. Hillsdale, NJ:
Lawrence Erlbaum Associates,
557–638.
Dik, S. 1989: The Theory of Functional
Grammar. Part I: The Structure of the
Clause. Dordrecht: Foris.
Doughty, C. 1991: Second language
instruction does make a difference.
Studies in Second Language Acquisition,
13, 431–69.
Eckman, F. 1996: On evaluating
arguments for special nativism in
second language acquisition theory.
Second Language Research, 12, 398–419.
Eckman, F., Bell, L., and Nelson, D. 1988:
On the generalization of relative clause
instruction in the acquisition of
English as a second language. Applied
Linguistics, 9, 1–13.
Ellis, N. 1996: Sequencing in SLA:
phonological memory, chunking and
points of order. Studies in Second
Language Acquisition, 18, 91–126.
60 William O’Grady
Elman, J. 1993: Learning and
development in neural networks:
the importance of starting small.
Cognition, 48, 71–99.
Elman, J., Bates, E., Johnson, M.,
Karmiloff-Smith, A., Parisi, D., and
Plunkett, K. 1996: Rethinking Innateness:
A Connectionist Perspective on
Development. Cambridge, MA:
MIT Press.
Fortescue, M. and Olsen, L. 1992: The
acquisition of West Greenlandic. In
D. Slobin (ed.), The Crosslinguistic
Study of Language Acquisition. Vol. 3.
Hillsdale, NJ: Lawrence Erlbaum
Associates, 111–219.
Frazier, L. 1987: Sentence processing: a
tutorial review. In M. Coltheart (ed.),
Attention and Performance XII: The
Psychology of Reading. Hillsdale, NJ:
Lawrence Erlbaum Associates,
559–86.
Fukuda, M. 1993: Head government
and case marker drop in Japanese.
Linguistic Inquiry, 24, 168–72.
Fukui, N. 1996: On the nature of
economy in language. Ninti Kagaku
[Cognitive Studies], 3, 51–71.
Gass, S. 1979: Language transfer and
universal grammatical relations.
Language Learning, 29, 327– 44.
Gass, S. 1980: An investigation of
syntactic transfer in adult second
language learners. In R. Scarcella and
S. Krashen (eds), Research in Second
Language Acquisition. Rowley, MA:
Newbury House, 132– 41.
Gregg, K. 1996: The logical and
developmental problems of second
language acquisition. In W. Ritchie
and T. Bhatia (eds), Handbook of
Second Language Acquisition. San
Diego: Academic Press, 49–81.
Grimshaw, J. and Rosen, S. 1990:
Knowledge and obedience: the
developmental status of the binding
theory. Linguistic Inquiry, 21, 187–222.
Grodzinsky, Y. 2000: The neurology of
syntax: language use without Broca’s
area. Behavioral and Brain Sciences, 23,
1–71.
Hawkins, J. 1999: Processing complexity
and filler-gap dependencies across
grammars. Language, 75, 244–85.
Hawkins, R. 1987: Markedness and the
acquisition of the English dative
alternation by L2 speakers. Second
Language Research, 3, 20–55.
Hildebrand, J. 1987: The acquisition
of preposition stranding. Canadian
Journal of Linguistics, 32, 65–85.
Imedadze, N. and Tuite, K. 1992: The
acquisition of Georgian. In D. Slobin
(ed.), The Crosslinguistic Study of
Language Acquisition. Vol. 3. Hillsdale,
NJ: Lawrence Erlbaum Associates,
39–109.
Kanno, K. 1996: The status of a nonparametrized principle in the L2 initial
state. Language Acquisition: A Journal of
Developmental Linguistics, 5, 317–34.
Keenan, E. and Comrie, B. 1977: Noun
phrase accessibility and Universal
Grammar. Linguistic Inquiry, 8, 63–100.
Kellerman, E. and Yoshioka, K. 1999:
Inter- and intra-population
consistency: a comment on Kanno
(1998). Second Language Research, 15,
101–9.
Kim, S. 1999: The subject–object
asymmetry in the acquisition of
wh-questions by Korean learners of
English. Paper presented at the Hawaii
Language Acquisition Workshop.
Honolulu, Hawaii.
Kolk, H. and Weijts, M. 1996: Judgments
of semantic anomaly in agrammatic
patients: argument movement,
syntactic complexity, and the use of
heuristics. Brain and Language, 54,
86–135.
Kuno, S. 1973: The Structure of the
Japanese Language. Cambridge, MA:
MIT Press.
Lakshmanan, U. and Ozeki, M. 1996: The
case of the missing particle: objective
case assignment and scrambling in
the early grammar of Japanese. In
The Radical Middle 61
A. Stringfellow, D. Cahana-Amitay,
E. Hughes, and A. Zukowski (eds),
Proceedings of the 20th Annual Boston
University Conference on Language
Development. Somerville, MA:
Cascadilla Press, 431– 42.
Langacker, R. 1987: Nouns and verbs.
Language, 63, 53–95.
Langacker, R. 1995: Raising and
transparency. Language, 71, 1–62.
Larsen-Freeman, D. and Long, M. 1991:
An Introduction to Second Language
Acquisition Research. New York:
Longman.
Levelt, W. 1989: Speaking: From Intention
to Articulation. Cambridge, MA: MIT
Press.
Lightfoot, D. 1982: The Language Lottery.
Cambridge, MA: MIT Press.
Long, M. 1990: Maturational constraints
on language development. Studies in
Second Language Acquisition, 12, 251–85.
Marcus, M. 1980: A Theory of Syntactic
Recognition for Natural Language.
Cambridge, MA: MIT Press.
Mazurkewich, I. 1984: The acquisition
of the dative alternation: unlearning
overgeneralizations. Cognition, 16,
261–83.
McDaniel, D., McKee, C., and Bernstein,
J. 1998: How children’s relatives solve
a problem for minimalism. Language,
74, 308–34.
Miyata, H. 1993: The performance of the
Japanese case particles in children’s
speech: with special reference to “ga”
and “o.” MITA Working Papers in
Psycholinguistics, 3, 117–36.
Newmeyer, F. 1998: Language Form and
Language Function. Cambridge, MA:
MIT Press.
O’Grady, W. 1996: Language acquisition
without Universal Grammar: a
proposal for L2 learning. Second
Language Research, 12, 374–97.
O’Grady, W. 1997: Syntactic Development.
Chicago: University of Chicago Press.
O’Grady, W. 1998: The acquisition of
syntactic representations: a general
nativist approach. In W. Ritchie and
T. Bhatia (eds), Handbook of Language
Acquisition. San Diego: Academic
Press, 157–93.
O’Grady, W. 1999: Toward a new
nativism. Studies in Second Language
Acquisition, 21, 621–33.
O’Grady, W. 2001a: Language
acquisition and language deficits.
Invited keynote address presented
to the Japan Society for Language
Sciences.
O’Grady, W. 2001b: Syntactic
computation. Ms. University of
Hawai’i, Department of Linguistics.
O’Grady, W. and Lee, W. 2001: The
Isomorphic Mapping Hypothesis:
evidence from Korean. To appear.
Brain and Cognition, 46, 226–30.
O’Grady, W., Lee, M., and Choo, M.
forthcoming: The acquisition of
relative clauses in Korean as a second
language. Studies in Second Language
Acquisition.
Osgood, C. and Zehler, A. 1981:
Acquisition of bi-transitive sentences:
pre-linguistic determinants of
language acquisition. Journal of Child
Language, 8, 367–84.
Pye, C. 1992: The acquisition of K’iche’
Maya. In D. Slobin (ed.), The
Crosslinguistic Study of Language
Acquisition. Vol. 3. Hillsdale, NJ:
Lawrence Erlbaum Associates,
221–309.
Roeper, T., Lapointe, S., Bing, J., and
Tavakolian, S. 1981: A lexical
approach to language acquisition.
In S. Tavakolian (ed.), Language
Acquisition and Linguistic Theory.
Cambridge, MA: MIT Press, 35–58.
Russel, R. 1985: An analysis of student
errors in the use of Japanese -wa
and -ga. Papers in Linguistics, 18,
197–221.
Slobin, D. 1985: Crosslinguistic evidence
for the language-making capacity.
In D. Slobin (ed.), The Crosslinguistic
Study of Language Acquisition. Vol. 2.
62 William O’Grady
Hillsdale, NJ: Lawrence Erlbaum
Associates, 1157–256.
Suzuki, T. 1999: Two aspects of Japanese
case in acquisition. Unpublished Ph.D.
dissertation. University of Hawai’i at
Manoa.
Talmy, L. 1988: Force dynamics in
language and cognition. Cognitive
Science, 12, 49–100.
Tomasello, M. 1995: Language is not
an instinct. Cognitive Development, 10,
131–56.
Uziel, S. 1993: Resetting Universal
Grammar parameters: evidence from
second language acquisition of
Subjacency and the Empty Category
Principle. Second Language Research,
9, 49–83.
de Villiers, J., Tager Flusberb, H.,
Hakuata, K., and Cohen, M. 1979:
Children’s comprehension of relative
clauses. Journal of Psycholinguistic
Research, 8, 499–528.
Waryas, C. and Stremel, K. 1974: On the
preferred form of the double object
construction. Journal of Psycholinguistic
Research, 3, 271–79.
Werker, J., Lloyd, V., Pegg, J., and
Polka, L. 1996: Putting the baby in the
bootstraps: toward a more complete
understanding of the role of the
input in infant speech processing. In
J. Morgan and K. Demuth (eds), Signal
to Syntax. Mahwah, NJ: Lawrence
Erlbaum Associates, 427–47.
White, L. 1987: Markedness and second
language acquisition: the question of
transfer. Studies in Second Language
Acquisition, 9, 261–85.
White, L. 1989: Universal Grammar
and Second Language Acquisition.
Philadelphia: John Benjamins.
Wolfe-Quintero, K. 1992: Learnability and
the acquisition of extraction in relative
clauses and wh questions. Studies in
Second Language Acquisition, 14, 39–70.
Wolfe-Quintero, K. 1996: Nativism does
not equal Universal Grammar. Second
Language Research, 12, 335–73.
Yoshinaga, N. 1996: Wh-questions: a
comparative study of their form and
acquisition in English and Japanese.
Ph.D. dissertation. University of
Hawai’i at Manoa.
Constructions, Chunking, and Connectionism 63
4
Constructions, Chunking,
and Connectionism: The
Emergence of Second
Language Structure
NICK C. ELLIS
1 Introduction and Overview
Constructivist views of language acquisition hold that simple learning mechanisms operating in and across human systems for perception, motor action,
and cognition while exposed to language data in a communicatively rich human
social environment navigated by an organism eager to exploit the functionality
of language are sufficient to drive the emergence of complex language representations. The various tribes of constructivism – that is, connectionists
(Christiansen and Chater, 2001; Christiansen, Chater, and Seidenberg, 1999;
Levy, Bairaktaris, Bullinaria, and Cairns, 1995; McClelland, Rumelhart, and
the PDP Research Group, 1986; Plunkett, 1998), functional linguists (Bates and
MacWhinney, 1981; MacWhinney and Bates, 1989), emergentists (Elman, Bates,
Johnson, Karmiloff-Smith, Parisi, and Plunkett, 1996; MacWhinney, 1999a),
cognitive linguists (Croft and Cruse, 1999; Lakoff, 1987; Langacker, 1987, 1991;
Ungerer and Schmid, 1996), constructivist child language researchers (Slobin,
1997; Tomasello, 1992, 1995, 1998a, 2000), applied linguists influenced by chaos/
complexity theory (Larsen-Freeman, 1997), and computational linguists who
explore statistical approaches to grammar (Bod, 1998; Jurafsky, 1996) – all
share a functional-developmental, usage-based perspective on language. They
emphasize the linguistic sign as a set of mappings between phonological forms
and conceptual meanings or communicative intentions; thus, their theories of
language function, acquisition, and neurobiology attempt to unite speakers,
syntax, and semantics, the signifiers and the signifieds. They hold that structural regularities of language emerge from learners’ lifetime analysis of the
distributional characteristics of the language input and, thus, that the knowledge
64 Nick C. Ellis
of a speaker/hearer cannot be understood as an innate grammar, but rather
as a statistical ensemble of language experiences that changes slightly every
time a new utterance is processed. Consequently, they analyze language
acquisition processes rather than the final state or the language acquisition
device (see Sorace, this volume; White, this volume). They work within the
broad remit of cognitive science, seeking functional and neurobiological descriptions of the learning processes which, through exposure to representative
experience, result in change, development, and the emergence of linguistic
representations.
Section 2 of this review describes cognitive linguistic theories of construction grammar. These focus on constructions as recurrent patterns of linguistic
elements that serve some well-defined linguistic function. These may be at
sentence level (such as the imperative, the ditransitive, the yes-no question) or
below (the noun phrase, the prepositional phrase, etc.). Whereas GovernmentBinding Theory denied constructions, viewing them as epiphenomena resulting
from the interaction of higher-level principles-and-parameters and lower-level
lexicon, cognitive linguistics – construction grammar in particular (Croft, 2001;
Goldberg, 1995) – has brought them back to the fore, suspecting instead that
it is the higher-level systematicities that emerge from the interactions of constructions large and small. Section 3 concerns the development of constructions
as complex chunks, as high-level schemata for abstract relations such as
transitives, locatives, datives, or passives. An acquisition sequence – from formula, through low-scope pattern, to construction – is proposed as a useful
starting point to investigate the emergence of constructions and the ways in
which type and token frequency affect the productivity of patterns. Section 4
presents the psychological learning mechanisms which underpin this acquisition sequence. It describes generic associative learning mechanisms such as
chunking which, when applied to the stream of language, provide a rich source
of knowledge of sequential dependencies ranging from low-level binary chunks
like bigrams, through phonotactics, lexis, and collocations, up to formulae and
idioms. Although a very basic learning mechanism, chunking results in hierarchical representations and structure dependency.
Emergentists believe that many of the rule-like regularities that we see in
language emerge from the mutual interactions of the billions of associations
that are acquired during language usage. But such hypotheses require testing
and formal analysis. Section 5 describes how connectionism provides a means
of evaluating the effectiveness of the implementations of these ideas as
simulations of language acquisition which are run using computer models
consisting of many artificial neurons connected in parallel. Two models of the
emergence of linguistic regularity are presented for detailed illustration. Other
simulations show how analysis of sequential dependencies results in grammatically useful abstract linguistic representations. The broad scope of connectionist and other distributional approaches to language acquisition is briefly
outlined. The review concludes by discussing some limitations of work to date
and provides some suggestions for future progress.
Constructions, Chunking, and Connectionism 65
2 Construction Grammar
This section outlines cognitive linguistic analyses of the interactions between
human language, perception, and cognition, and then focuses on construction
grammar (Croft, 2001; Fillmore and Kay, 1993; Goldberg, 1995; Langacker,
1987; Tomasello, 1998a, 1998b) as an approach for analyzing the ways in which
particular language patterns cue particular processes of interpretation. If words
are the atoms of language function, then construction grammar provides the
molecular level of analysis.
2.1
Cognitive linguistics
Cognitive linguistics (Barlow and Kemmer, 2000; Croft and Cruse, 1999;
Goldberg, 1995; Lakoff, 1987; Lakoff and Johnson, 1980; Langacker, 1987, 1991;
Talmy, 1988; Ungerer and Schmid, 1996) provides detailed qualitative analyses of the ways in which language is grounded in human experience and in
human embodiment, which represents the world in a very particular way. The
meaning of the words of a given language, and how they can be used in
combination, depends on the perception and categorization of the real world
around us. Since we constantly observe and play an active role in this world,
we know a great deal about the entities of which it consists, and this experience
and familiarity is reflected in the nature of language. Ultimately, everything
we know is organized and related in some meaningful way or other, and
everything we perceive is affected by our perceptual apparatus and our perceptual history. Language reflects this embodiment and this experience.
The different degrees of salience or prominence of elements involved in
situations that we wish to describe affect the selection of subject, object,
adverbials, and other clause arrangement. Figure/ground segregation and
perspective taking, processes of vision and attention, are mirrored in language
and have systematic relations with syntactic structure. Thus, paradoxically, a
theory of language must properly reflect the ways in which human vision and
spatial representations are explored, manipulated, cropped and zoomed, and
run in time like movies under attentional and scripted control (Kosslyn, 1983;
Talmy, 1996a). In language production, what we express reflects which parts
of an event attract our attention; depending on how we direct our attention,
we can select and highlight different aspects of the frame, thus arriving at
different linguistic expressions. The prominence of particular aspects of the
scene and the perspective of the internal observer (i.e., the attentional focus of
the speaker and the intended attentional focus of the listener) are key elements
in determining regularities of association between elements of visuo-spatial
experience and elements of phonological form. In language comprehension,
abstract linguistic constructions (like simple locatives, datives, and passives)
serve as a “zoom lens” for the listener, guiding their attention to a particular
perspective on a scene while backgrounding other aspects (Goldberg, 1995).
66 Nick C. Ellis
Thus, cognitive linguistics describes the regularities of syntax as emergent
from the cross-modal evidence that is collated during the learner’s lifetime of
using and comprehending language.
Cognitive linguistics was founded on the principle that language cognition
cannot be separated from semantics and the rest of cognition. The next section
shows how it similarly denies clear boundaries between the traditional linguistic separations of syntax, lexicon, phonology, and pragmatics.
2.2
Constructions
Traditional descriptive grammars focus on constructions, that is, recurrent
patterns of linguistic elements that serve some well-defined linguistic function.
As noted earlier, these may be at sentence level (such as the imperative, the
ditransitive, the yes-no question) or below (the noun phrase, the prepositional
phrase, etc.). The following summary of construction grammar, heavily influenced by Langacker (1987) and Croft and Cruse (1999), illustrates the key
tenets.
A construction is a conventional linguistic unit, that is, part of the linguistic
system, accepted as a convention in the speech community, and entrenched
as grammatical knowledge in the speaker’s mind. Constructions may (i) be
complex, as in [Det Noun], or be simple, as in [Noun] (traditionally viewed as
“syntax”); (ii) represent complex structure above the word level, as in [Adj
Noun], or below the word level, as in [NounStem-PL] (traditionally viewed as
“morphology”); or (c) be schematic, as in [Det Noun], or specific, as in [the
United Kingdom], traditionally viewed as “lexicon.” Hence, “morphology,”
“syntax,” and “lexicon” are uniformly represented in a construction grammar,
unlike both traditional grammar and generative grammar. Constructions are
symbolic. In addition to specifying the properties of an utterance’s defining
morphological, syntactic, and lexical form, a construction also specifies the
semantic, pragmatic, and/or discourse functions that are associated with it.
Constructions form a structured inventory of speakers’ knowledge of the conventions of their language (Langacker, 1987, pp. 63–6), usually described by
construction grammarians in terms of a semantic network, where schematic
constructions can be abstracted over the less schematic ones which are inferred inductively by the speaker in acquisition. This non-modular semantic
network representation of grammar is shared by other theories such as Word
Grammar (Hudson, 1984, 1990). A construction may provide a partial specification of the structure of an utterance. Hence, an utterance’s structure is specified by a number of distinct constructions. Constructions are independently
represented units in a speaker’s mind. Any construction with unique, idiosyncratic formal or functional properties must be represented independently in
order to capture speakers’ knowledge of their language. However, absence of
any unique property of a construction does not entail that it is not represented
independently and simply derived from other, more general or schematic constructions. Frequency of occurrence may lead to independent representation of
Constructions, Chunking, and Connectionism 67
even “regular” constructional patterns. This usage-based perspective implies that
the acquisition of grammar is the piecemeal learning of many thousands of
constructions and the frequency-biased abstraction of regularities within them.
Many constructions are based on particular lexical items, ranging from simple (Howzat! in cricket) to complex (Beauty is in the eye of the beholder). The
importance of such lexical units or idiomatic phrases is widely acknowledged
in SLA research when discussing holophrases (Corder, 1973), prefabricated
routines and patterns (Hakuta, 1974), formulaic speech (Wong Fillmore, 1976),
memorized sentences and lexicalized stems (Pawley and Syder, 1983), formulae (R. Ellis, 1994), sequences in SLA (N. Ellis, 1996, 2002), discourse management (Dörnyei and Kormos, 1998; Tannen, 1987), register (Biber and Finegan,
1994), style (Brewster, 1999), and lexical patterns and collocational knowledge
(Carter, 1998; Hoey, 1991; Lewis, 1993; Schmitt, 2000). According to Nattinger
(1980, p. 341), “for a great deal of the time anyway, language production
consists of piecing together the ready-made units appropriate for a particular
situation and . . . comprehension relies on knowing which of these patterns to
predict in these situations.” As Pawley and Syder (1983, p. 192) put it:
In the store of familiar collocations there are expressions for a wide range of
familiar concepts and speech acts, and the speaker is able to retrieve these as
wholes or as automatic chains from the long-term memory; by doing this he
minimizes the amount of clause-internal encoding work to be done and frees
himself to attend to other tasks in talk-exchange, including the planning of larger
units of discourse.
But other constructions are more abstract. Goldberg (1995) focuses on complex argument structure constructions such as the ditransitive (Pat faxed Bill
the letter), the caused motion (Pat pushed the napkin off the table), and the conative
(Sam kicked at Bill). She holds that these abstract and complex constructions
themselves carry meaning, independently of the particular words in the sentence. For example, even though the verb kick does not typically imply transfer
of possession, it works in the ditransitive Pat kicked Bill the football, and even
though one is hard pressed to interpret anything but an intransitive sneeze, the
caused motion Pat sneezed the napkin off the table is equally good. These abstract
argument structure constructions thus create an important top-down component to the process of linguistic communication. Such influences are powerful
mechanisms for the creativity of language, possibly even as manifest in derivational phenomena such as denominal verbs (They tabled the motion) and
deverbal nouns (Drinking killed him) (Tomasello, 1998b).
Constructions show prototype effects. For example, for ditransitive constructions there is the central sense of agent-successfully-causes-recipient-to-receivepatient (Bill gave/handed/passed/threw/took her a book), and various more
peripheral meanings such as future-transfer (Bill bequeathed/allocated/granted/
reserved her a book) and enabling-transfer (Bill allowed/permitted her one book).
Prototype effects are fundamental characteristics of category formation, again
68 Nick C. Ellis
blurring the boundaries between syntax and lexicon and other cognitive domains
(N. Ellis, 2002).
3
Learning Constructions
If linguistic systems comprise a conspiracy of constructions, then language
acquisition, L1 or L2, is the acquisition of constructions. There is nothing revolutionary in these ideas. Descriptive grammars (e.g., Biber, Johansson, Leech,
Conrad, and Finegan, 1999; Quirk, Greenbaum, Leech, and Svartvik, 1985) are
traditionally organized around form–function patterns; so are grammars which
are designed to inform pedagogy (e.g., Celce-Murcia and Larsen-Freeman, 1983).
But what about the processes of acquisition? To date, construction grammar
has primarily concerned descriptions of adult competence, although language
acquisition researchers, particularly those involved in child language, are now
beginning to sketch out theories of the acquisition of constructions which
involve a developmental sequence from formula, through low-scope pattern,
to construction.
3.1
Formulae and idioms
Formulae are lexical chunks which result from memorizing the sequence of
frequent collocations. Large stretches of language are adequately described by
finite-state grammars, as collocational streams where patterns flow into each
other. Sinclair (1991, p. 110), then director of the Cobuild project, the largest
lexicographic analysis of the English language to date, summarized this in the
principle of idiom:
A language user has available to him or her a large number of semi-preconstructed
phrases that constitute single choices, even though they might appear to be
analyzable into segments. To some extent this may reflect the recurrence of similar situations in human affairs; it may illustrate a natural tendency to economy of
effort; or it may be motivated in part by the exigencies of real-time conversation.
Rather than its being a somewhat minor feature compared with grammar,
Sinclair suggests that, for normal texts, the first mode of analysis to be applied
is the idiom principle, as most text is interpretable by this principle. Whereas
most of the material that Sinclair was analyzing in the Bank of English was
written text, comparisons of written and spoken corpora demonstrate that
collocations are even more frequent in spoken language (Biber et al., 1999;
Brazil, 1995; Leech, 2000). Parole is flat and Markovian because it is constructed
“off the top of one’s head,” and there is no time to work it over. Utterances are
constructed as intonation units which have the grammatical form of single
clauses, although many others are parts of clauses, and they are often highly
predictable in terms of their lexical concordance (Hopper, 1998). Language
Constructions, Chunking, and Connectionism 69
reception and production are mediated by learners’ representations of chunks
of language: “Suppose that, instead of shaping discourse according to rules,
one really pulls old language from memory (particularly old language, with
all its words in and everything), and then reshapes it to the current context:
“ ‘Context shaping’, as Bateson puts it, ‘is just another term for grammar’ ”
(Becker, 1983, p. 218).
Even for simple concrete lexis or formulae, acquisition is no unitary phenomenon. It involves the (typically) implicit learning of the sequence of sounds
or letters in the word along with separable processes of explicit learning of
perceptual reference (N. Ellis, 1994c, 2001). Yet however multifaceted and fascinating is the learning of words (Aitchison, 1987; Bloom, 2000; N. Ellis and
Beaton, 1993a, 1993b; Miller, 1991; Ungerer and Schmid, 1996), lexical learning
has generally been viewed as a phenomenon that can readily be understood in
terms of basic processes of human cognition. Learning the form of formulae is
simply the associative learning of sequences. It can readily be understood in
terms of the process of chunking which will be described in section 4.
The mechanism of learning might be simple, but the product is a rich and
diverse population of hundreds of thousands of lexical items and phrases. The
store of familiar collocations of the native language speaker is very large indeed. The sheer number of words and their patterns variously explains why
language learning takes so long, why it requires exposure to authentic sources,
and why there is so much current interest in corpus linguistics in SLA (Biber,
Conrad, and Reppen, 1998; Collins Cobuild, 1996; Hunston and Francis, 1996;
McEnery and Wilson, 1996). Native-like competence and fluency demand such
idiomaticity.
3.2
Limited scope patterns
The learning of abstract constructions is more intriguing. It begins with
chunking and committing formulae to memory. But there is more. Synthesis
precedes analysis. Once a collection of like examples is available in long-term
memory, there is scope for implicit processes of analysis of their shared features and for the development of a more abstract summary schema, in the
same way as prototypes emerge as the central tendency of other cognitive
categories.
Consider first the development of slot-and-frame patterns. Braine (1976)
proposed that the beginnings of L1 grammar acquisition involve the learning
of the position of words in utterances (e.g., More car, More truck, etc. allow
induction of the pattern “more + recurring element”). Maratsos (1982) extended
this argument to show that adult-like knowledge of syntactic constructions
(including both syntactic relations and part-of-speech categories like verb and
noun) can also result from positional analysis without the influence of semantic
categories like agent and action. He proposed that this learning takes place
through the amassing of detailed information about the syntactic handling of
particular lexical items, followed by discovery of how distributional privileges
70 Nick C. Ellis
transfer among them. The productivity of distributional analyses resultant
from connectionist learning of text corpora will be described in section 5.
It is important to acknowledge the emphases of such accounts on piecemeal
learning of concrete exemplars. Longitudinal child-language acquisition data
suggest that, to begin with, each word is treated as a semantic isolate in the
sense that the ability to combine it with other words is not accompanied by a
parallel ability with semantically related words. An early example was that of
Bowerman (1976), who demonstrated that her daughter Eva acquired the more
+ X construction long before other semantically similar relational words like
again and all-gone came to be used in the similar pivot position in two-word
utterances. Pine and Lieven (Lieven, Pine, and Dresner Barnes, 1992; Pine and
Lieven, 1993, 1997; Pine, Lieven, and Rowland, 1998) have since demonstrated
widespread lexical specificity in L1 grammar development. Children’s language
between the ages of 2 and 3 years is much more “low-scope” than theories of
generative grammar have argued. A high proportion of children’s early multiword speech is produced from a developing set of slot-and-frame patterns.
These patterns are often based on chunks of one or two words or phrases
and they have “slots” into which the child can place a variety of words, for
instance subgroups of nouns or verbs (e.g., I can’t + Verb; where’s + Noun +
gone?). Children are very productive with these patterns and both the number
of patterns and their structure develop over time. But they are lexically specific.
Pine and Lieven’s analyses of recordings of 2–3-year-old children and their
mothers measure the overlap between the words used in different slots in
different utterances. For example, if a child has two patterns, I can’t + X and I
don’t + X, Pine and Lieven measure whether the verbs used in the X slots come
from the same group and whether they can use any other CAN- or DOauxiliaries. There is typically very little or no overlap, an observation which
supports the conclusion that (i) the patterns are not related through an underlying grammar (i.e., the child does not “know” that can’t and don’t are both
auxiliaries or that the words that appear in the patterns all belong to a category
of Verb); (ii) there is no evidence for abstract grammatical patterns in the 2–3year-old child’s speech; and (iii) that, in contrast, the children are picking up
frequent patterns from what they hear around them, and only slowly making
more abstract generalizations as the database of related utterances grows.
Tomasello (1992) proposed the Verb Island hypothesis, in which it is the
early verbs and relational terms that are the individual islands of organization
in young children’s otherwise unorganized grammatical system – in the early
stages the child learns about arguments and syntactic markings on a verbby-verb basis, and ordering patterns and morphological markers learned for
one verb do not immediately generalize to other verbs. Positional analysis of
each verb island requires long-term representations of that verb’s collocations,
and, thus, this account of grammar acquisition implies vast amounts of longterm knowledge of word sequences. Only later are syntagmatic categories
formed from abstracting regularities from this large dataset in conjunction with
morphological marker cues (at least in case-marking languages). Goldberg (1995)
Constructions, Chunking, and Connectionism 71
argues that certain patterns are more likely to be made more salient in the input
because they relate to certain fundamental perceptual primitives, and, thus,
that the child’s construction of grammar involves both the distributional analysis
of the language stream and the analysis of contingent perceptual activity:
Constructions which correspond to basic sentence types encode as their central
senses event types that are basic to human experience . . . that of someone causing
something, something moving, something being in a state, someone possessing
something, something causing a change of state or location, something undergoing a change of state or location, and something having an effect on someone.
(Goldberg, 1995, p. 39)
Goldberg and Sethuraman (1999) show how individual “pathbreaking” semantically prototypic verbs form the seed of verb-centered argument structure
patterns. Generalizations of the verb-centered instances emerge gradually as
the verb-centered categories themselves are analyzed into more abstract argument structure constructions. The verb is a better predictor of sentence meaning than any other word in the sentence. Nevertheless, children ultimately
generalize to the level of constructions, because constructions are much better
predictors of overall meaning. Although verbs thus predominate in seeding
low-scope patterns and eventually more abstract generalizations, Pine et al.
(1998) have shown that such islands are not exclusive to verbs, and that the
theory should be extended to include limited patterns based on other lexical
types such as bound morphemes, auxiliary verbs, and case-marking pronouns.
3.3
Exemplar frequency and construction productivity
The research reviewed thus far has focused on piecemeal learning, the emergence of syntactic generalizations, and the elements of language which seed
such generalizations. There is another important strand in L1 constructionlearning research that concerns how the frequency of patterns in the input
affects acquisition. Usage-based linguistics holds that language use shapes
grammar through frequent repetitions of usage, but there are separable effects
of token frequency and type frequency. Token frequency is how often in the
input particular words or specific phrases appear; type frequency, on the other
hand, counts how many different lexical items a certain pattern or construction
is applicable to. Type frequency refers to the number of distinct lexical items
that can be substituted in a given slot in a construction, whether it is a wordlevel construction for inflection or a syntactic construction specifying the
relation among words. The “regular” English past tense -ed has a very high
type frequency because it applies to thousands of different types of verbs,
whereas the vowel change exemplified in swam and rang has a much lower
type frequency. Bybee (Bybee, 1995; Bybee and Thompson, 2000) shows how
the productivity of a pattern (phonological, morphological, or syntactic) is a
function of its type rather than its token frequency. In contrast, high token
72 Nick C. Ellis
frequency promotes the entrenchment or conservation of irregular forms and
idioms – the irregular forms only survive because they are very frequent.
Type frequency determines productivity because: (i) the more lexical items
that are heard in a certain position in a construction, the less likely it is that the
construction is associated with a particular lexical item, and the more likely it
is that a general category is formed over the items that occur in that position;
(ii) the more items the category must cover, the more general are its criterial
features, and the more likely it is to extend to new items; and (iii) high type
frequency ensures that a construction is used frequently, thus strengthening
its representational schema and making it more accessible for further use with
new items (Bybee and Thompson, 2000).
3.4
The same sequence for SLA?
To what degree might this proposed developmental sequence of syntactic
acquisition apply in SLA? SLA is different from L1A in numerous respects,
particularly with regard to:
i mature conceptual development:
a in child language acquisition knowledge of the world and knowledge
of language are developing simultaneously whereas adult SLA builds
upon pre-existing conceptual knowledge;
b adult learners have sophisticated formal operational means of thinking
and can treat language as an object of explicit learning, that is, of
conscious problem-solving and deduction, to a much greater degree
than can children (N. Ellis, 1994a);
ii language input: the typical L1 pattern of acquisition results from naturalistic
exposure in situations where caregivers naturally scaffold development
(Tomasello and Brooks, 1999), whereas classroom environments for second
or foreign language teaching can distort the patterns of exposure, of function, of medium, and of social interaction (N. Ellis and Laporte, 1997);
iii transfer from L1: adult SLA builds on pre-existing L1 knowledge
(MacWhinney, 1992; Odlin, this volume), and, thus, for example, whereas
a young child has lexically specific patterns and only later develops knowledge of abstract syntactic categories which guide more creative combinations and insertions into the slots of frames, adults have already acquired
knowledge of these categories and their lexical membership for L1, and
this knowledge may guide creative combination in their L2 interlanguage
to variously good and bad effects. Nevertheless, unless there is evidence
to the contrary, it is a reasonable default expectation that naturalistic SLA
develops in broadly the same fashion as does L1 – from formulae, through
low-scope patterns, to constructions – and that this development similarly
reflects the influences of type and token frequencies in the input. (But
see Doughty, this volume, for a discussion of how L1 and L2 processing
procedures differ.)
Constructions, Chunking, and Connectionism 73
There are lamentably few longitudinal acquisition data for SLA that are
of sufficient detail to allow the charting of construction growth. Filling this
lacuna and performing analyses of SLA which parallel those for L1A described
in section 3.2 is an important research priority. But the available evidence does
provide support for the assumption that constructions grow from formulae
through low-scope patterns to more abstract schema. For a general summary,
there are normative descriptions of stages of L2 proficiency that were drawn
up in as atheoretical a way as possible by the American Council on the Teaching of Foreign Languages (ACTFL) (Higgs, 1984). These Oral Proficiency Guidelines include the following descriptions of novice and intermediate levels that
emphasize the contributions of patterns and formulae to the development of
later creativity:
Novice Low: Oral production consists of isolated words and perhaps a few highfrequency phrases . . . Novice High: Able to satisfy partially the requirements of
basic communicative exchanges by relying heavily on learned utterances but
occasionally expanding these through simple recombinations of their elements
. . . Intermediate: The intermediate level is characterized by an ability to create
with the language by combining and recombining learned elements, though
primarily in a reactive mode. (ACTFL, 1986, p. 18)
Thus, the ACTFL repeatedly stresses the constructive potential of collocations
and chunks of language. This is impressive because the ACTFL guidelines
were simply trying to describe SLA as objectively as possible – there was no
initial theoretical focus on formulae – yet nonetheless the role of formulae
became readily apparent in the acquisition process.
There are several relevant case studies of child SLA. Wong Fillmore (1976)
presented the first extensive longitudinal study that focused on formulaic
language in L2 acquisition. Her subject, Nora, acquired and overused a few
formulaic expressions of a new structural type during one period, and then
amassed a variety of similar forms during the next. Previously unanalyzed
chunks became the foundations for creative construction (see also Vihman’s,
1982, analyses of her young son Virve’s SLA). Such observations of the formulaic beginnings of child L2 acquisition closely parallel those of Pine and Lieven
for L1.
There are a few studies which focus on these processes in classroom-based
SLA. R. Ellis (1984) described how three classroom learners acquired formulae
which allowed them to meet their basic communicative needs in an ESL classroom, and how the particular formulae they acquired reflected input frequency
– they were those which more often occurred in the social and organizational
contexts that arose in the classroom environment. Weinert (1994) showed how
English learners’ early production of complex target-like German foreign language negation patterns came through the memorization of complex forms in
confined linguistic contexts, and that some of these forms were used as a basis
for extension of patterns. Myles, Hooper, and Mitchell (1998; Myles, Mitchell,
74 Nick C. Ellis
and Hooper, 1999) describe the first two years of development of interrogatives in a classroom of anglophone French L2 beginners, longitudinally tracking the breakdown of formulaic chunks such as comment t’appelles-tu? (what’s
your name?), comment s’appelle-t-il? (what’s his name?), and où habites-tu? (where
do you live?), in particular the creative construction of new interrogatives by
recombination of their parts, and the ways in which formulae fed the constructive process. Bolander (1989) analyzed the role of chunks in the acquisition of
inversion in Swedish by Polish, Finnish, and Spanish immigrants enrolled in a
4-month intensive course in Swedish. In Swedish, the inversion of subject–
verb after a sentence-initial non-subject is an obligatory rule. Bolander identified the majority of the inversion cases in her data as being of a chunk-like
nature with a stereotyped reading such as det kan man säga (that can one say)
and det tycker jag (so think I). Inversion in these sort of clauses is also frequent
when the object is omitted as in kan man säga (can one say) and tycker jag (think
I), and this pattern was also well integrated in the interlanguage of most of
these learners. Bolander showed that the high accuracy on these stereotyped
initial-object clauses generalized to produce a higher rate of correctness on
clauses with non-stereotyped initial objects than was usual for other types of
inversion clause in her data, and took this as evidence that creative language
was developing out of familiar formulae.
Although there are many reviews which discuss the important role of formula use in SLA (e.g., Hakuta, 1974; Nattinger and DeCarrico, 1992; Towell
and Hawkins, 1994; Weinert, 1995; Wray, 1992), there is clearly further need
for larger-sampled SLA corpora which will allow detailed analysis of acquisition sequences. De Cock (1998) presents analyses of corpora of language-learner
productions using automatic recurrent sequence extractions. These show that
second language learners use formulae at least as much as native speakers and
at times at significantly higher rates. There is much promise of such computerbased learner corpus studies (Granger, 1998), providing that sufficient care is
taken to gather the necessarily intensive longitudinal learner data. There is
also need to test the predictions of usage-based theories regarding the influences of type frequency and token frequency as they apply in SLA.
4
Psychological Accounts of Associative
Learning
This section concerns the psychological learning mechanisms which underpin
the acquisition of constructions. Constructivists believe that language is cut of
the same cloth as other forms of learning. Although it differs importantly from
other knowledge in its specific content and problem space, it is acquired using
generic learning mechanisms. The Law of Contiguity, the most basic principle
of association, pervades all aspects of the mental representation of language:
“Objects once experienced together tend to become associated in the imagination,
Constructions, Chunking, and Connectionism 75
so that when any one of them is thought of, the others are likely to be thought
of also, in the same order of sequence or coexistence as before” ( James, 1890,
p. 561).
4.1
Chunking
What’s the next letter in a sentence beginning T . . . ? Native English speakers
know it is much more likely to be h or a vowel than it is z or other consonants,
and that it could not be q. But they are never taught this. What is the first word
in that sentence? We are likely to opt for the, or that, rather than thinks or
theosophy. If The . . . begins the sentence, how does it continue? “With an adjective or noun,” might be the reply. And, if the sentences starts with The cat . . . ,
then what? And then again, how should we complete The cat sat on the . . . ?
Fluent native speakers know a tremendous amount about the sequences of
language at all grains. We know how letters tend to co-occur (common bigrams,
trigrams, and other orthographic regularities). Likewise, we know the phonotactics of our tongue and its phrase structure regularities. We know thousands
of concrete collocations, and we know abstract generalizations that derive
from them. We have learned to chunk letters, sounds, morphemes, words,
phrases, clauses, bits of co-occurring language at all levels. Psycholinguistic
experiments show that we are tuned to these regularities in that we process
faster and most easily language which accords with the expectations that have
come from our unconscious analysis of the serial probabilities in our lifelong
history of input (N. Ellis, 2002).
Furthermore, we learn these chunks from the very beginnings of learning a
second language. N. Ellis, Lee, and Reber (1999) observed people reading their
first 64 sentences of a foreign language. While they read, they saw the referent
of each sentence, a simple action sequence involving colored geometrical shapes.
For example, the sentence miu-ra ko-gi pye-ri lon-da was accompanied by a
cartoon showing a square moving onto red circles. A linguistic description of
this language might include the following facts: (i) that it is an SOV language;
(ii) it has adjective–noun word order; (iii) grammatical number (singular/
plural) agreement is obligatory, and in the form of matching suffix endings of
a verb and its subject and of a noun and the adjective that modifies it; (iv) that
the 64 sentences are all of the type: [N]Subject [A N]Object V; and (v) that lexis was
selected from a very small set of eight words. But such explicit metalinguistic
knowledge is not the stuff of early language acquisition. What did the learners
make of it? To assess their intake, immediately after seeing each sentence,
learners had to repeat as much as they could of it. How did their intake
change over time? It gradually improved in all respects. With increasing exposure, performance incremented on diverse measures: the proportion of lexis
correctly recalled, correct expression of the adjective–noun agreement, correct
subject–verb agreement, totally correct sentence production, correct bigrams
and trigrams, and, overall, conformity to the sequential probabilities of the
language at letter, word, and phrase level. With other measures it was similarly
76 Nick C. Ellis
apparent that there was steady acquisition of form–meaning links and of
generalizable grammatical knowledge that allowed success on grammaticality
judgment tests which were administered later (Ellis et al., 1999). To greater or
lesser degree, these patterns, large and small, were being acquired simultaneously and collaboratively.
Acquisition of these sequential patterns is amenable to explanation in terms
of psychological theories of chunking. The notion of chunking has been at the
core of short-term memory research since Miller (1956) first proposed the term.
While the chunk capacity of short-term memory (STM) is fairly constant at
7 ± 2 units, its information capacity can be increased by chunking, a useful
representational process in that low-level features that co-occur can be organized together and thence referred to as an individual entity. Chunking underlies
superior short-term memory for patterned phone numbers (e.g., 0800-123777)
or letter strings (e.g., AGREEMENTS, FAMONUBITY) than for more random
sequences (e.g., 4957-632518, CXZDKLWQPM), even though all strings contain
the same number of items. We chunk chunks too, so Ellis is wittering on about
chunking again is better recalled than again wittering on is about Ellis chunking,
and, as shown by Epstein (1967) in a more rigorous but dreary fashion than
Lewis Carroll’s, A vapy koobs desaked the citar molently um glox nerfs is more
readily read and remembered than koobs vapy the desaked um glox citar nerfs a
molently:
A chunk is a unit of memory organization, formed by bringing together a set
of already formed elements (which, themselves, may be chunks) in memory
and welding them together into a larger unit. Chunking implies the ability to
build up such structures recursively, thus leading to a hierarchical organization
of memory. Chunking appears to be a ubiquitous feature of human memory.
(Newell, 1990, p. 7)
It operates at concrete and abstract levels, as we shall now see.
Sequences that are repeated across learning experiences become better remembered. Hebb (1961) demonstrated that, when people were asked to report
back random nine-digit sequences in short-term memory task, if, unbeknownst
to the participants, every third list of digits was repeated, memory for the
repeated list improved over trials faster than memory for non-repeated lists.
This pattern whereby repetitions of particular items in short-term memory
result in permanent structural traces has since become known as the Hebb
effect. It pervades learning in adulthood and infancy alike. Saffran, Aslin, and
Newport (1996) demonstrated that 8-month-old infants exposed for only 2
minutes to unbroken strings of nonsense syllables (for example, bidakupado) are
able to detect the difference between three-syllable sequences that appeared as
a unit and sequences that also appeared in their learning set but in random
order.
Chunks that are repeated across learning experiences also become better
remembered. In early Project Grammarama experiments, Miller (1958) showed
Constructions, Chunking, and Connectionism 77
that learners’ free recall of redundant (grammatical) items was superior to that
of random items, and hypothesized that this was because they were “recoding”
individual symbols into larger chunks which decreased the absolute number
of units. Structural patterns that are repeated across learning experiences as
well become better remembered. Reber (1967) showed that memory for grammatical “sentences” generated by a finite-state grammar improved across learning sets. More recent work reviewed by Manza and Reber (1997), Mathews
and Roussel (1997), and others in Berry (1997) shows that learners can transfer
knowledge from one instantiation to another, that is, learn an artificial grammar instantiated with one letter set (GFBQT) and transfer to strings instantiated in another (HMVRZ), so that if there are many letter strings which illustrate
patterned sequences (e.g., GFTQ, GGFTQ, GFQ) in the learning set, the participants show faster learning of a second transfer grammar which mirrors these
patterns (HMZR, HHMZR, HMR) than one which does not (HMZR, VMHZZ,
VZH). Learners can also demonstrate cross-modal transfer, where the training
set might be letters, as above, but the testing set comprises sequences of colors
which, unbeknownst to the participant, follow the same underlying grammar.
These effects argue for more abstract representations of tacit knowledge.
Hebb effects, Miller effects, and Reber effects all reflect the reciprocal interactions between short-term memory and long-term memory (LTM) which
allow us to bootstrap our way into language. The “cycle of perception” (Neisser,
1976) is also the “cycle of learning,” such that bottom-up and top-down processes are in constant interaction. Repetition of sequences in phonological STM
results in their consolidation in phonological LTM as chunks. The cognitive
system that stores long-term memories of phonological sequences is the same
system responsible for perception of phonological sequences. Thus, the tuning
of phonological LTM to regular sequences allows more ready perception of
input which contains regular sequences. Regular sequences are thus perceived
as chunks, and, as a result, language- (L1 or L2) experienced individuals’
phonological STM for regular sequences is greater than for irregular ones. This
common learning mechanism underpins language acquisition in phonological,
orthographic, lexical, and syntactic domains.
But this analysis is limited to language form. What about language function?
Learning to understand a language involves parsing the speech stream into
chunks which reliably mark meaning. The learner does not care about theoretical
analyses of language. From a functional perspective, the role of language is to
communicate meanings, and the learner wants to acquire the label–meaning
relations. Learners’ attention to the evidence to which they are exposed soon
demonstrates the recurring chunks of language (to use written examples, in
English e follows th more often than x does, the is a common sequence, the
[space] is frequent, dog follows the [space] more often than it does book, how do
you do? occurs quite often, etc.). At some level of analysis, the patterns refer to
meaning. It does not happen at the lower levels: t does not mean anything, nor
does th, but the does, and the dog does better, and how do you do? does very
well, thank you. In these cases the learner’s goal is satisfied, and the fact that
78 Nick C. Ellis
this chunk activates some meaning representations makes this sequence itself
more salient in the input stream. When the learner comes upon these chunks
again, they tend to stand out as units, and adjacent material is parsed accordingly (see Doughty, this volume, for a detailed discussion of this).
What is “meaning” in such an associative analysis? At its most concrete, it is
the perceptual memories which underpin the conscious experience which a
speaker wishes to describe and which, with luck, will be associated with sufficient strength in the hearer to activate a similar set of perceptual representations.
These are the perceptual groundings from which abstract semantics emerge
(Barsalou, 1999; Lakoff, 1987). Perceptual representations worth talking about
are complex structural descriptions in their own right, with a qualifying hierarchical schematic structure (e.g., a room schema which nests within it a desk
schema which in turn nests within it a drawer schema, and so on). These visuostructural descriptions are also acquired by associative chunking mechanisms,
operating in a neural system for representing the visual domain. When we
describe the structural properties of objects and their interactions we do so
from particular perspectives, attending to certain aspects and foregrounding
them, sequencing events in particular orders, etc., and so we need procedures
for spotlighting and sequencing perceptual memories with language. The most
frequent and reliable cross-modal chunks, which structure regular associations
between perception and language, are the constructions described in sections
2 and 3. Chunking, the bringing together of a set of already formed chunks in
memory and welding them into a larger unit, is a basic associative learning
process which can occur in and between all representational systems.
4.2
Generic learning mechanisms
Constructivists believe that generic, associative-learning mechanisms underpin all aspects of language acquisition. This is clearly a parsimonious assumption. But additionally, there are good reasons to be skeptical of theories of
learning mechanisms specific to the domain of language, first because innate
linguistic representations are neurologically implausible, and second because
of the logical problem of how any such universals might come into play:
i Current theories of brain function, process and development, with their
acknowledgement of plasticity and input-determined organization, do not
readily allow for the inheritance of structures which might serve, for instance, as principles or parameters of UG (Elman et al., 1996; Quartz and
Sejnowski, 1997).
ii Whether there are innate linguistic universals or not, there is still a logical
problem of syntactic acquisition. Identifying the syntactic category of words
must primarily be a matter of learning because the phonological strings
associated with words of a language are clearly not universal. Once some
identifications have been successfully made, it may be possible to use
prior grammatical knowledge to facilitate further identifications. But the