2014 The Psychology of Language PDF
2014 The Psychology of Language PDF
2014 The Psychology of Language PDF
OF LANGUAGE
Now in full color, this fully revised edition of the best-selling textbook provides an up-to-date and com-
prehensive introduction to the psychology of language for undergraduates, postgraduates, and researchers.
It contains everything the student needs to know about how we acquire, understand, produce, and store
language.
Whilst maintaining both the structure of the previous editions and the emphasis on cognitive process-
ing, this fourth edition has been thoroughly updated to include:
x the latest research, including recent results from the fast-moving field of brain imaging and studies
x updated coverage of key ideas and models
x an expanded glossary
x more real-life examples and illustrations.
The Psychology of Language, Fourth Edition is praised for describing complex ideas in a clear and
approachable style, and assumes no prior knowledge other than a grounding in the basic concepts of
cognitive psychology. It will be essential reading for advanced undergraduate and graduate students
of cognition, psycholinguistics, or the psychology of language. It will also be useful for those on
speech and language therapy courses.
The book is supported by a companion website featuring a range of helpful supplementary resources
for both students and lecturers.
Trevor A. Harley is Dean of Psychology and Chair of Cognitive Psychology at the University of Dundee,
Scotland. He was an undergraduate at the University of Cambridge, where he was also a PhD student,
completing a thesis on slips of the tongue and what they tell us about speech production. He moved to
Dundee from the University of Warwick in 1996. His research interests include speech production, how
we represent meaning, and the effects of aging on language.
This page intentionally left blank
THE
PSYCHOLOGY
OF LANGUAGE
FROM DATA TO THEORY
FOURTH EDITION
TREVOR A. HARLEY
Psychology Press
Taylor & Francis Group
LONDON AND NEW YORK
Fourth edition published 2014
by Psychology Press
27 Church Road, Hove, East Sussex BN3 2FA
and by Psychology Press
711 Third Avenue, New York, NY 10017
Psychology Press is an imprint of the Taylor & Francis Group, an informa business
© 2014 Psychology Press
The right of Trevor A. Harley to be identified as author of this work has been asserted by him in
accordance with sections 77 and 78 of the Copyright, Designs and Patents Act 1988.
All rights reserved. No part of this book may be reprinted or reproduced or utilized in any form or by
any electronic, mechanical, or other means, now known or hereafter invented, including photocopying
and recording, or in any information storage or retrieval system, without permission in writing from the
publishers.
Trademark notice: Product or corporate names may be trademarks or registered trademarks, and are used
only for identification and explanation without intent to infringe.
First edition published by Psychology Press 1995
Third edition published by Psychology Press 2008
British Library Cataloguing in Publication Data
A catalogue record for this book is available from the British Library
Library of Congress Cataloging-in-Publication Data
Harley, Trevor A.
The psychology of language: from data to theory / Trevor A. Harley.—Fourth edition.
pages cm
Includes bibliographical references and index.
1. Psycholinguistics. I. Title.
BF455.H2713 2014
401′.9—dc23
2013022343
Typeset in Times
by Book Now Ltd, London
CONTENTS
psychology; I have tried not to let this skepticism Pickering, Julian Pine, Ursula Pool, Eleanor
affect this revision. Most researchers believe that Saffran, Lynn Santelmann, Marcus Taft, Jeremy
brain imaging has greatly advanced our under- Tree, Roger van Gompel, Carel van Wijk, Alan
standing of psycholinguistics over the last decade. Wilkes, Beth Wilson, Suzanne Zeedyk, and Pienie
Technology has changed for the better, too, Zwitserlood. I would also like to thank several
making writing books much easier. Writing the anonymous reviewers for their comments; hope-
first edition involved constant trips to the library fully you know who you are. Numerous people
and much photocopying. In this edition I could pointed out minor errors and asked questions: I
read every reference I wanted at the luxury of thank them all. George Dunbar created the sound
my desk thanks to Google and electronic jour- spectrogram for Figure 2.1 using MacSpeechLab.
nals. I wrote the first draft of this book using Lila Gleitman gave me the very first line; thanks!
the wonderful Scrivener 2.0 on a Mac, and then Katie Edwards, Pam Miller, and Denise Jackson
finished it in Pages. helped me to obtain a great deal of material, often
There is a website associated with this book. at very short notice. This book would be much
It contains links to other pages, details of impor- worse without the help of all these people. I am
tant recent work, and a “hot link” to contact me. of course responsible for any errors or omissions
It is to be found at: http://www.psypress.com/ that remain. If there is anyone else I have forgot-
cw/harley. I still welcome any corrections, sug- ten, please accept my apologies. Many people
gestions for the next edition, or discussion on have suggested things that I have thought about
any topic. My email address is now: t.a.harley@ and decided not to implement, and many people
dundee.ac.uk. Suggestions on topics I have omit- have suggested things (more connectionism, less
ted or under-represented would be particularly connectionism, leave that in, take that out, move
welcome. The hardest bit of writing this book that bit there, leave it there) that are the opposite
has been deciding in what to leave out. I am of what others have suggested.
sure that people running other courses will cover In particular the writing of this edition was
some material in much more detail than has been made immeasurably easier by spending time in
possible to provide here. I would be interested the glorious environment of the University of
to hear, however, of any major differences of California, San Diego. I wish to thank everyone
emphasis. If the new edition is as successful as there from the bottom of my heart, particularly
the third, I will be looking forward (in a strange my hosts Tamar Gollan and Vic Ferreira.
sort of way) to producing the fifth edition in five I would also like to thank Psychology Press
years’ time. for all their help and enthusiasm for this project.
I would like to thank all those who have Finally, I would like to thank Brian Butterworth,
made suggestions about one or more of the pre- who supervised my PhD. He probably doesn’t
vious editions, particularly Jeanette Altarriba, realize how much I appreciated his help; without
Gerry Altmann, Elizabeth Bates, Paul Bloom, him, this book might never have existed.
Helen Bown, Peer Broeder, Gordon Brown, Hugh Finally, I hope that any bias there is in this
Buckingham, Annette de Groot, Lynne Duncan, book will appear to be the consequence of the
the Dundee Psycholinguistics Discussion consideration of evidence rather than of prejudice.
Group, Andy Ellis, Gerry Griffin, Zenzi Griffin,
Francois Grosjean, Evan Heit, Laorag Hunter, Professor Trevor A. Harley
Lesley Jessiman, Barbara Kaup, Alan Kennedy, School of Psychology
Kathryn Kohnert, Annukka Lindell, Nick University of Dundee
Lund, Siobhan MacAndrew, Nadine Martin, Dundee DD1 4HN
Randi Martin, Elizabeth Maylor, Don Mitchell, Scotland
Wayne Murray, Lyndsey Nickels, Jane Oakhill, [email protected]
Padraig O’Seaghdha, Shirley-Anne Paul, Martin February 2013
ILLUSTRATION CREDITS
Chapter 13 Chapter 16
Page 397: © Bettmann/Corbis. Page 413 (top): From Page 476: © Geoff Tompkinson/Science Photo Library.
Indefrey and Levelt (2004). Copyright © 2004. Page 478: © James King-Holmes/Science Photo Library.
HOW TO USE THIS BOOK
This book is intended to be a stand-alone intro- I do not think that there is anything much
duction to the psychology of language. It is my that can be done about this, but to persevere.
hope that anyone could pick it up and finish read- Sometimes comprehension might be assisted
ing it with a rich understanding of how humans by later material, and sometimes a number of
use language. Nevertheless, it would probably be readings might be necessary to comprehend
advantageous to have some knowledge of basic the material fully. Fortunately, the study of the
cognitive psychology. (Some suggestions for psychology of language gives us clues about
books to read are given in the “Further reading” how to facilitate understanding. Chapters 7 and
section at the end of Chapter 1.) For example, you 11 will be particularly useful in this respect. It
should be aware that psychologists have distin- should also be remembered that in some areas
guished between short-term memory (which has researchers do not agree on the conclusions or
limited capacity and can store material for only on what should be the appropriate method to
short durations) and long-term memory (which investigate a problem. Therefore it is some-
is virtually unlimited). I have tried to assume times difficult to say what the “right answer,”
that the reader has no knowledge of linguistics, or the correct explanation of a phenomenon,
although I hope that most readers will be familiar might be. In this respect the psychology of lan-
with such concepts as nouns and verbs. The psy- guage is still a very young subject.
chology of language is quite a technical area full The book is divided into sections, each cover-
of rather daunting terminology. I have defined ing an important aspect of language. Section A is
technical terms and italicized them when they an introduction. It describes what language is, and
first appear. There is also a glossary with short provides essential background for describing lan-
definitions of the technical terms. guage. It should not be skipped. Section B is about
Connectionist modeling is now central to the biological basis of language, the relationship
modern cognitive psychology. Unfortunately, it is of language to other cognitive processes, and lan-
also a topic that most people find extremely dif- guage development. Section C is about how we
ficult to follow. It is impossible to understand the recognize words. Section D is about comprehen-
details of connectionism without some mathemat- sion: how we understand sentences and discourse.
ical sophistication. I have provided an appendix Section E is about language production, and also
that covers the basics of connectionism in more about how language interacts with memory. It
mathematical detail than is generally necessary also examines the grand design or architecture of
to understand the main text. The general princi- the language system. This final section concludes
ples of connectionism can, however, probably be with a brief look at some possible new directions
appreciated without this extra depth, although it is in the psychology of language.
probably a good idea to look at the appendix. Each chapter begins with an introduction out-
In my opinion and experience, the mate- lining what the chapter is about and the main prob-
rial in some chapters is more difficult than others. lems faced in each area. Each introduction ends
xiv HOW TO USE THIS BOOK
with a summary of what you should know by the animals use language, or whether they can be
end of the chapter. Each chapter concludes with taught to do so. This will also help clarify what
a list of bullet points that gives a one-sentence we mean by language. We will look at how lan-
summary of each section in that chapter. This is guage is founded in the brain, and how damage to
followed by questions that you can think about the brain can lead to distinct types of impairment
either to test your understanding of the material, in language. We will look in detail at the more
or to go beyond what is covered, usually with an general role of language, by examining the rela-
emphasis on applying the material. If you want to tion between language and thought. We will also
follow a topic up in more detail than is covered in look at what can be learned from language acqui-
the text (which I think is quite richly referenced, sition in exceptional circumstances, including the
and should be the first place to look), then there effects of linguistic deprivation.
are suggestions for further reading at the very end Chapter 4 examines how children acquire
of each chapter. language, and how language develops through-
One way of reading this book is like a novel: out childhood. Chapter 5 examines how bilingual
start here and go to the end. Section A should children learn to use two languages.
certainly be read before the others because it We will then look in Chapter 6 at what appear
introduces many important terms, without which to be the simplest or lowest level processes and
later going would be very difficult. I certainly work towards more complex ones. Hence we will
recommend starting with Chapter 1. After that, first look at how we recognize and understand
alternative orders are possible, however. I have single words. Although these chapters are largely
tried to make each chapter as self-contained as about recognizing words in isolation in the sense
possible, so there is no reason why the chapters that in most of the experiments we discuss only
cannot be read in a different order. Similarly, you one word is present at a time, the influence of the
might choose to omit some chapters altogether. context in which they are found is an important
In each case you might find you have to refer to consideration, and we will look at this also.
the glossary more often than if you just begin at Chapter 7 looks at how we recognize words
the beginning. Unless you are interested in just a and how we access their meanings. Although the
few topics, however, I advise reading the whole emphasis is upon visually presented word recogni-
book through at least once. Each chapter looks at tion, many of the findings described in this chap-
a major chunk of the study of the psychology of ter are applicable to recognizing spoken words as
language. well. Chapter 8 examines how we read and pro-
nounce words, and looks at disorders of reading
(the dyslexias). It also looks at how we learn to
OVERVIEW read. Chapter 9 looks at the speech system and
how we process speech and identify spoken words.
Chapter 1 tells you about the subject of the psy- We then move on to how words are ordered to
chology of language. It covers its history and form sentences. Chapter 10 looks at how we make
methods. Chapter 2 provides some important use of word order information in understanding
background on language, telling you how we sentences. These are issues to do with syntax and
can describe sounds and the structure of sen- parsing. Chapter 11 examines how we represent
tences. In essence it is a primer on phonology the meaning of words. Chapter 12 examines how
and syntax. we comprehend and represent beyond the sentence
Chapter 3 is about how language is related level; these are the larger units of discourse or text.
to biological and cognitive processes. It looks at In particular, how do we integrate new information
the extent to which language depends on the pres- with old to create a coherent representation? How
ence and operation of certain biological, cogni- do we store what we have heard and read?
tive, and social precursors in order to be able to In Chapter 13 we consider the process in
develop normally. We will also look at whether reverse, and examine language production and its
HOW TO USE THIS BOOK xv
disorders. By this stage we will have an under- In Chapter 15 we will look at the structure of
standing of the processes involved in understand- the language system as a whole, and the relation
ing language, and these processes must be looked between the parts. Finally, Chapter 16 looks at
at in a wider context (Chapter 14). some possible new directions in psycholinguistics.
This page intentionally left blank
SECTION A
INTRODUCTION
This section describes what the rest of the book the history and methods of psycholinguistics, the
is about, discusses some important themes in chapter covers some current themes and contro-
the psychology of language, and examines versies in modern psycholinguistics, including
some important concepts used to describe lan- modularity, innateness, and the usefulness of
guage. You should read this section before the brain imaging, and studies involving people with
others. brain damage, for looking at language.
Chapter 1, The study of language, looks at Chapter 2, Describing language, looks
the functions of language and how the study of at the building blocks of language—sounds,
language plays a major role in helping to under- words, and sentences. The chapter then examines
stand human behavior. We look at what language Chomsky’s approaches to syntax and how these
is and what it is used for. After a brief look at have evolved over the years.
This page intentionally left blank
CHAPTER 1
THE STUDY OF LANGUAGE
It is not surprising then that understanding sort of experimental methods. We construct models
language is an important part of understanding of what we think is going on from our experimental
human behavior, with different areas of scientific results; we use observational and experimental data
study emphasizing different aspects of language to construct theories. This book will examine some
processing. The study of the anatomy of language of the experimental findings in psycholinguistics,
emphasizes the components of the articulatory tract, and the theories that have been proposed to account
such as the tongue and voice box. Neuroscience for them. Generally the phenomena and data to be
examines the role of different parts of the brain in explained will precede discussion of the models, but
behavior. Linguistics examines language itself. it is not always possible to neatly separate data and
Psycholinguistics is the study of the psychological theories, particularly when experiments are tests of
processes involved in language. Psycholinguists particular theories. I’ll be talking a bit more about
study understanding, producing, and remembering models and theories later.
language, and hence are concerned with listening, This book has a cognitive emphasis. It is con-
reading, speaking, writing, and memory for lan- cerned with understanding the processes involved
guage. They are also interested in how we acquire in using and acquiring language. This is not just
language, and the way in which it interacts with my personal bias; I believe that all our past expe-
other psychological systems. Many people think that rience has shown that the problems of studying
“psycholinguistics” has a rather dated feel, empha- human behavior have yielded, and will continue
sizing the role of linguistics too much. Although the to yield, to investigation by the methods of cogni-
area might once have been about the psychology of tive psychology and neuroscience.
linguistic theory, it is now much more. Still, there is
currently no better term, so it will have to do.
One reason why we take language for granted WHY STUDY LANGUAGE
is that we usually use it so effortlessly, and most of AND WHY IS IT SO
the time, so accurately. Indeed, when you listen to DIFFICULT?
someone speaking, or look at this page, you nor-
mally cannot help but understand what has been Even before I get on to saying what language is,
said or what is printed on the page in front of you. I want to ask why we should study it. Some peo-
It is only in exceptional circumstances that we ple (mostly psycholinguists) think the answer is
might become aware of the complexity involved: obvious, but in practice many students are often
if for example we are searching for a word but perplexed as to why so much of their psychology
cannot remember it; if a relative or colleague has course is devoted to the subject. What’s more I’ve
had a stroke that has affected their language; if we noticed that students often find the psychology of
observe a child acquiring language; if we try to learn language the most difficult part of psychology. It’s
a second language ourselves as an adult; or if we are often the part they like least (and often actively
visually impaired or hearing impaired, or if we meet dislike). So why should we study language?
someone else who is. And, of course, if you find this Well, you’re reading this book right now, aren’t
book so difficult to understand that you have to keep you? Reading words and sentences and making
reading and rereading it to make any sense of it. As sense of them (or trying to); that’s part of psycho-
we shall see, all of these examples of what might linguistics, for starters. It’s a good bet that you’re
be called “language in exceptional circumstances” pretty good at reading, but you probably know
reveal much about the processes involved in speak- someone who has had some difficulty in learning to
ing, listening, writing, and reading. But given that read, or even now finds reading and spelling diffi-
language processes are normally so automatic, we cult (that is, they have dyslexia). Perhaps you know
also need to carry out careful experiments to under- someone who has had a stroke and now finds read-
stand what is happening. Modern psycholinguistics ing difficult. More psycholinguistics!
is therefore closely related to other areas of cognitive But I bet you’ve listened to the radio or TV
psychology, and relies to a large extent on the same today, or listened to music with words (talking,
1. THE STUDY OF LANGUAGE 5
more psycholinguistics). I’ll be a little surprised if of the applications of the psychology of language.
you’ve not talked to anyone at all (speaking, lis- Second, the subject seems to have a lot of jargon
tening; even yet more psycholinguistics). You’ve in it, and teachers sometimes forget this or under-
probably written something too (you get the idea). estimate their students’ knowledge. How can you
But even if by some miracle you haven’t, I be expected to understand what a reduced relative
bet you’ve heard a voice in your head. The voice clause is when you don’t know what a clause is? Or
in your head probably uses words. In fact it’s hard even aren’t that clear what a noun is? I’ve tried to
(I find impossible) to think about human thought make life as easy as possible by defining all techni-
without thinking about language. So thinking, cal terms, trying to keep jargon to a minimum, and
the essence of being human, is completely inter- providing a glossary which contains a simple defi-
twined with language. nition of every technical term I can think of. Third,
What is more we transmit our learning and psycholinguists are an argumentative bunch, and
culture by language. The major reason civiliza- rarely seem to agree on anything. Sometimes they
tion has reached its heights, that we live in cen- can’t even agree whether they agree or not. So there
trally heated houses with thin computers and cell are few situations when we can say “now THAT’s
phones, using social networking sites, is because the answer.” And people like answers. They don’t
we have built up a culture and a technology that like to be left with the conclusion “it could be this or
would have been completely impossible without it could be that and it all depends,” and that’s going
language. For this reason the evolutionary biolo- to be my conclusion most of the time. But life is full
gist Martin Nowak (2006) says that language is of uncertainties, so get over it and live with it. And
“the most interesting invention of the last 600 the final reason that people find psycholinguistics
million years” (p. 250). He says that the impact difficult is because it’s full of models. A colleague
of language is comparable with only a few other once told me that she overheard some students talk-
events in biological history, such as the evolution ing in front of her (yes, we love to eavesdrop) and
of life and the evolution of multi-celled animals. one said to the other “language—it’s just all these
So here is my list of reasons of why the study models.” Models are the most important thing in
of the psychology of language is so important: science; they’re the closest we get to an explana-
tion. I’ll talk about models below.
1. We use language nearly all the time; technol-
ogy and our cultures would be impossible
without it. WHAT IS LANGUAGE?
2. We usually think in language.
3. Some people have difficulty learning spoken or It might seem natural at this point to say exactly
written language (developmental disorders), or what is meant by “language,” but to do so is much
have difficulty with language as a consequence harder than it first appears. We all have some intui-
of brain damage (acquired disorders). tive notion of what language is; a simple definition
might be that it is “a system of symbols and rules
We can agree then that studying language that enable us to communicate.” Symbols are things
is important; but why do so many students find it that stand for other things: Words, either written or
hard? I think there are several reasons. First, the spoken, are symbols. The rules specify how words
importance and applications of language are not are ordered to form sentences. However, providing
always made as clear as they might be. If I told you a strict definition of language is not straightfor-
that I could teach you to read a textbook in a way ward. Consider other systems that many think are
that would guarantee you’d remember it and under- related to human spoken language. Are the com-
stand it and get an A in an exam, you’d probably pay munication systems of monkeys a language? What
attention. (Sadly I can’t, otherwise I would be very about the “language” of dolphins, or the “dance”
rich, although later I will give you some tips.) So of honey bees that communicates the location of
in this book I’ve tried to emphasize the importance sources of nectar to other bees in the hive? How
6 A. INTRODUCTION
PHONOLOGY
SEMANTICS
(the study of how sounds
(the study of meaning)
are used within a language)
PHONETICS SYNTAX
LINGUISTICS
(the study of raw sounds) (the study of word order)
PRAGMATICS MORPHOLOGY
(the study of language use) (the study of words and
word formation)
FIGURE 1.1
1. THE STUDY OF LANGUAGE 7
are many other language units (e.g., sounds and adult knows about 70,000 words (Nagy & Anderson,
sentences). Crystal (2010, p. 461) defines a word 1984; but by “greatly” I mean that the estimates
as “the smallest unit of grammar that can stand range between 15,000 and 150,000—see Bryson,
on its own as a complete utterance, separated 1990). Recognizing a word is rather like looking it
with spaces in written language.” Hence “pigs” up in a dictionary; when we know what the word is,
is a word, but the word ending “-ing” by itself we have access to all the information about it, such
is not. A word can in turn be analyzed at a num- as what it means and how to spell it. So when we
ber of levels. At the lowest level, it is made up of see or hear a word, how do we access its representa-
sounds, or letters if written down. Sounds com- tion within the lexicon? How do we know whether an
bine together to form syllables. Hence the word item is stored there or not? What are the differences
“cat” has three sounds and one syllable; “houses” between understanding speech and understanding
has two syllables; “syllable” has three syllables. visually presented words? Psycholinguists are par-
Words can also be analyzed in terms of the ticularly interested in the processes of lexical access
morphemes they contain. Consider a word like and how things are represented.
“ghosts.” This is made up of two units of mean-
ing: the idea of “ghost,” and then the plural end-
ing or inflection (“-s”), which conveys the idea HOW HAS LANGUAGE
of number: in this case that there is more than one CHANGED OVER TIME?
ghost. Therefore we say that “ghosts” is made
up of two morphemes, the “ghost” morpheme Language must have changed enormously over
and plural morpheme “s.” The same can be said time, and one obvious consequence of these
of past tense endings or inflections: “Kissed” is changes is that there are now many different lan-
also made up of two morphemes, “kiss” plus the guages in the world. Depending on exactly how
“-ed” past tense inflection which signifies that the something counts as a separate language, there
event happened in the past. There are two sorts are now thought to be around 5,000–6,000 (but
of inflection, regular forms that follow some rule, the number is getting smaller as languages, like
and irregular forms that do not. Irregular plurals species, become extinct), although estimates
that do not obey the general rule of forming plu- vary between 2,700 and 10,000. We do not even
rals by adding an “-s” to the end of a noun, or know whether all human languages are descended
forming the past tense by adding a “-d” or “-ed” from one common ancestor, or whether they are
to the end of a verb, also contain at least two mor- derived from a number of ancestors (my bet is on
phemes. Hence “house,” “mouse,” and “do” are one). However, it is apparent that many languages
made up of one morpheme, but “houses,” “mice,” are related to each other. This relation is apparent
and “does” are made up of two. “Rehoused” is in the similarity of many of the words of some
made up of three morphemes: “house” plus “re-” languages (e.g., “mother” in English is “Mutter”
added through mechanisms of derivational mor- in German, “moeder” in Dutch, “mère” in French,
phology, and “-ed” added by inflection. Every “maht” in Russian, and “mata” in Sanskrit). More
child’s favorite word “antidisestablishmentarian- detailed analyses like this have shown that most of
ism” is made up of six morphemes. the languages of Europe, and parts of west Asia,
Psychologists believe that we store representa- derive from a common source called proto-Indo-
tions of words in a mental dictionary. We call this European. All the languages that are derived from
mental dictionary the lexicon. The lexicon contains this common source are called Indo-European.
all the information (or at least pointers to all of the We can gather ideas about where the speakers
information) that we know about a word, including of the ancestral language came from, by look-
its sounds (phonology), meaning (semantics), written ing at the words that are shared in the descend-
appearance (orthography), and the syntactic roles the ant languages. For example, all Indo-European
word can adopt. The lexicon must be huge: estimates languages have similar words for horses and
vary greatly, but a reasonable estimate is that an sheep, but not for palm tree or vine. Hence the
8 A. INTRODUCTION
original language must have been spoken some- some words, sometimes over short time spans—
where where it was easy to find horses and sheep, rather sadly I can’t remember the last time I had
but where palms and vines could not be found. to give a measurement in rods or chains. We bor-
Such observations suggest that the speakers of row (or perhaps steal is a better word) words from
proto-Indo-European probably spread out from other languages (“café” from French, “potato”
Anatolia (approximately modern-day Turkey) from Haiti, and “shampoo” from India). Sounds
with the expansion of agriculture about 9,000 change in words (“sweetard” becomes “sweet-
years ago (Bouckaert et al., 2012). Indo-European heart”). Words are sometimes even created by
has a number of main branches: the Romance error: “pea” was back-formed from “pease” as
(such as French, Italian, and Spanish), the people started to think (incorrectly) that “pease”
Germanic (such as German, English, and Dutch), was plural (Bryson, 1990).
and the Indian languages (see Figure 1.2). There We most definitely should not gloss over dif-
are some languages that are European but that are ferences between languages. Although they have
not part of the Indo-European family. Finnish and arisen over a relatively short time compared with
Hungarian are from the Finno-Ugric branch of the the evolution of humans, we cannot assume that
Uralic family of languages. There are many other speakers of different languages process them in
language families in addition to Indo-European, the same basic way. Whereas it is likely that most
including Afro-Asiatic (covering north Africa and of the mechanisms involved are the same, there
the Arabian peninsula), Niger-Congo, Japanese, might be some differences, particularly in the
Sino-Tibetan, and families of languages spoken processing of written or printed words. Writing
in and around the Pacific and in north and south is a recent development compared with speech,
America. Altogether linguists have identified over and as we shall see in Chapters 7 and 8, there
100 language families, although a few languages,
such as Basque, do not seem to be part of any fam-
ily. The extent to which these large families may
be related further back in time is unknown.
Languages also change over relatively short
time spans. Chaucerian and Elizabethan English
are obviously different from modern English, and
even Victorian speakers would sound decidedly
archaic to us today, my dear old bean. Even listen- U
ing to 1970s sitcoms can be disconcerting at times.
We coin new words or new uses of old words
when necessary (e.g., “computer,” “television,” fl»<8 62ftoi ⻀»θ|
“internet,” “rap”). Whole words drop out of usage tfeap e* <Se\a і
(“thee” and “thou”), and we lose the meanings of fl»<8 62ftoi ⻀»θ|
& с іі« н ф $ thg?
ftm«62ftoi
fl»<8 tuortaf ^
⻀»θ|
t fr te tZ b fa r I
fl»<8 62ftoi ⻀»θ|
INDO-EUROPEAN LANGUAGES tott)»·^ & І ? »
fl»<8
tu ilis62ftoi
n ifln ⻀»θ|
JCJrtir
early and mid-1960s when psycholinguists tried was the most likely continuation of a sentence
to relate language processing to Chomsky’s trans- from a particular point onwards was central to
formational grammar. Since then psycholinguistics this approach. Information theory was also impor-
has left its linguistic home and achieved inde- tant because of its influence in the development
pendence, flourishing on all fronts. of cognitive psychology. In the middle part of the
As its name implies, psycholinguistics has its twentieth century, the dominant tradition in psy-
roots in the two disciplines of psychology and lin- chology was behaviorism, which emphasized the
guistics, and particularly in Chomsky’s approach relation between an input (or stimulus) and output
to linguistics. Linguistics is the study of language (response), and how conditioning and reinforce-
itself, the rules that describe it, and our knowledge ment formed these associations. Intermediate
about the rules of language. The primary concerns constructs (such as the mind) were considered
of early linguistics were rather different from what unnecessary to provide a full account of behav-
they are now. Comparative linguistics was con- ior. For behaviorists, the only valid subject matter
cerned with comparing and tracing the origins of for psychology was behavior, and language was
different languages. In particular, the American behavior just like any other sort. Its acquisition
tradition of the linguist Leonard Bloomfield and use could therefore be explained by standard
(1887–1949) emphasized comparative studies of techniques of reinforcement and conditioning.
indigenous North American Indian languages, lead- This approach perhaps reached its acme in 1957
ing to an emphasis on what is called structuralism: with the publication of B. F. Skinner’s famous (or
A primary goal of linguistics was taken to be pro- to linguists, infamous) book Verbal Behavior.
viding an analysis of the appropriate categories of
description of the units of language (Harris, 1951). Psycholinguistic tests of
In modern linguistics the primary data used
by linguists are intuitions about what is and is Chomsky’s linguistic theory
not an acceptable sentence. For example, we Attitudes changed very quickly: in part this change
know that the string of words in (1) is accept- was due to a devastating review of Skinner’s book
able, and we know that (2) is ungrammatical. by Chomsky (1959). The American linguist Noam
How do we make these decisions? Can we formu- Chomsky (b. 1928) has had more influence on
late general rules to account for our intuitions? how we understand language than any other per-
(An asterisk conventionally marks an ungram- son. Unusually, the book review came to be more
matical construction.) influential than the book it reviewed. Chomsky
showed that behaviorism was incapable of dealing
(1) What did the pig give to the donkey? with natural language. He argued that a new type of
(2) *What did the pig sleep to donkey? linguistic theory called transformational grammar
provided both an account of the underlying struc-
This emphasis on our knowledge led to ture of language and also of people’s knowledge
greater emphasis on what humans do with lan- of their language (see Chapter 2 for more details).
guage, rather than just on its structure. Psycholinguistics blossomed in attempting to
Early psychological approaches to language test the psychological implications of this linguis-
saw the language processor as a simple device tic theory, and the influence of linguistics peaked
that could generate and understand sentences by in the late 1960s and early 1970s. The enterprise
moving from one state to another. There are two was not wholly successful, and experimental
strands in this early work, derived from informa- results suggested that, although linguistics might
tion theory and behaviorism. Information theory tell us a great deal about our knowledge of our
(Shannon & Weaver, 1949) emphasized the role language and about the constraints on children’s
of probability and redundancy in language, and acquisition of language, it is limited in what it can
developed out of the demands of the fledgling tell us about the processes involved in speaking
telecommunications industry. Working out what and understanding.
1. THE STUDY OF LANGUAGE 11
The rest of this section is rather technical and applied. They are therefore the active, affirmative,
can be skipped on the first reading. You might like declarative forms of English sentences.
to return to it before or after reading Chapter 10 Miller and McKean (1964) tested the idea
on parsing. that the more transformations there are in a sen-
What can the linguistic approach contribute tence, the more difficult it is to process. They
to our understanding of the processes involved looked at detransformation reaction times to sen-
in producing and understanding syntactic struc- tences such as (5) to (9). Participants were told
tures? When Chomsky’s work first appeared, there that they would have to make a particular trans-
was great optimism that it would also provide an formation on a sentence, and then press a button
account of these processes. Two ideas attracted par- when they found this transformed sentence in a
ticular interest and were considered easily testable: list of sentences through which they had to search.
these were the derivational theory of complexity Miller and McKean measured these times.
(DTC), and the autonomy of syntax. The idea of
the derivational theory of complexity is that the (5) The robot shoots the ghost. (0 transforma-
more complex the formal syntactic derivation of tions: active affirmative form)
a sentence—that is, the more transformations that (6) The ghost is shot by the robot. (1 transforma-
are necessary to form it—the more complex the tion: passive)
psychological processing necessary to understand (7) The robot does not shoot the ghost. (1 trans-
or produce it, meaning that transformationally formation: negative)
complex sentences should be harder to process than (8) The ghost is not shot by the robot. (2 transfor-
less complex sentences. This additional processing mations: passive + negative)
complexity should be detectable by an appropri- (9) Is the ghost not shot by the robot? (3 transfor-
ate measure such as reaction times. The psycho- mations: passive + negative + question)
logical principle of the autonomy of syntax takes
Chomsky’s assertion that syntactic rules should be We can derive increasingly complex sentences
specified independently of other constraints fur- from the kernel (5). For example, (9) is derived from
ther, to mean that syntactic processes operate inde- (5) by the application of three transformations: pas-
pendently of other ones. In practice this means that sivization, negativization, and question formation.
syntactic processes should be autonomous with Miller and McKean found that the time it took to
respect to semantic processes. detransform sentences with transformations back
Chomsky (1957) distinguished between to the kernel was linearly related to the number of
optional and obligatory transformations. Obligatory transformations in them. That is, the more transfor-
transformations were those without which the sen- mations a participant has to make, the longer it takes
tence would be ungrammatical. Examples include them to do it. This was interpreted as supporting the
transformations introduced to cope with number psychological reality of transformational grammar.
agreement between nouns and verbs, and the intro- Other experiments around the same time sup-
duction of “do” into negatives and questions. Other ported this idea. Savin and Perchonock (1965)
transformations were optional. For example, the found that sentences with more transformations
passivization transformation takes the active form in them took up more memory space. The more
of a sentence and turns it into a passive form, for transformationally complex a sentence was, the
instance turning (3) into (4): fewer items participants could simultaneously
remember from a list of unrelated words. Mehler
(3) Boris applauded Agnes. (1963) found that when participants made errors
(4) Agnes was applauded by Boris. in remembering sentences, they tended to do it in
the direction of forgetting transformational tags,
Chomsky defined a subset of sentences that he rather than adding them. It was as though partici-
called kernel sentences. Kernel sentences are those pants remembered sentences in the form of “kernel
to which only obligatory transformations have been plus transformation.”
12 A. INTRODUCTION
Problems with the psychological obtain. Slobin’s finding that the depth of syntactic
interpretation of transformational processing is affected by semantic considerations
grammar such as reversibility is also counter to the idea
The tasks that supported the psychological real- of the autonomy of syntax, although this proved
ity of transformational grammar all used indirect more controversial. Using different materials and
measures of language processing. If we ask partici- a different task (judging whether the sentence was
pants explicitly to detransform sentences, it is not grammatical or not), Forster and Olbrei (1973)
surprising that the time it takes to do this reflects found no effect of reversibility, and more recently
the number of transformations involved. However, Ferreira (2003) found that there was always some
this is not a task that we necessarily routinely do cost to processing a passive sentence, even irre-
in language comprehension. Memory measures are versible ones. Taken together, these results mean
not an on-line measure of what is happening in sen- that what we observe depends on the details of the
tence processing; at best they are reflecting a side tasks used, but both syntactic and semantic factors
effect. What we remember of a sentence need have have an effect on the difficulty of sentences.
no relation with how we actually processed that Wason (1965) examined the relation between
sentence. Indeed, other findings that were difficult the structure of a sentence and its meaning. He
to fit into this framework soon emerged. measured how long it took participants to com-
Slobin (1966a) performed an experiment plete sentences describing an array of eight
similar to the original detransformation experi- colored circles, seven of which were red and one
ment of Miller and McKean. Slobin examined of which was blue. It is more natural to use a nega-
the processing of what are called reversible and tive in a context of “plausible denial”—that is, it is
irreversible passive sentences. A reversible pas- more appropriate to say “this circle is not red” of
sive is one where the subject and object of the the exception than of each of the others “this circle
sentence can be reversed and the sentence still is not blue.” In other words, the time it takes to
makes pragmatic sense. An irreversible passive is process a syntactic construction such as negative-
one that does not make sense after this reversal. formation depends on the semantic context.
If you swap the subject and object in (10) you get In summary, early claims supporting the
(12), which makes perfect sense, whereas if you ideas of derivational complexity in linguistic per-
do this to (11) you get (13), which, although not formance that were derived from Chomsky’s for-
ungrammatical, is rather odd—it is semantically mulation of grammar were at best premature, and
anomalous: perhaps just wrong. As we shall see in later chap-
ters, the degree to which syntactic and semantic
(10) The ghost was chased by the robot. processes are independent turns out to be one of
(11) The flowers were watered by the robot. the most important and controversial topics in
(12) The robot was chased by the ghost. psycholinguistics.
(13) ? The robot was watered by the flowers. Linguistic approaches have given us a use-
ful terminology for talking about syntax. They
In the case of an irreversible passive, you can also illuminate how powerful the grammar that
work out what is the subject of the sentence and underlies human language must be. Chomsky’s
what is the object by semantic clues alone. With a theory of transformational grammar also had a
reversible passive, you have to do some syntactic major influence on the way in which psychologi-
work. Slobin found that Miller and McKean’s cal syntactic processing was thought to take place.
results could only be obtained for reversible pas- In spite of their initial promise, later experiments
sives. Hence detransformational parsing only provided little support for the psychological real-
appears to be necessary when there are not suf- ity of transformational grammar. Chomsky had a
ficient semantic cues to the meaning of the sen- retreat available: Linguistic theories describe our
tence from elsewhere. This result means that the linguistic competence, our abstract knowledge of
derivational theory of complexity does not always language, rather than our linguistic performance,
1. THE STUDY OF LANGUAGE 13
what we actually do. That is, transformational particular for examples). This approach is some-
grammar is a description of our knowledge of our times called, rather derogatorily, “boxology.” It
linguistic competence, and the constraints on lan- is certainly not unique to psycholinguistics, and
guage acquisition, rather than an account of the such an approach is not as bad as is sometimes
processes involved in parsing on a moment-to- hinted. It at least gives rise to an understanding of
moment basis. This has effectively led to a sepa- the architecture of the language system—what the
ration of linguistics and psycholinguistics, with “boxes” of the language system are, and how they
each pursuing these different goals. Miller, who are related to others.
first provided apparent empirical support for the As a consequence of the influence of the com-
psychological reality of transformational gram- putational metaphor, and with the development of
mar, later came to believe that all the time taken suitable experimental techniques, psycholinguis-
up in sentence processing was used in semantic tics gained an identity independent of linguistics.
operations. Modern psycholinguistics is primarily an experi-
mental science, and as in much of cognitive psy-
Psycholinguistics and information chology, experiments measuring reaction times
have been particularly important (especially in
processing word recognition and comprehension; see Chapters
Psycholinguistics was largely absorbed into main- 6 through 12). Psychologists try to break language
stream cognitive psychology in the 1970s. In this processing down into its components, and show
approach, the information processing or compu- how those components relate to each other.
tational metaphor reigned supreme. Information
processing approaches to cognition view the mind
as rather like a computer. The mind uses rules to
The “cognitive science” approach
translate an input such as speech or vision into a The term “cognitive science” is used to cover the
symbolic representation: cognition is symbolic multidisciplinary approach to the study of the mind,
processing. This approach can perhaps be seen at with the disciplines including adult and developmen-
its clearest in a computational account of vision, tal psychology, philosophy, linguistics, anthropology,
such as that of Marr (1982), where the representa- neuroscience, and artificial intelligence (AI). We
tion of the visual scene becomes more and more have already seen how linguistics influenced early
abstract from the retinal level through increasingly psycholinguistics, particularly early work on syn-
sophisticated representations. Processing could tax. Philosophy has played an important role in our
be represented as flow diagrams, in the same way understanding of meaning. AI involves getting com-
that complex tasks could be represented as flow puters to do things that appear to need intelligence,
diagrams before being turned into a computer pro- such as understanding a story, or understanding
gram. Flow diagrams illustrate levels of process- speech. Apart from the obvious technological uses of
ing, and much work during this time attempted to AI, the hope has been that it will increase our under-
show how one level of representation of language standing of how humans do such tasks. AI involves
is transformed into another. The computational computer modeling of human behavior emphasizing
metaphor is clearly influential in modern psycho- rules, plans, and goals. Implementing a model in a
linguistics, as most models are phrased in terms computer program has the great advantage that you
of the description of levels of processing, and the have to be explicit about everything. There are no
rules or processes that determine what happens hidden assumptions in models that are implemented
in between. We will see this type of approach as programs. On the other hand, this compulsory
throughout this book. Many traditional psycholin- explicitness has a significant disadvantage: Because
guistic models are specified as “box-and-arrow” modelers have to be explicit about everything, they
diagrams, with boxes referring to processing lev- often have to make unwarranted assumptions about
els, and the arrows being the means of getting processes and representations, and simplify the con-
from one box to another (see Chapters 7 and 13 in text or the scope of the model to a degree where its
14 A. INTRODUCTION
usefulness becomes debatable. Nevertheless, com- I need some help, that much seems certain.
putational models have many advantages and have
WHAT WOULD IT MEAN IF YOU GOT
become increasingly important.
SOME HELP
If you have seen Kubrick’s film 2001: A Space
Odyssey, you will probably remember HAL, the Perhaps I could learn to get along with my
talking, listening, all-seeing, thinking computer. mother.
AI researchers hope someday to emulate HAL’s
TELL ME MORE ABOUT YOUR FAMILY
linguistic abilities. Some early successes made
researchers optimistic that this goal might soon be
The program worked in a very simple way,
realized. Among the earliest of these successes was
using a few techniques such as matching sentences
Weizenbaum’s (1966) program, called ELIZA. The
to pre-stored structures or templates, producing pre-
program simulated a non-directive, Rogerian ther-
determined responses to a few keywords (such as
apist. In the following example, the output of the
“mother”), and repeating what was input to it back
computer—the ELIZA program—is in capitals.
with a few appropriate changes in word order. Nev-
ertheless, people who used the program became
Men are all alike.
very attached to it, even if they knew it was only a
IN WHAT WAY machine producing the output. Although some peo-
ple even wanted to consult regularly with ELIZA in
They’re always bugging us about something
private, this perhaps says more about human con-
or other.
cerns than it does about how language is produced.
CAN YOU THINK OF A SPECIFIC There were a number of variants of ELIZA, such as
EXAMPLE PARRY (Colby, 1975), which “simulated” the out-
put of a paranoid schizophrenic.
Well, my boyfriend made me come here.
The next influential program was called
YOUR BOYFRIEND MADE YOU COME SHRDLU (Winograd, 1972; the name came from
HERE the letters of one row of a typesetting machine
and was often used by typesetters to flag a mis-
He says I’m depressed much of the time.
take). This program could answer questions
I AM SORRY TO HEAR THAT YOU ARE about an imaginary world called “blocksworld.”
DEPRESSED Blocksworlds are occupied by objects such as
small red pyramids sitting on top of big blue cubes.
It’s true that I am unhappy.
SHRDLU’s success in being able to “understand”
DO YOU THINK COMING HERE WILL sentences such as “move the small red pyramid on
HELP YOU NOT TO BE UNHAPPY top of the blue cube” was much hailed at the time.
However, SHRDLU could only “understand” in One concept that is central in many types
as much as it could give an appropriate response of model, including connectionist models, is the
to an instruction, and most people would say that idea of activation. The idea has been around for
there is much more to understanding than this. a long time. Activation is a continuously vary-
Furthermore, these early demonstrations worked ing quantity, and can be thought of as a property
only for very simple, limited domains. SHRDLU rather like heat. We talk of how activation can
could not answer questions about elephants, or spread from one unit or word or point in a net-
even say what “block” means. Its knowledge was work to another, rather like electricity flowing
limited to the role of blocks within blocksworld. around a circuit board. Suppose we hear a word
These early attempts did have the virtue of such as “ghost.” If we assume there is a unit cor-
demonstrating the enormity of the task in under- responding to that word, it will have a very high
standing language. They also revealed the main level of activation. But a word related in mean-
problems that have to be solved before we can ing (e.g., “vampire”) or sound (e.g., “goal”)
talk of computers truly understanding language. might also have a small amount of activation,
There are an infinite number of sentences, of whereas a completely unrelated word (e.g.,
varying degrees of complexity. We can talk about “pamphlet”) will have a very low level of acti-
and understand potentially anything. The roles vation. The idea that the mind uses something
that context and world knowledge play in under- like activation, and that the activation level of
standing are very important: potentially any piece units—such as those representing words—can
of information we know could be necessary to influence the activation levels of similar items,
understand a particular sentence. The conven- is an important one.
tional AI approach has had some influence on
psycholinguistic theorizing, particularly on how The methods of modern
we understand syntax and how we make infer-
ences in story comprehension.
psycholinguistics
ELIZA and SHRDLU had extremely primitive Psycholinguistics uses many types of evidence.
syntactic processing abilities. ELIZA used tem- We will use examples of observational studies
plates for sentence recognition, and did not com- and linguistic intuitions, and make use of the
pute the underlying syntactic structure of sentences errors people make. Much has been learned
(a process known as parsing). SHRDLU was a lit- from computer modeling. Recently, neurosci-
tle more sophisticated, and did contain a syntactic ence has contributed greatly to our understand-
processor, but the processor was dedicated to the ing. But the bulk of our data, as you will see if
extraction of the limited semantic information nec- you just quickly skim through the rest of this
essary to move around “blocksworld.” Early AI book, comes from traditional psychology exper-
parsers lacked the computational power necessary iments, particularly those that generate reaction
to analyze human language. times. For example, how long does it take to
The influence of AI on psycholinguistics read out a word? What can we do to make the
peaked in the 1970s. More recently an approach process faster or slower? Do words differ in the
called connectionism (but also known as paral- speed with which we can read them out depend-
lel distributed processing, or neural networks) ing on their properties? The advantage of this
has become influential in all areas of psycholin- type of experiment is that it is now very easy to
guistics. Connectionist networks involve many run on modern computers. In many experiments,
very simple, richly interconnected neuron-like the collection of data can be completely auto-
units working together without an explicit gov- mated. There are a number of commercial (and
erning plan. Instead, rules and behavior emerge free) experimental packages available for both
from the interactions between these many simple PC and Macintosh computers that will help run
units. The principles of connectionist models are your experiments for you, or you can program
described more fully in the Appendix. the computer yourself.
16 A. INTRODUCTION
One of the most popular experimental tech- have to be explained. Types of data include experi-
niques is called priming. Priming has been used mental results, case studies of people with brain
in almost all areas of psycholinguistics. The damage, brain scans, and observations of people
general idea is that if two things are similar to using language correctly or incorrectly. A theory is
each other and involved together in processing, a general explanation of how something works. A
they will either assist with or interfere with each model is rather more specific: For example, com-
other, but if they are unrelated, they will have puter simulations are models of processes that are
no effect. For example, it is easier to recognize particular instances or parts of more abstract theo-
a word (e.g., BREAD) if you have just seen a ries. The distinction between a model and a theory
word that is related in meaning (e.g., BUTTER). is a bit fuzzy though, so don’t worry about it too
This effect is called semantic priming. If prim- much. A hypothesis is a very specific idea that
ing causes processing to be speeded up, we talk can be tested. An experimental test that confirms
about facilitation; if priming causes it to be the hypothesis is support for the particular theory
slowed down, we talk of inhibition. from which the hypothesis was derived. If the
Most psycholinguistic research has been car- hypothesis is not confirmed, then some change to
ried out on healthy monolingual English-speaking the theory is necessary. It need not be necessary
college students, in the visual modality (i.e., with to reject the theory completely, but as long as the
printed words). Psycholinguistic research does not hypothesis is derived fairly from the theory, then
differ from other types of psychology in this bias, some modification will be necessary. Testing the-
but it does have consequences: for example, it has ories by making predictions and trying to falsify
meant that there has been a great deal of research them is a fundamental part of science. And that’s
on reading when, for most people, speaking and why psycholinguistics is a part of science.
listening are the main language activities in their What’s an explanation then? An explana-
lives. Fortunately, in recent years this situation tion simplifies. If you carry out an experiment
has changed dramatically, and we are now see- and make one hundred observations, an explana-
ing the fruits of research on speech recognition, tion of those observations is something simpler
on language production, on speakers of different than those hundred data points. Suppose you
languages, on bilingual speakers, on people with could summarize why you got those observa-
brain damage, and on people across the full range tions in a sentence or one mathematical equation;
of the lifespan. A lot of this work has been spurred that would be a good explanation (and the equa-
by recent developments in brain imaging, which tion would also serve as a model). Explanations
over the last few years has revolutionized how we should also avoid being circular. A circular expla-
understand language. nation is one that explains itself in terms of itself;
for example, we could say children learn language
because they have a language acquisition module,
MODELS IN and define the language acquisition module as
PSYCHOLINGUISTICS what enables children to learn languages. Good
explanations transcend levels; complex phenom-
What do we do when we have a lot of data? We ena are explained in terms of simpler descriptions,
have to explain it. We do this by constructing a and may involve different areas.
model of the data. A good model is an account of Good models also make use of converging
the data that provides an explanation of why the evidence; evidence from different sources that
data are as they are and that makes novel, testable come together. A model of some behavior that is
predictions. Psycholinguistics is full of models, expressed as a computer model and makes novel,
and they’re very important. falsifiable predictions about real human behavior
At this point it is useful to explain what is is a good one, particularly if it is supported by
meant by the words “data,” “theory,” “model,” and evidence from other areas such as the study of
“hypothesis.” Data are the pieces of evidence that the brain.
1. THE STUDY OF LANGUAGE 17
Cerebral cortex
Cingulate gyrus Parietal lobe
Hippocampus
Frontal lobe
Corpus
callosum
Thalamus
Suprachiasmatic
nucleus
Although the traditional interpretation of a at how the brain works? New techniques of brain
double dissociation is that two separate routes imaging are gradually becoming more accurate and
are involved in a process, connectionist modeling more accessible. As a consequence brain imaging
has shown that this might not always be the case. has been one of the most widely used and important
Apparent double dissociations can emerge in com- techniques in psycholinguistics in the last few years.
plex, distributed, single-route systems (e.g., Plaut Traditional X-rays are of limited use to us
& Shallice, 1993a; Seidenberg & McClelland, because the skull blocks the view of the brain and,
1989; both are described in Chapter 7). At the in any case, there is little variation in the density of
very least, we should be cautious about inferring the brain. Hence neuroscientists have had to use even
that the routes involved are truly distinct and do more ingenious techniques. These are based on meas-
not interact (Ellis & Humphreys, 1999). uring the brain’s electrical activity, or creating images
Some more general care is necessary when of brain activity. Ideally, we would like both good
making inferences from neuropsychological data. temporal (being able to separate and time events very
Some researchers have questioned the whole enter- accurately) and spatial (being able to localize very
prise of trying to understand normal processing by accurately in space in the brain) resolution.
studying brain-damaged behavior. Gregory (1961) EEGs (electroencephalograms) and ERPs
made an analogy of attempting to discover how a (event-related potentials) both measure the electri-
radio set works by removing its components. If we cal activity of the brain by putting electrodes on
did this, we might conclude that the function of a the scalp. ERPs measure voltage changes on the
capacitor (an electrical component) was to inhibit scalp associated with the presentation of a stimulus
loud wailing sounds! Furthermore, the categories of (see Figure 1.6). The peaks of an ERP are labeled
disorder that we will discuss are not always clearly according to their polarity (positive or negative
recognizable in the clinical setting. There is often voltage) and latency in milliseconds (thousandths
much overlap between patients, with the more pure of a second) after the stimulus begins (Kutas & van
cases usually associated with smaller amounts of Petten, 1994). The N400 is a much-studied peak
brain damage. Finally, things are not usually in a occurring after a semantically incongruent sentence
fixed state as a result of brain damage; intact pro- dog (Kutas & Hillyard, 1980). Of course, that pre-
cesses reorganize, and some recovery of function vious sentence should have ended with “sentence
often occurs, even in adults. completion,” and “dog” should therefore have gen-
erated a large N400 in you. P300 peaks are elicited
by any stimuli requiring a binary decision (yes/no).
Neuroimaging The contingent negative variation (CNV) is a slow
Reaction times enable us to infer how the mind negative potential that develops on the scalp when
works; lesion studies enable us to infer which part of a person is preparing to make a motor action or to
the brain does what; suppose we could look directly process sensory stimuli.
+1μV
EEG and ERP have very good temporal which parts of the brain are most active when it is
resolution—they can currently resolve the tim- carrying out a particular task.
ing of events to within a millisecond or so. Their In recent years fMRI (functional magnetic
spatial resolution, however, is very poor. MEG resonance imaging) has become widely accessible,
(magnetoencephalography) is a recent devel- and “brain scans” derived from fMRI have become
opment that measures the magnetic activity of one of the most important sources of data in psy-
the brain. MEG has the advantage of both very chology. fMRI was developed in the 1990s. It meas-
good temporal and spatial (within 3 mm) resolu- ures the energy released by hemoglobin molecules
tion, but is more difficult to carry out and much in the blood, and then works out the areas of the
more expensive to run, needing superconducting brain receiving the greatest amounts of blood and
devices called SQUIDS, extreme cooling using oxygen. It therefore tells us which parts of the brain
liquid helium, and magnetic shielding. are most active at any time. It provides much better
CAT (computerized axial tomography) pro- temporal (about 1–5 seconds) and spatial (within
duces medium-resolution images from integrating 1 mm) resolution than PET, although the temporal
large numbers of X-ray pictures taken from many resolution is still clearly inferior to EEG. fMRI is
different angles around the head (see Figure 1.7). now the most widely used imaging technique used
MRI (magnetic resonance imaging) uses radio- in psycholinguistics, and its importance to the field
frequency waves rather than X-rays and produces has grown dramatically in the last few years.
higher resolution images than CAT. These tech- Another recently developed tool is TMS
niques enable neuroscientists to study the structure (transcranical magnetic stimulation). TMS is in
of the brain. PET (positron emission tomography) some ways the reverse of imaging: rather than
scans produce pictures of the brain’s activity. A observing the brain, we make part of it do some-
radioactive form of glucose, the metabolic fuel thing. A very powerful set of magnets is used to
that the brain uses, is injected into the blood, and directly stimulate part of the cortex of a partici-
detectors around the head measure where the glu- pant, and we then record what that participant
cose is being used up. In this way we can find out does or experiences.
X-ray tube
X-rays
X-ray detector
FIGURE 1.7 In a CAT scanner, X-rays pass through the brain in a narrow beam. X-ray detectors are arranged in
an arc and feed information to a computer that generates the scan image.
1. THE STUDY OF LANGUAGE 21
1997). Also, group studies using imaging tech- the processes we examine specific to language, or
niques average brain images across people, when are they aspects of general cognitive processing
functions might be localized inconsistently in sometimes recruited for language? Seventh, how
different parts of their brains (Howard, 1997). sensitive are the results of our experiments to the
It is also easy to get carried away with focusing particular techniques employed? That is, do we get
on where in the brain things happen, rather than different answers to the same question if we do our
on the underlying processes (see Harley, 2004a, experiments in slightly different ways? To antici-
2004b; Loosemore & Harley, 2010). pate, the answers we get sometimes do depend on
In general, imaging techniques do not tell us the way we get those answers, which obviously can
in any straightforward way what high activity in make the interpretation of findings quite complex.
different parts of the brain means in processing One consequence is that we find that the experi-
terms. Suppose we see during sentence process- mental techniques themselves come under close
ing that the parsing and semantic areas are active scrutiny. In this respect, the distinction between
at the same time. This could be a result of interac- data and theory can become very blurred. Eighth,
tion between these processes, or it could reflect what can be learned from looking at the language of
the parsing of one part of the sentence and the people with damage to the parts of the brain that
semantic integration of earlier material. It might control language? Ninth, what difference does
even reflect the participant parsing a sentence and it make speaking a different language? We have
thinking dimly about what’s for dinner that night. already seen that there are many thousands of lan-
It might be possible to tease them apart, but we guages in the world. Many countries have more
need clever experiments to do this. Imaging data than one language, and some (e.g., Papua New
now play an important role as part of the con- Guinea) have hundreds. Some languages have
verging evidence for a particular model, or even hundreds of millions of speakers; some just a few
distinguishing between competing accounts. hundred. There are important differences between
Imaging already plays an important diagnostic languages that may have significant implications
role in investigating the effects of brain damage for the way in which speakers process language. It
and brain disease. More optimistically, in the is sometimes easy to forget this, given the domina-
more distant future, imaging will play a more tion of English in experimental psycholinguistics.
important role in treatment and therapy. Some people speak more than one language. How
they do this, how they learn the two languages, and
how they translate between them are all important
THEMES AND questions, the answers to which have wider impli-
CONTROVERSIES cations for understanding cognitive processing.
Finally, we should be able to apply psycho-
Ten themes recur throughout this book (see Figure linguistic research to everyday life and prob-
1.8). The first theme is to discover the actual pro- lems. Although language comes naturally to most
cesses involved in producing and understanding humans most of the time, there are many occasions
language. The second theme is the question of when it does not: for example, in learning to read,
whether apparently different language processes in overcoming language disabilities, in rehabilitat-
are related to one another. For example, to what ing patients with brain damage, and in developing
extent are the processes involved in reading also computer systems that can understand and produce
involved in speaking? The third theme is whether language. Advances in the theory of any subject
or not processes in language operate independently such as psycholinguistics should have practical
of one another, or whether they interact. This is applications. For example, in Chapters 6 and 7 we
the issue of modularity, and we look at it in more will examine research on visual word recognition
detail below. Fourth, what is innate about lan- and reading. Learning to read is a remarkably diffi-
guage? Fifth, do we need to refer to explicit rules cult task. A good theory of reading should cast light
when considering language processing? Sixth, are on how it should best be taught. It should indicate
1. THE STUDY OF LANGUAGE 23
What can be learned Are language processes What are the processes
from the language of specific to language or are involved in producing
patients with brain they aspects of general and understanding
damage? cognitive processing? language?
FIGURE 1.8
the best strategies that can be used to overcome dif- example, does the meaning of a sentence help in
ficulties in learning to read, and thereby help chil- recognizing the sounds of a word or in making
dren who find learning to read particularly difficult. decisions about the sentence structure?
A good theory should specify the best methods of A module is a self-contained set of pro-
dealing with adult illiteracy. Furthermore, it should cesses: it converts an input to an output, without
help in the rehabilitation of adults who have diffi- any outside help for what goes on in between—
culty in reading as a consequence of brain damage, we say that the processes inside a module are
showing what remedial treatment would be most independent of processes outside the module.
useful and which strategies would maximize any Yet another way of describing it is to say that
preserved reading skills. processing is purely data-driven. Models in
Let us look at some of these themes in more which processing occurs in this way are called
detail. autonomous.
The opposing view is that processing is
How modular is the language interactive. Interaction involves the influence
of one level of processing on the operation of
system? another, but there are two intertwined notions
The concept of modularity is an important one involved. First, there is the question of overlap
in psycholinguistics. Most researchers agree that of processing between stages. Are the processing
psychological processing can be best described stages temporally discrete or do they overlap?
in terms of a number of levels. Processing begins In a discrete stage model, a level of processing
with an input that is acted on by one or more inter- can only begin its work when the previous one
vening levels of processing to produce an output. has finished its own work. In a cascade model,
For example, when we name a word, we have to information is allowed to flow from one level
identify and process the visual form of the word, to the following level before it has completed
and access the sounds of the word. There is much its processing (McClelland, 1979). If the stages
less agreement on the way in which these levels overlap, then multiple candidates might become
of processing are connected to each other. For a activated at the lower level of processing. An
particular process, at what stage does any kind analogy should make this clear. Discrete models
of context have an influence? When do differ- are like those water wheels made up of a series of
ent types of information have their effects? For tipping buckets; each bucket only tips up when
24 A. INTRODUCTION
it is full of water. Cascading models on the other decision making; and in word production they have
hand are like a series of waterfalls. proposed an editor, or emphasized the role of work-
The second aspect of interaction is whether ing memory, or claimed that some kinds of data
there is a reverse flow of information, or feedback, (e.g., picture-naming times) are more fundamen-
when information from a lower level feeds back to tal than others (e.g., speech errors). Researchers
the prior level. For example, does knowledge about can get very hot under the collar about the role of
what a word might be influence the recognition of interaction. Both Fodor (1983, 1985) and Pinker
its component sounds or letters? Does the context of (1994), who are leading exponents of the view that
the sentence help to make identifying the constituent language is highly modular and has a significant
words easier? A natural waterfall is purely top-down; innate basis, give a broader philosophical view:
water doesn’t flow from the bottom back up to the modularity is inconsistent with relativism, the
top. But suppose we introduce a pump. Then we can idea that everything is relative to everything else
pump water back up to earlier levels. There is scope and that anything goes (particularly in the social
for confusion with the terms “bottom-up” and “top- sciences). Modules provide a fixed framework in
down,” as they depend on the direction of processing. which to study the mind.
So a non-interactive model of word recognition The existence of a neuropsychological disso-
would be one that is purely bottom-up—from the ciation between two processes is often taken as evi-
perceptual representation of the word to the mental dence of the modularity of the processes involved.
representation—but a non-interactive model of word When we consider the neuroscience of modularity,
production would be one that is purely top-down— we can talk both about physical modularity (are
from the mental representation to the sound of the psychological processes localized in one part of the
word. “Data-driven” is a better term than “bottom- brain?) and processing modularity (in principle a set
up,” but the latter is in common use. The important of processes might be distributed across the brain
point is that models that permit feedback have both yet have a modular role in the processing model).
bottom-up and top-down information flow. It is plausible that the two types of modularity are
Fodor (1983) argued that many psychological related, so that cognitive modules correspond to
processes are modular. To what extent are the pro- neuropsychological modules. However, Farah (1994)
cesses of language self-contained, or do they interact criticized this “locality” assumption, and argued that
with one another? According to many researchers, neuropsychological dissociations were explicable in
we should start with the assumption that processes terms of distributed, connectionist systems.
are modular or non-interactive unless there is a To what extent is the whole language system
very good reason to think otherwise. There are a big, self-contained module (or set of modules)?
two main reasons for this assumption. First, modu- Is it just a special module for interfacing between
lar models are generally simpler—they involve social processes and cognition? Or does it provide
fewer processes and connections between systems. a true window onto wider cognitive processes? On
Second, it is widely believed that evolution favors the one hand, Chomsky (1975) argued that lan-
a modular system. On the other hand, there is no guage is a special faculty that cannot be reduced
consensus on how good a “very good reason” has to cognitive processes. On the other, Piaget (1923)
to be before we dump the modularity hypothesis. argued that language is a cognitive process just like
It is always possible to come up with a saving or any other, and that linguistic development depends
auxiliary hypothesis that can be used to modify and on general cognitive development. We will return
hence save the modularity hypothesis (Lakatos, to this question in Chapter 3 when we consider the
1970). We will observe many instances of auxiliary relation between language and thought. In addition
hypotheses introduced to save the main hypothesis to there being a separate module for language, there
that processing is modular. In theories of word rec- are some obvious candidates for subsystems being
ognition researchers have introduced the idea of modules, such as the syntax module, the speech
post-access processes; in syntax and parsing they processing module, and the word recognition mod-
have proposed parallel processing with deferred ule. But even if language is a big, self-contained
1. THE STUDY OF LANGUAGE 25
module, it has to interact with the rest of the cogni- linguistics, much knowledge is encapsulated in
tive system. We talk about what we think about, the form of explicit rules. For example, we will
our thoughts are often in verbal form (what we call see in Chapter 2 that we can describe the syntax
inner speech), and we integrate what we hear with of language in terms of rules such as “a sentence
the rest of the information in our long-term mem- can comprise a noun phrase followed by a verb
ory. As we will see (particularly in Chapters 12 and phrase.” Similarly, we can formulate a rule that
15), language plays a central role in our working the plural of a noun is formed by adding an “-s”
memory, the short-term repository of information. to its end, except in a limited number of irregular
In each case where modularity arises as an forms, which we would need to store separately.
issue, you need to examine the data, and ask Clearly then we can describe language with a sys-
whether the auxiliary hypothesis is more plau- tem of rules, but do we actually make use of such
sible than the non-modular alternative. You also rules when speaking and listening?
need to think about whether data converges from Until quite recently, the answer was thought
experimental and imaging sources. Often, with to be “yes.” Many researchers, particularly those
existing data, it is impossible to decide. with a more linguistic orientation, still believe
this. For many other researchers, connectionist
modeling has provided an alternative view.
Is any part of language innate? Connectionism has revolutionized psycho-
There are broader implications of modularity, linguistics over the last 25 years. In connectionist
too. Generally, those researchers most committed models, processing takes place in the interaction
to the claim that language processes are highly of many simple, massively interconnected units.
modular also argue that a significant amount of Connectionist models that can learn are particu-
our language abilities are innate. The argument is larly important. In these models, information is
essentially that nice, clean-cut modules must be learned by repeated presentation; the connections
built into the brain, or hard-wired, and therefore between units change to encode regularities in the
innately programmed, and that complex, messy environment. The general idea underlying learn-
systems reflect the effects of learning. ing can be summarized in the aphorism, based on
Obviously there are some prerequisites to the the work of Donald Hebb (1949), that “cells that
acquisition of language, if only a general learning fire together, wire together”: the simultaneous
ability. The question is, how much has to be innate? activation of cells (or units) leads to an increase in
Are we just talking about general learning prin- synaptic (or connection) strength.
ciples, or language-specific knowledge—to what What does the “model” part of “connection-
extent is the innate information specifically linguis- ist model” mean? A few years ago I built a model
tic? A related issue is the extent to which the innate rocket. It was only a foot high, and made out of
components are only found in humans. We will look plastic, but it did take off (eventually), and went
at these questions in more detail in Chapters 3 and 4. a few hundred feet in the air. It differed from a
Connectionist modeling (discussed below) suggests “real” rocket in many ways other than scale; the
ways in which general properties of the learning sys- rocket propellant was very different from that used
tem can serve the role of innate, language-specific in real rockets, and many aspects of it were deco-
knowledge, and shows how behavior emerges from rative rather than functional. It was also, needless
the interaction of nature and nurture at all levels to say, greatly simplified. Yet it did illustrate many
(Elman et al., 1996). important principles of rocket flight, and you can
learn a lot about real rocketry by playing with
Does the language system make such models. Computational models of mind are
very similar. They are scaled-down models of the
use of rules? mind, or parts of it, made from different materials,
To what extent does the language-processing sys- but which illustrate important principles of how
tem make use of linguistic rules? In traditional the mind works. What is more, we can learn from
26 A. INTRODUCTION
them. Their behavior is not always totally predict- have proved particularly influential in language
able, in the same way as it is difficult to predict acquisition, where children are thought to learn
exactly how the model rocket is going to behave in language by statistical or distributional analysis
different conditions on the basis of limited knowl- of what they hear rather than learning explicit
edge about its raw materials. Modeling then is a rules (see Chapter 4).
very important idea in modern psycholinguistics.
What makes connectionist models so attrac-
tive? First, unlike traditional AI, at first sight they Are language processes specific to
are more neurally plausible. They are loosely based language?
on a metaphor of the brain, which is a structure made
Does language depend on very specific processes
up out of many massively interconnected neurons,
that have evolved to do nothing else, or does it
each one of which is relatively simple. It is important
make use of more general cognitive processes? For
not to get too carried away with this metaphor, but
example, when we understand sentences, do we
at least we have the feeling that we are starting off
make use of a general-purpose working memory
with the right sorts of models. Second, connection-
store, or do we have dedicated stores that can store
ist modelers usually try to minimize the amount of
only information about language? Do children
information hard-wired into the system, emphasiz-
learn language using general-purpose learning
ing looking at what emerges from the model. Third,
rules, or do they make use of information restricted
just like traditional AI, connectionism has the virtue
to the linguistic domain?
that writing a computer program forces you to be
The ideas of innateness, modularity, rules, and
explicit about your assumptions.
language-specific processing are related. There is
There have been three major consequences
a divide in psycholinguistics between those who
from the success of connectionist modeling.
argue for innate language-specific modules that
First, it has led to a focus on the processes
make extensive use of rules, and those who argue
that take place inside the boxes of our models.
that much or all of language processing is the
In some cases (e.g., the acquisition of the past
adaptation of more general cognitive processes.
tense), this new focus has led to a detailed re-
examination of the evidence motivating the
models. The second consequence is that connec- Are we certain of anything in
tionism has forced us to consider in detail the
representations used by the language system.
psycholinguistics?
In particular, connectionist approaches can be One important point to note is that there are very
contrasted with rule-based approaches. In connec- few topics in psycholinguistics where we can say
tionist models rules are not explicitly encoded, that we know the answer to questions with com-
but instead emerge as a consequence of statisti- plete certainty. Time after time you will notice
cal generalizations in the input data. Examples of that even when there is consensus, or when we
this include the grapheme–phoneme correspond- appear to agree on what happens, there are dis-
ence rules of the dual-route model of reading (see senting voices. Uncertainty is a fact of life when
Chapter 7), and the acquisition of the past tense trying to understand the psychology of language.
(see Chapter 4). It is important to realize that this The discipline is still relatively quite young,
point is controversial, and we shall see through- and we have a lot to learn. It’s not like physics
out the book that the role of explicit rules is still which has hundreds of years of solid research
a matter of substantial debate among psycholin- to stand on. Imagine being a physicist debating
guists. Third, the shift of emphasis from learning experiments and models in seventeenth-century
rules to learning through many repeated specific Europe. That’s a bit like where we’re at now.
instances has led to an increase in probabilistic So I’m sorry; as I said earlier, sometimes I’ll
models of language acquisition and processing just have to throw my hands up and say “sorry, we
(Chater & Manning, 2006). Probabilistic models don’t know,” and you’ll have to leave it at that.
1. THE STUDY OF LANGUAGE 27
SUMMARY
x Language is a communication system that enables us to talk about anything, irrespective of time
and space.
x Psycholinguistics arose after the Second World War as a result of interaction between the disci-
plines of information theory and linguistics, and as a reaction against behaviorism.
x Later experiments revealed a number of problems with a purely linguistic approach to under-
standing language.
x Two ideas from Chomsky’s original work that were picked up by early psycholinguists were the
derivational theory of complexity and the autonomy of syntax.
x The earliest experiments supported the idea that the more transformationally complex a sentence,
the longer it took to process; however, experiments using psychologically more realistic tasks
failed to replicate these findings.
x Although linguistic theory influenced early accounts of parsing, linguistics and psycholinguistics
soon parted ways.
x Modern psycholinguistics uses a number of approaches, including experiments, computer simula-
tion, linguistic analysis, brain imaging, and neuropsychology.
x Early artificial intelligence (AI) approaches to language such as ELIZA and SHRDLU gave the
impression of comprehending language, but had no real understanding of language and were
limited to specific domains.
x Language processes can be broken down into a number of levels of processing.
x Psychologists have different views on the extent to which the mind can be divided into discrete
modules.
x The use of brain imaging is becoming particularly important in the study of language.
x There is considerable debate about whether language processing is interactive or autonomous.
x An important question, particularly for the study of how we acquire language, is the extent to
which language is innate.
x Whereas traditional approaches, based on linguistics, state that much of our knowledge of lan-
guage is encoded in terms of explicit rules, more recent approaches based on connectionist mod-
eling state that our knowledge arises from the statistical properties of language.
x Double dissociations are important in the neuropsychological study of language.
1. What are the methodological difficulties involved for linguists who study people’s intuitions
about language?
2. What are the advantages and disadvantages of using brain imaging to study language?
3. What are the advantages of a modular system? Are there any disadvantages that you can think of?
4. What are the disadvantages of group experiments in neuropsychology?
5. Are there any limits to what single-case studies of the effects of brain damage on language
might tell us?
(Continued)
28 A. INTRODUCTION
(Continued)
6. What is the difference between neuropsychology and neuroscience?
7. How would you define language? What do you think are its most important characteristics?
8. Which do you think is going to tell us more about how humans use language: experiments or
computational modeling? Which would you prefer to do, and why?
9. What does knowing where something happens in the brain tell us about what is happening?
10. What is the difference between linguistics and psycholinguistics, and does the distinction matter?
FURTHER READING
There are many textbooks that offer an introduction to cognitive psychology. Any introductory text
on psychology will provide you with rich material. If you want more detail, try Anderson (2010),
Eysenck and Keane (2010), or Quinlan and Dyson (2008).
For a summary of the early history of psycholinguistics, see Fodor, Bever, and Garrett (1974),
and of linguistics, Lyons (1977a). If you wish to find out more about linguistics, you might try Fromkin,
Rodman, and Hyams (2011). Crystal (2010) is a complete reference work on language. Clark’s
(1996) book is about language as communication. For an amusing read on the history of English,
and much more besides, see Bryson (1990).
Thagard (2005) provides a general survey of cognitive science. There are many introductory
textbooks on traditional AI, including Negnevitsky (2004). Introductions to connectionism include
Bechtel and Abrahamsen (2001) and Ellis and Humphreys (1999)—the latter emphasizes the impact
of connectionism on cognitive psychology.
Kolb and Whishaw (2009) describe traditional neuropsychology and the Wernicke–Geschwind
model in detail; see also Andrewes (2001), Banich (2004), or Stirling (2002) for recent introduc-
tions to neuropsychology. For a more advanced source on neuropsychology and language, try Hillis
(2002). Notice that these references are now getting rather dated; that’s because the emphasis has
switched from pure neuropsychology to neuroscience. Gazzaniga, Ivry, and Mangun (2008) and
Ward (2010) are good general introductions to imaging and cognitive neuroscience.
Chalmers (1999) is a good introduction to the methods and philosophy of science.
Altmann (1997) and Pinker (1994) are introductions to the psychology of language that take the
same general approach as this book. There are some recent handbooks and encyclopedias of psy-
cholinguistics that will provide you with more detailed coverage of the topics in this book, including
Gaskell (2007), Spivey, McRae, and Joanisse (2012), and Traxler and Gernsbacher’s (2006) second
edition of the Handbook of Psycholinguistics. As already mentioned, Crystal (2010) is a very good
reference for linguistics.
A number of journals cover the field of psycholinguistics. Many relevant experimental articles
can be found in journals such as the Journal of Experimental Psychology (particularly the sections
entitled General; Learning, Memory, and Cognition; and, for lower level processes such as speech
perception and aspects of visual word recognition, Human Perception and Performance), the Quar-
terly Journal of Experimental Psychology, Cognition, Cognitive Psychology, Cognitive Science, and
Memory and Cognition. Three journals with a particularly strong language bias are the Journal
of Memory and Language (formerly called the Journal of Verbal Learning and Verbal Behavior),
1. THE STUDY OF LANGUAGE 29
Language and Cognitive Processes, and the Journal of Psycholinguistic Research. Theoretical and
review papers can often be found in Psychological Review, Psychological Bulletin, and Behavioral
and Brain Sciences. The latter includes critical commentaries on the target article, plus a reply to
those commentaries, which can be most revealing. Articles on connectionist and AI approaches to
language are often found in Cognitive Science again, and sometimes in Artificial Intelligence. Many
relevant neuroscience papers can be found in Brain and Language, Cognitive Neuropsychology,
the Journal of Cognitive Neuroscience, Neurocase, and sometimes in journals such as Brain and
Cortex. Papers with a biological or connectionist angle on language can sometimes also be found
in the Journal of Cognitive Neuroscience. Journals rich in good papers on language acquisition are
the Journal of Experimental Child Psychology, Journal of Child Language, and First Language; see
also Child Development.
As we will see, designing psycholinguistics experiments can be a tricky business. It is vital to
control for a number of variables that affect language processing (see Chapter 6 for more detail). For
example, more familiar words are recognized more quickly than less familiar ones. We therefore need
easy access to measures of variables such as familiarity. There are a number of databases that provide
this information, including the Oxford Psycholinguistic Database (Quinlan, 1992) and the Nijmegen
CELEX lexical database for several languages on CD-ROM (Baayen, Piepenbrock, & Gulikers, 1995).
There is a website for this book. It contains links to other pages, details of important recent
work, and a means of contacting me electronically. The URL is http://www.psypress.com/cw/harley.
CHAPTER 2
DESCRIBING LANGUAGE
There are also many specific differences between We produce speech by moving parts of the
British and American pronunciations; for exam- vocal tract, including the lips, teeth, tongue,
ple, American English tends to drop the initial mouth, and voice box or larynx (see Figure 2.2).
/h/ in “herbs.” (There are also different words for The basic source of sounds is the larynx, which
the same thing, of course, such as “sidewalk” for modifies the flow of air from the lungs and pro-
“pavement,” and “trash” for “rubbish.”) Different duces a range of higher frequencies called harmon-
systems of pronunciations within a language are ics. Different sounds are then made by changing
known as dialects. Dialects mostly differ in their the shape of the vocal tract. There are two differ-
vowel sounds. One advantage of the IPA is that ent major types of sounds. Vowels (such as a, e,
it is possible to represent these different ways of i, o, and u) are made by modifying the shape of
pronouncing the same thing. the vocal tract, which remains more or less open
2. DESCRIBING LANGUAGE 33
CONSONANTS
Consonants are made by closing or restricting
some part of the vocal tract as air flows through it.
We classify consonants according to their place of
articulation, whether or not they are voiced, and
their manner of articulation (see Table 2.1).
The place of articulation is the part of the
Received Pronunciation (RP) has long been vocal tract that is closed or constricted during
perceived as the most prestigious spoken form articulation. For example, /p/ and /b/ are called
of the English language. RP belies the origins of bilabial sounds and are made by closing the
its speaker, and is sometimes referred to as the
“Queen’s English,” as it is spoken by the monarch.
mouth at the lips, whereas /t/ and /d/ are made
by putting the tongue to the back of the teeth.
To understand the difference between /b/ and /p/,
while the sound is being produced. The position we need to introduce a concept called voicing.
of the tongue modifies the range of harmonics In one case (/b/), the vocal cords are closed and
produced by the larynx. Consonants (such as p, b, vibrating from the moment the lips are released;
t, d, k, g) are made by closing or restricting some the consonants are said to be pronounced with
part of the vocal tract at the beginning or end of a voice, or just voiced. In the other case (/p/), there
vowel. Most consonants cannot be produced with- is a short delay, as the vocal cords are spread
out some sort of vowel. This description suggests apart as air is first passed between them; hence
that one way to examine the relation between they take some time to start vibrating. These
Hard palate
Velum
(soft palate)
Uvula
Tongue
Lips
Vocal cords
Teeth
Epiglottis Larynx
Esophagus
Glottis
Trachea
consonants are said to be voiceless (also pro- When the glottis is completely closed and then
duced without voice or unvoiced). The time released, a glottal stop (/ˤ/) is made. Glottal
between the release of the constriction of the stops do not occur in the Received Pronunciation
airstream when we produce a consonant, and of English, but are found in some dialects and in
when the vocal cords start to vibrate, is called other languages. (The glottal stop can be heard,
the voice onset time (VOT). Voicing also distin- for example, in some dialects of the south-east
guishes between the consonants /d/ (voiced) and of England in the middle of words like “bottle,”
/t/ (voiceless). The sounds /d/ and /t/ are made replacing the /t/ sound.)
by putting the front of the tongue on the alveo- The other important dimension used to
lar ridge (the bony ridge behind the upper teeth). describe consonants is the manner of articulation.
Hence these are called alveolars. Dentals such as Stops are formed when the airflow is completely
/θ/ and /ð/ are formed by putting the tongue tip interrupted for a short time (e.g., /p/, /b/, /t/, /d/).
behind the upper front teeth. Labiodentals such Not all consonants are made by completely clos-
as /f/ and /v/ are formed by putting the lower lip ing the vocal tract at some point; in some it is
to the upper teeth. Postalveolar sounds (e.g., /³/, merely constricted. Fricatives are formed by con-
/ˣ/, formerly called alveopalatals) are made by stricting the airstream so that air rushes through
putting the tongue towards the front of the hard with a hissing sound (e.g., /f/, /v/, /s/). Affricatives
part of the roof of the mouth, the palate, near the are a combination of a brief stopping of the air-
alveolar ridge. Palatal sounds (e.g., /j/, /y/) are stream followed by a constriction (e.g., /Gˣ/,
made by putting the tongue to the middle of the /˩/). Liquids are produced by allowing air to
palate. Further back in the mouth is a soft area flow around the tongue as it touches the alveo-
called the soft palate or velum, and velars (e.g., lar ridge (e.g., /l/, /r/). Most sounds are produced
/k/, /g/) are produced by putting the tongue to the orally, with the velum raised to prevent airflow
velum. Finally, some sounds are produced with- from entering the nasal cavity. If it does and air is
out the involvement of the tongue. The glottis is allowed to flow out through the nose we get nasal
the name of the space between the vocal cords sounds (e.g., /m/, /n/). Glides or semi-vowels are
in the larynx. Constriction of the larynx at the transition sounds produced as the tongue moves
glottis produces a voiceless glottal fricative (/h/). from one vowel position to another (e.g., /w/, /y/).
MANNER OF ARTICULATION
lateral
stop fricative affricative nasal approximant approximant
PLACE OF
ARTICULATION +V –V +V –V +V –V +V –V +V –V +V –V
bilabial b p m w
labiodental v f
dental ð T
alveolar d t z s n l
postalveolar ˣ ³ Gˣ ˩ r
velar g k ŋ
glottal ? h
2. DESCRIBING LANGUAGE 35
Low æ ˝ ˀ
FIGURE 2.3 Hierarchical structure of syllables.
So we can describe consonants in terms of the words are monosyllabic—they only have one syl-
articulatory distinctive features, place of articula- lable. Syllables can be analyzed in terms of a hier-
tion, manner of articulation, and voicing. It should be archical structure (see Figure 2.3). The syllable
noted that some languages produce consonants (such onset is an initial consonant or cluster (e.g., /cl/);
as clicks) that are not found in European languages. the rime consists of a nucleus, which is the cen-
tral vowel, and a coda, which comprises the final
consonants. Hence in the word “clumps,” “cl-” is
VOWELS the onset and “-umps” the rime, which in turn can
be analyzed into a nucleus, which is the central
Vowels are made with a relatively free flow of
vowel (“u”), and coda (“mps”). In English, all
air. The nature of the vowel is determined by the
of these components are optional, apart from the
way in which the shape of the tongue modifies
nucleus (all words have to have at least a central
the airflow. Table 2.2 shows how vowels can be vowel). The rules that describe how component
classified depending on the position (which can syllables combine with each other differ across
be raised, medium, or lower) of the front, central, languages—for example, Japanese words do not
or rear portions of the tongue. For example, the have codas, and in Cantonese only nasal sounds
/i/ sound in “meat” is an example of a high front and glottal stops are possible codas.
vowel because the air flows through the mouth Features of words and syllables that may span
with the front part of the tongue in a raised (high) more than one phoneme, such as pitch, stress,
position. and the rate of speech, are called suprasegmental
Two vowel sounds can be combined to form features. For example, a falling pitch pattern indi-
a diphthong. Examples are the sounds in “my,” cates a statement, whereas a rising pitch pattern
“cow,” “go,” and “boy.” indicates that the speaker is asking a question. Try
Whereas the pronunciation of consonants is saying “it’s raining” as a statement, “it’s raining?”
relatively constant across dialects, that of vowels as a question, and “it’s raining!” as a statement of
can differ greatly. surprise. Stress varies within a word, as some syl-
lables receive more stress than others, and within a
SYLLABLES sentence, as some words are emphasized more than
others. Taken together, pitch and stress determine
Words are divided into rhythmic units called syl- the rhythm of the language. Languages differ in
lables. One way of determining the number of their use of rhythm. In English, stressed syllables
syllables in a word is to try singing it—each sylla- are produced at approximately equal periods of
ble will need a different note (Radford, Atkinson, time—English is said to be a stressed-timed lan-
Britain, Clahsen, & Spencer, 1999). For example, guage. In French, syllables are produced in a steady
the word syl–la–ble has three syllables. Many flow—it is said to be a syllable-timed language.
36 A. INTRODUCTION
In English, although we can use pitch to I examine his views on the relation between lan-
draw attention to a particular word, or convey guage and thought and on language acquisition in
additional information about it, different pitches Chapters 3 and 4. Chomsky argued that language
do not change the meaning of the word (“mouse” is a special feature that is innate, species-specific,
spoken with a high or low pitch still means and biologically pre-programmed, and that is a
mouse). In some languages pitch is more impor- faculty independent of other cognitive structures.
tant. In the Nigerian language Nupe, [ba] spoken Here we are primarily concerned with the more
with a high pitch means “to be sour,” but [ba] spo- technical aspect of his theory.
ken with a low pitch means “to count.” Languages For Chomsky, the goal of the study of syntax
that use pitch to contrast meanings are called tone is to describe the set of rules, or grammar, that
languages. enables us to produce and understand language.
Chomsky (1968) argued that it is important to dis-
tinguish between our idealized linguistic compe-
LINGUISTIC APPROACHES tence, and our actual linguistic performance. Our
TO SYNTAX linguistic competence is what is tapped by our
intuitions about which are acceptable sentences
Linguistics provides us with a language for of our language, and which are ungrammatical
describing syntax. In particular, the work of the strings of words. We know that the sentence “The
American linguist Noam Chomsky (b. 1928) has vampire the ghost loved ran away” is grammati-
been influential in indicating constraints on how cal, even if we have never heard it before, while
powerful human language must be, and how it we also know that the string of words “The vam-
should best be described. We looked at his influ- pire sleep the ghost ran away” is ungrammatical.
ence on the development of psycholinguistics in Competence concerns our abstract knowledge of
Chapter 1. our language. It is about the judgments we would
make about language if we had sufficient time
and memory capacity. In practice, of course, our
The linguistic theory of Chomsky actual linguistic performance—the sentences that
Chomsky’s work is based on two related ideas: we actually produce—is greatly limited by these
first, the relations between language and the brain, factors. Furthermore, the sentences we actually
and how children acquire language, and second, a produce often use the more simple grammatical
technical description of the structure of language. constructions. Our speech is full of false starts,
hesitations, speech errors, and corrections. The
actual ways in which we produce and understand
sentences are also in the domain of performance.
In his more recent work, Chomsky (1986)
distinguished between externalized lan-
guage (E-language) and internalized language
(I-language). For Chomsky, E-language linguis-
tics is about collecting samples of language and
understanding their properties; in particular it is
about describing the regularities of a language in
the form of a grammar. I-language linguistics is
about what speakers know about their language.
For Chomsky, the primary aim of modern linguis-
tics should be to specify I-language: it is to produce
The American linguist Noam Chomsky argued that a grammar that describes our knowledge of the
language is innate, species-specific, and biologically language, not the sentences we actually produce.
pre-programmed.
Another way of putting this is that I-language is
2. DESCRIBING LANGUAGE 37
about mental phenomena, whereas E-language is be captured in a finite number of syntactic rules.
about social phenomena (Cook & Newson, 2007). A moment’s reflection should show that language
Competence is an aspect of I-language. involves rules, even if we are not always aware of
As a crude generalization, we can say that them. How else would we know that “Vlad bought
psycholinguists are more interested in our linguis- himself a new toothbrush” is acceptable English
tic performance, and linguists in our competence. but “Vlad bought himself toothbrush new a” is not?
Nevertheless, many of the issues of competence
are relevant to psychologists. In particular, lin- Describing syntax and
guistics provides a framework for describing and phrase-structure grammar
thinking about syntax, and its theories place pos-
sible constraints on language acquisition. How should we describe the rules of grammar?
Let us look at the notion of a grammar in Chomsky proposed that phrase-structure rules are
more detail. A grammar uses a finite number of an essential component of our grammar, although
rules that in combination can generate all the sen- he went on to argue that they are not the only
tences of a language—hence we talk of generative component. An important aspect of language is
grammar. Obviously we could produce a device that we can construct sentences by combining
that could emit words randomly, and although this words according to rules. Phrase-structure rules
might, like monkeys typing away with infinite describe how words can be combined, and pro-
time to spare, produce the occasional sentence, it vide a method of describing the structure of a sen-
will mainly produce garbage. For example, “dog tence. The central idea is that sentences are built
vampire cat chase” is a non-sentence in English. It up hierarchically from smaller units using rewrite
is an important constraint that although our gram- rules. The set of rewrite rules constitute a phrase-
mar must be capable of generating all the sentences structure grammar. Rewrite rules are simply rules
of a language, it should also never generate non- that translate a symbol on the left-hand side of the
sentences. (Of course, from time to time we errone- rule into those on the right-hand side. For exam-
ously produce non-sentences, but this is an aspect ple, (1) is a rewrite rule that says “a sentence (S)
of performance; remember we are concerned only can be rewritten as a noun phrase (NP) followed
with linguistic competence here.) Chomsky fur- by a verb phrase (VP)”:
ther argued that a grammar must give an account (1) S → NP + VP
of the underlying syntactic structure of sentences.
The sentence structures that the grammar creates In a phrase-structure grammar, there are two
should capture our intuitions about how sentences main types of symbol: terminal elements (consist-
and fragments of sentences are related. We know ing of vocabulary items or words) and non-terminal
that “the vampire kissed the ghost” and “the ghost elements (everything else). It is important to realize
was kissed by the vampire” are related in some that the rules of grammar do not deal with particu-
way. Finally, linguistic theory should also explain lar words, but with categories of words that share
how children acquire these rules. grammatical properties. Words fall into classes
Chomsky’s linguistic theory has evolved such as nouns (words used to name objects and
greatly over the years. The first version was ideas, both concrete and abstract, such as “pig,” or
described in a book called Syntactic Structures “truth”), adjectives (words used to describe, such
(1957). The 1965 version became known as the as “pink,” or “lovely”), verbs (words describing
“standard theory”; this was followed in turn by actions or states, or an assertion, such as “kiss,”
the “extended standard theory,” “revised extended or “modify”), adverbs (words qualifying verbs,
standard theory,” and then “government and bind- such as “quickly”), determiners (words deter-
ing (or GB) theory” (Chomsky, 1981). The latest mining the number of nouns they modify, such as
version is called minimalism (Chomsky, 1995). “the,” “a,” and “some”), prepositions (words such
Nevertheless, the central theme is that language is as “in,” “to,” and “at”), conjunctions (words such
rule-based, and that our knowledge of syntax can as “and,” “because,” and “so”), pronouns (“he,”
38 A. INTRODUCTION
(7) The vampire is being kicked by the witch. We desire more of a grammar than that it
(8) S → The vampire + verb phrase + preposi- should merely be able to generate sentences: we
tional phrase. need a way to describe the underlying syntactic
Now which is the grammatical subject of this structure of sentences. This is particularly use-
sentence and which is the grammatical object? If ful for syntactically ambiguous sentences. These
we apply the yes–no question test, we form “Is the are sentences that have more than one interpre-
vampire being kicked by the witch?,” with “the tation, such as the sentence “I saw the witches
vampire” moving position. “The witch” stays flying to America.” This could be paraphrased as
where it is. In addition, the structure of (7) is out- either “When I was flying to America, I saw the
lined in (8). Clearly “the vampire” is immediately witches,” or “There I was standing on the ground
dominated by the sentence node. Hence “the vam- when I looked up and there were the witches fly-
pire” is the subject of this sentence, even though ing off to America.” A phrase-structure grammar
“the witch” is doing the action and “the vampire” also enables us to describe the syntactic struc-
is having the action done to him. This type of sen- ture of a sentence by means of a tree diagram, as
tence structure is called a passive. The object in shown for the sentence “The vampire loves Boris”
the active form of the sentence has become the in Figure 2.4. The points on the tree correspond-
grammatical subject of the passive form. We will ing to constituents are called nodes. The node at
examine passives in more detail later. the top of the tree is the sentence or S node; at
The simple grammar in Box 2.2 can be used the bottom are terminal nodes corresponding to
to generate a number of simple sentences. Let us words; in between are non-terminal nodes corre-
start by applying some of these rewrite rules to sponding to constituents such as NP and VP.
show how we can generate a sentence (9). The Tree diagrams are very important in the analy-
goal is to show how a sentence can be made up sis of syntax, and it is important to be clear about
from terminal elements: what they mean. The underlying structure of a
sentence or a phrase is sometimes called its phrase
(9) Starting with S, rule (A) from Box 2.2 gives structure or phrase marker. It should be reiterated
us NP + VP. that the important idea is capturing the underlying
Rule (B) gives us DET + N + VP. syntactic structure of sentences; it is not our goal
Rule (D) gives us DET + N + V + NP. here to explain how we actually produce or under-
Rule (C) gives us DET + N + V + N. stand them. Furthermore, at this stage directional-
ity is not important; the directions of the arrows in
Then the substitution of words gives us, for Box 2.2 do not mean that we are limited to talking
example, the following sentence: “The vampire about sentence production. Our discussion at present
loves Boris.” applies equally to production and comprehension.
40 A. INTRODUCTION
Phrase-structure rules provide us with the underly- This process of center-embedding could
ing syntactic structure of sentences we both produce potentially continue forever, and most linguists
and comprehend. would argue that the sentence would still be perfectly
Clearly, this is an extremely limited gram- well-formed; that is, it would still be grammati-
mar. One obvious omission is that we cannot cal. Of course, we would soon have difficulty in
construct more complex sentences with more understanding such sentences, for we would lose
than one clause in them. However, we could do track of who scared whom and who loved what.
this by introducing conjunctions. A slightly more Many people have difficulty with sentence (13),
complex example would be using a relative clause and many people find constructions such as (14)
with a relative pronoun (such as “which,” “who,” grammatically acceptable, although it is missing
or “that”) to produce sentences such as (10): a verb (Gibson & Thomas, 1999). Although we
might rarely or never produce center-embedded
(10) The vampire who loves Boris is laughing. sentences, our grammar must be capable of pro-
ducing them, or at least of deciding that they are
Natural language could only be described by grammatical. Given a piece of paper and suffi-
a much more complex phrase-structure grammar cient time, you could still understand sentences
that contained many more rules. We would also of this type. This observation reflects the dis-
need to specify detailed restrictions on when par- tinction between competence and performance
ticular rules could and could not be applied. We mentioned earlier: We have the competence to
would then have a description of a grammar that understand these sentences, even if we never
could generate all of the sentences of a language produce them in actual performance. (Remem-
and none of the non-sentences. Obviously another ber that judgments of grammatical acceptability
language, such as French or German, would have are based on intuitions, and these might vary. Not
a different set of phrase-structure rules. everyone would agree that sentences with a large
Although these grammars might be very large, number of center-embeddings are grammatical.
they will still contain a finite number of rules. In Indeed, there is some controversy in linguistics
real languages there are potentially an infinite about their status; see Hawkins, 1990.) Neverthe-
number of sentences. How can we get an infinite less, most people think that recursion is a central
number of sentences from a finite number of rules property of language and perhaps human thought
and words? We can do this because of special rules (Fitch, Hauser, & Chomsky, 2005).
based on what are known as recursion and itera- Iteration enables us to carry on repeating the
tion. Recursion occurs when a rule uses a version same rule, potentially for ever. For example, we
of itself in its definition. Recursive rules enable can use iteration to produce sentences such as (15).
phrases to contain examples of the same sort of
phrase, such as in the old song “Little does she (15) The nice vampire loves the ghost and the
know that I know that she knows that I know …” ghost loves the vampire and the friendly
(Kursaal Flyers, 1976). One of the most important ghost loves the vampire and …
uses of recursion is to embed a sentence within
another sentence, producing center-embedded sen- There are different types of phrase-structure
tences. Examples (12) and (13) are based on (11): grammar. Context-free grammars contain only rules
that are not specified for particular contexts, whereas
(11) The vampire loved the ghoul. context-sensitive grammars can have rules that can
(12) The vampire the werewolf hated loved the only be applied in certain circumstances. In a con-
ghoul. text-free rule, the left-hand symbol can always be
(13) The vampire the werewolf the ghost scared rewritten by the right-hand one regardless of the con-
hated loved the ghoul. text in which it occurs. For example, the writing of
(14) *The vampire who the werewolf who the a verb in its singular or plural form depends on the
ghost had scared loved the ghoul. context of the preceding noun phrase.
2. DESCRIBING LANGUAGE 41
longer a distinction between optional and obliga- The new “standard version of the theory” was
tory transformations. In a sense all transforma- originally known as Government and Binding
tions became obligatory, in that markers for them (GB) theory (Chomsky, 1981), but the term prin-
are represented in the deep structure. ciples and parameters theory is now more widely
In the standard theory, the syntactic component used. This name emphasizes the central idea that
generated a deep structure and a surface structure for there are principles that are common to all lan-
every sentence. The deep structure was the output of guages and parameters that vary from language to
the base rules and the input to the semantic compo- language (see Chapter 4).
nent; the surface structure was the output of the trans- There have been a number of important
formational rules and the input to the phonological changes in the more recent versions of the theory.
rules. Describing sentences in terms of their deep First, with time, the number of transformations
structure has two main advantages. First, some sur- steadily dwindled. Second, related to this, the
face structures are ambiguous in that they have two importance of deep structure has also dwindled
different deep structures. Second, what is the subject (Chomsky, 1991). Third, when constituents are
and what is the object of the sentence is often unclear moved from one place to another, they are hypoth-
in the surface structure. Sentence (22) is ambiguous esized as leaving a trace in their original position.
in its surface structure. However, there is no ambigu- (This has nothing to do with the TRACE model of
ity in the corresponding deep structures, which can spoken word recognition that will be described in
be paraphrased as (23) and (24): Chapter 9.) Fourth, special emphasis is given to
the most important word in each phrase. For exam-
(22) The hunting of the vampires was terrible. ple, in the noun phrase “the vampire with the gar-
(23) The way in which the vampires hunted was lic,” the most important noun is clearly “vampire,”
terrible. not “garlic.” (This should be made clear by the
(24) It was terrible that the vampires were hunted. observation that the whole noun phrase is about
the vampire, not about the garlic.) The noun “vam-
Sentences (25) and (26) have the same sur- pire” is said to be the head of the noun phrase.
face structure, yet completely different deep Fifth, the revised theory permits units inter-
structures: mediate in size between nouns and noun phrases,
and verbs and verb phrases. The rules are phrased
(25) Vlad is easy to please. –
in terms of what is called X (pronounced “X-bar”)
(26) Vlad is eager to please. syntax (Jackendoff, 1977; Kornai & Pullum,
–
1990). The intermediate units are called N (pro-
In (25), Vlad is the deep structure object of –
nounced noun-bar) and V (verb-bar), and are
please; in (26), Vlad is the deep structure subject made up of the head of a phrase plus any essential
of please. This difference can be made apparent in arguments or role players. Consider the phrase
that we can build a deep structure corresponding to “the king of Transylvania with a lisp.” Hence
(27) of the form of (25), but cannot do so for (26), “king” is an N and the head of the phrase; “the
as (28) is clearly ungrammatical. (The ungrammat- –
king of Transylvania” an N (because Transylvania
icality is conventionally indicated by an asterisk.) is the argument of “king,” the place that the king
is king of); and “the king of Transylvania with a
(27) It is easy to please Vlad. lisp” an NP. This approach distinguishes between
(28) *It is eager to please Vlad. essential arguments (such as “of Transylvania”)
and optional adjuncts or modifiers (such as “with
Principles and parameters theory, a lisp”). The same type of argument applies to
and minimalism verbs, which also have obligatory arguments
As Chomsky’s theory continued to develop, many (even if they are not always stated) and optional
of the features of the grammars changed, although modifiers. The advantage of this description is that
the basic goals of linguistics remained the same. it captures new generalizations, such as if a noun
2. DESCRIBING LANGUAGE 43
phrase contains both argument and adjunct, the Chomsky is the most influential figure in the
argument must always be closer to the head than history of linguistics, with his central idea being
the adjunct: “The king with a lisp of Transylvania” that the goal of linguistics is to specify the rules
is distinctly odd. It is an important task of linguis- of a grammar that captures our linguistic compe-
tics to capture and explain such generalizations. tence. Later I look at the implications of this idea
This method of description also enables the speci- for psycholinguistics.
fication of a very general rule such as (29):
– Optimality Theory and Cognitive
(29) X → X, ZP* Linguistics
Although Chomsky’s earlier work had great
That is, any phrase (X-bar) contains a head with influence on the psycholinguistics of the time,
any number of modifiers (ZP*). Such an abstract this influence has waned. Minimalism, although
rule is an elegant blueprint for the treatment of important for linguists, has had no impact on psy-
both noun phrases and verb phrases, and captures cholinguistics. Many of the key ideas of modern
the underlying similarity between them. psycholinguistics are reflected in other branches
English is a head-first language. Japanese, on of linguistics, particularly Optimality Theory
the other hand, is a head-last language. Nevertheless, (McCarthy, 2001). Optimality Theory has been
both languages distinguish between heads and mod- applied to phonology, morphology, semantics,
ifiers; this is an example of a very general rule that and syntax; its main idea is that the surface form
Chomsky argues must be innate. This general rule is of an expression results from the resolution of
an example of a parameter. The setting of the param- conflicts between underlying representations.
eter that specifies head-first or head-last is acquired It shares much with connectionist approaches
through exposure to a particular language (Pinker, to language. As we shall see in Chapter 10, one
1994). I examine parameters and their role in lan- important approach to understanding sentences is
guage acquisition in Chapter 4. that of constraint satisfaction; we try to satisfy as
In the most recent reworking of his ideas, the many constraints as possible, and make sure that
minimalist program aims to simplify the gram- we satisfy all the important ones. We choose the
mar as much as possible (Chomsky, 1995). The best interpretation available in the context on the
Principle of Economy requires that all linguistic basis of all data.
representations and processes should be as eco- Cognitive Linguistics is the name given to
nomical as possible; the theoretical and descrip- the general approach that emphasizes language as
tive apparatus necessary to describe language one aspect of general cognition. In contrast with
should be minimized (Radford, 1997). The less Chomsky’s generative grammar approach, cogni-
complex a grammar, the easier it should be to tive linguists do not believe there is a separate
learn. Although this principle sounds simple, its faculty of language, and argue that we process
implications for the detailed form of the theory language using the same sorts of cognitive pro-
are vast. In minimalism, the role of abstract, cess as we use in every other aspect of cogni-
general grammatical rules is virtually abolished. tion. We learn language using general cognitive
Instead, the lexicon incorporates many aspects processes, rather than language-specific ones.
of the grammar. For example, information about These ideas are reflected in psycholinguistic
how transitive verbs take on syntactic roles is approaches to language acquisition that empha-
stored with the verbs in the lexicon, rather than size the importance of general learning mecha-
stored as an abstract grammatical rule. Instead nisms (see Chapter 4).
of phrase-structure rules, categories are merged
to form larger categories. The lexical representa-
tions of words specify grammatical features that
The formal power of grammars
control the merging of categories. These ideas are This part is relatively technical and can be
echoed by modern accounts of parsing. skipped, but the ideas discussed in it are useful
44 A. INTRODUCTION
for understanding how powerful a grammar must state of a finite-state device is determined by some
be if it is to be able to describe natural language. finite number of previous symbols (words). Type
The study of different types of grammar and the 3 grammars are also known as right-linear gram-
devices that are necessary to produce them is part mars, because every rewrite rule can only be of
of the branch of mathematical linguistics or com- the form A → B or A → x B, where x is a terminal
putational theory (a subject that combines logic, element. This produces right-branching tree struc-
linguistics, and computer science) called autom- tures. For example, if you use the rules in (30) you
ata theory. Automata theory also reveals some- can produce sentences such as in (31). Just substi-
thing of the difficulty of the task confronting the tute the appropriate letters; the vertical separator |
child who is trying to learn language. An automa- separates alternatives.
ton is a device that embodies a grammar and that
(30) S → the A | a A
can produce sentences that are in accordance with
A → green A | vicious A
that grammar. It takes an input and performs some
A → ghost B | vampire B
elementary operations, according to some previ-
B → chased C | loved C | kissed C
ously specified instructions, to produce an output.
C → the D | a D
The topic is of some importance because if we
D → witch | werewolf
know how complex natural language is, we might
(31) The vicious vampire chased the witch. A
expect this to place some constraints on the power
green vicious ghost kissed the werewolf.
of the grammar necessary to cope with it.
We have already defined a grammar as a The corresponding finite-state device is
device that can generate all the sentences of a lan- depicted in Figure 2.5. The finite-state device
guage, but no non-sentences. A language is not always starts in the S state, and then reads words
restricted to natural language: it can be an artifi- from the appropriate category to move on to the
cial language (such as a programming language), next state, before moving onto the next state. It
or a formal language such as mathematics. In fact, finishes producing sentences when it reaches the
there are many possible grammars that fall into end state. We can produce even longer sentences
a small number of distinct categories, each with if we allow iteration with a rule such as (32),
different power. Each grammar corresponds to a which will enable us to produce sentences of the
particular type of automaton, and each type pro- form (33).
duces languages of different complexity.
(32) D → and S
We cannot produce all the sentences of natu-
(33) The vicious vampire chased the witch and a
ral language simply by listing them, because there
green vicious ghost kissed the werewolf.
are an infinite number of grammatically accept-
able sentences. To be able to produce all these Next up in power from a finite-state device
sentences, our grammar must incorporate recursive is a push-down automaton. This is more powerful
and iterative rules. Some rules need to be sensitive than a finite-state device because it has a memory;
with respect to the context in which the symbols the memory is limited, however, in that it is only a
they manipulate occur. Context-free and context- push-down stack. A push-down stack is a special
sensitive languages differ in whether they need type of memory where only the last item stored
rules that can be specified independently of the on the stack can be retrieved; if you want to get at
context in which the elements occur. How com- something stored before the last thing, everything
plex is natural language, and how powerful must stored since will be lost. It is like a pile of plates. It
the grammar be that produces it? produces Type 2 grammars that can parse context-
The simplest type of automaton is known as free languages. Next in power is a linear-bounded
a finite-state device. This is a simple device that automaton, which has a limited memory, but can
moves from one state to another depending on only retrieve anything from this memory. It produces
its current state and current input, and produces Type 1 grammars, parsing context-sensitive lan-
what is known as a Type 3 language. The current guages. Finally, the most powerful automaton, a
2. DESCRIBING LANGUAGE 45
SUMMARY
FURTHER READING
Crystal (2010) and Fromkin et al. (2011) provide excellent detailed introductions to phonetics and
phonology, and in particular give much more detail about languages other than English.
Fabb (1994) is a workbook of basic linguistic and syntactic concepts, and makes the meaning of
grammatical terms very clear, although most of the book avoids using the notion of a verb phrase, on
the controversial grounds that verb phrases are not as fundamental as other types of phrases. For a
more detailed account see Burton-Roberts (1997). Also try Tarshis (1992) for a friendly introduction
to grammatical rules in English. For a more advanced review, see Crocker (1999).
Pinker (1994) gives a brief and accessible description of Chomsky’s theory of syntax. Borsley
(1991) provides excellent coverage of contemporary linguistic approaches to syntax, and Radford
(1981) provides detailed coverage of the linguistic aspects of Chomsky’s extended theory. Radford
(1997) provides an excellent introduction to the minimalist approach; be warned, however, that this
is a very technical topic. An excellent, detailed yet approachable coverage of Chomsky’s theory,
which emphasizes principles and parameters theory, is Cook and Newson (2007). See also references
to his ideas on the development of language at the end of Chapter 3.
If you want to find out more about the relation between linguistics and psycholinguistics, read
the debate between Berwick and Weinberg (1983a, 1983b), Garnham (1983a), Johnson-Laird (1983),
and the articles by Stabler (1983) and Jackendoff (2003), with the subsequent peer commentaries.
An introduction to automata theory is provided in Johnson-Laird (1983) and Sanford (1985); a more
detailed and highly mathematical treatment can be found in Wall (1972).
See Fauconnier and Turner (2003) for a general account of cognition and language in the
cognitive linguistics vein.
This page intentionally left blank
SECTION B
THE BIOLOGICAL AND
DEVELOPMENTAL BASES OF LANGUAGE
Chapter 3, The foundations of language, asks that enables them to acquire language from input
where language came from, whether language is that is often impoverished? How do infants learn
unique to humans, and what we can learn from to associate words with the objects they see in the
attempts to teach human language to animals. Next world around them? How do they learn the rules
we examine the biological basis of language and that govern word order?
what mechanisms are necessary for its develop- Chapter 5, Bilingualism and second
ment. We look at the cognitive and social basis of language acquisition, asks what cognitive
human language development. Finally, we exam- processes are involved when a child is brought
ine the relation between language and thought. up using two languages, and whether these
Chapter 4, Language development, is con- differ from the situation of an adult learning
cerned with how language develops from infancy a second language. How should languages be
to adolescence. Do children have an innate device taught?
This page intentionally left blank
CHAPTER 3
THE FOUNDATIONS OF
LANGUAGE
proposed in favor of the side-effect theory. First, increased in size and complexity when Homo
many researchers believe that there has not been sapiens became differentiated from other species,
enough time for something so complex as lan- between 2 million and 300,000 years ago. Study
guage to evolve since the evolution of humans of the fossil evidence suggests that a structure cor-
diverged from that of other primates. Second, a responding to Broca’s area, a region of the brain
grammar cannot exist in any intermediate form clearly associated with language in modern humans,
(we either have a grammar or we don’t). Third, as was present in the brains of early hominids as long
possessing a complex grammar confers no obvi- as 2 million years ago. The shape of the human
ous selective advantage, evolution could not have skull has changed significantly over time, enabling
selected for it. better control of speech: Neanderthals would not
In recent years, however, the hypothesis that have been capable of controlling their tongues suf-
language evolved by Darwinian natural selection ficiently to be able to articulate as clearly as we do.
as an advantageous adaptation has largely won, The articulatory apparatus has not changed signifi-
partly because it provides a well-understood general cantly over the last 60,000 years. The evolution of
mechanism—indeed, the only mechanism under- language has come at a cost: the structures in the
stood—for how language could have arisen (natu- throat that enable us to control the production of
ral selection), and partly because the objections do sounds also make us more likely than other pri-
not hold much water. It is now apparent that there mates to choke on our food. Obviously the evo-
was indeed sufficient time for grammar to evolve, lutionary advantages conferred by language must
that it evolved to communicate existing cognitive outweigh the disadvantage of this increased risk.
representations, and that the ability to communicate We do not know whether language existed
using a grammar-based system confers a big evolu- in some intermediate form—although it seems
tionary advantage. For example, it obviously makes unlikely that early humans went from commu-
a big difference to your survival if an area has ani- nicating through a few grunts to a rich language
mals that you can eat, or animals that can eat you, that used grammar. Bickerton (1990, 2003) has
and if you are able to communicate this distinction controversially championed the idea of a proto-
to someone else (Fitch, Hauser, & Chomsky, 2005; language that was intermediate between primate
Jackendoff & Pinker, 2005; Pinker, 2003; Pinker & communication systems and human language.
Bloom, 1990; Pinker & Jackendoff, 2005). Protolanguage arose with the evolution of Homo
The capacity for language and symbol erectus about 1.6 million years ago. Protolanguage
manipulation must have arisen as the human brain has vocal labels attached to concepts, but does not
have a proper syntax; it is distinguished from lan- Homo that became extinct about 30,000 years
guage by the power of syntax (Chapter 2). The ago—also carried the FOXP2 mutation and used
idea of a protolanguage is a powerful one: pri- some form of language, although these results are
mates taught sign language (this chapter), very controversial because they might just reflect inter-
young children (Chapter 4), children deprived of breeding between Homo sapiens and Homo nean-
early linguistic input (this chapter), and speakers derthalensis. We examine what the FOXP2 gene
of pidgin language (this chapter) could all be said may control in more detail in Chapter 4.
to use a protolanguage rather than language. The extent to which the evolution of language
What pressures selected for language? The depended on the hands, and whether grammar arose
social set-up of early humans must have played a from the use of manual gestures, is still controver-
role in the evolution of language, but many other sial. Paget (1930) was the first to propose that lan-
animals, particularly primates, have complex guage evolved in intimate connection with the use
social organizations, and although primates also of hand gestures, so that vocal gestures developed
have a rich repertoire of alarm calls and gestures, to expand the available repertoire. Corballis (1992,
they did not develop language. In a rich social 2003, 2004) argued that the evolution of language
environment an adaptation that enables rich com- freed the hands from having to make gestures to
munication confers a huge evolutionary advan- communicate, so that tools could be made and used
tage on that species. simultaneously with communication. Corballis
It is unlikely that language evolved in one argues that language arose not from primate calls,
step, or depends on a single gene. However, but from primate gestures. Additional evidence that
recent evidence suggests that important aspects language evolved from gestures comes from imag-
of language, especially grammar, may be asso- ing studies that show that the brains of great apes
ciated with a specific gene, called the FOXP2 are specialized in a very similar way to humans
gene. In animals, the FOXP2 gene seems to be (Cantalupo & Hopkins, 2001). Chimpanzees and
involved in coordinating sensory and motor infor- gorillas, like humans, show an asymmetry between
mation, and skilled complex movements (Fisher the left and right hemispheres of the brain, with
& Marcus, 2006). Damage to the FOXP2 gene in what is called Brodmann’s area 44 being particu-
humans leads to difficulty in acquiring language larly enlarged on the left. This area is probably
normally. The evidence suggests that the current involved with the production of gestures; further-
structure of the FOXP2 gene in humans arose more, it corresponds to Broca’s region in humans, a
through a mutation within the last 100,000 years key part of the brain involved in producing speech.
(Corballis, 2004), leading to greater development One plausible explanation of this finding is that the
of Broca’s region and an enhanced ability to coor- brains of great apes became specialized to enable
dinate complex sequences of movement (Fisher & the production of sophisticated gestures, but this
Marcus, 2006). Corballis argues that the flower- specialization continued in humans with speech
ing of human culture, art, and technology, and the arising from these gestures. Mirror neurons in this
expansion of Homo sapiens about 40,000 years region play a particular role in imitating gestures;
ago, were all associated with the FOXP2 mutation they fire when an animal performs a specific action
and the development of language. The mutation or sees another animal performing the same action
meant that speech could become fully autonomous (Rizzolatti, Fadiga, Fogassi, & Gallese, 1996).
in the sense that it no longer relied on gestures; They have been argued to play a particular role in
this autonomy at once freed the hands and enabled the evolution of language (Stamenov & Gallese,
better communication. A hundred thousand years 2002), with manual gestures rather than vocal com-
is a long time in evolution: A mutation giving a munication driving evolution. The mirror neuron
1% gain in fitness would increase in frequency in system for grasping enabled imitation, which in
the population from 0.1% to 99.9% in just 4,000 turn allowed early manual signs to develop (Arbib,
generations (Haldane, 1927). However, it is likely 2005). Although many species (including birds
that the Neanderthals—a branch of the genus and frogs) show left-hemisphere dominance for
54 B. THE BIOLOGICAL AND DEVELOPMENTAL BASES OF LANGUAGE
producing sounds, only humans show very strong of language might be innate in humans and have
right-handedness dominance; in other animals ges- a genetic basis. Third, it might tell us about which
ture production is bilateral across the population. other social and cognitive processes are necessary
(Although individual nonhuman primates, dogs, for a language to develop. Finally, of course, the
cats, and even rats tend to favor one paw, there is no question is of great intellectual interest. The idea
systematic preference for left or right within these of being able to “talk to the animals” like the fic-
species.) As the gesture-based language evolved, tional Dr. Dolittle fascinates both adults and chil-
vocalizations became incorporated into the gesture dren alike. It can become an emotive subject, as it
system, leading to the specialization and lateraliza- touches on the issue of animal rights, and the extent
tion of the language and gesture systems and the to which humans are distinct from other animals.
right-handed preference in humans.
Of course, the relation between evolution and Animal communication
language might have been more complex than this.
Elman (1999) argued that language arose from a systems
communication system through many interacting Many animals possess rich communication
“tweaks and twiddles.” Deacon (1997) proposed systems—even insects communicate. Commun-
that language and the brain co-evolved in an inter- ication is much easier to define than language: it
active way, converging towards a common solution is the transmission of a signal that conveys infor-
for the cognitive and sensorimotor problems facing mation, often such that the sender benefits from
the organism. Symbolic gestures and vocalization the recipient’s response (Pearce, 2008). The sig-
preceded fully blown language. As the frontal cor- nal is the means that conveys the information
tex of humans grew larger, symbolic processing (e.g., a sound or a smell). It is useful to distin-
became more important, and linguistic skills became guish between communicative and informative
necessary to manage symbol processing, leading signals: communicative signals have an element
to the development of speech apparatus to imple- of design or intentionality in them, whereas
ment these skills, which in turn would demand and signals that are merely informative do not. If I
enable further symbolic processing abilities. Fisher cough, this might inform you that I have a cold,
and Marcus (2006) propose that language was not a but it is not a communication; but telling you that
single wholesale innovation, but a complex recon- I have a cold is.
figuration of several systems that became adapted A wide range of methods is used to convey
to form language. Such a conclusion is similar to information. Ants rely on chemical messengers
that of Christiansen and Chater (2008), who see lan- called pheromones. Honey bees produce a complex
guage itself as an evolving system that has made use “waggle dance” (see Figure 3.1) in a figure-of-eight
of pre-existing brain structures. shape to other members of the hive (von Frisch,
1950, 1974). The direction of the straight part of the
dance (or the axis of the figure-of-eight) represents
DO ANIMALS HAVE the direction of the nectar relative to the sun, and the
LANGUAGE? rate at which the bee waggles during the dance rep-
resents distance.
Is language an ability that is uniquely human? I Primates use visual, auditory, tactile, and
examine both naturally occurring animal commu- olfactory signals to communicate with each other.
nication systems, and attempts to teach a human- They use a wide variety of calls to symbolize a
like language to animals, particularly chimpanzees. range of features of the environment and their
There are a number of reasons why this topic is emotional states. For example, a vervet monkey
important. First, it provides a focus for the issue produces one particular “chutter” to warn oth-
of what we mean by the term language. Second, it ers that a snake is nearby, a different call when
informs debate about the extent to which aspects an eagle is overhead, and yet another distinct call
3. THE FOUNDATIONS OF LANGUAGE 55
Which features do animal communication Syntax has five important properties (Kako,
systems possess? All communication systems 1999a; Pinker, 2002). First, language is a discrete
possess some of the features. For example, the combinatorial system. When words are combined,
red belly of a breeding stickleback is an arbitrary we create a new meaning: the meanings of the
sign. Some of the characteristics are more impor- words do not just blend into each other, but retain
tant than others; we might single out semanticity, their identity. Second, well-ordered sentences
arbitrariness, displacement, openness, tradition, depend on ordering syntactic categories of words
duality of patterning, prevarication, and reflec- (such as nouns and verbs) in correct sequences.
tiveness. These features all relate to the fact that Third, sentences are built round verbs, which
language is about meaning, and provide us with specify what goes with what (e.g., you give some-
the ability to communicate about anything. We thing to someone). Fourth, we can distinguish
might add other features to this list that empha- words that do the semantic work of the language
size the creativity and meaning-related aspects (content words—see Chapter 2) from words that
of language. Marshall (1970) pointed out the assist in the syntactic work of the language (func-
important fact that language is under our vol- tion words). Fifth, recursion—phrases containing
untary control; we intend to convey a particular examples of themselves—enables us to construct
message. The creativity of language stems from an infinite number of sentences from a finite num-
our ability to use syntactic rules to generate a ber of rules. No animal communication system
potentially infinite number of messages from a has these properties.
finite number of words using iteration and recursion We can use language to communicate about
(see Chapter 2). anything, however remote in time and space.
3. THE FOUNDATIONS OF LANGUAGE 57
Hence, although a parrot uses the vocal-auditory just picking up cues from their owner). When
channel and the noises it makes satisfy most of faced with a new name, he would infer that the
the design characteristics up to number 13, it can- name applied to a novel object, rather than being
not lie, or reflect about its communication system, another name for an object with which he was
or talk about the past. Whereas monkeys are lim- familiar—this “novel name equals nameless cat-
ited to chattering and squeaking about immedi- egory” principle is one that children use to learn
ate threats such as snakes in the grass and eagles some new words. However, unlike children,
overhead, we can express novel thoughts; we can Rico’s knowledge was restricted to the names
make up sentences that convey new ideas. This of physical objects, and he showed no under-
cannot be said of other animal communication standing of how the meanings of words might be
systems. Bees will never dance a book about the related (e.g., that doll and ball are both types of
psychology of the bee dance. We can talk about toy). Nevertheless, this performance is impres-
anything and effortlessly construct sentences that sive, and also suggests that general (rather than
have never been produced before. language-specific) learning mechanisms might
In summary, many animals possess rich sym- go some way to explaining early word learning
bolic communication systems that enable them in children.
to convey messages to other members of the Everyone knows that parrots can be taught
species, that affect their behavior, that serve an to mimic human speech. Pepperberg (1981, 1983,
extremely useful purpose, and that possess many 1987, 2009) took this idea further and embarked
of Hockett’s design features. On the other hand, on an elaborate formal program of training of her
these communication systems lack the richness of African grey parrot (Psittacus erithacus) called
human language. This richness is manifested in Alex. After 13 years, Alex had a vocabulary of
our limitless ability to talk about anything using a about 80 words, including object names, adjectives,
finite number of words and rules to combine those and some verbs. He could even produce and under-
words. However difficult “language” may be to stand short sequences of words. Alex could classify
define, the difference between animal communi- 40 objects according to their color and what they
cation systems and human language is not just one were made of, understand the concepts of same and
of degree. All nonhuman communication systems different, and count up to six. Alex showed evidence
are quite different from language (Deacon, 1997). of being able to combine discrete categories and
use syntactic categories appropriately. However,
Can we teach language to animals?
Perhaps some animals have the biological and
cognitive apparatus to acquire language, but have
not needed to do so in their evolutionary niche.
The alternative view is that only humans possess
the necessary capabilities: that other animals are
in principle incapable of learning language.
Most people think that dogs and par-
rots “know” some aspects of language. Dogs
respond to instructions. One border collie
called Rico knew the labels of over 200 items
(Kaminski, Call, & Fischer, 2004), being
able to fetch items with different names from
around the house, even when he could not see Pepperberg’s (1981) African grey parrot, Alex,
the owner (thereby eliminating the possibility showed evidence of being able to combine
of the “Clever Hans” effect, which is that ani- discrete categories and possibly to use syntactic
categories appropriately.
mals that appear to know language are in fact
58 B. THE BIOLOGICAL AND DEVELOPMENTAL BASES OF LANGUAGE
he knew few verbs, showed little evidence of being animals and are our closest genetic neighbors. In
able to relate objects to verbs, and knew very few the following discussion it is useful to bear in mind
function words (Kako, 1999a). Hence Alex’s lin- the distinction between teaching word meaning
guistic abilities are extremely limited. and syntax. Remember that an essential feature of
Herman, Richards, and Wolz (1984) human language is that it involves both associat-
taught two bottle-nosed dolphins, Phoenix and ing a finite number of words with particular mean-
Akeakamai, artificial languages. One language ings or concepts, and using a finite number of rules
was visually based, using gestures of the trainer’s to combine those words into a potentially infinite
arms and legs (see Figure 3.2), and the other was number of sentences. Before we can conclude that
acoustically based, using computer-generated apes have learned a language we need to show that
sounds transmitted through underwater speakers. they can do both of these things.
However, this research tested only the animals’
comprehension of the artificial language, not their What are the other cognitive
ability to produce it. From the point of view of
answering our questions on language and animals
abilities of chimpanzees?
it is clearly important to examine both compre- We have seen that primates have a rich communi-
hension and production. Even so, the dolphins’ cation system that they use in the wild. The cogni-
syntactic ability was limited, and they showed tive abilities of a chimpanzee named Viki aged 3½
no evidence of being able to use function words years were generally comparable to those of a child
(Kako, 1999a). of a similar age on a range of perceptual tasks such
Most of the work on teaching language to as discriminating and matching similar items, but
animals involves other primates, particularly broke down on tasks involving counting (Hayes
chimpanzees, as they are highly intelligent, social & Nissen, 1971). Experiments on another chimp
TAIL-TOUCH MOUTH
named Sarah also suggested that she performed at after 6 years the chimpanzee could produce just
levels close to that of a young child on tasks such four poorly articulated words (“mama,” “papa,”
as conserving quantity, as long she could see the “up,” and “cup”) using her lips. Even then, Viki
transformation occurring. For example, she under- could only produce these in a guttural croak, and
stood that pouring water from a tall, thin glass into a only the Hayes family could understand them eas-
short, fat glass did not change the amount of water. ily. With a great deal of training she understood
Hence the cognitive abilities of apes are broadly more words, and some combinations of words.
similar to those of young children, apart from the These early studies have a fundamental
latter’s linguistic abilities. This decoupling of lin- limitation. The vocal tracts of chimps are phys-
guistic and other cognitive abilities in children and iologically unsuited to producing speech, and
apes has important implications. First, it suggests this difference alone could account for their
that for many basic cognitive tasks language is not lack of progress (see Figure 3.3). Nothing can
essential. Second, it suggests that there are some be concluded about the general language abili-
non-cognitive prerequisites to linguistic develop- ties of primates from these early failures.
ment. Third, it suggests that cognitive limitations
in themselves might not be able to account for the Washoe
failure of apes to acquire language. Although the design of the vocal tracts of chimps
is unsuited to speaking, chimps are manually
Talking chimps very dexterous. Later attempts at teaching apes
The earliest attempt to teach apes language was language were based on systems using either a
that of Kellogg and Kellogg (1933), who raised a type of sign language, or involving manipulat-
female chimpanzee named Gua along with their ing artificially created symbols. Perhaps the most
own son. (This type of rearing is called cross- famous example of trying to teach language to an
fostering or cross-nurturing.) Gua only understood ape is that of Washoe. Washoe is a female chim-
a few words, and never produced any that were panzee who was caught in the wild when she was
recognizable. Hayes (1951) reared a chimp named approximately 1 year old. She was then brought
Viki as a human child and attempted to teach her up as a human child, doing things such as eating,
to speak. This attempt was also unsuccessful, as toilet training, playing, and other social activities
Palate
Nasal
cavity
Velum Nasal
cavity
Rear pharyngeal Palate
Lips Tongue wall Larynx
Tongue
Epiglottis Lips
Larynx
(vocal cords) Epiglottis
FIGURE 3.3 Compare the adult vocal tract of a human (left) with that of a chimpanzee (right). Adapted from
Lieberman (1975).
60 B. THE BIOLOGICAL AND DEVELOPMENTAL BASES OF LANGUAGE
FIGURE 3.5 The arrangement of lexigrams on a keyboard. Blank spaces were non-functioning keys, or displayed
photographs of trainers. From Savage-Rumbaugh, Pate, Lawson, Smith, and Rosenbaum (1983).
a grammar. Maybe, then, these animals can learn it is not true; not all the attempts mentioned earlier
language, and the difference between apes and used ASL—Premack’s plastic symbols, for exam-
humans is only a matter of degree? ple, are very different. In addition, the force of this
Unfortunately, there are many problems with objection can be largely dismissed on the grounds
some of this research, particularly the early, pio- that although some ASL signs are iconic, many of
neering work. The literature is full of argument and them are not, and that deaf people clearly use ASL in
counter-argument, making it difficult to arrive at a a symbolic way. No one would say that deaf people
definite conclusion. There have been two sources using ASL are not using a language (Petitto, 1987).
of debate: methodological criticisms of the training Nevertheless, ASL is different from spoken language
methods and the testing procedures used, and argu- in that it is more condensed—articles such as “the”
ment over how the results should be interpreted. and “a” are omitted—and this clearly might affect
What are the methodological criticisms? First, the way in which animals use the language. And
one criticism was that ASL is not truly symbolic, in in Washoe’s case at least, a great proportion of her
that many of the signs are icons standing for what signing seemed to be based on signs that resemble
is represented in a non-arbitrary way (Savage- natural gestures. It is also possible that her trainers
Rumbaugh et al., 1978; Seidenberg & Petitto, 1979). over-interpreted her gestures, first incorrectly identi-
For example, the symbol for “give” looks like a fying some gestures as signs, or thinking that a par-
motion of the hand towards the body reminiscent of ticular movement was indeed an appropriate sign.
receiving a gift, and “drive” is a motion rather like Deaf native signers observed a marked discrepancy
turning a steering wheel. If this were true, then this between what they thought Washoe had produced
research could be dismissed as irrelevant because the (which was very little), and what the trainers claimed
chimps are not learning a symbolic language. Clearly (Pinker, 1994). Again, these criticisms are hard to
3. THE FOUNDATIONS OF LANGUAGE 63
justify against the lexigram-based studies, although understanding of word meaning or syntactic struc-
Brown (1973) noted that Sarah’s performance dete- ture. (For details of these methodological problems,
riorated with a different trainer. see Bronowski & Bellugi, 1970; Fromkin et al.,
In these early studies, reporting of signing 2011; Gardner, 1990; Pinker, 1994; Seidenberg &
behavior was anecdotal, or limited to cumulative Petitto, 1979; and Thompson & Church, 1980.)
vocabulary counts and lists. No one ever produced There are also a number of differences between
a complete corpus of all the signs of a signing ape the behavior of apes using language and of children
in a predetermined period of time, with details of the of about the same age, or with the same vocabu-
context in which the signs occurred (Seidenberg & lary size (see Table 3.1). The utterances made by
Petitto, 1979). The limited reporting has a number of chimps are tied to the here-and-now, with those
consequences that make interpretation difficult. For involving temporal displacement (talking about
example, the “water bird” example would be less things remote in time) particularly rare. There is a
interesting if Washoe had spent all day randomly lack of syntactic structure and the word order used
making signs such as “water shoe,” “water banana,” is inconsistent, particularly with longer utterances.
“water refrigerator,” and so on. In addition, the data Fodor et al. (1974) pointed out that there appeared
presented are reduced so as to eliminate the repetition to be little comprehension of the syntactic relations
of signs, thus producing summary data. Repetition in between units, and that it was difficult to produce
signing is quite common, leading to long sequences a syntactic analysis of their utterances. There was
such as “me banana you banana me give,” which little evidence that “acquiring” a sentence struc-
is a less impressive syntactic accomplishment than ture as in the string of words “Insert apple dish”
“you banana me give,” and not at all like the early would help, or transfer to, producing the new sen-
sequences produced by human children. The chimps tence “Insert apple red dish.” Unlike humans, these
produced many imitations of the signs that had just chimpanzees could not reject ill-formed sentences.
been produced by the humans, while truly crea- They rarely asked questions—an obvious charac-
tive signing in the absence of something to imitate teristic of the speech of young children. Children
is rare. Thompson and Church (1980) produced a use language to find out more about language;
computer program to simulate Lana’s acquisition of chimpanzees do not. Chimps do not spontane-
Yerkish. They concluded that all she had done was to ously use symbols referentially—that is, they need
learn to associate objects and events with lexigrams, explicit training to go beyond merely associating a
and to use one of a few stock sentences depending particular symbol or word in a particular context;
on situational cues. There was no evidence of real young children behave quite differently. Finally, it
Apes Children
Utterances are mainly in the here-and-now Utterances can involve temporal displacement
Need explicit training to use symbols Do not need explicit training to use symbols
is not clear that these chimps used language to help chimpanzee (Pan troglodytes), comparative studies
them to reason. of animals suggest that the bonobo or pygmy chim-
These criticisms have not gone unchallenged panzee (Pan paniscus) is more intelligent, has a richer
(e.g., Premack, 1976a, 1976b). Savage-Rumbaugh social life, and a more extensive natural communica-
(1987) pointed out that it is important not to gen- tive repertoire. Kanzi is a pygmy chimpanzee, and
eralize from the failure of one ape to the behavior many believe he has made a vital step in spontane-
of others. Furthermore, many of these early studies ously acquiring the understanding that symbols refer
were pioneering and later studies learned from their to things in the world, behaving like a child. Unlike
failures and difficulties. Broadly, however, much other apes, Kanzi did not receive formal training by
of the early work is of limited value because it is reinforcement with food on production of the correct
not clear that it tells us anything about the linguistic symbol. He first acquired symbols by observing the
abilities of apes; if anything, it suggests that they are training of his mother (called Matata) on the Yerkish
rather limited. system of lexigrams. He then interacted with peo-
ple in normal daily activities, and was exposed to
Kanzi English. His ability to comprehend English as well
The major challenge to the critical point of view comes as Yerkish was studied and compared with the abil-
from more recent studies involving pygmy chimpan- ity of young children (Savage-Rumbaugh, Murphy,
zees. Strong claims have been made about the perfor- Sevcik, Brakke, Williams, & Rumbaugh, 1993).
mance of Kanzi (Greenfield & Savage-Rumbaugh, Kanzi performed as well as or better on a number of
1990; Savage-Rumbaugh & Lewin, 1994; Savage- measures than a 2-year-old child. By the age of 30
Rumbaugh, McDonald, Sevcik, Hopkins, & Rupert, months, Kanzi had learned at least seven symbols
1986). Whereas earlier studies used the common (orange, peanut, banana, apple, bedroom, chase, and
Austin); by the age of 46 months he had learned just chimpanzees used mainly signs for actions and
under 50 symbols and had produced about 800 com- objects. Furthermore, they showed little evidence
binations of them. He was sensitive to word order, of either syntactic or semantic structure in their
and understood verb meanings—for example, he signing, showing instead much repetition and
could distinguish between “get the rock” and “take simple concatenation of signs, mostly with the
the rock,” and between “put the hat on your ball” and goal of acquiring food or some other object. Rivas
“put the ball on your hat.” Spontaneous utterances— concluded that the signing of apes showed many
rather than those that were prompted or imitations— differences from the early language of children.
formed more than 80% of his output. Let us consider word meaning in more detail.
Both Kanzi’s semantic and syntactic abili- How do we use names—in what way is language
ties have been questioned. Seidenberg and Petitto different from simple association? Pigeons can be
(1987) argued that Kanzi understands names in a taught to respond differentially to pictures of trees
different way from humans. Take Kanzi’s use of and water (Herrnstein, Loveland, & Cable, 1977), so
the word “strawberry.” He uses “strawberry” as a it is an easy step to imagine that we could condition
name, as a request to travel to where the strawber- pigeons to respond in one way (e.g., pecking once)
ries grow, as a request to eat strawberries, and so on. to one printed word, and in another way (e.g., peck-
Furthermore, Kanzi’s acquisition of apparent gram- ing twice) to a different word, and so on. We could
matical skills was much slower than that of humans, go so far as to suggest that these pigeons would be
and his sentences did not approach the complexity “naming” the words. So in what way is this “nam-
displayed by a 3-year-old child. In reply, Savage- ing” behavior different from ours? One obvious
Rumbaugh (1987) and Nelson (1987) argued that difference is that we do more than name words: we
the critics underestimated the abilities of the chim- also know their meaning. We know that a tree has
panzees, and overestimated the appropriate linguis- leaves and roots, that an oak is a tree, that a tree is a
tic abilities of very young children. Kako (1999a) plant, and that they need soil to grow in. We know
argued that Kanzi shows no signs of possessing any that the word “leaf” goes with the word “tree” more
function words. He does not appear to be able to than the word “pyramid.” That is, we know how the
use morphology: he does not modify his language word “tree” is conceptually related to other words
according to number, as we do when we form plu- (see Chapter 11 for more detail). We also know what
rals. And there is no clear evidence that Kanzi uses a tree looks like. Consider what might happen if
recursive grammatical structures. we present the printed word “tree” to a pigeon. By
Kanzi is by far the best case for language- examining its pecking behavior, we might infer that
like abilities in apes. Why is Kanzi so success- the best a trained pigeon could manage is to indi-
ful? Although bonobos might be better linguistic cate that the word “tree” looks more like the word
students, another possibility is that he was very “tee” than the word “horse.”
young when first exposed to language (Deacon, Is the use of signs by chimpanzees more like
1997). Perhaps early exposure to language is as that of pigeons or of humans? There are two key
important for apes as it appears to be for humans. questions that would clearly have to be answered
“yes” before most psycholinguists would agree that
Evaluation of work on teaching apes these primates are using words like us. First, can
language apes spontaneously learn that names refer to objects
Most people would agree that in these studies in a way that is constant across contexts? We know
researchers have taught some apes something, but that a strawberry is a strawberry whether it’s in front
what exactly? Clearly apes can learn to associ- of us in a bowl covered in cream and sugar, or in a
ate names with actions and objects, but there is field attached to a strawberry plant half covered in
more to language than this. In a recent analysis of soil. We do not need different words for each, or
a large (3,448) corpus of signs made to humans by restrict our usage to just one context. Second, do
five chimpanzees (Pan troglodytes) with a long these primates have the same understanding of word
history of sign use, Rivas (2005) found that the meaning as we do? Despite the promising work with
66 B. THE BIOLOGICAL AND DEVELOPMENTAL BASES OF LANGUAGE
Kanzi, there are no unequivocal answers to these can include phrases of the same type—is an essen-
questions. For example, Nim could sign “apple” or tial feature of human language. There is no evidence
“banana” correctly if these fruits were presented to that apes can use recursion. More recent research
him one at a time, but was unable to respond cor- reinforces this view. Monkeys can learn very sim-
rectly if they were presented together. This sug- ple grammars, but they cannot learn more sophis-
gests that he did not understand the meaning of the ticated, human-like grammars that use hierarchical
signs in the same way that humans do. On the other structures where there are long-distance dependen-
hand, Sherman and Austin could group lexigrams cies between words (e.g., the word “if” is usually
into the proper superordinate categories even when followed by “then,” but any number of words can
the objects to which they referred were absent. intervene; we can embed sentences within others,
For example, they could group “apple,” “banana,” such as in “the cat the rat bit died”). Cotton-top tam-
and “strawberry” together as “fruit,” although this arins perform well at a range of language-like tasks.
claim is controversial (Savage-Rumbaugh, 1987; They can, for example, like young children (see
Seidenberg & Petitto, 1987). Chapter 4), learn which sequences of sounds tend to
In summary, whereas chimpanzees have occur often together (essentially, they can discrimi-
clearly learned associations between symbols nate words from nonwords; see Hauser, Newport,
and the world, and between symbols, it is debat- & Aslin, 2001). We can study their abilities to learn
able whether they have learned the meaning of the grammars by their ability to discriminate instances
symbols in the way that we know the meanings of strings of sounds that follow a syntactic rule from
of words. Nevertheless, they can sometimes learn strings that violate that rule; essentially, we are ask-
very effectively, in a manner akin to children (Lyn ing them to make what we call grammaticality judg-
& Savage-Rumbaugh, 2000). Kanzi and another ments. When the monkeys hear a string that violates
bonobo chimpanzee (called Panbanisha), also the rules they tend to look at the loudspeaker; we
reared in a naturalistic environment, could learn could say that they “look surprised.” The monkeys
new words naming objects very quickly, with only can be taught simple invented grammars (e.g.,
a few exposures to novel items (at a rate similar to that produce a string of sounds corresponding to
that of language-delayed children). In addition, the an ABABAB syllable structure), but are unable to
chimpanzees could sometimes learn by observa- learn more sophisticated artificial grammars that
tion, rather than having to have the object pointed use hierarchical structure (e.g., that produce a string
out to them each time its name was presented.
Let us now look at chimps’ syntactic abilities.
Has it been demonstrated that apes can combine
symbols in a rule-governed way to form sentences?
In as much as they might appear to do so, it has
been proposed that the “sentences” are simply gen-
erated by “frames.” That is, it is nothing more than a
sophisticated version of conditioning, and does not
show the creative use of word-ordering rules. It is as
though we have now trained our pigeons to respond
to whole sentences rather than just individual
words. Such pigeons would not be able to recognize
that the sentence “The cat chased the dog” is related
in meaning to “The dog is chased by the cat,” or has
the same structure as “A vampire loved a ghost.”
We have a finite number of grammatical rules and The cotton-top tamarin performs well on a range
a finite number of words, but combine them to pro- of language-like tasks; for example, they can learn
duce an infinite number of sentences (Chomsky, which sequences of sounds tend to occur often
together.
1957). We have seen that recursion—where phrases
3. THE FOUNDATIONS OF LANGUAGE 67
of sounds corresponding to AAABBB; Fitch & chimpanzees are not very different, their linguistic
Hauser, 2004). The generation of hierarchical struc- abilities are. This suggests that language processes
tures such as these depends on the ability to use are to some degree independent of other cognitive
recursion, and only humans can use recursion. processes. Third, following on from this, Chomsky
Hauser et al. (2002) and Fitch et al. (2005) go claimed that human language is a special faculty,
so far as to claim that recursion is the only uniquely which is independent of other cognitive processes,
human component of language—yet an immensely has a specific biological basis, and has evolved only
powerful one. Pinker and Jackendoff (2005) and in humans (e.g., Chomsky, 1968). Language arose
Jackendoff and Pinker (2005) take issue with this because the brain passed a threshold in size, and
extreme claim, arguing that there are many more only human children can learn language because
aspects of language, including properties of words only they have the special innate equipment nec-
and grammar, and the anatomy and control of the essary to do so. This hypothesis is summed up by
vocal tract, that are unique to humans. In addition, the phrase “language is species-specific and has an
the FOXP2 gene (see Chapters 1 and 4) is unique innate basis.” (Although as Kako, 1999a, observes, a
to humans and is involved in the control of speech better statement might be, “some components of lan-
and language, but does not seem to involve recur- guage are species-specific.”) In particular, Chomsky
sion. And furthermore, the Piraha language of the argued that only humans possess a language acqui-
Amazon does not seem to use any recursion, yet is sition device (LAD) that enables us to acquire
clearly a human language (Everett, 2005). language; without this device we would be stuck
In summary, some higher animals can learn forever at the level of a protolanguage (see Chapter
the names of objects and simple syntactic rules. 1). In particular, the ability to use recursive syntactic
However, they do not develop sophisticated rep- rules, which is what gives human language its full
resentations of meaning as do humans, and they power, is unique to humans (Hauser et al., 2002).
cannot learn complex, more human-like grammars. Even Premack (1985, 1986a, 1990) has become far
There is disagreement on how well apes less committed to the claim that apes can learn lan-
come out of a comparison of chimps and chil- guage just like human children. Indeed, he also has
dren. One problem is that it is unclear with which come to the conclusion that there is a major discon-
age group of children the chimpanzees should be tinuity between the linguistic and cognitive abilities
compared. When there is more work on linguistic of children and chimpanzees, with children possess-
apes bringing up their own offspring, the picture ing innate, “hard-wired” abilities that other animals
should be clearer. However, this research is diffi- lack. At the very least we can say that whereas chil-
cult to carry out, expensive, and difficult to obtain dren acquire language, apes have to be taught it.
funding for, so we might have to wait some time
for these answers.
At present we can conclude that chimps can
THE BIOLOGICAL BASIS
learn some symbols and some ways of combining OF LANGUAGE
them, but they cannot acquire a human-like syn-
tax. At best, they have acquired a protolanguage. What are the biological precursors of language?
How is language development related to the
Why is the issue so important? development of brain functions? How do biologi-
As we saw earlier, there is more to the issue of a cal processes interact with social factors?
possible animal language than simple intellectual
interest. First, the debate has led to a deeper insight Are language functions
into the nature of language and what is important
about it. We can see what makes human language
localized?
so very different from vervets “chattering” when The brain is not a homogeneous mass; parts of
they see a snake. Second, it is worth noting that it are specialized for specific tasks. How do we
although the cognitive abilities of young children and know this? In the past most of our knowledge
68 B. THE BIOLOGICAL AND DEVELOPMENTAL BASES OF LANGUAGE
about how brain and behavior are related came earliest and most famous work on the effects of
from lesion studies combined with an autopsy: brain damage on behavior in the 1860s. Broca
neuropsychologists would discover which part observed several patients where damage to the
of the brain had been damaged, and relate that cortex of the left frontal lobes resulted in an
information to behavior. Now we have brain- impairment in the ability to speak, despite the
imaging techniques available, particularly fMRI vocal apparatus remaining intact and the abil-
(see Chapter 1), which can also be used with non- ity to understand language apparently remain-
brain-damaged speakers. These techniques indi- ing unaffected. (We look at this again in Chapter
cate which parts of the brain are active when we 13.) This pattern of behavior, or syndrome, has
do tasks such as reading or speaking. become known as Broca’s aphasia, and the part
Most people know that the brain is divided into of the brain that Broca identified as responsible
two hemispheres (see Kolb & Whishaw, 2009). The for speech production has become known as
two hemispheres of the brain are partly specialized Broca’s area (see Figure 3.6).
for different tasks: broadly speaking, in most right- A few years later, in 1874, the German neu-
handed people the left hemisphere is particularly rologist Carl Wernicke identified another area
concerned with analytic, time-based processing, of the brain involved in language, this time fur-
while the right hemisphere is particularly con- ther back in the left hemisphere, in the part of
cerned with holistic, spatially based processing. For the temporal lobe known as the temporal gyrus.
the great majority (96%) of right-handed people, Damage to Wernicke’s area (Figure 3.7) results in
language functions are predominately localized in Wernicke’s aphasia, characterized by fluent lan-
the left hemisphere. We say that this hemisphere guage that makes little sense, and a great impair-
is dominant. According to Rasmussen and Milner ment in the ability to comprehend language,
(1977), even 70% of left-handed people are left- although hearing is unaffected.
hemisphere dominant. This localization of function
is not tied to the speech modality; imaging stud- The Wernicke–Geschwind model
ies show that just the same left-hemisphere brain Wernicke also advanced one of the first models
regions are activated in people producing sign lan- of how language is organized in the brain. He
guage with both hands (Corina, Jose-Robertson, argued that the “sound images” of object names
Guillermin, High, & Braun, 2003). are stored in Wernicke’s area of the left upper
temporal cortex of the brain. When we speak,
Early work on the localization of this information is sent along a pathway of fib-
language ers known as the arcuate fasciculus to Broca’s
How do we know which bits of the brain do what?
In the 1950s, Penfield and Roberts (1959) studied
the effects of electrical stimulation directly on the
brains of patients undergoing surgical treatment
for epilepsy. More recently, a number of tech- Broca’s Parietal
area lobe
niques for brain imaging have become available,
including PET and CAT scans (see Chapter 1).
These techniques all show that there are specific
parts of the brain responsible for specific lan- Frontal
guage processes. lobe Occipital
lobe
Most of the evidence on the localization of Temporal
language functions comes from studies of the lobe
Broca’s
area
Primary
auditory area
Wernicke’s
area
Motor area
Broca’s
area
Angular
gyrus
(comprehension) disorders. For example, people Wernicke’s areas does not produce the simple,
with damage to Broca’s region often have diffi- different effects that we might expect.
culty understanding sentences. Different types of
aphasia have variable clusters of symptoms that More recent models of how language
tend to go together, and that are not as clearly is related to the brain
related to regions such as Broca’s or Wernicke’s Ullman (2004) proposed a model, called the D/P
as the model predicts. Fourth, virtually all peo- (declarative/procedural) model, of how language
ple with aphasia have some anomia (difficulty relates to the brain. He argued that language depends
in finding the names of things) regardless of the on two brain systems. The mental dictionary, or
site of damage. Finally, electrical stimulation of lexicon, depends on a declarative memory system
different regions of the brain often has the same based mainly in the left temporal lobe. The mental
effect, and selective stimulation of Broca’s and grammar, which depends primarily on procedural
3. THE FOUNDATIONS OF LANGUAGE 71
memory, is based on a distinct neural system involv- that Broca’s area computes, including phono-
ing the frontal lobes, basal ganglia, cerebellum, and logical short-term memory (Rogalsky & Hickok,
regions of the left parietal lobe. Essentially this dis- 2011), building a hierarchical structure (Friederici,
tinction is one between linguistic rules, or syntax, 2002; Friederici, Bahlmann, Heim, Schubotz, &
and words. The distinction will recur throughout Anwander, 2006), linearizing a hierarchical struc-
this book, so it is important to remember that there ture (Bornkessel-Schlesewsky, Schlesewsky, &
is some anatomical justification for this distinction. von Cramon, 2009), and unifying concepts into a
Another important idea here is that language pro- planned sentence (Hagoort, 2008). Quite a list!
cessing makes some use of cognitive processes and Some portions of the brain are more impor-
brain structures that are not just dedicated to language. tant for language functions than others, but it is
Recent work has used imaging to explore the difficult to localize specific processes in specific
exact role of Broca’s area in language, and one brain structures or areas. It is likely that multiple
result is that its precise role has become much more routes in the brain are involved in language pro-
controversial. The fact that damage to Broca’s area duction and comprehension. Modern brain-imag-
leads to aphasia shows that it plays an important ing techniques show that much larger regions of
role, but is it dedicated to language specifically, the brain may be involved in language processing
or does it just involve more general processes that than were once thought. For example, the temporal
underpin language? Are other regions of the brain gyrus seems to play an important role in language
involved in processing syntax? The answer to the comprehension (Dronkers, Wilkins, van Valin,
latter question is almost certainly yes, and to the Redfern, & Jaeger, 2004). A wide-ranging account
former, maybe. Imaging suggests that Broca’s area of the relation between language and the brain is
may play a role in general phonological work- provided by Hickok and Poeppel (2004), who, draw-
ing memory rather than syntactic manipulation ing on data from brain imaging and lesion stud-
as such (Rogalsky & Hickok, 2011; but see also ies, focus on auditory comprehension. They argue
Fedorenko & Kanwisher, 2011). There is even that early stages of speech perception involve the
debate as to the exact language-related processes superior temporal gyrus bilaterally (on both sides,
72 B. THE BIOLOGICAL AND DEVELOPMENTAL BASES OF LANGUAGE
although more on the left). The cortical process- perception to speech production. Most of what we
ing system then diverges into dorsal (towards the traditionally think of as “speech perception” takes
back and top of the brain) and ventral (towards the place in the ventral stream. The output of the dor-
front and bottom of the brain) streams (see Figure sal stream is an integration of auditory and motor
3.9). The ventral stream is mainly concerned with information, and the stream is important when we
turning sound into meaning. The dorsal stream is focus on the sounds of the words involved (e.g.,
concerned with mapping sound onto a represen- in learning to make speech sounds, or in analyzing
tation involving articulation, and relates speech the sounds of words, or repeating back nonwords).
A Dorsal stream
Articulatory-based Auditory–motor
speech codes interface
Acoustic-phonetic
speech codes
Sound–meaning
interface
Auditory
input Ventral stream
FIGURE 3.9 (A) Hickok and Poeppel’s proposed framework for the functional anatomy of language. (B) General
locations of the model components shown on a lateral view of the brain. From Hickok and Poeppel (2004).
3. THE FOUNDATIONS OF LANGUAGE 73
In summary, although we can point to spe- shown that Broca’s area is activated differently in
cific regions of the brain—particularly in the left boys and girls when they carry out the language task
frontal and temporal lobes—that play particularly of deciding whether two nonwords rhyme or not.
important roles in language, lesion and imaging Girls tend to show activation in both the left and
studies show that the neural systems underlying right pre-frontal cortex, while with boys activation
language are variable, flexible, and distributed is limited to the left hemisphere (Shaywitz et al.,
over many brain regions (Corina et al., 2003). 1995). It seems that the less lateralized brain leads
In a recent synthesis, Friederici (2012) to an advantage for language processing—perhaps
describes how the cortical regions of the brain because both hemispheres can be used.
involved in language are connected by ventral and There are also sex differences in language
dorsal pathways. The ventral pathway is involved use in later life. Doubtless there are some cul-
in auditory-to-meaning mapping, and the dorsal tural factors. Anderson and Leaper (1998) report
pathway is involved in auditory-to-motor mapping. a meta-analysis of gender differences in the use
The dorsal pathway might also be involved in syn- of interruptions. They found that men are signifi-
tactic processing, particularly with syntactically cantly more likely to interrupt than women, and
complex sentences. She argues that these two func- women are more likely to be interrupted than men.
tions are so dissimilar that we distinguish two dor- However, women also tend to be fluent, producing
sal streams on the basis of function and structure. more words, longer sentences, and fewer errors in a
The ventral pathway supports sound-to-meaning given time, and men are much more likely to suffer
mapping and local syntactic structure building. from clinical disorders such as stuttering.
and the younger the child, the better the chances of habituation paradigm. Exploring the cogni-
a complete recovery. Indeed, the entire function of tive and perceptual abilities of very young infants is
the left hemisphere can be taken over by the right obviously difficult, so we need to use clever
if the child is young enough. There are a number experimental paradigms. In this task, the experi-
of cases of complete hemidecortication, where an menter monitors changes in the infant’s sucking
entire hemisphere is removed as a drastic treatment rate as stimuli are presented. Rapid sucking is an
for exceptionally severe epilepsy. Such an operation innate response to stimulation; when the infant
on an adult would almost totally destroy language gets bored, or habituated, to the stimulus, the
abilities. If performed on children who are young sucking rate drops. If a new stimulus is presented,
enough—that is, during their critical periods—they and if the infant can detect the change, the suck-
seem able to recover almost completely. Another ing rate increases again. Hence monitoring suck-
piece of evidence supporting the critical period ing rate is a very useful way of being able to tell
hypothesis is that crossed aphasia, where damage if an infant can detect change. Entus found a more
to the right hemisphere leads to a language deficit, marked change in the sucking rate when speech
appears to be more common in children (Woods stimuli were presented to the right ear (and there-
& Teuber, 1973). These findings suggest that the fore a left-hemisphere advantage, as the right ear
brain is not lateralized at birth, but that lateralization projects on to the left hemisphere), and an advan-
emerges gradually throughout childhood as a conse- tage for non-speech stimuli when presented to
quence of maturation. This period of maturation is the left ear (indicating a right-hemisphere advan-
the critical period. tage). Molfese (1977) measured evoked poten-
On the other hand, Dennis and Whitaker tials (a measure of the brain’s electrical activity)
(1976, 1977) found that children who had had and found hemispheric differences to speech and
the whole left cortex removed subsequently had non-speech in infants as young as 1 week, with
particular difficulties in understanding complex a left-hemisphere preference for speech. Very
syntax, compared with children who had had the young children also show a sensitive period for
whole right cortex removed. One explanation of phonetic perception that is more or less over by
this finding is that the right hemisphere cannot 10–12 months (B. Harley & Wang, 1997; Werker
completely accommodate all the language func- & Tees, 1983).
tions of the left hemisphere, although Bishop Mills, Coffrey-Corina, and Neville (1993,
(1983) in turn presented methodological criti- 1997) examined changes in patterns of ERPs
cisms of this work. She observed that the number (event-related potentials) in the electrical activity of
of participants was very small, and that it is impor- the brain in infants aged between 13 and 20 months.
tant to match for IQ to ensure that any observed They compared the ERPs as children listened to
differences are truly attributable to the effects of words whose meanings they knew with ERPs for
hemidecortication. When IQ is controlled for, words whose meanings they did not know. These
there is a large overlap with normal performance. two types of word elicited different patterns of ERP,
It is not clear that non-decorticated individuals of but whereas at 13–17 months the differences were
the same age would have performed any better. spread all over the brain, by 20 months the differ-
ences were restricted to the more central regions of
the left hemisphere. Clearly some specialization is
Evidence from studies of occurring here—but still considerably before the
lateralization in very young window of the critical period originally hypoth-
esized by Lenneberg. These data also suggest that
children the right hemisphere plays an important role in early
Contrary to the critical period hypothesis, there language acquisition. In particular, unknown words
is evidence that some lateralization is present at elicit electrical activity across the right hemisphere,
a very early age, if not from birth. Entus (1977) perhaps reflecting the processing of novel but
studied 3-week-old infants using a sucking meaningful stimuli. The same idea could explain
76 B. THE BIOLOGICAL AND DEVELOPMENTAL BASES OF LANGUAGE
the observation that focal brain injury to the right being equal, adults might be better than children
hemisphere of very young children (10–17 months) because of their better learning skills. Research
is more likely to result in a delay in the development has addressed the issue of whether there is an age-
of word comprehension skills than damage to the related block on second language learning.
left hemisphere (Goldberg & Costa, 1981; Thal Are children in fact better than adults at learn-
et al., 1991). ing language? The evidence is not as clear-cut as
Differences in early asymmetry may be is usually thought. Snow (1983) concluded that,
linked with later language abilities. Infants who contrary to popular opinion, adults are in fact no
show early left-hemisphere processing of pho- worse than young children at learning a second
nological stimuli show better language abilities language, and indeed might even be better. We
several years later (Mills et al., 1997; Molfese & often think children are better at learning the first
Molfese, 1994). and second languages, but they spend much more
Hence there does seem to be a critical period time being exposed to and learning language than
in which lateralization occurs, but the period starts adults, which makes a comparison very difficult.
earlier than Lenneberg envisaged. As there is con- Snow and Hoefnagel-Hohle (1978) compared
siderable evidence for some lateralization from English children with English adults in their first
birth, the data also support the idea that the left year of living in the Netherlands learning to speak
hemisphere has a special affinity for language, Dutch. The young children (3–4 years old) per-
rather than the view that the two hemispheres are formed worst of all. In addition, a great deal of the
truly equipotential. advantage for young children usually attributed to
the critical period may be explicable in terms of
Evidence from second language differences in the type and amount of information
available to learners (Bialystock & Hakuta, 1994).
acquisition There is also a great deal of variation: Some adults
The critical period hypothesis has traditionally are capable of near-native performance on a sec-
been used to explain why second language acqui- ond language, whereas some children are less
sition is difficult for older children and adults. successful (B. Harley & Wang, 1997). Although
Johnson and Newport (1989) examined the way ability in conversational syntax correlates with
in which the critical period hypothesis might duration of exposure to the second language, this
account for second language acquisition. They just suggests that total time spent learning the sec-
distinguished two hypotheses, both of which ond language is important—and the younger you
assume that humans have a superior capacity start the more time you tend to have (Cummins,
for learning language early in life. According to 1991). The conclusion is that there is little evi-
the maturational state hypothesis, this capacity dence for a dramatic cut-off in language-learning
disappears or declines as maturation progresses, abilities at the end of puberty.
regardless of other factors. The exercise hypoth- Adults learning a language have a persistent
esis further states that unless this capacity is exer- foreign accent, and hence phonological (sound)
cised early, it is lost. Both hypotheses predict that development might be one area for which there is
children will be better than adults in acquiring a critical period (Flege & Hillenbrand, 1984). And,
the first language. The exercise hypothesis pre- although adults seem to have an initial advantage
dicts that as long as a child has acquired a first in learning a second language, the eventual attain-
language during childhood, the ability to acquire ment level of children appears to be better (see
other languages will remain intact and can be used Krashen, Long, & Scarcella, 1982, for a review).
at any age. The maturational hypothesis predicts Johnson and Newport (1989) carried out one
that children will be superior at second language of the most detailed studies of the possible effects
learning, because the capacity to acquire language of a critical period on syntactic development. They
diminishes with age. However, it is possible found some evidence for a critical period for the
under the exercise hypothesis that, all other things acquisition of the syntax of a second language. They
3. THE FOUNDATIONS OF LANGUAGE 77
examined native Korean and Chinese immigrants What happens if we cannot acquire a first lan-
to the USA, and found a large advantage in mak- guage during the critical period?
ing judgments about whether a sentence was gram-
matically correct for immigrants who arrived at a
younger age. In adults who had arrived in the USA Evidence from hearing
when they were aged between 0 and 16 years of age, children of hearing-impaired
there was a large negative linear correlation between
age of arrival and language ability (on this meas-
parents
ure). Adults who arrived between the ages of 16 and In principle, the language of hearing children of
40 showed no significant relation between age of deaf parents should provide a test of the critical
arrival and ability, although later arrivers generally period hypothesis. However, linguistic depriva-
performed slightly less well than early arrivers. The tion is never total. Sachs, Bard, and Johnson (1981)
variance in the language ability of the later arrivers described the case of “Jim,” a hearing child of deaf
was very high. Johnson and Newport concluded that parents whose only exposure to spoken language
different factors operate on language acquisition until he entered nursery at the age of 3 was the tele-
before and after 16 years of age. They proposed that vision. Although his parents signed to each other,
there is a change in maturational state, from plastic- they did not sign towards him. They believed that
ity to a steady state, at about age 16. Other research- as he had normal hearing it would be inappropri-
ers place the age of discontinuity much earlier, at ate for him to learn signing. Jim’s intonation was
around 5 (see Birdsong & Molis, 2001). abnormally flat, his articulation very poor, with
There is some controversy about whether some utterances being unintelligible, and his gram-
Johnson and Newport’s data really represent a mar very idiosyncratic. For example, Jim produced
change at 16 from plastic to fixed state. Is there a real utterances such as “House. Two house. Not one
discontinuity? Elman et al. (1996) showed that the house. That two house.” This example shows that
distribution of performance scores can also be fitted Jim acquired the concept of plurality but not that it
by a curvilinear function nearly as well as two lin- is usually marked by an “-s” inflection, although
ear ones, suggesting that there is a gradual decline normally this is one of the earliest grammati-
in performance rather than a strong discontinuity. cal morphemes a child learns. Utterances such as
Nevertheless, the younger a person is, the better they “Going house a fire truck” suggest that Jim con-
seem to acquire a second language. Furthermore, structed his own syntactic rules based on stating
Birdsong and Molis (2001) replicated the original a phrase followed by specifying the topic of that
Johnson and Newport (1989) study, using Spanish phrase—the opposite of the usual word order in
speakers learning English. Contrary to the original English. Although this is an incorrect rule, it does
findings, and contrary to the critical period hypoth- emphasize the drive to create syntactic rules (see
esis, Birdsong and Molis found no learning discon- Chapter 4). Jim’s comprehension of language was
tinuity around 16. Furthermore, some late learners also very poor. After intervention, within a few
(starting to learn the second language after the pre- months Jim’s language use was almost normal.
sumed end of the critical period) achieved near- Jim’s case suggests that exposure to language alone
native performance on it—something that should not is not sufficient to acquire language normally: it
be possible if the critical period hypothesis is correct. must be in an appropriate social, interactional con-
In summary, there is evidence for a critical text. It also emphasizes humans’ powerful urge to
period for some aspects of syntactic development use language.
and, even more strongly, for phonological devel- People exposed to sign language (e.g., ASL)
opment. However, rather than any dramatic dis- early achieve a better level of ultimate compe-
continuity, decline seems to be gradual. Second tence (Newport, 1990). In particular, late learners
language acquisition is not a perfect test of the have particular difficulty using signs to represent
hypothesis, however, because the speakers have complex verbs. These observations also support
usually acquired at least some of a first language. the critical period hypothesis.
78 B. THE BIOLOGICAL AND DEVELOPMENTAL BASES OF LANGUAGE
abilities were virtually non-existent. At the age of as they are given exposure to language and train-
nearly 14 the critical period should be finished or ing at an early enough age. “Isabelle” was kept
almost finished, so could Genie learn language? from infancy, with minimum attention, in seclu-
With training, Genie learned some language skills. sion with her deaf-mute mother until the age of
However, her syntactic development was always 6½ (Davis, 1947; Mason, 1942). Her measured
impaired relative to her vocabulary. She used few intelligence was about that of a 2-year-old and she
question words, far fewer grammatical words, and possessed no spoken language. But with exposure
tended to form negatives only by adding nega- to spoken language she passed through the nor-
tives to the start of sentences. She failed to acquire mal phases of language development at a greatly
the use of inflectional morphology (the ability to accelerated rate, and after 18 months her intelli-
use word endings to modify the number of nouns gence was in the normal range and she was highly
and the tense of verbs), the ability to transform linguistically competent.
active syntactic constructions into passive ones In summary, the evidence from linguistic
(e.g., turning “the vampire chased the ghost” into deprivation is not as clear-cut as we might expect.
“the ghost was chased by the vampire”), and the use Children appear able to recover from it as long
of auxiliary verbs (e.g., “be”). Furthermore, unlike as they receive input early enough. If depriva-
most right-handed children, she showed a left-ear, tion continues, language development, particu-
right-hemisphere advantage for speech sounds. larly syntactic development, is impaired. A major
There could be a number of reasons for this, such as problem is that linguistic deprivation is invariably
left-hemisphere degeneration, the inhibition of the accompanied by other sorts of deprivation, and it
left hemisphere by the right, or the left hemisphere is difficult to disentangle the effects of these.
taking over some other function.
Because of financial and legal difficulties, Evaluation of the critical period
research on Genie did not continue for as long as
might have been hoped, and hence many ques-
hypothesis
tions remain unanswered. (Genie is now in an There are two reasons for rejecting a strong ver-
adult foster home.) In summary, Genie’s case sion of the critical period hypothesis. Children can
shows that it is possible to learn some language acquire some language outside of it, and lateraliza-
outside the critical period, but also that syntax tion does not occur wholly within it. In particular,
appears to have some privileged role. The amount some lateralization is present from birth or before.
of language that can be learned after the critical Nevertheless, it is possible to defend a weakened
period seems very limited. version of the hypothesis. A critical period appears
Of course, the other types of deprivation (such as to be involved in early phonological development
malnutrition and social deprivation) to which Genie and the development of syntax. The weakened ver-
was subjected might have played a part in her later sion is often called a sensitive period hypothesis.
linguistic difficulties. Indeed, Lenneberg discounted The evidence supports the weaker version. There
the case because of the extreme emotional trauma is a sensitive period for language acquisition, but
Genie had suffered. Furthermore, there has been no it seems confined to complex aspects of syntactic
agreement over whether Genie was developmentally processing (Bialystok & Hakuta, 1994).
delayed before her period of confinement. Indeed, The critical period does not apply only to
her father locked her away precisely because he con- spoken language. Newport (1990) found evidence
sidered her to be severely developmentally delayed, of a critical period for congenitally deaf people
in the belief that he was protecting her. Contrary to learning ASL, particularly concerning the use of
this, there is some evidence that aspects of Genie’s morphologically inflected signs. She also found a
non-linguistic development proceeded normally fol- continuous linear decline in learning ability rather
lowing her rescue (Rymer, 1993). than a sudden drop-off at puberty. Of course adults
Some children might be able to recover com- can learn sign language, but it is argued they learn
pletely from early linguistic deprivation as long it less efficiently.
80 B. THE BIOLOGICAL AND DEVELOPMENTAL BASES OF LANGUAGE
Why should there be a critical period for lan- received the most attention. However, perhaps the
guage? There are three types of explanation. The two approaches are not really contradictory. A sys-
nativist explanation is that there is a critical period tem that matures and is more efficient for learning
because the brain is pre-programmed to acquire lan- language will have an evolutionary advantage.
guage early in development. Bever (1981) argued
that it is a normal property of growth, arising from a
loss of plasticity as brain cells and processes become THE COGNITIVE BASIS OF
more specialized and more independent. Along simi- LANGUAGE
lar lines, Locke (1997) argues that a sensitive period
arises because of the interplay of developing special- Jean Piaget is one of the most influential figures in
ized neural systems, early perceptual experience, and developmental psychology. According to Piaget,
discontinuities in linguistic development. Lack of development takes place in a sequence of well-
appropriate activation during development acts like defined stages. In order to reach a certain stage of
physical damage to some areas of the brain. development, the child must have gone through all
The maturational explanation is that certain the preceding stages. Piaget identified four principal
advantages are lost as the child’s cognitive and neu- stages of cognitive development (see Figure 3.10). At
rological system matures. In particular, what might birth, he argued that the child possesses only innate
first appear to be a limitation of the immature cog- reflexes. In the first stage of development, which
nitive system might turn out to be an advantage for Piaget called the sensorimotor period, behavior is
the child learning language. For example, it might be organized around sensory and motor processes. This
advantageous to be able to hold only a limited num- stage lasts through infancy until the child is about
ber of items in short-term memory, to be unable to 2 years old. A primary development in this period
remember many specific word associations, and to is the attainment of the concept of object perma-
remember only the most global correspondences. nence—that is, realizing that objects have continual
That is, there might be an advantage to “starting existence and do not disappear as soon as they go
small,” because it enables the children to see the out of view. Indeed, Piaget divided the sensorimo-
wood for the trees (Deacon, 1997; Elman, 1993; tor period into six sub-stages depending on the pro-
Kersten & Earles, 2001; Newport, 1990). It is pos- gress made towards object permanence. Next comes
sible that the limited cognitive resources of the child the pre-operational stage, which lasts until the age
are actually advantageous to children (an idea called of about 6 or 7. This stage is characterized by ego-
“less is more”), as it means they can only process centric thought, which means that these children are
limited amounts of language at any one time. They unable to adopt alternative viewpoints to their own
can then get the small segments right before they start and are unable to change their point of view. The
on the larger and more complex units, without being concrete operations stage lasts until the age of about
overwhelmed from the beginning. A related variant 12. The child is now able to adopt alternative view-
of the maturational answer is that, as the brain devel- points, as shown by the conservation task. In this task
ops, it uses up its learning capacity by dedicating water is poured from a short wide glass to a tall thin
specialist circuits to particular tasks. Connectionist glass, and the child is asked if the amounts of water
modeling of the acquisition of the past tense of verbs are the same. A pre-operational child will reply that
suggests that networks do indeed become less plastic the tall glass has more water in it; a concrete opera-
the more they learn (Marchman, 1993). tional child will correctly say that they both contain
The main differences between these answers the same amount. Nevertheless the child is still lim-
are the extent to which the constraints underlying ited to reasoning about concrete objects. In the for-
the critical period are linguistic or more general, mal operations stage, the adolescent is not limited
and the extent to which the timing of the acqui- to concrete thinking, and is able to reason abstractly
sition process is genetically controlled (Elman and logically. Piaget proposed that the main mecha-
et al., 1996). With insights from connectionist nisms of cognitive development are assimilation and
modeling, the maturational answer has recently accommodation. Assimilation is the way in which
3. THE FOUNDATIONS OF LANGUAGE 81
information is abstracted from the world to fit exist- (Sinclair-de-Zwart, 1973). For example, the child has
ing cognitive structures; accommodation is the way to attain the stage of object permanence in order to be
in which cognitive structures are adjusted in order to able to acquire concepts of objects and names. Hence
accommodate otherwise incompatible information. an observed explosion in vocabulary size at around
According to Piaget, there is nothing special 18 months is related to the attainment of object per-
about language. Unlike Chomsky, he did not see it manence. However, Corrigan (1978) showed that
as a special faculty, but as a social and cognitive pro- there was no correlation between the development of
cess just like any other. It therefore clearly has cogni- object permanence and linguistic development once
tive prerequisites; it is dependent on other cognitive, the child’s age was taken into account. Furthermore,
motor, and perceptual processes, and its development infants comprehend names as much as 6 months
clearly follows the cognitive stages of development. before the stage of object permanence is complete.
Adult speech is socialized and has communicative Indeed, having unique names available for objects
intent, whereas early language is egocentric. Piaget may help children acquire object permanence. Xu
(1923/1955) went on to distinguish three differ- (2002) found that having two distinct labels available
ent types of early egocentric speech: repetition or for two distinct objects (e.g., a toy duck and a ball)
echolalia (where children simply repeat their own facilitated the discrimination abilities of 9-month-old
or others’ utterances); monologues (when children children, but having one label, or two distinct tones,
talk to themselves, apparently just speaking their or two facial expressions, did not.
thoughts out loud); and group or collective mono- There is some evidence that language acqui-
logues (where two or more children appear to be tak- sition is related to the development of object per-
ing appropriate turns in a conversation but actually manence in a more complex way. An important,
just produce monologues). For Piaget, cognitive and though at first small, class of early words are rela-
social egocentrism were related. tional words (e.g., “no,” “up,” “more,” “gone”). The
The cognition hypothesis is a statement of first relational words should depend on the emer-
Piaget’s ideas about language, and says that language gence of knowledge about how objects can be trans-
needs certain cognitive precursors in order to develop formed from one state to another, at the end of the
FIGURE 3.10
82 B. THE BIOLOGICAL AND DEVELOPMENTAL BASES OF LANGUAGE
Language development in
children with learning
difficulties
An obvious test of the cognition hypothesis is to
examine the linguistic abilities of children with
learning difficulties. If cognitive development
drives linguistic development, then impaired
cognitive development should be reflected in
slow linguistic development. The evidence is
mixed but suggests that language and cognition
are to some extent decoupled.
Although some children with Down’s syn- Some people with Down’s syndrome may have
drome become fully competent in their language, impaired linguistic abilities, whereas others become
most do not (Fowler, Gelman, & Gleitman, 1994). fully competent. It seems that cognitive and linguistic
At first, these children’s language development is abilities are distinct, and a person with Down’s
syndrome may show greater abilities in their
simply delayed. Up to the age of 4, their language cognition than in linguistic ability, or vice versa.
age is consistent with their mental age (although it is
obviously behind their chronological age). After this,
language age starts to lag behind mental age. Lexical and linguistic processes are distinct, and that as nor-
development is slow, and grammatical development mal language could develop when there is severe
is especially slow (Hoff-Ginsberg, 1997). Most peo- general cognitive impairment, cognitive precursors
ple with Down’s syndrome never become fully com- are not essential for linguistic development. The
petent with complex syntax and morphology. situation is not straightforward, however, as not all
On the other hand, there are several types of Laura’s linguistic abilities were spared. For exam-
impaired cognitive development that do not lead to ple, she had difficulty with complex morphologi-
such clear-cut linguistic impairments. Laura was cal forms. In another case study, Smith and Tsimpli
a girl who showed severe and widespread cogni- (1995) described a man who had a non-verbal IQ
tive impairments (her IQ was estimated at 41), yet beneath 70, and was unable to live independently,
appeared unimpaired at complex syntactic con- yet who had a normal verbal IQ and could speak
structions (Yamada, 1990). Furthermore, factors several foreign languages.
that caused problems for Laura in cognitive tasks Williams syndrome is a rare genetic disorder
did not do so in linguistic tasks; for example, while that leads to physical abnormalities (affected chil-
non-linguistic tasks involving reasoning about dren have an “elfin-faced” appearance) and a very
hierarchies were very difficult for Laura, her ability low IQ, typically around 50. However, the speech
to produce sentences with grammatical hierarchies of such people is very fluent and grammatically cor-
was intact. Although her short-term memory was rect (Bellugi, Bihrle, Jernigan, Trauner, & Doherty,
very poor, she could still produce complex syntac- 1991). Indeed, they are particularly fond of unusual
tic constructions. Yamada concluded that cognitive words. Their ability to acquire new words and to
3. THE FOUNDATIONS OF LANGUAGE 83
repeat nonwords is also good (Barisnikov, van 1991). Young children also rehearse less than older
der Linden, & Poncelet, 1996). This dissociation children do. It is possible that changes in working
between severe cognitive impairment and normal memory might have consequences for some lin-
(in some respects, better than normal) language guistic processes, particularly comprehension and
skills makes Williams syndrome particularly inter- learning vocabulary (see Chapter 15).
esting and important for thinking about how lan- There is currently little active research on the
guage and cognition are related. Piagetian approach to language. The emphasis has
Finally, children with autism find social com- instead shifted to the communicative precursors
munication difficult, and their language use is often of language and the social interactionist account
idiosyncratic. The things they talk about are differ- (see below). However, to be effective communica-
ent, for example, and they use some words in unusual tors children need to develop the ability to adopt
ways (Tager-Flusberg, 1999). Their peculiarities of others’ point of view. An essential component of
language use probably arise from their lack of a “the- this development is the acquisition of a “theory of
ory of mind” about how other people think and feel, mind.” Although this might be driven by cognitive
and is unlikely to be attributable to straightforward development, it might also be driven linguistically.
deficits in linguistic processing (Bishop, 1997). Their The acquisition of verbs such as “know,” “believe,”
grammatical skills are relatively unimpaired. “think,” and “want,” and the development of lin-
Cases such as these pose difficulty for any guistic structures that enable us to express complex
position that argues either for interaction between statements about beliefs, truth, and falsehood in a
cognitive and linguistic development, or for the relatively simple way, are almost certainly driving
primacy of cognitive factors. The evidence favors forces as well (de Villiers & de Villiers, 2000; Shatz,
a partial, but not complete, separation of language Diesendruck, Martinez-Beck, & Akar, 2003).
skills and general cognitive abilities such as rea-
soning and judgment.
THE SOCIAL BASIS OF
Evaluation of the cognition LANGUAGE
hypothesis We noted earlier that it is difficult to disentangle
The cognition hypothesis says that cognitive the specific effects of linguistic deprivation in feral
development drives linguistic development. children from the effects of social deprivation. Cases
However, there is no clear evidence for a strong such as that of Jim, the hearing child of deaf parents,
version of the cognition hypothesis. Children suggest that children need to be exposed to language
acquire some language abilities before they obtain in a socially meaningful situation (Sachs et al., 1981).
object permanence. Indeed, Bruner (1964) argued It is clearly not enough to be exposed to language;
that aspects of cognitive performance are facili- something more is necessary. Adults tend to talk to
tated by language. The possibility that linguistic children about objects that are in view and about
training would improve performance of the con- events that have just happened: the “here-and-now.”
servation task was tested by Sinclair-de-Zwart The usefulness of this is obvious (for example, in
(1969), who found that language training only had associating names with objects), and it is clear that
a small effect. Linguistic training does not affect learning language just by watching television is
basic cognitive processes, but helps in description going to be very limited in this respect. Furthermore,
and in focusing on the relevant aspects of the task. such situations involve the child having to both
Cognitive processes obviously continue to comprehend and produce language. To be effective,
develop beyond infancy. For example, working early language learning must involve interaction; it
memory capacity increases through childhood from must take place in a social setting. Social interaction-
about 2 items at age 2–3, to the adult span of 7 plus ists emphasize the importance of the development
or minus 2 in late childhood, and there might also of language through interaction with other people
be changes in the structure of memory (McShane, (Bruner, 1983; Durkin, 1987; Farrar, 1990; Gleason,
84 B. THE BIOLOGICAL AND DEVELOPMENTAL BASES OF LANGUAGE
have a structure (see Chapter 14). Clearly we do about the details of how social interactions influ-
not always talk all at once; we appear to take turns ence development. Cognitive processes mediate
in conversations. At the very least children have to social interactions, and the key to a sophisticated
learn to listen and to pay some attention when they theory is in detailing this relation.
are spoken to. How does this ability to interact in
conversational settings develop? There is some evi-
dence that it appears at a very early age. Schaeffer
Disorders of the social use of
(1975) proposed that the origins of turn-taking lie in language
feeding. In feeding, sucking occurs in bursts inter- There are several developmental disorders of
spersed with pauses that appear to act as a signal to using language in a social context. Bishop (1997)
mothers to play with the baby, to cuddle it, or to talk describes semantic-pragmatic disorder, which is a
to it. He also noted that mothers and babies rarely language impairment that looks like a very mild ver-
vocalize simultaneously. Snow (1977) observed that sion of autism. Children with semantic-pragmatic
mothers respond to their babies’ vocalizations as if disorder often have difficulty in conversations where
their yawns and burps were utterances. Hence the they have to draw inferences. They give very literal
precursors of conversation are present at an early age answers to questions, failing to take the preceding
and might emerge from other activities. The gaze conversational and social context into account, as in
of mother and child also seems to be correlated; in the following (from Bishop, 1997, p. 221):
particular, the mother quickly turns her gaze to what-
ever the baby is looking at. Hence again cooperation Adult: Can you tell me about your party?
emerges at an early age. Although in these cases it is Child: Yes.
the mother who is apparently sensitive to the pauses
of the child, there is further evidence that babies of Although semantic-pragmatic disorder is poorly
just a few weeks old are differentially sensitive to fea- understood, it is clear that its origins are complex.
tures of their environment. Trevarthen (1975) found Whereas related disorders might be explicable
that babies visually track and try to grab inanimate in terms of memory limitations or social neglect,
objects, but they make other responses to people, semantic-pragmatic disorder is probably best
including waving and what he called pre-speech— explained in terms of these children having diffi-
small movements of the mouth, rather like a precur- culty in representing other people’s mental states.
sor of talking. The exact role of this pre-speech is This in turn is probably the result of an innate or
unclear, but certainly by the end of the first 6 months developmental brain abnormality. This deficit illus-
the precursors of social and conversational skills are trates how difficult it can be to disentangle biologi-
apparent, and infants have developed the ability to cal, cognitive, and social factors from each other.
elicit communicative responses.
in linguistic development. If language drives children in that their speech was more egocentric,
cognitive development, then hearing-impaired and stereotypic, and less creative. Cutsford (1951) went
non-hearing-impaired children should differ in their so far as to claim that blind children’s words were
cognitive development. meaningless. It is now known that these are over-
The cognitive development of blind or visu- generalizations, and are probably totally wrong.
ally impaired children is slower than that of Some (but not all) blind children may take
sighted children. The smaller range of experiences longer to say their first words, although this is con-
available to the child, the relative lack of mobility, troversial (Lewis, 1987). Bigelow (1987) found that
the decreased opportunity for social contact, and blind children acquired the first 50 words between
the decreased control of the child’s own body and the mean ages of 1 year 4 months and 1 year 9
environment all take their toll (Lowenfeld, 1948). months, compared with the 1 year 3 months to 1
The reliance of the development of the concept of year 8 months Nelson (1973) observed for sighted
object permanence on the senses of hearing and children. The earliest words seem to be similar to
touch leads to a delay in attaining it, and necessar- those first used by sighted children, although there
ily leads to a different type of concept. appears to be a general reduction in the use of object
Early studies suggested that the language devel- names (Bigelow, 1987). Not surprisingly, unlike with
opment of blind children differed from that of sighted sighted children, names do not refer to objects that
are salient in the visual world, particularly those that
cannot be touched (e.g., “moon”). Blind children use
far fewer animal names in early speech than sighted
children (8% compared with 20%; see Mulford,
1988). Instead, they refer to objects salient in the
auditory and tactile domains (e.g., “drum,” “dirt,”
and “powder”). Blind children also use more action
words than sighted children do, and tend to refer to
their own actions rather than the actions of others.
The earliest words also seem to be used rather
differently. They appear to be used to comment on
the child’s own actions, in play or in imitation, rather
than for communication or referring to objects or
events. Indeed, Dunlea (1984) argued that as blind
children were not using words beyond the context
in which they were first learned, the symbolic use of
words was delayed. Furthermore, vocabulary acqui-
sition is generally slower. The understanding of par-
ticular words is bound to be different: Landau and
Gleitman (1985) described the case of a 3-year-old
child who, when asked to look up, reached her arms
over her head. Nevertheless, Landau and Gleitman
demonstrated that blind children can come to learn
the meanings of words such as “look” and “see”
without direct sensory experience. It is possible
that children infer the meanings of these words by
observing their positions in sentences and the words
There is a difference in the rate of development that occur with them.
of linguistic abilities in blind and visually impaired There is considerable controversy about the
children compared with non-impaired children, use of pronouns by blind children. Whereas some
due to their different experience of the world.
researchers have found late acquisition of pronouns
3. THE FOUNDATIONS OF LANGUAGE 87
and many errors with them (e.g., using “you” for children with alternative communicative strate-
“me”; Dunlea, 1989), better controlled studies have gies (Pérez-Pereira & Conti-Ramsden, 1999). For
found no such difference (Pérez-Pereira, 1999). example, repetition and stereotypic speech are used
There are differences in phonological develop- to serve a social function of keeping in contact with
ment: Blind children make more errors than sighted people. Blind children use verbal play to a greater
children in producing sounds that have highly vis- extent than sighted children, and may have better
ible movements of the lips (e.g., /b/), suggesting verbal memory. It should also be noted that work
that visual information about the movement of lips on blind children is methodologically complex and
normally contributes to phonological development tends to involve small numbers of participants; many
(Mills, 1987). Nevertheless, older blind children studies might have underestimated their linguistic
show normal use of speech sounds, suggesting that abilities (Pérez-Pereira & Conti-Ramsden, 1999).
acoustic information can eventually be used in iso- In any case, even if blind children were to
lation to achieve the correct pronunciation (Pérez- show an unambiguous linguistic deficit, it would
Pereira & Conti-Ramsden, 1999). be very difficult to attribute any deficit just to dif-
Syntactic development is marked by far more ferences in cognitive development. For example,
repetition than is normally found, and the use of the development of mutual gaze and the social
repeated phrases carries over into later develop- precursors of language will necessarily be differ-
ment. Furthermore, blind children do not ask ent; and sighted parents of blind children still tend
so many questions of the type “what’s that?” or to talk about objects that are visually prominent.
“what?,” or use modifiers such as “quite” or “very” However, caregivers try to adapt their speech to
(which account for the earliest function words of the needs of their children, resulting in subtle dif-
sighted children). This observation might reflect the ferences in linguistic development.
fact that their parents adapt their own language to On the other hand, it is obvious that the devel-
the needs of the children, providing more spontane- opment of spoken language is impaired in deaf or
ous labeling. There is also a delay in the acquisition hearing-impaired children. There is some evidence
of auxiliary verbs such as “will” and “can” (Landau that deaf children spontaneously start using and
& Gleitman, 1985). Again this is probably because combining increasingly complex gestures in the
of differences in the speech of the caregivers. absence of sign language (e.g., Mohay, 1982). This
Mothers of blind children use more direct com- finding shows that there is a strong need for humans
mands (“Take the doll”) than questions involving to attempt to communicate in some way. However,
auxiliaries (“Can you take the doll?”) when speak- given adequate tuition, the time course of the acqui-
ing to their children. The other curious finding is sition of sign language runs remarkably parallel to
that the children’s use of function words (which do that of normal spoken language development. Meier
the grammatical work of the language) is much less (1991) argued that deaf children using sign language
common early on (Bigelow, 1987). pass the same linguistic “milestones” at about the
Hence the linguistic development of blind chil- same ages as hearing children (and some milestones
dren is different from that of sighted children, but perhaps before hearing children).
the differences are mostly the obvious ones that Research on the cognitive consequences of
one would expect given the nature of the disabil- deafness has given mixed results. In one early exper-
ity. There is little clear evidence to support the idea iment, Conrad and Rush (1965) found differences
that an impairment of cognitive processing causes in coding in memory tasks between hearing and
an impairment of syntactic processing, and there- deaf children. This result is not surprising given the
fore we cannot conclude that cognitive processes involvement of acoustic or phonological process-
precede linguistic ones. Neither is there much evi- ing in short-term or working memory (Baddeley,
dence to support the idea that blind children’s early 1990). If rigorous enough controls are used, it can
language is deficient relative to that of sighted chil- be demonstrated that these indeed reflect differences
dren. Indeed, behavior that was once thought to be in the memory systems rather than inferiority of the
maladaptive in some way may in fact provide blind hearing-impaired systems (Conrad, 1979). Furth
88 B. THE BIOLOGICAL AND DEVELOPMENTAL BASES OF LANGUAGE
(1966, 1971) found that compared with hearing chil- the linguistic performance of one group is superior to
dren, deaf children’s performance on Piagetian tasks that of the other. The cognitive development of deaf
was relatively normal. A review of results on tasks children generally proceeds better than it should if
such as conservation gave a range of results, from language were primary, and the linguistic develop-
no impairment to 1–2 years’ delay; the evidence was ment of blind children generally proceeds better than
mixed. Furth (1973) found that deaf adolescents it should if cognition were primary. Deaf children
had more difficulty with symbolic logic reasoning learn a sign language, and blind children acquire
tasks than did hearing children. Furth interpreted excellent coping strategies and acquire spoken lan-
these data as evidence for the Piagetian hypothesis guage remarkably well. Indeed, the linguistic devel-
that language is not necessary for normal cognitive opment of deaf children and the cognitive develop-
development. Any differences between deaf and ment of blind children both proceed better than we
hearing children arise out of the lack of experiences would expect if one were driving the other. There is
and training of the deaf children. little supporting evidence for the cognition hypothe-
However, most deaf children learn some kind sis from an examination of children with learning dif-
of sign language at a very early age, so it is dif- ficulties or a comparison of deaf and blind children. If
ficult to reach any strong conclusions about the anything, these findings support Chomsky’s position
effects of lack of language. Deaf children with that language is an independent faculty. Nevertheless,
deaf parents acquire sign language at the same social factors are clearly important. Biological, cog-
rate as other children acquire spoken language nitive, and social factors work together in language
(Messer, 2000). Best (1973) found that the more development, and deficits in one of these areas can
exposure to sign language that deaf children had, often be compensated for by the others.
the better their performance on the Piagetian tasks.
WHAT IS THE RELATION
Evaluation of evidence from deaf BETWEEN LANGUAGE
and blind children AND THOUGHT?
There are clearly differences in cognitive develop-
ment between hearing-impaired and non-hearing- In this section we examine the relation between
impaired children, but it is not obviously the case that language and other cognitive and biological
processes. Does the form of our language influ-
ence the way in which we think, or is the form
of our language dependent on general cogni-
tive factors?
Many animals are clearly able to solve some
problems without language, suggesting that lan-
guage cannot be essential for problem solving and
thought. Although this may seem obvious, it has
not always been considered so. Among the early
approaches to examining the relation between lan-
guage and thought, the behaviorists believed that
thought was nothing more than speech. Young
children speak their thoughts aloud; this becomes
internalized, with the result that thought is covert
speech—thought is just small motor movements
of the vocal apparatus. Watson (1913) argued that
According to Messer (2000), deaf children with thought processes are nothing more than motor
deaf parents acquire sign language at the same rate
habits in the larynx. Jacobsen (1932) found some
as other children acquire spoken language.
evidence for this belief because thinking often is
3. THE FOUNDATIONS OF LANGUAGE 89
accompanied by covert speech. He detected elec- to some point in development, when the child
trical activity in the throat muscles when partici- is about 3 years of age, speech and thought are
pants were asked to think. But is thought possible independent; after this, they become connected.
without these small motor movements? Smith, At this point speech and thought become inter-
Brown, Thomas, and Goodman (1947) used curare dependent: thought becomes verbal, and speech
to temporarily paralyze all the voluntary muscles becomes representational. When this happens,
of a volunteer (Smith, who clearly deserved to be children’s monologues are internalized to become
first author on this paper). Despite being unable to inner speech.
make any motor movement of the speech appara- Vygotsky contrasted his theory with that of
tus, Smith later reported that he had been able to Piaget, using experiments that manipulated the
think and solve problems. Hence there is more to strength of social constraints (see Figure 3.11).
thought than moving the vocal apparatus. Unlike Piaget, Vygotsky considered that later
Perhaps language sets us apart from ani- cognitive development was determined in part by
mals because it enables new and more advanced language. Piaget argued that egocentric speech
forms of thought? We need to distinguish how arises because the child has not yet become fully
language and thought might affect each other socialized, and withers away as the child learns to
developmentally, and in the fully developed communicate by taking into account the point of
adult state. view of the listener. For Vygotsky the reverse was
We can list the possible alternatives; each the case. Egocentric speech serves the function of
of them has been championed at some time. self-guidance that eventually becomes internalized
First, cognitive development determines the as inner speech, and is only vocalized because the
course of language development. This view- child has not yet learned how to internalize it.
point was adopted by Piaget and his followers. The boundaries between child and listener are
Second, language and cognition are independ- confused, so that self-guiding speech can only be
ent faculties (Chomsky’s position). Third, lan- produced in a social context. Vygotsky found that
guage and cognition originate independently the amount of egocentric speech decreased when
but become interdependent; the relation is com- the child’s feeling of being understood lessened
plex (Vygotsky’s position). Fourth, the idea that (such as when the listener was at another table).
language determines cognition is known as the He claimed that this was the reverse of what Piaget
Sapir–Whorf hypothesis. The final two of these would predict. However, these experiments are
approaches both emphasize the influence of lan- difficult to evaluate because Vygotsky omitted
guage in cognition. many procedural details and measurements from
his reports that are necessary for a full evalua-
The interdependence of language tion. It is surprising that the studies have not been
repeated under more stringent conditions. Until
and thought then, and until this type of theory is more fully
The Russian psychologist Vygotsky (1934/1962) specified, it is difficult to evaluate the significance
argued that the relation between language and of Vygotsky’s ideas.
thought was a complex one. He studied inner
speech, egocentric speech, and child mono- The Sapir–Whorf hypothesis
logues. He proposed that speech and thought
have different ontogenetic roots (that is, different In George Orwell’s novel Nineteen Eighty-Four,
origins within an individual). Early on, in par- language restricted the way in which people
ticular, speech has a pre-intellectual stage. In this thought. The rulers of the state deliberately used
stage, words are not symbols for the objects they “Newspeak,” the official language of Oceania, so
denote, but are actual properties of the objects. that the people thought what they were required
Speech sounds are not attached to thought. At to think. “This statement … could not have been
the same time early thought is non-verbal, so up sustained by reasoned argument, because the
90 B. THE BIOLOGICAL AND DEVELOPMENTAL BASES OF LANGUAGE
PIAGET VYGOTSKY
Learning precedes
Development precedes development
learning
Language is a SOCIAL phenomenon,
even at the earliest stages of
development, although a child’s
Egocentric speech early speech is egocentric
represents child
thinking aloud
Thought develops within a
social context
FIGURE 3.11
necessary words were not available” (Orwell, The Sapir–Whorf hypothesis comprises two
1949, p. 249, in the appendix, “The principles related ideas. First, linguistic determinism is
of Newspeak”). Orwell’s idea is a version of the the idea that the form and characteristics of our
Sapir–Whorf hypothesis. language determine the way in which we think,
The central idea of the Sapir–Whorf hypoth- remember, and perceive. Second, linguistic rel-
esis is that the form of our language determines ativism is the idea that as different languages
the structure of our thought processes. Language map onto the world in different ways, differ-
affects the way we remember things and the way ent languages will generate different cognitive
in which we perceive the world. It was origi- structures.
nally proposed by a linguist, Edward Sapir, and Miller and McNeill (1969) distinguished
a fire insurance engineer and amateur linguist, between three versions of the Sapir–Whorf
Benjamin Lee Whorf (see Whorf, 1956a, 1956b). hypothesis. In the strong version, language deter-
Although Whorf is most closely associated with mines thought. In a weaker version, language
anthropological evidence based on the study of affects only perception. In the weakest version,
American Indian languages, the idea came to language differences affect processing on certain
him from his work in fire insurance. He noted tasks where linguistic encoding is important. It
that accidents sometimes happened because, he is the weakest version that has proved easiest to
thought, people were misled by words—as in the test, and for which there is the most support. It
case of a worker who threw a cigarette end into is important to consider what is meant by “per-
what he considered to be an “empty” drum of pet- ception” here. It is often unclear whether what is
rol. Far from being empty, the drum was full of being talked about is low-level sensory process-
petrol vapor, with explosive results. ing or classification.
3. THE FOUNDATIONS OF LANGUAGE 91
perception, but do aid classification and other that determine the endings than do English
cognitive processes. Not having words available speakers, and in particular they should group
for certain concepts does seem to have a detri- instances of objects according to their form. As
mental effect. Members of the Piraha tribe from all the children in the study were bilingual, the
the Amazon basin have words for the numbers comparison was made between more Navaho-
“one” and “two,” and then just “many.” Their per- dominant and more English-dominant Navaho
formance on a range of numerical tasks was very children. The more Navaho-dominant children
poor for quantities greater than three (Gordon, did indeed group objects more by form than
2004). Whereas we can count above two and by color, compared to the English-dominant
assign precise numbers to quantities, members group. However, a control group of non-Native
of the Piraha tribe just seem to be able to esti- American English-speaking children grouped
mate. Not having a word available for a concept even more strongly according to form, behav-
does appear to limit their cognitive abilities. ing as the Navaho children were predicted to
behave! It is therefore not clear what conclu-
sions about the relation between language and
Grammatical differences between thought can be drawn from this study.
languages A second example is that English speak-
Carroll and Casagrande (1958) examined the ers use the subjunctive mood to easily encode
cognitive consequences of grammatical differ- counter-factuals such as “If I had gone to the
ences in the English and Navaho languages. library, I would have met Dirk.” Chinese does
The form of the class of verbs concerning han- not have a subjunctive mood. Bloom (1981,
dling used in Navaho depends on the shape and 1984) found that Chinese speakers find it
rigidity of the object being handled. Endings for harder to reason counter-factually, and attrib-
the verb corresponding to “carry,” for example, uted this to their lack of a subjunctive con-
vary depending on whether a rope or a stick struction. Their memories are more easily
is being carried. Carroll and Casagrande there- overloaded than those of speakers of languages
fore argued that speakers of Navaho should that support these forms. There has been some
pay more attention to the properties of objects dispute about the extent to which sentences
3. THE FOUNDATIONS OF LANGUAGE 93
used by Bloom were good idiomatic Chinese. Paganelli, and Dworzynski (2005) found that
It is also possible to argue counter-factually effects of gender on thought were highly con-
in Chinese using more complex construc- strained. They were found in Italian (a two-
tions, such as (translated into English) “Mrs. gender language), but only with tasks that
Wong does not know English; if Mrs. Wong require verbalization and only with certain
knew English, she would be able to read the semantic categories (animals) and not others
New York Times” (Au, 1983, 1984; Liu, 1985). (artifacts). For example, when participants are
Nevertheless, Chinese speakers do seem to find asked to judge which two of three words are
counter-factual reasoning more difficult than most similar (e.g., donkey–elephant–giraffe),
English speakers. If this is because the form grammatical gender affected similarity judg-
of the construction needed for counter-factual ments for animals but not for artifacts. There
reasoning is longer than the English subjunc- were no effects at all in German, a language
tive, then this is evidence of a subtle effect of with an additional neuter gender. The likely
linguistic form on reasoning abilities. reason for this difference is that in two-gender
A third example is that of grammatical languages gender is a reliable cue to sex—but
gender. Although English does not mark gram- of course this rule is inapplicable with artifacts.
matical gender, many languages do. Italian, The conclusion is consistent with a weak ver-
for example, marks nouns as masculine or sion of the Sapir–Whorf hypothesis—language
feminine, and German marks them as mascu- can affect performance on some tasks that use
line, feminine, or neuter. Vigliocco, Vinson, language.
tasks involving numbers greater than three. It common and generally known and not usually
appears that in order to count accurately we need derived from the name of an object (hence “yel-
to have linguistic number terms available. low” but not “saffron”). Languages differ in the
number of color terms they have available. For
Color coding and memory for color example, Gleason (1961) compared the division
The most fruitful way of investigating the strong of color hues by speakers of English with that of
version of the Sapir–Whorf hypothesis has proved the languages Shona and Bassa (see Figure 3.14).
to be analysis of the way in which we name and Berlin and Kay found that across languages basic
remember colors. Brown and Lenneberg (1954) color terms were present in a hierarchy (see Figure
examined memory for “color chips” differing in 3.15). If a language only has two basic color terms
hue, brightness, and saturation. Codable colors, available, they must correspond to “black” and
which correspond to simple color names, are “white”; if they have three then they must be these
remembered more easily (e.g., an ideal red is two plus “red”; if they have four then they must be
remembered more easily than a poor example of the first three plus one of the next group, and so
red). Lantz and Stefflre (1964) argued that the on. English has names for all 11 basic color terms
similar notion of communication accuracy best (black, white, red, yellow, green, blue, brown,
determines success: People best remember colors purple, pink, orange, and gray). Berlin and Kay
that are easy to describe. also showed that the typical colors referred to by
This early work seemed to support the Sapir– the basic color terms, called the focal colors, tend
Whorf hypothesis, but there is a basic assumption to be constant across languages.
that the division of the color spectrum into labeled Heider (1972) examined people’s memory
colors is completely arbitrary. This means that, but for focal colors in more detail. Focal colors are
for historical accident, we might have developed the best examples of colors corresponding to
other color names, like “bled” for a name of a basic color terms: they can be thought of as the
color between red and blue, and “grue” for a name best example of a color such as red, green, or
of a color between green and blue, rather than red, blue. The Dani tribe of New Guinea have just
blue, and green. Is this assumption correct? two basic color terms, “mili” (for black and dark
Berlin and Kay (1969) compared the basic colors) and “mola” (for white and light colors),
color terms used by different languages. Basic although subsequently there has been some doubt
color terms are defined by being made up from as to whether this really is the case. Heider taught
only one morpheme (so “red,” but not “blood the Dani made-up names for other colors. They
red”), not being contained within another color learned names more easily for other focal colors
(so “red,” but not “scarlet”), not having restricted than for non-focal colors, even though they had
usage (hence “blond” is excluded), and being no name for those focal colors. They could also
FIGURE 3.14
Dani mili mola Comparison of color hue
division in English, Shona,
Bassa, and Dani (based on
Gleason, 1961).
96 B. THE BIOLOGICAL AND DEVELOPMENTAL BASES OF LANGUAGE
categorical perception of colors that lie on either low level. As Pinker (1994) observes, no matter how
side of a color name boundary, such as blue and influential language might be, it is preposterous to
green, speakers of the Mexican Indian language think that it could rewire the ganglion cells. Third,
Tarahumara, who do not have names for blue and in any case, there do appear to be effects of language
green, do not. Hence having an available name on color perception: Roberson et al. found effects of
can at least accentuate the difference between two categorical perception for colors, but aligned with
categories. These more recent findings suggest linguistic categories rather than more biologically
that there are indeed linguistic effects on color based categories.
perception.
There are limitations on the extent to which Linguistic differences in the coding of
biological factors constrain color categorization, space and time
and it is likely that there is some linguistic influ- In a recent review, Boroditsky (2003) concludes
ence. The Berinmo, a hunter-gatherer tribe also that there are several instances where encoding
from New Guinea, have five basic color terms. differences between languages leads to differ-
The Berinmo do not mark the distinction between ences in performance by speakers of those lan-
blue and green, but instead have a color bound- guages. For example, different languages encode
ary between the colors they call “nol” and “wor,” spatial languages in different ways. Most lan-
which does not have any correspondence in the guages (such as English) use relative terms (e.g.,
English color-naming scheme. English speakers front of, back of, left of, right of) to encode rela-
show a memory advantage across the blue–green tive spatial terms. Languages such as Tzeltal (a
category boundary but not across the nol–wor one, Mayan language) use an absolute system (similar
whereas Berinmo speakers showed the reverse to our system of describing compass points, e.g.,
pattern (Davidoff, Davies, & Roberson, 1999a, to the north). Speakers of Dutch (which uses the
1999b). In a further series of experiments using relative system) and Tzeltal interpret and perform
more sensitive statistical techniques, Roberson, very differently on a non-linguistic orientation
Davies, and Davidoff (2000) were unable to repli- task. In this task, people see an arrow pointing in
cate Heider’s earlier results with the Dani with the one direction, to the left or right. The viewpoint
Berinmo. They found no recognition advantage is then rotated 180 degrees, and people are asked
for focal stimuli, no facilitation of learning focal which is most like the one they had originally
colors, and a relation between color recognition seen—an arrow pointing in relatively the same
was affected by color vocabulary. way, or absolutely the same way. Preferences
It is now also apparent that even within depend on whether the language uses an absolute
English not all basic color terms are equal. or a relative coding system, with the Dutch speak-
“Brown” and “gray” are acquired later than other ers preferring the right-pointing arrow if they had
basic color terms, are the two least preferred seen that previously, but the Tzeltal speakers pre-
colors, and are used less frequently in adult speech ferring the left-pointing arrow (Levinson, 1996a).
to children than other color terms (Pitchford & This is because “what is north” does not vary with
Mullen, 2005). rotation, but “what is left” does. Different spatial
In summary there appear to be effects of biolog- frames of references are acquired with ease by
ical and linguistic constraints on memory for colors. children from different cultures using different
Perhaps color naming is not such a good test of the languages—the absolute and relative systems are
Sapir–Whorf hypothesis after all. First, the task is acquired equally easily (Majid, Bowerman, Kita,
clearly very sensitive to the details of the experimen- Haun, & Levinson, 2004). Different languages
tal procedures and materials used. Second, the more encode time in different ways: in English we
basic the cognitive or perceptual process, the less mainly use a front–back metaphor (look ahead,
scope there is likely to be for the top-down influ- falling behind, move meetings forward), while
ence of language, and color perception, a mecha- Mandarin speakers systematically use vertical
nism shared with many nonhuman species, is pretty metaphors (with up corresponding roughly to last
98 B. THE BIOLOGICAL AND DEVELOPMENTAL BASES OF LANGUAGE
and down to next). Mandarin speakers are more results, arguing that the purpose of the task was
likely to construct vertical timelines to think about too apparent to Li and Gleitman’s participants.
time, while English speakers are more likely to They also pointed out that their groups were
construct horizontal ones. For example, Mandarin tested with equal amounts of environmental cues
speakers are faster to confirm that March comes available, being tested equally indoors and out.
before April if they have just seen a vertical In summary, there is evidence that the way
array of objects than if they had seen a horizon- in which different languages encode distinctions
tal one. English speakers showed the reverse pat- such as time, space, motion, shape, and gender
tern (Boroditsky, 2001). Similar differences in influence the way in which speakers of those
performance can be found for the way in which languages think. These differences suggest that
languages encode object shape and grammatical our language may determine how we perform on
gender (Boroditsky, 2003). tasks that at first sight do not seem to involve
Languages differ in the way in which they language at all, although this claim remains
encode movement—do these linguistic differ- controversial.
ences lead to cognitive differences? English
encodes the direction of motion with a modifier Evaluation of the Sapir–Whorf
(“to,” “from”) and the manner of motion in the hypothesis
verb (“walk,” “run”), whereas in Greek the oppo- The weak version of the Sapir–Whorf hypoth-
site is the case: the verb encodes the direction of esis has enjoyed a resurgence. There is now a
motion, while the manner is encoded by a modi- considerable amount of evidence suggesting
fier. Papafragou, Massey, and Gleitman (2002) that linguistic factors can affect cognitive pro-
tested Greek and English children on two types of cesses. Even color perception and memory, once
task involving motion: one involving non-linguistic thought to be completely biologically determined,
tasks (remembering and categorizing motion in show some influence of language. Furthermore,
pictures of animals moving around), the other research on perception and categorization has
involving linguistic description. They only found shown that high-level cognitive processes can
a difference in performance on the linguistic tasks. influence the creation of low-level visual features
There has recently been debate about whether early in visual processing (Schyns, Goldstone, &
these linguistic differences reflect the presence or Thibaut, 1998). This is entirely consistent with the
absence of external cues, and whether they affect idea that, in at least some circumstances, language
performance on all tasks, or just linguistic tasks. might be able to influence perception.
Li and Gleitman (2002) argued that the results Indeed, it is hardly surprising that if a thought
of the studies by Levinson and colleagues on expressible in one language cannot be expressed
spatial frames of reference described above were so easily in another, then that difference will have
artifactual. Li and Gleitman suggested that the consequences for the ease with which cognitive
results depend on the presence of environmen- processes can be acquired and carried out. Having
tal cues. They tested a group of native English one word for a concept instead of having to use a
speakers, and found that they could make them whole sentence might reduce memory load. The
perform using relative or absolute frames of ref- differences in number systems between languages
erence depending on the presence of landmark form one example of how linguistic differences
cues in the environment. When participants can lead to slight differences in cognitive style.
could not see the outside world (the blinds of the We will see in later chapters that different
testing room were down), the speakers tended languages exemplify different properties that are
to use a relative frame; when they could see the bound to have cognitive consequences. For exam-
outside world (the blinds were up), they were ple, the complete absence of words with irregular
more likely to use an absolute frame of refer- pronunciations in languages such as Serbo-Croat
ence. On the other hand, Levinson, Kita, Haun, and Italian is reflected in differences between
and Rasch (2002) were unable to replicate these their reading systems and those of speakers of
3. THE FOUNDATIONS OF LANGUAGE 99
languages such as English. Furthermore, differ- range of sources to justify his claim that inner
ences between languages can lead to differences speech is the glue that sticks cognition together,
in the effects of brain damage. and enables the modules of the mind to com-
The extent to which people find the Sapir– municate: that is, language is the medium of
Whorf hypothesis plausible depends on the extent conscious thought.
to which they view language as an evolutionarily Even here we must note that there might be
late mechanism that merely translates our thoughts cultural differences. In the West, it is assumed
into a format suitable for communication, rather that language and inner speech assist thinking;
than a rich symbolic system that underlies most in the East, it is assumed that talking interferes
of cognition (Lucy, 1996). It is also more plausi- with thinking. These cultural differences affect
ble in a cognitive system with extensive feedback performance: thinking out loud helped European
from later to earlier levels of processing. Americans to solve reasoning problems, but hin-
dered Asian Americans (Kim, 2002; Nisbett,
Language and thought: 2003).
The influence of language on thought has
Conclusion some important consequences. For example,
Perhaps the main conclusion about how language does sexist language really influence the way in
and thought are related is that there is a relation- which people think? Spender (1980) proposed
ship, but it is a complex one. Environment and some of the strongest arguments for non-sexist
biology jointly determine our basic cognitive language. For example, that using the word
architecture. Within the constraints set by this “man” to refer to all humanity has the associa-
architecture, languages are free to vary in how tion that males are more important than females;
they dissect the world and in what they empha- or that using a word like “chairman” (rather than
size. These differences can then feed back to a more gender-neutral term such as “chair” or
affect aspects of perception and cognition. “chairperson”) encourages the expectation that
We noted above that paralyzing overt the person will be a man. These expectations
speech does not stop us being able to think. do have real effects. Gender-stereotyped nouns
Clearly language is an important medium of (e.g., “surgeon,” “nurse”) are those to which
thought and conceptualization. Although there many people have a strong initial expectation
is a great deal of individual variation, a signifi- of the gender of the person (surgeon as male,
cant proportion of our mental life is conducted nurse as female). Readers take longer to read
in language (Carruthers, 2002); we hear “inner a subsequent pronoun referring to the noun if
speech,” which often seems to be expressing or the pronoun is in conflict with the stereotyping
guiding our thoughts, or which sometimes is the (such as using “she” to refer to a surgeon rather
product of reading. The extent to which inner than “he”; e.g., Kennison & Trofe, 2004). Such
speech or language plays a real role in thinking a theory is a form of the Sapir–Whorf hypoth-
is unclear and controversial (Carruthers, 2002). esis, although there has been surprisingly little
A strong view is that language is essential for empirical work in this area.
conceptual thought and is the medium in which As Gleitman and Papafragou (2005) con-
it is conducted; a weaker view is that language clude, clearly we can have thought without
is the medium of conscious propositional (as language—some animals clearly reason and
opposed to visual) thought; an even weaker solve problems; prelinguistic infants have rich
view is that language is necessary to acquire cognitive abilities; people with brain damage
many concepts, and influences cognition in destroying most of their language abilities dis-
ways that we have seen above; yet another play rich cognitive abilities. Yet there is also
view is that there is essentially no relation at much evidence that language and culture can
all (although language can clearly express affect our ways of thinking. Language and
thoughts). Carruthers presents evidence from a thought are related, but in a complex way.
100 B. THE BIOLOGICAL AND DEVELOPMENTAL BASES OF LANGUAGE
SUMMARY
1. Why might early humans have needed language while chimpanzees did not?
2. What do you think is the most important way in which human language can be differentiated
from the way in which Washoe used language?
3. What would convince you that a chimpanzee was using a language like humans?
4. How easy is it to separate features that are universal to language from features that are univer-
sal to our environment?
5. One reason why second language acquisition might be so difficult for adults is that it is not
“taught” in the way that children acquire their first language. How then could the teaching of
a second language be facilitated?
6. How might individual differences play a part in the extent to which people use language to
“think to themselves”?
7. Compare and contrast the language of Genie with the “language” of Washoe.
8. What ethical issues are involved in trying to teach animals language?
9. Clearly the alleged experiment on creating wild children reputed to have been carried out by
King James IV was extremely unethical. What ethical issues do you think might be involved
in cases such as Genie’s?
10. How could you tell whether sex differences in language use result from biological or cultural
factors (or both)?
11. Can you find any examples of sexist language in magazines, newspapers, or official docu-
ments? Has it influenced your understanding of the roles people play?
12. Can you think of any examples of when your cognition has been affected by the words
you use?
FURTHER READING
For more on the origins and evolution of language, see Aitchison (1996), Deacon (1997), Harley
(2010), and Jackendoff (1999). Christiansen and Kirby (2003) is a more advanced but still
approachable recent edited collection about language evolution; start with the chapter by Pinker
for an overview. Dennett (1991) discusses the evolution of language, and its possible relation
to consciousness.
For a more detailed review of animal communication systems and their cognitive abilities, see
Pearce (2008). A detailed summary of early attempts to teach apes language is provided by Premack
(1986a). Gardner, van Cantfort, and Gardner (1992) report more recent analyses of Washoe’s signs.
Premack’s later stance is critically discussed in reviews by Carston (1987) and Walker (1987);
see also the debate between Premack (1986b) and Bickerton (1986) in the journal Cognition. A
popular and contemporary account of Kanzi is given by Savage-Rumbaugh and Lewin (1994). See
also Deacon (1997) for more on animal communication systems and the evolution of language.
See Klima and Bellugi (1979) for more on sign language in humans. Aitchison (1998) is a good
description of attempts to teach language to animals and the biological basis of language. There
(Continued)
102 B. THE BIOLOGICAL AND DEVELOPMENTAL BASES OF LANGUAGE
(Continued)
is a special issue of the journal Cognitive Science on primate cognition (2000, volume 24, part 3,
July–September). See Pepperberg (1999) and Shanker, Savage-Rumbaugh, and Taylor (1999) for
replies to Kako’s (1999a) criticisms; and Kako (1999b) for replies to them.
Most textbooks on neuropsychology and neuroscience have at least one chapter on language
and the brain (e.g., Gazzaniga et al., 2008; Kolb & Whishaw, 2009). See Poeppel and Hickok
(2004) and the rest of the special issue of the journal Cognition for a recent review of work on
the biology and anatomy of language.
Muller (1997) is an article with commentaries about the innateness of language, species-specificity,
and brain development. He argues that the brain is less localized for language and that language is less
precisely genetically determined than many people think. The article is also a good source of further
references on these topics.
An excellent source of readings on the critical period and how language develops in exceptional
circumstances is Bishop and Mogford (1993). Bishop (1997) provides a comprehensive review of
how comprehension skills develop in unusual circumstances. For a more detailed review of the
critical period and second language hypothesis see McLaughlin (1984). Bishop also describes spe-
cific language impairment (SLI) and semantic-pragmatic disorder in detail; see also Bishop (1989).
Gopnik (1992) also reviews SLI, emphasizing the role genetics plays in its occurrence. A popular
account of Genie and other attic children plus an outline of their importance is given by Rymer
(1993). See Shattuck (1980) for a detailed description of the “Wild Boy of Aveyron” and Curtiss
(1989) for a description of another linguistically deprived person called “Chelsea.” Description
of the neurology of hemispheric specialization can be found in Kolb and Whishaw (2009). Skuse
(1993) discusses other cases of linguistic deprivation. Cases of hearing children of deaf parents
and their implications are reviewed by Schiff-Myers (1993). See Harris (1982) for a full review of
cognitive prerequisites to language. Social precursors of language are discussed in more detail in
Harris and Coltheart (1986).
Gleason and Ratner (1993) give an overview of language development covering many of
the topics in this and the next chapter. See Cottingham (1984) for a discussion of rationalism and
empiricism. A general overview of cognitive development is provided by Flavell, Miller, and
Miller (1993) and by McShane (1991). Piattelli-Palmarini (1980) edited a collection of papers that
arose from the famous debate between Chomsky and Piaget on the relation between language and
thought, and the contributions of nativism versus experience, at the Royaumont Abbey near Paris
in 1975. Piattelli-Palmarini (1994) summarized and updated this debate. Lewis (1987) discusses
general issues concerning the effects of different types of disability on linguistic and cognitive
development. For more on language acquisition in the blind, see the collection of papers in Mills
(1983). Kyle and Woll (1985) is a textbook on sign language and the consequences of its use on
cognitive development. Cromer’s (1991) book provides a good critical overview of this area,
and indeed of many of the topics in this chapter. Gallaway and Richards (1994) is a collection of
papers covering research on child-directed speech and the role of the environment; the final chap-
ter by Richards and Gallaway (1994) provides an overview.
For more on the early language of blind children, see Dunlea (1989) and Pérez-Pereira and
Conti-Ramsden (1999), and for more on language in deaf, blind, and handicapped children, Cromer
(1991). For the effects of linguistic training on cognitive performance, see Dale (1976). Leonard
(2000) is a review of work on SLI. For a good review of the critical period hypothesis, see B. Harley
and Wang (1997). For a review of the biology of sex differences see Baron-Cohen (2003).
3. THE FOUNDATIONS OF LANGUAGE 103
See Gleitman and Papafragou (2005) for an overview of the relation between language and
thought. Gumperz and Levinson (1996) is an edited volume about linguistic relativity. Dale (1976)
also discusses the Sapir–Whorf hypothesis in detail. See Levinson (1996b) for cross-cultural work
on differences in the use of spatial terms, and how they might affect cognition. Fodor (1972) and
Newman and Holzman (1993) review the work of Vygotsky and its impact. For a detailed review of
the Sapir–Whorf hypothesis in general and the experiments on color coding in particular, see Lucy
(1992). Clark and Clark (1977) provide an extensive review of the relation between language and
thought, with particular emphasis on developmental issues. Nisbett (2003) discusses cultural differ-
ences in language and cognition.
CHAPTER 4
LANGUAGE DEVELOPMENT
obtain knowledge. The rationalists (such as Plato We should be wary of seeking any simple
and Descartes) maintained that certain fundamen- answer to the question “what drives language
tal ideas are innate—that is, they are present from development?” The answer is almost certainly
birth. The empiricists (such as Locke and Hume) that many factors do. It should also be remem-
rejected this doctrine of innate ideas, maintaining bered that language development is a complex
that all knowledge is derived from experience. process that involves the development of many
Locke (1690/1975) was one of the most influen- skills, and processes that may be important for
tial empiricists. He argued that all knowledge held syntactic development, for example, might be of
by the rationalists to be innate could be acquired less importance in phonological, morphological,
through experience. According to Locke, the or semantic development. Nevertheless, we can
mind at birth is a tabula rasa—a “blank sheet of tease apart some likely important contributions.
paper”—on which sensations write and determine
future behavior. The rationalist–empiricist contro-
versy is alive today: it is often called the nature–
Imitation
nurture debate. Chomsky’s work in general and The simplest theory of language development is that
his views on language acquisition are in the ration- children learn language by imitating adult language.
alist camp, and there are strong empiricist threads Although children clearly imitate some aspects of
in Piaget. (Piaget argued that cognitive structures adult behavior, it is clear that imitation cannot by
themselves are not innate, but can arise from itself be a primary driving force of early language
innate dispositions.) Behaviorists, who argued development, and particularly of syntactic develop-
that language was entirely learned, are clearly ment. A cursory examination of the sentences pro-
empiricists. Although we must be wary of sim- duced by younger children shows that they do not
plifying the debate by trying to label contrasting often imitate adults. Children make types of mis-
views as rationalist or empiricist, the questions of takes that adults do not. Furthermore, when children
which processes are innate, and which processes try to imitate what they hear, they are unable to do
must be in place for language to develop, are of so unless they already have the appropriate gram-
fundamental importance. Nevertheless, we must matical construction (see examples that follow).
not forget that behavior ultimately results from Nevertheless, imitation of adult speech (and that of
the interaction of nature and nurture. Work in other children) plays an important role in acquiring
connectionism has focused attention on the nature accent, in the manner of speech, and in the choice of
of nurture and the way in which learning systems particular vocabulary items. It might also be more
change with experience (Elman et al., 1996). important in older children, as we will see below.
Hirsh-Pasek, Treiman, & Schneiderman, 1984; is called U-shaped development: performance starts
Moerk, 1991; Morgan & Travis, 1989). For exam- off at a good level, but then becomes worse, before
ple, parents are more likely to repeat the child’s improving again. U-shaped development is sugges-
incorrect utterance in a grammatically correct form, tive of a developing system that has to learn both
or to ask a follow-up question (Saxton, 1997). rules and exceptions to those rules. We examine this
Example (4) exemplifies this. On the other hand, type of development in detail later.
if the child’s utterance is grammatically correct, The third piece of evidence against a condition-
the adults just continue the conversation (Messer, ing theory of language learning is that some words
2000). People from different cultures also respond (such as “no!”) are clearly understood before they
differently to grammatically incorrect utterances, are ever produced. Fourth, Chomsky (1959) argued
with some appearing to place more emphasis on that theoretical considerations of the power and
correctness (Ochs & Schieffelin, 1995). structure of language mean that it cannot be acquired
Whether this type of feedback is strong simply by conditioning (see Chapter 2). Finally, in
enough to have any effect on the course of acqui- phonological production, babbling is not random,
sition is controversial (Marcus, 1993; Morgan and imitation is not important: The hearing babies of
& Travis, 1989; Pinker, 1989). Such feedback is hearing-impaired parents babble normally. In gen-
probably too infrequent to be effective, although eral, language development appears to be strongly
others argue that occasional contrast between based on learning rules rather than simply on learn-
the child’s own incorrect speech and the correct ing associations and instances.
adult version does enable developmental change
(Saxton, 1997). Evidence in favor of this argu-
ment is that children are more likely to repeat
Poverty of the stimulus
adults’ expansions of their utterances than other Can children learn language from what they hear?
utterances, suggesting that they pay particular Chomsky showed that children acquire a set of
attention to them (Farrar, 1992). The debate about linguistic rules or grammar. He further argued that
whether or not children receive sufficient nega- they could not learn these rules by environmental
tive evidence (sometimes called the no negative exposure alone (Chomsky, 1965). The language
evidence problem), such as information about children hear was thought to be inadequate in
which strings of words are not grammatical, is two ways. First, they hear what has been called a
important because without negative feedback it degenerate input. The speech children hear is full of
is a challenge to specify how children learn to
produce only correct utterances. One possible
solution is that they rely on mechanisms such as Box 4.2 Arguments against the
innate principles to help them learn the grammar.
learning theory of language
Second, the pattern of acquisition of irregular
past verb tenses and irregular plural nouns cannot be
development
predicted by learning theory. Some examples of irreg- x Adults correct mainly the truth and meaning
ular forms given by children are “gived” for “gave,” of a child’s utterances, rarely the syntax
and “mouses” for “mice.” The sequence observed x Some words are understood before they
is: correct production, followed by incorrect produc- are produced
tion, and then later correct production again (Brown, x The pattern of acquisition of irregular past
1973; Kuczaj, 1977). The original explanation for tense verbs and irregular plural nouns is
this pattern (but see later) is that the children begin by U-shaped
learning specific instances. They then learn a general x Aspects of the structure of language mean
rule (e.g., “form past tenses by adding ‘-ed’”; “form it cannot be acquired simply by conditioning
plurals by adding ‘-s’”) but apply it incorrectly by x In phonological production, babbling is not
using it in all instances. Only later do they learn the random and imitation is not important
exceptions to the rule. This is an example of what
4. LANGUAGE DEVELOPMENT 109
slips of the tongue, false starts, and hesitations, and too (Hladik & Edwards, 1984). Mothers using sign
sounds run into one another so that the words are language also use a form of CDS when signing to
not clearly separated. Second, there does not seem to their infants, repeating signs, exaggerating them,
be enough information in the language that children and presenting them at a slower rate (Masataka,
hear for them to be able to learn the grammar. They 1996). Even 4-year-old children use CDS when
are not normally exposed to a sufficient number of speaking to infants (Shatz & Gelman, 1973). In
examples of grammatical constructions that would turn, infants prefer to listen to CDS rather than to
enable them to deduce the grammar. In particular, normal speech (Fernald, 1991). There appears to be
they do not hear grammatically defective sentences some feedback between the language of the adult
that are labeled as defective (e.g., “listen, Boris, carer and that of the child: the vocabulary of carers
this is wrong: ‘the witch chased to a cave’”). These becomes modified by exposure to the language of
obstacles to learning language constitute the pov- the child. The same is not true of syntax, however,
erty of the stimulus argument (Berwick, Pietroski, suggesting that the adult’s CDS directly and caus-
Yankama, & Chomsky, 2011). ally influences the syntactic development of the
child (Huttenlocher, Waterfall, Vasilyeva, Vevea,
& Hedges, 2010).
Child-directed speech What determines the level of simplification
Adults (particularly mothers) have a special way of used in CDS? Cross (1977) proposed a linguistic
talking to children (Snow, 1972, 1994). This spe- feedback hypothesis, which states that mothers
cial way of talking to children was originally called tailor the amount of simplification they provide
motherese, but is now called child-directed speech depending on how much the child appears to
(CDS for short), because its use is clearly not lim- need. Counter to this, Snow (1977) pointed out
ited to mothers. It is commonly known as “baby that mothers produce child-directed speech before
talk.” Adults talk in a simplified way to children, infants are old enough to produce any feedback on
taking care to make their speech easily recogniz- the level of simplification. Instead, she proposed
able. The sentences are to do with the “here-and- a conversational hypothesis in which what is
now”; they are phonologically simplified (baby important is the mother’s expectation of what the
words such as “moo-moo” and “gee-gee”); there child needs to know and can understand. Cross,
are more pauses, the utterances are shorter, there Johnson-Morris, and Nienhuys (1980) found that
is more redundancy, the speech is slower, and it is the form of CDS used to hearing-impaired chil-
clearly segmented. There are fewer word endings dren suggested that a number of factors might be
than in normal speech, the vocabulary is restricted, operating, and that elements of both the feedback
sentences are shorter, and prosody is exaggerated and the conversational hypothesis are correct. The
(Dockrell & Messer, 1999). There is a great deal form of CDS also interacts in a complex way with
of repetition in the speech of mothers to their chil- the social setting: Maternal speech contains more
dren, and they focus on shared activities (Messer, nouns during toy play, but more verbs during non-
1980). Carers are more likely to use nouns at the toy play (Goldfield, 1993). The nature of CDS
most common or basic level of description (e.g., also varies with the socioeconomic status of the
“dog” rather than “animal”; Hall, 1994). They are family, with higher status mothers saying more,
also more likely to use words that refer to whole using more variety in their language, and using
objects (Masur, 1997; Ninio, 1980). Speech is spe- longer utterances. These differences in CDS cor-
cifically directed towards the child and marked by relate with subsequent vocabulary development in
a high pitch (Garnica, 1977). Furthermore, these the child (Hoff, 2003), and might be one reason
differences are more marked the younger the child; why the vocabulary and language skills of chil-
hence adults reliably speak in a higher pitch to dren from high-status families grow more quickly
2-year-olds than to 5-year-olds. The most impor- than those of children from low-status families.
tant words in sentences receive special emphasis. (Of course, we cannot rule out genetic factors, as
Although mothers use CDS more, fathers use it mother and child are genetically very similar.)
110 B. THE BIOLOGICAL AND DEVELOPMENTAL BASES OF LANGUAGE
argued that the additional factor is that the design of languages. Thus this approach sees language acquisi-
the grammar is innate: Some aspects of syntax must tion as parameter setting.
be built into the mind. Let us look at a simple example. In languages
like Italian, it is possible to drop the pronoun of sen-
THE LANGUAGE tences. For example, it is possible just to say “parla”
ACQUISITION DEVICE (speaks). In languages such as English and French,
it is not grammatical just to say “speaks”; you must
What might be innate in language? Chomsky (1965, use the pronoun, and say “he speaks.” Whether or
1968, 1986) argued that language acquisition must not you can drop the pronoun in a particular lan-
be guided by innate constraints, and that language guage is an example of a parameter; it is called the
is a special faculty not dependent on other cognitive pro-drop parameter. English and French are non-pro-
or perceptual processes. It is acquired, he argued, at drop languages, whereas Italian and Arabic are
a time when the child is incapable of complex intel- pro-drop languages. But once the pro-drop param-
lectual achievements, and therefore could not be eter is specified, other aspects of the language fall
dependent on intelligence, cognition, or experience. into place. For example, in a pro-drop language such
Because the language they hear is impoverished and as Italian you can construct subjectless sentences
degenerate, children cannot acquire a grammar by such as “cade la notte” (“falls the night”); in non-
exposure to language alone. Assistance is provided pro-drop sentences, you cannot. Instead, you must
by the innate structure called the language acqui- use the standard word order with an explicit subject
sition device (LAD). In Chomsky’s later work the (“the rain falls”). Pro-drop languages always permit
LAD is replaced by the idea of universal gram- subjectless sentences, so pro-drop is a generalization
mar. This is a theory of the primitives and rules of about languages (Cook & Newson, 2007).
inferences that enable the child to learn any natural
grammar. In Chomsky’s terminology, it is the set of Is language learning parameter
principles and parameters that constrain language
acquisition (see Chapter 2). For Chomsky, language
setting?
is not learned, but grows. Is learning language setting parameters? For
Obviously languages vary, and children are Chomsky and others who view language acquisi-
faced with the task of acquiring the particular details tion as a process of acquiring a grammar, the basis
of their language. For Chomsky (1981), this is the of which is innate, acquiring a language involves
process of parameter setting. A parameter is a univer- putting the built-in switches (parameters) into the
sal aspect of language that can take on one of a small correct positions. One obvious problem with this
number of positions, rather like a switch. The param- view is that language development is a slow pro-
eters are set by the child’s exposure to a particular lan- cess, full of errors. Why does it take so long to set
guage. Another way of looking at it is that the LAD these switches? There are two explanations. The
does not prescribe details of particular languages, but continuity hypothesis says that all the principles and
rather sets boundaries on what acquired languages parameters are available from birth, but they cannot
can look like; languages are not free to vary in every all be used immediately because of other factors.
possible way, but are restricted. For example, no lan- For example, the child has first to identify words
guage yet discovered forms questions by inverting as belonging to particular categories, and be able to
the order of words from the primary (declarative) hold long sentences in memory for long enough to
form of the sentence. The LAD can be thought of process them (Clahsen, 1992). The second expla-
as a set of switches that constrain the possible shape nation is that the children do not have immediate
of the grammars the child can acquire; exposure to a access to all their innate knowledge. Instead, it only
particular language sets these switches to a particular becomes gradually available over time as a conse-
position. If exposure to the language does not cause quence of maturation (Felix, 1992) (see Figure 4.2).
these switches to go to a particular position, they stay There is little agreement about which of these pro-
in the neutral one. Parameters set the core features of vides the best account of language development.
112 B. THE BIOLOGICAL AND DEVELOPMENTAL BASES OF LANGUAGE
FIGURE 4.2
Another problem is that it has proved difficult two languages at the same time, when the lan-
to find examples of particular parameters clearly guages involved might need to have parameters
being set in different languages (Maratsos, 1998). set to different positions (Messer, 2000).
In telegraphic speech, English-speaking children These are difficult problems for the theory
often omit pronouns. One possible explanation for of principles and parameters. To counter them,
this is that they have incorrectly set the parameter Chomsky toned down the idea that grammati-
for whether or not pronouns should be included in cal rules are abstract, and generally reduced
their utterances. At first sight this makes the lan- their importance in language acquisition (e.g.,
guage look like Italian, but this comparison fails Chomsky, 1995).
because Italian verbs specify the subject, whereas
English ones provide much less information.
Other problems for the parameter-setting the-
Linguistic universals
ory include how deaf children manage to acquire Constraints must be general enough to apply
sign language. There are some indications that across all languages: clearly innate constraints
similar processes underlie both sign language and cannot be specific to a particular language.
spoken language. First, all the milestones in both Instead, there must be aspects of language that are
types of language occur at about the same sort of universal. Chomsky argued that there are substan-
time. Originally it was thought that because the tial similarities between languages, and the differ-
manual system matures more quickly than the ences between them are actually quite superficial.
language system, the first signs appeared before Pinker (1994, p. 232), perhaps controversially,
the first spoken words (Newport & Meier, 1985; suggested that “a visiting Martian would surely
Schlesinger & Meadow, 1972). However, it is conclude that aside their mutually unintelligible
possible that people tend to over-interpret ges- vocabularies, Earthlings speak a single language.”
tures by young children, and that in fact signed Although there are 6,000 languages in the world,
and spoken words emerge at about the same time they all share the same basic structure—and this
(Petitto, 1988). Second, signing children make basic structure is universal grammar.
the same sorts of systematic errors as speaking Linguistic universals are features that can be
children at the same time (Petitto, 1987). Hence, found in most languages. Chomsky (1968) distin-
although spoken and signed language develop in guished between substantive and formal univer-
very similar ways, it is unclear how sign language sals. Substantive universals include the categories
gestures can be matched to the innate principles of syntax, semantics, and phonology that are com-
and parameters of verbal language. It is also prob- mon to all languages. The presence of the noun
lematic how bilingual children manage to acquire and verb categories is an example of a substantive
4. LANGUAGE DEVELOPMENT 113
universal, as all languages make this distinction. There are four possible reasons why
It is so fundamental that it can arise in the absence universals might exist. First, some universals might
of linguistic input. “David,” a deaf child with no be part of the innate component of the grammar.
exposure to sign language, used one type of ges- There is some evidence for this claim in the way
ture corresponding to nouns, and another type for in which parameters set apparently unrelated
verbs (Goldin-Meadow, Butcher, Mylander, & features of language. For example, at first sight
Dodge, 1994). A formal universal concerns the there is no obvious reason why all SVO languages
general form of syntactic rules that manipulate must also put question words at the beginning of
these categories. These are universal constraints a sentence. Second, some universals might be part
on the form of syntactic rules. One of the goals of of an innate component of cognition, which then
universal grammar is to specify these universals. makes them more likely to be incorporated in
An interesting example of a linguistic universal some or all languages. For example, 5-month-old
relates to word order. Greenberg (1963) examined infants are sensitive to the conceptual distinction
word order and morphology in 30 very different between things that fit tightly and things that
languages and found 45 universals, focusing on the fit loosely. Using the standard dishabituation
normal order of subject, object, and verb (English paradigm, infants start to pay attention when there
is a SVO language: its dominant order is subject– is a change from cylinders in a narrow container
verb–object). He noted that we do not appear to to cylinders in a wider container (Bloom, 2004;
find all possible combinations; in particular, there Hespos & Spelke, 2004). That is, they are sensitive
seems to be an aversion to placing the object first. to the conceptual contrast. Some languages (e.g.,
The proportions found are shown in Table 4.1. Korean, which uses different verbs when referring
(Note that in general OVS and VOS languages are to things fitting tightly compared with things
very rare, comprising less than 1% of all languages, fitting loosely) mark this contrast linguistically,
and although some linguists believe that there are and some (e.g., English) do not. Hurford (2003)
a few OSV languages, there is no consensus; see argues that the predicate-argument distinction has
Pullum, 1981.) Even more striking is the way in a neural basis, reflecting distinctions such as that
which the primary word order has implications for between the “what” and “where” visual processing
other aspects of a language: it is an example of a pathways. Of course, the wider view is that neural
parameter. Once primary word order is fixed, other systems have evolved to interact with the physical
aspects of the language are also fixed. For example, laws of the universal, such as a distinction between
if a language is SVO it will put question words at mass and movement. Language learning is a
the beginning of the sentence (“Where is … ?”); if process of linking words to universal, pre-existing
it is SOV, it will put them at the end. SVO languages concepts that enable animals to navigate the world.
put prepositions before nouns (“to the dog”), while Third, constraints on syntactic processing make
SOV languages use postpositions after the noun. some word orders easier to process than others
(Hawkins, 1990). Languages evolve so that they
TABLE 4.1 Different word orders, as percentages of are easy to understand. Fourth, universals might
languages (based on Clark & Clark, 1977). result from strong features of the environment
that are imposed on us from birth, and make their
subject object verb 44% presence felt in all languages. Languages make
use of important distinctions in the environment.
subject verb object 35%
Different languages might pick up on some
verb subject object 19% differences rather than others. In practice it might
be very difficult to distinguish between these
verb object subject 2%
alternatives. Finally, it should be noted that the
object verb subject 0% notion that there are true universals common
to all languages has recently been criticized;
object subject verb 0%
instead, it has been argued, there is variation
114 B. THE BIOLOGICAL AND DEVELOPMENTAL BASES OF LANGUAGE
across languages in all ways in which variation is develop within-gesture structures analogous to
possible (Evans & Levinson, 2009). characteristics of word morphology. It is as though
The commonly accepted view is that innate there is a biological drive to develop syntax, even
mechanisms make themselves apparent very early if it is not present in the adult form of communica-
in development, whereas aspects of grammar that tion to which a child is exposed. Bickerton calls
have to be learned develop slowly. Wexler (1998) this idea the language bioprogram hypothesis:
argued that this does not have to be so. Some children have an innate drive to create a grammar
parameters are set by exposure to language at a that will make a language even in the absence of
very early age, whereas some innate, universal environmental input.
properties of language can emerge quite late, as
a consequence of genetically driven maturation.
As evidence for early parameter setting, Wexler
Genetic linguistics
observed that children know a great deal about the More evidence that aspects of language are innate
inflectional structure of their language when they comes from studies of the genetic basis of lan-
enter the two-word stage (around 18 months). guage, genetic linguistics. Specific language
Furthermore, the parameter of word order— impairment, or SLI, is a disorder that affects
whether or not the verb precedes or follows the about 5% of the population. SLI is marked by
object, and all that follows from it—is set from significant problems with spoken language with-
the earliest observable stage. out any obvious accompanying brain damage or
problems with hearing, and those affected have
IQs in the normal range. Importantly, it runs
Pidgins and creoles in families (Gopnik, 1990a, 1990b; Gopnik &
Further evidence that there is a strong biologi- Crago, 1991; Leonard, 1989, 2000; Pinker, 2001;
cal drive to learn syntax comes from the study of Vargha-Khadem, Watkins, Alcock, Fletcher, &
pidgin and creole languages. Pidgins are simplified Passingham, 1995). For example, the “KE” fam-
languages that were created for communication ily of London is a large family spanning three
between speakers of different languages who were generations where about half the members have
forced into prolonged contact, such as the result some speech or language disorder. Affected mem-
of slavery in places like the Caribbean, the South bers have difficulty controlling their tongues and
Pacific, and Hawaii. A creole is a pidgin language making speech sounds, but they also have trouble
that has become the native tongue of the children identifying speech sounds, understanding speech,
of the pidgin speakers. Whereas pidgins are highly and making judgments about the grammatical
simplified syntactically, creole languages are syn- acceptability. They have particular difficulty with
tactically rich. They are the spontaneous creation regular inflections (e.g., forming the plural of
of the first generation of children born into mixed nouns by adding an “s” at the end), and a study of
linguistic communities (Bickerton, 1981, 1984). the heritability of the disorder suggests that a sin-
Creoles are not restricted to spoken language: gle dominant gene is involved (Hurst, Baraitser,
hearing-impaired children develop a creole sign Auger, Graham, & Norell, 1990). Their language
language if exposed to a signing pidgin. A commu- is replete with grammatical errors, particularly
nity of deaf children in Nicaragua developed their involving pronouns. They have difficulty in learn-
own sign language from scratch (Kegl, Senghas, & ing new vocabulary. The speech of the affected
Coppola, 1999). Furthermore, the grammars that people is slow and effortful, and they have diffi-
different creoles develop are very similar. Deaf culty in controlling their facial muscles. Contrary
children who are not exposed to sign language to the earlier reports that were based on quite a
(because they have non-signing hearing parents) small number of items, affected members of the
nevertheless spontaneously develop a gesture sys- family also have difficulty with irregular inflec-
tem that seems to have its own syntax (Goldin- tions. SLI can also cause severe difficulties in
Meadow, Mylander, & Butcher, 1995). They also language comprehension (Bishop, 1997).
4. LANGUAGE DEVELOPMENT 115
The distribution of the disorder in the fam- such as recognizing the sound in common in words
ily suggests it is caused by a dominant gene (or (“b” in “ball” and “bat”). Joanisse and Seidenberg
a set of linked genes) on a non-sex chromosome; argued that normal syntactic development has an
the most likely candidate is a segment of chromo- important phonological component. For example,
some 7 labeled SPCH1 (Fisher, Vargha-Khadem, in order to be able to form the past tense of verbs
Watkins, Monaco, & Pembrey, 1998). Study of correctly, you have to be able to accurately identify
another person with SLI enabled the disorder to be the final sound of the word. If the final sound of a
tied to a specific gene, called FOXP2 (Lai, Fisher, present tense verb is a voiceless consonant, then you
Hurst, Vargha-Khadem, & Monaco, 2001—see form the past by adding a /t/ sound (“rip” becomes
also Chapter 3). The FOXP2 seems to play some “ripped”). But if it is a voiced consonant then you
causal role in the brain circuitry underlying nor- must add a /d/ sound (“file” becomes “filed”), and
mal language development, including Broca’s if it is an alveolar stop you must add an unstressed
area; in particular, it seems to be involved in con- vowel as well as a /d/ (“seed” becomes “seeded”).
trolling fine movements of the face and articula- Hence these morphological rules have an important
tory system (Fisher & Marcus, 2006). phonological component. Watkins, Dronkers, and
Clearly, then, genetic factors affect language Vargha-Khadem (2002) argued that the core defi-
proficiency, although there is considerable dis- cit in SLI is sequencing sounds, with the problems
agreement about just how specific the grammati- with inflections and syntactic sequencing secondary
cal impairment in the KE family actually is. As to that of sequencing sounds.
noted above, Vargha-Khadem and colleagues The argument about the theoretical impor-
showed that in fact affected members of the KE tance of SLI hinges on the extent to which these
family performed poorly on many other language impairments are truly specific to language or
tasks in addition to regular inflection formation to knowledge of grammar. On balance, the evi-
(Leonard, 1989; Vargha-Khadem & Passingham, dence suggests that language difficulties can “run
1990; Vargha-Khadem et al., 1995). Furthermore, in families,” but that these difficulties are quite
systems other than language might also be general and not limited to innate knowledge about
involved. For example, Tallal, Townsend, Curtiss, linguistic rules. The mapping between genes and
and Wulfeck (1991) proposed that children who language is a complex one, but the FOXP2 gene
tended to neglect word endings and other mor- clearly plays an important role.
phological elements did so because of difficul-
ties in temporal processing. There is also debate Formal approaches to language
about whether people with SLI have near-normal
IQ on tests of non-verbal performance. Affected
learning
members of the KE family scored 18 points lower How do children learn the rules of grammar?
on performance IQ tests than unaffected mem- Most accounts stress the importance of induction
bers (Vargha-Khadem et al., 1995). Although SLI in learning rules: Induction is the process of form-
might have a genetic basis, it is nevertheless to ing a rule by generalizing from specific instances.
some extent treatable. Members of the KE fam- One aspect of the poverty of the stimulus argument
ily learned to compensate for their difficulty in is that children come to learn rules that could not
generating syntactically complex sentences by be learned from the input they receive (Lightfoot,
memorizing structures, and by consciously apply- 1982). Gold (1967) showed that the mechanism of
ing rules most of us apply unconsciously. induction is not sufficiently powerful to enable a
An alternative view is that SLI is not primarily a language to be learned by itself; the proof of this is
disorder of grammar, but arises from impaired sound known as Gold’s theorem. If language learners are
processing (Joanisse & Seidenberg, 1998). Children presented only with positive data, they can only
with SLI who have syntactic deficits also have dif- learn a very limited type of language (known as
ficulty in tasks such as repeating nonwords (such a Type 3 language—see Chapter 2). They would
as “slint”), and tasks of phonological awareness, then not be able to construct sentences with an
116 B. THE BIOLOGICAL AND DEVELOPMENTAL BASES OF LANGUAGE
Elman (1993) showed that networks could learn A-type word plus a B-type word) by extracting pre-
grammars with some of the complexities of English. dictive dependencies—that some things consistently
In particular, the networks could learn to analyze go with other things. Interestingly, similar results
embedded sentences, but only if they were first were found with non-linguistic sounds and even
trained on non-embedded sentences, or were given in the visual modality, suggesting that these learn-
a limited initial working memory that was gradu- ing mechanisms are not specific to language. Very
ally increased. This modeling shows the importance young children are also able to extract structure from
of starting on small problems that reflect the types what they hear. Seven-month-old infants attend
of sentences to which young children are in prac- longer to sentences with unfamiliar structures than to
tice exposed. It also provides support for Newport’s sentences with familiar structures (Marcus, Vijayan,
(1990) idea, called the less-is-more theory, that ini- Rao, & Vishton, 1999). Marcus et al. tested children
tially limited cognitive resources might actually help on sequences in an artificial language where simple
children to acquire language, rather than hinder them. counting or statistical mechanisms would not suffice
In a study involving how easily adults learned an arti- to learn the rule generating the sequence because
ficial language, Kersten and Earles (2001) found that they heard new items. For example, suppose you
adults learned the artificial language better when they hear items like “ga ti ga” and “li na li” repeated sev-
were initially presented with only small segments of eral times. You then hear the new item “wo fe wo”;
the language than when they were exposed to the this item does not generate surprise, because it con-
full complexity of the language from the beginning. forms to the rule you have inducted (sequences must
On the other hand, making the task more realistic by be of the form ABA). If, however, you hear “wo fe
introducing semantic information into the modeling fe” you might be surprised, and pay more attention,
suggests that starting small provides less of an advan- because this stimulus does not conform to the rule.
tage than when syntactic information alone is consid- Marcus et al. found that the 7-month-olds behaved
ered. Indeed “starting small,” or “less is more,” might in the same way. So very young children are able to
actually hinder development with more naturalistic extract abstract rules from very little input. There is,
inputs to the learning system (Rohde & Plaut, 1999). however, some debate as to what counts as a “rule,”
In any case, connectionist modeling shows that and the extent to which connectionist networks
explicit negative syntactic information might not be can model this behavior using only simple statis-
needed to acquire a grammar in the absence of innate tical mechanisms (Christiansen & Curtin, 1999;
information—there might after all be sufficient infor- Seidenberg & Elman, 1999; see Marcus, 1999, for
mation in the sentences children actually hear. a reply).
It should be pointed out, however, that these con-
nectionist networks have only modeled grammars
approaching the complexity of natural language. In HOW CHILDREN DEVELOP
general, it is debatable whether the constraints neces- LANGUAGE
sary to acquire language in the face of Gold’s theorem
need to arise from innate language-specific informa- Many things drive language development: genes,
tion, or can be satisfied by more general constraints the environment, and particularly social interac-
on the developing brain, or by the social and linguis- tion. The main issue is the extent to which children
tic environment (Elman et al., 1996). need genetically encoded language-specific infor-
Nevertheless, adults and children are able to mation, rather than general-purpose learning mech-
extract at least some syntactic structure on the basis anisms. We should note that learning mechanisms
of exposure to statistical information alone. Saffran change as the child grows: Connectionist modeling
(2001, 2002) tested adults and 6–9-year-old children has focused attention on the way in which learn-
on an artificial language and then asked them to ing systems change with experience. Finally, we
decide whether test items followed the rules of the should remember that the balance of the driving
language or not. Both groups learned the structure forces for phonological, syntactic, semantic, and
of the language (e.g., that an A phrase consists of an pragmatic development might be very different.
4. LANGUAGE DEVELOPMENT 119
used in the language into which they are growing begins with a sequence like /mp/ because this is
up, this ability is lost by about the age of 1 year not a legitimate string of sounds at the start of
or even less (Werker & Tees, 1984). (Adults can English words. Similarly the sounds within words
learn to make these distinctions again, so these such as “laughing” and “loudly” frequently co-
findings are more likely to reflect a reorganization occur by virtue of these being words; the sounds
of processes rather than complete loss of ability.) “ingloud” occur much less frequently together—
Infants are sensitive to features of speech only when words like “laughing loudly” are spo-
other than phonetic discriminations. Neonates ken adjacently. This type of low co-occurrence
(newborn infants) aged 3 days prefer the mother’s information provides a way of dividing the speech
voice to that of others (DeCasper & Fifer, 1980; stream. On the other hand, the sounds making up
see above). From an early age, infants can distin- “mother” co-occur very frequently; hence the
guish languages as long as they are rhythmically way in which sounds cluster together is another
distinct enough; newborn French infants can dis- important cue. Cairns et al. (1997) and Batchelder
tinguish British English from Japanese, but not (2002) showed that it is relatively straightforward
from Dutch (Nazzi, Bertoncini, & Mehler, 1998). to construct a computational model that learns to
The sensitivity of babies to language extends segment English and other languages using distri-
beyond simple sound perception. Infants aged 8 butional information. Of course, once a child has
months are sensitive to cues such as the location of successfully segmented a few words, it becomes
important syntactic boundaries in speech (Hirsh- progressively easier to segment the rest of the
Pasek et al., 1987). Hirsh-Pasek et al. inserted speech stream. This idea of using a little infor-
pauses into speech recorded from a mother speak- mation to uncover more of the same is known as
ing to her child. Infants oriented longer to speech bootstrapping—by analogy to the idea of try-
where the pauses had been inserted at important ing to pull yourself up by your own bootstraps.
syntactic boundaries than when the pauses had Bootstrapping is an important theme in language
been inserted within the syntactic units. The infant acquisition. Batchelder’s computational model
appears early on to be identifying acoustic corre- (called BootLex) shows how useful bootstrapping
lates of clauses (such as their prosodic form—the is. Furthermore, infants do seem to be sensitive
way in which intonation rises and falls, and stress to this sort of distributional information. Saffran,
is distributed). Aslin, and Newport (1996) found that 8-month-old
One of the major difficulties facing chil- infants very quickly learn to discriminate words in
dren learning language is how to segment fluent a stream of syllables on the basis of which sounds
speech they hear into words. Words run together tend to occur together regularly. Once they have
in speech; they are rarely delineated from each learned the words, they then listen longer to novel
other by pauses. Young children probably make stimuli than to the words presented in the stream
use of several strategies in order to be able to seg- of syllables. Children probably use both divi-
ment the speech stream. Child-directed speech sional and clustering distributional information at
may help the child learn how to segment speech. some time.
For example, carers put more pauses in between Although children can segment speech on
words in speech to young children than in speech the basis of statistical information alone, their
to other adults. Children are further aided by the performance is much better if they can make
great deal of information present in the speech use of other types of information. Eight-month-
stream. Distributional information about pho- old babies also make use of speech-specific
netic segments is an important cue in learning information, including phonotactic cues such
to segment speech (Cairns, Shillcock, Chater, & as co-articulation—the way in which sounds
Levy, 1997; Christiansen, Allen, & Seidenberg, change in the presence of other sounds (Johnson
1998). Distributional information concerns the & Jusczyk, 2001; Mattys & Jusczyk, 2001).
way in which sounds co-occur in a language. For For example, Mattys and Jusczyk found that
example, we do not segment speech so that a word 9-month-old infants turned and looked longer
122 B. THE BIOLOGICAL AND DEVELOPMENTAL BASES OF LANGUAGE
at the source of a sound producing consonant– of New World monkey, can segment a sequence
vowel–consonant triplets with good phonotactic of sounds based on distributional information, with
cues to a word boundary than triplets without some sequences being more common than others,
these cues. For example, the triplet “gaffe” just like human infants (Hauser, Newport, & Aslin,
stands out more if it is preceded by “bean” (the 2001). However, even if animals can perform these
good phonotactic cue) than “fang” (the neutral perceptual distinctions, it does not necessarily fol-
cue). A single, isolated consonant is not a via- low that the perceptual mechanisms they employ
ble word; hence adults segment speech in such are identical to those of humans, and, furthermore,
a way as to avoid creating isolated consonants. humans possess language abilities that go far
Measuring the time children spent listening to beyond categorical perception and speech-stream
stimuli, Johnson, Jusczyk, Cutler, and Norris segmentation.
(2003) found that 12-month-old children use the Finally, for a while children actually regress
same strategy. Hence, from an early age chil- in their speech perception abilities (Gerken, 1994):
dren segment speech so as to avoid creating iso- The ability of young children to discriminate sounds
lated units that could not be words. In addition, is worse than that of infants. In part this regression
very young infants also seem to be sensitive to might be an artifact of using more stringent tasks
the prosody of language. Prosodic information to test older children: Tests for infants just involve
concerns the pitch of the voice, its loudness, and discriminating new sounds from old ones, but tests
the length of sounds. Neonates prefer to listen to for older children require them to match particular
parental rather than non-parental speech. Using sounds. It might also occur because of a change in
the sucking habituation technique, Mehler et al. focus of the child’s language-perception system.
(1988) showed that infants as young as 4 days Infants aged 14 months do not attend to fine pho-
old can distinguish languages from one another. netic detail (e.g., “bih” versus “dih”) when learn-
Infants prefer to listen to the language spoken ing new words, though children aged 8 months are
by their parents. For example, six babies born to capable of discriminating these sounds in a percep-
French-speaking mothers preferred to listen to tion task (Stager & Werker, 1997). When children
French rather than Russian. The likely explana- know only a few words, it might be possible to rep-
tion for this is that the child learns the prosodic resent them in terms of rather gross characteristics;
characteristic of the language in the womb. indeed, limiting the amount of detail to which you
Sensitivity to prosody helps the infant to iden- need to attend might be advantageous. But as chil-
tify legal syllables of their language (Altmann, dren grow older and acquire more words, they are
1997). After some months’ exposure to a lan- forced to represent words in terms of their detailed
guage, infants learn to make use of knowledge sound structure. Hence, early on—perhaps up to a
of lexical stress in identifying words; for exam- vocabulary size of about 50 words—detailed sound
ple, children growing up exposed to English contrasts are not yet needed by the child (Gerken,
adopt a stress initial syllable strategy, enabling 1994). Perceptual skills, experience, and the task at
them to identify when a new word is starting hand all interact to determine performance.
(Curtin, Mintz, & Christiansen, 2005; Thiessen Young children quickly become very good at
& Saffran, 2007). speech recognition. Children aged 18 months can
Just because some mechanisms of speech per- identify a large number of words without having to
ception are innate, it does not follow that they are hear the whole word: the first 300 ms is sufficient,
necessarily language- or even species-specific. All as shown by studies looking at children’s eye move-
children need is a general-purpose learning algo- ments to pictures of objects while listening to speech
rithm that helps them detect statistical regularities. (Fernald, Swingley, & Pinto, 2001). Once children
Kuhl (1981) showed that chinchillas (a type of have made a start on segmentation, “bootstrap-
South American rodent) display categorical percep- ping” can come into play: they can use their existing
tion of syllables such as “da” and “ta” in the same knowledge to facilitate the acquisition of new knowl-
way as humans do. The cotton-top tamarin, a type edge (Werker & Yeung, 2005). PRIMIR (Processing
4. LANGUAGE DEVELOPMENT 123
Rich Information from Multidimensional Interactive languages. This range of sounds is then gradually
Representations) is a model that emphasizes the role narrowed down, by reinforcement by parents and
of bootstrapping in early word learning (Werker & others of some sounds but not others (and by the lack
Curtin, 2005). Although children continue to per- of exposure to sounds not present within a particular
ceive phonetic variations in the speech stream, by language), to the set of sounds in the relevant lan-
17 months old they have learned a sufficient number guage. (The extreme version of this of course is the
of word–object pairings to enable them to focus on behaviorist account of language development dis-
the phonological distinctions that are important for cussed earlier: Words are acquired by the processes
distinguishing new words. of reinforcement and shaping of random babbling
sounds.) For example, a parent might give the infant
extra food when he or she makes a “ma” sound, and
Babbling progressively encourages the child to make increas-
From about the age of 6 months to 10 months, ingly accurate approximations to sounds and words
before infants start speaking, they make speech- in their language. There are a number of problems
like sounds known as babbling. Babbling is clearly with the continuity hypothesis. Many sounds, such
more language-like than other early vocaliza- as consonant clusters, are not produced at all in bab-
tions such as crying and cooing, and consists of bling, and also parents are not that selective about
strings of vowels and consonants combined into what they reinforce in babbling: they encourage all
sometimes lengthy series of syllables, usually vocalization (Clark & Clark, 1977). Nor does there
with a great deal of repetition, such as “bababa appear to be much of a gradual shift towards the
gugugu,” sometimes with an apparent intonation sounds particular to the language to which the child
contour. There are two types of babbling (Oller, is exposed (Locke, 1983).
1980). Reduplicated babble is characterized The discontinuity hypothesis states that bab-
by repetition of consonant–vowel syllables, often bling bears no simple relation to later develop-
producing the same pair for a long time (e.g., ment. Jakobson (1968) postulated two stages in the
“bababababa”). Non-reduplicated or variegated development of sounds. In the first stage children
babble is characterized by strings of non-repeated babble, producing a wide range of sounds that do
syllables (e.g., “bamido”). Babbling lasts for 6–9 not emerge in any particular order and that are not
months, fading out as the child produces the first obviously related to later development. The second
words. It appears to be universal: deaf infants also
babble (Sykes, 1940), although it is now known
that they produce slightly different babbling pat-
terns. This suggests that speech perception plays
some role in determining what is produced in
babbling (Oller, Eilers, Bull, & Carney, 1985).
Across many languages, the 12 most frequent
consonants constitute 95% of babbled conso-
nants (Locke, 1983), although babbling patterns
differ slightly across languages, again suggesting
that speech perception determines some aspects
of babbling (de Boysson-Bardies, Halle, Sagart,
& Durand, 1989; de Boysson-Bardies, Sagart, &
Durand, 1984).
What is the relation between babbling and According to Mowrer (1960), babbling is a direct
later speech? The continuity hypothesis (Mowrer, precursor of language. The range of babbling
1960) states that babbling is a direct precursor of sounds is gradually narrowed down over time by
language—in babbling the child produces all of the reinforcement by the carer of some sounds but
not others.
sounds that are to be found in all of the world’s
124 B. THE BIOLOGICAL AND DEVELOPMENTAL BASES OF LANGUAGE
Substitution of easier
sounds for more Omit the final
difficult sounds consonant
Children’s
simplification
of words
Reduce consonant
Repeat clusters
syllables
Omit unstressed
syllables
FIGURE 4.3
126 B. THE BIOLOGICAL AND DEVELOPMENTAL BASES OF LANGUAGE
“expressive style” group emphasize people and Greenfield and Smith (1976) found that
feelings, while children in the “referential style” early words may refer to many different roles,
group emphasize objects. These differences prob- not just objects, and further proposed that the
ably arise for several reasons. Nelson argued that first utterances may always name roles. For
they arise because of differences in what children example, the early word “mama” might be
think language is for: Children who think lan- used to refer to particular actions carried out
guage is primarily for labeling objects are likely by the mother, rather than to the mother her-
to be referential, while those who think it is for self. Generally, the earliest words can be char-
social interaction are likely to be more expressive. acterized as referring either to things that move
The differences also probably reflect differences in (such as people, animals, vehicles) or things
language use by the parents; some parents spend that can be moved (such as food, clothes,
a great deal of time producing object labels for toys). Moving things tend to be named before
their children, and such children tend to fall in movable things. Places and the instruments of
the referential style group (Pine, 1994a). It was actions are very rarely named.
once thought that the referential style led to faster There is some debate as to whether the earli-
language development; however, when you take est referential words may differ in their use and
into account factors such as vocabulary size and representation from later ones (McShane, 1991).
the age at which children produce the first word In particular, the child’s earliest use of reference
(both types of children reach 50 words at the same (what things refer to) appears to be qualitatively
age, but as the referential children tend to produce different from later use. The youngest children
their first word later, they appear to rush faster name objects spontaneously or give names of
towards that limit), there is no obvious difference objects in response to questions quite rarely, in
in subsequent development (Bates et al., 1994; marked contrast to their behavior at the end of the
Hoff-Ginsberg, 1997). second year.
It would be surprising if children got the
meanings of words right every time. Consider
the size of the task facing very young children. A
mother says to a baby sitting in a pram and look-
ing out of the window: “Isn’t the moon pretty?”
How, from all the things in the environment,
does the child pick out the correct referent for
“moon”? That is, how does the child know what
the word goes with in the world? It is not even
immediately obvious that the referent is both
an object and an object the infant can see. Even
when the child has picked out the appropriate
referent, substantial problems remain. He or she
has to learn that “moon” refers to the object, not
some property such as “being silver colored” or
“round.” What are the properties of the visual
object that are important? The child has to learn
that the word “moon” refers to the same thing,
even when its shape changes (from crescent to
full moon). The task, then, of associating names
Some children’s first words tend to refer to with objects and actions is an enormous one,
objects (“referential”) whereas some children’s and it is surprising that children are as good at
are more likely to refer to people and feelings
acquiring language as they are. Errors are there-
(“expressive”).
fore only to be expected. Sentences (6) and (7)
4. LANGUAGE DEVELOPMENT 127
are examples of errors in acquiring meaning there is some bias in learning, and one of the
from Clark and Clark (1977): goals of understanding semantic development is
to work out how this bias arises.
(6) Mother pointed out and named a dog “bow- The first words emerge out of situations
wow.” where an exemplar of the category referred to
Child later applies “bow-wow” to dogs, but by the word is present in the view of parent and
also to cats, cows, and horses. child (see Chapter 3 on the social precursors
(7) Mother says sternly to child: “Young man, of language). However, there are well-known
you did that on purpose.” philosophical objections to a simple “look and
When asked later what “on purpose” means, name,” or ostensive model of learning the first
child says: “It means you’re looking at me.” words (Quine, 1960). Ostensive means pointing—
this conveys the idea of acquiring simple words
What are the features that determine the by a parent pointing at a dog and saying “dog,”
child’s first guess at the meaning of words? and the child then simply attaching the name to
How do the first guesses become corrected the object. The problem is simply that the child
so that they converge on the way adults use does not know which attribute of input is being
words? The errors that children make turn out labeled. For all the child knows, it could be that
to be a rich source of evidence about how they the word “dog” is supposed to pick out just the
learn word meaning. dog’s feet, or the whole category of animals, or
Clark and Clark (1977) argued that, in the its brown color, or the barking sound it makes,
very earliest stages of development, the child or its smell, or the way it is moving, and so on.
must start with two assumptions about the pur- This is often called the mapping problem. One
pose of language: Language is for communica- thing that makes the task slightly easier is that
tion, and language makes sense in context. From adults stress the most important words, and
then on they can form hypotheses about what the children selectively attend to the stressed parts
words mean, and develop strategies for using and of the speech they hear (Gleitman & Wanner,
refining those meanings. 1982). Nevertheless, the problem facing the
child is an enormous one.
After the first few words, vocabulary devel-
The emergence of early words opment is very fast and very efficient. Young
Children’s semantic development is dependent children are able to associate new words with
on their conceptual development. They can only objects after only one exposure, an ability called
map meanings into the concepts they have avail- fast-mapping. How can the child learn so quickly?
able at that time. In this respect, linguistic devel- Researchers have proposed a number of solutions
opment must follow cognitive development. Of to the mapping problem.
course, not all concepts may be marked by simple
linguistic distinctions. We don’t have different Constraints on learning names for
words for brown dogs as opposed to black dogs. things
There must surely be some innate processes, if Perhaps the cognitive system is constrained in
only to categorize objects, so the child is born its interpretations? The developing child makes
with the ability to form concepts. Quinn and use of a number of lexical principles to help to
Eimas (1986) suggest that categorization is part establish the meaning of a new word (Golinkoff,
of the innate architecture of cognition. Hirsh-Pasek, Bailey, & Wenger, 1992; Golinkoff,
However, children’s early vocabularies can- Mervis, & Hirsh-Pasek, 1994). The idea of lexi-
not be predicted just on the basis of the words cal principles as general constraints on how chil-
they hear. Their vocabularies contain many more dren attach names to objects and their properties
names for objects than are present in the speech is an important one. Several main constraints have
directed towards them (Bloom, 2001a). Clearly been proposed.
128 B. THE BIOLOGICAL AND DEVELOPMENTAL BASES OF LANGUAGE
FIGURE 4.4 A significant problem for a child when learning a new word is that the thing it refers to can appear
in many different forms. For example, the word “building” can be used to name many different types of structure.
4. LANGUAGE DEVELOPMENT 129
A third possible constraint is the mutual (1992b) argued that social and pragmatic factors
exclusivity assumption, whereby each object can could have an important influence on language
only have one label (Markman & Wachtel, 1988): development. The problem of labeling objects
That is, (unilingual) children do not usually like would be greatly simplified if the adult and child
more than one name for things. establish through any available communicative
As children acquire words, new strategies means that the discourse is focusing on a particu-
become available. For example, they may be lar dimension of an object. For example, if it has
biased to assign words to objects for which they been established that the domain of discourse is
do not already have names (the novel name– “color,” then the word “pink” will not be used
nameless category or N3C principle; Mervis & to name a pig, but its color. Adults and chil-
Bertrand, 1994). There are syntactic cues to mean- dren interact in determining the focus of early
ing; if we talk about “I see Wolf” we are prob- conversation. Tomasello and Kruger (1992)
ably talking about a proper noun, but if we say demonstrated the importance of pragmatic and
“I see the wolf” we are talking about a common communicative factors. They showed that young
noun (Bloom, 2001a). Later on, when children’s children are surprisingly better at learning new
vocabulary is larger and their linguistic abilities verbs when adults are talking about actions that
more sophisticated, explicit definition becomes have yet to happen than when the verbs are used
possible. Hence superordinate and subordinate ostensively to refer to actions that are ongoing.
terms can be explicitly defined by constructions This must be because the impending action con-
such as “Tables, chairs, and sofas are all types of tains a great deal of pragmatic information that
furniture.” the infant can use, and the infant’s attention can
be drawn to this. In summary, the social setting
Other solutions to the mapping can serve the same role as innate principles in
problem enabling the child to determine the reference
Other solutions have been proposed to the map- without knowing the language. Joint attention
ping problem. There might be an innate basis to with adults, or intersubjectivity, is an essential
the hypotheses children make (Fodor, 1981): We component of learning a language, particularly
might have evolved such that we are more likely early in development. Variability in experience
to attach the word “dog” to the object “dog,” of joint attention at 9–18 months may be one of
rather than to its color, or some even more the most important determinants of variability in
abstruse concept such as “the hairy thing I see early lexical development. Nevertheless, there
on Mondays.” is a limit to what social-pragmatic factors and
It is likely that social factors play an impor- joint attention can achieve, and as the child gets
tant role in learning the meanings of early older the availability and nature of the linguis-
words. Joint attention between adult and infant tic input become increasingly important (Hoff &
is an important factor in early word learning. Naigles, 2002). In a study of 63 children, Hoff
Parents usually take care to talk about what and Naigles found that, at the age of 24 months,
their children are interested in at the time. Even variation in the extent to which mother and child
at 16 months of age, children are sensitive to mutually engage in conversation has little effect
what the speaker is attending to and can work on the richness of the vocabulary of the child; on
out whether novel labels refer to those things the other hand, variation in the lexical richness
(Baldwin, 1991; Woodward & Markman, 1998). and syntactic complexity of the mother’s utter-
Early words may be constrained so that they are ances does have an effect.
only used in particular discourse settings (Levy Children appear to vary in the importance
& Nelson, 1994; Nelson, Hampson, & Shaw, they assign to different concepts, and this leads
1993). The social setting is important in learn- to individual differences and preferences for
ing new words as a supplement or an alterna- learning words. The first use of “dog” varies
tive to innate or lexical constraints. Tomasello from four-legged mammal-shaped objects, to
130 B. THE BIOLOGICAL AND DEVELOPMENTAL BASES OF LANGUAGE
all furry objects (including inanimate objects to the difference between noun phrase syntax
such as coats and hats), to all moving objects as in “This is Sib” and count noun syntax as in
(Clark & Clark, 1977). In each case the same “This is a sib.” This is obviously a useful cue
basic principle is operating: a child forms a for determining whether the word is a proper
hypothesis about the meaning of a word and name or stands for a category of things. The
tries it out. The hypotheses formed differ from ability of using syntactic knowledge to learn
child to child. meaning is called syntactic bootstrapping
Brown (1958) was among the earliest to sug- (Gleitman, 1990; Gleitman, Cassidy, Nappa,
gest that children start using words at what was Papafragou, & Trueswell, 2005; Landau &
later known as the basic level (see Chapter 11). Gleitman, 1985; Lidz, Gleitman, & Gleitman,
The basic level is the default level of usage. For 2003). Children use the structure of the sen-
example, “dog” is a more useful label than “ani- tences they hear in combination with what they
mal” or “terrier.” The bulk of early words are perceive in the world to interpret the meanings
basic-level terms (Hall, 1993; Hall & Waxman, of new words. For example, they use the syntax
1993; Richards, 1979; Rosch, Mervis, Gray, to help them infer the meanings of new verbs by
Johnson, & Boyes-Braem, 1976). Superordinate working out the types of relation that are per-
concepts, above the basic level, seem particularly missible between the nouns involved (Naigles,
difficult to acquire (Markman, 1989). Taxonomic 1990). For instance, suppose a child does not
hierarchies begin to develop only after the con- understand the verb “bringing” in the sentence
straint biasing children to acquire basic-level “Are you bringing me the doll?” The syntactic
terms weakens. Later on, particular cues become structure of the sentence suggests that “bring”
important. Mass nouns (which represent sub- is a verb whose meaning involves transfer, thus
stances or classes of things, such as “water” ruling out possible contending meanings such
or “furniture”) in particular seem to aid children as “carrying,” “holding,” or “playing.” Even
in learning hierarchical taxonomies, as they often children as young as 2 years old can use infor-
flag superordinate category names (Markman, mation about transitive and intransitive verbs to
1985, 1989). As such, they are syntactically infer the meanings of verbs (Naigles, 1996).
restricted, which is apparent when we try to sub- There are a number of reasons why some
stitute one for another. Hence although we can words are easier to learn than others. First, and
say “this is a table,” it is incorrect to say “this most obviously, children are exposed to some
is a furniture”; similarly “this is a ring” but not words more often in the language and in the envi-
“this is a jewelry”; and “this is a dollar” but ronment. Second, some concepts might be more
not “this is a money.” accessible. Conceptual structures change as the
The properties of objects themselves might child develops, and understanding words like
constrain the types of label that are considered “know,” “think,” and “believe” might depend on
appropriate for them. Soja, Carey, and Spelke the child having a sophisticated conceptual struc-
(1992) argued that the sorts of inferences children ture and a theory of mind (Gopnik & Meltzoff,
make vary according to the type of object being 1997; Huttenlocher, Smiley, & Charney, 1983).
labeled. For example, if the speaker is talking Third, the information change model says that
about a solid object, the child assumes the word is the type of information available to the child
the name of the whole object, but if the speaker is changes and increases over time, and not all
talking about a non-solid substance, then the child words are acquired in the same way (Gleitman
infers that the word is the name of parts or proper- et al., 2005). Of course all of these factors might
ties of the substance. operate, although Gleitman et al. argue that infor-
Finally, there are syntactic cues to word mation change is more important than conceptual
meaning. Brown (1958) proposed that children change; certain words and syntactic structures
may use part-of-speech as a cue to meaning. For have to be learned before others can be success-
example, 17-month-olds are capable of attending fully acquired.
4. LANGUAGE DEVELOPMENT 131
bars of cot toy abacus, toast rack with parallel bars, picture of columned
building
some examples of early over-extensions. Over- There is some controversy surrounding these
extensions are very common in early language findings. Fremgen and Fay (1980) argued that the
and appear to be found across all languages. results of Thomson and Chapman (1977) were
Rescorla (1980) found that one third of the first an experimental artifact. They pointed out that
75 words were over-extended, including some the children were repeatedly tested on the same
early high-frequency words. words, and this might have led to the children
As we can observe from Table 4.2, over- changing their response either out of boredom or
extensions are often based on perceptual attributes of to please the experimenter. When Fremgen and
the object. Although shape is particularly impor- Fay tested children only once on each word, they
tant, the examples show that over-extensions failed to find comprehension over-extensions in
are also possible on the basis of the properties words over-extended in production. The situation
of movement, size, texture, and the sound of the is complex, however, as Chapman and Thomson
objects referred to. Although Nelson (1974) pro- (1980) showed that in their original sample there
posed that functional attributes are more important was no evidence of an increase in the number of
than perceptual ones, Bowerman (1978) and E. over-extensions across trials, which would have
Clark (1973) both found that appearance usually been expected if Fremgen and Fay’s hypothesis
takes precedence over function. That is, children was correct. Behrend (1988) also found over-
over-extend based on a perceptual characteristic extensions in comprehension in children as young
such as shape even when the objects in the domain as 13 months.
of application clearly have different functions. Clark and Clark (1977) hypothesized that
McShane and Dockrell (1983) pointed out over-extensions develop in two stages. In the ear-
that many reports of over-extensions failed to dis- liest stage, the child focuses on an attribute, usu-
tinguish persistent from occasional errors. They ally perceptual, and then uses the new word to
argued that occasional errors tell us little about the refer to that attribute. However, with more expo-
child’s semantic representation, perhaps arising sure they realize that the word has a more specific
only from filling a transient difficulty in accessing meaning, but they do not know the other words
the proper word with the most available one. Such that would enable them to be more precise. In this
transient over-generalizations are more akin to later stage, then, they use the over-extended word
adult word substitution speech errors (see Chapter rather as shorthand for “like it.” Hence the child
13), and as such would tell us little about normal might know that there is more to being a ball than
semantic development. Hence it is important to being round, yet when confronted with an object
show that words involved in real over-extensions like the moon, not having the word “moon” they
are permanently over-extended, and also that the might call it “ball,” meaning “the-thing-with-the-
same words are over-extended in comprehension. same-shape-as-a-ball.”
If a word is over-extended because the represen- We should also bear in mind that, like adults,
tation of its meaning is incomplete, the pattern of children might sometimes just make mistakes.
comprehension of that word by the child should They might be using words as an analogy—the
reflect this. To this end, Thomson and Chapman moon is like a ball. Or they might just be being
(1977) showed that young children over-extended mischievous (Bloom, 2001a).
the meanings of words in comprehension as well Under-extensions occur when words are
as in production. They found that many words used more specifically than their meaning—such
that were over-extended in production by a group as using the word “round” to refer only to balls.
of 21- to 27-month-old children were also over- The number of under-extensions might be dra-
extended in comprehension. However, not all matically under-recorded, because usually the
words that were over-extended in production were construction will appear to be true. For example,
over-extended in comprehension. Most children if a child points at the moon and says “round,”
chose the appropriate adult referent for about half this utterance is clearly correct, even if the child
the words they over-extended in production. thinks that this is the name of the moon.
4. LANGUAGE DEVELOPMENT 133
Three types of theory have been proposed to child converge. The features are acquired in an
account for these data. The accounts are all based order from most to least general.
on the idea that over-extensions occur because of a Atkinson (1982) and Barrett (1978) dis-
lexical representation that is incomplete compared cussed problems with this approach. Any theory
to that of the adult, whereas under-extensions of lexical development based on a semantic fea-
occur because the developing representation is ture theory of meaning will inherit the same prob-
more specific than that of the adult. lems as the original theory, and there are serious
The semantic feature hypothesis (E. Clark, problems with the semantic feature theory (see
1973) is based on a decompositional theory of Chapter 11). In particular, we must be able to
lexical semantics. This approach states that the point to plausible, simple features in all domains,
meaning of a word can be specified in terms and this is not always easy, even for the kind of
of a set of smaller units of meaning, called concrete objects and actions that young children
semantic features (see Chapter 11). Over- and talk about. Atkinson (1982) in particular pointed
under-extensions occur as a result of a mismatch to the central problem that the features proposed
between the features of the word as used by the to account for the data are arbitrary. The devel-
child compared with the complete adult represen- opmental theory cannot easily be related to any
tation. The child samples from the features, pri- plausible general semantic theory, or to an inde-
marily on perceptual grounds. Over-extensions pendent theory of perceptual development.
occur when the set of features is incomplete; In Nelson’s (1974) functional core hypothesis,
under-extensions occur when additional spurious generalization is not restricted to perceptual simi-
features are developed (such as the meaning of larity; instead, functional features are also empha-
“round” including something like [silvery white sized. In other respects this is similar to the featural
and in the sky]). Semantic development consists account and suffers from the same problems.
primarily of acquiring new features and reducing The prototype hypothesis (Bowerman, 1978)
the mismatch by restructuring the lexical repre- states that lexical development consists of acquiring
sentations until the features used by the adult and a prototype that corresponds to the adult version.
Semantic feature hypothesis (E. Clark) x Features are acquired from the most general
x The meaning of words can be specified in to the least general
terms of smaller units of meaning (“semantic
Functional core hypothesis (Nelson)
features”)
x When there is a mismatch between x Generalization is not restricted to
features of the word used by the child perceptual similarity—functional features
and the complete representation used are also emphasized
by the adult, an over- or under-extension x In other ways, similar to the semantic feature
occurs hypothesis
x Over-extensions occur when a set of
features is incomplete Prototype hypothesis (Bowerman)
x Under-extensions occur when a set of x A prototype is an average member of a
features is incomplete category
x Semantic development involves acquiring x Lexical development consists of acquiring
new features and reducing mismatch a prototype that corresponds to the adult
between adult and child features version
134 B. THE BIOLOGICAL AND DEVELOPMENTAL BASES OF LANGUAGE
A prototype is an average member of a category. When new word meanings are acquired, because
Over-extensions may probably be explained better features are contrasted with the features of exist-
in terms of concept development and basic category ing word meanings, the meaning should not overlap
use. Kay and Anglin (1982) found prototypicality with that of existing words: the words’ meaning
effects in over- and under-extensions. The more should fill a gap. Children do not like two labels
an object was prototypical of a category, the more for the same thing.
likely it was that the conceptual prototype name Unfortunately, young children are some-
would be extended to include the object. Words times happy with two labels for the same object
are less likely to be extended for more peripheral (Gathercole, 1987). Contrast appears to be used
category members. This suggests that the concepts later rather than earlier as an organizing principle
are not fully developed but clustered around just a of semantic development. Neither is it likely to
few prototypical exemplars. Once again, a signifi- be the only principle driving semantic develop-
cant problem with this approach is that it inherits ment. There comes a point when it is no longer
the problems of the semantic theory on which it is useful for semantic development to make a con-
founded (see Chapter 11). trast (for example, between black cats and white
In summary, the strengths and weaknesses cats), and the contrastive hypothesis says nothing
of these developmental theories are the same as about this. It seems just as likely that when chil-
those of the corresponding adult theories. There dren hear someone use a new word, they assume
is surely scope for connectionist modeling here, it must refer to something new because otherwise
which may yet show that a variant of the semantic the speaker would have used the original word
feature hypothesis is along the right lines. instead (Gathercole, 1989; Hoff-Ginsberg, 1997).
that?” and vocabulary develops quickly from although related view is that their acquisition
then on. From this point, a good guide to the depends on the prior acquisition of some nouns
order of acquisition of words is the semantic and some information about how syntax operates
complexity of the semantic domain under con- at the clause level (Gillette, Gleitman, Gleitman, &
sideration. Words with simpler semantic repre- Lederer, 1999). That is, verb acquisition depends
sentations are acquired first. For example, the on acquiring knowledge about linguistic context.
order of acquisition of dimensional terms used Gillette et al. presented adults with video clips of
to describe size matches their relative seman- adults speaking to children. Some words on the
tic complexity. These terms are acquired in the soundtrack were replaced with beeps or made-up
sequence shown in (8): words such as “gorp.” The adults had to identify
the meanings of the beeps and made-up words.
(8) big–small The extralinguistic context was surprisingly unin-
tall–short, long–short formative: adults found it quite difficult to identify
high–low the meanings of words on the basis of environ-
thick–thin mental information alone. They were particularly
wide–narrow, deep–shallow poor, however, at identifying verbs relative to
nouns, and extremely bad at identifying verbs
“Big” and “small” are the most general relating to mental states (e.g., “think,” “see”).
of these terms, and so these are acquired first. Performance increased markedly when syntactic
“Wide” and “narrow” are the most specific cues were available. As an example, the “gorp” in
terms, and are also used to refer to the sec- “Vlad is gorping” is more likely to mean “sneeze”
ondary dimension of size, and hence these are than “kick,” but in “Vlad is gorping the snaggle”
acquired later on. The other terms are interme- it is more likely to mean “kick” than “sneeze.” In
diate in complexity and are acquired in between summary, environmental context might be less
(Bierwisch, 1970; Clark & Clark, 1977; Wales powerful than was once thought, while linguistic
& Campbell, 1970). context provides powerful cues. Verbs are more
Nouns are acquired more easily than verbs. difficult to acquire than nouns because of their
One explanation for this might be that verbs are greater reliance on complex linguistic context.
more cognitively complex than nouns, in that Later semantic development sees much interplay
whereas nouns label objects, verbs label relations between lexical and syntactic factors.
between objects (Gentner, 1978). An alternative
Does comprehension always precede
production?
Comprehension usually precedes production for
the obvious reason that the child has to more or less
understand (or think they understand) a concept
before producing it. Quite often contextual cues
are strong enough for the child to get the gist of
an utterance without perhaps being able to under-
stand the details. In such cases there is no ques-
tion of the child being able to produce language
immediately after being first exposed to a partic-
ular word or structure. Furthermore, as we have
seen, even when a child starts producing a word
At around the age of 2½ years, children start to or structure, it might not be used in the same way
ask questions such as “What’s the name of that?” as an adult would use it (e.g., children over-extend
This marks the onset of a period of accelerated words). There is more to development than a sim-
vocabulary development.
ple lag, however. The order of comprehension and
136 B. THE BIOLOGICAL AND DEVELOPMENTAL BASES OF LANGUAGE
production is not always preserved: words that One important approach says that knowl-
are comprehended first are not always those that edge about the basic syntactic categories is innate
are produced first (Clark & Hecht, 1983). Early (Pinker, 1984, 1989). Children know that nouns
comprehension and production vocabularies may refer to objects and verbs refer to actions. Pinker
differ quite markedly (Benedict, 1979). There are argued that the child first learns the meaning of
even cases of words being produced before there some content words, and uses these to construct
is any comprehension of their meaning (Leonard, semantic representations of some simple input
Newhoff, & Fey, 1980). sentences. With the surface structure of a sentence
and knowledge about its meaning, the child is in a
position to make an inference about its underlying
SYNTACTIC structure. Children start off with their innate knowl-
DEVELOPMENT edge of syntactic categories and a set of innate link-
ing rules that relate them to the semantic categories
We have seen that a stage of single-word speech of thematic roles. Thematic roles are a way of
(called holophrastic speech) precedes a stage of labeling who did what in a sentence: For example,
two-word utterances. After this, early speech is in the sentence “Vlad kissed Agnes,” Vlad is the
telegraphic, in that grammatical morphemes may agent (the person or thing initiating the action) and
be omitted. We can broadly distinguish between Agnes the patient (the person or thing being acted
continuous and discontinuous theories. In con- on by the agent). An innate linking rule relates the
tinuous theories, children are believed to have syntactic categories of subject and object to the
knowledge of grammatical categories from the semantic categories of agent and patient, respec-
very earliest stages (e.g., Bloom, 1994; Brown tively. So on exposure to language, all the child
& Bellugi, 1964; Menyuk, 1969; Pinker, 1984). has to do is identify the agents in utterances, and
The child’s goal is to attach particular words to this information then provides knowledge about
the correct grammatical categories, and then use the syntactic structure. This process is known as
them with the appropriate syntactic rules. In dis- semantic bootstrapping (see Figure 4.5).
continuous theories, early multiword utterances Although nativist accounts have the advan-
are not governed by adult-like rules (Bowerman, tage of providing a simple explanation for many
1973; Braine, 1963; Maratsos, 1983). Theoretical
approaches also vary depending on the extent to
which they emphasize the semantic richness of Semantic bootstrapping theory (Pinker, 1984, 1989)
the early utterances.
Child has innate knowledge of syntactic
categories and linking rules
How do children learn syntactic
categories?
Child learns meaning of some
One of the most basic requirements of under- content words
otherwise mysterious phenomena, they have a Macnamara, 1972, 1982), which means that the
number of disadvantages. The predictions they very earliest stages of language development are
make are not always borne out by the data. asyntactic (Goodluck, 1991). A gross distinction
First, the theory depends on the child hearing is that nouns correspond to objects, adjectives to
plenty of utterances early on that contain easily attributes, and verbs to actions. But although many
identifiable agents and actions relating to what the nouns do indeed refer to objects, others are used
child is looking at that can be mapped onto nouns to refer to salient abstract concepts (e.g., “sleep,”
and verbs. However, it can sometimes be very diffi- “truth,” “time,” “love,” “happiness”). So one of
cult to work out the meaning of new words, particu- the major failings of a semantic approach to early
larly verbs (Gillette et al., 1999; Gleitman, 1990). grammar is that semantics alone cannot provide a
Second, Bowerman (1990) showed that there direct basis for syntax. It is possible, however, that
was little difference in the order of acquisition early semantic categories could underlie syntactic
of verbs that the semantic bootstrapping account categories (McShane, 1991); after all, children
predicts should be easiest for children to map onto learn about objects before they learn about truth
thematic roles, compared with those that should and time. Perhaps the category of “noun” is based
be more difficult. For example, verbs where the on a semantic category of objecthood (Gentner,
theme maps onto the subject (as is the case with 1982; Slobin, 1981).
many verbs, such as “fall,” “chased”) should be According to Schlesinger’s (1988) semantic
easier to acquire than verbs where the location, assimilation theory (see Figure 4.6), early seman-
goal, or source maps onto the subject and the tic categories develop into early syntactic catego-
theme onto the object (such as “have,” “got,” ries without any abrupt transition. At an early age
and “lose”). Instead Bowerman, in an analysis of children use an “agent–action” sentence schema.
the speech of her two children, Christy and Eva, This can be used to analyze new NP–VP sequences.
found that the two types of verb are acquired at The important point is that it is possible to give an
the same time. In general, children do not produce account of early syntactic development without hav-
sentences corresponding to the basic structure ing to assume that syntactic categories are innate.
“agent–action–patient” any earlier than other Macnamara (1972) proposed that the child
types of structure. focuses at first on individual content words so
Third, Braine (1988a, 1988b), in detailed that a small lexicon is acquired. Information per-
reviews of Pinker’s theory, questioned the need taining to word order is ignored at this stage. The
for semantic bootstrapping, and examined the child combines the meanings of the individual
evidence against the existence of very early phrase- words with the context to determine the speak-
structure rules. He argued that semantic information er’s intended meaning. For example, a child who
is sufficient for children to be able to learn syntac-
tic categories.
Finally, postulating the possession of specific
innate knowledge is very powerful—perhaps too Semantic assimilation theory (Schlesinger, 1988)
powerful. After all, the processes of language
No innate structures
development are slow and full of errors. There is
a fine balance between a developmental system Early semantic categories
alone (Elman, 1990; Finch & Chater, 1992; are typically verbs, but words that only take the
Mintz, 2003; Redington & Chater, 1998). This suffix -s are typically nouns (Maratsos, 1988).
approach shows how syntactic categories can be In English bisyllabic words, nouns tend to have
acquired without explicit knowledge of syntac- stress on the first syllable, but verbs have stress
tic rules or semantic information. Instead, all that on the second syllable (Kelly, 1992).
is necessary is statistical information about how
words tend to cluster together. This approach also Evaluation of work on learning
answers the criticism that a distributional analy- syntactic categories
sis of syntactic categories is beyond children’s In summary, the relation between the develop-
computational abilities (Pinker, 1984). In par- ment of syntax and the development of semantics
ticular, some words are ambiguous and belong to is likely to be a complex one. Early work empha-
multiple syntactic categories. A child hearing the sized the importance of semantic information in
first three sentences might conclude on the basis the acquisition of syntactic categories, but more
of distributional analysis alone that the fourth recent work has shown how these categories can
sentence is also acceptable: be acquired with little or no semantic informa-
tion. Children probably learn syntactic categories
(9) Vlad eats fish. through a distributional analysis of the language,
(10) Vlad eats rabbits. and connectionist modeling has been very useful
(11) Vlad can fish. in understanding how this occurs. It is unlikely
(12) *Vlad can rabbits. that innate principles are needed to learn syntactic
categories.
However, computer modeling shows that sta-
tistical distributional analysis in fact works very
well. MOSAIC is a computer model that has no
Two-word grammars
built-in syntactic knowledge and learns by the dis- Soon after the vocabulary explosion, the first
tributional analysis of an input of child-directed two-word utterances appear. There is a gradation
speech (Freudenthal, Pine, & Gobet, 2005, 2006). between one-word and two-word utterances in
It provides input to a range of data in English, the form of two single words juxtaposed (Bloom,
Dutch, Italian, and Spanish, fitting the errors that 1973). Children remain in the two-word phase for
children make and how those errors change in some time.
time in the light of further input. Mintz (2003) Early research focused on uncovering the
shows how exposure to words in frequent frames grammar that underlies early language. It was
produces extremely accurate categories. To give a hoped that detailed longitudinal studies of a few
very simple example, any word in the X position children would reveal the way in which adult
in “the X laughs” must be a noun. grammar was acquired. Early multiword speech
Researchers currently disagree about how is commonly said to be telegraphic in that it con-
much innate knowledge is necessary before dis- sists primarily of content words, with many of the
tributional learning can successfully take place. function words absent (Brown & Bellugi, 1964;
The current trend in research is to show how less Brown & Fraser, 1963).
knowledge must be innate because the input with It would be a mistake to characterize tele-
which children work is richer than was once real- graphic speech as consisting only of semanti-
ized. For example, Redington and Chater (1998) cally meaningful content words. Braine (1963)
pointed out that children have access to distribu- studied three children from when they started
tional information in addition to co-occurrence to form two-word utterances (at about the age
information. For instance, morphology varies of 20 months). He identified a small number of
regularly with syntactic category and this pro- what he called pivot words. These were words
vides a strong cue to the syntactic function of that were used frequently and always occurred in
a word. Words that take the suffixes -s and -ed the same fixed position in every sentence. Pivot
140 B. THE BIOLOGICAL AND DEVELOPMENTAL BASES OF LANGUAGE
words were not used alone and were not found not the other properties ascribed to pivot words. On
in conjunction with other pivot words. Most pivot closer analysis she found that the open class was
words (called P1 words) were to be found in the not undifferentiated, using instead a number of
initial position, although a smaller group (the P2 classes. Harris and Coltheart (1986) suggested that
words) were to be found in the second position. the children in the Bowerman study might have
There was a larger group of what Braine called been linguistically more advanced than those of the
open words that were used less frequently and earlier studies, and therefore more likely to show
that varied in the position in which they were increased syntactic differentiation.
used, but were usually placed second. This idea Bloom (1970) argued that these early gram-
that sentences are formed from a small number of matical approaches failed to capture the seman-
pivot words is called pivot grammar. Hence most tic richness of these simple utterances because
two-word sentences were of the form (P1 + open) they placed too much emphasis on their syntactic
words (e.g., “pretty boat,” “pretty fan,” “other structure. The alternative approach—that of plac-
milk,” “other bread”) with a smaller number of ing more emphasis on the context and content
(open + P2) forms (e.g., “push it”). Some (open + of children’s utterances, rather than just on their
open) constructions (e.g., “milk cup”) and some form—became known as rich interpretation. It
utterances consisting only of single open words soon became apparent that two-word utterances
are also found. with the same form could be used in different
Brown (1973) took a similar longitudinal ways. In one famous example, Bloom noted that
approach with three children named “Adam,” the utterance “mommy sock,” uttered by a child
“Eve,” and “Sarah.” Samples of their speech were named Kathryn, was used on one occasion to refer
recorded over a period of years from when they to the mother’s sock, and on another to refer to
started to speak until the production of complex the action of the child having her sock put on by
multiword utterances. Brown observed that the the mother. Bloom argued that it was essential to
children appeared to be using different rules from observe the detailed context of each utterance.
adults, but rules nevertheless. This idea that chil- The rich interpretation methodology has its
dren learn rules but apply them inappropriately is own problems. In particular, the observation of
an important concept. They produced utterances an appropriate context and the attribution of the
such as “more nut,” “a hands,” and “two sock.” intended meaning of a child’s utterance to a par-
Brown proposed a grammar similar in form to pivot ticular utterance in that context is a subjective
grammar, whereby noun phrases were to be rewrit- judgment by the observer. It is difficult to be cer-
ten according to the rule NP → (modifier + noun). tain, for example, that the child really did have
The category of “modifier” did not correspond two different meanings in mind for the “mommy
to any single adult syntactic category, containing sock” utterance.
articles, numbers, and some (demonstrative) adjec- In summary, it is difficult to uncover a simple
tives and (possessive) nouns. As the children grew grammar for early development that is based on
older, however, these distinctions emerged, and the syntactic factors alone. An additional problem is
grammar became more complex. that the order of words in early utterances is not
always consistent.
Problems with the early grammar
approaches Semantic approaches to early
Bowerman (1973) reviewed language development
across a number of cultures, particularly English syntactic development
and Finnish. She concluded that the rules of pivot The apparent failure of pure syntactic approaches
grammar were far from universal. Indeed, they did to early development, and the emerging emphasis
not fully capture the speech of American children. on the semantic richness of early utterances, led to
She confirmed that young children use a small an emphasis on semantic accounts of early gram-
number of words in relatively fixed positions, but mars (Schlesinger, 1971; Slobin, 1970). Aspects
4. LANGUAGE DEVELOPMENT 141
ways in which they are used. These early verbs instances. As we have noted before, production
that form the basis of utterances are called “verb is usually more difficult than comprehension.
islands” (Akhtar & Tomasello, 1997; Tomasello, Furthermore, most of the stimuli that test early
1992a, 2000, 2003). Tomasello (2000, 2003) comprehension tend to involve nonsense words
questioned the continuity assumption—the idea or artificial languages, whereas later produc-
that a child’s grammar is adult-like, using the tion studies usually involve real language where
same sort of grammatical rules as adults and with word meaning is involved. Naigles suggests that
an adult-like linguistic competence. He argued the patterns the younger children extract are not
that young children’s syntactic abilities have yet tied to meaning. Toddlers do not lose these
been greatly overestimated: in particular, they early abstractions, but their specific use of them
produce far fewer novel utterances than is usu- is very limited until they can integrate them with
ally attributed to them. Instead, their language meaning. As she says, learning form is easy, but
development proceeds in a piecemeal fashion learning meaning is hard. She argues that there
that is based on particular items (mainly verbs), is no reason to suppose that very young chil-
with little evidence of using general structures dren are not making abstractions across syn-
such as syntactic categories. Lieven, Pine, and tactic structures, so she resolves the paradox by
Baldwin (1997) found that virtually all their saying that toddlers do use abstraction. Young
sample of young children (1–3 years old) used children have difficulty extending meaning, not
verbs in only one type of construction, suggest- frames. Tomasello and Akhtar (2003) continued
ing that their syntax was built around these par- the debate (see Naigles, 2003, for a reply), argu-
ticular lexical items. Tomasello emphasizes the ing that there is no paradox. They contended that
importance of syntactic development by analogy- there is converging evidence that up to the age
making based on verb islands. The verb-island of 3 young children are unable to abstract across
hypothesis accounts for the data because chil- syntactic structures, focusing instead on specific
dren are learning some specific high-frequency items and expressions, and using a few specific
examples (giving the correct pattern in the syntactic frames. Tomasello and Akhtar argued
first instance) that are then used to form gener- that diary studies of spontaneous speech, and
alizations; however, the application of some of the production studies where children are taught
these generalizations sometimes leads to errors. novel verbs, produce particularly compelling
Eventually the child realizes that both rules and data that toddlers do not form abstract syntactic
exceptions are necessary. representations.
The verb-island hypothesis has generated If adults hear a particular syntactic struc-
considerable debate, particularly about whether ture, they are more likely to use that structure in
or not there is a paradox in accounts of early child production in the immediate future, a phenom-
language. Naigles (2002) argues that at first sight enon known as structural priming (see Chapter
there is a paradox: infants seem to be very good 13 for details). For example, you are more likely
at statistical learning and abstracting general to produce a passive construction if you have just
patterns from specific instances, while toddlers heard a passive sentence than if you have
are very poor, dealing instead with non-abstract, just heard an active one. Children over 4 show
item-specific information (e.g., the key verb this structural priming effect; however, children
of verb islands). It is though as they get older under 4 do not (Savage, Leiven, Theakston, &
children actually lose their ability for abstrac- Tomasello, 2003). One explanation for this find-
tion. She argues that this difference arises in part ing is that young children have no general syn-
from differences in methodologies: Studies on tactic structures to prime, but the finding might
younger children tend to test comprehension, and also suggest that imitation plays some role in
find more evidence of abstraction, while studies older children.
on older children tend to use test production, A third solution is that repeated instances of a
and find more evidence of the use of specific verb in particular constructions cause the child to
4. LANGUAGE DEVELOPMENT 143
make a probabilistic inference that the verb is only requires a level of syntactic abstraction. How
associated with a particular verb-argument struc- early does this abstraction happen? According
ture. The more often children hear a verb used in a to late-syntax theories abstraction happens rela-
particular construction, the less often they should tively late, suggesting that syntax takes time to be
generalize it to a novel input. This idea is called learned and is acquired through abstracted experi-
the entrenchment hypothesis (Braine & Brooks, ence, with children early on interpreting sentences
1995; Theakston, 2004). The more often children with lexical or verb-specific knowledge (Braine,
hear a verb being used, the less likely they should 1992; Lieven, Pine, & Baldwin, 1997; Tomasello,
be to get it wrong. Therefore verb frequency is 2003). According to early-syntax theories,
particularly important here, with over-generalization abstraction happens relatively early (Fisher, 2002;
errors particularly likely on low-frequency verbs. Naigles, 2002; Pinker, 1984). If abstraction hap-
Hence children are more likely to (incorrectly) pens early, children must be making use of some
say that “She arrived her to the park” is gram- additional information, which might be innate
matical than the similar construction containing (Pinker, 1984), or might arise from the structure
the higher frequency verb in “She came me to the of the general cognitive architecture used to learn
school” (Theakston, 2004). language (Chang, Dell, & Bock, 2006; Saffran,
Of course word frequency and the amount of 2002). Unfortunately different methodologies
exposure to semantic information are confounded. give different results and support different theo-
An alternative account combines the above ries (Chang et al., 2006). Results using elicited
accounts. It dispenses with rules and exceptions, production (getting children to speak) support
and argues that children carry out a type of distri- the late-syntax theory, while results examining
butional analysis of verb structures, with semantic comprehension support the early-syntax theory.
information playing an important role (Alishahi & Even different comprehension tasks give dif-
Stevenson, 2005). In this model the acquisition of ferent results. Tasks in which comprehension is
verb-argument structure is probabilistic. Children assessed by children acting out sentences find that
learn the argument structures of each specific children under 3 do not seem to use word order
verb over many specific instances, as well as the to comprehend who is acting on whom (Akhtar
more general semantic characteristics of that type & Tomasello, 1997). On the other hand, tasks
of verb. Early on children imitate specific forms, using the preferential-looking technique find that
but increasingly rely on generalizations based on children under 3 do use word order information
general patterns. At first this general information (Fernandes, Marcus, Di Nubila, & Vouloumanos,
overwhelms the specific information, but as the 2006; Gertner, Fisher, & Eisengart, 2006). Chang
child encounters more examples of infrequent et al. show that a connectionist model that learns
verbs they come to be able to use those less fre- and predicts sequences from repeated exposure
quent verbs correctly. to grammatical strings of words, and which also
The study of the acquisition of verb-argument makes use of information about the meaning of
structures enables us to make a more general utterances, can account for the data from both
point about how children learn syntax. Clearly sorts of methodology. The model can simulate
an important part of learning is to abstract infor- both the elicited production and preferential-look-
mation out of specific instances. After the age of ing data. Children appear to understand complex
3, children are able to combine novel verbs with structures early on with the preferential-looking
the appropriate syntactic structures with ease. For task because it provides a choice between two
example, consider the sentences “Agnes kicked interpretations. The system develops partial struc-
Vlad” and “Agnes kissed Vlad.” There are simi- tural representations before it can produce correct
larities between these sentences—for example, whole structures. In effect, it has enough informa-
both are transitive sentences involving agents and tion to be able to understand when alternatives are
objects (as opposed, say, to kickers and things provided, but not enough to be able to produce
being kicked), but to recognize these similarities from scratch.
144 B. THE BIOLOGICAL AND DEVELOPMENTAL BASES OF LANGUAGE
A more general way of phrasing these ques- multiword utterances just statistically reflect
tions was put by Lidz et al. (2003): Is word the most common types of utterance they hear?
learning driven by observation of the outside According to this view, children have a much
world, or is it driven by properties already inside less formal grammar than is commonly sup-
the child? Causative verbs make a particularly posed. Evidence for this comes from the obser-
good arena for testing this question. In English, vation that early language use is much less
causativity and transitivity are entwined: flexible than it would be if children were using
Causative verbs (whose meanings contain some explicit grammatical rules (Pine & Lieven,
notion of causation) are transitive. For example, 1997).
the causative verb “kill” (meaning “cause to In general, the idea that there is a syntax mod-
die”) is transitive—it can take an object (“Vlad ule that drives language development is becoming
kills Boris”); “swim” is not causative and is an less popular. It is clear that language development
intransitive verb—it cannot take an object. In must be seen within the context of social devel-
the Dravidian language Kannada (spoken in the opment and the way language is used (Messer,
subcontinent of India), however, transitivity is 2000). The shift is also mirrored in Chomsky’s
not the best predictor of causativity: There is more recent work (1995), where the importance
a causative morpheme which is never present of grammatical rules is much reduced.
unless the verb is a causative one. How do chil- Perhaps there is no straightforward way
dren come to learn verbs in such a language? of separating grammatical and lexical devel-
The emergentist theory, which says that learn- opment; the two are intertwined (Bates &
ing is driven by observation, will mean that for Goodman, 1997, 1999). For example, grammat-
the child the most reliable cue (which will not ical development is related to vocabulary size:
be transitivity, but the presence of the causative The best predictor of grammatical development
morpheme) will be associated with causativ- at 28 months is vocabulary size at 20 months,
ity. The syntactic universalist theory, however, suggesting that the two share something impor-
where learning is driven by the properties of the tant (Bates & Goodman, 1999; Fenson et al.,
syntax already present in the child, predicts that 1994). Furthermore, there is no evidence for a
they should still make most use of transitivity. dissociation between grammatical and vocabu-
Lidz et al. found that 3-year-old children largely lary development in either early or late talk-
ignore the causative morphology and make most ers: We cannot identify children with normal
use of the less useful transitive structures when grammatical development but with very low or
understanding verbs. high vocabulary scores for their age. Neither is
there any evidence of any clear dissociations
Evaluation of work on early between grammatical and lexical development
in language in special circumstances (such as
syntactic development Williams syndrome and Down’s syndrome).
Can early syntactic development be both non- Bates and Goodman (1999) concluded that
syntactic and non-semantic? The identification there is little support for the idea of a separate
of early syntactic categories might occur without module for grammar.
much semantic help, and without being based on In conclusion, recent work tends to downplay
the acquisition of an explicit grammar. Instead, the role of an innate grammatical module and the
children seem to learn grammatical categories by attribution of adult-like grammatical competence
distributional analysis. Can this type of approach to young children.
be extended to account for how children produce
two-word and early multiword utterances?
Perhaps children’s early productions are Later syntactic development
much more limited than has frequently been Brown (1973) suggested that the mean length of
thought (Messer, 2000). Perhaps their early utterance (MLU) is a useful way of charting the
4. LANGUAGE DEVELOPMENT 145
TABLE 4.3 Mean length of utterance (MLU) and language development. Based on Brown (1973).
Stage I MLU < 2.25 many omissions, few grammatical words and inflections
Stage III 2.75–3.5 (c. 3 years) pluralization, most basic syntactic rules
Stage V 4+ imperatives, negatives, questions, reflexives, passives (5–7 years), in that order
progress of syntactic development. This is the people doing odd actions. For example, she
mean length of an utterance measured in mor- would point to a drawing and say: “This is a
phemes averaged over many words. Brown wug. This is another one. Now there are two __”
divided early development into five stages based (see Figure 4.7). The children would fill in the
on MLU. Naturally MLU increases as the child gap with the appropriate plural ending “wugs.”
gets older; we find an even better correlation with In fact, they could use rules to generate posses-
age if single-word utterances are omitted from the sives (“the bik’s hat”), past tenses (“he ricked
analysis (Klee & Fitzgerald, 1985). This approach yesterday”), and number agreement in verbs
is rather descriptive and there is little correla- (“he ricks every day”).
tion between MLU and age after the age of 5. The development of order of acquisition of
Nevertheless, it is a convenient and much-used grammatical morphemes is relatively constant
measure (see Table 4.3). across children (James & Khan, 1982). The ear-
The rule-based nature of linguistic develop- liest acquired is the present progressive (e.g.,
ment is clear from the work of Berko (1958). “kissing”), followed by spatial prepositions, plu-
She argued that if children used rules, their use rals, possessives, articles, and the past tense in
should be apparent even with words the children different forms.
had not used before. They should be able to use
appropriate word endings even for imaginary
words. In a famous study, Berko used nonsense Inflecting verbs: Acquiring the past
words to name pictures of strange animals and tense
The development of the past tense has come under
particular scrutiny. Brown (1973) observed that the
youngest children use verbs in uninflected forms
(“look,” “give”). He argued that children seem to
be aware of the meaning of the different syntactic
roles before they could use the inflections. That
is, the youngest children use the simplest form
This is a wug. to convey all of the syntactic roles. They learn to
use the appropriate inflections very quickly: past
tenses to convey the sense of time (usually marked
by adding “-ed”), the use of the “-ing” ending,
Now there is another one. number modification, and modification by combi-
There are two of them.
. nation with auxiliaries. However, although regular
There are two ________.
verbs can be modified by applying a simple rule
(e.g., form the past tense by adding “-ed”), a large
FIGURE 4.7 number of verbs are irregular.
146 B. THE BIOLOGICAL AND DEVELOPMENTAL BASES OF LANGUAGE
The time course of development of irregular while patients with Parkinson’s disease are rela-
verbs and nouns is an example of U-shaped devel- tively worse at regular forms. More controver-
opment. Behavior changes from good performance, sially, children with Williams syndrome may
to poor performance, before improving again. Early fare worse with irregular forms, while children
on, children produce both regular and irregular with specific language impairment (SLI) fare
forms. Importantly, in the poor performance phase, worse with regular forms (Pinker, 1994, 1999;
children make a large number of over-regularization see Thomas & Karmiloff-Smith, 2003, for a
errors (e.g., Brown, 1973; Cazden, 1968; Kuczaj, review). A problem with acquiring the dual-
1977). Later on they can produce both the regular route model is that regular and irregular forms
and irregular forms once again. coexist; the proportion of over-regularizations
One explanation of this pattern is that the never rose above 46% in 14 children studied by
youngest children have just learned specific Kuczaj (1977), suggesting that a very general,
instances. They then learn a rule by induct- powerful rule is not learned.
ion (e.g., form the past tense by adding -ed An alternative account, connectionist mod-
to verbs, form plurals by adding -s to nouns) eling of the acquisition of the past tense, has gener-
and apply this in all cases. Only later do they ated substantial controversy. The basic idea of these
start to learn the exceptions to the rule. Hence models is that we do not need two distinct routes to
children develop a past-tense formation system produce regular and irregular forms; instead, knowl-
with two separate routes: a symbolic system edge of regular forms comes from knowledge about
that uses a rule to generate regular forms, and phonological regularities, whereas knowledge of
a route accessing a separate listing of irregular irregular forms comes from lexical-semantic knowl-
forms (Pinker, 1994, 1999). Evidence for a dual- edge. fMRI imaging data suggest that it is the pho-
route model comes from several dissociations nological characteristics of the past tense forms that
of performance on regular and irregular verbs. are important for determining which brain regions
Patients with fluent aphasia (see Chapter 13) are activated: Irregular forms that sound as if they
tend to be worse at reading and producing irreg- could be regular forms (e.g., “slept,” “sold”) pro-
ular forms than regular forms, while patients duce a pattern of activation similar to regular forms
with non-fluent aphasia tend to be relatively (Joanisse & Seidenberg, 2005). Rumelhart and
worse at processing the regular forms. Imaging McClelland (1986) simulated the acquisition of the
data suggest the processing of regular and irreg- past tense using back-propagation. The input con-
ular forms involves different parts of the brain. sisted of the root form of the verb, and the output
PET imaging suggests that only Broca’s area is consisted of the inflected form. The training sched-
activated when processing regular past tenses, ule was particularly important, as it was designed
but the temporal lobes of the brain are involved to mimic the type of exposure that children have to
in processing irregular past tenses (Jaeger et al., verbs. At first the model was trained on 10 of the
1996). fMRI data suggest that while the posterior highest frequency words, 8 of which happened to
temporal lobes are involved in processing both be irregular. After 10 training cycles, 410 medium-
regular and irregular forms, only regular forms frequency verbs were introduced for another 190
produce activation around the frontal gyrus learning trials. Finally 86 low-frequency verbs were
(Pinker & Ullman, 2002). There is also evidence introduced. The model behaved as children do: it ini-
that regular and irregular plurals are processed tially produced the correct output, but then began to
in different ways. Clahsen (1999) argued that over-regularize. Rumelhart and McClelland pointed
experimental and neuroimaging work on plural out that the model behaved in a rule-like way, with-
formation in German suggests that the language out explicitly learning or having been taught a rule.
system is divided into a lexicon and a computa- Instead, the behavior emerged as a consequence of
tional system that, among other things, gener- the statistical properties of the input. If true, this
ates irregular forms. Patients with Alzheimer’s might be an important general point about language
disease are relatively worse at irregular forms, development.
4. LANGUAGE DEVELOPMENT 147
What are the problems with this account? discontinuity as in the original Rumelhart and
Pinker and Prince (1988) made the most sub- McClelland model, they gradually increased the
stantial criticisms of this work. They noted that number of verbs the system must learn, to simu-
irregular verbs are not really totally irregular. late the gradual increase in children’s vocabu-
It is possible to predict which verbs are likely lary size. They concluded that a network could
to be irregular, and the way in which they will display U-shaped learning even when there are
be irregular. This is because irregular verbs still no discontinuities in the training. MacWhinney
obey the general phonological constraints of and Leinbach (1991) reached similar conclu-
the language. Hence it is possible that irregu- sions. Nevertheless, some problems remain
lar forms are derived by general phonological (Clahsen, 1999; Marcus, 1995). Obtaining the
rules. In addition, the way in which some verbs U-shaped curve in modeling seems to depend
have both regular and irregular past tenses, and on presenting the training stimuli in a cer-
the way in which they are inflected, depends on tain way—in particular, it depends on sudden
the semantic context (“hang” and “hanged” and changes in the training regime, in contrast to the
“hung,” and “ring” and “ringed” and “rung,” for smooth changes of input that children are faced
example). The network also made errors of a type with. Furthermore, connectionist models make
that children never produce (e.g., “membled” for more irregularization errors than children. It is
the past tense of “mail”). Pinker and Prince also possible that the single-route mechanism actu-
pointed out that there is no explicit representa- ally fits the child data better than rule-based
tion for a word in Rumelhart and McClelland’s accounts (Marchman, 1997). In particular,
(1986) model. Instead, it is represented as a dis- children are more likely to regularize irregular
tributed pattern of activation. However, words verbs that are similar to other verbs that behave
as explicit units play a vital role in the acquisi- in a regular way. For example, “throw” forms
tion process. Pinker and Prince also argued that an irregular past tense as “threw.” There are
the simulation’s U-shaped development resulted other verbs like it, however, that form their past
directly from its training schedule. The drop in tenses in a regular way (e.g., “flow,” “show”).
performance of the model occurred when the An irregular verb like “hit,” however, has no
number of regular verbs in the training vocabu- competing enemies. As the connectionist con-
lary was suddenly increased. There is no such straint-based model predicts, children are more
discontinuity in the language to which young likely to produce “throwed” than “hitted.”
children are exposed. Obtaining the U-shaped One outcome of the modeling work by
curve also depended on having a disproportion- Rumelhart and McClelland has been to focus
ately large number of irregular verbs in the ini- attention on the details of how children acquire
tial training phase. This is not mirrored by what skills such as forming the past tense (e.g.,
children are actually exposed to. Finally, the way Marchman & Bates, 1994; Marcus et al.,
in which the medium-frequency, largely regular 1992). We now know much more than we did
verbs are all introduced in one block on trial 11 before. A general problem with the connection-
is quite unlike what happens to children, where ist accounts is that these models need explicit
exposure is cumulative and gradual (McShane, feedback in order to learn. As we have seen, the
1991). extent and influence of explicit feedback in real
Plunkett and Marchman (1991, 1993) language development is limited. One frequent
argued that connectionist networks can model counter to this objection is that the modeling
the acquisition of verb morphology, but many is merely demonstrating the principle that asso-
more factors have to be taken into account. ciation and statistical regularities in the lan-
In particular, they proposed that the training guage can account for the phenomena without
set must more realistically reflect what hap- recourse to explicit rules, and the details of the
pens with children. Rather than present all the learning mechanisms involved are not impor-
verbs to be learned in one go, or with a sudden tant in this respect. Another possibility is that as
148 B. THE BIOLOGICAL AND DEVELOPMENTAL BASES OF LANGUAGE
children listen to speech, they make predictions (such as Russian) are more highly inflected and
about what comes next. They can then match have freer word order. Not surprisingly, these dif-
the predictions to the actual input. However, ferences lead to differences in the detail of lan-
there is presently little evidence that this hap- guage development.
pens (Messer, 2000). What is perhaps surprising is the amount
Finally, computational modeling shows of uniformity in language development across
how developmental disruption to past-tense languages. For example, stage 1 speech (cover-
acquisition can account for the apparent dis- ing the period with the first multiword utterances,
sociation between the patterns of acquisition up to MLU of 2.0) seems largely uniform
shown in Williams syndrome and SLI (Thomas across the world (Dale, 1976; Slobin, 1970).
& Karmiloff-Smith, 2003). Rather than a static There are of course some differences: Young
model, whereby children come with two routes, Finnish children do not produce yes–no ques-
one of which is either spared or destroyed, high- tions (Bowerman, 1973). This is because you
level deficits (past-tense formation) can arise cannot form questions by rising intonation in
from relatively low-level deficits (phonological Finnish, so speakers must rely on an interroga-
processing and the lexical-semantic system) in tive inflection. Some differences emerge in
conjunction with the effects of development and later development. Plural marking is an
compensation. extremely complex process in Arabic, but rela-
tively simple in English. Hence plural marking
Individual differences in language is acquired early in English-speaking children,
but is not entirely mastered until the teen-
development
The way in which adults talk to children appears to age years for Arabic-speaking children (see
have an effect as the child gets older: There are large McCarthy & Prince, 1990; Prasada & Pinker,
individual differences in the ability of preschool chil- 1993). In complex inflectional languages such
dren to form and understand syntactically complex as Russian, development generally progresses
sentences, and the quality of what children hear cor- from the most concrete (e.g., plurals) first to
relates highly with these differences (Huttenlocher, the most abstract later (e.g., gender usually
Vasilyeva, Cymerman, & Levine, 2002). Children has no systematic semantic basis; see Slobin,
who hear complex structures master them earlier. 1966b).
Even here, it is difficult to be certain about what
is causal. The most important source of input for The development of syntactic
young children is their parents, so we cannot rule out
comprehension
genetic factors: Syntactic complexity in parent and
More complicated syntactic constructions nat-
child might reflect parent–child genetic similarity.
urally provide the child with a number of chal-
However, the language of teachers also comes to have
lenges. The youngest children have difficulty
an effect: The syntactic abilities of children taught by
with passives because they are inappropriately
teachers who use syntactically more complex speech
applying the standard canonical order strat-
develops faster than those taught by teachers who
egy, which simply says that the subject of the
use simpler constructions (Huttenlocher et al., 2002).
sentence is the agent. Older children (around
Hence language input does play a role.
3 years old) start to map the roles of passives
as adults do, but they make mistakes depend-
Cross-linguistic differences in ing on the semantic context of the utterance.
language development Children have particular difficulty with revers-
Languages differ in their syntactic complexity. ible passives, when the subject and object
For example, English is relatively constrained in can be reversed and the sentence still makes
its use of word order, whereas other languages sense (such as “Vlad was kissed by Agnes”).
4. LANGUAGE DEVELOPMENT 149
Here there are no straightforward semantic revising their initial interpretations if they turn
cues available to assist them. M. Harris (1978) out to be wrong. Five-year-old children did not
showed that animacy is an important cue in use context to resolve ambiguous structures
the development of understanding passives. and were unable to revise their initial interpre-
Animate things tend to get placed earlier in tation. Children always preferred the “destina-
the sentence. Hence, in a picture description tion” interpretation (put the frog on the napkin)
task, when the object being acted on was ani- rather than the “modifier” interpretation (take
mate (such as a boy being run over by a car), the frog that is on the napkin and put it in the
a passive construction tended to be used to put box), regardless of the visual context. Young
the animate object first (“the boy was run over children therefore use different principles to
by the car”). The type of verb also matters: understand sentences; little is known about the
Young children find passives with action verbs way in which these principles turn into their
easier to manipulate than stative verbs such as adult equivalent.
“remember” (Sudhalter & Braine, 1985). The development of comprehension skills
More recently eye-tracking has been used is a long and gradual process with no clear-cut
to investigate how children understand sen- end point (Hoff-Ginsberg, 1997). Markman
tences. Trueswell, Sekerina, Hill, and Logrip (1979) found that a significant number of
(1999) used head-mounted eye-trackers to dis- 12-year-olds erroneously judged that (13)
cover where children looked in a scene as they made sense (I had to read it twice myself to
responded to ambiguous spoken instructions to find the problem):
move objects about that scene. As we shall see
in Chapters 10 and 14, adults can make use of (13) There is absolutely no light at the bottom of
many sources of information to resolve ambig- the ocean. Some fish that live at the bottom
uous instructions such as “Put the frog on the of the ocean know their food by its color.
napkin in the box,” and are also very good at They will only eat red fungus.
An eye-tracker can be
used to record and store
information about an
observer’s eye fixations.
Trueswell et al. (1999) used
this method to discover
where children looked in
a scene as they responded
to instructions to move
objects about that scene.
150 B. THE BIOLOGICAL AND DEVELOPMENTAL BASES OF LANGUAGE
SUMMARY
x Rationalists believed that knowledge was innate, whereas empiricists argued that it arose from
experience.
x An analysis of the effects of correcting speech on young children shows that language acquisition
cannot be driven just by imitation or reinforcement.
x Because the linguistic input that children hear does not seem to contain sufficient information (it is
an impoverished input), Chomsky proposed that they have an innate Language Acquisition Device.
x In particular, he argued that we are born with a fixed set of switches (parameters), the positions of
which are set by exposure to particular languages.
x In practice it has proved difficult to identify these parameters, and to explain how bilingual
children and children using sign language use them.
x Human languages have a surprising amount in common; this might be because they are all derived
from the same universal grammar.
x There are different types of linguistic universals; some show how a particular aspect of language
may have implications for other features.
x The drive to use language in general and rules of word order in particular is so great that children
develop them even if they are absent from their input.
x Young children move from babbling to one-word or holophrastic speech, through abbreviated or
telegraphic speech, before they master the full syntactic complexity of their language.
x Correcting children’s errors makes surprisingly little difference to their speech patterns.
x Adults speak to young children in a special way; this child-directed speech (CDS for short; some-
times called “motherese”) simplifies the child’s task in acquiring language.
x CDS is clear, and what is being talked about is usually obvious from the context.
x As CDS is not used by all cultures it may not be necessary for language development, although
it might facilitate it.
x There are specific language impairments (SLIs) that are genetically marked, although the precise
nature of the impairment is disputed.
x All young children go through a stage of babbling, but it is not clear how the sounds they make
are related to the sounds of the language to which they are exposed.
x Infants are born with rich speech-perception abilities.
x It is likely that babbling serves to enable infants to practice articulatory movements and to learn
to produce the prosody of their language.
x There is an explosion in children’s vocabulary at around 18 months.
x There have been a number of proposals for how children learn to associate the right word with
things in the world, including lexical constraints, innate concepts, syntactic cues, and social-
pragmatic cues.
x Young children make errors in the use of words; in particular, they occasionally over-extend them
inappropriately.
x A number of models have been proposed to account for over-extensions; one of the most influential
has been the idea that the child has not yet acquired the appropriate semantic features for a word.
x Later semantic development depends on conceptual and syntactic factors.
x A number of mechanisms have been proposed for how children learn the syntactic categories of words.
x One view is that knowledge of syntactic categories and how objects and actions are mapped onto
nouns and verbs is innate.
x Once children have learned a few correspondences, their progress can be much faster because of
bootstrapping.
4. LANGUAGE DEVELOPMENT 151
x According to the constructivist or meaning-first view, there is an early asyntactic phase of devel-
opment, which is driven only by semantic factors.
x More recent approaches have focused on the idea that children monitor the distribution of words
and use co-occurrence information to derive syntactic categories.
x Braine proposed that two-word grammars were founded on a small number of “pivot” words that
were also used in the same position in sentences.
x Purely grammatical approaches to early speech have difficulty in explaining all the utterances
children make, and ignore the semantic context in which the utterances are made.
x The acquisition of past tenses is best described by a U-shaped pattern, as performance goes from
perfect performance on irregular verbs through a phase of incorrectly regularizing them, before
using the correct irregular forms again.
x There has been much debate as to whether the learning of the past tense is best explained by the
acquisition of specific rules or by constraint-based models based on connectionist modeling.
1. What cognitive processes do you think need to be innate for language development to occur?
2. Throughout this chapter we have talked of “language development” or “language acquisition”
rather than (first) language learning. What is the advantage of avoiding the term “language
learning”?
3. To what extent are the errors that children make like the errors adult speakers routinely make?
(You might need to read Chapter 13 before attempting this question.)
4. Consider the first words made by someone you know. (You might be able to discover your
own.) What do you think accounts for them?
5. Produce a detailed summary of the time course of language development.
6. To what extent is the telegraphic speech of young children like the agrammatic speech of some
aphasics (see Chapter 13)?
7. In some studies with young infants children pay attention for longer to easy or familiar stimuli,
whereas in others they attend longer to unfamiliar material. What might determine when each
of these happens?
FURTHER READING
Many texts describe language development in far more detail than can be attempted in a single chapter:
see, for example, include Hoff-Ginsberg (1997) and Owens (2004) for an introductory approach.
Hoff-Ginsberg includes very good descriptions of language development in special circumstances.
Messer (2000) is a very short review of the main themes. See Bloom (1998) for another good
review with an emphasis on the effect of the context of development.
(Continued)
152 B. THE BIOLOGICAL AND DEVELOPMENTAL BASES OF LANGUAGE
(Continued)
See Werker and Yeung (2005) for a review of early speech perception and word learning. Bloom
(2001a) reviews work on how children learn the meaning of words; Bloom (2001b) is a summary of
the book, with a commentary. See also Hollich, Hirsh-Pasek, and Golinkoff (2000) for word learn-
ing. Although we have focused on nouns and verbs, we should not forget that there are other catego-
ries of words; see Mintz and Gleitman (2002) for work on how children learn adjectives.
There are several introductions to Chomsky’s work that cover his ideas on language, language
development, syntax (see Chapter 2), and sometimes his political ideas as well. See Cogswell and
Gordon (1996), Lyons (1991), and Maher and Groves (1999). A convincing defense of the position
that language has an important innate component is presented in a very approachable way by Pinker
(1994); see Pinker (1989) for more on formal approaches to language development. See Leonard
(2000) for a review of SLI. For more on language development as parameter setting, see Stevenson
(1988). Cook and Newson (2007) provide a great deal of material on Chomsky’s work, with par-
ticular evidence on language development. In particular, they provide a very clear account of the
poverty of the stimulus argument. See McClelland and Seidenberg (2000) and Seidenberg and Elman
(1999) for critiques of nativism. For more on early phonological and segmentation skills, see Saffran,
Werker, and Werner (2006). See Vihman (1996) for more on phonological development.
MacWhinney (1999) is an edited collection with an emphasis on how language is an emergent
property. Elman et al. (1996) discuss how connectionism has changed our view of what it means
for something to be innate. Their emphasis is on how behavior arises from the interactions between
nature and nurture. Plunkett and Elman (1997) provide practical examples of connectionist modeling
relevant to this in a simulation environment called tlearn. See Deacon (1997) for a review of the
biological basis of language, how it might have evolved, how humans differ from animals, and how
language might constrain language learning.
Broeder and Murre (2000) present a collection of articles that emphasizes computational modeling
of language development.
For a review of work on past-tense formation, see Clahsen (1999). Altmann (1997) has a good
section on the phonological skills of infants.
CHAPTER 5
BILINGUALISM AND SECOND
LANGUAGE ACQUISITION
Pearl & Lambert, 1962). For example, Lambert, and connected directly together (Paivio, Clark, &
Tucker, and d’Anglejan (1973) found that children Lambert, 1988). This model is supported by evi-
in the Canadian immersion program (for learning dence that semantic priming produces facilitation
French) tended to score more highly on tests of between languages (e.g., Chen & Ng, 1989; Jin,
creativity than monolinguals. Bilingual children, 1990; Schwanenflugel & Rey, 1986; see Altarriba,
compared with monolingual children, show an 1992, and Altarriba & Mathis, 1997, for a review).
advantage in knowing that a word is an arbitrary Studies that minimize the role of attentional pro-
name for something (Hakuta & Diaz, 1985). cessing and participants’ strategies, and that maxi-
Although some researchers have argued mize automatic processing (e.g., by masking the
that there is no obvious processing cost attached stimulus, or by varying the proportion of related
to being bilingual (e.g., see Nishimura, 1986), pairs—see Chapter 6), suggest that equivalent
others have found indications of interference words share an underlying semantic representation
between L1 and L2 (see B. Harley & Wang, that can mediate priming between the two words
1997, for a review). For example, increasing pro- (Altarriba, 1992). Most of the evidence now tends
ficiency in L2 by immigrant children is associ- to favor the common-store hypothesis. However,
ated with reduced speed of access to L1 (Magiste, early and late learners show different patterns of
1986). B. Harley and Wang (1997, p. 44) con- cross-language priming, with late learners showing
clude that “monolingual-like attainment in each much less priming (Silverberg & Samuel, 2004),
of a bilingual’s two languages is probably a myth suggesting once again that age-of-acquisition is
(at any age).” critical in how bilinguals represent and access
On the other hand, there is now an over- words, with late learners having separate lexicons
whelming body of research showing that bilin- mediated at the conceptual levels.
gualism confers a general cognitive advantage Another possibility is that some people use a
in the form of enhanced flexibility. There is even mixture of common and separate stores (Taylor &
evidence that being bilingual protects people Taylor, 1990). For example, concrete words, cog-
to some extent against developing Alzheimer’s nates (words in different languages that have the
disease by helping to build up the mind’s “cog- same root and meaning and which look similar),
nitive reserve” that slows down cognitive aging and culturally similar words act as though they
(Bialystok, Craik, & Luk, 2012). are stored in common, whereas abstract and other
words act as though they are in separate stores.
Also steering between the common- and separate-
Bilingual language processing stores models, Grosjean and Soares (1986) argued
How many lexicons does a bilingual speaker pos- that the language system is flexible in a bilingual
sess? Is there a separate store for each language, or speaker, and that its behavior depends on the cir-
just one common store? In separate-store models, cumstances. In unilingual mode, when the input
there are separate lexicons for each language. These and output are limited to only one of the available
are connected at the semantic level (Potter, So, von languages, and perhaps when the other speakers
Eckardt, & Feldman, 1984). Evidence for the sep- involved are unilingual in that language, inter-
arate-stores model comes from the finding that the action between the language systems is kept to
amount of facilitation gained by repeating a word a minimum; the bilingual tries to switch off the
(a technique called repetition priming) is much second language. In the bilingual mode, both lan-
greater and longer lasting within than between lan- guage systems are active and interact. How speakers
guages (Kirsner, Smith, Lockhart, King, & Jain, have strategic control over their language systems
1984), although repetition priming might not be is a topic that largely remains to be explored.
tapping semantic processes (Scarborough, Gerard, What happens when a bilingual speaker
& Cortese, 1984). In common-store models, there hears or sees a word? How do they prevent the
is just one lexicon and one semantic memory sys- two languages from interfering with one another?
tem, with words from both languages stored in it Bilingual speakers must have mechanisms in place
156 B. THE BIOLOGICAL AND DEVELOPMENTAL BASES OF LANGUAGE
to prevent interference. In an event-related potential Kroll and Stewart (1994) proposed that transla-
(ERP) study, bilingual Spanish–Catalan speakers tion by second-language novices is an asymmetric
were instructed to press a button when they saw a process. They argued that we translate words from
word in one of the languages, and to ignore words our first language into the second language (called
in the other (Rodriguez-Fornells, Rotte, Heinze, forward translation) by conceptual mediation. This
Nosselt, & Munte, 2002). The brain potentials of means that we must access the meaning of a word
the participants showed that they were not sensi- in order to translate it. In contrast, we translate from
tive to the frequency of the words in the ignored the second language into the first (called back-
language, suggesting that the words did not reach ward translation) by word association—that is, we
a high level of processing. However, fMRI activa- use direct links between items in the lexicon (see
tion had a lot in common with the way in which Figure 5.1). The evidence for this asymmetry is
we process nonwords. This pattern of results sug- that semantic factors (such as the items to be trans-
gests that speakers use quite low-level information lated being presented in semantically arranged lists)
to block words in the non-target language at a very have a profound effect on forward translation, but
early stage, such that the meanings of these words little or no effect on backward translation. In addi-
do not become activated. Further evidence for tion, backward translation is usually faster than for-
this low-level blocking of the non-target language ward translation.
comes from an electrophysiological study of very Having said this, there is some evidence that
fluent Italian–Slovenian bilinguals. The pattern of backward translation (from L2 to L1) might also be
activation while reading suggested that discrimina- semantically mediated. De Groot, Dannenburg, and
tion between the two languages is taking place at van Hell (1994) found that semantic variables such
a very early stage (Proverbio, Cok, & Zani, 2002). as imageability affect translation times in backward
translation, although to a lesser extent than in for-
Bilingual syntactic processing ward translation. La Heij, Hooglander, Kerling, and
There has been much less research on how bilingual van der Velden (1996) found that backward trans-
people process syntax than there has on how they lation was facilitated by the presence of congru-
process individual words. The issues are much the ent pictures and hindered by incongruent pictures,
same: for languages that use similar sorts of con- suggesting that the translation involves accessing
struction, do people store syntactic knowledge sep- semantics. Hence it is likely that translation in both
arately for each language, or just once, in a shared directions involves going through the semantic rep-
store? A study of Spanish–English bilingual speak- resentations of the words. It is also probable that
ers found that a particular syntactic structure in one the extent of conceptual mediation increases as the
language could make it easier to use the same struc- speaker becomes more proficient in L2.
ture in the second language, supporting the “shared
syntax” idea (Hartsuiker, Pickering, & Veltkamp,
2004). Similarly, Loebell and Bock (2003) found
Translation between L1 and L2
that production of German datives primed the (Kroll & Stewart, 1994)
subsequent use of English datives, and vice versa.
Similar results have been found in Dutch–English Forward translation via
conceptual mediation
bilinguals (Salamoura & Williams, 2006).
Picture–word interference studies suggest that sublexical levels of processing; see Dijkstra & van
in production only words of the target language Heuven, 2002; Dijkstra, van Heuven, & Grainger,
are ever considered for selection. Many studies 1998). The model attempts to bring together all
have shown that words in different languages inter- types of evidence concerning the orthographic pro-
fere with one another (e.g., Ehri & Ryan, 1980). cessing of two languages, but makes particular use
For example, it takes Catalan–Spanish bilinguals of how we recognize cognates—words that look
longer to name the picture of a table in Catalan if the the same (or very similar) in the two languages
Spanish word for chair is the distractor rather than (such as “silence” in English and French, or “ani-
an unrelated word. Costa, Miozzo, and Caramazza mal” in English and Spanish). In the BIA+ model,
(1999) presented Catalan–Spanish bilinguals with lexical access is non-language specific in its earli-
pictures to name in Catalan. In their experiment, est stages, so words from both languages are acti-
the name of the picture (not the name of a word vated, whatever the input. The model comprises a
related in meaning) was printed on top of the picture network of nodes at each level of representation
either in Catalan (same-language pairs) or Spanish (e.g., words, phonemes), connected together by
(different-language pairs). The critical condition is facilitatory and inhibitory connections. The model
the different-language pair. If choosing a word is not is purely bottom-up in the sense that word recogni-
language-specific, the different-language condition tion cannot be affected by the particular task (e.g.,
should cause a great deal of interference, as the word naming, lexical decision) being carried out. The
written in Spanish and the name of the picture in model is characterized by “language” nodes, which
Catalan will compete with each other. But if choos- tag representations according to the language to
ing a word is language-specific, then the Spanish dis- which they belong. The “language” nodes can
tractor name should not be able to compete with the receive activation from words (bottom-up) but can
Catalan word. Instead, if anything, it should facili- also send top-down inhibition. Recent work has
tate the production of the Catalan name through the centered on how bilingual processing is localized
intermediary of its meaning. Costa et al. found the in the brain (e.g., Moreno & Kutas, 2009).
latter: Having the name of the picture printed above
the target picture in the non-response language led The neuroscience of bilingualism
to facilitation. This finding suggests that only words There is some evidence that bilinguals with right-
of the target language are ever considered for output. hemisphere damage show more aphasia (crossed
A different picture holds for auditory com- aphasia) than monolinguals (Albert & Obler, 1978;
prehension. Eye-tracking studies suggest that both Hakuta, 1986). Crossed aphasia might arise because
languages are automatically considered. When the right hemisphere is involved in L2 acquisition,
bilingual people look at visual scenes searching particularly if L2 is acquired relatively late (Martin,
for particular items in the first language, they also 1998; Obler, 1981; Vaid, 1983), or because lan-
look at items with a name starting the same in the guage is less asymmetrically represented in the
second, irrelevant language (Marian & Spivey, two hemispheres in bilingual speakers—although
2003; Spivey & Marian, 1999). For example, when this is highly controversial (Obler & Hannigan,
an English–Russian bilingual looks for a “spear” 1996; Paradis, 1997). An ERP study of responses
in a visual array, they will also glance at a box of to words in 19–22-month-old English–Spanish
matches, because its name in Russian (“spichki”) bilingual children showed that the more dominant
overlaps substantially with the English word. language becomes lateralized before the less domi-
nant one (Conboy & Mills, 2006). In addition to the
Models of bilingualism types of aphasia shown by individuals who speak
The most influential model of bilingualism that only one language, brain damage sometimes causes
attempts to tell a complete story of the psychologi- additional disorders in people who speak two lan-
cal processes involved is the Bilingual Interactive guages. For example, we can observe pathological
Activation Plus (BIA+) model (a development of switching and mixing of languages, and difficulties
the original BIA model to include phonological and in translating between the languages.
158 B. THE BIOLOGICAL AND DEVELOPMENTAL BASES OF LANGUAGE
Colored computed
tomography (CT) scans of
horizontal sections through
different levels of a stroke
victim’s brain. (The front
of the brain is at the top in
each image.) The stroke has
resulted in internal bleeding
(white/orange). The mass
of blood (hematoma)
extends up and down in
the brain as well as across
the left hemisphere, and
has ruptured the ventricles
(black) that carry the brain’s
cerebrospinal fluid. This
brain damage caused aphasia
as well as paralysis of one
side of the body.
The most interesting issue is the extent to differences in comprehension between monolin-
which processing of different languages tends to guals and bilinguals. Bilinguals are generally slower
be localized in different parts of the brain. One of to respond to linguistic stimuli, regardless of what
the first reports of this was by Scoresby-Jackson, language the stimuli are in (Green, 1986; Proverbio
describing the case of an Englishman who, after a et al., 2002). Electrophysiological measures show
blow to the head, selectively lost his knowledge complex differences in reading and comprehension
of Greek. Since then there have been a number (Proverbio et al., 2002).
of reports of the selective impairment of one lan-
guage following brain damage, and many more
of differential recovery of the two languages (see SECOND LANGUAGE
Fabbro, 2001; Obler & Hannigan, 1996; Paradis, ACQUISITION
1997). The evidence is consistent with two inde-
pendent language systems connected at the con- Second language acquisition happens when a child
ceptual level. or an adult has already become competent at a lan-
Imaging suggests that the time of acquisition guage and then attempts to learn another. We should
most affects the grammatical aspects of language. distinguish between learning a second language nat-
The lexicons of both early and late bilinguals are uralistically (e.g., when a child or person moves to a
organized similarly. However, individuals who new country) and class-based instruction.
acquire the second language after the age of 7 show There are a number of reasons why a person
different organization (Fabbro, 2001). In particular, might find learning a second language difficult.
in early-acquisition bilinguals, closed- and open- First, we saw in Chapter 3 that some aspects of
class words are stored in different parts of the brain; language learning, particularly involving syn-
in late-acquisition bilinguals closed-class words are tax, are more difficult outside the critical period.
stored with open-class words. There are other Second, older children and adults often have less
5. BILINGUALISM 159
Traditional method:
Direct translations from L1 to L2
Lectures in grammar in L1
FIGURE 5.2
160 B. THE BIOLOGICAL AND DEVELOPMENTAL BASES OF LANGUAGE
that together form the monitor model of second Chomsky’s distinction between competence and
language learning (see Figure 5.3). Central to his performance). The fourth hypothesis is the com-
approach is a distinction between language learn- prehensible input hypothesis. In order to move
ing (which is what traditional methods empha- from one stage to the next, the acquirer must
size) and language acquisition (which is more understand the meaning and the form of the input.
akin to what children do naturally). Learning This hypothesis emphasizes the role of compre-
emphasizes explicit knowledge of grammati- hension. Krashen argues that production does not
cal rules, whereas acquisition emphasizes their need to be explicitly taught: it emerges itself in
unconscious use. Although learning has its role, time, given understanding, and the input at the
to be more successful second language acquisi- next highest level need not contain only infor-
tion should place more emphasis on acquisition. mation from that level. Finally, the active filter
The first of the five hypotheses is the acquisi- hypothesis says that attitude and emotional fac-
tion and learning distinction hypothesis: children tors are important in second language acquisition,
acquire their first language largely unconsciously and that they account for a lot of the apparent dif-
and automatically—they do not learn it. Earlier ference in the facility with which adults and chil-
views that emphasized the importance of the dren can learn a second language.
critical period maintained that adults could only Krashen’s approach provides a useful frame-
learn a second language consciously and effort- work, and has proved to be one of the most influ-
fully. Krashen argued that adults could indeed ential theoretical approaches to teaching a second
acquire the second language. The second hypoth- language. More recent work has moved away
esis is the natural order in acquisition hypothesis. from the idea that acquisition and learning are so
The order of acquisition of syntactic rules, and very different, emphasizing the practicalities of
the types of errors of generalization made, are the how learners can best acquire novel material, and
same in both languages. exploring the role of attention and covert learn-
The third and fourth hypotheses are central ing in language learning (see Doughty & Long,
to Krashen’s approach. The third hypothesis is 2005).
the monitor hypothesis. It states that the acquisi- In addition to teaching method, individual
tion processes create sentences in the second lan- differences between second language learn-
guage, but learning enables the development of a ers play some role in how easily people acquire
monitoring process to check and edit this output. L2 (Robinson, 2001). In a classic study, Carroll
This can only happen if there is sufficient time (1981) identified four sources of variation in peo-
in the interaction; hence it is difficult to employ ple’s ability to learn a new language. These were:
the monitor in spontaneous conversation. The phonetic coding ability (the ability to identify new
monitor uses knowledge of the rules rather than sounds and form associations between them—an
the rules themselves (in a way reminiscent of aspect of what is called phonological awareness);
grammatical sensitivity (the ability to recognize
the grammatical functions of words and other syn-
tactic structures); rote-learning ability; and induct-
Acquisition and learning ive learning ability (the ability to infer rules from
distinction hypothesis data). Working memory plays an important role in
foreign language vocabulary learning (Papagno,
Natural order
Active filter Monitor model
in acquisition Valentine, & Baddeley, 1991), and it is possible
hypothesis (Krashen, 1982)
hypothesis to recast Carroll’s four components of language
learning in terms of the size, speed, and efficiency
Comprehensible Monitor
input hypothesis hypothesis
of working memory functions (McLaughlin &
Heredia, 1996). Motivation, of course, also plays
a significant role; people who want or need to
FIGURE 5.3 learn will do better (Dörnyei, 1990).
5. BILINGUALISM 161
How can we make second artificial language better when they were ini-
tially presented with only small segments of the
language acquisition easier? language than when they were exposed to the
Second language acquisition is often characterized full complexity of the language from the begin-
by a phase or phases of silent periods when few pro- ning. Perhaps children learn the new language in
ductions are offered despite obvious development spite of the immersion rather than because of it.
of comprehension. Classroom teaching methods Immersion might be particularly counter-productive
that force students to speak in these silent periods for adults who, without the cognitive limitations
might be doing more harm than good. Newmark of childhood, will have great difficulty in apply-
(1966) argued that this has the effect of forcing the ing a “less-is-more” strategy.
speaker back onto the rules of the first language. Sharpe (1992) identified what he called the
Hence silent periods should be respected. “four Cs” of successful modern language teaching
Krashen (1982) argued we should make sec- (see Figure 5.4). These are communication (the main
ond language acquisition more like first language purpose of learning a language is aural communica-
acquisition by providing sufficient comprehensi- tion, and successful teaching emphasizes this); cul-
ble input. The immersion method, involving complete ture (which means learning about the culture of the
exposure to L2, exemplifies these ideas. Whole speakers of the language and de-emphasizing direct
schools in Montreal, Canada, contain English- translation); context (which is similar to providing
speaking children who are taught in French in all comprehensible input); and giving the learners con-
subjects from their first year (Bruck, Lambert, & fidence. These points may seem obvious, but they
Tucker, 1976). Immersion seems to have no del- are often neglected in traditional, grammar-based
eterious effects, and if anything might be beneficial methods of teaching foreign languages.
for other areas of development (e.g., mathematics). Finally, some particular methods of learning
The French acquired is very good but not perfect: second languages are of course better than oth-
there is a slight accent, and syntactic errors are ers. Ellis and Beaton (1993) reviewed what facili-
sometimes made. tates learning foreign language vocabulary. They
There might be limits, however, to how much concluded that simple rote repetition is best for
immersion is ideal. Recall the “less-is-more” learning to produce the new words, but that using
theory from Chapter 4: that starting small is an keywords is best for comprehension. Naturally,
advantage to children learning language. Kersten learners want to be able to do both, so a combina-
and Earles (2001) found that adults learned an tion of techniques is the optimum strategy.
Communication:
emphasis on aural
communication
Culture:
Four Cs of successful
Confidence: learning about the culture
modern language
given to learners and de-emphasizing
teaching
direction translation
Context:
providing comprehensible input
FIGURE 5.4
162 B. THE BIOLOGICAL AND DEVELOPMENTAL BASES OF LANGUAGE
SUMMARY
x Second language acquisition in adulthood and later childhood is difficult because it is not like first
language acquisition.
x There are probably both costs and benefits of learning two languages at once. There might be
some general cognitive advantages.
x There has been much debate as to how we translate words between languages; in particular,
whether or not there are direct links between words in our mental dictionaries, or whether the
entries are mediated by semantic links.
x Translation probably does involve conceptual mediation.
x Bilingualism is a useful tool for studying other language processes.
5. BILINGUALISM 163
1. How would you suggest teaching a second language based on psycholinguistic principles?
2. How would your answer differ if you were teaching (a) 3-year-olds; (b) 10-year-olds;
(c) 20-year-olds?
3. What are the advantages of knowing more than one language? What are the disadvantages?
FURTHER READING
There are many reference works on bilingualism and second language acquisition. Examples of more
detailed reviews include Kilborn (1994) and Klein (1986). Books covering the area in greater depth
include Bialystok and Hakuta (1994), de Groot and Kroll (1997), Ritchie and Bhatia (1996)—particularly
the review chapter by Romaine—and Romaine (1995). For a review of research on code switching, see
Grosjean (1997). Altarriba (1992) reviews work on bilingual memory. The book by Fabbro (1999) pro-
vides an introduction to the neuropsychology of bilingualism; see also Fabbro (2001). See McLaughlin
(1987) for a discussion of Krashen’s work. For a cognitive approach to second language learning, see
Skehan (1998). Doughty and Long’s Handbook of Second Language Acquisition (2005) provides a fairly
recent review of all the main topics in the area.
This page intentionally left blank
SECTION C
WORD RECOGNITION
This section examines how we recognize printed text. What can studies of people with brain dam-
(or written) and spoken words, and how we turn age tell us about this process?
printed words into sound. It also examines disor- Chapter 8, Learning to read and spell,
ders of reading, and how children learn to read. looks at how children learn to read. What is the
Chapter 6, Recognizing visual words, best method of teaching this vital skill? How do
examines the process that takes place when we children learn to spell? Why do some children
recognize a written word. How do we decide on find reading difficult to learn?
the meaning of a word, or even whether we know Chapter 9, Understanding speech, turns
the word or not? What methods are available to to the question of how we recognize the sounds
psycholinguists to study phenomena involved in we hear as speech. How do we decide where one
word recognition, and what models best explain word ends and another begins in the stream of
them? sound that is spoken language? How can context
Chapter 7, Reading, looks at how human help, and what models have been suggested to
beings access sound and meaning from a written explain how spoken word recognition operates?
This page intentionally left blank
CHAPTER 6
RECOGNIZING VISUAL WORDS
Roadside joggers endure sweat, pain, and angry drivers in the name of
FIGURE 6.1 Diagram
1 2 3 4 5 6 7 8
showing a typical
286 221 246 277 256 233 216 188 progression of fixations
and variations in saccade
fitness. A healthy body may seem reward enough for most people. However, length. The dots indicate
the place of the fixation;
9 10 11 12 13 14 15 16 17 18 19 the first number below the
301 177 196 175 244 302 112 177 266 188 199 dot indicates its position
in the sequence (note the
for all those who question the pay-off, some recent research on physical “overshoot” phenomenon
at fixation 20, in which
21 20 22 23 24 25 26 27 the first fixation on a
new line often falls too
activity and creativity has provided some surprising good news. Regular
far into a sentence and a
regression is required).
29 28 30 31 32 33 34 35 36 37
201 66 201 188 203 220 217 288 212 75 The second number below
the dot indicates the
duration of each fixation in
milliseconds.
The fovea is the most sensitive part of the These eye movements back to previous material,
visual field, and corresponds to the central seven called regressions, are sometimes so brief that we
characters or so of average-size text, subtending are not aware of it. As we will see in Chapter 10, the
the central 2° of vision. The fovea is surrounded study of these regressive eye movements provides
by the parafovea (extending 5° either side of the important information about how we disambiguate
fixation point) where visual acuity is poorer; ambiguous material.
beyond this is the periphery, where visual acuity There has been considerable debate as to
is even poorer. We extract most of the meaning which measure from eye movements is the most
of what we read from the foveal region. Rayner informative (Inhoff, 1984; Rayner, 1998). Should
and Bertera (1979) displayed text to readers with it be first fixation duration—the amount of time
a moving mask that creates a moving blindspot. the eye spends looking at a region in the first
If the foveal region was masked, reading was fixation—or should it be total gaze time—which
possible from the parafoveal region (just outside also includes the time spent looking at a region in
the fovea), but at a greatly reduced rate (only 12 any later regression? Most researchers now select
words a minute). If both the foveal and parafoveal regions of the text for detailed analysis and report
regions were masked, virtually no reading was a number of measures for that region.
possible. Participants knew that there were strings How are eye movements controlled when
of letters outside the masked portion of text, could reading—what determines where the eyes look
report the occasional grammatical function word and when? The most influential model of eye-
such as “and,” and could sometimes obtain infor- movement control is the E-Z Reader model
mation about the starts of words. For example, (Reichle, Rayner, & Pollatsek, 1999, 2003). In the
one participant read “The pretty bracelet attracted E-Z Reader attention, visual processing, and ocu-
much attention” as “The priest brought much lomotor control jointly determine when and where
ammunition.” eyes move when we are reading. The central idea
Sometimes we make mistakes, or need to check of this model is that, when we read, we fixate on a
previous material, and have to look backwards. point, and then visual attention progresses across
170 C. WORD RECOGNITION
the line of text until a point is reached where the nonword. In the more common visual presentation
acuity limitations of the visual system then make it method, the letter string is displayed on a computer
difficult to extract more information and recognize screen (there is also an auditory version of this
new words. Attention then shifts and an eye move- task). For example, the participant should press one
ment is programmed into the oculomotor system key in response to the word “nurse” and another
to move to the point of difficulty. A saccade then key in response to the nonword “murse.” The
takes place to the new location, and the process is experimenter measures reaction times and error
repeated. Saccades are programmed in two stages: rates. One problem with this task is that experi-
there is an early labile stage when the planned sac- menters must be sensitive to the problem of speed–
cade can be canceled if it turns out that it is no accuracy trade-offs (the faster participants respond,
longer necessary (e.g., because we have managed the more errors they make; Pachella, 1974), and
to identify the word in the proposed target loca- therefore researchers must be careful about the
tion); after this initial labile stage saccades cannot precise instructions the participants are given.
be canceled. The central, and the most controver- Encouraging participants to be accurate tends to
sial, assumption of the model is that attention is make them respond accurately but more slowly;
allocated to one word after another in a strictly encouraging them to be fast tends to make them
serial fashion, shifting only after each word is respond faster at the cost of making more mistakes.
identified. This assumption ensures that words are Researchers therefore usually analyze both reac-
processed in the correct order. Word “identifica- tion times and error rates (although usually these
tion” occurs in two stages: the first stage is a famil- show the same pattern of results). Response times
iarity check (do I know this word? Am I likely to vary, depending on many factors, but are typically
be able to use it?). Completion of the first stage in the order of 500 ms to 1 second.
can trigger the programming of a saccade. The In experiments measuring reaction time, the
second stage is full lexical access, where mean- absolute time taken to respond is not particularly
ing is retrieved and the representation of the word useful: we are usually concerned with differences
integrated with the emerging linguistic structure. between conditions. We assume that our experi-
Completion of the second stage triggers the shift in mental manipulations change only particular
attention to the next word along. Hence saccades aspects of processing, and everything else remains
and attention are decoupled in this model, and constant and therefore cancels out. For example,
have different sources of control (familiarity and we assume that the time participants take to locate
identification). Linguistic processing can affect the word on the screen and turn their attention to
eye movements; for example, if an analysis turns it is constant (unless of course we are deliberately
out to be wrong, we might return to an earlier loca- trying to manipulate it).
tion. In the model, higher level processes intervene In tachistoscopic identification, participants
in the general drive forward only when something are shown words for very short presentation times.
goes wrong. Researchers in the past used a piece of equipment
called a tachistoscope; now computers are used
instead, but the name is still used to refer to the
Reaction time measures general methodology. The experimenter records
In the naming task, participants are visually pre- the thresholds at which participants can no longer
sented with a word that they then have to name, confidently identify items. If the presentation is
and the time it takes a participant to start to pro- short enough, or if the normal perceptual pro-
nounce the word aloud (the naming latency) is cesses are interfered with by presenting a second
measured. Naming latencies are typically in the stimulus very quickly after the first, we some-
order of 500 ms from the onset of the presentation times find what is commonly known as sublimi-
of the word. nal perception. In this case participants’ behavior
In the lexical decision task the participant is affected although they are unaware that any-
must decide whether a string of letters is a word or thing has been presented.
6. RECOGNIZING VISUAL WORDS 171
The semantic categorization task requires the the word, by reducing the contrast between the
participant to make a decision that taps semantic word and the background, or by rotating the word
processes. For example, is the word “apple” a to an unusual angle.
“fruit” or a “vegetable”? Is the object referred to Presenting another stimulus immediately
by the word smaller or bigger than a chair? after the target interferes with the recognition
Different techniques do not always give process. This is called backwards masking (see
the same results. They tap different aspects Figure 6.2). There are two different ways of doing
of processing—an important consideration to this. If the masking stimulus is unstructured—for
which we will return. example, if it is just a patch of randomly posi-
One of the most important ideas in word rec- tioned black dots, or just a burst of light—then
ognition is that of priming. This involves presenting we call it energy (or brightness, or random noise)
material before the word to which a response has masking. If the masking stimulus is structured (for
to be made. One of the most common paradigms example, if it comprises letters or random parts of
involves presenting one word prior to the target letters) then we call it pattern masking (or feature
word to which a response (such as naming or lexi- masking). These two types of mask have very dif-
cal decision) has to be made. The first word is called ferent effects (Turvey, 1973). Energy masks oper-
the prime, and the word to which a response has ate on the visual feature detection level by causing
to be made is called the target. The time between a visual feature shortage and making feature iden-
when the prime is first presented (its onset) and the tification difficult. Feature masks cause interfer-
start of the target is called the stimulus–onset asyn- ence at the letter level and limit the time available
chrony, or SOA. We then observe what effect the for processing.
prime has on subsequent processing. By manipu- Masking is used in studies of one of the great-
lating the relation between the prime and the target, est of all psycholinguistic controversies, that of
and by varying the SOA, we can learn a great deal perception without awareness. Perception with-
about visual word recognition. The prime does not out awareness is a form of subliminal perception.
have to be a single word: it can be a whole sen- Researchers such as Allport (1977) and Marcel
tence, and does not even have to be linguistic (e.g., (1983a, 1983b) found that words that have been
it could be a picture). masked, to the extent that participants report they
are not aware of their presence, can nevertheless
produce activation through the word identification
WHAT MAKES WORD system, even to the level of semantic processing.
RECOGNITION EASIER (OR That is, we can access semantic information about
HARDER)? an item without any conscious awareness of that
item. The techniques involved are notoriously dif-
Next we will look at some of the main findings ficult; the results have been questioned by, among
on visual word recognition. You should bear in others, Ellis and Marshall (1978) and Williams
mind that many of these phenomena also apply to and Parkin (1980). Holender (1986), in critically
spoken word recognition. In particular, frequency reviewing this field, pointed out methodological
effects and semantic priming are found in both problems with the early experiments. He empha-
spoken and visual word recognition. sized ensuring that participants are equally dark-
adapted during the preliminary establishing of
individual thresholds and the main testing phase
Interfering with identification of the experiment. Otherwise we cannot be sure
We can slow down word identification by mak- that information is not reaching conscious aware-
ing it harder to recognize the stimulus. One way ness in the testing phase, even though we think
of doing this is by degrading its physical appear- we might have set the time for which the target is
ance. This is called stimulus degradation and can presented to a sufficiently short interval. The win-
be achieved by breaking up the letters that form dow between presenting a word quickly enough
172 C. WORD RECOGNITION
BACKWARDS MASKING
FIGURE 6.2
for it not to be available to consciousness, and so early use of information that is most likely to help
quickly that participants really do see nothing at them identify a word.
all, is very small. As yet it is unclear whether we
can identify and access meaning-related informa-
tion about words without conscious awareness,
Frequency, familiarity, and
although the balance of evidence is probably that age-of-acquisition
we can. Such a finding does not pose any real The frequency of a word is a very important fac-
problem for our models of lexical access. tor in word recognition. Commonly used words
Another informative way in which we can are easier to recognize and are responded to more
interfere with word recognition is to present a quickly than less commonly used words. The fre-
word, but delay the presentation of one or two quency effect was first demonstrated in tachisto-
letters at the beginning of the word by backward scopic recognition (Howes & Solomon, 1951),
masking of those letters. What causes most dis- but has since been demonstrated for a wide range
ruption when we do this? In English, after 60 ms of tasks. Whaley (1978) showed that frequency
it doesn’t make much difference, but before that, is the single most important factor in determin-
delaying a consonant disrupts visual word rec- ing the speed of responding in the lexical deci-
ognition much more than delaying a vowel (Lee, sion task. Forster and Chambers (1973) found a
Rayner, & Pollatsek, 2001). Early on, then, con- frequency effect in the naming task.
sonant identification is particularly important for The effect of frequency is not just a result of
recognizing a word. In English, consonants have differences between frequent and very infrequent
a more regular mapping from visual appearance words (e.g., “year” versus “heresy”), where you
to sound, whereas vowels do not. In Italian, which would obviously expect a difference, but also
has a much more regular mapping for vowels, between common and slightly less common words
there is no early advantage for consonants. Hence (e.g., “rain” versus “puddle”). It is therefore essen-
readers in different languages make differential tial to control for frequency in psycholinguistic
6. RECOGNIZING VISUAL WORDS 173
experiments, ensuring that different conditions & White, 1973a; Gilhooly, 1984). On the whole,
are matched. There are a number of norms of fre- children learn more common words first, but there
quency counts available; in the past, Kucera and are exceptions: for example, “giant” is generally
Francis (1967; see also Francis & Kucera, 1982) learned early although it is a relatively low-frequency
was one of the most popular of these, listing the word. Words that are learned early in life are
occurrence per million of a large number of words named more quickly and more accurately than
in many samples of printed language. Kucera and ones learned late, across a range of tasks including
Francis is based on written American English. object naming, word naming, and lexical decision
Clearly there are differences between versions of (Barry, Morrison, & Ellis, 1997; Brown & Watson,
English (e.g., “pavement” and “sidewalk”) and 1987; Carroll & White, 1973a; Morrison, Ellis, &
between written and spoken word frequency. For Quinlan, 1992). The later the age-of-acquisition
example, the pronoun “I” is 10 times more com- of a name, the more difficult it will be for some-
mon in the spoken word corpus than the written one with brain damage to produce (Hirsh & Ellis,
one (Dahl, 1979; Fromkin et al., 2011). Another 1994). Frequency and AOA may be correlated,
popular choice is the CELEX database (Baayen, but statistical techniques such as multiple regres-
Piepenbrock, & Gulikers, 1995), which is stored sion enable us to tease them apart. Early-learned
electronically and is therefore easily searchable, items tend to be higher in frequency, although
making it particularly useful for making up lists estimates of the size of the correlation have var-
of materials with very specific characteristics. ied from 0.68 (Carroll & White, 1973b) to as low
The Internet has made possible the collection and as 0.38 (between an objective measure of AOA,
analysis of very large samples of text. when a word first enters a child’s vocabulary, and
Gernsbacher (1984) pointed out that cor- the logarithm of the spoken word frequency, as in
pora of printed word frequencies are only an Ellis & Morrison, 1998). It has been suggested that
approximation to experiential familiarity. This all frequency effects are really AOA effects (e.g.,
approximation may break down, particularly for Morrison & Ellis, 1995). On the other hand, it has
low-frequency words. For example, psycholo- also been suggested that studies reporting AOA
gists might be very familiar with a word such as effects have not controlled adequately for fre-
“behaviorism,” even though it has quite a low fre- quency; in particular, these studies might not have
quency in the general language. People also rate taken into account cumulative frequency—how
some words with recorded low frequency (such often words have been encountered throughout the
as “mumble,” “giggle,” and “drowsy”) as more lifespan (Zevin & Seidenberg, 2002). Measures
familiar than others of similar frequency (such of frequency such as Kucera and Francis and the
as “cohere,” “rend,” and “char”). The printed- CELEX database are quite small (even a million
frequency corpora might not be very accurate words is small relative to the number we come
for low-frequency words, and language use has across in real life), and, as we have seen with
changed since many of the corpora were com- familiarity (Gernsbacher, 1984), might not accu-
posed. If it is possible to obtain ratings of the rately reflect the true occurrence of words in the
individual experiential familiarity of words, they language. Even then, they just provide a snapshot
should prove to be a more reliable measure in pro- of adult usage. Importantly, they might particu-
cessing tasks than printed word frequency. larly underestimate the frequency of words we are
Several other variables correlate with fre- exposed to in childhood. However, a large-scale
quency. For example, common words tend to be study of French showed that AOA effects persist
shorter. If you wish to demonstrate an unambigu- even when cumulative frequency is controlled for
ous effect of frequency, you must be careful to (Bonin, Barry, Méot, & Chalard, 2004). It is prob-
control for these other factors. able that both frequency and AOA have effects
Frequency is particularly entangled with age- on word processing (Morrison & Ellis, 2000).
of-acquisition (AOA). The age-of-acquisition of a Different tasks might differ in their sensitivity to
word is the age at which you first learn it (Carroll AOA and different measures of frequency; AOA
174 C. WORD RECOGNITION
FIGURE 6.3
6. RECOGNIZING VISUAL WORDS 175
For some time it was thought that there was recognize when other factors have been con-
clear evidence that longer words take longer to trolled for, although clear benefits are only
pronounce (Forster & Chambers, 1973). Weekes found for low-frequency words: Performance
(1997) found that word length (measured in let- on naming and lexical decision tasks is faster
ters) had little effect on naming words when for low-frequency words that have many ortho-
other properties of words (such as the number of graphic neighbors (Andrews, 1989; Grainger,
words similar to the target word) were controlled 1990; McCann & Besner, 1987). The rime parts
for (although length had some effect on reading of neighbors seem to be particularly impor-
nonwords). It seems that the number of letters in tant in producing the facilitation (Peereman &
a word has little effect for short words, but has Content, 1997).
some effect on words between 5 and 12 letters In addition to neighborhood size, the fre-
long. Furthermore, word length effects in naming quency of the neighbors might also be important,
words probably reflect the larger number of simi- although in a review of the literature Andrews
lar words with similar pronunciations found for (1997) concluded that neighborhood size has
shorter words. more effect than neighborhood frequency. On
Naming time increases as a function of the the other hand, it is surprising that having many
number of syllables in a word (Eriksen, Pollack, neighbors produces facilitation at all, rather than
& Montague, 1970). There is at least some con- competition (Andrews, 1997).
tribution from preparing to articulate these sylla-
bles in addition to any perceptual effect. We find a
similar effect in picture naming. We take longer to
Word or nonword?
name pictures of objects depicted by long words Words are generally responded to faster than non-
compared with pictures of objects depicted by words. Less plausible nonwords are rejected faster
short words, and longer to read numbers that have than more plausible nonwords (Coltheart et al.,
more syllables in their pronunciation, such as the 1977). Hence in a lexical decision task we are rela-
number 77 compared with the number 16 (Klapp, tively slow to reject a nonword like “siant” (which
1974; Klapp, Anderson, & Berrian, 1973). might have been a word, and indeed which looks
like one, “saint”), but very quick to reject one such
as “tnszv.” Nonwords that are plausible—that is,
Neighborhood effects that follow the rules of word formation of the lan-
Some words have a large number of other words guage in that they do not contain illegal strings of
that look like them (e.g., “mine” has “pine,” “line,” letters—are sometimes called pseudowords.
“mane,” among others), whereas other words of
similar frequency have few that look like them
(e.g., “much”). Coltheart, Davelaar, Jonasson, and
Repetition priming
Besner (1977) defined the N-statistic as the num- Once you have identified a word, it is easier to
ber of words that can be created by changing one identify it the next time you see it. The technique
letter of a target word. Hence “mine” has a large N of facilitating recognition by repeating a word
(29): It is said to have many orthographic neigh- is known as repetition priming. Repetition
bors (e.g., “pine,” “mane,” “mire”), but “much” facilitates both the accuracy of perceptual iden-
has a low N (5) and few neighbors. The word tification (Jacoby & Dallas, 1981) and lexical
“bank” has an N-value of 20, but “abhorrence” decision response times (Scarborough, Cortese,
only has an N-value of 1. (The related word is & Scarborough, 1977). Repetition has a surpris-
“abhorrency”—which oddly enough my spell- ingly long-lasting effect. It is perhaps obvious
checker doesn’t like!) N is a measure of neighbor- that having just seen a word will make it easier
hood size (or density). to recognize straight away, but periods of facili-
Neighborhood size affects visual word rec- tation caused by repetition have been reported
ognition, making words with a high N easy to over several hours or even longer.
176 C. WORD RECOGNITION
Repetition interacts with frequency. In a lexi- easier to obtain if the prime is masked, perhaps
cal decision task, repetition priming effects are because masked priming is a more “pure” form
stronger for low-frequency words than for high- of priming that has no contribution from con-
frequency ones, an effect known as frequency scious processing (Davis & Lupker, 2006; Forster
attenuation (Forster & Davis, 1984). Forster and & Davis, 1984; Forster, Davis, Schoknecht, &
Davis also pattern-masked the prime in an attempt Carter, 1987).
to wipe out any possible episodic memory of it.
They concluded that repetition effects have two
components: a very brief lexical access effect, and
Semantic priming
a long-term episodic effect, with only the latter For over a century, it has been known that iden-
sensitive to frequency. tification of a word can be facilitated by prior
There has been considerable debate as to exposure to a word related in meaning (Cattell,
whether repetition priming arises because of the 1888/1947). Meyer and Schvaneveldt (1971) pro-
activation of an item’s stored representation (e.g., vided a more recent demonstration of what is one
Morton, 1969; Tulving & Schachter, 1990) or of the most robust and important findings about
because of the creation of a record of the entire word recognition. They showed that the identifi-
processing even in episodic memory (e.g., Jacoby, cation of a word is made easier if it is immediately
1983). An important piece of evidence that sup- preceded by a word related in meaning. They used
ports the episodic view is the finding that we a lexical decision task, but the effect can be found,
generally obtain facilitation by repetition priming with differing magnitudes of effect, across many
only within a domain (such as the visual or audi- tasks, and is not limited to visual word recogni-
tory modality), but semantic priming (by meaning tion (although the lexical decision task shows the
or association) also works across domains (see largest semantic priming effect; Neely, 1991). For
Roediger & Blaxton, 1987). example, we are faster to say that “doctor” is a
word if it is preceded by the word “nurse” than
if it is preceded by a word unrelated in meaning,
Form-based priming such as “butter,” or if it is presented in isolation.
We might expect that seeing a word like This phenomenon is known as semantic priming.
CONTRAST should make it easier to recognize The word priming is best reserved for the
CONTRACT, because there is overlap between methodology of investigating what happens when
their physical forms. As they share letters, they one word precedes another. The first word (the
are said to be orthographically related, and this prime) might speed up recognition of the second
phenomenon is known as orthographic priming word (the target), in which case we talk of facilita-
or form-based priming. In fact, form-based prim- tion. Sometimes the prime slows down the iden-
ing is very difficult to demonstrate. Humphreys, tification of the target, in which case we talk of
Besner, and Quinlan (1988) found that form-based inhibition.
priming was only effective with primes masked at With very short time intervals, priming can
short SOAs so that the prime is not consciously occur if the prime follows the target. Kiger and
perceived. Forster and Veres (1998) further Glass (1983) placed the primes immediately after
showed that the efficacy of form-based primes the target in a lexical decision task. If the target
depends on the exact make-up of the materials in was presented for 50 ms, followed 80 ms later by
the task. Form-related primes can even have an the prime, there was no facilitation of the target,
inhibitory effect, slowing down the recognition of but if the target was presented for only 30 ms, and
the target (Colombo, 1986). One explanation for followed only 35 ms later by the prime, there was
these findings is that visually similar words are significant backwards priming of the target. This
in competition during the recognition process, so finding suggests that words are to some extent
that in some circumstances similar-looking words processed in parallel if the time between them is
inhibit each other. Form-based priming is much short enough.
6. RECOGNIZING VISUAL WORDS 177
Semantic priming is a type of context effect. (1) If your bicycle is stolen, you must [formulate]
One can see that the effect might have some (2) If your bicycle is stolen, you must [batteries]
advantages for processing. Words are rarely read
(or heard) in isolation, and neither are words In both cases the target word (in italics) is
randomly juxtaposed. Words related in meaning semantically unpredictable from the context, yet
sometimes co-occur in sentences. Hence pro- Wright and Garrett found that syntactic context
cessing might be speeded up if words related to affected lexical decision times so that people were
the word you are currently reading are somehow significantly slower to respond to the noun (“bat-
made more easily available, as they are more teries”) in this context than the verb (“formulate”).
likely to come next than random words. How
does this happen? We shall return to this question
throughout this chapter. ATTENTIONAL PROCESSES
IN VISUAL WORD
Other factors that affect word RECOGNITION
recognition
Reading is a mandatory process. When you see a
The ease of visual word recognition is affected by word, you cannot help but read it. Evidence to sup-
a number of variables (most of which have similar port this introspection comes from the Stroop task:
effects on spoken word recognition). There are oth- Naming the color in which a word is written is made
ers that should be mentioned, including the gram- more difficult if the color name and the word conflict
matical category to which a word belongs (West & (e.g., “red” written in green ink) (see Figure 6.4).
Stanovich, 1986). The imageability, meaningful- How many mechanisms are involved in prim-
ness, and concreteness of a word may also have ing? In a classic experiment, Neely (1977) argued
an effect on its identification (see Paivio, Yuille, that there were two different attentional modes of
& Madigan, 1968). In a review of 51 properties priming. His findings relate to a distinction made
of words, Rubin (1980) concluded that frequency, by Posner and Snyder (1975) and Schneider and
emotionality, and pronunciability were the best Shiffrin (1977) between automatic and atten-
predictors of performance on commonly used tional (or controlled) processing. Automatic
experimental tasks. Whaley (1978) concluded that processing is fast, parallel, not prone to interfer-
frequency, meaningfulness, and the number of syl- ence from other tasks, does not demand working
lables had most effect on lexical decision times, memory space, cannot be prevented, and is not
although recently age-of-acquisition has come to directly available to consciousness. Attentional
the fore as an important variable. In a study of a (or controlled) processing is slow, serial, sensi-
large number of words, Balota, Cortese, Sergent- tive to interference from competing tasks, does
Marshall, Spieler, and Yap (2004) compared the
effects of phonological (e.g., the first sound), lexi-
cal (e.g., frequency, length, neighborhood size),
and semantic (e.g., imageability) variables on The Stroop effect
speeded visual word naming and lexical decision.
They found that the contribution of the variables BLACK
was highly task dependent. Semantic variables are
especially important, particularly in lexical deci- BLUE
sion. Finally, the syntactic context affects word rec- BLACK
ognition. Wright and Garrett (1984) found a strong
effect of syntactic environment on lexical decision BLUE
times. In (1) and (2) the preceding context can be
continued with a verb, but not with a noun. In (2)
this syntactic constraint is violated: FIGURE 6.4
178 C. WORD RECOGNITION
! !
! !
! !
!
!
FIGURE 6.5
use working memory space, can be prevented or expect to shift or not shift their attention from one
inhibited, and its results are often (but not nec- category name to members of another category.
essarily) directly available to consciousness (see Examples of stimuli in the key conditions are
Figure 6.5). given in Box 6.1.
Neely used the lexical decision task to investi- Neely found that the pattern of results
gate attentional processes in semantic priming. He depended on the SOAs. The crucial condition is
manipulated four variables. The first was whether what happens after “BODY.” At short SOAs, an
or not there was a semantic relation between the unexpected but semantically related word such as
prime and target, so that in the related condition a “HEART” was facilitated relative to the baseline
category name acting as prime preceded the tar- condition, whereas participants took about as long
get. Second, he manipulated the participants’ con- to respond to the expected but unrelated “DOOR”
scious expectancies. Third, he varied whether or as the baseline. At long SOAs, “HEART” was
not participants’ attention had to be shifted from inhibited—that is, participants were actually
one category to another between the presentation slower to respond to it than they were to the base-
of the prime and the presentation of the target. line condition, whereas “DOOR” was facilitated.
Finally, he varied the stimulus–onset asynchrony, Neely interpreted these results as show-
between 250 ms (a very short SOA) and 2,000 ms ing that two different processes are operating at
(a very long SOA). short and long SOAs. At short SOAs, there is
Importantly, in this experiment there was a fast-acting, short-lived facilitation of semanti-
discrepancy between what participants were led cally related items, which cannot be prevented,
to expect from the instructions given to them irrespective of the participants’ expectations.
before the experiment started, and what actu- This facilitation is based on semantic relations
ally happened. Participants were told, for exam- between words. There is no inhibition of any sort
ple, that whenever the prime was “BIRD,” they at short SOAs. This is called automatic prim-
should expect that a type of bird would follow, ing. “BODY” primes “HEART,” regardless of
but that whenever the prime was “BODY,” a part what the participants are trying to do. But at long
of a building would follow. Hence their conscious SOAs, there is a slow build-up of facilitation that
expectancies determined whether they had to is dependent on your expectancies. This leads to
6. RECOGNIZING VISUAL WORDS 179
the inhibition of responses to unexpected items, may not just arise from attentional processes, but
with the cost that if you do have to respond to may also have an automatic component. Antos
them, then responding will be retarded. This is also showed the importance of the baseline
attentional priming. Normally, these two types condition, a conclusion supported by de Groot
of priming work together. In a semantic priming (1984). A row of Xs, as used by Neely, is a con-
task at intermediate SOAs (around 400 ms) both servative baseline, and tends to delay respond-
automatic and attentional priming will be cooper- ing; it is as though participants are waiting for
ating to speed up responding. One can also con- the second word before they respond. It may be
clude from this experiment, on the basis of the more appropriate to use a neutral word (such as
unexpected–related condition, that the meanings “BLANK” or “READY”) as the neutral condi-
of words are accessed automatically. tion. When this is done we observe inhibition at
much shorter SOAs. Antos also argued that even
Further evidence for a Neely found evidence of cost at short SOAs, but
that this was manifested in an increase in the
two-process priming model error rate rather than in a slowing of reaction
The details of the way in which two processes time. This is evidence of a speed–error trade-off
are involved in priming have changed a little in the data. Generally, in psycholinguistic reac-
since Neely’s original experiment, although the tion time experiments, it is always important to
underlying principle remains the same. Whereas check for differences in the error rate as well as
Neely used category–instance associations (e.g., reaction times across conditions.
“BODY–ARM”), which are not particularly A second source of evidence for attentional
informative (any part of the body could follow effects in priming comes from studies manipulat-
“BODY”), Antos (1979) used instance–category ing the predictive validity (sometimes called the
associations (e.g., “ARM–BODY”), which are cue validity) of the primes. The amount of prim-
highly predictive. He then found evidence of inhi- ing observed increases as the proportion of related
bition (relative to the baseline) in the unexpected words used in the experiment increases (Den
but semantically related condition at shorter Heyer, 1985; Den Heyer, Briand, & Dannenbring,
SOAs (at 200 ms), suggesting that inhibition 1983; Tweedy, Lapinski, & Schvaneveldt, 1977).
180 C. WORD RECOGNITION
This is called the proportion effect. If priming important in word recognition, and may play dif-
were wholly automatic, then the amount found ferent roles in the tasks used to study it.
should remain constant across all proportions
of associated word pairs. The proportion effect
reflects the effect of manipulating the partici- DO DIFFERENT TASKS
pants’ expectancies by varying the proportion GIVE CONSISTENT
of valid primes. If there are a lot of primes that RESULTS?
are actually unrelated to the targets, participants
quickly learn that they are not of much benefit. Experiments on word recognition are difficult
This will then attenuate the contribution of atten- to interpret because different experimental tasks
tional priming. Nevertheless, in those cases where sometimes give different results. When we use
primes are related to the target, automatic prim- lexical decision or naming, we are not just study-
ing still occurs. The more related primes there are ing pure word recognition: we are studying word
in an experiment, the more participants come to recognition plus the effects of the measurement
recognize their usefulness, and the contribution of task. Worse still, the tasks interact with what is
attentional priming increases. being studied. It is rather like using a telescope
to judge the color of stars when the glass of the
Evaluation of attentional processes telescope lens changes color depending on the
distance of the star—and we don’t realize it.
in word recognition By far the most controversy surrounds the
There are two attentional processes operating in naming and lexical decision tasks. Which of these
semantic priming: a short-lived, automatic, facili- better tap the early, automatic processes involved
tatory process that we cannot prevent from hap- in word recognition?
pening, and an attentional process that depends on Lexical decision has been particularly criti-
our expectancies and that is much slower to get cized as being too sensitive to post-access effects.
going. However, the benefits of priming are not In particular, it has been argued that it reflects too
without their costs; attentional priming certainly much of participants’ strategies rather than the
involves inhibition of unexpected alternatives, automatic processes of lexical access (e.g., Balota
and if one of these is indeed the target then recog- & Lorch, 1986; Neely, Keefe, & Ross, 1989;
nition will be delayed. There is probably also an Seidenberg, Waters, Sanders, & Langer, 1984).
inhibitory cost associated with automatic priming. This is because it measures participant decision-
Automatic priming probably operates through making times in addition to the pure lexical access
spreading activation. times (Balota & Chumbley, 1984; Chumbley &
We can extend our distinction between auto- Balota, 1984). Participants do not always respond
matic and attentional processes to word recog- as soon as lexical access occurs; instead, atten-
nition itself. As we have seen, there must be an tional or strategic factors may come into opera-
automatic component to recognition, because this tion, which delay responding. Participants need
processing is mandatory. Intuition suggests that not be aware of these post-access mechanisms,
there is also an attentional component. If we mis- as not all attentional processes are directly avail-
read a sentence, we might consciously choose to able to consciousness. Participants might use one
go back and reread a particular word. To take this or both of two types of strategy. First, as we have
further, if we provisionally identify a word that seen, participants have expectancies that affect
seems incompatible with the context, we might processing. In a lexical decision experiment, par-
check that we have indeed correctly identified it. ticipants usually notice that some of the prime–
These attentional processes operate after we have target word pairs are related. So when they see the
first contacted the lexicon, and hence we also talk prime, they can generate a set of possible targets.
about automatic lexical access and non-automatic Hence they can make the “word” response faster
post-access effects. Attentional processes are if the actual target matches one of their generated
6. RECOGNIZING VISUAL WORDS 181
words than if it does not. The second is a postlexi- (e.g., “dog” and “cat”) mixed in. Nevertheless,
cal or post-access checking strategy. Participants lexical decision does seem to routinely involve
might use information subsequent to lexical post-access checking. Third, backwards seman-
access to aid their decision. The presence of a tic priming of words that are only associated in
semantic relation between the prime and target one direction but not another (see later) is found
suggests that the prime must be a word, and hence in the lexical decision task but is not normally
they respond “word” faster in a lexical decision found in naming (Seidenberg, Waters, Sanders,
task, as there can be no semantic relation between et al., 1984). This type of priming again more
a word and nonword. That is, using postlexical plausibly arises through post-access checking
checking, participants might respond on the basis than through the automatic spread of activation.
of an estimate of the semantic relation between These results suggest that the naming task is
prime and target, and not directly on the results less sensitive to postlexical processes. The nam-
of trying to access the lexicon. Strategic factors ing task, however, has a production component in
might even lead some participants, some of the the way that lexical decision does not (Balota &
time, to respond before they have recognized a Chumbley, 1985). In particular, naming involves
word (that is, they guess, or respond to stimuli on assembling a pronunciation for the word that
very superficial characteristics). might bypass the lexicon altogether (using what is
What is the evidence that word naming is known as a sublexical route, discussed in detail in
less likely to engage participant strategies than Chapter 7). There are also some possible strategic
lexical decision? First, inhibitory effects are effects in naming: People are unwilling to utter
small or non-existent in naming (Lorch, Balota, words that may be incorrect in some way—for
& Stamm, 1986; Neely et al., 1989). As we have example, they may hesitate if they are unsure of
seen, inhibition is thought to arise from atten- the word’s pronunciation (O’Seaghdha, 1997).
tional processes, so its absence in the naming Clearly both lexical decision and naming
task suggests that naming does not involve atten- have their disadvantages. For this reason, many
tional processing. Second, mediated priming is researchers now prefer to use analysis of eye
found much more reliably in the naming task than movements. Fortunately, the results from differ-
in lexical decision (Balota & Lorch, 1986; de ent methods often converge. Schilling, Rayner,
Groot, 1983; Seidenberg, Waters, Sanders, et al., and Chumbley (1998) found that although the
1984). Mediated priming is facilitation between lexical decision task is more sensitive to word
pairs of words that are connected only through frequency than naming and gaze duration,
an intermediary (e.g., “dog” primes “cat,” which there is nevertheless a significant correlation
primes “mouse” for the prime–target pair “dog between the frequency effect and response time
mouse”). It is much more likely to be automatic in all three tasks. We either need to place more
than expectancy-driven because participants are stress on results on which the three techniques
unlikely to be able to generate a sufficient num- converge, or have a principled account of why
ber of possible target words from the prime in they differ.
sufficient time by any other means. Mediated
priming is not usually found in lexical decision
because normally participants speed up process-
The locus of the frequency effect
ing by using post-access checking. It is possible At what stage does frequency have its effect?
to demonstrate mediated priming in lexical deci- Is it inherent in the way that words are stored,
sion by manipulating the experimental materials or does it merely affect the way in which par-
and design so that post-access checking is dis- ticipants respond in experimental tasks? An
couraged (McNamara & Altarriba, 1988). For experiment by Goldiamond and Hawkins
example, we observe mediated priming if all (1958) suggested the latter. The first part of this
the related items only are mediated (“dog” and experiment was a training phase. Participants
“mouse”), with no directly related semantic pairs were exposed to nonwords (such as “lemp” and
182 C. WORD RECOGNITION
“stunch”). Frequency was simulated by giving later recognition of a word is facilitated every
a lot of exposure to some words (mimicking time we are exposed to it, whether through
high frequency), and less to others (mimick- speaking, writing, listening, or reading. Hence
ing low frequency). For example, if you see frequency of experience and frequency of gen-
“lemp” a lot of times relative to “stunch,” then eration are both important.
it becomes a higher frequency item for you, Most accounts of the frequency effect
even though it is a nonword. In the second part assume that it arises as a kind of practice—the
of the experiment, participants were tested for more often we do something, the better we get
tachistoscopic recognition at very short inter- at it. This idea has been challenged recently by
vals. Although the participants were told to Murray and Forster (2004), who show that the
expect the words on which they were trained, time it takes to identify words is linearly related
only a blurred stimulus that they had not seen to frequency, rather than varying as a logarith-
before was in fact presented. Nevertheless, par- mic function, as you would expect if frequency
ticipants generated the trained nonwords even was based on learning that in turn was based on
though they were not present, but also with multitudinous repetitions. (Eventually you get
the same frequency distribution on which they diminishing returns from repeating things more
were trained. That is, they responded with the times.) They argue that the frequency effect
more frequent words more often, even though is better accounted for by searching serially
nothing was actually present. It can be argued through lists of words, where all that matters
from this that frequency does not have an effect is relative frequency rather than absolute fre-
on the perception or recognition of a word, only quency. We examine the serial search model in
on the later output processes. That is, frequency more detail below.
creates a response bias. This leads to what is There has been considerable debate about
sometimes called a guessing model. This type whether the naming and lexical decision tasks are
of experiment only shows that frequency can differentially sensitive to word frequency (Balota
affect the later, response stages. It does not & Chumbley, 1984, 1985, 1990; Monsell, Doyle,
show that it does not involve the earlier rec- & Haggard, 1989). Balota and Chumbley argued
ognition processes as well. Indeed, Morton that word frequency has no effect on semantic
(1979a) used mathematical modeling to show categorization. This is a task that must involve
that sophisticated guessing cannot explain the accessing the meaning of the target word. They
word frequency effect alone. concluded that when frequency has an effect on
A frequency effect could arise in two ways. word recognition, it does so because of post-
A word could become more accessible because access mechanisms, such as checking in lexical
we see (or hear) frequent words more than we decision, and preparing for articulation in nam-
see (or hear) less frequent ones, or because we ing. They also showed that the magnitude of the
speak (or write) frequent words more often. frequency effect depended on subtle differences
Of course, most of the time these two possi- in the stimulus materials in the experiment (such
bilities are entangled; we use much the same as length differences between words and non-
words in speaking as we are exposed to as lis- words). This can be explained if the effect is
teners. Another way of putting this is to ask if mediated by participants’ strategies. Furthermore,
frequency effects arise through recognition or the magnitude of the frequency effect is much
generation. Morton (1979a) disentangled these greater in lexical decision than naming. The argu-
two factors. He concluded that the data are best ment is that this is because the frequency effect
explained by models whereby the advantage has a large attentional, strategic component, with
of high-frequency words is that they need less any automatic effect being small or non-existent.
evidence to reach some threshold for identi- Lexical decision is more sensitive to strategic fac-
fication. The effect of repeated exposure to a tors; therefore lexical decision is more sensitive
word is therefore to lower this threshold. The to frequency.
6. RECOGNIZING VISUAL WORDS 183
However, most researchers believe that fre- they arise instead because of this confound with
quency does have an automatic, lexical effect on neighborhood frequency. Hence the extent of
word recognition. Monsell et al. (1989) found post-access processes in lexical decision might
that frequency effects in naming can be inflated be less than originally thought.
to a similar level to that found in lexical deci-
sion by manipulating the regularity of the pro-
nunciation of words; participants must access
Evaluation of task differences
the lexical representation of irregular words to Throughout this section we have seen that dif-
pronounce them. It is possible that frequency ferent variables have different effects on perfor-
effects are absorbed by other components of mance, depending on which measure is used. In
the naming task (Bradley & Forster, 1987). particular, lexical decision and word naming do
Furthermore, delaying participants’ responses not always give the same results. The differences
virtually eliminates the frequency effect (Forster arise because other tasks include aspects of non-
& Chambers, 1973; Savage, Bradley, & Forster, automatic processing. Naming times include
1990). Delaying responding eliminates prepara- assembling a phonological code and articulation;
tion and lexical access effects, but not articu- lexical decision times include response prepara-
lation. This casts doubt on the claim that there tion and post-access checking. Hence the differ-
is a major articulatory component to the effect ences in reaction times between the tasks may
of frequency on naming, and suggests that the reflect differing accounts of post-access rather
effect must be occurring earlier. than access processes. Given that the goal of
Grainger (1990; see also Grainger & Jacobs, reading is to extract meaning, the extent to which
1996; Grainger, O’Regan, Jacobs, & Segui, either lexical decision or naming gets at this is
1989) reported experiments that addressed both questionable.
the locus of the frequency effect and also task
differences between lexical decision and nam-
ing. He showed that response times to words
IS THERE A DEDICATED
are also sensitive to the frequency of the neigh- VISUAL WORD
bors of the target words. The neighbors of a RECOGNITION SYSTEM?
word are those that are similar to it in some
way—in the case of visually presented words, How might our ability to read have come about?
it is visual or orthographic similarity that is Although there has been plenty of time for
important. For example, there is much overlap speech to evolve (see Chapter 1), reading is a
in the letters and visual appearance of “blue” much more recent development. It is therefore
and “blur.” Grainger found that when the fre- unlikely that a specific system has had time
quency of the lexical neighborhood of a word to evolve for visual word processing. It seems
is controlled, the magnitude of the effect of more likely that the word recognition system
frequency in lexical decision is reduced to that must be tacked onto other cognitive and per-
of the naming task. Responses to words with a ceptual processes. However, words are unusual:
high-frequency neighbor were slowed in the lex- We are exposed to them a great deal, they have
ical decision task and facilitated in the naming a largely arbitrary relation with their meaning,
task. He argued that as low-frequency targets and most importantly, in alphabetic writing sys-
necessarily tend to have more high-frequency tems at least, they are composed of units that
neighbors, previous studies had confounded correspond to sounds.
target frequency with neighborhood frequency. Is the word-processing system distinct from
Furthermore, he argued that the finding that fre- other recognition systems? This can be exam-
quency effects are stronger in lexical decision ined most simply in the context of naming pic-
than naming cannot necessarily be attributed tures of objects, the picture-naming task. One
to task-specific post-access processes, and that important way of looking at this is to examine
184 C. WORD RECOGNITION
parts. She proposed that recognizing faces depends out that there are different types of semantic prim-
just on holistic processing, whereas recognizing ing, and they have different effects.
words depends on part processing. Recognizing
other types of objects involves both sorts of
processing to different degrees, depending on the
Types of “semantic” priming
specific object concerned. One obvious question is whether all types of
Farah’s proposal makes specific predictions semantic relation are equally successful in induc-
about the co-occurrence of neuropsychological ing priming. The closer the meanings of the two
deficits. Because object recognition depends on words, the larger the size of the priming effect
both holistic and part processing, you should never observed. We can also distinguish between asso-
find a deficit of object recognition (called agnosia) ciative priming and non-associative semantic
without either a deficit of face recognition (called priming.
prosopagnosia) or word recognition (dyslexia). Two words are said to be associated if par-
Similarly, if a person has both prosopagnosia and ticipants produce one in response to the other
dyslexia, then they should also have agnosia. in a word association task. This can be meas-
Although this is an interesting proposal, it is ured by word association norms such as those
not clear-cut that face perception is holistic, that of Postman and Keppel (1970). Norms such as
object recognition is dependent on both wholes and these list the frequency of responses to a num-
parts, and that word recognition depends on just ber of words in response to the instruction “Say
parts. Furthermore, Humphreys and Rumiati (1998) the first word that comes to mind when I say …
described the case of MH, a woman showing signs doctor.” If you try this, you will probably find
of general cortical atrophy. MH was very poor at words such as “nurse” and “hospital” come to
object recognition, yet relatively good at word and mind. It is important to note that not all associa-
face processing. This is the pattern that Farah pre- tions are equal in both directions. “Bell” leads to
dicted should never occur. Humphreys and Rumiati “hop” but not vice versa: hence “bell” facilitates
concluded that there are some differences between “hop,” but “hop” does not facilitate “bell.” Some
word and object processing: for example, there is words are produced as associates of words that
much more variation in the spatial positions of parts are not related in meaning: an example might be
in objects than letters in words. Words are two- “waiting” generated in response to “hospital.”
dimensional and objects three-dimensional. Lambon Priming by associates is called associative prim-
Ralph, Sage, and Ellis (1996) describe a case study ing; the two associates might or might not also
of a patient who can recognize words and objects (as be semantically related.
familiar or unfamiliar, by a lexical or object decision Non-associative semantically related words
task), but who is selectively impaired at retrieving the are those that still have a relation in terms of mean-
meanings of words. This behavior can be explained ing to the target, but that are not produced as asso-
if there is a specific visual word form area, but it has ciates. Consider the words “dance” and “skate.”
become disconnected from the semantic system. They are clearly related in meaning, but “skate”
In summary there is considerable evidence is rarely produced as an associative of “dance.”
that a dedicated brain region processes informa- “Bread” and “cake” are an example of another pair
tion about visual words. of semantically related but unassociated words.
Superordinate category names (e.g., “animal”) and
category instances (e.g., “fox”) are clearly seman-
MEANING-BASED tically related, but are not always strongly associ-
FACILITATION OF VISUAL ated. Members of the same category (e.g., “fox”
WORD RECOGNITION and “camel” are both animals) are clearly related,
but are not always associated. Priming by words
We have seen that semantic priming is one of the that are semantically but not associatively related
most robust effects on word recognition. It turns is called non-associative semantic priming.
186 C. WORD RECOGNITION
Most studies of semantic priming have (Shelton & Martin, 1992). Both Fischler (1977)
looked at word pairs that are both associatively and Lupker (1984) found some priming effect
and semantically related. However, some stud- of semantic relation without association, also in
ies have examined the differential contributions a lexical decision task. The lexical decision task
of association and pure semantic relatedness to seems to be a less pure measure of automatic
priming. In particular, to what extent are these processing than naming, and hence this prim-
types of priming automatic? The evidence for ing might have arisen through non-automatic
automatic associative priming is fairly clear- means. Although Shelton and Martin (1992)
cut, and most of the research effort has focused also used a lexical decision task, they designed
on the question of whether or not we can find their experiment to minimize attentional pro-
automatic non-associative semantic priming. cessing. Rather than passively reading a prime
Many early studies found no evidence of and then responding to the target, participants
automatic pure semantic facilitation. Lupker made rapid successive lexical decisions to indi-
(1984) found virtually no semantic priming of vidual words. On a small proportion of trials
non-associated words in a naming task. The two successive words would be related, and the
word pairs were related in his experiment by vir- amount of priming to the second word could be
tue of being members of the same semantic cat- recorded. This technique of minimizing non-
egory, but were not commonly associated (e.g., automatic processing produced priming only
“ship” and “car” are related by virtue of both for the associated words, and not for the non-
being types of vehicles, but are not associated). associated related words.
Shelton and Martin (1992) showed that auto- These results suggest that automatic prim-
matic priming is obtained only for associatively ing in low-level visual word recognition tasks
related word pairs in a lexical decision task, and that tap the processes of lexical access can be
not for words that are semantically related but explained by associations between words, rather
not associated. This result suggests that auto- than by mediation based on word meaning.
matic priming appears to occur only within the “Doctor” primes “nurse” because these words
lexicon by virtue of associative connections frequently co-occur, leading to the strengthen-
between words that frequently co-occur. Moss ing of connections in the lexicon, rather than
and Marslen-Wilson (1993) found that semantic because of an overlap in their meaning, or the
associations (e.g., chicken–hen) and semantic activation of an item at a higher level of rep-
properties (e.g., chicken–beak) have different resentation. Indeed, co-occurrence might not
priming effects in a cross-modal priming task. even be necessary for words to become asso-
(In a cross-modal task, the prime is presented in ciated: it might be sufficient that two words
one modality—e.g., auditorially—and the target tend to be used in the same sort of contexts. For
in another—e.g., visually.) Associated targets example, both “doctor” and “nurse” tend to be
were primed context-independently, whereas used in the context of “hospital,” so they might
semantic-property targets were affected by the become associated even if they do not directly
context of the whole surrounding sentence. co-occur (Lund, Burgess, & Atchley, 1995;
Moss and Marslen-Wilson concluded that asso- Lund, Burgess, & Audet, 1996).
ciative priming does not reflect the operation McRae and Boisvert (1998) questioned this
of semantic representations, but is a low-level, conclusion. They argued that the studies that
intra-lexical automatic process. failed to find automatic semantic priming with-
On the other hand, Hodgson (1991) found out association (most importantly, Shelton &
no priming for semantically related pairs in a Martin, 1992) failed to do so because the items
naming task, but significant priming for the used in these experiments were not sufficiently
same pairs in a lexical decision task. It is pos- closely related (e.g., “duck” and “cow,” “nose”
sible that the instructions in his lexical deci- and “hand”). McRae and Boisvert used word
sion task encouraged non-automatic processing pairs that were more closely related but still not
6. RECOGNIZING VISUAL WORDS 187
associated (e.g., “mat” and “carpet,” “yacht” whereas we have just seen that in lexical deci-
and “ship”). With these materials McRae and sion (a recognition task) semantic priming has a
Boisvert found clear facilitation even at very facilitatory effect.
short (250 ms) SOAs. It now seems likely that
at least some aspects of semantic relation can Does sentence context affect
cause automatic facilitation.
The pattern of results observed also
visual word recognition?
depends on the precise nature of the seman- Priming from sentence context is the amount of
tic relations involved. Moss, Ostrin, Tyler, priming contributed over and above that of the
and Marslen-Wilson (1995) found that both associative effects of individual words in the sen-
semantically and associatively related items tence. The beginning of the sentence “It is impor-
produced priming of targets in an auditory lex- tant to brush your teeth every single __” facilitates
ical decision task. Furthermore, semantically the recognition of a word such as “day,” which is
related items produced a “boost” in the mag- a highly predictable continuation of the sentence,
nitude of priming if they were associatively compared with a word such as “year,” which is
related as well. However, a different pattern of not. The sentence context facilitates recognition
results was observed in a visual lexical deci- even though there is no semantic relation between
sion version of the task (which was also prob- “day” and other words in the sentence. Can sen-
ably the version of the task that minimized any tence context cause facilitation?
involvement of attentional processing). Here, Schuberth and Eimas (1977) were the first to
whether or not pure (non-associative) semantic appear to demonstrate sentence context effects in
priming was observed depended on the type of visual word recognition. They presented incom-
semantic relation. Category coordinates (e.g., plete context sentences followed by a word or
“pig–horse”) did not produce automatic prim- nonword to which participants had to make a
ing without association, whereas instrument lexical decision. Response times were faster if
relations (e.g., “broom–floor”) did. This sug- the target word was congruent with the preceding
gests that information about the use and purpose context. West and Stanovich (1978) demonstrated
of an object is immediately and automatically similar facilitation by congruent contexts on word
activated. naming. Later studies have revealed limitations
Moss, McCormick, and Tyler (1997) with regard to when and how much contextual
also showed that some semantic properties facilitation can occur.
of words are available before others. Using a Fischler and Bloom (1979) used a paradigm
cross-modal priming task, they found signifi- similar to that of Schuberth and Eimas. They
cant early priming for information about the showed that facilitation only occurs if the target
function and design of artifacts, but not for word is a highly probable continuation of the sen-
information about their physical form. There tence. For example, consider the sentence “She
are grounds to suppose (see Chapter 11 on the cleaned the dirt from her __.” The word “shoes”
neuropsychology of semantics) that a different is a highly predictable continuation here; the word
pattern of results would be obtained with other “hands” is an unlikely but not anomalous con-
semantic categories. In particular, information tinuation; “terms” would clearly be an anomalous
about perceptual attributes might be available ending. (We do not need to rely on our intuitions
early for living things. for this; we can ask a group of other participants
Finally, it should be pointed out that seman- to give a word to end the sentence and count up
tic priming may have different results in word the numbers of different responses.) We find that
recognition and word production. For example, an appropriate context has a facilitatory effect on
Bowles and Poon (1985) showed that semantic the highly predictable congruent words (“shoes”)
priming has an inhibitory effect on retrieving a relative to the congruent but unlikely word (e.g.,
word given its definition (a production task), “hands”), and an inhibitory effect to the anomalous
188 C. WORD RECOGNITION
words (e.g., “terms”). As there is no direct associa- should at least sometimes be non-automatic.
tive relation between “shoes” and other words in Perhaps the potential benefit is too small for it to
the sentence, this seems to be attributable to prim- be worth the language processor routinely using
ing from sentence context. context. Sentence context may only be of practi-
Stanovich and West (1979, 1981; see also cal help in difficult circumstances, such as when
West & Stanovich, 1982) found that contextual the stimulus is degraded.
effects are larger for words that are harder to rec- As naming does not necessitate integration
ognize in isolation. Contextual facilitation was of the target word into the semantic structure,
much larger when the targets were degraded by the analysis of eye movements is revealing here.
reduced contrast. In clear conditions, we find Schustack, Ehrlich, and Rayner (1987) found evi-
mainly contextual facilitation of likely words; dence of the effects of higher level context in the
in conditions of target degradation, we find con- analysis of eye movements, but not of naming
textual inhibition of anomalous words. Children, times. Inhoff (1984) had participants read short
who of course are less skilled at reading words passages of text from Alice in Wonderland. A
in isolation than adults, also display more con- moving visual pattern mask moved in synchrony
textual inhibition. Different tasks yield different with the readers’ eyes. Ease of lexical access was
results. Naming tasks tend to elicit more facilita- manipulated by varying word frequency, and ease
tion of congruent words, whereas lexical decision of conceptual processing was manipulated by
tasks tend to elicit more inhibition of incongru- varying how predictable the word was in context.
ent words. The inhibition is most likely to arise Analysis of eye movements suggested that lexical
because lexical decision is again tapping post- access and context-dependent conceptual process-
access, attentional processes. It is likely that these ing could not be separated in the earliest stages
processes involve integrating the meanings of the of word processing. The mask affected frequency
words accessed with a higher level representation and predictability differentially, suggesting that
of the sentence. there is an early automatic component to lexical
West and Stanovich (1982) argued that the access, and a later non-automatic, effortful pro-
facilitation effects found in the naming task arise cessing involving context. So context may have
through simple associative priming from preced- some early effects, but lexical access and concep-
ing words in the sentence. It is very difficult to tual processing later emerge as two separate pro-
construct test materials that eliminate all associa- cesses. This experiment is also further support for
tive priming from the other words in the sentence the idea that early lexical processing is automatic,
to the target. If this explanation is correct, any whereas later effects of context involve an atten-
facilitation found is simply a result of associa- tional component.
tive priming from the other words in the sentence. Van Petten (1993) examined event-related
Sentence context operates by the post-access inhi- potentials (ERPs) to semantically anomalous sen-
bition of words incongruent with the preceding tences. One advantage of the ERP technique is
context, and this is most likely to be detected with that it enables the time course of word recognition
tasks such as lexical decision that are more sen- to be examined before an overt response (such
sitive to post-access mechanisms. One problem as uttering a word or pressing a button) is made.
with this conclusion is that lexical relatedness is The effects of lexical and sentence context were
not always sufficient in itself to produce facili- distinguishable in the ERP data, and the effects
tation in sentence contexts (O’Seaghdha, 1997; of sentence context were more prolonged. Van
Sharkey & Sharkey, 1992; Williams, 1988). This Petten concluded that there was indeed an effect
suggests that the facilitation observed comes from of sentence context that could not be attributed to
the integration of material into a higher text-level lexical priming. Furthermore, the priming effects
representation. Forster (1981) noted that the use appear to start at the same time, which argues
of context may be very demanding of cognitive against a strict serial model where lexical prim-
resources. This suggests that contextual effects ing precedes sentence context priming. Similarly,
6. RECOGNIZING VISUAL WORDS 189
Kutas (1993) found that lexical and sentence con- Morris and Harris (2002) argue the RSVP
text had very similar effects on ERPs. Both give (rapid serial visual presentation) technique is par-
rise to N400s (a large negative wave present 400 ticularly suited to investigating the effects of sen-
ms after the stimulus) whose amplitudes vary with tence context because it resembles normal reading
the strength of the association or sentence context. in that a whole sentence has to be read and pro-
Finally, Altarriba, Kroll, Sholl, and Rayner (1996) cessed, in contrast to tasks that involve respond-
examined naming times and eye movements in ing to one particular word in a sentence. In the
an experiment where fluent English–Spanish RSVP task, words are displayed one at a time in
bilinguals read mixed-language sentences. They the same location, each new word overwriting the
found that sentence context operated both through previous one. Readers tend to misread the word
intra-lexical priming and high-level priming. “rice” in sentences such as “She ran her best time
Contextual constraints still operate across lan- yet in the rice last week” as “race” when the items
guages, although the results were moderated by a are presented using RSVP (Potter, Moryadas,
lexical variable, word frequency. Abrams, & Noel, 1993). Clearly here sentence
Clearly the results are variable, and seem to context is causing the misperception, but at what
be task-dependent. It is possible that processing in stage? The early, interactive accounts state that
discourse is different from the processing of word sentence context is one factor interacting with all
lists such as are typically used in semantic priming others to determine the activation of a word, and
experiments. Hess, Foss, and Carroll (1995) manip- affects recognition; the late, modular accounts
ulated global and local context in a task where state that “rice” is indeed selected, and corrected
participants heard discourse over headphones, and later as a result of postperceptual processing, or
then had to name the concluding target word, which recall. Morris and Harris combined the RSVP task
appeared on a screen in front of them. The most with repetition blindness, whereby people seeing
important conditions were where the target word a word repeated very soon after its first instance
was globally related to the context but locally tend to omit the repetition in the reports of what
unrelated to the immediately preceding words (3), they have seen—that is, they are blind to the rep-
and globally unrelated but locally related (4): etition (Kanwisher, 1987). Repetition blindness
can be so strong that people might report hav-
(3) The computer science major met a woman ing seen “When she spilled the ink there was all
who he was very fond of. He had admired her over,” which doesn’t make sense, when they actu-
for a while but wasn’t sure how to express ally saw “When she spilled the ink there was ink
himself. He always got nervous when trying all over.” The preponderance of evidence (e.g.,
to express himself verbally so the computer from ERP studies) suggests that repetition blind-
science major wrote the poem. ness has an early, perceptual effect.
(4) The English major was taking a computer What happens if we combine RSVP with cor-
science class that she was struggling with. rected words and repetition blindness in a mis-
There was a big project that was due at the reading repetition blindness paradigm? Suppose
end of the semester which she had put off we present participants with “race” very soon
doing. Finally, last weekend the English after the sentence “She ran her best time yet in the
major wrote the poem. rice last week”? If the perceptual account of the
correction is correct, “rice” should be “perceived”
Hess et al. found that only global context like “race,” and therefore we should get repetition
facilitated naming the target word “poem.” This blindness for the “second” “race.” If the postper-
result does not show that automatic semantic ceptual account is correct, people really do “see”
priming does not occur: we certainly observe it “rice,” and therefore this case should not cause
with isolated items presented rapidly together. The repetition blindness. Morris and Harris found
experiment does show that in real discourse the that the perceptual account fitted the data better:
effects of global context may be more important. reconstructions cause repetition blindness.
190 C. WORD RECOGNITION
In summary, sentence context can have either A few researchers argue that activation does
an early perceptual effect or a late postperceptual not spread, and instead propose a compound-cue
effect. We can observe early effects, but only in theory (e.g., Ratcliff & McKoon, 1981, 1988; see
certain tasks, particularly ones that resemble read- also Hodgson, 1991). The central idea of spreading
ing of whole sentences and discourse rather than activation—which Ratcliff and McKoon disputed—
responding to isolated words. is that activation can permeate some distance through
a network, and that this permeation takes time. The
Summary of meaning-based further activation travels, the more time should pass,
and it can be very difficult to detect some of these
priming studies very small effects. Instead, according to compound-
We can distinguish between associative seman- cue theory, priming involves the search of memory
tic priming, associative non-semantic priming, with a compound cue that contains both the prime
and non-associative semantic priming. All sorts and the target. This theory predicts that priming can
of priming have both automatic and attentional only occur if two items are directly linked in mem-
components, although there has been considerable ory. It therefore cannot account for mediated prim-
debate as to the status of automatic non-associative ing where two items that are not directly linked can
semantic priming. Attentional processes include be primed through an intermediary (see McNamara,
checking that the item accessed is the correct 1992, 1994). Furthermore, there is now evidence
one, using conscious expectancies, and integrat- that time elapses while activation spreads, and the
ing the word with higher level syntactic and more distantly related two things are, the longer the
semantic representations of the sentence being time that elapses (McNamara, 1992; McNamara &
analyzed. The remaining question is the extent to Altarriba, 1988).
which sentence context has an automatic compo-
nent. Researchers are divided on this, but there
is a reasonable amount of evidence that it has. PROCESSING
Schwanenflugel and LaCount (1988) suggested MORPHOLOGICALLY
that sentential constraints determine the semantic COMPLEX WORDS
representations generated by participants as they
read sentences. The more specific the constraints, So far we have mainly looked at morphologically
the more specific the expected semantic represen- simple words. How are morphologically complex
tations generated. Connectionist modeling also words stored in the lexicon? Is there a full listing
suggests a mechanism whereby sentence context of all derivations of a word, so that there are entries
could have an effect. In an interactive system, for “kiss,” “kissed,” “kisses,” and “kissing”? We
sentence context provides yet another constraint call this the full-listing hypothesis. Or do we just
that operates on word recognition in the same way list the stem (“kiss-”), and produce or decode the
as lexical variables, facilitating the recognition of inflected items by applying a rule (you add “-ed”
more predictable words. to the stem of a word to form the past tense)?
How does priming occur? The dominant the- As English contains a large number of irregular
ory says that semantic priming occurs by the spread derivations (e.g., “ran,” “ate,” “mice,” “sheep”),
of activation. Activation is a continuous property, we would then have to list the exceptions sepa-
rather like heat, that spreads around a network. rately, so we would store a general rule and a list
Items that are closely related will be close together of exceptions. We call this the obligatory decom-
in the network. Retrieving something from mem- position hypothesis (Smith & Sterling, 1982; Taft,
ory corresponds to activating the appropriate items. 1981, 2004). There is an intermediate position,
Items that are close to an item in the network will called the dual-pathway hypothesis. Although it is
receive activation by its spread from the source uneconomical to list all inflected words, some fre-
unit. The farther away other items are from the quent and common inflected words do have their
source, the less activation they will receive. own listing (Monsell, 1985; Sandra, 1990).
6. RECOGNIZING VISUAL WORDS 191
According to the obligatory decomposition 1990). Hence neither “milk” nor “spoon” will
hypothesis, to recognize a morphologically com- facilitate the recognition of “buttercup.”
plex word we must first strip off its affix, a process Marslen-Wilson, Tyler, Waksler, and Older
known as affix stripping (Taft & Forster, 1975; see (1994) examined how we process derivationally
also Taft, 1985, 1987). In a lexical decision task, complex words in English. Marslen-Wilson et al.
words that look as though they have a prefix, but used a cross-modal lexicon decision task to exam-
in fact do not (e.g., “interest,” “result”), take longer ine what we decompose morphologically com-
to recognize than control words (Taft, 1979, 1981). plex words into, and therefore the sorts of words
It is as though participants are trying to strip these that they can influence. For example, a participant
words of their affixes but are then unable to find would hear a spoken prime (e.g., “happiness”)
a match in the lexicon and have to reanalyze. In and then immediately have to make a lexical deci-
a task where participants were asked to judge sion to a visual probe (e.g., “happy”). The cross-
whether a visually presented word was pronounced modal nature of the task is important because it
identically to another word (i.e., the word was a obliterates any possible phonological priming
homophone), Taft (1984) observed people have between similar words. Instead, any priming that
difficulty with words such as “fined” that have a occurs must result from lexical access.
morphological structure different from their homo- The pattern of results was complicated and
phonic partner (here “find”). Taft argued that the showed that the extent of priming found depends
difficulty with such words arises from the fact on the ideas of phonological transparency and
that inflected words are represented in the lexicon semantic transparency. The relation between
as stems plus their affix. Finally, consider words two morphologically related words is said to
like “seeming” and “mending”; they have very be phonologically transparent if the shared part
similar surface frequencies—that is, those par- sounds the same. Hence the relation in “friendly”
ticular forms occur with about equal frequency in and “friendship” is phonologically transpar-
the language. However, the stems have very dif- ent (“friend” sounds the same in each word),
ferent base frequencies: “Seem” and all its vari- but in “sign” and “signal” it is not (the “sign”
ants (seems, seemed) is much more frequent than components have different pronunciations).
“mend” and its variants (mends, mended). Which (Phonological transparency is really a continuum
determines the ease of recognition—surface or rather than a dichotomy, with some word pairs,
base frequency? It turns out that on the whole such as “pirate” and “piracy,” in between the
lexical decision is much faster, and there are fewer extremes.) A morphologically complex word is
errors, for words with high base frequencies, again semantically transparent if its meaning is obvious
suggesting that complex words are decomposed from its parts: hence “unhappiness” is semanti-
and recognized by their stem (Taft, 1979, 2004). cally transparent, being made up in a predictable
However, the base frequency effect is not found fashion from “un-,” “happy,” and “-ness.” A word
for all words; for some common words there is no like “department,” even though it contains recog-
effect of base frequency but there is one of surface nizable morphemes, is not semantically transpar-
frequency (Baayen, Dijkstra, & Schreuder, 1997; ent. The meaning of “depart” in “department” is
Bertram, Schreuder, & Baayen, 2000; Schreuder not obviously related to the meaning of “depart”
& Baayen, 1997). This finding is evidence for the in “departure.” It is semantically opaque.
dual-pathway hypothesis, although the debate is Semantic and phonological transparency
ongoing, with Taft arguing that base- and surface- affect the way in which words are identified.
frequency effects arise at different stages of pro- Semantically transparent forms are morpho-
cessing, so that the lack of a base-frequency effect logically decomposed, regardless of whether
is not evidence against obligatory decomposition. or not they are phonologically transparent.
Compound words whose meanings are not Semantically opaque words, however, are not
transparent from their components (e.g., “but- decomposed. Furthermore, suffixed and pre-
tercup”) will also be stored separately (Sandra, fixed words behave differently. Suffixed and
192 C. WORD RECOGNITION
prefixed derived words prime each other, but we locate them because their storage location is
pairs of suffixed words produce interference. defined by their content—a feature called content
This is because when we hear a suffixed word, addressability?
we hear the stem first. All the suffixed forms Carr and Pollatsek (1985) use the term lexical
then become activated, but as soon as there is instance models for models that have in common
evidence for just one of them, the others are that there is simply perceptual access to a memory
suppressed. Therefore, if one of them is subse- system, the lexicon, where representations of the
quently presented, we observe inhibition. attributes of individual words are stored, and they
The experiment of Marslen-Wilson et al. do not have any additional rule-based component
shows that in English there is a level of lexi- that converts individual letters into sounds. We
cal representation that is modality-independent can distinguish two main types of lexical instance
(because we observe cross-modal priming), and model. These differ in whether they employ serial
that it is morphologically structured for seman- search through a list, or the direct, multiple activa-
tically transparent words (because of the pattern tion of units. The best known instance of a search
of facilitation shown). More recent studies have model is the serial search model. Direct access,
found that morphological priming effects are activation-based models include the logogen
independent of meaning similarity; that is, there model, localist connectionist models, as well as
is no difference in the priming effects for semanti- the cohort model of spoken word recognition (see
cally transparent and opaque derivations in sev- Chapter 9). More difficult to fit into this simple
eral languages, including English (Rastle, Davis, scheme are hybrid or verification models (which
& New, 2004), French (Longtin, Segui, & Halle, combine direct access and serial search), and dis-
2003), and Hebrew (Frost, Forster, & Deutsch, tributed connectionist models (which although
1997). These results suggest that morphological very similar to the logogen model do not have
priming in general is obtained because of morpho- simple lexical units at all).
logical structure rather than because of semantic
overlap between similar items. Forster’s autonomous serial
search model
MODELS OF VISUAL WORD Imagine how you might try to find a word by search-
RECOGNITION ing through a dictionary; you search through the
entries, which are arranged to facilitate search on
In this section, we examine some models of visual the basis of visual characteristics (that is, they are
lexical access. They all take as input a perceptual in alphabetical order), until you find the appropriate
representation of the word, and output desired entry. The entry in the dictionary gives you all the
information such as meaning, sound, and famili- information you need about the word: its meaning,
arity. The important question of how we access a pronunciation, and its syntactic class. A commonly
word’s phonological form will be examined in the used analogy here is that of searching through a cat-
next chapter. alog to find the location of a book in the library. The
All models of word recognition have to model is a two-stage one; you can use the catalog to
address four main questions. First, is process- find out where the book is, but you still have to go to
ing autonomous or interactive—in particular, the shelf, find the book’s actual location, and extract
are there top-down effects on word recognition? information from it. Forster (1976, 1979) proposed
Second, is lexical access a serial or a parallel pro- that we identify words by a serial search through
cess? Third, can activation cascade from one level the lexicon. In this model the catalog system corre-
of processing to a later one, or must processing by sponds to what are called access files, and the shelf
the later stage wait until that of the earlier one is full of books to the master file.
complete? Fourth, how do we find items? Do we In the serial search model, perceptual process-
find them by searching through the lexicon, or can ing is followed by the sequential search of access
6. RECOGNIZING VISUAL WORDS 193
Analysis of visual
input to be used
in search and to
compute probable
bin
COW PIG
Search is not affected by syntactic or seman- Forster, 2004). In the serial search model only the
tic information, which is why the search is said to relative frequency of words within a bin has an
be autonomous. The only type of context that can effect on access time, not the absolute frequency.
operate on lexical access is associative priming This idea is called the rank hypothesis (Murray
within the master file. There is no early role for & Forster, 2004). Suppose you have two bins; in
the effect of sentence context; sentence context one bin the absolute frequency of the first item
can only have an effect through post-access mech- is 100,000 and of the second item just 10, while
anisms such as checking the output and integrat- in the second bin the frequency of the first item
ing it with higher level representations. Repetition is just 20 and of the second 10. Hence in the
can temporarily change the order of items within first bin there is a big absolute difference in fre-
bins, which is why we observe repetition priming. quency between the two items, and in the second
Illegal nonwords can be rejected early on in the bin a small absolute difference. But in each case
bin selection process, but legal nonwords are only the relative frequencies are the same—the first
rejected after the exhaustive search of the appro- item compared with the second item. Most of the
priate bin. evidence suggests that relative frequency is more
important in determining access time than abso-
Evaluation of the serial search model lute frequency. Detailed experimental analy-
The most significant criticism of the serial search sis of lexical decision times and error rates for
model concerns the plausibility of a serial search words with a wide range of frequencies shows
mechanism. Although introspection suggests that that reaction times fit better to a linear rank func-
word recognition is direct rather than involving tion (as predicted by the rank hypothesis where
serial search, we cannot rely on these sorts of data. all that matters is relative frequency) than to a
Making a large number of serial comparisons will logarithmic function (where absolute frequency
take a long time, but word recognition is remark- matters). In particular, the extremes of the dis-
ably fast. The model accounts for the main data in tribution do not behave as expected: Both very
word recognition, and makes a strong prediction high frequency and very low frequency words
that priming effects should be limited to associa- are responded to more slowly and inaccurately
tive priming within the lexicon. There should be than the logarithmic function predicts.
no top-down involvement of extra-lexical knowl- The serial search model has proved very
edge in word recognition. Finally, the model does influential and is a standard against which to
not convincingly account for how we pronounce compare other models. Can we justify using lexi-
nonwords. cal access mechanisms more complex than serial
Forster (1994) addressed some of these prob- search?
lems. In particular, he introduced an element of
parallelism by suggesting that all bins are searched
simultaneously. The subdivision of the system into
The logogen model
bins greatly speeds up the search, and it makes it In this model every word we know has its own sim-
possible to conclude that a string of letters is a non- ple feature counter called a logogen correspond-
word much more quickly than if the whole lexicon ing to it. A logogen accumulates evidence until its
has to be searched. individual threshold level is reached. When this
The serial search model also provides an happens, the word is recognized. Lexical access is
account of the effects of word frequency on therefore direct, and occurs simultaneously and in
lexical access. It was originally thought that parallel for all words. Proposed by Morton (1969,
the effect of frequency is roughly logarithmic, 1970), the logogen model was related to the infor-
so that the difference in access times between mation processing idea of features and demons, as
a common and a slightly less common word is described in Lindsay and Norman’s classic (1977)
much less than between a rare and a slightly more textbook, where “demons” monitor the perceptual
rare word (Howes & Solomon, 1951; Murray & input for specific “features”; the more evidence
6. RECOGNIZING VISUAL WORDS 195
A N T G S
FIGURE 6.9 Fragment
of an interactive activation
network of letter
recognition. Arrows show
excitatory connections;
filled circles, inhibitory
connections. From
McClelland and Rumelhart
(1981).
connections is either excitatory (that is, positive or are connected to all other units at the same level
facilitatory), if it is an appropriate one, or inhibi- by inhibitory connections, as soon as a unit (e.g., a
tory (negative), if it is inappropriate. For exam- word) becomes activated, it starts inhibiting all the
ple, the letter “T” would excite the word units other units at that level. Hence if the system “sees”
“TAKE” and “TASK” in the level above it, but a “T,” then “TAKE,” “TASK,” and “TIME” will
would inhibit “CAKE” and “CASK.” Excitatory become activated, and immediately start inhibit-
connections make the destination units more ing words without a “T” in them, like “CAKE,”
active, while inhibitory connections make them “COKE,” and “CASK.” As activation is also sent
less active. Each unit is connected to each other back down to lower levels, all letters in words
unit within the same level by an inhibitory con- beginning with “T” will become a little bit acti-
nection. This introduces the element of competi- vated and hence “easier” to “see.” Furthermore, as
tion. The network is shown in Figure 6.9. letters in the context of a word receive activation
When a unit becomes activated, it sends acti- from the word units above them, they are easier to
vation in parallel along the connections to all the see in the context of a word than when presented
other units to which it is connected. If it is con- in isolation, when they receive no supporting
nected by a facilitatory connection, it will have top-down activation—hence the word superiority
the effect of increasing activation at the unit at the effect. Equations described in the Appendix deter-
other end of the connection, whereas if it is con- mine the way in which activation flows between
nected by an inhibitory connection, it will have the units, is summed by units, and is used to change
effect of decreasing the activation at the other end. the activation level of each unit at each time step.
Hence if the unit corresponding to the letter “T” in Suppose the next letter to be presented is an
the initial letter position becomes activated, it will “A.” This will activate “TAKE” and “TASK” but
increase the activation level of the word units cor- inhibit “TIME,” which will then also be inhibited
responding to “TAKE” and “TASK,” but decrease in turn by within-level inhibition from “TASK”
the activation level of “CAKE.” But because units and “TIME.” The “A” will of course also activate
198 C. WORD RECOGNITION
“CASK” and “CAKE,” but these will already be Verification models can be extended to include
some way behind the two words starting with “T.” any model where there is verification or check-
If the next letter is a “K,” then “TAKE” will be the ing that the output of the bottom-up lexical access
clear leader. Time is divided into a number of slices processes is correct. Norris (1986) argued that a
called processing cycles. Over time, the pattern of post-access checking mechanism checks the out-
activation settles down or relaxes into a stable con- put of lexical access against context and resolves
figuration so that only “TAKE” remains activated, any ambiguity.
and hence is the word “seen” or recognized.
The interactive activation model of letter and
word recognition has been highly influential. As
Comparison of models
the name implies, this type of model is heavily There are two dichotomies that could be used
interactive; hence any evidence that appears to to classify these models. The first is between
place a restriction on the role of context is prob- interactive and autonomous models. The second
lematic for it. The scope of the model is limited, dichotomy is between whether words are accessed
and gives no account of the roles of meaning and directly or through a process of search. The logo-
sound in visual word processing. Connection gen and interactive activation models are both
strengths have to be coded by hand. Models where interactive direct access models; the serial search
the connection strengths are learned have become model is autonomous and obviously search-based.
more popular. We will examine a connectionist Most researchers agree that the initial stages of
learning model of word recognition and naming lexical access involve parallel direct access,
in the next chapter. although serial processes might subsequently be
involved in checking prepared responses. There
is less agreement on the extent to which context
Hybrid models affects processing. All these models can explain
Hybrid models combine parallelism (as in the log- semantic priming, but the serial search model has
ogen and connectionist models) with serial search no role for sentence context.
(as in Forster’s model). In Becker’s (1976, 1980)
verification model, bottom-up, stimulus-driven
perceptual processes cannot recognize a word COPING WITH LEXICAL
on their own. A process of top-down checking or AMBIGUITY
verification has the final say. Rough perceptual
processing generates a candidate or sensory set of Ambiguity in language arises in a number of
possible lexical items. This sensory set is ordered ways. There are ambiguities associated with the
by frequency. Context generates a contextual or segmentation of speech. Consider the spoken
semantic set of candidate items. Both the sensory phrases “gray tape” with “great ape,” and “ice
and the semantic set are compared and verified cream” with “I scream”: in normal speech they
by detailed analysis against the visual characteris- sound the same. Some sentences have more than
tics of the word. The semantic set is verified first; one acceptable syntactic interpretation. Although
verification is serial. If a match is not found, then this chapter is primarily about visual word recog-
the matching process proceeds to the sensory set. nition, in this section we will look at lexical ambi-
This process will generate a clear advantage for guity for both visual and spoken words.
words presented in an appropriate context. The There are a number of types of lexical
less specific the context, the larger the semantic ambiguity. Homophones are words with differ-
set, and the slower the verification process. As ent meanings that sound the same. Some exam-
the context precedes the target word, the semantic ples of pure homophones are “bank” (a place for
set is ready before the sensory set is ready. Paap, money, or a place beside a river) and “pen” (a
Newsome, McDonald, and Schvaneveldt (1982) writing instrument or a place to keep animals).
also presented a version of the verification model. Heterographic homophones sound the same but
6. RECOGNIZING VISUAL WORDS 199
are spelled differently (e.g., “knight” and “night,” appropriate sense. The two main processing ques-
and “weight” and “wait”). Homographs are tions are: How do we resolve the ambiguity—that
ambiguous when written down, and some of these is, how do we choose the appropriate meaning or
may be disambiguated when pronounced (such as reading? And at what stage is context used?
“lead”—as in “dog lead” and “lead” the metal).
Most interesting of all are polysemous words, Early work on lexical ambiguity
which have multiple meanings. There are many
examples of polysemous words in English, such Early research on lexical ambiguity used a variety
as “bank,” “straw,” “ball,” and “letter.” Consider of tasks to examine at what point we select the
sentences (5) to (8). Some words are also syntacti- appropriate meaning of an ambiguous word. Most
cally ambiguous—“bank” can operate as a verb as of these tasks were off-line, in the sense that they
well as a noun, as in (7) or (8): used indirect measures that tap processing some
time after the ambiguity has been resolved.
(5) The fisherman put his catch on the bank.
(6) The businessman put his money in the bank. Early models of lexical ambiguity
(7) I wouldn’t bank on it if I were you. When we come across an ambiguous word, do we
(8) The plane is going to bank suddenly to one immediately select the appropriate sense, or do we
side. access all of the senses and then choose between
them, either in some sequence or in parallel? Early
Frazier and Rayner (1990) distinguished researchers worked within the framework of three
between words with multiple meanings, where types of model of resolving lexical ambiguity.
the meanings are unrelated (e.g., the meanings We can call the first model the context-guided
of “bank” or “ball”), and words with multiple single-reading lexical access model (Glucksberg,
senses, where the senses are related (e.g., a “film” Kreuz, & Rho, 1986; Schvaneveldt, Meyer, &
can be the physical reel or the whole thing that Becker, 1976; Simpson, 1981). According to this
is projected on a screen or watched on television, model, the context somehow restricts the access
“twist” can be a coil, or to operate something by process so that only the relevant meaning is ever
turning, or to sprain an ankle, or to distort the accessed. One problem with this model is that it is
meaning of something—all the meanings are unclear how context can provide such an immedi-
related). It is not always easy to decide whether a ate constraint.
word has multiple meanings or senses. The second model is called the ordered-
We are faster to make lexical decisions about access model (Hogaboam & Perfetti, 1975). All of
ambiguous words compared with matched unam- the senses of a word are accessed in order of their
biguous words—this advantage is called the ambi- individual meaning frequencies. For example,
guity advantage (Jastrzembski, 1981). However, the “writing instrument” sense of “pen” is more
the advantage is only found for lexical decision. frequent than the “agricultural enclosure for ani-
For other tasks there is no advantage or even a mals” sense. Each sense is then checked serially
disadvantage (e.g., on eye-movement measures; against the context to see if it is appropriate. We
see Rayner, 1998). Perhaps ambiguous words check the most common sense against the context
benefit from having multiple entries in the lexi- first to see if it is consistent. Only if it is not do we
con. This observation needs qualification: while try the less common meaning.
multiple senses of a word confer an advantage, The third model is called the multiple-access
distinct multiple meanings do not (Rodd, Gaskell, model (Onifer & Swinney, 1981; Swinney,
& Marslen-Wilson, 2002). 1979; Tanenhaus, Leiman, & Seidenberg, 1979).
Most of the time we are probably not even According to this model, when an ambiguous
aware of the ambiguity of ambiguous words; we word is encountered, all its senses are activated,
have somehow used the context of the sentence to and the appropriate one is chosen when the con-
disambiguate the sentence—that is, to select the text permits.
200 C. WORD RECOGNITION
(16) The accountant filled his pen with ink. Swinney’s (1979) experiment
(17) The farmer put the sheep in the pen. Some of the early evidence supported multiple
access, and some selective access. The results
Schvaneveldt et al. (1976) employed a we find are very task-dependent. Furthermore,
successive lexical decision task, in which par- the tasks are either off-line, in the sense that they
ticipants see individual words presented in a reflect processing times well after the ambiguity
stream, and have to make lexical decisions for has been processed (such as ambiguity detection,
each word. In this case participants become dichotic listening, and sentence completion), or
far less aware of relations between successive are on-line tasks such as phoneme monitoring
words. The lexical decision time to triads of that are very sensitive to other variables. We
words such as (18), (19), and (20) is the main need a task that tells us what is happening imme-
experimental concern: diately when we come across an ambiguous
word. Swinney (1979) carried out such an exper-
(18) save bank money iment. He used a cross-modal priming technique
(19) river bank money in which participants have to respond to a visual
(20) day bank money lexical decision task while listening to correlated
auditory material.
The fastest reaction time to “money” was in
(18) where the appropriate meaning of “bank” (21) Rumor had it that, for years, the govern-
had been primed by the first word (“save”). ment building had been plagued with prob-
Reaction time was intermediate in control lems. The man was not surprised when he
condition (20), but slowest in (19) where the found several (spiders, roaches, and other)
incorrect sense had been primed. If all senses of bugs1 in the cor2ner of his room.
“bank” had been automatically accessed when
it was first encountered, then “money” should In (21) the ambiguous word is “bugs.” The
have been primed by “bank” whatever the first phrase “spiders, roaches, and other” is a disam-
word. This result therefore supports selective biguating context that strongly biases participants
access. towards the “insect” sense of “bugs” rather than
202 C. WORD RECOGNITION
the “electronic” sense. Only half the participants autonomous, or informationally encapsulated, in
saw this strongly disambiguating phrase. There that all senses of the ambiguous word are output,
was a visually presented lexical decision task but then semantic information is utilized very
either immediately after (at point 1) or slightly quickly to select the appropriate sense. This in
later (three syllables after the critical word, at point turn suggests that the construction of the seman-
2). The target in the lexical decision was either tic representation of the sentence is happening
“ant” (associated with the biased sense), “spy” more or less on a word-by-word basis.
(associated with the irrelevant sense), or “sew” McClelland (1987) argued that these findings
(a neutral control). Swinney found facilitation at are consistent with interactive theories. He argued
point 1 for both meanings of “bugs,” including that context might have an effect very early on,
the irrelevant meaning, but facilitation only for but the advantage it confers is so small that it does
the relevant meaning at point 2. This suggests that not show up in these experiments. This approach
when we first come across an ambiguous word, is difficult to falsify, so for now the best interpre-
we automatically access all its meanings. We then tation of these experiments is that we access all
use context to make a very fast decision between the meanings.
the alternatives, leaving only the consistent sense
active. The effects of meaning frequency and
Swinney’s experiment showed that seman- prior context
tic context cannot restrict initial access. There is now agreement that when we encounter
Tanenhaus et al. (1979) performed a similar an ambiguous word, all meanings are activated
experiment based on a naming task rather than and context is subsequently used to very quickly
lexical decision. They used words that were syn- select the correct meaning. Recent research has
tactically ambiguous (e.g., “watch,” which can used on-line techniques, primarily cross-modal
be a verb or a noun). Tanenhaus et al. found that priming and eye-movement measures, to refine
both senses of the word were initially activated these ideas. Research has focused on three main
in sentences such as “Boris began to watch” and issues. First, what effect does the relative fre-
“Boris looked at his watch.” Again, the context- quency of the different meanings of the ambigu-
independent meaning faded after about 200 ms. ous word have on processing? Second, what is the
Hence syntactic context cannot constrain ini- effect of presenting strong disambiguating con-
tial access either. Tanenhaus and Lucas (1987) text before the ambiguous word? Third, how does
argued that there are good reasons to expect that context affect the access of semantic properties of
initial lexical access should not be restricted by words?
syntactic context. Set-membership feedback is There is controversy about whether the
of little use in deciding whether or not a word relative frequencies of meanings affect initial
belongs to a particular syntactic category: put access. On the one hand, Onifer and Swinney
another way, the likelihood of correctly guess- (1981) replicated Swinney’s experiment using
ing what word is presented given just its syntac- materials with an asymmetry in the frequency
tic category is very low. of the senses of the ambiguous word, so that one
In summary, the data so far suggest that meaning was much more frequent than the other
when we hear or see an ambiguous word, we meaning. Nevertheless, they still observed that
unconsciously access all the meanings immedi- all meanings were initially activated, regardless
ately, but use the context to very quickly reject of the biasing context. However, the dominant
all inappropriate senses. This process can begin meaning may be activated more strongly and
after approximately 200 ms. Less frequent perhaps sooner than less frequent ones (Simpson
meanings take longer to access because more & Burgess, 1985). Extensive use has been made
evidence is needed to cross their threshold for recently of studying eye movements, which are
being considered appropriate to the context. This thought to reflect on-line processing. Studies
suggests that the processes of lexical access are making use of this technique showed that the
6. RECOGNIZING VISUAL WORDS 203
time participants take gazing at ambiguous According to the autonomous access model,
words depends on whether the alternative mean- prior context has no effect on access; meanings
ings of the ambiguous word are relatively equal are accessed exhaustively. In a version of this
or highly discrepant in frequency. Simpson called the integration model, the successful inte-
(1994) called the two types of ambiguous words gration of one meaning with prior context termi-
balanced and unbalanced respectively. nates the search for alternative meanings of that
In most of the studies we have examined so word (Rayner & Frazier, 1989). Hence there is
far, the disambiguating context comes after the selective (single meaning) access when the inte-
ambiguous word. The evidence converges on the gration of the dominant meaning is fast (due to the
idea that all meanings are immediately accessed context) but identification of a subordinate mean-
but that the context is quickly used to select one ing is slow.
of them. What happens when the disambiguat- Dopkins, Morris, and Rayner (1992) car-
ing context comes before the ambiguous words? ried out an experiment to distinguish between
Three models have been proposed to account for the reordered access and integration models. In
what happens. their experiment, an ambiguous word was both
According to the selective access model, preceded and followed by context relevant to the
prior disambiguating material constrains meaning of the word. The context that followed
access so that only the appropriate meaning is the ambiguous word always conclusively disam-
accessed. biguated it. The main manipulation in this experi-
According to the reordered access model, ment was the extent to which the prior context was
prior disambiguating material affects the access consistent with the meanings of the ambiguous
phase in that the availability of the appropriate word. In the positive condition, the ambiguous
meaning of the word is increased (Duffy, Morris, word was preceded by material that highlighted
& Rayner, 1988; Rayner, Pacht, & Duffy, 1994). It an aspect of its subordinate meaning, although
is a hybrid model between autonomous and inter- the context was also consistent with the dominant
active models, where the influence that context meaning (e.g., 22). In the negative condition, the
can have is limited. Duffy et al. (1988) examined word was preceded by material that was inconsist-
the effect of prior context on balanced or unbal- ent with the dominant meaning but did not contain
anced ambiguous words, with the unbalanced any strong bias to the subordinate meaning (e.g.,
words always biased by the context to their less 23). In the neutral condition, the ambiguous word
common meaning. Processing times for balanced was preceded by context that provided support for
words and their controls were the same, but partic- neither of its meanings (e.g., 24).
ipants spent longer looking at unbalanced words
than the control words. Duffy et al. argued that (22) Having been examined by the king, the
the prior disambiguating context increased avail- page was soon marched off to bed. [positive
ability of appropriate meanings for both balanced condition]
and unbalanced words. In the case of the balanced (23) Having been hurt by the bee-sting, the
words, the meaning indicated by the context was page was soon marched off to bed. [nega-
accessed before the other meanings. In the case tive condition]
of the unbalanced words with the biasing con- (24) Just as Henrietta had feared, the page was
text, the two meanings were accessed at the same soon marched off to bed. [neutral condition]
time, with additional processing time then needed
to select the appropriate subordinate meaning. What do the two models predict? The criti-
This additional time is called the subordinate bias cal condition is the positive condition. The
effect (Rayner et al., 1994). A biasing context can integration model predicts that context has no
reorder the availability of the meanings so that effect on the initial access phase. The mean-
the subordinate meaning becomes available at the ings of ambiguous words will be accessed in a
same time as the dominant meaning. strict temporal sequence that is independent of
204 C. WORD RECOGNITION
the context, with the dominant meaning always The reordered access model finds further sup-
accessed first. If this meaning can be integrated port from an experiment by Folk and Morris (1995).
with the context, it will be selected; if not, the They examined reading fixation times and naming
processor will try to integrate the next meaning times when reading words that were semantically
with the context, and so on. In the positive and ambiguous (e.g., “calf”) had the same pronuncia-
neutral conditions, the context will contain no tion but different meanings and orthographies (e.g.,
evidence that the dominant meaning is inappro- “break” and “brake”), or had multiple semantic
priate, so the processor will succeed in integrat- and phonological codes (e.g., “tear”). They found
ing this meaning, halt before the subordinate that semantic, phonological, and orthographic con-
meaning is accessed, and move on. When the straints all had an early effect, influencing the order
subsequent material is encountered, the proces- of availability of the meanings.
sor realizes its mistake and has to backtrack. In So far, then, the data support a reordered
the negative condition, the preceding context access model over a strictly autonomous one
indicates that the dominant meaning is inappro- such as the integration model. Contextual infor-
priate, so the processor will then have to spend mation can be used to restrict the access of
time accessing the subordinate meaning. The meanings. In the reordered access model, how-
later context will provide no conflict. The inte- ever, the role of context is restricted by meaning
gration model predicts that processing times for frequency. In particular, the subordinate-biased
the ambiguous word will be longer in the nega- context cannot inhibit the dominant mean-
tive condition than in the positive and neutral ing from becoming available. Recent research
conditions, but processing time for the later dis- has examined the extent to which this is true.
ambiguating context will be longer in the posi- An alternative model is the context-sensitive
tive and neutral conditions than in the negative. model (Simpson, 1994; Vu, Kellas, & Paul,
The reordered access model predicts that the 1998), where meaning frequency and biasing
preceding context will have an effect on the ini- context operate together, dependent on contex-
tial access of the ambiguous word in the positive tual strength. This is the degree of constraint
condition but not in the negative or neutral con- that the context places on an ambiguous word.
ditions. In the positive condition, the context will According to this model, the subordinate bias
lead to the subordinate meaning being accessed effect that motivated the reordered access model
early. This means that when the context after only arises in weakly biasing contexts. If the
the word is encountered, the processor will not context is sufficiently strong, the subordinate
have to recompute anything, so processing in the meaning alone can become available.
disambiguating region will be fast. In the nega- If the context-sensitive model is correct, then
tive and neutral conditions the preceding context a sufficiently strong context should abolish the
contains no evidence for the subordinate mean- subordinate bias effect whereby we spend longer
ing and the predictions are similar to the integra- looking at an ambiguous word when its less fre-
tion model. quent meaning is indicated by the context. This
The key condition, then, is the positive condi- idea was tested in an experiment by Martin, Vu,
tion, which favors the subordinate meaning but is Kellas, and Metcalf (1999). Martin et al. varied
also consistent with the dominant meaning. The the strength of the discourse context: (25) is a
reordered access model predicts that processing weakly biasing context towards the subordinate
times in the subsequent disambiguation region meaning, but (26) is a strongly biasing context to
will be relatively fast, whereas the integration the subordinate meaning; (27) and (28) show the
model predicts that they will be relatively slow. control contexts for the dominant meanings.
The results supported the reordered access model.
Dopkins et al. found that reading times for the dis- (25) The scout patrolled the area. He reported
ambiguating material were indeed relatively fast the mine to the commanding officer. [weak
in the positive condition. context favoring subordinate meaning]
6. RECOGNIZING VISUAL WORDS 205
(26) The gardener dug a hole. She inserted the Accessing selective properties
bulb carefully into the soil. [strong context of words
favoring subordinate meaning] Tabossi (1988a, 1988b) used a cross-modal
(27) The farmer saw the entrance. He reported priming task to show that sentence context that
the mine to the survey crew. [weak context specifically constrains a property of the prime
favoring dominant meaning] word leads to selective facilitation. She argued
(28) The custodian fixed the problem. She inserted for a modified version of context-dependency:
the bulb into the empty socket. [strong con- not all aspects of semantic-pragmatic context
text favoring dominant meaning] can constrain the search through the possible
meanings, but semantic features constraining
According to the reordered access model, specific semantic properties can provide such
the dominant meaning will always be generated constraints. For example, the context in (29)
regardless of context, so time will be needed to clearly suggests the “sour” property of “lemon.”
resolve the competition. Hence there will be a sub- Tabossi observed facilitation when the target
ordinate bias effect, and the reading times on the “sour” was presented visually in a lexical deci-
ambiguous word should be the same, and longer sion task immediately after the prime (“lemon”),
than the reading time for the dominant meanings, relative both to the same context but with a dif-
regardless of the strength of the context. Accord- ferent noun (30) and a different context with the
ing to the context-sensitive model, there should same noun (31).
only be conflict and therefore a subordinate bias
effect in the weak context condition; therefore (29) The little boy shuddered eating the lemon.
reading times of the ambiguous word should be (30) The little boy shuddered eating the popsicle.
faster with the strong biasing context compared (31) The little boy rolled on the floor a lemon.
with the weak context. The data from a self-
paced reading task supported the context-sensitive In effect, Tabossi argued that there are large
model. A sufficiently strong context can eliminate differences in the effectiveness of different types
the subordinate bias effect so that reading times on of contextual cues. If the context is weakly con-
a word with either the subordinate or the dominant straining, we observe exhaustive access, but if it
meaning strongly indicated are the same. is very strongly constraining, we observe selec-
Rayner, Binder, and Duffy (1999) criticized tive access. However, Moss and Marslen-Wil-
the materials in this experiment. They argued that son (1993) pointed out that the acoustic offset
many of the items were unsuitable. For example, of the prime word might be too late to measure
some items appeared to be more balanced than an effect, given that initial lexical access occurs
biased, and some contexts were consistent with the very early, before words are completed. Tabossi
same meaning. They also argued that the reordered used two-syllable-long words, and it is possible
access model predicts that in very strong con- that these words were long enough to permit
texts the subordinate meaning might be accessed initial exhaustive access with selection occur-
before the dominant meaning. Nevertheless, ring before presentation of the target. Tabossi
access is exhaustive: the dominant meaning is and Zardon (1993) examined this possibility
still always accessed—unless the context contains in a cross-modal lexical decision task by pre-
a strong associate of the intended meaning, as in senting the target 100 ms before the end of the
Seidenberg, Tanenhaus, Leiman, and Bienkowski ambiguous prime. They still found that only the
(1982). Hence, Rayner et al. (1999) argue, the data dominant, relevant meaning was activated when
from Martin et al. are not contrary to the reordered the context was strongly biasing towards that
access model. In reply, Vu and Kellas (1999), meaning. Tabossi and Zardon also found that
while admitting that there were problems with if the context strongly biases the interpretation
some of their stimuli, claim that these problems to the less frequent meaning, both the dominant
could not have led to erroneous results. meaning (because of its dominance) and less
206 C. WORD RECOGNITION
dominant meaning (because of the effect of con- models of disambiguation incorporate an ele-
text) are active after 100 ms (see also Simpson ment of interactivity: the question now is the
& Krueger, 1991). extent to which it is restricted. Can a sufficiently
Moss and Marslen-Wilson (1993) also constraining semantic context prevent the acti-
explored the way in which aspects of meaning can vation of the less dominant meaning of a word?
be selectively accessed. They measured lexical Hence the way in which we deal with lexical
access very early on, before the presentation of the ambiguity depends on both the characteristics of
prime had finished. Semantically associated tar- the ambiguous word and the type of disambigu-
gets were primed independent of context, whereas ating context.
access to semantic-property targets was affected A number of questions remain to be
by the semantic context. Semantic properties answered. In particular, how does context exert
were not automatically accessed whenever heard, its influence in selecting the right meaning? How
but could be modulated by prior context, even does semantic integration occur? MacDonald,
at the earliest probe position. Hence this finding Pearlmutter, and Seidenberg (1994b) address
again indicates that neither exhaustive nor selec- this issue, and also address the relation between
tive access models may be quite right, in that what lexical and syntactic ambiguity. They propose
we find depends on the detailed relation between that the two are resolved using similar mecha-
the context and the meanings of the word. nisms based on an enriched lexicon. Kawamoto
(1993) constructed a connectionist model of
Evaluation of work on lexical lexical ambiguity resolution. The model showed
ambiguity that, even in an interactive system, multiple
Early on, there were two basic approaches to candidates become active, even when the con-
how we eventually select the appropriate sense text clearly favors one meaning. (This happens
of ambiguous words. According to the auton- because the relation between a word’s percep-
omous view, we automatically access all the tual form and its meanings is much stronger than
multiple senses of a word, and use the context the relation between the meaning and the con-
to select the appropriate reading. Semantic text.) This suggests that multiple access is not
information context is then used to access the necessarily diagnostic of modularity.
appropriate sense of the word. On the interac- Although ambiguous words appear to cause
tive view, the context enables selective access difficulty for the language system, there are some
of the appropriate sense of the ambiguous word. circumstances where ambiguous words have an
The experiments used in this area are very sen- advantage. We may be quicker to name ambiguous
sitive to properties of the target and context words compared with unambiguous words, and
length. When we get context-sensitive priming they have an advantage in lexical decision (e.g.,
in these cross-modal experiments depends on Balota, Ferraro, & Conner, 1991; Jastrzembski,
the details of the semantic relation between the 1981; Kellas, Ferraro, & Simpson, 1988; Millis
target and prime. Early experiments using off- & Button, 1989; but see Borowsky & Masson,
line tasks found contradictory results for both 1996). There are a number of explanations for this
multiple and context-specific selective access. possible advantage, but they all center around the
Later experiments using more sophisticated idea that having multiple target meanings speeds
cross-modal priming indicated multiple access up processing of the word. For example, if each
with rapid resolution. word meaning corresponds to a detector such as a
More recent experiments suggest that the logogen, then a word with two meanings will have
pattern of access depends on the relative fre- two detectors. The probability of an ambiguous
quencies of the alternative senses of the ambiguous word activating one of its multiple detectors will
word and the extent to which the disambiguating be higher than the probability of an unambiguous
context constrains the alternatives. All recent word activating its only detector.
6. RECOGNIZING VISUAL WORDS 207
SUMMARY
1. What might be different about reading in languages such as Hebrew that read from right to left?
2. Is the lexicon really like a dictionary?
3. Compare and contrast two models of word recognition.
4. How many types of priming are there?
5. What are the differences between naming, recognition, lexical access, and accessing the meaning?
What might neuropsychology tell us about these processes?
FURTHER READING
For a collection of papers surveying the field, see Andrews (2006). For reviews of the eye-movement
literature, see van Gompel, Fischer, Murray, and Hill (2006), and the collection edited by Henderson
and Ferreira (2004). For a detailed discussion of the latest version of the E-Z Reader model (version
7), and a comparison with several other important models of eye-movement control in reading, with
peer commentary, see Reichle, Rayner, and Pollatsek (2003). In addition to the E-Z Reader, there are
other recent models of eye-movement control in reading. See McDonald, Carpenter, and Shillcock
(2005) for the SERIF model. The SERIF model emphasizes the way in which information from each
half of the visual field is transmitted to the contralateral visual cortex. See Legge, Klitz, and Tjan
(1997) for the Mr. Chips model, and Martin (2004) for the Encoder model.
See Dean and Young (1996) for a review of work on repetition priming, and experimental evi-
dence that is troublesome for the episodic view. Morrison, Chappell, and Ellis (1997) provide age-
of-acquisition norms for a large set of object names.
More recent work on perception without awareness can be found in the papers by Doyle and
Leach (1988) and Dagenbach, Carr, and Wilhelmsen (1989). Humphreys (1985) reviewed the litera-
ture on attentional processes in priming. Neely (1991) provides a wide-ranging review of semantic
priming. For discussion of whether associative priming occurs through a mechanism of spreading
activation or some more complex process, see McNamara (1992, 1994). Plaut and Booth (2000)
present a connectionist model that incorporates both facilitation and inhibition using a single mecha-
nism. See Kinoshita and Lupker (2003) for a review of work on masked priming.
An excellent review of models of word recognition is Carr and Pollatsek (1985); they provide a
useful diagram showing the relation of all types of recognition model. See Garnham (1985) for more
detail on the interactions between frequency, context, and stimulus quality.
CHAPTER 7
READING
Examples Features
quite regular, but a phoneme may have differ- Hence this chapter should be read with the cau-
ent graphemic realizations (e.g., the graphemes tion in mind that some conclusions may be true of
“o,” “au,” “eau,” “aux,” and “eaux” all repre- English and many other writing systems, but not
sent the same sounds). In consonantal scripts, necessarily of all of them.
such as Hebrew and Arabic, not all sounds are Unlike speech, reading and writing are a
represented, as vowels are not written down at relatively recent development. Writing emerged
all. In syllabic scripts (such as Cherokee and independently in Sumer and Mesoamerica, and
the Japanese script kana), the written units rep- perhaps also in Egypt and China. The first writ-
resent syllables. Finally, some languages do not ing system was the cuneiform script printed on
represent any sounds. In ideographic languages clay in Sumer, which appeared just before 3000
(sometimes also called logographic languages), BC. The emergence of the alphabetic script can be
such as Chinese and the Japanese script kanji, traced to ancient Greece in about 1000 BC. The
each symbol is equivalent to a morpheme (see development of the one-to-many correspondence
Table 7.1). in English orthography primarily arose between
One consequence of this variation in writing the fifteenth and eighteenth centuries as a conse-
systems is that there must be differences in pro- quence of the development of the printing press
cessing between readers of different languages. and the activities of spelling “reformers” who
tried to make the Latin and Greek origins of
words more apparent in their spellings (see Ellis,
1993, for more detail). Therefore it is perhaps not
H
E
surprising that reading is actually quite a complex
B fl»<8 62ftoi ⻀»θ| fl»<8fl»<8
62ftoi ⻀»θ|
62ftoi ⻀»θ| cognitive task. There is a wide variation in read-
R
E ing abilities, and many different types of reading
W fl»<8 62ftoi ⻀»θ|fl»<8 62ftoi
fl»<8 ⻀»θ|
62ftoi ⻀»θ| disorder arise as a consequence of brain damage.
A
L fl»<8 62ftoi ⻀»θ| fl»<8 62ftoi
fl»<8 ⻀»θ|
62ftoi ⻀»θ|
P
H
A fl»<8fl»<8 62ftoi
62ftoi ⻀»θ|
⻀»θ| A PRELIMINARY MODEL OF
B
E READING
T
There is much more variability in the structure Introspection can provide us with a preliminary
of written languages than there is in spoken model of reading. Consider how we might name
languages. In consonantal scripts, such as or pronounce the word “beef.” Words like this
Hebrew (above) and Arabic, not all sounds are are said to have a regular spelling-to-sound cor-
represented.
respondence. That is, the graphemes map onto
7. READING 211
phonemes in a totally regular way; you need tend to agree on how they should be pronounced.
no special knowledge about the word to know If you hear nonwords like these, you can spell
how to pronounce it. If you had never seen the them correctly; you assemble their pronunciations
word “beef” before, you could still pronounce from their constituent graphemes. (Of course, not
it correctly. Some other examples of regular all nonwords are pronounceable—e.g., “xzhgh.”)
word pronunciations include “hint” and “rave.” Our ability to read nonwords on the one hand
In these words, there are alternative pronuncia- and irregular words on the other suggests the pos-
tions (as in “pint” and “have”), but “hint” and sibility of a dual-route model of naming. We can
“rave” are pronounced in accordance with the assemble pronunciations for words or nonwords
most common pronunciations. These are all reg- we have never seen before, yet also pronounce
ular words, because all the graphemes have the correctly irregular words that must need informa-
standard pronunciation. tion specific to those words (that is, lexical infor-
Not all words are regular, however. Some are mation). The classic dual-route model (see Figure
irregular or exception words. Consider the word 7.1) has two routes for turning words into sounds.
“steak.” This has an irregular spelling-to-sound There is a direct access or lexical route, which is
(or grapheme-to-phoneme) correspondence: the needed for irregular words. This must at least in
grapheme “ea” is not pronounced in the usual some way involve a direct link between print and
way, as in “streak,” “sneak,” “speak,” “leak,” and sound. That is, the lexical route takes us directly
“beak.” Other exceptions to a rule include “have” to a word’s entry in the lexicon and we are then
(an exception to the rule that leads to the regu- able to retrieve the sound of a word. There is also
lar pronunciations “gave,” “rave,” “save,” and so a grapheme-to-phoneme conversion (GPC) route
forth) and “vase” (in British English, an exception (also called the indirect or non-lexical or sublexi-
to the rule that leads to the regular pronunciations cal route), which is used for reading nonwords.
“base,” “case,” and so forth). English has many This route carries out what is called phonologi-
irregular words. Some words are extremely irreg- cal recoding. It does not involve lexical access at
ular, containing unusual patterns of letters that all. The non-lexical route was first proposed in
have no close neighbors, such as “island,” “aisle,” the early 1970s (e.g., Gough, 1972; Rubenstein,
“ghost,” and “yacht.” These words are sometimes Lewis, & Rubenstein, 1971). Another important
called lexical hermits. justification for a grapheme-to-phoneme conver-
Finally, we can pronounce strings of letters sion route is that it is useful for children learning
such as “nate,” “smeak,” “fot,” and “datch,” even to read by sounding out words letter by letter.
though we have never seen them before. These Given that neither route can in itself ade-
letter strings are all pronounceable nonwords or quately explain reading performance, it seems that
pseudowords. Therefore, even though they are we must use both. Modern dual-route theorists see
novel, we can still pronounce them, and we all reading as a “race” between these routes. When
Grapheme–phoneme
Lexicon
conversion rules
we see a word, both routes start processing it. For direct route from print to sound, and a direct route
skilled readers, most of the time the direct route is via semantics; what is debated is the role of the
much faster, so it will usually win the race and the indirect route in normal reading (see Taft & van
word will be pronounced the way that it recom- Graan, 1998, for further discussion of these issues).
mends. The indirect route will only be apparent in
exceptional circumstances, such as when we see
a very unfamiliar word; in that case, if the direct THE PROCESSES OF
route is slower than normal, then the direct and NORMAL READING
GPC routes will produce different pronunciations
at about the same time, and these words might be According to the dual-route model, there are two
harder to pronounce. independent routes when naming a word and
In the previous chapter we examined a num- accessing the lexicon: a lexical or direct access
ber of models of word recognition. These can route and a sublexical or grapheme–phoneme
all be seen as theories of how the direct, lexical conversion route. This section looks at how we
access reading route operates. The dual-route is name nonwords and words.
the simplest version of a range of possible multi-
route or parallel coding models, some of which Reading nonwords
posit more than two reading routes. Do we really
need a non-lexical route at all for routine read- It sounds odd to start a section on “normal reading”
ing? Although we appear to need it for reading by talking about how we can read nonwords, but
nonwords, it seems a costly procedure. We have they’re very revealing. According to the dual-route
a mechanism ready to use for something we model, the pronunciation of all nonwords should be
rarely do—pronouncing new words or nonwords. assembled using the GPC route. This means that all
Perhaps it is left over from the development of pronounceable nonwords should be alike and their
reading, or perhaps it is not as costly as it first similarity to words should not matter. However,
appears. We will see later that the non-lexical pronounceable nonwords are not all alike.
route is also apparently needed to account for the
neuropsychological data. Indeed, whether or not The pseudohomophone effect
two routes are necessary for reading is a central Pseudohomophones are pronounceable non-
issue of the topic of reading. Models that propose words that sound like words when pronounced
that we can get away with only one (such as con- (such as “brane,” which sounds like the word
nectionist models) must produce a satisfactory “brain” when spoken). The behavior of the pseu-
account of how we can pronounce nonwords. dohomophone “brane” can be compared with the
Of course, except for reading aloud, the pri- very similar nonword “brame,” which does not
mary goal of reading is not getting the sound of a sound like a word when it is spoken. Rubenstein
word, but getting the meaning. As we shall see in et al. (1971) showed that pseudohomophones are
Chapter 8, in the early stages of learning to read more confusable with words than other types of
children get to the meaning through the sound; that nonwords are. Participants are faster to name
is, they spell out the sound of the words, and then them, but slower to reject them as nonwords than
access meaning as they recognize those sounds. control nonwords.
Some researchers believe that even skilled adults Is the effect caused by the phonological or
primarily get to meaning by going from print to visual similarity between the nonword and word?
phonology and then to meaning, an idea called Martin (1982) and Taft (1982) argued that it is visual
phonological mediation (discussed in more detail similarity that is important. Pseudohomophones
below). Most researchers, however, believe that are more confusable with words than other non-
in skilled adults, most of the time, there is a direct words are because they look more similar to words
route from print to semantics. Indeed, as we shall than non-pseudohomophones, rather than because
see below, most researchers believe that there is a they sound the same. Pring (1981) alternated the
7. READING 213
case of letters within versus across graphemes, each other. Subsequent research has shown that
such as the “AI” in “grait,” to produce “GraIT” the proportion of regular pronunciations of non-
or “GRaiT.” These strings look different but still words increases as the number of orthographic
sound the same. Alternating letter cases within a neighbors increases (McCann & Besner, 1987).
grapheme or spelling unit (aI) eliminates the pseu- In summary, there are lexical effects on nonword
dohomophone effect; alternating letters elsewhere processing.
in the word (aiT) does not. Hence we are sensi-
tive to the visual appearance of spelling units of More on reading nonwords
words. The nonword “yead” can be pronounced to rhyme
The pseudohomophone effect suggests that with “bead” or “head.” Kay and Marcel (1981)
not all nonwords are processed in the same way. showed that its pronunciation can be affected
The importance of the visual appearance of the by the pronunciation of a preceding prime word:
nonwords further suggests that something else “bead” biases a participant to pronounce “yead”
apart from phonological recoding is involved here. to rhyme with it, whereas the prime “head” biases
It remains to be seen whether the phonological participants to the alternative pronunciation.
recoding route is still necessary, but if it is, then it Rosson (1983) primed the nonword by a seman-
must be more complex than we first thought. tic relative of a phonologically related word. The
task was to pronounce “louch” when preceded
Glushko’s (1979) experiment: Lexical either by “feel” (which is associated with “touch”)
effects on nonword reading or by “sofa” (which is associated with “couch”).
Glushko (1979) performed a very important In both cases “louch” tended to be pronounced to
experiment on the effect of the regularity of the rhyme with the appropriate relative.
word-neighbors of a nonword on its pronun- Finally, nonword effects in complex experi-
ciation. Consider the nonword “taze.” Its word- ments are sensitive to many factors, such as the
neighbors include “gaze,” “laze,” and “maze”; pronunciation of the surrounding words in the list.
these are all themselves regularly pronounced This also suggests that nonword pronunciation
words. Now consider the word-neighbors of the involves more than just grapheme-to-phoneme
nonword “tave.” These also include plenty of reg- conversion.
ular words (e.g., “rave,” “save,” and “gave”) but
there is an exception word-neighbor (“have”). As Evaluation of research on reading
another example, compare the nonwords “feal” nonwords
and “fead”: both have regular neighbors (e.g., These data do not fit the simple version of the
“real,” “seal,” “deal,” and “bead”) but the pro- dual-route model. The pronunciation of nonwords
nunciation of “fead” is influenced by its irregu- is affected by the pronunciation of visually simi-
lar neighbor “dead.” Glushko (1979) showed that lar words. That is, there are lexical effects in non-
naming latencies to nonwords such as “tave” were word processing; the lexical route seems to be
significantly slower than to ones such as “taze.” affecting the non-lexical route.
That is, reaction times to nonwords that have
orthographically irregular spelling-to-sound cor-
respondence word-neighbors are slower than to
Reading words
other nonword controls. Also, people make pro- According to the dual-route model, words are
nunciation “errors” with such nonwords: “pove” accessed directly by the direct route. This means
might be pronounced to rhyme with “love” rather that all words should be treated the same in respect
than “cove”; and “heaf” might be pronounced of the regularity of their spelling-to-sound corre-
to rhyme with “deaf” rather than “leaf.” In sum- spondences. An examination of the data reveals
mary, Glushko found that the pronunciation of that this prediction does not stand up.
nonwords is affected by the pronunciation of sim- One problem for the simple dual-route model
ilar words, and that nonwords are not the same as is that pronunciation regularity affects response
214 C. WORD RECOGNITION
times, although in a complex way. Baron and One possibility is that late-acquired low-frequency
Strawson (1976) provided an early demonstration consistent words can make use of the network
of this problem, finding that a list of regular words structure of other consistent words; inconsistent
was named faster than a list of frequency-matched items cannot, and need new associations to be
exception words (e.g., “have”). This task is a sim- learned between input and output (Monaghan &
plified version of the naming task, with response Ellis, 2002).
time averaged across many items rather than taken In general, regularity effects are more likely
from each one individually. There have been many to be found when participants have to be more
other demonstrations of the influence of regular- conservative, such as when accuracy rather than
ity on naming time (e.g., Forster & Chambers, speed is emphasized. The finding that regularity
1973; Frederiksen & Kroll, 1976; Stanovich & affects naming might appear problematic for the
Bauer, 1978). A well-replicated finding is that of dual-route model, but makes sense if there is a race
an interaction between regularity and frequency: between the direct and indirect routes. Remember
regularity has little effect on the pronunciation of that there is an interaction between regularity and
high-frequency words, but low-frequency regu- frequency. The pronunciation of common words is
lar words are named faster than low-frequency directly retrieved before the indirect route can con-
irregular words (e.g., Andrews, 1982; Seidenberg, struct any conflicting pronunciation. Conflict arises
Waters, Barnes, & Tanenhaus, 1984), even when when the lexical route is slow, as when retrieving
we control for age-of-acquisition (Monaghan & low-frequency words, and when the pronunciation
Ellis, 2002). Jared (1997b) found that high- of a low-frequency word generated by the lexical
frequency words can be sensitive to regularity, but route conflicts with that generated by the non-lexical
the effect of regularity is moderated by the number route (Norris & Brown, 1985).
and frequencies of their “friends” and “enemies”
(words with similar or conflicting pronunciations). Glushko’s (1979) experiment: Results
That is, it is important to control for the neighbor- from words
hood characteristics of the target words as well as Glushko (1979) also found that words behave in a
their regularity in order to observe the interaction. similar way to nonwords, in that the naming times
On the other hand, it is not clear whether there are of words are affected by the phonological consist-
regularity effects on lexical decision. They have ency of neighbors. The naming of a regular word
been obtained by, for example, Stanovich and is slowed down relative to that of a control word
Bauer (1978), but not by Coltheart et al. (1977), of similar frequency if the test word has irregular
or Seidenberg et al. (1984). In particular, a word neighbors. For example, the word “gang” is regu-
such as “yacht” looks unusual, as well as having lar, and all its neighbors (such as “bang,” “sang,”
an irregular pronunciation. The letter pairs “ya” “hang,” and “rang”) are also regular. Consider on
and “ht” are not frequent in English; we say they the other hand “base”; this itself has a regular pro-
have a low bigram frequency. Obviously the visual nunciation (compare it with “case”), but it is incon-
appearance of words is going to affect the time it sistent, in that it has one irregular neighbor, “vase”
takes for direct access, so we need to control for (in British English pronunciation). We could say
this when searching for regularity effects. Once that “vase” is an enemy of “base.” This leads to
we control for the generally unusual appearance a slowing of naming times. In addition, Glushko
of irregular words, regularity and consistency only found true naming errors of over-regularization: for
seem to affect naming times, not lexical decision example, “pint” was sometimes given its regular
times. Age-of-acquisition has a similar effect to pronunciation—to rhyme with “dint.”
frequency, and gives rise to a similar interaction:
Consistency has a much bigger impact on naming Pronunciation neighborhoods
time for late-acquired than early-acquired words Continuing this line of research, Brown (1987)
(Monaghan & Ellis, 2002). Why do late-acquired argued that the number of consistently pronounced
and low-frequency inconsistent words stand out? neighbors (friends) determines naming times, rather
7. READING 215
than whether a word has enemies (that is, whether than a straightforward dichotomy between regular
or not it is regular). It is now thought that the num- and irregular words (see Table 7.2). This classifica-
ber of both friends and enemies affects naming tion reflects two factors: first, the regularity of the
times (Brown & Watson, 1994; Jared, McRae, & pronunciation with reference to spelling-to-sound
Seidenberg, 1990; Kay & Bishop, 1987). correspondence rules; second, the agreement with
Andrews (1989) found effects of neighbor- other words that share the same body. (This is the
hood size in both the naming and the lexical end of a monosyllabic word, comprising the central
decision tasks. Responses to words with large vowel plus final consonant or consonant cluster;
neighborhoods were faster than words with e.g., “aint” in “saint” or “us” in “plus.”) We need
small neighborhoods (although this may be mod- to consider not only whether a word is regular or
erated by frequency, as suggested by Grainger, irregular, but also whether its neighbors are regular
1990). Not all readers produce the same results. or irregular. The same classification scheme can be
Barron (1981) found that good and poor elemen- applied to nonwords.
tary school readers both read regular words more In summary, just as not all nonwords behave
quickly than irregular words. However, once he in the same way, neither do all words. The reg-
controlled for neighborhood effects, he found ularity of pronunciation of a word affects the
that there was no longer any regularity effect ease with which we can name it. In addition, the
in the good readers, although it persisted in the pronunciation of a word’s neighbors can affect
poor readers. its naming. The number of friends and enemies
Parkin (1982) found more of a continuum affects how easy it is to name a word.
of ease-of-pronunciation than a simple division
between regular and irregular words. All this work The role of sound in accessing
suggests that a binary division into words with meaning: Phonological mediation
regular and irregular pronunciations is no longer There is some experimental evidence suggesting
adequate. Patterson and Morton (1985) provided a that a word’s sound may have some influence on
more satisfactory but complex categorization rather accessing the meaning (Frost, 1998; van Orden,
TABLE 7.2 Classification of word pronunciations depending on regularity and consistency (based on Patterson
& Morton, 1985).
Consistent gaze All words receive the same regular pronunciation of the body
Consensus lint All words with one exception receive the same regular pronunciation
Gang look All words with one exception receive the same irregular
pronunciation
Gang without a hero cold All words receive the same irregular pronunciation
1987; van Orden, Johnstone, & Hale, 1988; van On the other hand, Jared and Seidenberg
Orden, Pennington, & Stone, 1990). In a cate- (1991) showed that prior phonological access only
gory decision task, participants have to decide if happens with low-frequency homophones. In an
a visually presented target word is a member of examination of proof-reading and eye movements,
a particular category. For example, given “A type Jared, Levy, and Rayner (1999) also found that
of fruit” you would respond “yes” to “pear,” and phonology only plays a role in accessing the mean-
“no” to “pour.” If the “no” word is a homophone ings of low-frequency words. In addition, they
of a “yes” word (e.g., “pair”), participants make a found that poor readers are more likely to have
lot of false positive errors—that is, they respond to access phonology in order to access semantics,
“yes” instead of “no.” Participants seem confused whereas good readers primarily activate semantics
by the sound of the word, and category deci- first. Daneman, Reingold, and Davidson (1995)
sion clearly involves accessing the meaning. The reported eye fixation data on homophones that
effect is most noticeable when participants have to suggested the meaning of a word is accessed first
respond quickly. Lesch and Pollatsek (1998) found whereas the phonological code is accessed later,
evidence of interference between homophones in a probably post-access. They found that gaze dura-
semantic relatedness task (e.g., SAND–BEECH). tion times were longer on an incorrect homophone
We take longer to respond to homophones in a (e.g., “brake” was in the text when the context
lexical decision task (e.g., MAID), presumably demanded “break”), and that the fixation times on
because the homophones are generating confusion the incorrect homophone were about the same as
in lexical access, perhaps through feedback from on a spelling control (e.g., “broke”). This means
phonology to orthography (Pexman, Lupker, & that the appropriate meaning must have been acti-
Jared, 2001; Pexman, Lupker, & Reggin, 2002). vated before the decision to move the eyes, and that
Hence there is considerable evidence that the the phonological code is not activated at this time.
recognition of a word can be influenced by its pho- (If the phonological code had been accessed before
nology. The dominant view is that this influence meaning then the incorrect homophone would
arises through the indirect route, although word sound all right in the context, and gaze durations
recognition is primarily driven by the direct route should have been about the same.) The phonologi-
(or routes)—a view that has been labeled the weak cal code is accessed later, however, and influences
phonological perspective (Coltheart, Rastle, Perry, the number of regressions (when the eyes look back
Langdon, & Ziegler, 2001; Rastle & Brysbaert, to earlier material) to the target word. (However,
2006). Most of the models described in this chapter see Rayner, Pollatsek, & Binder, 1998, for different
subscribe to the weak phonological view. The alter- conclusions. It is clear that these experiments are
native, strong phonological view—that we primarily very sensitive to the materials used.)
get to the meaning through sound—is called pho- Taft and van Graan (1998) used a seman-
nological mediation. The most extreme form of this tic categorization task to examine phonological
idea is that visual word recognition cannot occur in mediation. Participants had to decide whether or
the absence of computing the sound of the word. not words belonged to a category of “words with
There is a great deal of controversy about the definable meanings” (e.g., “plank,” “pint”) or the
status of phonological mediation. Other experi- category of “given names” (e.g., “Pam,” “Phil”).
ments support the idea. Folk (1999) examined There was no difference in the decision times
eye movements as participants read sentences between regular definable words (e.g., “plank”)
containing either “soul” or “sole.” Folk found and irregular definable words (e.g., “pint”),
that the homophones were read with longer gaze although a regularity effect was shown in a word
duration—that is, they were processed as though naming task. This suggests that the sound of a
they were lexically ambiguous—even though the word does not need to be accessed on the route to
orthography should have prevented this. This accessing its meaning.
result is only explicable if the phonology is in A number of studies have tried to decide
some way interfering with the semantic access. between the strong and weak phonological views
7. READING 217
using masked phonological priming. In this tech- could give perfect definitions of printed words.
nique, targets (e.g., “clip”) are preceded by phono- In general, a review of the neuropsychological
logically identical nonword primes (e.g., “klip”). literature suggests that people can recognize
Responses to the targets are faster and more words in the absence of phonology (Coltheart,
accurate than when the target is preceded by an 2004). Hence it is unlikely that phonological
unrelated word. Several studies have found prim- recoding is an obligatory component of visual
ing effects occur even when the primes have been word recognition (Rastle & Brysbaert, 2006).
masked and presented so briefly that they cannot How then can we explain the data showing
be consciously observed and reported, suggesting phonological mediation? There are a number of
that the phonological stimulus must occur auto- alternative explanations. First, although phono-
matically and extremely quickly (e.g., Lukatela & logical recoding prior to accessing meaning may
Turvey, 1994a, 1994b; Perfetti, Bell, & Delaney, not be obligatory, it might occur in some circum-
1988). While some researchers interpret masked stances. Given there is a race between the lexical
phonological priming as supporting phonological and sublexical routes in the dual-route model, if
mediation—Why else should early phonological for some reason the lexical route is slow in pro-
activation happen so early unless it is essential?— ducing an output, the sublexical route might have
other researchers point out that these effects time to assemble a conflicting phonological repre-
are very sensitive to environmental conditions, sentation. Second, there might be feedback from
and are not always reliably found (see Rastle & the speech production system to the semantic sys-
Brysbaert, 2006, for a review). In a meta-analysis tem, or the direct access route causes inner speech
of the literature, Rastle and Brysbaert (2006) do that interferes with processing. Third, it is possi-
find small but significant masked phonological ble that lexical decision is based on phonological
priming effects. information (Rastle & Brysbaert, 2006).
These data suggest that the sound of a word
is usually accessed at an early stage. However, Silent reading and inner speech
there is much evidence suggesting that phono- Although it seems unlikely that we have to access
logical recoding cannot be obligatory in order sound before meaning, we do routinely seem to
to access the word’s meaning (Ellis, 1993). For access some sort of phonological code after access-
example, some dyslexics cannot pronounce non- ing meaning in silent reading. Subjective evidence
words, yet can still read many words. Hanley for this is the experience of “inner speech” while
and McDonnell (1997) described the case of reading. Tongue-twisters such as (1) take longer to
a patient, PS, who understood the meaning of read silently than sentences where there is variation
words in reading without being able to pro- in the initial consonants (Haber & Haber, 1982).
nounce them correctly. Critically, PS did not This suggests that we are accessing some sort of
have a preserved inner phonological code that phonological code as we read.
could be used to access the meaning. Some
patients have preserved inner phonology and (1) Boris burned the brown bread badly.
preserved reading comprehension, but make
errors in speaking aloud (Caplan & Waters, However, this inner speech cannot involve
1995b). Hanley and McDonnell argued that PS exactly the same processes as overt speech
did not have access to his phonological code because we can read silently much faster than
because he was unable to access both meanings we can read aloud (Rayner & Pollatsek, 1989),
of a homophone from seeing just one in print. and because overt articulation does not pro-
Thus PS could not produce the phonological hibit inner speech while reading. Furthermore,
forms of words aloud correctly, and did not have although most people who are profoundly deaf
access to an internal phonological representa- read very poorly, some read quite well (Conrad,
tion of those words, yet he could still under- 1972). Although this might suggest that eventual
stand them when reading them. For example, he phonological coding is optional, it is likely that
218 C. WORD RECOGNITION
these deaf able readers are converting printed (2) When his shoelace came loose, Vlad had to
words into some sign language code (Rayner & tie a bow.
Pollatsek, 1989). Evidence for this is that deaf (3) At the end of the play, Dirk went to the front
people are troubled by the silent reading of word of the stage to take a bow.
strings that correspond to hand-twisters (Treiman
& Hirsh-Pasek, 1983). (Interestingly, deaf people Clearly here we need to access the word’s
also have some difficulty with signing phonologi- meaning before we can select the appropriate pro-
cal tongue-twisters, suggesting that difficulty can nunciation. Further evidence that semantics can
arise from lip-reading sounds.) affect reading is provided by a study by Strain,
Hence, when we read we seem to access a Patterson, and Seidenberg (1995). They showed
phonological code that we experience as inner that there is an effect of imageability on skilled
speech. That is, when we gain access to a word’s reading such that there is a three-way interaction
representation in the lexicon, all its attributes between frequency, imageability, and spelling
become available. The activation of a phonologi- consistency. People are particularly slow and
cal code is not confined to alphabetic languages. make more errors when reading low-frequency
On-line experimental data using priming and exception words with abstract meanings (e.g.,
semantic judgment tasks suggest that phonologi- “scarce”). Although a subsequent study by
cal information about ideographs is automatically Monaghan and Ellis (2002) suggests that this
activated in both Chinese (Perfetti & Zhang, 1991, semantic effect might be at least in part the result
1995) and Japanese kanji (Wydell, Patterson, & of a confound with age-of-acquisition, as abstract
Humphreys, 1993). low-frequency exception words tend to have late
Inner speech seems to assist comprehen- AOA, this interaction is still found when we con-
sion; if it is reduced, comprehension suffers for trol for AOA (Strain, Patterson, & Seidenberg,
all but the easiest material (Rayner & Pollatsek, 2002). Hence, at least some of the time, we need
1989). McCutchen and Perfetti (1982) argued to access a word’s semantic representation before
that whichever route is used for lexical access in we can access its phonology.
reading, at least part of the phonological code of
each word is automatically accessed—in particu- Does speed reading work?
lar we access the sounds of beginnings of words. Occasionally you might notice advertisements in
Although there is some debate about the precise the press for techniques for improving your read-
nature of the phonological code and how much of ing speed. The most famous of these techniques
7. READING 219
is known as “speed reading.” Proponents of speed be read by a non-lexical route that is insensitive
reading claim that you can increase your reading to lexical information. Second, there are effects
speed from the average of 200–350 words a minute of regularity of pronunciation on reading words,
to 2,000 words a minute or even faster, yet retain which should be read by a direct, lexical route that
the same level of comprehension. Is this possible? is insensitive to phonological recoding.
Unfortunately, the preponderance of psychological A race model fares better. Regularity effects
research suggests not. As you increase your read- arise when the direct and indirect routes produce
ing speed above the normal rate, comprehension an output at about the same time, so that conflict
declines. Just and Carpenter (1987) compared the arises between the irregular pronunciation proposed
understanding of speed readers and normal readers by the lexical route and the regular pronunciation
on an easy piece of text (an article from Reader’s proposed by the sublexical route. However, it is not
Digest) and a difficult piece of text (an article from clear how a race model where the indirect route uses
Scientific American). They found that normal read- grapheme–phoneme conversion can explain lexical
ers scored 15% higher on comprehension measures effects on reading nonwords. Neither is it clear how
than the speed readers across both passages. In semantics can guide the operation of the direct route.
fact, the speed readers performed only slightly bet- Skilled readers have a measure of attentional
ter than a group of people who skimmed through or strategic control over the lexical and sublexical
the passages. The speed readers did as well as the routes such that they can attend selectively to lexi-
normal readers on the general gist of the text, but cal or sublexical information (Baluch & Besner,
were worse at details. In particular, speed readers 1991; Monsell, Patterson, Graham, Hughes, &
could not answer questions when the answers were Milroy, 1992; Zevin & Balota, 2000). For exam-
located in places where their eyes had not fixated. ple, Monsell et al. found that the composition of
Speed reading, then, is not as effective as nor- word lists affected naming performance. High-
mal reading. Eye movements are the key to why frequency exception words were pronounced
speed reading confers limited advantages (Rayner faster when they were in pure blocks than when
& Pollatsek, 1989). For a word to be processed they were mixed with nonwords. Monsell et al.
properly, its image has to land close to the fovea argued that this was because participants allocated
and stay there for a sufficient length of time. Speed more attention to lexical information when read-
reading is nothing more than skimming through a ing the pure blocks. Participants also made fewer
piece of writing (Carver, 1972). This is not to say regularization errors when the words were pre-
that readers obtain nothing from skimming: if you sented in pure blocks (when they can rely solely
have sufficient prior information about the mate- on lexical processing) than in mixed blocks (when
rial, your level of comprehension can be quite the sublexical route has to be involved).
good. If you speed read and then read normally, At first sight, then, this experiment suggests
your overall level of comprehension and retention that in difficult circumstances people seem able
might be better than if you had just read the text to change their emphasis in reading from using
normally. It is also a useful technique for preparing lexical information to sublexical information.
to read a book or article in a structured way (see However, Jared (1997a) argued that people need
Chapter 12). Finally, associated techniques such as not change the extent to which they rely on sub-
relaxing before you start to read might well have lexical information, but instead might be respond-
beneficial effects on comprehension and retention. ing at different points in the processing of the
stimuli. She argued that the faster pronunciation
Evaluation of experiments on latencies found in Monsell et al.’s experiment in the
exception-only condition could just be the result of
normal reading a general increase in response speed, rather than
There are two major problems with a simple dual- a reduction in reliance on the non-lexical route.
route model. First, we have seen that there are However, there is further evidence for stra-
lexical effects on reading nonwords, which should tegic effects in the choice of route when reading.
220 C. WORD RECOGNITION
Using a primed naming task, Zevin and Balota of the two reading routes. That is, we should find
(2000) found that nonword primes produce a some patients have damage to the lexical route but
greater dependence on sublexical processing, can still read by the non-lexical route only, whereas
but low-frequency exception word primes pro- we should be able to find other patients who have
duce a greater dependence on lexical processing. damage to the non-lexical route but can read by the
Coltheart and Rastle (1994) suggested that lexical lexical route only. The existence of a double dis-
access is performed so quickly for high-frequency sociation is a strong prediction of the dual-route
words that there is little scope for sublexical model, and a real challenge to any single-route
involvement, but with low-frequency words or in model.
difficult conditions people can devote more atten-
tion to one route or the other.
Surface dyslexia
People with surface dyslexia have a selective
THE NEUROSCIENCE impairment in the ability to read irregular (excep-
OF ADULT READING tion) words. Hence they would have difficulty with
DISORDERS “steak” compared with a similar regular relative
word such as “speak.” Marshall and Newcombe
What can studies of people with brain damage tell (1973) and Shallice and Warrington (1980)
us about reading? This section is concerned with described some early case histories. Surface dys-
disorders of processing written language. We must lexics often make over-regularization errors when
distinguish between acquired disorders (which, trying to read irregular words aloud. For example,
as a result of head trauma such as stroke, opera- they pronounce “broad” as “brode,” “steak” as
tion, or head injury, lead to disruption of processes “steek,” and “island” as “eyesland.” On the other
that were functioning normally beforehand) and hand, their ability to read regular words and non-
developmental disorders (which do not result words is intact. In terms of the dual-route model,
from obvious trauma, and which disrupt the the most obvious explanation of surface dyslexia
development of a particular function). Disorders is that these patients can only read via the indirect,
of reading are called the dyslexias; disorders of non-lexical route: that is, it is an impairment of
writing are called the dysgraphias. Damage to the the lexical (direct access) processing route. The
left hemisphere will generally result in dyslexia, comprehension of word meaning is intact in these
but as the same sites are involved in speaking, patients. They still know what an “island” is, even
dyslexia is often accompanied by impairments to if they cannot read the word, and they can still
spoken language processing. understand it if you say the word to them.
We can distinguish central dyslexias, which The effects of brain damage are rarely local-
involve central, high-level reading processes, ized to highly specific systems, and, in practice,
from peripheral dyslexias, which involve lower patients do not show such clear-cut behavior as
level processes. Peripheral dyslexias include the ideal of totally preserved regular word and
visual dyslexia, attentional dyslexia, letter-by- nonword reading, and the total loss of irregular
letter reading, and neglect dyslexia, all of which words. The clearest case yet reported is that of
disrupt the extraction of visual information from a patient referred to as MP (Bub, Cancelliere, &
the page. As our focus is on understanding the Kertesz, 1985). She showed completely normal
central reading process, we will limit discussion accuracy in reading nonwords, and hence her
here to the central dyslexias. In addition, we will non-lexical route was totally preserved. She was
only look at acquired disorders in this section, and not the best possible case of surface dyslexia,
defer discussion of developmental dyslexia until however, because she could read some irregular
our examination of learning to read. words (with an accuracy of 85% on high-frequency
If the dual-route model of reading is correct, items, and 40% on low-frequency exception
then we should expect to find a double dissociation words). This means that her lexical route must
7. READING 221
have been partially intact. The pure cases are Derouesné (1979). Phonological dyslexics find
rarely found. Other patients show considerably irregular words no harder to read than regular
less clear-cut reading than this, with even better ones. These symptoms suggest that these patients
performance on irregular words, and some deficit can only read using the lexical route, and there-
in reading regular words. fore that phonological dyslexia is an impairment
If patients were reading through a non-lexical of the non-lexical (GPC) processing route. As
route, we would not expect lexical variables to with surface dyslexia, the “perfect patient,” who
affect the likelihood of reading success. Kremin in this case would be able to read all words but no
(1985) found no effect of word frequency, part nonwords, has yet to be discovered. The clearest
of speech (noun versus adjective versus verb), or case yet reported is that of patient WB (Funnell,
whether or not it is easy to form a mental image 1983), who could not read nonwords at all; hence
of what is referred to (called imageability), on the the non-lexical GPC route must have been com-
likelihood of reading success. Although patients pletely abolished. He was not the most extreme
such as MP, from Bub et al. (1985), show a clear case possible of phonological dyslexia, however,
frequency effect in that they make few regulari- because there was also an impairment to his lexi-
zations of high-frequency words, other patients, cal route; his performance was about 85% correct
such as HTR, from Shallice, Warrington, and on words.
McCarthy (1983), do not. Patients also make For those patients who can pronounce
homophone confusions (such as reading “pane” some nonwords, nonword reading is improved
as “to cause distress”). if the nonwords are pseudohomophones (such
Surface dyslexia may not be a unitary cate- as “nite” for “night,” or “brane” for “brain”).
gory. Shallice and McCarthy (1985) distinguished Those patients who also have difficulty in read-
between Type I and Type II surface dyslexia. ing words have particular difficulty in reading the
Patients of both types are poor at reading excep- function words that do the grammatical work of
tion words. The more pure cases, known as Type the language. Low-frequency, low-imageability
I patients, are highly accurate at naming regular words are also poorly read, although neither fre-
words and pseudowords. Other patients, known quency nor imageability seems to have any over-
as Type II, also show some impairment at reading whelming role in itself. These patients also have
regular words and pseudowords. The reading per- difficulty in reading morphologically complex
formance of Type II patients may be affected by words—those that have syntactic modifications
lexical variables such that they are better at read- called inflections. They sometimes make what
ing high-frequency, high-imageability words, bet- are called derivational errors on these words,
ter at reading nouns than adjectives and at reading where they read a word as a grammatical rela-
adjectives than verbs, and better at reading short tive of the target, such as reading “performing”
words than long. Type II patients must have an as “performance.” Finally, they also make visual
additional, moderate impairment to the non-lexical errors, in which a word is read as another with a
route, but the dual-route model can nevertheless similar visual appearance, such as reading “per-
still explain this pattern. form” as “perfume.”
There are different types of phonological
dyslexia. Derouesné and Beauvois (1979) sug-
Phonological dyslexia gested that phonological dyslexia can result from
People with phonological dyslexia have a selec- disruption of either orthographic or phonological
tive impairment in the ability to read pronounce- processing. Some patients are worse at reading
able nonwords, called pseudowords (such as graphemically complex nonwords (e.g., CAU,
“sleeb”), while their ability to read matched words where a phoneme is represented by two letters;
(e.g., “sleep”) is preserved. Phonological dyslexia hence this nonword requires more graphemic pars-
was first described by Shallice and Warrington ing) than graphemically simple nonwords (e.g.,
(1975, 1980), Patterson (1980), and Beauvois and IKO, where there is a one-to-one mapping between
222 C. WORD RECOGNITION
letters and graphemes), but show no advantage for of the function words that caused his problems.
pseudohomophones. These patients suffer from a Nevertheless, he could understand the meaning
disruption of graphemic parsing. Another group of of function words that he could not read, and
patients are better at reading pseudohomophones his deficit was confined to reading single words.
than non-pseudohomophones, but show no effect His reading of function words in continuous text
of orthographic complexity. These patients suf- was much better. It is likely that MC at least has
fer from a disruption of phonological processing. a problem with syntactic processing such that
Friedman (1995) distinguished between phono- when producing words in isolation he is unable to
logical dyslexia arising from an impairment of access syntactic information.
orthographic-to-phonological processing (charac- People with phonological dyslexia show
terized by relatively poor function word reading complex phonological problems that have noth-
but good nonword repetition) from that arising ing to do with orthography. Indeed, it has been
from an impairment of general phonological pro- proposed that phonological dyslexia is a conse-
cessing (characterized by the reverse pattern). quence of a general problem with phonological
Following this, a three-stage model of sub- processing (Farah, Stowe, & Levinson, 1996;
lexical processing has emerged (Beauvois & Harm & Seidenberg, 2001; Patterson, Suzuki, &
Derouesné, 1979; Coltheart, 1985; Friedman, Wydell, 1996). If phonological dyslexia arises
1995). First, a graphemic analysis stage parses solely because of problems with ability to trans-
the letter string into graphemes. Second, a print- late orthography into phonology, then there must
to-sound conversion stage assigns phonemes to be brain tissue dedicated to this task. This implies
graphemes. Third, in the phonemic blending stage that this brain tissue becomes dedicated by school-
the sounds are assembled into a phonological age learning, which is an unappealing prospect.
representation. There are patients whose behav- The alternative view is that phonological dys-
ior can best be explained in terms of disruption lexia is just one aspect of a general impairment
of each of these stages (Lesch & Martin, 1998). of phonological processing. This impairment
MS (Newcombe & Marshall, 1985) suffered from will normally be manifested in performance on
disruption to graphemic analysis. Patients with non-reading tasks such as rhyming, nonword writ-
disrupted graphemic analysis find nonwords in ing, phonological short-term memory, nonword
which each grapheme is represented by a single repetition, and tasks of phonological synthesis
letter easier to read than nonwords with multiple (“what does “c–a–t spell out?”) and phonologi-
correspondences. WB (Funnell, 1983) suffered cal awareness (“what word is left if you take the
from disruption in the print-to-sound conver- “p” sound out of “spoon”?). This proposal also
sion stage; here nonword repetition is intact. ML explains why pseudohomophones are read bet-
(Lesch & Martin, 1998) was a phonological dys- ter than non-pseudohomophones. An important
lexic who could carry out tasks of phonological piece of evidence in favor of this hypothesis is
assembly on syllables, but not on sub-syllabic that phonological dyslexia is never observed in
units (onsets, bodies, and phonemes). MV (Bub, the absence of a more general phonological deficit
Black, Howell, & Kertesz, 1987) suffered from (but see Coltheart, 1996, for a dissenting view).
disruption to the phonemic stage. A general phonological deficit makes it difficult
Why do some people with phonological to assemble pronunciations for nonwords. Words
dyslexia have difficulty reading function words? are spared much of this difficulty because of sup-
One possibility is that function words are difficult port from other words and top-down support from
because they are so abstract (Friedman, 1995). their semantic representations. Repeating words
However, patient MC (Druks & Froud, 2002) had and nonwords is facilitated by support from audi-
great difficulty in reading nonwords, morpho- tory representations, so some phonological dys-
logically complex words, and function words in lexics can still repeat some nonwords. However,
isolation. Crucially, he could read highly abstract if the repetition task is made more difficult so
content words, so it cannot be the abstractness that patients can no longer gain support from the
7. READING 223
auditory representations, repetition performance semantic errors, they make visual errors, they sub-
declines markedly (Farah et al., 1996). This idea stitute incorrect function words for the target, they
that phonological dyslexia is caused by a general make derivational errors, they can’t pronounce
phonological deficit is central to the connectionist nonwords, they show an imageability effect, they
account of dyslexia, discussed later. find nouns easier to read than adjectives, they
find adjectives easier to read than verbs, they find
function words more difficult to read than content
Deep dyslexia words, their writing is impaired, their auditory
At first sight, surface and phonological dyslexia short-term memory is impaired, and their read-
appear to exhaust the possibilities of the con- ing ability depends on the context of a word (e.g.,
sequences of damage to the dual-route model. FLY is easier to read when it is a noun in a sen-
There is, however, another even more surprising tence than a verb).
type of dyslexia called deep dyslexia. Marshall There has been some debate about the extent
and Newcombe (1966, 1973) first described deep to which deep dyslexia is a syndrome (a syndrome
dyslexia in two patients, GR and KU, although is a group of symptoms that cluster together).
it is now recognized that the syndrome had been Coltheart (1980) argued that the clustering of
observed in patients before this (Marshall & symptoms is meaningful, in that they suggest a
Newcombe, 1980). In many respects deep dys- single underlying cause. However, although these
lexia resembles phonological dyslexia. Patients symptoms tend to occur in many patients, they
have great difficulty in reading nonwords, and do not apparently necessarily do so. For example,
considerable difficulty in reading the grammati- AR (Warrington & Shallice, 1979) did not show
cal, function words. Like phonological dyslex- concreteness and content word effects and had
ics, they make visual and derivational errors. intact writing and auditory short-term memory. A
However, the defining characteristic of deep dys- few patients make semantic errors but very few
lexia is the presence of semantic reading errors visual errors (Caramazza & Hillis, 1990). Such
or semantic paralexias, when people produce a patients suggest that it is unlikely that there is
word related in meaning to the target instead of a single underlying deficit. Like phonologi-
the target, as in examples (4) to (7): cal dyslexics, deep dyslexics obviously have
some difficulty in obtaining non-lexical access
(4) DAUGHTER “sister” to phonology via grapheme–phoneme recoding,
(5) PRAY “chapel” but they also have some disorder of the seman-
(6) ROSE “flower” tic system. We nevertheless have to explain why
(7) KILL “hate” these symptoms are so often associated. One pos-
sibility is that the different symptoms of deep
The imageability of a word is an important dyslexia arise because of an arbitrary feature of
determinant of the probability of reading success brain anatomy: Different but nearby parts of the
in deep dyslexia. The easier it is to form a men- brain control processes such as writing and audi-
tal image of a word, the easier it is to read. Note tory short-term memory, so that damage to one
that just an imageability effect in reading does not is often associated with damage to another. As we
mean that patients with deep dyslexia are better at will see, a more satisfying account is provided by
all tasks involving more concrete words. Indeed, connectionist modeling.
Newton and Barry (1997) described a patient (LW) Shallice (1988) argued that there are three
who was much better at reading high-frequency con- subtypes of deep dyslexia that vary in the pre-
crete words than abstract words, but who showed cise impairments involved. Input deep dyslexics
no impairment in comprehending those same have difficulties in reaching the exact semantic
abstract words. representations of words in reading. In these
Coltheart (1980) listed 12 symptoms com- patients, auditory comprehension is superior to
monly shown by deep dyslexics: They make reading. Central deep dyslexics have a severe
224 C. WORD RECOGNITION
Yes No
Are regular words read
Phonological
aloud much better than
dyslexia
exception words? MODELS
MODELS OF WORD
OF WORD
Yes
MODELS
MODELS OF WORD
OF WORD
Surface dyslexia
MODELS
MODELS OF WORD
OF WORD
FIGURE 7.2
MODELS
MODELS OF WORD
OF WORD
Acquired dyslexia in other
languages MODELS
MODELS OF WORD
OF WORD
Languages such as Italian, Spanish, or Serbo-Croat,
which have totally transparent or shallow alphabetic
orthographies—that is, where every grapheme is in MODELS
MODELS OF WORD
OF WORD
a one-to-one relation with a phoneme—can show
phonological and deep dyslexia, but not surface dys-
lexia, defined as an inability to read exception words MODELS
MODELS OF WORD
OF WORD
(Patterson, Marshall, & Coltheart, 1985a, 1985b).
However, we can find the symptoms that can co-
occur with an impairment of exception word reading,
MODELS
MODELS OF WORD
OF WORD
such as homophone confusions, in the languages that
permit them (Masterson, Coltheart, & Meara, 1985). Chinese (shown here) is a logographic or
Whereas languages such as English have ideographic script, providing no information on
word pronunciation.
a single, alphabetic script, Japanese has two
7. READING 227
kanji, but a difficulty in reading Japanese non- are necessary, other disorders suggest that these
words. The analog of deep dyslexia is a selective alone will not suffice. At first sight it is not
impairment of reading kana, while the reading obvious how a single-route model could explain
of kanji is preserved. For example, patient TY these dissociations at all.
could read words in both kanji and kana almost Theorists have taken two different approaches
perfectly, but she had great difficulty with non- depending on their starting point. One possibil-
words constructed from kana words (Sasanuma, ity is to refine the dual-route model. Another is
Ito, Patterson, & Ito, 1996). to show how word-neighborhoods can affect
Chinese is an ideographic language. pronunciation, and how pseudowords can be pro-
Butterworth and Wengang (1991) reported evi- nounced in a single-route model. This led to the
dence of two routes in reading in Chinese. development of analogy models. More recently,
Ideographs can be read aloud either through a a connectionist model of reading has been devel-
route that associates the symbol with its complete oped that takes the single-route, analogy-based
pronunciation, or through one that uses parts of approach to the limit.
the symbol. (Although Chinese is non-alphabetic,
most symbols contain some sublexical information The revised dual-route model
on pronunciation.) Each route can be selectively
impaired by brain damage, leading to distinct types We can save the dual-route model by making it
of reading disorder. more complex. Morton and Patterson (1980) and
The study of other languages that have differ- Patterson and Morton (1985) described a three-
ent means of mapping orthography onto phonol- route model (see Figure 7.3). First, there is a
ogy is still at a relatively early stage, but it is likely non-lexical route for assembling pronunciations
to greatly enhance our understanding of reading from sublexical grapheme–phoneme conver-
mechanisms. The findings suggest that the neu- sion. The non-lexical route now consists of two
ropsychological mechanisms involved in reading subsystems. A standard grapheme–phoneme con-
are universal, although there are obviously some version mechanism is supplemented with a body
differences related to the unique features of differ- subsystem that makes use of information about
ent orthographies. correspondences between orthographic and pho-
nological rimes. This is needed to explain lexi-
cal effects on nonword pronunciation. Second,
MODELS OF WORD the direct route is split into a semantic and a
NAMING non-semantic direct route.
The three-route model accounts for the data as
Both the classic dual-route and the single- follows. The lexical effects on nonwords and regu-
route, lexical-instance models face a number larity effects on words are explained by cross-talk
of problems. First, there are lexical effects for between the lexical and non-lexical routes. Two
nonwords and regularity effects for words, and types of interaction are possible: interference dur-
therefore reading cannot be a simple case of ing retrieval, and conflict in resolving multiple pho-
automatic grapheme-to-phoneme conversion for nological forms after retrieval. The two subsystems
nonwords, and automatic direct access for all of the non-lexical route also give the model greater
words. Single-route models, on the other hand, power. Surface dyslexia is the loss of the ability to
appear to provide no account of nonword pro- make direct contact with the orthographic lexicon,
nunciation, and it remains to be demonstrated and phonological dyslexia is the loss of the indirect
how neighborhood effects affect a word’s pro- route. Non-semantic reading is a loss of the lexical-
nunciation. Second, any model must also be semantic route. Deep dyslexia remains rather mys-
able to account for the pattern of dissociations terious. First, we have to argue that these patients
found in dyslexia. While surface and phonologi- can only read through the lexical-semantic route.
cal dyslexia indicate that two reading mechanisms While accounting for the symptoms that resemble
228 C. WORD RECOGNITION
Graphemes
Graphemes
Lexicon
(visual input logogens)
Grapheme
Lexicon
conversion
Non-semantic Sublexical recoding
reading (graphemes and bodies)
Semantic system
Phonology
(speech output Phonology
logogens) (speech output logogens)
Speech Speech
phonological dyslexia, it still does not explain the activation network to determine the final pronun-
semantic paralexias. One possibility is that this ciation of a word. Such an approach develops ear-
route is used normally, but not always success- lier models that make use of knowledge at multiple
fully, and that it needs additional information (such levels, such as those of Brown (1987), Patterson
as from the non-lexical and non-semantic direct and Morton (1985), and Shallice, Warrington, and
route) to succeed. So when this information is no McCarthy (1983).
longer available it functions imperfectly. It gets us The most recent version of the dual-route
to the right semantic area, but not necessarily to the model is the dual-route cascaded, or DRC, model
exact item, hence giving paralexias. This additional (Coltheart, Curtis, Atkins, & Haller, 1993; Coltheart
assumption seems somewhat arbitrary. An alterna- & Rastle, 1994; Coltheart, Rastle, Perry, Langdon,
tive idea is that paralexias are the result of addi- & Ziegler, 2001). This is a computational model
tional damage to the semantic system itself. Hence based on the architecture of the dual-route model—
a complex pattern of impairments is still necessary although it is in fact misleadingly so called, as it is
to explain deep dyslexia, and there is no reason to really based on the three-route model, with a non-
suggest that these are not dissociable. lexical grapheme–phoneme rule system and a lexi-
Multi-route models are becoming increasingly cal system, which in turn is divided into one route
complicated as we find out more about the reading that passes through the semantic system and a non-
process (for example, see Carr & Pollatsek, 1985). semantic route that does not. The model makes use
Another idea is that multiple levels of spelling-to- of cascaded processing, in that as soon as there is
sound correspondences combine in determining any activation at the letter level, activation is passed
the pronunciation of a word. In Norris’s (1994a) on to the word level. The computational model can
multiple-levels model, different levels of spelling- simulate performance on both lexical decision and
to-sound information, including phoneme, rime (the naming tasks, showing appropriate effects of fre-
final part of the word giving rise to the words with quency, regularity, pseudohomophones, neighbor-
which it rhymes, e.g., “eak” in “speak”), and word- hood, and priming. Regularity is now a central
level correspondences, combine in an interactive motivation of the model; words are either regular,
7. READING 229
or they are not. Irregular words take longer to pro- activates “hang,” “rang,” “sang,” and “bang”;
nounce than regular ones because the lexical and these are all consistent with the regular pronun-
non-lexical routes produce conflicting pronuncia- ciation of “gang,” and hence assembling a pro-
tions. The model accounts for surface dyslexia by nunciation is straightforward. When presented
making entries in the orthographic lexicon less with “base,” however, “case” and “vase” are acti-
available, and for phonological dyslexia by damag- vated; these conflict, and hence the assembly of a
ing the grapheme–phoneme conversion route. pronunciation is slowed down until the conflict is
There is not uniform agreement that it is nec- resolved. A nonword such as “taze” is pronounced
essary to divide the direct route into two. In the by analogy with the consistent set of similar
summation model (Hillis & Caramazza, 1991b; words (“maze,” “gaze,” “daze”). A nonword
Howard & Franklin, 1988), the only direct route is such as “mave” activates “gave,” “rave,” and
reading through semantics. How does this model “save,” but it also activates the conflicting enemy
account for non-semantic reading? The idea is that “have,” which hence slows down pronunciation
access to the semantic system is not completely of “mave.” In order to name by analogy, you have
obliterated. Activation from the sublexical route to find candidate words containing appropriate
combines (or is “summated”) with activation trick- orthographic segments (like “-ave”); obtain the
ling down from the damaged direct semantic route phonological representation of the segments; and
to ensure the correct pronunciation. assemble the complete phonology (“m + ave”).
It is difficult to distinguish between these Although attractive in the way they deal with
variants of the original dual-route model, although regularity and neighborhood effects, early ver-
the three-route version provides the more explicit sions of analogy models suffered from a number
account of the dissociations observed in dyslexia. of problems. First, the models did not make clear
There is also some evidence against the summa- how the input is segmented in an appropriate way.
tion hypothesis. EP (Funnell, 1996) could read Second, the models make incorrect predictions
irregular words that she could not name, and about how some nonwords should be pronounced.
priming the name with the initial letter did not Particularly troublesome are nonwords based on
help her naming, contrary to the prediction of the gangs; “pook” should be pronounced by analogy
summation hypothesis. Many aspects of the dual- with the great preponderance of the gang compris-
route model have been subsumed by the triangle ing “book,” “hook,” “look,” and “rook,” yet it is
model that serves as the basis of connectionist given the “hero” pronunciation (see Table 7.2)—
models of reading. The situation is complicated which is in accordance with grapheme–phoneme
even more by the apparent co-occurrence of the correspondence rules—nearly 75% of the time
loss of particular word meanings in dementia and (Kay, 1985). Analogy theory also appears to make
surface dyslexia (see later). incorrect predictions about how long it takes us to
make regularization errors (Patterson & Morton,
1985). Finally, it is not clear how analogy mod-
The analogy model els account for the dissociations found in acquired
The analogy model arose in the late 1970s when dyslexia. Nevertheless, in some ways the analogy
the extent of lexical effects on nonword reading model was a precursor of connectionist models of
and differences between words became apparent reading.
(Glushko, 1979; Henderson, 1982; Kay & Marcel,
1981; Marcel, 1980). It is a form of single-route
model that provides an explicit mechanism for Connectionist models: Seidenberg
how we pronounce nonwords. It proposes that we and McClelland’s (1989) model of
pronounce nonwords and new words by analogy
with other words. When a word (or nonword) is
reading
presented, it activates its neighbors, and these all The original Seidenberg and McClelland (1989)
influence its pronunciation. For example, “gang” model evolved in response to criticisms that I will
230 C. WORD RECOGNITION
hidden units fed back to the orthographic units, set of hidden units, and only one process is used
mimicking top-down word-to-letter connections to name regular, exception, and novel items. As
in the IAC model of word recognition. However, the model uses a distributed representation, there
there was no feedback from the phonological to is no one-to-one correspondence between hid-
the hidden units, so phonological representations den units and lexical items; each word is repre-
could not directly influence the processing of sented by a pattern of activation over the hidden
orthographic-level representations. units. According to this model, lexical memory
The training corpus comprised all 2,897 unin- does not consist of entries for individual words.
flected monosyllabic words of at least three or more Orthographic neighbors do not influence the
letters in the English language present in the Kucera pronunciation of a word directly at the time of
and Francis (1967) word corpus. Each trial con- processing; instead, regularity effects in pronun-
sisted of the presentation of a letter string that was ciation derive from statistical regularities in the
converted into the appropriate pattern of activation words of the training corpus—all the words we
over the orthographic units. This in turn fed for- have learned—as implemented in the weights of
ward to the phonological units by way of the hidden connections in the simulation. Lexical processing
units. In the training phase, words were presented therefore involves the activation of information,
a number of times with a probability proportional to and is not an all-or-none event.
the logarithm of their frequency. This means that the
ease with which a word is learned by the network, Evaluation of the original SM
and the effect it has on similar words, depends to
some extent on its frequency. About 150,000 learn-
model
ing trials were needed to minimize the differences Coltheart et al. (1993) criticized important aspects
between the desired and actual outputs. of the Seidenberg and McClelland (SM) model.
After training, the network was tested by They formulated six questions about reading that
presenting letter strings and computing the ortho- any account of reading must answer:
graphic and phonological error scores. The error
score is a measure of the average difference x How do skilled readers read exception words
between the actual and desired output of each of aloud?
the output units, across all patterns. Phonological x How do skilled readers read nonwords aloud?
error scores were generated by applying input to x How do participants make visual lexical deci-
the orthographic units, and measured by the out- sion judgments?
put of the phonological units; they were inter- x How does surface dyslexia arise?
preted as reflecting performance on a naming x How does phonological dyslexia arise?
task. Orthographic error scores were generated x How does developmental dyslexia arise?
by comparing the pattern of activation input to
the orthographic units with the pattern produced Coltheart et al. then argued that Seidenberg and
through feedback from the hidden units, and McClelland’s model only answered the first of
were interpreted as a measure reflecting the per- these questions.
formance of the model in a lexical decision task. Besner, Twilley, McCann, and Seergobin
Orthographic error scores are therefore a meas- (1990) provided a detailed critique of the
ure of orthographic familiarity. Seidenberg and Seidenberg and McClelland model, although
McClelland showed that the model fitted human a reply by Seidenberg and McClelland (1990)
data on a wide range of inputs. For example, answered some of these points. First, Besner et al.
regular words (such as “gave”) were pronounced argued that in a sense the model still possesses
faster than exception words (such as “have”). a lexicon, where instead of a word correspond-
Note that the Seidenberg and McClelland ing to a unit, it corresponds to a pattern of acti-
model uses a single mechanism to read non- vation. Second, they pointed out that the model
words and exception words. There is only one “reads” nonwords rather poorly—certainly much
232 C. WORD RECOGNITION
less well than a skilled reader. In particular, it realistic input and output representations.
only produced the “correct,” regular pronuncia- Phonological representations were based on pho-
tion of a nonword under 70% of the time. This nemes with phonotactic constraints (that constrain
contrasts with the model’s excellent performance which sounds occur together in the language), and
on its original training set. Hence the model’s orthographic representations were based on graph-
performance on nonwords is impaired from the emes with graphotactic constraints (that constrain
beginning. In reply, Seidenberg and McClelland which letters occur together in the language). The
(1990) pointed out that their model was trained on original SM model performed badly on nonwords
only 2,987 words, as opposed to the 30,000 words because Wickelfeatures disperse spelling–sound
that people know, and that this may be responsible regularities. For example, in GAVE, the A is rep-
for the difference. Hence the model simulates the resented in the context of G and V, and has noth-
direct lexical route rather better than it simulates ing in common with the A in SAVE (represented
the indirect grapheme–phoneme route. Therefore in the context of S and V). In the revised PMSP
any disruption of the model will give a better model, letters and phonemes activate the same
account of disruption to the direct route—that is, units irrespective of context. A mathematical
of surface dyslexia. The model’s account of lexi- analysis showed that a response to a letter string
cal decision is inadequate in that it makes far too input is a function that depends positively on the
many errors—in particular it accepts too many frequency of exposure to the pattern, positively to
nonwords as words (Besner et al., 1990; Fera & the sum of the frequencies of its friends, and nega-
Besner, 1992). The model did not perform as well tively to the sum of the frequencies of its enemies.
as people do on nonwords, in particular on non- The response to a letter string is non-linear, in that
words that contain unusual spelling patterns (e.g., there are diminishing returns: For example, regu-
JINJE, FAIJE). In addition, the model’s account of lar words are so good they gain little extra benefit
surface dyslexia was problematic and its account from frequency. This explains the interaction we
of phonological dyslexia non-existent. observe between word consistency and frequency.
Forster (1994) evaluated the assumptions As we shall see, the revised model also gives a
behind connectionist modeling of visual word much better account of dyslexia.
recognition. He made the point that showing that
a network model can successfully learn to per-
form a complex task such as reading does not
Accessing semantics
mean that that is the way humans actually do it. Of course the goal of reading is to access the mean-
Finally, Norris (1994b) argued that a major stum- ing of words. The PMSP model simulates the
bling block for the Seidenberg and McClelland orthography–phonology side of the triangle. Clearly,
model was that it could not account for the ability according to the model, we can access semantics
of readers to shift strategically between reliance either directly (OS: orthography–semantics) or indi-
on lexical and sublexical information. rectly (OPS: orthography–phonology–semantics—
what we have also called phonological mediation).
The revised connectionist model: Hence there is a division of labor between the two
routes. Harm and Seidenberg (2004) model the
PMSP access of semantics. In the full model, all parts of
A revised connectionist model performs much the system operate simultaneously and contribute to
better at pronouncing nonwords and at lexi- the activation of meaning. The Harm and Seidenberg
cal decision than the original (Plaut, 1997; model is a complete implementation of the triangle
Plaut & McClelland, 1993; Plaut, McClelland, model. It is trained to produce the correct pattern of
Seidenberg, & Patterson, 1996; Seidenberg, activation across a set of semantic features given
Petersen, MacDonald, & Plaut, 1996; Seidenberg, an orthographic input. In the first phase, the model
Plaut, Petersen, McClelland, & McRae, 1994). is trained for a while on the phonology–semantics
The model, called PMSP for short, used more side of the triangle, to simulate the knowledge of
7. READING 233
young children who cannot yet read, but who know hidden and output (phonological) units (called late
what words mean. These weights are then frozen. In weights); and damage to the hidden units them-
the second phase, the orthography–phonology and selves. Damage was inflicted by probabilistically
orthography–semantics sides of the triangle are then resetting a proportion of the weights or units to
trained. zero. The greater the amount of damage being
How does the trained model perform? simulated, the higher the proportion of weights that
Perhaps not surprisingly, in simulations resem- was changed. The consequences were measured in
bling the skilled reader in normal conditions, the two ways. First, the damage was measured by the
OS route is normally faster, with the OPS route phonological error score, which as we have seen
lagging somewhat behind. Nevertheless, analy- reflects the difference between the actual and target
sis of how activation of the input determines activation values of the phonological output units.
activation of the output shows that activation of Obviously, high error scores reflect impaired per-
the semantic system is driven by both pathways. formance. Second, the damage was measured by
Even if the OPS path is slower, it still always con- the reversal rate. This corresponds to a switch in
tributes to the final output. In addition, because pronunciation by the model, so that a regular pro-
of interactivity in the system, activation of the nunciation is given to an exception item (for exam-
semantic system activates corresponding pho- ple, “have” is pronounced to rhyme with “gave”).
nological representations, which in turn affect Increasing damage at each location produces
the semantic system. Simulations show that the near-linear increases in the phonological error
relative contributions of the two pathways (OS scores of all types of word. On the whole, though,
and OPS) are modulated by a number of factors, the lesioned model performed better with regu-
including skill (phonological information is more lar than with exception words. The reversal rate
important early on in training, corresponding increased as the degree of damage increased, but
to less skilled readers) and word frequency (for nevertheless there were still more reversals occur-
high-frequency words the OS pathway is more ring on exception words than on regular words.
efficient). The model also simulates the response Damage to the hidden units in particular produced
times of van Orden (1987), where people are a large number of instances where exception
slow to say “no” to “Is it a flower? ROWS.” words were produced with a regular pronuncia-
tion; this is similar to the result whereby surface
dyslexics over-regularize their pronunciations.
CONNECTIONIST MODELS However, the number of regularized pronuncia-
tions that were produced by the lesioned model
OF DYSLEXIA was significantly lower than that produced by sur-
face dyslexic patients. No lesion made the model
Modeling surface dyslexia perform selectively worse on nonwords. Hence
Over the last few years connectionist modeling the behavior of the lesioned model resembles that
has contributed to our understanding of deep of a surface dyslexic.
and surface dyslexia. Patterson, Seidenberg, Patterson et al. also found that word frequency
and McClelland (1989) artificially damaged or was not a major determinant of whether a pronun-
“lesioned” the Seidenberg and McClelland (1989) ciation reversed or not. (It did have some effect,
network after the learning phase by destroy- so that high-frequency words were generally more
ing hidden units or connection weights, and then robust to damage.) As we have seen, some sur-
observed the behavior of the model. Its perfor- face dyslexics show frequency effects on reading,
mance resembled the reading of a surface dyslexic. while others do not. Patterson et al. found that the
Patterson et al. (1989) explored three main types main determinant of reversals was the number of
of lesion: damage to the connections between the vowel features by which the regular pronunciation
orthographic input and hidden units (called early differs from the correct pronunciation, a finding
weights); damage to the connections between the verified from the neuropsychological data.
234 C. WORD RECOGNITION
An additional point of interest is that the Surface dyslexia arises in the progressive
lesioned model produced errors that have tra- neurological disease dementia (see Chapter
ditionally been interpreted as “visual” errors. 11 on semantics for details of dementia).
These are mispronunciations that are not over- Importantly, people with dementia find excep-
regularizations and that were traditionally tion words difficult to pronounce and repeat
thought to result from an impairment of early if they have lost the meaning of those words
graphemic analysis. If this analysis is correct, (Hodges, Patterson, Oxbury, & Funnell, 1992;
then Patterson et al. should only have found Patterson & Hodges, 1992; but see Funnell,
such errors when there was damage to the ortho- 1996). Patterson and Hodges proposed that
graphic units involved. In contrast, they found the integrity of lexical representations depends
them even when the orthographic units were on their interaction with the semantic system:
not damaged. This is an example of a particu- Semantic representations bind phonological
lar strength of connectionist modeling; the same representations together with a semantic glue;
mechanism explains what were previously con- hence this is called the semantic glue hypothe-
sidered to be disparate findings. Here visual sis. As the semantic system gradually dissolves
errors result from the same lesion that causes in dementia, so the semantic glue gradually
other characteristics of surface dyslexia, and comes unstuck, and the lexical representations
it is unnecessary to resort to more complex lose their integrity. Patients are therefore forced
explanations involving additional damage to the to rely on a sublexical or grapheme–phoneme
graphemic analysis system. correspondence reading route, leading to sur-
There are three main problems with this face dyslexic errors. Furthermore, they have
particular account. First, we have already seen difficulty in repeating irregular words for which
that the original Seidenberg and McClelland they have lost the meaning, if the system is suf-
model was relatively bad at producing non- ficiently stressed (by repeating lists of words),
words before it was lesioned. We might say but they can repeat lists of words for which
that the original model is already operating as a the meaning is intact (Patterson, Graham, &
phonological dyslexic. Yet surface dyslexics are Hodges, 1994; but see Funnell, 1996, for a
good at reading nonwords. Second, the model patient who does not show this difference).
does not really over-regularize, it just changes PMSP showed that a realistic model of sur-
the vowel sound of words. Third, Behrmann face dyslexia depends on involving semantics
and Bub (1992) reported data that are inconsist- in reading. Support from semantics normally
ent with this model. In particular, they showed relieves the phonological pathway from hav-
that the performance of the surface dyslexic MP ing to master low-frequency exception words by
on irregular words does vary as a function of itself. In surface dyslexia the semantic pathway is
word frequency. They interpreted this frequency damaged, and the isolated phonological pathway
effect as problematic for connectionist models. reveals itself as surface dyslexia.
Patterson et al. (1989) were quite explicit in Plaut (1997) further examined the involve-
simulating only surface dyslexia; their model ment of semantics in reading. He noted that some
does not address phonological dyslexia. patients have substantial semantic impairments
but can read exception words accurately (e.g.,
Exploring semantic involvement in DC of Lambon Ralph, Ellis, & Franklin, 1995;
DRN of Cipolotti & Warrington, 1995; WLP of
reading Schwartz, Marin, & Saffran, 1979). To explain
The revised model, abbreviated to PMSP, provides why some patients with semantic impairments
a better account of dyslexia. The improvements cannot read exception words but some can, Plaut
come about because the simulations implement suggested that there are individual differences
both pathways of the triangle model in order to in the division of labor between semantic and
explain semantic effects on reading. phonological pathways. Although the majority
7. READING 235
of patients with semantic damage show surface advantage for pseudohomophones, but no obvi-
dyslexia (Graham, Hodges, & Patterson, 1994), ous general phonological impairment. There
some exceptions are predicted. He also argued have also been effects of orthographic complex-
that people use a number of strategies in per- ity and visual similarity, suggesting that there
forming lexical decision, one of which is to use is also an orthographic impairment present in
semantic familiarity as a basis for making judg- phonological dyslexia (Derouesné & Beauvois,
ments. The revised model therefore takes into 1985; Howard & Best, 1996). For example,
account individual differences between speak- Howard and Best showed that their patient
ers, and shows how small differences in read- Melanie-Jane read pseudohomophones that were
ing strategies can lead to different consequences visually similar to the related word (e.g., GERL)
after brain damage. better than pseudohomophones that were visu-
ally more distant (e.g., PHOCKS). There was no
effect of visual similarity for control nonwords.
Modeling phonological dyslexia However, Harm and Seidenberg (2001) show
The triangle model provides the best connection- how phonological impairment in a connectionist
ist account of phonological dyslexia. It envisages model can give rise to such effects. A phonolog-
reading as taking place through the three routes ical impairment magnifies the ease with which
conceptualized in the original SM model. The different types of stimuli are read.
routes are orthography to phonology, orthography
to semantics, and semantics to phonology (Figure
7.4). This approach sees phonological dyslexia as
Modeling deep dyslexia
nothing other than a general problem with phono- Hinton and Shallice (1991) lesioned another
logical processing (Farah et al., 1996; Sasanuma connectionist model to simulate deep dyslexia.
et al., 1996). Phonological dyslexia arises through Their model was trained by back-propagation to
impairments to representations at the phono- associate word pronunciations with a represen-
logical level, rather than to grapheme–phoneme tation of the meaning of words. This model is
conversion. This is called the phonological particularly important, because it shows that one
impairment hypothesis. People with phono- type of lesion can give rise to all the symptoms
logical dyslexia can still read words because their of deep dyslexia, particularly both paralexias
weakened phonological representations can be and visual errors.
accessed through the semantic level. (Hence this The underlying semantic representation of a
approach is also a development of the semantic word is specified as a pattern of activation across
glue hypothesis.) We have already noted that semantic feature units (which Hinton and Shallice
the original Seidenberg and McClelland (1989) called sememes). These correspond to semantic
model performed rather like a phonological dys- features or primitives such as “main-shape-2D,”
lexic patient, in that it performed relatively poorly “has-legs,” “brown,” and “mammal.” These can
on nonwords. Consistent with the phonological be thought of as atomic units of meaning (see
deficit hypothesis, the explanation for this poor Chapter 11). The architecture of the Hinton and
performance was that the source of these errors Shallice (1991) model comprised 28 graphemic
was the impoverished phonological representa- input units and 68 semantic output units with an
tions used by the model. intervening hidden layer containing 40 intermedi-
An apparent problem with the phonological ate units. The model was trained to produce an
deficit hypothesis is that it is not clear that it appropriate output representation given a particu-
would correctly handle the way in which people lar orthographic input using back-propagation.
with phonological dyslexia read pseudohomo- The model was trained on 40 uninflected mono-
phones better than other types of nonwords syllabic words.
(Coltheart, 1996). Furthermore, patient LB of The structure of the output layer is quite
Derouesné and Beauvois (1985) showed an complex. First, there were interconnections
236 C. WORD RECOGNITION
between some of the semantic units. The 68 that was semantically but not visually close to
semantic feature units were divided into 19 the target; these resemble the classic semantic
groups depending on their interpretation, with paralexias of deep dyslexics); visual (words visu-
inhibitory connections between appropriate ally but not semantically similar); mixed (where
members of the group. For example, in the the output is both semantically and visually
group of semantic features that define the size of close to the target); and others. All lesion sites
the object denoted by the word, there are three and types (except for that of disconnecting the
semantic features: “max-size-less-foot,” “max- semantic and cleanup units) produced the same
size-foot-to-two-yards,” and “max-size-greater- broad pattern of errors. Finally, on some occa-
two-yards.” Each of these features inhibits the sions the lesions were so severe that the network
others in the group, because obviously an object could not generate an explicit response. In these
can only have one size. Second, an additional cases, Hinton and Shallice tested the below-
set of hidden units called cleanup units was threshold information left in the system by simu-
connected to the semantic units. These permit lating a forced-choice procedure. They achieved
more complex interdependencies between the this by comparing the residual semantic output
semantic units to be learned, and have the effect to a set of possible outputs corresponding to a
of producing structure in the output layer. This set of words, one of which was the target seman-
results in a richer semantic space where there tic output. The model behaved above chance on
are strong semantic attractors. An attractor can this forced-choice test, in that its output semantic
be seen as a point in semantic space to which representation tended to be closer to that of the
neighboring states of the network are attracted; target than to the alternatives.
it resembles the bottom of a valley or basin, so Hence the lesioned network behaves like
that objects positioned on the sides of the basin a deep dyslexic patient, in particular in mak-
tend to migrate towards the lowest point. This ing semantic paralexias. The paralexias occur
corresponds to the semantic representation ulti- because semantic attractors cause the accessing
mately assigned to a word. of feature clusters close to the meanings of words
As in Patterson et al.’s (1989) simulation that are related to the target. A “landscape” met-
of surface dyslexia, different types of lesion aphor may be useful. Lesions can be thought of
were possible. There are two dimensions to as resulting in the destruction of the ridges that
remember: one is what is lesioned, the other separate the different basins of attraction. The
is how it is lesioned. The connections involved occurrence of such errors does not seem to be
were the grapheme–intermediate, intermedi- crucially dependent on the particular lesion type
ate–sememe, and sememe–cleanup. Three or site under consideration. Furthermore, this
methods of lesioning the network were used. account provides an explanation of why differ-
First, each set of connections was taken in turn, ent error types, particularly semantic and visual
and a proportion of their weights was set to errors, nearly always co-occur in such patients.
zero (effectively disconnecting units). Second, Two visually similar words can point in the first
random noise was added to each connection. instance to nearby parts of semantic space, even
Third, the hidden units (the intermediate and though their ultimate meanings in the basins
cleanup units) were ablated by destroying a may be far apart; if you start off on top of a hill,
proportion of them. going downhill in different directions will take
The results showed that the closer the lesion you to very different ultimate locations. Lesions
was to the semantic system, the more effect it modify semantic space so that visually similar
had. The lesion type and site interacted in their words are then attracted to different semantic
effects; for example, the cleanup circuit was attractors.
more sensitive to added noise than to discon- Hinton and Shallice’s account is important
nections. Lesions resulted in four types of error: for cognitive neuropsychologists for a num-
semantic (where an input gave an output word ber of reasons. First, it provides an explicit
7. READING 237
mechanism whereby the characteristics of deep on abstract words. Plaut and Shallice argue that
dyslexia can be derived from a model of nor- this is consistent with patient CAV (Warrington,
mal reading. Second, it shows that the actual 1981), who showed such an advantage. Hence
site of the lesion is not of primary importance. this network can account for both the usual
This is mainly because of the “cascade” char- better performance of deep dyslexic patients
acteristics of these networks. Each stage of on concrete words, and also the rare exception
processing is continually activating the next, where the reverse is the case. They also showed
and is not dependent on the completion of pro- that lesions closer to the grapheme units tended
cessing by its prior stage (McClelland, 1979). to produce more visual errors, whereas lesions
Therefore, effects of lesions at one network closer to the semantic units tended to produce
site are very quickly passed on to surrounding more semantic errors. The model also provides
sites. Third, it shows why symptoms that were an account of the behavior of normal participants
previously considered to be conceptually dis- reading degraded words (McLeod, Shallice, &
tinct necessarily co-occur. Semantic and visual Plaut, 2000). If words are presented very rapidly
errors can result from the same lesion. Fourth, to people, they make both visual and semantic
it thus revives the importance of syndromes as a errors. The data fit the connectionist model well.
neuropsychological concept. If symptoms co- Connectionist modeling has advanced our
occur as a result of any lesion to a particular understanding of deep dyslexia in particular,
system, then it makes sense to look for and and neuropsychological deficits in general. The
study such co-occurrences. finding that apparently unrelated symptoms can
Plaut and Shallice (1993a) extended this necessarily co-occur as a result of a single lesion
work to examine the effect of word abstractness is of particular importance. It suggests that deep
on lesioned reading performance. As we have dyslexia may after all be a unitary condition.
seen, the reading performance of deep dyslexic However, there is one fly in the ointment. The
patients is significantly better on more image- finding that at least some patients show image-
able than on less imageable words. Plaut and ability effects in reading but not in comprehension
Shallice showed that the richness of the under- is troublesome for all models that posit a distur-
lying semantic representation of a word is an bance of semantic representations as the cause of
analog of imageability. They hypothesized that deep dyslexia (Newton & Barry, 1997). Instead,
the semantic representations of abstract words in at least some patients, the primary disturbance
contain fewer semantic features than those of may be to the speech production component of
concrete words; that is, the more concrete a reading.
word is, the richer its semantic representation.
Jones (1985) showed that it was possible to
account for imageability effects in deep dyslexia COMPARISON OF MODELS
by recasting them as ease-of-predication effects.
Ease-of-predication is a measure of how easy A simple dual-route model provides an inad-
it is to generate things to say about a word, or equate account of reading, and needs at least
predicates, and is obviously closely related to the an additional lexical route through imageable
richness of the underlying semantic representa- semantics. The more complex a model becomes,
tion. It is easier to find more things to say about the greater the worry that routes are being intro-
more imageable words than about less image- duced on an arbitrary basis to account for par-
able words. Plaut and Shallice (1993a) showed ticular findings. Analogy models have some
that when an attractor network similar to that attractive features, but their detailed workings
of Hinton and Shallice (1991) is lesioned, con- are vague and they do not seem able to account
crete words are read better than abstract words. for all the data. Connectionist modeling has
One exception was that severe lesions of the provided an explicit, single-route model that
cleanup system resulted in better performance covers most of the main findings, but has its
238 C. WORD RECOGNITION
problems. At the very least it has clarified the an increase in the number of times it is neces-
issues involved in reading. Its contribution sary to reanalyze inconsistent words as we read
goes beyond this, however. It has set the chal- them from left to right). Zevin and Seidenberg
lenge that only one route is necessary in reading (2006) argued that graded sensitivity to consist-
words and nonwords, and that regularity effects ency effects in nonwords provides the critical
in pronunciation arise out of statistical regulari- test between the models, with only connection-
ties in the words of the language. It may not be a ist models correctly predicting the presence
complete or correct account; however, it is cer- of such effects, and being able to account for
tainly a challenging one. individual differences in nonword pronuncia-
Currently we are faced with two serious tion. However, doubtless this debate will run
alternatives: a connectionist model such as the and run.
triangle model, and a variant of the dual-route Perhaps the choice between the triangle and
model such as the dual-route cascaded model. the dual-route cascaded model comes down to
The literature is full of claim and counter-claim, which one values most: explaining a wide range
and it would be presumptuous for a text like of data, or parsimony in design.
this to say that one is clearly right and the other Balota (1990) asked if there is a magic
wrong. There are many studies providing sup- moment when we recognize a word but do not
port for and against one or the other of the models. yet have access to its meaning. He argued that
Many of them focus on how we read nonwords the tasks most commonly used to study word
(Besner et al., 1990; Seidenberg et al., 1994), processing (lexical decision and word naming)
because the division of labor in the DRC model are both sensitive to post-access processes. This
between a lexical route with knowledge of indi- makes interpretation of data obtained using
vidual words and a non-lexical route with spell- these tasks difficult (although not, as we have
ing rules is absent in connectionist models, and seen, impossible). Furthermore, deep dyslexia
this difference is the key one between the two (discussed earlier) suggests that it is possible
sorts of models. The DRC emphasizes regular- to access meaning without correctly identifying
ity (does the word obey the rule?), which is a the word, while non-semantic reading suggests
categorical concept—either the word obeys the that we can recognize words without necessarily
spelling–sound rules or it does not, with non- accessing their meaning. Whereas unique lexi-
words having to be pronounced by the rule. cal access is a prerequisite of activating mean-
The triangle model emphasizes consistency of ing in models such as the logogen and the serial
rimes and other units (how often is -AVE pro- search model, cascading connectionist models
nounced in a certain way?), which is a statisti- permit the gradual activation of semantic infor-
cal concept. According to Zevin and Seidenberg mation while evidence is still accumulating
(2006), consistency effects such as those shown from perceptual processing. A model such as
in Glushko’s (1979) and Jared’s (1997b) stud- the triangle model (Patterson et al., 1996; Plaut
ies are the critical test between models. Words et al., 1996) seems best able to accommodate all
like PAVE are regular but inconsistent; accord- these constraints.
ing to the DRC model they should be as easy Finally, all of these models—particularly
to pronounce as regular and consistent words the connectionist ones—are limited in that they
such as PANE; according to the triangle model have focused on the recognition of morphologi-
they should not. Now of course we know from cally simple, often monosyllabic words. Rastle
Glushko’s study that regular inconsistent words and Coltheart (2000) have developed a rule-based
are slower to pronounce than regular consist- model of reading bisyllabic words, emphasiz-
ent ones, but Coltheart et al. (2001) argue that ing how we produce the correct stress, and Ans,
these differences are an artifact arising from Carbonnel, and Valdois (1998) have developed
several confounding factors (e.g., the pres- a connectionist model of reading polysyllabic
ence of exception words in the materials, and words.
7. READING 239
SUMMARY
x Different languages use different principles to translate words into sounds; languages such as
English use the alphabetic principle.
x Regular words have a regular grapheme-to-phoneme correspondence, but exception words do not.
x According to the dual-route model, words can be read through a direct lexical route or a sublexical
route; in adult skilled readers the lexical route is usually faster.
x The sublexical route was originally thought to use grapheme–phoneme conversion, but now it is
considered to use correspondences across a range of sublexical levels.
x There are effects of lexical similarity in reading certain nonwords (pseudohomophones), while
not all words are read with equal facility (the consistency of the regularity of a word’s neighbors
affects its ease of pronunciation).
x It might be necessary to access the phonological code of a word before we can access its meaning;
this process is called phonological mediation.
x Phonological mediation is most likely to be observed with low-frequency words and with poor readers.
x Readers have some attentional control over which route they emphasize in reading.
x Access to some phonological code is mandatory, even in silent reading, but normally does not
precede semantic access.
x Increasing reading speed above about 350 words a minute (by speed reading, for example) leads
to reduced comprehension.
x Surface dyslexia is difficulty in reading exception words; it corresponds to an impairment of the
lexical route in the dual-route model.
x Phonological dyslexia is difficulty in reading nonwords; it corresponds to an impairment of the
sublexical route in the dual-route model.
x Deep dyslexic readers display a number of symptoms including making visual errors, but the most
important characteristic is the presence of semantic reading errors or paralexias.
x There has been some debate as to whether deep dyslexia is a coherent syndrome.
x Non-semantic readers can pronounce irregular words even though they do not know their meaning.
x The revised dual-route model uses multiple sublexical correspondences and permits direct access
through a semantic lexical route and a non-semantic lexical route.
x The dual-route cascaded model allows activation to trickle through levels before processing is
necessarily completed at any level.
x Seidenberg and McClelland (SM) produced an important connectionist model of reading; how-
ever, it performed poorly on nonwords and pseudohomophones.
x Lesioning the SM network gives rise to behavior resembling surface dyslexia, but its over-
regularizations differ from those made by humans.
x The revised version of this model, PMSP, gives a much better account of normal reading and surface
dyslexia; it uses a much more realistic representation for input and output than the original model.
x There are clear semantic influences on normal and impaired reading, and recent connectionist
models are trying to take these into account.
x The triangle model accounts for phonological dyslexia as an impairment to the phonological rep-
resentations: this is the phonological impairment hypothesis.
x Deep dyslexia has been modeled by lesioning semantic attractors; the lesioned model shows how
the apparently disparate symptoms of deep dyslexia can arise from one type of lesion.
(Continued)
240 C. WORD RECOGNITION
(Continued)
x More imageable words are relatively spared because they have richer semantic representations.
x There has been considerable debate as to whether developmental dyslexia is qualitatively differ-
ent from very poor normal reading, and whether there are subtypes that correspond to acquired
dyslexias; the preponderance of evidence suggests that developmental dyslexia is on a continuum
with normal reading.
x Connectionist modeling shows how two distinct types of damage can lead to a continuum of
impairment between development surface and phonological dyslexia extremes.
FURTHER READING
Many of the references at the end of Chapter 6 will also be relevant here. There are a number of
works that describe the orthography of English, and discuss the rules whereby certain spelling-to-
sound correspondences are described as regular and others as irregular. One of the best known of
these is Venezky (1970). For an example of work on reading in a different orthographic system, see
Kess and Miyamoto (1999).
For a general introduction to reading, writing, spelling, and their disorders, see Ellis (1993).
For more discussion of dyslexia, including peripheral dyslexias, see Ellis and Young (1988). Two
volumes (entitled Deep Dyslexia, 2nd ed., by Coltheart, Patterson, & Marshall, 1987, and Surface
Dyslexia, by Patterson, Marshall, & Coltheart, 1985b) cover much of the relevant material. A special
issue of the journal Cognitive Neuropsychology (1996, volume 13, part 6) was devoted to phonologi-
cal dyslexia.
For recent overviews of reading, see Andrews (2006) and Snowling and Hulme (2007).
CHAPTER 8
LEARNING TO READ AND SPELL
principle or if they place a heavy load on memory. of words. They also performed better on more
Hence words containing graphemes with irregular imageable words. So even children in the earliest
pronunciations, phonemes with many graphemic stages of reading are sensitive to spelling–sound
options, and graphemes with no phonological cor- relations, but semantic factors also play a role.
respondences will all be difficult to spell.
In this scheme, then, there is an initial phase
of direct access based only on visual cues. Barron PHONOLOGICAL
and Baron (1977) showed that concurrent articula- AWARENESS
tion had no effect on extracting the meaning of a
printed word. However, this initial phase of visual Phonological awareness—the awareness of the
access is very short. There is some evidence that sounds of a word—is important when learning to
phonetic information is used from a very early read. It is one aspect of more general knowledge
stage (Ehri, 1992; Ehri & Wilce, 1985; Rack, of our cognitive abilities (called metacognitive
Hulme, Snowling, & Wightman, 1994). Early knowledge) that is thought to play an essential
readers set up partial associations between sounds role in cognitive development (Karmiloff-Smith,
and the letters for which they stand, even though 1986). Many tasks have been used to test phono-
these partial associations are not the same as con- logical awareness (see Table 8.1 for some exam-
scious letter-by-letter decoding. Ehri and Wilce ples). Phonological awareness is just one aspect
(1985) showed that children who could not yet use of our knowledge of language. Gombert (1992)
phonological decoding still found it easier to learn distinguished between epilinguistic knowledge
the simplified spelling cue “jrf,” which bears some (implicit knowledge about our language processes
phonetic resemblance to the target word “giraffe,” that is used unconsciously) and metalinguistic
than “wbc,” which is visually very distinctive knowledge (explicit knowledge about our lan-
but bears no phonological relation to the target. guage processes of which we are aware and can
Semantic factors also influence very early read- report, and of which we can make deliberate use).
ing: Laing and Hulme (1999) found that children This distinction is reflected in the tasks that have
performed better at associating spelling cues with been used to test phonological awareness (e.g.,
words when they were clearer about the meanings those in Table 8.1).
TABLE 8.1 Some tasks used to assess phonological awareness (based on Yopp, 1988).
Task Example
Phoneme deletion What would be left if you took /t/ out of “stand”?
Specifying deleted phoneme What sound do you hear in “meat” that’s missing in “eat”?
Phoneme reversal Say “as” with the first sound last and last sound first.
244 C. WORD RECOGNITION
Although it was first thought that these tasks as isolating, segmenting, and manipulating sounds
may all measure the same thing, it is now agreed as evidenced by production. Implicit awareness
that they do not. In an analysis of 10 commonly follows a large-to-small developmental sequence,
used tests of phonological awareness, Yopp as indicated by early performance in match-
(1988) identified two related factors, one to do ing tasks (Treiman & Zukowski, 1996), but this
with manipulating single sounds and another to do has little controlling effect on learning to read.
with holding sounds in memory while performing Explicit awareness follows a small-to-large unit
operations on them. Muter, Hulme, Snowling, and sequence and reflects the demands of learning
Taylor (1998) identified distinct factors in tests to read using letter–sound correspondences. For
of phonological awareness, one to do with seg- example, beginning readers’ explicit awareness
mentation skills and one with rhyming skills. The of rimes and onsets can be poor, while implicit
underlying ability to determine that two words knowledge of rhyming can be good (Duncan,
have a sound in common (phoneme constancy) Seymour, & Hill, 1997, 2000). Younger children
might be a particularly important phonological were best at finding the common unit in sounds
skill for learning to read (Byrne, 1998). when the units were small (e.g., initial conso-
Phonological awareness and literacy are nants, as in “face” and “food” rather than “boat”
closely related. Illiterate adults (from an agricul- and “goat”). Thus, although they were able to
tural area of south Portugal) performed poorly on make the implicit judgment that “boat” and “goat”
phonological awareness tasks, particularly those rhymed, they were poor at explicitly identifying
involving manipulating phonemes (e.g., adding the common sound in those words. As children
or deleting phonemes to the starts of nonwords). grow older they are more sensitive to the rimes of
Ex-illiterate adults, who had received some lit- words and better able to generate word analogies
eracy training in adulthood, performed much bet- for nonwords (e.g., “door” for “goor”).
ter (Morais, Bertelson, Cary, & Alegria, 1986; Early work suggested that rime-level aware-
Morais, Carey, Alegria, & Bertelson, 1979). ness could predict late reading ability in longi-
Speakers of Chinese, who use a non-alphabetic tudinal studies (Goswami, 1993; Goswami &
writing system where there is no correspondence Bryant, 1990); more recent studies have claimed
between written symbols and individual sounds, that phoneme-level segmentation skill and letter-
seem less aware of individual phonemes. Chinese name knowledge are strong predictors of level of
adult speakers who were literate in both an alpha- reading ability, while rhyming skill is only a weak
betic and a non-alphabetic system could readily predictor (Muter et al., 1998), although there is
perform tasks such as deleting or adding conso- some controversy about the effects of the spe-
nants in spoken Chinese words; speakers who cific instructions given to children (Bryant, 1998;
were literate only in the non-alphabetic system Hulme, Muter, & Snowling, 1998).
found the deletion and addition tasks extremely Beginning readers have difficulty with pho-
difficult (Read, Zhang, Nie, & Ding, 1986). These nological awareness tasks, but their performance
studies show that phonological awareness works improves with age. Developing phonological
in both ways: literacy in alphabetic scripts can awareness improves reading skills and, as chil-
lead to phonological awareness. dren learn to read, their phonological awareness
Where phonological awareness tasks have increases. Phonological awareness plays a driving
been applied systematically to all levels of the role in reading development (Rayner & Pollatsek,
syllable from small units (phonemes) through 1989). Training on phonological awareness can
intermediate-size units (onsets and rimes) to lead to an improvement in segmenting and read-
large units (syllables), researchers have found a ing skills in general (Bradley & Bryant, 1983) if
sequence of phonological development. Implicit it is linked to reading (Hatcher, Hulme, & Ellis,
awareness is measured by tasks such as matching 1994; see Bus & van Ijzendoorn, 1999, for a
sounds (e.g., finding rimes) and detecting oddi- review). Laing and Hulme (1999) showed that
ties; explicit awareness is measured by tasks such phonological awareness correlates with the ability
8. LEARNING TO READ AND SPELL 245
predictive of their later analogical reading perfor- in all age groups, again suggesting that the child’s
mance. Goswami presented children with a clue reading strategy is task-dependent. Hence learn-
word (e.g., “beak”) and asked them to read several ing to read involves a process of learning through
other words and nonwords, some of which were several different reading routes (Grainger, Lété,
analogs of the clue word (e.g., “bean,” “beal,” Bertand, Dufau, & Ziegler, 2012).
“peak,” and “lake”). She found that the children Given that different languages map spelling
read the analog words better than the control onto sound in different ways, it is perhaps not sur-
words, suggesting that they are making use of the prising that languages differ in the preferred size of
rime to read by analogy. For Goswami, children the key unit that emerges while learning to read. We
start to read by identifying large units (onset and have just seen that in English the rime emerges as
rime) first, and only later identify small units such a key reading unit. In languages such as German,
as phonemes. Greek, and Spanish, which are much more regular
Most studies, however, have found that begin- in the spelling–sound correspondences, it is possible
ning readers need some grapheme–phoneme decod- to make systematic use of smaller units and hence
ing skill in order to able to read words by analogy older children come to rely on simple grapheme–
(see Brown & Deavers, 1999; Coltheart & Leahy, phoneme conversion without needing to develop
1992; Duncan, Seymour, & Hill, 2000; Ehri, 1992; reading by analogy based on rimes. Speakers of
Ehri & Robbins, 1992; Laxon, Masterson, & orthographically regular languages do not need to
Coltheart, 1991; Marsh et al., 1981; Savage, 1997). make use of larger units. The data support this idea.
That is, beginning readers start by identifying how There are many words in English and German that
letters correspond to sounds. For example, begin- are orthographically identical (sand, zoo). However,
ning readers are more adept at segmenting words as we saw in Chapter 7, the ease of pronunciation
into phonemes than into onsets and rimes (Seymour of a target word in English depends on the number
& Evans, 1994). The differences between these of words that share the same rime with the target: a
results are probably attributable to the materials and word like “start” has many neighbors and is easier to
tasks Goswami used. Her control words might have pronounce, while a word such as “storm” has fewer
been more difficult to read than the analogs. Muter, neighbors and is more difficult. In German, this
Snowling, and Taylor (1994) pointed out that the effect in adult speakers is much less pronounced,
majority of these tasks involved the simultaneous while the effect of length is stronger (Ziegler, Perry,
presentation of clue words and target words, which Jacobs, & Braun, 2001). The idea that different lan-
might have provided additional information that guages make use of different-sized preferred reading
might not be available in normal reading. Along units is called the psycholinguistic grain size theory
these lines, Savage (1997) showed that there was no (Ziegler & Goswami, 2005).
privileged role for onsets and rimes in the absence In summary, in natural situations younger
of the concurrent prompts. Ehri and Robbins (1992) reading-age children tend to read using grapheme–
showed that children could only read words by phoneme correspondences, and older reading-age
analogy in natural reading if they already possessed children tend to read by analogy based mainly on
grapheme–phoneme recoding skills. Brown and rime. They are sensitive to task demands, how-
Deavers (1999) showed that reading strategy varied ever, and younger children can be encouraged to
depending on the reading age of the child. Although read by analogy by the clue word technique.
less skilled readers (with a mean reading age of 8 There is evidence that once children know
years 8 months) could make use of rime-based cor- something about reading—once they have acquired
respondences (that is, read by analogy), they pre- the basics of phonological recoding—they in part
ferred to read by grapheme–phoneme correspond- teach themselves to read (Share, 1995). Bowey and
ences. Children with a higher reading age (11 years Muller (2005) gave third-grade children (about 8
6 months) were more likely to read by analogy, with years old) short stories to read silently. The stories
the rime being particularly important. Using a clue contained nonwords, and in a subsequent test the
word increased the amount of reading-by-analogy children were asked to read lists of words containing
8. LEARNING TO READ AND SPELL 247
those nonwords. They pronounced these nonwords reading literature showed that systematic training
more quickly than control nonwords. on phonics produced a strong beneficial effect on
learning to read (Ehri, Nunes, Stahl, & Willows,
2001). Indeed, many studies show that discovering
HOW SHOULD READING the alphabetic principle (that letters correspond sys-
BE TAUGHT? tematically to sounds) is the key to learning to read
(see Backman, 1983; Bradley & Bryant, 1978, 1983;
When should reading be taught? The age at which Byrne, 1998; Rayner & Pollatsek, 1989; Share,
children start to learn to read seems to be relatively 1995). Other methods do not work anywhere near as
unimportant—indeed, even when it is delayed until well. Seymour and Elder (1986) examined the read-
age 7 there are no serious or permanent side effects ing performance of a class of young children (aged
(Rayner & Pollatsek, 1989). In fact, older children 4½ to 5½ years) who were taught to “sight read”
learn to read more quickly in comparison with with relatively little emphasis on the alphabetic prin-
younger children (Feitelson, Tehori, & Levinberg- ciple. They found that the children were limited to
Green, 1982). As a corollary of this, very early tuition reading only words that they had been taught. They
does not provide any obvious long-term benefits, as made many reading errors, and their performance in
late starters catch up so easily. some ways resembled that of people with deep and
The main question then is how should reading phonological dyslexia.
be taught? There are two traditional approaches Hence the most efficient way of learning to read
to teaching children how to read (see Figure 8.2). in an alphabetic language is to learn what phonemes
These correspond to emphasizing one of the two correspond to. In the absence of tuition, however,
routes in the dual-route model. In the look-and-say children try to assign letters to words rather than
or whole word method, children learn to associ- sounds, although most children soon realize that
ate the sound of a word with a particular visual this will not work (Byrne, 1998; Ferreiro, 1985).
pattern. This corresponds to emphasizing the lexi- Anything that expedites this realization facilitates
cal or direct access route. In the alternative phonic reading. Teaching the alphabetic principle explicitly
method, children are taught to associate sounds does this, and, as we have seen, training on phono-
with letters and letter sequences, and use these logical awareness improves reading skills, presum-
associations to build up the pronunciations of ably by focusing on phonemes and preparing the
words. This method therefore emphasizes the non- way to showing how they can be mapped onto let-
lexical, grapheme–phoneme conversion route. ters. As Byrne (1998, p. 144) concludes, “if we want
It is generally agreed that the phonic method children to know something, we would be advised to
gives much better results (Adams, 1990). A meta- teach it explicitly.”
analysis (which is a method of combining the results There are two types of phonics instruction.
of two or more, often many, experiments) of the Analytic phonics is generally taught after reading
FIGURE 8.2
248 C. WORD RECOGNITION
has begun. Letter sounds are introduced gradually; with children taught by this method showing a
reading is practiced using sets of words that share reading advantage several years later.
common sounds (e.g., dog and dig). Analytic Finally, mere exposure to print has benefi-
phonics is currently the most common method of cial effects. Stanovich, West, and Harrison (1995)
teaching reading in the United Kingdom. In syn- showed that exposure to print was a significant
thetic phonics, children are taught all the letters predictor of vocabulary size and declarative
and letter sounds before anything else. Teaching knowledge even after other factors such as work-
emphasizes word-building activities involving the ing memory differences, educational level, and
blending together of constituent sounds. Recent general skill were taken into account. It is particu-
work in Clackmannanshire in Scotland suggests larly important for adults to involve young chil-
that being taught by synthetic phonics is greatly dren actively with print, rather than children just
preferable to being taught by analytic phonics merely being passively exposed to it (Levy, Gong,
(Johnston & Watson, 2004, 2005). A 7-year lon- Hessels, Evans, & Jared, 2006). Hence games and
gitudinal study showed that children who were activities that get children to manipulate letters
taught by synthetic phonics learned to read and and words and involve them in carrying out some
spell faster than children who were taught by early form of reading are highly desirable. Indeed,
other methods. The advantages of learning to read lack of exposure to print can lead to a develop-
by synthetic phonics appear to be long-lasting, mental delay in reading, and may even be one
factor causing developmental surface dyslexia
(Stanovich, Siegel, & Gottardo, 1997).
LEARNING TO SPELL
Spelling is an important skill associated with the
emergence of phonological awareness and learning
to read. Spelling can be thought of as the reverse
of reading: Instead of having to turn letters into
sounds, you have to turn sounds into letters. Indeed
the classic model of spelling is a dual-route one
based on the dual-route model of reading (Brown
& Ellis, 1994). In this model, there is a spelling-to-
sound, or assembled or non-lexical, route, which
can only work for regular words, and a direct, or
addressed or lexical, route, which will work for all
words. The crucial determinant in spelling develop-
ment is the acquisition of phonological representa-
tions of words (Brown & Ellis, 1994).
Given the similarities between reading and spell-
ing, it is no surprise that the same sorts of issues are
found in spelling research as in reading research, and
that the two areas are closely connected longitudi-
nally. Spelling errors are a rich source of informa-
tion about how children spell. In the earliest stages of
spelling, around the age of 3, children know that writ-
In the phonic method, children are taught to ing is different from drawing, but do not yet under-
associate sounds with letters in order to build up
stand the alphabetic principle. Young children believe
the pronunciation of whole words.
that the written forms of words should reflect their
8. LEARNING TO READ AND SPELL 249
correspond to the acquired dyslexias. Identifying to contrast and movement, seems to be affected.
developmental dyslexic children is complex: By Deficits in the magnocellular system lead to prob-
definition, they read less well than age-matched lems with controlling and fixating the eyes, giving
controls, but how much less well do you have to rise to the sensation that letters are moving around
read to be a developmental dyslexic, rather than the page (Stein, 2003). Deficits in the magnocel-
just a poor reader? lular pathway are unlikely to be the sole cause of
A problem that arises when trying to infer developmental dyslexia, however, because many
the properties of the reading system from cases individuals without dyslexia have the same visual
of developmental dyslexia is that the developing deficits in this pathway as individuals with dyslexia
reading system may be very different from the (Skoyles & Skottun, 2004); indeed, most individu-
adult system. For example, grapheme–phoneme als with this visual deficit do not show dyslexia.
conversion might play a larger role in children’s Furthermore, not all people with dyslexia have this
reading. Furthermore, the nature of the child’s visual deficit (Lovegrove et al., 1986). We need to
reading system will depend on the way in which look elsewhere for a widespread underlying cause.
the child is being taught to read. The look-and- Reading disabilities tend to run in fami-
say method emphasizes the role of the direct lies, and recent work shows that dyslexia has
access route, and the phonic method emphasizes a significant genetic component, with a num-
grapheme–phoneme conversion. ber of chromosomal loci identified (Eckert,
Lombardino, & Leonard, 2001; Fisher et al.,
The biology of developmental 1999). There is some uncertainty—and per-
haps variation—about how these genetic abnor-
dyslexia malities are ultimately manifest at the level of
The relation between developmental dyslexia brain structure. Imaging studies suggest that
and other cognitive abilities is complicated the thalamus, frontal lobes, and cerebellum all
(Ellis, 1993). Some developmental dyslexic play some role, although the left planum tem-
children have other language problems, such as porale, a structure at the heart of Wernicke’s
in speaking or object naming. It is often thought area, plays a particularly important role in the
that dyslexic children are clumsier than aver- origin of developmental dyslexia (see Figure
age, but it is unclear whether this is really the 8.3). The planum temporale is usually larger
case. Some children with surface developmental in the left hemisphere than in the right; the dif-
dyslexia might similarly have impaired vis- ference in size is much less in individuals with
ual memory (Goulandris & Snowling, 1991), developmental dyslexia (Beaton, 1997). At a
although not all do. “Allan” (Hanley, Hastie, & processing level, damage to these brain areas
Kay, 1992) performed extremely well on tests seems to be manifest primarily as a disturbance
of visual short-term and long-term memory. to phonological skills (see below). An autopsy
People with developmental dyslexia are slightly of four men with developmental dyslexia found
more likely to be left-handed or ambidextrous this abnormal symmetry of the planum tempo-
than people without (Eglinton & Annett, 1994). rale, but also found neuronal ectopias (abnormal
There is some evidence that the oscillatory brain clusters of neurons) and dysplasias (abnormally
activity of people with developmental dyslexia oriented neurons)—both conditions associated
is abnormal, associated with aberrant lateraliza- with abnormalities in the migration phase of
tion and leading to problems with phonological brain development in the fetus, when neurons
processing and memory (Kraus, 2012). move to their eventual location (Galaburda,
Many studies have also found developmental Sherman, Rosen, Aboitiz, & Geschwind, 1985).
dyslexia to be associated with visual deficits (e.g., Neurons tend to be smaller in the left medial
Lovegrove, Martin, & Slaghuis, 1986). In particu- geniculate nucleus, an important part of the
lar, the magnocellular visual pathway, involving brain for relaying auditory information, than in
large cells that respond quickly and are sensitive the right in people with developmental dyslexia
8. LEARNING TO READ AND SPELL 251
Anterior
Broca’s
area
Planum
temporale
Planum (right)
Anterior Posterior temporale
(left)
Wernicke’s
area
Posterior
FIGURE 8.3 An axial cross-section of the brain to show the planum temporale. As here, the left planum
temporale is usually larger than the right. In an individual with developmental dyslexia the size difference would be
much less.
(Galaburda, Menard, & Rosen, 1994). Imaging Bryant and Impey (1986) reported a compari-
studies also reveal that the occipital regions of son of dyslexic and reading-age-matched control
the brain show increased activity—probably children and found that the “normal” children
because people are using additional visual strat- made exactly the same types of reading error as the
egies to cope with their phonological deficits dyslexic children. If dyslexic and normal children
(Casey, Thomas, & McCandliss, 2001). make the same types of error then this weakens
Clearly genetic and brain abnormalities play the argument that developmental dyslexia arises
an important role in determining a child’s read- from the same type of brain damage as acquired
ing ability. However, given the variation observed dyslexia. In addition, we find large differences in
in orthographies and dyslexia, it is unlikely that a normal young readers. Bryant and Impey suggest
single biological factor can account for all types that there are many different reading styles, and
of reading difficulty (Hadzibeganovic et al., 2010; some children adopt styles that lead them into dif-
Seidenberg, 2011). ficulty. Indeed, Baron and Strawson (1976) found
that some adult normal readers were particularly
Are there subtypes of good at phonological skills but relatively poor at
orthographic skills (they called these Phoenicians;
developmental dyslexia? they correspond to a very mild version of surface
There has been some controversy about whether dyslexia). Others were particularly good at ortho-
or not there are different types of developmental graphic skills but relatively poor at phonological
dyslexia. Frith (1985) emphasized the impor- skills (Baron and Strawson called these Chinese
tance of progressing from the logographic stage readers, corresponding to phonological dyslexia).
to the alphabetic stage, arguing that classic devel- Baron and Strawson proposed that these were the
opmental dyslexics fail to make this progression. ends of a continuum of individual differences in
Less severely affected are those readers who are the normal population. Developmental dyslex-
arrested at the alphabetic stage and cannot pro- ics would lie at the extremes of this continuum
gress to the orthographic stage. Less severe still (but see also Coltheart, 1987, and Temple, 1987,
is what is called type-B spelling disorder, where for detailed replies). Olson, Kliegel, Davidson,
there is a failure of orthographic access for spell- and Foltz (1984) also found that individual dif-
ing but not for reading. ferences in reading skills in their participants fell
252 C. WORD RECOGNITION
along a normally distributed continuum rather highly impaired at nonword reading, but read
than into distinct subtypes. words with normal accuracy and latencies. She
A number of researchers have pointed out reported that she had experienced no difficulties
that there are similarities between acquired in learning to read or write at school. She never
and developmental dyslexia. Jorm (1979) experienced any difficulty in “real-life” reading.
compared developmental dyslexia with deep Like all these people, Melanie-Jane had diffi-
dyslexia. In both cases grapheme–phoneme culty with other tasks involving phonology (e.g.,
conversion is impaired, which leads to a par- assembly and segmentation). In summary, many
ticular difficulty with nonwords. He concluded developmental dyslexics resemble people with
that the same part of the parietal lobe of the acquired phonological dyslexia.
brain was involved in each case; it was dam- Castles and Coltheart (1993) examined the
aged in deep dyslexia, and failed to develop reading of 56 developmental dyslexics, and argued
normally in developmental dyslexia. However, that they did not form a homogeneous population,
Baddeley, Ellis, Miles, and Lewis (1982) showing instead a clear dissociation between sur-
found that although the phonological encod- face and phonological dyslexic reading patterns.
ing of people with developmental dyslexia was They concluded that such a dissociation is the norm
greatly impaired, they could do some tasks that in developmental dyslexia. In this interpretation,
necessitate it. For example, they could read the types of developmental dyslexia correspond
nonwords at a much higher level than deep to a failure to “acquire” normally one of the two
dyslexics, although of course nowhere near as routes of the dual-route model. One subgroup was
well as age-matched controls. relatively skilled at sublexical processing (as they
Most people with developmental dyslexia were good at reading nonwords and poor at read-
rarely make semantic paralexias, so perhaps ing exception words) and another relatively skilled
they resemble phonological dyslexics rather at lexical processing (as they showed the reverse
more? Campbell and Butterworth (1985), and pattern). Hence Castles and Coltheart concluded
Butterworth, Campbell, and Howard (1986), that there are surface and phonological subtypes of
describe the case of RE, a successful university developmental dyslexia. Subsequent work looking
student, who resembled a phonological dyslexic. at the heritability of developmental dyslexia among
RE could only read a new word once she had twins suggests that although both types are signifi-
heard someone else say it. She could not inspect cantly inheritable, the genetic contribution is much
the phonological form of words, and could not larger in developmental phonological dyslexia
“hear words in the head.” Such a skill may be (Castles, Datta, Gayan, & Olson, 1999).
necessary for the development of the phono- An important consideration in studying
logical recoding route. In addition, she had an developmental dyslexia is selecting an appropri-
abnormally low digit span. A similar case is that ate control group. Snowling (1983, 2000) urged
of JM, a person of superior intelligence whose caution in comparing types of acquired and devel-
reading age was consistently 2 years less than opmental dyslexia. In particular, she argued that
his chronological age (Hulme & Snowling, the best comparison in understanding what has
1992; Snowling & Hulme, 1989). At the age of gone wrong is not between developmental and
15 his word reading was comparable to that of acquired dyslexics, but between developmental
reading-age-matched controls, but he was com- dyslexics and reading-age-matched controls. That
pletely unable to read two-syllable nonwords. is, if someone with a chronological age of 14 has
He also had a severely reduced short-term a reading age of 10, they should be compared
memory span and difficulty with other tests of with normal readers of 10. The study by Castles
phonology such as nonword repetition. Howard and Coltheart did not use appropriate reading-
and Best (1996) described the case of “Melanie- age-matched controls, and did not control for IQ
Jane,” an 85-year-old person with developmen- (Snowling, Bryant, & Hulme, 1996; Stanovich,
tal phonological dyslexia. Melanie-Jane was Siegel, Gottardo, Chiappe, & Sidhu, 1997). It is
8. LEARNING TO READ AND SPELL 253
therefore possible that any apparent differences awareness may be related to difficulties with
between the two types of developmental dyslex- phonological short-term memory (Campbell &
ics just reflect individual differences in normal Butterworth, 1985; Hulme & Snowling, 1992;
readers of a lower reading age. When compared Snowling, Stackhouse, & Rack, 1986). There
with children at the same reading age (rather than is also evidence of a speech perception deficit
chronological age), the two groupings disappear, in children at the developmental phonological
because children at different reading ages differ dyslexia extreme. Manis et al. (1997) showed
in the difficulty they have with exception words that dyslexics with low phonological awareness
and nonwords. were poor at distinguishing between the sounds
The consensus of opinion is that most “p” and “b.”
impairments in developmental dyslexia lie on a Harm and Seidenberg (1999) argued that
continuum, rather than falling into two neat cat- while children at the developmental phonological
egories, with phonological developmental dys- dyslexia end of the continuum share a core defi-
lexics and surface developmental dyslexics at cit in phonological processing, children at the
the ends of the continuum (Manis, Seidenberg, developmental surface dyslexia end are like begin-
Doi, McBride-Chang, & Petersen, 1996; ner readers, who are also much worse at reading
Seymour 1987, 1990; Wilding, 1990). Those exception words than sounding out nonwords.
developmental dyslexics near the surface dys- They therefore concluded that surface devel-
lexia end are poor at reading irregular words but opmental dyslexics are delayed readers. They
are not so troubled by nonwords, whereas those showed how both surface and phonological devel-
at the phonological dyslexia end have severe opmental dyslexia can be generated by different
nonword reading problems and make many pho- types of damage to an attractor connectionist net-
nological errors while reading. Children at the work. This model also shows that it is possible to
phonological dyslexic end of the continuum are have a phonological deficit that is severe enough to
impaired on tasks of phonological awareness, interfere with reading development but not severe
while children at the surface dyslexic end do not enough to interfere with speech perception and
differ from age-matched controls on such tasks production. Developmental phonological dyslexia
(Manis et al., 1996). arises as a consequence of damage to phonologi-
It seems then that those at the surface dys- cal representations before the model is trained to
lexic end of the continuum read and perform read. Developmental surface dyslexia can arise in
very similarly to reading-age-matched controls, several ways, including less training (correspond-
suggesting that a general developmental delay ing to less experience of reading), making technical
is the root of the problem, rather than a deviant changes to the way in which the model learns so
reading pattern. Clearly problems with phonol- that it does not obtain the normal benefits from the
ogy play a central role in the deviant reading same amount of learning, reducing the number of
pattern shown in developmental phonological hidden units that mediate between orthography and
dyslexia. Most people with developmental dys- phonology, and degrading the orthographic input to
lexia are indeed worse at tasks involving both the model (corresponding to visual-perceptual defi-
nonword reading and phonological awareness cits). Relatively pure examples of phonological and
(Bradley & Bryant, 1983; Goswami & Bryant, surface dyslexia (corresponding to the extremes of
1990; Metsala, Stanovich, & Brown, 1998; the continuum) were associated with mild forms
Rack, Snowling, & Olson, 1992; Siegel, 1998; of impairment; more severe impairments created
Snowling, 1987). Bradley and Bryant (1978) a mixed pattern of nonword and exception word
showed that people with developmental dyslexia impairment that lies somewhere along the continuum.
perform less well than reading-age-matched This work therefore shows how two dis-
control children at picking out a phonologically tinct types of damage to a connectionist model
distinct word from a group of four (e.g., cat, fat, can give rise to a continuum of impairments.
hat, net). These difficulties with phonological As we noted above, this phonological deficit
254 C. WORD RECOGNITION
SUMMARY
FURTHER READING
See McBride-Chang (2004) for an introduction to literacy development. Ellis (1993) includes an
excellent description of developmental dyslexia. Snowling (2000) is a very approachable review
of work on developmental dyslexia, and Olson (1994) provides an up-to-date review. For general
overviews of learning to read, with emphasis on individual differences in reading ability, see
Goswami and Bryant (1990), McShane (1991), Oakhill (1994), and Perfetti (1994). For a popular
8. LEARNING TO READ AND SPELL 257
account of connectionist models of reading, see Hinton (1992), and Hinton, Plaut, and Shallice
(1993). Brown and Ellis (1994) review research on spelling. Harris and Hatano (1999) provide a
cross-linguistic perspective on learning to read and write.
For an excellent recent review of the whole area, with emphasis on phonological awareness, see
Ziegler and Goswami (2005). For more detail see Snowling and Hulme (2007).
CHAPTER 9
UNDERSTANDING
SPEECH
had to be clearly and separately articulated. For information is called the metrical segmentation
the listener, co-articulation has the advantage strategy. It is possible to construct experimen-
that information about the identity of phonetic tal materials that violate these expectations, and
segments may be spread over several acoustic seg- these reliably induce mishearings in listeners. For
ments. Although this has the apparent disadvan- example, Cutler and Butterfield described how
tage that phonemes vary slightly depending on one participant, given the unpredictable words
the context, it also has the advantage that we do “conduct ascents uphill” presented very faintly,
not gather information about only one phoneme reported hearing “The doctor sends the bill,” and
at any one time; they provide us with some infor- another “A duck descends some pill.” The lis-
mation about the surrounding sounds (a feature teners have erroneously inserted word bounda-
known as parallel transmission). For example, the ries before the strong syllables and deleted the
/b/ phonemes in “bill,” “ball,” “bull,” and “bell” boundaries before the weak syllables. This type of
are all slightly different acoustically, and tell us segmentation procedure, whereby listeners seg-
about what is coming next. ment speech by identifying stressed syllables, is
The segmentation problem is that it is not called stress-based segmentation. An alternative
easy to separate sounds in speech, as they run mechanism, which is based on detecting syllables
together (except for stop consonants and pauses). and is used in languages such as French that have
This problem does not just apply to sounds within very clear and unambiguous syllables, is called
words; in normal conditions, words also run into syllable-based segmentation. In stress-based lan-
each other. To take a famous example, in normal guages such as English, syllable boundaries can
speech the strings “I scream” and “ice cream” be unclear, and identifying the syllables is not
sound indistinguishable. The acoustic segments reliable. Hence the form of the listener’s language
visible in spectrographic displays do not map in determines the precise segmentation strategy used
any easy way into phonetic segments. One obvi- (Cutler, Mehler, Norris, & Segui, 1986).
ous constraint on segmenting speech is that we How do bilingual speakers segment lan-
prefer to segment speech so that each speech seg- guages? They do not simply mimic the monolin-
ment is accounted for by a possible word. This gual speakers of the language. Their segmentation
is called the possible-word constraint: We do not strategy is determined by which is their domi-
like to segment speech so that it leaves parts of nant language. Cutler, Mehler, Norris, and Segui
syllables unattached to words (Norris, McQueen, (1992) tested English–French bilingual speakers
Cutler, & Butterfield, 1997). Any segmenta- on segmenting English and French materials,
tion of the speech string that results in impossi- using a syllable monitoring task where the par-
ble words (such as isolated consonants) is likely ticipants had to respond as quickly as possible if
to be rejected. Hence, other things being equal, they heard a particular sequence of sounds. The
the segmentation of “fill a green bucket” will be French words “balance” and “balcon” (mean-
preferred to “filigree n bucket” because the latter ing “balance” and “balcony”) begin with dif-
results in an unattached “n” sound. ferent syllables (“ba” in “balance” and “bal” in
Other strategies that we develop to segment “balcon”). Native French speakers find it easy to
speech depend on our exposure to a particular lan- detect “ba” in “balance” and “bal” in “balcon.”
guage. Strong syllables bear stress and are never On the other hand, they take longer to find the
shortened to unstressed neutral vowel sounds; “bal” in “balance” and “ba” in “balcon” because
weak syllables do not bear stress and are often although these sounds are present, they do not cor-
shortened to unstressed neutral vowel sounds. In respond to the syllables. The syllable structure of
English, strong syllables are likely to be the initial the English word “balance” is far less clear; peo-
syllables of main content-bearing words, while ple are uncertain to which syllable the “l” sound
weak syllables are either not word-initial, or start a belongs. Hence the time it takes English speakers
function word (Cutler & Butterfield, 1992; Cutler to detect “ba” and “bal” does not vary with the
& Norris, 1988). A strategy that uses this type of syllable structure of the word they hear (“balance”
9. UNDERSTANDING SPEECH 261
or “balcony”). French makes use of syllables, but is possible to fatigue the feature detectors hypoth-
English does not. esized to be responsible for categorical perception
In Cutler et al.’s experiment, the English– by repeated exposure to a sound, and to shift per-
French bilingual speakers segmented depending ception towards the other end of the continuum
on their primary or dominant language: English- (Eimas & Corbit, 1973). This technique is called
dominant speakers showed stress-based segmen- selective adaptation. For example, repeated pres-
tation with English language materials, and never entation of the syllable “ba” makes people less
showed syllable-based segmentation, whereas sensitive to the voicing feature of the /b/. This
French-dominant speakers showed syllabic seg- means that immediately afterwards the boundary
mentation, and only with French materials. It is between /b/ and /p/ shifts towards the /p/ end of
as though the segmentation strategy is fixed at an the continuum. Hence, even though speech stim-
early age, and only that strategy is developed fur- uli may be physically continuous, perception is
ther. Hence all bilingual speakers are monolingual categorical.
at the level of segmentation. This is not as big a The boundaries between categories are not
disadvantage as it might seem: Efficient bilin- fixed, but are sensitive to contextual factors such
guals are able to discard ineffective segmentation as the rate of speech. The perceptual system
processes and use other, more general, analytical seems able to adjust to fast rates of speech so
processes instead (Cutler et al., 1986, 1992). that, for example, a sound with a short VOT that
should be perceived as /b/ is instead perceived as
Categorical perception /p/. In effect, an absolutely short interval can be
Even though there is all this variation in the treated as a relatively long one if the surround-
way in which phonemes can sound, we rarely, if ing speech is rapid enough (Summerfield, 1981).
ever, notice these differences. We classify speech This is not necessarily learned, as infants are also
sounds as one phoneme or another; there is no sensitive to speech rate. They are able to interpret
halfway house. This phenomenon is known as the relative duration of different frequency com-
the categorical perception of phonemes (first ponents of speech depending on the rate of speech
demonstrated by Liberman, Harris, Hoffman, & (Eimas & Miller, 1980; Miller & Jusczyk, 1989;
Griffith, 1957). Liberman et al. used a speech see Altmann, 1997, for more detail).
synthesizer to create a continuum of artificial syl- At first, researchers thought that listen-
lables that differed in the place of articulation. ers were actually unable to distinguish between
In spite of the continuum, participants placed slightly different members of a phoneme cat-
these syllables into three quite distinct categories egory. However, this does not appear to be the
beginning with /b/, /d/, and /g/. Another exam- case. Pisoni and Tash (1974) found that partici-
ple of categorical perception is voice onset time pants were faster to say that two /ba/ syllables
(abbreviated to VOT). In the voiced consonants were the same if the /b/ sounds in each were
(e.g., /b/ and /d/), the vocal cords start vibrating acoustically identical, than if the /b/ sounds dif-
as soon as the vocal tract is closed, whereas in fered slightly in VOT. Participants are in fact sen-
the unvoiced consonants (e.g., /p/ and /t/), there sitive to differences within a category. Hence the
is a delay of about 60 ms. The pairs /p/ and /b/, importance of categorical perception has recently
and /t/ and /d/, differ only in this minimal feature come into question. It is possible that many phe-
of voicing. Voicing lies on a continuum; it is pos- nomena in speech perception are better described
sible to create sounds with a VOT of, for example, in terms of continuous rather than categorical per-
30 ms. Although this is midway between the two ception, and although our phenomenal experience
extremes, we actually categorize such sounds as of speech identification is that sounds fall into
being either simply voiced or unvoiced—exactly distinct categories, the evidence that early sensory
which may differ from time to time and from per- processing is really categorical is much weaker
son to person, and people can actually be biased (Massaro, 1987, 1994). Massaro argued that the
towards one end of the continuum or the other. It apparent poor discrimination within categories
262 C. WORD RECOGNITION
does not result from early perceptual processing, (For example, people are faster to respond to the
but instead just arises from a bias of participants word-initial “b” in the predictable word “book”
to say that items from the same category are iden- than the less word predictable “bill” in the context
tical. Nevertheless, the idea of categorical percep- of “He sat reading a book/bill until it was time to
tion remains popular in psycholinguistics. go home for his tea.”) Foss and Blank argued that
people respond to the prelexical code when the
What is the nature of the prelexical phoneme monitoring task is made easy, but to the
code? postlexical code when the task is difficult (such as
Do we need to identify phonemes before we iden- when the target word is contextually less likely).
tify spoken words? Savin and Bever (1970) asked Subsequently Foss and Gernsbacher (1983) failed
participants to respond as soon as they heard a par- to find experimental support for the dual-code
ticular unit, which was either a single phoneme or model. Increasing the processing load of the
a syllable. They found that participants responded participants (e.g., by requiring them to monitor
more slowly to phoneme targets than to syllable for multiple targets) did not shift them towards
targets, and concluded that phoneme identifica- responding on the basis of the postlexical code.
tion is subsequent to the perception of syllables. They concluded that people generally respond
They proposed that phonemes are not perceptually in the phoneme monitoring task on the basis of
real in the sense that syllables are: we do not rec- the prelexical code, and only in exceptional cir-
ognize words through perceiving their individual cumstances make use of a postlexical code. These
phonemes, but instead can only recognize them results suggest that phonemes form part of the
through perceiving some more fundamental unit, prelexical code.
such as the syllable. Foss and Swinney (1973) que- Marslen-Wilson and Warren (1994) pro-
ried this conclusion, arguing that the phoneme and vided extensive experimental evidence on a
syllable monitoring task used by Savin and Bever range of tasks that phoneme classification does
did not directly tap into the perception process. not have to be finished before lexical activation
That is, just because we can become consciously can begin. Nonwords that are constructed from
aware of a higher unit first does not mean that it is words are more difficult to reject in an auditory
processed perceptually earlier. lexical decision task than nonwords constructed
Foss and Blank (1980) proposed a dual-code from nonwords. In this experiment, you start off
theory where speech processing employs both a with “smog” (a word) and “smod” (a nonword).
prelexical (or phonetic) code and a postlexical In each case you then take off the final consonant
(or phonemic) code. The prelexical code is com- and splice on a new one, “b,” to give you a new
puted directly from the perceptual analysis of the nonword, “smob.” Although they might initially
input acoustic information, whereas the postlexi- sound the same, the version made from “smog” is
cal code is derived from information derived from more difficult to reject as a nonword because the
higher level units such as words. In the phoneme co-articulation information from the vowel is con-
monitoring task, participants have to press a but- sistent with a word. Furthermore, the effects were
ton as soon as they hear a particular sound. Foss also found across a number of different tasks. If
and Blank showed that phoneme monitoring times the phonetic representation of the vowel had been
to target phonemes in words and nonwords were translated into a phoneme before lexical access,
approximately the same. In this case, the partici- then the co-articulation information would have
pants must have been responding to the phonetic been lost and the two types of nonword would
code, as nonwords cannot have phonological have been equally difficult. Marslen-Wilson and
codes. Foss and Blank also found that the fre- Warren argued that lexical representations are
quency of the target word does not affect phoneme directly accessed from featural information in the
monitoring times. On the other hand, manipulat- sound signal. Co-articulation information from
ing the semantic context of a word leads to people vowels is used early to identify the following con-
responding on the basis of the postlexical code. sonant and therefore a word.
9. UNDERSTANDING SPEECH 263
In summary, there is controversy about willing to put a sound into a category they would
whether or not we need to identify phonemes not otherwise choose if the result makes a word:
before recognizing a word. Most data suggest that “kiss” is a word, but “giss” is not, and this influ-
while phonemes might be computed during word ences our categorical perception of the ambiguous
recognition, we do not need to complete phoneme phoneme. This is known as lexical identification
identification before word recognition can begin. shift. In this respect, word context is influencing our
The research on phonological awareness described categorization of sounds. Findings using this tech-
in Chapter 8 suggests that we seem to be less nique, developed by Connine and Clifton (1987),
aware of phonemes than other phonological con- further strengthen the argument that lexical
stituents of speech, such as syllables. Morais and knowledge (information about words) is available
Kolinsky (1994) proposed that there are two quite to the categorical perception of ambiguous stim-
distinct representations of phonemes: an uncon- uli. They showed that other processing advantages
scious system operating in speech recognition and accrue to the ambiguous stimuli when this lexical
production, and a conscious system developed in knowledge is invoked, but not at the ends of the
the context of the development of literacy (read- continuum, where perceptual information alone is
ing and writing). sufficient to make a decision. Later studies using a
method of analysis known as signal detection also
What role does context play in suggest that the lexical identification shift in a cat-
egorical perception task is truly perceptual. Signal
identifying sounds? detection theory provides a means of describing
The effect of context on speech recognition is of the identification of imperfectly discriminable
central importance, and has been hotly debated. Is stimuli. Lexical context is not sensitive to manip-
speech recognition a purely bottom-up process, or ulations (primarily the extent to which correct
can top-down information influence its outcome? responses are rewarded and incorrect ones pun-
If we can show that the word in which a sound ished) known to influence postperceptual pro-
occurs, or indeed the meaning of the whole sen- cesses (Pitt, 1995a, 1995b; but see Massaro &
tence, can influence the recognition of that partic- Oden, 1995, for a reply). Connine (1990) found
ular sound, then we will have shown a top-down that sentential context (provided by the meaning
influence on sound perception. In this case, we of the whole sentence) behaves differently from
will have shown that speech perception is in part lexical context (the context provided by the word
at least an interactive process; knowledge about in which the ambiguous phoneme occurs). In par-
whole words is influencing our perception of their ticular, sentential context has a similar effect to
component sounds. Of course, different types of the obviously postperceptual effect of the amount
context could have an effect at every level of pho- of monetary payoff, where certain responses lead
nological processing, and in principle the effects to greater rewards. She therefore concluded that
might be different at each level. sentential context has postperceptual effects.
The first piece of relevant evidence is based A classic psycholinguistic finding known as
on the categorical perception of sounds varying the phoneme restoration effect appears at first sight
along a continuum. For example, although /p/ and to be evidence of contextual involvement in sound
/b/ typically differ in VOT between 0 and 60 ms, identification (Obusek & Warren, 1973; Warren,
sounds in between will be assigned to one or the 1970; Warren & Warren, 1970). Participants were
other category. Word context affects where the presented with sentences such as “The state gov-
boundary between the two lies. Ganong (1980) ernors met with their respective legi*latures con-
varied an ambiguous phoneme along the appro- vening in the capital city.” At the point marked
priate continuum (e.g., /k/ to /g/), inserted this in with an asterisk *, a 0.12-second portion of speech
front of a context provided by a word ending (e.g., corresponding to the /s/ phoneme had been cut out
“-iss”), and found that context affected the per- and replaced with a cough. Nevertheless, partici-
ceptual changeover point. That is, participants are pants could not detect that a sound was missing
264 C. WORD RECOGNITION
from the sample. That is, they appear to restore asked whether the restoration occurs at the phono-
the /s/ phoneme to the word “legislatures.” The logical processing level, or at some higher level.
effect is quite dramatic. Participants continue to Perhaps it is just the case, for example, that partic-
report that the deleted phoneme is perceptually ipants guess the deleted phoneme. The guessing
restored even if they know it is missing. Moreover, does not even need to be conscious. Another way
participants cannot correctly locate the cough of putting this issue is, does the context affect the
in the speech. The effect can still be found if an actual perception or some later process?
even larger portion of the word is deleted (as in There is evidence that in some circumstances
le***latures). Warren and his colleagues argued phoneme restoration is a true perceptual effect.
that participants are using semantic and syntactic Samuel (1981, 1987, 1990, 1996) examined the
information far beyond the individual phonemes effects of adding noise to the segment instead of
in their processing of speech. The actual sound used just replacing the segment with noise. If phoneme
is not critical; a buzz or a tone elicits the effect as restoration is truly perceptual, participants should
successfully as a cough. There are limits on what not be able to detect any difference between these
can be restored, however; replacing a deleted conditions; in each case they will think they hear
phoneme with a short period of silence is easily a phoneme plus sound. On the other hand, if the
detectable and does not elicit the effect. effect is postperceptual, there should be good dis-
In an even more dramatic example, partici- crimination between the two conditions. Samuel
pants were presented with sentences (1) to (4) concluded that lexical context does indeed lead
(Warren & Warren, 1970): to true phoneme restoration and that effect was
prelexical. On the other hand he concluded that
(1) It was found that the *eel was on the orange. sentence context does not affect phoneme recog-
(2) It was found that the *eel was on the axle. nition, and affects only postlexical processing.
(3) It was found that the *eel was on the shoe. Consider the sentences in (5) and (6):
(4) It was found that the *eel was on the table.
(5) The travelers found horrible bats in the cavern/
The participants listened to tapes that had been tavern when they visited it.
specially constructed so that the only thing that (6) The travelers found horrible food in the cavern/
differed between the four sentences was the last tavern when they visited it.
word. In each case, a different final word was
spliced onto a physically identical beginning. This In (5) the sentential context supports “cav-
is important because it means that there can be ern” more than “tavern”; in (6) the reverse is the
no subtle phonological or intonational differences case. If sentence context has an effect, we should
between the sentences that might cue participants. therefore get stronger phoneme restoration of the
Once again, the phoneme at the beginning of *eel deleted initial phoneme for “cavern” than “tav-
was replaced with a cough. It was found that the ern” in (5), and the opposite way round in (6).
phoneme that participants restored depended on This was not the case. In conclusion, only infor-
the semantic context given by the final word of mation about particular words affects the identifi-
the sentence. Participants restored a phoneme that cation of words; information about the meaning of
would make an appropriate word for that context. the sentence affects a later processing stage.
These are “peel” in (1), “wheel” in (2), “heel” in Samuel (1997) investigated the suggestion
(3), and “meal” in (4). that people just guess the phoneme in the restoration
Although at first sight it seems that the per- task, rather than truly restore it at a perceptual level.
ception of speech is constrained by higher level He combined the phoneme restoration technique
information such as semantic and syntactic con- with the selective adaptation technique of Eimas
straints, it is unclear in these experiments how and Corbit (1973). Listeners identified sounds
the restoration is occurring. Do participants really from the /bI/–/dI/ continuum where the sounds
perceive the missing phoneme? Fodor (1983) that were acting as adaptors were the third syllable
9. UNDERSTANDING SPEECH 265
of words beginning either with /b/ or /d/ (e.g., The time course of spoken word
“alphabet” and “academic”). After repeated pres-
entation of the adaptor (e.g., /b/, by listening to
recognition
the word “alphabet” 40 times), participants were The terms “word recognition” and “lexical
less likely to classify a subsequent sound as /b/. access” are often used in the spoken word rec-
Crucially, this adaptation occurred even if the crit- ognition literature to refer to different processes
ical phoneme in the adaptor word was replaced (Tanenhaus & Lucas, 1987), and so it is best to be
with a loud burst of noise (e.g., “alpha*et,” with * clear in advance about what our terms mean. We
signifying the noise). The adaptation only occurred can identify three stages of identification: initial
when the critical phonemes were replaced with a contact, lexical selection, and word recognition
burst of noise, but not when they were replaced (Frauenfelder & Tyler, 1987) (see Figure 9.1).
with silence. These stages might overlap; whether they do or
At first sight this study suggests that not is an empirical question, and is an aspect of
restored phonemes can act like real ones and our concern with modularity.
cause adaptation. Others, however, have argued Recognizing a spoken word begins when
that these findings can be explained without some representation of the sensory input makes
interaction if the restored phonological code is initial contact with the lexicon, called the initial
created by top-down lexical context rather than contact phase. Once lexical entries begin to match
just provided by the lexical code. The lexical the contact representation, they change in some
context does not seem to be improving the per- way; they become “activated.” The activation
ceptibility of the phoneme (the sensitivity), but might be all-or-none (as is the case in the original
just affects how participants respond (the bias). cohort model described later), or the relative acti-
To this extent top-down information is not really vation levels might depend on properties of the
affecting the sensitivity of word recognition. words (such as word frequency), or words may
Perhaps listeners come to learn to recognize the be activated in proportion to the current goodness
noise as an instance of a “b” sound, and hence it of fit with the sensory data (as in the more recent
causes adaptation in the same way that a “real” cohort model, or in the connectionist TRACE
“b” would (Norris, McQueen, & Cutler, 2000, model). In the selection phase, activation accu-
2003). mulates until one lexical entry is selected. Word
The balance of the data here, and as discussed recognition is the end point of the selection phase.
later in the description on the TRACE model, sug- In the simplest case, the word recognition
gests that top-down context has at best a limited point corresponds to its uniqueness point, where
role in sound identification. In particular, there the word’s initial sequence is common to that word
is little evidence that sentential context affects and no other. Often recognition will be delayed
speech processing. until after the uniqueness point, and in principle
WORD RECOGNITION
INITIAL CONTACT LEXICAL SELECTION
(word is recognized and the
(some representation of the (sensory input continues to
recognition point usually
sensory input makes initial accumulate until one
occurs before the complete
contact with the lexicon) lexical entry is selected)
word has been heard)
FIGURE 9.1
266 C. WORD RECOGNITION
we might recognize a word before its uniqueness immediate sensory signal. It includes information
point—in strongly biasing contexts, for example. available from the previous sensory input (the
If this happens, the point at which this occurs is prior context) and from higher knowledge sources
called the isolation point. This is the point in a (e.g., lexical, syntactic, semantic, and pragmatic
word where a proportion of listeners identify the information). The nature of the context being
word correctly, even though they may not be con- discussed also depends on the level of analysis.
fident about this decision (Grosjean, 1980; Tyler & For example, we might have word-level context
Wessels, 1983). By the isolation point, the listener operating on phoneme identification, and sen-
has isolated a word candidate; they then continue tence-level context operating on word identifica-
to monitor the sensory input until some level of tion. To show that context affects recognition, we
confidence is reached; this is the recognition point. need to demonstrate top-down influences on the
Lexical access refers to the point at which all the bottom-up processing of the acoustic signal. We
information about a word—phonological, seman- have already examined whether context affects
tic, syntactic, pragmatic—becomes available fol- low-level perceptual processing; here we are
lowing its recognition. The process of integration concerned with the possible effects of context on
that then follows is the start of the comprehension word identification. The issues involved are com-
process proper, where the semantic and syntactic plex. Even if there are some contextual effects,
properties of the word are integrated into the higher we would still need to determine which types of
level sentence representation. context have an effect, at what stage or stages they
have an effect, and how they have this effect.
When does frequency affect spoken We have already noted that there are two
word recognition? opposing positions on the role of context in rec-
Frequency has a very early effect on spoken word ognition, which can be called the autonomous and
recognition. Dahan, Magnuson, and Tanenhaus interactionist positions. The autonomous position
(2001) examined people’s eye movements while says that context cannot have an effect prior to
looking at pictures on a computer screen. The word recognition. It can only contribute to the
participants had to follow spoken instructions evaluation and integration of the output of lexi-
about which object in the scene they had to click cal processing, not its generation. However, the
with their mouse. Participants tended to look at lateral flow of information is permitted in these
objects with the higher frequency name first, models. For example, information flow is allowed
compared with a competitor picture with a lower between words within the lexicon, but not from
frequency name but the same initial sounds (e.g., the lexicon to lower level processes such as pho-
the spoken word was “bench,” and alongside neme identification. On the other hand, interac-
the picture of a bench were pictures of a bed—a tive models allow different types of information
high-frequency competitor—and a bell—low- to interact with one another. In particular, there
frequency). Participants also needed to look for may be feedback from later levels of processing to
less time at targets with higher frequency names. earlier ones. For example, information about the
A detailed analysis of how these effects unfolded meaning of the sentence or the pragmatic context
over time showed that word frequency is impor- might affect perception.
tant from the very earliest stages of processing, This description is the simplest way of put-
and that these effects persisted for some time. ting the autonomous–interactive distinction.
However, perhaps the autonomous and interactive
models should be looked at as the extreme ends of
Context effects on word a continuum of possible models rather than as the
two poles of a dichotomy. There might be some
recognition restrictions on permitted interaction in interactive
Does context affect spoken word recognition? models. For example, context can propose candi-
The context is all of the information not in the dates for what word the stimulus might be before
9. UNDERSTANDING SPEECH 267
sensory processing has begun (Morton, 1969), or Tyler, 1980; Tyler & Wessels, 1983). But it is not
it might be restricted to disposing of candidates clear whether non-structural and semantic struc-
and not proposing them (Marslen-Wilson, 1987). tural context effects can be distinguished, or at
Because there are such huge differences between which stages they operate. Furthermore, these
models it can be difficult to test between them. effects must be studied using tasks that minimize
Strong evidence for the interactionist view is if the chance of postperceptual factors operating.
context has an effect before or during the access For this reason the delay between the stimulus
and selection phases. In an autonomous model, and the response cannot be too long; otherwise
context can only have an influence after a word participants would have a chance to reflect on and
has emerged as the best fit to the sensory input. maybe alter their decisions, which would obvi-
Frauenfelder and Tyler (1987) distinguished ously reflect late-stage, post-access mechanisms.
between two types of context: non-structural Interpretative structural context involves more
and structural. Non-structural context can be high-level information, such as pragmatic infor-
thought of as information from the same level mation, discourse information, and knowledge
of processing as that which is currently being about the world.
processed. An example is facilitation in pro- There is some evidence that non-linguistic
cessing arising from intra-lexical context, such context can have an effect on word recognition.
as an associative relation between two words Tanenhaus, Spivey-Knowlton, Eberhard, and
like “doctor” and “nurse.” It can be explained Sedivy (1995) studied people’s eye movements
in terms of relations within a single level of pro- while they were examining a visual scene while
cessing, and hence need not violate the principle following instructions. They found that visual con-
of autonomy, in terms of the spread of activa- text can facilitate spoken word recognition. For
tion within the lexicon. Alternatively, associa- example, the words “candy” and “candle” sound
tive facilitation can be thought of as occurring similar until about halfway through. Following the
because of hard-wired connections between instruction “pick up the candle,” participants were
similar things at the same level. According to faster to move their eyes to the object mentioned
autonomy theorists such as Fodor (1983) and if only a candle was in the scene than if both a
Forster (1981), this is the only type of context candle and candy were present. Indeed, when no
that affects processes prior to recognition. confusion object was present participants identi-
Structural context affects the combination fied the object before hearing the end of the word.
of words into higher level units, and it involves This result suggests that interpretative structural
higher level information. It is top-down process- context can affect word recognition.
ing. There are a number of possible types of struc-
tural context. Word knowledge (lexical context)
might be used to help identify phonemes, and MODELS OF SPEECH
sentence-level knowledge (sentence and syntactic RECOGNITION
context) might be used to help identify individual
words. The most interesting types of structural Before we can start to access the lexicon, we
context are those based on meaning. Frauenfelder have to translate the output of the auditory nerves
and Tyler (1987) distinguished two subtypes: from the ear into an appropriate format. Speech
semantic and interpretative. Semantic context is perception is concerned with this early stage of
based on word meanings. There is much evidence processing. It is obviously an important topic for
that this affects word processing. Words that are the machine recognition of speech, as there are
appropriate for the context are responded to faster many obvious advantages to computers and other
than those that are not, across a range of tasks machines being able to understand speech.
which I discuss in more detail later, such as pho- Early models of speech recognition exam-
neme monitoring, shadowing, naming, and gating ined the possibility that word recognition
(e.g., Marslen-Wilson, 1984; Marslen-Wilson & occurred by template matching. Target words
268 C. WORD RECOGNITION
are stored as templates, and identification occurs movements must be quite abstract; mute people
when a match is found. A template is an exact can understand speech perfectly well (Lenneberg,
description of the sound or the word for which 1962), and we can understand speech we cannot
we are searching. However, there is far too much ourselves produce (e.g., that of people with stut-
variation in speech for this to be a plausible ters, or foreign accents).
account except in the most restricted domains. Analysis-by-synthesis models suffer from
Speakers differ in their dialect, basic pitch, basic two substantial problems. First, there is no appar-
speed of talking, and in many other ways. One ent way of translating the articulatory hypoth-
person can produce the same phoneme in many esis generated by the production system into the
different ways—you might be speaking loudly, same format as the heard speech in order for the
or more quickly than normal, or have a cold, for potential match to be assessed. Second, we are
example. The number of templates that would extremely adept at recognizing clearly articulated
have to be stored would be prohibitively large. words that are improbable in their context, which
Generally, template models are not considered as suggests that speech recognition is primarily a
plausible accounts in psycholinguistics. data-driven process. In summary, Clark and Clark
One early model of speech perception was (1977) argued that this theory is underspecified
that of analysis-by-synthesis (Halle & Stevens, and has little predictive power. Nevertheless, in
1962; Liberman et al., 1967; Stevens, 1960). The recent years motor theories of perception have
basis of analysis-by-synthesis is that we recog- seen something of a resurgence. They do have
nize speech by reference to the actions necessary the advantage that matching the auditory signal
to produce a sound. The important idea underly- to motor representations for producing our own
ing this model was that when we hear speech, speech provides a means for categorizing the
we produce or synthesize a succession of speech acoustic signal; indeed, some researchers go so
sounds until we match what we hear. The synthe- far as to argue that these motor representations
sizer does not randomly generate candidates for have a privileged role in language processing,
matching against the input; it creates an initial and that perceiving speech resembles perceiving
best guess constrained by acoustic cues in the motor gestures, in the sense that the goal of speech
input, and then attempts to minimize the differ- perception is recognizing which vocal tract move-
ence between this and the input. This approach ments could give rise to the sounds, rather than
had a few advantages. First, it uses our capac- the more abstract identification of the sounds
ity for speech production to cope with speech themselves (Galantucci, Fowler, & Turvey, 2006;
recognition as well. Second, it copes easily with Liberman & Whalen, 2000). Imaging data show
intra-speaker differences, because the listeners are that the motor areas of the brain become activated
generating their own candidates. Third, it is easy during speech perception (Watkins & Paus, 2004),
to show how constraints of all levels might have although of course this activation does not mean
an effect; the synthesizer only generates candi- that the motor areas play a causal role in percep-
dates that are plausible. It will not, for example, tion. Although analysis-by-synthesis cannot be
generate sequences of sounds that are illegitimate the whole story of speech perception, it does seem
within that language. One variant of the model, as though motor processes play some role.
the motor theory, proposes that the speech synthe- We are left with two basic types of model of
sizer models the articulatory apparatus and motor word recognition. The cohort model of Marslen-
movements of the speaker. It effectively computes Wilson and colleagues emphasizes the bottom-up
which motor movements would have been nec- nature of word recognition. The connectionist
essary to create those sounds. Evidence for this model TRACE emphasizes its interactive nature,
model is that the way sounds are made provides a and allows feedback between levels of process-
perfect description of them; for example, all /d/s ing. Partly in response to TRACE, Marslen-
are made by tapping the tongue against the alveo- Wilson modified the cohort model, so we should
lar ridge. Note that the specification of the motor distinguish between early and late versions of it.
9. UNDERSTANDING SPEECH 269
the cohort)
(12) /trespass/
SELECTION STAGE
It is important to note that the recognition
(one item only is chosen from this set) point does not have to coincide with the unique-
ness point. Suppose we heard the start of a sen-
tence “The poacher ignored the sign not to tres-.”
Postlexical
was very interactive in this respect; context is proposes. Lexical candidates that are contextually
clearly affecting the prelexical selection stage. The appropriate are integrated into the higher level
cost of all this is that sometimes strong contextual representation of the sentence. Sentential context
bias might lead to error. On the other hand, if the cannot override perceptual hypotheses, but only
sensory information is poor, the recognition point has a late effect when one candidate is starting
might not be until well after a word’s uniqueness to emerge as the likely winner. The frequency of
point. Indeed, the uniqueness point and recogni- a word affects the activation level of candidates
tion point of a word are only likely to coincide in in the early stages of lexical access. The rate of
the case of a very clear, isolated word. gain of activation is greater for higher frequency
In a revision of the basic model (e.g., Marslen- words. There are relative frequency effects within
Wilson, 1989), context only affects the integration the initial cohort, so that being in the cohort is not
stage. The model has bottom-up priority, mean- all-or-none, but instead items vary along a con-
ing that context cannot be used to restrict which tinuum of activation. The most recent version
items form the initial cohort. Bottom-up priority of the model (Marslen-Wilson & Warren, 1994)
is a feature of both the early and late versions of emphasizes the direct access of lexical entries on
the cohort model, but in the later version, context the basis of an acoustic analysis of the incoming
cannot be used to eliminate members of the cohort speech signal.
before the uniqueness point. This change was
motivated by experimental data (from the gating Experimental tests of the cohort
task to be discussed later) that suggested that the model
role of context is more limited than was originally Marslen-Wilson and his colleagues have used a
thought: Context cannot be used to eliminate can- number of experimental tasks to gather evidence
didates at an early stage. Another important modi- for the cohort model. Marslen-Wilson and Welsh
fication in the later version of the cohort model is (1978) used a technique known as shadowing to
that the elimination of candidates from the cohort examine how syntax and semantics interact in
no longer becomes all-or-none. This counters one word recognition. In this task, participants have
objection to the original model: What happens if to listen to continuous speech and repeat it back
the start of a word is distorted or misperceived? as quickly as possible (typically after a 250 ms
This would have prevented the correct item delay). The speech samples have deliberate mis-
from being in the word-initial cohort, yet we can takes in them—distorted sounds so that certain
sometimes overcome distortions even at the start words are mispronounced. Participants are not
of a word. Suppose we hear a word like “bleas- told that there are mispronunciations, but are told
ant” (e.g., as in “the dinner was very bleasant”). they have to repeat back the passage of speech
Although we might be slowed down, we can still as they hear it. But Marslen-Wilson and Welsh
recover to identify the word as “pleasant.” (For found that participants often (about 50% of the
example, a model such as TRACE, described time) repeat these back as they should be rather
later, will successfully identify “bleasant” as than as they actually are, and without any audible
“pleasant” because the degree of overlap is high disruption to the fluency of their speech. That is,
and there is no better word candidate.) Hence, in we find what are called fluent restorations, such
the revised model degree of overlap is important, as producing “travedy” as “tragedy.” (On a small
although the beginnings of words are particularly proportion of trials participants restored words
important in generating the cohort. Also in the after a hesitation; these non-fluent hesitations,
revised cohort model, in the absence of further along with errors, were excluded from further
positive information, candidates gradually decay analysis.) The more distorted a sound is, the more
back down to their normal resting state. They can likely you are to get an exact repetition.
be revived again by subsequent positive informa- In Marslen-Wilson and Welsh’s experiment
tion. The activation level of contextually inap- there were three variables of interest. The first
propriate candidates decays: context disposes, not variable was the size of the discrepancy between
9. UNDERSTANDING SPEECH 271
the target and the erroneous word. This discrep- On the other hand, rhyme fragments of words pro-
ancy was measured in terms of the number of duce very little priming. For example, neither a
distinctive features changed in the deliberate error word (“cattle”) nor a derived nonword (“yattle”)
(either one feature, as in “trachedy,” or three fea- prime “battle” (Marslen-Wilson, 1993; Marslen-
tures, as in “travedy”). The second variable was Wilson & Zwitserlood, 1989). (Marslen-Wilson,
the lexical constraint, which reflected the number 1993, argued on this basis that the cohort model
of candidates available at different positions in gives a better account than that of the TRACE
the word by manipulating the syllable position on model described later. According to TRACE,
which the error was located (first or third sylla- “cattle” should compete with “battle” through the
ble). The third variable was the context (the word lateral inhibition connections, but as there is no
involved was a probable or improbable continua- word match for “yattle” it should not compete,
tion of the start of the sentence). An example of a and may even facilitate.)
high-constraint context was “Still, he wanted to The gating task (Grosjean, 1980; Tyler,
smoke a cigarette,” and of a low-constraint case, 1984; Tyler & Wessels, 1983) involves present-
“It was his misfortune that they were stationary.” ing gradually increasing amounts of a word, as in
Marslen-Wilson and Welsh found that most examples (7) to (12) given earlier. This enables
of the fluent restorations were made when the dis- the isolation points of words to be found: This is
tortion was slight, when the distortion was in the the mean time it takes from the onset of a word
final syllable, and when the word was highly pre- for listeners to be able to guess it correctly. This
dictable from its context. On the other hand, most task demonstrates the importance of context:
of the exact reproductions occur with greater dis- Participants need an average of 333 ms to identify
tortion when the word is relatively unconstrained a word in isolation, but only 199 ms in an appro-
by context. In a suitable constraining context, priate context, such as “At the zoo, the kids rode
listeners make fluent restorations, even when on the” for the word “camel” (Grosjean, 1980).
deviations are very prominent. These results were On the other hand, these studies also showed that
interpreted as demonstrating that the immediate candidates are generated that are compatible with
percept is the product of both bottom-up percep- the perceptual representation up to that point, but
tual input and top-down contextual constraints. that are not compatible with the context. Strong
Shadowing experiments showed that both syntac- syntactic and semantic constraints do not prevent
tic and semantic analyses of speech start to happen the accessing, at least early on, of word candidates
almost instantaneously, and are not delayed until that are compatible with the sensory input but not
a whole clause has been heard (Marslen-Wilson, with the context. Hence sentential context does
1973, 1975, 1976). not appear to have an early effect.
We do not pay attention equally to all parts In a visual equivalent of the gating task,
of a word. The beginning of the word, particu- participants looked at a computer screen show-
larly the first syllable, is especially salient. This ing pictures of a clown, cloud, dog, and par-
was demonstrated by the listening for mispronun- rot, and were instructed to “click on the cloud”
ciations task (Cole, 1973; Cole & Jakimik, 1980). (Allopenna, Magnuson, & Tanenhaus, 1998). On
In this task participants listen to speech where hearing the onset “cl-” participants were equally
a sound is distorted (e.g., “boot” is changed to likely to look at the picture of the cloud and that of
“poot”), and detect these changes. Consistent with the clown, but then as soon as they heard further
the shadowing task, participants are more sensi- disambiguating information they looked at just
tive to changes to the beginning of the words. the target picture.
Indeed, word fragments that match a word Although context might not be able to affect
from the onset are nearly as effective a prime as the generation of candidates, it might be able to
the word itself. For example, “capt-” is almost as remove them. A technique known as cross-modal
good a prime of the word “ship” as the word “cap- priming enables the measurement of contextual
tain” (Marslen-Wilson, 1987; Zwitserlood, 1989). effects at different times in recognizing a word
272 C. WORD RECOGNITION
(Zwitserlood, 1989). This technique necessitates is not driven purely by the phonetic properties of
participants listening to speech over headphones the incoming words.
while simultaneously looking at a computer screen
to perform a lexical decision task to visually pre- The influence of lexical
sented words. The relation between the word on neighborhoods
the screen and the speech, and the precise time In the cohort model, the evaluation of competitors
relation between the two, can be systematically to the target word takes place in parallel, and hence
varied. Zwitserlood showed that context can assist the number of competitors (the cohort size) at any
in selecting semantically appropriate candidates time should not have any effect on the recognition
before the word’s recognition point. Consider the of the target (Marslen-Wilson, 1987). However,
word “captain.” (Zwitserlood’s experiment actu- data from Goldinger, Luce, and Pisoni (1989) and
ally used Dutch materials, where the equivalent Luce, Pisoni, and Goldinger (1990) suggest that
item is “kapitein.”) Participants heard differing cohort size does affect the time course of word
amounts of the word before either a related or a recognition. Luce et al. found that the structure of
control word appeared on a computer screen. At a word’s neighborhood affects the speed and accu-
the point of hearing just “cap,” the word is not racy of auditory word recognition on a range of
yet unique. It is consistent with a number of con- tasks, including identifying words and performing
tinuations, including the word “captain” but also an auditory lexical decision task. The number and
a competitor, “capital.” Zwitserlood found facili- characteristics of a word’s competitors (such as
tation for the recognition of both relatives of the their frequency) are very important. For example,
target (e.g., “ship”) and competitors (“money” for we are less able to identify high-frequency words
“capital”). By the end of the word, however, only that have many high-frequency neighbors than
relatives of the target could be primed. There was words with fewer neighbors or low-frequency
also more priming by the more frequent candidate neighbors. Luce and his colleagues argue that the
than by less frequent candidates, as predicted by number of competitors, what they call the neigh-
the cohort model. Importantly, constraining con- borhood density, influences the decision. Words
text did not have any effect early on in the word: with many neighbors take longer to identify and
Even if context strongly favors a word so that produce more errors because of competition.
its competitors are implausible (e.g., as in “With Marslen-Wilson (1990) examined the effect
dampened spirits the men stood around the grave. of the frequency of competitors on recogniz-
They mourned the loss of their captain”), they ing words. He found that the time it takes you to
nevertheless still prime their neighbors. After a recognize a word such as “speech” does not just
word’s isolation point, however, we do find effects depend on the relative uniqueness points of com-
of context. Context then has the effect of boosting petitors (such as “speed” and “specious”) in the
the word’s activation level relative to its competi- cohort, but also on the frequency of those words.
tors. These results support the ideas that context Hence, you are faster to identify a high-frequency
cannot override perceptual hypotheses, and that word that only has low-frequency neighbors
sentential context has a late effect, on interpret- than vice versa. The rise in activation of a high-
ing a word and integrating it with the syntax and frequency word is much greater than for a low-
semantics of the sentence. Context speeds up this frequency one.
process of integration. Phonological neighborhood is not the only
Recent imaging data support the idea that factor that can affect auditory recognition.
semantics plays a role in selecting among candidates. Orthographic neighborhood can also affect audi-
In a lexical decision task, high imageability words tory recognition, but does so in a facilitatory fash-
generated stronger activation than low image- ion. That is, spoken words with many visually
ability words, in competitive contexts (Zhuang, similar neighbors are faster to identify than spo-
Randall, Stamatakis, Marslen-Wilson, & Tyler, ken words with few neighbors (Ziegler, Muneaux,
2011). The imaging work now shows that selection & Grainger, 2003). Somehow the printed word
9. UNDERSTANDING SPEECH 273
p tt ii і tt ii і tt πi tt πi
Words
ttii
t I ii ^t ii ^t ii ·tΠ ii
The cohort model has changed over the years, and i t i t1 ^i tt V i i
– – – – – – – –
in the light of more recent data it places less empha- – – – – – – – –
sis on the role of context. In the early version of the | p
p | p p | p p| p p
| p |p p | pp | p~ p ~p
p
p ' | pp | pp | p p | p p| p p| p p| p ~p T
model, context cannot affect the access stage, but
t I tt I tt і tt і tt і tt і tt і tt і
it can affect the selection and integration stages. tt I t t I tt I t^ tt I tt I tt I tt I 1
Phonemes
α> I k kk I k I k l kk l kk I kk I k ~
£ k
affect selection but only affects integration. In the <cu ' l k
k l kk l ^ k l k kl k k l k kl k kl ~
– – – – – – – –
the element and the acoustic input, so that a num- – – – – – – – –
ber of candidates may then be analyzed further in
parallel. This permits a gradual decay of candidates hi
5 :
identification; there are some probabilistic aspects lo
–
to word recognition (Grosjean, 1980). The later hi
version, by replacing all-or-none elimination from
Diffuseness
I
Features
■5 cω ··
Ф
the cohort with gradual elimination, also better
<O із I
accounts for the ability of the system to recover Ф 4—
it
from errors. A continuing problem for the cohort ^ lo
lo
–
model is its reliance on knowing when words start hi
Acuteness
u ♦
< t lo
l-o
–
TRACE and related models –- tt ii k p -–
signal. The level of input units represents phono- perception arises in the model as a consequence
logical features; these are connected to phoneme of within-level inhibition between the phoneme
units, which in turn are connected to the output units. As activation provided by an ambigu-
units that represent words (see Figure 9.3). Input ous input cycles through time, mutual inhibition
units are provided with energy or “activated,” and between the phoneme units results in the input
this energy or activation spreads along the con- being classified as at one or other end of the con-
nections in a manner described in the Appendix, tinuum. TRACE accounts for position effects in
with the result that eventually only one output word recognition (word-initial sounds play a par-
unit is left activated. The winner in this stable ticularly important role) because input unfolds
configuration is the word that the network “recog- over time, so that word-initial sounds contribute
nizes.” Units on different levels that are mutually much more to the activation of word nodes than
consistent have excitatory connections. All con- word-final sounds do (see Figures 9.4 and 9.5).
nections between levels are bidirectional, in that
information flows along them in both directions. Evaluation of the TRACE model
This means that both bottom-up and top-down TRACE handles context effects in speech per-
processing can occur. There are inhibitory con- ception very well. It can cope with some acoustic
nections between units within each level, which variability, and gives an account of findings such as
has the effect that once a unit is activated, it the phoneme restoration effect and co-articulation
tends to inhibit its competitors. This mechanism effects. TRACE gives a very good account of lexical
therefore emphasizes the concept of competition context effects. It is good at finding word bounda-
between units at the same level. The model deals ries and copes extremely well with noisy input—
with time by simulating it as discrete slices. Units which is a considerable advantage, given the noise
are represented independently in each time-slot. present in natural language. An attractive aspect of
The model is implemented in the form of com- TRACE is that features that are a problem for older
puter simulations, and runs of the simulations are models, such as co-articulation effects in template
compared with what happens in normal human models, actually facilitate processing, just as they
speech processing. The model shows how lexical clearly do in humans, through top-down process-
knowledge can aid perception—for example, if ing. As with all computer models, TRACE has the
an input ambiguous between /p/ and /b/ is given advantage of being explicit.
followed by the ending corresponding to -LUG, There are several problems with TRACE,
then /p/ is “recognized” by the model. Categorical however. There are many parameters that can be
/g/ /k/
1.00
0.50
0.25
FIGURE 9.5 Categorical
phoneme perception
0.00
0 1 2 3 4 5 6 7 8 9 10 11 12 in TRACE. The top
Stimulus number panel shows the level of
bottom-up activation to
the phoneme units /g/ and
1.00
/k/, for each of 12 stimuli
(shown on the x-axis).
Phoneme node activation
0.75
The lower panel shows
/g/ / k/
the activation for the
0.50 same phoneme units after
cycle 60. Stimuli 3 and 9
0.25 correspond to canonical
/g/ and /k/, respectively.
At cycle 60, the boundary
0.00
between the phonemes
is much sharper. From
– 0.25 McClelland, Rumelhart, and
0 1 2 3 4 5 6 7 8 9 10 11 12
Stimulus number
the PDP Research Group
(1986).
manipulated in the model, and it is possible to both phonemes approximately equally, as there are
level the criticism that TRACE is too powerful, in words beginning with both /pli-/ and /pri-/. Massaro
that it can accommodate any result. By adjusting found that the context biases performance so that,
some of the parameters, can the model be made to for example, listeners were more likely to classify
simulate any data from speech recognition experi- an ambiguous phoneme as /l/ in the /s_i/ context and
ments, whatever they show? Moreover, the way in /r/ in the /t_i/ context. The behavior of humans in
which the model deals with time, simulating it as this task differed from the behavior of the TRACE
discrete slices, is implausible. network. In particular, in TRACE context has
Massaro (1989) pointed out a number of prob- the biggest effect when the speech signal is most
lems with the TRACE model. He carried out an ambiguous, and has less effect when the signal is
experiment in which listeners had to make a forced- less ambiguous. With humans, the effects of con-
choice decision about which phoneme they heard, text are constant with respect to the ambiguity of the
when the sound they heard was on the continuum speech signal. Although McClelland’s (1991) reply
between /l/ and /r/. The sounds occurred in the accepted many of Massaro’s points, and tried mak-
contexts of /s_i/, /p_i/, and /t_i/. The first context ing the model’s output probabilistic (or stochastic),
favors the identification of /l/, as there are a num- Massaro and Cohen (1991) found that the problems
ber of English words that begin with /sli-/ but no persisted even after this modification. Massaro’s
words that begin /sri-/ The third context favors work is important in that it shows that it is possible
/r/ because there are words beginning with /tri-/ to make falsifiable predictions about connection-
but not /tli-/. Finally, the second context favors ist models such as TRACE. Massaro argues for a
276 C. WORD RECOGNITION
model where phonetic recognition uses features that of “English” and “copious” were replaced with a
serve as an input to a decision strategy involving sound halfway between /s/ and /sh/.
variable conjunctions of perceptual features called At first sight then, the data of Elman and
fuzzy prototypes (see Klatt, 1989, for more detail). McClelland (1988) support an interactive model
Choosing between these models is difficult, and it is rather than an autonomous one. The lexicon
not clear that they are addressing precisely the same appears to be influencing a prelexical effect (com-
issues: TRACE is concerned with the time course pensation). There are, however, accounts of the
of lexical access, whereas the fuzzy logic model is data compatible with the autonomous model. First,
more concerned with decision making and output it is not necessary after all to invoke lexical knowl-
processes (McClelland, 1991). edge. Connectionist simulations using strictly
The main problem with TRACE is that it is bottom-up processing can learn the difference
based on the idea that top-down context permeates between /g/ after /s/ and /sh/, and also that /s/ is
the recognition process. The extent to which top- more likely to follow one vowel and /sh/ another.
down context influences speech perception is con- That is, there are sequential dependencies between
troversial. In particular, there is also experimental phonemes that mean that we do not need to invoke
evidence against the types of top-down processing lexical knowledge: Some sequences of phonemes
that TRACE predicts occur in speech process- are just more likely (Cairns, Shillcock, Chater,
ing: Context effects are only really observed with & Levy, 1995; Norris, 1993). Pitt and McQueen
perceptually degraded stimuli (Burton, Baum, & (1998) demonstrated that this sequential informa-
Blumstein, 1989; McQueen, 1991; Norris, 1994b). tion can be used in speech perception. They found
In support of TRACE, Elman and McClelland compensation for co-articulation effects on the
(1988) reported an experiment showing interactive categorization of stop consonants when they were
effects on speech recognition of the sort predicted preceded by ambiguous fricative sounds at the end
by TRACE. They argued that they had demon- of nonwords. For example, the sequence of pho-
strated that between-level processes can affect nemes in the nonword “der?” is biased towards
within-level processes at a lower level. In particular, an /s/ conclusion, while the sequence in “nai?” is
they showed that illusory phonemes created by top- biased towards a /sh/ conclusion. (In both cases
down, lexical knowledge (in a manner analogous to the final sound in fact was halfway between /s/
phoneme restoration) can affect co-articulation (the and /sh/.) The nonwords were followed by a word
influence of one sound on a neighboring sound) beginning with a stop consonant sound along the
operating at the basic sound perception level in the /t/ to /k/ continuum, from “tapes” to “capes.” The
way predicted by simulations in TRACE. Consider identification of the stop consonant was influenced
word pairs such as “English dates/gates” or “copi- by the preceding ambiguous fricative differently
ous dates/gates,” where the initial phoneme of the depending on the nonword context of the frica-
second word was ambiguous, lying on the con- tive. As the preceding item was a nonword, lexical
tinuum between /d/ and /g/. The co-articulatory knowledge could not be used. The fact that com-
effects of the final sound of the first word affect the pensation is still obtained suggests that sequential
precise way in which we produce the first sound of knowledge about which phonemes co-occur is
the second word. Listeners are sensitive to these being used.
co-articulation effects in speech: the effect is called TRACE is also poor at detecting mispronun-
compensation for co-articulation. In particular, we ciations. TRACE is a single-outlet model (Cutler,
are more likely to identify the ambiguous phoneme Mehler, Norris, & Segui, 1987): The only way
as a /d/ when it follows a /sh/, as in “English,” but TRACE can identify phonemes is to see which pho-
more likely to identify it as a /g/ when following nemes are identified at the phoneme level. However,
/s/, as in “copious.” So listeners should tend to suppose a mispronounced word is presented. The
report hearing “English dates” but “copious gates.” phonemes will activate the best match word. This
Elman and McClelland showed that this compensa- word node will then feed back activation to the pho-
tion effect was obtained even when the final sounds neme level, so that the phonemes in the best match
9. UNDERSTANDING SPEECH 277
will become activated: The incorrect phonemes will to top-down inhibition. TRACE also predicts that
be corrected. But mispronunciations are not over- targets (e.g., t) in nonwords derived from changed
looked; they have a distinct adverse effect on per- words (e.g., vocabutary) should be identified more
formance (Gaskell & Marslen-Wilson, 1998). slowly than targets in control nonwords (e.g.,
Single-outlet models can be contrasted with socabutary) because the actual phoneme competes
multiple-outlet models, such as the Race model with the phoneme in the real word (l) because of
(Cutler & Norris, 1979), where two sources of infor- top-down feedback. However, there was no differ-
mation, the stored and maintained prelexical analy- ence between the two nonword conditions. Cutler
sis of the word, and a word’s lexical entry, compete et al. (1987) found that phoneme monitoring laten-
for output. The decision is made on the basis of cies were faster to word-initial phonemes than to
which route produces the answer first—hence the phonemes at the start of nonwords. According to
race aspect. Because there are two outlets, prelexi- the TRACE model there should be no difference
cal and lexical, it should be possible to emphasize for phonemes at the start of words and nonwords
one rather than the other by shifting attention. as activation will not have had time to build up and
Lexical effects on phoneme processing should be feed back to the phoneme level.
maximized when people pay particular attention TRACE is also unable to account for the find-
to the lexical outlet, and minimized when they pay ings from subcategorical mismatch experiments
particular attention to the prelexical outlet. This pat- (Marslen-Wilson & Warren, 1994). This task
tern is exactly what is observed, and is difficult for involves cross-splicing the initial consonants and
single-outlet models such as TRACE to account for consonant clusters from matched pairs of words
(Cutler et al., 1987; Norris et al., 2000). For exam- (e.g., “job” and “smob”). Marslen-Wilson and
ple, the magnitude of the lexical effect in phoneme Warren examined the effect of splicing on lexical
monitoring tasks depends on the composition of the decision (is it a word?) and phoneme categoriza-
other filler items used in the experiment. tion (what sort of sound did you hear?). The effect
In their review of the literature on con- of the cross-splice on nonwords was much greater
text effects on speech recognition, Norris et al. when the spliced material came from a word (e.g.,
(2000) argued that feedback is never necessary an item like “smob,” where the “sm-” component
in speech recognition. Indeed, top-down feed- came from the word “smog”), such that perfor-
back, they argue, would hinder recognition. mance was poorer when the cross-spliced nonword
Feedback cannot improve accuracy in process- came from a word, but the splicing made little dif-
ing (indeed, it can override the detection of ference to the processing of words. These data are
mispronunciations and can actually decrease difficult for many models. They are difficult for
accuracy); it can only speed up processing. The independent race models because decisions about
cost to this increase in speed is a trade-off with nonwords can only be made by the prelexical
accuracy. The crux of the argument is whether route, and therefore should be unaffected by the
or not there is lexical involvement in phonemic lexical status of the items from which the mate-
decision making—which are all tasks where rials are derived. They are difficult for TRACE
listeners are required to make decisions about because simulations in TRACE show that words
sounds, such as phoneme monitoring, phoneme should be affected as well as nonwords, and in
restoration, and phonetic categorization. nonwords the inhibitory effect should be greater
Finally, there is experimental evidence against than it actually is. TRACE does poorly because it
other assumptions of the model. Frauenfelder, cannot use data about the mismatch between two
Segui, and Dijkstra (1990) found no evidence of items.
top-down inhibition on phonemes in a task involv- TRACE is successful in accounting for a
ing phoneme monitoring of unexpected phonemes number of phenomena in speech recognition, and
late in a word compared with control nonwords. is particularly good at explaining context effects.
TRACE predicts that once a word is accessed, Its weakness is that the extent to which its predic-
phonemes that are not in it should be subject tions are supported by data is questionable.
278 C. WORD RECOGNITION
Other connectionist models of speech could become activated in parallel. The target
recognition word only becomes strongly differentiated from its
Recent networks use recurrent connections from the competitors close to its uniqueness point. Second,
hidden layer to a context to store information about the model successfully simulated the experimen-
previous states of the network (Elman, 1990) (see tal data of Marslen-Wilson and Warren (1994).
Figure 9.6). This modification enables networks to Third, unlike other connectionist models such as
encode information about time. Hence, they give TRACE, and like humans, their model shows very
a much more plausible account of the time-based little tolerance. As in Marslen-Wilson and Warren’s
nature of speech processing than does TRACE, (1994) experiment, a nonword such as “smob” that
which uses fixed time-based units and therefore finds matches a word quite closely (“smog”) except for
it difficult to cope with variations in speech rate. the place of articulation of the final segment, and
Gaskell and Marslen-Wilson (1997, 1998, 2002) which is constructed so that the vowels are consist-
extended the cohort model to model the process that ent with the proper target, does not in fact activate
maps between phonological and lexical information. the lexical representation of the word (“smog”) very
They constructed a connectionist model that empha- much. The network requires a great deal of phonetic
sized the distributed nature of lexical representations detail to access words—just like humans. Gaskell
(unlike TRACE, which uses local representation) so and Marslen-Wilson propose that this feature of
that information about any one word is distributed the model is a consequence of the realistic way in
across a large number of processing units. The other which the inputs are presented (with words embed-
important way in which it differed from other con- ded in a stream of speech), and the training of the
nectionist models such as TRACE is that low-level network on a large number of similar phonologi-
speech information, represented by phonetic fea- cal forms. These features force the network to be
tures, is mapped directly onto lexical forms. There intolerant about the classification of inputs. Fourth,
are no additional levels of phonological processing because words are represented in a way such that
involved (although there is a layer of hidden units similar items overlap in their representations, com-
mediating between the feature inputs and the seman- petition between similar items is an essential part
tic and phonological output layers). of processing. The simultaneous activation of more
Gaskell and Marslen-Wilson’s model simu- than one candidate creates conflict. Gaskell and
lated several important aspects of speech process- Marslen-Wilson present a series of experiments
ing. First, it gave a good account of the time course using cross-modal priming that show that com-
of lexical access. It showed that multiple candidates petition reduces the magnitude of the semantic
Output
units
Hidden
units
FIGURE 9.6
9. UNDERSTANDING SPEECH 279
priming effect. When a word is still ambiguous, for The SHORTLIST model is entirely bottom-up and is
example “capt-,” which could be either “captain” or based on a vocabulary of tens of thousands of words.
“captive,” it is not particularly effective at priming Essentially the model views spoken word recogni-
“ship”; it only becomes effective relatively late, tion as a bottom-up race between similar words. A
after we have reached the word’s uniqueness point. competition network is created “on the fly” from the
Note though that “capt-” still produces some prim- output of a bottom-up recognition network in which
ing; you can access meaning prior to the uniqueness candidates detected in the incoming speech stream
point, which allows some facilitation of semanti- are allowed to compete with each other. Only a few
cally related words, but as you cannot get complete words are active enough to be used in the list (hence
access, semantic priming is weaker than after the the name). The main drawback of this approach con-
uniqueness point. Finally, the model accounts for cerns the plausibility of creating a new competitive
the different pattern of effects found in cross-modal network at each time step (Protopapas, 1999).
repetition priming and cross-modal semantic prim- Given that they argue there is no top-
ing. Gaskell and Marslen-Wilson argue that the down feedback in speech recognition, Norris,
amount of competition between words depends McQueen, and Cutler (2000) propose a purely
on the coherence of the competing set. The candi- data-driven model. They call this model MERGE.
dates activated by a partial sound input will neces- MERGE is a competition-activation model simi-
sarily sound similar (e.g., captain and captive): the lar to SHORTLIST. In the MERGE model, activa-
candidate set is coherent. In contrast the semantic tion flows from the prelexical level to the lexicon
properties of the candidate words will be unrelated. and to phoneme-decision nodes. Crucially, there
Hence repetition priming can make direct use of the is no feedback between the lexical nodes and
set of lexical candidates directly activated by the the prelexical nodes. However, lexical informa-
input (e.g., “capt-” is closely related to “captain” tion can influence the phoneme-decision nodes.
and “captive”). Semantic priming cannot do so, as Decisions are made on the basis of merging
it generates multiple unrelated candidate items; the
candidate words related to the prime “capt-” include
“ship” and “prisoner,” which are unrelated—this set
is incoherent. Furthermore, with incoherent candi-
catalog
date sets, the more candidates there are, the more
competition there will be, while with coherent sets,
the number of candidates matters much less, and
hence priming should be less affected by the cohort cattle
these two inputs. Norris et al. provide simula- produce no priming. On the other hand, the evi-
tions that show that such a model does a good job dence for the amount of interaction that TRACE
of accounting for a wide range of experimental entails is limited.
data. Critics (see commentary in Norris et al.) The Gaskell and Marslen-Wilson model is
argue that merging is a form of interaction, as the very similar to the SHORTLIST model of Norris
phoneme-decision nodes are influenced by lexical (1994b). Both models differ from TRACE in
information, and the MERGE is a model specifi- making less use of top-down inhibition and more
cally about phoneme-decision tasks rather than a use of bottom-up information. SHORTLIST
general model of speech recognition. combines the advantages of recurrent nets and
TRACE. At present, these types of connectionist
Comparison of models of spoken model show how models of spoken word recogni-
tion are likely to develop, although SHORTLIST
word recognition currently suffers from the problem that it is not
Let us look again at the three phases of speech rec- clear how interactive activation networks can be
ognition we identified and see what the different set up quickly “on the fly.”
models we have examined so far have to say about Virtually all models of word recognition
them. When we hear speech, we have to do two view spoken word recognition as incorporating
things. We have to segment the speech stream into an element of competition between the target
words, and we have to recognize those words. The word and its neighbors. Therefore priming a
amount of speech needed to compute the contact word should retard recognition of another shar-
representation determines when initial contact can ing the same initial sounds (Monsell & Hirsh,
occur. According to Klatt (1989), contact can be 1998). Unfortunately, the bulk of the research
made after the first 10 ms. Models that use sylla- has shown either facilitation or no effect of
bles to locate possible word onsets, and which need priming phonologically related items, rather
larger units of speech, will obviously take longer than the expected inhibition. Why might this
before they can access the lexicon. Different mod- be? Monsell and Hirsh pointed out that in these
els also emphasize how representations make con- studies the lag between the prime and the probe
tact with the lexicon. Hence in the cohort model, is very brief. It is possible that any inhibi-
the beginning of the word (the first 150 ms) is tory effects are cancelled out by short-acting
used to make first contact. In other models (e.g., facilitatory effects generated by other factors,
Grosjean & Gee, 1987), the more salient or reliable such as processing shared sublexical constit-
parts of the word, such as the most stressed sylla- uents (such as phonemes or rimes). If this is
ble, are used. All of these models where initial con- the case, then inhibition should be apparent at
tact is used to generate a subset of lexical entries longer time lags, when the short-lived facilita-
have the disadvantage that it is difficult to recover tory effects have had time to die away. This is
from a mistake (e.g., a mishearing). Models such what Monsell and Hirsh observed. In an audi-
as TRACE, where there is not a unique contact for tory lexical decision task, with time lags of 1–5
each word, do not suffer from these problems. Each minutes between prime and target, the response
identified phoneme—the whole word—contributes time for a monosyllabic word preceded by a
to the set of active lexical entries. The cost of this is word sharing its onset and vowel (e.g., “chat”
that these sets may be very large and this might be and “chap”) increased relative to an unprimed
computationally costly. control. Similarly, response time increased for
The revised cohort model negates the prob- polysyllabic words preceded by another sharing
lem of recovering from catastrophic early mis- the first syllable (e.g., “beacon” and “beaker”).
takes by allowing gradual activation of candidates The effect was limited to word primes—non-
rather than all-or-none activation. Furthermore, word primes (e.g., “chass” and “beacal”) did
we have seen that while the beginnings of words not produce this inhibition. Hence priming
are important in lexical access, the rhyme parts phonological competitors does indeed retard
9. UNDERSTANDING SPEECH 281
the subsequent recognition of items, but the sonority—essentially the amount of acoustic
effect is only manifest when other short-term energy in a sound) were taken into account.
facilitatory effects have died down. Patients with pure word deafness can speak,
Finally, we make use of other types of infor- read, and write quite normally, but cannot
mation when understanding speech. Even people understand speech, even though their hearing is
with normal hearing can make some use of lip- otherwise normal (see Saffran, Marin, & Yeni-
reading. McGurk and MacDonald (1976) showed Komshian, 1976, for a case history). Patients
participants a video of someone saying “ba” with pure word deafness cannot repeat speech
repeatedly, but gave them a soundtrack with “ga” and have extremely poor auditory comprehen-
repeated. Participants reported hearing “da,” appar- sion. They are impaired at tasks such as distin-
ently blending the visual and auditory information. guishing stop consonants from each other (e.g.,
This effect suggests that speech perception is the /pa/ from /ba/ and /ga/ from /ka/). On the other
result of the best guess of the whole perceptual sys- hand Saffran et al.’s patient could identify musi-
tem, using multiple sources of information, among cal instruments and non-speech noises, and could
which speech is usually the most important. identify the gender and language of a recorded
voice. This pattern of performance suggests that
these people suffer from disruption to a prelexi-
THE NEUROSCIENCE cal, acoustic processing mechanism. A very rare
OF SPOKEN WORD and controversial variant of this is called word
RECOGNITION meaning deafness. Patients with word mean-
ing deafness show the symptoms of pure word
Some difficulty in speech recognition is quite deafness but have intact repetition abilities. The
common in adults with a disturbance of language most famous case of this was a patient living in
functions following brain damage. Varney (1984) Edinburgh in the 1890s (Bramwell, 1897/1984),
reported that 18% of such patients had some prob- although more recent cases have been reported
lem in discriminating speech sounds. Brain dam- by Franklin, Howard, and Patterson (1994), and
age can affect most levels of the word recognition Kohn and Friedman (1986). Pure word deafness
process, including access to the prelexical and the shows that we can produce words without neces-
postlexical codes. sarily being able to understand them.
There are many cases of patients who have Only one patient (EDE) clearly showed intact
difficulty in constructing the prelexical code. acoustic-phonetic processing (and therefore the
Caplan (1992) reviews these. For example, ability to construct a prelexical code), but also
brain damage can affect the earliest stages then had difficulties with lexical access (Berndt
of acoustic-phonetic processing of features & Mitchum, 1990). This patient performed well
such as voice onset time, or the later stages on all tests of phoneme discrimination and acous-
involving the identification of sounds based tic processing, yet made many errors in decid-
on these features (Blumstein, Cooper, Zurif, ing whether a string of sounds made up a word
& Caramazza, 1977). Neuropsychological evi- or not (e.g., “horse” is a word, but “hort” is not).
dence suggests that vowels and consonants are Nevertheless EDE generally performed well on
processed by different systems. Caramazza, routine language comprehension tasks, and Berndt
Chialant, Capasso, and Miceli (2000) describe and Mitchum interpreted her difficulties with this
two Italian-speaking aphasic patients who particular task in terms of a short-term memory
show selective difficulties in producing vow- deficit rather than of lexical access. As yet there
els and consonants. Patient AS produced have been no reports of patients who have com-
mainly errors on vowels, while patient IFA pletely intact phonetic processing but who cannot
produced mainly errors on consonants. These access the postlexical code. This might be because
differences remained even when other possi- so far we have not looked hard enough, or perhaps
ble confounding factors (such as the degree of have just been unlucky.
282 C. WORD RECOGNITION
SUMMARY
x We can recognize meaningful speech faster and more efficiently than we can identify non-speech
sounds.
x Sounds run together (the segmentation problem), and vary depending on the context in which they
occur (the invariance problem).
x The way in which we segment speech depends on the language we speak.
x We use a number of strategies to segment speech; stress-based segmentation is particularly impor-
tant in English.
x Consonants are classified categorically, but it is unclear how early in perception this effect arises,
because listeners are sensitive to differences between sounds within a category.
x The lexicon is our mental dictionary.
x The prelexical code is the sound representation used to access the lexicon.
x There is controversy about whether phonemes are represented directly in the prelexical code, or
whether they are constructed after we access the lexicon.
x Studies of co-articulation effects in words and nonwords suggest that a low-level phonetic repre-
sentation is used to access the lexicon directly.
x The lexical identification shift of ambiguous phonemes varies depending on the lexical context.
x Phonemes masked by noise can be restored by an appropriate context.
x There has been debate about whether the lexical identification shift and phoneme restoration
effects are truly perceptual effects or instead reflect later processing.
x Word recognition can be divided into initial contact, lexical selection, and word recognition phases.
x A spoken word’s uniqueness point is when the stream of sounds is finally unambiguously distin-
guishable from all other words.
x We recognize the word at its recognition point; the recognition point does not have to correspond
to the uniqueness point.
x Although the extent to which top-down sentential context has an effect on the early stages of
word recognition is controversial, the preponderance of evidence suggests that context only has
its effects after lexical access.
x Early models of speech recognition included template matching and analysis-by-synthesis.
x According to the cohort model of word recognition, when we hear a word a group of candidates—the
cohort—is set up; as further evidence arrives, the cohort is reduced until only one word remains.
x Later revisions of the cohort model introduced the idea of graded activation rather than all-or-
none membership of the cohort, and reduced the role of contextual effects.
x Evidence for the cohort model comes from studies of fluent restorations in speech, listening for
mispronunciations, and studies using the gating and cross-modal priming techniques.
x The lexical neighborhood comprises all words that sound like a particular word, and can have
effects on its recognition.
x TRACE is a highly interactive connectionist model of spoken word recognition.
x The main difficulty with TRACE is that it assumes more interaction than there is evidence for.
x Models such as SHORTLIST show how bottom-up, data-driven connectionist models can account
for most of the major findings of speech processing research.
x Vowels and consonants are processed by different systems.
x People with pure word deafness cannot understand speech even though their hearing is otherwise
unimpaired and they can read and write quite well.
x People with the rare disorder known as word meaning deafness cannot understand speech even
though they can repeat it back.
9. UNDERSTANDING SPEECH 283
1. What particular processing problems might people with a different dialect cause a listener?
2. Why might mishearings occur?
3. What sort of special problems might code switching by bilinguals create for speech recognition
by their listeners?
4. What are the main differences between the cohort and SHORTLIST models of spoken word
recognition?
FURTHER READING
Luce (1993) is an introduction to acoustics, the low-level processes of hearing, and how the ear
works. See MacMillan and Creelman (1991) for an introduction to signal detection theory. See Ward
(2010, Chapter 10) for a description of the neuroscience of auditory processing. Remez and Pisoni
(2005) is an edited collection that covers the whole field of speech perception and spoken word
recognition.
The classic textbook by Clark and Clark (1977) has a good description of the earlier models of
speech perception, particularly analysis-by-synthesis. The paper by Frauenfelder and Tyler (1987)
in a special issue on spoken word recognition in the journal Cognition is an introduction to the
issues involved in spoken word recognition. Two collections of papers on speech processing are to
be found in Altmann (1990) and Altmann and Shillcock (1993). Altmann (1997) provides excellent
coverage of speech perception, particularly on the importance of sound perception by infants and
other species.
Ellis and Humphreys (1999) review connectionist models of speech processing. Massaro (1989)
provides a critique of connectionist models in general and TRACE in particular. Norris (1994b) is a
good summary of the problems with TRACE, and see Protopapas (1999) for a review of connection-
ist models of speech perception. Grosjean and Frauenfelder (1996) review the methods commonly
used to study spoken word recognition. For a review of the literature on speech recognition, with the
conclusion that speech perception is bottom-up and data-driven, see Norris, McQueen, and Cutler
(2000), with commentaries.
This page intentionally left blank
SECTION D
MEANING AND USING LANGUAGE
This section examines the processes of compre- particular how we represent the meanings of
hension. How do we extract meaning from what individual words. Categorization, associations
we read or hear and make use of word order infor- between words, use of metaphor and idiom, and
mation? How do we represent and make use of the connectionist modeling of semantics are among
meaning of words and sentences? the topics addressed.
Chapter 10, Understanding the structure Chapter 12, Comprehension, looks at
of sentences, tackles the complexities of sentence what follows after we have identified words
interpretation and parsing. Once we have recog- and built the syntactic structure of a sentence.
nized words, how do we decide between all the What do we remember of text that we read or
different roles the words can take—who is doing hear? How do we know when to draw infer-
what to whom? (You may find it useful to read ences or move beyond the literal meaning of
Chapter 2 again before starting Chapter 10.) the text? This chapter also addresses the spe-
Chapter 11, Word meaning, examines cific problems inherent in understanding spo-
issues involved in the study of semantics, in ken conversation.
This page intentionally left blank
C H A P T E R 10
UNDERSTANDING THE STRUCTURE
OF SENTENCES
When we hear and understand a sentence, infor- stage of syntactic processing. Semantic informa-
mation about the word order is often crucial tion is used only in the second stage. Hence the
(at least in languages such as English). This is question about the number of stages is really the
information about the syntax of the sentence. same question as whether parsing is modular or
Sentences (1) and (2) have the same word order interactive.
structure but different meanings; (1) and (3) The goal of understanding is to extract the
have different word order structures but the same meaning from what we hear or read. Syntactic
meaning: processing is only one stage in doing this, but
it is nevertheless an important one. Whether it
(1) The ghost chased the vampire. is always an essential one is an important issue.
(2) The vampire chased the ghost. There is, however, another reason why we should
(3) The vampire was chased by the ghost. study syntax. Fodor (1975) argued that there is a
“language of thought” that bears a close resem-
A number of important questions arise blance to our surface language. In particular, the
about parsing and the human sentence parsing syntax that governs the language of thought may
mechanism. How does parsing operate? Why are be very similar or identical to that of external lan-
some sentences more difficult to parse than oth- guage. Studying syntax may therefore provide a
ers? What happens to the syntactic representa- window onto fundamental cognitive processes.
tion after parsing? Why are sentences assigned Different languages use different syntactic
the structures that they are? How many stages rules. English in particular is a strongly configu-
of parsing are there? What principles guide the rational language whose interpretation depends
operation of these stages? What happens if there heavily on word order. In inflectional languages
is a choice of possible structures at any point? At such as German, word order is less important.
what stage is non-structural (semantic, discourse, It is therefore possible that the predominance of
and frequency-based) information used? This last studies that have examined parsing in English
question is another manifestation of the issue of may have given a misleading view of how human
whether language processes are modular or not. Is parsing operates. For this reason, an important
there an enclosed syntactic module that uses only recent development has been the study of parsing
syntactic information to parse a sentence, or can in languages other than English. Most psycholin-
other types of information guide the parsing pro- guists hope and expect that the important parsing
cess? Any account of parsing must be able to spec- mechanisms will be common to speakers of all
ify why sentences are assigned the structure that languages. By the end of this chapter you should:
they are, why we are biased to parse structurally
ambiguous sentences in a certain way, and why x Know that parsing is incremental.
some sentences are harder to parse than others. x Understand how we assign syntactic structures
We should distinguish between autonomous to ambiguous sentences.
and interactive models of parsing, and one-stage x Be able to evaluate the extent to which parsing
and two-stage models. In autonomous models, the is autonomous or interactive.
initial stages of parsing at least can only use syn- x Understand the importance of verbs in parsing.
tactic information to construct a syntactic repre- x Understand how brain damage can disrupt
sentation. According to interactive models, other parsing.
sources of information (e.g., semantic informa-
tion) can influence the syntactic processor at an
early stage. DEALING WITH
In one-stage models, syntactic and semantic STRUCTURAL AMBIGUITY
information are both used to construct the syntac-
tic representation in one go. In two-stage models, My local newspaper, The Dundee Courier,
the first stage is invariably seen as an autonomous recently had a headline that read “Police seek
10. UNDERSTANDING SENTENCES 289
orange attackers.” Do you think that the headline reading the ambiguous regions of sentences than
meant “Police seek attackers who are orange,” the unambiguous regions of control sentences, but
“Police seek attackers of an orange,” or “Police we often spend longer in reading the disambigua-
seek attackers who attacked with an orange”? (It tion region.
was meant to be the last of these.) Here is another The central issue in parsing is when different
example: “Enraged cow injures farmer with axe.” types of information are used. In principle there
In this example the ambiguity arises because are two alternative parse trees that could be con-
the prepositional phrase “with axe” could be structed for (8). We could construct one of them
attached to either “farmer” or “injures”; that is, on purely syntactic grounds, and then decide using
there are two possible structures for this sentence. semantic information whether it makes sense or
So, as well as being poorly written, these sen- not. If it does, we accept that representation; if it
tences are ambiguous. does not, we go back and try again. This is a serial
It is difficult to discern the operations of the autonomous model. Alternatively, we could con-
processor when all is working well. For this reason, struct all possible syntactic representations in par-
most research on parsing has involved syntactic allel, again using solely syntactic information, and
ambiguity because ambiguity causes process- then use semantic or other information to choose
ing difficulty. Studying syntactic ambiguity is an the most appropriate one (Mitchell, 1994). This
excellent way of discovering how sentence pro- would be a parallel autonomous model. Or we
cessing works. could use semantic information from the earliest
There are different types of ambiguity involv- stages to guide parsing so that we only construct
ing more than one word. We have the bracketing semantically plausible syntactic representations.
ambiguity of example (4), which could be inter- Or we could activate representations of all possi-
preted either in the sense of (5) or in the sense of (6): ble analyses, with the level of activation affected
by the plausibility of each. The final two are ver-
(4) old men and women leave first sions of an interactive model.
(5) ([old men] and women) So far we have just looked at examples of
(6) (old [men and women]) permanent (also called global) ambiguity. In these
cases, when you get to the end of the sentence it is
More complex are structural ambiguities still syntactically ambiguous. Many sentences are
associated with parsing, such as in sentence (7). locally (or temporarily, or transiently) ambiguous,
What was done yesterday—Boris saying or Vlad but the ambiguity is disambiguated (or resolved)
finishing? Although both structures are equally by subsequent material (the disambiguation
plausible in (7), this is not the case in (8): region). We are sometimes made forcefully aware
of temporary ambiguity when we appear to have
(7) Boris said that Vlad finished it yesterday. chosen an incorrect syntactic representation.
(8) I saw the Alps flying to Romania. Consider (9) from Bever (1970). The verb “raced”
is ambiguous in that it could be a main verb (the
Many of us would not initially recognize a most frequent sense) or a past participle (a word
sentence such as (8) as ambiguous. On considera- derived from a verb acting as an adjective):
tion, this might be because one of its two meanings
is so semantically anomalous (the interpreta- (9) The horse raced past the barn fell.
tion that I looked up and saw a mountain range (10) The log floated past the bridge sank.
in the sky flying to a country) that it does not (11) The ship sailed round the Cape sank.
appear even to be considered. But psychology has (12) The old man the boats.
shown us many times that we cannot rely on our
intuitions. Recording eye movements has been When you hear or read a sentence like (9), it
particularly important in studying parsing. The can be interpreted in a straightforward way until
bulk of evidence shows that we spend no longer the final unexpected word “fell.” When we come
290 D. MEANING AND USING LANGUAGE
across the last word we realize that we have been clauses in a sample from the Wall Street Journal
led up the garden path. We realize that our origi- (Elsness, 1984; Garnsey, Pearlmutter, Myers, &
nal analysis was wrong and we have to go back Lotocky, 1997; McDavid, 1964; Thompson &
and reanalyze. We have the experience of having Mulac, 1991). There is evidence that appropri-
to backtrack. We then arrive at the interpreta- ate punctuation such as commas can reduce (but
tion of “The horse that was raced past the barn not obliterate) the magnitude of the garden path
was the one that fell.” (Some people take some effect by enhancing the reader’s awareness of the
time to work out what the correct interpretation phrasal structure (Hill & Murray, 2000; Mitchell
is.) That is, we initially try to parse it as a simple & Holmes, 1985). In real life, speakers give pro-
noun phrase followed by a verb phrase. In fact, sodic cues to provide disambiguating information,
it contains a reduced relative clause. (A relative and listeners are sensitive to this type of informa-
clause is one that modifies the main noun, and it tion; for example, speakers tend to emphasize
is “reduced” because it lacks the relative pronoun the direct-object nouns, and insert pauses akin
“which” or “that.”) Examples (10), (11), and (12) to punctuation (Snedeker & Trueswell, 2003).
should also lead you up the garden path. Garden Similarly, disfluencies influence the way in which
path sentences are favorite tools of researchers people interpret garden path sentences. When an
interested in parsing. interruption (saying “uh”) comes before an unam-
Many people might think that garden path biguous noun phrase, listeners are more likely to
sentences are rather odd: Often there would be think that the noun phrase is the subject of a new
pauses in normal speech and commas in written clause rather than the object of an old one (Bailey
language, which, although strictly optional, are & Ferreira, 2003). Disfluencies can help, but only
usually there to prevent the ambiguity in the first as long as they are in the right place. They are
place. For example, Rayner and Frazier (1987) helpful in (13) where they correctly flag a new
intentionally omitted punctuation in order to mis- subject, but not in (14), where they do not.
lead the participants’ processors. Deletion of the
complementizer “that” can also produce mis- (13) Vlad bumped into the ghost and the (um)
leading results (Trueswell, Tanenhaus, & Kello, ghoul told him to be careful.
1993). In such cases it might be possible that (14) Vlad bumped into the (um) ghost and the
these sentences are not telling us as much about ghoul told him to be careful.
normal parsing as we think. In fact, reduced
relatives are surprisingly common; “that” was However, just because speakers give prosodic
omitted in 33% of sentences containing relative cues, and listeners make use of these cues, does
not mean that speakers always mean to give these
cues for the express purpose of helping the lis-
tener (what has been called the audience design
hypothesis). Speakers are not always aware that
what they are saying is ambiguous, and they tend
to produce the same cues even when there is no
audience (Kraljic & Brennan, 2005). Prosody and
pauses probably reflect both the planning needs of
the speaker (see Chapter 13) as well as a deliber-
ate source of information to aid the listener.
Perhaps even more tellingly, McKoon and
Ratcliff (2003) showed that sentences with
reduced relatives with verbs like “race” (e.g., (9))
Garden path sentences, such as “The horse occur in natural language with near-zero probabil-
raced past the barn fell,” are favorite tools of ity. So, although such sentences might technically
researchers interested in parsing.
be syntactically correct, most people find these
10. UNDERSTANDING SENTENCES 291
sorts of sentence unacceptable. Indeed, McKoon the sentence. It is often said that “syntax proposes;
and Ratcliff go so far as to argue that sentences semantics disposes.” The simplest approach treats
with reduced relatives with verbs similar to “race” syntax as an independent or autonomous process-
are ungrammatical. Hence considerable caution is ing module: Only syntactic information is used to
necessary when drawing conclusions about the construct the parse tree. Is this true?
syntactic processor from studies of garden path
sentences.
At first sight, our experience of garden path
What size are the units of parsing?
sentences is evidence for a serial autonomous pro- What are the constituents used in parsing, and
cessor. But what has led us up the garden path? how big are they? Jarvella (1971) showed that
We could have been taken there by either seman- listeners only begin to purge memory of the
tic or syntactic factors. There has been a great deal details of syntactic constituents after a sentence
of research on trying to decide which. According boundary has been passed (see Chapter 12 for
to the serial autonomy model, we experience the more details). Once a sentence has been pro-
garden path effect because the single syntactic cessed, verbatim memory for it fades away very
representation we are constructing on syntactic quickly. Hence, perhaps not surprisingly, the
grounds turns out to be incorrect. According to sentence is a major processing unit. Beneath this,
the parallel autonomy model, one representation the clause also turns out to be an important unit.
is much more active than the others because of the A clause is a part of a sentence that has both a
strength of the syntactic cues, but this turns out subject and predicate. Furthermore, people find
to be wrong. According to the interactive model, material easier to read a line at a time if each line
various sources of information support the analy- corresponds to a major constituent (Anderson,
sis more than its alternative. However, later infor- 2010; Graf & Torrey, 1966). There is a clause
mation is inconsistent with these initial activation boundary effect in recalling words: it is easiest
levels. to recall words from within the clause currently
being processed, independent of the number of
words in the clause (Caplan, 1972). The process-
EARLY WORK ON PARSING ing load is highest at the end of the clause, and
eye fixations are longer on the final word of a
Early models of parsing were based on Chomsky’s clause (Just & Carpenter, 1980).
theory of generative grammar. In particular, psy- One of the first techniques used to explore
chologists tested the idea that understanding sen- the size of the syntactic unit in parsing was the
tences involved retrieving their deep structure. As click displacement technique (Fodor & Bever,
it became apparent that this could not provide a 1965; Garrett, Bever, & Fodor, 1966). The basic
complete account of parsing, emphasis shifted to idea was that major processing units resist inter-
examining strategies based on the surface struc- ruption: We finish what we are doing, and then
ture of sentences. process other material at the first suitable oppor-
For early psycholinguists still influenced by tunity. Participants heard speech over headphones
ideas from transformational grammar such as in one ear, and at certain points in the sentence,
the autonomy of syntax, the process of language extraneous clicks were presented in the other
understanding was a simple story (e.g., Fodor, ear. Even if the click falls in the middle of a real
Bever, & Garrett, 1974). First, we identify the constituent, it should be perceived as falling at a
words on the basis of perceptual data. Recognition constituent boundary. That is, the clicks should
and lexical access give us access to the syntactic appear to migrate according to listeners’ reports.
category of the words. We can use this informa- This is what was observed:
tion to build a parse tree for each clause. It is only
when each clause is completely analyzed that we (15) That he was* happy was evident from the
finally start to build a semantic representation of way he smiled.
292 D. MEANING AND USING LANGUAGE
For example, a click presented at * in (15) was disrupted immediately after they read the
migrated to after the end of the word “happy.” word “shot” in (16). The immediate disruption
This is at the end of a major constituent, at the means that they must have processed the sentence
end of the clause. The original study claimed to syntactically and semantically up to that point.
show that the clause is a major perceptual unit. However, syntactic effects are often delayed so
The same results were found when all non-syntactic that they occur a few words later.
perceptual cues, such as intonation and pauses,
were removed. This suggests that the clause is a (16) That is the very small pistol with which the
major unit of perceptual and syntactic processing. heartless killer shot the hapless man yester-
However, this interpretation is premature. day afternoon.
The participants’ task is a complex one: They
have to perceive the sentence, parse it, understand Not only do people construct the representa-
it, remember it, and give their response. Click tion incrementally, they try to anticipate what is
migration could occur at any of these points, not coming next. In an experiment with Dutch speak-
just perception or parsing. Reber and Anderson ers, van Berkum, Brown, Zwitserlood, Kooijman,
(1970) carried out a variant of the technique in and Hagoort (2005) examined the ERPs of peo-
which participants listened to sentences that actu- ple listening to stories. The stories led people to
ally had no clicks at all. They were told that it expect specific nouns. However, if participants
was an experiment on subliminal perception, and then heard a gender-marked adjective immedi-
were asked to say where they thought the clicks ately before the expected noun, and the gender
occurred. Participants still placed the non-existent was not the right match for the expected noun, the
clicks at constituent boundaries. This suggested inconsistent adjectives elicited a marked ERP.
that click migration occurs in the response stage: Indeed, people even anticipate properties
Participants are intuitively aware of the existence of upcoming words in the sentence, so that, for
of constituent boundaries and have a response bias example, the argument structure of a verb can be
to put clicks there. Wingfield and Klein (1971) used to anticipate the subsequent theme (Altmann
showed that the size of the migration effect is & Kamide, 1999). For example, the verb “drink”
greatly reduced if participants can point to places requires that the direct object is something drink-
in the sentence on a visual display at the same time able; this information is used to predict what is
as they hear them, rather than having to remember coming next, and people only pay attention to
them. It was also unclear whether intonation and drinkable things thereafter (as measured by their
pausing are as unimportant in determining struc- eye movements while looking at a picture). That
tural boundaries as was originally claimed. is, people make anticipatory eye movements
Hence these early studies probably reflect the towards probable upcoming objects. In a related
operations of memory rather than the operations experiment, Kamide, Altmann, and Haywood
of syntactic processing. It is now agreed that pars- (2003) tracked the eye movements of people
ing is largely an incremental process—we try to looking at a visual scene. They found that people
build structures on a word-by-word basis. That is, anticipated a great deal of information, even with
we do not sit idly by while we wait for the clause more complex verb structures. For example, given
to finish. The experiments of Marslen-Wilson a picture containing a man and a slice of bread, on
(1973, 1975) and Marslen-Wilson and Welsh hearing “The woman will spread the butter –”
(1978; see Chapter 9 for details) demonstrate people make anticipatory eye movements to the
that we try to integrate each word into a semantic bread when they hear butter, but to the man when
representation as soon as possible. Many studies they hear “The woman will slide the butter –.”
have shown that syntactic and semantic analysis In general, language processing interacts with
is incremental (Just & Carpenter, 1980; Tyler & the representation of a visual scene so linguistic
Marslen-Wilson, 1977). For example, Traxler and information can determine where we look next
Pickering (1996) found that readers’ processing (Altmann & Kamide, 2009). The conclusion is
10. UNDERSTANDING SENTENCES 293
that the processor draws on different sources of the second one the object. In fact, if we made use
information, some of them non-linguistic, at the of this strategy we could get a long way in com-
earliest opportunity, to construct as full an inter- prehension. This is called the canonical sentence
pretation as possible. strategy. We try the simpler strategies first, and if
We saw earlier that Chomsky’s description these do not work, we try other ones. If the battery
of language placed great emphasis on the hierar- of surface structure strategies become exhausted
chical and recursive nature of syntactic structure. by a sentence, we must try something else.
There is, however, debate as to which hierarchical Fodor, Bever, and Garrett (1974) developed
structure is actually used in cognitive processing. this type of approach in one of the most influential
In line with the incremental models, Frank and works in the history of psycholinguistics. They
Bod (2011) found that reading times are best pre- argued that the goal of parsing was to recover
dicted by purely sequential models; people do not the underlying, deep structure of a sentence. As it
appear to use hierarchical structure information to had been shown that this was not done by explic-
predict what word is coming next. itly undoing transformations, it must be done by
In summary, the language processor oper- perceptual heuristics; that is, using our surface
ates incrementally: It rapidly constructs a syntac- structure cues. However, there is little evidence
tical analysis for a sentence fragment, assigns it that deep structure is represented mentally inde-
a semantic interpretation, and relates this inter- pendently of meaning (Johnson-Laird, 1983).
pretation to world knowledge (Pickering, 1999). Nevertheless, the general principle that when we
Any delay in this process is usually very slight. parse we use surface structure cues has remained
Incremental analysis makes a lot of sense from a influential, and has been increasingly formalized.
processing point of view: Imagine having to wait
until the sentence finishes or the other person
stops speaking before you can begin analyzing
Two early accounts of parsing
what you have seen or heard. Kimball (1973) also argued that surface struc-
ture provides cues that enable us to uncover the
Parsing strategies based on underlying syntactic structure. He proposed seven
principles of parsing to explain the behavior of the
surface-structure cues human sentence parsing mechanism. He argued
The surface structure of the sentence often pro- that we initially compute the surface structure of a
vides a number of obvious cues to the underlying sentence guided by rules that are based on psycho-
syntactic representation. One obvious approach is logical constraints such as minimizing memory
to use these cues and a number of simple strategies load. He argued that these principles explained
that enable us to compute the syntactic structure. why sentences are assigned the structure that
The earliest detailed expositions of this idea were they are, why some sentences are harder to parse
by Bever (1970) and Fodor and Garrett (1967). than others, and why we are biased to parse many
These researchers detailed a number of parsing structurally ambiguous sentences in a certain way.
strategies that used only syntactic cues. Perhaps The first principle is that parsing is top-down,
the simplest example is that when we see or hear except when a conjunction (such as “and”) is
a determiner such as “the” or “a,” we know a encountered. It means that we start from the sen-
noun phrase has just started. A second example tence node and predict constituents. To avoid an
is based on the observation that although word excessive amount of backtracking, the processor
order is variable in English, and transformations employs limited lookahead of one or two words.
such as passivization can change it, the common For example, if you see that the first word of the
structure noun–verb–noun often maps on to what next constituent is “the,” then you know that you
is called the canonical sentence structure SVO are parsing a noun phrase.
(subject–verb–object). That is, in most sentences The second principle is called right associa-
we hear or read, the first noun is the subject, and tion, which is that new words are preferentially
294 D. MEANING AND USING LANGUAGE
attached to the lowest possible node in the struc- nodes will have to be kept active at once. Hence sen-
ture constructed so far. This places less of a load tences of this sort, such as (18), will be difficult, but
on memory. Consider (17): corresponding right-branching paraphrases such as
(19) cause no difficulty, because the sentence nodes
(17) Vlad figured that Boris wanted to take the do not need to be kept open in memory:
pet rat out.
(18) The vampire the ghost the witch liked loved
Here we attach “out” to the right-most availa- died.
ble constituent, “take” rather than “figured.” This (19) The witch liked the ghost that loved the
means that although this structure is potentially vampire that died.
ambiguous, we prefer the interpretation “take out”
to “figured out” (see Figure 10.1). Right associa- The fifth principle is that of closure, which
tion gives English its typically right-branching says that the processor prefers to close a phrase
structure, and it also explains why structures as soon as possible. The sixth principle is called
that are not right-branching are more difficult to fixed structure. Having closed a phrase, it is com-
understand (e.g., “the ghost who Vlad expected to putationally costly to reopen it and reorganize
leave’s ball”). the previously closed constituents, and so this is
Kimball’s third principle was new nodes. avoided if possible. This principle explains our
Function words signal a new phrase. The fourth difficulty with garden path sentences. The final
principle is that the processor can only cope with principle is the principle of processing. When a
nodes associated with two sentence nodes at any one phrase is closed it exits from short-term memory
time. For example, center-embedding splits up noun and is passed on to a second stage of deeper,
phrases and verb phrases associated with the sen- semantic processing. Short-term memory has lim-
tences so that they have to be held in memory. When ited capacity, and details of the syntactic structure
there are two embedded clauses, three sentence of a sentence are very quickly forgotten.
NP VP
V S
Kimball’s principles do a good job of “cried yesterday” to “said yesterday.” The sau-
explaining a number of properties of the proces- sage machine cannot account for the preference
sor. However, given that the principle of process- for right association in some six-word sentences.
ing underlies so many of the others, perhaps the
model can be simplified to reflect this? In addi- (20) Vampires werewolves rats kiss love sleep.
tion, there are some problems with particular (21) Vlad said that Boris cried yesterday.
strategies. For example, the role of function words
in parsing might not be as essential as Kimball Fodor and Frazier (1980) conceded that right
thought. Eye fixation research shows that we may association does not arise directly from the sau-
not always gaze directly at some function words: sage machine’s architecture. They added a new
Very short words are frequently skipped (Rayner principle that governs the performance of the sau-
& McConkie, 1976; although we might be able to sage machine, which says that right association
process them parafoveally—that is, we could still operates when minimal attachment cannot deter-
extract information from them even though they mine where a constituent should go. The sausage
are not centrally located in our visual field; see machine evolved into one of the most influential
Kennedy, 2000, and Rayner & Pollatsek, 1989). models of parsing, the garden path model.
Frazier and Fodor (1978) simplified
Kimball’s account by proposing a model they
called the “sausage machine,” because it divides PROCESSING
the language input into something that looks like STRUCTURAL AMBIGUITY
a link of sausages. The sausage machine is a two-
stage model of parsing. The first stage is called One of the major foci of current work on parsing
the preliminary phrase packager, or PPP. This is is on trying to understand how we process syntac-
followed by the sentence structure supervisor, or tic ambiguity, because this gives us an important
SSS. The PPP has a limited viewing window of tool in evaluating alternative models of how the
about six words, and cannot attach words to struc- syntactic processor operates.
tures that reflect dependencies longer than this. Two models have dominated research on
The SSS assembles the packets produced by the parsing. The garden path model is an autonomous
PPP, but cannot undo the work of the PPP. The two-stage model, while the constraint-based
idea of the limited length of the PPP, and a second model is an interactive one-stage model. Choosing
stage of processing that cannot undo the work of between the two depends on how early discourse
the first, operationalizes Kimball’s principle of context, frequency, and other semantic informa-
processing. The PPP can only make use of syn- tion can be shown to influence parsing choices.
tactic knowledge and uses syntactic heuristics, Is initial attachment—the way in which syntac-
such as preferring simpler syntactic structures if tic constituents are attached to the growing parse
there is a choice of structures (known as minimal tree—made on the basis of syntactic knowledge
attachment). alone, or is it influenced by semantic factors?
Wanner (1980) pointed out a number of prob-
lems with the sausage machine model. For exam-
ple, there are some six-word sentences that are
The garden path model
triply embedded, but because they are so short, According to the garden path model (e.g., Frazier,
should fit easily into the PPP window, such as 1987a), parsing takes place in two stages. In the
(20). Nevertheless, we still find them difficult to first stage, the processor draws only on syntactic
understand. There are also some six-word sen- information. If the incoming material is ambigu-
tences where right association operates when ous, only one structure is created. Initial attachment
minimal attachment is unable to choose between is determined only by syntactic preferences dic-
the alternatives, as they are both of equal com- tated by the two principles of minimal attachment
plexity (21). Here we prefer the interpretation and late closure. If the results of the first pass turn
296 D. MEANING AND USING LANGUAGE
out to be incompatible with further syntactic, prag- first verb. When we come to “seems” it is apparent
matic, or semantic and thematic information gener- that this structure is incorrect—we have been led up
ated by an independent thematic processor, then a a garden path. In an eye-movement study, Frazier
second pass is necessary to revise the parse tree. In and Rayner (1982) found that the reading time was
the garden path model, thematic information about longer for (23) than (22), and in (23) the first fixa-
semantic roles can only be used in the second stage tion in the disambiguating region was longer.
of parsing (Rayner, Carlson, & Frazier, 1983).
Two fundamental principles of parsing deter- (22) Since Jay always jogs a mile and a half this
mine initial attachment, called minimal attach- seems a short distance to him.
ment and late closure. According to minimal (23) Since Jay always jogs a mile and a half
attachment, incoming material should be attached seems a very short distance to him.
to the phrase marker being constructed using the
fewest nodes possible. According to late closure, Rayner and Frazier (1987) monitored partici-
incoming material should be incorporated into pants’ eye movements while they read sentences
the clause or phrase currently being processed. such as (24) and (25).
If there is a conflict between these two principles,
then minimal attachment takes precedence. (24) The criminal confessed his sins harmed
many people.
(25) The criminal confessed that his sins harmed
Constraint-based models of many people.
parsing
When we start to read (24), minimal attachment
A type of interactive model called the constraint-
leads to the adoption of the structure that contains
based approach has become very popular (e.g.,
the fewest number of nodes. Hence when we get
Boland, Tanenhaus, & Garnsey, 1990; MacDonald,
to “his sins” the simplest analysis is that “his sins”
1994; MacDonald, Pearlmutter, & Seidenberg,
is the object of “confessed,” rather than the more
1994a; Tanenhaus, Carlson, & Trueswell, 1989;
complex analysis that it is the subject of the com-
Taraban & McClelland, 1988; Trueswell et al.,
plement clause (as later turns out to be the case).
1993). On this account, the processor uses mul-
Readers should therefore be led up the garden path
tiple sources of information, including syntactic,
in (24), and will then be forced to reanalyze when
semantic, discourse, and frequency-based, called
they come to “harmed.” However, (25) should not
constraints. The construction that is most strongly
lead to a garden path, because “that” blocks the
supported by these multiple constraints is most
object analysis of the sentence. Rayner and Frazier
activated, although less plausible alternatives
found that participants did indeed experience dif-
might also remain active. Garden paths occur
ficulty when they reached “harmed” in (24) but
when the correct analysis of a local ambiguity
not in (25).
receives little activation.
Ferreira and Clifton (1986) described an exper-
iment that suggests that semantic factors cannot
Evidence for autonomy in syntactic prevent us from being garden-pathed. Garden path
theory predicts that, because of minimal attach-
processing ment, when we come across the word “examined”
The garden path model says that we resolve ambi- we should take it to be the main verb in (26) and
guity using minimal attachment and late closure, (27) rather than the verb in a reduced relative clause:
without semantic assistance. As (22) is consistent
with late closure, it does not cause the processor any (26) The defendant examined by the lawyer
problem; (23) is not ultimately consistent with late turned out to be unreliable.
closure, however, and the processor tries in the first (27) The evidence examined by the lawyer
instance to attach the NP “a mile and a half” to the turned out to be unreliable.
10. UNDERSTANDING SENTENCES 297
Consider what sorts of structure we might have (28) After the child had visited the doctor
generated by the time we get to the word “exam- prescribed a course of injections.
ined” in (26) and (27). “Examined” requires an (29) After the child had sneezed the doctor pre-
agent. In (26), “the defendant” is animate and scribed a course of injections.
can therefore fulfill the role of agent, as in “the
defendant examined the evidence”; but of course, Van Gompel and Pickering (2001) came to
“the defendant” can also be what is examined, so the same conclusion using an eye-movement
the syntactic structure is ambiguous between a methodology: readers experience difficulty after
reduced relative clause and a main verb analysis. “sneezed.” These experiments suggest that the
In (27) “the evidence” is inanimate and there- first stage of parsing is short-sighted and does not
fore cannot fulfill the role of the agent; it must use semantic or thematic information. Similarly,
be what is examined, and therefore this struc- Ferreira and Henderson (1990) examined data
ture can only be a reduced relative. However, from eye movements and word-by-word self-
analysis of eye-movement evidence suggested paced reading of ambiguous sentences, conclud-
that the semantic evidence available in sentences ing that verb information does not affect the initial
such as (27) did not prevent participants from parse, although it might guide the second stage of
getting garden-pathed. Instead, we still appear reanalysis.
to construct the initial interpretation to be the We can manipulate the semantic relatedness
syntactically most simple according to minimal of nouns and verbs in contexts where they are
attachment. Ferreira and Clifton argued that either syntactically appropriate or inappropriate.
semantic information does not prevent or cause Their different effects can then be teased out in
garden-pathing, but can hasten recovery from it. lexical decision and naming tasks (O’Seaghdha,
The difficulty caused by the ambiguity is very 1997). The results suggest that syntactic analysis
short in duration, and is resolved while reading precedes semantic analysis and is independent of
the word following the verb, “by” (Clifton & it. Consider (30) and (31):
Ferreira, 1989).
Mitchell (1987), on the basis of data from (30) The message that was shut.
a self-paced reading task (where participants (31) The message of that shut.
read a computer display and press a key every
time they are ready for a new word or phrase), In (30), the target word “shut” is syntactically
concluded that the initial stage only makes use appropriate but semantically anomalous. In (31),
of part-of-speech information, and that detailed the target is both syntactically and semantically
information from the verb only affects the sec- anomalous. In the lexical decision task, in (30)
ond, evaluative, stage of processing. Consider we observe meaning-based inhibition relative to a
sentences (28) and (29). In (28), according to baseline. In (31), we do not observe any inhibition.
garden path theory, the processor prefers to In the naming task, there is no sensitivity to seman-
assign the phrase “the doctor” as direct object of tic anomaly, but there is sensitivity to the syntactic
“visited” (to comply with late closure, keeping inappropriateness of the target in (31). O’Seaghdha
the first phrase open for as long as possible). As suggested that the inhibition occurs in (30) in the
expected, participants were garden-pathed by lexical decision task because of a difficulty in inte-
(28). However, if semantic and thematic infor- grating the target word into a high-level text rep-
mation about verbs is available from an early resentation. We do not get that far in (31) because
stage, then in (29) thematic information should the failure to construct a syntactic representation
tell the processor that “sneezed” cannot take a blocks any semantic integration. The results look
direct object (a process called lexical guidance). as though they support interactivity because the
Nevertheless, participants are still led up the lexical decision task is sensitive to post-access inte-
garden path with (29); hence the initial parse gration processes. The naming data are less con-
must be ignoring verb information. taminated by post-access processing and suggest
298 D. MEANING AND USING LANGUAGE
that syntactic analysis is prior to semantic integra- semantic processing are distinct (Ainsworth-
tion and independent of it. Darnell, Shulman, & Boland, 1998; Friederici,
Evidence from neuroscience suggests that 2002; Neville, Nicol, Barss, Forster, & Garrett,
semantic and syntactic processing are independ- 1991; Ni et al., 2000; Osterhout & Nicol, 1999).
ent. Breedin and Saffran (1999) described a For example, Ainsworth-Darnell et al. examined
patient, DM, who had a significant and pervasive ERPs when people heard sentences that contained
loss of semantic knowledge as a result of demen- a syntactic anomaly, a semantic anomaly, or
tia. For example, he found it very difficult to both. The sentences that contained both types of
match a picture of an object to another appropri- anomaly still provoked both an N400 and a P600.
ate picture (e.g., knowing that a pyramid is associ- Ainsworth-Darnell et al. concluded that different
ated with a palm tree rather than a pine tree). Yet parts of the brain automatically become involved
his semantic deficit had no apparent effect on his when syntactic and semantic anomalies are pre-
syntactic abilities. He performed extremely well sent, and therefore that these processes are rep-
at detecting grammatical violations (e.g., he knew resented separately. Osterhout and Nicol (1999)
that “what did the exhausted young woman sit?” gave participants sentences with different types of
was ungrammatical). He also had no difficulty in anomaly to read (34)–(37):
assigning semantic roles in a sentence. For exam-
ple, he could correctly identify who was being (34) The cats won’t eat the food that Mary leaves
carried in the sentence “The tiger is being carried them. (non-anomalous)
by the lion,” even though he had difficulty in rec- (35) The cats won’t bake the food that Mary
ognizing lions and tigers by name. leaves them. (semantic anomaly)
Brain-imaging studies are also useful here. A (36) The cats won’t eating the food that Mary
negative event-related potential (ERP) found 400 leaves them. (syntactic anomaly)
ms after an event (and hence called the N400) is (37) The cats won’t baking the food that Mary
thought to be particularly sensitive to semantic leaves them. (doubly anomalous)
processing, and is particularly indicative of vio-
lations of semantic expectancy (Batterink, Karns, As expected, semantically anomalous sentences,
Yamada, & Neville, 2010; Kounios & Holcomb, such as (35), elicited the N400, and syntacti-
1992; Kutas & Hillyard, 1980; Nigram, Hoffman, cally anomalous sentences, such as (36), elicited
& Simons, 1992). A sentence such as (32) gener- the P600. Doubly anomalous sentences, such as
ates a semantic anomaly: (37), elicited both an N400 and a P600, with the
magnitude of each effect being about the same
(32) Boris noticed a puncture and got out to as if each anomaly were present in isolation.
change the wheel on the castle. The brain responds differently to syntactic and
semantic anomalies, and the response to each
The N400 occurs 400 ms after the anomalous type of anomaly is unaffected by the presence
word “castle.” of the other type. Osterhout and Nicol concluded
There is also a positive wave found 600 ms that syntactic and semantic processes are separa-
after a syntactic violation (Hagoort, Brown, & ble and independent.
Groothusen, 1993; Osterhout & Holcomb, 1992; There has been some debate as to the strength
Osterhout, Holcomb, & Swinney, 1994). A P600 of this claim. It is useful to distinguish between
would be observed with (33): representational modularity and processing mod-
ularity (Pickering, 1999; Trueswell, Tanenhaus,
(33) Boris persuaded to fly. & Garnsey, 1994). Representational modularity
says that semantic and syntactic knowledge are
These anomalies can be used to map the represented separately. That is, there are distinct
time course of syntactic and semantic process- types of linguistic representation, which might be
ing. These ERP data suggest that syntactic and stored or processed in different parts of the brain.
10. UNDERSTANDING SENTENCES 299
This is relatively uncontroversial. Most of the (38) The thieves stole all the paintings in the
debate is about processing modularity: Is initial museum while the guard slept.
processing restricted to syntactic information, or (39) The thieves stole all the paintings in the
can all sources of information influence the earli- night while the guard slept.
est stages of processing?
Sentence (39) is a minimal attachment struc-
ture but (38) is not. In (38) the phrase “in the
Evidence for interaction in museum” must be formed into a noun phrase with
syntactic processing “paintings”; in (39) the phrase “in the night” must
The experiments discussed so far suggest that the be formed into a verb phrase with “stole.” The
first stage of parsing only makes use of syntactic noun phrase attachment in (38) produces a gram-
preferences based on minimal attachment and late matically more complex structure than the verb
closure, and does not use semantic or thematic phrase attachment in (39). Nevertheless, Taraban
information. On the interactive account, however, and McClelland found that (38) is read faster than
semantic factors influence whether or not we get (39). They argued that this is because all the words
garden-pathed. What is the evidence that semantic up to “museum” and “night” create a semantic
factors play an early role in parsing? bias for the non-minimal interpretation. They
Perhaps the syntactic principles of minimal concluded that violations of the purely syntactic
attachment and late closure can be better explained process of the attachment of words to the devel-
by semantic biases? Taraban and McClelland oping structural representation do not slow down
(1988) compared self-paced reading times for sen- reading, but violations of the semantic process of
tences such as (38) and (39) (see Figure 10.2): assigning words to thematic roles do. Taraban and
McClelland also concluded that previous studies
that had appeared to support minimal attachment
Noun phrase and verb phrase attachment structures in
had in fact confounded syntactic simplicity with
Taraban and McClelland (1988) semantic bias.
Why do we find garden-pathing on some
48 (7 nodes)
S
occasions but not others? Milne (1982) was one
of the first to argue that semantic factors rather
NP VP than syntactic factors lead us up the garden path.
Consider the three sentences (40)–(42). Only
The V NP (40) causes difficulty, because it sets up semantic
thieves expectancies that are then violated:
NP PP
stole
(40) The granite rocks during the earthquake.
all the in the (41) The granite rocks were by the seashore.
paintings museum (42) The table rocks during the earthquake.
39 (6 nodes)
S How can semantic factors explain our diffi-
culty with reduced relatives?
NP VP
Crain and Steedman (1985) used a speeded
grammaticality judgment task to show that an
The V NP PP
appropriate semantic context can eliminate syn-
thieves
tactic garden paths. In this task, participants see
stole all the in the
paintings night
a string of words and have to decide as quickly
as possible whether the string is grammatical or
not. Participants in this task on the whole are more
FIGURE 10.2 likely to misidentify garden path sentences as
300 D. MEANING AND USING LANGUAGE
non-grammatical than non-garden path sentences. lock” can modify either the noun phrase “the
Sentence (43) was incorrectly judged ungrammat- safe” or the verb phrase “blew open the safe.”
ical far more often than the structurally identical Altmann and Steedman presented the participants
but semantically more plausible sentence (44): with prior discourse context that disambiguated
the sentences. A prior context sentence referred to
(43) The teachers taught by the Berlitz method either one or two safes. (“Once inside he saw that
passed the test. there was a safe with a new lock and a strongbox
(44) The children taught by the Berlitz method with an old lock” versus “Once inside he saw that
passed the test. there was a safe with a new lock and a safe with
an old lock.”) If the context sentence mentioned
Crain and Steedman argued that there is no only one safe, then the complex noun phrase “the
such thing as a truly neutral semantic context. safe with the new lock” is redundant, and causes
Even when semantic context is apparently absent extra processing difficulty. Hence the preposi-
from the sentence, participants bring prior knowl- tional phrase in (48) took relatively longer to read.
edge and expectations to the experiment. They If the context sentence mentioned two safes, then
argued that all syntactic parsing preferences can be the simple noun phrase “the safe” in (47) fails
explained semantically. All syntactic alternatives to identify a particular safe, so the prepositional
are considered in parallel, and semantic consider- phrase “with the dynamite” in (47) took relatively
ations then rapidly select among them. Semantic longer to read.
difficulty is based on the amount of information Altmann and Steedman (1988) emphasized
that has to be assumed: The more assumptions that the processor constructs a syntactic represen-
that have to be made, the harder the sentence is tation incrementally, on a word-by-word basis. At
to process. Hence sentences such as (45) are dif- each word, alternative syntactic interpretations are
ficult compared with (46), where the existence generated in parallel, and then a decision is made
of only one horse is assumed. This assumption using context. Altmann and Steedman called this
is incompatible with the semantic representation “weak” interaction, as opposed to strong interac-
needed to understand (45)—that there are a num- tion, where context actually guides the parsing pro-
ber of horses but it was the one that was raced past cess so that only one alternative is generated. This
the barn that was the one that fell. That is, if the approach is called the referential theory of pars-
processor encounters a definite noun phrase in the ing. The processor constructs analyses in parallel
absence of any context, only one entity (e.g., one and uses discourse context to disambiguate them
horse) is postulated, and therefore no modifier is immediately. It is the immediate nature of this
necessary. If one is present, processing difficulty disambiguation that distinguishes the referential
ensues. theory from garden path models. As many factors
guide parsing, it must be semantic considerations
(45) The horse raced past the barn fell. that in this case must lead us up the garden path.
(46) The horse raced past the barn quickly. Is it possible to distinguish between the refer-
ential and the constraint-based theories? The theo-
Altmann and Steedman (1988) measured reading ries are similar in that each denies that parsing is
times on sentences such as (47) and (48): restricted to using syntactic information. In constraint-
based theories, all sources of semantic information,
(47) The burglar blew open the safe with the including general world knowledge, are used to dis-
dynamite and made off with the loot. ambiguate, but in referential theory only referential
(48) The burglar blew open the safe with the new complexity within the discourse model is important.
lock and made off with the loot. Ni, Crain, and Shankweiler (1996) tried to separate
the effects of these different types of knowledge by
These sentences are ambiguous: the prepositional studying reading times and eye movements when
phrases “with the dynamite” and “with the new reading ambiguous sentences. The results suggested
10. UNDERSTANDING SENTENCES 301
that semantic-referential information is used imme- this has a simpler structure than the alterna-
diately, but more general world knowledge takes tive (which turns out to be the correct analy-
longer to become available. Furthermore, world sis), in which the noun is the head of a complex
knowledge was dependent on working memory noun phrase. According to referential theory, the
capacity, whereas use of semantic-referential princi- resolution of ambiguities in context depends on
ples was not. (In general, people with larger working whether a unique referent can be found. The con-
memory spans are better able to maintain multiple text can bias the processor towards or away from
syntactic representations and therefore will be more garden-pathing. The null context induces a garden
effective at processing ambiguous sentences; see path in (51). However, some contexts will bias
MacDonald, Just, & Carpenter, 1992; Pearlmutter the processor towards a relative clause interpreta-
& MacDonald, 1995.) Ni et al. argued that the focus tion and prevent garden-pathing. Such a biasing
operator “only” presupposes the existence of more context can be obtained by preceding the ambigu-
than one vampire (in this example), and therefore a ous relative structure with a relative-supporting
modifier is needed to select one of them. Consider referential context. One way of doing this is to
(49) and (50): provide more than one possible referent for “the
man.” (For example, “A fireman braved a danger-
(49) The vampires loaned money at low interest ous fire in a hotel. He rescued one of the guests at
were told to record their expenses. great danger to himself. A crowd of men gathered
(50) Only vampires loaned money at low interest around him.”) Eye-movement measurements ver-
were told to record their expenses. ified this prediction. Measurements of difficulty
associated with garden-pathing were reflected in
Sentence (49) provokes a garden path effect longer average reading times per character in the
but (50) does not. Analysis suggested that these ambiguity region, and an increased probability of
referential principles were used immediately to regressive eye movements. When syntactic infor-
resolve ambiguity. Information about seman- mation leads to ambiguity and a garden path is
tic plausibility of interpretations was used later. possible, then the processor proceeds to construct
However, as Pickering (1999) noted, referential a syntactic representation on the basis of the best
theory cannot be a complete account of pars- semantic bet.
ing, because it can only be applied to ambigui- Further evidence for constraint-based models
ties involving simple and complex noun phrases. comes from the finding that thematic information
There is also more to context than discourse can be used to eliminate the garden path effect
analysis. Referential theory was an early version in these reduced relative sentences (MacDonald
of a constraint-based theory, applied to a limited et al., 1994a; Trueswell & Tanenhaus, 1994;
type of syntactic structure. Nevertheless, the idea Trueswell et al., 1994). For example, consider the
that discourse information can be used to influ- ambiguous sentence fragments (52) and (53):
ence parsing decisions is one essential component
of constraint-based theories. (52) The fossil examined –
Altmann, Garnham, and Dennis (1992) (53) The archeologist examined –
used eye-movement measures to investigate
how context affects garden pathing. Consider The fragments are ambiguous because they are
sentence (51): consistent with two sentence constructions:
the most frequent order, the unreduced struc-
(51) The fireman told the man that he had risked ture, where the first NP is the agent (e.g., “The
his life for to install a smoke detector. archeologist examined the fossil”), and with a
reduced relative clause (“The fossil examined
Garden path theory predicts that (51) should by the archeologist was important”). However,
always lead to a garden path. We always start to consider the thematic roles associated with
parse “the man” as a simple noun phrase because the verb “examine.” It has the roles of agent,
302 D. MEANING AND USING LANGUAGE
best fitted by an animate entity, and a theme, best According to constraint-based models, verb-
fitted by an inanimate object (Trueswell & bias information becomes available immediately
Tanenhaus, 1994). So semantic considerations the verb is recognized. Trueswell et al. (1993) found
associated with thematic roles suggest that (52) evidence for the immediate availability of verb-
is likely to be a reduced relative structure, and bias information across a range of tasks (prim-
(53) a simple sentence structure. Difficulty ing, self-paced reading, and eye movements).
ensues if subsequent material conflicts with They found that verbs with a sentence-complement
these interpretations, or if the context pro- bias did not cause processing difficulty, whereas
vided by the nouns is not sufficiently biasing. verbs with direct-object bias did. Furthermore,
Trueswell et al. (1994) examined eye move- the more frequently a sentence complement
ments to investigate how people understood verb appears in the language without a comple-
sentences such as (52) and (53). They found mentizer (“that”), the less likely it is to lead to
that if semantic constraints were sufficiently processing difficulty in sentence-complement
strong, reduced relative clauses were no more constructions. Using a carefully controlled set of
difficult than the unreduced constructions. materials combined with eye-movement and self-
Remember that, in contrast, Ferreira and paced reading analyses, Garnsey et al. (1997) also
Clifton (1986) found evidence of increased dif- found that people’s prior experience with particu-
ficulty with very similar materials, (26) and lar verbs guides their interpretation of temporary
(27). Why is there a discrepancy? Trueswell et al. ambiguity. Verb bias guides readers to a sentence-
argued that the semantic bias in Ferreira and complement interpretation with sentence-
Clifton’s experiment was too weak. If the seman- complement verbs. This information is available
tic constraint is not strong enough, we will be very quickly (certainly by the word following
garden-pathed. McRae, Spivey-Knowlton, and the verb). Furthermore, verb-bias information
Tanenhaus (1998) found that strong plausibility interacts with how plausible the temporarily
can also overcome garden-pathing. On the other ambiguous noun is as a direct object. For exam-
side of the coin, people are reluctant to abandon ple, “the decision” is more plausible as a direct
plausible analyses in favor of implausible ones, object than “the reporter.” This result is best
even when the plausible analysis is turning out to explained by constraint-based models, as accord-
be wrong (Pickering & Traxler, 1998). ing to the garden path model there should be no
An important idea in constraint-based mod- early effect of plausibility and verb bias.
els is that of verb bias (Garnsey et al., 1997; Note though that there is controversy over
Trueswell et al., 1993). This is the idea that whether verb-bias effects are real: Some studies
although some verbs can appear in a number have found no effect of verb-frequency informa-
of syntactic structures, some of their syntactic tion. For example, using an eye-tracking meth-
structures are more common than others. The odology, Pickering, Traxler, and Crocker (2000)
relative frequencies of alternative interpreta- found that readers experienced difficulty with tem-
tions of verbs predict whether or not people have porarily ambiguous sentence-complement clauses
difficulty in understanding reduced relatives even when the verbs were biased towards that
(MacDonald, 1994; Trueswell, 1996). Hence, analysis. Consider the sentence beginning (54).
although the verb “read” can appear with sen-
tence complements (“the ghost read the book had (54) The young athlete realized her potential –
been burned”), it is most commonly followed by
a direct object (as in simply, “the ghost read the There are now two possible analyses: the object
book during the plane journey”). Direct-object analysis (simply, “The young athlete realized her
verbs are those where the most frequent continu- potential”), and the sentence-complement analy-
ation is the direct object; sentence-complement sis (as in “The young athlete realized her potential
verbs are those where the most frequent continu- might one day make her a world class athlete”). The
ation is the sentence complement. sentence-complement analysis is the most common
10. UNDERSTANDING SENTENCES 303
for the verb “realized,” so readers should adopt that resolved in similar ways because of the impor-
and not the object analysis. However, they do not. tance of lexical constraints in parsing (MacDonald
People preferred to attach noun phrases as argu- et al., 1994a, 1994b). Syntactic ambiguities arise
ments of verbs, regardless of whether or not this because of ambiguities at the lexical level. For
analysis was likely to be correct. Kennison (2001) example, “raced” is an ambiguous word, with one
similarly found that ambiguous structures caused sense of a past tense, and another of a past parti-
difficulty regardless of the verb bias. Pickering and ciple. In (57), only the past tense sense is consist-
van Gompel (2006) concluded that verb-bias infor- ent with the preceding context. This information
mation has some influence on syntactic processing, eventually constrains the processor to a particular
but often not enough to prevent us having difficulty syntactic interpretation. But in (58), both senses
with temporally ambiguous sentences. are consistent with the context. Although con-
In constraint-based models, syntactic ambiguity textual constraints are rarely strong enough to
is eventually resolved by competition (MacDonald restrict activation to the appropriate alternative,
et al., 1994a, 1994b). The constraints activate they provide useful information for distinguish-
different analyses to differing degrees; if two or ing between alternative candidates. In this type of
more analyses are highly activated, competition is approach, a syntactic representation of a sentence
strong and there are severe processing difficulties. is computed through links between items in a rich
Tabor and Tanenhaus (1999; see also Tabor, Juliano, lexicon (MacDonald et al., 1994a).
& Tanenhaus, 1997) proposed that the competition
is resolved by settling into a basin of attraction in (57) The horse who raced –
an attractor network similar to those postulated to (58) The horse raced –
account for word recognition (Hinton & Shallice,
1991; see Chapter 7). Along similar lines, McRae Part of the difficulty in distinguishing between
et al. (1998) proposed a connectionist-like model of the autonomous and interactive constraint-based
ambiguity resolution called competition-integration. theories is in obtaining evidence about what is
Competition between alternative structures plays happening in the earliest stages of comprehen-
a central role in a parsing process that essentially sion. Tanenhaus et al. (1995) examined the eye
checks its preferred structure after each new word. movements of participants who were following
Evidence for parallel competition models comes instructions to manipulate real objects. Analy-
from studies that show that the more committed peo- sis of the eye movements suggested that people
ple become to a parsing choice, the more difficult processed the instructions incrementally, making
it is for them to recover, an effect called digging-in eye movements to objects immediately after the
(Tabor & Hutchins, 2004). For example, increasing relevant instruction. People typically made an eye
the gap between the ambiguity and the disambigu- movement to the target object 250 ms after the
ating information causes the comprehenders to “dig end of the word that uniquely specified the object.
in” as they become more committed to the wrong With more complex instructions, participants’
analysis (e.g., (55) is easier than (56); materials from eyes moved around the array looking for possible
Ferreira & Henderson, 1991). Once they have dug referents.
in, alternative interpretations (including the correct The best evidence for the independence of
one) become less activated. parsing comes from reading studies of sentences
with brief syntactic ambiguities, where listeners
(55) After the Martians invaded the town was have clear preferences for particular interpreta-
evacuated. tions, even when the preceding linguistic context
(56) After the Martians invaded the town that the supports the alternative interpretation. Tanenhaus
city bordered was evacuated. et al. pointed out that in this sort of experiment
the context may not be immediately available
Another important aspect of constraint-based because it has to be retrieved from memory.
models is that syntactic and lexical ambiguity are They examined the interpretation of temporarily
304 D. MEANING AND USING LANGUAGE
ambiguous sentences in the context of a visual structures (those where the reduced relative read-
array so that information is immediately avail- ing is the correct one). Furthermore, and contrary
able. They auditorily presented participants with to the reading time results, higher activation was
the sentence (59) with one of two visual contexts. shown while reading ambiguous sentences when
the ambiguity was resolved in favor of the pre-
(59) Put the apple on the towel in the box. ferred syntactic construction. The higher workload
was spread among the superior temporal gyrus
In the one-referent condition there was just one (including Wernicke’s area) and the inferior frontal
apple on a towel and another towel without an gyrus (including Broca’s area), hinting that multi-
apple on it. In the two-referent condition there ple processes are involved in ambiguity resolution
were two possible referents for the apple, one on a (see Figure 10.3). In particular, Broca’s area might
towel and one on a napkin. According to modular be involved in generating abstract syntactic frames,
theories, “on the towel” should always be initially and Wernicke’s in interpreting and elaborating
interpreted as the destination (where the apple them with semantic information. These findings
should be put, because this is structurally sim- are more consistent with parallel models where
plest). However, analysis of the eye movements multiple parses are kept open at the same time.
across the scene showed that “on the towel” was There is also recent electrophysiological
initially interpreted as the destination only in the evidence that shows that people predict what is
one-referent condition. In the two-referent condition, coming next (DeLong, Urbach, & Kutas, 2005;
“on the towel” was interpreted as the modifier of see also Kutas, DeLong, & Smith, 2011). DeLong
“apple.” In the one-referent condition, participants et al. examined the phonological regularity in the
looked at the incorrect destination (the irrelevant English indefinite article (“a” before a consonant,
towel) 55% of the time; in the two-referent condi- “an” before a vowel) using ERP, and concluded
tion, they rarely did so. This experiment is strong that people pre-activate words in a graded fashion.
evidence that people use contextual information
immediately to establish reference and to process Cross-linguistic differences in
temporarily ambiguous sentences. attachment
A similar experiment by Sedivy, Tanenhaus, A final point concerns the extent to which any
Chambers, and Carlson (1999) showed that people parsing principles apply to languages other than
very quickly take context into account when inter- English. Cuetos and Mitchell (1988) examined the
preting adjectives. On the basis of these findings, extent to which speakers of English and Spanish
Sedivy et al. argued that syntactic processing is used the late-closure strategy to interpret the same
incremental—that is, a semantic representation
is constructed with very little lag following the
input. People immediately try to integrate adjec-
Inferior frontal gyrus
tives into a semantic model even when they do not
have a stable core meaning (e.g., tall is a scalar
object—it is a relative term and depends on the Broca’s
area
noun it is modifying; tall in “a tall glass” means
something different from in “a tall building”).
They do this by establishing contrasts between
possible referents in the visual array (or memory).
Brain-imaging fMRI studies show that the
brain processes ambiguous and unambiguous
sentences differently (Mason, Just, Keller, &
Wernicke’s
Carpenter, 2003). Higher levels of brain activation Superior temporal gyrus area
are shown for ambiguous sentences, but also during
reading more complex structures and unpreferred FIGURE 10.3
10. UNDERSTANDING SENTENCES 305
sorts of sentences. They found that although the 1988). Third, as constraint-based models advo-
interpretations of the English speakers could be cate, parsing does not make use of linguistic prin-
accounted for by late closure, this was not true ciples at all. The results of interpretation depend
of the Spanish speakers. For example, given on the interaction of many constraints that are
(60), English speakers prefer to attach the rela- relevant in sentence processing. Whatever the
tive clause (“who had the accident”) to “the colo- answer, it is clear that if we limit our studies of
nel,” because that is the phrase currently being parsing to English then we miss out on a great
processed. We can find this out simply by asking deal of potentially important data.
readers “Who had the accident?” Constraint-based models contain a probabil-
istic element in that the most strongly activated
(60) The journalist interviewed the daughter of the analysis can vary depending on the circumstances.
colonel who had the accident. Another example of a probabilistic model is the
tuning hypothesis (Brysbaert & Mitchell, 1996;
Spanish speakers, on the other hand, given Mitchell, 1994; Mitchell et al., 1995). The tun-
the equivalent sentence (61), seem to follow a ing hypothesis emphasizes the role of exposure
strategy of early closure. That is, they attach the to language. Parsing decisions are influenced by
relative clause to the first noun phrase. the frequency with which alternative analyses are
used. Put another way, people resolve ambiguities
(61) El periodista entrevisto a la hija del coronel in a way that has been successful in the past (Sturt,
que tuvo el accidente. Costa, Lombardo, & Frasconi, 2003). Given the
reasonable assumption that people vary in their
Other languages also show a preference exposure to different analyses, then their preferred
for attaching the relative clause to the first initial attachments will also vary. Attachment
noun phrase, including French (Zagar, Pynte, & preferences may vary from language to language,
Rativeau, 1997) and Dutch (Brysbaert & Mitchell, and from person to person, and indeed might even
1996). These results suggest that late closure may vary within a person across time. Brysbaert and
not be a general strategy common to all lan- Mitchell (1996) used a questionnaire to examine
guages. Instead, the parsing preferences may attachment preferences in Dutch speakers, and
reflect the frequency of different structures found individual differences in these preferences.
within a language (Mitchell, Cuetos, Corley, &
Brysbaert, 1995). These cross-linguistic dif- Comparison of garden path and
ferences question the idea that late closure is a
process-generated principle that confers advan- constraint-based theories
tages on the comprehender, such as minimizing When do syntax and semantics interact in pars-
processing load. Frazier (1987b) proposed that ing? This has proved to be the central question
late closure is advantageous because if a constitu- in parsing, as well as one of the most difficult to
ent is kept open as long as possible, it avoids the answer. In serial two-stage models, such as the
processing cost incurred by closing it, opening it, garden path model, the initial analysis is con-
and closing it again. strained by using only syntactic information and
The results of this study can be explained preferences, and a second stage using semantic
in one of three ways. First, late closure may not information. In parallel constraint-based models,
originate because of processing advantages, and multiple analyses are active from the beginning,
the choice of strategy (early versus late closure) and both syntactic and non-syntactic informa-
is essentially an arbitrary choice in different lan- tion is used in combination to activate alterna-
guages. Second, late closure may have a process- tive representations. Unfortunately, there is little
ing advantage and may be the usual strategy, but consensus about which model gives the better
in some languages, in some circumstances, other account. Different techniques seem to give differ-
strategies may dominate (Cuetos & Mitchell, ent answers, and the results are sensitive to the
306 D. MEANING AND USING LANGUAGE
materials used. Proponents of the garden path might continue for a long time. In the competition-
model argue that the effects that are claimed to integration model (McRae et al., 1998; Spivey &
support constraint-based models arise because the Tanenhaus, 1998), competition is long-lasting but
second stage of parsing begins very quickly, and decreases as the sentence unfolds.
that many experiments that are supposed to be So do we resolve ambiguity by reanalysis or
looking at the first stage are in fact looking at the competition? Van Gompel et al. (2001) examined
second stage of parsing. Any interaction observed how we resolve ambiguity. They constructed sen-
is occurring at this second stage, which starts very tences such as (62) to (64):
early in processing. They argue that experiments
supporting constraint-based models are meth- (62) The hunter killed only the poacher with the
odologically flawed, and that constraint-based rifle not long after sunset.
models fail to account for the full range of data (63) The hunter killed only the leopard with the
(Frazier, 1995). On the other hand, proponents of rifle not long after sunset.
the constraint-based models argue that research- (64) The hunter killed only the leopard with the
ers favoring the garden path model use techniques scars not long after sunset.
that are not sensitive enough to detect the inter-
actions involved, or that the non-syntactic con- The prepositional phrase (“with the rifle/
straints used are too weak. scars”) can be attached either to “killed” (a VP
attachment analysis: the hunter killed with
the rifle/scars) or to “poacher/leopard” (an NP
Other models of parsing attachment: the poacher/leopard had the rifle/
Is there any way out of this dilemma? Alternative scars). In (63), only the VP attachment is plausible
approaches to garden path and constraint-based (that the hunter killed with the rifle, rather than
theories have recently come to the fore. that the leopard had the rifle); this is the VP condi-
The first alternative may be called the tion. In (64), only the NP attachment is plausible
unrestricted-race model. To understand the basis (that the leopard had the scars, as you cannot kill
of this model, we must consider exactly how with scars); this is the NP condition. In (62), both
syntactic ambiguity is resolved. We also need to the VP and NP attachments are plausible; this is
distinguish between models that always adopt the called the ambiguous condition.
same analysis of a particular ambiguity and those What do the different theories predict? The
that do not (van Gompel, Pickering, & Traxler, garden path model (an example of a fixed-choice
2000, 2001). two-stage model where ambiguity is resolved
The garden path model can be described as a by reanalysis) predicts, on the basis of minimal
fixed-choice two-stage model. It is fixed choice in attachment, that the processor will always initially
that it has no probabilistic element in its decision adopt the VP analysis, because this generates
making. Given a particular structure, the same the simpler structure. (It creates a structure with
syntactic structure will always be generated on fewer nodes than the NP analysis; see Chapter 2.)
the basis of late closure and minimal attachment. The processor only reanalyzes if the VP attach-
Either the correct analysis is chosen on syntactic ment turns out subsequently to be implausible.
grounds from the beginning, or, if the initial syn- Hence (62) should be as difficult as (63), but (64)
tactic analysis becomes implausible, reanalysis is should cause more difficulty. Constraint-based
needed. theories predict little competition in (64), because
Constraint-based models are variable-choice plausibility supports only the NP interpretation.
one-stage models. In constraint-based models, In (63) there should be little competition, because
syntactic ambiguity is resolved by competition. the semantic plausibility information supports
When there are alternative analyses of similar acti- only the VP analysis. Crucially, in this experiment
vation, competition is particularly intense, causing there was no syntactic preference for VP or NP
considerable processing difficulty. Competition attachment. The ambiguity was balanced (usually
10. UNDERSTANDING SENTENCES 307
VP/NP ambiguities are biased towards VP attach- processor will be forced to reanalyze. The critical
ment). In (62), however, there should be compe- and surprising finding that only a variable-choice
tition because both interpretations are plausible. two-stage model such as the unrestricted-race
In summary, garden path theory predicts that (62) model seems able to explain is that sometimes
and (63) should be equally easy, but (64) should ambiguous sentences cause less difficulty than
be difficult; constraint-based theory predicts that disambiguated sentences.
(63) and (64) should be easy, but (62) should be Need detailed syntactic processing neces-
difficult. sarily precede semantic analysis? In a second
Van Gompel et al. examined readers’ eye alternative approach Bever, Sanz, and Townsend
movements to discover when these sentences (1998) suggest that semantics comes first. In an
caused difficulty. They found that an inspection extension of the idea that probabilistic, statistical
of reading difficulty favored neither pattern of considerations play an important role in compre-
results. Instead, they found that the ambiguous hension, Bever et al. argue that statistically based
condition was easier to read than the two disam- strategies are used to propose an initial semantic
biguated ones. That is, (64) was easy but (62) and representation. This then constrains the detailed
(63) were difficult. computation of the syntactic representation. They
Neither garden path nor constraint-based the- argued that the frequency with which syntactic
ories seem able to explain this pattern of results. representations occur constrains the initial stage
Van Gompel et al. argue that only a variable-choice of syntactic processing. At any one time, the pro-
two-stage model can account for this pattern of cessor assigns the statistically most likely inter-
results. The unrestricted race is such a model pretation to the incoming material. Bever et al.
(Traxler, Pickering, & Clifton, 1998; van Gompel argued that a principle such as minimal attach-
et al., 2000, 2001). As in constraint-based mod- ment cannot explain why we find reduced rela-
els, all sources of information, both syntactic tives so very difficult, but the statistical rarity of
and semantic, are used to select among alterna- this sort of construction can (just because they
tive syntactic structures (hence it is unrestricted). are so rare). On this account, the role of the pro-
The alternatives are constructed in parallel and cessor is reduced to checking that everything is
engaged in a race. The winner is the analysis that accounted for, and that the initial semantic rep-
is constructed fastest, and this is adopted as the resentation indeed corresponds with the detailed
syntactic interpretation of the fragment. So in syntactic representation.
contrast to constraint-based theories, only one Do we always construct a complete, idealized
analysis is adopted at a time. If this analysis is syntactic structure? Christianson, Hollingworth,
inconsistent with later information, the processor Halliwell, and Ferreira (2001) argue that we
has to reanalyze, at considerable cost; hence it is do not. They focus on what people understand
also a two-stage model. It is also a variable-choice after they have read garden path sentences such
model, as the initial analysis is affected by the par- as “While the man hunted the deer ran into the
ticular characteristics of the sentence fragment (as woods.” This emphasis on comprehension—for
well as by individual differences resulting from example, asking people what they thought were
differences in experience). the subjects, objects, and actions of clauses, and
Let us consider how the unrestricted-race how confident they were about these judgments—
model accounts for these data. Because there is is different from that of most of the other stud-
no particular bias for NP or VP in (62)–(64), peo- ies we have looked at, which emphasize on-line
ple will adopt one of these as their initial prefer- measures of what is happening when we process
ence on about half the trials. In (62), people will individual words while looking at garden path
never have to reanalyze, because either preference sentences. They found that people do not always
turns out to be plausible, but (63) and (64) will completely reanalyze sentences, and often retain
both cause difficulty on those occasions when the a mistaken interpretation derived from the initial
initial preference turns out to be wrong, and the misanalysis. They concluded that people do not
308 D. MEANING AND USING LANGUAGE
strive towards perfect analyses, but instead are are ungrammatical, which is why people have so
happy with interpretations that seem to work; much difficulty with them. A study of a large cor-
they settle for “good enough.” In a return related pus of natural speech confirms that people only
to the early idea of surface cues, some researchers produce reduced relatives with these external-
now think that people use simple heuristics when causation verbs. With verbs where the control
processing language, in addition to detailed and is internal, in real life speakers use non-reduced
complete syntactic processing (Ferreira, 2003). constructions (“the horse that was raced past the
Comprehenders start out with the assumption that barn fell”).
a sentence is in canonical, NVN form, and sen- McKoon and Ratcliff call this approach,
tences that violate this heuristic (e.g., passives) where syntactic constructions convey particular
are more difficult to understand. meanings that restrict what sorts of nouns and
A different approach is taken by McKoon and verbs can be used with them, and particularly
Ratcliff (2002, 2003). They argue that syntactic what sort of verb-argument structures can be
constructions themselves carry meaning, beyond used, meaning through syntax (MTS). They fur-
the meaning of their constituent words. A passive ther argue that the MTS conflicts with constraint-
sentence provides a different emphasis from its based theories. According to constraint-based
corresponding active, and therefore has a differ- theories, the language processor knows about sta-
ent meaning. Sentences (65) and (66), although tistics of usage, not meanings and rules, whereas
superficially similar, convey different meanings. according to MTS, the language processor knows
about meanings and rules, but not statistics.
(65) Boris loaded the truck with hay. McKoon and Ratcliff found that statistical infor-
(66) Boris loaded hay onto the truck. mation about verbs derived from an actual corpus
of speech does not predict reading times of sen-
Here, sentence (65) conveys the notion that tences containing those verbs.
the truck is completely full of hay, but (66) does The MTS approach is criticized by McRae,
not. A difference in syntax conveys a difference in Hare, and Tanenhaus (2005), who argue that the
meaning. Reduced relative constructions convey difficulty of reduced relatives is best accounted
a particular meaning. McKoon and Ratcliff argue for not by the internal–external distinction, but
that this meaning means that it can only be com- by temporary processing difficulty resulting from
bined with particular sorts of nouns and verbs. ambiguity. Furthermore, the syntactic construc-
The reduced relative can only be used to talk tions can on occasion force, or coerce, a particu-
about particular sorts of things: The main noun lar interpretation regardless of the meaning of
participates in an event caused by some force or the verb: We can still understand a sentence such
other entity external to itself. The main verb has as “Boris sneezed the tissue off the table” even
to convey this sense of external participation. A though “sneezed” does not normally imply cau-
sentence such as (67) satisfies this constraint, but sation. Sentence constructions do carry meaning
a sentence such as (68) does not. independently of their constituent verbs. In sum-
mary, it is difficult to see how the MTS approach
(67) Cars and trucks abandoned in a terrifying can replace alternative theories of parsing dif-
scramble for safety. ficulty. Indeed, instead of replacing constraint-
(68) The horse raced past the barn fell. based theories, the internal–external causation
distinction may be just one more constraint.
“Abandoned” conveys this sense of external cau-
sation (“something caused cars and trucks to be Processing syntactic-category
abandoned”), but “raced” does not (because it is
the horse itself that is doing the racing). McKoon
ambiguity
and Ratcliff propose that reduced relatives with One type of lexical ambiguity that is of particular
verbs denoting internally caused events really importance for processing syntax is lexical-category
10. UNDERSTANDING SENTENCES 309
ambiguity, where a word can be from more than “the” (“the desert trains”), which permits both
one syntactic category (e.g., a noun or a verb, as in NV and NN interpretations, and the unambiguous
“trains” or “watches”). This type of ambiguity pro- controls started with “this” (giving “this desert
vides a useful test of the idea that lexical and syntac- trains” for an unambiguous NV interpretation) or
tic ambiguity are aspects of the same thing and are with “these” (giving “these desert trains” for an
processed in similar ways. unambiguous NN interpretation). The rest of the
According to serial-stage models such as gar- sentence provided disambiguating information, as
den path theory, lexical and syntactic ambiguity shown in the full sentences (69) and (70):
are quite distinct, because lexical representations
are already computed but syntactic representa- (69) I know that the desert trains young people to
tions must be computed (Frazier & Rayner, 1987). be especially tough.
According to Frazier (1989), distinct mechanisms (70) I know that the desert trains are especially
are needed to resolve lexical-semantic, syntactic, tough on young people.
and lexical-category ambiguity. Lexical-semantic
ambiguity is resolved in the manner described in Frazier and Rayner found that reading times in
Chapter 6: The alternative semantic interpreta- the critical, ambiguous region (“desert trains”) were
tions are generated in parallel, and one meaning is shorter in the ambiguous (“the”) condition than the
rapidly chosen on the basis of context and mean- unambiguous (“this”/“these”) conditions. However,
ing frequency. Syntactic ambiguity is dealt with in the ambiguous condition, reading times were
by the garden path model in that only one analysis longer in the disambiguating material later in the
is constructed at any one time; if this turns out to sentence. They proposed that when the processor
be incorrect, then reanalysis is necessary. Lexical- encounters the initial ambiguity, very little analy-
category ambiguity is dealt with by a delay sis takes place. Instead, processing is delayed until
mechanism. When we encounter a syntactically subsequent disambiguating information is reached,
ambiguous word, the alternative meanings are when additional work is necessary.
accessed in parallel, but no alternative is chosen According to constraint-based theories, there
immediately. Instead, the processor delays selec- is no real difference between lexical-semantic
tion until definitive disambiguating information ambiguity and lexical-category ambiguity. In
is encountered later in the sentence. The advan- each case, alternatives are activated in parallel
tage of the delay strategy is that it saves extensive depending on the strength of support they receive
computation because usually the word following from multiple sources of information. Hence mul-
a lexical-category ambiguity provides sufficient tiple factors, such as context and the syntactic
disambiguating information. bias of the ambiguous word (that is, whether it is
Frazier and Rayner (1987) provided some more frequently encountered as a noun or a verb),
experimental support for the delay strategy. They immediately affect interpretation.
examined how we process two-word phrases How can constraint-based theories account
containing lexical-category ambiguities, such for Frazier and Rayner’s findings that we seem
as “desert trains.” After the word “desert,” two to delay processing lexical-category ambigui-
interpretations are possible. The first noun can ties until the disambiguating region is reached?
either be a noun to be followed by a verb (in MacDonald (1993) suggested that the control
which case “desert” will be the subject of the condition in their experiment provided an unsuit-
verb “trains”—this is the NV interpretation), or able baseline, in that they introduced an additional
it can be a modifier noun that precedes a head factor. The determiners “this” and “these” serve
noun (in which case “desert” will be the modify- a deictic function, in that they point the compre-
ing noun and “trains” the head noun—this is the hender to a previously mentioned discourse entity.
NN interpretation). Frazier and Rayner examined When there is no previous entity, they sound quite
eye movements in ambiguous and unambiguous odd. Hence Frazier and Rayner’s control sen-
sentences. The ambiguous sentences started with tences (71) and (72) in isolation read awkwardly:
310 D. MEANING AND USING LANGUAGE
(71) I know that this desert trains young people favored by the semantic bias turns out to be incor-
to be especially tough. rect that reading times of the ambiguous sentence
(72) I know that these desert trains are especially should increase. The pattern of results favored
tough on young people. the constraint-based model. Semantic bias has an
immediate effect.
Therefore, MacDonald suggested, the rela-
tively fast reading times in the ambiguous region (75) She saw her duck –
of the experimental condition arose because the
comparable reading times in the control condi- What happens when we encounter an ambigu-
tion were quite slow, as readers were taken aback ous fragment such as (75)? In this situation, the
by the infelicitous use of “this” and “these.” continuation using “duck” in its sense as a verb
MacDonald therefore used an additional type (e.g., “She saw her duck and run”) is statistically
of control sentence. Rather than using different more likely than that as a noun (e.g., “She saw
determiners, she used the unambiguous phrases her duck and chickens”). It is possible to bias the
“deserted trains” and “desert trained.” She found interpretation with a preceding context sentence
that “this” and “these” did indeed slow down (e.g., “As they walked round, Agnes looked at all
processing, even in the unambiguous version (“I of Doris’s pets”). Boland (1997), using analysis
know that these deserted trains could resupply the of reading times, showed that whereas probabil-
camp” compared with “I know that the deserted istic lexical information is used immediately to
trains could resupply the camp”). influence the generation of syntactic structures,
MacDonald went on to test the effects of background information is used later to guide
the semantic bias of the categorically ambiguous the selection of the appropriate structure. These
word. The semantic bias is the interpretation that findings support the constraint-based approach:
people give to the ambiguity in isolation. It can When we identify a word, we do not just access
turn out either to be correct if it is supported by its syntactic category, we activate other knowl-
the context, such as in (73), which normally has a edge that plays an immediate role in parsing,
noun–verb interpretation, or to be incorrect if it is such as the knowledge about the frequency of
not, as in (74), where “warehouse fires” normally alternative syntactic structures. However, the
has a noun–noun interpretation: finding that context sometimes has a later effect
requires modification of standard constraint-
(73) The union told reporters that the corpora- based theories.
tion fires many workers each spring without
giving them notice.
(74) The union told reporters that the warehouse
GAPS, TRACES,
fires many workers each spring without giv- AND UNBOUNDED
ing them notice. DEPENDENCIES
According to the delay model, even a strong Syntactic analysis of sentences suggests that
semantic bias should not affect initial resolu- sometimes constituents have been deleted or
tion, because all decisions are delayed until the moved. Compare (76) and (77):
disambiguation region: Reading times should be
the same whether the bias is supported or not. (76) Vlad was selling and Agnes was buying.
According to the constraint-based model, a strong (77) Vlad was selling and Agnes_buying.
semantic bias should have an immediate effect.
If the interpretation favored by the semantic bias Sentence (77) is perfectly grammatically well
turns out to be correct, ambiguous reading times formed. The verb (“was”) has been deleted to
should not differ from the unambiguous con- avoid repetition, but it is still there, implicitly. Its
trol condition. It is only when the interpretation deletion has left a gap in the location marked.
10. UNDERSTANDING SENTENCES 311
Parts of a sentence can be moved elsewhere is the recent-filler strategy. This leads to the cor-
in the sentence. When they are moved they leave rect outcome in (80): Here the constituent “the
a special type of gap called a trace. There is no teacher” goes into the gap t1, leaving “the girl”
trace in (78), but in (79) “sharpen” is a transitive to go into t2. In (81), however, it is “the girl” that
verb demanding an object; the object “sword” has should go into the gap t, and not the most recent
been moved, leaving a trace (indicated by t). This constituent (“the teacher”). This delays process-
type of structure is called an unbounded depend- ing, leading to the slower reading times. These
ency, because closely associated constituents are two strategies can be quite difficult to distinguish,
separated from each other (and can, in principle, but in each case trace-detection plays an impor-
be infinitely far apart). tant role in parsing.
Finally, at first sight some of the strongest
(78) Which sword is sharpest? evidence for the processing importance of traces
(79) Which sword did Vlad sharpen [t] yesterday? is the finding that traces appear able to prime the
recognition of the dislocated constituents or ante-
Gaps and traces may be important in the syn- cedents with which they are associated. That is,
tactic analysis of sentences, but is there any evi- the filler of the gap becomes semantically reacti-
dence that they affect parsing? If so, the gap has vated at the point of the gap. There is significant
to be located and then filled with an appropriate priming of the NP filler at the gap (Nicol, 1993;
filler (here “the sword”). Nicol & Swinney, 1989). In a sentence such as
There is some evidence that we fill gaps when (82), the NP “astute lawyer” is the antecedent of
we encounter them. First, traces place a strain on the trace [t], as the “astute lawyer” is the underly-
memory: The dislocated constituent has to be held ing subject who is going to argue during the trial
in memory until the trace is reached. Second, pro- (Bever & McElree, 1988). In the superficially
cessing of the trace can be detected in measure- similar control sentence (83) no constituent has
ments of the brain’s electrical activity (Garnsey, been moved, and therefore there is no trace.
Tanenhaus, & Chapman, 1989; Kluender &
Kutas, 1993), although it is difficult to disentangle (82) The astute lawyer, who faced the female
the additional effects of plausibility and working judge, was certain [t] to argue during the trial.
memory load in these studies. Third, all languages (83) The astute lawyer, who faced the female
seem to employ a recent filler strategy, whereby judge, hated the long speeches during the trial.
in cases of ambiguity a gap is filled with the most
recent grammatically plausible filler. For exam- We find that the gap in (82) does indeed
ple, Frazier, Clifton, and Randall (1983) noted facilitate the recognition of a probe word from the
that sentences of the form of (80) are understood antecedent (e.g., “astute”). The control sentence
100 ms faster (as measured by reading times) than (83) produces no such facilitation. Hence, when
sentences such as (81): we find a trace, we appear to retrieve its associ-
ated antecedent—a process known as binding the
(80) This is the girl the teacher wanted [t1] to dislocated constituent to the trace, thereby making
talk to [t2]. it more accessible.
(81) This is the girl the teacher wanted [t] to talk. On the other hand, there is other research
suggesting that traces are not important in on-
One possibility is that when the processor line processing. McKoon, Ratcliff, and Ward
detects a gap it fills it with the most active item, (1994) failed to replicate the studies that show
and is prepared to reanalyze if necessary. This is wh- traces (traces formed by a question forma-
the active-filler strategy (Frazier & Flores d’Arcais, tion) can prime their antecedents (e.g., Nicol &
1989). Another possibility is that the processor Swinney, 1989). Although unable to point to any
detects a gap, and fills it with a filler, that is, the conclusive theoretical reasons why it should be
most recent potential dislocated constituent. This the case, they found that the choice of control
312 D. MEANING AND USING LANGUAGE
words in the lexical decision was very important; (85) That is the very small pistol in which the
a choice of different words could obliterate the heartless killer shot the hapless man [t]
effect. They found no priming when the control yesterday afternoon.
words were chosen from the same set of words
as the test words, yet priming was reinstated Clearly this sentence is implausible, but when
when the control words were from a different do readers experience difficulty? Here the gap
set of words than the test words. In addition, location is after “man” (because in the plausible
when they found priming, they found it for loca- version the word order should be the heartless
tions both after and before the verb. This should killer shot the hapless man with the very small
not be expected if the trace is reinstating the pistol yesterday afternoon), but the readers expe-
antecedent, as the trace is only activated by the rience processing difficulty immediately on read-
verb. Clearly what is happening here is poorly ing “shot.” The unbounded dependency has been
understood. formed before the gap location is reached. The
An alternative view to the idea that we parsing mechanism seems to be using all sources
activate fillers when we come to a gap is that of information to construct analyses as soon as
interpretation is driven by the verbs rather than possible.
the detection of the gaps, so that we postulate Similarly, Tanenhaus et al. (1989) presented
expected arguments to a verb as soon as we reach participants with sentences such as (86) and (87):
it (Boland, Tanenhaus, Carlson, & Garnsey, 1989).
In the earlier sentences where there was evidence (86) The businessman knew which customer the
of semantic reactivation, the traces were adjacent secretary called [t] at home.
to the verbs, so the two approaches make the same (87) The businessman knew which article the
prediction. What happens if they are separated? security called [t] at home.
Consider sentence (84):
At what point do people detect the anomaly in
(84) Which bachelor did Boris grant the mater- (87)? Analysis of reading times showed that par-
nity leave to [t]? ticipants detect the anomaly before the gap, when
they encounter the verb “called.” ERP studies
This sentence is semantically anomalous, but confirm that the detection of the anomaly is asso-
when does it become implausible? If the pro- ciated with the verb (Garnsey et al., 1989).
cess of gap postulation and filling is driven by In summary, the preponderance of evidence
the syntactic process of trace analysis, it should suggests that fillers are postulated by activating
only become implausible when people reach the argument structure of verbs.
the trace at the end of the sentence. The role of
“bachelor” can only be assigned after the prepo-
sition “to.” But if the process is verb-driven, the THE NEUROSCIENCE OF
role of “bachelor” can be determined as soon as PARSING
“maternity leave” is assigned to the role of the
direct object of “grant”; hence “bachelor” is the As we would expect of a complex process such
recipient. So the anomaly will be apparent here. as parsing, it can be disrupted as a consequence
This is what Boland et al. found. Hence the pos- of brain damage. Deficits in parsing, however,
tulation and filling of gaps are immediate and might not always be apparent, because people can
are driven by the verbs (for similar results see often rely on semantic cues to obtain meaning.
Altmann, 1999; Boland, Tanenhaus, Garnsey, & The deficit becomes apparent when these cues are
Carlson, 1995; Nicol, 1993; Pickering & Barry, removed and the patient is forced to rely on syn-
1991; Tanenhaus, Boland, Mauner, & Carlson, tactic processing.
1993). For example, consider (85) from Traxler There is some evidence that syntactic func-
and Pickering (1996): tions take place in specific, dedicated parts of the
10. UNDERSTANDING SENTENCES 313
brain. The evidence includes the differing effects Some evidence against the idea that peo-
of brain damage to regions of the brain such as ple with agrammatism have some impairment
Broca’s and Wernicke’s areas (see Chapters in parsing comes from the grammaticality
3 and 13), and studies of brain imaging (e.g., judgment task. This task simply involves ask-
Dogil, Haider, Schaner-Wolles, & Husman, 1995; ing people whether a string of words forms a
Friederici, 2002; Neville et al., 1991). proper grammatical sentence or not. Linebarger,
Schwartz, and Saffran (1983) showed that the
The comprehension abilities of patients are much more sensitive to grammati-
cal violations than one might expect from their
agrammatic aphasics performance on sentence comprehension tasks.
The disorder of syntactic processing that follows They performed poorly in a few conditions con-
damage to Broca’s area is called agrammatism. taining structures that involve making compari-
The most obvious feature of agrammatism is sons across positions in the sentence (such as
impaired speech production (see Chapter 13), being insensitive to violations like “*the man
but many people with agrammatism also have dressed herself” and “*the people will arrive at
difficulty in understanding syntactically com- eight o’clock didn’t they?”). It appears, then,
plex sentences. The ability of people with that these patients can compute the constitu-
agrammatism to match sentences to pictures ent structure of a sentence, but have difficulty
when semantic cues are eliminated is impaired using that information, both for the purposes of
(Caramazza & Berndt, 1978; Caramazza & detecting certain kinds of violation as well as for
Zurif, 1976; Saffran, Schwartz, & Marin, 1980). thematic role assignment. Schwartz, Linebarger,
These patients are particularly poor at under- Saffran, and Pate (1987) showed that agram-
standing reversible passive constructions (e.g., matic patients could isolate the arguments of the
“The dog was chased by the cat” compared with main verb in sentences that were padded with
“The flowers were watered by the girl”) and extraneous material, but had difficulty using the
object relative constructions (e.g., “The cat that syntax for the purpose of thematic role assign-
the dog chased was black” compared with “The ment. These studies suggest that these patients
flowers that the girl watered were lovely”) in the have not necessarily lost syntactic knowledge,
absence of semantic cues. but are unable to use it properly. Instead, the
One explanation for these people’s difficulty mapping hypothesis is the idea that the com-
is that brain damage has disrupted their parsing prehension impairment arises because although
ability. One suggestion is that these patients are low-level parsing processes are intact, agram-
unable to access grammatical elements correctly matics are limited by what they can do with the
(Pulvermüller, 1995). Another idea is that this dif- results of these processes. In particular, they
ficulty arises because syntactic traces are not pro- have difficulty with thematic role assignment
cessed properly, and the terminal nodes in the parse (Linebarger, 1995; Linebarger et al., 1983).
trees that correspond to function words are not They compensate, at least in part, by making
properly formed (Grodzinsky, 1989, 1990; Zurif use of semantic constraints, although Saffran,
& Grodzinsky, 1983). Grodzinsky (2000) spelled Schwartz, and Linebarger (1998) have shown
out the trace-deletion hypothesis. This hypothesis that reliance on these constraints may sometimes
states that people with an agrammatic comprehen- lead them astray. Thus these patients failed to
sion deficit have difficulty in computing the rela- detect anomalies such as “*The cheese ate the
tion between elements of a sentence that have been mouse” and “*The children were watched by the
moved by a grammatical transformation and their movie” approximately 50% of the time.
origin (trace), as well as in constructing the higher Some types of patient that we might expect
parts of the parse tree. One problem with this view to find have so far never been observed. In par-
is that, as we have seen, the evidence for the exist- ticular, no one has (yet) described a case of a per-
ence of traces in parsing is questionable. son who knows the meaning of words but who is
314 D. MEANING AND USING LANGUAGE
unable to assign them to thematic roles (Caplan, shortage or the rapid decay of the results of syn-
1992; although Schwartz, Saffran, & Marin, tactic processing might play a causal role in the
1980b, describe a patient who comes close). syntactic comprehension deficit and in agram-
A completely different approach emerged matic production (Kolk, 1995).
that postulated that the syntactic comprehension This is an interesting idea that has provoked
deficit results from an impairment of general a good deal of debate. The extent to which the
memory. According to this idea, the pattern of comprehension deficit is related to limited com-
impairment observed depends on the degree of putational resources is debatable. For example,
reduction of language capacity, and the struc- giving these patients unlimited time to process
tural complexity of the sentence being processed sentences does not lead to an improvement
(Miyake, Carpenter, & Just, 1994). (Somewhat in processing (Martin, 1995; Martin & Feher,
confusingly, although Miyake et al. talk of a 1990). The degree to which Miyake et al. simu-
reduction in working memory capacity, they lated aphasic performance has also been ques-
mean a reduction in the capacity of a component tioned (Caplan & Waters, 1995a). In particular,
of the central executive of Baddeley’s 1990 con- the performance of even their lowest-span par-
ception of working memory that serves language ticipants was much better than that of the apha-
comprehension; see Just & Carpenter, 1992.) In sic comprehenders. Caplan and Waters pointed
particular, these limited computational resources out that rapid presentation might interfere with
mean that people with a syntactic comprehen- the perception of words rather than syntactic pro-
sion deficit suffer from restricted availability of cessing. Furthermore, patients with Alzheimer’s
the materials. Miyake et al. simulated agram- disease (AD) with restricted working memory
matism in normal comprehenders with varying capacity show little effect of syntactic com-
memory capacities by increasing computational plexity, but do show large effects of semantic
demands using very rapid presentation of words complexity (Rochon, Waters, & Caplan, 1994).
(120 ms a word). Along similar lines, Blackwell Addressing these concerns, Dick et al. (2001)
and Bates (1995) created an agrammatic perfor- compared the syntactic comprehension abilities
mance profile in normal participants who had of agrammatic patients with college students
to make grammaticality judgments about sen- working under a variety of stressful conditions
tences while carrying a memory load. In other (e.g., with the speech masked by noise, or by
words, people with a syntactic comprehension compressing the speech). The two groups then
deficit are just at one end of a continuum of performed similarly.
central executive capacity compared with the Finally, if there is a reduction in processing
normal population. Syntactic knowledge is still capacity involved in syntactic comprehension
intact, but cannot be used properly because of deficits, it might be a reduction specifically in
this working memory impairment. Grammatical syntactic processing ability, rather than a reduc-
elements are not processed in dedicated parts tion in general verbal memory capacity (Caplan,
of the brain, but are particularly vulnerable to Baker, & Dehaut, 1985; Caplan & Hildebrandt,
a global reduction in computational resources. 1988; Caplan & Waters, 1999). The extent to
Further evidence for this idea comes from self- which this is the case, or whether general verbal
reports from aphasic patients suggesting that working memory is used in syntactic process-
they have limited computational resources ing (the capacity theory), is still a hotly debated
(“other people talk too fast”—Rolnick & Hoops, topic with few signs of settling on any agreement
1969) and conversely that slower speech facili- (Caplan & Waters, 1996, 1999; Just & Carpenter,
tates syntactic comprehension in some apha- 1992; Just, Carpenter, & Keller, 1996; Waters &
sic patients (e.g., Blumstein, Katz, Goodglass, Caplan, 1996; see also Chapter 15). On balance
Shrier, & Dworetzky, 1985). Increased time it looks as though a general reduction in working
provides more opportunity for using the limited memory capacity cannot cause the syntactic defi-
resources of the central executive. Indeed, time cit in agrammatism.
10. UNDERSTANDING SENTENCES 315
Are content and function words a deficit of attentional processing. She exam-
processed differently? ined aphasic comprehension of syntactic and
Remember that content words do the semantic semantic anomalies, comparing performance on
work of the language and include nouns, verbs, an on-line measure (monitoring for a particular
adjectives, and most adverbs, while function word) with that on an off-line measure (detect-
words, which are normally short, common words, ing an anomaly at the end of the sentence). She
do the grammatical work of the language. Are found patients who performed normally on the
content and function words processed in different on-line task but very poorly on the off-line task.
parts of the brain? This suggests that the automatic parsing pro-
Content words are sensitive to frequency in cesses were intact, but the attentional processes
a lexical decision task, but function words are were impaired.
not. For a while it was thought that this pattern This is a complex issue that has spawned
is not observed in patients with agrammatism a great deal of research (e.g., Friederici &
(Bradley, Garrett, & Zurif, 1980). Instead, agram- Kilborn, 1989; Haarmann & Kolk, 1991; Martin,
matic patients are sensitive to the frequency of Wetzel, Blossom-Stach, & Feher, 1989; Milberg,
function words, as well as to the frequency of Blumstein, & Dworetzky, 1987; Tyler, Ostrin,
content words. This is because the brain dam- Cooke, & Moss, 1995). Clearly at least some of
age means that function words can no longer be the deficits we observe arise from attentional fac-
accessed by the special set of processes and have tors: the question remaining is, how many?
to be accessed as other content words. Perhaps
the comprehension difficulties of these patients Evaluation of work on the
arise from difficulty in activating function neuroscience of parsing
words? Unfortunately, the exact interpretation
of these results has proved very controversial, Although there has been a considerable amount
and the original studies have not been replicated of work on the neuropsychology of parsing, it is
(see, for example, Gordon & Caramazza, 1982; much more difficult to relate to the psychological
Swinney, Zurif, & Cutler, 1980). Caplan (1992) processes involved in parsing. Much of the work
concluded that there is no clear neuropsycho- is technical in nature and relates to linguistic theo-
logical evidence that function words are treated ries of syntactic representation. It is also unlikely
specially in parsing. that there is a single cause for the range of deficits
observed (Tyler et al., 1995).
Friederici (2002) describes a model of sen-
Is automatic or attentional tence processing where the left temporal regions
processing impaired in identify sounds and words; the left frontal cortex
agrammatism? is involved in sequencing and the formation of
structural and semantic relations; and the right
Most of the tasks used in the studies described hemisphere is involved in identifying prosody
so far (e.g., sentence–picture matching tasks, (see Figure 10.4). She argues that imaging and
anomaly detection, and grammaticality judg- electrophysiological data suggest that sentence
ment) are off-line, in that they do not tap parsing processing takes place in three phases. In Phase
processes as they actually happen. Therefore, the 1 (100–300 ms) the initial syntactic structure is
results obtained might reflect the involvement of formed on the basis of information about word
some later variable (such as memory). So do these category. In Phase 2 (300–500 ms) lexical-syntactic
impairments reflect deficits of automatic parsing processes take place, resulting in thematic role
processes, or deficits of some subsequent atten- assignment. In Phase 3 (500–1,000 ms) the dif-
tional process? ferent types of information are integrated. She
Tyler (1985) provided an indication that at argues that syntactic and semantic processes only
least some deficits in some patients arise from interact in Phase 3.
316 D. MEANING AND USING LANGUAGE
6
44 22
SUMMARY
x The experimental evidence for and against the autonomous garden path and interactive constraint-
based models is conflicting.
x In constraint-based models, lexical and syntactic ambiguity are considered to be fundamentally
the same thing, and resolved by similar mechanisms.
x Statistical preferences may have some role in parsing.
x Some recent models have questioned whether syntax needs to precede semantic analysis.
x Gaps are filled by the semantic reactivation of their fillers.
x Gaps may be postulated as soon as we encounter particular verb forms.
x Verbs play a central role in parsing.
x ERP studies show that people try and predict what is coming next.
x Some aphasics show difficulties in parsing when they cannot rely on semantic information.
x There is no clear neuropsychological evidence that content and function words are processed dif-
ferently in parsing.
x Some off-line techniques might be telling us more about memory limitations or semantic integra-
tion than about what is actually happening at the time of parsing.
x Electrophysiological and imaging data suggest that sentence comprehension takes place in three
phases, and different components of processing are identifiable with distinct regions of the brain.
1. What does the evidence from the study of language development tell us about the relation
between syntax and other language processes? (You may need to look at Chapters 2 and 3 again
in order to be able to answer this question.)
2. What do studies of parsing tell us about some of the differences between good and poor readers?
3. Is the following statement true: “Syntax proposes, semantics disposes”?
4. How does the notion of “interaction” in parsing relate to the notion of “interaction” in word
recognition?
5. Which experimental techniques discussed in this chapter are likely to give the best insight into
what is happening at the time of parsing? How would you define “best”?
FURTHER READING
Fodor, Bever, and Garrett (1974) is the classic work on much of the early research on the possi-
ble application of Chomsky’s research to psycholinguistics, including deep structure and the deri-
vational theory of complexity. Greene (1972) covers the early versions of Chomsky’s theory, and
detailed coverage of early psycholinguistic experiments relating to it. See Clark and Clark (1977)
for a detailed description of surface structure parsing cues. Johnson-Laird (1983) discusses different
types of parsing systems with special reference to garden path sentences.
(Continued)
318 D. MEANING AND USING LANGUAGE
(Continued)
For reviews on parsing work see Pickering and van Gompel (2006) and van Gompel and Pickering
(2007). For a model based on a rational analysis of what parsing involves, see Hale (2010).
As Mitchell (1994) pointed out, most of the work in parsing has examined a single language.
There are exceptions, including work on Dutch (Frazier, 1987b; Frazier, Flores d’Arcais, & Coolen,
1993; Mitchell, Brysbaert, Grondelaers, & Swanepoel, 2000), French (Holmes & O’Reagan, 1981),
East Asian languages (Special Issue of Language and Cognitive Processes, 1999, volume 14, parts 5
and 6), German (Bach, Brown, & Marslen-Wilson, 1986; Hemforth & Konieczny, 1999), Hungarian
(MacWhinney & Pleh, 1988), Japanese (Mazuka, 1991), and Spanish (Cuetos & Mitchell, 1988), but
the great preponderance of the work has been on English alone. It is possible that this is giving us at
best a restricted view of parsing, and at worst a misleading view.
See Caplan (1992; the paperback edition is 1996) for a detailed review of work on the neuro-
psychology of parsing. See Haarmann, Just, and Carpenter (1997) for a computer simulation of the
resource-deficit model of syntactic comprehension deficits.
C H A P T E R 11
WORD MEANING
episodic memory is a useful one, the extent to search our memories for where the appropriate
which they involve different memory processes facts are stored.
is less clear (McKoon, Ratcliff, & Dell, 1986). It should be obvious that the study of mean-
The notion of meaning is closely bound to ing therefore necessitates capturing the way in
that of categorization. A concept determines how which words refer to things that are all members
things are related or categorized. It is a mental of the same category and have something in com-
representation of a category. It enables us to group mon, yet are different from non-members. (Of
things together, so that instances of a category all course something can belong to two categories
have something in common. Thus concepts some- at once: We can have a category labeled by the
how specify category membership. All words word “ghost,” and another by the word “invis-
have an underlying concept, but not all concepts ible,” and indeed we can join the two to form the
are labeled by a word. For example, we do not category of invisible ghosts labeled by the words
have a special word for brown dogs. In English “invisible ghosts.”) There are two issues here.
we have a word “dog” that we can use about What distinguishes items of one category from
certain things in the world, but not about others. items of another? And how are hierarchical rela-
There are two fundamental questions here. The tions between categories to be captured? There
philosophical question is how does the concept of are category relations between words. For exam-
“dog” relate to the members of the category dog? ple, the basic-level category “dog” has a large
The psychological question is how is the mean- number of category superordinate levels above it
ing of “dog” represented and how do we pick out (such as “mammal,” “animal,” “animate thing,”
instances of dogs in the environment? and “object”) and subordinates (such as “terrier,”
In principle we could have a word, say “Rottweiler,” and “German shepherd”—these are
“brog,” to refer to brown dogs. We do not have said to be category coordinates of each other).
such a term, probably because it is not a particu- Hierarchical relations between categories are
larly useful one. Rosch (1978) pointed out that one clear way in which words can be related in
the way in which we categorize the world is not meaning, but there are other ways that are equally
arbitrary, but determined by two important fea- important. Some words refer to associates of a
tures of our cognitive system. First, the catego- thing (e.g., “dog” and “lead”). Some words (anto-
ries we form are determined in part by the way nyms) are opposites in meaning (e.g., “hot” and
in which we perceive the structure of the world. “cold”). We can attempt to define many words: for
Perceptual features are tied together because they example, we might offer the definition “unmar-
form objects and have a shared function. How the ried man” for “bachelor.” A fundamental issue
categories we form are determined by biological for semantics concerns how we should capture all
factors is an important topic, about which little is these relations.
known, although we know how color names relate Semantics concerns more than associations
to perceptual constraints (see Chapter 3). Second, (see Chapter 6). Words can be related in mean-
the structure of categories might be determined ing without being associated (e.g., “yacht” and
by cognitive economy. This means that seman- “ship”), so any theory of word meaning cannot
tic memory is organized so as to avoid excessive rely simply on word association. Words with
duplication. There is a trade-off between economy similar meanings tend to occur in similar con-
and informativeness: A memory system organ- texts. Lund, Burgess, and Atchley (1995) showed
ized with just the categories “animal,” “plant,” that semantically similar words (e.g., “bed” and
and “everything else” would be economical but “table”) are interchangeable within a sentence;
not very informative (Eysenck & Keane, 2010). the resulting sentence, while maybe pragmatically
We may also need to make distinctions between implausible, nevertheless makes sense. Consider
members of some categories more often than oth- (1) and (2). If “table” is substituted for the seman-
ers. Another disadvantage of cognitive economy tically related word “bed” the sentence still makes
might be increased retrieval time, as we need to sense. Word pairs that are only associated (e.g.,
11. WORD MEANING 321
“baby” and “cradle”) result in meaningless sen- a word decomposed into more elemental units of
tences. If we substitute “baby” for its associate meaning or not? How are words related to each
“cradle” in (3), we end up with the anomalous other by their meanings? This deals with issues
sentence (4). such as priming, and how word meanings are
related. What does the neuropsychology of mean-
(1) The child slept on the bed. ing tell us about its representation and its relation
(2) The child slept on the table. with the encyclopedia? In the next chapter, we
(3) The child slept in the cradle. will examine how word meanings are combined
(4) *The child slept in the baby. to form representations of the meaning of sen-
tences and large units of language. By the end of
Associations arise from words regularly this chapter you should:
occurring together, while semantic relations arise
from shared contexts and higher level relations. x Understand the difference between sense and
One task of research in semantics is to capture reference.
how contexts can be shared and how these higher x Know how semantic networks might represent
level relations should be specified. meaning.
Semantics is also the interface between x Know about the strengths and weaknesses of
language and the rest of perception and cogni- representing word meaning in terms of smaller
tion. This relation is made explicit in the work units of meaning.
of Jackendoff (1983), who proposed a theory x Understand how we store information about
of the connection between semantics and other categories.
cognitive, perceptual, and motor processes. He x Appreciate how brain damage can affect how
proposed two constraints on a general theory of meaning is represented.
semantics. The grammatical constraint says that x Know whether we have one or more semantic
we should prefer a semantic theory that explains memory systems.
otherwise arbitrary generalizations about syntax x Understand the importance of the difference
and the lexicon. Some aspects of syntax will be between perceptual and functional information.
determined by semantics. Some AI theories and x Know how semantic information breaks down
theories based on logic (in particular, a form in dementia.
of logic known as predicate calculus) fail this x Be able to evaluate the importance of connec-
constraint. In order to work, they have to make tionist modeling of semantic memory.
up entities that do not correspond to anything
involved in cognitive processing, and they break
up the semantic representation of single words CLASSIC APPROACHES TO
across several constituents. This constraint says SEMANTICS
that syntax and semantics should be related in a
sensible way. The cognitive constraint says that It is useful to distinguish immediately between a
there is a level of representation where semantics word’s denotation and its connotation. The deno-
must interface with other psychological represen- tation of a word is its core, essential meaning.
tations, such as those derived from perception. The connotations of a word are all of its second-
There is some level of representation where lin- ary implications, or emotional or evaluative asso-
guistic, motor, and sensory information are com- ciations. For example, the denotation of the word
patible. Connectionist models in particular show “dog” is its core meaning: it is the relation between
how this constraint can be satisfied. the word and the class of objects to which it can
This chapter focuses on a number of related refer. The connotations of “dog” might be “nice,”
topics. How do we represent the meaning of “frightening,” or “smelly.” Put another way, peo-
words? In particular, how does a model of mean- ple agree on the denotation, but the connotations
ing deal with the issues we have just raised? Is differ from person to person. In this chapter I am
322 D. MEANING AND USING LANGUAGE
word: It obtains its meaning by its place in a A semantic network is particularly useful
network of associations. The meaning of “dog” for representing information about natural kind
might involve an association with “barks,” “four terms. These are words that denote naturally
legs,” “furry,” and so on. It soon became appar- occurring categories and their members—such
ent that association in itself was insufficiently as types of animal, or metal, or precious stone.
powerful to be able to capture all aspects of The scheme attributes fundamental importance
meaning. There is no structure in an associa- to their inherently hierarchical nature: For exam-
tive network, with no relation between words, ple, a bald eagle is a type of eagle, an eagle is a
no hierarchy of information, and no cognitive type of bird of prey, a bird of prey is a bird, and a
economy. In a semantic network, this addi- bird is a type of animal. This is a very economi-
tional power is obtained by making the connec- cal method of storing information. If you store
tions between items do something—they are the information that birds have wings at the level
not merely associations representing frequent of bird, you do not need to repeat it at the level
co-occurrence, but themselves have a semantic of particular instances (e.g., eagles, bald eagles,
value. That is, in a semantic network the links and robins). An example of a fragment of such
between concepts themselves have meaning. a network is shown in Figure 11.1. In the net-
work, nodes are connected by links that specify
The Collins and Quillian semantic the relation between the linked nodes; the most
common link is an ISA link which means that the
network model lower level node “is a” type of the higher level
Perhaps the best-known example of a semantic node. Attributes are stored at the lowest possible
network is that of Collins and Quillian (1969). node at which they are true of all lower nodes in
This work arose from an attempt to develop a the network. For example, not all animals have
“teachable language comprehender” to assist wings, but all birds do—so “has wings” is stored
machine translation between languages. at the level of birds.
breathes
ANIMAL
has lungs
ISA
ISA
has wings
lays eggs BIRD MAMMAL bears live young
flies
ISA
ISA
ISA
its decomposition into smaller units of meaning. actions that concern the movement of objects,
These smaller units of meaning are called seman- ideas, and abstract relations. For example, there
tic features (or sometimes semantic attributes, are five physical actions (called “expel,” “grasp,”
or semantic markers). Theories that make use of “ingest,” “move,” and “propel”), and two abstract
semantic features are often called decomposi- ones (“attend” and “speak”). Their names are
tional theories. The alternative is that each word is fairly self-explanatory, and it is not necessary
represented by its own concept that is not decom- to go into detail of their meanings here. Wilks
posed further. (1976) described a semantic system where the
Semantic features work very well in some meaning of 600 words in the simulation can be
simple domains where there is a clear relation reduced to combinations of only 80 primitives. In
between the terms. One such domain, much stud- this system the action sense of “beat” is denoted
ied by anthropologists, is that of kinship terms. A by (“strike” [subject—human] [object—animate]
simple example is shown in Table 11.1. Here the [instrument—thing]). The semantic representa-
meanings of the four words, “mother,” “father,” tion and syntactic roles in which the word can
“son,” and “daughter,” are captured by combina- partake are intimately linked. In a similar vein,
tions of the three features “human,” “male” or Wierzbicka (2004) argues that in spite of their
“female,” and “older” or “younger.” We could apparent diversity, all natural languages share a
provide a hierarchical arrangement of these fea- common core of about 60 conceptual primitives
tures (e.g., human → young and old; young → present in all languages. Other word meanings
male or female; and old → male or female), but can be built up by combining these primitives
it would either be totally unprincipled (there is (e.g., a plant is a living thing that cannot feel or
no reason why adult/young should come before do). Of course, just because we can reduce the
male/female, or vice versa), or would involve meaning of all words to a relatively small num-
duplication (if we store both hierarchical forms). ber of primitives does not mean that is how we do
Instead, we can list the meaning in terms of a list actually represent them.
of features, so that father is (+ human, − female, One possibility is that all words are repre-
+ older). sented in terms of combinations of only seman-
We can take the idea of semantic features fur- tic primitives. In addition to these AI models, the
ther, and represent the meanings of all words in model of Katz and Fodor (1963), described later,
terms of combinations of as few semantic features is of this type. Another possibility is that words
as possible. When we use features in this way it is are represented as combinations of features not all
as though they become “atoms of meaning,” and of which need be primitives. These non-primitive
are called semantic primitives. This approach has features might eventually be represented else-
been particularly influential in AI. For example, where in semantic memory as combinations of
Schank (1972, 1975) argued that the meaning of primitives. For example, the meaning of “woman”
sentences could be represented by the conceptual might include “human” but not “object,” because
dependencies between the semantic primitives the meaning of “human” might include “animal,”
underlying the words in the sentence. All com- and eventually the meaning of “animal” includes
mon verbs can be analyzed in terms of 12 primitive “object” (McNamara & Miller, 1989). This idea is
similar to the principle of economy incorporated
TABLE 11.1 Decomposition of kinship terms. into hierarchical semantic networks. Jackendoff
(1983) and Johnson-Laird (1983) described mod-
Feature Father Mother Daughter Son els of this type.
Human
Older
Early decompositional theories
One of the earliest decompositional theories
Female
was that of Katz and Fodor (1963). This theory
11. WORD MEANING 327
showed how the meanings of sentences could Feature-list theories and sentence
be derived by combining the semantic fea-
tures of each individual word in the sentence.
verification
It emphasized how we understand ambiguous We have seen that decompositional theories of
words. Consider examples (18) and (19). A dif- meaning enable us to list the meanings of words
ferent sense of “ball” is used in each sentence. as lists of semantic features. What account does
Then consider (20), which is semantically such a model give of performance on the sentence
anomalous: verification task, and in particular what account
does it give of the problems to which hierarchical
(18) The witches played around on the beach network models fall prey? Rips et al. (1973) pro-
and kicked the ball. posed that there are two types of semantic feature.
(19) The witches put on their party frocks and Defining features are essential to the underlying
went to the ball. meaning of a word, and relate to properties that
(20) ? The rock kicked the ball. things must have to be a member of that category
(for example, a bird is living, it is feathered, lays
There are no syntactic cues to be made use of eggs, and so forth). Characteristic features are
here, so how do the meanings of the words in the usually true of instances of a category, but are not
sentence combine to resolve the ambiguity in (18) necessarily true (for example, most birds can fly,
and (19) and identify the anomaly in (20)? First, but penguins and ostriches cannot).
Katz and Fodor postulated a decompositional According to Rips et al., sentence verification
theory of meaning so that the meanings of indi- involves making comparisons of the feature lists
vidual words in the sentence are broken down representing the meaning of the words involved
into their component semantic features (called in two stages. For this reason this particular
semantic markers by Katz and Fodor). Second, approach is called the feature-comparison theory.
the combination of features across words is gov- In the first stage, the overall featural similarity of
erned by particular constraints called selection the two words is compared, including both the
restrictions. There is a selection restriction on defining and characteristic features. If there is
the meaning of “kick” such that it must take very high overlap, we respond “true”; if there is
an animate subject and an optional object, but very low overlap, we respond “false.” If we com-
if there is an object then it must be a physical pare “robin” and “bird,” there is much overlap
object. An ambiguous word such as “ball” has and no conflict in the complete list of features, so
two sets of semantic features, one of which will we can respond “true” very easily. With “robin”
be specified as something like (sphere, small,
used in games, physical object … ), the other as
(dance, event … ). Only one of these contains
the “physical object” feature, so “kick” picks
out that sense. Similarly there is a selection
restriction on the verb “went” such that it picks
out locations and events, which contradicts the
“physical object” sense of “ball.” Finally, the
selection restriction on “kick” that specifies an
animate subject is incompatible with the under-
lying semantic features of “rock.” As there are
no other possible subjects in this sentence, we
consider it anomalous. As we shall see, one of
the problems with this type of approach is that Characteristic features are not necessarily relevant
for most words it is impossible to provide an to all members of a given category; most birds can
fly, for example, but penguins cannot.
exhaustive listing of all of its features.
328 D. MEANING AND USING LANGUAGE
and “pig,” there is very little overlap and a great features are weighted according to a combination
deal of conflict, so we can respond “false” very of how salient they are and the probability of their
quickly. However, if the amount of overlap is nei- being true of a category. For example, the feature
ther very high nor very low, we then have to go on “has four limbs” has a large weighting because it
to a second stage of comparison, where we con- relates to something that is perceptually salient and
sider only the defining features. This obviously is true of all mammals. “Bears live young” has a
takes additional time. An exact match on these is lower weighting because although true of almost
then necessary to respond “true.” For example, all mammals it is less salient, while “eats meat” is
when we compare “penguin” and “bird,” there is even lower because it is not even true of most mam-
a moderate amount of overlap and some conflict mals. In a sentence verification task, a candidate
(on flying, for example). An examination of the instance is accepted as an instance of the category
defining features of “penguin” then reveals that if it exceeds some critical weighted sum of features.
it is, after all, a type of bird. The advantage of the For example, “a robin is a bird” is accepted quickly
first stage is that although the comparison is not because the features of “robin” that correspond to
detailed, it is very quick. We do not always need “bird” easily exceed “bird’s” threshold.
to make detailed comparisons. The revised model has the advantage of
One problem with the feature-list model is that emphasizing the relation between meaning and
it is very closely tied to the sentence verification identification, and can account for all the verifica-
paradigm. A more general problem is that many tion time data. Because identifying an exemplar of
words do not have obvious defining features. Smith a category only involves passing a threshold rather
and Medin (1981) extended and modernized the than examining the possession of defining features,
feature theory with the probabilistic feature model. categories that have “fuzzy” or unclear boundaries are
In this approach there is an important distinction no longer problematic. At this point it becomes dif-
between the core description and the identifica- ficult to distinguish empirically between this model
tion procedures of a concept (see Figure 11.3). The and the prototype model described later.
core description comprises the essential defining
features of the concept and captures the relations Evaluation of decompositional
between concepts, while the identification proce-
dures concern those aspects of meaning that are
theories
related to identifying instances of the concept. For There is evidence for and against decomposi-
physical objects, perceptual features form an impor- tional theories. It is a difficult area in which to
tant part of the identification procedure. Semantic carry out experiments. Indeed, Hollan (1975)
FIGURE 11.3
11. WORD MEANING 329
(24) The bachelor married Sybil. as “a bachelor is unmarried,” you have to make a
(25) The bachelor did not marry Sybil. special type of inference (called a meaning postu-
(26) The widow did not marry Sybil. late). We do this only when required. A problem
with this study is that it is difficult to make up
According to decompositional theories, (24) good controls (for example, sentences matched
contains an implicit negative in the form of the for length and syntactic complexity) for this type
PDN in “bachelor.” If this is correct, and such of experiment (see Katz, 1977).
features are accessed automatically, then (25) is Fodor, Garrett, Walker, and Parkes (1980)
implicitly a double negative and should be harder examined the representation of words called lexi-
to understand than a control sentence such as cal causatives. These are verbs that bring about or
(26), which contains only an explicit negative cause new states of affairs. In a decompositional
and no PDN. Fodor et al. could find no process- analysis such verbs would contain this feature in
ing difference between sentences of the types their semantic representation. For example, “kill”
(25) and (26). They concluded that features are would be represented as something like (cause to
not accessed automatically, and instead pro- die), although this is obviously a far from perfect
posed a non-decompositional account in which decomposition. In Figure 11.4, (a) shows the sur-
the meaning of words is represented as a whole. face structure for the two sentences with the appar-
(Hence Fodor had completely changed his view ently similar verbs “kiss” and “kill.” For the control
of decomposition from the earlier Katz and Fodor verb “kiss,” the deep structure analysis is the same,
work.) They argued that to draw an inference such but if “kill” is indeed decomposed into “cause to
S
(a)
NP VP
N V NP
(b) S
NP VP
N V NP VP
die,” its deep structure should be like that of (b). of objects between participants in the sentence,
Fodor et al. asked participants to rate the perceived while “sold” decomposes into the notion of trans-
relatedness between words in these sentences. In ferring the ownership of objects plus an exchange
(b), “Vlad” and “Agnes” are farther apart than of money between participants. Hence sentences
they are in the deep structure of “kissed,” as there of the type “Vlad sold the wand to Agnes” are
are more intervening nodes. Therefore “Vlad” and remembered more accurately than sentences of
“Agnes” should be rated as less related in the sen- the type “Vlad gave the wand to Agnes” because
tence with the causative verb “Vlad killed Agnes” the verb has a more complex underlying structure
than with a non-causative verb as in “Vlad kissed (Gentner, 1981; see also Coleman & Kay, 1981).
Agnes.” However, Fodor et al. found no difference Although memory tasks do not always pro-
in the perceived relatedness ratings in these sen- vide an accurate reflection of what is happening
tences, and therefore no evidence that participants at the time of processing, there is further evidence
decompose lexical causatives. in favor of semantic decomposition. People with
Gergely and Bever (1986) questioned this aphasia tend to be more successful at retrieving
finding. In particular, they questioned whether verbs with rich semantic representations com-
perceived relatedness between words truly is a pared with verbs with less rich representations
function of their structural distance. They pro- (Breedin, Saffran, & Schwartz, 1998). For exam-
vided experimental evidence to support their ple, the verb “hurry” has a richer representation
contention, concluding that the technique of intui- than “go” because it includes the meaning of
tions about the relatedness of words cannot be “go” with the additional features representing
used to test the relative underlying complexity “quickly.” Semantically related word substitu-
of semantic representations. The conclusion also tion speech errors (see Chapter 13) always show
depends on a failure to show a difference rather a featural relation between the target and occur-
than on obtaining a difference, which is always ring words. Finally, much of the work on semantic
less satisfactory. development (see Chapter 4) is best explained in
Some studies have concluded that complex terms of some sort of featural representation.
sentences that are hypothesized to contain more In summary, it is likely that we represent the
semantic primitives are no less memorable or meanings of words as combinations of semantic
harder to process than simpler sentences that pre- features, although these ideas are fiendishly difficult
sumably contain fewer primitives (Carpenter & to test. McNamara and Miller (1989) suggested
Just, 1977; Kintsch, 1974; Thorndyke, 1975). On that young children automatically decompose
the other hand, these experiments confounded the early words into semantic primitives, but as they
number of primitives with other factors (Gentner, get older, they mainly decompose them into non-
1981), particularly syntactic complexity (as primitive features. Eventually words themselves
pointed out by Gentner, 1981, and McNamara & might act as features in the semantic system.
Miller, 1989). There has recently been a resurgence of inter-
Although Fodor et al. (1980) argued that est in semantic features. This has come from the
semantic complexity should slow processing interplay between connectionist modeling and
down, it is more likely that it speeds processing neuropsychological studies of semantic memory.
up. In Hinton and Shallice’s (1991) model of deep Vigliocco, Vinson, Lewis, and Garrett (2004)
dyslexia, highly imageable words have rich fea- describe an updated feature-based model called
tural representations that make them more robust the Featural and Unitary Semantic Space hypoth-
(see Chapter 7). Features also provide scope for esis. They argue that object and action words at
interconnections. Sentences that contain features least are represented by combinations of features
that facilitate interconnections between their ele- grounded in perception and organized according
ments are recalled better than those that do not to modality. These ideas of grounding and modality-
(Gentner, 1981). For example, “give” decom- specific organization are important ones to which
poses into the notion of transferring the ownership we will return later.
11. WORD MEANING 333
FAMILY RESEMBLANCE slots such as “can fly?” (“yes” for blackbird and
robin, “no” for penguin and emu), “bill length”
MODELS (“short” for robin, “long” for curlew), and “leg
We have seen that one of the major problems with length” (“short” for robin, “long” for stork). The
the decompositional theory of semantics is that it bird prototype will have the most common or
is surprisingly difficult to come up with an intui- average values for all these slots (can fly, short
tively appealing list of semantic features for many bill, short legs). Hence a robin will be closer to
words. Many categories seem to be defined by a the prototype than an emu. Category boundaries
family resemblance between their members rather are unclear or “fuzzy.” For some items, it is not
than the specification of defining features that all clear which category they should belong in; and
members must possess. How can we account for in some extreme cases, some instances may be in
the wooliness of concepts? two categories (for example, a tomato may be cat-
egorized as both a vegetable and a fruit).
There is a wealth of evidence supporting
Prototype theories prototype theory over feature theory. Rosch and
A prototype is an average family member Mervis (1975) measured family resemblance
(Rosch, 1978). Potential members of the category among instances of concepts such as fruit, fur-
are identified by how closely they resemble the niture, and vehicles by asking participants to list
prototype or category average. Some instances their features. Although some features were given
of a category are judged to be better exemplars by all participants for particular concepts, these
than other instances. The prototype is the “best were not technically defining features, as they did
example” of a concept, and is often a non-exist- not distinguish the concept from other concepts.
ent, composite example. For example, a blackbird For example, all participants might say of “birds”
(or alternatively, American robin) is very close to that “they’re alive,” but then so are all other ani-
being a prototypical bird; it is of average size, has mals. The more specific features that were listed
wings and feathers, can fly, and has average fea- were not shared by all instances of a concept—for
tures in every respect. A penguin is a long way example, not all birds fly.
from being a prototypical bird, and hence we take A number of results demonstrate the pro-
longer to verify that it is indeed a member of the cessing advantage of a prototype over particu-
bird category. lar instances (see for example Mervis, Catlin,
The idea of a prototype arose from many dif- & Rosch, 1975). Sentence verification time is
ferent areas of psychology. Posner and Keele faster for prototypical members of a category.
(1968) showed participants abstract patterns of Prototypical members can substitute for category
dots. Unknown to the participants, the patterns names in sentences, whereas non-prototypical
were distortions of just one underlying pattern of members cannot. Words for typical objects are
dots that the participants did not actually see. The learned before words for atypical ones. In a free
underlying pattern of dots corresponds to the cate- recall task, adults retrieve typical members before
gory prototype. Even though participants never saw atypical ones (Kail & Nippold, 1984). Prototypes
this pattern, they later treated it as the best example, share more features with other instances of the
responding to it better than the patterns they did see. category, but minimize the featural overlap with
I considered the related work of Rosch on proto- related categories (Rosch & Mervis, 1975). Hence,
types and color naming earlier, in Chapter 3. for most people, “apple” is very close to the pro-
A prototype is a special type of schema. A totype of “fruit” (Battig & Montague, 1969), and
schema is a frame for organizing knowledge that is similar to other fruit and dissimilar to “veg-
can be structured as a series of slots plus fillers etables,” but “tomato” is a peripheral member
(see Chapter 12). A prototype is a schema with all and indeed overlaps with “vegetable.” There are
the slots filled in with average values. For exam- prototypes that possess an advantage over other
ple, the schema for “bird” comprises a series of members of the category even when they are all
334 D. MEANING AND USING LANGUAGE
formally identical. Participants consider the num- are not so easily distinguished from each other.
ber “13” to be a better “odd number” than “23” or Nevertheless, objects at the same basic level share
“501” (Armstrong, Gleitman, & Gleitman, 1983), perceptual contours; they resemble each other
and “mother” is a better example of “female” than more than they resemble members of other simi-
“waitress.” We have already seen that these typi- lar categories. It is the level at which we think, in
cality effects can also be found in sentence verifi- the sense that those are the labels we choose in the
cation times. Generally, the closer an item is to the absence of any particular need to do otherwise.
prototype, the easier we process it. The basic level is the most general category for
Prototype theories are not necessarily incon- which a concrete image of the whole category can
sistent with feature theories. According to be formed (Rosch et al., 1976).
prototype theories, word meaning is not only Rosch et al. (1976) showed that basic levels
represented by essential features; non-essential have a number of advantages over other catego-
features also play a role. Theories based on fea- ries. Participants can easily list most of the attrib-
tures have the additional attractive property that utes of the basic level; it is the level of description
they can explain how we acquire new concepts, most likely to be spontaneously used by adults;
such as “liberty” or “hypocrisy”: we merely com- sentence verification time is faster for basic-level
bine existing features. Network models can also terms; and children typically acquire the basic
form new concepts, by adding new nodes to the level first. We can also name objects at the basic
network with appropriate connections to exist- level faster than at the superordinate or subordi-
ing nodes. As we have seen, it is unclear whether nate levels (Jolicoeur, Gluck, & Kosslyn, 1984).
this is a meaningful distinction in practice. On the
other hand, new concepts are problematical for Problems with the prototype model
non-decompositional theories. One suggestion Hampton (1981) pointed out that not all types
is that all concepts, including complex ones, are of concepts appear to have prototypes: Abstract
innate (Fodor, 1981). concepts in particular are difficult to fit into this
scheme. What does it mean, for example, to talk
Basic levels about the prototype for “truth”? The prototype
Rosch (1978) argued that a compromise between model does not explain why categories cohere.
cognitive economy and maximum informative- Lakoff (1987) points to some examples of very
ness results in a basic level of categorization that complex concepts for which it is far from obvious
tends to be the default level at which we catego- how there could be a prototype—the Australian
rize and think, unless there is particular reason to Aboriginal language Dyirbal has a coherent cat-
do otherwise. In general, we use the basic level of egory of “women, fire, and dangerous things”
“chairs,” rather than the lower level of “armchairs” marked by the word “balan.” Furthermore, the
or the higher level of “furniture.” That is, there is prototype model cannot explain why typicality
a basic level of categorization that is particularly judgments vary systematically depending on the
psychologically salient (Rosch et al., 1976). The context (Barsalou, 1985). Any theory of catego-
basic level is the level that has the most distinc- rization that relies on similarity risks being circu-
tive attributes and provides the most economical lar: Items are in the same category because they
arrangement of semantic memory. There is a large are similar to each other, and they are similar to
gain in distinctiveness from the basic level to lev- each other because they are in the same category
els above, but only a small one to levels below. (Murphy & Medin, 1985; Quine, 1977). It is nec-
For example, there seems to be a large jump from essary to explain how items are similar, and proto-
“chairs” to “furniture” and to other types of fur- type theories do not do a good job of this. Finally,
niture such as “tables,” but a less obvious differ- the characterization of the basic level as the most
ence between different types of chair. Objects at psychologically fundamental is not as clear-cut
the basic level are readily distinguished from each as at first sight (Komatsu, 1992). The amount of
other, but objects in levels beneath the basic level information we can retrieve about subordinate
11. WORD MEANING 335
levels varies with our expertise (Tanaka & Taylor, as the number of instances considered increases
1991). Birdwatchers, for example, know nearly as (Storms, De Boeck, & Ruts, 2000). Both abstrac-
much about subordinate members such as black- tion-based theories (Gluck & Bower, 1988) and
birds, jays, and olivaceous warblers, as they do instance-based theories (in the Jets and Sharks
about the basic level. Nevertheless, although model of McClelland, 1981; see also Kruschke,
expertise increases the knowledge available at 1992) have been implemented in connectionist
other levels, the original basic level retains a priv- models. Across a range of tasks involving natural
ileged status (Johnson & Mervis, 1997). language categories, instance-based models give
a slightly better account than prototype models
(Storms et al., 2000). The instantiation principle
Instance theories might be one possible resolution to this conflict
Is abstraction an essential component of concep- (Heit & Barsalou, 1996). According to this prin-
tual representation? An alternative view is that ciple, a category includes detailed information
of representing exemplars without abstraction: about its range of instances. Although it is clearly
Each concept is representing a particular, previ- implemented in instance-based theories, it is pos-
ously encountered instance. We make semantic sible to incorporate it into prototype theories. This
judgments by comparison with specific stored idea represents a shift from emphasizing cogni-
instances. This is the instance approach (Komatsu, tive economy in our theories. This might not be as
1992), also called the exemplar theory. There disadvantageous as it first seems. Nosofsky and
are different varieties of the instance approach, Palmeri (1997) suggested that category member-
depending on how many instances are stored, and ship decisions are made by retrieving instances
on the quality of these instances. The instance one at a time from semantic memory until a deci-
approach provides greater informational richness sion can be made. In this case, the more instances
at the expense of cognitive economy. you have stored, the faster you can respond.
It is quite difficult to distinguish between
prototype and instance-based theories. Many
of the phenomena explained by prototype theo-
Theory theories
ries can also be accounted for by instance-based A final theory of classification and concept repre-
theories. Both theories predict that people pro- sentation has emerged from work on how children
cess central members of the category better than represent natural kind categories (e.g., Carey,
peripheral members (Anderson, 2010). Prototype 1985; Markman, 1989), on judgments of similar-
theories predict this because central mem- ity (Rips & Collins, 1993), and on how catego-
bers are closer to the abstract prototype, while ries cohere (Murphy & Medin, 1985). According
instance-based theories predict this because cen- to theory theories, people represent categories as
tral instances are more similar to other instances miniature theories (mini-theories) that describe
of the category. Instance-based theories predict facts about those categories and why the members
that specific instances should affect the process- cohere (Murphy & Medin, 1985; Rips, 1995). A
ing of other instances regardless of whether or theory underlying a concept is thought to be very
not they are close to the central tendency, and similar to the type of theory a scientist uses, say
this has been observed (Medin & Schaffer, 1978; to decide what sort of insect a particular specimen
Nosofsky, 1991). For example, although the aver- might be. Mini-theories are sets of beliefs about
age dog barks, if we experience an odd-looking what makes instances members of categories, and
one that does not, we will expect similar-looking an idea about what the normal properties of an
ones not to (Anderson, 2010). On the other hand, instance of a category should possess. They look
abstraction theories correctly predict that people rather like encyclopedia entries. Concept devel-
infer tendencies that are not found in any specific opment throughout childhood is a case of the
instance (Elio & Anderson, 1981). The predic- child evolving theories of categories that become
tive power of instance-based models increases increasingly like those used by adults.
336 D. MEANING AND USING LANGUAGE
FIGURATIVE LANGUAGE
So far we have been concerned with how we pro-
cess literal language—that is, where the intended
meaning corresponds exactly to the meanings of
Wisniewski and Love (1998) showed that people
the words. Humans make extensive use of non-literal
often prefer noun combinations based on property or figurative language. In this we go beyond the
relations. For example, a “zebra horse” is easily literal meanings of the words involved, for humor,
interpreted as a horse with stripes. effect, politeness, to play, to be creative—and for
a mixture of these and other reasons. There are
three main types of figurative language.
so we might generate a thematic relation like “a First, we use what can broadly be called
zebra that lives in trees.” In a survey of familiar metaphor. This involves making a comparison,
noun–noun combinations, 71% of combinations or drawing a resemblance. A metaphor is a special
had thematic relation meanings and 29% prop- type of conceptual combination, where we com-
erty meanings. bine two concepts that are not normally thought
People are also influenced by what might be of as being related for some special effect. There
called a noun’s combinatorial history—the way are many types of metaphor, depending on the
in which a particular word has combined with relation between the words actually used and the
other words before. For example, when “moun- intended meaning. Here are a few examples:
tain” is used in compound nouns, it usually indi-
cates a location relation (e.g., “mountain stream,” (27) Vlad fought like a tiger. (Simile)
“mountain goat,” and “mountain resort”). Hence (28) Vlad exploded with fury. (Strict metaphor)
when we come across a new combination involv- (29) All hands on deck. (Synecdoche)
ing “mountain” (e.g., “mountain fish”) we tend to
interpret it in the same way. The modifying (first) Cacciari and Glucksberg (1994) argued that
noun of the pair is the most important in deter- there is no dichotomy between literal and meta-
mining this (Shoben & Gagne, 1997; Wisniewski, phoric usage: rather, there is a continuum. How do
1997). Further evidence that experience matters we process metaphorical utterances? The standard
is that exposure to a word pair related in a simi- theory is that we process non-literal language in
lar way makes it easier to understand a new word three stages (Clark & Lucy, 1975; Searle, 1979).
pair. For example, prior exposure to the word pair First, we derive the literal meaning of what we
“glass eye” makes people faster to understand hear. Second, we test the literal meaning against
“copper horse,” when the same conceptual rela- the context to see if it is consistent with it. Third,
tion (second word is made of the first) is instanti- if the literal meaning does not make sense with
ated (Estes & Jones, 2006). the context, we seek an alternative, metaphorical
Hence the interpretation of compound nouns meaning (see Figure 11.5). fMRI imaging data
depends on a number of factors, including past suggests that in processing metaphors people activate
experience, similarity, and whether plausible regions of the brain involved in general reason-
relations between the stimuli exist. Although ing and thinking, involving working memory and
there might be some bias towards understanding executive processing, to understand more abstract
them on the basis of thematic relations, property metaphors (Prat, Mason, & Just, 2012).
338 D. MEANING AND USING LANGUAGE
Nevertheless, there are constraints. The meaning a word or object does not mean that the semantic
of the words cannot be either too similar or too representation of that word has been lost or dam-
dissimilar. Neither (32) nor (33), examples given aged. People can fail to access the phonology of a
by Aitchison (1994), is memorable: word while they still have access to its semantic
representation. There are a number of reasons why
(32) Jam is honey. this must be the case. Some people who are having
(33) Her cheeks were typewriters. difficulty accessing the whole phonological form
might be able to access part of it. These people
Clearly we have to generate just the right might be able to comprehend the word in speech.
amount of overlap: the words must share an They might be able to produce the word in sponta-
appropriate but minor characteristic overlap. Lit- neous speech. Importantly, they know how to use
tle is known about how we can generate just the the objects, and they can group pictures together
right amount of overlap. Producing metaphors and appropriately. In these cases we can conclude that
jokes is an aspect of our metalinguistic ability— the word meanings are intact, and that such people
our ability to reflect on and manipulate language, are having difficulty with later stages of process-
of which phonological awareness (see Chapter 7) ing. Nevertheless there are some instances where
is just one component. the semantic representation is clearly disrupted.
Can we distinguish between a “central”
semantic deficit, when a concept is truly lost (or at
THE NEUROSCIENCE OF least when its representation is degraded), and an
SEMANTICS “access” semantic impairment (sometimes called
a refractory semantic deficit), when there is dif-
What can we learn about the representation of ficulty in gaining access to the concept? Shallice
meaning from examining the effects of brain dam- (1988; see also Warrington & Cipolotti, 1996, and
age? Obviously, just because a person cannot name Warrington & Shallice, 1979) discussed five
criteria that could distinguish problems associated of underlying neurological damage. One type
with the loss of a representation from problems involves damage to a neuromodulatory system
of accessing it. First, performance should be that normally functions to maintain and enhance
consistent across trials. If an item is permanently neuronal signals, while the second involves dam-
lost, it should never be possible to access it. If an age to the neuronal system that encodes semantic
item is available on some trials rather than on oth- information. Hence the idea is that “refractori-
ers, the difficulty must be one of access. Second, ness,” a reduction in the ability to use the seman-
for both degraded stores and access disorders, it tic system in the same way for a period of time
should be easier to obtain the superordinate cat- following the initial response, builds up abnor-
egory than to name the item, because that infor- mally. (The idea is similar to that of the refrac-
mation is very strongly represented; but once the tory period in neuronal firing.)
superordinate is obtained, it will be very difficult Studies of the neuropsychology of semantics
to obtain any further information in a degraded cast light on a number of important issues. In par-
store. Warrington (1975) found that superordinate ticular, how many semantic memory systems are
information (e.g., that a lion is an animal) may there, and how is semantic memory organized?
be preserved when more specific information is
lost. She proposed that the destruction of semantic How many semantic systems are
memory occurs hierarchically, with lower levels
storing specific information being lost before
there?
higher levels storing more general information. Do we have separate semantic memory systems
Hence, information about superordinates tends to for each input modality? So far we have discussed
be better preserved than information about spe- semantic information as though there is only one
cific instances. Impaired access should affect all semantic store. This is called the unimodal store
levels equally. Third, low-frequency items should hypothesis. It is the idea, perhaps held by most peo-
be lost first. Low-frequency items should be more ple, that we have one central store of meaning that
susceptible to loss, whereas problems of access we can access from different modalities (vision,
should affect all levels equally. Fourth, priming taste, sound, touch, and smell). However, perhaps
should no longer be effective, as an item that each modality has its own store of information? In
is lost obviously cannot be primed. Fifth, if the practice, we are most concerned with a distinction
knowledge is lost then performance should be between a store of visual semantic information
independent of the presentation rate, whereas dis- and a store of verbal semantic information. Paivio
turbances of access should be sensitive to the rate (1971) proposed a dual-code hypothesis of seman-
of presentation of the material. tic representation, with a perceptual code encod-
There has been considerable debate about ing the perceptual characteristics of a concept, and
how reliably these criteria distinguish access dis- a verbal code encoding the abstract, non-sensory
orders from loss disorders, and how many patients aspects of a concept. Experimental tests of this
show all of these features (Rapp & Caramazza, hypothesis produced mixed results (Snodgrass,
1993). To be confident that items have been 1984). For example, participants are often faster
lost from semantic memory we need to observe to access abstract information from pictures than
at least consistent failure to access items across from words (see for example Banks & Flora,
tasks. However, a number of semantic-access 1977). Some support for the dual-code hypothesis
deficit patients have now been clearly identi- is that brain-imaging studies show that concrete
fied (Warrington & Cipolotti, 1996; Warrington and abstract words are processed differently
& Crutch, 2004), and other patients show ele- (Kounios & Holcomb, 1994).
ments of semantic-access deficit (e.g., Forde & The idea of multiple or modality-specific
Humphreys, 1995, 1997). Gotts and Plaut (2002) semantic stores, whereby verbal material (words)
present a connectionist model that suggests that and non-verbal material (pictures) are separated,
central and access deficits result from different types has enjoyed something of a resurgence owing
11. WORD MEANING 341
to data from brain-damaged participants. There in pictures. For example, the presence of a large
are three main reasons for this (Caplan, 1992). gaping mouth and heavy paws in the picture of a
First, priming effects have been discovered that lion is an excellent indirect cue to how to answer
have been found to be limited to verbal material. a comprehension question such as “is it danger-
Second, some case studies show impairments ous?,” even if you do not know it is a picture of a
limited to one sensory modality. For example, lion (Caplan, 1992).
patient TOB (McCarthy & Warrington, 1988) Nevertheless, some research is more dif-
had difficulty in understanding living things, but ficult to explain away. Bub, Black, Hampson,
only when they were presented as spoken names. and Kertesz (1988) describe the case of MP, who
He could name their pictures without difficulty. showed very poor comprehension of verbal mate-
Patient EM (Warrington, 1975) was generally rial, did not show automatic semantic priming, but
much more impaired at verbal tasks than at visual did show much better comprehension of the mean-
tasks. Third, patients with semantic deficits are ing of pictures. The nature of the detailed infor-
not always equally impaired for verbal and vis- mation MP was able to provide about the objects
ual material (e.g., Warrington, 1975). Warrington in the pictures, such as the color of a banana
and Shallice’s (1979) patient AR showed a much from a black-and-white line drawing, could not
larger benefit from cuing when reading a written easily be inferred from perceptual cues without
word than when naming the corresponding pic- access to semantic information about the object.
ture. They interpreted this finding as evidence for Warrington and Shallice (1984) found high item
separate verbal and visual conceptual systems. consistency in naming performance as long as
Coltheart, Inglis, Cupples, Michle, Bates, and the modality was held constant, again suggesting
Budd (1998) described the case of AC, who was different semantic systems were involved. Lauro-
unable to access visual semantic attributes, but Grotto, Piccini, and Shallice (1997) described a
could access other sensory semantic attributes as patient with semantic dementia (a type of degen-
well as non-sensory attributes. This was observed erative dementia where semantic memory is lost
independently of the modality of testing and of while episodic memory is relatively well pre-
the semantic category tested. Coltheart et al. pro- served) who was much better at tasks involving
posed that semantic memory is organized into visual input than verbal input.
subsystems. There is a subsystem for each sen- Finally, supportive evidence comes from
sory modality, and a subsystem for non-sensory modality-specific anomia, in which the naming
semantic knowledge. This non-sensory subsystem disorder is confined to one modality. For example,
is in turn divided into subsystems for semantic in the disorder known as optic aphasia (Beauvois,
categories such as living and non-living things. 1982; Coslett & Saffran, 1989), patients are
This approach takes the fractionation of semantic impaired at the naming of visually presented stim-
memory to the extreme. uli, but without general visual anomia or agno-
sia. They are unable to name objects presented
Evaluation of multiple-stores models visually, but can name them if they are exposed
Alternative explanations have been offered for to them through other modalities (e.g., patients
these studies. Riddoch, Humphreys, Coltheart, cannot name a cat by sight, but can if they hear it
and Funnell (1988) argued that patients who per- mew, or if they are given one to touch), or if they are
form better on verbal material might have a subtle given a definition of the word. Hence the names
impairment of complex visual processing. This of objects must still be intact, showing there is no
idea is supported by the finding that the distur- general anomia. Patients can also mime the use of
bance in processing pictures is greater for catego- objects, or sort pictures into appropriate catego-
ries with many visually similar members (e.g., ries, showing there is no general agnosia.
fruit and vegetables). The reverse dissociation of The interpretation of these data is con-
better performance on visual material may arise troversial. The most obvious interpretation
because of the abundance of indirect visual cues of optic aphasia, for example, is that we can
342 D. MEANING AND USING LANGUAGE
access different modality-specific stores, with the appropriate semantic store first. Second, they
one of the stores being wiped out. Riddoch and predict that activation of phonological and ortho-
Humphreys (1987) argued that optic aphasia graphic representations is mediated by verbal
is a disorder of accessing a unitary semantic semantics. Third, they predict that information
system through the visual system, rather than can only be accessed directly through the appro-
disruption to a visual modality-specific seman- priate input modality. Caramazza et al. argued that
tic system. Much hangs on the interpretation of the data do not really support these predictions.
gestures made by the patient. Do they indeed All the data really motivate is that there is a rela-
reflect preserved visual semantics—so that tion between input modality and semantic content
patients understand the objects they see—with type; it does not have to be in a modality-specific
disruption of verbal semantics, or are they format. They proposed an alternative model of the
merely inferences made from the perceptual semantic system that they called OUCH (short
attributes of objects? Riddoch and Humphreys’ for organized unitary content hypothesis). In this
patient JV produced only the most general of model, pictures of objects have privileged access
gestures to objects, and other experiments indi- to a unimodal store. This is because a picture of an
cated a profound disturbance of comprehension object has a more direct relationship to the object
of visual objects. Of course, we must remember itself than a word denoting the object. A fork is a
the caveat that different patients display differ- fork because you can eat with it, and you can eat
ent behaviors, and one must be wary of drawing with it because it has tines and a handle. Some
too general a conclusion from a single patient. semantic connections are more important than
Caramazza, Hillis, Rapp, and Romani (1990) others. This idea is attractively simple, but OUCH
argued that there is some confusion about what cannot explain patients who have more trouble
the terms “semantics” and “semantic stores” with pictures than words (e.g., FRA of McCarthy
mean when used in neuropsychological con- & Warrington, 1986). Finally, it is not clear that a
texts. Is semantic information general knowledge distinction between a semantic system and sub-
about objects and events, or just something that system is a meaningful one. Perhaps they amount
mediates between input and output? They dis- to the same thing (Shallice, 1993).
tinguished four versions of the multiple-stores How can we explain optic aphasia? There are
hypothesis. In the input account, the same seman- several accounts (Sitton, Mozer, & Farah, 2000).
tic system, containing everything (both visual Optic aphasia shows that the simple canonical
and verbal), is duplicated for each modality of model of meaning, where we go from sensory
input. There is little evidence for this idea. In the input to semantics, and then to name, cannot be
modality-specific content hypothesis, there is a correct, because in optic aphasia people have
semantic store for each input modality. Each store accessed the semantics and therefore should
contains information relevant to that modality, always be able to access the name. The modal-
but in an abstract or modality-neutral format. The ity-specific multiple-stores models accounts for
modality-specific format hypothesis is similar to optic aphasia by positing a disconnection between
this, but the store is in the format of the input (e.g., verbal semantics and visual semantics, with pro-
visual information for vision, verbal for verbal). In ducing the correct name depending on access to
the modality-specific context hypothesis, visual and verbal semantics. According to OUCH (Hillis &
verbal semantics refer to the information acquired Caramazza, 1995), we observe optic aphasia when
in the context of visually presented objects or the semantic representation that is computed from
words. For example, if you acquired “tigers have visual input is enough to support action patterns
stripes” through verbal exposure, that informa- (mimes), but not naming. Shallice (1993) pointed
tion is stored verbally rather than visually. These out that this would make optic aphasia indistin-
hypotheses are difficult to distinguish, but appear guishable from visual associative agnosia. In
to make three predictions. First, they predict that a similar vein Riddoch and Humphreys (1987)
access from a particular modality always activates also hypothesize an impairment from vision to
11. WORD MEANING 343
semantics, but argue that a direct pathway from The perceptual information necessary to identify
vision to gesture is preserved. Both ideas note that and name an object is only a subset of the mean-
visual objects have affordances (Gibson, 1979); ing of a concept. If this information is intact and
the shape of a chair encourages or creates the the amodal associative store is impaired, a person
idea of sitting in it. Finally, Sitton et al. (2000) will still be able to name an object, but will not be
argue that instead of optic aphasia arising from able to access the other verbal semantic informa-
damage to multiple semantic systems or multi- tion about the object. One argument against this
ple pathways, it arises from damage at multiple hypothesis is that patient RM of Lauro-Grotto
sites in a unitary model. They argue that lesions et al. (1997) had much better preserved semantic
to the pathways mapping visual input to seman- abilities than we would expect, given that she was
tics, and also semantics to naming, can account impaired at tasks involving verbal semantics. In
for optic aphasia if those lesions are what they particular, she still had knowledge about visual
call super-additive. Super-additive means that a contextual contiguity (knowing what items tend
task requiring both pathways (naming a visually to occur together visually, such as a windshield
presented object) gives a much higher failure rate wiper and a car tax disc, which in the UK is dis-
than would be expected on the basis of the error played in the corner of the windshield) and even
rates on tasks involving just one of the paths (e.g., functional contextual contiguity (the way objects
gesturing from semantics). They present a con- tend to be used in the same function, such as a
nectionist model that shows that super-additivity screwdriver and a screw). Lauro-Grotto et al.
can occur and that damage to a system with a sin- argue that these types of information are stored
gle semantic store and with visual and auditory in visual semantics rather than being an amodal
inputs and name and gesture outputs (see Figure component of semantic memory.
11.6) gives rise to a pattern of performance simi- In summary, most researchers currently
lar to optic aphasia. Essentially brain damage in believe that there are multiple semantic systems.
two parts of the brain is particularly damaging for Most importantly, there are distinct systems for
some tasks, while leaving performance on tasks verbal and visual semantics. However, it is impor-
that involve just one part close to normal. tant to note that the representations and mecha-
Caplan (1992) proposed a compromise nisms used by these systems need to be spelled
between the multiple-stores and unitary store out, and it can be quite difficult to distinguish
theories in which only a subset of semantic infor- between different theories.
mation is dedicated to specific modalities. This
has become known as the identification seman- Category-specific semantic
tics hypothesis (Chertkow, Bub, & Caplan, 1992).
disorders
Perhaps the most intriguing and hotly debated
phenomena in this area are category-specific dis-
name gesture
orders. Sometimes brain damage disrupts knowl-
edge about particular semantic categories, leaving
other related ones intact. For example, Warrington
semantic
and Shallice’s (1984) patient JBR performed much
better at naming inanimate objects than animate
objects. He also had a relative comprehension
visual auditory
deficit for living things. At first sight this suggests
that semantic memory is divided into animate and
inanimate categories. JBR’s brain damage caused
FIGURE 11.6 A schematic depiction of the super- the loss of the animate category. The picture is
additive impairment account of optic aphasia (Farah, more complicated, however. JBR was good at
1990). From Sitton, Mozer, and Farah (2000). naming parts of the body, even though these are
344 D. MEANING AND USING LANGUAGE
parts of living things. He was also poor at naming made a general observation about the materials
musical instruments, foodstuffs, types of cloth, used for these types of experiment. Most experi-
and precious stones, even though these are clearly ments use as stimulus materials a set of black-
all inanimate things. Difficulties with a particu- and-white line drawings from Snodgrass and
lar semantic category are not restricted to naming Vanderwart (1980). Some examples are given in
pictures of its members. They arise across a range Figure 11.7. Funnell and Sheridan (1992) showed
of tasks, including picture naming, picture–name that within this set there were more pictures of
matching, answering questions, and carrying out low-frequency animate objects than there were
gestures appropriate to the object (Warrington & of low-frequency inanimate objects. There were
Shallice, 1984). few low-familiarity non-living things and few
Even more specific semantic disorders have high-familiarity living things. That is, randomly
been observed. Hart, Berndt, and Caramazza selected pictures of animate things are likely to
(1985) reported a patient, MD, who also had spe- be less familiar than a random sample of inani-
cific difficulties in naming fruit and vegetables; mate objects. Hence, if frequency is important in
PC (Semenza & Zettin, 1988) had selective dif- brain-damaged naming, an artifactual effect will
ficulty with proper names; BC (Crosson, Moberg, show up unless care is taken to control for fre-
Boone, Rothi, & Raymer, 1997) just had diffi- quency across the categories. Furthermore, there
culty with medical instruments. Knowledge about were two anomalous subcategories. SL was poor
nouns and verbs seems to be processed by differ- at naming human body parts (high familiarity
ent parts of the brain (Caramazza & Hillis, 1991; but a subcategory of living things) and musical
Hillis, Tuffiash, & Caramazza, 2002; Shapiro & instruments (low frequency but inanimate). These
Caramazza, 2003). It is unlikely that this dissocia- were the two anomalous categories mentioned by
tion can be reduced to the effects of semantic vari- Warrington and Shallice (1984) in their descrip-
ables because of the report of a patient by Rapp tion of JBR.
and Caramazza (2002) who has greater difficulty Stewart, Parkin, and Hunkin (1992) also
speaking nouns than verbs, but greater difficulty argued that there had been a lack of control of
writing verbs than nouns. word name frequency, but pointed out in addition
that the complexity and familiarity of the pictures
Methodological issues in investigating used in these experiments varied between catego-
category-specific deficits ries. Gaffan and Heywood (1993) showed that pic-
There are a number of methodological problems tures of living things are visually more similar to
in studying category-specific semantic disorders. each other than pictures of non-living things. With
Funnell and Sheridan (1992) reported an appar- very brief presentation times, normal participants
ent category-specific effect whereby their patient, make more errors on living things. In reply to
SL, appeared to show a selective deficit in naming these criticisms, Sartori, Miozzo, and Job (1993)
pictures and defining words for living versus non- concluded that their patient “Michelangelo” had
living things. When they controlled for the famili- a real category-specific deficit for living things,
arity of the stimulus, this effect disappeared. They even when these factors were controlled for. The
debate was continued by Parkin and Stewart is correct then category-specific disorders are
(1993), and Job, Miozzo, and Sartori (1993). One important because they reveal the structure of the
conclusion is that it is important to measure and categories as represented by the brain. Hence the
control the familiarity, visual featural complexity, distinction between living and non-living things
and visual similarity of pictures. would be a fundamental organizing principle in
On the other hand, we cannot explain all semantic memory. Farah (1994) argued that this
category-specific effects by these methodological approach would go against what we know about
problems. Now studies are careful to control for the organization of the brain. More importantly,
the potential confounding variables, yet category- this idea does not explain why deficits to particular
specific deficits persist. Some patients are poor at categories tend to co-occur. Why are impairments
tasks involving living things that do not involve on naming living things associated with impair-
picture naming, such as comprehension and defi- ments on naming gems, cloths, foodstuffs, and
nition (e.g., Warrington & Shallice, 1984). Most musical instruments, and why are impairments on
importantly, we observe a double dissociation naming non-living things associated with impair-
between the categories of living and non-living ments on naming body parts? It is also difficult
things. Warrington and McCarthy (1983, 1987) to reconcile with the observation that patients
describe patients who are the reverse of JBR in impaired at naming animals perform worse on
that they perform better on living objects than on tasks involving perceptual properties (Saffran &
inanimate objects. Their patient YOT, for exam- Schwartz, 1994; Sartori & Job, 1988; Silveri &
ple, who generally had an impairment in naming Gainotti, 1988). The second possible explanation
inanimate objects relative to animate ones, on is that the categories that are disrupted share some
closer examination could identify large outdoor incidental property that makes them susceptible to
objects such as buildings and vehicles. There loss. Riddoch et al. (1988) proposed that catego-
also appears to be a distinction between small ries that tend to be lost also tend to include many
and large artifacts. CW also found non-living similar and confusable items. However, it is not
things and body parts harder to name than living clear that these patients have any perceptual dis-
things (Sacchett & Humphreys, 1992). Hillis and order (Caplan, 1992). The third possible explana-
Caramazza (1991a) examined two patients, JJ tion is that the differences between the categories
and PS, who exemplified this double dissociation are mediated by some other variable so that the
when tested on the same stimuli. Although there items that are lost share some more abstract prop-
are fewer patients who show selective difficulties erty. We will look at this idea in detail.
with non-living things, there are enough of them
to be very convincing. The performance of these The sensory–functional theory
patients cannot be explained away as experimen- Non-living things are distinguished from one
tal artifacts, as they are having difficulty with another primarily in terms of their functional
members of the category that should prove easiest properties, whereas living things tend to be dif-
to process if all that matters is visual complexity ferentiated primarily in terms of their perceptual
and familiarity. properties (Warrington & McCarthy, 1987;
Warrington & Shallice, 1984). That is, the rep-
What explains the living–non-living resentation of living things depends on what they
dissociation? look like, but the representation of most non-living
There are three possible explanations for category- things depends on what they are used for. Hence
specific disorders. The first is that different types JBR, who generally showed a deficit for living
of semantic information are located at different things, also performed poorly on naming musical
sites in the brain, so that brain damage destroys instruments, precious stones, and fabrics. What
some types and not others. On this view, informa- these things all have in common is that, like liv-
tion about fruit and vegetables is stored specifi- ing things, they are recognized primarily in terms
cally in one part of the brain. If this explanation of their perceptual characteristics, rather than
346 D. MEANING AND USING LANGUAGE
being distinguished from each other on largely for each word was 7.7:1. For non-living things,
functional terms. This distinction is also consist- it was only 1.4:1. The network was then taught
ent with the organization of the brain, which has to associate the correct semantic and name pat-
distinct processing pathways for perceptual and tern when presented with each picture pattern,
motor information (Farah, 1994). and to produce the correct semantic and picture
Farah, Hammond, Mehta, and Ratcliff (1989) pattern when presented with each name pattern.
showed that control participants were poor at Farah and McClelland then lesioned the net-
answering questions on the perceptual features work. They found that damage to visual seman-
of both living and non-living objects (e.g., “Are tic units primarily impaired knowledge of living
the hind legs of kangaroos larger than their front things, whereas damage to functional semantic
legs?”). If visual attributes are more difficult to units primarily impaired knowledge about non-
process than functional ones, then categories that living things. Furthermore, when a category was
depend more on them would be more suscepti- impaired, knowledge of both types of attribute
ble to loss. This explains why we observe loss of was lost. This is because of the distributed nature
information about living things more frequently of the semantic representations. Lesioning the
than loss of information about non-living things. model results in a loss of support between parts
There is some support from neuroimaging of the representation. The elements of the repre-
work for this hypothesis. There is no obvious dif- sentation remaining after damage do not have suf-
ference in the blood flow in the temporal lobes ficient critical mass to become activated.
with responses to living and non-living things, In summary, the sensory–functional theory
but there is with a difference with the processing says knowledge of animate objects is derived pri-
of perceptual and functional information (Lee, marily from visual information, whereas knowl-
Graham, Simons, & Hodges, 2002), with more edge of inanimate objects is derived primarily
activation of the posterior regions of the left tem- from functional information. Non-living things
poral cortex when we are dealing with perceptual do not necessarily have more functional attributes
information, and more activation of the middle than perceptual attributes, but they have relatively
regions when dealing with functional information. more than living things.
patients impaired at tasks involving animals are However, Farah and McClelland’s (1991) simula-
good at musical instruments (e.g., Felicia of De tions showed that when a category was impaired,
Renzi & Lucchelli, 1994). Animals can be spared knowledge of both types of attribute was lost.
or damaged independently of plants (Hillis & This is because of the distributed nature of the
Caramazza, 1991a), and the category of plants can semantic representations.
be damaged independently of animals (e.g., TU In addition PET and fMRI imaging suggests
of Farah & Wallace, 1992). It is of course possi- that knowledge about animals and tools is indeed
ble that some types of perceptual feature are more stored in separate, identifiable parts of the brain
important for some categories than for others. For (Caramazza & Shelton, 1998; Vigliocco et al.,
example, animals might depend on shape, while 2004). To summarize, knowledge about animals is
foodstuffs might depend on color (Warrington stored in occipital-temporal areas, while knowledge
& McCarthy, 1987). These further dissociations about tools is stored in lateral temporal-parietal-
would then reflect selective loss of particular occipital areas (see Figure 11.8).
types of sensory feature, rather than of all of Caramazza and Shelton also argued that
them. Caramazza and Shelton argue that there is the concept of functional information is poorly
no independent evidence for this approach. defined. In the dictionary rating experiment of
The sensory–functional hypothesis also Farah and McClelland, participants were told
appears to predict that people with a selective that “it was what things are for.” But it is pos-
impairment for living things should show a dis- sible that much other non-sensory verbal infor-
proportionate difficulty with visual properties. mation is really involved (for example, a lion is
Although this has been observed sometimes, stud- a carnivore and it lives in a jungle). Biological
ies that have carefully controlled for the level of function information (such as animals breathe
difficulty of the different types of question have and can see) is preserved in RC, even though
not always found it to be the case (Funnell & de other types of functional information (what
Mornay Davies, 1996; Laiacona, Barbarotto, & an animal eats or where it lives) are impaired
Capitani, 1993; Sheridan & Humphreys, 1993). (Tyler & Moss, 1997, 2001).
Animals Tools
Parietal lobe
Frontal lobe Parietal lobe
Frontal lobe
FIGURE 11.8 Imaging studies suggest that knowledge about animals is stored in the occipital-temporal areas,
whereas knowledge about tools is stored in lateral temporal-parietal-occipital areas.
348 D. MEANING AND USING LANGUAGE
Caramazza and Shelton proposed an alterna- of dementia called semantic dementia is particu-
tive explanation of the data, which they called the larly interesting: In semantic dementia, the loss
domain-specific knowledge hypothesis (DSKH). of semantic information is disproportionately
They argued that specific, innate neural mechanisms great relative to the loss of other cognitive func-
for distinguishing between living and non-living tions, such as episodic memory (Hodges et al.,
things have evolved because of the importance of 1992; Mayberry, Sage, & Lambon Ralph, 2011;
this distinction. They cite two lines of evidence Snowden, Goulding, & Neary, 1989; Warrington,
for this. First, very young children (within the 1975). This selective disturbance of semantic infor-
first few months) can distinguish between living mation makes it particularly useful for studying
and non-living things (Bertenthal, 1993; Quinn & how we represent meaning. Alzheimer’s disease
Eimas, 1996). The presence of this ability so soon and semantic dementia reflect damage (at least
after birth suggests that it is innate. Second, studies initially) to different brain regions: Neuroimaging
of lesion sites and recent studies using brain imag- studies show that Alzheimer’s disease typically
ing both suggest that different parts of the brain begins with medial temporal lobe atrophy, includ-
might, after all, be dedicated to processing living ing the hippocampus, with more advanced cases
and non-living things. Living things are generally showing global atrophy. Semantic dementia on
associated with the temporal lobe, while artifacts the other hand is marked by atrophy beginning
tend to be more associated with the dorsal region of particularly in the left anterior temporal region of
the temporal, parietal, and frontal lobes. the brain, with much less early damage to the hip-
It is too early to evaluate these alternative pocampus. Patients with semantic dementia show
approaches. Currently most researchers in the impaired word naming and a loss of word mean-
field subscribe to the sensory–functional hypoth- ing, but preserved syntax. Imaging results suggest
esis. Time will tell whether the domain-specific that the left middle and inferior temporal cortex
knowledge hypothesis will be preferred. Imaging of the brain play a particularly important role in
data suggest that while knowledge about animals accessing and representing meaning (Chan et al.,
and tools might be stored in different parts of 2001; Garrard & Hodges, 2000).
the brain, this might be because of an underly-
ing dependence on some other factor. While ani-
mals are associated with activation of the lateral
fusiform gyrus, and tools with activation of the
medial fusiform gyrus, some non-living things
(e.g., chairs) cause activation of areas outside that
associated with tools (Chao, Haxby, & Martin,
1999; Vigliocco et al., 2004).
of meaning are. What are the types of feature that low-level nature. They mediate between percep-
underlie word meaning? How are categories organ- tion, action, and language, and do not necessarily
ized by the brain? How does our semantic system have any straightforward linguistic counterparts.
relate to input and output systems? While semantic microfeatures might correspond
to simple semantic features, they might corre-
spond to something far more abstract. There is no
CONNECTIONIST reason to assume that the semantic microfeatures
APPROACHES TO that we develop will correspond to any straight-
SEMANTICS forward linguistic equivalent (such as a word or
an attribute), in much the same way that hidden
Connectionism has made an impact on semantic units in a connectionist network do not always
memory, just as it did in earlier years on lower acquire an easily identifiable, specific function. In
level processes such as word recognition. We support of this idea, there is evidence that the loss
saw in Chapter 7 how Hinton and Shallice (1991) of specific semantic information can affect a set
and Plaut and Shallice (1993a) incorporated the of related concepts (Gainotti, di Betta, & Silveri,
semantic representation of words into a model 1996). Hence semantic microfeatures might
of the semantic route of word recognition. This encode knowledge at a very low level of seman-
approach gives rise to the idea that semantic tic representation, or in a very abstract way that
memory depends on semantic microfeatures. has no straightforward linguistic correspondence
Note that this approach is not necessarily a (Harley, 1998; Jackendoff, 1983; McNamara &
competitor to other theories such as prototypes; Miller, 1989). The encoding of visual information
one instance of a category might cause one pattern by at least some of the semantic microfeatures is
of activation across the semantic units, another yet another reason to expect lesions to the seman-
instance will cause another similar pattern, and tic system to result in visual errors and perceptual
so on. We can talk of the prototype that defines a processing difficulty in naming with dementia.
category as the average pattern of activation of all In Hinton and Shallice’s (1991) model of
the instances. deep dyslexia, meaning was represented as a pat-
tern of activation across a number of semantic
feature units, or sememes, such as “hard,” “soft,”
Semantic microfeatures “maximum-size-less-foot,” “made-of-metal,” and
In the connectionist models we have examined, “used-for-recreation.” No one claims that such
a semantic representation does not correspond semantic features are necessarily those that
to a particular semantic unit, but to a pattern of humans use, but there is some evidence for this
activation across all of the semantic units. For sort of approach from data on word naming by
example, in Tippett and Farah’s model the mean- Masson (1995). In Hinton and Shallice’s model
ing of each word or object was represented as a the semantic features are grouped together so that
pattern of activation over 32 semantic units, each features that are mutually excluded inhibit each
representing a semantic microfeature. A micro- other, and only one can be active at any one time.
feature is an individual, active unit; the prefix For example, an object cannot be both “hard” and
“micro” emphasizes that these units are involved “soft,” or “maximum-size-less-foot” and “maxi-
in low-level processes rather than explicit sym- mum-size-greater-two-yards,” at the same time.
bolic processing (Hinton, 1989), but there really In addition, another set of units called “cleanup”
isn’t much difference between a feature and a units modulate the activation of the semantic units.
microfeature. Connectionist models suppose that These features allow combinations of semantic
human semantic memory is based on microfea- units to influence each other. We saw in Chapter 7
tures. A semantic microfeature is really just a that semantic memory can be thought of as a land-
semantic feature, but the prefix “micro” is added scape with many hills and valleys. The bottom
in computational modeling to emphasize their of each valley corresponds to a particular word
352 D. MEANING AND USING LANGUAGE
meaning. Words that are similar in meaning will unit corresponding to the attribute “bites” is lost,
be in valleys that are close together. The initial then that attribute will always be unavailable. If,
pattern of activation produced by a word when it however, a unit corresponding to more abstract
first activates the network might be very different information that is not easily linguistically
from its ultimate semantic representation, but as encoded is lost, then the consequences might be
long as you start somewhere along the sides of the less apparent in any linguistic task. The loss of a
right valley, you will eventually find its bottom. feature may mean that the higher level, linguisti-
The valley bottoms, which correspond to particu- cally encoded units become permanently unavail-
lar word meanings, are called attractors. This type able, but alternatively it might just mean that the
of network is called an attractor network. higher level units become more difficult to access.
If meanings are represented as a pattern of Hence there is a probabilistic aspect to whether
activation distributed over many microfeatures, a word or an attribute will be consistently
then it makes less sense to talk about loss of indi- unavailable. So an increasing number of linguis-
vidual items in the model. Instead, the loss of units tically encoded units should become permanently
will result in the loss of microfeatures. This will unavailable as the severity of dementia increases
result in a general degradation in performance. and more microfeatures are lost, as is observed
(e.g., Schwartz & Chawluk, 1990).
Tippett and Farah (1994) pointed out
Explaining language loss in people that experimental tasks differ in the degree
with Alzheimer’s disease: The of constraint provided on possible responses.
semantic microfeature loss Connectionist models are sensitive to multiple
constraints: If one sort of constraint is lost, other
hypothesis consistent ones might still be able to facilitate
What happens if a disease such as dementia the correct output. For example, in Tippett and
results in the loss of semantic microfeatures? The Farah’s model, phonological priming provided
effect will be to distort semantic space so that an additional constraint. Hence the availability of
some semantic attractors might be lost altogether, items will depend on the degree to which tasks
while others might become inaccessible on some provide constraints. Patients with Alzheimer’s
tasks because of the erosion of the boundaries of disease perform relatively well in highly con-
the attractor basins. Damage to a subset of micro- strained tasks.
features will lead to a probabilistic decline in per-
formance. Depending on the importance of the Modeling category-specific
microfeature lost to a particular item in a particu-
lar patient, the pattern of performance observed
disorders in dementia
will vary from patient to patient and from task Connectionist models of category-specific disor-
to task. Different tasks will give different results ders in dementia are also interesting because they
because they will provide differing amounts of tell us both about the progress of the disease and
residual activation to the damaged system. Thus, about the structure of semantic memory. Dementia
although microfeatures are permanently lost in generally causes more global damage to the brain
dementia, when tested experimentally this loss than the very specific lesioning effects of herpes
will sometimes look like loss of information, but simplex that typically cause category-specific dis-
will at other times look like difficulty in accessing orders. Therefore category-specific deficits are
information. more elusive in dementia. There is also the ques-
Consider response consistency, usually tion of which semantic categories are more prone
taken as the clearest indication of item loss. If a to disruption in dementia. Gonnerman, Andersen,
unit corresponding to the meaning of the word Devlin, Kempler, and Seidenberg (1997) found
“vampire” is lost, the meaning of that word is that sufferers show selective impairments on
always going to be unavailable. Similarly, if the tasks involving both living things and artifacts,
11. WORD MEANING 353
depending on the level of severity of the disease. As intercorrelated features are particularly
Early on there is a slight relative deficit of naming common in the category of living things, a small
artifacts, followed later by a deficit on naming liv- amount of damage to the semantic network, char-
ing things, followed by poor naming performance acteristic of early dementia, will have little effect
across all categories. on living things. This is because the richly intercon-
What explains the way in which category- nected intercorrelated features support each other
specificity varies with severity? To understand (Devlin, Gonnerman, Andersen, & Seidenberg,
this, we need to look more closely at semantic fea- 1998). Hence, early on in the progression of demen-
tures. In an important study, McRae, de Sa, and tia, tasks involving living things will appear not to
Seidenberg (1997) argued that there are different be affected. Beyond a critical amount of damage,
types of semantic feature, depending on the extent however, this support will no longer be available.
to which each feature is related to other ones (see When a critical mass of distinguishing features is
Figure 11.11). Intercorrelated features tend to occur lost, there will be catastrophic failure of the mem-
together: For example, most things that have beaks ory system. Then, whole categories will suddenly
can also fly, and most things that have fur often become unavailable. Artifacts, however, tend not to
have tails and claws. Living things tend to be repre- be represented by many intercorrelated features, but
sented by many intercorrelated features. Semantic by relatively many informative distinguishing fea-
features also differ in the extent to which they ena- tures. The loss of just a few of these features might
ble us to distinguish among things. Some features result in the loss of a specific item. Increasing dam-
are more important than others. Distinctive (some- age then results in the gradual loss of an increasing
times called distinguishing) features enable mem- number of items across categories, rather than the
bers of a category to be distinguished: For example, catastrophic loss observed with living things.
a leopard can be distinguished from other large cats It is important to emphasize the probabilistic
because it has spots. Many members of a natural nature of this loss. If a distinguishing feature for an
kind category will share intercorrelated features, but animal happens to be lost early on, then that ani-
distinguishing features are exclusive to single items mal will be confused with other animals from that
within the category. Artifacts tend not to be repre- point on (Gonnerman et al., 1997). However, there
sented by many intercorrelated features, but rather are more intercorrelated than non-correlated distin-
by many distinguishing features. Using a primed guishing features within the living things category.
semantic verification task (e.g., “is an apple used Hence an intercorrelated feature is more likely to be
to make cider?”), Cree, McNorgan, and McRae affected, but usually with no obvious consequence,
(2006) showed that distinctive features hold a privi- than a distinguishing feature. This type of approach
leged status in semantic memory; they are activated is promising, but we must be wary about the rela-
more strongly than shared, non-distinctive features. tively limited amount of data on which this sort of
FIGURE 11.11
354 D. MEANING AND USING LANGUAGE
model is based. For example, Garrard, Patterson, Landauer and Dumais examined how latent
Watson, and Hodges (1998) failed to find any inter- semantic analysis might account for aspects of
action between disease severity and the direction of vocabulary acquisition. After exposure to a large
dissociation. Instead, they found a group advantage amount of text, the model generated performed
for artifacts, with a few individuals showing an well at a multiple-choice test of selecting the appro-
advantage for living things. priate synonym of a target word. It also acquired
vocabulary at the same rate as children. (To give
some idea of the complexity of the task, and to
Latent semantic analysis provide another demonstration of the importance
We have seen that connectionism represents meaning of computers in modern psycholinguistics, 300
by a pattern of activation distributed over many sim- dimensions were necessary to represent relations
ple semantic features. In these models, the features among 4.6 million words of text taken from an
are hand-coded; they are not learned, but are built encyclopedia.) This statistical sort of approach is
into the simulations. How do humans learn these very good at accounting for later vocabulary learn-
features? Connectionist models suggest one means: ing, where direct instruction is very rare. Instead,
connectionist models are particularly good at picking we infer the meanings of new words from the
out statistical regularities in data, so it is possible that context. LSA also shows how we can reach agree-
we abstract them from many exposures to words. A ment on the usage of words without any external
closely related approach makes explicit the role of referent. This observation is particularly useful in
co-occurrence information in acquiring knowledge. explaining how we acquire words describing pri-
This technique is called latent semantic analysis vate mental experiences. How do you know that I
(LSA) (Landauer & Dumais, 1997; Landauer, Foltz, mean the same thing by “I’m sad today” as you do?
& Laham, 1998; see Burgess & Lund, 1997, for the The answer is in the context in which these words
similar HAL—hyperspace analog to language— repeatedly do and do not occur.
model). Latent semantic analysis needs no prior lin- One criticism of the HDM models is that they
guistic knowledge. Instead, a mathematical proce- are overly concerned with the context in which
dure abstracts dimensions of similarity from a large words occur, so that words are related to other
corpus of items based on analysis of the context in words, rather than to the world, and therefore these
which words occur. We saw earlier how Lund et al. models find it difficult to cope with novel situations
(1995) showed that semantically similar words are (Glenberg & Robertson, 2000; see Burgess, 2000,
interchangeable within a sentence. This means that for a reply). For example, we know that it makes
the context in which words can (and cannot) occur sense to use a newspaper to protect our head from
provides a powerful constraint on how word mean- the wind, but not a matchbox. We will return to how
ings are represented. Latent semantic analysis makes meaning is connected to perception at the end of
use of this context to acquire knowledge about this chapter.
words. At first sight these constraints might not
seem particularly strong, there are a huge number Evaluation of connectionist models
of them, and we are exposed to them many times.
Constraints on the co-occurrence of words provide a
of semantic memory
vast number of interrelations that facilitate semantic Throughout this chapter we have seen how con-
development. LSA learns about these interrelations nectionist modeling has indicated how appar-
through a mechanism of induction. The mathemati- ently disparate theories and phenomena—here
cal techniques involved are too complex to describe the time course of dementia, modality-specific
here, but essentially the algorithm tries to minimize stores, functional versus perceptual attributes,
the number of dimensions necessary to represent all and category-specific memory—may be sub-
the co-occurrence information. Indeed, this type of sumed under one model. Connectionist model-
model is often called the HDM (high-dimensional ing of neuropsychological deficits is particularly
memory) approach. promising. The data and modeling work suggest
11. WORD MEANING 355
that the language deficits shown in diseases As Rogers et al. note, such a computational
such as dementia result from the gradual loss of approach, although broadly similar to the feature-
semantic microfeatures. based model, has several advantages. First, we no
longer have to be worried about what features we
Grounding: Connecting language should use and whether they are arbitrary; features
emerge to do the job. We no longer have to worry
to the world about whether a dog’s bark and a cow’s moo are
Language and meaning are not a closed system. the same or different features. Second, the compu-
Meaning is a way of mapping language onto the tational model forces us to be explicit about how
external world. At some point the semantic sys- every semantic or perceptual task is carried out.
tem has to interface with the perceptual systems; Third, the model provides an account of semantic
this interfacing is sometimes called grounding dementia. Semantic dementia was simulated by
(see Jackendoff, 1987, 2002, 2003; Roy, 2005; removing a proportion of the weights; increasing
Vigliocco et al., 2004). How does grounding occur? severity is modeled by removing a larger propor-
Rogers et al. (2004) describe a connectionist model tion of the weights. The lesioned model resembles
of semantic memory that provides an account of the behavior of patients with semantic dementia.
how language and perception are connected. They For example, in both the model and the patients, as
constructed a model that maps between modality- severity increases so does the proportion of omis-
specific representations of objects and their verbal sion and superordinate errors, while the production
descriptions (see Figure 11.12). Semantic repre- of semantic substitutions initially increases but
sentations mediate between these two output then declines. With a little damage, the model first
representations. In their model, a semantic level confuses similar items, but with increasing damage
mediates between visual features (e.g., is round) it becomes unable to generate any information that
and verbal descriptors, which in turn comprise distinguishes one item from another, and whole
names (e.g., bird), perceptual descriptors (e.g., has categories merge together. Hence, although indi-
wings), functional descriptors (e.g., can fly), and vidual names may not be accessible, superordinate
encyclopedic descriptors (e.g., lives in Africa). The categories remain so. With yet more damage, even
model learns to associate inputs with outputs. The broad categories may become indistinguishable.
internal semantic structure is constrained by both The model gives a similarly good account of other
visual and verbal outputs; hence visually similar semantic tasks, such as sorting words and pic-
inputs give rise to similarly structured internal tures, drawing, copying after a delay, and matching
representations. As noted above, the semantic rep- words to pictures. The model makes some specific
resentations do not necessarily encode semantic predictions: Because fruits share some proper-
features (e.g., has eyes) directly; they just have to ties with animals (e.g., they are living, or at least
be “good enough” to do the job (e.g., giving a name not man-made), they have many visual attributes
to an object, answering a question such as “does a in common with man-made objects. And patients
chair have eyes?”). do indeed treat fruit differently, sorting them with
Verbal descriptions
– Names, e.g., bird Visual features
– Perceptual, e.g., has wings Semantics
e.g., is round FIGURE 11.12 Rogers
– Functional, e.g., can fly
– Encyclopedic, e.g., lives in Africa et al.’s (2004) connectionist
model of semantic memory.
Adapted from Rogers et al.
(2004).
356 D. MEANING AND USING LANGUAGE
artifacts. The simulations also predicted that more example, if they were performing the action of
omission errors should be made when naming arti- opening a drawer, they were more likely to men-
facts and more substitution errors when naming tion clothes likely to be found inside a clothes
living things, because of the greater structure in the dresser than otherwise.
domain of living things, a prediction verified by the There is evidence that our mental situation in
data from patients with semantic dementia. the world takes a very concrete form, in that there
This kind of computational approach does are direct links between representations of percep-
not contradict the HAL (hyperspace analog to tions and actions. What happens in the brain when
language) model of Burgess and Lund (1997) we hear the word “kick”? Using brain imaging, we
or the LSA (latent semantic analysis) model of see Wernicke’s region, the part of the left tempo-
Landauer and Dumais (1997). Indeed, all these ral lobe of the brain that we know plays a vital
approaches show that we extract and abstract role in accessing word meanings, become highly
semantic information from large bodies of infor- activated. We also see some activation in Broca’s
mation. However, while HAL and LSA are reliant area, a region towards the front of the left hemi-
on verbal input, this computational approach links sphere that we know to be involved in producing
verbal and perceptual information. The computa- speech. What is even more surprising is the fMRI
tional model also links semantic processing to scans show that there is activation in the parts of
neuropsychology. the brain that deal with motor control, and particu-
The semantic representation is unitary and larly the motor control of the leg (Glenberg, 2007;
amodal, although different modalities will provide Hauk, Johnsrude, & Pulvermüller, 2004). It’s as
different inputs to the mediating semantic rep- though when we hear “kick,” we give a mental
resentation. In that respect the model resembles kick. Similarly, if we hear a word such as “catch,”
OUCH. Indeed, semantic memory might better be we see activation in the parts of the brain that con-
seen as a system that mediates different percep- trol the movements of the hand, and if you hear “I
tual systems, rather than a store of propositional eat an apple,” you get activation of the parts that
facts. The anterior regions of the temporal lobes control the mouth (Tettamanti et al., 2005). This
play a particularly important role in this process. motor activity peaks extremely quickly: within 20
The idea that our internal representations are ms of the peak activation in the parts of the brain
grounded in our perceptions, actions, and feelings traditionally thought to be involved in recogniz-
is an important one: put another way, our cogni- ing words and processing meaning (Pulvermüller,
tion is embedded in the world. Concepts have Shtyrov, & Illmoniemi, 2003), which is so fast that
very direct links to the world (Barsalou, 2003, it rules out the explanation that people are just con-
2008; Glenberg, 2007). Our minds don’t work sciously reflecting on or rehearsing what they’ve
in isolation—they are situated within the world. just heard. This idea that thinking or understanding
According to this view, concepts and meaning language causes activation in the parts of the brain
aren’t just abstract things: thinking about real- to do with how the body deals with these concepts
world objects, for example, involves the visual is called embodiment. Language is grounded to
perceptual system. Furthermore, according to the the world, and grounding happens in the parts
situated cognition idea, concepts are less stable of the brain that deal with perception and action
than has usually been thought, varying depending (Willems & Casasanto, 2011).
on the context and situation. Barsalou (2003) had Brain imaging studies reinforce the view that
people perform two tasks simultaneously: using wide areas of the brain are involved in processing
their hands to imagine performing some manual meaning at many different levels, initially involv-
operations, and identifying the properties of con- ing modality-specific sensory and motor systems,
cepts. Sometimes the actions being performed and then increasingly abstract representations that
were relevant to the concepts being described, tap into a variety of other cognitive, emotional,
in which case the participants were more likely and social processes carried out by the brain
to mention related aspects of the concepts. For (Binder & Desai, 2011).
11. WORD MEANING 357
SUMMARY
(Continued)
358 D. MEANING AND USING LANGUAGE
(Continued)
x People with probable Alzheimer’s disease have difficulty with picture naming; this can be
explained in terms of their underlying semantic deficit.
x In connectionist modeling, word meaning is represented as a pattern of activation distributed
across many semantic features; this pattern corresponds to a semantic attractor.
x Semantic features (called microfeatures in computational modeling) do not necessarily have
straightforward perceptual or linguistic correspondences.
x Semantic dementia can be explained as the progressive loss of semantic features.
x Living things tend to be represented by many shared intercorrelated features, whereas non-living
things are represented primarily by distinctive features.
x The pattern of category-specificity displayed in dementia depends on the level of severity of the
disease.
x Connectionist modeling shows how the differential dependence of living and non-living things on
intercorrelated and distinctive features explains the interaction between performance on different
semantic categories and severity of dementia.
x Latent semantic analysis shows how co-occurrence information is used to acquire knowledge.
x Grounding is how symbols are connected to perceptual representations.
FURTHER READING
A recent review of the psychology of semantics is Vigliocco and Vinson (2009). The classic lin-
guistics work on semantics is Lyons (1977a, 1977b). Johnson-Laird (1983) provides an excellent
review of a number of approaches to semantics, including the relevance of the more philosophical
approaches.
General problems with network models are discussed by Johnson-Laird, Herrman, and Chaffin
(1984). Chang (1986) and Smith (1988) review the experimental support for psychological models
of semantic memory. Kintsch (1980) is a good review of the early experimental work on semantic
memory, particularly on the sentence verification task.
For more on definitional versus non-definitional theories of meaning, see the debate between
J. A. Fodor (1978, 1979) and Johnson-Laird (1978; Miller & Johnson-Laird, 1976). For more on
11. WORD MEANING 359
instance-based theories, see Hintzman (1986), Murphy and Medin (1985), Nosofsky (1991), Smith
and Medin (1981), and Whittlesea (1987). For an important overview of connectionist approaches to
semantics, see Rogers and McClelland (2004).
Aitchison (1994) is a good introduction to processing figurative language.
For an excellent brief review of the neuropsychology of semantics, see Saffran and Schwartz
(1994). Caplan (1992) also provides an extensive review of the neuropsychology of semantic mem-
ory. For a review of optic aphasia see Sitton et al. (2000). See Vinson (1999) for an introductory
review of language in dementia; Harley (1998) for a review of work about naming and dementia; and
Schwartz (1990) for an edited volume of work on dementia with a cognitive bias.
HAL is another latent semantic analysis model (Lund et al., 1995, 1996); it produces an account
of semantic priming that is similar to McRae and Boisvert (1998; see Chapter 6).
C H A P T E R 12
COMPREHENSION
remember the gist of text, and very quickly dump Kintsch and Bates (1977) studied students’ mem-
the details of word order. ory of lectures. They found that verbatim memory
We start to purge our memory of the details of was good after 2 days but was greatly reduced
what we hear after sentence boundaries. Jarvella after 5 days. Extraneous remarks were remem-
(1971) presented participants with sentences such bered best: We remember the precise wording
as (5) and (6) embedded in a story: of jokes and announcements particularly well.
Perhaps surprisingly, there were no differences in
(5) The tone of the document was threatening. literal memory for sentences that were centrally
Having failed to disprove the charges, Taylor related to the topic compared with those con-
was later fired by the President. cerned with detail. A depressing result for teach-
(6) The document had also blamed him for hav- ers is that memory was worst for central topic
ing failed to disprove the charges. Taylor was statements and overall conclusions. These studies
later fired by the President. show that there are differences between coherent
naturalistic conversation, and isolated artificial
The participants were then tested on what they sentences and other materials constructed just for
remembered. They remembered the clause psycholinguistic experiments. In real conversa-
“having failed to disprove the charges” more tion (counting soap operas as examples of real
accurately in (5) than (6), presumably because in conversation), quite often what might be consid-
(5) it was part of the final sentence before the ered surface detail serves a particular function.
interruption. For example, the way in which we use pronouns
The way in which we describe what we recall or names depends in part on factors like how
from immediate memory can be influenced by the much attention we want to draw to what is being
syntactic structure of what we have just read or referred to. This result accords with our intuitions:
heard. Potter and Lombardi (1998) found that the Although we often remember only the gist of
tendency to use the same syntactic structure in what is said to us, on occasion we can remember
material recalled from immediate memory results the exact wording, particularly if it is important or
from syntactic priming by the target material (see emotionally salient.
Chapter 13 for more details). That is, we tend to Items and properties that become incorpo-
reuse the same words and sentence structures in rated into our model of what we hear are more
the material we recall because they were there memorable than those that do not. Consider these
in the original material. Potter and Lombardi two sentences, (7) and (8):
showed that it was possible to change the way
people phrased the material they recalled by prim- (7) Vlad was relieved that Agnes was wearing
ing them with an alternative sentence structure. her pink dress.
This is consistent with the idea that immediate (8) Vlad was relieved that Agnes was not wear-
recall involves generation from a meaning-level ing her pink dress.
representation, rather than true verbatim memory
(Potter & Lombardi, 1990, 1998). Both sentences mention the word “pink,” but
The details of surface syntactic form are not while in the first sentence there is a pink dress in
always lost. Yekovich and Thorndyke (1981) our representation of the sentence, in the second
showed that we can sometimes recognize exact there is not. We are explicitly told that there is no
wording up to at least 1 hour after presentation. pink dress present. How does this affect the mem-
Bates, Masling, and Kintsch (1978) tested par- orability of the word “pink”? Suppose we present
ticipants’ recognition memory for conversations the word “pink” after hearing these two sentences,
in television soap operas. As expected, memory and ask participants whether or not the word was
for meaning straight after the program was nearly present. What we find depends on the delay
perfect, but participants could also remember the between the sentence and presenting the probe
detailed surface form when it had some significance. word (“pink”). After 500 ms, “pink” is equally
364 D. MEANING AND USING LANGUAGE
accessible in both sentences, but after 1,500 ms, of the text; indeed, eye-movement research sug-
participants respond faster if the item is present gests this is in part the case. In this case the bet-
(7) compared with when it is not present (8). That ter memory would simply reflect more processing
is, immediately after hearing a sentence, linguistic time. However, Britton, Muth, and Glynn (1986)
structure and content determines memory; after a restricted the time participants could spend read-
longer delay, linguistic structure is less important ing parts of the text so they spent equal amounts of
than discourse structure (Kaup & Zwaan, 2003). time reading the more and the less important parts
Exactly why we sometimes remember the of a story, and found that they still remembered
exact surface form is not currently known. Is a the important parts better. Hence there is a real
decision taken to store it permanently, and if so effect of the role the material plays in the meaning
when? Neither is the relation between our mem- of the story. Important material must be flagged in
ory for surface form and the structure of the parser comprehension and memory in some way.
well understood. Clearly we can routinely remem- The importance of an idea relative to the rest
ber more than one clause, even if there has been of the story also affects its memorability (Bower,
subsequent interfering material, so it cannot be Black, & Turner, 1979; Kintsch & van Dijk, 1978;
simply that we always immediately discard sur- Thorndyke, 1977). As you would expect, the more
face form. Clearly the parser can process one sen- important a proposition is, the more likely it is to
tence while we are storing details of another. be remembered. Text processing theories should
predict why some ideas are more “important” than
others. One suggestion is that important ideas
Importance are those that receive more processing because
Not surprisingly, people are more likely to remem- themes in the text are more often related to impor-
ber what they consider to be the more important tant ideas than less important ones are.
aspects of text. Johnson (1970) showed that par-
ticipants were more likely to recall ideas from a What effect does prior knowledge
story that had been rated as important by another
group of participants. Keenan, MacWhinney, and
have?
Mayhew (1977) examined memory for a linguis- The effect of prior knowledge on what we remem-
tics seminar, and compared sentences that were ber and on the processes of comprehension was
considered to be HIC (high interactional content— explored in an important series of experiments
which is material having personal significance) by Bransford and his colleagues. For example,
and sentences with LIC (low interactional Bransford and Johnson (1973, p. 392) read par-
content—which is material having little personal ticipants the following story (11):
significance).
(11) “If the balloons popped, the sound wouldn’t
(9) I think you’ve made a fundamental error in be able to carry far, since everything would
this study. be too far away from the correct floor. A
(10) I think there are two fundamental tasks in closed window would also prevent the
this study. sound from carrying, since most buildings
tend to be well insulated. Since the whole
Sentences with high interactional content, such as operation depends upon a steady flow of
(9), were more likely to be recalled by the appropri- electricity, a break in the middle of the wire
ate participants in the seminar than sentences with would also cause problems. Of course, the
low interactional content, such as (10). fellow could shout, but the human voice is
Although it may not be surprising that more not loud enough to carry that far. An addi-
important information is recalled better, there tional problem is that a string could break
are a number of reasons why it might be so. We on the instrument. Then there could be no
might spend longer reading more important parts accompaniment to the message. It is clear
12. COMPREHENSION 365
context after reading it also only recalled on aver- old home which is set back from the road
age 2.7 ideas. However, those participants given and which has attractive grounds. But since
the context before the story recalled an average of it is an old house it has some defects: for
5.8 ideas. These experiments suggest that back- example, it has a leaky roof, and a damp and
ground knowledge by itself is not sufficient: you musty cellar. Because the family is wealthy,
must recognize when it is applicable. they have a lot of valuable possessions—
Appropriate context may be as little as the such as ten-speed bike, a color television,
title of a story. Dooling and Lachman (1971, p. 218) and a rare coin collection.
showed the effect of providing participants with
a title that helped them make sense of what was The story was 373 words long and identified
read, but once again it had to be given before by the experimenters as containing 72 main ideas.
reading the story (13): Other participants had previously rated the main
ideas of the story according to their relevance to
(13) “With hocked gems financing him, our hero a potential house buyer or a potential burglar. For
bravely defied all scornful laughter that tried example, a leaky roof and a damp basement are
to prevent his scheme. ‘Your eyes deceive,’ important features of a house to house buyers but
he had said. ‘An egg, not a table, correctly not to burglars, whereas valuable possessions and
typifies this unexplored planet.’ Now three the fact that no one is in on Thursday are more rel-
sturdy sisters sought proof. Forging along, evant to burglars. The participants in the experi-
sometimes through vast calmness, yet ment read the story from either a “house buying”
more often over turbulent peaks and val- or a “burglar” perspective in advance. Not surpris-
leys, days became weeks as doubters spread ingly, the perspective influenced the ideas the par-
fearful rumours about the edge. At last, ticipants recalled. Half the participants were then
from nowhere, welcome winged creatures told the other perspective, while a control group
appeared signifying monumental success.” of the other half of the participants just had the
first repeated. The shift in perspective improved
Without the title of “Christopher Columbus’s dis- recall: participants could recall things they had
covery of America,” the story makes little sense. previously forgotten. This is because the new per-
In fact, “three sturdy sisters” refers to the three spective provides a plan for searching memory.
ships, the “turbulent peaks and valleys” to the At first sight the findings of this experi-
waves, and “the edge” refers to the supposed edge ment appear to contradict those of Bransford and
of a flat earth. Johnson. Bransford and Johnson showed that
It might reasonably be objected that all these context has little effect when it is presented after
stories so far have been designed to be obscure, a story, but Anderson and Pichert showed that
without a title or context, and are not representa- changing the perspective after the story—which of
tive of normal texts. What happens with less course is a form of context—can improve recall.
obtuse stories? The difference is that, unlike the Bransford and
This can be seen in an experiment by Johnson experiments, the Anderson and Pichert
Anderson and Pichert (1978), who showed how story was easy to understand. It is hard to encode
a shift in perspective provides different retrieval difficult material in the first place, let alone recall
cues. Participants read a story summarized in it later. With easier material the problem is in
(14)—a more colloquial British term for “playing recalling it, not encoding it. People encode infor-
hooky” is “playing truant,” or “skiving”: mation from both perspectives, but the perspec-
tive biases what people recall. In an extension of
(14) Two boys play hooky from school. They this study, Baillet and Keenan (1986) looked at
go to the home of one of the boys because what happens if perspective is shifted after read-
his mother is never there on a Thursday. ing but before recall. Participants who recalled the
The family is well off. They have a fine material immediately depended on the retrieval
12. COMPREHENSION 367
perspective; however, participants who recalled For example, Sulin and Dooling (1974, p. 256)
it after a much longer interval (1 week) were not showed that background knowledge could also be
affected by the retrieval perspective—only the a source of errors if it is applied inappropriately.
perspective given at encoding mattered. Consider the following story (15):
There is a huge amount of potentially rel-
evant background knowledge. Almost anything (15) “Gerald Martin strove to undermine the
we know can be brought to bear on understand- existing government to satisfy his politi-
ing text. (Indeed, one way to improve our mem- cal ambitions. Many of the people of his
ory for text is to construct as many connections country supported his efforts. Current
as possible between new and old material.) political problems made it relatively easy
Culture-specific information also influences for Martin to take over. Certain groups
comprehension (Altarriba, 1993; Altarriba & remained loyal to the old government
Forsythe, 1993). For example, in an experiment and caused Martin trouble. He confronted
by Steffensen, Joag-dev, and Anderson (1979), these groups directly and so silenced
groups of American and Indian participants read them. He became a ruthless, uncontrolla-
two passages, one describing a typical American ble dictator. The ultimate effect of his rule
wedding and the other a typical Indian wedding. was the downfall of his country.”
Participants read the passage appropriate to their
native culture more rapidly and remembered Half of the participants in their experiment
more of it, and distorted more information from read this story as given here, with the main actor
the culturally inappropriate passage. Culture does in the story called “Gerald Martin.” The other
not mean just nationality: religious affiliation can half read it with the name “Adolf Hitler” instead.
affect reading comprehension. Lipson (1983) Participants in the “Hitler” condition afterwards
showed that children from strongly Catholic or were more likely to believe incorrectly that they
strongly Jewish backgrounds showed faster com- had read a sentence “He hated the Jews particu-
prehension of and better recall for text that was larly and so persecuted them,” than a neutral
appropriate to their affiliation. control sentence such as “He was an intelligent
In summary, prior knowledge has a large man but had no sense of human kindness.” That
effect on our ability to understand and remember is, they made inferences from their background
language. The more we know about a topic, the world knowledge that influenced their memory of
better we can comprehend and recall new mate- the story. Here the prior knowledge was a source
rial. The disadvantage of this is that sometimes of errors. Participants in the fictitious character
prior knowledge can lead us astray. condition were of course unable to use this back-
ground information.
There are three main types of inference,
Inferences called logical, bridging, and elaborative infer-
We make an inference when we go beyond the ences. Logical inferences follow from the mean-
literal meaning of the text. An inference is the ings of words. For example, hearing “Vlad is a
derivation of additional knowledge from facts bachelor” enables us to infer that Vlad is male.
already known; this might involve going beyond Bridging inferences (sometimes called backward
the text to maintain coherence, or to elaborate on inferences) help us relate new to previous infor-
what was actually presented. Inferences do not mation (Clark, 1977a, 1977b). Another way of
always lead to the correct conclusion, however. putting this is that texts have coherence in a way
Prior knowledge and context are mixed blessings. that randomly jumbled sentences do not have.
Although they can help us to remember material We strive to maintain this coherence, and make
that we would otherwise have forgotten, they can inferences to do so. One of the major tasks in
also make us think we have “remembered” mate- comprehension is sorting out what pronouns
rial that was never presented in the first place! refer to. Sometimes even more cognitive work
368 D. MEANING AND USING LANGUAGE
is necessary to make sense of what we read or (17) Three turtles rested on a floating log and a
hear. How can we make sense of (16)? We can if fish swam beneath them.
we assume that the moat refers to a moat around (18) Three turtles rested on a floating log and a
the castle mentioned in the first sentence. This is fish swam beneath it.
an example of how we maintain coherence: We
comprehend on the basis that there is continu- If you swim beneath a log with a turtle on
ity in the material that we are processing, and it, then you must swim beneath the turtle. If you
that it is not just a jumble of disconnected ideas. change “on” to “beside,” then participants are
Bridging inferences provide links among ideas very good at detecting this change, because the
to maintain coherence. inference is no longer true and therefore not one
likely to be made.
(16) Vlad looked around the castle. The moat
was dry. When are inferences made?
In the past, most researchers subscribed to a con-
We make elaborative inferences when we structionist view that inferences are involved
extend what is in the text with world knowledge. in constructing a representation of the text.
The Gerald Martin example is an (unwarranted) Comprehenders are more likely to make infer-
elaborative inference. This type of inference ences related to the important components of a
proves to be very difficult for AI simulations of story and not incidental details (Seifert, Robertson,
text comprehension, and is known as the frame & Black, 1985). The important components are
problem. Our store of world general knowledge the main characters and their goals, and actions
is enormous, and potentially any of it can be relating to the main plan of the story. According
brought to bear on a piece of text, to make both to constructionists, text processing is driven on a
bridging and elaborative inferences. How does “need to know” basis. The comprehender forms
text elicit relevant world knowledge? This is a goals when processing text or discourse, and these
significant problem for all theories of text pro- goals determine the inferences that are made,
cessing. Bridging and elaborative inferences what is understood and what is remembered about
have sometimes been called backward and the material, and the type of model constructed.
forward inferences respectively, as backward The alternative view is the minimalist hypoth-
inferences require us to go back from the cur- esis (McKoon & Ratcliff, 1992). According to the
rent text to previous information, whereas for- minimalist hypothesis, we automatically make
ward inferences allow us to predict the future. bridging inferences, but we keep the number of
As we shall see, there are reasons to think that elaborative inferences to a minimum. Those that
different mechanisms are responsible for these are made are kept as simple as possible and use
two types of inference. Taken together, all infer- only information that is readily available. Most
ences that are not logical are sometimes called elaborative inferences are made at the time of
pragmatic inferences. recall. According to the minimalist approach,
As we have seen, people make inferences text processing is data-driven. Comprehension is
on the basis of their world knowledge. We have enabled by the automatic activation of what is in
also seen that we only remember the gist of what memory: it is therefore said to be memory-based.
we read or hear, not the detailed form. Taken In part the issue comes down to when the infer-
together, these suggest that we should find it very ences are made. Is a particular inference made
difficult to distinguish the inferences we make automatically at the time of comprehension, or is
from what we actually hear. Bransford, Barclay, it made with prompting during recall?
and Franks (1972) demonstrated this experimen- The studies that show that we make elabo-
tally. They showed that after a short delay the tar- rative inferences look at our memory for text.
get sentence (17) could not be distinguished from Memory measures are indirect measures of com-
the valid inference (18): prehension, and may give a distorting picture of
12. COMPREHENSION 369
the comprehension process. In particular, this may (23) The tooth was pulled painlessly. The patient
have led us to overestimate the role of construc- liked the new method.
tion in comprehension. The most commonly used
on-line measure is reading time, assuming that In (21) the statement to be verified is explic-
making an automatic inference takes time, neces- itly stated, so people are fast to verify the probe
sitating us to look at the guilty material for longer. statement. In (22) a bridging inference that the
For an inference to be made automatically, appro- dentist is pulling the tooth is necessary to main-
priate supporting associative semantic informa- tain coherence; people are as fast to verify the
tion must be present in the text. For example, probe as they are when it is explicitly stated in
McKoon and Ratcliff (1986, 1989) showed that in (21). This suggests that the bridging inference
a lexical decision task, the recognition of a word has been made automatically in the comprehen-
that is likely to be inferred in a “strong association sion process. But in (23) people are about 250
predicting context,” for example the word “sew” ms slower to verify the statement; this suggests
in (19), is facilitated much more than the word that the elaborative inference has not been drawn
that might be inferred in a “weak association con- automatically.
text,” the word “dead” in (20). It now seems likely that only bridging or
reference-related inferences necessary to maintain
(19) The housewife was learning to be a seam- the coherence of the text are made automatically
stress and needed practice so she got out during comprehension, and elaborative infer-
the skirt she was making and threaded her ences are generally only made later, during recall.
needle. Evidence supporting this is that people make
(20) The director and cameraman were ready to more intrusion inferences (the sort of elaborative
shoot close-ups when suddenly the actress inference where people think that something was
fell from the 14th floor. in the study material when it was not) the longer
the delay between study and test (Dooling &
In both cases the target word is part of a valid Christiaansen, 1977; Spiro, 1977). This is because
inference from the original sentence, but whereas people’s memory for the original material deterio-
“sew” is a semantic associate of the words “seam- rates with time, and they have to do more recon-
stress,” “threaded,” and “needle” in (19), the word struction. Corbett and Dosher (1978) found that
“dead” needs an inference to be made in (20). The the word “scissors” was an equally good cue for
actress does not have to die as a result of this acci- recalling each of the sentences (24)–(26):
dent, and this conclusion is not supported by a
strong associative link between the words of the (24) The athlete cut out an article with scissors
sentence (as would be the case if the material said, for his friend.
“the actress was murdered”). Such inferences do (25) The athlete cut out an article for his friend.
not therefore have to be drawn automatically, and (26) The athlete cut out an article with a razor
indeed may not ever be made. (This is why this blade for his friend.
viewpoint is known as minimalist.)
Singer (1994) also provided evidence that The mention of a “razor blade” in sentence
bridging inferences are made automatically, but (26) blocks any inference being drawn then about
elaborative inferences are not. He presented sen- the use of scissors. One explanation of the finding
tences (21), (22), and (23), and then asked partici- that “scissors” is just as effective a cue is that par-
pants to verify whether “A dentist pulled a tooth.” ticipants are working backwards at recall from the
cue to an action, and then retrieving the sentence.
(21) The dentist pulled the tooth painlessly. The A problem with this sort of experiment, however,
patient liked the method. is that subsequent recall might not give an accu-
(22) The tooth was pulled painlessly. The dentist rate reflection of what happens when people first
used a new method. read the material.
370 D. MEANING AND USING LANGUAGE
Dooling and Christiaansen (1977) car- (30) However, she was disturbed by a loud
ried out an experiment similar to the Sulin and scream from the back of the class and the
Dooling (1974) study with the “Gerald Martin” chalk/pen dropped on the floor.
text. They tested the participants after 1 week,
telling them that Gerald Martin was really What happens when the reader comes to the
Adolf Hitler. People still made intrusion errors word “chalk” or “pen”? The analysis of eye move-
that in this case could not have been made at the ments indicates when readers are experiencing dif-
time of study. These results suggest that elabo- ficulty by telling us how long they are looking at
rative and reconstructive inferences are made particular items and whether they are looking back
at the time of test and recall, and when readers to re-examine earlier information. If role resolu-
are reflecting about material they have just read tion is dominated by lexical-semantic context, then
(Anderson, 2010). “pen” will be suggested by the lexical-semantic con-
Garrod and Terras (2000) distinguished text of “write,” regardless of the discourse context it
between two types of information that might is in. This is what Garrod and Terras observed. Peo-
assist in making a bridging inference. Consider ple spent no longer looking at “pen” in either the
the story in (27): appropriate or the inappropriate context, although
the first-pass reading time of “chalk,” which is not
(27) Vlad drove to Memphis yesterday. The car so lexically constrained as “pen,” was affected by
kept overheating. the context. That is, “writing on a blackboard” is
just as good as “writing a letter.” The appropriate-
To maintain coherence, we make the infer- ness of the discourse context does have a subsequent
ence that “the car” must be the one that Vlad effect, however, in that inappropriate context has a
drove to Memphis—even though the car has not delayed effect that makes people re-examine ear-
yet been mentioned. “The car” is said to fill an lier material in both cases.
open discourse role, and is linked to previous To account for these data, Garrod and Terras
material by a bridging inference that maintains propose a two-stage model of how people resolve
coherence. There are two types of information to open discourse roles. The first stage is called
do this that might be used here. First, there are bonding. In this stage, items that are suggested by
lexical-semantic factors: “drive” implies using the lexical context (e.g., “pen”) are automatically
a vehicle of some sort. Second, there might be activated and bound with the verb. In the second
more general background contextual informa- stage of resolution the link between proposed filler
tion. Garrod and Terras tried to tease apart the and verb is tested against the discourse context. A
influence of these two factors in a study where non-dominant filler, such as “chalk,” cannot be
they examined eye movements of participants automatically bound to the verb in the first stage,
reading stories such as (28) and (29): and causes some initial processing difficulty. The
resolution process is a combination of automatic,
(28) The teacher was busy writing a letter of bottom-up processing and non-automatic, contex-
complaint to a parent. tual processing. Inference-making in comprehen-
(29) The teacher was busy writing an exercise on sion involves both types of process.
the blackboard.
Practical implications of research on
The discourse context in (28) is consistent with inferences
the instrumental filler “pen,” but in (29) it is con- Of course, there are some obvious implications for
sistent with “chalk.” In both cases, however, the everyday life if we are continually making infer-
lexical-semantic context of “write” is much more ences on the basis of what we read and hear. Much
strongly associated with “pen” than with “chalk.” social interaction is based on making inferences
Now consider what happens when (28) and (29) from other people’s conversation—and we have
are followed by the continuation (30): seen that these inferences are not always drawn
12. COMPREHENSION 371
correctly. There are two main applied areas where because the definite article presupposes that a bro-
elaborative inferences are particularly important, ken headlight exists. Loftus and Palmer (1974)
and those are eyewitness testimony and methods also showed participants a film of a car crash.
of advertising. They asked some of the participants (35) and
The work of Loftus (1975, 1996) on eyewit- others (36) (see Figure 12.2):
ness testimony is very well known. She showed
how unreliable eyewitness testimony actually is, (35) About how fast were the cars going when
and how inferences based on the wording of ques- they hit each other?
tions could prejudice people’s answers. For exam- (36) About how fast were the cars going when
ple, the choice of either an indefinite article (“a”) they smashed into each other?
or a definite article (“the”) influences comprehen-
sion. The first time something is mentioned, we Participants asked (36) reliably estimated the
usually use an indefinite article; after that, we can speed of the cars to be higher than those asked
use the definite article. Sentence (31) is straight- (35). A week later the participants that had been
forward, but (32) is distinctly odd: asked (36) were much more likely to think that
they had seen broken glass than those asked (35),
(31) A pig chased a cow. They went into a river. although broken glass had not been mentioned.
The pig got very wet. The way a question is phrased can influence the
(32) ? The pig chased a cow. They went into a inferences people make and therefore the answers
river. A pig got very wet. that they give.
R. Harris (1978) simulated a jury listening
When we come across a definite article we to courtroom witnesses, and found that although
make an inference that we already know something participants were more likely to accept directly
about what follows. Sometimes this can lead to asserted statements as true than only implied
memory errors. Loftus and Zanni (1975) showed statements for which they had to make an infer-
participants a film of a car crash. Some participants ence, there was still a strong tendency to accept
were asked (33), while others were asked (34): the implied statements. Instructions to partici-
pants telling them to be careful to distinguish
(33) Did you see a broken headlight? between asserted and implied information did not
(34) Did you see the broken headlight? help either. Furthermore, this test took place only
5 minutes after hearing the statements, whereas
In fact, there was no broken headlight. Partici- in a real courtroom the delays can be weeks,
pants were more likely to respond “yes” incor- and the potential problem much worse. Harris
rectly to question (34) than to question (33), (1977) similarly found that people find it difficult
match anaphors to antecedents in the same rele- referred to the pronoun. However, Arnold et al.
vant position. Anaphor resolution is more difficult also manipulated order of mention, and this inter-
when the expectations generated by this strategy acted with gender so that there was only evidence
are flouted. In (43) and (44) the appropriate order of an effect of gender on pronoun resolution for
of antecedents and pronouns differs. In (43) “he” the less-accessible second-mentioned character.
refers to “Vlad,” which comes first in “Vlad sold For the first-mentioned character, people looked
Dirk,” but in (44) “he” refers to “Dirk,” which quickly at the target no matter whether the gender
comes second. Therefore (44) is harder to under- was ambiguous or not. In summary, the effects of
stand than (43). gender can only really be observed when we take
into account what other information influences
(43) Vlad sold Dirk his broomstick because he pronoun resolution. Rigalleau and Caplan (2000)
hated it. found that people are slower to say the pronoun
(44) Vlad sold Dirk his broomstick because he “he” when it is inconsistent with the only noun
needed it. in the discourse (46) compared with when it is
consistent (47):
We can distinguish two groups of further
strategies: those dependent on the meaning of the (46) Agnes paid without being asked; he had a
actual words used, or their role in the sentence; sense of honor.
and those dependent on the emergent discourse (47) Boris cried in front of the grave; he had a
model. tissue.
Of the strategies dependent on the words
used, one of the most obvious is the use of gender Rigalleau and Caplan suggest that pronouns
(Corbett & Chang, 1983): become immediately and automatically related to
possible antecedents. The resolution process that
(45) Agnes won and Vlad lost. He was sad and ultimately determines which of the possible ante-
she was glad. cedents is finally attached to the pronoun might
depend on other factors. Resolution only involves
In (45) it is clear that “he” must refer to Vlad, attentional processing if the initial automatic pro-
and “she” to Agnes. Most of the evidence sug- cesses fail to converge on a single noun as the
gests that gender information is used automati- antecedent, or if pragmatic information makes the
cally. Other experiments show that the effects selected noun an unlikely antecedent. Some tech-
of gender are more complicated and depend on niques are better at establishing the time course of
what other referents are accessible at the time of anaphor resolution than others. In particular, the
reading. Arnold, Eisenband, Brown-Schmidt, and use of probes, as used in the earlier studies, might
Trueswell (2000) examined eye movements to disrupt the comprehension process, giving a mis-
investigate how gender information is used. Par- leading picture of what is happening.
ticipants examined pictures of familiar cartoon Different verbs carry different implications
characters while listening to text. Arnold et al. about how the actors involved should be assigned
found that gender information about the pronoun to roles. If participants are asked to complete the
was accessed very rapidly (within 200 ms after the sentences (48) and (49), they usually produce
pronoun). If the picture contained both a female continuations in which “he” refers to the sub-
and a male character (e.g., Minnie Mouse and ject (Vlad) in (48), and the object (Boris) in (49).
Donald Duck), participants were able to use the Verbs such as “sell” are called NP1 verbs, because
gender cue (“she” or “he”) very quickly to look causality is usually attributed to the first, subject,
at the appropriate picture. If the pictures were of noun phrase; verbs such as “blame” are called
same-sex characters (e.g., Micky Mouse and Donald NP2 verbs, because causality is usually attrib-
Duck), gender was no longer a cue, and partici- uted to the second, object, noun phrase (Grober,
pants took longer to converge on the picture that Beardsley, & Caramazza, 1978).
374 D. MEANING AND USING LANGUAGE
(48) Vlad sold his broomstick to Boris because The second group of anaphor resolution strat-
he . . . egies are those dependent on the perceived promi-
(49) Vlad blamed Boris because he . . . nence of possible referents in the emergent text
model. We might be biased, for example, to select
When does implicit causality have its effect? the referent in the model that is most frequently
Is it early, enabling us to focus on the appropriate mentioned. Antecedents are generally easier to
antecedent, or late, facilitating the integration of locate when they are close to their referents than
material? The difference between the two possible when they are farther away, in terms of the num-
time courses is whether or not causality informa- ber of intervening words (Murphy, 1985; O’Brien,
tion affects the initial processing of the “he” in 1987). In more complicated examples alternatives
(48) and (49). An experiment by Stewart, Pick- can sometimes be eliminated using background
ering, and Sanford (2000) suggests that implicit knowledge and elaborative inferences, as in (52).
causality only has a late effect. Stewart et al. Exactly how this background knowledge is used
manipulated information about the cause of an is unclear. In this case we infer that becoming a
action, and about the type of anaphor used. They vegetarian would not make someone want to buy
manipulated the implicit cause (through verb piglets, but more likely to sell them, as they would
bias) and the explicit cause, which is derived from be less likely to have any future use for them.
the whole sentence. The two types of cause could
be either congruent or in conflict, a condition (52) Vlad sold his piglets to Dirk because he had
they called incongruent. They also manipulated become a vegetarian.
whether the anaphor was a pronoun or a proper
name. They measured ease of processing using Pronouns are read more quickly when the ref-
self-paced reading. Sentence (50) is an example erent of the antecedent is still in the focus of the
of a congruent condition with names, and (51) is situation being discussed than when the situation
an incongruent condition with pronouns—note has changed so that it is no longer in focus (Garrod
that “apologize” is usually a NP1-bias verb. & Sanford, 1977; Sanford & Garrod, 1981). Items
in explicit focus are said to be foregrounded and
(50) Daniel apologized to Arnold because Daniel have been explicitly mentioned in the preceding
had been behaving selfishly. text. Such items can be referred to pronominally.
(51) Daniel apologized to Arnold because he Items in implicit focus are only implied by what
didn’t deserve the criticism. is in explicit focus. For example, in (53) Vlad is
in explicit focus, but the car is in implicit focus. It
The pronoun “he” is ambiguous, whereas the sounds natural to continue with “he was thirsty,”
name is not. The early-focus account predicts that but not with “it broke down.” Instead, we would
we should determine the antecedent of the pro- need to bring the car into explicit focus with a sen-
noun on the basis of the implicit causality bias of tence like “his car broke down.”
the verb. In incongruent sentences with pronouns,
therefore, the early-focus account predicts con- (53) Vlad was driving to Philadelphia.
flict and therefore reading difficulty; this diffi-
culty should not be present in the sentences with Experiments on reading time suggest that
the unambiguous names instead of pronouns. So, implicit focus items are harder to process. Items
if the early-focus account is correct, there should are likely to stay in the foreground if it is an
be an interaction between congruence and type of important theme in the discourse, and these items
anaphor. Stewart et al. found no such interaction, are likely to be maintained in working memory.
a result that supports the late-integration account. Pronouns with antecedents in the foreground, or
Indeed, they found congruence mattered for just topic antecedents, are read quickly, regardless of
repeated names, suggesting that implicit bias has the distance between the pronoun and referent
a late effect. (Clifton & Ferreira, 1987). In conversation, we
12. COMPREHENSION 375
do not normally start using pronouns for refer- Given that there are a number of strategies
ents that we have not mentioned for some time. for interpreting anaphors, how do we choose the
In general, unstressed pronouns are used to refer best one? Badecker and Straub (2002) argue that
to the most salient discourse entity—the one at all potential cues contribute to the selection of the
the center of focus—while definite noun phrase appropriate antecedent. They propose an interac-
anaphors (e.g., “the intransigent vampire”) are tive parallel constraint model, where the multiple
used to refer to non-salient discourse entities— constraints influence the activation of the candidate
those out of focus. entities. The more conflict there is, the more candi-
In general, then, the more salient an entity is dates there are, and the more plausible they are, the
in discourse, the less information is contained in more difficult choosing an antecedent will be.
the anaphoric expression that refers to it. Almor
(1999) proposed that NP anaphor processing is
determined by informational load: this is the
Accessibility
amount of information an anaphor contains. The Some items are more accessible than others. We are
informational load of an anaphor with respect to faster at retrieving the referent of more accessible
its antecedent should either aid the identifica- antecedents. At this stage some caution is neces-
tion of the antecedent, or add new information sary to avoid a circular definition of accessibility.
about it, or both. The processing of anaphors is a Accessibility is a concept related both to anaphora
balance between the benefits of maximum infor- and to the work on sentence memory. It can be meas-
mativeness and the cost of minimizing working ured by recording how long it takes participants to
memory load. This idea that anaphor process- detect whether a word presented while participants
ing is a balance between informativeness and are reading sentences is present in the sentence.
processing cost leads to several predictions. For Common ground is shared information
example, anaphors with a high informational between participants in a conversation (Clark,
load with respect to their antecedent, but which 1996; Clark & Carlson, 1982). A piece of infor-
do not add new information about them, will mation is in the common ground if it is mutually
be difficult to process when the antecedent is believed by the speakers, and if all the speakers
in focus. Hence repetitive NP anaphors such as believe that all the others believe it to be shared.
(54) will be difficult: Information that is in the common ground should
have particular importance in determining refer-
(54) It was the bird that ate the fruit. The bird ence. The restricted search hypothesis states that
seemed very satisfied. the initial search for referents is restricted to enti-
(55) What the bird ate was the fruit. The bird ties in the common ground, whereas the unre-
seemed very satisfied. stricted search hypothesis places no such restric-
tion. That is, according to the restricted search
Here the antecedent (“a bird”) is in focus and hypothesis, things in the common ground should
the default antecedent, so a pronoun (“it”) will do. be more accessible than things that are not. The
The NP anaphor (“the bird”) has a high informa- evidence currently favors the unrestricted search
tional load, so it is not justified. It is only justified hypothesis (Keysar, Barr, Balin, & Paek, 1998).
when the antecedent is out of focus (55), because Consider (56) (Keysar et al., 1998, p. 5):
then it aids the identification of the antecedent.
Almor verified this prediction in a self-paced (56) “It is evening, and Boris’ young daughter is
reading task. “The bird” was read slower when playing in the other room. Boris, who lives
the antecedent was in focus (54) than when it was in Chicago, is thinking of calling his lover
out of it (55). Hence the use and processing of in Europe. He decides not to call because
pronominal and NP anaphors is a complex trade- she is probably asleep given the transatlan-
off between informativeness, focus, and working tic time difference. At that moment his wife
memory load. returns home and asks, ‘Is she asleep?’”
376 D. MEANING AND USING LANGUAGE
How does Boris search for the referent of The given–new contract
“she”? If the restricted search hypothesis were One of the most important factors that determines
correct, and search is restricted to possible ref- comprehensibility and coherence is the order in
erents in the common ground, the lover should which new information is presented relative to
not be considered, as the wife is not informed what we know already. Clearly this affects the
about the lover. However, entities that are not in ease with which we can integrate the new infor-
the common ground still interfere with reference mation into the old. It has been argued that there
resolution, as measured by error rates, verification is a “contract” between the writer and the reader,
times, and eye-movement measures. Although or participants in a conversation, to present new
common ground might not restrict which possi- information so that it can easily be assimilated
ble referents are initially checked, it almost cer- with what people already know. This is called the
tainly plays an important later role in checking, given–new contract (Clark & Haviland, 1977;
monitoring, and correcting the results of the initial Haviland & Clark, 1974). It takes less time to
search. Conversants take into account what each understand a new sentence when it explicitly con-
other knows when establishing common ground tains some of the same ideas as an earlier sentence
(Knutsen & Le Bigot, 2012). than when the relation between the content of the
Generally we are biased to referring back to sentences has to be inferred.
the subject of a sentence; there is also an advantage Utterances are linked together in discourse so
to first mention. This means that participants that that they link back to previous material and for-
are mentioned first in a sentence are more acces- ward to material that can potentially be the focus
sible than those mentioned second. Gernsbacher of future utterances. Centering theory, developed
and Hargreaves (1988) showed that there was an in AI models of text processing, provides a means
advantage for first mention, independent of other of describing these links (Gordon, Grosz, &
factors such as whether the words involved were Gilliom, 1993; Grosz, Joshi, & Weinstein, 1995).
subject or object. Gernsbacher, Hargreaves, and According to centering theory, each utterance in
Beeman (1989) explored the apparent contradic- coherent discourse has a single backward-looking
tion between first mention and recency, in that center that links to the previous utterance, and one
items that are more recent should also be more or more forward-looking centers that offer poten-
accessible. Gernsbacher and Hargreaves explained tial links to the next utterance. People prefer to
this with a constructionist, structure-building realize the backward-looking center as a pronoun.
account: The goal of comprehension is to build a The forward-looking centers are ranked in order
coherent model of what is being comprehended. of prominence, according to factors such as the
Comprehenders represent each clause of a multi- position in the sentence and the stress. The read-
clause sentence with a separate substructure, and ing times of sentences increase if these rules are
have easiest access to the substructure they are violated. For example, people actually take longer
currently working on. However, at some point to read stories where proper names are repeated
the earlier information becomes more accessible compared with sentences where appropriate pro-
because it serves as a foundation for the whole nouns are used.
sentence-level representation. So it is only as the
representation is being developed that recency is Summary of work on memory,
important. Recency is detectable only when acces-
sibility is measured immediately after the second
inferences, and anaphora
clause; elsewhere first mention is important, and Any model of comprehension must be able to
has the more long-lasting effect. This explanation explain the following characteristics. We read
is reminiscent of Kintsch’s propositional model, for gist, and very quickly forget details of sur-
discussed later, and shows how it is possible to face form. Comprehension is to some extent a
account for anaphor resolution in terms of the constructive process: We build a model of what
details of the emergent comprehension model. we are processing, although the level of detail
12. COMPREHENSION 377
involved is controversial. At the very least, we predicate-argument form (with a verb operating
make inferences to maintain coherence. One of on a noun). A proposition has a truth value—
the most important mechanisms involved in this that is, we can say whether it is true or false.
is anaphor resolution. Inferences soon become For example, the words “witch” and “cackle”
integrated into our model as we go along, and we are not propositions: They are unitary and have
are very soon unable to distinguish our inferences no internal structure, and it is meaningless to
from what we originally heard. There is a fore- talk of individual words being true or false. On
ground area of the model containing important the other hand, “the witch cackles” contains a
and recent items, so that they are more accessible. proposition. This can be put in the predicate-
argument form “cackle(witch),” which does
have a truth value: the witch is either cackling
MODELS OF TEXT or she isn’t.
PROCESSING Propositions are connected together in propo-
sitional networks, as in Figure 12.3. The model
We now examine some models of how we repre- of Anderson and Bower (1973) was particularly
sent and process text. AI has heavily influenced influential. Originally known as HAM (short for
models of comprehension. Although the ideas Human Associative Memory), the model evolved
thus generated are interesting and explicit, there first into ACT (short for Adaptive Control of
is a disadvantage that the specific mechanisms Thought; see Anderson, 1976) and later ACT*
we use are unlikely to be exactly the same as the (pronounced “ACT-star”; Anderson, 1983). These
explicit mechanisms used to implement the AI models include a spreading activation model of
concepts. semantic memory, combined later with a product-
ion system for executing higher level operations.
Propositional network models of A production system is a series of if–then rules: if
x happens, then do y. ACT* gives a good account
representing text of fact retrieval from short stories. For example,
The meaning of sentences and text can be rep- the more facts there are associated with a concept,
resented by a network where the intersections the slower the retrieval of any one of those facts.
(or nodes) represent the meaning of words, and
the connections represent the relations between
words. This approach is related to Fillmore’s
(1968) theory of case grammar, which in turn was
Vlad
derived from generative semantics, a grammatical
give past
theory that emphasized the importance of seman-
tics. Case grammar emphasizes the roles, or cases,
broomstick witch
played by what the words refer to in the sentence.
It emphasizes the relation between verbs and the
words associated with them. (Cases are more or
less the same as thematic roles; see Box 10.1 for yellow
some examples.) One disadvantage of case gram-
mar is that there is little agreement over exactly
what the cases that describe the language should own old
be, or even how many cases there are. This lack
of agreement about the basic units involved is a
common problem with models of comprehension.
In network models, sentences are first ana- FIGURE 12.3 An example of a simplified
lyzed into propositions. A proposition is the propositional network underlying the sentence “Vlad
smallest unit of meaning that can be put in gave his yellow broomstick to the old witch.”
378 D. MEANING AND USING LANGUAGE
This is known as the fan effect (Anderson, 1974, in stories is the basis of story grammars, which
2010). When you are presented with a stimulus, are analogous with sentence grammar. Stories
activation spreads to all its associates. There is a have an underlying structure, and the purpose of
limit to the total amount of activation, however, comprehension is to reconstruct this underlying
so the more items it spreads to, the less each indi- structure. This structure includes settings, themes,
vidual item can receive. plots, and how the story turns out (see Mandler,
Another influential network model has been 1978; Mandler & Johnson, 1977; Rumelhart,
the conceptual dependency theory of Schank 1975, 1977; and Thorndyke, 1977, for examples).
(1975). This starts off with the idea that mean- Like sentence grammars, story grammars
ing can be decomposed into small, atomic units. are made out of phrase-structure rules (see the
Text is represented by decomposing the incoming example in Box 12.1). The nature of the syntactic
material into these atomic units, and by building a rules in Box 12.1 is expanded by a corresponding
network that relates them. An important interme- semantic rule: for example, once you have a set-
diate step is that the atomic units of meaning are ting then an episode is possible. You can draw tree
combined into conceptualizations that specify the structures just as with sentences, hence emphasiz-
actors involved in the discourse and the actions ing their hierarchical structure. The basic units,
that relate them. Once again, this approach has corresponding to individual words in sentence
the advantage that as it has been implemented (in grammars, are propositions, which are eventually
part) as a computer simulation: its assumptions assigned the lowest-level slots.
and limitations are therefore very clear. In the recall, paraphrasing, and summa-
rizing of stories, the less important details are
Evaluation of propositional network omitted. According to story grammars, humans
models compute the importance of a sentence or a fact
Just as there is little agreement on the cases to use, by its height in the hierarchy. Cirilo and Foss
so there is little agreement on the precise types (1980) showed that participants spend more
of roles and connections to use. If we measure time reading sentences high in the structure
propositional networks against the requirements than those low down in the structure. However,
listed for memory, inferences, and anaphora, we any sensible theory of text processing should
can see that they satisfy some of the requirements, predict that we pay more attention to the impor-
but leave a lot to be desired as models of discourse tant elements of a story.
processing. Most propositional network models Thorndyke (1977) presented participants
show how knowledge might be represented, but with one of two simple stories. The story “Circle
they have little to say about when or how we make
inferences, or how some items are maintained in
the foreground, or how we extract the gist from Box 12.1 Example of a
text (Johnson-Laird et al., 1984; Woods, 1975). fragment of a story grammar
Propositional networks by themselves are inad- (based on Rumelhart, 1975)
equate as a model of comprehension, but form
the basis of more complex models. Kintsch’s Story o Setting + theme + plot +
construction–integration model (see later) is based resolution
on a propositional model, but includes explicit Setting o Characters + location +
mechanisms for dealing with the foreground and time
making inferences. Theme o (Event)* + goal
Plot o Episode*
Episode o Subgoals + attempt* +
Story grammars outcome
Stories possess a structure: they have a begin- Asterisks show the element can be repeated.
ning, a middle, and an end. The structure present
12. COMPREHENSION 379
sentences, the analysis of stories is content- this meaning. Finally, the information must be
dependent. Story grammars fail to provide an integrated to form a single holistic representation.
account of how stories are actually produced The idea of a schema cannot in itself account
or understood. (See Black & Wilensky, 1979; for text processing, but it is a central concept in
Garnham, 1983b; Johnson-Laird, 1983; and many theories. Although it provides a means of
Wilensky, 1983, for details of these criticisms.) organization of knowledge, and explains why
Given the fundamental nature of some of these we remember the gist of text, it does not explain
difficulties, story grammars are no longer influ- how we make inferences, how material is fore-
ential in comprehension research. grounded, or why we sometimes remember the
literal meaning. To solve these problems the
notion must be supplemented in some way.
Schema-based theories
The idea of a schema (the plural can be either Scripts
schemata or schemas) was originally introduced A script is a special type of schema (Schank &
by Bartlett (1932). Bartlett argued that memory Abelson, 1977). Scripts represent our knowledge
is determined not only by what is presented, but of routine actions and familiar repeated sequences.
also by the prior knowledge a person brings to the Scripts include information about the usual roles,
story. He presented people with stories that con- objects, and the sequence of events to be found in
flicted with prior knowledge, and observed that an action; they enable plans to be made and allow
over time people’s memory for the story became us to draw inferences about what is not explicitly
increasingly distorted in the direction of fitting in mentioned. Two famous examples are the “res-
with their prior knowledge. taurant script” and the “attending a lecture script”
A schema is an organized packet of knowledge (see Table 12.1).
that enables us to make sense of new knowledge. Psychological evidence for the existence of
It is related to ideas in both AI on visual object scripts comes from an experiment by Bower et al.
recognition (Minsky, 1975) and experimental (1979). Bower et al. asked participants to list
psychology (Posner & Keele, 1968). The schema about 20 events in activities such as visiting a
gives knowledge-organizing activation that means restaurant, attending a lecture, getting up in the
that the whole is greater than the sum of its parts. morning, visiting the doctor, or going shopping.
It can be conceptualized as a series of slots that can Some examples are shown in Table 12.1. Items
be filled with particular values. Anderson (2010) labeled (1) were mentioned by the most par-
gives the following example (60) of a possible ticipants and are considered the most important
schema for a house. actions in a script; items labeled (2) were men-
tioned by fewer participants; and items labeled
(60) House schema: (3) were mentioned by the fewest participants.
Isa: building These are considered the least important parts of
Parts: rooms the script. The events are shown in the order in
Materials: bricks, stone, wood which they were usually mentioned. All of these
Function: human dwelling events were mentioned by at least 25% of the
Shape: rectilinear, triangular participants. Hence participants agree about the
Size: 100–10,000 square feet central features that constitute a script, and their
relative importance.
There are four central processes involved in Scripts are useful in explaining some results
schema formation. First, the appropriate aspects of experiments on anaphoric reference. Walker
of the incoming stimuli must be selected. Second, and Yekovich (1987) showed that a central con-
the meaning must be abstracted, and syntactic and cept of a script (such as a “table” in the restaurant
lexical details dispensed with. Third, appropriate script) was comprehended faster (regardless of
prior knowledge must be activated to interpret whether it was explicitly mentioned in the story)
12. COMPREHENSION 381
TABLE 12.1 Examples of scripts (based on Bower than a peripheral concept. Peripheral concepts of
et al., 1979). scripts were dealt with particularly slowly when
their antecedents were only implied. That is, we
Visiting a Attending a lecture
find it easier to assign referents to the important
restaurant script script
elements of scripts.
Open door 3 Enter room 1 Occasionally events happen that are not in
the script: for example, the waiter might spill the
Enter 2 Look for friends 2
soup on you. Schank and Abelson (1977) referred
Give reservation 2 Find seat 1 to such interruptions as obstacles or distractions,
name because they get in the way of the main purpose of
the script (here, eating). Bower et al. made predic-
Wait to be 3 Sit down 1
seated tions about two types of event in stories relating to
scripts. First, distractions that interrupt the purpose
Go to table 3 Settle belongings 3 of the script should be more salient than the rou-
Be seated 1 Take out notebook 1
tine events, and should therefore be more likely to
be remembered. Second, events that are irrelevant
Order drinks 2 Look at other 2 to the purpose of the script (such as the color of
students the waiter’s shoes) should be poorly remembered.
Put napkins on lap 3 Talk 2 Both of these predictions were verified.
Schank (1982) pointed out that most of life
Look at menu 1 Look at lecturer 3 is not governed by predetermined, over-learned
Discuss menu 2 Listen to lecturer 1 sequences such as those encapsulated by a
script. Knowledge structures need to be flexible.
Order meal 1 Take notes 1 Dissatisfied with this limitation of scripts, Schank
Talk 2 Check time 1 focused on the role of reminding in memory. He
argued that memory is a dynamic structure driven
Drink water 3 Ask questions 3 by its failures. Memory is organized into differ-
Eat salad or soup 2 Change position in 3 ent levels, starting at the lower end with scenes.
seat Examples of these in what would earlier have
been called a “going to the doctor script” include
Meal arrives 3 Daydream 3 “reception room scene,” “waiting scene,” and
Eat food 1 Look at other students 3 “surgery scene.” Scenes are organized into mem-
ory organization packets or MOPs, which are all
Finish meal 3 Take more notes 3 linked by being related to a particular goal. In any
Order dessert 2 Close notebook 2 enterprise, more than one MOP might be active at
once. MOPs are themselves organized into meta-
Eat dessert 2 Gather belongings 2 MOPs if a number of MOPs have something in
Ask for bill 3 Stand up 3 common (for example, all MOPs involving going
on a trip). At a higher level than MOPs and meta-
Bill arrives 3 Talk 3 MOPs are thematic organization points or TOPs,
Pay bill 1 Leave 1 which deal with abstract information independent
of particular physical or social contexts.
Leave tip 2 There is some support for MOPs from a
Get coats 3 series of experiments by McKoon, Ratcliff, and
Seifert (1989) and Seifert, McKoon, Abelson, and
Leave 1 Ratcliff (1986). They showed that elements of
Items labeled (1) are considered most important, (3) MOPs could prime the retrieval of other elements
least important. from the same MOP. Participants read a number
382 D. MEANING AND USING LANGUAGE
in the model (Glenberg, Meyer, & Lindem, 1987; Greenspan, & Bower, 1987; Morrow et al.,
Rinck & Bower, 1995). In Rinck and Bower’s 1989; Zwaan & Radvansky, 1998). According
experiment, participants memorized a diagram of to the resonance model, new information reso-
a building and then read a story describing charac- nates with all information in memory, even with
ters’ activities in the building. The reading times information that is not apparently immediately
of sentences increased with the number of rooms relevant or up-to-date (Myers & O’Brien, 1998).
between the room containing an object mentioned Importantly, passive reactivation of old material
in the sentence and the room where the protago- cannot be prevented: all immediately irrelevant
nist of the story was currently located. information will become active as long as it is
Mental models represent more than spatial related. Zwaan and Madden show that compre-
information, however. There is agreement that henders can update situation models with new
they are multidimensional and represent five information that is consistent with the current
kinds of information: spatial, causal, and tempo- situation, but inconsistent with the prior situation,
ral information about people’s goals, and infor- as easily as material that was never inconsistent
mation about the characteristics of people and with the prior situation. This finding suggests that
objects (Zwaan & Radvansky, 1998). There is the most important determinant of updating is
some evidence that different aspects of the mental what is currently available, and new information
model are maintained independently in working does not resonate with all information in memory.
memory. Friedman and Miyake (2000) had peo- However, the findings in this sort of experiment
ple read short stories while responding to spatial are very sensitive to the details of the materials
and causal probe questions. They found that the used, and this conclusion is controversial (e.g.,
spatial measures were influenced by the spatial see O’Brien, Cook, & Peracchi, 2004; O’Brien,
demands of the texts, but not the causal demands, Rizzella, Albrect, & Halleran, 1998).
whereas the causal measures were only influenced Time is clearly an important determinant of
by the causal demands. Spatial aspects of the text how we construct models. In addition to the abso-
become encoded in spatial memory, but the causal lute time—the time at which information becomes
aspects become encoded in verbal memory. available in real time—relative time in a story is
The mental models approach is an extreme also important. A story unfolds in time, with the
version of a constructionist approach. Indeed, focus continually shifting. As a consequence,
Brewer (1987) distinguished mental models some events are immediate, some are in the recent
from other approaches by saying that rather than past, and some are perhaps quite a long time away.
accessing pre-existing structures, mental models Relative time can affect the accessibility of enti-
are specific knowledge structures constructed to ties in a model. Entities are less accessible when
represent each new situation, using general infor- the temporal distance between the “now” point
mation such as knowledge of spatial relations and and the past is long rather than short: readers
general knowledge. Exactly how this construction need to take more time to access entities remote
takes place, and the precise nature of the represen- in time. However, the effect of relative time only
tations involved, is sometimes unclear. applies to consecutive events (Kelter, Kaup, &
Claus, 2004). The critical comparison is the dif-
Updating the model ference between sentences such as (61) and (62).
Text processing is dynamic. As people compre-
hend text, and new material becomes available, (61) She then goes to the hairdresser and buys
they have to update their mental representa- hairspray.
tions. Zwaan and Madden (2004) distinguish two (62) She then goes to the hairdresser and gets a
approaches to how updating occurs. According to perm.
the here-and-now model, information that is cur-
rently relevant to the protagonist of the text is more There is no difference in utterance length
available than less relevant material (Morrow, here, but more time is likely to elapse in (62) than
384 D. MEANING AND USING LANGUAGE
in (61). Entities mentioned before these critical demanding that certain inferences be drawn if
sentences take longer to access after (62) than needed facts are not explicitly stated.
after (61). Text is represented in the form of a network
Given the importance of relative time in the of connected propositions or facts called a coher-
model, people pay particular attention to words ence graph. The coherence graph is built up
and phrases that indicate relative time, particularly hierarchically. This text base has both a micro-
those that indicate a shift of time in the narrative. structure and a macrostructure. The microstruc-
Words and phrases such as “later” or “two days ture is this network of connected propositions.
later” are called segmentation markers—they tell In processing text, we work through it in input
the reader that there is a temporal discontinuity cycles that usually correspond to a sentence, with
and a potential shift of topic. People take longer an average size of seven propositions. In each
to read the first sentence after a shift of topic (an cycle the overlap of the proposition arguments
effect called the boundary effect), but this penalty is noted; propositions are semantically related
is lessened if the shift is flagged by a segmenta- when they share arguments. If there is no over-
tion maker (Bestgen & Vonk, 2000). lap between the incoming propositions and the
propositions currently in working memory, then
Kintsch’s construction–integration there must be a time-consuming process of infer-
ence involving a reinstatement search (search
model of long-term memory). If there is overlap, then
Kintsch (1988) described a detailed and plausible the new propositions are connected to the active
model of spoken and written text comprehension part of the coherence graph by coherence rules.
known as the construction–integration model. The macrostructure concerns the higher level of
This model emphasizes how texts are represented description and the processes operating on that.
in memory and understanding, and how they are Relevant schemas are retrieved in parallel from
integrated into the comprehender’s general knowl- long-term memory. The knowledge base in long-
edge base. The construction–integration model term memory is stored in an associative network.
combines aspects of the network, schema-based, Rules called macrorules provide operations that
and mental model approaches. Text is represented delete propositions from microstructure, summa-
at a number of levels and processed cyclically in rize propositions, and construct inferences (e.g.,
two phases. A text base is created from the lin- to fill gaps in the text). Script-like information
guistic input and from the comprehender’s knowl- would be retrieved at this stage. The final situa-
edge base in the form of a propositional network. tion model represents the text, but in it the indi-
The text base is used to form the situation model viduality of the text has been lost, and the text
(which can also be represented propositionally), has been integrated with other information into
where the individuality of the text has been lost, a larger structure. Temporality, causality, and
and the text has been integrated with other infor- spatiality are particularly important in the situ-
mation to form a model of the whole situation ation model (Gernsbacher, 1990). Reading time
described in the text. studies suggest that comprehenders pay particu-
The early version of the model (Kintsch, lar attention to these aspects of text (Zwaan,
1979; Kintsch & van Dijk, 1978) is a sophisticated Magliano, & Graesser, 1995).
propositional network. The input is dealt with in As the text is being processed, certain prop-
processing cycles. Short-term memory acts as a ositions will be stored in working memory. As
buffer to store incoming material. We build up this has a limited capacity, what determines what
a representation of a story given two inputs: the goes into this buffer? First, recency is impor-
story itself, and the goals of the reader. The goals tant. Second, the level at which a proposition is
and knowledge of the reader are represented by stored is important, with propositions higher in
the goal schema, which does things such as stat- the coherence graph more likely to receive more
ing what is relevant, setting expectations, and processing cycles in working memory.
12. COMPREHENSION 385
The construction–integration model itself it takes to read per word (Kintsch & Keenan,
(Kintsch, 1988) keeps most of the features of the 1973). Second, as we have seen, there is a levels
earlier model (see Figure 12.4). In the construc- effect in the importance of a proposition owing
tion phase of processing, word meanings are acti- to the multiple processing of high-level proposi-
vated, propositions formed, and inferences made, tions. They are held in working memory longer,
by the mechanisms described earlier. The initial and elaborated more. Whenever a proposition is
stages of processing are bottom-up. In the inte- selected from working memory, its probability of
gration phase, the network of interrelated items being reproduced increases. Kintsch and van Dijk
is integrated into a coherent structure. The text (1978) found that the higher the level of a propo-
base constructed in the construction phase may sition, the more likely it is to be recalled in a free
contain contradictions or may be incorrect, but recall task.
any contradictions are resolved in the integration Inferences are confused with original mate-
phase by a process of spreading activation around rial because the propositions created as a result of
the network until a stable, consistent structure is the inferences are stored along with explicitly pre-
attained. Information is represented at four levels: sented propositions. The two sorts of proposition
the microstructure of the text; the local structure are indistinguishable in the representation. That
(sentence-by-sentence information integrated with this depends on the operation of goal and other
information retrieved from long-term memory); schemas also explains why material can be hard
the macrostructure (the hierarchically ordered to understand and remember if we do not know
set of propositions derived from the microstruc- what it is about. We remember different things
ture); and the situation model (the integration of if we change perspective because different goal
the text base—microstructure and macrostructure schemas become active.
together—with the results of inferences). The model also explains readability effects
and the difficulty of the text. Kintsch and van
Evaluation of the construction– Dijk (1978) defined the readability of a story as
integration model the number of propositions in the story divided
The construction–integration model explains by the time it takes to read it. The best predictors
many experimental findings. First, the more of readability turned out to be the frequency of
propositions there are in a passage, the longer words in the text and the number of reinstate-
ment searches that have to be made, as predicted
from the model. Kintsch and Vipond (1979) con-
firmed that readability is not determined solely
by the text, but is an interaction between the text
Construction–integration model (Kintsch, 1988) and the readers. The most obvious example is
that reinstatement searches are only necessary
CONSTRUCTION PHASE
when a proposition is not in working memory,
% " and obviously the greater the capacity of an indi-
%
vidual’s working memory, the less likely such a
%
reader is to need to make reinstatement searches.
Daneman and Carpenter (1980) found that indi-
INTEGRATION PHASE vidual differences in working memory size can
% # affect reading performance. So if you want to
! ! write easily readable text, you should use short
%$ words, and try to write so as to avoid the reader
"
having to retrieve a lot of material from long-
term memory.
The model can explain some differences
FIGURE 12.4 between good and poor readers. Vipond (1980)
386 D. MEANING AND USING LANGUAGE
presented readers with passages containing tech- with less prior knowledge rely more on the sur-
nical material. Comprehension ease could be pre- face detail in the text base to answer questions.
dicted from the number of times a reader must
make a reinstatement, by the number of propo-
sitions reinstated, by the number of inferences
Comparison of models
and reorganizations required to keep the network Story grammars suffer from a number of prob-
coherent, and by the number of levels in the net- lems: In particular, it is difficult to agree on what
work required to represent the text. Vipond exam- the terminal and non-terminal categories and rules
ined how these variables operate at the microlevel of the grammar should be. Propositional networks
(to do with sentences) and the macrolevel (to do and schema models, while providing useful con-
with paragraphs). He found that involvement of structs, are not in themselves sufficient to account
microprocesses predicts the reading performance for all the phenomena of text processing. Of these
of less skilled readers, whereas the involvement of models, Kintsch’s is the most detailed and prom-
macroprocesses predicts the reading performance ising, and as a consequence has received the most
of better readers. He argued that microprocesses attention. It combines the best of schema-based
have greater influence in question answering, and network-based approaches to form a well-
recognition, and locating information in text, specified mental model theory.
whereas macroprocesses have greater influence in
integration and long-term retention.
Fletcher (1986) examined eight strategies INDIVIDUAL DIFFERENCES
that participants might use to keep propositions IN COMPREHENSION
in the short-term buffer. Four were local strat- SKILLS
egies (“most recent proposition”; “most recent
topical”—the first agent or object mentioned in a Throughout this book, we have seen that there are
story; “most recent containing the most frequent individual differences in reading skills, and the same
argument”; and “leading edge”—a combination is true of text processing: people differ in their abil-
of the most recent and the most important propo- ity to process text effectively. There are a number of
sition) and four were global strategies (“follow ways in which people differ in comprehension abili-
a script”; “correspond to the major categories of ties, and a number of reasons for these differences.
a story grammar”; “indicate a character’s goal For example, less skilled comprehenders draw fewer
or plan”; “are part of the most recent discourse inferences when processing text or discourse, and
topic”). These were tested against 20 texts in a are also less well able to integrate meaning across
recall task and a “think-aloud protocol” task, utterances (Oakhill, 1994; Yuill & Oakhill, 1991).
where participants had to read the story and elab- Working memory plays a role in these difficulties,
orate out loud. There was no clear preference but is unlikely to be the sole reason.
for local versus global strategies, although the Working or short-term memory is used for
“plan/goal” strategy was top in both tasks, and storing currently active ideas, and for the short-
story structure was also important. There were term storage of mental computations (Baddeley,
large task differences: for example, frequency 1990). Differences in working memory span have
was bottom in recall but third most used in the a number of consequences for the ability to under-
protocol task. stand text (Singer, 1994). For example, a high span
Finally, the model also predicts how good will enable an antecedent to be kept active in mem-
memory is for text and how prior knowledge ory for longer, and will enable more elaborative
affects the way in which people answer questions inferences to be drawn. A useful measure of work-
(Kintsch, Welsch, Schmalhofer, & Zimny, 1990). ing memory capacity for test processing is reading
The more background knowledge comprehenders span as defined by Daneman and Carpenter (1980).
have, the more likely they are to answer questions People hear or read sets of unrelated sentences, and
based on their situation model. Comprehenders after each set attempt to recall the last word of each
12. COMPREHENSION 387
sentence. Reading span is the largest size set for material becomes activated before it can be sup-
which a participant can correctly recall all the last pressed. Reading activates a great deal of material,
words. Reading span correlates with the ability to and skilled comprehenders are better able to sup-
answer questions about texts, with pronoun resolu- press that material that is less relevant to the task at
tion accuracy, and even with general measures of hand. It reduces interference. Less skilled compre-
verbal intelligence such as SAT scores (a standard- henders are less efficient at suppressing the inap-
ized test of academic and intellectual achievement propriate meaning of homonyms such as SPADE
in the USA, standing for Scholastic Assessment (Gernsbacher, Varner, & Faust, 1990). When pre-
Test). Daneman and Carpenter argued that reading sented with the test word “ace” 850 ms after the
span gives a much better measure of comprehen- sentence “He dug with a spade,” skilled compre-
sion ability than traditional word span scores. On henders showed no interference, but less skilled
the other hand, it has proved much harder to find comprehenders took longer to reject the test word.
effects of memory capacity on elaborative infer- Less skilled comprehenders are also less efficient
ences, perhaps because optional elaborations are at suppressing the activation of related pictures
not always reliably inferred by readers (Singer, when reading words. They are even less good at
1994). Less able readers are also more prone to processing puns—this is because they are less able
mind wandering when reading (McVay & Kane, to quickly suppress the contextually appropriate
2012), suggesting that attentional control and meaning of a pun (Gernsbacher, 1997).
executive processing also play an important role Finally, although there has been considerable
in skilled reading, in addition to working memory debate as to the exact mechanisms involved, some
capacity. cognitive abilities decline with normal aging
We saw earlier that prior knowledge influences (Woodruff-Pak, 1997). There is experimental
comprehension. Possessing prior knowledge can be evidence that young people are more effective at
advantageous. In general, the more you know about relating ideas in text (Cohen, 1979; Singer, 1994).
a subject, the easier it is to understand and remember Healthy elderly people are less efficient at sup-
related text. (You can easily verify this for yourself pressing irrelevant material than young people.
by picking up a book or an article on a topic you
know nothing about.) Prior knowledge provides a
framework for understanding new material, activates
How to become a better reader
appropriate concepts more easily, and affects the pro- We saw in Chapter 7 that increases in reading
cessing of inferences. It helps us to decide what is speed are at the cost of impaired comprehension.
important and relevant in material and what is less so. However, psycholinguistics has suggested a num-
The effects of prior knowledge can be quite specific ber of tips about how one’s level of comprehen-
(Singer, 1994). Although experts are more accurate sion of text can be improved.
and faster than novices at making judgments about You can improve your reading ability by pro-
statements related to their expertise, this advantage viding yourself with a framework. One of the best
does not carry over to material in the same text that known methods for studying is called the PQ4R
is not related to their expertise, and does not help in method (Anderson, 2010; Thomas & Robinson,
making complicated elaborative inferences. 1972) (see Figure 12.5). This method emphasizes
Skilled comprehenders are also better able identifying the key points of what you are reading,
to suppress irrelevant and inappropriate mate- and adopting an active approach to the material.
rial (Gernsbacher, 1997). Suppression can be In terms of Kintsch’s model, this enables appro-
distinguished from the related attentional pro- priate goal schema to be activated right from the
cess of inhibition that is important in atten- start. It also enables you to process the material
tional expectancy-based priming (see Chapter more deeply, and think about its implications.
6). Suppression is the attenuating of activation, Material should also be related to prior knowl-
whereas inhibition is the blocking of activation edge. The technique also maximizes memory
(Gernsbacher, 1997). Suppression requires that retention. Making up questions and answering
388 D. MEANING AND USING LANGUAGE
them is known to improve memory, with question- x Reflect. Reflect on the material as you read it.
making the more effective of the two (Anderson, Try to think of examples, and try to relate the
2010). Finally, elaborative processing of material material to prior knowledge. Try to understand
is highly beneficial; we saw earlier that we tend it. If you don’t understand it all the first time,
to remember our inferences. The PQ4R method don’t worry. Some difficult material takes sev-
makes incidental use of all of these insights. The eral readings.
method goes like this. It can be applied either to a x Recite. After finishing a section, try to recall
whole book or to just one chapter in a book: the information that was in it. Try answering
the questions you made up earlier. If you can-
x Preview. Survey the material to determine not, reread the difficult material and the parts
what is discussed in it. Examine the contents relevant to the questions you could not answer.
list. If the book or chapter has an introduction x Review. After you have finished, go through it
or summary, read it. Read the conclusions. mentally, recalling the main points. Again try
Look at the figures and tables to get a feel for answering the questions you made up. A few
what the material is about. Identify the sections minutes after you have finished this process,
to be read as single units. Apply the next four flick through the material once more. If pos-
steps to each section. sible, repeat this an hour or so later.
x Questions. Make up questions for each section.
Try to make your questions related to your You might need to repeat the whole process
goals in reading the material. You can some- if you want to approach the material with a differ-
times turn section headings into questions. ent emphasis. This method is not always appropri-
(I’ve already tried to do this where possible in ate, of course. I wouldn’t like to read a novel by
this book.) the PQ4R method, for example. But if you have to
x Read. Read the material carefully, trying to study material for some purpose—such as this text-
answer the questions you made up. book for an exam—it is much better to rely on psy-
cholinguistic principles than to read it like a novel.
The PQ4R method (Anderson, 2010;
Thomas & Robinson, 1972)
THE NEUROSCIENCE OF
PREVIEW
TEXT PROCESSING
Much less is known about the neuropsychology of
text processing than about the neuropsychology of
QUESTIONS many other language processes. This is because
text processing and semantic integration really
comprise many processes, at least some of which
READ
are not specific to language, and involve much of
the cortex. It is much more straightforward to track
down the effects of brain damage on modular pro-
REFLECT
cesses. Many types of brain damage will lead to
some impairment of comprehension ability. For
example, people with receptive aphasia have dif-
RECITE
ficulty in understanding the meaning of words; this
obviously impairs their ability to follow coherent
text and conversation. People with syntactic pro-
REVIEW
cessing impairments have difficulty in parsing
sentences (see also Chapter 10). However, it has
FIGURE 12.5 proved much more difficult to find deficits that are
12. COMPREHENSION 389
restricted to text processing. Some patients with Right-hemisphere patients also find some dis-
Wernicke’s aphasia have difficulty in maintaining course inferences difficult to make (Brownell,
the coherence of discourse; they might repeat ideas Potter, Bihrle, & Gardner, 1986). In particular,
or introduce irrelevant ones (Christiansen, 1995). while they are able to draw straightforward infer-
Children with SLI (specific language impairment) ences from discourse, they are unable to revise
are poor at story comprehension and making infer- them in the light of new information that should
ences. The source of their comprehension difficulty make them inappropriate (Caplan, 1992).
is uncertain: Limited working memory span might We saw in Chapter 3 that children with
play a role, and it is also possible that ability to semantic-pragmatic disorder have difficulty in
suppress information is impaired. It may also be conversations where they have to draw inferences
that the difficulties arise because these children (Bishop, 1997). They give very literal answers to
spend so much time processing individual words questions, and fail to take the preceding conversa-
and parsing sentences, as they have a host of other tional and social context into account. Semantic-
difficulties (Bishop, 1997). Of course, all of these pragmatic disorder is best explained in terms of
factors might play a part. these children having difficulty in representing
In spite of this difficulty, there are reports of other people’s mental states.
people with an impaired ability to understand dis- Many people with short-term memory impair-
course, but without other language impairments. ments show comprehension impairments. We saw
Most of these reports involve (right-handed) earlier that reading span tends to be lower in peo-
people with right-hemisphere damage (Caplan, ple with poor comprehension skills. Brain damage
1992). For example, such patients have some can dramatically reduce short-term memory span
difficulty in understanding jokes (Brownell & (to just one or two digits). Patient BO had particu-
Gardner, 1988; Brownell, Michel, Powelson, & lar difficulty understanding sentences with three or
Gardner, 1983). Consider the following joke (63) more noun phrases (Caplan, 1992). McCarthy and
with three possible punchlines (from Brownell et al., Warrington (1987b) described a patient who had
1983, selected by Caplan, 1992): difficulty in translating commands into actions.
People with dementia have difficulty in keeping
(63) The quack was selling a potion which he track of the referents of pronouns; this is likely to be
claimed would make men live to a great because of their impaired working memory (Almor,
age. He claimed he himself was hale and Kempler, MacDonald, Andersen, & Tyler, 1999).
hearty and over 300 years old. Vallar and Baddeley (1987) described a patient with
“Is he really as old as that?” asked a listener impaired short-term memory who could not detect
of the youthful assistant. anomalies involving reference. Although short-term
“I can’t say,” said the assistant, memory seems to play little role in parsing (Chapter
“X.” 10), it is important in integration and maintaining a
Which best fits X? discourse representation.
A. Correct punchline: “I’ve only worked We saw earlier that one aspect of being a skilled
with him for 100 years.” comprehender is to suppress irrelevant material.
B. Coherent non-humorous ending: “I don’t People with dementia are very inefficient at sup-
know how old he is.” pressing irrelevant material (Faust, Balota, Duchek,
C. Incoherent ending: “There are over 300 Gernsbacher, & Smith, 1997). This leads to a
days in a year.” reduced ability to understand text and conversation.
Furthermore, the more severe the dementia, the less
Brownell et al. found that right-hemisphere efficient the suppression. People with dementia also
patients were not very good at picking the correct seem to change the topic of conversation more often
punchline. They often chose the incoherent end- and more unexpectedly than people without demen-
ing. They knew that the ending of a joke should be tia, and are generally less able to maintain coher-
surprising, but were unable to maintain coherence. ence in conversation (Garcia & Joanette, 1997).
390 D. MEANING AND USING LANGUAGE
SUMMARY
x In comprehension, we go beyond word meaning and syntactic structure to integrate the semantic
roles into a larger representation that integrates the text or discourse with previous material and
with background information.
x Text has a structure and coherence that makes it easy to understand.
x People try to make new information as easy to assimilate as possible for the listener.
x Literal memory is normally very unreliable.
x People generally forget the syntactic and lexical details of what they hear or read, and just remem-
ber the gist.
x We can remember some of the literal form, particularly where the wording matters, and for
incidental material such as jokes.
x We have better memory for what we consider to be important material.
x Prior knowledge is important; it helps us to understand and remember material.
x Changing perspective can help you remember additional information if the story was easy to
understand in the first place.
x As we read or listen, we make inferences.
x Eyewitness testimony can be quite unreliable, as people confuse inference with what originally
happened, and can be misled by the wording of questions.
x Bridging inferences enable us to maintain the coherence of text, elaborative inferences to go
beyond the text.
x We find it difficult to distinguish our inferences from the original material.
x According to the constructionist viewpoint, we construct a detailed model of the discourse, using
many elaborative inferences; according to the minimalist viewpoint, we make only those infer-
ences we need to maintain the coherence of the representation.
x The number of inferences we make at the time of comprehension might be quite minimal; we
make only those necessary to make sense of the text and keep it coherent.
x Many elaborative inferences are made at the time of recall.
x Resolving anaphoric reference involves working out who or what (the antecedent) should be
associated with pronouns and referring phrases.
x Gender is an important cue for resolving anaphoric ambiguity.
x Some topics are more accessible than others; they are said to be in the foreground.
x Common ground refers to items that are mutually known by participants in conversations, when
the participants know that the others know about these things too.
x Factors such as common ground cannot restrict the initial search for possible referents, but may
be an important constraint in selecting among alternatives.
x Propositions are units of meaning relating two things.
x Propositional networks form a useful basis for representing text, but cannot be sufficient in themselves,
because they do not show how we make inferences, or how some items are kept in the foreground.
x According to story grammars, stories have a structure analogous to that of a sentence; however,
unlike sentence grammars, there is no agreement on how stories should be analyzed, or on what
the appropriate units should be.
x Schemas are organized packets of knowledge that have been abstracted from many instances; they
are particularly useful for representing stereotypical sequences (such as going to a restaurant).
x A mental model is a structure that represents what the text is about, particularly preserving spatial
information.
12. COMPREHENSION 391
x The construction–integration model combines propositional networks, schema theory, and mental
models to provide a detailed account of how we understand text.
x Working memory span is an important constraint on comprehension ability.
x Skilled comprehenders are better able to suppress irrelevant material.
x The PQ4R method is a powerful method for approaching difficult material.
x People with right-hemisphere brain damage have difficulty in understanding jokes and drawing
appropriate inferences.
x Children with semantic-pragmatic disorder have difficulty following conversations because they
cannot represent other people’s mental states.
x Impaired short-term memory disrupts the ability to comprehend text and discourse.
x Dementia reduces the ability to comprehend text and discourse and to maintain a coherent conversation.
FURTHER READING
Fletcher (1994) reviews the classic literature on text memory. See Altarriba (1993) for a review of
cultural effects in comprehension.
There are many references on the debate between minimalism and constructionism (e.g.,
Graesser, Singer, & Trabasso, 1994; McKoon, Gerrig, & Greene, 1996; Potts, Keenan, & Golding,
1988; Singer, 1994; Singer & Ferreira, 1983; Singer, Graesser, & Trabasso, 1994).
Kintsch (1994) reviews models of text processing. Another early influential propositional net-
work model was that of Norman and Rumelhart (1975). Brewer (1987) compares the mental model
and schema approaches to memory. See Mandler and Johnson (1980) and Rumelhart (1980) for
replies to critics of story grammars. See Eysenck and Keane (2010) for more on schemas. Wilkes
(1997) describes how knowledge is represented.
See Bishop (1997) for a review of developmental discourse disorders, including semantic-
pragmatic disorder.
This page intentionally left blank
SECTION E
PRODUCTION AND OTHER
ASPECTS OF LANGUAGE
This section looks at how we produce language. Chapter 14, How do we use language?,
It also examines the structure of the language looks at how we use language. The chapter exam-
system, with emphasis on how we repeat words ines conversation and pragmatics, and the relation
and the role of memory in language processing. It between language and the visual world.
ends with a brief look at the main themes outlined Chapter 15, The structure of the language
in Chapter 1, and some possible future issues. system, draws together issues from the rest of the
Chapter 13, Language production, looks book, looking at how the components of the system
at the process involved in deciding what we want interrelate, particularly with reference to memory.
to say, and how we turn these words into sounds. Chapter 16, New directions, evaluates the
Where does comprehension end and production present status of psycholinguistics and the ways
begin? Writing is another way of producing lan- in which the themes introduced in Chapter 1 may
guage that is examined here. be developed in the future.
This page intentionally left blank
C H A P T E R 13
LANGUAGE PRODUCTION
FORMULATION
ARTICULATION
FIGURE 13.1
396 E. PRODUCTION AND OTHER ASPECTS OF LANGUAGE
detailed phonetic and articulatory planning (see correct sequence and specify how the muscles of
Figure 13.1). the articulatory system should be moved.
During conceptualization, speakers con- What types of evidence have been used to
ceive an intention and select relevant information study production? First, researchers have ana-
from memory or the environment in preparation lyzed transcripts of how speakers choose what to
for the construction of the intended utterance. say and how to say it (Beattie, 1983). For exam-
The product of conceptualization is a preverbal ple, Brennan and Clark (1996) found that speak-
message. This is called the message level of rep- ers cooperate in conversation so that they come to
resentation. To some extent, the message level is agree on the same names for objects. Computer
the forgotten level of speech production. A prob- simulations and connectionist modeling, as in other
lem with talking about intention and meaning, as areas of psycholinguistics, have become very influ-
Wittgenstein (1958) observed, is that they induce ential. Much has been learned by the analysis of
“a mental cramp.” Very little is known about the the distribution of hesitations or pauses in speech.
processes of conceptualization and the format Until fairly recently the most influential data were
of the message level. Obviously the message spontaneously occurring speech errors, or slips of
level involves interfacing with the world (par- the tongue, but in recent years experimental stud-
ticularly with other speakers), and with seman- ies, often based on picture naming, have become
tic memory. The start of the production process important. By the end of this chapter you should:
must have a great deal in common with the end
point of the comprehension process. When we x Know about the different types of speech error
talk, we have an intention to achieve some- and why we make them.
thing with our language. How do we decide on x Know the difference between conceptualiza-
the illocutionary force of what we want to say? tion, formulation, and execution.
Levelt (1989) distinguished between macroplan- x Understand how we plan the syntax of what
ning and microplanning conceptualization pro- we say.
cesses. Macroplanning involves the elaboration x Appreciate how we retrieve words when we
of a communicative goal into a series of sub- speak.
goals and the retrieval of appropriate informa- x Know about Garrett’s model and the interac-
tion. Microplanning involves assigning the right tive activation models of speech production.
propositional shape to these chunks of informa- x Know why we pause when we speak.
tion, and deciding on matters such as what the x Understand how brain damage affects lan-
topic or focus of the utterance will be. guage production.
There are two major components of formula- x Know how we plan what we write.
tion: We have to select the individual words that
we want to say (lexicalization), and we have to
put them together to form a sentence (syntactic SLIPS OF THE TONGUE
planning). It might not always be necessary to
construct a syntactic representation of a sentence Until fairly recently, models of speech production
in order to derive its meaning. Clearly this is not were primarily based on analyses of spontane-
an option when speaking. Given this, it is perhaps ously occurring speech errors. Casual examina-
surprising that more attention has not been paid to tion of our speech will reveal (in the unlikely
syntactic encoding in production, but the difficul- event that you do not know this already) that it
ties of controlling the input are substantial. is far from perfect, and rife with errors. Analysis
Finally, the processes of phonological encod- of these errors is one of the oldest research topics
ing involve turning words into sounds in the right in psycholinguistics. Speech errors are frequently
order, spoken at the correct speed, with the appro- commented on in everyday life. The case of the
priate prosody (intonation, pitch, loudness, and Reverend Dr. Spooner is quite commonly known;
rhythm). The sounds must be produced in the indeed, he gave his name to a particular type of
13. LANGUAGE PRODUCTION 397
error involving the exchange of initial conso- Not all Freudian slips need arise from a repressed
nants between words, the spoonerism. Some of sexual thought. In another example he gives, the
Reverend Spooner’s alleged spoonerisms are President of the Lower House of the Austrian
shown in examples (1) to (3). (See Potter, 1980, Parliament opened a meeting with “Gentlemen,
for a discussion of whether Reverend Spooner’s I take notice that a full quorum of members is
errors were in fact so frequent as to suggest an present and herewith declare the sitting closed!”
underlying pathology.) (instead of open). Freud interpreted this as reveal-
ing the President’s true thoughts, that he secretly
(1) Utterance: You have hissed all my mystery wished a potentially troublesome meeting closed.
lectures. However, Freud was not the first to study speech
Target: … missed all my history lectures. errors; a few years before, Meringer and Mayer
(2) Utterance: In fact, you have tasted the whole (1895) provided what is now considered to be
worm. a more traditional analysis. Ellis (1980) reana-
Target: … wasted the whole term. lyzed Freud’s collection of speech errors in terms
(3) Utterance: The Lord is a shoving leopard to of a modern process-oriented account of speech
his flock. production.
Target: … a loving shepherd. The most common method of analyzing
speech errors is to collect a large corpus of errors
Most people have heard of the Freudian slip. by recording as many as possible. Usually the
In part of a general treatise on action slips or errors researcher will interrupt the speaker when he or
of action called parapraxes, Freud (1901/1975) she detects the error, and ask the speaker what
noted the occurrence of slips of the tongue, was the intended target, why they thought the
and proposed that they revealed our repressed error was made, and so on. Although this method
thoughts. In one example he gives, a professor introduces the possibility of observer bias, this
said in a lecture, “In the case of female genitals, appears to be surprisingly weak, if present at all.
in spite of many Versuchungen (temptations)— A comparison of error corpora against a smaller
I beg your pardon, Versuche (experiments) …” sample taken from a rigorous transcription of a
sample of tape-recorded conversation (Garnham, What can speech errors tell us?
Shillcock, Brown, Mill, & Cutler, 1982) suggests
that the types and proportion of errors are very Let us now analyze a speech error in more detail
similar. For example, word substitution errors to see what can be learned from them. Consider
and sound anticipation and substitution errors are the famous example of (4) from Fromkin
particularly common. Furthermore, it is possible (1971/1973):
to induce slips of the tongue artificially by, for
example, getting participants to read words out (4) a weekend for MANIACS—a maniac for
at speed (Baars, Motley, & MacKay, 1975). The WEEKENDS
findings from such studies corroborate the natu-
ralistic data. The capital letters indicate the primary stress
There are many different types of speech and the italics secondary stress. The first thing
error. We can categorize them by considering the to notice is that the sentence stress was left
linguistic units involved in the error (for exam- unchanged by the error, suggesting that stress is
ple, at the phonological feature, phoneme, sylla- generated independently of the particular words
ble, morpheme, word, phrase, or sentence levels) involved. Even more strikingly, the plural mor-
and the error mechanism involved (such as the pheme “-s” was left at the end of the second word
blend, substitution, addition, or deletion of units). where it was originally intended to be in the first
Fromkin (1971/1973) argued that the existence of place: it did not move with “maniac.” We say it
errors involving a particular unit shows that these was stranded. Furthermore, this plural morpheme
units are psychologically real. Table 13.1 gives was realized in sound as /z/ not as /s/. That is,
some examples of speech errors from my own the plural ending sounds consistent with the
corpus to illustrate these points. In any error there word that actually came before it, not with the
was the target that the speaker had in mind, and word that was originally intended to come before
the erroneous utterance as actually produced; the it. (Plural endings are voiced “/z/” if the final
erroneous part of the utterance is in italics. consonant of the word to which it is attached is
voiced, as in “weekend,” but are unvoiced “/s/” if
Word exchange Guess whose mind came to name? whose name came to mind
Phrase blend Miss you a very much very much a great deal
13. LANGUAGE PRODUCTION 399
the final consonant is unvoiced, as in “maniac.”) speaking and can correct, or repair, it; sometimes
This is an example of accommodation to the pho- we notice it only after we have finished speak-
nological environment. ing. Often we never notice we have made an error.
Such examples tell us a great deal about The idea of a monitor plays an important role in
speech production. Garrett’s model, described the WEAVER++ model of speech production,
next, is based on a detailed analysis of such discussed below.
examples. On the other hand, Levelt et al. Naming errors probably do not arise from
(1991a) argued that too much emphasis has people rushing their preparation, or, in the case
been placed on errors, and that error analysis of naming, from insufficient word preparation, or
needs to be supported by experimental data. If a failure to check names against objects. Griffin
these two approaches give conflicting results, (2004) examined people’s eye movements while
we should place more emphasis on the experi- they described a visual scene. People tend to gaze
mental data, as the error data are only telling at objects while they are preparing their names.
us about aberrant processing. There are three If errors arise from rushed preparation, they
points that can be made in response to this. should spend less time looking at an object just
First, a complete model should be able to before naming it incorrectly (e.g., saying “ham-
account for both experimental and speech error mer” when looking at an axe); however, they do
data. Second, the lines of evidence converge not. Instead they spend just as long gazing at a
rather than giving conflicting results (Harley, referent before uttering errors as they do before
1993a). Third, it is possible to simulate sponta- uttering correct names. Indeed, if they corrected
neously occurring speech errors experimentally, their utterance (“ham – axe”), they spent longer
and these experimental simulations lead to the looking at the object after making their error, pre-
same conclusion as the natural errors. Using a sumably because they were preparing their repair.
technique they called SLIP, Baars et al. (1975)
required participants to rapidly read pairs of Garrett’s model of speech
words such as “big dog,” “blocked drain,” and
then “dart board.” If participants have to read
production
these pairs from right to left, the priming effect In an important series of papers based primarily
of the preceding pairs leads them to make many on speech error analysis, Garrett (1975, 1976,
spoonerisms on “dart board.” Furthermore, the 1980a, 1980b, 1982, 1988, 1992) argued that we
participants are more likely to produce “barn produce speech through a series of discrete levels
door” (two real words) than they are the cor- of processing. In Garrett’s model, processing is
responding “bart doard”—an instance of the serial, in that at any one stage of processing only
bias towards lexical outcomes also displayed in one thing is happening. Of course, more than one
the naturalistic data. On the other hand, using thing is happening at different processing levels,
the same technique, speakers are less likely to because obviously even as we speak we might be
make exchanges that result in taboo words (e.g., planning what we are going to say next. However,
from “hit shed”; work it out) than ones that do these levels of processing do not interact with
not. Furthermore, galvanic skin responses were one another. The model distinguishes two major
elevated on these taboo trials, suggesting that stages of syntactic planning (see Figure 13.2). At
speakers generated the spoonerism internally, the functional level, word order is not yet explic-
but are in some way monitoring their output itly represented. The semantic content of words is
(Motley, Camden, & Baars, 1982). specified and assigned to syntactic roles such as
We should note that we sometimes correct subject and object. At the positional level, words
our speech errors, which shows that we are moni- are explicitly ordered. There is a dissociation
toring our speech. Sometimes we notice the error between syntactic planning and lexical retrieval.
before we speak it and can prevent it from being Garrett argued that content and function words
made; sometimes we notice the error as we are play very different roles in language production.
400 E. PRODUCTION AND OTHER ASPECTS OF LANGUAGE
case the plural ending “-s.” (In English, affixes This is an extraordinarily robust finding: In my
are either prefixes, which come before a word, corpus of several thousand speech errors, there is
or suffixes, which come after, and are always not a single instance of a content word exchang-
bound morphemes, in that they cannot occur ing with a function word. This supports the idea
without a stem; morphemes that can be found as that content and function words are from compu-
words by themselves are called free morphemes. tationally distinct vocabularies that are processed
Bound morphemes can be either derivational or at different levels.
inflectional—see Chapter 1.) Because the bound There are also different constraints on
morpheme has been left in its original place word and sound exchange errors. Sounds only
while the free morpheme has moved, this type of exchange across small distances, whereas
exchange is called morpheme stranding. Content words can exchange across phrases; words that
words behave differently from the grammatical exchange tend to come from the same syntactic
elements, which include inflectional bound mor- class, whereas this is not a consideration in sound
phemes and function words. This suggests that errors, which swap with words regardless of
they are involved in different processing stages. their syntactic class. In summary, word exchange
In (4) the plural suffix was produced correctly errors involve content words and are constrained
for the sentence as it was actually uttered, not as it by syntactic factors; sound errors are constrained
was planned. This accommodation to the phono- by distance.
logical environment suggests that the phonologi-
cal specification of grammatical elements occurs
rather late in speech production, at least after the
Evaluation of Garrett’s model
phonological forms of content words have been Garrett’s model accounts for a great deal of the
retrieved. This dissociation between specifying speech error evidence, but a number of findings
the sounds of content words and specifying the subsequently have suggested that some aspects
grammatical elements is of fundamental impor- of it might not be correct. First, it is not at all
tance in the theory of speech production, and is clear that speech production is a serial process.
an issue that will recur in our discussions of its There is clearly some evidence for at least local
pathology. Furthermore, in word exchange errors, parallel processing in that we find word blend
the sentence stress is left unchanged, suggesting errors, which must be explained by two (or
that this is specified independently of the content more) words being simultaneously retrieved
words. from the lexicon, as in (5) for example. More
Error analysis suggests that when we speak problematically, we find blends of phrases
we specify a syntactic plan or frame for a sentence and sentences, such as in (6). Furthermore,
that consists of a series of slots into which con- the locus of these blends is determined phono-
tent words are inserted. Word exchanges occur logically (Butterworth, 1982), so that the two
when content words are put into the wrong slot. phrases cross over where they sound most alike.
Grammatical elements are part of the syntactic This suggests that two alternative messages are
frame, but their detailed phonological forms must processed in parallel from the message to the
be specified late. phonological levels.
This model predicts that when parts of a sen- We also observe two types of cognitive intru-
tence interact to produce a speech error, they must sion errors where material extraneous to the utter-
be elements of the same processing vocabulary. ance being produced intrudes into it. The message
That is, things only exchange if they are involved level can intrude into the utterance and lower
in the same processing level. Therefore certain levels of processing, producing errors called non-
types of error should never be found. Garrett plan-internal errors, such as in (7). These errors
observed that content words almost always only are often phonologically facilitated. Phonological
exchange with other content words, and that func- facilitation means that errors are more likely to
tion words exchange with other function words. occur if the target word and intrusion sound alike.
402 E. PRODUCTION AND OTHER ASPECTS OF LANGUAGE
We find that targets and intrusions in non-plan- that the levels of processing cannot be independ-
internal errors sound more alike than would be ent of one another but must interact. These data
expected by chance alone, although special care drive the interactive models of lexicalization
is necessary to determine what the intended utter- described later.
ance was (Harley, 1984). A final problem, about which little can be
done, is that the distinction between content and
(5) Utterance: It’s difficult to valify. (Targets: function words is confounded with frequency
validate + verify) (Stemberger, 1985), in that function words include
(6) Target 1: I’m making some tea. some of the most common words of the language
Target 2: I’m putting the kettle on. (for example, “the,” “a”). Processing differences
Utterance: I’m making the kettle on. may reflect this, rather than their being processed
(7) Target: I’ve read all my library books. by different systems. However, the observation
Utterance: I’ve eaten all my library books. that bound morphemes behave like function words
Context: The speaker reported that he was supports Garrett’s hypothesis, as does neuropsycho-
hungry and was thinking of getting some- logical data, discussed later.
thing to eat.
Sleiderink, & Levelt, 1998). They also gaze at the (9) The ghoul sold a vacuum cleaner to a were-
referents of direct-object nouns while producing wolf.
the subject; if they are uncertain which argument (10) The ghoul sold a werewolf a vacuum
to produce immediately after the verb, their gaze cleaner.
moves between the alternative referents (Griffin (11) The vampire handed a hat to the ghost.
& Bock, 2000). Gaze is a reliable indicator of (12) The vampire handed the ghost a hat.
what and when people are thinking and plan-
ning. Indeed, as is often said, the eyes can give Importantly, syntactic priming does not
us away; speakers will look at the intended ref- depend on superficial similarities between the
erent of an object even if they are preparing to prime and utterance. It does not depend on reus-
“lie” by giving an intentionally inaccurate label ing words (lexical priming) or on repeating the-
for it (Griffin & Oppenheimer, 2006). matic roles, but instead reflects the more general
construction of syntactic constituent structures.
Similarly, the magnitude of the priming effect
Syntactic priming shown by verbs does not depend on the tense,
We reuse words and sentence structures within number, or aspect of the verb (Pickering &
conversation (Schenkein, 1980). The repeti- Branigan, 1998). For example, a prime sentence
tion of syntactic structure is called structural such as (13) was just as effective as the prime
priming or syntactic persistence (Bock, 1986). sentence (14) in eliciting a prepositional-object
Structural priming suggests that we can sepa- construction involving the word “to” (Bock,
rate meaning and form, because we can prime 1989). Put more generally, prepositional-object
sentence structures independently of sentence sentences prime descriptions to use prepositional-
meaning. object constructions regardless of the prepo-
Syntactic persistence is one aspect of the sition (e.g., “to” and “for”) used in the prime
more general phenomenon of syntactic prim- sentences. However, repeating the verb (regard-
ing, whereby processing of a particular syntactic less of tense, aspect, or number) does enhance
structure influences processing of subsequently priming, an effect Pickering and Branigan call
presented sentences. Syntactic priming is wholly the lexical boost. The lexical boost is important
facilitatory, and has been observed in comprehen- because it suggests that the verb has a special role
sion, in production, and bidirectionally between in production. Priming is also enhanced by the
comprehension and production (Branigan, repetition of word order between prime and tar-
Pickering, Liversedge, Stewart, & Urbach, get (Hartsuiker & Westenberg, 2000; Pickering,
1995). One common method used to study syn- Branigan, & McLean, 2002). In summary, we can
tactic priming is to get participants to repeat a prime abstract syntactic structures, but the mag-
prime sentence that contains the syntactic struc- nitude of the priming effect is greater if we repeat
ture of interest, and then to describe a picture. word order and the verb. Indeed, a verb prime
Syntactic priming studies show that speakers use alone may be sufficient to bias speakers’ subse-
a particular word order if the prime sentence used quent productions (Melinger & Dobel, 2005).
that order (Bock, 1986, 1989; Bock & Loebell,
1990; Branigan et al., 1995; Hartsuiker, Kolk, & (13) The werewolf baked a cake for the witch.
Huiskamp, 1999). Suppose we have to describe (14) The werewolf took a cake to the witch.
a picture of a vampire handing a hat to a ghost.
A preceding prepositional-object structure prime, Along similar lines, Bock and Loebell (1990)
such as (9), steers us towards producing a prep- showed that only sentences like (15) produce
ositional-object construction in our description: priming of the prepositional-object description
for example, we might say (11); while a double- (17). A construction such as (16) does not, even
object prime, such as (10), steers us towards pro- though it is superficially very similar to (15). It
ducing a double-object construction such as (12): has similar words (most noticeably, it contains the
404 E. PRODUCTION AND OTHER ASPECTS OF LANGUAGE
word “to”) and has a similar stress pattern. How- changed the syntactic structure of what they had
ever, it has a very different syntactic structure (“a just read. In particular, people tend to reuse pre-
book to study” is a noun phrase, not a prepositional- vious syntactic structures: that is, they recalled
object phrase). Hence it is the underlying syntactic the sentence just presented with the syntactic
structure that is important in obtaining syntactic structure of a previous item. So syntactic priming
priming, not the surface form of the words. Syn- influences our memory, too. It can also lead us
tactic priming has been demonstrated for a variety to produce ungrammatical utterances, when we
of syntactic structures. are erroneously influenced by a structure we have
just heard (Ivanova, Pickering, McLean, Costa,
(15) Vlad brought a book to Boris. & Branigan, 2012).
(16) Vlad brought a book to study. It is also possible to prime the productions
(17) The witch is handing a paintbrush to the of patients who have an impairment of syntactic
ghost. planning in speech production, although not all
types of sentence structure are primed as easily
Syntactic persistence can continue for quite as others (Hartsuiker & Kolk, 1998; Saffran &
some time. Bock and Griffin (2000) found that Martin, 1997). The number of passives (e.g., “the
the structural priming could persist over as long cat was chased by the dog”) was increased by pas-
as 10 intervening sentences (although the priming sive primes, but the production of dative construc-
effect can be short-lived—Levelt & Kelter, 1982). tions (e.g., “give the food to the dog”) showed no
Such persistence suggests that the priming is due immediate increase after the primes. Some of the
to more than short-term memory, and may have newly generated constructions were morphologi-
some long-term learning component. cally deviant, suggesting that although phrase
Speakers also tend to reuse the syntactic con- structure and closed-class elements are normally
structions of other speakers (Branigan, Pickering, closely linked in production (as in Garrett’s
& Cleland, 2000). For example, speakers will model), they can be separated.
use a complex noun phrase (e.g., “the square At first sight, the way in which syntactic
that’s red”) more often after hearing a syntacti- frames can be primed independently of mean-
cally similar noun phrase than a simple one (“the ing points to a separation of meaning and form.
red square”), and are particularly likely to do so Greater overlap in meaning does not generally
if the main noun (“square”) is repeated (Cleland lead to a larger amount of priming; in most cases
& Pickering, 2003). We find this priming effect all that matters is the overlap in surface syntax.
on noun-phrase structure if the prime and tar- This finding suggests that sentence frames are
get noun are semantically related (“sheep” and independent syntactic representations, and in par-
“goat”), but not if they are phonologically related ticular that they have some existence independent
(e.g., “sheep” and “ship”), suggesting that while of the meaning of what they encode. It also points
syntactic encoding is unsurprisingly affected by to a probabilistic element in syntactic planning,
the semantic representation, it is not affected by where the precise form of the words we choose
feedback from the phonological representation is affected by environmental factors such as what
(Cleland & Pickering, 2003). we have just heard. Chang, Dell, and Bock (2006)
Syntactic priming does more than just influ- describe a connectionist model of sentence pro-
ence descriptions. Potter and Lombardi (1998) duction that can account for the structural priming
showed that immediate recall can be affected by data. In their model, sequencing in production
syntactic persistence. In their experiment, par- makes use of two types of information. A sequenc-
ticipants silently read words presented one at a ing system uses a recurrent connectionist model
time and at a fast rate on a computer screen. They that uses statistical information to predict what
then performed another distractor task before is coming next. However, the model also makes
being asked to repeat the sentence out aloud. This use of semantic information about events and the
task is quite difficult, and speakers sometimes message to be produced.
13. LANGUAGE PRODUCTION 405
The model has two advantages. First, there Bock & Miller, 1991). These experiments look
are some recent data that suggest that meaning at what type of factors cause number agreement
can have some effect on priming. Chang, Bock, errors. Consider the sentence fragments (19)–(21)
and Goldberg (2003) found that similar thematic from Bock and Eberhard (1993):
roles can cause priming even when the surface
syntax is held constant (e.g., “The man sprayed (19) The player on the court –
water on the wall” has the theme (water) before (20) The player on the courts –
the location (wall), and “The man sprayed the wall (21) The player on the course –
with water” has the location before the theme; but
both sentences have the same surface structure of A suitable continuation for this might be “was
NP–V–NP–PP). Chang et al.’s model can account very good.” A continuation containing an agree-
for this result because of the meaning-based route. ment error might be “were very good.” Which of
Syntactic priming probably serves two main these fragments causes agreement errors? Sen-
functions. First, it enables speakers in a conversa- tence (19) is very straightforward; both nouns
tion to coordinate or align information. Using the are singular. As we might expect, this type of
same words and syntax helps conversants to col- fragment produces no agreement errors. In (20)
laborate more efficiently. Second, it results from the noun closest to the verb is plural, while the
implicit learning of how people use syntax to con- noun that should determine number (“player”)
vey meaning—people unconsciously adjust how is singular. In this condition we observe many
they convey information on the basis of experi- errors. What about (21)? Although the local noun
ence. The finding that syntactic priming can be (“course”) is singular, it is a pseudoplural, because
persistent over surprisingly long periods of time is the end of the word is an /s/ sound. (Remember
consistent with the idea that it results from learn- that regular plurals in English are formed by add-
ing rather than just reflecting transient activation ing an -s to the end of the singular form of the
of syntactic structures. noun.) So if the plural sound alone were impor-
tant in determining agreement, we would expect
Coping with dependencies sentences like (21) to generate many agreement
How do we cope with dependencies between errors. In fact, they generate none. Hence agree-
words? One particular problem facing speakers ment cannot be determined by the sound of sur-
is ensuring number agreement between subjects rounding words (in particular, whether they sound
and verbs. For instance, we must ensure that we as though they have plural endings) but by some-
say “the woman does” and “the men do,” and thing more fundamental. Further evidence for
not “the woman do” or “the men does.” We do this is that regular (“boys”) and irregular (“men”)
not always get agreement right; number agree- versions of nouns cause equal numbers of agree-
ment errors are fairly common in speech. We ment errors, as do individual (“ship”) and collec-
particularly have a tendency to make attraction tive (e.g., “audience,” “fleet,” and “herd”). At first
errors such as (18), where we make the verb sight what seems to be important in determining
(here “were” instead of “was”) agree with a number agreement is only the syntactic number
noun (“unions”) that is closer to the verb than of the nouns, suggesting that syntactic planning
the subject (“membership”) with which it should is modular.
agree (Bock & Eberhard, 1993). More recent work has challenged this idea
that syntactic processing is feedforward and mod-
(18) Membership in these unions were voluntary. ular. Distributive noun phrases, such as “the label
on the bottles,” where the semantics of the phrase
In an important series of experiments, Bock implies the existence of multiple labels, leads
and her colleagues used a sentence-completion speakers in several languages to produce plural
task designed to elicit agreement errors (e.g., verbs (Eberhard, 1999; Vigliocco, Butterworth, &
Bock & Cutting, 1992; Bock & Eberhard, 1993; Garrett, 1996). It now seems likely that whether
406 E. PRODUCTION AND OTHER ASPECTS OF LANGUAGE
or not we find semantic effects on verb agree- of the local noun to the verb, as on its proxim-
ment depends on subtle factors such as the precise ity in the underlying syntactic structure. In one
materials we use in the experiments. Haskell and experiment participants had to generate sen-
MacDonald (2003) showed that number agree- tences from a sentence beginning and an adjec-
ment can be accounted for in terms of constraint tive, e.g., (22). A correct continuation would be
satisfaction. This approach is similar to that in (23), and one with an agreement error (24):
language comprehension, and makes use of the
constraint-satisfaction idea that several sources (22) The helicopter for the flights + safe.
of information interact to determine output. If (23) The helicopter for the flights is safe.
the different sources of information conflict, then (24) The helicopter for the flights are safe.
processing time increases. If one of the sources
strongly predicts singular or plural, then addi- In a second experiment participants had to
tional weak factors have little additional cost, but generate questions from (22), such as (25):
if the sources of information are approximately
equal, then competition is maximal and the cost to (25) Is the helicopter for the flights safe?
processing time greatest (Haskell & MacDonald, (26) Are the helicopter for the flights safe?
2003). For example, ordinary singular nouns (e.g.,
horse, ship) are very good predictors that a singu- Participants made about the same number of
lar verb is necessary, and produce little competi- agreement errors as in the first experiment, e.g.,
tion. Collective nouns (e.g., family, fleet, team) (26), even though here the “local noun” (“flights”)
share characteristics of both singulars and plurals. is much farther away in terms of the number of
Although they should strictly generate singular intervening words. This is because, according to
verbs, their plural characteristics induce some linguistic theory, the declarative sentence (23) and
competition between plural and singular verb the question (25) have the same underlying syn-
forms, leading to longer processing times and tactic structure.
more variability in output. According to Bock, Eberhard, and Cutting
Similar experimental methods also show (2004) we need two processes to ensure that num-
that number agreement takes place within the ber agreement proceeds smoothly. First, we need a
clause (Bock & Cutting, 1992). Analysis of specification that takes into account the number of
number agreement also provides further evi- things we are talking about in the message. Bock
dence that syntactic structure is generated et al. call this processing marking. For example,
before words are assigned to their final posi- if we are talking about one helicopter, then the
tions. Vigliocco and Nicol (1998) note that verb is marked as singular. Now suppose we are
grammatical encoding has three functions: talking about one pair of scissors. With regard to
assigning grammatical functions (e.g., assign- the message content, the verb will be marked as
ing the agent of an action to the subject of the singular. But we treat “scissors” as a plural noun,
sentence), building syntactic hierarchical con- even if we are only talking about one of them.
stituent structures to reflect this (e.g., turn- We say “the scissors are,” never “the scissor is.”
ing the subject into a NP), and arranging the Hence we need to override the syntactic process of
constituents in linear order. We have seen that marking with a process that takes account of the
speech error data clearly separate the first and morphology of the subject. This second process is
third functions (that is, the functional and posi- called morphing. This overriding process can lead
tional stages of Garrett’s model), but can we to attraction errors, where the verb erroneously
distinguish building abstract hierarchical struc- comes to agree with the number of a neighboring
tures from the final serial ordering of words? noun phrase that is not in fact that verb’s control-
Vigliocco and Nicol argued that we can. They ler, as in (27). Pronouns are more vulnerable to the
showed that number agreement errors do not so number of their controllers, leading to agreement
much depend on the surface or linear proximity errors such as (28). This difference suggests that
13. LANGUAGE PRODUCTION 407
number agreement might involve different pro- though the verb clearly must play a central role
cesses for pronouns and verbs (Eberhard, Cutting, in syntactic planning. They showed that semantic
& Bock, 2005). Verbs are particularly controlled interference between the verb and a distractor was
by the grammatical number—a syntactic process, only obtained for verbs at the very beginning of
while pronouns are controlled by what is called German sentences. Therefore, in sentence-final
notional number—the speaker’s initial, fleeting positions it could not have been retrieved by the
perspective on the number of things involved, and time the participants started speaking.
which involves lexical processes (e.g., our first Smith and Wheeldon (1999) had participants
impression of the word “fleet” is that it is plu- describe moving pictures. They found longer
ral). Eberhard et al. provide a detailed model of onset latencies for single clause sentences begin-
marking and morphing in number agreement that ning with a complex noun phrase (e.g., “the dog
accounts for a wide range of data. and the kite move above the house”) than for simi-
lar sentences beginning with a simple phrase (e.g.,
(27) The time to find the scissors are now. “the dog moves above the kite and the house”).
(28) The key to the cabinets disappeared. They Participants also take longer to initiate double
were never found again. clause sentences (e.g., “the dog and the foot move
up and the kite moves down”) than single clause
Is syntactic planning incremental? sentences. These results suggest that people do not
Word exchange speech errors suggest that the plan the entire syntactic structure of complex sen-
broad syntactic content is sketched out in clause- tences in advance. They suggest that when people
sized chunks. This idea is supported by picture– start speaking they have completed lemma access
word interference studies that suggest that before for the first phrase of an utterance, and started but
we start uttering phrases and short sentences not completed processing the remainder.
containing two names, we select the nouns (tech- Schnur, Costa, and Caramazza (2006) used
nically, we select the lemma—see later) and a picture–word interference design to examine
the sound form of the first noun. Meyer (1996) how far we plan ahead. Participants produced
presented participants with pictures of pairs of sentences while ignoring words that were phonologi-
objects that they then had to name (“the arrow and cally related or unrelated to the verb of the sen-
the bag”), or place in short sentences (“the arrow tence. Schnur et al. found that the time to begin
is next to the bag”). At the same time, the par- producing the sentence was faster in the presence
ticipants heard an auditory distractor that could be of the phonologically related distractor, even if
related in meaning or sound to the first or second the sentence the speaker was producing was rela-
noun, or to both. She found that the time it took tively long. These results suggest that phonologi-
participants to initiate speaking was longer when cal planning extends some way ahead, and can in
the distractor was semantically related to either some circumstances (if the verb is primed) cross
the first or the second noun, but the phonologi- phrase boundaries.
cal distractor only had an effect (by facilitating On the other hand, there is a great deal of
initiation) when it was related to the first noun. evidence that suggests that syntactic planning
This pattern of results suggests that we prepare is incremental—that is, we make it up as we go
the meaning of short phrases and select the appro- along. Ferreira (1996) found that speakers find
priate words before we start speaking, but only production easier when they have more syntactic
retrieve the sound of the first word. (This finding options available to continue what they are say-
is also evidence that lexical access takes place in ing, presumably because they can be flexible and
speech production in two stages; see later.) pick the most suitable or available continuation
Schriefers, Teruel, and Meinshausen (1998) one at any time. If we make up a detailed plan
used a picture–word interference technique to before we start speaking, the number of options
show that the detailed selection of a verb is not an shouldn’t matter, or might even get in the way, as
obligatory component of advance planning—even we choose between them.
408 E. PRODUCTION AND OTHER ASPECTS OF LANGUAGE
Ferreira and Swets (2002) also found evi- task comparing sentences such as “The draw-
dence for incremental planning. They had speak- ing of the flower” (where the two nouns are
ers answer arithmetic sums of differing difficulty tightly integrated) with sentences such as “The
in different sorts of syntactic construction (e.g., drawing with the flower” (where the two nouns
complete “the answer to 49 plus 73 is . . .”). are less closely integrated semantically). More
When speakers were encouraged to speak and errors were made in the completions in the “of”
plan simultaneously—that is, incrementally—by condition, where the components were tightly
trying to beat a deadline, both latency to begin integrated, supporting the parallel model. Hence
speaking and utterance duration were affected when we speak we maintain multiple compo-
by the difficulty of the problem. The more difficult nents of the sentence in memory; we plan and
the problem, the longer people took to produce the speak simultaneously; and we make it up as we
sentence, suggesting that they did not know the go along, rather than planning one chunk at a
answer—and therefore what they were going to time and only producing it when planning is
say—before they started speaking. complete.
Why the discrepancy in results? One expla-
nation is that evidence of detailed advance plan- Producing morphologically complex
ning comes from the study of either phrases or words
very short, simple sentences. Perhaps these are You will remember from Chapter 2 that words can
dealt with differently from more complex con- be morphologically modified in two ways: We can
structions. Another explanation is that the verb derive new words from existing ones (e.g., form-
in the experiments suggesting that there is con- ing “entertainment” from “entertain”), and we can
siderable advance planning is a simple link- inflect words to change noun number or verb tense
ing verb (“is”). Or perhaps the demands of the (e.g., “mouse/mice,” “run/ran”). The new part of
task affect how much participants plan in detail the word (e.g., “-ment”) is called an affix. Speech
before they start speaking. Speech production errors cast some light on how affixes are repre-
probably involves both preparation and planning sented in speech production. We find errors where
ahead and incremental planning; which wins the stems of lexical items can become separated from
day depends on the particular circumstances of their affixes (e.g., the morpheme stranding errors
the utterance. discussed earlier). Affixes are also sometimes
How does this incremental planning relate added incorrectly, anticipated, or deleted. Indeed,
to semantic and syntactic processing? Solomon Garrett’s speech production model rests on a dis-
and Pearlmutter (2004) contrast two approaches sociation between content words and grammatical
to planning production and coordinating mul- elements that are accessed at different times. The
tiple phrases, serial and parallel. They argue neuropsychological evidence from affix loss in
that serial systems must rely on memory to Broca’s-type disorders, and affix addition to neol-
shift representations in and out of memory. ogisms in jargon aphasia (described later), also
Memory-shifting should be easier for phrases suggests that affixes are added to stems. But how?
where the constituents are tightly integrated, You will remember that while most inflec-
with the consequence that there should be fewer tions are regular (we form the plural by adding
errors in such phrases. Parallel systems rely on “s” to the end of the noun, and the past tense
the parallel activation of multiple representa- by adding -ed to the verb), some (usually com-
tions simultaneously maintained in memory. mon) words are formed in an irregular way (e.g.,
Parallel activation means that more integrated mice, sheep, ran, did). How do we produce these
phrases will be processed together and will be irregular forms? One plausible model is that we
active simultaneously, leading to interference, know a rule for producing the regular versions,
with the consequence that there should be more and learn by rote a list of exceptions for dealing
errors in tightly integrated phrases. Solomon with the irregular ones, stored in our lexicon.
and Pearlmutter used a sentence-completion Evidence for this dual-mechanism model comes
13. LANGUAGE PRODUCTION 409
from the observation that while we are happy to regular words, because of their greater phono-
form English compound nouns with either sin- logical complexity, is more affected in non-fluent
gular or plural irregular modifying nouns (both patients (with damage in Broca’s area), who
“mouse-eating” and “mice-eating” sound accept- have a central phonological deficit (see Chapter
able to us), we only form compound nouns with 7). These non-fluent patients also showed defi-
singular regular nouns (hence “a rat-eating man” cits on other phonological tasks, such as mak-
sounds acceptable, but “a rats-eating man” does ing judgments about whether words rhyme, and
not). It seems that inflected forms generated by a segmenting words. On the other hand, damage
rule cannot be used as a modifier in a compound to the semantic system leads to more difficulty
noun. How do we come to know what is accept- with irregular verbs, where phonology receives
able and what is not? One possibility is that the support from the semantic system (Joanisse &
child has some innate knowledge of grammar Seidenberg, 1999). Patient AW is problematic
(Pinker, 1999). for this account. While having a selective deficit
There is also neuropsychological evidence in producing irregular forms of verbs, he per-
for a dual-mechanism model. Ullman et al. formed perfectly on a range of tasks involving
(1997) reported a double dissociation between semantics (Miozzo, 2003).
performance on sentence completion and read- Haskell, MacDonald, and Seidenberg (2003)
ing on words with regular and irregular past tackled the observations on the acceptability of noun
tenses. Patients with what is called fluent aphasia modifiers. One problem for the dual-mechanism
(described in more detail below, but arising from account is that there are many exceptions to the cen-
damage to the rear of the left hemisphere) were tral observation (we have “awards ceremony” and
better at producing the past tense of regular verbs, “sports announcer”). Why should some exceptions
whereas patients with non-fluent aphasia (arising be acceptable? Haskell et al. proposed that accept-
from damage to the more frontal regions of the ability is decided by a multiple-constraint satisfac-
left hemisphere) were better at producing irregu- tion process, where semantic, phonological, and
lar past tenses. One explanation for this result is other factors come together to decide acceptability.
that we make use of a rule-based mechanism for These processes are acquired by children through
generating regular forms, and this mechanism is general-purpose learning algorithms. There is no
located in the front of the left hemisphere (and left need, they argued, for two different innately speci-
intact in fluent aphasia), and a lexicon for storing fied mechanisms.
irregular verbs, located in more posterior regions
(and left intact in non-fluent aphasia). Evaluation of work on syntactic
There is an alternative explanation for this
double dissociation, which is that regular and
planning
irregular verbs are processed by the same sys- In recent years there has been a notable increase in
tem, but the processing of regular verbs depends the amount of research examining syntactic plan-
more on phonological information, while the ning. This has largely been due to the evolution of
processing of irregular verbs depends more on new experimental techniques, particularly syntactic
semantics (McClelland & Patterson, 2002). priming, scene description, and sentence comple-
Regular past verbs tend to be more phonologi- tion. Although much remains to be done, we now
cally complicated and less distinct than irregu- know a considerable amount about how we trans-
lar ones—they tend to be longer, for example, late thoughts into sentences. In particular, it is clear
and sound and look more like their associated that there is a syntactic module used in production
stems. When we control for phonological com- that generates syntactic structures that are, to some
plexity, the relative disadvantage shown by non- extent at least, independent of the meaning of what
fluent aphasic patients on regular past tenses they convey. It is also clear that there is a probabilistic
disappears (Bird, Lambon Ralph, Seidenberg, aspect to production. Syntactic planning is quite iner-
McClelland, & Patterson, 2003). The access of tial, and tends to reuse whatever is easily available.
410 E. PRODUCTION AND OTHER ASPECTS OF LANGUAGE
LEXICALIZATION
Two-stage model of lexicalization
Lexicalization is the process in speech production
whereby we turn the thoughts underlying words Conceptual representation
into sounds: We translate a semantic represen-
tation (the meaning) of a content word into its
Lemma
phonological representation of form (its sound).
There are three main questions to answer here.
First, how many steps or stages are involved? Phonological word form
that words that sound similar are close together. Experimental evidence
The lexicon is accessed in production by tra- The earliest experimental evidence for the divi-
versing a semantic network or decision tree sion of lexical access into two stages came
(see Figure 13.4). Semantic errors occur when from studies of the description of simple scenes
traversing the decision tree, and phonologi- (Kempen & Huijbers, 1983). They analyzed
cal errors occur when the final phonological the time people take before they start speaking
form is selected. As we shall see in Chapter 15, when describing these scenes, and argued that
the argument that there is a single lexicon for people do not start speaking until the content
comprehension and production is very contro- to be expressed has been fully identified. The
versial. If this is not the case, then some other selection of several lemmas for a multiword
mechanism will be necessary to account for the sentence can take place simultaneously. We can-
existence of malapropisms. The important idea not produce the first word of an utterance until
of Fay and Cutler’s model is that phonological we have accessed all the lemmas (at least for
and semantic word substitutions happen as a these short utterances) and at least the first pho-
result of mistakes in different parts of the word nological word form. Individual word difficulty
retrieval process. affects only word form retrieval times.
Butterworth (1982) formulated word retrieval Further experimental evidence for two
explicitly in terms of a two-stage process. In stages in lexicalization comes from Wheeldon
Butterworth’s model an entry in a semantic lexi- and Monsell’s (1992) investigation of repetition
con is first selected, which gives a pointer to an priming in lexicalization. Like repetition prim-
entry in a separate phonological lexicon. In gen- ing in visual word recognition, this effect lasts a
eral, in the two-stage model semantic and pho- long time, spanning over 100 intervening nam-
nological substitutions occur at different levels. ing trials. Wheeldon and Monsell showed that
The Fay and Cutler (1977) model predicts that naming a picture is facilitated by recently hav-
semantic and phonological processes should be ing produced the name in giving a definition or
independent. reading aloud. Prior production of a homophone
Word substitution errors, while supporting (e.g., “weight” for “wait”) is not an effective
the two-stage model in general, say nothing about prime, so the source of the facilitation cannot
the existence of amodal, syntactically specified be phonologically mediated. Instead, it must
lemmas. be semantic or lemma-based. Evidence from
speeded picture naming suggests that repeti-
tion priming arises from residual activation in
the connections between semantics and lemmas
object? (Vitkovitch & Humphreys, 1991).
Y N
Monsell, Matthews, and Miller (1992)
man-made looked at this effect in Welsh–English bilin-
object?
guals. There was facilitation within a language,
N Y
but not across (as long as the phonological
musical forms of the words differed). Taken together
instrument?
the experiments show that both the meaning
Y N
and the phonological forms have to be activated
stringed? vehicle? for repetition priming in production to occur.
Y N Repetition priming occurs as a result of the
/ukelele/ /eucalyptus/ /trumpet/ /truck/ strengthening of the connections between the
lemmas and phonological forms.
Evidence for a phase of early semantic activa-
FIGURE 13.4 An example of a search-based single tion in lexical selection and a later phase of phono-
lexicon model. Based on Fay and Cutler (1977). logical activation in phonological encoding comes
412 E. PRODUCTION AND OTHER ASPECTS OF LANGUAGE
from picture–word interference studies (Levelt color in which a word is printed when the word
et al., 1991a; Schriefers, Meyer, & Levelt, 1990). spells out a color name), there is striking inhibi-
These experiments, discussed in more detail later tion. Usually we find interference with semanti-
in the section on the time course of lexicaliza- cally related pairs from the same category, and
tion, used a picture–word interference paradigm facilitation with phonologically related pairs.
in which participants see pictures that they have Schriefers et al. (1990) found that inhibition
to name as quickly as possible. At about the same disappears if participants have to press buttons
time they are given an auditorily presented word instead of naming pictures, suggesting that the
for which they have to make a lexical decision. interference reflects competition among lexi-
Words prime semantic neighbors early on, whereas cal items at the stage of lemma selection. The
late on they prime phonological neighbors. This details of the task and the timings involved are
suggests that there is an early stage when semantic also critical (Bloem & La Heij, 2003; Bloem et al.,
candidates are active (this is the lemma stage), and 2004).
a late stage when phonological forms are active.
The semantic-interference paradigm pro- Evidence from neuroscience
vides evidence for two stages, and furthermore, Different regions of the brain become activated
that the lexical items activated by the first stage in sequence as we produce words (Indefrey &
compete against each other (Starreveld & La Levelt, 2000, 2004). Conceptual selection of a
Heij, 1995, 1996). In semantic-interference stud- word in picture naming is associated with acti-
ies, participants have to name pictures which vation of the mid-part of the left middle tem-
have superimposed distractor words that they poral gyrus; accessing a word’s phonological
have to ignore; naming times are longer when code is associated with activation of Wernicke’s
the picture and the word are related. The distrac- area; and phonological encoding, in terms of the
tors lead to the activation of semantic competitors preparation of syllables, sounds, and the pros-
that slow down the selection of the lexical target. ody of the word, is associated with activation
In the related word translation task, semantically around Broca’s area. As we shall see, lesions to
related words induce semantic interference; these areas lead to different types of impairment
however, related pictures produce facilitation to word naming, with damage to more posterior
(Bloem & La Heij, 2003). The SOA is, however, regions of the brain resulting in difficulty in
critical; if the interfering words are presented 200 accessing the meanings of words, and damage
ms after the target, we observe semantic interfer- to more frontal regions resulting in difficulty
ence, but if they are presented 400 ms before the in accessing the sounds of words. A survey of
target, we observe semantic facilitation (Bloem, the imaging literature also reveals the timings
van den Boogaard, & La Heij, 2004). Bloem and of word retrieval in naming an object (Indefrey
La Heij proposed a model of lexical access in & Levelt, 2004): Visual and conceptual pro-
which semantic facilitation is localized at the cessing take on average 175 ms; the best-fitting
conceptual level, semantic interference is local- lexical item, or lemma, is retrieved between 150
ized at the lexical level, and only one concept and 225 ms; the phonological representations
is selected for lexicalization. They called this are retrieved between 250 and 330 ms; and the
the Conceptual Selection Model (CSM). They details of the sounds of the word at around 450
account for the effects of SOA with the assump- ms (see Figure 13.5).
tion that lexical representations decay faster than Electrophysiological evidence also supports
conceptual representations. the two-stage model (van Turenout, Hagoort, &
Whether or not we observe facilitation or Brown, 1998). Dutch-speaking participants were
inhibition in the picture–word interference para- shown colored pictures and had to name them with
digm depends on the details of the experimental a simple noun phrase (e.g., “red table”). At the same
set-up. In the most famous example of picture– time the participants had to push buttons depend-
word interference, the Stroop task (naming the ing on the grammatical gender of the noun, and
Picture 0 ms
↓Conceptual preparation
Lexical concept
175 ms
↓Lemma retrieval
Multiple lemmas
↓Lemma selection
Self-monitoring
400 –600
Target lemma 250 ms
275 –400
200 –400 ↓Phonological code retrieval
Lexical phonological
150 –225 output code
↓Segmental spell-out
L Segments 350 ms
↓Syllabification
Phonological word 455 ms
↓Phonetic encoding
Articulatory scores 600 ms
↓
Articulation
FIGURE 13.5 Time taken (in ms) for different processes to occur in picture naming. The specific processes
are shown on the right and the relevant brain regions are shown on the left. Reprinted from Indefrey and
Levelt (2004).
on whether or not it began with a particular sound. (33) “A navigational instrument used in measur-
The electrophysiological data for the preparation of ing angular distances, especially the altitude
the motor movements suggested that the syntactic of the sun, moon, and stars at sea.”
properties were accessed before the phonological
information. However, the time delay between the Stop and try to name the item defined by (33). You
two was very short—in the order of 40 ms. may experience a TOT.
Example (33) defines the word “sextant.”
Evidence from the tip-of-the-tongue Brown and McNeill found that a proportion of
phenomenon the participants will be placed in a TOT state
The tip-of-the-tongue (TOT) state is a notice- by this task. Furthermore, they found that lexi-
able temporary difficulty in lexical access. It is an cal retrieval is not an all-or-none affair. Partial
extreme form of a pause, where the word takes a information, such as the number of syllables, the
noticeable time to come out (sometimes several initial letter or sound, and the stress pattern, can
weeks!). You are almost certainly familiar with be retrieved. Participants also often output near
this phenomenon: You know that you know what phonological neighbors like “secant,” “sextet,”
the word is, yet you are unable to get the sounds and “sexton.” These other words that come to
out. TOTs are accompanied by strong “feelings of mind are called interlopers. TOTs show us that
knowing” what the word is. They appear to be uni- we can be aware of the meaning of a word with-
versal; they have even been observed in children as out being aware of its component sounds; and
young as 2 (Elbers, 1985). The incidence of TOTs furthermore, that phonological representations
increases with old age (Burke, MacKay, Worthley, are not unitary entities.
& Wade, 1991), and TOTS are more common in There are two theories of the origin of
bilingual speakers (Gollan & Acenas, 2004; Gollan TOTs. These are called the partial activation
& Brown, 2006). They appear to be universal; and blocking (or interference) hypotheses.
deaf speakers experience “tip-of-the-finger” states Brown (1970) first proposed the partial activa-
(Thompson, Emmorey, & Gollan, 2005). tion hypothesis. This says that the target items
Brown and McNeill (1966) were the first are inaccessible because they are only weakly
to examine the TOT state experimentally. They represented in the system. Burke et al. (1991)
induced TOTs in participants by reading them provided evidence in favor of this model from
definitions of low-frequency words, such as (33): both an experimental and a diary study involv-
ing a group of young and old participants. They
argued that the retrieval deficit involves weak
links between the semantic and the phonologi-
cal systems: there is a transmission deficit in
getting between the two. A broadly similar
approach by Harley and MacAndrew (1992)
localized the deficit within a two-stage model
of lexical access, between the abstract lexical
units and the phonological forms. At first sight
Kohn et al. (1987) provided evidence contrary
to the partial activation hypothesis in the form
of a free association task. They showed that the
partial information provided by participants
does not in time narrow or converge on the
target. However, A. S. Brown (1991) pointed
The tip-of-the-tongue (TOT) state is an extreme out that participants might not say out loud the
form of a pause, where the word takes a interlopers in the order in which they came to
noticeable time to come out.
mind. Furthermore, in a noisy system there is
13. LANGUAGE PRODUCTION 415
no reason why each attempt at retrieval should Additional evidence for this claim comes from
give the same incorrect answer. the finding that pictures with names in sparse
The blocking hypothesis, first suggested by phonological neighborhoods are named more
Woodworth (1938), states that the target item slowly than words with dense neighborhoods
is actively suppressed by a stronger competitor. where there are many similar sounding words
Jones and Langford (1987) used a variant of the (Vitevitch, 2002).
Brown and McNeill task known as phonologi- The TOT data best support the partial activa-
cal blocking to test this idea. They presented a tion hypothesis. They also suggest that the levels
phonological neighbor of the target word and of semantic and phonological processing in lexi-
showed that this increases the chance of a TOT cal retrieval are distinct. The tip-of-the-tongue
state occurring, whereas presenting a semantic state is readily explained as success of the first
neighbor does not. They interpreted this as show- stage of lexicalization but failure of the second.
ing that TOTs primarily arise as a result of com- There is some evidence that supports this idea.
petition. Jones (1989) further showed that the Vigliocco, Antonini, and Garrett (1997) showed
blocker is only effective if it is presented at the that grammatical gender can be preserved in
time of retrieval rather than just before. However, tip-of-the-tongue states in Italian. That is, even
Perfect and Hanley (1992) and Meyer and Bock though speakers cannot retrieve the phonological
(1992) discussed methodological problems with form of a word, they can retrieve some syntactic
these experiments. Exactly the same results are information about it.
found with these materials when the blockers There is also evidence from preservation of
are not presented, suggesting that the original gender in an Italian person, called Dante, who
results were an artifact of the materials. In fact, suffered from word-finding difficulties or anomia
prior processing of phonologically related words (Badecker, Miozzo, & Zanuttini, 1995). Dante
actually decreases the chance of being in a tip- could give details about the grammatical gender
of-the-tongue state, and increases the probability of words that he could not produce. Information
of retrieving the target word (James & Burke, about grammatical gender is part of the lexical-
2000), a finding consistent with the insufficient semantic and syntactic information encoded by
activation hypothesis—TOTs arise because there lemmas, such as knowing that a word is a noun.
is a deficit in transmitting activation from the Hence Dante had access to the lemmas, but was
semantic to the phonological level. The finding then unable to access the associated phonologi-
that bilingual speakers are more prone to TOTs cal forms. It is important to note that for many
is also best explained by the insufficient activa- Italian words grammatical gender is not predict-
tion idea—presumably the semantic–phonological able from semantics. Furthermore, Dante could
links are weaker in bilingual speakers because retrieve the gender for both regular and excep-
they speak each language only some of the time tion words, which suggests that Dante could not
(Gollan & Acenas, 2004). just have used partial phonological information
Harley and Bown (1998) showed that TOTs to predict grammatical gender. However, while
are more likely to arise on low-frequency words Dante’s performance is entirely compatible with
that have few close phonological neighbors. the two-stage account, it is also compatible with
For example, the words “ball” and “growth” an account where such information is stored
are approximately equal in their frequency of elsewhere. Gender can be put with other syn-
occurrence. There are a lot of other words that tactic information in the lexicon, such that it is
sound like “ball” (e.g., “call,” “fall,” “bore”), stored with words. In that case, how could gen-
but few that are close to “growth.” These data fit der be preferentially lost? We have a choice of
a partial activation model of the origin of TOTs only three genders, but of many more phonologi-
rather than an interference model. Indeed, pho- cal forms. It is possible that in an interactive acti-
nological neighbors appear to play a support- vation network we would be able to retrieve the
ing rather than a blocking role in lexical access. correct gender without the network being able to
416 E. PRODUCTION AND OTHER ASPECTS OF LANGUAGE
settle down enough to select the appropriate one concepts and phonological forms (Caramazza,
of many phonological forms. 1997; Caramazza & Miozzo, 1997, 1998; Miozzo
Further evidence that TOTs are associated & Caramazza, 1997).
with a difficulty in retrieving the phonologi- One point is that it is not clear that the need
cal forms of words comes from brain imaging. for lemmas is strongly motivated by the data.
Shafto, Burke, Stamatakis, Tam, and Tyler (2007) Most of the evidence really only demands a dis-
had people aged 19–88 name pictures of famous tinction between the semantic and the phono-
people. The number of TOTs increased with age logical levels. The strongest evidence for lem-
and with atrophy of the left insula, a region of the mas comes from the finding that gender can be
brain known to be involved (among other things) retrieved when in the tip-of-the-tongue state,
in phonological production. although this interpretation has been disputed.
It should not be possible to retrieve phonologi-
Problems with the lemma model cal information for a word without retrieving
Although most researchers favor the two-stage the syntactic information for that word such as
model of lexicalization, there is less agreement on gender, as the phonological stage can only be
the need for lemmas as a level of amodal, syntacti- reached through the lemma stage. Tip-of-the-
cally specified representations mediating between tongue data suggest, however, that syntactic
LEXICAL–SEMANTIC
NETWORK
Activation flow
FIGURE 13.6 A
Adj detailed representation of
Ms
Caramazza’s (1997) model.
Cn The flow of information is
F
N
V from semantic to lexeme
/sedia/ and syntactic networks
M and then on to segmental
information. N = noun; V =
SYNTACTIC /tavolo/ /tigre/
verb; Adj = adjective; M =
NETWORK masculine; F = feminine;
PHONOLOGICAL
FORMS Cn = count noun; Ms =
mass noun. Dotted lines
indicate weak activation.
Links within a network are
inhibitory. Reproduced with
o l v t r d e s i a g
permission from Caramazza
(1997).
13. LANGUAGE PRODUCTION 417
and phonological information are independent a representation, the frequency of the lexeme
(Caramazza & Miozzo, 1997, 1998; Miozzo & representation will be the sum of the two homo-
Caramazza, 1997): Italian speakers can some- phones. Hence a less frequent word like “nun”
times retrieve partial phonological information will behave like a more frequent word, assum-
when they cannot retrieve the gender of the ing that frequency operates at the lexeme level.
word, and vice versa. Importantly, there was no Some studies find that frequency effects appear
correlation between the retrieval of gender and to reflect total-homophone frequency rather
phonological information; people are no better than word-specific frequency (Levelt, Roelofs,
at recalling gender when they correctly recall the & Meyer, 1999). For example, Jescheniak and
initial phoneme of the target in a TOT state than Levelt (1994) found that the translation speeds
when they fail to do so. Hence, phonological of a word like “nun” by Dutch–English bilin-
retrieval does not necessarily depend on syntac- guals depended on total-homophone frequency
tic retrieval, and therefore these results do not (the rather large “none” plus “nun”) rather than
support the idea of syntactic mediation. Arguing word-specific frequency (the rather low fre-
that lemmas are unnecessary complications, quency of just “nun”) compared with control
Caramazza (1997) dispenses with them. He pro- words. In contrast Caramazza, Costa, Miozzo,
poses that lexical access in production involves and Bi (2001) found that naming latencies in a
the interaction of a semantic network, a syntac- range of experimental tasks were determined
tic network, and phonological forms (see Figure just by word-specific frequency (i.e., “nun”
13.6). Semantic representations activate both behaves like a low-frequency word rather than
appropriate nodes in the syntactic network and a high-frequency word).
the phonological network. Clearly there is conflict in the data here, and
If lemmas exist, given they are amodal and are it is unclear how this conflict is best resolved
syntactically specified, then grammatical impair- (Bonin & Fayol, 2002; Caramazza, Bi, Costa,
ments involving words should not be modality- & Miozzo, 2004; Jescheniak, Meyer, & Levelt,
specific. However, we find patients who are 2003). Whether we find word-specific or total-
selectively impaired in producing words of one homophone frequency effects depends on the
grammatical class in only one output modality number and type of materials, the controls used,
(Caramazza, 1997; Caramazza & Miozzo, 1998). and where frequency effects operate. There is
For example, patient SJD has difficulty in produc- now, for example, a considerable amount of evi-
ing verbs in writing but not in speaking; she can dence that frequency affects lexical selection (the
produce nouns equally well in writing and speaking retrieval of lemmas), rather than just the retrieval
(Caramazza & Hillis, 1991). Although her errors of phonological forms. For example, Navarette,
include semantic substitutions, SJD does not have Basagni, Alario, and Costa (2006) found effects
a central semantic impairment because she has no of frequency (in the form of faster response times
difficulty with comprehending these words, and for high-frequency items) on tasks in Spanish that
because her difficulties are restricted to one output require the retrieval of gender but not phonologi-
modality. It is difficult to account for this pattern cal properties. For example, they found frequency
of results with the lemma model (but see Roelofs, effects in a gender decision task, and in a task
Meyer, & Levelt, 1998, for an attempt). where participants had to describe pictures using
Another way of distinguishing between the pronouns rather than the name of the object.
two accounts is to examine how we produce Perhaps the best conclusion is that no firm
homophones. Consider the words “none” and conclusion can be drawn from these translation
“nun.” According to the lemma model, these tasks, although picture-naming data suggest that
two words have shared lexeme representa- specific-word frequency best predicts naming
tions but separate lemma representations. The times (Caramazza et al., 2004). So in spite of ini-
alternative is that they just have two distinct tial optimism, homophone production does not
lexeme representations. If homophones share provide clear evidence for the two-stage model.
418 E. PRODUCTION AND OTHER ASPECTS OF LANGUAGE
In summary, although there is some dis- phonological specification only begin when the
sent about the nature of the two stages, and the first stage of lemma retrieval is complete, or does
extent to which there are amodal, syntactically it begin while the first stage is still going on? The
specified lemmas, there is consensus that lexi- speech error evidence of the existence of mixed
cal retrieval takes place in two stages, with a whole word substitutions indicates overlap or
semantic-lexical stage followed by a lexical- interaction between the two stages. To make the
phonological stage. distinction between independent and overlap-
ping models concrete, suppose that you want to
Is lexicalization interactive? say the word “sheep.” According to the two-stage
hypothesis, you formulate the semantic represen-
Given that there are two stages involved in lexi- tation underlying sheep, and use this to activate
calization, how do they relate to each other? a number of competing abstract lexical items.
Interaction involves the influence of one level of Obviously in the first instance these will all be
processing on the operation of another. It com- semantic relatives (like “sheep,” “goat,” “cow,”
prises two ideas. First, there is the notion of temporal etc.). The independence issue is this: Before you
discreteness. Are the processing stages tempo- start choosing the phonological form of the tar-
rally discrete or do they overlap, as they would get word, how many of these competing units are
if information or activation is allowed to cascade left? According to the independence (modular)
from one level to the following one before it has theory, only one item is left active before we start
completed its processing? The case when process- accessing phonological forms. This is of course
ing levels overlap, in that one level can pass on the target word, “sheep.” According to the inter-
information to the next before it has completed active theory, any number of them might be. So
its processing, is known as processing in cascade according to the interactive theory, when you
(McClelland, 1979). If the stages overlap, then intend to say “sheep,” you might also be thinking
multiple candidates will be activated at the second of the phonological form /gout/, and this will in
stage. For example, many lemmas will become turn have an effect on the selection of “sheep.”
partially activated while activation is accruing at Another way of putting this is that according to
the target. Activation will then cascade down to the discrete models, the semantic-lexical and
the phonological level. The result is that on the lexical-phonological stages cannot overlap, but
overlap hypothesis we get leakage between levels according to the interactive model, they can. The
so that non-target lemmas become phonologically issues involved are exactly the same as those dis-
activated. We can examine this by looking at the cussed in word recognition.
time course of lexicalization. Second, there is the Levelt et al. (1991a) performed an elegant
notion of the reverse flow of information. In this experiment to test between these two hypotheses.
case, information from a lower level feeds back to They looked for what is called a mediated prim-
the prior level. For example, phonological activa- ing effect: When you say “sheep,” it facilitates
tion might feed back from the phonological forms the recognition of the word “goat” (which obvi-
to the lemmas. Overlap and reverse flow of infor- ously is a semantic relative of “sheep”); but does
mation are logically distinct aspects of interac- “goat” then go on to facilitate in turn one of its
tion. We could have overlap without reverse flow phonological neighbors, such as “goal”? Levelt
(but reverse flow without overlap would not make et al. argue that the interactive model suggests
much sense). that this mediated priming effect should occur,
whereas the independence model states that it
The time course of lexicalization: should not. The participants’ task was this: They
Discrete or cascaded processing? were shown simple pictures of objects (such as a
How do the two stages of lexicalization relate to sheep) and had to name these objects as quickly
one another in time? Are they independent, or do as possible. This typically takes most people
they overlap? That is, does the second stage of approximately 500 to 800 ms to do. When we see
13. LANGUAGE PRODUCTION 419
a picture or an object, we typically spend the first the appropriate concept. We then spend another
150 ms doing visual processing and activating 125 ms or so selecting the lemma. Phonological
The Image
The first step on the path where thoughts
flow into words can be thinking about or
seeing the image of the thing you want
to talk about, like a llama.
is an
wth
FIGURE 13.7 Processes involved in naming an object in a picture, according to the two-stage model of lexicalization.
From Levelt et al. (1991).
420 E. PRODUCTION AND OTHER ASPECTS OF LANGUAGE
encoding starts around 275 ms and we usually whereas Levelt et al. used categorical associates
start uttering the name from 600 ms. (“sheep” and “goat”), Peterson and Savoy used
In the interval between presentation and near synonyms (“couch” and “sofa”). It is likely
naming, subjects were given a word in an acous- that categorical associates are too weakly acti-
tic form through headphones (e.g., “goal”). The vated to produce measurable activation of their
participants had to press a button as soon as they corresponding phonological forms. Near syno-
decided whether this second item was a word nyms, though, are very closely semantically
or not. That is, it was an auditory lexical deci- related and therefore highly activated.
sion task. There were two critical results. First, Whereas Peterson and Savoy used targets
Levelt et al. found that “sheep” did not facilitate and distractors that had a very strong seman-
“goal”: “sheep” affected the subsequent process- tic relation, Cutting and Ferreira (1999) used
ing of “goat,” but not of “goal.” That is, there was distractors that had a very strong phonological
no mediated priming. Hence they argued that no relation to the target picture. Participants had
interaction occurred. Second, in a separate experi- to name pictures that had homophonic names
ment, they showed that “sheep” only affects the (e.g., “ball”). Auditory distractor words were
access of semantic neighbors (e.g., “goat”) early presented 150 ms before the picture onset.
on, whereas late on it only affects the access of Homophones have the strongest phonologi-
phonological neighbors (e.g., “sheet”). That is, cal relation possible, because by definition
there was no late semantic priming. The prim- the sound of the two meanings (round toy and
ing effects were inhibitory: that is, related items formal dance) is identical. If the discrete stage
slowed down processing through interference. model is correct, at such an early SOA only
Levelt et al. concluded that, in picture naming and an appropriate-meaning semantic distractor
lexicalization, there is an early stage when seman- (e.g., “game”) should have an effect. But if the
tic candidates are active (this is the lemma selec- cascade model is correct, then the phonologi-
tion stage), and a late stage when phonological cal form of the inappropriate-meaning distrac-
forms are active (see Figure 13.7). Furthermore, tor (e.g., “dance”) should also have an effect.
these two stages are temporally discrete and do The results supported the cascade model. The
not overlap or interact. appropriate-meaning distractor produced inhi-
Dell and O’Seaghdha (1991) showed with bition relative to an unrelated control (“ham-
simulations that a model that incorporated local mer”), but the inappropriate-meaning distractor
interaction (between adjacent stages) could produced significant facilitation. The pho-
appear to be globally modular. This is because, in nologically related distractor affects picture
these types of model, different types of informa- naming at the same early stage as a semanti-
tion need not spread very far (but see Levelt et al., cally related distractor. Similarly, Morsella and
1991b). Only very weak mediated priming would Miozzo (2002) presented participants with two
be predicted here—insufficient to be detected by superimposed pictures, and asked them to name
this task. Harley (1993a) showed that a model one but ignore the other. They found that nam-
based on interactive activation could indeed pro- ing was faster when the two pictures were pho-
duce exactly this time course while permitting nologically related (e.g., a picture of a bed and
interaction between levels. a bell, compared with a bed and a pin). This
Levelt et al.’s findings have also been finding again suggests that activation from the
questioned by the results of an experiment unselected lexical node still trickles down to
by Peterson and Savoy (1998). They did find the phonological level.
mediated priming. They showed that “soda” Further support for cascade models of lexi-
is activated when we retrieve “couch,” as the calization comes from a study by Griffin and
word “couch” primes the word “sofa” through Bock (1998). They examined how long it took
mediated priming. The difference between participants to name pictures embedded in sen-
their experiment and that of Levelt et al. is that tences. They varied the degree of constraint of the
13. LANGUAGE PRODUCTION 421
sentences and the frequency of the picture names. Is there feedback in lexicalization?
For example, (34) highly constrains the following Is there reverse information flow when we choose
target picture name, whereas (35) produces very words? Models based primarily on speech error
little constraint. data see speech production as primarily an
interactive process involving feedback, mainly
(34) Boris taught his son to drive a – because speech errors show evidence of multiple
(35) Boris drew his son a picture of a – constraints such as a lexical bias and similarity
effects (Dell, 1986; Dell & Reich, 1981; Harley,
Griffin and Bock found that the effects of 1984; Stemberger, 1985).
constraint and frequency interacted in determin- A familiarity bias is the tendency for errors to
ing naming times. High-constraint sentences produce familiar sequences of phonemes. In par-
show reduced frequency effects compared with ticular, lexical bias is the tendency for sound-level
low-constraint sentences. In discrete stage mod- speech errors such as spoonerisms to result in a
els there is no means for the constraint present in word rather than a nonword (e.g., “barn door” being
the lemma selection stage to influence the effect produced as “darn bore”) more often than chance
of word frequency in the separate and subsequent would predict. Of course, we would expect some
stage of phonological encoding. However, this sound errors to form words sometimes by chance,
finding is exactly what cascade models predict. but Dell and Reich showed that word outcomes
Data from bilingual speakers also support the happen far more often than is expected by chance.
cascade model. Costa, Caramazza, and Sebastian- This, then, is evidence of an interaction between
Galles (2000) examined the naming times of pic- lexical and phonological processes. This bias has
tures whose names are cognates in Catalan and been shown both for naturally occurring speech
Spanish (words that sound and look similar in errors (Dell, 1985, 1986; Dell & Reich, 1981) and
both language—e.g., “gat” in Catalan and “gato” in artificially induced spoonerisms (Baars et al.,
in Spanish, both meaning “cat”). For bilingual 1975), and in languages other than English (e.g., in
speakers, if activation does indeed cascade from Spanish; Hartsuiker, Anton-Méndez, Roelstraete,
unselected lexical nodes, then the activation lev- & Costa, 2006). Some aphasic speakers show clear
els of the phonemes /g/ /a/ /t/ should be very high lexical bias in their errors (Blanken, 1998).
because they are receiving activation from two Similarity effects arise when the error is more
lexical nodes—the selected Spanish target word similar to the target according to some criterion
and the non-selected Catalan node. Costa et al. than would be expected by chance. In mixed sub-
indeed found that the naming times for cognate stitutions the intrusion is both semantically and
words was shorter in bilingual speakers (but not phonologically related to the target, such as in
for monolingual speakers). (36) and (37). Obviously we will find some mixed
In summary, these experiments show that errors by chance, but we find them far more often
word selection precedes phonological encod- than would be expected by chance alone (Dell &
ing. There is much evidence that the two stages Reich, 1981; Harley, 1984; Shallice & McGill,
of lexicalization overlap, and little unambigu- 1978). Obviously we need a formal definition of
ous evidence against this idea. They found that phonological similarity; here both the target and
naming times were shorter for cognate words in the intrusion start with the same consonant, and
bilingual (but not monolingual) speakers. Only contain the same number of syllables. We also
the cascaded-processing model clearly predicts find similar results in artificially induced speech
this result. In the cascade model, activation cas- errors (e.g., Baars et al., 1975; Motley & Baars,
cades down from non-selected lexical nodes (the 1976) and in errors arising in complex naming
cognates) to their phonological segments, as well tasks (Martin, Weisberg, & Saffran, 1989). Laine
as from the target nodes. The result of this addi- and Martin (1996) discuss the effect of task train-
tional activation of the phonological segments is ing on a severely anomic patient, IL. They found
to speed up naming. a strong phonological relatedness effect.
422 E. PRODUCTION AND OTHER ASPECTS OF LANGUAGE
LEXICAL NETWORK
TACTIC FRAMES
1 2
C
PLURAL SWIM
Q N Plural V
V
(1) (2) (3) ?
SYNTAX
1
WORD WORD C
SQ SV Af1 Af2
(1) ?
MORPHOLOGY
SYL sw A
On Nu
Rime
s w I m
On Nu Co On On Nu Co
?
PHONOLOGY
FIGURE 13.9 Dell’s (1986) connectionist model of speech production. This figure depicts the momentary
activation present in the production of the sentence “Some swimmers sink.” On the left there are tree structures
analogous to the representation at each level of the model. The numbered slots have already been filled in and
the “flag” indicates each node in the network that stands for an item filling a slot (the number indicates the order
and the c flag indicates the current node on each level). The ? indicates the slot in each linguistic frame that is
currently being filled. The highlighting on each unit indicates the current activation level. Each node is labeled
for membership of some category. Syntactic categories are: Q for quantifier; N for noun; V for verb; plural
marker. Morphological categories are: S for stem; Af for affix. Phonological categories are: On for onset; Nu for
nucleus; Co for coda. Many nodes have been left out to simplify the network, including nodes for syllables, syllabic
constituents, and features. From Dell (1986).
any pattern of results. It has become difficult to Hence all the data are consistent with non-
distinguish empirically between the cascade and discrete, cascading models. Levelt et al. (1991a)
discrete models. argued that real-time picture-naming experiments
13. LANGUAGE PRODUCTION 425
present a more accurate view of the normal lexi- more than others, it is unlikely to be able to do
calization process. Nevertheless, any complete so to the extent that can account for the num-
model of lexicalization should also provide ber of mixed errors actually found. Generally
an account of the speech error data. Feedback the dissociation between aphasic speakers with
explains similarity and lexical biases, but it is of comprehension deficits who show good error
course most unlikely that feedback connections detection is a problem for the perceptual-loop
should exist just to give phonological facilitation hypothesis. Instead, it might be that speech error
in speech errors (Levelt, 1989). One reason feed- detection arises from the ability of the speech
back links might exist is that the system is used in production system to detect conflicts between
speech production and comprehension, but this is planned output and intention, using mechanisms
implausible given experimental and neuropsycho- located in the anterior cingulate cortex of the
logical evidence for a separation (see Chapter 15). brain (Nozari, Dell, & Schwartz, 2011).
Hence models with feedback are in some respects What role does feedback serve? Feedback is
problematical. unlikely to be the same mechanism that is used in
One possibility is to explain the speech error comprehension: speech production is not just com-
data away. Given that the main evidence for prehension in reverse. (For detailed justification
interaction is facilitation and lexical bias, per- of this statement, see Chapter 12.) Any increase in
haps these phenomena can be explained by other processing speed that feedback provides is likely
mechanisms. An alternative explanation is the to be marginal, and feedback is most unlikely to
use of monitors (Baars et al., 1975; Butterworth, exist just to ensure that errors are words. One pos-
1982; Levelt, 1989; Postma, 2000). Of course sibility is that it plays a role in monitoring speech
we monitor our speech; we sometimes detect and detecting and preventing errors.
errors and correct them. The idea that we make Connectionist modeling provides an alter-
use of a comprehension system to monitor what native explanation to feedback. In Chapter 7
we say is called the perceptual-loop hypothesis. we saw how mixed errors can arise in a feed-
Postma (2000) discusses three ways in which a forward architecture, as one of the properties of
monitor might operate: It might be completely an attractor network (Hinton & Shallice, 1991).
perceptual, having access only to our speech Perhaps in a similar way we can talk about pho-
output; it might have access to levels of pro- nological attractors. More work is necessary on
cessing prior to output, comparing intermediate this topic.
levels of representation against the conceptual Rapp and Goldrick (2000) reviewed the lit-
message; or it might make use of relative infor- erature on discreteness and interactivity, pay-
mation about activation levels (e.g., if two lem- ing particular attention to the pattern of errors
mas are simultaneously very highly activated, a made by normal and brain-damaged people.
warning light might flash). It is, however, dif- This review provoked a lively debate (Rapp &
ficult to distinguish between these alternatives, Goldrick, 2004; Roelofs, 2004a, 2004b). Rapp
and indeed all might well be true. and Goldrick (2000) argued that the degree of bias
The use of a monitor to edit some slips adds towards mixed errors and the lexical bias in errors
complexity to the system (Stemberger, 1985). made by normal individuals can only plausibly be
We also observe aphasic speakers with error accounted for by the presence of feedback in the
patterns that contradict the editor hypoth- system. Furthermore, brain damage can disrupt
esis. For example, Blanken (1998) describes a language production at either the semantic or the
patient who makes errors that come from differ- post-semantic level, and yet lead to only semantic
ent syntactic categories on some occasions, but errors. However, individuals with brain damage
not on others. The editor should be very good show the mixed-error effect only if the locus of
at detecting syntactic category violations and damage is post-semantic—a semantic locus of
should be consistent. So, although the monitor impairment leads to semantic errors but no larger
might sometimes prevent some types of error number of mixed errors than would be expected
426 E. PRODUCTION AND OTHER ASPECTS OF LANGUAGE
tend to create words, and the majority of them feedback copied the past state of the hidden units
are anticipations. The bad pattern is when there of the network, and therefore provided the model
are many errors and the proportion of persevera- with memory of its past internal structure. When
tions is high. The bad pattern is found with some the model made errors, it exhibited four properties
types of aphasia, in childhood when the material observed in human sound speech errors. First, it
is less familiar, and with a faster speech rate. obeyed the phonotactic constraint: errors result in
Frame-based models are very good at account- sound sequences that occur in the language spo-
ing for these sorts of data. Decreasing the avail- ken. Second, consonants exchanged with other
able time and weakening connection strengths consonants, and vowels exchanged with other
in the model both lead to an increase in the bad vowels. Third, the syllabic constituent effect is
error pattern. that vowel–consonant errors are less common than
The second account, competitive queuing consonant–vowel errors. Finally, initial conso-
(Hartley & Houghton, 1996), is a connectionist nants are more likely to slip than non-initial ones.
model that also uses a frame, but which provides
an explicit mechanism for inserting segments into Phonological encoding in the
slots. The segments to be inserted form an ordered
queue controlled by processes of activation and
lemma model
inhibition. There are two control units, an initia- The final account of phonological encoding is
tion and an end unit. Sounds that belong at the provided by the WEAVER++ model of Levelt,
start of a word have strong connections to a unit Roelofs, and colleagues (e.g., Levelt, 2001;
that controls the initiation of speech, while sounds Levelt, Roelofs, & Meyer, 1999; Roelofs, 1992,
at the ends of words have strong connections to 1997a, 1997b, 2002, 2004a, 2004b; see Figure
a unit that controls the end of the sequence. The 13.10). WEAVER++ is a discrete two-stage
strength of connections of other sounds in a word model without any interaction between lev-
to these control units varies as a function of their els. Concepts select lemmas by enhancing the
position in a word. After a sound is selected, it is activation level of the concept dominating the
temporarily suppressed. Failure to do this prop- lemma. Activation spreads through the network,
erly leads to perseveration errors. Although this with the important restriction that cascaded pro-
model was originally formulated to account for cessing is not permitted, so that activation of
serial order effects in remembering lists, it can be the corresponding word form can only begin
extended to account for all of speech production. after a unique lemma has been selected. A pho-
It has the advantage of being able to learn how to nological code is retrieved for each lemma; for
order items. multimorphemic words the phonological code
Connectionist models suggest that the frame– is retrieved for each of the morphemes (e.g., if
filler distinction does not have to be represented the target is “horses,” we retrieve “horse” and
explicitly, but that it can emerge from the phono- “-z”). The phonological codes are spelled out
logical structure of the language (Dell, Juliano, as ordered sets of phonemes. The phonologi-
& Govindjee, 1993). Dell et al. used a type of cal code is retrieved for the word as a whole;
connectionist network called a recurrent net- in picture–word interference studies, priming
work to associate words with their phonological by parts of words facilitates the naming of the
representations in sequence, without any explicit target (e.g., naming a hammer is facilitated by
representation of the structure–content distinc- presenting “mer” as a distractor), suggesting that
tion. Recurrent networks are very good at learning all the parts of the word have been retrieved in
sequences of patterns. Dell et al.’s model incor- one go (Levelt, 2001; Roelofs, 1997a, 1997b).
porated two kinds of feedback. External feedback These ordered sets of phonemes are then incre-
copied the output of the most recent segment, and mentally strung together to form syllables, a pro-
therefore provided the model with memory of the cess known as syllabification. Syllables are not
past phonological states of the model. Internal stored in the lexicon; rather, we create them as
428 E. PRODUCTION AND OTHER ASPECTS OF LANGUAGE
Lexical concept
Lemma
Self-monitoring
Phonological word
STAGE 6 Articulation
we go along, depending on the context. As syl- these highly overlearned motor patterns to speed
lables are composed, they form the input to the up production.
final step of encoding, that of phonetic encoding, These models are perhaps not as mutually
which forms the details of the sounds and acts as exclusive as they might first appear. They represent
an input to the articulatory apparatus. evolution in theorizing, and also emphasize different
An important concept in phonetic encoding aspects of phonological encoding. The main differ-
is the mental syllabary. The syllabary is a store ence is once again the extent to which information
of highly practiced syllabic “gestures” that can has to be explicitly encoded in the model, or whether
drive articulation; as syllabification proceeds, it emerges as a consequence of the statistical regular-
the corresponding syllabic patterns are retrieved ities of the language. At present, frame-based mod-
from the syllabary for execution (Levelt, 2001; els are better able to account for how we can produce
Levelt et al., 1999). Evidence for the existence novel sequences of sounds (Dell, Schwartz, Martin,
of the syllabary comes from the finding that, Saffran, & Gagnon, 1997).
when word frequency is controlled for, syllable
frequency affects naming times (Cholin, Levelt, The role of syllables in
& Schiller, 2006; Levelt & Wheeldon, 1994).
Although English has more than 10,000 differ- phonological encoding
ent syllables, 80% of the time we use just 500 One major difference between many of the con-
(Levelt, 2001). It makes sense to make use of nectionist and WEAVER++ models concerns
13. LANGUAGE PRODUCTION 429
the role of the syllable. Most connectionist mod- It is difficult to come to any firm conclusion
els make use of metrical frames that specify the about the existence of pre-stored, abstract syllabic
number, order, and structure of syllables and their structures on the basis of the current contradictory
stress pattern; syllables are then inserted into this findings (see Cholin et al., 2006, for a summary).
metrical frame. In contrast, in the WEAVER++
model the metrical frame specifies only the stress
pattern, and does not contain syllable information.
How far do we plan ahead?
We can test this distinction, although the exper- What is the main unit of planning at the phonologi-
iments are complex. Roelofs and Meyer (1998) cal level? According to Levelt (1989), we have to
examined whether we store the structure of syllables prepare the phonological word before we can start
in the metrical frame. They used an implicit priming speaking. The phonological word is the smallest
paradigm. Participants had to produce one word out prosodic unit of speech: it is a stressed (strong)
of a small set of words as quickly as possible. The syllable and any associated unstressed (weak) syl-
sets of words were either homogeneous, when all lables (Levelt, 1989; Sternberg, Knoll, Monsell,
the words in the set had the same word-initial seg- & Wright, 1988; Wheeldon & Lahiri, 1997). For
ments, or heterogeneous, when they did not. They example, “the vampire” is one phonological word;
found that priming depended on the words having “the bad vampire” is two. The phonological word
the same number of syllables and the same stress is prepared prior to rapid execution. Wheeldon
pattern, but not the same syllable structure (the and Lahiri showed that when all other factors are
same number of consonants and vowels). Roelofs controlled for (e.g., syntactic structure, number of
and Meyer concluded that the lack of priming lexical items, and number of syllables), the time
suggests that syllable structure is not stored in the it takes us to prepare a sentence (as measured by
metrical frame. Cholin, Schiller, and Levelt (2004) the time it takes us to begin speaking the prepared
used the same paradigm, and concluded that sylla- material) is a function of the number of phonologi-
ble frames are not stored with a word and retrieved cal words in it.
during encoding, but instead are generated “on the In addition to content words, phonological
fly.” The general idea with these studies is that if words can contain function words, although in some
syllables are not explicitly stored in the lexicon, circumstances function words can form phonologi-
there should be no syllable-specific priming effect, cal words in themselves if we decide to stress them
which is what these studies find. Hence they sup- (e.g., “you CAN do that”). Further evidence for the
port the view that syllables are made up only when importance of phonological words in phonological
necessary, as in the WEAVER++ model. planning is that resyllabification occurs within pho-
Other studies come to a different conclusion. nological words, but not across them. This means
Costa and Sebastian-Gallés (1998) used a picture–word that sounds from the end of one syllable can migrate
interference paradigm: Participants had to name a to form the beginning of the next syllable. Consider
picture while a word was presented 150 ms later. (40) from Wheeldon and Lahiri (1997):
The results showed that participants were faster to
name the picture when the target and the distrac- (40) Get me a beer, if the beer is cold.
tor shared the same abstract structure. For example,
“cuña” (meaning “wedge”) has a CV.CV (conso- A final /r/ sound has been added explicitly to the
nant–vowel consonant–vowel) structure. “Cuña” end of the second “beer,” and this has then resyl-
primes the target word “mono” (monkey), which labified to become the onset of the following
has the same syllabic structure (CV.CV), but no “is,” so that it is pronounced “beea-riz.” No such
overlap in actual sounds (segmental content), rela- resyllabification can occur with the first “beer,”
tive to a control item (e.g., “culpa,” meaning fault, however, because the following /I/ is in a different
which is structurally and segmentally unrelated). phonological word.
This result suggests that abstract syllabic structures On the other hand, some more recent work
are used in phonological encoding. suggests that we do plan farther ahead than one
430 E. PRODUCTION AND OTHER ASPECTS OF LANGUAGE
phonological word. For example, Costa and (see Figure 13.11). An unfilled pause is simply
Caramazza (2002) used a picture–word interfer- a moment of silence. A filled hesitation can be
ence design to examine how we produce noun a filled pause (where a gap in the flow of words
phrases in English and Spanish. They asked is filled with a sound such as “uh” or “um”), a
speakers to produce simple (determiner noun) and repetition, a false start, or a parenthetical remark
complex (determiner adjective noun) construc- (such as “well” or “I mean”). People often start
tions while ignoring phonological distractors. what they are saying, hesitate when they discover
They found that the distractors phonologically that they haven’t really worked out what to say
related to the noun produced faster naming laten- or how to say it, and repeat their start when they
cies, regardless of the type of construction and have (Clark & Wasow, 1998). Unfilled pauses are
the position of the noun. This result shows that easier to detect mechanically by the equipment
the level of activation of the phonological forms used to measure pause duration, so analysis has
of the lexical nodes outside the first phonologi- focused on them. It has been argued that pauses
cal form affect naming latency, meaning that the represent two types of difficulty: one in what
second phonological word of the noun phrase (the might be called microplanning (due to retriev-
noun, in the complex construction) is activated ing particularly difficult words), and a second in
before articulation begins (because it is facilitated macroplanning (due to planning the syntax and
by the prime). Hence, in at least some circum- content of a sentence). The theoretical emphasis
stances, phonological encoding extends beyond in the past has been that pauses predominantly
a phonological word (see also Alario, Costa, & reflect semantic planning.
Caramazza, 2002a, 2002b; Levelt, 2002).
One possible resolution of these apparently dis-
crepant findings is that the phonological representa-
Pauses and lexicalization
tions of words are activated in a graded way as we Goldman-Eisler (1958, 1968) examined the dis-
speak; the closer to output an item is, the more it is tribution of unfilled pauses (defined variously as
activated (Jescheniak, Schriefers, & Hantsch, 2003). longer than 200 or 250 ms) across time, using a
device nicknamed the “pauseometer.” Obviously
there are gaps between speakers’ “turns” in con-
THE ANALYSIS OF versation, known as switching pauses, but there
HESITATIONS are many pauses within a single conversational
turn. They tend to occur every five to eight words.
Hesitation analysis is concerned with the distribu- Goldman-Eisler (1958, 1968) showed that
tion of pauses and other dysfluencies in speech pauses are more likely to occur, and to be of
SPEECH DYSFLUENCIES
FIGURE 13.11
13. LANGUAGE PRODUCTION 431
longer duration, before words that are less pre- Pauses and sentence planning
dictable in the context of the preceding speech.
Predictability reflects a number of notions, Goldman-Eisler (1958, 1968) argued that in some
including word frequency and familiarity, and the pauses we plan the content of what we are about
preceding semantic and syntactic context. Pauses to say. She found that the difficulty of the speak-
before less predictable words are hypothesized ing task affected the number of pauses a speaker
to reflect microplanning and to correspond to a makes, with more difficult tasks (for example,
transient difficulty in lexical access. We know interpreting a cartoon rather than simply describ-
the meaning of the word we want to say but we ing a cartoon) leading to more pauses in speech.
cannot immediately retrieve its sound. Of course, She argued that speakers were using these addi-
not all hesitations precede less predictable words, tional pauses to carry out additional planning.
and not all less predictable words are preceded Pauses cast some light on the size of plan-
by pauses. Sections of repeated speech behave ning units in speech. Maclay and Osgood (1959)
differently from pauses, tending to follow unpre- argued that the planning units must be larger
dictable words rather than preceding them, as than a single word because false starts involve
though they are used to check that the speaker corrections of the grammatical words associated
has selected the correct word (Tannenbaum, with the unintended content-bearing words. We
Williams, & Hillier, 1965). tend to produce corrections such as “The dog—
Beattie and Butterworth (1979) attempted to the cat was …” Boomer (1965) argued on the
disentangle the effects of word frequency from basis of hesitations that an appropriate unit of
contextual probability. They showed that the analysis corresponds to a phonemic clause that
relation between pausing and predictability did essentially has only one major stressed element
not appear to be attributable simply to word fre- within it, and which corresponds to a clause of
quency, and concluded that the main component the surface structure. He argued that the clause
of predictability that determined hesitations was is planned in the hesitation at the start of the
difficulty in semantic planning. However, their clause. Ford and Holmes (1978) used dual-task
study did not rule out possible contributions from performance to monitor cognitive load dur-
syntactic difficulty (Petrie, 1987). ing speech production, whereby the participant
People often use appropriate gestures during had to speak while monitoring for a tone over
these hesitations (Butterworth & Beattie, 1978). headphones. They argued that planning does
Suppose you are having difficulty in retrieving the not span sentences because reaction times to the
word “telephone.” You pause just before you say tone were no longer at the ends of sentences,
it, and in that pause make a gesture appropriate suggesting that people are not planning the next
to a telephone (such as holding your fist to the sentence at the end of the previous one. On the
side of your head, with thumb and little finger other hand, Holmes (1988) asked participants to
extended). This suggests that you know the mean- read several sentences that began a story, and
ing of what you want to say—that is, that the dif- then produce a one-sentence continuation. She
ficulty lies elsewhere than in semantic planning. found that, contrary to instructions, some speak-
It suggests a two-stage model of lexical access in ers produced more than one sentence, and when
production. We first formulate a semantic specifi- they did so a pause was more likely at the start
cation of what we want to say, and phonological of their speech than when they produced only
retrieval follows this. On this account the pause one sentence. Different tasks seem to indicate
reflects a successful first stage but a delay in the that different units are the fundamental unit.
second stage, that of retrieving the particular pho- Nevertheless, the clause does seem to be an
nological form of the word. This account ties in important unit of planning.
with the evidence from tip-of-the-tongue states, What exactly is planned in the pauses? In
which can be seen as extreme examples of micro- particular, is the planning syntactic or seman-
planning pauses. tic in nature, or both? Goldman-Eisler (1968)
432 E. PRODUCTION AND OTHER ASPECTS OF LANGUAGE
claimed that pause time was not affected by the plots of unfilled pauses against articulation time.
syntactic complexity of the utterances being Jaffe, Breskin, and Gerstman (1972) showed that
produced, and concluded that planning is pri- apparently cyclic patterns could be generated
marily semantic rather than syntactic. This completely randomly. However, other phenom-
conclusion is now considered controversial ena (such as filled hesitations) also cluster within
(see Petrie, 1987). One problem concerns what the planning phase of a cognitive cycle. For
measure should be taken of syntactic complex- example, speakers tend to gaze less at their listen-
ity. At this stage it would be premature to rule ers during the planning phase, maintaining more
out the possibility that macroplanning pauses eye contact during the execution phase (Beattie,
represent planning both the semantic and the 1980; Kendon, 1967). The use of gestures also
syntactic content of a clause. depends on the phase of speech (Beattie, 1983).
Henderson, Goldman-Eisler, and Skarbek Speakers tend to use more batonic gestures (ges-
(1966) proposed that there were cognitive tures used only for emphasis) in the hesitant
cycles in the planning of speech. In particu- phases, and more iconic gestures (gestures that
lar, phases of highly hesitant speech alternate in some way resemble the associated object,
with phases of more fluent speech. The hesi- such as the one described earlier when about to
tant phases also contain more filled pauses, and say “telephone”) in the fluent phase (particularly
more false starts, than the fluent phases. It is before less predictable words). The observation
thought that most of the planning takes part in that several features cluster together in hesitant
the hesitant phase, and in the fluent phase we phases suggests that these cycles are indeed psy-
merely say what we have just planned in the chologically real. Finally, Roberts and Kirsner
preceding hesitant phase (see Figure 13.12). (2000) used the statistical technique of time
Butterworth (1975, 1980) argued that a cycle series analysis to find further support for the
corresponds to an idea. He asked independent existence of temporal cycles.
judges to divide other speakers’ descriptions of
their routes home into semantic units, and com- Evaluation of research on
pared these with hesitation cycles. An idea lasts
for several clauses. Roberts and Kirsner (2000)
dysfluencies
found that new cycles are associated with topic Some dysfluencies might do more than just indi-
shifts in conversation. cate temporary processing difficulty. Sometimes
One problem with this work is the way in speakers deliberately (though perhaps usually
which the units were identified by inspection of unconsciously) put pauses into their speech to
make the listener’s job easier, perhaps aiding
them to segment speech, or to give them time to
parse the speech (see also Chapter 14, on audi-
Total amount of pause time
dysfluencies when parsing the input (Ferreira & pause rate did indeed go down, but the number
Bailey, 2004). For example, filled pauses and of repeats they made went up instead.
repetitions are more common at the start than at Although the early work was originally
the end of clauses—the parser could therefore interpreted as showing that pausing reflected
make use of this information to decide on clause semantic planning, this is far from clear. It is
boundaries when there are alternative construc- likely that microplanning difficulties arise in
tions (e.g., in garden path sentences). The use of retrieving the phonological forms and planning
“oh” indicates to the speaker that the following propositions, whereas macroplanning pauses
utterance is not connected to the immediately reflect both semantic and syntactic planning
preceding information, but to something earlier of larger chunks of language. It is possible that
in the conversation (Fox Tree & Schrock, 1999). macroplanning and microplanning may conflict
“Uh” and “um” may serve different functions (Levelt, 1989); if we spend too much time on
in speech, with “uh” signaling a short delay, macroplanning, there will be fewer resources
and “um” a longer delay, in speaking (Clark & available for microplanning, leading to an
Fox Tree, 2002). Hence dysfluencies do more increase in pausing and decreased fluency as
than just reflect processing difficulty; they con- we struggle for particular words.
vey information to the listener. Of course, it is
quite possible that any one particular dysfluency
might serve more than one function. THE NEUROSCIENCE OF
Different types of pause might have dif- SPEECH PRODUCTION
ferent causes. Goldman-Eisler (1958) argued
that micropauses (those shorter than 250 ms) What else does neuroscience tell us about speech
merely reflect articulation difficulties rather production?
than planning time; however, this view has been
challenged (see, for example, Hieke, Kowal,
& O’Connell, 1983). There is some measure
Aphasia
of interchangeability between different types In the past, researchers placed a great deal of empha-
of hesitations. Beattie and Bradbury (1979) sis on the distinction between Broca’s and Wernicke’s
showed that if speakers were dissuaded from aphasias. These terms refer to what were once con-
making many lengthy pauses (by being “pun- sidered to be syndromes, or symptoms that cluster
ished” by the appearance of a red light every together, resulting from damage to different parts of
time they paused for longer than 600 ms), their the left hemisphere. Broca’s area is toward the front
Motor cortex
Broca’s
area
3 2
of the brain, in the frontal lobe, and Wernicke’s area in the other part. One their small tile into
is toward the rear, in the posterior temporal lobe (see her time here. She’s working another time
Figure 13.13). These terms are also still meaning- because she’s getting, too …”
ful for clinicians and neurologists, and they are still
acceptable terms in those literatures. For Wernicke, this type of aphasia resulted
from the disruption of the “sensory images” of
Broca’s aphasia words. Clearly aspects of word meaning process-
Broca’s aphasics have non-fluent speech, charac- ing are disrupted in this type of aphasia, while
terized by slow, laborious, hesitant speech, with syntactic processing is left relatively intact.
little intonation (called dysprosody), and with
obvious articulation difficulties (called speech Comparison of Broca’s and
apraxia). There is also an obvious impairment Wernicke’s aphasias
in the ability to order words. At the most general Broca’s and Wernicke’s aphasias are not really mir-
level, Broca’s-type patients have difficulty with ror images. They are distinguished on two dimen-
sequencing units of the language. An example of sions: intact versus impaired comprehension, and
Broca’s aphasia is given in (41) (from Goodglass, the availability or unavailability of the syntactic
1976, p. 238), where the dots indicate long pauses. components of language (see Figure 13.14). This
Although all Broca’s patients suffer from different categorization relates more to the links between
degrees of speech apraxia, not all obviously have the characteristics of the impaired speech and ana-
a syntactic disorder. tomical regions of the brain, while currently the
emphasis is on developing more functional descrip-
(41) “Ah … Monday … ah Dad and Paul … tions relating to psycholinguistic models of the
and Dad … hospital. Two … ah … doctors impairments. It is now considered more useful to
… and ah … thirty minutes … and yes … distinguish between fluent aphasia, which is char-
ah … hospital. And er Wednesday … nine acterized by fluent (though sometimes meaning-
o’clock. And er Thursday, ten o’clock … less) speech, and non-fluent aphasia. At the same
doctors. Two doctors … and ah … teeth.” time we can also distinguish between those patients
who can comprehend language and those who have
Wernicke’s aphasia a comprehension deficit. Traditional Broca’s-type
Damage to Wernicke’s area, which is in the left aphasics are non-fluent with no obvious compre-
temporal-parietal cortex, results in the product- hension deficit, whereas traditional Wernicke’s-type
ion of fluent but often meaningless speech. This is aphasics are fluent with an obvious comprehension
called Wernicke’s (sometimes sensory) aphasia. As deficit. Bear in mind that no classification scheme
far as one can tell, patients speak in well-formed for neuropsychological disorders of language is
sentences, with copious grammatical elements and perfect: there are always exceptions and patients
with normal prosody. Comprehension is noticeably who appear to cut across categories (see Schwartz,
poor, and there are obvious major content word- 1984). Furthermore, all patients have some degree
finding difficulties, with many word substitutions of anomia (word-finding difficulties, discussed in
and made-up words. Zurif, Caramazza, Myerson, more detail below)—even agrammatic Broca’s
and Galvin (1974) found that patients were unable aphasics (Dick et al., 2001).
to pick the two most similar words from triads as
“shark, mother, husband.” An example of the speech
of someone with Wernicke’s aphasia is given in (42)
Agrammatism
(from Goodglass & Geschwind, 1976, p. 410): The syntactic disorder of non-fluent patients tells
us a great deal about the processes involved in
(42) “Well this is … mother is away here work- speech production. In traditional neuropsychol-
ing her work out o’here to get her better, but ogy terms, such patients suffer from what has
when she’s looking, the two boys looking been labeled agrammatism.
13. LANGUAGE PRODUCTION 435
Agrammatism has three components. First, place in modern neuropsychology. The debate
there is a sentence construction deficit, such that centers on whether agrammatism is a coher-
patients have an impaired ability to output cor- ent deficit: Do people with agrammatism show
rectly ordered words. The words do not always symptoms that consistently cluster together,
form sentences, but look as though they are and hence, is there a single underlying deficit
being output one at a time. In some cases, simple that can account for them? If it is a meaningful
sentences can be generated (e.g., a patient might syndrome, we should find that the sentence con-
repeat “the old man is washing the window” as struction deficit, grammatical element loss, and
“the man is washing window. The man is old”; a syntactic comprehension deficit should always
Ostrin & Schwartz, 1986). The disorder extends co-occur. A number of single-case studies have
to sentence repetition, where complex phrases found dissociations between these impairments
are simplified. Second, some parts of speech (Caplan et al., 1985; Goodglass & Menn, 1985;
are better preserved than others. In particular, Miceli, Mazzucci, Menn, & Goodglass, 1983;
there is a selective impairment of grammatical Nespoulous et al., 1988; Saffran et al., 1980;
elements, such that content words are best pre- Schwartz et al., 1987).
served, and function words and word endings These dissociations suggest that there
(bound inflectional morphemes) are least well is a syntax module in the brain, but that the
preserved. Third, although for some time it was module itself has neurologically distinct com-
thought that their comprehension was spared, ponents. This idea is supported by recent
some people with agrammatism also have dif- neuroimaging data (Grodzinsky & Friederici,
ficulty in understanding syntactically complex 2006). Grodzinsky and Friederici identify dif-
sentences (see Chapter 10). It is also possible ferent sorts of syntactic processing, and indi-
that certain differences between agrammatic cate where they might take place in the brain
speakers reflect different adaptations to the (see Figure 13.15). Broca’s area is particularly
deficit. For example, some people show better important for identifying how different constit-
retention of bound morphemes, and others of uents in the sentence are related to each other,
free grammatical morphemes. with regions in the superior temporal gyrus
Whether or not these components are disso- (including Wernicke’s area) more involved in
ciable is an important question. There has been syntactic integration. Imaging suggests that
considerable debate as to whether terms such even parts of the right hemisphere play some
as Broca’s aphasia and agrammatism have any role in syntactic processing.
436 E. PRODUCTION AND OTHER ASPECTS OF LANGUAGE
Explanations of agrammatism
One explanation of agrammatism is that the
patients’ articulation difficulties play a causal role.
It might be that patients find articulation so diffi-
cult that they drop function words in an attempt
to conserve resources. But agrammatism is much
more than a loss of grammatical morphemes, as
there is also a sentence construction and, in most
cases, a syntactic comprehension deficit.
Other theories attempt to find a single under-
lying cause for the three components. One obvious
suggestion is that Broca’s area is responsible for
processing function words and other grammatical
elements (see also Chapter 10). We saw earlier
that content and function words suffer very differ-
FIGURE 13.15 The main brain areas involved in ent constraints in normal speech production: for
syntactic processing. Pink areas (frontal operculum
example, they never exchange with each other in
and anterior superior temporal gyrus) are
word exchange speech errors. There is also some
involved in the build-up of local phrase structures;
the yellow area (BA33/45) is involved in the neuropsychological evidence that content and
computation of dependency relations between function words are served by different processing
sentence components; the striped area (posterior routines. French-speaking agrammatic patients
superior temporal gyrus and sulcus) is involved in made more phonological errors on reading func-
integration processes. Reprinted from Grodzinsky tion words than matched content words (Biassou,
and Friederici (2006). Obler, Nespoulous, Dordain, & Harris, 1997), a
finding often observed in deep dyslexia, which
often co-occurs with agrammatism. Probabilistic
difficulty in accessing grammatical elements
More recently it has been observed that will lead to difficulty in understanding complex
agrammatism can be observed in a wide range syntactic constructions, and deficits in syntactic
of aphasic patients, and is not restricted to non- production (Pulvermüller, 1995). Along these
fluent aphasics (Dick et al., 2001). Agrammatism lines, Kean (1977) proposed a single phonologi-
can even be observed in neurologically intact cal deficit hypothesis, later revised by Lapointe
people under stress. (1983), based on the assignment of stress to a
If there is no such syndrome as agram- syntactic frame. Kean argued that agrammatic
matism, it is meaningless to perform group patients omit items that are unstressed compo-
experiments on what is in fact a functionally nents of phonological words (see earlier). Hence
disparate group of patients. Instead, one should content words tend to be preserved, and affixes
only perform single-case studies (Badecker and function words are lost. This hypothesis
& Caramazza, 1985). In reply Caplan (1986) sparked considerable debate (see Caplan, 1992;
argued that at the very least agrammatism is Grodzinsky, 1984, 1990; Kolk, 1978). The main
a convenient label. Although there might be problem is that although it explains grammatical
subtypes, there is still a meaningful underly- element loss, it does not account so well for the
ing deficit. This issue sparked considerable other components of the disorder (particularly the
debate, both on the status of agrammatism (see sentence construction deficit), nor for the patterns
Badecker & Caramazza, 1986, for a reply to of dissociation that we can observe, in particular
Caplan) and on the methodology of single-case the patients’ ability to make judgments about the
studies (see Bates et al., 1991; Caramazza, grammaticality of sentences. Furthermore, as we
1991; McCloskey & Caramazza, 1988). saw in Chapter 10, the conclusion that function
13. LANGUAGE PRODUCTION 437
and content words are processed differently is short-term memory (STM) does not necessar-
questionable. ily lead to agrammatism (Kolk & van Grunsven,
Stemberger (1984) compared agrammatic 1985; Shallice & Butterworth, 1977). Hence any
errors with normal speech errors. He proposed impairment would have to be to some component
that in agrammatic patients there is an increase of memory other than the phonological loop. This
in random noise, and an increase in the threshold could be to a specialist store for syntactic planning,
that it is necessary to exceed for access to occur. or perhaps to a special part of the central execu-
In these conditions substitutions and omissions, tive component of working memory. Nevertheless,
particularly of low-frequency items, occur. He reduced computational resources may play some
argued that agrammatism is a differential exac- role in the production deficits in agrammatism
erbation of problems found in normal speech; (Blackwell & Bates, 1995; Kolk, 1995). If this is
this idea, that aphasic behavior is just an extreme so, one possibility is that grammatical elements
version of normal speech errors, is one fre- are particularly susceptible to loss when computa-
quently mentioned. Harley (1990) made a simi- tional resources are greatly reduced.
lar proposal for the origin of paragrammatisms.
These are errors involving misconstructed gram-
matical frames, and can be explained in terms
Jargon aphasia
of excessive substitutions. Again, however, these Jargon aphasia is an extreme type of fluent aphasia
approaches do not explain all the characteristics in which syntax is primarily intact, but speech is
of agrammatism. Although uninflected words are marked by gross word-finding difficulties. People
more common than inflected forms, the high- with jargon aphasia often have difficulty in rec-
frequency function words are more likely to be ognizing that their speech is aberrant, and may
lost than content words, which are of lower fre- become irritated when people fail to understand
quency, on average. Stemberger argued that the them, indicating a problem with self-monitoring
syntactic structures that involve function words (Marshall, Robson, Pring, & Chiat, 1998).
are less frequent than structures that do not. The word-finding difficulties in jargon apha-
Schwartz (1987) related agrammatism to sia are marked by content-word substitutions
Garrett’s model. Consider what would happen in (paraphasias) and made-up words (neologisms).
this model if there were a problem translating from Paraphasias include unrelated verbal paraphasias,
the functional level to the positional level. No sen- such as (43), semantic paraphasias (44), form-
tence frame would be constructed, and no gram- based or formal paraphasias (45) (all from Martin
matical elements would be retrieved. This is what & Saffran, 1992, and Martin, Dell, Saffran, &
is observed. This does not provide an account of Schwartz, 1994), and phonemic paraphasias (46)
the comprehension deficit, which would arise from (from Ellis, 1985). Of particular interest are neol-
damage to other systems. The dissociation between ogisms, which are made-up words not to be found
the sentence construction deficit and grammatical in a dictionary. There are a number of types of
element loss suggests that different processes must neologisms, including distortions of real words,
be responsible in Garrett’s model for constructing for example (47) and (48) (from Ellis, 1985), and
the sentence frame and retrieving grammatical ele- abstruse paraphasias with no discernible relatives,
ments. Although lacking detail, this line of thought where it is often difficult to discern the intended
both supports and extends Garrett’s model, and word (49) (from Butterworth, 1979). As an exam-
shows how neuropsychological impairments can ple, consider the description (50) of connected
be related to a model of normal processing. speech. This is a description by patient CB (from
We saw that reduced computational resources Buckingham, 1981, p. 54) of the famous Boston
might play some role in the syntactic comprehen- “cookie theft” picture, which depicts a mother
sion deficit. Similarly, limited memory might play washing plates while the sink overfills, while
some role in agrammatic production. However, in the background a little boy and girl steal the
any role is a complicated one, as severely reduced cookies.
438 E. PRODUCTION AND OTHER ASPECTS OF LANGUAGE
Currently the most comprehensive computa- activation spreads to the appropriate phonologi-
tional model of aphasia is based on Dell’s (1986) cal units. Feedback connections from the phono-
model of speech production. Martin and Saffran logical to the lexical level ensure that lexical units
(1992) reported the case of patient NC, a young corresponding to words that are phonologically
man who suffered a left hemisphere aneurysm that similar to the target word become activated. Martin
resulted in a pathological short-term memory span and Saffran argued that if the activation of lexical
and a disorder known as deep dysphasia. This is units decays pathologically quickly, then the target
an aphasic analog of deep dyslexia; it is a relatively lexical unit (as well as semantically related lexical
rare disorder marked by an inability to repeat non- units primed by earlier feedforward activation) will
words and the production of semantic errors in the be no more highly activated than other phonologi-
repetition of single words (see Howard & Franklin, cally related lexical units that have been activated
1988). Additionally, in word naming NC produced later by phonological–lexical feedback. Repetition
a relatively high rate of formal paraphasias errors are accounted for by a similar, but reversed,
(sound-related word substitutions, such as produc- mechanism. The target and phonologically related
ing “schools” for “skeleton”) (see Figure 13.16). lexical units are primed early by feedforward acti-
Martin and Saffran argued that the semantic errors vation from auditory input, and suffer more from
in word repetition and the formal paraphasias in decay. This activation feeds forward to semantic
production arise because of a pathological increase feature units that in turn feed back to the lexical
in the rate at which the activation of units decays. In network to refresh the activation of the decaying
naming, formal paraphasias arise because when the target unit. At the same time, this feedback primes
lexical unit corresponding to the target is activated, semantically related units. Because they are primed
0.50
NC
0.45
Proportion of responses
0.40
0.35
(n = 172)
0.30
0.25
0.20
0.15
0.10
0.05 FIGURE 13.16 Proportion
0.00 of naming errors by deep
C S F N S→F S→ N dysphasic patient, NC,
Response type
and the lesioned version
of Dell’s (1986) model of
0.50
Model, q = 0.92 speech production (where
0.45
Proportion of responses
0.40
the lesion led to abnormal
decay of activation). The
(n = 1,000 trials)
0.35
0.30 response categories are:
0.25 C = correct; S = semantic
0.20 error; F = formal paraphasia;
0.15 N = neologism (nonsense
0.10 word); S o F = formal
0.05 paraphasia on a semantic
0.00 error; S o N = neologism
C S F N S→F S→N
Response type
on a semantic error. q is
decay rate. Figure from
Martin et al. (1994).
442 E. PRODUCTION AND OTHER ASPECTS OF LANGUAGE
later, the semantic competitors suffer less from Modeling work suggests that the perfor-
the cumulative effects of the decay impairment, mance of patients with impaired lexical access
and thus the likelihood increases that they will be is better accounted for by impairments to two
selected instead of the target and phonologically parameters, semantic weight and phonologi-
related words. It is difficult to sustain the activa- cal weight, rather than by one weight-decay
tion of the target lexical unit given rapid decay, par- parameter (Foygel & Dell, 2000). These two param-
ticularly when it is hindered in other ways (such as eters are measures of the weights, or connection
when the target is low frequency, or is supported by strengths, between the semantic and the lexical
impoverished semantic representations). (lemma) units, and between the lemma and the
The idea that a pathological rate of decay and phonological units. Damage in the model occurs
impaired activation processes play a central role by varying these weights. The new model fits
in word retrieval deficits has been developed fur- the patient data slightly better than the weight-
ther. Dell, Schwartz, Martin, Saffran, and Gagnon decay model. For example, some patients (e.g.,
(1997) simulated these deficits with Dell’s com- PW of Rapp & Caramazza, 1998; DP of Cuetos,
putational model of speech production. The basic Aguado, & Caramazza, 2000) make exclusively
model (called the DSMSG model after the authors) semantic errors, and some patients (e.g., JBN
is the interactive two-stage model described earlier: of Hillis, Boatman, Hart, & Gordon, 1999; DM
Activation flows from the semantic level through of Caramazza, Papagno, & Ruml, 2000) make
the lemma level to the phoneme level. There are exclusively phonological errors. These types of
feedback connections between levels. Dell et al. patients were not present in the sample mod-
impaired the functioning of the network by reduc- eled by the original DSMSG model, but can be
ing the connection weights or increasing the decay modeled by the Foygel and Dell model. The new
rate of the model (or both). These changes were model provides an extremely good fit to the nam-
made globally: the same parameter determines pro- ing and repetition performance of a large (94 par-
cessing at each level. Decreasing the connection ticipant) group of aphasic patients (Dell, Martin,
strength produces a large increase in the number of & Schwartz, 2007; Schwartz, Dell, Martin, Gahl,
nonword errors, and a small increase in the number & Sobel, 2006). Finally, the new model fits in
of semantic and phonological word substitutions. very simply with the two-stage model of lexicali-
Increasing the decay rate at first increases the num- zation: We can account for the pattern of all types
ber of semantic and phonological word substitu- of lexical access failure in terms of the structure
tions, although eventually more nonword errors are of the two-stage model without introducing new
created. The most important dimension determin- parameters (such as decay). Hence the model is
ing performance is the severity of damage: Aphasic more parsimonious than its predecessor.
naming performance lies on a continuum between Although these two models are based on
normal performance and a completely random pat- sound psycholinguistic principles, there has been
tern. As damage becomes severe, the error pattern considerable debate about how well their out-
becomes more random. The model also accounts puts fit a wide range of patient data, and about
for the pattern of recovery shown by aphasic speak- the extent to which aphasic errors can all result
ers with time by gradually resetting the decay vari- from global damage to all levels of a system, as is
able to its normal value. The model described the the case with pathological delay, an idea called the
naming errors of 21 fluent aphasic patients. It can globality assumption (Ruml & Caramazza, 2000;
also account for the pattern of performance shown Ruml, Caramazza, Shelton, & Chialant, 2000).
by two brothers with a degenerative brain dis- One reason why it is difficult to draw any firm
ease called progressive aphasia (Croot, Patterson, conclusions from this controversy is that there is
& Hodges, 1999). The language of one brother no agreement on how well a computational model
(RB) can best be explained by reduced connection has to fit the data for it to be a good model (Dell,
strength, while the language of the other (CB) is Schwartz, Martin, Saffran, & Gagnon, 2000;
best explained by an abnormally high decay rate. Ruml & Caramazza, 2000).
13. LANGUAGE PRODUCTION 443
are distinct syntactic processes in production and significant differences. In Chapter 15 I will show
comprehension. That is, some agrammatic patients that the neuropsychological evidence suggests
have no comprehension impairments, and some that speaking and writing use different lexical sys-
people with comprehension deficits do not have tems. We have much more time available when
any production impairments. Furthermore, there writing compared with when speaking. We also
is no correlation between the severity of the pro- (usually) speak to another person, but write alone
duction and comprehension syntactic deficits that (even if for an audience). This leads to two major
patients exhibit (Caplan, 1992). The parser and the differences between spoken and written language
syntactic planner are to a large degree separable. (Chafe, 1985). Written language is more inte-
There is a problem with this double dis- grated and syntactically complex than spoken lan-
sociation: it might be an artifact of considering guage. We take more time to write, and can plan
just people who speak English. Cross-linguistic and edit our output more easily. Second, writing
studies of speakers of languages that are much involves little interaction with other people, and
more richly inflected show different types of as a result shows less personal involvement than
break-down (Dick et al., 2001). In particular, speech. This has important consequences for
patients with damage to Wernicke’s area make teaching writing skills (Czerniewska, 1992).
many more grammatical errors, making many Hayes and Flower (1980, 1986) identified
grammatical substitutions (something for which three stages of writing. The first is the planning
there is little scope in English). Dick et al. argue stage. Here goals are set, ideas are generated, and
that Broca’s aphasics tend to omit things and information is retrieved from long-term memory
Wernicke’s aphasics tend to substitute things, and organized into a plan for what to write. The
not because of underlying grammatical reasons, second is the translation stage. Here written lan-
but simply because of the differing speech rates guage is produced from the representation in
of the two groups. When speech is very slow, memory. The plan has to be turned into sentences.
many items fail to reach a critical level of activa- In the third stage, reviewing, the writer reads and
tion, meaning that weakly represented elements edits what has been written.
are omitted. Substitution errors increase with Collins and Gentner (1980) described
speech rate, but in English there is little scope the planning stage in some detail. They dis-
for grammatical substitution. Hence it looks as tinguished between the initial generation of
though people with Broca’s aphasia are making ideas, and their subsequent manipulation into a
grammatical errors, and those with Wernicke’s form suitable for translation into the final text.
aphasia lexical errors, but really the two disor- They suggested several means of generating
ders lie on a continuum of omission and substitu- ideas: Writing down all the ideas you have on
tion errors, with the nature of English limiting a topic, keeping a journal of interesting ideas,
the sort of errors that can occur. Dick et al. brainstorming in a group, looking in books and
argue that their results show that grammar is not journals, getting suggestions from other people,
localized in one specific brain region (such as and trying to explain your ideas to somebody.
Broca’s area), but instead makes use of many Although these ideas must be put down in tan-
regions. Damage to Broca’s area has serious con- gible form, at this stage it is important not to
sequences for grammatical processing, but in a get too carried away with translation into text.
more distributed account it does not necessarily Collins and Gentner identified several methods
mean that grammar is located there. of manipulating ideas into a form suitable for
translation. These include identifying dependent
WRITING AND AGRAPHIA variables, generating critical cases, comparing
similar cases, contrasting dissimilar cases, sim-
There has been even less work on writing than ulating, categorizing, and imposing structure.
there has been on speaking. Obviously writing A number of factors are known to distin-
and speaking are similar, but there are also guish good from less able writers. Differences
13. LANGUAGE PRODUCTION 445
SUMMARY
x Speech production has been studied less than language comprehension because of the difficulty
in controlling the input (our thoughts).
x Speech production can be divided into conceptualization, formulation, and execution.
x Formulation comprises syntactic planning and lexicalization.
x Lexicalization is the process of retrieving the sound of a word given its meaning.
x Speech errors are an important source of data in speech production, and can be described in terms
of the units and mechanisms involved.
x One of the best known models of formulation is Garrett’s; Garrett argues that speech error evidence
suggests there is a distinction between a functional level of planning and a positional level of planning.
x Explicit serial order information is not encoded at the functional level of Garrett’s model.
x The distinction between function and content words is central in speech production, as they never
exchange with each other in speech errors.
x Syntactic persistence is the phenomenon whereby we tend to reuse syntactic structures; hence we
can facilitate and direct production with appropriate prime sentences.
x Number agreement is determined by the underlying number of the subject noun.
x Production and syntactic planning has an incremental component to it.
x The strong version of Garrett’s model, in which the stages are discrete and do not interact, is
undermined by phonologically facilitated cognitive intrusions, blends of phrases merging at the
point of maximum phonological similarity, and similarity and familiarity biases in speech errors.
x In the two-stage model of lexicalization, a meaning-based stage is followed by a phonologically
based stage.
x Tip-of-the-tongue (TOT) states are noticeable pauses in retrieving a word; they arise because of
insufficient activation of the words in the lexicon.
x Evidence for two stages comes from an analysis of speech errors and TOTs, and of anomia in
languages that have gender.
x Lemmas are syntactically and semantically specified, amodal lexical representations.
x The amodal nature and syntactic mediation function of lemmas are debatable.
x Experimental studies of picture naming do not always find mediated semantic-phonological prim-
ing. Although this result suggests that the two stages of processing are discrete, simulations show
that it is not inconsistent with a cascade model, and other evidence suggests that the two stages
are accessed in cascade.
x Speech errors show lexical (familiarity) and similarity biases; these findings suggest that lexicali-
zation is interactive.
x Models such as that of Dell provide an interactive account of lexicalization.
x It is not clear why feedback connections exist, but connectionist models based on phonological
attractors can in principle still account for the data.
x The main problem for phonological encoding is ensuring that we produce the sounds in the correct
sequence.
x One important method of ensuring correct sequencing is to make a distinction between frames
and content.
x The phonological word is the basic unit of phonological planning.
x Hesitations reflect planning by the speaker, although they may also serve social and segmentation
functions.
x Microplanning pauses indicate transient difficulty in retrieving the phonological forms of less
predictable words, whereas macroplanning pauses indicate both semantic and syntactic planning.
13. LANGUAGE PRODUCTION 447
x We sometimes hesitate before less predictable words, suggesting that we are having a temporary
difficulty in retrieving them.
x We tend to pause between major syntactic units of speech, and in these pauses we plan the content
of what we want to say.
x Speech falls in planning cycles, with fluent execution phases following hesitant planning phases
in which we do a relatively large amount of planning, each cycle corresponding to an idea.
x Aphasia is an impairment of language processing following brain damage.
x Broca’s aphasia patients are not fluent, often with some deficit in syntactic comprehension,
whereas Wernicke’s aphasics are fluent, usually with very poor comprehension.
x Agrammatism is a controversial label covering a number of aspects of impaired syntactic process-
ing, including a sentence construction deficit, the loss of grammatical elements of speech, and
impaired syntactic comprehension.
x Jargon aphasia is a disorder of lexical retrieval characterized by paraphasias and neologisms.
x Lexical-semantic anomia arises because of an impairment of semantic processing, whereas pho-
nological anomia arises because of difficulty in accessing phonological word forms.
x Naming errors can be modeled by manipulating connection strengths and the rate of decay of activation.
x Writing is less constrained by time than speech production, and is less cooperative than speech.
x Writing involves planning, translation, and reviewing; of these, planning is the most difficult.
x There are types of dysgraphia analogous to the types of dyslexia.
FURTHER READING
See Wheeldon (2000) and Alario, Costa, Pickering, and Ferreira (2006) for collections of papers
covering all aspects of language production. A classic reference is Levelt (1989). Levelt discusses
what might happen at the message level. Dennett (1991) speculates about how the conceptualizer
might work.
(Continued)
448 E. PRODUCTION AND OTHER ASPECTS OF LANGUAGE
(Continued)
In addition to number agreement, in many languages it is important that gender is matched
between adjectives, articles, and nouns. For work in this area, see Alario and Caramazza (2002);
Costa, Kovacic, Fedorenko, and Caramazza (2003); Schiller and Caramazza (2003); Schiller and
Costa (2006); Schriefers, Jescheniak, and Hantsch (2005); and Schriefers and Teruel (2000).
For more on hesitations and pauses, see Beattie (1983) for a sympathetic review and Petrie
(1987) for a critical review. For a review of the role of interaction in lexicalization and syntactic
planning, see Vigliocco and Hartsuiker (2002).
See Meyer (2004) for a review of work on the visual world and speech production.
For further information on the neuropsychology of language, see Kolb and Whishaw (2009).
For more on what cognitive neuropsychology tells us about normal speech production, see Caplan
(1992, with a paperback edition in 1996). See Roelofs, Meyer, and Levelt (1998) for a response to
Caramazza on the necessity of lemmas. See Rapp and Goldrick (2005) for a review of the literature
on the neuropsychology of word production.
See Vinson (1999) for an introductory review of language in aphasia. The methodological issues
involved in cognitive neuropsychology have spawned a large literature of their own. Indeed, a spe-
cial issue of the journal Cognitive Neuropsychology (1988, volume 5, issue 5) is completely devoted
to this topic. Much of the emphasis in this area has been on the status of agrammatism. For a more
detailed discussion, see also Shallice (1988). The nature of agrammatism has always been central in
this debate. See Hale (2002) for an account of what it must be like to lose language after a stroke, and
how the loss affects the family of the person.
See Emmorey (2001) for a review of the production of sign language. Sign language breaks
down after brain damage in interesting ways. Ellis and Young (1988) review the literature on the
neuropsychology of sign languages and gestures.
For more on the dual versus single route models of how we generate regular and irregular verbs,
see the debate in Trends in Cognitive Science (Marslen-Wilson & Tyler, 2003; McClelland &
Patterson, 2002, 2003; Pinker & Ullman, 2002).
For excellent overviews of research on writing, see Ellis (1993) and Eysenck and Keane (2010).
The latter covers the Hayes and Flower model in detail. Flower and Hayes (1980) discuss the plan-
ning process in more detail. Ellis (1993) also has a section on disorders of writing, the dysgraphias.
Ellis and Young (1988) also cover peripheral dysgraphias, which affect the lower levels of writing. See
Czerniewska (1992) for information on learning how to write, and how writing should best be taught.
C H A P T E R 14
HOW DO WE USE LANGUAGE?
Speech acts
When we speak, we have goals, and it is the lis-
tener’s task to discover those goals. According to
Austin (1962/1976) and Searle (1969), every time
we speak we perform a speech act. That is, we
are trying to get things done with our utterances.
Austin (1976) began with the goal of explor-
ing sentences containing performative verbs.
These verbs perform an act in their very utterance,
such as “I hereby pronounce you man and wife”
(as long as the circumstances are appropriate—
such as that I have the authority to do so; such “Top me up!” This directive speech act may be
interpreted beyond its literal meaning and have
circumstances are called the felicity conditions). the perlocutionary effect of making fellow diners
Austin concluded that all sentences are perform- think that she has had quite enough wine to drink
ative, though mostly in an indirect way. That is, already!
all sentences are doing something—if only stat-
ing a fact. For example, the statement “My house
is terraced” can be analyzed as “I hereby assert According to Searle (1969, 1975), when we
that my house is terraced.” Austin distinguished speak we make speech acts. Every speech act falls
three effects or forces that each sentence pos- into one of five categories (see Figure 14.2):
sesses (see Figure 14.1). The locutionary force
of an utterance is its literal meaning. The illo- x Representatives. The speaker is asserting a fact
cutionary force is what the speaker is trying to and conveying his or her belief that a statement
get done with the utterance. The perlocutionary is true. (“Boris rides a bicycle.”)
force is the effect the utterance actually has on x Directives. The speaker is trying to get the lis-
the actions and beliefs of the listener. For exam- tener to do something. (In asking the question
ple, if I say (1) the literal meaning is that I am “Does Boris ride a bicycle?” the speaker is try-
asking you whether you have the ability to pass ing to get the hearer to give information.)
the gin. The illocutionary force is that I hereby x Commissives. The speaker commits him or her-
request you to pass the gin. The utterance might self to some future course of action. (“If Boris
have the perlocutionary force of making you doesn’t ride a bicycle, I will give you a present.”)
think that I drink too much. x Expressives. The speaker wishes to reveal his
or her psychological state. (“I’m sorry to hear
(1) Can you pass the gin? that Boris only rides a bicycle.”)
SENTENCE
FIGURE 14.1
14. HOW DO WE USE LANGUAGE? 451
x Declaratives. The speaker brings about a new increasing politeness. The less conventional they
state of affairs. (“Boris—you’re fired for rid- are, the more computational work is required by
ing a bicycle!”) the listener. Over 90% of requests are indirect
in English (Gibbs, 1986b). Indirectness serves a
Different theorists specify different catego- function: it is an important mechanism for con-
ries of speech acts. For example, D’Andrade and veying politeness in conversation (Brown &
Wish (1985) described seven types. They distin- Levinson, 1987). It also enables the speaker to be
guished between assertions and reactions (such as strategic in their language: for example, if you are
“I agree”) as different types of representatives, and offering someone a bribe, you might want to do
they distinguished requests for information from so indirectly, so you can fall back on the direct
other request directives. The lack of agreement and meaning should they turn out to be more honest
the lack of detailed criteria of what constitutes than you—“I never meant it that way!” (Lee &
any type of speech act are obvious problems here. Pinker, 2010).
Furthermore, some utterances might be ambigu- The meanings of indirect speech acts are not
ous, and if so, how do we select the appropriate always immediately apparent. Searle (1979) pro-
speech act analysis? A further challenge is that it posed a two-stage mechanism for computing the
needs to be made explicit how the listener uses the intended meaning. First, the listener tries the literal
context to assign the utterance to the appropriate meaning to see if it makes sense in context, and it
speech act type. is only if it does not that he or she will do the addi-
Direct speech acts are straightforward tional work of finding a non-literal meaning. There
utterances where the intention of the speaker is is an opposing one-stage model where people derive
revealed in the words. Indirect speech acts require the non-literal meaning either instead of or as well
some work on the part of the listener. The most as the literal one (Keysar, 1989). The evidence is
famous example is “Can you pass the salt?,” as conflicting, but certainly the non-literal meaning is
analyzed earlier. Speech acts can become increas- understood as fast as or faster than the literal mean-
ingly indirect (“Is the salt at your end of the ing, which favors a one-stage model. For example,
table?” to “This food is a bit bland”), often with Gibbs (1986a) found that in an appropriate context
REPRESENTATIVE
The speaker is asserting a fact
and conveying his or her belief
that a statement is true
DECLARATIVE DIRECTIVE
The speaker brings about a The speaker is trying to get the
SPEECH
new state of affairs listener to do something
ACT
EXPRESSIVE COMMISSIVE
The speaker wishes to reveal The speaker commits him or
his or her psychological state herself to some future course
of action
FIGURE 14.2
452 E. PRODUCTION AND OTHER ASPECTS OF LANGUAGE
participants took no longer to understand the sarcas- x Maxim of relevance. Make your contribution
tic sense of “You’re a fine friend!” than the literal relevant to the aims of the conversation.
sense in a context where that was appropriate. x Maxim of manner. Be clear: Avoid obscurity,
Clark (1994) detailed examples of many kinds ambiguity, wordiness, and disorder in your
of layering that can occur in conversation. We can language.
be ironic, sarcastic, or humorous, we can tease, we
can ask rhetorical questions that do not demand Subsequently there has been some debate on
answers, and so on. Although we probably under- whether there is any redundancy in these maxims.
stand these types of utterance using similar sorts Sperber and Wilson (1986) argued that relevance
of mechanisms as with indirect speech acts, much is primary among them and that the others can be
work remains to be done in this area. deduced from it.
Conversations quickly break down when
we deviate from these maxims without purpose.
How to run a conversation: However, we usually try to make sense of con-
Grice’s maxims versations that appear to deviate from them. We
Grice (1975) proposed that in conversations speak- assume that overall the speaker is following
ers and listeners cooperate to make the conversa- the cooperative principle. To do this, we make
tion meaningful and purposeful. That is, we adhere a particular type of inference known as a con-
to a cooperative principle. To comply with this, versational implicature. Consider the following
according to Grice, you must make your conver- conversational exchange (2).
sational contribution such as is required, when it is
required. This is achieved by use of four conversa- (2) Vlad: Do you think my nice new expensive
tional maxims (see Figure 14.3): gold fillings suit me?
Boris: Gee, it’s hot in here.
x Maxim of quantity. Make your contributions as
informative as is required, but no more. Boris’s utterance clearly violates the maxim
x Maxim of quality. Make your contribution true. of relevance. How can we explain this? Most of
Do not say anything that you believe to be false, us would make the conversational implicature
or for which you lack sufficient evidence. that in refusing to answer the question, Boris is
MAXIM OF QUANTITY
Make contributions as
informative as is required,
but no more
MAXIM OF MANNER
CONVERSATIONAL
Make contribution clear, MAXIM OF QUALITY
MAXIMS
avoiding obscurity, ambiguity, Make contribution true
(Grice, 1975)
wordiness, and disorder
MAXIM OF RELEVANCE
Make contribution relevant
to the aims of the conversation
FIGURE 14.3
14. HOW DO WE USE LANGUAGE? 453
implying that he dislikes Vlad’s new fillings and privileged information. They found that if speak-
doesn’t think they suit him at all, but for some rea- ers were told to keep this privileged information
son doesn’t want to say so to his face. Indeed, face secret, they were in fact more likely to refer to the
management is a common reason for violating the concealed objects. Wardlow Lane et al. explain
maxim of relevance (Goffman, 1967; Holtgraves, the results in terms of our monitoring speech;
1998): People do not want to hurt or be hurt. monitoring can bring things that we are trying to
The listeners’ recognition of this plays an impor- avoid into awareness, increasing the chance that
tant role in how they make inferences that make they are in fact produced. Freud (1975) would talk
sense of remarks that apparently violate relevance in terms of repression; the two explanations are
(Holtgraves, 1998). not a million miles apart.
There are other ways in which speakers The right hemisphere of the brain plays an
cooperate in conversations. Garrod and Anderson important role in processing some pragmatic
(1987) observed people cooperating in an attempt aspects of language (see Lindell, 2006, for a
to solve a computer-generated maze game. The review). We saw in Chapter 12 that patients
pairs of speakers very quickly adopted simi- with right-hemisphere damage have difficulty
lar forms of description—a phenomenon called in understanding jokes; more generally, the
lexical entrainment. For example, we could call right hemisphere is involved in non-literal pro-
a picture of a dog “a dog,” “a poodle,” “a white cessing. Patients with right-hemisphere damage
poodle,” or even “an animal.” The frequency have difficulty in understanding jokes, idioms,
and recency of name selection can override metaphors, and proverbs. Imagine the sort of
other factors that influence lexical choice, such literal image provoked by the phrase “cry-
as informativeness, accessibility, and being at ing your eyes out” (Lindell, 2006; Winner &
the basic level. Brennan and Clark (1996) pro- Gardner, 1977).
posed that in conversations speakers jointly make
conceptual pacts about which names to use.
Conceptual pacts are dynamic: They evolve over THE STRUCTURE OF
time, can be simplified, and even abandoned for CONVERSATION
new conceptualizations.
Of course, sometimes we don’t want to coop- There are two different approaches to analyzing
erate in conversations. Some people sometimes the way in which conversations are structured
want to lie; frequently we want to keep things to (Levinson, 1983). Discourse analysis uses the
ourselves. This privacy can sometimes be very dif- general methods of linguistics. It aims to dis-
ficult to maintain. Readers of a certain age might cover the basic units of discourse and the rules
remember an episode of the UK television pro- that relate them. The most extreme version of
gram “Dad’s Army,” where Captain Mainwaring, this is the attempt to find a grammar for conver-
desperate to keep Corporal Pike’s name from the sation in the same way as there are sentence and
invaders, says “Don’t tell him (your name), Pike.” story grammars. Labov and Fanshel (1977), in
(Here is another example: DON’T think of a pink one of the most famous examples of the analysis
elephant.) Often it seems that the harder we try of discourse, looked at the structure of psycho-
to keep something private, the more likely it is to therapy episodes. Utterances are segmented into
pop out. An experiment carried out by Wardlow units such as speech acts, and conversational
Lane, Groisman, and Ferreira (2006) showed that sequences are regulated by a set of sequencing
this impression is correct. Speakers described rules that operate over these units. Conversation
simple objects (e.g., triangles) to other people. analysis is much more empirical, aiming to
Some information was known only to the speak- uncover general properties of the organiza-
ers (e.g., that there was also another, larger tri- tion of conversation without applying rules.
angle in the scene concealed from the listeners). Conversation analysis was pioneered by ethno-
Wardlow Lane et al. call this type of information methodologists, who examine social behavior
454 E. PRODUCTION AND OTHER ASPECTS OF LANGUAGE
their utterances are understood (Clark & Wilkes- There are other reasons for supposing that
Gibbs, 1986; Schober & Clark, 1989). People go audience design is an emergent, interactive pro-
to considerable lengths to take the other person’s cess. Horton and Gerrig (2005) found that the
point of view in dialog, sometimes regardless of the memory requirements of a task influence speak-
cognitive load necessitated (Duran, Dale, & Kreuz, ers. They used a task in which “Directors” gave
2011). The idea that speakers tailor their utterances instructions about manipulating an array of cards
to the particular needs of the addressees is called to “Matchers.” They found that the Directors were
audience design (Clark, 1996). much better able to take the needs of the Matchers
In Chapter 12 we saw how readers and lis- into account when their own memory demands
teners construct representations of incoming produced by the task were lower. If speakers have
language. Conversation is a process of com- a lot to remember, they find it difficult to take the
municating these representations, of trying to needs of the listeners and the detailed past history
make the representation of the speaker and the of their conversational interaction into account.
listener the same—almost of filling in gaps. (Of
course, there are exceptions; when someone is
lying, or deliberately withholding information,
Audience design
they are trying to make sure that the gaps are The idea that speakers tailor their productions
not filled in.) Pickering and Garrod (2004) call to address the specific needs of their listeners is
this process of trying to make the language rep- called audience design. An example of audience
resentations of speakers and listeners coincide design is child-directed speech (see Chapter 4),
alignment. In their interactive alignment model, when adults modify their utterances when speak-
during dialog the linguistic representations of ing to infants and children.
the participants become aligned at many levels We also saw in Chapter 10 that speakers
(including the overall mental model of what is sometimes use prosody and pausing to help lis-
going on, the syntactic level, and the lexical teners disambiguate what they say. Speakers also
level). They argue alignment occurs by means seem to monitor what they say with the goal of
of four types of largely automatic mechanism: reducing ambiguity. While speakers sometimes
priming, inference, the use of routine expres- avoid linguistic ambiguity (e.g., ambiguous
sions, and the monitoring and repair of language words, as of the type we examined in Chapter 6,
output. Such alignment of linguistic represen- or temporarily ambiguous structures, of the sort
tations leads to the alignment of the speaker’s we examined in Chapter 10), they go out of their
and the listener’s situation models (Zwaan & way to avoid non-linguistic ambiguity (Ferreira,
Radvansky, 1998). Perhaps the most important Slevc, & Rogers, 2005). Non-linguistic ambi-
of these alignment mechanisms is priming. We guity arises when there are multiple instances
have examined priming in several contexts (e.g., of similar meanings—for example, if there are
lexical priming in Chapter 6, syntactic priming several instances of the same object in the visual
in Chapter 13). Priming of words and syntac- scene, or several instances that could be described
tic structures ensures that linguistic represen- by the same word. If there are two apples in front
tations become aligned at a number of levels. of us, one red and one green, we are unlikely to
This account assumes much less explicit rea- say just “give me the apple.” In their experiment,
soning about one’s interlocutor than alternative speakers described target objects (e.g., the fly-
views such as that of Clark (1996). Pickering ing mammal “bat”) in contexts where there were
and Garrod (2006) further emphasize the way other objects that could cause linguistic (a base-
in which listeners make predictions in conver- ball) or non-linguistic (a larger flying mammal)
sations, and that these predictions are made by ambiguity (see Figure 14.4). Ferreira et al.’s
the speech production system: Comprehension results found that speakers monitor their speech
draws on production, particularly in difficult and can sometimes detect and avoid linguistic
circumstances. ambiguity before producing it, but almost always
456 E. PRODUCTION AND OTHER ASPECTS OF LANGUAGE
avoid non-linguistic ambiguity. Speakers are syntactic structure of instructions such as “Put the
much better at dealing with non-linguistic ambi- dog in the basket on the star” regardless of whether
guity than with linguistic ambiguity. A related or not the referential situation is actually ambiguous
study looking at dialog between two speakers (Kraljic & Brennan, 2005). This finding suggests
engaged in moving objects on a grid found that that there are limitations to audience design. What
when the visual context was potentially ambigu- is more, speakers overestimate how good they are
ous, speakers tried to disambiguate their utter- at conveying information (Keysar & Henly, 2002).
ances (Haywood, Pickering, & Branigan, 2005). Keysar and Henly looked at 40 speakers producing
Hence speakers do pay some attention to the syntactically ambiguous sentences such as “Boris
needs of the listener. shot the man with the gun” and lexically ambigu-
There are, however, limits to how far a speaker ous sentences such as “The typist tried to read the
will go to make the listener’s life easier. Ferreira and letter without her glasses.” Nearly half (46%) of the
Dell (2000) examined the extent to which speakers time the speaker thought the listeners had correctly
used optional complementizers (e.g., “that,” which understood the sentence; in fact they had not. So
is optional in “the vampire knew [that] you hated not only are there limits to how much speakers tai-
blood,” a structure that is ambiguous up until the lor their productions to their listeners, they do not
word “hated”). If speakers are trying to produce always do so correctly even when they try.
structures that are as easy to understand and as
unambiguous as possible, they should frequently SOUND AND VISION
include these optional words in sentences that
would otherwise be ambiguous. However, they do We saw in Chapter 3 that human language is so
not. Instead they choose structures that are easy to powerful because we can talk about anything—we
produce and that enable them to produce the main can talk about things remote in time and space, and
content words as early as possible. Speech product- about very abstract notions. However, just because
ion proceeds with quickly selected lemmas being we can do these things, it doesn’t mean we do them
produced as soon as possible. In addition, while all the time. In fact a great deal of the time we talk
speakers produce prosodic cues (such as length- literally about what is in front of our eyes. For much
ening words and inserting pauses) to syntactic of everyday life we converse about the “here-and-
boundaries, and listeners do pay attention to these now.” Not surprisingly, therefore, the study of how
cues, speakers tend to do so regardless of whether language interacts with the visual world has become
or not the listener really needs it. For example, of considerable importance over the last few years.
the speakers provide disambiguating cues to the Perhaps the only surprise is why it has taken so long
for this topic to become so prominent. The answer representation and to resolve syntactic ambigu-
to this question is that the study of how we interact ity. While adult readers rely mostly on lexical
with the visual world requires sophisticated eye- information to generate alternative syntactic
movement technology, and such technology has structures, adult listeners make a great deal of
only recently become available. use of the visual world in front of them. In par-
A second reason why the study of the visual ticular, people can use referential information
world has become so important is that it provides from the visual scene at which they are looking
us with a new tool for studying how we under- to override very strong lexical biases (Tanenhaus
stand language and speech. We can now see in et al., 1995).
real time how people make use of external, vis- The role of the visual world in comprehen-
ual information when processing language. The sion has since been demonstrated in several
visual world paradigm has recently proved very experiments. For example, Spivey, Tanenhaus,
popular for investigating sentence processing (see Eberhard, and Sedivy (2002) monitored the eye
many studies in Chapter 10) and speech production movements of participants following spoken
(see Chapter 13). instructions about picking up moving objects
While adults make considerable use of the in a visual workspace. The eye movements
visual world, similar studies show that children do were closely linked to the associated referential
so to a much lesser extent (Snedeker & Trueswell, expressions (phrases describing objects) in the
2004; Trueswell, Sekerina, Hill, & Logrip, 1999). instructions. What happens when people are given
Five-year-old children rely exclusively on verb-bias temporarily ambiguous sentences, such as (3),
information. Highly reliable cues, such as lexical which contains a temporarily ambiguous prepo-
bias, emerge first in development, with referential sitional phrase?
information gradually being used as the child gets
older. Furthermore, although referential informa- (3) Put the apple on the towel in the box.
tion may not determine which structures young
children construct, it may reduce the time it takes to The normally preferred initial interpretation
construct them (Snedeker & Trueswell, 2004). is the goal-argument analysis (put the apple on
the towel); the less usual initial interpretation
Using visual information in is the noun-phrase modifier (the apple that is
already on the towel should be put somewhere
comprehension else). The answer depends on the visual con-
We saw in Chapter 10 that many sources of infor- text. If there was just one apple in the visual
mation are used to help us construct a syntactic scene, people would go with the usual preferred
(a) (b)
FIGURE 14.5 Examples
of the display conditions
used by Spivey et al. (2002).
In scene (a) participants
spent time looking at the
target destination (the
empty towel), whereas in
scene (b) they spent less
time looking at the empty
towel. Based on Spivey et al.
(2002).
458 E. PRODUCTION AND OTHER ASPECTS OF LANGUAGE
analysis, and spend time looking at the supposed information in turn is used from a very early stage
(but incorrect) target destination (an empty to influence parsing.
towel). If there was more than one apple, how- The results show that language processing
ever, participants assumed the less usual modi- immediately takes into account relevant non-lin-
fication analysis, and did not spend much time guistic context, and argues against models where
looking at the empty towel (see Figure 14.5). initial syntactic decisions are guided solely by
Eye movements showed that the initial inter- syntactic information.
pretation was the one consistent with the visual One particular sort of visual information is
context. information from the speaker themselves. We
Using a similar sort of design, Chambers, have seen in Chapter 9 that people’s recognition
Tanenhaus, and Magnuson (2004) showed that of speech can be influenced by the lip movements
properties of objects in the visual world can influ- of the speaker (the McGurk effect). Lip-readers
ence parsing. They gave participants temporarily clearly make extensive use of this sort of informa-
ambiguous sentences such as “Pour the egg in the tion. The eye movements of the speaker (see also
bowl over the flour.” The eggs in the scene could Chapter 13) provide another rich source of infor-
be in a liquid form, or whole. You cannot pour mation for listeners. We tend to look at what the
whole eggs, so people spend little time looking at speaker is looking at; indeed, eye movements can
them given the start of this instruction. Listeners be used to flag attention or a particular referent.
restrict their attention to objects that are physically When a speaker is describing a scene to a listener,
compatible with what they hear. If all you can see is the speaker naturally looks over the scene, and their
one egg, in a bowl, in liquid form, you will analyze eye movements relate to what they are describing.
the sentence from the beginning with the structure The eye movements of the listener come to match
of “pour the egg that’s in the bowl”—and your eyes the eye movements of the speaker; they move over
will give you away. Hence real-world properties of the scene in the same way, but with a delay of 2
objects constrain the referential domain, and this seconds (Richardson & Dale, 2005).
SUMMARY
1. Keep a record for a few days of what you talk about. How much is to do with the here-and-now?
2. When do people interrupt others?
3. When you talk, what do you look at? Why?
4. When you talk to someone, how much attention do you pay to whether or not they are following you?
5. How would you modify your speech if a tourist who is obviously a poor speaker of your native
language stops you in the street and asks you for directions? What does this example tell us
about audience design?
FURTHER READING
Sperber and Wilson (1987) is a summary of their book on relevance, with a peer commentary. Clark
(1996) is a classic work on using language. See Henderson and Ferreira (2004) for an edited collec-
tion on language and the visual world.
C H A P T E R 15
THE STRUCTURE OF THE
LANGUAGE SYSTEM
Throughout the book, it has become obvi- can be connected together to form a propositional
ous that the extent to which language processes network that is operated on by schemata (in
interact is very controversial. As a very general comprehension—see Chapter 12) and the concep-
conclusion, we have observed that the earlier in tualizer (in production—see Chapter 13).
processing a process is, the more likely it is to be Throughout this book we have seen how
autonomous. By the end of this chapter and book neuropsychological case studies show us that
you should: brain damage can affect some components of lan-
guage while leaving others intact. We have seen
x Know about the components of the language dissociations in reading and speech production.
system and how they relate to each other. Some patients have preserved lexical access but
x Understand the extent to which language pro- impaired syntactic processing, while others show
cesses are interactive. the reverse pattern. The pattern of performance of
x Appreciate some differences between reading people with Parkinson’s disease and Alzheimer’s
and listening. disease is quite different, leading some research-
x Understand how we repeat words, and how ers to conclude that specific instances are stored
repetition can be affected by brain damage. in the mental lexicon in one part of the brain,
x Understand the role that working memory while general grammatical rules are processed
plays in language processing. elsewhere, although again this is controversial
(see Chapters 3 and 13).
There are obviously enormous differences
WHAT ARE THE MODULES between language processing in the visual and
OF LANGUAGE? the auditory modalities, given the very different
natures of the inputs. Even if there is phonological
What modules of the language system can we recoding in reading, it is unlikely to be obligatory
identify? When we see, hear, or produce a sen- to gain access to meaning in languages with deep
tence, we have to recognize or produce the words orthography that have many irregular words, such
(Chapters 6, 7, 9, and 13), and decode or encode as English. In addition, the temporal demands of
the syntax of the sentence (Chapters 10 and 13). spoken and visual word recognition are very dif-
All of these tasks involve specific language mod- ferent. In normal circumstances, we have access to
ules. Little is known at present about the rela- a visual stimulus for much longer than an acous-
tion between the syntactic encoder and decoder, tic stimulus. We can backtrack while reading, but
although the evidence described in Chapter 13 we are unable to do this when listening. It is even
suggests that they are distinct. But does seman- possible that fundamental variables have different
tic information direct syntactic modules to do effects in the two modalities. It is more difficult to
particular analyses (strong interaction), or just to find frequency effects in spoken language recog-
reject implausible analyses and cause reanalysis? nition than in visual word recognition (Bradley
We looked at this in the chapter on parsing and & Forster, 1987). Nevertheless, in normal cir-
syntactic ambiguity (Chapter 10). cumstances the reading and listening systems
The semantic-conceptual system is respon- develop closely in tandem: Except for very young
sible for organizing and accessing our world children, there is a very high correlation between
knowledge and for interacting with the percep- auditory and visual comprehension skills (Palmer,
tual system. We discussed the way in which word MacLeod, Hunt, & Davidson, 1985). Differences
meanings might be represented in Chapter 11. Most between the modalities may extend beyond word
researchers currently think that they are decom- recognition. Kennedy, Murray, Jennings, and
posed into semantic features, some of which Reid (1989) argued that parsing differs in the two
might be fairly abstract. Initial contact with the modalities. With written language, we have the
conceptual system is probably made through opportunity to go back to it, but access to spoken
modality-specific stores. The meanings of words language is more transient.
462 E. PRODUCTION AND OTHER ASPECTS OF LANGUAGE
We saw in Chapter 9 that the data strongly level, or whether it is only specified at some more
suggest that speech recognition is a data-driven, abstract level, as there is uncertainty about the
purely bottom-up process. In contrast, we saw in degree of phonological similarity effects found
Chapter 13 that the data suggest that speech pro- in these internal errors (Corley, Brocklehurst, &
duction is a non-modular process involving feed- Moat, 2011; Oppenheim, 2012; Oppenheim &
back. Is there a contradiction here? Why should Dell, 2008). Finally, we saw in Chapter 7 that
recognition involve no feedback, but production reading often results in inner speech.
a great deal of it? The tasks are very different: In
speech recognition, the goal is to extract the cor-
rect meaning as quickly as possible; the speech HOW MANY LEXICONS
signal fades rapidly; and there is some redun- ARE THERE?
dancy in the input. And while we need to get at
the meaning and truth of what we are hearing, we We have seen how some researchers believe that
do not need to construct detailed representations there are multiple semantic memory systems,
of everything. In production, however, we need to one for each input modality. How many lexicons
be accurate. We do need to produce every word in are there? When we recognize a word, do we
full and construct every syntactic representation make contact with the same lexicon regardless
in detail. We need to make sure that one part of the of whether we are listening to speech or reading
sentence agrees with all the others. Traditionally, written language? Do we have just one mental
language production and language comprehen- dictionary, or is it fractionated, with a separate
sion have been treated as distinct modules; how- one for each modality? Clearly the peripheral
ever, recent thinking is that they are much more features of lexical processing—letters versus
intertwined (Pickering & Garrod, 2013). sounds, for example—must differ depending on
the modality, so the question should be rephrased
as: Is there one lexicon of lemmas (abstract lexi-
Inner speech cal units; see Chapter 13), or multiple systems
What about inner speech, that little voice we often of lemmas, one for each modality? In Levelt’s
hear in our head telling us what to do? Clearly original conception of lemmas they are modal-
inner speech is produced by the speech production ity neutral, but is that actually the case? In fact
system, but it stops short of full articulation. How lemmas, although an important idea in speech
short? Vigliocco and Hartsuiker (2002) argue that production, are rarely mentioned in the word
inner speech is in a phonetic code—that is, it is rel- recognition literature.
atively late. There are two main pieces of relevant The most parsimonious arrangement is that
evidence. The first is that articulatory suppression there is only one lexicon, used for the four tasks
(speaking out aloud) stops the inner voice, and of reading, listening, writing, and speaking.
articulatory suppression interferes with the pho- Alternatively, we may have four lexicons, one
netic code. Second, levels of representation are each for the tasks of writing, reading, speaking,
not accessible to consciousness prior to the pho- and listening. It is also plausible that there are two
netic code (we have no sense of knowing what lexicons: One possibility is that there are separate
a lemma or a phonological code is), but clearly lexicons for written (visual) language and spoken
inner speech is accessible to consciousness. (verbal) language (each covering input and out-
Recent research on getting people to mentally put tasks—that is, recognition and production),
recite tongue twisters has shown that people make and another is that there are separate lexicons for
speech errors in inner speech showing phonologi- input and output (each covering written and spo-
cal effects resembling those made in overt speech, ken language).
such as the lexical bias effect. However, opinion Note that to some extent the answers to these
is divided as to whether the errors show that inner questions depend on how we define our terms. If
speech is specified as far as the sound featural by “lexicon” we just mean “the complete mental
15. STRUCTURE OF THE LANGUAGE SYSTEM 463
Experimental data
Fay and Cutler (1977) interpreted form-based
word substitution speech errors as evidence that a
single lexicon was accessed in two different direc-
tions for speech production and comprehension
(Chapter 13). We saw, however, that malaprop-
isms can readily be explained without recourse
to a common lexicon in an interactive two-stage
model of lexicalization. In fact, most of the data
argue against a single lexicon used for both recog-
nition and production.
Winnick and Daniel (1970) showed that
tachistoscopic recognition of a printed word was
facilitated by the prior reading aloud of that word,
whereas naming a picture or producing a word in
response to a definition did not facilitate subse-
quent tachistoscopic recognition of those words
(Chapter 6). Furthermore, priming in the visual Listening to and repeating words. Color positron
modality produces much more facilitation on a emission tomography (PET) scan showing areas
test in the visual modality than auditory priming of the human brain involved in word recognition.
The active areas are highlighted in red and yellow.
does, and vice versa (Morton, 1979b). In response, At top, the subject is listening to words only. The
Morton (1979b) revised the logogen model, so part of the brain activated is the auditory region
that instead of one logogen for each word, logo- as word sounds are heard. At bottom, the subject
gen stores were modality-specific. He further dis- is both listening to words, and repeating them.
tinguished between input and output systems. In The auditory (hearing) region is activated as well
as a small motor control area (yellow, above
support of this fractionation, Shallice, McLeod, the auditory region) involved in speech. Active
and Lewis (1985) found that having to monitor a areas show cerebral blood flow detected by PET,
list of auditorily presented words for a target cre- superimposed onto an image of the brain.
ated little interference on reading words aloud.
Furthermore, listening to a word does not acti-
vate the same areas of the brain that are activated may be necessary for word recognition. Hence
by reading a word aloud and word repetition, as interactions in speech production arise through
shown by PET (positron emission tomography) leakage along the comprehension route. As we
brain imaging (Petersen, Fox, Posner, Mintun, & have seen, evidence favors the view that the pro-
Raichle, 1989). These pieces of evidence suggest duction and comprehension lexicons are distinct.
that the speech input and output pathways are Perhaps the role of feedback is limited or non-
different. existent in both production and recognition, but
Dell (1988) suggested that the feedback con- both involve attractor networks, giving rise to the
nections in his interactive model of lexicalization observed interactions.
464 E. PRODUCTION AND OTHER ASPECTS OF LANGUAGE
As we saw in Chapter 11, it can be difficult and spoken word recognition. In this section I
to distinguish between problems of access and examine data from patients whose behavior is
problems of storage. Allport and Funnell (1981) consistent with damage to some routes of a model
argued that perhaps we do not need separate lexi- of lexical processing while other routes are intact.
cons, just distinct access pathways to one lexicon. Several theorists, drawing on many sources, have
On the other hand, we have seen that semantic tried to bring all this material together to form
memory is split into multiple, modality-specific some idea of the overall structure of the language
stores. It seems uneconomical to have four access system (e.g., Ellis & Young, 1988; Kay, Lesser,
pathways (for reading, writing, speaking, and lis- & Coltheart, 1992; Patterson & Shewell, 1987).
tening) going to and from one lexicon, and then to One such arrangement is shown in Figure 15.1.
four semantic systems. Indeed, the most plausible The neuropsychological data strongly suggest
arrangement is that there are distinct lexical sys- that there are four different lexicons, one each for
tems. Language processes split early in process- speaking, writing, and spoken and visual word
ing, and do not converge again until quite late. recognition, although these systems must clearly
Monsell (1987) examined whether the same communicate in normal circumstances. This con-
set of lexical units is used in both production and clusion is consistent with the data from experi-
recognition. He compared the effects of priming ments on people without brain damage.
word recognition in an auditory lexical deci- At the heart of the model is a system where
sion task by perceiving a word or generating a word meanings are stored and that interfaces with
word. He found that generating a word facili- the other cognitive processes. This is the semantic
tated its recognition, suggesting that producing system (or systems, with the multiple-semantics
a word activates some representation that is also view). The four most important language behav-
accessed in recognition. This suggests that pro- iors are speaking, listening, reading, and writing.
duction and recognition use the same lexicon or Speaking involves going from the semantic
separate networks that are connected in some system to a store of the sounds of words. This
way. Further evidence that the input and output is the phonological output store. Understanding
phonological pathways cannot be completely speech necessitates the auditory analysis of
separate is that there are sublexical influences incoming speech in order to access a representa-
of speech production on speech perception. For tion of stored spoken word forms. This is the pho-
example, Gordon and Meyer (1984) found that nological input store.
preparing to speak influences speech percep- People with anomia have difficulty in retriev-
tion, so there must be some sharing of common ing the names for objects, yet can show perfect
mechanisms. Monsell tentatively argued that the comprehension of those words. EE was consist-
interconnection between the speech production ently unable to name particular words, yet he
and recognition systems happens at a sublexi- had no impairment of the auditory recognition
cal level such as the phonological buffer used in or comprehension of the words that he could not
memory-span tasks. name (Howard, 1995). This finding suggests that
In summary, experimental data from people the input and output phonological lexicons are
without brain damage suggest that spoken and vis- distinct.
ual word recognition make use of different mecha- Some patients show a disorder called pure
nisms. There are distinct input and output stores, word deafness. People with pure word deafness
perhaps sharing some sublexical mechanisms. can speak, read, and write quite normally, but can-
not understand speech (Chapter 9). These patients
Neuropsychological data and also cannot repeat speech back. However, there
are a few patients with word deafness who still
lexical architecture have intact repetition, a condition called word
There are very many neuropsychological disso- meaning deafness. Word meaning deafness is rare,
ciations found between reading, writing, and visual but has been reported by Bramwell (1897/1984)
15. STRUCTURE OF THE LANGUAGE SYSTEM 465
Auditory
phonological
analysis
Abstract
letter
identification
Phonological
input
buffer
Visual semantic
Acoustic-to- Letter-to-
system
phonological sound rules
conversion
Verbal semantic
Lemmas system
Phonological Orthographic
output output
lexicon lexicon
and Kohn and Friedman (1986). This shows that of word and nonword repetition performance,
word repetition need not depend on lexical access. along with the effect of semantic variables such
Indeed, if Figure 15.1 is correct, then we should as imageability. Obviously (assuming that they
be able to repeat speech using three routes (in a can be distinguished), if either the input or output
manner analogous to the three-route model of buffer is disrupted, repetition should be impaired;
reading). First, there is a repetition route through I examine this idea later. We should also be able
semantics. Second, there is a lexical repetition to see disruptions resulting from selective damage
route from the input phonological lexicon to the to and preservation of our three repetition routes.
output phonological lexicon. Third, there is a sub- If both the sublexical and the lexical routes
lexical repetition route from the input phonologi- are destroyed, then the person will be forced to
cal buffer to the output phonological buffer that rely on repetition through the semantic route.
bypasses lexical systems altogether. Disentangling If the semantic route is intact, there will be an
precisely which is impaired depends on the pattern imageability effect in repetition, with more
466 E. PRODUCTION AND OTHER ASPECTS OF LANGUAGE
imageable words repeated more readily. If there is and nonwords (through the sublexical repetition
also some damage to the semantic route, patients route) and good comprehension (through the
will make semantic errors in repetition (for exam- semantic route), a deficit of this type will be diffi-
ple, repeating “reflection” as “mirror”). This is cult to detect. The important conclusion, however,
called deep dysphasia. Howard and Franklin is that the patterns of repetition impairment found
(1988) described the case of MK, who was good can be explained by this sort of model.
at speech production. He was severely impaired Different lexical systems are involved in
at single-word and nonword repetition, but was reading and writing. Bramwell’s patient could
good at the matching span task. He made seman- not comprehend spoken words, but could still
tic errors in repetition. Howard and Franklin con- write even irregular words to dictation. This is
cluded that MK had preserved input and output incompatible with any general system mediating
phonological systems, total loss of the sublexical lexical stores, and with obligatory phonological
repetition route, partial impairment of the lexi- mediation of orthographic-to-cognitive codes.
cal repetition route, and partial impairment of the We also saw in Chapter 7 that phonological medi-
semantic repetition route. ation does not appear to be necessary for writing
If only the lexical repetition route is left single words.
intact, then patients will be able to repeat words There is a great deal of neuropsychologi-
but not nonwords (as nonwords do not have a lex- cal evidence that there are distinct phonological
ical entry). They will not be able to comprehend and orthographic output stores. Beauvois and
the words they repeat (as there is no link with Derouesné (1981) reported a patient showing
semantics), and they will probably have difficulty impaired spelling yet intact lexical reading. MH
in understanding and producing speech (because was severely anomic in speech but had much less
of the disruption to semantics). Nor should they severe written word-finding difficulties (Bub &
show the effects of semantic variables such as Kertesz, 1982b). Patient WMA produced incon-
imageability in repetition. They might also make sistent oral and written naming responses. When
lexicalization errors (repeating nonwords as close given a picture of peppers, he wrote “tomato” but
words—e.g., repeating “sleeb” as “sleep”). Dr. said “artichoke” (Miceli, Benvegnu, Capasso, &
O (Franklin, Turner, Lambon Ralph, Morris, & Caramazza, 1997). If a single lexicon were used
Bailey, 1996) was close to this pattern. He could for both speaking and writing, WMA would have
understand written words, but could not under- given the same (erroneous) response in both cases.
stand spoken words. He could, however, repeat Some patients are better at written picture nam-
spoken words quite well (80%) but was very poor ing than spoken picture naming (Rapp, Benzing,
at nonword repetition (7%). & Caramazza, 1997; Shelton & Weinrich, 1997).
If only the sublexical repetition route is left The existence of patients such as PW who can
intact, patients will be able to repeat both words write the names of words that they can neither
and nonwords, but will have no comprehension of define nor name aloud is evidence for the inde-
the meaning of the words. Transcortical sensory pendence of these systems, and argues against
aphasia fits this pattern (Chapter 13). obligatory phonological mediation in writing
There are other possible combinations, of (Rapp et al., 1997). Rapp and Caramazza (2002)
course. Patients might have damage to only one of describe a patient who has more difficulty speak-
the routes, leaving two intact. Damage to the sub- ing nouns than verbs but greater difficulty writ-
lexical route alone would lead to an impairment ing verbs than nouns. This evidence suggests that
of repetition, with particularly poor repetition of different output stores are involved in speaking
nonwords, as they cannot be repeated through the and writing, and that writing does not require the
direct and semantic repetition routes. This is the generation of a phonological representation of the
pattern observed in conduction aphasia (Martin, word. Although there are some dissenting voices
2001). As damage to the lexical route alone should (e.g., Behrmann & Bub, 1992), most studies sug-
result in relatively good repetition of both words gest that multiple lexical systems are involved.
15. STRUCTURE OF THE LANGUAGE SYSTEM 467
However, it is likely that they interact, as damage buffers. Writing involves going from the semantic
to word meaning usually leads to comparable dif- system to print through the orthographic output
ficulties in both written and spoken output (Miceli store. We can also write nonwords to dictation, so
& Capasso, 1997). there must be an additional connection between
the phonological output buffer and the ortho-
graphic output buffer that provides sound-to-letter
Sketch of a model rules.
As we saw in Chapter 7, reading makes use of Of course, we can do other things as well.
a number of routes. The exact number is con- We can name objects. Most people think that we
troversial, as connectionist models suggest that access the names of objects through the seman-
the direct and indirect lexical routes should be tic system from a system of visual object recog-
combined. Figure 15.1 shows the traditional nition. We saw in Chapter 11 that some people
model incorporating an indirect reading route; think that different semantic systems are used
the figure shows the maximum sophistication for words and objects, so we might have to
necessary in a model of lexical architecture. The split the semantic system in two. There is also
direct route goes from abstract letter identifica- some controversial evidence from the study of
tion to an orthographic input store and then to dementia that at least one patient (DT) can name
the semantic system. The direct lexical reading objects and faces without going through seman-
route then goes straight on to the phonological tics (Brennen, David, Fluchaire, & Pellat, 1996;
output store. The indirect or sublexical route but see Hodges & Greene, 1998, and Brennen,
(which as we saw in Chapter 7 might in turn be 1999, for a reply). In this case, we need to add an
quite complex) bypasses the orthographic input additional route from the visual object recogni-
store and the semantic system, giving us a direct tion system that bypasses semantics to get to the
link between letter identification and speech. We phonological output store.
saw that non-semantic reading means that the Note that there is no direct connection
semantic system can sometimes be bypassed. We between the orthographic input store and the
can also read out aloud a language with a regular orthographic output store. Are there patients
orthography (e.g., Italian) without being able to who can copy words (but not nonwords) without
understand it. Allport and Funnell (1981) argued understanding them? Finding such patients would
that we cannot have a separate amodal lexicon suggest that such a link will be necessary. We
mediating between systems. They reviewed evi- would need to find these sorts of patient to be cer-
dence from word meaning deafness, phonologi- tain about these links. There is also some question
cal dyslexia, and deep dyslexia. They described about whether we need distinct input and output
a number of studies of patients that argue for a phonological buffers, or whether one will suffice.
dissociation of cognitive and lexical functions. We examine these issues in more detail later.
The semantic paraphasias of deep dyslexics rule By the time we add lemmas and a non-
out any model where translation to a phonologi- semantic object-naming route, we end up with a
cal code is a necessary condition to be able to model that is even more complicated than Figure
access a semantic code (as these patients can 15.1, just to produce single words! Remember
access meaning without retrieving sound). that this is the most complex model necessary.
Writing and speaking produce output across Connectionist modeling may show how routes
time. It makes sense to retrieve a word in one go (in addition to the lexical and sublexical reading
rather than having to access the lexicon afresh routes) may be combined without loss of explana-
each time we need to produce a letter or sound. tory adequacy.
This means that we have to store the word while A final point on lexical organization is that
we speak out its constituent sounds, or write out it is not too important for the architecture of this
its constituent letters in order. This in turn means model whether words are represented in the lexi-
that we also need phonological and orthographic con in a local or a distributed representation. In
468 E. PRODUCTION AND OTHER ASPECTS OF LANGUAGE
Some patients with mild speech perception defi- phonological errors made in production by their
cits do not show impairments of ASTM (Martin sample of aphasic speakers and three measures of
& Breedin, 1992). In cases of mild speech per- input phonological buffer processing (phoneme
ception impairment, lexical items will still be discrimination, lexical decision, and synonym
able to become activated. judgments). However, Martin and Saffran (1998)
Martin and Saffran (1990) examined the found a negative relation between the propor-
repetition abilities of a patient (ST) with trans- tion of target-related nonword errors in a naming
cortical sensory aphasia. They showed that task and the patient’s ability to discriminate pho-
their patient could not repeat more than two nemes. One possible resolution of this disagree-
words without losing information about the ment is that the two buffers are interconnected.
earlier items in the input (here the first word Other evidence also supports the existence
of two words). People with a semantic impair- of separate input and output phonological buff-
ment cannot maintain items at the beginning ers. Romani (1992) described a patient with
of a sequence. Word repetition is supported by poor sentence and word repetition but good
phonological processes, but these processes are performance on immediate probe recognition,
of short duration without the feedback support suggesting an impaired output buffer but an
of semantic processes. Items at the beginning of intact input buffer. Similarly, R. C. Martin et al.
the sequence get lost because their maintenance (1999) describe the case of an anomic patient,
depends on activation spreading to semantics. MS, who showed a different pattern of perfor-
Items at the end of the sequence benefit from mance on tasks involving the input and output
the recency of phonological activation, and are phonological buffers. In particular, his perfor-
not dependent on that semantic feedback at the mance was poor on STM tasks that required
time of recall. So, although good repetition is verbal output, but normal on STM tasks that did
characteristic of transcortical sensory aphasia, not require verbal output but required the reten-
even that ability is limited. Martin and Saffran tion of verbal input. The pattern of performance
(1997) found similar associations between the suggests that separate input and output phono-
occurrence of semantic and phonological defi- logical buffers are involved. Shallice et al. (2000)
cits and serial position effects in single-word described a patient (LT) with reproduction con-
repetition. Semantic deficits are associated with duction aphasia. LT was impaired across a range
errors on the initial portion of the word, while of language output tasks; remember that the best
phonological deficits are associated with errors explanation for such a pattern of performance
on the final part of the word. This again points is an impairment to the phonological buffer.
to the integrity of the language and memory Yet LT had an intact short-term memory span,
systems. suggesting that the input phonological buffer
was spared but the output phonological buffer
Are there separate input and output was damaged. Finally, patients with impaired
phonological buffers? ASTM fall into clusters in performance on vis-
Can we distinguish between the input and out- ual homophone judgment, pseudohomophone
put phonological buffers? If Figure 15.1 is cor- judgment, and auditory and visual rhyme deci-
rect then we should be able to do so, and there is sion tasks in a way that can best be accounted
some neuropsychological evidence that we can. for by separate input and output phonological
Shallice and Butterworth (1977) described the buffers (Nickels, Howard, & Best, 1997). In
case of JB. On tasks probing memory span, JB particular, some patients showed evidence of
performed poorly, suggesting an impaired input damage to the input buffer, in being impaired
phonological buffer, but she had normal speech on all tasks apart from homophone judgment.
production, suggesting a preserved output pho- Other patients showed evidence of damage to
nological buffer. Nickels and Howard (1995) the output buffer, in that they were impaired on
found no correlation between the number of all tasks other than auditory rhyme judgments.
15. STRUCTURE OF THE LANGUAGE SYSTEM 471
Furthermore, some patients showed evidence of The idea that a central memory capacity is
a lesion to the link between the output and the used in language comprehension is known as
input buffers, in that they could perform homo- the capacity theory of comprehension (Just &
phone and auditory rhyme judgments well, but Carpenter, 1992). Just and Carpenter argued that
were poor at pseudohomophone detection and working memory constrains language compre-
visual rhyme detection. hension. Individual differences between linguis-
tic working memory capacity lead to differences
The phonological loop and in reading ability, and reduction of working
vocabulary learning memory capacity through aging or brain dam-
We have seen that because we can access the age leads to language comprehension deficits.
meaning of words so quickly, damage to the pho- As we saw in Chapter 10, some researchers have
nological loop has surprisingly few consequences put forward the controversial view that the defi-
for language processing. The main role for the cits observed in syntactic comprehension are
phonological loop is now thought to be limited best explained by a reduction in central execu-
to learning new words (Baddeley, Gathercole, & tive capacity (Blackwell & Bates, 1995; Miyake
Papagno, 1998). Verbal short-term memory also et al., 1994). Waters and Caplan (1996) criti-
plays a role in vocabulary acquisition in chil- cized the capacity theory, arguing that language
dren (Gathercole & Baddeley, 1989, 1990, 1993; processing makes use of two distinct working
Gupta & MacWhinney, 1997). The size of verbal memory systems, one dedicated to controlled,
STM and vocabulary size are strongly corre- verbally mediated tasks, and one dedicated to
lated, and early nonword-repetition ability pre- automatic, obligatory “routine” language pro-
dicts later vocabulary size. Nonword repetition cessing. They call this the domain-specific view
skills also predict success at foreign language of working memory (Caplan & Waters, 1999).
vocabulary acquisition (Papagno et al., 1991). There is some evidence against the domain-
Patients with impaired short-term phonological specific view suggesting that working memory
memory (e.g., PV of Baddeley et al., 1998) find is involved in parsing. Gibson (1998) examined
it difficult to learn a new language. Phonological the relation between working memory and sen-
memory is used to sustain novel phonological tence processing. He argued that comprehension
forms so that they can be built into more perma- has two sorts of demands on available computa-
nent representations. tional resources: a cost associated with integrating
components, and a cost associated with keeping
Working memory and parsing track of syntactic structures. The costs increase the
Although short-term memory plays some role in longer a unit must be kept in memory before it can
integration and maintaining a discourse represen- be integrated into the developing representation
tation, the extent to which an impairment of STM of the sentence. Gibson argued that the human
affects parsing is controversial. Early models parsing mechanism prefers the structure that
of parsing considered the minimization of STM incurs the least memory load. More recent dual-
demands to be a primary constraint on parsing task studies show that parsing is impaired if peo-
(e.g., Kimball, 1973). With a conception of work- ple have to remember additional related material;
ing memory as a phonological loop and central the more syntactically complex the material, the
executive, the phonological representations of greater the cost of remembering additional words.
words are stored in the phonological buffer of the The key to observing interference is that the addi-
loop, and the semantic representations of focal tional items that must be kept active in memory
components of the discourse are handled by the must be related to the material participants are try-
central executive. The central executive might ing to understand, rather than being unrelated dig-
play a role in parsing, in computing parsing its, for example (Fedorenko, Gibson, & Rohde,
processes, and in manipulating the intermediate 2006; Gordon, Hendrick, and Levine, 2002). As
results of computations. noted in Chapter 10, the debate about whether or
472 E. PRODUCTION AND OTHER ASPECTS OF LANGUAGE
not language comprehension uses general work- dementia. Although the participants’ working mem-
ing memory or a dedicated store is important, but ory capacity was reduced, there was little effect of
is unresolved and ongoing. syntactic complexity, although semantic complex-
Does parsing involve the phonological loop in ity was affected. Such results suggest that STM is
particular? On the one hand, some researchers argue not involved directly in parsing. Such patients can
that the phonological loop maintains some words still display a variety of comprehension difficulties
in short-term memory to assist in parsing, par- (such as turning commands into actions, or detect-
ticularly when parsing is difficult (e.g., Baddeley, ing discourse anomalies), suggesting that limited
Vallar, & Wilson, 1987; Vallar & Baddeley, 1984, STM can affect later integrative processing.
1987). Although some patients with STM deficits Hence it seems likely that if there is a reduct-
have impaired syntactic comprehension abilities ion in processing capacity involved in syntactic
(e.g., Vallar & Baddeley, 1987), others crucially comprehension deficits, it is a reduction specifi-
do not (e.g., Butterworth, Campbell, & Howard, cally in syntactic processing ability, rather than
1986; Howard & Butterworth, 1989; Waters, a reduction in general verbal memory capac-
Caplan, & Hildebrandt, 1991). For example, TB ity (Caplan et al., 1985; Caplan & Hildebrandt,
(a patient with a digit span of only two) showed 1988; Caplan & Waters, 1996, 1999). Parsing
increasing problems with comprehension as sen- uses a specific mechanism that does not draw
tence length increased (Baddeley & Wilson, on verbal working memory. However, these
1988). On the other hand, other researchers have more general processes may become involved
argued that the phonological loop plays no role in later in the comprehension process. This topic
parsing, but is involved in later processing after is hotly debated (Caplan & Waters, 1996, 1999;
the sentence has been interpreted syntactically Just & Carpenter, 1992; Just et al., 1996; Waters
and semantically (e.g., McCarthy & Warrington, & Caplan, 1996). The conclusions to be drawn
1987a, 1987b; Warrington & Shallice, 1969). This from all this depend on exactly how syntactic
later processing includes checking the meaning complexity is to be defined, and on the range of
against the pragmatic context, making some infer- sentence types, tasks, patient categories, and lan-
ences, and aspects of semantic integration. For guage examined (Bates, Dick, & Wulfeck, 1999).
example, patient BO had a memory span of only MacDonald and Christiansen (2002) take a
two or three items, yet had excellent comprehen- totally different approach to the idea of working
sion of syntactically complex sentences, includ- memory as a separate store. They adopt a connec-
ing those with dependencies spanning more than tionist perspective, arguing that the capacity limi-
three words (Caplan, 1992; Waters et al., 1991). RE tations arise from the architecture of the language
was a highly literate young woman with a greatly system, and from individual differences in read-
reduced digit span. Although she displayed phono- ing experience. In particular, there is no separate
logical dyslexia and impaired sentence repetition, working memory in the sense that there is a box
her syntactic analysis and comprehension abilities into which the results of linguistic computations
appeared to be intact (Butterworth et al., 1986; are put. Capacity and knowledge are inseparable.
Campbell & Butterworth, 1985; but see Vallar Instead, capacity limitations arise from the behav-
& Baddeley, 1989). McCarthy and Warrington ior of the whole system, rather than from one
(1987a) observed a double dissociation, with some component of it. MacDonald and Christiansen
patients showing an impairment to a passive pho- provided a connectionist model to simulate indi-
nological store involved in unrelated word list rep- vidual differences in language comprehension,
etition, but who were good at repeating sentences, showing how these differences can arise from dif-
and others showing an impairment to a memory ferences in the amount of training the networks
system involving meaningful sentence repetition, receive. This alternative approach has generated
but who could repeat lists of unrelated words. considerable controversy (Caplan & Waters,
Rochon et al. (1994) examined syntactic process- 2002; Just & Varma, 2002). Nevertheless the idea
ing in a group of patients with Alzheimer’s-type is pleasingly simple and parsimonious.
15. STRUCTURE OF THE LANGUAGE SYSTEM 473
SUMMARY
1. Do you think that a language system with multiple semantic stores is more plausibly combined
with separate or unitary lexical systems?
2. Are there kinds of patients that we should not observe, if Figure 15.1 is correct?
3. What role does the central executive play in language?
4. How do we decide whether or not two words rhyme?
5. What is a lexicon?
6. How does the content of what we know about the structure of the language system relate to what
we have learnt about the brain in earlier chapters?
FURTHER READING
For reviews of picture naming, see Glaser (1992) and Morton (1985). Allport and Funnell (1981)
review many of the issues concerning lexical fractionation; they argue for the reparability of cogni-
tive and lexical codes. Monsell (1987) is a comprehensive review of the literature on the fractionation
of the lexicon. Ellis and Young (1988, Ch. 8) provide a detailed discussion of the neuropsychologi-
cal evidence for their proposed architecture of the language system. See also the PALPA test battery
(Kay, Lesser, & Coltheart, 1992). See Shelton and Caramazza (1999) for a review and discussion of
how lexical architecture relates to semantic memory. Bradley and Forster (1987) review the differ-
ences between spoken and visual word recognition.
See Baddeley (2007) and Eysenck and Keane (2010) for more on working memory. Howard and
Franklin (1988) give a detailed single-case study of a patient (MK) with a repetition disorder, and
Martin (2001) is an excellent review of repetition disorders in aphasia. For more on the role of the
phonological loop in vocabulary learning, see the debate between Bowey (1996, 1997) and Gathercole
and Baddeley (1997). See Meyer, Wheeldon, and Krott (2006) for a collection that examines which
language processes might be automatic and which might require resources.
C H A P T E R 16
NEW DIRECTIONS
independently of the others, and other boxes only particularly in language development. There is
get access to the final output. Interactive models a divide between those who argue that children
allow boxes to fiddle around with the contents of need language-specific information (which is usu-
other boxes while they are still processing, or they ally thought to have some innate origin) to acquire
are allowed to start processing on the basis of an language, and those who argue that acquisition
early input rather than having to wait for the preced- needs no more than general learning principles,
ing stage to complete its processing. This issue has such as the ability to make use of distributional
recurred through every chapter on adult psycholin- information.
guistics. Although there is a great deal of disagree- Seventh, how sensitive are the results of
ment among psycholinguists, the preponderance of our experiments to the particular techniques
evidence—in my opinion—suggests that language employed? Our results are sometimes very sen-
processing is strongly interactive, although there sitive to the techniques used, and this means that
are constraints. There may be modules, but they are in addition to having a theory about the principal
leaky ones: modules need not be informationally object of study, we need to have a theory about
encapsulated. The debate has now largely moved the tools themselves. Perhaps this is most clearly
on from simply whether language processes are exemplified by the debate about lexical decision
modular or interactive, to examining the detailed and naming, and whether they measure the same
time course of processing. When does interaction thing.
occur? What types of information interact? Can Eighth, a great deal can be learned by exam-
interactions be prevented? Psycholinguists have ining the language of people with damage to the
started to dispense with broad, general considera- parts of the brain that control language. In recent
tions, and to focus on the details of what happens. years, cognitive neuroscience imaging data has
Context can have different effects at different levels provided some of the most interesting and impor-
of processing. tant contributions to psycholinguistics.
Fourth, what is innate about language? We Ninth, language is cross-cultural. Studies of
have seen that there is still disagreement about processing in different languages have told us a
whether the developing child needs innate, great deal about topics such as language develop-
language-specific content in order to acquire ment, reading, parsing, language production, and
language. Connectionist modeling has shown neuropsychology. The results suggest that while
how language might be an emergent process, the the same basic architecture is used to process
development of which depends on general con- different languages, it is exploited in different
straints, although this remains controversial. ways. That is, we all share the same hard-wired
Fifth, do we need to refer to explicit rules when
considering language processing? There is cur-
rently little agreement on this, with researchers in
the connectionist camp against much explicit rule-
based processing, and traditionalists in favor of it
(e.g., see the debates on past tense acquisition in
Chapter 4 and on dual-route models of reading in
Chapter 7). There is considerable evidence that chil-
dren make much use of statistical learning of dis-
tributional information when acquiring language. A
recent study, for example, has found a correlation
between children’s statistical learning skills and
reading ability (Arciuli & Simpson, 2012).
The sixth theme is the extent to which lan- Cognitive neuropsychology has provided some
guage processes are specific to language. We have interesting and important contributions to the
study of language.
seen how this issue has proved very controversial,
16. NEW DIRECTIONS 477
modules, but they vary slightly in what they do. Connectionism has made many important
Hence there are some important cross-linguistic contributions to psycholinguistics over the last
differences, and these differences are of theoreti- 30 years. What are its virtues that have made
cal interest. it so attractive? First, as we have seen, unlike
Finally, we should be able to apply psycho- traditional AI it is more brain-like, in that pro-
linguistic research to everyday problems. We can cessing takes place in lots of simple, massively
discern five key applications: First, we now know interconnected neuron-like units. It is important
a great deal about reading and comprehension, not to get too carried away with this metaphor,
and this can be applied to improving methods of but at least we have the feeling that we are start-
teaching reading (Chapters 8 and 12). Second, ing off with the right sort of models. Second,
these techniques should also be of use in helping just like traditional AI, connectionism has the
children with language disabilities; for example, virtue that modeling forces us to be totally
the study of developmental dyslexia has aroused explicit about our theories. This explicitness
much interest (Chapter 8). Third, psycholinguis- has had three major consequences. First, recall
tics helps us to improve the way in which foreign that many psycholinguistic models are specified
languages can be acquired by children and adults as box-and-arrow diagrams (e.g., Figure 15.1).
(Chapter 5). Fourth, we have greatly increased our This approach is sometimes called, rather derog-
understanding of how language can be disrupted atorily, “boxology.” It is certainly not unique to
by brain damage. This has had consequences psycholinguistics, and such an approach is not
for the treatment and rehabilitation of brain- as bad as is sometimes hinted. It at least gives
damaged patients (e.g., see Howard & Hatfield, rise to an understanding of the architecture of
1987). Fifth, there are obvious advantages if we the language system—what the modules of the
can develop computers that can understand and language system are, and how they are related to
produce language. This is a complex task, but an others. However, connectionism has meant that
examination of how humans perform these tasks we have had to focus on the processes that take
has been revealing. Generally, computers are place inside the boxes of our models. In some
better at lower level tasks such as word recogni- cases (such as the acquisition of past tense), this
tion. Higher level, integrative processes involve has led to a detailed re-examination of the evi-
a great deal of context (Chapters 12 and 13), and dence. Second, connectionism has forced us to
this has proved a major stumbling block for work consider in detail the representations used by
in the area. the language system. This has led to a healthy
In addition to these ten themes, we noted in debate, even if the first representations used
Chapter 1 that modern psycholinguistics is eclec- by connectionist modelers turned out later not
tic. In particular, we have made use of data from to be the correct ones (e.g., see Chapter 7 and
cognitive neuropsychology and techniques of the debate on using Wickelfeatures as a repre-
connectionist modeling. sentation of phonology in the input to the read-
We have seen that the study of impairments ing system). Third, the emphasis on learning
to the language system has cast light on virtually in many connectionist models focuses on the
every aspect of psycholinguistics. For example, it developmental aspect that is hopefully leading
has provided a major motivation for the dual-route to an integration of adult and developmental
model of reading (Chapter 7); it has enhanced our psycholinguistics.
understanding of the development of reading and
spelling (Chapter 8); it has provided interesting if
complex data that any theory of semantics must SOME GROWTH AREAS?
explain (Chapter 11); it has bolstered the two-
stage model of lexicalization (Chapter 13); and it Students of any subject are obviously interested
has been revealing about the nature of parsing and primarily in where a subject has been, whereas
syntactic planning (Chapters 10 and 13). researchers naturally focus on where a subject is
478 E. PRODUCTION AND OTHER ASPECTS OF LANGUAGE
models will become more process orientated, and corpus of actual usage. The Internet also makes it
we will form a clearer understanding of how pro- relatively easy through crowdsourcing to collect
cessing changes throughout childhood before set- large-scale norms. We can also carry out “mega-
tling on the adult form. studies” using a huge number of participants and
Fourth, a full understanding of psycholinguis- items. The challenge is developing new tools that
tics would entail an understanding of the nature will enable us to extract meaningful conclusions
of all the components of the language processor, from these very large samples (see for example
and how they are related to each other. We saw Bestgen & Vincze, 2012).
in Chapter 15 (Figure 15.1) how a start has been Finally, psycholinguistics will explore other
made on the word recognition system. One impor- participant groups and more naturalistic settings
tant goal of any integrative theory is to specify in greater detail. Recent years have seen an enor-
how the language system interfaces with other mous diversification in who is being studied. We
cognitive systems. That is, what is the final output saw in the first chapter that many experiments
of comprehension and the initial input to product- have been on the visual processing of language
ion? It is likely these are the same. In Chapter 12 by healthy monolingual college-aged participants.
we saw how currently the most likely proposal That is changing. One particularly important
about the form of the output is a propositional aspect of this is the cross-linguistic study of lan-
representation associated with the activation of guage. Most of the experiments described in this
goals and other schemata (see the description of book have been on speakers of the English lan-
Kintsch’s model in that chapter). In Chapter 13, guage. This does not just reflect my bias, because
we saw that the conceptualizer that creates the most of the work carried out has been on English.
input to the production system has been much Research is also driven by the assumption that the
neglected. In Chapter 11, we saw that the work of underlying processing architecture is shared by
people like Jackendoff (1983) puts restrictions on languages, although there may be some important
the interface between the semantic and cognitive differences. There is likely to be more emphasis
systems. There is a move to integrating research on how we process natural speech in more natural
across areas. For example, the connectionist settings, away from the single word presented on
model of Chang, Dell, and Bock (2006) accounts a computer screen. The “visual world” paradigm
for data from adult speech production (structural (discussed particularly in Chapters 10, 13, and 14)
priming) and language acquisition (verb-argument has become particularly important in this respect,
structures). The Chang et al. model shows how and is likely to become even more so.
language acquisition and adult speech production Chomsky’s ideas have been very influential in
make use of the same mechanisms. A related ques- this respect. You will remember that according to
tion is how production and comprehension are his position language is an innate faculty specified
related (e.g., Ferreira & Bailey, 2004; Pickering & by the language acquisition device (LAD). All lan-
Garrod, 2013). Clearly much remains to be done guages, because they are governed by the form of
in this important area. the LAD, are similar at some deep level. Variation
Fifth, the Internet and social networking between languages boils down to differences in
make possible the use of very large corpora of vocabulary, and the parameters set by the exposure
language. We have seen in the study of seman- to a particular language. An alternative view is the
tics how HAL makes use of the co-occurrence connectionist one that similar constraints from gen-
information of a very large sample of text. Watts eral development and inherent in the data lead to
(2012) explores what we can learn about behavior similarities in development and processing across
and communication using Twitter and Facebook. languages. In general, cross-linguistic comparisons
For the first time we have readily available mil- help us to constrain the nature of this architecture
lions of samples of language in actual use. So, and to explain the important differences. What
for example, rather than estimating things such as are the consequences of these differences? Much
word frequency, we can specify it for a very large research remains to be done on this.
480 E. PRODUCTION AND OTHER ASPECTS OF LANGUAGE
There are many areas where it is useful to frequency of related morphological and phonological
compare languages. First, the observation that forms, length, associations, age of acquisition, auto-
there are similar constraints on syntactic rules has biographical associations, categorizability, concrete-
been used to motivate the concept of universal ness, bigram frequency, imagery, letter frequency,
grammar (Chapters 2, 3, and 4). To what extent can number of meanings, orthographic regularity, mean-
the connectionist view that language is an emer- ingfulness, emotionality, recognition threshold,
gent process give an account of these findings? regularity, position of recognition point, and morpho-
Second, we also saw in Chapter 10 that examin- logical complexity. Since then we have discovered
ing a single language (English) might have given that not only are the neighbors of words important,
us a distorted view of the parsing process. Third, but also the properties of the neighbors are important.
similarities and differences in languages have con- And this is before we have begun to consider con-
sequences for language development (Chapter 4). straints on processing units larger than a word. Cutler,
For example, the cross-linguistic analysis of the writing in 1981, asked, “will we be able to run any
development of gender argues against a semantic psycholinguistic experiments at all in 1990?” The
basis for the development of syntactic categories. year 1990 has passed and we are still doing experi-
Finally, what can analysis of different languages ments in 2012, so the answer is obviously “yes,” but
that map orthography onto phonology in different it is getting more difficult, and we have to make a
ways tell us about reading (Chapter 7)? And do number of carefully justified assumptions. It is appar-
different languages break down in different ways ent that we have to be particularly careful about how
after brain damage? we choose our materials. Controlling variables we
Related to cross-linguistic studies, much know about might not be enough. Forster (2000)
remains to be learned about bilingualism, which is showed that skilled psycholinguists have a great deal
still receiving increasing attention in the research lit- of implicit knowledge about language. When asked
erature, both as a subject in its own right, and as a to make predictions about which word would be
means to investigate underlying language processes. responded to fastest on a lexical decision task from
word pairs controlled for known predictor variables,
skilled researchers performed above chance. Hence it
CONCLUSION is always possible that researchers are unconsciously
constructing their materials in a particular way. The
The eventual goal of psycholinguistics is a detailed remedy for this problem is making more use of
and unified theory of language and how it relates random sampling of materials.
to other cognitive processes. The more we know, There still remains a great deal to do in psy-
in some ways the harder it is to carry out psycho- cholinguistics. It should be clear from reading
linguistics experiments. Cutler (1981) observed that this book that there is much we don’t know, and
the list of variables that had to be controlled in psy- many occasions when there are competing inter-
cholinguistics experiments was large and growing, pretations of the data. If this book has inspired
and there were many that were rarely considered. any reader to investigate further and even actually
Here is Cutler’s (adapted) list for experiments on contribute to the subject, it has more than served
single words: syntactic class, ambiguity, frequency, its purpose.
This appendix provides a more formal and detailed do not learn are those based on interactive activa-
description of connectionism than is given in the tion and competition (IAC), and of models that do
main text. I hope it is comprehensible to anyone learn, those trained using back-propagation. We
with some knowledge of basic algebra. If you find should distinguish the architecture of a network,
the mathematics daunting, it is worth persevering, which describes the layout of the network (how
as many of the most important models in current many units there are and how they are connected
psycholinguistics are types of connectionist mod- to each other), the algorithm that determines how
els. See the suggestions for further reading for activation spreads around the network, and the
more detailed and comprehensive coverage. learning rule, if appropriate, that specifies how
Connectionism has become the preferred the network learns.
term to describe a class of models that all have We look here at two approaches that have
in common the principle that processing occurs been the most influential in psycholinguistics.
through the action of many simple, interconnected Other important learning algorithms that have
units; parallel distributed processing (PDP) and been used include Hebbian learning and the
neural networks are other commonly used terms Boltzmann machine (Hinton & Sejnowski, 1986);
that are almost synonymous. There are three very see the suggestions for further reading for details
important concepts underpinning all connectionist of these.
models. The first basic idea of connectionism is
that there are many simple processing units con-
nected together. These units don’t do very much INTERACTIVE
other than modify and pass on activation (one ACTIVATION MODELS
number). The second basic idea is that energy or
activation spreads around the network in a way I’ll start with the interactive activation model
determined by the strengths of the connections because historically it was the first connection-
between units. Strong positive weights magnify ist type model to have an impact on psychology,
the output of units; strong negative weights pro- and because it’s relatively easy to understand.
duce a large negative, inhibitory value. Units have McClelland and Rumelhart (1981) and Rumelhart
activation levels that are modified by the amount and McClelland (1982) presented the interac-
of activation they receive from other units. The tive activation and competition (IAC) model to
third idea is that high-level, complex “intelligent” account for word context effects on letter identi-
behavior emerges from the interaction and coop- fication. The TRACE model of spoken word rec-
eration of these many simple “dumb” units. ognition (McClelland & Elman, 1986) is an IAC
There are many types of connectionist model. model.
One important distinction is between models that The model consists of many simple process-
do not learn and models that do. In psychology ing units arranged in three levels. There is an
the most important examples of the models that input level of visual feature units, a level where
482 APPENDIX: CONNECTIONISM
units correspond to individual letters, and an out- and hence how quickly activation builds up at the
put level where each unit corresponds to a word. unit at the end of the connection. The total activa-
Each unit is connected to each unit in the level tion, called neti, arriving at each unit i from j con-
immediately before and after it. Each of these nections is shown in equation (A.1). Put in words,
connections is either excitatory (that is, positive this equation means that the activation arriving at
or facilitatory) or inhibitory (negative). Excitatory a unit is the sum (S) of the products of the output
connections make the units at the end of the con- activation (aj) of all the j units that input to it and
nection more active, whereas inhibitory connec- the weights (w) on the connection between the
tions make the connections at the end less active. input and receiving unit (wji). You just multiply all
Each unit is connected to each other unit within the output of connecting units by the strength of
the same level by an inhibitory connection. See the appropriate weight and add them up.
Figure 6.9 for a graphical representation of this
architecture. neti = 6 a j .w ji (A.1)
j
When a unit becomes activated, it sends
off energy, or activation, simultaneously along An example should make this clear. Figure A.1
the connections to all the other units to which shows part of a very simple network. There are four
it is connected. If it is connected by a facilita- input units to one destination unit. We say that the
tory connection, it will increase the activation input vector is [1 0 1 1]. (Here we are assuming that
of the unit at the other end of the connection, the input units are either simply “on” [with a value of
whereas if it is connected by an inhibitory con- 1] or “off” [with a value of 0]. That is, we are restrict-
nection, it will decrease the activation at the ing them to binary values. In principle, we could let
other end. Consider the IAC model of Figure the input units be “on” to different extents, e.g., 0.3.)
6.9. If the unit corresponding to the letter “T” The total amount of activation arriving at the desti-
in the initial letter position becomes activated, nation unit will be the sum of all the products of the
it will increase the activation level of the word outputs of the units that input to it with the appropriate
units corresponding to “TAKE” and “TASK,” weights on the connections: that is, ((1 × +0.2) + (0 ×
because they start with a “T,” but will decrease −0.5) + (1 × +0.7) + (1 × −0.1)) = +0.8. The under-
the activation level of “CAKE,” because it does lying idea is that this equation is a simplified model
not. Because units are connected to all other of a neuron: Neurons become excited or inhibited by
units within the same level by inhibitory con- all the other neurons that contact them. Found that bit
nections, as soon as a unit becomes activated, it
starts inhibiting all the other units at that level.
The equations summarized in the next section
determine the way in which activation flows on
between units, is summed by units, and is used +0.2
to change the activation level of each unit at
off –0.5
each time step. Over time, the pattern of activa-
tion settles down or relaxes into a stable con- +0.7
figuration so that only one word remains active. on –0.1
activation model
As we have seen, in the IAC model activation
spreads from each unit to neighboring units along FIGURE A.1 A simplified connectionist unit or
excitatory or inhibitory connections. Connections “neuron.” This unit takes multiple inputs (derived
have numbers or weights that determine how from the weighted output of other units) and
much activation spreads along that connection, converts them to a single output.
APPENDIX: CONNECTIONISM 483
of arithmetic tedious? Imagine doing it hundreds or three layers or levels. Again, each typically
thousands, or more, times. No wonder connectionism contains many simple units. These are called
relies on computers to do the computation. the input, hidden, and output levels (see Figure
Finally, a further equation is needed to deter- 7.5 for an example). As in the IAC model, each
mine what happens to a unit in each processing of the units in these layers has an activation
cycle after it receives an input. In the IAC model, level, and each unit is connected to all the units
each unit changes its activation level depending in the next level by a weighted connection,
on how much or how little input it receives, and which can be either excitatory or inhibitory.
whether that input is overall positive (excitatory) These networks learn to associate an input pat-
or negative (inhibitory). In each cycle the new tern with an output pattern using a learning rule
activation level of a unit i, Δai, is given by equations called back-propagation. The most important
(A.2) and (A.3): difference between IAC and back-propagation
networks is that in the case of the latter the
Δai = (max − ai)neti − decay(ai − rest) weights on connections are learned rather than
if neti > 0 (A.2) hand-coded at the start.
How does the network learn? The connec-
Δai = (ai − min)neti − decay(ai − rest) tions in the network all start off with random
otherwise (A.3) weights. Suppose we want the model to learn to
pronounce the printed word “DOG”; that is, we
where rest is the unit’s resting level, decay is a want to train the network to associate the input
parameter that makes the unit tend to decay back pattern of graphemes D O G with the output pat-
to its resting level in the absence of new input, tern of sounds or phonemes /d/ /o/ /g/. One pattern
max is the unit’s maximum permitted level of of activation over the input units corresponds to
activation, and min is the unit’s minimum permit- “DOG.” In Figure 7.5 I have for simplicity made
ted level of activation. So if absolutely nothing the representation a local one—that is, for exam-
happens, eventually the activation of the unit will ple, one unit corresponds to “D,” one to “O,” one
decay back to its resting level. to “G,” and so on. In more realistic models these
Processing takes place in cycles to represent patterns are usually distributed so that DOG is
the passage of time. At the end of each cycle, the represented by a pattern of activation over the
activation levels of all the units in the network are input units with no one single unit corresponding
updated. In the next cycle the process is repeated to any one single letter. Hence DOG might be rep-
using the new activation levels. Processing con- resented by input unit 1 on, input unit 2 off, input
tinues until some criterion (e.g., a certain number unit 3 on, and so on. These units then pass activa-
of processing cycles or a certain level of stability) tion on to the hidden units according to the val-
is attained. ues of the connections between the input and the
hidden units. Activation is then summed by each
BACK-PROPAGATION unit in the hidden unit layer in just the same way
as in the interactive activation model. In models
Back-propagation is the most widely used connec- that learn using back-propagation, the output of a
tionist learning rule. It enables networks to learn to unit is a complex function of its input: For reasons
associate input patterns with output patterns. It is called that we can skip, there must be a non-linear rela-
an error-reduction learning method because it is an tion between the two, given by a special type of
algorithm that enables networks to be trained to reduce function called the logistic function. The output ou
the error between what the network actually outputs of a unit u is related to its input by equation (A.4).
given a particular input, and what it should output given Here netinputu is the total input to the unit u from
that input. all the other units that input to it, and e is the expo-
The simplest type of network architecture nential constant (the base of natural logarithms,
that can be trained by back-propagation has with a value of about 2.718).
484 APPENDIX: CONNECTIONISM
As an example, let us take the unit shown in The error for the output units is given by
Figure A.1 once again. The total input to that unit, equation (A.6), and that for the hidden units by
netinputu, is [(1 u 0.2) + (0 × −0.5) (1 u 0.7) equation (A.7), where l and m are connecting lay-
(1 u −0.1)] 0.8. Hence the output ou for this unit ers. The weight change is given by equation (A.8).
is 1/(1 e−0.8) 0.69.
Each unit has an individual threshold level or δpj (tpj opj) · opj · (1 opj) (A.6)
bias. (This is usually implemented by attaching an
δpl opl · (1 opl) · Σ δpm · wlm (A.7)
additional unit, the bias unit, which is always on,
to each principal unit. The value of the weights Δwij(n+1) η · (δpj · opi) + α · Δwij(n) (A.8)
between the bias and other units can be learned
like any other weights.) There are two new constants in equa-
Activation is then passed on from the hidden to tion (A.8): η is the learning rate, which deter-
the output units, and so eventually the output units mines how quickly the network learns, and α is
end up with activation values. But as we started the momentum term, which stops the network
off with totally random values, they are extremely changing too much and hence overshooting on
unlikely to be the correct ones. As the target output, any learning cycle. (The dots “·” mean the same
we wanted the most activated output units to corre- as “multiply,” but make the equations easier to
spond to the phonemes /d/ /o/ /g/, but the actual out- read.) Needless to say, this training process can-
put is going to be totally random, maybe something not be completed in a single step. It has to be
close to /k/ /i/ /j/. What the learning rule does then is repeated many times, but gradually the values
to modify the connections in the network so that the of actual and desired outputs converge. You can
output will be a bit less like what it actually produced, modify the training set in a number of ways to
and a bit more like what it should be. It does this in a make the task more realistic. For example, if you
way that is very like what happens in calculating the are interested in word frequency, you have to
mean squared error in an analysis of variance. The encode it in the training in some way, perhaps
difference between the actual and the target outputs by presenting more input–output pairings of fre-
is computed, and the values of all the weights from quent words more often.
the hidden to the output units are adjusted slightly to Networks trained by back-propagation show
try to make this difference smaller. This process is some interesting properties. Most interestingly,
then “back-propagated” to change the weights on the if you present a trained network with an item
connections between the input and the hidden units. that it has not seen before, it can often manage
The whole process can then be repeated for a dif- to produce the appropriate output quite well. For
ferent input–output (e.g., grapheme–phoneme) pair. example, in the case of the model learning to
Eventually, the weights of the network converge on read, although the network has not been taught
values that give the best output (that is, the least dif- any explicit rules of pronunciation, it behaves as
ference between desired and actual output) averaged though it has learned them, and can generalize
across all input–output pairs. appropriately.
The back-propagation learning rule is based One of the most important and commonly
on the generalized delta rule. The rule for chang- used modifications to the simple feedforward
ing the weights following the presentation of a architecture is to introduce recurrent connec-
particular pattern p is given by equation (A.5), tions from one layer (usually the hidden layer) to
where j and i index adjacent upper and lower lay- another layer (called the context layer). For exam-
ers in the network, tpj is the jth component of the ple, if the context layer stores the past state of the
desired target pattern, opj is the corresponding jth hidden unit layer, then the network can learn to
component of the actual output pattern p, and ipi is encode sequential information—what follows
the ith component of the input pattern. what in a sequence (Elman, 1990).
APPENDIX: CONNECTIONISM 485
You should bear in mind that this description machines are those that you are most likely to
is a simplification. Why you need hidden units in come across in psycholinguistics. The general
such a model, what happens if you do not have principles involved are much the same, and fur-
them, and how you select how many units to thermore, the end result is generally the same. We
have are all important issues. Furthermore, there are usually most interested in the behavior of the
are other learning algorithms that are sometimes trained network, and how it is trained is usually
used—of these, Hebbian learning and Boltzmann not relevant.
FURTHER READING
Bechtel and Abrahamsen (2001) is an excellent textbook on connectionism. Ellis and Humphreys
(1999) is a text that emphasizes the role of connectionism in cognitive psychology. The two-volume
set Parallel Distributed Processing (Rumelhart, McClelland, & the PDP Research Group, 1986;
McClelland, Rumelhart, & the PDP Research Group, 1986) is a classic. Caudill and Butler (1992),
Dawson (2005), McClelland and Rumelhart (1988), Orchard and Phillips (1991), and Plunkett and
Elman (1997) provide exercises and simulation environments. Plunkett and Elman is a companion
volume to Elman et al. (1996) and includes a simulation environment called tlearn that runs on both
Macintosh and Windows platforms.
There are a number of popular books about emergent systems, attractors, chaos, and complexity,
including Gleick (1987), Stewart (1989), and Waldrop (1992).
GLOSSARY
Acoustics: the study of the physical properties of American Sign Language (ASL): American Sign
sounds. Language (sometimes called AMESLAN).
Acquired disorder: a disorder caused by brain damage Anaphor: a linguistic expression for which the
is acquired if it affects an ability that was previously referent can only be determined by taking another
intact (contrasted with developmental disorder). linguistic expression into account—namely the
Activation: can be thought of as the amount of energy anaphor’s antecedent (e.g., “Vlad was happy; he loved
possessed by something. The more highly activated the vampire”—here he is the anaphor and Vlad is the
something is, the more likely it is to be output. antecedent).
Adjective: a describing word (e.g., “red”). Aneurysm: dilation of blood vessel (e.g., in the
Adverb: a type of word that modifies a verb (e.g., brain), where a sac in the blood vessel is formed and
“quickly”). presses on surrounding tissue.
Affix: a bound morpheme that cannot exist on its Anomia: difficulty in naming objects.
own, but that must be attached to a stem (e.g., re-, Antecedent: the linguistic expression that must
-ing). It can come before the main word, when it is a be taken into account in order to determine the
prefix, or after, when it is a suffix. referent of an anaphor (“Vlad was happy; he loved
Agent: the thematic role describing the entity that the vampire”—here he is the anaphor and Vlad the
instigates an action. antecedent). Often the antecedent is the thing for
Agnosia: disorder of object recognition. which a pronoun is being substituted.
Agrammatism: literally, “without grammar”; a Aphasia: a disorder of language, including a defect
type of aphasia distinguished by an impairment of or loss of expressive (production) or receptive
syntactic processing (e.g., difficulties in sentence (comprehension) aspects of written or spoken
formation, inflection formation, and parsing). There language as a result of brain damage.
has been considerable debate about the extent to which Apraxia: an inability to plan movements, in the absence
agrammatism forms a syndrome. of paralysis. Of particular relevance is speech apraxia, an
Allophones: phonetic variants of phonemes. For inability to carry out properly controlled movements of
example, in English the phoneme /p/ has two variants, an the articulatory apparatus. Compare with dysarthria.
aspirated (breathy) and unaspirated (non-breathy) form. Articulatory apparatus: the parts of the body
You can feel the difference if you say the words “pit” and responsible for making speech sounds, such as the
“spit” with your hand a few inches from your mouth. larynx, tongue, teeth, and lips.
Alzheimer’s disease (AD): Alzheimer’s disease Aspect: the use of verb forms to show whether
or dementia—often there is some uncertainty something is finished, continuing, or repeated. English
about the diagnosis, so this is really shorthand for has two aspects: progressive (e.g., “we are cooking
“probable Alzheimer’s disease” or “dementia of the dinner”) versus non-progressive (e.g., “we cook
Alzheimer’s type.” dinner”), and perfect, involving forms of the auxiliary
GLOSSARY 487
“have” (e.g., “we have cooked dinner”), versus non- language) is learned relative to L1: simultaneous (L1 and
perfect (without the auxiliary). L2 learned about the same time), early sequential (L1
Aspirated: a sound that is produced with an audible learned first but L2 learned relatively early, in childhood),
breath (e.g., at the start of “pin”). and late (in adolescence onwards).
Assimilation: the influence of one sound on the Body: the same as a rime—the final vowel and
articulation of another, so that the two sounds become terminal consonants.
slightly more alike. Bootstrapping: the way in which children can
Attachment: attachment concerns how phrases are increase their knowledge when they have some—such
connected together to form syntactic structures. In as inferring syntax when they have semantics.
“the vampire saw the ghost with the binoculars” the Bottom-up: processing that is purely data-driven.
prepositional phrase (“with the binoculars”) can be Bound morphemes: a morpheme that cannot exist
attached to either the first noun phrase (“the vampire”) or on its own (e.g., un, ent).
the second (“the ghost”). Brain imaging: techniques for looking at what the
Attentional (or controlled) processing: processing brain is doing when we carry out some activity.
requiring central resources. It is non-obligatory, generally Broca’s aphasia: a type of aphasia that follows
uses working memory space, is prone to dual-task from damage to Broca’s region of the brain,
interference, is relatively slow, and may be accessible to characterized by many dysfluencies, slow,
consciousness. (The opposite is automatic processing.) laborious speech, difficulties in articulation, and by
Attractor: a point in the connectionist attractor agrammatism.
network to which related states are attracted. Cascade model: a type of processing where
Audience design: the idea that speakers tailor their information can flow from one level of processing
productions to address the specific needs of their listeners. to the next before the first has finished processing;
Auditory short-term memory (ASTM): a short- contrast with discrete stage model.
term store for spoken material. Categorical perception: perceiving things that
Automatic processing: processing that is unconscious, lie along a continuum as belonging to one distinct
fast, obligatory, facilitatory, does not involve working category or another.
memory space, and is generally not susceptible to dual-task Child-directed speech (CDS): the speech of carers
interference. (The opposite is attentional processing.) to young children that is modified to make it easier to
Auxiliary verb: a linking verb used with other understand (sometimes called “motherese”).
verbs (e.g., in “You must have done that,” “must” and Class: the grammatical class of a word is the major
“have” are auxiliaries). grammatical category to which a word belongs—e.g.,
Babbling: an early stage of language, starting at the noun, adjective, verb, adverb, determiner, preposition,
age of about 5 or 6 months, where the child babbles, pronoun.
repetitively combining consonants and vowels into Clause: a group of related words containing a subject
syllable-like sequences (e.g., “bababababa”). and a verb.
Back-propagation: an algorithm for learning Closed-class item: same as function word.
input–output pairs in connectionist networks. It works Co-articulation: the way in which the articulatory
by alternately reducing the error between the actual apparatus takes account of the surrounding sounds
output and the desired output of the network. when a sound is articulated; as a result, a sound
Basic level: the level of representation in a hierarchy conveys information about its neighbors.
that is the default level (e.g., “dog” rather than Cognates: words in different languages that have
“terrier” or “animal”). developed from the same root (e.g., many English
Bilingual: speaking two languages. and French words have developed from the same
Bilingualism: having the ability to speak two languages. Latin root: “horn” [and “cornet”] and “corne” are
There are three types depending on when L2 (the second derived from the Latin “cornu”); occasionally used for
488 GLOSSARY
words that have the same form in two languages (e.g., Diphthong: a type of vowel that combines two vowel
“oblige” in English and French). sounds (e.g., in “boy,” “cow,” and “my”).
Competence: our knowledge of our language, as Discourse: linguistic units composed of several sentences.
distinct from our linguistic performance. Discrete stage model: a processing model where
Complementizer: a category of words (e.g., “that”) information can only be passed to the next stage when
used to introduce a subordinate clause. the current one has completed its processing (contrast
Conjunction: a part of speech that connects words with cascade model).
within a sentence (e.g., “and,” “because”). Dissociation: a process is dissociable from other
Connectionism: an approach to cognition that involves processes if brain damage can disrupt it, while leaving
computer simulations with many simple processing units, the others intact.
and where knowledge comes from learning statistical Distributional information: information about
regularities rather than explicitly presented rules. what tends to co-occur with what; for example, the
Connectionist: a computational model involving knowledge that the letter “q” is almost always followed
many simple, neuron-like units connected together by by the letter “u,” or that the word “the” is always
weighted links. followed by a noun, are instances of distributional
Consonant: a sound produced with some constriction information.
of the airstream, unlike a vowel. Double dissociation: a pattern of dissociations
Constituent: a linguistic unit that is part of a larger whereby one patient can do one task but not another,
linguistic unit. whereas another patient shows the reverse pattern.
Content word: one of the enormous number of words Dysarthria: difficulty with executing motor
that convey most of the meaning of a sentence—nouns, movements. In addition to difficulties with executing
verbs, adjectives, adverbs. Content words are the same as speech plans, there are problems with automatic
open-class words. Contrasted with function word. activities such as eating. Compare with apraxia, which
Conversational maxim: a rule that helps us to make is a deficit limited to motor planning.
sense of conversation. Dysgraphia: disorder of writing.
Co-reference: two or more noun phrases with the Dyslexia: disorder of reading.
same reference. For example, in “There was a vampire in Dysprosody: a disturbance of prosody.
the kitchen; Boris was scared to death when he saw him,” EEG: electroencephalography—a means of measuring
the co-referential noun phrases are vampire and him. electrical potentials in the brain by placing electrodes
Creole: a pidgin that has become the language of a across the scalp.
community through an evolutionary process known as Episodic memory: knowledge of specific episodes
“creolization.” (e.g., what I had for breakfast this morning, or what
Cross-linguistic: involving a comparison across happened in the library yesterday).
languages. ERP: event-related potential—electrical activity in
Deep dyslexia: disorder of reading characterized by the brain after a particular event. An ERP is a complex
semantic reading errors. electrical waveform related in time to a specific event,
Deep dysphasia: disorder of repetition characterized measured by EEG.
by semantic repetition errors. Expressive: a form of aphasia to do with producing
Derivational morphology: the study of derivational language, primarily speaking.
inflections. Facilitation: making processing faster, usually as a
Determiner: a grammatical word that determines the result of priming. It is the opposite of inhibition.
number of a noun (e.g., “the,” “a,” “an,” “some”). Figurative speech: speech that contains non-literal
Developmental disorder: a disorder where the material, such as metaphors and similes (e.g., “he ran
normal development or acquisition of a process (e.g., like a leopard”).
reading) is affected. Filler: what fills a gap.
GLOSSARY 489
fMRI: functional magnetic resonance imaging—a Hidden units: a unit from the hidden layer of a
modern method of mapping the brain’s activity by connectionist network that enables the network
recording blood flow in real time. to learn complex input–output pairs by the back-
Formal paraphasia: substitution in speech of a word propagation algorithm. The hidden layer forms a
that sounds like another word (e.g., “caterpillar” for layer between the input and output layers.
“catapult”). Sometimes called a form-related paraphasia. Homographs: different words that are spelled the
Formant: a concentration of acoustic energy in a sound. same; they may or may not be pronounced differently,
Function word: one of the limited numbers of words e.g., “lead” (as in what you use to take a dog for a
that do the grammatical work of the language (e.g., walk) and “lead” (as in the metal).
determiners, prepositions, conjunctions—such as Homophone: two words that sound the same.
“the,” “a,” “to,” “in,” “and,” “because”). Contrasted Idioms: an expression particular to a language, whose
with content word. meaning cannot be derived from its parts (e.g., “kick
Gap: an empty part of the syntactic construction that the bucket”).
is associated with a filler. Imageability: a semantic variable concerning how
Garden path sentence: a type of sentence where easy it is to form a mental image of a word: “rose” is
the syntactic structure leads you to expect a different more imageable than “truth.”
conclusion from that which it actually has (e.g., “the Implicature: an inference that we make in
horse raced past the barn fell”). conversations to maintain the sense and relevance of
Gating task: a task that involves presenting the conversation.
increasing amounts of a word. Independent models: models in which processing
Gender: some languages (e.g., French and Italian) occurs without reference to any external processes or
distinguish different cases depending on their information (e.g., purely bottom-up).
gender—male, female, or neuter. Inference: the derivation of additional knowledge
Generative grammar: a finite set of rules that will from facts already known; this might involve going
produce or generate all the sentences of a language beyond the text to maintain coherence or to elaborate
(but no non-sentences). on what was actually presented.
Glottal stop: a sound produced by closing and Inflection: a grammatical change to a verb (changing
opening the glottis (the opening between the vocal its tense, e.g., -ed) or noun (changing its number, e.g.,
folds); an example is the sound that replaces the /t/ -s, or “mice”).
sound in the middle of “bottle” in some dialects of Inflectional morphology: the study of inflections.
English (e.g., in parts of London). Inhibition: this has two uses. In terms of processing it
Grammar: the set of syntactic rules of a language. means slowing processing down. In this sense priming
Grammatical element: a difficulty in physically may lead to inhibition. Inhibition is the opposite of
producing the sounds of language, usually due to brain facilitation. In comprehension it is closely related to
damage affecting control of the muscles involved in the idea of suppression. In terms of networks it refers
moving the articulatory apparatus. to how some connections decrease the amount of
Grapheme: a unit of written language that activation of the target unit.
corresponds to a phoneme (e.g., “steak” contains Inner speech: that voice we hear in our head; speech
four graphemes, s t ea k, corresponding to the four that is not overtly articulated.
component sounds). Interactive models: models where different sorts
Hemidecortication: complete removal of the cortex of information are allowed to influence current
of one side of the brain. processing (e.g., a mixture of bottom-up and
Heterographic homophones: two words with top-down).
different spellings that sound the same (e.g., “soul” Intransitive verb: a verb that does not take an object
and “sole”; “night” and “knight”). (e.g., “The man laughs”).
490 GLOSSARY
Invariance: the same phoneme can in fact sound Modifier: a part of speech that is dependent on
different depending on the context in which it occurs. another, which it modifies or qualifies in some way
L1: the language learned first by bilingual people. (e.g., adjectives modify nouns).
L2: the language learned second by bilingual people. Modularity: the idea that the mind is built up from
Language acquisition device (LAD): Chomsky discrete modules; its resurgence is associated with
argued that children hear an impoverished language the American philosopher Jerry Fodor, who said that
input and therefore need the assistance of an innate modules cannot tinker around with the insides of other
language acquisition device in order to acquire modules. A further step is to say that the modules of
language. the mind correspond to identifiable neural structures in
Lemma: a level of representation of a word between the brain.
its semantic and phonological representations; it is Monosyllabic: a word having just one syllable.
syntactically specified, but does not yet contain sound- Morpheme: the smallest unit of meaning (e.g.,
level information; it is the intermediate stage of two- “dogs” contains two, dog + plural s).
stage models of lexicalization. Morphology: the study of how words are built up
Lesion: damage to a particular part of the brain. from morphemes.
Lexeme: the phonological word form, in a format Nativist: the idea that knowledge is innate.
where phonology is represented. Natural kind: a category of naturally occurring
Lexical access: accessing a word’s entry in the lexicon. things (e.g., animals, trees).
Lexicalization: in speech production, going from Neologism: a “made-up word” that is not in the
semantics to sound. dictionary. Neologisms are usually common in the
Lexicon: our mental dictionary. speech of people with jargon aphasia.
LSA: latent semantic analysis—a means of acquiring Nonword: a string of letters that does not form
knowledge from the co-occurrence of information. a word. Although most of the time nonwords
Malapropisms: a type of speech error where a mentioned in psycholinguistics refer to pronounceable
similar-sounding word is substituted for the target nonwords (pseudowords), not all nonwords need be
(e.g., saying “restaurant” instead of “rhapsody”). pronounceable.
Manner of articulation: the way in which the Noun: the syntactic category of words that can act as
airstream is constricted in speaking (e.g., stop). names and can all be subjects or objects of a clause;
Maturation: the sequential unfolding of all things are nouns.
characteristics, usually governed by instructions in the Noun phrase: a grammatical phrase based on a
genetic code. noun (e.g., “the red house”), abbreviated to NP.
Mediated priming: (facilitatory) priming through a Number: the number of a verb is whether one or
semantic intermediary (e.g., “lion” to “tiger” to “stripes”). more subjects are doing the action (e.g., “the ghost
MEG: magnetoencephalography—a technique for was” but “the ghosts were”).
mapping the brain’s electrical activity by recording the Object: the person, thing, or idea that is acted on by
magnetic field produced by the brain. the verb. In the sentence “The cat chased the dog,”
Metaphor: a figure of speech that works by “cat” is the subject, “chased” the verb, and “dog” is
association, comparison, or resemblance (e.g., “he’s a the object. Objects can be either direct or indirect—in
tiger in a fight,” “the leaves swam around the lake”). the sentence “She gave the dog to the man,” “dog” is
Minimal pair: a pair of words that differ in meaning the direct object and “the man” is the indirect object.
when only one sound is changed (e.g., “pear” and Onset: the beginning of something. It has two
“bear”). meanings. The onset of a stimulus is when it is first
Model: an account of the data that provides an presented. The onset of a printed word is its initial
explanation of why the data are as they are and that consonant cluster (e.g., “sp” in “speak”).
makes novel, testable predictions. Open-class word: same as content word.
GLOSSARY 491
Ostensive: you can define an object ostensively by which is created by the contact of two peoples who do
pointing to it. not speak each other’s native languages.
Over-extension: when a child uses a word to refer to Place of articulation: where the airstream in the
things in a way that is based on particular attributes of articulatory apparatus is constricted.
the word, so that many things can be named using that Polysemous words: words that have more than one
word (e.g., using “moon” to refer to all round things, or meaning.
“stick” to all long things, such as an umbrella). Pragmatics: the aspects of meaning that do not
Parameter: a component of Chomsky’s theory affect the literal truth of what is being said; these
that governs aspects of language, and that is set in concern things such as choice from words with the
childhood by exposure to a particular language. same meaning, implications in conversation, and
Paraphasia: a spoken word substitution. maintaining coherence in conversation.
Parsing: analyzing the grammatical structure of a sentence. Predicate: the part of the clause that gives information
Participle: a type of verbal phrase where a verb is about the subject (e.g., in “The ghost is laughing,” “the
turned into an adjective by adding -ed or -ing to the ghost” is the subject and “is laughing” is the predicate).
verb: “we live in an exciting age.” Prefix: an affix that comes before the stem (e.g., dis-
Patient: the thematic role of a person or thing acted interested). Contrast with suffix which comes after the
on by the agent. stem.
Performance: our actual language ability, limited by Preposition: a grammatical word expressing a
our cognitive capacity, distinct from our competence. relation (e.g., “to,” “with,” “from”).
Phoneme: a sound of the language; changing a Prepositional phrase: a phrase beginning with
phoneme changes the meaning of a word. a preposition (e.g., “with the telescope,” “up the
Phonetics: the acoustic detail of speech sounds and chimney”).
how they are articulated. Priming: affecting a response to a target by
Phonological awareness: awareness of sounds, presenting a related item prior to it; priming can have
measured by tasks such as naming the common sound either facilitatory or inhibitory effects.
in words (e.g., “bat” and “ball”), and deleting a sound Pronouns: a grammatical class of words that can stand
from a word (e.g., “take the second sound of bland”); for nouns or noun phrases (e.g., “she,” “he,” “it”).
thought to be important for reading development but Proposition: the smallest unit of knowledge that can
probably other aspects of language too. stand alone: it has a truth value—that is, a proposition
Phonological dyslexia: a type of dyslexia where can be either true or false.
people can read words quite well but are poor at Prosody: the way in which speech is stressed and
reading nonwords. intoned to give it a rhythm.
Phonology: the study of sounds and how they Prototype: an abstraction that is the best example of
relate to languages; phonology describes the sound a category.
categories each language uses to divide up the space of Pseudohomophone: a nonword that sounds like a
possible sounds. word when pronounced (e.g., “nite”).
Phrase: a group of words forming a grammatical Pseudoword: a string of letters that form a
unit beneath the level of a clause (e.g., “up a tree”). pronounceable nonword (e.g., “smeak”).
A phrase does not contain both a subject and a Psycholinguist: someone who does
predicate. In general, if you can replace a sequence psycholinguistics.
of words in a sentence with a single word without Psycholinguistics: the psychology of language.
changing the overall structure of the sentence, then Receptive aphasia: a form of aphasia to do with
that sequence of words is a phrase. understanding language.
Pidgin: a type of language, with reduced structure Recognition point: the point at which we recognize
and form, without any native speakers of its own, and a word.
492 GLOSSARY
Sucking habituation paradigm: a method Tip-of-the-tongue (TOT): when you know that you
for examining whether or not very young infants know a word, but you cannot immediately retrieve
can discriminate between two stimuli. The child it (although you might know its first sound, or how
sucks on a special piece of apparatus; as the child many syllables it has).
habituates to the stimulus, their sucking rate drops, TMS: transcranial magnetic stimulation—producing
but if a new stimulus is presented, the sucking rate activity in certain brain regions using locally applied
increases again, but only if the child can detect that magnetic fields.
the stimulus is different from the first. Top-down: processing that involves knowledge
Suffix: a morpheme added to the end of a word to coming from higher levels (such as predicting a word
form a derivative (e.g. -ed, -ing, -s). from the context).
Suppression: in comprehension, suppression is Transcortical aphasia: a type of language
closely related to inhibition. Suppression is the disturbance following brain damage characterized by
attenuation of activation, while inhibition is the relatively good repetition but poor performance in
blocking of activation. Material must be activated other aspects of language.
before it can be suppressed. Transformation: a grammatical rule for
Syllable: a rhythmic unit of speech (e.g., po-lo transforming one syntactic structure into another (e.g.,
contains two syllables); it can be analyzed in terms of turning an active sentence into a passive one).
onset and rime (or rhyme), with the rime further being Transformational grammar: a system of
analyzable into nucleus and coda. Hence in “speaks,” grammar based on transformations, introduced by
“sp” is the onset, “ea” the nucleus, and “ks” the coda; Chomsky.
together “eaks” forms the rime. Transitive verb: a verb that takes an object (e.g.,
Syndrome: a medical term for a cluster of “The cat hit the dog”).
symptoms that cohere as a result of a single Unaspirated: a sound that is produced without an
underlying cause. audible breath (e.g., the /p/ in “spin”).
Syntactic bootstrapping: the idea that the syntactic Uniqueness point: the point at which a word is
frame associated with a verb provides a cue as to the unique and differs from all its neighbors.
word’s meaning. Universal grammar: the core of the grammar
Syntax: the rules of word order of a language. that is universal to all languages, and which
Tachistoscope: a device for presenting materials specifies and restricts the form that individual
(e.g., words) for extremely short durations; languages can take.
tachistoscopic presentation therefore means an item Unvoiced: a sound that is produced without vibration
that is presented very briefly. of the vocal cords, such as /p/ and /t/—the same as
Telegraphic speech: a type of speech used voiceless and without voice.
by young children, marked by syntactic Verb: a syntactic class of words expressing actions,
simplification, particularly in the omission of events, and states, and which have tenses.
function words. Verb-argument structure: the set of possible
Tense: the tense of a verb is whether it is in the past, themes associated with a verb (e.g., a person gives
present, or future (e.g., “she gave,” “she gives,” and something to someone—or agent–theme–goal).
“she will give”). Voice onset time (VOT): the time between the release
Thematic roles: the set of semantic roles in a of the constriction of the airstream when we produce a
sentence that conveys information about who is doing consonant, and when the vocal cords start to vibrate.
what to whom, as distinct from the syntactic roles of Voicing: consonants produced with vibration of the
subject and object. Examples include agent and theme. vocal cords.
Theme: the thing that is being acted on or being Vowel: a speech sound produced with very little
moved. constriction of the airstream, unlike a consonant.
494 GLOSSARY
Adams, M. J. (1990). Beginning to read: Thinking recognition: Evidence for continuous mapping models.
and learning about print. Cambridge, MA: MIT Press. Journal of Memory and Language, 38, 419–439.
Ainsworth-Darnell, K., Shulman, H. G., & Allport, D. A. (1977). On knowing the meaning
Boland, J. E. (1998). Dissociating brain responses to of words we are unable to report: The effects of
syntactic and semantic anomalies: Evidence from event- visual masking. In S. Dornic (Ed.), Attention and
related potentials. Journal of Memory and Language, performance VI (pp. 505–534). Hillsdale, NJ:
38, 112–130. Lawrence Erlbaum Associates, Inc.
Aitchison, J. (1994). Words in the mind: An Allport, D. A. (1984). Auditory short-term memory
introduction to the mental lexicon (2nd ed.). Oxford: and conduction aphasia. In H. Bouma & D. Bouwhis
Blackwell. (Eds.), Attention and performance X (pp. 313–326).
Aitchison, J. (1996). The seeds of speech: Language Hillsdale, NJ: Lawrence Erlbaum Associates, Inc.
origin and evolution. Cambridge: Cambridge Allport, D. A., & Funnell, E. (1981). Components of
University Press. the mental lexicon. Philosophical Transactions of the
Aitchison, J. (1998). The articulate mammal (4th ed.). Royal Society of London, Series B, 295, 397–410.
London: Routledge. Almor, A. (1999). Noun-phrase anaphora and focus:
Akhtar, N. (1999). Acquiring basic word order: The informational load hypothesis. Psychological
Evidence for data-driven learning of syntactic Review, 106, 748–765.
structure. Journal of Child Language, 26, 339–356. Almor, A., Kempler, D., MacDonald, M. C.,
Akhtar, N., & Tomasello, M. (1997). Young Andersen, E. S., & Tyler, L. K. (1999). Why do
children’s productivity with word order and verb Alzheimer patients have difficulty with pronouns?
morphology. Developmental Psychology, 33, 952–965. Working memory, semantics, and reference in
Alario, F.-X., & Caramazza, A. (2002). The comprehension and production in Alzheimer’s disease.
production of determiners: Evidence from French. Brain and Language, 67, 202–227.
Cognition, 82, 179–223. Altarriba, J. (1992). The representation of translation
Alario, F.-X., Costa, A., & Caramazza, A. (2002a). equivalents in bilingual memory. In R. J. Harris (Ed.),
Frequency effects in noun phrase production: Cognitive processing in bilinguals (pp. 157–174).
Implications for models of lexical access. Language Amsterdam: North-Holland.
and Cognitive Processes, 17, 299–319. Altarriba, J. (Ed.). (1993). Cognition and culture: A
Alario, F.-X., Costa, A., & Caramazza, A. (2002b). cross-cultural approach to psychology. Amsterdam:
Hedging one’s bets too much? A reply to Levelt North-Holland.
(2002). Language and Cognitive Processes, 17, Altarriba, J., & Forsythe, W. J. (1993). The role
673–682. of cultural schemata in reading comprehension. In
Alario, F.-X., Costa, A., Pickering, M., & Ferreira, V. J. Altarriba (Ed.), Cognition and culture: A cross-
(2006). Language production. Hove, UK: Psychology cultural approach to psychology (pp. 145–155).
Press. Amsterdam: North-Holland.
Albert, M. L., & Obler, L. K. (1978). The bilingual Altarriba, J., Kroll, J. F., Sholl, A., & Rayner, K.
brain: Neuropsychological and neurolinguistic aspects (1996). The influence of lexical and conceptual
of bilingualism. New York: Academic Press. constraints on reading mixed-language sentences:
Alishahi, A., & Stevenson, S. (2005). A probabilistic Evidence from eye fixations and naming times.
model of early argument structure acquisition. Memory and Cognition, 24, 477–492.
Proceedings of the 27th Annual Conference of the Altarriba, J., & Mathis, K. E. (1997). Conceptual
Cognitive Science Society, Stresa, Italy. and lexical development in second language
Allopenna, P. D., Magnuson, J. S., & Tanenhaus, M. K. acquisition. Journal of Memory and Language, 36,
(1998). Tracking the time course of spoken word 550–568.
496 REFERENCES
Altarriba, J., & Soltano, E. G. (1996). Repetition Andrews, S. (1997). The effect of orthographic
blindness and bilingual memory: Token individuation similarity on lexical retrieval: Resolving neighborhood
for translation equivalents. Memory and Cognition, 24, conflicts. Psychonomic Bulletin and Review, 4,
700–711. 439–461.
Altmann, G. T. M. (Ed.). (1990). Cognitive models of Andrews, S. (2006). From inkmarks to ideas. Hove,
speech processing. Cambridge, MA: MIT Press. UK: Psychology Press.
Altmann, G. T. M. (1997). The ascent of Babel: An Ans, B., Carbonnel, S., & Valdois, S. (1998). A
exploration of language, mind, and understanding. connectionist multiple-trace memory model for
Oxford: Oxford University Press. polysyllabic word reading. Psychological Review, 105,
Altmann, G. T. M. (1999). Thematic role assignment 678–723.
in context. Journal of Memory and Language, 41, Antos, S. J. (1979). Processing facilitation in a lexical
124–145. decision task. Journal of Experimental Psychology:
Altmann, G. T. M., Garnham, A., & Dennis, Y. Human Perception and Performance, 5, 527–545.
(1992). Avoiding the garden path: Eye movements Arbib, M. A. (2005). From monkey-like action
in context. Journal of Memory and Language, 31, recognition to human language: An evolutionary
685–712. framework for neurolinguistics. Behavioral and Brain
Altmann, G. T. M., & Kamide, Y. (1999). Sciences, 28, 105–167.
Incremental interpretation at verbs: Restricting the Arciuli, J., & Simpson, I. C. (2012). Statistical
domain of subsequent reference. Cognition, 73, learning is related to reading ability in children and
247–264. adults. Cognitive Science, 36, 286–304.
Altmann, G. T. M., & Kamide, Y. (2009). Discourse- Armstrong, S., Gleitman, L. R., & Gleitman, H.
mediation of the mapping between language and (1983). What some concepts might not be. Cognition,
the visual world: Eye movements and mental 13, 263–274.
representation. Cognition, 111, 55–71. Arnold, J. E., Eisenband, J. G., Brown-Schmidt, S.,
Altmann, G. T. M., & Shillcock, R. C. (Eds.). (1993). & Trueswell, J. C. (2000). The rapid use of gender
Cognitive models of speech processing. Hove, UK: information: Evidence of the time course of pronoun
Lawrence Erlbaum Associates. resolution from eyetracking. Cognition, 76, B13–B36.
Altmann, G. T. M., & Steedman, M. J. (1988). Atkinson, M. (1982). Explanations in the study of
Interaction with context during human sentence child language development. Cambridge: Cambridge
processing. Cognition, 30, 191–238. University Press.
Anderson, J. R. (1974). Retrieval of propositional Au, T. K. (1983). Chinese and English
information from long-term memory. Cognitive counterfactuals: The Sapir Whorf hypothesis revisited.
Psychology, 6, 451–474. Cognition, 15, 155–187.
Anderson, J. R. (1976). Language, memory, and thought. Au, T. K. (1984). Counterfactuals: In reply to Alfred
Hillsdale, NJ: Lawrence Erlbaum Associates, Inc. Bloom. Cognition, 17, 289–302.
Anderson, J. R. (1983). The architecture of cognition. Austin, J. L. (1976). How to do things with words
Cambridge, MA: Harvard University Press. (2nd ed.). Oxford: Oxford University Press. [First
Anderson, J. R. (2010). Cognitive psychology and its edition published 1962.]
implications (7th ed.). New York: Worth. Baars, B. J., Motley, M. T., & MacKay, D. G.
Anderson, J. R., & Bower, G. H. (1973). Human (1975). Output editing for lexical status from
associative memory. Washington, DC: Winston & Sons. artificially elicited slips of the tongue. Journal of
Anderson, K. J., & Leaper, C. (1998). Meta-analyses Verbal Learning and Verbal Behavior, 14, 382–391.
of gender effects on conversational interruption: Who, Baayen, R. H., Dijkstra, T., & Schreuder, R. (1997).
what, when, where, and how. Sex Roles, 39, 225–252. Singulars and plurals in Dutch: Evidence for a parallel
Anderson, R. C., & Pichert, J. W. (1978). Recall of dual route model. Journal of Memory and Language,
previously unrecallable information following a shift 37, 94–117.
in perspective. Journal of Verbal Learning and Verbal Baayen, R. H., Piepenbrock, R., & Gulikers, L.
Behavior, 12, 1–12. (1995). The CELEX lexical database [CD-ROM].
Andrewes, D. (2001). Neuropsychology: From theory Philadelphia: Linguistic Data Consortium, University
to practice. Hove, UK: Psychology Press. of Pennsylvania.
Andrews, S. (1982). Phonological recoding: Is the Bach, E., Brown, C., & Marslen-Wilson, W.
regularity effect consistent? Memory and Cognition, (1986). Crossed and nested dependencies in German
10, 565–575. and Dutch: A psycholinguistic study. Language and
Andrews, S. (1989). Frequency and neighborhood Cognitive Processes, 1–4, 249–262.
effects on lexical access: Activation or search? Journal Backman, J. E. (1983). Psycholinguistic skills and
of Experimental Psychology: Learning, Memory, and reading acquisition: A look at early readers. Reading
Cognition, 15, 802–814. Research Quarterly, 18, 466–479.
REFERENCES 497
Baddeley, A. D. (1990). Human memory: Theory & K. Rayner (Eds.), Comprehension processes in
and practice. Hove, UK: Lawrence Erlbaum reading (pp. 9–32). Hillsdale, NJ: Lawrence Erlbaum
Associates. Associates, Inc.
Baddeley, A. D. (2007). Working memory, thought, Balota, D. A., & Chumbley, J. I. (1984). Are lexical
and action. Oxford: Oxford University Press. decisions a good measure of lexical access? The
Baddeley, A. D., Ellis, N. C., Miles, T. R., & Lewis, V. J. role of word frequency in the neglected decision
(1982). Developmental and acquired dyslexia: A stage. Journal of Experimental Psychology: Human
comparison. Cognition, 11, 185–199. Perception and Performance, 10, 340–357.
Baddeley, A. D., Gathercole, S., & Papagno, C. Balota, D. A., & Chumbley, J. I. (1985). The locus
(1998). The phonological loop as a language learning of word-frequency effects in the pronunciation task:
device. Psychological Review, 105, 158–173. Lexical access and/or production? Journal of Memory
Baddeley, A. D., & Hitch, G. J. (1974). Working and Language, 24, 89–106.
memory. In G. H. Bower (Ed.), The psychology of Balota, D. A., & Chumbley, J. I. (1990). Where
learning and motivation (Vol. 8, pp. 47–90). London: are the effects of frequency on visual word
Academic Press. recognition tasks? Right where we said they were!
Baddeley, A. D., Vallar, G., & Wilson, B. (1987). Comment on Monsell, Doyle, and Haggard (1989).
Comprehension and the articulatory loop: Some Journal of Experimental Psychology: General, 119,
neuropsychological evidence. In M. Coltheart (Ed.), 231–237.
Attention and performance XII (pp. 509–530). Balota, D. A., Cortese, M. J., Sergent-Marshall, S. D.,
Hillsdale, NJ: Lawrence Erlbaum Associates, Inc. Spieler, D. H., & Yap, M. J. (2004). Visual word
Baddeley, A. D., & Wilson, B. (1988). recognition of single-syllable words. Journal of
Comprehension and working memory: A single case Experimental Psychology: General, 133, 283–316.
neuropsychological study. Journal of Memory and Balota, D. A., Ferraro, F. R., & Conner, L. T.
Language, 27, 479–498. (1991). On the early influence of meaning in word
Badecker, W., & Caramazza, A. (1985). On recognition: A review of the literature. In
considerations of method and theory governing the use P. J. Schwanenflugel (Ed.), The psychology of word
of clinical categories in neurolinguistics and cognitive meanings (pp. 187–222). Hillsdale, NJ: Lawrence
neuropsychology: The case against agrammatism. Erlbaum Associates, Inc.
Cognition, 20, 97–125. Balota, D. A., & Lorch, R. F. (1986). Depth of
Badecker, W., & Caramazza, A. (1986). A final brief automatic spreading activation: Mediated priming
in the case against agrammatism: The role of theory in effects in pronunciation but not in lexical decision.
the selection of data. Cognition, 24, 277–282. Journal of Experimental Psychology: Learning,
Badecker, W., Miozzo, M., & Zanuttini, R. (1995). Memory, and Cognition, 12, 336–345.
The two-stage model of lexical retrieval: Evidence Baluch, B., & Besner, D. (1991). Visual word
from a case of anomia with selective preservation of recognition: Evidence for strategic control of lexical
gender. Cognition, 57, 193–216. and nonlexical routines in oral reading. Journal of
Badecker, W., & Straub, K. (2002). The processing Experimental Psychology: Learning, Memory, and
role of structural constraints on the interpretation Cognition, 17, 644–652.
of pronouns and anaphors. Journal of Experimental Banich, M. T. (2004). Cognitive neuroscience and
Psychology: Learning, Memory, and Cognition, 28, neuropsychology. Boston, MA: Houghton Mifflin.
748–769. Banks, W. P., & Flora, J. (1977). Semantic and
Baguley, T., & Payne, S. J. (2000). Long-term perceptual processing in symbolic comparison.
memory for spatial and temporal mental models Journal of Experimental Psychology: Human
includes construction processes and model structure. Perception and Performance, 3, 278–290.
Quarterly Journal of Experimental Psychology, 53A, Barisnikov, K., van der Linden, M., & Poncelet, M.
479–512. (1996). Acquisition of new words and phonological
Bailey, K. G. D., & Ferreira, F. (2003). Disfluencies working memory in Williams syndrome: A case study.
affect the parsing of garden-path sentences. Journal of Neurocase, 2, 395–404.
Memory and Language, 49, 183–200. Barker, M. G., & Lawson, J. S. (1968). Nominal
Baillet, S. D., & Keenan, J. M. (1986). The role of aphasia in dementia. British Journal of Psychiatry,
encoding and retrieval processes in the recall of text. 114, 1351–1356.
Discourse Processes, 9, 247–268. Baron, J., & Strawson, C. (1976). Use of
Baldwin, D. A. (1991). Infants’ contributions to the orthographic and word-specific knowledge in reading
achievement of joint reference. Child Development, words aloud. Journal of Experimental Psychology:
62, 875–890. Human Perception and Performance, 2, 386–393.
Balota, D. A. (1990). The role of meaning in word Baron-Cohen, S. (2003). The essential difference.
recognition. In D. A. Balota, G. B. Flores d’Arcais, Harmondsworth, UK: Penguin.
498 REFERENCES
Barrett, M. D. (1978). Lexical development and Bates, E., & MacWhinney, B. (1982). Functionalist
overextension in child language. Journal of Child approaches to grammar. In E. Wanner &
Language, 5, 205–219. L. R. Gleitman (Eds.), Language acquisition: The
Barrett, M. D. (1982). Distinguishing between state of the art (pp. 173–218). Cambridge: Cambridge
prototypes: The early acquisition of the meaning of object University Press.
names. In S. A. Kuczaj (Ed.), Language development: Bates, E., Marchman, V., Thal, D., Fenson, L.,
Vol. 1. Syntax and semantics (pp. 313–334). New York: Dale, P. S., Reznick, J. S., et al. (1994). Developmental
Springer-Verlag. and stylistic variation in the composition of early
Barrett, M. D. (1986). Early semantic representations vocabulary. Journal of Child Language, 21, 85–123.
and early word-usage. In S. A. Kuczaj & M. D. Barrett Bates, E., Masling, M., & Kintsch, W. (1978).
(Eds.), The development of word meaning: Progress Recognition memory for aspects of dialog. Journal
in cognitive development research (pp. 39–67). New of Experimental Psychology: Human Learning and
York: Springer-Verlag. Memory, 4, 187–197.
Barron, R. W. (1981). Reading skills and reading Bates, E., McDonald, J., MacWhinney, B., &
strategies. In C. A. Perfetti & A. M. Lesgold (Eds.), Applebaum, M. (1991). A maximum likelihood
Interactive processes in reading (pp. 299–328). procedure for the analysis of group and individual
Hillsdale, NJ: Lawrence Erlbaum Associates, Inc. data in aphasia research. Brain and Language, 40,
Barron, R. W., & Baron, J. (1977). How children get 231–265.
meaning from printed words. Child Development, 48, Bates, E., & Roe, K. (2001). Language development
587–594. in children with unilateral brain damage. In
Barry, C., Morrison, C. M., & Ellis, A. W. (1997). C. A. Nelson & M. Luciana (Eds.), Handbook of
Naming the Snodgrass and Vanderwart pictures: developmental cognitive neuroscience (pp. 309–318).
Effects of age of acquisition, frequency, and name Cambridge, MA: MIT Press.
agreement. Quarterly Journal of Experimental Batterink, L., Karns, C. M., Yamada, Y., & Neville, H.
Psychology, 50A, 560–585. (2010). The role of awareness in semantic and syntactic
Barsalou, L. W. (1985). Ideals, central tendency, and processing: An ERP attentional blink study. Journal of
frequency of instantiation as determinants of graded Cognitive Neuroscience, 22, 2514–2529.
structure in categories. Journal of Experimental Battig, W. F., & Montague, W. E. (1969). Category
Psychology: Learning, Memory, and Cognition, 11, norms for verbal items in 56 categories: A replication
629–654. and extension of the Connecticut category norms.
Barsalou, L. W. (2003). Situated simulation in the Journal of Experimental Psychology Monograph, 80,
human conceptual system. Language and Cognitive 1–46.
Processes, 18, 513–562. Bavelier, D., & Potter, M. C. (1992). Visual and
Barsalou, L. W. (2008). Grounded cognition. Annual phonological codes in repetition blindness. Journal
Review of Psychology, 59, 617–645. of Experimental Psychology: Human Perception and
Bartlett, F. C. (1932). Remembering: A study in Performance, 18, 134–147.
experimental and social psychology. Cambridge: Beaton, A. A. (1997). The relation of planum
Cambridge University Press. temporale asymmetry and morphology of the corpus
Batchelder, E. O. (2002). Bootstrapping the lexicon: callosum to handedness, gender, and dyslexia: A
A computational model of infant speech segmentation. review of the evidence. Brain and Language, 60,
Cognition, 83, 167–206. 252–322.
Bates, E., Bretherton, I., & Snyder, L. (1988). From Beattie, G. W. (1980). The role of language
first words to grammar: Individual differences and production processes in the organisation of behaviour
dissociable mechanisms. Cambridge: Cambridge in face-to-face interaction. In B. Butterworth (Ed.),
University Press. Language production: Vol. 1. Speech and talk (pp.
Bates, E., Dick, F., & Wulfeck, B. (1999). Not so 69–107). London: Academic Press.
fast: Domain-general factors can account for selective Beattie, G. W. (1983). Talk: An analysis of speech and
deficits in grammatical processing. Behavioral and non-verbal behaviour in conversation. Milton Keynes,
Brain Sciences, 22, 96–97. UK: Open University Press.
Bates, E., & Goodman, J. C. (1997). On the Beattie, G. W., & Bradbury, R. J. (1979). An
inseparability of grammar and the lexicon: Evidence experimental investigation of the modifiability of the
from acquisition, aphasia and real-time processing. temporal structure of spontaneous speech. Journal of
Language and Cognitive Processes, 12, 507–586. Psycholinguistic Research, 8, 225–247.
Bates, E., & Goodman, J. C. (1999). On the emergence Beattie, G. W., & Butterworth, B. (1979). Contextual
of grammar from the lexicon. In B. MacWhinney (Ed.), probability and word frequency as determinants of
The emergence of language (pp. 29–79). Mahwah, NJ: pauses and errors in spontaneous speech. Language
Lawrence Erlbaum Associates, Inc. and Speech, 22, 201–211.
REFERENCES 499
Beauvois, M.-F. (1982). Optic aphasia: A process Berndt, R. S., & Mitchum, C. C. (1990). Auditory
of interaction between vision and language. and lexical information sources in immediate
Philosophical Transactions of the Royal Society of recall: Evidence from a patient with a deficit to the
London, Series B, 298, 35–47. phonological short-term store. In G. Vallar &
Beauvois, M.-F., & Derouesné, J. (1979). T. Shallice (Eds.), Neuropsychological implications
Phonological alexia: Three dissociations. Journal of short-term memory (pp. 115–144). Cambridge:
of Neurology, Neurosurgery and Psychiatry, 42, Cambridge University Press.
1115–1124. Bertenthal, B. I. (1993). Infants’ perceptions of
Beauvois, M.-F., & Derouesné, J. (1981). Lexical or biomechanical motions: Intrinsic image and knowledge-
orthographic agraphia. Brain, 104, 21–49. based constraints. In C. Granrud (Ed.), Visual
Bechtel, W., & Abrahamsen, A. (2001). perception and cognition in infancy (pp. 175–214).
Connectionism and the mind: Parallel processing, Hillsdale, NJ: Lawrence Erlbaum Associates, Inc.
dynamics and evolution in networks. Oxford: Bertram, R., Schreuder, R., & Baayen, R. H.
Blackwell. (2000). The balance of storage and computation in
Becker, C. A. (1976). Allocation of attention during morphological processing: The role of word formation
visual word recognition. Journal of Experimental type, affixal homophony, and productivity. Journal
Psychology: Human Perception and Performance, 2, of Experimental Psychology: Learning, Memory, and
556–566. Cognition, 26, 489–511.
Becker, C. A. (1980). Semantic context effects in Berwick, R. C., Pietroski, P., Yankama, B., &
visual word recognition: An analysis of semantic Chomsky, N. (2011). Poverty of the stimulus
strategies. Memory and Cognition, 8, 439–512. revisited. Cognitive Science, 35, 1207–1242.
Becker, C. A., & Killion, T. H. (1977). Interaction Berwick, R. C., & Weinberg, A. S. (1983a). The role
of visual and cognitive effects in word recognition. of grammars in models of language use. Cognition, 13,
Journal of Experimental Psychology: Human 1–61.
Perception and Performance, 3, 389–407. Berwick, R. C., & Weinberg, A. S. (1983b). Reply to
Begley, S. (2007). Train your mind, change your Garnham. Cognition, 15, 271–276.
brain. New York: Ballantine Books. Besner, D., & Swan, M. (1982). Models of lexical
Behrend, D. A. (1988). Overextensions in early access in visual word recognition. Quarterly Journal
language comprehension: Evidence from a signal of Experimental Psychology, 34A, 313–325.
detection approach. Journal of Child Language, 15, Besner, D., Twilley, L., McCann, R. S., &
63–75. Seergobin, K. (1990). On the connection between
Behrmann, M., & Bub, D. (1992). Surface dyslexia connectionism and data: Are a few words necessary?
and dysgraphia: Dual routes, single lexicon. Cognitive Psychological Review, 97, 432–446.
Neuropsychology, 9, 209–251. Best, B. J. (1973). Classificatory development in
Bellugi, U., Bihrle, A., Jernigan, T., Trauner, D., & deaf children: Research on language and cognitive
Doherty, S. (1991). Neuropsychological, neurological, development. Occasional Paper No. 15, Research,
and neuroanatomical profile of Williams syndrome. Development and Demonstration Center in Education
American Journal of Medical Genetics Supplement, 6, of Handicapped Children, University of Minnesota.
115–125. Bestgen, Y., & Vincze, N. (2012). Checking and
Bencini, G. L., & Goldberg, A. E. (2000). The bootstrapping lexical norms by means of word
contribution of argument structure constructions to similarity indexes. Behavior Research Methods, 44,
sentence meaning. Journal of Memory and Language, 998–1006.
43, 640–651. Bestgen, Y., & Vonk, W. (2000). Temporal adverbials
Benedict, H. (1979). Early lexical development: as segmentation markers in discourse comprehension.
Comprehension and production. Journal of Child Journal of Memory and Language, 42, 74–87.
Language, 6, 183–200. Bever, T. G. (1970). The cognitive basis for linguistic
Ben-Zeev, S. (1977). The influence of bilingualism on structures. In J. R. Hayes (Ed.), Cognition and the
cognitive strategy and cognitive development. Child development of language (pp. 279–362). New York:
Development, 48, 1009–1018. Wiley.
Bereiter, C., & Scardamalia, M. (1987). The Bever, T. G. (1981). Normal acquisition processes
psychology of written composition. Hillsdale, NJ: explain the critical period for language learning. In
Lawrence Erlbaum Associates, Inc. K. C. Diller (Ed.), Individual differences and
Berko, J. (1958). The child’s learning of English universals in language aptitude (pp. 176–198).
morphology. Word, 14, 150–177. Rowley, MA: Newbury House.
Berlin, B., & Kay, P. (1969). Basic color terms: Their Bever, T. G., & McElree, B. (1988). Empty categories
universality and evolution. Berkeley: University of access their antecedents during comprehension.
California Press. Linguistic Inquiry, 19, 35–45.
500 REFERENCES
Bever, T. G., Sanz, M., & Townsend, D. J. (1998). comprehension in children. Hove, UK: Psychology
The emperor’s psycholinguistics. Journal of Press.
Psycholinguistic Research, 27, 261–284. Bishop, D., & Mogford, K. (Eds.). (1993). Language
Bialystock, E. (2001). Metalinguistic aspects of development in exceptional circumstances. Hove, UK:
bilingual processing. Annual Review of Applied Lawrence Erlbaum Associates.
Linguistics, 21, 169–181. Black, J. B., & Wilensky, R. (1979). An evaluation of
Bialystok, E., Craik, F. I. M., & Luk, G. (2012). story grammars. Cognitive Science, 3, 213–229.
Bilingualism: Consequences for mind and brain. Blackwell, A., & Bates, E. (1995). Inducing
Trends in Cognitive Sciences, 16, 240–250. agrammatic profiles in normals: Evidence for
Bialystok, E., & Hakuta, K. (1994). In other words: the selective vulnerability of morphology under
The science and psychology of second-language cognitive resource limitation. Journal of Cognitive
acquisition. New York: Basic Books. Neuroscience, 7, 228–257.
Biassou, N., Obler, L. K., Nespoulous, J.-L., Blanken, G. (1998). Lexicalisation in speech
Dordain, M., & Harris, K. S. (1997). Dual production: Evidence from form-related word
processing of open- and closed-class words. Brain and substitutions in aphasia. Cognitive Neuropsychology,
Language, 57, 360–373. 15, 321–360.
Bickerton, D. (1981). Roots of language. Ann Arbor, Bloem, I., & La Heij, W. (2003). Semantic facilitation
MI: Karoma. and semantic interference in word translation:
Bickerton, D. (1984). The language bioprogram Implications for models of lexical access in language
hypothesis. Behavioral and Brain Sciences, 7, production. Journal of Memory and Language, 48,
173–221. 468–488.
Bickerton, D. (1986). More than nature needs? A Bloem, I., van den Boogaard, S., & La Heij, W.
reply to Premack. Cognition, 23, 73–79. (2004). Semantic facilitation and semantic interference
Bickerton, D. (1990). Language and species. in language production: Further evidence for the
Chicago: University of Chicago Press. conceptual selection model of lexical access. Journal
Bickerton, D. (2003). Symbol and structure: A of Memory and Language, 51, 307–323.
comprehensive framework for language evolution. Bloom, A. H. (1981). The linguistic shaping of
In M. H. Christiansen & S. Kirby (Eds.), Language thought: A study in the impact of thinking in China
evolution (pp. 77–93). Oxford: Oxford University and the West. Hillsdale, NJ: Lawrence Erlbaum
Press. Associates, Inc.
Bierwisch, M. (1970). Semantics. In J. Lyons (Ed.), Bloom, A. H. (1984). Caution—the words you use
New horizons in linguistics (Vol. 1, pp. 166–185). may affect what you say: A response to Au. Cognition,
Harmondsworth, UK: Penguin. 17, 275–287.
Bigelow, A. (1987). Early words of blind children. Bloom, L. (1970). Language development: Form and
Journal of Child Language, 14, 47–56. function in emerging grammars. Cambridge, MA: MIT
Binder, J. R., & Desai, R. H. (2011). The Press.
neurobiology of semantic memory. Trends in Bloom, L. (1973). One word at a time: The use of
Cognitive Sciences, 15, 527–536. single word utterances before syntax. The Hague:
Bird, H., Lambon Ralph, M. A., Seidenberg, M. S., Mouton.
McClelland, J. L., & Patterson, K. E. (2003). Bloom, L. (1998). Language acquisition in its
Deficits in phonology and past-tense morphology: developmental context. In W. Damon, D. Kuhn, &
What’s the connection? Journal of Memory and R. S. Siegler (Eds.), Handbook of child psychology
Language, 48, 502–526. (Vol. 2, 5th ed., pp. 309–370). New York: Wiley.
Birdsong, D., & Molis, M. (2001). On the evidence Bloom, P. (1994). Recent controversies in the study
for maturational constraints in second-language of language acquisition. In M. A. Gernsbacher (Ed.),
acquisition. Journal of Memory and Language, 44, Handbook of psycholinguistics (pp. 741–780). San
235–249. Diego, CA: Academic Press.
Bishop, D. (1983). Linguistic impairment after Bloom, P. (2001a). How children learn the meanings
left hemidecortication for infantile hemiplegia? of words. Cambridge, MA: MIT Press.
A reappraisal. Quarterly Journal of Experimental Bloom, P. (2001b). Précis of How children learn the
Psychology, 35A, 199–207. meanings of words. Behavioral and Brain Sciences,
Bishop, D. (1989). Autism, Asperger’s syndrome 24, 1095–1103.
and semantic-pragmatic disorder: Where are Bloom, P. (2004). Children think before they speak.
the boundaries? British Journal of Disorders of Nature, 430, 411–412.
Communication, 24, 107–121. Blumstein, S. E., Cooper, W. E., Zurif, E. B., &
Bishop, D. (1997). Uncommon understanding: Caramazza, A. (1977). The perception and production of
Development and disorders of language voice-onset time in aphasia. Neuropsychologia, 15, 19–30.
REFERENCES 501
Blumstein, S. E., Katz, B., Goodglass, H., Shrier, R., interaction of syntax and semantics in parsing. Journal
& Dworetzky, B. (1985). The effects of slowed of Psycholinguistic Research, 18, 563–576.
speech on auditory comprehension in aphasia. Brain Boland, J. E., Tanenhaus, M. K., & Garnsey, S. M.
and Language, 24, 246–265. (1990). Evidence for the immediate use of verb
Boas, F. (1911). Introduction to The Handbook of control information in sentence processing. Journal of
North American Indians (Vol. 1). Bureau of American Memory and Language, 29, 413–432.
Ethnology Bulletin, 40 (Part 1). Boland, J. E., Tanenhaus, M. K., Garnsey, S. M.,
Bock, J. K. (1982). Toward a cognitive psychology & Carlson, G. N. (1995). Verb argument structure
of syntax: Information processing contributions to in parsing and interpretation: Evidence from wh-
sentence formulation. Psychological Review, 89, 1–47. questions. Journal of Memory and Language, 34,
Bock, J. K. (1986). Syntactic persistence in language 774–806.
production. Cognitive Psychology, 18, 355–387. Bolinger, D. L. (1965). The atomization of meaning.
Bock, J. K. (1987). An effect of accessibility of word Language, 41, 555–573.
forms on sentence structure. Journal of Memory and Bonin, P., Barry, C., Méot, A., & Chalard, M.
Language, 26, 119–137. (2004). The influence of age of acquisition in word
Bock, J. K. (1989). Closed-class immanence in reading and other tasks: A never ending story? Journal
sentence production. Cognition, 31, 163–186. of Memory and Language, 50, 456–476.
Bock, J. K., & Cutting, J. C. (1992). Regulating Bonin, P., & Fayol, M. (2002). Frequency effects
mental energy: Performance units in language in the written and spoken production of homophonic
production. Journal of Memory and Language, 31, picture names. European Journal of Cognitive
99–127. Psychology, 14, 289–313.
Bock, J. K., & Eberhard, K. M. (1993). Meaning, Boomer, D. S. (1965). Hesitations and grammatical
sound and syntax in English number agreement. encoding. Language and Speech, 8, 148–158.
Language and Cognitive Processes, 8, 57–99. Bornkessel-Schlesewsky, I., Schlesewsky, M., &
Bock, J. K., Eberhard, K. M., & Cutting, J. C. von Cramon, D. Y. (2009). Word order and Broca’s
(2004). Producing number agreement: How pronouns region: Evidence for a supra-syntactic perspective.
equal verbs. Journal of Memory and Language, 51, Brain and Language, 111, 125–139.
251–278. Bornstein, M. H. (1973). Color vision and color
Bock, J. K., & Griffin, Z. M. (2000). The persistence naming: A psychophysiological hypothesis of cultural
of structural priming: Transient activation or implicit difference. Psychological Bulletin, 80, 257–285.
learning. Journal of Experimental Psychology: Bornstein, S. (1985). On the development of colour
General, 129, 177–192. naming in young children: Data and theory. Brain and
Bock, J. K., & Irwin, D. E. (1980). Syntactic effects Language, 26, 72–93.
of information availability in sentence production. Boroditsky, L. (2001). Does language shape thought?
Journal of Verbal Learning and Verbal Behavior, 19, Mandarin and English speakers’ conceptions of time.
467–484. Cognitive Psychology, 43, 1–22.
Bock, J. K., & Loebell, H. (1990). Framing Boroditsky, L. (2003). Linguistic relativity. In
sentences. Cognition, 35, 1–39. L. Nadel (Ed.), Encyclopedia of cognitive science
Bock, J. K., & Miller, C. A. (1991). Broken (Vol. 2, pp. 917–921). London: Nature Publishing
agreement. Cognition, 23, 45–93. Group.
Bock, J. K., & Warren, R. K. (1985). Conceptual Borowsky, R., & Masson, M. E. J. (1996). Semantic
accessibility and syntactic structure in sentence ambiguity effects in word identification. Journal of
formulation. Cognition, 21, 47–67. Experimental Psychology: Learning, Memory, and
Bohannon, J. N., MacWhinney, B., & Snow, C. E. Cognition, 22, 63–85.
(1990). No negative evidence revisited: Beyond Borsley, R. D. (1991). Syntactic theory: A unified
learnability or who has to prove what to whom. approach. London: Edward Arnold.
Developmental Psychology, 26, 221–226. Bouckaert, R., Lemey, P., Dunn, M., Greenhill, S. J.,
Bohannon, J. N., & Stanowicz, L. (1988). The issue Alekseyenko, A. V., Drummond, A. J., et al. (2012).
of negative evidence: Adult responses to children’s Mapping the origins and expansion of the Indo-
language errors. Developmental Psychology, 24, European language family. Science, 337, 957–960.
684–689. Bower, G. H., Black, J. B., & Turner, T. J. (1979).
Boland, J. E. (1997). Resolving syntactic category Scripts in memory for text. Cognitive Psychology, 11,
ambiguities in discourse context: Probabilistic 177–220.
and discourse constraints. Journal of Memory and Bowerman, M. (1973). Learning to talk: A cross
Language, 36, 588–615. linguistic study of early syntactic development, with
Boland, J. E., Tanenhaus, M. K., Carlson, G. N., special reference to Finnish. Cambridge: Cambridge
& Garnsey, S. M. (1989). Lexical projection and the University Press.
502 REFERENCES
Bowerman, M. (1978). The acquisition of word Young children’s acquisition of verbs (pp. 352–376).
meanings: An investigation into some current Hillsdale, NJ: Lawrence Erlbaum Associates, Inc.
conflicts. In N. Waterson & C. E. Snow (Eds.), Bramwell, B. (1897). Illustrative cases of aphasia.
The development of communication (pp. 263–287). Lancet, 1, 1256–1259. [Reprinted in Cognitive
Chichester, UK: Wiley. Neuropsychology (1984), 1, 249–258.]
Bowerman, M. (1990). Mapping thematic roles onto Branigan, H. P., Pickering, M. J., & Cleland, A. A.
syntactic functions: Are children helped by innate (2000). Syntactic co-ordination in dialogue. Cognition,
linking rules? Linguistics, 28, 1253–1289. 75, B13–B25.
Bowey, J. A. (1996). On the association between Branigan, H. P., Pickering, M. J., Liversedge, S. P.,
phonological memory and receptive vocabulary Stewart, A. J., & Urbach, T. P. (1995). Syntactic
in five-year-olds. Journal of Experimental Child priming: Investigating the mental representation of
Psychology, 63, 44–78. language. Journal of Psycholinguistic Research, 24,
Bowey, J. A. (1997). What does nonword repetition 489–506.
measure? A reply to Gathercole and Baddeley. Journal Bransford, J. D., Barclay, J. R., & Franks, J. J.
of Experimental Child Psychology, 67, 295–301. (1972). Sentence memory: A constructive versus
Bowey, J. A., & Muller, D. (2005). Phonological interpretive approach. Cognitive Psychology, 3,
recoding and rapid orthographic learning in third- 193–209.
graders’ silent reading: A critical test of the self- Bransford, J. D., & Johnson, M. K. (1973).
teaching hypothesis. Journal of Experimental Child Consideration of some problems of comprehension. In
Psychology, 92, 203–219. W. G. Chase (Ed.), Visual information processing (pp.
Bowles, N. L., & Poon, L. W. (1985). Effects of 383–438). New York: Academic Press.
priming in word retrieval. Journal of Experimental Breedin, S. D., & Saffran, E. M. (1999). Sentence
Psychology: Learning, Memory, and Cognition, 11, processing in the face of semantic loss: A case study.
272–283. Journal of Experimental Psychology: General, 128,
Bradley, D. C., & Forster, K. I. (1987). A reader’s 547–562.
view of listening. Cognition, 25, 103–134. Breedin, S. D., Saffran, E. M., & Coslett, H. B.
Bradley, D. C., Garrett, M. F., & Zurif, E. B. (1994). Reversal of the concreteness effect in a patient
(1980). Syntactic deficits in Broca’s aphasia. In D. with semantic dementia. Cognitive Neuropsychology,
Caplan (Ed.), Biological studies of mental processes 11, 617–660.
(pp. 269–286). Cambridge, MA: MIT Press. Breedin, S. D., Saffran, E. M., & Schwartz, M.
Bradley, L., & Bryant, P. (1978). Difficulties in (1998). Semantic factors in verb retrieval: An effect of
auditory organization as a possible cause of reading complexity. Brain and Language, 63, 1–35.
backwardness. Nature, 271, 746–747. Brennan, S. E., & Clark, H. H. (1996). Conceptual
Bradley, L., & Bryant, P. (1983). Categorizing pacts and lexical choice in conversation. Journal of
sounds and learning to read—A causal connection. Experimental Psychology: Learning, Memory, and
Nature, 301, 419–421. Cognition, 22, 1482–1493.
Braine, M. D. S. (1963). The ontogeny of English Brennen, T. (1999). Face naming in dementia: A reply
phrase structure: The first phase. Language, 39, to Hodges and Greene (1998). Quarterly Journal of
1–13. Experimental Psychology, 52A, 535–541.
Braine, M. D. S. (1976). Children’s first word Brennen, T., David, D., Fluchaire, I., &
combinations. Monographs of the Society for Research Pellat, J. (1996). Naming faces and objects
in Child Development, 41 (Serial No. 164). without comprehension: A case study. Cognitive
Braine, M. D. S. (1988a). Review of Language Neuropsychology, 13, 93–110.
learnability and language development by S. Pinker. Brewer, W. F. (1987). Schemas versus mental models in
Journal of Child Language, 15, 189–219. human memory. In P. Morris (Ed.), Modelling cognition
Braine, M. D. S. (1988b). Modeling the acquisition (pp. 187–197). Chichester, UK: J. Wiley & Sons.
of linguistic structure. In Y. Levy, I. M. Schlesinger, Britton, B. K., Muth, K. D., & Glynn, S. M. (1986).
& M. D. S. Braine (Eds.), Categories and processes Effects of text organization on memory: Test of a
in language acquisition (pp. 217–259). Hillsdale, NJ: cognitive effect hypothesis with limited exposure time.
Lawrence Erlbaum Associates, Inc. Discourse Processes, 9, 475–487.
Braine, M. D. S. (1992). What sort of innate structure Broeder, P., & Murre, J. (Eds.). (2000). Models
is needed to “bootstrap” into syntax? Cognition, 45, of language acquisition: Inductive and deductive
77–100. approaches. Oxford: Oxford University Press.
Braine, M. D. S., & Brooks, P. J. (1995). Verb Bronowski, J., & Bellugi, U. (1970). Language,
argument structure and the problem of avoiding an name, and concept. Science, 168, 669–673.
overgeneral grammar. In M. Tomasello & Broom, Y. M., & Doctor, E. A. (1995a).
W. E. Merriman (Eds.), Beyond names for things: Developmental phonological dyslexia: A case study of
REFERENCES 503
the efficacy of a remediation programme. Cognitive Brown, R., & Lenneberg, E. H. (1954). A study in
Neuropsychology, 12, 725–766. language and cognition. Journal of Abnormal and
Broom, Y. M., & Doctor, E. A. (1995b). Social Psychology, 49, 454–462.
Developmental surface dyslexia: A case study of Brown, R., & McNeill, D. (1966). The “tip of the
the efficacy of a remediation programme. Cognitive tongue” phenomenon. Journal of Verbal Learning and
Neuropsychology, 12, 69–110. Verbal Behavior, 5, 325–337.
Brown, A. S. (1991). A review of the tip-of-the-tongue Brownell, H. H., & Gardner, H. (1988).
experience. Psychological Bulletin, 109, 204–223. Neuropsychological insights into humour. In J. Durant
Brown, G. D. A. (1987). Resolving inconsistency: & J. Miller (Eds.), Laughing matters (pp. 17–34).
A computational model of word naming. Journal of Harlow, UK: Longman.
Memory and Language, 26, 1–23. Brownell, H. H., Michel, D., Powelson, J. A., &
Brown, G. D. A., & Deavers, R. P. (1999). Units of Gardner, H. (1983). Surprise but not coherence:
analysis in nonword reading: Evidence from children Sensitivity to verbal humor in right hemisphere
and adults. Journal of Experimental Child Psychology, patients. Brain and Language, 18, 20–27.
73, 208–242. Brownell, H. H., Potter, H. H., Bihrle, A. M., &
Brown, G. D. A., & Ellis, N. C. (1994). Issues in Gardner, H. (1986). Interference deficits in right
spelling research: An overview. In G. D. A. Brown brain-damaged patients. Brain and Language, 27,
& N. C. Ellis (Eds.), Handbook of spelling: Theory, 310–321.
process and intervention (pp. 3–25). London: John Bruce, D. J. (1958). The effects of listeners’
Wiley & Sons. anticipations in the intelligibility of heard speech.
Brown, G. D. A., & Watson, F. L. (1987). First Language and Speech, 1, 79–97.
in, first out: Word learning age and spoken word Bruck, M., Lambert, W. E., & Tucker, G. R. (1976).
frequency as predictors of word familiarity and word Cognitive and attitudinal consequences of bilingual
naming latency. Memory and Cognition, 15, 208–216. schooling: The St. Lambert project through grade six.
Brown, G. D. A., & Watson, F. L. (1994). Spelling- International Journal of Psycholinguistics, 6, 13–33.
to-sound effects in single-word reading. British Bruner, J. S. (1964). The course of cognitive growth.
Journal of Psychology, 85, 181–202. American Psychologist, 19, 1–15.
Brown, P. (1991). DEREK: The direct encoding Bruner, J. S. (1975). From communication to
routine for evolving knowledge. In D. Besner & language—a psychological perspective. Cognition, 3,
G. W. Humphreys (Eds.), Basic processes in reading: 255–287.
Visual word recognition (pp. 104–147). Hillsdale, NJ: Bruner, J. S. (1983). Child’s talk: Learning to use
Lawrence Erlbaum Associates, Inc. language. New York: W. W. Norton.
Brown, P., & Levinson, S. (1987). Politeness: Some Bryant, P. (1998). Sensitivity to onset and rhyme
universals in language usage. Cambridge: Cambridge does predict young children’s reading: A comment on
University Press. Muter, Hulme, Snowling, and Taylor (1997). Journal
Brown, R. (1958). Words and things. New York: Free of Experimental Child Psychology, 71, 29–37.
Press. Bryant, P., & Impey, L. (1986). The similarity
Brown, R. (1970). Psychology and reading: between normal readers and developmental and
Commentary on chapters 5 to 10. In H. Levin & acquired dyslexics. Cognition, 24, 121–137.
J. P. Williams (Eds.), Basic studies on reading Brysbaert, M., & Mitchell, D. C. (1996). Modifier
(pp. 164–187). New York: Basic Books. attachment in sentence parsing: Evidence from Dutch.
Brown, R. (1973). A first language: The early stages. Quarterly Journal of Experimental Psychology, 49A,
London: George Allen & Unwin. 664–695.
Brown, R. (1976). In memorial tribute to Eric Bryson, B. (1990). Mother tongue. Harmondsworth,
Lenneberg. Cognition, 4, 125–154. UK: Penguin Books.
Brown, R., & Bellugi, U. (1964). Three processes Bub, D. (2000). Methodological issues confronting
in the acquisition of syntax. Harvard Educational PET and fMRI studies of cognitive function. Cognitive
Review, 34, 133–151. Neuropsychology, 17, 467–484.
Brown, R., & Fraser, C. (1963). The acquisition of Bub, D., Black, S., Hampson, E., & Kertesz, A.
syntax. In C. Cofer & B. Musgrave (Eds.), Verbal (1988). Semantic encoding of pictures and words:
behavior and learning: Problems and processes (pp. Some neuropsychological observations. Cognitive
158–209). New York: McGraw-Hill. Neuropsychology, 5, 27–66.
Brown, R., & Hanlon, C. (1970). Derivational Bub, D., Black, S., Howell, J., & Kertesz, A. (1987).
complexity and order of acquisition in child Speech output processes and reading. In
speech. In J. R. Hayes (Ed.), Cognition and the M. Coltheart, G. Sartori, & R. Job (Eds.), The
development of language (pp. 11–53). New York: cognitive neuropsychology of language (pp. 79–110).
John Wiley & Sons. Hove, UK: Lawrence Erlbaum Associates.
504 REFERENCES
Bub, D., Cancelliere, A., & Kertesz, A. (1985). the tongue and language production (pp. 73–108).
Whole-word and analytic translation of spelling to Amsterdam: Mouton.
sound in a non-semantic reader. In K. E. Patterson, Butterworth, B. (1985). Jargon aphasia: Processes
J. C. Marshall, & M. Coltheart (Eds.), Surface and strategies. In S. Newman & R. Epstein (Eds.),
dyslexia: Neuropsychological and cognitive studies Current perspectives in dysphasia (pp. 61–96).
of phonological reading (pp. 15–34). Hove, UK: Edinburgh: Churchill Livingstone.
Lawrence Erlbaum Associates. Butterworth, B., & Beattie, G. W. (1978). Gesture
Bub, D., & Kertesz, A. (1982a). Deep agraphia. Brain and silence as indicators of planning in speech. In
and Language, 17, 146–165. R. N. Campbell & P. T. Smith (Eds.), Recent advances
Bub, D., & Kertesz, A. (1982b). Evidence for in the psychology of language: Vol. 4. Formal and
logographic processing in a patient with preserved experimental approaches (pp. 347–360). London:
written over oral single word naming. Brain, 105, Plenum Press.
697–717. Butterworth, B., Campbell, R., & Howard, D.
Buckingham, H. W. (1981). Where do neologisms (1986). The uses of short-term memory: A case study.
come from? In J. W. Brown (Ed.), Jargon-aphasia (pp. Quarterly Journal of Experimental Psychology, 38A,
39–62). New York: Academic Press. 705–737.
Buckingham, H. W. (1986). The scan-copier Butterworth, B., & Howard, D. (1987).
mechanism and the positional level of language Paragrammatisms. Cognition, 26, 1–37.
production: Evidence from phonemic paraphasia. Butterworth, B., Swallow, J., & Grimston, M.
Cognitive Science, 10, 195–217. (1981). Gestures and lexical processes in
Burgess, C. (2000). Theory and operational jargonaphasia. In J. Brown (Ed.), Jargonaphasia (pp.
definitions in computational memory models: A 113–124). New York: Academic Press.
response to Glenberg and Robertson. Journal of Butterworth, B., & Wengang, Y. (1991). The
Memory and Language, 43, 402–408. universality of two routines for reading: Evidence
Burgess, C., & Lund, K. (1997). Representing from Chinese dyslexia. Proceedings of the Royal
abstract words and emotional connotation in high- Society of London, Series B, 245, 91–95.
dimensional memory space. In Proceedings of the Byrne, B. (1998). The foundation of literacy: The
Cognitive Science Society (pp. 61–66). Hillsdale, NJ: child’s acquisition of the alphabetic principle. Hove,
Lawrence Erlbaum Associates, Inc. UK: Psychology Press.
Burke, D., MacKay, D. G., Worthley, J. S., & Wade, E. Cacciari, C., & Glucksberg, S. (1994).
(1991). On the tip of the tongue: What causes word Understanding figurative language. In M. A.
finding failures in young and older adults? Journal of Gernsbacher (Ed.), Handbook of psycholinguistics (pp.
Memory and Language, 30, 237–246. 447–477). San Diego, CA: Academic Press.
Burton, M. W., Baum, S. R., & Blumstein, S. E. Cairns, P., Shillcock, R., Chater, N., & Levy,
(1989). Lexical effects on the phonetic categorization J. (1995). Bottom-up connectionist modelling
of speech: The role of acoustic structure. Journal of of speech. In J. P. Levy, D. Bairaktaris, J. A.
Experimental Psychology: Human Perception and Bullinaria, & P. Cairns (Eds.), Connectionist models
Performance, 15, 567–575. of memory and language (pp. 289–310). London:
Burton-Roberts, N. (1997). Analysing sentences: UCL Press.
An introduction to English syntax (2nd ed.). London: Cairns, P., Shillcock, R., Chater, N., & Levy, J.
Longman. (1997). Bootstrapping word boundaries: A bottom-up
Bus, A. G., & van Ijzendoorn, M. H. (1999). corpus-based approach to segmentation. Cognitive
Phonological awareness and early reading: A meta- Psychology, 33, 111–153.
analysis of experimental training studies. Journal of Campbell, R., & Butterworth, B. (1985).
Educational Psychology, 91, 403–414. Phonological dyslexia and dysgraphia: A
Butterworth, B. (1975). Hesitation and semantic developmental case with associated deficits of
planning in speech. Journal of Psycholinguistic phonemic processing and awareness. Quarterly
Research, 4, 75–87. Journal of Experimental Psychology, 37A, 435–475.
Butterworth, B. (1979). Hesitation and the production Cantalupo, C., & Hopkins, W. D. (2001).
of neologisms in jargon aphasia. Brain and Language, Asymmetric Broca’s area in great apes. Nature, 414,
8, 133–161. 505.
Butterworth, B. (1980). Evidence from pauses in Caplan, D. (1972). Clause boundaries and recognition
speech. In B. Butterworth (Ed.), Language production: latencies. Perception and Psychophysics, 12, 73–76.
Vol. 1. Speech and talk (pp. 155–176). London: Caplan, D. (1986). In defense of agrammatism.
Academic Press. Cognition, 24, 263–276.
Butterworth, B. (1982). Speech errors: Old data in Caplan, D. (1992). Language: Structure, processing,
search of new theories. In A. Cutler (Ed.), Slips of and disorders. Cambridge, MA: MIT Press.
REFERENCES 505
Caplan, D., Baker, C., & Dehaut, F. (1985). Caramazza, A., & Hillis, A. E. (1990). Where do
Syntactic determinants of sentence comprehension in semantic errors come from? Cortex, 26, 95–122.
aphasia. Cognition, 21, 117–175. Caramazza, A., & Hillis, A. E. (1991). Lexical
Caplan, D., & Hildebrandt, N. (1988). Disorders of organization of nouns and verbs in the brain. Nature,
syntactic comprehension. Cambridge, MA: Bradford 349, 788–790.
Books. Caramazza, A., Hillis, A. E., Rapp, B. C., &
Caplan, D., & Waters, G. S. (1995a). Aphasic Romani, C. (1990). The multiple semantics
disorders of syntactic comprehension and working hypothesis: Multiple confusions? Cognitive
memory capacity. Cognitive Neuropsychology, 12, Neuropsychology, 7, 61–189.
637–649. Caramazza, A., Miceli, G., & Villa, G. (1986). The
Caplan, D., & Waters, G. S. (1995b). On the nature role of the (output) phonological buffer in reading,
of the phonological output planning processes writing, and repetition. Cognitive Neuropsychology, 3,
involved in verbal rehearsal: Evidence from aphasia. 37–76.
Brain and Language, 48, 191–220. Caramazza, A., & Miozzo, M. (1997). The relation
Caplan, D., & Waters, G. S. (1996). Syntactic between syntactic and phonological knowledge in
processing in sentence comprehension under dual- lexical access: Evidence from the “tip-of-the-tongue”
task conditions in aphasic patients. Language and phenomenon. Cognition, 64, 309–343.
Cognitive Processes, 22, 525–551. Caramazza, A., & Miozzo, M. (1998). More is not
Caplan, D., & Waters, G. S. (1999). Verbal working always better: A response to Roelofs, Meyer, and
memory and sentence comprehension. Behavioral and Levelt. Cognition, 69, 231–241.
Brain Sciences, 22, 77–126. Caramazza, A., Papagno, C., & Ruml, W. (2000).
Caplan, D., & Waters, G. S. (2002). Working The selective impairment of phonological processing
memory and connectionist models of parsing: Reply to in speech production. Brain and Language, 75,
MacDonald and Christiansen. Psychological Review, 428–450.
109, 66–74. Caramazza, A., & Shelton, J. R. (1998). Domain-
Caramazza, A. (1986). On drawing inferences about specific knowledge systems in the brain: The
the structure of normal cognitive systems from the animate–inanimate distinction. Journal of Cognitive
analysis of patterns of impaired performance. Brain Neuroscience, 10, 1–34.
and Cognition, 5, 41–66. Caramazza, A., & Zurif, E. B. (1976). Dissociation
Caramazza, A. (1991). Data, statistics, and theory: of algorithmic and heuristic processes in language
A comment on Bates, McDonald, MacWhinney, and comprehension: Evidence from aphasia. Brain and
Applebaum’s “A maximum likelihood procedure for Language, 3, 572–582.
the analysis of group and individual data in aphasia Carey, S. (1985). Conceptual change in childhood.
research.” Brain and Language, 41, 43–51. Cambridge, MA: MIT Press.
Caramazza, A. (1997). How many levels of Carmichael, L., Hogan, H. P., & Walter, A. A.
processing are there in lexical access? Cognitive (1932). An experimental study of the effect of
Neuropsychology, 14, 177–208. language on the reproduction of visually presented
Caramazza, A., & Berndt, R. S. (1978). Semantic forms. Journal of Experimental Psychology, 15,
and syntactic processes in aphasia: A review of the 73–86.
literature. Psychological Bulletin, 85, 898–918. Carpenter, P. A., & Just, M. A. (1977). Reading
Caramazza, A., Berndt, R. S., & Basili, A. G. comprehension as eyes see it. In M. A. Just &
(1983). The selective impairment of phonological P. A. Carpenter (Eds.), Cognitive processes in
processing: A case study. Brain and Language, 18, comprehension (pp. 109–140). Hillsdale, NJ:
128–174. Lawrence Erlbaum Associates, Inc.
Caramazza, A., Bi, Y., Costa, A., & Miozzo, M. (2004). Carr, T. H., McCauley, C., Sperber, R. D., &
What determines the speed of lexical access: Homophone Parmalee, C. M. (1982). Words, pictures and priming:
or specific-word frequency? A reply to Jescheniak et al. On semantic activation, conscious identification and
(2003). Journal of Experimental Psychology: Learning, the automaticity of information processing. Journal
Memory, and Cognition, 30, 278–282. of Experimental Psychology: Human Perception and
Caramazza, A., Chialant, D., Capasso, R., & Miceli, G. Performance, 8, 757–777.
(2000). Separable processing of consonants and vowels. Carr, T. H., & Pollatsek, A. (1985). Recognizing
Nature, 403, 428–430. printed words: A look at current models. In D. Besner,
Caramazza, A., Costa, A., Miozzo, M., & Bi, Y. T. J. Waller, & C. E. MacKinnon (Eds.), Reading
(2001). The specific-word frequency effect: research: Advances in theory and practice (Vol. 5, pp.
Implications for the representation of homophones. 1–82). New York: Academic Press.
Journal of Experimental Psychology: Learning, Carroll, J. B. (1981). Twenty-five years of research
Memory, and Cognition, 27, 1430–1450. on foreign language aptitude. In K. C. Diller (Ed.),
506 REFERENCES
Individual differences and universals in language Chalmers, A. F. (1999). What is this thing called
learning aptitude (pp. 83–118). Rowley, MA: science? (3rd ed.). Milton Keynes, UK: Open
Newbury House. University Press.
Carroll, J. B., & Casagrande, J. B. (1958). The Chambers Twentieth Century Dictionary. (1998).
function of language classifications in behavior. In Edinburgh: Chambers Harrap.
E. E. Maccoby, T. M. Newcomb, & E. L. Hartley Chambers, C. G., Tanenhaus, M. K., &
(Eds.), Readings in social psychology (3rd ed., pp. Magnuson, J. S. (2004). Actions and affordances
18–31). New York: Holt, Rinehart & Winston. in syntactic ambiguity resolution. Journal of
Carroll, J. B., & White, M. N. (1973a). Word Experimental Psychology: Learning, Memory, and
frequency and age-of-acquisition as determiners Cognition, 30, 687–696.
of picture-naming latency. Quarterly Journal of Chan, A. S., Butters, N., Paulsen, J. S., Salmon, D. P.,
Experimental Psychology, 25, 85–95. Swenson, M. R., & Maloney, L. T. (1993a). An
Carroll, J. B., & White, M. N. (1973b). Age-of- assessment of the semantic network in patients with
acquisition norms for 220 picturable nouns. Journal Alzheimer’s disease. Journal of Cognitive Neuroscience,
of Verbal Learning and Verbal Behavior, 12, 5, 254–261.
563–576. Chan, A. S., Butters, N., Salmon, D. P., &
Carruthers, P. (2002). The cognitive functions McGuire, K. A. (1993b). Dimensionality and clustering
of language. Behavioral and Brain Sciences, 25, in the semantic network of patients with Alzheimer’s
657–726. disease. Psychology and Aging, 8, 411–419.
Carston, R. (1987). Review of Gavagai! or the future Chang, F., Bock, K., & Goldberg, A. E. (2003). Can
history of the animal language controversy, by David thematic roles leave traces of their places? Cognition,
Premack. Mind and Language, 2, 332–349. 90, 29–49.
Carver, R. P. (1972). Speed readers don’t read: They Chang, F., Dell, G. S., & Bock, K. (2006).
skim. Psychology Today, 22–30. Becoming syntactic. Psychological Review, 113,
Casey, B. J., Thomas, K. M., & McCandliss, B. 234–272.
(2001). Applications of magnetic resonance imaging Chang, T. M. (1986). Semantic memory: Facts and
to the study of development. In C. A. Nelson & M. models. Psychological Bulletin, 99, 199–220.
Luciano (Eds.), Handbook of developmental cognitive Chao, L. L., Haxby, J. V., & Martin, A. (1999).
neuroscience (pp. 137–147). Cambridge, MA: MIT Attribute-based neural substrates in temporal cortex
Press. for perceiving and knowing about objects. Nature
Castles, A., & Coltheart, M. (1993). Varieties of Neuroscience, 2, 913–919.
developmental dyslexia. Cognition, 47, 149–180. Chapman, R. S., & Thomson, J. (1980). What is
Castles, A., & Coltheart, M. (2004). Is there a the source of overextension errors in comprehension
causal link from phonological awareness to success in testing of two-year-olds? A reply to Fremgen and Fay.
learning to read? Cognition, 91, 77–111. Journal of Child Language, 7, 575–578.
Castles, A., Datta, H., Gayan, J., & Olson, R. K. Chater, N., & Manning, C. D. (2006). Probabilistic
(1999). Varieties of developmental reading disorder: models of language processing and acquisition. Trends
Genetic and environmental influences. Journal of in Cognitive Sciences, 10, 335–344.
Experimental Child Psychology, 72, 73–94. Chen, H.-C., & Ng, M.-L. (1989). Semantic
Cattell, J. M. (1947). On the time required for facilitation and translation priming effects in
recognizing and naming letters and words, pictures Chinese–English bilinguals. Memory and Cognition,
and colors. In James McKeen Cattell, Man of science 17, 454–462.
(Vol. 1, pp. 13–25). Lancaster, PA: Science Press. Chertkow, H., Bub, D., & Caplan, D. (1992).
[Originally published 1888.] Constraining theories of semantic memory processing:
Caudill, M., & Butler, C. (1992). Understanding Evidence from dementia. Cognitive Neuropsychology,
neural networks: Computer explorations (Vols. 1 & 2). 9, 327–365.
Cambridge, MA: MIT Press. Cholin, J., Levelt, W. J. M., & Schiller, N. O. (2006).
Cazden, C. B. (1968). The acquisition of noun and Effects of syllable frequency in speech production.
verb inflections. Child Development, 39, 433–448. Cognition, 99, 205–235.
Cazden, C. B. (1972). Child language and education. Cholin, J., Schiller, N. O., & Levelt, W. J. M.
New York: Holt, Rinehart & Winston. (2004). The preparation of syllables in speech
Chafe, W. L. (1985). Linguistic differences produced production. Journal of Memory and Language, 50,
by differences between speaking and writing. In 47–61.
D. R. Olson, N. Torrance, & A. Hildyard (Eds.), Chomsky, N. (1957). Syntactic structures. The Hague:
Literacy, language and learning: The nature and Mouton.
consequences of reading and writing (pp. 105–123). Chomsky, N. (1959). Review of “Verbal behavior” by
Cambridge: Cambridge University Press. B. F. Skinner. Language, 35, 26–58.
REFERENCES 507
Chomsky, N. (1965). Aspects of the theory of syntax. Clark, E. V. (1973). What’s in a word? On the child’s
Cambridge, MA: MIT Press. acquisition of semantics in his first language. In
Chomsky, N. (1968). Language and mind. New York: T. E. Moore (Ed.), Cognitive development and the
Harcourt Brace. acquisition of language (pp. 65–110). New York:
Chomsky, N. (1975). Reflections on language. New Academic Press.
York: Pantheon. Clark, E. V. (1987). The principle of contrast: A
Chomsky, N. (1981). Lectures on government and constraint on language acquisition. In B. MacWhinney
binding. Dordrecht: Foris. (Ed.), Mechanisms of language acquisition (pp. 1–33).
Chomsky, N. (1986). Knowledge of language. New Hillsdale, NJ: Lawrence Erlbaum Associates, Inc.
York: Praeger Special Studies. Clark, E. V. (1993). The lexicon in acquisition.
Chomsky, N. (1988). Language and problems of Cambridge: Cambridge University Press.
knowledge: The Managua lectures. Cambridge, MA: Clark, E. V. (1995). Later lexical development and
MIT Press. word formation. In P. Fletcher & B. MacWhinney
Chomsky, N. (1991). Linguistics and cognitive (Eds.), The handbook of child language (pp. 393–412).
science: Problems and mysteries. In A. Kasher (Ed.), Oxford: Blackwell.
The Chomskyan turn (pp. 26–53). Oxford: Blackwell. Clark, E. V., & Hecht, B. F. (1983). Comprehension
Chomsky, N. (1995). Bare phrase structure. In G. and production. Annual Review of Psychology, 34,
Webelhuth (Ed.), Government and binding theory and 325–247.
the minimalist programme (pp. 383–400). Oxford: Clark, H. H. (1977a). Bridging. In P. N. Johnson-
Blackwell. Laird & P. C. Wason (Ed.), Thinking: Readings
Christiansen, J. A. (1995). Coherence violations and in cognitive science (pp. 411–420). Cambridge:
propositional usage in the narratives of fluent aphasics. Cambridge University Press.
Brain and Language, 51, 291–317. Clark, H. H. (1977b). Inferences in comprehension.
Christiansen, M. H., Allen, J., & Seidenberg, M. S. In D. LaBerge & S. J. Samuels (Eds.), Basic processes
(1998). Learning to segment speech using multiple in reading: Perception and comprehension (pp. 243–
cues: A connectionist model. Language and Cognitive 263). Hillsdale, NJ: Lawrence Erlbaum Associates,
Processes, 13, 221–268. Inc.
Christiansen, M. H., & Chater, N. (2008). Language Clark, H. H. (1994). Discourse in production. In
as shaped by the brain. Behavioral and Brain Sciences, M. A. Gernsbacher (Ed.), Handbook of psycholinguistics
31, 489–509. (pp. 985–1022). San Diego, CA: Academic Press.
Christiansen, M. H., & Curtin, S. (1999). Transfer Clark, H. H. (1996). Using language. Cambridge:
of learning: Rule acquisition of statistical learning? Cambridge University Press.
Trends in Cognitive Sciences, 3, 289–290. Clark, H. H., & Carlson, T. (1982). Speech
Christiansen, M. H., & Kirby, S. (Eds.). (2003). acts and hearers’ beliefs. In N. V. Smith (Ed.),
Language evolution. Oxford: Oxford University Press. Mutual knowledge (pp. 1–59). London: Academic
Christianson, K., Hollingworth, A., Halliwell, J. F., Press.
& Ferreira, F. (2001). Thematic roles assigned along Clark, H. H., & Clark, E. V. (1977). Psychology and
the garden path linger. Cognitive Psychology, 42, language: An introduction to psycholinguistics. New
368–407. York: Harcourt Brace Jovanovich.
Chumbley, J. I., & Balota, D. A., (1984). A word’s Clark, H. H., & Fox Tree, J. E. (2002). Using uh and
meaning affects the decision in lexical decision. um in spontaneous speaking. Cognition, 84, 73–111.
Memory and Cognition, 12, 590–606. Clark, H. H., & Haviland, S. E. (1977).
Cipolotti, L., & Warrington, E. K. (1995). Semantic Comprehension and the given-new contract. In
memory and reading abilities: A case report. Journal of R. O. Freedle (Ed.), Discourse production and
the International Neuropsychological Society, 1, 104–110. comprehension (pp. 1–40). Norwood, NJ: Ablex.
Cirilo, R. K., & Foss, D. J. (1980). Text structure and Clark, H. H., & Lucy, P. (1975). Understanding what
reading time for sentences. Journal of Verbal Learning is meant from what is said: A study in conversationally
and Verbal Behavior, 19, 96–109. conveyed requests. Journal of Verbal Learning and
Clahsen, H. (1992). Learnability theory and the Verbal Behavior, 14, 56–72.
problem of development in language acquisition. In Clark, H. H., & Wasow, T. (1998). Repeating words in
J. Weissenborn, H. Goodluck, & T. Roeper (Eds.), spontaneous speech. Cognitive Psychology, 37, 201–242.
Theoretical issues in language acquisition (pp. 53–76). Clark, H. H., & Wilkes-Gibbs, D. (1986). Referring
Hillsdale, NJ: Lawrence Erlbaum Associates, Inc. as a collaborative process. Cognition, 22, 1–39.
Clahsen, H. (1999). Lexical entries and rules of Clarke, R., & Morton, J. (1983). Cross modality
language: A multidisciplinary study of German facilitation in tachistoscopic word recognition.
and inflection. Behavioral and Brain Sciences, 22, Quarterly Journal of Experimental Psychology, 35A,
991–1060. 79–96.
508 REFERENCES
Clarke-Stewart, K., Vanderstoep, L., & Killian, G. Coltheart, M. (1985). Cognitive neuropsychology and
(1979). Analysis and replication of mother–child the study of reading. In M. I. Posner & O. S. M. Marin
relations at 2 years of age. Child Development, 50, (Eds.), Attention and performance XI (pp. 3–37).
777–793. Hillsdale, NJ: Lawrence Erlbaum Associates, Inc.
Cleland, A. A., & Pickering, M. J. (2003). The Coltheart, M. (1987). Varieties of developmental
use of lexical and syntactic information in language dyslexia: A comment on Bryant and Impey. Cognition,
production: Evidence from the priming of noun-phrase 27, 97–101.
structure. Journal of Memory and Language, 49, Coltheart, M. (1996). Phonological dyslexia: Past and
214–230. future issues. Cognitive Neuropsychology, 13, 749–762.
Clifton, C., & Ferreira, F. (1987). Discourse Coltheart, M. (2004) Are there lexicons? Quarterly
structure and anaphora: Some experimental results. In Journal of Experimental Psychology, 57A, 1153–1171.
M. Coltheart (Ed.), Attention and performance XII: Coltheart, M., Curtis, B., Atkins, P., & Haller, M.
The psychology of reading (pp. 635–654). Hove, UK: (1993). Models of reading aloud: Dual-route
Lawrence Erlbaum Associates. and parallel-distributed-processing approaches.
Clifton, C., & Ferreira, F. (1989). Ambiguity in Psychological Review, 100, 589–608.
context. Language and Cognitive Processes, 4, Coltheart, M., Davelaar, E., Jonasson, J. T., &
77–103. Besner, D. (1977). Access to the internal lexicon. In
Cogswell, D., & Gordon, P. (1996). Chomsky for S. Dornic (Ed.), Attention and performance VI (pp.
beginners. London: Readers & Writers Ltd. 535–555). London: Academic Press.
Cohen, G. (1979). Language comprehension in old Coltheart, M., Inglis, L., Cupples, L., Michle, P.,
age. Cognitive Psychology, 11, 412–429. Bates, A., & Budd, B. (1998). A semantic subsystem
Cohen, L., & Dehaene, S. (2004). Specialization of visual attributes. Neurocase, 4, 353–370.
within the ventral stream: The case for the visual word Coltheart, M., Patterson, K. E., & Marshall, J. C.
form area. NeuroImage, 22, 466–476. (Eds.). (1987). Deep dyslexia (2nd ed.). London:
Colby, K. M. (1975). Artificial paranoia. New York: Routledge & Kegan Paul. [1st ed., 1980.]
Pergamon Press. Coltheart, M., & Rastle, K. (1994). Serial processing
Cole, R. A. (1973). Listening for mispronunciations: in reading aloud: Evidence for dual-route models of
A measure of what we hear during speech. Perception reading. Journal of Experimental Psychology: Human
and Psychophysics, 13, 153–156. Perception and Performance, 20, 1197–1211.
Cole, R. A., & Jakimik, J. (1980). A model of speech Coltheart, M., Rastle, K., Perry, C., Langdon, R.,
perception. In R. A. Cole (Ed.), Perception and & Ziegler, J. (2001). DRC: A dual route cascaded
production of fluent speech (pp. 133–163). Hillsdale, model of visual word recognition and reading aloud.
NJ: Lawrence Erlbaum Associates, Inc. Psychological Review, 108, 204–256.
Coleman, L., & Kay, P. (1981). Prototype semantics. Coltheart, V., & Leahy, J. (1992). Children’s and
Language, 57, 26–44. adults’ reading of nonwords: Effects of regularity and
Collins, A., & Gentner, D. (1980). A framework for consistency. Journal of Experimental Psychology:
a cognitive theory of writing. In L. W. Gregg & E. R. Learning, Memory, and Cognition, 18, 718–729.
Sternberg (Eds.), Cognitive processes in writing (pp. Conboy, B. T., & Mills, D. L. (2006). Two languages,
51–72). Hillsdale, NJ: Lawrence Erlbaum Associates, Inc. one developing brain: Event-related potentials to
Collins, A. M., & Loftus, E. F. (1975). A spreading- words in bilingual toddlers. Developmental Science, 9,
activation theory of semantic processing. F1–F12.
Psychological Review, 82, 407–428. Connine, C. M. (1990). Effects of sentence context
Collins, A. M., & Quillian, M. R. (1969). Retrieval and lexical knowledge in speech processing. In
time from semantic memory. Journal of Verbal G. T. M. Altmann (Ed.), Cognitive models of speech
Learning and Verbal Behavior, 8, 240–247. processing (pp. 281–294). Cambridge, MA: MIT Press.
Colombo, L. (1986). Activation and inhibition Connine, C. M., & Clifton, C. (1987). Interactive use
with orthographically similar words. Journal of of lexical information in speech perception. Journal
Experimental Psychology: Human Perception and of Experimental Psychology: Human Perception and
Performance, 12, 226–234. Performance, 13, 291–319.
Coltheart, M. (1980). Deep dyslexia: A right Conrad, C. (1972). Cognitive economy in semantic
hemisphere hypothesis. In M. Coltheart, K. E. memory. Journal of Experimental Psychology, 92,
Patterson, & J. C. Marshall (Eds.), Deep dyslexia (pp. 149–154.
326–380). London: Routledge & Kegan Paul. [2nd ed., Conrad, R. (1979). The deaf school child: Language
1987.] and cognitive function. London: Harper & Row.
Coltheart, M. (1981). Disorders of reading and their Conrad, R., & Rush, M. L. (1965). On the nature of
implications for models of normal reading. Visible short-term memory encoding by the deaf. Journal of
Language, 15, 245–286. Speech and Hearing Disorders, 30, 336–343.
REFERENCES 509
Cook, V. (1997). The consequences of bilingualism morphemes: Evidence from Croatian. Journal of
for cognitive processing. In A. M. B. de Groot Experimental Psychology: Learning, Memory, and
& J. F. Kroll (Eds.), Tutorials in bilingualism: Cognition, 29, 1270–1282.
Psycholinguistic perspectives (pp. 279–299). Mahwah, Costa, A., Miozzo, M., & Caramazza, A. (1999).
NJ: Lawrence Erlbaum Associates, Inc. Lexical selection in bilinguals: Do words in the
Cook, V. J., & Newson, M. (2007). Chomsky’s bilingual’s two lexicons compete for selection?
universal grammar: An introduction (3rd ed.). Oxford: Journal of Memory and Language, 41, 365–397.
Blackwell. Costa, A., & Sebastian-Gallés, N. (1998). Abstract
Corballis, M. C. (1992). On the evolution of language phonological structure in language production:
and generativity. Cognition, 44, 197–226. Evidence from Spanish. Journal of Experimental
Corballis, M. C. (2003). From mouth to hand: Psychology: Learning, Memory, and Cognition, 24,
Gesture, speech, and the evolution of right- 886–903.
handedness. Behavioral and Brain Sciences, 26, Cottingham, J. (1984). Rationalism. London: Paladin.
199–260. Crain, S., & Steedman, M. J. (1985). On not being
Corballis, M. C. (2004). On the origins of modernity: led up the garden path: The use of context by the
Was autonomous speech the critical factor? psychological parser. In D. Dowty, L. Karttunen, &
Psychological Review, 111, 543–552. A. Zwicky (Eds.), Natural language parsing
Corbett, A. T., & Chang, F. (1983). Pronoun (pp. 320–358). Cambridge: Cambridge University Press.
disambiguation: Accessing potential antecedents. Cree, G. S., McNorgan, C., & McRae, K. (2006).
Memory and Cognition, 11, 383–394. Distinctive features hold a privileged status in the
Corbett, A. T., & Dosher, B. A. (1978). Instrument computation of word meaning: Implications for
inferences in sentence encoding. Journal of Verbal theories of semantic memory. Journal of Experimental
Learning and Verbal Behavior, 17, 479–492. Psychology: Learning, Memory, and Cognition, 32,
Corina, D. P., Jose-Robertson, L., Guillermin, A., 643–658.
High, J., & Braun, A. R. (2003). Language Crocker, M. W. (1999). Mechanisms for sentence
lateralization in a bimanual language. Journal of processing. In S. Garrod & M. J. Pickering (Eds.),
Cognitive Neuroscience, 15, 718–730. Language processing (pp. 191–232). Hove, UK:
Corley, M., Brocklehurst, P. H., & Moat, H. S. Psychology Press.
(2011). Error biases in inner and overt speech: Cromer, R. F. (1991). Language and thought in
Evidence from tongue twisters. Journal of normal and handicapped children. Oxford: Blackwell.
Experimental Psychology: Learning, Memory, and Croot, K., Patterson, K. E., & Hodges, J. R. (1999).
Cognition, 37, 162–175. Familial progressive aphasia: Insights into the nature
Corrigan, R. (1978). Language development as and deterioration of single word processing. Cognitive
related to stage 6 object permanence development. Neuropsychology, 16, 705–747.
Journal of Child Language, 5, 173–189. Cross, T. G. (1977). Mothers’ speech adjustments:
Coslett, H. B. (1991). Read but not write “idea”: The contribution of selected child listener variables.
Evidence for a third reading mechanism. Brain and In C. E. Snow & C. A. Ferguson (Eds.), Talking to
Language, 40, 425–443. children: Language input and acquisition (pp. 151–
Coslett, H. B., Roeltgen, D. P., Rothi, L. G., & 188). Cambridge: Cambridge University Press.
Heilman, K. M. (1987). Transcortical sensory Cross, T. G. (1978). Mother’s speech and its
aphasia: Evidence for subtypes. Brain and Language, association with rate of linguistic development in
32, 362–378. young children. In N. Waterson & C. E. Snow (Eds.),
Coslett, H. B., & Saffran, E. M. (1989). Preserved The development of communication (pp. 199–216).
object recognition and reading comprehension in optic Chichester, UK: Wiley.
aphasia. Brain, 112, 1091–1100. Cross, T. G., Johnson-Morris, J. E., & Nienhuys, T. G.
Costa, A., & Caramazza, A. (2002). The production of (1980). Linguistic feedback and maternal speech:
noun phrases in English and Spanish: Implications for Comparisons of mothers addressing hearing and
the scope of phonological encoding in speech production. hearing-impaired children. First Language, 1,
Journal of Memory and Language, 46, 178–198. 163–189.
Costa, A., Caramazza, A., & Sebastian-Galles, N. Crosson, B., Moberg, P. J., Boone, J. R., Rothi, L. J. G.,
(2000). The cognate facilitation effect: Implications & Raymer, A. (1997). Category-specific naming deficit
for models of lexical access. Journal of Experimental for medical terms after dominant thalamic/capsular
Psychology: Learning, Memory, and Cognition, 26, hemorrhage. Brain and Language, 60, 407–442.
1283–1296. Crystal, D. (1986). Prosodic development. In P.
Costa, A., Kovacic, D., Fedorenko, E., & Fletcher & M. Garman (Eds.), Language acquisition
Caramazza, A. (2003). The gender-congruency (2nd ed., pp. 174–197). Cambridge: Cambridge
effect and the selection of freestanding and bound University Press.
510 REFERENCES
Crystal, D. (1998). Language play. Harmondsworth, Cutting, J. C., & Ferreira, V. (1999). Semantic
UK: Penguin Books. and phonological information flow in the production
Crystal, D. (2010). The Cambridge encyclopedia of lexicon. Journal of Experimental Psychology:
language (3rd ed.). Cambridge: Cambridge University Learning, Memory, and Cognition, 25, 318–344.
Press. Czerniewska, P. (1992). Learning about writing.
Cuetos, F., Aguado, G., & Caramazza, A. (2000). Oxford: Blackwell.
Dissociation of semantic and phonological errors in D’Andrade, R. G., & Wish, M. (1985). Speech
naming. Brain and Language, 75, 451–460. act theory in quantitative research on interpersonal
Cuetos, F., & Mitchell, D. C. (1988). Cross-linguistic behavior. Discourse Processes, 8, 229–259.
differences in parsing: Restrictions on the use of the Dagenbach, D., Carr, T. H., & Wilhelmsen, A.
late closure strategy in Spanish. Cognition, 30, 73–105. (1989). Task-induced strategies and near-threshold
Cummins, J. (1991). Interdependence of first- and priming: Conscious influences on unconscious
second-language proficiency in bilingual children. In perception. Journal of Memory and Language, 28,
E. Bialystok (Ed.), Language processing in bilingual 412–443.
children (pp. 70–89). Cambridge: Cambridge Dahan, D., Magnuson, J. S., & Tanenhaus, M. K.
University Press. (2001). Time course of frequency effects in spoken-
Curtin, S., Mintz, T. H., & Christiansen, M. H. word recognition: Evidence from eye movements.
(2005). Stress changes the representational landscape: Cognitive Psychology, 42, 317–367.
Evidence from word segmentation. Cognition, 96, Dahl, H. (1979). Word frequencies of spoken
233–262. American English. Essex, CT: Verbatim.
Curtiss, S. (1977). Genie: A psycholinguistic study of Dale, P. S. (1976). Language development: Structure
a modern-day “wild child.” London: Academic Press. and function (2nd ed.). New York: Holt, Rinehart &
Curtiss, S. (1989). The independence and task- Winston.
specificity of language. In M. H. Bornstein & Daneman, M., & Carpenter, P. A. (1980). Individual
J. Bruner (Eds.), Interaction in human development differences in working memory and reading. Journal
(pp. 105–137). Hillsdale, NJ: Lawrence Erlbaum of Verbal Learning and Verbal Behavior, 19, 450–466.
Associates, Inc. Daneman, M., Reingold, E. M., & Davidson, M.
Cutler, A. (1981). Making up materials is a (1995). Time course of phonological-activation during
confounded nuisance, or: Will we be able to run reading: Evidence from eye fixations. Journal of
any psycholinguistic experiments at all in 1990? Experimental Psychology: Learning, Memory, and
Cognition, 10, 65–70. Cognition, 21, 884–898.
Cutler, A., & Butterfield, S. (1992). Rhythmic cues Davidoff, J., Davies, I., & Roberson, D. (1999a). Colour
to speech segmentation: Evidence from juncture categories in a stone-age tribe. Nature, 398, 203–204.
misperception. Journal of Memory and Language, 31, Davidoff, J., Davies, I., & Roberson, D. (1999b).
218–236. Addendum: Colour categories in a stone-age tribe.
Cutler, A., Mehler, J., Norris, D., & Segui, J. (1986). Nature, 402, 604.
The syllable’s differing role in the segmentation Davies, I., Corbett, G., Laws, G., McGurk, H.,
of French and English. Journal of Memory and Moss, A., & Smith, M. W. (1991). Linguistic
Language, 25, 385–400. basicness and colour information processing.
Cutler, A., Mehler, J., Norris, D., & Segui, J. (1987). International Journal of Psychology, 26, 311–327.
Phoneme identification and the lexicon. Cognitive Davis, C. J., & Lupker, S. J. (2006). Masked
Psychology, 19, 141–177. inhibitory priming in English: Evidence for lexical
Cutler, A., Mehler, J., Norris, D., & Segui, J. (1992). inhibition. Journal of Experimental Psychology:
The monolingual nature of speech segmentation by Human Perception and Performance, 32, 668–687.
bilinguals. Cognitive Psychology, 24, 381–410. Davis, K. (1947). Final note on a case of extreme
Cutler, A., & Norris, D. (1979). Monitoring sentence social isolation. American Journal of Sociology, 52,
comprehension. In W. E. Cooper & E. C. T. Walker 432–437.
(Eds.), Sentence processing: Psycholinguistic studies Dawson, M. (2005). Connectionism: A hands-on
presented to Merrill Garrett (pp. 113–134). Hillsdale, approach. Oxford: Blackwell.
NJ: Lawrence Erlbaum Associates, Inc. de Boysson-Bardies, B., Halle, P., Sagart, L., &
Cutler, A., & Norris, D. (1988). The role of strong Durand, C. (1989). A cross-linguistic investigation
syllables in segmentation for lexical access. Journal of vowel formants in babbling. Journal of Child
of Experimental Psychology: Human Perception and Language, 16, 1–17.
Performance, 14, 113–121. de Boysson-Bardies, B., Sagart, L., & Durand, C.
Cutsford, T. D. (1951). The blind in school and (1984). Discernible differences in the babbling of
society. New York: American Foundation for the infants according to target language. Journal of Child
Blind. Language, 11, 1–15.
REFERENCES 511
de Groot, A. M. B. (1983). The range of automatic of frame constraints in phonological speech errors.
spreading activation in word priming. Journal of Cognitive Science, 17, 149–195.
Verbal Learning and Verbal Behavior, 22, 417–436. Dell, G. S., Martin, N., & Schwartz, M. F. (2007).
de Groot, A. M. B. (1984). Primed lexical decision: A case-series test of the interactive two-step model
Combined effects of the proportion of related prime– of lexical access: Predicting word repetition from
target pairs and the stimulus onset asynchrony of picture naming. Journal of Memory and Language, 56,
prime and target. Quarterly Journal of Experimental 490–520.
Psychology, 36A, 253–280. Dell, G. S., & O’Seaghdha, P. G. (1991). Mediated
de Groot, A. M. B., Dannenburg, L., & van Hell, J. G. and convergent lexical priming in language
(1994). Forward and backward translation by bilinguals. production: A comment on Levelt et al. (1991).
Journal of Memory and Language, 33, 600–629. Psychological Review, 98, 604–614.
de Groot, A. M. B., & Kroll, J. F. (Eds.). (1997). Dell, G. S., & Reich, P. A. (1981). Stages in sentence
Tutorials in bilingualism: Psycholinguistic production: An analysis of speech error data. Journal
perspectives. Mahwah, NJ: Lawrence Erlbaum of Verbal Learning and Verbal Behavior, 20, 611–629.
Associates, Inc. Dell, G. S., Schwartz, M. F., Martin, N., Saffran, E. M.,
de Renzi, E., & Lucchelli, F. (1994). Are semantic & Gagnon, D. A. (1997). Lexical access in aphasic
systems separately represented in the brain? The case and nonaphasic speakers. Psychological Review, 104,
of living category impairment. Cortex, 30, 3–25. 801–838.
de Villiers, J. G., & de Villiers, P. A. (2000). Dell, G. S., Schwartz, M. F., Martin, N.,
Linguistic determination and the understanding of Saffran, E. M., & Gagnon, D. A. (2000). The
false beliefs. In P. Mitchell & K. J. Riggs (Eds.), role of computational models in the cognitive
Children’s reasoning and the mind (pp. 191–228). neuropsychology of language: A reply to Ruml and
Hove, UK: Psychology Press. Caramazza. Psychological Review, 107, 635–645.
de Villiers, P. A., & de Villiers, J. G. (1979). Early DeLong, K. A., Urbach, T. P., & Kutas, M. (2005).
language. London: Fontana/Open Books. Probabilistic word pre-activation during language
Deacon, T. (1997). The symbolic species. comprehension inferred from electrical brain activity.
Harmondsworth, UK: Penguin Books. Nature Neuroscience, 8, 1117–1121.
Dean, M. P., & Young, A. W. (1996). An item-specific Demers, R. A. (1988). Linguistics and animal
locus of repetition priming. Quarterly Journal of communication. In F. J. Newmeyer (Ed.), Linguistics:
Experimental Psychology, 49A, 269–294. The Cambridge survey: Vol. 3. Language:
DeCasper, A. J., & Fifer, W. P. (1980). Of human Psychological and biological aspects (pp. 314–335).
bonding: Newborns prefer their mothers’ voices. Cambridge: Cambridge University Press.
Science, 208, 1174–1176. Demetras, M. J., Post, K. N., & Snow, C. E. (1986).
DeCasper, A. J., Lecanuet, J. P., Maugais, R., Feedback to first language learners: The role of
Granier-Deferre, C., & Busnel, M. C. (1994). Fetal repetitions and clarification questions. Journal of
reactions to recurrent maternal speech. Infant Behavior Child Language, 13, 275–292.
and Development, 17, 159–164. Den Heyer, K. (1985). On the nature of the proportion
DeCasper, A. J., & Spence, M. J. (1986). Prenatal effect in semantic priming. Acta Psychologica, 60, 25–38.
maternal speech influences newborns’ perception of Den Heyer, K., Briand, K., & Dannenbring, G. L.
speech sounds. Infant Behavior and Development, 9, (1983). Strategic factors in a lexical decision task:
133–150. Evidence for automatic and attention driven processes.
Dell, G. S. (1985). Positive feedback in hierarchical Memory and Cognition, 10, 358–370.
connectionist models: Applications to language Dennett, D. C. (1991). Consciousness explained.
production. Cognitive Science, 9, 3–23. Harmondsworth, UK: Penguin.
Dell, G. S. (1986). A spreading-activation theory Dennis, M., & Whitaker, H. A. (1976). Language
of retrieval in sentence production. Psychological acquisition following hemidecortication: Linguistic
Review, 93, 283–321. superiority of the left over the right hemisphere. Brain
Dell, G. S. (1988). The retrieval of phonological and Language, 3, 404–433.
forms in production: Tests of predictions from Dennis, M., & Whitaker, H. A. (1977). Hemispheric
a connectionist model. Journal of Memory and equipotentiality and language acquisition. In
Language, 27, 124–142. S. J. Segalowitz & F. A. Gruber (Eds.), Language
Dell, G. S., Burger, L. K., & Svec, W. R. (1997). development and neurological theory (pp. 93–106).
Language production and serial order: A functional New York: Academic Press.
analysis and a model. Psychological Review, 104, Derouesné, J., & Beauvois, M.-F. (1979).
123–147. Phonological processing in reading: Data from
Dell, G. S., Juliano, C., & Govindjee, A. (1993). dyslexia. Journal of Neurology, Neurosurgery and
Structure and content in language production: A theory Psychiatry, 42, 1125–1132.
512 REFERENCES
Derouesné, J., & Beauvois, M.-F. (1985). The Dörnyei, Z. (1990). Conceptualizing motivation in
“phonemic” stage in the non-lexical reading process: foreign language learning. Language Learning, 40,
Evidence from a case of phonological alexia. In K. 45–78.
Patterson, M. Coltheart, & J. C. Marshall (Eds.), Doughty, C. J., & Long, M. H. (Eds.). (2005). The
Surface dyslexia (pp. 399–457). Hove, UK: Lawrence handbook of second language acquisition. Oxford:
Erlbaum Associates. Blackwell.
Devlin, J. T., Gonnerman, L. M., Andersen, E. S., & Downing, P. (1977). On the creation and use of
Seidenberg, M. S. (1998). Category specific semantic English compound nouns. Language, 53, 810–842.
deficits in focal and widespread brain damage: Doyle, J. R., & Leach, C. (1988). Word superiority
A computational account. Journal of Cognitive in signal detection: Barely a glimpse, yet reading
Neuroscience, 10, 77–94. nonetheless. Cognitive Psychology, 20, 283–318.
Dhooge, E., & Hartsuiker, R. J. (2012). Lexical Dronkers, N. F., Wilkins, D. P., van Valin, R. D.,
selection and verbal self-monitoring: Effects of Redfern, B. B., & Jaeger, J. J. (2004). Lesion
lexicality, context, and time pressure in picture-word analysis of the brain areas involved in language
interference. Journal of Memory and Language, 66, comprehension. Cognition, 95, 145–177.
163–176. Druks, J., & Froud, K. (2002). The syntax of
Dick, F., Bates, E., Wulfeck, B., Utman, J. A., single words: Evidence from a patient with a
Dronkers, N., & Gernsbacher, M. A. (2001). selective function word reading deficit. Cognitive
Language deficits, localization, and grammar: Neuropsychology, 19, 207–244.
Evidence for a distributive model of language Duffy, S. A., Morris, R. K., & Rayner, K. (1988).
breakdown in aphasic patients and neurologically Lexical ambiguity and fixation times in reading.
intact individuals. Psychological Review, 108, Journal of Memory and Language, 27, 429–446.
759–788. Duncan, L. G., Seymour, P. H. K., & Hill, S. (1997).
Diesfeldt, H. F. A. (1989). Semantic impairment in How important are rhyme and analogy in beginning
senile dementia of the Alzheimer type. Aphasiology, reading? Cognition, 63, 171–208.
3, 41–54. Duncan, L. G., Seymour, P. H. K., & Hill, S. (2000).
Dijkstra, A., & van Heuven, W. J. B. (2002). A small-to-large unit progression in metaphonological
The architecture of the bilingual word recognition awareness and reading? Quarterly Journal of
system: From identification to decision. Bilingualism: Experimental Psychology, 53A, 1081–1104.
Language and Cognition, 5, 175–197. Duncan, S. E., & Niederehe, G. (1974). On signaling
Dijkstra, T., van Heuven, W. J. B., & Grainger, J. that it’s your turn to speak. Journal of Experimental
(1998). Simulating cross-language competition Social Psychology, 10, 234–247.
with the bilingual interactive activation model. Duncker, K. (1945). On problem-solving.
Psychologica Belgica, 38, 177–196. Psychological Monographs, 58 (5, Whole No. 270).
Dionne, G., Dale, P. S., Boivin, M., & Plomin, R. Dunlea, A. (1984). The relation between concept
(2003). Genetic evidence for bidirectional effects of formation and semantic roles: Some evidence from the
early lexical and grammatical development. Child blind. In L. Feagans, C. Garvery, & R. M. Golinkoff
Development, 74, 394–412. (Eds.), The origins and growth of communication (pp.
Dockrell, J., & Messer, D. J. (1999). Children’s 224–243). Norwood, NJ: Ablex.
language and communication difficulties: Dunlea, A. (1989). Vision and the emergence of
Understanding, identification, and intervention. meaning: Blind and sighted children’s early language.
London: Cassell. Cambridge: Cambridge University Press.
Dogil, G., Haider, H., Schaner-Wolles, C., & Duran, N. D., Dale, R., & Kreuz, R. J. (2011).
Husman, R. (1995). Radical autonomy of syntax: Listeners invest in an assumed other’s perspective
Evidence from transcortical sensory aphasia. despite cognitive cost. Cognition, 121, 22–40.
Aphasiology, 9, 577–602. Durkin, K. (1987). Minds and language: Social
Dooling, D. J., & Christiaansen, R. E. (1977). cognition, social interaction and the acquisition of
Episodic and semantic aspects of memory for prose. language. Mind and Language, 2, 105–140.
Journal of Experimental Psychology: Human Learning Durso, F. T., & Johnson, M. K. (1979). Facilitation
and Memory, 3, 428–436. in naming and categorizing repeated pictures and
Dooling, D. J., & Lachman, R. (1971). Effects of words. Journal of Experimental Psychology: Human
comprehension on retention of prose. Journal of Learning and Memory, 5, 449–459.
Experimental Psychology, 88, 216–222. Duskova, L. (1969). On sources of errors in foreign
Dopkins, S., Morris, R. K., & Rayner, K. (1992). language learning. International Review of Applied
Lexical ambiguity and eye fixations in reading: A test Linguistics, 7, 11–36.
of competing models of lexical ambiguity resolution. Eberhard, K. M. (1999). The accessibility of
Journal of Memory and Language, 31, 461–476. conceptual number to the processes of subject–verb
REFERENCES 513
agreement in English. Journal of Memory and language. In S. Harnad (Ed.), Categorical perception
Language, 30, 210–233. (pp. 161–195). New York: Cambridge University Press.
Eberhard, K. M., Cutting, J. C., & Bock, J. K. Eimas, P. D., Siqueland, E. R., Jusczyk, P. W., &
(2005). Making syntax of sense: Number agreement Vigorito, J. (1971). Speech perception in infants.
in sentence production. Psychological Review, 112, Science, 171, 303–306.
531–559. Elbers, L. (1985). A tip-of-the-tongue experience at
Eckert, M. A., Lombardino, L. J., & Leonard, C. M. age two? Journal of Child Language, 12, 353–365.
(2001). Planar asymmetry tips the phonological Elio, R., & Anderson, J. R. (1981). The effects
playground and environment raises the bar. Child of category generalizations and instance similarity
Development, 72, 988–1002. on schema abstraction. Journal of Experimental
Eckman, F. (1977). Markedness and the contrastive Psychology: Human Learning and Memory, 7,
analysis hypothesis. Language Learning, 27, 315–330. 397–417.
Eglinton, E., & Annett, M. (1994). Handedness and Ellis, A. W. (1980). On the Freudian theory of speech
dyslexia: A meta-analysis. Perceptual Motor Skills, 79, errors. In V. A. Fromkin (Ed.), Errors in linguistic
1611–1616. performance (pp. 123–132). New York: Academic
Ehri, L. C. (1992). Reconceptualizing the Press.
development of sight word reading and its relationship Ellis, A. W. (1985). The production of spoken words:
to recoding. In P. Gough, L. Ehri, & R. Treiman (Eds.), A cognitive neuropsychological perspective. In
Reading acquisition (pp. 107–143). Hillsdale, NJ: A. W. Ellis (Ed.), Progress in the psychology of
Lawrence Erlbaum Associates, Inc. language (Vol. 2, pp. 107–145). Hove, UK: Lawrence
Ehri, L. C. (1997a). Sight word learning in normal Erlbaum Associates.
readers and dyslexics. In B. A. Blachman (Ed.), Ellis, A. W. (1993). Reading, writing and dyslexia:
Foundations of reading acquisition and dyslexia: A cognitive analysis (2nd ed.). Hove, UK: Lawrence
Implications for early intervention (pp. 163–189). Erlbaum Associates.
Mahwah, NJ: Lawrence Erlbaum Associates, Inc. Ellis, A. W., & Lambon Ralph, M. A. (2000). Age of
Ehri, L. C. (1997b). Learning to read and learning to acquisition effects in adult lexical processing reflect
spell are one and the same, almost. In C. A. Perfetti, loss of plasticity in maturing systems: Insights from
L. Rieben, & M. Fayol (Eds.), Learning to spell: connectionist networks. Journal of Experimental
Research, theory, and practice across languages Psychology: Learning, Memory, and Cognition, 26,
(pp. 237–269). Mahwah, NJ: Lawrence Erlbaum 1103–1123.
Associates, Inc. Ellis, A. W., & Marshall, J. C. (1978). Semantic
Ehri, L. C., Nunes, S. R., Stahl, S. A., & Willows, D. M. errors or statistical flukes: A note on Allport’s
(2001). Systematic phonics instruction helps students “On knowing the meaning of words we are unable
learn to read: Evidence from the National Reading Panel’s to report.” Quarterly Journal of Experimental
meta-analysis. Review of Educational Research, 71, Psychology, 30, 569–575.
393–447. Ellis, A. W., Miller, D., & Sin, G. (1983). Wernicke’s
Ehri, L. C., & Robbins, C. (1992). Beginners aphasia and normal language processing: A case study
need some decoding skill to read words by analogy. in cognitive neuropsychology. Cognition, 15, 111–144.
Reading Research Quarterly, 27, 13–26. Ellis, A. W., & Morrison, C. M. (1998). Real age-
Ehri, L. C., & Ryan, E. B. (1980). Performance of of-acquisition effects in lexical retrieval. Journal of
bilinguals in a picture–word interference task. Journal Experimental Psychology: Learning, Memory, and
of Psycholinguistic Research, 9, 285–302. Cognition, 24, 515–523.
Ehri, L. C., & Wilce, L. S. (1985). Movement into Ellis, A. W., & Young, A. W. (1988). Human cognitive
reading: Is the first stage of printed word learning neuropsychology. Hove, UK: Lawrence Erlbaum
visual or phonetic? Reading Research Quarterly, 20, Associates. [Augmented edition with readings, 1996.]
163–179. Ellis, N. C., & Beaton, A. (1993). Factors affecting
Ehrlich, K., & Johnson-Laird, P. N. (1982). Spatial the learning of foreign language vocabulary:
descriptions and referential continuity. Journal of Imagery keyword mediators and phonological short-
Verbal Learning and Verbal Behavior, 21, 296–306. term memory. Quarterly Journal of Experimental
Eimas, P. D., & Corbit, L. (1973). Selective Psychology, 46A, 533–558.
adaptation of linguistic feature detectors. Cognitive Ellis, N. C., & Hennelly, R. A. (1980). A bilingual
Psychology, 4, 99–109. word-length effect: Implications for intelligence testing
Eimas, P. D., & Miller, J. L. (1980). Contextual and the relative ease of mental calculations in Welsh
effects in infant speech perception. Science, 209, and English. British Journal of Psychology, 71, 43–52.
1140–1141. Ellis, R., & Humphreys, G. W. (1999). Connectionist
Eimas, P. D., Miller, J. L., & Jusczyk, P. W. (1987). psychology: A text with readings. Hove, UK:
On infant speech perception and the acquisition of Psychology Press.
514 REFERENCES
Ellis, R., & Wells, G. (1980). Enabling factors in Eysenck, M. W., & Keane, M. T. (2010). Cognitive
adult–child discourse. First Language, 1, 46–62. psychology: A student’s handbook (6th ed.). Hove,
Elman, J. L. (1990). Finding structure in time. UK: Psychology Press.
Cognitive Science, 14, 179–211. Fabb, N. (1994). Sentence structure. London:
Elman, J. L. (1993). Learning and development in Routledge & Kegan Paul.
neural networks: The importance of starting small. Fabbro, F. (1999). The neurolinguistics of
Cognition, 48, 71–99. bilingualism: An introduction. Hove, UK: Psychology
Elman, J. L. (1999). The emergence of language: Press.
A conspiracy theory. In B. MacWhinney (Ed.), The Fabbro, F. (2001). The bilingual brain: Cerebral
emergence of language (pp. 1–27). Mahwah, NJ: representation of languages. Brain and Language, 79,
Lawrence Erlbaum Associates, Inc. 211–222.
Elman, J. L., Bates, E. A., Johnson, M. H., Facoetti, A., Trussardi, A. N., Ruffino, M., Lorusso,
Karmiloff-Smith, A., Parisi, D., & Plunkett, K. M. L., Cattaneo, C., Galli, R., et al. (2010).
(1996). Rethinking innateness: A connectionist Multisensory spatial attention deficits are predictive of
perspective on development. Cambridge, MA: phonological decoding skills in developmental dyslexia.
Bradford Books. Journal of Cognitive Neuroscience, 22, 1011–1025.
Elman, J. L., & McClelland, J. L. (1988). Cognitive Faigley, L., & Witte, S. (1983). Analysing revision.
penetration of the mechanisms of perception: College Composition and Communication, 32,
Compensation for coarticulation of lexically restored 400–414.
phonemes. Journal of Memory and Language, 27, Farah, M. J. (1990). Visual agnosia: Disorders of
143–165. object recognition and what they tell us about normal
Elsness, J. (1984). That or zero? A look at the choice vision. Cambridge, MA: MIT Press.
of object relative clause connective in a corpus of Farah, M. J. (1991). Patterns of co-occurrence among
American English. English Studies, 65, 519–533. the associative agnosias: Implications for visual object
Emmorey, K. (2001). Language, cognition, and the recognition. Cognitive Neuropsychology, 8, 1–19.
brain: Insights from sign language research. Hillsdale, Farah, M. J. (1994). Neuropsychological inference
NJ: Lawrence Erlbaum Associates, Inc. with an interactive brain: A critique of the “locality”
Entus, A. K. (1977). Hemispheric asymmetry in assumption [with commentaries]. Behavioral and
processing of dichotically presented speech sounds. Brain Sciences, 17, 43–104.
In S. J. Segalowitz & F. A. Gruber (Eds.), Language Farah, M. J., Hammond, K. M., Mehta, Z.,
development and neurological theory (pp. 63–73). & Ratcliff, G. (1989). Category-specificity
New York: Academic Press. and modality-specificity in semantic memory.
Eriksen, C. W., Pollack, M. D., & Montague, W. E. Neuropsychologia, 27, 193–200.
(1970). Implicit speech: Mechanisms in perceptual Farah, M. J., & McClelland, J. L. (1991). A
encoding? Journal of Experimental Psychology, 84, computational model of semantic memory impairment:
502–507. Modality-specificity and emergent category-
Ervin-Tripp, S. (1979). Children’s verbal turntaking. specificity. Journal of Experimental Psychology:
In E. Ochs & B. B. Schieffelin (Eds.), Developmental General, 120, 339–357.
pragmatics (pp. 391–414). New York: Academic Press. Farah, M. J., Stowe, R. M., & Levinson, K. L.
Estes, Z., & Jones, L. L. (2006). Priming via (1996). Phonological dyslexia: Loss of a reading-
relational similarity: A COPPER HORSE is faster specific component of the cognitive architecture?
when seen through a GLASS EYE. Journal of Memory Cognitive Neuropsychology, 13, 849–868.
and Language, 55, 89–101. Farah, M. J., & Wallace, M. A. (1992).
Evans, N., & Levinson, S. C. (2009). The myth Semantically-bounded anomia: Implications
of language universals: Language diversity and its for the neural implementation of naming.
importance for cognitive science. Behavioral and Neuropsychologia, 30, 609–621.
Brain Sciences, 32, 429–448. Farrar, M. J. (1990). Discourse and the acquisition of
Evans, W. E., & Bastian, J. (1969). Marine mammal grammatical morphemes. Journal of Child Language,
communication: Social and ecological factors. In H. T. 17, 607–624.
Andersen (Ed.), The biology of marine mammals (pp. Farrar, M. J. (1992). Negative evidence and
425–475). New York: Academic Press. grammatical morpheme acquisition. Developmental
Everett, C., & Madora, K. (2011). Quantity Psychology, 28, 90–98.
recognition among speakers of an anumeric language. Fauconnier, G., & Turner, M. (2003). The way we
Cognitive Science, 36, 130–141. think. New York: Basic Books.
Everett, D. L. (2005). Cultural constraints Faust, M. E., Balota, D. A., Duchek, J. A.,
on grammar and cognition in Piraha. Current Gernsbacher, M. A., & Smith, S. D. (1997).
Anthropology, 46, 521–646. Inhibitory control during sentence processing in
REFERENCES 515
individuals with dementia of the Alzheimer type. movements and word-by-word self-paced reading.
Brain and Language, 57, 225–253. Journal of Experimental Psychology: Learning,
Fay, D., & Cutler, A. (1977). Malapropisms and the Memory, and Cognition, 16, 555–568.
structure of the mental lexicon. Linguistic Inquiry, 8, Ferreira, F., & Henderson, J. M. (1991). Recovery
505–520. from misanalyses of garden-path sentences. Journal of
Fedorenko, E., Gibson, E., & Rohde, D. (2006). Memory and Language, 30, 725–745.
The nature of working memory capacity in sentence Ferreira, V. S. (1996). Is it better to give than to
comprehension: Evidence against domain-specific donate? Syntactic flexibility in language production.
working memory resources. Journal of Memory and Journal of Memory and Language, 35, 724–755.
Language, 54, 541–553. Ferreira, V. S., & Dell, G. S. (2000). Effect of
Fedorenko, E., & Kanwisher, N. (2011). Some ambiguity and lexical availability on syntactic
regions within Broca’s area do respond more strongly and lexical production. Cognitive Psychology, 40,
to sentences than to linguistically degraded stimuli: A 296–340.
comment on Rogalsky and Hickok (2011). Journal of Ferreira, V. S., Slevc, L. R., & Rogers, E. S.
Cognitive Neuroscience, 23, 2632–2635. (2005). How do speakers avoid ambiguous linguistic
Feitelson, D., Tehori, B. Z., & Levinberg-Green, D. expressions? Cognition, 96, 263–284.
(1982). How effective is early instruction in reading? Ferreira, V. S., & Swets, B. (2002). How incremental
Experimental evidence. Merrill-Palmer Quarterly, 28, is language production? Evidence from the production
458–494. of utterances requiring the computation of arithmetic
Felix, S. (1992). Language acquisition as a sums. Journal of Memory and Language, 46, 57–84.
maturational process. In J. Weissenborn, H. Goodluck, Ferreiro, E. (1985). Literacy development: A
& T. Roeper (Eds.), Theoretical issues in language psychogenetic perspective. In D. R. Olson, N. Torrance,
acquisition (pp. 25–51). Hillsdale, NJ: Lawrence & A. Hildyard (Eds.), Literacy, language, and
Erlbaum Associates, Inc. learning: The nature and consequences of reading
Fenson, L., Dale, P., Reznick, J., Bates, E., Thal, D., and writing (pp. 217–228). Cambridge: Cambridge
& Pethick, S. (1994). Variability in early University Press.
communicative development. Monographs of the Ferreiro, E., & Teberosky, A. (1982). Literacy before
Society for Research in Child Development, 59 (5, schooling. New York: Heinemann.
Serial No. 242). Fillmore, C. J. (1968). The case for case. In E. Bach
Fera, P., & Besner, D. (1992). The process of lexical & R. T. Harms (Eds.), Universals of linguistic theory
decision: More words about a parallel distributed (pp. 1–90). New York: Holt, Rinehart & Winston.
processing model. Journal of Experimental Psychology: Finch, S., & Chater, N. (1992). Bootstrapping
Learning, Memory, and Cognition, 18, 749–764. syntactic categories. In Proceedings of the 14th Annual
Fernald, A. (1991). Prosody and focus in speech to Conference of the Cognitive Science Society (pp. 820–
infants and adults. Annals of Child Development, 8, 825). Hillsdale, NJ: Lawrence Erlbaum Associates, Inc.
43–80. Fischler, I. (1977). Semantic facilitation without
Fernald, A., Swingley, D., & Pinto, J. P. (2001). association in a lexical decision task. Memory and
When half a word is enough: Infants can recognize Cognition, 5, 335–339.
spoken words using partial phonetic information. Fischler, I., & Bloom, P. A. (1979). Automatic and
Child Development, 72, 1003–1015. attentional processes in the effects of sentence contexts
Fernald, G. M. (1943). Remedial techniques in basic on word recognition. Journal of Verbal Learning and
school subjects. New York: McGraw-Hill. Verbal Behavior, 18, 1–20.
Fernandes, K. J., Marcus, G. F., Di Nubila, J. A., & Fisher, C. (2002). The role of abstract syntactic
Vouloumanos, A. (2006). From semantics to syntax knowledge in language acquisition: A reply to
and back again: Argument structure in the third year of Tomasello (2000). Cognition, 82, 259–278.
life. Cognition, 100, B10–B20. Fisher, S. E., & Marcus, G. F. (2006). The eloquent
Ferreira, F. (2003). The misinterpretation of ape: Genes, brains and the evolution of language.
noncanonical sentences. Cognitive Psychology, 47, Nature Reviews Genetics, 7, 9–20.
164–203. Fisher, S. E., Marlow, A. J., Lamb, J., Maestrini, E.,
Ferreira, F., & Bailey, K. G. D. (2004). Disfluencies Williams, D. F., Richardson, A. J., et al. (1999). A
and human language comprehension. Trends in quantitative-trait locus on chromosome 6p influences
Cognitive Sciences, 8, 231–237. different aspects of developmental dyslexia. American
Ferreira, F., & Clifton, C. (1986). The independence Journal of Human Genetics, 64, 146–156.
of syntactic processing. Journal of Memory and Fisher, S. E., Vargha-Khadem, F., Watkins, K. E.,
Language, 25, 348–368. Monaco, A. P., & Pembrey, M. E. (1998).
Ferreira, F., & Henderson, J. M. (1990). Use of verb Localisation of a gene implicated in a severe speech
information in syntactic parsing: Evidence from eye and language disorder. Nature Genetics, 18, 168–170.
516 REFERENCES
Fitch, W. T., & Hauser, M. D. (2004). Computational Folk, J. R. (1999). Phonological codes are used to
constraints on syntactic processing in a nonhuman access the lexicon during silent reading. Journal of
primate. Science, 303, 377–380. Experimental Psychology: Learning, Memory, and
Fitch, W. T., Hauser, M. D., & Chomsky, N. (2005). Cognition, 25, 892–906.
The evolution of the language faculty: Clarifications Folk, J. R., & Morris, R. K. (1995). Multiple
and implications. Cognition, 97, 179–210. lexical codes in reading: Evidence from eye
Flavell, J. H., Miller, P. H., & Miller, S. (1993). movements, naming time, and oral reading. Journal
Cognitive development (3rd ed.). Englewood Cliffs, of Experimental Psychology: Learning, Memory, and
NJ: Prentice Hall. Cognition, 21, 1412–1429.
Flege, J. E., & Hillenbrand, J. (1984). Limits Ford, M., & Holmes, V. M. (1978). Planning units and
on phonetic accuracy in foreign language speech syntax in sentence production. Cognition, 6, 35–53.
production. Journal of the Acoustical Society of Forde, E. M. E., & Humphreys, G. W. (1995).
America, 76, 708–721. Refractory semantics in global aphasia: On semantic
Fletcher, C. R. (1986). Strategies for the allocation of organization and the access–storage distinction in
short-term memory during comprehension. Journal of neuropsychology. Memory, 3, 265–308.
Memory and Language, 25, 43–58. Forde, E. M. E., & Humphreys, G. W. (1997). A
Fletcher, C. R. (1994). Levels of representation in semantic locus for refractory behaviour: Implications
memory for discourse. In M. A. Gernsbacher (Ed.), for access–storage distinctions and the nature of
Handbook of psycholinguistics (pp. 589–608). San semantic memory. Cognitive Neuropsychology, 14,
Diego, CA: Academic Press. 367–402.
Flower, L. S., & Hayes, J. R. (1980). The dynamics Forster, K. I. (1976). Accessing the mental lexicon. In
of composing: Making plans and juggling constraints. R. J. Wales & E. C. T. Walker (Eds.), New approaches
In L. W. Gregg & E. R. Sternberg (Eds.), Cognitive to language mechanisms (pp. 257–287). Amsterdam:
processes in writing (pp. 31–50). Hillsdale, NJ: North Holland.
Lawrence Erlbaum Associates, Inc. Forster, K. I. (1979). Levels of processing and the
Fodor, J. A. (1972). Some reflections on L. S. structure of the language processor. In W. E. Cooper
Vygotsky’s thought and language. Cognition, 1, 83–95. & E. C. T. Walker (Eds.), Sentence processing:
Fodor, J. A. (1975). The language of thought. Psycholinguistic studies presented to Merrill Garrett
Hassocks, UK: Harvester Press. (pp. 27–85). Hillsdale, NJ: Lawrence Erlbaum
Fodor, J. A. (1978). Tom Swift and his procedural Associates, Inc.
grandmother. Cognition, 6, 229–247. Forster, K. I. (1981). Priming and effects of sentence
Fodor, J. A. (1979). In reply to Philip Johnson-Laird. and lexical contexts on naming time: Evidence of
Cognition, 7, 93–95. autonomous lexical processing. Quarterly Journal of
Fodor, J. A. (1981). The present status of the Experimental Psychology, 33A, 465–495.
innateness controversy. In J. A. Fodor, Representations Forster, K. I. (1994). Computational modeling
(pp. 257–316). Brighton, UK: Harvester Press. and elementary process analysis in visual word
Fodor, J. A. (1983). The modularity of mind. recognition. Journal of Experimental Psychology:
Cambridge, MA: MIT Press. Human Perception and Performance, 20, 1292–1310.
Fodor, J. A. (1985). Précis and multiple book review Forster, K. I. (2000). The potential for experimenter
of the Modularity of mind. Behavioral and Brain bias effects in word recognition experiments. Memory
Sciences, 8, 1–42. and Cognition, 28, 1109–1115.
Fodor, J. A., & Bever, T. G. (1965). The Forster, K. I., & Chambers, S. M. (1973). Lexical
psychological reality of linguistic segments. Journal access and naming time. Journal of Verbal Learning
of Verbal Learning and Verbal Behavior, 4, 414–420. and Verbal Behavior, 12, 627–635.
Fodor, J. A., Bever, T. G., & Garrett, M. F. (1974). Forster, K. I., & Davis, C. (1984). Repetition priming
The psychology of language. New York: McGraw-Hill. and frequency attenuation in lexical access. Journal
Fodor, J. A., & Garrett, M. F. (1967). Some syntactic of Experimental Psychology: Learning, Memory, and
determinants of sentential complexity. Perception and Cognition, 10, 680–698.
Psychophysics, 2, 289–296. Forster, K. I., Davis, C., Schoknecht, C., & Carter, R.
Fodor, J. A., Garrett, M. F., Walker, E. C. T., & (1987). Masked priming with graphemically related
Parkes, C. H. (1980). Against definitions. Cognition, forms: Repetition or partial activation? Quarterly
8, 263–367. Journal of Experimental Psychology, 39, 211–251.
Fodor, J. D., Fodor, J. A., & Garrett, M. F. Forster, K. I., & Olbrei, I. (1973). Semantic
(1975). The psychological unreality of semantic heuristics and syntactic analysis. Cognition, 2,
representations. Linguistic Inquiry, 6, 515–531. 319–347.
Fodor, J. D., & Frazier, L. (1980). Is the human sentence Forster, K. I., & Veres, C. (1998). The prime
parsing mechanism an ATN? Cognition, 8, 418–459. lexicality effect: Form-priming as a function of prime
REFERENCES 517
awareness, lexical status, and discrimination difficulty. case of word meaning deafness? Cognitive
Journal of Experimental Psychology: Learning, Neuropsychology, 13, 1139–1162.
Memory, and Cognition, 24, 498–514. Frauenfelder, U., Segui, J., & Dijkstra, T. (1990).
Foss, D. J. (1970). Some effects of ambiguity upon Lexical effects in phonemic processing: Facilitatory
sentence comprehension. Journal of Verbal Learning or inhibitory? Journal of Experimental Psychology:
and Verbal Behavior, 9, 699–706. Human Perception and Performance, 16, 77–91.
Foss, D. J., & Blank, M. A. (1980). Identifying the Frauenfelder, U. H., & Tyler, L. K. (1987). The
speech codes. Cognitive Psychology, 12, 1–31. process of spoken word recognition: An introduction.
Foss, D. J., & Gernsbacher, M. A. (1983). Cracking Cognition, 25, 1–20.
the dual code: Toward a unitary model of phoneme Frazier, L. (1987a). Sentence processing: A
identification. Journal of Verbal Learning and Verbal tutorial review. In M. Coltheart (Ed.), Attention and
Behavior, 22, 609–632. performance XII: The psychology of reading (pp. 559–
Foss, D. J., & Swinney, D. A. (1973). On the 586). Hove, UK: Lawrence Erlbaum Associates.
psychological reality of the phoneme: Perception, Frazier, L. (1987b). Syntactic processing: Evidence
identification, and consciousness. Journal of Verbal from Dutch. Natural Language and Linguistic Theory,
Learning and Verbal Behavior, 12, 246–257. 5, 519–560.
Fouts, R. S., Fouts, D. H., & van Cantfort, T. E. Frazier, L. (1989). Against lexical generation
(1989). The infant Loulis learns signs from cross- of syntax. In W. Marslen-Wilson (Ed.), Lexical
fostered chimpanzees. In R. A. Gardner, B. T. Gardner, representation and process (pp. 505–258). Cambridge,
& T. E. van Cantford (Eds.), Teaching sign language MA: MIT Press.
to chimpanzees (pp. 280–292). Albany, NY: Suny Frazier, L. (1995). Constraint satisfaction as a theory
Press. of sentence processing. Journal of Psycholinguistic
Fouts, R. S., Hirsch, A. D., & Fouts, D. H. (1982). Research, 24, 437–468.
Cultural transmission of a human language in a Frazier, L., Clifton, C., & Randall, J. (1983). Filling
chimpanzee mother–infant relationship. In H. E. gaps: Decision principles and structure in sentence
Fitzgerald, J. A. Mullins, & P. Gage (Eds.), Child comprehension. Cognition, 13, 187–222.
nurturance (Vol. 3, pp. 159–193). New York: Plenum. Frazier, L., & Flores d’Arcais, G. B. (1989). Filler
Fouts, R. S., Shapiro, G., & O’Neil, C. (1978). Studies driven parsing: A study of gap filling in Dutch.
of linguistic behaviour in apes and children. In P. Siple Journal of Memory and Language, 28, 331–344.
(Ed.), Understanding language through sign language Frazier, L., Flores d’Arcais, G. B., & Coolen, R.
research (pp. 163–185). London: Academic Press. (1993). Processing discontinuous words: On the
Fowler, A. E., Gelman, R., & Gleitman, L. R. interface between lexical and syntactic processing.
(1994). The course of language learning in children Cognition, 47, 219–249.
with Down syndrome: Longitudinal and language Frazier, L., & Fodor, J. D. (1978). The sausage
level comparisons with young normally developing machine: A new two-stage parsing model. Cognition,
children. In H. Tager-Flusberg (Ed.), Constraints on 6, 291–325.
language acquisition: Studies of atypical children (pp. Frazier, L., & Rayner, K. (1982). Making and
91–140). Hillsdale, NJ: Lawrence Erlbaum Associates, correcting errors during sentence comprehension: Eye
Inc. movements in the analysis of structurally ambiguous
Fox Tree, J. E., & Schrock, J. C. (1999). Discourse sentences. Cognitive Psychology, 14, 178–210.
markers in spontaneous speech: Oh what a difference Frazier, L., & Rayner, K. (1987). Resolution of
an oh makes. Journal of Memory and Language, 40, syntactic category ambiguities: Eye movements in
280–295. parsing lexically ambiguous sentences. Journal of
Foygel, D., & Dell, G. S. (2000). Models of impaired Memory and Language, 26, 505–526.
lexical access in speech production. Journal of Frazier, L., & Rayner, K. (1990). Taking on semantic
Memory and Language, 43, 182–216. commitments: Processing multiple meanings versus
Francis, W. N., & Kucera, H. (1982). Frequency multiple senses. Journal of Memory and Language,
analysis of English usage. Boston, MA: Houghton 29, 181–200.
Mifflin. Freberg, L. A. (2006). Discovering biological
Frank, S. L., & Bod, R. (2011). Insensitivity of the psychology. Boston, MA: Houghton Mifflin.
human sentence-processing system to hierarchical Frederiksen, J. R., & Kroll, J. F. (1976). Spelling
structure. Psychological Science, 22, 829–834. and sound: Approaches to the internal lexicon. Journal
Franklin, S., Howard, D., & Patterson, K. E. of Experimental Psychology: Human Perception and
(1994). Abstract word deafness. Cognitive Performance, 2, 361–379.
Neuropsychology, 11, 1–34. Frege, G. (1892). Über Sinn und Bedeutung.
Franklin, S., Turner, J., Lambon Ralph, M. A., Zeitschrifte für Philosophie und Philosophische Kritik,
Morris, J., & Bailey, P. J. (1996). A distinctive 100, 25–50. [Translated in P. T. Geach &
518 REFERENCES
M. Black (Eds.), Philosophical writings of Gottlob Frost, R. (1998). Toward a strong phonological theory
Frege (1952). Oxford: Blackwell.] of visual word recognition: True issues and false trails.
Fremgen, A., & Fay, D. (1980). Overextensions in Psychological Bulletin, 123, 71–99.
production and comprehension: A methodological Frost, R., Forster, K. I., & Deutsch, A. (1997).
clarification. Journal of Child Language, 7, 205–211. What can we learn from the morphology of Hebrew?
Freud, S. (1975). The psychopathology of everyday A masked priming investigation of morphological
life (Trans. A. Tyson). Harmondsworth, UK: Penguin. representation. Journal of Experimental Psychology:
[Originally published 1901.] Learning, Memory, and Cognition, 23, 829–856.
Freudenthal, D., Pine, J., & Gobet, F. (2005). Funnell, E. (1983). Phonological processes in reading:
Simulating the cross-linguistic development of New evidence from acquired dyslexia. British Journal
optional infinitive errors in MOSAIC. In B. G. Bara, of Psychology, 74, 159–180.
L. Barsalou, & M. Buchiarelli (Eds.), Proceedings Funnell, E. (1996). Response biases in oral reading:
of the 27th Annual Meeting of the Cognitive Science An account of the co-occurrence of surface dyslexia
Society (pp. 702–707). Mahwah, NJ: Lawrence and semantic dementia. Quarterly Journal of
Erlbaum Associates, Inc. Experimental Psychology, 49A, 417–446.
Freudenthal, D., Pine, J. M., & Gobet, F. (2006). Funnell, E., & de Mornay Davies, P. (1996). JBR:
Modelling the development of children’s use of A reassessment of concept familiarity and a category-
optional infinitives in English and Dutch using specific disorder for living things. Neurocase, 2,
MOSAIC. Cognitive Science, 30, 277–310. 461–474.
Friederici, A. D. (2002). Towards a neural basis of Funnell, E., & Sheridan, J. (1992). Categories of
auditory sentence processing. Trends in Cognitive knowledge? Unfamiliar aspects of living and non-
Sciences, 6, 78–84. living things. Cognitive Neuropsychology, 9, 135–153.
Friederici, A. D. (2012). The cortical language circuit: Furth, H. (1966). Thinking without language.
From auditory perception to sentence comprehension. London: Macmillan.
Trends in Cognitive Sciences, 16, 262–268. Furth, H. (1971). Linguistic deficiency and thinking:
Friederici, A. D., Bahlmann, J., Heim, S., Research with deaf subjects 1964–69. Psychological
Schubotz, R. I., & Anwander, A. (2006). The brain Bulletin, 75, 58–72.
differentiates human and non-human grammars: Furth, H. (1973). Deafness and learning: A
Functional localization and structural connectivity. psychosocial approach. Belmont, CA: Wadsworth.
Proceedings of the National Academy of Sciences of Gaffan, D., & Heywood, C. A. (1993). A spurious
the United States of America, 103, 2458–2463. category-specific visual agnosia for living things in
Friederici, A. D., & Kilborn, K. (1989). Temporal normal human and nonhuman primates. Journal of
constraints on language processing: Syntactic priming Cognitive Neuroscience, 5, 118–128.
in Broca’s aphasia. Journal of Cognitive Neuroscience, Gainotti, G., di Betta, A. M., & Silveri, M. C.
1, 262–272. (1996). The production of specific and generic
Friedman, N. P., & Miyake, A. (2000). Differential associates of living and nonliving, high- and low-
roles for visuospatial and verbal working memory in familiarity stimuli in Alzheimer’s disease. Brain and
situation model construction. Journal of Experimental Language, 54, 262–274.
Psychology: General, 129, 61–83. Galaburda, A. M., Menard, M. T., & Rosen, G. D.
Friedman, R. B. (1995). Two types of phonological (1994). Evidence for aberrant auditory anatomy in
alexia. Cortex, 31, 397–403. developmental dyslexia. Proceedings of the National
Frith, U. (1985). Beneath the surface of developmental Academy of Sciences, 91, 8010–8013.
dyslexia. In K. E. Patterson, J. C. Marshall, & Galaburda, A. M., Sherman, G. F., Rosen, G. D.,
M. Coltheart (Eds.), Surface dyslexia (pp. 301–330). Aboitiz, F., & Geschwind, N. (1985). Developmental
Hove, UK: Lawrence Erlbaum Associates. dyslexia: Four consecutive patients with cortical
Fromkin, V. A. (1971/1973). The non-anomalous anomalies. Annals of Neurology, 18, 222–233.
nature of anomalous utterances. Language, 51, Galantucci, B., Fowler, C. A., & Turvey, M. T.
696–719. [Reprinted in V. A. Fromkin (Ed.) (1973), (2006). The motor theory of speech perception
Speech errors as linguistic evidence (pp. 215–242). reviewed. Psychonomic Bulletin and Review, 13,
The Hague: Mouton.] 361–377.
Fromkin, V. A., Krashen, S., Curtiss, S., Rigler, D., Gallaway, C., & Richards, B. J. (Eds.). (1994). Input
& Rigler, M. (1974). The development of language and interaction in language acquisition. Cambridge:
in Genie: A case of language acquisition beyond the Cambridge University Press.
“Critical Period.” Brain and Language, 1, 81–107. Galton, F. (1879). Psychometric experiments. Brain,
Fromkin, V. A., Rodman, R., & Hyams, N. (2011). 2, 149–162.
An introduction to language (9th ed.). Boston, MA: Ganong, W. F. (1980). Phonetic categorization in
Thomson Heinle. auditory word perception. Journal of Experimental
REFERENCES 519
Psychology: Human Perception and Performance, 6, Alzheimer’s disease on the characteristics of writing
110–125. by a renowned author. Brain, 128, 250–260.
Garcia, L. J., & Joanette, Y. (1997). Analysis of Garrard, P., Patterson, K., Watson, P. C., &
conversational topic shifts: A multiple case study. Hodges, J. R. (1998). Category specific semantic
Brain and Language, 58, 92–114. loss in dementia of Alzheimer’s type: Functional–
Gardner, M. (1990). Science: Good, bad, and bogus. anatomical correlations from cross-sectional analyses.
Loughton, UK: Prometheus Books. Brain, 121, 633–646.
Gardner, R. A., & Gardner, B. T. (1969). Teaching Garrett, M. F. (1975). The analysis of sentence
sign language to a chimpanzee. Science, 165, 664–672. production. In G. Bower (Ed.), The psychology of
Gardner, R. A., & Gardner, B. T. (1975). Evidence learning and motivation (Vol. 9, pp. 133–177). New
for sentence constituents in the early utterances York: Academic Press.
of child chimpanzee. Journal of Experimental Garrett, M. F. (1976). Syntactic processes in sentence
Psychology: General, 104, 244–267. production. In R. J. Wales & E. C. T. Walker (Eds.),
Gardner, R. A., van Cantfort, T. E., & Gardner, B. T. New approaches to language mechanisms (pp. 231–
(1992). Categorical replies to categorical questions 255). Amsterdam: North Holland.
by cross-fostered chimpanzees. American Journal of Garrett, M. F. (1980a). Levels of processing in
Psychology, 105, 27–57. sentence production. In B. Butterworth (Ed.),
Garnham, A. (1983a). Why psycholinguists don’t Language production: Vol. 1. Speech and talk (pp.
care about DTC: A reply to Berwick and Weinberg. 177–220). London: Academic Press.
Cognition, 15, 263–270. Garrett, M. F. (1980b). The limits of accommodation.
Garnham, A. (1983b). What’s wrong with story In V. Fromkin (Ed.), Errors in linguistic performance:
grammars. Cognition, 15, 145–154. Slips of the tongue, ear, pen, and hand (pp. 263–271).
Garnham, A. (1985). Psycholinguistics: Central New York: Academic Press.
topics. London: Methuen. Garrett, M. F. (1982). Production of speech:
Garnham, A. (1987). Mental models as representation Observations from normal and pathological language
of discourse and text. Chichester, UK: Horwood. use. In A. W. Ellis (Ed.), Normality and pathology in
Garnham, A., & Oakhill, J. (1992). Discourse cognitive functions (pp. 19–76). London: Academic
processing and text representation from a “mental Press.
models” perspective. Language and Cognitive Garrett, M. F. (1988). Processes in language
Processes, 7, 193–204. production. In F. J. Newmeyer (Ed.), Linguistics: The
Garnham, A., Oakhill, J., & Johnson-Laird, P. N. Cambridge survey: Vol. 3. Language: Psychological
(1982). Referential continuity and the coherence of and biological aspects (pp. 69–96). Cambridge:
discourse. Cognition, 11, 29–46. Cambridge University Press.
Garnham, A., Shillcock, R. C., Brown, G. D. A., Garrett, M. F. (1992). Disorders of lexical selection.
Mill, A. I. D., & Cutler, A. (1982). Slips of the tongue in Cognition, 42, 143–180.
the London–Lund corpus of spontaneous conversation. Garrett, M. F., Bever, T. G., & Fodor, J. A. (1966).
In A. Cutler (Ed.), Slips of the tongue and language The active use of grammar in speech perception.
production (pp. 251–263). Amsterdam: Mouton. Perception and Psychophysics, 1, 30–32.
Garnica, O. (1977). Some prosodic and paralinguistic Garrod, S., & Anderson, A. (1987). Saying what you
features of speech to young children. In C. E. Snow & mean in dialogue: A study in conceptual and semantic
C. A. Ferguson (Eds.), Talking to children: Language co-ordination. Cognition, 27, 181–218.
input and acquisition (pp. 63–88). Cambridge: Garrod, S. C., & Sanford, A. J. (1977). Interpreting
Cambridge University Press. anaphoric relations: The integration of semantic
Garnsey, S. M., Pearlmutter, N. J., Myers, E., & information while reading. Journal of Verbal Learning
Lotocky, M. A. (1997). The contributions of verb bias and Verbal Behavior, 16, 77–90.
and plausibility to the comprehension of temporarily Garrod, S. C., & Terras, M. (2000). The contribution
ambiguous sentences. Journal of Memory and of lexical and situational knowledge to resolving
Language, 37, 58–93. discourse roles: Bonding and resolution. Journal of
Garnsey, S. M., Tanenhaus, M. K., & Chapman, R. M. Memory and Language, 42, 526–544.
(1989). Evoked potentials and the study of sentence Gaskell, G. (Ed.). (2007). Oxford handbook of
comprehension. Journal of Psycholinguistic Research, 18, psycholinguistics. Oxford: Oxford University Press.
51–60. Gaskell, M. G., & Marslen-Wilson, W. D. (1997).
Garrard, P., & Hodges, J. R. (2000). Semantic Integrating form and meaning: A distributed model of
dementia: Clinical, radiological and pathological speech perception. Language and Cognitive Processes,
perspectives. Journal of Neurology, 247, 409–422. 12, 613–656.
Garrard, P., Maloney, L. M., Hodges, J. R., & Gaskell, M. G., & Marslen-Wilson, W. D. (1998).
Patterson, K. E. (2005). The effects of very early Mechanisms of phonological inference in speech
520 REFERENCES
perception. Journal of Experimental Psychology: (Ed.), Handbook of psycholinguistics (pp. 781–820). San
Human Perception and Performance, 24, 280–396. Diego, CA: Academic Press.
Gaskell, M. G., & Marslen-Wilson, W. D. (2002). Gernsbacher, M. A. (1984). Resolving 20 years of
Representation and competition in the perception of inconsistent interactions between lexical familiarity
spoken words. Cognitive Psychology, 45, 220–266. and orthography, concreteness, and polysemy. Journal
Gathercole, S. E., Alloway, T. P., Willis, C., & of Experimental Psychology: General, 113, 256–281.
Adams, A. (2006). Working memory in children with Gernsbacher, M. A. (1990). Language comprehension
reading disabilities. Journal of Experimental Child as structure building. Hillsdale, NJ: Lawrence
Psychology, 93, 265–281. Erlbaum Associates, Inc.
Gathercole, S. E., & Baddeley, A. D. (1989). Gernsbacher, M. A. (1997). Group differences in
Evaluation of the role of phonological STM in the suppression skill. Aging, Neuropsychology, and
development of vocabulary in children: A longitudinal Cognition, 4, 175–184.
study. Journal of Memory and Language, 28, 200–213. Gernsbacher, M. A., & Hargreaves, D. J. (1988).
Gathercole, S. E., & Baddeley, A. D. (1990). Accessing sentence participants: The advantage of
Phonological memory deficits in language disordered first mention. Journal of Memory and Language, 27,
children: Is there a causal connection? Journal of 699–717.
Memory and Language, 29, 336–360. Gernsbacher, M. A., Hargreaves, D. J., &
Gathercole, S. E., & Baddeley, A. D. (1993). Working Beeman, M. (1989). Building and accessing clausal
memory and language. Hove, UK: Lawrence Erlbaum representations: The advantage of first mention versus
Associates. the advantage of clause recency. Journal of Memory
Gathercole, S. E., & Baddeley, A. D. (1997). and Language, 28, 735–755.
Sense and sensitivity in phonological memory and Gernsbacher, M. A., Varner, K. R., & Faust, M.
vocabulary development: A reply to Bowey (1996). (1990). Investigating differences in general
Journal of Experimental Child Psychology, 67, comprehension skill. Journal of Experimental
290–294. Psychology: Learning, Memory, and Cognition, 16,
Gathercole, V. C. (1985). “He has too much hard 430–445.
questions”: The acquisition of the linguistic mass- Gerrig, R. (1986). Processes and products of lexical
count distinction in much and many. Journal of Child access. Language and Cognitive Processes, 1, 187–196.
Language, 12, 395–415. Gertner, Y., Fisher, C., & Eisengart, J. (2006).
Gathercole, V. C. (1987). The contrastive hypothesis Learning words and rules: Abstract knowledge of word
for the acquisition of word meaning: A reconsideration order in early sentence comprehension. Psychological
of the theory. Journal of Child Language, 14, Science, 17, 684–691.
493–531. Geschwind, N. (1972). Language and the brain.
Gathercole, V. C. (1989). Contrast: A semantic Scientific American, 226, 76–83.
constraint? Journal of Child Language, 16, 685–702. Gibbs, R. W. (1980). Spilling the beans on
Gazdar, G., Klein, E., Pullum, G. K., & Sag, I. A. understanding and memory for idioms in conversation.
(1985). Generalized phrase structure grammar. Memory and Cognition, 8, 149–156.
Oxford: Blackwell. Gibbs, R. W. (1986a). On the psycholinguistics
Gazzaniga, M. S., Ivry, R. B., & Mangun, G. R. of sarcasm. Journal of Experimental Psychology:
(2008). Cognitive neuroscience: The biology of the General, 115, 3–15.
mind (3rd ed.). New York: Norton. Gibbs, R. W. (1986b). What makes some indirect
Gentner, D. (1978). On relational meaning: The speech acts conventional? Journal of Memory and
acquisition of verb meaning. Child Development, 49, Language, 25, 181–196.
988–998. Gibson, E. (1998). Linguistic complexity: Locality of
Gentner, D. (1981). Verb structures in memory for syntactic dependencies. Cognition, 68, 1–76.
sentences: Evidence for componential representation. Gibson, E., & Thomas, J. (1999). Memory
Cognitive Psychology, 13, 56–83. limitations and structural forgetting: The perception
Gentner, D. (1982). Why nouns are learned before of complex ungrammatical sentences as grammatical.
verbs: Linguistic relativity vs. natural partitioning. Language and Cognitive Processes, 14, 225–248.
In S. A. Kuczaj (Ed.), Language development: Vol. Gibson, J. J. (1979). The ecological approach to
2. Language, thought, and culture (pp. 301–334). perception. Boston, MA: Houghton Mifflin.
Hillsdale, NJ: Lawrence Erlbaum Associates, Inc. Gilhooly, K. J. (1984). Word age-of-acquisition and
Gergely, G., & Bever, T. G. (1986). Related intuitions residence time in lexical memory as factors in word
and the mental representation of causative verbs in naming. Current Psychological Research, 3, 24–31.
adults and children. Cognition, 23, 211–277. Gillette, J., Gleitman, H., Gleitman, L., & Lederer, A.
Gerken, L. (1994). Child phonology: Past research, (1999). Human simulations of vocabulary learning.
present questions, future direction. In M. A. Gernsbacher Cognition, 73, 135–176.
REFERENCES 521
Glaser, W. R. (1992). Picture naming. Cognition, 42, Glucksberg, S., Kreuz, R. J., & Rho, S. H. (1986).
61–105. Context can constrain lexical access: Implications
Gleason, H. A. (1961). An introduction to descriptive for models of language comprehension. Journal of
linguistics. New York: Holt, Rinehart & Winston. Experimental Psychology: Learning, Memory, and
Gleason, J. B., Hay, D., & Crain, L. (1989). Cognition, 12, 323–335.
The social and affective determinants of language Glucksberg, S., & Weisberg, R. W. (1966). Verbal
development. In M. Rice & R. Schiefelbusch behavior and problem solving: Some effects of
(Eds.), The teachability of language (pp. 171–186). labeling in a functional fixedness problem. Journal of
Baltimore, MD: Paul Brookes. Experimental Psychology, 71, 659–664.
Gleason, J. B., & Ratner, N. B. (1993). Language Glushko, R. J. (1979). The organization and
development in children. In J. B. Gleason & N. B. activation of orthographic knowledge in reading
Ratner (Eds.), Psycholinguistics (pp. 301–350). Fort aloud. Journal of Experimental Psychology: Human
Worth, TX: Harcourt Brace Jovanovich. Perception and Performance, 5, 674–691.
Gleick, J. (1987). Chaos. London: Sphere Books. Goffman, E. (1967). Interaction ritual: Essays on face
Gleitman, L. R. (1981). Maturational determinants of to face behavior. Garden City, NY: Anchor Books.
language growth. Cognition, 10, 105–113. Gold, E. M. (1967). Language identification in the
Gleitman, L. R. (1990). The structural sources of limit. Information and Control, 16, 447–474.
word meaning. Language Acquisition, 1, 3–55. Goldberg, E., & Costa, L. D. (1981). Hemisphere
Gleitman, L. R., Cassidy, K., Nappa, R., differences in the acquisition and use of descriptive
Papafragou, A., & Trueswell, J. C. (2005). Hard systems. Brain and Language, 14, 144–173.
words. Language Learning and Development, 1, Goldfield, B. A. (1993). Noun bias in maternal speech
23–64. to one-year-olds. Journal of Child Language, 20, 85–99.
Gleitman, L. R., & Papafragou, A. (2005). Language Goldiamond, I., & Hawkins, W. F. (1958).
and thought. In K. J. Holyoak & R. Morrison (Eds.), Vexierversuch: The logarithmic relationship between
The Cambridge handbook of thinking and reasoning. word-frequency and recognition obtained in the
Cambridge: Cambridge University Press. absence of stimulus words. Journal of Experimental
Gleitman, L. R., & Wanner, E. (1982). Language Psychology, 56, 457–463.
acquisition: The state of the state of the art. In Goldin-Meadow, S., Butcher, C., Mylander, C.,
E. Wanner & L. R. Gleitman (Eds.), Language & Dodge, M. (1994). Nouns and verbs in a self-
acquisition: The state of the art (pp. 3–48). styled gesture system: What’s in a name? Cognitive
Cambridge: Cambridge University Press. Psychology, 27, 259–319.
Glenberg, A. (2007). Language and action: Creating Goldin-Meadow, S., Mylander, C., & Butcher, C.
sensible combinations of ideas. In M. G. Gaskell (Ed.), (1995). The resilience of combinatorial structure at the
Oxford handbook of psycholinguistics (pp. 362–370). word level: Morphology in self-styled gesture systems.
Oxford: Oxford University Press. Cognition, 56, 195–262.
Glenberg, A. M., Meyer, M., & Lindem, K. (1987). Goldinger, S. D., Luce, P. A., & Pisoni, D. B. (1989).
Mental models contribute to foregrounding during text Priming lexical neighbours of spoken words: Effects
comprehension. Journal of Memory and Language, of competition and inhibition. Journal of Memory and
26, 69–83. Language, 28, 501–518.
Glenberg, A. M., & Robertson, D. A. (2000). Goldman-Eisler, F. (1958). Speech production and the
Symbol grounding and meaning: A comparison of predictability of words in context. Quarterly Journal
high-dimensional and embodied theories of meaning. of Experimental Psychology, 10, 96–106.
Journal of Memory and Language, 43, 379–401. Goldman-Eisler, F. (1968). Psycholinguistics:
Gluck, M. A., & Bower, G. H. (1988). From Experiments in spontaneous speech. London:
conditioning to category learning: An adaptive Academic Press.
network model. Journal of Experimental Psychology: Golinkoff, R. M., Hirsh-Pasek, K., Bailey, L. M., &
General, 8, 37–50. Wenger, N. R. (1992). Young children and adults use
Glucksberg, S. (1991). Beyond literal meanings: The lexical principles to learn new nouns. Developmental
psychology of allusion. Psychological Science, 2, Psychology, 28, 99–108.
146–152. Golinkoff, R. M., Mervis, C. B., & Hirsh-Pasek, K.
Glucksberg, S., Gildea, P., & Bookin, H. B. (1982). (1994). Early object labels: The case for lexical
On understanding nonliteral speech: Can people ignore principles. Journal of Child Language, 21, 125–155.
metaphors? Journal of Verbal Learning and Verbal Gollan, T. H., & Acenas, L. R. (2004). What is a
Behavior, 21, 85–98. TOT? Cognate and translation effects on tip-of-the-
Glucksberg, S., & Keysar, B. (1990). Understanding tongue states in Spanish–English and Tagalog–English
metaphorical comparisons: Beyond similarity. bilinguals. Journal of Experimental Psychology:
Psychological Review, 97, 3–18. Learning, Memory, and Cognition, 30, 246–269.
522 REFERENCES
Gollan, T. H., & Brown, A. S. (2006). From tip-of- Gordon, P. C., Hendrick, R., & Levine, W. H.
the-tongue (TOT) data to theoretical implications in (2002). Memory-load interference in syntactic
two steps: When more TOTs means better retrieval. processing. Psychological Science, 13, 425–430.
Journal of Experimental Psychology: General, 135, Gordon, P. C., & Meyer, D. E. (1984). Perceptual-
462–483. motor processing of phonetic features. Journal of
Gombert, J. E. (1992). Metalinguistic development Experimental Psychology: Human Perception and
(Trans. T. Pownall, originally published 1990). Performance, 10, 153–178.
London: Harvester Wheatsheaf. Goswami, U. (1986). Children’s use of analogy in
Gomez, R. L., & Gerken, L. (2000). Infant artificial learning to read: A developmental study. Journal of
language learning and language acquisition. Trends in Experimental Child Psychology, 42, 73–83.
Cognitive Sciences, 4, 178–186. Goswami, U. (1988). Orthographic analogies
Gonnerman, L. M., Andersen, E. S., Devlin, J. T., and reading development. Quarterly Journal of
Kempler, D., & Seidenberg, M. S. (1997). Double Experimental Psychology, 40A, 239–268.
dissociation of semantic categories in Alzheimer’s Goswami, U. (1993). Towards an interactive
disease. Brain and Language, 57, 254–279. analogy model of reading development: Decoding
Good, D. A., & Butterworth, B. (1980). Hesitancy vowel graphemes in beginning reading. Journal of
as a conversational resource: Some methodological Experimental Child Psychology, 56, 443–475.
implications. In H. W. Dechert & M. Raupach (Eds.), Goswami, U., & Bryant, P. (1990). Phonological
Temporal variables in speech (pp. 145–152). The skills and learning to read. Hove, UK: Lawrence
Hague: Mouton. Erlbaum Associates.
Goodglass, H. (1976). Agrammatism. In H. Whitaker Goswami, U., Wang, H., Cruz, A., Fosker, T.,
& H. A. Whitaker (Eds.), Studies in neurolinguistics Mead, N., & Huss, M. (2011). Language-universal
(Vol. 1, pp. 237–260). New York: Academic Press. sensory deficits in developmental dyslexia: English,
Goodglass, H., & Geschwind, N. (1976). Language Spanish, and Chinese. Journal of Cognitive
disorders (aphasia). In E. C. Carterette & M. P. Neuroscience, 23, 325–337.
Friedman (Eds.), Handbook of perception: Vol. VII. Goswami, U., Ziegler, J. C., & Richardson, U.
Language and speech (pp. 389–428). New York: (2005). The effects of spelling consistency on
Academic Press. phonological awareness: A comparison of English and
Goodglass, H., & Menn, L. (1985). Is agrammatism German. Journal of Experimental Child Psychology,
a unitary phenomenon? In M.-L. Kean (Ed.), 92, 345–365.
Agrammatism (pp. 1–26). New York: Academic Press. Gotts, S. J., & Plaut, D. C. (2002). The impact
Goodluck, H. (1991). Language acquisition: A of synaptic depression following brain damage: A
linguistic introduction. Oxford: Blackwell. connectionist account of “access/refractory” and
Gopnik, M. (1990a). Dysphasia in an extended family. “degraded-store” semantic impairments. Cognitive,
Nature, 344, 715. Affective, and Behavioral Neuroscience, 2, 187–213.
Gopnik, M. (1990b). Feature blindness: A case study. Gough, P. B. (1972). One second of reading. In
Language Acquisition, 1, 139–164. J. F. Kavanaugh & I. G. Mattingly (Eds.), Language
Gopnik, M. (1992). A model module? Cognitive by ear and by eye (pp. 331–358). Cambridge, MA:
Neuropsychology, 9, 253–258. MIT Press.
Gopnik, M., & Crago, M. B. (1991) Familial Goulandris, A., & Snowling, M. (1991). Visual
aggregation of a developmental language disorder. memory deficits: A plausible case of developmental
Cognition, 29, 1–50. dyslexia? Evidence from a single case study. Cognitive
Gopnik, M., & Meltzoff, A. N. (1997). Words, Neuropsychology, 8, 127–154.
thoughts, and theories. Cambridge, MA: MIT Press. Graesser, A. C., Singer, M., & Trabasso, T.
Gordon, B., & Caramazza, A. (1982). Lexical (1994). Constructing inferences during narrative text
decision for open and closed-class words: Failure to comprehension. Psychological Review, 101, 371–395.
replicate differential frequency sensitivity. Brain and Graf, P., & Torrey, J. W. (1966). Perception of phrase
Language, 15, 143–160. structure in written language. American Psychological
Gordon, P. (1985). Evaluating the semantic categories Association Convention Proceedings, 83–88.
hypothesis: The case of the count/mass distinction. Graham, K. S., Hodges, J. R., & Patterson, K.
Cognition, 20, 209–242. (1994). The relationship between comprehension
Gordon, P. (2004). Numerical cognition without and oral reading in progressive fluent aphasia.
words: Evidence from Amazonia. Science, 306, Neuropsychologia, 32, 299–316.
496–499. Grainger, J. (1990). Word frequency and
Gordon, P. C., Grosz, B. J., & Gilliom, L. A. (1993). neighborhood frequency effects in lexical decision
Pronouns, names, and the centering of attention in and naming. Journal of Memory and Language, 29,
discourse. Cognitive Science, 17, 311–347. 228–244.
REFERENCES 523
Grainger, J., & Jacobs, A. M. (1996). Orthographic Grodzinsky, Y. (1984). The syntactic characterization
processing in visual word recognition: A multiple of agrammatism. Cognition, 16, 88–120.
read-out model. Psychological Review, 103, 518–565. Grodzinsky, Y. (1989). Agrammatic comprehension
Grainger, J., Lété, B., Bertand, D., Dufau, S., & of relative clauses. Brain and Language, 37, 480–499.
Ziegler, J. C. (2012). Evidence for multiple routes in Grodzinsky, Y. (1990). Theoretical perspectives on
learning to read. Cognition, 123, 280–292. language deficits. Cambridge, MA: MIT Press.
Grainger, J., O’Regan, K., Jacobs, A. M., & Grodzinsky, Y. (2000). The neurology of syntax:
Segui, J. (1989). On the role of competing word Language use without Broca’s area. Behavioral and
units in visual word recognition: The neighbourhood Brain Sciences, 23, 1–71.
frequency effect. Perception and Psychophysics, 45, Grodzinsky, Y., & Friederici, A. D. (2006).
189–195. Neuroimaging of syntax and syntactic processing.
Green, D. W. (1986). Control, activation, and Current Opinion in Neurobiology, 16, 240–246.
resource: A framework and a model for the control Grosjean, F. (1980). Spoken word recognition
of speech in bilinguals. Brain and Language, 27, processes and the gating paradigm. Perception and
210–223. Psychophysics, 28, 267–283.
Greenberg, J. H. (1963). Some universals of grammar Grosjean, F. (1997). Processing mixed languages:
with particular reference to the order of meaningful Issues, findings, and models. In A. de Groot & J. Kroll
elements. In J. H. Greenberg (Ed.), Universals of (Eds.), Tutorials in bilingualism: Psycholinguistic
language (pp. 58–90). Cambridge, MA: MIT Press. perspectives (pp. 225–254). Mahwah, NJ: Lawrence
Greene, J. (1972). Psycholinguistics. Harmondsworth, Erlbaum Associates, Inc.
UK: Penguin. Grosjean, F., & Frauenfelder, U. H. (1996). A guide
Greenfield, P. M., & Savage-Rumbaugh, E. S. (1990). to spoken word recognition paradigms: Introduction.
Grammatical combinations in Pan paniscus: Processes of Language and Cognitive Processes, 11, 553–558.
learning and invention in the evolution and development Grosjean, F., & Gee, J. P. (1987). Prosodic structure
of language. In S. T. Parker & K. R. Gibson (Eds.), and spoken word recognition. Cognition, 25, 135–155.
“Language” and intelligence in monkeys and apes: Grosjean, F., & Soares, C. (1986). Processing mixed
Comparative developmental perspectives (pp. 540–578). language: Some preliminary findings. In J. Vaid (Ed.),
New York: Cambridge University Press. Linguistics processing in bilinguals: Psycholinguistic
Greenfield, P. M., & Smith, J. H. (1976). The and neuropsychological perspectives (pp. 145–179).
structure of communication in early language Hillsdale, NJ: Lawrence Erlbaum Associates, Inc.
development. New York: Academic Press. Grossman, M., Mickanin, J., Robinson, K. M.,
Gregory, R. L. (1961). The brain as an engineering & d’Esposito, M. (1996). Anomaly judgements of
problem. In W. H. Thorpe & O. L. Zangwill (Eds.), subject–predicate relations in Alzheimer’s disease.
Current problems in animal behaviour (pp. 547–565). Brain and Language, 54, 216–232.
London: Methuen. Grosz, B. J., Joshi, A. K., & Weinstein, S. (1995).
Grice, H. P. (1975). Logic and conversation. In Centering: A framework for modeling the local
P. Cole & J. Morgan (Eds.), Syntax and semantics: Vol. coherence of discourse. Computational Linguistics, 21,
3. Speech acts (pp. 41–58). New York: Academic Press. 203–225.
Griffin, Z. M. (2001). Gaze durations during speech Gumperz, J. J., & Levinson, S. C. (1996). Rethinking
reflect word selection and phonological encoding. linguistic relativity. Cambridge: Cambridge University
Cognition, 82, B1–B14. Press.
Griffin, Z. M. (2004). The eyes are right when the Gupta, P., & MacWhinney, B. (1997). Vocabulary
mouth is wrong. Psychological Science, 15, 814–821. acquisition and verbal short-term memory:
Griffin, Z. M., & Bock, K. (1998). Constraint, Computational and neural bases. Brain and Language,
word frequency, and the relationship between lexical 59, 267–333.
processing levels in spoken word production. Journal Haarmann, H. J., Just, M. A., & Carpenter, P. A.
of Memory and Language, 38, 313–338. (1997). Aphasic sentence comprehension as a
Griffin, Z. M., & Bock, K. (2000). What the eyes say resource deficit: A computational approach. Brain and
about speaking. Psychological Science, 11, 274–279. Language, 59, 76–120.
Griffin, Z. M., & Oppenheimer, D. M. (2006). Haarmann, H. J., & Kolk, H. H. J. (1991). Syntactic
Speakers gaze at objects while preparing intentionally priming in Broca’s aphasics: Evidence for slow
inaccurate labels for them. Journal of Experimental activation. Aphasiology, 5, 247–263.
Psychology: Learning, Memory, and Cognition, 32, Haber, R. N., & Haber, L. R. (1982). Does silent
943–948. reading involve articulation? Evidence from tongue-
Grober, E. H., Beardsley, W., & Caramazza, A. twisters. American Journal of Psychology, 95, 409–419.
(1978). Parallel function in pronoun assignment. Hadzibeganovic, T., van den Noort, M., Bosch, P.,
Cognition, 6, 117–133. Perc, M., van Kralingen, R., Mondt, K., &
524 REFERENCES
Coltheart, M. (2010). Cross-linguistic neuroimaging Groot & J. F. Kroll (Eds.), Tutorials in bilingualism:
and dyslexia: A critical view. Cortex, 46, 1312–1316. Psycholinguistic perspectives (pp. 19–51). Mahwah,
Hagoort, P. (2008). Should psychology ignore NJ: Lawrence Erlbaum Associates, Inc.
the language of the brain? Current Directions in Harley, T. A. (1984). A critique of top-down
Psychological Science, 17, 96–101. independent levels models of speech production:
Hagoort, P., Brown, C. M., & Groothusen, J. Evidence from non-plan-internal speech production.
(1993). The syntactic positive shift as an ERP-measure Cognitive Science, 8, 191–219.
of syntactic processing. Language and Cognitive Harley, T. A. (1990). Paragrammatisms: Syntactic
Processes, 8, 439–483. disturbance or failure of control? Cognition, 34,
Hakuta, K. (1986). Mirror of language. New York: 85–91.
Basic Books. Harley, T. A. (1993a). Phonological activation of
Hakuta, K., & Diaz, R. (1985). The relationship between semantic competitors during lexical access in speech
degree of bilingualism and cognitive ability: A critical production. Language and Cognitive Processes, 8,
discussion and some new longitudinal data. In K. E. 291–309.
Nelson (Ed.), Children’s language (Vol. 5, pp. 319–344). Harley, T. A. (1993b). Connectionist approaches to
Hillsdale, NJ: Lawrence Erlbaum Associates, Inc. language disorders. Aphasiology, 7, 221–249.
Haldane, J. B. S. (1927). A mathematic theory of Harley, T. A. (1998). The semantic deficit in
natural and artificial selection, Part V: Selection and dementia: Connectionist approaches to what goes
mutation. Proceedings of the Cambridge Philosophical wrong in picture naming. Aphasiology, 12, 299–318.
Society, 23, 838–844. Harley, T. A. (2004a). Does cognitive neuropsychology
Hale, J. T. (2010). What a rational parser would do. have a future? Cognitive Neuropsychology, 21 (Special
Cognitive Science, 35, 399–443. Issue; Lead article), 3–16.
Hale, S. (2002). The man who lost his language. Harley, T. A. (2004b). Promises, promises. Cognitive
Harmondsworth, UK: Penguin Books. Neuropsychology, 21 (Special Issue; Reply to
Hall, D. G. (1993). Basic-level individuals. Cognition, commentators), 51–56.
48, 199–221. Harley, T. A. (2010). Talking the talk: Language,
Hall, D. G. (1994). How mothers teach basic-level psychology and science. Hove, UK: Psychology Press.
and situation-restricted count nouns. Journal of Child Harley, T. A., & Bown, H. E. (1998). What causes
Language, 21, 391–414. a tip-of-the-tongue state? Evidence for lexical
Hall, D. G., & Waxman, S. R. (1993). Assumptions neighbourhood effects in speech production. British
about word meaning: Individuation and basic-level Journal of Psychology, 89, 151–174.
kinds. Child Development, 64, 1550–1570. Harley, T. A., & MacAndrew, S. B. G. (1992).
Halle, M., & Stevens, K. N. (1962). Speech Modelling paraphasias in normal and aphasic speech.
recognition: A model and a program for research. IRE In Proceedings of the 14th Annual Conference of the
Transactions of the Professional Group on Information Cognitive Science Society (pp. 378–383). Hillsdale,
Theory, 8, 155–159. NJ: Lawrence Erlbaum Associates, Inc.
Hampton, J. A. (1979). Polymorphous concepts in Harm, M. W., & Seidenberg, M. S. (1999).
semantic memory. Journal of Verbal Learning and Phonology, reading acquisition, and dyslexia: Insights
Verbal Behavior, 18, 441–461. from connectionist models. Psychological Review,
Hampton, J. A. (1981). An investigation of the 106, 491–528.
nature of abstract concepts. Memory and Cognition, 9, Harm, M. W., & Seidenberg, M. S. (2001). Are there
149–156. orthographic impairments in phonological dyslexia?
Hanley, J. R., Hastie, K., & Kay, J. (1992). Cognitive Neuropsychology, 18, 71–92.
Developmental surface dyslexia and dysgraphia: Harm, M. W., & Seidenberg, M. S. (2004).
An orthographic processing impairment. Quarterly Computing the meanings of words in reading:
Journal of Experimental Psychology, 44A, 285–320. Cooperative division of labor between visual words
Hanley, J. R., & McDonnell, V. (1997). Are reading and phonological processes. Psychological Review,
and spelling phonologically mediated? Evidence 111, 662–720.
from a patient with a speech production impairment. Harris, M. (1978). Noun animacy and the passive
Cognitive Neuropsychology, 14, 3–33. voice: A developmental approach. Quarterly Journal
Hanten, G., & Martin, R. C. (2000). Contributions of Experimental Psychology, 30, 495–504.
of phonological and semantic short-term memory Harris, M., & Coltheart, M. (1986). Language
to sentence processing: Evidence from two cases of processing in children and adults. London: Routledge
close head injury in children. Journal of Memory and & Kegan Paul.
Language, 43, 335–361. Harris, M., & Hatano, G. (Eds.). (1999). Learning
Harley, B., & Wang, W. (1997). The critical period to read and write: A cross-linguistic perspective.
hypothesis: Where are we now? In A. M. B. de Cambridge: Cambridge University Press.
REFERENCES 525
Harris, P. L. (1982). Cognitive prerequisites to Hauk, O., Johnsrude, I., & Pulvermuller, F. (2004).
language? British Journal of Psychology, 73, 187–195. Somatotopic representation of action words in human
Harris, R. J. (1977). Comprehension of pragmatic motor and premotor cortex. Neuron, 41, 301–307.
implications in advertising. Journal of Applied Hauser, M. D., Chomsky, N., & Fitch, W. T. (2002).
Psychology, 63, 603–608. The faculty of language: What is it, who has it, and
Harris, R. J. (1978). The effect of jury size and how did it evolve? Science, 298, 1569–1579.
judge’s instructions on memory for pragmatic Hauser, M. D., Newport, E. L., & Aslin, R. N.
implications from courtroom testimony. Bulletin of the (2001). Segmentation of the speech stream in a non-
Psychonomic Society, 11, 129–132. human primate: Statistical learning in cotton-top
Harris, Z. S. (1951). Methods in structural linguistics. tamarins. Cognition, 78, B53–B64.
Chicago: University of Chicago Press. Haviland, S. E., & Clark, H. H. (1974). What’s
Harste, J., Burke, C., & Woodward, V. (1982). new? Acquiring new information as a process of
Children’s language and world: Initial encounters with comprehension. Journal of Verbal Learning and
print. In J. Langer & M. Smith-Burke (Eds.), Bridging Verbal Behavior, 13, 515–521.
the gap: Reader meets author (pp. 105–131). Newark, Hawkins, J. A. (1990). A parsing theory of word order
DE: International Reading Association. universals. Linguistic Inquiry, 21, 223–261.
Hart, J., Berndt, R. S., & Caramazza, A. (1985). Hayes, C. (1951). The ape in our house. New York:
Category-specific naming deficit following cerebral Harper.
infarction. Nature, 316, 439–440. Hayes, J. R., & Flower, L. S. (1980). Identifying
Hart, J., & Gordon, B. (1992). Neural subsystems for the organisation of writing processes. In L. W. Gregg
object knowledge. Nature, 359, 60–64. & E. R. Sternberg (Eds.), Cognitive processes in
Hartley, T., & Houghton, G. (1996). A linguistically writing (pp. 3–30). Hillsdale, NJ: Lawrence Erlbaum
constrained model of short-term memory for nonwords. Associates, Inc.
Journal of Memory and Language, 35, 1–31. Hayes, J. R., & Flower, L. S. (1986). Writing
Hartsuiker, R. J., Anton-Méndez, I., Roelstraete, B., research and the writer. American Psychologist, 41,
& Costa, A. (2006). Spoonish Spanerisms: A lexical 1106–1113.
bias effect in Spanish. Journal of Experimental Hayes, K. J., & Nissen, C. H. (1971). Higher mental
Psychology: Learning, Memory, and Cognition, 32, functions of a home-raised chimpanzee. In
949–953. A. M. Schrier & F. Stollnitz (Eds.), Behaviour of
Hartsuiker, R. J., & Kolk, H. H. J. (1998). Syntactic nonhuman primates (Vol. 4, pp. 60–115). New York:
facilitation in agrammatic sentence production. Brain Academic Press.
and Language, 62, 221–254. Haywood, S. L., Pickering, M. J., & Branigan,
Hartsuiker, R. J., Kolk, H. H. J., & Huiskamp, P. H. P. (2005). Do speakers avoid ambiguities during
(1999). Priming word order in sentence production. dialogue? Psychological Science, 16, 362–366.
Quarterly Journal of Experimental Psychology, 52A, Healy, A., & Miller, G. (1970). The verb as the
129–147. main determinant of sentence meaning. Psychonomic
Hartsuiker, R. J., Pickering, M. J., & Veltkamp, E. Science, 20, 372.
(2004). Is syntax separate or shared between Heath, S. B. (1983). Ways with words. Cambridge:
languages? Cross-linguistic syntactic priming in Cambridge University Press.
Spanish/English bilinguals. Psychological Science, 15, Hebb, D. O. (1949). The organization of behavior.
409–414. New York: Wiley.
Hartsuiker, R. J., & Westenberg, C. (2000). Heider, E. R. (1971). “Focal” color areas and
Word order priming in written and spoken sentence the development of color names. Developmental
production. Cognition, 75, B27–B39. Psychology, 4, 447–455.
Haskell, T. R., & MacDonald, M. C. (2003). Heider, E. R. (1972). Universals in colour naming
Conflicting cues and competition in subject–verb and memory. Journal of Experimental Psychology, 93,
agreement. Journal of Memory and Language, 48, 10–20.
760–778. Heit, E., & Barsalou, L. W. (1996). The instantiation
Haskell, T. R., MacDonald, M. C., & Seidenberg, M. S. principle in natural categories. Memory, 4, 413–451.
(2003). Language learning and innateness: Some Hemforth, B., & Konieczny, L. (1999). German
implications of compounds research. Cognitive sentence processing. Dordrecht: Kluwer Academic
Psychology, 47, 119–163. Publishers.
Hatcher, P. J., Hulme, C., & Ellis, A. W. (1994). Henderson, A., Goldman-Eisler, F., & Skarbek, A.
Ameliorating early reading failure by integrating (1966). Sequential temporal patterns in speech.
the teaching of reading and phonological skills: The Language and Speech, 8, 236–242.
phonological linkage hypothesis. Child Development, Henderson, J. M., & Ferreira, F. (Eds.). (2004).
65, 41–57. The interface of language, vision, and action:
526 REFERENCES
Eye movements and the visual word. New York: Hinton, G. E., Plaut, D. C., & Shallice, T. (1993).
Psychology Press. Simulating brain damage. Scientific American, 269,
Henderson, L. (1982). Orthography and word 58–65.
recognition in reading. London: Academic Press. Hinton, G. E., & Sejnowski, T. J. (1986). Learning
Herman, L. M., Richards, D. G., & Wolz, J. P. and relearning in Boltzmann machines. In
(1984). Comprehension of sentences by bottlenosed D. E. Rumelhart, J. L. McClelland, & the PDP Research
dolphins. Cognition, 16, 129–219. Group, Parallel distributed processing: Explorations
Herrnstein, R., Loveland, D., & Cable, C. (1977). in the microstructure of cognition: Vol. 1. Foundations
Natural concepts in pigeons. Journal of Experimental (pp. 282–317). Cambridge, MA: MIT Press.
Psychology: Animal Learning and Memory, 2, Hinton, G. E., & Shallice, T. (1991). Lesioning an
285–302. attractor network: Investigations of acquired dyslexia.
Hespos, S. J., & Spelke, E. (2004). Conceptual Psychological Review, 98, 74–95.
precursors to language. Nature, 430, 453–456. Hintzman, D. L. (1986). “Schemata abstraction” in a
Hess, D. J., Foss, D. J., & Carroll, P. (1995). Effects multiple-trace memory model. Psychological Review,
of global and local context on lexical processing 93, 411–428.
during language comprehension. Journal of Hirsh, K. W., & Ellis, A. W. (1994). Age of
Experimental Psychology: General, 124, 62–82. acquisition and lexical processing in aphasia: A case
Hickerson, N. P. (1971). Review of “Basic study. Cognitive Neuropsychology, 11, 435–458.
Color Terms.” International Journal of American Hirsh-Pasek, K., Kemler-Nelson, D. G., Jusczyk, P. W.,
Linguistics, 37, 257–270. Cassidy, K. W., Druss, B., & Kennedy, L. (1987).
Hickok, G., & Poeppel, D. (2004). Dorsal and Clauses are perceptual units for young infants. Cognition,
ventral streams: A framework for understanding 26, 269–286.
aspects of the functional anatomy of language. Hirsh-Pasek, K., Reeves, L. M., & Golinkoff, R. M.
Cognition, 92, 67–99. (1993). Words and meaning: From primitives to
Hieke, A. E., Kowal, S. H., & O’Connell, D. C. complex organisation. In J. Berko Gleason & N. B.
(1983). The trouble with “articulatory” pauses. Ratner (Eds.), Psycholinguistics (pp. 134–199). Fort
Language and Speech, 26, 203–214. Worth, TX: Harcourt Brace.
Hill, R. L., & Murray, W. S. (2000). Commas and Hirsh-Pasek, K., Treiman, R., & Schneiderman, M.
spaces: Effects of punctuation on eye movements and (1984). Brown and Hanlon revisited: Mothers’
sentence parsing. In A. Kennedy, R. Radach, D. Heller, sensitivity to ungrammatical forms. Journal of Child
& J. Pynte (Eds.), Reading as a perceptual process Language, 11, 81–88.
(pp. 565–589). Oxford: Elsevier. Hladik, E. G., & Edwards, H. T. (1984). A
Hillis, A. (2002). Handbook of adult language comparative analysis of mother–father speech
disorders. Hove, UK: Psychology Press. in the naturalistic home environment. Journal of
Hillis, A. E., Boatman, D., Hart, J., & Gordon, B. Psycholinguistic Research, 13, 321–332.
(1999). Making sense out of jargon: A neurolinguistic Hockett, C. F. (1960). The origin of speech. Scientific
and computational account of jargon aphasia. American, 203, 89–96.
Neurology, 53, 1813–1824. Hodges, J. R., & Greene, J. D. W. (1998). Knowing
Hillis, A. E., & Caramazza, A. (1991a). Category- about people and naming them: Can Alzheimer’s
specific naming and comprehension impairment: A disease patients do one without the other? Quarterly
double dissociation. Brain, 114, 2081–2094. Journal of Experimental Psychology, 51A, 121–134.
Hillis, A. E., & Caramazza, A. (1991b). Mechanisms Hodges, J. R., Patterson, K. E., Oxbury, S., &
for accessing lexical representations for output: Funnell, E. (1992). Semantic dementia: Progressive
Evidence from a category-specific semantic deficit. fluent aphasia with temporal lobe atrophy. Brain, 115,
Brain and Language, 40, 106–144. 1783–1806.
Hillis, A. E., & Caramazza, A. (1995). Representation Hodges, J. R., Salmon, D. P., & Butters, N. (1991).
of grammatical categories of words in the brain. The nature of the naming deficit in Alzheimer’s and
Journal of Cognitive Neuroscience, 7, 396–407. Huntington’s disease. Brain, 114, 1547–1558.
Hillis, A. E., Tuffiash, E., & Caramazza, A. (2002). Hodgson, J. M. (1991). Informational constraints
Modality-specific deterioration in naming verbs in on pre-lexical priming. Language and Cognitive
nonfluent primary progressive aphasia. Journal of Processes, 6, 169–205.
Cognitive Neuroscience, 14, 1099–1108. Hoff, E. (2003). The specificity of environmental
Hinton, G. E. (1989). Deterministic Boltzmann influence: Socioeconomic status affects early
learning performs steepest descent in weight-space. vocabulary development via maternal speech. Child
Neural Computation, 1, 143–150. Development, 74, 1368–1378.
Hinton, G. E. (1992). How neural networks learn Hoff, E., & Naigles, L. (2002). How children use input
from experience. Scientific American, 267, 105–109. to acquire a lexicon. Child Development, 73, 418–433.
REFERENCES 527
(Ed.), The new cognitive neurosciences (2nd ed., pp. James, S. L., & Khan, L. M. L. (1982). Grammatical
845–865). Cambridge, MA: MIT Press. morpheme acquisition: An approximately invariant
Indefrey, P., & Levelt, W. J. M. (2004). The order? Journal of Psycholinguistic Research, 11,
spatial and temporal signatures of word production 381–388.
components. Cognition, 92, 101–144. Jared, D. (1997a). Evidence that strategy effects in
Inhoff, A. W. (1984). Two stages of word processing word naming reflect changes in output timing
during eye fixations in the reading of prose. Journal of rather than changes in processing route. Journal of
Verbal Learning and Verbal Behavior, 23, 612–624. Experimental Psychology: Learning, Memory, and
Ivanova, I., Pickering, M. J., McLean, J. F., Costa, A., Cognition, 23, 1424–1438.
& Branigan, H. P. (2012). How do people produce Jared, D. (1997b). Spelling–sound consistency affects
ungrammatical utterances? Journal of Memory and the naming of high-frequency words. Journal of
Language, 67, 355–370. Memory and Language, 36, 505–529.
Jackendoff, R. (1977). X-bar syntax: A study of Jared, D., Levy, B. A., & Rayner, K. (1999). The
phrase structure. Cambridge, MA: MIT Press. role of phonology in the activation of word meanings
Jackendoff, R. (1983). Semantics and cognition. during reading: Evidence from proofreading and eye
Cambridge, MA: MIT Press. movements. Journal of Experimental Psychology:
Jackendoff, R. (1987). On beyond zebra: The relation General, 128, 219–264.
of linguistic and visual information. Cognition, 26, Jared, D., McRae, K., & Seidenberg, M. S. (1990).
89–114. The basis of consistency effects in word naming.
Jackendoff, R. (1999). Possible stages in the Journal of Memory and Language, 29, 687–715.
evolution of the language capacity. Trends in Cognitive Jared, D., & Seidenberg, M. S. (1991). Does word
Sciences, 3, 272–279. identification proceed from spelling to sound to
Jackendoff, R. (2002). Foundations of language. meaning? Journal of Experimental Psychology:
Oxford: Oxford University Press. General, 120, 358–394.
Jackendoff, R. (2003). Précis of Foundations of Jarvella, R. J. (1971). Syntactic processing of
language: Brain, meaning, grammar, evolution. connected speech. Journal of Verbal Learning and
Behavioral and Brain Sciences, 26, 651–707. Verbal Behavior, 10, 409–416.
Jackendoff, R., & Pinker, S. (2005). The nature of Jastrzembski, J. E. (1981). Multiple meanings,
the language faculty and its implications for evolution number of related meanings, frequency of occurrence,
of language (Reply to Fitch, Hauser, and Chomsky). and the lexicon. Cognitive Psychology, 13, 278–305.
Cognition, 97, 211–225. Jescheniak, J. D., & Levelt, W. J. M. (1994). Word
Jacobsen, E. (1932). The electrophysiology of mental frequency effects in speech production: Retrieval
activities. American Journal of Psychology, 44, of syntactic information and of phonological form.
677–694. Journal of Experimental Psychology: Learning,
Jacoby, L. L. (1983). Perceptual enhancement: Memory, and Cognition, 20, 824–843.
Persistent effects of an experience. Journal of Jescheniak, J. D., Meyer, A. S., & Levelt, W. J. M.
Experimental Psychology: Learning, Memory, and (2003). Specific-word frequency is not all that counts
Cognition, 15, 930–940. in speech production: Comments on Caramazza,
Jacoby, L. L., & Dallas, M. (1981). On the Costa, et al. (2001) and new experimental data.
relationship between autobiographical memory Journal of Experimental Psychology: Learning,
and perceptual learning. Journal of Experimental Memory, and Cognition, 29, 432–438.
Psychology: General, 110, 306–340. Jescheniak, J. D., Schriefers, H., & Hantsch, A.
Jaeger, J. J., Lockwood, A. H., Kemmerer, D., van (2003). Utterance format affects phonological priming
Valin, R. D., Murphy, B. W., & Khalak, H. G. in the picture–word task: Implications for models of
(1996). A positron emission tomographic study of phonological encoding in speech production. Journal
regular and irregular verb morphology in English. of Experimental Psychology: Human Perception and
Language, 72, 451–497. Performance, 29, 441–454.
Jaffe, J., Breskin, S., & Gerstman, L. J. (1972). Jin, Y.-S. (1990). Effects of concreteness on cross-
Random generation of apparent speech rhythms. language priming in lexical decisions. Perceptual and
Language and Speech, 15, 68–71. Motor Skills, 70, 1139–1154.
Jakobson, R. (1968). Child language, aphasia and Joanisse, M. F., & Seidenberg, M. S. (1998).
phonological universals. The Hague: Mouton. Specific language impairment: A deficit in grammar or
James, L. E., & Burke, D. M. (2000). Phonological processing? Trends in Cognitive Sciences, 2, 240–247.
priming effects on word retrieval and tip-of-the-tongue Joanisse, M. F., & Seidenberg, M. S. (1999).
experiences in young and older adults. Journal of Impairments in verb morphology after brain injury:
Experimental Psychology: Learning, Memory, and A connectionist model. Proceedings of the National
Cognition, 26, 1378–1391. Academy of Sciences USA, 96, 7592–7597.
REFERENCES 529
Joanisse, M. F., & Seidenberg, M. S. (2005). Jones, G. V. (1989). Back to Woodworth: Role of
Imaging the past: Neural activation in frontal and interlopers in the tip-of-the-tongue phenomenon.
temporal regions during regular and irregular past Memory and Cognition, 17, 69–76.
tense processing. Cognitive, Affective and Behavioral Jones, G. V., & Langford, S. (1987). Phonological
Neuroscience, 5, 282–296. blocking in the tip of the tongue state. Cognition, 26,
Job, R., Miozzo, M., & Sartori, G. (1993). On the 115–122.
existence of category-specific impairments: A reply to Jones, G. V., & Martin, M. (1985). Deep dyslexia
Parkin and Stewart. Quarterly Journal of Experimental and the right-hemisphere hypothesis for semantic
Psychology, 46A, 511–516. paralexia: A reply to Marshall and Patterson.
Johnson, E. K., & Jusczyk, P. W. (2001). Word Neuropsychologia, 23, 685–688.
segmentation by 8-month-olds: When speech cues Jones, L. L., & Estes, Z. (2005). Metaphor
count more than statistics. Journal of Memory and comprehension as attributive categorization. Journal
Language, 44, 548–567. of Memory and Language, 53, 110–124.
Johnson, E. K., Jusczyk, P. W., Cutler, A., & Jorm, A. F. (1979). The cognitive and neurological
Norris, D. (2003). Lexical viability constraints on basis of developmental dyslexia: A theoretical
speech segmentation by infants. Cognitive Psychology, framework and review. Cognition, 7, 19–32.
46, 65–97. Jorm, A. F., & Share, D. L. (1983). Phonological
Johnson, J. S., & Newport, E. L. (1989). Critical recoding and reading acquisition. Applied
period effects in second language learning: The Psycholinguistics, 4, 103–147.
influence of maturational state on the acquisition of Jusczyk, P. W. (1982). Auditory versus phonetic
English as a second language. Cognitive Psychology, coding of speech signals during infancy. In J. Mehler,
21, 60–99. E. C. T. Walker, & M. Garrett (Eds.), Perspectives on
Johnson, K. E., & Mervis, C. B. (1997). Effects mental representation (pp. 361–387). Hillsdale, NJ:
of varying levels of expertise on the basic level of Lawrence Erlbaum Associates, Inc.
categorization. Journal of Experimental Psychology: Just, M. A., & Carpenter, P. A. (1980). A theory
General, 126, 248–277. of reading: From eye fixations to comprehension.
Johnson, R. E. (1970). Recall of prose as a function Psychological Review, 87, 329–354.
of the structural importance of the linguistic units. Just, M. A., & Carpenter, P. A. (1987). The
Journal of Verbal Learning and Verbal Behavior, 9, psychology of reading and language comprehension.
12–90. Newton, MA: Allyn & Bacon.
Johnson-Laird, P. N. (1975). Meaning and the mental Just, M. A., & Carpenter, P. A. (1992). A capacity
lexicon. In A. Kennedy & A. Wilkes (Eds.), Studies theory of comprehension: Individual differences in
in long-term memory (pp. 123–142). London: John working memory. Psychological Review, 99, 122–149.
Wiley. Just, M. A., Carpenter, P. A., & Keller, T. A. (1996).
Johnson-Laird, P. N. (1978). What’s wrong with The capacity theory of comprehension: New frontiers
Grandma’s guide to procedural semantics: A reply to of evidence and arguments. Psychological Review,
Jerry Fodor. Cognition, 6, 249–261. 103, 773–780.
Johnson-Laird, P. N. (1983). Mental models. Just, M. A., & Varma, S. (2002). A hybrid architecture
Cambridge: Cambridge University Press. for working memory: Reply to MacDonald and
Johnson-Laird, P. N., Herrman, D. J., & Chaffin, R. Christiansen. Psychological Review, 109, 55–65.
(1984). Only connections: A critique of semantic Kail, R., & Nippold, M. A. (1984). Unconstrained
networks. Psychological Bulletin, 96, 292–315. retrieval from semantic memory. Child Development,
Johnston, R. S., & Watson, J. E. (2004). 55, 944–951.
Accelerating the development of reading, spelling and Kako, E. (1999a). Elements of syntax in the systems
phonemic awareness skills in initial readers. Reading of three language-trained animals. Animal Learning
and Writing, 17, 327–357. and Behavior, 27, 1–14.
Johnston, R. S., & Watson, J. E. (2005). The effects Kako, E. (1999b). Response to Pepperberg; Herman
of synthetic phonics teaching on reading and spelling and Uyeyama; and Shanker, Savage-Rumbaugh, and
attainment: A seven year longitudinal study. The Taylor. Animal Learning and Behavior, 27, 26–27.
Scottish Executive, available at http://www.scotland. Kamide, Y., Altmann, G. T. M., & Haywood, S. L.
gov.uk/Publications/2005/02/20688/52449. (2003). The time-course of prediction in incremental
Jolicoeur, P., Gluck, M. A., & Kosslyn, S. M. (1984). sentence processing: Evidence from anticipatory eye
Pictures and names: Making the connection. Cognitive movements. Journal of Memory and Language, 49,
Psychology, 16, 243–275. 133–156.
Jones, G. V. (1985). Deep dyslexia, imageability, Kaminski, J., Call, J., & Fischer, J. (2004). Word
and ease of predication. Brain and Language, 24, learning in a domestic dog: Evidence for “fast
1–19. mapping.” Science, 304, 1682–1683.
530 REFERENCES
Kanwisher, N. (1987). Repetition blindness: Type Keenan, J. M., MacWhinney, B., & Mayhew, D.
recognition without token individuation. Cognition, (1977). Pragmatics in memory: A study of natural
27, 117–143. conversation. Journal of Verbal Learning and Verbal
Kanwisher, N., & Potter, M. C. (1990). Repetition Behavior, 16, 549–560.
blindness: Levels of processing. Journal of Kegl, J., Senghas, A., & Coppola, M. (1999).
Experimental Psychology: Human Perception and Creations through contact: Sign language emergence
Performance, 16, 30–47. and sign language change in Nicaragua. In M.
Karmiloff-Smith, A. (1986). From meta-process to DeGraff (Ed.), Comparative grammatical change: The
conscious access: Evidence from metalinguistic and intersection of language acquisition, Creole genesis,
repair data. Cognition, 23, 95–147. and diachronic syntax (pp. 179–237). Cambridge,
Katz, J. J. (1977). The real status of semantic MA: MIT Press.
representations. Linguistic Inquiry, 8, 559–584. Kellas, G., Ferraro, F. R., & Simpson, G. B.
Katz, J. J., & Fodor, J. A. (1963). The structure of a (1988). Lexical ambiguity and the time-course of
semantic theory. Language, 39, 170–210. attentional allocation in word recognition. Journal
Kaufer, D., Hayes, J. R., & Flower, L. S. (1986). of Experimental Psychology: Human Perception and
Composing written sentences. Research in the Performance, 14, 601–609.
Teaching of English, 20, 121–140. Kellogg, R. T. (1988). Attentional overload and
Kaup, B., & Zwaan, R. A. (2003). Effects of negational writing performance. Journal of Experimental
and situational presence on the accessibility of text Psychology: Learning, Memory, and Cognition, 14,
information. Journal of Experimental Psychology: 355–365.
Learning, Memory, and Cognition, 29, 439–446. Kellogg, W. N., & Kellogg, L. A. (1933). The ape and
Kawamoto, A. (1993). Nonlinear dynamics in the the child. New York: McGraw-Hill.
resolution of lexical ambiguity: A parallel distributed Kelly, M. H. (1992). Using sound to solve syntactic
processing account. Journal of Memory and problems: The role of phonology in grammatical
Language, 32, 474–516. category assignments. Psychological Review, 99,
Kay, D. A., & Anglin, J. M. (1982). Overextension 349–364.
and underextension in the child’s expressive and Kelly, M. H., Bock, J. K., & Keil, F. C. (1986).
receptive speech. Journal of Child Language, 9, Prototypicality in a linguistic context: Effects on
83–98. sentence structure. Journal of Memory and Language,
Kay, J. (1985). Mechanisms of oral reading: A critical 25, 59–74.
appraisal of cognitive models. In A. W. Ellis (Ed.), Kelter, S., Kaup, B., & Claus, B. (2004).
Progress in the psychology of language (Vol. 2, pp. Representing a described sequence of events: A
73–105). Hove, UK: Lawrence Erlbaum Associates. dynamic view of narrative comprehension. Journal
Kay, J., & Bishop, D. (1987). Anatomical differences of Experimental Psychology: Learning, Memory, and
between nose, palm, foot. Or, the body in question: Cognition, 30, 451–464.
Further dissection of the processes of sub-lexical Kempen, G., & Huijbers, P. (1983). The
spelling–sound translation. In M. Coltheart (Ed.), lexicalization process in sentence production and
Attention and performance XII: The psychology of naming: Indirect election of words. Cognition, 14,
reading (pp. 449–469). Hove, UK: Lawrence Erlbaum 185–209.
Associates. Kendon, A. (1967). Some functions of gaze direction
Kay, J., & Ellis, A. W. (1987). A cognitive in social interaction. Acta Psychologica, 26, 22–63.
neuropsychological case study of anomia: Implications Kennedy, A. (2000). Parafoveal processing in word
for psychological models of word retrieval. Brain, 110, recognition. Quarterly Journal of Experimental
613–629. Psychology, 53A, 429–455.
Kay, J., Lesser, R., & Coltheart, M. (1992). Kennedy, A., Murray, W. S., Jennings, F., & Reid, C.
Psycholinguistic assessments of language processing (1989). Parsing complements: Comments on the
in aphasia (PALPA): An introduction. Hove, UK: generality of the principle of minimal attachment.
Lawrence Erlbaum Associates. Language and Cognitive Processes, 4, 51–76.
Kay, J., & Marcel, A. J. (1981). One process, not two Kennison, S. M. (2001). Limitations on the use of
in reading aloud: Lexical analogies do the work of verb information during sentence comprehension.
nonlexical rules. Quarterly Journal of Experimental Psychonomic Bulletin and Review, 8, 132–138.
Psychology, 33A, 397–414. Kennison, S. M., & Trofe, J. L. (2004).
Kay, P., & Kempton, W. (1984). What is the Sapir– Comprehending pronouns: A role for word-
Whorf hypothesis? American Anthropologist, 86, 65–79. specific gender stereotype information. Journal of
Kean, M.-L. (1977). The linguistic interpretation of Psycholinguistic Research, 32, 355–378.
aphasic syndromes: Agrammatism in Broca’s aphasia, Kersten, A. W., & Earles, J. L. (2001). Less really
an example. Cognition, 5, 9–46. is more for adults learning a miniature artificial
REFERENCES 531
language. Journal of Memory and Language, 44, Kintsch, W., & Vipond, D. (1979). Reading
250–273. comprehension and readability in educational practice
Kess, J. F., & Miyamoto, T. (1999). The Japanese and psychological theory. In L. G. Nilsson (Ed.),
mental lexicon: Psycholinguistic studies of Kana and Perspectives in memory research (pp. 329–366).
Kanji processing. Amsterdam: John Benjamins. Hillsdale, NJ: Lawrence Erlbaum Associates, Inc.
Keysar, B. (1989). On the functional equivalence of Kintsch, W., Welsch, D., Schmalhofer, F., & Zimny, S.
literal and metaphorical interpretations of discourse. (1990). Sentence memory: A theoretical analysis.
Journal of Memory and Language, 28, 375–385. Journal of Memory and Language, 29, 133–159.
Keysar, B., Barr, D. J., Balin, J. A., & Paek, T. S. Kiparsky, P., & Menn, L. (1977). On the acquisition of
(1998). Definite reference and mutual knowledge: phonology. In J. Macnamara (Ed.), Language learning
Process models of common ground in comprehension. and thought (pp. 47–78). New York: Academic Press.
Journal of Memory and Language, 39, 1–20. Kirshner, H. S., Webb, W. G., & Kelly, M. P. (1984).
Keysar, B., & Henly, A. S. (2002). Speakers’ The naming order of dementia. Neuropsychologia, 22,
overestimation of their effectiveness. Psychological 23–30.
Science, 13, 207–212. Kirsner, K., Smith, M., Lockhart, R. S., King, M. L.,
Kiger, J. I., & Glass, A. L. (1983). The facilitation of & Jain, M. (1984). The bilingual lexicon: Language-
lexical decisions by a prime occurring after the target. specific units in an integrated network. Journal of
Memory and Cognition, 11, 356–365. Verbal Learning and Verbal Behavior, 23, 519–539.
Kilborn, K. (1994). Learning language late: Second Klapp, S. T. (1974). Syllable-dependent pronunciation
language acquisition in adults. In M. A. Gernsbacher latencies in number naming, a replication. Journal of
(Ed.), Handbook of psycholinguistics (pp. 917–944). Experimental Psychology, 102, 1138–1140.
San Diego, CA: Academic Press. Klapp, S. T., Anderson, W. G., & Berrian, R.
Kim, H. S. (2002). We talk, therefore we think? A cultural (1973). Implicit speech in reading considered. Journal
analysis of the effect of talking on thinking. Journal of of Experimental Psychology, 100, 368–374.
Personality and Social Psychology, 83, 828–842. Klatt, D. H. (1989). Review of selected models
Kimball, J. (1973). Seven principles of surface of speech perception. In W. Marslen-Wilson (Ed.),
structure parsing in natural language. Cognition, 2, Lexical representation and process (pp. 169–226).
15–47. Cambridge, MA: MIT Press.
Kinoshita, S., & Lupker, S. J. (2003). Masked Klee, T., & Fitzgerald, M. D. (1985). The relation
priming: State of the art. New York: Psychology Press. between grammatical development and mean length of
Kintsch, W. (1974). The representation of meaning in utterance in morphemes. Journal of Child Language,
memory. Hillsdale, NJ: Lawrence Erlbaum Associates, 12, 251–269.
Inc. Klein, W. (1986). Second language acquisition.
Kintsch, W. (1979). On modelling comprehension. Cambridge: Cambridge University Press.
Educational Psychologist, 14, 3–14. Klima, E. S., & Bellugi, U. (1979). The signs of
Kintsch, W. (1980). Semantic memory: A tutorial. language. Cambridge, MA: Harvard University Press.
In R. S. Nickerson (Ed.), Attention and performance Kluender, R., & Kutas, M. (1993). Bridging the gap:
XIII (pp. 595–620). Hillsdale, NJ: Lawrence Erlbaum Evidence from ERPs on the processing of unbounded
Associates, Inc. dependencies. Journal of Cognitive Neuroscience, 5,
Kintsch, W. (1988). The use of knowledge in 196–214.
discourse processing: A construction-integration Knutsen, D., & Le Bigot, L. (2012). Managing
model. Psychological Review, 95, 163–182. dialogue: How information availability affects
Kintsch, W. (1994). The psychology of discourse collaborative reference production. Journal of Memory
processing. In M. A. Gernsbacher (Ed.), Handbook and Language, 67, 326–341.
of psycholinguistics (pp. 721–740). San Diego, CA: Kohn, S. E. (1984). The nature of the phonological
Academic Press. disorder in conduction aphasia. Brain and Language,
Kintsch, W., & Bates, E. (1977). Recognition 23, 97–115.
memory for statements from a classroom lecture. Kohn, S. E., & Friedman, R. B. (1986). Word-
Journal of Experimental Psychology: Human Learning meaning deafness: A phonological–semantic
and Memory, 3, 187–197. dissociation. Cognitive Neuropsychology, 3, 291–308.
Kintsch, W., & Keenan, J. M. (1973). Reading Kohn, S. E., Wingfield, A., Menn, L., Goodglass, H.,
rate and retention as a function of the number of Gleason, J. B., & Hyde, M. (1987). Lexical
propositions in the base structure of sentences. retrieval: The tip-of-the-tongue phenomenon. Applied
Cognitive Psychology, 5, 257–274. Psycholinguistics, 8, 245–266.
Kintsch, W., & van Dijk, T. A. (1978). Toward Kolb, B., & Whishaw, I. Q. (2009). Fundamentals of
a model of text comprehension and production. human neuropsychology (6th ed.). New York: W. H.
Psychological Review, 85, 363–394. Freeman & Co.
532 REFERENCES
Kolk, H. H. J. (1978). The linguistic interpretation of Kuhl, P. K. (1981). Discrimination of speech by non-
Broca’s aphasia: A reply to M.-L. Kean. Cognition, 6, human animals: Basic auditory sensitivities conducive
353–361. to the perception of speech-sound categories. Journal
Kolk, H. H. J. (1995). A time-based approach to of the Acoustical Society of America, 70, 340–349.
agrammatic production. Brain and Language, 50, Kursaal Flyers (1976). Little does she know/Drinking
282–303. socially. CBS 4689. Producer: Mike Batt.
Kolk, H. H. J., & van Grunsven, M. (1985). Kutas, M. (1993). In the company of other words:
Agrammatism as a variable phenomenon. Cognitive Electrophysiological evidence for single-word and
Neuropsychology, 2, 347–384. sentence context effects. Language and Cognitive
Komatsu, L. K. (1992). Recent views of conceptual Processes, 8, 533–572.
structure. Psychological Bulletin, 112, 500–526. Kutas, M., DeLong, K. A., & Smith, N. J. (2011).
Kornai, A., & Pullum, G. K. (1990). The X-bar A look around at what lies ahead: Prediction and
theory of phrase structure. Language, 66, 24–50. predictability in language processing. In M. Bar (Ed.),
Kounios, J., & Holcomb, P. J. (1992). Structure and Predictions in the brain: Using our past to generate
process in semantic memory: Evidence from event- a future (pp. 190–207). Oxford: Oxford University
related brain potentials and reaction times. Journal of Press.
Experimental Psychology: General, 121, 459–479. Kutas, M., & Hillyard, S. A. (1980). Reading
Kounios, J., & Holcomb, P. J. (1994). Concreteness senseless sentences: Brain potentials reflect semantic
effects in semantic processing: ERP evidence incongruity. Science, 207, 203–205.
supporting dual-coding theory. Journal of Kutas, M., & van Petten, C. (1994).
Experimental Psychology: Learning, Memory, and Psycholinguistics electrified: Event-related brain
Cognition, 20, 804–823. potential investigations. In M. A. Gernsbacher (Ed.),
Kraljic, T., & Brennan, S. E. (2005). Prosodic Handbook of psycholinguistics (pp. 83–143). San
disambiguation of syntactic structure: For the speaker Diego, CA: Academic Press.
or for the addressee? Cognitive Psychology, 50, Kyle, J. G., & Woll, B. (1985). Sign language: The
194–231. study of deaf people and their language. Cambridge:
Krashen, S. D. (1982). Principles and practices in Cambridge University Press.
second language acquisition. Oxford: Pergamon. La Heij, W., Hooglander, A., Kerling, R., & van
Krashen, S. D., Long, M., & Scarcella, R. (1982). der Velden, E. (1996). Nonverbal context effects
Age, rate, and eventual attainment in second language in forward and backward translation: Evidence for
acquisition. In S. D. Krashen, R. Scarcella, & M. Long concept mediation. Journal of Memory and Language,
(Eds.), Child–adult differences in second language 35, 648–665.
acquisition (pp. 161–172). Rowley, MA: Newbury Labov, W. (1973). The boundaries of words and their
House. meanings. In C.-J. Bailey & R. W. Shuy (Eds.), New
Kraus, N. (2012). Atypical brain oscillations: A ways of analyzing variations in English (pp. 340–373).
biological basis for dyslexia? Trends in Cognitive Washington, DC: Georgetown University Press.
Sciences, 16, 12–13. Labov, W., & Fanshel, D. (1977). Therapeutic
Kremin, H. (1985). Routes and strategies in surface discourse: Psychotherapy as conversation. New York:
dyslexia and dysgraphia. In K. E. Patterson, J. C. Academic Press.
Marshall, & M. Coltheart (Eds.), Surface dyslexia: Lackner, J. R., & Garrett, M. F. (1972). Resolving
Neuropsychological and cognitive studies of ambiguity: Effects of biasing context in the unattended
phonological reading (pp. 105–137). Hove, UK: ear. Cognition, 1, 359–372.
Lawrence Erlbaum Associates. Lado, R. (1957). Linguistics across cultures. Ann
Kroll, J. F., & Stewart, E. (1994). Category Arbor: University of Michigan Press.
interference in translation and picture naming: Lai, C. S. L., Fisher, S. E., Hurst, J. A., Vargha-
Evidence for asymmetric connections between Khadem, F., & Monaco, A. P. (2001). A forkhead-
bilingual memory representations. Journal of Memory domain gene is mutated in a severe speech and
and Language, 33, 149–174. language disorder. Nature, 413, 519–523.
Kruschke, J. K. (1992). ALCOVE: An exemplar- Laiacona, M., Barbarotto, R., & Capitani, E.
based connectionist model of category learning. (1993). Perceptual and associative knowledge in
Psychological Review, 99, 22–44. category specific impairment of semantic memory: A
Kucera, H., & Francis, W. N. (1967). Computational study of two cases. Cortex, 29, 727–740.
analysis of present-day American English. Providence, Laine, M., & Martin, N. (1996). Lexical retrieval
RI: Brown University Press. deficit in picture naming: Implications for word
Kuczaj, S. A. (1977). The acquisition of regular and production models. Brain and Language, 53, 283–314.
irregular past tense forms. Journal of Verbal Learning Laing, E., & Hulme, C. (1999). Phonological and
and Verbal Behavior, 16, 589–600. semantic processes influence beginning readers’ ability
REFERENCES 533
to learn to read words. Journal of Experimental Child Lee, H., Rayner, K., & Pollatsek, A. (2001). The
Psychology, 73, 183–207. relative contribution of consonants and vowels to word
Lakatos, I. (1970). Falsification and the methodology identification during reading. Journal of Memory and
of scientific research programmes. In I. Lakatos & Language, 44, 189–205.
A. Musgrave (Eds.), Criticism and the growth of Lee, J. J., & Pinker, S. (2010). Rationales for
knowledge (pp. 91–196). Cambridge: Cambridge indirect speech: The theory of the strategic speaker.
University Press. Psychological Review, 117, 785–807.
Lakoff, G. (1987). Women, fire, and dangerous things. Legge, G. E., Klitz, T. S., & Tjan, B. S. (1997).
Chicago: University of Chicago Press. Mr Chips: An ideal-observer model of reading.
Lambert, W. E., Tucker, G. R., & d’Anglejan, A. Psychological Review, 104, 524–553.
(1973). Cognitive and attitudinal consequences Lenneberg, E. H. (1962). Understanding language
of bilingual schooling. Journal of Educational without ability to speak: A case report. Journal of
Psychology, 85, 141–159. Abnormal and Social Psychology, 65, 419–425.
Lambon Ralph, M. A., Ellis, A. W., & Franklin, S. Lenneberg, E. H. (1967). The biological foundations
(1995). Semantic loss without surface dyslexia. of language. New York: Wiley.
Neurocase, 1, 363–369. Lenneberg, E. H., & Roberts, J. M. (1956).
Lambon Ralph, M. A., Sage, K., & Ellis, A. W. The language of experience. Memoir 13, Indiana
(1996). Word meaning blindness: A new form of University Publications in Anthropology and
acquired dyslexia. Cognitive Neuropsychology, 13, Linguistics.
617–639. Leonard, L. B. (1989). Language learnability and
Landau, B., & Gleitman, L. R. (1985). Language specific language impairment in children. Applied
and experience: Evidence from the blind child. Psycholinguistics, 10, 179–202.
Cambridge, MA: Harvard University Press. Leonard, L. B. (2000). Children with specific
Landauer, T. K., & Dumais, S. T. (1997). A solution language impairment. Cambridge, MA: MIT Press.
to Plato’s problem: The latent semantic analysis Leonard, L. B., Newhoff, M., & Fey, M. E. (1980).
theory of acquisition, induction, and representation of Some instances of word usage without comprehension.
knowledge. Psychological Review, 104, 211–240. Journal of Child Language, 7, 186–196.
Landauer, T. K., Foltz, P. W., & Laham, D. (1998). Leopold, W. F. (1939–1949). Speech development of a
An introduction to latent semantic analysis. Discourse bilingual child: A linguist’s record (5 vols.). Evanston,
Processes, 25, 259–284. IL: Northwestern University Press.
Landauer, T. K., & Freedman, J. L. (1968). Lesch, M. F., & Martin, R. C. (1998). The
Information retrieval from long-term memory: representation of sublexical orthographic–
Category size and recognition time. Journal of Verbal phonological correspondences: Evidence from
Learning and Verbal Behavior, 7, 291–295. phonological dyslexia. Quarterly Journal of
Lane, H., & Pillard, R. (1978). The wild boy of Experimental Psychology, 51, 905–938.
Burundi. New York: Random House. Lesch, M. F., & Pollatsek, A. (1998). Evidence
Lantz, D., & Stefflre, V. (1964). Language and for the use of assembled phonology in accessing
cognition revisited. Journal of Abnormal Psychology, the meaning of words. Journal of Experimental
69, 472–481. Psychology: Learning, Memory, and Cognition, 24,
Lapointe, S. (1983). Some issues in the linguistic 573–592.
description of agrammatism. Cognition, 14, 1–39. Levelt, W. J. M. (1989). Speaking: From intention to
Lauro-Grotto, R., Piccini, C., & Shallice, T. (1997). articulation. Cambridge, MA: MIT Press.
Modality-specific operations in semantic dementia. Levelt, W. J. M. (2001). Spoken word production: A
Cortex, 33, 593–622. theory of lexical access. Proceedings of the National
Laws, G., Davies, I., & Andrews, C. (1995). Academy of Sciences, 98, 13464–13471.
Linguistic structure and non-linguistic cognition: Levelt, W. J. M. (2002). Picture naming and word
English and Russian blues compared. Language and frequency. Language and Cognitive Processes, 17,
Cognitive Processes, 10, 59–94. 663–671.
Laxon, V., Masterson, J., & Coltheart, V. (1991). Levelt, W. J. M., & Kelter, S. (1982). Surface
Some bodies are easier to read: The effect of form and memory in question answering. Cognitive
consistency and regularity on children’s reading. Psychology, 14, 78–106.
Quarterly Journal of Experimental Psychology, 43A, Levelt, W. J. M., Roelofs, A., & Meyer, A. S. (1999).
793–824. A theory of lexical access in speech production.
Lee, A. C. H., Graham, K. S., Simons, J. S., & Behavioral and Brain Sciences, 22, 1–75.
Hodges, J. (2002). Regional brain activations differ Levelt, W. J. M., Schriefers, H., Vorberg, D.,
for semantic features but not categories. Neuroreport, Meyer, A. S., Pechmann, T., & Havinga, J.
13, 1497–1501. (1991a). The time course of lexical access in speech
534 REFERENCES
production: A study of picture naming. Psychological Liberman, A. M., & Whalen, D. H. (2000). On the
Review, 98, 122–142. relation of speech to language. Trends in Cognitive
Levelt, W. J. M., Schriefers, H., Vorberg, D., Sciences, 4, 187–196.
Meyer, A. S., Pechmann, T., & Havinga, J. (1991b). Lidz, J., Gleitman, H., & Gleitman, L. (2003).
Normal and deviant lexical processing: Reply to Dell Understanding how input matters: Verb learning and
and O’Seaghdha (1991). Psychological Review, 98, the footprint of universal grammar. Cognition, 87,
615–618. 151–178.
Levelt, W. J. M., & Wheeldon, L. (1994). Do Lidzha, K., & Krageloh-Mann, I. (2005).
speakers have access to a mental syllabary? Cognition, Development and lateralization of language in the
50, 239–269. presence of early brain lesions. Developmental
Levine, D. N., Calvanio, R., & Popovics, A. Medicine and Child Neurology, 47, 724.
(1982). Language in the absence of inner speech. Lieberman, P. (1963). Some effects of semantic and
Neuropsychologia, 20, 391–409. grammatical context on the production and perception
Levinson, S. (1983). Pragmatics. Cambridge: of speech. Language and Speech, 6, 172–187.
Cambridge University Press. Lieberman, P. (1975). On the origins of language.
Levinson, S. (1996a). Frames of reference and New York: Macmillan.
Molyneux’s question: Crosslinguistic evidence. In P. Lieven, E. (1994). Crosslinguistic and crosscultural
Bloom & M. Peterson (Eds.), Language and space aspects of language addressed to children. In
(pp. 109–169). Cambridge, MA: MIT Press. C. Gallaway & B. J. Richards (Eds.), Input and
Levinson, S. (1996b). Language and space. Annual interaction in language acquisition (pp. 56–73).
Review of Anthropology, 25, 353–382. Cambridge: Cambridge University Press.
Levinson, S. C., Kita, S., Haun, D. B. M., & Lieven, E., Pine, J., & Baldwin, G. (1997).
Rasch, B. H. (2002). Returning the tables: Language Lexically-based learning and early grammatical
affects spatial reasoning. Cognition, 84, 155–188. development. Journal of Child Language, 24,
Levy, B. A., Gong, Z., Hessels, S., Evans, M. A., 187–220.
& Jared, D. (2006). Understanding print: Early Lightfoot, D. (1982). The language lottery: Toward a
reading development and the contributions of home biology of grammars. Cambridge, MA: MIT Press.
literacy experiences. Journal of Child Experimental Lindell, A. K. (2006). In your right mind: Right
Psychology, 93, 63–93. hemisphere contributions to language processing
Levy, E., & Nelson, K. (1994). Words in discourse: A and production. Neuropsychology Review, 16,
dialectical approach to the acquisition of meaning and 131–148.
use. Journal of Child Language, 21, 367–389. Lindsay, P. H., & Norman, D. A. (1977). Human
Levy, Y. (1983). It’s frogs all the way down. information processing (2nd ed.). New York:
Cognition, 15, 75–93. Academic Press.
Levy, Y. (1988). The nature of early language: Linebarger, M. C. (1995). Agrammatism as evidence
Evidence from the development of Hebrew about grammar. Brain and Language, 50, 52–91.
morphology. In Y. Levy, I. M. Schlesinger, & Linebarger, M. C., Schwartz, M. F., & Saffran, E. M.
M. D. S. Braine (Eds.), Categories and processes (1983). Sensitivity to grammatical structure in so-called
in language acquisition (pp. 73–98). Hillsdale, NJ: agrammatic aphasics. Cognition, 13, 361–392.
Lawrence Erlbaum Associates, Inc. Lipson, M. Y. (1983). The influence of religious
Levy, Y., & Schlesinger, I. M. (1988). The child’s affiliation on children’s memory for text information.
early categories: Approaches to language acquisition Reading Research Quarterly, 18, 448–457.
theory. In Y. Levy, I. M. Schlesinger, & Liu, L. G. (1985). Reasoning counter-factually in
M. D. S. Braine (Eds.), Categories and processes in Chinese: Are there any obstacles? Cognition, 21,
language acquisition (pp. 261–276). Hillsdale, NJ: 239–270.
Lawrence Erlbaum Associates, Inc. Locke, J. (1690). Essay concerning human
Lewis, V. (1987). Development and handicap. Oxford: understanding (Ed. P. M. Nidditch, 1975). Oxford:
Blackwell. Clarendon.
Li, P., & Gleitman, L. (2002). Turning the tables: Locke, J. L. (1983). Phonological acquisition and
Language and spatial reasoning. Cognition, 83, 265–294. change. New York: Academic Press.
Liberman, A. M., Cooper, F. S., Shankweiler, D. P., Locke, J. L. (1997). A theory of neurolinguistic
& Studdert-Kennedy, M. (1967). Perception of the development. Brain and Language, 58, 265–326.
speech code. Psychological Review, 74, 431–461. Loebell, H., & Bock, K. (2003). Structural priming
Liberman, A. M., Harris, K. S., Hoffman, H. S., across languages. Linguistics, 41, 791–824.
& Griffith, B. C. (1957). The discrimination of Loftus, E. F. (1973). Category, dominance, instance
speech sounds within and across phoneme boundaries. dominance, and categorization time. Journal of
Journal of Experimental Psychology, 53, 358–368. Experimental Psychology, 97, 70–74.
REFERENCES 535
Loftus, E. F. (1975). Leading questions and the from associative priming by words, homophones,
eyewitness report. Cognitive Psychology, 7, 560–572. and pseudohomophones. Journal of Experimental
Loftus, E. F. (1996). Eyewitness testimony (reprint Psychology: General, 123, 107–128.
edition with new preface). Cambridge, MA: Harvard Lukatela, G., & Turvey, M. T. (1994b). Visual
University Press. lexical access is initially phonological: 2. Evidence
Loftus, E. F., & Palmer, J. C. (1974). Reconstruction from phonological priming by homophones and
of automobile destruction: An example of the pseudohomophones. Journal of Experimental
interaction between language and memory. Journal of Psychology: General, 123, 331–353.
Verbal Learning and Verbal Behavior, 13, 585–589. Lund, K., Burgess, C., & Atchley, R. A. (1995).
Loftus, E. F., & Zanni, G. (1975). Eyewitness Semantic and associative priming in high-dimensional
testimony: The influence of the wording of a question. semantic space. Proceedings of the 17th Annual
Bulletin of the Psychonomic Society, 5, 86–88. Conference of the Cognitive Science Society, 660–665.
Longtin, C. M., Segui, J., & Halle, P. A. (2003). Lund, K., Burgess, C., & Audet, C. (1996).
Morphological priming without morphological Dissociating semantic and associative word
relationship. Language and Cognitive Processes, 18, relationships using high-dimensional semantic space.
313–334. Proceedings of the 18th Annual Conference of the
Loosemore, R., & Harley, T. A. (2010). Brains Cognitive Science Society, 603–608.
and minds. In S. J. Hanson & M. Bunzl (Eds.), Lundberg, I., & Tornéus, M. (1978). Nonreaders’
Foundational issues in human brain mapping (pp. awareness of the basic relationship between spoken
217–240). Cambridge, MA: MIT Press. and written words. Journal of Experimental Child
Lorch, R. F., Balota, D. A., & Stamm, E. G. (1986). Psychology, 25, 404–412.
Locus of inhibition effects in the priming of lexical Lupker, S. J. (1984). Semantic priming without
decisions: Pre- or post-lexical access. Memory and association: A second look. Journal of Verbal Learning
Cognition, 9, 587–598. and Verbal Behavior, 23, 709–733.
Lounsbury, F. G. (1954). Transitional probability, Luria, A. R. (1970). Traumatic aphasia. The Hague:
linguistic structure and systems of habit-family Mouton.
hierarchies. In C. E. Osgood & T. A. Sebeok (Eds.), Lyn, H., & Savage-Rumbaugh, E. S. (2000).
Psycholinguistics: A survey of theory and research Observational word learning in two bonobos (Pan
problems (pp. 93–101). Bloomington: Indiana Paniscus): Ostensive and non-ostensive contexts.
University Press. [Reprinted 1965.] Language and Communication, 20, 255–273.
Lovegrove, W., Martin, F., & Slaghuis, W. (1986). Lyons, J. (1977a). Semantics (Vol. 1). Cambridge:
A theoretical and experimental case for a visual Cambridge University Press.
deficit in specific reading disability. Cognitive Lyons, J. (1977b). Semantics (Vol. 2). Cambridge:
Neuropsychology, 3, 225–267. Cambridge University Press.
Lowenfeld, B. (1948). Effects of blindness on the Lyons, J. (1991). Chomsky (3rd ed.). London:
cognitive functions of children. Nervous Child, 7, Fontana. [First edition 1970.]
45–54. Maccoby, E., & Jacklin, C. (1974). The psychology
Luce, P. A., Pisoni, D. B., & Goldinger, S. D. (1990). of sex differences. Stanford, CA: Stanford University
Similarity neighbourhoods of spoken words. In Press.
G. T. M. Altmann (Ed.), Cognitive models of speech MacDonald, M. C. (1993). The interaction of lexical
processing (pp. 122–147). Cambridge, MA: MIT and syntactic ambiguity. Journal of Memory and
Press. Language, 32, 692–715.
Luce, R. D. (1993). Sound and hearing: A conceptual MacDonald, M. C. (1994). Probabilistic constraints
introduction. Hillsdale, NJ: Lawrence Erlbaum and syntactic ambiguity resolution. Language and
Associates, Inc. Cognitive Processes, 9, 157–201.
Lucy, J. A. (1992). Language diversity and thought. MacDonald, M. C., & Christiansen, M. H. (2002).
Cambridge: Cambridge University Press. Reassessing working memory: Comment on Just
Lucy, J. A. (1996). The scope of linguistic relativity: and Carpenter (1992) and Waters and Caplan (1996).
An analysis and review of empirical research. In Psychological Review, 109, 35–54.
J. J. Gumperz & S. C. Levinson (Eds.), Rethinking MacDonald, M. C., Just, M. A., & Carpenter, P. A.
linguistic relativity (pp. 37–69). Cambridge: (1992). Working memory constraints on the processing
Cambridge University Press. of syntactic ambiguity. Cognitive Psychology, 24,
Lucy, J. A., & Shweder, R. A. (1979). Whorf and his 56–98.
critics: Linguistic and nonlinguistic influences on colour MacDonald, M. C., Pearlmutter, N. J., &
memory. American Anthropologist, 81, 581–615. Seidenberg, M. S. (1994a). Syntactic ambiguity
Lukatela, G., & Turvey, M. T. (1994a). Visual resolution as lexical ambiguity resolution. In C.
lexical access is initially phonological: 1. Evidence Clifton, L. Frazier, & K. Rayner (Eds.), Perspectives
536 REFERENCES
on sentence processing (pp. 123–153). Hillsdale, NJ: Mandler, J. M., & Johnson, N. S. (1980). On
Lawrence Erlbaum Associates, Inc. throwing out the baby with the bathwater: A reply to
MacDonald, M. C., Pearlmutter, N. J., & Black and Wilensky’s evaluation of story grammars.
Seidenberg, M. S. (1994b). The lexical nature of Cognitive Science, 4, 305–312.
syntactic ambiguity resolution. Psychological Review, Manis, F. R., McBride-Chang, C., Seidenberg, M. S.,
101, 676–703. Keating, P., Doi, L. M., Munson, B., et al. (1997). Are
MacKain, C. (1982). Assessing the role of experience speech perception deficits asociated with developmental
in infant speech discrimination. Journal of Child dyslexia? Journal of Experimental Child Psychology,
Language, 9, 323–350. 66, 211–235.
MacKay, D. G. (1966). To end ambiguous sentences. Manis, F. R., Seidenberg, M. S., Doi, L. M.,
Perception and Psychophysics, 1, 426–436. McBride-Chang, C., & Petersen, A. (1996). On the
MacKay, D. G. (1973). Aspects of the theory of bases of two subtypes of developmental dyslexia.
comprehension, memory and attention. Quarterly Cognition, 58, 157–195.
Journal of Experimental Psychology, 25, 22–40. Maratsos, M. (1982). The child’s construction of
Maclay, H., & Osgood, C. E. (1959). Hesitation grammatical categories. In E. Wanner &
phenomena in spontaneous English speech. Word, 15, L. R. Gleitman (Eds.), Language acquisition: The
19–44. state of the art (pp. 240–266). Cambridge: Cambridge
Macmillan, N. A., & Creelman, C. D. (1991). University Press.
Detection theory: A user’s guide. Cambridge: Maratsos, M. (1983). Some current issues in the study
Cambridge University Press. of the acquisition of grammar. In J. H. Flavell &
Macnamara, J. (1972). Cognitive basis of language E. M. Markman (Eds.), Handbook of child psychology:
learning in infants. Psychological Review, 79, 1–13. Vol. 3. Cognitive development (pp. 707–786)
Macnamara, J. (1982). Names for things: A study of (P. H. Mussen, Series Editor). New York: Wiley.
human learning. Cambridge, MA: MIT Press. Maratsos, M. (1988). The acquisition of formal word
MacNeilage, P. F., & Davis, B. L. (2000). On the classes. In Y. Levy, I. M. Schlesinger, &
origin of internal structure of word forms. Science, M. D. S. Braine (Eds.), Categories and processes
288, 527–531. in language acquisition. Hillsdale, NJ: Lawrence
MacWhinney, B. (Ed.). (1999). The emergence Erlbaum Associates, Inc.
of language. Mahwah, NJ: Lawrence Erlbaum Maratsos, M. (1998). The acquisition of grammar.
Associates, Inc. In W. Damon, D. Kuhn, & R. S. Siegler (Eds.),
MacWhinney, B., & Leinbach, J. (1991). Handbook of child psychology (Vol. 2, 5th ed., pp.
Implementations are not conceptualizations: Revising 421–466). New York: Wiley.
the verb learning model. Cognition, 40, 121–157. Marcel, A. J. (1980). Surface dyslexia and beginning
MacWhinney, B., & Pleh, C. (1988). The processing reading: A revised hypothesis of the pronunciation of print
of restrictive relative clauses in Hungarian. Cognition, and its impairments. In M. Coltheart, K. E. Patterson, &
29, 95–141. J. C. Marshall (Eds.), Deep dyslexia (pp. 227–258).
Magiste, E. (1986). Selected issues in second and London: Routledge & Kegan Paul. [2nd ed., 1987.]
third language learning. In J. Vaid (Ed.), Language Marcel, A. J. (1983a). Conscious and unconscious
processing in bilinguals: Psycholinguistic and perception: Experiments on visual making and word
neuropsychological perspectives (pp. 97–121). recognition. Cognitive Psychology, 15, 197–237.
Hillsdale, NJ: Lawrence Erlbaum Associates, Inc. Marcel, A. J. (1983b). Conscious and unconscious
Maher, J., & Groves, J. (1999). Introducing perception: An approach to the relations between
Chomsky. Cambridge: Icon Books. [Originally phenomenal experience and perceptual processes.
published as Chomsky for beginners, 1996.] Cognitive Psychology, 15, 238–300.
Majid, A., Bowerman, M., Kita, S., Haun, D. B. M., Marchman, V. (1993). Constraints on plasticity in a
& Levinson, S. C. (2004). Can language restructure connectionist model of the English past tense. Journal
cognition? The case for space. Trends in Cognitive of Cognitive Neuroscience, 5, 215–234.
Sciences, 8, 108–113. Marchman, V. (1997). Children’s productivity in the
Malotki, E. (1983). Hopi time: A linguistic analysis English past tense: The role of frequency, phonology,
of temporal concepts in the Hopi language. Berlin: and neighborhood structure. Cognitive Science, 21,
Mouton. 283–304.
Mandler, J. M. (1978). A code in the node: The cue Marchman, V., & Bates, E. (1994). Continuity in
of a story schema in retrieval. Discourse Processes, 1, lexical and morphological development: A test of the
14–35. critical mass hypothesis. Journal of Child Language,
Mandler, J. M., & Johnson, N. S. (1977). 21, 339–366.
Remembrance of things parsed: Story structure and Marcus, G. F. (1993). Negative evidence in language
recall. Cognitive Psychology, 9, 111–151. acquisition. Cognition, 46, 53–85.
REFERENCES 537
Marcus, G. F. (1995). The acquisition of English Marshall, J. C., & Newcombe, F. (1966). Syntactic
past tense in children and multilayered connectionist and semantic errors in paralexia. Neuropsychologia, 4,
networks. Cognition, 56, 271–279. 169–176.
Marcus, G. F. (1999). Reply to Seidenberg and Marshall, J. C., & Newcombe, F. (1973). Patterns
Elman. Trends in Cognitive Sciences, 3, 289. of paralexia: A psycholinguistic approach. Journal of
Marcus, G. F., Ullman, M., Pinker, S., Hollander, Psycholinguistic Research, 2, 175–199.
M., Rosen, T. J., & Xu, F. (1992). Overregularization Marshall, J. C., & Newcombe, F. (1980). The
in language acquisition. Monographs of the Society for conceptual status of deep dyslexia: An historical
Research in Child Development, 57 (Serial No. 228). perspective. In M. Coltheart, K. E. Patterson, &
Marcus, G. F., Vijayan, S., Rao, S. B., & Vishton, P. M. J. C. Marshall (Eds.), Deep dyslexia (pp. 1–21).
(1999). Rule learning by seven-month-old infants. London: Routledge & Kegan Paul. [2nd ed., 1987.]
Science, 283, 77–80. Marshall, J. C., & Patterson, K. E. (1985). Left is
Marian, V., & Spivey, M. (2003). Bilingual and still left for semantic paralexias: A reply to Jones and
monolingual processing of competing lexical items. Martin. Neuropsychologia, 23, 689–690.
Applied Psycholinguistics, 24, 173–193. Marslen-Wilson, W. D. (1973). Linguistic structure
Marien, P., Enggelborghs, S., Fabbro, F., & and speech shadowing at very short latencies. Nature,
De Deyn, P. P. (2001). The lateralized linguistic 244, 522–523.
cerebellum: A review and a new hypothesis. Brain and Marslen-Wilson, W. D. (1975). Sentence perception
Language, 79, 580–600. as an interactive parallel process. Science, 189,
Markman, E. M. (1979). Realizing that you don’t 226–228.
understand: Elementary school children’s awareness of Marslen-Wilson, W. D. (1976). Linguistic
inconsistencies. Child Development, 50, 643–655. descriptions and psychological assumptions in the
Markman, E. M. (1985). Why superordinate category study of sentence perception. In R. J. Wales &
terms can be mass nouns. Cognition, 19, 311–353. E. C. T. Walker (Eds.), New approaches to language
Markman, E. M. (1989). Categorization and naming mechanisms (pp. 203–230). Amsterdam: North
in children. Cambridge, MA: MIT Press. Holland.
Markman, E. M. (1990). Constraints children place Marslen-Wilson, W. D. (1984). Spoken word
on word meanings. Cognitive Science, 14, 57–77. recognition: A tutorial review. In H. Bouma &
Markman, E. M., & Hutchinson, J. E. (1984). D. G. Bouwhis (Eds.), Attention and performance X:
Children’s sensitivity to constraints on word Control of language processes (pp. 125–150). Hove,
meaning: Taxonomic vs. thematic relations. Cognitive UK: Lawrence Erlbaum Associates.
Psychology, 16, 1–27. Marslen-Wilson, W. D. (1987). Functional
Markman, E. M., & Wachtel, G. F. (1988). parallelism in spoken word recognition. Cognition, 25,
Children’s use of mutual exclusivity to constrain 71–102.
the meaning of words. Cognitive Psychology, 20, Marslen-Wilson, W. D. (Ed.). (1989). Lexical
121–157. representation and process. Cambridge, MA: MIT
Marr, D. (1982). Vision: A computational Press.
investigation into the human representation and Marslen-Wilson, W. D. (1990). Activation,
processing of visual information. San Francisco: W. H. competition, and frequency in lexical access. In
Freeman. G. T. M. Altmann (Ed.), Cognitive models of speech
Marsh, G., Desberg, P., & Cooper, J. (1977). processing (pp. 148–172). Cambridge, MA: MIT Press.
Developmental changes in strategies of reading. Marslen-Wilson, W. D. (1993). Issues of process
Journal of Reading Behaviour, 9, 391–394. and representation in lexical access. In G. Altmann
Marsh, G., Friedman, M. P., Welch, V., & Desberg, P. & R. Shillcock (Eds.), Cognitive models of speech
(1981). A cognitive-developmental theory of reading processing (pp. 187–210). Hove, UK: Lawrence
acquisition. In T. G. Waller & G. E. Mackinnon Erlbaum Associates.
(Eds.), Reading research: Advances in theory and Marslen-Wilson, W. D., & Tyler, L. K. (1980). The
practice (Vol. 3, pp. 199–221). New York: Academic temporal structure of spoken language understanding.
Press. Cognition, 8, 1–71.
Marshall, J., Robson, J., Pring, T., & Chiat, S. Marslen-Wilson, W. D., & Tyler, L. K. (2003).
(1998). Why does monitoring fail in jargon aphasia? Capturing underlying differentiation in the human
Comprehension, judgement, and therapy evidence. language system. Trends in Cognitive Science, 7,
Brain and Language, 63, 79–107. 62–63.
Marshall, J. C. (1970). The biology of Marslen-Wilson, W. D., Tyler, L. K., Waksler, R.,
communication in man and animals. In J. Lyons (Ed.), & Older, L. (1994). Morphology and meaning in the
New horizons in linguistics (Vol. 1, pp. 229–242). English mental lexicon. Psychological Review, 101,
Harmondsworth, UK: Penguin. 3–33.
538 REFERENCES
Marslen-Wilson, W. D., & Warren, P. (1994). Levels Martin, N., Weisberg, R. W., & Saffran, E. M.
of perceptual representation and process in lexical (1989). Variables influencing the occurrence of naming
access: Words, phonemes, and features. Psychological errors: Implications for models of lexical retrieval.
Review, 101, 653–675. Journal of Memory and Language, 28, 462–485.
Marslen-Wilson, W. D., & Welsh, A. (1978). Martin, R. C. (1982). The pseudohomophone effect:
Processing interactions and lexical access during The role of visual similarity in non-word decisions.
word recognition in continuous speech. Cognitive Quarterly Journal of Experimental Psychology, 34A,
Psychology, 10, 29–63. 395–410.
Marslen-Wilson, W. D., & Zwitserlood, P. (1989). Martin, R. C. (1993). Short-term memory and
Accessing spoken words: The importance of word sentence processing: Evidence from neuropsychology.
onsets. Journal of Experimental Psychology: Human Memory and Cognition, 21, 176–183.
Perception and Performance, 15, 576–585. Martin, R. C. (1995). Working memory doesn’t work:
Martin, A., & Fedio, P. (1983). Word production A critique of Miyake et al.’s capacity theory of aphasic
and comprehension in Alzheimer’s disease: The comprehension deficits. Cognitive Neuropsychology,
breakdown of semantic knowledge. Brain and 12, 623–636.
Language, 19, 124–141. Martin, R. C., & Breedin, S. D. (1992). Dissociations
Martin, C., Vu, H., Kellas, G., & Metcalf, K. between speech perception and phonological short-
(1999). Strength of discourse context as a determinant term memory deficits. Cognitive Neuropsychology, 9,
of the subordinate bias effect. Quarterly Journal of 509–534.
Experimental Psychology, 52A, 813–839. Martin, R. C., & Feher, E. (1990). The consequences
Martin, G. L. (2004). Encoder: A connectionist model of reduced memory span for the comprehension of
of how learning to visually encode fixated text images semantic versus syntactic information. Brain and
improves reading fluency. Psychological Review, 111, Language, 38, 1–20.
617–639. Martin, R. C., & Lesch, M. F. (1996). Associations
Martin, G. N. (1998). Human neuropsychology. and dissociations between language impairment and
London: Prentice Hall. list recall: Implications for models of STM. In S. E.
Martin, N. (2001). Repetition disorders in aphasia: Gathercole (Ed.), Models of short-term memory (pp.
Theoretical and clinical implications. In R. S. Berndt 149–178). Hove, UK: Psychology Press.
(Ed.), Handbook of neuropsychology (Vol. 3, 2nd ed., Martin, R. C., Lesch, M. F., & Bartha, M. C.
pp. 137–155). Amsterdam: Elsevier Science. (1999). Independence of input and output phonology
Martin, N., & Ayala, J. (2004). Measurements of in word processing and short-term memory. Journal of
auditory-verbal STM in aphasia: Effects of task, Memory and Language, 41, 3–29.
item and word processing impairment. Brain and Martin, R. C., Shelton, J. R., & Yaffee, L. S.
Language, 89, 464–483. (1994). Language processing and working
Martin, N., Dell, G. S., Saffran, E. M., & memory: Neuropsychological evidence for separate
Schwartz, M. F. (1994). Origins of paraphasia in phonological and semantic capacities. Journal of
deep dysphasia: Testing the consequences of a decay Memory and Language, 33, 83–111.
impairment to an interactive spreading activation mode Martin, R. C., Wetzel, W. F., Blossom-Stach, C.,
of lexical retrieval. Brain and Language, 47, 609–660. & Feher, E. (1989). Syntactic loss versus processing
Martin, N., & Saffran, E. M. (1990). Repetition and deficit: An assessment of two theories of agrammatism
verbal STM in transcortical sensory aphasia: A case and syntactic comprehension deficits. Cognition, 32,
study. Brain and Language, 39, 254–288. 157–191.
Martin, N., & Saffran, E. M. (1992). A computational Masataka, N. (1996). Perception of motherese in
account of deep dysphasia: Evidence from a single case a signed language by 6-month-old deaf infants.
study. Brain and Language, 43, 240–274. Developmental Psychology, 32, 874–879.
Martin, N., & Saffran, E. M. (1997). Language and Mason, M. K. (1942). Learning to speak after six and
auditory-verbal short-term memory impairments: one half years silence. Journal of Speech and Hearing
Evidence for common underlying processes. Cognitive Disorders, 7, 295–304.
Neuropsychology, 14, 641–682. Mason, R. A., Just, M. A., Keller, T. A., &
Martin, N., & Saffran, E. M. (1998). The Carpenter, P. A. (2003). Ambiguity in the brain:
relationship between input and output phonology: What brain imaging reveals about the processing
Evidence from aphasia. Brain and Language, of syntactically ambiguous sentences. Journal of
65, 225–228. Experimental Psychology: Learning, Memory, and
Martin, N., Saffran, E. M., & Dell, G. S. (1996). Cognition, 29, 1319–1338.
Recovery in deep dysphasia: Evidence for a relation Massaro, D. W. (1987). Speech perception by ear and
between auditory-verbal STM and lexical errors in eye: A paradigm for psychological enquiry. Hillsdale,
repetition. Brain and Language, 52, 83–113. NJ: Lawrence Erlbaum Associates, Inc.
REFERENCES 539
Massaro, D. W. (1989). Testing between the TRACE of a single case. Journal of Neurology, Neurosurgery,
model and the fuzzy logical model of speech and Psychiatry, 49, 1233–1240.
perception. Cognitive Psychology, 21, 398–421. McCarthy, R. A., & Warrington, E. K. (1987a). The
Massaro, D. W. (1994). Psychological aspects of double dissociation of short-term memory for lists and
speech perception: Implications for research and sentences: Evidence from aphasia. Brain, 110, 1545–1563.
theory. In M. A. Gernsbacher (Ed.), Handbook of McCarthy, R. A., & Warrington, E. K. (1987b).
psycholinguistics (pp. 219–264). San Diego, CA: Understanding: A function of short-term memory?
Academic Press. Brain, 110, 1565–1578.
Massaro, D. W., & Cohen, M. M. (1991). Integration McCarthy, R. A., & Warrington, E. K. (1988).
versus interactive activation: The joint influence Evidence for modality-specific meaning systems in the
of stimulus and context in perception. Cognitive brain. Nature, 334, 428–430.
Psychology, 23, 558–614. McCauley, C., Parmalee, C. M., Sperber, R. D., &
Massaro, D. W., & Oden, G. C. (1995). Carr, T. H. (1980). Early extraction of meaning from
Independence of lexical context and phonological pictures and its relation to conscious identification.
information in speech perception. Journal of Journal of Experimental Psychology: Human
Experimental Psychology: Learning, Memory, and Perception and Performance, 6, 265–276.
Cognition, 21, 1053–1064. McClelland, J. L. (1979). On the time relations of mental
Masson, M. E. J. (1995). A distributed memory processes: An examination of systems of processes in
model of semantic priming. Journal of Experimental cascade. Psychological Review, 86, 287–330.
Psychology: Learning, Memory, and Cognition, 21, McClelland, J. L. (1981). Retrieving general and
3–23. specific information from stored knowledge of
Masterson, J., Coltheart, M., & Meara, P. (1985). specifics. Proceedings of the 3rd Annual Conference of
Surface dyslexia in a language without irregularly spelled the Cognitive Science Society, 170–172.
words. In K. E. Patterson, J. C. Marshall, & M. Coltheart McClelland, J. L. (1987). The case for interactions in
(Eds.), Surface dyslexia: Neuropsychological and language processing. In M. Coltheart (Ed.), Attention
cognitive studies of phonological reading (pp. 215–223). and performance XII: The psychology of reading (pp.
Hove, UK: Lawrence Erlbaum Associates. 3–36). Hove, UK: Lawrence Erlbaum Associates.
Masur, E. F. (1997). Maternal labelling of novel McClelland, J. L. (1991). Stochastic interactive
and familiar objects: Implications for children’s processes and the effect of context on perception.
development of lexical constraints. Journal of Child Cognitive Psychology, 23, 1–44.
Language, 24, 427–439. McClelland, J. L., & Elman, J. L. (1986). The
Mattys, S. L., & Jusczyk, P. W. (2001). Phonotactic TRACE model of speech perception. Cognitive
cues for segmentation of fluent speech by infants. Psychology, 18, 1–86.
Cognition, 78, 91–121. McClelland, J. L., & Patterson, K. E. (2002). Rules
Mayberry, E. J., Sage, K., & Lambon Ralph, M. A. or connections in past-tense inflections: What does
(2011). At the edge of semantic space: The breakdown the evidence rule out? Trends in Cognitive Science, 6,
of coherent concepts in semantic dementia is 465–472.
constrained by typicality and severity but not modality. McClelland, J. L., & Patterson, K. E. (2003).
Journal of Cognitive Neuroscience, 23, 2240–2251. Differentiation and integration in human language.
Mazuka, R. (1991). Processing of empty categories Trends in Cognitive Science, 7, 63–64.
in Japanese. Journal of Psycholinguistic Research, 20, McClelland, J. L., & Rumelhart, D. E. (1981). An
215–232. interactive activation model of context effects in letter
McBride-Chang, C. (2004). Children’s literacy perception: Part 1. An account of the basic findings.
development. London: Arnold. Psychological Review, 88, 375–407.
McCann, R. S., & Besner, D. (1987). Reading McClelland, J. L., & Rumelhart, D. E. (1988).
pseudohomophones: Implications for models of Explorations in parallel distributed processing.
pronunciation assembly and the locus of word frequency Cambridge, MA: MIT Press.
effects in naming. Journal of Experimental Psychology: McClelland, J. L., Rumelhart, D. E., & the PDP
Human Perception and Performance, 13, 14–24. Research Group. (1986). Parallel distributed
McCarthy, J. J. (2001). A thematic guide to processing: Vol. 2. Psychological and biological
optimality theory. Cambridge: Cambridge University models. Cambridge, MA: MIT Press.
Press. McClelland, J. L., & Seidenberg, M. S. (2000).
McCarthy, J. J., & Prince, A. (1990). Foot and word Words and rules—the ingredients of language by
in prosodic morphology: The Arabic broken plural. Pinker, S. Science, 287, 47–48.
Natural Language and Linguistic Theory, 8, 209–283. McCloskey, M. (1980). The stimulus familiarity
McCarthy, R. A., & Warrington, E. K. (1986). problem in semantic memory research. Journal of
Visual associative agnosia: A clinico-anatomical study Verbal Learning and Verbal Behavior, 19, 485–504.
540 REFERENCES
McCloskey, M., & Caramazza, A. (1988). Theory structures in story understanding. Journal of Memory
and methodology in cognitive neuropsychology: A and Language, 28, 711–734.
response to our critics. Cognitive Neuropsychology, 5, McKoon, G., Ratcliff, R., & Ward, G. (1994).
583–623. Testing theories of language processing: An empirical
McCloskey, M., & Glucksberg, S. (1978). Natural investigation of the on-line lexical decision task.
categories: Well-defined or fuzzy sets? Memory and Journal of Experimental Psychology: Learning,
Cognition, 6, 462–472. Memory, and Cognition, 20, 1219–1228.
McConkie, G. W., & Rayner, K. (1976). Asymmetry McLaughlin, B. (1984). Second language acquisition
of the perceptual span in reading. Bulletin of the in childhood (2nd ed.). Hillsdale, NJ: Lawrence
Psychonomic Society, 8, 365–368. Erlbaum Associates, Inc.
McCune-Nicolich, L. (1981). The cognitive bases of McLaughlin, B. (1987). Theories of second-language
relational words in the single word period. Journal of learning. London: Arnold.
Child Language, 8, 15–34. McLaughlin, B., & Heredia, R. (1996). Information-
McCutchen, D., & Perfetti, C. A. (1982). The visual processing approaches to research on second language
tongue-twister effect: Phonological activation in acquisition and use. In W. C. Ritchie & T. K. Bhatia
silent reading. Journal of Verbal Learning and Verbal (Eds.), Handbook of second language acquisition (pp.
Behavior, 21, 672–687. 213–228). London: Academic Press.
McDavid, V. (1964). The alternation of that and zero McLeod, P., Shallice, T., & Plaut, D. C. (2000).
in noun clauses. American Speech, 39, 102–113. Attractor dynamics in word recognition: Converging
McDonald, J. L., Bock, J. K., & Kelly, M. H. (1993). evidence from errors by normal subjects, dyslexic
Word order and world order: Semantic, phonological, patients and a connectionist model. Cognition, 74,
and metrical determinants of serial position. Cognitive 91–113.
Psychology, 25, 188–230. McNamara, T. P. (1992). Theories of priming: I.
McDonald, S. A., Carpenter, R. H. S., & Associative distance and lag. Journal of Experimental
Shillcock, R. C. (2005). An anatomically constrained, Psychology: Learning, Memory, and Cognition, 18,
stochastic model of eye movement control in reading. 1173–1190.
Psychological Review, 112, 814–840. McNamara, T. P. (1994). Theories of priming: II.
McGurk, H., & MacDonald, J. (1976). Hearing lips Types of prime. Journal of Experimental Psychology:
and seeing voices. Nature, 264, 746–748. Learning, Memory, and Cognition, 20, 507–520.
McKoon, G., Gerrig, R. J., & Greene, S. B. McNamara, T. P., & Altarriba, J. (1988). Depth of
(1996). Pronoun resolution without pronouns: Some spreading activation revisited: Semantic mediated
consequences of memory-based text processing. priming occurs in lexical decisions. Journal of
Journal of Experimental Psychology: Learning, Memory and Language, 27, 545–559.
Memory, and Cognition, 22, 919–932. McNamara, T. P., & Miller, D. L. (1989). Attributes
McKoon, G., & Ratcliff, R. (1986). Inferences of theories of meaning. Psychological Bulletin, 106,
about predictable events. Journal of Experimental 355–376.
Psychology: Learning, Memory, and Cognition, 12, McQueen, J. (1991). The influence of the lexicon on
82–91. phonetic categorisation: Stimulus quality and word-
McKoon, G., & Ratcliff, R. (1989). Semantic final ambiguity. Journal of Experimental Psychology:
associations and elaborative inference. Journal of Human Perception and Performance, 17, 433–443.
Experimental Psychology: Learning, Memory, and McRae, K., & Boisvert, S. (1998). Automatic semantic
Cognition, 15, 326–338. similarity priming. Journal of Experimental Psychology:
McKoon, G., & Ratcliff, R. (1992). Inference during Learning, Memory, and Cognition, 24, 558–572.
reading. Psychological Review, 99, 440–466. McRae, K., de Sa, V. R., & Seidenberg, M. S.
McKoon, G., & Ratcliff, R. (2002). Event templates (1997). On the nature and scope of featural
in the lexical representations of verbs. Cognitive representations of word meaning. Journal of
Psychology, 45, 1–44. Experimental Psychology: General, 126, 99–130.
McKoon, G., & Ratcliff, R. (2003). Meaning through McRae, K., Hare, M., & Tanenhaus, M. K. (2005).
syntax: Language comprehension and the reduced Meaning through syntax is insufficient to explain
relative clause construction. Psychological Review, comprehension of sentences with reduced relative
110, 490–525. clauses: Comment on McKoon and Ratcliff (2003).
McKoon, G., Ratcliff, R., & Dell, G. S. (1986). Psychological Review, 112, 1022–1031.
A critical evaluation of the semantic–episodic McRae, K., Spivey-Knowlton, M. J., & Tanenhaus,
distinction. Journal of Experimental Psychology: M. K. (1998). Modeling the influence of thematic
Learning, Memory, and Cognition, 12, 295–306. fit (and other constraints) in on-line sentence
McKoon, G., Ratcliff, R., & Seifert, C. M. (1989). comprehension. Journal of Memory and Language,
Making the connection: Generalized knowledge 38, 283–312.
REFERENCES 541
McShane, J. (1991). Cognitive development. Oxford: category norms, and word frequency. Bulletin of the
Blackwell. Psychonomic Society, 7, 283–284.
McShane, J., & Dockrell, J. (1983). Lexical and Messer, D. (1980). The episodic structure of maternal
grammatical development. In B. Butterworth (Ed.), speech to young children. Journal of Child Language,
Speech production: Vol. 2. Development, writing, 7, 29–40.
and other language processes (pp. 51–99). London: Messer, D. (2000). State of the art: Language
Academic Press. acquisition. The Psychologist, 13, 138–143.
McVay, J. C., & Kane, M. J. (2012). Why does Metsala, J. L., Stanovich, K. E., & Brown, G. D. A.
working memory capacity predict variation in reading (1998). Regularity effects and the phonological deficit
comprehension? On the influence of mind wandering model of reading disabilities: A meta-analytic review.
and executive attention. Journal of Experimental Journal of Educational Psychology, 90, 279–293.
Psychology: General, 141, 302–320. Meyer, A. S. (1996). Lexical access in phrase and
Medin, D. L., & Schaffer, M. M. (1978). A context sentence production: Results from picture–word
theory of classification learning. Psychological interference experiments. Journal of Memory and
Review, 85, 207–238. Language, 35, 477–496.
Mehler, J. (1963). Some effects of grammatical Meyer, A. S. (2004). The use of eye tracking in
transformations on the recall of English sentences. Journal studies of sentence generation. In J. M. Henderson &
of Verbal Learning and Verbal Behavior, 2, 346–351. F. Ferreira (Eds.), The interface of language, vision,
Mehler, J., Jusczyk, P. W., Lambertz, G., Halsted, N., and action: Eye movements and the visual world (pp.
Bertoncini, J., & Amiel-Tison, C. (1988). A precursor 191–211). Hove, UK: Psychology Press.
of language acquisition in young infants. Cognition, 29, Meyer, A. S., & Bock, K. (1992). The tip-of-the-
143–178. tongue phenomenon: Blocking or partial activation?
Mehler, J., Segui, J., & Carey, P. W. (1978). Tails of Memory and Cognition, 20, 715–726.
words: Monitoring ambiguity. Journal of Verbal Meyer, A. S., Sleiderink, A., & Levelt, W. J. M.
Learning and Verbal Behavior, 17, 29–35. (1998). Viewing and naming objects: Eye movements
Meier, R. P. (1991). Language acquisition by deaf during noun phrase production. Cognition, 66,
children. American Scientist, 79, 60–70. B25–B33.
Melby-Lervåg, M., Lyster, S.-A. H., & Hulme, C. Meyer, A. S., Wheeldon, L., & Krott, A. (Eds.).
(2012). Phonological skills and their role in learning to (2006). Automaticity and control in language
read: A meta-analytic review. Psychological Bulletin, processing. Hove, UK: Psychology Press.
138, 322–352. Meyer, D. E., & Schvaneveldt, R. W. (1971).
Melinger, A., & Dobel, C. (2005). Lexically-driven Facilitation in recognizing pairs of words: Evidence of
syntactic priming. Cognition, 98, B11–B20. a dependence between retrieval operations. Journal of
Melinger, A., & Rahman, R. A. (2013). Lexical Experimental Psychology, 90, 227–235.
selection is competitive: Evidence from indirectly Meyer, D. E., Schvaneveldt, R. W., & Ruddy, M. G.
activated semantic associates during picture naming. (1974). Loci of contextual effects on visual word
Journal of Experimental Psychology: Learning, recognition. In P. M. A. Rabbitt & S. Dornic (Eds.),
Memory, and Cognition, 39, 348–364. Attention and performance V (pp. 98–118). New York:
Menn, L. (1980). Phonological theory and child Academic Press.
phonology. In G. H. Yeni-Komshian, J. F. Kavanagh, Miceli, G., Benvegnu, B., Capasso, R., &
& C. A. Ferguson (Eds.), Child phonology (Vol. 1, pp. Caramazza, A. (1997). The independence of
23–41). New York: Academic Press. phonological and orthographic lexical forms: Evidence
Menyuk, P. (1969). Sentences children use. from aphasia. Cognitive Neuropsychology, 14, 35–69.
Cambridge, MA: MIT Press. Miceli, G., & Capasso, R. (1997). Semantic errors
Menyuk, P., Menn, L., & Silber, R. (1986). Early as neuropsychological evidence for the independence
strategies for the perception and production of and the interaction of orthographic and phonological
words and sounds. In P. Fletcher & M. Garman word forms. Language and Cognitive Processes, 12,
(Eds.), Language acquisition (2nd ed., pp. 198–222). 733–764.
Cambridge: Cambridge University Press. Miceli, G., Mazzucci, A., Menn, L., & Goodglass, H.
Meringer, R., & Mayer, K. (1895). Versprechen und (1983). Contrasting cases of Italian agrammatic aphasia
Verlesen: Eine Pyschologisch-Linguistische Studie. without comprehension disorder. Brain and Language,
Stuttgart: Gössen. 19, 65–97.
Mervis, C. B., & Bertrand, J. (1994). Young children Michaels, D. (1977). Linguistic relativity and color
and adults use lexical principles to learn new nouns. terminology. Language and Speech, 20, 333–343.
Child Development, 65, 1646–1662. Milberg, W., Blumstein, S. E., & Dworetzky, B.
Mervis, C. B., Catlin, J., & Rosch, E. (1975). (1987). Processing of lexical ambiguities in aphasia.
Relationships among goodness-of-example, Brain and Language, 31, 138–150.
542 REFERENCES
Miller, D., & Ellis, A. W. (1987). Speech and writing Mintz, T. H., & Gleitman, L. R. (2002). Adjectives
errors in “neologistic jargonaphasia”: A lexical really do modify nouns: The incremental and restricted
activation hypothesis. In M. Coltheart, G. Sartori, & nature of early adjective acquisition. Cognition, 84,
R. Job (Eds.), The cognitive neuropsychology of 267–293.
language (pp. 235–271). Hove, UK: Lawrence Miozzo, M. (2003). On the processing of regular and
Erlbaum Associates. irregular forms of verbs and nouns: Evidence from
Miller, G. A., Heise, G. A., & Lichten, W. (1951). neuropsychology. Cognition, 87, 101–127.
The intelligibility of speech as a function of the Miozzo, M., & Caramazza, A. (1997). Retrieval of
text of the test materials. Journal of Experimental lexical-syntactic features in tip-of-the-tongue states.
Psychology, 41, 329–355. Journal of Experimental Psychology: Learning,
Miller, G. A., & Johnson-Laird, P. N. (1976). Memory, and Cognition, 23, 1410–1423.
Language and perception. Cambridge: Cambridge Mitchell, D. C. (1987). Reading and syntactic
University Press. analysis. In J. R. Beech & A. M. Colley (Eds.),
Miller, G. A., & McKean, K. E. (1964). A Cognitive approaches to reading (pp. 87–112).
chronometric study of some relations between Chichester, UK: John Wiley & Sons Ltd.
sentences. Quarterly Journal of Experimental Mitchell, D. C. (1994). Sentence parsing. In
Psychology, 16, 297–308. M. A. Gernsbacher (Ed.), Handbook of
Miller, G. A., & McNeill, D. (1969). psycholinguistic research (pp. 375–410). San Diego,
Psycholinguistics. In G. Lindzey & E. Aronson (Eds.), CA: Academic Press.
The handbook of social psychology (Vol. 3, pp. 666– Mitchell, D. C., Brysbaert, M., Grondelaers, S., &
794). Reading, MA: Addison-Wesley. Swanepoel, P. (2000). Modifier attachment in Dutch:
Miller, J. L. (1981). Effects of speaking rate on Testing aspects of construal theory. In A. Kennedy,
segmental distinctions. In P. D. Eimas & J. L. Miller R. Radach, D. Heller, & J. Pynte (Eds.), Reading as a
(Eds.), Perspectives on the study of speech (pp. 39–74). perceptual process (pp. 493–516). Oxford: Elsevier.
Hillsdale, NJ: Lawrence Erlbaum Associates, Inc. Mitchell, D. C., Cuetos, F., Corley, M. M. B., &
Miller, J. L., & Jusczyk, P. W. (1989). Seeking the Brysbaert, M. (1995). Exposure-based models
neurobiological bases of speech perception. Cognition, of human parsing: Evidence for the use of coarse-
33, 111–137. grained (nonlexical) statistical records. Journal of
Miller, K. F., & Stigler, J. (1987). Counting in Psycholinguistic Research, 24, 469–488.
Chinese: Cultural variations in a basic cognitive skill. Mitchell, D. C., & Holmes, V. M. (1985). The role
Cognitive Development, 2, 279–305. of specific information about the verb in parsing
Millis, M. L., & Button, S. B. (1989). The effect of sentences with local structural ambiguity. Journal of
polysemy on lexical decision time: Now you see it, Memory and Language, 24, 542–559.
now you don’t. Memory and Cognition, 17, 141–147. Miyake, A., Carpenter, P. A., & Just, M. A. (1994).
Mills, A. E. (Ed.). (1983). Language acquisition in the A capacity approach to syntactic comprehension
blind child: Normal and deficient. London: Croom Helm. disorders: Making normal adults perform like aphasic
Mills, A. E. (1987). The development of phonology patients. Cognitive Neuropsychology, 11, 671–717.
in the blind child. In B. Dodd & R. Campbell (Eds.), Moerk, E. (1991). Positive evidence for negative
Hearing by eye: The psychology of lip-reading (pp. evidence. First Language, 11, 219–251.
145–162). Hove, UK: Lawrence Erlbaum Associates. Mohay, H. (1982). A preliminary description of
Mills, D. L., Coffrey-Corina, S. A., & Neville, H. J. the communication systems evolved by two deaf
(1993). Language acquisition and cerebral children in the absence of a sign language model. Sign
specialization in 20-month-old infants. Journal of Language Studies, 34, 73–90.
Cognitive Neuroscience, 5, 317–334. Molfese, D. L. (1977). Infant cerebral asymmetry.
Mills, D. L., Coffrey-Corina, S. A., & Neville, H. J. In S. J. Segalowitz & F. A. Gruber (Eds.), Language
(1997). Language comprehension and cerebral development and neurological theory (pp. 21–35).
specialization from 13 to 20 months. Developmental New York: Academic Press.
Neuropsychology, 13, 397–445. Molfese, D. L., & Molfese, V. J. (1994). Short-term
Milne, R. W. (1982). Predicting garden path and long-term developmental outcomes: The use of
sentences. Cognitive Science, 6, 349–373. behavioral and electrophysiological measures in early
Minsky, M. (1975). A framework for representing infancy as predictors. In G. Dawson & K. W. Fischer
knowledge. In P. H. Winston (Ed.), The psychology (Eds.), Human behavior and the developing brain (pp.
of computer vision (pp. 211–277). New York: 493–517). New York: Guilford Press.
McGraw-Hill. Monaghan, J., & Ellis, A. W. (2002). What exactly
Mintz, T. H. (2003). Frequent frames as a cue for interacts with spelling–sound consistency in word
grammatical categories in child directed speech. naming? Journal of Experimental Psychology:
Cognition, 90, 91–117. Learning, Memory, and Cognition, 28, 183–206.
REFERENCES 543
Monaghan, P., & Ellis, A. W. (2010). Modeling Morrison, C. M., & Ellis, A. W. (1995). Roles
reading development: Cumulative, incremental of word frequency and age of acquisition in word
learning in a computational model of word naming. naming and lexical decision. Journal of Experimental
Journal of Memory and Language, 63, 506–525. Psychology: Learning, Memory, and Cognition, 21,
Monsell, S. (1985). Repetition and the lexicon. In A. 116–133.
W. Ellis (Ed.), Progress in psychology of language Morrison, C. M., & Ellis, A. W. (2000). Real age of
(Vol. 2, pp. 147–195). Hove, UK: Lawrence Erlbaum acquisition effects in word naming. British Journal of
Associates. Psychology, 91, 167–180.
Monsell, S. (1987). On the relation between lexical Morrison, C. M., Ellis, A. W., & Quinlan, P. T.
input and output pathways for speech. In A. Allport, (1992). Age of acquisition, not word frequency, affects
D. Mackay, W. Prinz, & E. Sheerer (Eds.), Language object naming, not object recognition. Memory and
perception and production: Shared mechanisms in Cognition, 20, 705–714.
listening, speaking, reading, and writing (pp. 273– Morrow, D. G., Bower, G. H., & Greenspan, S. L.
311). London: Academic Press. (1989). Updating situation models during narrative
Monsell, S., Doyle, M. C., & Haggard, P. N. (1989). comprehension. Journal of Memory and Language,
Effects of frequency on visual word recognition tasks: 28, 292–312.
Where are they? Journal of Experimental Psychology: Morrow, D. G., Greenspan, S. L., & Bower, G. H.
General, 118, 43–71. (1987). Accessibility and situation models in narrative
Monsell, S., & Hirsh, K. W. (1998). Competitor comprehension. Journal of Memory and Language,
priming in spoken word recognition. Journal of 26, 165–187.
Experimental Psychology: Learning, Memory, and Morsella, E., & Miozzo, M. (2002). Evidence for a
Cognition, 24, 1495–1520. cascade model of lexical access in speech production.
Monsell, S., Matthews, G. H., & Miller, D. C. Journal of Experimental Psychology: Learning,
(1992). Repetition of lexicalization across languages: Memory, and Cognition, 28, 555–563.
A further test of the locus of priming. Quarterly Morton, J. (1969). Interaction of information in word
Journal of Experimental Psychology, 44A, 763–783. recognition. Psychological Review, 76, 165–178.
Monsell, S., Patterson, K. E., Graham, A., Hughes, Morton, J. (1970). A functional model for human
C. H., & Milroy, R. (1992). Lexical and sublexical memory. In D. A. Norman (Ed.), Models of human
translations of spelling to sound: Strategic anticipation memory (pp. 203–260). New York: Academic Press.
of lexical status. Journal of Experimental Psychology: Morton, J. (1979a). Word recognition. In J. Morton &
Learning, Memory, and Cognition, 18, 452–467. J. C. Marshall (Eds.), Psycholinguistics series: Vol. 2.
Morais, J., Bertelson, P., Cary, L., & Alegria, J. Structures and processes (pp. 107–156). London: Paul
(1986). Literacy training and speech segmentation. Elek.
Cognition, 24, 45–64. Morton, J. (1979b). Facilitation in word recognition:
Morais, J., Cary, L., Alegria, J., & Bertelson, P. Experiments causing change in the logogen model.
(1979). Does awareness of speech as a sequence of In P. A. Kolers, M. E. Wrolstad, & M. Bouma (Eds.),
phones arise spontaneously? Cognition, 7, 323–331. Processing of visible language (pp. 259–268). New
Morais, J., & Kolinsky, R. (1994). Perception and York: Plenum.
awareness in phonological processing: The case of the Morton, J. (1984). Brain-based and non-brain-based
phoneme. Cognition, 50, 287–297. models of language. In D. Caplan, A. R. Lecours, &
Moreno, E. M., & Kutas, M. (2009). Processing A. Smith (Eds.), Biological perspectives in language
semantic anomaly in two languages: An (pp. 40–64). Cambridge, MA: MIT Press.
electrophysiological exploration in both languages, of Morton, J. (1985). Naming. In S. Newman & R.
Spanish–English bilinguals. Cognitive Brain Research, Epstein (Eds.), Current perspectives in dysphasia (pp.
22, 205–220. 217–230). Edinburgh: Churchill Livingstone.
Morgan, J. L., & Travis, L. L. (1989). Limits on Morton, J., & Patterson, K. E. (1980). A new
negative information in language input. Journal of attempt at an interpretation, or, an attempt at a new
Child Language, 16, 531–552. interpretation. In M. Coltheart, K. E. Patterson, &
Morris, A. L., & Harris, C. L. (2002). Sentence J. C. Marshall (Eds.), Deep dyslexia (pp. 91–118).
context, word recognition, and repetition blindness. London: Routledge & Kegan Paul. [2nd ed., 1987.]
Journal of Experimental Psychology: Learning, Moss, H. E., & Marslen-Wilson, W. D. (1993).
Memory, and Cognition, 28, 962–982. Access to word meanings during spoken language
Morrison, C. M., Chappell, T. D., & Ellis, A. W. comprehension: Effects of sentential semantic context.
(1997). Age of acquisition norms for a large set of Journal of Experimental Psychology: Learning,
object names and their relation to adult estimates and Memory, and Cognition, 19, 1254–1276.
other variables. Quarterly Journal of Experimental Moss, H. E., McCormick, S. F., & Tyler, L. K.
Psychology, 50A, 528–559. (1997). The time course of activation of spoken
544 REFERENCES
information during spoken word recognition. Naigles, L. R. (2003). Paradox lost? No, paradox
Language and Cognitive Processes, 12, 695–731. found! Reply to Tomasello and Akhtar (2003).
Moss, H. E., Ostrin, R. K., Tyler, L. K., & Marslen- Cognition, 88, 325–329.
Wilson, W. D. (1995). Accessing different types of Nation, K., & Snowling, M. J. (1998). Semantic
lexical semantic information: Evidence from priming. processing and the development of word-recognition
Journal of Experimental Psychology: Learning, skills: Evidence from children with reading
Memory, and Cognition, 21, 863–883. comprehension difficulties. Journal of Memory and
Motley, M. T., & Baars, B. J. (1976). Semantic bias Language, 39, 85–101.
effects on the outcomes of verbal slips. Cognition, 4, Navarette, E., Basagni, B., Alario, F.-X., & Costa, A.
177–187. (2006). Does word frequency affect lexical selection in
Motley, M. T., Camden, C. T., & Baars, B. J. (1982). speech production? Quarterly Journal of Experimental
Covert formulation and editing of anomalies in speech Psychology, 59, 1681–1690.
production: Evidence from experimentally elicited Nazzi, T., Bertoncini, J., & Mehler, J. (1998).
slips of the tongue. Journal of Verbal Learning and Language discrimination by newborns: Towards
Verbal Behavior, 21, 578–594. an understanding of the role of rhythm. Journal of
Mowrer, O. H. (1960). Learning theory and symbolic Experimental Psychology: Human Perception and
processes. New York: John Wiley & Sons. Performance, 24, 756–766.
Mulford, R. (1988). First words of the blind child. Nebes, R. D. (1989). Semantic memory in Alzheimer’s
In M. D. Smith & J. L. Locke (Eds.), The emergent disease. Psychological Bulletin, 106, 377–394.
lexicon: The child’s development of a linguistic Neely, J. H. (1977). Semantic priming and retrieval
vocabulary (pp. 293–338). New York: Academic Press. from lexical memory: Roles of inhibitionless spreading
Muller, R.-A. (1997). Innateness, autonomy, activation and limited capacity attention. Journal of
universality? Neurobiological approaches to language. Experimental Psychology: General, 106, 226–254.
Behavioral and Brain Sciences, 19, 611–675. Neely, J. H. (1991). Semantic priming effects in
Murphy, G. L. (1985). Processes of understanding visual word recognition: A selective review of
anaphora. Journal of Memory and Language, 24, current findings and theories. In D. Besner & G.
290–303. W. Humphreys (Eds.), Basic processes in reading:
Murphy, G. L., & Medin, D. L. (1985). The role Visual word recognition (pp. 264–336). Hillsdale, NJ:
of theories in conceptual coherence. Psychological Lawrence Erlbaum Associates, Inc.
Review, 92, 289–316. Neely, J. H., Keefe, D. E., & Ross, K. (1989).
Murray, W. S., & Forster, K. I. (2004). Serial Semantic priming in the lexical decision task:
mechanisms in lexical access: The rank hypothesis. Roles of prospective prime-generated expectancies
Psychological Review, 111, 721–756. and retrospective relation-checking. Journal of
Muter, V., Hulme, C., Snowling, M., & Taylor, S. Experimental Psychology: Learning, Memory, and
(1998). Segmentation, not rhyming, predicts early Cognition, 15, 1003–1019.
progress in learning to read. Journal of Experimental Negnevitsky, M. (2004). Artificial intelligence:
Child Psychology, 71, 3–27. A guide to intelligent systems. Reading, MA:
Muter, V., Snowling, M. J., & Taylor, S. (1994). Addison-Wesley.
Orthographic analogies and phonological awareness: Neisser, U. (1981). John Dean’s memory: A case
Their role and significance in early reading study. Cognition, 9, 1–22.
development. Journal of Child Psychology and Nelson, K. (1973). Structure and strategy in learning
Psychiatry, 35, 293–310. to talk. Monographs of the Society for Research in
Myers, J. L., & O’Brien, E. J. (1998). Accessing the Child Development, 38 (Serial No. 149).
discourse representation during reading. Discourse Nelson, K. (1974). Concept, word, and sentence:
Processes, 26, 131–157. Inter-relations in acquisition and development.
Nagy, W., & Anderson, R. (1984). The number of Psychological Review, 81, 267–285.
words in printed school English. Reading Research Nelson, K. (1987). What’s in a name? Reply to
Quarterly, 19, 304–330. Seidenberg and Petitto. Journal of Experimental
Naigles, L. R. (1990). Children use syntax to learn Psychology: General, 116, 293–296.
verb meanings. Journal of Child Language, 17, Nelson, K. (1988). Constraints on word meaning?
357–374. Cognitive Development, 3, 221–246.
Naigles, L. R. (1996). The use of multiple frames in Nelson, K. (1990). Comment on Behrend’s
verb learning via syntactic bootstrapping. Cognition, “Constraints and development.” Cognitive
58, 221–251. Development, 5, 331–339.
Naigles, L. R. (2002). Form is easy, meaning is Nelson, K., Hampson, J., & Shaw, L. K. (1993).
hard: Resolving a paradox in early child language. Nouns in early lexicons: Evidence, explanations and
Cognition, 86, 157–199. implications. Journal of Child Language, 20, 61–84.
REFERENCES 545
Nespoulous, J.-L., Dordain, M., Perron, C., Ska, B., Nicol, J. (1993). Reconsidering reactivation. In
Bub, D., Caplan, D., et al. (1988). Agrammatism in G. Altmann & R. Shillcock (Eds.), Cognitive models
sentence production without comprehension deficits: of speech processing (pp. 321–347). Hove, UK:
Reduced availability of syntactic structures and/or Lawrence Erlbaum Associates.
of grammatical morphemes? A case study. Brain and Nicol, J., & Swinney, D. (1989). The role of
Language, 33, 273–295. structure in coreference assignment during sentence
Neville, H., Nicol, J. L., Barss, A., Forster, K. I., & comprehension. Journal of Psycholinguistic Research,
Garrett, M. F. (1991). Syntactically based sentence 18, 5–9.
processing classes: Evidence from event-related brain Nigram, A., Hoffman, J. E., & Simons, R. F. (1992).
potentials. Journal of Cognitive Neuroscience, 3, N400 to semantically anomalous pictures and words.
151–165. Journal of Cognitive Neuroscience, 4, 15–22.
Newcombe, F., & Marshall, J. C. (1980). Ninio, A. (1980). Ostensive definition in vocabulary
Transcoding and lexical stabilization in deep dyslexia. teaching. Journal of Child Language, 7, 565–573.
In M. Coltheart, K. E. Patterson, & J. C. Marshall Nisbett, R. E. (2003). The geography of thought.
(Eds.), Deep dyslexia (pp. 176–188). London: London: Nicholas Brealey.
Routledge & Kegan Paul. [2nd ed., 1987.] Nishimura, M. (1986). Intrasentential codeswitching:
Newcombe, F., & Marshall, J. C. (1985). Reading The case of language assignment. In J. Vaid (Ed.),
and writing by letter sounds. In K. E. Patterson, Language processing in bilinguals (pp. 123–143).
J. V. Marshall, & M. Coltheart (Eds.), Surface dyslexia Hillsdale, NJ: Lawrence Erlbaum Associates, Inc.
(pp. 34–51). Hove, UK: Lawrence Erlbaum Associates. Noppeny, U., & Price, C. J. (2002). A PET study
Newman, F., & Holzman, L. (Eds.). (1993). Lev of stimulus- and task-induced semantic processing.
Vygotsky: Revolutionary scientist. London: Routledge. NeuroImage, 15, 927–935.
Newmark, L. (1966). How not to interfere with Norman, D. A., & Rumelhart, D. E. (1975). Memory
language learning. International Journal of American and knowledge. In D. A. Norman, D. E. Rumelhart,
Linguistics, 32, 77–83. & the LNR Research Group (Eds.), Explorations in
Newport, E. L. (1990). Maturational constraints on cognition (pp. 3–32). San Francisco: Freeman.
language learning. Cognitive Science, 14, 11–28. Norris, D. (1984). The effects of frequency, repetition,
Newport, E. L., & Meier, R. P. (1985). The and stimulus quality in visual word recognition.
acquisition of American Sign Language. In Quarterly Journal of Experimental Psychology, 36A,
D. I. Slobin (Ed.), The cross-linguistic study of 507–518.
language acquisition: Vol. 1. The data (pp. 882–938). Norris, D. (1986). Word recognition: Context effects
Hillsdale, NJ: Lawrence Erlbaum Associates, Inc. without priming. Cognition, 22, 93–136.
Newton, P. K., & Barry, C. (1997). Concreteness Norris, D. (1990). A dynamic-net model of human
effects in word production but not word speech recognition. In G. T. M. Altmann (Ed.),
comprehension in deep dyslexia. Cognitive Cognitive models of speech processing (pp. 87–104).
Neuropsychology, 14, 481–509. Cambridge, MA: MIT Press.
Ni, W., Constable, R. T., Menci, W. E., Pugh, K. R., Norris, D. (1993). Bottom-up connectionist models
Fulbright, R. K., Shaywitz, S. E., et al. (2000). of “interaction.” In G. Altmann & R. Shillcock (Eds.),
An event-related neuroimaging study distinguishing Cognitive models of speech processing (pp. 211–234).
form and content in sentence processing. Journal of Hillsdale, NJ: Lawrence Erlbaum Associates, Inc.
Cognitive Neuroscience, 12, 120–133. Norris, D. (1994a). A quantitative multiple-levels
Ni, W., Crain, S., & Shankweiler, D. (1996). model of reading aloud. Journal of Experimental
Sidestepping garden paths: Assessing the contributions Psychology: Human Perception and Performance, 20,
of syntax, semantics, and plausibility in resolving 1212–1232.
ambiguities. Language and Cognitive Processes, 11, Norris, D. (1994b). Shortlist: A connectionist model
283–334. of continuous speech recognition. Cognition, 52,
Nickels, L., & Howard, D. (1994). A frequent 189–234.
occurrence? Factors affecting the production of Norris, D., & Brown, G. D. A. (1985). Race
semantic errors in aphasic naming. Cognitive models and analogy theories: A dead heat? Reply to
Neuropsychology, 11, 289–320. Seidenberg. Cognition, 20, 155–168.
Nickels, L., & Howard, D. (1995). Phonological Norris, D., McQueen, J. M., & Cutler, A. (2000).
errors in aphasic naming: Comprehension monitoring Merging information in speech recognition: Feedback
and lexicality. Cortex, 31, 209–237. is never necessary. Behavioral and Brain Sciences, 23,
Nickels, L., Howard, D., & Best, W. (1997). 299–370.
Fractionating the articulatory loop: Dissociations Norris, D., McQueen, J. M., & Cutler, A. (2003).
and associations in phonological recoding in aphasia. Perceptual learning in speech. Cognitive Psychology,
Brain and Language, 56, 161–182. 47, 204–238.
546 REFERENCES
Norris, D., McQueen, J. M., Cutler, A., & comparison with normal metaphonological processes.
Butterfield, S. (1997). The possible-word constraint Journal of Speech and Hearing Research, 28, 47–63.
in the segmentation of continuous speech. Cognitive Oller, D. K., Wieman, L. A., Doyle, W. J., &
Psychology, 34, 191–243. Ross, C. (1976). Infant babbling and speech. Journal
Nosofsky, R. M. (1991). Tests of an exemplar model of Child Language, 3, 1–11.
for relating perceptual classification and recognition Olsen, T. S., Bruhn, P., & Öberg, R. (1986). Cortical
memory. Journal of Experimental Psychology: Human hypoperfusion as a possible cause of “subcortical
Perception and Performance, 17, 3–27. aphasia.” Brain, 109, 393–410.
Nosofky, R. M., & Palmeri, T. J. (1997). An Olson, R. K. (1994). Language deficits in “specific”
exemplar-based random walk model of speeded reading ability. In M. A. Gernsbacher (Ed.), Handbook
classification. Psychological Review, 104, 266–300. of psycholinguistics (pp. 895–916). San Diego, CA:
Nowak, M. A. (2006). Evolutionary dynamics. Academic Press.
Cambridge, MA: Harvard University Press. Olson, R. K., Kliegel, R., Davidson, B. J., & Foltz, G.
Nozari, N., Dell, G. S., & Schwartz, M. F. (2011). (1984). Individual and developmental differences in
Is comprehension necessary for error detection? reading disability. In G. E. MacKinnon & T. G. Waller
A conflict-based account of monitoring in speech (Eds.), Reading research: Advances in theory and
production. Cognitive Psychology, 63, 1–33. practice (Vol. 4, pp. 1–64). New York: Academic Press.
Oakhill, J. (1994). Individual differences in children’s Onifer, W., & Swinney, D. A. (1981). Accessing
text comprehension. In M. A. Gernsbacher (Ed.), lexical ambiguities during sentence comprehension:
Handbook of psycholinguistics (pp. 821–848). San Effects of frequency of meaning and contextual bias.
Diego, CA: Academic Press. Memory and Cognition, 9, 225–236.
Obler, L. (1981). Right hemisphere participation in Oppenheim, G. M. (2012). The case for subphonemic
second language acquisition. In K. C. Diller (Ed.), attenuation in inner speech: Comment on Corley,
Individual differences and universals in language learning Brocklehurst, and Moat (2011). Journal of
aptitude (pp. 53–64). Rowley, MA: Newbury House. Experimental Psychology: Learning, Memory, and
Obler, L. K., & Hannigan, S. (1996). Cognition, 38, 502–512.
Neurolinguistics of second language acquisition and Oppenheim, G. M., & Dell, G. S. (2008). Inner
use. In W. C. Ritchie & T. K. Bhatia (Eds.), Handbook speech slips exhibit lexical bias, but not the phonemic
of second language acquisition (pp. 509–523). similarity effect. Cognition, 106, 528–537.
London: Academic Press. Orchard, G. A., & Phillips, W. A. (1991). Neural
O’Brien, E. J. (1987). Antecedent search processes computation: A beginner’s guide. Hove, UK:
and the structure of text. Journal of Experimental Lawrence Erlbaum Associates.
Psychology: Learning, Memory, and Cognition, 13, Orwell, G. (1949). Nineteen eighty-four.
278–290. Harmondsworth, UK: Penguin.
O’Brien, E. J., Cook, A. E., & Peracchi, K. A. O’Seaghdha, P. G. (1997). Conjoint and dissociable
(2004). Updating situation models: Reply to Zwaan effects of syntactic and semantic context. Journal of
and Madden (2004). Journal of Experimental Experimental Psychology: Learning, Memory, and
Psychology: Learning, Memory, and Cognition, 30, Cognition, 23, 807–828.
289–291. Osgood, C. E., & Sebeok, T. A. (Eds.). (1954).
O’Brien, E. J., Rizzella, M. L., Albrect, J. E., & Psycholinguistics: A survey of theory and research
Halleran, J. G. (1998). Updating a situation model: problems (pp. 93–101). Bloomington: Indiana
A memory-based text processing view. Journal of University Press. [Reprinted 1965.]
Experimental Psychology: Learning, Memory, and Osterhout, L., & Holcomb, P. J. (1992). Event-
Cognition, 24, 1200–1210. related potentials elicited by syntactic anomaly.
Obusek, C. J., & Warren, R. M. (1973). Relation of Journal of Memory and Language, 31, 785–806.
the verbal transformation and the phonemic restoration Osterhout, L., Holcomb, P. J., & Swinney, D. A.
effects. Cognitive Psychology, 5, 97–107. (1994). Brain potentials elicited by garden-path
Ochs, E., & Schieffelin, B. (1995). The impact of sentences: Evidence of the application of verb
language socialization on grammatical development. information during parsing. Journal of Experimental
In P. Fletcher & B. MacWhinney (Eds.), Handbook of Psychology: Learning, Memory, and Cognition, 20,
child language (pp. 73–94). Oxford: Blackwell. 786–803.
Oller, D. K. (1980). The emergence of sounds of Osterhout, L., & Nicol, J. (1999). On the
speech in infancy. In G. H. Yeni-Komshian, J. F. distinctiveness, independence, and time course of the
Kavanagh, & C. A. Ferguson (Eds.), Child phonology brain responses to syntactic and semantic anomalies.
(Vol. 1, pp. 93–112). New York: Academic Press. Language and Cognitive Processes, 14, 283–317.
Oller, D. K., Eilers, R. E., Bull, D. H., & Carney, A. E. Ostrin, R. K., & Schwartz, M. F. (1986).
(1985). Prespeech vocalizations of a deaf infant: A Reconstructing from a degraded trace—a study
REFERENCES 547
of sentence repetition in agrammatism. Brain and Patterson, F. (1981). The education of Koko. New
Language, 28, 328–345. York: Holt, Rinehart & Winston.
O’Sullivan, C., & Yeager, C. P. (1989). Patterson, K. E. (1980). Derivational errors. In
Communicative context and linguistic competence: M. Coltheart, K. E. Patterson, & J. C. Marshall (Eds.),
The effects of social setting on a chimpanzee’s Deep dyslexia (pp. 286–306). London: Routledge &
conversational skills. In R. A. Gardner & T. E. Kegan Paul. [2nd ed., 1987.]
van Cantford (Eds.), Teaching sign language to Patterson, K. E., & Besner, D. (1984). Is the right
chimpanzees (pp. 269–279). Albany, NY: Suny Press. hemisphere literate? Cognitive Neuropsychology, 3,
Owens, R. E., Jr. (2004). Language development: An 341–367.
introduction (6th ed.). Columbus, OH: Merrill. Patterson, K. E., Graham, N., & Hodges, J. R.
Paap, K. R., Newsome, S., McDonald, J. E., & (1994). The impact of semantic memory loss on
Schvaneveldt, R. W. (1982). An activation-verification phonological representations. Journal of Cognitive
model for letter and word recognition: The word Neuroscience, 6, 57–69.
superiority effect. Psychological Review, 89, 573–594. Patterson, K. E., & Hodges, J. R. (1992).
Pachella, R. G. (1974). The interpretation of reaction Deterioration of word meaning: Implications
time in information processing research. In B. H. for reading. Neuropsychologia, 30, 1025–1040.
Kantowitz (Ed.), Human information processing: Patterson, K. E., Marshall, J. C., & Coltheart, M.
Tutorials in performance and cognition (pp. 41–82). (1985a). Surface dyslexia in various orthographies:
Hillsdale, NJ: Lawrence Erlbaum Associates, Inc. Introduction. In K. E. Patterson, J. C. Marshall,
Paget, R. (1930). Human speech. New York: Harcourt & M. Coltheart (Eds.), Surface dyslexia:
Brace. Neuropsychological and cognitive studies of
Paivio, A. (1971). Imagery and verbal processes. phonological reading (pp. 209–214). Hove, UK:
London: Holt, Rinehart & Winston. Lawrence Erlbaum Associates.
Paivio, A., Clark, J. M., & Lambert, W. E. (1988). Patterson, K. E., Marshall, J. C., & Coltheart, M.
Bilingual dual-coding theory and semantic-repetition (Eds.). (1985b). Surface dyslexia: Neuropsychological
effects. Journal of Experimental Psychology: and cognitive studies of phonological reading. Hove,
Learning, Memory, and Cognition, 14, 163–172. UK: Lawrence Erlbaum Associates.
Paivio, A., Yuille, J. C., & Madigan, S. (1968). Patterson, K. E., & Morton, J. (1985). From orthography
Concreteness, imagery, and meaningfulness values to phonology: An attempt at an old interpretation. In K. E.
of 925 nouns. Journal of Experimental Psychology Patterson, J. C. Marshall, & M. Coltheart (Eds.), Surface
Monographs, 76, 1–25. dyslexia: Neuropsychological and cognitive studies of
Palmer, J., MacLeod, C. M., Hunt, E., & Davidson, phonological reading (pp. 335–359). Hove, UK: Lawrence
J. E. (1985). Information processing correlates of Erlbaum Associates.
reading. Journal of Verbal Learning and Verbal Patterson, K. E., Seidenberg, M. S., &
Behavior, 24, 59–88. McClelland, J. L. (1989). Connections and
Papafragou, A., Massey, C., & Gleitman, L. (2002). disconnections: Acquired dyslexia in a computational
Shake, rattle, ’n’ roll: The representation of motion in model of reading processes. In R. G. M. Morris (Ed.),
language and cognition. Cognition, 84, 189–219. Parallel distributed processing: Implications for
Papagno, C., Valentine, T., & Baddeley, A. (1991). psychology and neurobiology (pp. 131–181). Oxford:
Phonological short-term memory and foreign- Clarendon Press.
language vocabulary learning. Journal of Memory and Patterson, K. E., & Shewell, C. (1987). Speak and
Language, 30, 331–347. spell: Dissociations and word-class effects. In
Paquier, P. F., & Marien, P. (2005). A synthesis of M. Coltheart, G. Sartori, & R. Job (Eds.), The
the role of the cerebellum in cognition. Aphasiology, cognitive neuropsychology of language (pp. 273–294).
19, 3–19. Hove, UK: Lawrence Erlbaum Associates.
Paradis, M. (1997). The cognitive neuropsychology Patterson, K. E., Suzuki, T., & Wydell, T. N. (1996).
of bilingualism. In A. M. B. de Groot & J. F. Kroll Interpreting a case of Japanese phonological alexia:
(Eds.), Tutorials in bilingualism: Psycholinguistic The key is phonology. Cognitive Neuropsychology, 13,
perspectives (pp. 331–354). Mahwah, NJ: Lawrence 803–822.
Erlbaum Associates, Inc. Patterson, K. E., Vargha-Khadem, F., & Polkey, C. E.
Parkin, A. J. (1982). Phonological recoding in lexical (1989). Reading with one hemisphere. Brain, 112,
decision: Effects of spelling-to-sound regularity 39–63.
depend on how regularity is defined. Memory and Pearce, J. M. (2008). Animal learning and cognition
Cognition, 10, 43–53. (3rd ed.). Hove, UK: Lawrence Erlbaum Associates.
Parkin, A. J., & Stewart, F. (1993). Category-specific Pearl, E., & Lambert, W. E. (1962). The relation
impairments? No. A critique of Sartori et al. Quarterly of bilingualism to intelligence. Psychological
Journal of Experimental Psychology, 46A, 505–509. Monographs, 76 (27, Whole No. 546).
548 REFERENCES
Pearlmutter, N. J., & MacDonald, M. C. (1995). of Experimental Psychology: Learning, Memory, and
Individual differences and probabilistic constraints in Cognition, 21, 24–33.
syntactic ambiguity resolution. Journal of Memory Peters, P. S., & Ritchie, R. W. (1973). Context-
and Language, 34, 521–542. sensitive immediate constituent analysis: Context-free
Peereman, R., & Content, A. (1997). Orthographic language revisited. Mathematical Systems Theory, 6,
and phonological neighbours in naming: Not all 324–333.
neighbours are equally influential in orthographic Petersen, S. E., Fox, P. T., Posner, M. I., Mintun, M. E.,
space. Journal of Memory and Language, 37, 382–410. & Raichle, J. (1989). Positron emission tomographic
Penfield, W., & Roberts, L. (1959). Speech and brain studies of the processing of single words. Journal of
mechanisms. Princeton, NJ: Princeton University Cognitive Neuroscience, 1, 153–170.
Press. Petersen, S. E., van Mier, H., Fiez, J. A., &
Pennington, B. F., & Lefly, D. L. (2001). Early Raichle, M. E. (1998). The effects of practice on the
reading development in children at family risk for functional anatomy of task performance. Proceedings
dyslexia. Child Development, 72, 816–833. of the National Academy of Science USA, 95, 853–860.
Pepperberg, I. M. (1981). Functional vocalizations by Peterson, R. R., & Savoy, P. (1998). Lexical
an African grey parrot (Psittacus erithacus). Zeitschrift selection and phonological encoding during language
für Tierpsychologie, 55, 139–160. production: Evidence for cascaded processing. Journal
Pepperberg, I. M. (1983). Cognition in the African of Experimental Psychology: Learning, Memory, and
grey parrot: Preliminary evidence for auditory/vocal Cognition, 24, 539–557.
comprehension of the class concept. Animal Learning Petitto, L. (1987). On the autonomy of language and
and Behavior, 11, 179–185. gesture: Evidence from the acquisition of personal
Pepperberg, I. M. (1987). Acquisition of the same/ pronouns in American Sign Language. Cognition, 27,
different concept by an African grey parrot (Psittacus 1–52.
erithacus): Learning with respect to categories of Petitto, L. (1988). “Language” in the prelinguistic
color, shape, and material. Animal Learning and child. In F. S. Kessel (Ed.), The development of
Behavior, 15, 423–432. language and language disorders (pp. 187–222).
Pepperberg, I. M. (1999). Rethinking syntax: A Hillsdale, NJ: Lawrence Erlbaum Associates, Inc.
commentary on E. Kako’s “Elements of syntax in the Petitto, L. A., Holowka, S., Sergio, L. E., Levy, B., &
systems of three language-trained animals.” Animal Ostry, D. J. (2004). Baby hands that move to the rhythm
Learning and Behavior, 27, 15–17. of language: Hearing babies acquiring sign languages
Pepperberg, I. M. (2009). Alex & me: How a scientist babble silently on the hands. Cognition, 93, 43–73.
and a parrot discovered a hidden world of animal Petitto, L. A., & Marentette, P. F. (1991). Babbling
intelligence—and formed a deep bond in the process. in the manual mode: Evidence for the ontogeny of
New York: Harper Perennial. language. Science, 251, 1483–1496.
Pérez-Pereira, M. (1999). Deixis, personal reference, Petrie, H. (1987). The psycholinguistics of speaking.
and the use of pronouns by blind children. Journal of In J. Lyons, R. Coates, M. Deuchar, & G. Gazdar
Child Language, 26, 655–680. (Eds.), New horizons in linguistics (Vol. 2, pp. 336–
Pérez-Pereira, M., & Conti-Ramsden, G. (1999). 366). Harmondsworth, UK: Penguin.
Language development and social interaction in blind Pexman, P. M., Lupker, S. J., & Jared, D. (2001).
children. Hove, UK: Psychology Press. Homophone effects in lexical decision. Journal of
Perfect, T. J., & Hanley, J. R. (1992). The tip-of- Experimental Psychology: Learning, Memory, and
the-tongue phenomenon: Do experimenter-presented Cognition, 27, 139–156.
interlopers have any effect? Cognition, 45, 55–75. Pexman, P. M., Lupker, S. J., & Reggin, L. D.
Perfetti, C. A. (1994). Psycholinguistics and reading (2002). Phonological effects in visual word
ability. In M. A. Gernsbacher (Ed.), Handbook of recognition: Investigating the impact of feedback
psycholinguistics (pp. 849–886). San Diego, CA: activation. Journal of Experimental Psychology:
Academic Press. Learning, Memory, and Cognition, 28, 572–584.
Perfetti, C. A., Bell, L. C., & Delaney, S. M. (1988). Piaget, J. (1923). The language and thought of the
Automatic (prelexical) phonetic activation in silent child (Trans. M. Gabain, 1955). Cleveland, OH:
word reading: Evidence from backward masking. Meridian.
Journal of Memory and Language, 27, 59–70. Piattelli-Palmarini, M. (Ed.). (1980). Language and
Perfetti, C. A., & Zhang, S. (1991). Phonological learning: The debate between Jean Piaget and Noam
processes in reading Chinese characters. Journal of Chomsky. London: Routledge & Kegan Paul.
Experimental Psychology: Learning, Memory, and Piattelli-Palmarini, M. (1989). Evolution, selection,
Cognition, 17, 633–643. and cognition: From “learning” to parameter setting
Perfetti, C. A., & Zhang, S. (1995). Very early in biology and the study of language. Cognition, 31,
phonological activation in Chinese reading. Journal 1–44.
REFERENCES 549
Piattelli-Palmarini, M. (1994). Ever since language Pinker, S. (2001). Talk of genetics and vice versa.
and learning: Afterthoughts on the Piaget–Chomsky Nature, 413, 465–466.
debate. Cognition, 50, 315–346. Pinker, S. (2002). The blank state. Harmondsworth:
Pickering, M. J. (1999). Sentence comprehension. In Penguin.
S. Garrod & M. J. Pickering (Eds.), Language Pinker, S. (2003). Language as an adaptation to the
processing (pp. 123–153). Hove, UK: Psychology Press. cognitive niche. In M. H. Christiansen & S. Kirby
Pickering, M. J., & Barry, G. (1991). Sentence (Eds.), Language evolution (pp. 16–37). Oxford:
processing without empty categories. Language and Oxford University Press.
Cognitive Processes, 6, 229–259. Pinker, S., & Bloom, P. (1990). Natural language and
Pickering, M. J., & Branigan, H. P. (1998). The natural selection. Behavioral and Brain Sciences, 13,
representation of verbs: Evidence from syntactic 707–784.
priming in language production. Journal of Memory Pinker, S., & Jackendoff, R. (2005). The faculty
and Language, 39, 633–651. of language: What’s special about it? Cognition, 95,
Pickering, M. J., Branigan, H. P., & McLean, J. F. 201–236.
(2002). Constituent structure is formulated in one Pinker, S., & Prince, A. (1988). On language and
stage. Journal of Memory and Language, 46, 586–605. connectionism: Analysis of a parallel distributed
Pickering, M. J., & Garrod, S. (2004). Toward a processing model of language acquisition. Cognition,
mechanistic psychology of dialogue. Behavioral and 28, 59–108.
Brain Sciences, 27, 169–226. Pinker, S., & Ullman, M. T. (2002). The past and
Pickering, M. J., & Garrod, S. (2006). Do people future of the past tense. Trends in Cognitive Science, 6,
use language production to make predictions during 456–463, and Reply, 472–474.
comprehension? Trends in Cognitive Sciences, 11, Pisoni, D. B., & Tash, J. (1974). Reaction times to
105–110. comparisons within and across phonetic categories.
Pickering, M. J., & Garrod, S. (2013). An integrated Perception and Psychophysics, 15, 285–290.
theory of language production and comprehension. Pitchford, N., & Mullen, K. (2005). The role
Behavioral and Brain Sciences, 36, 329–347. of perception, language, and preference in the
Pickering, M. J., & Traxler, M. J. (1998). developmental acquisition of basic colour terms.
Plausibility and recovery from garden paths: An eye- Journal of Experimental Child Psychology, 90,
tracking study. Journal of Experimental Psychology: 275–302.
Learning, Memory, and Cognition, 24, 940–961. Pitt, M. A. (1995a). The locus of the lexical shift
Pickering, M. J., Traxler, M. J., & Crocker, M. W. in phoneme identification. Journal of Experimental
(2000). Ambiguity resolution in sentence processing: Psychology: Learning, Memory, and Cognition, 21,
Evidence against frequency-based accounts. Journal of 1037–1052.
Memory and Language, 43, 447–475. Pitt, M. A. (1995b). Data fitting and detection theory:
Pickering, M. J., & van Gompel, R. P. G. Reply to Massaro and Oden. Journal of Experimental
(2006). Syntactic parsing. In M. J. Traxler & M. A. Psychology: Learning, Memory, and Cognition, 21,
Gernsbacher (Eds.), The handbook of psycholinguistics 1065–1067.
(2nd ed., pp. 455–503). San Diego, CA: Elsevier. Pitt, M. A., & McQueen, J. M. (1998). Is
Pine, J. M. (1994a). Environmental correlates of compensation for coarticulation mediated by the
variation in lexical style: Interactional style and the lexicon? Journal of Memory and Language, 39,
structure of the input. Applied Psycholinguistics, 15, 347–370.
355–370. Plaut, D. C. (1997). Structure and function in the
Pine, J. M. (1994b). The language of primary lexical system: Insights from distributed models of
caregivers. In C. Gallaway & B. J. Richards (Eds.), word reading and lexical decision. Language and
Input and interaction in language acquisition (pp. Cognitive Processes, 12, 765–805.
15–37). Cambridge: Cambridge University Press. Plaut, D. C., & Booth, J. R. (2000). Individual
Pine, J. M., & Lieven, E. (1997). Lexically-based and developmental differences in semantic
learning and early grammatical development. Journal priming: Empirical and computational support for
of Child Language, 24, 187–219. a single-mechanism account of lexical processing.
Pinker, S. (1984). Language learnability and Psychological Review, 107, 786–823.
language development. Cambridge, MA: MIT Press. Plaut, D. C., & McClelland, J. L. (1993).
Pinker, S. (1989). Learnability and cognition. Generalizing with componential attractors: Word
Cambridge, MA: MIT Press. and nonword reading in an attractor network. In
Pinker, S. (1994). The language instinct. W. Kintsch (Ed.), Proceedings of the 15th Annual
Harmondsworth, UK: Allen Lane. Conference of the Cognitive Science Society
Pinker, S. (1999). Words and rules. London: (pp. 824–829). Hillsdale, NJ: Lawrence Erlbaum
Weidenfeld & Nicolson. Associates, Inc.
550 REFERENCES
Plaut, D. C., McClelland, J. L., Seidenberg, M. S., Potter, M. C., & Lombardi, L. (1998). Syntactic
& Patterson, K. E. (1996). Understanding normal priming in immediate recall of sentences. Journal of
and impaired word reading: Computational principles Memory and Language, 38, 265–282.
in quasi-regular domains. Psychological Review, 103, Potter, M. C., Moryadas, A., Abrams, I., & Noel, A.
56–115. (1993). Word perception and misperception in context.
Plaut, D. C., & Shallice, T. (1993a). Deep dyslexia: Journal of Experimental Psychology: Learning,
A case study of connectionist neuropsychology. Memory, and Cognition, 19, 3–22.
Cognitive Neuropsychology, 10, 377–500. Potter, M. C., So, K. F., von Eckardt, B., &
Plaut, D. C., & Shallice, T. (1993b). Perseverative Feldman, L. B. (1984). Lexical and conceptual
and semantic influences on visual object naming errors representation in beginning and proficient bilinguals.
in optic aphasia: A connectionist account. Journal of Journal of Verbal Learning and Verbal Behavior, 23,
Cognitive Neuroscience, 5, 89–117. 23–38.
Plunkett, K., & Elman, J. L. (1997). Exercises in Potts, G. R., Keenan, J. M., & Golding, J. M.
rethinking innateness: A handbook for connectionist (1988). Assessing the occurrence of elaborative
simulations. Cambridge, MA: Bradford Books. inferences: Lexical decision versus naming. Journal of
Plunkett, K., & Marchman, V. (1991). U-shaped Memory and Language, 27, 399–415.
learning and frequency effects in a multilayered Prasada, S., & Pinker, S. (1993). Generalisation
perceptron: Implications for child language of regular and irregular morphological patterns.
acquisition. Cognition, 38, 43–102. Language and Cognitive Processes, 8, 1–56.
Plunkett, K., & Marchman, V. (1993). From rote Prat, C. S., Mason, R. A., & Just, M. A. (2012). An
learning to system building: Acquiring verb morphology fMRI investigation of analogical mapping in metaphor
in children and connectionist nets. Cognition, 48, 21–69. comprehension: The influence of context and
Poeppel, D. (1996). A critical review of PET studies individual cognitive capacities on processing demands.
of phonological processing. Brain and Language, 55, Journal of Experimental Psychology: Learning,
317–351. Memory, and Cognition, 38, 282–294.
Poeppel, D., & Hickok, G. (2004). Towards a new Premack, D. (1971). Language in chimpanzee?
functional anatomy of language. Cognition, 92, 1–12. Science, 172, 808–822.
Polk, T. A., & Farah, M. J. (2002). Functional MRI Premack, D. (1976a). Intelligence in ape and man.
evidence for an abstract, not perceptual, word-form Hillsdale, NJ: Lawrence Erlbaum Associates, Inc.
area. Journal of Experimental Psychology: General, Premack, D. (1976b). Language and intelligence in
131, 65–72. ape and man. American Scientist, 64, 674–683.
Pollatsek, A., Bolozky, S., Well, A. D., & Rayner, K. Premack, D. (1985). “Gavagai!” or the future history
(1981). Asymmetries in the perceptual span for Israeli of the animal language controversy. Cognition, 19,
readers. Brain and Language, 14, 174–180. 207–296.
Posner, M. I., & Keele, S. W. (1968). On the genesis Premack, D. (1986a). Gavagai! or the future history
of abstract ideas. Journal of Experimental Psychology, of the animal language controversy. Cambridge, MA:
77, 353–363. MIT Press.
Posner, M. I., & Snyder, C. R. R. (1975). Premack, D. (1986b). Pangloss to Cyrano de
Facilitation and inhibition in the processing of Bergerac: “Nonsense, it’s perfect!” A reply to
signals. In P. M. A. Rabbitt & S. Dornic (Eds.), Bickerton. Cognition, 23, 81–88.
Attention and performance V (pp. 669–682). New Premack, D. (1990). Words: What are they, and do
York: Academic Press. animals have them? Cognition, 37, 197–212.
Postal, P. (1964). Constituent structure: A study Price, C. J., & Devlin, J. T. (2003). The myth of the
of contemporary models of syntactic description. visual word form area. NeuroImage, 19, 473–481.
Bloomington, IN: Research Center for the Language Pring, L. (1981). Phonological codes and functional
Sciences. spelling units: Reality and implications. Perception
Postma, A. (2000). Detection of errors during speech and Psychophysics, 30, 573–578.
production: A review of speech monitoring models. Protopapas, A. (1999). Connectionist modeling
Cognition, 77, 97–131. of speech perception. Psychological Bulletin, 125,
Postman, L., & Keppel, G. (1970). Norms of word 410–436.
associations. New York: Academic Press. Proverbio, A. M., Cok, B., & Zani, A. (2002).
Potter, J. M. (1980). What was the matter with Dr. Electrophysiological measures of language processing
Spooner? In V. A. Fromkin (Ed.), Errors in linguistic in bilinguals. Journal of Cognitive Neuroscience, 14,
performance (pp. 13–34). New York: Academic Press. 994–1017.
Potter, M. C., & Lombardi, L. (1990). Regeneration Pullum, G. K. (1981). Languages with object before
in the short-term recall of sentences. Journal of subject: A comment and a catalogue. Linguistics, 19,
Memory and Language, 29, 633–654. 147–155.
REFERENCES 551
Pullum, G. K. (1989). The great Eskimo vocabulary single case study. Journal of Neurolinguistics, 15,
hoax. Natural Language and Linguistic Theory, 7, 373–402.
275–281. Rapp, B., & Goldrick, M. (2000). Discreteness and
Pulvermüller, F. (1995). Agrammatism: Behavioral interactivity in spoken word production. Psychological
description and neurobiological explanation. Journal Review, 107, 460–499.
of Cognitive Neuroscience, 7, 165–181. Rapp, B., & Goldrick, M. (2004). Feedback by any
Pulvermüller, F., Shtyrov, Y., & Illmoniemi, R. J. other name is still interactivity: A reply to Roelofs
(2003). Spatio-temporal patterns of neural language (2004). Psychological Review, 111, 573–578.
processing: An MEG study using minimum-norm Rapp, B., & Goldrick, M. (2005). Speaking words:
current estimates. NeuroImage, 20, 1020–1025. Contributions of cognitive neuropsychological
Pye, C. (1986). Quiché Mayan speech to children. research. Cognitive Neuropsychology, 22, 1–34.
Journal of Child Language, 13, 85–100. Rapp, D. N., & Samuel, A. G. (2000). A reason to
Quine, W. V. O. (1960). Word and object. Cambridge, rhyme: Phonological and semantic influences on
MA: MIT Press. lexical access. Journal of Experimental Psychology:
Quine, W. V. O. (1977). Natural kinds. In S. P. Schwartz Learning, Memory, and Cognition, 28, 564–571.
(Ed.), Naming, necessity, and natural kinds (pp. 155–175). Rasmussen, T., & Milner, B. (1975). Clinical and
Ithaca, NY: Cornell University Press. surgical studies of the cerebral speech areas in man. In
Quinlan, P. T. (1992). The Oxford psycholinguistic K. J. Zulch, O. Creutzfeldt, & G. C. Galbraith (Eds.),
database. Oxford: Oxford University Press. Cerebral localization (pp. 238–257). New York:
Quinlan, P. T., & Dyson, B. (2008). Cognitive Springer-Verlag.
psychology. Harlow, Essex: Pearson Education. Rasmussen, T., & Milner, B. (1977). The role of
Quinn, P. C., & Eimas, P. D. (1986). On early left brain injury in determining lateralization
categorization in early infancy. Merrill-Palmer of cerebral speech functions. Annals of the New York
Quarterly, 32, 331–363. Academy of Sciences, 299, 355–369.
Quinn, P. C., & Eimas, P. D. (1996). Perceptual Rastle, K., & Brysbaert, M. (2006). Masked
organization and categorization in young infants. In phonological priming effects in English: Are they real?
C. Rovee-Collier & L. P. Lipsitt (Eds.), Advances in Do they matter? Cognitive Psychology, 53, 97–145.
infancy research (Vol. 10, pp. 2–36). Norwood, NJ: Ablex. Rastle, K., & Coltheart, M. (2000). Lexical and
Rack, J. P., Hulme, C., Snowling, M. J., & nonlexical print-to-sound translation of disyllabic
Wightman, J. (1994). The role of phonology in words and nonwords. Journal of Memory and
young children learning to read words: The direct- Language, 42, 342–364.
mapping hypothesis. Journal of Experimental Child Rastle, K., Davis, M. H., & New, B. (2004). The
Psychology, 57, 42–71. broth in my brother’s brothel: Morpho-orthographic
Rack, J. P., Snowling, M. J., & Olson, R. K. (1992). segmentation in visual word recognition. Psychonomic
The nonword reading deficit in developmental Bulletin and Review, 11, 1090–1098.
dyslexia: A review. Reading Research Quarterly, 27, Ratcliff, J. E., & McKoon, G. (1981). Does activation
29–43. really spread? Psychological Review, 88, 454–462.
Radford, A. (1981). Transformational syntax: A Ratcliff, J. E., & McKoon, G. (1988). A retrieval
student’s guide to Chomsky’s extended standard theory of priming in memory. Psychological Review,
theory. Cambridge: Cambridge University Press. 95, 385–408.
Radford, A. (1997). Syntax: A minimalist Rayner, K. (1998). Eye movements in reading
introduction. Cambridge: Cambridge University Press. and information processing: 20 years of research.
Radford, A., Atkinson, M. A., Britain, D., Clahsen, H., Psychological Bulletin, 124, 372–422.
& Spencer, A. (1999). Linguistics. Cambridge: Rayner, K., & Bertera, J. H. (1979). Reading without
Cambridge University Press. a fovea. Science, 206, 468–469.
Rapp, B., Benzing, L., & Caramazza, A. (1997). Rayner, K., Binder, K. S., & Duffy, S. A. (1999).
The autonomy of lexical orthography. Cognitive Contextual strength and the subordinate bias effect:
Neuropsychology, 14, 71–104. Comment on Martin, Vu, Kellas, and Metcalf. Quarterly
Rapp, B., & Caramazza, A. (1993). On the Journal of Experimental Psychology, 52A, 841–852.
distinction between deficits of access and deficits Rayner, K., Carlson, M., & Frazier, L. (1983).
of storage: A question of theory. Cognitive The interaction of syntax and semantics during
Neuropsychology, 10, 113–141. sentence processing: Eye movements in the analysis
Rapp, B., & Caramazza, A. (1998). A case of of semantically biased sentences. Journal of Verbal
selective difficulty in writing verbs. Neurocase, 4, Learning and Verbal Behavior, 22, 358–374.
127–140. Rayner, K., & Frazier, L. (1987). Parsing temporarily
Rapp, B., & Caramazza, A. (2002). Selective ambiguous complements. Quarterly Journal of
difficulties with spoken nouns and written verbs: A Experiment Psychology, 39A, 657–673.
552 REFERENCES
Rayner, K., & Frazier, L. (1989). Selection Input and interaction in language acquisition (pp.
mechanisms in reading lexically ambiguous words. 253–269). Cambridge: Cambridge University Press.
Journal of Experimental Psychology: Learning, Richards, M. M. (1979). Sorting out what’s in a word
Memory, and Cognition, 15, 779–790. from what’s not: Evaluating Clark’s semantic features
Rayner, K., & McConkie, G. W. (1976). What acquisition theory. Journal of Experimental Child
governs a reader’s eye movements? Vision Research, Psychology, 27, 1–47.
16, 829–837. Richardson, D. C., & Dale, R. (2005). Looking
Rayner, K., Pacht, J. M., & Duffy, S. A. (1994). to understand: The coupling between speakers’ and
Effects of prior encounter and global discourse bias on listeners’ eye movements and its relationship to
the processing of lexically ambiguous words. Journal discourse comprehension. Cognitive Science, 2005,
of Memory and Language, 33, 527–544. 1045–1060.
Rayner, K., & Pollatsek, A. (1989). The psychology Riddoch, M. J., & Humphreys, G. W. (1987).
of reading. Englewood Cliffs, NJ: Prentice Hall. Visual object processing in optic aphasia: A case of
Rayner, K., Pollatsek, A., & Binder, K. S. (1998). semantic access agnosia. Cognitive Neuropsychology,
Phonological codes and eye movements in reading. 4, 131–185.
Journal of Experimental Psychology: Learning, Riddoch, M. J., Humphreys, G. W., Coltheart, M.,
Memory, and Cognition, 24, 476–497. & Funnell, E. (1988). Semantic systems or system?
Rayner, K., Well, A. D., & Pollatsek, A. (1980). Neuropsychological evidence re-examined. Cognitive
Asymmetry of the effective visual field in reading. Neuropsychology, 5, 3–25.
Perception and Psychophysics, 27, 537–544. Rigalleau, F., & Caplan, D. (2000). Effects of gender
Read, C. (1975). Children’s categorization of speech marking in pronominal coindexation. Quarterly
sounds in English. Urbana, IL: National Council of Journal of Experimental Psychology, 53A, 23–52.
Teachers of English. Rinck, M., & Bower, G. H. (1995). Anaphora
Read, C., Zhang, Y., Nie, H., & Ding, B. (1986). resolution and the focus of attention in situation
The ability to manipulate speech sounds depends on models. Journal of Memory and Language, 34,
knowing alphabetic writing. Cognition, 24, 31–44. 110–131.
Reber, A. S., & Anderson, J. R. (1970). The Rips, L. J. (1995). The current status of research on
perception of clicks in linguistic and nonlinguistic concept combination. Mind and Language, 10, 72–104.
messages. Perception and Psychophysics, 8, 81–89. Rips, L. J., & Collins, A. (1993). Categories and
Redington, M., & Chater, N. (1998). Connectionist resemblance. Journal of Experimental Psychology:
and statistical approaches to language acquisition: A General, 122, 468–486.
distributional perspective. Language and Cognitive Rips, L. J., Shoben, E. J., & Smith, E. E. (1973).
Processes, 13, 129–191. Semantic distance and the verification of semantic
Redlinger, W., & Park, T. Z. (1980). Language relations. Journal of Verbal Learning and Verbal
mixing in young bilinguals. Journal of Child Behavior, 12, 1–20.
Language, 7, 337–352. Rips, L. J., Smith, E. E., & Shoben, E. J. (1975).
Rees, G., Russell, C., Frith, C. D., & Driver, J. Set-theoretic and network models reconsidered:
(1999). Inattentional blindness versus inattentional A comment on Hollan’s “Features and semantic
amnesia for fixated but ignored words. Science, 286, memory.” Psychological Review, 82, 156–157.
2504–2507. Ritchie, W. C., & Bhatia, T. K. (Eds.). (1996).
Reicher, G. M. (1969). Perceptual recognition as a Handbook of second language acquisition. London:
function of meaningfulness of stimulus materials. Academic Press.
Journal of Experimental Psychology, 81, 274–280. Rivas, E. (2005). Recent use of signs by chimpanzees.
Reichle, E. D., Rayner, K., & Pollatsek, A. (1999). Journal of Comparative Psychology, 119, 404–417.
Eye movement control in reading: Accounting for Rizzolatti, G., Fadiga, L., Fogassi, L., & Gallese, V.
initial fixation locations and refixations within the E-Z (1996). Premotor cortex and the recognition of motor
Reader model. Vision Research, 39, 4403–4411. actions. Cognitive Brain Research, 3, 1131–1141.
Reichle, E. D., Rayner, K., & Pollatsek, A. (2003). Roberson, D., Davies, I., & Davidoff, J. (2000).
The E-Z Reader model of eye-movement control in Color categories are not universal: Replications and
reading: Comparisons to other models. Behavioral and new evidence from a stone-age culture. Journal of
Brain Sciences, 26, 445–526. Experimental Psychology, 129, 369–398.
Remez, R., & Pisoni, D. (Eds.). (2005). Handbook of Roberts, B., & Kirsner, K. (2000). Temporal cycles
speech perception. Oxford: Blackwell. in speech production. Language and Cognitive
Rescorla, L. (1980). Overextension in early language Processes, 15, 129–157.
development. Journal of Child Language, 7, 321–335. Robinson, P. (2001). Individual differences, cognitive
Richards, B. J., & Gallaway, C. (1994). Conclusions abilities and aptitude complexes. Second Language
and directions. In C. Gallaway & B. J. Richards (Eds.), Research, 17, 368–392.
REFERENCES 553
Rochford, G. (1971). Study of naming Rogers, T. T., & McClelland, J. L. (2004). Semantic
errors in dysphasic and in demented patients. cognition: A parallel distributed processing approach.
Neuropsychologia, 9, 437–443. Cambridge, MA: MIT Press.
Rochon, E., Waters, G. S., & Caplan, D. (1994). Rohde, D. L. T., & Plaut, D. C. (1999). Language
Sentence comprehension in patients with Alzheimer’s acquisition in the absence of explicit negative
disease. Brain and Language, 46, 329–349. evidence: How important is starting small? Cognition,
Rodd, J., Gaskell, G., & Marslen-Wilson, W. 72, 67–109.
(2002). Making sense of semantic ambiguity: Rolnick, M., & Hoops, H. R. (1969). Aphasia as
Semantic competition in lexical access. Journal of seen by the aphasic. Journal of Speech and Hearing
Memory and Language, 46, 245–266. Disorders, 34, 48–53.
Rodriguez-Fornells, A., Rotte, M., Heinze, H. J., Romaine, S. (1995). Bilingualism (2nd ed.). Oxford:
Nosselt, T., & Munte, T. (2002). Brain potential Blackwell.
and functional MRI evidence for how to handle two Romani, C. (1992). Are there distinct input and
languages with one brain. Nature, 415, 1026–1029. output buffers? Evidence from an aphasic patient with
Roediger, H. L., & Blaxton, T. A. (1987). Retrieval an impaired output buffer. Language and Cognitive
modes produce dissociations in memory for surface Processes, 7, 131–162.
information. In D. S. Gorfein & R. R. Hoffman (Eds.), Romani, C., & Martin, R. C. (1999). A deficit in the
Memory and cognitive processes (pp. 349–377). short-term retention of lexical-semantic information:
Hillsdale, NJ: Lawrence Erlbaum Associates, Inc. Forgetting words but remembering a story. Journal of
Roelofs, A. (1992). A spreading-activation theory of Experimental Psychology: General, 128, 56–77.
lemma retrieval in speaking. Cognition, 42, 107–142. Rosch, E. (1973). Natural categories. Cognitive
Roelofs, A. (1997a). Syllabification in speech Psychology, 4, 328–350.
production: Evaluation of WEAVER. Language and Rosch, E. (1978). Principles of categorization.
Cognitive Processes, 12, 657–693. In E. Rosch & B. Lloyd (Eds.), Cognition and
Roelofs, A. (1997b). The WEAVER model of word- categorization (pp. 27–48). Hillsdale, NJ: Lawrence
form encoding in speech production. Cognition, 64, Erlbaum Associates, Inc.
249–284. Rosch, E., & Mervis, C. B. (1975). Family
Roelofs, A. (2002). Spoken language planning and resemblances: Studies in the internal structure of
the initiation of articulation. Quarterly Journal of categories. Cognitive Psychology, 7, 573–605.
Experimental Psychology, 55A, 465–483. Rosch, E., Mervis, C. B., Gray, W., Johnson, D.,
Roelofs, A. (2004a). Error biases in spoken word & Boyes-Braem, P. (1976). Basic objects in natural
planning and monitoring by aphasic and nonaphasic categories. Cognitive Psychology, 8, 382–439.
speakers: Comment on Rapp and Goldrick (2000). Rosnow, R. L., & Rosnow, M. (1992). Writing papers
Psychological Review, 111, 561–572. in psychology (2nd ed.). New York: Wiley.
Roelofs, A. (2004b). Comprehension-based versus Ross, B. H., & Bower, G. H. (1981). Comparisons of
production-internal feedback in planning spoken models of associative recall. Memory and Cognition,
words: A rejoinder to Rapp and Goldrick (2000). 9, 1–16.
Psychological Review, 111, 579–580. Rosson, M. B. (1983). From SOFA to LOUCH:
Roelofs, A., & Meyer, A. S. (1998). Metrical structure Lexical contributions to pseudoword pronunciation.
in planning the production of spoken words. Journal Memory and Cognition, 11, 152–160.
of Experimental Psychology: Learning, Memory, and Roy, D. (2005). Grounding words in perception and
Cognition, 24, 922–939. action: Computational insights. Trends in Cognitive
Roelofs, A., Meyer, A. S., & Levelt, W. J. M. (1998). Sciences, 9, 389–395.
A case for the lemma/lexeme distinction in models Rubenstein, H., Lewis, S. S., & Rubenstein, M. A.
of speaking: Comment on Caramazza and Miozzo (1971). Evidence for phonemic recoding in visual
(1997). Cognition, 69, 219–230. word recognition. Journal of Verbal Learning and
Roeltgen, D. P. (1987). Loss of deep dyslexic reading Verbal Behavior, 10, 645–658.
ability from a second left-hemisphere lesion. Archives Rubin, D. C. (1980). 51 properties of 125 words: A
of Neurology, 44, 346–348. unit analysis of verbal behavior. Journal of Verbal
Rogalsky, C., & Hickok, G. (2011). The role of Learning and Verbal Behavior, 19, 736–755.
Broca’s area in sentence comprehension. Journal of Rubin, J. (1968). National bilingualism in Paraguay.
Cognitive Neuroscience, 23, 1664–1680. The Hague: Mouton.
Rogers, T. T., Lambon Ralph, M. A., Garrard, P., Rumelhart, D. E. (1975). Notes on a schema for
Bozeat, S., McClelland, J. L., Hodges, J. R., et al. stories. In D. G. Bobrow & A. M. Collins (Eds.),
(2004). Structure and deterioration of semantic Representation and understanding: Studies in
memory: A neuropsychological and computational cognitive science (pp. 211–236). New York: Academic
investigation. Psychological Review, 111, 205–235. Press.
554 REFERENCES
Rumelhart, D. E. (1977). Understanding and Saffran, E. M., & Martin, N. (1997). Effects of
summarizing brief stories. In D. LaBerge & structural priming on sentence production in aphasia.
S. J. Samuels (Eds.), Basic processes in reading: Language and Cognitive Processes, 12, 877–882.
Perception and comprehension (pp. 265–303). Saffran, E. M., & Schwartz, M. (1994). Of
Hillsdale, NJ: Lawrence Erlbaum Associates, Inc. cabbages and things: Semantic memory from a
Rumelhart, D. E. (1980). On evaluating story neuropsychological perspective—a tutorial review.
grammars. Cognitive Science, 4, 313–316. In C. Umilta & M. Moscovitch (Eds.), Attention
Rumelhart, D. E., & McClelland, J. L. (1982). and performance XV: Conscious and nonconscious
An interactive activation model of context effects in information processing (pp. 507–536). Cambridge,
letter perception: Part 2. The contextual enhancement MA: MIT Press.
effect and some tests and extensions of the model. Saffran, E. M., Schwartz, M. F., & Linebarger, M. C.
Psychological Review, 89, 60–94. (1998). Semantic influences on thematic role
Rumelhart, D. E., & McClelland, J. L. (1986). On assignment: Evidence from normals and aphasics. Brain
learning the past tense of English verbs. In and Language, 62, 255–297.
D. E. Rumelhart, J. L. McClelland, & the PDP Saffran, E. M., Schwartz, M. F., & Marin, O. S. M.
Research Group, Parallel distributed processing: Vol. (1976). Semantic mechanisms in paralexia. Brain and
2. Psychological and biological models (pp. 216–271). Language, 3, 255–265.
Cambridge, MA: MIT Press. Saffran, E. M., Schwartz, M. F., & Marin, O. S. M.
Rumelhart, D. E., McClelland, J. L., & the PDP (1980). Evidence from aphasia: Isolating the components
Research Group. (1986). Parallel distributed processing: of a production model. In B. Butterworth (Ed.), Language
Vol. 1. Foundations. Cambridge, MA: MIT Press. production: Vol. 1. Speech and talk (pp. 221–241).
Ruml, W., & Caramazza, A. (2000). An evaluation of London: Academic Press.
a computational model of lexical access: Comment on Saffran, J. R. (2001). The use of predictive
Dell et al. (1997). Psychological Review, 107, 609–634. dependencies in language learning. Journal of Memory
Ruml, W., Caramazza, A., Shelton, J. R., & and Language, 44, 493–515.
Chialant, D. (2000). Testing assumptions in Saffran, J. R. (2002). Constraints on statistical
computational theories of aphasia. Journal of Memory language learning. Journal of Memory and Language,
and Language, 43, 217–248. 47, 172–196.
Rymer, R. (1993). Genie. London: Joseph. Saffran, J. R., Aslin, R. N., & Newport, E. L.
Sacchett, C., & Humphreys, G. W. (1992). Calling a (1996). Statistical learning by 8-month-old infants.
squirrel a squirrel but a canoe a wigwam: A category- Science, 274, 1926–1928.
specific deficit for artefactual objects and body parts. Saffran, J. R., Werker, J. F., & Werner, L. A.
Cognitive Neuropsychology, 9, 73–86. (2006). The infant’s auditory world: Hearing, speech,
Sachs, J., Bard, B., & Johnson, M. L. (1981). and the beginnings of language. In R. Siegler & D.
Language learning with restricted input: Case studies Kuhn (Eds.), Handbook of child development (6th ed.,
of two hearing children of deaf parents. Applied pp. 58–108). New York: Wiley.
Psycholinguistics, 2, 33–54. Salamoura, A., & Williams, J. N. (2006). Lexical
Sachs, J. S. (1967). Recognition memory for activation of cross-language syntactic priming.
syntactic and semantic aspects of connected discourse. Bilingualism: Language and Cognition, 9, 299–307.
Perception and Psychophysics, 2, 437–442. Samuel, A. G. (1981). Phonemic restoration: Insights
Sacks, H., Schegloff, E. A., & Jefferson, G. (1974). from a new methodology. Journal of Experimental
A simplest systematics for the organization of turn- Psychology: General, 110, 474–494.
taking in conversation. Language, 50, 696–735. Samuel, A. G. (1987). The effect of lexical uniqueness
Saffran, E. M. (1990). Short-term memory on phonemic restoration. Journal of Memory and
impairments and language processing. In A. Language, 26, 36–56.
Caramazza (Ed.), Cognitive neuropsychology and Samuel, A. G. (1990). Using perceptual-restoration
neurolinguistics (pp. 137–168). Hillsdale, NJ: effects to explore the architecture of perception. In
Lawrence Erlbaum Associates, Inc. G. T. M. Altmann (Ed.), Cognitive models of speech
Saffran, E. M., Bogyo, L. C., Schwartz, M. F., & processing (pp. 295–314). Cambridge, MA: MIT
Marin, O. S. M. (1980). Does deep dyslexia reflect right Press.
hemisphere reading? In M. Coltheart, K. E. Patterson, Samuel, A. G. (1996). Does lexical information
& J. C. Marshall (Eds.), Deep dyslexia (pp. 381–406). influence the perceptual restoration of phonemes?
London: Routledge & Kegan Paul. [2nd ed., 1987.] Journal of Experimental Psychology: General, 125,
Saffran, E. M., Marin, O. S. M., & 28–51.
Yeni-Komshian, G. H. (1976). An analysis of speech Samuel, A. G. (1997). Lexical activation produces
perception in word deafness. Brain and Language, 3, potent phonemic percepts. Cognitive Psychology, 32,
209–228. 97–127.
REFERENCES 555
Sandra, D. (1990). On the representation and Savage-Rumbaugh, E. S., Pate, J. L., Lawson, J.,
processing of compound words: Automatic access Smith, T., & Rosenbaum, S. (1983). Can
to constituent morphemes does not occur. Quarterly a chimpanzee make a statement? Journal of
Journal of Experimental Psychology, 42A, 529–567. Experimental Psychology: General, 112,
Sanford, A. J. (1985). Cognition and cognitive 457–492.
psychology. London: Weidenfeld & Nicolson. Savage-Rumbaugh, E. S., Rumbaugh, D. M., &
Sanford, A. J., & Garrod, S. C. (1981). Boysen, S. (1978). Linguistically mediated tool use
Understanding written language. Chichester, UK: and exchange by chimpanzees. Behavioral and Brain
John Wiley. Sciences, 1, 539–554.
Santa, J. L., & Ranken, H. B. (1972). Effects of Savin, H. B., & Bever, T. G. (1970). The non-
verbal coding on recognition memory. Journal of perceptual reality of the phoneme. Journal of Verbal
Experimental Psychology, 93, 268–278. Learning and Verbal Behavior, 9, 295–302.
Sartori, G., & Job, R. (1988). The oyster with four Savin, H. B., & Perchonock, E. (1965). Grammatical
legs: A neuropsychological study on the interaction structure and the immediate recall of English sentences.
of visual and semantic information. Cognitive Journal of Verbal Learning and Verbal Behavior, 4,
Neuropsychology, 5, 105–132. 348–353.
Sartori, G., Miozzo, M., & Job, R. (1993). Category- Saxton, M. (1997). The contrast theory of negative
specific impairments? Yes. Quarterly Journal of input. Journal of Child Language, 24, 139–161.
Experimental Psychology, 46A, 489–504. Scarborough, D. L., Cortese, C., & Scarborough, H. S.
Sasanuma, S. (1980). Acquired dyslexia in Japanese: (1977). Frequency and repetition effects in lexical
Clinical features and underlying mechanisms. In M. memory. Journal of Experimental Psychology: Human
Coltheart, K. E. Patterson, & J. C. Marshall (Eds.), Perception and Performance, 3, 1–17.
Deep dyslexia (pp. 48–90). London: Routledge & Scarborough, D. L., Gerard, L., & Cortese, C.
Kegan Paul. [2nd ed., 1987.] (1984). Independence of lexical access in bilingual
Sasanuma, S., Ito, H., Patterson, K., & Ito, T. word recognition. Journal of Verbal Learning and
(1996). Phonological alexia in Japanese: A case study. Verbal Behavior, 23, 84–99.
Cognitive Neuropsychology, 13, 823–848. Schaeffer, B., & Wallace, R. (1969). Semantic
Savage, C., Lieven, E., Theakston, A., & similarity and the comprehension of word meanings.
Tomasello, M. (2003). Testing the abstractness of Journal of Experimental Psychology, 82, 343–346.
young children’s linguistic representations: Lexical Schaeffer, B., & Wallace, R. (1970). The comparison
and structural priming of syntactic constructions. of word meanings. Journal of Experimental
Developmental Science, 6, 557–567. Psychology, 86, 144–152.
Savage, G. R., Bradley, D. C., & Forster, K. I. Schaeffer, H. R. (1975). Social development in
(1990). Word frequency and the pronunciation task: infancy. In R. Lewin (Ed.), Child alive (pp. 32–39).
The contribution of articulatory fluency. Language London: Temple Smith.
and Cognitive Processes, 5, 203–236. Schank, R. C. (1972). Conceptual dependency: A
Savage, R. S. (1997). Do children need concurrent theory of natural language understanding. Cognitive
prompts in order to use lexical analogies in reading? Psychology, 3, 552–631.
Journal of Child Psychology and Psychiatry, 38, Schank, R. C. (1975). Conceptual information
235–246. processing. Amsterdam: North Holland.
Savage-Rumbaugh, E. S. (1987). Communication, Schank, R. C. (1982). Dynamic memory. Cambridge:
symbolic communication, and language: A reply Cambridge University Press.
to Seidenberg and Petitto. Journal of Experimental Schank, R. C., & Abelson, R. (1977). Scripts, plans,
Psychology: General, 116, 288–292. goals and understanding. Hillsdale, NJ: Lawrence
Savage-Rumbaugh, E. S., & Lewin, R. (1994). Erlbaum Associates, Inc.
Kanzi: At the brink of the human mind. New York: Schenkein, J. (1980). A taxonomy for repeating action
Wiley. sequences in natural conversation. In B. Butterworth
Savage-Rumbaugh, E. S., McDonald, K., Sevcik, R. A., (Ed.), Language production: Vol. 1. Speech and talk
Hopkins, W. D., & Rupert, E. (1986). Spontaneous (pp. 21–48). London: Academic Press.
symbol acquisition and communicative use by pygmy Schiff-Myers, N. (1993). Hearing children of deaf
chimpanzees (Pan paniscus). Journal of Experimental parents. In D. Bishop & K. Mogford (Eds.), Language
Psychology: General, 115, 211–235. development in exceptional circumstances (pp. 47–61).
Savage-Rumbaugh, E. S., Murphy, J., Sevcik, R. A., Hove, UK: Lawrence Erlbaum Associates.
Brakke, K. E., Williams, S. L., & Rumbaugh, D. M. Schiller, N. O., & Caramazza, A. (2003).
(1993). Language comprehension in ape and child. Grammatical feature selection in noun phrase
Monographs of the Society for Research in Child production: Evidence from German and Dutch.
Development, 58 (Whole Nos. 3–4). Journal of Memory and Language, 48, 169–194.
556 REFERENCES
Schiller, N. O., & Costa, A. (2006). Different Schustack, M. W., Ehrlich, S. F., & Rayner, K.
selection principles of free-standing and bound (1987). The complexity of contextual facilitation
morphemes in language production. Journal of in reading: Local and global influences. Journal of
Experimental Psychology: Learning, Memory, and Memory and Language, 26, 322–340.
Cognition, 32, 1201–1207. Schvaneveldt, R. W., Meyer, D. E., & Becker, C. A.
Schilling, H. E. H., Rayner, K., & Chumbley, J. I. (1976). Lexical ambiguity, semantic context, and
(1998). Comparing naming, lexical decision, and eye visual word recognition. Journal of Experimental
fixation times: Word frequency effects and individual Psychology: Human Perception and Performance, 2,
differences. Memory and Cognition, 26, 1270–1281. 243–256.
Schlesinger, H. S., & Meadow, K. P. (1972). Sound Schwanenflugel, P. J., & LaCount, K. L. (1988).
and sign: Childhood deafness and mental health. Semantic relatedness and the scope of facilitation for
Berkeley: University of California Press. upcoming words in sentences. Journal of Experimental
Schlesinger, I. M. (1971). Production of utterances Psychology: Learning, Memory, and Cognition, 14,
and language acquisition. In D. I. Slobin (Ed.), The 344–354.
ontogenesis of grammar (pp. 63–102). New York: Schwanenflugel, P. J., & Rey, M. (1986). Interlingual
Academic Press. semantic facilitation: Evidence for a common
Schlesinger, I. M. (1988). The origin of relational representational system in the bilingual lexicon.
categories. In Y. Levy, I. M. Schlesinger, & Journal of Memory and Language, 25, 605–618.
M. D. S. Braine (Eds.), Categories and processes in Schwartz, M. F. (1984). What the classical aphasia
language acquisition (pp. 121–178). Hillsdale, NJ: categories can’t do for us, and why. Brain and
Lawrence Erlbaum Associates, Inc. Language, 21, 3–8.
Schneider, W., & Shiffrin, R. M. (1977). Controlled and Schwartz, M. F. (1987). Patterns of speech production
automatic human information processing: I. Detection, deficit within and across aphasia syndromes:
search and attention. Psychological Review, 84, 1–66. Application of a psycholinguistic model. In M.
Schnur, T. T., Costa, A., & Caramazza, A. (2006). Coltheart, G. Sartori, & R. Job (Eds.), The cognitive
Planning at the phonological level during sentence neuropsychology of language (pp. 163–199). Hove,
production. Journal of Psycholinguistic Research, 35, UK: Lawrence Erlbaum Associates.
189–213. Schwartz, M. F. (Ed.). (1990). Modular deficits in
Schober, M. F., & Clark, H. H. (1989). Alzheimer-type dementia. Cambridge, MA: MIT Press.
Understanding by addressees and overhearers. Schwartz, M. F., & Chawluk, J. B. (1990).
Cognitive Psychology, 21, 211–232. Deterioration of language in progressive aphasia: A
Schreuder, R., & Baayen, R. H. (1997). How case study. In M. F. Schwartz (Ed.), Modular deficits
complex simplex words can be. Journal of Memory in Alzheimer-type dementia (pp. 245–296). Cambridge,
and Language, 37, 118–139. MA: MIT Press.
Schriefers, H., Jescheniak, J. D., & Hantsch, A. Schwartz, M. F., Dell, G. S., Martin, N., Gahl, S., &
(2005). Selection of gender-marked morphemes Sobel, P. (2006). A case-series test of the interactive
in speech production. Journal of Experimental two-step model of lexical access: Evidence from
Psychology: Learning, Memory, and Cognition, 31, picture naming. Journal of Memory and Language, 54,
159–168. 228–264.
Schriefers, H., Meyer, A. S., & Levelt, W. J. M. Schwartz, M. F., Linebarger, M., Saffran, E., &
(1990). Exploring the time course of lexical access Pate, D. (1987). Syntactic transparency and sentence
in language production: Picture–word interference interpretation in aphasia. Language and Cognitive
studies. Journal of Memory and Language, 29, Processes, 2, 85–113.
86–102. Schwartz, M. F., Marin, O. S. M., & Saffran, E. M.
Schriefers, H., & Teruel, E. (2000). Grammatical (1979). Dissociations of language function in
gender in noun phrase production: The gender dementia: A case study. Brain and Language, 7,
interference effect in German. Journal of Experimental 277–306.
Psychology: Learning, Memory, and Cognition, 26, Schwartz, M. F., Saffran, E. M., Bloch, D. E., &
1368–1377. Dell, G. S. (1994). Disordered speech production in
Schriefers, H., Teruel, E., & Meinshausen, R. M. aphasic and normal speakers. Brain and Language, 47,
(1998). Producing simple sentences: Results from 52–88.
picture–word interference experiments. Journal of Schwartz, M. F., Saffran, E. M., & Marin, O. S. M.
Memory and Language, 39, 609–632. (1980a). Fractionating the reading process in
Schuberth, R. E., & Eimas, P. D. (1977). Effects dementia: Evidence for word-specific print-to-sound
of context on the classification of words and non- associations. In M. Coltheart, K. E. Patterson, &
words. Journal of Experimental Psychology: Human J. C. Marshall (Eds.), Deep dyslexia (pp. 259–269).
Perception and Performance, 3, 27–36. London: Routledge & Kegan Paul.
REFERENCES 557
Schwartz, M. F., Saffran, E. M., & Marin, O. S. M. limitations of knowledge-based processing. Cognitive
(1980b). The word order problem in agrammatism I: Psychology, 14, 489–537.
Comprehension. Brain and Language, 10, 249–262. Seidenberg, M. S., Waters, G. S., Barnes, M. A., &
Scoresby-Jackson, R. E. (1867). Case of aphasia with Tanenhaus, M. K. (1984). When does irregular spelling
right hemiplegia. Edinburgh Medical Journal, 12, or pronunciation influence word recognition? Journal of
696–706. Verbal Learning and Verbal Behavior, 23, 383–404.
Schyns, P. G., Goldstone, R. L., & Thibaut, J.-P. Seidenberg, M. S., Waters, G. S., Sanders, M.,
(1998). The development of features in object & Langer, P. (1984). Pre- and post-lexical loci of
concepts. Behavioral and Brain Sciences, 21, 1–53. contextual effects on word recognition. Memory and
Searle, J. R. (1969). Speech acts. Cambridge: Cognition, 12, 315–328.
Cambridge University Press. Seifert, C. M., McKoon, G., Abelson, R. P., & Ratcliff,
Searle, J. R. (1975). Indirect speech acts. In P. Cole R. (1986). Memory connections between thematically
& J. L. Morgan (Eds.), Syntax and semantics: Vol. 3. similar episodes. Journal of Experimental Psychology:
Speech acts (pp. 59–82). New York: Academic Press. Learning, Memory, and Cognition, 12, 220–231.
Searle, J. R. (1979). Metaphor. In A. Ortony (Ed.), Seifert, C. M., Robertson, S. P., & Black, J. B.
Metaphor and thought (pp. 92–123). Cambridge: (1985). Types of inference generated during reading.
Cambridge University Press. Journal of Memory and Language, 24, 405–422.
Sedivy, J. C., Tanenhaus, M. K., Chambers, C. G., Semenza, C., & Zettin, M. (1988). Generating
& Carlson, G. N. (1999). Achieving incremental proper names: A case of selective inability. Cognitive
semantic interpretation through contextual Neuropsychology, 5, 711–721.
representation. Cognition, 71, 109–147. Seymour, P. H. K. (1987). Individual cognitive
Seidenberg, M. S. (1988). Cognitive neuropsychology analysis of competent and impaired reading. British
and language: The state of the art. Cognitive Journal of Psychology, 78, 483–506.
Neuropsychology, 5, 403–426. Seymour, P. H. K. (1990). Developmental dyslexia.
Seidenberg, M. S. (2011). What causes dyslexia? In M. W. Eysenck (Ed.), Cognitive psychology: An
Comment on Goswami. Trends in Cognitive Sciences, international review (pp. 135–196). Chichester, UK:
15, 2. John Wiley.
Seidenberg, M. S., & Elman, J. L. (1999). Networks Seymour, P. H. K., & Elder, L. (1986).
are not “hidden rules.” Trends in Cognitive Sciences, Beginning reading without phonology. Cognitive
3, 288–289. Neuropsychology, 3, 1–36.
Seidenberg, M. S., & McClelland, J. L. (1989). A Seymour, P. H. K., & Evans, H. M. (1994). Levels of
distributed developmental model of word recognition. phonological awareness and learning to read. Reading
Psychological Review, 96, 523–568. and Writing, 6, 221–250.
Seidenberg, M. S., & McClelland, J. L. (1990). Shafto, M., Burke, D., Stamatakis, E., Tam, P., &
More words but still no lexicon. Reply to Besner et al. Tyler, L. (2007). On the tip-of-the-tongue: Neural
(1990). Psychological Review, 97, 447–452. correlates of increased word-finding failures in
Seidenberg, M. S., Petersen, A., MacDonald, M. C., normal aging. Journal of Cognitive Neuroscience, 19,
& Plaut, D. C. (1996). Pseudohomophone effects and 2060–2070.
models of word recognition. Journal of Experimental Shallice, T. (1981). Phonological agraphia and the
Psychology: Learning, Memory, and Cognition, 22, lexical route in writing. Brain, 104, 413–429.
48–62. Shallice, T. (1988). From neuropsychology to mental
Seidenberg, M. S., & Petitto, L. A. (1979). Signing structure. Cambridge: Cambridge University Press.
behavior in apes: A critical review. Cognition, 7, Shallice, T. (1993). Multiple semantics: Whose
177–215. confusions? Cognitive Neuropsychology, 10, 251–261.
Seidenberg, M. S., & Petitto, L. A. (1987). Shallice, T., & Butterworth, B. (1977). Short-
Communication, symbolic communication, and term memory impairment and spontaneous speech.
language: Comment on Savage-Rumbaugh, McDonald, Neuropsychologia, 15, 729–735.
Sevcik, Hopkins, and Rupert (1986). Journal of Shallice, T., & McCarthy, R. (1985). Phonological
Experimental Psychology: General, 116, 279–287. reading: From patterns of impairment to possible
Seidenberg, M. S., Plaut, D. C., Petersen, A., procedure. In K. E. Patterson, J. C. Marshall, &
McClelland, J. L., & McRae, K. (1994). Nonword M. Coltheart (Eds.), Surface dyslexia: Neuropsychological
pronunciation and models of word recognition. and cognitive studies of phonological reading
Journal of Experimental Psychology: Human (pp. 361–397). Hove, UK: Lawrence Erlbaum Associates.
Perception and Performance, 20, 1177–1196. Shallice, T., & McGill, J. (1978). The origins of
Seidenberg, M. S., Tanenhaus, M. K., Leiman, J. M., mixed errors. In J. Requin (Ed.), Attention and
& Bienkowski, M. (1982). Automatic access of the performance VII (pp. 193–208). Hillsdale, NJ:
meanings of ambiguous words in context: Some Lawrence Erlbaum Associates, Inc.
558 REFERENCES
Shallice, T., McLeod, P., & Lewis, K. (1985). Shaywitz, B. A., Shaywitz, S. E., Pugh, K. R.,
Isolating cognitive modules with the dual task Constable, R. T., Skudlarski, P., Fulbright, R. K.,
paradigm: Are speech perception and production et al. (1995). Sex differences in the functional
separate processes? Quarterly Journal of Experimental organization of the brain for language. Nature, 373,
Psychology, 37A, 507–532. 607–609.
Shallice, T., Rumiati, R. I., & Zadini, A. (2000). The Sheldon, A. (1974). The role of parallel function in the
selective impairment of the phonological output buffer. acquisition of relative clauses in English. Journal of
Cognitive Neuropsychology, 17, 517–546. Verbal Learning and Verbal Behavior, 13, 272–281.
Shallice, T., & Warrington, E. K. (1975). Word Shelton, J. R., & Caramazza, A. (1999). Deficits in
recognition in a phonemic dyslexic patient. Quarterly lexical and semantic processing: Implications for
Journal of Experimental Psychology, 27, 187–199. models of normal language. Psychonomic Bulletin and
Shallice, T., & Warrington, E. K. (1977). Auditory- Review, 6, 5–27.
verbal short-term memory impairment and conduction Shelton, J. R., & Martin, R. C. (1992). How
aphasia. Brain and Language, 4, 479–491. semantic is automatic semantic priming? Journal of
Shallice, T., & Warrington, E. K. (1980). Single and Experimental Psychology: Learning, Memory, and
multiple component central deep dyslexic syndromes. Cognition, 18, 1191–1209.
In M. Coltheart, K. E. Patterson, & J. C. Marshall Shelton, J. R., & Weinrich, M. (1997). Further
(Eds.), Deep dyslexia (pp. 199–245). London: evidence of a dissociation between output
Routledge & Kegan Paul. [2nd ed., 1987.] phonological and orthographic lexicons: A case study.
Shallice, T., Warrington, E. K., & McCarthy, R. Cognitive Neuropsychology, 14, 105–129.
(1983). Reading without semantics. Quarterly Journal Sheridan, J., & Humphreys, G. W. (1993). A verbal-
of Experimental Psychology, 35A, 111–138. semantic category-specific recognition impairment.
Shanker, S. G., Savage-Rumbaugh, E. S., & Cognitive Neuropsychology, 10, 143–184.
Taylor, T. J. (1999). Kanzi: A new beginning. Animal Shiffrin, R. M., & Schneider, W. (1977). Controlled
Learning and Behavior, 27, 24–25. and automatic human information processing: II.
Shannon, C. E., & Weaver, W. (1949). The Perceptual learning, automatic attending, and a general
mathematical theory of communication. Urbana: theory. Psychological Review, 84, 127–190.
University of Illinois Press. Shoben, E. J., & Gagne, C. L. (1997). Thematic
Shapiro, K., & Caramazza, A. (2003). The relations and the creation of combined concepts. In
representation of grammatical categories in the brain. T. B. Ward, S. M. Smith, & J. Vaid (Eds.), Creative
Trends in Cognitive Sciences, 7, 201–206. thought: An investigation of creative structures and
Share, D. L. (1995). Phonological recoding and processes (pp. 31–50). Washington, DC: American
self-teaching: Sine qua non of reading acquisition. Psychological Association.
Cognition, 55, 151–218. Siegel, L. S. (1998). Phonological processing deficits
Sharkey, A. J. C., & Sharkey, N. E. (1992). Weak and reading disabilities. In J. L. Metsala & L. C.
contextual constraints in text and word priming. Ehri (Eds.), Word recognition and beginning literacy
Journal of Memory and Language, 31, 543–572. (pp. 141–160). Mahwah, NJ: Lawrence Erlbaum
Sharpe, K. (1992). Communication, culture, context, Associates, Inc.
confidence: The four Cs of primary modern language Silverberg, S., & Samuel, A. G. (2004). The
teaching. Language Learning Journal, 6, 13–14. effect of age of second language acquisition on the
Shattuck, R. (1980). The forbidden experiment. New representation and processing of second language
York: Kodansha International. words. Journal of Memory and Language, 51,
Shattuck-Hufnagel, S. (1979). Speech errors as 381–398.
evidence for a serial ordering mechanism in speech Silveri, M. C., & Gainotti, G. (1988). Interaction
production. In W. E. Cooper & E. C. T. Walker between vision and language in category-specific
(Eds.), Sentence processing: Psycholinguistic studies semantic impairment. Cognitive Neuropsychology, 5,
presented to Merrill Garrett (pp. 295–342). Hillsdale, 677–709.
NJ: Lawrence Erlbaum Associates, Inc. Simpson, G. B. (1981). Meaning dominance
Shatz, M., Diesendruck, G., Martinez-Beck, I., and semantic context in the processing of lexical
& Akar, D. (2003). The influence of language and ambiguity. Journal of Verbal Learning and Verbal
socioeconomic status on children’s understanding of Behavior, 20, 120–136.
false belief. Developmental Psychology, 39, 717–729. Simpson, G. B. (1994). Context and the processing
Shatz, M., & Gelman, R. (1973). The development of ambiguous words. In M. A. Gernsbacher (Ed.),
of communication skills: Modifications in the speech Handbook of psycholinguistic research (pp. 359–374).
of young children as a function of the listener. San Diego, CA: Academic Press.
Monograph of the Society for Research in Child Simpson, G. B., & Burgess, C. (1985). Activation
Development, 152. and solution processes in the recognition of ambiguous
REFERENCES 559
words. Journal of Experimental Psychology: Human D. I. Slobin (Eds.), Studies of child language development
Perception and Performance, 11, 28–39. (pp. 175–208). New York: Holt, Rhinehart & Winston.
Simpson, G. B., & Krueger, M. A. (1991). Selective Slobin, D. I. (1981). The origins of grammatical
access of homograph meanings in sentence context. encoding of events. In W. Deutsch (Ed.), The child’s
Journal of Memory and Language, 30, 627–643. construction of language (pp. 185–199). London:
Sinclair-de-Zwart, H. (1969). Developmental Academic Press.
psycholinguistics. In D. Elkind & J. H. Flavell (Eds.), Slobin, D. I. (1985). Crosslinguistic evidence for the
Studies in cognitive development (pp. 315–366). language-making capacity. In D. I. Slobin (Ed.), The
Oxford: Oxford University Press. crosslinguistic study of language acquisitions: Vol.
Sinclair-de-Zwart, H. (1973). Language acquisition 2. Theoretical issues (pp. 1157–1249). Hillsdale, NJ:
and cognitive development. In T. E. Moore (Ed.), Lawrence Erlbaum Associates, Inc.
Cognitive development and the acquisition of Smith, E. E. (1988). Concepts and thought. In
language (pp. 9–26). New York: Academic Press. R. J. Sternberg (Ed.), The psychology of human thought
Singer, M. (1994). Discourse inference processes. (pp. 19–49). Cambridge: Cambridge University Press.
In M. A. Gernsbacher (Ed.), Handbook of Smith, E. E., & Medin, D. L. (1981). Categories and
psycholinguistics (pp. 479–516). San Diego, CA: concepts. Cambridge, MA: Harvard University Press.
Academic Press. Smith, E. E., Shoben, E. J., & Rips, L. J. (1974).
Singer, M., & Ferreira, F. (1983). Inferring Structure and process in semantic memory: A featural
consequences in story comprehension. Journal of model for semantic decisions. Psychological Review,
Verbal Learning and Verbal Behavior, 22, 437–448. 81, 214–241.
Singer, M., Graesser, A. C., & Trabasso, T. (1994). Smith, M., & Wheeldon, L. (1999). High level
Minimal or global inference in comprehension. processing scope in spoken sentence production.
Journal of Memory and Language, 33, 421–441. Cognition, 73, 205–246.
Singh, J. A. L., & Zingg, R. M. (1942). Wolf children Smith, M., & Wheeldon, L. (2004). Horizontal
and feral man. Hamden, CT: Shoe String Press. information flow in spoken sentence production.
[Reprinted 1966, New York: Harper & Row.] Journal of Experimental Psychology: Learning,
Sitton, M., Mozer, M. C., & Farah, M. J. Memory, and Cognition, 30, 675–686.
(2000). Superadditive effects of multiple lesions Smith, N., & Tsimpli, I.-M. (1995). The mind of a
in connectionist architecture: Implications for the savant: Language learning and modularity. Oxford:
neuropsychology of optic aphasia. Psychological Blackwell.
Review, 107, 709–734. Smith, N. V. (1973). The acquisition of phonology: A
Skehan, P. (1998). A cognitive approach to language case study. Cambridge: Cambridge University Press.
learning. Oxford: Oxford University Press. Smith, P. T., & Sterling, C. M. (1982). Factors
Skinner, B. F. (1957). Verbal behavior. New York: affecting the perceived morphophonemic structure of
Appleton-Century-Crofts. written words. Journal of Verbal Learning and Verbal
Skoyles, J., & Skottun, B. C. (2004). On the Behavior, 21, 704–721.
prevalence of magnocellular deficits in the visual Smith, S. M., Brown, H. O., Thomas, J. E. P., &
system of non-dyslexic individuals. Brain and Goodman, L. S. (1947). The lack of cerebral effects
Language, 88, 79–82. of d-tubocurarine. Anesthesiology, 8, 1–14.
Skuse, D. H. (1993). Extreme deprivation in early Snedeker, J., & Trueswell, J. C. (2003). Using
childhood. In D. Bishop & K. Mogford (Eds.), prosody to avoid ambiguity: Effects of speaker
Language development in exceptional circumstances awareness and referential context. Journal of Memory
(pp. 29–46). Hove, UK: Lawrence Erlbaum Associates. and Language, 48, 103–130.
Slobin, D. I. (1966a). Grammatical transformations Snedeker, J., & Trueswell, J. C. (2004). The developing
and sentence comprehension in childhood and constraints on parsing decisions: The role of lexical
adulthood. Journal of Verbal Learning and Verbal biases and referential scenes in child and adult sentence
Behavior, 5, 219–227. processing. Cognitive Psychology, 49, 238–299.
Slobin, D. I. (1966b). The acquisition of Russian as a Snodgrass, J. G. (1984). Concepts and their surface
native language. In F. Smith & G. A. Miller (Eds.), The representation. Journal of Verbal Learning and Verbal
genesis of a language: A psycholinguistic approach Behavior, 23, 3–22.
(pp. 129–248). Cambridge, MA: MIT Press. Snodgrass, J. G., & Vanderwart, M. (1980). A
Slobin, D. I. (1970). Universals of grammatical standardised set of 260 pictures: Norms for name
development in children. In G. Flores d’Arcais & agreement, image agreement, familiarity, and visual
W. J. M. Levelt (Eds.), Advances in psycholinguistics complexity. Journal of Experimental Psychology:
(pp. 174–186). Amsterdam: North Holland. Human Learning and Memory, 6, 174–215.
Slobin, D. I. (1973). Cognitive prerequisites for the Snow, C. E. (1972). Mothers’ speech to children
development of grammar. In C. A. Ferguson & learning language. Child Development, 43, 549–565.
560 REFERENCES
Snow, C. E. (1977). The development of conversation Sokolov, J. L., & Snow, C. E. (1994). The changing
between mothers and babies. Journal of Child role of negative evidence in theories of language
Language, 4, 1–22. development. In C. Gallaway & B. J. Richards (Eds.),
Snow, C. E. (1983). Age differences in second Input and interaction in language acquisition (pp.
language acquisition: Research findings and folk 38–55). Cambridge: Cambridge University Press.
psychology. In K. Bailey, M. Long, & S. Peck (Eds.), Solomon, E. S., & Pearlmutter, N. J. (2004).
Second language acquisition studies (pp. 141–150). Semantic integration and syntactic planning in
Rowley, MA: Newbury House. language production. Cognitive Psychology, 49, 1–46.
Snow, C. E. (1993). Bilingualism and second Spelke, E. S. (1994). Initial knowledge: Six
language acquisition. In J. B. Gleason & N. B. Ratner suggestions. Cognition, 50, 443–447.
(Eds.), Psycholinguistics (pp. 391–416). Fort Worth, Spender, D. (1980). Man made language. London:
TX: Harcourt Brace Jovanovich. Routledge & Kegan Paul.
Snow, C. E. (1994). Beginning from baby talk: Sperber, D., & Wilson, D. (1986). Relevance:
Twenty years of research on input and interaction. Communication and cognition. Oxford: Blackwell.
In C. Gallaway & B. J. Richards (Eds.), Input and Sperber, D., & Wilson, D. (1987). Précis of
interaction in language acquisition (pp. 3–12). Relevance: Communication and cognition. Behavioral
Cambridge: Cambridge University Press. and Brain Sciences, 10, 697–754.
Snow, C. E. (1995). Issues in the study of input: Sperber, R. D., McCauley, C., Ragain, R. D., &
Finetuning, universality, individual and developmental Weil, C. M. (1979). Semantic priming effects on
differences, and necessary causes. In P. Fletcher & picture and word processing. Memory and Cognition,
B. MacWhinney (Eds.), The handbook of child 7, 339–345.
language (pp. 180–193). Oxford: Blackwell. Spiro, R. J. (1977). Constructing a theory of
Snow, C. E., & Hoefnagel-Hohle, M. (1978). The reconstructive memory: The state of the schema
critical period for language acquisition: Evidence from approach. In R. C. Anderson, R. J. Spiro, &
second language learning. Child Development, 49, W. E. Montague (Eds.), Schooling and the acquisition
1114–1128. of knowledge (pp. 137–177). Hillsdale, NJ: Lawrence
Snowden, J. S., Goulding, P. J., & Neary, D. (1989). Erlbaum Associates, Inc.
Semantic dementia: A form of circumscribed cerebral Spivey, M. J., & Marian, V. (1999). Crosstalk
atrophy. Behavioural Neurology, 2, 167–182. between native and second languages: Partial
Snowling, M. J. (1983). The comparison of acquired activation of an irrelevant lexicon. Psychological
and developmental disorders of reading. Cognition, Science, 10, 281–284.
14, 105–118. Spivey, M. J., McRae, K., & Joanisse, M. F. (2012).
Snowling, M. J. (1987). Dyslexia: A cognitive The Cambridge handbook of psycholinguistics.
development perspective. Oxford: Blackwell. Cambridge: Cambridge University Press.
Snowling, M. J. (2000). Dyslexia (2nd ed.). Oxford: Spivey, M. J., & Tanenhaus, M. K. (1998). Syntactic
Blackwell. ambiguity resolution in discourse: Modeling the
Snowling, M. J., Bryant, P. E., & Hulme, C. (1996). effects of referential context and lexical frequency.
Theoretical and methodological pitfalls in making Journal of Experimental Psychology: Learning,
comparisons between developmental and acquired Memory, and Cognition, 24, 1521–1543.
dyslexia: Some comments on A. Castles and M. Spivey, M. J., Tanenhaus, M. K., Eberhard, K. M.,
Coltheart (1993). Reading and Writing, 8, 443–451. & Sedivy, J. C. (2002). Eye movements and spoken
Snowling, M. J., Gallagher, A., & Frith, U. (2003). language comprehension: Effects of visual context on
Family risk of dyslexia is continuous: Individual syntactic ambiguity resolution. Cognitive Psychology,
differences in the precursors of reading skill. Child 45, 447–481.
Development, 74, 358–373. Stabler, E. P. (1983). How are grammars represented?
Snowling, M. J., & Hulme, C. (1989). A longitudinal Behavioral and Brain Sciences, 6, 391–421.
case study of developmental phonological dyslexia. Stager, C. L., & Werker, J. F. (1997). Infants listen
Cognitive Neuropsychology, 6, 379–401. for more phonetic detail in speech perception than in
Snowling, M. J., & Hulme, C. (Eds.). (2007). The word-learning tasks. Nature, 388, 381–382.
science of reading: A handbook. Oxford: Blackwell. Stamenov, M. I., & Gallese, V. (Eds.). (2002). Mirror
Snowling, M. J., Stackhouse, J., & Rack, J. neurons and the evolution of brain and language
(1986). Phonological dyslexia and dysgraphia: A (Advances in consciousness research 42). Amsterdam:
developmental analysis. Cognitive Neuropsychology, John Benjamins.
3, 309–339. Stanners, R. F., Jastrzembski, J. E., & Westwood, A.
Soja, N. N., Carey, S., & Spelke, E. S. (1992). (1975). Frequency and visual quality in a word–
Perception, ontology, and word meaning. Cognition, nonword classification task. Journal of Verbal
45, 101–107. Learning and Verbal Behavior, 14, 259–264.
REFERENCES 561
Stanovich, K. E., & Bauer, D. W. (1978). Sternberg, S., Knoll, R. L., Monsell, S., &
Experiments on the spelling-to-sound regularity effect Wright, C. E. (1988). Motor programs and
in word recognition. Memory and Cognition, 6, hierarchical organization in the control of rapid
410–415. speech. Phonetica, 45, 175–197.
Stanovich, K. E., Siegel, L. S., & Gottardo, A. Stevens, K. N. (1960). Toward a model for speech
(1997). Converging evidence for phonological and recognition. Journal of the Acoustical Society of
surface subtypes of reading disability. Journal of America, 32, 47–55.
Educational Psychology, 89, 114–127. Stevenson, R. (1988). Models of language
Stanovich, K. E., Siegel, L. S., Gottardo, A., development. Milton Keynes, UK: Open University
Chiappe, P., & Sidhu, R. (1997). Subtypes of Press.
developmental dyslexia: Differences in phonological Stewart, A. J., Pickering, M. F., & Sanford, A. J.
and orthographic coding. In B. A. Blachman (Ed.), (2000). The time course of the influence of implicit
Foundations of reading acquisition and dyslexia: causality information: Focusing versus integration
Implications for early intervention (pp. 115–141). account. Journal of Memory and Language, 42,
Mahwah, NJ: Lawrence Erlbaum Associates, Inc. 423–443.
Stanovich, K. E., & West, R. F. (1979). Mechanisms Stewart, F., Parkin, A. J., & Hunkin, N. M. (1992).
of sentence context effects in reading: Automatic Naming impairments following recovery from herpes
activation and conscious attention. Memory and simplex encephalitis: Category-specific? Quarterly
Cognition, 6, 115–123. Journal of Experimental Psychology, 44A, 261–284.
Stanovich, K. E., & West, R. F. (1981). The effect Stewart, I. (1989). Does God play dice? The new
of sentence context on ongoing word recognition: mathematics of chaos. Harmondsworth, UK: Penguin.
Tests of a two-process theory. Journal of Experimental Stirling, J. (2002). Introducing neuropsychology.
Psychology: Human Perception and Performance, 7, Hove, UK: Psychology Press.
658–672. Storms, G., De Boeck, P., & Ruts, W. (2000).
Stanovich, K. E., West, R. F., & Harrison, M. R. Prototype and exemplar-based information in
(1995). Knowledge growth and maintenance across the natural language categories. Journal of Memory and
life span: The role of print exposure. Developmental Language, 42, 51–73.
Psychology, 31, 811–826. Strain, E., Patterson, K., & Seidenberg, M. S.
Stark, R. E. (1986). Prespeech segmental feature (1995). Semantic effects in single-word naming.
development. In P. Fletcher & M. Garman (Eds.), Journal of Experimental Psychology: Learning,
Language acquisition (2nd ed., pp. 149–173). Memory, and Cognition, 21, 1140–1154.
Cambridge: Cambridge University Press. Strain, E., Patterson, K., & Seidenberg, M. S.
Starreveld, P. A., & La Heij, W. (1995). Semantic (2002). Theories of word naming interact with
interference, orthographic facilitation, and their spelling–sound consistency. Journal of Experimental
interaction in naming tasks. Journal of Experimental Psychology: Learning, Memory, and Cognition, 28,
Psychology: Learning, Memory, and Cognition, 21, 207–214.
686–698. Sturt, P., Costa, F., Lombardo, V., & Frasconi, P.
Starreveld, P. A., & La Heij, W. (1996). Time (2003). Learning first-pass structural attachment
course analysis of semantic and orthographic context preferences with dynamic grammars and recursive
effects in picture naming. Journal of Experimental neural networks. Cognition, 88, 133–169.
Psychology: Learning, Memory, and Cognition, 22, Sudhalter, V., & Braine, M. D. S. (1985). How does
896–918. comprehension of passives develop? Journal of Child
Steffensen, M. S., Joag-dev, C., & Anderson, R. C. Language, 12, 455–470.
(1979). A cross-cultural perspective on reading Sulin, R. A., & Dooling, D. J. (1974). Intrusion
comprehension. Reading Research Quarterly, 15, 10–29. of a thematic idea in retention of prose. Journal of
Stein, J. (2003). Visual motion sensitivity and reading. Experimental Psychology, 103, 255–262.
Neuropsychologia, 41, 1785–1793. Summerfield, Q. (1981). Articulatory rate and
Stemberger, J. P. (1983). Distant context effects in perceptual constancy in phonetic perception. Journal
language production: A reply to Motley et al. Journal of Experimental Psychology: Human Perception and
of Psycholinguistic Research, 12, 555–560. Performance, 7, 1074–1095.
Stemberger, J. P. (1984). Structural errors in normal Swain, M., & Wesche, M. (1975). Linguistic
and agrammatic speech. Cognitive Neuropsychology, interaction: Case study of a bilingual child. Language
1, 281–313. Sciences, 17, 17–22.
Stemberger, J. P. (1985). An interactive activation Swinney, D. A. (1979). Lexical access during sentence
model of language production. In A. W. Ellis (Ed.), comprehension: (Re)consideration of context effects.
Progress in the psychology of language (Vol. 1, pp. Journal of Verbal Learning and Verbal Behavior, 18,
143–186). Hove, UK: Lawrence Erlbaum Associates. 545–569.
562 REFERENCES
Swinney, D. A., & Cutler, A. (1979). The access and Taft, M., & van Graan, F. (1998). Lack of
processing of idiomatic expressions. Journal of Verbal phonological mediation in a semantic categorization
Learning and Verbal Behavior, 18, 523–534. task. Journal of Memory and Language, 38, 203–224.
Swinney, D. A., Zurif, E. B., & Cutler, A. (1980). Tager-Flusberg, H. (1999). Language development in
Effects of sentential stress and word class upon atypical children. In M. Barrett (Ed.), The development of
comprehension in Broca’s aphasics. Brain and language (pp. 311–348). Hove, UK: Psychology Press.
Language, 10, 132–144. Tallal, P., Townsend, J., Curtiss, S., & Wulfeck, B.
Sykes, J. L. (1940). A study of the spontaneous (1991). Phenotypic profiles of language-impaired
vocalizations of young deaf children. Psychological children based on genetic/family history. Brain and
Monograph, 52, 104–123. Language, 41, 81–95.
Tabor, W., & Hutchins, S. (2004). Evidence for self- Tanaka, J. W., & Taylor, M. (1991). Object
organised sentence processing: Digging-in effects. categories and expertise: Is the basic level in the eye of
Journal of Experimental Psychology: Learning, the beholder? Cognitive Psychology, 23, 457–482.
Memory, and Cognition, 30, 431–450. Tanenhaus, M. K., Boland, J. E., Mauner, G. A.,
Tabor, W., Juliano, C., & Tanenhaus, M. K. (1997). & Carlson, G. N. (1993). More on combinatory
Parsing in a dynamical system: An attractor-based lexical information: Thematic structure in parsing and
account of the interaction of lexical and structural interpretation. In G. Altmann & R. Shillcock (Eds.),
constraints in sentence processing. Language and Cognitive models of speech processing (pp. 297–319).
Cognitive Processes, 12, 211–271. Hove, UK: Lawrence Erlbaum Associates.
Tabor, W., & Tanenhaus, M. K. (1999). Dynamical Tanenhaus, M. K., Carlson, G. N., & Trueswell, J. C.
models of sentence processing. Cognitive Science, 23, (1989). The role of thematic structure in interpretation
491–515. and parsing. Language and Cognitive Processes, 4,
Tabossi, P. (1988a). Accessing lexical ambiguity 211–234.
in different types of sentential context. Journal of Tanenhaus, M. K., Leiman, J. M., & Seidenberg, M. S.
Memory and Language, 27, 324–340. (1979). Evidence for multiple stages in the processing of
Tabossi, P. (1988b). Effects of context on the ambiguous words in syntactic contexts. Journal of Verbal
immediate interpretation of unambiguous words. Learning and Verbal Behavior, 18, 427–440.
Journal of Experimental Psychology: Learning, Tanenhaus, M. K., & Lucas, M. (1987). Context
Memory, and Cognition, 14, 153–162. effects in lexical processing. Cognition, 25, 213–234.
Tabossi, P., & Zardon, F. (1993). Processing Tanenhaus, M. K., Spivey-Knowlton, M. J.,
ambiguous words in context. Journal of Memory and Eberhard, K. M., & Sedivy, J. C. (1995). Integration
Language, 32, 359–372. of visual and linguistic information in spoken language
Taft, M. (1979). Recognition of affixed words and comprehension. Science, 268, 1632–1634.
the word frequency effect. Memory and Cognition, 7, Tannenbaum, P. H., Williams, F., & Hillier, C. S.
263–272. (1965). Word predictability in the environments of
Taft, M. (1981). Prefix stripping revisited. Journal of hesitations. Journal of Verbal Learning and Verbal
Verbal Learning and Verbal Behavior, 20, 289–297. Behavior, 4, 134–140.
Taft, M. (1982). An alternative to grapheme–phoneme Taraban, R., & McClelland, J. L. (1988).
conversion rules? Memory and Cognition, 10, 465–474. Constituent attachment and thematic role assignment
Taft, M. (1984). Evidence for abstract lexical in sentence processing: Influences of content-based
representation of word structure. Memory and expectations. Journal of Memory and Language, 27,
Cognition, 12, 264–269. 597–632.
Taft, M. (1985). The decoding of words in lexical Tarshis, B. (1992). Grammar for smart people. New
access: A review of the morphographic approach. In York: Pocket Books.
D. Besner, T. G. Waller, & G. E. MacKinnon (Eds.), Taylor, I., & Taylor, M. M. (1990). Psycholinguistics:
Reading research: Advances in theory and practice Learning and using language. Englewood Cliffs, NJ:
(Vol. 5, pp. 83–123). Orlando, FL: Academic Press. Prentice Hall International.
Taft, M. (1987). Morphographic processing: The Taylor, M., & Gelman, S. A. (1988). Adjectives and
BOSS re-emerges. In M. Coltheart (Ed.), Attention nouns: Children’s strategies for learning new words.
and performance XII: The psychology of reading (pp. Child Development, 59, 411–419.
265–279). Hove, UK: Lawrence Erlbaum Associates. Temple, C. M. (1987). The nature of normality, the
Taft, M. (2004). Morphological decomposition and deviance of dyslexia and the recognition of rhyme:
the reverse base frequency effect. Quarterly Journal of A reply to Bryant and Impey (1986). Cognition, 27,
Experimental Psychology, 57A, 745–765. 103–108.
Taft, M., & Forster, K. I. (1975). Lexical storage Terrace, H. S., Petitto, L. A., Sanders, R. J.,
and retrieval of prefixed words. Journal of Verbal & Bever, T. G. (1979). Can an ape create a
Learning and Verbal Behavior, 14, 638–647. sentence? Science, 206, 891–902.
REFERENCES 563
Tettamanti, M., Buccino, G., Saccuman, M. C., Tomasello, M. (1992a). First verbs: A case study
Gallese, V., Danna, M., Scifo, P., et al. (2005). of early grammatical development. Cambridge:
Listening to action-related sentences activates Cambridge University Press.
fronto-parietal motor circuits. Journal of Cognitive Tomasello, M. (1992b). The social bases of language
Neuroscience, 17, 273–281. acquisition. Social Development, 1, 67–87.
Thagard, P. (2005). Mind: An introduction to Tomasello, M. (2000). Do young children have adult
cognitive science (2nd ed.). Cambridge, MA: MIT syntactic competence? Cognition, 74, 209–253.
Press. Tomasello, M. (2003). Constructing a language:
Thal, D., Marchman, V. A., Stiles, J., Aram, D., A usage-based theory of language acquisition.
Trauner, D., Nass, R., et al. (1991). Early lexical Cambridge, MA: Harvard University Press.
development in children with focal brain injury. Brain Tomasello, M., & Akhtar, N. (2003). What paradox?
and Language, 40, 491–527. A response to Naigles. Cognition, 88, 317–323.
Theakston, A. L. (2004). The role of entrenchment in Tomasello, M., & Farrar, M. J. (1984). Cognitive
children’s and adults’ performance of grammaticality- bases of lexical development: Object permanence
judgement tasks. Cognitive Development, 19, 15–34. and relational words. Journal of Child Language, 11,
Thiessen, E. D., & Saffran, J. R. (2007). Learning 477–493.
to learn: Infants’ acquisition of stress-based strategies Tomasello, M., & Farrar, M. J. (1986). Object
for word segmentation. Language Learning and permanence and relational words: A lexical training
Development, 3, 73–100. study. Journal of Child Language, 13, 495–505.
Thomas, E. L., & Robinson, H. A. (1972). Improving Tomasello, M., & Kruger, A. (1992). Joint attention
reading in every class: A sourcebook for teachers. on actions: Acquiring verbs in ostensive and non-
Boston, MA: Allyn & Bacon. ostensive contexts. Journal of Child Language, 19,
Thomas, M. S. C. (2003). Limits on plasticity. 311–333.
Journal of Cognition and Development, 4, 95–121. Traxler, M., & Gernsbacher, M. A. (Eds.). (2006).
Thomas, M. S. C., & Karmiloff-Smith, A. (2003). Handbook of psycholinguistics (2nd ed.). Burlington,
Modeling language acquisition in atypical phenotypes. MA: Academic Press.
Psychological Review, 110, 647–682. Traxler, M. J., & Pickering, M. J. (1996).
Thompson, C. R., & Church, R. M. (1980). An Plausibility and the processing of unbounded
explanation of the language of a chimpanzee. Science, dependencies: An eye-tracking study. Journal of
208, 313–314. Memory and Language, 35, 454–475.
Thompson, R., Emmorey, K., & Gollan, T. H. Traxler, M. J., Pickering, M. J., & Clifton, C.
(2005). “Tip of the fingers” experiences by deaf (1998). Adjunct attachment is not a form of lexical
signers. Psychological Science, 16, 856–860. ambiguity resolution. Journal of Memory and
Thompson, S., & Mulac, A. (1991). The discourse Language, 39, 558–592.
conditions for the use of the complementizer that in Treiman, R. (1993). Beginning to spell: A study of
conversational English. Journal of Pragmatics, 15, first-grade children. New York: Oxford University
237–251. Press.
Thomson, J., & Chapman, R. S. (1977). Who is Treiman, R. (1994). Sources of information used by
“Daddy” revisited: The status of two-year-olds’ beginning spellers. In G. D. A. Brown & N. C. Ellis
overextended words in use and comprehension. (Eds.), Handbook of spelling: Theory, process and
Journal of Child Language, 4, 359–375. intervention (pp. 75–91). London: John Wiley & Sons
Thorndyke, P. W. (1975). Conceptual complexity Ltd.
and imagery in comprehension. Journal of Verbal Treiman, R. (1997). Spelling in normal children and
Learning and Verbal Behavior, 14, 359–369. dyslexics. In B. A. Blachman (Ed.), Foundations
Thorndyke, P. W. (1977). Cognitive structures in of reading acquisition and dyslexia: Implications
comprehension and memory of narrative discourse. for early intervention (pp. 191–218). Mahwah, NJ:
Cognitive Psychology, 9, 77–110. Lawrence Erlbaum Associates, Inc.
Thorndyke, P. W., & Hayes-Roth, B. (1979). Treiman, R., & Hirsh-Pasek, K. (1983). Silent
The use of schemata in the acquisition and transfer of reading: Insights from second-generation deaf readers.
knowledge. Cognitive Psychology, 11, 82–106. Cognitive Psychology, 15, 39–65.
Tincoff, R., & Jusczyk, P. W. (1999). Some Treiman, R., & Zukowski, A. (1996). Children’s
beginnings of word comprehension in 6-month-olds. sensitivity to syllables, onsets, rimes, and phonemes.
Psychological Science, 10, 172–175. Journal of Experimental Child Psychology, 61,
Tippett, L. J., & Farah, M. J. (1994). A 193–215.
computational model of naming in Alzheimer’s Trevarthen, C. (1975). Early attempts at speech. In
disease: Unitary or multiple impairments? R. Lewin (Ed.), Child alive (pp. 62–80). London:
Neuropsychology, 8, 1–11. Temple Smith.
564 REFERENCES
Trueswell, J. C. (1996). The role of lexical frequency processes. Perception and Psychophysics, 34,
in syntactic ambiguity resolution. Journal of Memory 409–420.
and Language, 35, 566–585. Ullman, M. T. (2004). Contributions to memory
Trueswell, J. C., Sekerina, I., Hill, N., & Logrip, M. circuits to language: The declarative/procedural
(1999). The kindergarten-path effect: Studying online model. Cognition, 92, 231–270.
sentence processing in young children. Cognition, 73, Ullman, M. T., Corkin, S., Coppola, M., Hickok, G.,
89–134. Growdon, J. H., Koroshetz, W. J., et al. (1997). A
Trueswell, J. C., & Tanenhaus, M. K. (1994). neural dissociation within language: Evidence that
Toward a lexicalist framework for constraint-based the mental dictionary is part of declarative memory,
syntactic ambiguity resolution. In C. Clifton, L. and that grammatical rules are processed by the
Frazier, & K. Rayner (Eds.), Perspectives on sentence procedural system. Journal of Cognitive Neuroscience,
processing (pp. 155–179). Hillsdale, NJ: Lawrence 9, 266–276.
Erlbaum Associates, Inc. Vaid, J. (1983). Bilingualism and brain lateralization.
Trueswell, J. C., Tanenhaus, M. K., & Garnsey, S. M. In S. Segalowitz (Ed.), Language functions and brain
(1994). Semantic influences on parsing: Use of thematic organization (pp. 315–339). New York: Academic Press.
role information in syntactic disambiguation. Journal of Valian, V. (1986). Syntactic categories in the speech
Memory and Language, 33, 285–318. of young children. Developmental Psychology, 22,
Trueswell, J. C., Tanenhaus, M. K., & Kello, C. 562–579.
(1993). Verb-specific constraints in sentence Vallar, G., & Baddeley, A. D. (1984). Phonological
processing: Separating effects of lexical preference short-term store, phonological processing and sentence
from garden paths. Journal of Experimental comprehension: A neuropsychological case study.
Psychology: Learning, Memory, and Cognition, 19, Cognitive Neuropsychology, 1, 121–142.
528–553. Vallar, G., & Baddeley, A. D. (1987). Phonological
Tulving, E. (1972). Episodic and semantic memory. short-term store and sentence processing. Cognitive
In E. Tulving & W. Donaldson (Eds.), Organization of Neuropsychology, 4, 417–438.
memory (pp. 381–403). New York: Academic Press. Vallar, G., & Baddeley, A. D. (1989). Developmental
Tulving, E., & Schachter, D. L. (1990). Priming and disorders of verbal short-term memory and their
human memory systems. Science, 247, 301–306. relation to sentence comprehension: A reply to
Turvey, M. T. (1973). On peripheral and central Howard and Butterworth. Cognitive Neuropsychology,
processes in vision. Psychological Review, 80, 1–52. 6, 465–473.
Tweedy, J. R., Lapinski, R. H., & Schvaneveldt, R. W. van Berkum, J. J. A., Brown, C., Zwitserlood, P.,
(1977). Semantic-context effects on word recognition: Kooijman, V., & Hagoort, P. (2005). Anticipating
Influence of varying the proportion of items presented upcoming words in discourse: Evidence from
in an appropriate context. Memory and Cognition, 5, ERPs and reading times. Journal of Experimental
84–89. Psychology: Learning, Memory, and Cognition, 31,
Tyler, L. K. (1984). The structure of the initial cohort. 443–467.
Perception and Psychophysics, 36, 415–427. van Dijk, T. A., & Kintsch, W. (1983). Strategies of
Tyler, L. K. (1985). Real-time comprehension discourse representation. New York: Academic Press.
processes in agrammatism: A case study. Brain and van Gompel, R. P. G., Fischer, M. H., Murray, W. S.,
Language, 26, 259–275. & Hill, R. L. (2006). Eye-movement research: An
Tyler, L. K., & Marslen-Wilson, W. D. (1977). overview of current and past developments. In R. P. G.
The on-line effects of semantic context on syntactic van Gompel, M. H. Fischer, W. S. Murray, & R. L. Hill
processing. Journal of Verbal Learning and Verbal (Eds.), Eye movements: A window on mind and brain.
Behavior, 16, 683–692. Oxford: Elsevier Science.
Tyler, L. K., & Moss, H. E. (1997). Functional van Gompel, R. P. G., & Pickering, M. J. (2001).
properties of concepts: Studies of normal and brain- Lexical guidance in sentence processing: A note on
damaged patients. Cognitive Neuropsychology, 14, Adams, Clifton, and Mitchell (1998). Psychonomic
511–545. Bulletin and Review, 8, 851–857.
Tyler, L. K., & Moss, H. E. (2001). Towards a van Gompel, R. P. G., & Pickering, M. J. (2007).
distributed account of conceptual knowledge. Trends Syntactic parsing. In G. Gaskell (Ed.), The Oxford
in Cognitive Science, 5, 244–252. handbook of psycholinguistics. Oxford: Oxford
Tyler, L. K., Ostrin, R. K., Cooke, M., & Moss, H. E. University Press.
(1995). Automatic access of lexical information in van Gompel, R. P. G., Pickering, M. J., &
Broca’s aphasics: Against the automaticity hypothesis. Traxler, M. J. (2000). Unrestricted race: A new model of
Brain and Language, 48, 131–162. syntactic ambiguity resolution. In A. Kennedy,
Tyler, L. K., & Wessels, J. (1983). Quantifying R. Radach, D. Heller, & J. Pynte (Eds.), Reading as a
contextual contributions to word-recognition perceptual process (pp. 621–648). Oxford: Elsevier.
REFERENCES 565
van Gompel, R. P. G., Pickering, M. J., & handbook of psycholinguistics (pp. 195–216). Oxford:
Traxler, M. J. (2001). Reanalysis in sentence Oxford University Press.
processing: Evidence against constraint-based and Vigliocco, G., Vinson, D. P., Lewis, W., & Garrett, M. F.
two-stage models. Journal of Memory and Language, (2004). Representing the meanings of object and action
45, 225–258. words: The featural and unitary semantic space hypothesis.
van Orden, G. C. (1987). A rows is a rose: Spelling, Cognitive Psychology, 48, 422–488.
sound and reading. Memory and Cognition, 15, Vigliocco, G., Vinson, D. P., Paganelli, F., &
181–198. Dworzynski, K. (2005). Grammatical gender effects
van Orden, G. C., Johnston, J. C., & Hale, B. L. on cognition: Implications for language learning and
(1988). Word identification in reading proceeds from language use. Journal of Experimental Psychology:
spelling to sound to meaning. Journal of Experimental General, 134, 501–520.
Psychology: Learning, Memory, and Cognition, 14, Vihman, M. M. (1985). Language differentiation by
371–386. the bilingual infant. Journal of Child Language, 12,
van Orden, G. C., Pennington, B. F., & Stone, G. O. 297–324.
(1990). Word identification in reading and the promise Vihman, M. M. (1996). Phonological development.
of subsymbolic psycholinguistics. Psychological Oxford: Blackwell.
Review, 97, 488–522. Vinson, B. P. (1999). Language disorders across the
van Petten, C. (1993). A comparison of lexical and lifespan: An introduction. San Diego, CA: Singular
sentence-level context effects in event-related potentials. Publishing Group.
Language and Cognitive Processes, 8, 485–531. Vipond, D. (1980). Micro and macroprocesses in
van Turenout, M., Hagoort, P., & Brown, C. M. text comprehension. Journal of Verbal Learning and
(1998). Brain activity during speaking: From syntax to Verbal Behavior, 19, 276–296.
phonology in 40 milliseconds. Science, 280, 572–574. Vitevitch, M. S. (2002). The influence of phonological
Vanderwart, M. (1984). Priming by pictures in lexical similarity neighborhoods on speech production.
decision. Journal of Verbal Learning and Verbal Journal of Experimental Psychology: Learning,
Behavior, 23, 67–83. Memory, and Cognition, 28, 735–747.
Vargha-Khadem, F., & Passingham, R. (1990). Vitkovitch, M., & Humphreys, G. W. (1991).
Speech and language defects. Nature, 346, 226. Perseverant responding in speeded naming of pictures:
Vargha-Khadem, F., Watkins, K., Alcock, K., It’s in the links. Journal of Experimental Psychology:
Fletcher, P., & Passingham, R. (1995). Praxic and Learning, Memory, and Cognition, 17, 664–680.
nonverbal cognitive deficits in a large family with a Von Frisch, K. (1950). Bees, their vision, chemical
genetically transmitted speech and language disorder. senses, and language. Ithaca, NY: Cornell University
Proceedings of the National Academy of Science, 92, Press.
930–933. Von Frisch, K. (1974). Decoding the language of
Varney, N. L. (1984). Phonemic imperception in bees. Science, 185, 663–668.
aphasia. Brain and Language, 21, 85–94. Vu, H., & Kellas, G. (1999). Contextual strength
Venezky, R. L. (1970). The structure of English modulates the subordinate bias effect: Reply to
orthography. The Hague: Mouton. Rayner, Binder, and Duffy. Quarterly Journal of
Vidyasagar, T. R., & Pammer, K. (2010). Dyslexia: Experimental Psychology, 52A, 853–855.
A deficit in visuo-spatial attention, not in phonological Vu, H., Kellas, G., & Paul, S. T. (1998). Sources of
processing. Trends in Cognitive Sciences, 14, 57–63. sentence constraint on lexical ambiguity resolution.
Vigliocco, G., Antonini, T., & Garrett, M. F. (1997). Memory and Cognition, 26, 979–1001.
Grammatical gender is on the tip of Italian tongues. Vygotsky, L. (1934). Thought and language (Trans.
Psychological Science, 8, 314–317. E. Hanfman & G. Vakar, 1962). Cambridge, MA: MIT
Vigliocco, G., Butterworth, B., & Garrett, M. F. Press.
(1996). Subject–verb agreement in Spanish and Waldrop, M. M. (1992). Complexity: The emerging
English: Differences in the role of conceptual science at the edge of order and chaos. London:
constraints. Cognition, 61, 261–298. Penguin Books.
Vigliocco, G., & Hartsuiker, R. J. (2002). The Wales, R. J., & Campbell, R. (1970). On the
interplay of meaning, sound, and syntax in sentence development of comparison and the comparison of
production. Psychological Bulletin, 128, 442–472. development. In G. B. Flores d’Arcais &
Vigliocco, G., & Nicol, J. (1998). Separating W. J. M. Levelt (Eds.), Advances in psycholinguistics
hierarchical relations and word order in language (pp. 373–396). Amsterdam: North Holland.
production: Is proximity concord syntactic or linear? Walker, C. H., & Yekovich, F. R. (1987). Activation
Cognition, 68, B13–B29. and use of script-based antecedents in anaphoric
Vigliocco, G., & Vinson, D. P. (2009). Semantic reference. Journal of Memory and Language, 26,
representation. In G. Gaskell (Ed.), The Oxford 673–691.
566 REFERENCES
Walker, S. (1987). Review of Gavagai! or the future Waters, G. S., Caplan, D., & Hildebrandt, N.
history of the animal language controversy, by David (1991). On the structure of verbal short-term memory
Premack. Mind and Language, 2, 326–332. and its functional role in sentence comprehension:
Wall, R. (1972). Introduction to mathematical Evidence from neuropsychology. Cognitive
linguistics. Englewood Cliffs, NJ: Prentice Hall. Neuropsychology, 8, 81–126.
Wanner, E. (1980). The ATN and the sausage Watkins, K. E., Dronkers, N. F., & Vargha-
machine: Which one is baloney? Cognition, 8, Khadem, F. (2002). Behavioural analysis of an
209–225. inherited speech and language disorder: Comparison
Ward, J. (2010). The student’s guide to cognitive with acquired aphasia. Brain, 125, 452–464.
neuroscience (2nd ed.). Hove, UK: Psychology Press. Watkins, K. E., & Paus, T. (2004). Modulation of
Wardlow Lane, L., Groisman, M., & Ferreira, V. S. motor excitability during speech perception: The role
(2006). Don’t talk about pink elephants! Psychological of Broca’s area. Journal of Cognitive Neuroscience,
Science, 17, 273–277. 16, 978–987.
Warren, C., & Morton, J. (1982). The effects of Watson, J. B. (1913). Psychology as the behaviorist
priming on picture recognition. British Journal of views it. Psychological Review, 20, 158–177.
Psychology, 73, 117–129. Watts, D. (2012). Why everything is obvious (once
Warren, R. M. (1970). Perceptual restoration of you know the answer). New York: Atlantic Books.
missing speech sounds. Science, 167, 392–393. Waxman, S. R. (1999). Specifying the scope of
Warren, R. M., Obusek, C. J., Farmer, R. M., & 13-month-olds’ expectations for novel words.
Warren, R. P. (1969). Auditory sequence: Confusion Cognition, 70, B35–B50.
of patterns other than speech or music. Science, 164, Waxman, S. R., & Booth, A. E. (2001). Seeing pink
586–587. elephants: Fourteen-month-olds’ interpretations of
Warren, R. M., & Warren, R. P. (1970). Auditory novel nouns and adjectives. Cognitive Psychology, 43,
illusions and confusions. Scientific American, 223, 217–242.
30–36. Waxman, S. R., & Markow, D. B. (1995). Words as
Warrington, E. K. (1975). The selective impairment invitations to form categories: Evidence from 12- to
of semantic memory. Quarterly Journal of 13-month-old infants. Cognitive Psychology, 29,
Experimental Psychology, 27, 635–657. 257–303.
Warrington, E. K. (1981). Concrete word dyslexia. Weekes, B. S. (1997). Differential effects of number of
British Journal of Psychology, 72, 175–196. letters on word and nonword naming latency. Quarterly
Warrington, E. K., & Cipolotti, L. (1996). Word Journal of Experimental Psychology, 50A, 439–456.
comprehension: The distinction between refractory Weizenbaum, J. (1966). ELIZA: A computer program
and storage impairments. Brain, 119, 611–625. for the study of natural language communication
Warrington, E. K., & Crutch, S. J. (2004). A between man and machine. Communications of the
circumscribed refractory access disorder: A verbal Association for Computing Machinery, 9, 36–45.
semantic impairment sparing visual semantics. Werker, J., & Curtin, S. (2005). PRIMIR: A
Cognitive Neuropsychology, 21, 299–315. developmental framework of infant speech processing.
Warrington, E. K., & McCarthy, R. (1983). Language Learning and Development, 1, 197–234.
Category specific access dysphasia. Brain, 106, Werker, J. F., & Tees, R. C. (1983). Developmental
859–878. changes across childhood in the perception of
Warrington, E. K., & McCarthy, R. (1987). non-native speech sounds. Canadian Journal of
Categories of knowledge: Further fractionation and an Psychology, 37, 278–286.
attempted integration. Brain, 110, 1273–1296. Werker, J. F., & Tees, R. C. (1984). Crosslanguage
Warrington, E. K., & Shallice, T. (1969). The speech development: Evidence for perceptual
selective impairment of auditory verbal short-term reorganization during the first year of life. Infant
memory. Brain, 92, 885–896. Behavior and Development, 7, 49–63.
Warrington, E. K., & Shallice, T. (1979). Semantic Werker, J. F., & Yeung, H. H. (2005). Infant speech
access dyslexia. Brain, 102, 43–63. perception bootstraps word learning. Trends in
Warrington, E. K., & Shallice, T. (1984). Category- Cognitive Sciences, 9, 519–527.
specific semantic impairments. Brain, 107, 829–854. West, R. F., & Stanovich, K. E. (1978). Automatic
Wason, P. C. (1965). The contexts of plausible denial. contextual facilitation in readers of three ages. Child
Journal of Verbal Learning and Verbal Behavior, 4, Development, 49, 717–727.
7–11. West, R. F., & Stanovich, K. E. (1982). Source of
Waters, G. S., & Caplan, D. (1996). The capacity inhibition in experiments on the effect of sentence
theory of sentence comprehension: Critique of Just context on word recognition. Journal of Experimental
and Carpenter (1992). Psychological Review, 103, Psychology: Learning, Memory, and Cognition, 8,
761–772. 385–399.
REFERENCES 567
West, R. F., & Stanovich, K. E. (1986). Robust Williams, J. N. (1988). Constraints upon semantic
effects of syntactic structure on visual word activation during sentence comprehension. Language
processing. Memory and Cognition, 14, 104–112. and Cognitive Processes, 3, 165–206.
Wexler, K. (1998). Very early parameter setting and Williams, P. C., & Parkin, A. J. (1980). On
the unique checking constraint: A new explanation of knowing the meaning of words we are unable to
the optional infinitive stage. Lingua, 106, 23–79. report—confirmation of a guessing explanation.
Whaley, C. P. (1978). Word–nonword classification Quarterly Journal of Experimental Psychology, 32,
time. Journal of Verbal Learning and Verbal Behavior, 101–107.
17, 143–154. Wilshire, C. E., & Saffran, E. M. (2005). Contrasting
Wheeldon, L. (Ed.). (2000). Aspects of language effects of phonological priming in aphasic word
production. Hove, UK: Psychology Press. production. Cognition, 95, 31–71.
Wheeldon, L., & Lahiri, A. (1997). Prosodic units in Wilson, M., & Wilson, T. P. (2005). An oscillator
speech production. Journal of Memory and Language, model of the timing of turn-taking. Psychonomic
37, 356–381. Bulletin and Review, 12, 957–968.
Wheeldon, L. R., & Monsell, S. (1992). The locus of Wingfield, A., & Klein, J. F. (1971). Syntactic
repetition priming of spoken word production. Quarterly structure and acoustic pattern in speech perception.
Journal of Experimental Psychology, 44A, 723–761. Perception and Psychophysics, 9, 23–25.
Wheeler, D. (1970). Processes in word recognition. Winner, E., & Gardner, H. (1977). The
Cognitive Psychology, 1, 59–85. comprehension of metaphor in brain-damaged
Whittlesea, B. W. A. (1987). Preservation of patients. Brain, 100, 717–729.
specific experiences in the representation of general Winnick, W. A., & Daniel, S. A. (1970). Two kinds
knowledge. Journal of Experimental Psychology: of response priming in tachistoscopic recognition.
Learning, Memory, and Cognition, 13, 3–17. Journal of Experimental Psychology, 84, 74–81.
Whorf, B. L. (1956a). Language, thought, and reality: Winograd, T. A. (1972). Understanding natural
Selected writings of Benjamin Lee Whorf. New York: language. New York: Academic Press.
Wiley. Wisniewski, E. J. (1997). When concepts combine.
Whorf, B. L. (1956b). Science and linguistics. In Psychonomic Bulletin and Review, 4, 167–183.
J. B. Carroll (Ed.), Language, thought and reality: Wisniewski, E. J., & Love, B. C. (1998). Relations
Selected writings of Benjamin Lee Whorf (pp. 207–219). versus properties in conceptual combination. Journal
Cambridge, MA: MIT Press. [Originally published 1940.] of Memory and Language, 38, 177–202.
Wickelgren, W. A. (1969). Context-sensitive coding, Wittgenstein, L. (1953). Philosophical investigations
associative memory, and serial order in (speech) (Trans. G. E. M. Anscombe). Oxford: Blackwell.
behavior. Psychological Review, 76, 1–15. Wittgenstein, L. (1958). The blue and brown books.
Wierzbicka, A. (2004). Conceptual primes in Oxford: Blackwell.
human languages and their analogues in animal Woodruff-Pak, D. S. (1997). The neuropsychology of
communication and cognition. Language Sciences, 26, aging. Oxford: Blackwell.
413–441. Woods, B. T., & Carey, S. (1979). Language deficits
Wilding, J. (1990). Developmental dyslexics do not after apparent clinical recovery from childhood
fit in boxes: Evidence from the case studies. European aphasia. Annals of Neurology, 6, 405–409.
Journal of Cognitive Psychology, 2, 97–131. Woods, B. T., & Teuber, H.-L. (1973). Early onset of
Wilensky, R. (1983). Story grammars versus story complementary specialization of cerebral hemispheres
points. Behavioral and Brain Sciences, 6, 579–623. in man. Transactions of the American Neurological
Wilkes, A. L. (1997). Knowledge in minds: Individual Association, 98, 113–117.
and collective processes in cognition. Hove, UK: Woods, W. A. (1975). What’s in a link? Foundations
Psychology Press. for semantic networks. In D. G. Bobrow &
Wilkins, A. J. (1971). Conjoint frequency, category A. M. Collins (Eds.), Representation and
size, and categorization time. Journal of Verbal understanding: Studies in cognitive science (pp.
Learning and Verbal Behavior, 10, 382–385. 35–82). New York: Academic Press.
Wilkins, A. J., & Neary, G. (1991). Some visual, Woodward, A. L., & Markman, E. M. (1998). Early
optometric and perceptual effects of coloured glasses. word learning. In W. Damon, D. Kuhn, & R. S. Siegler
Ophthalmic and Physiological Optics, 11, 163–171. (Eds.), Handbook of child psychology (Vol. 2, 5th ed.,
Wilks, Y. (1976). Parsing English II. In E. Charniak pp. 371–420). New York: Wiley.
& Y. Wilks (Eds.), Computational semantics (pp. Woodworth, R. S. (1938). Experimental psychology.
155–184). Amsterdam: North Holland. New York: Holt.
Willems, R. M., & Casasanto, D. (2011). Flexibility Wright, B., & Garrett, M. (1984). Lexical decision
in embodied language understanding. Frontiers in in sentences: Effects of syntactic structure. Memory
Psychology, 2, 1–11. and Cognition, 12, 31–45.
568 REFERENCES
Wydell, T. K., Patterson, K. E., & Humphreys, G. W. interaction of lexical semantics and cohort competition
(1993). Phonologically mediated access to meaning in spoken word recognition: An fMRI study. Journal
for kanji: Is rows still a rose in Japanese kanji? Journal of Cognitive Neuroscience, 23, 3778–3790.
of Experimental Psychology: Learning, Memory, and Ziegler, J. C., & Goswami, U. (2005). Reading
Cognition, 19, 1082–1093. acquisition, developmental dyslexia, and skilled
Xu, F. (2002). The role of language in acquiring object reading across languages: A psycholinguistic grain size
kind concepts in infancy. Cognition, 85, 223–250. theory. Psychological Bulletin, 131, 3–29.
Yamada, J. E. (1990). Laura: A case for the Ziegler, J. C., Muneaux, M., & Grainger, J. (2003).
modularity of language. Cambridge, MA: MIT Press. Neighborhood effects in auditory word recognition:
Yekovich, F. R., & Thorndyke, P. W. (1981). An Phonological competition and orthographic
evaluation of alternative models of narrative schema. facilitation. Journal of Memory and Language, 48,
Journal of Verbal Learning and Verbal Behavior, 20, 779–793.
454–469. Ziegler, J. C., Perry, C., Jacobs, A. M., & Braun, M.
Yngve, V. (1970). On getting a word in edgewise. (2001). Identical words are read differently in different
Papers from the Sixth Regional Meeting of the languages. Psychological Science, 12, 379–384.
Chicago Linguistic Society, 6, 567–577. Zorzi, M., Barbierob, A., Facoettia, C., &
Yopp, H. K. (1988). The validity and reliability Ziegler, J. C. (2012). Extra-large letter spacing
of phonemic awareness tests. Reading Research improves reading in dyslexia. Proceedings of the
Quarterly, 23, 159–177. National Academy of Science USA, 109, 11455–11459.
Yuill, N., & Oakhill, J. (1991). Children’s problems Zurif, E. B., Caramazza, A., Myerson, P., &
in text comprehension. Cambridge: Cambridge Galvin, J. (1974). Semantic feature representations for
University Press. normal and aphasic language. Brain and Language, 1,
Zagar, D., Pynte, J., & Rativeau, S. (1997). Evidence 167–187.
for early closure attachment on first-pass reading Zurif, E. B., & Grodzinsky, Y. (1983). Sensitivity to
times in French. Quarterly Journal of Experimental grammatical structure in agrammatic aphasics: A reply
Psychology, 50A, 421–438. to Linebarger, Schwartz, & Saffran. Cognition, 15,
Zaidel, E., & Peters, A. M. (1981). Phonological 207–214.
encoding and ideographic reading by the disconnected Zwaan, R. A., & Madden, C. J. (2004). Updating
right hemisphere. Brain and Language, 14, 205–234. situation models. Journal of Experimental
Zevin, J. D., & Balota, D. A. (2000). Priming and Psychology: Learning, Memory, and Cognition, 30,
attentional control of lexical and sublexical pathways 283–288.
during naming. Journal of Experimental Psychology: Zwaan, R. A., Magliano, J. P., & Graesser, A. C.
Learning, Memory, and Cognition, 26, 121–135. (1995). Dimensions of situation model construction
Zevin, J. D., & Seidenberg, M. S. (2002). Age of in narrative comprehension. Journal of Experimental
acquisition effects in word reading and other tasks. Psychology: Learning, Memory, and Cognition, 21,
Journal of Memory and Language, 47, 1–29. 386–397.
Zevin, J. D., & Seidenberg, M. S. (2006). Simulating Zwaan, R. A., & Radvansky, G. A. (1998). Situation
consistency effects and individual difference in models in language comprehension and memory.
nonword naming: A comparison of current models. Psychological Bulletin, 123, 162–185.
Journal of Memory and Language, 54, 145–160. Zwitserlood, P. (1989). The locus of the effects
Zhuang, J., Randall, B., Stamatakis, E. A., of sentential-semantic context in spoken-word
Marslen-Wilson, W. D., & Tyler, L. K. (2011). The processing. Cognition, 32, 25–64.
A USTEHCOT RI O INN D E X
Catlin, J. 333 Coltheart, M. 110, 140, 175, 214, 216, 217, 220, 222,
Cattaneo, C. 254 223, 224, 225, 226, 228, 231, 235, 238, 245, 251,
Cattell, J.M. 176 252, 341, 345, 349, 464
Cazden, C.B. 107, 146 Coltheart, V. 246
Chafe, W.L. 444 Conboy, B.T. 157
Chaffin, R. 378 Conner, L.T. 206
Chalard, M. 173, 174 Connine, C.M. 263
Chambers, C.G. 304, 458 Conrad, C. 217, 324
Chambers, S.M. 172, 175, 183, 214 Conrad, R. 87
Chambers Twentieth Century Dictionary 55 Constable, R.T. 73, 298
Chan, A.S. 349 Content, A. 175
Chan, D. 348 Conti-Ramsden, G. 87
Chang, F. 143, 373, 404, 479 Cook, A.E. 383
Chao, L.L. 348 Cook, V. 154
Chapman, R.M. 311, 312 Cook, V.J. 37, 111
Chapman, R.S. 132 Cooke, M. 315
Charney, R. 130 Cooper, F.S. 258, 268
Chater, N. 26, 54, 121, 139, 276 Cooper, J. 242
Chawluk, J.B. 352 Cooper, W.E. 281
Chen, H.-C. 155 Coppola, M. 69, 114, 409
Chertkow, H. 343 Corballis, M.C. 53
Chialant, D. 281, 442 Corbett, A.T. 369, 373
Chiappe, P. 252 Corbett, G. 96
Chiat, S. 437 Corbit, L. 261, 264
Cholin, J. 428, 429 Corina, D.P. 68, 73
Chomsky, N. 9, 10, 11, 24, 36, 37, 40, 41, 42, 43, 45, Corkin, S. 69, 409
51, 52, 66, 67, 108, 109, 111, 112, 144, 475 Corley, M. 462
Christiaansen, R.E. 369, 370 Corley, M.M.B. 305
Christiansen, J.A. 389 Corrigan, R. 81
Christiansen, M.H. 54, 118, 121, 122, 472 Cortese, C. 155, 175
Christianson, K. 307 Cortese, M.J. 177
Chumbley, J.I. 174, 180, 181, 182 Coslett, H.B. 225, 341, 349, 443
Church, R.M. 63 Costa, A. 157, 321, 404, 407, 417, 421, 429, 430
Cipolotti, L. 234, 339, 340 Costa, F. 305
Cirilo, R.K. 378 Costa, L.D. 76
Clahsen, H. 35, 111, 146, 147 Crago, M.B. 114
Clark, E.V. 91, 113, 123, 124, 125, 127, 130, 131, 132, Craik, F.I.M. 155
133, 134, 135, 136, 259, 268 Crain, L. 84
Clark, H.H. 9, 91, 113, 123, 124, 125, 127, 130, 131, Crain, S. 299, 300
132, 135, 259, 268, 337, 367, 375, 376, 396, 430, Cree, G.S. 353
433, 449, 452, 453, 455 Crocker, M.W. 302
Clark, J.M. 155 Cromer, R.F. 96
Clarke, R. 195 Croot, K. 442
Clarke-Stewart, K. 110 Cross, T.G. 109, 110
Claus, B. 383 Crosson, B. 344
Cleland, A.A. 404 Crum, W.R. 348
Clifton, C. 263, 296, 297, 302, 307, 311, 374 Crutch, S.J. 340
Coffrey-Corina, S.A. 75, 76 Cruz, A. 254
Cohen, G. 387 Crystal, D. 3, 7, 124, 154
Cohen, L. 184 Cuetos, F. 304, 305, 442
Cohen, M.M. 275 Cummins, J. 76, 154
Cok, B. 156, 158 Cupples, L. 341
Colby, K.M. 14 Curtin, S. 118, 122, 123
Cole, R.A. 271 Curtis, B. 228, 231
Coleman, L. 332 Curtiss, S. 78, 115
Collins, A.M. 323, 324, 325, 335, 444, 445 Cutler, A. 122, 260, 261, 265, 276, 277, 279, 315, 338,
Colombo, L. 176 398, 410, 411, 463, 480
AUTHOR INDEX 573
Italic page numbers indicate tables; bold numbers indicate figures, pictures and text boxes.
aspirated sounds 30 response bias 182; semantic bias 310; verb bias
assimilation 80–1, 259 302–3; whole-object bias 128–9
associations 320–1, 322–3 bigram frequency 214
associative facilitation 267 bilabial sounds 33
associative semantic priming 185–7 Bilingual Interactive Activation Plus (BIA+)
attachment preferences 305 model 157
attentional dyslexia 220 bilingualism 94, 480; advantages 154–5; age of
attentional processes, visual word recognition 177–80 acquisition 158; aphasia 157; categories 153–4, 154;
attentional processing 177–8, 178; agrammatic aphasia and cognitive processing 154–5; and color coding
315; evaluation of research 180; two-process 96; early research 154; evaluation of research 162;
priming model 179–80 interference 157; language processing 155–7;
attitude and emotion, second language acquisition 160 lateralization 157; lexicalization 411, 421; models
attractors 236 157; neuroscience 157–8; overview 153; parameter
audience design 455–6 setting 112; second language acquisition 158–61;
audiolingual teaching 159 segmentation 260–1; summary 162; syntactic
auditory comprehension 71–2, 157 processing 156; tip-of-the-tongue (TOT) 415;
auditory short-term memory (ASTM) 473; tasks translation 156–7
469–70 biological basis, of language 67–73
autism, language development 83 blindness see visual impairment
automata theory 44 blindspots 169
automatic associative priming 186–7 blocking hypothesis 415
automatic inferences 369 body 215
automatic non-associative priming 186–7 bonobos, language acquisition 64–5
automatic processing 177–8, 178; agrammatic book: cognitive emphasis 4; conclusions 480; themes
aphasia 315 22–6, 23, 475–7
autonomous access model 203–4 BootLex 121
autonomous-interactive distinction 266 bootstrapping 121, 122–3; semantic 136–7; syntactic
autonomous models of parsing 288 130
autonomy, in syntactic processing 296 borrowing, of words 8
autonomy of syntax 11 bottom-up 24
autonomy theory 266–7 bound morphemes 401
auxiliary hypotheses 24 “box-and-arrow” diagrams 13
auxiliary verbs 41; visual impairment 87 box and candle problem 94, 94
boxology 477
brain: activity during reading 184, 224; Alzheimer’s
B disease 348; cross section 17; knowledge storage
babbling 104, 123, 123–5 areas 347; and language 17–22; localization of
babytalk 109–11 functions 67–73, 70, 72; resolving ambiguity 304;
back-channel communication 454 syntactic processing 436
back propagation 230, 483–5 brain damage 476; and comprehension 389; effects
backward translation 156 on parsing 312–16; lesion studies 17–19; not
backwards masking 171–2, 172 localized 220; range of effects 461; recovery
base frequency effect 191 74–5; selective language impairment 158; spoken
basic-level terms 130 word recognition 281
basic levels 334 brain development, and language development 52
basis of language: biological basis 67–73; cognitive brain imaging 16, 19–22, 68, 71; ambiguous and non-
basis 80–2; genetics 53; hand gestures 53–4; ambiguous sentences 304; increasing accuracy 478;
origins 51–4; overview 51; primate studies 53; semantic and syntactic processing 298
protolanguage 52; social basis 83–8; social factors bridging inferences 367–8, 369, 370
53; summary 100 Broca’s aphasia 68, 433–4, 435, 444
Bassa, color coding 95 Broca’s area 17; agrammatic aphasia 313; location 18,
Bayesian models 478 68; role of 71
bees 54, 55, 57 Brodmann’s area 53, 316
behaviorism 10, 123; arguments against 108; as
empirical 106; view of thought 88–99
Berinmo 97 C
bias: in comprehension 376; familiarity bias 421; in canonical sentence strategy 293
learning 127; lexical bias 421; in research 16, 142; capacity theory 471
592 SUBJECT INDEX
Caramazza’s model 416, 417 Chinese 92–3; number systems 94; reading 227; script
cascade models 23–4, 418–21, 424–5, 426 226; see also Mandarin
case grammar 377 Chinese–English bilinguals 94
CAT (computerized axial tomography) 20, 20 Chomsky’s linguistic theory 36–45; see also
categorical perception 261–2 transformational grammar
categorical phoneme perception, TRACE model 275 class-inclusion model 338
categorization 320; basic level 334; fuzziness 330, 333 classification, evaluation of research 336
category decision task 216 clauses 38
category-specific disorders: connectionist models click displacement technique 291
352–4; living–non-living dissociation 345; closed-class items 38
methodological issues 344–5; modality-specific closure 294; late 295–6
effects 346–8; sensory–functional theory 345–8; co-articulation 121–2, 259–60, 262
stimulus materials 344 co-reference, comprehension 372
causal coherence 361 coda 35
causative verbs 144, 331, 331–2 code switching 154
center-embedding 40, 45 coding, of color 95
centering theory 376 cognition: embeddedness 356; indirect effects of
central deep dyslexia 223–4 language 94
central dyslexias 220 cognition hypothesis 81, 83, 88
certainty 26 cognitive cycles 432, 432
chaffinch 74 cognitive development: hearing impairment 87–8;
changes in languages, over time 7–9 Piagetian theory 80–2, 81
characteristic features 327, 327–9 cognitive economy 320
child-directed speech 109–11, 110, 455; cultural cognitive linguistics 43
variation 110 cognitive neuropsychology 17–18
children: color 96; concept development 335; cognitive neuroscience, area of study 17
deprivation of linguistic input 78–9; early sounds cognitive processes, specificity 26
104; hearing children of hearing-impaired parents cognitive processing, and bilingualism 154–5
77; hypothesis testing 123, 129–30; language cognitive psychology 10, 13
acquisition 63; lateralization 74–6; learning cognitive science approach 13–15
difficulties 82–3; motion encoding 98; spatial coherence 361
coding 97 coherence graph 384
children, language development 478–9; acquisition of cohort model 265, 268–73; extension 278
irregular forms 108; after babbling 124; babbling collaboration, in conversations 454–6
123–4; child-directed speech 109–11, 110; Collins and Quillian semantic network model 323–5
conditioning 107–8; distributional information color coding 95–6, 268
117–18; early speech perception 120–3; color hue division 95
early words 126, 127; errors 126–7; errors in color, memory for 95–7
meanings 131–4; formal approaches 115–16; color perception 97
genetic linguistics 114–15; imitation 106; color spectrum, and visual system 96–7
individual differences and preferences 129–30; color terms, hierarchy 95, 96
language acquisition device (LAD) 111–18; commissive speech acts 450
later phonological development 124; lexical and common ground 375–6
semantic development 125–6, 126; linguistic common-store models 155–6
universals 112–14; mapping problem communication 5; steps in 3
127–30; name learning 127–9, 131; output communicative signals 54
simplification 125, 125; over- and under-extensions comparative linguistics 10
131–4; overview 104–5; parameter setting 111–12, competence 36–7, 105
114; phonological development 120–5; pidgins competition 303
and creoles 114; poverty of the stimulus 108–9; competition effects 279
process 118–20; semantics first 136; summary competition-integration model 303
150–1; syntactic categories 136–9; syntactic competitive queuing 427
comprehension 148–9; syntactic development compound nouns 336–7
136–49; use of cues 130; verb-argument structure compound words 191
141–4; in the womb 119 comprehensible input hypothesis 160
chimpanzees see apes comprehension: accessibility 375–6; agrammatism
chinchillas 122 435; anaphoric ambiguity 372–5; bias 376;
SUBJECT INDEX 593
co-reference 372; common ground 375–6; context construction–integration model 378, 384–6
effect 364–7, 365; first mention 376; given–new constructivist-semantic perspective 136
contract 376; implicit causality 373–4; implicit content-word substitutions 437
focus 374–5; improving reading skills 387–8; content words 38, 315, 400
individual differences 386–8; inferences 367–72; context: lexical ambiguity 202–3; and meaning
Kintsch’s construction–integration model 378, 319; and sound identification 263–5; visual word
384–6; and memory 361, 362–72; memory, recognition 187–90
inferences and anaphora 376–7; mental models context effect: cohort model 270; comprehension
382–4; neuroscience of text processing 388–9; and memory 364–7, 365; garden path model
overview 360–2; prior knowledge 364–7, 387; and 301–2; speech recognition 277; TRACE model 276;
production 135–6; recency 376; reference 372; understanding indirect speech acts 451–2; word
referential processing 361; schema-based theories recognition 266–7
380–2; semantic processing 361; and sentence context-free grammars 40, 44, 45
structure 361; and short-term memory 389; speed context-guided single-reading lexical access model 199
reading 218–19; story grammars 378–80; summary context-sensitive grammars 40, 44, 45
390–1; text processing 377–86; visual information context-sensitive model 204–5
457, 457–8 contingent negative variation (CNV) 19
computational account, of vision 13 continuity assumption 142
computational metaphor 13 continuity hypothesis 111, 112, 123
computational models 25–6, 478–9 continuity theories 136
computer modeling 13–14; see also models contrastive hypothesis 134, 159
computer programs: ELIZA 14; experimental packages controlled processing 177
15; PARRY 14; SHRDLU 14–15 conversation analysis 453–4
concepts 320; combining 336–7; wooliness 333 conversational hypothesis 109
conceptual change 130 conversational implicature 452–3
conceptual dependency theory 378 conversations 360, 361; ambiguity 455–6; ambiguity
conceptual mediation 156 in 456; collaboration 454–6; conceptual pacts 453;
conceptual pacts 453 Grice’s maxims 452, 452–3; inferences in 449–53;
Conceptual Selection Model (CSM) 412 layering 452; privacy 453; sound and vision 456–8;
conceptualization, speech production 395–6 structure of 453; turn-taking 84–5, 454; visual cues
concrete nouns 37 454, 454
conditioning 107–8 cooperation 85
conduction aphasia 443, 466 core description 328
congruence 187–8 Cornell University conference 9
conjoint frequency 324 cotton-top tamarins 66, 66, 122
conjunctions 37, 40 counter-factual reasoning 92–3
connectionism 15, 25–6, 106, 477, 481–5 creole languages 114
connectionist modeling 80, 117–18, 138–9, 146–7 critical period hypothesis 73–80; deprivation
connectionist models 25–6, 229, 427; accessing of linguistic input 78–9; evaluation 79–80;
semantics 232–3; aphasia 440–2; of dyslexia lateralization 74–5; second language acquisition
233–7; grounding 355–6; of impairment in 76–7; syntactic development 76–7
dementia 350; latent semantic analysis (LSA) 354; cross-cultural studies 140, 476
lexicalization 423–4, 425–6; of reading 467; revised cross-language priming 155
232; semantic microfeature loss hypothesis 352; cross-linguistic differences, language development 148
semantics 351–6; of sentence production 403–4; cross-linguistic research 479–80
cross-modal lexicon decision task 191
speech recognition 273–80; working memory 472
cross-modal priming technique 201–2, 271–2
connotation 321–2
cross-sectional studies 105
conservation task 80
crossed aphasia 75, 157
consolidated alphabetic phase 242
CT scan, stroke 158
consonantal languages 210, 210
cues 130, 135
consonants 30, 33–5; as combinations of
culture, transmission 5
distinguishing phonological features 34; speech
production 33
constituent analysis 38 D
constituents 38 Dani 95
constraint-based models 296, 300–3ff; compared to data 16
garden path theories 305–6 data-driven processes 23, 24
594 SUBJECT INDEX
interactive parallel constraint model 375 104–5; social context 83–4; see also apes; children,
intercalated dependencies 45 language acquisition
interchangeability 320–1; of pauses 433 language acquisition device (LAD) 111–18, 479
intercorrelated features 353–4 language acquisition socialization system (LASS) 84
interference 157 language bioprogram hypothesis 114
interlopers 414 language development 105; children with learning
internalized language (I-language) 36–7 difficulties 82–3; critical period hypothesis 73–80;
International Phonetic Alphabet (IPA) 31, 32 cross-linguistic differences 148; drivers 105–11;
Internet 479 evaluation of evidence of effects of sensory impairment
intersubjectivity 129 88; hearing impairment 86, 87–9, 88; individual
intonation 120 differences 148; visual impairment 85–6, 86
intra-lexical context 267 language disorders, of social use 85
intransitive verbs 38 language families 8
Inuit 91, 92 language functions, localization of 67–73
invariance 259 language learning, formal approaches 115–16
IQ (intelligence quotient) 115 language loss, Alzheimer’s disease 352
irregular forms, acquisition of 108 language, meaning and use: overview 285; see also
irregular words 228–9; reading 211 sentence structure
irreversible determinism (invariance) hypothesis 74 language of thought 288
irreversible passive sentences 12 language processes, specificity 26
isolation point 266, 271 language processing: bilingualism 155–7; improving
isomorphism 259 understanding of 479; interaction 461, 475–6;
Italian 111, 209 overlap 475; unconscious 460; visual and auditory
iteration 40, 44 461–2
language production and use: overview 393; summary
446–7; writing and agraphia 444–5; see also speech
J production
Japanese 226–7 language production, overview 395–6
jargon aphasia 437–8 language, study of: context and overview 3–4;
joint attention 84, 129 difficulty 5; reasons for 4–5
jokes 3 language systems: experimental evidence for lexicons
juncture pauses 432 463; lexicalization 462–8; and memory 473;
modeling 467–8; modularity 23–5, 460; modules
461–2; neuropsychology and lexical architecture
K 464–7; overview 460–1; rules 25–6; semantic 464;
kana 226–7
and short-term memory 468–73; structure of 465;
kanji 226–7
summary 473
Kannada 144
language teaching, to animals 57–67
kernel sentences 11, 41
language use: inferences in conversation 449–53;
kinship terms 326, 326
overview 449; speech acts 450–2; structure of
Kintsch’s construction–integration model 378, 384–6
conversations 453–4; summary 458
Kintsch’s propositional model 376
languages: number of 7; relationships 7
knowledge storage areas 347
larynx 32
Korean 113
late bilingualism 153
late closure 295–6, 305
L late-syntax theory 143
labeling 94, 129 latent semantic analysis (LSA) 354, 356
labiodentals 34 lateralization 74–5; bilingualism 157; infants 75–6
language: aspects of 6; defining 5–7, 55; design layering, in conversations 452
features 55–6; functions 3; social setting 3; utility learnability theory 116
56–7; and vision 456–8 learning bias 127
language abilities, innateness 105–6 learning difficulties, language development 82–3
language acquisition 4; apes vs. children 63, 63–4; learning theory 108; see also behaviorism
bonobos 64–5; children 63; deprivation of linguistic learning to read: age 247; cues 243; developmental
input 78–9; general principles 116; hearing children dyslexia 249–55; exposure to print 248;
of hearing-impaired parents 77; as parameter setting multi-sensory techniques 255; normal
111–12; pragmatic factors 117; research methods development 241–3; overview 241; phonological
598 SUBJECT INDEX
awareness 243–5; progress between stages 251; lexicons 7, 319; access to 464; bilingualism 155–7;
size of early reading units 245–7; summary 256; number of 462–8
teaching methods 247, 247–8; see also reading lexigrams 62, 64
left-hemisphere dominance 53–4 limbus tracking 168
lemmas 410, 416–18, 427–8, 462 linear-bounded automaton 44
lesion studies 17–19, 68–9 linguistic ambiguity 455–6, 456
less-is-more theory 118, 161 linguistic determinism 90
letter-by-letter reading 220 linguistic encoding 94
letters, and sounds 31 linguistic feedback hypothesis 109
levels, of psychological processing 23 linguistic relativism 90
lexeme selection 410 linguistic rules 25–6
lexical access 167, 258, 265, 266, 280; modes 167; see linguistic universals 112–14
also visual word recognition linguistics: contribution of 12–13; overview 10;
lexical ambiguity 198–205; autonomous access model transformational grammar 10–13
203–4; context effect 202–3; context-guided single- lip-reading 458
reading lexical access model 199; early research liquids 34
199–205; evaluation of research 206; experimental listening, neuroscience 413
research 200–2; frequency effect 202–3; integration literacy 168, 244–5
model 203–4; models 199; multiple access model locality assumption 24
199; ordered-access model 199; reordered access localization, of language functions 67–73, 70
model 204, 205; selective access model 203; locational coherence 361
Swinney’s experiment 201–2 locutionary force 450
lexical and semantic development 125–36, 126; logical inferences 367
comprehension and production 135–6; early words logogen model 194–6, 195, 196, 463
126, 127; errors in meanings 131–4; individual logographic languages 210
differences and preferences 129–30; later logographic stage 242
development 134–5; mapping problem 127–30; longitudinal studies 105, 140, 245
name learning 127–9, 131; over- and look and name 127
under-extensions 131–4; summary of early look-and-say method 247
development 134 Low Interactional Content (LIC) 364
lexical bias 421 lying 453
lexical boost 403
lexical category ambiguity 308–10 M
lexical causatives 331–2 macroplanning 396
lexical decision task 170, 178, 186; and consistency of made-up words 437
results 180–1; frequency effect 182–3 magic moment 167
lexical entrainment 453 magnocellular system 255
lexical guidance 297 malapropisms 410
lexical identification shift 263 Mandarin: spatial coding 97–8; see also Chinese
lexical instance models 192 manner of articulation 33, 34
lexical neighborhoods 272 mapping hypothesis 313
lexical processing, and short-term memory 469–72 mapping problem 127–30
lexical retrieval 414 mapping, sounds onto letters 245
lexical selection 410, 411–12 masked phonological priming 217
lexical-semantic anomia 439, 440 mass nouns 130
lexicalization 396, 410–26; bilingualism 411, matching span task 469
421; cascade models 418–21, 423–6; maturation 74
connectionist models 423–6; discrete stage maturation hypothesis 111, 112
models 421, 425–6; experimental evidence maturational state hypothesis 76, 80
411–12; feedback 421–2, 425–6; horizontal mean length of utterance (MLU) 144–5, 145
information flow 422; interactive activation meaning: and context 319; role in accessing sound
models 422–4, 423; interactivity 418–22; 218; and structure 12; see also semantics
mediated priming 418–20; neuroscience 412–14; meaning-first view 136
and pauses 430–1; speech errors 410–11, 425; meaning through syntax (MTS) 308
stages 410–18; time course 418–21; tip-of-the- meanings, children’s errors 131–4
tongue (TOT) 414, 414–16; two-stage models medial geniculate nucleus 250–1
410, 410–13, 418–19, 419, 423 mediated priming 181, 418–20
SUBJECT INDEX 599
type-B spelling disorder 251 overview 167–8; reaction time measures 170;
Tzeltal 97 repetition priming 175–6; semantic priming 176–7;
serial search model 192–4; summary 207; summary
of research into meaning-based priming 190; syllable
U number 175; word frequency 172–3; word length
U-shaped development 108, 141, 146, 147 174–5; words and nonwords 175
U-shaped learning, second language acquisition 159 vocabulary development 135
ultra-cognitive neuropsychology 18 vocabulary differentiation 91–2
unaspirated sounds 30 vocabulary learning, and phonological loop 471
unbounded dependency 311–12 vocal tract: human and chimpanzee 59; structure 33
uncertainty 5, 26 voice onset time (VOT) 34, 261
under- and over extensions 131–4; theories of 133 voiceless consonants 34
unfilled pauses 430 voiceless glottal fricative 34
unimodal store hypothesis 340 voicing 33–4
uniqueness point 265, 269 voluntary control, of language 56
Universal Grammar 111, 112 vowels 30, 35; as combinations of distinguishing
unrestricted race model 306–8 phonological features 35; speech production 32–3
unrestricted search hypothesis 375
unvoiced consonants 34
utility, of language 56–7
W
waggle dance 54, 55, 57
V
_ weak phonological perspective 216
V 42 WEAVER++ 426, 427–8, 428
velars 34 Welsh, number systems 94
verb-argument structure 141–4 Wernicke–Geschwind model 17, 68–9
verb bias 302–3 Wernicke’s aphasia 68, 433–4, 435, 444
verb-island hypothesis 142 Wernicke’s area 17, 18, 69
Verbal Behavior (Skinner) 107 whales 55
verbatim memory 362–4 whole-object bias 128–9
verbs 37, 38 whole word method 247
vertical information 422 Wickelfeatures 230, 232, 477
vision: computational account 13; and language 456–8 Wild Boy of Aveyron 78
visual comprehension 157 William’s syndrome 82–3, 146, 148
visual context 267 word association 156
visual dyslexia 220 word, concept of 6–7
visual impairment: auxiliary verbs 87; language word exchange errors 398, 401, 402, 407
development 85–6, 86; phonological development word frequency 143; and pauses 431; visual word
87; syntactic development 87 recognition 172–3
visual information: in comprehension 457, 457–8; and word identification 265
parsing 458 word length: measuring 174; visual word recognition
visual processing, dementia 349 174–5
visual scenes 402–3 “word-like entity in the language of thought 336
visual system, and color spectrum 96–7 word meaning see semantics
visual word form area 184 word meaning deafness 281, 464–5
visual word recognition 184; accessing selective word order 113, 402; linguistic universals 113
properties 205–6; age-of-acquisition (AOA) 173–4; word production, semantic priming 187
attentional processes 177–80; comparison of models word recognition 463; context effect 266–7;
198; consistency of results 180–3; context effect frequency effect 266; neuroscience 281; overview
187–90; dedicated system 183–5; evaluation of 165; PET (positron emission tomography) 463;
attentional process research 180; expectations 178–9, stages of 265; time course 265–6; see also visual
179; eye movement studies 168–70; facilitation and word recognition
interference 171–7; factors affecting 177; form- word repetition 465–6
based priming 176; frequency effect 181–3; hybrid word substitution errors 398, 401, 402, 407
models 198; interactive activation models 196–8; word superiority effect 196
lexical ambiguity 198–205; logogen model 194–6; words: borrowing 8; classes 37–8; classification of
meaning-based facilitation 185–90; methods and pronunciations 215; ease of learning 130; models of
findings 168–71; models 192–8; morphologically naming 227–33; multiple meanings 128; processing
complex words 190–2; neighborhood effects 175; of content and function words 315; reading 213–20
SUBJECT INDEX 607
eBooks
Y
FREE 3 OUR
0 DAY
INSTITU
TIONA
TRIAL T L
ODAY!
FOR LIBRARIES
Over 22,000 eBook titles in the Humanities,
Social Sciences, STM and Law from some of the
world’s leading imprints.
Choose from a range of subject packages or create your own!
www.ebooksubscriptions.com